Ошибка быстрого запуска Cloudera: java.lang.RuntimeException: PipeMapRed.waitOutputThreads (): сбой подпроцесса с кодом 1 - PullRequest
0 голосов
/ 15 ноября 2018

Помогите мне решить следующую ошибку Hadoop:

Я хочу запустить модуль MRJob Python mrjob_word_count.py, который просто считает слова из текстового файла my_text.txt: Ниже приведена программа mrjob_word_count.py:

#!/usr/bin/python2.7
from mrjob.job import MRJob


class MRWordFrequencyCount(MRJob):

    def mapper(self, _, line):
        yield "chars", len(line)
        yield "words", len(line.split())
        yield "lines", 1

    def reducer(self, key, values):
        yield key, sum(values)


if __name__ == '__main__':
    MRWordFrequencyCount.run()

Когда я просто бегу:

python2.7 /home/cloudera/Documents/mrjob_word_count.py /home/cloudera/Documents/my_text.txt

Я получаю следующие ожидаемые результаты:

No configs found; falling back on auto-configuration
No configs specified for inline runner
Running step 1 of 1...
Creating temp directory /tmp/mrjob_word_count.cloudera.20181114.215028.951225
job output is in /tmp/mrjob_word_count.cloudera.20181114.215028.951225/output
Streaming final output from /tmp/mrjob_word_count.cloudera.20181114.215028.951225/output...
"chars" 786
"words" 125
"lines" 1
Removing temp directory /tmp/mrjob_word_count.cloudera.20181114.215028.951225...

Но когда я пытаюсь запустить то же самое на Hadoop Cloudera Quickstart на Virtual Box, я получаю ошибки. Имейте в виду, у меня снята проверка разрешений hdfs. Поэтому я не уверен, что это все еще ошибка прав доступа:

команда для запуска на hadoop:

 python2.7 /home/cloudera/Documents/mrjob_word_count.py -r hadoop hdfs:///user/root/my_input2/my_text.txt

Ошибка:

No configs found; falling back on auto-configuration
No configs specified for hadoop runner
Looking for hadoop binary in $PATH...
Found hadoop binary: /usr/bin/hadoop
Using Hadoop version 2.6.0
Looking for Hadoop streaming jar in /home/hadoop/contrib...
Looking for Hadoop streaming jar in /usr/lib/hadoop-mapreduce...
Found Hadoop streaming jar: /usr/lib/hadoop-mapreduce/hadoop-streaming.jar
Creating temp directory /tmp/mrjob_word_count.cloudera.20181114.213237.845998
Copying local files to hdfs:///user/cloudera/tmp/mrjob/mrjob_word_count.cloudera.20181114.213237.845998/files/...
Running step 1 of 1...
  packageJobJar: [] [/usr/lib/hadoop-mapreduce/hadoop-streaming-2.6.0-cdh5.15.1.jar] /tmp/streamjob7897542047673680590.jar tmpDir=null
  Connecting to ResourceManager at quickstart.cloudera/10.0.2.15:8032
  Connecting to ResourceManager at quickstart.cloudera/10.0.2.15:8032
  Total input paths to process : 1
  number of splits:2
  Submitting tokens for job: job_1542225023300_0003
  Submitted application application_1542225023300_0003
  The url to track the job: http://quickstart.cloudera:8088/proxy/application_1542225023300_0003/
  Running job: job_1542225023300_0003
  Job job_1542225023300_0003 running in uber mode : false
   map 0% reduce 0%
  Task Id : attempt_1542225023300_0003_m_000001_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

  Task Id : attempt_1542225023300_0003_m_000000_0, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

  Task Id : attempt_1542225023300_0003_m_000000_1, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

  Task Id : attempt_1542225023300_0003_m_000001_1, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

  Task Id : attempt_1542225023300_0003_m_000000_2, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

  Task Id : attempt_1542225023300_0003_m_000001_2, Status : FAILED
Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

   map 100% reduce 100%
  Job job_1542225023300_0003 failed with state FAILED due to: Task failed task_1542225023300_0003_m_000001
Job failed as tasks failed. failedMaps:1 failedReduces:0

  Job not successful!
  Streaming Command Failed!
Counters: 10
    Job Counters 
        Data-local map tasks=2
        Failed map tasks=8
        Killed reduce tasks=1
        Launched map tasks=8
        Other local map tasks=6
        Total megabyte-milliseconds taken by all map tasks=32033792
        Total time spent by all map tasks (ms)=62566
        Total time spent by all maps in occupied slots (ms)=32033792
        Total time spent by all reduces in occupied slots (ms)=0
        Total vcore-milliseconds taken by all map tasks=62566
Scanning logs for probable cause of failure...
Looking for history log in hdfs:///tmp/hadoop-yarn/staging...
Looking for history log in /var/log/hadoop-yarn...
Probable cause of failure:

Error: java.lang.RuntimeException: PipeMapRed.waitOutputThreads(): subprocess failed with code 1
    at org.apache.hadoop.streaming.PipeMapRed.waitOutputThreads(PipeMapRed.java:325)
    at org.apache.hadoop.streaming.PipeMapRed.mapRedFinished(PipeMapRed.java:538)
    at org.apache.hadoop.streaming.PipeMapper.close(PipeMapper.java:130)
    at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:61)
    at org.apache.hadoop.streaming.PipeMapRunner.run(PipeMapRunner.java:34)
    at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:459)
    at org.apache.hadoop.mapred.MapTask.run(MapTask.java:343)
    at org.apache.hadoop.mapred.YarnChild$2.run(YarnChild.java:164)
    at java.security.AccessController.doPrivileged(Native Method)
    at javax.security.auth.Subject.doAs(Subject.java:415)
    at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1924)
    at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:158)

Step 1 of 1 failed: Command '['/usr/bin/hadoop', 'jar', '/usr/lib/hadoop-mapreduce/hadoop-streaming.jar', '-files', 'hdfs:///user/cloudera/tmp/mrjob/mrjob_word_count.cloudera.20181114.213237.845998/files/mrjob.zip#mrjob.zip,hdfs:///user/cloudera/tmp/mrjob/mrjob_word_count.cloudera.20181114.213237.845998/files/mrjob_word_count.py#mrjob_word_count.py,hdfs:///user/cloudera/tmp/mrjob/mrjob_word_count.cloudera.20181114.213237.845998/files/setup-wrapper.sh#setup-wrapper.sh', '-input', 'hdfs:///user/root/my_input2/my_text.txt', '-output', 'hdfs:///user/cloudera/tmp/mrjob/mrjob_word_count.cloudera.20181114.213237.845998/output', '-mapper', 'sh -ex setup-wrapper.sh python mrjob_word_count.py --step-num=0 --mapper', '-reducer', 'sh -ex setup-wrapper.sh python mrjob_word_count.py --step-num=0 --reducer']' returned non-zero exit status 256
...