Я застрял на этой проблеме очень долго.
Я пытаюсь запустить что-то в распределенном узле.
У меня есть 2 datanodes и мастер с namenode и трекер работы.
Я продолжаю получать следующую ошибку в tasktracker.log каждого из узлов
<
2012-01-03 08:48:30,910 WARN mortbay.log - /mapOutput: org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_201201031846_0001/attempt_201201031846_0001_m_000000_1/output/file.out.index in any of the configured local directories
2012-01-03 08:48:40,927 WARN mapred.TaskTracker - getMapOutput(attempt_201201031846_0001_m_000000_2,0) failed :
org.apache.hadoop.util.DiskChecker$DiskErrorException: Could not find taskTracker/jobcache/job_201201031846_0001/attempt_201201031846_0001_m_000000_2/output/file.out.index in any of the configured local directories
at org.apache.hadoop.fs.LocalDirAllocator$AllocatorPerContext.getLocalPathToRead(LocalDirAllocator.java:389)
at org.apache.hadoop.fs.LocalDirAllocator.getLocalPathToRead(LocalDirAllocator.java:138)
at org.apache.hadoop.mapred.TaskTracker$MapOutputServlet.doGet(TaskTracker.java:2887)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:707)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:820)
at org.mortbay.jetty.servlet.ServletHolder.handle(ServletHolder.java:502)
at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:363)
at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216)
at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181)
at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:766)
at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:417)
at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:230)
at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152)
at org.mortbay.jetty.Server.handle(Server.java:324)
at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:534)
at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:864)
at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:533)
at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:207)
at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:403)
at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:409)
at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:522)
>
и эта ошибка в hadoop.log ведомого:
2012-01-03 10:20:36,732 WARN mapred.ReduceTask - attempt_201201031954_0006_r_000001_0 adding host localhost to penalty box, next contact in 4 seconds
2012-01-03 10:20:41,738 WARN mapred.ReduceTask - attempt_201201031954_0006_r_000001_0 copy failed: attempt_201201031954_0006_m_000001_2 from localhost
2012-01-03 10:20:41,738 WARN mapred.ReduceTask - java.io.FileNotFoundException: http://localhost:50060/mapOutput?job=job_201201031954_0006&map=attempt_201201031954_0006_m_000001_2&reduce=1
at sun.reflect.GeneratedConstructorAccessor6.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1491)
at java.security.AccessController.doPrivileged(Native Method)
at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1485)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1139)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1447)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1349)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195)
Caused by: java.io.FileNotFoundException: http://localhost:50060/mapOutput?job=job_201201031954_0006&map=attempt_201201031954_0006_m_000001_2&reduce=1
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1434)
... 4 more
2012-01-03 10:20:41,739 WARN mapred.ReduceTask - attempt_201201031954_0006_r_000001_0 adding host localhost to penalty box, next contact in 4 seconds
2012-01-03 10:20:46,761 WARN mapred.ReduceTask - attempt_201201031954_0006_r_000001_0 copy failed: attempt_201201031954_0006_m_000000_3 from localhost
2012-01-03 10:20:46,762 WARN mapred.ReduceTask - java.io.FileNotFoundException: http://localhost:50060/mapOutput?job=job_201201031954_0006&map=attempt_201201031954_0006_m_000000_3&reduce=1
at sun.reflect.GeneratedConstructorAccessor6.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:27)
at java.lang.reflect.Constructor.newInstance(Constructor.java:513)
at sun.net.www.protocol.http.HttpURLConnection$6.run(HttpURLConnection.java:1491)
at java.security.AccessController.doPrivileged(Native Method)
at sun.net.www.protocol.http.HttpURLConnection.getChainedException(HttpURLConnection.java:1485)
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1139)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getInputStream(ReduceTask.java:1447)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.getMapOutput(ReduceTask.java:1349)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.copyOutput(ReduceTask.java:1261)
at org.apache.hadoop.mapred.ReduceTask$ReduceCopier$MapOutputCopier.run(ReduceTask.java:1195)
Caused by: java.io.FileNotFoundException: http://localhost:50060/mapOutput?job=job_201201031954_0006&map=attempt_201201031954_0006_m_000000_3&reduce=1
at sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1434)
... 4 more
Это моя конфигурация:
mapred-сайт:
<property>
<name>mapred.job.tracker</name>
<value>10.20.1.112:9001</value>
<description>The host and port that the MapReduce job tracker runs
at.</description>
</property>
<property>
<name>mapred.map.tasks</name>
<value>2</value>
<description>
define mapred.map tasks to be number of slave hosts
</description>
</property>
<property>
<name>mapred.reduce.tasks</name>
<value>2</value>
<description>
define mapred.reduce tasks to be number of slave hosts
</description>
</property>
<property>
<name>mapred.system.dir</name>
<value>filesystem/mapreduce/system</value>
</property>
<property>
<name>mapred.local.dir</name>
<value>filesystem/mapreduce/local</value>
</property>
<property>
<name>mapred.submit.replication</name>
<value>2</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>tmp</value>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx2048m</value>
</property>
ядро-сайт:
<property>
<name>fs.default.name</name>
<value>hdfs://10.20.1.112:9000</value>
<description>The name of the default file system. A URI whose
scheme and authority determine the FileSystem implementation.
</description>
</property>
Я пытался поиграть с tmp dir - не помогло.
Я пытался играть с mapred.local.dir - не помогло.
Я также устал смотреть, что находится в директории файловой системы во время выполнения.
Я обнаружил, что путь: taskTracker / jobcache / job_201201031846_0001 / попытки_201201031846_0001_m_000000_1 /
существует, но в нем нет выходной папки.
есть идеи?
спасибо.