У меня есть кластер HDFS с 2 наменодами и 6 датанодами.Во время непрерывного обновления кластера я вижу следующие исключения в задании Spark:
java.io.IOException: Got error, status message , ack with firstBadLink as x.x.x.x:50010
at org.apache.hadoop.hdfs.protocol.datatransfer.DataTransferProtoUtil.checkBlockOpStatus(DataTransferProtoUtil.java:142)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.createBlockOutputStream(DFSOutputStream.java:1359)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.nextBlockOutputStream(DFSOutputStream.java:1262)
at org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer.run(DFSOutputStream.java:448)
В журналах узлов данных обнаружены следующие исключения:
2019-09-18 17:07:52,270 ERROR [DataXceiver for client DFSClient_NONMAPREDUCE_-769906848_78 at /x.x.x.x:38980 [Receiving block BP-615224276-127.0.0.1-1528190490364:blk_2103245143_1029505696]] datanode.DataNode (DataXceiver.java:run(280)) - localhost:5
0010:DataXceiver error processing WRITE_BLOCK operation src: /x.x.x.x:38980 dst: /x.x.x.x:50010
java.net.ConnectException: Connection refused
at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:733)
at org.apache.hadoop.net.SocketIOWithTimeout.connect(SocketIOWithTimeout.java:206)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:531)
at org.apache.hadoop.net.NetUtils.connect(NetUtils.java:495)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:708)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253)
at java.lang.Thread.run(Thread.java:808)
2019-09-18 17:07:52,271 INFO [PacketResponder: BP-615224276-127.0.0.1-1528190490364:blk_2103245144_1029505697, type=HAS_DOWNSTREAM_IN_PIPELINE] datanode.DataNode (BlockReceiver.java:run(1361)) - PacketResponder: BP-615224276-127.0.0.1-1528190490364:blk_2
103245144_1029505697, type=HAS_DOWNSTREAM_IN_PIPELINE
java.io.EOFException: Premature EOF: no length prefix available
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2294)
at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:244)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:1291)
at java.lang.Thread.run(Thread.java:808)
2019-09-18 17:07:52,272 INFO [DataXceiver for client DFSClient_NONMAPREDUCE_-769906848_78 at /x.x.x.x:38986 [Receiving block BP-615224276-127.0.0.1-1528190490364:blk_2103245144_1029505697]] datanode.DataNode (BlockReceiver.java:receiveBlock(942)) -
Exception for BP-615224276-127.0.0.1-1528190490364:blk_2103245144_1029505697
java.io.IOException: Premature EOF from inputStream
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:202)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:503)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:903)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:808)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:137)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:74)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:253)
at java.lang.Thread.run(Thread.java:808)
Эти исключения возникают только во времявремя обновления.
Версия Hadoop: 2.7.7