я использовал ooz ie с оболочкой impala для данных ETL. (запрос импалы - вставьте перезапись)
Однако я нашел рабочий процесс, который работал дольше обычного и никогда не завершался.
Я проверил незаконченный запрос и обнаружил, что импала зависла при записи файл в hdfs.
(Этот запрос выполняется более 3 часов, и обычно он выполняется за 10 минут.)
Я думаю, что эта проблема возникла из-за того, что Impala Daemon не удалось после подтверждения записи файла в hdfs .
я не знаю, почему SocketChannelImpl.ensureReadOpen постоянно блокируется.
Можете ли вы дать совет, чтобы решить эту проблему?
Моя среда:
- CDH 5.14.2 (имел oop 2.6.0, импала 2.11)
- Kerberos Enable
- java 1.8.0_121
Дамп потока демона Impala (удаление IP-адреса)
"ResponseProcessor for block BP-21905457-<my-ip-address>-1502174846412:blk_1162137978_122336977" #341266
daemon prio=5 os_prio=0 tid=0x00007f2930f14000 nid=0x65b3c waiting for monitor entry
[0x00007f289262c000]
java.lang.Thread.State: BLOCKED (on object monitor)
at sun.nio.ch.SocketChannelImpl.ensureReadOpen(SocketChannelImpl.java:255)
- waiting to lock <0x00000002a4d00c60> (a java.lang.Object)
at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:300)
- locked <0x00000002a4d00d10> (a java.lang.Object)
at org.apache.hadoop.net.SocketInputStream$Reader.performIO(SocketInputStream.java:57)
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:142)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at
org.apache.hadoop.crypto.CryptoInputStream.readFromUnderlyingStream(CryptoInputStream.java:220)
at org.apache.hadoop.crypto.CryptoInputStream.read(CryptoInputStream.java:200)
at org.apache.hadoop.crypto.CryptoInputStream.read(CryptoInputStream.java:658)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2303)
at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:235)
at
org.apache.hadoop.hdfs.DFSOutputStream$DataStreamer$ResponseProcessor.run(DFSOutputStream.java:1093)
Locked ownable synchronizer:
- None
DataNode (первый объект данных) Дамп потока
"PacketResponder: BP-21905457-<ip>-1502174846412:blk_1162137978_122336977, type=HAS_DOWNSTREAM_IN_PIPELINE" #32844152 daemon prio=5 os_prio=0 tid=0x00007f2532f2c000 nid=0x65b3b runnable [0x00007f25a2002000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00000000d111e670> (a sun.nio.ch.Util$3)
- locked <0x00000000d111e680> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000d111e628> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335)
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:131)
at java.io.FilterInputStream.read(FilterInputStream.java:133)
at org.apache.hadoop.crypto.CryptoInputStream.readFromUnderlyingStream(CryptoInputStream.java:220)
at org.apache.hadoop.crypto.CryptoInputStream.read(CryptoInputStream.java:200)
at org.apache.hadoop.crypto.CryptoInputStream.read(CryptoInputStream.java:658)
at java.io.FilterInputStream.read(FilterInputStream.java:83)
at org.apache.hadoop.hdfs.protocolPB.PBHelper.vintPrefixed(PBHelper.java:2303)
at org.apache.hadoop.hdfs.protocol.datatransfer.PipelineAck.readFields(PipelineAck.java:235)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver$PacketResponder.run(BlockReceiver.java:1291)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- None
"DataXceiver for client DFSClient_NONMAPREDUCE_633544602_1 at /<ip>:39456 [Receiving block BP-21905457-<ip>-1502174846412:blk_1162137978_122336977]" #32844151 daemon prio=5 os_prio=0 tid=0x000000000a609800 nid=0x65b3a runnable [0x00007f25a2102000]
java.lang.Thread.State: RUNNABLE
at sun.nio.ch.EPollArrayWrapper.epollWait(Native Method)
at sun.nio.ch.EPollArrayWrapper.poll(EPollArrayWrapper.java:269)
at sun.nio.ch.EPollSelectorImpl.doSelect(EPollSelectorImpl.java:93)
at sun.nio.ch.SelectorImpl.lockAndDoSelect(SelectorImpl.java:86)
- locked <0x00000000d11173b0> (a sun.nio.ch.Util$3)
- locked <0x00000000d11173c0> (a java.util.Collections$UnmodifiableSet)
- locked <0x00000000d1117368> (a sun.nio.ch.EPollSelectorImpl)
at sun.nio.ch.SelectorImpl.select(SelectorImpl.java:97)
at org.apache.hadoop.net.SocketIOWithTimeout$SelectorPool.select(SocketIOWithTimeout.java:335)
at org.apache.hadoop.net.SocketIOWithTimeout.doIO(SocketIOWithTimeout.java:157)
at org.apache.hadoop.net.SocketInputStream.read(SocketInputStream.java:161)
at org.apache.hadoop.crypto.CryptoInputStream.read(CryptoInputStream.java:198)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:246)
at java.io.BufferedInputStream.read1(BufferedInputStream.java:286)
at java.io.BufferedInputStream.read(BufferedInputStream.java:345)
- locked <0x00000000e8de7b60> (a java.io.BufferedInputStream)
at java.io.DataInputStream.read(DataInputStream.java:149)
at org.apache.hadoop.io.IOUtils.readFully(IOUtils.java:201)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doReadFully(PacketReceiver.java:213)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.doRead(PacketReceiver.java:134)
at org.apache.hadoop.hdfs.protocol.datatransfer.PacketReceiver.receiveNextPacket(PacketReceiver.java:109)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receivePacket(BlockReceiver.java:501)
at org.apache.hadoop.hdfs.server.datanode.BlockReceiver.receiveBlock(BlockReceiver.java:901)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.writeBlock(DataXceiver.java:808)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.opWriteBlock(Receiver.java:169)
at org.apache.hadoop.hdfs.protocol.datatransfer.Receiver.processOp(Receiver.java:106)
at org.apache.hadoop.hdfs.server.datanode.DataXceiver.run(DataXceiver.java:246)
at java.lang.Thread.run(Thread.java:745)
Locked ownable synchronizers:
- None
Имеет oop Журнал NameNode (удаление пути hdfs)
2020-04-01 04:11:17,671 INFO org.apache.hadoop.hdfs.StateChange: BLOCK* allocateBlock: /<data path>/_impala_insert_staging/e346b1f10713e4ba_c69b6dcd00000000/.e346b1f10713e4ba-c69b6dcd00000052_264212949_dir/service_terms_id=21/e346b1f10713e4ba-c69b6dcd00000052_1934381021_data.0.parq.
BP-21905457-<ip>-1502174846412
blk_1162137978_122336977{blockUCState=UNDER_CONSTRUCTION, primaryNodeIndex=-1, replicas=[
ReplicaUnderConstruction[[DISK]DS-513944a8-a1ca-4272-a521-959f8ebd6c4d:NORMAL:<ip>:1004|RBW],
ReplicaUnderConstruction[[DISK]DS-b912b9ce-93df-4c0f-b25d-55be36dae3e8:NORMAL:<ip>:1004|RBW],
ReplicaUnderConstruction[[DISK]DS-b6db02ea-1bbf-4920-9754-79c68eafba7f:NORMAL:<ip>:1004|RBW]]
}
# cannot found blockMap update log
Журнал DataNode
# node 1
2020-04-01 04:11:17,673 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Receiving BP-21905457-<ip>-1502174846412:blk_1162137978_122336977 src: /<ip>:39456 dest: /<ip>:1004
2020-04-01 09:40:45,634 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Scheduling blk_1162137978_122336977 file /data2/dfs/dn/current/BP-21905457-<ip>-1502174846412/current/rbw/blk_1162137978 for deletion
2020-04-01 09:40:45,635 INFO org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.FsDatasetAsyncDiskService: Deleted BP-21905457-<ip>-1502174846412 blk_1162137978_122336977 file /data2/dfs/dn/current/BP-21905457-<ip>-1502174846412/current/rbw/blk_1162137978
# node2,3 log - same node1 log
Журнал Impala Daemon
- Импала не имеет журнал предупреждений / ошибок
Информация Netstat
Скажите, пожалуйста, если вам нужно больше информации. Спасибо. :)