YARN не может использовать встроенный HBase - PullRequest
0 голосов
/ 28 февраля 2019

У моей установки есть YARN, работающий с Kerberos и включенным SSL в небольшом кластере HDP 3.1 с песочницей.YARN ATS работает во встроенном режиме.

Проблема в том, что YARN не может подключиться, потому что коннекторы HBase застряли при попытке прочитать базовый znode по адресу /atsv2-hbase-secure/hbaseid.

Если я удаляю базовый znode, диспетчер ресурсов может подойти;тем не менее, znode в конечном итоге создается встроенным HBase Master, и Timeline Reader v2 застревает при попытке его прочитать.Тот же самый случай происходит с менеджерами узлов.

В журналах ни одной из сервисов нет ошибок, последнее сообщение, которое я получаю перед застреванием:

2019-02-27 16:24:10,321 INFO  common.HBaseTimelineStorageUtils (HBaseTimelineStorageUtils.java:getTimelineServiceHBaseConf(65)) - Using hbase configuration at file:///usr/hdp/current/hadoop-yarn-resourcemanager/conf/embedded-yarn-ats-hbase/hbase-site.xml
2019-02-27 16:24:10,446 INFO  zookeeper.ReadOnlyZKClient (ReadOnlyZKClient.java:<init>(130)) - Start read only zookeeper connection 0x1bbae752 to master01.dom.lab:2181,hive01.dom.lab:2181,master02.dom.lab:2181, session timeout 90000 ms, retries 6, retry interval 1000 ms, keep alive 60000 ms
2019-02-27 16:24:10,454 INFO  zookeeper.ZooKeeper (ZooKeeper.java:<init>(438)) - Initiating client connection, connectString=master01.dom.lab:2181,hive01.dom.lab:2181,master02.dom.lab:2181 sessionTimeout=90000 watcher=org.apache.hadoop.hbase.zookeeper.ReadOnlyZKClient$$Lambda$15/700988999@6ccab51a
2019-02-27 16:24:10,456 INFO  client.ZooKeeperSaslClient (ZooKeeperSaslClient.java:run(289)) - Client will use GSSAPI as SASL mechanism.
2019-02-27 16:24:10,456 DEBUG client.ZooKeeperSaslClient (ZooKeeperSaslClient.java:run(291)) - creating sasl client: client=rm/master02.dom.lab@DOM.LAB;service=zookeeper;serviceHostname=hive01.dom.lab
2019-02-27 16:24:10,457 INFO  zookeeper.ClientCnxn (ClientCnxn.java:logStartConnect(1019)) - Opening socket connection to server hive01.dom.lab/10.14.19.29:2181. Will attempt to SASL-authenticate using Login Context section 'Client'
2019-02-27 16:24:10,457 INFO  zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(864)) - Socket connection established, initiating session, client: /10.14.19.25:39414, server: hive01.dom.lab/10.14.19.29:2181
2019-02-27 16:24:10,458 DEBUG zookeeper.ClientCnxn (ClientCnxn.java:primeConnection(936)) - Session establishment request sent on hive01.dom.lab/10.14.19.29:2181
2019-02-27 16:24:10,461 INFO  zookeeper.ClientCnxn (ClientCnxn.java:onConnected(1279)) - Session establishment complete on server hive01.dom.lab/10.14.19.29:2181, sessionid = 0x1692b5629cb0012, negotiated timeout = 60000
2019-02-27 16:24:10,462 DEBUG client.ZooKeeperSaslClient (ZooKeeperSaslClient.java:sendSaslPacket(421)) - ClientCnxn:sendSaslPacket:length=0
2019-02-27 16:24:10,462 DEBUG client.ZooKeeperSaslClient (ZooKeeperSaslClient.java:run(369)) - saslClient.evaluateChallenge(len=0)
2019-02-27 16:24:10,470 DEBUG zookeeper.ClientCnxnSocketNIO (ClientCnxnSocketNIO.java:findSendablePacket(184)) - deferring non-priming packet: clientPath:/atsv2-hbase-secure/hbaseid serverPath:/atsv2-hbase-secure/hbaseid finished:false header:: 0,4  replyHeader:: 0,0,0  request:: '/atsv2-hbase-secure/hbaseid,F  response::  until SASL authentication completes.
2019-02-27 16:24:10,473 DEBUG zookeeper.ClientCnxnSocketNIO (ClientCnxnSocketNIO.java:findSendablePacket(184)) - deferring non-priming packet: clientPath:/atsv2-hbase-secure/hbaseid serverPath:/atsv2-hbase-secure/hbaseid finished:false header:: 0,4  replyHeader:: 0,0,0  request:: '/atsv2-hbase-secure/hbaseid,F  response::  until SASL authentication completes.
2019-02-27 16:24:10,473 DEBUG client.ZooKeeperSaslClient (ZooKeeperSaslClient.java:run(369)) - saslClient.evaluateChallenge(len=50)
2019-02-27 16:24:10,474 DEBUG client.ZooKeeperSaslClient (ZooKeeperSaslClient.java:sendSaslPacket(403)) - ClientCnxn:sendSaslPacket:length=86
2019-02-27 16:24:10,475 DEBUG zookeeper.ClientCnxnSocketNIO (ClientCnxnSocketNIO.java:findSendablePacket(184)) - deferring non-priming packet: clientPath:/atsv2-hbase-secure/hbaseid serverPath:/atsv2-hbase-secure/hbaseid finished:false header:: 0,4  replyHeader:: 0,0,0  request:: '/atsv2-hbase-secure/hbaseid,F  response::  until SASL authentication completes.
2019-02-27 16:24:10,475 DEBUG zookeeper.ClientCnxnSocketNIO (ClientCnxnSocketNIO.java:findSendablePacket(184)) - deferring non-priming packet: clientPath:/atsv2-hbase-secure/hbaseid serverPath:/atsv2-hbase-secure/hbaseid finished:false header:: 0,4  replyHeader:: 0,0,0  request:: '/atsv2-hbase-secure/hbaseid,F  response::  until SASL authentication completes.
2019-02-27 16:24:10,475 DEBUG zookeeper.ClientCnxnSocketNIO (ClientCnxnSocketNIO.java:findSendablePacket(184)) - deferring non-priming packet: clientPath:/atsv2-hbase-secure/hbaseid serverPath:/atsv2-hbase-secure/hbaseid finished:false header:: 0,4  replyHeader:: 0,0,0  request:: '/atsv2-hbase-secure/hbaseid,F  response::  until SASL authentication completes.
2019-02-27 16:24:10,475 DEBUG zookeeper.ClientCnxnSocketNIO (ClientCnxnSocketNIO.java:findSendablePacket(184)) - deferring non-priming packet: clientPath:/atsv2-hbase-secure/hbaseid serverPath:/atsv2-hbase-secure/hbaseid finished:false header:: 0,4  replyHeader:: 0,0,0  request:: '/atsv2-hbase-secure/hbaseid,F  response::  until SASL authentication completes.
2019-02-27 16:24:10,476 DEBUG zookeeper.ClientCnxnSocketNIO (ClientCnxnSocketNIO.java:findSendablePacket(184)) - deferring non-priming packet: clientPath:/atsv2-hbase-secure/hbaseid serverPath:/atsv2-hbase-secure/hbaseid finished:false header:: 0,4  replyHeader:: 0,0,0  request:: '/atsv2-hbase-secure/hbaseid,F  response::  until SASL authentication completes.
2019-02-27 16:24:10,476 DEBUG zookeeper.ClientCnxnSocketNIO (ClientCnxnSocketNIO.java:findSendablePacket(184)) - deferring non-priming packet: clientPath:/atsv2-hbase-secure/hbaseid serverPath:/atsv2-hbase-secure/hbaseid finished:false header:: 0,4  replyHeader:: 0,0,0  request:: '/atsv2-hbase-secure/hbaseid,F  response::  until SASL authentication completes.
2019-02-27 16:24:10,476 DEBUG zookeeper.ClientCnxnSocketNIO (ClientCnxnSocketNIO.java:findSendablePacket(184)) - deferring non-priming packet: clientPath:/atsv2-hbase-secure/hbaseid serverPath:/atsv2-hbase-secure/hbaseid finished:false header:: 0,4  replyHeader:: 0,0,0  request:: '/atsv2-hbase-secure/hbaseid,F  response::  until SASL authentication completes.
2019-02-27 16:24:10,477 DEBUG zookeeper.ClientCnxn (ClientCnxn.java:readResponse(830)) - Reading reply sessionid:0x1692b5629cb0012, packet:: clientPath:/atsv2-hbase-secure/hbaseid serverPath:/atsv2-hbase-secure/hbaseid finished:false header:: 3,4  replyHeader:: 3,51539608995,0  request:: '/atsv2-hbase-secure/hbaseid,F  response:: #ffffffff000146d61737465723a3137303030ffffffa676ffffffb9ffffffd013ffffffe7ffffffbe6950425546a2433663031656330342d343039362d343834322d386632302d353236356562616138306161,s{25769804836,51539608796,1550173035842,1551282324524,29,0,30,0,67,0,25769804836} 
2019-02-27 16:24:12,017 DEBUG zookeeper.ClientCnxn (ClientCnxn.java:readResponse(729)) - Got ping response for sessionid: 0x2692b567b9b0013 after 0ms
2019-02-27 16:24:13,646 DEBUG zookeeper.ClientCnxn (ClientCnxn.java:readResponse(729)) - Got ping response for sessionid: 0x2692b567b9b0014 after 0ms
2019-02-27 16:24:15,353 DEBUG zookeeper.ClientCnxn (ClientCnxn.java:readResponse(729)) - Got ping response for sessionid: 0x2692b567b9b0013 after 0ms
2019-02-27 16:24:16,981 DEBUG zookeeper.ClientCnxn (ClientCnxn.java:readResponse(729)) - Got ping response for sessionid: 0x2692b567b9b0014 after 0ms
2019-02-27 16:24:18,689 DEBUG zookeeper.ClientCnxn (ClientCnxn.java:readResponse(729)) - Got ping response for sessionid: 0x2692b567b9b0013 after 0ms
2019-02-27 16:24:20,317 DEBUG zookeeper.ClientCnxn (ClientCnxn.java:readResponse(729)) - Got ping response for sessionid: 0x2692b567b9b0014 after 0ms
2019-02-27 16:24:22,027 DEBUG zookeeper.ClientCnxn (ClientCnxn.java:readResponse(729)) - Got ping response for sessionid: 0x2692b567b9b0013 after 0ms
2019-02-27 16:24:23,653 DEBUG zookeeper.ClientCnxn (ClientCnxn.java:readResponse(729)) - Got ping response for sessionid: 0x2692b567b9b0014 after 1ms
2019-02-27 16:24:25,365 DEBUG zookeeper.ClientCnxn (ClientCnxn.java:readResponse(729)) - Got ping response for sessionid: 0x2692b567b9b0013 after 0ms
2019-02-27 16:24:26,989 DEBUG zookeeper.ClientCnxn (ClientCnxn.java:readResponse(729)) - Got ping response for sessionid: 0x2692b567b9b0014 after 0ms
2019-02-27 16:24:28,702 DEBUG zookeeper.ClientCnxn (ClientCnxn.java:readResponse(729)) - Got ping response for sessionid: 0x2692b567b9b0013 after 0ms

ACLиз znode кажутся правильными:

[zk: master01:2181,master02:2181,hive01:2181(CONNECTED) 13] getAcl /atsv2-hbase-secure
'sasl,'yarn
: cdrwa
'world,'anyone
: r
'sasl,'yarn-ats-hbase
: cdrwa

Ниже приведены некоторые соответствующие параметры конфигурации из встроенного экземпляра HBase:

hbase-env:

export HBASE_MANAGES_ZK=false

hbase-site.xml:

<property>
  <name>zookeeper.recovery.retry</name>
  <value>6</value>
</property>
<property>
  <name>zookeeper.session.timeout</name>
  <value>90000</value>
</property>
<property>
  <name>zookeeper.znode.parent</name>
  <value>/atsv2-hbase-secure</value>
</property>
<property>
  <name>hbase.security.authentication</name>
  <value>kerberos</value>
</property>
<property>
  <name>hbase.security.authorization</name>
  <value>true</value>
</property>
<property>
  <name>hbase.superuser</name>
  <value>yarn</value>
</property>
<property>
  <name>hbase.tmp.dir</name>
  <value>/hadoop/yarn/tmp/hbase-${user.name}</value>
</property>
<property>
  <name>hbase.zookeeper.property.clientPort</name>
  <value>2181</value>
</property>
<property>
  <name>hbase.zookeeper.quorum</name>
  <value>master01.dom.lab,hive01.dom.lab,master02.dom.lab</value>
</property>
<property>
  <name>hbase.rootdir</name>
  <value>/atsv2/hbase/data</value>
</property>
<property>
  <name>hbase.rpc.protection</name>
  <value>authentication</value>
</property>
<property>
  <name>hbase.rpc.timeout</name>
  <value>90000</value>
</property>
<property>
  <name>hbase.local.dir</name>
  <value>${hbase.tmp.dir}/local</value>
</property>
<property>
  <name>hbase.cluster.distributed</name>
  <value>true</value>
</property>

Заранее спасибо за помощь!

...