Не удалось присоединиться к топологии Apache Ignite, когда несколько серверных модулей запускаются одновременно - PullRequest
0 голосов
/ 18 октября 2019

В настоящее время я настраиваю кластер Apache Ignite без сохранения состояния в среде Kubernetes.

Во время теста аварийного восстановления я перезагружал несколько узлов Ignite сервера одновременно и преднамеренно. Эти серверные узлы Ignite были запущены примерно в одно и то же время.

С момента восстановления серверных узлов Ignite весь кластер Ignite стал бесполезным, а соединение между серверами и клиентами потеряно и никогда не восстанавливалось.

В журнале узлов сервера постоянно появляется следующая строка:

Failed to wait for partition map exchange [topVer=AffinityTopologyVersion [topVer=572, minorTopVer=0], node=f1f26b7e-5130-423a-b6c0-477ad58437ee]. Dumping pending objects that might be the cause: 

Редактировать: Добавлено больше журналов, показывающих, что узлы пытаются последовательно вернуться в топологию Ignite

Added new node to topology: TcpDiscoveryNode [id=91be6833-9884-404b-8b20-afb004ce32a3, addrs=[100.64.32.153, 127.0.0.1], sockAddrs=[/100.64.32.153:0, /127.0.0.1:0], discPort=0, order=337, intOrder=212, lastExchangeTime=1571403600207, loc=false, ver=2.7.5#20190603-sha1:be4f2a15, isClient=true]
Topology snapshot [ver=337, locNode=98f9d085, servers=9, clients=78, state=ACTIVE, CPUs=152, offheap=2.3GB, heap=45.0GB]
Local node's value of 'java.net.preferIPv4Stack' system property differs from remote node's (all nodes in topology should have identical value) [locPreferIpV4=true, rmtPreferIpV4=null, locId8=98f9d085, rmtId8=4110272f, rmtAddrs=[securities-1-0-0-6d57b9989b-95wkn/100.64.0.31, /127.0.0.1], rmtNode=ClusterNode [id=4110272f-ca98-4a51-89e3-3478d87ff73e, order=338, addr=[100.64.0.31, 127.0.0.1], daemon=false]]
Added new node to topology: TcpDiscoveryNode [id=4110272f-ca98-4a51-89e3-3478d87ff73e, addrs=[100.64.0.31, 127.0.0.1], sockAddrs=[/127.0.0.1:0, /100.64.0.31:0], discPort=0, order=338, intOrder=213, lastExchangeTime=1571403600394, loc=false, ver=2.7.5#20190603-sha1:be4f2a15, isClient=true]
Topology snapshot [ver=338, locNode=98f9d085, servers=9, clients=79, state=ACTIVE, CPUs=153, offheap=2.3GB, heap=45.0GB]
Completed partition exchange [localNode=98f9d085-933a-435c-a09b-1846cf39c3b1, exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion [topVer=284, minorTopVer=0], evt=NODE_FAILED, evtNode=TcpDiscoveryNode [id=f3fb9b23-e3b0-47ab-98da-baf2421fb59a, addrs=[100.64.32.132, 127.0.0.1], sockAddrs=[/100.64.32.132:0, /127.0.0.1:0], discPort=0, order=66, intOrder=66, lastExchangeTime=1571377609149, loc=false, ver=2.7.5#20190603-sha1:be4f2a15, isClient=true], done=true], topVer=AffinityTopologyVersion [topVer=284, minorTopVer=0], durationFromInit=104]
Finished exchange init [topVer=AffinityTopologyVersion [topVer=284, minorTopVer=0], crd=true]
Skipping rebalancing (obsolete exchange ID) [top=AffinityTopologyVersion [topVer=284, minorTopVer=0], evt=NODE_FAILED, node=f3fb9b23-e3b0-47ab-98da-baf2421fb59a]
Started exchange init [topVer=AffinityTopologyVersion [topVer=285, minorTopVer=0], mvccCrd=MvccCoordinator [nodeId=98f9d085-933a-435c-a09b-1846cf39c3b1, crdVer=1571377592872, topVer=AffinityTopologyVersion [topVer=117, minorTopVer=0]], mvccCrdChange=false, crd=true, evt=NODE_FAILED, evtNode=b4b25a6f-1d3c-411f-9d81-5593d52e9db1, customEvt=null, allowMerge=true]
Finish exchange future [startVer=AffinityTopologyVersion [topVer=285, minorTopVer=0], resVer=AffinityTopologyVersion [topVer=285, minorTopVer=0], err=null]
Local node's value of 'java.net.preferIPv4Stack' system property differs from remote node's (all nodes in topology should have identical value) [locPreferIpV4=true, rmtPreferIpV4=null, locId8=98f9d085, rmtId8=edc33f38, rmtAddrs=[transfer-1-0-0-846f8bf868-dnfjg/100.64.18.195, /127.0.0.1], rmtNode=ClusterNode [id=edc33f38-9c94-4c4d-a109-be722e918512, order=339, addr=[100.64.18.195, 127.0.0.1], daemon=false]]
Added new node to topology: TcpDiscoveryNode [id=edc33f38-9c94-4c4d-a109-be722e918512, addrs=[100.64.18.195, 127.0.0.1], sockAddrs=[/127.0.0.1:0, /100.64.18.195:0], discPort=0, order=339, intOrder=214, lastExchangeTime=1571403600468, loc=false, ver=2.7.5#20190603-sha1:be4f2a15, isClient=true]
Topology snapshot [ver=339, locNode=98f9d085, servers=9, clients=80, state=ACTIVE, CPUs=155, offheap=2.3GB, heap=46.0GB]
Completed partition exchange [localNode=98f9d085-933a-435c-a09b-1846cf39c3b1, exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion [topVer=285, minorTopVer=0], evt=NODE_FAILED, evtNode=TcpDiscoveryNode [id=b4b25a6f-1d3c-411f-9d81-5593d52e9db1, addrs=[100.64.19.98, 127.0.0.1], sockAddrs=[/127.0.0.1:0, /100.64.19.98:0], discPort=0, order=71, intOrder=71, lastExchangeTime=1571377609159, loc=false, ver=2.7.5#20190603-sha1:be4f2a15, isClient=true], done=true], topVer=AffinityTopologyVersion [topVer=285, minorTopVer=0], durationFromInit=100]
Finished exchange init [topVer=AffinityTopologyVersion [topVer=285, minorTopVer=0], crd=true]
Skipping rebalancing (obsolete exchange ID) [top=AffinityTopologyVersion [topVer=285, minorTopVer=0], evt=NODE_FAILED, node=b4b25a6f-1d3c-411f-9d81-5593d52e9db1]
Started exchange init [topVer=AffinityTopologyVersion [topVer=286, minorTopVer=0], mvccCrd=MvccCoordinator [nodeId=98f9d085-933a-435c-a09b-1846cf39c3b1, crdVer=1571377592872, topVer=AffinityTopologyVersion [topVer=117, minorTopVer=0]], mvccCrdChange=false, crd=true, evt=NODE_FAILED, evtNode=c161e542-bad7-4f41-a973-54b6e6e7b555, customEvt=null, allowMerge=true]
Finish exchange future [startVer=AffinityTopologyVersion [topVer=286, minorTopVer=0], resVer=AffinityTopologyVersion [topVer=286, minorTopVer=0], err=null]
Completed partition exchange [localNode=98f9d085-933a-435c-a09b-1846cf39c3b1, exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion [topVer=286, minorTopVer=0], evt=NODE_FAILED, evtNode=TcpDiscoveryNode [id=c161e542-bad7-4f41-a973-54b6e6e7b555, addrs=[100.64.17.126, 127.0.0.1], sockAddrs=[/127.0.0.1:0, /100.64.17.126:0], discPort=0, order=38, intOrder=38, lastExchangeTime=1571377608515, loc=false, ver=2.7.5#20190603-sha1:be4f2a15, isClient=true], done=true], topVer=AffinityTopologyVersion [topVer=286, minorTopVer=0], durationFromInit=20]
Finished exchange init [topVer=AffinityTopologyVersion [topVer=286, minorTopVer=0], crd=true]
Skipping rebalancing (obsolete exchange ID) [top=AffinityTopologyVersion [topVer=286, minorTopVer=0], evt=NODE_FAILED, node=c161e542-bad7-4f41-a973-54b6e6e7b555]
Started exchange init [topVer=AffinityTopologyVersion [topVer=287, minorTopVer=0], mvccCrd=MvccCoordinator [nodeId=98f9d085-933a-435c-a09b-1846cf39c3b1, crdVer=1571377592872, topVer=AffinityTopologyVersion [topVer=117, minorTopVer=0]], mvccCrdChange=false, crd=true, evt=NODE_FAILED, evtNode=0c16c5a7-6e3f-4fd4-8618-b6d8d8888af4, customEvt=null, allowMerge=true]
Finish exchange future [startVer=AffinityTopologyVersion [topVer=287, minorTopVer=0], resVer=AffinityTopologyVersion [topVer=287, minorTopVer=0], err=null]
Completed partition exchange [localNode=98f9d085-933a-435c-a09b-1846cf39c3b1, exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion [topVer=287, minorTopVer=0], evt=NODE_FAILED, evtNode=TcpDiscoveryNode [id=0c16c5a7-6e3f-4fd4-8618-b6d8d8888af4, addrs=[100.64.34.22, 127.0.0.1], sockAddrs=[/127.0.0.1:0, /100.64.34.22:0], discPort=0, order=25, intOrder=25, lastExchangeTime=1571377607690, loc=false, ver=2.7.5#20190603-sha1:be4f2a15, isClient=true], done=true], topVer=AffinityTopologyVersion [topVer=287, minorTopVer=0], durationFromInit=52]
Finished exchange init [topVer=AffinityTopologyVersion [topVer=287, minorTopVer=0], crd=true]
Skipping rebalancing (obsolete exchange ID) [top=AffinityTopologyVersion [topVer=287, minorTopVer=0], evt=NODE_FAILED, node=0c16c5a7-6e3f-4fd4-8618-b6d8d8888af4]
Started exchange init [topVer=AffinityTopologyVersion [topVer=288, minorTopVer=0], mvccCrd=MvccCoordinator [nodeId=98f9d085-933a-435c-a09b-1846cf39c3b1, crdVer=1571377592872, topVer=AffinityTopologyVersion [topVer=117, minorTopVer=0]], mvccCrdChange=false, crd=true, evt=NODE_FAILED, evtNode=807333d7-0b71-4510-a35d-0ed41e068ac5, customEvt=null, allowMerge=true]
Finish exchange future [startVer=AffinityTopologyVersion [topVer=288, minorTopVer=0], resVer=AffinityTopologyVersion [topVer=288, minorTopVer=0], err=null]
Completed partition exchange [localNode=98f9d085-933a-435c-a09b-1846cf39c3b1, exchange=GridDhtPartitionsExchangeFuture [topVer=AffinityTopologyVersion [topVer=288, minorTopVer=0], evt=NODE_FAILED, evtNode=TcpDiscoveryNode [id=807333d7-0b71-4510-a35d-0ed41e068ac5, addrs=[100.64.32.231, 127.0.0.1], sockAddrs=[/127.0.0.1:0, /100.64.32.231:0], discPort=0, order=74, intOrder=74, lastExchangeTime=1571377609280, loc=false, ver=2.7.5#20190603-sha1:be4f2a15, isClient=true], done=true], topVer=AffinityTopologyVersion [topVer=288, minorTopVer=0], durationFromInit=60]
Finished exchange init [topVer=AffinityTopologyVersion [topVer=288, minorTopVer=0], crd=true]

1 Ответ

0 голосов
/ 19 октября 2019

Я также столкнулся с той же проблемой. Развертывание каждого узла один за другим - единственный способ восстановить это., Согласно. Насколько мой опыт с зажигать.

...