У Namenode In был кластер oop, не запущенный после отключения электричества - PullRequest
0 голосов
/ 28 января 2020

Имеется кластер oop с версией ambari

HDP 2.6.5, включает 2 наменода

Из-за сбоя в электроснабжении оба наменода не запущены

На В первом наменоде мы видим следующее поведение со многими циклами replaying edit log xxxxxxxxx transactions completed, а наменод никогда не запускался

2020-01-27 06:20:27,276 INFO  namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(266)) - replaying edit log: 181625/2282427 transactions completed. (8%)
2020-01-27 06:20:28,277 INFO  namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(266)) - replaying edit log: 395566/2282427 transactions completed. (17%)
2020-01-27 06:20:29,278 INFO  namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(266)) - replaying edit log: 589023/2282427 transactions completed. (26%)
2020-01-27 06:20:30,279 INFO  namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(266)) - replaying edit log: 786070/2282427 transactions completed. (34%)
2020-01-27 06:20:31,280 INFO  namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(266)) - replaying edit log: 980763/2282427 transactions completed. (43%)
2020-01-27 06:20:32,281 INFO  namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(266)) - replaying edit log: 1195544/2282427 transactions completed. (52%)
2020-01-27 06:20:33,282 INFO  namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(266)) - replaying edit log: 1401095/2282427 transactions completed. (61%)
2020-01-27 06:20:34,283 INFO  namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(266)) - replaying edit log: 1573123/2282427 transactions completed. (69%)
2020-01-27 06:20:35,305 INFO  namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(266)) - replaying edit log: 1742059/2282427 transactions completed. (76%)
2020-01-27 06:20:36,305 INFO  namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(266)) - replaying edit log: 1931265/2282427 transactions completed. (85%)
2020-01-27 06:20:37,306 INFO  namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(266)) - replaying edit log: 2072759/2282427 transactions completed. (91%)
2020-01-27 06:20:38,307 INFO  namenode.FSEditLogLoader (FSEditLogLoader.java:loadEditRecords(266)) - replaying edit log: 2214991/2282427 transactions completed. (97%)

На втором наменоде мы видим следующее:

2020-01-27 06:27:09,487 ERROR namenode.NameNode (NameNode.java:main(1783)) - Failed to start namenode.
org.apache.hadoop.hdfs.server.namenode.EditLogInputException: Error replaying edit log at offset 0.  Expected transaction ID was 19494143328
               at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:203)
               at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadFSEdits(FSEditLogLoader.java:143)
               at org.apache.hadoop.hdfs.server.namenode.FSImage.loadEdits(FSImage.java:838)
               at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:693)
               at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:289)
               at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:1073)
               at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:723)
               at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:697)
               at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:761)
               at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:1001)
               at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:985)
               at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1710)
               at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1778)
Caused by: org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream$PrematureEOFException: got premature end-of-file at txid 19494143327; expected file to go up to 19494143328
               at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:197)
               at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
               at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.skipUntil(EditLogInputStream.java:151)
               at org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream.nextOp(RedundantEditLogInputStream.java:179)
               at org.apache.hadoop.hdfs.server.namenode.EditLogInputStream.readOp(EditLogInputStream.java:85)
               at org.apache.hadoop.hdfs.server.namenode.FSEditLogLoader.loadEditRecords(FSEditLogLoader.java:190)
               ... 12 more
2020-01-27 06:27:09,488 INFO  util.ExitUtil (ExitUtil.java:terminate(124)) - Exiting with status 1
2020-01-27 06:27:09,489 INFO  namenode.NameNode (LogAdapter.java:info(47)) - SHUTDOWN_MSG: 
/************************************************************

Что мы можем сделать так, чтобы никто из узлов имени не запускался?

...