错误:namenode无法自动切换成active

错误描述:
  使用 kill -9 namenode-jps-id 杀死active状态的namenode1进程时,无法使namenode2自动切换到active状态
  查看namenode2日志得到如下信息:

org.apache.hadoop.hdfs.qjournal.client.QuorumException: Got too many exceptions to achieve quorum size 2/3. 3 exceptions thrown:
192.168.1.205:8485: Call From itcast01/192.168.1.201 to itcast05:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
192.168.1.206:8485: Call From itcast01/192.168.1.201 to itcast06:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
192.168.1.207:8485: Call From itcast01/192.168.1.201 to itcast07:8485 failed on connection exception: java.net.ConnectException: Connection refused; For more details see:  http://wiki.apache.org/hadoop/ConnectionRefused
    at org.apache.hadoop.hdfs.qjournal.client.QuorumException.create(QuorumException.java:81)
    at org.apache.hadoop.hdfs.qjournal.client.QuorumCall.rethrowException(QuorumCall.java:223)
    at org.apache.hadoop.hdfs.qjournal.client.AsyncLoggerSet.waitForWriteQuorum(AsyncLoggerSet.java:142)
    at org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager.selectInputStreams(QuorumJournalManager.java:471)
    at org.apache.hadoop.hdfs.server.namenode.JournalSet.selectInputStreams(JournalSet.java:278)
    at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1463)
    at org.apache.hadoop.hdfs.server.namenode.FSEditLog.selectInputStreams(FSEditLog.java:1487)
    at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer.doTailEdits(EditLogTailer.java:212)
    at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.doWork(EditLogTailer.java:324)
    at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.access$200(EditLogTailer.java:282)
    at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread$1.run(EditLogTailer.java:299)
    at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:412)
    at org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer$EditLogTailerThread.run(EditLogTailer.java:295)

疑惑:已经配置了ssh免登陆,但为什么还报无法连接到namenode的错误?
猜想:考虑 hdfs-site.xml 中的sshfence配置:
sshfence 是为了通过 ssh 登录到前一个 active NameNode 并将其杀死。

所以dfs.ha.fencing.ssh.private-key-files对应的值就是本机私钥文件的存放地址;
这里一定要对号入座,我在本机配置私钥的地址是:/root/.ssh/id_rsa , 然而我却粗心的拷贝了别人的配置 这里一定要吸取教训!

posted @ 2016-04-15 11:30  时光.漫步  阅读(1361)  评论(0编辑  收藏  举报