镰鼬LL

导航

 
场景:同事反馈 hadoop集群namenode服务器故障,hadoo集群不可用
 
现象:hadoop集群有两个namenode,A处于active状态,正常提供服务,B处于standby状态,作为备份,其中A节点挂掉且无法重启,但是B节点仍然处于standby状态,并没有切换
 
处理:1 首先  zkfc 进程没有启动,zk进程负责故障切换,故启动zk, 
su - hadoop && /usr/local/hadoop/sbin/hadoop-daemon.sh --script hdfs start zkfc
启动后,B节点没有自动切换,
 
2,执行手动切换命令 
bin/hdfs haadmin -transitionToActive nn2
hdfs  haadmin -failover -forcefence -forceactive  nn2  nn1
其中 nn1,nn2 是namenode节点的名称,具体可以在 /usr/local/hadoop-cdh/etc/hadoop/hdfs-site.xml 查看
手动切换依然失败,提示
forcefence and forceactive flags not supported with auto-failover enabled
原因是配置了 dfs.ha.automatic-failover.enabled 开启自动切换,导致不能手动切换
在hdfs-site.xml 里注释 dfs.ha.automatic-failover.enabled 配置,依然切换失败
[hadoop@poseidon78 bin]$ ./hdfs  haadmin -transitionToActive nn2   
21/07/23 16:01:30 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
21/07/23 16:01:50 INFO ipc.Client: Retrying connect to server: h00036/10.35.0.36:8020. Already tried 0 time(s); maxRetries=45
21/07/23 16:02:10 INFO ipc.Client: Retrying connect to server: h00036/10.35.0.36:8020. Already tried 1 time(s); maxRetries=45
21/07/23 16:02:30 INFO ipc.Client: Retrying connect to server: h00036/10.35.0.36:8020. Already tried 2 time(s); maxRetries=45
21/07/23 16:02:50 INFO ipc.Client: Retrying connect to server: h00036/10.35.0.36:8020. Already tried 3 time(s); maxRetries=45
21/07/23 16:03:10 INFO ipc.Client: Retrying connect to server: h00036/10.35.0.36:8020. Already tried 4 time(s); maxRetries=45
21/07/23 16:03:30 INFO ipc.Client: Retrying connect to server: h00036/10.35.0.36:8020. Already tried 5 time(s); maxRetries=45
21/07/23 16:03:50 INFO ipc.Client: Retrying connect to server: h00036/10.35.0.36:8020. Already tried 6 time(s); maxRetries=45
21/07/23 16:04:10 INFO ipc.Client: Retrying connect to server: h00036/10.35.0.36:8020. Already tried 7 time(s); maxRetries=45
21/07/23 16:04:30 INFO ipc.Client: Retrying connect to server: h00036/10.35.0.36:8020. Already tried 8 time(s); maxRetries=45
21/07/23 16:04:50 INFO ipc.Client: Retrying connect to server: h00036/10.35.0.36:8020. Already tried 9 time(s); maxRetries=45
21/07/23 16:05:10 INFO ipc.Client: Retrying connect to server: h00036/10.35.0.36:8020. Already tried 10 time(s); maxRetries=45
21/07/23 16:05:30 INFO ipc.Client: Retrying connect to server: h00036/10.35.0.36:8020. Already tried 11 time(s); maxRetries=45
21/07/23 16:05:50 INFO ipc.Client: Retrying connect to server: h00036/10.35.0.36:8020. Already tried 12 time(s); maxRetries=45
 
3,请求外援,解决流程大致如下,
    1.journalnode 应当运行在三台服务器上,到那时目前只有B节点一个进程,A节点的进程挂了,第三个进程在datanode上(记为C),由于磁盘满导致一个月前就挂了
    2.将B节点的journalnode 数据copy到 C上,重启journalnode进程,此时就有两个journalnode了,如果只有一个是无法进行切换的,两个会报warning,但是可以正常启动
su - hadoop && /usr/local/hadoop/sbin/hadoop-daemon.sh start journalnode
    3,将 hdfs-site.xml 的 dfs.ha.fencing.methods 和 dfs.ha.fencing.ssh.private-key-files 注释,将dfs.ha.namenodes.backupcluster 改为(nn2,nn2),
将 dfs.ha.automatic-failover.enabled 改为false。
    4,最后执行 bin/hdfs haadmin -transitionToActive nn2 切换成功
[hadoop@poseidon78 hadoop-cdh]$ bin/hdfs  haadmin -transitionToActive nn2
21/07/23 16:11:34 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Operation failed: Failed on local exception: java.io.EOFException; Host Details : local host is: "poseidon78/10.73.18.78"; destination host is: "h18078":8020;
    5,恢复原始配置
 
standby节点恢复
    由于原始的namenode服务器无法重启了,且由于raid卡掉电所以数据也丢了,,,需要重新搞个namenode(standby模式)
    由于namenode的ip是写死在datanode节点的配置上,为了避免麻烦决定新namenode采用和原namenode一样的ip
 
    1,将active节点的数据文件完整copy到standby 节点的对应位置,需要注意的是需要将 namenode_nfs 目录下的in_use.lock 文件删除
    2,然后copy active 节点上的配置文件至standby节点,启动standby节点,在前端页面可以正常打开,但是显示的 datanode 节点全部为dead状态,
    原因:怀疑是由于standby处于不可用状态已经有很长一段时间,在datanode进程中存在向standby节点汇报不成功的block,在standby重启之后,datanode重新发送内存中的在这段时间没有报告成功的block。由于量很大,standby节点占用了大量的内存去处理这个事情,导致内存不足,无法完成启动!
    解决:需要注意的是最好将namnode进程的jvm参数调大
    hadoop-env.sh文件
    export HADOOP_HEAPSIZE=20000
    export HADOOP_NAMENODE_INIT_HEAPSIZE="20000"

 

 
    3,重启standby 节点后,在前端页面放发现 nodes的数量补全,并且log可以看到如下内容
[11:30:05]org.apache.hadoop.hdfs.server.protocol.DisallowedDatanodeException: Datanode denied communication with namenode because hostname cannot be resolved (ip=10.13.32.61, hostname=10.13.32.61): DatanodeRegistration(0.0.0.0, datanodeUuid=e55cab6f-5120-467c-b11b-48be1f984719, infoPort=50075, ipcPort=50020, storageInfo=lv=-56;cid=CID-31a19ab7-910e-44d6-ab79-d3c3b7f09551;nsid=1959994934;c=0)
[11:30:05]        at org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:891)
[11:30:05]        at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4837)
[11:30:05]        at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1038)
[11:30:05]        at org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
[11:30:05]        at org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26378)
[11:30:05]        at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:587)
[11:30:05]        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1026)
[11:30:05]        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2013)
[11:30:05]        at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2009)
[11:30:05]        at java.security.AccessController.doPrivileged(Native Method)
[11:30:05]        at javax.security.auth.Subject.doAs(Subject.java:415)
[11:30:05]        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
[11:30:05]        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2007)

 

    原因: datanode节点的信息需要写到 namenode的host文件中,补全所有datanode 的信息后,重启standby节点,再去前端页面看,发现前端页面数据恢复正常
 
    4,这是再去namenode的log里看看,发现仍然有报错
2021-07-29 14:54:41,221 INFO org.apache.hadoop.ipc.Server: IPC Server handler 9 on 8020, call org.apache.hadoop.hdfs.protocol.ClientProtocol.renewLease from 10.77.115.38:55529 Call#6719850 Retry#-1: org.apache.hadoop.ipc.StandbyException: Operation category WRITE is not supported in state standby
    看报错信息是hadoop集群的节点切换出现一些问题,于是重启了standby节点的zk,这时发现出发了自动切换。。。standby节点成为active节点,原active节点挂了,于是又重启该节点的namenode和zk进程
    查询后发现,在启用ha的集群中,DFS客户端无法预先知道在操作的时刻哪个NameNode处于活动状态。因此,当客户端与NameNode联系,而NameNode恰好是standby节点时,读或写操作将被拒绝,此消息将被记录下来。然后,客户端将自动与另一个NameNode联系,并再次尝试该操作。只要集群中有一个活动的NameNode和一个备用的NameNode,这个消息就可以安全地被忽略
  整理写的很乱,因为不懂原理,欸!,多不如精啊
 
 
 
 
 
 
posted on 2021-07-29 23:26  镰鼬LL  阅读(560)  评论(0编辑  收藏  举报