hbase集群部分节点HRegionServer启动后自动关闭的问题
参考链接
http://f.dataguru.cn/thread-209058-1-1.html
我有4HRegionServer节点,1个master,其中3个是unbuntu 系统,2个节点是centos 6.5,
启动过程都很正常,但是一会后slave3 的HRegionServer会自动关闭.
查看tail -n100 hbase-hadoop-regionserver-Slave3.log日志如下:
015-07-04 16:18:52,761 WARN [regionserver/Slave3/192.168.2.38:16020] zookeeper.RecoverableZooKeeper: Possibly transient ZooKeeper, quorum=Master:2181,Slave1:2181,Slave2:2181,Slave3:2181,Slavrg.apache.zookeeper.KeeperException$OperationTimeoutException: KeeperErrorCode = OperationTimeout 2015-07-04 16:18:52,761 ERROR [regionserver/Slave3/192.168.2.38:16020] zookeeper.RecoverableZooKeeper: ZooKeeper delete failed after 4 attempts 2015-07-04 16:18:52,762 WARN [regionserver/Slave3/192.168.2.38:16020] regionserver.HRegionServer: Failed deleting my ephemeral node org.apache.zookeeper.KeeperException$OperationTimeoutException: KeeperErrorCode = OperationTimeout at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.checkZk(RecoverableZooKeeper.java:145) at org.apache.hadoop.hbase.zookeeper.RecoverableZooKeeper.delete(RecoverableZooKeeper.java:179) at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1347) at org.apache.hadoop.hbase.zookeeper.ZKUtil.deleteNode(ZKUtil.java:1336) at org.apache.hadoop.hbase.regionserver.HRegionServer.deleteMyEphemeralNode(HRegionServer.java:1391) at org.apache.hadoop.hbase.regionserver.HRegionServer.run(HRegionServer.java:1074) at java.lang.Thread.run(Thread.java:745) 2015-07-04 16:18:52,767 INFO [regionserver/Slave3/192.168.2.38:16020] regionserver.HRegionServer: stopping server Slave3,16020,1435997816385; zookeeper connection close
通过调整系统时间解决问题了.参考内容复制如下:
2、问题原因是时间不致造成的,解决方法如下:
1)在hbase-site.xml文件中 修改增加 ,将时间改大点
<property>
<name>hbase.master.maxclockskew</name>
<value>150000</value>
</property>
2)修改系统时间,将时间改为一致(建议采用本方法):
修改日期
date -s 11/23/2013
修改时间
date -s 15:14:00
检查硬件(CMOS)时间
clock -r
将系统时间写入CMOS
clock -w
3、修改完成后单独启动HRegionServer节点即可:
启动集群中所有的regionserver
./hbase-daemons.sh start regionserver
启动某个regionserver
./hbase-daemon.sh start regionserver
其实最好关闭hbase和hadoop之后重启,才能浏览器http://192.168.2.35:16010/查看到结果.