HBase启动时出现FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected state : blog,,1384910214217.e0be69e4d19be9dacb4e4df3416d51e5. state=PENDING_OPEN

由于操作不当在hdfs下删除了hbase中的表,而hbase的'.META.'表还保留着表信息,导致启动hbase时,HMaster总是down掉。查找hbase的master下的hbase-hadoop-master-Master.Hadoop.log文件发现:

2014-04-25 10:35:27,007 FATAL org.apache.hadoop.hbase.master.HMaster: Unexpected state : blog,,1384910214217.e0be69e4d19be9dacb4e4df3416d51e5. state=PENDING_OPEN, ts=1398393327006, server=Slave1.Hadoop,60020,1398393314428 .. Cannot transit it to OFFLINE.
java.lang.IllegalStateException: Unexpected state : blog,,1384910214217.e0be69e4d19be9dacb4e4df3416d51e5. state=PENDING_OPEN, ts=1398393327006, server=Slave1.Hadoop,60020,1398393314428 .. Cannot transit it to OFFLINE.
at org.apache.hadoop.hbase.master.AssignmentManager.setOfflineInZooKeeper(AssignmentManager.java:1879)
at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1688)
at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1424)
at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1399)
at org.apache.hadoop.hbase.master.AssignmentManager.assign(AssignmentManager.java:1394)
at org.apache.hadoop.hbase.master.handler.ClosedRegionHandler.process(ClosedRegionHandler.java:105)
at org.apache.hadoop.hbase.executor.EventHandler.run(EventHandler.java:175)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)

在slave1节点下的hbase-hadoop-regionserver-Slave1.Hadoop.log文件下发现:

2014-04-25 09:00:14,794 DEBUG org.apache.hadoop.hbase.util.FSTableDescriptors: Exception during readTableDecriptor. Current table name = blog
org.apache.hadoop.hbase.TableInfoMissingException: No .tableinfo file under hdfs://192.168.1.2:9000/hbase/blog
at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorModtime(FSTableDescriptors.java:417)
at org.apache.hadoop.hbase.util.FSTableDescriptors.getTableDescriptorModtime(FSTableDescriptors.java:409)
at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:164)
at org.apache.hadoop.hbase.util.FSTableDescriptors.get(FSTableDescriptors.java:127)
at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2959)
at org.apache.hadoop.hbase.regionserver.HRegionServer.openRegion(HRegionServer.java:2922)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:601)
at org.apache.hadoop.hbase.ipc.WritableRpcEngine$Server.call(WritableRpcEngine.java:320)
at org.apache.hadoop.hbase.ipc.HBaseServer$Handler.run(HBaseServer.java:1426)

所以判断事故原因是HBase的'.META.'表中保存着原来表的信息,于是单独启动主节点后,删除‘.META.’表中的多余信息:


hbase(main):054:0> delete '.META.','blog,,1384910214217.e0be69e4d19be9dacb4e4df3416d51e5.','info:regioninfo'
0 row(s) in 0.0060 seconds

hbase(main):055:0> delete '.META.','blog,,1384910214217.e0be69e4d19be9dacb4e4df3416d51e5.','info:server'
0 row(s) in 0.0050 seconds

hbase(main):056:0> delete '.META.','blog,,1384910214217.e0be69e4d19be9dacb4e4df3416d51e5.','info:serverstartcode'
0 row(s) in 0.0060 seconds

值得注意的是,这里本人运气还算不错,因为hbase shell的时候hbase需要启动slave1节点(.META.表的信息应该是存在slave1中),而master和slave1节点连通时没有出现问题,这才得以顺利解决!

posted on 2014-04-25 11:50  因胖被判变瘦  阅读(850)  评论(0编辑  收藏  举报