hadoop集群错误之Configured Capacity: 0 (0 KB)...Present Capacity: 0 (0 KB)...Datanodes available: 0 (0 total, 0 dead)解答

今天刚刚把hadoop多机集群搭建折腾出来,本满心欢喜,但是使用hadoop dfsadmin -report命令查看集群运行状况就傻眼儿了

Safe mode is ON
Configured Capacity: 0 (0 KB)
Present Capacity: 0 (0 KB)
DFS Remaining: 0 (0 KB)
DFS Used: 0 (0 KB)
DFS Used%: �%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Datanodes available: 0 (0 total, 0 dead)

怎么会全部是0呢?既然Datanode是0个活着,那slave应该没有启动。赶紧去slave机器上使用jps工具查看,发现slave上Datanode和TaskTracker“正常”运行,而master上Namenode,JobTracker和SecondaryNameNode也是正常运行。依然抱着侥幸心理,在网页中查看集群运行状况,发现还真是DataNode异常,于是随便打开其中slave机器logs中的tasktracker日志,发现有这样的提示Safe mode is OFF
namenode@namenode:~/hadoop-1.0.3/bin$ ./hadoop dfsadmin -report
Warning: $HADOOP_HOME is deprecated.

Configured Capacity: 2372235571200 (2.16 TB)
Present Capacity: 2171106631680 (1.97 TB)
DFS Remaining: 2171106373632 (1.97 TB)
DFS Used: 258048 (252 KB)
DFS Used%: 0%
Under replicated blocks: 1
Blocks with corrupt replicas: 0
Missing blocks: 1

-------------------------------------------------
Datanodes available: 3 (3 total, 0 dead)

Name: 192.168.0.3:50010
Decommission Status : Normal
Configured Capacity: 952717885440 (887.29 GB)
DFS Used: 86016 (84 KB)
Non DFS Used: 79567196160 (74.1 GB)
DFS Remaining: 873150603264(813.18 GB)
DFS Used%: 0%
DFS Remaining%: 91.65%
Last contact: Tue Sep 11 20:07:39 CST 2012


Name: 192.168.0.2:50010
Decommission Status : Normal
Configured Capacity: 466824966144 (434.76 GB)
DFS Used: 86016 (84 KB)
Non DFS Used: 48297947136 (44.98 GB)
DFS Remaining: 418526932992(389.78 GB)
DFS Used%: 0%
DFS Remaining%: 89.65%
Last contact: Tue Sep 11 20:07:39 CST 2012

2012-09-11 19:50:20,119 INFO org.apache.hadoop.ipc.Client:Retryging connect to server:localhost/127.0.0.1:9000. Already tried 9 time(s)

为什么会连接到localhost呢?不是masterIP吗?打开master上的conf/core-site.xml才发现原来是自己犯晕将fs.default.name的值写成了localhost:9000(实际上是121.49.110.3:9000,121.49.110.3是主机ip),赶紧修改过来,再次启动集群并查看集群运行状况,发现还是提示上面的错误,真是蛋疼。上网才搜了一下才发现,原来是namenode在安全模式下,再次关闭namenode安全模式,命令

hadoop路径/bin/hadoop dfsadmin -safemode leave

再次查看集群运行状况,"奇迹"发生了

Safe mode is OFF
namenode@namenode:~/hadoop-1.0.3/bin$ ./hadoop dfsadmin -report
Warning: $HADOOP_HOME is deprecated.

Configured Capacity: 2372235571200 (2.16 TB)
Present Capacity: 2171106631680 (1.97 TB)
DFS Remaining: 2171106373632 (1.97 TB)
DFS Used: 258048 (252 KB)
DFS Used%: 0%
Under replicated blocks: 1
Blocks with corrupt replicas: 0
Missing blocks: 1

-------------------------------------------------
Datanodes available: 3 (3 total, 0 dead)

Name: 192.168.0.3:50010
Decommission Status : Normal
Configured Capacity: 952717885440 (887.29 GB)
DFS Used: 86016 (84 KB)
Non DFS Used: 79567196160 (74.1 GB)
DFS Remaining: 873150603264(813.18 GB)
DFS Used%: 0%
DFS Remaining%: 91.65%
Last contact: Tue Sep 11 20:07:39 CST 2012


Name: 192.168.0.2:50010
Decommission Status : Normal
Configured Capacity: 466824966144 (434.76 GB)
DFS Used: 86016 (84 KB)
Non DFS Used: 48297947136 (44.98 GB)
DFS Remaining: 418526932992(389.78 GB)
DFS Used%: 0%
DFS Remaining%: 89.65%
Last contact: Tue Sep 11 20:07:39 CST 2012
居然成功了。不过,我是在4台机器上搭建的集群,每个机器的硬盘大小也就400G左右,但是这里居然显示我有2T,尼玛,看来还是有地方我没有弄清楚。不过能看到这个结果还是很不错了。补充一点,运行集群时把机器上的防火墙关了,不然有莫名其妙的错误。

posted on 2012-09-11 20:38  傲视天下3314  阅读(6543)  评论(4编辑  收藏  举报