hadoop遇到问题总结

问题一

# hadoop fs -ls时出现错误如下:

# hadoop fs -ls  

11/08/31 22:51:39 INFO ipc.Client: Retrying connect to server: localhost/127.0.0.1:8020. Already tried 0 time(s).  

Bad connection to FS. command aborted. 

 

解决方案:

1.   格式化namenode:

# hadoop namenode -format 

 

2.   重新启动hadoop

# sh stop-all.sh

# sh start-all.sh

 

3.   查看后台进程

# jps

13508 NameNode

11008 SecondaryNameNode

14393 Jps

11096 JobTracker

此时namenode启动

 

4.   运行

# hadoop fs -ls

12/01/31 14:04:39 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000

12/01/31 14:04:39 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id

Found 1 items

drwxr-xr-x   - root supergroup          0 2012-01-31 13:57 /user/root/test

 


问题二

# hadoop fs -put ../conf input 时出现错误如下:

12/01/31 16:01:25 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000

12/01/31 16:01:25 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id

12/01/31 16:01:26 WARN hdfs.DFSClient: DataStreamer Exception: java.io.IOException: File /user/root/input/ssl-server.xml.example could only be replicated to 0 nodes, instead of 1

put: File /user/root/input/ssl-server.xml.example could only be replicated to 0 nodes, instead of 1

12/01/31 16:01:26 ERROR hdfs.DFSClient: Exception closing file /user/root/input/ssl-server.xml.example : java.io.IOException: File /user/root/input/ssl-server.xml.example could only be replicated to 0 nodes, instead of 1

解决方案:

这个问题是由于没有添加节点的原因,也就是说需要先启动namenode,再启动datanode,然后启动jobtracker和tasktracker。这样就不会存在这个问题了。 目前解决办法是分别启动节点#hadoop-daemon.sh start namenode #$hadoop-daemon.sh start datanode

1.   重新启动namenode

# hadoop-daemon.sh stop namenode

stopping namenode

# hadoop-daemon.sh start namenode

starting namenode, logging to /usr/hadoop-0.21.0/bin/../logs/hadoop-root-namenode-www.keli.com.out

DEPRECATED: Use of this script to execute hdfs command is deprecated.

Instead use the hdfs command for it.

 

2.   重新启动datanode

# hadoop-daemon.sh stop datanode

stopping datanode

# hadoop-daemon.sh start datanode

starting datanode, logging to /usr/hadoop-0.21.0/bin/../logs/hadoop-root-datanode-www.keli.com.out

DEPRECATED: Use of this script to execute hdfs command is deprecated.

Instead use the hdfs command for it.

 

3.   切换到hadoop的bin目录

# cd /usr/hadoop-0.21.0/bin/

 

4.   浏览hdfs目录

[root@www bin]# hadoop fs -ls

12/01/31 16:09:45 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000

12/01/31 16:09:45 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id

Found 4 items

drwxr-xr-x   - root supergroup          0 2012-01-31 16:01 /user/root/input

drwxr-xr-x   - root supergroup          0 2012-01-31 15:24 /user/root/test

-rw-r--r--   1 root supergroup          0 2012-01-31 14:37 /user/root/test-in

drwxr-xr-x   - root supergroup          0 2012-01-31 14:32 /user/root/test1

 

5.   删除hdfs中的input目录

[root@www bin]# hadoop fs -rmr input

12/01/31 16:10:09 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000

12/01/31 16:10:09 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id

Deleted hdfs://m106:9000/user/root/input

 

6.   上传数据到hdfs中的input目录

[root@www bin]# hadoop fs -put ../conf input

12/01/31 16:10:14 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000

12/01/31 16:10:14 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id

 

7.   浏览input目录,检查已上传的数据

[root@www bin]# hadoop fs -ls input

12/01/31 16:10:21 INFO security.Groups: Group mapping impl=org.apache.hadoop.security.ShellBasedUnixGroupsMapping; cacheTimeout=300000

12/01/31 16:10:21 WARN conf.Configuration: mapred.task.id is deprecated. Instead, use mapreduce.task.attempt.id

Found 16 items

-rw-r--r--   1 root supergroup       3426 2012-01-31 16:10 /user/root/input/capacity-scheduler.xml

-rw-r--r--   1 root supergroup       1335 2012-01-31 16:10 /user/root/input/configuration.xsl

-rw-r--r--   1 root supergroup        757 2012-01-31 16:10 /user/root/input/core-site.xml

-rw-r--r--   1 root supergroup        321 2012-01-31 16:10 /user/root/input/fair-scheduler.xml

-rw-r--r--   1 root supergroup       2237 2012-01-31 16:10 /user/root/input/hadoop-env.sh

-rw-r--r--   1 root supergroup       1650 2012-01-31 16:10 /user/root/input/hadoop-metrics.properties

-rw-r--r--   1 root supergroup       4644 2012-01-31 16:10 /user/root/input/hadoop-policy.xml

-rw-r--r--   1 root supergroup        252 2012-01-31 16:10 /user/root/input/hdfs-site.xml

-rw-r--r--   1 root supergroup       4141 2012-01-31 16:10 /user/root/input/log4j.properties

-rw-r--r--   1 root supergroup       2997 2012-01-31 16:10 /user/root/input/mapred-queues.xml

-rw-r--r--   1 root supergroup        430 2012-01-31 16:10 /user/root/input/mapred-site.xml

-rw-r--r--   1 root supergroup         25 2012-01-31 16:10 /user/root/input/masters

-rw-r--r--   1 root supergroup         26 2012-01-31 16:10 /user/root/input/slaves

-rw-r--r--   1 root supergroup       1243 2012-01-31 16:10 /user/root/input/ssl-client.xml.example

-rw-r--r--   1 root supergroup       1195 2012-01-31 16:10 /user/root/input/ssl-server.xml.example

-rw-r--r--   1 root supergroup        250 2012-01-31 16:10 /user/root/input/taskcontroller.cfg

[root@www bin]#


问题三

Hadoop启动datanode时出现Unrecognized option: -jvm 和 Could not create the Java virtual machine.

[root@www bin]# hadoop-daemon.sh start datanode

starting datanode, logging to /usr/hadoop-0.20.203.0/bin/../logs/hadoop-root-datanode-www.keli.com.out

Unrecognized option: -jvm

Could not create the Java virtual machine.

 

解决办法:

在hadoop安装目录/bin/hadoop中有如下一段shell:

CLASS='org.apache.hadoop.hdfs.server.datanode.DataNode'

  if [[ $EUID -eq 0 ]]; then

    HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"

  else

    HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS"

  fi

其中的

if [[ $EUID -eq 0 ]]; then

HADOOP_OPTS="$HADOOP_OPTS -jvm server $HADOOP_DATANODE_OPTS"

如果 $EUID 为 0,什么意思呢?

有效用户标识号(EUID):该标识号负责标识以什么用户身份来给新创建的进程赋所有权、检查文件的存取权限和检查通过系统调用kill向进程发送软中断信号的许可权限。

在root用户下echo $EUID,echo结果为 0。

ok,在root下会有-jvm选项添加上去,上面说的Unrecognized option: -jvm难道就是这里产生的。

 

两个想法。一个想法是自己改了这shell代码,去掉里面的-jvm。另外一个想法是既然这里要求 $EUID -eq 0,那别用$EUID不为0的(root用户)用户运行即可。果断试试,换上普通用户根据文档提示做。ok,成功。好奇的再试试第一个想法,其实暂时还是不太想动源码。但是这shell动动也没妨,果断去掉上面的-jvm,直接把上面的if else 结构直接去掉改为

 

HADOOP_OPTS="$HADOOP_OPTS -server $HADOOP_DATANODE_OPTS",

同样运行成功。


问题四

[root@www bin]# jps

3283 NameNode

2791 SecondaryNameNode

2856 JobTracker

3348 Jps

hadoop没有启动datanode

 

解决办法:

format之后之前的datanode会有一个ID,这个ID没有删除,所以会拒绝当前Namenode链接和分配。所以需要删除原来的datanode中的hdfs目录。

[root@freepp ~]# rm -rf /hadoopdata/

重启hadoop

[root@www bin]# jps

4132 Jps

3907 NameNode

4056 DataNode

2791 SecondaryNameNode

2856 JobTracker
posted @ 2012-11-24 16:50  蜗牛123  阅读(2774)  评论(0编辑  收藏  举报