Hadoop-2.8.0分布式安装手册
目录
3.5. OOM相关:vm.overcommit_memory 7
8.8.1. dfs.namenode.rpc-address 18
10.11.7. HDFS只允许有一主一备两个NameNode 25
10.11.8. 存储均衡start-balancer.sh 26
12.2.3. yarn rmadmin -getServiceState rm1 34
12.2.4. yarn rmadmin -transitionToStandby rm1 34
1. 前言
当前版本的Hadoop已解决了hdfs、yarn和hbase等单点,并支持自动的主备切换。
本文的目的是为当前最新版本的Hadoop 2.8.0提供最为详细的安装说明,以帮助减少安装过程中遇到的困难,并对一些错误原因进行说明,hdfs配置使用基于QJM(Quorum Journal Manager)的HA。本文的安装只涉及了hadoop-common、hadoop-hdfs、hadoop-mapreduce和hadoop-yarn,并不包含HBase、Hive和Pig等。
NameNode存储了一个文件有哪些块,但是它并不存储这些块在哪些DataNode上,DataNode会上报有哪些块。如果在NameNode的Web上看到“missing”,是因为没有任何的DataNode上报该块,也就造成的丢失。
2. 特性介绍
版本 |
发版本日期 |
新特性 |
3.0.0 |
|
支持多NameNode |
2.8.0 |
2016/1/25 |
|
2.7.1 |
2015/7/6 |
|
2.7.0 |
2015/4/21 |
1) 不再支持JDK6,须JDK 7+ 2) 支持文件截取(truncate) 3) 支持为每种存储类型设置配额 4) 支持文件变长块(之前一直为固定块大小,默认为64M) 5) 支持Windows Azure Storage 6) YARN认证可插拔 7) 自动共享,全局缓存YARN本地化资源(测试阶段) 8) 限制一个作业运行的Map/Reduce任务 9) 加快大量输出文件时大型作业的FileOutputCommitter速度 |
2.6.4 |
2016/2/11 |
|
2.6.3 |
2015/12/17 |
|
2.6.2 |
2015/10/28 |
|
2.6.1 |
2015/9/23 |
|
2.6.0 |
2014/11/18 |
1) YARN支持长时间运行的服务 2) YARN支持升级回滚 3) YARN支持应用运行在Docker容器中 |
2.5.2 |
2014/11/19 |
|
2.5.1 |
2014/9/12 |
|
2.5.0 |
2014/8/11 |
|
2.4.1 |
2014/6/30 |
|
2.4.0 |
2014/4/7 |
1) HDFS升级回滚 2) HDFS支持完整的https 3) YARN ResourceManager支持自动故障切换 |
2.2.0 |
2013/10/15 |
1) HDFS Federation 2) HDFS Snapshots |
2.1.0-beta |
2013/8/25 |
1) HDFS快照 2) 支持Windows |
2.0.3-alpha |
2013/2/14 |
1) 基于QJM的NameNode HA |
2.0.0-alpha |
2012/5/23 |
1) 人工切换的NameNode HA 2) HDFS Federation |
1.0.0 |
2011/12/27 |
|
0.23.11 |
2014/6/27 |
|
0.23.10 |
2013/12/11 |
|
0.22.0 |
2011/12/10 |
|
0.23.0 |
2011/11/17 |
|
0.20.205.0 |
2011/10/17 |
|
0.20.204.0 |
2011/9/5 |
|
0.20.203.0 |
2011/5/11 |
|
0.21.0 |
2010/8/23 |
|
0.20.2 |
2010/2/26 |
|
0.20.1 |
2009/9/14 |
|
0.19.2 |
2009/7/23 |
|
0.20.0 |
2009/4/22 |
|
0.19.1 |
2009/2/24 |
|
0.18.3 |
2009/1/29 |
|
0.19.0 |
2008/11/21 |
|
0.18.2 |
2008/11/3 |
|
0.18.1 |
2008/9/17 |
|
0.18.0 |
2008/8/22 |
|
0.17.2 |
2008/8/19 |
|
0.17.1 |
2008/6/23 |
|
0.17.0 |
2008/5/20 |
|
0.16.4 |
2008/5/5 |
|
0.16.3 |
2008/4/16 |
|
0.16.2 |
2008/4/2 |
|
0.16.1 |
2008/3/13 |
|
0.16.0 |
2008/2/7 |
|
0.15.3 |
2008/1/18 |
|
0.15.2 |
2008/1/2 |
|
0.15.1 |
2007/11/27 |
|
0.14.4 |
2007/11/26 |
|
0.15.0 |
2007/10/29 |
|
0.14.3 |
2007/10/19 |
|
0.14.1 |
2007/9/4 |
|
完整请浏览:http://hadoop.apache.org/releases.html。
3. 部署
推荐使用批量操作工具:mooon_ssh、mooon_upload和mooon_download安装部署,可以提升操作效率(https://github.com/eyjian/mooon/tree/master/mooon/tools),采用CMake编译,依赖OpenSSL(https://www.openssl.org/)和libssh2(http://www.libssh2.org)两个库,其中libssh2也依赖OpenSSL。
3.1. 机器列表
共5台机器(zookeeper部署在这5台机器上),部署如下表所示:
NameNode |
JournalNode |
DataNode |
ZooKeeper |
10.148.137.143 10.148.137.204 |
10.148.137.143 10.148.137.204 10.148.138.11 |
10.148.138.11 10.148.140.14 10.148.140.15 |
10.148.137.143 10.148.137.204 10.148.138.11 10.148.140.14 10.148.140.15 |
3.2. 主机名
机器IP |
对应的主机名 |
10.148.137.143 |
hadoop-137-143 |
10.148.137.204 |
hadoop-137-204 |
10.148.138.11 |
hadoop-138-11 |
10.148.140.14 |
hadoop-140-14 |
10.148.140.15 |
hadoop-140-15 |
注意主机名不能有下划线,否则启动时,SecondaryNameNode节点会报如下所示的错误(取自hadoop-hadoop-secondarynamenode-VM_39_166_sles10_64.out文件):
Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /data/hadoop/hadoop-2.8.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'. Exception in thread "main" java.lang.IllegalArgumentException: The value of property bind.address must not be null at com.google.common.base.Preconditions.checkArgument(Preconditions.java:88) at org.apache.hadoop.conf.Configuration.set(Configuration.java:971) at org.apache.hadoop.conf.Configuration.set(Configuration.java:953) at org.apache.hadoop.http.HttpServer2.initializeWebServer(HttpServer2.java:391) at org.apache.hadoop.http.HttpServer2.<init>(HttpServer2.java:344) at org.apache.hadoop.http.HttpServer2.<init>(HttpServer2.java:104) at org.apache.hadoop.http.HttpServer2$Builder.build(HttpServer2.java:292) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.initialize(SecondaryNameNode.java:264) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.<init>(SecondaryNameNode.java:192) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.main(SecondaryNameNode.java:651) |
3.2.1. 临时修改主机名
命令hostname不但可以查看主机名,还可以用它来修改主机名,格式为:hostname 新主机名。
在修改之前172.25.40.171对应的主机名为VM-40-171-sles10-64,而172.25.39.166对应的主机名为VM_39_166_sles10_64。两者的主机名均带有下划线,因此需要修改。为求简单,仅将原下划线改成横线:
hostname VM-40-171-sles10-64
hostname VM-39-166-sles10-64
经过上述修改后,还不够,类似于修改环境变量,还需要通过修改系统配置文件做永久修改。
3.2.2. 永久修改主机名
不同的Linux发行版本,对应的系统配置文件可能不同,SuSE 10.1是/etc/HOSTNAME:
# cat /etc/HOSTNAME VM_39_166_sles10_64 |
将文件中的“VM_39_166_sles10_64”,改成“VM-39-166-sles10-64”。有些Linux发行版本对应的可能是/etc/hostname文件,有些可能是/etc/sysconfig/network文件。
不但所在文件不同,修改的方法可能也不一样,比如有些是名字对形式,如/etc/sysconfig/network格式为:HOSTNAME=主机名。
修改之后,需要重启网卡,以使修改生效,执行命令:/etc/rc.d/boot.localnet start(不同系统命令会有差异,这是SuSE上的方法),再次使用hostname查看,会发现主机名变了。
直接重启系统,也可以使修改生效。
注意修改主机名后,需要重新验证ssh免密码登录,方法为:ssh 用户名@新的主机名。
可以通过以下多处查看机器名:
1) hostname命令(也可以用来修改主机名,但当次仅当次会话有效)
2) cat /proc/sys/kernel/hostname
3) cat /etc/hostname或cat /etc/sysconfig/network(永久性的修改,需要重启)
4) sysctl kernel.hostname(也可以用来修改主机名,但仅重启之前有效)
3.3. 免密码登录范围
要求能通过免登录包括使用IP和主机名都能免密码登录:
1) NameNode能免密码登录所有的DataNode
2) 各NameNode能免密码登录自己
3) 各NameNode间能免密码互登录
4) DataNode能免密码登录自己
5) DataNode不需要配置免密码登录NameNode和其它DataNode。
注:免密码登录不是必须的,如果不使用hadoop-daemons.sh等需要ssh、scp的脚本。
3.4. 修改最大可打开文件数
修改文件/etc/security/limits.conf,加入以下两行:
* soft nofile 102400 * hard nofile 102400
# End of file |
其中102400为一个进程最大可以打开的文件个数,当与RedisServer的连接数多时,需要设定为合适的值。
修改后,需要重新登录才会生效,如果是crontab,则需要重启crontab,如:service crond restart,有些平台可能是service cron restart。
3.5. OOM相关:vm.overcommit_memory
如果“/proc/sys/vm/overcommit_memory”的值为0,则会表示开启了OOM。可以设置为1关闭OOM,设置方法请参照net.core.somaxconn完成。
4. 约定
4.1. 安装目录约定
为便于讲解,本文约定Hadoop、JDK安装目录如下:
|
安装目录 |
版本 |
说明 |
JDK |
/data/jdk |
1.7.0 |
ln -s /data/jdk1.7.0_55 /data/jdk |
Hadoop |
/data/hadoop/hadoop |
2.8.0 |
ln -s /data/hadoop/hadoop-2.8.0 /data/hadoop/hadoop |
在实际安装部署时,可以根据实际进行修改。
4.2.
服务端口约定
端口 |
作用 |
9000 |
fs.defaultFS,如:hdfs://172.25.40.171:9000 |
9001 |
dfs.namenode.rpc-address,DataNode会连接这个端口 |
50070 |
dfs.namenode.http-address |
50470 |
dfs.namenode.https-address |
50100 |
dfs.namenode.backup.address |
50105 |
dfs.namenode.backup.http-address |
50090 |
dfs.namenode.secondary.http-address,如:172.25.39.166:50090 |
50091 |
dfs.namenode.secondary.https-address,如:172.25.39.166:50091 |
50020 |
dfs.datanode.ipc.address |
50075 |
dfs.datanode.http.address |
50475 |
dfs.datanode.https.address |
50010 |
dfs.datanode.address,DataNode的数据传输端口 |
8480 |
dfs.journalnode.rpc-address,主备NameNode以http方式从这个端口获取edit文件 |
8481 |
dfs.journalnode.https-address |
8032 |
yarn.resourcemanager.address |
8088 |
yarn.resourcemanager.webapp.address,YARN的http端口 |
8090 |
yarn.resourcemanager.webapp.https.address |
8030 |
yarn.resourcemanager.scheduler.address |
8031 |
yarn.resourcemanager.resource-tracker.address |
8033 |
yarn.resourcemanager.admin.address |
8042 |
yarn.nodemanager.webapp.address |
8040 |
yarn.nodemanager.localizer.address |
8188 |
yarn.timeline-service.webapp.address |
10020 |
mapreduce.jobhistory.address |
19888 |
mapreduce.jobhistory.webapp.address |
2888 |
ZooKeeper,如果是Leader,用来监听Follower的连接 |
3888 |
ZooKeeper,用于Leader选举 |
2181 |
ZooKeeper,用来监听客户端的连接 |
16010 |
hbase.master.info.port,HMaster的http端口 |
16000 |
hbase.master.port,HMaster的RPC端口 |
60030 |
hbase.regionserver.info.port,HRegionServer的http端口 |
60020 |
hbase.regionserver.port,HRegionServer的RPC端口 |
8080 |
hbase.rest.port,HBase REST server的端口 |
10000 |
hive.server2.thrift.port |
9083 |
hive.metastore.uris |
4.3. 各模块RPC和HTTP端口
模块 |
RPC端口 |
HTTP端口 |
HTTPS端口 |
HDFS JournalNode |
8485 |
8480 |
8481 |
HDFS NameNode |
8020 |
50070 |
|
HDFS DataNode |
50020 |
50075 |
|
HDFS SecondaryNameNode |
|
50090 |
50091 |
Yarn Resource Manager |
8032 |
8088 |
8090 |
Yarn Node Manager |
8040 |
8042 |
|
Yarn SharedCache |
|
8788 |
|
HMaster |
|
16010 |
|
HRegionServer |
|
16030 |
|
HBase thrift |
9090 |
9095 |
|
HBase rest |
|
8085 |
|
注:DataNode通过端口50010传输数据。
5. 工作详单
为运行Hadoop(HDFS、YARN和MapReduce)需要完成的工作详单:
Hadoop是Java语言开发的,所以需要。 |
|
NameNode控制SecondaryNameNode和DataNode使用了ssh和scp命令,需要无密码执行。 |
|
Hadoop安装和配置 |
这里指的是HDFS、YARN和MapReduce,不包含HBase、Hive等的安装。 |
6. JDK安装
本文安装的JDK 1.7.0版本。
6.1. 下载安装包
JDK最新二进制安装包下载网址:
http://www.oracle.com/technetwork/java/javase/downloads
JDK1.7二进制安装包下载网址:
http://www.oracle.com/technetwork/java/javase/downloads/jdk7-downloads-1880260.html
本文下载的是64位Linux版本的JDK1.7:jdk-7u55-linux-x64.gz。请不要安装JDK1.8版本,JDK1.8和Hadoop 2.8.0不匹配,编译Hadoop 2.8.0源码时会报很多错误。
6.2. 安装步骤
JDK的安装非常简单,将jdk-7u55-linux-x64.gz上传到Linux,然后解压,接着配置好环境变量即可(本文jdk-7u55-linux-x64.gz被上传在/data目录下):
1) 进入/data目录
2) 解压安装包:tar xzf jdk-7u55-linux-x64.gz,解压后会在生成目录/data/jdk1.7.0_55
3) 建立软件链接:ln -s /data/jdk1.7.0_55 /data/jdk
4) 修改/etc/profile或用户目录下的profile,或同等文件,配置如下所示环境变量:
export JAVA_HOME=/data/jdk export CLASSPATH=$JAVA_HOME/lib/tools.jar export PATH=$JAVA_HOME/bin:$PATH |
完成这项操作之后,需要重新登录,或source一下profile文件,以便环境变量生效,当然也可以手工运行一下,以即时生效。如果还不放心,可以运行下java或javac,看看命令是否可执行。如果在安装JDK之前,已经可执行了,则表示不用安装JDK。
7. 免密码ssh2登录
以下针对的是ssh2,而不是ssh,也不包括OpenSSH。配置分两部分:一是对登录机的配置,二是对被登录机的配置,其中登录机为客户端,被登录机为服务端,也就是解决客户端到服务端的无密码登录问题。下述涉及到的命令,可以直接拷贝到Linux终端上执行,已全部验证通过,操作环境为SuSE 10.1。
第一步,修改所有被登录机上的sshd配置文件/etc/ssh2/sshd2_config:
1) (如果不以root用户运行hadoop,则跳过这一步)将PermitRootLogin值设置为yes,也就是取掉前面的注释号#
2) 将AllowedAuthentications值设置为publickey,password,也就是取掉前面的注释号#
3) 重启sshd服务:service ssh2 restart
第二步,在所有登录机上,执行以下步骤:
1) 进入到.ssh2目录:cd ~/.ssh2
2) ssh-keygen2 -t dsa -P''
-P表示密码,-P''就表示空密码,也可以不用-P参数,但这样就要敲三次回车键,用-P''就一次回车。
成功之后,会在用户的主目录下生成私钥文件id_dsa_2048_a,和公钥文件id_dsa_2048_a.pub。
3) 生成identification文件:echo "IdKey id_dsa_2048_a" >> identification,请注意IdKey后面有一个空格,确保identification文件内容如下:
# cat identification IdKey id_dsa_2048_a |
4) 将文件id_dsa_2048_a.pub,上传到所有被登录机的~/.ssh2目录:scp id_dsa_2048_a.pub root@192.168.0.1:/root/.ssh2,这里假设192.168.0.1为其中一个被登录机的IP。在执行scp之前,请确保192.168.0.1上有/root/.ssh2这个目录,而/root/需要修改为root用户的实际HOME目录,通常环境变量$HOME为用户主目录,~也表示用户主目录,不带任何参数的cd命令也会直接切换到用户主目录。
第三步,在所有被登录机上,执行以下步骤:
1) 进入到.ssh2目录:cd ~/.ssh2
2) 生成authorization文件:echo "Key id_dsa_2048_a.pub" >> authorization,请注意Key后面有一个空格,确保authorization文件内容如下:
# cat authorization Key id_dsa_2048_a.pub |
完成上述工作之后,从登录机到被登录机的ssh登录就不需要密码了。如果没有配置好免密码登录,在启动时会遇到如下错误:
Starting namenodes on [172.25.40.171] 172.25.40.171: Host key not found from database. 172.25.40.171: Key fingerprint: 172.25.40.171: xofiz-zilip-tokar-rupyb-tufer-tahyc-sibah-kyvuf-palik-hazyt-duxux 172.25.40.171: You can get a public key's fingerprint by running 172.25.40.171: % ssh-keygen -F publickey.pub 172.25.40.171: on the keyfile. 172.25.40.171: warning: tcgetattr failed in ssh_rl_set_tty_modes_for_fd: fd 1: Invalid argument |
或下列这样的错误:
Starting namenodes on [172.25.40.171] 172.25.40.171: hadoop's password: |
建议生成的私钥和公钥文件名都带上自己的IP,否则会有些混乱。
按照中免密码登录范围的说明,配置好所有的免密码登录。更多关于免密码登录说明,请浏览技术博客:
1) http://blog.chinaunix.net/uid-20682147-id-4212099.html(两个SSH2间免密码登录)
2) http://blog.chinaunix.net/uid-20682147-id-4212097.html(SSH2免密码登录OpenSSH)
3) http://blog.chinaunix.net/uid-20682147-id-4212094.html(OpenSSH免密码登录SSH2)
4) http://blog.chinaunix.net/uid-20682147-id-5520240.html(两个openssh间免密码登录)
8. Hadoop安装和配置
本部分仅包括HDFS、MapReduce和Yarn的安装,不包括HBase、Hive等的安装。
8.1. 下载安装包
Hadoop二进制安装包下载网址:http://hadoop.apache.org/releases.html#Download(或直接进入http://mirror.bit.edu.cn/apache/hadoop/common/进行下载),本文下载的是hadoop-2.8.0版本(安装包:
http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.8.0/hadoop-2.8.0.tar.gz,源码包:http://www.apache.org/dyn/closer.cgi/hadoop/common/hadoop-2.8.0/hadoop-2.8.0-src.tar.gz)。
官方的安装说明请浏览Cluster Setup:
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/ClusterSetup.html。
8.2. 安装和环境变量配置
1) 将Hadoop安装包hadoop-2.8.0.tar.gz上传到/data/hadoop目录下
2) 进入/data/hadoop目录
3) 在/data/hadoop目录下,解压安装包hadoop-2.8.0.tar.gz:tar xzf hadoop-2.8.0.tar.gz
4) 建立软件链接:ln -s /data/hadoop/hadoop-2.8.0 /data/hadoop/hadoop
5) 修改用户主目录下的文件.profile(当然也可以是/etc/profile或其它同等效果的文件),设置Hadoop环境变量:
export JAVA_HOME=/data/jdk export HADOOP_HOME=/data/hadoop/hadoop export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop export PATH=$HADOOP_HOME/bin:$PATH |
需要重新登录以生效,或者在终端上执行:export HADOOP_HOME=/data/hadoop/hadoop也可以即时生效。
8.3. 修改hadoop-env.sh
修改所有节点上的$HADOOP_HOME/etc/hadoop/hadoop-env.sh文件,在靠近文件头部分加入:export JAVA_HOME=/data/jdk
特别说明一下:虽然在/etc/profile已经添加了JAVA_HOME,但仍然得修改所有节点上的hadoop-env.sh,否则启动时,报如下所示的错误:
10.12.154.79: Error: JAVA_HOME is not set and could not be found. 10.12.154.77: Error: JAVA_HOME is not set and could not be found. 10.12.154.78: Error: JAVA_HOME is not set and could not be found. 10.12.154.78: Error: JAVA_HOME is not set and could not be found. 10.12.154.77: Error: JAVA_HOME is not set and could not be found. 10.12.154.79: Error: JAVA_HOME is not set and could not be found. |
除JAVA_HOME之外,再添加:
export HADOOP_HOME=/data/hadoop/hadoop
export HADOOP_CONF_DIR=${HADOOP_HOME}/etc/hadoop
同时,建议将下列添加到/etc/profile或~/.profile中:
export JAVA_HOME=/data/jdk
export HADOOP_HOME=/data/hadoop/hadoop
export HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
8.4. 修改/etc/hosts
为省去不必要的麻烦,建议在所有节点的/etc/hosts文件,都做如下所配置:
10.148.137.143 hadoop-137-143 # NameNode 10.148.137.204 hadoop-137-204 # NameNode 10.148.138.11 hadoop-138-11 # DataNode 10.148.140.14 hadoop-140-14 # DataNode 10.148.140.15 hadoop-140-15 # DataNode |
注意不要为一个IP配置多个不同主机名,否则HTTP页面可能无法正常运作。
主机名,如VM-39-166-sles10-64,可通过hostname命令取得。由于都配置了主机名,在启动HDFS或其它之前,需要确保针对主机名进行过ssh,否则启动时,会遇到如下所示的错误:
VM-39-166-sles10-64: Host key not found from database. VM-39-166-sles10-64: Key fingerprint: VM-39-166-sles10-64: xofiz-zilip-tokar-rupyb-tufer-tahyc-sibah-kyvuf-palik-hazyt-duxux VM-39-166-sles10-64: You can get a public key's fingerprint by running VM-39-166-sles10-64: % ssh-keygen -F publickey.pub VM-39-166-sles10-64: on the keyfile. VM-39-166-sles10-64: warning: tcgetattr failed in ssh_rl_set_tty_modes_for_fd: fd 1: Invalid argument |
上述错误表示没有以主机名ssh过一次VM-39-166-sles10-64。按下列方法修复错误:
ssh hadoop@VM-39-166-sles10-64 Host key not found from database. Key fingerprint: xofiz-zilip-tokar-rupyb-tufer-tahyc-sibah-kyvuf-palik-hazyt-duxux You can get a public key's fingerprint by running % ssh-keygen -F publickey.pub on the keyfile. Are you sure you want to continue connecting (yes/no)? yes Host key saved to /data/hadoop/.ssh2/hostkeys/key_36000_137vm_13739_137166_137sles10_13764.pub host key for VM-39-166-sles10-64, accepted by hadoop Thu Apr 17 2014 12:44:32 +0800 Authentication successful. Last login: Thu Apr 17 2014 09:24:54 +0800 from 10.32.73.69 Welcome to SuSE Linux 10 SP2 64Bit Nov 10,2010 by DIS Version v2.6.20101110 No mail. |
8.5. 修改slaves
这些脚本使用到了slaves:
hadoop-daemons.sh slaves.sh start-dfs.sh stop-dfs.sh yarn-daemons.sh |
这些脚本都依赖无密码SSH,如果没有使用到,则可以不管slaves文件。
slaves即为HDFS的DataNode节点。当使用脚本start-dfs.sh来启动hdfs时,会使用到这个文件,以无密码登录方式到各slaves上启动DataNode。
修改主NameNode和备NameNode上的$HADOOP_HOME/etc/hadoop/slaves文件,将slaves的节点IP(也可以是相应的主机名)一个个加进去,一行一个IP,如下所示:
> cat slaves 10.148.138.11 10.148.140.14 10.148.140.15 |
8.6. 准备好各配置文件
配置文件放在$HADOOP_HOME/etc/hadoop目录下,对于Hadoop 2.3.0、Hadoop 2.8.0和Hadoop 2.8.0版本,该目录下的core-site.xml、yarn-site.xml、hdfs-site.xml和mapred-site.xml都是空的。如果不配置好就启动,如执行start-dfs.sh,则会遇到各种错误。
可从$HADOOP_HOME/share/hadoop目录下拷贝一份到/etc/hadoop目录,然后在此基础上进行修改(以下内容可以直接拷贝执行,2.3.0版本中各default.xml文件路径不同于2.8.0版本):
# 进入$HADOOP_HOME目录 cd $HADOOP_HOME cp ./share/doc/hadoop/hadoop-project-dist/hadoop-common/core-default.xml ./etc/hadoop/core-site.xml cp ./share/doc/hadoop/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml ./etc/hadoop/hdfs-site.xml cp ./share/doc/hadoop/hadoop-yarn/hadoop-yarn-common/yarn-default.xml ./etc/hadoop/yarn-site.xml cp ./share/doc/hadoop/hadoop-mapreduce-client/hadoop-mapreduce-client-core/mapred-default.xml ./etc/hadoop/mapred-site.xml |
接下来,需要对默认的core-site.xml、yarn-site.xml、hdfs-site.xml和mapred-site.xml进行适当的修改,否则仍然无法启动成功。
QJM的配置参照的官方文档:
8.7. 修改hdfs-site.xml
对hdfs-site.xml文件的修改,涉及下表中的属性:
属性名 |
属性值 |
说明 |
dfs.nameservices |
mycluster |
|
dfs.ha.namenodes.mycluster |
nn1,nn2 |
同一nameservice下,只能配置一或两个,也就是不能有nn3了 |
dfs.namenode.rpc-address.mycluster.nn1 |
hadoop-137-143:8020 |
|
dfs.namenode.rpc-address.mycluster.nn2 |
hadoop-137-204:8020 |
|
dfs.namenode.http-address.mycluster.nn1 |
hadoop-137-143:50070 |
|
dfs.namenode.http-address.mycluster.nn2 |
hadoop-137-204:50070 |
|
dfs.namenode.shared.edits.dir |
qjournal://hadoop-137-143:8485;hadoop-137-204:8485;hadoop-138-11:8485/mycluster |
至少三台Quorum Journal节点配置 |
|
|
|
dfs.client.failover.proxy.provider.mycluster |
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider |
客户端通过它来找主NameNode |
|
|
|
dfs.ha.fencing.methods |
sshfence
如果配置为sshfence,当主NameNode异常时,使用ssh登录到主NameNode,然后使用fuser将主NameNode杀死,因此需要确保所有NameNode上可以使用fuser。 |
用来保证同一时刻只有一个主NameNode,以防止脑裂。可带用户名和端口参数,格式示例:sshfence([[username][:port]]);值还可以为shell脚本,格式示例: shell(/path/to/my/script.sh arg1 arg2 ...),如: shell(/bin/true)
如果sshd不是默认的22端口时,就需要指定。 |
dfs.ha.fencing.ssh.private-key-files |
/data/hadoop/.ssh2/id_dsa_2048_a |
指定私钥,如果是OpenSSL,则值为/data/hadoop/.ssh/id_rsa |
dfs.ha.fencing.ssh.connect-timeout |
30000 |
可选的配置 |
|
|
|
dfs.journalnode.edits.dir |
/data/hadoop/hadoop/journal |
这个不要带前缀“file://”,JournalNode存储其本地状态的位置,在JouralNode机器上的绝对路径,JNs的edits和其他本地状态将被存储在此处。此处如果带前缀,则会报“Journal dir should be an absolute path” |
|
|
|
dfs.datanode.data.dir |
file:///data/hadoop/hadoop/data |
请带上前缀“file://”,不要全配置成SSD类型,否则写文件时会遇到错误“Failed to place enough replicas” |
dfs.namenode.name.dir |
|
请带上前缀“file://”,NameNode元数据存放目录,默认值为file://${hadoop.tmp.dir}/dfs/name,也就是在临时目录下,可以考虑放到数据目录下 |
dfs.namenode.checkpoint.dir |
|
默认值为file://${hadoop.tmp.dir}/dfs/namesecondary,但如果没有启用SecondaryNameNode,则不需要 |
dfs.ha.automatic-failover.enabled |
true |
自动主备切换 |
|
|
|
dfs.datanode.max.xcievers |
4096 |
可选修改,类似于linux的最大可打开的文件个数,默认为256,建议设置成大一点。同时,需要保证系统可打开的文件个数足够(可通过ulimit命令查看)。该错误会导致hbase报“notservingregionexception”。 |
dfs.journalnode.rpc-address |
0.0.0.0:8485 |
配置JournalNode的RPC端口号,默认为0.0.0.0:8485,可以不用修改 |
dfs.hosts |
|
可选配置,但建议配置,以防止其它DataNode无意中连接进来。用于配置DataNode白名单,只有在白名单中的DataNode才能连接NameNode。dfs.hosts的值为一本地文件绝对路径,如:/data/hadoop/etc/hadoop/hosts.include |
dfs.hosts.exclude |
|
正常不要填写,需要下线DataNode时用到。dfs.hosts.exclude的值为本地文件的绝对路径,文件内容为每行一个需要下线的DataNode主机名或IP地址,如:/data/hadoop/etc/hadoop/hosts.exclude |
dfs.namenode.num.checkpoints.retained |
2 |
默认为2,指定NameNode保存fsImage文件的个数 |
dfs.namenode.num.extra.edits.retained |
1000000 |
Edit文件保存个数 |
dfs.namenode.max.extra.edits.segments.retained |
10000 |
|
dfs.datanode.scan.period.hours |
|
默认为504小时 |
dfs.blockreport.intervalMsec |
|
DataNode向NameNode报告块信息的时间间隔,默认值为21600000毫秒 |
dfs.datanode.directoryscan.interval |
|
DataNode进行内存和磁盘数据集块校验,更新内存中的信息和磁盘中信息的不一致情况,默认值为21600秒 |
dfs.heartbeat.interval |
3 |
向NameNode发心跳的间隔,单位:秒 |
详细配置可参考:
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-hdfs/hdfs-default.xml。
8.8. 修改core-site.xml
对core-site.xml文件的修改,涉及下表中的属性:
属性名 |
属性值 |
说明 |
fs.defaultFS |
hdfs://mycluster |
|
fs.default.name |
hdfs://mycluster |
按理应当不用填写这个参数,因为fs.defaultFS已取代它,但启动时报错: fs.defaultFS is file:/// |
hadoop.tmp.dir |
/data/hadoop/hadoop/tmp |
|
|
|
|
ha.zookeeper.quorum |
hadoop-137-143:2181,hadoop-138-11:2181,hadoop-140-14:2181 |
|
ha.zookeeper.parent-znode |
/mycluster/hadoop-ha |
|
io.seqfile.local.dir |
|
默认值为${hadoop.tmp.dir}/io/local |
fs.s3.buffer.dir |
|
默认值为${hadoop.tmp.dir}/s3 |
fs.s3a.buffer.dir |
|
默认值为${hadoop.tmp.dir}/s3a |
注意启动之前,需要将配置的目录创建好,如创建好/data/hadoop/current/tmp目录。详细可参考:
http://hadoop.apache.org/docs/current/hadoop-project-dist/hadoop-common/core-default.xm。
8.8.1. dfs.namenode.rpc-address
如果没有配置,则启动时报如下错误:
Incorrect configuration: namenode address dfs.namenode.servicerpc-address or dfs.namenode.rpc-address is not configured. |
这里需要指定IP和端口,如果只指定了IP,如<value>10.148.137.143</value>,则启动时输出如下:
Starting namenodes on [] |
改成“<value>hadoop-137-143:8020</value>”后,则启动时输出为:
Starting namenodes on [10.148.137.143] |
8.9. 修改mapred-site.xml
对hdfs-site.xml文件的修改,涉及下表中的属性:
属性名 |
属性值 |
涉及范围 |
mapreduce.framework.name |
yarn |
所有mapreduce节点 |
详细配置可参考:
8.10. 修改yarn-site.xml
对yarn-site.xml文件的修改,涉及下表中的属性:
属性名 |
属性值 |
涉及范围 |
yarn.resourcemanager.hostname |
0.0.0.0 |
ResourceManager NodeManager HA模式可不配置,但由于其它配置项可能有引用它,建议保持值为0.0.0.0,如果没有被引用到,则可不配置。 |
yarn.nodemanager.hostname |
0.0.0.0 |
|
yarn.nodemanager.aux-services |
mapreduce_shuffle |
|
以下为HA相关的配置,包括自动切换(可仅可在ResourceManager节点上配置) |
||
yarn.resourcemanager.ha.enabled |
true |
启用HA |
yarn.resourcemanager.cluster-id |
yarn-cluster |
可不同于HDFS的 |
yarn.resourcemanager.ha.rm-ids |
rm1,rm2 |
注意NodeManager要和ResourceManager一样配置 |
yarn.resourcemanager.hostname.rm1 |
hadoop-137-143 |
|
yarn.resourcemanager.hostname.rm2 |
hadoop-137-204 |
|
yarn.resourcemanager.webapp.address.rm1 |
hadoop-137-143:8088 |
|
yarn.resourcemanager.webapp.address.rm2 |
hadoop-137-204:8088 |
|
yarn.resourcemanager.zk-address |
hadoop-137-143:2181,hadoop-137-204:2181,hadoop-138-11:2181 |
|
yarn.resourcemanager.ha.automatic-failover.enable |
true |
可不配置,因为当yarn.resourcemanager.ha.enabled为true时,它的默认值即为true |
以下为NodeManager配置 |
||
yarn.nodemanager.vmem-pmem-ratio |
|
每使用1MB物理内存,最多可用的虚拟内存数,默认值为2.1,在运行spark-sql时如果遇到“Yarn application has already exited with state FINISHED”,则应当检查NodeManager的日志,以查看是否该配置偏小原因 |
yarn.nodemanager.resource.cpu-vcores |
|
NodeManager总的可用虚拟CPU个数,默认值为8 |
yarn.nodemanager.resource.memory-mb |
|
该节点上YARN可使用的物理内存总量,默认是8192(MB) |
yarn.nodemanager.pmem-check-enabled |
|
是否启动一个线程检查每个任务正使用的物理内存量,如果任务超出分配值,则直接将其杀掉,默认是true |
yarn.nodemanager.vmem-check-enabled |
|
是否启动一个线程检查每个任务正使用的虚拟内存量,如果任务超出分配值,则直接将其杀掉,默认是true |
以下为ResourceManager配置 |
||
yarn.scheduler.minimum-allocation-mb |
|
单个容器可申请的最小内存 |
yarn.scheduler.maximum-allocation-mb |
|
单个容器可申请的最大内存 |
|
|
|
yarn.nodemanager.hostname如果配置成具体的IP,如10.12.154.79,则会导致每个NodeManager的配置不同。详细配置可参考:
http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-common/yarn-default.xml。
Yarn HA的配置可以参考:
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/ResourceManagerHA.html。
9. 启动顺序
Zookeeper -> JournalNode -> 格式化NameNode -> 初始化JournalNode -> 创建命名空间(zkfc) -> NameNode -> 主备切换进程 -> DataNode -> ResourceManager -> NodeManager |
但请注意首次启动NameNode之前,得先做format,也请注意备NameNode的启动方法。主备切换进程的启动只需要在“创建命名空间(zkfc)”之后即可。
10. 启动HDFS
在启动HDFS之前,需要先完成对NameNode的格式化。
10.1. 启动好zookeeper
./zkServer.sh start
注意在启动其它之前先启动zookeeper。
10.2. 创建主备切换命名空间
这一步和格式化NameNode、实始化JournalNode无顺序关系。在其中一个namenode上执行:./hdfs zkfc -formatZK
成功后,将在ZooKeer上创建core-site.xml中ha.zookeeper.parent-znode指定的路径。如果有修改hdfs-site.xml中的dfs.ha.namenodes.mycluster值,则需要重新做一次formatZK,否则自动主备NameNode切换将失效。zkfc进程的日志文件将发现如下信息(假设nm1改成了nn1):
Unable to determine service address for namenode 'nm1' |
注意如果有修改dfs.ha.namenodes.mycluster,上层的HBase等依赖HBase的也需要重启。
10.3. 启动所有JournalNode
NameNode将元数据操作日志记录在JournalNode上,主备NameNode通过记录在JouralNode上的日志完成元数据同步。
在所有JournalNode上执行(注意是两个参数,在“hdfs namenode -format”之后做这一步):
./hadoop-daemon.sh start journalnode
注意,在执行“hdfs namenode -format”之前,必须先启动好JournalNode,而format又必须在启动namenode之前。
10.4. 格式化NameNode
注意只有新的,才需要做这一步,而且只需要在主NameNode上执行。
1) 进入$HADOOP_HOME/bin目录
2) 进行格式化:./hdfs namenode -format
如果完成有,输出包含“INFO util.ExitUtil: Exiting with status 0”,则表示格式化成功。
在进行格式化时,如果没有在/etc/hosts文件中添加主机名和IP的映射:“172.25.40.171 VM-40-171-sles10-64”,则会报如下所示错误:
14/04/17 03:44:09 WARN net.DNS: Unable to determine local hostname -falling back to "localhost" java.net.UnknownHostException: VM-40-171-sles10-64: VM-40-171-sles10-64: unknown error at java.net.InetAddress.getLocalHost(InetAddress.java:1484) at org.apache.hadoop.net.DNS.resolveLocalHostname(DNS.java:264) at org.apache.hadoop.net.DNS.<clinit>(DNS.java:57) at org.apache.hadoop.hdfs.server.namenode.NNStorage.newBlockPoolID(NNStorage.java:945) at org.apache.hadoop.hdfs.server.namenode.NNStorage.newNamespaceInfo(NNStorage.java:573) at org.apache.hadoop.hdfs.server.namenode.FSImage.format(FSImage.java:144) at org.apache.hadoop.hdfs.server.namenode.NameNode.format(NameNode.java:845) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1256) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1370) Caused by: java.net.UnknownHostException: VM-40-171-sles10-64: unknown error at java.net.Inet4AddressImpl.lookupAllHostAddr(Native Method) at java.net.InetAddress$2.lookupAllHostAddr(InetAddress.java:907) at java.net.InetAddress.getAddressesFromNameService(InetAddress.java:1302) at java.net.InetAddress.getLocalHost(InetAddress.java:1479) ... 8 more |
10.5. 初始化JournalNode
这一步需要在格式化NameNode之后进行!
如果是非HA转HA才需要这一步,在其中一个JournalNode上执行:
./hdfs namenode -initializeSharedEdits
此命令默认是交互式的,加上参数-force转成非交互式。
在所有JournalNode创建如下目录:
mkdir -p /data/hadoop/hadoop/journal/mycluster/current
如果此步在格式化NameNode前运行,则会报错“NameNode is not formatted”。
10.6. 启动主NameNode
1) 进入$HADOOP_HOME/sbin目录
2) 启动主NameNode:
./hadoop-daemon.sh start namenode
启动时,遇到如下所示的错误,则表示NameNode不能免密码登录自己。如果之前使用IP可以免密码登录自己,则原因一般是因为没有使用主机名登录过自己,因此解决办法是使用主机名SSH一下,比如:ssh hadoop@VM_40_171_sles10_64,然后再启动。
Starting namenodes on [VM_40_171_sles10_64] VM_40_171_sles10_64: Host key not found from database. VM_40_171_sles10_64: Key fingerprint: VM_40_171_sles10_64: xofiz-zilip-tokar-rupyb-tufer-tahyc-sibah-kyvuf-palik-hazyt-duxux VM_40_171_sles10_64: You can get a public key's fingerprint by running VM_40_171_sles10_64: % ssh-keygen -F publickey.pub VM_40_171_sles10_64: on the keyfile. VM_40_171_sles10_64: warning: tcgetattr failed in ssh_rl_set_tty_modes_for_fd: fd 1: Invalid argument |
10.7. 启动备NameNode
1) ./hdfs namenode -bootstrapStandby
2) ./hadoop-daemon.sh start namenode
如果没有执行第1步,直接启动会遇到如下错误:
No valid image files found
或者在该NameNode日志会发现如下错误:
2016-04-08 14:08:39,745 WARN org.apache.hadoop.hdfs.server.namenode.FSNamesystem: Encountered exception loading fsimage
java.io.IOException: NameNode is not formatted.
10.8. 启动主备切换进程
在所有NameNode上启动主备切换进程:
./hadoop-daemon.sh start zkfc
只有启动了DFSZKFailoverController进程,HDFS才能自动切换主备。
注:zkfc是zookeeper failover controller的缩写。
10.9. 启动所有DataNode
在各个DataNode上分别执行:
./hadoop-daemon.sh start datanode
如果有发现DataNode进程并没有起来,可以试试删除logs目录下的DataNode日志,再得启看看。
10.10. 检查启动是否成功
1) 使用JDK提供的jps命令,查看相应的进程是否已启动
2) 检查$HADOOP_HOME/logs目录下的log和out文件,看看是否有异常信息。
启动后nn1和nn2都处于备机状态,将nn1切换为主机:
./hdfs haadmin -transitionToActive nn1
10.10.1. DataNode
执行jps命令(注:jps是jdk中的一个命令,不是jre中的命令),可看到DataNode进程:
$ jps 18669 DataNode 24542 Jps |
10.10.2. NameNode
执行jps命令,可看到NameNode进程:
$ jps 18669 NameNode 24542 Jps |
10.11. 执行HDFS命令
执行HDFS命令,以进一步检验是否已经安装成功和配置好。关于HDFS命令的用法,直接运行命令hdfs或hdfs dfs,即可看到相关的用法说明。
10.11.1. 查看DataNode是否正常启动
hdfs dfsadmin -report
注意如果core-site.xml中的配置项fs.default.name的值为file:///,则会报:
report: FileSystem file:/// is not an HDFS file system
Usage: hdfs dfsadmin [-report] [-live] [-dead] [-decommissioning]
解决这个问题,只需要将fs.default.name的值设置为和fs.defaultFS相同的值。
10.11.2. 查看NameNode的主备状态
如查看NameNode1和NameNode2分别是主还是备:
$ hdfs haadmin -getServiceState nn1 standby $ hdfs haadmin -getServiceState nn2 active |
10.11.3. hdfs dfs ls
“hdfs dfs -ls”带一个参数,如果参数以“hdfs://URI”打头表示访问HDFS,否则相当于ls。其中URI为NameNode的IP或主机名,可以包含端口号,即hdfs-site.xml中“dfs.namenode.rpc-address”指定的值。
“hdfs dfs -ls”要求默认端口为8020,如果配置成9000,则需要指定端口号,否则不用指定端口,这一点类似于浏览器访问一个URL。示例:
> hdfs dfs -ls hdfs:///172.25.40.171:9001/ |
9001后面的斜杠/是和必须的,否则被当作文件。如果不指定端口号9001,则使用默认的8020,“172.25.40.171:9001”由hdfs-site.xml中“dfs.namenode.rpc-address”指定。
不难看出“hdfs dfs -ls”可以操作不同的HDFS集群,只需要指定不同的URI。
文件上传后,被存储在DataNode的data目录下(由DataNode的hdfs-site.xml中的属性“dfs.datanode.data.dir”指定),如:
$HADOOP_HOME/data/current/BP-139798373-172.25.40.171-1397735615751/current/finalized/blk_1073741825
文件名中的“blk”是block,即块的意思,默认情况下blk_1073741825即为文件的一个完整块,Hadoop未对它进额外处理。
10.11.4. hdfs dfs -put
上传文件命令,示例:
> hdfs dfs -put /etc/SuSE-release hdfs:///172.25.40.171:9001/ |
10.11.5. hdfs dfs -rm
删除文件命令,示例:
> hdfs dfs -rm hdfs://172.25.40.171:9001/SuSE-release Deleted hdfs://172.25.40.171:9001/SuSE-release |
10.11.6. 人工切换主备NameNode
hdfs haadmin -failover --forcefence --forceactive nn1 nn2 # 让nn2成为主NameNode |
10.11.7. HDFS只允许有一主一备两个NameNode
注:hadoop-3.0版本将支持多备NameNode,类似于HBase那样。
如果试图配置三个NameNode,如:
<property> <name>dfs.ha.namenodes.test</name> <value>nn1,nn2,nn3</value> <description> The prefix for a given nameservice, contains a comma-separated list of namenodes for a given nameservice (eg EXAMPLENAMESERVICE). </description> </property> |
则运行“hdfs namenode -bootstrapStandby”时会报如下错误,表示在同一NameSpace内不能超过2个NameNode:
16/04/11 09:51:57 ERROR namenode.NameNode: Failed to start namenode. java.io.IOException: java.lang.IllegalArgumentException: Expected exactly 2 NameNodes in namespace 'test'. Instead, got only 3 (NN ids were 'nn1','nn2','nn3' at org.apache.hadoop.hdfs.server.namenode.ha.BootstrapStandby.run(BootstrapStandby.java:425) at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1454) at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1554) Caused by: java.lang.IllegalArgumentException: Expected exactly 2 NameNodes in namespace 'test'. Instead, got only 3 (NN ids were 'nn1','nn2','nn3' at com.google.common.base.Preconditions.checkArgument(Preconditions.java:115) |
10.11.8. 存储均衡start-balancer.sh
示例:start-balancer.sh –t 10%
10%表示机器与机器之间磁盘使用率偏差小于10%时认为均衡,否则做均衡搬动。“start-balancer.sh”调用“hdfs start balancer”来做均衡,可以调用stop-balancer.sh停止均衡。
均衡过程非常慢,但是均衡过程中,仍能够正常访问HDFS,包括往HDFS上传文件。
[VM2016@hadoop-030 /data4/hadoop/sbin]$ hdfs balancer # 可以改为调用start-balancer.sh 16/04/08 14:26:55 INFO balancer.Balancer: namenodes = [hdfs://test] // test为HDFS的cluster名 16/04/08 14:26:55 INFO balancer.Balancer: parameters = Balancer.Parameters[BalancingPolicy.Node, threshold=10.0, max idle iteration = 5, number of nodes to be excluded = 0, number of nodes to be included = 0] Time Stamp Iteration# Bytes Already Moved Bytes Left To Move Bytes Being Moved 16/04/08 14:26:56 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.231:50010 16/04/08 14:26:56 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.229:50010 16/04/08 14:26:56 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.213:50010 16/04/08 14:26:56 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.208:50010 16/04/08 14:26:56 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.232:50010 16/04/08 14:26:56 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.207:50010 16/04/08 14:26:56 INFO balancer.Balancer: 5 over-utilized: [192.168.1.231:50010:DISK, 192.168.1.229:50010:DISK, 192.168.1.213:50010:DISK, 192.168.1.208:50010:DISK, 192.168.1.232:50010:DISK] 16/04/08 14:26:56 INFO balancer.Balancer: 1 underutilized(未充分利用的): [192.168.1.207:50010:DISK] # 数据将移向该节点 16/04/08 14:26:56 INFO balancer.Balancer: Need to move 816.01 GB to make the cluster balanced. # 需要移动816.01G数据达到平衡 16/04/08 14:26:56 INFO balancer.Balancer: Decided to move 10 GB bytes from 192.168.1.231:50010:DISK to 192.168.1.207:50010:DISK # 从192.168.1.231移动10G数据到192.168.1.207 16/04/08 14:26:56 INFO balancer.Balancer: Will move 10 GB in this iteration
16/04/08 14:32:58 INFO balancer.Dispatcher: Successfully moved blk_1073749366_8542 with size=77829046 from 192.168.1.231:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.213:50010 16/04/08 14:32:59 INFO balancer.Dispatcher: Successfully moved blk_1073749386_8562 with size=77829046 from 192.168.1.231:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.231:50010 16/04/08 14:33:34 INFO balancer.Dispatcher: Successfully moved blk_1073749378_8554 with size=77829046 from 192.168.1.231:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.231:50010 16/04/08 14:34:38 INFO balancer.Dispatcher: Successfully moved blk_1073749371_8547 with size=134217728 from 192.168.1.231:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.213:50010 16/04/08 14:34:54 INFO balancer.Dispatcher: Successfully moved blk_1073749395_8571 with size=134217728 from 192.168.1.231:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.231:50010 Apr 8, 2016 2:35:01 PM 0 478.67 MB 816.01 GB 10 GB 16/04/08 14:35:10 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.213:50010 16/04/08 14:35:10 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.229:50010 16/04/08 14:35:10 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.232:50010 16/04/08 14:35:10 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.231:50010 16/04/08 14:35:10 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.208:50010 16/04/08 14:35:10 INFO net.NetworkTopology: Adding a new node: /default-rack/192.168.1.207:50010 16/04/08 14:35:10 INFO balancer.Balancer: 5 over-utilized: [192.168.1.213:50010:DISK, 192.168.1.229:50010:DISK, 192.168.1.232:50010:DISK, 192.168.1.231:50010:DISK, 192.168.1.208:50010:DISK] 16/04/08 14:35:10 INFO balancer.Balancer: 1 underutilized(未充分利用的): [192.168.1.207:50010:DISK] 16/04/08 14:35:10 INFO balancer.Balancer: Need to move 815.45 GB to make the cluster balanced. 16/04/08 14:35:10 INFO balancer.Balancer: Decided to move 10 GB bytes from 192.168.1.213:50010:DISK to 192.168.1.207:50010:DISK 16/04/08 14:35:10 INFO balancer.Balancer: Will move 10 GB in this iteration
16/04/08 14:41:18 INFO balancer.Dispatcher: Successfully moved blk_1073760371_19547 with size=77829046 from 192.168.1.213:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.213:50010 16/04/08 14:41:19 INFO balancer.Dispatcher: Successfully moved blk_1073760385_19561 with size=77829046 from 192.168.1.213:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.213:50010 16/04/08 14:41:22 INFO balancer.Dispatcher: Successfully moved blk_1073760393_19569 with size=77829046 from 192.168.1.213:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.213:50010 16/04/08 14:41:23 INFO balancer.Dispatcher: Successfully moved blk_1073760363_19539 with size=77829046 from 192.168.1.213:50010:DISK to 192.168.1.207:50010:DISK through 192.168.1.213:50010 |
10.11.9. 查看文件分布在哪些节点
hdfs fsck hdfs:///tmp/slaves -files -locations -blocks
10.11.10. 关闭安全模式
hdfs dfsadmin -safemode leave
10.11.11. 删除missing blocks
hdfs fsck -delete
11. 扩容和下线
11.1. 新增JournalNode
如果是扩容,将已有JournalNode的current目录打包到新机器的“dfs.journalnode.edits.dir”指定的相同位置下。
为保证扩容和缩容JournalNode成功,需要先将NameNode和JournalNode全停止掉,再修改配置,然后在启动JournalNode成功后(日志停留在“IPC Server listener on 8485: starting”处),再启动NameNode,否则可能遇到如下这样的错误:
org.apache.hadoop.hdfs.qjournal.protocol.JournalOutOfSyncException: Can't write, no segment open |
找一台已有JournalNode节点,修改它的hdfs-site.xml,将新增的Journal包含进来,如在
<value>qjournal://hadoop-030:8485;hadoop-031:8485;hadoop-032:8485/test</value>
的基础上新增hadoop-033和hadoop-034两个JournalNode:
<value>qjournal://hadoop-030:8485;hadoop-031:8485;hadoop-032:8485;hadoop-033:8485;hadoop-034:8485/test</value>
然后将安装目录和数据目录(hdfs-site.xml中的dfs.journalnode.edits.dir指定的目录)都复制到新的节点。
如果不复制JournalNode的数据目录,则新节点上的JournalNode可能会报错“Journal Storage Directory /data/journal/test not formatted”,将来的版本可能会实现自动同步。ZooKeeper的扩容不需要复制已有节点的data和datalog,而且也不能这样操作。
接下来,就可以在新节点上启动好JournalNode(不需要做什么初始化),并重启下NameNode。注意观察JournalNode日志,查看是否启动成功,当日志显示为以下这样的INFO级别日志则表示启动成功:
2016-04-26 10:31:11,160 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Finalizing edits file /data/journal/test/current/edits_inprogress_0000000000000194269 -> /data/journal/test/current/edits_0000000000000194269-0000000000000194270
但只能出现如下的日志,才表示工作正常:
2017-05-18 15:22:42,901 INFO org.apache.hadoop.ipc.Server: IPC Server listener on 8485: starting 2017-05-18 15:23:27,028 INFO org.apache.hadoop.hdfs.qjournal.server.JournalNode: Initializing journal in directory /data/journal/data/test 2017-05-18 15:23:27,042 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /data/journal/data/test/in_use.lock acquired by nodename 15259@hadoop-40 2017-05-18 15:23:27,057 INFO org.apache.hadoop.hdfs.qjournal.server.Journal: Scanning storage FileJournalManager(root=/data/journal/data/test) 2017-05-18 15:23:27,152 INFO org.apache.hadoop.hdfs.qjournal.server.Journal: Latest log is EditLogFile(file=/data/journal/data/test/current/edits_inprogress_0000000000027248811,first=0000000000027248811,last=0000000000027248811,inProgress=true,hasCorruptHeader=false) |
11.2. 新NameNode如何加入?
记得更换NameNode后,需要重新执行“hdfs zkfc -formatZK”,否则将不能自动主备切换。
当有NameNode机器损坏时,必然存在新NameNode来替代。把配置修改成指向新NameNode,然后以备机形式启动新NameNode,这样新的NameNode即加入到Cluster中:
1) ./hdfs namenode -bootstrapStandby 2) ./hadoop-daemon.sh start namenode |
记启动主备切换进程DFSZKFailoverController,否则将不能自动做主备切换!!!
新的NameNode通过bootstrapStandby操作从主NameNode拉取fsImage(hadoop-091:50070为主NameNode):
17/04/24 14:25:32 INFO namenode.TransferFsImage: Opening connection to http://hadoop-091:50070/imagetransfer?getimage=1&txid=2768127&storageInfo=-63:2009831148:1492719902489:CID-5b2992bb-4dcb-4211-8070-6934f4d232a8&bootstrapstandby=true 17/04/24 14:25:32 INFO namenode.TransferFsImage: Image Transfer timeout configured to 60000 milliseconds 17/04/24 14:25:32 INFO namenode.TransferFsImage: Transfer took 0.01s at 28461.54 KB/s 17/04/24 14:25:32 INFO namenode.TransferFsImage: Downloaded file fsimage.ckpt_0000000000002768127 size 379293 bytes. |
如果没有足够多的DataNode连接到NameNode,则NameNode也会进入safe模式,下面的信息显示只有0台DataNodes连接到了NameNode。
原因有可能是因为修改了dfs.ha.namenodes.mycluster的值,导致DataNode不认识,比如将nm1改成了nn1等,这个时候还需要重新formatZK,否则自动主备切换将失效。
如果DataNode上的配置也同步修改了,但修改后未重启,则需要重启DataNode:
Safe mode is ON. The reported blocks 0 needs additional 12891 blocks to reach the threshold 0.9990 of total blocks 12904. The number of live datanodes 0 has reached the minimum number 0. Safe mode will be turned off automatically once the thresholds have been reached. |
11.3. 扩容DataNode
兼容的版本,可以跨版本扩容,比如由Hadoop-2.7.2扩容Hadoop-2.8.0。扩容方法为在新增的机器上安装和配置好DataNode,在成功启动DataNode后,在主NameNode上执行命令:bin/hdfs dfsadmin -refreshNodes,即完成扩容。
如要数据均衡到新加入的机器,执行命令:sbin/start-balancer.sh,可带参数-threshold,默认值为10,如:sbin/start-balancer.sh -threshold 5。参数-threshold的取值范围为0~100。
balancer命令可在NameNode和DataNode上执行,但最好在新增机器或空闲机器上执行。
参数-threshold的值表示节点存储使用率和集群存储使用率间的关系,如果节点的存储使用率小于集群存储的使用率,则执行balance操作。
11.4. 下线DataNode
约束:本操作需要在主NameNode上进行,即状态为active的NameNode上进行!!!如果备NameNode也运行着,建议备的hdfs-site.xml也做同样修改,以防止下线过程中发现主备NameNode切换,或者干脆停掉备NameNode。
下线完成后,记得将hdfs-site.xml修改回来(即将dfs.hosts.exclude值恢复为空值),如果不修改回来,那被下线掉的DataNode将一直处于Decommissioned状态,同时还得做一次“/data/hadoop/bin/hdfs dfsadmin -refreshNodes”,否则被下线的DataNode一直处于Decommissioned状态。
下线后,只要配置了dfs.hosts,即使被下线的DataNode进程未停掉,也不会再连接进来,而且这是推荐的方式,以防止外部的DataNode无意中连接进来。
但在将dfs.hosts.exclude值恢复为空值之前,需要将已下线的所有DataNode进程停掉,最好还设置hdfs-site.xml中的dfs.hosts值,以限制可以连接NameNode的DataNode,不然一不小心,被下线的DataNode又连接上来了,切记!另外注意,如果有用到slaves文件,也需要slaves同步修改。
修改主NameNode的hdfs-site.xml,设置dfs.hosts.exclude的值,值为一文件的全路径,如:/home/hadoop/etc/hadoop/hosts.exclude。文件内容为需要下线(即删除)的DataNode的机器名或IP,每行一个机器名或IP(注意暂不要将下线的DataNode从slaves中剔除)。
修改完hdfs-site.xml后,在主NameNode上执行:bin/hdfs dfsadmin -refreshNodes,以刷新DataNode,下线完成后可同扩容一样做下balance。
使用命令bin/hdfs dfsadmin -report或web界面可以观察下线的DataNode退役(Decommissioning)状态。完成后,将下线的DataNode从slaves中剔除。
下线前的状态:
$ hdfs dfsadmin -report Name: 192.168.31.33:50010 (hadoop-33) Hostname: hadoop-33 Decommission Status : Normal Configured Capacity: 3247462653952 (2.95 TB) DFS Used: 297339283 (283.56 MB) Non DFS Used: 165960652397 (154.56 GB) DFS Remaining: 3081204662272 (2.80 TB) DFS Used%: 0.01% DFS Remaining%: 94.88% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 1 Last contact: Wed Apr 19 18:03:33 CST 2017 |
下线进行中的的状态:
$ hdfs dfsadmin -report Name: 192.168.31.33:50010 (hadoop-33) Hostname: hadoop-33 Decommission Status : Decommission in progress Configured Capacity: 3247462653952 (2.95 TB) DFS Used: 297339283 (283.56 MB) Non DFS Used: 165960652397 (154.56 GB) DFS Remaining: 3081204662272 (2.80 TB) DFS Used%: 0.01% DFS Remaining%: 94.88% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 16 Last contact: Thu Apr 20 09:00:48 CST 2017 |
下线完成后的状态:
$ hdfs dfsadmin -report Name: 192.168.31.33:50010 (hadoop-33) Hostname: hadoop-33 Decommission Status : Decommissioned Configured Capacity: 1935079350272 (1.76 TB) DFS Used: 257292167968 (239.62 GB) Non DFS Used: 99063741175 (92.26 GB) DFS Remaining: 1578723441129 (1.44 TB) DFS Used%: 13.30% DFS Remaining%: 81.58% Configured Cache Capacity: 0 (0 B) Cache Used: 0 (0 B) Cache Remaining: 0 (0 B) Cache Used%: 100.00% Cache Remaining%: 0.00% Xceivers: 13 Last contact: Thu Apr 20 09:29:00 CST 2017 |
如果长时间处于“Decommission In Progress”状态,而不能转换成Decommissioned状态,这个时候可用“hdfs fsck”检查下。
成功下线后,还需要将该节点从slaves中删除,以及dfs.hosts.exclude中剔除,然后再做一下:bin/hdfs dfsadmin -refreshNodes。
11.5. 强制DataNode上报块信息
在扩容过程中,有可能遇到DataNode启动时未向NameNode上报block信息。正常时,NameNode都会通过心跳响应的方式告诉DataNode上报block,但当NameNode和DataNode版本不一致等时,可能会使这个机制失效。搜索DataNode的日志文件,将搜索不到上报信息日志“sent block report”。
这个时候,一旦重启NameNode,就会出现大量“missing block”。幸好HDFS提供了工具,可以直接强制DataNode上报block,方法为:
hdfs dfsadmin -triggerBlockReport 192.168.31.26:50020 |
上述192.168.31.26为DataNode的IP地址,50020为DataNode的RPC端口。最终应当保持DataNode和NameNode版本一致,不然得每次做一下这个操作,而且可能还有其它问题存在。
12. 启动YARN
12.1. 启动YARN
如果不能自动主备切换,检查下是否有其它的ResourceManager正占用着ZooKeeper。
1) 进入$HADOOP_HOME/sbin目录
2) 在主备两台都执行:start-yarn.sh,即开始启动YARN
若启动成功,则在Master节点执行jps,可以看到ResourceManager:
> jps 24689 NameNode 30156 Jps 28861 ResourceManager |
在Slaves节点执行jps,可以看到NodeManager:
$ jps 14019 NodeManager 23257 DataNode 15115 Jps |
如果只需要单独启动指定节点上的ResourceManager,这样:
./yarn-daemon.sh start resourcemanager
对于NodeManager,则是这样:
./yarn-daemon.sh start nodemanager
12.2. 执行YARN命令
12.2.1. yarn node -list
列举YARN集群中的所有NodeManager,如(注意参数间的空格,直接执行yarn可以看到使用帮助):
> yarn node -list Total Nodes:3 Node-Id Node-State Node-Http-Address Number-of-Running-Containers localhost:45980 RUNNING localhost:8042 0 localhost:47551 RUNNING localhost:8042 0 localhost:58394 RUNNING localhost:8042 0 |
12.2.2. yarn node -status
查看指定NodeManager的状态,如:
> yarn node -status localhost:47551 Node Report : Node-Id : localhost:47551 Rack : /default-rack Node-State : RUNNING Node-Http-Address : localhost:8042 Last-Health-Update : 星期五 18/四月/14 01:45:41:555GMT Health-Report : Containers : 0 Memory-Used : 0MB Memory-Capacity : 8192MB CPU-Used : 0 vcores CPU-Capacity : 8 vcores |
12.2.3. yarn rmadmin -getServiceState rm1
查看rm1的主备状态,即查看它是主(active)还是备(standby)。
12.2.4. yarn rmadmin -transitionToStandby rm1
将rm1从主切为备。
更多的yarn命令可以参考:
https://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/YarnCommands.html。
13. 运行MapReduce程序
在安装目录的share/hadoop/mapreduce子目录下,有现存的示例程序:
hadoop@VM-40-171-sles10-64:~/hadoop> ls share/hadoop/mapreduce hadoop-mapreduce-client-app-2.8.0.jar hadoop-mapreduce-client-jobclient-2.8.0-tests.jar hadoop-mapreduce-client-common-2.8.0.jar hadoop-mapreduce-client-shuffle-2.8.0.jar hadoop-mapreduce-client-core-2.8.0.jar hadoop-mapreduce-examples-2.8.0.jar hadoop-mapreduce-client-hs-2.8.0.jar lib hadoop-mapreduce-client-hs-plugins-2.8.0.jar lib-examples hadoop-mapreduce-client-jobclient-2.8.0.jar sources |
跑一个示例程序试试:
hdfs dfs -put /etc/hosts hdfs:///test/in/ hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.0.jar wordcount hdfs:///test/in/ hdfs:///test/out/ |
运行过程中,使用java的jps命令,可以看到yarn启动了名为YarnChild的进程。
wordcount运行完成后,结果会保存在out目录下,保存结果的文件名类似于“part-r-00000”。另外,跑这个示例程序有两个需求注意的点:
1) in目录下要有文本文件,或in即为被统计的文本文件,可以为HDFS上的文件或目录,也可以为本地文件或目录
2) out目录不能存在,程序会自动去创建它,如果已经存在则会报错。
包hadoop-mapreduce-examples-2.8.0.jar中含有多个示例程序,不带参数运行,即可看到用法:
> hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.0.jar wordcount Usage: wordcount <in> <out>
> hadoop jar ./share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.0.jar An example program must be given as the first argument. Valid program names are: aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files. aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files. bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi. dbcount: An example job that count the pageview counts from a database. distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi. grep: A map/reduce program that counts the matches of a regex in the input. join: A job that effects a join over sorted, equally partitioned datasets multifilewc: A job that counts words from several files. pentomino: A map/reduce tile laying program to find solutions to pentomino problems. pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method. randomtextwriter: A map/reduce program that writes 10GB of random textual data per node. randomwriter: A map/reduce program that writes 10GB of random data per node. secondarysort: An example defining a secondary sort to the reduce. sort: A map/reduce program that sorts the data written by the random writer. sudoku: A sudoku solver. teragen: Generate data for the terasort terasort: Run the terasort teravalidate: Checking results of terasort wordcount: A map/reduce program that counts the words in the input files. wordmean: A map/reduce program that counts the average length of the words in the input files. wordmedian: A map/reduce program that counts the median length of the words in the input files. wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files. |
修改日志级别为DEBBUG,并打屏:
export HADOOP_ROOT_LOGGER=DEBUG,console
14. HDFS权限配置
14.1. hdfs-site.xml
dfs.permissions.enabled = true dfs.permissions.superusergroup = supergroup dfs.cluster.administrators = ACL-for-admins dfs.namenode.acls.enabled = true dfs.web.ugi = webuser,webgroup |
14.2. core-site.xml
fs.permissions.umask-mode = 022 hadoop.security.authentication = simple 安全验证规则,可为simple或kerberos |
15. C++客户端编程
15.1. 示例代码
// g++ -g -o x x.cpp -L$JAVA_HOME/lib/amd64/jli -ljli -L$JAVA_HOME/jre/lib/amd64/server -ljvm -I$HADOOP_HOME/include $HADOOP_HOME/lib/native/libhdfs.a -lpthread -ldl #include "hdfs.h" #include <stdio.h> #include <stdlib.h> #include <string.h>
int main(int argc, char **argv) { #if 0 hdfsFS fs = hdfsConnect("default", 0); // HA方式 const char* writePath = "hdfs://mycluster/tmp/testfile.txt"; hdfsFile writeFile = hdfsOpenFile(fs, writePath, O_WRONLY |O_CREAT, 0, 0, 0); if(!writeFile) { fprintf(stderr, "Failed to open %s for writing!\n", writePath); exit(-1); } const char* buffer = "Hello, World!\n"; tSize num_written_bytes = hdfsWrite(fs, writeFile, (void*)buffer, strlen(buffer)+1); if (hdfsFlush(fs, writeFile)) { fprintf(stderr, "Failed to 'flush' %s\n", writePath); exit(-1); } hdfsCloseFile(fs, writeFile); #else struct hdfsBuilder* bld = hdfsNewBuilder(); hdfsBuilderSetNameNode(bld, "default"); // HA方式 hdfsFS fs = hdfsBuilderConnect(bld); if (NULL == fs) { fprintf(stderr, "Failed to connect hdfs\n"); exit(-1); } int num_entries = 0; hdfsFileInfo* entries; if (argc < 2) entries = hdfsListDirectory(fs, "/", &num_entries); else entries = hdfsListDirectory(fs, argv[1], &num_entries); fprintf(stdout, "num_entries: %d\n", num_entries); for (int i=0; i<num_entries; ++i) { fprintf(stdout, "%s\n", entries[i].mName); } hdfsFreeFileInfo(entries, num_entries); hdfsDisconnect(fs); //hdfsFreeBuilder(bld); #endif return 0; } |
15.2. 运行示例
运行之前需要设置好CLASSPATH,如果设置不当,可能会遇到不少困难,比如期望操作HDFS上的文件和目录,却变成了本地的文件和目录,如者诸于“java.net.UnknownHostException”类的错误等。
为避免出现错误,强烈建议使用命令“hadoop classpath --glob”取得正确的CLASSPATH值。
另外还需要设置好libjli.so和libjvm.so两个库的LD_LIBRARY_PATH,如:
export LD_LIBRARY_PATH=$JAVA_HOME/lib/amd64/jli:$JAVA_HOME/jre/lib/amd64/server:$LD_LIBRARY_PATH |
16. fsImage
Hadoop提供了fsImage和Edit查看工具,分别为oiv和oev,使用示例:
hdfs oiv -i fsimage_0000000000000001953 -p XML -o x.xml hdfs oev -i edits_0000000000000001054-0000000000000001055 -o x x.xml |
借助工具,可以编辑修改fsImage或Edit文件,做数据修复。
主备NameNode通过QJM同步数据。QJM的数据目录由参数dfs.namenode.name.dir决定,NameNode的数据目录由dfs.journalnode.edits.dir决定。
QJM通过一致复制协议生成日志文件(即edit文件),日志文件名示例:
edits_0000000000024641811-0000000000024641854
所以节点的日志文件是完全相同的,即拥有相同的MD5值,主备NameNode从QJM取日志文件,并存在自己的数据目录,因此所有QJM节点和主备NameNode上的日志文件是完全相同的。
备NameNode会定期将日志文件合并成fsImage文件,并将fsImage同步给主NameNode,因此正常情况下主备NameNode间的fsImage文件也是完全相同的。如果出现不同,有可能主备NameNode间数据出现了不一致,或者是因为备NameNode刚好生成新的fsImage但还未同步给主NameNode。
默认备NameNode一小时合并一次edit文件生成新的fsImage文件,并只保留最近两个fsImage:
下面显示开始合并edit文件,生成新的fsImage文件(为3600秒,即1小时,实际由hdfs-site.xml中的dfs.namenode.checkpoint.period决定): 2017-04-21 15:35:44,994 INFO org.apache.hadoop.hdfs.server.namenode.ha.StandbyCheckpointer: Triggering checkpoint because it has been 3600 seconds since the last checkpoint, which exceeds the configured interval 3600 2017-04-21 15:35:44,994 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Save namespace ... 2017-04-21 15:35:45,022 INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager: Going to retain 2 images with txid >= 24641535
下面显示删除上上一个fsImage文件fsimage_0000000000024638036 2017-04-21 15:35:45,022 INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager: Purging old image FSImageFile(file=/data5/namenode/current/fsimage_0000000000024638036, cpktTxId=0000000000024638036)
下面显示向主NameNode上传新的fsImage文件花了0.142秒 2017-04-21 15:35:45,239 INFO org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Uploaded image with txid 24647473 to namenode at http://hadoop-030:50070 in 0.142 seconds 2017-04-21 15:36:38,528 INFO org.apache.hadoop.hdfs.server.namenode.ha.EditLogTailer: Triggering log roll on remote NameNode hadoop-030/10.143.136.207:8020 2017-04-21 15:36:38,587 INFO org.apache.hadoop.hdfs.server.namenode.FSImage: Reading org.apache.hadoop.hdfs.server.namenode.RedundantEditLogInputStream@5854eec9 expecting start txid #24647474 |
备NameNode会上传最新的fsImage给主NameNode:
2017-04-21 15:35:45,119 INFO org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Transfer took 0.01s at 56923.08 KB/s
下面显示已下载了最新的fsImage文件,文件名将是fsimage_0000000000024647473 2017-04-21 15:35:45,119 INFO org.apache.hadoop.hdfs.server.namenode.TransferFsImage: Downloaded file fsimage.ckpt_0000000000024647473 size 758002 bytes.
下面显示保留2个fsImage文件 2017-04-21 15:35:45,126 INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager: Going to retain 2 images with txid >= 24641535
下面显示删除上上一个fsImage文件fsimage_0000000000024638036 2017-04-21 15:35:45,126 INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager: Purging old image FSImageFile(file=/data5/namenode/current/fsimage_0000000000024638036, cpktTxId=0000000000024638036) 2017-04-21 15:35:45,236 INFO org.apache.hadoop.hdfs.qjournal.client.QuorumJournalManager: Purging remote journals older than txid 23641536 2017-04-21 15:35:45,236 INFO org.apache.hadoop.hdfs.server.namenode.FileJournalManager: Purging logs older than 23641536
下面显示删除较老的edit文件 2017-04-21 15:35:45,244 INFO org.apache.hadoop.hdfs.server.namenode.NNStorageRetentionManager: Purging old edit log EditLogFile(file=/data5/namenode/current/edits_0000000000023641041-0000000000023641120,first=0000000000023641041,last=0000000000023641120,inProgress=false,hasCorruptHeader=false) |
17. 常见错误
1) 执行“hdfs dfs -ls”时报ConnectException
原因可能是指定的端口号9000不对,该端口号由hdfs-site.xml中的属性“dfs.namenode.rpc-address”指定,即为NameNode的RPC服务端口号。
文件上传后,被存储在DataNode的data(由DataNode的hdfs-site.xml中的属性“dfs.datanode.data.dir”指定)目录下,如:
$HADOOP_HOME/data/current/BP-139798373-172.25.40.171-1397735615751/current/finalized/blk_1073741825
文件名中的“blk”是block,即块的意思,默认情况下blk_1073741825即为文件的一个完整块,Hadoop未对它进额外处理。
hdfs dfs -ls hdfs://172.25.40.171:9000 14/04/17 12:04:02 WARN conf.Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 14/04/17 12:04:02 WARN conf.Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 14/04/17 12:04:02 WARN conf.Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 14/04/17 12:04:02 WARN conf.Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. 14/04/17 12:04:02 WARN conf.Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 14/04/17 12:04:02 WARN conf.Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. Java HotSpot(TM) 64-Bit Server VM warning: You have loaded library /data/hadoop/hadoop-2.8.0/lib/native/libhadoop.so.1.0.0 which might have disabled stack guard. The VM will try to fix the stack guard now. It's highly recommended that you fix the library with 'execstack -c <libfile>', or link it with '-z noexecstack'. 14/04/17 12:04:02 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable 14/04/17 12:04:03 WARN conf.Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.attempts; Ignoring. 14/04/17 12:04:03 WARN conf.Configuration: mapred-site.xml:an attempt to override final parameter: mapreduce.job.end-notification.max.retry.interval; Ignoring. ls: Call From VM-40-171-sles10-64/172.25.40.171 to VM-40-171-sles10-64:9000 failed on connection exception: java.net.ConnectException: 拒绝连接; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused |
2) Initialization failed for Block pool
可能是因为对NameNode做format之前,没有清空DataNode的data目录。
3) Incompatible clusterIDs
“Incompatible clusterIDs”的错误原因是在执行“hdfs namenode -format”之前,没有清空DataNode节点的data目录。
网上一些文章和帖子说是tmp目录,它本身也是没问题的,但Hadoop 2.8.0是data目录,实际上这个信息已经由日志的“/data/hadoop/hadoop-2.8.0/data”指出,所以不能死死的参照网上的解决办法,遇到问题时多仔细观察。
从上述描述不难看出,解决办法就是清空所有DataNode的data目录,但注意不要将data目录本身给删除了。 data目录由core-site.xml文件中的属性“dfs.datanode.data.dir”指定。
2014-04-17 19:30:33,075 INFO org.apache.hadoop.hdfs.server.common.Storage: Lock on /data/hadoop/hadoop-2.8.0/data/in_use.lock acquired by nodename 28326@localhost 2014-04-17 19:30:33,078 FATAL org.apache.hadoop.hdfs.server.datanode.DataNode: Initialization failed for block pool Block pool <registering> (Datanode Uuid unassigned) service to /172.25.40.171:9001 java.io.IOException: Incompatible clusterIDs in /data/hadoop/hadoop-2.8.0/data: namenode clusterID = CID-50401d89-a33e-47bf-9d14-914d8f1c4862; datanode clusterID = CID-153d6fcb-d037-4156-b63a-10d6be224091 at org.apache.hadoop.hdfs.server.datanode.DataStorage.doTransition(DataStorage.java:472) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:225) at org.apache.hadoop.hdfs.server.datanode.DataStorage.recoverTransitionRead(DataStorage.java:249) at org.apache.hadoop.hdfs.server.datanode.DataNode.initStorage(DataNode.java:929) at org.apache.hadoop.hdfs.server.datanode.DataNode.initBlockPool(DataNode.java:900) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.verifyAndSetNamespaceInfo(BPOfferService.java:274) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:220) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:815) at java.lang.Thread.run(Thread.java:744) 2014-04-17 19:30:33,081 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Ending block pool service for: Block pool <registering> (Datanode Uuid unassigned) service to /172.25.40.171:9001 2014-04-17 19:30:33,184 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool ID needed, but service not yet registered with NN java.lang.Exception: trace at org.apache.hadoop.hdfs.server.datanode.BPOfferService.getBlockPoolId(BPOfferService.java:143) at org.apache.hadoop.hdfs.server.datanode.BlockPoolManager.remove(BlockPoolManager.java:91) at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownBlockPool(DataNode.java:859) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.shutdownActor(BPOfferService.java:350) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.cleanUp(BPServiceActor.java:619) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:837) at java.lang.Thread.run(Thread.java:744) 2014-04-17 19:30:33,184 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Removed Block pool <registering> (Datanode Uuid unassigned) 2014-04-17 19:30:33,184 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Block pool ID needed, but service not yet registered with NN java.lang.Exception: trace at org.apache.hadoop.hdfs.server.datanode.BPOfferService.getBlockPoolId(BPOfferService.java:143) at org.apache.hadoop.hdfs.server.datanode.DataNode.shutdownBlockPool(DataNode.java:861) at org.apache.hadoop.hdfs.server.datanode.BPOfferService.shutdownActor(BPOfferService.java:350) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.cleanUp(BPServiceActor.java:619) at org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:837) at java.lang.Thread.run(Thread.java:744) 2014-04-17 19:30:35,185 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Exiting Datanode 2014-04-17 19:30:35,187 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 0 2014-04-17 19:30:35,189 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: SHUTDOWN_MSG: /************************************************************ SHUTDOWN_MSG: Shutting down DataNode at localhost/127.0.0.1 ************************************************************/ |
4) Inconsistent checkpoint fields
SecondaryNameNode中的“Inconsistent checkpoint fields”错误原因,可能是因为没有设置好SecondaryNameNode上core-site.xml文件中的“hadoop.tmp.dir”。
2014-04-17 11:42:18,189 INFO org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Log Size Trigger :1000000 txns 2014-04-17 11:43:18,365 ERROR org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode: Exception in doCheckpoint java.io.IOException: Inconsistent checkpoint fields. LV = -56 namespaceID = 1384221685 cTime = 0 ; clusterId = CID-319b9698-c88d-4fe2-8cb2-c4f440f690d4 ; blockpoolId = BP-1627258458-172.25.40.171-1397735061985. Expecting respectively: -56; 476845826; 0; CID-50401d89-a33e-47bf-9d14-914d8f1c4862; BP-2131387753-172.25.40.171-1397730036484. at org.apache.hadoop.hdfs.server.namenode.CheckpointSignature.validateStorageInfo(CheckpointSignature.java:135) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doCheckpoint(SecondaryNameNode.java:518) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.doWork(SecondaryNameNode.java:383) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode$1.run(SecondaryNameNode.java:349) at org.apache.hadoop.security.SecurityUtil.doAsLoginUserOrFatal(SecurityUtil.java:415) at org.apache.hadoop.hdfs.server.namenode.SecondaryNameNode.run(SecondaryNameNode.java:345) at java.lang.Thread.run(Thread.java:744)
另外,也请配置好SecondaryNameNode上hdfs-site.xml中的“dfs.datanode.data.dir”为合适的值: <property> <name>hadoop.tmp.dir</name> <value>/data/hadoop/current/tmp</value> <description>A base for other temporary directories.</description> </property> |
5) fs.defaultFS is file:///
在core-site.xml中,当只填写了fs.defaultFS,而fs.default.name为默认的file:///时,会报此错误。解决方法是设置成相同的值。
6) a shared edits dir must not be specified if HA is not enabled
该错误可能是因为hdfs-site.xml中没有配置dfs.nameservices或dfs.ha.namenodes.mycluster。
7) /tmp/dfs/name is in an inconsistent state: storage directory does not exist or is not accessible.
只需按日志中提示的,创建好相应的目录。
8) The auxService:mapreduce_shuffle does not exist
问题原因是没有配置yarn-site.xml中的“yarn.nodemanager.aux-services”,将它的值配置为mapreduce_shuffle,然后重启yarn问题即解决。记住所有yarn节点都需要修改,包括ResourceManager和NodeManager,如果NodeManager上的没有修改,仍然会报这个错误。
9) org.apache.hadoop.ipc.Client: Retrying connect to server
该问题,有可能是因为NodeManager中的yarn-site.xml和ResourceManager上的不一致,比如NodeManager没有配置yarn.resourcemanager.ha.rm-ids。
10) mapreduce.Job: Running job: job_1445931397013_0001
Hadoop提交mapreduce任务时,卡在mapreduce.Job: Running job: job_1445931397013_0001处。
问题原因可能是因为yarn的NodeManager没起来,可以用jdk的jps确认下。
该问题也有可能是因为NodeManager中的yarn-site.xml和ResourceManager上的不一致,比如NodeManager没有配置yarn.resourcemanager.ha.rm-ids。
11) Could not format one or more JournalNodes
执行“./hdfs namenode -format”时报“Could not format one or more JournalNodes”。
可能是hdfs-site.xml中的dfs.namenode.shared.edits.dir配置错误,比如重复了,如:
<value>qjournal://hadoop-168-254:8485;hadoop-168-254:8485;hadoop-168-253:8485;hadoop-168-252:8485;hadoop-168-251:8485/mycluster</value>
修复后,重启JournalNode,问题可能就解决了。
12) org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: Already in standby state
遇到这个错误,可能是yarn-site.xml中的yarn.resourcemanager.webapp.address配置错误,比如配置成了两个yarn.resourcemanager.webapp.address.rm1,实际应当是yarn.resourcemanager.webapp.address.rm1和yarn.resourcemanager.webapp.address.rm2。
13) No valid image files found
如果是备NameNode,执行下“hdfs namenode -bootstrapStandby”再启动。
2015-12-01 15:24:39,535 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.io.FileNotFoundException: No valid image files found
at org.apache.hadoop.hdfs.server.namenode.FSImageTransactionalStorageInspector.getLatestImages(FSImageTransactionalStorageInspector.java:165)
at org.apache.hadoop.hdfs.server.namenode.FSImage.loadFSImage(FSImage.java:623)
at org.apache.hadoop.hdfs.server.namenode.FSImage.recoverTransitionRead(FSImage.java:294)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFSImage(FSNamesystem.java:975)
at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.loadFromDisk(FSNamesystem.java:681)
at org.apache.hadoop.hdfs.server.namenode.NameNode.loadNamesystem(NameNode.java:584)
at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:644)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:811)
at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:795)
at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1488)
at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1554)
2015-12-01 15:24:39,536 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2015-12-01 15:24:39,539 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG:
14) xceivercount 4097 exceeds the limit of concurrent xcievers 4096
此错误的原因是hdfs-site.xml中的配置项“dfs.datanode.max.xcievers”值4096过小,需要改大一点。该错误会导致hbase报“notservingregionexception”。
16/04/06 14:30:34 ERROR namenode.NameNode: Failed to start namenode.
15) java.lang.IllegalArgumentException: Unable to construct journal, qjournal://hadoop-030:8485;hadoop-031:8454;hadoop-032
执行“hdfs namenode -format”遇到上述错误时,是因为hdfs-site.xml中的配置dfs.namenode.shared.edits.dir配置错误,其中的hadoop-032省了“:8454”部分。
16) Bad URI 'qjournal://hadoop-030:8485;hadoop-031:8454;hadoop-032:8454': must identify journal in path component
是因为配置hdfs-site.xml中的“dfs.namenode.shared.edits.dir”时,路径少带了cluster名。
17) 16/04/06 14:48:19 INFO ipc.Client: Retrying connect to server: hadoop-032/10.143.136.211:8454. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
检查hdfs-site.xml中的“dfs.namenode.shared.edits.dir”值,JournalNode默认端口是8485,不是8454,确认是否有写错。JournalNode端口由hdfs-site.xml中的配置项dfs.journalnode.rpc-address决定。
18) Exception in thread "main" org.apache.hadoop.HadoopIllegalArgumentException: Could not get the namenode ID of this node. You may run zkfc on the node other than namenode.
执行“hdfs zkfc -formatZK”遇到上面这个错误,是因为还没有执行“hdfs namenode -format”。NameNode ID是在“hdfs namenode -format”时生成的。
19) 2016-04-06 17:08:07,690 INFO org.apache.hadoop.hdfs.server.common.Storage: Storage directory [DISK]file:/data3/datanode/data/ has already been used.
以非root用户启动DataNode,但启动不了,在它的日志文件中发现如下错误信息:
2016-04-06 17:08:07,707 INFO org.apache.hadoop.hdfs.server.common.Storage: Analyzing storage directories for bpid BP-418073539-10.143.136.207-1459927327462
2016-04-06 17:08:07,707 WARN org.apache.hadoop.hdfs.server.common.Storage: Failed to analyze storage directories for block pool BP-418073539-10.143.136.207-1459927327462
java.io.IOException: BlockPoolSliceStorage.recoverTransitionRead: attempt to load an used block storage: /data3/datanode/data/current/BP-418073539-10.143.136.207-1459927327462
继续寻找,会发现还存在如何错误提示:
Invalid dfs.datanode.data.dir /data3/datanode/data:
EPERM: Operation not permitted
使用命令“ls -l”检查目录/data3/datanode/data的权限设置,发现owner为root,原因是因为之前使用root启动过DataNode,将owner改过来即可解决此问题。
20) 2016-04-06 18:00:26,939 WARN org.apache.hadoop.hdfs.server.datanode.DataNode: Problem connecting to server: hadoop-031/10.143.136.208:8020
DataNode的日志文件不停地记录如下日志,是因为DataNode将作为主NameNode,但实际上10.143.136.208并没有启动,主NameNode不是它。这个并不表示DataNode没有起来,而是因为DataNode会同时和主NameNode和备NameNode建立心跳,当备NameNode没有起来时,有这些日志是正常现象。
2016-04-06 18:00:32,940 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop-031/10.143.136.208:8020. Already tried 0 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
2016-04-06 17:55:44,555 INFO org.apache.hadoop.hdfs.server.datanode.DataNode: Namenode Block pool BP-418073539-10.143.136.207-1459927327462 (Datanode Uuid 2d115d45-fd48-4e86-97b1-e74a1f87e1ca) service to hadoop-030/10.143.136.207:8020 trying to claim ACTIVE state with txid=1
“trying to claim ACTIVE state”出自于hadoop/hdfs/server/datanode/BPOfferService.java中的updateActorStatesFromHeartbeat()。
2016-04-06 17:55:49,893 INFO org.apache.hadoop.ipc.Client: Retrying connect to server: hadoop-031/10.143.136.208:8020. Already tried 5 time(s); retry policy is RetryUpToMaximumCountWithFixedSleep(maxRetries=10, sleepTime=1000 MILLISECONDS)
“Retrying connect to server”出自于hadoop/ipc/Client.java中的handleConnectionTimeout()和handleConnectionFailure()。
21) ERROR cluster.YarnClientSchedulerBackend: Yarn application has already exited with state FINISHED!
如果遇到这个错误,请检查NodeManager日志,如果发现有如下所示信息:
WARN org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl: Container [pid=26665,containerID=container_1461657380500_0020_02_000001] is running beyond virtual memory limits. Current usage: 345.0 MB of 1 GB physical memory used; 2.2 GB of 2.1 GB virtual memory used. Killing container.
则表示需要增大yarn-site.xmk的配置项yarn.nodemanager.vmem-pmem-ratio的值,该配置项默认值为2.1。
16/10/13 10:23:19 ERROR client.TransportClient: Failed to send RPC 7614640087981520382 to /10.143.136.231:34800: java.nio.channels.ClosedChannelException
java.nio.channels.ClosedChannelException
16/10/13 10:23:19 ERROR cluster.YarnSchedulerBackend$YarnSchedulerEndpoint: Sending RequestExecutors(0,0,Map()) to AM was unsuccessful
java.io.IOException: Failed to send RPC 7614640087981520382 to /10.143.136.231:34800: java.nio.channels.ClosedChannelException
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:249)
at org.apache.spark.network.client.TransportClient$3.operationComplete(TransportClient.java:233)
at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:680)
at io.netty.util.concurrent.DefaultPromise$LateListeners.run(DefaultPromise.java:845)
at io.netty.util.concurrent.DefaultPromise$LateListenerNotifier.run(DefaultPromise.java:873)
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:357)
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:357)
at io.netty.util.concurrent.SingleThreadEventExecutor$2.run(SingleThreadEventExecutor.java:111)
at java.lang.Thread.run(Thread.java:745)
22) java.net.SocketException: Unresolved address
可能是在非NameNode上启动NameNode:
java.net.SocketException: Call From cluster to null:0 failed on socket exception: java.net.SocketException: Unresolved address
23) should be specified as a URI in configuration files
请在dfs.namenode.name.dir、dfs.journalnode.edits.dir和dfs.datanode.data.dir配置的路径前加上前缀“file://”:
common.Util: Path /home/namenode/data should be specified as a URI in configuration files. Please update hdfs configuration.
如:
<property>
<name>dfs.namenode.name.dir</name>
<value>file:///home/namenode/data</value>
</property>
24) Failed to place enough replicas
如果将DataNode的dfs.datanode.data.dir全配置成SSD类型,则执行“hdfs dfs -put /etc/hosts hdfs:///tmp/”时会报如下错误:
2017-05-04 16:08:22,545 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) For more information, please enable DEBUG log level on org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy and org.apache.hadoop.net.NetworkTopology 2017-05-04 16:08:22,545 WARN org.apache.hadoop.hdfs.protocol.BlockStoragePolicy: Failed to place enough replicas: expected size is 3 but only 0 storage types can be selected (replication=3, selected=[], unavailable=[DISK], removed=[DISK, DISK, DISK], policy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}) 2017-05-04 16:08:22,545 WARN org.apache.hadoop.hdfs.server.blockmanagement.BlockPlacementPolicy: Failed to place enough replicas, still in need of 3 to reach 3 (unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]}, newBlock=true) All required storage types are unavailable: unavailableStorages=[DISK], storagePolicy=BlockStoragePolicy{HOT:7, storageTypes=[DISK], creationFallbacks=[], replicationFallbacks=[ARCHIVE]} 2017-05-04 16:08:22,545 INFO org.apache.hadoop.ipc.Server: IPC Server handler 37 on 8020, call Call#5 Retry#0 org.apache.hadoop.hdfs.protocol.ClientProtocol.addBlock from 10.208.5.220:40701 java.io.IOException: File /tmp/in/hosts._COPYING_ could only be replicated to 0 nodes instead of minReplication (=1). There are 5 datanode(s) running and no node(s) are excluded in this operation. at org.apache.hadoop.hdfs.server.blockmanagement.BlockManager.chooseTarget4NewBlock(BlockManager.java:1733) at org.apache.hadoop.hdfs.server.namenode.FSDirWriteFileOp.chooseTargetForNewBlock(FSDirWriteFileOp.java:265) at org.apache.hadoop.hdfs.server.namenode.FSNamesystem.getAdditionalBlock(FSNamesystem.java:2496) at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.addBlock(NameNodeRpcServer.java:828) at org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolServerSideTranslatorPB.addBlock(ClientNamenodeProtocolServerSideTranslatorPB.java:506) at org.apache.hadoop.hdfs.protocol.proto.ClientNamenodeProtocolProtos$ClientNamenodeProtocol$2.callBlockingMethod(ClientNamenodeProtocolProtos.java) at org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:447) at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:989) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:845) at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:788) at java.security.AccessController.doPrivileged(Native Method) at javax.security.auth.Subject.doAs(Subject.java:422) at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1807) at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2455) |
|