hadoop2.2多hdfs集群安装
http://www.superwu.cn/2014/02/12/1094/
基本安装教程按照上面的安装即可,但是安装过程中还是出现了一些问题,现整理如下:
1、首先使用root用户安装可以,也省去了很多权限问题,但是最好别用,不容易发现问题,添加用户:
如果是新用户,直接添加,如:useradd hadoop如果已经存在该用户,如果用户已删除,但是组还在,需使用useradd -g hadoop hadoop
2、修改hadoop用户密码:root用户下修改某个用户密码:passwd hadoop然后直接输入新密码即可
3、设置完毕,需要给hadoop用户添加操作磁盘目录的权限:chown -R hadoop:hadoop 权限目录(包括安装等目录)
4、设置ssh无密码连接:
在主节点上执行(hadoop用户下)ssh-keygen –t rsa,一路回车
在主节点上执行cp id_ids.pub authorized_keys
在主节点上执行ssh-copy-id -i 其它节点域名
5、测试集群安装节点,如下:
cluster1:
192.168.157.100 hadoop-kf100.jd.com
192.168.157.101 hadoop-kf101.jd.com
192.168.157.102 hadoop-kf102.jd.com
cluster2:
192.168.157.103 hadoop-kf103.jd.com
192.168.157.104 hadoop-kf104.jd.com
192.168.157.105 hadoop-kf105.jd.com
各台机器职责如下:
hadoop-kf100.jd.com | hadoop-kf101.jd.com | hadoop-kf102.jd.com | hadoop-kf103.jd.com | hadoop-kf104.jd.com | hadoop-kf105.jd.com | |
是NameNode吗? | 是,属集群cluster1 | 是,属集群cluster1 | 是,属集群cluster1 | 是,属集群cluster2 | 是,属集群cluster2 | 是,属集群cluster2 |
是DataNode吗? | 是 | 是 | 是 | 是 | 是 | 是 |
是JournalNode吗? | 是 | 是 | 不是 | 是 | 不是 | 不是 |
是ZooKeeper吗? | 是 | 是 | 不是 | 是 | 不是 | 不是 |
是ZKFC吗? | 是 | 是 | 不是 | 是 | 是 | 不是 |
6、设置环境变量,如下:
export JAVA_HOME=/export/servers/jdk1.6.0_25
export JAVA_BIN=/export/servers/jdk1.6.0_25/bin
export HADOOP_HOME=/usr/local/hadoop-2.2.0
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export HADOOP_OPTS="-Djava.library.path=$HADOOP_HOME/lib"
export JAVA_LIBRARY_PATH=$HADOOP_HOME/lib/native
export ZOOKEEPER_HOME=/export/servers/zookeeper-3.4.6
export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$ZOOKEEPER_HOME/bin:$PATH
7、hadoop压缩到安装目录后,主要修改以下几个文件:
(1)core-site.xml:
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://cluster1</value>--集群1名称,默认hdfs路径,在hdfs-site.xml中定义
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop/tmp</value>--默认是NameNode、DataNode、JournalNode等存放数据的公共目录
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>hadoop-kf100.jd.com:2181,hadoop-kf101.jd.com:2181,hadoop-kf103.jd.com:2181</value>--zookeeper集群
<description>Zookeeper集群</description>
</property>
<property>
<name>hadoop.proxyuser.hadoop.hosts</name>--oozie客户端权限设置
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hadoop.groups</name>
<value>*</value>
</property>
</configuration>
(2)hadoop-env.sh:
export JAVA_HOME=/export/servers/jdk1.6.0_25
(3)hdfs-site.xml(cluster2):
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>--数据备份
</property>
<property>
<name>dfs.nameservices</name>
<value>cluster1,cluster2</value>--两个hdfs集群
</property>
<property>
<name>dfs.ha.namenodes.cluster1</name>
<value>hadoop100,hadoop101</value>--集群cluster1的namenodes
</property>
<property>
<name>dfs.namenode.rpc-address.cluster1.hadoop100</name>
<value>hadoop-kf100.jd.com:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster1.hadoop100</name>
<value>hadoop-kf100.jd.com:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster1.hadoop101</name>
<value>hadoop-kf101.jd.com:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster1.hadoop101</name>
<value>hadoop-kf101.jd.com:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop-kf100.jd.com:8485;hadoop-kf101.jd.com:8485;hadoop-kf103.jd.com:8485/cluster2</value>
<description>指定cluster2的3个NameNode共享edits文件目录时,使用的是JournalNode集群来维护</description>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled.cluster1</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.cluster1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.namenodes.cluster2</name>
<value>hadoop103,hadoop104</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster2.hadoop103</name>
<value>hadoop-kf103.jd.com:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster2.hadoop103</name>
<value>hadoop-kf103.jd.com:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster2.hadoop104</name>
<value>hadoop-kf104.jd.com:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster2.hadoop104</name>
<value>hadoop-kf104.jd.com:50070</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled.cluster2</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.cluster2</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/usr/local/hadoop/tmp/journal</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
(4)hdfs-site.xml(cluster1)两个集群唯一不同的地方是dfs.namenode.shared.edits.dir属性值:
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>--数据备份
</property>
<property>
<name>dfs.nameservices</name>
<value>cluster1,cluster2</value>--两个hdfs集群
</property>
<property>
<name>dfs.ha.namenodes.cluster1</name>
<value>hadoop100,hadoop101</value>--集群cluster1的namenodes
</property>
<property>
<name>dfs.namenode.rpc-address.cluster1.hadoop100</name>
<value>hadoop-kf100.jd.com:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster1.hadoop100</name>
<value>hadoop-kf100.jd.com:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster1.hadoop101</name>
<value>hadoop-kf101.jd.com:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster1.hadoop101</name>
<value>hadoop-kf101.jd.com:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://hadoop-kf100.jd.com:8485;hadoop-kf101.jd.com:8485;hadoop-kf103.jd.com:8485/cluster1</value>
<description>指定cluster2的3个NameNode共享edits文件目录时,使用的是JournalNode集群来维护</description>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled.cluster1</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.cluster1</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.namenodes.cluster2</name>
<value>hadoop103,hadoop104</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster2.hadoop103</name>
<value>hadoop-kf103.jd.com:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster2.hadoop103</name>
<value>hadoop-kf103.jd.com:50070</value>
</property>
<property>
<name>dfs.namenode.rpc-address.cluster2.hadoop104</name>
<value>hadoop-kf104.jd.com:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.cluster2.hadoop104</name>
<value>hadoop-kf104.jd.com:50070</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled.cluster2</name>
<value>true</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.cluster2</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/usr/local/hadoop/tmp/journal</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.permissions</name>
<value>false</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
</configuration>
(5)mapred-site.xml:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
(6)slaves:
hadoop-kf100.jd.com
hadoop-kf101.jd.com
hadoop-kf102.jd.com
hadoop-kf103.jd.com
hadoop-kf104.jd.com
hadoop-kf105.jd.com
(7)yarn-site.xml:
<configuration>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop-kf100.jd.com</value>
</property>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
<name>yarn.scheduler.fair.allocation.file</name>
<value>/usr/local/hadoop-2.2.0/etc/hadoop/fair-scheduler.xml</value>
</property>
</configuration>
至此,配置文件修改完毕,除了hdfs-site.xml文件内容稍有差别,其它都一样
8、下面开始启动:
启动步骤按照上面链接来就行,但是最好每个命令自己手敲,别随便复制粘贴,容易出问题
9、下面说一下安装过程中出现的几个问题:
(1)JournalNode最好是每个hdfs集群安装一个,不然启动namenode的时候会出没有格式化的问题,zookeeper也是
(2)操作过程中一定要注意用户权限问题,操作不当也很容易出问题
(3)每个hdfs集群最多支持两个namenode,>=3个会报错
(4)安装完要测试一下经典的wordcount程序,不报错才行
(5)安装完在浏览器输入http://hadoop-kf100.jd.com:50070/dfshealth.jsp、http://hadoop-kf100.jd.com:8088没问题即可
(6)如果10020端口连接失败,则表示是jobhistory进程没启动,启动命令:sh mr-jobhistory-daemon.sh start historyserver