安装高可用Hadoop生态 (三) 安装Hadoop
3. 安装Hadoop
3.1. 解压程序
※ 3台服务器分别执行
tar -xf ~/install/hadoop-2.7.3.tar.gz -C/opt/cloud/packages ln -s /opt/cloud/packages/hadoop-2.7.3 /opt/cloud/bin/hadoop ln -s /opt/cloud/packages/hadoop-2.7.3/etc/hadoop /opt/cloud/etc/hadoop mkdir -p /opt/cloud/hdfs/name mkdir -p /opt/cloud/hdfs/data mkdir -p /opt/cloud/hdfs/journal mkdir -p /opt/cloud/hdfs/tmp/java mkdir -p /opt/cloud/logs/hadoop/yarn
3.2. 设置环境变量
设置JAVA环境变量和Hadoop环境变量
vi ~/.bashrc
增加
export HADOOP_HOME=/opt/cloud/bin/hadoop export HADOOP_CONF_DIR=/opt/cloud/etc/hadoop export HADOOP_LOG_DIR=/opt/cloud/logs/hadoop export HADOOP_PID_DIR=/opt/cloud/hdfs/tmp export YARN_PID_DIR=/opt/cloud/hdfs/tmp export HADOOP_OPTS="-Djava.io.tmpdir=/opt/cloud/hdfs/tmp/java" export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
即刻生效
source ~/.bashrc
复制到另外两台服务器
scp ~/.bashrc hadoop2:/home/hadoop scp ~/.bashrc hadoop3:/home/hadoop
3.3. 修改Hadoop参数
cd ${HADOOP_HOME}/etc/hadoop
修改log4j.properties 、hadoop-env.sh、yarn-env.sh、slaves、core-site.xml、hdfs-site.xml、mapred-site.xml和yarn-site.xml,分发到hadoop2和hadoop2相同的目录下
3.3.1. 修改log配置文件log4j.properties
hadoop.root.logger =INFO,DRFA
hadoop.log.dir=/opt/cloud/logs/hadoop
3.3.2. 修改hadoop-env.sh
hadoop-env.sh设置了Hadoop的一些环境变量,但是直到2.7.3都有bug,不能从系统的环境变量中提取正确的值,需要手工修改,在文件头部
export JAVA_HOME=${JAVA_HOME}
将其注释,手工修改为
export JAVA_HOME="/usr/lib/jvm/java"
在文件中查找#export HADOOP_LOG_DIR,在其下增加
export HADOOP_LOG_DIR=/opt/cloud/logs/hadoop
在文件中查找export HADOOP_PID_DIR=${HADOOP_PID_DIR}
export HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/
设置java的临时目录,查找
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true "
修改为
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true -Djava.io.tmpdir=/opt/cloud/hdfs/tmp/java"
3.3.3. 修改yarn-env.sh
查找default log directory,在其后增加一行
export YARN_LOG_DIR=/opt/cloud/logs/hadoop/yarn
3.3.4. 修改slaves
# vi slaves
配置内容:
删除:localhost
添加:
hadoop2
hadoop3
3.3.5. 修改core-site.xml
# vi core-site.xml
配置内容:
<configuration>
<property> <name>fs.defaultFS</name> <value>hdfs://mycluster</value> </property> <property> <name>ha.zookeeper.quorum</name> <value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/opt/cloud/hdfs/tmp</value> </property> <property> <name>io.file.buffer.size</name> <value>131072</value> </property> <property> <name>hadoop.proxyuser.hadoop.groups</name> <value>hadoop</value> </property> <property> <name>hadoop.proxyuser.hadoop.hosts</name> <value>hadoop1, hadoop2, hadoop3,127.0.0.1,localhost</value> </property> <property> <name>ipc.client.rpc-timeout.ms</name>[1] <value>4000</value> </property> <property> <name>ipc.client.connect.timeout</name> <value>4000</value> </property> <property> <name>ipc.client.connect.max.retries</name> <value>100</value> </property> <property> <name>ipc.client.connect.retry.interval</name> <value>10000</value> </property> </configuration>
3.3.6. 修改hdfs-site.xml
# vi hdfs-site.xml
配置内容:
<configuration> <property> <name>dfs.nameservices</name> <value>mycluster</value> </property> <property> <name>dfs.ha.namenodes.mycluster</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn1</name> <value>hadoop1:9000</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn1</name> <value>hadoop1:50070</value> </property> <property> <name>dfs.namenode.rpc-address.mycluster.nn2</name> <value>hadoop2:9000</value> </property> <property> <name>dfs.namenode.http-address.mycluster.nn2</name> <value>hadoop2:50070</value> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://hadoop1:8485;hadoop2:8485;hadoop3:8485/mycluster</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/opt/cloud/hdfs/journal</value> </property> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>true</value> </property> <property> <name>dfs.client.failover.proxy.provider.mycluster</name> <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value> </property> <property> <name>dfs.ha.fencing.methods</name> <value> sshfence shell(/bin/true) </value> </property> <property> <name>dfs.ha.fencing.ssh.private-key-files</name> <value>/home/hadoop/.ssh/id_rsa</value> </property> <property> <name>dfs.ha.fencing.ssh.connect-timeout</name> <value>30000</value> </property> <property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.name.dir</name> <value>/opt/cloud/hdfs/name</value> </property> <property> <name>dfs.data.dir</name> <value>/opt/cloud/hdfs/data</value> </property> <property> <name>dfs.permissions</name> <value>false</value> </property> <property> <name>dfs.support.append</name> <value>true</value> </property> <property> <name>dfs.webhdfs.enabled</name> <value>true</value> </property> <property> <name>dfs.client.block.write.replace-datanode-on-failure.enable</name> <value>true</value> </property> <property> <name>dfs.client.block.write.replace-datanode-on-failure.policy</name> <value>NEVER</value> </property> <property> <name>dfs.datanode.max.xcievers</name> <value>8192</value> </property> </configuration>
3.3.7. 修改mapred-site.xml
mv mapred-site.xml.template mapred-site.xml vi mapred-site.xml
配置内容:
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>0.0.0.0:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>0.0.0.0:19888</value> </property> <property> <name>yarn.app.mapreduce.am.resource.mb</name> <value>1024</value> </property> <property> <name>yarn.app.mapreduce.am.command-opts</name> <value>-Xmx800m</value> </property> <property> <name>mapreduce.map.memory.mb</name> <value>512</value> </property> <property> <name>mapreduce.map.java.opts</name> <value>-Xmx400m</value> </property> <property> <name>mapreduce.reduce.memory.mb</name> <value>1024</value> </property> <property> <name>mapreduce.reduce.java.opts</name> <value>-Xmx800m</value> </property> </configuration>
3.3.8. 修改yarn-site.xml(非HA版)
vi yarn-site.xml
配置内容:
<configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>hadoop1</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name> yarn.nodemanager.aux-services.mapreduce_shuffle.class </name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> </configuration>
3.3.9. 修改yarn-site.xml(HA版)
vi yarn-site.xml
配置内容:
<configuration> <property> <name>yarn.resourcemanager.ha.enabled</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.cluster-id</name> <value>clusteryarn</value> </property> <property> <name>yarn.resourcemanager.ha.rm-ids</name> <value>rm1,rm2</value> </property> <property> <name>yarn.resourcemanager.hostname.rm1</name> <value>hadoop1</value> </property> <property> <name>yarn.resourcemanager.hostname.rm2</name> <value>hadoop2</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.resourcemanager.zk-address</name> <value>hadoop1:2181,hadoop2:2181,hadoop3:2181</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name> yarn.nodemanager.aux-services.mapreduce_shuffle.class </name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.connect.retry-interval.ms</name> <value>5000</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>3072</value> </property> <property> <name>yarn.nodemanager.vmem-pmem-ratio</name> <value>4</value> </property> <property> <name>yarn.nodemanager.resource.cpu-vcores</name> <value>2</value> </property> <property> <name>yarn.scheduler.minimum-allocation-mb</name> <value>512</value> </property> <property> <name>yarn.scheduler.maximum-allocation-mb</name> <value>2048</value> </property> <property> <name>yarn.scheduler.minimum-allocation-vcores</name> <value>1</value> </property> <property> <name>yarn.scheduler.maximum-allocation-vcores</name> <value>2</value> </property> </configuration>
3.3.10. 复制到另外2台服务器
配置文件打包为
scp /opt/cloud/bin/hadoop/etc/hadoop/* hadoop2:/opt/cloud/bin/hadoop/etc/hadoop/ scp /opt/cloud/bin/hadoop/etc/hadoop/* hadoop3:/opt/cloud/bin/hadoop/etc/hadoop/
3.4. 首次启动HDFS
- 启动JournalNode集群:
cexec 'hadoop-daemon.sh start journalnode'
注意只有第一次需要这么启动,之后启动hdfs会包含journalnode。
- 格式化第1个NameNode:
ssh hadoop1 'hdfs namenode -format -clusterId mycluster'
输出信息的最后部分出现下面两行表示格式化成功
INFO common.Storage: Storage directory /opt/cloud/hdfs/name has been successfully formatted.
...
INFO util.ExitUtil: Exiting with status 0
- 启动第1个NameNode:
ssh hadoop1 'hadoop-daemon.sh start namenode'
- 格式化第2个NameNode:
ssh hadoop2 'hdfs namenode -bootstrapStandby'
输出信息的最后部分出现下面两行表示格式化成功
INFO common.Storage: Storage directory /opt/cloud/hdfs/name has been successfully formatted.
...
INFO util.ExitUtil: Exiting with status 0
- 启动第2个NameNode:
ssh hadoop2 'hadoop-daemon.sh start namenode'
- 格式化Zk
ssh hadoop1 'hdfs zkfc -formatZK'
信息
INFO ha.ActiveStandbyElector: Successfully created /hadoop-ha/mycluster in ZK.
即为格式化成功
- 启动2个Zkfc
ssh hadoop1 'hadoop-daemon.sh start zkfc' ssh hadoop2 'hadoop-daemon.sh start zkfc'
- 启动所有的DataNodes:
ssh hadoop1 'hadoop-daemons.sh start datanode'
用浏览器访问http://hadoop1:50070和http://hadoop2:50070 查看状态
namenode一个是active一个是standby,其中active的网页中QJM三台服务器的Written txid相同。
3.5. 正式启动hdfs和Yarn
在hadoop1上执行
start-dfs.sh start-yarn.sh
在hadoop2上执行
ssh hadoop2 'yarn-daemon.sh start resourcemanager'
通过jps查看进程
[hadoop@hadoop1 ~]$ cexec jps ************************* cloud ************************* --------- hadoop1--------- 1223 QuorumPeerMain 3757 DFSZKFailoverController 4787 Jps 3872 ResourceManager 3365 NameNode 3578 JournalNode --------- hadoop2--------- 1220 QuorumPeerMain 24240 NodeManager 24545 Jps 24022 JournalNode 24139 DFSZKFailoverController 23847 NameNode 23923 DataNode 24419 ResourceManager --------- hadoop3--------- 23764 Jps 23578 NodeManager 23471 JournalNode 23372 DataNode 1224 QuorumPeerMain
在浏览器中下列网址,会看到图形界面的监控程序
http://hadoop1:50070/ dfs的图形界面的监控程序
http://hadoop2:50070/ dfs的图形界面的监控程序,hadoop1和hadoop2其中一个是active,另外一个是standby
http://hadoop1:8088
http://hadoop2:8088 自动跳转到http://hadoop1:8088
3.6. 开机自动运行hdfs
Centos7 采用Systemd作为自启动管理器,有方便设置依赖关系等多个优点,不过,每个服务的环境变量都是初始化的,即“systemd不继承任何上下文环境”,所以服务脚本需要设置必要的所有环境变量,每个变量需要用Environment = name = value的方式设置,好消息Environment可以多行,坏消息是Environment中不支持已经使用已经声明的变量,就是说value中不能有$name,${name}。
3.6.1. journalnode service
vi hadoop-journalnode.service
[Unit] Description=hadoop journalnode service After= network.target [Service] Type=forking User=hadoop Group=hadoop Environment = JAVA_HOME=/usr/lib/jvm/java Environment = JRE_HOME=/usr/lib/jvm/java/jre Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar' Environment = HADOOP_HOME=/opt/cloud/bin/hadoop Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/ ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh start journalnode' ExecStop =/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh stop journalnode' [Install] WantedBy=multi-user.target
3.6.2. namenode service
vi hadoop-namenode.service
[Unit] Description=hadoop namenode service After= network.target [Service] Type=forking User=hadoop Group=hadoop Environment = JAVA_HOME=/usr/lib/jvm/java Environment = JRE_HOME=/usr/lib/jvm/java/jre Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar' Environment = HADOOP_HOME=/opt/cloud/bin/hadoop Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/ ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh start namenode' ExecStop=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh stop namenode' [Install] WantedBy=multi-user.target
3.6.3. datanode service
vi hadoop-datanode.service
[Unit] Description=hadoop datanode service After= network.target [Service] Type=forking User=hadoop Group=hadoop Environment = JAVA_HOME=/usr/lib/jvm/java Environment = JRE_HOME=/usr/lib/jvm/java/jre Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar' Environment = HADOOP_HOME=/opt/cloud/bin/hadoop Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/ ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh start datanode' ExecStop =/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh stop datanode' [Install] WantedBy=multi-user.target
3.6.4. zkfc service
vi hadoop-zkfc.service
[Unit] Description=hadoop zkfc service After= network.target [Service] Type=forking User=hadoop Group=hadoop Environment = JAVA_HOME=/usr/lib/jvm/java Environment = JRE_HOME=/usr/lib/jvm/java/jre Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar' Environment = HADOOP_HOME=/opt/cloud/bin/hadoop Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/ ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh start zkfc' ExecStop=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/hadoop-daemon.sh stop zkfc' [Install] WantedBy=multi-user.target
3.6.5. yarn resource manager service
vi yarn-rm.service
[Unit] Description=yarn resource manager service After= network.target [Service] Type=forking User=hadoop Group=hadoop Environment = JAVA_HOME=/usr/lib/jvm/java Environment = JRE_HOME=/usr/lib/jvm/java/jre Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar' Environment = HADOOP_HOME=/opt/cloud/bin/hadoop Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/ ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/yarn-daemon.sh start resourcemanager' ExecStop=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/yarn-daemon.sh stop resourcemanager' [Install] WantedBy=multi-user.target
3.6.6. yarn nodemanager service
vi yarn-nm.service
[Unit] Description=yarn node manager service After= network.target [Service] Type=forking User=hadoop Group=hadoop Environment = JAVA_HOME=/usr/lib/jvm/java Environment = JRE_HOME=/usr/lib/jvm/java/jre Environment = CLASSPATH='.:/usr/lib/jvm/java/jre/lib/rt.jar:/usr/lib/jvm/java/lib/dt.jar:$JAVA_HOME/lib/tools.jar' Environment = HADOOP_HOME=/opt/cloud/bin/hadoop Environment = HADOOP_CONF_DIR=/opt/cloud/etc/hadoop Environment = HADOOP_LOG_DIR=/opt/cloud/logs/hadoop Environment = HADOOP_PID_DIR=/opt/cloud/hdfs/tmp/ ExecStart=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/yarn-daemon.sh start nodemanager' ExecStop=/usr/bin/sh -c '/opt/cloud/bin/hadoop/sbin/yarn-daemon.sh stop nodemanager' [Install] WantedBy=multi-user.target
3.6.7. 测试和设置为自动启动服务
编写6种服务的启动脚本,分别复制到对应服务的/etc/systemd/system目录
hadoop2 (6种服务)
systemctl start hadoop-journalnode systemctl start hadoop-namenode systemctl start hadoop-datanode systemctl start hadoop-zkfc systemctl start yarn-rm systemctl start yarn-nm
测试通过后
systemctl enable hadoop-journalnode systemctl enable hadoop-namenode systemctl enable hadoop-datanode systemctl enable hadoop-zkfc systemctl enable yarn-rm systemctl enable yarn-nm
hadoop1 (4种服务)
systemctl enable hadoop-journalnode systemctl enable hadoop-namenode systemctl enable hadoop-zkfc systemctl enable yarn-rm
hadoop3 (3种服务)
systemctl enable hadoop-journalnode systemctl enable hadoop-datanode systemctl enable yarn-nm
重新启动3台服务器,运行 cexec jps 查看系统状态
3.7. 卸载
- 停止yarn,停止DFS:
ssh hadoop1 'stop-yarn.sh' ssh hadoop2 'yarn-daemon.sh stop resourcemanager' ssh hadoop1 'stop-dfs.sh'
cexec jps 不再看到hdfs和yarn的进程
- 停止并删除系统服务
hadoop2 (6种服务)
systemctl disable hadoop-journalnode systemctl disable hadoop-namenode systemctl disable hadoop-datanode systemctl disable hadoop-zkfc systemctl disable yarn-rm systemctl disable yarn-nm
hadoop1 (4种服务)
systemctl disable hadoop-journalnode systemctl disable hadoop-namenode systemctl disable hadoop-zkfc systemctl disable yarn-rm
hadoop3 (3种服务)
systemctl disable hadoop-journalnode systemctl disable hadoop-datanode systemctl disable yarn-nm
- 删除数据目录
rm /opt/cloud/hdfs -rf rm /opt/cloud/logs/hadoop -rf
- 删除程序目录
rm /opt/cloud/bin/hadoop -rf rm /opt/cloud/etc/hadoop -rf rm /opt/cloud/packages/hadoop-2.7.3 -rf
- 复原环境变量
vi ~/.bashrc
删除hadoop相关行