ResourceManager的HA配置
HDFS的NameNode类似,如果Yarn的ResourceManager挂掉了怎么办,我们需要配置ResourceManager的高可用性(一个挂掉,另一个可以接着起来干活),这里同样可以使用Zookeeper的master选举机制来实现
- 1、保证zookeeper服务正常,分别到master、slave1和slave2上把Zookeeper启动
zkServer.sh start
- 2、关闭yarn,
stop-yarn.sh
- 3、备份yarn-site.xml:
cp yarn-site.xml yarn-site.xml_bak
- 4、在yarn-site.xml中加上如下配置(将原先的resourcemanager.hostname和yarn.resourcemanager.address注释掉):
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>cluster1</value>
<description>集群唯一标识</description>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
<description>两个RM的唯一标识</description>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>master</value>
<description>第一个RM部署在的机器名</description>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>slave1</value>
<description>第二个RM部署在的机器名</description>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>master:8088</value>
<description>第一个RM的web ui的端口</description>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>slave1:8088</value>
<description>第二个RM的web ui的端口</description>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>master:2181,slave1:2181,slave2:2181</value>
<description>zk的部署的主机名和端口</description>
</property>
- 5、同步:
scp yarn-site.xml hadoop-twq@slave1:~/bigdata/hadoop-2.7.5/etc/hadoop/
scp yarn-site.xml hadoop-twq@slave2:~/bigdata/hadoop-2.7.5/etc/hadoop/
- 6、在master上启动yarn,
start-yarn.sh
- 7、在slave1上启动resourcemanager:
~/bigdata/hadoop-2.7.5/sbin/yarn-daemon.sh start resourcemanager
- 8、通过命令查看ResourceManager状态:
yarn rmadmin -getServiceState rm1
- 9、干掉其中一个状态为active的ResourceManager,
kill -9 11177
,查看另一个ResourceManager是否变为active,通过WebUI亦可查看 - 10、重新启动ResourceManager:
yarn-deamon.sh start resourcemanager
- 11、以后学习也用不到两个,这里把其复原回去
- 12、
stop-yarn.sh
- 13、把原来的文件备份复原:
cp yarn-site.xml yarn-site.xml_ha
mv yarn-site.xml_bak yarn-site.xml
- 14、同步:
scp yarn-site.xml hadoop-twq@slave1:~/bigdata/hadoop-2.7.5/etc/hadoop/
scp yarn-site.xml hadoop-twq@slave2:~/bigdata/hadoop-2.7.5/etc/hadoop/
- 15、启动start-yarn.sh,jps查看,把另一台开启ResourceManager的节点关闭:
yarn-deamon.sh stop resourcemanager