Hadoop-Yarn-HA集群搭建(搭建篇)

1.前提条件

我学习过程是一块一块深入的,在把hdfs基本弄懂以及HA成功的情况开始尝试搭建yarn的,建议在搭建前先去看一下转载的原理篇,懂了原理后搭建会很快的,再次强调一下hdfs我默认已经搭建成功了

2.搭建环境准备

  • 1,主机环境:4台centos机器。

    rtest-mysql-01: 主NN,ResourceManager 运行进程(NameNode,ResourceManager,DFSZKFailoverController

      主要运行nn和ResourceManager。

    rtest-mysql-02:备NN,ResourceManager  运行程序(NameNode,DFSZKFailoverController,DataNode,ResourceManager,JournalNode,NodeManager,zookeeper)

      备nn和ResourceManager,同时和03,04搭建了zookeeper集群和journalnode集群。

     rtest-mysql-03:运行程序(DataNode,JournalNode,NodeManager,zookeeper)

         数据节点

     rtest-mysql-04:运行程序(DataNode,JournalNode,NodeManager,zookeeper)

                数据节点

  • 2,各主机详解

    在hadoop2.X中通常由两个NameNode组成,一个处于active状态,另一个处于standby状态。Active NameNode对外提供服务,而Standby NameNode则不对外提供服务,仅同步active namenode的状态,以便能够在它失败时快速进行切换。hadoop2.0官方提供了两种HDFS HA的解决方案,一种是NFS,另一种是QJM(由cloudra提出,原理类似zookeeper)。这里我使用QJM完成。主备NameNode之间通过一组JournalNode同步元数据信息,一条数据只要成功写入多数JournalNode即认为写入成功。通常配置奇数个JournalNode

3.搭建过程

1.前提在强调一下

hdfs的ha已经成功,因为yarn的ha也是需要zkfc的,如果zkfc不没有成功,自然yarn切换也没有成功了。

2.mapred-site.xml

<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
                <description>The runtime framework for executing MapReduce jobs.
                        Can be one of local, classic or yarn.
                </description>
        </property>

<!-- jobhistory properties -->
        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>0.0.0.0:10020</value>
                <description>MapReduce JobHistory Server IPC host:port</description>
        </property>

        <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>0.0.0.0:19888</value>
                <description>MapReduce JobHistory Server Web UI host:port</description>
        </property>

</configuration>

3.yarn-site.xml 主要的配置文件

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>

<!-- Site specific YARN configuration properties -->
<!-- Resource Manager Configs -->

<!--rm失联后重新链接的时间-->
    <property>
        <name>yarn.resourcemanager.connect.retry-interval.ms</name>
        <value>2000</value>
    </property>

<!--开启resourcemanagerHA,默认为false-->
    <property>
        <name>yarn.resourcemanager.ha.enabled</name>
        <value>true</value>
    </property>

<!--配置resourcemanager-->
    <property>
        <name>yarn.resourcemanager.ha.rm-ids</name>
        <value>rm1,rm2</value>
    </property>

    <property>
        <name>ha.zookeeper.quorum</name>
        <value>rtest-mysql-02:2181,rtest-mysql-03:2181,rtest-mysql-04:2181</value>
    </property>

<!--开启故障自动切换-->
    <property>
        <name>yarn.resourcemanager.ha.automatic-failover.enabled</name>
        <value>true</value>
    </property>
    
   <property>
        <description>The hostname of the RM.</description>
        <name>yarn.resourcemanager.hostname</name>
        <value>rtest-mysql-01</value>
   </property> 
  
    <property>
        <name>yarn.resourcemanager.hostname.rm1</name>
        <value>rtest-mysql-01</value>
    </property>

    <property>
        <name>yarn.resourcemanager.hostname.rm2</name>
        <value>rtest-mysql-02</value>
    </property>
    
    <!--
    注意:一般都喜欢把配置好的文件远程复制到其它机器上,但这个在YARN的另一个机器上一定要修改
    -->
    <property>
        <name>yarn.resourcemanager.ha.id</name>
        <value>rm1</value>
        <description>If we want to launch more than one RM in single node,we need this configuration</description>
    </property>

<!--开启自动恢复功能-->
    <property>
        <name>yarn.resourcemanager.recovery.enabled</name>
        <value>true</value>
    </property>

<!--配置与zookeeper的连接地址-->
    <property>
        <name>yarn.resourcemanager.zk-state-store.address</name>
        <value>rtest-mysql-02:2181,rtest-mysql-03:2181,rtest-mysql-04:2181</value>
    </property>

    <property>
        <name>yarn.resourcemanager.store.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore</value>
    </property>

    <property>
        <name>yarn.resourcemanager.zk-address</name>
        <value>rtest-mysql-02:2181,rtest-mysql-03:2181,rtest-mysql-04:2181</value>
    </property>

    <property>
        <name>yarn.resourcemanager.cluster-id</name>
        <value>yarn-cluster</value>
    </property>
    
    <!--schelduler失联等待连接时间-->
    <property>
        <name>yarn.app.mapreduce.am.scheduler.connection.wait.interval-ms</name>
        <value>5000</value>
    </property>
   
    <!--注意:一般都喜欢把配置好的文件远程复制到其它机器上,但这个在YARN的另一个机器上一定要修改-->
   <property>
        <description>The address of the applications manager interface in the RM.</description>
        <name>yarn.resourcemanager.address.rm1</name>
        <value>rtest-mysql-01:8032</value>
   </property>

   <property>
        <description>The address of the scheduler interface.</description>
        <name>yarn.resourcemanager.scheduler.address.rm1</name>
        <value>rtest-mysql-01:8030</value>
   </property>

   <property>
        <description>The http address of the RM web application.</description>
        <name>yarn.resourcemanager.webapp.address.rm1</name>
        <value>rtest-mysql-01:8088</value>
   </property>

   <property>
        <description>The https adddress of the RM web application.</description>
        <name>yarn.resourcemanager.webapp.https.address.rm1</name>
        <value>rtest-mysql-01:8090</value>
   </property>

   <property>
        <name>yarn.resourcemanager.resource-tracker.address.rm1</name>
        <value>rtest-mysql-01:8031</value>
   </property>

   <property>
        <description>The address of the RM admin interface.</description>
        <name>yarn.resourcemanager.admin.address.rm1</name>
        <value>rtest-mysql-01:8033</value>
   </property>
  
    <property>
        <name>yarn.resourcemanager.ha.admin.address.rm1</name>
        <value>rtest-mysql-01:23142</value>
    </property>

    <!--*******************************************************-->
    
    <property>
        <description>the valid service name should only contain a-zA-Z0-9_ and can not start with numbers</description>
        <name>yarn.nodemanager.aux-services</name>
        <value>mapreduce_shuffle</value>
   </property>
    <property>
        <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
        <value>org.apache.hadoop.mapred.ShuffleHandler</value>
    </property>
    
   <property>
        <description>The class to use as the resource scheduler.</description>
        <name>yarn.resourcemanager.scheduler.class</name>
        <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
   </property>
  
    <property>
        <name>yarn.nodemanager.log-dirs</name>
        <value>/home/biedong/hadoop/yarn/log</value>
    </property>

    <property>
        <name>mapreduce.shuffle.port</name>
        <value>23080</value>
    </property>

   <property>
        <description>fair-scheduler conf location</description>
        <name>yarn.scheduler.fair.allocation.file</name>
        <value>/home/biedong/hadoop-2.7.0/etc/hadoop/fairscheduler.xml</value>
   </property>

   <property>
        <description>List of directories to store localized files in. An 
              application's localized file directory will be found in:
              ${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}.
              Individual containers' work directories, called container_${contid}, will
              be subdirectories of this.
         </description>
        <name>yarn.nodemanager.local-dirs</name>
        <value>/home/biedong/hadoop/yarn/local</value>
   </property>

   <property>
        <description>Whether to enable log aggregation</description>
        <name>yarn.log-aggregation-enable</name>
        <value>true</value>
   </property>

   <property>
        <description>Where to aggregate logs to.</description>
        <name>yarn.nodemanager.remote-app-log-dir</name>
        <value>/tmp/logs</value>
   </property>

   <property>
        <description>Amount of physical memory, in MB, that can be allocated for containers.</description>
        <name>yarn.nodemanager.resource.memory-mb</name>
        <value>8192</value>
   </property>

   <property>
        <description>Number of CPU cores that can be allocated for containers.</description>
        <name>yarn.nodemanager.resource.cpu-vcores</name>
        <value>4</value>
   </property>
  
  <!--故障处理类-->
    <property>
        <name>yarn.client.failover-proxy-provider</name>
        <value>org.apache.hadoop.yarn.client.ConfiguredRMFailoverProxyProvider</value>
    </property>

    <property>
        <name>yarn.resourcemanager.ha.automatic-failover.zk-base-path</name>
        <value>/yarn-leader-election</value>
        <description>Optionalsetting.Thedefaultvalueis/yarn-leader-election</description>
    </property>

</configuration>
详细配置

4.启动YARN

你可以使用以下命令分别启动ResourceManager和NodeManager:
  sbin/yarn-daemon.sh start resourcemanager
  sbin/yarn-daemon.sh start nodemanager(如果有多个datanode,需使用yarn-daemons.sh)
  或者一次启动过:sbin/start-yarn.sh

5.验证是否成功

输入命令验证主备关系:

  yarn rmadmin -getServiceState rm1

  yarn rmadmin -getServiceState rm2

在网页输入:http:rtest-mysql-02:8088,是否能打开这个页面

 6.主备自动切换验证

访问rm1节点的nodemanager会提示
This is standby RM. Redirecting to the current active RM: http://rtest-mysql-01:8088/cluster/apps
下面KILL掉rm2的resourcemanager

再次验证主备关系:

7.运行应用程序

执行如下:bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.0.jar pi 20 10

如果程序没有明显报错,证明安装成功!

posted @ 2016-08-18 18:47  楚时邀月  阅读(1774)  评论(0编辑  收藏  举报