Hadoop生产环境搭建(含HA、Federation)
Hadoop生产环境搭建 1. 将安装包hadoop-2.x.x.tar.gz存放到某一目录下,并解压。 2. 修改解压后的目录中的文件夹etc/hadoop下的配置文件(若文件不存在,自己创建。) 包括hadoop-env.sh,mapred-site.xml,core-site.xml,hdfs-site.xml,yarn-site.xml 3. 格式化并启动HDFS 4. 启动YARN 以上整个过程与Hadoop单机Hadoop测试环境搭建基本一致,不同的是步骤2中配置文件设置内容以及步骤3的详细过程。 HDFS2.0的HA配置方法(主备NameNode) 注意事项: 1)主备Namenode有多种配置方法,本次使用JournalNode方式。至少准备三个节点作为JournalNode 2)主备两个Namenode应放于不同的机器上,独享机器。(HDFS2.0中吴煦配置secondaryNamenode,备NameNode已经代替它完成相应的功能) 3)主备NameNode之间有两种切换方式,手动切换和自动切换。其中自动切换是借助Zookeeper实现的。因此需要单独部署一个Zookeeper集群,通常为奇数个,至少3个。 ================================================================================== HSFS HA部署架构和流程 HSFS HA部署架 三个JournalNode 两个NameNode N个DataNode HDFS HA部署流程——hdfs-site.xml配置 dfs.nameservices 集群中命名服务列表(自定义) dfs.ha.namenodes.${ns}命名服务中的namenode逻辑名称(自定义) dfs.namenode.rpc-address.${ns}.${nn} 命名服务中逻辑名称对应的RPC地址 dfs.namenode.http-address.${ns}.${nn} 命名服务中逻辑名称对应的HTTP地址 dfs.namenode.name.dir NameNode fsimage存放目录 dfs.namenode.shared.edits.dir 主备NameNode同步元信息的共享存储系统 dfs.journalnode.edits.dir Journal Node数据存放目录 HDFS HA部署流程——hdfs-site.xml配置实例 <?xml version="1.0" encoding="UTF-8"?> <configuration> <property> <name>dfs.nameservices</name> <value>hadoop-rokid</value> </property> <property> <name>dfs.ha.namenodes.hadoop-rokid</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-adress.hadoop-rokid.nn1</name> <value>nn1:8020</value> </property> <property> <name>dfs.namenode.rpc-adress.hadoop-rokid.nn2</name> <value>nn2:8020</value> </property> <property> <name>dfs.namenode.http-adress.hadoop-rokid.nn1</name> <value>nn1:50070</value> </property> <property> <name>dfs.namenode.http-adress.hadoop-rokid.nn2</name> <value>nn2:50070</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:///home/zhangzhenghai/cluster/hadoop/dfs/name</value> </property> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://jnode1:8485;jnode2:8485;jnode3:8485/hadoop-rokid</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:///home/zhangzhenghai/cluster/hadoop/dfs/data</value> </property> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>false</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/home/zhangzhenghai/cluster/hadoop/dfs/journal</value> </property> </configuration> HDFS HA部署流程——core-site.xml配置实例 <?xml version="1.0" encoding="UTF-8"?> <configuration> <property> <name>fs.default.name</name> <value>hdfs://nn1:8020</value> </property> </configuration> HDFS HA部署流程——slaves配置实例 列出集群中的所有机器名称列表 启动顺序: Hadoop2.x上机实践(部署多机-HDFS HA+YARN) HA 注意:所有操作均在Hadoop部署目录下进行。 启动Hadoop集群: step1: 在各个JournalNode节点上,输入以下命令启动journalNode服务, sbin/hadoop-daemon.sh start journalnode step2: 在[nn1]上,对其进行格式化,并启动, bin/hdfs namenode -format sbin/hadoop-daemon.sh start namenode step3: 在[nn2]上,同步nn1的元数据信息, bin/hdfs namenode -bootstrapStandby step4: 启动[nn2], sbin/hadoop-daemon.sh start namenode 经过以上四步骤,nn1和nn2均处于standby状态 step5: 将[nn1]切换成Active bin/hdfs haadmin -transitionToActive nn1 step6: 在[nn1]上,启动所有datanode sbin/hadoop-daemons.sh start datanode ================================================================================== Hadoop HA+Federation部署架构和流程 HSFS HA+Federation部署架构 三个JournalNode 四个Namenode(每两个互备) N个DataNode HDFS HA+Federation部署流程——hdfs-site.xml配置 <?xml version="1.0" encoding="UTF-8"?> <configuration> <property> <name>dfs.nameservices</name> <value>hadoop-rokid1,hadoop-rokid2</value> </property> <!-- hadoop-rokid1 --> <property> <name>dfs.ha.namenodes.hadoop-rokid1</name> <value>nn1,nn2</value> </property> <property> <name>dfs.namenode.rpc-adress.hadoop-rokid1.nn1</name> <value>nn1:8020</value> </property> <property> <name>dfs.namenode.rpc-adress.hadoop-rokid1.nn2</name> <value>nn2:8020</value> </property> <property> <name>dfs.namenode.http-adress.hadoop-rokid1.nn1</name> <value>nn1:50070</value> </property> <property> <name>dfs.namenode.http-adress.hadoop-rokid1.nn2</name> <value>nn2:50070</value> </property> <!-- hadoop-rokid2 --> <property> <name>dfs.ha.namenodes.hadoop-rokid2</name> <value>nn3,nn4</value> </property> <property> <name>dfs.namenode.rpc-adress.hadoop-rokid2.nn3</name> <value>nn3:8020</value> </property> <property> <name>dfs.namenode.rpc-adress.hadoop-rokid2.nn4</name> <value>nn4:8020</value> </property> <property> <name>dfs.namenode.http-adress.hadoop-rokid2.nn3</name> <value>nn3:50070</value> </property> <property> <name>dfs.namenode.http-adress.hadoop-rokid2.nn4</name> <value>nn4:50070</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:///home/zhangzhenghai/cluster/hadoop/dfs/name</value> </property> <!-- hadoop-rokid1 JournalNode配置 两者配置不一样 每一个namespace下 只存其一--> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://jnode1:8485;jnode2:8485;jnode3:8485/hadoop-rokid1</value> </property> <!-- hadoop-rokid2 JournalNode配置 两者配置不一样 每一个namespace下 只存其一--> <property> <name>dfs.namenode.shared.edits.dir</name> <value>qjournal://jnode1:8485;jnode2:8485;jnode3:8485/hadoop-rokid2</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:///home/zhangzhenghai/cluster/hadoop/dfs/data</value> </property> <property> <name>dfs.ha.automatic-failover.enabled</name> <value>false</value> </property> <property> <name>dfs.journalnode.edits.dir</name> <value>/home/zhangzhenghai/cluster/hadoop/dfs/journal</value> </property> </configuration> 启动顺序: 在nn1和nn2两个节点上进行如下操作: 步骤1:在各个JournalNode节点上,输入以下命令启动JournalNode服务: sbin/hadoop-daemon.sh start journalnode 步骤2:在[nn1]上,对其进行格式化,并启动: bin/hdfs namenode -format -clusterId hadoop-rokid1 sbin/hadoop-daemon.sh start namenode 步骤3:在[nn2]上,同步nn1的元数据信息 bin/hdfs namenode bootstrapStandby 步骤4:在[nn2]上,启动NameNode sbin/hadooop-daemon.sh start namenode (经过以上四个步骤,nn1和nn2均处于standby状态) 步骤5:在[nn1]上,将NameNode切换为Active bin/hdfs haadmin -ns hadoop-rokid1 -transitionToActive nn1 在nn3和nn4两个节点上进行如下操作: 步骤1:在各个JournalNode节点上,输入以下命令启动JournalNode服务: sbin/hadoop-daemon.sh start journalnode 步骤2:在[nn3]上,对其进行格式化,并启动: bin/hdfs namenode -format -clusterId hadoop-rokid2 sbin/hadoop-daemon.sh start namenode 步骤3:在[nn4]上,同步nn3的元数据信息 bin/hdfs namenode bootstrapStandby 步骤4:在[nn4]上,启动NameNode sbin/hadooop-daemon.sh start namenode (经过以上四个步骤,nn3和nn4均处于standby状态) 步骤5:在[nn3]上,将NameNode切换为Active bin/hdfs haadmin -ns hadoop-rokid2 -transitionToActive nn3 最后:在[nn1]上,启动所有datanode sbin/hadoop-daemons.sh start datanode ================================================================================== Yarn部署架构 ResourceManager N个NodeManager yarn-site.xml配置实例 <?xml version="1.0" encoding="UTF-8"?> <configuration> <property> <name>yarn.resourcemanager.hostname</name> <value>YARN001</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>${yarn.resourcemanager.hostname}:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>${yarn.resourcemanager.hostname}:8030</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>${yarn.resourcemanager.hostname}:8088</value> </property> <property> <name>yarn.resourcemanager.webapp.https.address</name> <value>${yarn.resourcemanager.hostname}:8090</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>${yarn.resourcemanager.hostname}:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>${yarn.resourcemanager.hostname}:8033</value> </property> <property> <name>yarn.resourcemanager.scheduler.class</name> <value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value> </property> <property> <name>yarn.scheduler.fair.allocation.file</name> <value>${yarn.home.dir}/etc/hadoop/fairscheduler.xml</value> </property> <property> <name>yarn.nodemanager.local-dirs</name> <value>/home/zhangzhenghai/cluster/hadoop/yarn/local</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.nodemanager.remote-app-log-dir</name> <value>/tmp/logs</value> </property> <property> <name>yarn.nodemanager.resource.memory-mb</name> <value>30720</value> </property> <property> <name>yarn.nodemanager.resource.cpu-vcores</name> <value>12</value> </property> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> </configuration> fairscheduler.xml配置实例 <?xml version="1.0" encoding="UTF-8"?> <allocations> <queue name="basic"> <minResources>102400 mb, 50 vcores</minResources> <maxResources>153600 mb, 100 vcores</maxResources> <maxRunningApps>200</maxRunningApps> <minSharePreemptionTimeout>300</minSharePreemptionTimeout> <weight>1.0</weight> <aclSubmitApps>root,yarn,search,hdfs</aclSubmitApps> </queue> <queue name="queue1"> <minResources>102400 mb, 50 vcores</minResources> <maxResources>153600 mb, 100 vcores</maxResources> </queue> <queue name="queue2"> <minResources>102400 mb, 50 vcores</minResources> <maxResources>153600 mb, 100 vcores</maxResources> </queue> </allocations> mapred-site.xml配置实例 <?xml version="1.0" encoding="UTF-8"?> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> <description>The runtime framework for executing MapReduce jobs. Can be one of local classic or yarn.</description> </property> <!-- jobhistory properties --> <property> <name>mapreduce.jobhistory.address</name> <value>jobhistory:10020</value> <description>MapReduce JobHistory Server IPC host:port</description> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>jobhistory:19888</value> <description>MapReduce JobHistory Server Web UI host:port</description> </property> </configuration> YARN启动/停止步骤 在YARN001上执行以下命令 启动YARN: sbin/start-yarn.sh 停止YARN: sbin/stop-yarn.sh 启动MR-JobHistory: sbin/mr-jobhistory-daemon.sh start historyserver #############################OVER#####################################################################