Hadoop-2.7.2集群部署步骤

Hadoop-2.7.2集群部署步骤

1、namenode主机信息

|序号|IP|Hostname|操作系统|CPU|内存|硬盘|环境及服务|备注|
|:--😐
|1、|10.10.40.10|master1|CentOS 7.6 64位|24c|96G|1T|java1.8+zookeeper+namenode|
|2、|10.10.40.11|master2|CentOS 7.6 64位|24c|96G|1T|java1.8+zookeeper+namenode|
|3、|10.10.40.12|master3|CentOS 7.6 64位|24c|96G|1T|java1.8+zookeeper+namenode|

2、datanode主机信息(裸盘)

|序号|IP|Hostname|操作系统|CPU|内存|硬盘|服务|备注|
|:--😐
|1、|10.10.40.13-42|datanode1-30|CentOS 7.6 64位|32c|128G|5.5T*12|java1.8+datanode|

3、集群部署java环境

  • 下载安装包:jdk-8u151-linux-x64.tar.gz

  • 上传到各个节点

  • 解压到执行位置:tar -xf jdk-8u151-linux-x64.tar.gz -C /data1/xinsrv

  • 添加环境变量:source /etc/profile.d/java.sh

      export JAVA_HOME=/data1/xinsrv/jdk1.8.0_151/
      
      PATH=$PATH:$HOME/bin:$JAVA_HOME/bin
      
      export PATH
    
  • 版本验证:java -version

4、部署zookeeper集群

部署主机:10.10.40.10-10.10.40.11(可多台建议zk集群主机个数为奇数)

  • 下载安装包:zookeeper-3.4.8.tar.gz

  • 上传到主机

  • 解压到安装位置:tar -xf zookeeper-3.4.8.tar.gz -C /data1/xinsrv/ && cd /data1/xinsrv/ && mv zookeeper-3.4.8 zookeeper

  • 创建相关依赖目录:mkdir -p /data1/data/zookeeper && mkdir -p /data1/logs/app/zookeeper

  • 配置myid:

      10.10.40.10 echo "1" >/data1/data/zookeeper/myid
      10.10.40.11 echo "2" >/data1/data/zookeeper/myid
      10.10.40.12 echo "3" >/data1/data/zookeeper/myid
    
  • 配置conf文件:cat /data1/xinsrv/zookeeper/conf/zoo.cfg |egrep -v "#|$"

      tickTime=2000
      initLimit=10
      syncLimit=5
      dataDir=/data1/data/zookeeper
      dataLogDir=/data1/logs/app/zookeeper
      clientPort=2181
      maxClientCnxns=3000
      server.1=10.10.40.10:3181:4181
      server.2=10.10.40.11:3181:4181
      server.3=10.10.40.12:3181:4181
    

4.1、服务启动:cd /etc/init.d/ && ./zookeeper_2181 start

cat zookeeper_2181

#!/bin/bash
#
# zookeeper       Startup script for zookeeper
#
# chkconfig: - 93 19
# processname: zookeeper
. /etc/init.d/functions
bin=/data1/xinsrv/zookeeper/bin/zkServer.sh
conf=/data1/xinsrv/zookeeper/conf/zoo.cfg
logdir=/data1/logs/xinsrv/zookeeper
[ ! -d $logdir ] && mkdir -p $logdir && cd $logdir
service_count=`ps -ef |grep /data1/xinsrv/zookeeper/conf/zoo.cfg|grep -v grep|wc -l`
start(){
if [ $service_count -gt 1 ];then
   echo "the service of zookeeper is running..."
   exit
fi
$bin start $conf
if [ $? -ne 0 ];then
   action "starting zookeeper..." /bin/false
   exit
else
   action "starting zookeeper..." /bin/true
fi
}
stop(){
if [ $service_count -lt 1 ];then
   echo "the service of zookeeper is not running..."
   exit
fi
$bin stop $conf
if [ $? -ne 0 ];then
   action "stopping zookeeper..." /bin/false
   exit
else
   action "stopping zookeeper..." /bin/true
fi
}
restart(){
$bin restart $conf
}
status(){
$bin status $conf
}
main () {
case "$1" in
  start)
	start
	;;
  stop)
	stop
	;;
  restart)
	restart
	;;
  status)
	status
	;;
   *)
    echo $"Usage: $0 {start|stop|status|restart}"
    exit 1
esac
}
main $*

5、部署hadoop集群

5.1、基础环境调整

  • 所有节点

  • 安装依赖:yum install python-devel libevent-devel libmcrypt libmcrypt-devel -y

  • 集群添加:host解析

      10.10.40.10		master1
      10.10.40.11		master2
      10.10.40.12		master3
      10.10.40.13		datanode1
      10.10.40.14		datanode2
      10.10.40.15		datanode3
    
  • 创建相关目录及授权:

    • mkdir -p /data1/cache/hadoop /data1/cache/hadoop/dfs/data /data1/cache/hadoop/dfs/name /data1/cache/hadoop/tmp /var/run/hadoop /var/run/spark /data1/xinsrv/hadoop-2.7.2 /data1/xinsrv/tez-0.8.4
    • chown hadoop.hadoop -R /data1/cache/hadoop /data1/cache/hadoop/dfs/data /data1/cache/hadoop/dfs/name /data1/cache/hadoop/tmp /var/run/hadoop /var/run/spark /data1/xinsrv/hadoop-2.7.2 /data1/xinsrv/tez-0.8.4

5.2、hadoop安装

  • 所有节点

  • 下载安装包:hadoop-2.7.2.tar.gz tez-0.8.4.tar.gz scala-2.11.8.tgz

  • 上传到主机

  • 安装tez:tar xf tez-0.8.4.tar.gz -C /data1/xinsrv

  • 安装scala:tar xf scala-2.11.8.tgz -C /data1/xinsrv

  • 安装hadoop:useradd hadoop && tar xf hadoop-2.7.2.tar.gz -C /data1/xinsrv && chown -R hadoop.hadoop /data1/xinsrv/hadoop-2.7.2

  • datanode裸盘格式化及挂载,挂载盘按disk1-12命名

  • datanode依赖目录创建:

      df -h |grep disk &>/dev/null  && for n in `df -h |grep disk|awk '{print $NF}'|grep "^/disk"`;do mkdir -p $n/cache/hadoop/dfs/data/ && chown -R hadoop.hadoop $n/cache/hadoop/dfs/data/
    
  • 安装tez:tar xf tez-0.8.4.tar.gz -C /data1/xinsrv/

  • 添加免秘钥登录:10.10.40.10 ,sudo su - hadoop 然后执行ssh-keygen -t rsa生成秘钥,其他节点添加相关公钥。

5.3、hadoop-env.sh(配置hadoop集群启动环境)

cat hadoop-env.sh |egrep -v "#|$"

export JAVA_HOME=/data1/xinsrv/jdk1.8.0_151
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/data1/xinsrv/hadoop-2.7.2/etc/hadoop"}
for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
  if [ "$HADOOP_CLASSPATH" ]; then
    export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
  else
    export HADOOP_CLASSPATH=$f
  fi
done
export HADOOP_HEAPSIZE=8192
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"
export HADOOP_NAMENODE_OPTS="-Xms30g -Xmx30g -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:+PrintTenuringDistribution -Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS"
export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS"
export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"
export HADOOP_NFS3_OPTS="$HADOOP_NFS3_OPTS"
export HADOOP_PORTMAP_OPTS="-Xmx1024m $HADOOP_PORTMAP_OPTS"
export HADOOP_CLIENT_OPTS="-Xmx2048m $HADOOP_CLIENT_OPTS"
export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}
export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}
export HADOOP_PID_DIR=/var/run/hadoop
export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}
export HADOOP_IDENT_STRING=$USER
export TEZ_CONF_DIR=/data1/xinsrv/hadoop-2.7.2/etc/hadoop/tez-site.xml
export TEZ_JARS=/data1/xinsrv/tez-0.8.4
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:${TEZ_CONF_DIR}:${TEZ_JARS}/*:${TEZ_JARS}/lib/*

5.4、hdfs-site.xml(namenode和datanode相关配置)

cat hdfs-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <property>
                <name>dfs.replication</name>
                <value>2</value>
        </property>
        <property>
                <name>dfs.namenode.name.dir</name>
                <value>file:/data1/cache/hadoop/dfs/name</value>
        </property>
        <property>
                <name>dfs.datanode.data.dir</name>
                <value>/disk1/cache/hadoop/dfs/data/,/disk2/cache/hadoop/dfs/data/,/disk3/cache/hadoop/dfs/data/,/disk4/cache/hadoop/dfs/data/,/disk5/cache/hadoop/dfs/data/,/disk6/cache/hadoop/dfs/data/,/disk7/cache/hadoop/dfs/data/,/disk8/cache/hadoop/dfs/data/,/disk9/cache/hadoop/dfs/data/,/disk10/cache/hadoop/dfs/data/,/disk11/cache/hadoop/dfs/data/,/disk12/cache/hadoop/dfs/data/</value>
        </property>
        <property>
                <name>dfs.datanode.failed.volumes.tolerated</name>
                <value>1</value>
        </property>
        <property>
                <name>dfs.blocksize</name>
                <value>134217728</value>
        </property>
	<property>
  		<name>dfs.nameservices</name>
  		<value>mycluster</value>
		<description>Logical name forthis new nameservice</description>
	</property>
	<property>
		 <name>dfs.ha.namenodes.mycluster</name>
 		 <value>nn1,nn2</value>
	</property>
        <property>
        	 <name>dfs.namenode.rpc-address.mycluster.nn1</name>
                 <value>10.10.40.10:9000</value>
        </property>
        <property>
          	 <name>dfs.namenode.rpc-address.mycluster.nn2</name>
          	 <value>10.10.40.11:9000</value>
        </property>
        <property>
                <name>dfs.namenode.http-address.mycluster.nn1</name>
                <value>10.10.40.10:50070</value>
        </property>
        <property>
                <name>dfs.namenode.http-address.mycluster.nn2</name>
                <value>10.10.40.11:50070</value>
        </property>
        <property>
                <name>dfs.namenode.shared.edits.dir</name>
                <value>qjournal://10.10.40.10:8485;10.10.40.11:8485;10.10.40.12:8485/mycluster</value>
        </property>
        <property>
                <name>dfs.client.failover.proxy.provider.mycluster</name>
                <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
        </property>
        <property>
                <name>dfs.ha.fencing.methods</name>
                <value>sshfence</value>
        </property>
        <property>
                <name>dfs.ha.fencing.ssh.private-key-files</name>
                <value>/home/hadoop/.ssh/id_rsa</value>
        </property>
        <property>
                <name>dfs.journalnode.edits.dir</name>
                <value>/data1/cache/hadoop/journal/node/data</value>
        </property>
        <property>
                <name>dfs.ha.fencing.ssh.connect-timeout</name>
                <value>30000</value>
        </property>
        <property>
                <name>dfs.ha.automatic-failover.enabled</name>
                <value>true</value>
        </property>
        <property>
                <name>dfs.datanode.max.transfer.threads</name>
                <value>10240</value>
        </property>
  	<property>
    		<name>dfs.permissions</name>
    		<value>true</value>
  	</property>
        <property>
                <name>dfs.permissions.superusergroup</name>
                <value>hadoop</value>
        </property>
	<property>
		<name>dfs.namenode.acls.enabled</name>
		<value>true</value>
	</property>
	<property>
                <name>fs.trash.interval</name>
                <value>10080</value>
        </property>
 	<property>
    		<name>dfs.webhdfs.enabled</name>
    		<value>true</value>
  	</property>
	<property>
  		<name>dfs.hosts.exclude</name>
  		<value>/data1/xinsrv/hadoop-2.7.2/etc/hadoop/excludes</value>
  		<description>Names a file that contains a list of hosts that are not permitted to connect to the namenode.  The full pathname of the file must be specified.  If the value is empty, no hosts are excluded.</description>
	</property>
	<property>
		<name>dfs.namenode.handler.count</name>
		<value>150</value>
	</property>
	<property>
		<name>dfs.datanode.handler.count</name>
		<value>100</value>
	</property>
        <property>
                <name>dfs.namenode.datanode.registration.ip-hostname-check</name>
                <value>false</value>
        </property>
        <property>
        	<name>dfs.qjournal.write-txns.timeout.ms</name>
        	<value>90000</value>
	</property>
</configuration>

5.5、core-site.xml

cat core-site.xml

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://mycluster</value>
        </property>
        <property>
                <name>io.file.buffer.size</name>
                <value>131072</value>
        </property>
        <property>
       		<name>ha.zookeeper.quorum</name>
        	<value>10.10.40.10:2181,10.10.40.11:2181,10.10.40.12:2181</value>
        </property>
	<property>
    		<name>io.compression.codecs</name>
        	<value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec</value>
	</property>
	<property>
    		<name>hadoop.native.lib</name>
    		<value>true</value>
  	</property>
  	<property>
    		<name>hadoop.tmp.dir</name>
    		<value>/data1/cache/hadoop/tmp</value>
  	</property>
	<property>
  		<name>hadoop.proxyuser.hue.hosts</name>
  		<value>*</value>
	</property>
	<property>
  		<name>hadoop.proxyuser.hue.groups</name>
  		<value>*</value>
	</property>
	<property>
		<name>ipc.server.listen.queue.size</name>
		<value>512</value>
	</property>
<!-- 以下是结合阿里云的oss产品的配置,用于数据迁移 -->
<property>
  <name>fs.oss.endpoint</name>
  <description>Aliyun OSS endpoint to connect to. An up-to-date list is
    provided in the Aliyun OSS Documentation.
   </description>
  <value>oss-cn-beijing-internal.aliyuncs.com</value>
</property>

<property>
  <name>fs.oss.accessKeyId</name>
  <description>Aliyun access key ID</description>
  <value>LTAI</value>
</property>

<property>
  <name>fs.oss.accessKeySecret</name>
  <description>Aliyun access key secret</description>
  <value>AZO</value>
</property>

<property>
   <name>fs.oss.impl</name>
   <value>org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem</value>
</property>

<property>
  <name>fs.oss.buffer.dir</name>
  <value>/tmp/oss</value>
  <description>Comma separated list of directories to buffer OSS data before uploading to Aliyun OSS</description>
</property>

<property>
  <name>fs.oss.connection.maximum</name>
  <value>2048</value>
  <description>Number of simultaneous connections to oss.</description>
</property>

<property>
  <name>fs.oss.connection.secure.enabled</name>
  <value>false</value>
  <description>Connect to oss over ssl or not, true by default.</description>
</property>

<!--2019年07月22日11:13:20 优化namenode 频繁切换问题 -->
    <!-- HealthMonitor check namenode 的超时设置,默认45000ms,改为3mins -->
    <property>
        <name>ha.health-monitor.rpc-timeout.ms</name>
        <value>180000</value>
    </property>
   <!-- zk failover的session 超时设置,默认5000ms,改为2mins -->
    <property>
        <name>ha.zookeeper.session-timeout.ms</name>
        <value>120000</value>
    </property>
</configuration>

5.6、mapred-site.xml

cat mapred-site.xml

<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn-tez</value>
        </property>
        <property>
                <name>mapreduce.jobhistory.address</name>
                <value>10.10.40.12:10020</value>
        </property>
        <property>
                <name>mapreduce.jobhistory.webapp.address</name>
                <value>10.10.40.12:19888</value>
        </property>
        <property>
                <name>mapreduce.map.memory.mb</name>
                <value>2048</value>
        </property>
        <property>
                <name>mapreduce.reduce.memory.mb</name>
                <value>4096</value>
        </property>
        <property>
                <name>mapreduce.task.io.sort.mb</name>
                <value>150</value>
        </property>
        <property>
                <name>yarn.app.mapreduce.am.resource.mb</name>
                <value>2048</value>
        </property>
        <property>
                <name>yarn.app.mapreduce.am.command-opts</name>
                <value>-Xmx3276m</value>
        </property>
	<property>
    		<name>mapreduce.jobhistory.done-dir</name>
    		<value>${yarn.app.mapreduce.am.staging-dir}/history/done</value>
	</property>
	<property>
    		<name>mapreduce.jobhistory.intermediate-done-dir</name>
    		<value>${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate</value>
	</property>
	<property>
    		<name>yarn.app.mapreduce.am.staging-dir</name>
    		<value>/tmp/hadoop-yarn/staging</value>
	</property>
	<property>
    		<name>mapreduce.jobhistory.joblist.cache.size</name>
    		<value>20000</value>
	</property>
	<property>
		<name>mapreduce.job.reduce.slowstart.completedmaps</name>
		<value>1</value>
		<description>Fraction of the number of maps in the job which should be complete before reduces are scheduled for the job.</description>
	</property>
	<property>
		<name>mapreduce.reduce.cpu.vcores</name>
		<value>2</value>
		<description>The number of virtual cores to request from the scheduler for each reduce task.</description>
	</property>
	<property>
		<name>mapreduce.tasktracker.map.tasks.maximum</name>
		<value>10</value>
		<description>The maximum number of map tasks that will be run simultaneously by a task tracker.</description>
	</property>
	<property>
		<name>mapreduce.tasktracker.reduce.tasks.maximum</name>
		<value>5</value>
		<description>The maximum number of reduce tasks that will be run simultaneously by a task tracker.</description>
	</property>
	<property>
		<name>mapreduce.task.io.sort.factor</name>
		<value>50</value>
		<description>The number of streams to merge at once while sorting files. This determines the number of open file handles.</description>
	</property>
	<property>
		<name>mapreduce.reduce.shuffle.parallelcopies</name>
		<value>20</value>
		<description>The default number of parallel transfers run by reduce during the copy(shuffle) phase.</description>
	</property>
	<property>
		<name>mapreduce.map.output.compress</name>
		<value>true</value>
		<description>Should the outputs of the maps be compressed before being sent across the network. Uses SequenceFile compression.</description>
	</property>
	<property>
		<name>mapreduce.map.output.compress.codec</name>
		<value>org.apache.hadoop.io.compress.GzipCodec</value>
		<description>If the map outputs are compressed, how should they be compressed.</description>
	</property>
        <property>
                <name>mapreduce.job.running.map.limit</name>
                <value>60</value>
                <description>The maximum number of simultaneous map tasks per job. There is no limit if this value is 0 or negative.</description>
        </property>
        <property>
                <name>mapreduce.job.running.reduce.limit</name>
                <value>30</value>
                <description>The maximum number of simultaneous reduce tasks per job. There is no limit if this value is 0 or negative.</description>
        </property>
        <property>
                <name>mapred.child.java.opts</name>
                <value>-Xmx3712m</value>
        </property>
        <property>
                <name>mapreduce.jobhistory.client.thread-count</name>
                <value>20</value>
        </property>

</configuration>

5.7、cat yarn-site.xml

<?xml version="1.0"?>
<!--
  Licensed under the Apache License, Version 2.0 (the "License");
  you may not use this file except in compliance with the License.
  You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

  Unless required by applicable law or agreed to in writing, software
  distributed under the License is distributed on an "AS IS" BASIS,
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->
<configuration>
	<!-- yarn-ha start-->
	<property>
		<name>yarn.resourcemanager.ha.enabled</name>
		<value>true</value>
	</property>
	<property>
		<name>yarn.resourcemanager.cluster-id</name>
		<value>yarn-ha</value>
	</property>
	<property>
		<name>yarn.resourcemanager.ha.rm-ids</name>
		<value>rm1,rm2</value>
	</property>
        <property>
                <name>yarn.resourcemanager.hostname.rm1</name>
                <value>10.10.40.10</value>
        </property>
        <property>
                <name>yarn.resourcemanager.hostname.rm2</name>
                <value>10.10.40.11</value>
        </property>
	<property>
		<name>yarn.resourcemanager.webapp.address.rm1</name>
		<value>10.10.40.10:8088</value>
	</property>
	<property>
		<name>yarn.resourcemanager.webapp.address.rm2</name>
		<value>10.10.40.11:8088</value>
	</property>
	<property>
		<name>yarn.resourcemanager.zk-address</name>
		<value>10.10.40.10:2181,10.10.40.11:2181,10.10.40.12:2181</value>
	</property>
	<property>
		<name>yarn.resourcemanager.address.rm1</name>
		<value>10.10.40.10:8032</value>
	</property>
	<property>
		<name>yarn.resourcemanager.address.rm2</name>
		<value>10.10.40.11:8032</value>
	</property>
	<!-- yarn ha end -->

        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle,spark_shuffle</value>
        </property>
	<property>
		<name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
		<value>org.apache.spark.network.yarn.YarnShuffleService</value>
	</property>
        <property>
                <name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
                <value>org.apache.hadoop.mapred.ShuffleHandler</value>
        </property>
        <property>
                <name>yarn.nodemanager.resource.memory-mb</name>
                <value>106496</value>
        </property>
        <property>
                <name>yarn.scheduler.minimum-allocation-mb</name>
                <value>1024</value>
        </property>
        <property>
                <name>yarn.scheduler.maximum-allocation-mb</name>
                <value>8192</value>
        </property>
        <property>
                <name>yarn.nodemanager.resource.cpu-vcores</name>
                <value>30</value>
        </property>
        <property>
                <name>yarn.scheduler.maximum-allocation-vcores</name>
                <value>50</value>
        </property>
	<property>
    		<name>yarn.log-aggregation-enable</name>
    		<value>true</value>
	</property>
        <property>
                <name>yarn.nodemanager.vmem-check-enabled</name>
                <value>false</value>
        </property>
        <property>
                <name>yarn.nodemanager.pmem-check-enabled</name>
                <value>false</value>
        </property>
	<property>
   		<name>yarn.nodemanager.vmem-pmem-ratio</name>
   		<value>4</value>
   		<description>Ratio between virtual memory to physical memory when setting memory limits for containers</description>
	</property>
	<property>
		<name>yarn.resourcemanager.scheduler.class</name>
		<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
	</property>
	<property>
		<name>yarn.scheduler.fair.allocation.file</name>
		<value>/data1/xinsrv/hadoop-2.7.2/etc/hadoop/fair-scheduler.xml</value>
	</property>
	<property>
		<name>yarn.scheduler.fair.preemption</name>
		<value>true</value>
	</property>
	<property>
		<name>yarn.scheduler.fair.user-as-default-queue</name>
		<value>true</value>
		<description>default is True</description>
	</property>
	<property>
		<name>yarn.scheduler.fair.allow-undeclared-pools</name>
		<value>false</value>
		<description>default is True</description>
	</property>
	<property>
		<name>yarn.timeline-service.enabled</name>
		<value>true</value>
	</property>
	<property>
		<name>yarn.timeline-service.hostname</name>
		<value>master3</value>
	</property>
	<property>
		<name>yarn.timeline-service.address</name>
		<value>master3:10200</value>
	</property>
	<property>
		<name>yarn.timeline-service.webapp.address</name>
		<value>master3:8188</value>
	</property>
	<property>
		<name>yarn.timeline-service.webapp.https.address</name>
		<value>master3:8190</value>
	</property>
	<property>
		<name>yarn.timeline-service.http-cross-origin.enabled</name>
		<value>true</value>
	</property>
	<property>
		<name>yarn.resourcemanager.system-metrics-publisher.enabled</name>
		<value>true</value>
	</property>
	<property>
		<description>Enable age off of timeline store data.</description>
		<name>yarn.timeline-service.ttl-enable</name>
		<value>true</value>
	</property>
	<property>
		<description>Time to live for timeline store data in milliseconds.</description>
		<name>yarn.timeline-service.ttl-ms</name>
		<value>259200000</value>
	</property>
	<property>
        	<description>Handler thread count to serve the client RPC requests.</description>
        	<name>yarn.timeline-service.handler-thread-count</name>
        	<value>35</value>
	</property>
	<property>
		<description>Length of time to wait between deletion cycles of leveldb timeline store in milliseconds.</description>
		<name>yarn.timeline-service.leveldb-timeline-store.ttl-interval-ms</name>
		<value>1200000</value>
	</property>
	<property>
		<name>yarn.resourcemanager.nodes.exclude-path</name>
		<value>/data1/xinsrv/hadoop-2.7.2/etc/hadoop/node-manager-excludes</value>
	</property>
	<property>
		<name>yarn.log.server.url</name>
		<value>http://master3:19888/jobhistory/logs</value>
	</property>
        <property>
                <name>yarn.resourcemanager.webapp.cross-origin.enabled</name>
                <value>true</value>
        </property>
<property>
  <name>yarn.timeline-service.generic-application-history.enabled</name>
  <value>true</value>
</property>
</configuration>

5.8、slaves(添加datanode节点IP)

cat slaves

10.10.40.13
10.10.40.14
10.10.40.15
10.10.40.16

6、hadoop集群启动

6.1、datanode环境检查

主要用于检查相关组件和目录是否存在及权限是否正常。

cat hadoop_check.sh
#!/bin/bash

#######################################################
# $Name:         hadoop_check.sh
# $Version:      v1.0
# $Function:     检查hadoop集群新增节点环境
# $Author:       liuxiaopeng
# $Create Date:  2018年07月09日18:57:20
# $Description:
#######################################################
# Shell Env
SHELL_NAME=`basename $0`
SHELL_DIR="/tmp"
SHELL_LOG="${SHELL_DIR}/${SHELL_NAME}.log"
LOCK_FILE="/tmp/${SHELL_NAME}.lock"

# mains
#Write Log
shell_log(){
    LOG_INFO=$1
    echo "$(date "+%Y-%m-%d") $(date "+%H-%M-%S") : ${SHELL_NAME} : ${LOG_INFO}" >> ${SHELL_LOG}
}

red_color(){
    CONTENT=$1
    echo -e "\033[31m $CONTENT \033[0m"
}
green_color(){
    CONTENT=$1
    echo -e "\033[32m $CONTENT \033[0m"
}
# Shell Usage
#shell_usage(){
#    echo $"Usage: $0 {backup}"
#}

shell_lock(){
    touch ${LOCK_FILE}
}

shell_unlock(){
    rm -f ${LOCK_FILE}
}

TAG=$(/sbin/ip a l |grep "inet "|awk '{print $2}'|awk -F '/' '{print $1}'|grep -v '127.0.0.1')
check_hosts(){
linename1=`cat /etc/hosts|grep  $TAG`
linenum1=`cat /etc/hosts|grep $TAG|wc -l`
if [ $linenum1 -ne 1 ];then
   red_color "第一项:HOSTS配置【异常】: $linename1"
else
   green_color "第一项:HOSTS配置【正常】"
fi
}
check_slaves(){
linenum2=`cat /data1/xinsrv/hadoop-2.7.2/etc/hadoop/slaves|grep $TAG|wc -l`
if [ $linenum2 -ne 1 ];then
   red_color "第二项:slaves配置【异常】:没有配置该IP"
else
   green_color "第二项:slaves配置【正常】"
fi
}
check_dir_cache(){
linename3=`ls -ld /data1/cache/hadoop|awk '{print $3}'`
[ -d /data1/cache/hadoop ] && [ $linename3 == "hadoop" ] && linenum3=1
[ -d /data1/cache/hadoop ] && [ $linename3 == "hadoop" ] || linenum3=2

if [ $linenum3 -ne 1 ];then
   red_color "第三项:/data1/cache/hadoop目录及属性配置【异常】:目录或者属性异常"
else
   green_color "第三项:/data1/cache/hadoop目录及属性配置【正常】"
fi
}
check_dir_run(){
linename4=`ls -ld /var/run/hadoop|awk '{print $3}'`
[ -d /var/run/hadoop ] && [ $linename4 == "hadoop" ] && linenum4=1
[ -d /var/run/hadoop ] && [ $linename4 == "hadoop" ] || linenum4=2
if [ $linenum4 -ne 1 ];then
   red_color "第四项:/var/run/hadoop目录及属性配置【异常】:目录或者属性异常"
else
   green_color "第四项:/var/run/hadoop目录及属性配置【正常】"
fi
}
check_scala(){
linename5=`which scala`
[ $linename5 == "/data1/xinsrv/scala-2.11.8/bin/scala" ]  && linenum5=1
[ $linename5 == "/data1/xinsrv/scala-2.11.8/bin/scala" ]  || linenum5=2
if [ $linenum5 -ne 1 ];then
   red_color "第五项:scala未部署或者profile配置【异常】"
else
   green_color "第五项:scala配置【正常】"
fi
}

check_jdk(){
linename6=$(ls -l $(ls -l `which java`|awk '{print $NF}')|awk '{print $NF}')
[ $linename6 == "/data1/xinsrv/jdk1.8.0_151/bin/java" ] && linenum6=1
[ $linename6 == "/data1/xinsrv/jdk1.8.0_151/bin/java" ] || linenum6=2
if [ $linenum6 -ne 1 ];then
   red_color "第六项:JDK未部署或者profile配置【异常】"
else
   green_color "第六项:JDK配置【正常】"
fi
}
check_numpy(){
linenum7=`pip2.7 list|grep numpy|wc -l`
if [ $linenum7 -ne 1 ];then
   red_color "第七项:python2.7 numpy模块未安装【异常】"
else
   green_color "第七项:python2.7 numpy模块【正常】"
fi
}
check_pandas(){
linenum8=`pip2.7 list|grep pandas|wc -l`
if [ $linenum8 -ne 1 ];then
   red_color "第八项:python2.7 pandas模块未安装【异常】"
else
   green_color "第八项:python2.7 pandas模块【正常】"
fi
}
check_hadoop(){
linename9=`ls -l /data1/xinsrv/hadoop-2.7.2/bin/yarn|awk '{print $3}'`
[ $linename9 == "hadoop" ] && linenum9=1
[ $linename9 == "hadoop" ] || linenum9=2
if [ $linenum9 -ne 1 ];then
   red_color "第九项:hadoop-2.7.2未安装或者权限非hadoop【异常】"
else
   green_color "第九项:hadoop-2.7.2安装及属性【正常】"
fi
}
check_tez(){
linename10=`ls -l /data1/xinsrv/tez-0.8.4/lib/jetty-6.1.26.jar|awk '{print $3}'`
[ $linename10 == "hadoop" ] && linenum10=1
[ $linename10 == "hadoop" ] || linenum10=2
if [ $linenum10 -ne 1 ];then
   red_color "第十项:tez-0.8.4未安装或者权限非hadoop【异常】"
else
   green_color "第十项:tez-0.8.4安装及属性【正常】"
fi
}
check_disk(){
linename11=`ls -ld /disk1/cache/hadoop/dfs/data/|awk '{print $3}'`
[ $linename11 == "hadoop" ] && linenum11=1
[ $linename11 == "hadoop" ] || linenum11=2
if [ $linenum11 -ne 1 ];then
   red_color "第十一项:/disk1/cache/hadoop/dfs/data/未创建或者权限非hadoop【异常】"
else
   green_color "第十一项:/disk1/cache/hadoop/dfs/data/创建及属性【正常】"
fi
}
check_hostname(){
linename12=`cat /etc/hosts|grep $TAG|awk '{print $2}'`
hostname $linename12
hostname=`hostname`
[ $linename12 == $hostname ] && linenum12=1
[ $linename12 == $hostname ] ||  linenum12=2

if [ $linenum12 -ne 1 ];then
   red_color "第十二项:hostname 不为$linename12【异常】"
else
   green_color "第十二项:hostname 为$linename12【正常】"
fi
}
check_hdfs_site(){
linename13=`cat /data1/xinsrv/hadoop-2.7.2/etc/hadoop/hdfs-site.xml |grep /disk1/cache/hadoop/dfs/data|wc -l`
[ $linename13 -eq 1 ] && linenum13=1
[ $linename13 -eq 1 ] || linenum13=2
if [ $linenum13 -ne 1 ];then
   red_color "第十三项:/data1/xinsrv/hadoop-2.7.2/etc/hadoop/hdfs-site.xml 未配置disk1目录【异常】"
else
   green_color "第十三项:/data1/xinsrv/hadoop-2.7.2/etc/hadoop/hdfs-site.xml 配置【正常】"
fi

}
check_disk_total_num(){
linename14=`df -h|grep disk|wc -l`
[ $linename14 -eq 16 -o $linename14 -eq 6 -o $linename14 -eq 12 ] && linenum14=1
[ $linename14 -eq 16 -o $linename14 -eq 6 -o $linename14 -eq 12 ] || linenum14=2
if [ $linenum14 -ne 1 ];then
   red_color "第十四项: 主机/disk盘数为:$linename14 【异常】"
else
   green_color "第十四项:主机/disk盘数为:$linename14【正常】"
fi

}
main(){
  check_hosts
  check_slaves
  check_dir_cache
  check_dir_run
  check_scala
  check_jdk
  check_numpy
  check_pandas
  check_hadoop
  check_tez
  check_disk
  check_hostname
  check_hdfs_site
  check_disk_total_num
  green_color "如果以上均正常,请在${TAG}主机执行:sudo su - hadoop; /data1/xinsrv/hadoop-2.7.2/sbin/yarn-daemon.sh start nodemanager;/data1/xinsrv/hadoop-2.7.2/sbin/hadoop-daemon.sh start datanode; 检查命令:jps"
}
main $*

6.2、启动hadoop

6.2.1、集群初始化namenode

/data1/xinsrv/hadoop-2.7.2/sbin/start-dfs.sh
hadoop namenode -format(仅第一次初始化集群使用!!!!)

6.2.2、启动集群服务

/data1/xinsrv/hadoop-2.7.2/sbin/start-all.sh

6.2.3、 FATAL org.apache.hadoop.ha.ZKFailoverController: Unable to start failover controller. Parent znode does not exist.

hdfs zkfc -formatZK

6.2.4、集群出现:Incompatible namespaceID for journal Storage Directory

修改另外一个namenode VERSION文件
删除datanode节点数据

for n in /disk1/cache/hadoop/dfs/data/ /disk2/cache/hadoop/dfs/data/ /disk3/cache/hadoop/dfs/data/ /disk4/cache/hadoop/dfs/data/ /disk5/cache/hadoop/dfs/data/ /disk6/cache/hadoop/dfs/data/ /disk7/cache/hadoop/dfs/data/ /disk8/cache/hadoop/dfs/data/ /disk9/cache/hadoop/dfs/data/ /disk10/cache/hadoop/dfs/data/ /disk11/cache/hadoop/dfs/data/ /disk12/cache/hadoop/dfs/data/;do cd $n && rm -rf  ./*;done

再次启动:/data1/xinsrv/hadoop-2.7.2/sbin/start-all.sh

7、启动后服务验证

本地电脑添加hadoop集群hosts解析:
1、hadoop集群状态:http://10.10.40.11:50070/dfshealth.html#tab-overview
2、yarn:http://master1:8088/cluster/scheduler

posted @ 2019-11-25 14:17  石Stone头  阅读(514)  评论(0编辑  收藏  举报