Hadoop-2.7.2集群部署步骤
Hadoop-2.7.2集群部署步骤
1、namenode主机信息
|序号|IP|Hostname|操作系统|CPU|内存|硬盘|环境及服务|备注|
|:--😐
|1、|10.10.40.10|master1|CentOS 7.6 64位|24c|96G|1T|java1.8+zookeeper+namenode|
|2、|10.10.40.11|master2|CentOS 7.6 64位|24c|96G|1T|java1.8+zookeeper+namenode|
|3、|10.10.40.12|master3|CentOS 7.6 64位|24c|96G|1T|java1.8+zookeeper+namenode|
2、datanode主机信息(裸盘)
|序号|IP|Hostname|操作系统|CPU|内存|硬盘|服务|备注|
|:--😐
|1、|10.10.40.13-42|datanode1-30|CentOS 7.6 64位|32c|128G|5.5T*12|java1.8+datanode|
3、集群部署java环境
-
下载安装包:jdk-8u151-linux-x64.tar.gz
-
上传到各个节点
-
解压到执行位置:tar -xf jdk-8u151-linux-x64.tar.gz -C /data1/xinsrv
-
添加环境变量:source /etc/profile.d/java.sh
export JAVA_HOME=/data1/xinsrv/jdk1.8.0_151/ PATH=$PATH:$HOME/bin:$JAVA_HOME/bin export PATH
-
版本验证:java -version
4、部署zookeeper集群
部署主机:10.10.40.10-10.10.40.11(可多台建议zk集群主机个数为奇数)
-
下载安装包:zookeeper-3.4.8.tar.gz
-
上传到主机
-
解压到安装位置:tar -xf zookeeper-3.4.8.tar.gz -C /data1/xinsrv/ && cd /data1/xinsrv/ && mv zookeeper-3.4.8 zookeeper
-
创建相关依赖目录:mkdir -p /data1/data/zookeeper && mkdir -p /data1/logs/app/zookeeper
-
配置myid:
10.10.40.10 echo "1" >/data1/data/zookeeper/myid 10.10.40.11 echo "2" >/data1/data/zookeeper/myid 10.10.40.12 echo "3" >/data1/data/zookeeper/myid
-
配置conf文件:cat /data1/xinsrv/zookeeper/conf/zoo.cfg |egrep -v "#|$"
tickTime=2000 initLimit=10 syncLimit=5 dataDir=/data1/data/zookeeper dataLogDir=/data1/logs/app/zookeeper clientPort=2181 maxClientCnxns=3000 server.1=10.10.40.10:3181:4181 server.2=10.10.40.11:3181:4181 server.3=10.10.40.12:3181:4181
4.1、服务启动:cd /etc/init.d/ && ./zookeeper_2181 start
cat zookeeper_2181
#!/bin/bash
#
# zookeeper Startup script for zookeeper
#
# chkconfig: - 93 19
# processname: zookeeper
. /etc/init.d/functions
bin=/data1/xinsrv/zookeeper/bin/zkServer.sh
conf=/data1/xinsrv/zookeeper/conf/zoo.cfg
logdir=/data1/logs/xinsrv/zookeeper
[ ! -d $logdir ] && mkdir -p $logdir && cd $logdir
service_count=`ps -ef |grep /data1/xinsrv/zookeeper/conf/zoo.cfg|grep -v grep|wc -l`
start(){
if [ $service_count -gt 1 ];then
echo "the service of zookeeper is running..."
exit
fi
$bin start $conf
if [ $? -ne 0 ];then
action "starting zookeeper..." /bin/false
exit
else
action "starting zookeeper..." /bin/true
fi
}
stop(){
if [ $service_count -lt 1 ];then
echo "the service of zookeeper is not running..."
exit
fi
$bin stop $conf
if [ $? -ne 0 ];then
action "stopping zookeeper..." /bin/false
exit
else
action "stopping zookeeper..." /bin/true
fi
}
restart(){
$bin restart $conf
}
status(){
$bin status $conf
}
main () {
case "$1" in
start)
start
;;
stop)
stop
;;
restart)
restart
;;
status)
status
;;
*)
echo $"Usage: $0 {start|stop|status|restart}"
exit 1
esac
}
main $*
5、部署hadoop集群
5.1、基础环境调整
-
所有节点
-
安装依赖:yum install python-devel libevent-devel libmcrypt libmcrypt-devel -y
-
集群添加:host解析
10.10.40.10 master1 10.10.40.11 master2 10.10.40.12 master3 10.10.40.13 datanode1 10.10.40.14 datanode2 10.10.40.15 datanode3
-
创建相关目录及授权:
- mkdir -p /data1/cache/hadoop /data1/cache/hadoop/dfs/data /data1/cache/hadoop/dfs/name /data1/cache/hadoop/tmp /var/run/hadoop /var/run/spark /data1/xinsrv/hadoop-2.7.2 /data1/xinsrv/tez-0.8.4
- chown hadoop.hadoop -R /data1/cache/hadoop /data1/cache/hadoop/dfs/data /data1/cache/hadoop/dfs/name /data1/cache/hadoop/tmp /var/run/hadoop /var/run/spark /data1/xinsrv/hadoop-2.7.2 /data1/xinsrv/tez-0.8.4
5.2、hadoop安装
-
所有节点
-
下载安装包:hadoop-2.7.2.tar.gz tez-0.8.4.tar.gz scala-2.11.8.tgz
-
上传到主机
-
安装tez:tar xf tez-0.8.4.tar.gz -C /data1/xinsrv
-
安装scala:tar xf scala-2.11.8.tgz -C /data1/xinsrv
-
安装hadoop:useradd hadoop && tar xf hadoop-2.7.2.tar.gz -C /data1/xinsrv && chown -R hadoop.hadoop /data1/xinsrv/hadoop-2.7.2
-
datanode裸盘格式化及挂载,挂载盘按disk1-12命名
-
datanode依赖目录创建:
df -h |grep disk &>/dev/null && for n in `df -h |grep disk|awk '{print $NF}'|grep "^/disk"`;do mkdir -p $n/cache/hadoop/dfs/data/ && chown -R hadoop.hadoop $n/cache/hadoop/dfs/data/
-
安装tez:tar xf tez-0.8.4.tar.gz -C /data1/xinsrv/
-
添加免秘钥登录:10.10.40.10 ,sudo su - hadoop 然后执行ssh-keygen -t rsa生成秘钥,其他节点添加相关公钥。
5.3、hadoop-env.sh(配置hadoop集群启动环境)
cat hadoop-env.sh |egrep -v "#|$"
export JAVA_HOME=/data1/xinsrv/jdk1.8.0_151
export HADOOP_CONF_DIR=${HADOOP_CONF_DIR:-"/data1/xinsrv/hadoop-2.7.2/etc/hadoop"}
for f in $HADOOP_HOME/contrib/capacity-scheduler/*.jar; do
if [ "$HADOOP_CLASSPATH" ]; then
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:$f
else
export HADOOP_CLASSPATH=$f
fi
done
export HADOOP_HEAPSIZE=8192
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"
export HADOOP_NAMENODE_OPTS="-Xms30g -Xmx30g -XX:+UseParNewGC -XX:+UseConcMarkSweepGC -XX:CMSInitiatingOccupancyFraction=70 -XX:+CMSParallelRemarkEnabled -XX:+PrintTenuringDistribution -Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_NAMENODE_OPTS"
export HADOOP_DATANODE_OPTS="-Dhadoop.security.logger=ERROR,RFAS $HADOOP_DATANODE_OPTS"
export HADOOP_SECONDARYNAMENODE_OPTS="-Dhadoop.security.logger=${HADOOP_SECURITY_LOGGER:-INFO,RFAS} -Dhdfs.audit.logger=${HDFS_AUDIT_LOGGER:-INFO,NullAppender} $HADOOP_SECONDARYNAMENODE_OPTS"
export HADOOP_NFS3_OPTS="$HADOOP_NFS3_OPTS"
export HADOOP_PORTMAP_OPTS="-Xmx1024m $HADOOP_PORTMAP_OPTS"
export HADOOP_CLIENT_OPTS="-Xmx2048m $HADOOP_CLIENT_OPTS"
export HADOOP_SECURE_DN_USER=${HADOOP_SECURE_DN_USER}
export HADOOP_SECURE_DN_LOG_DIR=${HADOOP_LOG_DIR}/${HADOOP_HDFS_USER}
export HADOOP_PID_DIR=/var/run/hadoop
export HADOOP_SECURE_DN_PID_DIR=${HADOOP_PID_DIR}
export HADOOP_IDENT_STRING=$USER
export TEZ_CONF_DIR=/data1/xinsrv/hadoop-2.7.2/etc/hadoop/tez-site.xml
export TEZ_JARS=/data1/xinsrv/tez-0.8.4
export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:${TEZ_CONF_DIR}:${TEZ_JARS}/*:${TEZ_JARS}/lib/*
5.4、hdfs-site.xml(namenode和datanode相关配置)
cat hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>file:/data1/cache/hadoop/dfs/name</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/disk1/cache/hadoop/dfs/data/,/disk2/cache/hadoop/dfs/data/,/disk3/cache/hadoop/dfs/data/,/disk4/cache/hadoop/dfs/data/,/disk5/cache/hadoop/dfs/data/,/disk6/cache/hadoop/dfs/data/,/disk7/cache/hadoop/dfs/data/,/disk8/cache/hadoop/dfs/data/,/disk9/cache/hadoop/dfs/data/,/disk10/cache/hadoop/dfs/data/,/disk11/cache/hadoop/dfs/data/,/disk12/cache/hadoop/dfs/data/</value>
</property>
<property>
<name>dfs.datanode.failed.volumes.tolerated</name>
<value>1</value>
</property>
<property>
<name>dfs.blocksize</name>
<value>134217728</value>
</property>
<property>
<name>dfs.nameservices</name>
<value>mycluster</value>
<description>Logical name forthis new nameservice</description>
</property>
<property>
<name>dfs.ha.namenodes.mycluster</name>
<value>nn1,nn2</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn1</name>
<value>10.10.40.10:9000</value>
</property>
<property>
<name>dfs.namenode.rpc-address.mycluster.nn2</name>
<value>10.10.40.11:9000</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn1</name>
<value>10.10.40.10:50070</value>
</property>
<property>
<name>dfs.namenode.http-address.mycluster.nn2</name>
<value>10.10.40.11:50070</value>
</property>
<property>
<name>dfs.namenode.shared.edits.dir</name>
<value>qjournal://10.10.40.10:8485;10.10.40.11:8485;10.10.40.12:8485/mycluster</value>
</property>
<property>
<name>dfs.client.failover.proxy.provider.mycluster</name>
<value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
</property>
<property>
<name>dfs.ha.fencing.methods</name>
<value>sshfence</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.private-key-files</name>
<value>/home/hadoop/.ssh/id_rsa</value>
</property>
<property>
<name>dfs.journalnode.edits.dir</name>
<value>/data1/cache/hadoop/journal/node/data</value>
</property>
<property>
<name>dfs.ha.fencing.ssh.connect-timeout</name>
<value>30000</value>
</property>
<property>
<name>dfs.ha.automatic-failover.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.datanode.max.transfer.threads</name>
<value>10240</value>
</property>
<property>
<name>dfs.permissions</name>
<value>true</value>
</property>
<property>
<name>dfs.permissions.superusergroup</name>
<value>hadoop</value>
</property>
<property>
<name>dfs.namenode.acls.enabled</name>
<value>true</value>
</property>
<property>
<name>fs.trash.interval</name>
<value>10080</value>
</property>
<property>
<name>dfs.webhdfs.enabled</name>
<value>true</value>
</property>
<property>
<name>dfs.hosts.exclude</name>
<value>/data1/xinsrv/hadoop-2.7.2/etc/hadoop/excludes</value>
<description>Names a file that contains a list of hosts that are not permitted to connect to the namenode. The full pathname of the file must be specified. If the value is empty, no hosts are excluded.</description>
</property>
<property>
<name>dfs.namenode.handler.count</name>
<value>150</value>
</property>
<property>
<name>dfs.datanode.handler.count</name>
<value>100</value>
</property>
<property>
<name>dfs.namenode.datanode.registration.ip-hostname-check</name>
<value>false</value>
</property>
<property>
<name>dfs.qjournal.write-txns.timeout.ms</name>
<value>90000</value>
</property>
</configuration>
5.5、core-site.xml
cat core-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://mycluster</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>ha.zookeeper.quorum</name>
<value>10.10.40.10:2181,10.10.40.11:2181,10.10.40.12:2181</value>
</property>
<property>
<name>io.compression.codecs</name>
<value>org.apache.hadoop.io.compress.DefaultCodec,org.apache.hadoop.io.compress.GzipCodec,org.apache.hadoop.io.compress.BZip2Codec,org.apache.hadoop.io.compress.SnappyCodec</value>
</property>
<property>
<name>hadoop.native.lib</name>
<value>true</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/data1/cache/hadoop/tmp</value>
</property>
<property>
<name>hadoop.proxyuser.hue.hosts</name>
<value>*</value>
</property>
<property>
<name>hadoop.proxyuser.hue.groups</name>
<value>*</value>
</property>
<property>
<name>ipc.server.listen.queue.size</name>
<value>512</value>
</property>
<!-- 以下是结合阿里云的oss产品的配置,用于数据迁移 -->
<property>
<name>fs.oss.endpoint</name>
<description>Aliyun OSS endpoint to connect to. An up-to-date list is
provided in the Aliyun OSS Documentation.
</description>
<value>oss-cn-beijing-internal.aliyuncs.com</value>
</property>
<property>
<name>fs.oss.accessKeyId</name>
<description>Aliyun access key ID</description>
<value>LTAI</value>
</property>
<property>
<name>fs.oss.accessKeySecret</name>
<description>Aliyun access key secret</description>
<value>AZO</value>
</property>
<property>
<name>fs.oss.impl</name>
<value>org.apache.hadoop.fs.aliyun.oss.AliyunOSSFileSystem</value>
</property>
<property>
<name>fs.oss.buffer.dir</name>
<value>/tmp/oss</value>
<description>Comma separated list of directories to buffer OSS data before uploading to Aliyun OSS</description>
</property>
<property>
<name>fs.oss.connection.maximum</name>
<value>2048</value>
<description>Number of simultaneous connections to oss.</description>
</property>
<property>
<name>fs.oss.connection.secure.enabled</name>
<value>false</value>
<description>Connect to oss over ssl or not, true by default.</description>
</property>
<!--2019年07月22日11:13:20 优化namenode 频繁切换问题 -->
<!-- HealthMonitor check namenode 的超时设置,默认45000ms,改为3mins -->
<property>
<name>ha.health-monitor.rpc-timeout.ms</name>
<value>180000</value>
</property>
<!-- zk failover的session 超时设置,默认5000ms,改为2mins -->
<property>
<name>ha.zookeeper.session-timeout.ms</name>
<value>120000</value>
</property>
</configuration>
5.6、mapred-site.xml
cat mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<!-- Put site-specific property overrides in this file. -->
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn-tez</value>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>10.10.40.12:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>10.10.40.12:19888</value>
</property>
<property>
<name>mapreduce.map.memory.mb</name>
<value>2048</value>
</property>
<property>
<name>mapreduce.reduce.memory.mb</name>
<value>4096</value>
</property>
<property>
<name>mapreduce.task.io.sort.mb</name>
<value>150</value>
</property>
<property>
<name>yarn.app.mapreduce.am.resource.mb</name>
<value>2048</value>
</property>
<property>
<name>yarn.app.mapreduce.am.command-opts</name>
<value>-Xmx3276m</value>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>${yarn.app.mapreduce.am.staging-dir}/history/done</value>
</property>
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>${yarn.app.mapreduce.am.staging-dir}/history/done_intermediate</value>
</property>
<property>
<name>yarn.app.mapreduce.am.staging-dir</name>
<value>/tmp/hadoop-yarn/staging</value>
</property>
<property>
<name>mapreduce.jobhistory.joblist.cache.size</name>
<value>20000</value>
</property>
<property>
<name>mapreduce.job.reduce.slowstart.completedmaps</name>
<value>1</value>
<description>Fraction of the number of maps in the job which should be complete before reduces are scheduled for the job.</description>
</property>
<property>
<name>mapreduce.reduce.cpu.vcores</name>
<value>2</value>
<description>The number of virtual cores to request from the scheduler for each reduce task.</description>
</property>
<property>
<name>mapreduce.tasktracker.map.tasks.maximum</name>
<value>10</value>
<description>The maximum number of map tasks that will be run simultaneously by a task tracker.</description>
</property>
<property>
<name>mapreduce.tasktracker.reduce.tasks.maximum</name>
<value>5</value>
<description>The maximum number of reduce tasks that will be run simultaneously by a task tracker.</description>
</property>
<property>
<name>mapreduce.task.io.sort.factor</name>
<value>50</value>
<description>The number of streams to merge at once while sorting files. This determines the number of open file handles.</description>
</property>
<property>
<name>mapreduce.reduce.shuffle.parallelcopies</name>
<value>20</value>
<description>The default number of parallel transfers run by reduce during the copy(shuffle) phase.</description>
</property>
<property>
<name>mapreduce.map.output.compress</name>
<value>true</value>
<description>Should the outputs of the maps be compressed before being sent across the network. Uses SequenceFile compression.</description>
</property>
<property>
<name>mapreduce.map.output.compress.codec</name>
<value>org.apache.hadoop.io.compress.GzipCodec</value>
<description>If the map outputs are compressed, how should they be compressed.</description>
</property>
<property>
<name>mapreduce.job.running.map.limit</name>
<value>60</value>
<description>The maximum number of simultaneous map tasks per job. There is no limit if this value is 0 or negative.</description>
</property>
<property>
<name>mapreduce.job.running.reduce.limit</name>
<value>30</value>
<description>The maximum number of simultaneous reduce tasks per job. There is no limit if this value is 0 or negative.</description>
</property>
<property>
<name>mapred.child.java.opts</name>
<value>-Xmx3712m</value>
</property>
<property>
<name>mapreduce.jobhistory.client.thread-count</name>
<value>20</value>
</property>
</configuration>
5.7、cat yarn-site.xml
<?xml version="1.0"?>
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License. See accompanying LICENSE file.
-->
<configuration>
<!-- yarn-ha start-->
<property>
<name>yarn.resourcemanager.ha.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.cluster-id</name>
<value>yarn-ha</value>
</property>
<property>
<name>yarn.resourcemanager.ha.rm-ids</name>
<value>rm1,rm2</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm1</name>
<value>10.10.40.10</value>
</property>
<property>
<name>yarn.resourcemanager.hostname.rm2</name>
<value>10.10.40.11</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm1</name>
<value>10.10.40.10:8088</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.address.rm2</name>
<value>10.10.40.11:8088</value>
</property>
<property>
<name>yarn.resourcemanager.zk-address</name>
<value>10.10.40.10:2181,10.10.40.11:2181,10.10.40.12:2181</value>
</property>
<property>
<name>yarn.resourcemanager.address.rm1</name>
<value>10.10.40.10:8032</value>
</property>
<property>
<name>yarn.resourcemanager.address.rm2</name>
<value>10.10.40.11:8032</value>
</property>
<!-- yarn ha end -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle,spark_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.spark_shuffle.class</name>
<value>org.apache.spark.network.yarn.YarnShuffleService</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce_shuffle.class</name>
<value>org.apache.hadoop.mapred.ShuffleHandler</value>
</property>
<property>
<name>yarn.nodemanager.resource.memory-mb</name>
<value>106496</value>
</property>
<property>
<name>yarn.scheduler.minimum-allocation-mb</name>
<value>1024</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-mb</name>
<value>8192</value>
</property>
<property>
<name>yarn.nodemanager.resource.cpu-vcores</name>
<value>30</value>
</property>
<property>
<name>yarn.scheduler.maximum-allocation-vcores</name>
<value>50</value>
</property>
<property>
<name>yarn.log-aggregation-enable</name>
<value>true</value>
</property>
<property>
<name>yarn.nodemanager.vmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.pmem-check-enabled</name>
<value>false</value>
</property>
<property>
<name>yarn.nodemanager.vmem-pmem-ratio</name>
<value>4</value>
<description>Ratio between virtual memory to physical memory when setting memory limits for containers</description>
</property>
<property>
<name>yarn.resourcemanager.scheduler.class</name>
<value>org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.FairScheduler</value>
</property>
<property>
<name>yarn.scheduler.fair.allocation.file</name>
<value>/data1/xinsrv/hadoop-2.7.2/etc/hadoop/fair-scheduler.xml</value>
</property>
<property>
<name>yarn.scheduler.fair.preemption</name>
<value>true</value>
</property>
<property>
<name>yarn.scheduler.fair.user-as-default-queue</name>
<value>true</value>
<description>default is True</description>
</property>
<property>
<name>yarn.scheduler.fair.allow-undeclared-pools</name>
<value>false</value>
<description>default is True</description>
</property>
<property>
<name>yarn.timeline-service.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.timeline-service.hostname</name>
<value>master3</value>
</property>
<property>
<name>yarn.timeline-service.address</name>
<value>master3:10200</value>
</property>
<property>
<name>yarn.timeline-service.webapp.address</name>
<value>master3:8188</value>
</property>
<property>
<name>yarn.timeline-service.webapp.https.address</name>
<value>master3:8190</value>
</property>
<property>
<name>yarn.timeline-service.http-cross-origin.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.resourcemanager.system-metrics-publisher.enabled</name>
<value>true</value>
</property>
<property>
<description>Enable age off of timeline store data.</description>
<name>yarn.timeline-service.ttl-enable</name>
<value>true</value>
</property>
<property>
<description>Time to live for timeline store data in milliseconds.</description>
<name>yarn.timeline-service.ttl-ms</name>
<value>259200000</value>
</property>
<property>
<description>Handler thread count to serve the client RPC requests.</description>
<name>yarn.timeline-service.handler-thread-count</name>
<value>35</value>
</property>
<property>
<description>Length of time to wait between deletion cycles of leveldb timeline store in milliseconds.</description>
<name>yarn.timeline-service.leveldb-timeline-store.ttl-interval-ms</name>
<value>1200000</value>
</property>
<property>
<name>yarn.resourcemanager.nodes.exclude-path</name>
<value>/data1/xinsrv/hadoop-2.7.2/etc/hadoop/node-manager-excludes</value>
</property>
<property>
<name>yarn.log.server.url</name>
<value>http://master3:19888/jobhistory/logs</value>
</property>
<property>
<name>yarn.resourcemanager.webapp.cross-origin.enabled</name>
<value>true</value>
</property>
<property>
<name>yarn.timeline-service.generic-application-history.enabled</name>
<value>true</value>
</property>
</configuration>
5.8、slaves(添加datanode节点IP)
cat slaves
10.10.40.13
10.10.40.14
10.10.40.15
10.10.40.16
6、hadoop集群启动
6.1、datanode环境检查
主要用于检查相关组件和目录是否存在及权限是否正常。
cat hadoop_check.sh
#!/bin/bash
#######################################################
# $Name: hadoop_check.sh
# $Version: v1.0
# $Function: 检查hadoop集群新增节点环境
# $Author: liuxiaopeng
# $Create Date: 2018年07月09日18:57:20
# $Description:
#######################################################
# Shell Env
SHELL_NAME=`basename $0`
SHELL_DIR="/tmp"
SHELL_LOG="${SHELL_DIR}/${SHELL_NAME}.log"
LOCK_FILE="/tmp/${SHELL_NAME}.lock"
# mains
#Write Log
shell_log(){
LOG_INFO=$1
echo "$(date "+%Y-%m-%d") $(date "+%H-%M-%S") : ${SHELL_NAME} : ${LOG_INFO}" >> ${SHELL_LOG}
}
red_color(){
CONTENT=$1
echo -e "\033[31m $CONTENT \033[0m"
}
green_color(){
CONTENT=$1
echo -e "\033[32m $CONTENT \033[0m"
}
# Shell Usage
#shell_usage(){
# echo $"Usage: $0 {backup}"
#}
shell_lock(){
touch ${LOCK_FILE}
}
shell_unlock(){
rm -f ${LOCK_FILE}
}
TAG=$(/sbin/ip a l |grep "inet "|awk '{print $2}'|awk -F '/' '{print $1}'|grep -v '127.0.0.1')
check_hosts(){
linename1=`cat /etc/hosts|grep $TAG`
linenum1=`cat /etc/hosts|grep $TAG|wc -l`
if [ $linenum1 -ne 1 ];then
red_color "第一项:HOSTS配置【异常】: $linename1"
else
green_color "第一项:HOSTS配置【正常】"
fi
}
check_slaves(){
linenum2=`cat /data1/xinsrv/hadoop-2.7.2/etc/hadoop/slaves|grep $TAG|wc -l`
if [ $linenum2 -ne 1 ];then
red_color "第二项:slaves配置【异常】:没有配置该IP"
else
green_color "第二项:slaves配置【正常】"
fi
}
check_dir_cache(){
linename3=`ls -ld /data1/cache/hadoop|awk '{print $3}'`
[ -d /data1/cache/hadoop ] && [ $linename3 == "hadoop" ] && linenum3=1
[ -d /data1/cache/hadoop ] && [ $linename3 == "hadoop" ] || linenum3=2
if [ $linenum3 -ne 1 ];then
red_color "第三项:/data1/cache/hadoop目录及属性配置【异常】:目录或者属性异常"
else
green_color "第三项:/data1/cache/hadoop目录及属性配置【正常】"
fi
}
check_dir_run(){
linename4=`ls -ld /var/run/hadoop|awk '{print $3}'`
[ -d /var/run/hadoop ] && [ $linename4 == "hadoop" ] && linenum4=1
[ -d /var/run/hadoop ] && [ $linename4 == "hadoop" ] || linenum4=2
if [ $linenum4 -ne 1 ];then
red_color "第四项:/var/run/hadoop目录及属性配置【异常】:目录或者属性异常"
else
green_color "第四项:/var/run/hadoop目录及属性配置【正常】"
fi
}
check_scala(){
linename5=`which scala`
[ $linename5 == "/data1/xinsrv/scala-2.11.8/bin/scala" ] && linenum5=1
[ $linename5 == "/data1/xinsrv/scala-2.11.8/bin/scala" ] || linenum5=2
if [ $linenum5 -ne 1 ];then
red_color "第五项:scala未部署或者profile配置【异常】"
else
green_color "第五项:scala配置【正常】"
fi
}
check_jdk(){
linename6=$(ls -l $(ls -l `which java`|awk '{print $NF}')|awk '{print $NF}')
[ $linename6 == "/data1/xinsrv/jdk1.8.0_151/bin/java" ] && linenum6=1
[ $linename6 == "/data1/xinsrv/jdk1.8.0_151/bin/java" ] || linenum6=2
if [ $linenum6 -ne 1 ];then
red_color "第六项:JDK未部署或者profile配置【异常】"
else
green_color "第六项:JDK配置【正常】"
fi
}
check_numpy(){
linenum7=`pip2.7 list|grep numpy|wc -l`
if [ $linenum7 -ne 1 ];then
red_color "第七项:python2.7 numpy模块未安装【异常】"
else
green_color "第七项:python2.7 numpy模块【正常】"
fi
}
check_pandas(){
linenum8=`pip2.7 list|grep pandas|wc -l`
if [ $linenum8 -ne 1 ];then
red_color "第八项:python2.7 pandas模块未安装【异常】"
else
green_color "第八项:python2.7 pandas模块【正常】"
fi
}
check_hadoop(){
linename9=`ls -l /data1/xinsrv/hadoop-2.7.2/bin/yarn|awk '{print $3}'`
[ $linename9 == "hadoop" ] && linenum9=1
[ $linename9 == "hadoop" ] || linenum9=2
if [ $linenum9 -ne 1 ];then
red_color "第九项:hadoop-2.7.2未安装或者权限非hadoop【异常】"
else
green_color "第九项:hadoop-2.7.2安装及属性【正常】"
fi
}
check_tez(){
linename10=`ls -l /data1/xinsrv/tez-0.8.4/lib/jetty-6.1.26.jar|awk '{print $3}'`
[ $linename10 == "hadoop" ] && linenum10=1
[ $linename10 == "hadoop" ] || linenum10=2
if [ $linenum10 -ne 1 ];then
red_color "第十项:tez-0.8.4未安装或者权限非hadoop【异常】"
else
green_color "第十项:tez-0.8.4安装及属性【正常】"
fi
}
check_disk(){
linename11=`ls -ld /disk1/cache/hadoop/dfs/data/|awk '{print $3}'`
[ $linename11 == "hadoop" ] && linenum11=1
[ $linename11 == "hadoop" ] || linenum11=2
if [ $linenum11 -ne 1 ];then
red_color "第十一项:/disk1/cache/hadoop/dfs/data/未创建或者权限非hadoop【异常】"
else
green_color "第十一项:/disk1/cache/hadoop/dfs/data/创建及属性【正常】"
fi
}
check_hostname(){
linename12=`cat /etc/hosts|grep $TAG|awk '{print $2}'`
hostname $linename12
hostname=`hostname`
[ $linename12 == $hostname ] && linenum12=1
[ $linename12 == $hostname ] || linenum12=2
if [ $linenum12 -ne 1 ];then
red_color "第十二项:hostname 不为$linename12【异常】"
else
green_color "第十二项:hostname 为$linename12【正常】"
fi
}
check_hdfs_site(){
linename13=`cat /data1/xinsrv/hadoop-2.7.2/etc/hadoop/hdfs-site.xml |grep /disk1/cache/hadoop/dfs/data|wc -l`
[ $linename13 -eq 1 ] && linenum13=1
[ $linename13 -eq 1 ] || linenum13=2
if [ $linenum13 -ne 1 ];then
red_color "第十三项:/data1/xinsrv/hadoop-2.7.2/etc/hadoop/hdfs-site.xml 未配置disk1目录【异常】"
else
green_color "第十三项:/data1/xinsrv/hadoop-2.7.2/etc/hadoop/hdfs-site.xml 配置【正常】"
fi
}
check_disk_total_num(){
linename14=`df -h|grep disk|wc -l`
[ $linename14 -eq 16 -o $linename14 -eq 6 -o $linename14 -eq 12 ] && linenum14=1
[ $linename14 -eq 16 -o $linename14 -eq 6 -o $linename14 -eq 12 ] || linenum14=2
if [ $linenum14 -ne 1 ];then
red_color "第十四项: 主机/disk盘数为:$linename14 【异常】"
else
green_color "第十四项:主机/disk盘数为:$linename14【正常】"
fi
}
main(){
check_hosts
check_slaves
check_dir_cache
check_dir_run
check_scala
check_jdk
check_numpy
check_pandas
check_hadoop
check_tez
check_disk
check_hostname
check_hdfs_site
check_disk_total_num
green_color "如果以上均正常,请在${TAG}主机执行:sudo su - hadoop; /data1/xinsrv/hadoop-2.7.2/sbin/yarn-daemon.sh start nodemanager;/data1/xinsrv/hadoop-2.7.2/sbin/hadoop-daemon.sh start datanode; 检查命令:jps"
}
main $*
6.2、启动hadoop
6.2.1、集群初始化namenode
/data1/xinsrv/hadoop-2.7.2/sbin/start-dfs.sh
hadoop namenode -format(仅第一次初始化集群使用!!!!)
6.2.2、启动集群服务
/data1/xinsrv/hadoop-2.7.2/sbin/start-all.sh
6.2.3、 FATAL org.apache.hadoop.ha.ZKFailoverController: Unable to start failover controller. Parent znode does not exist.
hdfs zkfc -formatZK
6.2.4、集群出现:Incompatible namespaceID for journal Storage Directory
修改另外一个namenode VERSION文件
删除datanode节点数据
for n in /disk1/cache/hadoop/dfs/data/ /disk2/cache/hadoop/dfs/data/ /disk3/cache/hadoop/dfs/data/ /disk4/cache/hadoop/dfs/data/ /disk5/cache/hadoop/dfs/data/ /disk6/cache/hadoop/dfs/data/ /disk7/cache/hadoop/dfs/data/ /disk8/cache/hadoop/dfs/data/ /disk9/cache/hadoop/dfs/data/ /disk10/cache/hadoop/dfs/data/ /disk11/cache/hadoop/dfs/data/ /disk12/cache/hadoop/dfs/data/;do cd $n && rm -rf ./*;done
再次启动:/data1/xinsrv/hadoop-2.7.2/sbin/start-all.sh
7、启动后服务验证
本地电脑添加hadoop集群hosts解析:
1、hadoop集群状态:http://10.10.40.11:50070/dfshealth.html#tab-overview
2、yarn:http://master1:8088/cluster/scheduler