HADOOP docker(一):安装hadoop实验集群(略操蛋)

一.环境准备

1.1.机器规划

主机名    别名    IP     角色
9321a27a2b91 hadoop1 172.17.0.10 NN1 ZK RM
7c3a3c9cd595 hadoop2 172.17.0.9 NN2 ZK RM JOBHIS
f89eaf2a2548 hadoop3 172.17.0.8 DN ZK ND
28620eee1426 hadoop4 172.17.0.7 DN QJM1 ND
ae1f06bd04c8 hadoop5 172.17.0.6 DN QJM2 ND
11c433a003b6 hadoop6 172.17.0.5 DN QJM3 ND

1.2.用户与组

用户     作用
hdfs hadoop 管理HDFS
yarn hadoop 管理yarn
zookeeper hadoop 管理zookeeper
hvie hadoop 管理hvie
hbase hadoop 管理hbase
     
脚本:
  1. groupadd hadoop
  2. useradd -g hadoop hdfs
  3. passwd hdfs <<EOF
  4. hdfs
  5. hdfs
  6. EOF
  7. useradd -g hadoop yarn
  8. passwd yarn <<EOF
  9. yarn
  10. yarn
  11. EOF
  12. useradd -g hadoop zookeeper
  13. passwd zookeeper <<EOF
  14. zookeeper
  15. zookeeper
  16. EOF
  17. useradd -g hadoop hive
  18. passwd hive <<EOF
  19. hive
  20. hive
  21. EOF
  22. useradd -g hadoop hbase
  23. passwd hbase <<EOF
  24. hbase
  25. hbase
  26. EOF
  27. echo user added!
 
 

1.3.修改/etc/hosts

加入所有节点的IP NAME
  1. echo "127.0.0.1 localhost localhost">/etc/hosts
  2. echo "172.17.0.6 9321a27a2b91 hadoop1">>/etc/hosts
  3. echo "172.17.0.7 7c3a3c9cd595 hadoop2">>/etc/hosts
  4. echo "172.17.0.8 f89eaf2a2548 hadoop3">>/etc/hosts
  5. echo "172.17.0.9 28620eee1426 hadoop4">>/etc/hosts
  6. echo "172.17.0.10 ae1f06bd04c8 hadoop5">>/etc/hosts
  7. echo "172.17.0.11 11c433a003b6 hadoop6">>/etc/hosts

1.4. ssh 免密码登录

在各个机器上执行
 
  1. su hdfs
  2. ssh-copy-id -i ~/.ssh/id_rsa.pub 172.17.0.6
  3. ssh-copy-id -i ~/.ssh/id_rsa.pub 172.17.0.7
  4. ssh-copy-id -i ~/.ssh/id_rsa.pub 172.17.0.8
  5. ssh-copy-id -i ~/.ssh/id_rsa.pub 172.17.0.9
  6. ssh-copy-id -i ~/.ssh/id_rsa.pub 172.17.0.10
  7. ssh-copy-id -i ~/.ssh/id_rsa.pub 172.17.0.11
 

1.5.修改ulimit 

先用hdfs yarn hive等用户查看limit -a .如果 -n -m -l -u等不满足要求,则修改/etc/security/limits.conf
  1. [hdfs@9321a27a2b91 root]$ ulimit -a
  2. core file size (blocks,-c) unlimited
  3. data seg size (kbytes,-d) unlimited
  4. scheduling priority (-e)0
  5. file size (blocks,-f) unlimited
  6. pending signals (-i)95612
  7. max locked memory (kbytes,-l)64
  8. max memory size (kbytes,-m) unlimited
  9. open files (-n)65536
  10. pipe size (512 bytes,-p)8
  11. POSIX message queues (bytes,-q)819200
  12. real-time priority (-r)0
  13. stack size (kbytes,-s)8192
  14. cpu time (seconds,-t) unlimited
  15. max user processes (-u)1024
  16. virtual memory (kbytes,-v) unlimited
  17. file locks (-x) unlimited
如果不满足要求,修改/etc/security/limits.conf,添加:
  1. hdfs hard nfile 65536
  2. hdfs soft nfile 65536
  3. yarn hard nfile 65536
  4. yarn soft nfile 65536
  5. ......
nfile指找开文件数,还可以设置nproc等.
注:本次实验机不做修改.
 

6.关闭防火墙

  1. service iptables stop

7.关闭seLinux

永久生效:修改/etc/selinux/config文件中设置SELINUX=disabled
临时生效:使用命令setenforce 0 
由于是docker无法重启,故此次使用第二种方法
  1. setenforce 0
 
 

二.软件准备

2.1.安装jdk 

在hadoop1节点上:
上官网下载最新的jdk8.1 tar包,解压到/usr/local/java
  1. [root@9321a27a2b91 ~]#mkdir /usr/local/java
  2. [root@9321a27a2b91 ~]#cp jdk-8u121-linux-x64.tar.gz /usr/local/java/
  3. [root@9321a27a2b91 ~]#chown -R hdfs:hadoop /usr/local/java/
  4. [root@9321a27a2b91 ~]#su hdfs
  5. [hdfs@9321a27a2b91 root]$ cd /usr/local/java/jdk-8u121-linux-x64.tar.gz
  6. [hdfs@9321a27a2b91 java]$ tar -zxvf jdk-8u121-linux-x64.tar.gz
在每个节点上执行此操作.
或者在一个节点上执行完后,把相关文件scp到其它节点:
  1. mkdir /usr/local/java
  2. chown hdfs:hadoop /usr/local/java
  3. su hdfs
  4. scp -r hdfs@hadoop1:/usr/local/java/jdk1.8.0_121 /usr/local/java
 

2.2.hadoop安装包

在hadoop1节点上:
上官网下载hadoop2.7.3,解压到/opt/hadoop
  1. [root@9321a27a2b91 ~]# mkdir /opt/hadoop
  2. [root@9321a27a2b91 ~]# chown hdfs:hadoop hadoop-2.7.3.tar.gz
  3. [root@9321a27a2b91 ~]# chown hdfs:hadoop /opt/hadoop
  4. [root@9321a27a2b91 ~]# cp hadoop-2.7.3.tar.gz /opt/hadoop
  5. [root@9321a27a2b91 ~]# su hdfs
  6. [hdfs@9321a27a2b91 root]$ cd /opt/hadoop/
  7. [hdfs@9321a27a2b91 hadoop]$ tar -zxvf hadoop-2.7.3.tar.gz
 

2.3.ntp服务

将hadoop1设置为ntp服务器,其它为ntp client.在hadoop1结点上:
1)在hadoop1上,以root用户执行:
如果没有装ntp,先用yum安装:
  1. yum -y install ntp
编辑ntp配置文件/etc/ntp.conf,添加:
  1. #本子网内主机都可以同步
  2. restrict 172.17.0.0 mask 255.255.0.0 nomodify
  3. #优先时间服务器
  4. server 172.17.0.10 prefer
  5. #日志文件位置
  6. logfile /var/log/ntp.log
然后启动ntpd:
  1. [root@9321a27a2b91 hadoop]# service ntpd start
  2. Starting ntpd:[ OK ]
  3. [root@9321a27a2b91 hadoop]# service ntpd status
  4. ntpd dead but pid file exists
发现ntpd停止了,去/var/log/ntp.log看:
  1. 3Apr11:20:08 ntpd[732]: ntp_io: estimated max descriptors:65536, initial socket boundary:16
  2. 3Apr11:20:08 ntpd[732]:Listen and drop on 0 v4wildcard 0.0.0.0 UDP 123
  3. 3Apr11:20:08 ntpd[732]:Listen and drop on 1 v6wildcard :: UDP 123
  4. 3Apr11:20:08 ntpd[732]:Listen normally on 2 lo 127.0.0.1 UDP 123
  5. 3Apr11:20:08 ntpd[732]:Listen normally on 3 eth0 172.17.0.10 UDP 123
  6. 3Apr11:20:08 ntpd[732]:Listen normally on 4 lo ::1 UDP 123
  7. 3Apr11:20:08 ntpd[732]:Listen normally on 5 eth0 fe80::42:acff:fe11:a UDP 123
  8. 3Apr11:20:08 ntpd[732]:Listening on routing socket on fd #22 for interface updates
  9. 3Apr11:20:08 ntpd[732]:0.0.0.0 c016 06 restart
  10. 3Apr11:20:08 ntpd[732]: ntp_adjtime() failed:Operation not permitted
  11. 3Apr11:20:08 ntpd[732]:0.0.0.0 c012 02 freq_set kernel 0.000 PPM
  12. 3Apr11:20:08 ntpd[732]:0.0.0.0 c011 01 freq_not_set
  13. 3Apr11:20:08 ntpd[732]: cap_set_proc() failed to drop root privileges:Operation not permitted
上网百度:
原因:可能是虚拟机上使用linux内核有bug,导致ntp不能drop root(不以root用户启动).编译内核是不可能了.那么,修改/etc/sysconfig/ntpd:
注释掉:
  1. OPTIONS="-u ntp:ntp -p /var/run/ntpd.pid -g"
  1. echo "# Drop root to id 'ntp:ntp' by default.">/etc/sysconfig/ntpd
  2. echo "#OPTIONS="-u ntp:ntp -p /var/run/ntpd.pid -g" ">>/etc/sysconfig/ntpd
 
然后启动ntp:
  1. [root@9321a27a2b91 hadoop]# service ntpd start
  2. Starting ntpd:[ OK ]
  3. [root@9321a27a2b91 hadoop]# service ntpd status
  4. ntpd (pid 796) is running..
 
2)以其它节点上:
修改/etc/ntp.conf 加上
  1. server 172.17.0.10 prefer
 
 

2.4.mysql 数据库

mysql数据库是给hive用的.为了方便,直接使用yum安装即可.也可手动下载安装包安装.
 
 

三.安装hadoop及其组件

3.1 安装HDFS及YARN

3.1.1 设置环境变量.bash_profile
在hadoop1上修改/home/hdfs/.bash_profile
  1. su hdfs
  2. vi ~.bash_profile
  3. JAVA_HOME=/usr/local/java/jdk1.8.0_121
  4. CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib:$CLASS
  5. HADOOP_HOME=/opt/hadoop/hadoop-2.7.3
  6. HADOOP_PREFIX=/opt/hadoop/hadoop-2.7.3
  7. HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
  8. HADOOP_YARN_HOME=$HADOOP_HOME
  9. LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$JAVA_HOME/jre/lib/amd64/server
  10. PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
  11. export PATH
复制到所有机器上:
  1. su hdfs
  2. scp -r hdfs@hadoop1:/home/hdfs/.bash_profile ~
 
3.1.2 设置hadoop启动的环境配置文件xxx-evn.sh
在hadoop1上:
3.1.2.1 hadoop-env.s
该文件主要包括启动jvm的内存参数,环境变量等:
  1. export JAVA_HOME=/usr/local/java/jdk1.8.0_121
  2. export HADOOP_HOME=/opt/hadoop/hadoop-2.7.3
  3. #hadoop进程的最大heapsize包括namenode/datanode/ secondarynamenode等,默认1000M
  4. #export HADOOP_HEAPSIZE=
  5. #namenode的初始heapsize,默认取上面的值,按需要分配
  6. #export HADOOP_NAMENODE_INIT_HEAPSIZE=""
  7. #JVM启动参数,默认为空
  8. export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"
  9. #还可以单独配置各个组件的内存:
  10. export HADOOP_NAMENODE_OPTS=
  11. export HADOOP_DATANODE_OPTS
  12. export HADOOP_SECONDARYNAMENODE_OPTS
  13. #设置hadoop日志,默认是$HADOOP_HOME/log
  14. export HADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER
根据自己系统的规划来设置各个参数.要注意namenode所用的blockmap和namespace空间都在heapsize中,所以生产环境要设较大的heapsize.
注意所有组件使用的内存和,生产给linux系统留5-15%的内存(一般留10G).
 
这里就不设置这些参数,按需分配即可.
 
3.1.2.2 yarn-env.sh
  1. export JAVA_HOME=/usr/local/java/jdk1.8.0_121
  2. JAVA_HEAP_MAX=-Xmx1000m
  3. # YARN_HEAPSIZE=1000 #yarn 守护进程heapsize
  4. #export YARN_RESOURCEMANAGER_HEAPSIZE=1000 #单独设置RESOURCEMANAGER的HEAPSIZE
  5. #export YARN_TIMELINESERVER_HEAPSIZE=1000 #单独设置TIMELINESERVER(jobhistoryServer)的HEAPSIZE
  6. #export YARN_RESOURCEMANAGER_OPTS= #单独设置RESOURCEMANAGER的JVM选项
  7. #export YARN_NODEMANAGER_HEAPSIZE=1000 #单独设置NODEMANAGER的HEAPSIZE
  8. #export YARN_NODEMANAGER_OPTS= #单独设置NODEMANAGER的JVM选项
同hadoop-env.sh按需要分配
 
3.1.3 修改hadoop配置文件
在hadoop1上:
3.1.3.1 修改core-site.xml
  1. <configuration>
  2. <property>
  3. <name>fs.defaultFS</name>
  4. <value>hdfs://hadoop1:9000</value>
  5.              <description>HDFS 端口</description>
  6. </property>
  7. <property>
  8. <name>io.file.buffer.size</name>
  9. <value>131072</value>
  10. </property>
  11. <property>
  12. <name>hadoop.tmp.dir</name>
  13. <value>/opt/hadoop/hadoop-2.7.3/tmp</value>
  14. <description>默认值/tmp/hadoop-${user.name},修改成持久化的目录</description>
  15. </property>
  16. </configuration>
用hdfs用户创建目录:
 
  1. mkdir ${HADOOP_HOME}/tmp
 
3.1.3.2 hdfs-site.xm
  1. <configuration>
  2. <property>
  3. <name>dfs.replication</name>
  4. <value>3</value>              
  5.                <description>数据块的备份数量</description>
  6. </property>
  7. <property>
  8. <name>dfs.namenode.name.dir</name>
  9. <value>/opt/hadoop/hadoop-2.7.3/namenodedir</value>
  10.                <description>保存namenode元数据的目录,要自己创建</description>
  11. </property>
  12. <property>
  13. <name>dfs.blocksize</name>
  14. <value>134217728</value>
          1. <description>数据块大小,128M</description>
  15. </property>
  16. <property>
  17. <name>dfs.datanode.data.dir</name>
  18. <value>/opt/hadoop/hadoop-2.7.3/datadir</value>
  19.             <description>datanode 数据目录</description>
  20. </property>
  21. </configuration>
用hdfs用户创建目录:
  1. mkdir ${HADOOP_HOME}/datadir
  2. mkdir ${HADOOP_HOME}/namenodedir
 
3.1.3.3 mapred-site.xml
MR任务的设置
ParameterValueNotes
mapreduce.framework.name yarn Execution framework set to Hadoop YARN.     MR任务执行框架
mapreduce.map.memory.mb 1536 Larger resource limit for maps.                       map内存上限
mapreduce.map.java.opts -Xmx1024M Larger heap-size for child jvms of maps.         map的子进程虚拟机heapsize
mapreduce.reduce.memory.mb 3072 Larger resource limit for reduces.                  redouce任务内存上限
mapreduce.reduce.java.opts -Xmx2560M Larger heap-size for child jvms of reduces.    redouce的子进程虚拟机heapsize
mapreduce.task.io.sort.mb 512 Higher memory-limit while sorting data for efficiency.    排序内存
mapreduce.task.io.sort.factor 100 More streams merged at once while sorting files.            排序因子
mapreduce.reduce.shuffle.parallelcopies 50 Higher number of parallel copies run by reduces to fetch outputs from very large number of maps.  并行数
jobhistoryServe:
ParameterValueNotes
mapreduce.jobhistory.address MapReduce JobHistory Server host:port Default port is 10020.    jobhistory地址:主机+端口
mapreduce.jobhistory.webapp.address MapReduce JobHistory Server Web UI host:port Default port is 19888.   jobhistory web端口
mapreduce.jobhistory.intermediate-done-dir /mr-history/tmp Directory where history files are written by MapReduce jobs.  
mapreduce.jobhistory.done-dir /mr-history/done Directory where history files are managed by the MR JobHistory Server.
 
这里只配置以下参数:
  1. <configuration>
  2. <property>
  3. <name>mapreduce.framework.name</name>
  4. <value>yarn</value>
  5.               <description>使用yarn来管理mr</description>
  6. </property>
  7. <property>
  8. <name>mapreduce.jobhistory.address</name>
  9. <value>hadoop2</value>
  10. </property>
  11. <property>
  12. <name>mapreduce.jobhistory.webapp.address</name>
  13. <value>hadoop2</value>
  14. </property>
  15. <property>
  16. <name>mapreduce.jobhistory.intermediate-done-dir</name>
  17. <value>/opt/hadoop/hadoop-2.7.3/mrHtmp</value>
  18. </property>
  19. <property>
  20. <name>mapreduce.jobhistory.done-dir</name>
  21. <value>/opt/hadoop/hadoop-2.7.3/mrhHdone</value>
  22. </property>
  23. </configuration>
在hadoop2上创建目录:
mkdir ${HADOOP_HOME}/mrHtmp
mkdir ${HADOOP_HOME}/mrhHdone
 
3.1.3.4 yarn-site.xml
yarn-site.xml有众多参数可设置,大多数都有默认参数,参考官网:
以下是几个比较关键的参数:
ResourceManager配置:
ParameterValueNotes
yarn.resourcemanager.address ResourceManager host:port for clients to submit jobs. host:port If set, overrides the hostname set in yarn.resourcemanager.hostname.   resourcemanager的地址,格式 主机:端口
yarn.resourcemanager.scheduler.address ResourceManager host:port for ApplicationMasters to talk to Scheduler to obtain resources. host:port If set, overrides the hostname set in yarn.resourcemanager.hostname.    调度器地址 ,覆盖yarn.resourcemanager.hostname
yarn.resourcemanager.resource-tracker.address ResourceManager host:port for NodeManagers. host:port If set, overrides the hostname set in yarn.resourcemanager.hostname.    datanode像rm报告的端口, 覆盖 yarn.resourcemanager.hostname
yarn.resourcemanager.admin.address ResourceManager host:port for administrative commands. host:port If set, overrides the hostname set in yarn.resourcemanager.hostname.    RM管理地址,覆盖 yarn.resourcemanager.hostname
yarn.resourcemanager.webapp.address ResourceManager web-ui host:port. host:port If set, overrides the hostname set in yarn.resourcemanager.hostname.    RM web地址,有默认值
yarn.resourcemanager.hostname ResourceManager host. host Single hostname that can be set in place of setting allyarn.resourcemanager*address resources. Results in default ports for ResourceManager components.                           RM的主机,使用默认端口
yarn.resourcemanager.scheduler.class ResourceManager Scheduler class. CapacityScheduler (recommended), FairScheduler (also recommended), or FifoScheduler
yarn.scheduler.minimum-allocation-mb Minimum limit of memory to allocate to each container request at the Resource Manager. In MBs            最小容器内存(每个container最小内存)                             
yarn.scheduler.maximum-allocation-mb Maximum limit of memory to allocate to each container request at the Resource Manager. In MBs           最大容器内存(每个container最大内存)   
yarn.resourcemanager.nodes.include-path /yarn.resourcemanager.nodes.exclude-path List of permitted/excluded NodeManagers. If necessary, use these files to control the list of allowable NodeManagers.    哪些datanode可以被RM管理
 
NodeManager配置:
yarn.nodemanager.resource.memory-mb Resource i.e. available physical memory, in MB, for given NodeManager Defines total available resources on the NodeManager to be made available to running containers    Yarn在NodeManager最大内存
yarn.nodemanager.vmem-pmem-ratio Maximum ratio by which virtual memory usage of tasks may exceed physical memory The virtual memory usage of each task may exceed its physical memory limit by this ratio. The total amount of virtual memory used by tasks on the NodeManager may exceed its physical memory usage by this ratio.     任务使用的虚拟内存超过被允许的推理内存的比率,超过则kill掉
yarn.nodemanager.local-dirs Comma-separated list of paths on the local filesystem where intermediate data is written. Multiple paths help spread disk i/o.    datamanager的本地目录
yarn.nodemanager.log-dirs Comma-separated list of paths on the local filesystem where logs are written. Multiple paths help spread disk i/o.   datamanager日志目录
yarn.nodemanager.log.retain-seconds 10800 Default time (in seconds) to retain log files on the NodeManager Only applicable if log-aggregation is disabled.
yarn.nodemanager.remote-app-log-dir /logs HDFS directory where the application logs are moved on application completion. Need to set appropriate permissions. Only applicable if log-aggregation is enabled.
yarn.nodemanager.remote-app-log-dir-suffix logs Suffix appended to the remote log dir. Logs will be aggregated to ${yarn.nodemanager.remote-app-log-dir}/${user}/${thisParam} Only applicable if log-aggregation is enabled.
yarn.nodemanager.aux-services mapreduce_shuffle Shuffle service that needs to be set for Map Reduce applications.   shuffle服务类型
 
YARN的ACL配置:
yarn.acl.enable true /false Enable ACLs? Defaults to false.  是否开启ACL
yarn.admin.acl Admin ACL ACL to set admins on the cluster. ACLs are of for comma-separated-usersspacecomma-separated-groups. Defaults to special value of * which meansanyone. Special value of just space means no one has access.              ACL用户,用,分隔     如root,yarn
yarn.log-aggregation-enable false Configuration to enable or disable log aggregation                 启用日志聚集.日志聚焦到一个节点
 
本次实验只设置yarn.resourcemanager.hostname
  1. <configuration>
  2. <!-- Site specific YARN configuration properties -->
  3. <property>
  4. <name>yarn.resourcemanager.hostname</name>
  5. <value>hadoop1</value>
  6. <description>设置resourcemanager节点</description>
  7. </property>
  8. <!-- Site specific YARN configuration properties -->
  9. <property>
  10. <name>yarn.nodemanager.aux-services</name>
  11. <value>mapreduce_shuffle</value>
  12. <description>设置nodemanager的aux服务</description>
  13. </property>
  14. </configuration>
 
3.1.4 设置slaves文件
在hadoop1上,在${HADOOP_HOME}/etc/hadoop/slaves中加入datanode\nodemanager的主机名:
  1. vi $HADOOP_HOME/et/hadoop/slaves
  2. hadoop3
  3. hadoop4
  4. hadoop5
  5. hadoop6
 
3.1.5 启动HADOOP
3.1.5.1 把hadoop配置复制到其它机器上
在其它机器上执行:
  1. mkdir /opt/hadoop
  2. chown hdfs:hadoop /opt/hadoop
  3. su hdfs
  4. scp -r hdfs@hadoop1:/opt/hadoop/hadoop-2.7.3 /opt/hadoop
 
3.1.5.2 格式化namenode
  1. $HADOOP_HOME/bin/hdfs namenode -format
 
3.1.5.3 启动hdfs
找一台机器:
  1. [hdfs@9321a27a2b91 hadoop-2.7.3]$ start-dfs.sh
  2. Starting namenodes on [hadoop1]
  3. hadoop1: starting namenode, logging to /opt/hadoop/hadoop-2.7.3/logs/hadoop-hdfs-namenode-9321a27a2b91.out
  4. hadoop3: starting datanode, logging to /opt/hadoop/hadoop-2.7.3/logs/hadoop-hdfs-datanode-f89eaf2a2548.out
  5. hadoop4: starting datanode, logging to /opt/hadoop/hadoop-2.7.3/logs/hadoop-hdfs-datanode-28620eee1426.out
  6. hadoop5: starting datanode, logging to /opt/hadoop/hadoop-2.7.3/logs/hadoop-hdfs-datanode-ae1f06bd04c8.out
  7. hadoop6: starting datanode, logging to /opt/hadoop/hadoop-2.7.3/logs/hadoop-hdfs-datanode-11c433a003b6.out
  8. Starting secondary namenodes [0.0.0.0]
  9. 0.0.0.0: starting secondarynamenode, logging to /opt/hadoop/hadoop-2.7.3/logs/hadoop-hdfs-secondarynamenode-9321a27a2b91.out
执行jps命令,查看进程:
  1. [hdfs@9321a27a2b91 hadoop]$ jps
  2. 11105 Jps
  3. 10981 SecondaryNameNode
  4. 10777 NameNode
 
验证hdfs:
  1. [hdfs@9321a27a2b91 hadoop-2.7.3]$ hdfs dfs -put NOTICE.txt /
  2. [hdfs@9321a27a2b91 hadoop-2.7.3]$ hdfs dfs -ls /
  3. Found 1 items
  4. -rw-r--r-- 3 hdfs supergroup 14978 2017-04-03 19:15 /NOTICE.txt
查看hdfs web页面:
  1. [root@9321a27a2b91 hdfs]# curl hadoop1:50070
  2. <!--
  3. Licensed to the Apache Software Foundation (ASF) under one or more
  4. contributor license agreements. See the NOTICE file distributed with
  5. this work for additional information regarding copyright ownership.
  6. ................
当然,如果做了端口映射就可以看见web了!
 至于怎么做端口映射,参考我另一篇文章<docker iptables端口映射>
 
 
 
3.1.5.4 启动yarn
本想以yarn用户启动yarn,但是又要配置yarn的ssh又要配置环境变量,挺麻烦的,就以hdfs用户启动yarn.
  1. [hdfs@9321a27a2b91 hadoop]$ start-yarn.sh
  2. starting yarn daemons
  3. starting resourcemanager, logging to /opt/hadoop/hadoop-2.7.3/logs/yarn-hdfs-resourcemanager-9321a27a2b91.out
  4. hadoop5: starting nodemanager, logging to /opt/hadoop/hadoop-2.7.3/logs/yarn-hdfs-nodemanager-ae1f06bd04c8.out
  5. hadoop6: starting nodemanager, logging to /opt/hadoop/hadoop-2.7.3/logs/yarn-hdfs-nodemanager-11c433a003b6.out
  6. hadoop3: starting nodemanager, logging to /opt/hadoop/hadoop-2.7.3/logs/yarn-hdfs-nodemanager-f89eaf2a2548.out
  7. hadoop4: starting nodemanager, logging to /opt/hadoop/hadoop-2.7.3/logs/yarn-hdfs-nodemanager-28620eee1426.out
执行jps查看进程:
  1. [hdfs@9321a27a2b91 hadoop]$ jps
  2. 11105 Jps
  3. 10981 SecondaryNameNode
  4. 10777 NameNode
  5. 10383 ResourceManager
 
查看yarn页面:
通过curl没有查到数据,但是web端口可以看到:
 
3.1.5.5 测试集群配置
使用hadooop example程序来验证一下集群正确性:
  1. [hdfs@9321a27a2b91 hadoop-2.7.3]$ bin/hdfs dfs -mkdir /user
  2. [hdfs@9321a27a2b91 hadoop-2.7.3]$ bin/hdfs dfs -mkdir /user/hdfs
  3. [hdfs@9321a27a2b91 hadoop-2.7.3]$ bin/hdfs dfs -put etc/hadoop input
  4. ...............
  5. 17/04/12 12:38:24 INFO mapreduce.JobSubmitter: number of splits:30
  6. 17/04/12 12:38:24 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1491968887469_0003
  7. 17/04/12 12:38:24 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/hdfs/.staging/job_1491968887469_0003
  8. java.lang.IllegalArgumentException: Does not contain a valid host:port authority: hadoop2
  9. at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:213)
  10. ........................
呀,报错了!
错误信息:
找不到hadoop2主机!
什么,hadoop2这个主机?不应该是7c3a3c9cd595吗?卧槽.赶紧修改所有以hadoop开头的主机名为docker真正的主机名:
core-site.xml:
<value>hdfs://hadoop1:9000</value>改为<value>hdfs://9321a27a2b91:9000</value>
yarn-site.xml:
<value>hadoop1</value>改为<value>9321a27a2b91</value>
mapred-site.xml:
<value>hadoop2</value>改为<value>7c3a3c9cd595</value>
slaves文件:
hadoop3
hadoop4
hadoop5
hadoop6
改为:
f89eaf2a2548
28620eee1426
ae1f06bd04c8
11c433a003b6
 
关闭yarn\hdfs,然后scp把etc/hadoop下的配置文件发到各个节点上:
scp -r hdfs@hadoop1:/opt/hadoop/hadoop-2.7.3/etc/hadoop/* /opt/hadoop/hadoop-2.7.3/etc/hadoop/
再启hdfs:
  1. [hdfs@9321a27a2b91 hadoop]$ start-dfs.sh
  2. Starting namenodes on [9321a27a2b91]
  3. The authenticity of host '9321a27a2b91 (172.17.0.10)' can't be established.
  4. RSA key fingerprint is 60:0c:61:73:2c:49:ef:e3:f7:61:c9:27:93:5a:1d:c7.
  5. Are you sure you want to continue connecting (yes/no)?
卧槽,竟然又要设置ssh免密码!ssh免密码是以主机名的字符而设置的,9321a27a2b91和hadoop1竟然不是对等的!我竟然无主以对!
 
3.1.5.6 各个节点上启动hdfs守护进程
算了,在各个节点上单独启动hdfs:
9321a27a2b91 hadoop1启动namenode:
$HADOOP_HOME/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode
以下节点启动datanode:
f89eaf2a2548 hadoop3
28620eee1426 hadoop4
ae1f06bd04c8 hadoop5
11c433a003b6 hadoop6
  1. [hdfs@11c433a003b6 hadoop-2.7.3]$ $HADOOP_HOME/sbin/hadoop-daemons.sh start datanode
  2. The authenticity of host '28620eee1426 (172.17.0.7)' can't be established.
  3. RSA key fingerprint is 60:0c:61:73:2c:49:ef:e3:f7:61:c9:27:93:5a:1d:c7.
  4. Are you sure you want to continue connecting (yes/no)? The authenticity of host '11c433a003b6 (172.17.0.5)' can't be established.
  5. RSA key fingerprint is 60:0c:61:73:2c:49:ef:e3:f7:61:c9:27:93:5a:1d:c7.
  6. Are you sure you want to continue connecting (yes/no)? The authenticity of host 'ae1f06bd04c8 (172.17.0.6)' can't be established.
  7. RSA key fingerprint is 60:0c:61:73:2c:49:ef:e3:f7:61:c9:27:93:5a:1d:c7.
  8. Are you sure you want to continue connecting (yes/no)? f89eaf2a2548: datanode running as process 5764. Stop it first.
卧槽,在其中一个节点上执行时竟然会一起启动其它节点!好吧,hadoop-daemons.sh这个脚本肯定是读取了slave文件的内容.嗯,在datanode节点把slave文件移出rm $HADOOP_HOME/etc/hadoop/slaves ,再启动datanode:
  1. [hdfs@11c433a003b6 hadoop-2.7.3]$ $HADOOP_HOME/sbin/hadoop-daemons.sh --config $HADOOP_CONF_DIR --script hdfs start datanode
  2. cat: /opt/hadoop/hadoop-2.7.3/etc/hadoop/slaves: No such file or directory
照样报错,真是日了狗了!来看看这个$HADOOP_HOME/sbin/hadoop-daemons.sh里面是啥!
  1. usage="Usage: hadoop-daemons.sh [--config confdir] [--hosts hostlistfile] [start|stop] command args..."
  2. # if no args specified, show usage
  3. if[ $# -le 1 ]; then
  4. echo $usage
  5. exit 1
  6. fi
  7. bin=`dirname "${BASH_SOURCE-$0}"`
  8. bin=`cd "$bin"; pwd`
  9. DEFAULT_LIBEXEC_DIR="$bin"/../libexec
  10. HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
  11. . $HADOOP_LIBEXEC_DIR/hadoop-config.sh
  12. exec "$bin/slaves.sh"--config $HADOOP_CONF_DIR cd "$HADOOP_PREFIX" \; "$bin/hadoop-daemon.sh"--config $HADOOP_CONF_DIR "$@"
卧槽尼玛!竟然是引用了slave.sh hadoop-daemon.sh !!!!明明单节点启动要用hadoop-daemon.sh ,官网地告诉我们要用hadoop-daemons.sh ,尼玛坑爹啊!
 好了,用hadoop-daemon.sh 来启动:
$HADOOP_HOME/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode
好了,终于启动成功了.
 
3.1.5.6 各个节点上启动yarn守护进程
 
9321a27a2b91 hadoop1启动resourcemanager:
 
  1. [hdfs@9321a27a2b91 hadoop]$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode
  2. starting namenode, logging to /opt/hadoop/hadoop-2.7.3/logs/hadoop-hdfs-namenode-9321a27a2b91.out
以下节点启动nodemanager:
f89eaf2a2548 hadoop3
28620eee1426 hadoop4
ae1f06bd04c8 hadoop5
11c433a003b6 hadoop6
  1. [hdfs@f89eaf2a2548 hadoop-2.7.3]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager
  2. starting resourcemanager, logging to /opt/hadoop/hadoop-2.7.3/logs/yarn-hdfs-resourcemanager-f89eaf2a2548.out
同样,官方文档依然是搞错了,我!
 
好了yarn也启动成功了.
再来测试一下example程序:
  1. [hdfs@9321a27a2b91 hadoop-2.7.3]$ bin/hdfs dfs -put etc/hadoop input
  2. ...............
  3. 17/04/1212:38:24 INFO mapreduce.JobSubmitter: number of splits:30
  4. 17/04/1212:38:24 INFO mapreduce.JobSubmitter:Submitting tokens for job: job_1491968887469_0003
  5. 17/04/1212:38:24 INFO mapreduce.JobSubmitter:Cleaning up the staging area /tmp/hadoop-yarn/staging/hdfs/.staging/job_1491968887469_0003
  6. java.lang.IllegalArgumentException:Does not contain a valid host:port authority: hadoop2
  7. at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:213)
  8. ........................
老问题啊我X!那就不是主机的别名问题了.应该是
  1. <property>
  2. <name>mapreduce.jobhistory.address</name>
  3. <value>7c3a3c9cd595</value>
  4. </property>
  5. <property>
  6. <name>mapreduce.jobhistory.webapp.address</name>
  7. <value>7c3a3c9cd595</value>
  8. </property>
这里要加上端口,改成
  1. <property>
  2. <name>mapreduce.jobhistory.address</name>
  3. <value>7c3a3c9cd595:10020</value>
  4. </property>
  5. <property>
  6. <name>mapreduce.jobhistory.webapp.address</name>
  7. <value>7c3a3c9cd595:19888</value>
  8. </property>
分发到各个客户端:
scp -r hdfs@hadoop1:/opt/hadoop/hadoop-2.7.3/etc/hadoop/* /opt/hadoop/hadoop-2.7.3/etc/hadoop/,重启hdfs和yarn,再测试:
这一次不报那个问题了! 
为什么第一次没写端口,因为hadoop2.6.3的安装中,不用写jobhistory的端口,所以...我偷了懒~
虽然不报那个错,但是报另外一个错误:
  1. 2017-04-0319:13:12,328 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:IOExceptionin offerService
  2. java.io.EOFException:End of FileException between local host is:"ae1f06bd04c8/172.17.0.6"; destination host is:"hadoop1":9000;: java.io.EOFException;For more details see: http://wiki.a
  3. pache.org/hadoop/EOFException
  4. at sun.reflect.NativeConstructorAccessorImpl.newInstance0(NativeMethod)
  5. at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
  6. at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
这个nodemanager日志错误,IO不可以.四个nodemanager中挂了三个.当只有一个nodemanger启动时,任务可以执行成功.说明同一台宿主机上docker集群有磁盘争用,一旦一个进程用了其它进程就不能用了,这个问题我下次用docker单独挂不同的目录试试.
 

四.总结

4.1.hadoop的安装步骤

其实很简单的,大部分工作都在配置环境
1)安装JDK
2)添加hadoop组,添加hdfs用户,如果你愿意也可以加上yarn
3)设置环境变量 /home/hdfs/.bash_profil   /etc/hosts 
4)关闭防火墙
5)修改ulimit
6)配置SSH,此步骤可以省略,只要你愿意在每个节点上启动守护进程
7)安装ntp
8)下载hadoop安装包并解压
9)修改hadoop配置文件,并启动hdfs yarn

4.2问题总结

1)jobhistory里的端口要加上(为什么2.6.3版本可能不加?)
2)在每个节点上启动守护进程(不用start-dfs.sh start-yarn.sh)时,要注意用daemon.sh而非daemons.sh!
 
zookpeer还没有安装,这将在搞完docker磁盘争用后单独一个节来写.
后续的计划:
1.解决docker磁盘争用
2.安装zookpeer
3.安装hdfs ha
4.安装hive
5.安装hbaes
6.安装kafka
7.安装solr
8.安装es
当然,有可能有一些改变.
 
 
 
 后记:
第二天我又重新建一个hadoop集群,原因是弄好了docker的iptables端口映射\容器主机名\固定IP\挂载目录这四样东西.新的hadoop集群IP:
172.18.0.11
172.18.0.12
172.18.0.13
172.18.0.14
172.18.0.15
172.18.0.16
角色跟上面的一样.以后的更新将按照新部署的集群来做.
 
注:cnblog支持markdown语法,以后会写markdown的格式~
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 





posted on 2017-04-13 10:13  月饼馅饺子  阅读(2355)  评论(1编辑  收藏  举报

导航