HADOOP docker(一):安装hadoop实验集群(略操蛋)
一.环境准备
1.1.机器规划
主机名 别名 | IP | 角色 |
9321a27a2b91 hadoop1 | 172.17.0.10 | NN1 ZK RM |
7c3a3c9cd595 hadoop2 | 172.17.0.9 | NN2 ZK RM JOBHIS |
f89eaf2a2548 hadoop3 | 172.17.0.8 | DN ZK ND |
28620eee1426 hadoop4 | 172.17.0.7 | DN QJM1 ND |
ae1f06bd04c8 hadoop5 | 172.17.0.6 | DN QJM2 ND |
11c433a003b6 hadoop6 | 172.17.0.5 | DN QJM3 ND |
1.2.用户与组
用户 | 组 | 作用 |
hdfs | hadoop | 管理HDFS |
yarn | hadoop | 管理yarn |
zookeeper | hadoop | 管理zookeeper |
hvie | hadoop | 管理hvie |
hbase | hadoop | 管理hbase |
脚本:
groupadd hadoop
useradd -g hadoop hdfs
passwd hdfs <<EOF
hdfs
hdfs
EOF
useradd -g hadoop yarn
passwd yarn <<EOF
yarn
yarn
EOF
useradd -g hadoop zookeeper
passwd zookeeper <<EOF
zookeeper
zookeeper
EOF
useradd -g hadoop hive
passwd hive <<EOF
hive
hive
EOF
useradd -g hadoop hbase
passwd hbase <<EOF
hbase
hbase
EOF
echo user added!
1.3.修改/etc/hosts
加入所有节点的IP NAME
echo "127.0.0.1 localhost localhost">/etc/hosts
echo "172.17.0.6 9321a27a2b91 hadoop1">>/etc/hosts
echo "172.17.0.7 7c3a3c9cd595 hadoop2">>/etc/hosts
echo "172.17.0.8 f89eaf2a2548 hadoop3">>/etc/hosts
echo "172.17.0.9 28620eee1426 hadoop4">>/etc/hosts
echo "172.17.0.10 ae1f06bd04c8 hadoop5">>/etc/hosts
echo "172.17.0.11 11c433a003b6 hadoop6">>/etc/hosts
1.4. ssh 免密码登录
在各个机器上执行
su hdfs
ssh-copy-id -i ~/.ssh/id_rsa.pub 172.17.0.6
ssh-copy-id -i ~/.ssh/id_rsa.pub 172.17.0.7
ssh-copy-id -i ~/.ssh/id_rsa.pub 172.17.0.8
ssh-copy-id -i ~/.ssh/id_rsa.pub 172.17.0.9
ssh-copy-id -i ~/.ssh/id_rsa.pub 172.17.0.10
ssh-copy-id -i ~/.ssh/id_rsa.pub 172.17.0.11
1.5.修改ulimit
先用hdfs yarn hive等用户查看limit -a .如果 -n -m -l -u等不满足要求,则修改/etc/security/limits.conf
[hdfs@9321a27a2b91 root]$ ulimit -a
core file size (blocks,-c) unlimited
data seg size (kbytes,-d) unlimited
scheduling priority (-e)0
file size (blocks,-f) unlimited
pending signals (-i)95612
max locked memory (kbytes,-l)64
max memory size (kbytes,-m) unlimited
open files (-n)65536
pipe size (512 bytes,-p)8
POSIX message queues (bytes,-q)819200
real-time priority (-r)0
stack size (kbytes,-s)8192
cpu time (seconds,-t) unlimited
max user processes (-u)1024
virtual memory (kbytes,-v) unlimited
file locks (-x) unlimited
如果不满足要求,修改/etc/security/limits.conf,添加:
hdfs hard nfile 65536
hdfs soft nfile 65536
yarn hard nfile 65536
yarn soft nfile 65536
......
nfile指找开文件数,还可以设置nproc等.
注:本次实验机不做修改.
6.关闭防火墙
service iptables stop
7.关闭seLinux
永久生效:修改/etc/selinux/config文件中设置SELINUX=disabled
临时生效:使用命令setenforce 0
由于是docker无法重启,故此次使用第二种方法
setenforce 0
二.软件准备
2.1.安装jdk
在hadoop1节点上:
上官网下载最新的jdk8.1 tar包,解压到/usr/local/java
[root@9321a27a2b91 ~]#mkdir /usr/local/java
[root@9321a27a2b91 ~]#cp jdk-8u121-linux-x64.tar.gz /usr/local/java/
[root@9321a27a2b91 ~]#chown -R hdfs:hadoop /usr/local/java/
[root@9321a27a2b91 ~]#su hdfs
- [hdfs@9321a27a2b91 root]$ cd /usr/local/java/jdk-8u121-linux-x64.tar.gz
[hdfs@9321a27a2b91 java]$ tar -zxvf jdk-8u121-linux-x64.tar.gz
在每个节点上执行此操作.
或者在一个节点上执行完后,把相关文件scp到其它节点:
mkdir /usr/local/java
chown hdfs:hadoop /usr/local/java
su hdfs
scp -r hdfs@hadoop1:/usr/local/java/jdk1.8.0_121 /usr/local/java
2.2.hadoop安装包
在hadoop1节点上:
上官网下载hadoop2.7.3,解压到/opt/hadoop
[root@9321a27a2b91 ~]# mkdir /opt/hadoop
[root@9321a27a2b91 ~]# chown hdfs:hadoop hadoop-2.7.3.tar.gz
[root@9321a27a2b91 ~]# chown hdfs:hadoop /opt/hadoop
[root@9321a27a2b91 ~]# cp hadoop-2.7.3.tar.gz /opt/hadoop
[root@9321a27a2b91 ~]# su hdfs
[hdfs@9321a27a2b91 root]$ cd /opt/hadoop/
[hdfs@9321a27a2b91 hadoop]$ tar -zxvf hadoop-2.7.3.tar.gz
2.3.ntp服务
将hadoop1设置为ntp服务器,其它为ntp client.在hadoop1结点上:
1)在hadoop1上,以root用户执行:
如果没有装ntp,先用yum安装:
yum -y install ntp
编辑ntp配置文件/etc/ntp.conf,添加:
#本子网内主机都可以同步
restrict 172.17.0.0 mask 255.255.0.0 nomodify
#优先时间服务器
server 172.17.0.10 prefer
#日志文件位置
logfile /var/log/ntp.log
然后启动ntpd:
[root@9321a27a2b91 hadoop]# service ntpd start
Starting ntpd:[ OK ]
[root@9321a27a2b91 hadoop]# service ntpd status
ntpd dead but pid file exists
发现ntpd停止了,去/var/log/ntp.log看:
3Apr11:20:08 ntpd[732]: ntp_io: estimated max descriptors:65536, initial socket boundary:16
3Apr11:20:08 ntpd[732]:Listen and drop on 0 v4wildcard 0.0.0.0 UDP 123
3Apr11:20:08 ntpd[732]:Listen and drop on 1 v6wildcard :: UDP 123
3Apr11:20:08 ntpd[732]:Listen normally on 2 lo 127.0.0.1 UDP 123
3Apr11:20:08 ntpd[732]:Listen normally on 3 eth0 172.17.0.10 UDP 123
3Apr11:20:08 ntpd[732]:Listen normally on 4 lo ::1 UDP 123
3Apr11:20:08 ntpd[732]:Listen normally on 5 eth0 fe80::42:acff:fe11:a UDP 123
3Apr11:20:08 ntpd[732]:Listening on routing socket on fd #22 for interface updates
3Apr11:20:08 ntpd[732]:0.0.0.0 c016 06 restart
3Apr11:20:08 ntpd[732]: ntp_adjtime() failed:Operation not permitted
3Apr11:20:08 ntpd[732]:0.0.0.0 c012 02 freq_set kernel 0.000 PPM
3Apr11:20:08 ntpd[732]:0.0.0.0 c011 01 freq_not_set
3Apr11:20:08 ntpd[732]: cap_set_proc() failed to drop root privileges:Operation not permitted
上网百度:
原因:可能是虚拟机上使用linux内核有bug,导致ntp不能drop root(不以root用户启动).编译内核是不可能了.那么,修改/etc/sysconfig/ntpd:
注释掉:
OPTIONS="-u ntp:ntp -p /var/run/ntpd.pid -g"
echo "# Drop root to id 'ntp:ntp' by default.">/etc/sysconfig/ntpd
echo "#OPTIONS="-u ntp:ntp -p /var/run/ntpd.pid -g" ">>/etc/sysconfig/ntpd
然后启动ntp:
[root@9321a27a2b91 hadoop]# service ntpd start
Starting ntpd:[ OK ]
[root@9321a27a2b91 hadoop]# service ntpd status
ntpd (pid 796) is running..
2)以其它节点上:
修改/etc/ntp.conf 加上
server 172.17.0.10 prefer
2.4.mysql 数据库
mysql数据库是给hive用的.为了方便,直接使用yum安装即可.也可手动下载安装包安装.
三.安装hadoop及其组件
3.1 安装HDFS及YARN
3.1.1 设置环境变量.bash_profile
在hadoop1上修改/home/hdfs/.bash_profile
su hdfs
- vi ~.bash_profile
JAVA_HOME=/usr/local/java/jdk1.8.0_121
CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar:$JRE_HOME/lib:$CLASS
HADOOP_HOME=/opt/hadoop/hadoop-2.7.3
HADOOP_PREFIX=/opt/hadoop/hadoop-2.7.3
HADOOP_CONF_DIR=$HADOOP_HOME/etc/hadoop
HADOOP_YARN_HOME=$HADOOP_HOME
LD_LIBRARY_PATH=$LD_LIBRARY_PATH:$JAVA_HOME/jre/lib/amd64/server
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
export PATH
复制到所有机器上:
su hdfs
scp -r hdfs@hadoop1:/home/hdfs/.bash_profile ~
3.1.2 设置hadoop启动的环境配置文件xxx-evn.sh
在hadoop1上:
3.1.2.1 hadoop-env.s
该文件主要包括启动jvm的内存参数,环境变量等:
export JAVA_HOME=
/usr/local/java/jdk1.8.0_121- export HADOOP_HOME=/opt/hadoop/hadoop-2.7.3
#hadoop进程的最大heapsize包括namenode/datanode/ secondarynamenode等,默认1000M
#export HADOOP_HEAPSIZE=
#namenode的初始heapsize,默认取上面的值,按需要分配
#export HADOOP_NAMENODE_INIT_HEAPSIZE=""
#JVM启动参数,默认为空
export HADOOP_OPTS="$HADOOP_OPTS -Djava.net.preferIPv4Stack=true"
#还可以单独配置各个组件的内存:
export HADOOP_NAMENODE_OPTS=
export HADOOP_DATANODE_OPTS
export HADOOP_SECONDARYNAMENODE_OPTS
#设置hadoop日志,默认是$HADOOP_HOME/log
export HADOOP_LOG_DIR=${HADOOP_LOG_DIR}/$USER
根据自己系统的规划来设置各个参数.要注意namenode所用的blockmap和namespace空间都在heapsize中,所以生产环境要设较大的heapsize.
注意所有组件使用的内存和,生产给linux系统留5-15%的内存(一般留10G).
这里就不设置这些参数,按需分配即可.
3.1.2.2 yarn-env.sh
export JAVA_HOME=
/usr/local/java/jdk1.8.0_121JAVA_HEAP_MAX=-Xmx1000m
# YARN_HEAPSIZE=1000 #yarn 守护进程heapsize
#export YARN_RESOURCEMANAGER_HEAPSIZE=1000 #单独设置RESOURCEMANAGER的HEAPSIZE
#export YARN_TIMELINESERVER_HEAPSIZE=1000 #单独设置TIMELINESERVER(jobhistoryServer)的HEAPSIZE
#export YARN_RESOURCEMANAGER_OPTS= #单独设置RESOURCEMANAGER的JVM选项
#export YARN_NODEMANAGER_HEAPSIZE=1000 #单独设置NODEMANAGER的HEAPSIZE
#export YARN_NODEMANAGER_OPTS= #单独设置NODEMANAGER的JVM选项
同hadoop-env.sh按需要分配
3.1.3 修改hadoop配置文件
在hadoop1上:
3.1.3.1 修改core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://hadoop1:9000</value>
</property>
<property>
<name>io.file.buffer.size</name>
<value>131072</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/opt/hadoop/hadoop-2.7.3/tmp</value>
<description>默认值/tmp/hadoop-${user.name},修改成持久化的目录</description>
</property>
</configuration>
用hdfs用户创建目录:
mkdir ${HADOOP_HOME}/tmp
3.1.3.2 hdfs-site.xm
<configuration>
<property>
<name>dfs.replication</name>
<value>3</value>
- <description>数据块的备份数量</description>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/opt/hadoop/hadoop-2.7.3/namenodedir</value>
- <description>保存namenode元数据的目录,要自己创建</description>
</property>
<property>
<name>dfs.blocksize</name>
<value>134217728</value>
<description>数据块大小,128M</description>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/opt/hadoop/hadoop-2.7.3/datadir</value>
</property>
</configuration>
用hdfs用户创建目录:
mkdir ${HADOOP_HOME}/datadir
mkdir
${HADOOP_HOME}/namenodedir
3.1.3.3 mapred-site.xml
MR任务的设置
Parameter | Value | Notes |
---|---|---|
mapreduce.framework.name | yarn | Execution framework set to Hadoop YARN. MR任务执行框架 |
mapreduce.map.memory.mb | 1536 | Larger resource limit for maps. map内存上限 |
mapreduce.map.java.opts | -Xmx1024M | Larger heap-size for child jvms of maps. map的子进程虚拟机heapsize |
mapreduce.reduce.memory.mb | 3072 | Larger resource limit for reduces. redouce任务内存上限 |
mapreduce.reduce.java.opts | -Xmx2560M | Larger heap-size for child jvms of reduces. redouce的子进程虚拟机heapsize |
mapreduce.task.io.sort.mb | 512 | Higher memory-limit while sorting data for efficiency. 排序内存 |
mapreduce.task.io.sort.factor | 100 | More streams merged at once while sorting files. 排序因子 |
mapreduce.reduce.shuffle.parallelcopies | 50 | Higher number of parallel copies run by reduces to fetch outputs from very large number of maps. 并行数 |
jobhistoryServe:
Parameter | Value | Notes |
---|---|---|
mapreduce.jobhistory.address | MapReduce JobHistory Server host:port | Default port is 10020. jobhistory地址:主机+端口 |
mapreduce.jobhistory.webapp.address | MapReduce JobHistory Server Web UI host:port | Default port is 19888. jobhistory web端口 |
mapreduce.jobhistory.intermediate-done-dir | /mr-history/tmp | Directory where history files are written by MapReduce jobs. |
mapreduce.jobhistory.done-dir | /mr-history/done | Directory where history files are managed by the MR JobHistory Server. |
这里只配置以下参数:
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
<description>使用yarn来管理mr</description>
</property>
<property>
<name>mapreduce.jobhistory.address</name>
<value>hadoop2</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>hadoop2</value>
</property>
<property>
<name>mapreduce.jobhistory.intermediate-done-dir</name>
<value>/opt/hadoop/hadoop-2.7.3/mrHtmp</value>
</property>
<property>
<name>mapreduce.jobhistory.done-dir</name>
<value>/opt/hadoop/hadoop-2.7.3/mrhHdone</value>
</property>
</configuration>
在hadoop2上创建目录:
mkdir ${HADOOP_HOME}/mrHtmp
mkdir ${HADOOP_HOME}/mrhHdone
3.1.3.4 yarn-site.xml
yarn-site.xml有众多参数可设置,大多数都有默认参数,参考官网:
以下是几个比较关键的参数:
ResourceManager配置:
Parameter | Value | Notes |
---|---|---|
yarn.resourcemanager.address | ResourceManager host:port for clients to submit jobs. | host:port If set, overrides the hostname set in yarn.resourcemanager.hostname. resourcemanager的地址,格式 主机:端口 |
yarn.resourcemanager.scheduler.address | ResourceManager host:port for ApplicationMasters to talk to Scheduler to obtain resources. | host:port If set, overrides the hostname set in yarn.resourcemanager.hostname. 调度器地址 ,覆盖yarn.resourcemanager.hostname |
yarn.resourcemanager.resource-tracker.address | ResourceManager host:port for NodeManagers. | host:port If set, overrides the hostname set in yarn.resourcemanager.hostname. datanode像rm报告的端口, 覆盖 yarn.resourcemanager.hostname |
yarn.resourcemanager.admin.address | ResourceManager host:port for administrative commands. | host:port If set, overrides the hostname set in yarn.resourcemanager.hostname. RM管理地址,覆盖 yarn.resourcemanager.hostname |
yarn.resourcemanager.webapp.address | ResourceManager web-ui host:port. | host:port If set, overrides the hostname set in yarn.resourcemanager.hostname. RM web地址,有默认值 |
yarn.resourcemanager.hostname | ResourceManager host. | host Single hostname that can be set in place of setting allyarn.resourcemanager*address resources. Results in default ports for ResourceManager components. RM的主机,使用默认端口 |
yarn.resourcemanager.scheduler.class | ResourceManager Scheduler class. | CapacityScheduler (recommended), FairScheduler (also recommended), or FifoScheduler |
yarn.scheduler.minimum-allocation-mb | Minimum limit of memory to allocate to each container request at the Resource Manager. | In MBs 最小容器内存(每个container最小内存) |
yarn.scheduler.maximum-allocation-mb | Maximum limit of memory to allocate to each container request at the Resource Manager. | In MBs 最大容器内存(每个container最大内存) |
yarn.resourcemanager.nodes.include-path /yarn.resourcemanager.nodes.exclude-path | List of permitted/excluded NodeManagers. | If necessary, use these files to control the list of allowable NodeManagers. 哪些datanode可以被RM管理 |
NodeManager配置:
yarn.nodemanager.resource.memory-mb | Resource i.e. available physical memory, in MB, for given NodeManager | Defines total available resources on the NodeManager to be made available to running containers Yarn在NodeManager最大内存 |
yarn.nodemanager.vmem-pmem-ratio | Maximum ratio by which virtual memory usage of tasks may exceed physical memory | The virtual memory usage of each task may exceed its physical memory limit by this ratio. The total amount of virtual memory used by tasks on the NodeManager may exceed its physical memory usage by this ratio. 任务使用的虚拟内存超过被允许的推理内存的比率,超过则kill掉 |
yarn.nodemanager.local-dirs | Comma-separated list of paths on the local filesystem where intermediate data is written. | Multiple paths help spread disk i/o. datamanager的本地目录 |
yarn.nodemanager.log-dirs | Comma-separated list of paths on the local filesystem where logs are written. | Multiple paths help spread disk i/o. datamanager日志目录 |
yarn.nodemanager.log.retain-seconds | 10800 | Default time (in seconds) to retain log files on the NodeManager Only applicable if log-aggregation is disabled. |
yarn.nodemanager.remote-app-log-dir | /logs | HDFS directory where the application logs are moved on application completion. Need to set appropriate permissions. Only applicable if log-aggregation is enabled. |
yarn.nodemanager.remote-app-log-dir-suffix | logs | Suffix appended to the remote log dir. Logs will be aggregated to ${yarn.nodemanager.remote-app-log-dir}/${user}/${thisParam} Only applicable if log-aggregation is enabled. |
yarn.nodemanager.aux-services | mapreduce_shuffle | Shuffle service that needs to be set for Map Reduce applications. shuffle服务类型 |
YARN的ACL配置:
yarn.acl.enable | true /false | Enable ACLs? Defaults to false. 是否开启ACL |
yarn.admin.acl | Admin ACL | ACL to set admins on the cluster. ACLs are of for comma-separated-usersspacecomma-separated-groups. Defaults to special value of * which meansanyone. Special value of just space means no one has access. ACL用户,用,分隔 如root,yarn |
yarn.log-aggregation-enable | false | Configuration to enable or disable log aggregation 启用日志聚集.日志聚焦到一个节点 |
本次实验只设置yarn.resourcemanager.hostname
<configuration>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.resourcemanager.hostname</name>
<value>hadoop1</value>
<description>设置resourcemanager节点</description>
</property>
<!-- Site specific YARN configuration properties -->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
<description>设置nodemanager的aux服务</description>
</property>
</configuration>
3.1.4 设置slaves文件
在hadoop1上,在${HADOOP_HOME}/etc/hadoop/slaves中加入datanode\nodemanager的主机名:
vi $HADOOP_HOME/et/hadoop/slaves
hadoop3
hadoop4
hadoop5
hadoop6
3.1.5 启动HADOOP
3.1.5.1 把hadoop配置复制到其它机器上
在其它机器上执行:
mkdir /opt/hadoop
chown hdfs:hadoop /opt/hadoop
su hdfs
scp -r hdfs@hadoop1:/opt/hadoop/hadoop-2.7.3 /opt/hadoop
3.1.5.2 格式化namenode
$HADOOP_HOME/bin/hdfs namenode -format
3.1.5.3 启动hdfs
找一台机器:
[hdfs@9321a27a2b91 hadoop-2.7.3]$ start-dfs.sh
Starting namenodes on [hadoop1]
hadoop1: starting namenode, logging to /opt/hadoop/hadoop-2.7.3/logs/hadoop-hdfs-namenode-9321a27a2b91.out
hadoop3: starting datanode, logging to /opt/hadoop/hadoop-2.7.3/logs/hadoop-hdfs-datanode-f89eaf2a2548.out
hadoop4: starting datanode, logging to /opt/hadoop/hadoop-2.7.3/logs/hadoop-hdfs-datanode-28620eee1426.out
hadoop5: starting datanode, logging to /opt/hadoop/hadoop-2.7.3/logs/hadoop-hdfs-datanode-ae1f06bd04c8.out
hadoop6: starting datanode, logging to /opt/hadoop/hadoop-2.7.3/logs/hadoop-hdfs-datanode-11c433a003b6.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: starting secondarynamenode, logging to /opt/hadoop/hadoop-2.7.3/logs/hadoop-hdfs-secondarynamenode-9321a27a2b91.out
执行jps命令,查看进程:
[hdfs@9321a27a2b91 hadoop]$ jps
11105 Jps
10981 SecondaryNameNode
10777 NameNode
验证hdfs:
[hdfs@9321a27a2b91 hadoop-2.7.3]$ hdfs dfs -put NOTICE.txt /
[hdfs@9321a27a2b91 hadoop-2.7.3]$ hdfs dfs -ls /
Found 1 items
-rw-r--r-- 3 hdfs supergroup 14978 2017-04-03 19:15 /NOTICE.txt
查看hdfs web页面:
[root@9321a27a2b91 hdfs]# curl hadoop1:50070
<!--
Licensed to the Apache Software Foundation (ASF) under one or more
contributor license agreements. See the NOTICE file distributed with
this work for additional information regarding copyright ownership.
................
当然,如果做了端口映射就可以看见web了!
至于怎么做端口映射,参考我另一篇文章<docker iptables端口映射>
3.1.5.4 启动yarn
本想以yarn用户启动yarn,但是又要配置yarn的ssh又要配置环境变量,挺麻烦的,就以hdfs用户启动yarn.
[hdfs@9321a27a2b91 hadoop]$ start-yarn.sh
starting yarn daemons
starting resourcemanager, logging to /opt/hadoop/hadoop-2.7.3/logs/yarn-hdfs-resourcemanager-9321a27a2b91.out
hadoop5: starting nodemanager, logging to /opt/hadoop/hadoop-2.7.3/logs/yarn-hdfs-nodemanager-ae1f06bd04c8.out
hadoop6: starting nodemanager, logging to /opt/hadoop/hadoop-2.7.3/logs/yarn-hdfs-nodemanager-11c433a003b6.out
hadoop3: starting nodemanager, logging to /opt/hadoop/hadoop-2.7.3/logs/yarn-hdfs-nodemanager-f89eaf2a2548.out
hadoop4: starting nodemanager, logging to /opt/hadoop/hadoop-2.7.3/logs/yarn-hdfs-nodemanager-28620eee1426.out
执行jps查看进程:
[hdfs@9321a27a2b91 hadoop]$ jps
11105 Jps
10981 SecondaryNameNode
10777 NameNode
10383 ResourceManager
查看yarn页面:
通过curl没有查到数据,但是web端口可以看到:
3.1.5.5 测试集群配置
使用hadooop example程序来验证一下集群正确性:
[hdfs@9321a27a2b91 hadoop-2.7.3]$ bin/hdfs dfs -mkdir /user
[hdfs@9321a27a2b91 hadoop-2.7.3]$ bin/hdfs dfs -mkdir /user/hdfs
[hdfs@9321a27a2b91 hadoop-2.7.3]$ bin/hdfs dfs -put etc/hadoop input
...............
17/04/12 12:38:24 INFO mapreduce.JobSubmitter: number of splits:30
17/04/12 12:38:24 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1491968887469_0003
17/04/12 12:38:24 INFO mapreduce.JobSubmitter: Cleaning up the staging area /tmp/hadoop-yarn/staging/hdfs/.staging/job_1491968887469_0003
java.lang.IllegalArgumentException: Does not contain a valid host:port authority: hadoop2
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:213)
........................
呀,报错了!
错误信息:
找不到hadoop2主机!
什么,hadoop2这个主机?不应该是7c3a3c9cd595吗?卧槽.赶紧修改所有以hadoop开头的主机名为docker真正的主机名:
core-site.xml:
<value>hdfs://hadoop1:9000</value>改为<value>hdfs://9321a27a2b91:9000</value>
yarn-site.xml:
<value>hadoop1</value>改为<value>9321a27a2b91</value>
mapred-site.xml:
<value>hadoop2</value>改为<value>7c3a3c9cd595</value>
slaves文件:
hadoop3
hadoop4
hadoop5
hadoop6
改为:
f89eaf2a2548
28620eee1426
ae1f06bd04c8
11c433a003b6
关闭yarn\hdfs,然后scp把etc/hadoop下的配置文件发到各个节点上:
scp -r hdfs@hadoop1:/opt/hadoop/hadoop-2.7.3/etc/hadoop/* /opt/hadoop/hadoop-2.7.3/etc/hadoop/
再启hdfs:
[hdfs@9321a27a2b91 hadoop]$ start-dfs.sh
Starting namenodes on [9321a27a2b91]
The authenticity of host '9321a27a2b91 (172.17.0.10)' can't be established.
RSA key fingerprint is 60:0c:61:73:2c:49:ef:e3:f7:61:c9:27:93:5a:1d:c7.
Are you sure you want to continue connecting (yes/no)?
卧槽,竟然又要设置ssh免密码!ssh免密码是以主机名的字符而设置的,9321a27a2b91和hadoop1竟然不是对等的!我竟然无主以对!
3.1.5.6 各个节点上启动hdfs守护进程
算了,在各个节点上单独启动hdfs:
9321a27a2b91 hadoop1启动namenode:
$HADOOP_HOME/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode
以下节点启动datanode:
f89eaf2a2548 hadoop3
28620eee1426 hadoop4
ae1f06bd04c8 hadoop5
11c433a003b6 hadoop6
[hdfs@11c433a003b6 hadoop-2.7.3]$ $HADOOP_HOME/sbin/hadoop-daemons.sh start datanode
The authenticity of host '28620eee1426 (172.17.0.7)' can't be established.
RSA key fingerprint is 60:0c:61:73:2c:49:ef:e3:f7:61:c9:27:93:5a:1d:c7.
Are you sure you want to continue connecting (yes/no)? The authenticity of host '11c433a003b6 (172.17.0.5)' can't be established.
RSA key fingerprint is 60:0c:61:73:2c:49:ef:e3:f7:61:c9:27:93:5a:1d:c7.
Are you sure you want to continue connecting (yes/no)? The authenticity of host 'ae1f06bd04c8 (172.17.0.6)' can't be established.
RSA key fingerprint is 60:0c:61:73:2c:49:ef:e3:f7:61:c9:27:93:5a:1d:c7.
Are you sure you want to continue connecting (yes/no)? f89eaf2a2548: datanode running as process 5764. Stop it first.
卧槽,在其中一个节点上执行时竟然会一起启动其它节点!好吧,hadoop-daemons.sh这个脚本肯定是读取了slave文件的内容.嗯,在datanode节点把slave文件移出rm $HADOOP_HOME/etc/hadoop/slaves ,再启动datanode:
[hdfs@11c433a003b6 hadoop-2.7.3]$ $HADOOP_HOME/sbin/hadoop-daemons.sh --config $HADOOP_CONF_DIR --script hdfs start datanode
cat: /opt/hadoop/hadoop-2.7.3/etc/hadoop/slaves: No such file or directory
照样报错,真是日了狗了!来看看这个$HADOOP_HOME/sbin/hadoop-daemons.sh里面是啥!
usage="Usage: hadoop-daemons.sh [--config confdir] [--hosts hostlistfile] [start|stop] command args..."
# if no args specified, show usage
if[ $# -le 1 ]; then
echo $usage
exit 1
fi
bin=`dirname "${BASH_SOURCE-$0}"`
bin=`cd "$bin"; pwd`
DEFAULT_LIBEXEC_DIR="$bin"/../libexec
HADOOP_LIBEXEC_DIR=${HADOOP_LIBEXEC_DIR:-$DEFAULT_LIBEXEC_DIR}
. $HADOOP_LIBEXEC_DIR/hadoop-config.sh
exec "$bin/slaves.sh"--config $HADOOP_CONF_DIR cd "$HADOOP_PREFIX" \; "$bin/hadoop-daemon.sh"--config $HADOOP_CONF_DIR "$@"
卧槽尼玛!竟然是引用了slave.sh hadoop-daemon.sh !!!!明明单节点启动要用hadoop-daemon.sh ,官网地告诉我们要用hadoop-daemons.sh ,尼玛坑爹啊!
好了,用hadoop-daemon.sh 来启动:
$HADOOP_HOME/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start datanode
好了,终于启动成功了.
3.1.5.6 各个节点上启动yarn守护进程
9321a27a2b91 hadoop1启动resourcemanager:
[hdfs@9321a27a2b91 hadoop]$ $HADOOP_PREFIX/sbin/hadoop-daemon.sh --config $HADOOP_CONF_DIR --script hdfs start namenode
starting namenode, logging to /opt/hadoop/hadoop-2.7.3/logs/hadoop-hdfs-namenode-9321a27a2b91.out
f89eaf2a2548 hadoop3
28620eee1426 hadoop4
ae1f06bd04c8 hadoop5
11c433a003b6 hadoop6
[hdfs@f89eaf2a2548 hadoop-2.7.3]$ $HADOOP_YARN_HOME/sbin/yarn-daemon.sh --config $HADOOP_CONF_DIR start resourcemanager
starting resourcemanager, logging to /opt/hadoop/hadoop-2.7.3/logs/yarn-hdfs-resourcemanager-f89eaf2a2548.out
同样,官方文档依然是搞错了,我!
好了yarn也启动成功了.
再来测试一下example程序:
[hdfs@9321a27a2b91 hadoop-2.7.3]$ bin/hdfs dfs -put etc/hadoop input
...............
17/04/1212:38:24 INFO mapreduce.JobSubmitter: number of splits:30
17/04/1212:38:24 INFO mapreduce.JobSubmitter:Submitting tokens for job: job_1491968887469_0003
17/04/1212:38:24 INFO mapreduce.JobSubmitter:Cleaning up the staging area /tmp/hadoop-yarn/staging/hdfs/.staging/job_1491968887469_0003
java.lang.IllegalArgumentException:Does not contain a valid host:port authority: hadoop2
at org.apache.hadoop.net.NetUtils.createSocketAddr(NetUtils.java:213)
........................
老问题啊我X!那就不是主机的别名问题了.应该是
<property>
<name>mapreduce.jobhistory.address</name>
<value>7c3a3c9cd595</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>7c3a3c9cd595</value>
</property>
这里要加上端口,改成
<property>
<name>mapreduce.jobhistory.address</name>
<value>7c3a3c9cd595:10020</value>
</property>
<property>
<name>mapreduce.jobhistory.webapp.address</name>
<value>7c3a3c9cd595:19888</value>
</property>
分发到各个客户端:
scp -r hdfs@hadoop1:/opt/hadoop/hadoop-2.7.3/etc/hadoop/* /opt/hadoop/hadoop-2.7.3/etc/hadoop/,重启hdfs和yarn,再测试:
这一次不报那个问题了!
为什么第一次没写端口,因为hadoop2.6.3的安装中,不用写jobhistory的端口,所以...我偷了懒~
虽然不报那个错,但是报另外一个错误:
2017-04-0319:13:12,328 WARN org.apache.hadoop.hdfs.server.datanode.DataNode:IOExceptionin offerService
java.io.EOFException:End of FileException between local host is:"ae1f06bd04c8/172.17.0.6"; destination host is:"hadoop1":9000;: java.io.EOFException;For more details see: http://wiki.a
pache.org/hadoop/EOFException
at sun.reflect.NativeConstructorAccessorImpl.newInstance0(NativeMethod)
at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
这个nodemanager日志错误,IO不可以.四个nodemanager中挂了三个.当只有一个nodemanger启动时,任务可以执行成功.说明同一台宿主机上docker集群有磁盘争用,一旦一个进程用了其它进程就不能用了,这个问题我下次用docker单独挂不同的目录试试.
四.总结
4.1.hadoop的安装步骤
其实很简单的,大部分工作都在配置环境
1)安装JDK
2)添加hadoop组,添加hdfs用户,如果你愿意也可以加上yarn
3)设置环境变量 /home/hdfs/.bash_profil /etc/hosts
4)关闭防火墙
5)修改ulimit
6)配置SSH,此步骤可以省略,只要你愿意在每个节点上启动守护进程
7)安装ntp
8)下载hadoop安装包并解压
9)修改hadoop配置文件,并启动hdfs yarn
4.2问题总结
1)jobhistory里的端口要加上(为什么2.6.3版本可能不加?)
2)在每个节点上启动守护进程(不用start-dfs.sh start-yarn.sh)时,要注意用daemon.sh而非daemons.sh!
zookpeer还没有安装,这将在搞完docker磁盘争用后单独一个节来写.
后续的计划:
1.解决docker磁盘争用
2.安装zookpeer
3.安装hdfs ha
4.安装hive
5.安装hbaes
6.安装kafka
7.安装solr
8.安装es
当然,有可能有一些改变.
后记:
第二天我又重新建一个hadoop集群,原因是弄好了docker的iptables端口映射\容器主机名\固定IP\挂载目录这四样东西.新的hadoop集群IP:
172.18.0.11
172.18.0.12
172.18.0.13
172.18.0.14
172.18.0.15
172.18.0.16
角色跟上面的一样.以后的更新将按照新部署的集群来做.
注:cnblog支持markdown语法,以后会写markdown的格式~