hadoop安装

1.硬件准备

阿里云三台服务器，使用系统是centos6.9

2.安装

2.1安装jdk

1）jdk下载，在官网下载linux 64位的

下载到本地上传，或者使用wget下载，注意后面的AUTH信息每次需要自行改变

https://www.oracle.com/java/technologies/javase/javase-jdk8-downloads.html

浏览器下载后获取地址

curl -O https://download.oracle.com/otn/java/jdk/8u261-b12/a4634525489241b9a9e1aa73d9e118e6/jdk-8u261-linux-x64.tar.gz?AuthParam=1600227784_d1d21d93eaccab801bd1324fb5089179

　　解压后

[root@iZ8vbikyhq14lnnku3lyegZ soft]# tar -zxvf jdk-8u261-linux-x64.tar.gz  -C ../modules

配置环境变量

vi /etc/profile

JAVA_HOME=/opt/modules/jdk
PATH=$PATH:$JAVA_HOME/bin

export JAVA_HOME PATH

source /etc/profile

[root@hadoop-001 jdk]# which java
/opt/modules/jdk/bin/java

2）配置host互通

修改hostname

/etc/hosts：主机名查询静态表，是ip地址与域名快速解析的文件

格式

IP　　主机名域名主机别名（一个IP有多个名字，可用空格隔离）

[root@iZ8vbikyhq14lnnku3lyegZ modules]# cat /etc/hosts
127.0.0.1	localhost	localhost.localdomain	localhost4	localhost4.localdomain4
::1	localhost	localhost.localdomain	localhost6	localhost6.localdomain6
172.20.5.0	hadoop01

[root@iZ8vbikyhq14lnnku3lyegZ modules]# hostname 
iZ8vbikyhq14lnnku3lyegZ
[root@iZ8vbikyhq14lnnku3lyegZ modules]# hostname hadoop01
[root@iZ8vbikyhq14lnnku3lyegZ modules]# hostname
hadoop01

这一步是最坑的，对于阿里云这种有固定IP的，这里不能配置外网IP，需要配置内网IP

[root@hadoop-001 jdk]# cat /etc/hosts
127.0.0.1	localhost	localhost.localdomain	localhost4	localhost4.localdomain4
::1	localhost	localhost.localdomain	localhost6	localhost6.localdomain6
172.24.67.124	hadoop-001	hadoop-001
39.100.128.158	hadoop-002	hadoop-002
39.100.146.104	hadoop-003	hadoop-003

[root@hadoop-001 jdk]# ping hadoop-003
PING hadoop-003 (39.100.146.104) 56(84) bytes of data.
64 bytes from hadoop-003 (39.100.146.104): icmp_seq=1 ttl=64 time=0.266 ms
64 bytes from hadoop-003 (39.100.146.104): icmp_seq=2 ttl=64 time=0.203 ms

新的配置　　

[hadoop@hadoop01 ~]$ cat /etc/hosts
127.0.0.1	localhost	localhost.localdomain	localhost4	localhost4.localdomain4
::1	localhost	localhost.localdomain	localhost6	localhost6.localdomain6
172.20.5.0	hadoop01
172.20.5.1   hadoop02
172.20.5.2   hadoop03

3）创建用户

[root@hadoop-001 jdk]# useradd hadoop
[root@hadoop-001 jdk]# passwd hadoop
Changing password for user hadoop.
New password: 
BAD PASSWORD: it is based on a dictionary word
BAD PASSWORD: is too simple
Retype new password: 
passwd: all authentication tokens updated successfully.
[root@hadoop-001 jdk]# chown -R hadoop:hadoop /opt/modules/
[root@hadoop-001 jdk]# ll /opt/
total 8
drwxr-xr-x 3 hadoop hadoop 4096 Sep 16 14:04 modules
drwxr-xr-x 2 root   root   4096 Sep 16 11:35 soft

　授予sudo权限

[root@hadoop-001 jdk]# vi /etc/sudoers
 
 90 ## Allow root to run any commands anywhere
     91 root    ALL=(ALL)       ALL
     92 hadoop  ALL=(ALL)       NOPASSWD: ALL

[root@hadoop-001 jdk]# su - hadoop
[hadoop@hadoop-001 ~]$ sudo mkdir /opt/test
[hadoop@hadoop-001 ~]$ ll /opt/
total 12
drwxr-xr-x 3 hadoop hadoop 4096 Sep 16 14:04 modules
drwxr-xr-x 2 root   root   4096 Sep 16 11:35 soft
drwxr-xr-x 2 root   root   4096 Sep 16 14:23 test

[hadoop@hadoop-001 soft]$ sudo chown -R hadoop:hadoop /opt/soft/
[hadoop@hadoop-001 soft]$ ll /opt/
total 12
drwxr-xr-x 3 hadoop hadoop 4096 Sep 16 14:04 modules
drwxr-xr-x 2 hadoop hadoop 4096 Sep 16 14:34 soft
drwxr-xr-x 2 root   root   4096 Sep 16 14:23 test

2.2 安装hadoop　　

　1）上传和解压

[hadoop@hadoop-001 soft]$ tar -zxvf hadoop-2.7.2.tar.gz -C ../modules/

[hadoop@hadoop-001 hadoop-2.7.2]$ sudo vi /etc/profile

HADOOP_HOME=/opt/modules/hadoop-2.7.2
PATH=$PATH:$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin

export JAVA_HOME PATH HADOOP_HOME


[hadoop@hadoop-001 hadoop-2.7.2]$ source /etc/profile

[hadoop@hadoop-001 hadoop-2.7.2]$ hadoop version
Hadoop 2.7.2
Subversion Unknown -r Unknown
Compiled by root on 2017-05-22T10:49Z
Compiled with protoc 2.5.0
From source with checksum d0fda26633fa762bff87ec759ebe689c
This command was run using /opt/modules/hadoop-2.7.2/share/hadoop/common/hadoop-common-2.7.2.jar

2.3.集群安装准备工作

　1）三台机子免密登录（001免密登录002和003）　

先完成hadoop用户

[hadoop@hadoop-001 ~]$ ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/home/hadoop/.ssh/id_rsa): 
Created directory '/home/hadoop/.ssh'.
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /home/hadoop/.ssh/id_rsa.
Your public key has been saved in /home/hadoop/.ssh/id_rsa.pub.
The key fingerprint is:
aa:ad:73:ca:f5:ab:a9:a6:99:a4:29:93:28:84:f4:39 hadoop@hadoop-001
The key's randomart image is:
+--[ RSA 2048]----+
|                 |
|                 |
|                 |
| .               |
|o . .   S        |
|.. E   .         |
|o.. . o          |
|*+ +o+.o         |
|=.+o**+.o.       |
+-----------------+
[hadoop@hadoop-001 ~]$ ssh-copy-id hadoop-001
The authenticity of host 'hadoop-001 (172.24.67.124)' can't be established.
RSA key fingerprint is ae:64:82:22:d7:6c:cb:ab:5d:d8:04:8c:fe:16:a7:ea.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop-001,172.24.67.124' (RSA) to the list of known hosts.
hadoop@hadoop-001's password: 
Now try logging into the machine, with "ssh 'hadoop-001'", and check in:

  .ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

[hadoop@hadoop-001 ~]$ ssh-copy-id hadoop-002
The authenticity of host 'hadoop-002 (39.100.128.158)' can't be established.
RSA key fingerprint is ea:6f:0d:c6:43:18:36:34:e7:d3:c7:2b:c2:3a:17:5b.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop-002,39.100.128.158' (RSA) to the list of known hosts.
hadoop@hadoop-002's password: 
Now try logging into the machine, with "ssh 'hadoop-002'", and check in:

  .ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

[hadoop@hadoop-001 ~]$ ssh-copy-id hadoop-003
The authenticity of host 'hadoop-003 (39.100.146.104)' can't be established.
RSA key fingerprint is ae:64:82:22:d7:6c:cb:ab:5d:d8:04:8c:fe:16:a7:ea.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop-003,39.100.146.104' (RSA) to the list of known hosts.
Now try logging into the machine, with "ssh 'hadoop-003'", and check in:

  .ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

[hadoop@hadoop-001 ~]$ ssh hadoop-002 ls  /opt

再完成root用户

[hadoop@hadoop-001 bin]$ ssh-copy-id root@hadoop-001
root@hadoop-001's password: 
Now try logging into the machine, with "ssh 'root@hadoop-001'", and check in:

  .ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

[hadoop@hadoop-001 bin]$ ssh-copy-id root@hadoop-002
root@hadoop-002's password: 
Now try logging into the machine, with "ssh 'root@hadoop-002'", and check in:

  .ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

[hadoop@hadoop-001 bin]$ ssh-copy-id root@hadoop-003
root@hadoop-003's password: 
Now try logging into the machine, with "ssh 'root@hadoop-003'", and check in:

  .ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

　切换到root用户下配置免密登录（因为部分操作使用root用户会比较方便）　

[root@hadoop01 ~]#  ssh-keygen -t rsa
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa): 
Enter passphrase (empty for no passphrase): 
Enter same passphrase again: 
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
de:9f:57:a5:41:ea:2e:6b:cc:e0:44:4f:69:74:18:e2 root@hadoop01
The key's randomart image is:
+--[ RSA 2048]----+
|        . .o     |
|       . .o . .  |
|        E. o o   |
|        . + . . .|
|       .S+ .   o.|
|       .o.. . . .|
|       o.+..   . |
|        . =....  |
|         ..oo.   |
+-----------------+
[root@hadoop01 ~]# ssh-copy-id hadoop01
The authenticity of host 'hadoop01 (172.20.5.0)' can't be established.
RSA key fingerprint is 55:c0:b5:f5:65:73:f7:76:c5:4b:87:c9:0c:97:df:c7.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop01,172.20.5.0' (RSA) to the list of known hosts.
root@hadoop01's password: 
Now try logging into the machine, with "ssh 'hadoop01'", and check in:

  .ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

[root@hadoop01 ~]# ssh-copy-id hadoop02
The authenticity of host 'hadoop02 (39.99.132.211)' can't be established.
RSA key fingerprint is 9d:13:8f:31:29:9b:11:b3:ef:ac:4e:3f:ec:da:6e:6f.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop02,39.99.132.211' (RSA) to the list of known hosts.
root@hadoop02's password: 
Now try logging into the machine, with "ssh 'hadoop02'", and check in:

  .ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

[root@hadoop01 ~]# ssh-copy-id hadoop03
The authenticity of host 'hadoop03 (39.99.129.254)' can't be established.
RSA key fingerprint is 3b:84:38:db:ba:02:a9:5a:ed:b8:b6:84:ba:8a:c6:4a.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'hadoop03,39.99.129.254' (RSA) to the list of known hosts.
root@hadoop03's password: 
Now try logging into the machine, with "ssh 'hadoop03'", and check in:

  .ssh/authorized_keys

to make sure we haven't added extra keys that you weren't expecting.

2）编写同步文件脚本 xsync

#!/bin/bash
#判断是否传参数，或者参数大于1
if(($#!=1))
then
	echo 请传入一个参数
	exit;
fi
#获取文件的路径和文件名
dirname=$(cd `dirname $1`;pwd -P)
filename=basename $1
echo 要分发的文件路径是$dirname/$filename
for((i=2;i<4;i++))
do
	echo ------hadoop-00$i---------
	rsync -rvlt  $dirname/$filename  hadoop@hadoop-00$i:$dirname
done

　　3）no-login模式，让/etc/profile里面的环境变量生效

[hadoop@hadoop-001 ~]$ vi .bashrc 
[hadoop@hadoop-001 ~]$ tail .bashrc 
# .bashrc

# Source global definitions
if [ -f /etc/bashrc ]; then
	. /etc/bashrc
fi

# User specific aliases and functions
source /etc/profile

　　4）编写同步执行命令

[hadoop@hadoop-001 bin]$ cat xcall 
#!/bin/bash
#判断是否传入命令
if(($#==0))
then
	echo 请输入命令！！！
	exit
fi
#执行命令
for((i=1;i<4;i++))
do
	echo -----------hadoop-00$i------------
	ssh hadoop-00$i $*
done

　　因为path中包括/home/hadoop/bin，所以把脚本放在该目录下即可在任意地点执行 chmod u+x 脚本

hadoop@hadoop01 ~]$ mkdir bin
[hadoop@hadoop01 ~]$ mv xcall xsync ./bin/
[hadoop@hadoop01 ~]$ cd bin/
[hadoop@hadoop01 bin]$ chmod u+x *
[hadoop@hadoop01 bin]$ ll
total 8
-rwxrw-r-- 1 hadoop hadoop 209 Oct 14 15:49 xcall
-rwxrw-r-- 1 hadoop hadoop 388 Oct 14 15:48 xsync

　把两个脚本同步到root用户下

[root@hadoop01 ~]# cp /home/hadoop/bin/* /usr/local/bin/
[root@hadoop01 ~]# ll /usr/local/bin/
total 8
-rwxr--r-- 1 root root 205 Oct 14 16:35 xcall
-rwxr--r-- 1 root root 386 Oct 14 16:35 xsync

　　修改xsync，用户改为root

[root@hadoop01 ~]# cat /usr/local/bin/xsync 

#!/bin/bash
#判断是否传参数，或者参数大于1
if(($#!=1))
then
    echo 请传入一个参数
    exit;
fi
#获取文件的路径和文件名
dirname=$(cd `dirname $1`;pwd -P)
filename=`basename $1`
echo 要分发的文件路径是$dirname/$filename
for((i=2;i<4;i++))
do
    echo ------hadoop0$i---------
    rsync -rvlt  $dirname/$filename  root@hadoop0$i:$dirname
done

[hadoop@hadoop-001 bin]$ echo $PATH
/usr/local/bin:/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/sbin:/opt/modules/jdk/bin:/opt/modules/hadoop-2.7.2/bin:/opt/m
odules/hadoop-2.7.2/sbin:/home/hadoop/bin

2.4规划

进程规划：核心、同质进程尽量分散

Hadoop-001

Hadoop-002

Hadoop-003

HDFS

NameNode

DataNode

SecondaryNameNode

DataNode

YARN

NodeManager

ResourceManager

NodeManager

2.5修改配置文件

1）core-site.xml

[hadoop@hadoop-001 hadoop]$ pwd
/opt/modules/hadoop-2.7.2/etc/hadoop
[hadoop@hadoop-001 hadoop]$ vi core-site.xml 
[hadoop@hadoop-001 hadoop]$ tail -20  core-site.xml 
  WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
  See the License for the specific language governing permissions and
  limitations under the License. See accompanying LICENSE file.
-->

<!-- Put site-specific property overrides in this file. -->

<configuration>
<!-- 指定HDFS中NameNode的地址 -->
<property>
		<name>fs.defaultFS</name>
      <value>hdfs://hadoop-001:9000</value>
</property>

<!-- 指定Hadoop运行时产生文件的存储目录 -->
<property>
		<name>hadoop.tmp.dir</name>
		<value>/opt/modules/hadoop-2.7.2/data/tmp</value>
</property>
</configuration>

[hadoop@hadoop-001 hadoop]$ cd ..
[hadoop@hadoop-001 etc]$ cd ..
[hadoop@hadoop-001 hadoop-2.7.2]$ ll
total 52
drwxr-xr-x 2 hadoop hadoop  4096 May 22  2017 bin
drwxr-xr-x 3 hadoop hadoop  4096 May 22  2017 etc
drwxr-xr-x 2 hadoop hadoop  4096 May 22  2017 include
drwxr-xr-x 3 hadoop hadoop  4096 May 22  2017 lib
drwxr-xr-x 2 hadoop hadoop  4096 May 22  2017 libexec
-rw-r--r-- 1 hadoop hadoop 15429 May 22  2017 LICENSE.txt
-rw-r--r-- 1 hadoop hadoop   101 May 22  2017 NOTICE.txt
-rw-r--r-- 1 hadoop hadoop  1366 May 22  2017 README.txt
drwxr-xr-x 2 hadoop hadoop  4096 May 22  2017 sbin
drwxr-xr-x 4 hadoop hadoop  4096 May 22  2017 share
[hadoop@hadoop-001 hadoop-2.7.2]$ mkdir data
[hadoop@hadoop-001 hadoop-2.7.2]$ mkdir -p  data/tmp
[hadoop@hadoop-001 hadoop-2.7.2]$

　2）同步操作

环境变量

[hadoop@hadoop01 ~]$ xsync .bashrc 
要分发的文件路径是/home/hadoop/.bashrc
------hadoop02---------
sending incremental file list
.bashrc

sent 217 bytes  received 37 bytes  508.00 bytes/sec
total size is 144  speedup is 0.57
------hadoop03---------
sending incremental file list
.bashrc

sent 217 bytes  received 37 bytes  508.00 bytes/sec
total size is 144  speedup is 0.57

[root@hadoop01 ~]# xsync /etc/profile 
要分发的文件路径是/etc/profile
------hadoop02---------
sending incremental file list
profile

sent 666 bytes  received 49 bytes  1430.00 bytes/sec
total size is 1985  speedup is 2.78
------hadoop03---------
sending incremental file list
profile

sent 666 bytes  received 49 bytes  1430.00 bytes/sec
total size is 1985  speedup is 2.78

　　hadoop和jdk文件

xsync modules/

（1）如果集群是第一次启动，需要格式化NameNode

hadoop namenode -format

[hadoop@hadoop-001 tmp]$ tree
.
└── dfs
    └── name
        └── current
            ├── edits_0000000000000000001-0000000000000000002
            ├── edits_0000000000000000003-0000000000000000004
            ├── fsimage_0000000000000000000
            ├── fsimage_0000000000000000000.md5
            ├── seen_txid
            └── VERSION

3 directories, 6 files
[hadoop@hadoop-001 tmp]$

　　(2)在001上面启动namenode

cd /opt/modules/hadoop-2.7.2/sbin/


[hadoop@hadoop-001 bin]$ hadoop-daemon.sh start namenode
starting namenode, logging to /opt/modules/hadoop-2.7.2/logs/hadoop-hadoop-namenode-hadoop-001.out

查看日志

2020-09-17 22:01:34,101 ERROR org.apache.hadoop.hdfs.server.namenode.NameNode: Failed to start namenode.
java.net.BindException: Problem binding to [hadoop-001:9000] java.net.BindException: Cannot assign requested address; 
For more details see:  http://wiki.apache.org/hadoop/BindException	at sun.reflect.NativeConstructorAccessorImpl.newInstance0(Native Method)
	at sun.reflect.NativeConstructorAccessorImpl.newInstance(NativeConstructorAccessorImpl.java:62)
	at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
	at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
	at org.apache.hadoop.net.NetUtils.wrapWithMessage(NetUtils.java:792)
	at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:721)
	at org.apache.hadoop.ipc.Server.bind(Server.java:425)
	at org.apache.hadoop.ipc.Server$Listener.<init>(Server.java:574)
	at org.apache.hadoop.ipc.Server.<init>(Server.java:2215)
	at org.apache.hadoop.ipc.RPC$Server.<init>(RPC.java:938)
	at org.apache.hadoop.ipc.ProtobufRpcEngine$Server.<init>(ProtobufRpcEngine.java:534)
	at org.apache.hadoop.ipc.ProtobufRpcEngine.getServer(ProtobufRpcEngine.java:509)
	at org.apache.hadoop.ipc.RPC$Builder.build(RPC.java:783)
	at org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.<init>(NameNodeRpcServer.java:344)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createRpcServer(NameNode.java:673)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.initialize(NameNode.java:646)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:811)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.<init>(NameNode.java:795)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.createNameNode(NameNode.java:1488)
	at org.apache.hadoop.hdfs.server.namenode.NameNode.main(NameNode.java:1554)
Caused by: java.net.BindException: Cannot assign requested address
	at sun.nio.ch.Net.bind0(Native Method)
	at sun.nio.ch.Net.bind(Net.java:444)
	at sun.nio.ch.Net.bind(Net.java:436)
	at sun.nio.ch.ServerSocketChannelImpl.bind(ServerSocketChannelImpl.java:225)
	at sun.nio.ch.ServerSocketAdaptor.bind(ServerSocketAdaptor.java:74)
	at org.apache.hadoop.ipc.Server.bind(Server.java:408)
	... 13 more
2020-09-17 22:01:34,103 INFO org.apache.hadoop.util.ExitUtil: Exiting with status 1
2020-09-17 22:01:34,104 INFO org.apache.hadoop.hdfs.server.namenode.NameNode: SHUTDOWN_MSG: 
/*********************************************************

　　分析原因：阿里云服务器无法绑定公网IP的地址，在 /etc/hosts 中把主节点master节点里面的ip写成内网ip

重启后验证：

界面验证rpc9000对应的http是50070，需要在安全组中添加，

（2）启动其他节点的datanode，需要指定namenode，把001上的文件同步至002、003

xsync core-site.xml

[hadoop@hadoop-001 hadoop]$ xcall hadoop-daemon.sh start  datanode
-----------hadoop-001------------
starting datanode, logging to /opt/modules/hadoop-2.7.2/logs/hadoop-hadoop-datanode-hadoop-001.out
-----------hadoop-002------------
starting datanode, logging to /opt/modules/hadoop-2.7.2/logs/hadoop-hadoop-datanode-hadoop-002.out
-----------hadoop-003------------
starting datanode, logging to /opt/modules/hadoop-2.7.2/logs/hadoop-hadoop-datanode-hadoop-003.out
[hadoop@hadoop-001 hadoop]$ xcall jps
-----------hadoop-001------------
16770 Jps
16686 DataNode
6191 NameNode
-----------hadoop-002------------
12439 DataNode
12509 Jps
-----------hadoop-003------------
14121 DataNode
14191 Jps

配置

secondarynamenode 

vi /opt/modules/hadoop-2.7.2/etc/hadoop/hdfs-site.xml

<configuration>
        <property>
                <name>dfs.namenode.secondary.http-address</name>
                <value>hadoop03:50090</value>
        </property>
</configuration>


vi yarn-site.xml

<configuration>
        <property>
                <name>yarn.nodemanager.aux-services</name>
                <value>mapreduce_shuffle</value>
        </property>
		<property>
                <name>yarn.resourcemanager.hostname</name>
                <value>hadoop02</value>
        </property>
</configuration>


mv mapred-site.xml.template mapred-site.xml

vi mapred-site.xml 
<configuration>
        <property>
                <name>mapreduce.framework.name</name>
                <value>yarn</value>
        </property>
</configuration>

同步

[hadoop@hadoop01 etc]$ xsync hadoop/
要分发的文件路径是/opt/modules/hadoop-2.7.2/etc/hadoop
------hadoop02---------
sending incremental file list
hadoop/
hadoop/.mapred-site.xml.swp
hadoop/hdfs-site.xml
hadoop/mapred-site.xml
hadoop/yarn-site.xml

sent 15282 bytes  received 110 bytes  30784.00 bytes/sec
total size is 89844  speedup is 5.84
------hadoop03---------
sending incremental file list
hadoop/
hadoop/.mapred-site.xml.swp
hadoop/hdfs-site.xml
hadoop/mapred-site.xml
hadoop/yarn-site.xml

sent 15282 bytes  received 110 bytes  30784.00 bytes/sec
total size is 89844  speedup is 5.84

启动脚本

在hadoop03上面启动secondarynamenode

hadoop-daemon.sh start secondarynamenode

yarn的启动

在hadoop02上启动resourcemanager　

[hadoop@hadoop02 ~]$ yarn-daemon.sh start resourcemanager
starting resourcemanager, logging to /opt/modules/hadoop-2.7.2/logs/yarn-hadoop-resourcemanager-hadoop02.out
[hadoop@hadoop02 ~]$ jps
13688 DataNode
14154 Jps
13932 ResourceManager

　在所有节点启动nodemanager

[hadoop@hadoop01 ~]$ xcall yarn-daemon.sh  start nodemanager
-----------hadoop01------------
starting nodemanager, logging to /opt/modules/hadoop-2.7.2/logs/yarn-hadoop-nodemanager-hadoop01.out
-----------hadoop02------------
starting nodemanager, logging to /opt/modules/hadoop-2.7.2/logs/yarn-hadoop-nodemanager-hadoop02.out
-----------hadoop03------------
starting nodemanager, logging to /opt/modules/hadoop-2.7.2/logs/yarn-hadoop-nodemanager-hadoop03.out
[hadoop@hadoop01 ~]$ xcall jps
-----------hadoop01------------
4881 NodeManager
4610 NameNode
4722 DataNode
4989 Jps
-----------hadoop02------------
14305 Jps
13688 DataNode
14201 NodeManager
13932 ResourceManager
-----------hadoop03------------
14371 Jps
14070 SecondaryNameNode
14267 NodeManager
13948 DataNode

测试验证

[hadoop@hadoop01 ~]$ hadoop fs -mkdir /wcinput
[hadoop@hadoop01 ~]$ echo "hi hi hi peter amy study hadoop and yarn">> hi
[hadoop@hadoop01 ~]$ hadoop fs -put hi /wcinput/
[hadoop@hadoop01 ~]$ cd $HADOOP_HOME
[hadoop@hadoop01 hadoop-2.7.2]$ cd share/hadoop/mapreduce/
[hadoop@hadoop01 mapreduce]$ ls
hadoop-mapreduce-client-app-2.7.2.jar     hadoop-mapreduce-client-hs-2.7.2.jar          hadoop-mapreduce-client-jobclient-2.7.2-tests.jar  lib
hadoop-mapreduce-client-common-2.7.2.jar  hadoop-mapreduce-client-hs-plugins-2.7.2.jar  hadoop-mapreduce-client-shuffle-2.7.2.jar          lib-examples
hadoop-mapreduce-client-core-2.7.2.jar    hadoop-mapreduce-client-jobclient-2.7.2.jar   hadoop-mapreduce-examples-2.7.2.jar                sources
[hadoop@hadoop01 mapreduce]$ hadoop jar hadoop-mapreduce-examples-2.7.2.jar  wordcount /wcinput /wcoutput
20/10/15 10:35:47 INFO client.RMProxy: Connecting to ResourceManager at hadoop02/172.20.5.1:8032
20/10/15 10:35:48 INFO input.FileInputFormat: Total input paths to process : 1
20/10/15 10:35:49 INFO mapreduce.JobSubmitter: number of splits:1
20/10/15 10:35:49 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1602728522305_0001
20/10/15 10:35:49 INFO impl.YarnClientImpl: Submitted application application_1602728522305_0001
20/10/15 10:35:49 INFO mapreduce.Job: The url to track the job: http://hadoop02:8088/proxy/application_1602728522305_0001/
20/10/15 10:35:49 INFO mapreduce.Job: Running job: job_1602728522305_0001
20/10/15 10:35:58 INFO mapreduce.Job: Job job_1602728522305_0001 running in uber mode : false
20/10/15 10:35:58 INFO mapreduce.Job:  map 0% reduce 0%
20/10/15 10:36:05 INFO mapreduce.Job:  map 100% reduce 0%
20/10/15 10:36:12 INFO mapreduce.Job:  map 100% reduce 100%
20/10/15 10:36:12 INFO mapreduce.Job: Job job_1602728522305_0001 completed successfully
20/10/15 10:36:13 INFO mapreduce.Job: Counters: 49
	File System Counters
		FILE: Number of bytes read=83
		FILE: Number of bytes written=235065
		FILE: Number of read operations=0
		FILE: Number of large read operations=0
		FILE: Number of write operations=0
		HDFS: Number of bytes read=137
		HDFS: Number of bytes written=49
		HDFS: Number of read operations=6
		HDFS: Number of large read operations=0
		HDFS: Number of write operations=2
	Job Counters 
		Launched map tasks=1
		Launched reduce tasks=1
		Data-local map tasks=1
		Total time spent by all maps in occupied slots (ms)=4273
		Total time spent by all reduces in occupied slots (ms)=4884
		Total time spent by all map tasks (ms)=4273
		Total time spent by all reduce tasks (ms)=4884
		Total vcore-milliseconds taken by all map tasks=4273
		Total vcore-milliseconds taken by all reduce tasks=4884
		Total megabyte-milliseconds taken by all map tasks=4375552
		Total megabyte-milliseconds taken by all reduce tasks=5001216
	Map-Reduce Framework
		Map input records=1
		Map output records=9
		Map output bytes=77
		Map output materialized bytes=83
		Input split bytes=96
		Combine input records=9
		Combine output records=7
		Reduce input groups=7
		Reduce shuffle bytes=83
		Reduce input records=7
		Reduce output records=7
		Spilled Records=14
		Shuffled Maps =1
		Failed Shuffles=0
		Merged Map outputs=1
		GC time elapsed (ms)=125
		CPU time spent (ms)=940
		Physical memory (bytes) snapshot=308670464
		Virtual memory (bytes) snapshot=4131770368
		Total committed heap usage (bytes)=170004480
	Shuffle Errors
		BAD_ID=0
		CONNECTION=0
		IO_ERROR=0
		WRONG_LENGTH=0
		WRONG_MAP=0
		WRONG_REDUCE=0
	File Input Format Counters 
		Bytes Read=41
	File Output Format Counters 
		Bytes Written=49
[hadoop@hadoop01 mapreduce]$

群起脚本：读取HADOOP_HOME/etc/hadoop/slaves　　获取所有节点的主机名，ssh启动（免密登录，source /etc/profile）

start-yarn.sh 在集群非RM所在的机器使用，不会启动resourcemanager

建议在RM上执行群启脚本。

posted @ 2019-01-11 20:16 酸奶加绿茶阅读(257) 评论(0) 编辑收藏举报

刷新页面返回顶部

求知若饥，虚心若愚。

[Stay Hungry, Stay Foolish.]