【Hadoop】2、平台部署(单机版)
一:配置环境
1、配置基础环境
配置网络
[root@wangzhigang ~]# vim /etc/sysconfig/network-scripts/ifcfg-ens33
TYPE=Ethernet
BOOTPROTO=static
DEFROUTE=yes
PEERDNS=yes
PEERROUTES=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_PEERDNS=yes
IPV6_PEERROUTES=yes
IPV6_FAILURE_FATAL=no
IPV6_ADDR_GEN_MODE=stable-privacy
NAME=ens33
UUID=da1a701d-8cee-4e1d-9423-56280232e595
DEVICE=ens33
ONBOOT=yes
IPADDR=192.168.100.10
PREFIX=24
GATEWAY=192.168.100.2
DNS1=114.114.114.114
[root@wangzhigang ~]# ip address show
设置主机名称
[root@wangzhigang ~]# hostnamectl set-hostname wangzhigang
[root@wangzhigang ~]# bash
[root@wangzhigang ~]# hostname
wangzhigang
绑定主机名与IP地址
(当IP地址改变时,只要修改主机名与IP地址的绑定文件,不用在多个配置文件中去修改)
[root@wangzhigang ~]# vim /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.100.10 wangzhigang
[root@wangzhigang ~]# ping wangzhigang
PING wangzhigang (192.168.100.10) 56(84) bytes of data.
64 bytes from wangzhigang (192.168.100.10): icmp_seq=1 ttl=64 time=0.092 ms
64 bytes from wangzhigang (192.168.100.10): icmp_seq=2 ttl=64 time=0.030 ms
64 bytes from wangzhigang (192.168.100.10): icmp_seq=3 ttl=64 time=0.029 ms
^C
--- wangzhigang ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2000ms
rtt min/avg/max/mdev = 0.029/0.050/0.092/0.030 ms
开启SSH服务
(在Hadoop分布式环境下,集群中的各个节点之间需要使用SSH协议进行通信)
[root@wangzhigang ~]# systemctl status sshd
关闭防火墙
(Hadoop可以使用Web页面进行管理,但需要关闭防火墙,否则打不开 Web 页面)
[root@wangzhigang ~]# systemctl stop firewalld
创建 hadoop 用户
(使用root用户安装Hadoop的运行环境,然后再使用hadoop用户来运行Hadoop,以防止误操作)
[root@wangzhigang ~]# useradd hadoop
[root@wangzhigang ~]# passwd hadoop
2、安装JAVA环境
注意:需要连接SecureCRT,传输压缩包jdk-8u152-linux-x64.tar.gz和hadoop-2.7.1.tar.gz到/root目录
将安装包解压
[root@wangzhigang ~]# tar -zxvf jdk-8u152-linux-x64.tar.gz -C /usr/local/src
[root@wangzhigang ~]# ll /usr/local/src
total 0
drwxr-xr-x. 8 hadoop hadoop 255 Sep 14 2017 jdk1.8.0_152
设置JAVA环境变量
(配置/etc/profile文件,配置结果对整个系统有效)
[root@wangzhigang ~]# vi /etc/profile
加入:
export JAVA_HOME=/usr/local/src/jdk1.8.0_152 #JAVA_HOME指向JAVA安装目录
export PATH=$PATH:$JAVA_HOME/bin #将JAVA安装目录加入PATH路径
执行source使设置生效
[root@wangzhigang ~]# source /etc/profile
检查JAVA是否可用
[root@wangzhigang ~]# echo $JAVA_HOME
/usr/local/src/jdk1.8.0_152/
#说明JAVA_HOME已指向JAVA安装目录
[root@wangzhigang ~]# java -version
java version "1.8.0_152"
Java(TM) SE Runtime Environment (build 1.8.0_152-b11)
Java HotSpot(TM) 64-Bit Server VM (build 25.231-b11, mixed mode)
显示Java版本则说明JDK安装并配置成功
二:安装Hadoop软件
将安装包解压
[root@wangzhigang ~]# tar -zxvf hadoop-2.7.1.tar.gz -C /usr/local/src/
[root@wangzhigang ~]# ll /usr/local/src/
total 0
drwxr-xr-x. 9 hadoop hadoop 149 Jun 29 2015 hadoop-2.7.1
drwxr-xr-x. 8 hadoop hadoop 255 Sep 14 2017 jdk1.8.0_152
[root@wangzhigang ~]# ll /usr/local/src/hadoop-2.7.1/
total 28
drwxr-xr-x. 2 hadoop hadoop 194 Jun 29 2015 bin
drwxr-xr-x. 3 hadoop hadoop 20 Jun 29 2015 etc
drwxr-xr-x. 2 hadoop hadoop 106 Jun 29 2015 include
drwxr-xr-x. 3 hadoop hadoop 20 Jun 29 2015 lib
drwxr-xr-x. 2 hadoop hadoop 239 Jun 29 2015 libexec
-rw-r--r--. 1 hadoop hadoop 15429 Jun 29 2015 LICENSE.txt
-rw-r--r--. 1 hadoop hadoop 101 Jun 29 2015 NOTICE.txt
-rw-r--r--. 1 hadoop hadoop 1366 Jun 29 2015 README.txt
drwxr-xr-x. 2 hadoop hadoop 4096 Jun 29 2015 sbin
drwxr-xr-x. 4 hadoop hadoop 31 Jun 29 2015 share
配置Hadoop环境变量
修改/etc/profile文件,设置 JAVA 环境变量类似
[root@wangzhigang ~]# vi /etc/profile
加入:
export HADOOP_HOME=/usr/local/src/hadoop-2.7.1 #HADOOP_HOME指向JAVA安装目录
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
执行 source 使设置生效:
[root@wangzhigang ~]# source /etc/profile
检查设置是否生效:
[root@wangzhigang ~]# hadoop
Usage: hadoop [--config confdir] [COMMAND | CLASSNAME]
CLASSNAME run the class named CLASSNAME
or
where COMMAND is one of:
fs run a generic filesystem user client
version print the version
jar <jar> run a jar file
note: please use "yarn jar" to launch
YARN applications, not this command.
checknative [-a|-h] check native hadoop and compression libraries availability
distcp <srcurl> <desturl> copy file or directories recursively
archive -archiveName NAME -p <parent path> <src>* <dest> create a hadoop archive
classpath prints the class path needed to get the
credential interact with credential providers
Hadoop jar and the required libraries
daemonlog get/set the log level for each daemon
trace view and modify Hadoop tracing settings
Most commands print help when invoked w/o parameters.
出现Hadoop帮助信息就说明Hadoop安装好了
修改目录所有者和所有者组
(使hadoop用户能够运行Hadoop软件)
[root@wangzhigang ~]# chown -R hadoop:hadoop /usr/local/src
[root@wangzhigang ~]# ll /usr/local/src/
total 0
drwxr-xr-x. 9 hadoop hadoop 149 Jun 29 2015 hadoop-2.7.1
drwxr-xr-x. 8 hadoop hadoop 255 Sep 14 2017 jdk1.8.0_152
三:安装单机版Hadoop系统
1、配置Hadoop配置文件
(目的是告诉Hadoop系统JDK的安装目录)
[root@wangzhigang ~]# cd /usr/local/src/hadoop-2.7.1/
[root@wangzhigang ~]# vi etc/hadoop/hadoop-env.sh
修改:
export JAVA_HOME=/usr/local/src/jdk1.8.0_152
2、测试Hadoop本地模式的运行
切换到hadoop用户
[root@wangzhigang ~]# su - hadoop
[hadoop@wangzhigang ~]$
创建输入数据存放目录
[hadoop@wangzhigang ~]$ mkdir ~/input
创建数据输入文件
[hadoop@wangzhigang ~]$ vim ~/input/data.txt
Hello World
Hello Hadoop
Hello Huasan
Hello 王智刚
3、测试MapReduce运行
运行WordCount官方案例,统计data.txt文件中单词的出现频度
[hadoop@wangzhigang ~]$ hadoop jar /usr/local/src/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount ~/input/data.txt ~/output
运行结果保存在~/output目录中,命令执行后查看结果
[hadoop@wangzhigang ~]$ ll ~/output
total 4
-rw-r--r--. 1 hadoop hadoop 46 Mar 13 17:15 part-r-00000
-rw-r--r--. 1 hadoop hadoop 0 Mar 13 17:15 _SUCCESS
文件_SUCCESS 表示处理成功
处理的结果存放在part-r-00000文件中,查看该文件
[hadoop@wangzhigang ~]$ cat ~/output/part-r-00000
Hadoop 1
Hello 4
Huasan 1
World 1
王智刚 1
注意:输出目录output不能事先创建
一次统计多个文件到output2
[hadoop@wangzhigang ~]$ hadoop jar /usr/local/src/hadoop-2.7.1/share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount ~/input/data.txt ~/input/111.txt ~/input/222.txt ~/output2
[hadoop@wangzhigang ~]$ ll output2/
total 4
-rw-r--r--. 1 hadoop hadoop 58 Mar 14 17:42 part-r-00000
-rw-r--r--. 1 hadoop hadoop 0 Mar 14 17:42 _SUCCESS
[hadoop@wangzhigang ~]$ cat output2/part-r-00000
11111 1
22222 1
Hadoop 1
Hello 4
Huasan 1
World 1
王智刚 1
声明:未经许可,禁止转载