Linux单机安装配置hadoop
1.到 《官网》 下载Hadoop
https://hadoop.apache.org/releases.html
2.使用cd /usr/local命令,进入到usr/local目录下,创建hadoop目录
[root@master ~]# cd /usr/local/
[root@master local]#
3.将下载好的文件上传到服务器,/usr/local 目录下
[root@master tools]# ls
hadoop-3.3.1.tar.gz jdk-8u131-linux-x64.rpm
4.上传完成后,使用ls命令,查看上传是否成功
[root@master tools]# ls
hadoop-3.3.1.tar.gz jdk-8u131-linux-x64.rpm
5.执行tar -zxvf hadoop-3.3.1.tar.gz命令,将tar包解压到当前文件夹中
[root@master local]# tar -zxvf hadoop-3.3.1.tar.gz
6.执行vi /etc/profile
[root@master hadoop-3.3.1]# vim /etc/profile
#zkm 2022-03-25
export HADOOP_HOME="/usr/local/hadoop-3.3.1"
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
7.执行source /etc/profile命令,使配置生效
[root@master hadoop-3.3.1]# source /etc/profile
8.使用hadoop version命令,查看hadoop 版本号,顺带检验hadoop 是否安装成功,安装成功,
[root@master hadoop-3.3.1]# hadoop version
Hadoop 3.3.1
Source code repository https://github.com/apache/hadoop.git -r a3b9c37a397ad4188041dd80621bdeefc46885f2
Compiled by ubuntu on 2021-06-15T05:13Z
Compiled with protoc 3.7.1
From source with checksum 88a4ddb2299aca054416d6b7f81ca55
This command was run using /usr/local/hadoop-3.3.1/share/hadoop/common/hadoop-common-3.3.1.jar
[root@master hadoop-3.3.1]#
9.执行cd /usr/local/hadoop-3.3.1/etc/hadoop进入hadoop目录,并使用ls查看配置文件,截图如下:
[root@master local]# cd hadoop-3.3.1/etc/hadoop/
[root@master hadoop]# pwd
/usr/local/hadoop-3.3.1/etc/hadoop
[root@master hadoop]# ls
capacity-scheduler.xml hadoop-policy.xml kms-acls.xml mapred-queues.xml.template yarn-env.cmd
configuration.xsl hadoop-user-functions.sh.example kms-env.sh mapred-site.xml yarn-env.sh
container-executor.cfg hdfs-rbf-site.xml kms-log4j.properties shellprofile.d yarnservice-log4j.properties
core-site.xml hdfs-site.xml kms-site.xml ssl-client.xml.example yarn-site.xml
hadoop-env.cmd httpfs-env.sh log4j.properties ssl-server.xml.example
hadoop-env.sh httpfs-log4j.properties mapred-env.cmd user_ec_policies.xml.template
hadoop-metrics2.properties httpfs-site.xml mapred-env.sh workers
[root@master hadoop]#
10.执行 vim hadoop-env.sh命令,添加配置运行环境变量
[root@master hadoop]# vim hadoop-env.sh
增加如下配置:
#zkm 2022-03-25
export JAVA_HOME=/usr/java/jdk1.8.0_131
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
11.执行vim core-site.xml命令,修改core-site.xml文件,在configuration中添加如下配置
[root@master hadoop]# vim core-site.xml
增加如下配置:
<!--zkm 2022-03-25-->
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9820</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-3.3.1/hadoopdata</value>
</property>
12.执行vim hdfs-site.xml命令,修改hdfs-site.xml配置文件
[root@master hadoop]# vim hdfs-site.xml
添加如下配置:
<!--zkm 2022-03-25-->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
13.执行vim mapred-site.xml命令,修改mapred-site.xml配置文件
[root@master hadoop]# vim mapred-site.xml
添加如下配置:
<!--zkm 2019-08-25-->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
14.执行vim yarn-site.xml命令,修改yarn-site.xml配置文件
[root@master hadoop]# vim yarn-site.xml
添加如下配置:
<!--zkm 2022-03-25-->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CL
ASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
15.执行 chmod -R 777 /usr/local/hadoop-3.3.1命令,将/usr/local/hadoop-3.3.1目录赋予777权限,否则后面格式化HDFS会报错
[root@master hadoop-3.3.1]# chmod -R 777 /usr/local/hadoop-3.3.1
执行cd /usr/local/hadoop-3.3.1命令,进入hadoop-3.2.1目录下
[root@master hadoop-3.3.1]# cd /usr/local/hadoop-3.3.1/
[root@master hadoop-3.3.1]# ll
total 116
drwxrwxrwx 2 lighthouse lighthouse 4096 Jun 15 2021 bin
drwxrwxrwx 3 lighthouse lighthouse 4096 Jun 15 2021 etc
drwxrwxrwx 2 root root 4096 Aug 25 10:27 hadoopdata
drwxrwxrwx 2 lighthouse lighthouse 4096 Jun 15 2021 include
drwxrwxrwx 3 lighthouse lighthouse 4096 Jun 15 2021 lib
drwxrwxrwx 4 lighthouse lighthouse 4096 Jun 15 2021 libexec
-rwxrwxrwx 1 lighthouse lighthouse 23450 Jun 15 2021 LICENSE-binary
drwxrwxrwx 2 lighthouse lighthouse 4096 Jun 15 2021 licenses-binary
-rwxrwxrwx 1 lighthouse lighthouse 15217 Jun 15 2021 LICENSE.txt
-rwxrwxrwx 1 lighthouse lighthouse 29473 Jun 15 2021 NOTICE-binary
-rwxrwxrwx 1 lighthouse lighthouse 1541 May 22 2021 NOTICE.txt
-rwxrwxrwx 1 lighthouse lighthouse 175 May 22 2021 README.txt
drwxrwxrwx 3 lighthouse lighthouse 4096 Jun 15 2021 sbin
drwxrwxrwx 4 lighthouse lighthouse 4096 Jun 15 2021 share
[root@master hadoop-3.3.1]#
执行bin/hdfs namenode -format命令,格式化HDFS文件系统,
[root@master hadoop-3.3.1]# bin/hdfs namenode -format
注意:只需要格式化一次即可,如果多次格式化到后面可能会出现问题,到那时删除/usr/local/hadoop-3.3.1/hadoopdata目录后再重新格式化
16.执行sbin/start-all.sh命令,启动hadoop所有进程,截图如下:
[root@master hadoop-3.3.1]# sbin/start-all.sh
启动完成后,执行jps命令(以后经常用),查看已经启动的进程:
错误:
localhost: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password)
解决方案:
https://www.jianshu.com/p/181b06293067
[root@master hadoop]# ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
Generating public/private rsa key pair.
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:07u+nJQ2t0HJTUCH0t8YCLOvKi0/+uwAuhsgWWjvAi4 root@master
The key's randomart image is:
+---[RSA 2048]----+
| o+oo. |
| . .o+o. |
|... .. ..+ |
|.o. . o +o .|
|= .. S . = . |
|+... . . = |
|Eoo. .. B o |
|. .o ooo= = o |
| o. .B*o*.. |
+----[SHA256]-----+
[root@master hadoop]# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[root@master sbin]# pwd
/usr/local/hadoop-3.3.1/sbin
[root@master sbin]# ./stop-all.sh
Stopping namenodes on [master]
Last login: Thu Aug 25 11:05:25 CST 2022 on pts/3
master: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Stopping datanodes
Last login: Thu Aug 25 11:12:33 CST 2022 on pts/3
localhost: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Stopping secondary namenodes [master]
Last login: Thu Aug 25 11:12:33 CST 2022 on pts/3
master: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Stopping nodemanagers
Last login: Thu Aug 25 11:12:34 CST 2022 on pts/3
localhost: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Stopping resourcemanager
Last login: Thu Aug 25 11:12:36 CST 2022 on pts/3
[root@master sbin]# pwd
/usr/local/hadoop-3.3.1/sbin
[root@master sbin]# ./start-all.sh
Starting namenodes on [master]
Last login: Thu Aug 25 11:12:37 CST 2022 on pts/3
Starting datanodes
Last login: Thu Aug 25 11:15:10 CST 2022 on pts/3
Starting secondary namenodes [master]
Last login: Thu Aug 25 11:15:12 CST 2022 on pts/3
Starting resourcemanager
Last login: Thu Aug 25 11:15:17 CST 2022 on pts/3
Starting nodemanagers
Last login: Thu Aug 25 11:15:23 CST 2022 on pts/3
[root@master sbin]# jps
29827 NameNode
29972 DataNode
31046 Jps
30632 NodeManager
30200 SecondaryNameNode
30489 ResourceManager
[root@master sbin]#
查看已经启动的进程
在window浏览器中输入:http://IP:8088/cluster (虚拟机ip地址)即可查看ResourceManager界面
输入http://IP:9870 查看namenode
输入http://IP:9864 查看datanode
注:执行“stop-all.sh”停止hadoop,以后启动、停止hadoop因为加入到环境变量中,所以不用在进入固定文件夹中启动了。
https://hadoop.apache.org/releases.html
2.使用cd /usr/local命令,进入到usr/local目录下,创建hadoop目录
[root@master ~]# cd /usr/local/
[root@master local]#
3.将下载好的文件上传到服务器,/usr/local 目录下
[root@master tools]# ls
hadoop-3.3.1.tar.gz jdk-8u131-linux-x64.rpm
4.上传完成后,使用ls命令,查看上传是否成功
[root@master tools]# ls
hadoop-3.3.1.tar.gz jdk-8u131-linux-x64.rpm
5.执行tar -zxvf hadoop-3.3.1.tar.gz命令,将tar包解压到当前文件夹中
[root@master local]# tar -zxvf hadoop-3.3.1.tar.gz
6.执行vi /etc/profile
[root@master hadoop-3.3.1]# vim /etc/profile
#zkm 2022-03-25
export HADOOP_HOME="/usr/local/hadoop-3.3.1"
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
7.执行source /etc/profile命令,使配置生效
[root@master hadoop-3.3.1]# source /etc/profile
8.使用hadoop version命令,查看hadoop 版本号,顺带检验hadoop 是否安装成功,安装成功,
[root@master hadoop-3.3.1]# hadoop version
Hadoop 3.3.1
Source code repository https://github.com/apache/hadoop.git -r a3b9c37a397ad4188041dd80621bdeefc46885f2
Compiled by ubuntu on 2021-06-15T05:13Z
Compiled with protoc 3.7.1
From source with checksum 88a4ddb2299aca054416d6b7f81ca55
This command was run using /usr/local/hadoop-3.3.1/share/hadoop/common/hadoop-common-3.3.1.jar
[root@master hadoop-3.3.1]#
9.执行cd /usr/local/hadoop-3.3.1/etc/hadoop进入hadoop目录,并使用ls查看配置文件,截图如下:
[root@master local]# cd hadoop-3.3.1/etc/hadoop/
[root@master hadoop]# pwd
/usr/local/hadoop-3.3.1/etc/hadoop
[root@master hadoop]# ls
capacity-scheduler.xml hadoop-policy.xml kms-acls.xml mapred-queues.xml.template yarn-env.cmd
configuration.xsl hadoop-user-functions.sh.example kms-env.sh mapred-site.xml yarn-env.sh
container-executor.cfg hdfs-rbf-site.xml kms-log4j.properties shellprofile.d yarnservice-log4j.properties
core-site.xml hdfs-site.xml kms-site.xml ssl-client.xml.example yarn-site.xml
hadoop-env.cmd httpfs-env.sh log4j.properties ssl-server.xml.example
hadoop-env.sh httpfs-log4j.properties mapred-env.cmd user_ec_policies.xml.template
hadoop-metrics2.properties httpfs-site.xml mapred-env.sh workers
[root@master hadoop]#
10.执行 vim hadoop-env.sh命令,添加配置运行环境变量
[root@master hadoop]# vim hadoop-env.sh
增加如下配置:
#zkm 2022-03-25
export JAVA_HOME=/usr/java/jdk1.8.0_131
export HDFS_NAMENODE_USER=root
export HDFS_DATANODE_USER=root
export HDFS_SECONDARYNAMENODE_USER=root
export YARN_RESOURCEMANAGER_USER=root
export YARN_NODEMANAGER_USER=root
11.执行vim core-site.xml命令,修改core-site.xml文件,在configuration中添加如下配置
[root@master hadoop]# vim core-site.xml
增加如下配置:
<!--zkm 2022-03-25-->
<property>
<name>fs.defaultFS</name>
<value>hdfs://master:9820</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/usr/local/hadoop-3.3.1/hadoopdata</value>
</property>
12.执行vim hdfs-site.xml命令,修改hdfs-site.xml配置文件
[root@master hadoop]# vim hdfs-site.xml
添加如下配置:
<!--zkm 2022-03-25-->
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
13.执行vim mapred-site.xml命令,修改mapred-site.xml配置文件
[root@master hadoop]# vim mapred-site.xml
添加如下配置:
<!--zkm 2019-08-25-->
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
14.执行vim yarn-site.xml命令,修改yarn-site.xml配置文件
[root@master hadoop]# vim yarn-site.xml
添加如下配置:
<!--zkm 2022-03-25-->
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.nodemanager.env-whitelist</name>
<value>JAVA_HOME,HADOOP_COMMON_HOME,HADOOP_HDFS_HOME,HADOOP_CONF_DIR,CL
ASSPATH_PREPEND_DISTCACHE,HADOOP_YARN_HOME,HADOOP_MAPRED_HOME</value>
</property>
15.执行 chmod -R 777 /usr/local/hadoop-3.3.1命令,将/usr/local/hadoop-3.3.1目录赋予777权限,否则后面格式化HDFS会报错
[root@master hadoop-3.3.1]# chmod -R 777 /usr/local/hadoop-3.3.1
执行cd /usr/local/hadoop-3.3.1命令,进入hadoop-3.2.1目录下
[root@master hadoop-3.3.1]# cd /usr/local/hadoop-3.3.1/
[root@master hadoop-3.3.1]# ll
total 116
drwxrwxrwx 2 lighthouse lighthouse 4096 Jun 15 2021 bin
drwxrwxrwx 3 lighthouse lighthouse 4096 Jun 15 2021 etc
drwxrwxrwx 2 root root 4096 Aug 25 10:27 hadoopdata
drwxrwxrwx 2 lighthouse lighthouse 4096 Jun 15 2021 include
drwxrwxrwx 3 lighthouse lighthouse 4096 Jun 15 2021 lib
drwxrwxrwx 4 lighthouse lighthouse 4096 Jun 15 2021 libexec
-rwxrwxrwx 1 lighthouse lighthouse 23450 Jun 15 2021 LICENSE-binary
drwxrwxrwx 2 lighthouse lighthouse 4096 Jun 15 2021 licenses-binary
-rwxrwxrwx 1 lighthouse lighthouse 15217 Jun 15 2021 LICENSE.txt
-rwxrwxrwx 1 lighthouse lighthouse 29473 Jun 15 2021 NOTICE-binary
-rwxrwxrwx 1 lighthouse lighthouse 1541 May 22 2021 NOTICE.txt
-rwxrwxrwx 1 lighthouse lighthouse 175 May 22 2021 README.txt
drwxrwxrwx 3 lighthouse lighthouse 4096 Jun 15 2021 sbin
drwxrwxrwx 4 lighthouse lighthouse 4096 Jun 15 2021 share
[root@master hadoop-3.3.1]#
执行bin/hdfs namenode -format命令,格式化HDFS文件系统,
[root@master hadoop-3.3.1]# bin/hdfs namenode -format
注意:只需要格式化一次即可,如果多次格式化到后面可能会出现问题,到那时删除/usr/local/hadoop-3.3.1/hadoopdata目录后再重新格式化
16.执行sbin/start-all.sh命令,启动hadoop所有进程,截图如下:
[root@master hadoop-3.3.1]# sbin/start-all.sh
启动完成后,执行jps命令(以后经常用),查看已经启动的进程:
错误:
localhost: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password)
解决方案:
https://www.jianshu.com/p/181b06293067
[root@master hadoop]# ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
Generating public/private rsa key pair.
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:07u+nJQ2t0HJTUCH0t8YCLOvKi0/+uwAuhsgWWjvAi4 root@master
The key's randomart image is:
+---[RSA 2048]----+
| o+oo. |
| . .o+o. |
|... .. ..+ |
|.o. . o +o .|
|= .. S . = . |
|+... . . = |
|Eoo. .. B o |
|. .o ooo= = o |
| o. .B*o*.. |
+----[SHA256]-----+
[root@master hadoop]# cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys
[root@master sbin]# pwd
/usr/local/hadoop-3.3.1/sbin
[root@master sbin]# ./stop-all.sh
Stopping namenodes on [master]
Last login: Thu Aug 25 11:05:25 CST 2022 on pts/3
master: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Stopping datanodes
Last login: Thu Aug 25 11:12:33 CST 2022 on pts/3
localhost: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Stopping secondary namenodes [master]
Last login: Thu Aug 25 11:12:33 CST 2022 on pts/3
master: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Stopping nodemanagers
Last login: Thu Aug 25 11:12:34 CST 2022 on pts/3
localhost: Permission denied (publickey,gssapi-keyex,gssapi-with-mic,password).
Stopping resourcemanager
Last login: Thu Aug 25 11:12:36 CST 2022 on pts/3
[root@master sbin]# pwd
/usr/local/hadoop-3.3.1/sbin
[root@master sbin]# ./start-all.sh
Starting namenodes on [master]
Last login: Thu Aug 25 11:12:37 CST 2022 on pts/3
Starting datanodes
Last login: Thu Aug 25 11:15:10 CST 2022 on pts/3
Starting secondary namenodes [master]
Last login: Thu Aug 25 11:15:12 CST 2022 on pts/3
Starting resourcemanager
Last login: Thu Aug 25 11:15:17 CST 2022 on pts/3
Starting nodemanagers
Last login: Thu Aug 25 11:15:23 CST 2022 on pts/3
[root@master sbin]# jps
29827 NameNode
29972 DataNode
31046 Jps
30632 NodeManager
30200 SecondaryNameNode
30489 ResourceManager
[root@master sbin]#
查看已经启动的进程
在window浏览器中输入:http://IP:8088/cluster (虚拟机ip地址)即可查看ResourceManager界面
输入http://IP:9870 查看namenode
输入http://IP:9864 查看datanode
注:执行“stop-all.sh”停止hadoop,以后启动、停止hadoop因为加入到环境变量中,所以不用在进入固定文件夹中启动了。