hadooop3.2.0单机版和伪分布式配置
(一) 安装jdk和hadoop
1:下载jdk和hadoop
https://www.oracle.com/technetwork/java/javase/downloads/index.html
https://hadoop.apache.org/releases.html
jdk-11.0.2_linux-x64_bin.tar.gz
hadoop-3.2.0.tar.gz
2:解压,安装jdk和hadoop
2.1安装jdk
tar -xzvf jdk-11.0.2_linux-x64_bin.tar.gz -C /usr/local
mv jdk-11.0.2 jdk
cat /etc/profile
export JAVA_HOME=/usr/local/jdk
export HADOOP_HOME=/usr/local/hadoop
export PATH=$PATH:$JAVA_HOME/bin
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
java -version
java version "11.0.2" 2019-01-15 LTS
Java(TM) SE Runtime Environment 18.9 (build 11.0.2+9-LTS)
Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.2+9-LTS, mixed mode)
2.2安装hadoop
tar -xzvf hadoop-3.2.0.tar.gz -C /usr/local
cd /usr/local
root@dl1-VirtualBox:/usr/local# ls
bin etc games hadoop-3.2.0 include jdk lib man sbin share src
root@dl1-VirtualBox:/usr/local# mv hadoop-3.2.0 hadoop
root@dl1-VirtualBox:/usr/local# hadoop version
Hadoop 3.2.0
Source code repository https://github.com/apache/hadoop.git -r e97acb3bd8f3befd27418996fa5d4b50bf2e17bf
Compiled by sunilg on 2019-01-08T06:08Z
Compiled with protoc 2.5.0
From source with checksum d3f0795ed0d9dc378e2c785d3668f39
This command was run using /usr/local/hadoop/share/hadoop/common/hadoop-common-3.2.0.jar
现在为止,单机版安装成功
(二)配置ssh免密登录
1:最开始需要密码才能够登录
ssh localhost
The authenticity of host 'localhost (127.0.0.1)' can't be established.
ECDSA key fingerprint is SHA256:od6liPOhNmokdAObfCZYGWz8Af3GT8SE/1bjgJ+Usbk.
Are you sure you want to continue connecting (yes/no)? yes
Warning: Permanently added 'localhost' (ECDSA) to the list of known hosts.
root@localhost's password:
Permission denied, please try again.
root@localhost's password:
Permission denied, please try again.
root@localhost's password:
1.1:
ssh-keygen
Generating public/private rsa key pair.
Enter file in which to save the key (/root/.ssh/id_rsa):
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /root/.ssh/id_rsa.
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
SHA256:TGIR8PGQ3PZuYnF0TOcbn/CGKeBEhGj2umV+pSwPI0E root@dl1-VirtualBox
The key's randomart image is:
+---[RSA 2048]----+
| .o*+oo o. . |
| =o*+ . oo |
| oE=.o= . .o |
| .. =+ + =+.|
| .. S= . o.+.|
| ..oo o.. . |
| .=+.oo |
| ..oo+ |
| +. |
+----[SHA256]-----+
1.3 配置公钥
root@dl1-VirtualBox:~# ls .ssh/
id_rsa id_rsa.pub known_hosts
root@dl1-VirtualBox:~/.ssh# ls
id_rsa id_rsa.pub known_hosts
root@dl1-VirtualBox:~/.ssh# cat id_rsa.pub > authorized_keys
root@dl1-VirtualBox:~/.ssh# ls
authorized_keys id_rsa id_rsa.pub known_hosts
root@dl1-VirtualBox:~/.ssh# cat authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDnrNKRojUnocNCWr3ZXWA9MP75B4vbvUq0zwxXJdL20x0Lr9XpSea996wx4xZ/ooX9XMTm/jPE53L5EQVrbjoQNq5oYo4ktlw3GO7oTAgBMO3M4IV8+rhOrv5hIqWgBMLzjjV9kL4s1wKvjs4JKz4yBpvM8s2Qwc8rNzpx4nyvL7EmVfp7v03CXJBIpPqq8jlg57CxMYHESe1G3CGwMIV06vbQxn0iRn/5zptuBNTTZz2nIxwT/GQjVetsefw0SmynEQfj3b2WyzAQnGPwRoYm6p2XohzR1e1MJ8D577VIHH8YvtPbsON5J2SH+jTxMlv0iLbHuq8GOFj0Y+2XSWUr root@dl1-VirtualBox
1.4配置成功
root@dl1-VirtualBox:~/.ssh# ssh localhost
Welcome to Ubuntu 16.04.4 LTS (GNU/Linux 4.15.0-46-generic x86_64)
* Documentation: https://help.ubuntu.com
* Management: https://landscape.canonical.com
* Support: https://ubuntu.com/advantage
189 个可升级软件包。
0 个安全更新。
New release '18.04.2 LTS' available.
Run 'do-release-upgrade' to upgrade to it.
Last login: Thu Mar 7 09:48:51 2019 from 127.0.0.1
(三)配置hadoop相关文件
etc/hadoop/hadoop-env.sh to define some parameters as follows:
# set to the root of your Java installation export JAVA_HOME=/usr/java/latest
(四)启动hadoop伪分布式
1:启动hdfs
1.1:配置core-site.xml和hdfs-site.xml
etc/hadoop/core-site.xml:
<configuration> <property> <name>hadoop.tmp.dir</name> <value>file:/usr/local/hadoop/tmp</value> <description>Abase for other temporary directories.</description> </property> <property> <name>fs.defaultFS</name> <value>hdfs://localhost:9000</value> </property> </configuration>
etc/hadoop/hdfs-site.xml:
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/usr/local/hadoop/tmp/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/usr/local/hadoop/tmp/dfs/data</value> </property> </configuration>
1.2 format namenode
hadoop namenode -format
2019-03-07 11:33:41,718 INFO common.Storage: Storage directory /usr/local/hadoop/tmp/dfs/name has been successfully formatted.
1.3 启动hdfs
root@dl1-VirtualBox:/usr/local# start-dfs.sh
Starting namenodes on [localhost]
ERROR: Attempting to operate on hdfs namenode as root
ERROR: but there is no HDFS_NAMENODE_USER defined. Aborting operation.
Starting datanodes
ERROR: Attempting to operate on hdfs datanode as root
ERROR: but there is no HDFS_DATANODE_USER defined. Aborting operation.
Starting secondary namenodes [dl1-VirtualBox]
ERROR: Attempting to operate on hdfs secondarynamenode as root
ERROR: but there is no HDFS_SECONDARYNAMENODE_USER defined. Aborting operation.
报错,解决办法:
root@dl1-VirtualBox:/usr/local/hadoop/etc/hadoop# vim hadoop-env.sh
JAVA_HOME=/usr/local/jdk
HDFS_NAMENODE_USER="root"
HDFS_DATANODE_USER="root"
HDFS_SECONDARYNAMENODE_USER="root"
YARN_RESOURCEMANAGER_USER="root"
YARN_NODEMANAGER_USER="root"
1.4错误解决,再启动,成功
root@dl1-VirtualBox:/usr/local/hadoop/etc/hadoop# start-dfs.sh
Starting namenodes on [localhost]
Starting datanodes
Starting secondary namenodes [dl1-VirtualBox]
dl1-VirtualBox: Warning: Permanently added 'dl1-virtualbox' (ECDSA) to the list of known hosts.
root@dl1-VirtualBox:/usr/local/hadoop/etc/hadoop# jps
29716 Jps
29477 SecondaryNameNode
29275 DataNode
29131 NameNode
访问http://localhost:9870/
成功
2,启动yarn
2.1配置yarn文件
可以不配置
2.2
出现问题,resourcemanager和nodemanager启动起来之后,就自动退出了。
尝试了很多方法,比如详细配置yarn-site.xml和mapred-site.xml文件,比如更改/etc/hosts文件中等,都没能解决,于是查看日志。
root@dl1-VirtualBox:/usr/local/hadoop/logs# tail hadoop-root-resourcemanager-dl1-VirtualBox.log
... 49 more
Caused by: java.lang.ClassNotFoundException: javax.activation.DataSource
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
... 83 more
2019-03-07 12:55:16,944 INFO org.apache.hadoop.yarn.server.resourcemanager.ResourceManager: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down ResourceManager at dl1-VirtualBox/127.0.0.1
************************************************************/
root@dl1-VirtualBox:/usr/local/hadoop/logs# tail hadoop-root-nodemanager-dl1-VirtualBox.log
... 52 more
Caused by: java.lang.ClassNotFoundException: javax.activation.DataSource
at java.base/jdk.internal.loader.BuiltinClassLoader.loadClass(BuiltinClassLoader.java:583)
at java.base/jdk.internal.loader.ClassLoaders$AppClassLoader.loadClass(ClassLoaders.java:178)
at java.base/java.lang.ClassLoader.loadClass(ClassLoader.java:521)
... 86 more
2019-03-07 12:55:26,438 INFO org.apache.hadoop.yarn.server.nodemanager.NodeManager: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NodeManager at dl1-VirtualBox/127.0.0.1
************************************************************/
查看日志,发现是jdk版本的问题,为了避免其他错误,于是降低jdk版本。jdk9之后会出现的问题,降低到8。
JDK1.9引起的错误 ClassNotFoundException
JDK1.9引起的错误:
java.lang.ClassNotFoundException: javax.activation.DataSource
缺Jar包: activation-1.1.jar
2.3 更换jdk 到8的版本
tar -xzvf jdk-8u201-linux-x64.tar.gz -C /usr/local
重新format,并启动 hdfs,yarn等,进程稳定,没有退出现象。
root@dl1-VirtualBox:/home/dl1/下载# start-yarn.sh
Starting resourcemanager
Starting nodemanagers
root@dl1-VirtualBox:/home/dl1/下载# jps
11205 Jps
10885 ResourceManager
11030 NodeManager
10553 SecondaryNameNode
10350 DataNode
10206 NameNode
root@dl1-VirtualBox:/home/dl1/下载# jps
10885 ResourceManager
11030 NodeManager
10553 SecondaryNameNode
11469 Jps
10350 DataNode
10206 NameNode
以下两个网址也能够正常访问:
http://localhost:9870/dfshealth.html#tab-overview
http://localhost:8088/cluster
(总结)
使用最新版本的软件试好的,可以跟进新技术。但是新技术有很多问题,还要慢慢解决。