Docker中搭建Hadoop-2.6单机伪分布式集群

 1 获取一个简单的Docker系统镜像,并建立一个容器。

  1.1 这里我选择下载CentOS镜像

docker pull centos

  1.2 通过docker tag命令将下载的CentOS镜像名称换成centos,然后建立一个简单容器

docker run -it --name=client1 centos /bin/bash

 2 Docker容器中下载并安装Java

  2.1 JDK下载

    去Oracle官网选择要下载的JDK

    http://www.oracle.com/technetwork/java/javase/archive-139210.html

    选择好版本后,在如下的界面中选择好Linux平台的tar包,点击右键在新标签中打开

    

    然后在新的标签页中复制出下载地址,在wget下载的时候使用:

    

    然后用wget下载,下载的时候要时候加上一个特殊的cookie

wget --no-cookie --header "Cookie: gpw_e24=http%3A%2F%2Fwww.oracle.com%2F" download.oracle.com/otn/java/jdk/7u80-b15/jdk-7u80-linux-x64.tar.gz

    下载截图:

    

  2.2 JDK安装

    选择自己的安装目录,然后解压,做软连接

[root@f795f10ac377 java]# pwd
/usr/local/java
[root@f795f10ac377 java]# tar -zxvf jdk-7u80-linux-x64.tar.gz
[root@f795f10ac377 java]# ln -s jdk1.7.0_80 jdk
[root@f795f10ac377 java]# ls
jdk  jdk1.7.0_80

     引入环境变量,编辑/etc/profile文件,在末尾加上如下内容:

export JAVA_HOME=/usr/local/java/jdk
export JRE_HOME=${JAVA_HOME}/jre
export CLASSPATH=.:${JAVA_HOME}/lib:${JRE_HOME}/lib
export PATH=${JAVA_HOME}/bin:$PATH

    保存退出后使用source profile ,或者是 . profile 使得profile文件中的内容立即生效。然后可以通过java -version查看java版本:

   

 3 SSH的安装以及配置

  3.1 通过yum 安装

yum install openssh-server
yum install openssh-clients

  3.2 启动sshd,但是出现错误,无法正常启动。

[root@f795f10ac377 hadoop]# /usr/sbin/sshd
Could not load host key: /etc/ssh/ssh_host_rsa_key
Could not load host key: /etc/ssh/ssh_host_ecdsa_key
Could not load host key: /etc/ssh/ssh_host_ed25519_key

  解决方法如下:

#ssh-keygen 中的参数 -q 表示以quiet方式执行,也就是不输出执行情况。 -t 表示生成的host key 的类型
[root@f795f10ac377 hadoop]# ssh-keygen -q -t rsa -b 2048 -f /etc/ssh/ssh_host_rsa_key -N '' [root@f795f10ac377 hadoop]# ssh-keygen -q -t ecdsa -f /etc/ssh/ssh_host_ecdsa_key -N '' [root@f795f10ac377 hadoop]# ssh-keygen -t dsa -f /etc/ssh/ssh_host_ed25519_key -N '' Generating public/private dsa key pair. Your identification has been saved in /etc/ssh/ssh_host_ed25519_key. Your public key has been saved in /etc/ssh/ssh_host_ed25519_key.pub. The key fingerprint is: d4:8a:c2:e0:75:cf:fc:2b:46:b2:8a:b4:d9:a2:8b:7a root@f795f10ac377 The key's randomart image is: +--[ DSA 1024]----+ | | | . | | . . . . . | | . + . * . | | . o . S | | .. .. | | . + . | |..E= . o . | |*++.o. . .. | +-----------------+

  3.3 设置root密码,并测试登陆到本机。

[root@f795f10ac377 ssh]# /usr/sbin/sshd    #开启sshd服务
[root@f795f10ac377 ssh]# netstat -tnulp    #查看是否开启成功
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      158/sshd
tcp6       0      0 :::22                   :::*                    LISTEN      158/sshd
[root@f795f10ac377 ssh]# passwd root      #设置root账户的密码
Changing password for user root.
New password:
BAD PASSWORD: The password fails the dictionary check - it does not contain enough DIFFERENT characters
Retype new password:
passwd: all authentication tokens updated successfully.
[root@f795f10ac377 ssh]# ssh root@localhost  #用ssh登陆到本机
root@localhost's password:
[root@f795f10ac377 ~]# ls
anaconda-ks.cfg
[root@f795f10ac377 ~]# exit            #退出登陆
logout
Connection to localhost closed.

  3.4 设置本机免密码登陆

[root@f795f10ac377 ~]# ssh-keygen                 #默认以 -t rsa 建立公钥,私钥档案
Generating public/private rsa key pair.    
Enter file in which to save the key (/root/.ssh/id_rsa):   # 按enter, 使用默认的目录和文件名
Enter passphrase (empty for no passphrase):           # 按enter, 使用空密码
Enter same passphrase again:                    # 按enter, 重复密码
Your identification has been saved in /root/.ssh/id_rsa.   
Your public key has been saved in /root/.ssh/id_rsa.pub.
The key fingerprint is:
c5:3b:1f:cc:07:d5:6b:e3:12:57:c2:84:a3:2e:97:32 root@f795f10ac377
The key's randomart image is:
+--[ RSA 2048]----+
|             +o. |
|         .  o.o o|
|          o... .o|
|         ..+ o = |
|        S.o.+ * .|
|        E +o + . |
|         =  . .  |
|                 |
|                 |
+-----------------+

  这时在~/.ssh/目录下就会出现公钥和私钥

[root@f795f10ac377 .ssh]# pwd
/root/.ssh
[root@f795f10ac377 .ssh]# ls
id_rsa  id_rsa.pub

  建立authorized_keys文件

[root@f795f10ac377 .ssh]# touch authorized_keys
[root@f795f10ac377 .ssh]# cat id_rsa.pub >> authorized_keys
[root@f795f10ac377 .ssh]# cat authorized_keys
ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC8e4J0ogiNuH3SC8/pXKa5gfHZxWY+soo9wOBoYnkq2HCga/p/cmhpy87mO+IZHLdyskH8TylK2/tFaovbnWNDXHE7uH2gToPjbQG0wCOoRWYy0Irz++wmK64eMTsVYF0L4/AEF6l46iYonQ1RT9xCC/BNgcKaiPNnNlu2O5jMw1ZQCJGg5IDT9RGFms/aYw/cblafYRkwF14keULWpGHFQgyiNthFP/1faaWIu9KJqBr9I93FXWE3cD7F05M/EGV0cRlrVnPOUD5oLUS7y+useBm3Cu8IRUy5SvaJ1qoUb78fX1ExhUFcewt4D1K9XNsFGTi6a4Q60RN7jTjHvRm/ root@f795f10ac377

  本机测试免密码登陆

[root@f795f10ac377 /]# ssh root@localhost    #直接登陆,没有输入密码
[root@f795f10ac377 ~]# exit
logout
Connection to localhost closed.

 

 4 Hadoop的下载和安装以及配置

  4.1 在容器中通过curl命令下载hadoop 2.6 安装包。 下载地址为: http://apache.fayea.com/hadoop/common/hadoop-2.6.0/

curl -o hadoop-2.6.tar.gz  http://apache.fayea.com/hadoop/common/hadoop-2.6.0/hadoop-2.6.0.tar.gz

  4.2 解压并重命名

tar -zxvf  hadoop-2.6.0.tar.gz
mv hadoop-2.6.0 hadoop

   4.3 添加Hadoop相关的环境变量

   编辑/etc/profile,添加HADOOP_HOME变量,并将HADOOP_HOME中的bin添加到PATH中

export HADOOP_HOME=/usr/local/hadoop
export PATH=${JAVA_HOME}/bin:${HADOOP_HOME}/bin:$PATH

  4.4 配置伪分布式

   编辑/usr/local/hadoop/etc/hadoop/core-site.xml 文件:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
        <property>
                <name>hadoop.tmp.dir</name>
                <value>file:/data/hadoop/tmp</value>
        </property>
        <property>
                <name>fs.defaultFS</name>
                <value>hdfs://localhost:9000</value>
        </property>
</configuration>

  编辑/usr/local/hadoop/etc/hadoop/hdfs-site.xml 文件:

<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>

<configuration>
        <property>
                <name>dfs.replication</name>
                <value>1</value>
        </property>
        <property>
                <name>dfs.namenode.name.dir</name>
                <value>file:/data/hadoop/dfs/name</value>
        </property>
        <property>
                <name>dfs.datanode.data.dir</name>
                <value>file:/data/hadoop/dfs/data</value>
        </property>
</configuration>

  4.5 格式化namenode

[root@f795f10ac377 bin]# ./hdfs namenode -format
./hdfs: line 28: which: command not found
dirname: missing operand
Try 'dirname --help' for more information.
/usr/local/hadoop/bin/../libexec/hdfs-config.sh: line 21: which: command not found

   虽然格式化成功,但是上边的红体字部分报错,提示which命令没有。如果没有which命令会影响后续的工作,所以需要安装。

[root@f795f10ac377 /]# yum install which

  重新进行格式化:

[root@f795f10ac377 bin]# ./hdfs namenode -format
16/08/06 16:12:08 INFO namenode.NameNode: STARTUP_MSG:
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = f795f10ac377/172.17.0.2
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.6.0
................................
16/08/06 16:12:11 INFO common.Storage: Storage directory /data/hadoop/dfs/name has been successfully formatted. #成功格式化
16/08/06 16:12:11 INFO namenode.NNStorageRetentionManager: Going to retain 1 images with txid >= 0
16/08/06 16:12:11 INFO util.ExitUtil: Exiting with status 0
16/08/06 16:12:11 INFO namenode.NameNode: SHUTDOWN_MSG:
/************************************************************
SHUTDOWN_MSG: Shutting down NameNode at f795f10ac377/172.17.0.2
************************************************************/

  4.6 开启HDFS,和Yarn

  

[root@f795f10ac377 hadoop]# start-dfs.sh
[root@f795f10ac377 hadoop]# start-yarn.sh

 

  4.7 测试

  查看开启的JAVA程序

[root@f795f10ac377 hadoop]# jps
1322 DataNode
1460 SecondaryNameNode
2068 Jps
1692 ResourceManager
1775 NodeManager
1214 NameNode

  查看开启的端口

# 如果没有netstat 以及 ifconfig 等网络命令可以通过 yum install net-tools 进行安装
[root@f795f10ac377 hadoop]# netstat -tnulp
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:50020           0.0.0.0:*               LISTEN      1322/java
tcp        0      0 127.0.0.1:9000          0.0.0.0:*               LISTEN      1214/java
tcp        0      0 0.0.0.0:50090           0.0.0.0:*               LISTEN      1460/java
tcp        0      0 0.0.0.0:50070           0.0.0.0:*               LISTEN      1214/java
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      304/sshd
tcp        0      0 0.0.0.0:50010           0.0.0.0:*               LISTEN      1322/java
tcp        0      0 0.0.0.0:50075           0.0.0.0:*               LISTEN      1322/java
tcp6       0      0 :::8032                 :::*                    LISTEN      1692/java
tcp6       0      0 :::37216                :::*                    LISTEN      1775/java
tcp6       0      0 :::8033                 :::*                    LISTEN      1692/java
tcp6       0      0 :::8040                 :::*                    LISTEN      1775/java
tcp6       0      0 :::8042                 :::*                    LISTEN      1775/java
tcp6       0      0 :::22                   :::*                    LISTEN      304/sshd
tcp6       0      0 :::8088                 :::*                    LISTEN      1692/java
tcp6       0      0 :::8030                 :::*                    LISTEN      1692/java
tcp6       0      0 :::8031                 :::*                    LISTEN      1692/java

  查看HDFS状况

[root@f795f10ac377 hadoop]# hadoop dfsadmin -report
DEPRECATED: Use of this script to execute hdfs command is deprecated.
Instead use the hdfs command for it.

16/08/06 16:32:58 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable
Configured Capacity: 10726932480 (9.99 GB)
Present Capacity: 9665077248 (9.00 GB)
DFS Remaining: 9665073152 (9.00 GB)
DFS Used: 4096 (4 KB)
DFS Used%: 0.00%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0

-------------------------------------------------
Live datanodes (1):

Name: 127.0.0.1:50010 (localhost)
Hostname: f795f10ac377
Decommission Status : Normal
Configured Capacity: 10726932480 (9.99 GB)
DFS Used: 4096 (4 KB)
Non DFS Used: 1061855232 (1012.66 MB)
DFS Remaining: 9665073152 (9.00 GB)
DFS Used%: 0.00%
DFS Remaining%: 90.10%
Configured Cache Capacity: 0 (0 B)
Cache Used: 0 (0 B)
Cache Remaining: 0 (0 B)
Cache Used%: 100.00%
Cache Remaining%: 0.00%
Xceivers: 1
Last contact: Sat Aug 06 16:32:59 UTC 2016

 

   

 

posted @ 2016-08-04 23:09  Amei1314  阅读(4156)  评论(0编辑  收藏  举报