Hadoop搭建文档

这个文档是我之前刚自学Hadoop的时候记录的笔记,现在放到博客上一共电脑上的文档找不到了还有个备份。具体步骤如下:

1.基本配置环境说明:

     1)操作系统:windows8.1企业版 64位

     2)虚拟机软件:VMware Workstation 9.0.2  build-1031769

     3)Linux镜像:REdhat5.4 32位

     4)Hadoop版本:hadoop-1.2.1-bin.tar.gz

     5)Jdk:jdk-7u45-linux-i586.tar.gz

     6)总共有3台机器:master,slaver1,slaver2

     7)IP地址为别为:192.168.1.1(master),192.168.1.2(slaver1),192.168.1.3(slaver2),windows虚拟网卡地址为:192.168.1.10

     8)master 作为NameNode,jobtracker

     9)slaver1,slaver2作为DataNode,TaskTracker

2.装Linux

     1)基本默认安装。具体步骤略过。

3.装jdk

     1)用ftp把java的包发送到linux上

     2)解压压缩包   用命令  tar  -xzvf  包名

     3)配置环境变量

     4)将 jdk 安装路径添加到配置文件/etc/profile 中:

    export JAVA_HOME=/opt/jdk1.6

         export JRE_HOME=$JAVA_HOME/jre

         export CLASSPATH=.:$JAVA_HOME/lib:$JRE_HOME/lib:$CLASSPATH

         export PATH=$JAVA_HOME/bin:$JRE_HOME/bin:$PATH

    5)使配置生效

         $source /etc/profile

4.建立hadoop用户

         useradd hadoop

         为用户创建密码

         passwd hadoop

         默认路径为/home/hadoop

         在根目录下创建一个hadoop安装目录,给hadoop相应的权限。

         mkdir -p /hadoop-1.2.1

         chown -R hadoop:hadoop /hadoop-1.2.1

         chmod -R 775 /hadoop-1.2.1

5.将 hadoop 的安装路径添加到/etc/profile 中:

       export HADOOP_HOME=/home/hadoop/hadoop-1.0.3

       export PATH=$HADOOP_HOME/bin:$PATH

         执行 source /etc/profile 让新加入的环境变量生效

6.配置hadoop

       在hadoop的安装目录有个conf文件夹,配置文件在那里。

         1)vi   conf/hadoop-env.sh

                   修改 Java_home那一行

                         export  JAVA_HOME=/java/jdk1.6.0._26

         2)修改core-site.xml

             在<configuration>中加入以下内容    

                   <configuration>

                   <property>

                   <name>hadoop.tmp.dir</name>

                   <value>/hadoop/hadoop-datastore</value>

                   </property>

                   <property>

                   <name>fs.default.name</name>

                   <value>hdfs://master:9000</value>                </property>

                   </configuration>

         3)修改hdfs.site.xml

                   在<configuration>中加入以下内容

                   <configuration>

                   <property>

                   <name>dfs.replication</name>

                   <value>2</value>

                   </property>

                   </configuration>

         4)修改mapred-site.xml

                   在<configuration>中加入以下内容

                   <configuration>

                   <property>

                   <name>mapred.job.tracker</name>

                   <value>master:9001</value>                                                

                  </property>

                   </configuration>

         5)配置 NameNode 上的 conf/masters 和 conf/slaves

                   $ vi masters master

                   $vi slaves slave1 slave2

7.复制两份Linux配合现有的搭建成集群

      1)完全关机,复制两份虚拟机文件,改名为slaver1和slaver2.

      2)修改slaver1和slaver2的ip  

8.配置3台机器免密码登录

      1)用hadoop用户登录到home目录,命令是  su  - hadoop    或者  hadoop用户登录后用命令cd  ~即可。

      2)在每个节点运行  ssh-keygen -t dsa 命令

      3)然后cd 到 .ssh/目录下,把id_dsa.pub的内容复制到名叫authorized_keys的文件内

      4)命令是  cat  .ssh/id_dsa.pub  >> .ssh/authorized_keys

      5)然后用超级拷贝把authorized_keys拷贝到其他节点同样的目录里,同时往文件里增加别节点的id_dsa.pub内容。

      6)组后authorized_keys文件有三个节点的id_dsa.pub的内容,然后把这个文件放在所有节点的.ssh/目录里,就可以免密码登录了。

9.格式化文件系统

      1)格式化文件系统,只在master上执行此操作,命令如下

           $ hadoop namenode -format

10.启动hadoop

          $start-all.sh

11..验证启动是否成功

          在hadoop用户下输入jps回车

          用 jps 命令查看进程,NameNode 上的结果如下:

                   5334 JobTracker

                   5215 SecondaryNameNode

                   5449 jps 5001 NameNode

12.查看集群状态

                $ hadoop dfsadmin -report

                    确保运行的 DataNode 个数是正确的     

                   以下是两个datanode节点集群的运行结果显示:

                  

                   [hadoop@master conf]$ hadoop dfsadmin -report

                   Configured Capacity: 57986523136 (54 GB)

                   Present Capacity: 48732041216 (45.39 GB)

                   DFS Remaining: 48731803648 (45.39 GB)

                   DFS Used: 237568 (232 KB)

                   DFS Used%: 0%

                   Under replicated blocks: 0

                   Blocks with corrupt replicas: 0

                   Missing blocks: 0

 

                   -------------------------------------------------

                   Datanodes available: 2 (2 total, 0 dead)

 

                   Name: 192.168.1.2:50010

                   Decommission Status : Normal

                   Configured Capacity: 28993261568 (27 GB)

                   DFS Used: 118784 (116 KB)

                   Non DFS Used: 4627247104 (4.31 GB)

                   DFS Remaining: 24365895680(22.69 GB)

                   DFS Used%: 0%

                   DFS Remaining%: 84.04%

                   Last contact: Sat Oct 19 16:39:55 CST 2013

 

 

                   Name: 192.168.1.3:50010

                   Decommission Status : Normal

                   Configured Capacity: 28993261568 (27 GB)

                   DFS Used: 118784 (116 KB)

                   Non DFS Used: 4627234816 (4.31 GB)

                   DFS Remaining: 24365907968(22.69 GB)

                   DFS Used%: 0%

                   DFS Remaining%: 84.04%

                   Last contact: Sat Oct 19 16:39:53 CST 2013

posted on 2015-01-24 18:16  metmetyou  阅读(113)  评论(0)    收藏  举报

导航