自动手动安装CDH5

  回想去年噩梦般的经历,刚出大学门,转行做IT,跟着潮流学了大数据,可惜碰到了个骗子+流氓+误人子弟的人叫杨勇,整天吹嘘,课程一窍不通,3个半月白白浪费掉。硬着头皮自学hadoop集群,写了不少文档,长进了自己的独立学习能力。进公司后,很幸运的转到了JavaWeb开发,从码畜慢慢往上爬,未来真正的去做大数据方面的研究。

  最近想new一个自己,开始写写博客,记录点滴,希望能return给自己一个不会后悔的未来。今天第一天,我放上了当时自学劲头特足的CM和手动安装Hadoop集群。

通过CM安装CDH5

1、CM_cleanup_cluster.sh     清理集群(不是hadoop中的一个,只是为了实验用。)

2、配IP

   2.1改主机名:

  vi /etc/sysconfig/network

  vi /etc/hosts

 

 2.2 hots:

  elephant 192.168.20.3    elephant

  monkey   192.168.20.4    monkey

  tiger     192.168.20.5    tiger

  lion      192.168.20.6    lion

  horse    192.168.20.7   horse

3、安装hadoop:

    3.1 Install_CDH5.sh

 

    3.2 修改/etc/cloudera-scm-agent/config.ini如下:

  server_host=localhost

  server_host=lion

   

    3.3 如果其他机子没有安装hadoop,那么需要运行如下命令:

  copy_CM_agent_config.sh

 

4、仅lion做:

  安装数据库:sudo yum --accesseyes install cloudera-manager-server-db-2

  启动数据库:sudo service cloudera-scm-server-db-2 start

  启动数据库服务:sudo service cloudera-scm-server start

  查看数据库状态:sudo service cloudera-scm-server-db-2 status

  查看数据库服务状态:sudo service cloudera-scm-server-db-2 status

   

5、启动客户端

  sudo service cloudera-scm-agent start

  查看状态:

  sudo service cloudera-scm-agent status

   

  查看日志:

  sudo tail -f /var/log/cloudera-scm-agent/cloudera-scm-agent.log

 

6、打开浏览器进入:http://lion:7180  (用户名:admin 密码:admin)

 

手动安装hadoop cluster

此次安装一共分为5台机子,分别取名为elephant、horse、tiger、monkey、lion。严格按照下图所要求的进程分布安装(lion只需要安装DN和NM)

 

一、检查5台机子是否纯净

1、先ping通其他机子

2、安装前首先检查jps没有任何进程,如有进程则停止服务:

$ sudo service hadoop-hdfs-namenode stop

$ sudo service hadoop-hdfs-secondarynamenode stop

$ sudo service hadoop-hdfs-datanode stop

$ sudo service hadoop-yarn-resourcemanager stop

$ sudo service hadoop-yarn-nodemanager stop

$ sudo service hadoop-mapreduce-historyserver stop

3、检查日志,如果有删除日志

sudo ls /var/log/hadoop-*/*

sudo rm -rf /var/log/hadoop-*/*

4、首先查看有没有yum包

rpm -qa |grep yum

如果有,删除:

$ sudo yum remove --assumeyes hadoop-hdfs-secondarynamenode

$ sudo yum remove --assumeyes hadoop-yarn-resourcemanager

$ sudo yum remove --assumeyes hadoop-mapreduce-historyserver

 

二、在elephant上安装namenode

1、在elephant上安装namenode:

sudo yum install --assumeyes hadoop-hdfs-namenode

2、在根目录下创建如下目录:

sudo mkdir -p /hadoop-cluster/fsimage/nn1

sudo mkdir -p /hadoop-cluster/fsimage/nn2

 

3、安装完namenode之后,先将/training_materials/admin/stubs/里的配置文件复制到/etc/hadoop/conf中。然后配置core-site.xml,hdfs-site.xml,mapreduce-site.xml,yarn-site.xml如下:

core-site.xml:

<configuration>

<property>

<name>fs.defaultFS</name>

<value>hdfs://elephant:8020</value>

</property>

</configuration>

hdfs-site.xml:

首先在”/”目录下创建目录:

sudo mkdir -p /disk1/dfs/nn,sudo mkdir -p /disk2/dfs/nn

sudo mkdir -p /disk1/dfs/dn,sudo mkdir -p /disk2/dfs/dn

disk1、disk2释放权限:

sudo chmod -R 1777 /disk1

sudo chmod -R 1777 /disk2

切换用户属组:

sudo chown -R hdfs:hadoop /disk1/dfs/

sudo chown -R hdfs:hadoop /disk2/dfs/

<configuration>

<property>

<name>dfs.namenode.name.dir</name>

<value>file:///disk1/dfs/nn,file:///disk2/dfs/nn</value>

</property>

<property>

<name>dfs.datanode.date.dir</name>

<value>file:///disk1/dfs/dn,file:///disk2/dfs/dn</value>

</property>

</configuration>

4、用hdfs用户格式化:sudo -u hdfs hdfs namenode -format

5、启动namenode

sudo service hadoop-hdfs-namenode start

sudo service hadoop-hdfs-namenode status

启动成功后,检查sudo jps,看启动进程中是否有NN。此时所有的分机都可以安装DN。

 

三、在horse上安装ResourceManager

1、在horse上安装resourcemanager:

sudo yum install --assumeyes hadoop-yarn-resourcemanager

2、安装完resourcemanager,将/training_materials/admin/stubs/里的配置文件复制到/etc/hadoop/conf中。然后配置core-site.xml,hdfs-site.xml,mapreduce-site.xml,yarn-site.xml如下:

core-site.xml

<configuration>

<property>

<name>fs.defaultFS</name>

<value>hdfs://elephant:8020</value>

</property>

</configuration>

mapred-site.xml 

vim /etc/hadoop/conf/mapred-site.xml

 <property>

    <name>mapreduce.framework.name</name>

    <value>yarn</value>

  </property>

  <property>

    <name>mapreduce.jobhistory.address</name>

    <value>monkey:10020</value>

  </property>

  <property>

    <name>mapreduce.jobhistory.webapp.address</name>

    <value>monkey:19888</value>

  </property>

  <property>

    <name>yarn.app.mapreduce.am.staging-dir</name>

    <value>/user</value>

  </property>

yarn-site.xml

配置之前,首先在“/”目录下创建如下目录:

sudo mkdir -p /disk1/nodemgr/local

sudo mkdir -p /disk2/nodemgr/local

sudo mkdir -p /var/log/hadoop-yarn/containers  

sudo mkdir -p /var/log/hadoop-yarn/apps

给目录释放权限:

sudo chmod -R 1777 /disk1/nodemgr/local

sudo chmod -R 1777 /disk2/nodemgr/local

sudo chmod -R 1777 /var/log/hadoop-yarn/containers  

sudo chmod -R 1777 /var/log/hadoop-yarn/apps

给目录切换用户属组:

sudo chown -R yarn:yarn /disk1/nodemgr/

sudo chown -R yarn:yarn /disk2/nodemgr/

sudo chown -R yarn:yarn /var/log/hadoop-yarn/containers

sudo chown -R yarn:yarn /var/log/hadoop-yarn/apps

<configuration>  

<property>

    <name>yarn.resourcemanager.hostname</name>

    <value>horse</value>

  </property>

  <property>

     <name>yarn.application.classpath</name>

     <value>

        $HADOOP_CONF_DIR,

        $HADOOP_COMMON_HOME/*,$HADOOP_COMMON_HOME/lib/*,

        $HADOOP_HDFS_HOME/*,$HADOOP_HDFS_HOME/lib/*,

        $HADOOP_MAPRED_HOME/*,$HADOOP_MAPRED_HOME/lib/*,

        $HADOOP_YARN_HOME/*,$HADOOP_YARN_HOME/lib/*

     </value>

  </property>

  <property>

    <name>yarn.nodemanager.aux-services</name>

    <value>mapreduce_shuffle</value>

  </property>

  <property>

    <name>yarn.nodemanager.local-dirs</name>

   <value>file:///disk1/nodemgr/local,file:///disk2/nodemgr/l ocal</value>

  </property>

  <property>

    <name>yarn.nodemanager.log-dirs</name>

    <value>/var/log/hadoop-yarn/containers</value>

  </property>

  <property>

    <name>yarn.nodemanager.remote-app-log-dir</name>

    <value>/var/log/hadoop-yarn/apps</value>

  </property>

  <property>

    <name>yarn.log-aggregation-enable</name>

    <value>true</value>

  </property>

</configuration>

 

3、启动服务  

sudo service hadoop-yarn-resourcemanager start

sudo service hadoop-yarn-resourcemanager status

启动成功后,检查sudo jps,看启动进程中是否有RM。此时所有的分机都可以安装NM。

 

四、在tiger上安装SecondaryNamenode

1、在elephant上安装SecondaryNamenode:

sudo yum install --assumeyes hadoop-hdfs-secondarynamenode

2、安装完secondarynamenode,将/training_materials/admin/stubs/里的配置文件复制到/etc/hadoop/conf中。然后配置core-site.xml,hdfs-site.xml,mapreduce-site.xml,yarn-site.xml如下:

core-site.xml:

<configuration>

<properties>

<name>fs.defaultFS</name>

<value>hdfs://elephant:8020</value>

</properties>

</configuration>

hdfs-site.xml:

首先在”/”目录下创建目录:

sudo mkdir -p /disk1/dfs/nn,sudo mkdir -p /disk2/dfs/nn

sudo mkdir -p /disk1/dfs/dn,sudo mkdir -p /disk2/dfs/dn

disk1、disk2释放权限:

sudo chmod -R 1777 /disk1

sudo chmod -R 1777 /disk2

切换用户属组:

sudo chown -R hdfs:hadoop /disk1/dfs/

sudo chown -R hdfs:hadoop /disk2/dfs/

<configuration>

<properties>

<name>dfs.namenode.name.dir</name>

<value>file:///disk1/dfs/nn,file:///disk2/dfs/nn</value>

</properties>

<properties>

<name>dfs.datanode.date.dir</name>

<value>file:///disk1/dfs/dn,file:///disk2/dfs/dn</value>

</properties>

</configuration>

3、启动secondarynamenode

sudo service hadoop-hdfs-secondarynamenode start

sudo service hadoop-hdfs-secondarynamenode status

启动成功后,检查sudo jps,看启动进程中SecondaryNamennode是否启动。

 

五、在monkey上安装JobHistoryServer

启动JHS之前,需要先在HDFS上建立如下文件:

 

1、安装JobHistoryServer:

sudo yum install --assumeyes hadoop-mapreduce-historyserver

2、安装完historyserver,将/training_materials/admin/stubs/里的配置文件复制到/etc/hadoop/conf中。然后配置core-site.xml,hdfs-site.xml,mapreduce-site.xml,yarn-site.xml如下:

core-site.xml:

<configuration>

<properties>

<name>fs.defaultFS</name>

<value>hdfs://elephant:8020</value>

</properties>

</configuration>

hdfs-site.xml:

首先在”/”目录下创建目录:

sudo mkdir -p /disk1/dfs/nn,sudo mkdir -p /disk2/dfs/nn

sudo mkdir -p /disk1/dfs/dn,sudo mkdir -p /disk2/dfs/dn

disk1、disk2释放权限:

sudo chmod -R 1777 /disk1

sudo chmod -R 1777 /disk2

切换用户属组:

sudo chown -R hdfs:hadoop /disk1/dfs/

sudo chown -R hdfs:hadoop /disk2/dfs/

<configuration>

<properties>

<name>dfs.namenode.name.dir</name>

<value>file:///disk1/dfs/nn,file:///disk2/dfs/nn</value>

</properties>

<properties>

<name>dfs.datanode.date.dir</name>

<value>file:///disk1/dfs/dn,file:///disk2/dfs/dn</value>

</properties>

</configuration>

3、启动historyserver

sudo service hadoop-mapreduce-historyserver start

sudo service hadoop-mapreduce-historyserver status

启动成功后,检查sudo jps,看启动进程中JobHistoryserver是否启动。

 

六、在集群各台机子上安装datenode,nodemanager

重要:在启动nodemanager之前,请务必先运行mapreduce!!!!!!!

1、安装datanode

需要注意的是,需要配置core-site.xml,hdfs-site.xml。在上述配置过程中,elephant、tiger、monkey都已经配置完成,所以需要horse、lion进行相同的配置,然后安装并启动:

sudo yum install --assumeyes hadoop-hdfs-datanode

sudo service hadoop-hdfs-datanode start

sudo service hadoop-hdfs-datanode status

2、安装nodemanager

需要注意的是,需要配置mapreduce-site.xml和yarn-site.xml,配置过程中只有horse进行了配置,所以elephant、tiger、monkey、lion都需要按照horse配置,然后进行安装和启动:

sudo yum install --assumeyes hadoop-yarn-nodemanager

sudo service hadoop-yarn-nodemanager start

sudo service hadoop-yarn-nodemanager status

3、运行mapreduce

sudo yum install --assumeyes hadoop-mapreduce

到此,hadoop cluster全部安装完毕,可以进行相关测试。

在配置集群的过程中,会出现一些错误,可以查看/var/log下的运行日志。

posted @ 2017-12-14 16:58  Danfoen  阅读(473)  评论(0编辑  收藏  举报