【Hadoop】Hadoop HA 高可用搭建手册

思路通过vm 虚拟机实现,创建一台linux虚拟机(在此使用CentOS-6.5-x86_64-minimal.iso,配置1核2G)当作主节点,其他节点主机通过克隆主节点虚拟机创建,虚拟机网络模式为NAT。

角色 主机名 IP
ActiveNameNode master 172.16.152.130
StandbyNameNode slave 172.16.152.131
DataNode slave1 172.16.152.132

附录1: 修改网络IP

vi /etc/sysconfig/network-scripts/ifcfg-eth0

// 删除其中 UUID、HWADDR
// 修改或添加如下:
// IPADDR=172.16.152.131
// NATMASK=255.255.255.0
// GATEWAY=172.16.152.2
// DNS1=192.168.0.1

附录2: 修改主机名

vi /etc/sysconfig/network

附录3: ssh 免密登陆
如果B想免密登陆A,则在A上生成的公钥(id_rsa.pub)给B就可免密登陆

// A上操作
// 一路回车生成密钥
ssh-keygen -t rsa
scp ~/.ssh/id_rsa.pub root@xxx:~
// B上操作
cat ~/.ssh/id_rsa.pub >> ~/.ssh/authorized_keys

附录4: 查看端口进程占用

netstat -ntulp | grep 80   //查看所有80端口使用情况

搭建zookeeper集群

mv zoo.fig zoo_sample.cfg
vi zoo.fig

// 修改或添加
dataDir=/opt/zookeeper/data

server.1=master:2888:3888
server.2=slave:2888:3888
server.3=slave1:2888:3888


//在dataDir 下创建myid,myid中的数字必须是server.1/2/3 对应
vi /opt/zookeeper/data/myid

//启动, 注意每个节点实例都要启动才能正确开启
cd /usr/local/software/zookeeper-3.4.6/bin/
./ zkServer.sh start

// 查看状态
./zkServer.sh  status

//停止
./zkServer.sh stop

// 方便启动命令追加bin目录到PATH
export ZOOKEEPER_HOME=usr/local/software/zookeeper-3.4.6

//exprot PATH 后追加
:$ZOOKEEPER_HOME/bin:$ZOOKEEPER_HOME/sbin



// 默认日志
/usr/local/software/zookeeper-3.4.6/bin/zookeeper.out

hadoop环境变量

export HADOOP_HOME=/usr/local/software/hadoop-2.5.1
export PATH=$HADOOP_HOME/sbin:$PATH

core-site.xml 配置

<configuration>


<property>

  <name>fs.defaultFS</name>

  <value>hdfs://mycluster</value>

</property>


<property>
  <!---fsimage and edits-->

  <name>hadoop.tmp.dir</name>

  <value>/opt/hadoop-2.5</value>

</property>


<property>

  <name>ha.zookeeper.quorum</name>

   <value>master,slave,slave1</value>

</property>


</configuration>

hdfs-site.xml 配置

<configuration> 
  <property> 
    <name>dfs.nameservices</name>  
    <value>mycluster</value> 
  </property>

  <property> 
    <name>dfs.ha.namenodes.mycluster</name>  
    <value>nn1,nn2</value> 
  </property>  

  <!--rpc 通讯端口-->
  <property> 
    <name>dfs.namenode.rpc-address.mycluster.nn1</name>  
    <value>master:8020</value> 
  </property>  
  <property> 
    <name>dfs.namenode.rpc-address.mycluster.nn2</name>  
    <value>slave:8020</value> 
  </property>  

  <!--web 访问端口-->
  <property> 
    <name>dfs.namenode.http-address.mycluster.nn1</name>  
    <value>master:50070</value> 
  </property>  
  <property> 
    <name>dfs.namenode.http-address.mycluster.nn2</name>  
    <value>slave:50070</value> 
  </property> 
    
 <!--用于共享编辑日志的journal节点列表-->
  <property> 
    <name>dfs.namenode.shared.edits.dir</name>  
    <value>qjournal://master:8485;slave:8485;slave1:8485/mycluster</value> 
  </property>  

  <property> 
    <name>dfs.client.failover.proxy.provider.mycluster</name>
   <value>org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider</value>
  </property> 

  <property> 
    <name>dfs.ha.fencing.methods</name>  
    <value>sshfence</value> 
  </property>

  <property> 
    <name>dfs.ha.fencing.ssh.private-key-files</name>  
    <value>/root/.ssh/id_rsa</value> 
  </property> 
 
  <!--journalnode 上用于存放edits日志的目录-->
  <property> 
    <name>dfs.journalnode.edits.dir</name>  
    <value>/opt/journalnode/data</value> 
  </property>  

  <property> 
    <name>dfs.ha.automatic-failover.enabled</name>  
    <value>true</value> 
  </property> 
  
</configuration>
cd /usr/local/software/hadoop-2.5.1/etc/hadoop
vi masters
// 添加
master



vi slaves
//写入
master
slave
slave1

将mater修改好的hadoop发送至slave, slave1

scp -r  /usr/local/software/hadoop-2.5.1/ root@slave:/usr/local/software/

每个节点都启动journalnode(已配置hadoop环境变量)

cd /usr/local/software/hadoop-2.5.1/sbin
./hadoop-daemon.sh start journalnode

// jps  出现 51190 JournalNode 成功

master上执行,格式化NameNode, 再将生成结果文件拷贝StandbyNameNode

// 生成文件在 core-site.xml 节点hadoop.tmp.dir定义的目录中
hdfs namenode -format
scp -r /opt/hadoop-2.5/ root@slave:/opt 

master执行 格式化zookeeper

hdfs zkfc -formatZK

启动hdfs

cd /usr/local/software/hadoop-2.5.1/sbin/
./start-dfs.sh


 web 访问
 http://masternode:50070/dfshealth.html#tab-overview

end

附录:启动顺序

1 启动所有节点上的zk
  cd /usr/local/software/zookeeper-3.4.6/bin/
  ./zkServer.sh start
2 启动所有节点journalnode
   cd /usr/local/software/hadoop-2.5.1/sbin
   ./hadoop-daemon.sh start journalnode
2 启动master上启动hdfs
  cd /usr/local/software/hadoop-2.5.1/sbin/
  ./start-dfs.sh
posted @ 2020-04-02 18:35  加州风尘  阅读(194)  评论(0编辑  收藏  举报