Hadoop完全分布式集群搭建
一.设置虚拟机
1.克隆三台虚拟机
设置IP地址和主机名称
2.设置免密钥登录
ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa
设置映射名称
vim hosts (etc 目录下)
将密钥互相拷贝给对方
ssh-copy-id root@node01 ~/.ssh/id_rsa.pub
ssh-copy-id root@node02 ~/.ssh/id_rsa.pub
ssh-copy-id root@node03 ~/.ssh/id_rsa.pub
3.准备工作
上传jdk, Hadoop压缩包并解压
tar -zxvf hadoop-2.6.5.tar.gz
移动文件目录
mv hadoop-2.6.5 /opt/sxt/
设置jdk和Hadoop环境变量
vim /etc/profile
export JAVA_HOME=/usr/java/jdk1.7.0_67
export HADOOP_HOME=/opt/sxt/hadoop-2.6.5
export PATH=$JAVA_HOME/bin:$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH
查看路径:echo $PATH
重新加载配置文件 source /etc/profile
4.配置Hadoop相关文件
修改jdk路径
vim hadoop-env.sh
vim mapred-env.sh
vim yarn-env.sh
修改Hadoop核心配置文件
vim /opt/sxt/hadoop-2.6.5/etc/hadoop/etc/hadoop/core-site.xml
<property>
<name>fs.defaultFS</name>
<value>hdfs://node01:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/var/sxt/hadoop/full</value>
</property>
vim /opt/sxt/hadoop-2.6.5/etc/hadoop/etc/hadoop/hdfs-site.xml
<property>
<name>dfs.namenode.secondary.http-address</name>
<value>node02:50090</value>
</property>
<property>
<name>dfs.namenode.secondary.https-address</name>
<value>node02:50091</value>
</property>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
创建临时文件路径
mkdir -p /var/sxt/hadoop/full
配置DataNode节点 slaves
vim /opt/sxt/hadoop-2.6.5/etc/hadoop/etc/hadoop/slaves
node01
node02
node03
将Hadoop文件拷贝到其他节点
scp -r /opt/sxt/hadoop-2.6.5/ root@node02:/opt/sxt/
scp -r /opt/sxt/hadoop-2.6.5/ root@node03:/opt/sxt/
将环境变量配置文件拷贝到其他节点
scp -r /etc/profile root@node02:/etc/profile
scp -r /etc/profile root@node03:/etc/profile
重新加载配置文件
source /etc/profile
格式化NameNode节点
hdfs namenode -format
启动集群
start-dfs.sh
node01: starting namenode, logging to /opt/sxt/hadoop-2.6.5/logs/hadoop-root-namenode-node01.out
node02: starting datanode, logging to /opt/sxt/hadoop-2.6.5/logs/hadoop-root-datanode-node02.out
node03: starting datanode, logging to /opt/sxt/hadoop-2.6.5/logs/hadoop-root-datanode-node03.out
node01: starting datanode, logging to /opt/sxt/hadoop-2.6.5/logs/hadoop-root-datanode-node01.out
Starting secondary namenodes [node02]
node02: starting secondarynamenode, logging to /opt/sxt/hadoop-2.6.5/logs/hadoop-root-secondarynamenode-node02.out
查看集群各节点的进程
jps
查看并上传文件
hdfs dfs -mkdir -p /sxt/bigdata
hdfs dfs -D dfs.blocksize=1048576 -put jdk-7u67-linux-x64.rpm /user/root
停止集群
stop-dfs.sh
Stopping namenodes on [node01]
node01: stopping namenode
node02: stopping datanode
node03: stopping datanode
node01: stopping datanode
Stopping secondary namenodes [node02]
node02: stopping secondarynamenode
记得拍快照