Hadoop分布式环境搭建

一、系统环境
10.0.0.11  master   centos6.6  x86_64
10.0.0.12  salve1    centos6.6  x86_64
 
二、设置host
将master和slave1加进两台服务器的hosts文件中
 
三、ssh无密钥登陆
master操作
#ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa  
#cat id_dsa.pub >> authorized_keys
#若成功则ssh master 无需密码即可登陆
slave1执行相同操作,若成功则slave1上ssh slave1成功无密钥登陆
将master的公钥id_dsa.pub内容 加到slave1的authorized_keys中
将slave1的公钥id_dsa.pub 内容 加到master的authorized_keys中
配置成功则master和slave1可相互无密钥登陆
PS:.ssh目录不能手动创建,要通过ssh-keygen自动生成
 
四、安装java
yum -y install java-1.7.0-openjdk*
安装完成后配置环境变量
编辑/etc/profile,在最后加入export JAVA_HOME=/usr/lib/jvm/java-1.7.0-openjdk-1.7.0.79.x86_64
执行source /etc/profile使配置立即生效
 
五、hadoop安装
官网下载hadoop最新stable版本binary包(本文为2.6.0)
解压至/usr/local/hadoop
编辑/etc/profile,在最后加入
export HADOOP_PREFIX=/usr/local/hadoop
export PATH=$PATH:/usr/local/hadoop/bin:/usr/local/hadoop/sbin
执行source /etc/profile使其立即生效
 
创建相关目录
mkdir /usr/local/hadoop/dfs
mkdir /usr/local/hadoop/dfs/name
mkdir /usr/local/hadoop/dfs/data
mkdir /usr/local/hadoop/tmp
 
六、Hadoop配置
1、etc/hadoop/slaves
填入所有的slave的hostname,本文仅有一个slave1
 
2、etc/hadoop/core-site.xml
 1 <configuration>
 2        <property>
 3                 <name>fs.defaultFS</name>
 4                 <value>hdfs://master:8020</value>
 5        </property>
 6        <property>
 7                 <name>io.file.buffer.size</name>
 8                 <value>131072</value>
 9         </property>
10        <property>
11                <name>hadoop.tmp.dir</name>
12                <value>file:/usr/local/hadoop/tmp</value>
13                <description>Abase for other temporary   directories.</description>
14        </property>
15 </configuration>
 
3、etc/hadoop/hdfs-site.xml
 1 <configuration>
 2        <property>
 3                 <name>dfs.namenode.secondary.http-address</name>
 4                <value>master:9001</value>
 5        </property>
 6      <property>
 7              <name>dfs.namenode.name.dir</name>
 8              <value>file:/usr/local/hadoop/dfs/name</value>
 9        </property>
10       <property>
11               <name>dfs.datanode.data.dir</name>
12               <value>file:/usr/local/hadoop/dfs/data</value>
13        </property>
14        <property>
15                <name>dfs.replication</name>
16                <value>1</value>
17         </property>
18         <property>
19                  <name>dfs.webhdfs.enabled</name>
20                   <value>true</value>
21          </property>
22 </configuration>

 

4、etc/hadoop/mapred-site.xml
 1 <configuration>
 2           <property>
 3                 <name>mapreduce.framework.name</name>
 4                 <value>yarn</value>
 5            </property>
 6           <property>
 7                   <name>mapreduce.jobhistory.address</name>
 8                   <value>master:10020</value>
 9           </property>
10           <property>
11                 <name>mapreduce.jobhistory.webapp.address</name>
12                 <value>master:19888</value>
13        </property>
14 </configuration>

 

5、etc/hadoop/yarn-site.xml
 1 <configuration>
 2         <property>
 3                <name>yarn.nodemanager.aux-services</name>
 4                <value>mapreduce_shuffle</value>
 5         </property>
 6         <property>
 7          <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name>
 8                <value>org.apache.hadoop.mapred.ShuffleHandler</value>
 9         </property>
10         <property>
11                <name>yarn.resourcemanager.address</name>
12                <value>master:8032</value>
13        </property>
14        <property>
15                <name>yarn.resourcemanager.scheduler.address</name>
16                <value>master:8030</value>
17        </property>
18        <property>
19                <name>yarn.resourcemanager.resource-tracker.address</name>
20                <value>master:8031</value>
21       </property>
22       <property>
23                <name>yarn.resourcemanager.admin.address</name>
24                <value>master:8033</value>
25        </property>
26        <property>
27                <name>yarn.resourcemanager.webapp.address</name>
28                <value>master:8088</value>
29        </property>
30 </configuration>
 
6、将hadoop文件目录拷贝到所有节点上
 
七、启动
1、格式化namenode :
hadoop namenode -format 
2、start-all.sh 启动所有节点
用jps查看master和slave1的进程状态
master上:NameNode、SecondaryNameNode、ResourceManager
slave上: DataNode、NodeManager
 
 
 
 
 
 
posted @ 2015-06-29 17:46  Verruckt  阅读(161)  评论(0编辑  收藏  举报