alternatives配置CDH3集群

1、通过使用alernatives来切换配置文件

# 将默认配置文件复制到conf.cluster目录
sudo mkdir -p /etc/hadoop-0.20/conf.cluster
sudo cp -r /etc/hadoop-0.20/conf.empty/* /etc/hadoop-0.20/conf.cluster
# 使用alternatives安装配置目录,不过我现在都没搞明白那个权限值有什么用
sudo alternatives --install /etc/hadoop-0.20/conf hadoop-0.20-conf /etc/hadoop-0.20/conf.cluster 50
# 可以查看所有配置
sudo alternatives --display hadoop-0.20-conf
# 删除一个配置,不是删除文件,而是将文件引用删掉,实际上alternatives是使用软连接(ln -s)来管理多种配置目录
sudo alternatives --remove hadoop-0.20-conf /etc/hadoop-0.20/conf.cluster
# 直接将conf.cluster作为hadoop集群的默认设置,在install之后直接set就可以了
sudo alternatives --set hadoop-0.20-conf /etc/hadoop-0.20/conf.cluster
  
2、cdh的元数据信息配置默认路径是/tmp/hadoop-hdfs/里,如果更改路径需要创建目录并赋予相应用户组

a、编辑/etc/hosts, 加入集群节点的域名, 并将文件同步到各个节点
b、编辑/etc/hadoop-0.20/conf.cluster/masters   /etc/hadoop-0.20/conf.cluster/slaves, 将域名对应添入masters、slaves, 并将文件同步到各个节点
c、编辑/etc/hadoop-0.20/conf.cluster/core-site.xml  /etc/hadoop-0.20/conf.cluster/mapred-site.xml,并将文件同步到各个节点
  
<property>
  <name>fs.default.name</name>
  <value>hdfs://cdh3-master1:9000</value>
</property>
<property>
  <name>hadoop.tmp.dir</name>
  <value>/tmp/hadoop-cdh/cache/</value>
</property>
  
<property>
  <name>dfs.name.dir</name>
  <value>/data/dfs/name</value>
</property>
<property>
  <name>dfs.data.dir</name>
  <value>/data/dfs/dn</value>
</property>
<property>
  <name>dfs.replication</name>
  <value>2</value>
</property>
  
  
<property>
  <name>mapred.job.tracker</name>
  <value>cdh3-master1:9001</value>
</property>
  
<property>
  <name>mapred.local.dir</name>
  <value>/data/mapred/local</value>
</property>
  
scp masters slaves hdfs-site.xml core-site.xml mapred-site.xml root@cdh3-slave1:/etc/hadoop-0.20/conf.cluster/
scp masters slaves hdfs-site.xml core-site.xml mapred-site.xml root@cdh3-slave2:/etc/hadoop-0.20/conf.cluster/
     
3、根据配置文件建立相关目录,并指定所属用户

sudo mkdir -p /data/dfs/nn 
sudo mkdir -p /data/dfs/dn 
sudo mkdir -p /data/mapred/local 
  
sudo chown -R hdfs:hadoop /data/dfs/nn 
sudo chown -R hdfs:hadoop /data/dfs/dn
sudo chown -R mapred:hadoop /data/mapred/local 
  
hadoop namenode -format
  
4、启动,并设置系统服务

sudo service hadoop-0.20-namenode start
sudo service hadoop-0.20-secondarynamenode start
sudo service hadoop-0.20-datanode start
  
sudo -u hdfs hadoop fs -mkdir /mapred/system
sudo -u hdfs hadoop fs -chown mapred:hadoop /mapred/system
  
sudo service hadoop-0.20-tasktracker start
sudo service hadoop-0.20-jobtracker start
  
sudo chkconfig hadoop-0.20-namenode on
sudo chkconfig hadoop-0.20-jobtracker on
sudo chkconfig hadoop-0.20-secondarynamenode on
sudo chkconfig hadoop-0.20-tasktracker on
sudo chkconfig hadoop-0.20-datanode on
  

# 这样是可以启动,但ps –ax | grep hadoop却看到所有的服务都起了两个进程,不知道为什么,求解惑

  
5、另一种启动方法,通过hadoop/bin 脚本启动(推荐)

# 先赋权限,将cdh用户添入hadoop组里,hadoop组里就有了三个用户,hdfs,mapred,cdh (推荐)
# 或者你也可以更改涉及hadoop的目录的权限。总之,按用户名和用户组规划好hadoop所使用的目录权限即可
  
sudo chmod -R 777 /usr/lib/hadoop-0.20/logs
sudo chmod -R 777 /usr/lib/hadoop-0.20/pids
sudo chown -R cdh:cdh /data
  
# 然后启动集群
sh /usr/lib/hadoop-0.20/bin/start-all.sh
   
http://log.medcl.net/item/2011/02/hadoop-cluster-configuration-centos/

http://archive.cloudera.com/cdh/3/hadoop-0.20.2-cdh3u0/index.html

posted on 2011-04-24 22:07  张淼  阅读(954)  评论(2编辑  收藏  举报