搭建本地hadoop测试环境
操作系统:Ubuntu 9.10
下载hadoop: hadoop-0.20.1.tar.gz
安装依赖软件:
Java 1.6x 或以上
$ sudo apt-get install sun-java6-bin
在conf/hadoo-env.sh设在JAVA_HOME变量:
JAVA_HOME=/usr/lib/jvm/java-6-sun
ssh及sshd
$ sudo apt-get install openssh-server
hadoop测试环境配置(单机, 模拟分布式环境)
namenode节点配置
conf/core-site.xml:
<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
hdfs测试环境配置
conf/hdfs-site.xml: 数据块不需要冗余
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
job-tracker配置
conf/mapred-site.xml:
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:9001</value>
</property>
</configuration>
免密码ssh登录配置
$ ssh localhost
# 若不行, 则进行以下配置
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
# 若不行, 则进行以下配置
$ ssh-keygen -t dsa -P '' -f ~/.ssh/id_dsa
$ cat ~/.ssh/id_dsa.pub >> ~/.ssh/authorized_keys
初始化HDFS, 启动hadoop
$ bin/hadoop namenode -format
$ bin/start-all.sh
# 所有运行日志都在 ${HADOOP_LOG_DIR} 目录, (默认是 ${HADOOP_HOME}/logs).
$ bin/start-all.sh
# 所有运行日志都在 ${HADOOP_LOG_DIR} 目录, (默认是 ${HADOOP_HOME}/logs).
马上查看NameNode和 JobTracker吧:
NameNode -
http://localhost:50070/
JobTracker - http://localhost:50030/