hadoop 分布式开发环境搭建
一,安装java环境
export HADOOP_HOME=/data/hadoop/hadoop-2.2.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
2751 ResourceManager
2628 SecondaryNameNode
2469 NameNode
添加java环境变量
vi /etc/profile
# add by tank
export JAVA_HOME=/data/soft/jdk/jdk1.7.0_71
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
export JAVA_HOME=/data/soft/jdk/jdk1.7.0_71
export JRE_HOME=$JAVA_HOME/jre
export CLASSPATH=.:$JAVA_HOME/lib/dt.jar:$JAVA_HOME/lib/tools.jar
export PATH=$PATH:$JAVA_HOME/bin:$JRE_HOME/bin
二,修改文件句柄数
vi /etc/security/limits.conf
# add by tank
* soft nofile 65536
* hard nofile 65536
* hard nofile 65536
三,设置ssh无密码登录
参考:http://www.cnblogs.com/tankaixiong/p/4172942.html
任意主机之间可以无密码登录。
authorized_keys包含了所有主机的密钥,多主机这里可以通过nfs 挂载同步文件authorized_keys,一改全改
四,设置HSOT
vi /etc/hosts
127.0.0.1 localhost localhost.localdomain localhost4 localhost4.localdomain4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.183.130 tank1
192.168.183.131 tank2
192.168.183.132 tank3
192.168.183.133 tank4
::1 localhost localhost.localdomain localhost6 localhost6.localdomain6
192.168.183.130 tank1
192.168.183.131 tank2
192.168.183.132 tank3
192.168.183.133 tank4
五,安装hadoop 环境
这里用的是hadoop2.20版本
目录结构:
设置环境变量:
export HADOOP_HOME=/data/hadoop/hadoop-2.2.0
export PATH=$PATH:$HADOOP_HOME/bin:$HADOOP_HOME/sbin
注意:$HADOOP/bin 和$HADOOP/sbin 目录下的文件都有可执行的权限
修改配置文件:
[tank@192 hadoop]$ vi core-site.xml
<?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>hadoop.tmp.dir</name> <value>/usr/hadoop/tmp</value><description>(备注:请先在 /usr/hadoop 目录下建立 tmp 文件夹)A base for other temporary directories.</description> </property> <property> <name>fs.default.name</name> <value>hdfs://192.168.149.128:9000</value> </property> </configuration>
备注:如没有配置hadoop.tmp.dir参数,此时系统默认的临时目录为:/tmp/hadoo-hadoop。而这个目录在每次重启后都会被干掉,必须重新执行format才行,否则会出错。
[tank@192 hadoop]$ vi hdfs-site.xml
<configuration> <property> <name>dfs.replication</name> <value>1</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/data/soft/hadoop/hadoop-2.2.0/hdfs/name</value> <final>true</final> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/data/soft/hadoop/hadoop-2.2.0/hdfs/data</value> </property> </configuration>
文件必须已经预先创建好并存在!
[tank@192 hadoop]$ vi yarn-site.xml
<configuration> <property> <name>mapred.job.tracker</name> <value>192.168.149.128:9001</value> </property> </configuration>
注意上面一定要填Ip,不要填localhost,不然eclipse会连接不到!
设置主从关系$HADOOP_HOME/etc/hadoop/目录下:
[hadoop@tank1 hadoop]$ vi masters
192.168.183.130
192.168.183.130
//主机特有,从机可以不需要
[hadoop@tank1 hadoop]$ vi slaves
192.168.183.131
192.168.183.132
192.168.183.133
192.168.183.131
192.168.183.132
192.168.183.133
[hadoop@tank1 hadoop]$ hadoop namenode -format //第一次需要
启动:
sbin/start-all.sh
查看状态:主机
[tank@192 hadoop-2.2.0]$ jps2751 ResourceManager
2628 SecondaryNameNode
2469 NameNode
查看状态:从机
[hadoop@tank2 sbin]$ jps
1745 NodeManager
1658 DataNode
1745 NodeManager
1658 DataNode
总共有5个hadoop线程
访问地址查看hdfs 的运行状态:
http://192.168.149.128:50070/dfshealth.jsp
从机运行一段时间后NodeManager停止了,查看日志信息:
FATAL org.apache.hadoop.yarn.server.nodemanager.NodeManager: Error starting NodeManager
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From tank2/192.168.183.131 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:181)
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: java.net.ConnectException: Call From tank2/192.168.183.131 to 0.0.0.0:8031 failed on connection exception: java.net.ConnectException: Connection refused; For more details see: http://wiki.apache.org/hadoop/ConnectionRefused
at org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl.serviceStart(NodeStatusUpdaterImpl.java:181)
解决方案:
yarn-site.xml 中添加配置
<property>
<name>yarn.resourcemanager.hostname</name>
<value>192.168.183.130</value>//主机IP
</property>
<name>yarn.resourcemanager.hostname</name>
<value>192.168.183.130</value>//主机IP
</property>
停止,重启