Hadoop 2.x完全分布式安装
前期规划
192.168.100.231 db01 192.168.100.232 db02 192.168.100.233 db03 |
一、安装java
[root@master ~]# vim /etc/profile
在末尾添加环境变量:
export JAVA_HOME=/usr/java/jdk1.7.0_67-cloudera
export PATH=$JAVA_HOME/bin:$JAVA_HOME/jre/bin:$PATH
export CLASSPATH=.:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$CLASSPATH
检查java是否安装成功:
[root@master ~]# java -version
二、创建hadoop用户用于安装软件
groupadd hadoop
useradd -g hadoop hadoop
echo "dbking588" | passwd --stdin hadoop
配置环境变量:
export HADOOP_HOME=/opt/cdh-5.3.6/hadoop-2.5.0
export PATH=$HADOOP_HOME/bin:$HADOOP_HOME/sbin:$PATH:$HOME/bin
三、安装hadoop
# cd /opt/software
# tar -zxvf hadoop-2.5.0.tar.gz -C /opt/cdh-5.3.6/
# chown -R hadoop:hadoop /opt/cdh-5.3.6/hadoop-2.5.0
四、配置SSH
--配置方法: $ ssh-keygen -t rsa $ ssh-copy-id db07.chavin.king (ssh-copy-id方式只能用于rsa加密秘钥配置,测试对于dsa加密配置无效) |
--验证: [hadoop@db01 ~]$ ssh db02 date Wed Apr 19 09:57:34 CST 2017 |
五、编辑hadoop配置文件
需要配置的文件包括:
HDFS配置文件:
etc/hadoop/hadoop-env.sh
etc/hadoop/core-site.xml
etc/hadoop/hdfs-site.xml
etc/haoop/slaves
YARN配置文件:
etc/hadoop/yarn-env.sh
etc/hadoop/yarn-site.xml
etc/haoop/slaves
MapReduce配置文件:
etc/hadoop/mapred-env.sh
etc/hadoop/mapred-site.xml
配置文件内容如下:
[hadoop@db01 hadoop-2.5.0]$ cat etc/hadoop/core-site.xml <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>fs.defaultFS</name> <value>hdfs://db01:9000</value> </property> <property> <name>hadoop.tmp.dir</name> <value>/opt/cdh-5.3.6/hadoop-2.5.0/data/tmp</value> </property> <property> <name>fs.trash.interval</name> <value>7000</value> </property> </configuration> |
[hadoop@db01 hadoop-2.5.0]$ cat etc/hadoop/hdfs-site.xml <?xml version="1.0" encoding="UTF-8"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>dfs.namenode.secondary.http-address</name> <value>db03:50090</value> </property> </configuration> |
[hadoop@db01 hadoop-2.5.0]$ cat etc/hadoop/yarn-site.xml <?xml version="1.0"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <configuration> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.resourcemanager.hostname</name> <value>db02</value> </property> <property> <name>yarn.log-aggregation-enable</name> <value>true</value> </property> <property> <name>yarn.log-aggregation.retain-seconds</name> <value>600000</value> </property> </configuration> |
[hadoop@db01 hadoop-2.5.0]$ cat etc/hadoop/mapred-site.xml <?xml version="1.0"?> <?xml-stylesheet type="text/xsl" href="configuration.xsl"?> <!-- Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. See accompanying LICENSE file. --> <!-- Put site-specific property overrides in this file. --> <configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>db01:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>db01:19888</value> </property> </configuration> |
[hadoop@db01 hadoop-2.5.0]$ cat etc/hadoop/slaves db01 db02 db03 |
在以下文件中修改Java环境变量: etc/hadoop/hadoop-env.sh etc/hadoop/yarn-env.sh etc/hadoop/mapred-env.sh |
创建数据目录: /opt/cdh-5.3.6/hadoop-2.5.0/data/tmp |
六、格式化HDFS
[hadoop@db01 hadoop-2.5.0]$ hdfs namenode -format
七、启动hadoop
*启动方式1:各个服务器逐一启动(比较常用,可编写shell脚本)
hdfs:
sbin/hadoop-daemon.sh start|stop namenode
sbin/hadoop-daemon.sh start|stop datanode
sbin/hadoop-daemon.sh start|stop secondarynamenode
yarn:
sbin/yarn-daemon.sh start|stop resourcemanager
sbin/yarn-daemon.sh start|stop nodemanager
mapreduce:
sbin/mr-jobhistory-daemon.sh start|stop historyserver
*启动方式2:各个模块分开启动:需要配置ssh对等性,需要在namenode上运行
hdfs:
sbin/start-dfs.sh
sbin/start-yarn.sh
yarn:
sbin/stop-dfs.sh
sbin/stop-yarn.sh
*启动方式3:全部启动:不建议使用,这个命令需要在namenode上运行,但是会同时叫secondaryname节点也启动到namenode节点
sbin/start-all.sh
sbin/stop-all.sh
八、测试集群
[hadoop@db01 logs]$ cd ~/hadoop-2.5.2/share/hadoop/mapreduce/
[hadoop@db02 mapreduce]$ hadoop jar hadoop-mapreduce-examples-2.5.2.jar pi 10 10
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· C#/.NET/.NET Core优秀项目和框架2025年2月简报
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 【杭电多校比赛记录】2025“钉耙编程”中国大学生算法设计春季联赛(1)