centOS7搭建hadoop环境
引用博客:https://blog.csdn.net/yujuan110/article/details/78457259
前提条件:1.jdk1.8
2.三台节点
3./etc/hosts 地址映射
4.SSH免密登录
5.下载安装包:hadoop-bin-2.6.4
解压 tar -zxvf hadoop-2.6.4.tar.gz
移动到mv hadoop-2.6.4/opt/hadoop
2.修改配置文件
进入到/home/spark/apps/hadoop-2.6.4/etc/hadoop/路径下
添加JAVA_HOME环境到hadoop-env.sh和yarn-env.sh
JAVA_HOME=/usr/java/jdk1.8.5_151
core-site.xml:
<configuration> <property> <name>fs.default.name</name> <value>hdfs://spark01:9000</value>(spark01:主节点名称,未配地址映射的写IP) </property> <property> <name>hadoop.tmp.dir</name> <value>/home/spark/apps/hadoop-2.6.4/tmp</value> </property> </configuration>
hdfs-site.xml:
<configuration>
<property> <name>dfs.replication</name> <value>2</value> </property> <property> <name>dfs.namenode.name.dir</name> <value>file:/home/spark/apps/hadoop-2.6.4/dfs/name</value> </property> <property> <name>dfs.datanode.data.dir</name> <value>file:/home/spark/apps/hadoop-2.6.4/dfs/data</value> </property> </configuration>
mapred-site.xml:
<configuration> <property> <name>mapreduce.framework.name</name> <value>yarn</value> </property> <property> <name>mapreduce.jobhistory.address</name> <value>spark01:10020</value> </property> <property> <name>mapreduce.jobhistory.webapp.address</name> <value>spark01:19888</value> </property> </configuration>
yarn-site.xml:
<configuration> <!-- Site specific YARN configuration properties --> <property> <name>yarn.nodemanager.aux-services</name> <value>mapreduce_shuffle</value> </property> <property> <name>yarn.nodemanager.aux-services.mapreduce.shuffle.class</name> <value>org.apache.hadoop.mapred.ShuffleHandler</value> </property> <property> <name>yarn.resourcemanager.address</name> <value>spark01:8032</value> </property> <property> <name>yarn.resourcemanager.scheduler.address</name> <value>spark01:8030</value> </property> <property> <name>yarn.resourcemanager.resource-tracker.address</name> <value>spark01:8031</value> </property> <property> <name>yarn.resourcemanager.admin.address</name> <value>spark01:8033</value> </property> <property> <name>yarn.resourcemanager.webapp.address</name> <value>spark01:8088</value> </property> </configuration>
编辑slaves文件:
清空slaves,再加入从节点的名字
spark02
spark03
3.将hadoop分发到各个节点(因为权限问题,先将该文件夹分发到各个节点的/home/master目录下,再进入各个节点将文件夹移到/opt下)
scp -r /home/spark/apps/hadoop-2.6.4 spark02:/home/spark/apps/ scp -r /home/spark/apps/hadoop-2.6.4 spark03:/home/spark/apps/
(由于我三台都使用的是root,所以不存在权限问题。如果有权限问题,可以先移动到/home/用户名/下,再到相应的机器下用sudo来移动)
4.在spark01服务器启动hadoop,从节点会自动启动,进入/home/spark/apps/hadoop-2.6.4目录
(1)初始化,输入命令,bin/hdfs namenode -format
(2)全部启动sbin/start-all.sh,也可以分开sbin/start-dfs.sh、sbin/start-yarn.sh
(3)终止服务器:sbin/stop-all.sh
(4)输入命令jps,可以看到相关信息
5.本机上用浏览器访问
(1)关闭防火墙systemctl stop firewalld.service
(2)浏览器打开http://spark01(或者IP地址,在本地浏览器输入spark01能通过的前提是在windows 的hosts文件中配置地址映射):8088/
(3)浏览器打开http://spark01:50070/