hadoop 运行测试

hadoop集群运行

需要提前配置配置文件

slave节点用户得是hadoop,/usr/local/src的所有文件得属于hadoop

三台虚拟机关闭setenforce与防火墙,并且配置域名解析

格式化NameNode

[hadoop@master src]$ cd /usr/local/src/hadoop/
[hadoop@master hadoop]$ ./bin/hdfs namenode -format
22/04/02 20:49:35 INFO namenode.NameNode: STARTUP_MSG: 
/************************************************************
STARTUP_MSG: Starting NameNode
STARTUP_MSG:   host = master/192.168.3.23
STARTUP_MSG:   args = [-format]
STARTUP_MSG:   version = 2.7.1

启动NameNode

[hadoop@master hadoop]$ hadoop-daemon.sh start namenode
starting namenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-namenode-master.out

查看java进程

[hadoop@master hadoop]$ jps
10356 NameNode
10427 Jps
5518 SecondaryNameNode
[hadoop@master hadoop]$ 

slave1启动DataNode

[hadoop@slave1 src]$ hadoop-daemon.sh start datanode
starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave1.out
[hadoop@slave1 src]$ jps
10908 Jps
10511 DataNode

slave2启动DataNode

[hadoop@slave2 hadoop]$ hadoop-daemon.sh start datanode
starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave2.out
[hadoop@slave2 hadoop]$ jps
9763 DataNode
9828 Jps

在windows上做域名解析

将c:\windows\system32\drivers\etc\hosts 拖出桌面
修改,在放回去 一定要带上slave

# localhost name resolution is handled within DNS itself.
#	127.0.0.1       localhost
#	::1             localhost

192.168.3.138  master
192.168.3.139  slave1
192.168.3.140   slave2

启动SecondaryNameNode

[hadoop@master hadoop]$ hadoop-daemon.sh stop secondarynamenode
stopping secondarynamenode
[hadoop@master hadoop]$ hadoop-daemon.sh start secondarynamenode
starting secondarynamenode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-secondarynamenode-master.out
[hadoop@master hadoop]$ jps
13938 NameNode
12980 DataNode
14233 Jps
14190 SecondaryNameNode

master传递公钥(在hadoop用户下)

ssh-keygen
ssh-copy-id slave1
ssh-copy-id slave2
ssh-copy-id master

启动分布式操作系统和资源管理

[hadoop@master .ssh]$ start-dfs.sh 
Starting namenodes on [master]
master: namenode running as process 4729. Stop it first.
192.168.3.128: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave1.out
192.168.3.129: starting datanode, logging to /usr/local/src/hadoop/logs/hadoop-hadoop-datanode-slave2.out
Starting secondary namenodes [0.0.0.0]
0.0.0.0: secondarynamenode running as process 4843. Stop it first.
[hadoop@master .ssh]$ start-yarn.sh 
starting yarn daemons
starting resourcemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-resourcemanager-master.out
192.168.3.129: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave2.out
192.168.3.128: starting nodemanager, logging to /usr/local/src/hadoop/logs/yarn-hadoop-nodemanager-slave1.out
[hadoop@master .ssh]$ jps
5799 Jps
4729 NameNode
5530 ResourceManager
4843 SecondaryNameNode
[hadoop@master .ssh]$ 

[hadoop@slave1 network-scripts]$ jps
4049 NodeManager
4217 Jps

slave节点出现NodeManager 主节点出现ResourceManager则成功

hadoop创建文件

[hadoop@slave1 network-scripts]$ hdfs dfs -mkdir /input
[hadoop@slave1 network-scripts]$ hdfs  dfs -ls /
Found 1 items
drwxr-xr-x   - hadoop supergroup          0 2022-04-03 11:53 /input
[hadoop@slave1 network-scripts]$ 

创建文件并上传

[hadoop@master network-scripts]$ mkdir ~/input2
[hadoop@master network-scripts]$ vi ~/input2/data.txt
[hadoop@master network-scripts]$ cat ~/input2/data.txt 
Hello   World
Hello   Supermao
Hello   huawei
[hadoop@master ~]$ hdfs dfs -put ~/input/data.txt  /input
[hadoop@master ~]$ hdfs dfs -mkdir /mqy
[hadoop@master ~]$ hdfs dfs -put input/data.txt  /mqy
[hadoop@master ~]$ hdfs dfs -ls /mqy
Found 1 items
-rw-r--r--   2 hadoop supergroup         40 2022-04-03 14:12 /mqy/data.txt
[hadoop@master ~]$ hdfs dfs -cat /mqy/data.txt
Hello World
Hello redhat
Hello supermao

本地主机http://master:50070/explorer.html#/
获得上传的文件

测试mapreduce

[hadoop@master hadoop]$ hdfs dfs -mkdir /input    #创建输入文件夹 
[hadoop@master hadoop]$ hdfs dfs -ls /          #查看hdfs根分区
[hadoop@master hadoop]$ cat ~/input/data.txt        写一个data文件
Hello World
Hello Hadoop
Hello Huasan
[hadoop@master hadoop]$ hdfs dfs -put ~/input/data.txt /input      #上传data到input下
[hadoop@master hadoop]$ hdfs dfs -ls /input
[hadoop@master hadoop]$ hdfs dfs -mkdir /output     #创建输出文件
[hadoop@master hadoop]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.7.1.jar wordcount /input/data.txt    #使用mapreduce计数
[hadoop@master hadoop]$ hdfs dfs -cat /output/part-r-00000    #查看计数结果

停止hadoop

[hadoop@master hadoop]$ stop-yarn.sh
[hadoop@slave1 hadoop]$ hadoop-daemon.sh stop datanode
[hadoop@slave2 hadoop]$ hadoop-daemon.sh stop datanode
[hadoop@master hadoop]$ hadoop-daemon.sh stop namenode
[hadoop@master hadoop]$ hadoop-daemon.sh stop secondarynamenode

报错

[hadoop@master hadoop]$ hdfs dfsadmin -report
Configured Capacity: 0 (0 B)
Present Capacity: 0 (0 B)
DFS Remaining: 0 (0 B)
DFS Used: 0 (0 B)
DFS Used%: NaN%
Under replicated blocks: 0
Blocks with corrupt replicas: 0
Missing blocks: 0
Missing blocks (with replication factor 1): 0

无法连接存储节点
删除上一次生成的/usr/local/src/hadoop/dfs/data
再次hadoop-daemon.sh start datanode
启动成功
如果启动了datanode,再次格式化namenode
那么就会因为clusterID不一致导致主控与从属无法连接

停止所有服务
stop-all.sh
删除上一次启动datanode的数据,重新启动datanode
此文件记录了datanode
hdfs-site.xml

posted @ 2022-04-25 19:43  supermao12  阅读(108)  评论(0编辑  收藏  举报