hadoop基本应用
(1)修改主机名并刷新一下主机名称(建议打开窗口重新连接一下)
[root@bdvm2022 ~]# hostnamectl set-hostname lixianhui38
[root@bdvm2022 ~]#
[root@bdvm2022 ~]#
[root@bdvm2022 ~]# bash
(2)加载提前编写好的脚本安装文件
[root@lixianhui38 ~]# sh install.sh
确认开始安装 Hadoop HBase Hive Sqoop Spark ,按任意键继续... 取消请按 Ctrl+c
初始化操作系统
Failed to execute operation: No such file or directory
Failed to restart iptables.service: Unit not found.
The service command supports only basic LSB actions (start, stop, restart, try-restart, reload, force-reload, status). For other actions, please try to use systemctl.
Loaded plugins: fastestmirror
Determining fastest mirrors
(3)切换到hadoop目录下并启动hadoop
[root@lixianhui38 ~]# cd /opt/apps/hadoop
[root@lixianhui38 hadoop]# sbin/start-dfs.sh
WARNING: HADOOP_SECURE_DN_USER has been replaced by HDFS_DATANODE_SECURE_USER. Using value of HADOOP_SECURE_DN_USER.
Starting namenodes on [lixianhui38]
Last login: Wed May 18 19:59:14 CST 2022 from 192.168.233.1 on pts/0
lixianhui38: namenode is running as process 2411. Stop it first.
Starting datanodes
Last login: Wed May 18 20:01:32 CST 2022 on pts/0
localhost: datanode is running as process 2555. Stop it first.
Starting secondary namenodes [lixianhui38]
Last login: Wed May 18 20:01:33 CST 2022 on pts/0
lixianhui38: secondarynamenode is running as process 2759. Stop it first.
[root@lixianhui38 hadoop]# start-yarn.sh
Starting resourcemanager
Last login: Wed May 18 20:01:34 CST 2022 on pts/0
Starting nodemanagers
Last login: Wed May 18 20:09:32 CST 2022 on pts/0
[root@lixianhui38 hadoop]# sbin/start-yarn.sh
Starting resourcemanager
Last login: Wed May 18 20:09:34 CST 2022 on pts/0
resourcemanager is running as process 4335. Stop it first.
Starting nodemanagers
Last login: Wed May 18 20:09:48 CST 2022 on pts/0
localhost: nodemanager is running as process 4489. Stop it first.
[root@lixianhui38 hadoop]# jps
2759 SecondaryNameNode
4489 NodeManager
5114 Jps
2411 NameNode
2555 DataNode
4335 ResourceManager
(4)在HDFS文件系统根目录下递归创建以本人班级和学号的两级目录
[root@lixianhui38 hadoop]# hdfs dfs -mkdir -p /rjgj2031/38
(5)查看文件系统中创建的文件列表信息
[root@lixianhui38 hadoop]# hdfs dfs -ls -R /
drwxr-xr-x - root supergroup 0 2022-05-18 20:23 /rjgj2031
drwxr-xr-x - root supergroup 0 2022-05-18 20:23 /rjgj2031/38
[root@lixianhui38 hadoop]#
(6)以本地目录创建以本人姓名为文件名的文本文件,然后使用hdfs上传到创建的/rjgj2031/38文件中
[root@lixianhui38 hadoop]# vi lixianhui.txt
[root@lixianhui38 hadoop]# hdfs dfs -put lixianhui.txt /rjgj2031/38
2022-05-18 20:49:00,085 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
[root@lixianhui38 hadoop]#
(7)在本地D盘中创建名为38的文件夹,使用hdfs命令将上传到/rjgj2031/38/ixianhui.txt文本文件下载到本地D盘38文件目录下并重命名文本文件为rj8.txt
[root@lixianhui38 ~]# mkdir 38
[root@lixianhui38 ~]# hdfs dfs -get /rjgj2031/38/lixianhui.txt 38/rj8.txt
2022-05-18 20:53:12,164 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
[root@lixianhui38 ~]#
(8)使用hdfs查看/rjgj2031/38/lixianhui.txt上传的文本文件内容
[root@lixianhui38 ~]# hdfs dfs -cat /rjgj2031/38/lixianhui.txt
2022-05-18 21:00:31,427 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
aaaaaaaaaaaaaaaa
sdddd
ddddd
ddddd
ddddd
sssssssss
[root@lixianhui38 ~]#
(9)运行Hadoop安装包中自带的wordcound应用程序,jar包所在的路径/opt/apps/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar,输入文件为题目/rjgj2031/38/lixianhui.txt上传文件,输出目录是rjgjout38
[root@lixianhui38 hadoop]# hadoop jar /opt/apps/hadoop/share/hadoop/mapreduce/hadoop-mapreduce-examples-3.2.1.jar wordcount /rjgj2031/38/ /rjgj2031/out38/
2022-05-18 21:34:44,894 INFO client.RMProxy: Connecting to ResourceManager at lixianhui38/192.168.233.10:8032
2022-05-18 21:34:45,546 INFO mapreduce.JobResourceUploader: Disabling Erasure Coding for path: /tmp/hadoop-yarn/staging/root/.staging/job_1652875775793_0001
2022-05-18 21:34:45,654 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2022-05-18 21:34:45,774 INFO input.FileInputFormat: Total input files to process : 1
2022-05-18 21:34:45,802 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2022-05-18 21:34:45,821 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2022-05-18 21:34:45,832 INFO mapreduce.JobSubmitter: number of splits:1
2022-05-18 21:34:45,969 INFO sasl.SaslDataTransferClient: SASL encryption trust check: localHostTrusted = false, remoteHostTrusted = false
2022-05-18 21:34:46,384 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1652875775793_0001
2022-05-18 21:34:46,384 INFO mapreduce.JobSubmitter: Executing with tokens: []
2022-05-18 21:34:46,585 INFO conf.Configuration: resource-types.xml not found
2022-05-18 21:34:46,586 INFO resource.ResourceUtils: Unable to find 'resource-types.xml
(10) 使用hdfs删除/rjgj2031的文件夹
[root@lixianhui38 ~]# hdfs dfs -rm -r /rjgj2031 Deleted /rjgj2031 [root@lixianhui38 ~]# hdfs dfs -ls / Found 1 items drwx------ - root supergroup 0 2022-05-18 21:34 /tmp [root@lixianhui38 ~]#
可在web管理端查看创建的结果,虚拟器使用是虚拟机地址加端口号,主机是localhost加端口号,yarn的端口号是8088、hadoop端口号是9870