Hadoop2.8.2 运行wordcount
1 例子jar位置
[hadoop@hadoop02 mapreduce]$ pwd /hadoop/hadoop-2.8.2/share/hadoop/mapreduce [hadoop@hadoop02 mapreduce]$ ls -lrt 总用量 5084 drwxr-xr-x 2 hadoop hadoop 4096 10月 20 05:11 lib drwxr-xr-x 2 hadoop hadoop 4096 10月 20 05:11 jdiff -rw-r--r-- 1 hadoop hadoop 301936 10月 20 05:11 hadoop-mapreduce-examples-2.8.2.jar -rw-r--r-- 1 hadoop hadoop 77142 10月 20 05:11 hadoop-mapreduce-client-shuffle-2.8.2.jar -rw-r--r-- 1 hadoop hadoop 1588114 10月 20 05:11 hadoop-mapreduce-client-jobclient-2.8.2-tests.jar -rw-r--r-- 1 hadoop hadoop 67003 10月 20 05:11 hadoop-mapreduce-client-jobclient-2.8.2.jar -rw-r--r-- 1 hadoop hadoop 31535 10月 20 05:11 hadoop-mapreduce-client-hs-plugins-2.8.2.jar -rw-r--r-- 1 hadoop hadoop 195052 10月 20 05:11 hadoop-mapreduce-client-hs-2.8.2.jar -rw-r--r-- 1 hadoop hadoop 1571759 10月 20 05:11 hadoop-mapreduce-client-core-2.8.2.jar -rw-r--r-- 1 hadoop hadoop 782757 10月 20 05:11 hadoop-mapreduce-client-common-2.8.2.jar -rw-r--r-- 1 hadoop hadoop 563771 10月 20 05:11 hadoop-mapreduce-client-app-2.8.2.jar drwxr-xr-x 2 hadoop hadoop 4096 10月 20 05:11 sources drwxr-xr-x 2 hadoop hadoop 29 10月 20 05:11 lib-examples
2 生成数据文件
[hadoop@hadoop01 ~]$ echo "Hello World">>word.txt [hadoop@hadoop01 ~]$ echo "Hello Hadoop">>word.txt [hadoop@hadoop01 ~]$ echo "Hello Hive">>word.txt
3 创建HDFS目录
[hadoop@hadoop01 ~]$ hadoop dfs -mkdir /work/data/input DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. [hadoop@hadoop01 ~]$ hadoop dfs -lsr /work/data DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. lsr: DEPRECATED: Please use 'ls -R' instead. drwxr-xr-x - hadoop supergroup 0 2017-11-12 09:00 /work/data/input [hadoop@hadoop01 ~]$
4 将数据文件word.txt上传以HDFS /work/data/input目录下
[hadoop@hadoop01 ~]$ hadoop dfs -copyFromLocal word.txt /work/data/input DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. [hadoop@hadoop01 ~]$ hadoop dfs -text /work/data/input/word.txt DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. Hello World Hello Hadoop Hello Hive [hadoop@hadoop01 ~]$
5 运行wordcount例子
[hadoop@hadoop01 hadoop-2.8.2]$ pwd /hadoop/hadoop-2.8.2 [hadoop@hadoop01 hadoop-2.8.2]$ hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-2.8.2.jar wordcount /work/data/input /work/data/output 17/11/12 09:05:14 INFO client.RMProxy: Connecting to ResourceManager at hadoop02/192.168.169.102:8032 17/11/12 09:05:15 INFO input.FileInputFormat: Total input files to process : 1 17/11/12 09:05:15 INFO mapreduce.JobSubmitter: number of splits:1 17/11/12 09:05:15 INFO mapreduce.JobSubmitter: Submitting tokens for job: job_1510447239720_0001 17/11/12 09:05:16 INFO impl.YarnClientImpl: Submitted application application_1510447239720_0001 17/11/12 09:05:16 INFO mapreduce.Job: The url to track the job: http://hadoop02:8088/proxy/application_1510447239720_0001/ 17/11/12 09:05:16 INFO mapreduce.Job: Running job: job_1510447239720_0001 17/11/12 09:05:25 INFO mapreduce.Job: Job job_1510447239720_0001 running in uber mode : false 17/11/12 09:05:25 INFO mapreduce.Job: map 0% reduce 0% 17/11/12 09:05:35 INFO mapreduce.Job: map 100% reduce 0% 17/11/12 09:05:40 INFO mapreduce.Job: map 100% reduce 100% 17/11/12 09:05:41 INFO mapreduce.Job: Job job_1510447239720_0001 completed successfully 17/11/12 09:05:41 INFO mapreduce.Job: Counters: 49 File System Counters FILE: Number of bytes read=53 FILE: Number of bytes written=276955 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=152 HDFS: Number of bytes written=31 HDFS: Number of read operations=6 HDFS: Number of large read operations=0 HDFS: Number of write operations=2 Job Counters Launched map tasks=1 Launched reduce tasks=1 Data-local map tasks=1 Total time spent by all maps in occupied slots (ms)=5860 Total time spent by all reduces in occupied slots (ms)=3296 Total time spent by all map tasks (ms)=5860 Total time spent by all reduce tasks (ms)=3296 Total vcore-milliseconds taken by all map tasks=5860 Total vcore-milliseconds taken by all reduce tasks=3296 Total megabyte-milliseconds taken by all map tasks=6000640 Total megabyte-milliseconds taken by all reduce tasks=3375104 Map-Reduce Framework Map input records=3 Map output records=6 Map output bytes=59 Map output materialized bytes=53 Input split bytes=117 Combine input records=6 Combine output records=4 Reduce input groups=4 Reduce shuffle bytes=53 Reduce input records=4 Reduce output records=4 Spilled Records=8 Shuffled Maps =1 Failed Shuffles=0 Merged Map outputs=1 GC time elapsed (ms)=224 CPU time spent (ms)=2190 Physical memory (bytes) snapshot=443719680 Virtual memory (bytes) snapshot=4207517696 Total committed heap usage (bytes)=293076992 Shuffle Errors BAD_ID=0 CONNECTION=0 IO_ERROR=0 WRONG_LENGTH=0 WRONG_MAP=0 WRONG_REDUCE=0 File Input Format Counters Bytes Read=35 File Output Format Counters Bytes Written=31 [hadoop@hadoop01 hadoop-2.8.2]$
6 查看结果
[hadoop@hadoop01 hadoop-2.8.2]$ hadoop dfs -lsr /work/data/output DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. lsr: DEPRECATED: Please use 'ls -R' instead. -rw-r--r-- 2 hadoop supergroup 0 2017-11-12 09:05 /work/data/output/_SUCCESS -rw-r--r-- 2 hadoop supergroup 31 2017-11-12 09:05 /work/data/output/part-r-00000 [hadoop@hadoop01 hadoop-2.8.2]$ hadoop dfs -text /work/data/output/part-r-00000 DEPRECATED: Use of this script to execute hdfs command is deprecated. Instead use the hdfs command for it. Hadoop 1 Hello 3 Hive 1 World 1 [hadoop@hadoop01 hadoop-2.8.2]$