CDH- 测试mr
cdh的mr样例算法的jar包在
[zc.lee@ip-172-32-1-221 hadoop-0.20-mapreduce]$ pwd /opt/cloudera/parcels/CDH-5.10.0-1.cdh5.10.0.p0.41/lib/hadoop-0.20-mapreduce
查看该目录下的文件
[zc.lee@ip-172-32-1-221 hadoop-0.20-mapreduce]$ ll total 400 drwxr-xr-x 2 root root 4096 Jan 20 2017 bin -rw-r--r-- 1 root root 348776 Jan 20 2017 CHANGES.txt drwxr-xr-x 2 root root 4096 Jan 20 2017 cloudera lrwxrwxrwx 1 root root 16 Jul 20 2017 conf -> /etc/hadoop/conf drwxr-xr-x 6 root root 4096 Jan 20 2017 contrib drwxr-xr-x 4 root root 4096 Jan 20 2017 example-confs lrwxrwxrwx 1 root root 45 Jul 20 2017 hadoop-ant-2.6.0-mr1-cdh5.10.0.jar -> ../../jars/hadoop-ant-2.6.0-mr1-cdh5.10.0.jar lrwxrwxrwx 1 root root 34 Jul 20 2017 hadoop-ant-mr1.jar -> hadoop-ant-2.6.0-mr1-cdh5.10.0.jar lrwxrwxrwx 1 root root 46 Jul 20 2017 hadoop-core-2.6.0-mr1-cdh5.10.0.jar -> ../../jars/hadoop-core-2.6.0-mr1-cdh5.10.0.jar lrwxrwxrwx 1 root root 35 Jul 20 2017 hadoop-core-mr1.jar -> hadoop-core-2.6.0-mr1-cdh5.10.0.jar lrwxrwxrwx 1 root root 50 Jul 20 2017 hadoop-examples-2.6.0-mr1-cdh5.10.0.jar -> ../../jars/hadoop-examples-2.6.0-mr1-cdh5.10.0.jar lrwxrwxrwx 1 root root 23 Jul 20 2017 hadoop-examples.jar -> hadoop-examples-mr1.jar lrwxrwxrwx 1 root root 39 Jul 20 2017 hadoop-examples-mr1.jar -> hadoop-examples-2.6.0-mr1-cdh5.10.0.jar lrwxrwxrwx 1 root root 46 Jul 20 2017 hadoop-test-2.6.0-mr1-cdh5.10.0.jar -> ../../jars/hadoop-test-2.6.0-mr1-cdh5.10.0.jar lrwxrwxrwx 1 root root 35 Jul 20 2017 hadoop-test-mr1.jar -> hadoop-test-2.6.0-mr1-cdh5.10.0.jar lrwxrwxrwx 1 root root 47 Jul 20 2017 hadoop-tools-2.6.0-mr1-cdh5.10.0.jar -> ../../jars/hadoop-tools-2.6.0-mr1-cdh5.10.0.jar lrwxrwxrwx 1 root root 36 Jul 20 2017 hadoop-tools-mr1.jar -> hadoop-tools-2.6.0-mr1-cdh5.10.0.jar drwxr-xr-x 3 root root 4096 Jan 20 2017 include drwxr-xr-x 5 root root 4096 Jan 20 2017 lib -rw-r--r-- 1 root root 13366 Jan 20 2017 LICENSE.txt -rw-r--r-- 1 root root 101 Jan 20 2017 NOTICE.txt -rw-r--r-- 1 root root 1366 Jan 20 2017 README.txt drwxr-xr-x 3 root root 4096 Jan 20 2017 sbin drwxr-xr-x 5 root root 4096 Jan 20 2017 webapps
可以用hadoop-examples.jar里面的wordcount做测试
#hadoop jar hadoop-examples.jar
可以看到里面都有些上面可以使用的类
An example program must be given as the first argument. Valid program names are: aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files. aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files. bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi. dbcount: An example job that count the pageview counts from a database. distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi. grep: A map/reduce program that counts the matches of a regex in the input. join: A job that effects a join over sorted, equally partitioned datasets multifilewc: A job that counts words from several files. pentomino: A map/reduce tile laying program to find solutions to pentomino problems. pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method. randomtextwriter: A map/reduce program that writes 10GB of random textual data per node. randomwriter: A map/reduce program that writes 10GB of random data per node. secondarysort: An example defining a secondary sort to the reduce. sort: A map/reduce program that sorts the data written by the random writer. sudoku: A sudoku solver. teragen: Generate data for the terasort terasort: Run the terasort teravalidate: Checking results of terasort wordcount: A map/reduce program that counts the words in the input files. wordmean: A map/reduce program that counts the average length of the words in the input files. wordmedian: A map/reduce program that counts the median length of the words in the input files. wordstandarddeviation: A map/reduce program that counts the standard deviation of the length of the words in the input files.
这里我直接取wordcount类来做测试,首先上传文件到hdfs准备好计算
hdfs dfs -mkdir /user/zc.lee/input/
hdfs dfs -put /user/PG/conf/type.txt /user/zc.lee/input/
开始计算
hadoop jar hadoop-examples.jar wordcount /user/zc.lee/input/type.txt /user/zc.lee/ouputtest
检查结果
hdfs dfs -text /user/zc.lee/ouputtest/*