hadoop-mapreduce-examples Hadoop实例
[root@master hadoop-3.1.1]# bin/hadoop jar share/hadoop/mapreduce/hadoop-mapreduce-examples-3.1.1.jar
An example program must be given as the first argument.
Valid program names are:
aggregatewordcount: An Aggregate based map/reduce program that counts the words in the input files.
aggregatewordhist: An Aggregate based map/reduce program that computes the histogram of the words in the input files.
bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact digits of Pi.
dbcount: An example job that count the pageview counts from a database.
distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits of Pi.
grep: A map/reduce program that counts the matches of a regex in the input.
join: A job that effects a join over sorted, equally partitioned datasets
multifilewc: A job that counts words from several files.
pentomino: A map/reduce tile laying program to find solutions to pentomino problems.
pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
randomtextwriter: A map/reduce program that writes 10GB of random textual data per node.
randomwriter: A map/reduce program that writes 10GB of random data per node.
secondarysort: An example defining a secondary sort to the reduce.
sort: A map/reduce program that sorts the data written by the random writer.
sudoku: A sudoku solver.
teragen: Generate data for the terasort
terasort: Run the terasort
teravalidate: Checking results of terasort
wordcount: A map/reduce program that counts the words in the input files.
wordmean: A map/reduce program that counts the average length of the words in the input files.
wordmedian: A map/reduce program that counts the median length of the words in the input files.
有效的程序名称是:
aggregatewordcount:一个基于聚合的map/reduce程序,它对输入文件中的单词进行计数。
aggregatewordhist:一个基于聚合的map/reduce程序,用于计算输入文件中单词的直方图。
bbp:一个使用Bailey Borwein Plouffe计算PI精确数字的map/reduce程序。
dbcount:一个计算页面浏览量的示例作业,从数据库中计数。
distbbp:一个使用BBP型公式计算PI精确比特的map/reduce程序。
grep:一个在输入中计算正则表达式匹配的map/reduce程序。
join:一个影响连接排序、相等分区数据集的作业
multifilewc:一个从多个文件中计算单词的任务。
pentomino:一个地图/减少瓦片铺设程序来找到解决PotoMimo问题的方法。
pi:一个用拟蒙特卡洛方法估计PI的MAP/Relp程序。
randomtextwriter:一个map/reduce程序,每个节点写入10GB的随机文本数据。
randomwriter:一个映射/RADIUS程序,每个节点写入10GB的随机数据。
secondarysort:定义一个次要排序到减少的例子。
sort:一个对随机写入器写入的数据进行排序的map/reduce程序。
sudoku:数独求解者。
teragen:为terasort生成数据
terasort:运行terasort
teravalidate: terasort的检查结果
wordcount:一个映射/缩小程序,计算输入文件中的单词。
wordmean:map/reduce程序,用于计算输入文件中单词的平均长度。
wordmedian:map/reduce程序,用于计算输入文件中单词的中值长度。