hadoop--mapreduce--自定义key类型

问题:

输入文件A的样例如下(注意文件以tab为分隔符,粘贴时请检查):

20170101     x

20170102     y

20170103     x

20170104     y

20170105     z

20170106     x

 

 

 

 

 

 

 

输入文件B的样例如下:

20170101      y

20170102      y

20170103      x

20170104      z

20170105      y

 

 

 

 

 

 

根据输入文件A和B合并得到的输出文件C的样例如下:

20170101      x

20170101      y

20170102      y

20170103      x

20170104      y

20170104      z

20170105      y  20170105      z

20170106      x

 

 

 

 

 

 

 

 

 

 

代码实现:

 1 import org.
apache.hadoop.fs.Path;
2 import org.apache.hadoop.io.DoubleWritable; 3 import org.apache.hadoop.io.IntWritable; 4 import org.apache.hadoop.io.LongWritable; 5 import org.apache.hadoop.io.Text; 6 import org.apache.hadoop.mapreduce.Job; 7 import org.apache.hadoop.mapreduce.Mapper; 8 import org.apache.hadoop.mapreduce.Reducer; 9 import org.apache.hadoop.mapreduce.lib.input.FileInputFormat; 10 import org.apache.hadoop.mapreduce.lib.output.FileOutputFormat; 11 import org.apache.hadoop.util.GenericOptionsParser; 12 13 public class Task1 { 14 public static class MapClass extends Mapper<LongWritable, Text, Text, Text>{ 15 public void map(LongWritable key,Text value,Context context) throws IOException, InterruptedException { 16 context.write(value, new Text("")); 17 } 18 } 19 public static class ReduceClass extends Reducer<Text,Text,Text,Text>{ 20 public void reduce( Text key, Iterable<Text> values,Context context) throws IOException, InterruptedException { 21 context.write(key, new Text("")); 22 } 23 } 24 public static void main(String args[]) throws IOException, ClassNotFoundException, InterruptedException { 25 Configuration conf = new Configuration(); 26 Job job = new Job(conf); 27 job.setJarByClass(Task1.class); 28 job.setMapperClass(MapClass.class); 29 job.setReducerClass(ReduceClass.class); 30 job.setOutputKeyClass(Text.class); 31 job.setOutputValueClass(Text.class); 32 33 FileInputFormat.addInputPath(job, new Path("C:\\Users\\Administrator\\Desktop\\新建文件夹\\input2.txt") ); 34 FileInputFormat.addInputPath(job, new Path("C:\\Users\\Administrator\\Desktop\\\\新建文件夹\\input1.txt") ); 35 FileOutputFormat.setOutputPath(job, new Path("C:\\Users\\Administrator\\Desktop\\新建文件夹\\output")); 36 37 System.exit(job.waitForCompletion(true)?0:1); 38 } 39 }

 结果:

 

 

 踩过的坑:

  reduce不执行的原因:

    1、程序出现过异常,可以通过日志来debug;

    2、参数类型不匹配;

    等

posted @ 2018-10-24 20:55  bear_ge  阅读(306)  评论(0编辑  收藏  举报