统计 MapReduce 输出路径修改。
先在上一篇MR 的104 行加入代码。jobConf.setOutputFormat(MyMultipleFilesTextOutputFormat.class); 用意是自定义 job 的输出格式:
上一篇 MR 代码:
http://www.cnblogs.com/rocky24/p/f7a27b79fa8e5dfdc22fb535cadb86bc.html
- 1 继承 MultipleOutputFormat 实现抽象类的接口方法 getBaseRecordWriter 负责将键值对写入到文件系统。
- 2 重写 generateFileNameForKeyValue 方法。 定义不同的输出文件名。
/**
*
*/
public static class MyMultipleFilesTextOutputFormat extends MultipleOutputFormat<Text, IntWritable> {
private TextOutputFormat<Text, IntWritable> output = null;
// 明确定义使用哪个 recordwriter类
@Override
protected org.apache.hadoop.mapred.RecordWriter<Text, IntWritable> getBaseRecordWriter(
FileSystem fs, JobConf job, String name, Progressable progress)
throws IOException {
final TextOutputFormat<Text, IntWritable> textOutputFormat = new TextOutputFormat<Text, IntWritable>();
if (output == null) {
output = new TextOutputFormat<Text, IntWritable>();
}
return textOutputFormat.getRecordWriter(fs, job, name, progress);
}
// 重写方法, 将生成输出文件文件名的方法进行重写
@Override
protected String generateFileNameForKeyValue(Text key,IntWritable value, String name) {
//输出的文件名就是k3的值
final String keyString = key.toString();
if(keyString.contains("download")) {
return "download";
} else if(keyString.contains("upload")) {
return "upload";
} else if(keyString.contains("debug")) {
return "debug";
} else {
return "others";
}
}
}
God has given me a gift. Only one. I am the most complete fighter in the world. My whole life, I have trained. I must prove I am worthy of someting. rocky_24