MapReduce案例-流量统计(二)
### 需求二: 上行流量倒序排序(递减排序)
分析,以需求一的输出数据作为排序的输入数据,自定义FlowBean,以FlowBean为map输出的key,以手机号作为Map输出的value,因为MapReduce程序会对Map阶段输出的key进行排序
##### Step 1: 定义FlowBean实现WritableComparable实现比较排序
Java 的 compareTo 方法说明:
- compareTo 方法用于将当前对象与方法的参数进行比较。
- 如果指定的数与参数相等返回 0。
- 如果指定的数小于参数返回 -1。
- 如果指定的数大于参数返回 1。
例如:`o1.compareTo(o2);` 返回正数的话,当前对象(调用 compareTo 方法的对象 o1)要排在比较对象(compareTo 传参对象 o2)后面,返回负数的话,放在前面
~~~java
public class FlowBean implements WritableComparable<FlowBean> {
private Integer upFlow; //上行数据包数
private Integer downFlow; //下行数据包数
private Integer upCountFlow; //上行流量总和
private Integer downCountFlow;//下行流量总和
public Integer getUpFlow() {
return upFlow;
}
public void setUpFlow(Integer upFlow) {
this.upFlow = upFlow;
}
public Integer getDownFlow() {
return downFlow;
}
public void setDownFlow(Integer downFlow) {
this.downFlow = downFlow;
}
public Integer getUpCountFlow() {
return upCountFlow;
}
public void setUpCountFlow(Integer upCountFlow) {
this.upCountFlow = upCountFlow;
}
public Integer getDownCountFlow() {
return downCountFlow;
}
public void setDownCountFlow(Integer downCountFlow) {
this.downCountFlow = downCountFlow;
}
@Override
public String toString() {
return upFlow +
"\t" + downFlow +
"\t" + upCountFlow +
"\t" + downCountFlow;
}
//序列化方法
@Override
public void write(DataOutput out) throws IOException {
out.writeInt(upFlow);
out.writeInt(downFlow);
out.writeInt(upCountFlow);
out.writeInt(downCountFlow);
}
//反序列化
@Override
public void readFields(DataInput in) throws IOException {
this.upFlow = in.readInt();
this.downFlow = in.readInt();
this.upCountFlow = in.readInt();
this.downCountFlow = in.readInt();
}
//指定排序的规则
@Override
public int compareTo(FlowBean flowBean) {
// return this.upFlow.compareTo(flowBean.getUpFlow()) * -1;
return flowBean.upFlow - this.upFlow ;
}
}
~~~
##### Step 2: 定义FlowMapper
```java
public class FlowSortMapper extends Mapper<LongWritable,Text,FlowBean,Text> {
//map方法:将K1和V1转为K2和V2
@Override
protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
//1:拆分行文本数据(V1),得到四个流量字段,并封装FlowBean对象---->K2
String[] split = value.toString().split("\t");
FlowBean flowBean = new FlowBean();
flowBean.setUpFlow(Integer.parseInt(split[1]));
flowBean.setDownFlow(Integer.parseInt(split[2]));
flowBean.setUpCountFlow(Integer.parseInt(split[3]));
flowBean.setDownCountFlow(Integer.parseInt(split[4]));
//2:通过行文本数据,得到手机号--->V2
String phoneNum = split[0];
//3:将K2和V2下入上下文中
context.write(flowBean, new Text(phoneNum));
}
}
```
##### Step 3: 定义FlowReducer
```java
/*
K2: FlowBean
V2: Text 手机号
K3: Text 手机号
V3: FlowBean
*/
public class FlowSortReducer extends Reducer<FlowBean,Text,Text,FlowBean> {
@Override
protected void reduce(FlowBean key, Iterable<Text> values, Context context) throws IOException, InterruptedException {
//1:遍历集合,取出 K3,并将K3和V3写入上下文中
for (Text value : values) {
context.write(value, key);
}
}
}
```
##### Step 4: 程序main函数入口
```java
public class JobMain extends Configured implements Tool {
//该方法用于指定一个job任务
@Override
public int run(String[] args) throws Exception {
//1:创建一个job任务对象
Job job = Job.getInstance(super.getConf(), "mapreduce_flowsort");
//2:配置job任务对象(八个步骤)
//第一步:指定文件的读取方式和读取路径
job.setInputFormatClass(TextInputFormat.class);
//TextInputFormat.addInputPath(job, new Path("hdfs://node01:8020/wordcount"));
TextInputFormat.addInputPath(job, new Path("file:///D:\\out\\flowcount_out"));
//第二步:指定Map阶段的处理方式和数据类型
job.setMapperClass(FlowSortMapper.class);
//设置Map阶段K2的类型
job.setMapOutputKeyClass(FlowBean.class);
//设置Map阶段V2的类型
job.setMapOutputValueClass(Text.class);
//第三(分区),四 (排序)
//第五步: 规约(Combiner)
//第六步 分组
//第七步:指定Reduce阶段的处理方式和数据类型
job.setReducerClass(FlowSortReducer.class);
//设置K3的类型
job.setOutputKeyClass(Text.class);
//设置V3的类型
job.setOutputValueClass(FlowBean.class);
//第八步: 设置输出类型
job.setOutputFormatClass(TextOutputFormat.class);
//设置输出的路径
TextOutputFormat.setOutputPath(job, new Path("file:///D:\\out\\flowsort_out"));
//等待任务结束
boolean bl = job.waitForCompletion(true);
return bl ? 0:1;
}
public static void main(String[] args) throws Exception {
Configuration configuration = new Configuration();
//启动job任务
int run = ToolRunner.run(configuration, new JobMain(), args);
System.exit(run);
}
}
```