流量统计之自定义Mapper类(第二步)
1、access.log
(1)在com.imooc.bigdata(hadoop-train-v2)下新建directory:access
(2)在access下新建directory:input
(3)在input下新建file:access.log
001 13161021751 192.168.126.100 200 5 10 002 13161021751 192.168.126.100 200 5 10 003 13161021751 192.168.126.100 200 5 10 004 13161021751 192.168.126.100 200 5 10 005 13161021751 192.168.126.100 200 5 10 006 13161021752 192.168.126.101 200 15 20 007 13161021752 192.168.126.101 200 15 20 008 13161021752 192.168.126.101 200 15 20 009 13161021752 192.168.126.101 200 15 20 010 13161021752 192.168.126.101 200 15 20 011 13161021753 192.168.126.102 200 25 30 012 13161021753 192.168.126.102 200 25 30 013 13161021753 192.168.126.102 200 25 30 014 13161021753 192.168.126.102 200 25 30 015 13161021753 192.168.126.102 200 25 30 016 13161021754 192.168.126.103 200 35 40 017 13161021754 192.168.126.103 200 35 40 018 13161021754 192.168.126.103 200 35 40 019 13161021754 192.168.126.103 200 35 40 020 13161021754 192.168.126.103 200 35 40 021 13161021755 192.168.126.104 200 45 50 022 13161021755 192.168.126.104 200 45 50 023 13161021755 192.168.126.104 200 45 50 024 13161021755 192.168.126.104 200 45 50 025 13161021755 192.168.126.104 200 45 50 026 13161021755 192.168.126.104 200 45 50 027 13161021755 192.168.126.104 200 45 50 028 13161021755 192.168.126.104 200 45 50 029 13161021755 192.168.126.104 200 45 50 030 13161021755 192.168.126.104 200 45 50
2、AccessMapper.java
package com.imooc.bigdata.hadoop.mr.access; import org.apache.hadoop.io.LongWritable; import org.apache.hadoop.io.Text; import org.apache.hadoop.mapreduce.Mapper; import java.io.IOException; /* * 自定义Mapper处理类 * * Mapper就是把所需要的东西提取出来,做一些适当的转换 */ //按照实现,要先继承Mapper /* * LongWritable输入的偏移量 * Text输入的每行文本 * Text输出的电话号码作为key * Access输出的Access.java中的复杂数据类型 */ public class AccessMapper extends Mapper<LongWritable, Text, Text, Access> { // 实现一个map方法 // 具体操作:输入map,选择protected... // 任务:把读进来的日志access.log,按照tab键进行分割,取出3个字段 @Override protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException { String[] lines = value.toString().split("\t"); String phone = lines[1]; //取出手机号 long up = Long.parseLong(lines[lines.length-2]); //取出上行流量 long down = Long.parseLong(lines[lines.length-1]); //取出下行流量 //通过上下文write出来sum context.write(new Text(phone), new Access(phone, up, down)); } }
3、Access.java
(1)除了默认的构造方法外,还需添加以下代码,为AccessMapper.java方便使用
//为了AccessMapper.java使用方便 public Access(String phone, long up, long down){ this.phone = phone; this.up = up; this.down = down; this.sum = up + down; }