打赏

1 weekend110的复习 + hadoop中的序列化机制 + 流量求和mr程序开发

 

  不多说,直接上干货!

 

 

 

以上是,weekend110的yarn的job提交流程源码分析的复习总结

 

下面呢,来讲weekend110的hadoop中的序列化机制

 

 

1363157985066      13726230503  00-FD-07-A4-72-B8:CMCC       120.196.100.82        i02.c.aliimg.com                24     27     2481         24681       200

1363157995052      13826544101  5C-0E-8B-C7-F1-E0:CMCC       120.197.40.4                      4       0       264  0       200

1363157991076      13926435656  20-10-7A-28-CC-0A:CMCC       120.196.100.99                          2       4       132  1512         200

1363154400022      13926251106  5C-0E-8B-8B-B1-50:CMCC       120.197.40.4                      4       0       240  0       200

1363157993044      18211575961  94-71-AC-CD-E6-18:CMCC-EASY     120.196.100.99        iface.qiyi.com  视频网站         15         12     1527         2106         200

1363157995074      84138413         5C-0E-8B-8C-E8-20:7DaysInn 120.197.40.4   122.72.52.12            20     16     4116         1432         200

1363157993055      13560439658  C4-17-FE-BA-DE-D9:CMCC      120.196.100.99                          18     15     1116         954  200

1363157995033      15920133257  5C-0E-8B-C7-BA-20:CMCC      120.197.40.4   sug.so.360.cn  信息安全         20     20     3156         2936         200

1363157983019      13719199419  68-A1-B7-03-07-B1:CMCC-EASY      120.196.100.82                          4       0       240  0       200

1363157984041      13660577991  5C-0E-8B-92-5C-20:CMCC-EASY      120.197.40.4   s19.cnzz.com  站点统计         24     9         6960         690  200

1363157973098      15013685858  5C-0E-8B-C7-F7-90:CMCC       120.197.40.4   rank.ie.sogou.com  搜索引擎         28     27         3659         3538         200

1363157986029      15989002119  E8-99-C4-4E-93-E0:CMCC-EASY      120.196.100.99        www.umeng.com    站点统计         3         3       1938         180  200

1363157992093      13560439658  C4-17-FE-BA-DE-D9:CMCC      120.196.100.99                          15     9       918  4938         200

1363157986041      13480253104  5C-0E-8B-C7-FC-80:CMCC-EASY      120.197.40.4                      3       3       180  180  200

1363157984040      13602846565  5C-0E-8B-8B-B6-00:CMCC       120.197.40.4   2052.flash2-http.qq.com         综合门户         15     12     1938         2910         200

1363157995093      13922314466  00-FD-07-A2-EC-BA:CMCC      120.196.100.82        img.qfc.cn                  12     12     3008         3720         200

1363157982040      13502468823  5C-0A-5B-6A-0B-D4:CMCC-EASY    120.196.100.99        y0.ifengimg.com      综合门户         57     102  7335         110349     200

1363157986072      18320173382  84-25-DB-4F-10-1A:CMCC-EASY      120.196.100.99        input.shouji.sogou.com   搜索引擎         21     18     9531         2412         200

1363157990043      13925057413  00-1F-64-E1-E6-9A:CMCC        120.196.100.55        t3.baidu.com   搜索引擎         69     63         11058       48243       200

1363157988072      13760778710  00-FD-07-A4-7B-08:CMCC       120.196.100.82                          2       2       120  120  200

1363157985066      13726238888  00-FD-07-A4-72-B8:CMCC       120.196.100.82        i02.c.aliimg.com                24     27     2481         24681       200

1363157993055      13560436666  C4-17-FE-BA-DE-D9:CMCC      120.196.100.99                          18     15     1116         954  200

 

手机号码                        时间戳                     Ip            网站      上行流量   下行流量   总的流量 

 

 

 

 

 

 

 

 

 

 

 

 

LongWritable的源码

 

/**

 * Licensed to the Apache Software Foundation (ASF) under one

 * or more contributor license agreements.  See the NOTICE file

 * distributed with this work for additional information

 * regarding copyright ownership.  The ASF licenses this file

 * to you under the Apache License, Version 2.0 (the

 * "License"); you may not use this file except in compliance

 * with the License.  You may obtain a copy of the License at

 *

 *     http://www.apache.org/licenses/LICENSE-2.0

 *

 * Unless required by applicable law or agreed to in writing, software

 * distributed under the License is distributed on an "AS IS" BASIS,

 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

 * See the License for the specific language governing permissions and

 * limitations under the License.

 */

 

package org.apache.hadoop.io;

 

 

import java.io.DataInput;

import java.io.DataOutput;

import java.io.IOException;

 

import org.apache.hadoop.classification.InterfaceAudience;

import org.apache.hadoop.classification.InterfaceStability;

 

/** A WritableComparable for longs. */

@InterfaceAudience.Public

@InterfaceStability.Stable

public class LongWritable implements WritableComparable<LongWritable> {

  private long value;

 

  public LongWritable() {}

 

  public LongWritable(long value) { set(value); }

 

  /** Set the value of this LongWritable. */

  public void set(long value) { this.value = value; }

 

  /** Return the value of this LongWritable. */

  public long get() { return value; }

 

  @Override

  public void readFields(DataInput in) throws IOException {

    value = in.readLong();

  }

 

  @Override

  public void write(DataOutput out) throws IOException {

    out.writeLong(value);

  }

 

  /** Returns true iff <code>o</code> is a LongWritable with the same value. */

  @Override

  public boolean equals(Object o) {

    if (!(o instanceof LongWritable))

      return false;

    LongWritable other = (LongWritable)o;

    return this.value == other.value;

  }

 

  @Override

  public int hashCode() {

    return (int)value;

  }

 

  /** Compares two LongWritables. */

  @Override

  public int compareTo(LongWritable o) {

    long thisValue = this.value;

    long thatValue = o.value;

    return (thisValue<thatValue ? -1 : (thisValue==thatValue ? 0 : 1));

  }

 

  @Override

  public String toString() {

    return Long.toString(value);

  }

 

  /** A Comparator optimized for LongWritable. */

  public static class Comparator extends WritableComparator {

    public Comparator() {

      super(LongWritable.class);

    }

 

    @Override

    public int compare(byte[] b1, int s1, int l1,

                       byte[] b2, int s2, int l2) {

      long thisValue = readLong(b1, s1);

      long thatValue = readLong(b2, s2);

      return (thisValue<thatValue ? -1 : (thisValue==thatValue ? 0 : 1));

    }

  }

 

  /** A decreasing Comparator optimized for LongWritable. */

  public static class DecreasingComparator extends Comparator {

   

    @Override

    public int compare(WritableComparable a, WritableComparable b) {

      return -super.compare(a, b);

    }

    @Override

    public int compare(byte[] b1, int s1, int l1, byte[] b2, int s2, int l2) {

      return -super.compare(b1, s1, l1, b2, s2, l2);

    }

  }

 

  static {                                       // register default comparator

    WritableComparator.define(LongWritable.class, new Comparator());

  }

 

}

 

 

 

 

WritableComparable的源码

 

/**

 * Licensed to the Apache Software Foundation (ASF) under one

 * or more contributor license agreements.  See the NOTICE file

 * distributed with this work for additional information

 * regarding copyright ownership.  The ASF licenses this file

 * to you under the Apache License, Version 2.0 (the

 * "License"); you may not use this file except in compliance

 * with the License.  You may obtain a copy of the License at

 *

 *     http://www.apache.org/licenses/LICENSE-2.0

 *

 * Unless required by applicable law or agreed to in writing, software

 * distributed under the License is distributed on an "AS IS" BASIS,

 * WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.

 * See the License for the specific language governing permissions and

 * limitations under the License.

 */

 

package org.apache.hadoop.io;

 

import org.apache.hadoop.classification.InterfaceAudience;

import org.apache.hadoop.classification.InterfaceStability;

 

/**

 * A {@link Writable} which is also {@link Comparable}.

 *

 * <p><code>WritableComparable</code>s can be compared to each other, typically

 * via <code>Comparator</code>s. Any type which is to be used as a

 * <code>key</code> in the Hadoop Map-Reduce framework should implement this

 * interface.</p>

 *

 * <p>Note that <code>hashCode()</code> is frequently used in Hadoop to partition

 * keys. It's important that your implementation of hashCode() returns the same

 * result across different instances of the JVM. Note also that the default

 * <code>hashCode()</code> implementation in <code>Object</code> does <b>not</b>

 * satisfy this property.</p>

 * 

 * <p>Example:</p>

 * <p><blockquote><pre>

 *     public class MyWritableComparable implements WritableComparable<MyWritableComparable> {

 *       // Some data

 *       private int counter;

 *       private long timestamp;

 *      

 *       public void write(DataOutput out) throws IOException {

 *         out.writeInt(counter);

 *         out.writeLong(timestamp);

 *       }

 *      

 *       public void readFields(DataInput in) throws IOException {

 *         counter = in.readInt();

 *         timestamp = in.readLong();

 *       }

 *      

 *       public int compareTo(MyWritableComparable o) {

 *         int thisValue = this.value;

 *         int thatValue = o.value;

 *         return (thisValue &lt; thatValue ? -1 : (thisValue==thatValue ? 0 : 1));

 *       }

 *

 *       public int hashCode() {

 *         final int prime = 31;

 *         int result = 1;

 *         result = prime * result + counter;

 *         result = prime * result + (int) (timestamp ^ (timestamp &gt;&gt;&gt; 32));

 *         return result

 *       }

 *     }

 * </pre></blockquote></p>

 */

@InterfaceAudience.Public

@InterfaceStability.Stable

public interface WritableComparable<T> extends Writable, Comparable<T> {

}

 

 

 

 

 

 

 

 

 

这样可以减少网络带宽,所以,为什么hadoop用到自己的序列化机制。

 

以上是weekend110的hadoop中的序列化机制

 

//将对象数据序列化到数据流中

    @Override

    public void write(DataOutput out) throws IOException {

       // TODO Auto-generated method stub

      

    }

序列化里,是要把数据写出去

 

 

 

 

 //从数据流中反序列出对象数据

    @Override

    public void readFields(DataInput in) throws IOException {

       // TODO Auto-generated method stub

      

    }

 

反序列化,是要读入数据。

 

 

至此,FlowBean.java代码已经写完。

 

 

 

 

1363157985066      13726230503  00-FD-07-A4-72-B8:CMCC       120.196.100.82        i02.c.aliimg.com                24     27     2481         24681       200

1363157995052      13826544101  5C-0E-8B-C7-F1-E0:CMCC       120.197.40.4                      4       0       264  0       200

1363157991076      13926435656  20-10-7A-28-CC-0A:CMCC       120.196.100.99                          2       4       132  1512         200

1363154400022      13926251106  5C-0E-8B-8B-B1-50:CMCC       120.197.40.4                      4       0       240  0       200

1363157993044      18211575961  94-71-AC-CD-E6-18:CMCC-EASY     120.196.100.99        iface.qiyi.com  视频网站         15         12     1527         2106         200

1363157995074      84138413         5C-0E-8B-8C-E8-20:7DaysInn 120.197.40.4   122.72.52.12            20     16     4116         1432         200

1363157993055      13560439658  C4-17-FE-BA-DE-D9:CMCC      120.196.100.99                          18     15     1116         954  200

1363157995033      15920133257  5C-0E-8B-C7-BA-20:CMCC      120.197.40.4   sug.so.360.cn  信息安全         20     20     3156         2936         200

1363157983019      13719199419  68-A1-B7-03-07-B1:CMCC-EASY      120.196.100.82                          4       0       240  0       200

1363157984041      13660577991  5C-0E-8B-92-5C-20:CMCC-EASY      120.197.40.4   s19.cnzz.com  站点统计         24     9         6960         690  200

1363157973098      15013685858  5C-0E-8B-C7-F7-90:CMCC       120.197.40.4   rank.ie.sogou.com  搜索引擎         28     27         3659         3538         200

1363157986029      15989002119  E8-99-C4-4E-93-E0:CMCC-EASY      120.196.100.99        www.umeng.com    站点统计         3         3       1938         180  200

1363157992093      13560439658  C4-17-FE-BA-DE-D9:CMCC      120.196.100.99                          15     9       918  4938         200

1363157986041      13480253104  5C-0E-8B-C7-FC-80:CMCC-EASY      120.197.40.4                      3       3       180  180  200

1363157984040      13602846565  5C-0E-8B-8B-B6-00:CMCC       120.197.40.4   2052.flash2-http.qq.com         综合门户         15     12     1938         2910         200

1363157995093      13922314466  00-FD-07-A2-EC-BA:CMCC      120.196.100.82        img.qfc.cn                  12     12     3008         3720         200

1363157982040      13502468823  5C-0A-5B-6A-0B-D4:CMCC-EASY    120.196.100.99        y0.ifengimg.com      综合门户         57     102  7335         110349     200

1363157986072      18320173382  84-25-DB-4F-10-1A:CMCC-EASY      120.196.100.99        input.shouji.sogou.com   搜索引擎         21     18     9531         2412         200

1363157990043      13925057413  00-1F-64-E1-E6-9A:CMCC        120.196.100.55        t3.baidu.com   搜索引擎         69     63         11058       48243       200

1363157988072      13760778710  00-FD-07-A4-7B-08:CMCC       120.196.100.82                          2       2       120  120  200

1363157985066      13726238888  00-FD-07-A4-72-B8:CMCC       120.196.100.82        i02.c.aliimg.com                24     27     2481         24681       200

1363157993055      13560436666  C4-17-FE-BA-DE-D9:CMCC      120.196.100.99                          18     15     1116         954  200

 

 

1363157985066      13726230503  00-FD-07-A4-72-B8:CMCC       120.196.100.82        i02.c.aliimg.com                24     27     2481         24681       200

1363157995052      13826544101  5C-0E-8B-C7-F1-E0:CMCC       120.197.40.4                      4       0       264  0       200

1363157991076      13926435656  20-10-7A-28-CC-0A:CMCC       120.196.100.99                          2       4       132  1512         200

1363154400022      13926251106  5C-0E-8B-8B-B1-50:CMCC       120.197.40.4                      4       0       240  0       200

1363157993044      18211575961  94-71-AC-CD-E6-18:CMCC-EASY     120.196.100.99        iface.qiyi.com  视频网站         15         12     1527         2106         200

1363157995074      84138413         5C-0E-8B-8C-E8-20:7DaysInn 120.197.40.4   122.72.52.12            20     16     4116         1432         200

1363157993055      13560439658  C4-17-FE-BA-DE-D9:CMCC      120.196.100.99                          18     15     1116         954  200

1363157995033      15920133257  5C-0E-8B-C7-BA-20:CMCC      120.197.40.4   sug.so.360.cn  信息安全         20     20     3156         2936         200

1363157983019      13719199419  68-A1-B7-03-07-B1:CMCC-EASY      120.196.100.82                          4       0       240  0       200

1363157984041      13660577991  5C-0E-8B-92-5C-20:CMCC-EASY      120.197.40.4   s19.cnzz.com  站点统计         24     9         6960         690  200

1363157973098      15013685858  5C-0E-8B-C7-F7-90:CMCC       120.197.40.4   rank.ie.sogou.com  搜索引擎         28     27         3659         3538         200

1363157986029      15989002119  E8-99-C4-4E-93-E0:CMCC-EASY      120.196.100.99        www.umeng.com    站点统计         3         3       1938         180  200

1363157992093      13560439658  C4-17-FE-BA-DE-D9:CMCC      120.196.100.99                          15     9       918  4938         200

1363157986041      13480253104  5C-0E-8B-C7-FC-80:CMCC-EASY      120.197.40.4                      3       3       180  180  200

1363157984040      13602846565  5C-0E-8B-8B-B6-00:CMCC       120.197.40.4   2052.flash2-http.qq.com         综合门户         15     12     1938         2910         200

1363157995093      13922314466  00-FD-07-A2-EC-BA:CMCC      120.196.100.82        img.qfc.cn                  12     12     3008         3720         200

1363157982040      13502468823  5C-0A-5B-6A-0B-D4:CMCC-EASY    120.196.100.99        y0.ifengimg.com      综合门户         57     102  7335         110349     200

1363157986072      18320173382  84-25-DB-4F-10-1A:CMCC-EASY      120.196.100.99        input.shouji.sogou.com   搜索引擎         21     18     9531         2412         200

1363157990043      13925057413  00-1F-64-E1-E6-9A:CMCC        120.196.100.55        t3.baidu.com   搜索引擎         69     63         11058       48243       200

1363157988072      13760778710  00-FD-07-A4-7B-08:CMCC       120.196.100.82                          2       2       120  120  200

1363157985066      13726238888  00-FD-07-A4-72-B8:CMCC       120.196.100.82        i02.c.aliimg.com                24     27     2481         24681       200

1363157993055      13560436666  C4-17-FE-BA-DE-D9:CMCC      120.196.100.99                          18     15     1116         954  200

 

 

 

1363157985066      13726230503  00-FD-07-A4-72-B8:CMCC       120.196.100.82        i02.c.aliimg.com                24     27     2481         24681       200

1363157995052      13826544101  5C-0E-8B-C7-F1-E0:CMCC       120.197.40.4                      4       0       264  0       200

1363157991076      13926435656  20-10-7A-28-CC-0A:CMCC       120.196.100.99                          2       4       132  1512         200

1363154400022      13926251106  5C-0E-8B-8B-B1-50:CMCC       120.197.40.4                      4       0       240  0       200

1363157993044      18211575961  94-71-AC-CD-E6-18:CMCC-EASY     120.196.100.99        iface.qiyi.com  视频网站         15         12     1527         2106         200

1363157995074      84138413         5C-0E-8B-8C-E8-20:7DaysInn 120.197.40.4   122.72.52.12            20     16     4116         1432         200

1363157993055      13560439658  C4-17-FE-BA-DE-D9:CMCC      120.196.100.99                          18     15     1116         954  200

1363157995033      15920133257  5C-0E-8B-C7-BA-20:CMCC      120.197.40.4   sug.so.360.cn  信息安全         20     20     3156         2936         200

1363157983019      13719199419  68-A1-B7-03-07-B1:CMCC-EASY      120.196.100.82                          4       0       240  0       200

1363157984041      13660577991  5C-0E-8B-92-5C-20:CMCC-EASY      120.197.40.4   s19.cnzz.com  站点统计         24     9         6960         690  200

1363157973098      15013685858  5C-0E-8B-C7-F7-90:CMCC       120.197.40.4   rank.ie.sogou.com  搜索引擎         28     27         3659         3538         200

1363157986029      15989002119  E8-99-C4-4E-93-E0:CMCC-EASY      120.196.100.99        www.umeng.com    站点统计         3         3       1938         180  200

1363157992093      13560439658  C4-17-FE-BA-DE-D9:CMCC      120.196.100.99                          15     9       918  4938         200

1363157986041      13480253104  5C-0E-8B-C7-FC-80:CMCC-EASY      120.197.40.4                      3       3       180  180  200

1363157984040      13602846565  5C-0E-8B-8B-B6-00:CMCC       120.197.40.4   2052.flash2-http.qq.com         综合门户         15     12     1938         2910         200

1363157995093      13922314466  00-FD-07-A2-EC-BA:CMCC      120.196.100.82        img.qfc.cn                  12     12     3008         3720         200

1363157982040      13502468823  5C-0A-5B-6A-0B-D4:CMCC-EASY    120.196.100.99        y0.ifengimg.com      综合门户         57     102  7335         110349     200

1363157986072      18320173382  84-25-DB-4F-10-1A:CMCC-EASY      120.196.100.99        input.shouji.sogou.com   搜索引擎         21     18     9531         2412         200

1363157990043      13925057413  00-1F-64-E1-E6-9A:CMCC        120.196.100.55        t3.baidu.com   搜索引擎         69     63         11058       48243       200

1363157988072      13760778710  00-FD-07-A4-7B-08:CMCC       120.196.100.82                          2       2       120  120  200

1363157985066      13726238888  00-FD-07-A4-72-B8:CMCC       120.196.100.82        i02.c.aliimg.com                24     27     2481         24681       200

1363157993055      13560436666  C4-17-FE-BA-DE-D9:CMCC      120.196.100.99                          18     15     1116         954  200

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

[hadoop@weekend110 ~]$ /home/hadoop/app/hadoop-2.4.1/bin/hadoop jar flow.jar cn.itcast.hadoop.mr.flowsum.FlowSumRunner /flow/data /flow/output

 

 

 

以上是weekend110的流量求和mr程序开发

 

posted @ 2016-09-22 15:00  大数据和AI躺过的坑  阅读(518)  评论(0编辑  收藏  举报