MapReduce寻找共同好友

1.测试文件

A:B,C,D,F,E,O
B:A,C,E,K
C:F,A,D,I
D:A,E,F,L
E:B,C,D,M,L
F:A,B,C,D,E,O,M
G:A,C,D,E,F
H:A,C,D,E,O
I:A,O
J:B,O
K:A,C,D
L:D,E,F
M:E,F,G
O:A,H,I,J

2.方法

2-1.方法一:

1.将域用户和好友分别作为值和键输出
  {B,C,D,F,E,O}:A
  {A,C,E,K}:B

2.可以看出:B,C,D,F,E,O都有共同好友A,

3.把A的好友两两组合作为键,A作为值,冒泡输出

4.经过shuffle处理后,会把BC作为键,共同好友作为值放入集合中

5.迭代集合中的好友,一次输出即可

2-2.方法二:

1.将用户和好友作为键和值输出

  A:B,C,D,F,E,O     --A:B,C,D,F,E,O
  B:A,C,E,K     --B:A,C,E,K
  C:F,A,D,I     --C:A,D,F,I
  D:A,E,F,L     --D:A,E,F,L
  E:B,C,D,M,L       --E:B,C,D,L,M

2.将所有键值对添加到map集合中

3.取map的键(所有用户)为数组

4.迭代数组,通过用户名"A"在map中取得他的好友

5.迭代除用户"A"以外的其他用户,获取这些用户的好友;

  如果有用户同时存在于"A"和"B"的好友列表中

  那么这些好友就是"AB"的共同好友

  --A:{B,C,D,F,E,O}
  --B:{A,C,E,K}

  "A"中存在"C,E"用户,"B"中也存在"C,E"用户,那么"C,E"就是AB的共同好友

6.将"AB"作为键,共同好友作为值输出即可

3.代码

public class Friends {

    // map
    public static class MRMapper extends Mapper<LongWritable, Text, Text, Text> {

        protected void map(LongWritable key, Text value, Context context) throws IOException, InterruptedException {
            String str = value.toString();
            String friends = str.substring(2);
            System.out.println(friends);
            context.write(new Text(str.charAt(0) + ""), new Text(friends));
        }
    }

    // reduce
    public static class MRReducer extends Reducer<Text, Text, Text, Text> {

        private static HashMap<String, String> map1 = new HashMap<String, String>();
        public void run(Context context) throws IOException, InterruptedException {
            try {
                while (context.nextKeyValue()) {
                    reduce(context.getCurrentKey(), context.getValues(), context);
                }
            } finally {
                cleanup(context);
            }
        }

        public void reduce(Text key, Iterable<Text> iterable, Context context)
                throws IOException, InterruptedException {

            for (Text t : iterable) {
                map1.put(key.toString(), t.toString());
            }
        }

        public void cleanup(Reducer<Text, Text, Text, Text>.Context context) 
                throws IOException, InterruptedException {

            List<String> list = new ArrayList<String>();

            Collection<String> keys = map1.keySet();// 所有用户

            String keys1 = keys.toString();

            String keys2 = keys1.substring(1, keys1.length() - 1);

            String[] split = keys2.split(",");

            for (int i = 1; i < split.length; i++) {//迭代用户

                String a = split[i].trim();

                for (int j = (i+1); j < split.length; j++) {//迭代除外层循环以外的用户

                    String b = split[j].trim();

                    String a_and_b = "";

                    // a的好友
                    String af = map1.get(a);

                    String[] friends = af.split(",");

                    for (String s : friends) {//比较两个用户的好友列表,取共同好友

                        if (map1.get(b).contains(s)) {

                            a_and_b += "," + s;
                        }
                    }

                    System.out.println(a + "," + b + " 共同好友  " + a_and_b);

                    if (a_and_b.length() > 1) {

                        list.add(a + "," + b + " 共同好友 :" + a_and_b.substring(1));
                    }
                }
            }
            for(String s:list){

                context.write(new Text(""), new Text(s));
            }
        }
    }

    public static void main(String[] args) throws IOException, ClassNotFoundException, InterruptedException {

        Configuration conf = new Configuration();

        Job job = Job.getInstance(conf);
        job.setJarByClass(Friends.class);

        job.setMapperClass(MRMapper.class);
        job.setReducerClass(MRReducer.class);
        job.setCombinerClass(MRReducer.class);

        job.setMapOutputKeyClass(Text.class);
        job.setMapOutputValueClass(Text.class);

        job.setOutputKeyClass(Text.class);
        job.setOutputValueClass(Text.class);

        FileInputFormat.setInputPaths(job, new Path("hdfs://hadoop5:9000/input/friends.txt"));
        FileOutputFormat.setOutputPath(job, new Path("hdfs://hadoop5:9000/output/friends"));

        System.out.println(job.waitForCompletion(true) ? 1 : 0);
    }
}

如果有更简洁的方法,欢迎留言给博主。

posted @ 2017-07-20 19:39  总被人想的静静  阅读(163)  评论(0编辑  收藏  举报