2016 年 7月 30 日随笔档案 - XGogo

2016年7月30日

摘要：原型： def reduceByKeyLocally(func: (V, V) => V): Map[K, V] 该函数将RDD[K,V]中每个K对应的V值根据映射函数来运算，运算结果映射到一个Map[K,V]中，而不是RDD[K,V] scala> var rdd1 = sc.makeRDD(Ar 阅读全文

posted @ 2016-07-30 23:14 XGogo 阅读(676) 评论(0) 推荐(0) 编辑

PairRDD中算子reduceByKey图解

摘要： reduceByKey 函数原型： def reduceByKey(func: (V, V) => V): RDD[(K, V)] def reduceByKey(func: (V, V) => V, numPartitions: Int): RDD[(K, V)] def reduceByKey( 阅读全文

posted @ 2016-07-30 23:09 XGogo 阅读(1861) 评论(0) 推荐(0) 编辑

PairRDD中算子foldByKey图解

摘要： foldByKey 函数原型： def foldByKey(zeroValue: V)(func: (V, V) => V): RDD[(K, V)] def foldByKey(zeroValue: V, numPartitions: Int)(func: (V, V) => V): RDD[(K 阅读全文

posted @ 2016-07-30 22:58 XGogo 阅读(621) 评论(0) 推荐(0) 编辑

PairRDD中算子combineByKey图解

摘要： 1、combineByKey combine 为结合意思。作用：将RDD[(K,V)] => RDD[(K,C)] 表示V的类型可以转成C两者可以不同类型。 def combineByKey[C](createCombiner:V =>C ,mergeValue:(C，V) =>C, merge 阅读全文

posted @ 2016-07-30 22:00 XGogo 阅读(1155) 评论(0) 推荐(0) 编辑

PairRDD中算子aggregateByKey图解

摘要： PairRDD 有几个比较麻烦的算子，常理解了后面又忘记了，自己按照自己的理解记录好，以备查阅 1、aggregateByKey aggregate 是聚合意思，直观理解就是按照Key进行聚合。转化： RDD[(K,V)] ==> RDD[(K,U)] 可以看出是返回值的类型不需要和原来的RDD的阅读全文

posted @ 2016-07-30 21:08 XGogo 阅读(693) 评论(0) 推荐(0) 编辑

尧字节

明翼

公告