Spark常用的算子总结(4)—— reduceByKey
按Key进行分组,使用给定的func函数聚合value值,
val arr = List(("A",3),("A",2),("B",1),("B",3)) val rdd = sc.parallelize(arr) val reduceByKeyRDD = rdd.reduceByKey(_ +_) reduceByKeyRDD.foreach(println) sc.stop # (A,5) # (A,4)
按Key进行分组,使用给定的func函数聚合value值,
val arr = List(("A",3),("A",2),("B",1),("B",3)) val rdd = sc.parallelize(arr) val reduceByKeyRDD = rdd.reduceByKey(_ +_) reduceByKeyRDD.foreach(println) sc.stop # (A,5) # (A,4)