寒假生活指导01
今天看了spark的安装视频,进行了spark的基础学习。
rdd:
map计算 rdd = sc.parallelize([1,2,3,4,5]) def func(date): return date*10 rdds=rdd.map(func) #flatMap解除嵌套 rdd = sc.parallelize(["dwad wad wdas","dwadw dfgawdfw dwad","dwadwad"]) rdds=rdd.flatMap(lambda x : x.split(" ")) #reduceByKey分组两两计算 rdd=sc.parallelize([('男',99),('女',99),('女',99),('男',99),('男',99),('男',99)]) rdds = rdd.reduceByKey(lambda a, b: a+b) print(rdds.collect())