宽依赖与窄依赖

Posted on 2018-08-01 14:26  打杂滴  阅读(1493)  评论(0编辑  收藏  举报

窄依赖:Narrow Dependency

父RDD和子RDD是一对一的依赖关系,如map,filter

宽依赖:Shuffle Dependency

本质就是shuffle。如reduceByKey,groupyByKey,父RDD一个分区数据给了子RDD的多个分区

存在shuffle就是宽依赖,否则就是窄依赖

窄依赖的函数有:map, filter, union, join(父RDD是hash-partitioned ), mapPartitions, mapValues 
宽依赖的函数有:groupByKey, join(父RDD不是hash-partitioned ), partitionBy

Copyright © 2024 打杂滴
Powered by .NET 8.0 on Kubernetes