spark - tasks is bigger than spark.driver.maxResultSize

Error

ERROR TaskSetManager: Total size of serialized results of 8113 tasks (1131.0 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)
ERROR TaskSetManager: Total size of serialized results of 8114 tasks (1131.1 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)
ERROR TaskSetManager: Total size of serialized results of 8115 tasks (1131.2 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)
ERROR TaskSetManager: Total size of serialized results of 8116 tasks (1131.3 MB) is bigger than spark.driver.maxResultSize (1024.0 MB)

Cause

  • caused by actions like RDD’s collect() that send big chunk of data to the driver(不一定是因为RDD的问题哦~)

Solution

  • set by SparkConf: conf.set("spark.driver.maxResultSize", "3g")
  • set by spark-defaults.confspark.driver.maxResultSize 3g
  • set when calling spark-submit--conf spark.driver.maxResultSize=3g
posted @ 2016-12-13 12:02  澄轶  阅读(13533)  评论(0编辑  收藏  举报