[错误]Caused by: org.apache.spark.memory.SparkOutOfMemoryError: Unable to acquire 65536 bytes of memory, got 0
今天,在运行Spark SQL代码的时候,遇到了以下错误:
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 50.0 failed 4 times, most recent failure: Lost task 3.3 in stage 50.0 (TID 9204, bjhw-feed-shunyi2513.bjhw.baidu.com, executor 20): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:165) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:93) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:88) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:112) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.memory.SparkOutOfMemoryError: Unable to acquire 65536 bytes of memory, got 0 at org.apache.spark.memory.MemoryConsumer.throwOom(MemoryConsumer.java:157) at org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:98) at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.<init>(UnsafeInMemorySorter.java:128) at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.<init>(UnsafeExternalSorter.java:161) at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.create(UnsafeExternalSorter.java:128) at org.apache.spark.sql.execution.UnsafeExternalRowSorter.<init>(UnsafeExternalRowSorter.java:108) at org.apache.spark.sql.execution.UnsafeExternalRowSorter.create(UnsafeExternalRowSorter.java:93) at org.apache.spark.sql.execution.SortExec.createSorter(SortExec.scala:87) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.init(Unknown Source) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10.apply(WholeStageCodegenExec.scala:611) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10.apply(WholeStageCodegenExec.scala:608) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$26.apply(RDD.scala:847) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$26.apply(RDD.scala:847) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
Spark SQL关键代码如下:
spark.sql( s""" |select |$baseDay, |T1.type, |T1.app_id, |T1.name, |T1.level, |T1.business_line_id, |T1.status, |T1.quota, |T1.real_flow, |T1.real_show, |T1.article_num, |T1.baoliang_num, |T1.baoliang_finish_num, |T1.begin_time, |T1.end_time, |T2.fawen_num_bjh, |T3.fawen_num_tth, |T4.fenfa_num_bjh, |T5.fenfa_num_tth, |T6.fensi_num_bjh, |T7.fensi_num_tth |from |base_table T1 |left outer join bjh_fawen T2 on T1.app_id=T2.app_id |left outer join tth_fawen T3 on T1.app_id=T3.app_id |left outer join bjh_fenfa T4 on T1.app_id=T4.app_id |left outer join tth_fenfa T5 on T1.app_id=T5.app_id |left outer join bjh_fensi_add T6 on T1.app_id=T6.app_id |left outer join tth_fensi_add T7 on T1.app_id=T7.app_id """.stripMargin).rdd.map(r => { r.mkString("\t") }).coalesce(5).saveAsTextFile(s"xxx")
解决办法:
去掉coalesce。
参考
https://www.e-learn.cn/content/wangluowenzhang/700757