[错误]Caused by: org.apache.spark.memory.SparkOutOfMemoryError: Unable to acquire 65536 bytes of memory, got 0
我的知乎:DarrenChan陈驰
今天,在运行Spark SQL代码的时候,遇到了以下错误:
Caused by: org.apache.spark.SparkException: Job aborted due to stage failure: Task 3 in stage 50.0 failed 4 times, most recent failure: Lost task 3.3 in stage 50.0 (TID 9204, bjhw-feed-shunyi2513.bjhw.baidu.com, executor 20): org.apache.spark.SparkException: Task failed while writing rows at org.apache.spark.internal.io.SparkHadoopWriter$.org$apache$spark$internal$io$SparkHadoopWriter$$executeTask(SparkHadoopWriter.scala:165) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:93) at org.apache.spark.internal.io.SparkHadoopWriter$$anonfun$3.apply(SparkHadoopWriter.scala:88) at org.apache.spark.scheduler.ResultTask.runTask(ResultTask.scala:87) at org.apache.spark.scheduler.Task.run(Task.scala:112) at org.apache.spark.executor.Executor$TaskRunner.run(Executor.scala:345) at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149) at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624) at java.lang.Thread.run(Thread.java:748) Caused by: org.apache.spark.memory.SparkOutOfMemoryError: Unable to acquire 65536 bytes of memory, got 0 at org.apache.spark.memory.MemoryConsumer.throwOom(MemoryConsumer.java:157) at org.apache.spark.memory.MemoryConsumer.allocateArray(MemoryConsumer.java:98) at org.apache.spark.util.collection.unsafe.sort.UnsafeInMemorySorter.<init>(UnsafeInMemorySorter.java:128) at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.<init>(UnsafeExternalSorter.java:161) at org.apache.spark.util.collection.unsafe.sort.UnsafeExternalSorter.create(UnsafeExternalSorter.java:128) at org.apache.spark.sql.execution.UnsafeExternalRowSorter.<init>(UnsafeExternalRowSorter.java:108) at org.apache.spark.sql.execution.UnsafeExternalRowSorter.create(UnsafeExternalRowSorter.java:93) at org.apache.spark.sql.execution.SortExec.createSorter(SortExec.scala:87) at org.apache.spark.sql.catalyst.expressions.GeneratedClass$GeneratedIteratorForCodegenStage22.init(Unknown Source) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10.apply(WholeStageCodegenExec.scala:611) at org.apache.spark.sql.execution.WholeStageCodegenExec$$anonfun$10.apply(WholeStageCodegenExec.scala:608) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$26.apply(RDD.scala:847) at org.apache.spark.rdd.RDD$$anonfun$mapPartitionsWithIndex$1$$anonfun$apply$26.apply(RDD.scala:847) at org.apache.spark.rdd.MapPartitionsRDD.compute(MapPartitionsRDD.scala:38) at org.apache.spark.rdd.RDD.computeOrReadCheckpoint(RDD.scala:324)
Spark SQL关键代码如下:
spark.sql( s""" |select |$baseDay, |T1.type, |T1.app_id, |T1.name, |T1.level, |T1.business_line_id, |T1.status, |T1.quota, |T1.real_flow, |T1.real_show, |T1.article_num, |T1.baoliang_num, |T1.baoliang_finish_num, |T1.begin_time, |T1.end_time, |T2.fawen_num_bjh, |T3.fawen_num_tth, |T4.fenfa_num_bjh, |T5.fenfa_num_tth, |T6.fensi_num_bjh, |T7.fensi_num_tth |from |base_table T1 |left outer join bjh_fawen T2 on T1.app_id=T2.app_id |left outer join tth_fawen T3 on T1.app_id=T3.app_id |left outer join bjh_fenfa T4 on T1.app_id=T4.app_id |left outer join tth_fenfa T5 on T1.app_id=T5.app_id |left outer join bjh_fensi_add T6 on T1.app_id=T6.app_id |left outer join tth_fensi_add T7 on T1.app_id=T7.app_id """.stripMargin).rdd.map(r => { r.mkString("\t") }).coalesce(5).saveAsTextFile(s"xxx")
解决办法:
去掉coalesce。
参考#
https://www.e-learn.cn/content/wangluowenzhang/700757
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 10年+ .NET Coder 心语,封装的思维:从隐藏、稳定开始理解其本质意义
· .NET Core 中如何实现缓存的预热?
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· 10年+ .NET Coder 心语 ── 封装的思维:从隐藏、稳定开始理解其本质意义
· 地球OL攻略 —— 某应届生求职总结
· 提示词工程——AI应用必不可少的技术
· Open-Sora 2.0 重磅开源!
· 周边上新:园子的第一款马克杯温暖上架