spark 问题
- driver报下面错,同时报在我自己写的代码 collect 部分. top user 不报错,top file报错,我猜是因为file 比user多得多
20/08/24 08:37:15 ERROR MicroBatchExecution: Query [id = de341482-5e75-4c34-b924-146a7eb6c9b0, runId = 13007eb2-10eb-4ef0-a799-dc048a7fc0bf] terminated with error org.apache.spark.SparkException: Job aborted due to stage failure: ShuffleMapStage 4 (start at top_n.scala:646) has failed the maximum allowable number of times: 4. Most recent failure reason: org.apache.spark.shuffle.MetadataFetchFailedException: Missing an output location for shuffle 2 at org.apache.spark.MapOutputTracker$.$anonfun$convertMapStatuses$2(MapOutputTracker.scala:1010) a
executor 报错
20/08/24 08:30:25 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:30:43 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:30:58 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:31:17 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:31:35 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:31:53 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:32:12 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
20/08/24 08:32:25 WARN TaskMemoryManager: Failed to allocate a page (67108864 bytes), try again.
ref:
https://blog.csdn.net/lingbo229/article/details/84943560
Solution:
memory 从16G -> 24G, 然后改成G1 GC collector, 同时加了GC 打印
"spark.executor.extraJavaOptions": "-XX:+UseG1GC -verbose:gc -XX:+PrintGCDetails -XX:+PrintGCDateStamps",