spark由于shuffle中read过大造成netty申请DirectMemor失败异常分析

1.报错日志：

 1 23/11/01 22:14:25 INFO [Executor task launch worker for task 79952] MapOutputTrackerWorker: Don't have map outputs for shuffle 7, fetching them
 2 23/11/01 22:14:28 INFO [Executor task launch worker for task 79932] TorrentBroadcast: Started reading broadcast variable 174
 3 23/11/01 22:14:28 INFO [Executor task launch worker for task 79932] MemoryStore: Block broadcast_174_piece0 stored as bytes in memory (estimated size 3.5 MB, free 8.0 GB)
 4 23/11/01 22:14:28 INFO [Executor task launch worker for task 79932] TorrentBroadcast: Reading broadcast variable 174 took 29 ms
 5 23/11/01 22:14:28 INFO [Executor task launch worker for task 79932] MemoryStore: Block broadcast_174 stored as values in memory (estimated size 3.5 MB, free 8.0 GB)
 6 23/11/01 22:14:28 INFO [Executor task launch worker for task 79932] MapOutputTracker: Broadcast mapstatuses size = 435, actual size = 3636093
 7 23/11/01 22:14:28 INFO [Executor task launch worker for task 79932] MapOutputTrackerWorker: Got the output locations
 8 23/11/01 22:14:28 INFO [Executor task launch worker for task 79952] ShuffleBlockFetcherIterator: Getting 9369 non-empty blocks including 138 local blocks and 9231 remote blocks
 9 23/11/01 22:14:28 INFO [Executor task launch worker for task 79972] ShuffleBlockFetcherIterator: Getting 9369 non-empty blocks including 138 local blocks and 9231 remote blocks
10 23/11/01 22:14:28 INFO [Executor task launch worker for task 79932] ShuffleBlockFetcherIterator: Getting 9369 non-empty blocks including 138 local blocks and 9231 remote blocks
11 23/11/01 22:14:28 INFO [Executor task launch worker for task 79952] ShuffleBlockFetcherIterator: Started 9 remote fetches in 9 ms
12 23/11/01 22:14:28 INFO [Executor task launch worker for task 79972] ShuffleBlockFetcherIterator: Started 11 remote fetches in 10 ms
13 23/11/01 22:14:28 INFO [Executor task launch worker for task 79932] ShuffleBlockFetcherIterator: Started 10 remote fetches in 12 ms //【故障分析点1】
14 23/11/01 22:14:30 WARN [shuffle-client-7-1] TransportChannelHandler: Exception in connection from emr-worker-2500.cluster-265451/10.64.156.19:7337
15 io.netty.util.internal.OutOfDirectMemoryError: failed to allocate 16777216 byte(s) of direct memory (used: 838860800, max: 838860800) //【故障分析点2】
16     at io.netty.util.internal.PlatformDependent.incrementMemoryCounter(PlatformDependent.java:725) //【故障分析点3】
17     at io.netty.util.internal.PlatformDependent.allocateDirectNoCleaner(PlatformDependent.java:680)
18     at io.netty.buffer.PoolArena$DirectArena.allocateDirect(PoolArena.java:758)
19     at io.netty.buffer.PoolArena$DirectArena.newChunk(PoolArena.java:734)
20     at io.netty.buffer.PoolArena.allocateNormal(PoolArena.java:245)
21     at io.netty.buffer.PoolArena.allocate(PoolArena.java:227)
22     at io.netty.buffer.PoolArena.allocate(PoolArena.java:147)
23     at io.netty.buffer.PooledByteBufAllocator.newDirectBuffer(PooledByteBufAllocator.java:342)
24     at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:187)
25     at io.netty.buffer.AbstractByteBufAllocator.directBuffer(AbstractByteBufAllocator.java:178)
26     at io.netty.buffer.AbstractByteBufAllocator.ioBuffer(AbstractByteBufAllocator.java:139)
27     at io.netty.channel.DefaultMaxMessagesRecvByteBufAllocator$MaxMessageHandle.allocate(DefaultMaxMessagesRecvByteBufAllocator.java:114)
28     at io.netty.channel.nio.AbstractNioByteChannel$NioByteUnsafe.read(AbstractNioByteChannel.java:147)
29     at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:700)
30     at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:635)
31     at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:552)
32     at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:514)
33     at io.netty.util.concurrent.SingleThreadEventExecutor$6.run(SingleThreadEventExecutor.java:1044)
34     at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
35     at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
36     at java.lang.Thread.run(Thread.java:748)

2.错误分析：

　　（1）、通过【故障分析点1】我们能发现，应该是在做shuffle，并且去上游取回数据（因为fetches）

（2）、通过【故障分析点2】spark程序去申请了16777216字节的Direct内存，也就是16MB，结果说申请失败，因为最大只能使用838860800字节，也就是800MB

（3）、通过【故障分析点3】，我们去找源代码，发现如果申请的内存 + 使用的内存超过 DIRECT_MEMORY_LIMIT，则报错，也就是【故障分析点2】的报错：failed to allocate 16777216 byte(s) of direct memory (used: 838860800, max: 838860800)

private static void incrementMemoryCounter(int capacity) {
        if (DIRECT_MEMORY_COUNTER != null) {
            for (;;) {
                long usedMemory = DIRECT_MEMORY_COUNTER.get();
                long newUsedMemory = usedMemory + capacity;
                if (newUsedMemory > DIRECT_MEMORY_LIMIT) {
                    throw new OutOfDirectMemoryError("failed to allocate " + capacity
                            + " byte(s) of direct memory (used: " + usedMemory + ", max: " + DIRECT_MEMORY_LIMIT + ')');
                }
                if (DIRECT_MEMORY_COUNTER.compareAndSet(usedMemory, newUsedMemory)) {
                    break;
                }
            }
        }
    }

　　那么DIRECT_MEMORY_LIMIT是什么呢？追踪源码发现，是通过 maxDirectMemory 获取到的，通过上面的注解，可以知道，如果没有配置 io.netty.maxDirectMemory 参数，则使用当前使用内存的2倍，但是我们当前的使用内存配置的executor是20G，对不上

1 // Here is how the system property is used:
2 //
3 // * <  0  - Don't use cleaner, and inherit max direct memory from java. In this case the
4 //           "practical max direct memory" would be 2 * max memory as defined by the JDK.
5 // * == 0  - Use cleaner, Netty will not enforce max memory, and instead will defer to JDK.
6 // * >  0  - Don't use cleaner. This will limit Netty's total direct memory
7 //           (note: that JDK's direct memory limit is independent of this).
8 long maxDirectMemory = SystemPropertyUtil.getLong("io.netty.maxDirectMemory", -1);
9 DIRECT_MEMORY_LIMIT = maxDirectMemory;

最后我们在环境中找到这个配置，真相确认

3.科普一下，为什么在shuffle read的时候，netty要申请内存？

这里就直接摘抄别人的描述

shuffle分为shuffle write和shuffle read两部分。

这两部是采用Netty框架，日志就是netty内存泄漏问题.
shuffle write的分区数由上一阶段的RDD分区数控制，shuffleread的分区数则是由Spark提供的一些参数控制。

shuffle write可以简单理解为类似于saveAsLocalDiskFile的操作，将计算的中间结果按某种规则临时放到各个executor所在的本地磁盘上。

shuffle read的时候数据的分区数则是由spark提供的一些参数控制。可以想到的是，如果这个参数值设置的很小，同时shuffle read的量很大，就会导致netty自身接收数据的缓存不够用，然后申请Direct 内存来补充，每次只申请16MB，达到最大上限就会报错失败

4 解决方案：

知道原因后问题就好解决了，这里也直接把网上现成的答案搬过来就行，主要从shuffle的数据量和处理shuffle数据的分区数两个角度入手。

（1）. 减少shuffle数据

思考是否可以使用map side join或是broadcastjoin来规避shuffle的产生。

将不必要的数据在shuffle前进行过滤，比如原始数据有20个字段，只要选取需要的字段进行处理即可，将会减少一定的shuffle数据。

（2）. SparkSQL和DataFrame的join,groupby等操作

通过spark.sql.shuffle.partitions控制分区数，默认为200，根据shuffle的量以及计算的复杂度提高这个值。

（3）. Rdd的join,groupBy,reduceByKey等操作

通过spark.default.parallelism控制shuffleread与reduce处理的分区数，默认为运行任务的core的总数（mesos细粒度模式为8个，local模式为本地的core总数），官方建议为设置成运行任务的core的2-3倍。

（4）. 提高executor的内存

通过spark.executor.memory适当提高executor的memory值。

（5）. 是否存在数据倾斜的问题

空值是否已经过滤？异常数据（某个key数据特别大）是否可以单独处理？考虑改变数据分区规则。

posted @ 2023-11-02 00:40 白羊座怪蜀黍阅读(206) 评论(0) 编辑收藏举报

刷新页面返回顶部

spark由于shuffle中read过大造成netty申请DirectMemor失败异常分析

1.报错日志：

2.错误分析：

3.科普一下，为什么在shuffle read的时候，netty要申请内存？

4 解决方案：

公告