之前,在 [Erlang 0126] 我们读过的Erlang论文 提到过下面这篇论文:
On Preserving Term Sharing in the Erlang Virtual Machine
地址: http://user.it.uu.se/~kostis/Papers/erlang12_sharing.pdf
摘要:In this paper we describe our experiences and argue through examples why flattening terms during copying is not a good idea for
a language like Erlang. More importantly, we propose a sharing preserving copying mechanism for Erlang/OTP and describe a pub-
licly available complete implementation of this mechanism.
Term Sharing 数据项共享这个东西不新鲜,"Efficiency Guide User's Guide"的4.2章节"Constructing binaries"[ 链接 ] 和 8.2 章节的"Loss of sharing" [链接]就提到过(再次吐槽一下Erlang文档组织,一个主题往往分散在多个文档里面,需要耐心).不仅仅是Binary数据类型,Term Sharing 是一个通用的问题.在Guide中提到了强制拷贝的场景:发送到别的进程或者插入到ETS. 下面是文档摘抄的:
Loss of sharing
Shared sub-terms are not preserved when a term is sent to another process, passed as the initial process arguments in the spawn call, or stored in an ETS table. That is an optimization. Most applications do not send messages with shared sub-terms.
数据拷贝过程,Erlang会进行两次遍历.第一次遍历,会计算flat size(erts/emulator/beam/copy.c中的size_object 方法)然后为此分配对应的内存,第二次遍历完成实际的拷贝 (function copy_structin erts/emulator/beam/copy.c).
首先,我们写一个简单的代码演示一下后面要频繁用到的erts_debug:size/1和erts_debug:flat_size/1
1 2 3 4 5 6 7 8 | s3(L)-> L2=[L,L,L,L], {{erts_debug:size(L),erts_debug:flat_size(L)}, {erts_debug:size(L2),erts_debug:flat_size(L2)}} . 9> d:s3([1,2,3,4,5,6]). {{12,12},{20,56}} |
下面在shell这段代码,分别演示了spawn,消息发送,以及插入ETS.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 | Eshell V6.0 (abort with ^G) 1> L=[1,2,3,4,5,6,7,8,9,10]. [1,2,3,4,5,6,7,8,9,10] 2> L2=[L,L,L,L,L,L]. [[1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10]] 3> erts_debug:size(L2). 32 4> erts_debug:flat_size(L2). 132 5> spawn(fun () ->receive Data -> io:format( "~p" ,[erts_debug:size(Data)]) end end). <0.39.0> 6> v(5) ! L2. 132[[1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10]] 7> erts_debug:size(L2). 32 8> ets: new (test,[named_table]). test 9> ets:insert(test,{1,L2}). true 10> ets:lookup(test ,1). [{1, [[1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10]]}] 11> [{1,Data}]=v(10). [{1, [[1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10]]}] 12> Data. [[1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10], [1,2,3,4,5,6,7,8,9,10]] 13> erts_debug:size(Data). 132 14> spawn(d,test,[L2]). 132<0.54.0> test(Data)-> io:format( "~p" ,[erts_debug:size(Data)]). |
除了上面的情况,还有一些潜在的情况也会导致数据展开,比如上面提到的论文里面设计的例子:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 | show_printing_may_be_bad() -> F = fun (N) -> T = now(), L = mklist(N), S = erts_debug:size(L), io:format( "mklist(~w), size ~w, " , [N, S]), io:format( "is ~P, " , [L, 2]), %%% BAD !!! D = timer:now_diff(now(), T), io:format( "in ~.3f sec.~n" , [D/1000000]) end, lists: foreach (F, [10, 20, 22, 24, 26, 28, 30]). mklist(0) -> 0; mklist(M) -> X = mklist(M-1), [X, X]. |
io:format("is ~P, ", [L, 2]), %%% BAD !!!这行代码删掉前后分别执行代码,在过机器上得到的结果如下:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | Eshell V6.0 (abort with ^G) 1> d:show_printing_may_be_bad(). mklist(10), size 40, in 0.001 sec. mklist(20), size 80, in 0.000 sec. mklist(22), size 88, in 0.000 sec. mklist(24), size 96, in 0.000 sec. mklist(26), size 104, in 0.000 sec. mklist(28), size 112, in 0.000 sec. mklist(30), size 120, in 0.000 sec. ok Eshell V6.0 (abort with ^G) 1> d:show_printing_may_be_bad(). mklist(10), size 40, is [[...]|...], in 0.001 sec. mklist(20), size 80, is [[...]|...], in 0.110 sec. mklist(22), size 88, is [[...]|...], in 0.421 sec. mklist(24), size 96, is [[...]|...], in 43.105 sec. mklist(26), size 104, Crash dump was written to: erl_crash.dump eheap_alloc: Cannot allocate 3280272216 bytes of memory (of type "heap" ). rlwrap: warning: erl killed by SIGABRT. rlwrap has not crashed, but for transparency, it will now kill itself (without dumping core)with the same signal |
很明显看到有这行代码的版本不仅执行时间长,而且需要大量内存.
为什么会出现这种情况呢?就是上面提到的"Loss of sharing",为什么会触发数据展开(或者说数据平铺化)呢?之前我们曾经聊过关于io:format的事情( [Erlang 0041] 详解io:format [链接]),在Erlang/OTP中I/O是通过向I/O Server发起I/O请求实现的.io:format调用实际上是向I/O 发送了一个io request消息,剩下的就由IO Server处理了.虽然上面的L被简略的输出为" [[...]|...]"但是在消息的传递过程中已经触发了数据平铺然后拷贝;
这种问题实际上是非常隐蔽的,所以通过宏选项在release生成的时候清理掉所有io:format输出是非常有必要的;
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· AI与.NET技术实操系列(二):开始使用ML.NET
· 记一次.NET内存居高不下排查解决与启示
· 探究高空视频全景AR技术的实现原理
· 理解Rust引用及其生命周期标识(上)
· 浏览器原生「磁吸」效果!Anchor Positioning 锚点定位神器解析
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· 单线程的Redis速度为什么快?
2011-10-23 [Erlang 0009] Erlang 杂记