hibernate 主要用于在内存空闲时,通过整理进程的stack,回收进程的heap 来达到回收内存节省资源的效果.

hibernate 可用于OTP 进程以及普通进程, hibernate 的官方文档 erlang:hibernate/3

Puts the calling process into a wait state where its memory allocation has been reduced as much as possible, which is useful if the process does not expect to receive any messages in the near future.

The process will be awaken when a message is sent to it, and control will resume in Module:Function with the arguments given by Argswith the call stack emptied, meaning that the process will terminate when that function returns. Thus erlang:hibernate/3 will never return to its caller.

If the process has any message in its message queue, the process will be awaken immediately in the same way as described above.

In more technical terms, what erlang:hibernate/3 does is the following. It discards the call stack for the process. Then it garbage collects the process. After the garbage collection, all live data is in one continuous heap. The heap is then shrunken to the exact same size as the live data which it holds (even if that size is less than the minimum heap size for the process).

If the size of the live data in the process is less than the minimum heap size, the first garbage collection occurring after the process has been awaken will ensure that the heap size is changed to a size not smaller than the minimum heap size.

Note that emptying the call stack means that any surrounding catch is removed and has to be re-inserted after hibernation. One effect of this is that processes started using proc_lib (also indirectly, such as gen_server processes), should use proc_lib:hibernate/3 instead to ensure that the exception handler continues to work when the process wakes up.

1, 主要用于某进程可能会在未来短期内空闲时

2, hibernate 会清空进程的stack 然后进程GC

从Erlang 进程PCB(process control block) 的角度来看, heap 主要用来存储复杂数据结构(如:tuples, lists, big integers), 而stack 主要存储简单的数据结构和指向在heap 中的复杂数据结构的引用. hibernate 清空进程stack 之后再进行进程GC ,会尽可能的保证heap 中的资源回收. 相比普通的GC 更彻底.

gen_server hibernate

gen_server module 对hibernate 调用的支持主要体现在 callback 返回参数(如:{reply, R, S, hibernate}) 以及enter_loop 函数.

1 loop(Parent, Name, State, Mod, hibernate, Debug) ->
2     proc_lib:hibernate(?MODULE,wake_hib,[Parent, Name, State, Mod, Debug]);
3 
4 wake_hib(Parent, Name, State, Mod, Debug) ->
5     Msg = receive
6               Input ->
7                   Input
8           end,
9     decode_msg(Msg, Parent, Name, State, Mod, hibernate, Debug, true).

如果loop/6 函数的Timeout 参数值为 'hibernate', gen_server process 就会调用proc_lib:hibernate/3 函数使进程进入hibernate 状态. 而当有消息被发送至本进程时, 进程就会被resume并调用wake_hib/5 回到正常的decode_msg 流程.

而enter_loop 函数的作用是 make 一个已经存在的process 变成gen_server 进程:

The calling process will enter the gen_server receive loop and become a gen_server process.

各界对hibernate 的使用

进程hibernate 会清空进程的stack 并进行GC ,并且进程被resume 时,需要为进程重新分配heap 资源. hibernate 能够为虚拟机节省更多的内存资源, 但是也会加重CPU的负担. 那社区里对这一特性是怎么使用的?

Mochiweb

在大名鼎鼎的 A Million-user Comet Application with Mochiweb 中, 作者提到了使用hibernate 作为优化手段之一.

This sounds reasonable - let's try hibernating after every message and see what happens.

并且起到了立竿见影的效果:

Judicious use of hibernate means the mochiweb application memory levels out at 78MB Resident with 10k connections, much better than the 450MB we saw in Part 1. There was no significant increase in CPU usage.

是的,Mochiweb 的使用方式, 是在接收到message 之后立即使用hibernate.

Ejabberd

Ejabberd 对hibernate 的使用比较谨慎, 只有在进程未收到任何信息一段时间后, 才使用hibernate .

ejabberd_receiver module 是 Ejabberd 框架中的socket message 接收module, ejabberd_receiver 是gen_server 进程,衔接socket 和C2S gen_fsm 进程. ejabberd_receiver 设置了一个超时时间, 超时时间被触发(也就是进程收到timeout)后, ejabberd_receiver 进程会调用proc_lib:hibernate/3 .而当有新的消息sent 到该进程之后, hibernate 会使用 gen_server:enter_loop/3 函数, 唤醒ejabberd_receiver 进程并重新进入gen_server process .

可以看出 Ejabberd 对 hibernate 的使用,比较谨慎,只有当进程在一段时间内未收到消息(也就是一段时间内空闲),才会使用hibernate .

1 handle_info(timeout, State) ->
2     proc_lib:hibernate(gen_server, enter_loop,
3                [?MODULE, [], State]),

RabbitMQ

RabbitMQ 同样使用了hibernate, 是在gen_server2 代码中,同样是使用了Ejabberd 类似的方式(进程一段时间空闲后, 才使用hibernate).

使用了 gen_server2 behavior module 的init 函数:

1 init(Q) ->
2     process_flag(trap_exit, true),
3     ?store_proc_name(Q#amqqueue.name),
4     {ok, init_state(Q#amqqueue{pid = self()}), hibernate,
5      {backoff, ?HIBERNATE_AFTER_MIN, ?HIBERNATE_AFTER_MIN, ?DESIRED_HIBERNATE},
6     ?MODULE}.

L5 定义了hibernate 相关的参数.

而在gen_server2 module 中, 回调user module init 的方法:

1         {ok, State, Timeout, Backoff = {backoff, _, _, _}, Mod1} ->
2             Backoff1 = extend_backoff(Backoff),
3             proc_lib:init_ack(Starter, {ok, self()}),
4             loop(GS2State #gs2_state { mod           = Mod1,
5                                        state         = State,
6                                        time          = Timeout,
7                                        timeout_state = Backoff1 });

会将backoff hibernate 相关参数与gs2_state 一同作为 MAIN loop 的参数.紧接着,gen_server2 loop/1 函数进入 process_next_msg/1 :

 1 process_next_msg(GS2State = #gs2_state { time          = Time,
 2                                          timeout_state = TimeoutState,
 3                                          queue         = Queue }) ->
 4     case priority_queue:out(Queue) of
 5         {{value, Msg}, Queue1} ->
 6             process_msg(Msg, GS2State #gs2_state { queue = Queue1 });
 7         {empty, Queue1} ->
 8             {Time1, HibOnTimeout}
 9                 = case {Time, TimeoutState} of
10                       {hibernate, {backoff, Current, _Min, _Desired, _RSt}} ->
11                           {Current, true};
12                       {hibernate, _} ->
13                           %% wake_hib/7 will set Time to hibernate. If
14                           %% we were woken and didn't receive a msg
15                           %% then we will get here and need a sensible
16                           %% value for Time1, otherwise we crash.
17                           %% R13B1 always waits infinitely when waking
18                           %% from hibernation, so that's what we do
19                           %% here too.
20                           {infinity, false};
21                       _ -> {Time, false}
22                   end,
23             receive
24                 Input ->
25                     %% Time could be 'hibernate' here, so *don't* call loop
26                     process_next_msg(
27                       drain(in(Input, GS2State #gs2_state { queue = Queue1 })))
28             after Time1 ->
29                     case HibOnTimeout of
30                         true ->
31                             pre_hibernate(
32                               GS2State #gs2_state { queue = Queue1 });
33                         false ->
34                             process_msg(timeout,
35                                         GS2State #gs2_state { queue = Queue1 })
36                     end
37             end
38     end.

在priority_queue 队列为空(L7)的情况下, 等待message(L23) 并设置 ?HIBERNATE_AFTER_MIN (L11, L28)超时, 超时触发之后, 首先回调 user module 的handle_pre_hibernate callback 方法(gen_server2 特有), 最后调用hibernate .

可以看出RabbitMQ 使用hibernate 的方式更为谨慎.

参考文献:

1, http://blog.yufeng.info/archives/1615

2, http://www.erlang.org/doc/man/erlang.html#hibernate-3

3, http://www.metabrew.com/article/a-million-user-comet-application-with-mochiweb-part-2

4, Characterizing the Scalability of Erlang VM

posted on 2015-02-05 14:31  _00  阅读(1144)  评论(0编辑  收藏  举报