gen_server terminate与trap_exit
不论是新手还是熟手,写gen_server
时常会遇到terminate/2
,有时执行,有时却不执行的困惑。
比如stackoverflow中的Handling the cleanup of the gen_server state
,因为terminate的文档写得比较模糊,并没有给出如何让terminate/2
一定会被执行的方案。
为了理顺各种情形,做了个小实验,结论如下:
让进程退出的来源有二种:
- 内部原因,自己运行完退出或发生异常crash退出。
- 外部原因,使用
erlang:exit/2
强制退出正在进行的进程。
gen_server退出原因 | 启动函数 | trap_exit | terminate |
---|---|---|---|
内部自身发生crash | 无关 | 无关 | 执行 |
exit(P,kill) | 无关 | 无关 | 不执行 |
exit(P,Reason) | 无关 | true | 执行 |
exit(P,Reason) | 无关 | false | 不执行 |
Pid! | gen_server:start_link | 无关 | 执行 |
Pid! | gen_server:start | 无关 | 不执行 |
- 特别注意
kill
是非常霸道的exit信号,直接强制退出,不会执行terminate,这也是supervisor在退出brutal_kill
方式启动进程所使用的方法。 - 我们不执行terminate最常见情况:
1. 使用监控树把进程挂载在Application下(gen_server:start_link/3-4)。
2. Application关闭时会调用supervisor:terminate_child/2来依次关闭进程。
3. terminate_child/2是使用exit(Pid, shutdown)来关闭工作进程。
4. 所以如果我们trap_exit: false,则不会执行terminate/2。
- 确保进程执行
terminate
的方案是init/1
中加上process_flag(trap_exit, true)
。
接下来,我们将分情况一步步分析下。
trap_exit 的作用
erlang:process_flage(trap_exit, true).
- 设置为false时:
link的进程 异常 退出(exit(whatever)
),本进程也会直接异常退出。
link的进程 正常 退出(运行结束或使用exit(normal)
),则本进程完全没有影响。不会收到任何信息,也不会退出。
1> erlang:process_flag(trap_exit, false), self().
<0.64.0>
2> erlang:spawn_link(fun() -> exit(whatever) end). ## 子进程exit的原因除了normal以外的其它原因。
** exception exit: whatever
3> self(). ## 父进程也异常退出了,变成了一个新的shell进程。
<0.68.0>
4> erlang:process_flag(trap_exit,false),self().
<0.68.0>
5> erlang:spawn_link(fun() -> exit(normal) end). ## 子进程exit的原因为normal
<0.79.0>
6> erlang:spawn_link(fun() -> {ok, true} end). ## 子进程正常结束,与5)中等同
<0.80.0>
11> flush(). ##没有收到任何消息
ok
12> self(). ## 进程没有退出
<0.68.0>
- 设置为true时,
link的进程 异常 退出(比如exit(whatever)
),那么本进程不会退出,只是会收到{’EXIT‘, FromPid, whatever}
的消息。
link的进程 正常 退出(直接正常结束),那么本进程会收到{'EXIT',FromPid,normal}
13> erlang:process_flag(trap_exit, true),self().
<0.68.0>
14> erlang:spawn_link(fun() -> exit(whatever) end).
<0.72.0>
15> flush().
Shell got {'EXIT',<0.72.0>,whatever}
ok
16> erlang:spawn_link(fun() -> {ok, true} end).
<0.90.0>
17> flush().
Shell got {'EXIT',<0.90.0>,normal}
总结:
- 进程默认的trap_exit为false,如果link进程crash,则自己也会被用
exit/2
crash掉,link进程正常退出,则本进程不受影响,且收不到任何消息。 - 进程trap_exit为true时,只要link进程退出(正常退出或crash),本进程都会收到
{’EXIT‘, FromPid, Reason}
的消息。
那么我们再来看一看一个单独的gen_server进程出错了,会发生什么?
gen_server内部出错,会发生什么?
如果gen_server
内部逻辑发生错误导致crash,比如除零,原子使用++
导致进程自己crash掉,会不会执行termniate/2
?
结论是: 一定会!
写一个简单的gen_server验证一下:
-module(gen_server_test).
-behaviour(gen_server).
-export([start_link/1, start/1]).
-export([divide/2, stop/1, crash/1]).
%% gen_server callbacks
-export([init/1, handle_call/3, handle_cast/2, handle_info/2, terminate/2, code_change/3]).
%% API
start(TrapExit) ->
gen_server:start({local, ?MODULE}, ?MODULE, [TrapExit], []).
start_link(TrapExit) ->
gen_server:start_link({local, ?MODULE}, ?MODULE, [TrapExit], []).
stop(Reason) ->
gen_server:call(?MODULE, {stop, Reason}).
crash(Reason) ->
gen_server:call(?MODULE, {crash, Reason}).
divide(X, Y) ->
gen_server:call(?MODULE, {divide, X, Y}).
init([TrapExit]) ->
erlang:process_flag(trap_exit, TrapExit),
{ok, undefined}.
handle_call({divide, X, Y}, _From, State) ->
io:format("[line:~p] Divide:~p/~p ~n", [?LINE, X, Y]),
{reply, X/Y, State};
handle_call({stop, Reason}, _From, State) ->
io:format("[line:~p] Got stop by ~p ~n", [?LINE, Reason]),
{stop, Reason, ok, State};
handle_call({crash, Reason}, _From, State) ->
io:format("[line:~p] Got crash: error(~p).~n", [?LINE, Reason]),
erlang:error(Reason),
{reply, ok, State};
handle_call(_Msg, _From, State) -> {reply, ignore, State}.
handle_info(Msg, State) ->
io:format("[line:~p] Got ~p~n", [?LINE, Msg]),
{noreply, State}.
terminate(Reason, _State) ->
io:format("[line:~p] Terminate reason: ~p~n", [?LINE, Reason]),
ok.
handle_cast(_Msg, State) -> {noreply, State}.
code_change(_Old, State, _Extra) -> {ok, State}.
- 如果内部crash退出结果:
1> c(gen_server_test).
{ok,gen_server_test}
2> gen_server_test:start_link(false).
{ok,<0.71.0>}
3> gen_server_test:divide(1,0).
[line:31] Divide:1/0
%% crash后执行terminate/2的callback
[line:47] Terminate reason: {badarith,
[{gen_server_test,handle_call,3,
[{file,"gen_server_test.erl"},{line,32}]},
{gen_server,try_handle_call,4,
[{file,"gen_server.erl"},{line,636}]},
{gen_server,handle_msg,6,
[{file,"gen_server.erl"},{line,665}]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,247}]}]}
%% 因为gen_server_test:divide(1,0)使用的是gen_server:call/2,
%% 它会先link到gen_server_test进程,gen_server_test导常退出
%% 会把这个错再抛出给调用者(shell进程)
%% shell进程trap_exit默认false,所以一起挂掉
** exception exit: badarith
in function gen_server_test:handle_call/3 (gen_server_test.erl, line 32)
in call from gen_server:try_handle_call/4 (gen_server.erl, line 636)
in call from gen_server:handle_msg/6 (gen_server.erl, line 665)
in call from proc_lib:init_p_do_apply/3 (proc_lib.erl, line 247)
4>
%% 如果gen_server进程被不是normal的Reason结束掉,默认会使用error_logger记录一条日志。
=ERROR REPORT==== 21-May-2018::17:03:34 ===
** Generic server gen_server_test terminating
** Last message in was {divide,1,0}
** When Server state == undefined
** Reason for termination ==
** {badarith,[{gen_server_test,handle_call,3,
[{file,"gen_server_test.erl"},{line,32}]},
{gen_server,try_handle_call,4,
[{file,"gen_server.erl"},{line,636}]},
{gen_server,handle_msg,6,[{file,"gen_server.erl"},{line,665}]},
{proc_lib,init_p_do_apply,3,
[{file,"proc_lib.erl"},{line,247}]}]}
** Client <0.64.0> stacktrace
** [{gen,do_call,4,[{file,"gen.erl"},{line,169}]},
{gen_server,call,2,[{file,"gen_server.erl"},{line,202}]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,674}]},
{shell,exprs,7,[{file,"shell.erl"},{line,687}]},
{shell,eval_exprs,7,[{file,"shell.erl"},{line,642}]},
{shell,eval_loop,3,[{file,"shell.erl"},{line,627}]}]
- 如果进程内部正常退出结果:
4> gen_server_test:start_link(false).
{ok,<0.75.0>}
5> gen_server_test:stop(normal).
[line:34] Got stop by normal
[line:47] Terminate reason: normal
ok
6> gen_server_test:start_link(false).
{ok,<0.78.0>}
7> gen_server_test:stop(whatever).
[line:34] Got stop by whatever
[line:47] Terminate reason: whatever
%% 正常stop 但是reason不为normal时,会使用error_log打印信息
=ERROR REPORT==== 21-May-2018::17:07:02 ===
** Generic server gen_server_test terminating
** Last message in was {stop,whatever}
** When Server state == undefined
** Reason for termination ==
** whatever
** Client <0.73.0> stacktrace
** [{gen,do_call,4,[{file,"gen.erl"},{line,169}]},
{gen_server,call,2,[{file,"gen_server.erl"},{line,202}]},
{erl_eval,do_apply,6,[{file,"erl_eval.erl"},{line,674}]},
{shell,exprs,7,[{file,"shell.erl"},{line,687}]},
{shell,eval_exprs,7,[{file,"shell.erl"},{line,642}]},
{shell,eval_loop,3,[{file,"shell.erl"},{line,627}]}]
** exception exit: whatever
总结:
gen_server
内部 自己退出或发生crash退出,都会执行terminate/2
- 如果stop的原因不是
normal
,error_log会记录本次退出信息。
gen_server 外部强制退出,会发生什么
- 使用
exit(Pid, kill)
强制发送退出信号,terminate/2
并不会执行。 trap_exit: fasle
使用exit(Pid, Reason)
强制发送退出信号,terminate/2
并不会执行。trap_exit: true
使用exit(Pid, Reason)
强制发送退出信号,terminate/2
会执行。{'EXIT',Pid,Reason}
消息发送给gen_server:start/3
启动的进程,消息被当成普通的消息被handle_info/2
处理。{'EXIT',Pid,Reason}
消息发送给gen_server:start_link/3
启动的进程,消息被当成退出信号被terminate/2
处理。
8> {ok, Pid} = gen_server_test:start_link(false). %% trap_exit false
{ok,<0.86.0>}
9> erlang:exit(Pid, whatever).
** exception exit: whatever
10> {ok, Pid1} = gen_server_test:start_link(true). %% trap_exit true
{ok,<0.90.0>}
11> erlang:exit(Pid1, whatever).
[line:47] Terminate reason: whatever
true
=ERROR REPORT==== 21-May-2018::17:10:24 ===
** Generic server gen_server_test terminating
** Last message in was {'EXIT',<0.88.0>,whatever}
** When Server state == undefined
** Reason for termination ==
** whatever
** exception exit: whatever
12> gen_server_test:start(true). ## 此进程使用gen_server:start/3启动,所以只把{'EXIT',self(), whatever}消息当成一个普通的消息给gen_server进程hanle_info/2处理
{ok,<0.94.0>}
13> gen_server_test ! {'EXIT',self(), whatever}.
[line:43] Got {'EXIT',<0.92.0>,whatever}
{'EXIT',<0.92.0>,whatever}.
14> > gen_server_test:stop(normal).
[line:34] Got stop by normal
[line:47] Terminate reason: normal
ok
15> gen_server_test:start_link(false). ## 此进程使用gen_server:start_link/3启动,所以只把{'EXIT',self(), whatever}消息当成特殊的退出的消息给gen_server进程terminate/3处理
{ok,<0.98.0>}
16> gen_server_test ! {'EXIT',self(), whatever}.
> gen_server_test ! {'EXIT',self(), whatever}.
[line:47] Terminate reason: whatever
{'EXIT',<0.92.0>,whatever}
=ERROR REPORT==== 21-May-2018::17:14:56 ===
** Generic server gen_server_test terminating
** Last message in was {'EXIT',<0.92.0>,whatever}
** When Server state == undefined
** Reason for termination ==
** whatever
** exception exit: whatever
17> gen_server_test:start_link(false).
{ok,<0.102.0>}
18> exit(<0.102.0>, whatever).
** exception exit: whatever
19> gen_server_test:start_link(true).
{ok,<0.111.0>}
20> exit(<0.111.0>, kill).
** exception exit: killed
21> gen_server_test:start_link(true).
{ok,<0.106.0>}
22> exit(<0.106.0>, whatever).
[line:47] Terminate reason: whatever
true
=ERROR REPORT==== 21-May-2018::17:20:24 ===
** Generic server gen_server_test terminating
** Last message in was {'EXIT',<0.104.0>,whatever}
** When Server state == undefined
** Reason for termination ==
** whatever
** exception exit: whatever
terminate里面crash会发生什么?
会把crash继续住上抛出去,大多数情况都给exit给了supervisor,让他处理。gen_server源码中处理如下
terminate(ExitReason, ReportReason, Name, Msg, Mod, State, Debug) ->
Reply = try_terminate(Mod, ExitReason, State),
case Reply of
{'EXIT', ExitReason1, ReportReason1} ->
FmtState = format_status(terminate, Mod, get(), State),
error_info(ReportReason1, Name, Msg, FmtState, Debug),
exit(ExitReason1);
_ ->
case ExitReason of
normal ->
exit(normal);
shutdown ->
exit(shutdown);
{shutdown,_}=Shutdown ->
exit(Shutdown);
_ ->
FmtState = format_status(terminate, Mod, get(), State),
error_info(ReportReason, Name, Msg, FmtState, Debug),
exit(ExitReason)
end
end.
try_terminate(Mod, Reason, State) ->
try
{ok, Mod:terminate(Reason, State)}
catch
throw:R ->
{ok, R};
error:R ->
Stacktrace = erlang:get_stacktrace(),
{'EXIT', {R, Stacktrace}, {R, Stacktrace}};
exit:R ->
Stacktrace = erlang:get_stacktrace(),
{'EXIT', R, {R, Stacktrace}}
end.
人不了解自己时是最糟糕的。--李小龙
写下来是好习惯: Notes