最近遇到几个问题,都是和Erlang Shell输出有关,问题解决了但是追问还要继续下去,后面几篇文章都将围绕这一话题展开;那我们就从io:format("hello world!")开始说起吧.
%%代码路径:\erl5.9\lib\stdlib-1.18\src\io.erl
format(Format) ->
format(Format, []).

format(Format, Args) ->
format(default_output(), Format, Args).

format(Io, Format, Args) ->
o_request(Io, {format,Format,Args}, format).

 打开\erl5.9\lib\stdlib-1.18\src\io.erl,显然io:format("hello world!")的调用会走到format(default_output(), Format, Args).我们的第一个问题就是这里的default_output,它的实现很简单:
default_output() ->
group_leader().
 这里的group_leader/0,实际上是erlang:group_leader().我们看官方文档对它的解释:

group_leader() -> GroupLeader

Types:
GroupLeader = pid()
     Returns the pid of the group leader for the process which evaluates the function.
   Every process is a member of some process group and all groups have a group leader. All IO from the group is channeled to the group leader. When a new process is spawned, it gets the same group leader as the spawning process. Initially, at system start-up, init is both its own group leader and the group leader of all processes.

 Erlang进程不是孤立的,进程都属于进程组,进程组都有group leader.所有的进程组的IO都会重定向到group leader.当一个进程被创建的时候,它就会继承父进程的group leader.系统初始化的时候,init是它自己和其它所有进程的group leader.我们下面在Erlang Shell中做一下检验:
1.在shell中创建一个进程,它的group_leader是什么?
2.当前shell的group leader是什么?
3.init进程的group leader
4.搞崩这个shell,让shell重启,我们继续观察
Eshell V5.9  (abort with ^G)
1> self().
<0.30.0>
2> P=spawn(fun()-> receive after infinity -> hello end end ).
<0.33.0>
3> erlang:process_info(P).
[{current_function,{erl_eval,receive_clauses,8}},
{initial_call,{erlang,apply,2}},
{status,waiting},
{message_queue_len,0},
{messages,[]},
{links,[]},
{dictionary,[]},
{trap_exit,false},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.23.0>}, %%注意这里就是进程P的group leader
{total_heap_size,233},
{heap_size,233},
{stack_size,10},
{reductions,18},
{garbage_collection,[{min_bin_vheap_size,46368},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,0}]},
{suspending,[]}]
4> erlang:process_info(pid(0,23,0)). %%继续跟进看看P进程的group leader是什么样的进程?
[{registered_name,user}, %%注意P进程的group leader的registered_name是user!!!
{current_function,{user,server_loop,2}},
{initial_call,{erlang,apply,2}},
{status,waiting},
{message_queue_len,0},
{messages,[]},
{links,[<0.21.0>,<0.24.0>,#Port<0.319>,<0.5.0>]},
{dictionary,[{unicode,false},
{read_mode,list},
{shell,<0.24.0>}]},
{trap_exit,true},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.23.0>}, %%user进程的group leader就是它自己
{total_heap_size,3194},
{heap_size,2584},
{stack_size,9},
{reductions,1310},
{garbage_collection,[{min_bin_vheap_size,46368},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,2}]},
{suspending,[]}]
5> whereis(init). %%观察下init进程的元数据
<0.0.0>
6> erlang:process_info(pid(0,0,0)).
[{registered_name,init},
{current_function,{init,loop,1}},
{initial_call,{otp_ring0,start,2}},
{status,waiting},
{message_queue_len,0},
{messages,[]},
{links,[<0.5.0>,<0.6.0>,<0.3.0>]},
{dictionary,[]},
{trap_exit,true},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.0.0>},%init的group leader 就是它自己
{total_heap_size,1974},
{heap_size,1597},
{stack_size,2},
{reductions,2357},
{garbage_collection,[{min_bin_vheap_size,46368},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,4}]},
{suspending,[]}]
7> erlang:process_info(pid(0,30,0)). %调转回头我们看看当前这个shell的group leader
[{current_function,{erl_eval,do_apply,6}},
{initial_call,{erlang,apply,2}},
{status,running},
{message_queue_len,0},
{messages,[]},
{links,[<0.24.0>]},
{dictionary,[]},
{trap_exit,false},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.23.0>}, %还记得这是什么进程的进程ID? 对,是user
{total_heap_size,3571},
{heap_size,2584},
{stack_size,24},
{reductions,18941},
{garbage_collection,[{min_bin_vheap_size,46368},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,6}]},
{suspending,[]}]
8> self(). %%下面我们要把当前shell搞崩
<0.30.0>
9> 1/0.
** exception error: bad argument in an arithmetic expression
in operator '/'/2
called as 1 / 0
10> self(). %%再次查看 Shell的pid已经变了
<0.41.0>
11> erlang:process_info(pid(0,41,0)).
[{current_function,{erl_eval,do_apply,6}},
{initial_call,{erlang,apply,2}},
{status,running},
{message_queue_len,0},
{messages,[]},
{links,[<0.24.0>]},
{dictionary,[]},
{trap_exit,false},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.23.0>}, %%注意这里的group leader 还是shell
{total_heap_size,3571},
{heap_size,2584},
{stack_size,24},
{reductions,3281},
{garbage_collection,[{min_bin_vheap_size,46368},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,8}]},
{suspending,[]}]
12>
   系统启动的时候,init进程首先被创建(Pid <0.0.0>),是自己的group leader,前面说过它还是所有进程的group leader,逻辑上是这样的,因为它首先被创建.在shell中我们创建的进程以及shell进程,group leader都是user进程!
  user进程是做什么用的呢?我们看下官方文档中的描述:

user 
Standard I/O Server
DESCRIPTION
user is a server which responds to all the messages defined in the I/O interface. The code in user.erl can be used as a model for building alternative I/O servers.

 原来user是标准的I/O 的server,看一下user进程的创建过程片段:
%%代码路径\erl5.9\lib\kernel-2.15\src\user.erl
run(P) ->
put(read_mode,list),
put(unicode,false),
case init:get_argument(noshell) of
%% non-empty list -> noshell
{ok, [_|_]} ->
put(shell, noshell),
server_loop(P, queue:new());
_ ->
group_leader(self(), self()),
catch_loop(P, start_init_shell())
end.
可以看到这里调用了 group_leader(self(), self())方法,看下这个方法的作用:
group_leader(GroupLeader, Pid) -> true
 Types:
GroupLeader = Pid = pid()

Sets the group leader of Pid to GroupLeader. Typically, this is used when a processes started from a certain shell should have another group leader than init.
See also group_leader/0.

这个方法的作用是:把某进程(Pid)的group leader 设置为GroupLeader,上面user执行的 group_leader(self(), self())就是把group leader设置为自己.
 
  到目前为止,我们还没有把io:format的整个过程走完,转回头继续看io:fromat的实现,现在进行到 o_request(Io, {format,Format,Args}, format).继续跟进:
%%代码路径:\erl5.9\lib\stdlib-1.18\src\io.erl
o_request(Io, Request, Func) ->
case request(Io, Request) of %这里的Io参数的值就是group_leader哦
{error, Reason} ->
[_Name | Args] = tuple_to_list(to_tuple(Request)),
{'EXIT',{get_stacktrace,[_Current|Mfas]}} = (catch erlang:error(get_stacktrace)),
erlang:raise(error, conv_reason(Func, Reason), [{io, Func, [Io | Args]}|Mfas]);
Other ->
Other
end.

request(Request) ->
request(default_output(), Request).

request(standard_io, Request) ->
request(group_leader(), Request);
request(Pid, Request) when is_pid(Pid) -> %%看这里 我们走进的是这个分支
execute_request(Pid, io_request(Pid, Request)); %%io_request/2方法是一个消息格式转换的方法 它的实现摘录在后面
request(Name, Request) when is_atom(Name) ->
case whereis(Name) of
undefined ->
{error, arguments};
Pid ->
request(Pid, Request)
end.

execute_request(Pid, {Convert,Converted}) -> %%然后是到了这里
Mref = erlang:monitor(process, Pid),
Pid ! {io_request,self(),Pid,Converted}, %%这里向group_leader 发送一个消息,我们看看user进程接收到这个消息之后做了什么
if
Convert ->
convert_binaries(wait_io_mon_reply(Pid, Mref));
true ->
wait_io_mon_reply(Pid, Mref)
end.
上面的代码跟踪过程,最后看到了向group_leader发送已经格式化的消息,下面继续跟踪到user.erl,看这个消息的接收与处理
%%代码路径\erl5.9\lib\kernel-2.15\src\user.erl
server_loop(Port, Q) ->
receive
{io_request,From,ReplyAs,Request} when is_pid(From) ->
server_loop(Port, do_io_request(Request, From, ReplyAs, Port, Q));
{Port,{data,Bytes}} ->
case get(shell) of
noshell ->
server_loop(Port, queue:snoc(Q, Bytes));
_ ->
case contains_ctrl_g_or_ctrl_c(Bytes) of
false ->
server_loop(Port, queue:snoc(Q, Bytes));
_ ->
throw(new_shell)
end
end;
{Port, eof} ->
put(eof, true),
server_loop(Port, Q);

%% Ignore messages from port here.
{'EXIT',Port,badsig} -> % Ignore badsig errors
server_loop(Port, Q);
{'EXIT',Port,What} -> % Port has exited
exit(What);

%% Check if shell has exited
{'EXIT',SomePid,What} ->
case get(shell) of
noshell ->
server_loop(Port, Q); % Ignore
_ ->
throw({unknown_exit,{SomePid,What},Q})
end;

_Other -> % Ignore other messages
server_loop(Port, Q)
end.
 代码里面出现了shell的身影,user进程的进程指点中会保留当前shell的Pid,这样我们的io_server user就知道最终输出在什么终端上了;进行一个实验,我们查看一下user的元数据然后搞崩shell,看看这时候user的进程字典是什么情况:
Eshell V5.9  (abort with ^G)
1> whereis(user).
<0.23.0>
2> erlang:process_info(whereis(user)).
[{registered_name,user},
{current_function,{user,server_loop,2}},
{initial_call,{erlang,apply,2}},
{status,waiting},
{message_queue_len,0},
{messages,[]},
{links,[<0.21.0>,<0.24.0>,#Port<0.319>,<0.5.0>]},
{dictionary,[{unicode,false},
{read_mode,list},
{shell,<0.24.0>}]},
{trap_exit,true},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.23.0>},
{total_heap_size,987},
{heap_size,610},
{stack_size,9},
{reductions,666},
{garbage_collection,[{min_bin_vheap_size,46368},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,5}]},
{suspending,[]}]
3> exit(pid(0,24,0),kill).
*** ERROR: Shell process terminated! ***
Eshell V5.9 (abort with ^G)
1> whereis(user).
<0.23.0>
2> erlang:process_info(whereis(user)).
[{registered_name,user},
{current_function,{user,server_loop,2}},
{initial_call,{erlang,apply,2}},
{status,waiting},
{message_queue_len,0},
{messages,[]},
{links,[<0.5.0>,<0.21.0>,<0.34.0>,#Port<0.319>]},
{dictionary,[{unicode,false},
{read_mode,list},
{shell,<0.34.0>}]},
{trap_exit,true},
{error_handler,error_handler},
{priority,normal},
{group_leader,<0.23.0>},
{total_heap_size,1364},
{heap_size,987},
{stack_size,9},
{reductions,1659},
{garbage_collection,[{min_bin_vheap_size,46368},
{min_heap_size,233},
{fullsweep_after,65535},
{minor_gcs,8}]},
{suspending,[]}]
3>

  到这里io:format执行的整个流程比较清晰了:向group_leader 按照指定格式发送io请求,group leader负责io请求的处理,通常情况下的group leader是user,user进程中维护了输出终端shell的进程pid,Shell重建之后user会更新进程字典.现在我们已经可以做一些有趣的事情了,比如不使用io:format直接向user进程发送请求,就像下面这样:
Eshell V5.9  (abort with ^G)
1> U =whereis(user).
<0.23.0>
2> U!{io_request,self(),self(), {put_chars,unicode,io_lib,format, ["hello world
:~p~n
",[zen]]}}.
hello world :zen
{io_request,<0.30.0>,<0.30.0>,
{put_chars,unicode,io_lib,format,
["hello world :~p~n",[zen]]}}
3>
 当然也可以解决一些实际的问题了,比如erlangqa上的这个问题:


litaocheng已经给出了解决方案,现在我们看这个解决方案就不再陌生了吧:
通过修改group_leader,达到io重定向的目的.
比如代码:
cat test.erl 
-module(test).
-compile([export_all]).

r() ->
io:format("group leader:~p~n", [erlang:group_leader()]),
io:format("node:~p~n", [node()]),
erlang:group_leader(whereis(user), self()),
io:format("hello world~n").
随后:
erl -sname t1
erl -sname t2
在t1中执行:
net_kernel:connect_node('t2@litao').
rpc:call('t2@litao', test, r, []).
会看到t2中输出hello world

 可能注意到我们的代码止于user进程接收到消息并没有继续下去,这是因为后面的代码跟踪会完全陷入io_protocol的细节里面去.我们简单看下:
The Erlang I/O-protocol
The Erlang I/O-protocol
The I/O-protocol in Erlang specifies a way for a client to communicate with an io_server and vice versa. The io_server is a process handling the requests and that performs the requested task on i.e. a device. The client is any Erlang process wishing to read or write data from/to the device.
The common I/O-protocol has been present in OTP since the beginning, but has been fairly undocumented and has also somewhat evolved over the years. In an addendum to Robert Virdings rationale the original I/O-protocol is described. This document describes the current I/O-protocol.
The original I/O-protocol was simple and flexible. Demands for spacial and execution time efficiency has triggered extensions to the protocol over the years, making the protocol larger and somewhat less easy to implement than the original. It can certainly be argumented that the current protocol is too complex, but this text describes how it looks today, not how it should have looked.
The basic ideas from the original protocol still hold. The io_server and client communicate with one single, rather simplistic protocol and no server state is ever present in the client. Any io_server can be used together with any client code and client code need not be aware of the actual device the io_server communicates with.
1.1 Protocol basicsAs described in Robert's paper, servers and clients communicate using io_request/io_reply tuples as follows:
{io_request, From, ReplyAs, Request}
{io_reply, ReplyAs, Reply}
The client sends an io_request to the io_server and the server eventually sends a corresponding reply.

* From is the pid() of the client, the process which the io_server sends the reply to.
* ReplyAs can be any datum and is simply returned in the corresponding io_reply. The io-module in the Erlang standard library simply uses the pid() of the io_server as the ReplyAs datum, but a more complicated client could have several outstanding io-requests to the same server and would then use i.e. a reference() or something else to differentiate among the incoming io_reply's. The ReplyAs element should be considered opaque by the io_server. Note that the pid() of the server is not explicitly present in the io_reply. The reply can be sent from any process, not necessarily the actual io_server. The ReplyAs element is the only thing that connects one io_request with an io_reply.
* Request and Reply are described below.

When an io_server receives an io_request, it acts upon the actual Request part and eventually sends an io_reply with the corresponding Reply part.
完整协议请点击:http://erlang.org/doc/apps/stdlib/io_protocol.html

关于io输出的话题,远远没有结束,还有很多问题值得思考,比如rpc:all的时候io输出是怎么控制的?OTP application启动之后的group leader是怎样的?JCL方式接入一个节点,输出又是怎样一个流程?所以,今天先到这里,未完待续

io 模块 online documentation