ranch
整体理解
从整体上的话,ranch主要是三层的监控树
- 第一层 ranch_sup,负责整个应用的启动,启动了ranch_server进程,它管理了整个应用的配置和连接数据
- 第二层 ranch_listener_sup,负责连接的管理
- 第三层 ranch_conns_sup和ranch_acceptors_sup,这两个分别用来处理新的连接和获得新的连接
当然最底层的ranch_acceptor是应用中的重要角色,每当有新的连接,都会把控制权交给ranch_conns_sup,由它统一管理
ranch.app
启动模块为ranch_app,说明需要找到ranch_app.erl文件去启动应用
ranch_app.erl
根据参数启动测试的功能,主要启动了一个ranch_sup监控进程
ranch_sup.erl
新建一个名为ranch_server的ets表,同时启动并监控ranch_server进程,策略为one_for_one
ranch_server.erl
启动了一个进程,管理ranch_server这个ets表,提供多个接口
set_new_listener_opts:设置监听进程的参数
set_connections_sup:增加新的连接进程的监控进程Pid,并且对该进程进行monitor监视,把{MonitorRef, Pid}添加到#state.monitors中
set_listener_sup:增加一个监听进程的监控进程Pid,并且对该进程进行monitor监视,并且把{MonitorRef, Pid}添加到#state.monitors中
set_addr:在ets中记录地址
set_max_conns:设置最大连接数量
set_trans_opts:设置传输协议参数
set_proto_opts:设置协议参数
到此为止,ranch应用的准备工作已经结束,剩下的就差外部的调用了
ranch.erl
ranch应用的调用模块,通过start_listener/6来初始化ranch的功能模块,给它提供功能参数,其中有一个Transport参数,是ranch的协议模块名,要么是ranch_ssl,要么就是ranch_tcp,先在ranch_sup下面启动了一个ranch_listener_sup进程,该进程做了什么,接下来将详细介绍,至少在这里我们知道,ranch的正式使用由ranch_listener_sup进程启动开始。
-spec start_listener(ref(), module(), any(), module(), any())
-> supervisor:startchild_ret().
start_listener(Ref, Transport, TransOpts, Protocol, ProtoOpts) ->
NumAcceptors = proplists:get_value(num_acceptors, TransOpts, 10),
start_listener(Ref, NumAcceptors, Transport, TransOpts, Protocol, ProtoOpts).
-spec start_listener(ref(), non_neg_integer(), module(), any(), module(), any())
-> supervisor:startchild_ret().
start_listener(Ref, NumAcceptors, Transport, TransOpts, Protocol, ProtoOpts)
when is_integer(NumAcceptors) andalso is_atom(Transport)
andalso is_atom(Protocol) ->
_ = code:ensure_loaded(Transport),
case erlang:function_exported(Transport, name, 0) of
false ->
{error, badarg};
true ->
Res = supervisor:start_child(ranch_sup, child_spec(Ref, NumAcceptors,
Transport, TransOpts, Protocol, ProtoOpts)),
Socket = proplists:get_value(socket, TransOpts),
case Res of
{ok, Pid} when Socket =/= undefined ->
%% Give ownership of the socket to ranch_acceptors_sup
%% to make sure the socket stays open as long as the
%% listener is alive. If the socket closes however there
%% will be no way to recover because we don't know how
%% to open it again.
Children = supervisor:which_children(Pid),
{_, AcceptorsSup, _, _}
= lists:keyfind(ranch_acceptors_sup, 1, Children),
%%% Note: the catch is here because SSL crashes when you change
%%% the controlling process of a listen socket because of a bug.
%%% The bug will be fixed in R16.
catch Transport:controlling_process(Socket, AcceptorsSup);
_ ->
ok
end,
maybe_started(Res)
end.
-spec child_spec(ref(), module(), any(), module(), any())
-> supervisor:child_spec().
child_spec(Ref, Transport, TransOpts, Protocol, ProtoOpts) ->
NumAcceptors = proplists:get_value(num_acceptors, TransOpts, 10),
child_spec(Ref, NumAcceptors, Transport, TransOpts, Protocol, ProtoOpts).
-spec child_spec(ref(), non_neg_integer(), module(), any(), module(), any())
-> supervisor:child_spec().
child_spec(Ref, NumAcceptors, Transport, TransOpts, Protocol, ProtoOpts)
when is_integer(NumAcceptors) andalso is_atom(Transport)
andalso is_atom(Protocol) ->
{{ranch_listener_sup, Ref}, {ranch_listener_sup, start_link, [
Ref, NumAcceptors, Transport, TransOpts, Protocol, ProtoOpts
]}, permanent, infinity, supervisor, [ranch_listener_sup]}.
ranch_listener_sup.erl
该监控进程启动时,主动调用ranch_server:set_listener_sup/2,将自己的信息记录在ets中并且被ranch_server监控,它下面还顺序启动了ranch_conns_sup和ranch_acceptors_sup,策略是rest_for_one,因为ranch_conns_sup是负责监控连接的进程,而ranch_acceptors_sup是监控消息的进程,ranch_conns_sup死掉之后,说明连接都断开了,ranch_acceptors_sup下面的进程也就无法运行,必须等ranch_conns_sup重启成功后才能正常工作。
ranch_conns_sup.erl
该模块并不是supervisor行为,不过作者手动写了一个类似supervisor的东西,启动时主动调用ranch_server:set_connections_sup/2记录自身的信息,同时通过ranch_server获取相应的一些连接参数,其中用到了proc_lib:init_ack/2用于响应proc_lib:start_link/3,实现同步启动进程,做到和gen_server一样的效果,接着开始一个循环函数loop/4,用来处理消息,下面列出主要的消息处理
{?MODULE, start_protocol, T, Socket}:参数中To为ranch_acceptor模块的进程pid,而Socket是ranch_acceptor接收到的客户端socket,启动一个调用Protocol:start_link/4启动一个进程,这个Protocol是用户实现的回调模块,通常是socket消息的接收处理进程,就像例子中的echo_protocol.erl或者reverse_protocol.erl这两个部分,如果启动成功,将会调用shoot/8来修改回调模块的Socket的控制进程,即socket的消息将发送到哪个进程在这里决定,修改之后,将回复回调部分进程一个{shoot, Ref, Transport, Socket, AckTimeout}消息,接着检查当前连接数量是否达到配置中的MaxConns,如果达到了最大连接数的话则把连接加入到等待连接列表中,同时增加子连接数量,继续循环loop/4
{?MODULE, active_connections, To, Tag}:To连接进程获取当前连接列表
{remove_connection, Ref, Pid}:移除某个连接进程
-spec init(pid(), ranch:ref(), module(), module()) -> no_return().
init(Parent, Ref, Transport, Protocol) ->
process_flag(trap_exit, true),
ok = ranch_server:set_connections_sup(Ref, self()),
MaxConns = ranch_server:get_max_connections(Ref),
TransOpts = ranch_server:get_transport_options(Ref),
ConnType = proplists:get_value(connection_type, TransOpts, worker),
Shutdown = proplists:get_value(shutdown, TransOpts, 5000),
AckTimeout = proplists:get_value(ack_timeout, TransOpts, 5000),
ProtoOpts = ranch_server:get_protocol_options(Ref),
ok = proc_lib:init_ack(Parent, {ok, self()}),
loop(#state{parent=Parent, ref=Ref, conn_type=ConnType,
shutdown=Shutdown, transport=Transport, protocol=Protocol,
opts=ProtoOpts, ack_timeout=AckTimeout, max_conns=MaxConns}, 0, 0, []).
loop(State=#state{parent=Parent, ref=Ref, conn_type=ConnType,
transport=Transport, protocol=Protocol, opts=Opts,
max_conns=MaxConns}, CurConns, NbChildren, Sleepers) ->
receive
{?MODULE, start_protocol, To, Socket} ->
try Protocol:start_link(Ref, Socket, Transport, Opts) of
{ok, Pid} ->
shoot(State, CurConns, NbChildren, Sleepers, To, Socket, Pid, Pid);
{ok, SupPid, ProtocolPid} when ConnType =:= supervisor ->
shoot(State, CurConns, NbChildren, Sleepers, To, Socket, SupPid, ProtocolPid);
Ret ->
To ! self(),
error_logger:error_msg(
"Ranch listener ~p connection process start failure; "
"~p:start_link/4 returned: ~999999p~n",
[Ref, Protocol, Ret]),
Transport:close(Socket),
loop(State, CurConns, NbChildren, Sleepers)
catch Class:Reason ->
To ! self(),
error_logger:error_msg(
"Ranch listener ~p connection process start failure; "
"~p:start_link/4 crashed with reason: ~p:~999999p~n",
[Ref, Protocol, Class, Reason]),
loop(State, CurConns, NbChildren, Sleepers)
end;
{?MODULE, active_connections, To, Tag} ->
To ! {Tag, CurConns},
loop(State, CurConns, NbChildren, Sleepers);
%% Remove a connection from the count of connections.
{remove_connection, Ref, Pid} ->
case put(Pid, removed) of
active ->
loop(State, CurConns - 1, NbChildren, Sleepers);
remove ->
loop(State, CurConns, NbChildren, Sleepers);
undefined ->
_ = erase(Pid),
loop(State, CurConns, NbChildren, Sleepers)
end;
%% Upgrade the max number of connections allowed concurrently.
%% We resume all sleeping acceptors if this number increases.
{set_max_conns, MaxConns2} when MaxConns2 > MaxConns ->
_ = [To ! self() || To <- Sleepers],
loop(State#state{max_conns=MaxConns2},
CurConns, NbChildren, []);
{set_max_conns, MaxConns2} ->
loop(State#state{max_conns=MaxConns2},
CurConns, NbChildren, Sleepers);
%% Upgrade the protocol options.
{set_opts, Opts2} ->
loop(State#state{opts=Opts2},
CurConns, NbChildren, Sleepers);
{'EXIT', Parent, Reason} ->
terminate(State, Reason, NbChildren);
{'EXIT', Pid, Reason} when Sleepers =:= [] ->
case erase(Pid) of
active ->
report_error(Ref, Protocol, Pid, Reason),
loop(State, CurConns - 1, NbChildren - 1, Sleepers);
removed ->
report_error(Ref, Protocol, Pid, Reason),
loop(State, CurConns, NbChildren - 1, Sleepers);
undefined ->
loop(State, CurConns, NbChildren, Sleepers)
end;
%% Resume a sleeping acceptor if needed.
{'EXIT', Pid, Reason} ->
case erase(Pid) of
active when CurConns > MaxConns ->
report_error(Ref, Protocol, Pid, Reason),
loop(State, CurConns - 1, NbChildren - 1, Sleepers);
active ->
report_error(Ref, Protocol, Pid, Reason),
[To|Sleepers2] = Sleepers,
To ! self(),
loop(State, CurConns - 1, NbChildren - 1, Sleepers2);
removed ->
report_error(Ref, Protocol, Pid, Reason),
loop(State, CurConns, NbChildren - 1, Sleepers);
undefined ->
loop(State, CurConns, NbChildren, Sleepers)
end;
{system, From, Request} ->
sys:handle_system_msg(Request, From, Parent, ?MODULE, [],
{State, CurConns, NbChildren, Sleepers});
%% Calls from the supervisor module.
{'$gen_call', {To, Tag}, which_children} ->
Children = [{Protocol, Pid, ConnType, [Protocol]}
|| {Pid, Type} <- get(),
Type =:= active orelse Type =:= removed],
To ! {Tag, Children},
loop(State, CurConns, NbChildren, Sleepers);
{'$gen_call', {To, Tag}, count_children} ->
Counts = case ConnType of
worker -> [{supervisors, 0}, {workers, NbChildren}];
supervisor -> [{supervisors, NbChildren}, {workers, 0}]
end,
Counts2 = [{specs, 1}, {active, NbChildren}|Counts],
To ! {Tag, Counts2},
loop(State, CurConns, NbChildren, Sleepers);
{'$gen_call', {To, Tag}, _} ->
To ! {Tag, {error, ?MODULE}},
loop(State, CurConns, NbChildren, Sleepers);
Msg ->
error_logger:error_msg(
"Ranch listener ~p received unexpected message ~p~n",
[Ref, Msg]),
loop(State, CurConns, NbChildren, Sleepers)
end.
shoot(State=#state{ref=Ref, transport=Transport, ack_timeout=AckTimeout, max_conns=MaxConns},
CurConns, NbChildren, Sleepers, To, Socket, SupPid, ProtocolPid) ->
case Transport:controlling_process(Socket, ProtocolPid) of
ok ->
ProtocolPid ! {shoot, Ref, Transport, Socket, AckTimeout},
put(SupPid, active),
CurConns2 = CurConns + 1,
if CurConns2 < MaxConns ->
To ! self(),
loop(State, CurConns2, NbChildren + 1, Sleepers);
true ->
loop(State, CurConns2, NbChildren + 1, [To|Sleepers])
end;
{error, _} ->
Transport:close(Socket),
%% Only kill the supervised pid, because the connection's pid,
%% when different, is supposed to be sitting under it and linked.
exit(SupPid, kill),
To ! self(),
loop(State, CurConns, NbChildren, Sleepers)
end.
ranch_acceptors_sup.erl
从ranch_server中获取ranch_conns_sup的进程,并且获取监听参数TransOpts,如果ranch_server中尚未有监听socket,则启动监听socket,接着把监听socket记录到ranch_server中,启动一个ranch_acceptor子进程。
ranch_acceptor.erl
启动一个loop/3循环,当接收到客户端的socket之后,把socket的控制进程改为连接监控进程ranch_conns_sup,连接监控进程中有对应的一些消息处理,接着调用ranch_conns_sup:start_protocol/2发送{?MODULE, start_protocol, self(), Socket},ranch_conns_sup进程自身对该消息进行处理,详情看ranch_conns_sup.erl的介绍,至此,ranch的监听端口的工作都已经准备完毕,(发现还有部分忽略了,需要实现ranch_protocol行为才能处理客户端消息的)现在就差客户端的连接进来了。
-spec loop(inet:socket(), module(), pid()) -> no_return().
loop(LSocket, Transport, ConnsSup) ->
_ = case Transport:accept(LSocket, infinity) of
{ok, CSocket} ->
case Transport:controlling_process(CSocket, ConnsSup) of
ok ->
%% This call will not return until process has been started
%% AND we are below the maximum number of connections.
ranch_conns_sup:start_protocol(ConnsSup, CSocket);
{error, _} ->
Transport:close(CSocket)
end;
%% Reduce the accept rate if we run out of file descriptors.
%% We can't accept anymore anyway, so we might as well wait
%% a little for the situation to resolve itself.
{error, emfile} ->
error_logger:warning_msg("Ranch acceptor reducing accept rate: out of file descriptors~n"),
receive after 100 -> ok end;
%% We want to crash if the listening socket got closed.
{error, Reason} when Reason =/= closed ->
ok
end,
flush(),
?MODULE:loop(LSocket, Transport, ConnsSup).