对pgpool-II的child process的理解(二)
继续对pgpool-II的child.c里面的 do_child函数进行分析,改动代码如下:
/* * child main loop */ void do_child(int unix_fd, int inet_fd) { …… for (;;) { …… /* perform accept() */ frontend = do_accept(unix_fd, inet_fd, &timeout); /*added by gaojian begin*/ if (frontend==NULL) { pool_log("frontend is NULL"); }else{ pool_log("frontend is not NULL"); pool_log("I am %d", getpid()); } /*added by gaojian end*/ if (frontend == NULL) /* connection request from frontend timed out */ { /* check select() timeout */ if (connected && pool_config->child_life_time > 0 && timeout.tv_sec == 0 && timeout.tv_usec == 0) { pool_debug("child life %d seconds expired",
pool_config->child_life_time); /* * Doesn't need to call this. child_exit() calls it. * send_frontend_exits(); */ child_exit(2); } continue; } …… } child_exit(0); }
重新编译运行后,开一个客户端,显示如下:
LOG: pid 16210: frontend is not NULL.
LOG: pid 16210: I am 16210
再开一个客户端,显示如下:
LOG: pid 16212: frontend is not NULL.
LOG: pid 16212: I am 16212
也就是说:开一个客户端,就会有一个child被唤醒
(通过do_accept函数,虽然大家都调用了do_accept函数,但是哪个被唤醒当有一个算法)
此时,kill -9 16210。然后在此客户端再执行sql文:
select * from a5;
显示:
postgres=#select * from a5; server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Succeeded. postgres=#
这是,再看pgpool后台,
LOG: pid 16216: frontend is not NULL.
LOG: pid 16216: I am 16216
这是因为 Attempting reset, 又唤醒了一个 子进程。
由此可以得到两个结论:
结论一:虽然所有的child都会调用 do_child,在 do_child里调用 do_accept,但是相当于大家都去申请,轮到谁来响应客户的请求,倒不一定。
结论二:如果某个child不幸挂掉了,为了继续响应同一个客户端,另一个child会被唤醒。