pgpool-II3.1 的内存泄漏(六)
磨砺技术珠矶,践行数据之道,追求卓越价值
回到上一级页面: PostgreSQL集群方案相关索引页 回到顶级页面:PostgreSQL索引页
[作者 高健@博客园 luckyjackgao@gmail.com]
接上文 情形C save_ps_display_args 调用 malloc
==27927== 1,602 (256 direct, 1,346 indirect) bytes in 1 blocks are definitely lost in loss record 92 of 100
==27927== at 0x4A05E1C: malloc (vg_replace_malloc.c:195)
==27927== by 0x434605: save_ps_display_args (ps_status.c:173)
==27927== by 0x403ECA: fork_a_child (main.c:1066)
==27927== by 0x406C00: main (main.c:550)
==27927==
从上述log可知, 调用关系如下:
main-> fork_a_child -> save_ps_display_args -> malloc
以下是各个函数的主要相关逻辑:
main函数:
/*
* pgpool main program
*/
int main(int argc, char **argv){
……
myargc = argc;
myargv = argv;
……
/*
* We need to block signal here. Otherwise child might send some
* signals, for example SIGUSR1(fail over). Children will inherit
* signal blocking but they do unblock signals at the very beginning
* of process. So this is harmless.
*/
POOL_SETMASK(&BlockSig);
/* fork the children */
for (i=0;i<pool_config->num_init_children;i++){
process_info[i].pid = fork_a_child(unix_fd, inet_fd, i);
process_info[i].start_time = time(NULL);
}
/* set up signal handlers */
pool_signal(SIGTERM, exit_handler);
pool_signal(SIGINT, exit_handler);
pool_signal(SIGQUIT, exit_handler);
pool_signal(SIGCHLD, reap_handler);
pool_signal(SIGUSR1, failover_handler);
pool_signal(SIGUSR2, wakeup_handler);
pool_signal(SIGHUP, reload_config_handler);
/* create pipe for delivering event */
if (pipe(pipe_fds) < 0){
pool_error("failed to create pipe");
myexit(1);
}
……
/*
* This is the main loop
*/
for (;;)
{
……
}
pool_shmem_exit(0);
}
for_a_child函数:
/*
* fork a child
*/
pid_t fork_a_child(int unix_fd, int inet_fd, int id)
{
pid_t pid;
pid = fork();
if (pid == 0)
{
/* Before we unconditionally closed pipe_fds[0] and pipe_fds[1]
* here, which is apparently wrong since in the start up of
* pgpool, pipe(2) is not called yet and it mistakenly closes
* fd 0. Now we check the fd > 0 before close(), expecting
* pipe returns fds greater than 0. Note that we cannot
* unconditionally remove close(2) calls since fork_a_child()
* may be called *after* pgpool starting up.
*/
if (pipe_fds[0] > 0)
{
close(pipe_fds[0]);
close(pipe_fds[1]);
}
myargv = save_ps_display_args(myargc, myargv);
/* call child main */
POOL_SETMASK(&UnBlockSig);
reload_config_request = 0;
my_proc_id = id;
run_as_pcp_child = false;
do_child(unix_fd, inet_fd);
}
else if (pid == -1)
{
pool_error("fork() failed. reason: %s", strerror(errno));
myexit(1);
}
return pid;
}
save_ps_display_args函数
/*
* Call this early in startup to save the original argc/argv values.
* If needed, we make a copy of the original argv[] array to preserve it
* from being clobbered by subsequent ps_display actions.
*
* (The original argv[] will not be overwritten by this routine, but may be
* overwritten during init_ps_display. Also, the physical location of the
* environment strings may be moved, so this should be called before any code
* that might try to hang onto a getenv() result.)
*/
char **
save_ps_display_args(int argc, char **argv)
{
save_argc = argc;
save_argv = argv;
#if defined(PS_USE_CLOBBER_ARGV)
/*
* If we're going to overwrite the argv area, count the available space.
* Also move the environment to make additional room.
*/
{
char *end_of_area = NULL;
char **new_environ;
int i;
/*
* check for contiguous argv strings
*/
for (i = 0; i < argc; i++)
{
if (i == 0 || end_of_area + 1 == argv[i])
end_of_area = argv[i] + strlen(argv[i]);
}
if (end_of_area == NULL) /* probably can't happen? */
{
ps_buffer = NULL;
ps_buffer_size = 0;
return argv;
}
/*
* check for contiguous environ strings following argv
*/
for (i = 0; environ[i] != NULL; i++)
{
if (end_of_area + 1 == environ[i])
end_of_area = environ[i] + strlen(environ[i]);
}
ps_buffer = argv[0];
ps_buffer_size = end_of_area - argv[0];
/*
* move the environment out of the way
*/
new_environ = (char **) malloc((i + 1) * sizeof(char *));
for (i = 0; environ[i] != NULL; i++)
new_environ[i] = strdup(environ[i]);
new_environ[i] = NULL;
environ = new_environ;
}
#endif /* PS_USE_CLOBBER_ARGV */
#if defined(PS_USE_CHANGE_ARGV) || defined(PS_USE_CLOBBER_ARGV)
/*
* If we're going to change the original argv[] then make a copy for
* argument parsing purposes.
*
* (NB: do NOT think to remove the copying of argv[], even though
* postmaster.c finishes looking at argv[] long before we ever consider
* changing the ps display. On some platforms, getopt() keeps pointers
* into the argv array, and will get horribly confused when it is
* re-called to analyze a subprocess' argument string if the argv storage
* has been clobbered meanwhile. Other platforms have other dependencies
* on argv[].
*/
{
char **new_argv;
int i;
new_argv = (char **) malloc((argc + 1) * sizeof(char *));
for (i = 0; i < argc; i++)
new_argv[i] = strdup(argv[i]);
new_argv[argc] = NULL;
#if defined(__darwin__)
/*
* Darwin (and perhaps other NeXT-derived platforms?) has a static
* copy of the argv pointer, which we may fix like so:
*/
*_NSGetArgv() = new_argv;
#endif
argv = new_argv;
}
#endif /* PS_USE_CHANGE_ARGV or PS_USE_CLOBBER_ARGV */
return argv;
}
由以上各个函数的逻辑,可以看到
在save_ps_display_args 函数中,调用 malloc 来开了内存, 未释放。
首先:
如果满足了 #if defined(PS_USE_CLOBBER_ARGV) 的条件,如下的代码会得到执行:
/*
* move the environment out of the way
*/
new_environ = (char **) malloc((i + 1) * sizeof(char *));
for (i = 0; environ[i] != NULL; i++)
new_environ[i] = strdup(environ[i]);
new_environ[i] = NULL;
environ = new_environ;
如果代码进入此分支,那么 因为 malloc 和 strdup,确实没有释放干净。(pgpool退出时会丢失一小段内存)
如果满足了 #if defined(PS_USE_CHANGE_ARGV) || defined(PS_USE_CLOBBER_ARGV) 的条件,如下的代码会得到执行:
new_argv = (char **) malloc((argc + 1) * sizeof(char *));
for (i = 0; i < argc; i++)
new_argv[i] = strdup(argv[i]);
new_argv[argc] = NULL;
一方面 strdup 操作造成的内存分配,没有释放干净。(pgpool退出时会丢失一小段内存)
另一方面,
argv = new_argv;
要返回给 上一层函数,暂时还不能释放。
在 fork_a_child 函数里面,由于 只是在 生成每个子进程的时候,来调用此 save_ps_display_args函数。
如果 配置文件中 规定 初始的 子进程数量为 128的话,那么就要 调用 save_ps_display_args 函数 128次。
似乎 对每个 子进程的起始阶段, sava_ps_display_args 函数的运行没有不同,所以 多次重复调用有些浪费内存了。
不如把 save_ps_display_args 函数提到 fork()操作之前,这样运行一次就够了。
但是浪费也就浪费吧。
有一点还不能确定:
如果 fork_a_child 只是 在系统初始化 阶段才运行, 那么顶多就是 浪费了一些内存。到不至于随着运行时间的延长而导致内存持续增加。
如果 某种条件导致运行一段时间后, fok_a_child 被再次激活。那么就会持续地再次浪费内存。
(比如 failover 似乎就是要重新生成所有子进程)
此种情形,可以 发生 内存泄露。
[作者 高健@博客园 luckyjackgao@gmail.com]
回到上一级页面: PostgreSQL集群方案相关索引页 回到顶级页面:PostgreSQL索引页
磨砺技术珠矶,践行数据之道,追求卓越价值