PG进程结构和内存结构

本文主要介绍PostgreSQL数据库（后文简称PG）进程结构和内存结构，物理结构将在后续继续整理分享。

上图描述了PG进程结构、内存结构和部分物理结构的内容。图中的内容包含了两个部分：

PG启动时分配
应用访问时分配

PG启动时进程结构和内存结构

PG是一用户一进程的Client/Server的应用程序。在PG启动时会启动若干个进程，其中包括主进程和辅助进程。在详细介绍之前，我们先来做一个测试，以下是操作记录：

[postgres@CHENZX-DB01 ~]$ ps -ef|grep postgres
root     15592 15571  0 14:23 pts/1    00:00:00 su - postgres
postgres 15593 15592  0 14:23 pts/1    00:00:00 -bash
postgres 15684 15593  0 14:24 pts/1    00:00:00 ps -ef
postgres 15685 15593  0 14:24 pts/1    00:00:00 grep --color=auto postgres
[postgres@CHENZX-DB01 ~]$ pgtl start
server starting
[postgres@CHENZX-DB01 ~]$ 
[postgres@CHENZX-DB01 ~]$ 
[postgres@CHENZX-DB01 ~]$ ps -ef|grep postgres
root     15592 15571  0 14:23 pts/1    00:00:00 su - postgres
postgres 15593 15592  0 14:23 pts/1    00:00:00 -bash
postgres 15712     1  0 14:25 pts/1    00:00:00 /postgres/postgresql-9.6.11/bin/postgres -D /postgres/pgdata
postgres 15713 15712  0 14:25 ?        00:00:00 postgres: logger process   
postgres 15715 15712  0 14:25 ?        00:00:00 postgres: checkpointer process   
postgres 15716 15712  0 14:25 ?        00:00:00 postgres: writer process   
postgres 15717 15712  0 14:25 ?        00:00:00 postgres: wal writer process   
postgres 15718 15712  0 14:25 ?        00:00:00 postgres: autovacuum launcher process   
postgres 15719 15712  0 14:25 ?        00:00:00 postgres: stats collector process   
postgres 15736 15593  0 14:25 pts/1    00:00:00 ps -ef
postgres 15737 15593  0 14:25 pts/1    00:00:00 grep --color=auto postgres
[postgres@CHENZX-DB01 ~]$ kill -9 15712
[postgres@CHENZX-DB01 ~]$ ps -ef|grep postgres
root     15592 15571  0 14:23 pts/1    00:00:00 su - postgres
postgres 15593 15592  0 14:23 pts/1    00:00:00 -bash
postgres 16164 15593  0 14:32 pts/1    00:00:00 ps -ef
postgres 16165 15593  0 14:32 pts/1    00:00:00 grep --color=auto postgres

PG启动前后OS层面上一共多了7个进程，而进程15713、15715、15716、15717、15718和15719均是被进程15712拉起的。当进程15712被杀掉后，剩余6个进程均会停止。所以，进程15712是PG的主进程，它在这里的作用是数据库的启动和停止、管理与数据库运行相关的辅助进程。

这里有6个进程，究竟这6个进程是干啥用的呢？以下是具体介绍：

进程名	作用
系统日志进程SysLogger	通过从Postmaster进程、所有的服务进程及其他辅助进程收集所有的stderr输出，并将这些输出写入到日志文件中
后台写进程BgWriter	将共享内存中的内容基于算法周期性的写入磁盘如果太快：一个数据块可能会被修改很多次，太快的情况下，每修改一次都要写入磁盘中，是一个相当浪费资源和性能的过程如果太慢：如果有新的查询或者更新等操作需要用内存保存从磁盘中读取数据，而此时内存却没有足够的空间，就需要把“脏数据”刷入磁盘来释放一部分内存。那么此时的查询或者更新操作就会发生“等待”，性能体验降低
归档进程PgArch	postgreSQL从8.x版本开始提出了PITR（Point-In-Time-Recovery）技术，支持将数据库恢复到其运行历史中任意一个有记录的时间点。除2.5.3节中所述的WalWriter外，PITR的另一个重要的基础就是对WAL文件的归档功能。PgArch辅助进程的目标就是对WAL日志在磁盘上的存储形式（Xlog文件）进行归档备份。
预写式日志写进程WalWriter	写WAL日志的进程。预写式日志的概念就是在修改数据之前，必须要把这些修改操作记录到磁盘中，这样后面更新实际数据时，就不需要实时地把数据持久化到文件中了。即使机器突然宕机或数据库异常退出，导致一部分内存中的脏数据没有及时地刷新到文件中，在数据库重启后，通过读取WAL日志，并把最后一部分的WAL日志重新执行一遍，就可以恢复到宕机时的状态。
自动清理进程AutoVacuum	在PostgreSQL数据库中，对表元组的UPDATE或DELETE操作并未立即删除旧版本的数据，表中的旧元组只是被标识为删除状态，并未立即释放空间。这种处理对于获取多版本并发控制是必要的，如果一个元组的版本仍有可能被其他事务看到，那么就不能删除元组的该版本。当事务提交后，过期元组版本将对事务不再有效，因而其占据的空间必须回收以供其他新元组使用，以避免对磁盘空间增长的无休止的需求，此时对数据库的清理工作通过运行VACUUM来实现。从PostgreSQL 8.1开始，PostgreSQL数据库引入一个额外的可选辅助进程AutoVacuum（系统自动清理进程），自动执行VACUUM和ANALYZE命令，回收被标识为删除状态记录的空间，更新表的统计信息。
统计数据收集进程PgStat	做数据的统计收集工作

我们再启动PG，看看内存方面的内容：

[postgres@CHENZX-DB01 ~]$ pgtl start
server starting
[postgres@CHENZX-DB01 ~]$ ps -ef|grep postgres
root     15592 15571  0 14:23 pts/1    00:00:00 su - postgres
postgres 15593 15592  0 14:23 pts/1    00:00:00 -bash
postgres 16403     1  0 14:36 pts/1    00:00:00 /postgres/postgresql-9.6.11/bin/postgres -D /postgres/pgdata
postgres 16404 16403  0 14:36 ?        00:00:00 postgres: logger process   
postgres 16406 16403  0 14:36 ?        00:00:00 postgres: checkpointer process   
postgres 16407 16403  0 14:36 ?        00:00:00 postgres: writer process   
postgres 16408 16403  0 14:36 ?        00:00:00 postgres: wal writer process   
postgres 16409 16403  0 14:36 ?        00:00:00 postgres: autovacuum launcher process   
postgres 16410 16403  0 14:36 ?        00:00:00 postgres: stats collector process   
postgres 19276 15593  0 15:24 pts/1    00:00:00 ps -ef
postgres 19277 15593  0 15:24 pts/1    00:00:00 grep --color=auto postgres
[postgres@CHENZX-DB01 ~]$ ps aux --sort -rss|grep 16403
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres 16403  0.0  2.1 319992 41268 pts/1    S    14:36   0:00 /postgres/postgresql-9.6.11/bin/postgres -D /postgres/pgdata
postgres 19299  0.0  0.0 112704   968 pts/1    S+   15:24   0:00 grep --color=auto 16403

PG启动后，主进程会占用一部分的内存，这部分内存会共享给所有后端进程共同使用。这部分称之为共享内存，共享内存由三部分组成，分别是：

Shared Buffer Pool：PG将表和索引中的页面从持久存储装载到这里，并直接操作它们
WAL Buffer：WAL文件持久化之前的缓冲区
CommitLog Buffer：PostgresSQL在CommitLog中保存事务的状态，并将这些状态保留在Shared Buffer Pool中，在整个事务过程中使用

应用访问时的进程结构和内存结构

我们在找一台安装了PG客户端的机器，进行远程登录的操作：

[postgres@CHENZX-DB02 bin]$ psql -h 192.168.1.61 -p 5432
psql (9.6.11)
Type "help" for help.

postgres=#

我们在PG服务器端查看一下进程：

[postgres@CHENZX-DB01 ~]$ ps -ef|grep postgres
root     15592 15571  0 14:23 pts/1    00:00:00 su - postgres
postgres 15593 15592  0 14:23 pts/1    00:00:00 -bash
postgres 16403     1  0 14:36 pts/1    00:00:00 /postgres/postgresql-9.6.11/bin/postgres -D /postgres/pgdata
postgres 16404 16403  0 14:36 ?        00:00:00 postgres: logger process   
postgres 16406 16403  0 14:36 ?        00:00:00 postgres: checkpointer process   
postgres 16407 16403  0 14:36 ?        00:00:00 postgres: writer process   
postgres 16408 16403  0 14:36 ?        00:00:00 postgres: wal writer process   
postgres 16409 16403  0 14:36 ?        00:00:00 postgres: autovacuum launcher process   
postgres 16410 16403  0 14:36 ?        00:00:00 postgres: stats collector process   
postgres 23120 16403  0 16:23 ?        00:00:00 postgres: postgres postgres 192.168.1.62(36106) idle        ------>多了此进程
postgres 23136 15593  0 16:24 pts/1    00:00:00 ps -ef
postgres 23137 15593  0 16:24 pts/1    00:00:00 grep --color=auto postgres

除了数据库启动时那7个进程之外，多了一个由主进程fork出来的进程23120。我们再做一个测试，远程登录一个数据库中没有的用户。

[postgres@CHENZX-DB02 bin]$ psql -U chenzx -h 192.168.1.61 -p 5432
psql: FATAL:  role "chenzx" does not exist

那服务器端的进程有无多呢？正常无连上数据库应该是无多的。

[postgres@CHENZX-DB01 ~]$ ps -ef|grep postgres
root     15592 15571  0 14:23 pts/1    00:00:00 su - postgres
postgres 15593 15592  0 14:23 pts/1    00:00:00 -bash
postgres 16403     1  0 14:36 pts/1    00:00:00 /postgres/postgresql-9.6.11/bin/postgres -D /postgres/pgdata
postgres 16404 16403  0 14:36 ?        00:00:00 postgres: logger process   
postgres 16406 16403  0 14:36 ?        00:00:00 postgres: checkpointer process   
postgres 16407 16403  0 14:36 ?        00:00:00 postgres: writer process   
postgres 16408 16403  0 14:36 ?        00:00:00 postgres: wal writer process   
postgres 16409 16403  0 14:36 ?        00:00:00 postgres: autovacuum launcher process   
postgres 16410 16403  0 14:36 ?        00:00:00 postgres: stats collector process   
postgres 28197 15593  0 17:45 pts/1    00:00:00 ps -ef
postgres 28198 15593  0 17:45 pts/1    00:00:00 grep --color=auto postgres

可以看到，服务器进程并没有比之前多，仍然是7个进程。可以看到，主进程在这里的作用是监听客户端连接、为每个客户端连接fork单独的postgres服务进程。

创建一个会话连接到PG，再查看这个进程使用内存的状况：

[postgres@CHENZX-DB01 ~]$ ps aux --sort -rss|grep 28572
USER       PID %CPU %MEM    VSZ   RSS TTY      STAT START   TIME COMMAND
postgres 28572  0.0  0.1 321604  2948 ?        Ss   17:51   0:00 postgres: postgres postgres 192.168.1.62(36134) idle

可以看到在会话连接到数据库时，postgres进程会为每个进程查询等操作划分一部分内存，这部分内存我们称之为本地内存。本地内存由三部分组成：

Temp_buffers：临时表相关操作会使用这部分内存
work_mem：内部排序（order by,distinct）操作和Hash表在使用临时磁盘文件之前使用的内存缓冲区
maintenance_work_mem：维护性操作中使用的内存缓冲区，如vacuum、create index和alter table add foreign key等

讨论了两个情景，基本把PG的进程结构和内存结构梳理了一遍。最后，结合上述的情景，对PG主进程的作用进行整理：

数据库启停
监听客户端连接
为每个客户端连接fork单独的postgres进程
当服务进程出错时进行修复
管理数据文件
管理与数据库运行相关的辅助进程

posted @ 2019-01-17 19:09 Nolan_Chan 阅读(698) 评论(0) 编辑收藏举报

刷新页面返回顶部

PG进程结构和内存结构

PG启动时进程结构和内存结构

应用访问时的进程结构和内存结构

公告