首页  :: 新随笔  :: 管理

PostgreSQL 9.6流复制(异步复制)

Posted on 2021-12-07 23:57  高&玉  阅读(487)  评论(0编辑  收藏  举报

1 介绍

        PostgresSQL在9.0版本实现的复制功能只能进行异步复制,在PostgreSQL 9.1版本中可以支持同步复制

        PostgreSQL流复制是基于WAL日志传输实现的:主库发送WAL日志,备库接收WAL日志并进行回放。

 

流复制的3个主要进程:

  • 主库的walsender进程:负责发送WAL日志给备库。
  • 备库的walreceiver进程:负责主库发送的WAL日志。
  • 备库的startup进程:重放接收的WAL日志。

 

主从之间是怎么通信的?

(1)primary端 后端进程通过执行函数XLogInsert()和XLogFlush(),将WAL数据写入并刷新到WAL段文件中。

(2)primary端 WAL发送器进程将写入WAL段文件的WAL数据发送到WAL接收器进程。

(3)primary端 在发送WAL数据之后,后端进程继续等待来自备库的ACK响应。更确切地说,后端进程通过执行内部函数SyncRepWaitForLSN()来获取锁存器,并等待它被释放。

(4)standby端 上的WAL接收器通过write()系统函数调用,将接收到的WAL数据写入备库的WAL段,并向WAL发送器返回ACK响应。

(5)standby端 WAL接收器通过调用(如fsync()函数)将WAL数据刷新到WAL段中,向WAL发送器返回另一个ACK响应,并通知启动进程相关WAL数据已更新。

(6)standby端 启动进程重放已写入WAL段的WAL数据。

(7)primary端 WAL发送器在收到来自WAL接收器的ACK响应后释放后端进程的锁存器,然后后端进程完成commit或abort动作。锁存器释放的时间取决于参数synchronous_commit。如果它是‘on’(默认),那么当接收到步骤(5)的ACK时,锁存器被释放;如果它是'remote_write',当接收到步骤(4)的ACK时,其被释放。

 

每个ACK响应将备库的内部信息通知给主库,包含以下4个项目:

  • 已写入最新WAL数据的LSN位置。
  • 已刷新最新WAL数据的LSN位置。
  • 启动进程已经重放最新的WAL数据的LSN。
  • 发送此响应的时间戳。

2 安装PostgreSQL 9.6

CentOS 7.6

PostgreSQL 14.5

 

 

1. 安装依赖包

[root]# yum install -y perl-ExtUtils-Embed python-devel bison flex readline-devel zlib-devel gcc gcc-c++ wget

2. 源码安装uuid

[root]# wget ftp://ftp.ossp.org/pkg/lib/uuid/uuid-1.6.2.tar.gz
[root]# tar -zxvf uuid-1.6.2.tar.gz
[root]# cd uuid-1.6.2
[root]# ./configure --with-uuid=ossp
[root]# make && make install

3. 源码安装postgres

[root]# https://ftp.postgresql.org/pub/source/v9.6.0/postgresql-9.6.0.tar.gz
[root]# tar -zxvf postgresql-9.6.0.tar.gz
[root]# cd postgresql-9.6.0
[root]# ./configure --prefix=/usr/local/postgresql-9.6.0 --with-perl --with-python --enable-thread-safety --with-uuid=ossp --with-wal-segsize=64
[root]# make && make install

4. 安装contrib的工具

[root]# cd postgresql-9.6.0/contrib
[root]# make && make install

5. 更新安装目录用户及属组并做软链接

[root]# chown -R postgres:postgres /usr/local/postgresql-9.6.0
[root]# ln -s /usr/local/postgresql-9.6.0 /usr/local/postgresql

6. 软链接libuuid.so.16

[root]# ln -s /usr/local/lib/libuuid.so.16 /usr/local/postgresql-9.6.0/lib/

7. 创建PGDATA目录

[root]# mkdir /data/postgres
[root]# chown postgres:postgres /data/postgres
[root]# chmod 0700 /data/postgres

8. 创建postgres用户及环境变量

[root]# useradd postgres
[root]# echo "Your_passwd" | passwd postgres --stdin

[postgres]# vi /home/postgres/.bashrc
[postgres]# export PGDATA=/data/postgres
[postgres]# export PATH=/usr/local/postgresql/bin:$PATH
[postgres]# export LD_LIBRARY_PATH=/usr/local/postgresql/lib

9. 初始化postgres

[postgres]# initdb -D $PGDATA -k
The files belonging to this database system will be owned by user "postgres".
This user must also own the server process.

The database cluster will be initialized with locale "en_US.UTF-8".
The default database encoding has accordingly been set to "UTF8".
The default text search configuration will be set to "english".

Data page checksums are enabled.

fixing permissions on existing directory /data/postgres ... ok
creating subdirectories ... ok
selecting dynamic shared memory implementation ... posix
selecting default max_connections ... 100
selecting default shared_buffers ... 128MB
selecting default time zone ... PRC
creating configuration files ... ok
running bootstrap script ... ok
performing post-bootstrap initialization ... ok
syncing data to disk ... ok

initdb: warning: enabling "trust" authentication for local connections
You can change this by editing pg_hba.conf or using the option -A, or
--auth-local and --auth-host, the next time you run initdb.

Success. You can now start the database server using:

    pg_ctl -D /data/postgres -l logfile start

10.启动postgres

启动postgres

[postgres]# pg_ctl start
waiting for server to start....2022-09-22 08:52:37.658 CST [33036] LOG:  starting PostgreSQL 9.6 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit
2022-09-22 08:52:37.659 CST [33036] LOG:  listening on IPv6 address "::1", port 5432
2022-09-22 08:52:37.659 CST [33036] LOG:  listening on IPv4 address "127.0.0.1", port 5432
2022-09-22 08:52:37.660 CST [33036] LOG:  listening on Unix socket "/tmp/.s.PGSQL.5432"
2022-09-22 08:52:37.662 CST [33037] LOG:  database system was shut down at 2022-09-22 08:52:16 CST
2022-09-22 08:52:37.665 CST [33036] LOG:  database system is ready to accept connections
 done
server started

查看postgres版本

postgres=# select version();
                                                 version                                                  
----------------------------------------------------------------------------------------------------------
 PostgreSQL 9.6.0 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44), 64-bit

3 配置复制流

3.1 配置“复制槽”

        standby端如果长时间停机,重启后standby可能因缺少相应的WAL日志无法连接primary。此时可以通过启用max_replication_slots参数启用复制槽来解决此问题。

        primary端实例会一直保留预写日志(WAL)文件,直到所有备库所需的插槽都确认已接收到特定段为止。只有完成此操作后,主库实例才会移除相应的WAL文件。

 

Primary端创建复制槽

SELECT * FROM pg_create_physical_replication_slot('pg_primary');

 

查看复制槽

postgres=# select * from pg_replication_slots;
 slot_name  | plugin | slot_type | datoid | database | active | active_pid | xmin | catalog_xmin | restart_lsn | confirmed_flush_lsn 
------------+--------+-----------+--------+----------+--------+------------+------+--------------+-------------+---------------------
 pg_primary |        | physical  |        |          | t      |     116077 |      |              | 89/A40003E0 | 

 

Standby端配置复制槽

[postgres]# vi $PGDATA/recovery.conf
添加如下内容:
primary_slot_name = 'pg_primary'

3.2 primary端配置归档

开启归档模式

创建归档目录

[postgres]# mkdir $PGDATA/arch_log

配置归档脚本

[postgres]# vi $PGDATA/archive.sh
cp --preserve=timestamps $1 $PGDATA/arch_log/$2 ; find $PGDATA/arch_log -type f -mtime +30 | xargs rm -fr;

修改归档相关参数

[postgres]# vi $PGDATA/postgresql.conf
archive_mode = on
archive_command = '/bin/bash archive.sh %p %f'
archive_timeout = 1800

重启postgres使归档参数生效

[postgres]# pg_ctl restart

查看参数是否已经生效

postgres=# select name,setting from pg_settings where name in ('archive_mode','archive_command','archive_timeout');
      name       |          setting           
-----------------+----------------------------
 archive_command | /bin/bash archive.sh %p %f
 archive_mode    | on
 archive_timeout | 1800

手动归档

postgres=# select pg_switch_xlog();
 pg_switch_wal 
---------------
 0/4000160

查看归档目录是否已存在归档文件

[postgres]# ls $PGDATA/arch_log
000000010000000000000004

3.3 primary创建复制用户

 创建复制用户

postgres=# create role repl login replication encrypted password 'Your_passwd';
CREATE ROLE

配置权限配置文件

[postgres]# vi pg_hba.conf
添加如下:
host replication repl 0.0.0.0/0        md5

 配置参数(这个必须配置)

[postgres]# vi postgresql.conf
listen_addresses = '*'
wal_level = replica

重启PG使参数生效

[postgres]# pg_ctl restart

 3.4 配置standby端

 删除standby端数据

[postgres]# pg_ctl stop
[postgres]# rm -fr $PGDATA/*

配置primary端到standby端

[postgres]# pg_basebackup -h 192.168.1.71 -U repl -D $PGDATA -X stream -P

配置recovery.conf文件

[postgres]# vi $PGDATA/recovery.conf
standby_mode = on
primary_conninfo = 'host=192.168.1.71 port=5432 user=repl password=Your_passwd'
recovery_target_timeline = 'latest'
trigger_file = '/tmp/trigger_file0'
primary_slot_name = 'pg_primary'

备机开启只读模式(默认配置下备机不可读,启用hot_standby后备机可读)

[postgres]# vi $PGDATA/postgressql.conf
hot_standby = on

 3.5 启动流复制

启动standby端postgres

[postgres]# pg_ctl start

 

primary端查看流复制详情

postgres=# \x on;
Expanded display is on.
postgres=# select * from pg_stat_replication;
----+------------------------------
pid              | 116077
usesysid         | 16384
usename          | repl
application_name | walreceiver
client_addr      | 192.168.1.72
client_hostname  | 
client_port      | 45984
backend_start    | 2021-12-20 16:17:58.493769+08
backend_xmin     | 
state            | streaming
sent_location    | 89/A2BE8E10
write_location   | 89/A2BE8E10
flush_location   | 89/A2BE8E10
replay_location  | 89/A2BE8E10
sync_priority    | 0
sync_state       | async

4 监控复制流

Primary端查看已连接的Standby端信息,如果为空说明复制流异常

postgres=# \x on;
Expanded display is on.
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid              | 4007
usesysid         | 16384
usename          | repl
application_name | walreceiver
client_addr      | 192.168.1.72
client_hostname  | host72
client_port      | 34596
backend_start    | 2022-09-21 17:47:09.629914+08
backend_xmin     | 
state            | streaming
sent_location    | 1/F4000060
write_location   | 1/F4000060
flush_location   | 1/F4000060
replay_location  | 1/F4000060
sync_priority    | 0
sync_state       | async

 

Standby端查看是否处于recovery状态,值为“t”说明是当前数据库是备库,值为“f”说明当前数据库是主库(也可能复制流出现异常出现双主)

postgres=# select pg_is_in_recovery();
 pg_is_in_recovery 
-------------------
 t

 

通过pg_controldata命令查看Database cluster state,值为“in archive recovery”说明是当前数据库是备库,值为“in production”说明当前数据库是主库

[postgres@host72 postgres]$ pg_controldata $DATA
pg_control version number:            960
Catalog version number:               201608131
Database system identifier:           7122063347532860770
Database cluster state:               in archive recovery
pg_control last modified:             Wed 21 Sep 2022 05:52:09 PM CST
Latest checkpoint location:           1/F4000098
Prior checkpoint location:            1/F0000060
Latest checkpoint's REDO location:    1/F4000060
Latest checkpoint's REDO WAL file:    00000001000000010000003D
Latest checkpoint's TimeLineID:       1
Latest checkpoint's PrevTimeLineID:   1
Latest checkpoint's full_page_writes: on
Latest checkpoint's NextXID:          0:1762
Latest checkpoint's NextOID:          24576
Latest checkpoint's NextMultiXactId:  1
Latest checkpoint's NextMultiOffset:  0
Latest checkpoint's oldestXID:        1753
Latest checkpoint's oldestXID's DB:   1
Latest checkpoint's oldestActiveXID:  1762
Latest checkpoint's oldestMultiXid:   1
Latest checkpoint's oldestMulti's DB: 1
Latest checkpoint's oldestCommitTsXid:0
Latest checkpoint's newestCommitTsXid:0
Time of latest checkpoint:            Wed 21 Sep 2022 05:51:43 PM CST
Fake LSN counter for unlogged rels:   0/1
Minimum recovery ending location:     1/F4000108
Min recovery ending loc's timeline:   1
Backup start location:                0/0
Backup end location:                  0/0
End-of-backup record required:        no
wal_level setting:                    replica
wal_log_hints setting:                on
max_connections setting:              500
max_worker_processes setting:         8
max_prepared_xacts setting:           0
max_locks_per_xact setting:           64
track_commit_timestamp setting:       off
Maximum data alignment:               8
Database block size:                  8192
Blocks per segment of large relation: 131072
WAL block size:                       8192
Bytes per WAL segment:                67108864
Maximum length of identifiers:        64
Maximum columns in an index:          32
Maximum size of a TOAST chunk:        1996
Size of a large-object chunk:         2048
Date/time type storage:               64-bit integers
Float4 argument passing:              by value
Float8 argument passing:              by value
Data page checksum version:           0

 

在standby端查看wal接收情况

postgres=# \x on;
Expanded display is on.
postgres=# select * from pg_stat_wal_receiver;
-[ RECORD 1 ]---------+------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
pid                   | 31805
status                | streaming
receive_start_lsn     | 0/11000000
receive_start_tli     | 1
written_lsn           | 0/12000000
flushed_lsn           | 0/12000000
received_tli          | 1
last_msg_send_time    | 2022-09-22 17:31:42.117758+08
last_msg_receipt_time | 2022-09-22 17:31:42.11795+08
latest_end_lsn        | 0/12000000
latest_end_time       | 2022-09-22 16:56:39.809168+08
slot_name             | pg_primary
sender_host           | 192.168.1.71
sender_port           | 5432
conninfo              | user=repl password=******** channel_binding=disable dbname=replication host=192.168.1.72 port=5432 fallback_application_name=walreceiver sslmode=disable sslcompression=0 sslsni=1 ssl_min_protocol_version=TLSv1.2 gssencmode=disable krbsrvname=postgres target_session_attrs=any

5 主从切换

查看primary端状态

[postgres]# pg_controldata | grep "Database cluster state:"
Database cluster state:               in production

 

查看standby端状态

[postgres]# pg_controldata | grep "Database cluster state:"
Database cluster state:               in archive recovery

 

standby切换为primary

[postgres]# pg_ctl promote -D $PGDATA 
server promoting