postgresql 10.5 主从复制--搭建测试
env:
role | master | slave |
host | pg1 | pg2 |
ip | 11 | 12 |
pg-version | 10.5 | 10.5 |
1 初始化查看
[ceiec@localhost ~]$ df -h [ceiec@localhost ~]$ free -m [ceiec@localhost ~]$ lscpu [ceiec@localhost ~]$ cat /etc/redhat-release CentOS Linux release 7.6.1810 (Core)
--master [root@localhost ~]# hostnamectl set-hostname pg1
--slave [root@localhost ~]# hostnamectl set-hostname pg2
官方下载源码包
https://www.postgresql.org/ftp/source/
2 解压并安装
# tar zxvf postgresql-10.5.tar.gz # cd postgresql-10.5/ # mkdir /usr/local/postgresql # yum install -y zlib-devel readline-devel gcc # ./configure --prefix=/usr/local/postgresql --with-python --with-perl
编译时增加 --with-python configure: error: header file <Python.h> is required for Python 解决方法: yum install python-devel 编译时增加 --with-perl configure: error: could not determine flags for linking embedded Perl. This probably means that ExtUtils::Embed or ExtUtils::MakeMaker is notinstalled. 解决方法: yum install perl-ExtUtils-Embed
用户、路径、权限
# useradd postgres # mkdir -p /home/pgdata # chown -R postgres.postgres /home/pgdata # vim /etc/profile export PATH=/usr/local/postgresql/bin:$PATH export LD_LIBRARY_PATH=/usr/local/postgresql/lib # source /etc/profile # vim /etc/hosts *.11 pg1 *.12 pg2 # chown -R postgres:postgres /usr/local/postgresql
postgres用户profile
[root@localhost postgresql]# su - postgres [postgres@pg1 ~]$ vim /home/postgres/.bash_profile export PGHOME=/usr/local/postgresql export PGDATA=/home/pgdata export PGHOST=/tmp export PATH="$HOME/bin:$HOME/.local/bin:$PATH:$PGHOME/bin" export MANPATH=$PGHOME/share/man:$MANPATH export LANG=en_US.utf8 export DATE=`date +"%Y-%m-%d %H:%M:%S"` export LD_LIBRARY_PATH=$PGHOME/lib:$LD_LIBRARY_PATH [postgres@pg1 ~]$ source .bash_profile
初始化数据库
$ /usr/local/postgresql/bin/initdb -D /home/pgdata
修改参数文件
[postgres@pg1 pgdata]$ vim postgresql.conf listen_addresses = '*' max_connections = 1000 shared_buffer = 10240MB wal_level = hot_standby synchronous_commit = on checkpoint_timeout = 5min archive_mode = on archive_command = '/bin/date' max_wal_senders = 10 wal_keep_segments = 16 hot_standby = on max_standby_archive_delay = 30s max_standby_streaming_delay = 30s wal_receiver_status_interval = 1s hot_standby_feedback = on wal_receiver_timeout = 60s
连接权限
[postgres@pg1 pgdata]$ vim pg_hba.conf host all all *.7.0/24 md5 host all all *.12/32 md5 host all all *.0/32 md5 host all all *.11/32 md5 host replication repuser *.12/32 md5
启动
$ pg_ctl start -D /home/pgdata/ [postgres@pg1 pgdata]$ netstat -lnt|grep 5432 tcp 0 0 0.0.0.0:5432 0.0.0.0:* LISTEN tcp6 0 0 :::5432 :::* LISTEN [postgres@pg1 pgdata]$ psql psql (10.5) Type "help" for help. postgres=# create role repuser login replication encrypted password '***'; CREATE ROLE postgres=# \q
--salve配置 ,在初始化db前的操作都一样,slave上不需要初始化
从库安装完成后,不初始化,若已经初始化,删除其data目录
[postgres@pg2 ~]$ pg_basebackup -D /home/pgdata -F p -X stream -R -v -P -h *.11 -p 5432 -U repuser Password: pg_basebackup: initiating base backup, waiting for checkpoint to complete pg_basebackup: checkpoint completed pg_basebackup: write-ahead log start point: 0/2000028 on timeline 1 pg_basebackup: starting background WAL receiver 23740/23740 kB (100%), 1/1 tablespace pg_basebackup: write-ahead log end point: 0/20000F8 pg_basebackup: waiting for background process to finish streaming ... pg_basebackup: base backup completed
配置备库参数postgresql.conf
[postgres@pg2 pgdata]$ vim postgresql.conf #在基础备份时,初始化文件是从主库复制来的,所以配置文件一致,注释掉 wal_level, max_wal_senders wal_keep_segments等参数 打开如下参数: hot_standby = on #在备份的同时允许查询 max_standby_streaming_delay = 30s #可选,流复制最大延迟 wal_receiver_status_interval = 10s #可选,从向主报告状态的最大间隔时间 hot_standby_feedback = on #可选,查询冲突时向主反馈 max_connections = 1100 #默认参数,非主从配置相关参数,表示到数据库的连接数,一般从库做主要的读服务时,设置值需要高于主库
[postgres@pg2 pgdata]$ vim pg_hba.conf host replication repuser *.11/32 md5 [postgres@pg2 data]# cat recovery.conf standby_mode = 'on' primary_conninfo = 'user=repuser password=****** host=*.11 port=5432 sslmode=prefer sslcompression=1 krbsrvname=postgres' [postgres@pg2 pgdata]$ vim recovery.conf recovery_target_timeline = 'latest'
启动
[postgres@pg2 pgdata]$ pg_ctl start waiting for server to start....2019-07-20 03:00:00.026 CST [19694] FATAL: data directory "/home/pgdata" has group or world access 2019-07-20 03:00:00.026 CST [19694] DETAIL: Permissions should be u=rwx (0700). stopped waiting pg_ctl: could not start server Examine the log output. [postgres@pg2 pgdata]$ chmod 700 /home/pgdata [postgres@pg2 pgdata]$ pg_ctl start
进程查看
--master [postgres@pg1 pgdata]$ ps -ef|grep postgres root 1363 10030 0 09:59 pts/0 00:00:00 su - postgres postgres 1364 1363 0 09:59 pts/0 00:00:00 -bash postgres 6608 11510 0 10:59 ? 00:00:00 postgres: wal sender process repuser *.12(50726) streaming 0/3000140 postgres 7486 1364 0 11:00 pts/0 00:00:00 ps -ef postgres 7487 1364 0 11:00 pts/0 00:00:00 grep --color=auto postgres postgres 11510 1 0 10:18 pts/0 00:00:00 /usr/local/postgresql/bin/postgres -D /home/pgdata postgres 11512 11510 0 10:18 ? 00:00:00 postgres: checkpointer process postgres 11513 11510 0 10:18 ? 00:00:00 postgres: writer process postgres 11514 11510 0 10:18 ? 00:00:00 postgres: wal writer process postgres 11515 11510 0 10:18 ? 00:00:00 postgres: autovacuum launcher process postgres 11516 11510 0 10:18 ? 00:00:00 postgres: archiver process last was 000000010000000000000002.00000028.backup postgres 11517 11510 0 10:18 ? 00:00:00 postgres: stats collector process postgres 11518 11510 0 10:18 ? 00:00:00 postgres: bgworker: logical replication launcher --slave [postgres@pg2 pgdata]$ ps -ef|grep postgres root 17802 4245 0 02:30 pts/0 00:00:00 su - postgres postgres 17803 17802 0 02:30 pts/0 00:00:00 -bash postgres 19891 1 0 03:02 pts/0 00:00:00 /usr/local/postgresql/bin/postgres postgres 19892 19891 0 03:02 ? 00:00:00 postgres: startup process recovering 000000010000000000000003 postgres 19893 19891 0 03:02 ? 00:00:00 postgres: checkpointer process postgres 19894 19891 0 03:02 ? 00:00:00 postgres: writer process postgres 19895 19891 0 03:02 ? 00:00:00 postgres: stats collector process postgres 19896 19891 0 03:02 ? 00:00:00 postgres: wal receiver process streaming 0/3000140 postgres 19914 17803 0 03:02 pts/0 00:00:00 ps -ef postgres 19915 17803 0 03:02 pts/0 00:00:00 grep --color=auto postgres
状态查看
--master postgres=# select client_addr,sync_state from pg_stat_replication; client_addr | sync_state --------------+------------ *.12 | async (1 row) postgres=# select * from pg_stat_replication; pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | backend_xmin | state | sent_ls n | write_lsn | flush_lsn | replay_lsn | write_lag | flush_lag | replay_lag | sync_priority | sync_state ------+----------+---------+------------------+--------------+-----------------+-------------+-------------------------------+--------------+-----------+-------- ---+-----------+-----------+------------+-----------+-----------+------------+---------------+------------ 6608 | 16384 | repuser | walreceiver | *.12 | | 50726 | 2019-07-19 10:59:51.168278+08 | 556 | streaming | 0/30001 40 | 0/3000140 | 0/3000140 | 0/3000140 | | | | 0 | async (1 row) postgres=# create user yhq with password '****'; CREATE ROLE postgres=# create database test1 owner yhq; CREATE DATABASE postgres=# grant all privileges on database test1 to yhq; GRANT postgres=# \c test1 You are now connected to database "test1" as user "postgres". test1=# CREATE TABLE user_tbl(name VARCHAR(20), signup_date DATE); CREATE TABLE test1=# INSERT INTO user_tbl(name, signup_date) VALUES('yhqtest','2019-07-19'); INSERT 0 1 test1=# select *from user_tbl; name | signup_date ---------+------------- yhqtest | 2019-07-19 (1 row)
第一次主从切换
--master
--master [postgres@pg1 pgdata]$ pg_controldata | grep 'Database cluster state' Database cluster state: in production [postgres@pg1 pgdata]$ psql psql (10.5) Type "help" for help. postgres=# \c test1 You are now connected to database "test1" as user "postgres". test1=# CREATE TABLE t1(id integer not null,date TIMESTAMP NOT NULL DEFAULT LOCALTIMESTAMP(0)); CREATE TABLE test1=# INSERT INTO t1 (id) VALUES (1); INSERT 0 1 test1=# INSERT INTO t1 (id) VALUES (2); INSERT 0 1 test1=# INSERT INTO t1 (id) VALUES (3); INSERT 0 1 test1=# select * from t1; id | date ----+--------------------- 1 | 2019-07-19 14:26:27 2 | 2019-07-19 14:26:30 3 | 2019-07-19 14:26:32 (3 rows)
--slave
--slave [postgres@pg2 pgdata]$ pg_controldata | grep 'Database cluster state' Database cluster state: in archive recovery [postgres@pg2 pgdata]$ psql psql (10.5) Type "help" for help. postgres=# \c test1 You are now connected to database "test1" as user "postgres". test1=# select * from t1; id | date ----+--------------------- 1 | 2019-07-19 14:26:27 2 | 2019-07-19 14:26:30 3 | 2019-07-19 14:26:32 (3 rows) test1=# INSERT INTO t1 (id) VALUES (4); 2019-07-20 06:33:51.823 CST [2326] ERROR: cannot execute INSERT in a read-only transaction 2019-07-20 06:33:51.823 CST [2326] STATEMENT: INSERT INTO t1 (id) VALUES (4); ERROR: cannot execute INSERT in a read-only transaction
--master关掉pg
--master [postgres@pg1 pgdata]$ pg_ctl stop -m fast waiting for server to shut down....2019-07-19 14:37:47.168 CST [7897] LOG: received fast shutdown request 2019-07-19 14:37:47.171 CST [7897] LOG: aborting any active transactions 2019-07-19 14:37:47.172 CST [20572] FATAL: terminating connection due to administrator command 2019-07-19 14:37:47.172 CST [7897] LOG: worker process: logical replication launcher (PID 7905) exited with exit code 1 2019-07-19 14:37:47.174 CST [7899] LOG: shutting down Fri Jul 19 14:37:47 CST 2019 2019-07-19 14:37:47.484 CST [7897] LOG: database system is shut down done server stopped [postgres@pg1 pgdata]$ pg_controldata | grep 'Database cluster state' Database cluster state: shut down
--slave
--slave --错误日志 2019-07-20 06:41:10.304 CST [2915] FATAL: could not connect to the primary server: could not connect to server: Connection refused Is the server running on host "*.11" and accepting TCP/IP connections on port 5432? [postgres@pg2 pgdata]$ pg_ctl promote waiting for server to promote....2019-07-20 06:41:13.568 CST [24073] LOG: received promote request 2019-07-20 06:41:13.568 CST [24073] LOG: redo done at 0/5000028 2019-07-20 06:41:13.568 CST [24073] LOG: last completed transaction was at log time 2019-07-19 14:26:32.495757+08 2019-07-20 06:41:13.573 CST [24073] LOG: selected new timeline ID: 2 2019-07-20 06:41:14.116 CST [24073] LOG: archive recovery complete 2019-07-20 06:41:14.130 CST [24071] LOG: database system is ready to accept connections Sat Jul 20 06:41:14 CST 2019 Sat Jul 20 06:41:14 CST 2019 Sat Jul 20 06:41:14 CST 2019 done server promoted [postgres@pg2 pgdata]$ pg_controldata | grep 'Database cluster state' Database cluster state: in production [postgres@pg2 pgdata]$ psql psql (10.5) Type "help" for help. postgres=# \c test1 You are now connected to database "test1" as user "postgres". test1=# INSERT INTO t1 (id) VALUES (4); INSERT 0 1 test1=# INSERT INTO t1 (id) VALUES (5); INSERT 0 1 test1=# INSERT INTO t1 (id) VALUES (6); INSERT 0 1 test1=# INSERT INTO t1 (id) VALUES (7); INSERT 0 1 test1=# select * from t1; id | date ----+--------------------- 1 | 2019-07-19 14:26:27 2 | 2019-07-19 14:26:30 3 | 2019-07-19 14:26:32 4 | 2019-07-20 06:42:29 5 | 2019-07-20 06:42:36 6 | 2019-07-20 06:42:42 7 | 2019-07-20 06:42:44 (7 rows) [postgres@pg2 pgdata]$ ll recovery.done -rw-rw-r--. 1 postgres postgres 199 Jul 20 02:58 recovery.done
--旧master修改并启动且加入
--master [postgres@pg1 pgdata]$ ll recovery ls: cannot access recovery: No such file or directory [postgres@pg1 pgdata]$ vim recovery.conf recovery_target_timeline = 'latest' standby_mode = 'on' primary_conninfo = 'user=repuser password=***** host=*.12 port=5432 sslmode=disable sslcompression=1 target_session_attrs=any' [postgres@pg1 pgdata]$ vim postgresql.conf max_connections = 1200 #max_wal_senders #wal_keep_segments [postgres@pg1 pgdata]$ pg_ctl start [postgres@pg1 pgdata]$ pg_controldata | grep 'Database cluster state' Database cluster state: in archive recovery [postgres@pg1 pgdata]$ psql psql (10.5) Type "help" for help. postgres=# \c test1 You are now connected to database "test1" as user "postgres". test1=# select * from t1; id | date ----+--------------------- 1 | 2019-07-19 14:26:27 2 | 2019-07-19 14:26:30 3 | 2019-07-19 14:26:32 4 | 2019-07-20 06:42:29 5 | 2019-07-20 06:42:36 6 | 2019-07-20 06:42:42 7 | 2019-07-20 06:42:44 (7 rows)
--再次把新slave停掉
[postgres@pg2 pgdata]$ pg_ctl stop -m fast
--旧master作为主
[postgres@pg1 pgdata]$ pg_ctl promote [postgres@pg1 pgdata]$ pg_controldata | grep 'Database cluster state' Database cluster state: in production [postgres@pg1 pgdata]$ psql psql (10.5) Type "help" for help. postgres=# \c test1 You are now connected to database "test1" as user "postgres". test1=# INSERT INTO t1 (id) VALUES (17); INSERT 0 1 test1=# INSERT INTO t1 (id) VALUES (18); INSERT 0 1 test1=# INSERT INTO t1 (id) VALUES (19); INSERT 0 1 test1=# select * from t1; id | date ----+--------------------- 1 | 2019-07-19 14:26:27 2 | 2019-07-19 14:26:30 3 | 2019-07-19 14:26:32 4 | 2019-07-20 06:42:29 5 | 2019-07-20 06:42:36 6 | 2019-07-20 06:42:42 7 | 2019-07-20 06:42:44 17 | 2019-07-19 14:48:22 18 | 2019-07-19 14:48:27 19 | 2019-07-19 14:48:30 (10 rows)
--最开始的slave,依然作为slave加入
--slave [postgres@pg2 pgdata]$ mv recovery.done recovery.conf [postgres@pg2 pgdata]$ pg_ctl start
--new master
--master test1=# select client_addr,sync_state from pg_stat_replication; client_addr | sync_state --------------+------------ *.12 | async (1 row) --slave [postgres@pg2 pgdata]$ psql psql (10.5) Type "help" for help. postgres=# \c test1 You are now connected to database "test1" as user "postgres". test1=# select * from t1; id | date ----+--------------------- 1 | 2019-07-19 14:26:27 2 | 2019-07-19 14:26:30 3 | 2019-07-19 14:26:32 4 | 2019-07-20 06:42:29 5 | 2019-07-20 06:42:36 6 | 2019-07-20 06:42:42 7 | 2019-07-20 06:42:44 17 | 2019-07-19 14:48:22 18 | 2019-07-19 14:48:27 19 | 2019-07-19 14:48:30 (10 rows)
--note:
1 pg1和pg2的时间不一样,暴露了在初始化的时候没有修改正确
同步2台主机的时间
[root@pg2 ~]# date Sat Jul 20 06:56:13 CST 2019 [root@pg1 pgdata]# date Fri Jul 19 14:48:55 CST 2019 [root@pg1 pgdata]# ntpdate 0.arch.pool.ntp.org 19 Jul 15:00:41 ntpdate[19435]: step time server *** offset 155.499032 sec [root@pg2 ~]# ntpdate 0.arch.pool.ntp.org 19 Jul 14:59:55 ntpdate[4130]: adjust time server *** offset 0.004792 sec
2 sync_state=async 异步流复制
修改为sync同步参数
#synchronous_standby_names = '' # standby servers that provide sync rep # method to choose sync standbys, number of sync standbys, # and comma-separated list of application_name # from standby(s); '*' = all
postgres=# SELECT client_addr,application_name,sync_state FROM pg_stat_replication; client_addr | application_name | sync_state --------------+------------------+------------ *.12 | walreceiver | async (1 row) [postgres@pg1 pgdata]$ vim postgresql.conf [postgres@pg1 pgdata]$ pg_ctl reload server signaled 2019-07-19 15:44:09.263 CST [23330] LOG: received SIGHUP, reloading configuration files 2019-07-19 15:44:09.264 CST [23330] LOG: parameter "synchronous_standby_names" changed to "walreceiver" 2019-07-19 15:44:14.879 CST [3985] LOG: standby "walreceiver" is now a synchronous standby with priority 1 postgres=# SELECT client_addr,application_name,sync_state FROM pg_stat_replication; client_addr | application_name | sync_state --------------+------------------+------------ *.12 | walreceiver | sync (1 row)
postgres=# \x Expanded display is on. postgres=# select * from pg_stat_replication; -[ RECORD 1 ]----+------------------------------ pid | 3985 usesysid | 16384 usename | repuser application_name | walreceiver client_addr | *.12 client_hostname | client_port | 50758 backend_start | 2019-07-19 14:51:11.747452+08 backend_xmin | 573 state | streaming sent_lsn | 0/60005E8 write_lsn | 0/60005E8 flush_lsn | 0/60005E8 replay_lsn | 0/60005E8 write_lag | flush_lag | replay_lag | sync_priority | 1 sync_state | sync