PG primary 和 slave 互换
http://blog.sina.com.cn/s/blog_544a710b0101a122.html
http://blog.51cto.com/heyiyi/1898506
https://blog.csdn.net/fjgui/article/details/47421609
https://blog.csdn.net/baiyinqiqi/article/details/47951687
- 1.standby端,在$PGDATA/recovery里加上recovery_target_timeline = 'latest'
pg9以后的官方文档有了这么一段话:
Allow standby recovery to switch to a new timeline automatically (Heikki Linnakangas)
Now standby servers scan the archive directory for new timelines periodically
什么是new timeline?后面就会看到
- 2.关掉primary
pg_ctl stop -D $PGDATA -m fast
2018-11-27 17:23:01.059 CST,,,1624,,5bfcd2a7.658,1,,2018-11-27 13:14:15 CST,,0,LOG,00000,"shutting down",,,,,,,,,""
2018-11-27 17:23:01.443 CST,,,1624,,5bfcd2a7.658,2,,2018-11-27 13:14:15 CST,,0,LOG,00000,"database system is shut down",,,,,,,,,""
2018-11-27 17:23:01.672 CST,"repl","",3204,"172.16.10.142:58547",5bfd0cf5.c84,1,"",2018-11-27 17:23:01 CST,,0,FATAL,57P03,"the database system is shutting down",,,,,,,,,""
2018-11-27 17:23:02.839 CST,"role1","pdb1",3205,"10.1.161.35:54606",5bfd0cf6.c85,1,"",2018-11-27 17:23:02 CST,,0,FATAL,57P03,"the database system is shutting down",,,,,,,,,""
- 3.在standby端promote
pg_ctl promote -D $PGDATA
2018-11-27 17:25:02.448 CST,,,1940,,5bfd0d6e.794,1,,2018-11-27 17:25:02 CST,,0,FATAL,XX000,"could not connect to the primary server: could not connect to server: Connection refused
Is the server running on host ""172.16.10.100"" and accepting
TCP/IP connections on port 5432?
",,,,,,,,,""
2018-11-27 17:25:03.792 CST,,,31753,,5bfd0874.7c09,7,,2018-11-27 17:03:48 CST,1/0,0,LOG,00000,"received promote request",,,,,,,,,""
2018-11-27 17:25:03.792 CST,,,31753,,5bfd0874.7c09,8,,2018-11-27 17:03:48 CST,1/0,0,LOG,00000,"redo done at 0/19000028",,,,,,,,,""
2018-11-27 17:25:03.792 CST,,,31753,,5bfd0874.7c09,9,,2018-11-27 17:03:48 CST,1/0,0,LOG,00000,"last completed transaction was at log time 2018-11-27 17:06:58.916715+08",,,,,,,,,""
2018-11-27 17:25:03.794 CST,,,31753,,5bfd0874.7c09,10,,2018-11-27 17:03:48 CST,1/0,0,LOG,00000,"selected new timeline ID: 2",,,,,,,,,""
2018-11-27 17:25:03.836 CST,,,31753,,5bfd0874.7c09,11,,2018-11-27 17:03:48 CST,1/0,0,FATAL,42501,"could not open file ""recovery.conf"": Permission denied",,,,,,,,,""
2018-11-27 17:25:03.836 CST,,,31751,,5bfd0874.7c07,3,,2018-11-27 17:03:48 CST,,0,LOG,00000,"startup process (PID 31753) exited with exit code 1",,,,,,,,,""
2018-11-27 17:25:03.836 CST,,,31751,,5bfd0874.7c07,4,,2018-11-27 17:03:48 CST,,0,LOG,00000,"terminating any other active server processes",,,,,,,,,""
2018-11-27 17:25:03.836 CST,"postgres","pdb1",32068,"[local]",5bfd091d.7d44,1,"idle",2018-11-27 17:06:37 CST,3/0,0,WARNING,57P02,"terminating connection because of crash of another server process","The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.","In a moment you should be able to reconnect to the database and repeat your command.",,,,,,,"psql"
recovery.conf没权限更改,实例进程被终止,再开起来已经无法继续
2018-11-28 10:12:51.648 CST,,,18795,,5bfdf855.496b,7,,2018-11-28 10:07:17 CST,1/0,0,LOG,00000,"received promote request",,,,,,,,,""
2018-11-28 10:12:51.648 CST,,,18795,,5bfdf855.496b,8,,2018-11-28 10:07:17 CST,1/0,0,LOG,00000,"redo done at 0/1A000028",,,,,,,,,""
2018-11-28 10:12:51.648 CST,,,18795,,5bfdf855.496b,9,,2018-11-28 10:07:17 CST,1/0,0,LOG,00000,"last completed transaction was at log time 2018-11-28 10:10:28.375684+08",,,,,,,,,""
2018-11-28 10:12:51.649 CST,,,18795,,5bfdf855.496b,10,,2018-11-28 10:07:17 CST,1/0,0,LOG,00000,"selected new timeline ID: 2",,,,,,,,,""
2018-11-28 10:12:51.697 CST,,,18795,,5bfdf855.496b,11,,2018-11-28 10:07:17 CST,1/0,0,LOG,00000,"archive recovery complete",,,,,,,,,""
2018-11-28 10:12:51.715 CST,,,18795,,5bfdf855.496b,12,,2018-11-28 10:07:17 CST,1/0,0,LOG,00000,"MultiXact member wraparound protections are now enabled",,,,,,,,,""
2018-11-28 10:12:51.716 CST,,,18793,,5bfdf855.4969,3,,2018-11-28 10:07:17 CST,,0,LOG,00000,"database system is ready to accept connections",,,,,,,,,""
2018-11-28 10:12:51.716 CST,,,19752,,5bfdf9a3.4d28,1,,2018-11-28 10:12:51 CST,,0,LOG,00000,"autovacuum launcher started",,,,,,,,,""
2018-11-28 10:12:51.760 CST,,,19753,,5bfdf9a3.4d29,1,,2018-11-28 10:12:51 CST,,0,LOG,00000,"archive command failed with exit code 1","The failed archive command was: test ! -f /mysqldata/
pg/pgarch/00000002.history && cp pg_xlog/00000002.history /mysqldata/pg/pgarch/00000002.history",,,,,,,,""
2018-11-28 10:12:52.763 CST,,,19753,,5bfdf9a3.4d29,2,,2018-11-28 10:12:51 CST,,0,LOG,00000,"archive command failed with exit code 1","The failed archive command was: test ! -f /mysqldata/
pg/pgarch/00000002.history && cp pg_xlog/00000002.history /mysqldata/pg/pgarch/00000002.history",,,,,,,,""
2018-11-28 10:12:53.766 CST,,,19753,,5bfdf9a3.4d29,3,,2018-11-28 10:12:51 CST,,0,LOG,00000,"archive command failed with exit code 1","The failed archive command was: test ! -f /mysqldata/
pg/pgarch/00000002.history && cp pg_xlog/00000002.history /mysqldata/pg/pgarch/00000002.history",,,,,,,,""
2018-11-28 10:12:53.766 CST,,,19753,,5bfdf9a3.4d29,4,,2018-11-28 10:12:51 CST,,0,WARNING,01000,"archiving transaction log file ""00000002.history"" failed too many times, will try again l
ater",,,,,,,,,""
[postgres@mycat02 ~]$ pg_controldata
pg_control version number: 942
Catalog version number: 201409291
Database system identifier: 6583145462094845370
Database cluster state: in production
这时standby已经转为primary了,到$PGDATA下可以看到recovery.conf变为了recovery.done
- 4.把原来的primary恢复,成为新环境下的standby
cd $PGDATA
mv recovery.done recovery.conf
standby_mode = on # 指定为从库
primary_conninfo = 'host=172.16.10.143 port=5432 user=repl password=mall%9K0924' # 对应的主库信息
recovery_target_timeline = 'latest' # 这个说明这个流复制同步到最新的数据
vi postgres.conf
hot_standby = on
# 新从库上
[postgres@mysql56 pg_log]$ pg_controldata
pg_control version number: 942
Catalog version number: 201409291
Database system identifier: 6583145462094845370
Database cluster state: in archive recovery
- 5.级联状态
master_172.16.10.143 --> slave01_172.16.10.100 --> slave02_172.16.10.142
# master
postgres=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid | 20456
usesysid | 16426
usename | repl
application_name | walreceiver
client_addr | 172.16.10.100
client_hostname |
client_port | 39208
backend_start | 2018-11-28 10:17:55.837594+08
backend_xmin |
state | streaming
sent_location | 0/1A000348
write_location | 0/1A000348
flush_location | 0/1A000348
replay_location | 0/1A000348
sync_priority | 0
sync_state | async
# slave01
pdb1=# select * from pg_stat_replication;
-[ RECORD 1 ]----+------------------------------
pid | 8202
usesysid | 16426
usename | repl
application_name | walreceiver
client_addr | 172.16.10.142
client_hostname |
client_port | 60725
backend_start | 2018-11-28 10:17:55.108761+08
backend_xmin | 1892
state | streaming
sent_location | 0/1A000348
write_location | 0/1A000348
flush_location | 0/1A000348
replay_location | 0/1A000348
sync_priority | 0
sync_state | async