centos7部署postgresql集群高可用 patroni + etcd 之patroni篇
实验环境:centos7.4纯净版
postgres版本: 9.6.15
etcd版本:3.3.11
patroni版本:1.6.0
patroni介绍可参考:https://github.com/zalando/patroni
ip规划
192.168.216.130 node1 master
192.168.216.132 node2 slave
192.168.216.134 node3 slave
etcd集群部署请看上一篇文章:https://www.cnblogs.com/caidingyu/p/11408389.html
postgres部署参考文章:https://www.cnblogs.com/virtulreal/p/9921978.html
修改node1中postgresql.conf配置如下
max_connections = '100' max_wal_senders = '10' port = '5432' listen_addresses = '0.0.0.0' synchronous_commit = on full_page_writes = on wal_log_hints = on synchronous_standby_names = '*' max_replication_slots = 10 wal_level = replica
修改node1中pg_hba.conf配置如下
[root@localhost data]# more pg_hba.conf|grep -v ^#|grep -v ^$ local all all peer host all all 127.0.0.1/32 md5 host all postgres 127.0.0.1/32 md5 host all all 192.168.216.0/24 md5 host all all ::1/128 md5 local replication replicator peer host replication replicator 127.0.0.1/32 md5 host replication replicator ::1/128 md5 host replication replicator 192.168.216.130/32 md5 host replication replicator 192.168.216.132/32 md5 host replication replicator 192.168.216.134/32 md5
node1上创建复制槽,至关重要,patroni 用到了这个玩意
postgres=# create user replicator replication login encrypted password '1qaz2wsx'; postgres=# alter user postgres with password '1qaz2wsx'; postgres=# select * from pg_create_physical_replication_slot('pgsql96_node1'); postgres=# select * from pg_create_physical_replication_slot('pgsql96_node2'); postgres=# select * from pg_create_physical_replication_slot('pgsql96_node3');
node2 配置stream replication
systemctl stop postgresql-9.6 su - postgres cd /var/lib/pgsql/9.6/data rm -rf ./* /usr/pgsql-9.6/bin/pg_basebackup -h 192.168.216.130 -D /var/lib/pgsql/9.6/data -U replicator -v -P -R vi recovery.conf recovery_target_timeline = 'latest' standby_mode = 'on' primary_conninfo = 'host=192.168.216.130 port=5432 user=replicator password=1qaz2wsx' primary_slot_name = 'pgsql96_node2' trigger_file = '/tmp/postgresql.trigger.5432' 执行exit返回root用户 systemctl start postgresql-9.6
node3 配置stream replication
systemctl stop postgresql-9.6 su - postgres cd /var/lib/pgsql/9.6/data rm -rf ./* /usr/pgsql-9.6/bin/pg_basebackup -h 192.168.216.130 -D /var/lib/pgsql/9.6/data -U replicator -v -P -R vi recovery.conf recovery_target_timeline = 'latest' standby_mode = 'on' primary_conninfo = 'host=192.168.216.130 port=5432 user=replicator password=1qaz2wsx' primary_slot_name = 'pgsql96_node3' trigger_file = '/tmp/postgresql.trigger.5432' 执行exit返回root用户 systemctl start postgresql-9.6
在node1上连接数据库,查看复制状态
select client_addr,pg_xlog_location_diff(sent_location, write_location) as write_delay,pg_xlog_location_diff(sent_location, flush_location) as flush_delay,pg_xlog_location_diff(sent_location, replay_location) as replay_delay from pg_stat_replication;
下载、安装 patroni,如遇网络问题可以多执行几次pip install或者切换其他pip源
yum install gcc yum install python-devel.x86_64 cd /tmp curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py python get-pip.py pip install psycopg2-binary pip install patroni[etcd,consul]
验证是否安装成功
which patroni patronictl --help
node1上patroni 配置文件如下,该配置文件需要手动创建
mkdir -p /usr/patroni/conf cd /usr/patroni/conf/ cat /usr/patroni/conf/patroni_postgresql.yml scope: pgsql96 namespace: /pgsql/ name: pgsql96_node1 restapi: listen: 192.168.216.130:8008 connect_address: 192.168.216.130:8008 etcd: host: 192.168.216.130:2379 bootstrap: # this section will be written into Etcd:/<namespace>/<scope>/config after initializing new cluster # and all other cluster members will use it as a `global configuration` dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 master_start_timeout: 300 synchronous_mode: false postgresql: use_pg_rewind: true use_slots: true parameters: listen_addresses: "0.0.0.0" port: 5432 wal_level: logical hot_standby: "on" wal_keep_segments: 1000 max_wal_senders: 10 max_replication_slots: 10 wal_log_hints: "on" # archive_mode: "on" # archive_timeout: 1800s # archive_command: gzip < %p > /data/backup/pgwalarchive/%f.gz # recovery_conf: # restore_command: gunzip < /data/backup/pgwalarchive/%f.gz > %p postgresql: listen: 0.0.0.0:5432 connect_address: 192.168.216.130:5432 data_dir: /var/lib/pgsql/9.6/data bin_dir: /usr/pgsql-9.6/bin # config_dir: /etc/postgresql/9.6/main authentication: replication: username: replicator password: 1qaz2wsx superuser: username: postgres password: 1qaz2wsx #watchdog: # mode: automatic # Allowed values: off, automatic, required # device: /dev/watchdog # safety_margin: 5 tags: nofailover: false noloadbalance: false clonefrom: false nosync: false
node2上patroni 配置文件如下
[root@localhost etcd]# cat /usr/patroni/conf/patroni_postgresql.yml scope: pgsql96 namespace: /pgsql/ name: pgsql96_node2 restapi: listen: 192.168.216.132:8008 connect_address: 192.168.216.132:8008 etcd: host: 192.168.216.132:2379 bootstrap: # this section will be written into Etcd:/<namespace>/<scope>/config after initializing new cluster # and all other cluster members will use it as a `global configuration` dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 master_start_timeout: 300 synchronous_mode: false postgresql: use_pg_rewind: true use_slots: true parameters: listen_addresses: "0.0.0.0" port: 5432 wal_level: logical hot_standby: "on" wal_keep_segments: 1000 max_wal_senders: 10 max_replication_slots: 10 wal_log_hints: "on" # archive_mode: "on" # archive_timeout: 1800s # archive_command: gzip < %p > /data/backup/pgwalarchive/%f.gz # recovery_conf: # restore_command: gunzip < /data/backup/pgwalarchive/%f.gz > %p postgresql: listen: 0.0.0.0:5432 connect_address: 192.168.216.132:5432 data_dir: /var/lib/pgsql/9.6/data bin_dir: /usr/pgsql-9.6/bin # config_dir: /etc/postgresql/9.6/main authentication: replication: username: replicator password: 1qaz2wsx superuser: username: postgres password: 1qaz2wsx #watchdog: # mode: automatic # Allowed values: off, automatic, required # device: /dev/watchdog # safety_margin: 5 tags: nofailover: false noloadbalance: false clonefrom: false nosync: false
node3上patroni 配置文件如下
scope: pgsql96 namespace: /pgsql/ name: pgsql96_node3 restapi: listen: 192.168.216.134:8008 connect_address: 192.168.216.134:8008 etcd: host: 192.168.216.134:2379 bootstrap: # this section will be written into Etcd:/<namespace>/<scope>/config after initializing new cluster # and all other cluster members will use it as a `global configuration` dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 master_start_timeout: 300 synchronous_mode: false postgresql: use_pg_rewind: true use_slots: true parameters: listen_addresses: "0.0.0.0" port: 5432 wal_level: logical hot_standby: "on" wal_keep_segments: 1000 max_wal_senders: 10 max_replication_slots: 10 wal_log_hints: "on" # archive_mode: "on" # archive_timeout: 1800s # archive_command: gzip < %p > /data/backup/pgwalarchive/%f.gz # recovery_conf: # restore_command: gunzip < /data/backup/pgwalarchive/%f.gz > %p postgresql: listen: 0.0.0.0:5432 connect_address: 192.168.216.134:5432 data_dir: /var/lib/pgsql/9.6/data bin_dir: /usr/pgsql-9.6/bin # config_dir: /etc/postgresql/9.6/main authentication: replication: username: replicator password: 1qaz2wsx superuser: username: postgres password: 1qaz2wsx #watchdog: # mode: automatic # Allowed values: off, automatic, required # device: /dev/watchdog # safety_margin: 5 tags: nofailover: false noloadbalance: false clonefrom: false nosync: false
手动启动 patroni,注意这里需要切换到postgres用户下启动
node1、node2、node3 三个节点依次启动
patroni /usr/patroni/conf/patroni_postgresql.yml
查看 patroni 集群状态
克隆一个窗口,执行patronictl -c /usr/patroni/conf/patroni_postgresql.yml list
查看 etcd 的 信息
etcdctl ls /pgsql/pgsql96
etcdctl get /pgsql/pgsql96/members/pgsql96_node1
为了方便开机自启,故配置成 patroni.service,3个node都需要进行配置,配置好patroni.service后就可以直接在root用户下切换Leader以及重启postgres节点等操作
[root@localhost data]# vi /etc/systemd/system/patroni.service [root@localhost data]# cat /etc/systemd/system/patroni.service [Unit] Description=patroni - a high-availability PostgreSQL Documentation=https://patroni.readthedocs.io/en/latest/index.html After=syslog.target network.target etcd.target Wants=network-online.target [Service] Type=simple User=postgres Group=postgres PermissionsStartOnly=true ExecStart=/usr/bin/patroni /usr/patroni/conf/patroni_postgresql.yml ExecReload=/bin/kill -HUP $MAINPID LimitNOFILE=65536 KillMode=process KillSignal=SIGINT Restart=on-abnormal RestartSec=30s TimeoutSec=0 [Install] WantedBy=multi-user.target
禁止 postgresql 的自启动,通过 patroni 来管理 postgresql
systemctl status patroni systemctl start patroni systemctl enable patroni systemctl status postgresql systemctl disable postgresql systemctl status etcd systemctl enable etcd
如何手动切换Leader
执行patronictl -c /usr/patroni/conf/patroni_postgresql.yml switchover
多执行patronictl -c /usr/patroni/conf/patroni_postgresql.yml list进行刷新,可以看到Leader由node2切换为node1
可以尝试模拟node节点断电,即手动停止Leader所在节点的patroni服务,此时查询集群状态,可以看到Leader自动切换到其他node节点上
目前可以看到Leader位于132节点上,
在132节点上停止patroni服务
再次查看集群状态,可以看到132被stop,Leader自动被转移到134上
参考:https://blog.csdn.net/ctypyb2002/article/details/81007990
https://www.cnblogs.com/ctypyb2002/p/9792939.html