centos7部署postgresql集群高可用 patroni + etcd 之patroni篇
实验环境:centos7.4纯净版
postgres版本: 9.6.15
etcd版本:3.3.11
patroni版本:1.6.0
patroni介绍可参考:https://github.com/zalando/patroni
ip规划
192.168.216.130 node1 master
192.168.216.132 node2 slave
192.168.216.134 node3 slave
etcd集群部署请看上一篇文章:https://www.cnblogs.com/caidingyu/p/11408389.html
postgres部署参考文章:https://www.cnblogs.com/virtulreal/p/9921978.html
修改node1中postgresql.conf配置如下
1 2 3 4 5 6 7 8 9 10 | max_connections = '100' max_wal_senders = '10' port = '5432' listen_addresses = '0.0.0.0' synchronous_commit = on full_page_writes = on wal_log_hints = on synchronous_standby_names = '*' max_replication_slots = 10 wal_level = replica |
修改node1中pg_hba.conf配置如下
1 2 3 4 5 6 7 8 9 10 11 12 | [root@localhost data] # more pg_hba.conf|grep -v ^#|grep -v ^$ local all all peer host all all 127.0.0.1 /32 md5 host all postgres 127.0.0.1 /32 md5 host all all 192.168.216.0 /24 md5 host all all ::1 /128 md5 local replication replicator peer host replication replicator 127.0.0.1 /32 md5 host replication replicator ::1 /128 md5 host replication replicator 192.168.216.130 /32 md5 host replication replicator 192.168.216.132 /32 md5 host replication replicator 192.168.216.134 /32 md5 |
node1上创建复制槽,至关重要,patroni 用到了这个玩意
1 2 3 4 5 | postgres= # create user replicator replication login encrypted password '1qaz2wsx'; postgres= # alter user postgres with password '1qaz2wsx'; postgres= # select * from pg_create_physical_replication_slot('pgsql96_node1'); postgres= # select * from pg_create_physical_replication_slot('pgsql96_node2'); postgres= # select * from pg_create_physical_replication_slot('pgsql96_node3'); |
node2 配置stream replication
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | systemctl stop postgresql-9.6 su - postgres cd /var/lib/pgsql/9 .6 /data rm -rf ./* /usr/pgsql-9 .6 /bin/pg_basebackup -h 192.168.216.130 -D /var/lib/pgsql/9 .6 /data -U replicator - v -P -R vi recovery.conf recovery_target_timeline = 'latest' standby_mode = 'on' primary_conninfo = 'host=192.168.216.130 port=5432 user=replicator password=1qaz2wsx' primary_slot_name = 'pgsql96_node2' trigger_file = '/tmp/postgresql.trigger.5432' 执行 exit 返回root用户 systemctl start postgresql-9.6 |
node3 配置stream replication
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 | systemctl stop postgresql-9.6 su - postgres cd /var/lib/pgsql/9 .6 /data rm -rf ./* /usr/pgsql-9 .6 /bin/pg_basebackup -h 192.168.216.130 -D /var/lib/pgsql/9 .6 /data -U replicator - v -P -R vi recovery.conf recovery_target_timeline = 'latest' standby_mode = 'on' primary_conninfo = 'host=192.168.216.130 port=5432 user=replicator password=1qaz2wsx' primary_slot_name = 'pgsql96_node3' trigger_file = '/tmp/postgresql.trigger.5432' 执行 exit 返回root用户 systemctl start postgresql-9.6 |
在node1上连接数据库,查看复制状态
1 | select client_addr,pg_xlog_location_diff(sent_location, write_location) as write_delay,pg_xlog_location_diff(sent_location, flush_location) as flush_delay,pg_xlog_location_diff(sent_location, replay_location) as replay_delay from pg_stat_replication; |
下载、安装 patroni,如遇网络问题可以多执行几次pip install或者切换其他pip源
1 2 3 4 5 6 7 | yum install gcc yum install python-devel.x86_64 cd /tmp curl https: //bootstrap .pypa.io /get-pip .py -o get-pip.py python get-pip.py pip install psycopg2-binary pip install patroni[etcd,consul] |
验证是否安装成功
1 2 | which patroni patronictl --help |
node1上patroni 配置文件如下,该配置文件需要手动创建
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 | mkdir -p /usr/patroni/conf cd /usr/patroni/conf/ cat /usr/patroni/conf/patroni_postgresql .yml scope: pgsql96 namespace: /pgsql/ name: pgsql96_node1 restapi: listen: 192.168.216.130:8008 connect_address: 192.168.216.130:8008 etcd: host: 192.168.216.130:2379 bootstrap: # this section will be written into Etcd:/<namespace>/<scope>/config after initializing new cluster # and all other cluster members will use it as a `global configuration` dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 master_start_timeout: 300 synchronous_mode: false postgresql: use_pg_rewind: true use_slots: true parameters: listen_addresses: "0.0.0.0" port: 5432 wal_level: logical hot_standby: "on" wal_keep_segments: 1000 max_wal_senders: 10 max_replication_slots: 10 wal_log_hints: "on" # archive_mode: "on" # archive_timeout: 1800s # archive_command: gzip < %p > /data/backup/pgwalarchive/%f.gz # recovery_conf: # restore_command: gunzip < /data/backup/pgwalarchive/%f.gz > %p postgresql: listen: 0.0.0.0:5432 connect_address: 192.168.216.130:5432 data_dir: /var/lib/pgsql/9 .6 /data bin_dir: /usr/pgsql-9 .6 /bin # config_dir: /etc/postgresql/9.6/main authentication: replication: username: replicator password: 1qaz2wsx superuser: username: postgres password: 1qaz2wsx #watchdog: # mode: automatic # Allowed values: off, automatic, required # device: /dev/watchdog # safety_margin: 5 tags: nofailover: false noloadbalance: false clonefrom: false nosync: false |
node2上patroni 配置文件如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 | [root@localhost etcd] # cat /usr/patroni/conf/patroni_postgresql.yml scope: pgsql96 namespace: /pgsql/ name: pgsql96_node2 restapi: listen: 192.168.216.132:8008 connect_address: 192.168.216.132:8008 etcd: host: 192.168.216.132:2379 bootstrap: # this section will be written into Etcd:/<namespace>/<scope>/config after initializing new cluster # and all other cluster members will use it as a `global configuration` dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 master_start_timeout: 300 synchronous_mode: false postgresql: use_pg_rewind: true use_slots: true parameters: listen_addresses: "0.0.0.0" port: 5432 wal_level: logical hot_standby: "on" wal_keep_segments: 1000 max_wal_senders: 10 max_replication_slots: 10 wal_log_hints: "on" # archive_mode: "on" # archive_timeout: 1800s # archive_command: gzip < %p > /data/backup/pgwalarchive/%f.gz # recovery_conf: # restore_command: gunzip < /data/backup/pgwalarchive/%f.gz > %p postgresql: listen: 0.0.0.0:5432 connect_address: 192.168.216.132:5432 data_dir: /var/lib/pgsql/9 .6 /data bin_dir: /usr/pgsql-9 .6 /bin # config_dir: /etc/postgresql/9.6/main authentication: replication: username: replicator password: 1qaz2wsx superuser: username: postgres password: 1qaz2wsx #watchdog: # mode: automatic # Allowed values: off, automatic, required # device: /dev/watchdog # safety_margin: 5 tags: nofailover: false noloadbalance: false clonefrom: false nosync: false |
node3上patroni 配置文件如下
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 | scope: pgsql96 namespace: /pgsql/ name: pgsql96_node3 restapi: listen: 192.168.216.134:8008 connect_address: 192.168.216.134:8008 etcd: host: 192.168.216.134:2379 bootstrap: # this section will be written into Etcd:/<namespace>/<scope>/config after initializing new cluster # and all other cluster members will use it as a `global configuration` dcs: ttl: 30 loop_wait: 10 retry_timeout: 10 maximum_lag_on_failover: 1048576 master_start_timeout: 300 synchronous_mode: false postgresql: use_pg_rewind: true use_slots: true parameters: listen_addresses: "0.0.0.0" port: 5432 wal_level: logical hot_standby: "on" wal_keep_segments: 1000 max_wal_senders: 10 max_replication_slots: 10 wal_log_hints: "on" # archive_mode: "on" # archive_timeout: 1800s # archive_command: gzip < %p > /data/backup/pgwalarchive/%f.gz # recovery_conf: # restore_command: gunzip < /data/backup/pgwalarchive/%f.gz > %p postgresql: listen: 0.0.0.0:5432 connect_address: 192.168.216.134:5432 data_dir: /var/lib/pgsql/9 .6 /data bin_dir: /usr/pgsql-9 .6 /bin # config_dir: /etc/postgresql/9.6/main authentication: replication: username: replicator password: 1qaz2wsx superuser: username: postgres password: 1qaz2wsx #watchdog: # mode: automatic # Allowed values: off, automatic, required # device: /dev/watchdog # safety_margin: 5 tags: nofailover: false noloadbalance: false clonefrom: false nosync: false |
手动启动 patroni,注意这里需要切换到postgres用户下启动
node1、node2、node3 三个节点依次启动
1 | patroni /usr/patroni/conf/patroni_postgresql .yml |
查看 patroni 集群状态
克隆一个窗口,执行patronictl -c /usr/patroni/conf/patroni_postgresql.yml list
查看 etcd 的 信息
etcdctl ls /pgsql/pgsql96
etcdctl get /pgsql/pgsql96/members/pgsql96_node1
为了方便开机自启,故配置成 patroni.service,3个node都需要进行配置,配置好patroni.service后就可以直接在root用户下切换Leader以及重启postgres节点等操作
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 | [root@localhost data] # vi /etc/systemd/system/patroni.service [root@localhost data] # cat /etc/systemd/system/patroni.service [Unit] Description=patroni - a high-availability PostgreSQL Documentation=https: //patroni .readthedocs.io /en/latest/index .html After=syslog.target network.target etcd.target Wants=network-online.target [Service] Type=simple User=postgres Group=postgres PermissionsStartOnly= true ExecStart= /usr/bin/patroni /usr/patroni/conf/patroni_postgresql .yml ExecReload= /bin/kill -HUP $MAINPID LimitNOFILE=65536 KillMode=process KillSignal=SIGINT Restart=on-abnormal RestartSec=30s TimeoutSec=0 [Install] WantedBy=multi-user.target |
禁止 postgresql 的自启动,通过 patroni 来管理 postgresql
1 2 3 4 5 6 7 8 9 | systemctl status patroni systemctl start patroni systemctl enable patroni systemctl status postgresql systemctl disable postgresql systemctl status etcd systemctl enable etcd |
如何手动切换Leader
执行patronictl -c /usr/patroni/conf/patroni_postgresql.yml switchover
多执行patronictl -c /usr/patroni/conf/patroni_postgresql.yml list进行刷新,可以看到Leader由node2切换为node1
可以尝试模拟node节点断电,即手动停止Leader所在节点的patroni服务,此时查询集群状态,可以看到Leader自动切换到其他node节点上
目前可以看到Leader位于132节点上,
在132节点上停止patroni服务
再次查看集群状态,可以看到132被stop,Leader自动被转移到134上
参考:https://blog.csdn.net/ctypyb2002/article/details/81007990
https://www.cnblogs.com/ctypyb2002/p/9792939.html
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 如何编写易于单元测试的代码
· 10年+ .NET Coder 心语,封装的思维:从隐藏、稳定开始理解其本质意义
· .NET Core 中如何实现缓存的预热?
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 地球OL攻略 —— 某应届生求职总结
· 周边上新:园子的第一款马克杯温暖上架
· Open-Sora 2.0 重磅开源!
· 提示词工程——AI应用必不可少的技术
· .NET周刊【3月第1期 2025-03-02】