kingbaseES R6 集群手工切换案例

1、当前集群状态


[kingbase@ECOLABAPP37 bin]$ ./repmgr cluster show
 ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                                                                                                                
----+---------+---------+-----------+----------+----------+----------+----------+--------
 1  | node248 | standby |   running |          | default  | 100      | 2        | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
 2  | node249 | primary | * running |          | default  | 100      | 2        | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

2、执行手工切换


[kingbase@ECOLABAPP37 ~]$ repmgr standby switchover --siblings-follow
bash: repmgr: command not found...
[kingbase@ECOLABAPP37 ~]$ cd cluster/R6HA/KHA/kingbase/bin
[kingbase@ECOLABAPP37 bin]$ ./repmgr standby switchover --siblings-follow
NOTICE: executing switchover on node "node248" (ID: 1)
ERROR: local node "node248" (ID: 1) is not a downstream of demotion candidate primary "node249" (ID: 2)
DETAIL: local node has no registered upstream node
HINT: execute "repmgr standby register --force" to update the local node's metadata
[kingbase@ECOLABAPP37 bin]$ ./repmgr standby switchover --siblings-follow --force
NOTICE: executing switchover on node "node248" (ID: 1)
ERROR: local node "node248" (ID: 1) is not a downstream of demotion candidate primary "node249" (ID: 2)
DETAIL: local node has no registered upstream node
HINT: execute "repmgr standby register --force" to update the local node's metadata

–-siblings-follow 表示所有备份节点的 upstream 变更为新的master

---- 切换失败,从以上信息可知,当前集群状态不正常。

3、查看集群状态

[kingbase@ECOLABAPP37 bin]$ ./repmgr cluster show
 ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                                                                                                                
----+---------+---------+-----------+----------+----------+----------+----------+--------
 1  | node248 | standby |   running |          | default  | 100      | 2        | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
 2  | node249 | primary | * running |          | default  | 100      | 2        | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

---- node248为备库,集群状态在正常情况下,在Upstream字段信息应该是主库“node249”,现在字段信息为空。

4、注册备库到集群

[kingbase@ECOLABAPP37 bin]$ ./repmgr standby register --force
INFO: connecting to local node "node248" (ID: 1)
INFO: connecting to primary database
INFO: standby registration complete
NOTICE: standby node "node248" (ID: 1) successfully registered

查看集群状态(node248的upstream字段值为"node249"):


[kingbase@ECOLABAPP37 bin]$ ./repmgr cluster show
 ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                                                                                                                
----+---------+---------+-----------+----------+----------+----------+----------+-------
 1  | node248 | standby |   running | node249  | default  | 100      | 2        | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
 2  | node249 | primary | * running |          | default  | 100      | 2        | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

5、执行手工集群切换

[kingbase@ECOLABAPP37 bin]$ ./repmgr standby switchover --siblings-follow
NOTICE: executing switchover on node "node248" (ID: 1)
WARNING: option "--sibling-nodes" specified, but no sibling nodes exist
INFO: pausing repmgrd on node "node248" (ID 1)
INFO: pausing repmgrd on node "node249" (ID 2)
NOTICE: local node "node248" (ID: 1) will be promoted to primary; current primary "node249" (ID: 2) will be demoted to standby
NOTICE: stopping current primary node "node249" (ID: 2)
NOTICE: issuing CHECKPOINT
NOTICE: node (ID: 2) release the virtual ip 192.168.7.246/24 success
DETAIL: executing server command "/home/kingbase/cluster/R6HA/KHA/kingbase/bin/sys_ctl  -D '/home/kingbase/cluster/R6HA/KHA/kingbase/data' -l /home/kingbase/cluster/R6HA/KHA/kingbase/bin/logfile -W -m fast stop"
INFO: checking for primary shutdown; 1 of 60 attempts ("shutdown_check_timeout")
INFO: checking for primary shutdown; 2 of 60 attempts ("shutdown_check_timeout")
INFO: checking for primary shutdown; 3 of 60 attempts ("shutdown_check_timeout")
INFO: checking for primary shutdown; 4 of 60 attempts ("shutdown_check_timeout")
INFO: checking for primary shutdown; 5 of 60 attempts ("shutdown_check_timeout")
INFO: checking for primary shutdown; 6 of 60 attempts ("shutdown_check_timeout")
NOTICE: current primary has been cleanly shut down at location 1/65000028
NOTICE: PING 192.168.7.246 (192.168.7.246) 56(84) bytes of data.

--- 192.168.7.246 ping statistics ---
2 packets transmitted, 0 received, +1 errors, 100% packet loss, time 1002ms


WARNING: ping host"192.168.7.246" failed
DETAIL: average RTT value is not greater than zero
NOTICE: new primary node (ID: 1) acquire the virtual ip 192.168.7.246/24 success
NOTICE: promoting standby to primary
DETAIL: promoting server "node248" (ID: 1) using sys_promote()
NOTICE: waiting up to 60 seconds (parameter "promote_check_timeout") for promotion to complete
INFO: SET synchronous TO "async" on primary host 
NOTICE: STANDBY PROMOTE successful
DETAIL: server "node248" (ID: 1) was successfully promoted to primary
NOTICE: issuing CHECKPOINT
INFO: local node 2 can attach to rejoin target node 1
DETAIL: local node's recovery point: 1/65000028; rejoin target node's fork point: 1/650000A0
NOTICE: setting node 2's upstream to node 1
WARNING: unable to ping "host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3"
DETAIL: PQping() returned "PQPING_NO_RESPONSE"
NOTICE: begin to start server at 2021-03-01 11:42:21.425222
NOTICE: starting server using "/home/kingbase/cluster/R6HA/KHA/kingbase/bin/sys_ctl  -w -t 90 -D '/home/kingbase/cluster/R6HA/KHA/kingbase/data' -l /home/kingbase/cluster/R6HA/KHA/kingbase/bin/logfile start"
NOTICE: start server finish at 2021-03-01 11:42:21.835553
NOTICE: replication slot "repmgr_slot_1" deleted on node 2
NOTICE: NODE REJOIN successful
DETAIL: node 2 is now attached to node 1
NOTICE: switchover was successful
DETAIL: node "node248" is now primary and node "node249" is attached as standby
INFO: unpausing repmgrd on node "node248" (ID 1)
INFO: unpause node "node248" (ID 1) successfully
INFO: unpausing repmgrd on node "node249" (ID 2)
INFO: unpause node "node249" (ID 2) successfully
NOTICE: STANDBY SWITCHOVER has completed successfully

6、查看集群状态(切换成功)


[kingbase@ECOLABAPP37 bin]$ ./repmgr cluster show
 ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                                                                                                                
----+---------+---------+-----------+----------+----------+----------+----------+
 1  | node248 | primary | * running |          | default  | 100      | 3        | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
 2  | node249 | standby |   running | node248  | default  | 100      | 2        | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

posted @ 2021-06-30 19:42  天涯客1224  阅读(190)  评论(0编辑  收藏  举报