KingbaseES V8R6集群运维案例之---通过备库clone在线添加新节点

案例说明:
KingbaseES V8R6集群可以通过图形化方式在线添加新节点,但是在添加新节点clone环节时,是从主库copy数据到新的节点,这样在生产环境,如果数据量大,将会对主库的网络I/O造成压力。可以通过‘repmgr standby clone’指定已有的备库进行克隆,减轻主库压力。

如下图所示:图形化添加新节点默认从主库clone数据

适用版本:

KingbaseES V8R6

集群架构:

[kingbase@node1 bin]$ cat /etc/hosts
......
192.168.7.248   node1          #主库节点
192.168.7.249   node2          #新增备库节点
192.168.7.243   node3          #备库节点


[kingbase@node1 bin]$ ./repmgr cluster show
 ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                                                                                                                
----+---------+---------+-----------+----------+----------+----------+----------+---------------
 1  | node243 | standby |   running | node248  | default  | 100      | 12       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
 2  | node248 | primary | * running |          | default  | 100      | 12       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

一、新节点准备工作

=新节点的系统配置的准备,可以参考集群部署的要求。建立和集群其他节点相同的集群管理路径,并从其他节点上传相关的目录和文件(除了data目录)到新节点=

[kingbase@node2 R6C5R]$ pwd
/home/kingbase/cluster/R6C5/R6C5R
[kingbase@node2 R6C5R]$ ls -lh
total 145M
-rw-rw-r--. 1 kingbase kingbase   62 Mar 22 13:24 control.so
-rwxr-xr-x. 1 kingbase root     145M Mar 22 13:20 db.zip
drwxrwxr-x. 9 kingbase kingbase   99 Mar 22 13:25 kingbase
-rwxrwxrwx. 1 kingbase kingbase 3.4K Mar 22 13:24 license.dat
[kingbase@node2 R6C5R]$ cd kingbase/
[kingbase@node2 kingbase]$ ls -lh
total 32K
drwxrwxr-x.  2 kingbase kingbase    6 Mar 22 13:24 archive
drwxr-xr-x.  2 kingbase kingbase 4.0K Mar 22 13:24 bin
-rw-------.  1 kingbase kingbase 8.0K Mar 22 16:25 copy_file
drwxrwxr-x.  2 kingbase kingbase  101 Mar 22 13:25 etc
drwxrwxr-x.  5 kingbase kingbase 8.0K Nov  5 17:20 lib
drwxr-xr-x.  2 kingbase kingbase   37 Mar 22 13:25 log
drwxrwxr-x.  8 kingbase kingbase 4.0K Mar 22 13:25 share

二、在新节点执行在线clone

1、首先在主库上建立新备库的replication slot

test=# select sys_create_physical_replication_slot('repmgr_slot_3');
 sys_create_physical_replication_slot 
--------------------------------------
 (repmgr_slot_3,)
(1 row)

test=# select * from sys_replication_slots;              
   slot_name   | plugin | slot_type | datoid | database | temporary | active | active_pid | xmin  | catalog_xmin | restart_lsn | confirmed_flush_
lsn 
---------------+--------+-----------+--------+----------+-----------+--------+------------+-------+--------------+-------------+-----------------
----
 repmgr_slot_1 |        | physical  |        |          | f         | t      |       8229 | 68604 |              | 0/44004000  | 
 repmgr_slot_3 |        | physical  |        |          | f         | f      |            |       |              | 0/44003100  | 
(2 rows)

2、在新节点执行clone预演

[kingbase@node2 kingbase]$ ./repmgr standby clone -h 192.168.7.243 -U esrep -d esrep --dry-run
-bash: ./repmgr: No such file or directory
[kingbase@node2 kingbase]$ cd bin
[kingbase@node2 bin]$ ./repmgr standby clone -h 192.168.7.243 -U esrep -d esrep --dry-run
NOTICE: destination directory "/home/kingbase/cluster/R6C5/R6C5R/kingbase/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=192.168.7.243 user=esrep dbname=esrep
DETAIL: current installation size is 108 MB
INFO: "repmgr" extension is installed in database "esrep"
INFO: parameter "max_replication_slots" set to 32
INFO: parameter "max_wal_senders" set to 32
NOTICE: checking for available walsenders on the source node (2 required)
INFO: sufficient walsenders available on the source node
DETAIL: 2 required, 32 available
NOTICE: checking replication connections can be made to the source server (2 required)
INFO: required number of replication connections could be made to the source server
DETAIL: 2 replication connections required
NOTICE: standby will attach to upstream node 2
HINT: consider using the -c/--fast-checkpoint option
INFO: all prerequisites for "standby clone" are met

3、在新节点执行clone

[kingbase@node2 bin]$ ./repmgr standby clone -h 192.168.7.243 -U esrep -d esrep 
# 注:-h ,指定已有的备库的ip地址。

NOTICE: destination directory "/home/kingbase/cluster/R6C5/R6C5R/kingbase/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=192.168.7.243 user=esrep dbname=esrep
DETAIL: current installation size is 108 MB
NOTICE: checking for available walsenders on the source node (2 required)
NOTICE: checking replication connections can be made to the source server (2 required)
INFO: creating directory "/home/kingbase/cluster/R6C5/R6C5R/kingbase/data"...
NOTICE: starting backup (using sys_basebackup)...
HINT: this may take some time; consider using the -c/--fast-checkpoint option
INFO: executing:
  /home/kingbase/cluster/R6C5/R6C5R/kingbase/bin/sys_basebackup -l "repmgr base backup"  -D /home/kingbase/cluster/R6C5/R6C5R/kingbase/data -h 192.168.7.243 -p 54321 -U esrep -X stream -S repmgr_slot_3 
NOTICE: standby clone (using sys_basebackup) complete
NOTICE: you can now start your Kingbase server
HINT: for example: sys_ctl -D /home/kingbase/cluster/R6C5/R6C5R/kingbase/data start
HINT: after starting the server, you need to register this standby with "repmgr standby register"

4、启动新节点数据库服务

[kingbase@node2 bin]$ ./sys_ctl -D /home/kingbase/cluster/R6C5/R6C5R/kingbase/data start
waiting for server to start....2022-03-22 12:14:15.746 CST [3450] LOG:  sysaudit extension initialized
.......
server started

5、注册新备库到集群

[kingbase@node2 bin]$ ./repmgr standby register --force
INFO: connecting to local node "node249" (ID: 3)
INFO: connecting to primary database
WARNING: --upstream-node-id not supplied, assuming upstream node is primary (node ID 2)
INFO: standby registration complete
NOTICE: standby node "node249" (ID: 3) successfully registered

6、查看集群节点状态

[kingbase@node2 bin]$ ./repmgr cluster show
 ID | Name    | Role    | Status    | Upstream | Location | Priority | Timeline | Connection string                                                                                                                                
----+---------+---------+-----------+----------+----------+----------+----------+---------------------------------------------------------------------------------------------------------------------------------------------------
 1  | node243 | standby |   running | node248  | default  | 100      | 12       | host=192.168.7.243 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
 2  | node248 | primary | * running |          | default  | 100      | 12       | host=192.168.7.248 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
 3  | node249 | standby |   running | node248  | default  | 100      | 12       | host=192.168.7.249 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3

=由以上获知,新的备库节点添加成功。=

三、备库clone错误案例

=在从指定备库clone前,如果未创建replication slot 出现以下错误=

错误日志:

[kingbase@node2 bin]$ ./repmgr standby clone -h 192.168.7.243 -U esrep -d esrep 
NOTICE: destination directory "/home/kingbase/cluster/R6C5/R6C5R/kingbase/data" provided
INFO: connecting to source node
DETAIL: connection string is: host=192.168.7.243 user=esrep dbname=esrep
DETAIL: current installation size is 108 MB
NOTICE: checking for available walsenders on the source node (2 required)
NOTICE: checking replication connections can be made to the source server (2 required)
INFO: creating directory "/home/kingbase/cluster/R6C5/R6C5R/kingbase/data"...
NOTICE: starting backup (using sys_basebackup)...
HINT: this may take some time; consider using the -c/--fast-checkpoint option
INFO: executing:
  /home/kingbase/cluster/R6C5/R6C5R/kingbase/bin/sys_basebackup -l "repmgr base backup"  -D /home/kingbase/cluster/R6C5/R6C5R/kingbase/data -h 192.168.7.243 -p 54321 -U esrep -X stream -S repmgr_slot_3 
NOTICE: creating replication slot "repmgr_slot_3" on upstream node 2
ERROR: drop_replication_slot(): unable to drop replication slot "repmgr_slot_3"
DETAIL: 
ERROR:  replication slot "repmgr_slot_3" does not exist

DETAIL: query text is:
SELECT pg_catalog.pg_drop_replication_slot('repmgr_slot_3')
ERROR: unable to delete replication slot "repmgr_slot_3" on source node
NOTICE: standby clone (using sys_basebackup) complete
NOTICE: you can now start your Kingbase server
HINT: for example: sys_ctl -D /home/kingbase/cluster/R6C5/R6C5R/kingbase/data start
HINT: after starting the server, you need to register this standby with "repmgr standby register"

=解决方案: 从主库提前创建备库复制槽,复制槽名称查看备库的kingbase.auto.conf文件。=

posted @ 2022-03-24 12:02  天涯客1224  阅读(160)  评论(0编辑  收藏  举报