KingbaseES V8R6集群运维案例---删除test库测试
案例说明:
KingbaseES V8R6默认初始化实例后,创建test库,用于测试。对于有的生产环境,为安全考虑,需要删除test库;本案例测试了在删除test库后,对集群管理的影响。
适用版本:
KingbaseES V8R6
一、集群节点状态
[kingbase@node101 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100 | 6 | | host=192.168.1.101 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node2 | standby | running | node1 | default | 100 | 6 | 0 bytes | host=192.168.1.102 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
二、删除集群test库
1、集群启动后删除test库
[kingbase@node101 bin]$ ./ksql -U system prod
ksql (V8.0)
Type "help" for help.
prod=# drop database test;
ERROR: database "test" is being accessed by other users
DETAIL: There is 1 other session using the database.
如下图所示,有session访问test库,无法删除:
2、查看访问test库的session
prod=# select * from sys_stat_activity where datname='test';
-[ RECORD 1 ]----+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
datid | 12145
datname | test
pid | 4653
usesysid | 10
usename | system
application_name |
client_addr | ::1
client_hostname |
client_port | 44336
backend_start | 2023-04-24 16:33:40.039128+08
xact_start |
query_start | 2023-04-24 16:33:40.082766+08
state_change | 2023-04-24 16:33:40.083743+08
wait_event_type | Client
wait_event | ClientRead
state | idle
backend_xid |
backend_xmin |
query |SELECT *,
|(case pg_is_in_recovery() when 't' then null else pg_current_wal_lsn() end) AS pg_current_wal_lsn,
|(case pg_is_in_recovery() when 't' then null else pg_wal_lsn_diff(pg_current_wal_lsn(), pg_lsn('0/0'))::float end) AS pg_current_wal_lsn_bytes,+
| (case pg_is_in_recovery() when 't' then null else pg_wal_lsn_diff(pg_current_wal_lsn(), replay_lsn)::float end) AS pg_wal_lsn_diff
| FROM pg_stat_replication
backend_type | client backend
---如上所示,主库节点通过test库访问sys_stat_replication视图。
3、关闭集群启动主库节点数据库服务
1)修改参数synchronous_commit
[kingbase@node101 bin]$ cat /data/kingbase/hac7/data/es_rep.conf |grep -i commit
#synchronous_commit = remote_apply
synchronous_commit = local
2)启动主库节点数据库服务
3)删除主库实例test库
prod=# drop database test;
DROP DATABASE
4)修改主库synchronous_commit = remote_apply
5)启动集群后查看备库test库也已经删除。
三、集群管理测试
1、集群关闭和启动测试
# 执行集群重启
[kingbase@node101 bin]$ ./sys_monitor.sh restart
# 集群启动后节点状态正常
[kingbase@node101 bin]$ ./repmgr cluster show
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100 | 6 | | host=192.168.1.101 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node2 | standby | running | node1 | default | 100 | 6 | 0 bytes | host=192.168.1.102 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2、备库recovery测试
1)配置主备库参数recovery='standby'
2)关闭备库数据库服务
3)查看主库hamgr.log日志
如下图所示,对standby库的recovery成功:
3、集群failover测试
1)配置集群参数recovery='automatic'
2)关闭主库数据库服务
3)查看failover日志信息
4)切换后集群节点状态
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node1 | standby | running | node2 | default | 100 | 6 | 0 bytes | host=192.168.1.101 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node2 | primary | * running | | default | 100 | 7 | | host=192.168.1.102 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
4、主备switchover测试
# 备库执行switchover
[kingbase@node101 bin]$ ./repmgr standby switchover -h 192.168.1.102 -U esrep -d esrep
........
ID | Name | Role | Status | Upstream | Location | Priority | Timeline | LSN_Lag | Connection string
----+-------+---------+-----------+----------+----------+----------+----------+---------+---------------------------------------------------------------------------------------------------------------------------------------------------
1 | node1 | primary | * running | | default | 100 | 8 | | host=192.168.1.101 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
2 | node2 | standby | running | node1 | default | 100 | 7 | 0 bytes | host=192.168.1.102 user=esrep dbname=esrep port=54321 connect_timeout=10 keepalives=1 keepalives_idle=10 keepalives_interval=1 keepalives_count=3
---如上所示,swithover后集群状态正常。
5、数据库日志错误信息
四、执行物理备份测试
Tips:
执行sys_backup.sh做物理备份。
如下图所示,在执行sys_backup.sh init时连接test库失败。
---具体解决方案可以参考:https://www.cnblogs.com/kingbase/p/16460379.html
五、总结
KingbaseES V8R6集群的管理一般由esrep用户访问esrep库的元数据来完成,在生产环境为安全起见可以删除test库,但需要严格的测试后再执行。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
· DeepSeek 开源周回顾「GitHub 热点速览」