KingbaseES V8R6集群运维案例之---sys_backup.sh init失败‘pg_replslot’故障
案例说明:
一主一备集群在执行sys_backup.sh init时出现“link ‘pg_replslot’ destionation ....." 错误,备份失败。故障如下图所示:
适用版本:
KingbaseES V8R6
一、问题分析
查看data目录文件信息,如下所示有pg_replslot的软链接:
正常的data下的文件信息,如下图所示:
二、案例复现
1、创建pg_replslot的链接
# 创建pg_replslot的链接
[kingbase@node1 data]$ mkdir sys_replslot
[kingbase@node1 data]$ ln -s sys_replslot pg_replslot
[kingbase@node1 data]$ ls -lh
total 88K
drwx------ 9 kingbase kingbase 86 Oct 13 14:41 base
-rw------- 1 kingbase kingbase 46 Nov 20 17:17 current_logfiles
-rw-rw-r-- 1 kingbase kingbase 933 Oct 27 10:49 es_rep.conf
drwx------ 2 kingbase kingbase 4.0K Nov 20 17:18 global
........
lrwxrwxrwx 1 kingbase kingbase 12 Nov 20 17:54 pg_replslot -> sys_replslot
# pg_replslot存储信息
[kingbase@node1 data]$ ls -lh pg_replslot/
total 0
[kingbase@node1 data]$ cp -r sys_replslot.bk/repmgr_slot_2 pg_replslot/
[kingbase@node1 data]$ ls -lh pg_replslot/
total 0
drwx------ 2 kingbase kingbase 18 Nov 20 17:54 repmgr_slot_2
2、执行备份初始化
[kingbase@node1 bin]$ ./sys_backup.sh init
# pre-condition: check the non-archived WAL files
# generate local sys_rman.conf...DONE
# update all node: sys_rman.conf and archive_command with sys_rman.archive-push...
# update all node: sys_rman.conf and archive_command with sys_rman.archive-push...DONE
# create stanza and check...(maybe 60+ seconds)
# create stanza and check...DONE
# initial first full backup...(maybe several minutes)
ERROR: full backup failed, check log file /home/kingbase/cluster/R6C8/HAC8/kingbase/log/sys_rman_backup.log
# 备份日志:
2023-11-20 17:55:48.562 P00 INFO: backup command begin 2.27: --archive-copy --no-archive-statistics --archive-timeout=600 --band-width=0 --cmd-ssh=/home/kingbase/cluster/R6C8/HAC8/kingbase/bin/sys_securecmd --compress-level=3 --compress-type=none --config=/home/kingbase/kbbr_repo/sys_rman.conf --exec-id=26829-8cc6447c --kb2-host=192.168.1.202 --kb2-host-user=kingbase --kb1-path=/home/kingbase/cluster/R6C8/HAC8/kingbase/data --kb2-path=/home/kingbase/cluster/R6C8/HAC8/kingbase/data --kb1-port=54321 --kb2-port=54321 --kb1-user=esrep --kb2-user=esrep --log-level-console=info --log-level-file=info --log-path=/home/kingbase/cluster/R6C8/HAC8/kingbase/log --log-subprocess --non-archived-space=1024 --process-max=4 --repo1-path=/home/kingbase/kbbr_repo --repo1-retention-full=5 --stanza=kingbase --start-fast --type=full
WARN: set process-max 4 is too large, auto set to CPU core count 1
2023-11-20 17:55:49.298 P00 INFO: Get pageCheckSum flag from ControlFile is 1
2023-11-20 17:55:49.401 P00 INFO: Check the non archvied WAL space under the setting 1024 MB
2023-11-20 17:55:49.401 P00 INFO: Non archived WAL files have 0 MB.
2023-11-20 17:55:49.401 P00 INFO: execute non-exclusive sys_start_backup(): backup begins after the requested immediate checkpoint completes
2023-11-20 17:55:49.725 P00 INFO: backup start archive = 000000050000000000000034, lsn = 0/34000028
2023-11-20 17:55:49.725 P00 INFO: check archive for prior segment 000000050000000000000033
ERROR: [070]: link 'pg_replslot' destination '/home/kingbase/cluster/R6C8/HAC8/kingbase/data/sys_replslot' is in KBDATA
2023-11-20 17:55:49.928 P00 INFO: backup command end: aborted with exception [070]
如下图所示,故障复现:
三、问题解决
将软连接取消,直接使用sys_replslot目录后,备份正常。对于生产环境,注意查明软链接创建的原因,确定不影响数据库运行后,取消软链接。
分类:
KingbaseES
标签:
kingbaseES
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
· DeepSeek 开源周回顾「GitHub 热点速览」