KingbaseES V8R6集群运维案例---failover切换后新主库归档失败

案例分析:
主备failover切换后,新主库执行归档失败。
数据库版本:
KingbaseES V008R006C008B0014

一、故障现象
如下所示,在新主库的sys_log日志发现归档失败的信息:
2024-01-31 11:22:22.653 CST,,,7347,,65b9b987.1cb3,59,,2024-01-31 11:07:51 CST,,0,LOG,00000,"archive command failed with exit code 103","The failed archive command was: export TZ=Asia/Shanghai;/home/kingbase/cluster/R6C8/HAC8/kingbase/bin/sys_rman --config /home/kingbase/kbbr_repo/sys_rman.conf --stanza=kingbase archive-push sys_wal/00000012.history",,,,,,,,""

二、故障分析
1、查看新主库sys_rman.conf
如下所示,备份的repo-host是原主库(192.168.1.201):

[kingbase@node202 sys_log]$ cat /home/kingbase/kbbr_repo/sys_rman.conf
# Genarate by script at 20240119005235, should not change manually
[kingbase]
kb1-path=/home/kingbase/cluster/R6C8/HAC8/kingbase/data
[global]
repo1-host=192.168.1.201
repo1-host-user=kingbase
repo1-host-config=/home/kingbase/kbbr_repo/sys_rman.conf
repo1-path=/home/kingbase/kbbr_repo
archive-statistics=n
log-path=/home/kingbase/cluster/R6C8/HAC8/kingbase/log
log-level-file=info
log-level-console=info
log-subprocess=y
#### support: gz none
compress-type=none
compress-level=3
band-width=0
archive-timeout=660
link-all=y
cmd-ssh=/home/kingbase/cluster/R6C8/HAC8/kingbase/bin/sys_securecmd

2、查看备份类型
集群采用的是cluster模式备份:

[kingbase@node201 bin]$ cat sys_backup.conf|grep -i _target_db
_target_db_style="cluster"

3、执行手工归档

[kingbase@node202 data]$ /home/kingbase/cluster/R6C8/HAC8/kingbase/bin/sys_rman --config /home/kingbase/kbbr_repo/sys_rman.conf --stanza=kingbase archive-push sys_wal/00000012.history
2024-01-31 11:25:40.226 P00   INFO: archive-push command begin 2.27: [sys_wal/00000012.history] --no-archive-statistics --archive-timeout=660 --band-width=0 --cmd-ssh=/home/kingbase/cluster/R6C8/HAC8/kingbase/bin/sys_securecmd --compress-level=3 --compress-type=none --config=/home/kingbase/kbbr_repo/sys_rman.conf --exec-id=16023-6b7b3c24 --kb1-path=/home/kingbase/cluster/R6C8/HAC8/kingbase/data --log-level-console=info --log-level-file=info --log-path=/home/kingbase/cluster/R6C8/HAC8/kingbase/log --log-subprocess --repo1-host=192.168.1.201 --repo1-host-config=/home/kingbase/kbbr_repo/sys_rman.conf --repo1-host-user=kingbase --repo1-path=/home/kingbase/kbbr_repo --stanza=kingbase
ERROR: [103]: unable to find a valid repository:
       repo1: [unknown_err] remote-0 process on '192.168.1.201' terminated unexpectedly [255]: ES: connect to host 192.168.1.201 port 8890: Connection refused
2024-01-31 11:25:40.229 P00   INFO: archive-push command end: aborted with exception [103]

如下图所示,到原主库的securecmdd的连接失败:

4、测试到原主库的securecmd连接
如下所示,到原主的securecmd连接失败:

[kingbase@node202 bin]$ ./sys_securecmd kingbase@192.168.1.201
ES: connect to host 192.168.1.201 port 8890: Connection refused

5、检查原主系统securecmdd服务
如下所示,原主系统securecmdd服务未启动,启动服务:

[kingbase@node201 bin]$ ps -ef |grep securecmd
kingbase  6822  3948  0 11:26 pts/0    00:00:00 grep --color=auto securecmd

[root@node201 ~]# systemctl enable securecmdd
Created symlink from /etc/systemd/system/multi-user.target.wants/securecmdd.service to /etc/systemd/system/securecmdd.service.
[root@node201 ~]# systemctl start securecmdd
[root@node201 ~]# systemctl status securecmdd
● securecmdd.service - KingbaseES - sys_securecmdd daemon
   Loaded: loaded (/etc/systemd/system/securecmdd.service; enabled; vendor preset: disabled)
   Active: active (running) since Wed 2024-01-31 11:27:04 CST; 5s ago
 Main PID: 7053 (sys_securecmdd)
   CGroup: /system.slice/securecmdd.service
           └─7053 sys_securecmdd: /home/kingbase/cluster/securecmdd/bin/sys_securecmdd -f /etc/.kes/sec...

Jan 31 11:27:04 node201 systemd[1]: Started KingbaseES - sys_securecmdd daemon.

6、执行手工归档
如下所示,执行手工归档需要在data目录下执行:

[kingbase@node202 bin]$ /home/kingbase/cluster/R6C8/HAC8/kingbase/bin/sys_rman --config /home/kingbase/kbbr_repo/sys_rman.conf --stanza=kingbase archive-push sys_wal/00000012.history
2024-01-31 11:27:21.594 P00   INFO: archive-push command begin 2.27: [sys_wal/00000012.history] --no-archive-statistics --archive-timeout=660 --band-width=0 --cmd-ssh=/home/kingbase/cluster/R6C8/HAC8/kingbase/bin/sys_securecmd --compress-level=3 --compress-type=none --config=/home/kingbase/kbbr_repo/sys_rman.conf --exec-id=16510-89811805 --kb1-path=/home/kingbase/cluster/R6C8/HAC8/kingbase/data --log-level-console=info --log-level-file=info --log-path=/home/kingbase/cluster/R6C8/HAC8/kingbase/log --log-subprocess --repo1-host=192.168.1.201 --repo1-host-config=/home/kingbase/kbbr_repo/sys_rman.conf --repo1-host-user=kingbase --repo1-path=/home/kingbase/kbbr_repo --stanza=kingbase
ERROR: [032]: Kingbase working directory '/home/kingbase/cluster/R6C8/HAC8/kingbase/bin' is not the same as option kb1-path '/home/kingbase/cluster/R6C8/HAC8/kingbase/data'
       HINT: is the Kingbase data_directory configured the same as the kb1-path option?
2024-01-31 11:27:21.594 P00   INFO: archive-push command end: aborted with exception [032]

在data目录下执行:
[kingbase@node202 data]$ /home/kingbase/cluster/R6C8/HAC8/kingbase/bin/sys_rman --config /home/kingbase/kbbr_repo/sys_rman.conf --stanza=kingbase archive-push sys_wal/00000012.history
2024-01-31 11:27:46.754 P00   INFO: archive-push command begin 2.27: [sys_wal/00000012.history] --no-archive-statistics --archive-timeout=660 --band-width=0 --cmd-ssh=/home/kingbase/cluster/R6C8/HAC8/kingbase/bin/sys_securecmd --compress-level=3 --compress-type=none --config=/home/kingbase/kbbr_repo/sys_rman.conf --exec-id=16585-06d4bec5 --kb1-path=/home/kingbase/cluster/R6C8/HAC8/kingbase/data --log-level-console=info --log-level-file=info --log-path=/home/kingbase/cluster/R6C8/HAC8/kingbase/log --log-subprocess --repo1-host=192.168.1.201 --repo1-host-config=/home/kingbase/kbbr_repo/sys_rman.conf --repo1-host-user=kingbase --repo1-path=/home/kingbase/kbbr_repo --stanza=kingbase
2024-01-31 11:27:47.105 P00   INFO: pushed WAL file '00000012.history' to the archive
2024-01-31 11:27:47.205 P00   INFO: archive-push command end: completed successfully (455ms)

三、问题总结
在集群原主库配置了‘cluster’模式的sys_backup.sh的备份,触发集群主备failover切换后,无需重新部署备份,数据库执行备份后,将在原repo-host下存储。

posted @   天涯客1224  阅读(9)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
· DeepSeek 开源周回顾「GitHub 热点速览」
点击右上角即可分享
微信分享提示