KingbaseES V8R6数据库运维案例之---单实例sys_backup.sh init故障

案例说明:
KingbaseES V8R6单实例环境,执行sys_backup.sh init后出现“ unable to find primary cluster - cannot proceed”故障,如下图所示:

适用版本:
KingbaseES V8R6

案例复现
一、配置sys_backup.conf
如下图所示,备份用户使用了普通用户:

二、执行备份初始化
[kingbase@node201 bin]$ sh -x sys_backup.sh init

......
+ '[' Xonly_sys_rman_conf == X ']'
+ '[' X0 '!=' X0 ']'
+ /bin/echo '# update single archive_command with sys_rman.archive-push...DONE'
# update single archive_command with sys_rman.archive-push...DONE
+ '[' X0 == X0 ']'
+ '[' Xcluster_standby_step1 '!=' X ']'
+ '[' Xroot == Xkingbase ']'
+ /bin/sed -i -e 's/archive_command.*/archive_command='\''export TZ=Asia\/Shanghai;\/opt\/Kingbase\/ES\/R6_24\/Server\/bin\/sys_rman --config \/home\/kingbase\/kbbr_repo\/sys_rman.conf --stanza=kingbase archive-push %p'\''/' /home/kingbase/db/r6_24/data/kingbase.conf
+ /bin/sed -i -e 's/archive_command.*/archive_command='\''export TZ=Asia\/Shanghai;\/opt\/Kingbase\/ES\/R6_24\/Server\/bin\/sys_rman --config \/home\/kingbase\/kbbr_repo\/sys_rman.conf --stanza=kingbase archive-push %p'\''/' /home/kingbase/db/r6_24/data/es_rep.conf
+ /opt/Kingbase/ES/R6_24/Server/bin/sys_ctl -D /home/kingbase/db/r6_24/data reload
+ /bin/echo '# create stanza and check...(maybe 60+ seconds)'
# create stanza and check...(maybe 60+ seconds)
+ /bin/rm -rf /home/kingbase/kbbr_repo/archive
+ /bin/rm -rf /home/kingbase/kbbr_repo/backup
+ /opt/Kingbase/ES/R6_24/Server/bin/sys_rman --config=/home/kingbase/kbbr_repo/sys_rman.conf --stanza=kingbase --log-level-console=info stanza-create
+ '[' X0 '!=' X56 ']'
+ /bin/echo 'ERROR: create stanza failed, check log file /opt/Kingbase/ES/R6_24/Server/log/sys_rman_stanza-create.log'
ERROR: create stanza failed, check log file /opt/Kingbase/ES/R6_24/Server/log/sys_rman_stanza-create.log
+ exit 2

---如上所示,初始化失败。

查看备份日志:

[kingbase@node201 bin]$ tail -100 /opt/Kingbase/ES/R6_24/Server/log/sys_rman_stanza-create.log
2023-08-30 17:48:18.842 P00   INFO: stanza-create command begin 2.27: --band-width=0 --config=/home/kingbase/kbbr_repo/sys_rman.conf --exec-id=31837-eec3829a --log-level-console=info --log-level-file=info --log-path=/opt/Kingbase/ES/R6_24/Server/log --log-subprocess --kb1-path=/home/kingbase/db/r6_24/data --kb1-port=54322 --kb1-user=system --process-max=4 --repo1-path=/home/kingbase/kbbr_repo --stanza=kingbase
WARN: unable to check kb-1: [DbConnectError] unable to connect to 'dbname='test' port=54322 user='system'': could not connect to server: No such file or directory
        Is the server running locally and accepting
        connections on Unix domain socket "/tmp/.s.KINGBASE.54322"?
ERROR: [056]: unable to find primary cluster - cannot proceed
2023-08-30 17:48:18.842 P00   INFO: stanza-create command end: aborted with exception [056]

2023-08-30 17:50:15.134 P00   INFO: stanza-create command begin 2.27: --band-width=0 --config=/home/kingbase/kbbr_repo/sys_rman.conf --exec-id=32037-87a2a1fd --log-level-console=info --log-level-file=info --log-path=/opt/Kingbase/ES/R6_24/Server/log --log-subprocess --kb1-path=/home/kingbase/db/r6_24/data --kb1-port=54322 --kb1-user=system --process-max=4 --repo1-path=/home/kingbase/kbbr_repo --stanza=kingbase
2023-08-30 17:50:15.350 P00   INFO: stanza-create for stanza 'kingbase' on repo1
2023-08-30 17:50:15.356 P00   INFO: stanza-create command end: completed successfully (224ms)
2023-08-31 14:07:04.974 P00   INFO: stanza-create command begin 2.27: --band-width=0 --config=/home/kingbase/kbbr_repo/sys_rman.conf --exec-id=9877-5cac3fb3 --log-level-console=info --log-level-file=info --log-path=/opt/Kingbase/ES/R6_24/Server/log --log-subprocess --kb1-path=/home/kingbase/db/r6_24/data --kb1-port=54322 --kb1-user=admin --process-max=4 --repo1-path=/home/kingbase/kbbr_repo --stanza=kingbase
2023-08-31 14:07:05.184 P00   INFO: stanza-create for stanza 'kingbase' on repo1
2023-08-31 14:07:05.199 P00   INFO: stanza-create command end: completed successfully (228ms)
2023-08-31 14:11:05.316 P00   INFO: stanza-create command begin 2.27: --band-width=0 --config=/home/kingbase/kbbr_repo/sys_rman.conf --exec-id=10128-7bcf817b --log-level-console=info --log-level-file=info --log-path=/opt/Kingbase/ES/R6_24/Server/log --log-subprocess --kb1-path=/home/kingbase/db/r6_24/data --kb1-port=54322 --kb1-user=tom --process-max=4 --repo1-path=/home/kingbase/kbbr_repo --stanza=kingbase
WARN: unable to check kb-1: [DbQueryError] unable to select some rows from sys_settings
      HINT: is the backup running as the kingbase user?
      HINT: is the sys_read_all_settings role assigned for Kingbase >= 10?
ERROR: [056]: unable to find primary cluster - cannot proceed
2023-08-31 14:11:05.415 P00   INFO: stanza-create command end: aborted with exception [056]

如下图所示:

三、故障分析
执行日志中报错的语句:(debug)
[kingbase@node201 bin]$ /opt/Kingbase/ES/R6_24/Server/bin/sys_rman --config=/home/kingbase/kbbr_repo/sys_rman.conf --stanza=kingbase --log-level-console=debug stanza-create

.......
2023-08-31 14:35:09.343 P00  DEBUG:     db/db::dbQueryRow: (this: {client: {host: null, port: 54322, database: {"test"}, user: {"tom"}, queryTimeout 1800000}, remoteClient: null}, query: {"select (select setting from pg_catalog.pg_settings where name = 'server_version_num')::int4, (select setting from pg_catalog.pg_settings where name = 'data_directory')::text, (select setting from pg_catalog.pg_settings where name = 'archive_mode')::text, (select ' '||setting from pg_catalog.pg_settings where name = 'archive_command')::text, (select setting from pg_catalog.pg_settings where name = 'checkpoint_timeout')::int4"})
2023-08-31 14:35:09.343 P00  DEBUG:     db/db::dbQuery: (this: {client: {host: null, port: 54322, database: {"test"}, user: {"tom"}, queryTimeout 1800000}, remoteClient: null}, query: {"select (select setting from pg_catalog.pg_settings where name = 'server_version_num')::int4, (select setting from pg_catalog.pg_settings where name = 'data_directory')::text, (select setting from pg_catalog.pg_settings where name = 'archive_mode')::text, (select ' '||setting from pg_catalog.pg_settings where name = 'archive_command')::text, (select setting from pg_catalog.pg_settings where name = 'checkpoint_timeout')::int4"})
2023-08-31 14:35:09.343 P00  DEBUG:     postgres/client::pgClientQuery: (this: {host: null, port: 54322, database: {"test"}, user: {"tom"}, queryTimeout 1800000}, query: {"select (select setting from pg_catalog.pg_settings where name = 'server_version_num')::int4, (select setting from pg_catalog.pg_settings where name = 'data_directory')::text, (select setting from pg_catalog.pg_settings where name = 'archive_mode')::text, (select ' '||setting from pg_catalog.pg_settings where name = 'archive_command')::text, (select setting from pg_catalog.pg_settings where name = 'checkpoint_timeout')::int4"})
2023-08-31 14:35:09.359 P00  DEBUG:     postgres/client::pgClientQuery: => {VariantList}
2023-08-31 14:35:09.359 P00  DEBUG:     db/db::dbQuery: => {VariantList}
2023-08-31 14:35:09.359 P00  DEBUG:     db/db::dbQueryRow: => {VariantList}
WARN: unable to check kb-1: [DbQueryError] unable to select some rows from sys_settings
      HINT: is the backup running as the kingbase user?
      HINT: is the sys_read_all_settings role assigned for Kingbase >= 10?
2023-08-31 14:35:09.359 P00  DEBUG:     common/exit::exitSafe: (result: 0, error: true, signalType: 0)
ERROR: [056]: unable to find primary cluster - cannot proceed
2023-08-31 14:35:09.359 P00   INFO: stanza-create command end: aborted with exception [056]
2023-08-31 14:35:09.359 P00  DEBUG:     common/lock::lockRelease: (failOnNoLock: false)
2023-08-31 14:35:09.359 P00  DEBUG:     common/lock::lockRelease: => true
2023-08-31 14:35:09.359 P00  DEBUG:     common/exit::exitSafe: => 56
2023-08-31 14:35:09.359 P00  DEBUG:     main::main: => 56

---如上所示,用户在连接数据库,访问pg_catalog.pg_settings视图时,无法获取到相关信息。

如下图所示,对于普通用户通过视图无法获取到data_directory的配置,导致初始化失败,superuser可以获取到所有配置:

四、总结
在执行sys_backup.sh init初始化备份时,需要备份用户连接到test数据库访问pg_catalog.pg_settings视图,如果test库不存在或用户无相关访问视图的权限,则会出现“unable to find primary cluster - cannot proceed”的故障。

posted @   天涯客1224  阅读(34)  评论(0编辑  收藏  举报
相关博文:
阅读排行:
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了
· DeepSeek 开源周回顾「GitHub 热点速览」
历史上的今天:
2022-08-31 KingbaseES V8R3集群运维案例之---主库系统down failover切换过程分析
点击右上角即可分享
微信分享提示