【案例】invalid primary checkpoint record

问题 invalid primary checkpoint record

服务器因未正常归档 ,导至wal日志目录满,最终DB down掉,当启库的时候也异常。 这种情况也常出现在服务器断电的场景。

LOG,00000,"database system was interrupted while in recovery at 2022-04-08 09:08:03 HKT",,"This probably means that some data is corrupted and you will have to use the last backup for recovery.",,,,,,,""
LOG,00000,"invalid primary checkpoint record",,,,,,,,,""
PANIC,XX000,"could not locate a valid checkpoint record",,,,,,,,,""
LOG,00000,"startup process (PID 76560) was terminated by signal 6: Aborted",,,,,,,,,""
LOG,00000,"aborting startup due to startup process failure",,,,,,,,,""
LOG,00000,"database system is shut down",,,,,,,,,""

处理 pg_resetwal

使用su  切换至postgres用户便于执行 pg_resetwal 
 su - postgres 
[postgres@LXUATEDPD2 log]$ pg_resetwal  -f $PGDATA
Write-ahead log reset

# 启动数据库
[postgres@LXUATEDPD2 log]$ pg_ctl start
waiting for server to start....2022-04-08 09:22:12.216 HKT [76634] LOG:  starting PostgreSQL 12.9 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 8.5.0 20210514 (Red Hat 8.5.0-4), 64-bit
2022-04-08 09:22:12.217 HKT [76634] LOG:  listening on IPv4 address "0.0.0.0", port 5432
2022-04-08 09:22:12.218 HKT [76634] LOG:  listening on Unix socket "/db_data/dwh/pg_data/.s.PGSQL.5432"
2022-04-08 09:22:12.328 HKT [76634] LOG:  redirecting log output to logging collector process
2022-04-08 09:22:12.328 HKT [76634] HINT:  Future log output will appear in directory "log".
 done
server started

注意事项

pg_resetwal会清空存储在pg_control文件中的wal和其他可选的控制信息。不到万不得已,可别使用这个命令。
执行该工具之后,数据库可以启动,但是可能会包含不一致的数据,因为会有事务部分提交。重启后,建议立即将数据dump出来,运行initdb并reload数据。检查数据一致性并根据需要进行数据修复。
需要显式的指定目录,pg_resetwal不会使用环境变量PGDATA。
参数-f表示强制执行pg_resetwal。





posted @ 2022-04-08 11:39  www.cqdba.cn  阅读(357)  评论(0编辑  收藏  举报