PostgreSQL块损坏模拟修复
PostgreSQL块损坏模拟修复
PG块损坏模拟修复
报错信息:ERROR: invalid page in block 10 of relation "base/13593/11025587"
-
测试环境
服务器:CentOS Linux release 7.6.1810
数据库:PostgreSQL 12.3
-
1、创建表测试
postgres=# CREATE TABLE filler (txt TEXT);
CREATE TABLE
postgres=# INSERT INTO filler SELECT generate_series::text FROM generate_series(-10000000,10000000);
INSERT 0 20000001
postgres=# create unique index i_filler_txt on filler(txt);
CREATE INDEX
postgres=# \dt+ filer
Did not find any relation named "filer".
postgres=# \dt+ filler
List of relations
Schema | Name | Type | Owner | Size | Description
--------+--------+-------+-------+--------+-------------
public | filler | table | sa | 761 MB |
(1 row)
-
2、查看数据路径
postgres=# select pg_relation_filepath('filler');
pg_relation_filepath
----------------------
base/13593/11025542
(1 row)
postgres=# select *from filler where ctid = '(10,1)';
txt
----------
-9998150
(1 row)
-
3、 checkpoint
记得做checkpoint,不然可能dd导出的数据是空的
postgres=# checkpoint;
CHECKPOINT
-
4、备份数据文件
[07-01 18:00:31] root@gauss01:/home/postgres/abdata/7.0/abase1
$ cp /home/postgres/abdata/7.0/abase1/base/13593/11025601 /home/postgres/backup/11025601_bak
[07-01 18:02:30] root@gauss01:/home/postgres/abdata/7.0/abase1
$ ll /home/postgres/backup/11025601_bak
-rw------- 1 root 761M Jul 1 18:02 /home/postgres/backup/11025601_bak
-
5、查看块数据
我们查看第10块的数据,总共有185条数据
postgres=# select *from filler where ctid >'(10,0)' and ctid <'(11,1)';
txt
----------
-9998150
-9998149
-9998148
-9998147
-9998146
-9998145
-9998144
-9998143
-9998142
-9998141
-9998140
、、、
-9997974
-9997973
-9997972
-9997971
-9997970
-9997969
-9997968
-9997967
-9997966
(185 rows)
-
6、导出块数据
将第10个数据块导出:
[07-01 15:00:14] root@gauss01:/home/postgres/abdata/7.0/abase1
$ dd bs=8192 count=1 skip=10 if=/home/postgres/abdata/7.0/abase1/base/13593/11025542 of=block
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000715983 s, 11.4 MB/s
[07-01 15:05:09] root@gauss01:/home/postgres/abdata/7.0/abase1
$ hexdump -C block
00000000 53 00 00 00 f0 89 07 f3 00 00 00 00 fc 02 18 03 |S...............|
00000010 00 20 04 20 00 00 00 00 d8 9f 42 00 b0 9f 42 00 |. . ......B...B.|
00000020 88 9f 42 00 60 9f 42 00 38 9f 42 00 10 9f 42 00 |..B.`.B.8.B...B.|
00000030 e8 9e 42 00 c0 9e 42 00 98 9e 42 00 70 9e 42 00 |..B...B...B.p.B.|
00000040 48 9e 42 00 20 9e 42 00 f8 9d 42 00 d0 9d 42 00 |H.B. .B...B...B.|
00000050 a8 9d 42 00 80 9d 42 00 58 9d 42 00 30 9d 42 00 |..B...B.X.B.0.B.|
00000060 08 9d 42 00 e0 9c 42 00 b8 9c 42 00 90 9c 42 00 |..B...B...B...B.|
00000070 68 9c 42 00 40 9c 42 00 18 9c 42 00 f0 9b 42 00 |h.B.@.B...B...B.|
00000080 c8 9b 42 00 a0 9b 42 00 78 9b 42 00 50 9b 42 00 |..B...B.x.B.P.B.|
00000090 28 9b 42 00 00 9b 42 00 d8 9a 42 00 b0 9a 42 00 |(.B...B...B...B.|
000000a0 88 9a 42 00 60 9a 42 00 38 9a 42 00 10 9a 42 00 |..B.`.B.8.B...B.|
000000b0 e8 99 42 00 c0 99 42 00 98 99 42 00 70 99 42 00 |..B...B...B.p.B.|
000000c0 48 99 42 00 20 99 42 00 f8 98 42 00 d0 98 42 00 |H.B. .B...B...B.|
000000d0 a8 98 42 00 80 98 42 00 58 98 42 00 30 98 42 00 |..B...B.X.B.0.B.|
000000e0 08 98 42 00 e0 97 42 00 b8 97 42 00 90 97 42 00 |..B...B...B...B.|
000000f0 68 97 42 00 40 97 42 00 18 97 42 00 f0 96 42 00 |h.B.@.B...B...B.|
00000100 c8 96 42 00 a0 96 42 00 78 96 42 00 50 96 42 00 |..B...B.x.B.P.B.|
00000110 28 96 42 00 00 96 42 00 d8 95 42 00 b0 95 42 00 |(.B...B...B...B.|
00000120 88 95 42 00 60 95 42 00 38 95 42 00 10 95 42 00 |..B.`.B.8.B...B.|
00000130 e8 94 42 00 c0 94 42 00 98 94 42 00 70 94 42 00 |..B...B...B.p.B.|
00000140 48 94 42 00 20 94 42 00 f8 93 42 00 d0 93 42 00 |H.B. .B...B...B.|
00000150 a8 93 42 00 80 93 42 00 58 93 42 00 30 93 42 00 |..B...B.X.B.0.B.|
00000160 08 93 42 00 e0 92 42 00 b8 92 42 00 90 92 42 00 |..B...B...B...B.|
00000170 68 92 42 00 40 92 42 00 18 92 42 00 f0 91 42 00 |h.B.@.B...B...B.|
00000180 c8 91 42 00 a0 91 42 00 78 91 42 00 50 91 42 00 |..B...B.x.B.P.B.|
00000190 28 91 42 00 00 91 42 00 d8 90 42 00 b0 90 42 00 |(.B...B...B...B.|
000001a0 88 90 42 00 60 90 42 00 38 90 42 00 10 90 42 00 |..B.`.B.8.B...B.|
000001b0 e8 8f 42 00 c0 8f 42 00 98 8f 42 00 70 8f 42 00 |..B...B...B.p.B.|
000001c0 48 8f 42 00 20 8f 42 00 f8 8e 42 00 d0 8e 42 00 |H.B. .B...B...B.|
000001d0 a8 8e 42 00 80 8e 42 00 58 8e 42 00 30 8e 42 00 |..B...B.X.B.0.B.|
000001e0 08 8e 42 00 e0 8d 42 00 b8 8d 42 00 90 8d 42 00 |..B...B...B...B.|
000001f0 68 8d 42 00 40 8d 42 00 18 8d 42 00 f0 8c 42 00 |h.B.@.B...B...B.|
00000200 c8 8c 42 00 a0 8c 42 00 78 8c 42 00 50 8c 42 00 |..B...B.x.B.P.B.|
00000210 28 8c 42 00 00 8c 42 00 d8 8b 42 00 b0 8b 42 00 |(.B...B...B...B.|
00000220 88 8b 42 00 60 8b 42 00 38 8b 42 00 10 8b 42 00 |..B.`.B.8.B...B.|
00000230 e8 8a 42 00 c0 8a 42 00 98 8a 42 00 70 8a 42 00 |..B...B...B.p.B.|
00000240 48 8a 42 00 20 8a 42 00 f8 89 42 00 d0 89 42 00 |H.B. .B...B...B.|
00000250 a8 89 42 00 80 89 42 00 58 89 42 00 30 89 42 00 |..B...B.X.B.0.B.|
00000260 08 89 42 00 e0 88 42 00 b8 88 42 00 90 88 42 00 |..B...B...B...B.|
00000270 68 88 42 00 40 88 42 00 18 88 42 00 f0 87 42 00 |h.B.@.B...B...B.|
00000280 c8 87 42 00 a0 87 42 00 78 87 42 00 50 87 42 00 |..B...B.x.B.P.B.|
00000290 28 87 42 00 00 87 42 00 d8 86 42 00 b0 86 42 00 |(.B...B...B...B.|
000002a0 88 86 42 00 60 86 42 00 38 86 42 00 10 86 42 00 |..B.`.B.8.B...B.|
000002b0 e8 85 42 00 c0 85 42 00 98 85 42 00 70 85 42 00 |..B...B...B.p.B.|
000002c0 48 85 42 00 20 85 42 00 f8 84 42 00 d0 84 42 00 |H.B. .B...B...B.|
000002d0 a8 84 42 00 80 84 42 00 58 84 42 00 30 84 42 00 |..B...B.X.B.0.B.|
000002e0 08 84 42 00 e0 83 42 00 b8 83 42 00 90 83 42 00 |..B...B...B...B.|
000002f0 68 83 42 00 40 83 42 00 18 83 42 00 00 00 00 00 |h.B.@.B...B.....|
00000300 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 |................|
00000310 00 00 00 00 00 00 00 00 e4 6d db 0a 00 00 00 00 |.........m......|
00000320 00 00 00 00 00 00 0a 00 b9 00 01 00 02 09 18 00 |................|
00000330 13 2d 39 39 39 37 39 36 36 00 00 00 00 00 00 00 |.-9997966.......|
00000340 e4 6d db 0a 00 00 00 00 00 00 00 00 00 00 0a 00 |.m..............|
00000350 b8 00 01 00 02 09 18 00 13 2d 39 39 39 37 39 36 |.........-999796|
00000360 37 00 00 00 00 00 00 00 e4 6d db 0a 00 00 00 00 |7........m......|
00000370 00 00 00 00 00 00 0a 00 b7 00 01 00 02 09 18 00 |................|
00000380 13 2d 39 39 39 37 39 36 38 00 00 00 00 00 00 00 |.-9997968.......|
00000390 e4 6d db 0a 00 00 00 00 00 00 00 00 00 00 0a 00 |.m..............|
000003a0 b6 00 01 00 02 09 18 00 13 2d 39 39 39 37 39 36 |.........-999796|
000003b0 39 00 00 00 00 00 00 00 e4 6d db 0a 00 00 00 00 |9........m......|
000003c0 00 00 00 00 00 00 0a 00 b5 00 01 00 02 09 18 00 |................|
000003d0 13 2d 39 39 39 37 39 37 30 00 00 00 00 00 00 00 |.-9997970.......|
000003e0 e4 6d db 0a 00 00 00 00 00 00 00 00 00 00 0a 00 |.m..............|
000003f0 b4 00 01 00 02 09 18 00 13 2d 39 39 39 37 39 37 |.........-999797|
00000400 31 00 00 00 00 00 00 00 e4 6d db 0a 00 00 00 00 |1........m......|
00000410 00 00 00 00 00 00 0a 00 b3 00 01 00 02 09 18 00 |................|
00000420 13 2d 39 39 39 37 39 37 32 00 00 00 00 00 00 00 |.-9997972.......|
00000430 e4 6d db 0a 00 00 00 00 00 00 00 00 00 00 0a 00 |.m..............|
00000440 b2 00 01 00 02 09 18 00 13 2d 39 39 39 37 39 37 |.........-999797|
00000450 33 00 00 00 00 00 00 00 e4 6d db 0a 00 00 00 00 |3........m......|
00000460 00 00 00 00 00 00 0a 00 b1 00 01 00 02 09 18 00 |................|
00000470 13 2d 39 39 39 37 39 37 34 00 00 00 00 00 00 00 |.-9997974.......|
00000480 e4 6d db 0a 00 00 00 00 00 00 00 00 00 00 0a 00 |.m..............|
00000490 b0 00 01 00 02 09 18 00 13 2d 39 39 39 37 39 37 |.........-999797|
000004a0 35 00 00 00 00 00 00 00 e4 6d db 0a 00 00 00 00 |5........m......|
000004b0 00 00 00 00 00 00 0a 00 af 00 01 00 02 09 18 00 |................|
000004c0 13 2d 39 39 39 37 39 37 36 00 00 00 00 00 00 00 |.-9997976.......|
000004d0 e4 6d db 0a 00 00 00 00 00 00 00 00 00 00 0a 00 |.m..............|
000004e0 ae 00 01 00 02 09 18 00 13 2d 39 39 39 37 39 37 |.........-999797|
000004f0 37 00 00 00 00 00 00 00 e4 6d db 0a 00 00 00 00 |7........m......|
00000500 00 00 00 00 00 00 0a 00 ad 00 01 00 02 09 18 00 |................|
00000510 13 2d 39 39 39 37 39 37 38 00 00 00 00 00 00 00 |.-9997978.......|
00000520 e4 6d db 0a 00 00 00 00 00 00 00 00 00 00 0a 00 |.m..............|
00000530 ac 00 01 00 02 09 18 00 13 2d 39 39 39 37 39 37 |.........-999797|
00000540 39 00 00 00 00 00 00 00 e4 6d db 0a 00 00 00 00 |9........m......|
00000550 00 00 00 00 00 00 0a 00 ab 00 01 00 02 09 18 00 |................|
00000560 13 2d 39 39 39 37 39 38 30 00 00 00 00 00 00 00 |.-9997980.......|
00000570 e4 6d db 0a 00 00 00 00 00 00 00 00 00 00 0a 00 |.m..............|
00000580 aa 00 01 00 02 09 18 00 13 2d 39 39 39 37 39 38 |.........-999798|
00000590 31 00 00 00 00 00 00 00 e4 6d db 0a 00 00 00 00 |1........m......|
000005a0 00 00 00 00 00 00 0a 00 a9 00 01 00 02 09 18 00 |................|
-
7、替换数据
将9998000替换为9999000,并且将旧值存放进blockorig
root@gauss01:/home/postgres/abdata/7.0/abase1 $ sed -iorig 's/9998000/9999000/' block
root@gauss01:/home/postgres/abdata/7.0/abase1 $ diff -u <(hexdump -C block) <(hexdump -C blockorig)
--- /dev/fd/63 2022-07-01 15:15:52.276949646 +0800
+++ /dev/fd/62 2022-07-01 15:15:52.277949613 +0800
@@ -134,7 +134,7 @@
00000850 98 00 01 00 02 09 18 00 13 2d 39 39 39 37 39 39 |.........-999799|
00000860 39 00 00 00 00 00 00 00 e4 6d db 0a 00 00 00 00 |9........m......|
00000870 00 00 00 00 00 00 0a 00 97 00 01 00 02 09 18 00 |................|
-00000880 13 2d 39 39 39 39 30 30 30 00 00 00 00 00 00 00 |.-9999000.......|
+00000880 13 2d 39 39 39 38 30 30 30 00 00 00 00 00 00 00 |.-9998000.......|
00000890 e4 6d db 0a 00 00 00 00 00 00 00 00 00 00 0a 00 |.m..............|
000008a0 96 00 01 00 02 09 18 00 13 2d 39 39 39 38 30 30 |.........-999800|
000008b0 31 00 00 00 00 00 00 00 e4 6d db 0a 00 00 00 00 |1........m......|
-
8、对比
--修改前的值
$ hexdump -C blockorig
00000880 13 2d 39 39 39 38 30 30 30 00 00 00 00 00 00 00 |.-9998000.......|
--修改后的值
$ hexdump -C block
00000880 13 2d 39 39 39 39 30 30 30 00 00 00 00 00 00 00 |.-9999000.......|
-
9、模拟文件损坏
将替换的数据写入到数据文件中,来模拟数据文件损坏
[07-01 15:15:52] root@gauss01:/home/postgres/abdata/7.0/abase1
$ dd bs=8192 count=1 seek=10 of=/home/postgres/abdata/7.0/abase1/base/13593/11025542 if=block conv=notrunc
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.00363887 s, 2.3 MB/s
写入后要先做checkpoint,然后重启
需要加入conv=notrunc,保证只替换第10页,不加的话10页后面的数据都丢失了。
这样表中就有两条一样的数据了,但是还是能查询出来,表也没报错
postgres=# SELECT ctid, * FROM filler WHERE txt = '-9999000';
ctid | txt
----------+----------
(5,76) | -9999000
(10,151) | -9999000
(2 rows)
神奇的是,该sql能走索引,而且查询出来了。这里检查了zero_damaged_pages=off,并未打开
postgres=# vacuum full filler;
VACUUM
postgres=# SELECT ctid, * FROM filler WHERE txt = '-9999000';
ctid | txt
----------+----------
(5,76) | -9999000
(10,151) | -9999000
(2 rows)
--查看执行计划返回了两条数据
postgres=# explain (analyze,buffers)SELECT ctid, * FROM filler WHERE txt = '-9999000';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------
Index Scan using i_filler_txt on filler (cost=0.28..8.29 rows=1 width=14) (actual time=0.115..0.120 rows=2 loops=1)
Index Cond: (txt = '-9999000'::text)
Buffers: shared hit=4
Planning Time: 0.319 ms
Execution Time: 0.278 ms
(5 rows)
-
11、重建索引报错:
postgres=# reindex table filler;
错误: 无法创建唯一索引"i_filler_txt"
DETAIL: 键值(txt)=(-9999000)重复了
-
未能复现出ERROR: invalid page in block 10 of relation 的报错但是很明显数据确实被破坏了,表中有重复的数据,而且索引重建冲突
-
上面还有一点,vacuum full没有报错,而reindex的时候报错了,我们都知道vacuum是会清理掉索引膨胀的,单并没有重建这个索引。上面替换进去的页被正常查出来了,但是数据也是重复的。
-
从这篇文章来看要开启checksum才会报页面损坏,如果不开的话,很可能就是上面的情况,只是数据冲突,还是能继续访问。上面的测试基于pg12.3https://blog.csdn.net/weixin_43230594/article/details/123781575
打开checksum
打开checksum,然后使用dd模拟坏块
-
1、查看checksum是否打开,0表示关闭,1表示打开
[07-01 16:28:07] postgres@gauss01:~ $ pg_controldata -D $PGDATA |grep checksum
Data page checksum version: 0
-
2、关闭数据库,打开checksum
[07-01 16:28:17] postgres@gauss01:~ $ sh stop_abase1.sh
正在关闭数据库实例...
waiting for server to shut down...... done
server stopped
--zhprs_dict_template1这个可能和abase加入了全文检索有关系,删掉这个文件
[07-01 16:29:31] postgres@gauss01:~ $ pg_checksums -D$PGDATA --enable --progress
pg_checksums: error: invalid segment number 0 in file name "/home/postgres/abdata/7.0/abase1/base/1/zhprs_dict_template1.tag"
-
3、开启checksum
--开启checksum
[07-01 16:33:18] postgres@gauss01:~
$ pg_checksums -D$PGDATA --enable --progress
4443/4443 MB (99%) computed
Checksum operation completed
Files scanned: 3665
Blocks scanned: 568745
pg_checksums: syncing data directory
pg_checksums: updating control file
Checksums enabled in cluster
--启动数据库
[07-01 16:35:09] postgres@gauss01:~
$ sh startup_abase1.sh
正在启动数据库实例...
waiting for server to start....2022-07-01 16:35:16.957 CST - - - - 5722: 日志: Auto detecting pg_stat_kcache.linux_hz parameter...
done
server started
--开启checksum
[07-01 16:35:21] postgres@gauss01:~
$ pg_controldata -D $PGDATA |grep checksum
Data page checksum version: 1
-
4、模拟数据损坏
postgres=# create table t1 (id int,c_name varchar(100));
CREATE TABLE
postgres=# insert into t1 select generate_series(1,100000),'zsan';
INSERT 0 100000
postgres=# alter table t1 add primary key(id);
ALTER TABLE
postgres=# select pg_relation_filepath('t1');
pg_relation_filepath
----------------------
base/13593/11025587
(1 row)
-
5、使用dd
--这里使用当前页去替换
[07-01 16:56:55] root@gauss01:/home/postgres/abdata/7.0/abase1
$ dd bs=8192 count=1 seek=1 of=/home/postgres/abdata/7.0/abase1/base/13593/11025587 if=/home/postgres/abdata/7.0/abase1/base/13593/11025587
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000700158 s, 11.7 MB/s
--替换后只剩下370条数据了,还没重启
postgres=# select count(*) from t1;
count
-------
370
(1 row)
-- 然后重启,重启后在查询:
--这里不能执行checkpoint,实验中发现,如果执行了checkpoint那么可能会将缓存中的数据刷到磁盘,数据块仍然是正常的
postgres=# select count(*) from t1;
警告: 页校验失败,计算出的校验和为51401,但期望值是51402
警告: 关系 "base/13593/11025587" 中的块 1 无效的页;正在对页进行清零操作
-
打开checksum能够复现出块损坏的报错
-
6、使用checksum检查块
停掉数据库使用checksum检查,可以可能出我开始手动损坏的块都检测出来了,但是查询是正常的
$ pg_checksums -D$PGDATA --check
pg_checksums: error: checksum verification failed in file "/home/postgres/abdata/7.0/abase1/base/13593/11025587", block 1: calculated checksum C8C9 but block contains C8CA
pg_checksums: error: checksum verification failed in file "/home/postgres/abdata/7.0/abase1/base/13593/11025551", block 1: calculated checksum 1CB7 but block contains 1CB8
Checksum operation completed
Files scanned: 3663
Blocks scanned: 551755
Bad checksums: 2
Data checksum version: 1
7、转换
将checksum报错16进制转换为2进制刚好能对上
-- block 1: calculated checksum C8C9 but block contains C8CA
[07-04 16:48:54] postgres@gauss01:~
$ echo $((0xC8C9))
51401
[07-04 16:49:31] postgres@gauss01:~
$ echo $((0xC8CA))
51402
--页校验失败,计算出的校验和为51401,但期望值是51402
-
8、打开zero_damaged_pages
-
打开checksum后能复现报错,尝试使用zero_damaged_pages来忽略坏块
postgres=# set zero_damaged_pages=on;
SET
postgres=# select count(*) from t1;
警告: 页校验失败,计算出的校验和为57110,但期望值是57109
警告: 关系 "base/13593/11121105" 中的块 1 无效的页;正在对页进行清零操作
count
-------
185
(1 row)
可以看到确实可以访问表了,但是警告仍然存在,这给了我们操作空间,能访问意味着我们可以备份这个表
关闭checksum
-
1、关闭checksum
[07-01 17:40:43] postgres@gauss01:~ $ pg_checksums -D$PGDATA --disable --progress
pg_checksums: syncing data directory
pg_checksums: updating control file
Checksums disabled in cluster
--重启数据库
[07-01 17:41:11] postgres@gauss01:~
$ sh startup_abase1.sh
-
2、 启动数据库查询数据,发现数据发生了错误
--数据发生了冲突
postgres=# select count(*) from t1;
count
-------
370
(1 row)
--一页数据完全重复
postgres=# select count(distinct id) from t1;
count
-------
185
(1 row)
--并没打开
postgres=# show zero_damaged_pages;
zero_damaged_pages
--------------------
off
(1 row)
--数据冲突的
postgres=# select * from t1 where id = 1;
id | c_name
----+--------
1 | zsan
1 | zsan
(2 rows)
-
3、zero_damaged_pages
设置后还是有数据冲突的问题,当数据重复且能正常查询出来的时候设置zero_damaged_pages这个参数已经没有效果,因为数据已经正常只是索引里面冲突了,zero_damaged_pages不能解决数据冲突的问题
--打开zero_damaged_pages
postgres=# set zero_damaged_pages=on;
SET
postgres=# select * from t1 where id = 1;
id | c_name
----+--------
1 | zsan
1 | zsan
(2 rows)
postgres=# checkpoint;
CHECKPOINT
--查看数据有冲突
postgres=# select * from t1 where id = 1;
id | c_name
----+--------
1 | zsan
1 | zsan
(2 rows)
postgres=# show zero_damaged_pages;
zero_damaged_pages
--------------------
on
(1 row)
-
小结:也就是说没有开启checksum下的数据文件损坏,可能导致查询到错误的数据
数据页损坏修复
我们再次复现上面第一个例子,先打开checksum,看能检查出报错,另外就是备份数据文件,看能否从备份的数据文件中还原损坏的块
从读库还原坏块
-
1、我们用第一个模拟文件损坏来实验
打开checksum以后复现:
postgres=# SELECT ctid, * FROM filler WHERE txt = '-9999000';
警告: 页校验失败,计算出的校验和为34304,但期望值是56297
错误: 关系 "base/13593/11121060" 中的块 10 存在无效的页
--count失败
postgres=# select count(*) from filler;
警告: 页校验失败,计算出的校验和为34304,但期望值是56297
错误: 关系 "base/13593/11121060" 中的块 10 存在无效的页
--更新失败
postgres=# UPDATE filler SET txt ='Postgres Rules!' WHERE txt = '-9997999';
警告: 页校验失败,计算出的校验和为34304,但期望值是56297
错误: 关系 "base/13593/11121060" 中的块 10 存在无效的页
--更新其他好的页也失败
postgres=# UPDATE filler SET txt ='Postgres Rules!' WHERE txt = '1';
警告: 页校验失败,计算出的校验和为34304,但期望值是56297
错误: 关系 "base/13593/11121060" 中的块 10 存在无效的页
--查询其他页可以
postgres=# select *from filler where ctid = '(1,1)';
txt
----------
-9999815
(1 row)
--更新这个能查询的页也不行
postgres=# UPDATE filler SET txt ='Postgres Rules!' WHERE txt = '-9999815';
警告: 页校验失败,计算出的校验和为34304,但期望值是56297
错误: 关系 "base/13593/11121060" 中的块 10 存在无效的页
--更新ctid也不行
postgres=# UPDATE filler SET txt ='Postgres Rules!' WHERE ctid = '(1,1)';
警告: 页校验失败,计算出的校验和为34304,但期望值是56297
错误: 关系 "base/13593/11121060" 中的块 10 存在无效的页
--删除这个坏块也不行,看来网上说的删除坏块的方法行不通,dd应该是可以
postgres=# delete from filler where ctid = '(10,1)';
警告: 页校验失败,计算出的校验和为34304,但期望值是56297
错误: 关系 "base/13593/11121060" 中的块 10 存在无效的页
--关闭数据库执行checksum
[07-04 14:49:17] postgres@gauss01:~
$ sh stop_abase1.sh
正在关闭数据库实例...
waiting for server to shut down.... done
server stopped
[07-04 14:51:43] postgres@gauss01:~
$ pg_checksums -D$PGDATA --check
pg_checksums: error: checksum verification failed in file "/home/postgres/abdata/7.0/abase1/base/13593/11121060", block 10: calculated checksum 8600 but block contains DBE9
Checksum operation completed
Files scanned: 3666
Blocks scanned: 697218
Bad checksums: 1
Data checksum version: 1
--启动数据库
postgres=# vacuum analyze filler;
警告: 页校验失败,计算出的校验和为34304,但期望值是56297
错误: 关系 "base/13593/11121060" 中的块 10 存在无效的页
-
也就是说如果一开始打开了checksum,那么检测出损坏的块,即使后面关掉checksum仍然会报错
-
2、从读库还原坏块
使用dd将备份的数据文件替换进去,加入11121060_bak是流复制中读库的数据
[07-04 13:52:35] postgres@gauss01:~
$ dd bs=8192 count=1 skip=10 if=/home/postgres/backup/11121060_bak seek=10 of=/home/postgres/abdata/7.0/abase1/base/13593/11121060 conv=notrunc
1+0 records in
1+0 records out
8192 bytes (8.2 kB) copied, 0.000897603 s, 9.1 MB/s
[07-04 14:47:32] postgres@gauss01:~ $ ll /home/postgres/abdata/7.0/abase1/base/13593/11121060
-rw------- 1 postgres 761M Jul 4 14:47 /home/postgres/abdata/7.0/abase1/base/13593/11121060
3、重启数据库
--再次查看数据确实能够查询出来,说明这个方案是靠谱的,如果有高可用那么从从库恢复的方案可行
postgres=# select count(*) from filler;
count
-------
20000001
(1 row)
--没有重复的数据
postgres=# select count(distinct txt) from filler;
count
-------
20000001
(1 row)
-
有高可用情况可以非常快速的恢复,且不丢失数据
抹除损坏的块
dd抹除
生产环境谨慎使用,如果没有备份,可以抹除掉损坏的块
-
1、dd抹除,conv=notrunc 会阻止 dd 截断表的其余部分。
-
如果不加conv=notrunc,将会丢失大量数据
dd bs=8192 seek=10 count=1 of=/home/postgres/abdata/7.0/abase1/base/13593/11121060 if=/dev/zero conv=notrunc
表示块大小,seek标识第10块,count=1标识总共处理1块,if标识输入的文件,of标识输出的文件
抹除后执行vacuum
postgres=# vacuum analyze filler;
VACUUM
--能够查询出数据
postgres=# select count(*) from filler;
count
----------
19999816
(1 row)
--执行vacuum full以后第10页会被占用,数据正常
postgres=# vacuum full filler;
VACUUM
postgres=# select * from filler where ctid = '(10,1)';
txt
----------
-9997965
(1 row)
我们可以看到最开始插入的数据是20000001,刚好相差185条数据,也就是丢失了第10页的数据
导出正常数据
如果害怕dd使用有问题,那么我们可以考虑将正常的数据导出使用sql来处理,导出非坏块的数据
-
1、根据报错找到对应的坏块
-
将数据分别插入到临时表中
-- 找到坏块之后,我们来处理数据,处理的步骤是忽略坏块,使用临时表将数据合并。
select * int t_09 from t1 where ctid <='(10,0)';
select * int t_11 from t1 where ctid >='(11,0)';
--将两部分数据分别插入临时表,然后再处理
insert into t_11 select * from t_10 ;
--修改原来的表名,然后将新的表名修改为原来的表名
alter table t1 rename to t1_bak;
alter table t11 rename to t1;
--关闭参数
alter system set zero_damaged_pages='off';
select pg_reload_conf();
-
对整库进行备份,让运维人员对整个磁盘设备进行检查是否有坏块,并修复坏块
-
另外看到说可以删除坏块的方式,测试发现不可行
使用zero_damaged_pages
-
如上打开checksum示例,我们可以打开这个参数来忽略坏块,然后备份数据,当然某些情况下这个参数也不好使,还得用上面两种方法
总结
1、在没有打开checksum的情况下块损坏后还是能访问,但是数据错误
2、如果报错页损坏invalid page in block of relation base ,那么可以设置zero_damaged_pages=on一般就可以访问表,使用dump的方式将数据备份出来。
3、第2步如果不行那么就需要特殊处理
常见的方式有使用dd抹除损坏的这个页,另外就是使用临时表的方式将有用的数据导出来,然后还原数据
4、如果有做高可用,那么可以直接从备库的数据文件中还原损坏的块,此办法较快
5、checksum可以检查具体有那些块是损坏的,然后针对全库进行修复
6、做好备份,并且定期检查备份的正确是很有必要的。块损坏会导致备份失败
zero_damaged_pages:检测到损坏的页头通常会导致PostgreSQL报告错误,中止当前事务。设置zero_damaged_pages
为 on 会导致系统报告警告,将内存中损坏的页面清零,然后继续处理。这种行为会破坏数据,即损坏页面上的所有行。但是,它确实允许您克服错误并从表中可能存在的任何未损坏的页面中检索行。如果由于硬件或软件错误而发生损坏,它对于恢复数据很有用。在您放弃从表的损坏页恢复数据的希望之前,您通常不应设置此选项。清零页不会被强制写入磁盘,因此建议在再次关闭此参数之前重新创建表或索引。默认设置为off
,并且只能由超级用户更改。
参考资料:
https://www.endpointdev.com/blog/2010/06/tracking-down-database-corruption-with/
http://bitrotincarnate.herokuapp.com/blog/2013/08/27/tuple-flensing-treating-postgres-corruption-part-i
https://mp.weixin.qq.com/s/LFPta3nGD12MRFVyuYEvHA
https://www.endpointdev.com/blog/2013/10/downstream-implications-of-data-page/
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
· 【自荐】一款简洁、开源的在线白板工具 Drawnix
· 园子的第一款AI主题卫衣上架——"HELLO! HOW CAN I ASSIST YOU TODAY
· Docker 太简单,K8s 太复杂?w7panel 让容器管理更轻松!