PostgresQL-toast表数据损坏修复案例
PostgresQL-toast表数据损坏修复案例
pg_statistic-toast表块损坏修复
当访问该表的时候报错:missing chunk number 0 for toast value 59747165 in pg_toast_2619
首先我们要找到出问题的表:toast表的表名是字符串"pg_toast"与表tbl的oid "2619" 拼接而成,可以直接查询oid=2619
1、
pgbenchdb=# select 2619::regclass;
regclass
--------------
pg_statistic
(1 row)
2、
pgbenchdb=# select relname from pg_class where oid= 2619;
relname
--------------
pg_statistic
(1 row)
--3、也可以根据reltoastrelid来获取,reltoastrelid存储了该表相关联的TOAST表的OID
pgbenchdb=# select relname from pg_class where reltoastrelid = (select oid from pg_class where relname = 'pg_toast_2619');
relname
--------------
pg_statistic
(1 row)
这样我们就获取到了出问题的表是pg_statistic的toast表:pg_toast_2619
--查询报错
pgbenchdb=# select * from pg_statistic;
错误: missing chunk number 0 for toast value 59747165 in pg_toast_2619
pgbenchdb=# select * from pg_statistic;
错误: missing chunk number 0 for toast value 59747165 in pg_toast_2619
--尝试vacuum full
pgbenchdb=# vacuum full pg_statistic;
错误: missing chunk number 0 for toast value 59747165 in pg_toast_2619
--analyze可以执行
pgbenchdb=# analyze pg_statistic;
ANALYZE
--重建索引不报错,因为此处是toast表的损坏
pgbenchdb=# reindex table pg_statistic;
REINDEX
pgbenchdb=# vacuum full pg_statistic;
错误: missing chunk number 0 for toast value 59747165 in pg_toast_2619
pgbenchdb=# select * from pg_statistic;
错误: missing chunk number 0 for toast value 59747165 in pg_toast_2619
pgbenchdb=#
pgbenchdb=#
-
pg_toast_2619通过vacuum full无法修复
修复
-
由于pg_statistic表的数据是存储的统计信息,所以可以删掉这个表的数据
--删掉这个表的数据,访问正常
pgbenchdb=# delete from pg_statistic;
DELETE 485
pgbenchdb=# select * from pg_statistic;
starelid | staattnum | stainherit | stanullfrac | stawidth | stadistinct | stakind1 | stakind2 | s
takind3 | stakind4 | stakind5 | staop1 | staop2 | staop3 | staop4 | staop5 | stacoll1 | stacoll2 |
stacoll3 | stacoll4 | stacoll5 | stanumbers1 | stanumbers2 | stanumbers3 | stanumbers4 | stanumbers
5 | stavalues1 | stavalues2 | stavalues3 | stavalues4 | stavalues5
----------+-----------+------------+-------------+----------+-------------+----------+----------+--
--------+----------+----------+--------+--------+--------+--------+--------+----------+----------+-
---------+----------+----------+-------------+-------------+-------------+-------------+-----------
--+------------+------------+------------+------------+------------
(0 rows)
删掉后会没有数据,重新analyze一遍
pgbenchdb=# vacuum analyze;
VACUUM
--再次访问这个表就正常了
pgbenchdb=# select * from pg_statistic limit 10;
-[ RECORD 1 ]--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------
starelid | 1260
staattnum | 1
stainherit | f
stanullfrac | 0
stawidth | 4
stadistinct | -1
stakind1 | 2
stakind2 | 3
stakind3 | 0
stakind4 | 0
stakind5 | 0
staop1 | 609
复现该问题
-- 1、获取pg_toast_2619的文件
pgbenchdb=# select pg_relation_filepath('pg_toast.pg_toast_2619');
pg_relation_filepath
------------------------
base/32544061/59747222
(1 row)
-- 2、用户dd来抹除数据文件
[thunisoft@gauss01 ~]$ dd if=/dev/zero of=/home/thunisoft/abdata/7.0/abase1/base/32544061/59747222 bs=8k count=2
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 0.000906751 s, 18.1 MB/s
-- 3、访问表没问题
pgbenchdb=# select * from pg_statistic;
-[ RECORD 1 ]--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
------
starelid | 1260
staattnum | 1
stainherit | f
stanullfrac | 0
stawidth | 4
stadistinct | -1
、、、、、、
-
重启数据库,排除缓存干扰
[thunisoft@gauss01 ~]$ sh stop_abase1.sh
正在关闭数据库实例...
waiting for server to shut down........................................ done
server stopped
[thunisoft@gauss01 ~]$ sh startup_abase1.sh
正在启动数据库实例...
waiting for server to start....2022-02-14 10:21:56.537 CST - - - - 23533: 警告: 42704: 未认可的配置参数 "pg_stat_statements.track_planning"
2022-02-14 10:21:56.537 CST - - - - 23533: 位置: EmitWarningsOnPlaceholders, guc.c:8795
2022-02-14 10:21:56.568 CST - - - - 23533: 日志: 00000: Auto detecting pg_stat_kcache.linux_hz parameter...
、、
2022-02-14 10:21:59.027 CST - - - - 23533: 位置: SysLogger_Start, syslogger.c:675
. done
server started
-
再次查看该表就能复现出这个报错
--访问表报错
pgbenchdb=# select * from pg_statistic;
错误: missing chunk number 0 for toast value 59747207 in pg_toast_2619
按照上面的方式再次修复的时候翻车了,修复的时候报了另外一个错误:
ERROR: could not read block 2 in file "base/32544061/59747222":read only 0 of 8192 bytes
pgbenchdb=# delete from pg_statistic;
错误: 无法读取文件"base/32544061/59747222"的块2:只读取了8192字节的0
ERROR: could not read block 2 in file "base/32544061/59747222":read only 0 of 8192 bytes
前面测试的时候没有报这个错,这次突然报这个,怀疑和索引可能有关系,因为pg_toast_2619上面也是有索引的
重建该索引后就可以正常删除
pgbenchdb=# explain delete from pg_statistic;
QUERY PLAN
---------------------------------------------------------------------
Delete on pg_statistic (cost=0.00..29.85 rows=485 width=6)
-> Seq Scan on pg_statistic (cost=0.00..29.85 rows=485 width=6)
(2 rows)
pgbenchdb=# reindex index pg_toast_2619_index;
错误: 关系 "pg_toast_2619_index" 不存在
--重建索引
pgbenchdb=# reindex index pg_toast.pg_toast_2619_index;
REINDEX
--重建后可以正常删除
pgbenchdb=# delete from pg_statistic;
DELETE 485
复现索引损坏:
pgbenchdb=# select relfilenode from pg_class where relname = 'pg_toast_2619_index';
relfilenode
-------------
59863937
(1 row)
[thunisoft@gauss01 ~]$ dd if=/dev/zero of=/home/thunisoft/abdata/7.0/abase1/base/32544061/59863937 bs=8k count=2
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 0.000638181 s, 25.7 MB/s
--重启数据库
--这次的报错和前面不一样,但是也提示索引损坏
pgbenchdb=# select * from pg_statistic ;
错误: 索引"pg_toast_2619_index"在块0上包含未期望的零页
HINT: 请重建索引 (REINDEX).
普通表toast损坏修复
如果没有备份出现损坏的数据如何最大程度的保留数据呢?
create table test(id int primary key ,c_name varchar);
insert into test select generate_series(1,100),'a';
--插入大字段
insert into test select generate_series(101,200),repeat(md5(random()::text),8000);
insert into test select generate_series(201,300),'b';
postgres=# select oid from pg_class where relname = 'test';
oid
----------
59986613
(1 rows)
--查看toast文件
postgres=# select pg_relation_filepath('pg_toast.pg_toast_59986613');
pg_relation_filepath
----------------------
base/13593/59986616
(1 row)
--损坏toast文件
[thunisoft@gauss01 ~]$ dd if=/dev/zero of=/home/thunisoft/abdata/7.0/abase1/base/13593/59986616 bs=8k count=2
2+0 records in
2+0 records out
16384 bytes (16 kB) copied, 0.000544404 s, 30.1 MB/s
--重启数据库
[thunisoft@gauss01 ~]$ sh stop_abase1.sh
正在关闭数据库实例...
waiting for server to shut down..... done
server stopped
--清理缓存
[root@gauss01 ~]# echo 3 > /proc/sys/vm/drop_caches
[thunisoft@gauss01 ~]$ sh startup_abase1.sh
正在启动数据库实例...
定位损坏数据
--重启后发现访问该表报错
postgres=# select * from test;
错误: missing chunk number 0 for toast value 59986621 in pg_toast_59986613
--count没问题
postgres=# select count(*) from test;
count
-------
300
(1 row)
--选择其他列也没有问题,但是访问c_name有问题
postgres=# select id from test;
id
-----
1
2
3
--建索引,更新统计信息不行
postgres=# reindex table test;
REINDEX
postgres=# reindex table pg_toast.pg_toast_59986613;
REINDEX
postgres=# vacuum analyze test;
VACUUM
postgres=# select * from test;
错误: missing chunk number 0 for toast value 59986621 in pg_toast_59986613
获取损坏的数据
postgres=# select * from test order by id limit 100 offset 0;
postgres=# select * from test order by id limit 100 offset 100;
postgres=# select * from test order by id limit 100 offset 200;
postgres=# select * from test order by id limit 100 offset 300;
、、、
--当执行到limit 100 offset 100的时候报错
postgres=# select * from test order by id limit 100 offset 100;
错误: missing chunk number 0 for toast value 59986621 in pg_toast_59986613
这种查找方式比较笨拙
-
使用脚本查找出所有的损坏行
[thunisoft@gauss01 ~]$ cat check_toast.sh
#!/bin/bash
j=300 #300是表的总行数
for ((i=1; i<=j;i++))
do
psql -U sa -d postgres -c "SELECT * FROM test order by id LIMIT 1 offset $i" >/dev/null || echo $i
done
-
上面的语法意味着:执行查询并将输出重定向到,/dev/null或者,如果出现错误(||),则输出错误消息。
定位损坏的行
[thunisoft@gauss01 ~]$ sh check_toast.sh
错误: missing chunk number 0 for toast value 59986621 in pg_toast_59986613
100
错误: missing chunk number 0 for toast value 59986622 in pg_toast_59986613
101
错误: missing chunk number 0 for toast value 59986623 in pg_toast_59986613
102
错误: missing chunk number 0 for toast value 59986624 in pg_toast_59986613
103
错误: missing chunk number 0 for toast value 59986625 in pg_toast_59986613
104
错误: missing chunk number 0 for toast value 59986626 in pg_toast_59986613
105
错误: missing chunk number 0 for toast value 59986627 in pg_toast_59986613
106
错误: missing chunk number 0 for toast value 59986628 in pg_toast_59986613
107
错误: missing chunk number 0 for toast value 59986629 in pg_toast_59986613
108
错误: missing chunk number 0 for toast value 59986630 in pg_toast_59986613
109
错误: missing chunk number 0 for toast value 59986631 in pg_toast_59986613
110
错误: missing chunk number 0 for toast value 59986632 in pg_toast_59986613
111
错误: missing chunk number 0 for toast value 59986633 in pg_toast_59986613
112
错误: missing chunk number 0 for toast value 59986634 in pg_toast_59986613
113
错误: missing chunk number 0 for toast value 59986635 in pg_toast_59986613
114
错误: missing chunk number 0 for toast value 59986636 in pg_toast_59986613
115
错误: missing chunk number 0 for toast value 59986637 in pg_toast_59986613
116
错误: missing chunk number 0 for toast value 59986638 in pg_toast_59986613
117
错误: missing chunk number 0 for toast value 59986639 in pg_toast_59986613
118
错误: missing chunk number 0 for toast value 59986640 in pg_toast_59986613
119
错误: missing chunk number 0 for toast value 59986641 in pg_toast_59986613
120
错误: missing chunk number 0 for toast value 59986642 in pg_toast_59986613
121
错误: missing chunk number 0 for toast value 59986643 in pg_toast_59986613
122
错误: missing chunk number 0 for toast value 59986644 in pg_toast_59986613
123
错误: missing chunk number 0 for toast value 59986645 in pg_toast_59986613
124
错误: missing chunk number 0 for toast value 59986646 in pg_toast_59986613
125
错误: missing chunk number 0 for toast value 59986647 in pg_toast_59986613
126
错误: missing chunk number 0 for toast value 59986648 in pg_toast_59986613
127
错误: missing chunk number 0 for toast value 59986649 in pg_toast_59986613
128
错误: missing chunk number 0 for toast value 59986650 in pg_toast_59986613
129
错误: missing chunk number 0 for toast value 59986651 in pg_toast_59986613
130
错误: missing chunk number 0 for toast value 59986652 in pg_toast_59986613
131
错误: missing chunk number 0 for toast value 59986653 in pg_toast_59986613
132
错误: missing chunk number 0 for toast value 59986654 in pg_toast_59986613
133
错误: missing chunk number 0 for toast value 59986655 in pg_toast_59986613
134
错误: missing chunk number 0 for toast value 59986656 in pg_toast_59986613
135
错误: missing chunk number 0 for toast value 59986657 in pg_toast_59986613
136
错误: missing chunk number 0 for toast value 59986658 in pg_toast_59986613
137
错误: missing chunk number 0 for toast value 59986659 in pg_toast_59986613
138
错误: missing chunk number 0 for toast value 59986660 in pg_toast_59986613
139
错误: missing chunk number 0 for toast value 59986661 in pg_toast_59986613
140
错误: missing chunk number 0 for toast value 59986662 in pg_toast_59986613
141
错误: missing chunk number 0 for toast value 59986663 in pg_toast_59986613
142
错误: missing chunk number 0 for toast value 59986664 in pg_toast_59986613
143
错误: missing chunk number 0 for toast value 59986665 in pg_toast_59986613
144
错误: missing chunk number 0 for toast value 59986666 in pg_toast_59986613
145
错误: missing chunk number 0 for toast value 59986667 in pg_toast_59986613
146
错误: missing chunk number 0 for toast value 59986668 in pg_toast_59986613
147
错误: missing chunk number 0 for toast value 59986669 in pg_toast_59986613
148
错误: missing chunk number 0 for toast value 59986670 in pg_toast_59986613
149
错误: missing chunk number 0 for toast value 59986671 in pg_toast_59986613
150
错误: missing chunk number 0 for toast value 59986672 in pg_toast_59986613
151
错误: missing chunk number 0 for toast value 59986673 in pg_toast_59986613
152
错误: missing chunk number 0 for toast value 59986674 in pg_toast_59986613
153
错误: missing chunk number 0 for toast value 59986675 in pg_toast_59986613
154
错误: missing chunk number 0 for toast value 59986676 in pg_toast_59986613
155
错误: missing chunk number 0 for toast value 59986677 in pg_toast_59986613
156
错误: missing chunk number 0 for toast value 59986678 in pg_toast_59986613
157
错误: missing chunk number 0 for toast value 59986679 in pg_toast_59986613
158
错误: missing chunk number 0 for toast value 59986680 in pg_toast_59986613
159
错误: missing chunk number 0 for toast value 59986681 in pg_toast_59986613
160
错误: missing chunk number 0 for toast value 59986682 in pg_toast_59986613
161
错误: missing chunk number 0 for toast value 59986683 in pg_toast_59986613
162
错误: missing chunk number 0 for toast value 59986684 in pg_toast_59986613
163
错误: missing chunk number 0 for toast value 59986685 in pg_toast_59986613
164
错误: missing chunk number 0 for toast value 59986686 in pg_toast_59986613
165
错误: missing chunk number 0 for toast value 59986687 in pg_toast_59986613
166
错误: missing chunk number 0 for toast value 59986688 in pg_toast_59986613
167
错误: missing chunk number 0 for toast value 59986689 in pg_toast_59986613
168
错误: missing chunk number 0 for toast value 59986690 in pg_toast_59986613
169
错误: missing chunk number 0 for toast value 59986691 in pg_toast_59986613
170
错误: missing chunk number 0 for toast value 59986692 in pg_toast_59986613
171
错误: missing chunk number 0 for toast value 59986693 in pg_toast_59986613
172
错误: missing chunk number 0 for toast value 59986694 in pg_toast_59986613
173
错误: missing chunk number 0 for toast value 59986695 in pg_toast_59986613
174
错误: missing chunk number 0 for toast value 59986696 in pg_toast_59986613
175
错误: missing chunk number 0 for toast value 59986697 in pg_toast_59986613
176
错误: missing chunk number 0 for toast value 59986698 in pg_toast_59986613
177
错误: missing chunk number 0 for toast value 59986699 in pg_toast_59986613
178
错误: missing chunk number 0 for toast value 59986700 in pg_toast_59986613
179
错误: missing chunk number 0 for toast value 59986701 in pg_toast_59986613
180
错误: missing chunk number 0 for toast value 59986702 in pg_toast_59986613
181
错误: missing chunk number 0 for toast value 59986703 in pg_toast_59986613
182
错误: missing chunk number 0 for toast value 59986704 in pg_toast_59986613
183
错误: missing chunk number 0 for toast value 59986705 in pg_toast_59986613
184
错误: missing chunk number 0 for toast value 59986706 in pg_toast_59986613
185
错误: missing chunk number 0 for toast value 59986707 in pg_toast_59986613
186
错误: missing chunk number 0 for toast value 59986708 in pg_toast_59986613
187
错误: missing chunk number 0 for toast value 59986709 in pg_toast_59986613
188
错误: missing chunk number 0 for toast value 59986710 in pg_toast_59986613
189
错误: missing chunk number 0 for toast value 59986711 in pg_toast_59986613
190
错误: missing chunk number 0 for toast value 59986712 in pg_toast_59986613
191
错误: missing chunk number 0 for toast value 59986713 in pg_toast_59986613
192
错误: missing chunk number 0 for toast value 59986714 in pg_toast_59986613
193
错误: missing chunk number 0 for toast value 59986715 in pg_toast_59986613
194
错误: missing chunk number 0 for toast value 59986716 in pg_toast_59986613
195
错误: missing chunk number 0 for toast value 59986717 in pg_toast_59986613
196
错误: missing chunk number 0 for toast value 59986718 in pg_toast_59986613
197
错误: missing chunk number 0 for toast value 59986719 in pg_toast_59986613
198
错误: missing chunk number 0 for toast value 59986720 in pg_toast_59986613
199
-
可以看到offset 从100开始到199都是有问题的
--offset 99没问题offset 100有问题,对应的id是100没问题
postgres=# select * from test limit 1 offset 99;
id | c_name
-----+--------
100 | a
(1 row)
postgres=# select * from test limit 1 offset 199;
错误: missing chunk number 0 for toast value 59986720 in pg_toast_59986613
postgres=# select * from test limit 1 offset 200;
id | c_name
-----+--------
201 | b
(1 row)
--删除掉损坏的行
postgres=# delete from test where id >100 and id <201;
DELETE 100
-
访问该表正常
postgres=# select * from test;
id | c_name
-----+--------
1 | a
2 | a
3 | a
4 | a
5 | a
6 | a
7 | a
8 | a
9 | a
、、、、、
postgres=# select count(*) from test;
count
-------
200
(1 row)
postgres=# vacuum full test;
VACUUM
小结
1、除了toast的修复外,还有一些表数据文件可能会损坏的修复,常常会用到一个参数zero_damaged_pages
,可以将该参数设置为on来跳过损坏的块,然后重新备份
2、ignore_system_indexes
是专门用来针对系统表的索引损坏,导致数据库无法正常连接的时候,该参数可以忽略系统索引,可以打开该参数来重建损坏的系统表索引
3、pg_statistic的数据是更新统计信息的时候插入的toast损坏后可以直接清理,然后vacuum analyze会重新生成
4、业务表的toast损坏,如果没有完整的备份,可以删除掉损坏的行来恢复
参考资料:
http://m.2cto.com/database/201802/720718.html
https://www.bbsmax.com/A/qVde1ZBAdP/
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 别再用vector<bool>了!Google高级工程师:这可能是STL最大的设计失误
· 单元测试从入门到精通