KingbaseES V8R6运维案例之---pg_statistic toast表故障修复


案例说明:
数据库在日常的维护过程中,执行表结构查询语句(\d t1),如下图所示,出现“missing chunk number 0 for toast value 16259 in pg_toast_2619”,从报错信息看和toast表有关。本案例复现了以上故障,并提供了解决方案。

适用版本:
KingbaseES V8R6

一、案例复现
TOAST存储的表不能单独创建,只有当普通表包含了main,extended或external存储格式的字段时,系统会自动创建一个和普通表关联的TOAST表。当一行记录(tuple)存储的(包括压缩后的大小)大小超过TOAST_TUPLE_THRESHOLD(默认2K)时,会存储到TOAST表。

1、查询toast表对应的源表

Tips:

toast表pg_toast_2619的‘2619’,对应着源表在pg_class中的OID,可以通过此OID查询表名称。

test=# select oid,reltoastrelid ,relname from pg_class where oid=2619;


 oid  | reltoastrelid |   relname
------+---------------+--------------
 2619 |          2840 | pg_statistic
(1 row)

Tips:
pg_statistic表存储有关该数据库内容的统计数据。 记录是由ANALYZE创建的,并且随后被查询规划器使用。

2、查看toast表名称

test=# select relname from pg_class where oid=2840;
    relname
---------------
 pg_toast_2619
(1 row)

二、复现toast表故障

1、查看toast表文件存储路径

prod=# select pg_relation_filepath('pg_toast.pg_toast_2619');
 pg_relation_filepath
----------------------
 base/16385/2840
(1 row)

2、模拟toast文件数据破坏

# 查看toast表文件信息
[kingbase@node102 16385]$ ls -lh /data/kingbase/v8r6_c6/data/base/16385/2840
-rw------- 1 kingbase kingbase 32K Oct 27 14:18 /data/kingbase/v8r6_c6/data/base/16385/2840

# 模拟数据文件故障
[kingbase@node102 v8r6_c6]$ dd if=/dev/zero of=/data/kingbase/v8r6_c6/data/base/16385/2840 bs=512 count=1
1+0 records in
1+0 records out
512 bytes (512 B) copied, 0.000186372 s, 2.7 MB/s

# 查看数据文件信息
[kingbase@node102 v8r6_c6]$ ls -lh /data/kingbase/v8r6_c6/data/base/16385/2840
-rw------- 1 kingbase kingbase 512 Nov  2 14:19 /data/kingbase/v8r6_c6/data/base/16385/2840

3、执行数据库访问(出现toast表访问错误)

prod=# \d+ test_t1;
ERROR:  could not read block 0 in file "base/16385/2840": read only 512 of 8192 bytes

常见 pg_statistic toast 访问问题

prod=# select count(*) from pg_statistic;
 count
-------
   508
(1 row)

prod=# select * from pg_statistic;
ERROR:  missing chunk number 0 for toast value 16259 in pg_toast_2619

prod=# \d pg_toast.pg_toast_2619
ERROR:  could not read block 0 in file "base/16385/2840": read only 512 of 8192 bytes

三、toast表故障解决步骤

# 清理pg_statistic中的数据
prod=#  delete from pg_statistic;
ERROR:  permission denied: "pg_statistic" is a system catalog

# 默认对系统表用户没有权限执行系统表修改操作
prod=# show allow_system_table_mods;
 allow_system_table_mods
-------------------------
 off
(1 row)

# 配置系统参数允许管理员修改系统表
prod=# alter system set allow_system_table_mods=on;
ALTER SYSTEM

# 重启数据库服务
[kingbase@node102 bin]$ ./sys_ctl restart -D /data/kingbase/v8r6_c6/data/
waiting for server to shut down.... done
......

# 连接数据库清理pg_statistic表数据
[kingbase@node102 bin]$ ./ksql -U system test -p 54325
ksql (V8.0)
Type "help" for help.

test=# show allow_system_table_mods;
 allow_system_table_mods
-------------------------
 on
(1 row)

# 仍然没有对系统表的DML权限
prod=# delete from pg_statistic;
ERROR:  permission denied: "pg_statistic" is a system catalog

# 执行truncate清理系统表
prod=# truncate table pg_statistic;
TRUNCATE TABLE

# pg_statistic表清理后查询(可以正常查询)
prod=# select * from pg_statistic;
 starelid | staattnum | stainherit | stanullfrac | stawidth | stadistinct | stakind1 | stakind2 | stakind3 | stakind4 | stakind5 | staop1 | staop2 | staop3 | staop4 | staop5 | stacoll1 | stacoll2 | stacoll3 | stacoll4 | stacoll5 | stanumbers1 | stanumbers2 | stanumbers3 | stanumbers4 | stanumbers5 | stavalues1 | stavalues2 | stavalues3 | stavalues4 | stavalues5
----------+-----------+------------+-------------+----------+-------------+----------+----------+----------+----------+----------+--------+--------+--------+--------+--------+----------+----------+----------+----------+----------+-------------+-------------+-------------+-------------+-------------+------------+------------+------------+------------+------------
(0 rows)


prod=# select count(*) from pg_toast.pg_toast_2619;
 count
-------
     0
(1 row)

# 查询表结构
prod=# \d test_t1;
              Table "public.test_t1"
 Column |  Type   | Collation | Nullable | Default
--------+---------+-----------+----------+---------
 name   | varchar |           |          |

# 执行表分析
prod=# vacuum analyze t1;
VACUUM

# 执行表分析后pg_statistic有数据插入
prod=# select * from pg_statistic;
 starelid | staattnum | stainherit | stanullfrac | stawidth | stadistinct | stakind1 | stakind2 | stakind3 | stakind4 | stakind5 | staop1 | staop2 | staop3 | staop4 | staop5 | stacoll1 | stacoll2 | stacoll3 | stacoll4 | stacoll5 | stanumbers1 | stanumbers2 | stanumbers3 | stanumbers4 | stanumbers5 | stavalues1 | stavalues2 | stavalues3 | stavalues4 | stavalues5
----------+-----------+------------+-------------+----------+-------------+----------+----------+----------+----------+----------+--------+--------+--------+--------+--------+----------+----------+----------+----------+----------+-------------+-------------+-------------+-------------+-------------+------------+------------+------------+------------+------------
    16386 |         1 | f          |           0 |        4 |          -1 |        0 |        0 |        0 |        0 |0 |      0 |      0 |      0 |      0 |      0 |        0 |        0 |        0 |        0 |        0 |             |   |             |             |             |            |            |            |            |
(1 row)

# 如上所示,pg_statistic及toast表都可以正常访问。

四、总结
以上toast表故障发生在pg_statistic表,对于pg_statistic表数据可以通过truncate方式清理,但是如果是业务普通表,将不能直接通过truncate方式清理(除非有备份),对于业务普通表的toast故障,将在后面的案例中分析。

posted @ 2023-02-03 11:24  KINGBASE研究院  阅读(336)  评论(0编辑  收藏  举报