KingbaseESV8R6中一个报错:tuple concurrently updated
前言
前几天同事在生产环境遇到的一个报错:tuple concurrently updated
从字面意思看是行的并发争用。这个问题原因是数据库更新系统表时产生了争用,所以不同session不能并发更新系统表。
测试
窗口1,不提交事务
begin;
truncate t1;
窗口2,因为无法更新系统表信息,授权语句无法执行成功,卡主
grant select on all tables in schema public to u2 ;
这时候ctrl+c终止后收到一条信息,这个关键点可以帮助我们分析问题
TEST=# grant select on all tables in schema public to u2 ;
^CCancel request sent
错误: 由于用户请求而正在取消查询
CONTEXT: 当更新关系"sys_class"的元组(28, 37)时
查看系统表的行偏移(28,37),发现表t1上有争用
TEST=# select * from sys_class where ctid = '(28,37)';
oid | relname | relnamespace | reltype | reloftype | relowner | relam | relfilenode | reltablespace | relpages | reltuples | relallvisible | reltoastrelid | relhasindex | relisshared | relpersistence |
relkind | relnatts | relchecks | relhasrules | relhastriggers | relhassubclass | relrowsecurity | relforcerowsecurity | relispopulated | relreplident | relispartition | relrewrite | relfrozenxid | relminmxi
d | relacl | reloptions | relpartbound
--------+---------+--------------+---------+-----------+----------+-------+-------------+---------------+----------+-----------+---------------+---------------+-------------+-------------+----------------+-
--------+----------+-----------+-------------+----------------+----------------+----------------+---------------------+----------------+--------------+----------------+------------+--------------+----------
--+-------------------------------------+------------+--------------
110336 | t1 | 2200 | 110338 | 0 | 10 | 2 | 110492 | 0 | 0 | 0 | 0 | 0 | f | f | p |
r | 1 | 0 | f | f | f | f | f | t | d | f | 0 | 1914706 | 56
7 | {system=arwdDxt/system,u2=r/system} | |
(1 row)
查看表sys_stat_activity,发现有会话正在 tuncate t1的操作,说明truncate语句和grant语句产生更新系统表的冲突
TEST=# select * from sys_stat_activity where state<>'idle' and pg_backend_pid() != pid and (backend_xid is not null or backend_xmin is not null ) ;
datid | datname | pid | usesysid | usename | application_name | client_addr | client_hostname | client_port | backend_start | xact_start | query_start
| state_change | wait_event_type | wait_event | state | backend_xid | backend_xmin | query | backend_type
-------+---------+-------+----------+---------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+------------------------------
-+-------------------------------+-----------------+------------+---------------------+-------------+--------------+-------------------------------+----------------
91362 | test | 23875 | 10 | system | | | | | 2023-02-09 16:59:30.233225+08 | 2023-02-17 15:51:13.788882+08 | 2023-02-17 15:51:13.788882+08
| 2023-02-17 15:51:13.788896+08 | Lock | relation | active | 1914711 | 1914708 | select perf.snapshot_timer(); | kwr collector
91362 | test | 535 | 110096 | u2 | kingbase_*&+_ | | | -1 | 2023-02-17 15:33:14.456389+08 | 2023-02-17 15:49:58.210625+08 | 2023-02-17 15:50:00.666372+08
| 2023-02-17 15:50:00.666792+08 | Client | ClientRead | idle in transaction | 1914708 | | truncate t1; | client backend
(2 rows)
重复上述操作,直到授权语句卡主不取消
然后窗口1,commit提交事务后,
窗口2 会发生错误tuple concurrently updated,
test=# grant select on all tables in schema public to u2 ;
ERROR: tuple concurrently updated
窗口1提交后,正常情况窗口2的授权语句可以顺利执行成功,可是从数据库日志中看到,commit和grant时间几乎是同时完成的,所以仍然会有此报错。
此案例提供了分析思路和方法,建议类似grant和truncate需要更新sys_class信息的语句按事务顺序执行,不要并发完成。
KINGBASE研究院