Postgresql 锁等待检测及处理
背景
对于多数数据库,dba技能之一就是查找锁。锁的存在有效合理的在多并发场景下保证业务有序进行。下面我们看一下Postgresql中查找阻塞的方法。
1、找到"被阻塞者",获取被堵塞的PID
select distinct pid from pg_locks where not granted;
2、找到"阻塞者",通过被阻塞者pid找到阻塞者
## test=# select * from pg_blocking_pids(53920);
pg_blocking_pids
{53868}
(1 row)
3、被堵塞的PID,当前的会话内容
test=# select * from pg_stat_activity where pid=53920;
-[ RECORD 1 ]----+------------------------------
datid | 13285
datname | test
pid | 53920
usesysid | 10
usename | system
application_name | ksql
client_addr |
client_hostname |
client_port | -1
backend_start | 2022-04-22 10:20:29.124634+08
xact_start | 2022-04-22 10:20:30.962902+08
query_start | 2022-04-22 10:20:30.962902+08
state_change | 2022-04-22 10:20:30.962905+08
wait_event_type | Lock
wait_event | relation
state | active
backend_xid | 1286297005
backend_xmin | 1286297004
query | drop table a;
backend_type | client backend
被堵塞的PID,当前的锁等待内容
test=# select * from pg_locks where pid=53920 and not granted;
locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted | fastpath
----------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+---------------------+---------+----------
relation | 13285 | 1907887 | | | | | | | | 5/1358301 | 53920 | AccessExclusiveLock | f | f
(1 row)
"阻塞者"
1、找到"阻塞者"当前的状态,(注意,有可能当前会话内容看不出阻塞动作)
堵塞这个PID的PIDs,当前的会话内容
test=# select * from pg_stat_activity where pid= any (pg_blocking_pids(53920));
-[ RECORD 1 ]----+------------------------------
datid | 13285
datname | test
pid | 53868
usesysid | 10
usename | system
application_name | psql
client_addr |
client_hostname |
client_port | -1
backend_start | 2019-04-22 10:20:21.377909+08
xact_start | 2019-04-22 10:20:23.832489+08
query_start | 2019-04-22 10:20:25.529063+08
state_change | 2019-04-22 10:20:25.53116+08
wait_event_type | Client
wait_event | ClientRead
state | idle in transaction
backend_xid | 1286297004
backend_xmin |
query | truncate a;
backend_type | client backend
如果当前状态没有找到具体是哪条SQL导致的锁,则需要从审计日志中查找(开启log_statements='all')。重点关注wait_event_type和state字段。这里说明该holder执行完事务后处于空闲状态,正等待客户端发送新请求,常见于业务框架代码忘记提交的场景或假死状态。
2、找到"阻塞者"的"犯罪"证据:
堵塞这个PID的PIDs,查看当前的锁内容
test=# select * from pg_locks where pid=any (pg_blocking_pids(53920)) order by pid;
locktype | database | relation | page | tuple | virtualxid | transactionid | classid | objid | objsubid | virtualtransaction | pid | mode | granted | fastpath
---------------+----------+----------+------+-------+------------+---------------+---------+-------+----------+--------------------+-------+---------------------+---------+----------
virtualxid | | | | | 4/1372747 | | | | | 4/1372747 | 53868 | ExclusiveLock | t | t
relation | 13285 | 1907887 | | | | | | | | 4/1372747 | 53868 | ShareLock | t | f
relation | 13285 | 1907887 | | | | | | | | 4/1372747 | 53868 | AccessExclusiveLock | t | f
transactionid | | | | | | 1286297004 | | | | 4/1372747 | 53868 | ExclusiveLock | t | f
(4 rows)
3、 最后梳理一下
"被阻塞者" :对13285.1907887对象需要如下锁
relation | 13285 | 1907887 | | | | | | | | 5/1358301 | 53920 | AccessExclusiveLock | f | f
"阻塞者" :对13285.1907887对象已持有如下锁
relation | 13285 | 1907887 | | | | | | | | 4/1372747 | 53868 | ShareLock | t | f
relation | 13285 | 1907887 | | | | | | | | 4/1372747 | 53868 | AccessExclusiveLock | t | f
两者冲突,因此发生锁等待。最后和应用确认持锁者是否是活动事务,可否正确结束事务。否则,通过
select pg_terminate_backend(53868);终止此session。