为什么Index Only Scan却还需要访问表

在实际SQL优化工作中,我们经常会发现SQL 执行计划明明是 “Index Only Scan”,但执行计划后面却有 “Heap Fetches: x” ,也就是说实际执行计划还是访问了表记录。这是为什么了?

一、举个例子

1、创建数据

1
2
3
create table t1(id1 integer,id2 integer,name text);
insert into t1 select generate_series(1,100),generate_series(1,100),repeat('a',1000);
create index ind_t1 on t1(id1,id2);

2、查看执行计划

1
2
3
4
5
6
7
8
9
test=# explain analyze select id2 from t1 where id1=1;
                                                   QUERY PLAN                                                  
----------------------------------------------------------------------------------------------------------------
 Index Only Scan using ind_t1 on t1  (cost=0.14..8.16 rows=1 width=4) (actual time=0.024..0.025 rows=1 loops=1)
   Index Cond: (id1 = 1)
   Heap Fetches: 1
 Planning Time: 0.286 ms
 Execution Time: 0.044 ms
(5 rows)

 可以看到,虽然SQL 只访问一条记录,但 heap fetches 值是 1 ,也就是实际需要访问表。

3、原因分析

1
2
3
4
5
test=# select pg_relation_filepath('t1');
 pg_relation_filepath
----------------------
 base/16089/16428
(1 row)

查看该路径下是否有 vm 文件。没有visibility map,postgresql就不知道是否所有的行对当前事务都是可见的,因此需要去访问表获取数据。只有fsm ,没有vm

1
2
3
[kb21@dbhost03 data]$ ls -l base/16089/16428*
-rw------- 1 kb21 kb21 122880 Sep 20 09:54 base/16089/16428
-rw------- 1 kb21 kb21  24576 Sep 20 09:54 base/16089/16428_fsm

4、vacuum 后执行计划

1
2
3
4
5
6
7
8
9
10
11
test=# vacuum analyze t1;
VACUUM
test=# explain analyze select id2 from t1 where id1=1;
                                                   QUERY PLAN                                                  
----------------------------------------------------------------------------------------------------------------
 Index Only Scan using ind_t1 on t1  (cost=0.14..4.16 rows=1 width=4) (actual time=0.011..0.012 rows=1 loops=1)
   Index Cond: (id1 = 1)
   Heap Fetches: 0
 Planning Time: 0.249 ms
 Execution Time: 0.031 ms
(5 rows)

vacuum 后,heap fetches 变为 0

二、进一步分析

1、通过 sys_visibility 扩展进行分析

1
2
3
4
5
6
7
8
9
10
11
12
13
14
test=# create extension  sys_visibility;
CREATE EXTENSION
test=# \dx+ sys_visibility
     Objects in extension "sys_visibility"
              Object description              
-----------------------------------------------
 function pg_check_frozen(regclass)
 function pg_check_visible(regclass)
 function pg_truncate_visibility_map(regclass)
 function pg_visibility_map(regclass)
 function pg_visibility_map(regclass,bigint)
 function pg_visibility_map_summary(regclass)
 function pg_visibility(regclass)
 function pg_visibility(regclass,bigint)

pg_visibility_map 函数的参数:regclass, blkno bigint, all_visible OUT boolean, all_frozen OUT boolean

2、删除记录

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
test=# begin;
BEGIN
test=# delete from t1 where id1=1;
DELETE 1
test=# rollback;
ROLLBACK
test=# select pg_visibility_map('t1'::regclass, 0);
 pg_visibility_map
-------------------
 (f,f)
(1 row)<br>
test=# select pg_visibility_map('t1'::regclass, 1);
pg_visibility_map
-------------------
(t,f)
(1 row)

因为id1=1 是在数据块0,因此,数据块并不是all visible

3、验证执行计划

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
test=# explain analyze select id2 from t1 where id1=1;
                                                   QUERY PLAN                                                  
----------------------------------------------------------------------------------------------------------------
 Index Only Scan using ind_t1 on t1  (cost=0.14..4.16 rows=1 width=4) (actual time=0.016..0.017 rows=1 loops=1)
   Index Cond: (id1 = 1)
   Heap Fetches: 1
 Planning Time: 0.054 ms
 Execution Time: 0.033 ms
(5 rows)
 
test=# explain analyze select distinct id1,id2 from t1;
                                                        QUERY PLAN                                                       
--------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=0.14..10.14 rows=100 width=8) (actual time=0.014..0.042 rows=100 loops=1)
   ->  Index Only Scan using ind_t1 on t1  (cost=0.14..9.64 rows=100 width=8) (actual time=0.013..0.024 rows=100 loops=1)
         Heap Fetches: 7
 Planning Time: 0.057 ms
 Execution Time: 0.060 ms
(5 rows)
 
test=# vacuum analyze t1;
VACUUM
test=# explain analyze select distinct id1,id2 from t1;
                                                        QUERY PLAN                                                       
--------------------------------------------------------------------------------------------------------------------------
 Unique  (cost=0.14..10.14 rows=100 width=8) (actual time=0.009..0.034 rows=100 loops=1)
   ->  Index Only Scan using ind_t1 on t1  (cost=0.14..9.64 rows=100 width=8) (actual time=0.008..0.017 rows=100 loops=1)
         Heap Fetches: 0
 Planning Time: 0.213 ms
 Execution Time: 0.051 ms
(5 rows)

至于在Vacuum之前 heap fetches 为什么是 7 , 没搞明白。但明确的是vacuum 之后,heap fectches 变为0.

 

posted @   KINGBASE研究院  阅读(258)  评论(0编辑  收藏  举报
编辑推荐:
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
阅读排行:
· 被坑几百块钱后,我竟然真的恢复了删除的微信聊天记录!
· 没有Manus邀请码?试试免邀请码的MGX或者开源的OpenManus吧
· 【自荐】一款简洁、开源的在线白板工具 Drawnix
· 园子的第一款AI主题卫衣上架——"HELLO! HOW CAN I ASSIST YOU TODAY
· Docker 太简单,K8s 太复杂?w7panel 让容器管理更轻松!
点击右上角即可分享
微信分享提示