Postgresql 全局索引与分区索引对于SQL性能影响的比较及DDL操作后分区全局索引是否会失效

Postgresql 提供了对于分区表 global index 的支持。global index 不仅提供了对于唯一索引功能的改进（无需包含分区键），而且在性能上相比非global index （local index）有很大的提升（无法提供分区条件情况下）。以下举例说明二者在性能方面的差异。

1、准备数据

create table t1(id1 integer,id2 integer,name text) partition by hash(id1) partitions 200;
 
insert into t1 select generate_series(1,10000000),generate_series(1,10000000),repeat('a',500);

2、本地索引的性能

没有提供分区条件时

create index ind_t1_id2 on t1(id2);
 
test=# \di+ ind_t1_id2
                                List of relations
 Schema |    Name    |       Type        | Owner  | Table |  Size   | Description
--------+------------+-------------------+--------+-------+---------+-------------
 public | ind_t1_id2 | partitioned index | system | t1    | 0 bytes |
(1 row)
 
test=# explain analyze select * from t1 where id2=10004;
                                                           QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------
 Append  (cost=0.29..1662.50 rows=200 width=512) (actual time=1.324..3.249 rows=1 loops=1)
   ->  Index Scan using t1_p0_id2_idx on t1_p0  (cost=0.29..8.31 rows=1 width=512) (actual time=0.054..0.055 rows=0 loops=1)
         Index Cond: (id2 = 10004)
   ->  Index Scan using t1_p1_id2_idx on t1_p1  (cost=0.29..8.31 rows=1 width=512) (actual time=0.065..0.065 rows=0 loops=1)
         Index Cond: (id2 = 10004)
   ......
   ->  Index Scan using t1_p198_id2_idx on t1_p198  (cost=0.29..8.31 rows=1 width=512) (actual time=0.031..0.031 rows=0 loops=1)
         Index Cond: (id2 = 10004)
   ->  Index Scan using t1_p199_id2_idx on t1_p199  (cost=0.29..8.31 rows=1 width=512) (actual time=0.025..0.025 rows=0 loops=1)
         Index Cond: (id2 = 10004)
 Planning Time: 39.262 ms
 Execution Time: 5.673 ms
(403 rows)

使用非全局索引，并且没有提供分区条件情况下，优化器需要读取所有索引分区及表分区的统计数据，才能确定最优的执行计划。对于数据访问，同样需要访问所有分区的索引（即使该分区没有所需要的数据）。

提供分区条件时

test=# explain analyze select * from t1 where id2=10004 and id1=10004;
                                                       QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------
 Index Scan using t1_p71_id2_idx on t1_p71  (cost=0.29..8.31 rows=1 width=512) (actual time=0.045..0.046 rows=1 loops=1)
   Index Cond: (id2 = 10004)
   Filter: (id1 = 10004)
 Planning Time: 0.346 ms
 Execution Time: 0.064 ms
(5 rows)

在提供分区条件情况下，只需要访问单个索引分区及表分区的统计数据，因此，所需的语句的解析时间更少。

3、全局索引的性能

create unique index ind_t1_id2 on t1(id2) global;
 
test=# \di+ ind_t1_id2
                             List of relations
 Schema |    Name    |     Type     | Owner  | Table |  Size  | Description
--------+------------+--------------+--------+-------+--------+-------------
 public | ind_t1_id2 | global index | system | t1    | 215 MB |
(1 row)
 
 
test=# explain analyze select * from t1 where id2=10004;
                                                        QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------
 Global Index Scan using ind_t1_id2 on t1  (cost=0.38..8.39 rows=200 width=512) (actual time=0.136..0.137 rows=1 loops=1)
   Index Cond: (id2 = 10004)
 Planning Time: 9.896 ms
 Execution Time: 0.264 ms
(4 rows)

可以SQL 解析与执行时间都比本地索引的情景快很多。

4、DDL操作不会影响全局索引

Oracle 在对分区做DDL操作时，会使分区全局索引失效，需要加上关键字update global indexes。

（1）、创建测试数据

create table t1_part(id integer,name text,status char(1))
partition by list(status)
(
  partition p_0 values ('0'),
  partition p_1 values ('1'),
  partition p_default values (default)
);
 
 
insert into t1_part select generate_series(1,100000),repeat('a',500),'0';
insert into t1_part select generate_series(100001,200000),repeat('a',500),'1';
insert into t1_part select generate_series(200001,300000),repeat('a',500),'2';
create unique index ind_t1_part_2 on t1_part(id) global;
analyze t1_part;
 
set enable_globalindexscan = on;

（2）略

truncate partition 不会导致全局索引失效。

Postgresql 对于delete操作，只是在 tuple 上做了个标记，而索引不会进行操作。在通过索引访问tuple时，如果发现数据已经被 Deleted ，也不会报错。因此，对于truncate ，实际就相当于记录被delete。可以看到，在truncate 之后，索引的占用空间没有发生变动，而在 vacuum full ，索引尺寸小了很多。

posted @ 2022-06-06 11:18 数据库集中营阅读(2178) 评论(0) 编辑收藏举报

刷新页面返回顶部

数据库集中营