postgresql/lightdb分区技术及其优化
PostgreSQL 10开始,实现了原生表分区,算是真正意义上进入了支持分区的数据库的圈子。
11实现了大量的功能如支持更新分区键、默认分区、自动创建分区索引、外键支持、唯一索引、分区聚合pushdown(enable_partitionwise_aggregate
)、哈希分区、动态分区剪除,但这个版本的分区算不上真正的成熟(GA不代表mature),12才算是勉强的成熟,可以广泛用于生产。
创建分区
哈希分区的创建:https://blog.csdn.net/ctypyb2002/article/details/104971991
创建分区索引
和mysql一样,在pg实现中,只有分区本地索引,没有全局索引,但是社区已经有提案https://www.percona.com/blog/2019/11/20/proposal-for-global-indexes-in-postgresql/,主要是因为绝大部分情况下,分区数量没有成千上万的时候,全局索引搜索的效率远比搜索多个本地索引效率高,主键定义时必须包含分区键。
CREATE INDEX idx_entrus_no_f ON ses_entrust USING btree (entrust_no, init_date); -- 默认在所有分区上创建索引,如果指定了ONLY子句,则不在分区上创建索引,这样做通常无意义
查看分区定义
select
*
from
pg_partition_tree(
'traffic_violations_p'
);
relid |parentrelid|isleaf|level| ----------------|-----------|------|-----| ses_entrust | |false | 0| ses_entrust_1_f |ses_entrust|true | 1| ses_entrust_2_f |ses_entrust|true | 1| ses_entrust_3_f |ses_entrust|true | 1| ses_entrust_4_f |ses_entrust|true | 1| ses_entrust_5_f |ses_entrust|true | 1| ses_entrust_6_f |ses_entrust|true | 1| ses_entrust_7_f |ses_entrust|true | 1| ses_entrust_8_f |ses_entrust|true | 1| ses_entrust_9_f |ses_entrust|true | 1| ses_entrust_10_f|ses_entrust|true | 1| ses_entrust_11_f|ses_entrust|true | 1| ses_entrust_12_f|ses_entrust|true | 1| ses_entrust_13_f|ses_entrust|true | 1| ses_entrust_14_f|ses_entrust|true | 1| ses_entrust_15_f|ses_entrust|true | 1| ses_entrust_16_f|ses_entrust|true | 1| ses_entrust_17_f|ses_entrust|true | 1| ses_entrust_18_f|ses_entrust|true | 1| ses_entrust_19_f|ses_entrust|true | 1| ses_entrust_20_f|ses_entrust|true | 1| ses_entrust_21_f|ses_entrust|true | 1| ses_entrust_22_f|ses_entrust|true | 1| ses_entrust_23_f|ses_entrust|true | 1| ses_entrust_24_f|ses_entrust|true | 1| ses_entrust_25_f|ses_entrust|true | 1| ses_entrust_26_f|ses_entrust|true | 1| ses_entrust_27_f|ses_entrust|true | 1| ses_entrust_28_f|ses_entrust|true | 1| ses_entrust_29_f|ses_entrust|true | 1| ses_entrust_30_f|ses_entrust|true | 1| ses_entrust_31_f|ses_entrust|true | 1| ses_entrust_32_f|ses_entrust|true | 1|
查看分区剪除(pruning)及按分区关联、聚合
讨论分区,不看分区剪除的效果就是耍流氓。相关测试可以参见https://www.cnblogs.com/abclife/p/11647623.html(测试最多基于11版本,因为12版本届时尚未发布,基于13测试的测试见此)。
分区对OLTP的性能影响
至少在千万级以下(30个字段、或200字节每行为界),分区意义弊大于利(可能比非分区表增加10%左右响应时间以及cost),当分区只有个位数个时,分区意义也不大。
https://www.2ndquadrant.com/en/blog/partition-elimination-postgresql-11/
分区 vs 非分区的性能 https://wenku.baidu.com/view/1a1f00da5cbfc77da26925c52cc58bd63086935e.html
哈希分区数量的建议
推荐为2的倍数。参见:http://www.manongjc.com/article/90533.html。事实上因为分区数量有限,通常成百上千最多了,所以影响有限(几十个的时候性能无影响,见下文)。
组合分区与多列分区
当前,pg暂不支持子分区(组合分区)。
分区与并行
讨论分区,不和并行一起讨论是不完整的。可惜的是,在pg当前的实现中,所有Append下的子计划(subplan)间均不支持并行执行,只支持子计划内的并行执行。
其他
从pg 13开始,逻辑复制支持针对分区表。
自动创建分区
社区有个pg_pathman(lightdb也PR修复了几个缺陷),可自动创建分区,比如按照日期每天1个。
https://wenku.baidu.com/view/99fbc278834d2b160b4e767f5acfa1c7aa0082aa.html
分区数量对性能的影响
https://www.jianshu.com/p/1cba77d18694 比较客观测试,分区数量不多的情况下,性能是差不多的。
CREATE TABLE t33_hash_part ( a character varying(32) not null, b character varying(32) , c character varying(32) , d character varying(32) , e character varying(32) , f character varying(32) , g character varying(32) , h character varying(32) , a1 bigint primary key , a2 numeric(19,2) , a3 numeric(19,2) , a4 numeric(19,2) , a5 numeric(19,2) , a6 numeric(19,2) , a7 numeric(19,2) , a8 numeric(19,2) , a9 numeric(19,2) , a10 numeric(19,2) , a11 numeric(19,2) , a12 numeric(19,2) , a13 numeric(19,2) , a14 numeric(19,2) , a15 numeric(19,2) , a16 numeric(19,2) , x1 character(10) , x2 character(10) not null, x3 character(10) not null, x4 character(10) not null, x5 character(10) not null, x6 character(10) not null, x7 character(10) not null, x8 character(10) not null, x9 character(10) not null ) PARTITION BY HASH (a1); create table t33_hash_part_1 partition of t33_hash_part FOR VALUES WITH (MODULUS 12, REMAINDER 0); create table t33_hash_part_2 partition of t33_hash_part FOR VALUES WITH (MODULUS 12, REMAINDER 1); create table t33_hash_part_3 partition of t33_hash_part FOR VALUES WITH (MODULUS 12, REMAINDER 2); create table t33_hash_part_4 partition of t33_hash_part FOR VALUES WITH (MODULUS 12, REMAINDER 3); create table t33_hash_part_5 partition of t33_hash_part FOR VALUES WITH (MODULUS 12, REMAINDER 4); create table t33_hash_part_6 partition of t33_hash_part FOR VALUES WITH (MODULUS 12, REMAINDER 5); create table t33_hash_part_7 partition of t33_hash_part FOR VALUES WITH (MODULUS 12, REMAINDER 6); create table t33_hash_part_8 partition of t33_hash_part FOR VALUES WITH (MODULUS 12, REMAINDER 7); create table t33_hash_part_9 partition of t33_hash_part FOR VALUES WITH (MODULUS 12, REMAINDER 8); create table t33_hash_part_10 partition of t33_hash_part FOR VALUES WITH (MODULUS 12, REMAINDER 9); create table t33_hash_part_11 partition of t33_hash_part FOR VALUES WITH (MODULUS 12, REMAINDER 10); create table t33_hash_part_12 partition of t33_hash_part FOR VALUES WITH (MODULUS 12, REMAINDER 11); \timing on zjh@postgres=# insert into t33_hash_part select 'aaaa' || i,'bbbb' || i,'cccc' || i,'dddd' || i,'eeee' || i,'ffff' || i,'gggg' || i,'hhhh' || i, i*2,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,'x1','x2','x3','x4','x5','x6','x7','x8','x9' from generate_series(1,50000000) i; INSERT 0 50000000 Time: 239574.209 ms (03:59.574) zjh@postgres=# \COPY (select * from t33_hash_part) TO '/mnt/pmem1/zjh/t33_hash_nopart.txt' with csv ; COPY 50000000 Time: 131624.683 ms (02:11.625) zjh@postgres=# truncate table t33_hash_part; TRUNCATE TABLE Time: 4758.602 ms (00:04.759) zjh@postgres=# \COPY t33_hash_part FROM '/mnt/pmem1/zjh/t33_hash_nopart.txt' with csv ; COPY 50000000 Time: 284130.323 ms (04:44.130) --------------非分区 CREATE TABLE t33_hash_nopart ( a character varying(32) not null, b character varying(32) , c character varying(32) , d character varying(32) , e character varying(32) , f character varying(32) , g character varying(32) , h character varying(32) , a1 bigint primary key , a2 numeric(19,2) , a3 numeric(19,2) , a4 numeric(19,2) , a5 numeric(19,2) , a6 numeric(19,2) , a7 numeric(19,2) , a8 numeric(19,2) , a9 numeric(19,2) , a10 numeric(19,2) , a11 numeric(19,2) , a12 numeric(19,2) , a13 numeric(19,2) , a14 numeric(19,2) , a15 numeric(19,2) , a16 numeric(19,2) , x1 character(10) , x2 character(10) not null, x3 character(10) not null, x4 character(10) not null, x5 character(10) not null, x6 character(10) not null, x7 character(10) not null, x8 character(10) not null, x9 character(10) not null ); \COPY t33_hash_nopart FROM '/mnt/pmem1/zjh/t33_hash_nopart.txt' with csv ; COPY 50000000 Time: 298424.421 ms (04:58.424) zjh@postgres=# insert into t33_hash_nopart select 'aaaa' || i,'bbbb' || i,'cccc' || i,'dddd' || i,'eeee' || i,'ffff' || i,'gggg' || i,'hhhh' || i, zjh@postgres-# i*2,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,i*2.11,'x1','x2','x3','x4','x5','x6','x7','x8','x9' from generate_series(1,50000000) i; INSERT 0 50000000 Time: 228608.017 ms (03:48.608) zjh@postgres=# \COPY (select * from t33_hash_nopart) TO '/mnt/pmem1/zjh/t33_hash_nopart.txt' with csv ; COPY 50000000 Time: 121091.492 ms (02:01.091)
分区统计信息
分区的统计信息分为子分区的统计信息和根分区的统计信息,GPORCA依赖于根分区的统计信息,pg优化器依赖于子分区的统计信息。GP参数optimizer_analyze_root_partition控制默认是否收集根分区统计信息。
查看分区表的大小
CREATE OR REPLACE FUNCTION pg_partition_table_size(text) RETURNS TEXT AS $$ SELECT pg_size_pretty(sum(pg_relation_size(inhrelid))::bigint) FROM pg_inherits WHERE inhparent=$1::regclass; $$ language sql;
zjh@postgres=# select pg_partition_table_size('t33_hash_part'); pg_partition_table_size ------------------------- 177 MB (1 row)
https://www.enterprisedb.com/blog/partition-wise-joins-divide-and-conquer-joins-between-partitioned-table