【PostgreSQL 15】PostgreSQL 15对distinct的优化
2022-07-17 11:34 abce 阅读(1153) 评论(0) 编辑 收藏 举报示例表
1 2 3 4 5 6 7 8 9 10 11 12 13 | table t_ex; c1 | c2 ----+---- 2 | B 4 | C 6 | A 2 | C 4 | B 6 | B 2 | A 4 | B 6 | C 2 | C |
以下SQL语句有序地返回"c1"列中唯一值:
1 | select distinct on (c1) * from abce; |
对于c2列,会根据c1的唯一性,从表中找到的第一个值。
1 2 3 4 5 6 7 | postgres=# select distinct on (c1) * from abce; c1 | c2 ----+---- 2 | B 4 | B 6 | B (3 rows ) |
以下SQL语句有序地返回"c2"列中唯一值:
1 2 3 4 5 6 7 | # select distinct on (c2) * from abce; c1 | c2 ----+---- 6 | A 2 | B 4 | C (3 rows ) |
最后从表中返回唯一性的记录
1 2 3 4 5 6 7 8 9 10 11 12 | postgres=# select distinct * from abce; c1 | c2 ----+---- 6 | C 4 | C 4 | B 2 | C 2 | A 6 | B 6 | A 2 | B (8 rows ) |
那么你可能会问,在postgresql15中,distinct的增强体现在哪些方面呢?答案是:并发
在此之前,只有一个cpu或进程来计算不同的值。在postgresql15中,可以使用并发,使用多个cpu进程。
这一特性涉及好几个参数,但是,我们只聚焦在参数max_parallel_workers_per_gather。
为了演示这个改进,我们创建三个表,没有索引,填充大约5000000条记录。注意,表的列数分别为1,5,10。
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 | Table "public.t1" Column | Type | Collation | Nullable | Default --------+---------+-----------+----------+--------- c1 | integer | | | Table "public.t5" Column | Type | Collation | Nullable | Default --------+-----------------------+-----------+----------+--------- c1 | integer | | | c2 | integer | | | c3 | integer | | | c4 | integer | | | c5 | character varying (40) | | | Table "public.t10" Column | Type | Collation | Nullable | Default --------+-----------------------+-----------+----------+--------- c1 | integer | | | c2 | integer | | | c3 | integer | | | c4 | integer | | | c5 | character varying (40) | | | c6 | integer | | | c7 | integer | | | c8 | integer | | | c9 | integer | | | c10 | integer | | | |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 | insert into t1 select generate_series(1,500); insert into t5 select generate_series(1,500) ,generate_series(500,1000) ,generate_series(1000,1500) ,(random()*100):: int , 'aofjaofjwaoeev$#^ÐE#@#Fasrhk!!@%Q@' ; insert into t10 select generate_series(1,500) ,generate_series(500,1000) ,generate_series(1000,1500) ,(random()*100):: int , 'aofjaofjwaoeev$#^ÐE#@#Fasrhk!!@%Q@' ,generate_series(1500,2000) ,generate_series(2500,3000) ,generate_series(3000,3500) ,generate_series(3500,4000) ,generate_series(4000,4500); List of relations Schema | Name | Type | Owner | Persistence | Access method | Size | --------+------+-------+----------+-------------+---------------+--------+ public | t1 | table | postgres | permanent | heap | 173 MB | public | t10 | table | postgres | permanent | heap | 522 MB | public | t5 | table | postgres | permanent | heap | 404 MB | |
下一步是将生成的数据dump到以下的版本中:
1 2 3 4 5 6 7 8 | PG VERSION pg96 pg10 pg11 pg12 pg13 pg14 pg15 |
数据导入后,使用下面的脚本生成结果:
1 2 3 4 5 6 7 8 9 10 11 12 | #!/bin/bash for v in 96 10 11 12 13 14 15 do # run the explain analzye 5X in order to derive consistent numbers for u in $(seq 1 5) do echo "--- explain analyze: pg${v}, ${u}X ---" psql -p 100$v db01 -c "explain analyze select distinct on (c1) * from t1" > t1.pg$v.explain.txt psql -p 100$v db01 -c "explain analyze select distinct * from t5" > t5.pg$v.explain.txt psql -p 100$v db01 -c "explain analyze select distinct * from t10" > t10.pg$v.explain.txt done done |
以下是结果比较,可以看到表越大,性能收获越大。
PG VERSION |
1 column (t1), ms |
5 column (t5), ms |
10 column (t10), ms |
pg96 |
3,382 |
9,743 |
20,026 |
pg10 |
2,004 |
5,746 |
13,241 |
pg11 |
1,932 |
6,062 |
14,295 |
pg12 |
1,876 |
5,832 |
13,214 |
pg13 |
1,973 |
2,358 |
3,135 |
pg14 |
1,948 |
2,316 |
2,909 |
pg15 |
1,439 |
1,025 |
1,245 |

来看看不同版本之间的执行计划:
1 2 3 4 5 6 7 8 9 | PG96 QUERY PLAN, TABLE T1 ------------------------------------------------------------------------------- Unique (cost=765185.42..790185.42 rows =500 width=4) (actual time =2456.805..3381.230 rows =500 loops=1) -> Sort (cost=765185.42..777685.42 rows =5000000 width=4) (actual time =2456.804..3163.600 rows =5000000 loops=1) Sort Key : c1 Sort Method: external merge Disk: 68432kB -> Seq Scan on t1 (cost=0.00..72124.00 rows =5000000 width=4) (actual time =0.055..291.523 rows =5000000 loops=1) Planning time : 0.161 ms Execution time : 3381.662 ms |
1 2 3 4 5 6 7 8 9 10 11 12 13 | PG15 QUERY PLAN, TABLE T1 --------------------------------------------------------------------------- Unique (cost=557992.61..582992.61 rows =500 width=4) (actual time =946.556..1411.421 rows =500 loops=1) -> Sort (cost=557992.61..570492.61 rows =5000000 width=4) (actual time =946.554..1223.289 rows =5000000 loops=1) Sort Key : c1 Sort Method: external merge Disk: 58720kB -> Seq Scan on t1 (cost=0.00..72124.00 rows =5000000 width=4) (actual time =0.038..259.329 rows =5000000 loops=1) Planning Time : 0.229 ms JIT: Functions: 1 Options: Inlining true , Optimization true , Expressions true , Deforming true Timing: Generation 0.150 ms, Inlining 31.332 ms, Optimization 6.746 ms, Emission 6.847 ms, Total 45.074 ms Execution Time : 1438.683 ms |
当DISTINCT列的数量增加时,真正的差异出现了,如查询表 t10 所示。 可以看到并行化在起作用!
1 2 3 4 5 6 7 8 9 | PG96 QUERY PLAN, TABLE T10 ------------------------------------------------------------------------------------------- Unique (cost=1119650.30..1257425.30 rows =501000 width=73) (actual time =14257.801..20024.271 rows =50601 loops=1) -> Sort (cost=1119650.30..1132175.30 rows =5010000 width=73) (actual time =14257.800..19118.145 rows =5010000 loops=1) Sort Key : c1, c2, c3, c4, c5, c6, c7, c8, c9, c10 Sort Method: external merge Disk: 421232kB -> Seq Scan on t10 (cost=0.00..116900.00 rows =5010000 width=73) (actual time =0.073..419.701 rows =5010000 loops=1) Planning time : 0.352 ms Execution time : 20025.956 ms |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 | PG15 QUERY PLAN, TABLE T10 ------------------------------------------------------------------------------------------- HashAggregate (cost=699692.77..730144.18 rows=501000 width=73) (actual time=1212.779..1232.667 rows=50601 loops=1) Group Key : c1, c2, c3, c4, c5, c6, c7, c8, c9, c10 Planned Partitions: 16 Batches: 17 Memory Usage: 8373kB Disk Usage: 2976kB -> Gather (cost=394624.22..552837.15 rows =1002000 width=73) (actual time =1071.280..1141.814 rows =151803 loops=1) Workers Planned: 2 Workers Launched: 2 -> HashAggregate (cost=393624.22..451637.15 rows =501000 width=73) (actual time =1064.261..1122.628 rows =50601 loops=3) Group Key : c1, c2, c3, c4, c5, c6, c7, c8, c9, c10 Planned Partitions: 16 Batches: 17 Memory Usage: 8373kB Disk Usage: 15176kB Worker 0: Batches: 17 Memory Usage: 8373kB Disk Usage: 18464kB Worker 1: Batches: 17 Memory Usage: 8373kB Disk Usage: 19464kB -> Parallel Seq Scan on t10 (cost=0.00..87675.00 rows =2087500 width=73) (actual time =0.072..159.083 rows =1670000 loops=3) Planning Time : 0.286 ms JIT: Functions: 31 Options: Inlining true , Optimization true , Expressions true , Deforming true Timing: Generation 3.510 ms, Inlining 123.698 ms, Optimization 200.805 ms, Emission 149.608 ms, Total 477.621 ms Execution Time : 1244.556 ms |
提高性能:
postgres运行时参数max_parallel_workers_per_gather来提高性能。新初始化的集群中的默认值为2。
如下表所示,由于测试硬件本身的能力有限,它很快成为收益递减的原因。
在postgresql 15中:
max_parallel_workers_per_gather |
1 column (t1) |
5 column (t5) |
10 column (t10) |
2 |
1,439 |
1,025 |
1,245 |
3 |
1,464 |
875 |
1,013 |
4 |
1,391 |
858 |
977 |
6 |
1,401 |
846 |
1,045 |
8 |
1,428 |
856 |
993 |
关于索引:如本查询计划中所示,应用索引时未实现性能改进。
PG15,表T10,max_parallel_workers_per_gather=4:
1 2 3 4 5 6 7 8 9 10 11 | QUERY PLAN ----------------------------------------------------------------------------------- Unique (cost=0.43..251344.40 rows =501000 width=73) (actual time =0.060..1240.729 rows =50601 loops=1) -> Index Only Scan using t10_c1_c2_c3_c4_c5_c6_c7_c8_c9_c10_idx on t10 (cost=0.43..126094.40 rows =5010000 width=73) (actual time =0.058..710.780 rows =5010000 loops=1) Heap Fetches: 582675 Planning Time : 0.596 ms JIT: Functions: 1 Options: Inlining false , Optimization false , Expressions true , Deforming true Timing: Generation 0.262 ms, Inlining 0.000 ms, Optimization 0.122 ms, Emission 2.295 ms, Total 2.679 ms Execution Time : <strong>1249.391 ms</strong> |
跨多个CPU运行DISTINCT是性能能力的一大进步。
但是请记住,当增加max_parallel_workers_per_gather的数量并接近硬件的限制时,性能下降的风险。
在正常情况下,查询计划器可能会决定使用索引而不是运行并行工作程序。
解决此问题的一种方法是考虑禁用运行时参数,例如enable_indexonlyscan和enable_indexscan。
最后,不要忘记运行EXPLAIN ANALYZE以了解发生了什么。
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· DeepSeek 开源周回顾「GitHub 热点速览」
· 物流快递公司核心技术能力-地址解析分单基础技术分享
· .NET 10首个预览版发布:重大改进与新特性概览!
· AI与.NET技术实操系列(二):开始使用ML.NET
· .NET10 - 预览版1新功能体验(一)
2015-07-17 Putty设置删除
2015-07-17 ssh/scp 远程连接ssh非默认端口方法
2015-07-17 查看LINUX版本