分享一次公司晋级考试的SQL题目,非常有趣的案例(postgresql 标量子查询 where like 谓词过滤条件)
同事今天晋级高级工程师考试,发来一道公司出题目让我帮忙进行优化,其中场景二的案例非常有意思😉。
题目内容如下:
原始SQL:
scott=> explain analyze scott-> select scott-> a.id, scott-> a.col2, scott-> (select sum(b.id) from table02 b where a.col2 like b.col2||'%' ) scott-> from table01 a; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------ Seq Scan on table01 a (cost=0.00..3905341.00 rows=100000 width=45) (actual time=0.579..50568.090 rows=100000 loops=1) SubPlan 1 -> Aggregate (cost=39.02..39.03 rows=1 width=8) (actual time=0.504..0.504 rows=1 loops=100000) -> Seq Scan on table02 b (cost=0.00..39.00 rows=10 width=4) (actual time=0.063..0.499 rows=8 loops=100000) Filter: (a.col2 ~~ (col2 || '%'::text)) Rows Removed by Filter: 1992 Planning Time: 0.097 ms Execution Time: 50590.882 ms (8 行记录) 时间:50591.756 ms (00:50.592)
table01、table02 这两张表没有创建任何索引,全表扫描+标量子查询SQL执行需要50s才能出结果,速度非常慢。
考题要求要优化这条SQL,意思既是无论是调整 postgresql数据库的参数,对SQL加索引,等价改写SQL,这些手段都没问题,只要能让执行速度变快就行。
由于当时我在忙其他的事情,大致看了一眼后给出了以下的改写方案(我没加索引,感觉加索引的用处不大):
改写1:
scott=> explain analyze select scott-> a.id, scott-> a.col2, scott-> b.sum_b_id scott-> from table01 a scott-> left join (select sum(b.id) sum_b_id,b.col2 from table02 b group by b.col2) b scott-> ON a.col2 like b.col2||'%' scott-> ; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------- Nested Loop Left Join (cost=39.00..448135.74 rows=127500 width=45) (actual time=1.283..8674.517 rows=100000 loops=1) Join Filter: (a.col2 ~~ (b.col2 || '%'::text)) Rows Removed by Join Filter: 25400387 -> Seq Scan on table01 a (cost=0.00..1841.00 rows=100000 width=37) (actual time=0.018..19.620 rows=100000 loops=1) -> Materialize (cost=39.00..45.37 rows=255 width=11) (actual time=0.000..0.024 rows=255 loops=100000) -> HashAggregate (cost=39.00..41.55 rows=255 width=11) (actual time=1.241..1.316 rows=255 loops=1) Group Key: b.col2 -> Seq Scan on table02 b (cost=0.00..29.00 rows=2000 width=7) (actual time=0.007..0.342 rows=2000 loops=1) Planning Time: 0.181 ms Execution Time: 8682.974 ms (10 行记录) 时间:8684.338 ms (00:08.684)
可以看到等价改写以后,SQL从原来执行 50s 的时间已经降低到 8.8s 左右,提升还是挺大的。
把答案给了同事,我也去忙其他的事情了。
晚上我闲下来没事做,贼无聊,仔细看了下改写1 SQL的执行计划,感觉这种计划可能不是最优的执行计划。
因为我始终感觉走HASH可能才是最佳的执行计划,如果这条SQL在ORACLE 上执行,CBO很大可能会让计划走HASH,但是在PG就是走NL(脑残优化器)😅。
吃完饭后一直在尝试改写,搞了哥很长时间,最终还是把HASH版本的SQL给改出来了,泪目🤣。
改写2:
explain analyze with a as (select t1.id, t1.col2, substr(t1.col2, 1, x.rn) rn1 from table01 t1, (select min(length(col2)) rn from table02) x) , b as (select t2.col2, t2.sum_b_id, substr(t2.col2, 1, x.rn) rn2 from (SELECT col2, SUM(id) as sum_b_id FROM table02 GROUP BY col2) t2, (select min(length(col2)) rn from table02) x) SELECT a.id, a.col2, b.sum_b_id FROM a LEFT JOIN b ON a.rn1 = b.rn2 and a.col2 LIKE b.col2 || '%'; QUERY PLAN ---------------------------------------------------------------------------------------------------------------------------------- -------- Hash Left Join (cost=127.86..14881.38 rows=100000 width=45) (actual time=2.578..219.098 rows=100000 loops=1) Hash Cond: (substr(t1.col2, 1, (min(length(table02.col2)))) = substr(table02_2.col2, 1, (min(length(table02_1.col2))))) Join Filter: (t1.col2 ~~ (table02_2.col2 || '%'::text)) -> Nested Loop (cost=39.00..2880.02 rows=100000 width=41) (actual time=0.692..36.984 rows=100000 loops=1) -> Aggregate (cost=39.00..39.01 rows=1 width=4) (actual time=0.682..0.685 rows=1 loops=1) -> Seq Scan on table02 (cost=0.00..29.00 rows=2000 width=3) (actual time=0.020..0.297 rows=2000 loops=1) -> Seq Scan on table01 t1 (cost=0.00..1841.00 rows=100000 width=37) (actual time=0.006..15.019 rows=100000 loops=1) -> Hash (cost=85.67..85.67 rows=255 width=15) (actual time=1.871..1.876 rows=255 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 20kB -> Nested Loop (cost=78.00..85.67 rows=255 width=15) (actual time=1.641..1.749 rows=255 loops=1) -> Aggregate (cost=39.00..39.01 rows=1 width=4) (actual time=0.630..0.631 rows=1 loops=1) -> Seq Scan on table02 table02_1 (cost=0.00..29.00 rows=2000 width=3) (actual time=0.005..0.266 rows=2000 l oops=1) -> HashAggregate (cost=39.00..41.55 rows=255 width=11) (actual time=1.010..1.069 rows=255 loops=1) Group Key: table02_2.col2 -> Seq Scan on table02 table02_2 (cost=0.00..29.00 rows=2000 width=7) (actual time=0.005..0.271 rows=2000 l oops=1) Planning Time: 0.531 ms Execution Time: 227.639 ms (17 行记录)
这个案例从最早的 50秒 改写到 8秒,到最后的 227毫秒出结果,花了不少时间研究各种改写方式😂。
只能说PG的优化器确实太拉跨了,要用CTE才能让优化器走HASH连接,浪费开发者不少时间,换成ORACLE数据库不会走这种SB执行计划。😁
以后估计会很少发博客,正在考虑转行卖炒粉,现在的市场真的是一言难尽。😢