PG优化实战系列-利用hint固定执行计划，避免数据全量排序

　　项目现场开发给了个慢SQL寻求优化。T3表结构又全是varchar…已经无力吐槽了，之前的博客已经讲过了在磁盘上varchar几乎等同于实际长度，但是如果放到内存，分配的空间是定义长度，很容易导致内存不足，如果嫌命长，可以多试试。此外不要用UUID做主键，UUID做主键会导致索引结构大，同时UUID是随机的，入库时很容易导致叶子节点分裂，破坏B树结构，影响查询，入库以及更新效率。

--原始SQL
select t3.apptype, t3.uname, t3.uid, t3.nickname, t3.action_time, t3.content
  from t1
 inner join t2
    on (t1.task_id = t2.task_id or t1.child_task_id = t2.task_id)
 inner join t3
    on t2.md_id = t3.md_id
 where t1.id = 1
 order by t3.action_time desc limit 10 offset 0;

--对应执行计划
                                                                                 QUERY PLAN                                                                                 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=2226.68..2226.70 rows=10 width=206) (actual time=1770.610..1770.641 rows=10 loops=1)
   Buffers: shared hit=776920
   ->  Sort  (cost=2226.68..2227.82 rows=458 width=206) (actual time=1770.607..1770.614 rows=10 loops=1)
         Sort Key: t3.action_time DESC
         Sort Method: top-N heapsort  Memory: 30kB
         Buffers: shared hit=776920
         ->  Nested Loop  (cost=29.33..2216.78 rows=458 width=206) (actual time=26.288..1674.090 rows=106066 loops=1)
               Buffers: shared hit=776920
               ->  Nested Loop  (cost=28.77..1666.80 rows=458 width=33) (actual time=26.244..336.274 rows=162491 loops=1)
                     Buffers: shared hit=20890
                     ->  Seq Scan on t1  (cost=0.00..1.18 rows=1 width=292) (actual time=0.008..0.015 rows=1 loops=1)
                           Filter: (id = 1)
                           Rows Removed by Filter: 13
                           Buffers: shared hit=1
                     ->  Bitmap Heap Scan on t2  (cost=28.77..1661.05 rows=458 width=47) (actual time=26.230..157.987 rows=162491 loops=1)
                           Recheck Cond: (((t1.task_id)::text = (task_id)::text) OR ((t1.child_task_id)::text = (task_id)::text))
                           Heap Blocks: exact=19013
                           Buffers: shared hit=20889
                           ->  BitmapOr  (cost=28.77..28.77 rows=458 width=0) (actual time=23.810..23.811 rows=0 loops=1)
                                 Buffers: shared hit=1876
                                 ->  Bitmap Index Scan on "TTRR_CONFLICT_TASK_ID_IDX"  (cost=0.00..14.27 rows=229 width=0) (actual time=23.790..23.791 rows=162491 loops=1)
                                       Index Cond: ((task_id)::text = (t1.task_id)::text)
                                       Buffers: shared hit=1872
                                 ->  Bitmap Index Scan on "TTRR_CONFLICT_TASK_ID_IDX"  (cost=0.00..14.27 rows=229 width=0) (actual time=0.012..0.013 rows=0 loops=1)
                                       Index Cond: ((task_id)::text = (t1.child_task_id)::text)
                                       Buffers: shared hit=4
               ->  Index Scan using idx_mdid on  t3  (cost=0.56..1.19 rows=1 width=239) (actual time=0.006..0.006 rows=1 loops=162491)
                     Index Cond: ((md_id)::text = (t2.md_id)::text)
                     Buffers: shared hit=756030
 Planning Time: 0.336 ms
 Execution Time: 1770.725 ms

--命中10w+
->  Nested Loop  (cost=29.33..2216.78 rows=458 width=206) (actual time=26.288..1674.090 rows=106066 loops=1)
--循环16w+次，耗时947ms
->  Index Scan using idx_mdid on  t3  (cost=0.56..1.19 rows=1 width=239) (actual time=0.006..0.006 rows=1 loops=162491)

　　从截取的执行计划段可以看到总体结果集捞出来了10w+的数据，而实际输出的分页只需10条，显然执行计划里面将全部命中结果获取后再进行排序取10条的方式很不明智。

验证：不全部命中直接返回10条是否能提速？

--验证SQL，直接去掉order by
explain (analyze,buffers) select t3.apptype, t3.uname, t3.uid, t3.nickname, t3.action_time, t3.content
  from  t1
 inner join  t2
    on (t1.task_id = t2.task_id or t1.child_task_id = t2.task_id)
 inner join  t3
    on t2.md_id = t3.md_id
 where t1.id = 1
  limit 10 offset 0;

--对应执行计划
                                                                              QUERY PLAN                                                                              
----------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=29.33..77.09 rows=10 width=206) (actual time=29.312..29.474 rows=10 loops=1)
   Buffers: shared hit=1932
   ->  Nested Loop  (cost=29.33..2216.78 rows=458 width=206) (actual time=29.310..29.444 rows=10 loops=1)
         Buffers: shared hit=1932
         ->  Nested Loop  (cost=28.77..1666.80 rows=458 width=33) (actual time=29.271..29.291 rows=11 loops=1)
               Buffers: shared hit=1878
               ->  Seq Scan on  t1  (cost=0.00..1.18 rows=1 width=292) (actual time=0.008..0.009 rows=1 loops=1)
                     Filter: (id = 1)
                     Rows Removed by Filter: 10
                     Buffers: shared hit=1
               ->  Bitmap Heap Scan on  t2  (cost=28.77..1661.05 rows=458 width=47) (actual time=29.257..29.265 rows=11 loops=1)
                     Recheck Cond: (((t1.task_id)::text = (task_id)::text) OR ((t1.child_task_id)::text = (task_id)::text))
                     Heap Blocks: exact=1
                     Buffers: shared hit=1877
                     ->  BitmapOr  (cost=28.77..28.77 rows=458 width=0) (actual time=26.835..26.836 rows=0 loops=1)
                           Buffers: shared hit=1876
                           ->  Bitmap Index Scan on "TTRR_CONFLICT_TASK_ID_IDX"  (cost=0.00..14.27 rows=229 width=0) (actual time=26.810..26.811 rows=162491 loops=1)
                                 Index Cond: ((task_id)::text = (t1.task_id)::text)
                                 Buffers: shared hit=1872
                           ->  Bitmap Index Scan on "TTRR_CONFLICT_TASK_ID_IDX"  (cost=0.00..14.27 rows=229 width=0) (actual time=0.014..0.015 rows=0 loops=1)
                                 Index Cond: ((task_id)::text = (t1.child_task_id)::text)
                                 Buffers: shared hit=4
         ->  Index Scan using idx_mdid on  t3  (cost=0.56..1.19 rows=1 width=239) (actual time=0.011..0.011 rows=1 loops=11)
               Index Cond: ((md_id)::text = (t2.md_id)::text)
               Buffers: shared hit=54
 Planning Time: 1.044 ms
 Execution Time: 29.538 ms

　　从以上执行计划可以看到，在不全部命中直接输出的情况下，速度提升巨大。与业务沟通后，输出必须带时间排序。

思路：能否只命中部分数据后直接返回？

“order by t3.action_time desc limit 10 offset 0;”实际是取最近数据的10条。

与业务沟通后可以增加action_time过滤条件，如近3天、5天。

--对应SQL
explain (analyze,buffers)    
select t3.apptype, t3.uname, t3.uid, t3.nickname, t3.action_time, t3.content
  from  t1
 inner join  t2
    on (t1.task_id = t2.task_id or t1.child_task_id = t2.task_id)
 inner join  t3
    on t2.md_id = t3.md_id
 where t1.id = 1 and action_time>'1610035195'
 order by t3.action_time desc limit 10 offset 0;

--对应执行计划
                                                                                 QUERY PLAN                                                                                 
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=3461.45..3461.46 rows=4 width=205) (actual time=1789.902..1789.931 rows=10 loops=1)
   Buffers: shared hit=781825
   ->  Sort  (cost=3461.45..3461.46 rows=4 width=205) (actual time=1789.899..1789.906 rows=10 loops=1)
         Sort Key: t3.action_time DESC
         Sort Method: top-N heapsort  Memory: 31kB
         Buffers: shared hit=781825
         ->  Nested Loop  (cost=45.35..3461.41 rows=4 width=205) (actual time=46.231..1760.202 rows=106066 loops=1)
               Buffers: shared hit=781825
               ->  Nested Loop  (cost=44.79..1766.05 rows=460 width=33) (actual time=35.603..387.490 rows=162491 loops=1)
                     Buffers: shared hit=25795
                     ->  Seq Scan on  t1  (cost=0.00..1.18 rows=1 width=292) (actual time=0.013..0.019 rows=1 loops=1)
                           Filter: (id = 1)
                           Rows Removed by Filter: 13
                           Buffers: shared hit=1
                     ->  Bitmap Heap Scan on  t2  (cost=44.79..1760.27 rows=460 width=47) (actual time=35.582..189.271 rows=162491 loops=1)
                           Recheck Cond: (((t1.task_id)::text = (task_id)::text) OR ((t1.child_task_id)::text = (task_id)::text))
                           Heap Blocks: exact=22050
                           Buffers: shared hit=25794
                           ->  BitmapOr  (cost=44.79..44.79 rows=460 width=0) (actual time=32.343..32.344 rows=0 loops=1)
                                 Buffers: shared hit=3744
                                 ->  Bitmap Index Scan on "TTRR_CONFLICT_TASK_ID_IDX"  (cost=0.00..22.28 rows=230 width=0) (actual time=32.316..32.317 rows=162491 loops=1)
                                       Index Cond: ((task_id)::text = (t1.task_id)::text)
                                       Buffers: shared hit=3740
                                 ->  Bitmap Index Scan on "TTRR_CONFLICT_TASK_ID_IDX"  (cost=0.00..22.28 rows=230 width=0) (actual time=0.015..0.016 rows=0 loops=1)
                                       Index Cond: ((task_id)::text = (t1.child_task_id)::text)
                                       Buffers: shared hit=4
               ->  Index Scan using idx_mdid on  t3  (cost=0.56..3.68 rows=1 width=238) (actual time=0.007..0.007 rows=0 loops=162491)
                     Index Cond: ((md_id)::text = (t2.md_id)::text)
                     Filter: ((action_time)::text > '1610035195'::text)
                     Rows Removed by Filter: 0
                     Buffers: shared hit=756030
 Planning Time: 0.411 ms
 Execution Time: 1790.004 ms

　　从执行计划可以看到没有按照想象中的来执行；

试试hint

--SQL
/*+ indexscan(t3 idx_index_webpost_action_time2) */
explain (analyze,buffers)    
select t3.apptype, t3.uname, t3.uid, t3.nickname, t3.action_time, t3.content
  from  t1
 inner join  t2
    on (t1.task_id = t2.task_id or t1.child_task_id = t2.task_id)
 inner join  t3
    on t2.md_id = t3.md_id
 where t1.id = 1 and action_time>'1610035195'
 order by t3.action_time desc limit 10 offset 0;

--执行计划
                                                                                           QUERY PLAN                                                                                        
    
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
----
 Limit  (cost=174120.21..174120.22 rows=4 width=205) (actual time=709.149..709.188 rows=10 loops=1)
   Buffers: shared hit=66616, temp read=1725 written=1725
   ->  Sort  (cost=174120.21..174120.22 rows=4 width=205) (actual time=709.147..709.161 rows=10 loops=1)
         Sort Key: t3.action_time DESC
         Sort Method: top-N heapsort  Memory: 28kB
         Buffers: shared hit=66616, temp read=1725 written=1725
         ->  Hash Join  (cost=1772.23..174120.17 rows=4 width=205) (actual time=523.189..682.476 rows=28597 loops=1)
               Hash Cond: ((t3.md_id)::text = (t2.md_id)::text)
               Buffers: shared hit=66616, temp read=1725 written=1725
               ->  Index Scan using idx_index_webpost_action_time2 on  t3  (cost=0.43..172161.43 rows=49838 width=238) (actual time=0.019..54.132 rows=46447 loops=1)
                     Index Cond: ((action_time)::text > '1610035195'::text)
                     Buffers: shared hit=40821
               ->  Hash  (cost=1766.05..1766.05 rows=460 width=33) (actual time=497.318..497.320 rows=162491 loops=1)
                     Buckets: 131072 (originally 1024)  Batches: 2 (originally 1)  Memory Usage: 7169kB
                     Buffers: shared hit=25795, temp written=526
                     ->  Nested Loop  (cost=44.79..1766.05 rows=460 width=33) (actual time=34.844..369.624 rows=162491 loops=1)
                           Buffers: shared hit=25795
                           ->  Seq Scan on  t1  (cost=0.00..1.18 rows=1 width=292) (actual time=0.015..0.021 rows=1 loops=1)
                                 Filter: (id = 1)
                                 Rows Removed by Filter: 13
                                 Buffers: shared hit=1
                           ->  Bitmap Heap Scan on  t2  (cost=44.79..1760.27 rows=460 width=47) (actual time=34.822..184.299 rows=162491 loops=1)
                                 Recheck Cond: (((t1.task_id)::text = (task_id)::text) OR ((t1.child_task_id)::text = (task_id)::text))
                                 Heap Blocks: exact=22050
                                 Buffers: shared hit=25794
                                 ->  BitmapOr  (cost=44.79..44.79 rows=460 width=0) (actual time=31.881..31.882 rows=0 loops=1)
                                       Buffers: shared hit=3744
                                       ->  Bitmap Index Scan on "TTRR_CONFLICT_TASK_ID_IDX"  (cost=0.00..22.28 rows=230 width=0) (actual time=31.856..31.857 rows=162491 loops=1)
                                             Index Cond: ((task_id)::text = (t1.task_id)::text)
                                             Buffers: shared hit=3740
                                       ->  Bitmap Index Scan on "TTRR_CONFLICT_TASK_ID_IDX"  (cost=0.00..22.28 rows=230 width=0) (actual time=0.014..0.015 rows=0 loops=1)
                                             Index Cond: ((task_id)::text = (t1.child_task_id)::text)
                                             Buffers: shared hit=4
 Planning Time: 0.429 ms
 Execution Time: 709.298 ms

　　从执行计划可以看到在构建hash表的时候有很多临时read、written，临时读写一方面可能是work_mem不够，另外一方面最根本原因是表结构字段类型全被设置成varchar(500)导致；

->  Index Scan using idx_index_webpost_action_time2 on  t3  (cost=0.43..172161.43 rows=49838 width=238) (actual time=0.019..54.132 rows=46447 loops=1)

　　此外可以看到捞取T3表的行数从10w降低到4.6w（此值跟选定的时间范围有关，业务侧可以根据实际情况选择1,3,5天）；

同时可以看到通过T3的action_time索引扫描cost代价很大,导致优化器不选择此执行计划，鉴于实际情况使用此执行计划更满足业务场景，需要用hint对pg进行执行计划选取。

问题总结：

1、表结构要设计合理，字段类型不要为了省事全部设置成varchar类型

2、 UUID尽量少用，可以使用序列代替，如果是为了保证行唯一，可以考虑用hashtext（uuid）::bigint代替。

3、根据业务场景合理变更SQL，如果执行计划不符合要求可以调整配置参数或者用hint。

4、优化的原则是尽量减少磁盘IO,走索引或者减少捞取最终或中间数据量等都能达到目的。

posted @ 2021-01-20 16:05 DUAN的博客阅读(909) 评论(0) 收藏举报

刷新页面返回顶部

段磊的博客

PG优化实战系列-利用hint固定执行计划，避免数据全量排序

公告