对PostgreSQL Nested Loop 执行计划的初步学习

开始

伪代码如下:

for (i = 0; i < length(outer); i++)
  for (j = 0; j < length(inner); j++)
    if (outer[i] == inner[j])
      output(outer[i], inner[j]);

就是个双层循环。

实际运行一下:

建表:

postgres=# CREATE TEMPORARY TABLE sample1 (id, junk) AS
postgres-# SELECT oid, repeat('x',250)
postgres-# FROM pg_proc
postgres-# ORDER BY random(); -- add rows in random order;
SELECT 2491
postgres=# 
postgres=# CREATE TEMPORARY TABLE sample2 (id, junk) AS
postgres-# SELECT oid, repeat('x',250)
postgres-# FROM pg_class
postgres-# ORDER BY random();
SELECT 298
postgres=# 

运行:

postgres=# EXPLAIN analyze SELECT sample2.junk
FROM sample1 JOIN sample2 ON (sample1.id = sample2.id)
WHERE sample1.id = 33;
                                                  QUERY PLAN                                                   
---------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..271.44 rows=399 width=32) (actual time=0.120..0.120 rows=0 loops=1)
   ->  Seq Scan on sample1  (cost=0.00..235.99 rows=57 width=4) (actual time=0.087..0.118 rows=1 loops=1)
         Filter: (id = 33::oid)
         Rows Removed by Filter: 2490
   ->  Materialize  (cost=0.00..30.48 rows=7 width=36) (actual time=0.000..0.000 rows=0 loops=1)
         ->  Seq Scan on sample2  (cost=0.00..30.45 rows=7 width=36) (actual time=0.000..0.000 rows=0 loops=1)
               Filter: (id = 33::oid)
               Rows Removed by Filter: 298
 Total runtime: 0.234 ms
(9 rows)

如果和使用 with 子句相比,还是快一些:

postgres=# explain analyze                    
WITH testsample AS (
   select * from sample1 where sample1.id=33
)
SELECT sample2.junk
FROM testsample JOIN sample2 ON (testsample.id = sample2.id)
;
                                                   QUERY PLAN                                                    
-----------------------------------------------------------------------------------------------------------------
 Hash Join  (cost=237.84..274.35 rows=421 width=32) (actual time=0.387..0.387 rows=0 loops=1)
   Hash Cond: (sample2.id = testsample.id)
   CTE testsample
     ->  Seq Scan on sample1  (cost=0.00..235.99 rows=57 width=36) (actual time=0.174..0.284 rows=1 loops=1)
           Filter: (id = 33::oid)
           Rows Removed by Filter: 2490
   ->  Seq Scan on sample2  (cost=0.00..26.76 rows=1476 width=36) (actual time=0.009..0.040 rows=298 loops=1)
   ->  Hash  (cost=1.14..1.14 rows=57 width=4) (actual time=0.301..0.301 rows=1 loops=1)
         Buckets: 1024  Batches: 1  Memory Usage: 1kB
         ->  CTE Scan on testsample  (cost=0.00..1.14 rows=57 width=4) (actual time=0.177..0.288 rows=1 loops=1)
 Total runtime: 0.489 ms
(11 rows)

postgres=# 

不过这只是一个实验,要是真的简化起来,不就是这样了么?

postgres=# explain analyze SELECT sample2.junk
from sample2 WHERE sample2.id=33;
                                            QUERY PLAN                                             
---------------------------------------------------------------------------------------------------
 Seq Scan on sample2  (cost=0.00..30.45 rows=7 width=32) (actual time=0.042..0.042 rows=0 loops=1)
   Filter: (id = 33::oid)
   Rows Removed by Filter: 298
 Total runtime: 0.147 ms
(4 rows)

postgres=# 

结束

posted @ 2012-11-09 09:43  健哥的数据花园  阅读(3180)  评论(0编辑  收藏  举报