1 样例数据集
wget https://ftp.postgresql.org/pub/projects/pgFoundry/dbsamples/dellstore2/dellstore2-normal-1.0/dellstore2-normal-1.0.tar.gz
tar -xf dellstore2-normal-1.0.tar.gz
createdb dellstore2
cd dellstore2-normal-1.0
psql -f dellstore2-normal-1.0.sql -d dellstore2
psql -d dellstore2 -c "vacuum verbose analyze"
数据库大小:
dellstore2=# select pg_size_pretty(pg_database_size('dellstore2'));
pg_size_pretty
----------------
24 MB
(1 row)
以下是该库中含有的表和索引
dellstore2=# select schemaname,relname, pg_size_pretty(pg_total_relation_size(relid)) from pg_stat_user_tables order by pg_relation_size(relid) desc;--包含表和索引大小
schemaname | relname | pg_size_pretty
------------+------------+----------------
public | customers | 5024 kB
public | orderlines | 4448 kB
public | cust_hist | 3984 kB
public | products | 1560 kB
public | orders | 1392 kB
public | inventory | 712 kB
public | categories | 56 kB
public | reorder | 0 bytes
(8 rows)
dellstore2=# select schemaname,relname, pg_size_pretty(pg_table_size(relid)) from pg_stat_user_tables order by pg_relation_size(relid) desc;--不包括索引(但包括TOAST、空闲空间映射fsm和可见 性映射)
schemaname | relname | pg_size_pretty
------------+------------+----------------
public | customers | 3944 kB
public | orderlines | 3112 kB
public | cust_hist | 2648 kB
public | products | 840 kB
public | orders | 832 kB
public | inventory | 472 kB
public | categories | 40 kB
public | reorder | 0 bytes
(8 rows)
2 explain基础
如果一个查询运行缓慢,那么首先要做的就是在执行前使用explain运行它。这会显示该查询的执行计划,它是当查询被真正执行时预期发生的内容列表。还可以在查询语句前加上explain analyze,这样可以得到规划器的预期估算,还有查询实际运行时发生的事情的描述信息。要注意的是,这种形式和手动执行查询一样会实际运行该语句。
计时开销
假设用户需要执行一个简单查询来计算数据库中所有客户的数量,并且要知道查询执行的时间:
dellstore2=# \timing
Timing is on.
dellstore2=# select count(1) from customers;
count
-------
20000
(1 row)
Time: 4.233 ms
执行计划跟查询花费时间:
dellstore2=# explain analyze select count(1) from customers;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=578.29..578.30 rows=1 width=8) (actual time=3.199..3.200 rows=1 loops=1)
-> Index Only Scan using customers_pkey on customers (cost=0.29..528.29 rows=20000 width=0) (actual time=0.014..1.760 rows=20000 loops=1)
Heap Fetches: 0
Planning time: 0.051 ms
Execution time: 3.251 ms
(5 rows)
Time: 3.958 ms
冷热缓存
上面的查询第二次相对第一次快。它代表了“热”缓存行为,意思是查询所需要的数据已经在数据库或i操作系统缓存中了。数据第一次被加载时就留在了缓存中。不论用户的缓存是“热”还是“冷”(数据不在缓存中),这都是一件需要关系的事情。
可以通过过查看整张表的查询时间来衡量其有效的传输速率,这样就可以体会到热缓存情况下数据在两段内存之间的移动有多快。
dellstore2=# select pg_size_pretty(cast(pg_relation_size('customers')/9.489*1000 as int8)) as bytes_per_second;
bytes_per_second
------------------
402 MB
(1 row)
Time: 1.343 ms
清除缓存
清除所有的缓存信息来获得冷缓存性能的方法取决于不同的操作系统。只停止数据库服务是不够的,因为操作系统缓存仍然持有大量的缓存信息。在Linux系统中,可以使用drop_caches功能丢弃在其页面缓存中的所有东西。
pg_ctl stop
su -
sync
echo 3 > /proc/sys/vm/drop_caches
logout
pg_ctl start
这里的sync是在清除缓存之前将所有的数据都刷写到磁盘上。
重新运行相同的基准会显示出完全不同的性能:
dellstore2=# \timing
Timing is on.
dellstore2=# select count(1) from customers;
count
-------
20000
(1 row)
Time: 18.135 ms
现在运行下面的查询语句:
dellstore2=# select pg_size_pretty(cast(pg_relation_size('customers')/32.417*1000 as int8)) as bytes_per_second;
bytes_per_second
------------------
118 MB
(1 row)
Time: 5.509 ms
可以看到目前磁盘驱动器的顺序读写速度为 118 MB/s,并且这些数据并不一定是连续的,不过,在这么小的数据上很难达到驱动器的最高速度。
现在。重新执行该查询,由于现在是热缓存状态,我们已经知道可以期待将会回到原来的速度:
dellstore2=# select count(1) from customers;
count
-------
20000
(1 row)
Time: 2.658 ms
对真实的数据集进行测试,需要知道数据是否已经存在缓存中。常用的技术是将每个查询运行三次。如果第一次比第二次和第三次慢得多。说明这个查询开始是冷缓存。如果三次的执行时间都相同,那么查询可能在开始前就是热缓存。如果三次执行时间都不大相同,那么除了缓存,可能还存在有其他变数。这时,就需要通过大量的查询来获取到速度变化的模式。
3 执行计划节点结构
explain的输出被组织成一系列的计划节点。在最低层的是一些查看表、扫描表或者在索引中查找东西的节点。更高层节点将拿到较低层节点的输出,并且在其上进行操作。当用户运行explain时,输出中的每一行都是一个计划节点。
dellstore2=# explain analyze select * from customers;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Seq Scan on customers (cost=0.00..688.00 rows=20000 width=268) (actual time=0.302..25.800 rows=20000 loops=1)
Planning time: 2.570 ms
Execution time: 26.541 ms
(3 rows)
这个计划中有一个Seq Scan节点。报告的第一组数据是计划的估计,这也是运行不带ANALYZE的EXPALIN命令时唯一能看到的信息:
- cost = 0.00..688.00:这里的第一个代价是节点的启动代价。它表示该节点在产生第一个输出行之前估计要做的工作量。本例中这个值为0,是因为Seq Scan会立即返回行。而排序操作是需要花费一些时间才能够返回第一行的例子。第二个估计代价是运行整个节点直至完成的代价。例如它可能不会是一个带有LIMIT的节点,那种节点会在检索出所有估计要取的行之前就早早地停止。
- rows = 20000:该节点运行完成预期输出的行数。
- width = 268:这个节点输出的每一行所包含的平均字节数估计。例如,20000个268字节的行表示这个节点预期产生总共5360000字节的输出。这比表本身的规模(3.8MB)还要大一些,因为它包含了执行计划时将元组存储在内存的开销。
数字“actual”显示这个查询的实际运行情况:
- actual time=0.302..25.800:实际启动代价并非正好为0,它消耗了一小段时间来开始产生输出。一旦开始,执行这个计划节点共花费了25.8毫秒。
- row = 20000:该节点预期将输出20000行。预期行数量与实际生成数量之间的差异是最常见的查询问题来源之一,在这类问题中优化器会做出不好的决定。
- loops=1:一些节点(比如执行连接的节点)执行将超过1次。这种情况下,循环值大于1,并且显示实际执行时间和行值都是针对每一次循环而非整个节点。想要得到整个节点的值,需要乘以循环的次数。
要想理解如何使用这些数据进行决策,就需要了解更多关于如何计算估计代价的知识。
3.1 基本代价计算
查询优化器的工作是生成许多种可能用于执行查询的查询计划,然后选择其中一个代价最低的去执行。而代价计算采用的是假定的单位,这些单位与现实世界中的执行代价具有松散的关联:
- seq_page_cost:表示当预期按照顺序挨个读取相邻页面(在磁盘的一块区域上顺序读)时,从磁盘读取一个数据库页面要花多久。而其他的代价参数本质上都是相对于这个作为参考代价1.0的值来设定。
- random_page_cost:设计获取随机散布在磁盘上的行代价。默认是4.0。
- cpu_tuple_cost:处理单行数据的代价。默认是0.01。
- cpu_index_tuple_cost:索引扫描种处理单个索引项的代价。默认值为0.005,该值低于单行数据的代价,因为数据行比索引项包含更多的头部信息(例如xmin和xmax)。
- cpu_operator_cost:处理简单操作符或者函数的预期代价。如果查询需要将两个数字相加,那么就会有一个操作符的代价,默认值为非常低的0.0025。
下面的表格展示了相对于seq_page_cost这个参考值的相对速度比:
参数 | 默认值 | 相对速度 |
---|---|---|
seq_page_cost | 1.0 | 参考值 |
random_page_cost | 4.0 | 慢4倍 |
cpu_tuple_cost | 0.01 | 快100倍 |
cpu_index_tuple_cost | 0.005 | 快200倍 |
cpu_operator_cost | 0.0025 | 快400倍 |
我们可以使用这些数字来计算前面例子中显示的代价。在customers表上的一次顺序扫描将读取其中每一个数据页面并处理每一个结果行。查看优化器用于估算表中页面和行数的统计信息,然后将它们与内部的代价参数相结合:
dellstore2=# select relpages,
dellstore2-# current_setting('seq_page_cost') as seq_page_cost,
dellstore2-# relpages * current_setting('seq_page_cost')::decimal as page_cost,
dellstore2-# reltuples,
dellstore2-# current_setting('cpu_tuple_cost') as cpu_tuple_cost,
dellstore2-# reltuples * current_setting('cpu_tuple_cost')::decimal as tuple_cost
dellstore2-# from pg_class where relname = 'customers';
relpages | seq_page_cost | page_cost | reltuples | cpu_tuple_cost | tuple_cost
----------+---------------+-----------+-----------+----------------+------------
488 | 1 | 488 | 20000 | 0.01 | 200
(1 row)
将读取页面的代价(488)和处理行的代价(200)相加后得到688.00,这正是前一节当中EXPALIN计划对顺序扫描给出的代价。
每个计划节点都将分解5个操作:顺序读、随机读、处理一行、处理一个索引项或执行一个操作符。其他所有操作都是基于这些基础构建的更复杂结构。
3.2 执行计划格式输出
从pg9.0开始,可以用一些新的格式来产生解释输出。如下相对简单的计划:
dellstore2=# explain verbose select customerid,firstname,lastname from customers where customerid > 1000 order by zip;
QUERY PLAN
-----------------------------------------------------------------------------
Sort (cost=2088.30..2135.80 rows=19000 width=26)
Output: customerid, firstname, lastname, zip
Sort Key: customers.zip
-> Seq Scan on public.customers (cost=0.00..738.00 rows=19000 width=26)
Output: customerid, firstname, lastname, zip
Filter: (customers.customerid > 1000)
(6 rows)
同一个计划以YAML格式输出要大很多,但是也更容易读:
dellstore2=# explain (format yaml) select customerid,firstname,lastname from customers where customerid > 1000 order by zip;
QUERY PLAN
---------------------------------------
- Plan: +
Node Type: "Sort" +
Parallel Aware: false +
Startup Cost: 2088.30 +
Total Cost: 2135.80 +
Plan Rows: 19000 +
Plan Width: 26 +
Sort Key: +
- "zip" +
Plans: +
- Node Type: "Seq Scan" +
Parent Relationship: "Outer" +
Parallel Aware: false +
Relation Name: "customers" +
Alias: "customers" +
Startup Cost: 0.00 +
Total Cost: 738.00 +
Plan Rows: 19000 +
Plan Width: 26 +
Filter: "(customerid > 1000)"
(1 row)
json格式:
dellstore2=# explain (format json) select customerid,firstname,lastname from customers where customerid > 1000 order by zip;
QUERY PLAN
-------------------------------------------
[ +
{ +
"Plan": { +
"Node Type": "Sort", +
"Parallel Aware": false, +
"Startup Cost": 2088.30, +
"Total Cost": 2135.80, +
"Plan Rows": 19000, +
"Plan Width": 26, +
"Sort Key": ["zip"], +
"Plans": [ +
{ +
"Node Type": "Seq Scan", +
"Parent Relationship": "Outer",+
"Parallel Aware": false, +
"Relation Name": "customers", +
"Alias": "customers", +
"Startup Cost": 0.00, +
"Total Cost": 738.00, +
"Plan Rows": 19000, +
"Plan Width": 26, +
"Filter": "(customerid > 1000)"+
} +
] +
} +
} +
]
(1 row)
xml格式:
dellstore2=# explain (format xml) select customerid,firstname,lastname from customers where customerid > 1000 order by zip;
QUERY PLAN
------------------------------------------------------------
<explain xmlns="http://www.postgresql.org/2009/explain"> +
<Query> +
<Plan> +
<Node-Type>Sort</Node-Type> +
<Parallel-Aware>false</Parallel-Aware> +
<Startup-Cost>2088.30</Startup-Cost> +
<Total-Cost>2135.80</Total-Cost> +
<Plan-Rows>19000</Plan-Rows> +
<Plan-Width>26</Plan-Width> +
<Sort-Key> +
<Item>zip</Item> +
</Sort-Key> +
<Plans> +
<Plan> +
<Node-Type>Seq Scan</Node-Type> +
<Parent-Relationship>Outer</Parent-Relationship>+
<Parallel-Aware>false</Parallel-Aware> +
<Relation-Name>customers</Relation-Name> +
<Alias>customers</Alias> +
<Startup-Cost>0.00</Startup-Cost> +
<Total-Cost>738.00</Total-Cost> +
<Plan-Rows>19000</Plan-Rows> +
<Plan-Width>26</Plan-Width> +
<Filter>(customerid > 1000)</Filter> +
</Plan> +
</Plans> +
</Plan> +
</Query> +
</explain>
(1 row)
4 组装行集合
4.1 扫描方式
Tid Scan
dellstore2=# select ctid,customerid from customers limit 3;
ctid | customerid
-------+------------
(0,1) | 1
(0,2) | 2
(0,3) | 3
(3 rows)
dellstore2=# explain select customerid from customers where ctid = '(0,1)';
QUERY PLAN
---------------------------------------------------------
Tid Scan on customers (cost=0.00..4.01 rows=1 width=4)
TID Cond: (ctid = '(0,1)'::tid)
(2 rows)
对象oid
dellstore2=# select oid,relname from pg_class where relname = 'customers';
oid | relname
-------+-----------
16411 | customers
(1 row)
dellstore2=# explain select relname from pg_class where oid = 16411;
QUERY PLAN
------------------------------------------------------------------------------------
Index Scan using pg_class_oid_index on pg_class (cost=0.27..8.29 rows=1 width=64)
Index Cond: (oid = '16411'::oid)
(2 rows)
顺序扫描
当没有可用的索引,或者表的大部分数据都预计被返回时,可以预期规划期会使用Seq Scan,因为这种情况下使用索引会增加不必要的开销。当只有少量的数据需要访问时,也会使用Seq Scan,因为如果表只占据磁盘的几页,索引的开销会很大。
Seq Scan会读取表中的死行,但是不会将他们包含在输出中。因此表膨胀将花费更多的时间来产生所有需要的输出。
索引扫描
如果有对满足一个选择性where条件有用的索引存在,就会得到如下的执行计划:
dellstore2=# explain analyze select * from customers where customerid = 1000;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------
Index Scan using customers_pkey on customers (cost=0.29..8.30 rows=1 width=268) (actual time=0.013..0.013 rows=1 loops=1)
Index Cond: (customerid = 1000)
Planning time: 0.333 ms
Execution time: 0.036 ms
(4 rows)
这里的代价主要是两个随机页面的读取(每个4.0,共计8.0)、索引块及数据库行所在的块。
仅索引扫描
pg9.2版本加入了一种新特性:仅索引扫描。当查询中的所有列都在一个索引中时,规划期会选择仅索引扫描替代索引扫描。
dellstore2=# explain analyze select customerid from customers where customerid = 1000;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------
Index Only Scan using customers_pkey on customers (cost=0.29..4.30 rows=1 width=4) (actual time=0.098..0.099 rows=1 loops=1)
Index Cond: (customerid = 1000)
Heap Fetches: 0
Planning time: 0.076 ms
Execution time: 0.119 ms
(5 rows)
4.2 处理节点
排序
当用户在查询中加入order by 语句时,就会出现排序节点:
dellstore2=# explain analyze select customerid from customers order by zip;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Sort (cost=2116.77..2166.77 rows=20000 width=8) (actual time=27.677..30.429 rows=20000 loops=1)
Sort Key: zip
Sort Method: external sort Disk: 352kB
-> Seq Scan on customers (cost=0.00..688.00 rows=20000 width=8) (actual time=0.012..7.927 rows=20000 loops=1)
Planning time: 0.465 ms
Execution time: 32.008 ms
(6 rows)
如果预期排序操作在内存中进行,Sort操作可以使用快速排序算法在内存中执行,否则将会被交换到磁盘来使用这个例子中所谓的外部合并排序。确定使用哪一种方式的阈值取决于服务器上的work_men设置。
dellstore2=# show work_mem;
work_mem
----------
1MB
(1 row)
pg中外部磁盘排序时通过写出一组排序的文件,然后合并结果完成的,这样做要比快速排序需要的内存少得多。
dellstore2=# set work_mem='2MB';
SET
dellstore2=# explain analyze select customerid from customers order by zip;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Sort (cost=2116.77..2166.77 rows=20000 width=8) (actual time=9.466..11.168 rows=20000 loops=1)
Sort Key: zip
Sort Method: quicksort Memory: 1621kB
-> Seq Scan on customers (cost=0.00..688.00 rows=20000 width=8) (actual time=0.008..3.801 rows=20000 loops=1)
Planning time: 0.047 ms
Execution time: 13.081 ms
(6 rows)
Limit
与其他任何东西一样,查询限制建立在已有的返回一个行集合的扫描之上:
dellstore2=# explain analyze select customerid from customers limit 10;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..0.34 rows=10 width=4) (actual time=0.010..0.012 rows=10 loops=1)
-> Seq Scan on customers (cost=0.00..688.00 rows=20000 width=4) (actual time=0.005..0.007 rows=10 loops=1)
Planning time: 0.088 ms
Execution time: 0.025 ms
(4 rows)
Offset
当Offset被加入到查询中时,不会有专门的节点类型来处理它。它会由另一种不同形式的Limit来处理。本质上,底层扫描节点产生的前几行都会被丢弃。用一个Limit查询的变体很容易看到:
dellstore2=# explain analyze select customerid from customers offset 10 limit 10;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------
Limit (cost=0.34..0.69 rows=10 width=4) (actual time=0.010..0.012 rows=10 loops=1)
-> Seq Scan on customers (cost=0.00..688.00 rows=20000 width=4) (actual time=0.007..0.010 rows=20 loops=1)
Planning time: 0.052 ms
Execution time: 0.023 ms
(4 rows)
注意这一次Seq Scan节点是如何生成20行输出的。前10行被offset略过,接下来的10行满足前面的limit节点,此时查询完成。
聚集
聚集函数接收一系列值并产生一个单一输出。例如:AVG()、COUNT()、EVERY()、MIN()、MAX()、STDDEV()、SUM()、VARIANCE()。为了计算一个聚集,通常需要读取所有的行,然后把这些行喂给聚集节点计算出一个结果:
dellstore2=# explain analyze select max(zip) from customers;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
Aggregate (cost=738.00..738.01 rows=1 width=4) (actual time=4.938..4.939 rows=1 loops=1)
-> Seq Scan on customers (cost=0.00..688.00 rows=20000 width=4) (actual time=0.006..1.550 rows=20000 loops=1)
Planning time: 0.079 ms
Execution time: 4.961 ms
(4 rows)
不过情况并非总是如此,因为有些值可以用索引来计算。
dellstore2=# explain analyze select max(customerid) from customers;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------------------------------------------
Result (cost=0.32..0.33 rows=1 width=4) (actual time=8.271..8.272 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.29..0.32 rows=1 width=4) (actual time=8.264..8.264 rows=1 loops=1)
-> Index Only Scan Backward using customers_pkey on customers (cost=0.29..578.29 rows=20000 width=4) (actual time=8.262..8.262 rows=1 loops=1)
Index Cond: (customerid IS NOT NULL)
Heap Fetches: 0
Planning time: 0.101 ms
Execution time: 8.295 ms
(8 rows)
HashAggregate
作为一种多功能的节点类型,HashAggregate节点会接受一组节点,并且在桶中输出一系列派生数据。这种哈希方式可以避免对值进行排序,而排序有时会是一种代价昂贵的步骤。
有时候使用GROUP BY甚至是DISTINCT计算的聚集都将使用HashAggregate来计算其输出。
dellstore2=# explain analyze select category,count(*) from products group by category order by category;
QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
Sort (cost=251.48..251.52 rows=16 width=12) (actual time=7.027..7.030 rows=16 loops=1)
Sort Key: category
Sort Method: quicksort Memory: 25kB
-> HashAggregate (cost=251.00..251.16 rows=16 width=12) (actual time=7.012..7.014 rows=16 loops=1)
Group Key: category
-> Seq Scan on products (cost=0.00..201.00 rows=10000 width=4) (actual time=0.401..4.779 rows=10000 loops=1)
Planning time: 10.014 ms
Execution time: 7.147 ms
(8 rows)
唯一
唯一节点是将一个排序好的行集合作为输入,然后输出移除了重复行的集合。当DISTINCT以及去重的UNION时,就会出现唯一节点。输出与输入的顺序是一样的。
dellstore2=# explain analyze select distinct(state) from customers;;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=738.00..738.51 rows=51 width=3) (actual time=6.393..6.398 rows=52 loops=1)
Group Key: state
-> Seq Scan on customers (cost=0.00..688.00 rows=20000 width=3) (actual time=0.006..1.628 rows=20000 loops=1)
Planning time: 0.269 ms
Execution time: 6.432 ms
(5 rows)
Result
有些时候一个节点只需要返回一个语句的计算结果:
ellstore2=# explain analyze select 1;
QUERY PLAN
------------------------------------------------------------------------------------
Result (cost=0.00..0.01 rows=1 width=4) (actual time=0.001..0.001 rows=1 loops=1)
Planning time: 0.022 ms
Execution time: 0.010 ms
(3 rows)
在只处理一个值而不是行集合时,Result节点基本上就是一种快速传递节点,Reuslt节点可以被折叠,因此可以优化where子句中仅需计算一次的部分:
dellstore2=# explain analyze select * from customers where customerid = (select min(customerid) from customers)+1;
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using customers_pkey on customers (cost=0.62..8.63 rows=1 width=268) (actual time=0.028..0.028 rows=1 loops=1)
Index Cond: (customerid = ($1 + 1))
InitPlan 2 (returns $1)
-> Result (cost=0.32..0.33 rows=1 width=4) (actual time=0.019..0.019 rows=1 loops=1)
InitPlan 1 (returns $0)
-> Limit (cost=0.29..0.32 rows=1 width=4) (actual time=0.017..0.017 rows=1 loops=1)
-> Index Only Scan using customers_pkey on customers customers_1 (cost=0.29..578.29 rows=20000 width=4) (actual time=0.017..0.017 rows=1 loops=1)
Index Cond: (customerid IS NOT NULL)
Heap Fetches: 0
Planning time: 0.546 ms
Execution time: 0.054 ms
(11 rows)
Append
像union一样,Append是另一种不像过去那么流行的节点类型。
dellstore2=# explain analyze select customerid,city from customers where state = 'MA' union select customerid,city from customers where state = 'MD';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=1482.02..1486.03 rows=401 width=122) (actual time=5.632..5.742 rows=401 loops=1)
Group Key: customers.customerid, customers.city
-> Append (cost=0.00..1480.01 rows=401 width=122) (actual time=0.010..5.497 rows=401 loops=1)
-> Seq Scan on customers (cost=0.00..738.00 rows=214 width=12) (actual time=0.010..2.710 rows=214 loops=1)
Filter: ((state)::text = 'MA'::text)
Rows Removed by Filter: 19786
-> Seq Scan on customers customers_1 (cost=0.00..738.00 rows=187 width=12) (actual time=0.020..2.742 rows=187 loops=1)
Filter: ((state)::text = 'MD'::text)
Rows Removed by Filter: 19813
Planning time: 0.310 ms
Execution time: 5.787 ms
(11 rows)
Group
Group节点是早期pg版本中group by的实现方式。它要求输入的数据按分组列集进行排序。
dellstore2=# explain analyze select state,count(*) from customers group by state;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=788.00..788.51 rows=51 width=11) (actual time=7.626..7.633 rows=52 loops=1)
Group Key: state
-> Seq Scan on customers (cost=0.00..688.00 rows=20000 width=3) (actual time=0.009..1.562 rows=20000 loops=1)
Planning time: 0.056 ms
Execution time: 7.684 ms
(5 rows)
子查询转换和IN列表
用户可能预计会作为子查询执行的一些东西实际上会转变成连接类型。当使用一个子查询寻找要在in中使用的行列表时,就会发生这种事情:
dellstore2=# explain analyze select * from orders where customerid in (select customerid from customers where state = 'MD');
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------
Hash Semi Join (cost=740.34..994.61 rows=112 width=30) (actual time=3.003..8.050 rows=120 loops=1)
Hash Cond: (orders.customerid = customers.customerid)
-> Seq Scan on orders (cost=0.00..220.00 rows=12000 width=30) (actual time=0.362..4.114 rows=12000 loops=1)
-> Hash (cost=738.00..738.00 rows=187 width=4) (actual time=2.618..2.618 rows=187 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 15kB
-> Seq Scan on customers (cost=0.00..738.00 rows=187 width=4) (actual time=0.020..2.580 rows=187 loops=1)
Filter: ((state)::text = 'MD'::text)
Rows Removed by Filter: 19813
Planning time: 11.360 ms
Execution time: 8.094 ms
(10 rows)
集合操作
dellstore2=# explain analyze select * from customers where state = 'MD' intersect select * from customers where zip = '21340';
QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------
HashSetOp Intersect (cost=0.00..1487.28 rows=1 width=1566) (actual time=13.592..13.592 rows=1 loops=1)
-> Append (cost=0.00..1477.88 rows=188 width=1566) (actual time=0.034..12.896 rows=188 loops=1)
-> Subquery Scan on "*SELECT* 2" (cost=0.00..738.01 rows=1 width=272) (actual time=0.034..7.547 rows=1 loops=1)
-> Seq Scan on customers (cost=0.00..738.00 rows=1 width=268) (actual time=0.033..7.546 rows=1 loops=1)
Filter: (zip = 21340)
Rows Removed by Filter: 19999
-> Subquery Scan on "*SELECT* 1" (cost=0.00..739.87 rows=187 width=272) (actual time=0.025..5.327 rows=187 loops=1)
-> Seq Scan on customers customers_1 (cost=0.00..738.00 rows=187 width=268) (actual time=0.023..5.278 rows=187 loops=1)
Filter: ((state)::text = 'MD'::text)
Rows Removed by Filter: 19813
Planning time: 0.210 ms
Execution time: 13.716 ms
(12 rows)
物化
一般来说,节点都会按照其父节点的要求返回他们的输出行。但有时在执行子查询时或者在连接的内部,规划器可能认为物化该节点会更有效。这种方式需要一次性获取子节点的整个行集合,而不是在上层节点要求时每次得到一行。
cte扫描
cte带来了新的查询的有效方式,甚至允许在SQL中使用递归。一中使用表达式的方法是作为一种内联视图,这种方式很容易展示该特性预计计划节点的结果类型:
dellstore2=# explain analyze with monthlysales as (select extract(year from orderdate) as year,extract(month from orderdate) as month,sum(netamount) as sales from orders group by year,month) select year,sum(sales) as sales from monthlysales group by year;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=447.34..449.84 rows=200 width=40) (actual time=10.674..10.674 rows=1 loops=1)
Group Key: monthlysales.year
CTE monthlysales
-> HashAggregate (cost=430.00..438.21 rows=365 width=48) (actual time=10.655..10.659 rows=12 loops=1)
Group Key: date_part('year'::text, (orders.orderdate)::timestamp without time zone), date_part('month'::text, (orders.orderdate)::timestamp without time zone)
-> Seq Scan on orders (cost=0.00..340.00 rows=12000 width=22) (actual time=0.024..6.112 rows=12000 loops=1)
-> CTE Scan on monthlysales (cost=0.00..7.30 rows=365 width=40) (actual time=10.659..10.668 rows=12 loops=1)
Planning time: 1.527 ms
Execution time: 10.734 ms
(9 rows)