一个RDBMS左连接SQL执行计划解析
1、测试数据如下:
SQL> select * from t1;
a | b | c
---+----+---
1 | 10 | 1
2 | 20 | 2
3 | 30 | 3
4 | 40 | 4
5 | 50 | 5
6 | 60 | 6
(6 rows)
SQL> select * from t2;
a | b | d
---+----+---
1 | 10 | 1
2 | 20 | 2
3 | 30 | 3
(3 rows)
2、解析示例SQL 如下 :
select *
from (
select * from t1 where c >= 2
) t1 left join (
select * from t2 where b < 30
) t2 on t1.a = t2.a
and t2.d > 1
where t1.b < 50
;
3、Oracle数据库查看执行结果及执行计划:
SQL> select *
from (
select * from t1 where c >= 2
) t1 left join (
select * from t2 where b < 30
) t2 on t1.a = t2.a
and t2.d > 1
where t1.b < 50
;
A B C A B D
---------- ---------- ---------- ---------- ---------- ----------
2 20 2 2 20 2
3 30 3
4 40 4
Execution Plan
----------------------------------------------------------
Plan hash value: 1823443478
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 3 | 234 | 7 (15)| 00:00:01 |
|* 1 | HASH JOIN OUTER | | 3 | 234 | 7 (15)| 00:00:01 |
|* 2 | TABLE ACCESS FULL| T1 | 3 | 117 | 3 (0)| 00:00:01 |
|* 3 | TABLE ACCESS FULL| T2 | 1 | 39 | 3 (0)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("T1"."A"="T2"."A"(+))
2 - filter("T1"."B"<50 AND "C">=2)
3 - filter("T2"."D"(+)>1 AND "B"(+)<30)
Note
-----
- dynamic sampling used for this statement (level=2)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
7 consistent gets
0 physical reads
0 redo size
926 bytes sent via SQL*Net to client
523 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
3 rows processed
4、PGSQL数据库查看执行结果及执行计划:
postgres=# select *
postgres-# from (
postgres(# select * from t1 where c >= 2
postgres(# ) t1 left join (
postgres(# select * from t2 where b < 30
postgres(# ) t2 on t1.a = t2.a
postgres-# and t2.d > 1
postgres-# where t1.b < 50
postgres-# ;
a | b | c | a | b | d
---+----+---+---+----+---
2 | 20 | 2 | 2 | 20 | 2
3 | 30 | 3 | | |
4 | 40 | 4 | | |
(3 rows)
postgres=# explain analyze select *
postgres-# from (
postgres(# select * from t1 where c >= 2
postgres(# ) t1 left join (
postgres(# select * from t2 where b < 30
postgres(# ) t2 on t1.a = t2.a
postgres-# and t2.d > 1
postgres-# where t1.b < 50
postgres-# ;
QUERY PLAN
------------------------------------------------------------------------------------------------------------
Hash Left Join (cost=37.04..85.88 rows=197 width=24) (actual time=0.020..0.027 rows=3 loops=1)
Hash Cond: ("outer".a = "inner".a)
-> Seq Scan on t1 (cost=0.00..36.55 rows=197 width=12) (actual time=0.005..0.008 rows=3 loops=1)
Filter: ((c >= 2) AND (b < 50))
-> Hash (cost=36.55..36.55 rows=197 width=12) (actual time=0.006..0.006 rows=1 loops=1)
-> Seq Scan on t2 (cost=0.00..36.55 rows=197 width=12) (actual time=0.002..0.003 rows=1 loops=1)
Filter: ((b < 30) AND (d > 1))
Total runtime: 0.052 ms
(8 rows)
5、MySQL数据库查看执行结果及执行计划:
mysql> select *
-> from (
-> select * from t1 where c >= 2
-> ) t1 left join (
-> select * from t2 where b < 30
-> ) t2 on t1.a = t2.a
-> and t2.d > 1
-> where t1.b < 50
-> ;
+---+----+---+------+------+------+
| a | b | c | a | b | d |
+---+----+---+------+------+------+
| 2 | 20 | 2 | 2 | 20 | 2 |
| 3 | 30 | 3 | NULL | NULL | NULL |
| 4 | 40 | 4 | NULL | NULL | NULL |
+---+----+---+------+------+------+
3 rows in set (0.05 sec)
mysql> explain select *
-> from (
-> select * from t1 where c >= 2
-> ) t1 left join (
-> select * from t2 where b < 30
-> ) t2 on t1.a = t2.a
-> and t2.d > 1
-> where t1.b < 50
-> ;
+----+-------------+------------+------+---------------+-------------+---------+------+------+-------------+
| id | select_type | table | type | possible_keys | key | key_len | ref | rows | Extra |
+----+-------------+------------+------+---------------+-------------+---------+------+------+-------------+
| 1 | PRIMARY | <derived2> | ALL | NULL | NULL | NULL | NULL | 6 | Using where |
| 1 | PRIMARY | <derived3> | ref | <auto_key0> | <auto_key0> | 8 | t1.a | 1 | Using where |
| 3 | DERIVED | t2 | ALL | NULL | NULL | NULL | NULL | 3 | Using where |
| 2 | DERIVED | t1 | ALL | NULL | NULL | NULL | NULL | 6 | Using where |
+----+-------------+------------+------+---------------+-------------+---------+------+------+-------------+
4 rows in set (0.00 sec)
6、针对以上SQL执行计划的分析:
1) 全表扫描左表T1,同时根据T1表子查询条件"C">=2和where过滤条件"T1"."B"<50联合过滤,即filter("T1"."B"<50 AND "C">=2),计算结果临时表记为tmp1;
2) 全表扫描右表T2,同时根据T2表子查询条件"B"(+)<30和on子句"T2"."D"(+)>1联合过滤,即filter("T2"."D"(+)>1 AND "B"(+)<30),计算结果临时表记为tmp2;
3) 左表T1及右表T2处理后临时表tmp1和tmp2通过access("T1"."A"="T2"."A"(+))连接条件进行Hash Left Join操作,左临时表结果集全量返回,右表不匹配行置为null,返回结果临时表记为tmp3;
4) 返回结果集。
7、一些更为复杂得SQL如下,有兴趣自行研究:
1) 测试数据
create table tmp1 as
select a,b,c,a as e from t1;
create table tmp2 as
select a,b,d,a as e from t2;
2) 示例SQL
select *
from (
select * from tmp1 where c >= 1
) t1 left join (
select * from tmp2 where b < 30
) t2 on t1.a = t2.a
and t2.d > 1 and t1.e >= 2
where t1.b < 50
;
select *
from (
select * from tmp1 where c >= 1
) t1 left join (
select * from tmp2 where b < 30
) t2 on t1.a = t2.a
and t2.d > 1 and t1.e >= 2
where t1.b < 50 and t2.e <= 3
;