In和exists使用及性能分析(三):in和exists的性能分析
本节主要讨论in和exists的性能。
三、in和exists的性能分析
3.1 基于8i数据库使用in和exists
3.1.1数据准备
SQL> create table big_table as select * from dba_objects;
Table created
SQL> insert into big_table select * from big_table;
3160 rows inserted
........
SQL> create table small_table as select * from dba_objects where rownum<50;
Table created
SQL> create index idx_big_table on big_table(object_id);
Index created
SQL> create index idx_small_table on small_table(object_id);
Index created
SQL> analyze table big_table compute statistics;
Table analyzed
SQL> analyze table small_table compute statistics;
Table analyzed
SQL> select count(1) from big_table;
COUNT(1)
----------
202240
SQL> select count(1) from small_table;
COUNT(1)
----------
49
3.1.2 外层数据集大于内层数据集
? 8i CBO下in和exists比较ora8@ORACLE8I> select count(1) from big_table b where b.object_id in (select object_id from small_table s);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=52 Card=1 Bytes=16)
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS (Cost=52 Card=3136 Bytes=50176)
3 2 VIEW OF 'VW_NSO_1' (Cost=3 Card=49 Bytes=637)
4 3 SORT (UNIQUE) (Cost=3 Card=49 Bytes=147)
5 4 TABLE ACCESS (FULL) OF 'SMALL_TABLE' (Cost=1 Card=
49 Bytes=147)
6 2 INDEX (RANGE SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE) (Co
st=1 Card=202240 Bytes=606720)
Statistics
----------------------------------------------------------
0 recursive calls
12 db block gets
115 consistent gets
0 physical reads
0 redo size
366 bytes sent via SQL*Net to client
367 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
1 rows processedora8@ORACLE8I> select count(1) from big_table b where exists (select 1 from small_table s where b.object_id=s.object_id);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=363 Card=1 Bytes=3)
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=363 Card=1011
2 Bytes=30336)
4 2 INDEX (RANGE SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE) (
Cost=1 Card=1 Bytes=3)
Statistics
----------------------------------------------------------
0 recursive calls
5 db block gets
188435 consistent gets
0 physical reads
0 redo size
366 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
从以上结果可以分析:8i在CBO优化器下,如果外层数据集远大于内层数据集,则in操作的效率远高于exists的效率。
? 8i RBO下in和exists比较ora8@ORACLE8I> select /*+ rule */count(1) from big_table b where b.object_id in (select object_id from small_table s);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS
3 2 VIEW OF 'VW_NSO_1'
4 3 SORT (UNIQUE)
5 4 TABLE ACCESS (FULL) OF 'SMALL_TABLE'
6 2 INDEX (RANGE SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
12 db block gets
109 consistent gets
0 physical reads
0 redo size
366 bytes sent via SQL*Net to client
378 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processedora8@ORACLE8I> select /*+ rule */ count(1) from big_table b where exists (select 1 from small_table s where b.object_id=s.object_id);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'BIG_TABLE'
4 2 INDEX (RANGE SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
5 db block gets
188435 consistent gets
0 physical reads
0 redo size
366 bytes sent via SQL*Net to client
393 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
可以看出,当外层数据远大于内层数据时,在8i RBO优化器下,用in的操作比exists效率高很多。
3.1.3 外层数据集小于内层数据集
? 8i CBO下in和exists比较ora8@ORACLE8I> select count(1) from small_table s where s.object_id in (select object_id from big_table b);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1463 Card=1 Bytes=16
)
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS (Cost=1463 Card=49 Bytes=784)
3 2 VIEW OF 'VW_NSO_1' (Cost=1463 Card=3159 Bytes=41067)
4 3 SORT (UNIQUE) (Cost=1463 Card=3159 Bytes=9477)
5 4 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=363 Card=
202240 Bytes=606720)
6 2 INDEX (RANGE SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
1021 recursive calls
359 db block gets
5861 consistent gets
604 physical reads
19796 redo size
365 bytes sent via SQL*Net to client
367 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
3 sorts (memory)
1 sorts (disk)
1 rows processedora8@ORACLE8I> select count(1) from small_table s where exists (select 1 from big_table b where b.object_id=s.object_id);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=3)
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'SMALL_TABLE' (Cost=1 Card=3 By
tes=9)
4 2 INDEX (RANGE SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE) (Co
st=1 Card=65 Bytes=195)
Statistics
----------------------------------------------------------
0 recursive calls
12 db block gets
103 consistent gets
0 physical reads
0 redo size
365 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
从以上结果可以看出,外层数据集小,内层数据集大的情况时,在oracle 8i CBO环境下,用in的效率比exists低很多。
? 8i RBO下in和exists比较 ora8@ORACLE8I> select /*+ rule */count(1) from small_table s where s.object_id in (select object_id from big_table s);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS
3 2 VIEW OF 'VW_NSO_1'
4 3 SORT (UNIQUE)
5 4 TABLE ACCESS (FULL) OF 'BIG_TABLE'
6 2 INDEX (RANGE SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
1021 recursive calls
363 db block gets
5859 consistent gets
604 physical reads
20436 redo size
365 bytes sent via SQL*Net to client
379 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
1 sorts (disk)
1 rows processedora8@ORACLE8I> select /*+ rule */ count(1) from small_table s where exists (select 1 from big_table b where b.object_id=s.object_id);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'SMALL_TABLE'
4 2 INDEX (RANGE SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
12 db block gets
103 consistent gets
0 physical reads
0 redo size
365 bytes sent via SQL*Net to client
394 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
由上面的执行计划可以看到,8i在RBO优化器下,如果外层数据集远小于内层数据集,则用exists的效率远高于用in。
3.2 基于9i数据库使用in和exists
3.2.1 准备表及数据
SQL> create table big_table as select * from dba_objects;
Table created
SQL> insert into big_table select * from big_table;
6299 rows inserted
SQL> /
12598 rows inserted
SQL> /
25196 rows inserted
SQL> /
50392 rows inserted
SQL> /
100784 rows inserted
根据exists都运行机理,我们在big_table和small_table都object_id上建立索引,以让exists高效。下面都比较都是基于关联字段有合适索引都基础上讨论的
SQL> create table small_table as select * from dba_objects where rownum<50;
Table created
SQL> create index idx_big_table on big_table(object_id);
Index created
SQL> create index idx_small_table on small_table(object_id);
Index created
SQL> analyze table big_table compute statistics;
Table analyzed
SQL> analyze table small_table compute statistics;
Table analyzed
SQL> select count(1) from big_table;
COUNT(1)
----------
198560
SQL> select count(1) from small_table;
COUNT(1)
----------
49
3.2.2 外层数据集大于内层数据集
? 9i CBO下in和exists比较suk@ORACLE9I> select count(1) from big_table b where b.object_id in (select object_id from small_table s);
COUNT(1)
----------
1568
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=42 Card=1 Bytes=6)
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS (SEMI) (Cost=42 Card=1568 Bytes=9408)
3 2 INDEX (FAST FULL SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE)
(Cost=42 Card=198560 Bytes=595680)
4 2 INDEX (RANGE SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
424 consistent gets
0 physical reads
0 redo size
377 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processedsuk@ORACLE9I> select count(1) from big_table b where exists (select 1 from small_table s where b.object_id=s.object_id);
COUNT(1)
----------
1568
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=42 Card=1 Bytes=6)
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS (SEMI) (Cost=42 Card=1568 Bytes=9408)
3 2 INDEX (FAST FULL SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE)
(Cost=42 Card=198560 Bytes=595680)
4 2 INDEX (RANGE SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
424 consistent gets
0 physical reads
0 redo size
377 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
可见,在9i的CBO优化器下,in和exists的执行计划是一样的,也就是说,它们效率是一样的
? 9i RBO下的in和exists比较suk@ORACLE9I> select /*+ rule */count(1) from big_table b where b.object_id in (select object_id from small_table s);
COUNT(1)
----------
1568
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS
3 2 VIEW OF 'VW_NSO_1'
4 3 SORT (UNIQUE)
5 4 TABLE ACCESS (FULL) OF 'SMALL_TABLE'
6 2 INDEX (RANGE SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
60 consistent gets
0 physical reads
0 redo size
377 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processedsuk@ORACLE9I> select /*+ rule */ count(1) from big_table b where exists (select 1 from small_table s where b.object_id=s.object_id);
COUNT(1)
----------
1568
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'BIG_TABLE'
4 2 INDEX (RANGE SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
192915 consistent gets
300 physical reads
0 redo size
377 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
可见,在9i的RBO优化器下,如果外层数据集远大于内层数据集,则用in的效率远远高于exists。
3.2.3外层数据集小于内层数据集
? 9i CBO下的in和exists比较suk@ORACLE9I> select count(1) from small_table s where s.object_id in (select object_id from big_table b);
COUNT(1)
----------
49
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=50 Card=1 Bytes=6)
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS (SEMI) (Cost=50 Card=49 Bytes=294)
3 2 INDEX (FULL SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE) (C
ost=1 Card=49 Bytes=147)
4 2 INDEX (RANGE SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE) (Co
st=1 Card=198432 Bytes=595296)suk@ORACLE9I> select count(1) from small_table s where exists (select 1 from big_table b where b.object_id=s.object_id);
COUNT(1)
----------
49
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=50 Card=1 Bytes=6)
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS (SEMI) (Cost=50 Card=49 Bytes=294)
3 2 INDEX (FULL SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE) (C
ost=1 Card=49 Bytes=147)
4 2 INDEX (RANGE SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE) (Co
st=1 Card=198432 Bytes=595296)
从以上结果可以得出结论:9i的CBO优化器下,in和exists执行路径一样,效率也就是一样的。
? RBO下的in和exists比较suk@ORACLE9I> select /*+ rule */ count(1) from small_table s where s.object_id in (select object_id from big_table b);
COUNT(1)
----------
49
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS
3 2 VIEW OF 'VW_NSO_1'
4 3 SORT (UNIQUE)
5 4 TABLE ACCESS (FULL) OF 'BIG_TABLE'
6 2 INDEX (RANGE SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2421 consistent gets
1386 physical reads
0 redo size
376 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processedsuk@ORACLE9I> select /*+ rule */ count(1) from small_table s where exists (select 1 from big_table b where b.object_id=s.object_id);
COUNT(1)
----------
49
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'SMALL_TABLE'
4 2 INDEX (RANGE SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
101 consistent gets
0 physical reads
0 redo size
376 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
可见,在9i的RBO优化器下,如果外层数据集远小于内层数据集,则用exists的效率要远高于in。
3.3 简单总结:
1、在相关字段有合适索引的情况下:
(1)当外层数据集远大于内层数据集时:
8i RBO/CBO优化器下都是in比exists更高效
9i 在RBO下in比exists高效,在CBO下in和exists执行计划相同,效率也相同。
(2)当外层数据集远小于内层数据集时:
8i RBO/CBO优化器下都是exists比in更高效
9i 在RBO下exists比in高效,在CBO下in和exists执行计划相同,效率也相同。
2、根据in和exists的原理,不难推断,如果相关字段没有合适索引,得到的结论和有合适索引时一致。
本文是讨论的是相对极端的例子,如果外层数据集和内层数据集大小差不多,则in和exists的效率那个更高一点则取决于其他的因素,此时可以考虑用其他方法,如外连接+ null等。