In和exists使用及性能分析(三)：in和exists的性能分析

本节主要讨论in和exists的性能。

三、in和exists的性能分析
3.1 基于8i数据库使用in和exists
3.1.1数据准备
SQL> create table big_table as select * from dba_objects;
Table created
SQL> insert into big_table select * from big_table;
3160 rows inserted
........
SQL> create table small_table as select * from dba_objects where rownum<50;
Table created
SQL> create index idx_big_table on big_table(object_id);
Index created
SQL> create index idx_small_table on small_table(object_id);
Index created
SQL> analyze table big_table compute statistics;
Table analyzed
SQL> analyze table small_table compute statistics;
Table analyzed
SQL> select count(1) from big_table;
COUNT(1)
----------
202240
SQL> select count(1) from small_table;
COUNT(1)
----------
49
3.1.2 外层数据集大于内层数据集
? 8i CBO下in和exists比较ora8@ORACLE8I> select count(1) from big_table b where b.object_id in (select object_id from small_table s);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=52 Card=1 Bytes=16)
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS (Cost=52 Card=3136 Bytes=50176)
3 2 VIEW OF 'VW_NSO_1' (Cost=3 Card=49 Bytes=637)
4 3 SORT (UNIQUE) (Cost=3 Card=49 Bytes=147)
5 4 TABLE ACCESS (FULL) OF 'SMALL_TABLE' (Cost=1 Card=
49 Bytes=147)
6 2 INDEX (RANGE SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE) (Co
st=1 Card=202240 Bytes=606720)
Statistics
----------------------------------------------------------
0 recursive calls
12 db block gets
115 consistent gets
0 physical reads
0 redo size
366 bytes sent via SQL*Net to client
367 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
4 sorts (memory)
0 sorts (disk)
1 rows processedora8@ORACLE8I> select count(1) from big_table b where exists (select 1 from small_table s where b.object_id=s.object_id);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=363 Card=1 Bytes=3)
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=363 Card=1011
2 Bytes=30336)
4 2 INDEX (RANGE SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE) (
Cost=1 Card=1 Bytes=3)
Statistics
----------------------------------------------------------
0 recursive calls
5 db block gets
188435 consistent gets
0 physical reads
0 redo size
366 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
从以上结果可以分析：8i在CBO优化器下，如果外层数据集远大于内层数据集，则in操作的效率远高于exists的效率。
? 8i RBO下in和exists比较ora8@ORACLE8I> select /*+ rule */count(1) from big_table b where b.object_id in (select object_id from small_table s);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS
3 2 VIEW OF 'VW_NSO_1'
4 3 SORT (UNIQUE)
5 4 TABLE ACCESS (FULL) OF 'SMALL_TABLE'
6 2 INDEX (RANGE SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
12 db block gets
109 consistent gets
0 physical reads
0 redo size
366 bytes sent via SQL*Net to client
378 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processedora8@ORACLE8I> select /*+ rule */ count(1) from big_table b where exists (select 1 from small_table s where b.object_id=s.object_id);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'BIG_TABLE'
4 2 INDEX (RANGE SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
5 db block gets
188435 consistent gets
0 physical reads
0 redo size
366 bytes sent via SQL*Net to client
393 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
可以看出，当外层数据远大于内层数据时，在8i RBO优化器下，用in的操作比exists效率高很多。
3.1.3 外层数据集小于内层数据集
? 8i CBO下in和exists比较ora8@ORACLE8I> select count(1) from small_table s where s.object_id in (select object_id from big_table b);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1463 Card=1 Bytes=16
)
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS (Cost=1463 Card=49 Bytes=784)
3 2 VIEW OF 'VW_NSO_1' (Cost=1463 Card=3159 Bytes=41067)
4 3 SORT (UNIQUE) (Cost=1463 Card=3159 Bytes=9477)
5 4 TABLE ACCESS (FULL) OF 'BIG_TABLE' (Cost=363 Card=
202240 Bytes=606720)
6 2 INDEX (RANGE SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
1021 recursive calls
359 db block gets
5861 consistent gets
604 physical reads
19796 redo size
365 bytes sent via SQL*Net to client
367 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
3 sorts (memory)
1 sorts (disk)
1 rows processedora8@ORACLE8I> select count(1) from small_table s where exists (select 1 from big_table b where b.object_id=s.object_id);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=1 Card=1 Bytes=3)
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'SMALL_TABLE' (Cost=1 Card=3 By
tes=9)
4 2 INDEX (RANGE SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE) (Co
st=1 Card=65 Bytes=195)
Statistics
----------------------------------------------------------
0 recursive calls
12 db block gets
103 consistent gets
0 physical reads
0 redo size
365 bytes sent via SQL*Net to client
381 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
2 sorts (memory)
0 sorts (disk)
1 rows processed
从以上结果可以看出，外层数据集小，内层数据集大的情况时，在oracle 8i CBO环境下，用in的效率比exists低很多。
? 8i RBO下in和exists比较 ora8@ORACLE8I> select /*+ rule */count(1) from small_table s where s.object_id in (select object_id from big_table s);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS
3 2 VIEW OF 'VW_NSO_1'
4 3 SORT (UNIQUE)
5 4 TABLE ACCESS (FULL) OF 'BIG_TABLE'
6 2 INDEX (RANGE SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
1021 recursive calls
363 db block gets
5859 consistent gets
604 physical reads
20436 redo size
365 bytes sent via SQL*Net to client
379 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
1 sorts (disk)
1 rows processedora8@ORACLE8I> select /*+ rule */ count(1) from small_table s where exists (select 1 from big_table b where b.object_id=s.object_id);
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'SMALL_TABLE'
4 2 INDEX (RANGE SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
12 db block gets
103 consistent gets
0 physical reads
0 redo size
365 bytes sent via SQL*Net to client
394 bytes received via SQL*Net from client
3 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
由上面的执行计划可以看到，8i在RBO优化器下，如果外层数据集远小于内层数据集，则用exists的效率远高于用in。
3.2 基于9i数据库使用in和exists
3.2.1 准备表及数据
SQL> create table big_table as select * from dba_objects;
Table created
SQL> insert into big_table select * from big_table;
6299 rows inserted
SQL> /
12598 rows inserted
SQL> /
25196 rows inserted
SQL> /
50392 rows inserted
SQL> /
100784 rows inserted
根据exists都运行机理，我们在big_table和small_table都object_id上建立索引，以让exists高效。下面都比较都是基于关联字段有合适索引都基础上讨论的
SQL> create table small_table as select * from dba_objects where rownum<50;
Table created
SQL> create index idx_big_table on big_table(object_id);
Index created
SQL> create index idx_small_table on small_table(object_id);
Index created
SQL> analyze table big_table compute statistics;
Table analyzed
SQL> analyze table small_table compute statistics;
Table analyzed
SQL> select count(1) from big_table;
COUNT(1)
----------
198560
SQL> select count(1) from small_table;
COUNT(1)
----------
49
3.2.2 外层数据集大于内层数据集
? 9i CBO下in和exists比较suk@ORACLE9I> select count(1) from big_table b where b.object_id in (select object_id from small_table s);
COUNT(1)
----------
1568
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=42 Card=1 Bytes=6)
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS (SEMI) (Cost=42 Card=1568 Bytes=9408)
3 2 INDEX (FAST FULL SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE)
(Cost=42 Card=198560 Bytes=595680)
4 2 INDEX (RANGE SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
424 consistent gets
0 physical reads
0 redo size
377 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processedsuk@ORACLE9I> select count(1) from big_table b where exists (select 1 from small_table s where b.object_id=s.object_id);
COUNT(1)
----------
1568
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=42 Card=1 Bytes=6)
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS (SEMI) (Cost=42 Card=1568 Bytes=9408)
3 2 INDEX (FAST FULL SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE)
(Cost=42 Card=198560 Bytes=595680)
4 2 INDEX (RANGE SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
424 consistent gets
0 physical reads
0 redo size
377 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
可见，在9i的CBO优化器下，in和exists的执行计划是一样的，也就是说，它们效率是一样的
? 9i RBO下的in和exists比较suk@ORACLE9I> select /*+ rule */count(1) from big_table b where b.object_id in (select object_id from small_table s);
COUNT(1)
----------
1568
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS
3 2 VIEW OF 'VW_NSO_1'
4 3 SORT (UNIQUE)
5 4 TABLE ACCESS (FULL) OF 'SMALL_TABLE'
6 2 INDEX (RANGE SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
60 consistent gets
0 physical reads
0 redo size
377 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processedsuk@ORACLE9I> select /*+ rule */ count(1) from big_table b where exists (select 1 from small_table s where b.object_id=s.object_id);
COUNT(1)
----------
1568
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'BIG_TABLE'
4 2 INDEX (RANGE SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
192915 consistent gets
300 physical reads
0 redo size
377 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
可见，在9i的RBO优化器下，如果外层数据集远大于内层数据集，则用in的效率远远高于exists。
3.2.3外层数据集小于内层数据集
? 9i CBO下的in和exists比较suk@ORACLE9I> select count(1) from small_table s where s.object_id in (select object_id from big_table b);
COUNT(1)
----------
49
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=50 Card=1 Bytes=6)
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS (SEMI) (Cost=50 Card=49 Bytes=294)
3 2 INDEX (FULL SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE) (C
ost=1 Card=49 Bytes=147)
4 2 INDEX (RANGE SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE) (Co
st=1 Card=198432 Bytes=595296)suk@ORACLE9I> select count(1) from small_table s where exists (select 1 from big_table b where b.object_id=s.object_id);
COUNT(1)
----------
49
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=CHOOSE (Cost=50 Card=1 Bytes=6)
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS (SEMI) (Cost=50 Card=49 Bytes=294)
3 2 INDEX (FULL SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE) (C
ost=1 Card=49 Bytes=147)
4 2 INDEX (RANGE SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE) (Co
st=1 Card=198432 Bytes=595296)
从以上结果可以得出结论：9i的CBO优化器下，in和exists执行路径一样，效率也就是一样的。
? RBO下的in和exists比较suk@ORACLE9I> select /*+ rule */ count(1) from small_table s where s.object_id in (select object_id from big_table b);
COUNT(1)
----------
49
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 SORT (AGGREGATE)
2 1 NESTED LOOPS
3 2 VIEW OF 'VW_NSO_1'
4 3 SORT (UNIQUE)
5 4 TABLE ACCESS (FULL) OF 'BIG_TABLE'
6 2 INDEX (RANGE SCAN) OF 'IDX_SMALL_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
2421 consistent gets
1386 physical reads
0 redo size
376 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
1 sorts (memory)
0 sorts (disk)
1 rows processedsuk@ORACLE9I> select /*+ rule */ count(1) from small_table s where exists (select 1 from big_table b where b.object_id=s.object_id);
COUNT(1)
----------
49
Execution Plan
----------------------------------------------------------
0 SELECT STATEMENT Optimizer=HINT: RULE
1 0 SORT (AGGREGATE)
2 1 FILTER
3 2 TABLE ACCESS (FULL) OF 'SMALL_TABLE'
4 2 INDEX (RANGE SCAN) OF 'IDX_BIG_TABLE' (NON-UNIQUE)
Statistics
----------------------------------------------------------
0 recursive calls
0 db block gets
101 consistent gets
0 physical reads
0 redo size
376 bytes sent via SQL*Net to client
503 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
可见，在9i的RBO优化器下，如果外层数据集远小于内层数据集，则用exists的效率要远高于in。
3.3 简单总结：
1、在相关字段有合适索引的情况下：
（1）当外层数据集远大于内层数据集时：
8i RBO/CBO优化器下都是in比exists更高效
9i 在RBO下in比exists高效，在CBO下in和exists执行计划相同，效率也相同。
（2）当外层数据集远小于内层数据集时：
8i RBO/CBO优化器下都是exists比in更高效
9i 在RBO下exists比in高效，在CBO下in和exists执行计划相同，效率也相同。
2、根据in和exists的原理，不难推断，如果相关字段没有合适索引，得到的结论和有合适索引时一致。
本文是讨论的是相对极端的例子，如果外层数据集和内层数据集大小差不多，则in和exists的效率那个更高一点则取决于其他的因素，此时可以考虑用其他方法，如外连接+ null等。

posted on 2008-05-10 22:15 一江水阅读(1137) 评论(0) 收藏举报