优化实例- not in 和 not exists
客户运行一个SQL,非常慢。于是进行了一下改写。速度飞快,首先看一下原来的SQL。
original sql
SQL> explain plan for 2 select count(*) from pnadmin.si_vsl where vsl_status_i = 'A' and to_number(vsl_id_n) not in (select vessel_id from pnadmin.vessel_master); Explained. SQL> select * from table(dbms_xplan.display()); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------- Plan hash value: 3613440888 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 9 | 9466K (4)| 31:33:15 | | 1 | SORT AGGREGATE | | 1 | 9 | | | |* 2 | FILTER | | | | | | |* 3 | TABLE ACCESS FULL| SI_VSL | 35069 | 308K| 642 (5)| 00:00:08 | |* 4 | TABLE ACCESS FULL| VESSEL_MASTER | 1 | 5 | 352 (4)| 00:00:05 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter( NOT EXISTS (SELECT 0 FROM "PNADMIN"."VESSEL_MASTER" "VESSEL_MASTER" WHERE LNNVL("VESSEL_ID"<>TO_NUMBER(:B1)))) 3 - filter("VSL_STATUS_I"='A') 4 - filter(LNNVL("VESSEL_ID"<>TO_NUMBER(:B1))) 19 rows selected.
可以看到是filter 操作。 filter操作在之前的blog中有介绍(http://www.cnblogs.com/kramer/archive/2013/04/12/3017013.html)。 这是一个类似于nested loop的操作,可以理解为对SI_VSL的每一行都要对VESSEL_MASTER做一次全表扫描,所以这个很显然是会非常慢的。 于是想到如果用hash join可能会快很多。 两种办法,一种是改写,一种是用 hash hint。 我用的是改写
new sql
SQL> explain plan for 2 select count(*) from PNADMIN.si_vsl a where a.vsl_status_i = 'A' and not exists (select 1 from PNADMIN.vessel_master b where b.vessel_id=to_number(a.vsl_id_n)); Explained. SQL> select * from table(dbms_xplan.display()); PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------- Plan hash value: 3198032547 ------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 14 | 995 (4)| 00:00:12 | | 1 | SORT AGGREGATE | | 1 | 14 | | | |* 2 | HASH JOIN ANTI | | 35067 | 479K| 995 (4)| 00:00:12 | |* 3 | TABLE ACCESS FULL| SI_VSL | 35069 | 308K| 641 (5)| 00:00:08 | | 4 | TABLE ACCESS FULL| VESSEL_MASTER | 74163 | 362K| 348 (3)| 00:00:05 | ------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("B"."VESSEL_ID"=TO_NUMBER("A"."VSL_ID_N")) 3 - filter("A"."VSL_STATUS_I"='A')
新的SQL用hash 非常快。 但是我想通过hash hint的方式修改原sql 效果不好。 也许是因为对hash hint理解不深,明天好好研究一下。
--
后来发现 hash_aj hint在 oracle10g中不支持了。 同样的还有很多
The AND_EQUAL
, HASH_AJ
, MERGE_AJ
, NL_AJ
, HASH_SJ
, MERGE_SJ
, NL_SJ
, EXPAND_GSET_TO_UNION
, ORDERED_PREDICATES
, ROWID
, and STAR
hints have been deprecated and should not be used. 详细请看下面的URL
http://docs.oracle.com/cd/B13789_01/server.101/b10752/whatsnew.htm
OK 既然hash hint不支持了 我们就不再纠结了 。 本文只要记得 至少在10g里面 通过 not in exist 等的转换还可以获得不同的执行计划就好了。