一个改善含有ROWNUM条件语句性能的案例
数据库版本: 10.2.0.3;
案例背景:我们有一个Java程序会从一个配置文件中读取并执行以下一条SQL语句,由查询结果确认是否存在满足条件的数据记录,然后据此再执行后续代码。
最初的语句为:
- Select count(1) as rowcount
- from B2B_BIZ_KEY B
- WHERE EXISTS
- (SELECT ID FROM B2B_TRANSACTION T
- WHERE T.ID=B.TRANSACTION_ID AND T.MSG_REQ_ID <= RPAD('EDI', 11, TO_CHAR(TO_TIMESTAMP('2012-07-31 00:00:00','YYYY-MM-DD HH24:MI:SS.FF'), 'YYYYMMDD')));
程序的中判断条件为rowcount是否大于0.
表B2B_BIZ_KEY是表B2B_TRANSACTION子表,它有一个外键(已建索引)在字段(TRANSACTION_ID)上,指向B2B_TRANSACTION的主键 (ID)。尽管在B2B_TRANSACTION上还存在一个索引(MSG_REQ_ID, ID),但由于这两张表中的数据量很大,因此语句的性能很差。以下是其执行后的性能统计数据。
- Elapsed: 00:02:12.48
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=165253 Card=1 Bytes=71)
- 1 0 SORT (AGGREGATE)
- 2 1 HASH JOIN (RIGHT SEMI) (Cost=165253 Card=35644057 Bytes=2530728047)
- 3 2 INDEX (FAST FULL SCAN) OF 'B2B_TRANSACTION_UN' (INDEX(UNIQUE)) (Cost=15035 Card=8184629 Bytes=384677563)
- 4 2 INDEX (FAST FULL SCAN) OF 'B2B_BIZ_KEY_UN' (INDEX (UNIQUE)) (Cost=65567 Card=36056379 Bytes=865353096)
- Statistics
- ----------------------------------------------------------
- 943 recursive calls
- 0 db block gets
- 363128 consistent gets
- 416008 physical reads
- 0 redo size
- 353 bytes sent via SQL*Net to client
- 435 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
通过逻辑分析,我们知道实际上根本需要获取满足条件的所有记录数。因此,我们可以给语句加上“ROWNUM<=1”条件。
修改后的语句是这样的:
- Select count(1) as rowcount
- from B2B_BIZ_KEY B
- WHERE EXISTS
- (SELECT ID FROM B2B_TRANSACTION T
- WHERE T.ID=B.TRANSACTION_ID AND T.MSG_REQ_ID <= RPAD('EDI', 11, TO_CHAR(TO_TIMESTAMP('2012-07-31 00:00:00','YYYY-MM-DD HH24:MI:SS.FF'), 'YYYYMMDD')))
- AND ROWNUM <= 1;
但是我们发现,尽管语句的性能确实得到很大地提高,但并未达到我们期望的效果。运行结果如下:
- Elapsed: 00:00:19.76
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=165253 Card=1 Bytes=71)
- 1 0 SORT (AGGREGATE)
- 2 1 COUNT (STOPKEY)
- 3 2 HASH JOIN (RIGHT SEMI) (Cost=165253 Card=35644057 Bytes=2530728047)
- 4 3 INDEX (FAST FULL SCAN) OF 'B2B_TRANSACTION_UN' (INDEX (UNIQUE)) (Cost=15035 Card=8184629 Bytes=384677563)
- 5 3 INDEX (FAST FULL SCAN) OF 'B2B_BIZ_KEY_UN' (INDEX (UNIQUE)) (Cost=65567 Card=36056379 Bytes=865353096)
- Statistics
- ----------------------------------------------------------
- 249 recursive calls
- 0 db block gets
- 66098 consistent gets
- 13694 physical reads
- 0 redo size
- 350 bytes sent via SQL*Net to client
- 435 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
我们假设"ROWNUM <= 1"的目的是为了使语句避免扫描没必要的记录,同时让优化器能基于这样的目的对执行计划进行优化————这种优化模式也就是我们所知的取前K条记录(First K rows)的优化模式。然而,由于COUNT(1)的存在,优化器知道服务端与客户端之间只会交互一次,因此认为取前K条记录的优化模式对于这条语句是没有意义的。如果将COUNT移除,运行结果如下:
- HELLODBA.COM> Select 1 as rowcount
- 2 from B2B_BIZ_KEY B
- 3 WHERE EXISTS
- 4 (SELECT ID FROM B2B_TRANSACTION T
- 5 WHERE T.ID=B.TRANSACTION_ID AND T.MSG_REQ_ID <= RPAD('EDI', 11, TO_CHAR(TO_TIMESTAMP('2012-07-31 00:00:00','YYYY-MM-DD HH24:MI:SS.FF'), 'YYYYMMDD')))
- 6 AND ROWNUM<=1;
- Elapsed: 00:00:00.43
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=8 Card=1 Bytes=71)
- 1 0 COUNT (STOPKEY)
- 2 1 NESTED LOOPS (SEMI) (Cost=8 Card=3 Bytes=213)
- 3 2 TABLE ACCESS (FULL) OF 'B2B_BIZ_KEY' (TABLE) (Cost=2 Card=22996310 Bytes=551911440)
- 4 2 INDEX (RANGE SCAN) OF 'B2B_TRANSACTION_IX2' (INDEX) (Cost=2 Card=5456419 Bytes=256451693)
- Statistics
- ----------------------------------------------------------
- 1 recursive calls
- 0 db block gets
- 7 consistent gets
- 0 physical reads
- 0 redo size
- 350 bytes sent via SQL*Net to client
- 435 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
这个结果正式我们期望的。不过,如果仅仅将语句修改成这样,那么会导致程序缺陷。因为在我们的Java代码中并没有处理NO_DATA_FOUND的异常。我们也不想再修改Java代码。那还有其他办法吗?我曾经考虑过,如果能将上述结果写入一张临时表,然后再对临时表计数,应该可以解决上述问题。因此,我尝试将语句写出以下形式:
- HELLODBA.COM> with V as
- 2 (Select /*+ MATERIALIZE qb_name(wv)*/1
- 3 from B2B_BIZ_KEY B
- 4 WHERE EXISTS
- 5 (SELECT ID FROM B2B_TRANSACTION T
- 6 WHERE T.ID=B.TRANSACTION_ID AND T.MSG_REQ_ID <= RPAD('EDI', 11, TO_CHAR(TO_TIMESTAMP('2012-07-31 00:00:00','YYYY-MM-DD HH24:MI:SS.FF'), 'YYYYMMDD')))
- 7 AND ROWNUM<=1)
- 8 select count(1) from v;
- Elapsed: 00:00:15.84
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=165255 Card=1)
- 1 0 TEMP TABLE TRANSFORMATION
- 2 1 LOAD AS SELECT
- 3 2 COUNT (STOPKEY)
- 4 3 HASH JOIN (RIGHT SEMI) (Cost=165253 Card=35644057 Bytes=2530728047)
- 5 4 INDEX (FAST FULL SCAN) OF 'B2B_TRANSACTION_UN' (INDEX (UNIQUE)) (Cost=15035 Card=8184629 Bytes=384677563)
- 6 4 INDEX (FAST FULL SCAN) OF 'B2B_BIZ_KEY_UN' (INDEX(UNIQUE)) (Cost=65567 Card=36056379 Bytes=865353096)
- 7 1 SORT (AGGREGATE)
- 8 7 VIEW (Cost=2 Card=1)
- 9 8 TABLE ACCESS (FULL) OF 'SYS_TEMP_0FD9FC8C0_51579C6B'(TABLE (TEMP)) (Cost=2 Card=1 Bytes=13)
- Statistics
- ----------------------------------------------------------
- 520 recursive calls
- 10 db block gets
- 66112 consistent gets
- 11595 physical reads
- 1460 redo size
- 350 bytes sent via SQL*Net to client
- 435 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
单从执行计划看,似乎与我的想法比较吻合:SQL引擎先进子查询结果写入一直系统临时表,然后再扫描临时表计数。但是,这个两个过程并不是分开的,而属于同一执行计划————COUNT还是导致优化器没有采纳取前K条记录的优化模式。好吧,那就使用SQL技巧将COUNT消除掉吧:
- HELLODBA.COM> with V as
- 2 (Select 1
- 3 from B2B_BIZ_KEY B
- 4 WHERE EXISTS
- 5 (SELECT ID FROM B2B_TRANSACTION T
- 6 WHERE T.ID=B.TRANSACTION_ID AND T.MSG_REQ_ID <= RPAD('EDI', 11, TO_CHAR(TO_TIMESTAMP('2012-07-31 00:00:00','YYYY-MM-DD HH24:MI:SS.FF'), 'YYYYMMDD')))
- 7 AND ROWNUM<=1)
- 8 select nvl2((select 1 from v),1,0) as aaa from dual;
- Elapsed: 00:00:00.21
- Execution Plan
- ----------------------------------------------------------
- 0 SELECT STATEMENT Optimizer=ALL_ROWS (Cost=2 Card=1)
- 1 0 VIEW (Cost=8 Card=1)
- 2 1 COUNT (STOPKEY)
- 3 2 NESTED LOOPS (SEMI) (Cost=8 Card=3 Bytes=213)
- 4 3 TABLE ACCESS (FULL) OF 'B2B_BIZ_KEY' (TABLE) (Cost=2Card=22996310 Bytes=551911440)
- 5 3 INDEX (RANGE SCAN) OF 'B2B_TRANSACTION_IX2' (INDEX)(Cost=2 Card=5456419 Bytes=256451693)
- 6 0 FAST DUAL (Cost=2 Card=1)
- Statistics
- ----------------------------------------------------------
- 0 recursive calls
- 0 db block gets
- 7 consistent gets
- 0 physical reads
- 0 redo size
- 345 bytes sent via SQL*Net to client
- 435 bytes received via SQL*Net from client
- 2 SQL*Net roundtrips to/from client
- 0 sorts (memory)
- 0 sorts (disk)
- 1 rows processed
Ok. 性能满足要求,也不存在 NO_DATA_FOUND 异常的问题了。
--- Fuyuncat ---
转自 http://blog.csdn.net/haoxiaoyan/article/details/14553047