一次SQL分页的优化
今天优化了一个分页的SQL,以前虽然做了上千个SQL的优化,不过都是一些OLAP的,虽然也有OLTP的不过从来没做过分页优化,所以这里记录一下。
SQL和执行计划如下:
SQL> SELECT A.ROWNO,EMS_EVENT_VIEW.* FROM EMS_EVENT_VIEW, 2 (SELECT * FROM (SELECT ROW_NUMBER() OVER(ORDER BY first_occurrence_time DESC) AS ROWNO,EVENT_ID 3 FROM EMS_EVENT_VIEW WHERE (first_occurrence_time>to_date('2012-02-22 00:00:00','yyyy-mm-dd hh24:mi:ss') 4 5 and first_occurrence_time<to_date('2012-02-29 09:42:35','yyyy-mm-dd hh24:mi:ss')) or (last_occurrence_time>to_date('2012-02-22 00:00:00','yyyy-mm-dd hh24:mi:ss') and last_occurrence_time<to_date('2012-02-29 09:42:35','yyyy-mm-dd hh24:mi:ss'))) 6 7 8 WHERE ROWNO>=0 AND ROWNO<=20) A 9 WHERE EMS_EVENT_VIEW.EVENT_ID=A.EVENT_ID; Plan hash value: 2052413575 ------------------------------------------------------------------------------------------------------------------------------------------------------- PLAN_TABLE_OUTPUT ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | ------------------------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 20 |00:00:53.37 | 757K| | | | |* 1 | HASH JOIN | | 1 | 81G| 20 |00:00:53.37 | 757K| 1179K| 1179K| 6598K (0)| |* 2 | VIEW | | 1 | 2104K| 20 |00:00:30.83 | 101K| | | | |* 3 | WINDOW SORT PUSHED RANK | | 1 | 2104K| 21 |00:00:30.83 | 101K| 4096 | 4096 | 4096 (0)| |* 4 | FILTER | | 1 | | 1255K|00:00:30.10 | 101K| | | | | 5 | VIEW | EMS_EVENT_VIEW | 1 | 2104K| 1255K|00:00:28.85 | 101K| | | | | 6 | UNION-ALL | | 1 | | 1255K|00:00:28.85 | 101K| | | | | 7 | CONCATENATION | | 1 | | 0 |00:00:00.01 | 335 | | | | | 8 | TABLE ACCESS BY INDEX ROWID| EMS_EVENT | 1 | 1 | 0 |00:00:00.01 | 63 | | | | |* 9 | INDEX RANGE SCAN | LAST_OCCURRENCE_TIME_INDEX | 1 | 1 | 0 |00:00:00.01 | 63 | | | | |* 10 | TABLE ACCESS BY INDEX ROWID| EMS_EVENT | 1 | 1 | 0 |00:00:00.01 | 272 | | | | |* 11 | INDEX RANGE SCAN | FIRST_OCCURRENCE_INDEX | 1 | 1 | 0 |00:00:00.01 | 272 | | | | |* 12 | VIEW | index_join_006 | 1 | 2104K| 1255K|00:00:28.84 | 100K| | | | |* 13 | HASH JOIN | | 1 | | 3863K|00:00:26.50 | 100K| 195M| 9M| 248M (0)| |* 14 | HASH JOIN | | 1 | | 3863K|00:00:13.87 | 63020 | 160M| 10M| 214M (0)| | 15 | PARTITION RANGE ALL | | 1 | 2104K| 3863K|00:00:00.01 | 31419 | | | | | 16 | INDEX FAST FULL SCAN | IDX_FIRSTTIME_201202 | 338 | 2104K| 3863K|00:00:00.07 | 31419 | | | | | 17 | PARTITION RANGE ALL | | 1 | 2104K| 3863K|00:00:00.01 | 31601 | | | | | 18 | INDEX FAST FULL SCAN | IDX_LASTOCCURRENCE_201202 | 338 | 2104K| 3863K|00:00:00.06 | 31601 | | | | | 19 | INDEX FAST FULL SCAN | PK_EMS_EVENT_HISTORY_201202 | 1 | 2104K| 3863K|00:00:00.01 | 37894 | | | | | 20 | VIEW | EMS_EVENT_VIEW | 1 | 3864K| 3867K|00:00:19.34 | 656K| | | | | 21 | UNION-ALL | | 1 | | 3867K|00:00:15.47 | 656K| | | | | 22 | TABLE ACCESS FULL | EMS_EVENT | 1 | 3867 | 3950 |00:00:00.02 | 2046 | | | | | 23 | PARTITION RANGE ALL | | 1 | 3860K| 3863K|00:00:07.73 | 654K| | | | | 24 | TABLE ACCESS FULL | EMS_EVENT_HISTORY | 338 | 3860K| 3863K|00:00:09.51 | 654K| | | | ------------------------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("EMS_EVENT_VIEW"."EVENT_ID"="from_subquery_003"."EVENT_ID") 2 - filter(("ROWNO">=:SYS_B_8 AND "ROWNO"<=:SYS_B_9)) 3 - filter(ROW_NUMBER() OVER ( ORDER BY INTERNAL_FUNCTION("FIRST_OCCURRENCE_TIME") DESC )<=:SYS_B_9) 4 - filter(:SYS_B_8<=:SYS_B_9) 9 - access("A"."LAST_OCCURRENCE_TIME">TO_DATE(:SYS_B_4,:SYS_B_5) AND "A"."LAST_OCCURRENCE_TIME"<TO_DATE(:SYS_B_6,:SYS_B_7)) 10 - filter((LNNVL("A"."LAST_OCCURRENCE_TIME"<TO_DATE(:SYS_B_6,:SYS_B_7)) OR LNNVL("A"."LAST_OCCURRENCE_TIME">TO_DATE(:SYS_B_4,:SYS_B_5)))) 11 - access("A"."FIRST_OCCURRENCE_TIME">TO_DATE(:SYS_B_0,:SYS_B_1) AND "A"."FIRST_OCCURRENCE_TIME"<TO_DATE(:SYS_B_2,:SYS_B_3)) 12 - filter((("B"."FIRST_OCCURRENCE_TIME">TO_DATE(:SYS_B_0,:SYS_B_1) AND "B"."FIRST_OCCURRENCE_TIME"<TO_DATE(:SYS_B_2,:SYS_B_3)) OR ("B"."LAST_OCCURRENCE_TIME">TO_DATE(:SYS_B_4,:SYS_B_5) AND "B"."LAST_OCCURRENCE_TIME"<TO_DATE(:SYS_B_6,:SYS_B_7)))) 13 - access(ROWID=ROWID) 14 - access(ROWID=ROWID)
如果你看不清楚SQL,我在这里再贴一下:
SELECT A.ROWNO, EMS_EVENT_VIEW.*
FROM EMS_EVENT_VIEW,
(SELECT *
FROM (SELECT ROW_NUMBER() OVER(ORDER BY first_occurrence_time DESC) AS ROWNO,
EVENT_ID
FROM EMS_EVENT_VIEW
WHERE (first_occurrence_time >
to_date('2012-02-22 00:00:00',
'yyyy-mm-dd hh24:mi:ss') and
first_occurrence_time <
to_date('2012-02-29 09:42:35',
'yyyy-mm-dd hh24:mi:ss'))
))
WHERE ROWNO >= 0
AND ROWNO <= 20) A
WHERE EMS_EVENT_VIEW.EVENT_ID = A.EVENT_ID;
这个SQL其实就是一个分页SQL,利用 row_number over 做分页,EMS_EVENT_VIEW是一个视图。这个SQL确实写得很坑爹,它要扫描EMS_EVENT_VIEW两次,其实我们可以改写它,让它扫描一次,而不是自己和自己利用event_id 做自连接。
EMS_EVENT_VIEW的定义就不贴出来了,涉及保密 。它的大概意思就是 select * from a union all select * from b; 无where 过滤条件。
因为这个SQL是朋友给我的,我无法连接到他的DB,所以我只有自己做测试了,测试代码如下:
create table a as select * from dba_objects;
create table b as select * from dba_objects;
create view test_view as select * from a
union all select * from b;
create index idx_a on a(created ,last_ddl_time);
create index idx_b on b(created ,last_ddl_time);
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(ownname => 'SCOTT',
tabname => 'B',
estimate_percent => 100,
method_opt => 'for all columns size auto',
no_invalidate => FALSE,
degree => 4,
cascade => TRUE);
END;
/
BEGIN
DBMS_STATS.GATHER_TABLE_STATS(ownname => 'SCOTT',
tabname => 'B',
estimate_percent => 100,
method_opt => 'for all columns size auto',
no_invalidate => FALSE,
degree => 4,
cascade => TRUE);
END;
/
要优化的SQL可以改写成如下代码,只访问一次视图:
select * from
(
select t.*,rownum rn from
(select /*+ INDEX(TEST_VIEW.A idx_a) INDEX(TEST_VIEW.b idx_b) */ *
from test_view
where created > to_date('2010-01-01', 'yyyy-mm-dd') or
LAST_DDL_TIME < to_date('2007-0101', 'yyyy-mm-dd')
order by created desc
) t where rownum<=20
) where rn>=0;
现在来看它的执行计划
SQL> select * from 2 ( 3 select t.*,rownum rn from 4 (select /*+ INDEX(TEST_VIEW.A idx_a) INDEX(TEST_VIEW.b idx_b) */ * 5 from test_view 6 where created > to_date('2010-01-01', 'yyyy-mm-dd') or 7 LAST_DDL_TIME < to_date('2007-0101', 'yyyy-mm-dd') 8 order by created desc 9 ) t where rownum<=20 10 ) where rn>=0; 已选择20行。 已用时间: 00: 00: 00.10 执行计划 ---------------------------------------------------------- Plan hash value: 1808710389 ------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 20 | 3800 | | 1898 (2)| 00:00:23 | |* 1 | VIEW | | 20 | 3800 | | 1898 (2)| 00:00:23 | |* 2 | COUNT STOPKEY | | | | | | | | 3 | VIEW | | 70304 | 11M| | 1898 (2)| 00:00:23 | |* 4 | SORT ORDER BY STOPKEY | | 70304 | 6659K| 17M| 1898 (2)| 00:00:23 | | 5 | VIEW | TEST_VIEW | 70304 | 6659K| | 329 (5)| 00:00:04 | | 6 | UNION-ALL PARTITION | | | | | | | | 7 | TABLE ACCESS BY INDEX ROWID| A | 1650 | 156K| | 238 (3)| 00:00:03 | |* 8 | INDEX FULL SCAN | IDX_A | 1650 | | | 199 (4)| 00:00:03 | | 9 | TABLE ACCESS BY INDEX ROWID| B | 1650 | 156K| | 238 (3)| 00:00:03 | |* 10 | INDEX FULL SCAN | IDX_B | 1650 | | | 199 (4)| 00:00:03 | ------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("RN">=0) 2 - filter(ROWNUM<=20) 4 - filter(ROWNUM<=20) 8 - filter("CREATED">TO_DATE('2010-01-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss') OR "LAST_DDL_TIME"<TO_DATE('2007-0101','yyyy-mm-dd')) 10 - filter("CREATED">TO_DATE('2010-01-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss') OR "LAST_DDL_TIME"<TO_DATE('2007-0101','yyyy-mm-dd')) 统计信息 ---------------------------------------------------------- 8 recursive calls 0 db block gets 566 consistent gets 9 physical reads 0 redo size 2621 bytes sent via SQL*Net to client 411 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 20 rows processed
逻辑读566,那么这样改写是不是最优化的呢?显然不是,因为索引 IDX_A,IDX_B 都是走的 index full scan,会扫描整个索引block,原始的SQL这个索引里面有3863K 条数据,性能肯定是很低的。 所以进一步的 改写SQL 如下:
select * from
(
select t.*,rownum rn from
(
select * from
(select * from
(
select /*+ index_desc(a) */ *
from a
where created > to_date('2010-01-01', 'yyyy-mm-dd') or
LAST_DDL_TIME < to_date('2007-0101', 'yyyy-mm-dd')
order by created desc
) where rownum<=20
union all
select * from
(
select /*+ index_desc(b) */ *
from b
where created > to_date('2010-01-01', 'yyyy-mm-dd') or
LAST_DDL_TIME < to_date('2007-0101', 'yyyy-mm-dd')
order by created desc
) where rownum<=20
) order by created desc
) t where rownum<=20
) where rn>=0
执行计划和逻辑读如下:
SQL> select * from 2 ( 3 select t.*,rownum rn from 4 ( 5 select * from 6 (select * from 7 ( 8 select /*+ index_desc(a) */ * 9 from a 10 where created > to_date('2010-01-01', 'yyyy-mm-dd') or 11 LAST_DDL_TIME < to_date('2007-0101', 'yyyy-mm-dd') 12 order by created desc 13 ) where rownum<=20 14 union all 15 select * from 16 ( 17 select /*+ index_desc(b) */ * 18 from b 19 where created > to_date('2010-01-01', 'yyyy-mm-dd') or 20 LAST_DDL_TIME < to_date('2007-0101', 'yyyy-mm-dd') 21 order by created desc 22 ) where rownum<=20 23 ) order by created desc 24 ) t where rownum<=20 25 ) where rn>=0; 已选择20行。 已用时间: 00: 00: 00.04 执行计划 ---------------------------------------------------------- Plan hash value: 3460309830 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 20 | 3800 | 244 (4)| 00:00:03 | |* 1 | VIEW | | 20 | 3800 | 244 (4)| 00:00:03 | |* 2 | COUNT STOPKEY | | | | | | | 3 | VIEW | | 40 | 7080 | 244 (4)| 00:00:03 | |* 4 | SORT ORDER BY STOPKEY | | 40 | 7080 | 244 (4)| 00:00:03 | | 5 | VIEW | | 40 | 7080 | 243 (3)| 00:00:03 | | 6 | UNION-ALL | | | | | | |* 7 | COUNT STOPKEY | | | | | | | 8 | VIEW | | 1650 | 285K| 238 (3)| 00:00:03 | | 9 | TABLE ACCESS BY INDEX ROWID| A | 1650 | 156K| 238 (3)| 00:00:03 | |* 10 | INDEX FULL SCAN DESCENDING| IDX_A | 1650 | | 199 (4)| 00:00:03 | |* 11 | COUNT STOPKEY | | | | | | | 12 | VIEW | | 20 | 3540 | 5 (0)| 00:00:01 | | 13 | TABLE ACCESS BY INDEX ROWID| B | 20 | 1940 | 5 (0)| 00:00:01 | |* 14 | INDEX FULL SCAN DESCENDING| IDX_B | 1650 | | 4 (0)| 00:00:01 | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - filter("RN">=0) 2 - filter(ROWNUM<=20) 4 - filter(ROWNUM<=20) 7 - filter(ROWNUM<=20) 10 - filter("CREATED">TO_DATE('2010-01-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss') OR "LAST_DDL_TIME"<TO_DATE('2007-0101','yyyy-mm-dd')) 11 - filter(ROWNUM<=20) 14 - filter("CREATED">TO_DATE('2010-01-01 00:00:00', 'yyyy-mm-dd hh24:mi:ss') OR "LAST_DDL_TIME"<TO_DATE('2007-0101','yyyy-mm-dd')) 统计信息 ---------------------------------------------------------- 1 recursive calls 0 db block gets 10 consistent gets 0 physical reads 0 redo size 2457 bytes sent via SQL*Net to client 411 bytes received via SQL*Net from client 3 SQL*Net roundtrips to/from client 1 sorts (memory) 0 sorts (disk) 20 rows processed
逻辑读整整下降了60倍。
现在根据这个案例来谈谈SQL分页的优化思路,SQL分页通常要进行排序,比如select xxxx from t where 条件 order by ......
优化分页SQL可以重点关注 order by 这个条件,写SQL的时候要让ORACLE 对 order by 列 上的索引进行有序的扫描,然后根据stopkey 停止,也就是不要把索引的block全都给扫描了,应该扫描一部分block就停止。