SQL 优化之该走索引却不走索引的分析
分析案例:
1.走rule很快,但是收集了执行计划后却很慢
SQL> create table test(id int);
表已创建。
SQL> insert into test select 1 from dba_objects;
已创建49883行。
SQL> commit;
提交完成。
SQL> insert into test select 2 from user_objects where rownum<101;
已创建100行。
SQL> commit;
提交完成。
SQL> create index i_test_1 on test(id);
索引已创建。
SQL> analyze table test compute statistics;
表已分析。
SQL> set autot trace
SQL> set timing on
SQL> select * from test where id=2;
已选择100行。
已用时间: 00: 00: 00.01
执行计划
----------------------------------------------------------
Plan hash value: 1357081020
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 24992 | 49984 | 20 (10)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST | 24992 | 49984 | 20 (10)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("ID"=2)
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
86 consistent gets
0 physical reads
0 redo size
1690 bytes sent via SQL*Net to client
451 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
100 rows processed
可以看出这里并没有走索引,可以对比rows和实际返回的行数100,得出这时候的统计信息并没有反应真实的情况
SQL> select /*+ rule */ * from test where id=2;
已选择100行。
已用时间: 00: 00: 00.00
执行计划
----------------------------------------------------------
Plan hash value: 3245211066
-------------------------------------
| Id | Operation | Name |
-------------------------------------
| 0 | SELECT STATEMENT | |
|* 1 | INDEX RANGE SCAN| I_TEST_1 |
-------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("ID"=2)
Note
-----
- rule based optimizer used (consider using cbo)
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
9 consistent gets
0 physical reads
0 redo size
1690 bytes sent via SQL*Net to client
451 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
100 rows processed
已用时间: 00: 00: 00.01
从这里可以看出如果是走索引的话是很快的,大概已经明白了,这个字段是倾斜的,所以收集下直方图应该就能走正确的执行计划
SQL> analyze table test compute statistics for table for all indexes for all indexed columns;
表已分析。
已用时间: 00: 00: 00.37
SQL> select * from test where id=2;
已选择100行。
已用时间: 00: 00: 00.00
执行计划
----------------------------------------------------------
Plan hash value: 3245211066
-----------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 100 | 200 | 1 (0)| 00:00:01 |
|* 1 | INDEX RANGE SCAN| I_TEST_1 | 100 | 200 | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - access("ID"=2)
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
9 consistent gets
0 physical reads
0 redo size
1690 bytes sent via SQL*Net to client
451 bytes received via SQL*Net from client
8 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
100 rows processed
//收集直方图后走了正确的执行计划
2.加hint也不走索引
a.隐式类型转换
对于这个问题,10g可以根据执行计划下面的谓词部分来判断,而对于10g以前的版本可以根据强制加hint不走索引,
对于index full scan和index fast full scan做为切入点
example:
执行计划
----------------------------------------------------------
Plan hash value: 4122059633
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 29 | 58 (6)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST1 | 1 | 29 | 58 (6)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_NUMBER("OBJECT_ID")=10000)
从这里可以看出oracle隐式的把object_id变成了number类型,强制加hint看看
1* select /*+ index(test1) */ * from test1 where object_id=10000
SQL> /
未选定行
执行计划
----------------------------------------------------------
Plan hash value: 4122059633
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 29 | 58 (6)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST1 | 1 | 29 | 58 (6)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter(TO_NUMBER("OBJECT_ID")=10000)
确实加了hint也不走索引,既然已经知道了是因为隐式转换导致的不走索引,这时候可以有2个办法,避免隐式转换或者创建函数索引
b.索引失效
执行计划
----------------------------------------------------------
Plan hash value: 4122059633
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 29 | 57 (4)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST1 | 1 | 29 | 57 (4)| 00:00:01 |
---------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_ID"='10000')
确认过这里的没有存在隐式类型转换,如果是10g的话,skip_unusable_indexes=true强制走索引会报错.
检查索引的状态.
SQL> select status from user_indexes where table_name='TEST1';
STATUS
--------
UNUSABLE
索引失效了,只能rebuild,同时收集索引的统计信息,9i不要用alter index.. rebuild compute statistics;
因为这个会同时收集表的统计信息,10g以后没有问题,如果怕rebuild影响dml,那可以加上online选项
SQL> alter index i_test1_1 rebuild;
索引已更改。
SQL> analyze index i_test1_1 compute statistics;
索引已分析
SQL> set autot trace
SQL> select * from test1 where object_id='10000';
未选定行
执行计划
----------------------------------------------------------
Plan hash value: 3616871662
--------------------------------------------------------------------------------
---------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| T
ime |
--------------------------------------------------------------------------------
---------
| 0 | SELECT STATEMENT | | 1 | 29 | 2 (0)| 0
0:00:01 |
| 1 | TABLE ACCESS BY INDEX ROWID| TEST1 | 1 | 29 | 2 (0)| 0
0:00:01 |
|* 2 | INDEX RANGE SCAN | I_TEST1_1 | 1 | | 1 (0)| 0
0:00:01 |
--------------------------------------------------------------------------------
---------
Predicate Information (identified by operation id):
---------------------------------------------------
2 - access("OBJECT_ID"='10000')
这时候终于走索引
c.索引的访问路径oracle并没有考虑
主要讨论index fast full scan
index fast full scan类似全表扫描,只是把索引当作表来处理,支持并行和多块读
index fast full scan的前提是,索引必须满足这个查询,同时必须保证非空
(通过非空字段或者是在where条件中声明非空),最为典型的就是count(*)
SQL> create table test as select * from dba_objects;
表已创建。
SQL> create index i_test_1 on test(object_id);
索引已创建。
SQL> analyze table test compute statistics;
表已分析。
SQL> select count(*) from test;
执行计划
----------------------------------------------------------
Plan hash value: 1950795681
-------------------------------------------------------------------
| Id | Operation | Name | Rows | Cost (%CPU)| Time |
-------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 154 (2)| 00:00:02 |
| 1 | SORT AGGREGATE | | 1 | | |
| 2 | TABLE ACCESS FULL| TEST | 49883 | 154 (2)| 00:00:02 |
-------------------------------------------------------------------
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
690 consistent gets
0 physical reads
0 redo size
410 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
这里虽然有索引,但是oracle并不知道索引中是否有null值,所以无法走索引
SQL> select count(*) from test where object_id is not null;
执行计划
----------------------------------------------------------
Plan hash value: 1366347385
--------------------------------------------------------------------------------
--
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time
|
--------------------------------------------------------------------------------
--
| 0 | SELECT STATEMENT | | 1 | 4 | 27 (4)| 00:00:01
|
| 1 | SORT AGGREGATE | | 1 | 4 | |
|
|* 2 | INDEX FAST FULL SCAN| I_TEST_1 | 49882 | 194K| 27 (4)| 00:00:01
|
--------------------------------------------------------------------------------
--
Predicate Information (identified by operation id):
---------------------------------------------------
2 - filter("OBJECT_ID" IS NOT NULL)
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
117 consistent gets
0 physical reads
0 redo size
410 bytes sent via SQL*Net to client
385 bytes received via SQL*Net from client
2 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
1 rows processed
select count(distinct object_id) from test;
select count(*) from (select object_id from test group by object_id);
这是itpub上的一个例子,严格来说这2个语句并不是等价的,第一个语句会忽略null,因此只要object_id上有索引
,同时索引比表小的情况就会用索引扫描代替全表扫描,而第二个语句则不会.
第一个语句和select count(*) from (select object_id from test where object_id is not null group by object_id);
是等价的.
3.统计信息不对
SQL> create table test as select * from dba_objects where 1=0;
表已创建。
SQL> analyze table test compute statistics;
表已分析。
SQL> insert into test select * from dba_objects;
已创建49920行。
SQL> commit;
提交完成。
SQL> create index i_test_1 on test(object_id);
索引已创建。
SQL> select * from test where object_id=10000;
未选定行
执行计划
----------------------------------------------------------
Plan hash value: 1357081020
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 177 | 2 (0)| 00:00:01 |
|* 1 | TABLE ACCESS FULL| TEST | 1 | 177 | 2 (0)| 00:00:01 |
--------------------------------------------------------------------------
Predicate Information (identified by operation id):
---------------------------------------------------
1 - filter("OBJECT_ID"=10000)
统计信息
----------------------------------------------------------
1 recursive calls
0 db block gets
689 consistent gets
0 physical reads
0 redo size
992 bytes sent via SQL*Net to client
374 bytes received via SQL*Net from client
1 SQL*Net roundtrips to/from client
0 sorts (memory)
0 sorts (disk)
0 rows processed
从这里来判断统计信息错误,这个表大概60000多行,这里的cost=2,rows=1(实际上这里可能是0行),基本可以判断出表是没有统计信息的,
如果这个表的统计信息是正确的话,这个表肯定是小表,一般不可能导致比较大的逻辑读