之前一直不明白为什么order by字段有索引却不走索引,因此在互联网的帮助下,自己做了一个实验,算是比较清楚了。将具体的分析记录在此。
(学习中,有不对的欢迎批评指正)
实验的数据说明:
b$l_interest_info表有280万多条数据;g3eid,g3e_fid,lttid 均为 NOT NULL,name字段在测试中先为null,后改为not null,且该四个字段的索引均不是唯一索引。
(I)对于一个语句,没有相应的where条件,ORDER BY的字段中的索引要被引用首先要满足以下条件:
1、ORDER BY的字段必须为非空字段(见二和三的例子);
(一)
以下语句由于没用上name字段索引,因此ORDER BY排序用到了临时表空间。此时的name字段没有not null的限制。
SELECT a.name FROM b$l_interest_info a ORDER BY a.name;
2 -------------------------------------------------------------------------------------
3 | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)|
4 -------------------------------------------------------------------------------------
5 | 0 | SELECT STATEMENT | | 2825K| 26M| | 17075 (4)|
6 | 1 | SORT ORDER BY | | 2825K| 26M| 86M| 17075 (4)|
7 | 2 | TABLE ACCESS FULL| B$L_INTEREST_INFO | 2825K| 26M| | 5374 (3)|
8 -------------------------------------------------------------------------------------
(二)
以下语句由于用上了索引,ORDER BY不需要再进行排序,避免了使用临时表空间。此处将name字段做了not null的限制。
ALTER TABLE b$l_interest_info MODIFY name NOT NULL;--将name字段改为非空
SELECT a.name FROM b$l_interest_info a ORDER BY a.name;
2 ------------------------------------------------------------------------------------
3 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
4 ------------------------------------------------------------------------------------
5 | 0 | SELECT STATEMENT | | 2825K| 26M| 8370 (2)|
6 | 1 | INDEX FULL SCAN | IDX_B$L_INTEREST_INFO_NAME | 2825K| 26M| 8370 (2)|
7 ------------------------------------------------------------------------------------
2、ORDER BY的字段必须被select选取;
(三)select选取的字段包含了g3e_id字段
SELECT * FROM b$l_interest_info a ORDER BY a.g3e_id;
2 ------------------------------------------------------------------------------------------------
3 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
4 ------------------------------------------------------------------------------------------------
5 | 0 | SELECT STATEMENT | | 2825K| 137M| 35640 (1)|
6 | 1 | TABLE ACCESS BY INDEX ROWID| B$L_INTEREST_INFO | 2825K| 137M| 35640 (1)|
7 | 2 | INDEX FULL SCAN | IDX_B$L_INTEREST_INFO_G3EID | 2825K| | 6657 (2)|
8 ------------------------------------------------------------------------------------------------
(四)g3e_id字段不在select 中。
SELECT name FROM b$l_interest_info a ORDER BY a.g3e_id;
2 -------------------------------------------------------------------------------------
3 | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)|
4 -------------------------------------------------------------------------------------
5 | 0 | SELECT STATEMENT | | 2825K| 43M| | 20818 (4)|
6 | 1 | SORT ORDER BY | | 2825K| 43M| 129M| 20818 (4)|
7 | 2 | TABLE ACCESS FULL| B$L_INTEREST_INFO | 2825K| 43M| | 5374 (3)|
8 -------------------------------------------------------------------------------------
疑问1:
什么情况下,select选取的字段可以不做限制呢(前提是选择的字段中已经包含了order by字段的情况下)
分别见(七),(九),(十)的例子,name,g3e_fid,g3e_id都是一样类型的索引,为什么只有(七)这个语句用到了该字段的索引。然不成是跟这些字段在表中
的数据布局不同造成的?
(九)
SELECT * FROM b$l_interest_info a ORDER BY a.g3e_fid;
2 -------------------------------------------------------------------------------------
3 | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)|
4 -------------------------------------------------------------------------------------
5 | 0 | SELECT STATEMENT | | 2825K| 137M| | 41742 (2)|
6 | 1 | SORT ORDER BY | | 2825K| 137M| 455M| 41742 (2)|
7 | 2 | TABLE ACCESS FULL| B$L_INTEREST_INFO | 2825K| 137M| | 5442 (5)|
8 -------------------------------------------------------------------------------------
(十)
SELECT * FROM b$l_interest_info a ORDER BY a.name;
2 -------------------------------------------------------------------------------------
3 | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)|
4 -------------------------------------------------------------------------------------
5 | 0 | SELECT STATEMENT | | 2825K| 137M| | 41742 (2)|
6 | 1 | SORT ORDER BY | | 2825K| 137M| 455M| 41742 (2)|
7 | 2 | TABLE ACCESS FULL| B$L_INTEREST_INFO | 2825K| 137M| | 5442 (5)|
8 -------------------------------------------------------------------------------------
如果将(九),(十)例子中语句修改为select字段只有一个order by的字段,就可以用上索引。
如下语句所示:
SELECT a.name FROM b$l_interest_info a ORDER BY a.name;
2 ------------------------------------------------------------------------------------
3 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
4 ------------------------------------------------------------------------------------
5 | 0 | SELECT STATEMENT | | 2825K| 26M| 8622 (2)|
6 | 1 | INDEX FULL SCAN | IDX_B$L_INTEREST_INFO_NAME | 2825K| 26M| 8622 (2)|
7 ------------------------------------------------------------------------------------
疑问2:如下语句中:语句1可以走索引,语句2却不能走索引,这是为什么呢?
b$l_interest_info_test表是通过create table as子句创建的。与b$l_interest_info表唯一不同的是g3e_id字段建的索引在b$l_interest_info该表是非空索引,
在b$l_interest_info_test是唯一非空索引。
语句1:SELECT * FROM b$l_interest_info a ORDER BY a.g3e_id;
2 ------------------------------------------------------------------------------------------------
3 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
4 ------------------------------------------------------------------------------------------------
5 | 0 | SELECT STATEMENT | | 2825K| 137M| 35526 (1)|
6 | 1 | TABLE ACCESS BY INDEX ROWID| B$L_INTEREST_INFO | 2825K| 137M| 35526 (1)|
7 | 2 | INDEX FULL SCAN | IDX_B$L_INTEREST_INFO_G3EID | 2825K| | 6615 (2)|
8 ------------------------------------------------------------------------------------------------
语句2:SELECT * FROM b$l_interest_info_test a ORDER BY a.g3e_id;
2 ------------------------------------------------------------------------------------------
3 | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)|
4 ------------------------------------------------------------------------------------------
5 | 0 | SELECT STATEMENT | | 2828K| 137M| | 41816 (2)|
6 | 1 | SORT ORDER BY | | 2828K| 137M| 455M| 41816 (2)|
7 | 2 | TABLE ACCESS FULL| B$L_INTEREST_INFO_TEST | 2828K| 137M| | 5482 (4)|
8 ------------------------------------------------------------------------------------------
/*
解答上述疑问于2011-6-22
查看一下语句:
产生上述1)可以走索引,而另两个不行的原因分析:
首先,看下B$L_INTEREST_INFO表各个索引的聚簇因子与行数和块数的关系。(如果聚簇因子与块数接近,表明表数据比较有序,若与行数接近,则表明表中数据比较无序)。
如下显示可知,G3e_id索引的CLUSTERING_FACTOR值与LEAF_BLOCKS比较接近,相对于另几个索引。
INDEX_NAME CLUSTERING_FACTOR LEAF_BLOCKS NUM_ROWS
------------------------------ ----------------- ----------- ----------
IDX_B$L_INTEREST_INFO_NAME 802058 8509 2935624
IDX_B$L_INTEREST_INFO_G3EID 28691 6511 2747581
IDX_B$L_INTEREST_INFO_FID 294698 6374 2688408
其次看以下三个表(数据均来源于B$L_INTEREST_INFO表):
--查看b$l_interest_info_test表的索引聚簇因子与行数和块数情况:
------------------------------ ----------------- ----------- ----------
IDX_B$L_INTEREST_TEST_FID 334627 6754 2850273
IDX_B$L_INTEREST_TEST_G3EID 53689 6772 2857793
IDX_B$L_INTEREST_TEST_NAME 843456 8515 2935220
create index IDX_INTEREST_TEST_G3EID on INTEREST_TEST (G3E_ID);
create index IDX_INTEREST_TEST_NAME on INTEREST_TEST (NAME);
1)SELECT * FROM interest_test a ORDER BY a.g3e_id; --可以走IDX_INTEREST_TEST_G3EID索引
2)SELECT * FROM interest_test a ORDER BY a.g3e_fid; --不能走IDX_INTEREST_TEST_FID索引
3)SELECT * FROM interest_test a ORDER BY a.name;--不能走IDX_INTEREST_TEST_NAME索引
------------------------------ ----------------- ----------- ----------
IDX_INTEREST_TEST_G3EID 28959 6588 2778513 --该索引的聚簇因子与原表b$l_interest_info的g3e_id字段索引聚簇因子接近
IDX_INTEREST_TEST_FID 305312 6696 2825449
IDX_INTEREST_TEST_NAME 778870 8211 2825449
create index IDX_INTEREST_TEST2_FID on INTEREST_TEST2 (G3E_FID);
create index IDX_INTEREST_TEST2_G3EID on INTEREST_TEST2 (G3E_ID);
create index IDX_INTEREST_TEST2_NAME on INTEREST_TEST2 (NAME);
BEGIN dbms_stats.gather_table_stats('LTTFM','interest_test2');END;
1)SELECT * FROM interest_test2 a ORDER BY a.g3e_id; --不能走IDX_INTEREST_TEST_G3EID索引
2)SELECT * FROM interest_test2 a ORDER BY a.g3e_fid; --可以走IDX_INTEREST_TEST_FID索引
3)SELECT * FROM interest_test2 a ORDER BY a.name;--不能走IDX_INTEREST_TEST_NAME索引
--查看interest_test2表的索引聚簇因子与行数和块数情况:
INDEX_NAME CLUSTERING_FACTOR LEAF_BLOCKS NUM_ROWS
------------------------ ----------------- ----------- ----------
IDX_INTEREST_TEST2_FID 28900 6588 2778513
IDX_INTEREST_TEST2_G3EID 47523 6517 2748431
IDX_INTEREST_TEST2_NAME 750750 8574 2927127
最后得出以上疑问的结论:
通过上述的分析,可以发现如果索引的聚簇因子与块数接近,在使用order by查询时就会使用该索引,反之必须只选择order by的字段。或者可以通过hint使其强制使用索引。
因为order by查询相当于是对整个表数据的查询,因此此时如果表的数据排列cbo认为是比较无序的,那么就会选择全表扫描。*/
(II)对于一个语句,含有where条件和ORDER BY条件:
where
条件中的字段有索引且该条件下的选取的行数占表很少的一部分且数据布局是有序的,不管该字段的索引是否非空,一般是可以走索引的。(III)ORDER BY语句会用到临时表空间进行排序,但是如果有索引,则不需用到临时表空间,因为索引已经进行了排序
(一)
以下语句由于没用上name字段索引,因此ORDER BY排序用到了临时表空间。此时的name字段没有not null的限制。
SELECT a.name FROM b$l_interest_info a ORDER BY a.name;
2
-------------------------------------------------------------------------------------3 | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)|
4
-------------------------------------------------------------------------------------5 | 0 | SELECT STATEMENT | | 2825K| 26M| | 17075
(4)|6 | 1 | SORT ORDER BY | | 2825K|
26M| 86M| 17075 (4)|7 | 2 | TABLE ACCESS FULL| B$L_INTEREST_INFO | 2825K|
26M| | 5374 (3)|8
-------------------------------------------------------------------------------------(二)
以下语句由于用上了索引,
ORDER BY不需要再进行排序,避免了使用临时表空间。此处将name字段做了not null的限制。ALTER TABLE b$l_interest_info MODIFY name NOT NULL;
--将name字段改为非空SELECT a.name FROM b$l_interest_info a ORDER BY a.name;
2
------------------------------------------------------------------------------------3 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)|
4
------------------------------------------------------------------------------------5 | 0 | SELECT STATEMENT | | 2825K| 26M| 8370
(2)|6 | 1 | INDEX FULL SCAN | IDX_B$L_INTEREST_INFO_NAME | 2825K| 26M| 8370
(2)|7
------------------------------------------------------------------------------------