Oracle 表的访问方式(2)-----索引扫描
索引扫描(Index scan)
我们先通过index查找到数据对应的rowid值(对于非唯一索引可能返回多个rowid值),然后根据rowid直接从表中得到具体的数据,这种查找方式称为索引扫描或索引查找(index lookup)。一个rowid唯一的表示一行数据,该行对应的数据块是通过一次i/o得到的,在此情况下该次i/o只会读取一个数据库块。在索引中,除了存储每个索引的值外,索引还存储具有此值的行对应的ROWID值。索引扫描可以由2步组成: (1) 扫描索引得到对应的rowid值。 (2) 通过找到的rowid从表中读出具体的数据。
根据索引的类型与where限制条件的不同,有5种类型的索引扫描:
1)索引唯一扫描(index unique scan)
2)索引范围扫描(index range scan)
3)索引全扫描(index full scan)
4)索引快速扫描(index fast full scan)
5)索引跳跃扫描(INDEX SKIP SCAN)
索引唯一扫描(INDEX UNIQUE SCAN)
通过唯一索引查找一个数值经常返回单个ROWID
唯一索引由单独列组成:
1 --收集统计信息 2 SQL> exec dbms_stats.gather_table_stats('SCOTT','EMP'); 3 4 PL/SQL procedure successfully completed. 5 6 Commit complete. 7 SQL> 8 9 10 --获取创建索引语句 11 SQL> SELECT DBMS_METADATA.GET_DDL('INDEX',u.index_name) 12 2 FROM USER_INDEXES u 13 3 WHERE u.TABLE_NAME='EMP'; 14 15 DBMS_METADATA.GET_DDL('INDEX',U.INDEX_NAME) 16 -------------------------------------------------------------------------------- 17 CREATE UNIQUE INDEX "SCOTT"."PK_EMP" ON "SCOTT"."EMP" ("EMPNO") 18 PCTFREE 10 19 20 SQL> 21 22 1.索引名称 PK_EMP 23 2.索引包含列 EMPNO 24 3.索引为唯一索引 25 26 --执行计划走唯一索引的语句 27 SQL> SELECT * FROM SCOTT.EMP WHERE EMPNO='7369'; 28 29 Execution Plan 30 ---------------------------------------------------------- 31 Plan hash value: 2949544139 32 33 -------------------------------------------------------------------------------------- 34 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | 35 -------------------------------------------------------------------------------------- 36 | 0 | SELECT STATEMENT | | 1 | 38 | 1 (0)| 00:00:01 | 37 | 1 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 38 | 1 (0)| 00:00:01 | 38 |* 2 | INDEX UNIQUE SCAN | PK_EMP | 1 | | 0 (0)| 00:00:01 | 39 -------------------------------------------------------------------------------------- 40 41 Predicate Information (identified by operation id): 42 --------------------------------------------------- 43 44 2 - access("EMPNO"=7369) 45 46 47 48 SQL> SELECT * FROM SCOTT.EMP WHERE EMPNO IN ('7499','7521'); 49 50 51 Execution Plan 52 ---------------------------------------------------------- 53 Plan hash value: 2355049923 54 55 --------------------------------------------------------------------------------------- 56 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | 57 --------------------------------------------------------------------------------------- 58 | 0 | SELECT STATEMENT | | 2 | 76 | 2 (0)| 00:00:01 | 59 | 1 | INLIST ITERATOR | | | | | | 60 | 2 | TABLE ACCESS BY INDEX ROWID| EMP | 2 | 76 | 2 (0)| 00:00:01 | 61 |* 3 | INDEX UNIQUE SCAN | PK_EMP | 2 | | 1 (0)| 00:00:01 | 62 --------------------------------------------------------------------------------------- 63 64 Predicate Information (identified by operation id): 65 --------------------------------------------------- 66 67 3 - access("EMPNO"=7499 OR "EMPNO"=7521) 68 69 SQL> SELECT * FROM SCOTT.EMP WHERE EMPNO='7499' OR EMPNO='7521'; 70 71 Execution Plan 72 ---------------------------------------------------------- 73 Plan hash value: 2355049923 74 75 --------------------------------------------------------------------------------------- 76 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | 77 --------------------------------------------------------------------------------------- 78 | 0 | SELECT STATEMENT | | 2 | 76 | 2 (0)| 00:00:01 | 79 | 1 | INLIST ITERATOR | | | | | | 80 | 2 | TABLE ACCESS BY INDEX ROWID| EMP | 2 | 76 | 2 (0)| 00:00:01 | 81 |* 3 | INDEX UNIQUE SCAN | PK_EMP | 2 | | 1 (0)| 00:00:01 | 82 --------------------------------------------------------------------------------------- 83 84 Predicate Information (identified by operation id): 85 --------------------------------------------------- 86 87 3 - access("EMPNO"=7499 OR "EMPNO"=7521) 88 SQL>
SELECT * FROM SCOTT.EMP WHERE EMPNO='7369';
SELECT * FROM SCOTT.EMP WHERE EMPNO IN ('7499','7521');
SELECT * FROM SCOTT.EMP WHERE EMPNO='7499' OR EMPNO='7521'
总结,索引在where条件中,且谓词条件可以确定唯一值时,走唯一索引。思考下2,3语句的查询过程
唯一索引由多个列组成(即组合索引)
1 --创建一个唯一索引(优质索引) 2 create unique index scott.idx_test on scott.emp(ename, deptno); --ename为引导列,表中ename列值具有唯一性 3 4 --谓词条件中的列顺序与索引的列顺序完全一致,走唯一索引 5 SQL> select * from scott.emp where ename = 'ALLEN' and deptno = 20 ; 6 7 no rows selected 8 9 10 Execution Plan 11 ---------------------------------------------------------- 12 Plan hash value: 4010583877 13 14 ---------------------------------------------------------------------------------------- 15 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | 16 ---------------------------------------------------------------------------------------- 17 | 0 | SELECT STATEMENT | | 1 | 38 | 1 (0)| 00:00:01 | 18 | 1 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 38 | 1 (0)| 00:00:01 | 19 |* 2 | INDEX UNIQUE SCAN | IDX_TEST | 1 | | 0 (0)| 00:00:01 | 20 ---------------------------------------------------------------------------------------- 21 22 Predicate Information (identified by operation id): 23 --------------------------------------------------- 24 25 2 - access("ENAME"='ALLEN' AND "DEPTNO"=20) 26 27 28 --谓词条件中的列顺序与唯索引的列顺序不一致 ,走唯一索引 29 SQL> select * from scott.emp where deptno = 20 and ename = 'ALLEN'; 30 31 no rows selected 32 33 34 Execution Plan 35 ---------------------------------------------------------- 36 Plan hash value: 4010583877 37 38 ---------------------------------------------------------------------------------------- 39 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | 40 ---------------------------------------------------------------------------------------- 41 | 0 | SELECT STATEMENT | | 1 | 38 | 1 (0)| 00:00:01 | 42 | 1 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 38 | 1 (0)| 00:00:01 | 43 |* 2 | INDEX UNIQUE SCAN | IDX_TEST | 1 | | 0 (0)| 00:00:01 | 44 ---------------------------------------------------------------------------------------- 45 46 Predicate Information (identified by operation id): 47 --------------------------------------------------- 48 49 2 - access("ENAME"='ALLEN' AND "DEPTNO"=20) 50 51 --只有引导列在谓词条件中 52 SQL> select * from scott.emp where ename = 'ALLEN'; --即使是唯一数据 也不走唯一索引 53 54 55 Execution Plan 56 ---------------------------------------------------------- 57 Plan hash value: 2317538385 58 59 ---------------------------------------------------------------------------------------- 60 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | 61 ---------------------------------------------------------------------------------------- 62 | 0 | SELECT STATEMENT | | 1 | 38 | 2 (0)| 00:00:01 | 63 | 1 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 38 | 2 (0)| 00:00:01 | 64 |* 2 | INDEX RANGE SCAN | IDX_TEST | 1 | | 1 (0)| 00:00:01 | 65 ---------------------------------------------------------------------------------------- 66 67 Predicate Information (identified by operation id): 68 --------------------------------------------------- 69 70 2 - access("ENAME"='ALLEN') 71 72 73 --引导列不在谓词条件中 74 SQL> select * from scott.emp where deptno = 20; 75 76 77 Execution Plan 78 ---------------------------------------------------------- 79 Plan hash value: 3956160932 80 81 -------------------------------------------------------------------------- 82 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | 83 -------------------------------------------------------------------------- 84 | 0 | SELECT STATEMENT | | 5 | 190 | 3 (0)| 00:00:01 | 85 |* 1 | TABLE ACCESS FULL| EMP | 5 | 190 | 3 (0)| 00:00:01 | 86 -------------------------------------------------------------------------- 87 88 Predicate Information (identified by operation id): 89 --------------------------------------------------- 90 91 1 - filter("DEPTNO"=20) 92 93 SQL>
1 --创建一个唯一索引(劣质索引) 2 create unique index idx_test on scott.emp(deptno,ename); --deptno为引导列,表中deptno列值不具有唯一性 3 4 分别对别如下sql的执行计划: 5 --谓词条件中的列顺序与索引的列顺序完全一致,,走唯一索引 6 SQL> select * from scott.emp where deptno = 20 and ename = 'ALLEN'; 7 8 no rows selected 9 10 Execution Plan 11 ---------------------------------------------------------- 12 Plan hash value: 1531058326 13 14 ------------------------------------------------------------------------------------------ 15 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | 16 ------------------------------------------------------------------------------------------ 17 | 0 | SELECT STATEMENT | | 1 | 38 | 1 (0)| 00:00:01 | 18 | 1 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 38 | 1 (0)| 00:00:01 | 19 |* 2 | INDEX UNIQUE SCAN | IDX_TEST01 | 1 | | 0 (0)| 00:00:01 | 20 ------------------------------------------------------------------------------------------ 21 22 Predicate Information (identified by operation id): 23 --------------------------------------------------- 24 25 2 - access("DEPTNO"=20 AND "ENAME"='ALLEN') 26 27 SQL> 28 29 --谓词条件中的列顺序与唯索引的列顺序不一致 ,走唯一索引 30 SQL> select * from scott.emp where ename = 'ALLEN' and deptno = 20 ; 31 32 no rows selected 33 34 Execution Plan 35 ---------------------------------------------------------- 36 Plan hash value: 1531058326 37 38 ------------------------------------------------------------------------------------------ 39 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | 40 ------------------------------------------------------------------------------------------ 41 | 0 | SELECT STATEMENT | | 1 | 38 | 1 (0)| 00:00:01 | 42 | 1 | TABLE ACCESS BY INDEX ROWID| EMP | 1 | 38 | 1 (0)| 00:00:01 | 43 |* 2 | INDEX UNIQUE SCAN | IDX_TEST01 | 1 | | 0 (0)| 00:00:01 | 44 ------------------------------------------------------------------------------------------ 45 46 Predicate Information (identified by operation id): 47 --------------------------------------------------- 48 49 2 - access("DEPTNO"=20 AND "ENAME"='ALLEN') 50 51 SQL> 52 53 --只有引导列在谓词条件中 54 SQL> select * from scott.emp where deptno = 20; 55 56 Execution Plan 57 ---------------------------------------------------------- 58 Plan hash value: 560737562 59 60 ------------------------------------------------------------------------------------------ 61 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | 62 ------------------------------------------------------------------------------------------ 63 | 0 | SELECT STATEMENT | | 5 | 190 | 2 (0)| 00:00:01 | 64 | 1 | TABLE ACCESS BY INDEX ROWID| EMP | 5 | 190 | 2 (0)| 00:00:01 | 65 |* 2 | INDEX RANGE SCAN | IDX_TEST01 | 5 | | 1 (0)| 00:00:01 | 66 ------------------------------------------------------------------------------------------ 67 68 Predicate Information (identified by operation id): 69 --------------------------------------------------- 70 71 2 - access("DEPTNO"=20) 72 SQL> 73 74 --引导列不在谓词条件 75 SQL> select * from scott.emp where ename = 'ALLEN'; 76 77 Execution Plan 78 ---------------------------------------------------------- 79 Plan hash value: 3956160932 80 81 -------------------------------------------------------------------------- 82 | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | 83 -------------------------------------------------------------------------- 84 | 0 | SELECT STATEMENT | | 1 | 38 | 3 (0)| 00:00:01 | 85 |* 1 | TABLE ACCESS FULL| EMP | 1 | 38 | 3 (0)| 00:00:01 | 86 -------------------------------------------------------------------------- 87 88 Predicate Information (identified by operation id): 89 --------------------------------------------------- 90 91 1 - filter("ENAME"='ALLEN') 92 93 SQL>
总结:使用组合索引时,遵守以下原则:
1.引导列标识性要强;
2.索引列尽量全部出现在谓词条件中
3.引导列尽量出现在谓词条件中
索引范围扫描(INDEX RANGE SCAN)
使用一个索引存取多行数据,在唯一索引上使用索引范围扫描的典型情况下是在谓词(where限制条件)中使用了范围操作符 (如>、<、<>、>=、<=、between)。在非唯一索引上,谓词"="也可能返回多行数据,所以在非唯一索引上都使用索引范围扫描。
使用index rang scan的3种情况:
1.在唯一索引列上使用了range操作符(> < <> >= <= between)
2.在组合索引上,只使用部分列进行查询,导致查询出多行
3.对非唯一索引列上进行的任何查询。
通过index range scan访问的表可以通过按照索引顺序重新建立表来提高效率:
1.如果你只读一部分数据,假设20% ,如果表数据顺序混乱,实际上可能把整个表都读进来了;
如果表顺序和索引一致,则只需要读进 20%的表的block就够了。这是简单情况
2.复杂情况下,顺序混乱的时候 block 可能在整个查询的不同时间点多次反复访问
当再次要访问这个块的时候说不定已经被换出去了,或者被修改过了,那代价更大
而如果顺序一样,对同一个block的访问集中在一段连续的很短的时间内,变数少,不会对同一个block产生多次IO
Index Unique Scan对比Index Range Scan
1.Index Unique Scan和Index Range Scan在B Tree上的搜索路径是一样的
2.Index Unique Scan在找到应该含有要找的Index Key的block后便停止了搜索,因为该键是唯一的;而Index Range Scan还要循着指针继续找下去直到条件不满足时
3.Index Unique Scan和Index Range Scan都只是索引上的查询,与是否扫描表没有关系。
如果所选择的列都在index上就不用去scan table;如果扫描到表, 必然还有一个table access by rowid
索引全扫描(index full scan)
与全表扫描对应,也有相应的全索引扫描。在某些情况下,可能进行全索引扫描而不是范围扫描,需要注意的是全索引扫描只在CBO模式下才有效。 CBO根据统计数值得知进行全索引扫描比进行全表扫描更有效时,才进行全索引扫描,而且此时查询出的数据都必须从索引中可以直接得到。
一般通过索引进行排序时,会用到(index full scan)
索引快速扫描(index fast full scan)
扫描索引中的所有的数据块,与 index full scan很类似,但是一个显著的区别就是它不对查询出的数据进行排序,即数据不是以排序顺序被返回。在这种存取方法中,可以使用多块读功能,也可以使用并行读入,以便获得最大吞吐量与缩短执行时间。
索引跳跃扫描(INDEX SKIP SCAN)
Skip Scans are initiated by probing the index for distinct values of the prefix column. Each of these distinct values is then used as a starting point for a regular index search. The result is several separate searches of a single index that, when combined, eliminate the affect of the prefix column.
skip scan会探测出索引前导列的唯一值个数,每个唯一值都会作为常规扫描的入口,在此基础上做一次查找,最后合并 这些查询。例如:表employees (sex, employee_id, address) ,有一个组合索引(sex, employee_id). 在索引跳跃的情况 下,我们可以逻辑上把他们看成两个索引,一个是(男,employee_id),一个是(女,employee_id). select * from employees where employee_id=1;发出这个查询后,oracle先进入sex为男的入口,查找employee_id=1的条目。 再进入sex为女的入口,查找employee_id=1的条目。最后合并两个结果集
参考blog:http://www.itpub.net/thread-1372696-1-1.html
http://blog.csdn.net/dba_waterbin/article/details/8550405