B树索引

  1. 3大特点
    1. 高度较低
      1. where =
      2. 分区索引设计
    2. 存储列值
      1. countStar(*)优化
      2. SUM/AVG优化
      3. MAX/MIN优化
      4. 索引和回表
    3. 结构有序
      1. order by
      2. distinct
      3. union
  2. 主外键
  3. 组合索引
  4. 索引危害

位图索引

  1. 性能卓越:
    1. 统计条数奋勇夺冠
    2. 即时查询一骑绝尘
  2. 缺陷:
    1. 遭遇更新苦不堪言
    2. 重复度低一败涂地
  3. 原理

函数索引

  1. 列运算让索引失效
  2. 函数索引如何应用
  3. 避免列运算的经典案例

B树索引结构:

  1. 形状:根块-枝块-叶块;叶块:索引入口头部,键列长度,键列值,rowid;通过双向链表链接叶块。
  2. 定位方式:根-枝-叶-数据块 至少4次IO
  3. 建立过程:
    1. 要建索引先排序
    2. 列值入块成索引
    3. 填满一块接一块
    4. 同级两块有人管
  4. 3大特点:
    1. 索引高度较低:A:一张500g的大表有几百亿条的记录,索引高度才6层。B:查少量数据走索引快,大量数据用全表扫描快。C:查少量数据,有索引和没索引差距极大。D:分区表时,有分区索引需要结合分区字段来实现性能优化,否则遍历分区索引反而会增加开销。E:分区的案例:分区多时,查询要用到分区条件避免遍历分区索引。如果查询是用到分区条件,则需要建立分区索引。如果查询实在用不到分区条件,那么不要建分区索引。
      create table t1 as select rownum as id,rownum+1 as id1 from dual connect by level <=5;
      create index idx_1 on t1(id);
      select segment_name, bytes/1024 from user_segments where segment_name in ('IDX_1','IDX_2','IDX_3','IDX_4','IDX_5','IDX_6');
      
      SEGMENT_NA BYTES/1024
      ---------- ----------
      IDX_1           64
      IDX_2           64 因为一次性分配8块=64k
      IDX_3           64
      IDX_4          128
      IDX_5         1024
      IDX_6         9216
      
      SELECT INDEX_NAME,BLEVEL,LEAF_BLOCKS,NUM_ROWS,DISTINCT_KEYS,CLUSTERING_FACTOR FROM USER_IND_STATISTICS WHERE INDEX_NAME IN  ('IDX_1','IDX_2','IDX_3','IDX_4','IDX_5','IDX_6');
      
      INDEX_NAME     BLEVEL LEAF_BLOCKS   NUM_ROWS DISTINCT_KEYS CLUSTERING_FACTOR
      ------------ ---------- ----------- ---------- ------------- -----------------
      IDX_1       只有叶块0       1             5           5             1
      IDX_2              0       1            50          50             1
      IDX_3       没有枝块1       2           500         500             1
      IDX_4              1      11          5000        5000             9
      IDX_5              1     110         50000       50000           101
      IDX_6      根枝叶3块2    1113        500000      500000          1035  1叶块=449个索引--》0-499,1-25万,2-1亿,3-500亿
      SQL> select /*+ INDEX(IDX_5) */ count(*) from t5;--没有附加条件,加hint也没用,oracle会自动全表扫描。
      -------------------------------------------------------------------
      | Id  | Operation       | Name | Rows  | Cost (%CPU)| Time      |
      -------------------------------------------------------------------
      |   0 | SELECT STATEMENT   |      |    1 |    32   (0)| 00:00:01 |
      |   1 |  SORT AGGREGATE    |      |    1 |           |      |
      |   2 |   TABLE ACCESS FULL| T5   | 50000 |    32   (0)| 00:00:01 |
      -------------------------------------------------------------------
      Statistics
      ----------------------------------------------------------
          105  consistent gets
      
      SQL> select /*+ INDEX(IDX_5) */ count(*) from t5 WHERE ID <50000;--先走索引,再回表,在读取大多数表内数据的情况下,更加消耗资源
      -------------------------------------------------------------------------------
      | Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
      -------------------------------------------------------------------------------
      |   0 | SELECT STATEMENT      |       |     1 |     5 |    32    (0)| 00:00:01 |
      |   1 |  SORT AGGREGATE       |       |     1 |     5 |        |          |
      |*  2 |   INDEX FAST FULL SCAN| IDX_5 | 49999 |   244K|    32    (0)| 00:00:01 |
      -------------------------------------------------------------------------------
      
      Statistics
      ----------------------------------------------------------
          118  consistent gets
      SQL> select * from t3 WHERE ID =100;
      -------------------------------------------------------------------------------------
      | Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
      -------------------------------------------------------------------------------------
      |   0 | SELECT STATEMENT        |        |      1 |      8 |      2   (0)| 00:00:01 |
      |   1 |  TABLE ACCESS BY INDEX ROWID| T3    |      1 |      8 |      2   (0)| 00:00:01 |
      |*  2 |   INDEX RANGE SCAN        | IDX_3 |      1 |        |      1   (0)| 00:00:01 |
      -------------------------------------------------------------------------------------
      Statistics
      ----------------------------------------------------------
            0  recursive calls
            0  db block gets
            4  consistent gets---500条记录
      
      SQL> select * from t5 WHERE ID =100;
      -------------------------------------------------------------------------------------
      | Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
      -------------------------------------------------------------------------------------
      |   0 | SELECT STATEMENT        |        |      1 |     10 |      2   (0)| 00:00:01 |
      |   1 |  TABLE ACCESS BY INDEX ROWID| T5    |      1 |     10 |      2   (0)| 00:00:01 |
      |*  2 |   INDEX RANGE SCAN        | IDX_5 |      1 |        |      1   (0)| 00:00:01 |
      -------------------------------------------------------------------------------------
      Statistics
      ----------------------------------------------------------
            0  recursive calls
            0  db block gets
            4  consistent gets--50000条记录
      SQL> select * from t6 WHERE ID =100;
      
      | Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
      -------------------------------------------------------------------------------------
      |   0 | SELECT STATEMENT        |        |      1 |     10 |      4   (0)| 00:00:01 |
      |   1 |  TABLE ACCESS BY INDEX ROWID| T6    |      1 |     10 |      4   (0)| 00:00:01 |
      |*  2 |   INDEX RANGE SCAN        | IDX_6 |      1 |        |      3   (0)| 00:00:01 |
      -------------------------------------------------------------------------------------
      Statistics
      ----------------------------------------------------------
            5  consistent gets --50万记录
      SQL> drop index idx_6;
      Index dropped.
      
      SQL> select * from t6 WHERE ID =100;--无索引状态。
      --------------------------------------------------------------------------
      | Id  | Operation      | Name | Rows  | Bytes | Cost (%CPU)| Time     |
      --------------------------------------------------------------------------
      |   0 | SELECT STATEMENT  |     |     1 |    10 |   292   (2)| 00:00:04 |
      |*  1 |  TABLE ACCESS FULL| T6     |     1 |    10 |   292   (2)| 00:00:04 |
      --------------------------------------------------------------------------
      Statistics
      ----------------------------------------------------------
          168  recursive calls
             1059  consistent gets
      SQL> conn nyc/nyc
      SQL> create table part_tab (id int,col2 int,col3 int) partition by range (id)(
        2  partition p1 values less than (10000),
        3  partition p2 values less than (20000),
        4  partition p3 values less than (30000),
        5  partition p4 values less than (40000),
        6  partition p5 values less than (50000),
        7  partition p6 values less than (60000),
        8  partition p7 values less than (70000),
        9  partition p8 values less than (80000),
       10  partition p9 values less than (90000),
       11  partition p10 values less than (100000),
       12  partition p11 values less than (maxvalue)
       13  );
      SQL> insert into part_tab select rownum, rownum+1, rownum+2 from dual connect by rownum<=100000;
      SQL> commit;
      SQL> create index idx_col2 on part_tab(col2) local;
      SQL> create index idx_col3 on part_tab(col3);
      SQL> set autotrace traceonly
      SQL> set linesize 1000
      SQL> set timing on
      SQL> select * from part_tab where col3=8; --全局索引比分区索引快的情况。
      ---------------------------------------------------------------------------------------------------------------
      | Id  | Operation               | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
      ---------------------------------------------------------------------------------------------------------------
      |   0 | SELECT STATEMENT           |          |     1 |    39 |     2    (0)| 00:00:01 |       |       |
      |   1 |  TABLE ACCESS BY GLOBAL INDEX ROWID| PART_TAB |     1 |    39 |     2    (0)| 00:00:01 | ROWID | ROWID |
      |*  2 |   INDEX RANGE SCAN           | IDX_COL3 |     1 |       |     1    (0)| 00:00:01 |       |       |
      ---------------------------------------------------------------------------------------------------------------
      Statistics
      ----------------------------------------------------------
            4  consistent gets
      
      SQL> select * from part_tab where col2=8; 因为要遍历分区索引,其实大索引和小索引高度差不多,所以消耗倍增。
      ---------------------------------------------------------------------------------------------------------------
      | Id  | Operation               | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
      ---------------------------------------------------------------------------------------------------------------
      |   0 | SELECT STATEMENT           |          |     1 |    39 |    13    (0)| 00:00:01 |       |       |
      |   1 |  PARTITION RANGE ALL           |          |     1 |    39 |    13    (0)| 00:00:01 |     1 |    11 |
      |   2 |   TABLE ACCESS BY LOCAL INDEX ROWID| PART_TAB |     1 |    39 |    13    (0)| 00:00:01 |     1 |    11 |
      |*  3 |    INDEX RANGE SCAN           | IDX_COL2 |     1 |       |    12    (0)| 00:00:01 |     1 |    11 |
      ---------------------------------------------------------------------------------------------------------------
      Statistics
      ----------------------------------------------------------
           23  consistent gets

      SQL> select * from part_tab where col2=8 and id=7;  --这里需要加上id这一分区条件,则速度倍增

      ---------------------------------------------------------------------------------------------------------------
      | Id  | Operation               | Name     | Rows  | Bytes | Cost (%CPU)| Time     | Pstart| Pstop |
      ---------------------------------------------------------------------------------------------------------------
      |   0 | SELECT STATEMENT           |          |     1 |    39 |     2    (0)| 00:00:01 |       |       |
      |   1 |  PARTITION RANGE SINGLE        |          |     1 |    39 |     2    (0)| 00:00:01 |     1 |     1 |
      |*  2 |   TABLE ACCESS BY LOCAL INDEX ROWID| PART_TAB |     1 |    39 |     2    (0)| 00:00:01 |     1 |     1 |
      |*  3 |    INDEX RANGE SCAN           | IDX_COL2 |     1 |       |     1    (0)| 00:00:01 |     1 |     1 |
      ---------------------------------------------------------------------------------------------------------------

      Statistics
      ----------------------------------------------------------
            4  consistent gets

    2. 索引存储列值:索引存列值和rowid的特性 :
      1. count * 优化:
        1. select count(*) from t where id is not null; oracle通过读索引大大加快了速度,其中not null是必要的。
          SQL> select count(*) from t where id is not null;
          ---------------------------------------------------------------------------------
          | Id  | Operation          | Name    | Rows    | Bytes | Cost (%CPU)| Time    |
          ---------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT      |     |     1 |    13 |    65   (0)| 00:00:01 |
          |   1 |  SORT AGGREGATE       |     |     1 |    13 |         |        |
          |*  2 |   INDEX FAST FULL SCAN| IDX_TID |   110K|  1398K|    65   (0)| 00:00:01 |
          ---------------------------------------------------------------------------------
          Statistics
          ----------------------------------------------------------
              230  consistent gets
        2. select count(*) from t; 的情况下,oracle不知道是否有空值列,因为索引不会记录空值,如有空值,扫描索引的的结果和全表扫描不等价。所以oracle仍然进行全表扫描。
          SQL> select count(*) from t;
          -------------------------------------------------------------------
          | Id  | Operation       | Name | Rows  | Cost (%CPU)| Time      |
          -------------------------------------------------------------------
          |   0 | SELECT STATEMENT   |      |    1 |    69   (2)| 00:00:01 |
          |   1 |  SORT AGGREGATE    |      |    1 |           |      |
          |   2 |   TABLE ACCESS FULL| T      |   110K|    69   (2)| 00:00:01 |
          -------------------------------------------------------------------
          Statistics
              238  consistent gets
        3. Alter table t modify id not null; 将一列改为非空。
        4. Alter table t add constraint pk_id primary key (id);主键=非空+唯一
        5. 一张表只有1个字段,这个时候索引比全表大(加入rowid)。
      2. SUM/AVG 优化:
        1. SUM/AVG空值bug已经被修复:oracle对于聚集语句,原来在没确认非空的情况下做全表扫,这个是个bug似乎这里已经解决。
          SQL> select sum(object_id) from t;
          
          SUM(OBJECT_ID)
          ---------------------------------------------------------------------------------
          | Id  | Operation          | Name    | Rows    | Bytes | Cost (%CPU)| Time    |
          ---------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT      |     |     1 |    13 |    49   (0)| 00:00:01 |
          |   1 |  SORT AGGREGATE       |     |     1 |    13 |         |        |
          |   2 |   INDEX FAST FULL SCAN| IDX_TID | 65544 |   832K|    49   (0)| 00:00:01 |
          Statistics
          ----------------------------------------------------------
              168  consistent gets
        2. MAX/MIN优化:
          SQL> select max(object_id) from t;
          --------------------------------------------------------------------------------------
          | Id  | Operation           | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
          --------------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT       |         |       1 |      13 |       2   (0)| 00:00:01 |
          |   1 |  SORT AGGREGATE        |         |       1 |      13 |          |         |
          |   2 |   INDEX FULL SCAN (MIN/MAX)| IDX_TID |       1 |      13 |       2   (0)| 00:00:01 |
          --------------------------------------------------------------------------------------
          Statistics
          ----------------------------------------------------------
                2  consistent gets
          SQL> select max(object_id),min(object_id) from t; --全表扫描
          ---------------------------------------------------------------------------
          | Id  | Operation       | Name | Rows  | Bytes | Cost (%CPU)| Time      |
          ---------------------------------------------------------------------------
          |   0 | SELECT STATEMENT   |      |    1 |    13 |   288   (1)| 00:00:04 |
          |   1 |  SORT AGGREGATE    |      |    1 |    13 |           |      |
          |   2 |   TABLE ACCESS FULL| T      | 65544 |   832K|   288   (1)| 00:00:04 |
          Statistics
          ----------------------------------------------------------
                 1034  consistent gets
          
          SQL> select max(object_id),min(object_id) from t where object_id is not null;走普通索引
          | Id  | Operation          | Name    | Rows    | Bytes | Cost (%CPU)| Time    |
          ---------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT      |     |     1 |    13 |    49   (0)| 00:00:01 |
          |   1 |  SORT AGGREGATE       |     |     1 |    13 |         |        |
          |*  2 |   INDEX FAST FULL SCAN| IDX_TID | 65544 |   832K|    49   (0)| 00:00:01 |
          ---------------------------------------------------------------------------------
          Statistics
          ----------------------------------------------------------
              168  consistent gets
          
          SQL> select max,min from (select max(object_id) max from t) a,(select min(object_id) min from t) b;
          | Id  | Operation             | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
          ----------------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT         |           |     1 |    26 |     4     (0)| 00:00:01 |
          |   1 |  NESTED LOOPS             |           |     1 |    26 |     4     (0)| 00:00:01 |
          |   2 |   VIEW                 |           |     1 |    13 |     2     (0)| 00:00:01 |
          |   3 |    SORT AGGREGATE         |           |     1 |    13 |        |           |
          |   4 |     INDEX FULL SCAN (MIN/MAX)| IDX_TID |     1 |    13 |     2     (0)| 00:00:01 |
          |   5 |   VIEW                 |           |     1 |    13 |     2     (0)| 00:00:01 |
          |   6 |    SORT AGGREGATE         |           |     1 |    13 |        |           |
          |   7 |     INDEX FULL SCAN (MIN/MAX)| IDX_TID |     1 |    13 |     2     (0)| 00:00:01 |
          ----------------------------------------------------------------------------------------
          Statistics
          ----------------------------------------------------------
                4  consistent gets
        3. 索引回表与优化
          SQL> select * from t where object_id<=5; --有回表,开发人员有时为了方便这样写,尽量让他们把业务具体化。
          | Id  | Operation            | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
          ---------------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT        |          |     4 |   828 |     3    (0)| 00:00:01 |
          |   1 |  TABLE ACCESS BY INDEX ROWID| T       |     4 |   828 |     3    (0)| 00:00:01 |
          |*  2 |   INDEX RANGE SCAN        | IDX_TID |     4 |       |     2    (0)| 00:00:01 |
          ---------------------------------------------------------------------------------------
          Statistics
          ----------------------------------------------------------
                5  consistent gets
          SQL> select object_id from t where object_id<=5; --无回表,尽量把多余字段取消只留下索引段,可以避免回表。
          | Id  | Operation     | Name    | Rows  | Bytes | Cost (%CPU)| Time       |
          ----------------------------------------------------------------------------
          |   0 | SELECT STATEMENT |       |     4 |    52 |     2   (0)| 00:00:01 |
          |*  1 |  INDEX RANGE SCAN| IDX_TID |     4 |    52 |     2   (0)| 00:00:01 |
          ----------------------------------------------------------------------------
          Statistics
          ----------------------------------------------------------
                3  consistent gets
          --为了避免回表,建立组合索引。缺点是,索引变大。需要平衡考虑
          SQL> create index idx_c on t(object_id,object_name);
          
          SQL> select object_id,object_name from t where object_id<=10;
          | Id  | Operation     | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
          --------------------------------------------------------------------------
          |   0 | SELECT STATEMENT |     |    12 |   948 |     2   (0)| 00:00:01 |
          |*  1 |  INDEX RANGE SCAN| IDX_C |    12 |   948 |     2   (0)| 00:00:01 |
          --------------------------------------------------------------------------
          Statistics
          ----------------------------------------------------------
                3  consistent gets
        4. 聚合因子clustering_factor(在必须回表查询时,查询的方式也有高下之分),这个值越高,代表同块索引指到不同数据库的情况越多。
          --分别创建聚合因子高和低的表。
          create table t_co (id int,col2 varchar2(100));
          begin
          for i in 1..100000
              loop
                  insert into t_co(id,col2)
                  values(i,rpad(dbms_random.random,95,'*'));
              end loop;
          end;
          /
          alter table t_co add constraint pk_t_co primary key(id);
          create table t_nco as select id,col2 from t_co order by col2;
          alter table t_nco add constraint pk_t_nco primary key(id);
          commit;
          select index_name,blevel,leaf_blocks,num_rows,distinct_keys,clustering_factor from user_ind_statistics where table_name in ('T_CO','T_NCO');
          
          INDEX_NAME               BLEVEL LEAF_BLOCKS    NUM_ROWS DISTINCT_KEYS CLUSTERING_FACTOR
          ------------------------------ ---------- ----------- ---------- ------------- -----------------
          PK_T_CO                 1      208      100000    100000            1469
          PK_T_NCO                1      208      100000    100000           99937
          另一种查询方法:
          set linesize 1000
          alter session set statistics_level=all;
          --Sql语句
          select /*+index(t)*/* from t_co t where id>=20000 and id<=40000;
          
          聚合度低的:21364

          select /*+index(t)*/* from t_nco t where id>=20000 and id<=40000

          --------------------------------------------------------------------------------------------------
          | Id  | Operation            | Name     | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
          --------------------------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT        |           |      1 |     |  20001 |00:00:00.06 |   21364 |
          |   1 |  TABLE ACCESS BY INDEX ROWID| T_NCO    |      1 |  17512 |  20001 |00:00:00.06 |   21364 |
          |*  2 |   INDEX RANGE SCAN        | PK_T_NCO |      1 |  17512 |  20001 |00:00:00.01 |    1374 |
          --------------------------------------------------------------------------------------------------

          聚合度高的:2985,为上面1/10。如何提高表的聚合度?让常用的索引列的顺序和表本身的排序尽量接近。

          select /*+index(t)*/* from t_co t where id>=20000 and id<=40000

          -------------------------------------------------------------------------------------------------
          | Id  | Operation            | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
          -------------------------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT        |          |      1 |    |  20001 |00:00:00.02 |    2985 |
          |   1 |  TABLE ACCESS BY INDEX ROWID| T_CO    |      1 |  15888 |  20001 |00:00:00.02 |    2985 |
          |*  2 |   INDEX RANGE SCAN        | PK_T_CO |      1 |  15888 |  20001 |00:00:00.01 |    1374 |
          -------------------------------------------------------------------------------------------------

    3. 索引本身有序:
      1. order by优化:
        1. 一般看执行计划的性能,cost是最重要的指标。逻辑读90%情况下有效,但是有时和cost矛盾,取cost的结果。
        2. 排序查询会消耗太多资源,所以可以用建索引的办法来避免排序。
        3. 索引可以避免排序,但因为查询的值太多,会反复回表,这样会提高逻辑读,降低性能;如果能只查索引列,那就最理想了。
          create table t as select * from object_id;
          select * from t where object_id>2;
          --------------------------------------------------------------------------
          | Id  | Operation      | Name | Rows  | Bytes | Cost (%CPU)| Time     |
          --------------------------------------------------------------------------
          |   0 | SELECT STATEMENT  |     | 83491 |    16M|   288   (1)| 00:00:04 |
          |*  1 |  TABLE ACCESS FULL| T     | 83491 |    16M|   288   (1)| 00:00:04 |
                 5784  consistent gets
          
          select * from t where object_id>2 order by object_id;
          | Id  | Operation       | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time      |
          -----------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT   |      | 83491 |    16M|      |  4046   (1)| 00:00:49 |
          |   1 |  SORT ORDER BY       |      | 83491 |    16M|    19M|  4046   (1)| 00:00:49 |
          |*  2 |   TABLE ACCESS FULL| T      | 83491 |    16M|      |   288   (1)| 00:00:04 |
                 1035  consistent gets
                1  sorts (memory)
                0  sorts (disk)
          --这里是比较有排序和没有排序,可以看出,排序的逻辑读减少,但是时间和cost极大增加。
          --建立索引后发现时间变少,cost降低,逻辑读暴增(回表现象导致),但是总体cost降低,因为没有排序操作。索引回表如果大量产生,有时效率比全表扫描还差,因为全表扫描可以一次读入多个数据库,而走索引只能一块一块读去数据块。
          SQL> select * from t where object_id>2 order by object_id;
          | Id  | Operation            | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
          ---------------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT        |          | 72293 |  6848K|  1248    (1)| 00:00:15 |
          |   1 |  TABLE ACCESS BY INDEX ROWID| T       | 72293 |  6848K|  1248    (1)| 00:00:15 |
          |*  2 |   INDEX RANGE SCAN        | IDX_OID | 72293 |       |   161    (0)| 00:00:02 |
          ---------------------------------------------------------------------------------------
          Statistics
          ----------------------------------------------------------
                10810  consistent gets
                0  sorts (memory)

          SQL> select object_id from t where object_id>2 order by object_id;--查索引列,避免了回表
          | Id  | Operation     | Name    | Rows  | Bytes | Cost (%CPU)| Time       |
          ----------------------------------------------------------------------------
          |   0 | SELECT STATEMENT |       | 72293 |   352K|  161   (0)| 00:00:02 |
          |*  1 |  INDEX RANGE SCAN| IDX_OID | 72293 |   352K|   161   (0)| 00:00:02 |
          ----------------------------------------------------------------------------
          Statistics
          ----------------------------------------------------------
                 4969  consistent gets

      2. Distinct排重优化
        1. 这种排序不会出现在sort统计上,只会出现在tempspc上。
          create table t as select * from dba_objects;
          alter table t modify object_id not null;
          update t set object_id=2;
          update t set object_id=3 where rownum<=25000;
          SQL> select distinct object_id from t;
          | Id  | Operation       | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time      |
          -----------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT   |      | 83491 |  1059K|      |   687   (1)| 00:00:09 |
          |   1 |  HASH UNIQUE       |      | 83491 |  1059K|  1648K|   687   (1)| 00:00:09 |
          |   2 |   TABLE ACCESS FULL| T      | 83491 |  1059K|      |   288   (1)| 00:00:04 |
          -----------------------------------------------------------------------------------
          Statistics
          ----------------------------------------------------------
                 1035  consistent gets
                0  sorts (memory)
                0  sorts (disk)
          SQL> select object_id from t;
          | Id  | Operation      | Name | Rows  | Bytes | Cost (%CPU)| Time     |
          --------------------------------------------------------------------------
          |   0 | SELECT STATEMENT  |     | 83491 |  1059K|   288   (1)| 00:00:04 |
          |   1 |  TABLE ACCESS FULL| T     | 83491 |  1059K|   288   (1)| 00:00:04 |
          Statistics
          ----------------------------------------------------------
                 5797  consistent gets
          
          --对于等值的排重查询
          SQL> select distinct object_id from t where object_id=2;
          ---------------------------------------------------------------------------
          | Id  | Operation       | Name | Rows  | Bytes | Cost (%CPU)| Time      |
          ---------------------------------------------------------------------------
          |   0 | SELECT STATEMENT   |      | 51744 |   656K|   537   (1)| 00:00:07 |
          |   1 |  SORT UNIQUE NOSORT|      | 51744 |   656K|   537   (1)| 00:00:07 | --此算法消除排序,使得cost从687--》537略有提升。
          |*  2 |   TABLE ACCESS FULL| T      | 51744 |   656K|   288   (1)| 00:00:04 |
          ---------------------------------------------------------------------------
          Statistics
          ----------------------------------------------------------
                 1035  consistent gets
          
          --建立索引
          SQL> select distinct object_id from t;
          | Id  | Operation          | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
          -----------------------------------------------------------------------------------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT      |       | 83491 |  1059K|       |   443    (1)| 00:00:06 |
          |   1 |  HASH UNIQUE          |       | 83491 |  1059K|  1648K|   443    (1)| 00:00:06 |
          |   2 |   INDEX FAST FULL SCAN| IDX_T | 83491 |  1059K|       |    44    (0)| 00:00:01 |
          Statistics
          ----------------------------------------------------------
              150  consistent gets
          --
          SQL> select /*+index(t)*/ distinct object_id from t;索引消除排序,不过开销反而更大。证明优化不理想。
          | Id  | Operation       | Name  | Rows  | Bytes | Cost (%CPU)| Time       |
          ----------------------------------------------------------------------------
          |   0 | SELECT STATEMENT   |       | 83491 |  1059K|   556   (1)| 00:00:07 |
          |   1 |  SORT UNIQUE NOSORT|       | 83491 |  1059K|   556   (1)| 00:00:07 |
          |   2 |   INDEX FULL SCAN  | IDX_T | 83491 |  1059K|   156   (0)| 00:00:02 |
          ----------------------------------------------------------------------------
          Statistics
          ----------------------------------------------------------
              222  consistent gets
      3. 索引全扫和索引快扫的区别:INDEX FULL SCAN VS INDEX FAST FULL SCAN
        1. 快扫每次读取多个块,不排序。全扫每次读取一个块,排序。所以sum和avg等统计根本无需使用排序,一般走快扫。
      4. Union合并的优化:
        1. Union all就是简单的合并,没有消除重复的记录。Union 会去除重复的记录和distinct很相似需要排序。
          SQL> select object_id from t2 union all select object_id from t1;
          | Id  | Operation       | Name | Rows  | Bytes | Cost (%CPU)| Time      |
          ---------------------------------------------------------------------------
          |   0 | SELECT STATEMENT   |      |   125K|  1587K|   575  (51)| 00:00:07 |
          |   1 |  UNION-ALL       |      |      |      |           |      |
          |   2 |   TABLE ACCESS FULL| T2   | 57108 |   725K|   288   (1)| 00:00:04 |
          |   3 |   TABLE ACCESS FULL| T1   | 67936 |   862K|   288   (1)| 00:00:04 |
          ---------------------------------------------------------------------------
          Statistics
          ----------------------------------------------------------
                11579  consistent gets
          SQL> select object_id from t2 union select object_id from t1;--排序带来的额外开销是非常巨大的超过逻辑读的开销。
          | Id  | Operation        | Name | Rows  | Bytes |TempSpc| Cost (%CPU)| Time       |
          ------------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT    |       |   125K|  1587K|       |  1173  (53)| 00:00:15 |
          |   1 |  SORT UNIQUE        |       |   125K|  1587K|  2473K|  1173  (53)| 00:00:15 |
          |   2 |   UNION-ALL        |       |       |       |       |        |       |
          |   3 |    TABLE ACCESS FULL| T2   | 57108 |   725K|       |   288   (1)| 00:00:04 |
          |   4 |    TABLE ACCESS FULL| T1   | 67936 |   862K|       |   288   (1)| 00:00:04 |
          Statistics
          ----------------------------------------------------------
                 2070  consistent gets
        2. 通过建索引优化:结论:是不可行的,最常用的优化是在两个表不会重复的时候,把union改为union all。 
          SQL> create index idx_1 on t1(object_id);
          SQL> create index idx_2 on t2(object_id);
          SQL> set autotrace traceonly
          SQL> set linesize 1000
          SQL> select object_id from t1 union select object_id from t2;
          ----------------------------------------------------------------------------------------
          | Id  | Operation           | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time     |
          ----------------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT       |       |   125K|  1587K|       |   697    (47)| 00:00:09 |
          |   1 |  SORT UNIQUE           |       |   125K|  1587K|  2473K|排序不可避免697    (47)| 00:00:09 |
          |   2 |   UNION-ALL           |       |       |       |       |        |           |
          |   3 |    INDEX FAST FULL SCAN| IDX_1 | 67936 |   862K|       |    49     (0)| 00:00:01 |
          |   4 |    INDEX FAST FULL SCAN| IDX_2 | 57108 |   725K|       |    49     (0)| 00:00:01 |
          Statistics --Oracle认为索引排序而省去的开销不足以抵消对索引排序而增加的逻辑读。
          ----------------------------------------------------------
              336  consistent gets
          SQL> select object_id from t1 union all select object_id from t2;--极大提高了性能。走索引就只读所以而已。
          | Id  | Operation          | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
          -------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT      |       |   125K|  1587K|    99  (51)| 00:00:02 |
          |   1 |  UNION-ALL          |       |       |       |        |          |
          |   2 |   INDEX FAST FULL SCAN| IDX_1 | 67936 |   862K|    49    (0)| 00:00:01 |
          |   3 |   INDEX FAST FULL SCAN| IDX_2 | 57108 |   725K|    49    (0)| 00:00:01 |
          Statistics
          ----------------------------------------------------------
                 9952  consistent gets

          SQL> select /*+index(t1)*/ object_id from t1 union select /*+index(t2)*/ object_id from t2;

          -----------------------------------------------------------------------------------
          | Id  | Operation      | Name  | Rows  | Bytes |TempSpc| Cost (%CPU)| Time      |
          -----------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT  |      |   125K|  1587K|      |   947  (48)| 00:00:12 |
          |   1 |  SORT UNIQUE      |      |   125K|  1587K| 2473K|  947  (48)| 00:00:12 |
          |   2 |   UNION-ALL      |      |      |      |      |           |      |
          |   3 |    INDEX FULL SCAN| IDX_1 | 67936 |   862K|      |   174   (0)| 00:00:03 |
          |   4 |    INDEX FULL SCAN| IDX_2 | 57108 |   725K|      |   174   (0)| 00:00:03 |
          Statistics
          ----------------------------------------------------------
              322  consistent gets --两个结果集的索引没办法起到消除排序的作用。

      5. 主外键设计:
        1. 主外键三大特点:
          1. 主键本身是一种索引
          2. 可以保证表中主键所在列的唯一性
          3. 可以有效限制外表依赖的记录的完整性
            SQL> create table t_p (id number,name varchar2(30));
            SQL> create table t_c (id number, fid number,name varchar2(30));
            SQL> alter table t_p add constraint pk1 primary key(id);
            SQL> alter table t_c add constraint fk1 foreign key(fid) references t_p(id);
            SQL> insert into t_p select rownum,table_name from all_tables;
            SQL> insert into t_c select rownum, mod(rownum,1000)+1,object_name from all_objects;
            SQL> commit;
            SQL> select a.id,a.name,b.name from t_p a,t_c b where a.id=b.fid and a.id=880;
            
            71 rows selected.
            -------------------------------------------------------------------------------------
            | Id  | Operation             | Name | Rows  | Bytes | Cost (%CPU)| Time     |
            -------------------------------------------------------------------------------------
            |   0 | SELECT STATEMENT         |        |     39 |  2340 |    136   (0)| 00:00:02 |
            |   1 |  NESTED LOOPS             |        |     39 |  2340 |    136   (0)| 00:00:02 |
            |   2 |   TABLE ACCESS BY INDEX ROWID| T_P  |      1 |     30 |      0   (0)| 00:00:01 |
            |*  3 |    INDEX UNIQUE SCAN         | PK1  |      1 |        |      0   (0)| 00:00:01 |
            |*  4 |   TABLE ACCESS FULL         | T_C  |     39 |  1170 |    136   (0)| 00:00:02 |
            -------------------------------------------------------------------------------------
            Statistics
            ----------------------------------------------------------
                418  consistent gets
            SQL> create index idx_c on t_c(fid);
            
            Index created.
            
            SQL> select a.id,a.name,b.name from t_p a,t_c b where a.id=b.fid and a.id=880;
            | Id  | Operation             | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
            --------------------------------------------------------------------------------------
            |   0 | SELECT STATEMENT         |         |      71 |    4260 |      98   (0)| 00:00:02 |
            |   1 |  NESTED LOOPS             |         |      71 |    4260 |      98   (0)| 00:00:02 |
            |   2 |   TABLE ACCESS BY INDEX ROWID| T_P   |       1 |      30 |       0   (0)| 00:00:01 |
            |*  3 |    INDEX UNIQUE SCAN         | PK1   |       1 |         |       0   (0)| 00:00:01 |
            |   4 |   TABLE ACCESS BY INDEX ROWID| T_C   |      71 |    2130 |      98   (0)| 00:00:02 |
            |*  5 |    INDEX RANGE SCAN         | IDX_C |      71 |         |       1   (0)| 00:00:01 |
            Statistics
            ----------------------------------------------------------
                128  consistent gets 建了索引之后逻辑读降到原来的不到1/3。cost也降低了快到1/3。
        2. 外键索引和避免锁的争用
          --会话1:
          SQL> create index idx_c on t_c(fid);
          Index created.
          SQL> delete t_c where id =4;
          1 row deleted.
          --会话2:
          SQL> delete t_p where id =200;
          delete t_p where id =200
          ERROR at line 1:
          ORA-02292: integrity constraint (NYC.FK1) violated - child record found
          SQL> delete t_p where id =2000; 有索引的状态下,因为表被锁住,但是我们可以访问索引,所以可以删除。
          1 row deleted.
          -- 如果会话1没有创建索引,同样动作后会话2如下:
          SQL> delete t_p where id =200;
          这里就不动了,也就是会话1对表做修改时,锁定了它所参照的表。如果它删除的结果会导致外键的状态发生变化,所以会锁住外键对应的父表。同样的情况就不会发生在insert上面,因为insert这个操作独立于子表的删除操作。

          SQL> insert into t_p values(3000,'abc');

          1 row created.

        3. 主外键约束的级联删除
          SQL> alter table t_c add constraint fk1 foreign key(fid) references t_p(id) on delete cascade;
          --删除父表时子表一起删除。
          SQL> delete from t_p where id=2;
          1 row deleted.
          SQL> select count(*) from t_c where fid=2;
            COUNT(*)
          ----------
               0
        4. 改造为主键的简便方法:因为建主键的动作其实就是建一个唯一性索引,再增加一个非空约束,仅此而已。
          drop table t cascade constraints purge;
          create table t (id number,name varchar2(30));
          insert into t select rownum,table_name from all_tables;
          commit;
          create index idx_t on t(id);
          alter table t add constraint t_id_pk primary key(id);
      6. 组合索引高效设计的要领:
        1. 适当场合避免回表:
        2. 组合列返回越少越高效:
          --三种情况:
          select * from t where a=1 and b=2; 如果a返回多,b也返回多,但是a and b返回少,这是就算要回表也适合建索引。因为ab联合可以定位的行多。过多字段的联合是不可取的。那还不如做索引组织表。
          另一种情况是如果ab组合返回值和ab单独返回值差不多,那就没有比较做联合索引。只用一列比较节省维护索引的空间。
          第三种:select a,b from t where a=1 and b=2; --适合建组合索引避免回表。
        3. 组合两列谁在前更合适:
          --等值查询都一样
          SQL> drop table t purge;
          SQL> create table t as select * from dba_objects;
          SQL> create index idx1_t on t (object_id,object_type);
          SQL> create index idx2_t on t (object_type,object_id);
          SQL> set autotrace traceonly
          SQL> set linesize 1000
          SQL> select /*+index(t,idx1_t)*/* from t where object_id=20 and object_type='TABLE';
          --------------------------------------------------------------------------------------
          | Id  | Operation            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
          --------------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT        |         |       7 |    1449 |       2   (0)| 00:00:01 |
          |   1 |  TABLE ACCESS BY INDEX ROWID| T      |       7 |    1449 |       2   (0)| 00:00:01 |
          |*  2 |   INDEX RANGE SCAN        | IDX1_T |       1 |         |       1   (0)| 00:00:01 |
          --------------------------------------------------------------------------------------
          Statistics
          ----------------------------------------------------------
                4  consistent gets
          
          SQL> select /*+index(t,idx2_t)*/* from t where object_id=20 and object_type='TABLE';
          --------------------------------------------------------------------------------------
          | Id  | Operation            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
          --------------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT        |         |       7 |    1449 |       2   (0)| 00:00:01 |
          |   1 |  TABLE ACCESS BY INDEX ROWID| T      |       7 |    1449 |       2   (0)| 00:00:01 |
          |*  2 |   INDEX RANGE SCAN        | IDX2_T |       1 |         |       1   (0)| 00:00:01 |
          --------------------------------------------------------------------------------------
          Statistics
          ----------------------------------------------------------
                4  consistent gets
          --范围查询不一样:当一列查询是范围查询一列是等值,那么范围查询的索引应该放在后面才高效。先分类在取范围。
          SQL> select /*+index(t,idx1_t)*/* from t where object_id>=20 and object_id<=2000 and object_type='TABLE';
          --------------------------------------------------------------------------------------
          | Id  | Operation            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
          --------------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT        |         |     503 |     101K|       7   (0)| 00:00:01 |
          |   1 |  TABLE ACCESS BY INDEX ROWID| T      |     503 |     101K|       7   (0)| 00:00:01 |
          |*  2 |   INDEX RANGE SCAN        | IDX1_T |       5 |         |       6   (0)| 00:00:01 |
          --------------------------------------------------------------------------------------
          Statistics
          ----------------------------------------------------------
          82  consistent gets
          
          SQL> select /*+index(t,idx2_t)*/* from t where object_id>=20 and object_id<=2000 and object_type='TABLE';
          
          465 rows selected.
          | Id  | Operation            | Name   | Rows  | Bytes | Cost (%CPU)| Time     |
          --------------------------------------------------------------------------------------
          |   0 | SELECT STATEMENT        |         |     503 |     101K|      24   (0)| 00:00:01 |
          |   1 |  TABLE ACCESS BY INDEX ROWID| T      |     503 |     101K|      24   (0)| 00:00:01 |
          |*  2 |   INDEX RANGE SCAN        | IDX2_T |     503 |         |       3   (0)| 00:00:01 |
          --------------------------------------------------------------------------------------
          Statistics
          ----------------------------------------------------------
               79  consistent gets
        4. 哪列在前的原理讨论:在前的索引先执行。
          1. 对于等值查询:索引只定位单个值,所以基本上走B树的方式,步骤是固定的。
          2. 对于范围查询,一个索引管范围一个管分类:这个时候先分类,然后用range scan就搞定,效率高。
          3. 对于经常做OLTP查询的服务器:往往可以做range的列是会被频繁查询。这个时候就不用管上面的规则,按照业务需求来。
          4. 组合查询in的改写:
            --如果每个id只对应一个值,那么range scan速度快,如果每个id对应很多值,那么对每个值直接走索引快。
            SQL> select /*+index(t,idx_1)*/ * from t where object_type='TABLE' AND object_id in(20,21);
            | Id  | Operation             | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
            --------------------------------------------------------------------------------------
            |   0 | SELECT STATEMENT         |         |       7 |    1449 |       4   (0)| 00:00:01 |
            |   1 |  INLIST ITERATOR         |         |         |         |          |         |
            |   2 |   TABLE ACCESS BY INDEX ROWID| T     |       7 |    1449 |       4   (0)| 00:00:01 |
            |*  3 |    INDEX RANGE SCAN         | IDX_1 |       1 |         |       3   (0)| 00:00:01 |
            Statistics
            ----------------------------------------------------------
                  7  consistent gets
            
            SQL> select /*+index(t,idx_1)*/ * from t where object_type='TABLE' AND object_id>=20 and object_id<=21;
            | Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
            -------------------------------------------------------------------------------------
            |   0 | SELECT STATEMENT        |        |      2 |    414 |      3   (0)| 00:00:01 |
            |   1 |  TABLE ACCESS BY INDEX ROWID| T     |      2 |    414 |      3   (0)| 00:00:01 |
            |*  2 |   INDEX RANGE SCAN        | IDX_1 |      1 |        |      2   (0)| 00:00:01 |
            -------------------------------------------------------------------------------------
            Statistics
            ----------------------------------------------------------
                  5  consistent gets
            
            --将id=20变成26000条,则用range扫的方法就非常耗时,因为虽然值相同,他也要一个一个比较,走索引没有这个比较的步骤。
            SQL> update t set object_id=20 where rownum<=26000;
            SQL> update t set object_id=21 where object_id<>20;
            
            SQL> select /*+index(t,idx_1)*/ * from t where object_type='TABLE' AND object_id in(20,21);
            | Id  | Operation             | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
            --------------------------------------------------------------------------------------
            |   0 | SELECT STATEMENT         |         |    1877 |     379K|       4   (0)| 00:00:01 |
            |   1 |  INLIST ITERATOR         |         |         |         |          |         |
            |   2 |   TABLE ACCESS BY INDEX ROWID| T     |    1877 |     379K|       4   (0)| 00:00:01 |
            |*  3 |    INDEX RANGE SCAN         | IDX_1 |       1 |         |       3   (0)| 00:00:01 |
            Statistics
            ----------------------------------------------------------
                565  consistent gets
            
            SQL> select /*+index(t,idx_1)*/ * from t where object_type='TABLE' AND object_id>=20 and object_id<=21;
            | Id  | Operation            | Name  | Rows  | Bytes | Cost (%CPU)| Time     |
            -------------------------------------------------------------------------------------
            |   0 | SELECT STATEMENT        |        |  1877 |    379K|    769   (1)| 00:00:10 |
            |   1 |  TABLE ACCESS BY INDEX ROWID| T     |  1877 |    379K|    769   (1)| 00:00:10 |
            |*  2 |   INDEX RANGE SCAN        | IDX_1 |    291 |        |    764   (1)| 00:00:10 |
            Statistics
            ----------------------------------------------------------
                815  consistent gets
        5. 索引的开销:
          1. 索引越多插入越慢:举例,10万记录:索引个数,时间:0个,5秒;1个,15秒;2个,3分钟;3个,5分钟。
          2. 无序插入索引影响更加惊人:
            SQL> insert into t select * from dba_objects where rownum<=5000 order by dbms_random.random;
            5000 rows created.
            Elapsed: 00:00:00.18
            SQL> insert into t select * from dba_objects where rownum<=5000;
            5000 rows created.
            Elapsed: 00:00:00.05
          3. 修改删除与插入略有差别:
            1. 对于insert语句负面最大,只要有索引,必然慢,越多越慢。
            2. 对于delete,有好有坏,删少数记录有索引好,大量数据,影响性能,因为其它列的索引要跟着更新。
            3. 对update影响最小,只会影响到更新的列的索引。
          4. 建索引动作引发排序和锁:
            案例:对大表建索引,开10并行度,耗尽10个cpu的资源,导致系统性能缓慢15分钟。
        6. 如何控制索引的数量:
          1. 对所有的监控:
            alter index idx_1 monitoring usage;
            select * from v$object_usage;
            INDEX_NAME               TABLE_NAME              MON USE START_MONITORING      END_MONITORING
            ------------------------------ ------------------------------ --- --- ------------------- -------------------
            IDX_1                   T                  YES YES 04/17/2015 17:16:01
            SQL> alter index idx_1 nomonitoring usage;
posted on 2015-04-15 06:12  alexblog  阅读(203)  评论(0编辑  收藏  举报