代码改变世界

Oracle之SQL优化专题03-如何看懂SQL的执行计划

2020-02-13 21:27  AlfredZhao  阅读(1375)  评论(0编辑  收藏  举报

专题第一篇《Oracle之SQL优化专题01-查看SQL执行计划的方法》讲到了查看SQL执行计划的方法,并介绍了各种方法的应用场景,那么这一篇就主要介绍下如何看懂SQL的执行计划。毕竟如果SQL的执行计划都看不懂,那优化就无从谈起了。

关于如何看懂SQL的执行计划,我把它简单分为3个部分:

1.判断执行计划的执行顺序

**口诀:**先子后父,先上后下。 一般简单的执行计划可以直接根据这个口诀来判断执行计划的执行顺序。类似的口诀还有"最右最上"之类,其表达的意思都是一样的,选择一种自己易接受的口诀记忆即可。 举一个简单的例子:
SQL> show user  
USER is "SYS"
SQL> alter session set current_schema=scott;
SQL> alter session set statistics_level = ALL;
SQL> select a.empno, a.ename, b.dname, a.job, a.sal from emp a, dept b
  2  where a.deptno = b.deptno and empno = 7788
  3  ;

     EMPNO ENAME      DNAME          JOB              SAL
---------- ---------- -------------- --------- ----------
      7788 SCOTT      RESEARCH       ANALYST         3000

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  19ktq5t44xsyp, child number 0
-------------------------------------
select a.empno, a.ename, b.dname, a.job, a.sal from emp a, dept b where
a.deptno = b.deptno and empno = 7788

Plan hash value: 2385808155

--------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |      1 |        |      1 |00:00:00.01 |       4 |
|   1 |  NESTED LOOPS                |         |      1 |      1 |      1 |00:00:00.01 |       4 |
|   2 |   TABLE ACCESS BY INDEX ROWID| EMP     |      1 |      1 |      1 |00:00:00.01 |       2 |
|*  3 |    INDEX UNIQUE SCAN         | PK_EMP  |      1 |      1 |      1 |00:00:00.01 |       1 |
|   4 |   TABLE ACCESS BY INDEX ROWID| DEPT    |      1 |      1 |      1 |00:00:00.01 |       2 |
|*  5 |    INDEX UNIQUE SCAN         | PK_DEPT |      1 |      1 |      1 |00:00:00.01 |       1 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   3 - access("EMPNO"=7788)
   5 - access("A"."DEPTNO"="B"."DEPTNO")


24 rows selected.

执行计划的执行顺序(按Id区分)就是 3 -> 2 -> 5 -> 4 -> 1 -> 0.

这里需要特别注意的是:
1)实际上这个所谓的口诀得到的执行顺序只是为了方便我们理解操作数据的顺序,而执行计划其实是按照Id从上往下递归调用的,简单说其实优化器首先判断是一条select语句,涉及多表关联,关联方式采用了NL Join,然后这个NL Join包含分别对EMP和DEPT索引回表的访问。

2)理解了第一点,那对于SQL语句中select部分含有标量子查询的部分不遵循这个口诀就更好理解了(因为实际这个标量子查询是最后执行的)。
举一个简单例子说明这类标量子查询:

SQL> select a.empno, a.ename, b.dname, a.job, a.sal, (select * from dual) x from emp a, dept b
  2  where a.deptno = b.deptno and empno = 7788
  3  ;

     EMPNO ENAME      DNAME          JOB              SAL X
---------- ---------- -------------- --------- ---------- -
      7788 SCOTT      RESEARCH       ANALYST         3000 X

SQL> select * from table(dbms_xplan.display_cursor(null,null,'allstats last'));

PLAN_TABLE_OUTPUT
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
SQL_ID  dpbagkc0c6ctv, child number 0
-------------------------------------
select a.empno, a.ename, b.dname, a.job, a.sal, (select * from dual) x
from emp a, dept b where a.deptno = b.deptno and empno = 7788

Plan hash value: 1121436124

--------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Starts | E-Rows | A-Rows |   A-Time   | Buffers |
--------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |      1 |        |      1 |00:00:00.01 |       4 |
|   1 |  TABLE ACCESS FULL           | DUAL    |      1 |      1 |      1 |00:00:00.01 |       2 |
|   2 |  NESTED LOOPS                |         |      1 |      1 |      1 |00:00:00.01 |       4 |
|   3 |   TABLE ACCESS BY INDEX ROWID| EMP     |      1 |      1 |      1 |00:00:00.01 |       2 |
|*  4 |    INDEX UNIQUE SCAN         | PK_EMP  |      1 |      1 |      1 |00:00:00.01 |       1 |
|   5 |   TABLE ACCESS BY INDEX ROWID| DEPT    |      1 |      1 |      1 |00:00:00.01 |       2 |
|*  6 |    INDEX UNIQUE SCAN         | PK_DEPT |      1 |      1 |      1 |00:00:00.01 |       1 |
--------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("EMPNO"=7788)
   6 - access("A"."DEPTNO"="B"."DEPTNO")


25 rows selected.

执行计划的执行顺序(按Id区分),如果按口诀来看就是 1 -> 4 -> 3 -> 6 -> 5 -> 2 -> 0.
而由于这其中1是标量子查询,实际要最后执行,所以真实的执行顺序为 4 -> 3 -> 6 -> 5 -> 2 -> 1 -> 0.

2.理解执行计划每步的含义

以上面的执行计划为例,要清楚知道每一列代表的含义: - Id 操作id号;
  • Operation
    这一列,可以看到SQL的每一步操作,深入理解就需要清楚表的主要访问方式(TABLE ACCESS FULL、TABLE ACCESS BY INDEX ROWID、TABLE ACCESS BY USER ROWID),索引的主要访问方式(INDEX UNIQUE SCAN、INDEX RANGE SCAN、INDEX FULL SCAN、INDEX FAST FULL SCAN、INDEX SKIP SCAN),多表查询还需要清楚表连接的方法(NESTED LOOP JOIN、SORT MERGE JOIN、HASH JOIN、MERGE JOIN CARTESIAN)和顺序(先访问驱动表)等知识;

  • Name
    这一列,就是操作的具体对象名称;

  • Starts
    这一列,代表访问次数;比如在NESTED LOOP JOIN中,被驱动表的具体访问次数就可以依据Starts的数值来判断。

  • E-Rows
    预估返回行数;

  • A-Rows
    实际返回行数;

  • A-Time
    实际执行时间;

  • Buffers
    逻辑读块/次;

如果是其他方式,看到的执行计划内容有所不同,比如可能就是下面这样:

Execution Plan
----------------------------------------------------------
Plan hash value: 1121436124

----------------------------------------------------------------------------------------
| Id  | Operation                    | Name    | Rows  | Bytes | Cost (%CPU)| Time     |
----------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |         |     1 |    38 |     2   (0)| 00:00:01 |
|   1 |  TABLE ACCESS FULL           | DUAL    |     1 |     2 |     2   (0)| 00:00:01 |
|   2 |  NESTED LOOPS                |         |     1 |    38 |     2   (0)| 00:00:01 |
|   3 |   TABLE ACCESS BY INDEX ROWID| EMP     |     1 |    25 |     1   (0)| 00:00:01 |
|*  4 |    INDEX UNIQUE SCAN         | PK_EMP  |     1 |       |     0   (0)| 00:00:01 |
|   5 |   TABLE ACCESS BY INDEX ROWID| DEPT    |     1 |    13 |     1   (0)| 00:00:01 |
|*  6 |    INDEX UNIQUE SCAN         | PK_DEPT |     1 |       |     0   (0)| 00:00:01 |
----------------------------------------------------------------------------------------

可以看到,前三列没有区别,后面只有Rows、Bytes、Cost (%CPU)、Time等信息。

3.了解执行计划相关的信息

执行计划下面还有相关的信息,类似如下:
Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("EMPNO"=7788)
   6 - access("A"."DEPTNO"="B"."DEPTNO")

Note
-----
   - SQL plan baseline SQL_PLAN_9sdj6nc4ybu5x2b78d17a used for this statement

Statistics
----------------------------------------------------------
          0  recursive calls
          0  db block gets
          6  consistent gets
          0  physical reads
          0  redo size
        879  bytes sent via SQL*Net to client
        520  bytes received via SQL*Net from client
          2  SQL*Net roundtrips to/from client
          0  sorts (memory)
          0  sorts (disk)
          1  rows processed
  • 谓词信息(Predicate Information)
    标识出哪一步有access、filter这样的操作;

  • Note部分(Note)
    如果SQL执行中用到了动态采样、基数反馈、自适应游标共享、SQL_Profile、SPM等,会在这里显示出来;

  • 统计信息(Statistics)
    这里指的是SQL执行实际发生的递归调用、逻辑读、物理读、排序、处理行数等统计信息。