标量子查询加聚合函数sql改写一

标量子查询的语句:

select /*+ GATHER_PLAN_STATISTICS  dwtest */ empno,
       (select count(*) from DEPT1 b where b.id = a.id) as d,
       (select sum(x) from DEPT1 b where b.id = a.id) as e
  from EMP1 a --where a.id in (1,2,3,4,5,6,7,8,9,11,12,13) 

创建表:

create table DEPT1 as select  * from DEPT;
insert into DEPT1 select  * from DEPT1;   ---多执行几次

create table emp1 as select  * from emp;
insert into emp1 select  * from emp1;      ---多执行几次

alter table DEPT1 add id number;
update DEPT1 set id=rownum;

alter table emp1 add id number;
update emp1 set id=rownum;
create index idx_emp1 on emp1(id);

create index idx_DEPT1 on dept1(id);

改写后的语句:

select /*+ GATHER_PLAN_STATISTICS  dwtes2 */ empno,d,e from EMP1 a
left join (select count(1)d,id,sum(x) e from DEPT1 group by id) b on a.id=b.id --where a.id in (1,2,3,4,5,6,7,8,9,11,12,13);

  在放开where条件时,标量子查询时的多次索引范围扫描,导致cost较高,性能比左连接方式要差些,从下图看的不是很明显,如果从monitor看耗时更直观些。得出结论是:左连接改写后效率比标量子查询稍好,但是没有明显提升。

----
SQL_ID  8mgcw7x9x16rq, child number 0
-------------------------------------
select /*+ GATHER_PLAN_STATISTICS  dwtest */ empno,        (select 
count(*) from DEPT1 b where b.id = a.id) as d,        (select sum(x) 
from DEPT1 b where b.id = a.id) as e   from EMP1 a where a.id in 
(1,2,3,4,5,6,7,8,9,11,12,13)
 
Plan hash value: 3470857716
 
-------------------------------------------------------------------------------------------------------------
| Id  | Operation                    | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
-------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT             |           |      1 |        |     12 |00:00:00.01 |       9 |      4 |
|   1 |  SORT AGGREGATE              |           |     12 |      1 |     12 |00:00:00.01 |       8 |      1 |
|*  2 |   INDEX RANGE SCAN           | IDX_DEPT1 |     12 |      1 |     12 |00:00:00.01 |       8 |      1 |
|   3 |  SORT AGGREGATE              |           |     12 |      1 |     12 |00:00:00.01 |      10 |      0 |
|   4 |   TABLE ACCESS BY INDEX ROWID| DEPT1     |     12 |      1 |     12 |00:00:00.01 |      10 |      0 |
|*  5 |    INDEX RANGE SCAN          | IDX_DEPT1 |     12 |      1 |     12 |00:00:00.01 |       8 |      0 |
|   6 |  INLIST ITERATOR             |           |      1 |        |     12 |00:00:00.01 |       9 |      4 |
|   7 |   TABLE ACCESS BY INDEX ROWID| EMP1      |     12 |     12 |     12 |00:00:00.01 |       9 |      4 |
|*  8 |    INDEX RANGE SCAN          | IDX_EMP1  |     12 |     12 |     12 |00:00:00.01 |       8 |      4 |
-------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   2 - access("B"."ID"=:B1)
   5 - access("B"."ID"=:B1)
   8 - access(("A"."ID"=1 OR "A"."ID"=2 OR "A"."ID"=3 OR "A"."ID"=4 OR "A"."ID"=5 OR "A"."ID"=6 OR 
              "A"."ID"=7 OR "A"."ID"=8 OR "A"."ID"=9 OR "A"."ID"=11 OR "A"."ID"=12 OR "A"."ID"=13))
 


-----
SQL_ID  dpwyqsf1rch2g, child number 0
-------------------------------------
select /*+ GATHER_PLAN_STATISTICS  dwtes2 */ empno,d,e from EMP1 a left 
join (select count(1)d,id,sum(x) e from DEPT1 group by id) b on 
a.id=b.id where a.id in (1,2,3,4,5,6,7,8,9,11,12,13)
 
Plan hash value: 1193336691
 
----------------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                       | Name      | Starts | E-Rows | A-Rows |   A-Time   | Buffers |  OMem |  1Mem | Used-Mem |
----------------------------------------------------------------------------------------------------------------------------------
|   0 | SELECT STATEMENT                |           |      1 |        |     12 |00:00:00.01 |      19 |       |       |          |
|*  1 |  HASH JOIN OUTER                |           |      1 |     12 |     12 |00:00:00.01 |      19 |  1969K|  1969K| 1407K (0)|
|   2 |   INLIST ITERATOR               |           |      1 |        |     12 |00:00:00.01 |       9 |       |       |          |
|   3 |    TABLE ACCESS BY INDEX ROWID  | EMP1      |     12 |     12 |     12 |00:00:00.01 |       9 |       |       |          |
|*  4 |     INDEX RANGE SCAN            | IDX_EMP1  |     12 |     12 |     12 |00:00:00.01 |       8 |       |       |          |
|   5 |   VIEW                          |           |      1 |     12 |     12 |00:00:00.01 |      10 |       |       |          |
|   6 |    HASH GROUP BY                |           |      1 |     12 |     12 |00:00:00.01 |      10 |  1116K|  1116K| 2222K (0)|
|   7 |     INLIST ITERATOR             |           |      1 |        |     12 |00:00:00.01 |      10 |       |       |          |
|   8 |      TABLE ACCESS BY INDEX ROWID| DEPT1     |     12 |     12 |     12 |00:00:00.01 |      10 |       |       |          |
|*  9 |       INDEX RANGE SCAN          | IDX_DEPT1 |     12 |     12 |     12 |00:00:00.01 |       8 |       |       |          |
----------------------------------------------------------------------------------------------------------------------------------
 
Predicate Information (identified by operation id):
---------------------------------------------------
 
   1 - access("A"."ID"="B"."ID")
   4 - access(("A"."ID"=1 OR "A"."ID"=2 OR "A"."ID"=3 OR "A"."ID"=4 OR "A"."ID"=5 OR "A"."ID"=6 OR "A"."ID"=7 OR 
              "A"."ID"=8 OR "A"."ID"=9 OR "A"."ID"=11 OR "A"."ID"=12 OR "A"."ID"=13))
   9 - access(("ID"=1 OR "ID"=2 OR "ID"=3 OR "ID"=4 OR "ID"=5 OR "ID"=6 OR "ID"=7 OR "ID"=8 OR "ID"=9 OR "ID"=11 OR 
              "ID"=12 OR "ID"=13))

  在没有where条件时,上例的索引选择性非常好,标量子查询时的多次索引范围扫描,与左连接方式的对两表的全表扫描的hash排序cost差不多,导致两种方式效率相差无几。得出结论是:在关联索引选择性非常好时,左连接改写后效率比标量子查询差不多,但是如果索引选择性一般时,左连接效果要好。

  但是在关联条件没有索引时,emp1表多少行,就要对dept是乘2次的全表扫描,此时就会导致cost非常高。

所以,尽量使用左连接加分组来优化

posted on 2023-02-02 17:36  小杜的学习天地  阅读(60)  评论(0编辑  收藏  举报

导航