标量子查询加聚合函数sql改写一
标量子查询的语句:
select /*+ GATHER_PLAN_STATISTICS dwtest */ empno, (select count(*) from DEPT1 b where b.id = a.id) as d, (select sum(x) from DEPT1 b where b.id = a.id) as e from EMP1 a --where a.id in (1,2,3,4,5,6,7,8,9,11,12,13)
创建表:
create table DEPT1 as select * from DEPT; insert into DEPT1 select * from DEPT1; ---多执行几次 create table emp1 as select * from emp; insert into emp1 select * from emp1; ---多执行几次 alter table DEPT1 add id number; update DEPT1 set id=rownum; alter table emp1 add id number; update emp1 set id=rownum;
create index idx_emp1 on emp1(id);
create index idx_DEPT1 on dept1(id);
改写后的语句:
select /*+ GATHER_PLAN_STATISTICS dwtes2 */ empno,d,e from EMP1 a left join (select count(1)d,id,sum(x) e from DEPT1 group by id) b on a.id=b.id --where a.id in (1,2,3,4,5,6,7,8,9,11,12,13);
在放开where条件时,标量子查询时的多次索引范围扫描,导致cost较高,性能比左连接方式要差些,从下图看的不是很明显,如果从monitor看耗时更直观些。得出结论是:左连接改写后效率比标量子查询稍好,但是没有明显提升。
---- SQL_ID 8mgcw7x9x16rq, child number 0 ------------------------------------- select /*+ GATHER_PLAN_STATISTICS dwtest */ empno, (select count(*) from DEPT1 b where b.id = a.id) as d, (select sum(x) from DEPT1 b where b.id = a.id) as e from EMP1 a where a.id in (1,2,3,4,5,6,7,8,9,11,12,13) Plan hash value: 3470857716 ------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | ------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 12 |00:00:00.01 | 9 | 4 | | 1 | SORT AGGREGATE | | 12 | 1 | 12 |00:00:00.01 | 8 | 1 | |* 2 | INDEX RANGE SCAN | IDX_DEPT1 | 12 | 1 | 12 |00:00:00.01 | 8 | 1 | | 3 | SORT AGGREGATE | | 12 | 1 | 12 |00:00:00.01 | 10 | 0 | | 4 | TABLE ACCESS BY INDEX ROWID| DEPT1 | 12 | 1 | 12 |00:00:00.01 | 10 | 0 | |* 5 | INDEX RANGE SCAN | IDX_DEPT1 | 12 | 1 | 12 |00:00:00.01 | 8 | 0 | | 6 | INLIST ITERATOR | | 1 | | 12 |00:00:00.01 | 9 | 4 | | 7 | TABLE ACCESS BY INDEX ROWID| EMP1 | 12 | 12 | 12 |00:00:00.01 | 9 | 4 | |* 8 | INDEX RANGE SCAN | IDX_EMP1 | 12 | 12 | 12 |00:00:00.01 | 8 | 4 | ------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("B"."ID"=:B1) 5 - access("B"."ID"=:B1) 8 - access(("A"."ID"=1 OR "A"."ID"=2 OR "A"."ID"=3 OR "A"."ID"=4 OR "A"."ID"=5 OR "A"."ID"=6 OR "A"."ID"=7 OR "A"."ID"=8 OR "A"."ID"=9 OR "A"."ID"=11 OR "A"."ID"=12 OR "A"."ID"=13)) ----- SQL_ID dpwyqsf1rch2g, child number 0 ------------------------------------- select /*+ GATHER_PLAN_STATISTICS dwtes2 */ empno,d,e from EMP1 a left join (select count(1)d,id,sum(x) e from DEPT1 group by id) b on a.id=b.id where a.id in (1,2,3,4,5,6,7,8,9,11,12,13) Plan hash value: 1193336691 ---------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | ---------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 12 |00:00:00.01 | 19 | | | | |* 1 | HASH JOIN OUTER | | 1 | 12 | 12 |00:00:00.01 | 19 | 1969K| 1969K| 1407K (0)| | 2 | INLIST ITERATOR | | 1 | | 12 |00:00:00.01 | 9 | | | | | 3 | TABLE ACCESS BY INDEX ROWID | EMP1 | 12 | 12 | 12 |00:00:00.01 | 9 | | | | |* 4 | INDEX RANGE SCAN | IDX_EMP1 | 12 | 12 | 12 |00:00:00.01 | 8 | | | | | 5 | VIEW | | 1 | 12 | 12 |00:00:00.01 | 10 | | | | | 6 | HASH GROUP BY | | 1 | 12 | 12 |00:00:00.01 | 10 | 1116K| 1116K| 2222K (0)| | 7 | INLIST ITERATOR | | 1 | | 12 |00:00:00.01 | 10 | | | | | 8 | TABLE ACCESS BY INDEX ROWID| DEPT1 | 12 | 12 | 12 |00:00:00.01 | 10 | | | | |* 9 | INDEX RANGE SCAN | IDX_DEPT1 | 12 | 12 | 12 |00:00:00.01 | 8 | | | | ---------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 1 - access("A"."ID"="B"."ID") 4 - access(("A"."ID"=1 OR "A"."ID"=2 OR "A"."ID"=3 OR "A"."ID"=4 OR "A"."ID"=5 OR "A"."ID"=6 OR "A"."ID"=7 OR "A"."ID"=8 OR "A"."ID"=9 OR "A"."ID"=11 OR "A"."ID"=12 OR "A"."ID"=13)) 9 - access(("ID"=1 OR "ID"=2 OR "ID"=3 OR "ID"=4 OR "ID"=5 OR "ID"=6 OR "ID"=7 OR "ID"=8 OR "ID"=9 OR "ID"=11 OR "ID"=12 OR "ID"=13))
在没有where条件时,上例的索引选择性非常好,标量子查询时的多次索引范围扫描,与左连接方式的对两表的全表扫描的hash排序cost差不多,导致两种方式效率相差无几。得出结论是:在关联索引选择性非常好时,左连接改写后效率比标量子查询差不多,但是如果索引选择性一般时,左连接效果要好。
但是在关联条件没有索引时,emp1表多少行,就要对dept是乘2次的全表扫描,此时就会导致cost非常高。
所以,尽量使用左连接加分组来优化