Oracle反连接HASH JOIN ANTI NA会处理驱动表连接列null值
Oracle反连接HASH JOIN ANTI NA会处理驱动表连接列null值
这个现象和Oracle内连接HASH JOIN/半连接HASH JOIN SEMI不处理驱动表连接列null值相反。
反连接中无论一下哪个结论都一样:
- HASH JOIN ANTI NA
- HASH JOIN ANTI SNA
- HASH JOIN RIGHT ANTI NA
- HASH JOIN RIGHT ANTI SNA
测试表t,tt的信息如下:其中t表全是null值。
15:16:18 ZKM@zkm(37)> desc t 15:16:20 ZKM@zkm(37)> desc tt Name Null? Type Name Null? Type ------- -------- ------------- ------- -------- ------------- COL VARCHAR2(20) COL VARCHAR2(20) 15:17:40 ZKM@zkm(37)> select col,count(*) from t group by col; 15:17:43 ZKM@zkm(37)> select col,count(*) from tt group by col; COL COUNT(*) COL COUNT(*) ------ ---------- ------ ---------- 1000000 1 a 1 Elapsed: 00:00:00.16 Elapsed: 00:00:00.00
现有sql和执行情况如下:
其中,id=2花费的时间达到25s(33-7-1),如果是Oracle内连接HASH JOIN/半连接HASH JOIN SEMI不处理驱动表连接列null值中,会忽略null值导致秒出。
select /*+ leading(t) */ count(*) from t where t.col not in (select tt.col from tt); Plan hash value: 1065647028 ----------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | OMem | 1Mem | Used-Mem | ----------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:00.33 | 1578 | | | | | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:00.33 | 1578 | | | | |* 2 | HASH JOIN ANTI NA | | 1 | 201M| 0 |00:00:00.33 | 1578 | 37M| 7549K| 58M (0)| | 3 | TABLE ACCESS FULL| T | 1 | 201M| 1000K|00:00:00.07 | 1574 | | | | | 4 | TABLE ACCESS FULL| TT | 1 | 2 | 1 |00:00:00.01 | 4 | | | | ----------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("T"."COL"="TT"."COL")
创建索引,还是一样。
create index idx_col on t(col);
insert多15000000多行null值数据,如下:
16:43:47 ZKM@zkm(37)> select col,count(*) from t group by col; COL COUNT(*) ------------------------------------------------------------ ---------- 16000000 Elapsed: 00:00:02.62
再次执行sql语句:防偷
时间和消耗大大增加,说明在反连接的情况下,hash join无法忽略null值,还是会对null进行hash值计算,其实这很不合理(其实合理,详见:https://www.cnblogs.com/PiscesCanon/p/17481966.html)。
select /*+ leading(t) */ count(*) from t where t.col not in (select tt.col from tt); Plan hash value: 1065647028 --------------------------------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Starts | E-Rows | A-Rows | A-Time | Buffers | Reads | Writes | OMem | 1Mem | Used-Mem | Used-Tmp| --------------------------------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | | 1 |00:00:06.05 | 24392 | 14229 | 14229 | | | | | | 1 | SORT AGGREGATE | | 1 | 1 | 1 |00:00:06.05 | 24392 | 14229 | 14229 | | | | | |* 2 | HASH JOIN ANTI NA | | 1 | 201M| 0 |00:00:06.05 | 24392 | 14229 | 14229 | 542M| 29M| 892M (0)| 115K| | 3 | TABLE ACCESS FULL| T | 1 | 201M| 16M|00:00:01.18 | 24388 | 0 | 0 | | | | | | 4 | TABLE ACCESS FULL| TT | 1 | 2 | 1 |00:00:00.01 | 4 | 0 | 0 | | | | | --------------------------------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - access("T"."COL"="TT"."COL")