将数列唯一值化后再求中值的效率比较 第一方案胜出,加索引后在近两百万数据中查出中值耗时0.176秒
建表及充值:
create table hy_emp( id number(9,0) primary key, name nvarchar2(20) not null, salary integer not null); insert into hy_emp select rownum,dbms_random.string('*',dbms_random.value(2,20)),dbms_random.value(1000,10000) from dual connect by level<2000000 order by dbms_random.random;
方案一(运行时间0.95秒,cost3082)
SQL:
with soloemp as (select distinct salary from hy_emp) select avg(salary) from (select row_number() over (order by salary) as seq, row_number() over (order by salary desc) as revseq,salary from soloemp) a where abs(a.seq-a.revseq)<=1
解释计划:
Plan hash value: 2542413663 ---------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 39 | 3082 (3)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 39 | | | |* 2 | VIEW | | 9001 | 342K| 3082 (3)| 00:00:01 | | 3 | WINDOW SORT | | 9001 | 114K| 3082 (3)| 00:00:01 | | 4 | WINDOW SORT | | 9001 | 114K| 3082 (3)| 00:00:01 | | 5 | VIEW | | 9001 | 114K| 3081 (3)| 00:00:01 | | 6 | HASH UNIQUE | | 9001 | 36004 | 3081 (3)| 00:00:01 | | 7 | TABLE ACCESS FULL| HY_EMP | 1999K| 7812K| 3026 (1)| 00:00:01 | ---------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(ABS("A"."SEQ"-"A"."REVSEQ")<=1)
方案二(运行时间0.85秒,cost:3089)
SQL:
with soloemp as (select distinct salary from hy_emp) select avg(b.salary) from (select a.salary,abs(a.seq-a.revseq) as diff from (select salary,dense_rank() over (order by salary asc) as seq,dense_rank() over (order by salary desc) as revseq from soloemp order by salary) a ) b where b.diff=(select min(c.diff) from (select abs(a.seq-a.revseq) as diff from (select dense_rank() over (order by salary asc) as seq,dense_rank() over (order by salary desc) as revseq from soloemp order by salary) a ) c)
解释计划:
Plan hash value: 2351668320 ---------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 39 | 3089 (3)| 00:00:01 | | 1 | TEMP TABLE TRANSFORMATION | | | | | | | 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9DE4A2_18C7CE | | | | | | 3 | HASH UNIQUE | | 9001 | 36004 | 3081 (3)| 00:00:01 | | 4 | TABLE ACCESS FULL | HY_EMP | 1999K| 7812K| 3026 (1)| 00:00:01 | | 5 | SORT AGGREGATE | | 1 | 39 | | | |* 6 | VIEW | | 9001 | 342K| 4 (25)| 00:00:01 | | 7 | WINDOW SORT | | 9001 | 36004 | 4 (25)| 00:00:01 | | 8 | WINDOW SORT | | 9001 | 36004 | 4 (25)| 00:00:01 | | 9 | VIEW | | 9001 | 36004 | 3 (0)| 00:00:01 | | 10 | TABLE ACCESS FULL | SYS_TEMP_0FD9DE4A2_18C7CE | 9001 | 36004 | 3 (0)| 00:00:01 | | 11 | SORT AGGREGATE | | 1 | 26 | | | | 12 | VIEW | | 9001 | 228K| 4 (25)| 00:00:01 | | 13 | WINDOW SORT | | 9001 | 36004 | 4 (25)| 00:00:01 | | 14 | WINDOW SORT | | 9001 | 36004 | 4 (25)| 00:00:01 | | 15 | VIEW | | 9001 | 36004 | 3 (0)| 00:00:01 | | 16 | TABLE ACCESS FULL | SYS_TEMP_0FD9DE4A2_18C7CE | 9001 | 36004 | 3 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 6 - filter(ABS("A"."SEQ"-"A"."REVSEQ")= (SELECT MIN(ABS("A"."SEQ"-"A"."REVSEQ")) FROM (SELECT DENSE_RANK() OVER ( ORDER BY "SALARY") "SEQ",DENSE_RANK() OVER ( ORDER BY INTERNAL_FUNCTION("SALARY") DESC ) "REVSEQ" FROM (SELECT /*+ CACHE ("T1") */ "C0" "SALARY" FROM "SYS"."SYS_TEMP_0FD9DE4A2_18C7CE" "T1") "SOLOEMP" ORDER BY "SALARY") "A"))
方案三(用时1.663秒,Cost:3089):
SQL:
with soloemp as (select distinct salary from hy_emp) select avg(a.salary) from (select salary,row_number() over (order by salary asc) as seq,row_number() over (order by salary desc) as revseq,(select ceil(count(*)/2) from soloemp) as ceil from soloemp) a where a.seq=a.ceil or a.revseq=a.ceil
解释计划:
Plan hash value: 940756949 ---------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 52 | 3089 (3)| 00:00:01 | | 1 | TEMP TABLE TRANSFORMATION | | | | | | | 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9DE4A4_18C7CE | | | | | | 3 | HASH UNIQUE | | 9001 | 36004 | 3081 (3)| 00:00:01 | | 4 | TABLE ACCESS FULL | HY_EMP | 1999K| 7812K| 3026 (1)| 00:00:01 | | 5 | SORT AGGREGATE | | 1 | 52 | | | | 6 | SORT AGGREGATE | | 1 | | | | | 7 | VIEW | | 9001 | | 3 (0)| 00:00:01 | | 8 | TABLE ACCESS FULL | SYS_TEMP_0FD9DE4A4_18C7CE | 9001 | 36004 | 3 (0)| 00:00:01 | |* 9 | VIEW | | 9001 | 457K| 8 (25)| 00:00:01 | | 10 | WINDOW SORT | | 9001 | 114K| 8 (25)| 00:00:01 | | 11 | WINDOW SORT | | 9001 | 114K| 8 (25)| 00:00:01 | | 12 | VIEW | | 9001 | 114K| 3 (0)| 00:00:01 | | 13 | TABLE ACCESS FULL | SYS_TEMP_0FD9DE4A4_18C7CE | 9001 | 36004 | 3 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 9 - filter("A"."SEQ"="A"."CEIL" OR "A"."REVSEQ"="A"."CEIL")
从目前看来,第一种方案以耗时少和cost低稳居榜首。
让我们增加一个salary列上的索引:
create index idx_hyemp_salary on hy_emp(salary);
再看三种方案的耗时和cost:
方案一(耗时0.176秒,Cost:1198):
with soloemp as (select distinct salary from hy_emp) select avg(salary) from (select row_number() over (order by salary) as seq, row_number() over (order by salary desc) as revseq,salary from soloemp) a where abs(a.seq-a.revseq)<=1
解释计划:
Plan hash value: 3967344228 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 39 | 1198 (6)| 00:00:01 | | 1 | SORT AGGREGATE | | 1 | 39 | | | |* 2 | VIEW | | 9001 | 342K| 1198 (6)| 00:00:01 | | 3 | WINDOW SORT | | 9001 | 114K| 1198 (6)| 00:00:01 | | 4 | WINDOW SORT | | 9001 | 114K| 1198 (6)| 00:00:01 | | 5 | VIEW | | 9001 | 114K| 1196 (6)| 00:00:01 | | 6 | HASH UNIQUE | | 9001 | 36004 | 1196 (6)| 00:00:01 | | 7 | INDEX FAST FULL SCAN| IDX_HYEMP_SALARY | 1999K| 7812K| 1142 (1)| 00:00:01 | ----------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(ABS("A"."SEQ"-"A"."REVSEQ")<=1)
方案二:(耗时0.252秒,Cost:1205)
SQL:
with soloemp as (select distinct salary from hy_emp) select avg(b.salary) from (select a.salary,abs(a.seq-a.revseq) as diff from (select salary,dense_rank() over (order by salary asc) as seq,dense_rank() over (order by salary desc) as revseq from soloemp order by salary) a ) b where b.diff=(select min(c.diff) from (select abs(a.seq-a.revseq) as diff from (select dense_rank() over (order by salary asc) as seq,dense_rank() over (order by salary desc) as revseq from soloemp order by salary) a ) c)
解释计划:
Plan hash value: 3915164046 ---------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 39 | 1205 (6)| 00:00:01 | | 1 | TEMP TABLE TRANSFORMATION | | | | | | | 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9DE4A6_18C7CE | | | | | | 3 | HASH UNIQUE | | 9001 | 36004 | 1196 (6)| 00:00:01 | | 4 | INDEX FAST FULL SCAN | IDX_HYEMP_SALARY | 1999K| 7812K| 1142 (1)| 00:00:01 | | 5 | SORT AGGREGATE | | 1 | 39 | | | |* 6 | VIEW | | 9001 | 342K| 4 (25)| 00:00:01 | | 7 | WINDOW SORT | | 9001 | 36004 | 4 (25)| 00:00:01 | | 8 | WINDOW SORT | | 9001 | 36004 | 4 (25)| 00:00:01 | | 9 | VIEW | | 9001 | 36004 | 3 (0)| 00:00:01 | | 10 | TABLE ACCESS FULL | SYS_TEMP_0FD9DE4A6_18C7CE | 9001 | 36004 | 3 (0)| 00:00:01 | | 11 | SORT AGGREGATE | | 1 | 26 | | | | 12 | VIEW | | 9001 | 228K| 4 (25)| 00:00:01 | | 13 | WINDOW SORT | | 9001 | 36004 | 4 (25)| 00:00:01 | | 14 | WINDOW SORT | | 9001 | 36004 | 4 (25)| 00:00:01 | | 15 | VIEW | | 9001 | 36004 | 3 (0)| 00:00:01 | | 16 | TABLE ACCESS FULL | SYS_TEMP_0FD9DE4A6_18C7CE | 9001 | 36004 | 3 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 6 - filter(ABS("A"."SEQ"-"A"."REVSEQ")= (SELECT MIN(ABS("A"."SEQ"-"A"."REVSEQ")) FROM (SELECT DENSE_RANK() OVER ( ORDER BY "SALARY") "SEQ",DENSE_RANK() OVER ( ORDER BY INTERNAL_FUNCTION("SALARY") DESC ) "REVSEQ" FROM (SELECT /*+ CACHE ("T1") */ "C0" "SALARY" FROM "SYS"."SYS_TEMP_0FD9DE4A6_18C7CE" "T1") "SOLOEMP" ORDER BY "SALARY") "A"))
方案三(0.24秒,Cost:1205):
SQL:
with soloemp as (select distinct salary from hy_emp) select avg(a.salary) from (select salary,row_number() over (order by salary asc) as seq,row_number() over (order by salary desc) as revseq,(select ceil(count(*)/2) from soloemp) as ceil from soloemp) a where a.seq=a.ceil or a.revseq=a.ceil
解释计划:
Plan hash value: 234565008 ---------------------------------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time | ---------------------------------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 52 | 1205 (6)| 00:00:01 | | 1 | TEMP TABLE TRANSFORMATION | | | | | | | 2 | LOAD AS SELECT (CURSOR DURATION MEMORY)| SYS_TEMP_0FD9DE4A8_18C7CE | | | | | | 3 | HASH UNIQUE | | 9001 | 36004 | 1196 (6)| 00:00:01 | | 4 | INDEX FAST FULL SCAN | IDX_HYEMP_SALARY | 1999K| 7812K| 1142 (1)| 00:00:01 | | 5 | SORT AGGREGATE | | 1 | 52 | | | | 6 | SORT AGGREGATE | | 1 | | | | | 7 | VIEW | | 9001 | | 3 (0)| 00:00:01 | | 8 | TABLE ACCESS FULL | SYS_TEMP_0FD9DE4A8_18C7CE | 9001 | 36004 | 3 (0)| 00:00:01 | |* 9 | VIEW | | 9001 | 457K| 8 (25)| 00:00:01 | | 10 | WINDOW SORT | | 9001 | 114K| 8 (25)| 00:00:01 | | 11 | WINDOW SORT | | 9001 | 114K| 8 (25)| 00:00:01 | | 12 | VIEW | | 9001 | 114K| 3 (0)| 00:00:01 | | 13 | TABLE ACCESS FULL | SYS_TEMP_0FD9DE4A8_18C7CE | 9001 | 36004 | 3 (0)| 00:00:01 | ---------------------------------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 9 - filter("A"."SEQ"="A"."CEIL" OR "A"."REVSEQ"="A"."CEIL")
还是第一种方案占优。
最终的状元是第一种方案,它的成绩是两百万数据中查询中值用时0.176秒:
with soloemp as (select distinct salary from hy_emp) select avg(salary) from (select row_number() over (order by salary) as seq, row_number() over (order by salary desc) as revseq,salary from soloemp) a where abs(a.seq-a.revseq)<=1
--2020-04-14--
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· C#/.NET/.NET Core优秀项目和框架2025年2月简报
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 【杭电多校比赛记录】2025“钉耙编程”中国大学生算法设计春季联赛(1)