新三种求数列中值SQL之效率再比拼
在 https://www.cnblogs.com/xiandedanteng/p/12677688.html 中我列举了三种求中值方案,其中日本人MICK的做法因为不适用于二百万结果集而放弃,取而代之是新方案一。
新方案一:
经过思考后我又得出了一种中值的新解法,那就是利用排序后正向序列和反向序列交叉点为中值区的原理,如果两个序列相减小于等于一则求所在区域的均值即可。
SQL:
select avg(a.salary) from (select salary,dense_rank() over (order by salary asc) as seq,dense_rank() over (order by salary desc) as revseq from tb_employee order by salary) a where abs(a.seq-a.revseq)<=1
解释计划:
SQL> select avg(a.salary) from 2 (select salary,dense_rank() over (order by salary asc) as seq,dense_rank() over (order by salary desc) as revseq from tb_employee order by salary) a where abs(a.seq-a.revseq)<=1; 已用时间: 00: 00: 00.00 执行计划 ---------------------------------------------------------- Plan hash value: 9035349 --------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | --------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 39 | | 8850 (2)| 00:01:47 | | 1 | SORT AGGREGATE | | 1 | 39 | | | | |* 2 | VIEW | | 2000K| 74M| | 8850 (2)| 00:01:47 | | 3 | WINDOW SORT | | 2000K| 7812K| 22M| 8850 (2)| 00:01:47 | | 4 | WINDOW SORT | | 2000K| 7812K| 22M| 8850 (2)| 00:01:47 | | 5 | TABLE ACCESS FULL| TB_EMPLOYEE | 2000K| 7812K| | 3045 (2)| 00:00:37 | --------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(ABS("A"."SEQ"-"A"."REVSEQ")<=1)
从Cost看是目前最快的。
执行时间:
SQL> select avg(a.salary) from 2 (select salary,dense_rank() over (order by salary asc) as seq,dense_rank() over (order by salary desc) as revseq from tb_employee order by salary) a where abs(a.seq-a.revseq)<=1; AVG(A.SALARY) ------------- 5500 已用时间: 00: 00: 02.46
从运行时间上看排第二。
经局部优化后两种方案SQL如下:
原有方案二:
select avg(b.salary) from (select a.salary,abs(a.seq-a.revseq) as diff from (select id,salary,dense_rank() over (order by salary asc) as seq,dense_rank() over (order by salary desc) as revseq from tb_employee order by salary) a ) b where b.diff=(select min(c.diff) from (select abs(a.seq-a.revseq) as diff from (select dense_rank() over (order by salary asc) as seq,dense_rank() over (order by salary desc) as revseq from tb_employee order by salary) a ) c)
解释计划:
SQL> select avg(b.salary) from 2 (select a.salary,abs(a.seq-a.revseq) as diff from (select id,salary,dense_rank() over (order by salary asc) as seq,dense_rank() over (order by salary desc) as revseq from tb_employee order by salary) a ) b 3 where b.diff=(select min(c.diff) from 4 (select abs(a.seq-a.revseq) as diff from (select dense_rank() over (order by salary asc) as seq,dense_rank() over (order by salary desc) as revseq from tb_employee order by salary) a ) c); 已用时间: 00: 00: 00.00 执行计划 ---------------------------------------------------------- Plan hash value: 3874635296 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 39 | | 19969 (2)| 00:04:00 | | 1 | SORT AGGREGATE | | 1 | 39 | | | | |* 2 | VIEW | | 2000K| 74M| | 11119 (2)| 00:02:14 | | 3 | WINDOW SORT | | 2000K| 19M| 38M| 11119 (2)| 00:02:14 | | 4 | WINDOW SORT | | 2000K| 19M| 38M| 11119 (2)| 00:02:14 | | 5 | TABLE ACCESS FULL | TB_EMPLOYEE | 2000K| 19M| | 3045 (2)| 00:00:37 | | 6 | SORT AGGREGATE | | 1 | 26 | | | | | 7 | VIEW | | 2000K| 49M| | 8850 (2)| 00:01:47 | | 8 | WINDOW SORT | | 2000K| 7812K| 22M| 8850 (2)| 00:01:47 | | 9 | WINDOW SORT | | 2000K| 7812K| 22M| 8850 (2)| 00:01:47 | | 10 | TABLE ACCESS FULL| TB_EMPLOYEE | 2000K| 7812K| | 3045 (2)| 00:00:37 | ----------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 2 - filter(ABS("A"."SEQ"-"A"."REVSEQ")= (SELECT MIN(ABS("A"."SEQ"-"A"."REVSEQ")) FROM (SELECT DENSE_RANK() OVER ( ORDER BY "SALARY") "SEQ",DENSE_RANK() OVER ( ORDER BY INTERNAL_FUNCTION("SALARY") DESC ) "REVSEQ" FROM "TB_EMPLOYEE" "TB_EMPLOYEE" ORDER BY "SALARY") "A"))
执行时间:
SQL> select avg(b.salary) from
2 (select a.salary,abs(a.seq-a.revseq) as diff from (select id,salary,dense_rank() over (order by salary asc) as seq,dense_rank() over (order by salary desc) as revseq from tb_employee order by salary) a ) b
3 where b.diff=(select min(c.diff) from
4 (select abs(a.seq-a.revseq) as diff from (select dense_rank() over (order by salary asc) as seq,dense_rank() over (order by salary desc) as revseq from tb_employee order by salary) a ) c);
AVG(B.SALARY)
-------------
5500
已用时间: 00: 00: 04.56
原有方案三:
select avg(a.salary) from (select salary,row_number() over (order by salary asc) as seq,row_number() over (order by salary desc) as revseq,(select ceil(count(*)/2) from tb_employee) as ceil from tb_employee) a where a.seq=a.ceil or a.revseq=a.ceil
解释计划:
SQL> select avg(a.salary) from 2 (select 3 salary,row_number() over (order by salary asc) as seq,row_number() over (order by salary desc) as revseq,(select ceil(count(*)/2) from tb_employee) as ceil 4 from tb_employee) a 5 where a.seq=a.ceil or a.revseq=a.ceil; 已用时间: 00: 00: 00.00 执行计划 ---------------------------------------------------------- Plan hash value: 3541675306 ----------------------------------------------------------------------------------------------- | Id | Operation | Name | Rows | Bytes |TempSpc| Cost (%CPU)| Time | ----------------------------------------------------------------------------------------------- | 0 | SELECT STATEMENT | | 1 | 52 | | 14656 (3)| 00:02:56 | | 1 | SORT AGGREGATE | | 1 | 52 | | | | | 2 | SORT AGGREGATE | | 1 | | | | | | 3 | INDEX FAST FULL SCAN| SYS_C0012264 | 2000K| | | 1474 (2)| 00:00:18 | |* 4 | VIEW | | 2000K| 99M| | 14656 (3)| 00:02:56 | | 5 | WINDOW SORT | | 2000K| 7812K| 22M| 14656 (3)| 00:02:56 | | 6 | WINDOW SORT | | 2000K| 7812K| 22M| 14656 (3)| 00:02:56 | | 7 | TABLE ACCESS FULL | TB_EMPLOYEE | 2000K| 7812K| | 3045 (2)| 00:00:37 | ----------------------------------------------------------------------------------------------- Predicate Information (identified by operation id): --------------------------------------------------- 4 - filter("A"."SEQ"="A"."CEIL" OR "A"."REVSEQ"="A"."CEIL")
消耗时间:
SQL> ; 1 select avg(b.salary) from 2 (select 3 salary,row_number() over (order by salary asc) as seq,row_number() over (order by salary desc) as revseq,(select ceil(count(*)/2) from tb_employee) as ceil 4 - filter("A"."SEQ"="A"."CEIL" OR "A"."REVSEQ"="A"."CEIL"); 5* where a.seq=a.ceil or a.revseq=a.ceil SQL> select avg(a.salary) from 2 (select 3 salary,row_number() over (order by salary asc) as seq,row_number() over (order by salary desc) as revseq,(select ceil(count(*)/2) from tb_employee) as ceil 4 from tb_employee) a 5 where a.seq=a.ceil or a.revseq=a.ceil; AVG(A.SALARY) ------------- 5498 已用时间: 00: 00: 01.98
而单查表数量的解释计划大家也参考看一下:
SQL> select count(*) from tb_employee; 已用时间: 00: 00: 00.01 执行计划 ---------------------------------------------------------- Plan hash value: 2866911338 ------------------------------------------------------------------------------ | Id | Operation | Name | Rows | Cost (%CPU)| Time | ------------------------------------------------------------------------------ | 0 | SELECT STATEMENT | | 1 | 1474 (2)| 00:00:18 | | 1 | SORT AGGREGATE | | 1 | | | | 2 | INDEX FAST FULL SCAN| SYS_C0012264 | 2000K| 1474 (2)| 00:00:18 | ------------------------------------------------------------------------------
最后的大比拼表格:
方案 | Cost | 耗时 |
新方案一 | 8850 | 2.46秒 |
原有方案二 | 19946 | 4.56秒 |
原有方案三 | 14646 | 1.98 |
截至目前,方案三以较高效率胜出,新方案一其实也不错,原有方案三排第三位。
--2020年4月12日--
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· C#/.NET/.NET Core优秀项目和框架2025年2月简报
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 【杭电多校比赛记录】2025“钉耙编程”中国大学生算法设计春季联赛(1)
2018-04-12 【Python】得到当前日期时间
2018-04-12 【Python】在控制台输出不同颜色的文字
2018-04-12 【python】如何去掉使用BeautifulSoup读取html出现的警告UserWarning: You provided Unicode markup but also provided a value for from_encoding