求程序员工资的中位数 附两种众数求法
除了平均数外,众数和中位数也是衡量集合的标尺之一,中位数是将集合升序排列后恰好位于正中间的元素,如果集合总数为偶数,则取中间两个元素的平均值作为中位数,下文将就用SQL去求中位数展开讨论。
首先建表:
create table tb_coder( id number(4,0) not null primary key, name nvarchar2(20) not null, salary integer not null)
然后充值:
insert into tb_coder select rownum,dbms_random.string('*',dbms_random.value(2,20)),dbms_random.value(3000,20000) from dual connect by level<11 order by dbms_random.random;
然后我们看看表中数据:
SQL> select * from tb_coder order by salary; ID NAME SALARY ---------- ---------------------------------------- ---------- 9 DTVMMQSOAJA 4947 10 CSIZK 5424 2 TYLWH 5676 7 EPUIILOYENSD 10316 3 MSKHEGSWNIFZWB 15153 4 OFZKCXNWZWSJR 15326 5 HOJLJWIOYNDBB 16101 8 XJQTKPDXAIOOWJ 16843 6 RWRGMEIRGZN 18599 1 OZKFBBFYWGEERAUGSNH 19710 已选择10行。
现在可以观察得知,总个数为10,是偶数,中位数就该是处于中间两个数15153和15326的平均值:15329.5
经过思考,我发现解决问题的关键是得知中间在哪,为此我添加了两列以帮助思考:
SQL> select id,salary,rank() over (order by salary asc) as seq,rank() over (order by salary desc) as revseq from tb_coder order by salary; ID SALARY SEQ REVSEQ ---------- ---------- ---------- ---------- 9 4947 1 10 10 5424 2 9 2 5676 3 8 7 10316 4 7 3 15153 5 6 4 15326 6 5 5 16101 7 4 8 16843 8 3 6 18599 9 2 1 19710 10 1 已选择10行。
进过观察可以知道,用seq减去revseq再求绝对值,值大的一定靠两边,值小的一定靠中间。于是再查:
SQL> select a.*,abs(a.seq-a.revseq) as diff from 2 (select id,salary,rank() over (order by salary asc) as seq,rank() over (order by salary desc) as revseq from tb_coder order by salary) a; ID SALARY SEQ REVSEQ DIFF ---------- ---------- ---------- ---------- ---------- 9 4947 1 10 9 10 5424 2 9 7 2 5676 3 8 5 7 10316 4 7 3 3 15153 5 6 1 4 15326 6 5 1 5 16101 7 4 3 8 16843 8 3 5 6 18599 9 2 7 1 19710 10 1 9 已选择10行。
从上面已经明显看出,diff值最小的就是我们要找的15153和15326两条记录。
如果我取diff值最小的记录,求平均值不就是中位数了吗,于是有了下面的SQL:
SQL> select avg(b.salary) from 2 (select a.*,abs(a.seq-a.revseq) as diff from (select id,salary,rank() over (order by salary asc) as seq,rank() over (order by salary desc) as revseq from tb_coder order by salary) a ) b 3 where b.diff=(select min(c.diff) from 4 (select a.*,abs(a.seq-a.revseq) as diff from (select id,salary,rank() over (order by salary asc) as seq,rank() over (order by salary desc) as revseq from tb_coder order by salary) a ) c); AVG(B.SALARY) ------------- 15239.5
结果等于预期,这时偶数个情况,那么奇数个呢?
让我们先删除最后一行:
SQL> delete from tb_coder where id=10; 已删除 1 行。 SQL> commit; 提交完成。
再看结果集:
SQL> select * from tb_coder order by salary; ID NAME SALARY ---------- ---------------------------------------- ---------- 9 DTVMMQSOAJA 4947 2 TYLWH 5676 7 EPUIILOYENSD 10316 3 MSKHEGSWNIFZWB 15153 4 OFZKCXNWZWSJR 15326 5 HOJLJWIOYNDBB 16101 8 XJQTKPDXAIOOWJ 16843 6 RWRGMEIRGZN 18599 1 OZKFBBFYWGEERAUGSNH 19710 已选择9行。
只剩下九条,那么处于中间的15326应该是中位数。
看看是不是:
SQL> select avg(b.salary) from 2 (select a.*,abs(a.seq-a.revseq) as diff from (select id,salary,rank() over (order by salary asc) as seq,rank() over (order by salary desc) as revseq from tb_coder order by salary) a ) b 3 where b.diff=(select min(c.diff) from 4 (select a.*,abs(a.seq-a.revseq) as diff from (select id,salary,rank() over (order by salary asc) as seq,rank() over (order by salary desc) as revseq from tb_coder order by salary) a ) c); AVG(B.SALARY) ------------- 15326
结果等于预期!
附:另一种中位数求法,这种方法比较简洁,但是是别人的方法《SQL进阶教程》P61-P62,不是我自己想出来的。
select avg(distinct salary) from ( select t1.salary from tb_coder t1,tb_coder t2 group by t1.salary having sum(case when t1.salary>=t2.salary then 1 else 0 end)>=count(*)/2 and sum(case when t1.salary<=t2.salary then 1 else 0 end)>=count(*)/2 )
另外众数也是衡量集合属性的标尺之一,下面列出了两种求众数的方法供大家参考(一千元分一个档):
--求众数(使用谓词) select a.sal,count(*) as cnt from (select round(salary/1000)*1000 as sal from tb_coder) a group by a.sal having count(*) >= all(select count(*) from (select round(salary/1000)*1000 as sal from tb_coder) b group by b.sal) --求众数(使用rownum) select b.sal,b.cnt from ( select a.sal,count(*) as cnt from (select round(salary/1000)*1000 as sal from tb_coder) a group by a.sal order by cnt desc ) b where rownum=1
--2020年4月5日--
以上用到的全部SQL:
create table tb_coder( id number(4,0) not null primary key, name nvarchar2(20) not null, salary integer not null) insert into tb_coder select rownum,dbms_random.string('*',dbms_random.value(2,20)),dbms_random.value(3000,20000) from dual connect by level<11 order by dbms_random.random; select id,salary,rank() over (order by salary asc) as seq,rank() over (order by salary desc) as revseq from tb_coder order by salary select a.*,abs(a.seq-a.revseq) as diff from (select id,salary,rank() over (order by salary asc) as seq,rank() over (order by salary desc) as revseq from tb_coder order by salary) a select b.* from (select a.*,abs(a.seq-a.revseq) as diff from (select id,salary,rank() over (order by salary asc) as seq,rank() over (order by salary desc) as revseq from tb_coder order by salary) a ) b order by b.diff select b.* from (select a.*,abs(a.seq-a.revseq) as diff from (select id,salary,rank() over (order by salary asc) as seq,rank() over (order by salary desc) as revseq from tb_coder order by salary) a ) b where b.diff=(select min(c.diff) from (select a.*,abs(a.seq-a.revseq) as diff from (select id,salary,rank() over (order by salary asc) as seq,rank() over (order by salary desc) as revseq from tb_coder order by salary) a ) c) select avg(b.salary) from (select a.*,abs(a.seq-a.revseq) as diff from (select id,salary,rank() over (order by salary asc) as seq,rank() over (order by salary desc) as revseq from tb_coder order by salary) a ) b where b.diff=(select min(c.diff) from (select a.*,abs(a.seq-a.revseq) as diff from (select id,salary,rank() over (order by salary asc) as seq,rank() over (order by salary desc) as revseq from tb_coder order by salary) a ) c)
分类:
Oracle.繁难Sql文
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
· Linux系列:如何用 C#调用 C方法造成内存泄露
· AI与.NET技术实操系列(二):开始使用ML.NET
· 无需6万激活码!GitHub神秘组织3小时极速复刻Manus,手把手教你使用OpenManus搭建本
· C#/.NET/.NET Core优秀项目和框架2025年2月简报
· Manus爆火,是硬核还是营销?
· 终于写完轮子一部分:tcp代理 了,记录一下
· 【杭电多校比赛记录】2025“钉耙编程”中国大学生算法设计春季联赛(1)
2018-04-05 【Canvas与桌面】十字网格黑灰背景(1920x1080)
2018-04-05 【Canvas与电脑桌面】米字格黑灰背景电脑桌面(1920x1080)