569. 员工薪水中位数

Employee 表包含所有员工。Employee 表有三列:员工Id,公司名和薪水。

+-----+------------+--------+
|Id   | Company    | Salary |
+-----+------------+--------+
|1    | A          | 2341   |
|2    | A          | 341    |
|3    | A          | 15     |
|4    | A          | 15314  |
|5    | A          | 451    |
|6    | A          | 513    |
|7    | B          | 15     |
|8    | B          | 13     |
|9    | B          | 1154   |
|10   | B          | 1345   |
|11   | B          | 1221   |
|12   | B          | 234    |
|13   | C          | 2345   |
|14   | C          | 2645   |
|15   | C          | 2645   |
|16   | C          | 2652   |
|17   | C          | 65     |
+-----+------------+--------+

请编写SQL查询来查找每个公司的薪水中位数。挑战点:你是否可以在不使用任何内置的SQL函数的情况下解决此问题。

+-----+------------+--------+
|Id   | Company    | Salary |
+-----+------------+--------+
|5    | A          | 451    |
|6    | A          | 513    |
|12   | B          | 234    |
|9    | B          | 1154   |
|14   | C          | 2645   |
+-----+------------+--------+


解法一
注意到每个公司的薪水升序后,中位数的下标和个数是确定的。

薪水总数N是偶数时,下标(N-1)/2和N/2位置处为中位数。N是奇数时,下标(N-1)/2为中位数。

对每个公司group by分组,统计每组个数N,确定中位数区间[beg,beg+cnt],beg从0开始。

beg = (N-1)/2

cnt = 0或1,N为偶数时为1,N为奇数时为0。

结果命名为表A。

(
SELECT E.Company,FLOOR((COUNT(*)-1)/2) AS `beg`,if(COUNT(*) % 2=1,0,1) AS `cnt`
FROM employee AS E
GROUP BY E.Company
) AS A
COUNT(*)为每组总数,FLOOR向下取整函数。

计算每个人薪水的升序排名。最小的薪水排第一,第2小的薪水排第二,…..。

薪水比较方法:

if (A.salary = B.salary and A.id > B.id or A.salary > B.salary)
{
那么A的排名在B的排名后。
}
员工表left join员工表,得出同一个公司中,排名在每个人之前的所有人。

SELECT *
FROM employee AS E1
LEFT JOIN employee AS E2
ON(E1.company = E2.company AND (E1.salary = E2.salary AND E1.Id > E2.Id OR E1.Salary > E2.Salary))
那么,按E1分组后,统计每组内,E2的个数即为E1的排名,排名从0开始。

最后结果按薪水升序,命名为表B。

(
SELECT E1.Id,E1.Company,E1.Salary, COUNT(E2.Salary) AS `trank`
FROM employee AS E1
LEFT JOIN employee AS E2
ON(E1.company = E2.company AND (E1.salary = E2.salary AND E1.Id > E2.Id OR E1.Salary > E2.Salary))
GROUP BY E1.Id,E1.Company,E1.Salary
ORDER BY E1.Company,E1.Salary
) AS B
现在有了中位数区间和每个人的排名。

连接表A和表B,公司相同,B.trank在[A.beg,A.beg+A.cnt]之间的行,为中位数。

SELECT B.Id,B.Company,B.Salary
FROM
(
SELECT E.Company,FLOOR((COUNT(*)-1)/2) AS `beg`,if(COUNT(*) % 2=1,0,1) AS `cnt`
FROM employee AS E
GROUP BY E.Company
) AS A
JOIN (
SELECT E1.Id,E1.Company,E1.Salary, COUNT(E2.Salary) AS `trank`
FROM employee AS E1
LEFT JOIN employee AS E2
ON(E1.company = E2.company AND (E1.salary = E2.salary AND E1.Id > E2.Id OR E1.Salary > E2.Salary))
GROUP BY E1.Id,E1.Company,E1.Salary
ORDER BY E1.Company,E1.Salary
) AS B
ON (A.company = B.company AND B.trank BETWEEN A.beg AND (A.beg+A.cnt))
解法二
思路与解法一相同。换了一种计算公司内部人员排名的方法。

解法一采用表left join,分组再计数的方法。

这里,采用用户变量的方法,简化排名的计算。

定义用户变量:pre_company——上一行的公司字段。trank——当前行的排名,从0开始。

(SELECT @pre_company:= NULL,@trank:=0) AS T
计算排名的逻辑:

if (当前行的company = pre_company){
trank = trank + 1
}else{
trank = 0
}
SQL代码,结果命名为表A

(
SELECT
id,
company,
salary,
@trank:=IF(@pre_company=company,@trank+1,0) AS `trank`,
@pre_company:=company
FROM employee,(SELECT @pre_company:= NULL,@trank:=0) AS T
ORDER BY company,salary
) AS A
计算中位数下标的方法如解法一。

薪水总数N是偶数时,下标(N-1)/2和N/2位置处为中位数。N是奇数时,下标(N-1)/2为中位数。

对每个公司group by分组,统计每组个数N,确定中位数区间[beg,beg+cnt],beg从0开始。

beg = (N-1)/2

cnt = 0或1,N为偶数时为1,N为奇数时为0。

结果命名为表B。

(
SELECT company, FLOOR((COUNT(*)-1)/2) AS `beg`, if(COUNT(*)%2=1,0,1) AS `cnt`
FROM employee
GROUP BY company
) AS B
连接表A和表B ,公司相同,B.trank在[A.beg,A.beg+A.cnt]之间的行,为中位数。

SELECT A.Id,A.Company,A.Salary
FROM
(
SELECT
id,
company,
salary,
@trank:=IF(@pre_company=company,@trank+1,0) AS `trank`,
@pre_company:=company
FROM employee,(SELECT @pre_company:= NULL,@trank:=0) AS T
ORDER BY company,salary
) AS A
JOIN
(
SELECT company, FLOOR((COUNT(*)-1)/2) AS `beg`, if(COUNT(*)%2=1,0,1) AS `cnt`
FROM employee
GROUP BY company
) AS B
ON (A.company=B.company AND A.trank BETWEEN B.beg AND B.beg+B.cnt)

作者:jason-2
链接:https://leetcode-cn.com/problems/median-employee-salary/solution/liang-chong-jie-fa-by-jason-2-2/
来源:力扣(LeetCode)
著作权归作者所有。商业转载请联系作者获得授权,非商业转载请注明出处。

posted @ 2019-11-20 23:39  lihui1625  阅读(949)  评论(0编辑  收藏  举报