窗口函数【分析函数】

窗口函数【分析函数】【聚合函数 over(paratition by 字段 / orderr by 字段) as 别名】

Mysql8.0 支持窗口函数【Window Function】,也称分析函数
与分组聚合函数类似,但是 每一行数据都生成一个结果
聚合窗口函数:SUM、AVG、COUNT、MAX、MIN等等

1. 窗口函数 VS 传统聚合函数

传统聚合函数 group by

窗口函数【分析函数】===> 可以显示出原始数据

select 
    year, 
    country, 
    product, 
    profit, 
	sum(profit) OVER (partition by country) as country_profit
from
	sales
order by
	country,
	year,
	product,
	profit

2. 专用窗口函数

2.1 排名函数

row_number():唯一

rank():跳过

dense_rank():不跳过

percent_rank()

2.2 分组内的第一或最后一名

first_value()

last_value()

lead():相对当前数据,向后查第几个

lag():相对当前数据,向前查几个

2.3 第一行到当前行求和

# Write your MySQL query statement below

# 有一队乘客在等着上巴士。然而,巴士有1000  千克 的重量限制,所以其中一部分乘客可能无法上巴士。

# 找出 最后一个 上巴士且不超过重量限制的乘客,并报告 person_name 。题目测试用例确保顺位第一的人可以上巴士且不会超重。

select person_name from

(select 
    turn Turn,
    person_id ID,
    person_name,
    weight Weight,
    sum(weight) over(order by turn ROWS between UNBOUNDED PRECEDING AND CURRENT ROW) Total_Weight 
    from Queue) temp
    where Total_Weight <= 1000
    order by Total_Weight desc
    limit 1
  1. 主动请求确认消息的用户
# 24 小时窗口内两次请求确认消息的用户的 ID
select distinct(user_id) user_id from
(select
    user_id,
    time_stamp,
    lag(time_stamp) over(partition by user_id order by time_stamp) time_lag
    from Confirmations) temp
    where time_lag is not null and
    timestampdiff(second, time_lag, time_stamp) <= 86400

2.3 分析函数

cume_dist():累积到现在,这个数据占了多少

nth_value()

ntile()

3. 应用

1532. 最近的三笔订单

  1. 先窗口函数
SELECT  
    order_id,
    order_date,
    customer_id,
    row_number() over(partition by customer_id order by order_date DESC) as rank_tag
FROM orders 
order_id order_date customer_id rank_tag
8 2020-08-03 1 1
1 2020-07-31 1 2
10 2020-07-15 1 3
5 2020-06-10 1 4
9 2020-08-07 2 1
6 2020-08-01 2 2
2 2020-07-30 2 3
7 2020-08-01 3 1
3 2020-07-31 3 2
4 2020-07-29 4 1
  1. 再左连接
select name customer_name, temp.customer_id customer_id, order_id, order_date from
    (select  
        order_id,
        order_date,
        customer_id,
        row_number() over(partition by customer_id order by order_date DESC) as rank_tag
    from orders) temp left join Customers on
    temp.customer_id = Customers.customer_id
  1. 再根据 rank_tag 筛选,并排序
select name customer_name, temp.customer_id customer_id, order_id, order_date from
    (select  
        order_id,
        order_date,
        customer_id,
        row_number() over(partition by customer_id order by order_date DESC) as rank_tag
    from orders) temp left join Customers on
    temp.customer_id = Customers.customer_id
    where rank_tag <= 3
    order by customer_name, customer_id, order_date desc

1077. 项目员工Ⅲ

select project_id, employee_id from
(select project_id, 
        employee_id,
        rank() over(partition by project_id order by experience_years desc) as `rank` from
(select project_id, p.employee_id, experience_years from Project p left join Employee e on
    p.employee_id = e.employee_id) temp1) temp2
    where `rank` = 1;

1285. 找到连续区间的开始和结束数字【技巧】

等差数列 - 等差数列,值才会相等
所以才会是一组
group by(log_id - rk)

log_id rk
1 1
2 2
3 3
7 4
8 5
10 6
# Write your MySQL query statement below
select min(log_id) start_id, max(log_id) end_id from
(select 
    log_id,
    rank() over(order by log_id) rk from Logs) temp
    group by (log_id - rk)

1709. 访问日期之间最大的空档期

# Write your MySQL query statement below

# now:2021-1-1
# 求:最大间隔

# 1. 特殊情况:最后一个,now - max
select user_id, max(date_diff) biggest_window from 
(select user_id, datediff('2021-1-1', max(visit_date)) date_diff from UserVisits 
    group by user_id 
union
select user_id, max(datediff(next, visit_date)) date_diff from
(select 
    user_id,
    visit_date,
    lead(visit_date) over(partition by user_id order by visit_date) next
    from UserVisits) temp
    where temp.next is not null
    group by user_id) temp2
    group by user_id

1412. 查找成绩处于中游的学生【中游学生】

select temp1.student_id, student_name from

(select distinct student_id from Exam 
    where student_id not in
(select student_id from
(select exam_id, 
    student_id, 
    score,
    rank() over(partition by exam_id order by score) rk1,  # min:rk1 = 1
    rank() over(partition by exam_id order by score desc) rk2   # max:rk2 = 1
    from Exam) temp 
    where rk1 = 1 or rk2 = 1)) temp1 left join Student on
    temp1.student_id = Student.student_id
    order by temp1.student_id

550. 游戏玩法分析 Ⅳ

select round(
(select count(distinct player_id) from
(select
    player_id,
    event_date,
    count(1) over(partition by player_id) size,
    lead(event_date) over(partition by player_id order by event_date) lead1,
    rank() over (partition by player_id order by event_date) rk
    from Activity) temp
    where rk = 1 and datediff(lead1, event_date) = 1)

/
    (select count(distinct player_id) from Activity), 2) fraction  
    

585. 2016年的投资

# Write your MySQL query statement below

#  2016 年 
# 1. 他在 2015 年的投保额 (tiv_2015) 至少跟一个其他投保人在 2015 年的投保额相同。

# 2. 他所在的城市必须与其他投保人都不同(也就是说 (lat, lon) 不能跟其他任何一个投保人完全相同)。

# select round(sum(tiv_2016), 2) tiv_2016 from Insurance 
#     where pid not in

select round(sum(tiv_2016), 2) tiv_2016 from Insurance where
    pid in
(select pid from
(select 
    pid,
    count(1) over (partition by lat, lon) ct1,  # 唯一坐标 ct1 = 1
    count(1) over (partition by tiv_2015) ct2   # 2015相同 ct2 > 1
    from Insurance) temp
    where ct1 = 1 and ct2 > 1)
  

1164. 指定日期的产品价格

# Write your MySQL query statement below

# 1. 在 2019-08-16 时全部产品的价格
# 2. 假设所有产品在修改前的价格都是 10

## 一、有符合条件的
## 二、没有符合条件
select product_id, price from
(select 
    product_id, 
    new_price price,
    change_date,
    rank() over(partition by product_id order by change_date desc) rk
    from Products
    where change_date <= '2019-08-16') temp
    where rk = 1
union
select product_id, 10 from Products 
    where product_id not in
(select product_id from Products 
    where change_date <= '2019-08-16') 
posted @ 2023-12-20 12:58  爱新觉罗LQ  阅读(19)  评论(0编辑  收藏  举报