MySQL8.0新特性—窗口函数

一、概述

窗口函数(window functions)是一种对结果集进行计算,并将计算结果合并到结果集上返回多行的一类函数。MySQL8开始支持窗口函数,包括rank()lag()ntile()等非聚合窗口函数。以及部分聚合函数现在可以用作窗口函数,例如:sum()avg()

使用窗口函数需在函数调用后使用over子句,over子句定义了窗口函数如何处理查询行即窗口规范。窗口规范可以在表达式中直接定义,可以使用在其他子句中定义的窗口规范的名称即命名窗口。

注意:不能使用窗口函数在update或delete语句中更新行,可在查询或子查询中用来选择行。

over子句定义语法:

OVER ([window_name] [partition_clause] [order_clause] [frame_clause])
  • window_name:查询中其他windows子句定义的命名窗口。如只含window_name则完全引用。
    如还包含partitioning, ordering,framing则会修改命名窗口(注意:OVER子句只能向命名窗口添加属性,而不能修改它们。);
  • partition_clause:由PARTITION BY子句指定如何分组;默认所有结果集一个分组;MySQL支持PARTITION BY跟列名或表达式;
  • order_clause:由ORDER BY子句指定分组中的行如何排序;
  • frame_clause:滑动窗口中使用。定义frame,即分区中的子集;

命名窗口定义语法(位于from子句之后,having和order by子句之间)

WINDOW window_name AS (window_spec)[, window_name AS (window_spec)]

window_spec: [window_name] [partition_clause] [order_clause] [frame_clause]

二、案例

mysql> select class_id, student, subject, score from subject_score;  
+----------+---------+---------+-------+
| class_id | student | subject | score |
+----------+---------+---------+-------+
|        8 | lee     | english |    60 |
|        8 | lee     | math    |    50 |
|        8 | lee     | chinese |    80 |
|        8 | xiaole  | english |    90 |
|        8 | xiaole  | math    |   100 |
|        8 | xiaole  | chinese |    90 |
|        8 | xiaoyun | english |    95 |
|        8 | xiaoyun | math    |    95 |
|        8 | xiaoyun | chinese |    70 |
+----------+---------+---------+-------+
9 rows in set (0.00 sec)

2.1 不指定分组

mysql> select class_id,
      -> student, subject, score,
      -> row_number() over() row_num
      -> from subject_score;
+----------+---------+---------+-------+---------+
| class_id | student | subject | score | row_num |
+----------+---------+---------+-------+---------+
|        8 | lee     | english |    60 |       1 |
|        8 | lee     | math    |    50 |       2 |
|        8 | lee     | chinese |    80 |       3 |
|        8 | xiaole  | english |    90 |       4 |
|        8 | xiaole  | math    |   100 |       5 |
|        8 | xiaole  | chinese |    90 |       6 |
|        8 | xiaoyun | english |    95 |       7 |
|        8 | xiaoyun | math    |    95 |       8 |
|        8 | xiaoyun | chinese |    70 |       9 |
+----------+---------+---------+-------+---------+
9 rows in set (0.00 sec)

2.2 指定分组

mysql> select 
      -> class_id, student, subject, score,
      -> row_number() over( partition by subject ) as row_num_par_sub
      -> from subject_score;
+----------+---------+---------+-------+-----------------+
| class_id | student | subject | score | row_num_par_sub |
+----------+---------+---------+-------+-----------------+
|        8 | lee     | chinese |    80 |               1 |
|        8 | xiaole  | chinese |    90 |               2 |
|        8 | xiaoyun | chinese |    70 |               3 |
|        8 | lee     | english |    60 |               1 |
|        8 | xiaole  | english |    90 |               2 |
|        8 | xiaoyun | english |    95 |               3 |
|        8 | lee     | math    |    50 |               1 |
|        8 | xiaole  | math    |   100 |               2 |
|        8 | xiaoyun | math    |    95 |               3 |
+----------+---------+---------+-------+-----------------+
9 rows in set (0.00 sec)

2.3 指定分组和排序

mysql> select class_id, student, subject, score,
      -> row_number()over(partition by subject order by subject, score desc) as subject_order 
      -> from subject_score;
+----------+---------+---------+-------+---------------+
| class_id | student | subject | score | subject_order |
+----------+---------+---------+-------+---------------+
|        8 | xiaole  | chinese |    90 |             1 |
|        8 | lee     | chinese |    80 |             2 |
|        8 | xiaoyun | chinese |    70 |             3 |
|        8 | xiaoyun | english |    95 |             1 |
|        8 | xiaole  | english |    90 |             2 |
|        8 | lee     | english |    60 |             3 |
|        8 | xiaole  | math    |   100 |             1 |
|        8 | xiaoyun | math    |    95 |             2 |
|        8 | lee     | math    |    50 |             3 |
+----------+---------+---------+-------+---------------+
9 rows in set (0.00 sec)

2.4 使用命名窗口

mysql> select class_id,
      -> student, subject, score,
      -> row_number() over w as row_num_w
      -> from subject_score
      -> window w as(partition by subject);              
+----------+---------+---------+-------+-----------+
| class_id | student | subject | score | row_num_w |
+----------+---------+---------+-------+-----------+
|        8 | lee     | chinese |    80 |         1 |
|        8 | xiaole  | chinese |    90 |         2 |
|        8 | xiaoyun | chinese |    70 |         3 |
|        8 | lee     | english |    60 |         1 |
|        8 | xiaole  | english |    90 |         2 |
|        8 | xiaoyun | english |    95 |         3 |
|        8 | lee     | math    |    50 |         1 |
|        8 | xiaole  | math    |   100 |         2 |
|        8 | xiaoyun | math    |    95 |         3 |
+----------+---------+---------+-------+-----------+
9 rows in set (0.00 sec)

2.5 使用并改写命名窗口

mysql> select class_id,
      -> student, subject, score,
      -> row_number() over(w order by subject, score desc) as row_num_ss
      -> from subject_score
      -> window w as(partition by subject);   
+----------+---------+---------+-------+------------+
| class_id | student | subject | score | row_num_ss |
+----------+---------+---------+-------+------------+
|        8 | xiaole  | chinese |    90 |          1 |
|        8 | lee     | chinese |    80 |          2 |
|        8 | xiaoyun | chinese |    70 |          3 |
|        8 | xiaoyun | english |    95 |          1 |
|        8 | xiaole  | english |    90 |          2 |
|        8 | lee     | english |    60 |          3 |
|        8 | xiaole  | math    |   100 |          1 |
|        8 | xiaoyun | math    |    95 |          2 |
|        8 | lee     | math    |    50 |          3 |
+----------+---------+---------+-------+------------+

三、部分聚合函数支持over子句用作窗口函数

3.1 统计

avg() 平均数

mysql> select class_id,
      -> student, subject, score,
      -> avg(score) over(partition by subject) as subject_avg 
      -> from subject_score;
+----------+---------+---------+-------+-------------+
| class_id | student | subject | score | subject_avg |
+----------+---------+---------+-------+-------------+
|        8 | lee     | chinese |    80 |     80.0000 |
|        8 | xiaole  | chinese |    90 |     80.0000 |
|        8 | xiaoyun | chinese |    70 |     80.0000 |
|        8 | lee     | english |    60 |     81.6667 |
|        8 | xiaole  | english |    90 |     81.6667 |
|        8 | xiaoyun | english |    95 |     81.6667 |
|        8 | lee     | math    |    50 |     81.6667 |
|        8 | xiaole  | math    |   100 |     81.6667 |
|        8 | xiaoyun | math    |    95 |     81.6667 |
+----------+---------+---------+-------+-------------+
9 rows in set (0.00 sec)

count()

mysql> select class_id,
      -> student, subject, score,
      -> count(score) over(partition by subject) as subject_count
      -> from subject_score;      
+----------+---------+---------+-------+---------------+
| class_id | student | subject | score | subject_count |
+----------+---------+---------+-------+---------------+
|        8 | lee     | chinese |    80 |             3 |
|        8 | xiaole  | chinese |    90 |             3 |
|        8 | xiaoyun | chinese |    70 |             3 |
|        8 | lee     | english |    60 |             3 |
|        8 | xiaole  | english |    90 |             3 |
|        8 | xiaoyun | english |    95 |             3 |
|        8 | lee     | math    |    50 |             3 |
|        8 | xiaole  | math    |   100 |             3 |
|        8 | xiaoyun | math    |    95 |             3 |
+----------+---------+---------+-------+---------------+
9 rows in set (0.00 sec)

max()、min()

mysql> select class_id,
      -> student, subject, score,
      -> max(score) over(partition by subject) as score_max,
      -> min(score) over(partition by subject) as score_min
      -> from subject_score;       
+----------+---------+---------+-------+-----------+-----------+
| class_id | student | subject | score | score_max | score_min |
+----------+---------+---------+-------+-----------+-----------+
|        8 | lee     | chinese |    80 |        90 |        70 |
|        8 | xiaole  | chinese |    90 |        90 |        70 |
|        8 | xiaoyun | chinese |    70 |        90 |        70 |
|        8 | lee     | english |    60 |        95 |        60 |
|        8 | xiaole  | english |    90 |        95 |        60 |
|        8 | xiaoyun | english |    95 |        95 |        60 |
|        8 | lee     | math    |    50 |       100 |        50 |
|        8 | xiaole  | math    |   100 |       100 |        50 |
|        8 | xiaoyun | math    |    95 |       100 |        50 |
+----------+---------+---------+-------+-----------+-----------+
9 rows in set (0.00 sec)

sum()

mysql> select class_id,
      -> student, subject, score,
      -> sum(score) over(partition by student) as student_sum
      -> from subject_score;              
+----------+---------+---------+-------+-------------+
| class_id | student | subject | score | student_sum |
+----------+---------+---------+-------+-------------+
|        8 | lee     | english |    60 |         190 |
|        8 | lee     | math    |    50 |         190 |
|        8 | lee     | chinese |    80 |         190 |
|        8 | xiaole  | english |    90 |         280 |
|        8 | xiaole  | math    |   100 |         280 |
|        8 | xiaole  | chinese |    90 |         280 |
|        8 | xiaoyun | english |    95 |         260 |
|        8 | xiaoyun | math    |    95 |         260 |
|        8 | xiaoyun | chinese |    70 |         260 |
+----------+---------+---------+-------+-------------+
9 rows in set (0.00 sec)

3.2 BIT

BIT_AND() 按位与
BIT_OR() 按位或
BIT_XOR() 按位异或

3.3 JSON

json_arrayagg: 将结果集聚合为单个JSON数组,其元素由行组成。

mysql> select class_id,
      -> student, subject, score,
      -> json_arrayagg(score) over(partition by subject) as score_json
      -> from subject_score;               
+----------+---------+---------+-------+---------------+
| class_id | student | subject | score | score_json    |
+----------+---------+---------+-------+---------------+
|        8 | lee     | chinese |    80 | [80, 90, 70]  |
|        8 | xiaole  | chinese |    90 | [80, 90, 70]  |
|        8 | xiaoyun | chinese |    70 | [80, 90, 70]  |
|        8 | lee     | english |    60 | [60, 90, 95]  |
|        8 | xiaole  | english |    90 | [60, 90, 95]  |
|        8 | xiaoyun | english |    95 | [60, 90, 95]  |
|        8 | lee     | math    |    50 | [50, 100, 95] |
|        8 | xiaole  | math    |   100 | [50, 100, 95] |
|        8 | xiaoyun | math    |    95 | [50, 100, 95] |
+----------+---------+---------+-------+---------------+
9 rows in set (0.01 sec)

json_objectagg: 将两个列名或表达式作为参数,第一个用作键,第二个用作值,并返回包含键值对的JSON对象。

mysql> select class_id,
      -> student, subject, score,
      -> json_objectagg(subject, score) over(partition by student) as score_json
      -> from subject_score;
+----------+---------+---------+-------+---------------------------------------------+
| class_id | student | subject | score | score_json                                  |
+----------+---------+---------+-------+---------------------------------------------+
|        8 | lee     | english |    60 | {"math": 50, "chinese": 80, "english": 60}  |
|        8 | lee     | math    |    50 | {"math": 50, "chinese": 80, "english": 60}  |
|        8 | lee     | chinese |    80 | {"math": 50, "chinese": 80, "english": 60}  |
|        8 | xiaole  | english |    90 | {"math": 100, "chinese": 90, "english": 90} |
|        8 | xiaole  | math    |   100 | {"math": 100, "chinese": 90, "english": 90} |
|        8 | xiaole  | chinese |    90 | {"math": 100, "chinese": 90, "english": 90} |
|        8 | xiaoyun | english |    95 | {"math": 95, "chinese": 70, "english": 95}  |
|        8 | xiaoyun | math    |    95 | {"math": 95, "chinese": 70, "english": 95}  |
|        8 | xiaoyun | chinese |    70 | {"math": 95, "chinese": 70, "english": 95}  |
+----------+---------+---------+-------+---------------------------------------------+
9 rows in set (0.00 sec)

3.4 STD和VAR

stddev_pop(), stddev(), std(): 总体标准差
stddev_samp(): 样本标准差

mysql> select class_id,
      -> student, subject, score,
      -> std(score) over(partition by subject) as score_std,
      -> stddev_samp(score) over(partition by subject) as score_samp
      -> from subject_score;            
+----------+---------+---------+-------+--------------------+-------------------+
| class_id | student | subject | score | score_std          | score_samp        |
+----------+---------+---------+-------+--------------------+-------------------+
|        8 | lee     | chinese |    80 |   8.16496580927726 |                10 |
|        8 | xiaole  | chinese |    90 |   8.16496580927726 |                10 |
|        8 | xiaoyun | chinese |    70 |   8.16496580927726 |                10 |
|        8 | lee     | english |    60 |  15.45603082582617 | 18.92969448600091 |
|        8 | xiaole  | english |    90 |  15.45603082582617 | 18.92969448600091 |
|        8 | xiaoyun | english |    95 |  15.45603082582617 | 18.92969448600091 |
|        8 | lee     | math    |    50 | 22.484562605386735 | 27.53785273643051 |
|        8 | xiaole  | math    |   100 | 22.484562605386735 | 27.53785273643051 |
|        8 | xiaoyun | math    |    95 | 22.484562605386735 | 27.53785273643051 |
+----------+---------+---------+-------+--------------------+-------------------+
9 rows in set (0.00 sec)

var_pop(), variance(): 总体标准方差
var_samp(): 样本方差

mysql> select class_id,
      -> student, subject, score,
      -> var_pop(score) over(partition by subject) as score_pop,
      -> var_samp(score) over(partition by subject) as score_samp
      -> from subject_score;            
+----------+---------+---------+-------+--------------------+--------------------+
| class_id | student | subject | score | score_pop          | score_samp         |
+----------+---------+---------+-------+--------------------+--------------------+
|        8 | lee     | chinese |    80 |  66.66666666666667 |                100 |
|        8 | xiaole  | chinese |    90 |  66.66666666666667 |                100 |
|        8 | xiaoyun | chinese |    70 |  66.66666666666667 |                100 |
|        8 | lee     | english |    60 | 238.88888888888883 | 358.33333333333326 |
|        8 | xiaole  | english |    90 | 238.88888888888883 | 358.33333333333326 |
|        8 | xiaoyun | english |    95 | 238.88888888888883 | 358.33333333333326 |
|        8 | lee     | math    |    50 |  505.5555555555555 |  758.3333333333333 |
|        8 | xiaole  | math    |   100 |  505.5555555555555 |  758.3333333333333 |
|        8 | xiaoyun | math    |    95 |  505.5555555555555 |  758.3333333333333 |
+----------+---------+---------+-------+--------------------+--------------------+
9 rows in set (0.00 sec)

四、非聚合窗口函数

注意:虽然某些窗口函数允许使用null_treatment子句指定计算结果时如何处理NULL值,但MySQL实际使用RESPECT NULLS(默认值)处理NULL值,即在计算结果时会考虑NULL值。

Name Description
cume_dist() 累计分布,分区值小于等于当前值的百分比。需和order by一起使用
dense_rank() 当前行在其分区内的排名,不会产生不连续的秩数。 对等点被视为关系并获得相同的等级。
需和order by一起使用
first_value() value of argument from first row of window frame
lag() 分区内在当前行前n 行的值。 如果没有这样的行,则返回值为默认值。通常用于计算行之间的差异
last_value() value of argument from last row of window frame
lead() 分区内在当前行后n行的值。如果没有这样的行,则返回值为默认值。通常用于计算行之间的差异
nth_value() value of argument from n-th row of window frame
ntile() 将一个分区划分为n个组(桶),为分区中的每一行分配其桶号,并返回其分区内当前行的桶号。
需和order by一起使用
percent_rank() 返回小于当前行中的值的分区值的百分比,不包括最高值。
rank() 返回当前行在其分区内的排名,有间隙。需和order by一起使用。注意和dense_rank的区别
row_number() 返回其分区内当前行的编号。 行数范围从1到分区行数。order by影响行编号的顺序。
如果没有order by,行编号是不确定的。

cume_dist(): 累计分布,分区值小于等于当前值的百分比。需和order by 一起使用

mysql> select student,
      -> subject, score,
      -> cume_dist() over w
      -> from subject_score
      -> window w as (partition by subject order by score desc);
+---------+---------+-------+--------------------+
| student | subject | score | CUME_DIST() over w |
+---------+---------+-------+--------------------+
| xiaole  | chinese |    85 | 0.3333333333333333 |
| lee     | chinese |    80 | 0.6666666666666666 |
| xiaoyun | chinese |    70 |                  1 |
| xiaoyun | english |    95 | 0.3333333333333333 |
| xiaole  | english |    90 | 0.6666666666666666 |
| lee     | english |    60 |                  1 |
| xiaole  | math    |   100 | 0.3333333333333333 |
| xiaoyun | math    |    75 | 0.6666666666666666 |
| lee     | math    |    50 |                  1 |
+---------+---------+-------+--------------------+
9 rows in set (0.00 sec)

percent_rank(): 返回小于当前行中的值的分区值的百分比,不包括最高值。

mysql> select student,
      -> subject, score,
      -> percent_rank() over w
      -> from subject_score
      -> window w as (partition by subject order by score desc);
+---------+---------+-------+-----------------------+
| student | subject | score | percent_rank() over w |
+---------+---------+-------+-----------------------+
| xiaole  | chinese |    85 |                     0 |
| lee     | chinese |    80 |                   0.5 |
| xiaoyun | chinese |    70 |                     1 |
| xiaoyun | english |    95 |                     0 |
| xiaole  | english |    90 |                   0.5 |
| lee     | english |    60 |                     1 |
| xiaole  | math    |   100 |                     0 |
| xiaoyun | math    |   100 |                     0 |
| lee     | math    |    50 |                     1 |
+---------+---------+-------+-----------------------+
9 rows in set (0.00 sec)

dense_rank(): 当前行在其分区内的排名,不会产生不连续的秩数。对等点被视为关系并获得相同的等级。需和order by一起使用

mysql> select student,
      -> subject, score,
      -> dense_rank() over w
      -> from subject_score
      -> window w as (partition by subject order by score desc);           
+---------+---------+-------+---------------------+
| student | subject | score | dense_rank() over w |
+---------+---------+-------+---------------------+
| xiaole  | chinese |    85 |                   1 |
| lee     | chinese |    80 |                   2 |
| xiaoyun | chinese |    70 |                   3 |
| xiaoyun | english |    95 |                   1 |
| xiaole  | english |    90 |                   2 |
| lee     | english |    60 |                   3 |
| xiaole  | math    |   100 |                   1 |
| xiaoyun | math    |   100 |                   1 |
| lee     | math    |    50 |                   2 |
+---------+---------+-------+---------------------+
9 rows in set (0.00 sec)

rank(): 返回当前行在其分区内的排名,有间隙。需和order by一起使用。注意和DENSE_RANK的区别

mysql> select student,
      -> subject, score,
      -> rank() over w
      -> from subject_score
      -> window w as (partition by subject order by score desc);      
+---------+---------+-------+---------------+
| student | subject | score | rank() over w |
+---------+---------+-------+---------------+
| xiaole  | chinese |    85 |             1 |
| lee     | chinese |    80 |             2 |
| xiaoyun | chinese |    70 |             3 |
| xiaoyun | english |    95 |             1 |
| xiaole  | english |    90 |             2 |
| lee     | english |    60 |             3 |
| xiaole  | math    |   100 |             1 |
| xiaoyun | math    |   100 |             1 |
| lee     | math    |    50 |             3 |
+---------+---------+-------+---------------+
9 rows in set (0.00 sec)

lag(): 分区内在当前行前N 行的值。 如果没有这样的行,则返回值为默认值。通常用于计算行之间的差异

mysql> select student,
      -> subject, score,
      -> score - lag(score, 1, score) over w
      -> from subject_score
      -> window w as ( order by score ); 
+---------+---------+-------+-----------------------------------+
| student | subject | score | score - lag(score, 1, score) over w |
+---------+---------+-------+-----------------------------------+
| lee     | math    |    50 |                                 0 |
| lee     | english |    60 |                                10 |
| xiaoyun | chinese |    70 |                                10 |
| lee     | chinese |    80 |                                10 |
| xiaole  | chinese |    85 |                                 5 |
| xiaole  | english |    90 |                                 5 |
| xiaoyun | english |    95 |                                 5 |
| xiaole  | math    |   100 |                                 5 |
| xiaoyun | math    |   100 |                                 0 |
+---------+---------+-------+-----------------------------------+
9 rows in set (0.00 sec)

lead(): 分区内在当前行后N 行的值。如果没有这样的行,则返回值为默认值。通常用于计算行之间的差异

mysql> select student,
      -> subject, score,
      -> lead(score, 1, score) over w -score
      -> from subject_score
      -> window w as ( order by score ); 
+---------+---------+-------+-----------------------------------+
| student | subject | score | lead(score, 1, score) over w -score |
+---------+---------+-------+-----------------------------------+
| lee     | math    |    50 |                                10 |
| lee     | english |    60 |                                10 |
| xiaoyun | chinese |    70 |                                10 |
| lee     | chinese |    80 |                                 5 |
| xiaole  | chinese |    85 |                                 5 |
| xiaole  | english |    90 |                                 5 |
| xiaoyun | english |    95 |                                 5 |
| xiaole  | math    |   100 |                                 0 |
| xiaoyun | math    |   100 |                                 0 |
+---------+---------+-------+-----------------------------------+
9 rows in set (0.00 sec)

first_value(): 取分区或结果集的第一行。
last_value():取分区或结果集的最后一行。
nth_value():取分区或结果集中的第N行获取值。如果第N行不存在,则函数返回NULL。N必须是正整数,例如1,2和3。

mysql> select student,
      -> subject, score,
      -> first_value(score) over w as sorce_first,
      -> last_value(score) over w as sorce_last,
      -> nth_value(score, 2) over w as sorce_nth
      -> from subject_score
      -> window w as (order by score ROWS BETWEEN 1 PRECEDING AND 1 FOLLOWING);    
+---------+---------+-------+-------------+------------+-----------+
| student | subject | score | sorce_first | sorce_last | sorce_nth |
+---------+---------+-------+-------------+------------+-----------+
| lee     | math    |    50 |          50 |         60 |        60 |
| lee     | english |    60 |          50 |         70 |        60 |
| xiaoyun | chinese |    70 |          60 |         80 |        70 |
| lee     | chinese |    80 |          70 |         85 |        80 |
| xiaole  | chinese |    85 |          80 |         90 |        85 |
| xiaole  | english |    90 |          85 |         95 |        90 |
| xiaoyun | english |    95 |          90 |        100 |        95 |
| xiaole  | math    |   100 |          95 |        100 |       100 |
| xiaoyun | math    |   100 |         100 |        100 |       100 |
+---------+---------+-------+-------------+------------+-----------+
9 rows in set (0.00 sec)

row_number(): 返回其分区内当前行的编号。行数范围从1到分区行数。

mysql> select student,
      -> subject, score,
      -> row_number() over w
      -> from subject_score
      -> window w as (partition by subject order by score desc);
+---------+---------+-------+---------------------+
| student | subject | score | ROW_NUMBER() over w |
+---------+---------+-------+---------------------+
| xiaole  | chinese |    85 |                   1 |
| lee     | chinese |    80 |                   2 |
| xiaoyun | chinese |    70 |                   3 |
| xiaoyun | english |    95 |                   1 |
| xiaole  | english |    90 |                   2 |
| lee     | english |    60 |                   3 |
| xiaole  | math    |   100 |                   1 |
| xiaoyun | math    |   100 |                   2 |
| lee     | math    |    50 |                   3 |
+---------+---------+-------+---------------------+
9 rows in set (0.00 sec)

ntile(): 将一个分区划分为N个组(桶),为分区中的每一行分配其桶号,并返回其分区内当前行的桶号。需和order by一起使用

mysql> select student,
      -> subject, score,
      -> ntile(4) over w
      -> from subject_score
      -> window w as (order by score desc);
+---------+---------+-------+-----------------+
| student | subject | score | NTILE(4) over w |
+---------+---------+-------+-----------------+
| xiaole  | math    |   100 |               1 |
| xiaoyun | math    |   100 |               1 |
| xiaoyun | english |    95 |               1 |
| xiaole  | english |    90 |               2 |
| xiaole  | chinese |    85 |               2 |
| lee     | chinese |    80 |               3 |
| xiaoyun | chinese |    70 |               3 |
| lee     | english |    60 |               4 |
| lee     | math    |    50 |               4 |
+---------+---------+-------+-----------------+
9 rows in set (0.00 sec)

五、滑动窗口

可以使用frame子句定义frameframe_units定义窗口范围基于ROW还是基于RANGEframe_extent定义frame的起点和终点。

#frame子句

frame_clause: frame_units frame_extent

frame_units: {ROWS | RANGE}

# ROWS:fame由开始和结束行位置定义。 偏移量是行号与当前行号的差异。
# RANGE:fame由值范围内的行定义。 偏移量是行值与当前行值的差异。

frame_extent: {frame_start | frame_between}

frame_between:BETWEEN frame_start AND frame_end

# 使用frame_between时frame_start不能比frame_end大

frame_start, frame_end: {

# frame_start 和 frame_end可以使用以下范围表达式
    CURRENT ROW 
  | UNBOUNDED PRECEDING
  | UNBOUNDED FOLLOWING
  | expr PRECEDING
  | expr FOLLOWING
}

# CURRENT ROW : 基于ROW 边界是当前行,基于Range 边界是当前行的对应值。一般和其他范围表达式一起使用
# UNBOUNDED PRECEDING : 边界是分区第一行
# UNBOUNDED FOLLOWING : 边界是分区最后一行
# expr PRECEDING : 基于ROW 边界是当前行之前的expr行。 基于Range 边界是当前行值减去expr的行。
# expr FOLLOWING : 基于ROW 边界是当前行之后的expr行。 基于Range 边界是当前行值加上expr的行。

expr在prepare statement中可以用?占位,可以是一个非负的数据 ,或者是一个时间间隔 INTERVAL val unit。  
例如:
5 PRECEDING
interval 5 year PRECEDING
5 FOLLOWING  
interval 5 year  FOLLOWING
基于时间和数字表达式的Range 需要 时间和数字表达式上的order by

用作窗口函数的聚合函数和以下非聚合窗口函数支持滑动窗口

first_value()
last_value()
nth_value()

mysql> select student,
      -> subject, score,
      -> first_value(score) over w score_first,
      -> last_value(score) over w score_last,
      -> nth_value(score, 2) over w score_nth
      -> from subject_score
      -> window w as (partition by subject order by score rows UNBOUNDED PRECEDING);
+---------+---------+-------+-------------+------------+-----------+
| student | subject | score | score_first | score_last | score_nth |
+---------+---------+-------+-------------+------------+-----------+
| xiaoyun | chinese |    70 |          70 |         70 |      NULL |
| lee     | chinese |    80 |          70 |         80 |        80 |
| xiaole  | chinese |    85 |          70 |         85 |        80 |
| lee     | english |    60 |          60 |         60 |      NULL |
| xiaole  | english |    90 |          60 |         90 |        90 |
| xiaoyun | english |    95 |          60 |         95 |        90 |
| lee     | math    |    50 |          50 |         50 |      NULL |
| xiaole  | math    |   100 |          50 |        100 |       100 |
| xiaoyun | math    |   100 |          50 |        100 |       100 |
+---------+---------+-------+-------------+------------+-----------+
9 rows in set (0.00 sec)

创建一个简单的收入表,收入数量和时间。通过滑动窗口查询每一个时间点前2小时的总收入,平均收入,最高、最低单次收入。

mysql> select money,
      -> insert_time,
      -> sum(money) over w as money_sum,
      -> avg(money) over w as money_avg,
      -> max(money) over w as money_max,
      -> min(money) over w as money_min
      -> from t_test
      -> window w as (order by insert_time range between interval 2 hour preceding and current row );
+-------+---------------------+-----------+-----------+-----------+-----------+
| money | insert_time         | money_sum | money_avg | money_max | money_min |
+-------+---------------------+-----------+-----------+-----------+-----------+
|     1 | 2021-08-28 17:35:19 |         1 |    1.0000 |         1 |         1 |
|     2 | 2021-08-28 17:45:19 |         3 |    1.5000 |         2 |         1 |
|     3 | 2021-08-28 17:55:19 |         6 |    2.0000 |         3 |         1 |
|     4 | 2021-08-28 18:05:19 |        10 |    2.5000 |         4 |         1 |
|     6 | 2021-08-28 18:15:19 |        16 |    3.2000 |         6 |         1 |
|     7 | 2021-08-28 18:25:19 |        23 |    3.8333 |         7 |         1 |
|     8 | 2021-08-28 18:35:19 |        31 |    4.4286 |         8 |         1 |
|     9 | 2021-08-28 18:45:19 |        40 |    5.0000 |         9 |         1 |
|    11 | 2021-08-28 18:55:19 |        51 |    5.6667 |        11 |         1 |
|    12 | 2021-08-28 19:05:19 |        63 |    6.3000 |        12 |         1 |
|    13 | 2021-08-28 19:15:19 |        76 |    6.9091 |        13 |         1 |
|    14 | 2021-08-28 19:25:19 |        90 |    7.5000 |        14 |         1 |
|    15 | 2021-08-28 19:35:19 |       105 |    8.0769 |        15 |         1 |
|    17 | 2021-08-28 19:45:19 |       121 |    9.3077 |        17 |         2 |
|    18 | 2021-08-28 19:55:19 |       137 |   10.5385 |        18 |         3 |
|    19 | 2021-08-28 20:05:19 |       153 |   11.7692 |        19 |         4 |
|    20 | 2021-08-28 20:15:19 |       169 |   13.0000 |        20 |         6 |
|    21 | 2021-08-28 20:25:19 |       184 |   14.1538 |        21 |         7 |
|    22 | 2021-08-28 20:35:19 |       199 |   15.3077 |        22 |         8 |
|   212 | 2021-08-31 15:35:19 |      2670 |  205.3846 |       212 |       199 |
|   999 | 2021-08-31 15:45:19 |      3470 |  266.9231 |       999 |       200 |
+-------+---------------------+-----------+-----------+-----------+-----------+
422 rows in set (0.09 sec)
posted @ 2022-11-01 18:07  夏尔_717  阅读(352)  评论(0编辑  收藏  举报