OVER窗口函数
OVER简介:
OVER窗口(OVER Window)是传统数据库的标准开窗,不同于Group By Window,OVER窗口中每1个元素都对应1个窗口,每个元素触发一次计算。OVER窗口可以按照实际元素的行或实际的元素值(时间戳值,支持事件时间(eventtime)和处理时间(proctime))确定窗口,因此流数据元素可能分布在多个窗口中。语法:
select sum(amount) OVER (definition) AS amount,
avg(age) OVER (definition) AS age from table;
函数类型:
1、ROWS OVER Window
释义:每1行元素都被视为新的计算行,即每1行都是一个新的窗口;分为Unbounded(无界流)和Bounded(有界流)。
语法:
SELECT
agg1(col1) OVER(
[PARTITION BY (value_expression1,..., value_expressionN)]
ORDER BY timeCol
ROWS BETWEEN (UNBOUNDED | rowCount) PRECEDING AND CURRENT ROW) AS colName, ...
FROM Tab1;
SELECT
agg1(col1) OVER(
[PARTITION BY (value_expression1,..., value_expressionN)]
ORDER BY timeCol
ROWS BETWEEN (UNBOUNDED | rowCount) PRECEDING AND CURRENT ROW) AS colName, ...
FROM Tab1;
例子:
统计当前商品上架之前同类的3个商品的最高价格(有界流)
CREATE TEMPORARY TABLE tmall_item(
itemID VARCHAR,
itemType VARCHAR,
eventtime varchar,
onSellTime AS TO_TIMESTAMP(eventtime),
price DOUBLE, WATERMARK FOR onSellTime AS onSellTime - INTERVAL '0' SECOND --为Rowtime定义Watermark。
) WITH ('connector' = 'sls', ...);
itemID VARCHAR,
itemType VARCHAR,
eventtime varchar,
onSellTime AS TO_TIMESTAMP(eventtime),
price DOUBLE, WATERMARK FOR onSellTime AS onSellTime - INTERVAL '0' SECOND --为Rowtime定义Watermark。
) WITH ('connector' = 'sls', ...);
SELECT itemID, itemType, onSellTime, price,
MAX(price) OVER ( PARTITION BY itemType ORDER BY onSellTime ROWS BETWEEN 2 preceding AND CURRENT ROW)
AS maxPrice
FROM tmall_item;
MAX(price) OVER ( PARTITION BY itemType ORDER BY onSellTime ROWS BETWEEN 2 preceding AND CURRENT ROW)
AS maxPrice
FROM tmall_item;
2、RANGE OVER Window
释义:具有相同时间值的所有元素行视为同一计算行,即具有相同时间值的所有行都是同一个窗口。
语法:
SELECT
agg1(col1) OVER(
[PARTITION BY (value_expression1,..., value_expressionN)]
ORDER BY timeCol
RANGE
BETWEEN (UNBOUNDED | timeInterval) PRECEDING AND CURRENT ROW) AS colName,...
FROM Tab1;
agg1(col1) OVER(
[PARTITION BY (value_expression1,..., value_expressionN)]
ORDER BY timeCol
RANGE
BETWEEN (UNBOUNDED | timeInterval) PRECEDING AND CURRENT ROW) AS colName,...
FROM Tab1;
例子:
统计当前商品上架时间早2分钟的同类商品中的最高价格
CREATE TEMPORARY TABLE tmall_item(
itemID VARCHAR,
itemType VARCHAR,
eventtime varchar,
onSellTime AS TO_TIMESTAMP(eventtime),
price DOUBLE,
WATERMARK FOR onSellTime AS onSellTime - INTERVAL '0' SECOND --为Rowtime定义Watermark。
) WITH (
'connector' = 'sls',
...
);
SELECT itemID, itemType, onSellTime, price,
MAX(price) OVER (
PARTITION BY itemType
ORDER BY onSellTime
RANGE BETWEEN INTERVAL '2' MINUTE preceding AND CURRENT ROW) AS maxPrice
FROM tmall_item;