Hive—简单窗口分析函数
hive 窗口分析函数 0: jdbc:hive2://localhost:10000> select * from t_access; +----------------+---------------------------------+-----------------------+--------------+--+ | t_access.ip | t_access.url | t_access.access_time | t_access.dt | +----------------+---------------------------------+-----------------------+--------------+--+ | 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 20170804 | | 192.168.33.3 | http://www.xxx.ccc.aa/teach | 2017-08-04 15:35:20 | 20170804 | | 192.168.33.4 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 20170804 | | 192.168.33.4 | http://www.xxx.ccc.aa/job | 2017-08-04 16:30:20 | 20170804 | | 192.168.33.5 | http://www.xxx.ccc.aa/job | 2017-08-04 15:40:20 | 20170804 | | 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 20170805 | | 192.168.44.3 | http://www.xxx.ccc.aa/teach | 2017-08-05 15:35:20 | 20170805 | | 192.168.33.44 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 20170805 | | 192.168.33.46 | http://www.xxx.ccc.aa/job | 2017-08-05 16:30:20 | 20170805 | | 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-05 15:40:20 | 20170805 | | 192.168.133.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:30:20 | 20170806 | | 192.168.111.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:35:20 | 20170806 | | 192.168.34.44 | http://www.xxx.ccc.aa/pay | 2017-08-06 15:30:20 | 20170806 | | 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 20170806 | | 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 20170806 | | 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 20170806 | | 192.168.33.25 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 20170806 | | 192.168.33.36 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 20170806 | | 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 20170806 | +----------------+---------------------------------+-----------------------+--------------+--+ ## LAG函数 select ip,url,access_time, row_number() over(partition by ip order by access_time) as rn, lag(access_time,1,0) over(partition by ip order by access_time)as last_access_time from t_access; +----------------+---------------------------------+----------------------+-----+----------------------+--+ | ip | url | access_time | rn | last_access_time | +----------------+---------------------------------+----------------------+-----+----------------------+--+ | 192.168.111.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:35:20 | 1 | 0 | | 192.168.133.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:30:20 | 1 | 0 | | 192.168.33.25 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 1 | 0 | | 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | 0 | | 192.168.33.3 | http://www.xxx.ccc.aa/teach | 2017-08-04 15:35:20 | 2 | 2017-08-04 15:30:20 | | 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 3 | 2017-08-04 15:35:20 | | 192.168.33.36 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 1 | 0 | | 192.168.33.4 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | 0 | | 192.168.33.4 | http://www.xxx.ccc.aa/job | 2017-08-04 16:30:20 | 2 | 2017-08-04 15:30:20 | | 192.168.33.44 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 1 | 0 | | 192.168.33.46 | http://www.xxx.ccc.aa/job | 2017-08-05 16:30:20 | 1 | 0 | | 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 2 | 2017-08-05 16:30:20 | | 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 3 | 2017-08-06 16:30:20 | | 192.168.33.5 | http://www.xxx.ccc.aa/job | 2017-08-04 15:40:20 | 1 | 0 | | 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-05 15:40:20 | 1 | 0 | | 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 2 | 2017-08-05 15:40:20 | | 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 3 | 2017-08-06 15:40:20 | | 192.168.34.44 | http://www.xxx.ccc.aa/pay | 2017-08-06 15:30:20 | 1 | 0 | | 192.168.44.3 | http://www.xxx.ccc.aa/teach | 2017-08-05 15:35:20 | 1 | 0 | +----------------+---------------------------------+----------------------+-----+----------------------+--+ ## LEAD函数 select ip,url,access_time, row_number() over(partition by ip order by access_time) as rn, lead(access_time,1,0) over(partition by ip order by access_time)as last_access_time from t_access; +----------------+---------------------------------+----------------------+-----+----------------------+--+ | ip | url | access_time | rn | last_access_time | +----------------+---------------------------------+----------------------+-----+----------------------+--+ | 192.168.111.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:35:20 | 1 | 0 | | 192.168.133.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:30:20 | 1 | 0 | | 192.168.33.25 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 1 | 0 | | 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | 2017-08-04 15:35:20 | | 192.168.33.3 | http://www.xxx.ccc.aa/teach | 2017-08-04 15:35:20 | 2 | 2017-08-05 15:30:20 | | 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 3 | 0 | | 192.168.33.36 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 1 | 0 | | 192.168.33.4 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | 2017-08-04 16:30:20 | | 192.168.33.4 | http://www.xxx.ccc.aa/job | 2017-08-04 16:30:20 | 2 | 0 | | 192.168.33.44 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 1 | 0 | | 192.168.33.46 | http://www.xxx.ccc.aa/job | 2017-08-05 16:30:20 | 1 | 2017-08-06 16:30:20 | | 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 2 | 2017-08-06 16:30:20 | | 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 3 | 0 | | 192.168.33.5 | http://www.xxx.ccc.aa/job | 2017-08-04 15:40:20 | 1 | 0 | | 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-05 15:40:20 | 1 | 2017-08-06 15:40:20 | | 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 2 | 2017-08-06 15:40:20 | | 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 3 | 0 | | 192.168.34.44 | http://www.xxx.ccc.aa/pay | 2017-08-06 15:30:20 | 1 | 0 | | 192.168.44.3 | http://www.xxx.ccc.aa/teach | 2017-08-05 15:35:20 | 1 | 0 | +----------------+---------------------------------+----------------------+-----+----------------------+--+ ## FIRST_VALUE 函数 例:取每个用户访问的第一个页面 select ip,url,access_time, row_number() over(partition by ip order by access_time) as rn, first_value(url) over(partition by ip order by access_time rows between unbounded preceding and unbounded following)as last_access_time from t_access; +----------------+---------------------------------+----------------------+-----+---------------------------------+--+ | ip | url | access_time | rn | last_access_time | +----------------+---------------------------------+----------------------+-----+---------------------------------+--+ | 192.168.111.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:35:20 | 1 | http://www.xxx.ccc.aa/register | | 192.168.133.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:30:20 | 1 | http://www.xxx.ccc.aa/register | | 192.168.33.25 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 1 | http://www.xxx.ccc.aa/job | | 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | http://www.xxx.ccc.aa/stu | | 192.168.33.3 | http://www.xxx.ccc.aa/teach | 2017-08-04 15:35:20 | 2 | http://www.xxx.ccc.aa/stu | | 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 3 | http://www.xxx.ccc.aa/stu | | 192.168.33.36 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 1 | http://www.xxx.ccc.aa/excersize | | 192.168.33.4 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | http://www.xxx.ccc.aa/stu | | 192.168.33.4 | http://www.xxx.ccc.aa/job | 2017-08-04 16:30:20 | 2 | http://www.xxx.ccc.aa/stu | | 192.168.33.44 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 1 | http://www.xxx.ccc.aa/stu | | 192.168.33.46 | http://www.xxx.ccc.aa/job | 2017-08-05 16:30:20 | 1 | http://www.xxx.ccc.aa/job | | 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 2 | http://www.xxx.ccc.aa/job | | 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 3 | http://www.xxx.ccc.aa/job | | 192.168.33.5 | http://www.xxx.ccc.aa/job | 2017-08-04 15:40:20 | 1 | http://www.xxx.ccc.aa/job | | 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-05 15:40:20 | 1 | http://www.xxx.ccc.aa/job | | 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 2 | http://www.xxx.ccc.aa/job | | 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 3 | http://www.xxx.ccc.aa/job | | 192.168.34.44 | http://www.xxx.ccc.aa/pay | 2017-08-06 15:30:20 | 1 | http://www.xxx.ccc.aa/pay | | 192.168.44.3 | http://www.xxx.ccc.aa/teach | 2017-08-05 15:35:20 | 1 | http://www.xxx.ccc.aa/teach | +----------------+---------------------------------+----------------------+-----+---------------------------------+--+ ## LAST_VALUE 函数 例:取每个用户访问的最后一个页面 select ip,url,access_time, row_number() over(partition by ip order by access_time) as rn, last_value(url) over(partition by ip order by access_time rows between unbounded preceding and unbounded following)as last_access_time from t_access; +----------------+---------------------------------+----------------------+-----+---------------------------------+--+ | ip | url | access_time | rn | last_access_time | +----------------+---------------------------------+----------------------+-----+---------------------------------+--+ | 192.168.111.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:35:20 | 1 | http://www.xxx.ccc.aa/register | | 192.168.133.3 | http://www.xxx.ccc.aa/register | 2017-08-06 15:30:20 | 1 | http://www.xxx.ccc.aa/register | | 192.168.33.25 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 1 | http://www.xxx.ccc.aa/job | | 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | http://www.xxx.ccc.aa/stu | | 192.168.33.3 | http://www.xxx.ccc.aa/teach | 2017-08-04 15:35:20 | 2 | http://www.xxx.ccc.aa/stu | | 192.168.33.3 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 3 | http://www.xxx.ccc.aa/stu | | 192.168.33.36 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 1 | http://www.xxx.ccc.aa/excersize | | 192.168.33.4 | http://www.xxx.ccc.aa/stu | 2017-08-04 15:30:20 | 1 | http://www.xxx.ccc.aa/stu | | 192.168.33.4 | http://www.xxx.ccc.aa/job | 2017-08-04 16:30:20 | 2 | http://www.xxx.ccc.aa/stu | | 192.168.33.44 | http://www.xxx.ccc.aa/stu | 2017-08-05 15:30:20 | 1 | http://www.xxx.ccc.aa/stu | | 192.168.33.46 | http://www.xxx.ccc.aa/job | 2017-08-05 16:30:20 | 1 | http://www.xxx.ccc.aa/job | | 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 2 | http://www.xxx.ccc.aa/job | | 192.168.33.46 | http://www.xxx.ccc.aa/excersize | 2017-08-06 16:30:20 | 3 | http://www.xxx.ccc.aa/job | | 192.168.33.5 | http://www.xxx.ccc.aa/job | 2017-08-04 15:40:20 | 1 | http://www.xxx.ccc.aa/job | | 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-05 15:40:20 | 1 | http://www.xxx.ccc.aa/job | | 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 2 | http://www.xxx.ccc.aa/job | | 192.168.33.55 | http://www.xxx.ccc.aa/job | 2017-08-06 15:40:20 | 3 | http://www.xxx.ccc.aa/job | | 192.168.34.44 | http://www.xxx.ccc.aa/pay | 2017-08-06 15:30:20 | 1 | http://www.xxx.ccc.aa/pay | | 192.168.44.3 | http://www.xxx.ccc.aa/teach | 2017-08-05 15:35:20 | 1 | http://www.xxx.ccc.aa/teach | +----------------+---------------------------------+----------------------+-----+---------------------------------+--+ /* 累计报表--分析函数实现版 */ -- sum() over() 函数 select id ,month ,sum(amount) over(partition by id order by month rows between unbounded preceding and current row) from (select id,month, sum(fee) as amount from t_test group by id,month) tmp;
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 从 HTTP 原因短语缺失研究 HTTP/2 和 HTTP/3 的设计差异
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· TypeScript + Deepseek 打造卜卦网站:技术与玄学的结合
· Manus的开源复刻OpenManus初探
· AI 智能体引爆开源社区「GitHub 热点速览」
· 三行代码完成国际化适配,妙~啊~
· .NET Core 中如何实现缓存的预热?