hivesql 实现collect_list内排序
假设我有一个看起来像这样的蜂巢表:
ID event order_num ------------------------ A red 2 A blue 1 A yellow 3 B yellow 2 B green 1 ...
我正在尝试使用 collect_list 为每个 ID 生成事件列表。所以类似于以下内容:
1 2 3 4 | SELECT ID, collect_list(event) as events_list, FROM table GROUP BY ID; |
但是,在我分组所依据的每个 ID 中,我需要按 order_num 进行排序。这样我的结果表将如下所示:
ID events_list ------------------------ A ["blue","red","yellow"] B ["green","red"]
我无法在 collect_list() 查询之前按 ID 和 order_num 进行全局排序,因为该表很大。有没有办法在 collect_list 中按 order_num 排序?
诀窍是使用带有 DISTRIBUTE BY 和 SORT BY 语句的子查询。见下文:
WITH table1 AS ( SELECT 'A' AS ID, 'red' AS event, 2 AS order_num UNION ALL SELECT 'A' AS ID, 'blue' AS event, 1 AS order_num UNION ALL SELECT 'A' AS ID, 'yellow' AS event, 3 AS order_num UNION ALL SELECT 'B' AS ID, 'yellow' AS event, 2 AS order_num UNION ALL SELECT 'B' AS ID, 'green' AS event, 1 AS order_num ) -- Collect it SELECT subquery.ID, collect_list(subquery.event) as events_list FROM ( SELECT table1.ID, table1.event, table1.order_num FROM table1 DISTRIBUTE BY table1.ID SORT BY table1.ID, table1.order_num ) subquery GROUP BY subquery.ID;
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 阿里最新开源QwQ-32B,效果媲美deepseek-r1满血版,部署成本又又又降低了!
· 单线程的Redis速度为什么快?
· SQL Server 2025 AI相关能力初探
· 展开说说关于C#中ORM框架的用法!
· AI编程工具终极对决:字节Trae VS Cursor,谁才是开发者新宠?