Hive 刷题——HiveSql 实现分钟级的趋势图
问题描述
在Hive中,怎么用sql实现分钟级的趋势图?比如从交易表中,如何统计0点到每分钟的交易趋势图?原表:trade_A(trade_id,pay_time(格式是2020-08-05 10:30:28),pay_gmv)。希望用sql实现分钟级的0点到当前分钟的GMV。结果表:result_A(minute_rn(分钟顺序),pay_gmv_td(每分钟的交易额,都是0点到当前分钟的累加值))。
示例数据
with temp as (select 101 as trade_id, '2024-03-05 00:30:28' as pay_time, 100 as pay_gmv union all select 102 as trade_id, '2024-03-05 00:30:58' as pay_time, 200 as pay_gmv union all select 103 as trade_id, '2024-03-05 00:35:28' as pay_time, 300 as pay_gmv union all select 104 as trade_id, '2024-03-05 01:36:28' as pay_time, 400 as pay_gmv)
参考实现
with temp as (select 101 as trade_id, '2024-03-05 00:30:28' as pay_time, 100 as pay_gmv union all select 102 as trade_id, '2024-03-05 00:30:58' as pay_time, 200 as pay_gmv union all select 103 as trade_id, '2024-03-05 00:35:28' as pay_time, 300 as pay_gmv union all select 104 as trade_id, '2024-03-05 01:36:28' as pay_time, 400 as pay_gmv) select minute_rn, sum(nvl(pay_gmv, 0)) over (partition by date_format(minute_rn, 'yyyy-MM-dd') order by minute_rn) as pay_gmv_td from (select from_unixtime(unix_timestamp(t1.min_pay_time) + tab.pos * 60, 'yyyy-MM-dd HH:mm') as minute_rn from (select min(date_format(pay_time, 'yyyy-MM-dd 00:00:00')) as min_pay_time --取最小日期 , max(date_format(pay_time, 'yyyy-MM-dd 23:59:59')) as max_pay_time --取最大日期 from temp) t1 lateral view posexplode(split( repeat(',', cast(ceil((unix_timestamp(max_pay_time) - unix_timestamp(min_pay_time)) / 60) as int)), ',')) tab as pos, val) t1 left join (select date_format(pay_time, 'yyyy-MM-dd HH:mm') as pay_time , sum(pay_gmv) as pay_gmv from temp group by date_format(pay_time, 'yyyy-MM-dd HH:mm')) t2 on t1.minute_rn = t2.pay_time order by 1;