Hive 刷题——HiveSql 实现分钟级的趋势图

问题描述

在Hive中,怎么用sql实现分钟级的趋势图?比如从交易表中,如何统计0点到每分钟的交易趋势图?原表:trade_A(trade_id,pay_time(格式是2020-08-05 10:30:28),pay_gmv)。希望用sql实现分钟级的0点到当前分钟的GMV。结果表:result_A(minute_rn(分钟顺序),pay_gmv_td(每分钟的交易额,都是0点到当前分钟的累加值))。

示例数据

with temp as (select 101 as trade_id, '2024-03-05 00:30:28' as pay_time, 100 as pay_gmv
              union all
              select 102 as trade_id, '2024-03-05 00:30:58' as pay_time, 200 as pay_gmv
              union all
              select 103 as trade_id, '2024-03-05 00:35:28' as pay_time, 300 as pay_gmv
              union all
              select 104 as trade_id, '2024-03-05 01:36:28' as pay_time, 400 as pay_gmv)

参考实现

with temp as (select 101 as trade_id, '2024-03-05 00:30:28' as pay_time, 100 as pay_gmv
              union all
              select 102 as trade_id, '2024-03-05 00:30:58' as pay_time, 200 as pay_gmv
              union all
              select 103 as trade_id, '2024-03-05 00:35:28' as pay_time, 300 as pay_gmv
              union all
              select 104 as trade_id, '2024-03-05 01:36:28' as pay_time, 400 as pay_gmv)

select minute_rn,
       sum(nvl(pay_gmv, 0)) over (partition by date_format(minute_rn, 'yyyy-MM-dd') order by minute_rn) as pay_gmv_td
from (select from_unixtime(unix_timestamp(t1.min_pay_time) + tab.pos * 60, 'yyyy-MM-dd HH:mm') as minute_rn
      from (select min(date_format(pay_time, 'yyyy-MM-dd 00:00:00')) as min_pay_time --取最小日期
                 , max(date_format(pay_time, 'yyyy-MM-dd 23:59:59')) as max_pay_time --取最大日期
            from temp) t1
               lateral view posexplode(split(
                       repeat(',',
                              cast(ceil((unix_timestamp(max_pay_time) - unix_timestamp(min_pay_time)) / 60) as int)),
                       ',')) tab as pos, val) t1
         left join (select date_format(pay_time, 'yyyy-MM-dd HH:mm') as pay_time
                         , sum(pay_gmv)                              as pay_gmv
                    from temp
                    group by date_format(pay_time, 'yyyy-MM-dd HH:mm')) t2
                   on t1.minute_rn = t2.pay_time
order by 1;

 

posted @ 2024-03-28 12:34  晓枫的春天  阅读(48)  评论(0编辑  收藏  举报