Hive 刷题——查询有新注册用户的当天的新用户数量、新用户的第一天留存率

需求描述

从用户登录明细表(user_login_detail)中首次登录算作当天新增,第二天也登录了算作一日留存
结果如下:
first_login
 
(注册时间)
register
 
(新增用户数)
retention
<decimal(16,2)>
(留存率)
2021-09-21
1
0.00
2021-09-22
1
0.00
2021-09-23
1
0.00
2021-09-24
1
0.00
2021-09-25
1
0.00
2021-09-26
1
0.00
2021-09-27
1
0.00
2021-10-04
2
0.50
2021-10-06
1
0.00
需要用到的表:
用户登录明细表:user_login_detail
user_id(用户id)
ip_address(ip地址)
login_ts(登录时间)
logout_ts(登出时间)
101
180.149.130.161
2021-09-21 08:00:00
2021-09-27 08:30:00
102
120.245.11.2
2021-09-22 09:00:00
2021-09-27 09:30:00
103
27.184.97.3
2021-09-23 10:00:00
2021-09-27 10:30:00

参考SQL

select login_date first_login, cast(register as string) register, cast(relonin / register as decimal(16, 2)) retention
from (select t.login_date,
             count(t.user_id)                                         register,
             sum(if(datediff(t1.login_date, t.login_date) = 1, 1, 0)) relonin
      from (select user_id, login_date
            from (select user_id, login_date, rank() over (partition by user_id order by login_date) rk
                  from (select user_id, date(login_ts) login_date
                        from user_login_detail
                        group by user_id, date(login_ts)) i) i
            where rk = 1) t
               left join
           (select user_id, login_date
            from (select user_id, login_date, rank() over (partition by user_id order by login_date) rk
                  from (select user_id, date(login_ts) login_date
                        from user_login_detail
                        group by user_id, date(login_ts)) i) i
            where rk = 2) t1 on t.user_id = t1.user_id
      group by t.login_date) i;
posted @ 2023-02-03 11:39  晓枫的春天  阅读(207)  评论(0编辑  收藏  举报