Hive 刷题 查询首次登录系统后第二天登录系统的用户数占总用户数的百分比

需求描述

从用户登录信息表(temp_user_login)中查询首次登录后第二天仍然登录的用户占所有用户的比例,结果保留2位小数,使用百分数显示,

原始数据

desc temp_user_login;
day_id    string 登录日期
user_id    int    用户ID
if_login    int 是否登录 10select * from temp_user_login;
2022-03-01    1    1
2022-03-01    2    0
2022-03-01    3    0
2022-03-02    1    1
2022-03-02    2    1
2022-03-02    3    0
2022-03-03    1    1
2022-03-03    2    0
2022-03-03    3    1
2022-03-04    1    1
2022-03-04    2    0
2022-03-04    3    0
2022-03-05    1    1
2022-03-05    2    1
2022-03-05    3    1

解题思路

1、先计算总用户数,这个很 esay;

2、取出连续两天登录的用户,进行去重,进行计算即可

参考sql

select concat(cast(round(login_again / user_cnt * 100, 2) as string), '%') login_again_percent
from (select sum(1) user_cnt
      from (select user_id
            from temp_user_login
            group by user_id) t) total_user
         join
     (select sum(1) login_again
      from (select t.user_id
            from (select user_id, day_id
                  from (select user_id, day_id, rank() over (partition by user_id order by day_id) rk
                        from (select user_id, day_id
                              from temp_user_login
                              where if_login = '1'
                              group by user_id, day_id) t) t1
                  where rk = 1) t
                     join
                 (select user_id, day_id
                  from (select user_id, day_id, rank() over (partition by user_id order by day_id) rk
                        from (select user_id, day_id
                              from temp_user_login
                              where if_login = '1'
                              group by user_id, day_id) t) t1
                  where rk = 2) t1 on t.user_id = t1.user_id and datediff(t1.day_id, t.day_id) = 1
            group by t.user_id) again) again;

 

posted @ 2023-01-11 16:22  晓枫的春天  阅读(82)  评论(0编辑  收藏  举报