测试数据:
解题思路:
1.用窗口函数根据userId分组,将每个分组内的数据根据排序顺序编号
select userid, loginDate , ROW_NUMBER() OVER (PARTITION BY userId ORDER BY loginDate) AS seq from test;
2. 用每条记录的loginDate 减去seq,得到一个时间 ,如果两条记录得到的这个时间相等表示是连续登录
select userid, loginDate , ROW_NUMBER() OVER (PARTITION BY userId ORDER BY loginDate) AS seq, date_sub(logindate,interval row_number() over (partition by userid order by logindate) day) as day from test;
3.根据题目要求统计连续登录大于等于7天的用户,就是统计至少存在7条记录的day是相等的userid
with gr as (select userid, date_sub(logindate,interval row_number() over (partition by userid order by logindate) day) as day from test) select userid, day, count(1) from gr group by userId ,day HAVING (count(1)>=7);
上面的写法等同于
select userid, day, count(1) from (select userid, date_sub(logindate,interval row_number() over (partition by userid order by logindate) day) as day from test) t group by userId ,day HAVING (count(1)>=7);