查找至少连续出现三次的所有数字/连续3天的日期【LeetCode】

编写一个SQL查询,查找至少连续出现三次的所有数字。
+----+-----+ | Id | Num | +----+-----+ | 1 | 1 | | 2 | 1 | | 3 | 1 | | 4 | 2 | | 5 | 1 | | 6 | 2 | | 7 | 2 | +----+-----+

【1】sql server优质解法:

【1.1】连续3次以上出现的数字

CREATE TABLE #A
(
    id INT IDENTITY(1,1),
    val INT 
)
INSERT INTO #A(val) VALUES(1),(1),(1),(1),(2),(2),(3),(4),(4),(4),(4),(4)
INSERT INTO #A(val) VALUES(1)
select * from #A
SELECT val,MIN(id) AS minid,MAX(id) AS maxid, COUNT(1) AS cmd FROM 
(
SELECT *,id-ROW_NUMBER() OVER( PARTITION BY val ORDER BY id )  AS idx FROM #A
 
) S GROUP BY val,idx

 


【1.2】连续出现的日期:(比如,想要查询连续登录超过3天的用户)

CREATE TABLE #b
(
id INT IDENTITY(1,1),
userid INT,
login_time datetime
)
INSERT INTO #b(userid,login_time)VALUES(101,'20180801'),(102,'20180801')
INSERT INTO #b(userid,login_time)VALUES(102,'20180802')
INSERT INTO #b(userid,login_time)VALUES(101,'20180803'),(102,'20180803')
INSERT INTO #b(userid,login_time)VALUES(101,'20180804'),(102,'20180804')
INSERT INTO #b(userid,login_time)VALUES(101,'20180805'),(102,'20180805')
INSERT INTO #b(userid,login_time)VALUES(101,'20180806')
INSERT INTO #b(userid,login_time)VALUES(101,'20180807')
INSERT INTO #b(userid,login_time)VALUES(101,'20180809')

SELECT
* FROM #b ORDER BY userid ,login_time --解答 SELECT userid, MIN(login_time) AS StartDate, MAX(login_time) AS EndDate, COUNT(1) AS DayCount FROM ( SELECT userid ,login_time ,DATEADD(dd, -ROW_NUMBER() OVER ( PARTITION BY userid ORDER BY login_time), login_time) AS Grp FROM #b ) AS T GROUP BY userid, [Grp] ORDER BY 1


 【2】mysql办法解决

【2.1】连续时间(比如,想要查询连续登录超过3天的用户)

(8.0以前,8.0以后可以用上述sql server 办法)


 测试代码

-- 测试数据代码
CREATE TABLE b
(
id INT primary key auto_increment,
userid INT,
login_time datetime
);
-- select * from b order by userid;
INSERT INTO b(userid,login_time) VALUES(101,'20180801'),(102,'20180801');
INSERT INTO b(userid,login_time) VALUES(101,'20180802'),(102,'20180802');
INSERT INTO b(userid,login_time) VALUES(101,'20180803'),(102,'20180803');
INSERT INTO b(userid,login_time) VALUES(101,'20180804'),(102,'20180804');
INSERT INTO b(userid,login_time) VALUES(101,'20180805'),(102,'20180805');
INSERT INTO b(userid,login_time) VALUES(101,'20180806');
INSERT INTO b(userid,login_time) VALUES(101,'20180807');
INSERT INTO b(userid,login_time) VALUES(101,'20180808');
INSERT INTO b(userid,login_time) VALUES(101,'20180809');
INSERT INTO b(userid,login_time) VALUES(101,'20180810');
INSERT INTO b(userid,login_time) VALUES(101,'20180731');
INSERT INTO b(userid,login_time) VALUES(102,'20180731');

插入后生成的测试表数据: 

    

实现代码:

select userid,min(login_time) min_date,max(login_time) max_date,count(1) as day_count
from (
    select b.*,
    date_add(login_time,interval -if(@group_str=userid,@num:=@num+1,@num:=1) day) as login ,
    @group_str:=userid as temp  
    from b cross join (select @num:=0,@group_str=-1) t 
order by b.userid,login_time

) t
group by userid,login

结果:

 

 

【2.2】连续3次以上出现的数字

  强烈推荐解法三

编写一个SQL查询,查找至少连续出现三次的所有数字。
+----+-----+ | Id | Num | +----+-----+ | 1 | 1 | | 2 | 1 | | 3 | 1 | | 4 | 2 | | 5 | 1 | | 6 | 2 | | 7 | 2 | +----+-----+

 

-- 建表
create table  Logs(id int primary key auto_increment,num int);

-- 造数据
INSERT INTO Logs(num) VALUES(1),(1),(1),(1),(2),(2),(3),(4),(4),(4),(4),(4),(1);

 

mysql解法:

这道题给了我们一个Logs表,让我们找Num列中连续出现相同数字三次的数字,那么由于需要找三次相同数字,所以我们需要建立三个表的实例.
我们可以用l1分别和l2, l3内交,l1和l2的Id下一个位置比,l1和l3的下两个位置比,然后将Num都相同的数字返回即可:

解法一:

SELECT DISTINCT l1.Num FROM Logs l1
JOIN Logs l2 ON l1.Id = l2.Id - 1
JOIN Logs l3 ON l1.Id = l3.Id - 2
WHERE l1.Num = l2.Num AND l2.Num = l3.Num;

  

   

 

下面这种方法没用用到Join,而是直接在三个表的实例中查找,然后把四个条件限定上,就可以返回正确结果了:

 

解法二:

SELECT DISTINCT l1.Num FROM Logs l1, Logs l2, Logs l3
WHERE l1.Id = l2.Id - 1 AND l2.Id = l3.Id - 1
AND l1.Num = l2.Num AND l2.Num = l3.Num;

 

  再来看一种画风截然不同的方法,用到了变量count和pre,分别初始化为0和-1,然后需要注意的是用到了IF语句,MySQL里的IF语句和我们所熟知的其他语言的if不太一样,相当于我们所熟悉的三元操作符a?b:c,若a真返回b,否则返回c,具体可看这个帖子。那么我们先来看对于Num列的第一个数字1,pre由于初始化是-1,和当前Num不同,所以此时count赋1,此时给pre赋为1,然后Num列的第二个1进来,此时的pre和Num相同了,count自增1,到Num列的第三个1进来,count增加到了3,此时满足了where条件,t.n >= 3,所以1就被select出来了,以此类推遍历完整个Num就可以得到最终结果:

 

解法三:

SELECT DISTINCT Num FROM (
SELECT Num, @count := IF(@pre = Num, @count + 1, 1) AS n, @pre := Num
FROM Logs, (SELECT @count := 0, @pre := -1) AS init
) AS t WHERE t.n >= 3;

  

 

【3】求出某段时间内玩家最大连胜次数

【3.1】思路分解

-- 获取某段日期内某个表的玩家最大连胜次数
-- t1 构造全局顺序伪列 all_rn 
use log_db;
with t1 as (
    select numid,result,logtime,row_number() over(order by numid,logtime) as all_rn
    from log_ZhuoLaoMaZi_20255_2021_07 order by numid,logtime limit 100  --时间段过滤,在这里添加条件
)
-- t2 筛选出获胜局 , result >0 为赢 ,这时如果不连续胜利的,那么 all_rn 必定会断连续
,t2 as (
    select * from t1 where result >0
)
-- t3 构造根据上面的结果集,以用户 numberid 分组 构造分组顺序伪列 group_rn
,t3 as (
    select *, 
    row_number() over(partition by numid order by logtime) as group_rn  
    from t2  
)
-- t4 构造出可以分组的列,连续获胜的构造出同一个值出来,以便分组获取连胜次数
,t4 as (
    select *, all_rn - group_rn as group_val from t3 
)
-- 最终结果
,t5 as (
    select numid as '玩家id',group_val '分组辅助列',    
    min(logtime) as '连胜begintime' ,max(logtime) as '连胜endtime',count(1) as '连胜数' 
    from t4 
    group by numid,group_val 
)
select * from t5;

  

 

 抽取了一百条做查询测试,如上图,只有一行是0的;所以应该有99行相关连胜,最终查询结果如下:

下图中连胜数加起来确实是 99行,所以是正确的,想要获取最大连胜数的行信息,只需要再把代码 order by 连胜数 desc limit 1 即可;

  

【3.2】最佳实践

本小节的思路、代码来自:SQLSERVER技术交流群

use tempdb;
if object_id('#temp1') is not null
    drop table #temp1
create table #temp1(userid int ,result int,logtime datetime)
insert into #temp1  values(101,2,'2021-08-31 08:25:25'),(101,2,'2021-08-31 08:25:27'),(101,2,'2021-08-31 08:25:29')
insert into #temp1  values(101,2,'2021-08-31 08:25:31'),(101,2,'2021-08-31 08:26:01'),(101,1,'2021-08-31 08:26:03')
insert into #temp1  values(101,2,'2021-08-31 08:26:05'),(101,2,'2021-08-31 08:26:06'),(101,2,'2021-08-31 08:26:07')
insert into #temp1  values(101,2,'2021-08-31 08:27:05'),(101,2,'2021-08-31 08:28:06'),(101,2,'2021-08-31 08:29:07')
insert into #temp1  values(101,2,'2021-08-31 09:27:05'),(101,2,'2021-08-31 10:28:06'),(101,2,'2021-08-31 11:29:07')

insert into #temp1  values(101,2,'2021-09-11 08:25:25'),(101,2,'2021-09-11 08:25:27'),(101,2,'2021-09-12 08:25:29')
insert into #temp1  values(101,2,'2021-09-13 08:25:31'),(101,2,'2021-09-14 08:26:01'),(101,2,'2021-09-14 08:26:03')
insert into #temp1  values(101,2,'2021-09-15 08:26:05'),(101,2,'2021-09-16 08:26:06'),(101,2,'2021-09-17 08:26:07')
insert into #temp1  values(101,2,'2021-09-18 08:27:05'),(101,2,'2021-09-19 08:28:06'),(101,1,'2021-09-20 08:29:07')
insert into #temp1  values(101,2,'2021-09-21 09:27:05'),(101,2,'2021-09-22 10:28:06'),(101,2,'2021-09-25 11:29:07')

简写一下【3.1】:

select userid,result,all_rn-group_rn as [分组伪列],count(1) count_number from 
(
    select *  
    ,row_number() over(partition by userid,result order by  logtime) as all_rn
    ,row_number() over(partition by userid order by logtime) as group_rn
    from #temp1
) y
group by userid,result,all_rn-group_rn--having result=2
  

另外一种思路:

select userid,count(1) count_number ,max(result) rt
from (
    select * ,sum(rt) over(partition by userid order by logtime) sum_rt
    from (
        select *  
        ,iif(lag(result,1,result) over(partition by userid order by  logtime) = result,0,1) as rt
        from #temp1
    ) t 

) t1
group by userid,sum_rt
--having rt=2
  

 

posted @ 2018-07-04 12:15  郭大侠1  阅读(4663)  评论(2编辑  收藏  举报