查找至少连续出现三次的所有数字/连续3天的日期【LeetCode】
编写一个SQL查询,查找至少连续出现三次的所有数字。
+----+-----+ | Id | Num | +----+-----+ | 1 | 1 | | 2 | 1 | | 3 | 1 | | 4 | 2 | | 5 | 1 | | 6 | 2 | | 7 | 2 | +----+-----+
【1】sql server优质解法:
【1.1】连续3次以上出现的数字
CREATE TABLE #A ( id INT IDENTITY(1,1), val INT ) INSERT INTO #A(val) VALUES(1),(1),(1),(1),(2),(2),(3),(4),(4),(4),(4),(4) INSERT INTO #A(val) VALUES(1)
select * from #A
SELECT val,MIN(id) AS minid,MAX(id) AS maxid, COUNT(1) AS cmd FROM ( SELECT *,id-ROW_NUMBER() OVER( PARTITION BY val ORDER BY id ) AS idx FROM #A ) S GROUP BY val,idx
【1.2】连续出现的日期:(比如,想要查询连续登录超过3天的用户)
CREATE TABLE #b ( id INT IDENTITY(1,1), userid INT, login_time datetime ) INSERT INTO #b(userid,login_time)VALUES(101,'20180801'),(102,'20180801') INSERT INTO #b(userid,login_time)VALUES(102,'20180802') INSERT INTO #b(userid,login_time)VALUES(101,'20180803'),(102,'20180803') INSERT INTO #b(userid,login_time)VALUES(101,'20180804'),(102,'20180804') INSERT INTO #b(userid,login_time)VALUES(101,'20180805'),(102,'20180805') INSERT INTO #b(userid,login_time)VALUES(101,'20180806') INSERT INTO #b(userid,login_time)VALUES(101,'20180807')
INSERT INTO #b(userid,login_time)VALUES(101,'20180809')
SELECT * FROM #b ORDER BY userid ,login_time --解答 SELECT userid, MIN(login_time) AS StartDate, MAX(login_time) AS EndDate, COUNT(1) AS DayCount FROM ( SELECT userid ,login_time ,DATEADD(dd, -ROW_NUMBER() OVER ( PARTITION BY userid ORDER BY login_time), login_time) AS Grp FROM #b ) AS T GROUP BY userid, [Grp] ORDER BY 1
【2】mysql办法解决
【2.1】连续时间(比如,想要查询连续登录超过3天的用户)
(8.0以前,8.0以后可以用上述sql server 办法)
测试代码
-- 测试数据代码 CREATE TABLE b ( id INT primary key auto_increment, userid INT, login_time datetime ); -- select * from b order by userid; INSERT INTO b(userid,login_time) VALUES(101,'20180801'),(102,'20180801'); INSERT INTO b(userid,login_time) VALUES(101,'20180802'),(102,'20180802'); INSERT INTO b(userid,login_time) VALUES(101,'20180803'),(102,'20180803'); INSERT INTO b(userid,login_time) VALUES(101,'20180804'),(102,'20180804'); INSERT INTO b(userid,login_time) VALUES(101,'20180805'),(102,'20180805'); INSERT INTO b(userid,login_time) VALUES(101,'20180806'); INSERT INTO b(userid,login_time) VALUES(101,'20180807'); INSERT INTO b(userid,login_time) VALUES(101,'20180808'); INSERT INTO b(userid,login_time) VALUES(101,'20180809'); INSERT INTO b(userid,login_time) VALUES(101,'20180810'); INSERT INTO b(userid,login_time) VALUES(101,'20180731'); INSERT INTO b(userid,login_time) VALUES(102,'20180731');
插入后生成的测试表数据:
实现代码:
select userid,min(login_time) min_date,max(login_time) max_date,count(1) as day_count from ( select b.*, date_add(login_time,interval -if(@group_str=userid,@num:=@num+1,@num:=1) day) as login , @group_str:=userid as temp from b cross join (select @num:=0,@group_str=-1) t order by b.userid,login_time ) t group by userid,login
结果:
【2.2】连续3次以上出现的数字
强烈推荐解法三
编写一个SQL查询,查找至少连续出现三次的所有数字。
+----+-----+ | Id | Num | +----+-----+ | 1 | 1 | | 2 | 1 | | 3 | 1 | | 4 | 2 | | 5 | 1 | | 6 | 2 | | 7 | 2 | +----+-----+
-- 建表 create table Logs(id int primary key auto_increment,num int);
-- 造数据
INSERT INTO Logs(num) VALUES(1),(1),(1),(1),(2),(2),(3),(4),(4),(4),(4),(4),(1);
mysql解法:
这道题给了我们一个Logs表,让我们找Num列中连续出现相同数字三次的数字,那么由于需要找三次相同数字,所以我们需要建立三个表的实例.
我们可以用l1分别和l2, l3内交,l1和l2的Id下一个位置比,l1和l3的下两个位置比,然后将Num都相同的数字返回即可:
解法一:
SELECT DISTINCT l1.Num FROM Logs l1 JOIN Logs l2 ON l1.Id = l2.Id - 1 JOIN Logs l3 ON l1.Id = l3.Id - 2 WHERE l1.Num = l2.Num AND l2.Num = l3.Num;
下面这种方法没用用到Join,而是直接在三个表的实例中查找,然后把四个条件限定上,就可以返回正确结果了:
解法二:
SELECT DISTINCT l1.Num FROM Logs l1, Logs l2, Logs l3 WHERE l1.Id = l2.Id - 1 AND l2.Id = l3.Id - 1 AND l1.Num = l2.Num AND l2.Num = l3.Num;
再来看一种画风截然不同的方法,用到了变量count和pre,分别初始化为0和-1,然后需要注意的是用到了IF语句,MySQL里的IF语句和我们所熟知的其他语言的if不太一样,相当于我们所熟悉的三元操作符a?b:c,若a真返回b,否则返回c,具体可看这个帖子。那么我们先来看对于Num列的第一个数字1,pre由于初始化是-1,和当前Num不同,所以此时count赋1,此时给pre赋为1,然后Num列的第二个1进来,此时的pre和Num相同了,count自增1,到Num列的第三个1进来,count增加到了3,此时满足了where条件,t.n >= 3,所以1就被select出来了,以此类推遍历完整个Num就可以得到最终结果:
解法三:
SELECT DISTINCT Num FROM ( SELECT Num, @count := IF(@pre = Num, @count + 1, 1) AS n, @pre := Num FROM Logs, (SELECT @count := 0, @pre := -1) AS init ) AS t WHERE t.n >= 3;
【3】求出某段时间内玩家最大连胜次数
【3.1】思路分解
-- 获取某段日期内某个表的玩家最大连胜次数 -- t1 构造全局顺序伪列 all_rn use log_db; with t1 as ( select numid,result,logtime,row_number() over(order by numid,logtime) as all_rn from log_ZhuoLaoMaZi_20255_2021_07 order by numid,logtime limit 100 --时间段过滤,在这里添加条件 ) -- t2 筛选出获胜局 , result >0 为赢 ,这时如果不连续胜利的,那么 all_rn 必定会断连续 ,t2 as ( select * from t1 where result >0 ) -- t3 构造根据上面的结果集,以用户 numberid 分组 构造分组顺序伪列 group_rn ,t3 as ( select *, row_number() over(partition by numid order by logtime) as group_rn from t2 ) -- t4 构造出可以分组的列,连续获胜的构造出同一个值出来,以便分组获取连胜次数 ,t4 as ( select *, all_rn - group_rn as group_val from t3 ) -- 最终结果 ,t5 as ( select numid as '玩家id',group_val '分组辅助列', min(logtime) as '连胜begintime' ,max(logtime) as '连胜endtime',count(1) as '连胜数' from t4 group by numid,group_val ) select * from t5;
抽取了一百条做查询测试,如上图,只有一行是0的;所以应该有99行相关连胜,最终查询结果如下:
下图中连胜数加起来确实是 99行,所以是正确的,想要获取最大连胜数的行信息,只需要再把代码 order by 连胜数 desc limit 1 即可;
【3.2】最佳实践
本小节的思路、代码来自:SQLSERVER技术交流群
use tempdb; if object_id('#temp1') is not null drop table #temp1 create table #temp1(userid int ,result int,logtime datetime) insert into #temp1 values(101,2,'2021-08-31 08:25:25'),(101,2,'2021-08-31 08:25:27'),(101,2,'2021-08-31 08:25:29') insert into #temp1 values(101,2,'2021-08-31 08:25:31'),(101,2,'2021-08-31 08:26:01'),(101,1,'2021-08-31 08:26:03') insert into #temp1 values(101,2,'2021-08-31 08:26:05'),(101,2,'2021-08-31 08:26:06'),(101,2,'2021-08-31 08:26:07') insert into #temp1 values(101,2,'2021-08-31 08:27:05'),(101,2,'2021-08-31 08:28:06'),(101,2,'2021-08-31 08:29:07') insert into #temp1 values(101,2,'2021-08-31 09:27:05'),(101,2,'2021-08-31 10:28:06'),(101,2,'2021-08-31 11:29:07') insert into #temp1 values(101,2,'2021-09-11 08:25:25'),(101,2,'2021-09-11 08:25:27'),(101,2,'2021-09-12 08:25:29') insert into #temp1 values(101,2,'2021-09-13 08:25:31'),(101,2,'2021-09-14 08:26:01'),(101,2,'2021-09-14 08:26:03') insert into #temp1 values(101,2,'2021-09-15 08:26:05'),(101,2,'2021-09-16 08:26:06'),(101,2,'2021-09-17 08:26:07') insert into #temp1 values(101,2,'2021-09-18 08:27:05'),(101,2,'2021-09-19 08:28:06'),(101,1,'2021-09-20 08:29:07') insert into #temp1 values(101,2,'2021-09-21 09:27:05'),(101,2,'2021-09-22 10:28:06'),(101,2,'2021-09-25 11:29:07')
简写一下【3.1】:
select userid,result,all_rn-group_rn as [分组伪列],count(1) count_number from ( select * ,row_number() over(partition by userid,result order by logtime) as all_rn ,row_number() over(partition by userid order by logtime) as group_rn from #temp1 ) y group by userid,result,all_rn-group_rn--having result=2
另外一种思路:
select userid,count(1) count_number ,max(result) rt from ( select * ,sum(rt) over(partition by userid order by logtime) sum_rt from ( select * ,iif(lag(result,1,result) over(partition by userid order by logtime) = result,0,1) as rt from #temp1 ) t ) t1 group by userid,sum_rt --having rt=2