LeetCode 1097. Game Play Analysis V
1、题目描述
We define the install date of a player to be the first login day of that player.
We also define day 1 retention of some date X to be the number of players whose install date is X and they logged back in on the day right after X, divided by the number of players whose install date is X, rounded to 2 decimal places.
Write an SQL query that reports for each install date, the number of players that installed the game on that day and the day 1 retention.
The query result format is in the following example:
Activity table:
player_id | device_id | event_date | games_played |
1 | 2 | 2016-03-01 | 5 |
1 | 2 | 2016-03-02 | 6 |
2 | 3 | 2017-06-25 | 1 |
2 | 1 | 2016-03-01 | 0 |
3 | 4 | 2016-07-03 | 5 |
Result table:
install_dt | installs | Day1_retention |
2016-03-01 | 2 | 0.50 |
2017-06-25 | 1 | 0.00 |
Player 1 and 3 installed the game on 2016-03-01 but only player 1 logged back in on 2016-03-02 so the day 1 retention of 2016-03-01 is 1 / 2 = 0.50
Player 2 installed the game on 2017-06-25 but didn’t log back in on 2017-06-26 so the day 1 retention of 2017-06-25 is 0 / 1 = 0.00
2、解题思路
- 建表语句:
1 INSERT INTO Activity (player_id, event_date) 2 VALUES 3 (1, '2018-01-01') ; 4 5 INSERT INTO Activity (player_id, event_date) 6 VALUES 7 (1, '2018-01-02') ; 8 9 INSERT INTO Activity (player_id, event_date) 10 VALUES 11 (2, '2018-01-01') ; 12 13 INSERT INTO Activity (player_id, event_date) 14 VALUES 15 (2, '2019-03-01') ; 16 17 INSERT INTO Activity (player_id, event_date) 18 VALUES 19 (2, '2019-03-02') ; 20 21 INSERT INTO Activity (player_id, event_date) 22 VALUES 23 (3, '2019-02-28') ; 24 25 INSERT INTO Activity (player_id, event_date) 26 VALUES 27 (3, '2019-03-01') ; 28 29 INSERT INTO Activity (player_id, event_date) 30 VALUES 31 (4, '2019-02-25') ; 32 33 INSERT INTO Activity (player_id, event_date) 34 VALUES 35 (4, '2019-03-01') ; 36 37 INSERT INTO Activity (player_id, event_date) 38 VALUES 39 (5, '2019-03-05') ; 40 41 INSERT INTO Activity (player_id, event_date) 42 VALUES 43 (6, '2019-03-01') ;
- 查询每日安装游戏的玩家数:
1 SELECT 2 event_date, 3 COUNT(1) AS installs 4 FROM 5 (SELECT 6 player_id, 7 event_date 8 FROM 9 Activity 10 GROUP BY player_id 11 ORDER BY player_id, 12 event_date) a 13 GROUP BY event_date;
- 按照访问日期升序,为每位玩家的事件记录进行编号:
如果你使用的数据库支持ROW_NUMBER()函数,可以更优雅的实现这一步。
1 SELECT 2 player_id, 3 event_date, 4 IF( 5 @id = player_id, 6 @rank := @rank + 1, 7 @rank := 1 8 ) AS rank, 9 @id := player_id 10 FROM 11 Activity, 12 (SELECT 13 @id := NULL, 14 @rank := 0) r 15 ORDER BY player_id, 16 event_date ;
- 统计安装游戏后第二日仍然登陆游戏的玩家信息:
用上一步的中间表自关联,关联条件见SQL语句。
1 SELECT 2 * 3 FROM 4 (SELECT 5 player_id, 6 event_date, 7 IF( 8 @id = player_id, 9 @rank := @rank + 1, 10 @rank := 1 11 ) AS rank, 12 @id := player_id 13 FROM 14 Activity, 15 (SELECT 16 @id := NULL, 17 @rank := 0) r 18 ORDER BY player_id, 19 event_date) t1, 20 (SELECT 21 player_id, 22 event_date, 23 IF( 24 @id2 = player_id, 25 @rank2 := @rank2 + 1, 26 @rank2 := 1 27 ) AS rank, 28 @id2 := player_id 29 FROM 30 Activity, 31 (SELECT 32 @id2 := NULL, 33 @rank2 := 0) r 34 ORDER BY player_id, 35 event_date) t2 36 WHERE t1.rank = 1 37 AND t2.rank =2 38 AND t1.player_id = t2.player_id 39 AND DATEDIFF(t2.event_date,t1.event_date)=1;
- 完整查询语句:
1 SELECT 2 temp1.event_date AS install_dt, 3 temp1.installs, 4 ROUND( 5 IFNULL(temp2.counts, 0) / temp1.installs, 6 2 7 ) AS Day1_retention 8 FROM 9 (SELECT 10 event_date, 11 COUNT(1) AS installs 12 FROM 13 (SELECT 14 player_id, 15 event_date 16 FROM 17 Activity 18 GROUP BY player_id 19 ORDER BY player_id, 20 event_date) a 21 GROUP BY event_date) temp1 22 LEFT JOIN 23 (SELECT 24 t1.event_date, 25 COUNT(1) AS counts 26 FROM 27 (SELECT 28 player_id, 29 event_date, 30 IF( 31 @id = player_id, 32 @rank := @rank + 1, 33 @rank := 1 34 ) AS rank, 35 @id := player_id 36 FROM 37 Activity, 38 (SELECT 39 @id := NULL, 40 @rank := 0) r 41 ORDER BY player_id, 42 event_date) t1, 43 (SELECT 44 player_id, 45 event_date, 46 IF( 47 @id2 = player_id, 48 @rank2 := @rank2 + 1, 49 @rank2 := 1 50 ) AS rank, 51 @id2 := player_id 52 FROM 53 Activity, 54 (SELECT 55 @id2 := NULL, 56 @rank2 := 0) r 57 ORDER BY player_id, 58 event_date) t2 59 WHERE t1.rank = 1 60 AND t2.rank = 2 61 AND t1.player_id = t2.player_id 62 AND DATEDIFF(t2.event_date, t1.event_date) = 1 63 GROUP BY t1.event_date) temp2 64 ON temp1.event_date = temp2.event_date ;
3、小结
本题知识点有以下几个:
- 如何实现分组按顺序进行编号。这是一个常见需求,最简单的实现方式是row_number函数,但mysql中没有这个函数,需要自己用sql语句实现。
- 日期函数的使用。
- 左连接与空值判定。