EFCore分组查询(GroupBy)后获取第一个元素
EFCore截至目前已经更新到了5.x, 然鹅对于一些略复杂的查询的支持还是不尽如人意啊, 有时候还不得不配合dapper来使用.
引入
假如现在有这样一个登录日志表, 需要查询各用户的最近的一次登录记录, 如何用EFCore来查呢?
登录日志表结构和模拟数据如下
生成数据库脚本
USE [EFCoreGroupDemo]
GO
/****** Object: Table [dbo].[LoginLog] Script Date: 2021-06-20 16:50:39 ******/
SET ANSI_NULLS ON
GO
SET QUOTED_IDENTIFIER ON
GO
CREATE TABLE [dbo].[LoginLog](
[Id] [uniqueidentifier] NOT NULL,
[DateAdded] [datetime2](7) NOT NULL,
[UserId] [uniqueidentifier] NOT NULL,
[ClientId] [uniqueidentifier] NOT NULL,
[LoginResult] [int] NOT NULL,
PRIMARY KEY CLUSTERED
(
[Id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, OPTIMIZE_FOR_SEQUENTIAL_KEY = OFF) ON [PRIMARY]
) ON [PRIMARY]
GO
INSERT [dbo].[LoginLog] ([Id], [DateAdded], [UserId], [ClientId], [LoginResult]) VALUES (N'20212010-9906-40f3-8830-31d6a006c027', CAST(N'2021-06-20T15:45:09.1666667' AS DateTime2), N'a79c02e4-61f2-422a-a8ab-fdbf2b201e44', N'88dcc77d-30cf-4b1b-9cbb-d6451a31ce67', 0)
GO
INSERT [dbo].[LoginLog] ([Id], [DateAdded], [UserId], [ClientId], [LoginResult]) VALUES (N'5c8d4eec-d443-46a2-a0f2-5c51ae2d9592', CAST(N'2021-06-20T15:46:08.0133333' AS DateTime2), N'54eab252-b171-4e51-80c2-92bd0f15fc86', N'88dcc77d-30cf-4b1b-9cbb-d6451a31ce67', 0)
GO
INSERT [dbo].[LoginLog] ([Id], [DateAdded], [UserId], [ClientId], [LoginResult]) VALUES (N'05426fc6-c9e6-4f98-ac83-bdc2f7a76e7d', CAST(N'2021-06-19T15:43:39.0000000' AS DateTime2), N'a79c02e4-61f2-422a-a8ab-fdbf2b201e44', N'7e40223a-cd28-4886-8480-2a9c3270a963', 0)
GO
INSERT [dbo].[LoginLog] ([Id], [DateAdded], [UserId], [ClientId], [LoginResult]) VALUES (N'8e3e8e17-6617-47cc-9fd7-bfb6603a3a8f', CAST(N'2021-06-19T15:44:19.0000000' AS DateTime2), N'54eab252-b171-4e51-80c2-92bd0f15fc86', N'88dcc77d-30cf-4b1b-9cbb-d6451a31ce67', 0)
GO
ALTER TABLE [dbo].[LoginLog] ADD DEFAULT (newid()) FOR [Id]
GO
ALTER TABLE [dbo].[LoginLog] ADD DEFAULT (getdate()) FOR [DateAdded]
GO
摸索
最容易想到的写法应该像下面这样
var set = dbCtx.Set<LoginLog>().AsNoTracking();
var latestLogs = set.GroupBy(x => x.UserId)
.Select(g => g.OrderByDescending(x => x.DateAdded).First());
经过测试这样写是不行的, 运行后抛出System.InvalidOperationException
目前EF对Group的支持还停留在基本的聚合函数上, 详细的可以看EF仓库的这个Issue: Support ability to select top N of each group.
那么在EF提供对上面的那种写法的支持前, 我们就只能写原生sql来实现这个需求了吗? 其实不然, 我找到了一种替代的写法如下:
var set = dbCtx.Set<LoginLog>().AsNoTracking();
var sorted = set.OrderByDescending(x => x.DateAdded);
var latestLogs = set.Select(x => x.UserId)
.Distinct()
.SelectMany(x => sorted.Where(y => y.UserId == x).Take(1));
这种写法生成的sql是像这样的:
SELECT [t1].[Id], [t1].[ClientId], [t1].[DateAdded], [t1].[LoginResult], [t1].[UserId]
FROM (
SELECT DISTINCT [l].[UserId]
FROM [LoginLog] AS [l]
) AS [t]
INNER JOIN (
SELECT [t0].[Id], [t0].[ClientId], [t0].[DateAdded], [t0].[LoginResult], [t0].[UserId]
FROM (
SELECT [l0].[Id], [l0].[ClientId], [l0].[DateAdded], [l0].[LoginResult], [l0].[UserId], ROW_NUMBER() OVER(PARTITION BY [l0].[UserId] ORDER BY [l0].[DateAdded] DESC) AS [row]
FROM [LoginLog] AS [l0]
) AS [t0]
WHERE [t0].[row] <= 1
) AS [t1] ON [t].[UserId] = [t1].[UserId]
可以看到生成的sql里有用到窗口函数ROW_NUMBER
.
我还在控制台打印了下查询的结果
打印查询结果的方法
static void WriteLogs(IEnumerable<LoginLog> logs)
{
foreach (var log in logs)
{
Console.WriteLine($"{log.UserId}, {log.DateAdded:yyyy-MM-dd HH:mm}, {log.LoginResult}");
}
}
第一列是用户Id, 第二列是添加日期, 最后是登录结果, 可以到查询结果是符合预期的.
小结
EF6中其实是支持直接GroupBy然后排序后的第一个元素的, EFCore这方面的支持居然迟迟不到位, 相关Issue在18年就有了, 无奈之下只能暂时曲线救国. 因为好几次遇到这种查询需求了, 所以这里记录一下, 希望EFCore快点提供相关功能的支持吧😓