SQL IN查询优化
实际项目中有如下SQL, 发现效率很低,用时超过1分钟
select TaskID, StartDate = min(UpdateTime), EndDate = max(UpdateTime) from BFDB_Code.dbo.packcodeinfo p inner join Task t on p.TaskID = t.ID where datediff(day, t.EndDate, getdate()) < @day and t.Status in(4,5) group by TaskID
通过查询计划可知上面语句进行了全表扫描,所以效率很低
单个查询时并不慢,因为在TaskID上已经建立索引
select TaskID, StartDate = min(UpdateTime), EndDate = max(UpdateTime) from BFDB_Code.dbo.packcodeinfo p where p.TaskID in (2488,2499) group by TaskID
但如果是这样查询,仍然会全表扫描
select TaskID, StartDate = min(UpdateTime), EndDate = max(UpdateTime) from BFDB_Code.dbo.packcodeinfo p where p.TaskID in ( SELECT id FROM task t where t.Status in(2,3) ) group by TaskID
优化方案
去除inner join,写一个函数返回类似(2488,2499)值,然后动态构造SQL语句执行。
函数定义get_begin_task
create function [dbo].[get_begin_task]() returns varchar(1000) as begin DECLARE @csv VARCHAR(1000) SELECT @csv = COALESCE(@csv + ',', '') + Convert(varchar(10), ID) FROM task t where t.Status in(2,3) and createdate >= '2016-05-01'; Return @csv end;
修改存储过程如:
declare @sql varchar(8000); set @sql= ' select TaskID, StartDate = min(UpdateTime), EndDate = max(UpdateTime) from BFDB_Code.dbo.packcodeinfo p where p.TaskID in (' + dbo.get_begin_task() +') group by TaskID ' exec(@sql);
小插曲:
因为用到了跨数据库查询,动态执行SQL语句需要执行 EXEC sp_addlinkedserver ‘BFDB_Code’命令。
签名:删除冗余的代码最开心,找不到删除的代码最痛苦!