Slow query using non-deterministic user defined function

Slow query using non-deterministic user defined function

http://blogs.msdn.com/b/psssql/archive/2014/07/08/slow-query-using-non-deterministic-user-defined-function.aspx

Recently we worked with a customer who reported a query that used to run a few seconds in SQL Server 2000 but it never finishes in SQL Server 2008 R2 following upgrade.

We went around and tried quite a few things but couldn't get SQL Server 2008 R2 to generate similar plan. Upon closer look at 2008 R2's query plan, we noticed something unusual. The plan has a warning "NO JOIN PREDICATE". What this means is that a cartesian product is introduced.

To illustrate the problem, let's use an example setup:

 

drop function dbo.myfunc 
go
drop view v1, v2 
go
drop table t1, t2 
go 
create table t1 (c1 int not null, c2 varchar(100)) 
go
create table t2 (c1 int not null, c2 varchar(100)) 
go 
set nocount on
go
declare @i int 
begin tran 
select @i = 0 
while (@i < 1000) 
begin 
insert into t1 (c1, c2) values (@i, 'a') 
insert into t2 (c1, c2) values (@i, 'b') 
select @i = @i + 1 
end 
commit tran 
go
drop function dbo.myFunc 
go
create function dbo.myfunc (@c1 int) 
returns int 
--with schemabinding 
as 
begin 
return (@c1 * 100 ) 
end 
go 
create view v1 as select c1, c2, dbo.myfunc(c1) as c3 from t1 
go 
create view v2 as select c1, c2, dbo.myfunc(c1) as c3 from t2 
go

 

 

Now, let's run the following query

dbcc freeproccache 
go 
set statistics profile on 
go

-- But by pulling UDF above join in this query we actually introduce a cartesian product (NO JOIN PREDICATE) 
-- UDF is called 1 million times instead of 1000 times each for the two views! 
select count(*) from v1 as t1 join v2 as t2 on t1.c3 = t2.c3 
go 
set statistics profile off 
go

 

The above query is very slow as illustrated in the query plan below. In the line 6 for the query plan, there is a warning "no join predicate". The join resulted in 1,000,000 rows (1,000 x 1,000 rows from each table).

In line 5, the myfunc is called 2,000,000 times (1,000,000 for computing t1.c1 and 1,000,000 for t2.c1).

This is because starting SQL Server 2005, optimizer has rule changes that will disallow non-deterministic scalar functions to be 'pushed down' in some situations (like this one).

 

 

Solution

 

Many times, you can simply make a function deterministic by adding schemabinding option. In the above example, re-write the function with schemabinding, it will be much faster.

From the query plan, you will no longer see that the "NO JOIN PREDICATE". The scalare UDF is pushed down right after table scan and applied only 100 times on each table.

drop function dbo.myFunc 
go
create function dbo.myfunc (@c1 int) 
returns int 
with schemabinding 
as 
begin 
return (@c1 * 100 ) 
end

 

 

 

Obviously, the function can be made deterministic. If you use following, the function will not be deterministic even you use schemabidning because of getdate(). In such cases, you will continue to see "NO JOIN PREDICATE" Cartesian product joins.

drop function dbo.myFunc 
go
create function dbo.myfunc (@c1 int) 
returns int 
with schemabinding 
as 
begin 
return (@c1 * 100 * datepart (mm,getdate())) 
end

 

 

 

Jack Li | Senior Escalation Engineer | Microsoft SQL Server Support

 

posted @ 2014-10-23 16:32  桦仔  阅读(148)  评论(0)    收藏  举报