scalar UDFs performance issue

refer from Inside Miscrsoft SQL Server 2008: T-SQL Programming.

You should be aware that invoking scalar UDFs in queries has a high cost when you provide
the function with attributes from the outer table as inputs. Even when the function only has

a RETURN clause with a scalar expression, it is not considered inline. The overhead of the
function
call per row involves a high cost. You can run a simple performance test to realize
the high cost involved with UDFs compared to inline expressions in a query.
Before you run the performance test, run the code in Listing 2-1 to create an auxiliary table
of numbers called Nums and populate it with 1,000,000 numbers. Note that this book makes
frequent use of this helper table, so you may want to keep it around after creating it.
List ing 2-1 Creating and Populating Auxiliary Table of Numbers

SET NOCOUNT ON;
USE InsideTSQL2008;
IF OBJECT_ID('dbo.Nums', 'U') IS NOT NULL DROP TABLE dbo.Nums;
CREATE TABLE dbo.Nums(n INT NOT NULL PRIMARY KEY);
DECLARE @max AS INT, @rc AS INT;
SET @max = 1000000;
SET @rc = 1;
INSERT INTO Nums VALUES(1);
WHILE @rc * 2 <= @max
BEGIN
INSERT INTO dbo.Nums SELECT n + @rc FROM dbo.Nums;
SET @rc = @rc * 2;
END
INSERT INTO dbo.Nums
SELECT n + @rc FROM dbo.Nums WHERE n + @rc <= @max;
Turn on the Discard results after execution in SQL Server Management Studio (SSMS), so that
your measurements do not include the time it takes to generate the output.

  

Start by running a query against a million rows from Nums, with an inline expression that
adds 1 to n:

SELECT n, n + 1 AS n_plus_one FROM dbo.Nums WHERE n <= 1000000;

it finished in less than a second. 

Next, create the AddOne scalar UDF:

1 IF OBJECT_ID('dbo.AddOne', 'FN') IS NOT NULL
2 DROP FUNCTION dbo.AddOne;
3 GO
4 CREATE FUNCTION dbo.AddOne(@i AS INT) RETURNS INT
5 AS
6 BEGIN
7 RETURN @i + 1;
8 END
9 GO

Now run the query using AddOne:

SELECT n, dbo.AddOne(n) AS n_plus_one FROM dbo.Nums WHERE n <= 1000000;

this query ran for five seconds 

Fortunately, there is a solution that allows you

to avoid making such a choice—a solution that allows you to create a UDF without negatively
effecting the performance of the query.
The solution is applicable only when the function is based on a single expression, as opposed
to having a full body with flow. Instead of defining a scalar UDF, define an inline table-valued
UDF that returns a query with no FROM clause, with a single column based on the expression
of interest. I’ll provide more details about inline table-valued UDFs later in the chapter, in
the section “Table-Valued UDFs.” For now, suffice to say that an inline table-valued UDF is
very much like a view that can accept input parameters. Here’s the inline table-valued UDF
version
of AddOne:

IF OBJECT_ID('dbo.AddOneInline', 'IF') IS NOT NULL
DROP FUNCTION dbo.AddOneInline;
GO
CREATE FUNCTION dbo.AddOneInline(@n AS INT) RETURNS TABLE
AS
RETURN SELECT @n + 1 AS val;
GO

Because this UDF is table-valued, you can’t just call it as part of an expression—you have to

query it. Therefore, to write a scalar expression based on the function call, you have to use a
scalar subquery, like so:

SELECT n, (SELECT val FROM dbo.AddOneInline(n) AS F) AS n_plus_one
FROM dbo.Nums WHERE n <= 1000000;

this query also ran for under a second

The ability to refer to a table UDF within a subquery and pass attributes from the outer
table as input is like an implicit use of the APPLY operator functionality. If you prefer, as an
alternative
you could use the APPLY operator explicitly, like so:

SELECT Nums.n, A.val AS n_plus_one
FROM dbo.Nums
CROSS APPLY dbo.AddOneInline(n) AS A
WHERE n <= 1000000;

this query also ran for under a second

 



 

posted @ 2014-08-31 17:57  Cathy Lee  阅读(247)  评论(0编辑  收藏  举报