scalar UDFs performance issue
refer from Inside Miscrsoft SQL Server 2008: T-SQL Programming.
You should be aware that invoking scalar UDFs in queries has a high cost when you provide
the function with attributes from the outer table as inputs. Even when the function only has
a RETURN clause with a scalar expression, it is not considered inline. The overhead of the
function
call per row involves a high cost. You can run a simple performance test to realize
the high cost involved with UDFs compared to inline expressions in a query.
Before you run the performance test, run the code in Listing 2-1 to create an auxiliary table
of numbers called Nums and populate it with 1,000,000 numbers. Note that this book makes
frequent use of this helper table, so you may want to keep it around after creating it.
List ing 2-1 Creating and Populating Auxiliary Table of Numbers
SET NOCOUNT ON; USE InsideTSQL2008; IF OBJECT_ID('dbo.Nums', 'U') IS NOT NULL DROP TABLE dbo.Nums; CREATE TABLE dbo.Nums(n INT NOT NULL PRIMARY KEY); DECLARE @max AS INT, @rc AS INT; SET @max = 1000000; SET @rc = 1; INSERT INTO Nums VALUES(1); WHILE @rc * 2 <= @max BEGIN INSERT INTO dbo.Nums SELECT n + @rc FROM dbo.Nums; SET @rc = @rc * 2; END INSERT INTO dbo.Nums SELECT n + @rc FROM dbo.Nums WHERE n + @rc <= @max; Turn on the Discard results after execution in SQL Server Management Studio (SSMS), so that your measurements do not include the time it takes to generate the output.
Start by running a query against a million rows from Nums, with an inline expression that
adds 1 to n:
SELECT n, n + 1 AS n_plus_one FROM dbo.Nums WHERE n <= 1000000;
it finished in less than a second.
Next, create the AddOne scalar UDF:
1 IF OBJECT_ID('dbo.AddOne', 'FN') IS NOT NULL 2 DROP FUNCTION dbo.AddOne; 3 GO 4 CREATE FUNCTION dbo.AddOne(@i AS INT) RETURNS INT 5 AS 6 BEGIN 7 RETURN @i + 1; 8 END 9 GO
Now run the query using AddOne:
SELECT n, dbo.AddOne(n) AS n_plus_one FROM dbo.Nums WHERE n <= 1000000;
this query ran for five seconds
Fortunately, there is a solution that allows you
to avoid making such a choice—a solution that allows you to create a UDF without negatively
effecting the performance of the query.
The solution is applicable only when the function is based on a single expression, as opposed
to having a full body with flow. Instead of defining a scalar UDF, define an inline table-valued
UDF that returns a query with no FROM clause, with a single column based on the expression
of interest. I’ll provide more details about inline table-valued UDFs later in the chapter, in
the section “Table-Valued UDFs.” For now, suffice to say that an inline table-valued UDF is
very much like a view that can accept input parameters. Here’s the inline table-valued UDF
version
of AddOne:
IF OBJECT_ID('dbo.AddOneInline', 'IF') IS NOT NULL DROP FUNCTION dbo.AddOneInline; GO CREATE FUNCTION dbo.AddOneInline(@n AS INT) RETURNS TABLE AS RETURN SELECT @n + 1 AS val; GO
Because this UDF is table-valued, you can’t just call it as part of an expression—you have to
query it. Therefore, to write a scalar expression based on the function call, you have to use a
scalar subquery, like so:
SELECT n, (SELECT val FROM dbo.AddOneInline(n) AS F) AS n_plus_one FROM dbo.Nums WHERE n <= 1000000;
this query also ran for under a second
The ability to refer to a table UDF within a subquery and pass attributes from the outer
table as input is like an implicit use of the APPLY operator functionality. If you prefer, as an
alternative
you could use the APPLY operator explicitly, like so:
SELECT Nums.n, A.val AS n_plus_one FROM dbo.Nums CROSS APPLY dbo.AddOneInline(n) AS A WHERE n <= 1000000;
this query also ran for under a second