SET STATISTICS IO

Causes SQL Server to display information regarding the amount of disk activity generated by Transact-SQL statements.

SET STATISTICS IO { ON | OFF }
Remarks
 
When STATISTICS IO is ON, statistical information is displayed. When OFF, the information is not displayed.

After this option is set ON, all subsequent Transact-SQL statements return the statistical information until the option is set to OFF.

The following table lists and describes the output items.

 

Output item

Meaning

Table

Name of the table.

Scan count

Number of seeks/scans started after reaching the leaf level in any direction to retrieve all the values to construct the final dataset for the output.

  • Scan count is 0 if the index used is a unique index or clustered index on a primary key and you are seeking for only one value. For example WHERE Primary_Key_Column = <value>.

  • Scant count is 1 when you are searching for one value using a non-unique clustered index which is defined on a non-primary key column. This is done to check for duplicate values for the key value that you are searching for. For example WHERE Clustered_Index_Key_Column = <value>.

  • Scan count is N when N is the number of different seek/scan started towards the left or right side at the leaf level after locating a key value using the index key.

logical reads

Number of pages read from the data cache.

physical reads

Number of pages read from disk.

read-ahead reads

Number of pages placed into the cache for the query.

lob logical reads

Number of text, ntext, image, or large value type (varchar(max), nvarchar(max), varbinary(max)) pages read from the data cache.

lob physical reads

Number of text, ntext, image or large value type pages read from disk.

lob read-ahead reads

Number of text, ntext, image or large value type pages placed into the cache for the query.

The setting of SET STATISTICS IO is set at execute or run time and not at parse time.

Note Note

When Transact-SQL statements retrieve LOB columns, some LOB retrieval operations might require traversing the LOB tree multiple times. This may cause SET STATISTICS IO to report higher than expected logical reads.

 
 
 
To use SET STATISTICS IO, users must have the appropriate permissions to execute the Transact-SQL statement. The SHOWPLAN permission is not required.
 
Examples
 
This example shows how many logical and physical reads are used by SQL Server as it processes the statements.
USE AdventureWorks2012;
GO       
SET STATISTICS IO ON;
GO
SELECT * 
FROM Production.ProductCostHistory
WHERE StandardCost < 500.00;
GO
SET STATISTICS IO OFF;
GO

Here is the result set:

Table 'ProductCostHistory'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Reference
 
From http://msdn.microsoft.com/en-us/library/ms184361.aspx

Beware misleading data from SET STATISTICS IO

My co-worker Steve Wright (blog | @SQL_Steve) prodded me with a question recently on a strange result he was seeing. In order to test some functionality in our latest tool, SQL Sentry Plan Explorer PRO, he had manufactured a wide and large table, and was running a variety of queries against it. In one case he was returning a lot of data, but STATISTICS IO was showing that very few reads were taking place. I pinged some people on #sqlhelp and, since it seemed nobody had seen this issue, I thought I would blog about it.

In short, be very aware that there are some scenarios where you can't rely on STATISTICS IO to tell you the truth. In some cases (this one involving TOP and parallelism), it will vastly under-report logical reads. This can lead you to believe you have a very I/O-friendly query when you don't. There are other more obvious cases – such as when you have a bunch of I/O hidden away by the use of scalar user-defined functions. We think Plan Explorer makes those cases more obvious; this one, however, is a little trickier.

The problem query

The table has 37 million rows, up to 250 bytes per row, about 1 million pages, and very low fragmentation (0.42% on level 0, 15% on level 1, and 0 beyond that). There are no computed columns, no UDFs in play, and no indexes except a clustered primary key on the leadingINT column. A simple query returning 500,000 rows, all columns, using TOP and SELECT *:

SET STATISTICS IO ON;
 
SELECT TOP 500000 * FROM dbo.OrderHistory 
WHERE OrderDate < (SELECT '19961029');

(And yes, I realize I am violating my own rules and using SELECT * and TOP without ORDER BY, but for the sake of simplicity I am trying my best to minimize my influence on the optimizer.)

Results:

(500000 row(s) affected)
Table 'OrderHistory'. Scan count 1, logical reads 23, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

We're returning 500,000 rows, and it takes about 10 seconds. I immediately know that something is wrong with the logical reads number. Even if I didn't already know about the underlying data, I can tell from the grid results in Management Studio that this is pulling more than 23 pages of data, whether they are from memory or cache, and this should be reflected somewhere in STATISTICS IO. Looking at the plan…

…we see parallelism is in there, and that we've scanned the entire table. So how is it possible that there are only 23 logical reads?

Another "identical" query

One of my first questions back to Steve was: "What happens if you eliminate parallelism?" So I tried it out. I took the original subquery version and added MAXDOP 1:

SET STATISTICS IO ON;
 
SELECT TOP 500000 * FROM dbo.OrderHistory 
WHERE OrderDate < (SELECT '19961029') OPTION (MAXDOP 1);

Results and plan:

(500000 row(s) affected)
Table 'OrderHistory'. Scan count 1, logical reads 149589, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

We have a slightly less complex plan, and without the parallelism (for obvious reasons), STATISTICS IO is showing us much more believable numbers for logical read counts.

What is the truth?

It's not hard to see that one of these queries is not telling the whole truth. While STATISTICS IO might not tell us the whole story, maybe trace will. If we retrieve runtime metrics by generating an actual execution plan in Plan Explorer, we see that the magical low-read query is, in fact, pulling the data from memory or disk, and not from a cloud of magic pixie dust. In fact it has *more* reads than the other version:

So it is clear that reads are happening, they're just not appearing correctly in the STATISTICS IO output.

What is the problem?

Well, I'll be quite honest: I don't know, other than the fact that parallelism is definitely playing a role, and it seems to be some kind of race condition. STATISTICS IO (and, since that's where we get the data, our Table I/O tab) shows a very misleading number of reads. It's clear that the query returns all of the data we're looking for, and it's clear from the trace results that it uses reads and not osmosis to do so. I asked Paul White (blog | @SQL_Kiwi) about it and he suggested that only some of the pre-thread I/O counts are being included in the total (and agrees that this is a bug).

If you want to try this out at home, all you need is AdventureWorks (this should repro against 2008, 2008 R2 and 2012 versions), and the following query:

SET STATISTICS IO ON;
DBCC SETCPUWEIGHT(1000) WITH NO_INFOMSGS;
GO
 
SELECT TOP (15000) * 
FROM Sales.SalesOrderHeader 
WHERE OrderDate < (SELECT '20080101');
 
SELECT TOP (15000) * 
FROM Sales.SalesOrderHeader 
WHERE OrderDate < (SELECT '20080101') 
OPTION (MAXDOP 1);
 
DBCC SETCPUWEIGHT(1) WITH NO_INFOMSGS;

(Note that SETCPUWEIGHT is only used to coax parallelism. For more info, see Paul White's blog post on Plan Costing.)

Results:

Table 'SalesOrderHeader'. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'SalesOrderHeader'. Scan count 1, logical reads 333, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Paul pointed out an even simpler repro:

SET STATISTICS IO ON;
GO
 
SELECT TOP (15000) *
FROM Production.TransactionHistory
WHERE TransactionDate < (SELECT '20080101')
OPTION (QUERYTRACEON 8649, MAXDOP 4);
 
SELECT TOP (15000) *
FROM Production.TransactionHistory AS th
WHERE TransactionDate < (SELECT '20080101');

Results:

Table 'TransactionHistory'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Table 'TransactionHistory'. Scan count 1, logical reads 110, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

So it seems that we can easily reproduce this at will with a TOP operator and a low enough DOP. I've filed a bug:

And Paul has filed two other somewhat-related bugs involving parallelism, the first as a result of our conversation:

(For the nostalgic, here are six other parallelism bugs I pointed out a few years ago.)

What is the lesson?

Be careful about trusting a single source. If you look solely at STATISTICS IO after changing a query like this, you may be tempted to focus on the miraculous drop in reads instead of the increase in duration. At which point you may pat yourself on the back, leave work early and enjoy your weekend, thinking you have just made a tremendous performance impact on your query. When of course nothing could be further from the truth.

 

posted @ 2014-04-22 18:57  princessd8251  阅读(257)  评论(0编辑  收藏  举报