SET STATISTICS IO
Causes SQL Server to display information regarding the amount of disk activity generated by Transact-SQL statements.
SET STATISTICS IO { ON | OFF }
After this option is set ON, all subsequent Transact-SQL statements return the statistical information until the option is set to OFF.
The following table lists and describes the output items.
Output item |
Meaning |
---|---|
Table |
Name of the table. |
Scan count |
Number of seeks/scans started after reaching the leaf level in any direction to retrieve all the values to construct the final dataset for the output.
|
logical reads |
Number of pages read from the data cache. |
physical reads |
Number of pages read from disk. |
read-ahead reads |
Number of pages placed into the cache for the query. |
lob logical reads |
Number of text, ntext, image, or large value type (varchar(max), nvarchar(max), varbinary(max)) pages read from the data cache. |
lob physical reads |
Number of text, ntext, image or large value type pages read from disk. |
lob read-ahead reads |
Number of text, ntext, image or large value type pages placed into the cache for the query. |
The setting of SET STATISTICS IO is set at execute or run time and not at parse time.
Note |
---|
When Transact-SQL statements retrieve LOB columns, some LOB retrieval operations might require traversing the LOB tree multiple times. This may cause SET STATISTICS IO to report higher than expected logical reads. |
USE AdventureWorks2012;
GO
SET STATISTICS IO ON;
GO
SELECT *
FROM Production.ProductCostHistory
WHERE StandardCost < 500.00;
GO
SET STATISTICS IO OFF;
GO
Here is the result set:
Table 'ProductCostHistory'. Scan count 1, logical reads 5, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Beware misleading data from SET STATISTICS IO
My co-worker Steve Wright (blog | @SQL_Steve) prodded me with a question recently on a strange result he was seeing. In order to test some functionality in our latest tool, SQL Sentry Plan Explorer PRO, he had manufactured a wide and large table, and was running a variety of queries against it. In one case he was returning a lot of data, but STATISTICS IO
was showing that very few reads were taking place. I pinged some people on #sqlhelp and, since it seemed nobody had seen this issue, I thought I would blog about it.
In short, be very aware that there are some scenarios where you can't rely on STATISTICS IO
to tell you the truth. In some cases (this one involving TOP
and parallelism), it will vastly under-report logical reads. This can lead you to believe you have a very I/O-friendly query when you don't. There are other more obvious cases – such as when you have a bunch of I/O hidden away by the use of scalar user-defined functions. We think Plan Explorer makes those cases more obvious; this one, however, is a little trickier.
The problem query
The table has 37 million rows, up to 250 bytes per row, about 1 million pages, and very low fragmentation (0.42% on level 0, 15% on level 1, and 0 beyond that). There are no computed columns, no UDFs in play, and no indexes except a clustered primary key on the leadingINT
column. A simple query returning 500,000 rows, all columns, using TOP
and SELECT *
:
SET STATISTICS IO ON;
SELECT TOP 500000 * FROM dbo.OrderHistory
WHERE OrderDate < (SELECT '19961029');
(And yes, I realize I am violating my own rules and using SELECT *
and TOP
without ORDER BY
, but for the sake of simplicity I am trying my best to minimize my influence on the optimizer.)
Results:
Table 'OrderHistory'. Scan count 1, logical reads 23, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
We're returning 500,000 rows, and it takes about 10 seconds. I immediately know that something is wrong with the logical reads number. Even if I didn't already know about the underlying data, I can tell from the grid results in Management Studio that this is pulling more than 23 pages of data, whether they are from memory or cache, and this should be reflected somewhere in STATISTICS IO
. Looking at the plan…
…we see parallelism is in there, and that we've scanned the entire table. So how is it possible that there are only 23 logical reads?
Another "identical" query
One of my first questions back to Steve was: "What happens if you eliminate parallelism?" So I tried it out. I took the original subquery version and added MAXDOP 1
:
SET STATISTICS IO ON;
SELECT TOP 500000 * FROM dbo.OrderHistory
WHERE OrderDate < (SELECT '19961029') OPTION (MAXDOP 1);
Results and plan:
Table 'OrderHistory'. Scan count 1, logical reads 149589, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
We have a slightly less complex plan, and without the parallelism (for obvious reasons), STATISTICS IO
is showing us much more believable numbers for logical read counts.
What is the truth?
It's not hard to see that one of these queries is not telling the whole truth. While STATISTICS IO
might not tell us the whole story, maybe trace will. If we retrieve runtime metrics by generating an actual execution plan in Plan Explorer, we see that the magical low-read query is, in fact, pulling the data from memory or disk, and not from a cloud of magic pixie dust. In fact it has *more* reads than the other version:
So it is clear that reads are happening, they're just not appearing correctly in the STATISTICS IO
output.
What is the problem?
Well, I'll be quite honest: I don't know, other than the fact that parallelism is definitely playing a role, and it seems to be some kind of race condition. STATISTICS IO
(and, since that's where we get the data, our Table I/O tab) shows a very misleading number of reads. It's clear that the query returns all of the data we're looking for, and it's clear from the trace results that it uses reads and not osmosis to do so. I asked Paul White (blog | @SQL_Kiwi) about it and he suggested that only some of the pre-thread I/O counts are being included in the total (and agrees that this is a bug).
If you want to try this out at home, all you need is AdventureWorks (this should repro against 2008, 2008 R2 and 2012 versions), and the following query:
SET STATISTICS IO ON;
DBCC SETCPUWEIGHT(1000) WITH NO_INFOMSGS;
GO
SELECT TOP (15000) *
FROM Sales.SalesOrderHeader
WHERE OrderDate < (SELECT '20080101');
SELECT TOP (15000) *
FROM Sales.SalesOrderHeader
WHERE OrderDate < (SELECT '20080101')
OPTION (MAXDOP 1);
DBCC SETCPUWEIGHT(1) WITH NO_INFOMSGS;
(Note that SETCPUWEIGHT
is only used to coax parallelism. For more info, see Paul White's blog post on Plan Costing.)
Results:
Table 'SalesOrderHeader'. Scan count 1, logical reads 333, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
Paul pointed out an even simpler repro:
SET STATISTICS IO ON;
GO
SELECT TOP (15000) *
FROM Production.TransactionHistory
WHERE TransactionDate < (SELECT '20080101')
OPTION (QUERYTRACEON 8649, MAXDOP 4);
SELECT TOP (15000) *
FROM Production.TransactionHistory AS th
WHERE TransactionDate < (SELECT '20080101');
Results:
Table 'TransactionHistory'. Scan count 1, logical reads 110, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.
So it seems that we can easily reproduce this at will with a TOP
operator and a low enough DOP. I've filed a bug:
And Paul has filed two other somewhat-related bugs involving parallelism, the first as a result of our conversation:
- Cardinality Estimation Error With Pushed Predicate on a Lookup [ related blog post ]
- Poor Performance with Parallelism and Top [ related blog post ]
(For the nostalgic, here are six other parallelism bugs I pointed out a few years ago.)
What is the lesson?
Be careful about trusting a single source. If you look solely at STATISTICS IO
after changing a query like this, you may be tempted to focus on the miraculous drop in reads instead of the increase in duration. At which point you may pat yourself on the back, leave work early and enjoy your weekend, thinking you have just made a tremendous performance impact on your query. When of course nothing could be further from the truth.
Bug information: STATISTICS IO under-reports logical reads for parallel plans
http://connect.microsoft.com/SQLServer/feedback/details/767250/statistics-io-under-reports-logical-reads-for-parallel-plans
We recently stumbled on a scenario where if a plan uses TOP and parallelism, with a low enough DOP, SET STATISTICS IO ON produces incorrect and unreliable information. Thanks to Steve Wright and Thomas Stringer for helping me narrow in and focus on the true issue. Note that DBCC SETCPUWEIGHT is present only to ensure a parallel plan is used for this quite simplistic repro.
USE AdventureWorks2012;
GO
SET NOCOUNT ON;
SET STATISTICS IO ON;
DBCC SETCPUWEIGHT(1000) WITH NO_INFOMSGS;
GO
-- STATISTICS IO shows 4 logical reads, trace shows 420:
SELECT TOP 15000 * FROM Sales.SalesOrderHeader WHERE OrderDate < '20080101';
-- STATISTICS IO and trace both show 333 logical reads:
SELECT TOP 15000 * FROM Sales.SalesOrderHeader WHERE OrderDate < '20080101' OPTION (MAXDOP 1);
DBCC SETCPUWEIGHT(1) WITH NO_INFOMSGS;
Another AdventureWorks repro, but which does not require messing about with weights:
SELECT TOP (15000)
th.*
FROM Production.TransactionHistory AS th
WHERE
th.TransactionDate < (SELECT '20080101')
OPTION (QUERYTRACEON 8649, MAXDOP 4);
I found the issue occurs much less frequently at higher DOP, hence the MAXDOP hint.