When does “physical reads” include read-ahead reads in SQL Server?

pre-fetching mechanism

SQL Server has more than one way to pull pages in from disk for your queries. SQL Server can do a physical read of an 8KB page, or an extent of 8 of those 8KB pages.

SQL Server can also use the “read-ahead” mechanism to pull even larger chunks of data in from disk when you have a query that wants to read a lot of data — because just plucking one 8KB page or even 64KB of pages into disk isn’t super fast when you need lotsa pages.

But these terms get a little confusing when you’re changing between different diagnostic tools in SQL Server, because some of these tools include read-ahead reads in physical reads, and some don’t!

In this case, physical reads = 5,625 and read-ahead reads = 438,117 against our largest table. There are 2 other reads against the small FirstName table for a total of 443,384 physical pages read.

The sys.dm_exec_query_stats DMV combines all physical reads

When I query the sys.dm_exec_query_stats DMV and pull back execution statistics just for my query (with a diagnostic query like this), it doesn’t have a separate column for read-ahead reads. Instead, it reports all reads from disk under physical reads:

Notably, this DMV shows that more physical reads were done than were reported by STATISTICS IO! It saw 483,128 physical reads.

Well, how about a trace? I ran a quick trace on my session and collected sql_statement_completed. Like sys.dm_exec_query_stats, it doesn’t separate out read-ahead reads: it reports everything as physical reads.

I was really happy to see that the trace agreed with the DMV, and that they both saw 483,128 physical reads. It’s nice to have a little consistency!

What about tracing sqlserver.file_read_completed for read-ahead reads?

So I set up my Extended Events trace, cleared out my buffer pool, and ran my query. Here’s the physical read count and sizes, I saw, with some columns added for analysis:

The “read-ahead reads” column here has been converted to the page count read. I didn’t include single page reads or the reads of a single extent in that column. (I also was a little lazy and didn’t bother to filter out “cache warming” pages read, which is probably around 9 pages, but don’t tell anyone, OK?)

This trace saw fewer read-ahead reads than STATISTICS IO reported. It also saw fewer physical reads than sys.dm_exec_query_stats or sql_statement_completed saw. But we’re in the ballpark, and these are different methods of measurement.

My view: STATISTICS IO isn’t super-precise, but it is cool that it separates out read-ahead reads

It’s actually not easy to see how many read-ahead reads your query is doing! STATISTICS IO is one of the few places that breaks it out in a way that’s simple to see. I see these numbers as a ballpark estimate, and I wouldn’t bet on them being “exactly right”. They’re good enough to be really useful.

When you’re looking at sys.dm_exec_query_stats, or you’re tracing statements completed, in those places “physical reads” includes reads of all sizes– including read-ahead reads.