This post is part 21 of a 30-part series about the Who is Active stored procedure. A new post will run each day during the month of April, 2011. After April all of these posts will be edited and combined into a single document to become the basis of the Who is Active documentation.

Tempdb. Everyone’s favorite shared bottleneck.

The funny thing about tempdb is that it’s not used by every query. It’s only really used by the biggest queries. The queries where performance really matters. And of course, that makes the situation all the worse. When tempdb is a problem, it’s a major problem.

A common cause of tempdb issues is latch contention. This occurs due to three “special” page types: GAM (Global Allocation Map), SGAM (Shared Global Allocation Map), and PFS (Page Free Space). Each of these pages (sometimes many of each) exist in all database files, and they are responsible for helping to identify where incoming data can be written in the physical file. For complete background, read this post by Paul Randal.

Whenever a process needs to update one of these special pages, a latch is taken. A latch can be thought of as a kind of very lightweight lock; it’s designed for synchronization and is intended to be quickly turned on, and then just as quickly turned off after it’s no longer needed. The problem occurs when lots of processes happen to hit the SQL Server instance at the exact same time, and they all need to find a place to store some data. Suddenly, there are numerous outstanding latch requests. They start queuing up, and before you know it the last one in line has been waiting for seconds—and not doing any work at all in the meantime.

Seeing this in action is quite simple. Create a bunch of temporary tables, in a very short period of time. I like to demo this using SQLQueryStress. Ideally you should do this kind of test on a server with only a single tempdb data file, to really highlight the issue. Here’s the code to run:

20 or so concurrent threads should do it. Hit the start button in your load tool, wait a few seconds, run Who is Active, and you’ll see something similar to:

See those PAGELATCH waits? They’re all on the same resource: PFS pages, in tempdb file ID 1. The format for PAGELATCH and PAGEIOLATCH waits is: [wait_type]:[database_name]:[file_id](page_type). Who is Active can decode the page types for GAM, SGAM, PFS, DCM (Differential Change Map), and BCM (Bulk Change Map) pages. For any other page, the page type will be an asterisk (*).

These waits are all on update (UP) latches, but it's also quite common to see exclusive (EX) latches when this problem occurs

Fixing this problem is amazingly simple: just create more tempdb files! When you create additional tempdb files—as long as they’re equally sized—SQL Server will automatically balance incoming requests across the multiple files. Since each file maintains its own PFS, GAM, and SGAM information, the contention will be eliminated.

How many files should you create? Well, that depends on your workload. I take a simple approach: start with a number of files equal to one quarter to one half of the number of logical schedulers. If the contention goes away (i.e. Who is Active is no longer showing PAGELATCH waits on these special pages in tempdb), stop there. Otherwise, keep increasing the number of files until the contention does go away.

Again: make sure to keep the files equally sized! SQL Server’s algorithm is based on a proportional fill model, which means that if one file is bigger than the others it will be chosen more often. This will cause it to grow more quickly, which will cause it to be chosen more often, which will cause it to grow more quickly, which will keep compounding until your disk is full and your performance is back in the red zone. Not a good situation to get yourself into. Grow the files bigger than you need them at create time, and monitor to make sure things don’t get out of hand.

Homework

Use Who is Active to check your production servers for tempdb contention! Tell me in the comments below whether you found any. It’s amazing how common this issue is, yet how simple the fix turns out to be. A very satisfying task for even the most harried of DBAs.

I've noticed that sp_whoisactive uses tempdb itself and when there's tempdb contention I've seen sp_whoisactive take a while. So I've learned to do quick and dirty ad hoc queries from dm_exec_requests...

But does this make sense? That sp_whoisactive can struggle to report on tempdb contention because it needs to use tempdb itself?

First off, which version are you using? v11.11 was specifically tuned to use less tempdb than prior versions.

Second, unfortunately I had to make a choice between [workspace] memory and tempdb [spills]. I chose the latter, as I consider tempdb to be a much cheaper and more expendable resource than memory. Admittedly, I tuned this using a pretty heavy knob and I didn't do a lot of analysis on whether or not I could balance things out a bit more. I hope to do that for a future version.

When u say equally sized. Do you mean exactly same size? E.g if i have a tempdb.mdf as 4045.50mb, to add another files is it ok to around off to 4045mb. Will server still choosing the first file since it is only 0.5mb bigger? Do i have to restart the mssql service after i have added the extra data files?

Leave a Comment

About Adam Machanic

Adam Machanic is a Boston-based SQL Server developer, writer, and speaker. He focuses on large-scale data warehouse performance and development, and is author of the award-winning SQL Server monitoring stored procedure, sp_WhoIsActive. Adam has written for numerous web sites and magazines, including SQLblog, Simple Talk, Search SQL Server, SQL Server Professional, CoDe, and VSJ. He has also contributed to several books on SQL Server, including "SQL Server 2008 Internals" (Microsoft Press, 2009) and "Expert SQL Server 2005 Development" (Apress, 2007). Adam regularly speaks at conferences and training events on a variety of SQL Server topics. He is a Microsoft Most Valuable Professional (MVP) for SQL Server, a Microsoft Certified IT Professional (MCITP), and an alumnus of the INETA North American Speakers Bureau.