SQL Server 2008: Lock Escalation, INSERTs, and a Potential Bug

Lock escalation is a funny thing. I've found myself on numerous occasions waging war against its concurrency-sapping existence, and rarely have I found myself wishing that it would work more aggressively. But there is a time and place for everything, and yesterday I discovered that a major change has occurred with regard to lock escalation in SQL Server 2008.

The idea behind lock escalation is simple: Lock management is not free. According to BOL, each lock requires around 100 bytes, and that can really add up. So after a certain number of smaller locks have been taken, it sometimes makes sense to drop them in favor of one bigger lock. This can amount to huge savings, especially when dealing with millions of rows.

Lock escalation has been around for several SQL Server versions--as long as I've been using the product, as far as I'm aware (though I didn't even know of its existence for the first few years)--and we expect things to more or less work as they always have. So when things suddenly change, I tend to get annoyed. And how do we discover when something as internal as lock escalation changes? You would either have to be a complete internals geek--someone like Kalen--or notice a problem. When it comes to lock escalation, I fall into the latter category.

Here's what happened: I was doing a large data load of around 100 million rows, and checked the system at around the halfway point, using my Who is Active procedure. What I saw was that the load was humming along, but the monitoring software had 5 active sessions open, the longest of which had been running for 25 minutes. This was an immediate red flag--you never want your monitoring software taking 25 minutes to do its queries. Kind of defeats the point. These sessions were all hitting the sys.dm_tran_locks DMV, which unfortunately tends to not scale too well when a lot of locks get created, so I did a COUNT(*)--which took almost five minutes to run--and discovered that I was dealing with 58 million open locks.

A quick script and a few tests later and I figured out what the problem was: The lock escalation algorithm for INSERTs no longer works the same way in 2008 as it did in 2005. Unfortunately, the BOL entry is quite vague and a bit on the confusing side so I'm not certain whether this change was made on purpose. Based on my tests escalation still behaves for SELECTs identically--the issue seems to only be with INSERTs (and perhaps other DML operations--I haven't tested yet).

To see this on your end, try the following script in both SQL Server 2005 and SQL Server 2008:

USE tempdbGO

CREATE TABLE x ( i INT NOT NULL PRIMARY KEY)GO

BEGIN TRAN INSERT x SELECT TOP (40000) ROW_NUMBER() OVER ( ORDER BY (SELECT NULL) ) AS r FROM master..spt_values a, master..spt_values b ORDER BY r

In SQL Server 2005, the final query will return either 1 or 2 (why either? I'm actually not certain; the second lock, when it is taken, references an object ID that doesn't seem to exist--a mystery for another day). In SQL Server 2008, on the other hand, the final query will return 40,066 or 40,067. 40,000 KEY locks, 65 PAGE locks, and either 1 or 2 OBJECT locks (again, the mystery object shows up).

So is this a major problem? If you're doing large data loads, I believe that it is. Again, according to BOL each lock takes up 100 bytes, so in my case when I looked I had 5.8 GB of my precious RAM taken up by lock data. I would have much preferred an escalation, as would have occurred in previous versions of SQL Server. There is a workaround: use a TABLOCK hint. But if you're dealing with third-party databases this may not be an option. And even SQL Server's own features, such as Change Data Capture, will not work with this fix.

I've filed a bug on Connect, and I hope you'll vote for it. I'm not certain that this is a bug, per se, or something done "by design", but I feel that the behavior is less than ideal and that the algorithms should be rolled back to their pre-SQL Server 2008 behavior.

I found that even when inserting in the order of the clustered index with concurrentinserts, i had to use a real batch size. If you are inserting into an index concurrently, i think you have to make sure that the data sets are non overlapping.

Unfortunately, I can't do minimal logging in my scenario, because we have CDC enabled on the table (whether or not this is a good use for CDC is a whole other topic). And while I can stop the key locks on the main insert, I can't control what CDC does; it will kick in and cause the same problem when it does its own work.

Leave a Comment

About Adam Machanic

Adam Machanic is a Boston-based SQL Server developer, writer, and speaker. He focuses on large-scale data warehouse performance and development, and is author of the award-winning SQL Server monitoring stored procedure, sp_WhoIsActive. Adam has written for numerous web sites and magazines, including SQLblog, Simple Talk, Search SQL Server, SQL Server Professional, CoDe, and VSJ. He has also contributed to several books on SQL Server, including "SQL Server 2008 Internals" (Microsoft Press, 2009) and "Expert SQL Server 2005 Development" (Apress, 2007). Adam regularly speaks at conferences and training events on a variety of SQL Server topics. He is a Microsoft Most Valuable Professional (MVP) for SQL Server, a Microsoft Certified IT Professional (MCITP), and an alumnus of the INETA North American Speakers Bureau.