SQL Server Locking and You!

Did you know that SQL Server’s locking has a name? It’s called two-phase locking. If we’re really getting specific about it, SQL Server uses what’s called strong strict two-phase locking or SS2PL. We’ll get there in a few minutes, right now we’re going to take a look at what makes up two-phase locking.

But First, Some History

The earliest references to two-phase locking (2PL) that I can find is in Bernstein and Goodman’s 1981 paper Concurrency Control in Distributed Database Systems. The authors examine multiple 2PL techniques for synchronizing transactions, alternatives using timestamp ordering, integrated concurrency control methods combining 2PL and timestamp ordering, before mentioning some other also-rans in an appendix. (This may also be the first use of the now tired example involving bank balances and transactions.)

Even though his paper focuses on distributed databases, it’s still valuable because it sets up a common vocabulary for things to come. There’s a lot of theory in here. It’s interesting, but it’s still mathematical theory and most people glaze over when they see that sort of thing.

The Basics of Two-Phase Locking

2PL works by being explicit about who is doing what to whom. Or, in clearer terms:

(1) Different transactions cannot simultaneously own conflicting locks; and (2) once a transaction surrenders ownership of a lock, it may never obtain additional locks

Locks conflict if they’re not compatible with each other. That is to say that if both locks are on the same thing and at least one lock is a write lock, that’s a conflict. (Astute readers will notice that readers won’t block readers.) Anyone who has used SQL Server for a while will be familiar with what happens when we try to acquire a conflicting lock: we wait. Sometimes, we’ll wait for a good long while and a lock will eventually be released. Sometimes, we would end up waiting forever (a deadlock) if the lock manager didn’t step in and kill off one of the processes.

Why Is It Called Two-Phase Locking?

This process is called two-phase locking because there are two distinct phases. The two rules above hint at them, but in effect a transaction can either be issuing locks or releasing locks, it cannot be in stasis.

In reality, this works more like the following:

A statement is issued by an application.

SQL Server compiles the statement and determines the types of locks that are needed to most efficiently satisfy the query.

Once all locks are acquired, the transaction is in a ready state.

SQL Server will begin operations and release locks as appropriate.

If you’re really playing along at home, at some point you’ll figure out that lock acquisition and release varies by isolation level and results in the various phenomena that you see in each of the isolation levels. If you noticed that on your own, give yourself a gold star. If you didn’t, you’re normal.

Deadlocks

Deadlocks are, sadly, a byproduct of a 2PL mechanism. Most of the literature talks about things like transaction graphs (or waits-for graphs) and edges. The Great Triumvirate illustrates this perfectly.

Deadlocks, circa 1836

Daniel Webster is grooming his eyebrows in a fashion that almost became known as ‘the Webster’. He’s impatiently waiting for Henry Clay to finish with the funny pages, but he won’t release his comb until he has the funny pages. Henry Clay is reading what passes for the funny pages in 1836 (hint: it’s the New Yorker). Henry is well pleased with himself, but he’s waiting on John C Calhoun to relinquish the volumizer before he will give up the funny pages. John C Calhoun is volumizing his hair to lofty heights but he really wants Daniel Webster’s eyebrow comb. Everyone is waiting on something from everyone else, but nobody will give up first. This is a deadlock. Okay, maybe that’s not really a deadlock, but it’s better than a waits-for graph. A combination of techniques can be used to determine which transactions will be killed off. SQL Server will typically use the least expensive transaction (in terms of optimizer cost). SQL Server avoids some problems, too, by not attempting to retry transactions.

A Waits-For (or Deadlock) Graph

Strict Two-Phase Locking

So far, we’ve only talked about 2PL, but I said SQL Server is SS2PL. Between the two is Strict Two-Phase Locking or S2PL. S2PL is like 2PL but there are some more rules.

To be considered S2PL, a transaction has to follow the rules of 2PL (sound like normalization rules?). In addition, a transaction also has to release write locks after the transaction has ended and either been rolled back or committed. Interestingly, nothing is directly said about read locks in S2PL. However, read locks can be released as they are no longer needed during the transaction.

Strong Strict Two-Phase Locking

SS2PL (called S2PL in Concurrency Control and Recovery in Database Systems) requires that the locks are only released after the transaction is finished and has been committed or rolled back. SS2PL provides serializability – database transactions appear as if they are atomic and occurring in complete isolation from one another. Serializable transactions are interesting because for a database to truly be serializable, it should be possible to process transactions in any order, s long as the effective is the same as that of some serial order (not any, just some).

Why the Devil Should I Care?

Locks are the primary way that SQL Server manages concurrency. This is a limitation of the database. New CPUs, a SAN, more disks, solid state hardware, and more RAM will not remove locking, blocking, and deadlocks from the database. New CPUs, a SAN, more disks, solid state hardware, and more RAM will make locking, blocking, and deadlocks happen faster. They may happen so fast, that you don’t really notice the problem until it’s growing out of control.

Combine a healthy knowledge of how locking operates with a working knowledge of isolation levels and some allegedly insurmountable application problems can be resolved through simple changes in the data layer.

When I’m not working with databases, you’ll find me at a food truck in Portland, Oregon, or at conferences such as DevLink, Stir Trek, and OSCON. My sessions have been highly rated and I pride myself on their quality.

Brent didn’t write this one, I did. It’s probably not clear in the RSS feed, but we’ll get that fixed.

I am a bit of a math nut, but I wouldn’t call myself a math geek. I’m just interested in the theory behind making computers work.

With two phase update, if you’re referring to the topics in paper on two-phase update, I wouldn’t consider that to be the same thing as 2PL. Two-phase update is a way to get around a lot of the concurrency problems that are a part of 2PL. In fact, Section 2.1 of that research paper compares two-phase update to locking mechanisms and some of the other mechanisms listed in the Bernstein and Goodman paper.

Jeremiah – just to record it along with article, you nicely summed up how adding hardware can buy you time with:
“hardware alone won’t resolve locking issues, it just makes them happen faster (or in memory). But hardware can also buy you enough time so that you can effectively research the issues instead of pressing every button that SQL Server has to offer in the hopes that it will can solve the problem.”