Transaction Isolation Levels

One of the most eagerly anticipated new features of SQL Server 7.0 is its support for row-level locking. I fully intend to tell you all about the new locking mechanisms, hints about using the new locking, and the tools for troubleshooting locks. But I can't yet, because you need to know about something else first. Closely tied in with the modes and methods of locking is the transaction isolation level. To understand the new locking behavior, you need to understand the four transaction isolation levels in SQL Server 7.0: Uncommitted Read (also called "dirty read"), Committed Read, Repeatable Read, and Serializable.

Isolation Levels

The isolation level that your transaction runs in determines how sensitive your application is to changes other users' transactions make, and consequently, how long your transaction must hold locks to protect against these changes. The ANSI SQL standard defines four levels of transaction isolation. Although previous versions of SQL Server let you specify all four distinct levels of transaction isolation, there were only three different behaviors because SQL Server internally treated two of the syntactic specifications (i.e., Repeatable Read and Serializable) as synonymous.

You can change the level of isolation that a particular connection is operating in by using the SET TRANSACTION ISOLATION LEVEL command. Keep in mind that the SET command applies only to your current connection, and every time you make a new connection (or open a new window in the Query Analyzer), you'll be back in the default isolation level. I'll use each of the four isolation levels in the examples to follow.

To see how each level behaves, you can use the script in Listing 1, page 20, to create a table with a few rows in it. I'll refer back to this table in examples for each of the four isolation levels.

Uncommitted Read

Uncommitted Read, or dirty read, lets a transaction read any data currently on a data page, whether or not that data has been committed. For example, although another user might have a transaction in progress that has updated data, and that transaction is holding exclusive locks on the data, your transaction can read the data anyway, and possibly take further actions based on the values you read. The other user might then decide to roll back his or her transaction, so logically, those changes never occurred. Although this scenario isn't desirable, with Uncommitted Read you won't get stuck waiting for a lock, nor will your reads acquire share locks that might affect others.

Let's see how Uncommitted Read behaves. Use the SQL Server 7.0 Query Analyzer, and start two separate connections. Use the pubs database in each one. In the first connection, begin a transaction, but don't commit it:

BEGIN TRAN
UPDATE ISOLATION_TEST
SET col2 = 'New Value'

Now, use the second connection, and change your isolation level before trying to access the same table.

All the values in col1 are 0, even though the transaction in the first connection has not committed yet. In fact, the transaction might never commit. If you took some action based on the fact that all the values are 0, you could regret it if the changes turned out not to be permanent. Back in the first connection, roll back the transaction:

ROLLBACK TRAN

Now rerun the SELECT statement in the second connection to see that all the values are back to what they were before. If you're following along with these examples, make sure you close your connections after each one, so that all outstanding locks are released.

Committed Read

Committed Read is SQL Server's default isolation level. It ensures that an operation will never read data another application has changed but not yet committed. Because you can never read uncommitted data, if a transaction running with Committed Read isolation revisits data, that data might have changed, or new rows might appear that meet the criteria of the original query. Rows that appear in this way are called phantoms.

So Committed Read behavior has two aspects. To see the first aspect, you can run the above example, without setting the second connection to use isolation level Read Uncommitted. The second connect would then block on the SELECT statement; it can't read the changes the first connection has made but not yet committed (or rolled back). To see the second Committed Read behavior, close all the connections in the Query Analyzer from the previous example, and open two new connections using pubs again. In the first connection, run the following batch:

Notice that the update is successful, even though the first connection is still inside a transaction.

Go back to the first connection and run the same SELECT statement:

SELECT AVG(col1) from ISOLATION_TEST

The average value is now different. The default isolation level does not prevent another connection from changing data you have read. Because you are not guaranteed to see the same data if you rerun the SELECT within the transaction, the read operations are not guaranteed to be repeatable.

Repeatable Read

If you want the read operations to be repeatable, choose the third isolation level. The Repeatable Read isolation level adds to the properties of Committed Read by ensuring that if a transaction revisits data or if a query is reissued, the data will not have changed. In other words, issuing the same query twice within a transaction won't pick up any changes to data values that another user's transaction has made. No other user can modify the data that your transaction visits as long as you have not yet committed or rolled back your transaction.

To see Repeatable Read behavior, close all the connections, and open two new ones in pubs. Issue the same two queries as above, but this time, have the first connection SET ISOLATION LEVEL REPEATABLE READ.

The second connection will have to use a slightly different update statement, because the value of 50 for col1 no longer exists:

UPDATE ISOLATION_TEST
SET col1 = 5000 WHERE col1 = 500

This update will block when it tries to update the ISOLATION_TEST table. And the first connection will get the same result when it reissues its original SELECT. Preventing nonrepeatable reads is a desirable safeguard, but it comes at a price. The cost of this extra safeguard is that all the shared locks in a transaction must be held until the completion (COMMIT or ROLLBACK) of the transaction.

However, Repeatable Read isolation doesn't prevent all possible changes. It protects only the data that you have read. The following example shows you what this protection means. Close all connections, and open two new ones connecting to pubs. In the first connection, start a transaction in Repeatable Read isolation level and look for all rows that meet a certain condition.

The second time you execute the same statement, the new row appears. Because the row doesn't even exist the first time you run the SELECT statement, it isn't locked. This new row that appears is called a phantom. You can prevent phantoms with the fourth isolation level.

Serializable

The Serializable isolation level ensures that if a query is reissued, no data will have changed and no new rows will appear in the interim. In other words, you won't see phantoms if the same query is issued twice within a transaction. Rerun the example from the Repeatable Reads section, inserting a row with a col1 value of 35. But this time, set your isolation level to SERIALIZABLE. The second connection will block when you try to do the INSERT, and the first connection will read exactly the same rows each time.

You pay a price to prevent phantoms. In addition to locking all the data you have read, enforcing the Serializable isolation level requires that SQL Server also lock data that doesn't exist! The Serializable level gets its name from the fact that running multiple serializable transactions at the same time is the equivalent of running them one at a time—that is, serially—regardless of sequence.

Controlling the Isolation Level

SQL Server's default isolation level is Committed Read, but as you've seen, you can override this setting within your application. The most straightforward way is by using the SET command:

Previous versions of SQL Server treated Repeatable Read and Serializable as synonymous. I thought the difference was that Repeatable Reads prevented UPDATE operations, and Serializable prevented INSERTs and DELETEs. But the difference is in what data is locked. Repeatable Read locks only the data that has been read. With Serializable, SQL Server has to guarantee complete serializability, so it locks ranges of data.

Previous versions of SQL Server gave you no way to distinguish between these two levels, because the smallest unit of locking was an entire page. Locking the page containing the col1 values between 20, 30, and 40 prevented changes to those specific rows but also prevented inserts anywhere on the same page. With row-level locking, SQL Server 7.0 can lock only the accessed rows. SQL Server 7.0 also introduced a new kind of lock to lock the ranges of data between two rows, which I'll discuss in an upcoming article.