Featured Database Articles

Transactions 101

Introduction

SQL Server uses many mechanisms to ensure the integrity of its databases. One
of these mechanisms is the concept of a transaction.

A transaction is the smallest unit of work SQL Server will do. One
transaction can involve a single operation on one row of one table, or it can involve hundreds
of operations on multiple tables in multiple databases on multiple servers. The
developer usually controls when the transaction begins and ends. The server
makes sure that either the entire transaction runs to completion or instead rolls it
back to the point where it originally began. There's no middle ground.

All T-SQL statements must run within a transaction; there are no provisions
for doing otherwise.

A unit of work is a transaction if it meets the ACID test. ACID is an acronym
for atomic, consistent, isolated, and durable.

Atomic

The entire transaction has to run or the server will restore the data to the
point before the transaction started. For instance, if an update
statement modifies 100 rows and one fails because of a check constraint, every
row will be restored to the point where it was before the update started.

It's important that the developer both understands how the server works and does his
or her part. Potential problem areas are:

The developer explicitly begins a transaction but, due to a typing or
logic error, never ends it. The server will wait forever for the transaction
to complete, up to the point the server is restarted. Most likely, the user
will tire and break the connection or the DBA will break it for them. Either
way, the server will roll back the transaction and no work will be done.

The connection is set for implicit transactions, the developer issues a
statement that begins a transaction, and neither commits nor rolls the
transaction back. The outcome will be the same as for explicit transactions,
though perhaps harder to debug.

The developer mixes T-SQL and database API calls. If the developer uses
ADO's BeginTrans to begin a transaction, they shouldn't use a
T-SQL COMMIT TRANSACTION to commit it. BOL says this may
produce undefined results.

Consistent

Consistency means the data never appears to other transactions to be in a
transitory state. They either see the data as it was before the transaction
began or after it was committed. Consistency involves a tradeoff of
concurrency--the ability of multiple users to share data without impacting one
another. In practice, maintaining perfect consistency imposes an unacceptable
hardship on other users, and SQL Server won't enforce it without instructions
from the developer. Read-only databases combine performance and consistency, and as such
are a potential workaround.

Isolated

SQL Server's design permits multiple users to work with the same data
simultaneously. A lack of controls to isolate transactions from one another introduces three potential problems: dirty reads, nonrepeatable reads, and phantoms.

Dirty reads are reads of another transaction's uncommitted data
modifications. If the other transaction is rolled back, the first transaction
has effectively read data that never existed. SQL Server won't permit this
without instructions from the developer.

Nonrepeatable reads are instances where a transaction reads rows, another
transaction modifies or deletes those rows and commits its changes, and the
first transaction re-reads the rows. SQL Server will permit this if not told
otherwise. Affected applications should be designed to handle nonrepeatable
reads. Timestamp, datetime, and smalldatetime data types can be used, the latter
two in conjunction with triggers.

Phantoms are instances where a transaction reads rows satisfying a search
condition, another transaction inserts and commits rows that satisfy the search
condition, and the first transaction rereads using the same search condition and
gets a different set of rows. SQL Server permits this if not instructed
otherwise.

SQL Server follows the ANSI specifications for isolation and allows four
distinct levels. They are implemented using locks of increasing scope and
duration.

Transaction Isolation Level

Allow Dirty reads

Allow Nonrepeatable reads

Allow Phantoms

READ UNCOMMITTED

Yes

Yes

Yes

READ COMMITTED

No

Yes

Yes

REPEATABLE READ

No

No

7Yes/6No

SERIALIZABLE

No

No

No

The levels of interest are READ COMMITTED, which is SQL Server's default
level, and SERIALIZABLE, which can turn a multi-user system into a single-user
system for the duration of the transaction.

There are three ways to control the isolation level: T-SQL, database API
calls, and locking hints.

/* Either of the following is set prior to beginning
** the transaction and remains in effect until reset or the
** connection is closed.
*/
-- T-SQL
SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
-- ADO
conC.IsolationLevel = adXactSerializable
/*
** Setting a locking hint gives you more
** control than a connection-level setting.
** It also overrides a connection-level
** setting. See Locking Hints in BOL for
** details.
*/
select *
from swynk_table with (HOLDLOCK)

Durable

Durability guarantees that a committed transaction will be permanently
written to the database, even if the system fails during the process. It also
guarantees that an uncommitted transaction will be completely rolled back. This
is the theory. In practice, there are a few things to note, and some gotchas.

SQL Server allows you to restore to a specific point in time, assuming you have
backups. You can restore the database to the state it was in before you
inadvertently committed the deletion of a million rows. There are several
products for SQL 6.5, and at
least one for 7.0 and 2000, that go beyond
SQL Server's recovery capability. Nothing is permanent.

Your hardware--particularly the disk subsystem--and file system settings
can break durability. Concerning hardware, keep in mind the old motorcycle
helmet advertisement--"If you have a $20 head, buy a $20 helmet." The
system, especially the disk controller(s), should be outfitted for a database
server. Write-caching controllers are problematic because they lie to SQL Server
that data has been written to the disks before it physically has. If the system
fails, you're relying on the controller to pick up exactly where it left off.
See Q234656.

Use NTFS for the file system but don't enable NTFS compression. Compression
isn't supported, kills performance, and may corrupt your data. See Q231347.