1.2 Concurrency control implementation

This section covers the basic implementation techniques based on locks,
namely (strict) two-phase locking, describes the properties satisfied by
the (strict) 2PL protocol, and discusses SQL support for serializability.

Locking protocols

A lock is a value associated with each database object.

The basic locking protocol is two-phase locking (2PL): If a
transaction wants to read (resp., write) an object, the scheduler first
requests a shared/read (resp., exclusive/write) lock for the object from a
lock manager. If a requested lock is not granted, the transaction is
suspended until a lock is granted by the lock manager. All locks obtained
by a transaction must be released before the transaction completes. In
2PL, all locks must be obtained before any locks are released.

Shared locks allow other transactions to read but not write the object;
exclusive locks prevent other transactions from even reading the object.

Now, we must remove T1's lock on B, restore T1's initial
value of A, rerun T2 (because it read an incorrect value of A)
despite having committed, and we may need to rerun other transactions
which read A after T2 wrote it.

The locking protocol used in practice is strict two-phase locking
(strict 2PL): If a transaction wants to read (resp., write) an object,
it first requests a shared (resp., exclusive) lock for the object. All
locks obtained by a transaction must be held until the transaction
commits or aborts. (The exact definition of when shared and exclusive
locks may be released is subtle.)

Strict 2PL ensures schedules are serialisable and strict (doh).

Dynamic databases

When the set of database objects can change, the situation is
more complex.

Suppose transaction T1 reads the Edinburgh accounts (Ai) in a bank and
compares their sum with the stored total (T). Suppose transaction T2
inserts a new Edinburgh account (A) and updates the total.

R1(A1) R1(A2) R1(A3) I2(A) R2(T) W2(T) R1(T)

In this schedule, the total read by T1 is not the sum of the accounts it
read, despite the fact that this is a 2PL schedule (even a strict
2PL schedule). This is called the phantom problem. To some
extent it arises because we ignored the ``end of sequence'' item which
transactions T1 and T2 both accessed. One solution is to use locks on the
indexto the accounts in the bank, but this seems too
implementation-oriented. Another solution is to use
predicate locking, locking the set of all Edinburgh accounts, but
this is expensive.

Other operations

Operations such as increment and decrement cause
additional problems.

Deadlocks

Deadlocks may occur even with strict 2PL. (Conservative 2PL Ð collect all
required locks at once - can avoid this but reduces concurrency.) The
DMBS needs to periodically check whether deadlocks have occurred and
somehow break them.

One way to identify whether a deadlock has occurred is to maintain a
waits-for graph. This graph has a node for each active
transaction (in a schedule) and an arc from Ti to Tj iff T1 is waiting for
Tj to release a lock. The lock managemer adds edges to this graph when it
suspends transactions and removes edges when it grants lock requests. A
schedule is deadlocked if the waits-for graph contains a cycle. The
deadlock can be broken by aborting one of the transactions in the schedule
(and later redoing it).

Transaction support in SQL

Transactions start with a SELECT, UPDATE, CREATE TABLE, and other similar
statements. Transactions end with a COMMIT or ROLLBACK (abort) statement.

SQL allows programmers to especify three characteristics of a transaction:
access mode, diagnostics size and isolation level. Access mode is READ
ONLY, or READ WRITE. Diagnostics size only affects error reporting.
Isoliation level is the important one which affects the extent to which
transactions may be executed concurrently. The four possible isolation
levels are (R&G, Fig. 16.10):

Level

Dirty read

Unrepeatable
read

Phantom

READ UNCOMMITTED

Maybe

Maybe

Maybe

READ COMMITTED

No

Maybe

Maybe

REPEATABLE READ

No

No

Maybe

SERIALIZABLE

No

No

No

The SERIALIZABLE level is the safest to use; it guarantees strict,
serialisable
schedules using predicate locking to avoid the phantom problem.

The isolation level and access mode can be set using the SET TRANSACTION
command, e.g.,