The SQL standard defines
four levels of transaction isolation in terms of three phenomena
that must be prevented between concurrent transactions. These
undesirable phenomena are:

dirty read

A transaction reads data written by a concurrent
uncommitted transaction.

nonrepeatable read

A transaction re-reads data it has previously read and
finds that data has been modified by another transaction
(that committed since the initial read).

phantom read

A transaction re-executes a query returning a set of
rows that satisfy a search condition and finds that the set
of rows satisfying the condition has changed due to another
recently-committed transaction.

The four transaction
isolation levels and the corresponding behaviors are described in
Table
12-1.

Table 12-1. SQL Transaction Isolation Levels

Isolation Level

Dirty Read

Nonrepeatable Read

Phantom Read

Read uncommitted

Possible

Possible

Possible

Read committed

Not possible

Possible

Possible

Repeatable read

Not possible

Not possible

Possible

Serializable

Not possible

Not possible

Not possible

In PostgreSQL, you can
request any of the four standard transaction isolation levels.
But internally, there are only two distinct isolation levels,
which correspond to the levels Read Committed and Serializable.
When you select the level Read Uncommitted you really get Read
Committed, and when you select Repeatable Read you really get
Serializable, so the actual isolation level may be stricter than
what you select. This is permitted by the SQL standard: the four
isolation levels only define which phenomena must not happen,
they do not define which phenomena must happen. The reason that
PostgreSQL only provides two
isolation levels is that this is the only sensible way to map the
standard isolation levels to the multiversion concurrency control
architecture. The behavior of the available isolation levels is
detailed in the following subsections.

To set the transaction isolation level of a transaction, use
the command SET
TRANSACTION.

Read Committed is the default
isolation level in PostgreSQL.
When a transaction runs on this isolation level, a SELECT query sees only data committed before the
query began; it never sees either uncommitted data or changes
committed during query execution by concurrent transactions.
(However, the SELECT does see the
effects of previous updates executed within its own
transaction, even though they are not yet committed.) In
effect, a SELECT query sees a snapshot
of the database as of the instant that that query begins to
run. Notice that two successive SELECT
commands can see different data, even though they are within a
single transaction, if other transactions commit changes during
execution of the first SELECT.

UPDATE, DELETE, SELECT FOR
UPDATE, and SELECT FOR SHARE
commands behave the same as SELECT in
terms of searching for target rows: they will only find target
rows that were committed as of the command start time. However,
such a target row may have already been updated (or deleted or
locked) by another concurrent transaction by the time it is
found. In this case, the would-be updater will wait for the
first updating transaction to commit or roll back (if it is
still in progress). If the first updater rolls back, then its
effects are negated and the second updater can proceed with
updating the originally found row. If the first updater
commits, the second updater will ignore the row if the first
updater deleted it, otherwise it will attempt to apply its
operation to the updated version of the row. The search
condition of the command (the WHERE
clause) is re-evaluated to see if the updated version of the
row still matches the search condition. If so, the second
updater proceeds with its operation, starting from the updated
version of the row. (In the case of SELECT
FOR UPDATE and SELECT FOR SHARE,
that means it is the updated version of the row that is locked
and returned to the client.)

Because of the above rule, it is possible for an updating
command to see an inconsistent snapshot: it can see the effects
of concurrent updating commands that affected the same rows it
is trying to update, but it does not see effects of those
commands on other rows in the database. This behavior makes
Read Committed mode unsuitable for commands that involve
complex search conditions. However, it is just right for
simpler cases. For example, consider updating bank balances
with transactions like

If two such transactions concurrently try to change the
balance of account 12345, we clearly want the second
transaction to start from the updated version of the account's
row. Because each command is affecting only a predetermined
row, letting it see the updated version of the row does not
create any troublesome inconsistency.

Since in Read Committed mode each new command starts with a
new snapshot that includes all transactions committed up to
that instant, subsequent commands in the same transaction will
see the effects of the committed concurrent transaction in any
case. The point at issue here is whether or not within a
single command we see
an absolutely consistent view of the database.

The partial transaction isolation provided by Read Committed
mode is adequate for many applications, and this mode is fast
and simple to use. However, for applications that do complex
queries and updates, it may be necessary to guarantee a more
rigorously consistent view of the database than the Read
Committed mode provides.

The level Serializable provides the
strictest transaction isolation. This level emulates serial
transaction execution, as if transactions had been executed one
after another, serially, rather than concurrently. However,
applications using this level must be prepared to retry
transactions due to serialization failures.

When a transaction is on the serializable level, a
SELECT query sees only data committed
before the transaction began; it never sees either uncommitted
data or changes committed during transaction execution by
concurrent transactions. (However, the SELECT does see the effects of previous updates
executed within its own transaction, even though they are not
yet committed.) This is different from Read Committed in that
the SELECT sees a snapshot as of the
start of the transaction, not as of the start of the current
query within the transaction. Thus, successive SELECT commands within a single transaction
always see the same data.

UPDATE, DELETE, SELECT FOR
UPDATE, and SELECT FOR SHARE
commands behave the same as SELECT in
terms of searching for target rows: they will only find target
rows that were committed as of the transaction start time.
However, such a target row may have already been updated (or
deleted or locked) by another concurrent transaction by the
time it is found. In this case, the serializable transaction
will wait for the first updating transaction to commit or roll
back (if it is still in progress). If the first updater rolls
back, then its effects are negated and the serializable
transaction can proceed with updating the originally found row.
But if the first updater commits (and actually updated or
deleted the row, not just locked it) then the serializable
transaction will be rolled back with the message

ERROR: could not serialize access due to concurrent update

because a serializable transaction cannot modify or lock
rows changed by other transactions after the serializable
transaction began.

When the application receives this error message, it should
abort the current transaction and then retry the whole
transaction from the beginning. The second time through, the
transaction sees the previously-committed change as part of its
initial view of the database, so there is no logical conflict
in using the new version of the row as the starting point for
the new transaction's update.

Note that only updating transactions may need to be retried;
read-only transactions will never have serialization
conflicts.

The Serializable mode provides a rigorous guarantee that
each transaction sees a wholly consistent view of the database.
However, the application has to be prepared to retry
transactions when concurrent updates make it impossible to
sustain the illusion of serial execution. Since the cost of
redoing complex transactions may be significant, this mode is
recommended only when updating transactions contain logic
sufficiently complex that they may give wrong answers in Read
Committed mode. Most commonly, Serializable mode is necessary
when a transaction executes several successive commands that
must see identical views of the database.

The intuitive meaning (and mathematical definition) of
"serializable" execution is that
any two successfully committed concurrent transactions will
appear to have executed strictly serially, one after the
other — although which one appeared to occur first may not be
predictable in advance. It is important to realize that
forbidding the undesirable behaviors listed in Table 12-1 is
not sufficient to guarantee true serializability, and in fact
PostgreSQL's Serializable
mode does not guarantee
serializable execution in this sense. As an example,
consider a table mytab, initially
containing

class | value
-------+-------
1 | 10
1 | 20
2 | 100
2 | 200

Suppose that serializable transaction A computes

SELECT SUM(value) FROM mytab WHERE class = 1;

and then inserts the result (30) as the value in a new row with class = 2. Concurrently, serializable
transaction B computes

SELECT SUM(value) FROM mytab WHERE class = 2;

and obtains the result 300, which it inserts in a new row
with class = 1. Then both
transactions commit. None of the listed undesirable behaviors
have occurred, yet we have a result that could not have
occurred in either order serially. If A had executed before
B, B would have computed the sum 330, not 300, and similarly
the other order would have resulted in a different sum
computed by A.

To guarantee true mathematical serializability, it is
necessary for a database system to enforce predicate locking, which means that a
transaction cannot insert or modify a row that would have
matched the WHERE condition of a
query in another concurrent transaction. For example, once
transaction A has executed the query SELECT ... WHERE class = 1, a
predicate-locking system would forbid transaction B from
inserting any new row with class 1 until A has committed.
[1] Such a
locking system is complex to implement and extremely
expensive in execution, since every session must be aware of
the details of every query executed by every concurrent
transaction. And this large expense is mostly wasted, since
in practice most applications do not do the sorts of things
that could result in problems. (Certainly the example above
is rather contrived and unlikely to represent real software.)
Accordingly, PostgreSQL does
not implement predicate locking, and so far as we are aware
no other production DBMS does either.

In those cases where the possibility of nonserializable
execution is a real hazard, problems can be prevented by
appropriate use of explicit locking. Further discussion
appears in the following sections.

Comments

Sept. 28, 2006, 2:38 a.m.

I don't find the distinction between a nonrepeatable read and a phantom read at all clear. It seems to use different wording to say the same thing, at least in the parts of it...making the wording more the same might better highlight the key difference. For example "re-reads data it has previously read" and "re-executes a query returning a set of rows" aere different wording for the same thing, no? The biggest difference in what they say that I can identify is "recently-committed" vs "committed since the initial read". Is that really the only difference? Recentness of transactions that might affect the read?

Nov. 1, 2006, 9:11 p.m.

Eric,This question probably should have been posted on the mailing list or asked in IRC. I can see other people getting confused by the same thing you are getting confused by, so I'll answer here anyway. The difference between a 'nonrepeatable read' and a 'phantom read' might best be explained by the following example.

Suppose the query is: SELECT col1, col2 where col2 &lt;= 3;

First Read Results: A 1 B 2

Non-Repeatable Read: A 1 B 3The value of col2 in the 'B' row has been changed.

Phantom Read 1: A 1 B 2 C 3The value of col2 in the 'C' row has been changed so that it is now included in the resultset.

Phantom Read 2: A 1The value of col2 in the 'B' row has been changed so that it is no longer included in the resultset.

The main difference between a phantom read and a nonrepeatable read is that a nonrepeatable read only differs by the values in the rows, but a phantom read will return different rows. They are very closely related and that probably explains why PostgreSQL handles both of them with the same level of locking.

Nov. 21, 2006, 2:59 p.m.

The difference between nonrepeatable reads and phantom reads is that already-read data is not modified/deleted by phantom reads.

Just wanted to point out that in the previous comment by Mark Drago, the case 'Phantom Read 2' is actually a nonrepeatable read. In a nonrepeatable read, previously read data is modified when re-read. Row B has been been modified and so the read is not repeatable.