The SQL standard defines
four levels of transaction isolation. The most strict is
Serializable, which is defined by the standard in a paragraph
which says that any concurrent execution of a set of Serializable
transactions is guaranteed to produce the same effect as running
them one at a time in some order. The other three levels are
defined in terms of phenomena, resulting from interaction between
concurrent transactions, which must not occur at each level. The
standard notes that due to the definition of Serializable, none
of these phenomena are possible at that level. (This is hardly
surprising -- if the effect of the transactions must be
consistent with having been run one at a time, how could you see
any phenomena caused by interactions?)

The phenomena which are prohibited at various levels are:

dirty read

A transaction reads data written by a concurrent
uncommitted transaction.

nonrepeatable read

A transaction re-reads data it has previously read and
finds that data has been modified by another transaction
(that committed since the initial read).

phantom read

A transaction re-executes a query returning a set of
rows that satisfy a search condition and finds that the set
of rows satisfying the condition has changed due to another
recently-committed transaction.

The four transaction isolation levels and the corresponding
behaviors are described in Table 13-1.

Table 13-1. Standard SQL Transaction Isolation Levels

Isolation Level

Dirty Read

Nonrepeatable Read

Phantom Read

Read uncommitted

Possible

Possible

Possible

Read committed

Not possible

Possible

Possible

Repeatable read

Not possible

Not possible

Possible

Serializable

Not possible

Not possible

Not possible

In PostgreSQL, you can
request any of the four standard transaction isolation levels.
But internally, there are only three distinct isolation levels,
which correspond to the levels Read Committed, Repeatable Read,
and Serializable. When you select the level Read Uncommitted you
really get Read Committed, and phantom reads are not possible in
the PostgreSQL implementation of
Repeatable Read, so the actual isolation level might be stricter
than what you select. This is permitted by the SQL standard: the
four isolation levels only define which phenomena must not
happen, they do not define which phenomena must happen. The
reason that PostgreSQL only
provides three isolation levels is that this is the only sensible
way to map the standard isolation levels to the multiversion
concurrency control architecture. The behavior of the available
isolation levels is detailed in the following subsections.

To set the transaction isolation level of a transaction, use
the command SET
TRANSACTION.

Important: Some PostgreSQL data types and functions have
special rules regarding transactional behavior. In
particular, changes made to a sequence (and therefore the
counter of a column declared using serial) are immediately visible to all other
transactions and are not rolled back if the transaction that
made the changes aborts. See Section 9.16 and Section
8.1.4.

Read Committed is the default
isolation level in PostgreSQL.
When a transaction uses this isolation level, a SELECT query (without a FOR
UPDATE/SHARE clause) sees only data committed before the
query began; it never sees either uncommitted data or changes
committed during query execution by concurrent transactions. In
effect, a SELECT query sees a snapshot
of the database as of the instant the query begins to run.
However, SELECT does see the effects
of previous updates executed within its own transaction, even
though they are not yet committed. Also note that two
successive SELECT commands can see
different data, even though they are within a single
transaction, if other transactions commit changes during
execution of the first SELECT.

UPDATE, DELETE, SELECT FOR
UPDATE, and SELECT FOR SHARE
commands behave the same as SELECT in
terms of searching for target rows: they will only find target
rows that were committed as of the command start time. However,
such a target row might have already been updated (or deleted
or locked) by another concurrent transaction by the time it is
found. In this case, the would-be updater will wait for the
first updating transaction to commit or roll back (if it is
still in progress). If the first updater rolls back, then its
effects are negated and the second updater can proceed with
updating the originally found row. If the first updater
commits, the second updater will ignore the row if the first
updater deleted it, otherwise it will attempt to apply its
operation to the updated version of the row. The search
condition of the command (the WHERE
clause) is re-evaluated to see if the updated version of the
row still matches the search condition. If so, the second
updater proceeds with its operation using the updated version
of the row. In the case of SELECT FOR
UPDATE and SELECT FOR SHARE, this
means it is the updated version of the row that is locked and
returned to the client.

Because of the above rule, it is possible for an updating
command to see an inconsistent snapshot: it can see the effects
of concurrent updating commands on the same rows it is trying
to update, but it does not see effects of those commands on
other rows in the database. This behavior makes Read Committed
mode unsuitable for commands that involve complex search
conditions; however, it is just right for simpler cases. For
example, consider updating bank balances with transactions
like:

If two such transactions concurrently try to change the
balance of account 12345, we clearly want the second
transaction to start with the updated version of the account's
row. Because each command is affecting only a predetermined
row, letting it see the updated version of the row does not
create any troublesome inconsistency.

More complex usage can produce undesirable results in Read
Committed mode. For example, consider a DELETE command operating on data that is being
both added and removed from its restriction criteria by another
command, e.g., assume website is a
two-row table with website.hits
equaling 9 and 10:

The DELETE will have no effect even
though there is a website.hits = 10
row before and after the UPDATE. This
occurs because the pre-update row value 9 is skipped, and when the UPDATE completes and DELETE obtains a lock, the new row value is no
longer 10 but 11, which no longer matches the criteria.

Because Read Committed mode starts each command with a new
snapshot that includes all transactions committed up to that
instant, subsequent commands in the same transaction will see
the effects of the committed concurrent transaction in any
case. The point at issue above is whether or not a single command sees an absolutely
consistent view of the database.

The partial transaction isolation provided by Read Committed
mode is adequate for many applications, and this mode is fast
and simple to use; however, it is not sufficient for all cases.
Applications that do complex queries and updates might require
a more rigorously consistent view of the database than Read
Committed mode provides.

The Repeatable Read isolation level
only sees data committed before the transaction began; it never
sees either uncommitted data or changes committed during
transaction execution by concurrent transactions. (However, the
query does see the effects of previous updates executed within
its own transaction, even though they are not yet committed.)
This is a stronger guarantee than is required by the
SQL standard for this
isolation level, and prevents all of the phenomena described in
Table
13-1. As mentioned above, this is specifically allowed by
the standard, which only describes the minimum protections each
isolation level must provide.

This level is different from Read Committed in that a query
in a repeatable read transaction sees a snapshot as of the
start of the transaction, not as of the start
of the current query within the transaction. Thus, successive
SELECT commands within a single transaction see the same
data, i.e., they do not see changes made by other transactions
that committed after their own transaction started.

Applications using this level must be prepared to retry
transactions due to serialization failures.

UPDATE, DELETE, SELECT FOR
UPDATE, and SELECT FOR SHARE
commands behave the same as SELECT in
terms of searching for target rows: they will only find target
rows that were committed as of the transaction start time.
However, such a target row might have already been updated (or
deleted or locked) by another concurrent transaction by the
time it is found. In this case, the repeatable read transaction
will wait for the first updating transaction to commit or roll
back (if it is still in progress). If the first updater rolls
back, then its effects are negated and the repeatable read
transaction can proceed with updating the originally found row.
But if the first updater commits (and actually updated or
deleted the row, not just locked it) then the repeatable read
transaction will be rolled back with the message

ERROR: could not serialize access due to concurrent update

because a repeatable read transaction cannot modify or lock
rows changed by other transactions after the repeatable read
transaction began.

When an application receives this error message, it should
abort the current transaction and retry the whole transaction
from the beginning. The second time through, the transaction
will see the previously-committed change as part of its initial
view of the database, so there is no logical conflict in using
the new version of the row as the starting point for the new
transaction's update.

Note that only updating transactions might need to be
retried; read-only transactions will never have serialization
conflicts.

The Repeatable Read mode provides a rigorous guarantee that
each transaction sees a completely stable view of the database.
However, this view will not necessarily always be consistent
with some serial (one at a time) execution of concurrent
transactions of the same level. For example, even a read only
transaction at this level may see a control record updated to
show that a batch has been completed but not see one of the detail records
which is logically part of the batch because it read an earlier
revision of the control record. Attempts to enforce business
rules by transactions running at this isolation level are not
likely to work correctly without careful use of explicit locks
to block conflicting transactions.

Note: Prior to PostgreSQL version 9.1, a request for
the Serializable transaction isolation level provided
exactly the same behavior described here. To retain the
legacy Serializable behavior, Repeatable Read should now be
requested.

The Serializable isolation level
provides the strictest transaction isolation. This level
emulates serial transaction execution for all committed
transactions; as if transactions had been executed one after
another, serially, rather than concurrently. However, like the
Repeatable Read level, applications using this level must be
prepared to retry transactions due to serialization failures.
In fact, this isolation level works exactly the same as
Repeatable Read except that it monitors for conditions which
could make execution of a concurrent set of serializable
transactions behave in a manner inconsistent with all possible
serial (one at a time) executions of those transactions. This
monitoring does not introduce any blocking beyond that present
in repeatable read, but there is some overhead to the
monitoring, and detection of the conditions which could cause a
serialization anomaly will trigger a
serialization failure.

As an example, consider a table mytab, initially containing:

class | value
-------+-------
1 | 10
1 | 20
2 | 100
2 | 200

Suppose that serializable transaction A computes:

SELECT SUM(value) FROM mytab WHERE class = 1;

and then inserts the result (30) as the value in a new row with class= 2.
Concurrently, serializable transaction B computes:

SELECT SUM(value) FROM mytab WHERE class = 2;

and obtains the result 300, which it inserts in a new row
with class=
1. Then both transactions try to commit. If either
transaction were running at the Repeatable Read isolation
level, both would be allowed to commit; but since there is no
serial order of execution consistent with the result, using
Serializable transactions will allow one transaction to commit
and will roll the other back with this message:

ERROR: could not serialize access due to read/write dependencies among transactions

This is because if A had executed before B, B would have
computed the sum 330, not 300, and similarly the other order
would have resulted in a different sum computed by A.

When relying on Serializable transactions to prevent
anomalies, it is important that any data read from a permanent
user table not be considered valid until the transaction which
read it has successfully committed. This is true even for
read-only transactions, except that data read within a
deferrable read-only transaction is
known to be valid as soon as it is read, because such a
transaction waits until it can acquire a snapshot guaranteed to
be free from such problems before starting to read any data. In
all other cases applications must not depend on results read
during a transaction that later aborted; instead, they should
retry the transaction until it succeeds.

To guarantee true serializability PostgreSQL uses predicate locking, which means that it keeps
locks which allow it to determine when a write would have had
an impact on the result of a previous read from a concurrent
transaction, had it run first. In PostgreSQL these locks do not cause any
blocking and therefore can not play any part in causing a
deadlock. They are used to identify and flag dependencies among
concurrent serializable transactions which in certain
combinations can lead to serialization anomalies. In contrast,
a Read Committed or Repeatable Read transaction which wants to
ensure data consistency may need to take out a lock on an
entire table, which could block other users attempting to use
that table, or it may use SELECT FOR
UPDATE or SELECT FOR SHARE which
not only can block other transactions but cause disk
access.

Predicate locks in PostgreSQL, like in most other database
systems, are based on data actually accessed by a transaction.
These will show up in the pg_locks
system view with a mode of SIReadLock. The particular locks acquired during
execution of a query will depend on the plan used by the query,
and multiple finer-grained locks (e.g., tuple locks) may be
combined into fewer coarser-grained locks (e.g., page locks)
during the course of the transaction to prevent exhaustion of
the memory used to track the locks. A READ
ONLY transaction may be able to release its SIRead locks
before completion, if it detects that no conflicts can still
occur which could lead to a serialization anomaly. In fact,
READ ONLY transactions will often be
able to establish that fact at startup and avoid taking any
predicate locks. If you explicitly request a SERIALIZABLE READ ONLY DEFERRABLE transaction,
it will block until it can establish this fact. (This is the
only case where
Serializable transactions block but Repeatable Read
transactions don't.) On the other hand, SIRead locks often need
to be kept past transaction commit, until overlapping read
write transactions complete.

Consistent use of Serializable transactions can simplify
development. The guarantee that any set of concurrent
serializable transactions will have the same effect as if they
were run one at a time means that if you can demonstrate that a
single transaction, as written, will do the right thing when
run by itself, you can have confidence that it will do the
right thing in any mix of serializable transactions, even
without any information about what those other transactions
might do. It is important that an environment which uses this
technique have a generalized way of handling serialization
failures (which always return with a SQLSTATE value of
'40001'), because it will be very hard to predict exactly which
transactions might contribute to the read/write dependencies
and need to be rolled back to prevent serialization anomalies.
The monitoring of read/write dependencies has a cost, as does
the restart of transactions which are terminated with a
serialization failure, but balanced against the cost and
blocking involved in use of explicit locks and SELECT FOR UPDATE or SELECT
FOR SHARE, Serializable transactions are the best
performance choice for some environments.

For optimal performance when relying on Serializable
transactions for concurrency control, these issues should be
considered:

Declare transactions as READ
ONLY when possible.

Control the number of active connections, using a
connection pool if needed. This is always an important
performance consideration, but it can be particularly
important in a busy system using Serializable
transactions.

Don't put more into a single transaction than needed for
integrity purposes.

Eliminate explicit locks, SELECT FOR
UPDATE, and SELECT FOR SHARE
where no longer needed due to the protections automatically
provided by Serializable transactions.

When the system is forced to combine multiple page-level
predicate locks into a single relation-level predicate lock
because the predicate lock table is short of memory, an
increase in the rate of serialization failures may occur.
You can avoid this by increasing
max_pred_locks_per_transaction.

A sequential scan will always necessitate a
relation-level predicate lock. This can result in an
increased rate of serialization failures. It may be helpful
to encourage the use of index scans by reducing random_page_cost
and/or increasing cpu_tuple_cost.
Be sure to weigh any decrease in transaction rollbacks and
restarts against any overall change in query execution
time.

Warning

Support for the Serializable transaction isolation
level has not yet been added to Hot Standby replication
targets (described in Section 25.5). The strictest
isolation level currently supported in hot standby mode
is Repeatable Read. While performing all permanent
database writes within Serializable transactions on the
master will ensure that all standbys will eventually
reach a consistent state, a Repeatable Read transaction
run on the standby can sometimes see a transient state
which is inconsistent with any serial execution of
serializable transactions on the master.