A Batch of Commit Batching

A database commit can be the most expensive single operation that
its users have to wait for. Recent trends in the database industry
have proven some applications are willing to accept durability loss,
when it must be sacrificed to reach performance goals. And an inevitable
downside of more durable approaches like Synchronous Replication are
their impact on server commit speed.

Some of the fundamental limitations here are physical ones: disk rotation,
network performance, and the speed of light. Recent performance improvements
changes for PostgreSQL 9.2 aim at getting closer to the theoretical best
possible behavior here in every situation. It's more important than ever
to tell when the limit you're hitting is a physical one, and when it's
something you can address with a software change. Controlling commit
batch size and the number of concurrent clients is getting even more
important as PostgreSQL is deployed onto cloud and other virtual hardware
environments.

Four of the fundamental factors going into how expensive a commit is are
atomicity, consistency, isolation, durability, collectively referred to
as ACID. PostgreSQL has always respected the durability aspects of ACID
compliance. Extending that to reach onto multiple servers can significantly
expands the suitability of the database for business critical applications.
It will cost you though. The question isn't just how much durability
you want; it's much durability can you afford?

The innovative design used in PostgreSQL doesn't force you to make this
sort of decision at the database level. Every individual commit can
specify its durability requirements at any time, even in the middle
of a transaction. Being able to classify your need at such a fine level
allows PostgreSQL an unprecedented range of options in this area.
Mission critical data that needs multi-node synchronous commit can
coexist with high volume/best effort data, with each transaction
fine-tuned to its position in the reliability vs. speed trade-off
spectrum.

There's a second factor to consider too: client count. The Synchronous
Replication implementation used for PostgreSQL 9.1 makes it possible
to increase total aggregate commit throughput by scaling up the
concurrent number of clients. Improvements in progress for PostgreSQL
9.2 take that basic idea and applies it more aggressively to local
commits as well. Carefully adjusting per-client commit behavior is
becoming an increasingly important bottleneck to understand and design
against.