Just for kicks, I ran two 30-minute pgbench tests at scale factor 300
tonight on Nate Boley's machine, with -n -l -c 32 -j 32. The
configurations were identical, except that on one of them, I set
wal_buffers=64MB. It seemed to make quite a lot of difference:
wal_buffers not set (thus, 16MB):
tps = 3162.594605 (including connections establishing)
wal_buffers=64MB:
tps = 6164.194625 (including connections establishing)
Rest of config: shared_buffers = 8GB, maintenance_work_mem = 1GB,
synchronous_commit = off, checkpoint_segments = 300,
checkpoint_timeout = 15min, checkpoint_completion_target = 0.9,
wal_writer_delay = 20ms
I have attached tps scatterplots. The obvious conclusion appears to
be that, with only 16MB of wal_buffers, the buffer "wraps around" with
some regularity: we can't insert more WAL because the buffer we need
to use still contains WAL that hasn't yet been fsync'd, leading to
long stalls. More buffer space ameliorates the problem. This is not
very surprising, when you think about it: it's clear that the peak tps
rate approaches 18k/s on these tests; right after a checkpoint, every
update will force a full page write - that is, a WAL record > 8kB. So
we'll fill up a 16MB WAL segment in about a tenth of a second. That
doesn't leave much breathing room. I think we might want to consider
adjusting our auto-tuning formula for wal_buffers to allow for a
higher cap, although this is obviously not enough data to draw any
firm conclusions.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company