This is probably somewhat superfluous, but we had another one these
incidents last night whose details confirm your explanation here.
[2006-04-21 00:22:19.500 ] 2452 LOG: could not rename file
"pg_xlog/000000010000011A0000004C" to
"pg_xlog/000000010000011A00000071", continuing to try
the autovacuums (which wouldn't actually have been vacuuming anything
since update traffic would have stopped by then) continued until:
[2006-04-21 01:57:35.968 ] 4048 LOG: autovacuum: processing database
"bigbird"
and the Web site first started noticing timeouts at 01:31:42,827.
Overnight traffic is so light that 70 minutes to work through 32K / 2
transactions is probably about right.
Pete
>>> Tom Lane <tgl(at)sss(dot)pgh(dot)pa(dot)us> 04/18/06 9:01 pm >>>
[ thinks for awhile longer ... ] No, I take that back. Once you'd
exhausted the current pg_clog page (32K transactions), even read-only
transactions would be blocked by the need to create a new pg_clog page
(which is a WAL-logged action). A read-only transaction never
actually
makes a WAL entry, but it does still consume an XID and hence a slot
on
the current pg_clog page. So I just hadn't tried enough transactions.