Honorable members of the list,
I would like to share with you a side effect that I discovered today on
our postgresql 8.1 server.
We ve been running this instance with PITR for now 2 months without any
problems.
The wal's are copied to a remote machine with the pg_archive_command and
locally to some other directory.
For some independant reasons we made the remote machine unreachable for
some hours. The pg_archive_command returned as expected a failure value.
Now to what puzzles me:
the load on the box that normally is kept between 0.7 and 1.5 did
suddenly rise to 4.5 -5.5 and the processes responsiveness got bad.
The dir pg_xlog has plenty of space to keep several day of wal's.
there was no unfinished backup's or whatever that could have apparently
slowed the machine that much.
So the question is: is there a correlation between not getting the wal's
archived and this "massive" load growth?
In my understanding, as the pgl engine has nothing more to do with the
filled up log except just to make sure it's archived correctly ther
should not be any significant load increase for this reason. Looking at
the logs the engine tried approx. every 3 minutes to archive the wal's.
Is this behaviour expected, If it is then is it reasonnable to burden
the engine that is already in a inexpected situation with some IMHO
unecessary load increase.
your thougths are welcome
Cedric