Postfix and NFS

Postfix support status for NFS

What is the status of support for Postfix on NFS? The answer
is that Postfix itself is supported when you use NFS, but there is
no promise that an NFS-related problem will promptly receive a
Postfix workaround, or that a workaround will even be possible.

That said, Postfix will in many cases work very well on NFS,
because Postfix implements a number of workarounds (see below).
Good NFS implementations seldom if ever give problems with Postfix,
so Wietse recommends that you spend your money wisely.

Postfix file locking and NFS

For the Postfix mail queue, it does not matter how well NFS
file locking works. The reason is that you cannot share Postfix
queues among multiple running Postfix instances. You can use NFS
to switch a Postfix mail queue from one NFS client to another one,
but only one NFS client can access a Postfix mail queue at any
particular point in time.

For mailbox file sharing with NFS, your options are to use
fcntl (kernel locks), dotlock (username.lock
files), to use both locking methods simultaneously, or to switch
to maildir format. The maildir format uses one file per message and
needs no file locking support in Postfix or in other mail software.

Many sites that use mailbox format play safe and use both locking
methods simultaneously.

Postfix NFS workarounds

The list below summarizes the workarounds that exist for running
Postfix on NFS as of the middle of 2003. As a reminder, Postfix
itself is still supported when it runs on NFS, but there is no
promise that an NFS-related problem will promptly receive a Postfix
workaround, or that a workaround will even be possible.

Problem: when renaming a file, the operation may succeed
but report an error anyway[1].

Workaround: when rename(old, new) reports an error, Postfix
checks if the new name exists and the old name is gone. If the check
succeeds, Postfix assumes that the rename() operation completed
normally.

Problem: when creating a directory, the operation may succeed
but report an error anyway[1].

Workaround: when mkdir(new) reports an EEXIST error, Postfix
checks if the new name resolves to a directory. If the check succeeds,
Postfix assumes that the mkdir() operation completed normally.

Problem: when creating a hardlink to a file, the operation
may succeed but report an error anyway[1].

Workaround: when link(old, new) fails, Postfix compares the
device and inode number of the old and new files. When the two files
are identical, Postfix assumes that the link() operation completed
normally.

Problem: when creating a dotlock (username.lock)
file, the operation may succeed but report an error anyway[1].

Workaround: in this case, the only safe action is to back off
and try again later.

Problem: when a file server's "time of day" clock is not
synchronized with the client's "time of day" clock, email deliveries
are delayed by a minute or more.

Workaround: Postfix explicitly sets file time stamps to avoid
delays with new mail (Postfix uses "last modified" file time stamps
to decide when a queue file is ready for delivery).

[1] How can an operation succeed and report an error
anyway?

Suppose that an NFS server executes a client request successfully,
and that the server's reply to the client is lost. After some time
the client retransmits the request to the server. Normally, the
server remembers that it already completed the request (it keeps a
list of recently-completed requests and replies), and simply
retransmits the reply.

However, when the server has rebooted or when it has been very
busy, the server no longer remembers that it already completed the
request, and repeats the operation. This causes no problems with
file read/write requests (they contain a file offset and can therefore
be repeated safely), but fails with non-idempotent operations. For
example, when the server executes a retransmitted rename() request,
the server reports an ENOENT error because the old name does not
exist; and when the server executes a retransmitted link(), mkdir()
or create() request, the server reports an EEXIST error because the
name already exists.

Thus, successful, non-idempotent, NFS operations will report
false errors when the server reply is lost, the client retransmits
the request, and the server does not remember that it already
completed the request.