6.4.13.11 Comparing Mail Back Ends

First, just for terminology, the back end is the common word for a
low-level access method—a transport, if you will, by which something
is acquired. The sense is that one's mail has to come from somewhere,
and so selection of a suitable back end is required in order to get that
mail within spitting distance of Gnus.

The same concept exists for Usenet itself: Though access to articles is
typically done by NNTP these days, once upon a midnight dreary, everyone
in the world got at Usenet by running a reader on the machine where the
articles lay (the machine which today we call an NNTP server), and
access was by the reader stepping into the articles' directory spool
area directly. One can still select between either the nntp or
nnspool back ends, to select between these methods, if one happens
actually to live on the server (or can see its spool directly, anyway,
via NFS).

The goal in selecting a mail back end is to pick one which
simultaneously represents a suitable way of dealing with the original
format plus leaving mail in a form that is convenient to use in the
future. Here are some high and low points on each:

nnmbox

UNIX systems have historically had a single, very common, and well-defined
format. All messages arrive in a single spool file, and
they are delineated by a line whose regular expression matches
‘^From_’. (My notational use of ‘_’ is to indicate a space,
to make it clear in this instance that this is not the RFC-specified
‘From:’ header.) Because Emacs and therefore Gnus emanate
historically from the Unix environment, it is simplest if one does not
mess a great deal with the original mailbox format, so if one chooses
this back end, Gnus' primary activity in getting mail from the real spool
area to Gnus' preferred directory is simply to copy it, with no
(appreciable) format change in the process. It is the “dumbest” way
to move mail into availability in the Gnus environment. This makes it
fast to move into place, but slow to parse, when Gnus has to look at
what's where.

nnbabyl

Once upon a time, there was the DEC-10 and DEC-20, running operating
systems called TOPS and related things, and the usual (only?) mail
reading environment was a thing called Babyl. I don't know what format
was used for mail landing on the system, but Babyl had its own internal
format to which mail was converted, primarily involving creating a
spool-file-like entity with a scheme for inserting Babyl-specific
headers and status bits above the top of each message in the file.
Rmail was Emacs's first mail reader, it was written by Richard Stallman,
and Stallman came out of that TOPS/Babyl environment, so he wrote Rmail
to understand the mail files folks already had in existence. Gnus (and
VM, for that matter) continue to support this format because it's
perceived as having some good qualities in those mailer-specific
headers/status bits stuff. Rmail itself still exists as well, of
course, and is still maintained within Emacs. Since Emacs 23, it
uses standard mbox format rather than Babyl.

Both of the above forms leave your mail in a single file on your
file system, and they must parse that entire file each time you take a
look at your mail.

nnml

nnml is the back end which smells the most as though you were
actually operating with an nnspool-accessed Usenet system. (In
fact, I believe nnml actually derived from nnspool code,
lo these years ago.) One's mail is taken from the original spool file,
and is then cut up into individual message files, 1:1. It maintains a
Usenet-style active file (analogous to what one finds in an INN- or
CNews-based news system in (for instance) /var/lib/news/active,
or what is returned via the ‘NNTP LIST’ verb) and also creates
overview files for efficient group entry, as has been defined for
NNTP servers for some years now. It is slower in mail-splitting,
due to the creation of lots of files, updates to the nnml active
file, and additions to overview files on a per-message basis, but it is
extremely fast on access because of what amounts to the indexing support
provided by the active file and overviews.

nnml costs inodes in a big way; that is, it soaks up the
resource which defines available places in the file system to put new
files. Sysadmins take a dim view of heavy inode occupation within
tight, shared file systems. But if you live on a personal machine where
the file system is your own and space is not at a premium, nnml
wins big.

It is also problematic using this back end if you are living in a
FAT16-based Windows world, since much space will be wasted on all these
tiny files.

nnmh

The Rand MH mail-reading system has been around UNIX systems for a very
long time; it operates by splitting one's spool file of messages into
individual files, but with little or no indexing support—nnmh
is considered to be semantically equivalent to “nnml without
active file or overviews”. This is arguably the worst choice, because
one gets the slowness of individual file creation married to the
slowness of access parsing when learning what's new in one's groups.

nnfolder

Basically the effect of nnfolder is nnmbox (the first
method described above) on a per-group basis. That is, nnmbox
itself puts all one's mail in one file; nnfolder provides a
little bit of optimization to this so that each of one's mail groups has
a Unix mail box file. It's faster than nnmbox because each group
can be parsed separately, and still provides the simple Unix mail box
format requiring minimal effort in moving the mail around. In addition,
it maintains an “active” file making it much faster for Gnus to figure
out how many messages there are in each separate group.

If you have groups that are expected to have a massive amount of
messages, nnfolder is not the best choice, but if you receive
only a moderate amount of mail, nnfolder is probably the most
friendly mail back end all over.

nnmaildir

For configuring expiry and other things, nnmaildir uses
incompatible group parameters, slightly different from those of other
mail back ends.

nnmaildir is largely similar to nnml, with some notable
differences. Each message is stored in a separate file, but the
filename is unrelated to the article number in Gnus. nnmaildir
also stores the equivalent of nnml's overview files in one file
per article, so it uses about twice as many inodes as nnml.
(Use df -i to see how plentiful your inode supply is.) If this
slows you down or takes up very much space, a non-block-structured
file system.

Since maildirs don't require locking for delivery, the maildirs you use
as groups can also be the maildirs your mail is directly delivered to.
This means you can skip Gnus' mail splitting if your mail is already
organized into different mailboxes during delivery. A directory
entry in mail-sources would have a similar effect, but would
require one set of mailboxes for spooling deliveries (in mbox format,
thus damaging message bodies), and another set to be used as groups (in
whatever format you like). A maildir has a built-in spool, in the
new/ subdirectory. Beware that currently, mail moved from
new/ to cur/ instead of via mail splitting will not
undergo treatment such as duplicate checking.

nnmaildir stores article marks for a given group in the
corresponding maildir, in a way designed so that it's easy to manipulate
them from outside Gnus. You can tar up a maildir, unpack it somewhere
else, and still have your marks.

nnmaildir uses a significant amount of memory to speed things up.
(It keeps in memory some of the things that nnml stores in files
and that nnmh repeatedly parses out of message files.) If this
is a problem for you, you can set the nov-cache-size group
parameter to something small (0 would probably not work, but 1 probably
would) to make it use less memory. This caching will probably be
removed in the future.

Startup is likely to be slower with nnmaildir than with other
back ends. Everything else is likely to be faster, depending in part
on your file system.

nnmaildir does not use nnoo, so you cannot use nnoo
to write an nnmaildir-derived back end.