9.17.4 Spam and Ham Processors

Spam and ham processors specify special actions to take when you exit
a group buffer. Spam processors act on spam messages, and ham
processors on ham messages. At present, the main role of these
processors is to update the dictionaries of dictionary-based spam back
ends such as Bogofilter (see Bogofilter) and the Spam Statistics
package (see Spam Statistics Filtering).

The spam and ham processors that apply to each group are determined by
the group'sspam-process group parameter. If this group
parameter is not defined, they are determined by the variable
gnus-spam-process-newsgroups.

Gnus learns from the spam you get. You have to collect your spam in
one or more spam groups, and set or customize the variable
spam-junk-mailgroups as appropriate. You can also declare
groups to contain spam by setting their group parameter
spam-contents to gnus-group-spam-classification-spam, or
by customizing the corresponding variable
gnus-spam-newsgroup-contents. The spam-contents group
parameter and the gnus-spam-newsgroup-contents variable can
also be used to declare groups as ham groups if you set their
classification to gnus-group-spam-classification-ham. If
groups are not classified by means of spam-junk-mailgroups,
spam-contents, or gnus-spam-newsgroup-contents, they are
considered unclassified. All groups are unclassified by
default.

In spam groups, all messages are considered to be spam by default:
they get the ‘$’ mark (gnus-spam-mark) when you enter the
group. If you have seen a message, had it marked as spam, then
unmarked it, it won't be marked as spam when you enter the group
thereafter. You can disable that behavior, so all unread messages
will get the ‘$’ mark, if you set the
spam-mark-only-unseen-as-spam parameter to nil. You
should remove the ‘$’ mark when you are in the group summary
buffer for every message that is not spam after all. To remove the
‘$’ mark, you can use M-u to “unread” the article, or
d for declaring it read the non-spam way. When you leave a
group, all spam-marked (‘$’) articles are sent to a spam
processor which will study them as spam samples.

Messages may also be deleted in various other ways, and unless
ham-marks group parameter gets overridden below, marks ‘R’
and ‘r’ for default read or explicit delete, marks ‘X’ and
‘K’ for automatic or explicit kills, as well as mark ‘Y’ for
low scores, are all considered to be associated with articles which
are not spam. This assumption might be false, in particular if you
use kill files or score files as means for detecting genuine spam, you
should then adjust the ham-marks group parameter.

— Variable: ham-marks

You can customize this group or topic parameter to be the list of
marks you want to consider ham. By default, the list contains the
deleted, read, killed, kill-filed, and low-score marks (the idea is
that these articles have been read, but are not spam). It can be
useful to also include the tick mark in the ham marks. It is not
recommended to make the unread mark a ham mark, because it normally
indicates a lack of classification. But you can do it, and we'll be
happy for you.

— Variable: spam-marks

You can customize this group or topic parameter to be the list of
marks you want to consider spam. By default, the list contains only
the spam mark. It is not recommended to change that, but you can if
you really want to.

When you leave any group, regardless of its
spam-contents classification, all spam-marked articles are sent
to a spam processor, which will study these as spam samples. If you
explicit kill a lot, you might sometimes end up with articles marked
‘K’ which you never saw, and which might accidentally contain
spam. Best is to make sure that real spam is marked with ‘$’,
and nothing else.

When you leave a spam group, all spam-marked articles are
marked as expired after processing with the spam processor. This is
not done for unclassified or ham groups. Also, any
ham articles in a spam group will be moved to a location
determined by either the ham-process-destination group
parameter or a match in the gnus-ham-process-destinations
variable, which is a list of regular expressions matched with group
names (it's easiest to customize this variable with M-x
customize-variable <RET> gnus-ham-process-destinations). Each
group name list is a standard Lisp list, if you prefer to customize
the variable manually. If the ham-process-destination
parameter is not set, ham articles are left in place. If the
spam-mark-ham-unread-before-move-from-spam-group parameter is
set, the ham articles are marked as unread before being moved.

If ham can not be moved—because of a read-only back end such as
NNTP, for example, it will be copied.

Note that you can use multiples destinations per group or regular
expression! This enables you to send your ham to a regular mail
group and to a ham training group.

When you leave a ham group, all ham-marked articles are sent to
a ham processor, which will study these as non-spam samples.

By default the variable spam-process-ham-in-spam-groups is
nil. Set it to t if you want ham found in spam groups
to be processed. Normally this is not done, you are expected instead
to send your ham to a ham group and process it there.

By default the variable spam-process-ham-in-nonham-groups is
nil. Set it to t if you want ham found in non-ham (spam
or unclassified) groups to be processed. Normally this is not done,
you are expected instead to send your ham to a ham group and process
it there.

When you leave a ham or unclassified group, all
spam articles are moved to a location determined by either
the spam-process-destination group parameter or a match in the
gnus-spam-process-destinations variable, which is a list of
regular expressions matched with group names (it's easiest to
customize this variable with M-x customize-variable <RET>
gnus-spam-process-destinations). Each group name list is a standard
Lisp list, if you prefer to customize the variable manually. If the
spam-process-destination parameter is not set, the spam
articles are only expired. The group name is fully qualified, meaning
that if you see ‘nntp:servername’ before the group name in the
group buffer then you need it here as well.

If spam can not be moved—because of a read-only back end such as
NNTP, for example, it will be copied.

Note that you can use multiples destinations per group or regular
expression! This enables you to send your spam to multiple spam
training groups.

The problem with processing ham and spam is that Gnus doesn't track
this processing by default. Enable the spam-log-to-registry
variable so spam.el will use gnus-registry.el to track
what articles have been processed, and avoid processing articles
multiple times. Keep in mind that if you limit the number of registry
entries, this won't work as well as it does without a limit.

Set this variable if you want only unseen articles in spam groups to
be marked as spam. By default, it is set. If you set it to
nil, unread articles will also be marked as spam.

Set this variable if you want ham to be unmarked before it is moved
out of the spam group. This is very useful when you use something
like the tick mark ‘!’ to mark ham—the article will be placed
in your ham-process-destination, unmarked as if it came fresh
from the mail server.

When autodetecting spam, this variable tells spam.el whether
only unseen articles or all unread articles should be checked for
spam. It is recommended that you leave it off.