Manual training of the Bayesian filter involves using scripts and the Spam Training Utility to update the dictionary file with spam and ham. Manual training can occur alongside auto-training and is a good way of adding extra emails that had avoided detection to the dictionary so they can be caught in future.

Similar to auto-training, both spam and ham need to be collected, but the process for doing so varies, as detailed below.

Collecting spam for manual training

Two ways to collect spam for manual training purposes are:

Creating a catchall address. Set up a mailbox address (e.g. spam@example.com) as a catchall address. This address will collect all emails for a domain that do not have a mapping to a mailbox. The majority of mail in this mailbox will be spam, as spammers will often send to unknown addresses for a domain. Do not use the same address as one that is being used for auto-training.

Using public folders. Set up public folders for post offices for the purpose of collecting spam. IMAP users can drag and drop spam messages from their inbox into the public folder for collection. A script can then be scheduled to copy the content of these folders to a single spam repository folder for addition to the dictionary. For an example script, see the Manual Training section.

Collecting ham for manual training

One way of collecting ham for manual training is to configure a filter that collects mail from senders who have authenticated. To do this, follow this procedure:

Create a mailbox in the domain called ham@example.com

Create a global filter called “Ham Collection” with the criteria of “Where the sender has authenticated” and the action “Forward message to ham@example.com”. More advanced criteria can be used to determine which messages to use for training.

The inbox of this mailbox can then be used as a source for ham messages to be used for manual training.

Compiling the dictionary using a script

In order to add emails to a dictionary, the Spam Training Utility is used. This will take spams and hams from two specified folders, process them and add them to the dictionary. Since the emails to add could be located in various public folders and catchall mailboxes, a scheduled DOS script would normally be used to copy the emails from these locations and put into two folders for the Spam Training Utility.

An example script for this is below. This script will also stop and start the MTA service in order to allow it to be used along with auto-training. Since the Spam Training Utility only works on the dictionary on the hard drive, the MTA service needs to be stopped to write out any auto-training additions that have been made.

The script is just an example and would need to be modified to match the MailEnable configuration.