Python Milter Mail Policy

The milter package is a flexible milter built using
pymilter that
emphasizes authentication. It helps prevent forgery of legitimate mail,
and most spam is rejected as forged, or because of poor reputation
when not forged.

These are the policies implemented by the bms.py
application in the milter package. The milter and Milter modules
in the pymilter package
do not implement any policies themselves.

Classify connection

When the SMTP client connects, the connection IP address is
saved for later verification, and the connection
is classified as INTERNAL or EXTERNAL by matching the ip
address against the internal_connect configuration.
IP addresses with no PTR, and PTR names that look like
the kind assigned to dynamic IPs (as determined by a heuristic
algorithm) are flagged as DYNAMIC. IPs that match the
trusted_relay configuration are flagged as TRUSTED.

HELO Check

The HELO name provided by the client is saved for later verification
(for example by SPF). We could validate the HELO at this point
by verifying that an A record for the HELO name matches the connect ip.
However, currently we only block certain obvious problems.
HELO names that look like an IP4 address
and ones that match the hello_blacklist configuration
are immediately rejected. The hello_blacklist typically contains
the current MTAs own HELO name or email domains.
Clients that attempt to skip HELO are immediately rejected.

MAIL FROM Check

Before calling our milter, sendmail checks a DNS blacklist to
block banned sender domains. We never see a blocked domain.

The MAIL FROM address is saved for possible use by the smart-alias
feature. First, the internal_domains is used for
a simple screening if defined. If the MAIL FROM for an INTERNAL connection
is NOT in internal_domains, then it is rejected (the
PC is most likely infected and attempting to send out spam).
If the MAIL FROM for an EXTERNAL connection IS in
internal_domains, then the message is immediately rejected.
This is quick and effective for most small company MTAs. For more
complex mail networks, it is too simplistic, and should not be defined.
SPF will handle the complex cases.

wiretap

The wiretap feature can screen and/or monitor mail to/from certain
users. If the MAIL FROM is being wiretapped, the recipients are
altered accordingly.

SPF check

The MAIL FROM, connect IP, and HELO name are checked against
any SPF records published via DNS for the alleged sender (MAIL FROM)
to determine the official SPF policy result.
The offical SPF result is then logged in the Received-SPF header field,
but certain results are subjected to further processing to create
an effective result for policy purposes.

If the official result is 'none', we try to turn it into an effective result of
'pass' or 'fail'. First, we check for a local substitute SPF record
under the domain defined in the [spf]delegate configuration.
It is often useful to add local SPF records for correspondents that are
too clueless to add their own. If there is no local substitute, we use a "best
guess" SPF record of "v=spf1 a/24 mx/24 ptr" for MAIL FROM or "v=spf1 a/24
mx/24" for HELO. In addition, a HELO that is a subdomain of MAIL FROM and
resolves to the connect IP results in an effective result of 'pass'.

If there is no local SPF record, and the effective result is still not
'pass', we check for either a valid HELO name or a valid PTR record for
the connect IP. A valid HELO or PTR cannot look like a dynamic name
as determined by the heuristic in Milter.dynip.

If HELO has an SPF record, and the result is anything but pass, we reject
the connection:

Note that HELO does not have any forwarding issues like MAIL FROM, and so
any result other than 'pass' or 'none' should be treated like 'fail'.

Only if nothing about the SMTP envelope can be validated does the effective
result remain 'none. I call this the "3 strikes" rule.

If the official result is 'permerror' (a syntax error in the sender's
policy), we use the 'lax' option in pyspf to try various heuristics to guess
what they really meant. For instance, the invalid mechanism "ip:1.2.3.4" is
treated as "ip4:1.2.3.4". The result of lax processing is then used
as the effective result for policy purposes.

With an effective SPF result in hand, we consult the sendmail access
database to find our receiver policy for the sender.

REJECT

Reject the sender with a 550 5.7.1 SMTP code. The SMTP rejection
includes a detailed description of the problem.

CBV

Do a Call Back Validation by connecting to an MX of the sender
and checking that using the sender as the RCPT TO is not rejected.
We quit the CBV connection before actualling sending a message.
If the CBV is rejected, our SMTP connection is rejected with the
same error code and message. CBV results are cached.

DSN

Do a Call Back Validation by connecting to an MX of the sender
and checking that using the sender as the RCPT TO is not rejected.
Unlike a CBV, we continue on to data and send a detailed message
explaining the problem. This can be useful for reporting PermError
or SoftFail to the sender. Keep in mind that for any result other
than 'pass', the sender could be forged, and your DSN could annoy the
wrong person. However, a SoftFail result is requesting such feedback
for debugging and a PermError result needs to be fixed by the sender ASAP
whether forged or not. DSN results are cached so that senders are
annoyed only weekly.

OK

Accept the sender. The message may still be rejected via reputation
or content filtering.

SPF policy syntax

First, the full sender is checked:

SPF-Fail:abeb@adelphia.net DSN

This says to accept mail from that adelphia.net user despite the
SPF fail, but only after annoying them with a DSN about their ISP's broken
policy.

If there is no match on the full sender, the domain is checked:

SPF-Neutral:aol.com REJECT

This says to reject mail from AOL with an SPF result of neutral.
This means AOL users can't use their AOL address with another mail service
to send us mail. This is good because the other mail service is
likely a badly configured greeting card site or a virus.

Finally, a default policy for the result is checked. While there are program
defaults, you should have defaults in the access database for SPF results:

Reputation

If the sender has not been rejected by this point, and if a GOSSiP server is
configured, we consult GOSSiP for the reputation score of the sender and
SPF result. The score is a number from -100 to 100 with a confidence
percentage from 0 to 100. A really bad reputation (less than -50 with
confidence greater than 3) is rejected. Note that the reputation is tracked
independently for each SPF result and sender combination. So aol.com:neutral
might have a really bad reputation, while aol.com:pass would be ok.
Furthermore, when a sender finally publishes an SPF policy and starts
getting SPF pass, their reputation is effectively reset.

Whitelists and Blacklists

The administrator can whitelist or blacklist senders and sending domains by
appending them to ${datadir}/auto_whitelist.log or
${datadir}/blacklist.log respectively. In addition,
recipients of internal senders (except for automatic replies like vacation
messages and return receipts) are automatically whitelisted for 60 days, and
senders that fail CBV or DSN checks are automatically blacklisted for 30 days.
Whitelisted and blacklisted senders are used to automatically train the
bayesian content filter before being delivered or rejected, respectively.

Real Soon Now users will be able to maintain their own whitelist and
blacklist that applies only when they are the recipient.

Recipient Check

When the pysrs package is installed and configured,
outgoing mail is "signed" by adding a cryto-cookie to MAIL FROM.
All DSNs (null MAIL FROM) must be sent to a MAIL FROM address only,
so a DSN without a validated cookie in RCPT is immediately rejected.
Forwarded domains can have a list of valid recipients configured,
and invalid recipients are rejected. The MTA rejects invalid local RCPTs.
Four or more invalid RCPTs cause the IP to be blacklisted.

Content Filter

Most messages have been rejected or delivered by now, but spammers
are always finding new places to send their junk from. For instance,
we get around 10000 emails a day, of which around 500 are first time
spam senders. A bayesian filter is trained by the whitelists and
blacklists, and scores the message. What is likely spam is either
rejected or quarantined. If the sender is an effective SPF pass,
then they get a DSN notifying them that their message has been
quarantined. (A DSN failure gets the sender auto blacklisted.)
Else, if the reject_spam option is set, the message is rejected.
Otherwise, a CBV is done (failure gets the sender auto blacklisted)
and the message is silently quarantined.

Normally, you don't want email messages to silently disappear into
a black hole, so you should set the reject_spam option. However,
if you don't want your correspondent's email to get rejected, you can
check your quarantine frequently instead.

Honeypot

You can also blacklist recipients by listing them as aliases of the
'honeypot' dspam user. These are collectively called
the honeypot. Any email to these recipients is used to train the
spam filter as spam and chalk up a reputation demerit for the sender, then
discarded. It might be a good idea to blacklist the sender if it has SPF pass
as well, but I'm afraid of accidents.

Reputation

Reputation is tracked by sending domain and effective SPF result.
The GOSSiP server tracks the spam/ham status of the last 1024 messages
for each domain:result combination. When the server is queried during
the SMTP envelope phase (MAIL FROM), it also queries any configured
peers, and the scores are combined. Domains with a history of spam for
a given SPF result are rejected at MAIL FROM. The GOSSiP system has
a command line utility to reset (delete) a reputation for cases where a
sender that was infected with malware is repaired. In addition,
the confidence score of a reputation decays with time, so a bad sender
will eventually be able to try again without manual intervention.