For most average Internet Service Providers and networks in North America, Europe or Australasia, today's incoming email traffic consists of approximately 90% spam and 10% normal legitimate wanted email. [1]

The problem for mail system administrators is how to filter out the spam while not losing legitimate email, and how to keep mail queues flowing without spam-filter processes impacting performance and slowing the mail queue.

The problem for network executives is how to do this cost-effectively.

This whitepaper sets out how to do it, using existing technology and open-source software such as SpamAssassin.

2-Stage Filtering

You need to look at your mail filtering as a two-stage process, with the fastest and cheapest process in front to reduce the known spam in your incoming mail flow from a torrent to a trickle. Using only the Spamhaus DNS-based Blocklists SBL, XBL and PBL (collectively known as Zen), internet networks can very safely reject at least 85% of inbound mail traffic outright at SMTP connect time and before your mail servers are burdened with having to process it.

This one set of very safe Spamhaus DNSBLs alone does the job of reducing your torrent of incoming spam to a trickle. Now we can deal with the trickle: The remaining spam that has got past the Spamhaus Zen IP blocklist check at SMTP connect time should then be filtered in a second stage using a Spamhaus blocklist designed for this exact purpose: The Spamhaus Domain Block List (DBL).

On an average production mail system, this setup can achieve a catch rate of approximately 299 out of every 300 spams (99+%) with zero false positives.

1st Stage

The first stage is to install the Spamhaus Zen blocklist on your incoming mail relay(s). Zen, which is a combination of Spamhaus's SBL, XBL and PBL blocklists, will identify and reject approximately 85% of a average mail relay's incoming mail traffic. Zen has a long-held reputation as having a false positive rate so low as to be unmeasurable and insignificant even on the largest mail systems. False positives with Zen are extremey rare, which is why Zen is used by almost all of the world's major email providers.Incoming mail from servers listed on Spamhaus's SBL, XBL or PBL (collectively Zen) at this first stage should be rejected at the RCPT TO command, terminating the SMTP transaction before the message body is sent or accepted.

This is very cost effective - it safely handles more than three quarters of your incoming mail traffic, drastically cutting your incoming mail bandwidth and the size of your subsequent mail queue. This is also the safe way to handle message filtering, because in the event a legitimate sender is ever blocked in error they are immediately notified, by their own mail server, of the reason why their message could not be delivered as well as what to do and who to contact about it. [3]

2nd Stage

Email traffic which gets past the first stage can now be put through more in-depth filters which check message bodies and specifically the domains and IPs of web sites advertised in them.

This stage is done using software [2] capable of rapidly scanning message bodies for Uniform Resource Identifiers (URIs also known as URLs) and testing these against a URI Blocklist such as Spamhaus's Domain Block List (see: DBL).
The vast majority of spam contains links to spammers' web sites and therefore to domains. These domains are what the Spamhaus DBL is designed to find and filter.

The Spamhaus DBL is an advanced URI blocklist which makes extensive use of Spamhaus technologies to identify domains used in spam in realtime. It is extremely effective, has a zero-false-positive reputation and can also be used for SMTP header checks.

In addition to checking URI blocklists, URIs can also be checked against the SBL using a feature called URIBL_SBL [4]. Over 50% of spam contains links to web sites whose webserver IPs are listed on the Spamhaus SBL [5]. Spamhaus lists the IPs of spammers' web servers and DNS servers in the SBL for this purpose.

Remaining spam, which should now be reduced to less than 0.5% of your total incoming email traffic, can now be taken care of easily by other filter components [2], with the result that your total spam catch rate should now average 99.6%, or 299 in every 300 spams.
--
Effective Spam Filtering
Updated January 2010

[1] Many Internet networks report the figure is greater than 90% spam, some as high as 96%.

[2] Examples: SpamAssassin[3] The Spamhaus DNSBLs return a text message (TXT) on a positive hit, giving the URL to the precise record page explaining why the IP is listed and who to contact to get the issue resolved. The Spamhaus XBL and PBL DNSBLs allow end-users to remove their own addresses from the blocklist.

[4] http://wiki.apache.org/spamassassin/Rules/URIBL_SBL
[5] URL on SBL: Out of a sample of 1000 spams tested by Spamhaus in December 2009 using the "URIBL_SBL" feature in SpamAssassin, 647 (64%) contained urls of spammer web sites whose IPs were listed on the SBL.