This is the first part of a two-part series article by Yakov Shafranovich, a co-founder and software architect with SolidMatrix Technologies, Inc., and former co-chair of the Anti-Spam Research Group (ASRG) of the Internet Research Task Force (IRTF). Read Part II of this series.

A long long time ago when the Internet was still young and most people were still using clunky Apples, PCs and mainframes; two documents were published by the Advanced Research Projects Agency (ARPA), part of the US Government's Department of Defense. They were called "RFC 821 - Simple Mail Transfer Protocol" and "RFC 822 - Standard for the format of ARPA Internet text messages" respectively. Written by the John Postel and Dave Crocker respectively, often referred to as some of the founding fathers of the Internet, they defined a simple text-based email system for the use of the fledging network then called the "ARPA Internet". The year was 1982: IBM and Apple just came out with their respective computers, Microsoft was still a tiny company shipping the DOS for IBM's new PC under a contract with the "Big Blue". In those days the phrase "Evil Empire" still referred to the Soviet Union in general, and to IBM in the technology industry specifically. Internet standards were developed for the US Government, to be used on their private network and no one heard of "open source".

Now forward twenty years to 2002: John Postel passed away, the Internet standards process is now done by the IETF under ISOC instead of DOD, Apple and IBM market shares are almost nil, the Soviet Union has collapsed, Linux and open source software now run majority of the Internet infrastructure, and the phrase "Evil Empire" is now used to refer to Microsoft. But the original two documents defining the email system of what has become a worldwide network now simply called "the Internet" are still here. Every single email message sent today still follows the original guidelines and format set out over 20 years ago by two scientists working on behalf of the US Government. Of course adjustments have been made by the IETF over the years, but the basic design and protocols remained the same.

While the thought of a standard surviving for so long is kind of cute and a testament to the original designers, unfortunately while the standard remained the same, the network has drastically changed. The original email standards were designed for a private government network with restricted access on a scale much smaller than what we have today. Viruses, crackers, hackers, spammers, phishers. and other associated evils that go bump in the Internet night were simply not part of the original design. Hence, the problems that we have today with email viruses, phishing and spam. The very fabric of the network itself — the email protocols — is now being successfully used for sending spam and phishing. Add to this the large scale of today's Internet with so many network operators who are not always willing to shut down abusers from their networks, and sometimes even get paid by abusers, and we got a sure recipe for disaster. I will leave it to the reader to fill in his or hers additional favorite reasons of why spam is so prevalent.

Back in 1997, right after Paul Vixie, the author of BIND, established MAPS and blacklists, he received a message from Jim Miller describing the idea of using DNS to authenticate incoming mail hosts (copy of the message appears here). In 2002, Paul Vixie wrote it up as a draft and posted it to an IETF list (original draft). At the same time, David Green posted a draft with the similar idea (original draft). At the end of 2002, Hadmut Danisch published the first draft of Reverse Mail eXchanger protocol otherwise known as RMX. The main idea that all of these proposals were pursuing is the closure of one of the loopholes in the original design of the email system. The original SMTP protocol provided no measure to check whether any of the sender's information provided in the SMTP transaction is actually valid. These proposals added an ability to check whether a specific domain had authorized this SMTP server to use its name in the MAIL FROM command, which is used to indicate the address to which bounces are sent. However, due to the lack of interest or perhaps more accurately lack of proof that this would actually help reduce spam, no one acted on their proposals and they simply faded away.

Things would have probably stayed that way except for a few developments. In early 2003, Paul Judge of Ciphertrust managed to convince the Internet Architecture Board (IAB) and Vern Paxson, the chair of the Internet Research Task Force (IRTF), to create a research group under the IRTF to fight spam. This group was chartered in March 2003 under the name of "Anti-Spam Research Group" or ASRG. The first topic of discussion on the ASRG mailing list was RMX with a post from Hadmut Danisch. Eventually, Gordon Fycek would create another proposal called DMP based on RMX. and several other less known proposals would be spawned off as well (DRIP, HAAS, MTA MARK, FSV, IMX, etc.). As of May 2003, the ASRG was discussing RMX and its related proposals, which had brought all of these ideas closer to the possibility of standardization than before. These discussions would continue for at least a year. Nevertheless, the IRTF has no power to create standards, and the chairs of the ASRG were reluctant to send the idea over to the IETF for standardization until it can be better determined that the idea actually had merit. In any case, this idea was within one hop of the IETF standards process, having made its way into the ASRG which as an IRTF group contributes input into the IETF process. The chances of this actually making it into the IETF were slim at the time, but as fate would have it, things turned out to be very different.

In the summer of 2003, Meng Wong, CTO of PoBox.com, saw these proposals and after trying to negotiate with the original authors, decided to fork his own version. Thus, "Senders Permitted From" renamed later to "Sender Policy Framework" was born (also known as SPF). However, Meng has what the others where lacking — ability to form a community. The SPF website was quickly born together with mailing lists, tools and a small but ever growing list of contributors. Meng also actively promoted his proposal through a variety of outlets including the media, mailing lists, open source community and commercial companies. The popularity of sender authentication as these sets of proposals become known grew, and the ASRG started a dedicated subgroup (LMAP) to merge all the varied proposals. Unfortunately, that proved to be a mistake as I realized later on, because the authors of various proposal were not able to come to an agreement, and also because we were trying to do in the ASRG what is normally done in the IETF — engineering instead of research. At that time, SPF seemed to be the leader of the pack, with the most tools available, largest community and biggest deployment.

After the failure of the ASRG subgroup for sender authentication, background discussions began in the IETF about a possibility of chartering a working group. Eventually, a planning BOF session was scheduled for the March 2004 IETF meeting in Seoul. The ASRG provided an overview document for input to the BOF and managed to get all of the proposal authors to provide their drafts as well including Meng of SPF, which was reluctant at the time to go to the IETF. At the same time, Microsoft has begun circulating its proposal among various ISPs and industry leaders, and also begun talks with Meng Wong of SPF. In the winter of 2003, right before the Seoul meeting, Microsoft publicly announced its proposal in a speech by Bill Gates at the RSA Conference. The proposal did not differ significantly from the existing ones, except for two parts: the use of XML for encoding data instead of plain text like SPF did or new DNS record types like RMX); and an algorithm for checking Caller-ID information based on email headers inside an email client, as opposed to using the SMTP transaction inside an email server. Microsoft wanted to deploy Caller-ID inside Outlook and Outlook Express so they came up with this algorithm which become known as "Purported Responsible Address" or PRA. It was eventually published as a separate Internet draft.

After the Seoul meeting, the IETF chartered a working group for sender authentication called MARID. Eventually SPF and Caller-ID proposals would merge in the May of 2004, and the combination become known as Sender-ID. By the time the August 2004 IETF meeting in San Diego took place, most of the technical parts of the standard were worked out. At the time of writing, the MARID WG is undergoing a last call procedure to see if Sender-ID should be sent to the IESG for approval as a standard. However, things are not as peachy as people think. Many members of the anti-spam community still think that the entire sender authentication set of proposals may not accomplish anything at all, even with the proposed reputation extensions. This of course remains to be seen.

Comments

Interesting article. Thanks. Just one minor correction and one comment, if I may…

First - that minor correction: RFC821 and RFC822 were in fact superseded by two new standards track RFCs conveniently numbered RFC2821 and RFC2822 in early 2001. As I read these though, they do not extend SMTP/MIME so much as clarify it, so you are correct that the principles of message content and transport still used today do indeed date back to the early 80s.

My comment: SPF will not reduce spam at all. In fact, that does not seem to be to me what it is intended to do. What it will do, if widely adopted, is put an end to the spoofing of sender addresses so popular with both spammers and the creators of mass mailing worms. This may be a very good thing although both types of abuser have repeatedly shown their ability to adapt. There can be little doubt that they will adapt again to an environment in which sender authentication has become widely adopted. Ultimately therefore, SPF will serve simply to defend the reputations of bona fide Internet users by preventing the use of their domains in spam and worms.

The question of whether and how sender identification is beneficial to the fight against spam is the subject of a paper I presented at the recent Conference on Email and Anti-Spam. If that question interests you, you can find the paper at the following address.

This whole debate is silly. There will never be a universal standard for e-mail authorship verification, no matter what IETF MARID or Microsoft do. Let a thousand flowers bloom. SPF works. Jim Miller's MAIL-FROM (which I documented here) works. Domain holders can put in metadata for all the verification styles they consider relevant, and mailserver administrators can look up the ones they consider relevant. Let's stop debating this and start implementing it. We're bogging down on questions of IPR and authorship when the fact is that a single standard isn't necessary (or possible). It's clear that IETF just can't settle this kind of controversy — but we all know that market forces can.

Agree entirely - and the first step in my opinion is for domain owners to publish SPF or Sender ID (or both) for all their domains, which is exactly what we are doing here.

Just to be clear though - the most immediate benefit will not be a reduction in spam, it will be a reduction in the amount of messages either delivered or bounced, purporting to be from someone other than the true originator. This in itself is a worthwhile outcome.