Jonathon asks if writing the mailto link with javascript is necessary, when his email address is entity-encoded. I’m afraid that it’s a question of whether a kitten is better hidden with its head under the couch or under a blanket: if its rump is still sticking out, it doesn’t much matter.

The problem with most current methods of spam-proofing is that they are designed from a web designer’s point of view rather than a programmer’s, and they are perhaps too influenced by our thoughts about spammers. When a web designer looks at a string of “&#109;&#097;&#105;&#108;&#116;” he sees confusion, and when he thinks about spammers, he naturally thinks of morons. The problem is that although spammers certainly are morons (otherwise they wouldn’t offer to enlarge both my breasts and my penis), spam harvesting bot authors aren’t morons. For the first month or so after someone thought of it, entity or hex encoding your email address, or just the @ like Movable Type does in comments, probably worked pretty well. However, once you tell the world about it, to a spambot author that string looks like she needs to write another couple of functions to decode hex- and entity-encoded urls, and a couple of regular expressions to capture the words on either side of &#064; and %40. Hidden by a document.write? She’s not parsing the source for working links, she’s just looking for text on either side of an @ that might make an email address. If you want to hide from her, you’ll have to do something that she can’t write a regular expression to capture.

Unless someone can see a problem with it, I think I’ve got a solution that will hide the address from harvesters, work reasonably transparently in most browsers, and still be fairly accessible.

First, you need to get a disposable address, one that you can happily and easily discard if it gets harvested. I favor Spam Motel, which lets you create any number of disposable addresses, and forwards any mail they receive to you until you delete an address, but there are other options (even a Hotmail or Yahoo address would work).

Then, you need a section of javascript in the <head> section of any page where you want an email link, to assemble the pieces of your address:

Then, people without javascript and spambots will get your Spam Motel address, people with javascript should get your real address (tested in Mozilla, Opera 6, and IE 6 for Windows), and depending on their browser and their settings, they may even see your real address in the status bar (IE does, Mozilla does unless you have your preferences set to not allow javascript to change the status bar, and Opera is, as usual with javascript, a bit funky). Give it a try (my name in the byline is an email link using this, for now at least), or let me know what I’m missing that makes this harvestable (note that I’m including obfuscate_email() with <script src=””>, since I already had one, and it probably adds a little extra measure of protection: my real address doesn’t actually appear in the source, and a spambot would have to be following <script src=””> links to find it).

One other thing to consider about spam-proofing: I’ve had my @barrysworld address completely available in quite a few pages for around a year now, and I still get less than one piece of spam through it per week. On the other hand, the work address I just got rid of, which was available in a web page several years ago, was getting thirty or forty spams per day. I suspect that there is a whole lot less harvesting, and more reselling of the same old addresses, these days.

<update>Oops, trying to edit this down under a thousand words, I inadvertently cut out the part giving credit for the idea of assembling pieces of email address with javascript, which I lifted wholesale from Hossein’s YACCS code.</update>

This entry was posted
on Wednesday, June 26th, 2002 at 2:51 pm and is filed under blogging tech.
You can follow any responses to this entry through the post feed.
You can skip to the end and leave a response. Pinging is currently not allowed.

7 Comments

Though come to think of it, it might be more accessible to use a form for the link, and replace it with a mailto: in the javascript. After all, people without javascript by choice or by force are also likely to not have an email client tied into their browser.

Just be sure not to use formmail.cgi in a directory named cgi-bin: probably half my 404 traffic comes from spammers prospecting for /cgi-bin/formmail.cgi, /cgi-bin/FormMail.cgi, /cgi-bin/formmail.pl, and /cgi-bin/FormMail.pl. If I could come up with a good sticky trap for a program that thinks it’s filling out a formmail.cgi script, I’d be happy to put something there for them.

I have a solution that has worked fairly well for me for a quite long time: I have no mailto: links on my sites, yet I have clickable links that will open an e-mail client in most browsers.

This is done using simple CSS and a mod_rewrite hack. The mod_rewrite hack sets a http Location: header with a mailto, something spam harvesters seem to ignore, but browsers seem to understand (Even lynx does).

Some of you may have noticed that in my bio and to the right I’ve posted an obfuscated version of my e-mail address. Basically, the obfuscation is a small javascript that, when the link is clicked, causes your mail client…

Phil Ringnalda has a cool idea for obfuscating email addresses in a web page: he constructs the address dynamically in client script in an onmouseover handler. He credits someone else with the idea, but he’s got a nice clear example.