Earlier this week, Felten made the observation that the government eavesdropping on Lavabit could be considered as an insider attack against Lavabit users. This leads to the obvious question: how might we design an email system that’s resistant to such an attack? The sad answer is that we’ve had this technology for decades but it never took off. Phil Zimmerman put out PGP in 1991. S/MIME-based PKI email encryption was widely supported by the late 1990′s. So why didn’t it become ubiquitous?[Read more...]

Consider a hypothetical of three Internet users: Alice, Bob, and Charlie. If Alice wants to communicate anonymously with Charlie, she may relay her messages through Bob. While Charlie knows Bob is an intermediary, Charlie does not know with whom he is ultimately communicating. For even greater anonymity Alice can pass her messages through multiple Bobs, and by applying cryptography she can ensure no individual Bob can piece together that she is communicating with Charlie. This basic approach to anonymity is remarkable in its independence of the Internet’s design: it only requires that some Bob(s) can and do run intermediary software. Even on an Internet where users could verify each other’s identity this means of anonymity would remain viable.

The sad state of software security – the latest DHS weekly bulletin alone identified over 40 “high severity” vulnerabilities – is what enables malicious users to exploit the Internet’s indelible capacity for anonymity. Modifying the prior hypothetical, suppose Alice now wants to spam, phish, denial of service (DoS) attack, or hack Charlie. After compromising Bob’s computer with malicious software (malware), Alice can send emails, host websites, and launch DoS attacks from it; Charlie knows Bob is apparently misbehaving, but has no means of discovering Alice’s role. Nearly all spam, phishing, and DoS attacks are now perpetrated with networks of compromised computers like Bob’s (botnets). At the writing of a July 2009 private sector report, just five botnets sourced nearly 75% of spam. Worse yet, botnets are increasingly self-perpetuating: spam and phishing websites propagate malware that compromises new computers for the botnet.

Shortcomings in authentication, the means of proving one’s identity either when necessary or at all times, are a secondary contributor to the Internet’s ills. Most applications rely on passwords, which are easily guessed or divulged through deception – the very mechanisms of most phishing and account hijacking. There are potential technical solutions that would enable a user to authenticate themselves without the risk of compromising accounts. But any approach will be undermined by weaknesses in underlying software security when a malicious party can trivially compromise a user’s computer.

The policy community is already trending towards acceptance of Internet anonymity and refocusing on software security and authentication; the recent White House Cyberspace Policy Review in particular emphasizes both issues. To the remaining unpersuaded, I can only offer at last a truism: There’s anonymity on the Internet. Get over it.

Today, I got a spam touting a Citrix product (“Free virtualization training for you and your students!”). This message arrived in my mailbox with an unsubscribe link hosted by xmr3.com which bounced me back to a page at Citrix. The Citrix page then asks me for assorted personal information (name, email, country, employer). There was also a mailto link from xmr3 allowing me to opt-out.

At no time did I ever opt into any communication from Citrix. I’ve never done business with them. I don’t know anybody who works there. I could care less about their product.

What’s wrong here? A seemingly legitimate company is sending out spam to people who have never requested anything from them. They’re not employing any of the tactics that are normally employed by spammers to hide themselves. They’re not advertising drugs for sexual dysfunction or replicas of expensive watches. Maybe they got my email by surfing through faculty web pages. Maybe they got my email from some conference registration list. They’ve used a dubious third-party to distribute the spam who provides no method for indicating that their client is violating their terms of service (nor can their terms of service be found anywhere on their home page).

Based on this, it’s easy to advocate technical countermeasures (e.g., black-hole treatment for xmr3.com and citrix.com) or improvements to laws (the message appears to be superficially compliant with the CAN-SPAM act, but a detailed analysis would take more time than it’s worth). My hope is that we can maybe also apply some measure of shame. Citrix, as a company, should be embarrassed and ashamed to advertise itself this way. If it ever became culturally acceptable for companies to do this sort of thing, then the deluge of “legitimate” spam will be intolerable.

You can run, but you can’t hide. Here are a few of the latest things I’ve seen, in no particular order.

On a PHPBB-style chat board which I sometimes frequent, there was a thread about do-it-yourself television repair, dormant for over a year. Recently, there was a seemingly robotic post, from a brand new user, that was still on-topic, giving general diagnosis advice and offering to sell parts for TV repair. The spam was actually somewhat germane to the main thread of the discussion. Is it still spam?

In my email, I recently got a press release for a local fried chicken franchise celebrating their 40th anniversary. My blogging output generally doesn’t extend to writing restaurant reviews (tempting as that might be), although I do sometimes link to foodie things from Google Reader which will also show up in my public FriendFeed. Spam or not spam?

I’ve complained about spammers before, but this one is new. I recently received a spam that supports the case of Michael Skelly for Congress, saying negative things about incumbent John Culberson. What’s interesting: this is my home precinct. These people are actually competing for my vote. This leads to the question: how on earth did the Skelly people manage to map my work email address to my home mailing address? Is there a database out there that they used? Maybe they just spammed everybody at my employer, since this particular Congressional district includes our campus; all of our students, in our dorms, who are registered locally will be voting in this particular race.

Part of me wants to bias my voting decision against the idiot candidate who thought that email marketing was a good way to efficiently reach voters. Sadly, that decision will have to be based on more substantial issues, like which candidate I think will perform better in Congress. Instead, I’m going to direct my fire at VerticalResponse, the service provider who the Skelly campaign used to send me the spam. According to their anti-spam policy,

VerticalResponse has no tolerance for the sending of spam and unsolicited mail, and we prohibit the use of third-party, purchased, rented, or harvested mailing lists. Any customer found using VerticalResponse to send such mail is banned from the use of our service.

VerticalResponse takes several steps to keep abuse to a minimum. Among other things, we:

- Interview new clients about both the origins of their mailing lists and their marketing practices. Clients who do not meet our standards are not allowed to use the VerticalResponse service.

…

- Read most emails before they can go out the door. Email sent through our system goes to a staging area where it is looked over by a member of the VerticalResponse staff. If we have any concerns, the mailing is stopped and we contact the client.

Really? I find that impossible to believe. In what way could any reasonable human have decided that a blob of partisan political attack messaging being delivered to what we can only presume is a non-trivial mailing list is, in any way, anything other than gratuitous spam? For the record, I have never supported either the Democratic or Republican parties financially. I am not a member of either party. The only possible way my email address could have been used is that it was either harvested in bulk, along with other Rice email addresses, or perhaps more charitably, if somebody thought “ahh, that Prof. Wallach seems like he’d be interested political propaganda from our party and/or candidate.” Neither one would appear to be compatible with VerticalResponse’s stated anti-spam policies.

I’ll also note that, while VerticalResponse provides a one-click way for me to opt out of this particular spam source, they provide no way for me to opt out of any other future source or otherwise specify any sort of policy from my end. There’s no way, short of training my spam filter, for me to say “I never want to receive email from VerticalResponse, ever again.” Surely, I figured, I can’t be the first person to complain about them, yet a Google search on any of the usual terms didn’t find anybody else complaining like this.

Instead, I started digging through my historical email. It appears that there have been a handful of VerticalResponse “campaigns” that I considered to be non-spam and have kept. One series of non-spam messages were from a house builder who I thought I might want to use at one point. Another was an update notice for a web service that I use. Historically, I’ve reported one other spam to them, via their abuse email address. They stated, in response, that they removed me from that particular mailing list and would investigate the infraction. I received no subsequent email about the resolution of that case.

Of course, that’s far from everything. Generally, when I get these things, I generally just click the “unsubscribe” link, retrain my spam filter, and move on with life. I haven’t kept count of how many such spams I’ve treated this way.

I did a similar search through my old mail for ConstantContact, one of VerticalResponse’s competitors. I found not a single email, from them to me, that I had kept, although several were forwarded to mailing lists that I archive, so those I kept. I have no records of having ever contacted their abuse department.

Does this mean that one vendor is more spammy than the other, does it mean that one vendor just has more market share than the other, or does it mean that my spam filter is removing more of this stuff before I have to look at it? It’s hard to say without more data.

Okay, big policy question: given that political campaigns and everybody else on the marketing side of the equation deeply loves the idea of targeted email marketing campaigns, how should we accommodate them? Should they be required to provide better proof to to firms like VerticalResponse or ConstantContact that their email addresses were harvested in some proper fashion? How on earth could they actually do such a thing? Short of having users opt-in directly at the email distribution service, everything else boils down to the email service taking the marketer at their word, which seems about as likely to be true as those “no documentation required” mortgages.

Maybe the answer is for “ethical” email distributors to pay fees, per message, perhaps as a government tax. Call it “spam postage”, and tweak the fee structure so the sender ends up paying more money when the recipient hit the “unsubscribe” or “abuse” button. First off, by adding a real monetary expense to the process, senders might be incentivized to reduce their mailing lists. The penalties incentivize them to cull their lists down to their true supporters. The only problem with a structure like this is that it tends to push email marketers away from “ethical” email distribution services and toward either do-it-yourself solutions or toward shady vendors who don’t charge the postage fees. (And, we all know that the real-money postage costs of physical mail do seemingly little to deter all the paper spam that we receive.)

For better or for worse, we’ll never get rid of email spam. Maybe we can filter out recurring messages from Nigerian dictators or overseas pharmacies, but no training-based spam filter is going to be able to learn every new thing to come down the block when it’s still new. The only thing that will ever truly work is if and when people just stop paying attention.

[Sidebar: so how should a political campaign effectively reach people like me to convey their message? I tend to go out and surf their web sites, read their policy papers, and I pay attention to the endorsements of newspapers, bloggers, and others who I trust. For the "down-ballot" races, I tend to spend some quality time with the non-partisan League of Women Voters guide. The LWV asks candidates to respond to a variety of relevant questions, but space constraints limit the answers. An online version could presumably give the candidates space to really explain their positions (and/or firmly demonstrate their lack of clue). At the end of all that, I make a cheat sheet with my favorite candidates and bring it with me to the polls.]

My phone at work rings. The caller ID has a weird number (“50622961841″ – yes, it’s got an extra digit in it). I answer. It’s a recording telling me I can get lower rates on my card (what card?) if I just hit one to connect me to a representative. Umm, okay. “1″. Recorded voiced: “Just a moment.” Human voice: “Hello, card center.”

At this point, I was mostly thinking that this was unsolicited spam, not a phishing attack. Either way, I knew I had a limited time to ask questions before they’d hang up. “Who is this? What company is this?” They hung up. Damn! I should have played along a little further. I imagine they would have asked for my credit card number. I could have then made something up to see how far the interaction would go. Oh well.

Clearly, this was a variant on a credit card phishing attack, except instead of an email from a Nigerian dictator, it was a phone call. I’m sure the caller ID is total garbage, although that, along with the demon-dialer, says that the scammer has some non-trivial infrastructure in place to make it happen.

So, the next time one of you receives an unsolicited call offering to get you lower rates on your card, please do play along and feed them random numbers when they ask for data. At the very least, there’s some entertainment value. If you’re lucky, you might be able to learn something that would be useful to mount a criminal investigation. Maybe half-way through you could suddenly have an important meeting to get to and see if you can get them to give you a callback phone number.

Update: reader “anon” points to an article from The Register that discusses this in more detail.

ZDNet’s “Zero Day” blog has an interesting post on the gray-market economy in solving CAPTCHAs.

CAPTCHAs are those online tests that ask you to type in a sequence of characters from a hard-to-read image. By doing this, you prove that you’re a real person and not an automated bot – the assumption being that bots cannot decipher the CAPTCHA images reliably. The goal of CAPTCHAs is to raise the price of access to a resource, by requiring a small quantum of human attention, in the hope that legitimate human users will be willing to expend a little attention but spammers, password guessers, and other unwanted users will not.

It’s no surprise, then, that a gray market in CAPTCHA-solving has developed, and that that market uses technology to deliver CAPTCHAs efficiently to low-wage workers who solve many CAPTCHAs per hour. It’s no surprise, either, that there is vigorous competition between CAPTCHA-solving firms in India and elsewhere. The going rate, for high-volume buyers, seems to be about $0.002 per CAPTCHA solved.

I would happily pay that rate to have somebody else solve the CAPTCHAs I encounter. I see two or three CAPTCHAs a week, so this would cost me about twenty-five cents a year. I assume most of you, and most people in the developed world, would happily pay that much to never see CAPTCHAs. There’s an obvious business opportunity here, to provide a browser plugin that recognizes CAPTCHAs and outsources them to low-wage solvers – if some entrepreneur can overcome transaction costs and any legal issues.

Of course, the fact that CAPTCHAs can be solved for a small fee, and even that most users are willing to pay that fee, does not make CAPTCHAs useless. They still do raise the cost of spamming and other undesired behavior. The key question is whether imposing a $0.002 fee on certain kinds of accesses deters enough bad behavior. That’s an empirical question that is answerable in principle. We might not have the data to answer it in practice, at least not yet.

Another interesting question is whether it’s good public policy to try to stop CAPTCHA-solving services. It’s not clear whether governments can actually hinder CAPTCHA-solving services enough to raise the price (or risk) of using them. But even assuming that governments can raise the price of CAPTCHA-solving, the price increase will deter some bad behavior but will also prevent some beneficial transactions such as outsourcing by legitimate customers. Whether the bad behavior deterred outweighs the good behavior deterred is another empirical question we probably can’t answer yet.

On the first question – the impact of cheap CAPTCHA-solving – we’re starting a real-world experiment, like it or not.

Today marks the 30th anniversary of (what is reputed to be) the first spam email. Here’s the body of the email:

DIGITAL WILL BE GIVING A PRODUCT PRESENTATION OF THE NEWEST MEMBERS OF THE DECSYSTEM-20 FAMILY; THE DECSYSTEM-2020, 2020T, 2060, AND 2060T. THE DECSYSTEM-20 FAMILY OF COMPUTERS HAS EVOLVED FROM THE TENEX OPERATING SYSTEM AND THE DECSYSTEM-10 (PDP-10) COMPUTER ARCHITECTURE. BOTH THE DECSYSTEM-2060T AND 2020T OFFER FULL ARPANET SUPPORT UNDER THE TOPS-20 OPERATING SYSTEM. THE DECSYSTEM-2060 IS AN UPWARD EXTENSION OF THE CURRENT DECSYSTEM 2040 AND 2050 FAMILY. THE DECSYSTEM-2020 IS A NEW LOW END MEMBER OF THE DECSYSTEM-20 FAMILY AND FULLY SOFTWARE COMPATIBLE WITH ALL OF THE OTHER DECSYSTEM-20 MODELS.

WE INVITE YOU TO COME SEE THE 2020 AND HEAR ABOUT THE DECSYSTEM-20 FAMILY AT THE TWO PRODUCT PRESENTATIONS WE WILL BE GIVING IN CALIFORNIA THIS MONTH. THE LOCATIONS WILL BE:

THURSDAY, MAY 11, 1978 – 2 PM
DUNFEY’S ROYAL COACH
SAN MATEO, CA
(4 MILES SOUTH OF S.F. AIRPORT AT BAYSHORE, RT 101 AND RT 92)

A 2020 WILL BE THERE FOR YOU TO VIEW. ALSO TERMINALS ON-LINE TO OTHER DECSYSTEM-20 SYSTEMS THROUGH THE ARPANET. IF YOU ARE UNABLE TO ATTEND, PLEASE FEEL FREE TO CONTACT THE NEAREST DEC OFFICE FOR MORE INFORMATION ABOUT THE EXCITING DECSYSTEM-20 FAMILY.

This is relatively mild by the standards of today’s spam. The message announced legitimate events relating to legitimate products in which the recipients might plausibly be interested. The sender was apparently unaware that this kind of message was against the rules.

Yet this message has much in common with today’s spam. The message used ALL CAPS, which was more common in those days but not the universal practice for email. The list of recipients was long. The message was incorrectly formatted – the original had more recipients than the email software of the day could handle, so what was supposed to be the recipient list actually spilled over into the body of the email, apparently unnoticed by the sender.

At the time, the Net’s rules forbade commercial activity, so the message was against the rules. Beyond the rule violation,the message’s propriety was widely questioned, and people debated what to do about it. (Brad Templeton has posted parts of the debate.)

Thirty years later, there is more spam than ever and no end is in sight. This shouldn’t be surprising, because the spam problem is fundamentally driven by economics. If anyone can send to anyone, and the cost of sending is nearly zero, many messages will be sent. Distinguishing unwanted email from wanted email is notoriously difficult – often you have to read a message to decide whether reading it was a waste of time. In this environment, spam will be a fact of life. The surprise, if anything, is that we have done as well as we have in coping with it.

I’m sure this sort of behavior is old news, but it’s still really annoying. Starting last night and continuing as I’m writing this, some annoying spammer has been forging my email address as the “From” line of a variety of spams. This is causing a staggering volume of backscatter, mostly of the “Delivery Status Notification (failure)” variety. Sampling these messages, I’m seeing several interesting things.

The spammer is using my proper email address (dwallach@…) on each message, but a different “real” name on each one. The name “Dan Wallach” does not appear anywhere.

I forward everything to Gmail. Gmail considers all of this backscatter to be spam. That’s probably the correct answer, but I’m not sure I want to train my own DSPAM to do the same thing. (DSPAM runs locally, and then I save a local copy and forward to Gmail.) If I send a real message and it legitimately bounces, I want to know about it. If I train DSPAM that all of these delivery status notifications are spam, it will inevitably throw away anything from “mailer-daemon”. I’m unclear on whether that’s good or bad.

You could easily build a bounce-message validator. Every backscatter seems to have the original message ID in it, somewhere. If the backscatter mentions a message ID that my system actually generated, then the backscatter is allowed. Otherwise it’s dropped. (This idea appears to be a variation of VERP; I’d make the message ID be a keyed MAC of a sequence number.)

A large number of these spams have a message body consisting entirely of “Take a look at yourself :)” and linking to “video.exe” on a variety of different web sites. Gmail helpfully rewrites those links such that they can track that I clicked on it. This would also seem to give them an opportunity to give me an anti-virus warning, but they don’t do any such thing. (“video.exe” is one of the common names used by the Storm worm.)

Many spams include links that redirect through Google’s PageAd server to yet another server. I clicked on one of them. It appears that the PageAd redirector worked, but then Firefox’s “badware” detector caught the destination as being bad, ultimately taking me to stopbadware.org. Go Firefox!

Some legit antispam firewall products (including Barracuda) are helpfully telling me my message “was blocked by our Spam Firewall. The email you sent with the following subject has NOT BEEN DELIVERED”. This is clearly broken behavior. Just drop it and move on!

Several of the backscatter messages are actually validation messages (sender address verification). This has been largely discredited due to a variety of practical problems, never mind common-case annoyance to normal users.

One of the spammers seems to be quite keen to sell replicas of expensive wristwatches, and those links take you to some kind of seemingly real online store, albeit with a funky DNS name. Somehow, even if I did want a fake expensive watch, I’m not sure I’d be comfortable typing my credit card number into a web site whose name is a list of random characters and who (clearly) is closely related to the underworld of lecherous spammers.

I love spammers, really I do. Some of you may recall my earlier post here about freezing your credit report. In the past week, I’ve deleted two comments that were clearly spam and that made it through Freedom to Tinker’s Akismet filter. Both had generic, modestly complementary language and a link to some kind of credit card application processing site. What’s interesting about this? One of two things.

Akismet is letting those spams through because their content is “related” to the post.

Or more ominously, the spammer in question is trolling the blogosphere for “relevant” threads and is then inserting “relevant” comment spam.

If it’s the former, then one can certainly imagine that Akismet and other such filters will eventually improve to the point where the problem goes away (i.e., even if it’s “relevant” to a thread here, if it’s posted widely then it must be spam). If it’s the latter, then we’re in trouble. How is an automated spam catcher going to detect “relevant” spam that’s (statistically) on-topic with the discussion where it’s posted and is never posted anywhere else?

Freedom to Tinker is hosted by Princeton's Center for Information Technology Policy, a research center that studies digital technologies in public life. Here you'll find comment and analysis from the digital frontier, written by the Center's faculty, students, and friends.