Why Do Not Track may not protect anybody’s privacy

Microsoft ruffled feathers in the online privacy community this week by announcing that Internet Explorer 10 would enable Do Not Track technology by default. Many have lauded the move as an instance of Microsoft putting consumers’ interests above those of behavioral advertisers — which, ironically, includes itself. However, Microsoft’s stance may run afoul of the W3C committee that’s actually drafting the Do Not Track standard. Right now, they’re saying browser makers must only send a Do Not Track signal with a user’s explicit consent.

What is Do Not Track and how can it protect your privacy? And is Microsoft’s stance about protecting consumers…or whittling away at Google’s dominant position in the online advertising world?

What is Do Not Track?

Do Not Track is a proposed (emphasis mine) technology standard intended to enable individual Web users to express whether or not they consent to having their online activities monitored and collated, mostly for the purpose of being served targeted advertising. It was originally proposed back in 2009 by Christopher Soghoian, Sid Stamm, and Dan Kaminsky in the wake of the Federal Trade Commission indicating it was looking into the idea of implementing a “Do Not Track” list similar to the “Do Not Call” list that has reasonably successfully in letting consumers opt out of telemarketing calls in the United States. However, Do Not Track is not backed by any legislative or regulatory authority: it’s purely a voluntary effort from the technology community — and one many hope will help stave off any government involvement in consumer tracking issues. Do Not Track is not yet a finalized standard: as with most things at the W3C, progress is slow as working groups assemble, stakeholders weigh in, and drafts get circulated.

At a very basic level, Do Not Track is elegantly simple. If Do Not Track is active, a user’s Web browser sends a single HTTP header to remote servers along with every request for pages, images, and any other constituent items that make up a Web page. Whenever you load a Web page, your browser sends a flurry of headers to the remote system indicating not just the specific page you want, but what types of media you can handle, your preferred languages, any cookies the site had previously set for you, information about your Web browser, and more.

The Do Not Track header is called, logically, enough, DNT. If the value of that header is “1,” the header serves as a signal to the server that the user does not wish to be tracked. If the value is a “0,” it means the user consents to being tracked. If the header is missing, it means the user isn’t supplying any preference at all about tracking.

There’s a lot more to Do Not Track — it’s rolled into a larger specification called “Tracking Preference Expression” which includes JavaScript APIs, server responses, machine-readable policies, and much more. But that 0 and 1 are the basics.

Simple isn’t easy

The simplicity of implementing this Do Not Track functionality has led many browser makers to jump the gun on the standards process and roll out initial support before the standard is complete. (This is pretty common with Web technologies — so much so that it’s the status quo with HTML5.) Microsoft’s Internet Explorer 9 was the first to offer Do Not Track support, with Firefox, Safari, and Opera quickly following. Google Chrome is the only major browser that doesn’t currently implement Do Not Track, but Google does offer the “Keep My Opt-Outs” Chrome extension to users for free.

But what’s simple for browsers isn’t not simple for sites. There’s nothing magic about those DNT headers that stops Web sites and online advertisers from tracking users. Sites have to specifically code support for honoring the intentions expressed by Do Not Track headers — and this is where Do Not Track can get astonishingly complicated. Unless Web sites and services specifically change their practices, turning on Do Not Track in a Web browser will do absolutely nothing to protect users’ privacy.

Right now, there’s little agreement on what setting a Do Not Track value of “1” should mean for Web sites — and different sites are likely to interpret it differently, if they bother to interpret it at all. Should sites not keep any record of visitors? If so, sites won’t even know what their users like and don’t like even in general terms; furthermore, many sites cannot function properly unless they’re allowed to set cookies, allow people to set up accounts and log in, or present personalized information. For many Web sites, being able to present personalized content is the entire point — and that can’t be done if they can’t keep track of users.

Do Not Track is not intended to break personalization features and things like site log-ins: after all, users are theoretically opting in to having the site serve up customized data by signing up in the first place. So if a user doesn’t like the way a site they visit collects and uses data about them, Do Not Track is not going to change anything.

No ads were harmed serving this page

So what are users opting out of with Do Not Track? Ads? Well…no. Web site operators tend to look at visiting a site as a voluntary thing, complete with a tacit understanding that users may be shown advertisements simply by visiting. In all probability, these sites will continue to use everything they know about visitors — from what browser they’re using to their rough geographic location (obtained by IP address) to any information associated with an account they’ve logged in to to serve up their own ads, regardless of a user’s Do Not Track setting.

Similarly, Do Not Track is unlikely to have any impact on social plug-ins that pull in content or controls from services like Facebook, Twitter, Pinterest, Google+, or any other social services. If users have an account with those services, they will likely consider themselves a “first party” — even if they’re embedded in someone else’s site — and track users with Do Not Track set just as thoroughly as anyone else.

Where Do Not Track should have an impact is on third-party advertising. Most Web sites do not manage all the advertising that appears on their pages; instead, they subscribe to ad networks and services that automatically display ads they feel will be appropriate. The sites get a cut of the revenue from anyone who happens to click.

To target ads to those users, those third party ad networks also try to collect every iota of information they can about site visitors — the idea is that the more in tune with an ad is with a user’s interests and tastes, the more likely they are to click. Usually, targeting starts off with the same sort of general details in HTTP headers — browser, IP address, etc. However, each of these third party ad networks will also try to store cookies on the browser for later reference. If that same browser later visits another site serviced by the same ad network, it will be recognized and the advertising service now knows “Ah, in addition to being near Arlington, Virginia, and loading pages about Web browser privacy, this browser also visits pages related to My Little Pony. How interesting.” Suddenly the ad network knows not just technical details of a browser, but potentially very personal information about its user. (Don’t think so? Substitute “HIV testing” or “bankruptcy attorney” for My Little Pony, above.) Suddenly, ads served to that browser by that ad company may take on a very different character.

In theory, turning on Do Not Track should tell third party advertising networks “this user does not want to be tracked; do not use any information associated with this browser to target ads.” That doesn’t mean the ads go away: it just means the ad network won’t use any information it may have collected to help decide what ad to serve. It’ll likely make a decision based on what it knows about the first-party site you’re visiting, what kind of device you’re using, and things like your IP address.

Does Do Not Track protect anybody?

It’s not clear what turning on Do Not Track will actually do to protect user’s privacy. For instance, most people have been using the Web for years, and third parties have accumulated detailed dossiers on tens of millions of Internet users. Does turning on Do Not Track mean those parties suddenly delete all that information?

Probably not. If there’s one thing ad networks hate doing, it’s deleting data. Heck, even deleting data from Facebook is nearly impossible for users: by default, Facebook just squirrels it away so folks can re-activate it later if they change their mind. Ad networks tend to feel the same way.

As yet, there’s no industry-wide consensus on exactly how sites should respond to a Do Not Track flag. Can they keep and store data from requests labelled Do Not Track and use that data in aggregate? Yes, according to the current draft Do Not Track specification. What about third parties that are acting exclusively on behalf of a first party — say, to provide detailed analysis of their users? According to the draft, they can ignore Do Not Track. What about a service that provides a shopping cart or media streaming for a site — are they allowed to track information about users even if they set a Do Not Track flag? That’s a grey area.

There are others. What about analytics companies that try to track site visitors as verification to advertisers that sites get as much traffic as they say? Those companies are likely to collect as much information as they can regardless of Do Not Track settings, using the reasoning that they’ve been contracted by the first-party sites. Don’t be surprised if some advertisers use similar reasoning.

What about third-party utilities? If a user installs an antivirus or security package, can it legitimately turn on Do Not Track unless it gets a user’s specific approval? Would simply installing the package constitute consent?

And here’s another wrinkle. Support for Do Not Track will be entirely voluntary and, essentially, based on the honor system. There is no provision for either government or industry auditing sites’ compliance and interpretation of the meaning of Do Not Track, and it’s entirely possible for bad actors to set up shop saying “We honor Do Not Track!” while, in fact, they do no such thing. Fundamentally, Do Not Track merely expresses a preference — nothing requires Web sites honor users’ wishes. You can set a Web browser’s Do Not Track settings as often as you like: if Web sites don’t honor it, they — and anyone they do business with — are still following your every move.

And there’s another problem: users who don’t want to be tracked will have to find and set the Do Not Track setting in every browser they use: desktop computer, notebook computer, smartphone, tablet — everywhere. That’s a lot of fiddly detail and configuration that most users won’t follow through with — and helps illustrate why the choice of a default setting for Do Not Track is so important.

Microsoft and IE10

“Consumers should be empowered to make an informed choice and, for these reasons, we believe that for IE10 in Windows 8, a privacy-by-default state for online behavioral advertising is the right approach,” wrote Microsoft’s chief privacy officer Brendon Lynch.

The wrinkle in this statement is that the “group consensus” in the W3C’s user-tracking working group is that Web browsers must not transmit either opt-in or opt-out Do Not Track information without explicit user consent. If Microsoft ships IE10 with Do Not Track enabled by default and if the final Do Not Track specification retains the same language as the draft, Microsoft will be in the ironic position of being out of compliance with the Do Not Track spec. Those are two significant “ifs,” and the spec isn’t expected to be finalized until late 2012. Change one word, Microsoft would be in compliance.

Choosing not to decide

The Do Not Track specification allows for two Do Not Track values: zero and one, corresponding to a users’ consent to opt-in or opt-out of tracking. However, there’s in fact a third choice: don’t send any information at all. The Do Not Track spec defines this as a “no preference” setting, and it the de facto setting for every browser on the Web today and is the W3C working group’s current consensus for what should be the default value once the standard is completed. Browsers wouldn’t send any Do Not Track information unless a user specifically goes into the browser preferences, finds the setting, and turns it on or off.

“We don’t know what the user wants, so we’re not sending any signals to servers,” wrote Mozilla’s Alex Fowler. “This causes the presence of the signal to mean more — the signal being sent should be the user’s choice, not ours. Therefore, Firefox doesn’t broadcast anything until our user has told us what to send.”

The problem with this logic is that sending no signal is functionally equivalent to explicitly opting-in to being tracked. The W3C working group tactic acknowledges this by saying Web browsers need only provide two options for Do Not Track (corresponding to “on” and “no preference”) rather than three, and they explicitly state Web sites can interpret the lack of Do Not Track information any way they like “in light of the user’s privacy expectations and cultural circumstances.” In other words, having no preference is business as usual — track everything possible. The reality — or perhaps as the W3C would put it, “cultural expectation” — of not sending Do Not Track information is equivalent to opting in to tracking.

Microsoft’s clever move

So why would Microsoft publicly jump in front of a bus and announce it plans to enable Do Not Track by default in IE10?

It goes over well with consumers. After being late to the Internet revolution, the portable media player revolution, and the mobile revolution, Microsoft is no longer seen as leading-edge in the technology world. Jumping ahead on Do Not Track lets Microsoft paint itself with a good-guy image as it prepares to launch Windows 8 and Windows RT, as well as reboot Windows Phone.

It thwarts Google. Internet Explorer is still the most common browser on the Internet, although Chrome and Firefox have eroded its market share. However, Google is the dominant player in online advertising — and those ads generate the bulk of Google’s revenue. Although it will take time for Internet Explorer to catch on and become a significant player in the browser market, every installation of Internet Explorer 10 that has Do Not Track enabled is another browser that is not participating in Google’s advertising business. True, they won’t be participating in Microsoft’s advertising business either, but Microsoft has far less to lose: its cash cows are Office, Windows, and its Server and Tools groups. Google’s cash cow is ads.

Does Do Not Track have a future?

If you’re technically inclined, take a glance at the draft specification for Tracking Preference Expression. If you manage a Web site, it’s easy to see how the technology could become a major headache: it requires machine-readable summaries of a site’s tracking behaviors, potentially including tracking behaviors specific to particular pages or features. The idea is that sites can reliably communicate how they do or don’t track their users, and, if browsers like, they can present a pretty interface summarizing the details. Sites that really get into the technology can even provide mechanisms for users to view and even edit or delete information that’s been tracked about them, providing a very high degree of control over potentially personal data. That way folks could get rid of associations with My Little Pony, if they liked.

Anyone who has been through the technology wringer a few times will recognize technology like this as akin to P3P, the so-called Platform for Privacy Preferences. P3P got started in the late 1990s and was finally standardized in 2002 — it was intended as a machine-readable protocol that could let Web users see how sites claimed to use data they collected — in theory, users could peruse sites policies and make an informed decision about whether to continue visiting them.

P3P was well-intentioned, but went over like a lead balloon. Ironically, Microsoft Internet Explorer was the only browser that tried to implement P3P support, and even extremely popular Web sites never bothered. After all, what good was being able to see standardized information about how a site might use your data if you couldn’t do anything about it? Similarly, P3P polices were awkward and complicated to maintain, especially for large and complex sites.

Do Not Track faces many of the same hurdles: unless advertisers, social networks, analytics services, and essentially every major Web site chooses to adopt the technology, it’s going to go nowhere fast. There may be more impetus for sites to adopt Do Not Track thanks to the Obama administration’s proposed Privacy Bill of Rights — and the technology industry may well choose the effort involved in supporting Do Not Track is preferable to government regulation. But the Web is global: no matter the legal status of technology like Do Not Track in the United States and similar privacy efforts in Europe, even if Do Not Track were ready today it will be years before it offers any meaningful protection to consumers.

Until then, consumers’ privacy will be in the same place it’s always been: right in marketers’ crosshairs.