Strong opinions, weakly held

Today the New York Times Opinionator blog ran a piece by Robert Wright made the following assertion about HTML5:

In principle, HTML 5 will allow sites you visit to know your physical location and will make it easier for them to keep track of your browsing and shopping history.

That assertion is based on this news article from the Times, which says:

In the next few years, a powerful new suite of capabilities will become available to Web developers that could give marketers and advertisers access to many more details about computer users’ online activities. Nearly everyone who uses the Internet will face the privacy risks that come with those capabilities, which are an integral part of the Web language that will soon power the Internet: HTML 5.

All of this talk is about one piece of HTML5, client storage. For the details, check out Mark Pilgrim’s chapter on local storage in Dive Into HTML5.

There are two points to make. The first is that Web sites won’t have access to any information that they don’t have already already. In that sense, the talk about “access to many more details” is misleading. It’s not that Web sites will have access to new information, but rather that they’ll have a new place to store information that they already collect that may make it more convenient for them.

For example, if I don’t share my current location with FourSquare, they won’t suddenly be able to retrieve it if I use a browser that supports local storage. However, if I do give them access to my current location, they could store it in local storage on my own computer rather than using their own resources to store it on their server. In that sense, the information may suddenly be worth storing and easier to access, but it’s information they could already obtain and store on their own servers if they chose to do so. This aspect of local storage subjects users to no real risk beyond the risk already posed by cookies or other vectors for storing information about users.

What’s really gotten people wound up is evercookie (mentioned in the New York Times story), a proof of concept that demonstrates how the variety of ways Web sites can store information on the client can be exploited so that it’s nearly impossible to delete tracking cookies. Browser cookies are one way to store information on the client, as is local storage. Flash Local Shared Objects (also known as Flash cookies) can also store information on behalf of Web sites on your computer. evercookie uses a number of other methods for storing information as well. The nefarious thing about it is that when the information is deleted in one of these locations, evercookie replicates it again from another location where it is still stored. So if I delete my browser cookie, evercookie will copy that information from Flash and put it back in place. If I delete the Flash cookie, it will look in one of the other locations where it stashes information and copy it back again.

Using tricks like this to make it difficult for users to prevent Web sites from tracking them is unethical. Web sites who take this approach should be classified as spyware. But the existence of these techniques has nothing to do with HTML5.

What concerns me is that we’re on a path toward HTML5 being perceived negatively by regular users because the only thing they’ve heard about it is that it is likely to compromise their privacy. This perception could become a major stumbling block on the road to wider usage of browsers with HTML5 support. As developers, it’s important to educate users and perhaps more importantly, the media, so that people don’t conjure up risks where they don’t exist and damage the HTML5 brand in the process.

If your aim is education, perhaps you might work on eliminating scripting as a component of web design.

HTML was designed for displaying information, and creating a convenient scheme for locating it.

The AJAX loons speak about how it ”enhances” your experience but in reality is a highway for information thievery. Unethical is putting it mildly.

JavaScript does not enhance the web for the user, but rather plunders them for information without permission. Sort of like being stripped searched in your favorite store before you can actually shop.

The concept of “Tracking” through cookies, LSO’s and evercookies points out the failure of companies that use this technology being able to tell their story honestly.
That they would rather steal from you, than be honest should wake folks up.

Ripping javascript interpreters out of browsers would be a first good step to bring the web back to a place to exchange information.

We might even see ethics and honesty as a fundamental principle of web design rather than an apology mill after getting caught.

@alan: I like your tinfoil hat, it’s jaunty and set at a nice rakish angle. You should go after CSS next, that should be entertaining.

JS has nothing to do with cookies. Even without JS enabled, a server can still send you a Set-Cookie header in a response and your browser will still send back a Cookie header in the next request. You can do tracking with plain old images, no scripting required.

HTML 5 provides numerous new avenues for cookie storage and provides a few new features (like geolocation) that are potentially hazardous. In reality that horse already fled the barn, between Flash cookies, IP-based tracking, ETag hacks &c and browser-fingerprinting, but there is definitely a trend towards more identifiability online that is worrying even if HTML5 isn’t the biggest component. (By far the biggest hazard to online privacy & anonymity is Facebook serving buttons on every website on Earth, without even the privacy controls that Google provides (buried) for tracking.)

I don’t think end users have heard of HTML5 or know anything about it, so I wouldn’t worry about that.

Geolocation is one of the more worrisome. I don’t consider where I live to be a secret, but the default settings on the iPhone & Flickr mean that photos I upload from it identify the precise house I live in. (And which end of the house they were taken in. It’s very specific.) I’m okay with that, but many people wouldn’t be. That isn’t exactly HTML5, but it’s one of a class of features of new applications that have pretty serious privacy implications that aren’t entirely apparent at first sight.