Friday, July 16, 2010

mind the gap

Web privacy is a very hard problem to solve. Well, at least what people perceive as "the privacy problem."

One of the main reasons it's hard to "solve privacy" is that the term privacy is used in many contexts to indicate many things.

This is ironic, since user data is also used in many contexts to indicate many different things. That is, a piece of data may be considered private in some contexts, but not in others.

I'd like to take a more focused approach and concentrate on one main cause of the "ZOMG my privacy is violated!!1!" uproars to see if we can't help address it.

Facebook Beacon. In late 2007, Facebook launched a new Beacon feature that caused a brisk community reaction. The feature automatically syndicated users' activities on partner sites to their Facebook news feed. For example, if you bought tickets to the Harry Potter movie on fandango.com (a Beacon partner), it might be broadcast to all your friends where and when you were going to the movie. People were mad because non-Facebook activities were now automatically imported into Facebook and shared.

Google Buzz. Google turned their new "buzz" feature on for some of google users in February 2010. This feature automatically created a twitter-like stream for things you do (such as what you read in google reader and photos you upload to picasa) and immediately connected you to "follow" other google users in your "exchanged mail with" list. Harriet Jacobs' article exemplifies the reaction. She didn't want people who emailed her on occasion to know everything she does, but suddenly this new technology connected her activities to everyone she had received mail from.

LSOs, a.k.a. Flash Cookies. When people clear their cookies, not all cookies actually get deleted! Gasp! Here's why: Adobe's Flash plug-in has its own data storage space on your computer -- separate from where your browser stores cookies, bookmarks and passwords. The browser doesn't have direct control over Flash's data, since Flash is essentially a separate application that happens to show its content inside your browser window. The result? You clear cookies, but your browser doesn't know how to clear flash cookies. How is this used? In many ways, but one particular sneaky use rubs many people the wrong way: web sites can use flash to keep longer lived cookies on your system that can be used to re-populate regular cookies after you clear them. People are mad. (FYI, this is being worked out, see this bug).

The Gap. There's this dark and mysterious area between what users think is happening with the data they put on the web and what actually happens. I call this the Privacy Perception Gap (PPG). There are a variety of reasons this gap exists:

Software makers are not psychologists -- they don't know what people expect, only how the system works.

Software makers are not anthropologists -- they don't know how different cultures expect secrets to be kept or shared.

Software is reactive -- users complain, software is re-engineered, and the cycle repeats

The PPG is not well understood

This last reason is something we can address with proper research. First, we need to understand the size and reason for the PPG before we can close it, especially before we know who is best poised to do the work. Is it users, user agents, infrastructure, applications, or a combination who should take the giant leap? How big is this gap on average? Surely it's different for various web applications.

If we minimize the PPG, we can expect users to be better informed, and that may have solved the variety of situations enumerated above. Users wouldn't be surprised with what happens, and the suspicion that web companies are out to violate their users would be reduced significantly.

I'm a big fan of transparency (see Open and Obvious), as it is a big part of the PPG problem. We should start by making data relationships transparent: this includes disclosure first and then most importantly user accessibility second. For instance, having a privacy policy linked from my web site doesn't really make me transparent unless users can find and understand it. The gap doesn't shrink if users don't understand! My theory is that an informed user is a happy user, and if we can better understand the PPG we can take the first step towards making web users happy.

6 comments:

As mentioned in the comment's on Bruce Schneier's post that you linked, until it is implemented properly in Firefox, the Better Privacy addon deals with LSOs (there's a development version on the author's site which works with the current beta).

I'd argue that a lot of sites benefit (or believe they do) from at least a small gap.I think most users' explicitly stated privacy preferences are probably slightly stricter than what they're really comfortable with in practice. As long as they believe the settings are slightly stricter than they are both sides may be happy. The problem is twofold, though: sites overreach, and users find out. Eventually anything but a small gap therefore becomes problematic.

Note, I say all this as a user not a software or site designer. I may not like the privacy perception gap as a user, but in theory at least I can see how a small gap might benefit both parties.

The problem is that people submit information to a (social networking) site for a reason; for example, to share certain information with their friends. Later, it turns out that the company behind the website has been using this information for all kinds of purposes, most of them not in the interests of the person who submitted the information.

Then there's an outcry. Surprised?

In my opinion, the problem is that there's no way to limit what company's can do with your data. There needs to be a legal framework with redress for companies who use user data for reasons that the submitting user did not intend.

@Anonymous: What you describe is precisely the gap I mentioned. Users submit data to social site to do one thing, but site secretly does something else the users don't expect. The gap is between what the users think and what actually happens, and in your example all the covert activities undertaken by the site fall into the gap.

I don't see an easy solution to this problem, but that doesn't mean there isn't one. I also don't want to focus too much on technology-only solutions since maybe we can benefit from some policy/legal/psychological approaches too.

The "private browsing mode" in Firefox 3.5 and above cooperates with Adobe Flash Player to control this, and we hope for deeper cooperation in the future... here's a January overview:http://blogs.adobe.com/jd/2010/01/private_browsing.html

@John Dowell: Yup, private browsing mode is supported, but not yet "clear private data". The latter is being held up because the API is hard to standardize when different browsers have different schemas for storing this data... but progress is being made!