Friday, July 16, 2010

Web privacy is a very hard problem to solve. Well, at least what people perceive as "the privacy problem."

One of the main reasons it's hard to "solve privacy" is that the term privacy is used in many contexts to indicate many things.

This is ironic, since user data is also used in many contexts to indicate many different things. That is, a piece of data may be considered private in some contexts, but not in others.

I'd like to take a more focused approach and concentrate on one main cause of the "ZOMG my privacy is violated!!1!" uproars to see if we can't help address it.

Facebook Beacon. In late 2007, Facebook launched a new Beacon feature that caused a brisk community reaction. The feature automatically syndicated users' activities on partner sites to their Facebook news feed. For example, if you bought tickets to the Harry Potter movie on fandango.com (a Beacon partner), it might be broadcast to all your friends where and when you were going to the movie. People were mad because non-Facebook activities were now automatically imported into Facebook and shared.

Google Buzz. Google turned their new "buzz" feature on for some of google users in February 2010. This feature automatically created a twitter-like stream for things you do (such as what you read in google reader and photos you upload to picasa) and immediately connected you to "follow" other google users in your "exchanged mail with" list. Harriet Jacobs' article exemplifies the reaction. She didn't want people who emailed her on occasion to know everything she does, but suddenly this new technology connected her activities to everyone she had received mail from.

LSOs, a.k.a. Flash Cookies. When people clear their cookies, not all cookies actually get deleted! Gasp! Here's why: Adobe's Flash plug-in has its own data storage space on your computer -- separate from where your browser stores cookies, bookmarks and passwords. The browser doesn't have direct control over Flash's data, since Flash is essentially a separate application that happens to show its content inside your browser window. The result? You clear cookies, but your browser doesn't know how to clear flash cookies. How is this used? In many ways, but one particular sneaky use rubs many people the wrong way: web sites can use flash to keep longer lived cookies on your system that can be used to re-populate regular cookies after you clear them. People are mad. (FYI, this is being worked out, see this bug).

The Gap. There's this dark and mysterious area between what users think is happening with the data they put on the web and what actually happens. I call this the Privacy Perception Gap (PPG). There are a variety of reasons this gap exists:

Software makers are not psychologists -- they don't know what people expect, only how the system works.

Software makers are not anthropologists -- they don't know how different cultures expect secrets to be kept or shared.

Software is reactive -- users complain, software is re-engineered, and the cycle repeats

The PPG is not well understood

This last reason is something we can address with proper research. First, we need to understand the size and reason for the PPG before we can close it, especially before we know who is best poised to do the work. Is it users, user agents, infrastructure, applications, or a combination who should take the giant leap? How big is this gap on average? Surely it's different for various web applications.

If we minimize the PPG, we can expect users to be better informed, and that may have solved the variety of situations enumerated above. Users wouldn't be surprised with what happens, and the suspicion that web companies are out to violate their users would be reduced significantly.

I'm a big fan of transparency (see Open and Obvious), as it is a big part of the PPG problem. We should start by making data relationships transparent: this includes disclosure first and then most importantly user accessibility second. For instance, having a privacy policy linked from my web site doesn't really make me transparent unless users can find and understand it. The gap doesn't shrink if users don't understand! My theory is that an informed user is a happy user, and if we can better understand the PPG we can take the first step towards making web users happy.

Thursday, July 01, 2010

It's hard to resist the YouTube hotness -- especially if you feel like blogging about the adorable kittens you just saw on YouTube because all your friends must absolutely see it. Sarcasm aside, there are plenty of reasons to embed videos on a blog or web site, but there's something to consider: the embedded video might be used to track your visitors!

Frankly, any content embedded on your site but hosted by another site can cause privacy concerns. When the content is requested, the user's cookies go along with it. The effect is that not only does YouTube know what site a user is on when viewing the kitten video, but they potentially also know who the viewer is! This means that unless you log out of YouTube regularly, they know all the videos you've viewed and on what sites you saw them.

This isn't breaking news. This stuff has been around for a while, it's just Not Obvious.

So you run a blog or web site and you don't want to aid in YouTube tracking your visitors. What do you do? You have three options in my view:

1. Use YouTube's privacy mode. You can embed videos that suppress sending cookies until the visitor has clicked to watch them. This works by changing URLs for any content automatically loaded by your site from the domain "youtube.com" to "youtube-nocookie.com". This way the cookies (for youtube.com) aren't sent until more content is loaded from youtube.com after the click.

2. Don't trust 'em? Use EFF's MyTube. This does something similar, but the static content is loaded from your own site, then when the visitor clicks on the video, it is loaded from youtube.com and the cookies are sent.

3. Open Video FTW! If possible, host the video yourself (on your own website) and use the <video> tag! More info here. Flash is not the only way to display video on the web! Use open standards!