Towards a Taxonomy of Social Networking Data

Bruce Schneier has posted over at his blog the following draft of a social networking data taxonomy:

Service data is the data you give to a social networking site in order to use it. Such data might include your legal name, your age, and your credit-card number.

Disclosed data is what you post on your own pages: blog entries, photographs, messages, comments, and so on.

Entrusted data is what you post on other people’s pages. It’s basically the same stuff as disclosed data, but the difference is that you don’t have control over the data once you post it — another user does.

Incidental data is what other people post about you: a paragraph about you that someone else writes, a picture of you that someone else takes and posts. Again, it’s basically the same stuff as disclosed data, but the difference is that you don’t have control over it, and you didn’t create it in the first place.

Behavioral data is data the site collects about your habits by recording what you do and who you do it with. It might include games you play, topics you write about, news articles you access (and what that says about your political leanings), and so on.

Derived data is data about you that is derived from all the other data. For example, if 80 percent of your friends self-identify as gay, you’re likely gay yourself.

Why is this important? Because in order to develop ways to control the data we distribute in the cloud we need to first classify precisely the different types of data and their relational position within our digital footprint and the surrounding ecology. Disclosed data is of different value to Behavioral or Derived data, and most people will likely value their individual content such as pictures and posts much more than the aggregated patterns sucked out of their footprint by a social network site’s algorithms. Much to think about here.