Shirky on Social Networks and Filesharing

Continuing my recent theme of social networks, trust, searching and filesharing I present an article by Clay Shirky.

He describes the technological and social effects of the RIAA’s strategies against file-sharing: by attacking some of the few nodes that make a large number of files accessible for most of the time, they have sufficiently weakened the system that users are angry, both at Kazaa (for example) and the RIAA.

Shirky goes on to describe a new kind of system, one that is slowly evolving today, based on “trust networks”. Users invite each other to join networks of trust, smaller and even less centralised than the current generation of P2P networks (Gnutella, Kazaa, eDonkey etc). Shirky claims that the efficiency of these networks in finding desired content is higher than one might expect for a random distribution of files, because users who trust each other are likely to have similar tastes.

This brings me to a criticism that has been levelled at Outfoxed, and which I think is appropriate here: Trust is not absolute or all-encompassing. If my friend George is a Professor of Theoretical Mathematics at Princeton, then of course I’ll trust his opinion on anything mathematical – however, he may also be a recalcitrant sexist pig, so I wouldn’t trust his opinion on women.

Likewise, I may trust my friends not to rat on me to the RIAA when we share files, but I may also think their taste in music is crap (this is, in fact, true).

So when we talk about the trust vectors that we hook into the cloud, they need to have more dimensions than just “who?” and “how much?”. They also need “with regards to what?”. In order to operate efficiently our file-sharing trust networks need to be built on an heuristic which combines both “secrecy” (traditional trust) and “semantics” – to let us walk the traditional trust graph to content we actually want.

6 Replies to “Shirky on Social Networks and Filesharing”

It’s true that peer-to-peer, on whatever scale it ends up clustering, vastly favours the top 10% of the most popular files.

One thing that P2P networks should be remarkably good at that they aren’t yet doing is determining the relationships between files depending on how often a particular node requests, say, songs by particular artists, and which files exist on that node. If you assume that the node belongs to an individual, then you can approximate their tastes by enumerating the list of available files and doing Amazon-style statistical analysis.

This gives you the whole “people who liked X also liked Y” thing. Which people clearly really dig, because it’s one of the cornerstones of Amazon.com’s business model.

P2P networks currently operate on searches, but as a happy Acquisition user I can wholeheartedly say that I now also operate 50% of the time on browsing.

If I find a node has a track by a band that I like, I can click a button and browse every file hosted on that node. More often than not, it turns out that they have a whole heap of stuff I like, and also a whole heap of stuff I’d never heard of that I discover that I like.

This is directly analogous to the trust-taste directed graph. The P2P client tracks which nodes supplied me with stuff that I like, and assumes that there’ll be more stuff that I like in the same place, particularly where said stuff has a strong statistical correlation vis-a-vis location across the whole network.