Thursday, April 17, 2008

Facebook knows who you are, and that's worth more than you think

It's very fashionable to declare that Facebook is an over-hyped fad and will never make any real money, certainly not enough to justify its insane $15 billion valuation. At first glance, it's easy to understand why some people might think it's a toy -- most of the activity there seems to involve biting, poking, and joining groups with funny names.

However, I think that assessment misses out on something very interesting: Facebook is capturing everyone's identity and relationships. Of course there's some noise caused by random friending, but by examining the larger graph as well as other details such as location, affiliations, interactions, and of course explicitly entered relationship details ("how do you know Paul?"), they can get a pretty good idea of which people are actual friends and acquaintances.

The lack of reliable identity information has always been an issue on the web. It's the reason why we don't have a useful directory of email addresses -- everyone in the directory would get bombarded by spam or other unwanted messages, and even if it did exist, how would you know which of the thousands of Adam Smiths is the one that you are looking for? Facebook has already solved this problem for a large fraction of people. It's easy to search for a name and then pick out the right person based on their picture, location, or friends. I get a lot of messages on Facebook, but unlike email, I have yet to receive any spam. That's pretty remarkable.

Perhaps a people directory doesn't seem terribly valuable, but if you can't imagine how to make money from knowing everyone's identity and trust networks, then you aren't being very imaginative. Spam and fraud are two of the biggest problems on the internet, and they are very difficult to stop because it's so easy to create new identities, and we have no good way of differentiating between real identities and fake ones. Even in "real" life, people are able to skip town-to-town, defrauding people again and again because to the people in the new town, they have a new and unknown identity.

One of the best examples of this problem on the internet is eBay. If you try to buy or sell something on eBay (especially computers or electronics, apparently), there is a very good chance that someone will try to rip you off -- just search Google for ebay scammers and you will find pages such as "How scammers run rings round eBay" and "eBay Forums: Today's Scams In Progress". Ebay has had a relatively solid lock on the auction market due to network effects, but with billions of dollars in profits, a $42 billion market cap, and 10 years of not innovating, I'm willing to bet that won't last. With reliable identity information, most of these fraud schemes would become impractical, which would obviously be a real advantage for an eBay competitor.

What else is highly profitable on the internet? Search. I doubt that anyone will ever beat Google at Google-style search, certainly not Microsoft or Yahoo, even if they do tie their horses together. The only way anyone will create something significantly better than today's Google is if they add a new and important ingredient to the mix. Many people have suggested that demographic information, or perhaps knowing what your friends have searched for will help, but I doubt it. What could work is actual, direct, human involvement by the users. In fact, it's already helping in a very limited form -- Wikipedia pages are written and edited by random people on the internet and they frequently occupy the top spots on Google (and I always click on them). Of course the problem with letting random users edit or reorder the search results is that you will quickly be overwhelmed by spam and fraud. But what if you knew who the users were and which ones you could trust?

Those are just the first few things that come to mind -- the uses of identity information are endless. Of course there's no guarantee that Facebook will actually realize any of this potential -- there were many search engines before Google, and they all fumbled the opportunity they had, but it's important to at least understand the potential for big things.

Update: This post was supposed to be about data more so than Facebook (Facebook just happens to have the data). See this post for a (hopefully) better explanation.