Blog Post

A really tiny explanation of how Facebook’s Graph Search works

I spent the morning listening to Mark Zuckerberg and company (S FB) talk about Facebook’s next big product, Graph Search, which allows folks to search (for now) using people, places, photos and interests as search vectors. The reaction to the news ranges from boredom to muted to wild exultation.

So, how does this Unicorn based Graph Search work? From what I understand, there are two parts to the search, both built in-house. The first part is natural language processing based, which is essentially driven by the users’ questions. The second part is the one that brings back the answers.

The second part is built on an internal retrieval tool called Unicorn that has been in use at Facebook for a while. Facebook has created an index of all objects on the social platform. On the Facebook Engineering blog, Lars Rasmussen (who along with Tom Stocky) was responsible for building Graph Search writes:

Using traditional information-retrieval systems to mix keyword and structured queries is fairly well understood. But we needed the system also to find answers more than a single connection away, such as “restaurants liked by my friends from India.” Here we were in luck: one of our three existing systems, Unicorn, was designed exactly with this in mind.

The search infra team decided on a two-stage approach: first build out Unicorn to manage all the existing search experiences on the site and then, build out Unicorn further to meet all the requirements of Graph Search. Today, we are far enough along now to launch Graph Search as a beta, but we’re still missing is the ability to index all of the posts and comments people have shared on Facebook–they make up by far the biggest dataset we have for Graph Search and Unicorn.

At a very rudimentary level, Rasmussen in a call later explained that when we type a query, say, “Restaurants liked by my friends from India in San Francisco,” an aggregator tool using natural language processing takes the query and parses it into keywords against which searches are conducted. So “Restaurants,” “San Francisco,” and “Friends from India” become queries whose results are fetched and further sorted to give us the final answer. It is not new, except the scale of data on which such queries are being run is possibly a new (and rising) high watermark.

Rasmussen acknowledged that there are a lot of unknowns for the company and it is still not clear what kind of computing resources are going to be required for Graph Search at scale. Facebook plans to learn from phased beta rollout and do compute-resource planning based on that, he said.

My take: This is going to be a big infrastructure challenge, considering Lars and his cohorts want to keep the latency to below two seconds. Rasmussen pointed out that the “ranking algorithm” for results is going to keep improving as more and more people use the search. With as many as a billion searches on Facebook every day, even few million queries are going to be enough to help fine tune this ranking algorithm.

As for rest of my questions — I guess I will have to wait for the company to share details “about this challenge on the engineering blog soon,” as Facebook said in its blog.

I personally don’t like it as is, they need to fix it so you can toggle between custom search and full search…doing a search that connects me to everyone on my friends list is a nice idea at first, but what if I want to do a search outside of that bubble they just created for me? I want my facebook page back that has facebooks normal search bar or facebook needs to update it so I have a choice of switch between custom or full search for things on facebook.

Natural Language Processing it what makes the Search Graph cool. Of course you could find most, if not all the information you’d typically look for: friends that like organic food, friends that went to SXSW and attended a certain party, friends who have other friends in New York City and like Tiesto etc. But the NLP makes this super-fast (hopefully) and allows one to ‘de-duplicate’ when doing complex searches with several cross overs. This is great I think for Marketing peeps. Not so sure about everday use though.

I know a lot of people don’t see the utility in Graph Search, but I do. Here’s my use case:

A few weeks ago I was planning a housewarming party. I wanted to invite very select people in my network, almost all of whom are on Facebook: People who live in my city, don’t work with me, are close friends, and over 21. It took me 30 minutes to get them all on one list and send an invitation.

Graph Search (from what I understand) could have done this for me in 5 seconds.

(Of course, if Facebook had Google+-style Circles, it could have been done a lot quicker, too. But Facebook refuses to incorporate this MOST USEFUL tool, apparently because Zuckerberg didn’t think of it.)

I told my wife about a big new Facebook announcement and she said,… what something dumb? Then I told her you can search for the favorite SF restaurants of your Indian friends and she said… you can do that already.

Just reporting the reaction from a non-technical 30-something user of Facebook who has a decent jobs, spends some time of FB everyday and has 2 kids.