Monday, May 17, 2010

Facebook's Realtime Updates -- Use Cases

At f8 Facebook launched a first version of real time updates for the Graph API. These updates allow consumers to subscribe to users of their application and get notified via an HTTP Post when the data has changed so they can go and fetch it by invoking the Graph API endpoint.Some in the community were wondering why the PubSubHubbub protocol was not used and be concerned about issues like the Thundering Herd.

It comes down to these three reasons:

The need for simple data modeling of any resource:

PubSubHubbub currently only supports Atom and RSS and we wanted to use JSON to match the rest of the Graph API so developers don't have to write additional wrappers.

We need to syndicate changes to any type of resource, not just feeds or lists. The changes may include updating properties of a given resource or deleting the resource altogether. PubSubHubbub only supports appending to the list.

The ability to do light pings where only the notification is sent. Not all the use cases require fetching the data right away but can still benefit from the notifications model as opposed to continous polling.

The need for user authorization

We needed to let users remain in control over what data is shared and with whom based on their privacy settings.

Facebook is committed to authenticity and quality and there are rules to encourage this.So one of the main reasons why we could not use traditional PubSubHubbub is that it does not address authenticating the publishers or the subscribers to determine the quality of data being pushed in or the trust that the user has in the consumer.

As you may have seen, the Graph API is extremely powerful in its simplicity, flexibility and efficiency.To query the Graph API you just need to figure out the url of the resource you are interested in fetching and use OAuth 2.0Ex: https://graph.facebook.com/ciberch/feed?token=XXXX we wanted to use a similar elegant approach for subscribing to notifications

The need for a more efficient content propagation architecture.

For those of you who are not familiar with the term, PubSubHubbub is a open protocol which allows exchange of news feeds in real time by POSTing changes to subscribers as they occur (this methodology is called web hooks). One of the main goals of PubSubHubbub is to allow the syndication of public feeds to any party. It works best when the same content is requested by a multitude of subscribers. For example CNN's feed would benefit from publishing to a hub that can help them service all their consumers. The publisher and the hub do not know or worry about how the information will be resyndicated. PubSubHubbub is built with small publishers and large hubs in mind which allow publishers to fan out. It is definitely a far superior to consumers polling publishers directly.

In contrast to the CNN news example above, Facebook has a more personal relationship with their users.

Here is an example to illustrate the challenge we would face trying to use PubSubHubbub. We have a user *Tim* wanting to share content with to a subset of people and applications and 2 applications: Sports Club and Restaurant Rating Site. The same content can't be sent to both applications because it would violate Tim's privacy.

So what we needed was to provide a simple way for external developers to keep user's data in sync taking into consideration the authorization given by users and thus we selected to use OAuth 2.0 for subscription creation and for data retrieval.

This use case, as well as existing Facebook interaction requirements, materialized in the following three needs:

Need to have a decoupled notification system which allows to syndicate changes to arbitrary data.

Need to only syndicate data to authenticated consumers

The need for a more efficient data propagation architecture.

The fact that Facebook only sends notifications of the content that changed means the consumer can always fetch the up to date version from the server. No need to check timestamps in case the updates were received in the incorrect order.

The need for Facebook to act as a consumer aware delivery hub distributing the content of a relatively small subset of its publishers to another relatively small subset of interested consumers within a variety of multiple contexts

Facebook delivers personlized content to each user on each app and has a lot of users and apps. This means that it would not benefit from fanning out the same content to multiple consumers. Every user and every application gets different updates. In FB's world you see content through your social graph so everyone sees something different.

The PSHB example below does not work for Facebook because it's neither a small publisher needing to fan out nor a traditional consumer agnostic hub. Facebook is only interested in syndicating the updates from Facebook users to trusted parties.

We think that there are other platforms which may be facing similar challenges syndicating changes to arbitrarily modeled data to authenticated consumers.Before the release of OAuth WRAP and OAuth 2.0 we had some discussions on the PubSubHubbub mailing list about using OAuth 1.0a topic url signing. Here is the proposal. This is a good and simple enhancement and now with the release of OAuth 2.0 we can use a very similar approach.