Oracle Blog

Chris Webster's Weblog

Sunday Mar 30, 2008

I just finished working with the Facebook Data API to see how it compared with other storage services like Amazon S3. According to the documentation there is no intention to charge for normal usage of this service, so if you are building a facebook application it is worth considering this API.

The Data API requires referencing objects by identity. All objects have
intrinsic identity which is returned whenever an object is created. This can be referenced from FQL as '_id'. The Data API also supports the ability to reference an object using a user defined key (referenced in the documentation as a hash key). Just as in object oriented programming, objects can have properties which must be one of the following set of types (integer, string (255 char max), and text storage as a blob).

I used this API to allow associating tags with a user. I started by defining the data model using the DataStoreAdmin link on application page. Using this tool, you can quickly create object and association types as well as execute FQL queries. I created two object types user (containing user id) and tag (containing the tag value) as well as the tagged association (a symmetric association between user and tag).

When a user is tagged the following operations are performed:

\* Create a tag object by invoking data.setHashValue passing in the tag string for the hash key as well as the value property. This service returns the object id and allows the object to be referenced via hash key in addition to the object id. In this data model, invoking setHashValue is idempotent and thus this can be called every time a user is tagged.

\* Create a user object by invoking data.setHashValue passing in the user id as the hash key and the id property.

\* Create an association using the data.setAssociation service to link the user object and the tag object. The id's returned by setHashValue should be used to collect the id's.

When the tags for a user are displayed, the tagged association can be queried, using fql.query, to determine the set of tags applied to the user. The query for extracting the tags would roughly be:

SELECT value FROM app.tag WHERE _id IN (SELECT tag FROM app.tagged WHERE user = tagged_user)

The storage mechanism makes this data model pretty straightforward. This hash key mechanism can store properties of non scalar data such as JSON. This supports the ability to use the getHashKey method to retrieve more than a single property. There is also an interesting feature of hashkey, data.incHashKey, which allows atomically modifying an integer property value.

The example above could be extended to support a tag cloud, by counting how many instances of the tag have been added. One way to achieve this is by adding a count object with a single count integer property. The count object type would have an association with both the user and tag objects to have a unique count for each user. The count would be obtained through a query over both associations. Another implementation technique would be to use a synthetic hash key and a unique tag object for each user. The tag object would be extended with the count property. The tag object's hash key would combine the user id and the tag name. The count property would be incremented each time the user is tagged.

Observations

The API would be smaller if only hash keys were required for each object. The object id is somewhat difficult to work with and typically domain data provides unique identifiers.

The batch API is required for any significant use of this API, as a single logical operation will translate to multiple service invocations. It would be nice to have more support directly in the data API.

A lot of common use cases could be supported by extending the increment hash key capability to any object property. This could be done by passing the last version id and the current update and rejecting the change if an intervening change has occurred. In relational databases, this is implemented using an optimistic locking technique such as a version column.

This technology is worth evaluating if you are working with a viral facebook application.