Freebase: Your database is ready!

This is going to be really frickin’ cool. There’s just no other way to put it. Maybe I’m a little too much of a data geek, because I can’t seem to sit still since receiving my email letting me know that Freebase is now in alpha, and the account I requested months ago can now be activated. I logged and immediately started poking around. I’ve been doing that for about 48 hours straight now.

What is ‘Freebase’?

Well, the short answer is that Freebase is a public domain relational database maintained by the community. If this sounds like Wikipedia, don’t get too attached to that comparison. It’s true that Wikipedia is also maintained by its users, but that’s where the similarities end. You see, while Wikipedia stores information in a way that makes it attractive and easy for humans to find things, Freebase provides the kind of structure and relational characteristics that make it useful to application developers (programmatic access). It provides a relational database, which is typically used by programs, instead of an encyclopedia, which is used by people.

If you’re a DBA, your first thought might be that these are people who are trying to take your job. Not so. This is in no way suitable for internal, private, corporate, proprietary data. In fact, I don’t believe it’s even allowed. What it *is* good for is applications that can make use of publicly available and/or publicly maintained data. For example, a sample application called “Concierge” allows users to browse restaurants in their area by first telling the application the area they live in, and then the type of restaurant they’re looking for. The data about the restaurants is all stored in the publicly maintained Freebase database, and Concierge also provides an interface for users to add new restaurants, which adds the data back to Freebase.

Me as a working example of a typical Freebase geek

I myself am fleshing out the Freebase “beer”, “beer style”, and “beer style category” types, in the hopes that I can provide a web interface that allows beer enthusiasts and brewers alike to know more about beer, and helps them to develop recipes for their own beer. I’m using the BJCP (Beer Judge Certification Program) published beer style guidelines to flesh things out, and then other people can come in and start associating beers with styles and breweries and all kinds of other characteristics. Having the beer style definitions held in a public place means that, as the style guidelines evolve, people who care about such things are free to make updates to the data in Freebase. This means the data used by my application is always up to date, and I never have to push out updates to users just to update their local copy of this data.

It may also mean that I can afford to make the application available to large numbers of people for free, since I won’t have to find a hosting plan that lets me house however many gigabytes of data this thing grows to. I can eventually work in all of the properties of different hop varieties, different grain characteristics, yeast attenuation rates, water profiles of different beer brewing regions, etc. Further, I don’t have to be an expert on every facet of beer, because people who care about, say, yeast attenuation are probably going to populate that data for me anyway, whether its specifically for my application, or one of their own.

The Future of Freebase (Pre-“Total World Domination”)

Aside from simple publicly maintained data, I think there are other implications of this model. For example, with all kinds of applications using Freebase as the back end database, and considering that users can have their own “private domain” data type definitions, it makes sense for applications to use the users’ Freebase credentials to maintain application preferences data using that domain. This would seem to make Freebase a contender for a de facto standard OpenID or CAS portal.

If the model is Earth-shaking, it stands in contrast to the current state of the actual user interface. Oh, I’m getting by just fine, but that’s more in spite of the interface than because of it. Some of the ajax-y features hurt as often as they help, navigation needs a little improvement, and there’s no way to add massive amounts of data quickly that I’ve found yet without writing code.

Further, it’s still unclear to me how they plan to foster cooperation between people who want to relate different data types. For example, I quite naturally want to list, for each beer, the brewery that makes the beer. Someone else has already created a “brewery” type, and the community has done a darn good job at fleshing out the data for that type. However, when you go and look at a brewery definition, there is no listing of the beers produced by that brewery. Freebase certainly supports the idea of a “reciprocal link” that would cause beers to show up under “brewery” entries as people add beer definitions and fill in the “brewery” property of the beer. However, there’s no clear rules on how to get this reciprocal link to happen if you’re not the creator of all of the types involved.

What’s more, I’m not the only person who has created a “beer” type. Which one should the “brewery” type administrator link to? Well, this wouldn’t be as big a problem if I were allowed to add a property to an existing beer type! Then I wouldn’t have to create a competing type at all! Currently, this is not allowed. I cannot go redefining the properties associated with a type that’s maintained by someone else. As a result, in order to support properties of beer that brewers and enthusiasts care about, I have to strike out on my own and hope that in the long run, my “beer” type becomes “the” beer type.

This should really all be opened up, and people should be allowed to add properties that are submitted for approval by the type administrator. Reciprocal links should be put in a “pending” state, or maybe even a “probationary” state. These are features that would encourage more interaction between users who care about the same data, and foster a community around the data that community cares about.

I’m sure there are plenty of other things to think about as well. For example, will Freebase let me upload the Briess malt profiles, published by one of the biggest maltsters in the US? Briess may have a problem with that – but how will Freebase know without receiving a cease and desist from some friendly neighborhood lawyers? Then there are technical and financial details. Presumably, they’ll either charge applications that use Freebase for commercial gain, or they’ll have to charge for some higher service level in order to guarantee that data will be available for applications to use.

This is not a simple service. I’ll say this though: I wish I could buy stock in Freebase, if only to cash out when they are inevitably purchased by Google.