Archive for the ‘Awesomeness’ Category

Today we are incredibly proud to announce a major new release of Fluidinfo.

Our vision is that information should be easy to create, find and use. This led us to develop our core technology: an openly writable database for storing and organizing metadata. Our database is useful for developers looking to glue together data from multiple sources, be it from the web or in-house systems. Information stored in context is more valuable. By aggregating information in one place, writing new visualizations, generating reports or performing analytics across multiple sources of data becomes trivial.

Our new user interface provides these same benefits to users of the social web. We take the most useful pieces of data from your social network- the web links and hashtags that people use – and attach activity to them from around the web. To start with, we’ve focused on content from Twitter, but you’ll see data from other services very soon.

URLs and hashtags become places where conversations can occur, as shown in the image on the right for the hashtag #euro. We save you the pain of repeatedly scanning multiple activity streams, a dashboard shows what’s hot in your social graph, or across Fluidinfo as a whole. Contribute directly, or Tweet knowing that whatever hashtag or URL you mention will find its way into Fluidinfo.

This act of gluing different sets of data together to detect signals, trends and relevant activity, also forms the basis for our new set of enterprise products. The problem that individuals face in trying to stitch together information from multiple sources to see what’s going, is magnified in any organization dealing with multiple databases. For most organizations, managing the sprawl of different data platforms is the biggest barrier to turning data into a useful asset for employees or customers.

Our first new product, Fluidinfo Enterprise helps companies take the first steps in taming their data. It bundles our new user interface with the openly writable database and enterprise grade API features to allow organizations to view and manage their data wherever it may be stored.

By co-locating metadata from existing platforms, Fluidinfo Enterprise eliminates the need to copy or move large chunks of data around, instead providing a single index for all data. This can include data that is not hosted on-site, whether it be on social networks, web pages, or 3rd party cloud services. This allows employees, partners and customers to interact with the data, via the API or UI. We believe this freedom and flexibility can turn a company into an open-data business.

FluidSense, our second product, is a white-label browser extension for Firefox, Safari and Chrome that takes this ability to expose useful data even further. It lets a company give end-users a way to engage with their unified content as they browse the web. A browser sidebar alerts users to contextually relevant company content and comments from their social network.

We are now living in the age of data. We want individuals and organizations to be able to find, share and use the data that matters most to them. Whether you are a web user interested in links and hashtags, a publisher whose core data is books, or a news company whose critical content is their articles, our new interface and products help you establish and track these important data types and allow users to start collaborating around them.

Russell Manley became Fluidinfo CEO last November. Although many people are aware of the change, we’ve not announced it until now as we’ve been busy working on the new UI, have been settling in with how Russell is running the company, and he’s been getting to know our investors better.

Russell pointed me to Delicious back in late 2005, after he’d read the early work I did on Fluidinfo in the late 90s. That led directly to the founding of Fluidinfo in London. The corporate address was Russell’s home, and the two of us formed the board. Because we only needed programmers early on, we planned for Russell to join full time once the Fluidinfo architecture was developed and deployed and we had significant external interest. Last November Russell took the plunge, resigned from his London job, and took over from me as CEO.

Russell is extraordinarily competent. He spent 10 years as a “company doctor” in London. He went into a dozen companies as CFO, COO, or CEO, charged with turning them around. Walk around central London with him and it seems that almost everything you see he’s had a hand in running. The diversity of his operations and management experience is extraordinary. In 2005 he joined SMIF, a Secondary Market Infrastructure Fund, where he helped acquire, manage, and eventually sell hundreds of assets: long-term management contracts and debt on UK schools, motorways, hospitals, prisons etc. Russell was frequently in the middle of deals worth tens or hundreds of millions of dollars. SMIF sold themselves to Land Securities Group for $1.4B, where Russell became an Investment Director. Soon afterwards he and a few others spun themselves out of Land Securities to form Semperian. Russell became Group Communications Director and also CIO. He directed the set-up of their entire IT infrastructure in the clouds, a daring and difficult move to pull off in 2007, especially with the stakes so high (Semperian supports about $3B of public sector infrastructure). Russell devised and ran Semperian’s company systems and processes and sat on the board of over 30 companies. Just before joining Fluidinfo he spent 9 months restructuring one of their companies and then negotiating its very complex sale. In the final act he spent two full days signing the 600+ documents he’d coordinated among 28 parties involved in the sale. He knows how to close a deal.

That’s just a sample of Russell’s background and skills—there’s a lot more where that came from. As you can probably guess, we’re extremely happy to now have him running Fluidinfo

We’re delighted to announce that Neil Levine (LinkedIn, Twitter) has today joined Fluidinfo as VP Product. Neil has been working in the industry for over 17 years and has a great track record of taking both consumer and enterprise products to market, most notably at Canonical, where he was VP of Corporate Services and also Director of Information Infrastructure. He’s based in the Bay Area.

I met Neil a year ago via an introduction from Jamu Kakar (also of Fluidinfo). Neil had been Jamu’s boss at Canonical, and had a stellar reputation. We got on really well immediately, and stayed in contact. I often wondered if one day we’d be lucky enough to find someone like Neil to join us on the product side. We’ve always been careful and patient in hiring, looking for people we think are brilliant and who really “get” Fluidinfo at a fundamental level. People who can’t stop thinking about what a Fluidinfo-enabled future could offer. Neil certainly fits that category, and we’re thrilled to have him on board. So, please join us in welcoming him to the team!

Getting Started With Fluidinfo is now available in hardcopy and various eBook formats from O’Reilly. The authors, Nicholas Radcliffe (@njr) and Nicholas Tollervey (@ntoll), know Fluidinfo inside out, as you might hope. They’ve written multiple Fluidinfo client libraries, web applications, command line tools, visualizations, have written many blog posts about Fluidinfo, have imported tons of data into the system, and have both contributed to the design and architecture in many ways. The books is extremely well written. Both of the Nicholases are entertaining and clear writers.

The first chapter has a wonderful introduction and overview of Fluidinfo, and should be understandable by a broad audience. After that, things get more technical with a chapter on using Fish, a Fluidinfo shell, either from your shell command line or via Shell-fish, a web interface. Playing directly with Fluidinfo, adding to objects and running queries, is probably the best way to understand its (very simple!) data model. Then it’s on to programmatic access, using two Python libraries (one low level, one high level) and via Javascript. An example social book reader application is built from the ground up in Javascript. The book concludes with chapters on the REST API, advanced use of Fish, discussion of the special Fluidinfo about tag, and a description of the query language.

Namespaces and tags provide a powerful mechanism for organizing information. The Fluidinfo API provides a set of tools for creating, describing and using them to store information about anything and everything. Until now, you had to create them before you could use them to store values, but we’ve changed that. Namespaces and tags are now created automatically, on first use, provided you have permission to do so. A number of API calls that had to be made in the past are no longer necessary, which makes storing data easier and faster than before.

Permissions provide fine-grained privacy controls to define who has access to see and work with information in Fluidinfo. By default, Fluidinfo creates permissions that grant everyone read access to information, while limiting write access to the author of the information. This is good for the most part, as Fluidinfo and its users benefit from sharing information with each other. The default behaviour could cause surprising results when used with a namespace that had been locked down and made private though, because the new namespaces and tags would be public. This is no longer the case. Permissions for new namespaces and tags now inherit from their parent namespace, at creation time. Changing permissions for existing namespaces and tags won’t cause any changes to propagate to children.

Namespace permissions are inherited one-to-one. That is, the create namespace permission is copied to a new child namespace, the update namespace permission is copied to a new child namespace, and so on. Tags are a little bit different because they have a different set of permissions than namespaces. The update tag, delete tag, write tag value and delete tag value permissions are all inherited from the create namespace permission on the parent namespace. The read tag value permission is inherited from the list namespaces permission on the parent namespace. Control permissions are inherited from the namespace’s control permissions.

The combination of automatic namespaces and tags with inherited permissions makes Fluidinfo both easier and safer to use. We hope you enjoy these changes!

Today we’re excited to announce the release of a writable API for O’Reilly books and authors. There’s far too much news and information around this release to pack into a single blog post. Here’s a summary of what’s new today and where to find out more.

General manager and publisher Joe Wikert is excited by the opportunities that a writable API provides to O’Reilly and other publishers. “It’s like LEGOs for publishing,” he says of the new malleability in his industry. “It’s as though we’ve been selling plastic children’s toys and the pieces were all glued together so customers could only use them the way we intended them to be used,” he adds. “Now we’ve decided to break the pieces into their component parts and let customers build whatever they want.”

Last but not least: if you want a modern, writable API for your data, drop us a line at info at fluidinfo com, and let’s talk.

O’Reilly’s favorite startup is Terry Jones’ Fluidinfo “because I’m not sure it’s going to work. He’s got his teeth into something that is bigger than he is. He may be overwhelmed and he may not get it,” O’Reilly said.” That passion it’s kind of like the Wright Brothers that wanted to fly or Thomas Edison and the light bulb… It’s not the entrepreneur chasing the million bucks, it’s the entrepreneur chasing the big idea.”

Imagine you want wanted to tell the world you were interested in something, for example an email address or a phone number, without telling the world what that thing was. That may not sound so interesting, but if several people were doing the same thing, it would be a mechanism for discovery of private things you had in common, without telling anyone else what those things were.

Russell Manley and I just thought of a simple way to do this using Fluidinfo. Here’s how we did it for the email addresses we know.

For each email address, compute its MD5 sum. Then, put a rustlem/knows or terrycojones/knows tag onto the object whose fluiddb/about value is the MD5 sum. The MD5 algorithm is essentially one-way, so even if someone finds a Fluidinfo object with either of our tags on it (which is trivial) they cannot recover the original email address.

This is pretty nice. We’re independently indicating things of interest, but neither of us is publicly saying what those things are. Because we’re putting our information onto the same objects in Fluidinfo, we can then easily discover things we have in common with each other (and with others), without the world knowing what. We can do the same thing for phone numbers, or anything else.

Getting the data into Fluidinfo was trivial. Here’s code I used to put a terrycojones/knows tag (with value True) onto the appropriate objects:

Russell and I each had about a thousand email addresses in our address books. A first question is how many addresses we know in common. You can get the answer to this with the simple Fluidinfo query has terrycojones/knows and has rustlem/knows. It turns out there are 53 common addresses. But the results don’t tell us which addresses those are, which is also interesting.

We also wrote a small script to print any tags ending in /knows for a set of email addresses given on the command line.

So given an email address, we can run the above and see who else knows (or claims to) that email address.

We find all this quite thought provoking. Without going into details of the social side of this, it’s worth pointing out that Fluidinfo makes this kind of information sharing very easy because it has a guaranteed writable object for everything, including all MD5 sums. Because the fluiddb/about tag is unique and isn’t owned by anyone, any user can add their knows tag to the object for any MD5 sum. The ability for users and applications to work independently and yet to share information by just following a fluiddb/about convention is one of the coolest things about Fluidinfo.

Finally, note that this system does not guarantee privacy. If someone already knows an email address or phone number (etc) they can compute its MD5 sum and examine the Fluidinfo tags on the corresponding object. Doing so they might see a rustlem/knows tag and would then be free to draw their own conclusion.

You can play too. All you need is a Fluidinfo account and the above code. Please let us know how you get on. For example, you can freely tweet any MD5 sums we have in common. We’re going to use the hashtag #incommon, like this.

With all the BoingBoing data from the past ten years now in Fluidinfo the next question is “what can we do with it..?”. That’s what I’ll be answering in this technical how-to, so expect lots of code / examples!

Basic Fluidinfo Concepts

How does this all fit together..? Objects are simply tagged with data. Put another way, tags associate a value with an object.

The other important concept to make clear is that nobody owns objects, there are no permissions associated with objects and objects last for ever. Although every object has a unique ID they are also usually identified by a globally unique and immutable “about” tag value. It’s used as you’d expect: to indicate what the object is supposed to be about. Finally, anyone can add data to any object (more on this later).

(er… that’s really all it is.)

Of course, since Fluidinfo is a data-store it is possible to do searches, link objects and store all sorts of different types of data (from primitive types like numbers, booleans and text to more opaque values such as images, video, sound and other binary data).

Oh yeah, interaction with the data is via a simple yet powerful REST API. There are plenty of client libraries in many different languages which allow you to work without worrying about the dirty implementation details.

How the BoingBoing data is organised in Fluidinfo

Each of the 64,000 BoingBoing articles is represented by a corresponding Fluidinfo object whose about tag value is the URL of the original post on boingboing.net. In the original XML dump, each post looked something like this:

<row><permalink>http://boingboing.net/2000/01/21/street-tech-reviews-.html</permalink><created_on>2000-01-21 14:07:38</created_on><basename>street_tech_reviews_</basename><author>Mark Frauenfelder</author><title>Street Tech Reviews and news</title><body><AHREF="http://www.streettech.com/">Street Tech</A> Reviews and news for gadget-lovers and propeller heads of all stripes.</body><body_more>NULL</body_more><comment_count>0</comment_count><categories>NULL</categories></row>

I’ve done the simplest thing possible: created a top-level boingboing.net namespace in Fluidinfo under which all tags used to annotate BoingBoing data are defined. I’ve added tags to this namespace that map to the original XML elements: permalink, created_on, basename, author, title, body, body_more, comment_count and categories. The Fluidinfo objects representing BoingBoing posts have data associated with them using these tags. For example, the object representing the post described in the XML example above has a boingboing.net/title tag with the associated value: “Street Tech Reviews and news”.

Since I was also cleaning the raw XML I decided to extract / re-structure some of the data. This resulted in some additional tags: year, month, day, timestamp, links and domains. The function of the date related tags should be clear. The links and domains tags are interesting because I scraped all the anchor tags in the body and body_more fields and processed the href values. Obviously the links tag references a list of all the URLs referenced in an article and the domains tag references a related list containing just the domain names.

I did one final enhancement to the data dump. I extracted all the authors and categories and turned them into tags. When I imported the data I used these tags in the “delicious” way of tagging: simply by having such a tag (with no associated value) an object is associated with an author or category.

Here’s what an object representing a BoingBoing article looks like:

Another interesting view on the data is to explore the BoingBoing tags and namespaces in the Fluidinfo Explorer (see the screen-shot on the right). In the Explorer, if you right-click on a tag and select “Open Object” you’ll see the object that represents the tag in the main area of the application. This object is itself tagged with useful information – such as a description (containing copyright information). Yeah, I know, it sounds odd but this makes meta-tagging possible.

In addition to creating Fluidinfo objects for all the BoingBoing articles I also created an object for every domain referenced by BoingBoing throughout the last ten years.

The about tag value for these domain objects is the domain name itself. For example, there is an object about the “bbc.co.uk” domain.

Each of these domain objects has been tagged with a list of all the BoingBoing articles that mention them. This is, I think, rather cool. To continue the example, the bbc.co.uk domain was referenced in 177 BoingBoing articles.

Minecraft (example data mining interactions with the API)

So here comes the cool how-to stuff…

Should you need to, use the existing documentation to read about the Fluidinfo API in super-painfully-precise-techno-vision. However, I’m going to present a quick guided tour in the form of a Python session using the fluiddb.py module (remember my advice to use one of the client libs). The advantage of using fluiddb.py is that it’s a very thin layer on top of the HTTP API so you get a feel for how various things work. The other advantage is that reading Python is like reading pseudo-code and is thus a great teaching tool.

In the following example I simply import the fluiddb module and ask it for information about my user (ntoll). The basic pattern for calling Fluidinfo is: fluiddb.call(“HTTP-VERB“, “PATH IN API“, OTHER OPTIONAL ARGS)

Notice how the “content-location” in the headers tells you what the full URL of the API call is (this is interesting since fluiddb.py creates this automagically for you). The body (result) is a Python dict object that basically mirrors the JSON dict object Fluidinfo served up.

The following example grabs information about a specific object. Notice that I pass in the path to the Fluidinfo resource I’m GETting as a list. This ensures that the BoingBoing URL gets correctly percent encoded.

Hopefully, the result speaks for itself: it contains the unique ID of the Fluidinfo object that is about the BoingBoing URL, and a list of the tags on that object. Getting the value of a specific tag is simple:

A call is made to the “/values” endpoint with a list of tags whose values we want returned and a query to generate the result set. The query is written in Fluidinfo’s super-simple query language. The headers of the response look like this:

Happily, fluiddb.py has converted it into the Python equivalent so we can find out some useful information and look at individual results.

>>> len(body['results']['id'])# how many results do we have..?1214
>>> body['results']['id'].keys()[0]# what's the id of the first result..?
u'f2976562-eba6-47e4-94a1-b36ffe9a2ab1'
>>> body['results']['id']['f2976562-eba6-47e4-94a1-b36ffe9a2ab1']# show the record for the first result...{u'boingboing.net/categories': {u'value': [u'science',
u'technology',
u'art and design',
u'design']},
u'boingboing.net/created_on': {u'value': u'2010-10-14 13:14:14'},
u'boingboing.net/title': {u'value': u'TED releases iPad app today'}}

Great! So you have all the tools you need to search and explore all the BoingBoing articles from the last ten years. That’s what a conventional data API provides.

However, Fluidinfo can do additional super-duper cool stuff..!

Super-duper cool stuff!

Fluidinfo is an openly writeable database where objects have value because they are annotated with data from different sources. That’s why anyone can tag any data to any object. Since you control who can use, read and control your namespaces and tags, you still maintain control of data and importantly create a mechanism for trust.

You can trust values annotated with tags from the boingboing.net namespace because only BoingBoing is allowed to create and edit anything under this namespace. Since BoingBoing has annotated objects with information about articles then it’s safe to assume the objects are about a BoingBoing articles.

Here’s the super-duper stuff: you can contribute data to these objects too.

How..?

I’m glad you asked…

First of all you’ll need an account on Fluidinfo. Once you’ve signed up you’ll be the proud owner of a top-level namespace with the same name as your username. Before you can add data to objects you’ll need to create some tags to achieve this:

The new tag is given a name (“tuba”), description and an indication if it should be indexed. The “201” status that Fluidinfo returned confirms that the new tag was successfully created under the “ntoll” namespace.

In case you hadn’t guessed I like tubas! I’d like others to find other tuba related objects in Fluidinfo so I’ve decided I’ll attach this newly created tag to anything tuba-related, including BoingBoing posts. As it happens Fluidinfo helps me get a bunch of these posts with a search like this:

Oops, I forgot I’d already tagged a couple of non-BoingBoing objects with the ntoll/tuba tag: one whose about tag value is “CrossCountryTuba” and the other being the object that represents me in Fluidinfo.

Notice how the value for the ntoll/tuba tag on the object about “CrossCountryTuba” contains only metadata: the type of data stored by that tag on that particular object (image/jpg) and the size of the data (467947 bytes). Looks like it’s an image of some sort. Let’s get it and see:

Now we’ve covered a lot of ground, so let’s just consider where we’ve got to.

We have a consistent, simple and powerful API to play with.

We can retrieve values using a simple query language referencing data contributed from many different users.

We can contribute data ourselves in such a way that the data remains under our control.

We can put all our data in the right place. If I want to contribute something about a BoingBoing article I just tag it to the object representing the right BoingBoing article.

We can contribute all sorts of data be it searchable primitive values like numbers, text and booleans or opaque data such as images, audio or anything else for which you can specify a MIME type.

You’re armed with enough basic knowledge to both mine BoingBoing data and contribute to it too. In fact, if you look carefully you’ll find all sorts of interesting objects in Fluidinfo. Remember, to find out more about the API check out our technical documentation.

I’ve gotten to know Marc slowly over the last 10 years. We first met very briefly when he was CEO of the Popular Power, a San Francisco start-up. Nelson Minar (Marc’s co-founder) and Derek Smith, two of my close friends who are very close to Marc, were both working there. Nelson and Derek, as well as several others including Fluidinfo investor and advisor Tim O’Reilly have sky-high opinions of Marc. Hearing regular off-the-charts superlatives about Marc over the years always kept me interested to someday know him better.

Marc was present at my first ever (abysmal!) solo VC pitch for Fluidinfo, to the ill-fated Bryce Roberts and Mark Jacobson of OATV in early 2007. During the presentation, Marc interrupted to ask if he could take a photo of my slide titled “Revenue”. I think he wanted it as an example of how not to pitch a VC. I’ve never forgotten. He snapped the pic, resumed his seat, and told me to carry on

Marc has a ton of experience. He founded and led Lucas Online, the internet subsidiary of Lucasfilm, was director of engineering at Organic Online, and was also CTO at Webstorm. After Popular Power he was VP of Engineering at Sana Security, and then Entrepreneur in Residence at OATV, gaining intimate knowledge of the world of venture capital and interacting with hundreds of start-up companies. Marc then co-founded Wesabe where he was Chief Product Officer before becoming CEO. These days he’s Chief Product Office at Daylife in New York.

As you can probably imagine, we’re honored and excited to have Marc involved at Fluidinfo.