Mis-matched parts

I’m a rubbish shopper. I don’t like browsing for hours, and I invariably find what I want, but in the wrong size, shape, or colour. This is most exasperating in shops with technical bits and pieces. Usually, this is because what I’m trying to do requires something kind of like a USB to HDMI adapter, but with Firewire and a place for my coffee mug. Hours and hours in a parts shop will usually find me with a basket full of various odds, ends, and middles which—when assembled—usually work, but after a fashion and after a lot of worrying about overloading circuits.

The simple reason behind this is that I’m not trying to buy a specific part that matches a nicely-designed and single-branded setup. I’m hacking something new out of what’s available, and that leaves me with the mess of trying to use tools made to support one thing (say, a lead for a TV that I don’t own) to give me what I want (say, console gaming through my monitor instead!).

I think this is similar to what Paul Miller was blogging about over on GigaOm, when he talked about data markets:

Matters become far more complex when you want to start combining different data sets, even within a single data marketplace. Typically, it’s not what these services are designed for, and typically, there is insufficient metadata to enable sensible combinations.

Data markets are very new, and they’re all going through various learning curves. Most, though, seem to be building a shop-front for datasets which allow for the easy purchase data as-is (buying CSV files, for example), or are tailored for a particular data-related task (such as visualisation). It takes a lot to cater for sensible combinations of data!

The thing is, people who are shopping around a data marketplace are—almost by definition—developers and hackers. They’ll be looking for that dataset that lets them build an application, produce a service or answer a question. They’ll need it to match what they’ve got, or they’ll have to hack it into place and make sure they don’t end up crossing metrics, querying the wrong columns and returning completely wrong answers due to a misunderstood schema.

No data market could possibly anticipate every shopper’s exact recipe, but what one could do is to provide everything in a way that makes remixing and chopping/changing/and repurposing as easy as possible.

Kasabi is built natively on Linked Data: data as RDF triples that can be queried and matched to any number of graph patterns natively. It’s the W3C’s recommended way to publish data on the web, because it allows for the infrastructure of the web itself to handle the shape of the information.

This doesn’t mean that every developer wanting access to data in Kasabi will need to learn a whole new way of doing things (that’s like asking all your shoppers to have the same size feet to make buying in shoes simpler). Anyone who can develop for the web, can use Kasabi’s APIs (and even tailor them for themselves). But it does allow for a flexibility in the way data are queried, delivered and reused. By using a bit of Graph-thinking, Kasabi will be much better placed to handle the problem Paul points out of data not matching up.

It’s our vision to allow developers to mix and match through use of Kasabi’s underlying Linked Data structure. This will let people actually reshape the format of information they need using web standards right in the market rather than downloading as-is and needing extra hacking steps to get the data structure to match their needs. As these features are polished, we’re also building with the aim of providing tools that don’t require a full set of developer’s skills for blending data. People with specialist knowledge, experts in their fields and with interest in particular datasets should have the tools available to mix and match too.

Different features will be rolled out and tested throughout our Closed and subsequent Open betas very soon. We’ll no doubt revisit the Linked Data side of the marketplace here on the blog, or you can drop in on us on IRC (#kasabi on freenode.net).