Open Data Node – what it is, what it does, what is next

One of the main goals of COMSODE project is the creation of publication platform for Open Data called Open Data Node (see project description). Open Data Node (or ODN in short) will be – or to some degree already is – Open Source and is intended to be used mainly to publish Open Data. That’s the basic use-case. And our motivation is to provide tool which can make this publication repeatable, automated and easy to use.

So, what it does?

Organizations, companies or even individuals wishing to publish Open Data from existing information systems usually face a lot of troubles stemming from where the data currently is:

Data is usually in internal systems, isolated from the Internet, from the general public, isolated even from the rest of the organization itself and its staff.

Data is maintained in complex, heterogeneous environments, using wide variety of formats and employing a lot of different technologies to access it and process it.

This sometimes forces those wishing to publish Open Data either to abandon such plans or to make some compromises, such as publishing only manually, forgetting or not being able to produce timely updates, needed clean-up, etc.

In order to publish Open Data from such systems properly, i.e. using Open formats, in machine readable form and in timely manner, Open Data Node does following:

it extracts (harvests) data from internal systems using any available interface and method to do that safely, effectively, with low costs,

it processes that data, performing format conversions, cleansing, anonymization, enrichment, linking, etc. (and as part of that also compiling some metadata about that data)

it stores the results (data and metadata too), serving effectively as cache, protecting internal systems from overloading in case of high demand for data from users,

it makes the results available to the general public and businesses – supporting both common users (with usual office tools on PCs or other devices) and application developers (equipped with powerful software development tools and above average hardware), implementing also automated and efficient distribution of updated data – increments – and dataset replication (including metadata),

it allows all this to be automated, easy to use and easy to maintain.

In other words and shortened: Open Data Node helps publishers of Open Data with the complexity of source data and continuously delivers easy to use and high quality Open Data to the users.

And, what it is?

To some degree, ODN already is. Its main component was released recently – see blog post about UnifiedViews. More about ODN’s components will be published in subsequent articles.

As already mentioned, ODN is software. So it is not a service. But it can be used to create one.

And not just a software, ODN is Open Source software, employing mainly combination of GPLv3 and LGPLv3. Thus you can use it yourself, without paying license fees. You can use it as it is or you can customize it or integrate it with something else. If you are not skilled enough, you can acquire customer support for this. You can buy this support either from COMSODE consortium members (e.g. from EEA Company) or you can buy that support from somebody else, even building local expertise in your own country.

Given that, ODN can be also seen as integration component, helping to reliably and safely bridge information systems of public bodies with IT infrastructure of citizens and businesses.

Is it all-in-one package, silver bullet?

After reaching this point, you may think ODN is a wonder tool, doing everything, The Only Right Solution In The World(TM), sort of silver bullet. No, it is not.

ODN is not:

It is not a service, it is not Cloud.

Given demand and “room in the market” we may provide a Cloud service. But as of now, there are others providing such services and for now it is out of our project scope.

Also it is not a Data Catalogue. It is meant to complement data catalogues, but not to replace them.

We will make sure ODN plays well with data catalogues, so that data preview works as expected and if user clicks a link to get the data, it will be delivered.

ODN is not and will not be the only solution for publication of Open Data.

We do not plan to replace simple shell scripts if they work well. Nor do we want to force ODN in places where simple update of existing information systems (e.g. accounting software, CMS, DMS, etc.) can achieve publication of Open Data more effectively.

What is next?

We will continuously work on enhancing the UnifiedViews and combining it with other components during the time of COMSODE project, so as on delivering Open Data Node as promised within the next year, with alpha, beta versions and pre-releases.

To make that happen in Open fashion, more articles will follow shortly, describing in more detail for example the history of ODN, its architecture and design, the motivation behind it, its intended use-cases and so on.