Over the last four years open government and open data have been at the forefront of the debate on how governments can become more transparent, participative and efficient. The theory is well known: rather than (or alongside) providing the government’s interpretation or packaging of public data, this data should be made available in raw, open format for people to build their own views and applications.

Open data evangelists say that this is an essential component of any open government initiative, and they must be right, given the number of jurisdictions that are pursuing this around the world. I do agree with the principle that making data equally available to everybody creates a level-playing field and helps overcome some of the most evident problems with information not being available or carrying a spin that precludes it from being really transparent.

The downside is a deluge of data. People can easily drawn in raw open data that is either too much or simply meaningless unless some processing takes place.

But who is supposed to do the processing? It can’t be government. Or – better – it can be, but this would bring us back to square one, with suspicion of government cooking its data to prove a certain point or to hide some uncomfortable reality. Then you have the so-called “civil society”, made of voluntary and advocacy groups, activists, as well as lobby groups, corporations and the mythical “application developers”.

It is reasonable to assume that each of these groups has a vested interest in packaging open data. “Vested interest” should not be read as a negative term. The interest may just be to increase transparency for an activist, or to provide people with better information about a particular noble cause, if you are an advocacy group, and just make a name for themselves. But less optimistic scenarios are also plausible: businesses trying to influence consumer behaviors, supporters of less noble causes “torturing” data to make their case, and so forth.

What is interesting, though, is not to try to foresee which of these trends will prevail. Let’s agree that there will be orders-of-magnitude more good than evil coming out of this: this is not the point. The point is that there will be people and organizations that are able to use, package, leverage masses of open data, and people who don’t. There will be people who rely on the way in which other people and organizations package data to take important decisions for their lives.

One might argue that this is not different from the past: information has always come through few sources, and open data cannot but broaden those sources, hence creating a clear opportunity for diversity and transparency. My contention is that this may turn to be an illusion. The more the data, the more sophisticated the analysis and presentation tools, the more specialized are the skills and resources required to process that data. Although consumer technologies become increasingly powerful and massive processing resources become available as a commodity, making sense of big, open data is not for the faint of hearth, and will require significant investments for the times to come.

Every time I suggest that there might be a darker side to open data, some of the evangelists and supporters come after me as if I were a scaremonger or just in denial of the great future that open data will help create for us all. I do sincerely hope for the open data potential to be fully realized. But this will require more people to play devil’s advocate, to scrutinize what infomediaries and application developers are doing, to provide people with ways that allow them to both benefit from and critically review open data packaging and analysis.

The open data train is in motion and there is no way that calls for cautions like mine can slow it down. They are meant to make sure that more than a few benefit from this “transparency spring”, to prevent a new divide to be created between those who have skills and resources to interpret open data and those who don’t. In my humble opinion, such a divide would be far more insidious than the one between those who access technology and the Internet and those who don’t. In fact, after almost two decades of positive actions to provide access to all, after major broadband deployment programs in several countries, now that we may be close to bridging the gap, it would be very sad to fall prey of a new, less evident but equally (if not more) pernicious divide.

The open data ‘movement’ is almost certainly a force for good. But you hint that sometimes the response – in terms of applications created around open data – is not as wonderful as many governments would hope for. Or, indeed, there may be no response at all.

Many of these issues will be discussed at an upcoming event CITIZEN 2012 in London on June 28. The event will be live-streamed completely free. I hope that some of your readers may like to watch and participate.

As Churchill said, “democracy is the worst form of government except all those other forms that have been tried from time to time.”

Like democratizing access to power, democratizing access to data can have negative consequences, especially when that access is distributed unevenly in practice. But I’ll take that any day over having to trust a benign dictatorship or oligarchy with control.

[…] Andrea Di Maio of Gartner recently articulated concerns about open data processing, particularly the divide between data professionals who have the skills to do so and those who do not. Di Maio writes, “Over the last four years open government and open data have been at the forefront of the debate on how governments can become more transparent, participative and efficient. The theory is well known: rather than (or alongside) providing the government’s interpretation or packaging of public data, this data should be made available in raw, open format for people to build their own views and applications… The downside is a deluge of data. People can easily drown in raw open data that is either too much or simply meaningless unless some processing takes place.” […]

Comments or opinions expressed on this blog are those of the individual contributors only, and do not necessarily represent the views of Gartner, Inc. or its management. Readers may copy and redistribute blog postings on other blogs, or otherwise for private, non-commercial or journalistic purposes, with attribution to Gartner. This content may not be used for any other purposes in any other formats or media. The content on this blog is provided on an "as-is" basis. Gartner shall not be liable for any damages whatsoever arising out of the content or use of this blog.