Open Data: Empowering the Empowered or Effective Data Use for Everyone?

The open data movement in the area of access to public (and other) information is a relatively new but very significant, and potentially powerful, emerging force. It has now been widely endorsed by among others Tim Berners-Lee generally acknowledged as the Father of the World Wide Web. The overall intention is to make local, regional and national data (and particularly publicly acquired data) available in a form that allows for direct manipulation using software tools as for example, for the purposes of cross-tabulation, visualization, mapping and so on.

The underlying idea is that public (and other) data, whether collected directly as part of census collection or indirectly as a secondary output of other activities (crime or accident statistics for example) should be available in electronic form and accessible via the web. There are significant initiatives in this area underway in the US , the UK and Canada among many many other jurisdictions.

This drive towards increased public transparency and allowing for enhanced data enriched citizen/public engagement in policy and other analysis and assessment is certainly a very positive outcome of public computing and online tools for data management and manipulation. However, as with the earlier discussion concerning the “digital divide” there would, in this context, appear to be some confusion as between movements to enhance citizen “access” to data and the related issues concerning enhancing citizen “use” of this data as part, for example, of interventions concerning public policies and programs.

In an earlier paper dealing with the digital divide discussion I suggested the use of the concept of “effective use” to distinguish between the opportunity for digitally-enabled activity presented by ICT access, from the actual realization of those opportunities in the form of “effective use”. At that time I introduced a set of layers of requirements, which can be understood as “pre-conditions” for the realization of “effective use” of digital “access”.

Efforts to extend access to “data” will perhaps inevitably create a “data divide” parallel to the oft-discussed “digital divide” between those who have access to data which could have significance in their daily lives and those who don’t. Associated with this will, one can assume, be many of the same background conditions which have been identified as likely reasons for the digital divide—that is differences in income, education, literacy and so on. However, just as with the “digital divide”, these divisions don’t simply stop or be resolved with the provision of digital (or data) “access”. What is necessary as well, is that those for whom access is being provided are in a position to actually make use of the now available access (to the Internet or to data) in ways that are meaningful and beneficial for them.

The question then becomes, who is in a position to make “effective use” of this newly available data?

The suggestion implicit in most of the discussions on “open data” (and explicit in Berners-Lee’s above quoted talk) is that “everyone” has the potential to make use of the data. However, as we know from experience elsewhere, not “everyone” has access to the digital infrastructure, to the hardware or software, or to the financial or educational resources/skills which would allow for the effective use of data or any other digital resource. Thus rather than the entire range of potential users being able to translate their access into meaningful applications and uses, the lack of these foundational requirements means that the exciting new outcomes available from open data are available only to those who are already reasonably well provided for technologically and with other resources.

The example that Berners-Lee quotes concerning the role of the data mashup in the Zanesville lawsuit is an interesting case in point. In this instance, the direct creators of the mashup were the Cedar Grove Institute a public interest consulting firm specializing in GIS applications and employing several leading Ph.D. GIS specialists and with a U. of North Carolina, MBA as the CEO. The lawyer who argued the case and presumably who so effectively deployed the mashup is a Harvard law school graduate.

Of course, there is nothing wrong with this, nor with the outcome of their intervention and their use of open data—in fact, as with Berners-Lee, I think this is an exemplary case of the positive benefits for people that can come from open data.

However, this is a very very long way from what folks like Berners-Lee seem to be asserting which is that “open data” empowers everyone. In fact, the example indicates precisely the opposite, that is, that “open data” empowers those with access to the basic infrastructure and the background knowledge and skills to make use of the data for specific ends. Given in fact, that these above mentioned resources are more likely to be found among those who already overall have access to and the resources for making effective use of digitally available information one could suggest that a primary impact of “open data” may be to further empower and enrich the already empowered and the well provided for rather than those most in need of the benefits of such new developments (unless of course, they have means or the luck to find benefactors such as the Cedar Grove Institute or Harvard Law School graduates willing to work pro bono or on a contingency basis).

A very interesting and well-documented example of this empowering of the empowered can be found in the work of Solly Benjamin and his colleagues looking at the impact of the digitization of land records in Bangalore. Their findings were that newly available access to land ownership and title information in Bangalore was primarily being put to use by middle and upper income people and by corporations to gain ownership of land from the marginalized and the poor. The newly digitized and openly accessible data allowed the well to do to take the information provided and use that as the basis for instructions to land surveyors and lawyers and others to challenge titles, exploit gaps in title, take advantage of mistakes in documentation, identify opportunities and targets for bribery, among others. They were able to directly translate their enhanced access to the information along with their already available access to capital and professional skills into unequal contests around land titles, court actions, offers of purchase and so on for self-benefit and to further marginalize those already marginalized.

Certainly the newly digitized information was “accessible” to all on an equal basis but the availability of resources to translate that “access” into a beneficial “effective use” was directly proportional to the already existing resources available to those to whom the access was being provided. The old story about the pauper and the millionaire having equal opportunity to purchase a printing press as a means to promote their interests can be seen as holding equally here as in the 19th century.

Benjamin’s meticulously documented paper shows how the digitization and related digital access to land title records in Bangalore had the direct effect of shifting power and wealth to those with the financial resources and skills to use this information in self-interested ways. This is not to suggest that processes of computerization inevitably lead to such outcomes but rather to say that in the absence of efforts to equalize the playing field with respect to enabling opportunities for the use of newly available data, the end result may be increased social divides rather than reduced ones particularly with respect to the already poor and marginalized.

As well, this is not to argue against “open data” which in fact is a very significant advance and support to broad-based democratic action and empowerment but rather to argue that in the absence of specific efforts to ensure the widest possible availability of the pre-requisites for “effective use” the outcome of “open data” may be quite the opposite to that which is anticipated (and presumably desired) by its strongest proponents.

An “effective use” approach to open data would thus be one that ensured that opportunities and resources for translating this open data into useful outcomes would be available (and adapted) for the widest possible range of users. Thus, to ensure the effective use of open data a range of considerations needs to be included in the open data process and as elements in the open data movement including such factors as the cost and availability of Internet access, the language in which the data is presented, the technical or professional requirements for interpreting and making use of the data, the availability of training in data use and visualization, among others.

An interesting example of how open data, with appropriate attention being given to some of these pre-conditions, in fact can provide a basis for effective use can be seen in how the UCLA Centre for Health Policy Research’s California Health Interview Survey (CHIS) has been put to use by Community Advocates in Solano County. The CHPR conducts a bi-annual California Health Interview Survey in conjunction with the California Department of Health “to provide a snapshot of the health and healthcare of Californians”. The survey is used by a range of political authorities but most interestingly they provide free and widely accessible training on how to use the information “to develop appropriate and targeted policy responses” and overall “to learn how to use and apply the data to improve health and health care”. That is, the information is not only made accessible but attention is paid and resources are provided to ensure that the data is usable by those who might make effective use of it.

In this instance, the Solana County Community Advocates were trained so as to be able to take the data provided by the CHIS, and plot incidences of asthma by local electoral district. They were then able to create a map showing an extremely high frequency of asthma among residents in a particular local area. The Community Advocates successfully argued against developing another truck stop along I-80 in the county based on CHIS 2001 data estimates that showed Solano County to have the state’s highest rate of asthma symptom prevalence overall and one of the highest rates for children.

While in many respects this example parallels the earlier one from North Carolina the difference here is that the skills required for doing the analysis of the online data were provided through training to the local community who were then able to mount a local campaign to achieve the desired end. The key difference here was the attention that was paid by the provider of the information, the CHPR to ensuring that the data could be effectively used without the need for highly skilled (and expensive) professional intermediaries. This involved the development of end user oriented training programs.

In this instance it should be noted that Internet access, bandwidth, the language of the data among other factors were not an issue. However, in other circumstances such as for example among indigenous peoples, non-English speakers, the very poor, those living in areas with poor connectivity and so on, these issues will be inhibitors of use of open data and a responsible intervener would be concerned to ensure that these issues were attended to as part of an open data program.

(Additionally, the difficulties and types of interventions required to ensure that effective use can be made of information by the intended clients can be found in the very interesting report from Shelter in the UK on Social inclusion in the Digital Age. This report documents a very useful approach to providing some of the tools needed for effective use of online information by those to whom that information is being directed and who would necessarily be those who could make the most active and effective use of that information—information on housing for the homeless being made available for use by the homeless themselves. (Regrettably the project being reported upon has been canceled by the UK government.)

For a more detailed discussion on “effective data use” and overcoming the “data divide” see the next blogpost at:

Some fantastic points in this post – very much what I found in recent MSc dissertation on Open Government Data in the UK (http://practicalparticipation.co.uk/odi/report/). The inequalities of effective use are significant, both in access to use data as a tool in securing change for an individual, and in terms of who gets to be interpreters and gatekeepers of what open data means.

There is a particular aspect of the ‘open data movement’ (in practice I think an economically driven ‘Public Sector Information’ movement; an open government movement; and a digitising government computerisation movement all wrapped up together) in terms of the narrow focus in many contexts (particularly the UK with TBLs role, but I think increasingly other national initiatives) on machine-readable datasets and encoding datasets to be interoperable – sometimes at the cost of their usability in other contexts.

Many open data advocates assume that technically skills users will mediate access to this open data – providing user-friendly platforms giving access to open data – but in practice I’m not sure this is happening in any widespread way.

I hope this post generates some good discussion about the detail of the directions we’re heading in with open data – as like you – I see open data as a key aspect of deepening and developing democracy – but not a development that will automatically and necessarily promote the equality of access and effective use required for that to be realised.

This is an interesting post. I have been writing about openness and transparency for a little while and hopefully, I should be able to put up a blog post of some my work by the end of this week. Your post has provoked me to look through some of the material and analyses that I have been working with. Some thoughts and my own contentions to your post and to the notion of ‘public access’, ‘open access’, ‘public data’ and ‘openness’.

I will begin with Solly Benjamin’s work that you have cited in this post. You state:

“Benjamin’s meticulously documented paper shows how the digitization and related digital access to land title records in Bangalore had the direct effect of shifting power and wealth to those with the financial resources and skills to use this information in self-interested ways. This is not to suggest that processes of computerization inevitably lead to such outcomes but rather to say that in the absence of efforts to equalize the playing field with respect to enabling opportunities for the use of newly available data, the end result may be increased social divides rather than reduced ones particularly with respect to the already poor and marginalized.”

My problems lie with the contentions that you have made here. Solomon Benjamin’s work clearly shows that making ‘public’ certain kinds of data, in this case information about land records and titles, is problematic in the first place. Therefore, before making any data public, the question that needs to be asked if whether the ‘openness’ will result in infringing on the liberties and autonomy of particular groups. I find that ‘openness’ and ‘transparency’ are seen and lauded and promoted as universal virtues without attending to the fact that certain kinds of transparencies and open access may actually have negative effects for particular groups of citizens.

Benjamin’s work does not say that the playing field has to be equalized in order for people to benefit from open access. In the case of land ownership and occupancy, the field is highly tenuous in the first place because ways and means of owning and holding land are highly cultural, social, historical and political. The law does not recognize the context specificities and historical trajectories of different forms of land ownership and occupancies. Digital technologies, when they are deployed to implement equality in such fraught contexts by creating to ‘open access’ to what is perceived as ‘public’ data, actually end up narrowing and even homogenizing the meaning the meaning of land ownership. Such databases, digitization processes and technologies then actually make it possible for law, law-makers and policy-makers to immediately act upon and clamp down on individuals and groups who land ownership practices do not conform with the single, universal, legal meaning of land ownership and title. This is what computerization invariably does in particular contexts.

Given this, is the answer then no ‘openness’ and no ‘public access’? Am I then saying that we should throw the baby out of the bath water? No. I am suggesting that the issue with respect to ‘open access’ and ‘public data’ is not simply and merely one of digital divides and enabling all people with tools and infrastructure which will help them to access and effectively use the data. I want to suggest instead that fundamentally, the issue at hand is what kinds of information do we view as ‘public’ and if we say that there is indeed something called ‘public data’ then who is this public in the first place and why does such information/data need to be coded for deepening democracy? I will go on to say, at the cost of much self-criticism, that in certain contexts, it is wiser not to store information in databases and make open access. Information is coded, stored, memorized and accessed in many social and cultural ways and through intricate kinds of networks of individuals and groups. That such information is culturally coded, stored, memorized and even accessed in ways that are public to the public which needs to know the information in question can also present opportunities for democracy (if democracy is understood in terms of enabling negotiations and interactions between people).

To conclude quickly then, we need to examine what we understand by the term ‘public’? Who is the ‘public’ that is likely to benefit from ‘openness’? Does ‘openness’ translate uniformly in all the contexts and circumstances to which it is applied? Is ‘openness’ always a virtue for everyone or do we need to understand and conceive of different ways in which information is socially, culturally, historically and politically stored, coded, memorized and accessed?

Interesting comments and I generally agree. I should point out that the argument concerning the “effective use” of “open data” was my own and not Benjamin’s. He and his colleagues were overall, making a somewhat different, although important point about the centralizing effects of digitization (as well as your point about the way in which the digitization process — and the resulting data — have to be seen within a specific cultural, social, and economic context. For the purposes of my discussion this contextualization would be one of the elements that would need to be included in an appropriate design for effective use.

I think this discussion draws apart usefully the two sides of advocating for ‘equality promoting open data’.

On the supply side, we need to be critical about how data is encoded, and how certain ways of encoding data prioritise certain sets of individuals. For example, many involved in producing open data resources are attached to the idea of ‘friction free’ data use – the ability to pull down and combine any two or more datasets based almost entirely on the machine-readable meta-data that can make them interoperable – rather than based on having a conversation with the data-owners, or having to draw on human interpretation of the data’s meaning and potential comparability / compatibility. There is the question of how we build an equality promoting architecture. (For example, it might be right to provide some land ownership data in open forms – but to ensure it’s digital representation does only point to ambiguous local records as the authoritative source of information on the topic, rather than digital becoming the authoritative record: socially just outcome from open data may come from making for ‘less friction’ but not making data ‘friction free’.

On the use side, we have to address who has the resources, skills, capacities to use open data – in some sense a more classic digital divide problem. This is where I’d understood an “effective access” perspective to be focussed. Would that be a fair interpretation?

This is in reply to my earlier comments and also to Michael’s and Tim’s responses.

I re-read Michael’s post twice to get a stronger grasp of some of the issues raised in the post. On second and third readings, I realized that the post definitely raises very useful questions, insights and issues concerning open access and effective use, particularly the fact that effective use cannot be achieved only by providing more and more infrastructure. The issue of interpreting data is equally, or perhaps more critical and here again, the question lies as to who does the interpretation, in what way and for which publics. At the same time, which publics are putting the data and its interpretation to effective use/s and how does effective use by one kind of public or particular communities impact the immediate and larger socio-economic-political-spatial ecosystems.

The other issue also lies with definitions and here, Tim’s point of demand and supply – who is demanding open data and who is supplying this open data – remain critical. Thus, in the case of crime and cities, the issue remains as to who is providing access to data concerning crime and crime records, how is this data interpreted and who is doing the interpretations particularly in the light of the fact that in cities in some parts of the world, the histories and geographical patterns of segregation are so entrenched that these continue to perpetuate beliefs about crime, criminality and which groups are indulging in the ‘crimes’. Open data and open access in this case becomes very sensitive because despite making the data public and accesible, the data may only continue to perpetuate beliefs which may be disconnected from, and/or discount, ground realities. I state this more from my experience of Johannesburg city where certain areas are marked as ‘criminal’ and that open data about crimes in Johannesburg may not account for the intensity, the gravity, the historicity and the natures of crimes in different parts of the city. Moreover, what counts as crime in one part of the city may only be a minor infringement or a much serious problem in another part. So also, what count as crimes in one city may not have similar intensity and gravity in other parts of the world. Therefore, perhaps, the onus on the interpretation front is also to contextualize the meanings of crime and to attend to the historical and spatial patterns which underlie ‘crimes’.

The problem of open access in the case of land records is that in the case of the Indian experience, which I know somewhat about, the government takes the responsibility to computerize land records by digitizing them and then making some of the details public to enable rural citizens to get their land records quickly and without much delays. The problem with the digitization, as Tim has rightly pointed out, is the manner in which the data tends to get encoded. Typically, digitization of land records would mean either scanning the record as it is, or inputting all the data on the record as it is, without changing any fields. In the case of Karnataka state, of which Bangalore city is the capital, we find that historically, the state was created by amalgamating four different regions. Given that the state has four different regions with historically different forms of governance, political cultures and land administration systems, it means that the ways of maintaining land records are highly diverse. Now, when one arm of the government, in this case, the revenue department, comes in and says that we will digitize all the land records to enhance transparency and make the records open or public, it does this not by accounting for the diversity but by struggling to figure out how to create a common database system where information from all the records will be stored. Moreover, the information that is stored in such a land records database is about the latest owner/tenant/cultivator which then eliminates very rich and of course tenuous/contested histories of the past – the histories of how the land parcel came to be the way it is, who made what kinds of claims at what points in time, how were those claims resolved, whose claims were diminished and whose claims got priority, how were sub-divisions and usufruct rights negotiated and enforced, etc. The transition from paper to digital was done to enhance efficiency and transparency in transactions. Invariably, in this case, the way in which the systems were created and what finally came out enforced a highly linear history of land ownership and land parcels thereby transforming and in some cases homogenizing existing understandings and notions of tenure. Also, ownership is not the only means of holding a land parcel. There are other ways which may not legally classify as ownership but are in fact social, cultural and historical ways of using/holding land. Digitization in the case of Karnataka has enforced and given primacy to only ‘legal’ ‘ownership’ which then has implications not only for governance systems but also for democracy, in terms of nature and scope of interactions and negotiations, between local groups, local administration and state administration.

When governments have done digitization of land records, as has been in the case in India, the idea has also been to then give legal sanctity, authenticity and final authority to the digitized record. Government decrees and orders in India, in this respect, clearly state that once a record is digitized, it is final and binding and the paper record is no longer valid. It then remains that when various kinds of data is made available to people through digital technologies, do other and older systems where similar data was coded, stored and accessed become insignificant or even legally illegitimate? Does opening up data and making it available online imply that the digital version of the data is only authentic? Are there instances on the ground where users and interpreters also access older systems where the data was stored even when the same data is available digitally?

Finally, I think the friction-free and less-friction nuance which Tim has raised is interesting. In the context of social justice, where the notion of social presumes similarity/equality/homogeneity in the polity/citizenry and the notion of justice tends to be predicated more or less on absolute rights and entitlements rather than on claims that are negotiated differently at various points in time, there are likely to be frictions in the uses, interpretations, and among representatives and those represented – sometimes more, sometimes less, sometimes vociferous – when certain kinds of data are made open/public in particular ways. How ambiguities are represented/eliminated/accentuated/interpreted/perceived in digital forms and simultaneously in older forms will be most useful to document in order for effectiveness in present and future uses and interpretations for various publics.

May be I am missing the point completely but weren’t land records already “open” and “public” in India though they were never free. I believe anyone can apply for certified copies after paying the stamp duty and printing charges which combined are pretty cheap, almost peanuts…

It may not be openness but digitization of land records that led to the misuse of Open Data in Bangalore. Land sharks all over India have been doing these all the time through traditional means such as bribery and strong-arm tactics but digitization and easy access to such huge amount of data has obviously helped them take exploitation to the next level. If the government put up digitized data for empowering the poor, then this was yet another abuse of government policies and initiatives out of many.

However, the government should give priority to collecting and digitizing census data (are they available? ) especially of the marginalized segments, which may attract more social entrepreneurs rather than greedy corporations.

Prior to digitization, land records in India were available to people who made requests with village accountants for them. At that time, the ‘book’ kept by the village accountants was the source of information about records. Among some of the most vociferous arguments advanced in favour of digitization, this practice of the village accounts singularly having land records information and (presumably, in all circumstances) holding farmers to ransom for issuing records was seen as a barrier to efficiency, accountability and transparency in governance. Now, one of the issues is that prior to digitization, this information about land records was ‘locally’ available and despite all the arguments of corruption and inefficiency leveled against the village accountants, it also remained that encountering the village accountant and mobilizing his agency, directly or indirectly through other networks, constituted politics and interfaces with the government and the state. The outcomes of such mobilizations were not totally certain and neither was this kind of politics perfect, ideal and normative, but people managed to get hold of their records one way or the other.

With digitization, such as the Bhoomi programme in Karnataka, as you rightly pointed out, you could now access land records of any place and any farmer if you knew the survey number/s of the plot/s. This kind of easy availability of land records’ information, again as you rightly pointed out, was problematic because now, anybody and everybody could get hold of the information about your land parcel and given the money and muscle powers that developers can have, they could use this information to swoop down upon individuals/groups who were not cooperating in making certain transactions around the land. The other issue, as my colleagues and I have found through another study, is after digitization of land records and subsequently other revenue services, village accountants no longer personally visit the villages they are in charge of or sit in the panchayat offices of these villages and conduct surveys and meetings with villagers/individual farmers. They have now shifted the accountability to the computer kiosks and to the private operators because of the state sponsored public private partnerships in providing revenue services digitally. Thus, what has happened with digitization is a reorganization of earlier forms of social and political relations (as we all as economic ones) that underlay land records information and transactions around obtaining land records. The accountability systems, post-digitization, have moved the levers of accountability from the immediate village level to the taluk/hobli level and have aligned/realigned legal and political structures that are variously favourable and unfavourable, advantageous and disadvantageous, to different socio-economic and political groupings in rural areas.

The digitization process, as I have tried to outline in some of the above comments, was not making a perfect digital copy of the record. Rather, the database created for storing land records actually only stored a part of the information. Also, with digitization, the importance of land records changes. The land record now comes to be viewed as a the singularly most important document which proves your ownership over the land parcel. Whereas earlier, given the way in which the records were kept, you could variously use the land records information to make different kinds of claims over the state and/or within the communities and networks you were associated with. So yes, the digitization has its share of problems and it becomes necessary to interrogate the claims of efficiency, transparency and accountability underpinning digitization – efficiency, transparency and accountability for who and at what costs?

On making census data public, I have to think about it. I have been against making voter lists publicly available in the past because having experienced rioting and violence personally, and given the fact that during communal riots in Delhi and Mumbai, mobs and rioters managed to get information about particular identity groups through voter rolls, it remains that openness is double-edged, and in certain situations, a precarious virtue. Having said this, I also want to suggest that unlike the imagination and the practices which digital technologies (can and tend to) usher, can openness be conceived in terms of the social, cultural, historical and political practices of communities where certain kinds of informations remain open in particular ways to particular publics ….

writerruns,
Different Indian states have their own land reforms/ revenue laws and from your accounts it seems Karnataka has a slightly different mechanism. Nonetheless, I fully understand your contention.

Rather, the database created for storing land records actually only stored a part of the information. Also, with digitization, the importance of land records changes. The land record now comes to be viewed as a the singularly most important document which proves your ownership over the land parcel.

However, what I learnt from a brief glance at NIC portal is that the records are available for information sharing only and cannot be used as a legal document i.e. it won’t be admissible in courts as certified copy of the record, which means the record in their original format are still updated manually too. I would be surprised otherwise, since without certified copies, the court administrations would lose a lot of revenue which go into its maintenance. As I said before I am not familiar with laws and procedures of Karnataka and would want to know that before commenting on it.

But in Orissa, before digitization, you were required to got to the Tehsildar’s office, apply for a RoR or other details and get it after months of bribing and greasing the official machinery, online it takes a few minutes to do the same thing. But in either case, (online or Teshsil office) the documents are reports, there is no way to manipulate those other than using the same old techniques- deceit, muscle-power, bribery. It is for the state to implement policies to check these. It is for the government to ensure that accountants, clerks, bureaucrats work honestly and efficiently.

On digitization of census data and voters lists, it is almost certain that in the next election Open Data would be used to rig election but here again openness is not the issue, they would find it anyway, through cable tv/mobile subscription lists etc In any case, some people would always have access to the records, is there any way to know for sure that these people wont abuse ? In case of data being used to target certain segments during riots, 2 of the worst communal carnages in modern India saw the ruling party, the administration engineer it. In contrast not making voter lists public would definitely bring down voter turn-out significantly and make absolutely sure that the incumbent wins the election.

Openness as you rightly said, is a double-edged sword, the actor wielding it should make sure that it doesn’t inflict injury on self.

Thanks again Tim. I must say that in the original post I was dealing as you say, only with the “user” side and your comments concerning the “data provision/supply” side are very interesting and well taken. Once I have a chance to think further on that (and look through the thesis that you reference) it might be the subject of another blogpost but in the meantime I would urge those interested to read both of your comments above

Your blog makes a number of interesting and clearly valid concerns about ‘open data’. I appreciated your distinction between the availability of data and its effective use. I would suggest that there is a third leg which is too often omitted – and that is the knowledge and experience with which to interpret data. The digital availability of data does not make everybody able to access it, competent to interpet it – any more than the availability of graphic design software.made everybody competent to design graphics (although many users failed to understand that distinction). While there are enormous benefits in open data, there are also significant perils caused by the assumption that access=understanding.

There seems a valid parallel with the law profession, and the right of people at all socioeconomic levels to have access to professionals – not just to the statutes. Your emphasis on ‘effective use’ of open data needs to be extended to the requirement for many more informatics professionals to be available to help interpret the data for all demographic groups.

There is a subtle belief that access, experience and knowledge /is/ power, as the proverb goes. But knowledge matters little if not /combined/ with power: money, weapons, media penetration, organized or important people and so on. Thus there are many indiviudals with insights into their situation but without any means to change it. This adds to Michael’s “Empowering the Empowered”.

Such a perspective could be linked to a recent article and theorization on tracing and reconfiguring networks as a new type of political project:
Yannick Rumpala, “Knowledge and praxis of networks as a political project”, Twenty-First Century Society, Volume 4, Issue 3, November 2009.
Abstract:
Modern-day society is increasingly described as an extensive web of networks, but as such, it is often perceived and experienced as elusive. In light of this paralysing description, this paper aims to highlight the potentially political dimension of network analysis, namely as defined in the social sciences, and of the notion of networks itself. It will be shown that a political project could, in this case, be built on the desire to know this reticular world better, but also to be able to act appropriately towards it. Three steps are proposed to specify how such a political project could be built. The first step aims at deploying knowledge of networks and emphasises the usefulness of a procedure to trace them. The second step shows the possibilities that this knowledge offers, particularly in allowing one to find one’s bearings in a world which is frequently described as veering towards an increasing complexity, and by helping to rebuild the selection criteria for connections in this world, thanks to an additional degree of reflexivity. The third step draws on these points to extend them and bring out potentialities with regards to the intervention capacities in network configurations.
See also http://yannickrumpala.wordpress.com/category/networks-and-rhizomes/

The importance of training is exactly the reason we organized monthly hands-on trainings and yearly conferences for the government data website I formerly helped run, the MetroBoston DataCommon.

In addition to community-based groups like the one in LA you cite, journalists are often another interesting intermediary between government data and citizens. Projects like Everyblock.com and other “database-driven” reporting transform raw data into more understandable formats, although through the journalists’ own lenses.

Finally, a note on the example of digital parcels. In most of the US, the digitization has already been done, but only made available only to users of specialized GIS software and often at a substantial fee. (In Boston the complete data requires purchasing a $138 CD – for a 40 MB file) In essence, digital parcels are already helping the privileged. Putting them up in a machine-readable format would at least create the potential for community groups and advocates to make “effective use” of this data.

Excellent observations Rob and thanks for the additional example. In Canada community and advocacy based mapping has been severely retarded because the Canadian Geographical Survey sells the base maps for truly exorbitant sums. I discussed doing a somewhat parallel project to the California one here in BC with some folks at one point but we had to abandon this because the cost of the maps was just too high.

Thanks for this interesting post, which help us in taking some distance with a movement which, as often with young and fresh ideas, carry a certain doses of naivety.

As you will read without any surprise, I totally agree with your general comment: technology and information per se doesn’t empower the majority of those who need to be empowered, technology per se doesn’t have democratic virtuous, but the way we implement the technology and information in the society will determine or not its capacity for social change.

Talking about digital divide, we all have seen those beautiful projects of telecenters, granted with brand new computers and connection, but no budget to hire skilled people to help the people who were supposed to visit those public access points, no money for training or for content building. One explanation for this might be lying in the question of visible and invisible, tangible and intangible. I am convinced that sponsors (public as private) are culturally thinking as we did in the previous century, in the material world, and want to see their money incarnated into tangible objects (computers) and not immaterial knowhow. (but this leads us far away from open data).
On the other hand, we have some beautiful examples, although rare, of funding taking into account the whole chain compulsory for empowerment : access + training + conditions for participation. Indeed the city of Brest in Britany – France has decided that the web site providing information (culture, history, geography…) about its region would be open to any person willing to contribute, and therefore providing a wiki. This is where the traditional authority would have stopped its action. The city of Brest instead funded some local non profit organization to set up training sessions for the inhabitants: who to use a wiki, how to write a post, under which conditions information can be shared etc… And the result is a lively web site http://www.wiki-brest.net/index.php/Wiki-Brest,_les_carnets_collaboratifs_du_Pays_de_Brest, full of useful information.

Coming back to open data, I think one of the problem with the open data evangelists such as TBL, is that they are western centric and can only consider the potential effects in countries with a long democratic history, where the problem is to overpass the limits of representation, participation being one possible track. If we take the Benglaore exemple, the process of consolidation of various ownership systems, including common good ruling, has been done in Western countries centuries ago, also in pain and sorrow most often (see Peter Linebaugh in The Magna Carta Manifesto: Liberties and Commons for All).

Having said that, the questions you raise Michael are still totally valid, even in a western context. I would like to point out levers, which might be helpful to ensure that open data empower the right people.
I think most often open data operates in “a 3 sides snooker” as we say in French (hope it makes sense in English). One side is the web site where the data is posted, or better say, mashed up, web site run by civic hacker or democratic movements. The second side is the people themselves, often in a contributive mode (eg: posting information) or in an interactive mode (eg: comparing his/her opinion with the data available). And the third side is data journalism (ex: The Guardian, OWNI…).
Let me be more explicit through 2 examples.
In a website like http://www.electionleaflets.org/, people can upload the leaflets distributed by the candidates during the campaign. Scanning and uploading a leaflet can not be done by anyone (you need a scanner, a computer etc.) but it is a relatively simple task and it can be done by people who might not feel at ease in an electoral debate, therefore they can contribute even in a modest form to enlighten the public debate by allowing comparison between the candidates positions or making public some disrespectful material (racist, insulting…). Nevertheless, if some disrespectful materiel is disclosed, it will not have a real impact on the course of the election, unless some media, alerted by the website, grab the information and make it public in a more classical way. (Wikileaks operates the same way, presenting its “hotest” documents to the mainstream media before publishing them).
Here you see an interesting combination between traditional tools of democracy (press freedom) and new tools for democracy (open data, civic hackers, digital participation).

Another interesting example, which again goes along Michael’s line, is the issue of open data and criminality. It is amazing to see the success of crimes database in the US (they are among the most downloaded on data.gov). If you go to San Francisco open data app showcase, the first set off apps is related to crime.
One could argue that raw open data on criminality contributes to an anxious society, obsessed with crime. Ant that it helps people choosing their house according to the level of criminality in the district, consequently increasing urban segregation. All this is true.
But when you look at an app like http://apps.facebook.com/ukcrimestatsquiz/, it’s another story. By helping people realize there is a gap between the reality of criminality and their intuitive perception, the application contributes in raising the awareness of citizens and their capacity to put at distance the dominant vision conveyed by the traditional media.

So again, technology and information need to be inserted into approaches that help people develop the cognitive skills necessary for an active citizenship.

I attended an “open government” conference in the US recently, where the conversations, both in the prepared presentations and in the hallways, were mostly about creating physical access to data.

Discussion about making meaning for their communities was notably absent. I did hear a little bit about the consequences of open data, such as making crime data available on a neighbourhood basis would likely have an impact on property values in the neighbourhood, but it was a pretty primitive conversation… no thought was being given to understanding the contributing circumstances.

I asked one of the municipal CTOs at the conference how his city’s publicly accessible data would be used and was told that he expected that local entrepreneurs would create “apps” that would transform the newly available data into socially useful products.

Although everyone at the conference had the goal of creating a better community through open data, I don’t think the concepts have matured sufficiently for those creating today’s tools to have easy access to the concepts. We certainly need to continue doing stuff in this nascent field, such as putting data online, and creating apps on the practical level, trying different mashups, creating user interfaces and we desperately need a more comprehensive conceptual framework that is accessible to practical folks: politicians, CTOs, entrepreneurs.

Pushing this conversation forward will require the convening of a number of disciplines and vested interests within the community… the subject cuts across many disciplines and professions. I think the group convened here is a great start!

An interesting conversation. I think it’s important to realize that the Indian example is a particularly extreme one. In many cases open government data is far less problematic — meteorological data is a good example, I think. It’s also an example of how concerns about social justice and equal access work in practice. It’s true that elites will be the ones who use the data; but everyone benefits from cheaper weather information services.

I think it’s also important to distinguish between digital data’s effects on efficiency and its effects on possibilities. In some cases these effects are difficult to disentangle. In the Indian case, though, it seems as though the opportunities for taking advantage of the poor already existed. Open data allowed this particular market to clear much more quickly — and in an obviously problematic way — but the essential issue is that the market was structured unfairly.

In computer security circles, relying on the difficulty — but not impossibility — of obtaining data to prevent negative effects is known as “security through obscurity”, and it’s rightly mocked. It’s simply not a viable long-term strategy for achieving desired effects. Put another way: the genie is not going back in the bottle. We need to do our best to minimize the disruptions that can be caused as open data reveals the flaws in our social systems. But fixing those systems is ultimately the only real way to prevent abuses.

All this is great, and wonderful. But a more centralized open source approach could be utilized in a fashion similiar to my evolving project known as the Universal Debating Project. More importantly, there must be a way of reducing the amount of information into lists, or numbered “sentences” similiar to note-taking…otherwise the sheer diversity of data can be quite “overwhelming”. This is an urgent issue which needs to be addressed by all “stakeholders” such as NGOs, governments, businesses, et al. http://www.p2pfoundation.net/Universal_Debating_Project
about a minute ago ·

[…] a long excerpt from Gurstein's post, Open Data: Empowering the Empowered or Effective Data Use for Everyone? A very interesting and well-documented example of this empowering of the empowered can be found in […]

[…] A while ago I wrote a bit of a rant about Schooloscope, and how its over-simplification of school data made us feel perhaps smarter than we really are. Mike Gurstein, who is Executive Director of the Centre for Community Informatics Research, Development and Training (Vancouver BC and Cape Town, South Africa), has written another angle on a parallel issue. He argues that open data is, of course, a good thing, but that without proper training in its use it just empowers those with the social capital – Internet access, education, time – who can then, in the time-honoured fashion, suck resources away from the less-empowered. Open Data: Empowering the Empowered or Effective Data Use for Everyone? « Gurstein’s Communit…: […]

[…] This is making the rounds today. As usual, I see it not only through the regulatory lens, but the educational one as well — and think it is yet another data point as we think about eCitizenship in our own country: A very interesting and well-documented example of this empowering of the empowered can be found in the work of Solly Benjamin and his colleagues looking at the impact of the digitization of land records in Bangalore. Their findings were that newly available access to land ownership and title information in Bangalore was primarily being put to use by middle and upper income people and by corporations to gain ownership of land from the marginalized and the poor. The newly digitized and openly accessible data allowed the well to do to take the information provided and use that as the basis for instructions to land surveyors and lawyers and others to challenge titles, exploit gaps in title, take advantage of mistakes in documentation, identify opportunities and targets for bribery, among others. They were able to directly translate their enhanced access to the information along with their already available access to capital and professional skills into unequal contests around land titles, court actions, offers of purchase and so on for self-benefit and to further marginalize those already marginalized. […]

[…] Open Data: Empowering the Empowered or Effective Data Use for Everyone? « Gurstein’s Communit… This blog by Michael Gurstein provides well balanced an informed commentary as well as insightful views on Community Informatics and the use of Open Data throughout the world. Gurstein's posts are well written and often attract comment from interesting and informed contemporaries and contain links to quality information including referenced articles and government papers. The tagline of the blog is “enabling and empowering communities with information and communications technologies” and this is largely the focus of blog posts however there is often parallel lines between Community Informatics and the use of Open Data and this is touched on often in the commentary. […]

[…] tweet last week alerted me to a ground-shaking study. Mike Gurstein in “Open Data: Empowering the Empowered or Effective Data Use for Everyone?” says this drive towards increased public transparency and allowing for enhanced data […]

[…] a post that seemed to get a fair amount of traction last week (Open Data: Empowering the Empowered or Effective Data Use for Everyone?), Mike Gurstein wrote: Efforts to extend access to “data” will perhaps inevitably create a […]

[…] Does open data only empower the empowered? This excellent blog post by Michael Gurstein uses the example of the digitized land registry in Bangalore to warn that open data initiatives may empower the already-empowered to use information in self-interested ways: “This is not to suggest that processes of computerization inevitably lead to such outcomes but rather to say that in the absence of efforts to equalize the playing field… the end result may be increased social divides rather than reduced ones.” […]

[…] Finding new ways to broker information – bring together needs with haves and different participants, empowered and disempowered is., as Anselm discussed with me, one way to change our view of human to human, human to environment and human to civilization communication (particularly in light of this “sobering account of how open data is used against the poor in Bangalore” that as @timoreilly noted recently OpenData Empowering the Empowered). […]

[…] Another approach to get huge amounts of data is through sensors, which measure all sorts of factors from our environment. The idea is that soon low-cost sensors will be available, for example, to measure noise, air quality or one’s physical condition. Such sensors can also be RFID chips, but can go even further. These sensors could be included in a watch or mobile phone; this way millions of people can deliver real time information. Sounds like science fiction, but there are already some crowdsourcing projects “using humans as sensors“. Citypulse wants to measure the air quality though the contributions from pedestrians. And with a smart phone it is easy to join a project to measure the noise level worldwide. […]

[…] As is the case with any other tool that is very powerful, Open PSI can also have negative effects, even if in the big picture, or in the medium/long term, their advantages still greatly outweigh the disadvantages. One first, potential disadvantage of opening PSI (more on this later) can be temporary disillusion and loss of interest for politics, if not disgust, in citizens. Another, more likely risk, is the fact that, at least initially, Open Data may only benefit people in the upper classes of society who have, on average, better Internet connectivity and much more familiarity with online services than the others, who could therefore may be damaged. A perfect, very recent example of this problem has been discussed in September 2010 by M. Gurstein: […]

[…] remains very unevenly distributed – not that big a leap of faith. In a paper aptly titled “Empowering the empowered?”, Mike Gurstein quotes a 2007 research paper on the digitization of land records in Bangalore: […]

[…] remains very unevenly distributed – not that big a leap of faith. In a paper aptly titled “Empowering the empowered?”, Mike Gurstein quotes a 2007 research paper on the digitization of land records in Bangalore: […]

[…] Gurstein’s blog post last year on Open Data: Empowering the Empowered, or Effective Use for Everyone sparked some interesting discussions about how open data policies and practices impact different […]

[…] “Shiny app syndrome” and Gov 2.0 >> O’Reilly Radar“Where the conversation in Manor got heated, however, was when Texas state government officials revealed that there was no Android or BlackBerry app, nor was there a mobile version of the Texas.gov site. One attendee, CityCamp founder Kevin Curry, asked a simple but important question: Are .gov iPhone apps ‘empowering the empowered?’ Given that such apps require an internet connection and an expensive iPod Touch or iPhone, do they essentially add to a digital divide? Is this an evolution of the issue that Michael Gurstein raised in September, where open data empowers the empowered?” […]

[…] following the thread of danah boyd’s “transparency is not enough talk at the 2010 Gov 2.0 Expo. Open data can empower the empowered. To make open government data sing, infomediaries need to have time and resources. If we’re going […]

[…] lot has been written recently about the fact that open data alone is not enough to make a difference. Data needs to be put into the hands of those who can use it to make a difference, and if the only […]

[…] when we say that we’re advocating for better data access for all. As transparency bloggers have talked about before, efforts to increase transparency can have unexpectedly oppressive effects: A very interesting and […]

[…] decisionmakers. Correcting asymmetries of information can ameliorate asymmetries of power, despite the occasional troublingly counterintuitive result. Look at what Public Laboratory is doing: democratizing technology to make it possible for ordinary […]

[…] As Jesse Lichtenstein asserted “open data along isn’t enough,” following the thread of danah boyd’s “transparency is not enough talk at the 2010 Gov 2.0 Expo. Open data can empower the empowered. […]

[…] “The unfortunate part is that the data to power a truly democratic process exists,” said Massa. “We all know that no one is hand-drawing maps and then typing out the lengthy legislative proposals that describe, in text, the boundaries of a district. The fact that the political parties use tech and data to craft their proposals and then, in most cases, refuse to publish the data they used to make their decisions, or electronic versions of the proposals themselves, is particularly infuriating. This is a prime example of data ‘empowering the empowered‘.” […]

[…] Efforts to extend access to “data” will perhaps inevitably create a “data divide” parallel to the oft-discussed “digital divide” between those who have access to data which could have significance … […]