A Semantic Web Founding Father Explains Why Americans Should Care About Keeping Open Government Data Alive

There’s still no official word on how much peril open government data initiatives such as Data.gov may be in. And perhaps to many Americans, the hand-wringing they’ve heard about funding cuts in this area seem trivial when the country is looking at the U.S. public debt nearing its statutory ceiling of about $14.3 trillion. After all, what’s the real applicability of structured government data sets – and projects that translate that data into RDF, hook it up to the Linked Data cloud, and build apps and demos off it – to their lives?

More than they know. Open data matters to individuals in their role as citizens, taxpayers, and as community members — not to mention potentially as innovators, too — says one of the Semantic Web’s founding fathers, James Hendler, the Tetherless World Professor of Computer and Cognitive Science at Rensselaer Polytechnic Institute. There, at the Tetherless World Constellation, he is a leader of the Data-gov Wiki project that uses semantic web technologies to investigate open government data sets. (There are six other sites in the open data government initiative besides data.gov, including USASpending.gov and paymentaccuracy.gov.)

Let’s start with the first point, about how open data matters to us as citizens who want to make intelligent decisions about the country our direction is heading in. How, Hendler asks, can we be effective moderators between opposing political points of view unless we can access the actual information that helps us understand and judge the merits of each side’s arguments? Now, it’s true that most individuals won’t themselves be reaching into open government data to see what it shows about whether one side’s proposal will cut jobs and another’s increase them, for instance, but intermediaries such as the media can leverage that data to help enlighten the citizenry.

“Much of this comes via other companies or traditional media, and now using this government data they can show us things like that — here’s what the situation is that can be backed up by data,” he says.

And, by the way, as a citizen and a taxpayer, you’re paying for the government to collect a lot of information anyway, regardless of whether or not it sees the light of day as open data. Take scientific data collected by agencies like the EPA or Health and Human Services, Hendler says. Researchers may want to review various data sets from these agencies and turn that information into something useful for the average individual – for instance, for what it has to say about issues such as environmental impact on health.

“We pay taxes as citizens to create that data that’s needed to monitor these things, but it hasn’t always been easy to get our hands back on it to use it,” Hendler says. “The ability to get that data back is very important,” and open data has made that job a lot easier.

Community Power

Another way open data matters is for how it “can power lot of communities people will care about.” So many online communities exist around various issues, he says, pointing just as one example to New York as a state rife with such communities interested in health and food. “Where do you get data to figure out where the farms are, what they grow, what the transport network looks like between them, what kind of information helps you grow stuff in your backyard that would be healthful,” he says. “It’s out there because the government uses that data for the same or different purposes. …So when the government makes that available, then a community can build a level of information services on top of it in a way it couldn’t afford to do if it had to build the whole thing from scratch.” It also generally is too expensive for small communities to turn to private organizations that themselves sell such data.

Hendler also notes that the government would itself like to see this data create an innovation environment. “One Google helps the government in a huge way and those come about because of innovation,” he says. Government data that powers products or applications that become popular can create thriving companies, he says, and that’s a good thing for consumers, too. “The point is that creation of innovation is important and government data looks like it could be a good source of new innovations,” Hendler says.

Given the leading role the U.S. has taken on opening up government data, and its position as the leading data sharing source among nations today, could we be jeopardizing our competitive advantage vs. other countries that have come on board (and continue to do so) in our footsteps? Maybe. “Again, no one quite knows how the innovation economy around this stuff might grow,” Hendler says. “Certainly it is the case that people see a lot of opportunities and it would be sad for the U.S. to miss the boat by being pound foolish but penny wise.”

Even if you don’t buy the innovation argument — and some have rejected open government data as a driving force here — Hendler says it’s impossible to deny the community benefits that come from having access to data without having to pay third parties for it. “It’s clear there is value in that information,” he says.

It’s understandable, though, if most users don’t get the connection between continuing the trend the U.S. spearheaded to open up government data – never mind semanticizing it – and the benefits that accrue from that. We live, after all, in a world of apps and mash-ups, and while consumers enjoy using them thesmall percentage that follows the trail of how the provider (whether a company or a community programmer) puts these things together is probably pretty small.

How many citizens of Boston, for instance, realize the example Hendler speaks of, when he discusses how open government data advocates there convinced that city to make GPS data available from its buses, and then themselves built an application to help give riders better information about those services? This at a time when the government couldn’t afford to do that itself. “The government runs the buses, people get better service quality knowing where the buses are, and that feeds back to the government discovering it can use that [data] to make more efficient services, too,” he say. The upshot is that “we’re still just at the beginning of learning how to make the data available, learning how to share it, how to give it to communities, leaning what its value is.”

That value should become more evident (assuming funding will be there to support it) with various communities growing on Data.gov for health, law, and more. “The idea is to create areas where groups can request specific data, and share technologies – the whole idea is for this to be a data mart, not just a place to download data,” Hendler says. “There’s more that it can be. The government is trying to create an open data ecosystem and we will see tremendous slowing in the growth of that ecosystem if the raw data can’t come out.” Also potentially in jeopardy would be opportunities to build APIs on Data.gov so people can get to data more easily, and more pre-built mesh-ups to further make the data easier to use.

The Tetherless World Constellation, which is also doing a lot of work with international data, won’t see a direct impact on its work, Hendler thinks. Of course it would be a setback to some degree, as it’s easier to understand data about one’s own country and much easier when one has Data.gov to be a federated source among information coming from various municipalities across the U.S.

But it hits home on another more visceral level, too. Hendler says he’s been pondering whether his reaction to the potential funding shortfall was the same as so many others when it comes to protecting their turf when cuts loom, and concluded that the answer to that was no — that there really is a national good in the project, for citizens, for creating an innovation economy, and for saving the government money, too. So perhaps the most hit to his team’s work, says Hendler, is that “we enjoyed the fact that we were doing what we thought of as a national service, and that would not be something we can do in the same way without the data.”

Right now, there’s still some hope that things won’t go completely dark. While cutting budgets to $2 million would make it almost impossible to maintain the open government data sites, even half of the $30+ million requested would be enough to keep things running at some steam. “What does it take to make back $14 million for the government?” Hendler says. “ It’s not a huge investment, and what they’re getting back already could cover it. ”

About the author

Jennifer Zaino is a New York-based freelance writer specializing in business and technology journalism. She has been an executive editor at leading technology publications, including InformationWeek, where she spearheaded an award-winning news section, and Network Computing, where she helped develop online content strategies including review exclusives and analyst reports. Her freelance credentials include being a regular contributor of original content to The Semantic Web Blog; acting as a contributing writer to RFID Journal; and serving as executive editor at the Smart Architect Smart Enterprise Exchange group. Her work also has appeared in publications and on web sites including EdTech (K-12 and Higher Ed), Ingram Micro Channel Advisor, The CMO Site, and Federal Computer Week.