Sebastopol brainstorm session
by industry experts aims to make government data more accessible

NATHAN HALVERSON

THE PRESS DEMOCRAT | December 19, 2007

The U.S. government is one of the largest repositories of public information in the world.

It collects data on everything from pet ownership and bathroom remodeling to congressional votes and political contributions.

The sheer volume can be overwhelming. But it doesn't have to be, according to an influential group of technology experts who are pushing for governments to further digitize public data and make it easier to access.

In a two-day brainstorming session at O'Reilly Media in Sebastopol, a group of 30 industry leaders from the head of Google's public content division to the founder of Stanford's Center for Internet and Society hammered out eight guidelines that define what an open government looks like in the Internet age.

Their goal is to make public information easier to access, and their guidelines have already attracted interest from as far away as New Zealand as well as cities across the United States and Congress.

"This is part of our broader initiative to get government online," said Carl Malamud, a Sebastopol resident who helped force the U.S. Securities and Exchange Commission to put corporate financial documents online in 1994.

Notably, the group did not necessarily call for an improvement in government Web sites when they met in early December. Rather, the group wants access to the vast amount of publicly-available data stored digitally by governments -- allowing entrepreneurs and innovators to present public data in unique ways.

Malamud's efforts with the SEC opened the door for companies such as Google and Yahoo to gather information about publicly traded companies and make it freely available on their Web sites -- everything from same-day SEC filings to stock quotes and insider sales by corporate executives.

Advocates want the same to happen with everything from congressional voting histories to crime reports and court opinions.

"If they have data available, they need to make it available with these eight characteristics," Malamud said.

The principles they outline range from making information such as video available in non-proprietary formats to ensuring that computers can process government data into a searchable format. For instance, some video files such as RealPlayer and Windows Media Video are proprietary and can only be shown on licensed players. Also, putting text information online doesn't necessarily ensure a computer can compile it into a sortable database.

Advocates believe that governments -- from federal to municipal -- should standardize how digital data is collected, archived and distributed digitally. This, in turn, will encourage the private sector to develop less-expensive and more innovative methods for analyzing and displaying the information.

"It means researchers can build other things," Malamud said. "It's an innovation issue. Some young startup should be able to make a better case law search tool."

Web sites such as MapLight.org and Chicagocrime.org are examples of organizations using public data to inform people an innovative new ways.

Chicagocrime.org taps into the Chicago Police Department's online database of recently reported crimes and then uses Google maps to display the information. The nonprofit site lets people search crime by street name, zip code or type of offense, displaying it on an interactive map for easy interpretation.

MapLight.org analyzes publicly available voting records of California legislators, and then pairs how a politician voted on a bill to the amount of campaign contributions the official received from interested industries.

"The votes and the money have been available publicly for decades," said Dan Newman, co-founder of MapLight. "But never have they been available together. So for instance, you can see that legislators that voted in favor of pharmaceutical companies got almost three times as much money from the drug companies as others."

MapLight has paired political donations to how California politicians voted on more than 5,000 bills from 2003 to 2004. But compiling the data is a time-consuming process for the nonprofit organization. The information is not easily accessible in digital format.