The Promise of Open Government Data & Where We Go Next

One of the biggest boons for the Open Data movement in recent years has been the enthusiastic support from all levels of government for releasing more, and higher quality, datasets to the public. In May 2013, the White House released its Open Data Policy and announced the launch of Project Open Data, a repository of tools and information–which anyone is free to contribute to–that help government agencies release data that is “available, discoverable, and usable.”

Since 2013, many enterprising government leaders across the United States at the federal, state, and local levels have responded to the President’s call to see just how far Open Data can take us in the 21st century. Following the White House’s groundbreaking appointment in 2009 of Aneesh Chopra as the country’s first Chief Technology Officer, many local and state governments across the United States have created similar positions. San Francisco last year named its first Chief Data Officer, Joy Bonaguro, and released a strategic plan to institutionalize Open Data in the city’s government. Los Angeles’ new Chief Data Officer, Abhi Nemani, was formerly at Code for America and hopes to make LA a model city for open government. His office recently launched an Open Data portal along with other programs aimed at fostering a vibrant data community in Los Angeles.1

Open government data is powerful because of its potential to reveal information about major trends and to inform questions pertaining to the economic, demographic, and social makeup of the United States. A second, no less important, reason why open government data is powerful is its potential to help shift the culture of government toward one of greater collaboration, innovation, and transparency.

These gains are encouraging, but there is still room for growth. One pressing issue is for more government leaders to establish Open Data policies that specify the type, format, frequency, and availability of the data  that their offices release. Open Data policy ensures that government entities not only release data to the public, but release it in useful and accessible formats.

Only nine states currently have formal Open Data policies, although at least two dozen have some form of informal policy and/or an Open Data portal.2 Agencies and state and local governments should not wait too long to standardize their policies about releasing Open Data. Doing so will severely limit Open Data’s potential. There is not much that a data analyst can do with a PDF.

One area of great potential is for data whizzes to pair open government data with web crawl data. Government data makes for a natural complement to other big datasets, like Common Crawl’s corpus of web crawl data, that together allow for rich educational and research opportunities. Educators and researchers should find Common Crawl data a valuable complement to government datasets when teaching data science and analysis skills. There is also vast potential to pair web crawl data with government data to create innovative social, business, or civic ventures.

Innovative government leaders across the United States (and the world!) and enterprising organizations like Code for America have laid an impressive foundation that others can continue to build upon as more and more government data is released to the public in increasingly usable formats. Common Crawl is encouraged by the rapid growth of a relatively new movement and we are excited to see the collaborations to come as Open Government and Open Data grow together.

 

Allison Domicone was formerly a Program and Policy Consultant to Common Crawl and previously worked for Creative Commons. She is currently pursuing a master’s degree in public policy from the Goldman School of Public Policy at the University of California, Berkeley.

Big Data Week: meetups in SF and around the world

 

Big Data Week aims to connect data enthusiasts, technologists, and professionals across the globe through a series of meetups between April 19th-28th. The idea is to build community among groups working on big data and to spur conversations about relevant topics ranging from technology to commercial use cases. With big data an increasingly hot topic, it’s becoming ever more important for data scientists, technologists, and wranglers to work together to establish best practices and build upon each others’ innovations.

With 50 meetups spread across England, Australia, and the U.S., there is plenty happening between April 19-28. If you’re in the SF Bay Area, here are a few noteworthy events that may be of interest to you!

  • Bio + Tech | Bio Hackers and Founders Meetup on Tuesday, April 24th, 7pm at Giordano in the Mission. This will be a great chance to network with a diverse group of professionals from across the fields of science, data, and medicine.
  • Introduction to Hadoop on Tuesday, April 24th, 6:30pm at Swissnex. This is a full event, but you can join the waiting list.
  • InfoChimps Presents Ironfan on Thursday, April 26th, 7pm at SurveyMonkey in Palo Alto. Hear Flip Kromer, CTO of Infochimps, present on Ironfan, which makes provisioning and configuring your Big Data infrastructure simple.
  • Data Science Hackathon on Saturday, April 28th. This international hackathon aims to demonstrate the possibilities and power of combining Data Science with Open Source, Hadoop, Machine Learning, and Data Mining tools.

See a full list of events on the Big Data Week website.