Open Data: An Interview with U.S. Deputy CTO Nick Sinai

Brad is a contributor to FedTech. He’s covered technology since 1991, when he first got his start in the industry.

Data: The federal government has lots of it and wants to give much of it away. In response to the administration’s open-government directive and the president’s executive order on open data issued last year, the White House Office of Science and Technology Policy (OSTP) and the Office of Management and Budget (OMB) are leading efforts to make more government data publicly available and easier to find and use. Agencies must now inventory all their existing data and publish a list of all of their public data. And when they create new data, they must ensure that it’s in machine-readable format so others can use it.

U.S. Deputy Chief Technology Officer Nick Sinai is at the center of these open-data initiatives. He spoke to FedTech Managing Editor Brad Grimes about its details and the potential for solving challenges through open data.

FEDTECH: When you say agencies need to make data more open, what data are you talking about?

SINAI: The government collects and creates a vast amount of data — in its research and scientific activities, through procurement, via regulation and through a variety of different programs. Often, that data can be made public as fuel for companies and economic growth. So we want to continue opening up data that fuels private-sector innovation or helps build a more efficient and accountable government.

The classic examples are global positioning and weather data. Opening those valuable information resources over the past few decades has led to countless useful products and services. Think about your local news weathercaster, weather apps and services, or even financial instruments, such as weather insurance for farmers. Those are all powered by weather data from the Commerce Department’s National Oceanic and Atmospheric Administration.

Similarly, the global positioning system was something opened up by Presidents Reagan and Clinton. We now have a vast ecosystem — a multibillion-dollar ecosystem — of products, services and companies, from navigation systems to maps to apps at your fingertips via smartphones. It’s also how we have higher crop yields, thanks to precision, GPS-enabled tractors. Civilian access to GPS has increased maritime safety, as our supertankers navigate via GPS. Those are just some of the most well-known examples of open data. We’ve worked over the past few years to launch a series of open-data initiatives in health, energy, education, public safety, global development, finance, climate, geospatial and more in an effort to transform government data into powerful resources for private-sector innovation.

FEDTECH: What is required of agencies as part of the open-data initiative?

SINAI: The president issued an executive order last May, and with that, the Office of Management and Budget and the Office of Science and Technology Policy published OMB Memorandum 13-13, the Open Data Policy memo. The memo is about managing information as an asset and thinking about openness and interoperability throughout the information lifecycle. The president has made open and machine-readable the default for new or substantially modernized information systems, while ensuring that we still protect data where appropriate, as in the case of personally identifiable or private information.

So the new default is for data to be open and machine-readable, where possible, but we also have a lot of existing systems in the federal government. OMB M13-13 requires agencies to inventory their data assets. After all, we inventory desk chairs and other physical resources, and data are valuable virtual resources within departments and agencies. Agencies need to be able to catalog and list data sets and identify those that can be made public. Then they can begin to work with external users of that data to make it easier for the public to use and find. There are a series of requirements in the Open Data Policy, and agencies are making great progress toward meeting them.

FEDTECH: What do OMB and OSTP mean by “machine-readable” as it pertains to open data?

SINAI: This is all about making data easier to find and use, starting with better human-readable descriptions of the data. We also want to make government data more interoperable and machine-readable for outside software developers. Usually the easiest first step is to make data available via bulk download in formats like CSV, JSON [JavaScript Object Notation] or something similar. And there are a number of agencies doing great work in building application programming interfaces for stakeholders to access and use their data repeatedly. For example, the Food and Drug Administration is rolling out an API this year for data in cases of adverse effects of drugs reported to the agency

FEDTECH: Are there deadlines for opening agency data?

SINAI: Initial deadlines to get started were the end of November 2013. Agencies are now required to report a snapshot of their entire enterprise data inventories to OMB quarterly.

A subset of an agency’s enterprise data inventory includes those data sets that are public or can be made public. Agencies are required to post and keep this information current, called a public data listing, on their .gov/data web pages. That’s something that should be updated regularly, as they continue to make data sets available.

The public can find data sets on an agency’s web page, as well as on the main Data.gov site, which consolidates all the data web pages across government. We’ve rebuilt the Data.gov site to make it easier to use and more mobile- and tablet-friendly, and we continue to [update] it, shipping code every two weeks.

Agencies are also required to have public-feedback mechanisms, including on their data pages. OMB is working with agencies to make sure they continue to expand, enrich and develop those inventories. We recognize this is going to be an ongoing process, and inventories and public-data listings are only going to improve over time.

FEDTECH: How has progress been so far?

SINAI: We’re very excited about the progress agencies have made, but there is still a lot of work to be done. Ultimately, this is an ongoing effort, part of managing an agency’s assets, and a real shift in how we think about data as a valuable national resource. The open-data initiative is governmentwide, so you’re going to see agencies proceed at different paces and ultimately engage with their users and stakeholders about the technology solutions that work well for them — approaches aren’t going to be exactly the same between an agency or department that manages troves of geospatial data and one that manages education data.

One example is how Secretary [Penny] Pritzker has made open data a strategic priority for the Department of Commerce. NOAA recently released a request for the public, industry and stakeholders to provide input about the agency unleashing even more of its valuable, large data sets.

We also continue to host brainstorming and collaboration events, called “datajams” and “datapaloozas.” These events help facilitate the use of data, as well as highlight innovative uses of government data by private-sector innovators and entrepreneurs to help Americans — in transportation, law enforcement and consumer safety, as well as in areas of college affordability and teaching and learning innovation.

This is an ongoing process. It’s about continuing to manage information resources strategically as assets within an agency and continuing to protect them from a privacy and security perspective. Data are valuable assets that taxpayers paid for and that need to be protected, and where they can be made public, we need to make sure we’re opening them up by making them easy for users inside and outside the federal government to discover and use.

FEDTECH: What kind of guidance is available to agencies that are trying to determine how to comply with the Open Data Policy?

SINAI:Project Open Data, has a useful set of free tools that a lot of agencies use. These are part of Project Open Data, which was set up by the White House, but is an open and collaborative site on GitHub where anybody — government, contractor, member of the public — can post a free tool or case study to help.

One of the things we’ve found is that agencies have a lot of great data across many different offices, so often they set up working groups or data councils inside an agency, to make sure they’re facilitating both the executive and staff conversations across those different offices. The CIO staff are almost always part of those groups to help make sure agencies understand what data assets they’ve got.

OSTP and OMB also meet with the agencies on a biweekly basis to troubleshoot and brainstorm how to move forward on this.

FEDTECH: The advantages of opening government data to the public seem obvious, but is there an agency-to-agency benefit from open data?

SINAI: Absolutely. This is about a strategic approach to managing data throughout its lifecycle and making it open, machine-readable and useful to all users, including users within or among agencies. Making our data usable and reusable, where appropriate, will have benefits to the public, improve internal processes and boost efficiencies in government. For one thing, it will help with internal coordination, support more informed decision-making, and ensure less duplication within agencies.

FEDTECH: Provided it’s adequately protected, correct? Is there a risk that users of government data can take public information and piece together private records?

SINAI: Privacy, confidentiality and security of government data are tremendously important. We take this very seriously. Each agency has a senior official responsible for privacy and an information security team who work together to make sure that they’re protecting data.

Agencies are extremely conscious of the risks of reidentification from certain disparate data sets and take great care about it. They put in protections and strict legal safeguards around them. But this is a complicated subject, and will be an area of continued effort among privacy and security folks.

FEDTECH: Before you were Mr. Open Data, you played a major role in the nation’s broadband policy. Care to give an update?

SINAI: Yes! Broadband is critical to a 21st century economy. It is increasingly how citizens access a wide range of government services, and it is important to every major sector of the economy. That’s why it’s a priority for the administration, and we’re making some great progress. At the same time there is no silver-bullet policy option, so we have numerous initiatives underway to spur deployment, adoption, and competition.

In addition to helping co-author the FCC’s National Broadband Plan, I had the opportunity to contribute to the administration’s grid-modernization policy framework in 2011. There are some interesting connections here: One example of broadband- and software-fueled innovation has to do with energy data. The Green Button initiative is something I’ve been heavily involved in — basically getting energy data back into the hands of families and businesses, privately and securely, so they can better shop for solar panels, perform virtual building energy audits or just be more informed about their energy use so they can save on their energy bills.