Agencies exploring the right balance between open data, security

The Department of Health and Human Services is turning to big data to improve the
security of their computer networks.

At the same time, HHS is striving to make more data accessible to the public and
across all the agency bureaus.

Kevin Charest, the chief information security officer at HHS, said balancing these
two mandates requires the ability to explain to the program managers why data
security is important beyond the typical "because it is" response.

"Typically security folks tend to talk about it from a confidentiality aspect,
protection and that sort of thing, and unfortunately that tends not to resonate
with actual end users, whether they be producers of data or consumers of data,"
Charest said after his spoke on a panel discussion Thursday at the AFFIRM/GITEC
Colossal Data conference in Washington. "They are really more interested in the
integrity, from a research standpoint, and the availability, I want what I want
when I want it type of thing. Particularly as it relates to HHS, we have
a tremendous amount of research data, and very often scientists have a tendency to
see security as a nuisance and getting in the way of science. But the reality is
what we found is by going to them and having perfectly open and honest
conversations about what would happen if the integrity of your research was to be
comprised. You begin to relate it to their world view, and all of a sudden
security becomes less of an evil and more of a necessity."

He said if you relate the security to the needs of the individual, it's much
easier sell.

Like most agencies, HHS creates and holds a lot of data, especially the sensitive
kind, whether its personnel or health information, and there is more pressure to
share data among bureaus.

Charest said as the data sets get bigger and bigger, the normal type of processing
just doesn't work.

APIs to Commerce data

Additionally, the Office of Management and Budget's mandate to make information
more accessible through Data.gov in a machine-readable format is adding another
layer of complexity for agencies. President Barack Obama issued an executive order and OMB followed with
implementation guidance in May.

The Commerce Department is trying to comply with that mandate by developing
application programming interfaces (APIs) that will make data accessibility
easier.

Simon Syzkman, Commerce's chief information officer, said at the event that the
department has made more than 100,000 data sets available through Data.gov. OMB
plans to revamp the
website later this year, and already has included a list of APIs available to
users.

He said Commerce needs to improve the governance of data by building it in early
on from a lifecycle perspective.

In addition to individual data sets from all the agencies, the administration is
funding research and development on how to better harness big data.

Fen Zhao, a staff associate in the Directorate for Computer and Information
Science and Engineering at the National Science Foundation, said her agency and
the Office of Science and Technology Policy issued a request for information
earlier this year to see how multi-stakeholder partnerships could come together to
solve big data problems.

Zhao said NSF and OSTP will announce a new round of research and development
projects this fall.

Mash-ups offering benefits

While the administration focused on R&D of big data, several agencies are mashing-
up data to improve how they meet their mission.

The Agriculture Department is combining data on wildfires with information
about crops to better understand the path of the fire. Then the Forest Service can
redirect the blaze away from the corn or soybeans or other farm land.

Charles McClam, the USDA deputy CIO, said the agency is developing a big data
strategy to decide which systems would benefit the most from analytics technology
and mashing data together.

He said the strategy could be completed in the next 3-to-6 months.

HHS's Charest said the agency is trying to break the data silos that currently
exist.

He said there is great value in bringing together information from, for example,
the Food and Drug Administration, the Centers for Disease Control and Prevention
and the National Institutes of Health.

"We recognize the value of unlocking the data and bringing it together. So, what
we have to do is not just take the underlying siloed security, but overlay a level
of security for the interoperability and collaborative nature of the
infrastructure," he said. "So we are designing it now. We are looking at those
pieces and deciding what goes into that mash-up and what doesn't make sense,
recognizing it will be an evolutionary and iterative process, but security is
being designed up front before we do any data collaboration."