Search This Blog

The Problem of Dark Data

A March New York Times articlesounded warning bells for
researchers: the scourge of dark data. Dark data doesn’t refer to anything
secret or illegal, but rather data developed by the government and other
organizations subject to loss. A more complete definition, often used in the
corporate context, is "the information assets organizations collect, process and store during
regular business activities, but generally fail to use for other purposes.” Concern
over the loss of data that could lead to new discoveries has been especially
equated with the loss of scientific data stored by agencies and other
organizations. Much of this data is stored on government servers, with no legal
obligation to remain available. The Trump administration’s proposed cuts to
scientific research and agency funding has only increased the alarm felt by
scientists and other researchers.

An additional problem is that dark data, by
definition, is unknown. It can’t be verified if it can’t be found, even though
we know it’s there. Somewhere. Right now, data.gov
is the central repository for government created databases, but it relies on
agencies to self-report and is, by many researchers’ estimates, only a fraction
of data created by the agencies. The use of proprietary code and data.gov’s practice of linking to data housed
on websites, instead of the databases themselves, makes it even more difficult
for researchers.

While there does not seem to be any federal
legislation prohibiting the destruction or decentralization of these types of
data, several non-profits have formed to save this data from going dark, by identifying
and downloading data viewed as
vulnerable to deletion.

To learn more about dark data, here are some
resources to get you started:

Comments

Post a Comment

Popular posts from this blog

On January 27 President Donald Trump signed an Executive
Order, Protecting
the Nation from Foreign Terrorist Entry Into the United States. Four days
earlier, on January 24, the Congressional Research Service released its own
report: Executive Authority to Exclude Aliens: In Brief. To those unfamiliar, the Congressional Research
Service (CRS) is a federal legislative branch agency, housed inside the Library
of Congress, charged with providing the United States Congress non-partisan
advice on issues that may come before Congress, including immigration. Included in the report are in-depth discussions on the
operation of sections of the Immigration and Nationality Act (INA) in the context of the executive power . Discussions
of sections 212(f), 214(a)(1) and 215(a)(1)
report on how the sections have been used by Presidents, along with relevant
case law and precedents. Most interesting is the list of executive orders
excluding some groups of aliens during past presidencies; the table all…

Want to learn more
about the upcoming presidential and congressional transitions? There’s an app
for that.

The Government Accountability Office (GAO) recently launched its Priorities for Policy
Makers app (available free of charge for iPhone or Android), which is
intended to “help President-elect Donald Trump and the next Congresstackle
critical challenges facing the nation, fix agency-specific problems, and
scrutinize government areas with the potential for large savings,” according to
Gene Dodaro, Comptroller General of the United States and head of the GAO. The
app allows users to search by agency or topic, and provides brief summaries of
relevant issues as well as links to more detailed GAO reports.