About

Endangered Data Week is an initiative committed to raising awareness of threats to publicly available data; exploring the power dynamics of data creation, sharing, and retention; and teaching ways to make endangered data more accessible and secure.

At the Library, we’re dedicated to ensuring that research output, in all forms, is preserved and made accessible to the people who need it. We interviewed IU Libraries employees to learn more about the work they do to steward at-risk data.

Theresa Quill – Social Sciences Librarian

What role does publicly available data play in your position here at IU Bloomington?

Part of my job is helping researchers locate geospatial data for their area of interest. Almost all GIS data that I work with is publicly available, and usually created by various government agencies at the federal, state, and local level. We’re fortunate that in the state of Indiana, we have a robust system for making geospatial data free, secure, and publicly available. However, this is not always the case for other states or countries.

Which (if any) types of data or datasets that you know of and/or use are currently at risk?

Two currently introduced bills (H.R.482 and S.103) contain language that prohibits federal funding for a “Federal database of geospatial information on community racial disparities or disparities in access to affordable housing.”. As a member of the geospatial community, and as someone who works with researchers who rely on access to this type of data, I am extremely alarmed by this language that restricts access to vital data. Along with proposed funding cuts to the Environmental Protection Agency (EPA) and other agencies that produce geospatial data, I would consider almost all federal geospatial datasets to be at risk.

Do/have you participated in any preservation of public data efforts?

On a local level, the Indiana University Libraries use Archive It to scrape state and local websites that host geospatial data. The intention behind this is to preserve historic versions of datasets and maintain access during events such as government shutdowns or funding cuts.

What about efforts toward awareness?

Indiana University is involved in the Big Ten Academic Alliance Geoportal. This is a project that aggregates publicly available geospatial data to aid in discoverability and awareness.

Molly Wittenberg – Records Manager

What role does publicly available data play in your position here at IU Bloomington?

I work with departments across campus to identify and schedule the records they create and manage as evidence of their activity; many of the records will become available publicly, and it’s key that they remain accessible.

Do/have you participated in any preservation of public data efforts?

Yes, in my ongoing work with departments and staff at IU about what records they have and how they should manage them. I am also in the process of looking at the current web content created by IU and our efforts towards archiving that content.

What about efforts toward awareness?

My efforts right now are mostly focused on assessing what people have – awareness and outreach are the next steps.

Heidi Kelly – Digital Preservation Librarian

What role does publicly available data play in your position here at IU Bloomington?

I work as the Digital Preservation Librarian, so I’m more focused on how to preserve and make accessible data. I work a lot with special collections on campus, like the University Archives, who are tasked with preserving the scholarly and historic record. I’ve seen things come through like hard drives from our congresspeople and former governors, which will be really important to future historians and researchers.

Which (if any) types of data or datasets that you know of and/or use are currently at risk?

I’m interested in web archiving, and we’ve seen some government websites changing as the new administration has come. For example, the Spanish version of whitehouse.gov has gone dark. We’ve seen this internationally as well – for example, when the Dutch flight was shot down over Ukraine a few years ago, one of the leaders of the group posted to his blog that they’d taken down a plane. A few hours later, as they realized that it was a commercial jet and not freight, they took the post down. Luckily, the Internet Archive captured the website before it was removed, otherwise we wouldn’t have that.

Do/have you participated in any preservation of public data efforts?

I’m involved in the Software Preservation Network, as well as other digital preservation national efforts. These aren’t directly related to public data efforts, but they sort of build the infrastructure in a lot of ways.

Emily Alford – Social Sciences Librarian

What role does publicly available data play in your position here at IU Bloomington?

In one word? Everything. I am the IUB Libraries liaison for students, faculty, as well as community researchers to government information and data. It is my job to make sure library users are aware of what public government data is available to them, how to access it, and of course how to utilize and incorporate it into their research. I work with public local, state, federal, and international data.

Have you participated in any efforts focusing on awareness of endangered data?

This! I was so excited to hear about Endangered Data Week and the efforts made by all of the organizers and collaborators. Here at IU, we rapidly gathered up some promotion ideas which we hope to expand even further in preparation for EDW 2018.

I’d like to share a recent example of how an IU faculty member used IUScholarWorks to share research materials from a 30-year, NIH-funded research project online. Take a look at the Learnability Project archive. These collections, created by Judith A. Gierut, Professor Emerita of Speech and Hearing and founder of the Learnability Project, include working papers, data sets, and journal articles from 1980-2015. Professor Gierut worked closely with IUScholarWorks staff to address issues of metadata, digital preservation, and copyright before depositing her materials. Now that the project is archived in IUScholarWorks, future scholars will be able to access to this important research, and Professor Gierut can rest assured that her life’s work will be preserved for years to come. For more information, read our news item on the Libraries’ website.

Scholarly communication librarians (and our colleagues) often tout institutional repositories as the best place for authors to keep their work safe and accessible. Yet, more than 4 years after the publication of Dorothea Salo’s infamous article, “Innkeeper at the Roach Motel,” another former librarian is pointing out that while work uploaded to IRs is secure, it is oftentimes undiscoverable by the public.

Before I go any further, I should clarify that the IUScholarWorks team has identified and addressed our own discoverability problems long ago. Our metadata is regularly indexed and is fully findable by Google Scholar. We also have staff hard at work on some of the usability and product awareness issues raised below. This post is merely intended to serve as a conversation starter among librarians.

Over on the MmITS blog, Louise Morrison has written a provocative post highlighting several problems that often plague users who try to find content in IRs:

Lack of discoverability via Google and Google Scholar

Poorly conceived IR search tools

Users are oblivious to the existence of IRs (and hence they cannot find content therein)

Subject repositories don’t exist for every subject

While I can’t speak to all the points raised by Louise, I find the metadata interoperability issue intriguing. Is Dublin Core a poor choice for an IR metadata standard, given that Google Scholar uses a different approved metadata scheme? Should we change our standards to fit with common practice, or hold tight to best practice?

Jenn Riley (Head, Carolina Digital Library and Archives*), in her keynote speech at the 2011 Australian Committee on Cataloging Seminar, suggested that as far as metadata goes, libraries should become integrated, rather than carving out a niche (Maclean, 2011)—and I would agree.

What do you think? Common practice or best practice? And how would you suggest we tackle building more user-friendly search tools for the major IR platforms?

* Full disclosure: Riley was my former supervisor at IU’s Digital Library Program in 2007.