Northeastern University Libraries is embarking on a program to help researchers at the University write data management plans in grant applications, help people store and create metadata for their research, and, eventually, help them publish their datasets with all the appropriate metadata to the appropriate audiences.

What is the use of this? In a lot of ways, it is a way to make research more efficient, and thus, in the long run, cheaper. Researchers regularly create data from experiments and observations. In an academic environment, they then use that data to test hypotheses and generate new ideas. The results are published, and humanity gains new knowledge. (I realize that I’m simplifying the process considerably.)

But what happens to the data that had been collected? The researchers that create it obviously use it for further inquiries, but much of the time, the data is stored somewhere and forgotten. People have been slowly realizing that there may be more knowledge that can be combed out of the “old” data. Perhaps it can be combined with data from another source, or multiple other sources, and new questions can be answered using it. In fact, funding agencies, such as the National Science Foundation, are starting to require that grant applications include a Data Management Plan, which should explain how the researchers on the grant will:

share with other researchers, at no more than incremental cost and within a reasonable time, the primary data, samples, physical collections and other supporting materials created or gathered in the course of work under NSF grants.

I will be working with the staff at Northeastern University Libraries and the researchers at Northeastern to make sure that Northeastern can become a leader in data management and curation.