Data in the Scholarly Communications Solar System

Guest post by Kathryn Funk, program manager for NLM’s PubMed Central.

The Library of the Future. What will it look like? The NLM Strategic Plan envisions it partly as “one of connections between and among literature, data, models, and analytical tools.” In this future, journal articles are no longer lone objects drifting in space, but, rather, each a solar system waiting to be explored. Indeed, we’re already seeing the published literature associated with datasets, clinical trials, protocols, software, earlier versions (including preprints), peer review documents, and so on through consistent identifiers and standardized publishing and archival practices.

To help researchers and the public navigate this new solar system, PubMed Central (PMC), NLM’s full-text archive of journal literature, has been collaborating with publishers and funders for the last year to support efficient ways of linking journal articles with associated data. We’re encouraging authors to cite their open datasets and publishers to archive and make available those data citations in a machine-readable format. Though data citations represent only a small percentage of how PMC articles are linked to data (supplementary material continues to be the predominant method for associating data with articles in the archival record), the growth in data citations in the last year has been promising, nearly doubling the previous year’s total (i.e., 850 articles with data citations in 2017 vs. approximately 440 in 2016). NLM is also supporting the public access policy requirements of our research funder partners by encouraging authors to deposit datasets as supporting documents via the NIH Manuscript Submission (NIHMS) system.

But solar systems, even the metaphorical kind, are meant to be explored, so we’re also working to expose each journal article solar system in a way that promotes discoverability. We want to make it easier to discover articles in PMC with associated data citations, data availability statements, and supplementary data, through improved record displays and new search facets, leveraging the data-related search filters announced earlier this year.

NLM is also looking beyond datasets to archive and expose articles’ key satellites, including, for example, comments generated during the peer review process. As the effort to expand the openness of peer review gains traction, PMC staff have been collaborating with publishers and Crossref on standardized ways to make readily available those peer review materials.

As with any exploration of new solar systems, it’s our hope that taking these steps will help generate new knowledge, and in so doing drive research that is reproducible, robust, transparent, and reusable. And as we move toward becoming the Library of the Future, how we can best support your research needs in connecting the literature with the rest of the research universe? Please let us know.

With thanks to Jeff Beck for the solar system analogy.

Kathryn Funk is the program manager for PubMed Central. She is responsible for PMC policy as well as PMC’s role in supporting the public access policies of numerous funding agencies, including NIH. Katie received her master’s degree in library and information science from The Catholic University of America.