from OCLC Research

Print preservation and peer review: a narrowing prospective?

[Here’s a bit of half-baked blog fodder I thought had been published months ago. It seems newly relevant, so rather than discard it, I thought I’d put it out for public dissection. Have at it, fellow anatomizers.]

I’ve been thinking [ca. September 2008] quite a bit recently about peer review and its relationship to print preservation and access initiatives among research libraries. A recent white paper from Ithaka, summarizing findings from range of recent studies on faculty and librarian views of the transition from print to online scholarly communication practices, prompted me to reconsider the ways in which discoverability and availability of library print collections is likely to affect scholarship in a largely digital information landscape. The prospective value of academic print collections (outside of special collections) is largely determined by scholarly communication practices and the paradigm of peer review. So long as content in print continues to play a role in the creation, exchange or verification of scholarly work, it is likely to survive.

A while back, Merrilee and I took a look at the copyright status of US imprints in WorldCat and citation patterns in a selected set of scholarly publications in the humanities. The point of this exercise was to gauge the potential impact of increased online access to public domain content made available through mass digitization efforts. The results were not encouraging: less than 20% of the US imprints represented in the WorldCat database were judged to be in the public domain, hence available for broad electronic distribution, while a majority of scholarly citations referenced in-copyright content in books and journals. It didn’t appear to us that “freeing” the content of every public domain title would necessarily enable researchers to do more of the kind of intellectual work that is currently most valued and rewarded by the academic community. This is not a judgment of the quality or importance of such work compared to innovative approaches that might be supported by increased digital access, simply an observation that current scholarly practices depend upon the availability of content that is protected by copyright. Universal online access to a time-bound fragment of the scholarly record is likely to produce a narrower (if arguably deeper) kind of knowledge; one implication of this is that library owned print collections will continue to play an important role in extending access to scholarly perspectives that don’t rise to the surface of the Web in digital collections.

James Evans, a sociologist at the University of Chicago, recently published a study in which he examined citation rates for peer-reviewed articles published in online scientific journals. The results of his study, “Electronic Publication and the Narrowing of Science and Scholarship,” originally published in Science (Vol. 321, No. 5887, 18 July 2008, 395-399 [abstract]), were summarized in an article in the Economist, which noted that:

…as more journals become available online, fewer articles are being cited …. Moreover, those articles that do get a mention tend to have been recently published themselves. Far from growing longer, the long tail is being docked.
[“Great Minds Great minds think (too much) alike” The Economist, July 17th 2008]

Quite a lot of attention has been devoted to the question of how online availability affects the impact (as measured in citations) of scholarly publications. Proponents of open access, in particular, have been keen to demonstrate that the increased accessibility of content in pre- and e-print repositories results in greater citation rates and presumably greater penetration into the collective scientific consciousness. Evans takes a different stance, arguing that the increasing scope of online scholarly content is — paradoxically — narrowing our perspective (or “prospective” as the Economist cannily puts it). His research was based on examination of citations to content in journals whose print back-files have come online over the past decade or so. Counter to what one might expect, the increasing availability of content was correlated with a decreasing frequency of citation. Several prominent figures in the open access community have questioned Evans’ interpretation of the citation patterns, preferring to see the increasing concentration (narrowing range of citations) as a sign that the best quality work is attracting well-deserved attention: the signal strength of the best papers is being intensified. Some of this critical commentary is summarized in the useful OpCit bibliography of studies on the citation impact of open access.

Roger Schonfeld, manager of research projects at Ithaka, has also looked at citation rates as a function of online availability. Ithaka has close ties to JSTOR, so it is not especially surprising that they have studied the impact of back-file digitization on scholarly practice. In a project with researchers at the University of Michigan and Dartmouth College, Roger examined the impact of digitization (that is, the increasing availability of journal articles in digital archives like JSTOR) on citation rates to content previously available in print-only format. Their findings suggested that the “online advantage” varied by discipline, ranging from a 5% to 20% boost in citations. The sciences benefited from the greatest increase in citations following digitization; the humanities (history, in particular) benefitted the least. In contrast to Evans’ findings, there was no evidence that online availability hastened the demise — or shortened the half-life — of the scholarly literature, nor any suggestion that the range of scholarly enquiry had been curtailed. Ithaka was tracking impact on journal titles, not articles, so the studies are not easily compared. The Ithaka study (briefly outlined here) nevertheless raises perplexing questions about the impact of online access on increasing the longevity and vitality of a scholarly record “gone dark” (figuratively speaking) in less-used and less-discoverable print collections.

All of this begs the question of how print collections function in current scholarly workflows. Evans speculates that scholarly interactions with literature in print formats are intrinsically different from research in the online environment. Similar assertions are often offered up as a justification for retaining print collections (especially journal collections) that are duplicated in electronic format, even when there is little evidence that the print collections are continuing to fulfill a “browsing” function. Carol Mandel, Dean of Libraries at New York University, has observed that what faculty — like Evans — mean when they say that browseable collections are central to the library’s function may be less about the materiality, proximity or accessability of print than the social dimensions of scholarship, and the degree to which the library is able foster communication between researchers and the scholarly record, regardless of format. Her point is that the library’s function is defined by its engagement with the scholarly process, and not by a particular medium of communication.

It seems to me that we are lacking a risk assessment framework [cf. some current work just reported on] against which the prospective value of print collections can be measured with confidence. Academic library investments in electronic resources dwarf investments in print, but operational workflows are still dominated by models developed for building and managing print collections in a period when library-owned content was viewed as a durable institutional asset. As scholarly activity has moved into the network space, the ways in which collections deliver (and accrue) value has changed; visibility, mobility, and referenceability are all key to maximising the usefulness of research resources in the online environment.