Saturday, March 17. 2007

SUMMARY: Cornell University's Institutional Repository (IR) so far houses only a very small percentage of its own annual research output, even though this output is the target content for Open Access (OA) IRs. As such, Cornell's IR is no different from all other IRs worldwide except those that have already adopted a "Green OA" deposit mandate. Alma Swan's international, multidisciplinary surveys have found that most researchers report they will not deposit without a mandate but will comply willingly if deposit is mandated by their institutions and/or their funders. Arthur Sale's comparative analyses of mandated and unmandated IRs have confirmed this in actual practise. Cornell's IR too has confirmed this with high deposit rates for the few subcollections that are mandated. IRs with Green OA mandates approach 100% OA within about 2 years. The worldwide baseline for unmandated self-archiving is about 15%.Davis & Connolly's 2007 D-Lib article takes no cognizance of this prior published information. It surveys a sample of Cornell researchers for their attitudes to self-archiving and finds the usual series of uninformed misunderstandings, already long-catalogued and answered in published FAQs. The article then draws some incorrect conclusions derived entirely from incorrect assumptions it first makes, among them the following:

(1) The purpose of Green OA self-archiving is to compete with journals? (No, the purpose is to supplement subscription access by depositing the author's final draft online, free for all users who cannot access the subscription-based version.)
(2) IRs should instead store the "grey literature"? (No, OA's target content is peer-reviewed research.)
(3) IRs are for preservation? (No, they are for research access-provision.)
(4) Some disciplines may not benefit from Green OA self-archiving? (The only disciplines that would not benefit would be those that do not benefit from maximizing the usage and impact of their peer-reviewed journal article output.)

The only thing Cornell needs to do if it wants its IR filled with Cornell's own research output is to mandate it.

D & C: "Problem: While there has been considerable attention dedicated to the development and implementation of Institutional Repositories [IRs], there has been little done to evaluate them, especially with regards [sic] to faculty participation."

On the contrary; little has been done to develop IRs apart from creating them; moreover, many surveys and analyses have evaluated faculty non-participation and identified how and why to remedy it: By mandating deposit. (See Sale and Swan references at the end of this posting.)

(I note that, unlike Harvard, Cornell is not one of the 132 Universities that have signed in support of the US Federal Green OA Mandate, the FRPAA; this may be a sign of equivocation, but in Cornell's defense, none of the 132 have yet practised what they petitioned (by adopting locally the global mandate they are urging federally). European, Australian and Asian Universities have been faster off the mark.

D & C: "[The only] steady growth [is in] collections in which [Cornell] university has made an administrative investment, such [as] requiring deposits of theses and dissertations into DSpace."

This passage states the problem (empty IRs) as well as the solution (mandating deposit) -- but the article itself then proceeds to ignore this obvious and already known outcome, and instead goes on and on about the many groundless (and easily answered) reasons faculty cite for not depositing unless it is mandated.

D & C: "Many faculty use alternatives to institutional repositories, such as their personal Web pages and disciplinary repositories"

If all or most faculty were indeed spontaneously despositing their peer-reviewed articles on their personal Web pages or in central disciplinary repositories (CRs) (like Arxiv), there would be no problem: 100% Open Access (OA) would already be upon us, for IRs could easily fill themselves by simply harvesting their faculty's output from their web-pages and CRs.

The trouble is that -- except where mandated -- most faculty are not depositing their articles on their Web pages today, and only a few sub-disciplines are depositing in CRs. Hence OA is only at about 15% today.

D & C: "[CRs] are perceived to have higher community salience than one's affiliate institution."

Right now, the only two CRs with any appreciable content -- Arxiv and PubMed Central -- certainly do have "higher community salience" than IRs, since most IRs are mostly empty. But institutions need merely mandate depositing and the "salience" of their IRs will sail, along with the size of their contents.

(Moreover, the true success rate of a repository -- whether IR or CR -- is the percentage of its total annual target content that it is currently capturing. By that proportionate measure, central disciplinary CRs are in fact doing just as badly as unmandated IRs and the real champions are (unsurprisingly) the harvesters like Citeseer, OAIster and Google Scholar that trawl their contents from the distributed IRs and CRs.)

All IRs are OAI-compliant and interoperable. Researchers' institutions cover all of research output space. Hence researchers' own IRs are the natural and optimal locus for direct deposit. Institutions also have a proprietary interest in showcasing, monitoring, evaluating and storing their own research output -- as well as in maximizing its research impact. Hence both funders and institutions should mandate direct deposit in the researcher's own IR. (CRs can then harvest therefrom, if they wish.) (See: Optimizing OA Self-Archiving Mandates: What? Where? When? Why? How?)

D & C: "Faculty gave many reasons for not using repositories: redundancy with other modes of disseminating information"

There is no "redundancy" with OA's target content: peer-reviewed journal articles. Those users who can afford paid access, have paid access. Those who do not, have no access. The purpose of OA self-archiving in IRs is to supplement the existing paid access, providing free access to the author's final draft, self-archived online, for those would-be users who do not have paid access to the journal's proprietary version.

(The authors of this article, D & C, as we shall see, draw precisely the conclusions from their article that they have themselves put into it, in the form of assumptions, often incorrect ones. Apart from that, all the do is amplify the volume of the faculty misunderstandings they sample, instead of correcting them.)

The purpose of maximizing research access is to maximise research impact (download, usage, applications, citations, productivity, progress).

Cornell faculty are right to regard the affordability problem as not their problem. The accessibility problem, however, is their problem, both from the point of view of Cornell researchers' own lost access to the work of researchers at other institutions (in journals that even Cornell cannot afford to subscribe to) and -- even more important (as most researchers at other institutions are not sitting as pretty as Cornell for subscriptions) -- from the point of view of Cornell researchers' lost research impact (owing to the access problems of would-be users at other institutions).

D & C: "Each discipline has a normative culture, largely defined by their reward system and traditions. If the goal of institutional repositories is to capture and preserve the scholarship of one's faculty, institutional repositories will need to address this cultural diversity."

The target content of OA IRs is peer-reviewed journal articles. If there are any disciplines that do not care about maximising the usage and impact of their peer-reviewed journal article output, then there are indeed reasons to examine discipline differences. If not, then what is needed is not discipline-difference studies but pandisciplinary deposit mandates.

D & C: "most faculty host their digital objects on a personal website, where their long-term preservation is not secure. If institutions truly value the content created by their faculty, they must take some responsibility for the long-term curation of this content."

D & C: "There are two opposing philosophical camps among those who work to justify institutional repositories: one that views IRs as competition for traditional publishing, the other that sees IRs as a supplement to traditional publishing."

There are indeed two opposing views of what IRs are for, but the opposition is certainly not about whether IRs compete with or supplement traditional publishing. It is about whether IRs are primarily for OA content (i.e., peer-reviewed research) or for other kinds of content (e.g., "grey literature"). (There is also some related confusion about whether IRs are primarily for supplementing access or for digital preservation.)

Among OA advocates there is no divergence whatsoever on the fact that OA IRs (Green OA) supplement journal publishing; they are not a substitute for it, nor a competitor to it.

(There is competition between subscription-based publishing and Gold OA publishing, but that is an entirely different matter, having nothing to do with IRs or Green OA.)

Here is a core example of how the authors of this article first make incorrect assumptions, and then simply proceed to derive their inevitably incorrect consequences:

D & C: "In 1994, Stevan Harnad wrote his Subversive Proposal for Electronic Publishing, in which he argued that all academics should make their research articles publicly available through open repositories. This collective effort would help to reduce the power wielded by publishers who have built economic barriers to limit scholars' access to the literature."

(1) From the very outset, the Subversive Proposal was to supplement traditional publishing with (what we have since come to call) Green OA self-archiving of the author's peer-reviewed final draft. Self-archiving was never proposed as a substitute for peer-reviewed journal publication -- as a google search on "harnad supplement substitute" will repeatedly confirm!

Latent in the Subversive Proposal -- a Green OA supplement proposal -- was, of course, the possibility of an eventual transition to Gold OA publishing. But that is and always was treated as a hypothetical possibility, whereas Green OA self-archiving (which eventually led to the first OA IR software, EPrints, and eventually to the OA IR movement) was proposed as a concrete, practical action, within reach of all researchers -- a practical action that has since been widely tried, tested, and confirmed empirically to work, and to deliver the enhanced research usage and impact for which it was intended.

(2) Davis & Connolly have also completely conflated the explicitly stated purpose of the Subversive Proposal -- which was to maximize research access and usage -- with the library community's struggle with the journal affordability problem. Green OA self-archiving is not about "reducing publisher power" nor about changing economics. It is just about maximizing research access.

D & C: "In opposition, Clifford Lynch views IRs as supplements, not primary venues for scholarly publishing, and warns against assuming the role of certification in the scholarly publishing process."

All OA IR advocates view IRs as supplements: a way to provide free access to the author's peer-reviewed final draft, accepted for publication by the "primary venue" (the journal) -- not as a substitute form of peer review or certification or publication.

D & C: "[Lynch] argues that "the institutional repository isn't a journal, or a collection of journals, and should not be managed like one""

Preaching to the choir: No one thinks IRs are journals.

D & C: "Lynch fears that viewing IRs as instruments for undermining the economics of the current publishing system discounts their importance and reduces their ability to promote a broader spectrum of scholarly communication."

IRs are not "instruments for undermining the economics of the current publishing system" they are instruments for maximizing the access and impact of currently published research articles.

D & C: "Institutional repositories may better serve to disseminate the so-called "grey literature": documents such as pamphlets, bulletins, visual conference presentations, and other materials that are typically ignored by traditional publishers."

The idea that IRs should focus on the grey (unpublished) literature instead of the OA Green literature remains just as off-the-mark and wrong-headed today as on the day it was first mooted: (See: Cliff Lynch on Institutional Archives)

D & C: "DSpace was not conceived as competition to commercial publishers, but as a resource to capture, preserve and communicate the diversity of intellectual output of an institution's faculty and researchers It was designed specifically to deal with a wide range of content types including research articles, grey literature, theses, cultural materials, scientific datasets, institutional records, and educational materials, among others."

More's the pity that DSpace does not now, nor did it ever, have its priorities straight. The #1 priority for IRs is and always has been (or ought to have been!) OA. (See: EPrints, DSpace or ESpace?)

D & C: "On May 1st, 2005, a policy was enacted that recommended, not required, that all researchers receiving grant monies from the National Institutes of Heath deposit final copies of their manuscripts in PubMed Central (PMC), a free digital archive of biomedical and life sciences journal literature. PMC offers many valuable services to authors, such as indexing in Medline (the primary literature index for the biomedical and life sciences), as well as dynamic links to the published version of their article. After eight months, the participation rate remained a dismal 3.8%. Lack of awareness of the policy was not cited as contributing to the low compliance rate. On December 14th, 2005, Senator Joseph Lieberman introduced the CURES Act (S.2104), which would require (not recommend) mandatory deposit of final manuscripts"

The NIH Public Access Policy failed for three reasons (in order of priority):

(1) because it was not a mandate, but merely a request,

(2) because it allowed deposit to be delayed (up to a year) rather than immediate,

(3) and because it insisted upon central deposit, in PMC, instead of local deposit (in the fundee's own IR, harvestable by PMC).

D & C: "Cornell's DSpace is largely underpopulated and underused by its faculty. Its complex organization is seen at comparable institutions, but may discourage contributions to DSpace by making it appear empty. In addition, faculty have little knowledge of and no motivation to use DSpace."

D & C: "Each discipline has a normative culture, largely defined by their reward system and inertia. If the goal of institutional repositories is to capture and preserve the scholarship of one's faculty, IRs will need to address this cultural diversity."

No, the remedy is not to delve into disciplinary diversity. It is to promote what all disciplines (indeed all of research) have in common, which is the need to maximize the usage and impact of their peer-reviewed research findings -- by mandating Green OA.

The American Scientist Open Access Forum has been chronicling and often directing the course of progress in providing Open Access to Universities' Peer-Reviewed Research Articles since its inception in the US in 1998 by the American Scientist, published by the Sigma Xi Society.

The Forum is largely for policy-makers at universities, research institutions and research funding agencies worldwide who are interested in institutional Open Acess Provision policy. (It is not a general discussion group for serials, pricing or publishing issues: it is specifically focussed on institutional Open Acess policy.)