Thursday,
September 21. 2006

SUMMARY: NIH's, PLoS's, the Wellcome Trust's
and now the UK MRC's
unreflective support for PubMed
Central (PMC), a Central Repository (CR), as the locus for
direct self-archiving by authors is very unfortunate for
Institutional Repositories (IRs),
for self-archiving, and for Open Access (OA) progress in general. Alma
Swan has published key papers on both OA self-archiving policy
and institutional
versus central self-archiving (IRs vs. CRs) analysing the reasons.
(a) Institutional self-archiving
and central self-archiving are at odds in the quest for a universal
self-archiving policy solution that will cover all OA research output.
(b) It would be awkward and
inefficient to have a different external cross-institution CR as the
locus of primary deposit for every funding area, subject area,
combination of subject areas, or nation.
(c) Researchers' own IRs are the
most natural and efficient way to scale up to covering all of OA space
from all disciplines, institutions and nations.
(d) Direct central self-archiving
is already obsolete in the OAI
era of interoperable OAI-compliant IRs.
(e) The optimal solution is for
researchers to self-archive their own papers in their own OAI-compliant
IRs and for CRs to be harvested from those distributed IRs.
(f) Universities are in the best
position to mandate
self-archiving and monitor and reward compliance.
(g) Mandating self-archiving in CRs
instead simply creates an unsystematic and incoherent policy that does
not scale up to covering all research output from all research
institutions.
(h) What the NIH, Wellcome Trust
and MRC should be mandating is not direct depositing in PMC, but
universal depositing in the fundee's own IR, from which PMC can then
harvest collections.

Let me try to explain why unreflective support for PubMed Central (PMC, and UK PMC) as
the locus for direct self-archiving by authors is very unfortunate
for Institutional Repositories (IRs),
for self-archiving, and
for Open Access (OA)
progress in general. The reason is very simple, and I very much hope
that it will be given some thought by the many who are currently
unquestioningly promoting central self-archiving. (Please note that
this has nothing to do with the existence and enormous value of PMC
itself: only with whether or not PMC (or any other Central Repository)
should be the place where authors self-archive their papers, and the
place where institutions and funders mandate that authors should
self-archive their papers -- instead of self-archiving them in their
own IRs.)

(1) PMC and UK PMC are grounded in two things, (i) the pre-OAI and
pre-IR central-archiving model originating from the early and very
successful Physics Arxiv and (ii) Harold
Varmus's -- and hence NIH's, PLoS's, the Wellcome Trust's
and now the UK MRC's
fixation on the central (indeed the PMC) model of OA self-archiving.
That self-archiving model is already obsolete in the OAI era of distributed,
interoperable OAI-compliant IRs.

(2) Although they appear to be complementary -- after all, OAI renders
all OAI-compliant archives, whether central or institutional,
interoperable, and hence equivalent -- in reality, at this critical
point in the evolution of OA self-archiving policy-making,
(a) institutional self-archiving and (b) central self-archiving are
profoundly at odds with one another in the quest for a systematic,
universal self-archiving policy solution that will systematically scale
up to cover all research output, from all institutions, in all
disciplines, worldwide.

(3) In the OAI-interoperable age, the natural and optimal solution is
for researchers to self-archive their own papers in their own
OAI-compliant Institutional Repositories (IRs) and for whatever central
archives one may wish to have -- whether subject-based or funder-based
or national -- to be harvested, via the OAI protocol for
metadata harvesting, from the distributed local IRs, rather than
deposited, (or re-deposited) directly. That is what the OAI
metadata-harvesting protocol was created for!

(4) So although on the surface it looks as if there is room for
complementarity, pluralism, and parallelism between Central
Repositories (CRs) and Institutional Repositories (IRs), the question
of what their optimal interrelationship should be is far more
complicated insofar as formulating a systematic, effective OA
self-archiving policy is concerned, and ensuring that the policy will
scale up to cover all of OA space. There is a profound and important
strategic conflict specifically related to institutional and
research-funder self-archiving policy (mandates).

(6) The gist of the strategic and practical conflict between IRs and
CRs, as well as the basis for resolving it, is the following:

(7) Universities (and other research institutions) are the primary
research providers. It is their researchers who conduct and publish
the research. It is they and their researchers who are in a position to
provide OA. It is they and their researchers who co-benefit from
providing OA by self-archiving their own research output. The natural
place for them to self-archive their own research output is in their
own respective (OAI-compliant) IRs. This covers all the output of all
their disciplines (some research institutions have just one research
speciality, whereas others, including all universities, cover most or
all research specialties).

(8) Universities (and other research institutions) are real entities,
with their own institutional identity, and it is their own
institutional visibility and productivity and research impact (along
with the impact and progress of research in general) that they are
motivated and indeed necessitated to promote and foster. CRs, in
contrast, do not correspond to institutional entities with needs of
their own. (The partial exception is when a CR is research
funder-based, where the funder is an entity with interests. I will
return to this.)

(9) Universities (and other research institutions) are also the ones
that are in the strongest position to mandate the self-archiving of
their own research input, as well as to monitor and to reward
compliance with their self-archiving policy. (Again, the only exception
is a research funder, or a national government.)

(10) Universities (and other research institutions) are helped in their
efforts to mandate OA self-archiving by OA self-archiving mandates from
the funders of their research, but (a) not all their research is
funded, (b) it would be extremely awkward and inefficient if for a
single institutions' authors, there were a different external
cross-institution CR that needed to be desposited in for every funder
and every subject and every other possible combination of subjects (and
nations!) .

(11) Instead, the natural and efficient way to gather content into CRs
-- whether funder CRs or subject-based CRs or multidisciplinary CRs or
national CRs -- is to selectively harvest their contents from the
individual, distributed IRs of the researchers' own institutions.

(12) IRs are also the most natural and efficient and systematic and
universal way to scale up to cover all of OA space -- originating from
all disciplines, at all institutions, in all nations.

(13) A few generic OAI-compliant CRs are fine for provisionally or even
permanently depositing research by researchers whose institutions do
not yet have an IR (or by researchers who do not even have an
institution!); but apart from that, direct depositing in CRs is
extremely counterproductive at a time when self-archiving has not yet
been established as a systematic research imperative.

(14) The optimal thing for both research institutions and
funders to do now is to mandate self-archiving in the researcher's
own IR (except where a default generic CR is needed because the
researcher's institution does not yet have an IR).

(15) Compliance can be monitored and rewarded, primarily by the
researcher's own institution, but also through the grant-fulfilment
conditions of the funder.

(16) This will systematically scale up to cover all disciplines, at all
institutions, globally.

(17) If central self-archiving (e.g., in PMC) is mandated instead, that
simply creates an unsystematic and incoherent policy that does not
translate into a general means of covering all research output of all
research institutions.

(18) The NIH, Wellcome Trust and MRC self-archiving policies (though
they make important contributions to OA) are hence complicating and
retarding progress toward a universal, systematic solution toward
making all institutions' research output OA because of their insistence
on direct deposit in PMC.

(19) What the NIH, Wellcome Trust and MRC should be mandating is not
arbitrary direct depositing in PMC, but universal depositing in the
fundee's own IR, from which PMC (and any other CRs) can then harvest
collections, if they wish.

(20) In this way, institutional and funder self-archiving mandates can
be synergistic instead of antagonistic (confusing researchers about
where to self-archive, arousing resentment about the need to do
multiple deposits; failing to generalize and scale up to a systematic,
universal self-archiving policy and solution, for all institutions,
disciplines, funders and nations, and in general retarding instead of
accelerating progress in the formulation of effective and compatible
self-archiving policies globally).

(21) The last point is that not only is primary depositing in CRs a
very bad idea, but in the OAI-age CRs need not "house" the
full-texts at all: they really only need to be "virtual archives"
in much the way that google or OAIster is: They harvest the metadata
and links, allow focussed search, and then point back to the IRs for
accessing the full-text itself. The notion of having to have one
central "place" in which to put all papers is obsolete in the OAI age.
(I am not referring to redundancy and preservation issues, for which
some duplication is useful and indeed necessary; I am referring to the
fallacious notion that we need CRs in order to have the target content
for searching and accessing "all in one place." We do not; and we
should not. Yet I am almost certain that this is the main reason so
many people think they need a CR!)

Many well-meaning advocates of OA do not yet understand much of this,
imagining that CRs like PMC will in some mysterious way manage to cover
all of OA space. I hope the summary above will help to redirect the
welcome and important contributions of the supporters of the
NIH-PLoS-Wellcome-MRC OA initiatives in a direction that is more
helpful for scaling up to cover the world's research output as a whole.