Six questions about preprints

2017 is shaping up to be the year that preprints in biomedical sciences go mainstream.

At the beginning of the year MRC and Wellcome Trust both moved to accept preprints in grant applications and scientific reviews. Another major UK biomedical funder is likely to follow suit. In the USA the NIH has recently done the same. The ASAPbio initiative has issued its call for proposals for a central preprint service and it will be interesting to see what comes out of that. Meanwhile the Chan Zuckerberg Initiative has made a major commitment to support bioRxiv financially. Several new discipline-specific preprint servers have been launched. All this and there’s still nearly 8 months of 2017 to go!

The more I think about and learn about preprints, the more questions that seem to crop up. Maybe you can help me find some answers.

Last year’s SPOTON conference on the future of peer review featured a session on preprints and peer review which I chaired and introduced. The discussion was interesting although it felt inconclusive. The recently-issued report from the conference has a piece by me: “The history of peer review, and looking forward to preprints in biomedicine” – now available as a post on the BMC blog. It’s nothing profound but I speculate slightly:

We may be moving to a world where some research is just published ‘as is’, and subject [only] to post-publication peer review, while other research goes through a more rigorous form of review including reproducibility checks. This [change] will be a gradual process, over a period of years. New tools … using artificial intelligence, will help by providing new ways to discover and filter the literature. A new set of research behaviours will emerge around reading, interpreting and responding to preprint literature. The corridors of science will resound with caveat lector and nullius in verba.

Whether / when to post?

Last week I gave a short talk about preprints to the postdocs at the MRC LMCB at UCL (here’s my slides). They were a lively group, already knowledgeable about preprints and open access but keen to learn more. I focused on a quick history of preprints and general stuff about them. The other speaker, Fiona Watts, provided real insight into the experience of a researcher who posts preprints, also drawing on her long experience as a senior editor on various top journals. She suggested that researchers should carefully consider which papers they preprint (is it a verb yet? I think it needs to be). Papers that have a heavy data component are appropriate for preprinting, but those with a high conceptual element might not be. I wonder if that is a generally-accepted notion, and whether there are other criteria that people use for deciding whether to post a preprint?

Scooping and citation

Related to this, the question of being scooped came up. Audience members shared their different experiences. Some believed it to be really uncommon, others reported having been scooped. In theory posting a preprint can give you priority over anything published subsequently. Some felt that would not be much comfort if someone else publishes a peer-reviewed paper before your preprint is published in a journal, particularly if the author of that paper fails to cite your preprint. Maybe peer reviewers need to be reminded that they should include preprints in their search for relevant literature?

One person at the LMCB meeting raised the question of preprint citeability. This had recently been discussed on Twitter (it has been usefully summarised here on the Jabberwocky blog). Apparently NAR does not permit preprints to be cited.

Are there other journals that have a similar policy?

Clearly preprint citation is necessary if we are to give appropriate credit/priority to work posted as a preprint. But I can understand why some may be wary of allowing citation of work that’s not been peer-reviewed. I’m not altogether persuaded by the argument that because physicists do it it must be OK for biologists to do it. There are differences between disciplines and their cultures of knowledge-sharing.

I would like to know how much difference there is between a preprint version and a finally-published version of an article. (Indeed, I’d also like to know how much difference there is between a ‘green’ version of an article and the version of record, but that’s another story).

In the comments on the Jabberwocky blogpost mentioned above, Martin Jung suggests that the key issue is “how to support and improve scientific peer-review to better deal with grey literature and non-published sources” (such as preprints). We need to think harder about this.

In talking to people about appropriate policies someone raised the idea of establishing an institutional preprint server. I can imagine a world where that would be a good idea, but I don’t think we are living in that world. I fear that an institutional server would be an irrelevant backwater, ignored by most in favour of the big disciplinary preprint servers. What do you think – is there much benefit to an institutional preprint server?

What criteria (if any) do you use for deciding whether to post a preprint?

Should peer reviewers be told that they should include preprints in their search for relevant literature?

Do you know of any journals which do not preprints to be cited as references?

How much difference is there between a preprint version and a finally-published version of an article?

Is there any benefit to an institutional preprint server?

Please give your answers in comments below, or on Twitter.

About Frank Norman

I am a librarian in a biomedical research institute. I've been around a few years, long enough to know that exciting new things fall into the same familiar patterns. I'm interested in navigating a path for libraries as we move further from print to electronic resources to open research, and become more embedded in research workflows.

6 Responses to Six questions about preprints

Frank, these are good questions. I can answer on behalf of preprints.org, a multidisciplinary preprint server:

1. You’ve just used it, so I think that makes it acceptable!

2. It doesn’t quite answer your question, but we are looking for papers that ‘look like’ a research article or review. We try to avoid opinion pieces and editorials. Beyond that, it’s up to the authors, I would be interested to see further discussions about this.

3. In my view, yes! Most preprints are available via Google Scholar, but not PubMed or Web of Science, so it might take a bit more effort.

4. Not journals as far as I know, but some funding bodies still don’t allow them in grant applications. With the NIH move, the situation is improving, though.

5. Preprints can be updated, so authors often upload the final accepted version. I haven’t looked in detail at this, but I would say the main differences are regarding language and tightening up formatting (e.g. use of units, title/section formats, figure/table formats). Another important difference is with the references: publishers put quite a bit of effort into making sure citations are complete and correct and adding details like the doi number.

6. Good question, I haven’t seen this proposed anywhere (but I may be wrong). There’s no reason why not, but the main drawback I would see is discoverability: these preprints might remain too hidden to be widely known and used.

Re. citeability (qn 4), there is at least one journal that does not allow authors to include preprints in their reference list – Nucleic Acids Research.

Re. qn 5, I wondered whether a textmining wizard could summon a cunning algorithm to make such comparisons on a wider scale. It’s interesting to reflect on whether final accepted versions should be uploaded to the preprint server or to (eg) the PubMedCentral repository – or both. Our funders insist on PubMedCentral.

Re. qn 6, yes that was my worry too. Plus the issue of economies of scale (or rather diseconomies of small-scale).

I have some views and concerns on question 5. One of these concerns is copyright. I have heard of a publisher (I can’t remember which) claiming copyright not just of the version of record but all previous versions (has anyone else?). Whether this is legally justified is another question (I would hope not). (Obviously authors do sometimes have the option not to assign copyright to publishers but it does still happen.)

Also I worry that if a researcher posts a preprint and the version of record is substantially similar then a publisher may decline to publish it if they are aware of this (although I admit I am not aware that this has ever happened). What ‘substantially similar’ means will obviously be a matter of interpretation in each case.

Are these concerns groundless or nearly so? I’d be interested to know what others think.

Stephen – I’m not sure about the first concern, but it sounds quite an outrageous claim for a publisher to make. The Harnad/Oppenheim strategy was conceived to avoid such claims I believe.

As for the second concern, there’s a growing list of journals which will accept for publication papers that have been preprinted previously. But there are still some which will not – notably the New England Journal of Medicine. This Wikipedia page has a useful list of journals and preprint policies.

My responses:
1. Yes!
2. I decided that my group would preprint everything where possible. I believe this is important for setting people’s expectations and for having a discussion about preprints earlier rather than last minute. We have recently not preprinted a manuscript because the journal we were sending it to did not allow preprints.
3. Yes. I think most reviewers are aware of preprints in their area (as you say preprints are mainstream). That said, I’m uncomfortable with the notion that prior work (preprinted or not) renders a manuscript worthless.
4. No (only the NAR example you mentioned). Our preprints have picked up some citations in papers in Nat Cell Biol and J Cell Biol, and we have cited a preprint in our recent J Cell Sci paper. Just like the citation of a normal paper. Well, in the case of JCS they note “preprint” next to the in-text citation.
5. Short answer to this is difficult. I will leave this stat: of the 2512 preprints that were published in a journal by the end of 2016, only 27 were revised once and 2 were revised twice.
6. No.

Re. qn 3, I certainly wouldn’t say that prior work renders a manuscript worthless, but researchers I speak to certainly become anxious if they think that a rival lab is getting into print ahead of them. Perhaps, again, the intensity of this anxiety varies between fields?