Some (probably not original) thoughts about originality

25 August 200926 Comments

A number of things have prompted me to be thinking about what makes a piece of writing “original” in a web based world where we might draft things in the open, get informal public peer review, where un-refereed conference posters can be published online, and pre-print servers of submitted versions of papers are increasingly widely used. I’m in the process of correcting an invited paper that derives mostly from a set of blog posts and had to revise another piece because it was too much like a blog post but what got me thinking most was a discussion on the PLoS ONE Academic Editors forum about the originality requirements for PLoS ONE.

In particular the question arose of papers that have been previously peer reviewed and published, but in journals that are not indexed or that very few people have access to. Many of us have one or two papers in journals that are essentially inaccessible, local society journals or just journals that were never online, and never widely enough distributed for anyone to find. I have a paper in Complex Systems (volume 17, issue 4 since you ask) that is not indexed in Pubmed, only available in a preprint archive and has effectively no citations. Probably because it isn’t in an index and no-one has ever found it. But it describes a nice piece of work that we went through hell to publish because we hoped someone might find it useful.

Now everyone agreed, and this is what the PLoS ONE submission policy says quite clearly, that such a paper cannot be submitted for publication. This is essentially a restatement of the Ingelfinger Rule. But being the contrary person I am I started wondering why. For a commercial publisher with a subscripton business model it is clear that you don’t want to take on content that you can’t exert a copyright over, but for a non-profit with a mission to bring science to wider audience does this really make sense? If the science is currently unaccessible and is of appropriate quality for a given journal and the authors are willing to pay the costs to bring it to a wider public, why is this not allowed?

The reason usually given is that if something is “already published” then we don’t need another version. But if something is effectively inaccessible is that not true. Are preprints, conference proceedings, even privately circulated copies, not “already published”. There is also still a strong sense that there needs to be a “version of record”, that there is a potential for confusion with different versions. There is a need for better practice in the citation of different versions of work but this is a problem we already have. Again a version in an obscure place is unlikely to cause confusion. Another reason is that refereeing is a scarce resource that needs to be protected. This points to our failure to publish and re-use referee’s reports within the current system, to actually realise the value that we (claim to) ascribe to them. But again, if the author is willing to pay for this, why should they not be allowed to?

However, in my view, at the core to the rejection of “republication” is an objection to the idea that people might manage to get double credit for a single publication. In a world where the numbers matter people do game the system to maximise the number of papers they have. Credit where credit’s due is a good principle and people feel, rightly, uneasy with people getting more credit for the same work published in the same form. I think there are three answers to this, one social, one technical, and one…well lets just call it heretical.

Firstly placing two versions of a manuscript on the same CV is simply bad practice. Physicists don’t list both the ArXiv and journal versions of papers on their publication lists. In most disciplines, where conference papers are not peer reviewed, they are listed separate to formally published peer reviewed papers in CVs. We have strong social norms around “double counting”. These differ from discipline to discipline as to whether work presented at conferences can be published as a journal paper, whether pre-prints are widely accepted, and how control needs to be exerted over media releases but while there may be differences over what constitutes “the same paper” there are storng social norms that you only publish the same thing once. These social norms are at the root of the objection to re-publication.

Secondly the technical identification of duplicate available versions, either deliberately by the authors to avoid potential confusion, or in an investigative roleto identify potential misconduct, is now trivial. A quick search can rapidly identify duplicate versions of papers. I note paranthetically that it would be even easier with a fully open access corpus but where there is either misconduct, or the potential for confusion, tools like Turnitin and Google will sort it out for you pretty quickly.

Finally though, for me the strongest answer to the concern over “double credit” is that this is a deep indication we have the whole issue backwards. Are we really more concerned about someone having an extra paper on their CV than we are about getting the science into the hands of as many people as possible? This seems to me a strong indication that we value the role of the paper as a virtual notch on the bedpost over its role in communicating results. We should never forget that STM publishing is a multibillion dollar industry supported primarily through public subsidy. There are cheaper ways to provide people with CV points if that is all we care about.

This is a place where the author (or funder) pays model really comes it in its own. If an author feels strongly enough that a paper will get to a wider audience in a new journal, if they feel strongly enough that it will benefit from that journal’s peer review process, and they are prepared to pay a fee for that publication, why should they be prevented from doing so? If that publication does bring that science to a wider audience, is not a public service publisher discharging their mission through that publication?

Now I’m not going to recommend this as a change in policy to PLoS. It’s far too radical and would probably raise more problems in terms of public relations than it would solve in terms of science communication. But I do want to question the motivations that lie at the bottom of this traditional prohibition. As I have said before and will probably say over and over (and over) again. We are spending public money here. We need to be clear about what it is we are buying, whether it is primarily for career measurement or communication, and whether we are getting the best possible value for money. If we don’t ask the question, then in my view we don’t deserve the funding.

Thanks for your comment. The link is quite helpful. I would however turn:

“Readers of primary source periodicals, whether print or electronic, deserve to be able to trust that what they are reading is original unless there is a clear statement that the author and editor are intentionally republishing an article.”

on its head. Readers also deserve to be able to find important results that are relevant to them. I would agree there needs to be a value test here – what is the added value from re-publication and that clarity is important. There is a good point made about the potential for double counting of studies in meta-analyses. I would agree that that is an important issue, especially for medical research.

Thanks for your comment. The link is quite helpful. I would however turn:

“Readers of primary source periodicals, whether print or electronic, deserve to be able to trust that what they are reading is original unless there is a clear statement that the author and editor are intentionally republishing an article.”

on its head. Readers also deserve to be able to find important results that are relevant to them. I would agree there needs to be a value test here – what is the added value from re-publication and that clarity is important. There is a good point made about the potential for double counting of studies in meta-analyses. I would agree that that is an important issue, especially for medical research.

I agree with your point that publication should be about dissemination and not about earning points on your CV. However, I think that even if the authors pay for publication, republication of the same (or very similar) papers is nonetheless a problem. What I am concerned about is time waste: even if the authors pay for the direct expenses of the journal, they do not pay for the time spent by editors and reviewers (who are generally not paid by the journal). Also, republication potentially wastes the time of other scientists who end up reading the same paper several times in different journals.

I agree with your point that publication should be about dissemination and not about earning points on your CV. However, I think that even if the authors pay for publication, republication of the same (or very similar) papers is nonetheless a problem. What I am concerned about is time waste: even if the authors pay for the direct expenses of the journal, they do not pay for the time spent by editors and reviewers (who are generally not paid by the journal). Also, republication potentially wastes the time of other scientists who end up reading the same paper several times in different journals.

Lars, I would agree that an editor should make a decision based on the usefulness of re-publishing. My point I guess is that there cases where it shouldn’t be ruled out immediately. I do agree with your point about additional costs but again, if the editor sees value in submitting a paper to their review process (whatever it is) doesn’t that justify the time of the referee to the same extent as for a new paper? There are hidden costs.

I’m also not convinced about the “wasting time reading the paper again” argument. On three counts. One I seem to end up doing this anyway, either because I forget I’ve read it, or because I have seen it in some other form (conference paper etc). Secondly filtering and keeping track of what you’ve read is your problem surely [though in light of point one I’m just proving I’m badly organised]. But again if this paper is in some obscure journal no-one can get at then the number of people who might accidentally read it again versus the number who might benefit from reading it the first time seems like a weight on the positive side.

Lars, I would agree that an editor should make a decision based on the usefulness of re-publishing. My point I guess is that there cases where it shouldn’t be ruled out immediately. I do agree with your point about additional costs but again, if the editor sees value in submitting a paper to their review process (whatever it is) doesn’t that justify the time of the referee to the same extent as for a new paper? There are hidden costs.

I’m also not convinced about the “wasting time reading the paper again” argument. On three counts. One I seem to end up doing this anyway, either because I forget I’ve read it, or because I have seen it in some other form (conference paper etc). Secondly filtering and keeping track of what you’ve read is your problem surely [though in light of point one I’m just proving I’m badly organised]. But again if this paper is in some obscure journal no-one can get at then the number of people who might accidentally read it again versus the number who might benefit from reading it the first time seems like a weight on the positive side.

Cameron, I think there are two situations here that need to be kept separate: republication where the authors honestly point out that this is (essentially) the same publication as an earlier paper of theirs, and republication where the editor is not informed about this.

If the editor is informed that the manuscript in question is a republication and decides that it makes sense to republish it, I agree with you that republication is not a problem. However, in that case I don’t see why the editor should waste the time of reviewers if the manuscript has already undergone successful peer review. Also, it should IMHO be clearly stated that the paper is a republication so that the readers are aware.

In case the authors did not inform the editor about the fact that manuscript is a republication, it is fully possible that the editor will not discover this. In that case editor may not make an informed decision, and the choice to republish the manuscript may be based on false premises. This I consider a big problem.

It may be that I’m too cynical, but my guess is that the editor is typically not informed. I have at least on multiple occasions as a reviewer discovered that large portions manuscripts had already been published elsewhere – as soon as the editor learned this, they promptly rejected the manuscripts.

Cameron, I think there are two situations here that need to be kept separate: republication where the authors honestly point out that this is (essentially) the same publication as an earlier paper of theirs, and republication where the editor is not informed about this.

If the editor is informed that the manuscript in question is a republication and decides that it makes sense to republish it, I agree with you that republication is not a problem. However, in that case I don’t see why the editor should waste the time of reviewers if the manuscript has already undergone successful peer review. Also, it should IMHO be clearly stated that the paper is a republication so that the readers are aware.

In case the authors did not inform the editor about the fact that manuscript is a republication, it is fully possible that the editor will not discover this. In that case editor may not make an informed decision, and the choice to republish the manuscript may be based on false premises. This I consider a big problem.

It may be that I’m too cynical, but my guess is that the editor is typically not informed. I have at least on multiple occasions as a reviewer discovered that large portions manuscripts had already been published elsewhere – as soon as the editor learned this, they promptly rejected the manuscripts.

Lars, I see your point. I was definitely assuming everyone had the best possible motivations here. I agree that any form of hiding that it is republication is bad either by the editor or by the author. That is just rude as well as wasting resources. I was definitely thinking about your first example. Which is still ruled out by most journal policies (including PLoS ONE).

The fact that the second is prevalent is evidenced by the paper that Michael Nielsen bookmarked today which apparently uses for its sample set 5000 papers that appear in two journals with the same first author and same title. That there are 5000 was certainly suprising to me but I would guess that either the editors were not aware in most of these cases or that they were examples of CV padding via vanity publishers.

Lars, I see your point. I was definitely assuming everyone had the best possible motivations here. I agree that any form of hiding that it is republication is bad either by the editor or by the author. That is just rude as well as wasting resources. I was definitely thinking about your first example. Which is still ruled out by most journal policies (including PLoS ONE).

The fact that the second is prevalent is evidenced by the paper that Michael Nielsen bookmarked today which apparently uses for its sample set 5000 papers that appear in two journals with the same first author and same title. That there are 5000 was certainly suprising to me but I would guess that either the editors were not aware in most of these cases or that they were examples of CV padding via vanity publishers.

It just occurred to me that there is also a practical problem with respect to the first situation: If the journal in which the paper was originally published is not Open Access, the authors will typically (although there are exceptions) have signed over copyright, which would make republication of the paper illegal. If on the other hand the paper was originally published in an Open Access journal, it is already available to everyone, which would make republication rather pointless from a dissemination point-of-view.

It just occurred to me that there is also a practical problem with respect to the first situation: If the journal in which the paper was originally published is not Open Access, the authors will typically (although there are exceptions) have signed over copyright, which would make republication of the paper illegal. If on the other hand the paper was originally published in an Open Access journal, it is already available to everyone, which would make republication rather pointless from a dissemination point-of-view.

Come on Lars, it was quite a nice piece until you shot all those logical and sensible holes through it :-)

I would guess this mostly comes up with small local journals, many of which have subsequently folded. But even then the authors may have retained copyright in a pre-submission version. I admit its getting more difficult to sustain as more than a theoretical argument but I think there could be legitimate circumstances in which it came up.

Come on Lars, it was quite a nice piece until you shot all those logical and sensible holes through it :-)

I would guess this mostly comes up with small local journals, many of which have subsequently folded. But even then the authors may have retained copyright in a pre-submission version. I admit its getting more difficult to sustain as more than a theoretical argument but I think there could be legitimate circumstances in which it came up.

Anna

I was going to raise the issue of prior copyright as well (but Lars beat me to it). The problem is, that even if an operation is folded, is there not a timespan with which they still retain the copyright (or someone else, other than the author does)?
From my point of view, any book that is no longer published and/or readily available should be free game (for scanning/dissemination/photocopying/republication etc), but I am sure that the copyright police would get me on those. I can’t see that journals are much different.
Also, if you have a copy already available via archive, is it not sufficient for you to flag this up online, rather than republish? Otherwise all that would happen is those rich enough to present their work through an open-access journal would take up all the bandwidth …

Anna

I was going to raise the issue of prior copyright as well (but Lars beat me to it). The problem is, that even if an operation is folded, is there not a timespan with which they still retain the copyright (or someone else, other than the author does)?
From my point of view, any book that is no longer published and/or readily available should be free game (for scanning/dissemination/photocopying/republication etc), but I am sure that the copyright police would get me on those. I can’t see that journals are much different.
Also, if you have a copy already available via archive, is it not sufficient for you to flag this up online, rather than republish? Otherwise all that would happen is those rich enough to present their work through an open-access journal would take up all the bandwidth …

In the examples I was thinking of the journal may not have required a copyright transfer, particularly for small local journals. In the case of a journal folding the big publishers actually have arrangements to put things into the public domain in some form or other to prevent this kind of thing happening. The point about archiving is a fair one. Especially as more people use Google Scholar to search which is better on things in repositories compared to PubMed or WOK.

But I still think the point is that the immediate reaction that “this is wrong” is misplaced. There are many reasons why it might be the wrong thing to do, but to rule it out in principle is the thing I am arguing against. I’m questioning what seems to be a deep seated and not entirely logical reaction against the idea.

In the examples I was thinking of the journal may not have required a copyright transfer, particularly for small local journals. In the case of a journal folding the big publishers actually have arrangements to put things into the public domain in some form or other to prevent this kind of thing happening. The point about archiving is a fair one. Especially as more people use Google Scholar to search which is better on things in repositories compared to PubMed or WOK.

But I still think the point is that the immediate reaction that “this is wrong” is misplaced. There are many reasons why it might be the wrong thing to do, but to rule it out in principle is the thing I am arguing against. I’m questioning what seems to be a deep seated and not entirely logical reaction against the idea.

Cameron – this is interesting. Of course publishers do already republish articles. They sometimes collect different articles on a topic together and republish them in book form. They clearly label the individual articles as to their origin, though the book itself is not always marketed as a re-hash of existing material. Librarians hate these books as we can end up paying again for something we already have in journal form!

So there is nothing inherently wrong in republishing provided, as Lars says, it is clearly labelled and you are not trying to create a new article out of an old article.

Teaching institutions in effect do the same thing, often having to spend a great deal of time on clearing the rights to re-use content. See HERON for more info http://heronweb.ingenta.com/heron/

Some journals also republish old, classic articles with a commentary. Again the intent is to re-expose the old article to new audience not to practise any kind of deception.

Cameron – this is interesting. Of course publishers do already republish articles. They sometimes collect different articles on a topic together and republish them in book form. They clearly label the individual articles as to their origin, though the book itself is not always marketed as a re-hash of existing material. Librarians hate these books as we can end up paying again for something we already have in journal form!

So there is nothing inherently wrong in republishing provided, as Lars says, it is clearly labelled and you are not trying to create a new article out of an old article.

Teaching institutions in effect do the same thing, often having to spend a great deal of time on clearing the rights to re-use content. See HERON for more info http://heronweb.ingenta.com/heron/

Some journals also republish old, classic articles with a commentary. Again the intent is to re-expose the old article to new audience not to practise any kind of deception.

To me, some kind of re-publishing of non-indexed studies needs to be done somehow. I myself search theses and dissertations when I have to really get deep in a topic – these can have great data on new measures. Manual search of conference proceedings can get you info on emerging topics, and show who is working on these. Increasingly, these are popping up in e-searches. Theses and dissertations are indexed, but they are kind of out-of-the-mainstream so do not get found. I am currently doing a review, and I am including decent dissertation evidence, with the caveat that this source is not “vetted” by peer-review (although the quality was probably scrutinized highly). This is kind of “grey market” for knowledge, and it would be great to have some improved way to access these “grey market” research pubs.

Another reason for duplicate publication is for pharma to more quickly create their own snowball effect of evidence for marketing reasons – if pharma reps can walk into the doc’s office with more than one article as evidence of efficacy, it gives the impression that the new drug has got something goin’ on. Generate some buzz for the (patented) med.
This blogger gets credit for noting this specific example:

To me, some kind of re-publishing of non-indexed studies needs to be done somehow. I myself search theses and dissertations when I have to really get deep in a topic – these can have great data on new measures. Manual search of conference proceedings can get you info on emerging topics, and show who is working on these. Increasingly, these are popping up in e-searches. Theses and dissertations are indexed, but they are kind of out-of-the-mainstream so do not get found. I am currently doing a review, and I am including decent dissertation evidence, with the caveat that this source is not “vetted” by peer-review (although the quality was probably scrutinized highly). This is kind of “grey market” for knowledge, and it would be great to have some improved way to access these “grey market” research pubs.

Another reason for duplicate publication is for pharma to more quickly create their own snowball effect of evidence for marketing reasons – if pharma reps can walk into the doc’s office with more than one article as evidence of efficacy, it gives the impression that the new drug has got something goin’ on. Generate some buzz for the (patented) med.
This blogger gets credit for noting this specific example:

That will pad your CV, but in this case it seems likely the motivation was to generate product ‘buzz.’

Tzar Mohd Nizam

hi all! i’m new in this area. consider this situation: your paper has been accepted by a journal – problem is the journal is not indexed – therefore your work can’t be shared with many people except for those who subscribe to that journal. can u republish your article in another indexed journal? tq

Tzar Mohd Nizam

hi all! i’m new in this area. consider this situation: your paper has been accepted by a journal – problem is the journal is not indexed – therefore your work can’t be shared with many people except for those who subscribe to that journal. can u republish your article in another indexed journal? tq

License

To the extent possible under law, Cameron Neylon
has waived all copyright and related or neighboring rights to
Science in the Open.
Published from the
United Kingdom.