A systemic malady: the pervasive problem of misconduct in the biomedical sciences part 2: detection and prevention

By Dr Gerald H. Lushington and Rathnam Chaguturu

Summer 2015

Discovering safe and effective medicines, one of the greatest gifts to humanity, relies intensely upon the availability of detailed, unassailably accurate biomedical scientific knowledge.

The key distinction that separates science from other philosophical disciplines is that no idea should become an accepted scientific fact without rigorous independent verification. The importance of this bedrock principle is paramount, but is being jeopardised by an epidemic of false-science unfolding around us.

Part 1 of this two-part series on the pervasive problem of misconduct in the biomedical sciences discussed the issues and causes of key problematic areas, including falsification of results, peer-review rigging, data over-interpretation and improper or willfully selective sampling practises. This second instalment series explores tactics for identifying and counteracting the flawed practises that engender faulty or dishonest research. As a learned society, we must truly commit to such exhaustive scrutiny if we hope to actively reverse this destructive trend.

Although science, like Wall Street, is selfcorrecting to a certain extent, the outcome of the recent housing bubble reminds us vividly how much pain and destruction can be incurred by passively awaiting organic self-correction. As this article conveys, all affected entities – investigators, publishers, research institutions, funding agencies and national policy-makers – have the imperative, the ability and the resources to identify research improprieties and prevent such misconduct from inflicting negative consequences on the global biomedical research enterprise.

Most scientific advancement occurs not through blockbuster papers, which definitively solve or address a problem, but through careful and incremental progress, building on prior work.

If results and the key findings are not reproducible, subsequent progress in the field is compromised until erroneous conclusions are identified and either corrected or removed from public record. The oft-quoted inability by the Amgen, Bayer and ALS Therapy Institute to reproduce seminal biomedical studies published in high impact journals is a testament to this malady1. However, we clearly do not know the magnitude or prevalence of the problem. We simply do not take time, have the resources or the incentive to replicate a published study to ascertain its reproducibility quotient. To address this problem, or fill the void, various public and private organisations are now starting to fund replication studies to gauge research reproducibility under the expectation that independent replication of key findings will foster successful biomedical innovation2. The most prominent example is the Cancer Biology Reproducibility Project3 using the Science Exchange network. The market-driven collaboration model of the Science Exchange network (www.scienceexchange.com) has been critical for delivering and publishing replication studies. Monetarily incentivising participation has enabled rapid delivery of high-quality replication experiments. The following replication best practice model has been adapted by the Science Exchange (one of the authors, RC, is on its advisory board), and should be relevant to anyone who embarks on a replication study2:

l Conduct a direct replication using the same materials and methods as closely as possible, including any additional controls as necessary.

l Obtain input from the original author on the proposed replication protocol (if desired).

l Use expert, independent labs with extensive expertise in the techniques being replicated.

l Where possible, use positive and negative controls to confirm that replication experiments were successful.

l Use power calculations to ensure that the replication sample size is sufficient to verify the reported effect with at least 80% power.

l Provide all protocols, results, raw and processed data for review.

The awarding of a $1.3 million grant by the Arnold Foundation in October 2013 to replicate 50 landmark cancer studies is an acknowledgement of the relevancy of Science Exchange Network’s Reproducibility Initiative2. The results from the replication studies are available through the Open Science Framework (OSF). The OSF is a free service that provides open public access to experimental protocols, materials, data, analysis and results, and is published in PLOS ONE, a peer reviewed, open access, online publication4.

Misconduct detection and prevention

Very few readers will disagree with the assertion that scientific misconduct is a problem that should be prevented. Regardless of whether one is seeking to build upon the great body of knowledge in basic science, or to translationally extend such knowledge toward key technological developments such as advancing the drug discovery pipeline, the true onus is on the originating (academic) researchers to safeguard their professional reputations by adhering to the bedrock principle of reproducibility and to thus have effected due diligence in preclinical research practices. For subsequent researchers to blindly assume such diligence on the part of their forebears is a recipe for frustration or failure, however. The more complex issue is how problematic research can best be identified, and even more fundamentally, whose responsibility is it to expose and counteract such problems?

Most journals now take steps to prevent some examples of misconduct from being published. The most straightforward issue to address (as is discussed elsewhere in this paper) is the one with the least impact on reproducibility, namely plagiarism, whose detection has been rendered increasingly facile by a prevalence of text digitisation and by the growing power of comprehensive text search tools. Manuscript screening is being increasingly implemented by journals in the earliest stages of quality control, which means that for some periodicals the most egregious papers can be intercepted even before the peer-review process begins. Of greater relevance to knowledge integrity is the fact that some journals are also beginning to implement automated scrutiny of images (eg, gel blots and microscopy diagrams) to detect significant image editing. Beyond that, however, there are practical limits to editorial mandate for uncovering falsified or fraudulent data or conclusions. Based on specialised domain expertise, journal referees might be able to spot possible instances of misconduct, and the occasional proposal review panel may catch problems with preliminary findings in the research proposals that they evaluate, but the most important figures in the battle against scientific misconduct are scientific researchers themselves: people who must believe in the contributions and insight provided by collaborators, and people who rely upon the accuracy of prior studies in order to build their own research plans.

The means by which an individual may detect instances of research impropriety has varied over time and continues to evolve. Ultimately, the greatest chance of detection is availed if a certain minimum amount of critical information is provided to all collaborators prior to publication, to all referees during the peer-review process, and to all readers post-publication. Key forms of information are delineated below:

l A comprehensive description of the underlying methodology should be provided such that any reader would be able to reassemble the experimental/analytical protocol without ambiguity. This is a fundamental tenet of scientific reproducibility, but one may now find it unnerving in reality to see how many papers do not describe their protocols, instrumentation and computational manipulations in terms that are specific and detailed enough for another researcher to repeat.

l Authors should include all spreadsheets and other files that fully justify the numbers achieved in each step of the data analysis process.

l Where practical, an electronic compendium of all raw data and metadata employed in the study should also be available. If this is not possible due to electronic storage limitations, then a condensed set of processed data may be acceptable, provided that clear data provenance is established and there is clear documentation to demonstrate that the reported data subset has not been cherry-picked in any subjective manner.

Ensuring that this level of information is available will greatly facilitate the process of identifying problematic assertions because it will support rigorous statistical validation, and then permit multiple levels of thorough verification for any instances that appear to be anomalous or statistically questionable. In terms of how best to exploit this detail, Glenn Begley proposes watching for the following six red flags while examining basic science publications5:

1. Were experiments performed blinded? 2. Were basic experiments repeated? 3. Were all the results reported? 4. Were there positive and negative controls? 5. Were reagents used validated? 6. Were statistical tests relevant and appropriate?

We would also suggest that a paper’s title, the journal, the authors or the institution should not sway researchers. When using prior results to formulate a research project, it is imperative to read the paper rigorously, checking the methods, error bars and deductive logic to ensure adequate documentation. We strongly encourage viewing documents on the computer screen, instead of the printed document, to facilitate/scrutinise graphical verification (eg, cropping or Photoshopping).

Ultimately, the path toward elimination of scientific misconduct is best laid not through detection and reporting, but by pre-emptive prevention. While public knowledge of the repercussions of cheating may provide deterrence, the most effective strategies may lie in reforming the research environment. Some of this can be instituted within the mindset of young researchers by stressing education in proper sampling, controls and validation strategies and in the usage of relevant basic instrumentation. Additional progress may be achieved by distributing responsibility more broadly. For example, the prospect for undetected data fabrication (at least during the early stages of a drug discovery research programme) is almost non-existent in the pharmaceutical sector. This is mainly because promising projects move in quick succession from the molecular/cell biology lab to the hands of assay developers, HTS facility and eventually the chemistry/computational team for QSAR studies. The disconnects in personal responsibility for a project helps to impose ‘real time’ checks and balances, such that data quality assurance is effected repeatedly (each time the project transitions from one lab to the next) and inconsistencies are quickly detected. Such repeated independent verification within an integrated project management environment is rare in academia, wherein the same principle investigator oversees the whole development process, and a single researcher may retain a vested interest throughout – a personalisation of responsibility that amplifies both the incentives and opportunities for scientific misconduct. Fortunately, a foundation for improved academic self-regulation is emerging now as well. The establishment of independent service cores (eg, for high throughput screening, synthesis, purification or modelling) typically encourages non-vested independent scrutiny from outside of the principal investigator’s laboratory. As well, the pressure to reform is emanating from sober public watchdog entities such as Retraction Watch www.retractionwatch.com), and the Reproducibility Initiative (validation.scienceexchange. com).

Detecting peer review fraud

The Ferguson article6 exhorts journal editors to be wary of authors who submit proposed referees with free, non-institutional e-mail addresses (eg, yahoo, Gmail, etc; such generic addresses are abused frequently enough that journals may consider restricting review correspondence to only established institutional addresses), as well as those who provide excessively long lists of researchers to avoid directing the paper to. Other red flags stated in this article include rapid referee replies and unanimously positive responses6. Unlike many aspects of scientific misconduct, the abuse of peer review protocol is readily corrected – if journal editorial boards are serious about publishing manuscripts of reasonable quality, the board must simply not cut corners in the refereeing process. Possible solutions to the challenge of finding qualified and willing reviewers include:

l Offering small honoraria. l Inviting corresponding authors to become part of the referee roster. l Mandatory use of institutional IDs.

Unfortunately, many journals that have come into existence in the past decade have not demonstrated a commitment to rigorous review; unless a journal can commit to providing a diligent and fair evaluation of all of the submissions it receives, the journal will not sustain a readership necessary to perpetuate its existence.

Reporting and responding to misconduct

Scientific misconduct, either due to innocent ignorance or malicious malfeasance, should be identified and dealt with by listening to concerns, questioning gaps and auditing operations7. When congressman Albert Gore Jr probed Philip Felig (Yale) during the 1981 Biomedical Research Fraud hearings, there was no effective national response mechanism for safeguarding the research granting investments made by our federal funding agencies8. Today in the United States, the Office of Research Integrity (ORI) provides a framework, established by the Health & Human Services (HHS)/Public Health Service (PHS), for addressing allegations of research misconduct in biomedical sciences (https://ori.hhs.gov/handlingmisconduct), including:

The administrative actions PHS/HHS may take against respondents who have a finding of research misconduct made against them include, but are not limited to (https://ori.hhs.gov/administrative-actions):

l Debarment from eligibility to receive Federal funds for grants and contracts. l Prohibition from service on PHS advisory committees, peer review committees or as consultants. l Certification of information sources by respondent that is forwarded by institution. l Certification of data by institution. l Imposition of supervision on the respondent by the institution. l Submission of a correction of a published article by respondent. l Submission of a retraction of a published article by respondent.

There could also be formal reprimand, termination and rejection of thesis and recession of degree. Which administrative action(s), the number of administrative actions and the length of the administrative actions depend on the seriousness of the misconduct, the impact of the misconduct and whether the misconduct demonstrates a pattern of behaviour. Administrative actions are usually imposed for three years, but have ranged from one year to a lifetime. Case summaries providing background information and administrative actions levied are published (https://ori.hhs.gov/case_summary).

At the time of writing (January 07, 2015), there were active administrative actions of various degrees against 43 investigators, but with only four recorded retractions! (https://ori.hhs.gov/ORI_ PHS_alert.html). None of these investigators faced loss of positions or criminal prosecution. While the processes and procedures in place at ORI seem satisfactory, execution is an altogether different matter. David Wright, former Director of ORI, resigned after two years out of frustration over “profoundly dysfunctional” federal bureaucracy and budgetary constraints. Wright further states: “It was the accumulation of frustrations with the bureaucracy and trying to operate a regulatory office which requires precision, transparency, procedural rigor in an organisation that values none of those things.”9

In reality, most complaints against scientific misconduct produce no significant, measurable, open or transparent consequences. In academia, the punishment received by the perpetrators, except a few (http://articles.latimes.com/keyword/scientificmisconduct), is rarely commensurate with the level of the misconduct.

Unlike in ‘basic science,’ the consequence of misconduct in clinical trials poses a significant and immediate safety risk to the patients, and FDA’s mission to protect public health in jeopardy, and hence be dealt swiftly (Deborah Collyar, Patient Advocates in Research, personal communication). In this spirit, perpetrators in professional disciplines, such as the pharmaceutical industry, are dealt with much more efficiently and severely than their counterparts in academia, including the following sampling of cases:

l A researcher who was found to have fabricated data no longer works at the drug maker, Bristol- Myers Squibb Wall Street Journal, November 24, 2014

l Head of GSK’s Shanghai site fired for fabricating results in a Nature Medicine 2010 paper In the Pipeline, 2013

l Bell Labs physicist fired for falsifying and fabricating data in a series of high profile papers Physics World, September 2002

Given these grave consequences, we believe that the research community must develop and embrace a firm, responsive and equitable platform for detecting and preventing scientific misconduct. Accountability must become a paramount objective. Advisors who push students or research staff toward indefensible conclusions must share in the culpability of falsification. Wellfunded senior investigators and administrators should not have allegations of impropriety swept under the carpet while their junior colleagues are held to task. The pursuit of scientific integrity must be accompanied by vigorous, wholehearted administrative integrity; institutions that fail to demonstrate this principle must run the risk of external audits and legal judgments. Ultimately, as with fraud in other sectors, misconduct must be punishable by law, as was demonstrated for Eric Poehlman who falsified 17 NIH grant applications and fabricated data in 10 of his papers. He was levied a $180,000 fine, a lifetime ban on receiving federal funds and became the first American academic to serve jail time10.

A common thread through the mechanisms available for misconduct reporting and for all of the above instances of successfully prosecuted research fraud is that, at some point in the process, individual researchers encountered failures in the validation of others’ studies and were confronted with the question of what to do about the corresponding evidence of research malpractice. Although it is debatable whether or not each of us has an imperative to report such problems, there is clearly at least a moral persuasion to do so. Such reporting must never be undertaken lightly, however. The first step for someone considering exposing an instance of alleged scientific misconduct, even before speaking to anyone about it, should be to familiarise oneself with institutional policies regarding the onus and procedure for dealing with misconduct. Such policies have been developed in the last decade, but many are in states of frequent flux and vary substantially from one institution to another, and across the academic, corporate and public sectors. One should clearly understand and be prepared to rigorously follow any standard reporting protocol, and one should be knowledgeable regarding the presence (or lack) of any institutional whistleblower protections. If no institutional whistleblower policy is in place, one should investigate federal programmes (eg, within the United States, one should refer to resources made available by the Office of the Inspector General, http://www.oig.doc.gov/Pages/Hotline.aspx).

Once the basic responsibilities, protections and reporting logistics have been established, one should proceed with exceptional care when reporting a prospective case of misconduct. Regardless of whether you plan to take the information to a friend, your supervisor, an alleged offender or to higher administration, one must exhaustively document the precise allegation and be prepared to describe or report any communications with the potentially culpable researcher(s). The reason for this abundance of caution is clear – accusations have a tendency to bring out the worst responses, not only from those being accused but sometimes even from people one might view as friends or mentors, or otherwise expect to be impartial. Numerous unexpected negative repercussions may emerge in ways that may be difficult to foresee. The authors are aware of disturbing allegations of institutional hypocrisy in which the same administrators who have readily prosecuted misconduct complaints against some researchers are accused of simultaneously suppressing other comparable complaints, even to the point of intimidating or dismissing the complainant. Nonetheless, given care and commitment, anyone acting in good faith and with strong documentation should be able to advance the cause of public good without incurring personal damage.

In many cases, the person who might detect or suspect scientific misconduct may have no professional relationship with the prospective perpetrator, and may not have a strong, immediate vested interested in the case. In such cases, there may be limited motivation for attempting to initiate a major punitive response, but a scientist with integrity may still wish to see suitable corrections in the open literature record. In such cases, one may write to the editor of the journal that has published the questionable work. Professor David Vaux, a molecular biologist at Walter and Eliza Hall Institute of Medical Research in Melbourne, Australia frequently does precisely this (personal communication, see box).

Plagiarism perspective: why we shouldn’t just take the easy road

In the first installment of this two-article series1, we raised the conundrum posed by textual plagiarism as the form of scientific misconduct that was most readily confronted (because it is much easier to identify than other forms of malfeasance) but least damaging to the knowledge discovery pyramid (because plagiarised material is no more likely to contain erroneous findings than the average scientific report). Further clouding the issue is the philosophical quandary over whether hyper-diligent prosecution of textual plagiarism morally amounts to much more than systematic discrimination against non- English-speaking scientists, especially given the fact that federal documents specify plagiarism as comprising not only of uncited borrowing of text or ideas but also of employing uncredited third-party writers.

In clouding the prospects for substantial thirdparty assistance in drafting or rewriting manuscripts, one can glimpse the predicament faced by many non-English-speaking researchers. Although every obstacle likely has some path for circumvention, it is not difficult to see why some people could begin to see textual plagiarism as a practical and justifiable way of acquiring reasonable-sounding text to convey useful information. Should those of us who do not face such obstacles consider expending less energy on condemnation of the actions caused by this regrettable state, and more toward finding possible solutions to the underlying problem?

One remedy may be emerging, courtesy of linguistic software development. The incredible progress made in recent years by language translation utilities has rendered facile multilingual communication increasingly feasible, even proving capable of plausible rendition of many features and quirks of common vernacular. Although at first blush it may seem counter-intuitive, the linguistic challenges of translating common spoken language (especially considering dialectic variations and slang) are much greater than those of processing scientific writing. For papers reporting of novel technical developments, the manuscript tends toward a fairly rigid formulaic structure that lends itself well to automated translation. Anecdotal evidence for this exists in accounts whereby scientists of one European culture may find it possible to passably read papers in other European languages, even without formal schooling in those languages – the message being that the structure and vocabulary of technical papers are closely enough conserved from one language to the next for rapid and intuitive translation.

So let us envision a time in the near future when it may be possible for scientific researchers worldwide to compose manuscripts in their language of choice and have the text subsequently converted automatically to publication-quality English for submission to high quality periodicals. What are the implications?

l Is this an unqualified victory for global science, or does it expose new ethical issues?

l Is it any more ethical to use the description of an experimental protocol that has been textually constructed by software, than it was to borrow phrases liberally from prior scientific publications?

l What if the protocol translation software learned aspects of syntax and nomenclature from prior example of published scientific literature?

l On another angle, one wonders about the implications of errors in translation; who will accept liability in cases where damage arises?

Miscommunications in technical literature (especially in the formulation of protocols or in the delineation of practical conclusions) can incur a significant practical penalty if they induce irreproducibility or misinterpreted conclusions, thus widespread reliance on automated communications is only viable if one has either a strong degree of confidence in the automation, or if one has effective mechanisms for verification. At the time of this writing, it is uncertain how one might go about ensuring these safeguards.

Universal textual novelty: a moral imperative, or a recipe for irreproducibility? While the challenge of intelligible and accurate reporting is of utmost importance for discussing and deriving conclusions from experimental analysis, the vitality of science is even more contingent on full and reliable reporting of experimental protocol. Numerous instances of textual plagiarism are detected within the methodological descriptions of papers. Along with the Introduction, the Methods section likely hosts the greatest incidence levels of text copying of any manuscript component, because the material covered in these sections is most likely to have substantial overlap with prior work. Ironically, plagiarism within Introduction and Methods sections engenders less net harm than other components. While novel and compelling background and introductory text is often critical for contextual motivation in proposals, such material is of modest importance in publications as it encapsulates prior knowledge that many readers will be familiar with. Furthermore, many researchers are interested in papers for reasons other than those of the original authors, and thus many readers may do little more than skim the Introduction. Within the Methods section, one might opine that there is a practical aspect to mimicry – in cases where it is important to perform an experiment via a fairly standard protocol that is identical or very similar to prior studies, there is no did not heed attribution of the core laboratory, but rather insinuated that the author plagiarised an earlier manuscript that had already included precisely the same boilerplate description provided by the same service laboratory.

Provision of boiler plate methodology encapsulation is a fairly common practice among core service laboratories and while some authors make an effort to rewrite the text provided, this diligence is far from universal and, in light of earlier statements of accuracy, one may question whether it is actually advisable for a non-expert to rewrite (for no reason other than ensuring textual uniqueness) the protocol provided by a specialised practitioner of a specific technique? While boiler-plating is not uncommon in academic settings, it is distinctly popular within corporate research settings where a premium is placed on accurate and efficient recording of methodological protocol12. This raises the rather disturbing question: is our collective obsession with textual plagiarism actually harming scientific progress by encouraging people either to revise and potentially obfuscate previously clear protocol descriptions? Even worse, might concerns over the scientific provenance of technical text cause people to eschew cross-disciplinary research on the grounds that authors may not feel comfortable in writing about (or including in manuscripts) methods that they have an imperfect grasp of?

Emerging and future solutions?As discussed earlier, the prospect for using automated translation software to facilitate the generation of publication quality text in English (presumably from a manuscript constructed with care within the author’s native language) was explored as a means for obviating a key motivation for textual plagiarism. Although any question of translation reliability likely stands in the way of using automated translators for critical sections such as the Methods, Discussion and Conclusions sections, there is no reason why it cannot provide a reasonable avenue for handling the basic text generation for introductory paragraphs, simple (ie, non-interpreted) statements of results, as well as acknowledgements.

How then, might we achieve a compromise for facilitating production of the more scientifically critical sections? The question of how to accomplish this for interpretive discussion and conclusions, although of substantial community interest, is perhaps beyond the scope of this article since these sections are much less likely to contain significant textual plagiarism than the Methods section. For the latter, perhaps it is time to think like quality-minded technical developers rather than creative artists. Instead of regarding the Methods section of a paper as a unique piece of creative writing, would it not be more productive to regard it as a rigorously formulated technical document in which clarity and accuracy of communication is critical?

The means for achieving this may already be well exemplified within many of our standard public data services. Consider, for example, the high degree of informational standardisation that must be effected by researchers who wish to make technical results publicly available within services such as the Protein Databank, PubChem and the many genomic and proteomic data repositories. What if there existed a public service that accepted and curated similarly standardised descriptions of experimental and analytical protocols? Each protocol deposited could be massaged into a form maximised for transparency and reproducibility via a combination of an information standardisation interface (designed to perceive, require and format the minimum required detail), followed by peer-review intended to validate the completeness and the methodological and statistical rigor of the study. In order to facilitate studies that largely follow prior protocols, it would be helpful to permit use of pre-existing templates, with the reciprocal benefit that the interface can then fully document the inherited lineage of the modified protocol (ie, reference all prior studies that conceptually influenced the strategy and implementation in the current methodology).

The obvious advantages for such a centralised resource would include, but are not limited to, the following:

l Non-English researchers will be able to conveniently document their research procedures in a systematic, transparent way via clear reference to existing templates without incurring a risk of textplagiarism insinuations.

l Studies using this resource will have a standardised, validated and clearly articulated protocol statement, thus greatly aiding research reproducibility.

l Journals will be encouraged to require an accepted, validated protocol prior to manuscript submission, which should substantially simplify the journal peer-review process which, for many manuscripts, is greatly complicated due to referee misinterpretation of poorly communicated methodology descriptions.

l The resource will grow collaboratively to comprise a comprehensive, conveniently searchable global resource, documenting unique and variant technical protocols which may be subsequently cross-referenced with resulting publications.

Personal perspectives One of the authors (GL) has developed a distinct interest in aspects of the professional and cultural climate that drives researchers toward text misappropriation. As a scientific researcher whose career has taken an interesting detour after having foolishly failed to report text plagiarism committed by a colleague who published several items (two bioinformatics articles, a conference paper and a poster abstract) containing borrowed text, this interest has a personal basis (see box).

There are better uses for the readers’ time than to dwell on the long-term implications of my failure to act, so I will summarise briefly the implications, which, ironically, were decidedly mixed. The stain on my reputation has been humbling, but on the other hand the episode produced a rather liberating form of self-reinvention. My collaborators understood that I had not personally engaged in plagiarism and had never been accused of such, and thus I found myself able to productively sustain all mutually-beneficial scientific interactions, while shelving some inequitable, unsatisfying relationships. A shift to independent consultancy permitted me to extract my more lucrative intellectual property from the confining bounds of a public institution, thus affording a productive, self-sustaining situation in which to grow my research. In spite of the pleasant silver linings, however, I would not generally recommend the experience to others. Rather, I feel a prerogative for advocating key practical precautions for reducing the prospects that well-intentioned scientists will be stung with the taint of text-plagiarism.

The first key is to understand the scenarios in which we are most likely to unwittingly become co-authors on violating papers. The greatest risk comes with complex, multidisciplinary research wherein there are aspects to the project for which most collaborators and co-authors have limited background and literature-awareness. Thus, while most researchers should be able to instinctively sense from within our own areas of specialisation most statements and assertions in a manuscript that likely originated in other recent papers and thus warrant external referencing; this instinct may, for more distant disciplines, be less attenuated to distinguishing between statements of basic fact (ie, principles that are widely enough known in the subfield that they rarely warrant citation) versus novel concepts with a clear attribution trail. Thus, when one’s collaborations stray outside of a personal zone of experience, it becomes increasingly important to rely on instinct rather than knowledge. In particular, one must know a collaborator’s native writing style. Although one might think first to examine old papers by this collaborator, a better strategy is to instead examine informal communications such as e-mails. Prior papers may have been heavily edited by others and may even contain undetected plagiarism, and hence this manifold of writing samples may provide little guidance as to improprieties in newer drafts. By contrast, very few people ever dwell slavishly over the style, content, spelling, punctuation and grammar of the e-mails that they compose. Natively competent writers will typically write fairly fluent e-mails, while those with problematic writing abilities will typically produce sloppy communications. The key metric for suspicion would thus be eloquent and polished formal manuscript contributions by a researcher who generally composes atrocious informal memos. A further, related metric entails writing that contains a mixture of complex, grammatically immaculate constructs juxtaposed with sloppy juvenile ones – the sense that the same piece of writing has been composed by two very different people. A third recourse is to perform simple Google text searches on a small selection of the most eloquent, informative or provocative unattributed statements in a piece of writing.

A second vista of pitfalls comes for group leaders or senior researchers mentoring younger scientists. At a certain point in the younger scientist’s appointment, senior researchers will have formed a reasonable understanding of the acolyte’s writing style, strengths and weaknesses, but the first year often comprises a grey area in which undetected improprieties, especially among ambitious postdoctoral fellows who may actually research and compose manuscripts prodigiously in this modest amount of time. The onus is clearly upon the group leader to communicate a need for plagiarism-free manuscripts and to at least cursorily scrutinise each document for novelty as it arises. One may also augment the prospects for successful plagiarism- avoidance by instituting self-policing within the group – delegating a portion of the responsibility for rigour and probity by requiring that any document produced for eventual publication be checked carefully by one or more non-authoring group members.

Discussing such grave measures as these provide a vivid illustration of one key form of communal harm that plagiarism has helped to inflict on the research community – mistrust. On one hand, one may argue that in terms of the scope of modern scientific misconduct, text misappropriation is a minor offence that does little to damage research integrity. However, irrespective of one’s opinion on that the practice is dishonest, and dishonesty breeds suspicion, which in turn engenders a host of other possible repercussions that threaten to render our esteemed discipline a quagmire that can swallow the disreputable and well-intentioned alike. For this reason as much as any other, it has become essential for society to restore probity and discipline to the scholarship that holds vital keys to the future well-being of our society.

Self plagiarism

In the earlier component of this discussion1, we identified self-plagiarism (aka, text recycling or publication duplication) as an unethical practice that is sometimes considered when speaking about the sphere of behaviours collectively known as scientific misconduct. As such, we should now consider in this paper how such a practice could be detected and prevented.

Detection of self-plagiarism, of course, is indistinguishable from mechanisms used for identifying plagiarism and thus need not be reiterated. Prevention of the practice, however, is somewhat distinct. Unlike plagiarism, which is frequently prosecuted in official channels with vigour, there is no such guarantee for self-plagiarism. In the United States, ORI does not pursue administrative action against self-plagiarism since it does not recognise it as a form of conduct that rises to its own institutional definition of scientific misconduct:

ORI often receives allegations of plagiarism that involve efforts by scientists to publish the same data in more than one journal article. Assuming that the duplicated figures represent the same experiment and are labelled the same in both cases (if not, possible falsification of data makes the allegation significantly more serious), this so-called ‘self-plagiarism’ does not meet the PHS research misconduct standard. However, once again, ORI notes that this behaviour violates the rules of most journals and is considered inappropriate by most institutions. In these cases, ORI will notify the institution(s) from which the duplicate publications/ grants originated, being careful to note that ORI has no direct interest in the matter13.

Prevention of self-plagiarism is thus left to less official channels. The most impactful arena for this to occur is within the publishing industry. It is within the copyright prerogative of scientific journals to unilaterally retract papers that they feel infringe upon the copyright of other publications. A simpler response, if agreed upon by all affected publishing parties, is to append, post-facto, a statement to later publications that specifically cites a prior publication as comprising a key portion of the originating material.

The most practical level of remediation for the problem of self-plagiarism, however, is at the individual level with strategic re-education of scientific authors. Indeed the vast majority of self-plagiarism instances might be readily prevented by judicious and relatively painless diligence on the part of the authors. Specifically, authors will frequently find that publishers are readily agreeable to permitting material from their manuscript to be reprinted in subsequent publications, as long as official permission has been sought and a clear citation is provided in the borrowing paper. This citation, of course, is the second major factor that not only absolves authors of wrongdoing, but also provides added exposure to their own work and to the publisher of the originating paper. It is difficult to understand why such a simple and attractive solution would not be pursued in the first place? Perhaps this truly exposes what self-plagiarisers are most guilty of: it is not theft or self-aggrandisation, but rather mere abject stupidity.

Final thoughts

True scientific advancements are built equally on the knowledge of what works and what does not work. Are we all to blame for scientific misconduct? Is it our bias toward recognising only the successes that tempt too many researchers to disguise informationally useful failures instead as false and deleterious successes? If we are truly to end scientific misconduct, perhaps it is time to end the exclusive glorification of supposed breakthroughs and begin offering comparable recognition to systematic, plodding studies that we can trust. We should open our journals to recognition of high quality negative results that effectively eliminate dead end concepts from further consideration. The philosophy of science prizes objective repudiation of hypotheses. Should we not also commit to this ideal?

In addition to establishing tangible carrots in the form of recognising and valuing honest, non-spectacular science, one must support those sticks that are emerging to dissuade dishonest, tantalising propagation of falsehoods. Retraction Watch, the Reproducibility Initiative and the scrutiny of a fully functional HHS/ORI would collectively form a powerful motivation for scientific diligence that should send tremors down the spines of any investigators who have deliberately ventured into malicious malfeasance in the publication of irreproducible research.

It is good to be reminded of the quotes from Mark Twain and Louis Pasteur, which were true then, are certainly now and surely will be for eternity:

“It ain’t what you don’t know that gets you into trouble. It’s what you know for sure that just ain’t so.”Mark Twain

“Science knows no country, because knowledge belongs to humanity, and is the torch which illuminates the world.” Louis Pasteur

Acknowledgements We thank our many colleagues who have influenced us in innumerable ways over the years and for being the beneficiary of their collective wisdom. We are particularly indebted to David Vaux (Walter and Eliza Hall Institute of Medical Research), Deborah Collyar (Patient Advocates in Research) and Elizabeth Iorns (Science Exchange) for shaping our ideas, but the viewpoints expressed herein are our own.

Dr Gerald H. Lushington, an avid collaborator, focuses primarily on applying simulations, visualisation and data analysis techniques to help extract physiological insight from structural biology data, and relate physical attributes of small bioactive molecules (drugs, metabolites, toxins) toward physiological effects. Most of his 150+ publications have involved work with experimental molecular and biomedical scientists, covering diverse pharmaceutical and biotechnology applications. His technical expertise includes QSAR, quantum and classical simulations, statistical modelling and machine learning. Key interests include applying simulations and artificial intelligence to extract. After productive academic service, Lushington’s consultancy practice supports R&D and commercialisation efforts for clients in academia, government and the pharmaceutical and biotechnology industries. Dr Lushington serves as Informatics Section Editor in the journal Combinatorial Chemistry & High Throughput Screening, Bioinformatics Editor for Web- MedCentral and is on editorial boards for Current Bioactive Compounds, Current Enzymology and the Journal of Clinical Bioinformatics.

Rathnam Chaguturu is the Founder & CEO of iDDPartners, a non-profit think-tank focused on pharmaceutical innovation. He has more than 35 years of experience in academia and industry, managing new lead discovery projects and forging collaborative partnerships with academia, disease foundations, non-profits and government agencies. He is the Founding President of the International Chemical Biology Society, a Founding Member of the Society for Biomolecular Sciences and Editorin- Chief of the journal Combinatorial Chemistry and High Throughput Screening. He serves on several editorial and scientific advisory boards, has been the recipient of several awards and is a sought-after speaker at major national and international conferences, passionately discussing the threat of scientific misconduct in biomedical sciences and advocating the virtues of collaborative partnerships in addressing the pharmaceutical innovation crisis. ‘Collaborative Innovation in Drug Discovery: Strategies for Public and Private Partnerships’, edited by Rathnam, has just been published by Wiley.