Archives

Follow us on Twitter

Why do researchers commit misconduct? A new preprint offers some clues

Daniele Fanelli

“Why Do Scientists Fabricate And Falsify Data?” That’s the start of the title of a new preprint posted on bioRxiv this week by researchers whose names Retraction Watch readers will likely find familiar. Daniele Fanelli, Rodrigo Costas, Ferric Fang (a member of the board of directors of our parent non-profit organization), Arturo Casadevall, and Elisabeth Bik have all studied misconduct, retractions, and bias. In the new preprint, they used a set of papers from PLOS ONE shown in earlier research to have included manipulated images to test what factors were linked to such misconduct. The results confirmed some earlier work, but also provided some evidence contradicting previous findings. We spoke to Fanelli by email.

Retraction Watch (RW): This paper builds on a previous study by three of your co-authors, on the rate of inappropriate image manipulation in the literature. Can you explain how it took advantage of those findings, and why that was an important data set?

Daniele Fanelli (DF): The data set in question is unique in offering a virtually unbiased proxy of the rate of scientific misconduct. Most data that we have about misconduct comes either from anonymous surveys or from retracted publications. Both of these sources have important limitations. Surveys are by definition reports of what people think or admit to have done, and usually come from a self-selected group of voluntary respondents. Retractions result from complex sociological processes and therefore their occurrence is determined by multiple uncontrollable factors, such as the policies of retracting journals, the policies of the country in which authors are working, the level of scrutiny that a journal or a field is subject to, the willingness of research institutions to cooperate in investigations, etc.

In a previous study co-authored with Rodrigo Costas, we had tested several hypotheses on the causes of scientific misconduct on papers that were retracted and corrected. We had used a matched-control analysis, which is very powerful and accurate but cannot avoid the retraction-specific biases mentioned above. The data set compiled by Elisabeth Bik and colleagues, however, was free from any such sociological confounding factors. As soon as I saw the paper published in bioarxiv, and realizing that its leading author was based in Stanford like me, I got in touch with her and we quickly agreed to join forces to produce a study about misconduct of unprecedented power and accuracy.

RW: A recent paper you co-authored found that “pressure to publish” — aka “publish or perish” was not a cause of bias in the literature. This study found partial support for the idea that scientists commit misconduct because of “pressure to publish.” Can you put these findings in context?

DF: Results obtained in this study largely resemble what we found on retractions and corrections (the latter interpreted as a proxy of research integrity) as well as bias in meta-analysis. In all these cases, we observed that researchers that work in countries in which publications are incentivized with cash-bonuses are at greater risk from misconduct or bias. We also consistently observe that the number of years a researcher has been in activity is inversely related to this risk – suggesting that early-career researchers are more likely to engage in misconduct or [questionable research practices (QRPs). A remarkable finding of this new study is that, in secondary analyses in which we focus on first authors working in Western countries, we seem to observe a positive and marginally significant association between first authors’ productivity and risk of data fabrication. This could be a surprising suggestion that, in some contexts, our naive predictions about pressures to publish might indeed be correct. However, these results are exploratory and need to be confirmed.

Depending on how you look at them, these findings could be interpreted as being partially consistent with the notion that certain policies and certain career conditions that impose greater productivity expectations on researchers might damage the quality of research. However, in all the three studies mentioned we consistently observed that countries with other publication incentive systems are not at greater risk and in fact may run a significantly lower risk of misconduct and bias. Similarly, we generally found a null or even a negative correlation at the level of individual authors. Proxies of bias or misconduct are null or negatively correlated with measures of an author’s productivity and impact, such as the number of papers published per year, the average impact factor or the number of citations. In other words, we repeatedly had to conclude that researchers who publish many papers in high-impact journals are not at greater risk of misconduct and may in fact manifest signs of higher research integrity: they may be less likely to have papers retracted, more likely to correct them, less likely to report exaggerated findings and less likely, this latest study suggest, to actually fabricate data.

I am oversimplifying, of course. There are still many confounding factors to take into account and the reasons underlying these patterns are difficult to disentangle. However, the results are clearly not consistent with a certain simplistic narrative that postulates that pressures to publish are straightforwardly distorting science. The picture is clearly more complex than that.

Retraction Watch readers who are skeptical might appreciate the fact that this study was pre-registered, which means that all primary analyses presented were conducted exactly as planned, with no second-guessing allowed. These are genuine tests. We could have found completely different results, and we would have interpreted them as evidence that pressures to publish do indeed distort science as it is often argued. This is actually what one of my earliest studies suggested. However, that study was very coarse-grained, and in multiple later studies I have obtained very different results. Such is the power of scientific research: it forces us to update our opinions.

RW: A previous paper by two of your co-authors found that men were far more likely than women to be found guilty of misconduct by the U.S. Office of Research Integrity. However, this paper found no such increased likelihood of men committing misconduct, and even hinted that women may be more likely than men to manipulate images inappropriately. What does that all mean, if anything, about the role of gender in misconduct?

DF: The study you refer to showed that males were over-represented amongst researchers who were found guilty of scientific misconduct. It was already noted at the time that this finding could have multiple alternative explanations.

One possible interpretation of the ORI data was that males are more likely to engage in misconduct, and we have tested such hypothesis in this study as well as in the aforementioned studies on retractions, corrections and bias. In all these cases, we consistently observed that males and females are not significantly different. Interestingly, according to some secondary analyses female first authors might actually be more likely to engage in data fabrication, but these are exploratory results that need to be confirmed independently and are probably affected by many confounding factors.

Our main conclusion so far is that gender is not a relevant predictor of scientific integrity.

RW: What are the important limitations of the approach you used, and of the findings?

DF: A specific limitation of the data set is that we cannot know which instances of data duplication in our sample were due to intentional fabrication as opposed to error. This ignorance, however, forced us to make a prediction that was strikingly supported, retrospectively validating the entire analysis. This is one of the results that I find most exciting, so let me explain it in detail. In their previous paper, Elisabeth, Ferric and Arturo had classified image duplications in three categories: Category 1 duplications are simple enough to be most likely caused by random errors. Category 3 consist in elaborate image manipulations that are more likely to be fraudulent. We still didn’t know which individual papers are fraudulent. However, in our pre-registration we predicted that Category 1 papers would be the least affected by any of the parameters measured and that they might even exhibit opposite effects to Category 2 and 3 papers. This prediction turned out to be rather accurate, and retrospectively adds credibility to the entire study design: hypothesised risk factors for scientific misconduct turn out to have significant effects only for categories of papers that are more likely to be fraudulent.

A general limitation of these results is that they concern a very specific form of data fabrication. Our findings may not be generalizable to other fields or forms of misconduct. However, as I mentioned above, our main findings are in remarkable agreement with those obtained on completely different phenomena: retractions, corrections and the likelihood to report over-estimated effects.

RW: Understanding — as you point out in the paper — that correlation is not causation, given all of the findings, what would you say is the answer to the question posed by your paper’s title? And how can it help guide policy?

DF: Our results corroborate previous findings that underscore the importance of good mentorship to early-career researchers and the benefits of working in conditions that favour mutual criticism. However, our strongest and most important finding is that countries differ widely their risk of producing unreliable science. Authors from China, India and several developing countries contributed a disproportionate amount of problematic papers. Irrespective of whether these problems were caused by error or misconduct, there is clearly important work to do to improve research standards in these countries.

The title “Why Do Scientists Fabricate And Falsify Data?” uses too liberal a form of the term “scientists”. A true scientist would not fabricate or falsify data. A mere researcher might. A true scientist would not over-interpret findings. A mere researcher might. A true scientist would not hide data that falsifies his/her story. A mere researcher might. Each of these features of good scientific practice takes more time and energy than mere research, giving advantage in terms of superficial measures of ‘productivity’ to those not burdened by such discipline. The least we can do is to not blur the distinction between scientists and researchers.

Daniel Fanelli and his colleagues have made significant contributions based on empirical data to our understanding of research misconduct. This paper is another example. It enhances our knowledge of the contribution of the research setting.

Yes, external factors do play a role. However, all investigators, during a career in science, practice in the same environment. What then determines who is vulnerable at any given stage and what remedies are required?

For example, Fanelli rightfully identifies the important role of mentors in reducing misconduct in trainees.Therefore, the funding of training grants should be dependent on the presence of a good mentoring program.

Senior faculty misconduct require different remedies. I believe the effective protection of whistleblowers, with heightened fear of detection, and the institution of truly severe penalties would be an effective deterrent for this group.

The bottom line: Research misconduct is the product of multiple personal and environmental factors. They all must be addressed.

A little rhyme I wrote on the occasion of the just-in-time detection of a large scale bit of fraud by reverse engineering (using the new app “Darseecalc 1.0”) addressed the issue of cultural milieu thusly:

John Darsee is the toadstool that
grew upon the dunghill that
Others (here unnamed) had shat

misconduct is driven by sociopathy and desperation due to lack of funds. Successful PI’s then feel they must continue the deception to maintain status. this is so obvious I don’t know why we need studies to investigate the issue.