Associations Between bioRxiv Preprint Noncitation Attention and Publication in the Biomedical Literature: A Cross-sectional and Cohort Study

Stylianos Serghiou1, John P. A. Ioannidis1-4

Objective To describe associations between bioRxiv preprint traffic, Altmetric scores, and eventual publication and to compare the attention ex-preprints receive when published in the canonical literature with the attention given to published articles not prepublished on bioRxiv.

Design We downloaded all preprints available on bioRxiv through January 17, 2017; all metrics available for each preprint; all data held for each preprint by Altmetric; and all data held by Altmetric and PubMed for published articles of previous preprints, which are identified in bioRxiv. We randomly chose 211 published articles that had been bioRxiv preprints, randomly identified 5 journal and time-matched control articles that had not been preprinted on bioRxiv, and compared Altmetric data. We compared means using pairwise t and Wilcoxon signed rank tests, estimated associations between the preprint covariable for the field of study and canonical publication using a multivariable Cox proportional hazards model, and compared matched data using a mixed-effects model with random intercept.

Results Of 7760 preprints, median traffic to abstracts was 943 (range, 6-192,570) and median traffic to PDFs was 331 (range, 16-151,520). Median Altmetric score was 7.3 (range, 0.25-2506) with a heavy right skew. Two thousand thirty-one preprints (36%) reached the canonical literature within a year of being uploaded and had a higher mean Altmetric score (19.5 vs 11.4, P = 3.8 × 10−8) than articles that had not reached the canonical literature within a year. After adjusting for the field of study, the Altmetric score remained a statistically significant but weak variable associated with eventual publication (hazard ratio, 1.005; 95% CI, 1.002-1.007). Once a preprint article is published, the pairwise absolute mean difference in Altmetric score was 14.9 points higher than what it was as a preprint on bioRxiv (P = 10−5). The biggest contributor to this difference was the number of citations by the mainstream media (pairwise absolute mean [SE] difference, 16.6 [3.8]). Articles that had previously been posted on bioRxiv attracted significantly more attention than controls once they were published (mean difference, 19.2; 95% CI, 5.5-32.8).

Conclusions Many preprints on bioRxiv attract significant noncitation attention and reach canonical publication. The Altmetric score is not meaningfully associated with eventual publication, but articles that had been posted on bioRxiv tend to receive more attention once published than articles that had never been posted on bioRxiv.

1Department of Health Research and Policy, Stanford University School of Medicine, Stanford, CA, USA; sstelios@stanford.edu; 2Stanford Prevention Research Center, Department of Medicine, Stanford University School of Medicine, Stanford, CA, USA; 3Department of Statistics, Stanford University School of Humanities and Sciences, Stanford, CA, USA; 4Meta-Research Innovation Center at Stanford (METRICS), Stanford University, Stanford, CA, USA

Conflict of Interest Disclosures: Dr Ioannidis is a Peer Review Congress Associate Director but was not involved in the review or decision for this abstract.