Monday, August 13, 2012

Earlier today, some big names in the world of science journalism started a fascinating discussion on Twitter about reporting from the arXiv preprint server for physics papers. Because these papers usually have yet to be peer-reviewed (or may never be peer-reviewed), several questions naturally arose during the discussion.

How can a journalist trust one of these papers? What steps should reporters take to verify the claims in an arXiv paper, and should the process differ from reporting on a peer-reviewed article? How much faith should reporters place in the peer-review process?

As a writer who frequently writes about papers from the arXiv, I've grown to appreciate this gold mine of often undiscovered physics research. But it can also be a tomb for dubious papers and speculative musings. Consequently, I decided to do a quick (non-scientific) analysis of arXiv articles to see which categories (e.g. astrophysics or high energy physics) may have more articles that are eventually peer-reviewed. This is certainly not the best test of "trustworthiness" for arXiv papers, but it may help us better answer some of the aforementioned questions.

Papers submitted to the arXiv are not required to be reviewed by other scientists before they appear online, but submission requires having an email address associated with a research institution. Some of the papers include notes from the authors about an article's future publication, and these comments revealed nuances between three different sub-fields in physics.

For the charts below, I categorized the 25 most recent articles for three different areas of arXiv physics articles (this may not represent a random sample, but it's convenient for these purposes). I categorized articles as accepted/published, submitted to a journal, not specified or conference talks, based on the author-submitted comments. Here's the results:

Astrophysics

Out of the three groups that I analyzed, astrophysics had the most papers accepted or published in a reputable scientific journal. Submitted articles came in second, while unspecified papers and conference proceedings tied for the bottom spot.

High Energy Physics: Experiment (Think LHC/Fermilab research)

Among high energy physics experimental papers, less than half had a specified publication category, which contrasts sharply with the astrophysical papers.

Condensed Matter (Think superconductors or magnetism)

Over three quarters of the condensed matter papers were unspecified and no conference proceedings were in the latest batch of 25 papers.

What Does This Mean?

I wouldn't read too much into these numbers, but it reveals a little bit about the variety of papers submitted to the arXiv. Astrophysics papers tend to be ready for publication (at least in format), whereas high energy physics and condensed matter articles take on a different form. Or maybe their authors are too lazy to tag their papers in the comments.

Nonetheless, it seems that physicists from different fields take different approaches to using the arXiv. So how should journalists or members of the public interpret these open access results? Most importantly, know what you're reading.

Scientific results can be presented in a variety of formats other than journal papers, and they can be revealed at different stages. ArXiv articles are more likely to be preliminary, and there may not be as much detail (or maybe more detail) than a typical journal paper.

Also, my preliminary investigation suggests that there may be a big difference between the types of papers from two different fields within physics. Astrophysical papers tend to follow the more standard format that journalists are accustomed to, but that may not be the case for other areas.

For instance, some papers on the arXiv are just small snippets of research worth sharing but perhaps not worth reporting more widely. Longer review papers can be deceiving as well. Although these papers may appear to be unveiling new research to the untrained eye, the authors are usually giving an overview of the field to physicists in other specialties. Both of these types of papers may account for many of the unspecified papers.

So should science journalists change their approach to covering arXiv papers before they're "officially" published? Probably not, as long as they follow good journalistic practices such as contacting independent experts and relaying the source of these results.

If a paper sounds too good to be true, then it probably is, whether or not it's on the arXiv. Applying the crackpot index to a paper always helps, and I think journalists covering the arXiv can develop good instincts over time.

Maybe someone with some more patience and coding practice could do a thorough, scientific analysis of arXiv papers to see if and where they are eventually published. For now, Physics Buzz readers, check the source whenever possible.