Saturday, 1 October 2016

On the incomprehensibility of much neurogenetics research

Together with some colleagues, I am carrying out an analysis of methodological issues such as statistical power in papers in top neuroscience journals. Our focus is on papers that compare brain and/or behaviour measures in people who vary on common genetic variants.

I'm learning a lot by being forced to read research outside my area, but I'm struck by how difficult many of these papers are to follow. I'm neither a statistician nor a geneticist, but I have nodding acquaintance with both disciplines, as well as with neuroscience, yet in many cases I find myself struggling to make sense of what researchers did and what they found. Some papers that have taken hours of reading and re-reading to just get at the key information that we are seeking for our analysis, i.e. what was the largest association that was reported.

This is worrying for the field, because the number of people competent to review such papers will be extremely small. Good editors will, of course, try to cover all bases by finding reviewers with complementary skill sets, but this can be hard, and people will be understandably reluctant to review a highly complex paper that contains a lot of material beyond their expertise. I remember a top geneticist on Twitter a while ago lamenting that when reviewing papers they often had to just take the statistics on trust, because they had gone beyond the comprehension of all but a small set of people. The same is true, I suspect, for neuroscience. Put the two disciplines together and you have a big problem.

I'm not sure what the solution is. Making raw data available may help, in that it allows people to check analyses using more familiar methods, but that is very time-consuming and only for the most dedicated reviewer.

Do others agree we have a problem, or is it inevitable that as things get more complex the number of people who can understand scientific papers will contract to a very small set?

12 comments:

Indeed: much can go wrong between teams working on topics beyond their own skill level, reviewers similarly unprepared, and some of the most complex analytic pipelines in science. Reading some papers it feels the authors skimp on methods and fail to be explicit about which choices were made simply because they don't know. Others use this judiciously to obscure inadequate choices and faulty deductions. Fixing it all demands much of reviewers already pressed for time, but...

But I think the good will is there to do this: genetics is a very open group, and sharing of code and data are deep in their own 'DNA'/research culture, with organisations like BGA doing much to foster this along with NIH and others running workshops to teach these complex but essential tool chains. Broad and others releasing high quality software. Genetics journals and reviewers have good, too, at rapidly publishing critiques (vis the open debate about GCTA last year, which exposed flaws in the critiques). The field seems above average at accepting errors too, rather than flogging a dead horse.

It's worth noting too that there many many papers which are crystal clear, open, and well conducted. Often very well written, these papers serve as excellent templates. Often too these authors are the ones who also go to the trouble of writing explainer papers, publishing open source programs, and writing detailed method papers contrasting and reviewing the current state of theory and laying out and testing the assumptions. Naomi Wray and Peter Visscher spring to mind here as good role models.

Yes, it can be hellish hard. One of the things that has helped part of the problem in clinical research has been the advent of reporting guidelines, starting with CONSORT for clinical trials - especially the flow diagram of participants: http://www.jameslindlibrary.org/articles/a-history-of-the-evolution-of-guidelines-for-reporting-medical-research-the-long-road-to-the-equator-network/

Reporting guidelines are spreading: http://www.equator-network.org/reporting-guidelines/ - and they can make some difference at least: https://www.ncbi.nlm.nih.gov/pubmed/22730543

Thanks. One of the things we are hoping to do is to come up with recommendations for future studies, so this is a very useful suggestion. There is a huge gulf between the best and worst studies in terms of ease of finding basic information (and that's before you even grapple with the stats).

While trying to comprehend (starting from absolute scratch in all four circles of your diagram) a certain genetics/psychology article in PNAS, it became clear to me, well before I understood the article, that the reviewers could not possibly have done so. Of course, such situations are not unique to cross-disciplinary research, but I would bet money that they are a lot more common there.

As both occasional editor, reviewer and reader of such and related articles I completely feel with you! It feels reassuring that others feel like me, thank you.

When I happen to stumble upon such papers, I always feel like the techniques have advanced to a degree that they have left me standing out in the cold. It feels like I really need to read up on new developments an catch up with the latest statistics et al. Given it's not my main field, either, I've always just assumed that those within the field just had kept up and it was us sideliners who are trailing. I wouldn't know if that pool of experts was shrinking or not. In fact, given the more and more commonplace occurrence of GWAS studies, I always just assumed the field was growing, rather than shrinking.

Thanks Nick and Bjorn. I think a big problem is that new methods keep evolving, and I suspect one reason is that many studies don't find very much. I'm sure many have had my experience of doing a study (in this case ERP) and finding very little of interest with traditional analysis. This is very hard to accept and so you go on a quest to find a better method that may uncover some subtle features of the data that were not evident in the standard analysis. This can be sensible. For instance, I'm pretty sure that with ERP there are more things to be found than are evident with classic analysis of peaks: looking in the frequency domain or using methods such as ICA to find independent sources makes sense in terms of what we know about how the signal is generated. But if you then find nothing, the temptation is to go on and on with ever more complicated analysis. I know less about fMRI but it's clear that while we have paradigms that reliably generate activation 'blobs' in specific areas, there's more to be found out by looking a connectivity, for instance. Again, a sensible move at many levels, but there's an explosion of methods. The field is perhaps just too young, and I don't think it helps to have a plethora of different approaches, with everyone doing their own thing, which very few people can understand. Add to the mix the concern about limited power of many studies and opportunities for p-hacking, and we have a bit of a mess.

Because I will only review (pre-pub) a paper I think I can review in toto and do a good job, I tend to review a very small percentage of those sent my way. But I put post-pub comments onto, for example, PubPeer much more frequently because partial peer review is feasible. Nobody as yet has approached me to review just the acquisition part of an fMRI study, which is a shame.

That's a great blogpost - and confirms that it's not just me that thinks we have a problem. The growing complexity of methods, coupled with interdisciplinary research, is another reason why the conventional publishing model is becoming dysfunctional.

The endless diversification of analysis is also a "garden of forking paths", researcher degrees of freedom that can eke "results" out of noise. This should be highly suspect unless pre registered or cross validated on hold out data that you don't peek at until you have decided on your analysis.