A bridge over troubled waters for fMRI?

Yesterday’s ‘troubles with fMRI’ article has caused lots of debate so I thought I’d post the original answers given to me by neuroimagers Russ Poldrack and Tal Yarkoni from which I quoted.

Poldrack and Yarkoni have been at the forefront of finding, fixing and fine-tuning fMRI and its difficulties. I asked them about current challenges but could only include small quotes in The Observer article. Their full answers, included below with their permission, are important and revealing, so well worth checking out.

First, however, a quick note about the reactions the piece has received from the neuroimaging community. They tend to be split into “well said” and “why are you saying fMRI is flawed?”

Because of this, it’s worth saying that I don’t think fMRI or other imaging methods are flawed in themselves. However, it is true that we have discovered that a significant proportion of the past research has been based on potentially misleading methods.

Although it is true that these methods have largely been abandoned there still remain some important and ongoing uncertainties around how we should interpret neuroimaging data.

As a result of these issues, and genuinely due to the fact that brain scans are often enchantingly beautiful, I think neuroimaging results are currently given too much weight as we are trying to understand the brain but that we shouldn’t undervalue neuroimaging as a science.

Despite having our confidence shaken in past studies, neuroimaging will clearly come out better and stronger as a result of current debates about problems with analysis and interpretation.

At the moment, the science is at a fascinating point of transition, so it’s a great time to be interested in cognitive neuroscience and I think this is made crystal clear from Russ and Tal’s answers below.

What’s the most pressing problem fMRI research needs to address at the moment?

I think that biggest fundamental problem is the great flexibility of analytic methods that one can bring to bear on any particular dataset; the ironic thing is that this is also one of fMRI’s greatest strengths, i.e., that it allows us to ask so many different questions in many different ways. The problem comes about when researchers search across many different analysis approaches for a result, without the realization that this induces an increase in the ultimate likelihood of finding a false positive. I think that another problem that interacts with this is the prevalence of relatively underpowered studies, which are often analyzed using methods that are not stringent enough to control the level of false positives. The flexibility that I mentioned above also includes methods that are known by experts to be invalid, but unfortunately these still get into top journals, which only helps perpetuate them further.

Someone online asked the question “How Much of the Neuroimaging Literature Should We Discard?” How do you think should we consider past fMRI studies that used problematic methodology?

I think that replication is the ultimate answer. For example, the methods that we used in our 1999 Neuroimagepaper that examined semantic versus phonological processing seem pretty abominable by today’s standards, but the general finding of that paper has been replicated many times since then. There are many other findings from the early days that have stood the test of time, while others have failed to replicate. So I would say that if a published study used problematic methods, then one really wants to see some kind of replication before buying the result.

What’s the most pressing problem fMRI research needs to address at the moment?

My own feeling (which I’m sure many people would disagree with) is that the biggest problem isn’t methodological laxness so much as skewed incentives. As in most areas of science, researchers have a big incentive to come up with exciting new findings that make a splash. What’s particularly problematic about fMRI research–as opposed to, say, cognitive psychology–is the amount of flexibility researchers have when performing their analyses. There simply isn’t any single standard way of analyzing fMRI data (and it’s not clear there should there be); as a result, it’s virtually impossible to assess the plausibility of many if not most fMRI findings simply because you have no idea how many things the researchers tried before they got something to work.

The other very serious and closely related problem is what I’ve talked about in my critique of Friston’s paper [on methods in fMRI analysis] as well as other papers (e.g., I wrote a commentary on the Vul et al “voodoo correlations” paper to the same effect): in the real world, most effects are weak and diffuse. In other words, we expect complicated psychological states or processes–e.g., decoding speech, experiencing love, or maintaining multiple pieces of information in mind–to depend on neural circuitry widely distributed throughout the brain, most of which are probably going to play a relatively minor role. The problem is that when we conduct fMRI studies with small samples at very stringent statistical thresholds, we’re strongly biased to detect only a small fraction of the ‘true’ effects, and because of the bias, the effects we do detect will seem much stronger than they actually are in the real world. The result is that fMRI studies will paradoxically tend to produce *less* interesting results as the sample size gets bigger. Which means your odds of getting a paper into a journal like Science or Nature are, in many cases, much higher if you only collect data from 20 subjects than if you collect data from 200.

The net result is that we have hundreds of very small studies in the literature that report very exciting results but are unlikely to ever be directly replicated, because researchers don’t have much of an incentive to collect the large samples needed to get a really good picture of what’s going on.

Someone online asked the question “How Much of the Neuroimaging Literature Should We Discard?” How do you think should we consider past fMRI studies that used problematic methodology?

This is a very difficult question to answer in a paragraph or two. I guess my most general feeling is that our default attitude to any new and interesting fMRI finding should be skepticism–instead of accepting findings at face value until we discover a good reason to discount them, we should incline toward disbelief until a finding has been replicated and extended. Personally I’d say I don’t really believe about 95% of what gets published. That’s not to say I think 95% of the literature is flat-out wrong; I think there’s probably a kernel of truth to most findings that get published. But the real problem in my view is a disconnect between what we should really conclude from any given finding and what researchers take license to say in their papers. To take just one example, I think claims of “selective” activation are almost without exception completely baseless (because very few studies really have the statistical power to confidently claim that absence of evidence is evidence of absence).

For example, suppose someone publishes a paper reporting that romantic love selectively activates region X, and that activation in that region explains a very large proportion of the variance in some behavior (this kind of thing happens all the time). My view is that the appropriate response is to say, “well, look, there probably is a real effect in region X, but if you had had a much larger sample, you would realize that the effect in region X is much smaller than you think it is, and moreover, there are literally dozens of other regions that show similarly-sized effects.” The argument is basically that much of the novelty of fMRI findings stems directly from the fact that most studies are grossly underpowered. So really I think the root problem is not that researchers aren’t careful to guard against methodological problems X, Y, and Z when doing their analyses; it’s that our mental model of what most fMRI studies can tell us is fundamentally wrong in most cases. A statistical map of brain activity is *not* in any sense an accurate window into how the brain supports cognition; it’s more like a funhouse mirror that heavily distorts the true image, and to understand the underlying reality, you also have to take into account the distortion introduced by the measurement. The latter part is where I think we have a systemic problem in fMRI research.

Share this:

Related

8 thoughts on “A bridge over troubled waters for fMRI?”

This is a good follow up to the Observer article Vaughan. I think the recent criticisms of fMRI should actually be taken as a sign that the methods are becoming more mature, as some constructive self-criticism is necessary to make the methods more rigorous.

The power problem (i.e. that there are not enough participants to be sure that we are not getting false positive, or false negative results) is not unique to fMRI. As we psychologists are aware, the vast majority of psychological studies have very small sample sizes. The general public may not be aware of this however. In this respect, some of the best psychology studies I’ve seen have been carried out by the dating website OK Cupid – with sample sizes of 100,000+

The other problem with fMRI as I see it is the association-cause problem. Yes, BOLD signals may be a good proxy for cognition, but just because there is activation in brain area X when we engage in behaviour Y doesn’t mean that brain area X causes cognition Z. For this reason, I tend to trust interpretations of double-dissociation studies with brain injured people much more than imaging methods.

However, the power problem is sometimes even worse in brain injury studies (I am using qualitative data and case studies for part of my PhD for example). Also, one thing that fMRI has going for it is that it is truly experimental, particularly if methods like TMS are used in conjunction with fMRI. In comparison, double-dissociation brain injury studies are by their nature pseudo-experimental, as we have to find brain injured participants who have been unfortunate enough to have suffered a brain injury of the type we are interested in studying. This is another reason why our sample sizes are often quite small.

Thanks so much Vaughan for continuing this important debate, and giving it such prominence in the national press.

Re Tom Michael’s comment about brain injury studies, it’s always good to look at a scientific issue from multiple angles, and brain injury studies are a useful example here – but carry their own problems (damage not limited to area under study, brain adapting to damage in unpredictable ways, unknown extra hidden damage, e.g. in white matter tracts, and so on). Neuroimaging, especially fMRI, is an incredibly useful, powerful method potentially, but can only live up to this potential if various things improve, including better training, more valid application of techniques, more replication, and generally more rigour at all stages.

And there was a great special issue of this month’s The Psychologist magazine, with many articles generally discussing the problem of replication in psychology, with one article (by me) specifically describing problems and potential solutions in neuroimaging. You can access this issue here:

Which is where I argue that we ought to pre-register scientific studies so that the original methods and analyses are publically available, and post hoc “fishing” can be distinguished from hypothesis-driven a priori stuff.

It’s interesting that both Tal & Russell point to the flexibility of fMRI analyses as the primary problem with the field – this flexibility could be a very good thing (it means the methods are versatile) but it easily becomes a problem when people are able to selectively publish only the significant findings.

Preregistration would help mitigate this – although it’s not a panacea and replication would still be important.

I think a major issue with the “flexibility” of fMRI analyses is the limited number of people that really understand these different methods. As someone that uses electrophysiology to study the MTL I come across a lot of fMRI results in this region and have no means of vetting them. Just recently I was reading a paper that found something using MVPA (http://onlinelibrary.wiley.com/doi/10.1002/hipo.20960/pdf) and I just have to trust the reviewers did their due diligence.

This seems to lead to another problem: the closed circle of fMRI researchers vetting each others’ work. I presume not many people outside of fMRI work have used something like MVPA enough to understand it and ensure it’s properly employed. While I don’t mean to imply there’s some kind of fMRI old boys’ club, I can understand Tal Yarkoni’s skepticism since it’s probably counterproductive for an fMRI-using reviewer to tear apart a submission to a high tier journal.

Dear Society,the Therm is not anderstudiesstanding.Flower in A Greenfields and then brains fly and to create like beutines yours way is what.Think total free,Tom Stafford you have take less terms,you explane too much,are your brains connecting like machine,data better create something your own in Blanco Paper.Try clear your brains like no think only be .

Dear Tom Stafford where are your evidence.book named Control your Dreams.I know that brains work (the healty and studied brains)their own way and do best for person.What,do you have evidence?Total mess.Trust your brains and so on.It is so easy be brainless,heh.Deine Tomatohead.