On another nostalgic note, this blog turns 3 today. To mark the occasion, we'll take a peek back into The Neurocritic's archives, because what's old is new again.

New Voodoo Correlations: Now Taking Nominations!

By now, most neuroimagers and cognitive neuroscientists have heard about the controversial (some would say inflammatory) new paper by Ed Vul and colleagues on Voodoo Correlations in Social Neuroscience (PDF), summarized in this post.1 In the article, Vul et al. claimed that over half of the fMRI studies that were surveyed used faulty statistical techniques to analyze their data:

...using a strategy that computes separate correlations for individual voxels, and reports means of just the subset of voxels exceeding chosen thresholds. We show how this non-independent analysis grossly inflates correlations, while yielding reassuring-looking scattergrams. This analysis technique was used to obtain the vast majority of the implausibly high correlations in our survey sample.

Needless to say, authors of the criticized papers were not pleased, and some have posted rebuttals (Jabbi et al. in preparation, PDF). Vul and colleagues responded to that rebuttal, but a new invited reply by Lieberman et al. (submitted, PDF) has just popped up. Here are some highlights from the abstract:

...Vul et al. incorrectly claim that whole-brain regression analyses use an invalid and “non-independent” two-step inferential procedure. We explain how whole-brain regressions are a valid single-step method of identifying brain regions that have reliable correlations with individual difference measures. ... Finally, it is troubling that almost 25% of the “non-independent” correlations in the papers reviewed by Vul et al. were omitted from their own meta-analysis without explanation.

1. The methodology is opaque - in particular the method of identifying relevant papers. The authors have criticised a number of the imaging studies similarly.

2. In my opinion there is possibly a selection bias in this paper - a small number of all possible papers are selected but due to the opacity of the methodology section we are unable to ascertain the nature of a possible selection bias. The authors criticise other researchers for identifying voxel activity based on correlation with the behaviour/phenomenological experience in question i.e. selection bias.

3. If there is a selection bias then the authors would have selected those papers which support their argument - thus generating a result similar to the ‘non-independent error’. Furthermore they have produced ‘visually appealing’ graphs for their data which ‘provide reassurance’ to the ‘viewer that s/he is looking at a result that is solid’.

Vul et al. are fully capable of responding to these objections, and I'm sure we'll see a rebuttal from them shortly. What I would like to do here is to mention 6 previous posts from The Neurocritic's archives (which all happen to cover the field of social neuroscience):Mental as Anything

Although these posts don't engage in a rigorous deconstruction of analytic techniques à la Vul et al., they do ask some questions about the methods and about how the results are framed (i.e., over-interpreted). But first, let's reiterate that

The critics among us are not trying to trash the entire field of social neuroscience (or neuroimaging in general). Some of us are taking concrete steps to open a dialogue and improve its methodology, while others are trying to rein in runaway interpretations.

The rebuttals to Vul et al. emphasize that the latter's analytic objections are by no means unique to social neuroscience. Vul et al. acknowledged this, albeit not in a prominent way. The criticized authors are also peeved that "the media" (mostly bloggers) have contributed to a "sensationalized" atmosphere before the paper has been published. However, as previously noted in Mind Hacks,

The paper was accepted by a peer-reviewed journal before it was released to the public. The idea that something actually has to appear in print before anyone is allowed to discuss it seems to be a little outdated (in fact, was this ever the case?).

Why do I have a problem with some papers in social neuroscience? Let's take the study by Mitchell et al. (2006), which qualified for its own special category in the Vul et al. critique:

* study 26 carried out a slightly different, non-independent analysis: instead of explicitly selecting for a correlation between IAT and activation, they split the data into two groups, those with high IAT scores and those with low IAT scores, they then found voxels that showed a main effect between these two groups, and then computed a correlation within those voxels. This procedure is also non-independent, and will inflate correlations.)

The participants in that study made judgments about hypothetical people who were either similar or dissimilar to themselves (on liberal vs. conservative sociopolitical views). Two regions of medial prefrontal cortex were identified: ventral mPFC was the "Like Me" area and dorsal mPFC was the "Not Like Me" area. Even if we believe those inferences about mentalizing, how are we to interpret these graphs?

Does the first bar graph (left) mean that liberals are a little less hostile to conservatives than vice versa? Does the other bar graph (right) mean that the “Not Like Me” area in liberals is equally activated by “self” and “conservative other”?? What DOES it all mean?

...completed a "liberal-conservative" IAT [Implicit Association Test]2 that used photos of the hypothetical persons presented for "mentalizing" judgments in the scanning session.

The authors used the IAT to retroactively assign subjects to "like liberal" and "not like liberal" groups. As the graph illustrates, only 3 subjects (out of 15 total) actually had RT effects indicating they might have a closer affinity to the conservative "other" (if you believe the IAT).

Very well then. The researchers should have recruited actual conservative Christians for a valid sample of conservative students.

But the Voodoo Guru Award goes to...

...King-Casas et al. (Science 2008) for The Rupture and Repair of Cooperation in Borderline Personality Disorder!!

This paper examined how well individuals with borderline personality disorder (BPD) trusted others in an economic exchange game (called, conveniently enough, the Trust Game). In this game, one player (the Investor) gives a sum of money to the other player (the Trustee). The investment triples, and the Trustee decides how much to give back to the Investor. Relative to the control group, the BPD group was more likely to make a small repayment after receiving a small investment. This reflected a lack of cooperation (or "coaxing" behavior) designed to induce the Investors to trust their partners.

So why do I feel like I'm goin' to lose my mind?3 As suggested by a colleague, the present paper 1) makes liberal use of reverse inference and 2) reeks of fishing.4 In my view, the real trouble arises when the authors try to explain what bits of the brain might be implicated in the lack of trust shown by players with BPD. It's the insula! [and only the insula]. Why is that problematic? We shall return to that question in a moment.

In the study of Sanfey et al., unfair offers were associated with greater activity in bilateral anterior insula, dorsolateral prefrontal cortex, and anterior cingulate cortex, with the degree of insular activity related to the stinginess of the offer. A similar relationship was observed here in the controls, but not in the BPD patients. Taking a step back for a moment, we see differences between control and BPD participants (for the contrast low vs. high investment) in quite a number of places...

However, the within-group analysis in controls yielded a "small investment" effect only in bilateral anterior insula (12 voxels and 15 voxels, respectively, at pexplains here), and do not respond correctly to social norm violations.

[Thus], the authors bypassed more general analyses comparing BPD and control brains during the point of investment and the point of repayment. Instead, the major neuroimaging result contrasted the receipt of low investment offers vs. high investment offers, as illustrated below. Control brains showed a nearly perfect linear correlation [r=-.97] between $ offer and activity in the anterior insula (expressed here as a negative correlation, because low $ offers correlated with high insula activity). Such a relationship was not observed in BPD brains...

Neurally, activity in the anterior insula, a region known to respond to norm violations across affective, interoceptive, economic, and social dimensions, strongly differentiated healthy participants from individuals with BPD.

However, many other imaging studies have shown that this exact same region of the insula is activated during tasks that assess speech, language, explicit memory, working memory, reasoning, pain, and listening to emotional music (see this figure).

Voodoo or no voodoo?

What do you think?

Footnotes

1 You can also read a quick overview here and more in-depth commentaryhere.

2 We'll gloss over the objections of some commenters who think the IAT itself is voodoo.

3 Other than the fact that I am not knowledgeable in behavioral game theory (see Camerer et al., 2003 for that, PDF).

Paul Ekman’s pioneering work on emotions and facial expressions is the basis for a new television series, “Lie to Me,” premiering tonight at 9:00 PM on Fox (check local listings for details). The main character -- Dr. Lightman, the “world’s leading deception expert,” – is described as a human lie detector who cracks criminal cases by reading facial expressions, voice, and movements.

Join Ekman and members of the production group and cast at the APS 21st Annual Convention this May in San Francisco, CA, for a discussion on how psychological science research is used in the show. The panel discussion, “Prime Time Psychology: Science Is the Story in ‘Lie to Me,’” will be moderated by Robert W. Levenson, and the program will include excerpts from the series.

Thursday, January 22, 2009

Did you know that male "sexual sweat" differs from ordinary sweat? Apparently so, according to a new paper in the Journal of Neuroscience (Zhou & Chen, 2008). Curiously, the article did not cite any references for this, nor did it specify the chemical composition of sexual sweat. Nonetheless, the results of an fMRI experiment suggested that the orbitofrontal cortex and the fusiform region in 20 female participants responded differently when smelling the two substances. How was such a study conducted, you might ask?

And here the fun begins...

Sweat collection. From 2 d before the experiment until the end of the experiment, 20 heterosexual male donors in a larger study refrained from using deodorant/antiperspirant/scented products, and used scent-free shampoo/conditioner, soap, and lotion provided by the experimenter. They reported to have experience with watching sexually explicit videos, and signed informed consent before participation. Subjects kept a 4" x 4" pad (rayon/polyester for maximum absorbance) in each armpit while they watched 20-min-long video segments intended to produce the emotions of sexual arousal (sexual intercourse between heterosexual couples) and neutrality (educational documentaries), respectively. ... Over the course of the 20 min video segments, donors experienced greater arousal (measured by skin conductance) while watching erotic videos than while watching neutral videos... Three healthy, male nonsmokers (aged 26, 29, and 29 years) were subsequently selected for the current study mainly because of their higher level of the self-reported sexual arousal.

How were the female participants selected?

We recruited only women for their superior sense of smell and sensitivity to emotional signals. Twenty right-handed females (mean age = 23.4 years) were selected from a group of 42 women on the basis that they reported to have no rhinal disorders or neurological diseases, and that they showed superior olfactory sensitivity to PSP [the putative sex pheromone androstadienone] and PEA [phenyl ethyl alcohol]. They either were in a heterosexual relationship or had been in one within the previous year. They were not on hormone contraceptives, and were tested during the periovulatory phase of their menstrual cycles. ... Subjects were informed that the study was on brain activations to natural compounds. They were blind to the nature of the smells used in the experiment.

The scanning was performed while the women were inhaling...

...the sweat of sexual arousal in comparison with two other social chemosensory compounds (PSP and the sweat of neutrality) and a nonsocial smell [phenyl ethyl alcohol (PEA)].

The sweat of neutrality. The sweat of sexual arousal! [plus the two others.] The subjects rated the four inhalants (presented 10 times each) on intensity and pleasantness, as shown below. And the smell of sexual sweat was not particularly pleasant...

Figure 1. Mean intensity and pleasantness ratings. There are four types of olfactory stimuli, and SE bars are shown. For intensity, 1 refers to no smell, 2 little smell, 3 moderate smell, 4 quite a bit smell, and 5 strong smell. For pleasantness, 1 refers to very unpleasant, 2 unpleasant, 3 neutral, 4 pleasant, and 5 very pleasant. Sex, Sexual sweat; Neutral, neutral sweat. Sexual sweat and PSP were perceived to be more intense than neutral sweat; PEA was perceived to be more pleasant than sexual sweat and neutral sweat.

At the end of the experiment, the participants gave verbal descriptions of the smells. Only one characterized sexual sweat as "sweaty/human." So the women were not [consciously] aware that the odor was obtained from sexually aroused men.

The right hypothalamus showed increased activity to sexual sweat relative to alcohol, but so did androstadienone and neutral sweat. The two brain regions that responded more to sexual sweat than to the other odors are illustrated below. The right orbitofrontal cortex is an olfactory region, but the right fusiform gyrus is a high-level visual region. The authors say their fusiform region1 falls in the vicinity of the fusiform face area (FFA) and fusiform body area (FBA). Hmm.

The authors took a giant leap when speculating about visual imagery of faces and bodies:

The Talairach coordinates of the fusiform region identified in our experiment fall in the range of the coordinates for FFA and FBA. Such anatomical location likely reflects a recognition of the human quality in the sexual sweat, whose emotional nature may have also contributed to the activation. Considering its functional connectivity to the right hippocampus/ parahippocampal gyrus, the recognition may arise from implicitly associating the sexual sweat with humans based on past experience. The fact that most subjects did not perceive the sexual sweat as human related suggests that the effects we observed occurred at a subconscious level.

Wholly unconscious face/body visual processing in response to a sexual chemosensory cue? In the absence of any specific activity in the hypothalamus or amygdala? That's a hard one to swallow.

Footnote

1 The FFA and FBA have been dissociated with scanning at high resolution.

Chemosensory communication of affect and motivation is ubiquitous among animals. In humans, emotional expressions are naturally associated with faces and voices. Whether chemical signals play a role as well has hardly been addressed. Here, we use functional magnetic resonance imaging to show that the right orbitofrontal cortex, right fusiform cortex, and right hypothalamus respond to airborne natural human sexual sweat, indicating that this particular chemosensory compound is encoded holistically in the brain. Our findings provide neural evidence that socioemotional meanings, including the sexual ones, are conveyed in the human sweat.

Tuesday, January 20, 2009

"That we are in the midst of crisis is now well understood. Our nation is at war, against a far-reaching network of violence and hatred. Our economy is badly weakened, a consequence of greed and irresponsibility on the part of some, but also our collective failure to make hard choices and prepare the nation for a new age. Homes have been lost; jobs shed; businesses shuttered. Our health care is too costly; our schools fail too many; and each day brings further evidence that the ways we use energy strengthen our adversaries and threaten our planet.

These are the indicators of crisis, subject to data and statistics. Less measurable but no less profound is a sapping of confidence across our land - a nagging fear that America's decline is inevitable, and that the next generation must lower its sights.

Today I say to you that the challenges we face are real. They are serious and they are many.

They will not be met easily or in a short span of time. But know this, America - they will be met. On this day, we gather because we have chosen hope over fear, unity of purpose over conflict and discord."

Monday, January 19, 2009

President Obama, on his first day in office, can make a number of changes that will mark a clean break with the Bush presidency. He can, and should, issue an executive order revoking any prior order that permits detainee mistreatment by any government agency. He should begin the process of closing Guantánamo, and he should submit to Congress a bill to end the use of military commissions, at least as presently constituted. Over the coming months he can pursue other reforms to restore respect for the Constitution, such as revising the Patriot Act, abolishing secret prisons and "extraordinary rendition," and ending practices, like signing statements, that seek to undo laws.

the man who said it would take several hundred thousand troopsfiredthe man who said it would cost more than a hundred billionfiredthe man who revealed Bush's yellowcake liesmearedhis wife's covert statusexposedthe White House liars who did itand covered it upnot firedone convictedBush commutes his sentence

Thursday, January 15, 2009

Most hip researchers in cognitive neuroscience and human brain imaging have already heard about the critical new journal article with the incendiary title: "Voodoo Correlations in Social Neuroscience" (Vul et al., in press - PDF). If you haven't, you can read a comprehensive summary here and a micro version here.

Avenging Voodoo Schadenfreude

Nature News ran a piece on the debate and the burgeoning backlash from an angry mob of researchers whose methods were derided as fatally flawed. Some of these authors (and perhaps some Nature editors) were miffed that bloggers wrote about the preprint when it was first made available to the public, as if that somehow violates the scientific method:

The swift rebuttal was prompted by scientists' alarm at the speed with which the accusations have spread through the community. The provocative title — 'Voodoo correlations in social neuroscience' — and iconoclastic tone have attracted coverage on many blogs, including that of Newsweek. Those attacked say they have not had the chance to argue their case in the normal academic channels.

"I first heard about this when I got a call from a journalist," comments neuroscientist Tania Singer of the University of Zurich, Switzerland, whose papers on empathy are listed as examples of bad analytical practice. "I was shocked — this is not the way that scientific discourse should take place." Singer says she asked for a discussion with the authors when she received the questionnaire, to clarify the type of information needed, but got no reply.

Based on the statements above, it would seem that Dr. Singer and her colleagues (Jabbi, Keysers, and Stephan) are not keeping up with the way that scientific discourse is evolving. [See Mind Hacks on this point as well.] Citing "in press" articles in the normal academic channels is a frequent event; why should bloggers, some of whom are read more widely than the authors' original papers, refrain from such a practice? Is it the "read more widely" part? To their credit, however, they commented in blogs and publicized the link to a preliminary version of their detailed reply.....although calling it "summary information for the press" assumes that "the press" is extremely knowledgeable about neuroimaging methodology and statistical analysis.

All is not puppies and flowers in the world of science social media, however. Proponents rarely acknowledge that many companies and institutions block access to these sites, so at present their usefulness is limited for many in the scientific community. A more obvious issue is that these sites can turn into an enormous time sink.

Humans have the capacity to empathize with the pain of others, but we don't empathize in all circumstances. An experiment on human volunteers playing an economic game looked at the conditional nature of our sympathy, and the results show that fairness of social interactions is key to the empathic neural response. Both men and women empathized with the pain of cooperative people. But if people are selfish, empathic responses were absent, at least in men. And it seems that physical harm might even be considered a good outcome — perhaps the first neuroscientific evidence for schadenfreude.

Nature and Science have a long history of issuing overblown press releases that extrapolate the findings of a single, quite flawed [if you side with Vul et al.] neuroimaging paper to yield the revelation of deep truths about human social interactions (among other things). The Nature News piece, Brain imaging studies under fire (Abbott, 2009), continues:

The article is scheduled for publication in September, alongside one or more replies. But the accused scientists are concerned that the impression now being established through media reports will be hard to shake after the nine-month delay. "We are not worried about our close colleagues, who will understand the arguments. We are worried that the whole enterprise of social neuroscience falls into disrepute," says neuroscientist Chris Frith of University College London, whose Nature paper [Singer et al., 2006] on response to perceived fairness was called into question.

So media reports heavily promoted the field, and media reports will unduly tarnish the field.1

NewScientist provides a clear instance of this, in what is surely a textbook exemplar of a pot-kettle moment.

SOME of the hottest results in the nascent field of social neuroscience, in which emotions and behavioural traits are linked to activity in a particular region of the brain, may be inflated and in some cases entirely spurious.

But one doesn't have to look very far to find NewScientist headlines like these (I just searched the archives of this blog):

IT IS two centuries since the birth of Charles Darwin, but even now his advice can be spot on. The great man attempted a little neuroscience in The Expressions of the Emotions in Man and Animals, published in 1872, in which he discussed the link between facial expressions and the brain. "Our present subject is very obscure," Darwin warned in his book, "and it is always advisable to perceive clearly our ignorance."

Modern-day neuroscience might benefit from adopting a similar stance. The field has produced some wonderful science, including endless technicolor images of the brain at work and headline-grabbing papers about the areas that "light up" when registering emotions. Researchers charted those sad spots that winked on in women mourning the end of a relationship, the areas that got fired up when thinking about infidelity, or those that surged in arachnophobes when they thought they were about to see a spider. The subjective subject of feelings seemed at last to be becoming objective.

Now it seems that a good chunk of the papers in this field contain exaggerated claims, according to an analysis which suggests that "voodoo correlations" often inflate the link between brain areas and particular behaviours.

Some of the resulting headlines appeared in New Scientist, so we have to eat a little humble pie and resolve that next time a sexy-sounding brain scan result appears we will strive to apply a little more scepticism to our coverage.

Um, no joke guys.

On the other hand, Sharon Begley at Newsweek is one science writer who hasn't been entirely convinced by the colorful brain images. On March 10, 2008, she wrote:

Brain-imaging studies have proliferated so mindlessly (no pun intended) that neuroscientists should have to wear a badge pleading, “stop me before I scan again.” I mean, does it really add to the sum total of human knowledge to learn that the brain’s emotion regions become active when people listen to candidates for president? Or that the reward circuitry in the brains of drug addicts become active when they see drug paraphernalia?

Therefore, her recent commentary on the brouhaha does not come across as an opinion that was invented yesterday:

If you are a fan of science news, then odds are you are also intrigued by brain imaging, the technique that produces those colorful pictures of brains “lit up” with activity, showing which regions are behind which behaviors, thoughts and emotions. So maybe you remember these recent hits... [gives many examples here] . . . the list goes on and on and on. And now a bombshell has fallen on dozens of such studies: according to a team of well-respected scientists, they amount to little more than voodoo science.

The neuroscience blogosphere is crackling with—so far—glee over the upcoming paper, which rips apart an entire field: the use of brain imaging in social neuroscience.....

Before concluding, I will state that I am not a complete neuroimaging nihilist. For examples of this view, see Coltheart, 2006 and especially van Orden and Paap, 1997 (as quoted by Coltheart):

What has functional neuroimaging told us about the mind so far? Nothing, and it never will: the nature of cognition is such that this technique in principle cannot provide evidence about the nature of cognition.

So no, I am not a Jerry Fodor Functionalist. I do believe that learning about human brain function is essential to learing about "the mind," that the latter can be reduced to the former, that fMRI can have something useful to say, and (more broadly, in case any anti-psychiatry types are listening) that psychiatric disorders are indeed caused by faulty brain function. But there's still a lot about fMRI as a technique that we don't really know. The best-practice statistical procedures for analyzing functional images is obviously a contentious issue; there is no consensus at this point. Our knowledge of what the BOLD signal is measuring, exactly, is not very clear either [see the recent announcement in J. Neurosci. that "BOLD Signals Do Not Always Reflect Neural Activity."] The critics among us2 are not trying to trash the entire field of social neuroscience (or neuroimaging in general). Some of us are taking concrete steps to open a dialogue and improve its methodology, while others are trying to rein in runaway interpretations.

1. Correction for multiple comparisons safeguards against inflation of correlations.It appears that the authors of the rebuttal misunderstand what correction for multiple comparisons provides...

2. Our claims based on calculations of an 'upper bound' on the correlations are inappropriate.....The fact that there is some variability and uncertainty associated with reliability estimates does not seem to us to be likely very important in understanding why this literature has featured so many enormous correlations.

3. Our simulations are misleading about false alarm rates......We did not intend to make any assertions about the rate of false alarms, nor to claim that all the correlations that we contend to be inflated are false alarms.

4. Non-independent analyses sometimes yield low or non-significant correlations.In the rebuttal, the authors assert that the sorts of non-independent analyses we describe do not always produce substantial correlations. However, they do not provide specific examples, so we are not able to meaningfully comment.

5. Correlation magnitude is not so important.....Whether or not the authors themselves care about the magnitude of the correlations, their procedures for producing these correlation estimates produce inflated numbers. The scientific literature should, where possible, be free of such erroneous measurements.

6. If non-independent analyses are so untrustworthy, why are they producing replicable results?This is a very important point: if what we say is true, why do replications of the measured correlations occur? Assessing this claim requires an in-depth examination of specific literatures, which is beyond the scope of this rapid response, but we look forward to examining some specific cases in the future.....

7. Our survey was misleading and confusing.This critical point would seem to be whether we mis-classified the methods of some studies, and counted them as having conducted non-independent analyses, when in fact they had not. If this happened, we would regret it, and any authors who feel that their papers have been misclassified should please contact us and provide details.....

8. Our suggested split-half analyses are not necessarily non-independent.Here we think the authors of the rebuttal bring up an excellent point. ..... There is evidently no single perfect analysis of brain-behavior correlations, but the procedures we suggest should offer a major improvement over the non-independent approaches being widely used.

BPS Research Digest is a little late in their coverage of this debate, but yesterday they wrote that a second rebuttal is in preparation:

Matthew Lieberman, a co-author on Eisenberger's social rejection study, told us that he and his colleagues have drafted a robust reply to these methodological accusations, which will be published in Perspectives on Psychological Science alongside the Pashler paper. In particular he stressed that concerns over multiple comparisons in fMRI research are not new, are not specific to social neuroscience, and that the methodological approach of the Pashler group, done correctly, would lead to similar results to those already published. "There are numerous errors in their handling of the data that they reanalyzed," he argued. "While trying to recreate their [most damning] Figure 5, we went through and pulled all the correlations from all the papers. We found around 50 correlations that were clearly in the papers Pashler's team reviewed but were not included in their analyses. Almost all of these overlooked correlations tend to work against their hypotheses."

Visual Illusion Contestants are invited to submit novel visual or multimodal illusions (unpublished, or published no earlier than 2008) in standard image, movie or html formats. An international panel of impartial judges will rate the submissions and narrow them to the top ten. Then, at the Contest Gala in Naples, the top ten illusionists will present their contributions and the attendees of the event (that means you!) will vote to pick the TOP THREE WINNERS!

. . .

Submissions can be emailed to Dr. Susana Martinez-Conde (Illusion Contest Coordinator, Neural Correlate Society) until February 16th, 2009. Illusion submissions should come with a (no more than) one-page description of the illusion and its theoretical underpinnings (if known). Illusions will be rated according to:

Only one entry per person. Use either the PDF or WORD versions of the official entry form (copies of the form are acceptable) to enter a picture on the topic for your age group (see website). Drawings must be done by hand using pencils, pens, markers, and/or crayons. Feel free to use words in your drawings. Be creative!

In the spirit of American political debate shows such as Crossfire, The McLaughlin Group, Hannity & Colmes, and the classic Point/Counterpoint (both the 60 Minutes and SNL versions), The Neurocritic is pleased to present an excerpt from a rebuttal to the lively and controversial paper by Vul, Harris, Winkielman, and Pashler (PDF).

Two "anonymouscommenters" tipped me off to the preliminary version of a detailed reply by some of the authors on the Vul et al. hit list. The entire document is available for download as a PDF. The abstract, main bullet points, and conclusions are reproduced below.

Rebuttal of "Voodoo Correlations in Social Neuroscience" by Vul et al. – summary information for the press

2 University Medical Center Groningen, Department of Neuroscience, University of Groningen, The Netherlands. www.bcn-nic.nl/socialbrain.html

3 Laboratory for Social and Neural Systems Research, University of Zurich, Switzerland. http://www.socialbehavior.uzh.ch/index.html

The paper by Vul et al., entitled "Voodoo correlations in social neuroscience" and accepted for publication by Perspectives on Psychological Science, claims that "a disturbingly large, and quite prominent, segment of social neuroscience research is using seriously defective research methods and producing a profusion of numbers that should not be believed." In all brevity, we here summarise conceptual shortcomings and methodological errors of this paper and explain why their criticisms are invalid. A detailed reply will be submitted to a peer reviewed scientific journal shortly.

2. The authors make strong claims on the basis of a questionable upper bound argument.

3. The authors use misleading simulations to support their claims.

4. The authors inappropriately dismiss the existence of non-significant correlations.

5. The authors' understanding of the rationale behind the use and interpretation of correlations in social neuroscience is incomplete.

6. The authors ignore that the same brain-behaviour correlations have been replicated by several independent studies and that major results in social neuroscience are not based on correlations at all.

7. The authors used an ambiguous and incomplete questionnaire.

8. The authors make flawed suggestions for data analysis.

. . .

Conclusions

In this summary, we have provided a very brief summary that exposes some of the flaws that undermine the criticisms by Vul et al. We have pointed out that brain-behaviour correlations in social neuroscience are valid, provided that one adheres to good statistical practice. It has also been emphasized that many analyses and findings in social neuroscience do not rest on brain-behaviour correlations and have been replicated several times by independent studies conducted by different laboratories. A full analysis of the Vul et al. paper and a detailed reply will be submitted to a peer-reviewed scientific journal shortly._____________________________________________________

A rebuttal to the rebuttal, along with commentary by The Neurocritic, all to come in the next exciting episode!

Monday, January 05, 2009

The end of 2008 brought us the tabloid headline, Scan Scandal Hits Social Neuroscience. As initially reported by Mind Hacks, a new "bombshell of a paper" (Vul et al., 2009) questioned the implausibly high correlations observed in some fMRI studies in Social Neuroscience. A new look at the analytic methods revealed that over half of the sampled papers used faulty techniques to obtain their results.

Edward Vul, the first author, deserves a tremendous amount of credit (and a round of applause) for writing and publishing such a critical paper under his own name [unlike all those cowardly pseudonymous bloggers who shall go unnamed here]. He's a graduate student in Nancy Kanwisher's Lab at MIT. Dr. Kanwisher1 is best known for her work on the fusiform face area.

Vul et al. start with the observation that the new field of Social Neuroscience (or Social Cognitive Neuroscience) has garnered a great deal of attention and funding in its brief existence. Many high-profile neuroimaging articles have been published in Science, Nature, and Neuron, and have received widespread coverage in the popular press. However, all may not be rosy in paradise:2

Eisenberger, Lieberman, and Williams (2003), writing in Science, described a game they created to expose individuals to social rejection in the laboratory. The authors measured the brain activity in 13 individuals at the same time as the actual rejection took place, and later obtained a self-report measure of how much distress the subject had experienced. Distress was correlated at r=.88 with activity in the anterior cingulate cortex (ACC).

In another Science paper, Singer et al. (2004) found that the magnitude of differential activation within the ACC and left insula induced by an empathy-related manipulation was correlated between .52 and .72 with two scales of emotional empathy (the Empathic Concern Scale of Davis, and the Balanced Emotional Empathy Scale of Mehrabian).

Why is a correlation of r=.88 with 13 subjects considered "remarkably high"? For starters, it exceeds the reliability of the hemodynamic and behavioral (social, emotional, personality) measurements:

The problem is this: It is a statistical fact... that the strength of the correlation observed between measures A and B reflects not only the strength of the relationship between the traits underlying A and B), but also the reliability of the measures of A and B.

Evidence from the existing literature suggests the test-retest reliability of personality rating scales to be .7-.8 at best, and a reliability no higher than .7 for the BOLD (Blood-Oxygen-Level Dependent) signal. If each of these measures was [impossibly] perfect, then the highest possible correlation would be sqrt(.8 * .7), or .74.

This observation prompted the authors to conduct a meta-analysis of the literature. They identified 54 papers that met their criteria for fMRI studies reporting correlations between the BOLD response in a particular brain region and some social/emotional/personality measure. In most cases, the Methods sections did not provide enough detail about the statistical procedures used to obtain these correlations. Therefore, a questionnaire was devised and sent to the corresponding authors of all 54 papers:

APPENDIX 1: fMRI Survey Question Text

Would you please be so kind as to answer a few very quick questions about the analysis that produced, i.e., the correlations on page XX. We expect this will just take you a minute or two at most.

To make this as quick as possible, we have framed these as multiple choice questions and listed the more common analysis procedures as options, but if you did something different, we'd be obliged if you would describe what you actually did.

The data plotted reflect the percent signal change or difference in parameter estimates (according to some contrast) of...

1. ...the average of a number of voxels.2. ...one peak voxel that was most significant according to some functional measure.3. ...something else?

etc.....

Thank you very much for giving us this information so that we can describe your study accurately in our review.

They received 51 replies. Did these authors suspect the final product could put some of their publications in such a negative light?

SpongeBob: What if Squidward’s right? What if the award is a phony? Does this mean my whole body of work is meaningless?

After providing a nice overview of fMRI analysis procedures (beginning on page 6 of the preprint), Vul et al. present the results of the survey, and then explain the problems associated with the use of non-independent analysis methods.

...23 [papers] reported a correlation between behavior and one peak voxel; 29 reported the mean of a number of voxels. ... Of the 45 studies that used functional constraints to choose voxels (either for averaging, or for finding the ‘peak’ voxel), 10 said they used functional measures defined within a given subject, 28 used the across-subject correlation to find voxels, and 7 did something else. All of the studies using functional constraints used the same data to select voxels, and then to measure the correlation. Notably, 54% of the surveyed studies selected voxels based on a correlation with the behavioral individual-differences measure, and then used those same data to compute a correlation within that subset of voxels.

Therefore, for these 28 papers, voxels were selected because they correlated highly with the behavioral measure of interest. Using simulations, Vul et al. demonstrate that this glaring "non-independence error" can produce significant correlations out of noise!

This analysis distorts the results by selecting noise exhibiting the effect being searched for, and any measures obtained from such a non-independent analysis are biased and untrustworthy (for a formal discussion see Vul & Kanwisher, in press, PDF).

And the problem is magnified in correlations that used activity in one peak voxel (out of a grand total of between 40,000 and 500,000 voxels in the entire brain) instead of a cluster of voxels that passed a statistical threshold. Papers that used non-independent analyses were much more likely to report implausibly high correlations, as illustrated in the figure below.

Figure 5 (Vul et al., 2009). The histogram of the correlations values from the studies we surveyed, color-coded by whether or not the article used non-independent analyses. Correlations coded in green correspond to those that were achieved with independent analyses, avoiding the bias described in this paper. However, those in red correspond to the 54% of articles surveyed that reported conducting non-independent analyses – these correlation valuesare certain to be inflated. Entries in orange arise from papers whose authors chose not torespond to our survey.

Not so coincidentally, some of these same papers have been flagged (or flogged) in this very blog. The Neurocritic's very first post 2.94 yrs ago, Men are Torturers, Women are Nurturers..., complained about the overblown conclusions and misleading press coverage of a particular paper (Singer et al., 2006), as well as its methodology:

And don't get me started on their methodology -- a priori regions of interest (ROIs) for pain-related empathy in fronto-insular cortex and anterior cingulate cortex (like the relationship between those brain regions and "pain-related empathy" are well-established!) -- and on their pink-and-blue color-coded tables!

Not necessarily the most sophisticated deconstruction of analytic techniques, but it was the first...and it did question how the regions of interest were selected. And of course how the data were interpreted and presented in the press.

SUMMARY from The Neurocritic : Ummm, it's nice they can generalize from 16 male undergrads to the evolution of sex differences that are universally valid in all societies.

As you can tell, this one really bothers me...

And what are the conclusions of Vul et al.?

To sum up, then, we are led to conclude that a disturbingly large, and quite prominent, segment of social neuroscience research is using seriously defective research methods and producing a profusion of numbers that should not be believed.

Finally, they call upon the authors to re-analyze their data and correct the scientific record.

About Me

Born in West Virginia in 1980, The Neurocritic embarked upon a roadtrip across America at the age of thirteen with his mother. She abandoned him when they reached San Francisco and The Neurocritic descended into a spiral of drug abuse and prostitution. At fifteen, The Neurocritic's psychiatrist encouraged him to start writing as a form of therapy.