In a substantial contribution to the fMRI field, Eklund et al. (1) use nonparametric methods to demonstrate that random field theory (RFT)-based familywise error (FWE) correction for cluster inference does not control errors appropriately, and this discrepancy is more pronounced for lenient cluster-defining thresholds (CDT). Moreover, they point to violations of RFT assumptions as the culprit for this discrepancy.

Given these results, how should we interpret existing fMRI literature that used RFT-based, FWE-corrected P values (pRFT-FWE)? To suggest caution is reasonable but incomplete; we require concrete, quantitative guidelines to enable appropriate calibration of skepticism.

Here, we undertake an initial attempt at such guidance. We heed Eklund et al.’s (1) warning and prefer nonparametric null distributions to RFT. However, we focus on the false discovery rate (FDR) (2), which is a more natural target for multiple testing control [as recognized by Nichols and coworkers in previous work (3)]: A researcher is naturally more concerned with the proportion of reported clusters that are false positives (FDR) than whether any are false positives (FWE). Thus, a reader considering a table of clusters significant under RFT–FWE might ask which of these results would have survived had the study instead used a nonparametric FDR-based method.

We address this question using the same task fMRI data (4, 5) analyzed by Eklund et al. (1) (available from openfMRI, ref. 6).

For each contrast, we generate 5,000 realizations of the data through sign flipping (code, data, and extended methods: https://github.com/mangstad/FDR_permutations). To obtain a null distribution of cluster extents (for an arbitrary cluster) we combine normalized frequencies of extents at each realization. This distribution is used to assign uncorrected P values to each observed cluster. We next submit the vector of uncorrected P values for each contrast to Benjamini and Hochberg’s (2) FDR procedure with αFDR=.05 (cf. ref. 7 for a parametric implementation of clusterwise FDR).

We compare pRFT-FWE values to qFDR values and note whether they survive FDR correction under αFDR=.05. We generate separate plots for this analysis conducted at CDT = {0.001, 0.01}.

Based on our results (Fig. 1), we suggest nearly all clusters identified as significant when using CDT = 0.001 and RFT–FWE correction are trustworthy by the nonparametric FDR benchmark. For clusters identified as significant with CDT = 0.01 and RFT–FWE correction, the guidance depends on the corrected P value: Clusters with pRFT-FWE<.00001 seem consistently trustworthy by the nonparametric FDR benchmark, whereas clusters above this value are not reliably trustworthy.

These findings have promising implications for past fMRI studies using RFT-based cluster-level inference that used CDT = 0.001, estimated to be upward of 8,500 reports (8, 9). Although the story is mixed for CDT = 0.01 (used in ∼3,500 studies) (8, 9), our findings suggest that not all such previously reported clusters are unreliable. We identify 0.00001 as a potential cutoff for trustworthiness.

Our results provide more granular guidance on the relationship between pRFT-FWE and trustworthiness of results. A more comprehensive examination of fMRI task datasets that used RFT-based FWE can further refine this guidance.

Acknowledgments

We thank Anders Eklund and Thomas Nichols for providing us with processed data and for very helpful comments on earlier versions of this letter.

Blood-sucking sand flies from disparate global regions have a predilection for feeding on the marijuana plant (Cannabis sativa), and the findings hint at a potential avenue for controlling sand flies, which can transmit leishmaniasis.