I'm not blogging about any of these events. Many many others have already written about them (see selected reading list below). And The Neurocritic has been feeling tapped out lately.

Hence the cats on treadmills. They're here to introduce a new study which demonstrated that early visual experience is not necessary for the perception of biological motion (Bottari et al., 2015). Biological motion perception involves the ability to understand and visually track the movement of a living being. This phenomenon is often studied using point light displays, as shown below in a demo from the BioMotion Lab. You should really check out their flash animation that allows you to view human, feline, and pigeon walkers moving from right to left, scrambled and unscrambled, masked and unmasked, inverted and right side up.

People born with dense, bilateral cataracts that are surgically removed at a later date show deficits in higher visual processing, including the perception of global motion, global form, faces, and illusory contours. Proper neural development during the critical, or sensitive period early in life is dependent on experience, in this case visual input. However, it seems that the perception of biological motion (BM) does not require early visual experience (Bottari et al., 2015).

Participants in the study were 12 individuals with congenital cataracts that were removed at a mean age of 7.8 years (range 4 months to 16 yrs). Age at testing was 17.8 years (range 10-35 yrs). The study assessed their biological motion thresholds (extracting BM from noise) and recorded their EEG to point light displays of a walking man and to scrambled versions of the walking man (see demo).

Behavioral performance on the BM threshold task didn't differ much between the congenital cataract (cc) and matched control (mc) groups (i.e., there was a lot of overlap between the filled diamonds and the open triangles below).

The event-related potentials (ERPs) averaged to presentations of the walking man vs. scrambled man showed the same pattern in cc and mc groups as well: larger to walking man (BM) than scrambled man (SBM).

The N1 component (the peak at about 0.25 sec post-stimulus) seems a little smaller in cc but that wasn't significant. On the other hand, the earlier P1 was significantly reduced in the cc group. Interestingly, the duration of visual deprivation, amount of visual experience, and post-surgical visual acuity did not correlate with the size of the N1.

The authors discuss three possible explanations for these results:

(1) The neural circuitries associated with the processing of BM can specialize in late childhood or adulthood. That is, as soon as visual input becomes available, initiates the functional maturation of the BM system. Alternatively the neural systems for BM might mature independently of vision. (2) Either they are shaped cross-modally or (3) they mature independent of experience.

They ultimately favor the third explanation, that "the neural systems for BM specialize independently of visual experience." They also point out that the ERPs to faces vs. scrambled faces in the cc group do not show the characteristic difference between these stimulus types. What's so special about biological motion, then? Here the authors wave their hands and arms a bit:

We can only speculate why these different developmental trajectories for faces and BM emerge: BM is characteristic for any type of living being and the major properties are shared across species. ... By contrast, faces are highly specific for a species and biases for the processing of faces from our own ethnicity and age have been shown.

It's more important to see if a bear is running towards you than it is to recognize faces, as anyone with congenital prosopagnosia ("face blindness") might tell you...

"The third sequence showed a walking cat. The data are based on a high-speed (200 fps) video sequence showing a cat walking on a treadmill. Fourteen feature points were manually sampled from single frames. As with the pigeon sequence, data were approximated with a third-order Fourier series to obtain a generic walking cycle."

Sunday, August 09, 2015

In the coming era of Precision Medicine, we'll all want customized treatments that “take into account individual differences in people’s genes, environments, and lifestyles.” To do this, we'll need precise diagnostic tools to identify the specific disease process in each individual. Although focused on cancer in the near-term, the longer-term goal of the White House initiative is to apply Precision Medicine to all areas of health. This presumably includes psychiatry, but the links between Precision Medicine, the BRAIN initiative, and RDoC seem a bit murky at present.1

But there's nothing a good infographic can't fix. Science recently published a Perspective piece by the NIMH Director and the chief architect of the Research Domain Criteria (RDoC) initiative (Insel & Cuthbert, 2015). There's Deconstruction involved, so what's not to like? 2

ILLUSTRATION: V. Altounian and C. Smith / SCIENCE

In this massively ambitious future scenario, the totality of one's genetic risk factors, brain activity, physiology, immune function, behavioral symptom profile, and life experience (social, cultural, environmental) will be deconstructed and stratified and recompiled into a neat little cohort. 3

The new categories will be data driven. The project might start by collecting colossal quantities of expensive data from millions of people, and continue by running classifiers on exceptionally powerful computers (powered by exceptionally bright scientists/engineers/coders) to extract meaningful patterns that can categorize the data with high levels of sensitivity and specificity. Perhaps I am filled with pathologically high levels of negative affect (Loss? Frustrative Nonreward?), but I find it hard to be optimistic about progress in the immediate future. You know, for a Precision Medicine treatment for me (and my pessimism)...

But let's just focus on the brain for now. For a long time, most neuroscientists have viewed mental disorders as brain disorders. [But that's not to say that environment, culture, experience, etc. play no role! cf. Footnote 3]. So our opening question becomes, How do we classify and diagnose brain disorders neural circuit disorders in a fashion consistent with RDoC principles? Is there really One Brain Network for All Mental Illness, for instance? (I didn't think so.)

Our colleagues in Asia and Australia and Europe and Canada may not have gotten the funding memo, however, and continue to run classifiers based on DSM categories. 5 In my previous post, I promised an unsystematic review of machine learning as applied to the classification of major depression. You can skip directly to the Appendix to see that.

Regardless of whether we use DSM-5 categories or RDoC matrix constructs, what we need are robust and reproducible biomarkers (see Table 1 above). A brief but excellent primer by Woo and Wager (2015) outlined the characteristics of a useful neuroimaging biomarker:

1. Criterion 1: diagnosticity

Good biomarkers should produce high diagnostic performance in classification or prediction. Diagnostic performance can be evaluated by sensitivity and specificity. Sensitivity concerns whether a model can correctly detect signal when signal exists. Effect size is a closely related concept; larger effect sizes are related to higher sensitivity. Specificity concerns whether the model produces negative results when there is no signal. Specificity can be evaluated relative to a range of specific alternative conditions that may be confusable with the condition of interest.

2. Criterion 2: interpretability

Brain-based biomarkers should be meaningful and interpretable in terms of neuroscience, including previous neuroimaging studies and converging evidence from multiple sources (eg, animal models, lesion studies, etc). One potential pitfall in developing neuroimaging biomarkers is that classification or prediction models can capitalize on confounding variables that are not neuroscientifically meaningful or interesting at all (eg, in-scanner head movement). Therefore, neuroimaging biomarkers should be evaluated and interpreted in the light of existing neuroscientific findings.

3. Criterion 3: deployability

Once the classification or outcome-prediction model has been developed as a neuroimaging biomarker, the model and the testing procedure should be precisely defined so that it can be prospectively applied to new data. Any flexibility in the testing procedures could introduce potential overoptimistic biases into test results, rendering them useless and potentially misleading. For example, “amygdala activity” cannot be a good neuroimaging biomarker without a precise definition of which “voxels” in the amygdala should be activated and the relative expected intensity of activity across each voxel. A well-defined model and standardized testing procedure are crucial aspects of turning neuroimaging results into a “research product,” a biomarker that can be shared and tested across laboratories.

4. Criterion 4: generalizability

Clinically useful neuroimaging biomarkers aim to provide predictions about new individuals. Therefore, they should be validated through prospective testing to prove that their performance is generalizable across different laboratories, different scanners or scanning procedures, different populations, and variants of testing conditions (eg, other types of chronic pain). Generalizability tests inherently require multistudy and multisite efforts. With a precisely defined model and standardized testing procedure (criterion 3), we can easily test the generalizability of biomarkers and define the boundary conditions under which they are valid and useful.

[Then the authors evaluated the performance of a structural MRI signature for IBS presented in an accompanying paper.]

Should we try to improve on a neuroimaging biomarker (or “neural signature”) for classic disorders in which “Neuroanatomical diagnosis was correct in 80% and 72% of patients with major depression and schizophrenia, respectively...” (Koutsouleris et al., 2015)? That study used large cohorts and evaluated the trained biomarker against an independent validation database (i.e., it was more thorough than many other investigations). Or is the field better served by classifying when loss and agency and auditory perception go awry? What would individualized treatments for these constructs look like? Presumably, the goal is to develop better treatments, and to predict who will respond to a specific treatment(s).

OR should we adopt the surprisingly cynical view of some prominent investigators, who say:

...identifying a genuine neural signature would necessitate the discovery of a specific pattern of brain responses that possesses nearly perfect sensitivity and specificity for a given condition or other phenotype. At the present time, neuroscientists are not remotely close to pinpointing such a signature for any psychological disorder or trait...

If that's true, then we'll have an awfully hard time with our resting state fMRI classifier for neuro-nihilism.

2 Derrida's Deconstruction and RDoc are diametrically opposed, as irony would have it.

3 Or maybe an n of 1... I'm especially curious about how life experience will be incorporated into the mix. Perhaps the patient of the future will upload all the data recorded by their memory implants, as in The Entire History of You (an episode of Black Mirror).

4 The word “shroud” always makes everything sound so dire and deathly important... especially when used as a noun.

5 As do many research groups in the US. This is meant to be snarky, but not condescending to anyone who follows DSM-5 in their research.

This last one is especially important, since an accurate diagnosis can avoid the potentially disastrous prescribing of antidepressants in bipolar depression.

Idea that may already be implemented somewhere: Individual labs or research groups could perhaps contribute to a support vector machine clearing house (e.g., at NTRIC or OpenfMRI or GitHub) where everyone can upload the code for data processing streams and various learning/classification algorithms to try out on each others' data.

In machine learning, computers apply statistical learning techniques to automatically identify patterns in data. These techniques can be used to make highly accurate predictions.

You should really head over there right now to view it, because it's very impressive.

Computational neuroscience types are using machine learning algorithms to classify all sorts of brain states, and diagnose brain disorders, in humans. How accurate are these classifications? Do the studies all use separate training sets and test sets, as shown in the example above?

Let's say your fMRI measure is able to differentiate individuals with panic disorder (n=33) from those with panic disorder + depression (n=26) with 79% accuracy.1 Or with structural MRI scans you can distinguish 20 participants with treatment-refractory depression from 21 never-depressed individuals with 85% accuracy.2 Besides the issues outlined in the footnotes, the “reality check” is that the model must be able to predict group membership for a new (untrained) data set. And most studies don't seem to do this.

I was originally drawn to the topic by a 3 page article entitled, Machine learning algorithm accurately detects fMRI signature of vulnerability to major depression (Sato et al., 2015). Wow! Really? How accurate? Which fMRI signature? Let's take a look.

The authors used a “standard leave-one-subject-out procedure in which the classification is cross-validated iteratively by using a model based on the sample after excluding one subject to independently predict group membership”but they did not test their fMRI signature in completely independent groups of participants.

Nor did they try to compare individuals who are currently depressed to those who are currently remitted. That didn't matter, apparently, because the authors suggest the fMRI signature is a trait markerof vulnerability, not a state marker of current mood. But the classifier missed 28% of the remitted group who did not have the “guilt-selective anterior temporal functional connectivity changes.”

What is that, you ask? This is a set of mini-regions (i.e., not too many voxels in each) functionally connected to a right superior anterior temporal lobe seed region of interest during a contrast of guilt vs. anger feelings (selected from a number of other possible emotions) for self or best friend, based on written imaginary scenarios like “Angela [self] does act stingily towards Rachel [friend]” and “Rachel does act stingily towards Angela” conducted outside the scanner (after the fMRI session is over). Got that?

You really need to read a bunch of other articles to understand what that means, because the current paper is less than 3 pages long. Did I say that already?

The patients were previously diagnosed according to DSM-IV-TR (which was current at the time), and in remission for at least 12 months. The study was conducted by investigators from Brazil and the UK, so they didn't have to worry about RDoC, i.e. “new ways of classifying mental disorders based on behavioral dimensions and neurobiological measures” (instead of DSM-5 criteria). A “guilt-proneness” behavioral construct, along with the “guilt-selective” network of idiosyncratic brain regions, might be more in line with RDoC than past major depression diagnosis.

Could these results possibly generalize to other populations of remitted and never-depressed individuals? Well, the fMRI signature seems a bit specialized (and convoluted). And overfitting is another likely problem here...

Ideally, the [decision] tree should perform similarly on both known and unknown data.

So this one is less than ideal. [NOTE: the one that's 90% in the top figure]

These errors are due to overfitting. Our model has learned to treat every detail in the training data as important, even details that turned out to be irrelevant.

In my next post, I'll present an unsystematic review of machine learning as applied to the classification of major depression. It's notable that Sato et al. (2015) used the word “classification” instead of “diagnosis.”3

ADDENDUM (Aug 3 2015): In the comments, I've presented more specific critiques of: (1) the leave-one-out procedure and (2) how the biomarker is temporally disconnected from when the participants identify their feeling as 'guilt' or 'anger' or etc. (and why shame is more closely related to depression than guilt).

Footnotes

1 The sensitivity (true positive rate) was 73% and the specificity (true negative rate) was 85%. After correcting for confounding variables, these numbers were 77% and 70%, respectively.

2 The abstract concludes this is a “high degree of accuracy.” Not to pick on these particular authors (this is a typical study), but Dr. Dorothy Bishop explains why this is not very helpful for screening or diagnostic purposes. And what you'd really want to do here is to discriminate between treatment-resistant vs. treatment-responsive depression. If an individual does not respond to standard treatments, it would be highly beneficial to avoid a long futile period of medication trials.

3 In case you're wondering, the title of this post was based on The Dark Side of Diagnosis by Brain Scan, which is about Dr Daniel Amen. The work of the investigators discussed here is in no way, shape, or form related to any of the issues discussed in that post.

About Me

Born in West Virginia in 1980, The Neurocritic embarked upon a roadtrip across America at the age of thirteen with his mother. She abandoned him when they reached San Francisco and The Neurocritic descended into a spiral of drug abuse and prostitution. At fifteen, The Neurocritic's psychiatrist encouraged him to start writing as a form of therapy.