Research

Copyright

The documents distributed here have been provided as a means to ensure
timely dissemination of scholarly and technical work on a
noncommercial basis. Copyright and all rights therein are maintained
by the authors or by other copyright holders, notwithstanding that
they have offered their works here electronically. It is understood
that all persons copying this information will adhere to the terms and
constraints invoked by each author's copyright. These works may not be
reposted without the explicit permission of the copyright holder.

Behavioural phenomena are central to psychiatric disorders. Computational
modelling allows the learning and decision-making processes underlying behaviour
to be modelled in great detail. By doing so, specific and possibly highly
complex hypotheses about the underlying processes can be directly tested on the
data. The first part of this chapter introduces Markov Decision Problems (MDPs)
as a formal framework for decision-making. It then describes several solutions
to MDPs including reinforcement learning and dynamic programming, and briefly
introduces some of their key characteristics. The second part of the chapter
provides a tutorial overview over how to use MDPs in a generative modelling
framework to test hypotheses about learning and decision-making. The final part
of the chapter discusses the methods using a few worked examples from the
literature.

Important real-world decisions are often arduous as they frequently involve
sequences of choices, with initial selections affecting future options.
Evaluating every possible combination of choices is computationally intractable,
particularly for longer multi-step decisions. Therefore, humans frequently
employ heuristics to reduce the complexity of decisions. We recently used a
goal-directed planning task to demonstrate the profound behavioral influence and
ubiquity of one such shortcut, namely aversive pruning, a reflexive Pavlovian
process that involves neglecting parts of the decision space residing beyond
salient negative outcomes. However, how the brain implements this important
decision heuristic, and what underlies individual differences in its strength
have hitherto remained unanswered. Therefore, we administered an adapted version
of the same planning task to healthy volunteers undergoing functional magnetic
resonance imaging (fMRI) to determine the neural basis of aversive pruning.
Through both computational and standard categorical fMRI analyses, we show that
when planning was influenced by aversive pruning, the subgenual cingulate cortex
was robustly recruited. This neural signature was distinct from those associated
with general planning and valuation, two fundamental cognitive components
elicited by our task but which are complementary to aversive pruning.
Furthermore, we found that individual variation in levels of aversive pruning
were associated with the responses of insula and dorsolateral prefrontal cortex
to the receipt of large monetary losses, and also with sub-clinical levels of
anxiety. In summary, our data reveal the neural signatures of an important
reflexive Pavlovian processes that shapes goal-directed evaluations, and thereby
determine the outcome of high-level sequential cognitive processes.

Computational psychiatry attempts to apply mathematical and computational
techniques to help improve psychiatric care. Here, we consider formal valuation
accounts of emotion. The flexibility of emotional responses and the nature of
appraisals suggest the need for a model-based framework for emotions. Resource
limitations make plain model-based valuation impossible, and require strategies
to apportion cognitive resources adaptively. We argue that emotions can
implement such approximations by restricting the range of behaviours and states
considered. We consider the processes that guide the deployment of the emotional
approximations, discerning between innate, model-free, heuristic and model-based
controllers. A focus on complex model-based decisions reveals the necessity for
strategies to deal with the complexity of the problems. Emotions may provide
such approximations, and this framework may provide a principled approach to
examining them.

Alcohol-related cues acquire incentive salience through Pavlovian conditioning
and then can markedly affect instrumental behavior of alcohol-dependent patients
to promote relapse. However, it is unclear whether similar effects occur with
alcohol-unrelated cues. We tested 116 early abstinent alcohol-dependent patients
and 91 healthy controls who completed a delay discounting task to assess choice
impulsivity, and a Pavlovian-to-instrumental transfer (PIT) paradigm employing
both alcohol-unrelated and alcohol-related stimuli. To modify instrumental
choice behavior, we tiled the background of the computer screen either with
conditioned stimuli (CS) previously generated by pairing abstract pictures with
pictures indicating monetary gains or losses, or with pictures displaying
alcohol or water beverages. CS paired to money gains and losses affected
instrumental choices differently. This PIT effect was significantly more
pronounced in patients compared to controls, and the group difference was mainly
driven by highly impulsive patients. The PIT effect was particularly strong in
trials in which the instrumental stimulus required inhibition of instrumental
response behavior and the background CS was associated to monetary gains. Under
that condition, patients performed inappropriate approach behavior, contrary to
their previously formed behavioral intention. Surprisingly, the effect of
alcohol and water pictures as background stimuli resembled that of aversive and
appetitive CS, respectively. These findings suggest that positively valenced
background CS can provoke dysfunctional instrumental approach behavior in
impulsive alcohol-dependent patients. Consequently, in real life they might be
easily seduced by environmental cues to engage in actions thwarting their
long-term goals. Such behaviors may include, but are not limited to, approaching
alcohol.

Addiction is supposed to be characterized by a shift from goal-directed to
habitual decision-making, thus facilitating automatic drug intake. The two-step
task allows distinguishing between these mechanisms by computationally modelling
goal-directed and habitual behavior as model-based and model-free control. In
addicted patients, decision-making may also strongly depend upon drug-associated
expectations. Therefore, we investigated model-based vs. model-free
decision-making and its neural correlates as well as alcohol expectancies in
alcohol-dependent patients and healthy controls and assessed treatment outcome
in patients.
Ninety detoxified, medication-free alcohol-dependent patients and 96 age- and
gender-matched control underwent functional magnetic resonance imaging during
the two-step task. Alcohol expectancies were measured with the Alcohol
Expectancy Questionnaire.. Over a follow-up period of 48 weeks, 37 patients
remained abstinent whereas 53 patients.
Patients who relapsed displayed reduced medial prefrontal cortex (mPFC)
activation during model-based decision-making. Furthermore high alcohol
expectancies were associated with low model-based control in relapsers, while
the opposite was observed in abstainers and healthy controls. However, reduced
model-based control per se was not associated with subsequent relapse.
These findings suggest that poor treatment outcome in addicted patients does not
simply result from reduced model-based control but is rather dependent on the
interaction between high drug expectancies and low model-based decision-making.
Reduced model-based mPFC signatures in prospective relapsers point to a neural
correlate of relapse risk. These observations suggest that therapeutic
interventions should target subjective alcohol expectancies.

Alcohol dependence is a mental disorder which has been associated with
an imbalance in behavioral control favoring model-free habitual over
model-based goal-directed strategies. It is as yet unknown, however,
whether such an imbalance reflects a predisposing vulnerability or
results as a consequence of repeated and/or excessive alcohol exposure.
We, therefore, examined the association of alcohol consumption with
model-based goal-directed and model-free habitual control in 188
eighteen-year-old social drinkers in a two-step sequential
decision-making task while undergoing fMRI before prolonged alcohol
misuse could have led to severe neurobiological adaptations.
Behaviorally, participants showed a mixture of model-free and
model-based decision-making as observed previously. Measures of
impulsivity were positively related to alcohol consumption. In
contrast, neither model-free nor model-based decision weights nor the
tradeoff between them were associated with alcohol consumption. There
were also no significant associations between alcohol consumption and
neural correlates of model-free or model-based decision quantities in
either ventral striatum or ventromedial prefrontal cortex. Exploratory
whole-brain fMRI analyses with a lenient threshold revealed early onset
of drinking to be associated with an enhanced representation of
model-free reward prediction errors in the posterior putamen. These
results suggest that an imbalance between model-based goal-directed and
model-free habitual control might rather not be a trait marker of
alcohol intake per se.

Cognitive biases, such as the anchoring bias, pose a serious challenge to
rational accounts of human cognition. We investigate whether rational theories
can meet this challenge by taking into account the mind's bounded cognitive
resources. We asked what reasoning under uncertainty would look like if people
made rational use of their finite time and limited cognitive resources. To
answer this question, we applied a mathematical theory of bounded rationality to
the problem of numerical estimation. Our analysis led to a rational process
model that can be interpreted in terms of anchoring-and-adjustment. This model
provided a unifying explanation for ten anchoring phenomena including the
differential effect of accuracy motivation on the bias towards provided versus
self-generated anchors. Our results illustrate the potential of
resource-rational analysis to provide formal theories that can unify a wide
range of empirical results and reconcile the impressive capacities of the human
mind with its apparently irrational cognitive biases.

People's estimates of numerical quantities are systematically biased towards
their initial guess. This anchoring bias is usually interpreted as sign of human
irrationality, but it has recently been suggested that the anchoring bias
instead results from people's rational use of their finite time and limited
cognitive resources. If this were true, then adjustment should decrease with the
relative cost of time. To test this hypothesis, we designed a new numerical
estimation paradigm that controls people's knowledge and varies the cost of time
and error independently while allowing people to invest as much or as little
time and effort into refining their estimate as they wish. Two experiments
confirmed the prediction that adjustment decreases with time cost but increases
with error cost regardless of whether the anchor was self-generated or provided.
These results support the hypothesis that people rationally adapt their number
of adjustments to achieve a near-optimal speed-accuracy tradeoff. This suggests
that the anchoring bias might be a signature of the rational use of finite time
and limited cognitive resources rather than a sign of human irrationality.

Using simple mathematical models of choice behavior, we present a Bayesian
adaptive algorithm to assess measures of impulsive and risky decision making.
Practically, these measures are characterized by discounting rates and are used
to classify individuals or population groups, to distinguish unhealthy behavior,
and to predict developmental courses. However, a constant demand for improved
tools to assess these constructs remains unanswered. The algorithm is based on
trial-by-trial observations. At each step, a choice is made between immediate
(certain) and delayed (risky) options. Then the current parameter estimates are
updated by the likelihood of observing the choice, and the next offers are
provided from the indifference point, so that they will acquire the most
informative data based on the current parameter estimates. The procedure
continues for a certain number of trials in order to reach a stable estimation.
The algorithm is discussed in detail for the delay discounting case, and results
from decision making under risk for gains, losses, and mixed prospects are also
provided. Simulated experiments using prescribed parameter values were performed
to justify the algorithm in terms of the reproducibility of its parameters for
individual assessments, and to test the reliability of the estimation procedure
in a group-level analysis. The algorithm was implemented as an experimental
battery to measure temporal and probability discounting rates together with loss
aversion, and was tested on a healthy participant sample.

Background: A substantial proportion of the burden of depression arises from its
recurrent nature. The risk of relapse after antidepressant (ADM) discontinuation
is high but not uniform. Predictors of individual relapse risk after
antidepressant discontinuation could help to guide treatment and mitigate the
long-term course of depression.

Methods: We conducted a systematic literature search in Pubmed to identify
relapse predictors using the search terms "(depress* OR MDD*) AND (relapse* OR
recurren*) AND (predict* OR risk) AND (discontinu* OR withdraw* OR maintenance
OR maintain or continu*) AND (antidepress* OR medication OR drug)" for published
studies until November 2014. Studies investigating predictors of relapse in
patients aged between 18 and 65 with a main diagnosis of Major Depressive
Disorder (MDD) who remitted from a depressive episode while treated with
antidepressant medication and were followed up for at least 6 months to assess
relapse after part of the sample discontinued their ADM, were included in the
review.

Results: Although relevant information is present in many studies, only thirteen
studies based on nine separate samples investigated predictors for relapse after
ADM discontinuation. There are multiple promising predictors, including markers
of true treatment response and the number of prior episodes. However, the
existing evidence is weak and there are no established, validated markers of
individual relapse risk after antidepressant cessation.

Conclusion: There is little evidence to guide discontinuation decisions in an
individualized manner beyond overall recurrence risk. Thus, there is a pressing
need to investigate neurobiological markers of individual relapse risk, focusing
on treatment discontinuation.

The burden of depression is substantially aggravated by relapses and
recurrences, and these become more inevitable with every episode of depression.
This chapter first describes how computational psychiatry can provide a
normative framework for emotions that might provide an integrative approach to
core cognitive components of depression and relapse. At the heart of this
account is the notion that emotions effectively imply a valuation, and that they
are therefore amenable to description and dissection by reinforcement-learning
methods. It is argued that cognitive accounts of emotion can be viewed in terms
of model-based valuation, and that automatic emotional responses relate to
model-free valuation and the innate recruitment of fixed behavioural patterns.
The model-based view captures phenomena such as helplessness, hopelessness,
attributions and stress sensitization. Considering it in more atomic algorithmic
detail opens up the possibility of viewing rumination and emotion regulation in
this same normative framework, too. The chapter then briefly outlines the
problem of treatment selection for relapse and recurrence prevention, and then
suggests ways in which the computational framework of emotions might help in
improving this. The discussion closes with a very brief general overview over
what we can hope to gain from computational psychiatry.

Psychiatry faces a number of challenges, among them are the reconceptualization
of symptoms and diagnoses, disease prevention, treatment development and
monitoring of its effects, and the provision of individualized, precision
medicine. Achieving these goals will require an increase in the biological,
quantitative, and theoretical grounding of psychiatry. To address these
challenges, psychiatry must confront the complexity and heterogeneity intrinsic
to the nature of brain disorders. This chapter seeks to identify the sources of
complexity and heterogeneity as a means of confronting the challenges facing the
field. These sources include the interplay between genetic and epigenetic
factors with the environment and their impact on neural circuits. Moreover,
these interactions are expressed dynamically over the course of development and
continue to play out during the disease process and treatment.

We propose that computational approaches provide a framework for addressing the
complexity and heterogeneity that underlie the challenges facing psychiatry.
Central to our argument is the idea that these characteristics are not noise to
be eliminated from diagnosis and treatment of disorders. Instead, such
complexity and heterogeneity arises from intrinsic features of brain function
and, therefore, represent opportunities for computational models to provide a
more accurate biological foundation for diagnosis and treatment of psychiatric
disorders. The challenges to be addressed by a computational framework include
the following. First, it must improve the search for risk factors and
biomarkers, which can be used toward primary prevention of disease. Second, it
must help to represent the biological ground truth of psychiatric disorders,
which will improve the accuracy of diagnostic categories, assist in discovering
new treatments, and aid in precision medicine. Third, to be useful for secondary
prevention, it must represent how risk factors, biomarkers, and the underlying
biology change through the course of development, disease progression, and
treatment process.

Computational psychiatry is a young field that aims to further our understanding of mental illness and its treatment with the use of novel computational techniques. The present issue provides an overview over the breadth of the field. On the one hand, computational techniques can be used to provide mechanistic insight into illnesses. This is exemplified with contributions using Bayesian and reinforcement-learning techniques into schizophrenia, methamphetamine and alcohol use disorders. On the other hand, mechanistically agnostic techniques can directly infer information relevant to treatment. Examples in the issue include prediction of depression treatment responses with EEG and response prediction with fMRI. The issue concludes with a novel way to address heterogeneity, and finally with a proposal to adopt a developmental pathway akin to that in drug development to ensure computational psychiatry fulfills its promise to improve patient outcomes.

Dopamine potentially unites two important roles: one in addiction, being
involved in most substances of abuse
including alcohol, and a second one in a specific type of learning, namely
model-free temporal-difference reinforce-
ment learning. Theories of addiction have long suggested that drugs of abuse may
usurp dopamine's role in learning.
We here briefly review the preclinical literature to motivate specific
hypotheses about model-free temporal-difference
learning, and then review the imaging evidence in the drug of abuse with the
most substantial societal consequences:
alcohol. Despite the breadth of the literature, only very few studies have
examined the predictions directly, and these
provide at best inconclusive evidence for the involvement of temporal-difference
learning alterations in alcohol de-
pendence. We discuss the difficulties of testing the theory, make specific
suggestions and close with a focus on the
interaction with other learning mechanisms.

Background: Computational psychiatry is a burgeoning field that utilizes
mathematical approaches to investigate psychiatric disorders, derive
quantitative predictions, and integrate data across multiple levels of
description. Computational psychiatry has already led to many new insights into
the neurobehavioral mechanisms that underlie several psychiatric disorders, but
its usefulness from a clinical standpoint is only now starting to be considered.
Methods: Examples of computational psychiatry are highlighted, and a phase-based
pipeline for the development of clinical computational- psychiatry applications
is proposed, similar to the phase-based pipeline used in drug development. It is
proposed that each phase has unique endpoints and deliverables, which will be
important milestones to move tasks, procedures, computational models, and
algorithms from the laboratory to clinical practice.
Results: Application of computational approaches should be tested on healthy
volunteers in Phase I, transitioned to target populations in Phase IB and Phase
IIA, and thoroughly evaluated using randomized clinical trials in Phase IIB and
Phase III. Successful completion of these phases should be the basis of
determining whether computational models are useful tools for prognosis,
diagnosis, or treatment of psychiatric patients.
Conclusions: A new type of infrastructure will be necessary to implement the
proposed pipeline. This infrastructure should consist of groups of investigators
with diverse backgrounds collaborating to make computational psychiatry relevant
for the clinic.

Neuroimaging increasingly exploits machine learning techniques in an attempt to
achieve clinically relevant single-subject predictions. An alternative to
machine learning, which tries to establish predictive links between features of
the observed data and clinical variables, is the deployment of computational
models for inferring on the (patho)physiological and cognitive mechanisms that
generate behavioural and neuroimaging responses. This paper discusses the
rationale behind a computational approach to neuroimaging-based single-subject
inference, focusing on its potential for characterising disease mechanisms in
individual subjects and mapping these characterisations to clinical predictions.
Following an overview of two main approaches - Bayesian model selection and
generative embedding - which can link computational models to individual
predictions, we review how these methods accommodate heterogeneity in
psychiatric and neurological spectrum disorders, help avoid erroneous
interpretations of neuroimaging data, and establish a link between a
mechanistic, model-based approach and the statistical perspectives afforded by
machine learning.

Behavioral choice can be characterized along two axes. One axis distinguishes
reflexive, model-free systems that slowly accumulate values through experience
and a model-based system that uses knowledge to reason prospectively. The second
axis distinguishes Pavlovian valuation of stimuli from instrumental valuation of
actions or stimulus-action pairs. This results in a quartet of values and many
possible interactions between them, with important consequences for accounts of
individual variation. We here explored whether individual variation along one
axis was related to individual variation along the other. Specifically, we
asked whether individual's balance between model-based and model-free learning
was related to their tendency to show Pavlovian interferences with instrumental
decisions. In two independent samples with a total of 243 subjects,
Pavlovian-instrumental transfer effects were negatively correlated with the
strength of model-based reasoning in a two-step task. This suggests a potential
common underlying substrate predisposing individuals to both have strong
Pavlovian interference and be less model-based, and provides a framework within
which to interpret the observation of both effects in addiction.

Exploration-exploitation of functions, that is learning and optimizing a mapping
between inputs and expected outputs, is ubiquitous to many real world
situations. These situations sometimes require us to avoid certain outcomes at
all cost, for example because they are poisonous, harmful, or otherwise
dangerous. We test participants' behavior in scenarios in which they have to
find the optimum of a function while at the same time avoid outputs below a
certain threshold. In two experiments, we find that Safe-Optimization, a
Gaussian Process-based exploration-exploitation algorithm, describes
participants' behavior well and that participants seem to care firstly whether a
point is safe and then try to pick the optimal point from all such safe points.
This means that their trade-off between exploration and exploitation can be seen
as an intelligent, approximate, and homeostasis-driven strategy.

Background: The Cognitive Style Questionnaire is a valuable tool for
the assessment of hopeless cognitive styles in depression research, with
predictive power in longitudinal studies. Even the short form is still long, and
neither this nor the original version exist in validated German translations.

Methods: The questionnaire was translated from English to German,
back-translated and commented on by clinicians. The reliability, factor
structure and external validity of an online form of the questionnaire were
examined on 214 participants. External validity was measured on a subset of 90
subjects.

Results: The resulting CSQ-SF-D had good to excellent reliability, both
across items and subscales, and similar external validity to the original
English version. The internality subscale appeared less robust than other
subscales. A detailed analysis of individual item performance suggests that
stable results could be achieved with a very short form (CSQ-VSF-D) including
only 27 items.

Conclusions: The CSQ-SF-D is a validated and freely distributed
translation of the CSQ-SF into German. This should make efficient assessment of
cognitive style in German samples more accessible to researchers.

Translating advances in neuroscience into benefits for patients with mental
illness presents enormous challenges because it involves both the most complex
organ----the brain---and its interaction with a similarly complex environment.
Dealing with such complexities demands powerful techniques. Computational
psychiatry combines multiple levels and types of computation with multiple types
of data in an effort to improve understanding, prediction, and treatment of
mental illness. Computational psychiatry, broadly defined, encompasses two
complementary approaches: data-driven and theory-driven. Data-driven approaches
apply machine-learning methods to high-dimensional data to improve
classification of disease, predict treatment outcomes, or improve
treatment selection. These approaches are generally agnostic as to the
underlying mechanisms. Theory-driven approaches, in contrast, use models that
instantiate prior knowledge of, or explicit hypotheses about, such mechanisms,
possibly at multiple levels of analysis and abstraction. We review recent
advances in both approaches, with an emphasis on clinical applications, and
highlight the utility of combining them.

Background: Changes in reflexive emotional responses are hallmarks of
depression, but how emotional reflexes impact on adaptive decision-making in
depression has not been examined formally. Using a Pavlovian-Instrumental
Transfer (PIT) task, we compared the influence of affectively valenced stimuli
on decision-making in depression and generalized anxiety disorder compared with
healthy controls; and related this to the longitudinal course of the illness.

Methods: Fourty subjects with a current DSM-IV-TR diagnosis of Major
Depressive Disorder, Dysthymia, Generalised Anxiety Disorder or a combination
thereof, and 40 matched healthy controls performed a Pavlovian-instrumental
transfer (PIT) task that assesses how instrumental approach and withdrawal
behaviours are influenced by appetitive and aversive Pavlovian conditioned
stimuli. Patients were followed up after 4-6 months. Analyses focussed on
patients with depression alone (n=25).

Contemporary psychiatry faces major challenges. Its phenomenological nosology
continues to lack mechanistic interpretability and predictive guidance;
treatment largely depends on trial and error; drug development is impeded
through ignorance of potential beneficiary subgroups; neuroscientific and
genetics research has yet to impact disease definitions or to contribute to
clinical decision-making. In this parlous setting, what should psychiatric
research focus on?
In two companion papers, we present a list of concrete problems nominated by
clinicians and researchers from different disciplines as candidates for future
scientific investigation of mental disorders. These problems are loosely grouped
into challenges concerning nosology and diagnosis (this article) and those
addressing pathogenesis and aetiology (in the companion article). Motivated by
successful examples in other disciplines (particularly the famous list of
"Hilbert's problems" in mathematics), this subjective and eclectic list of
priority problems is intended to provide inspiration for the field, helping to
refocus existing research and providing perspectives for future psychiatric
science.

This is the second of two companion papers proposing priority problems for
research on mental disorders. Whereas the first article focuses on questions of
nosology and diagnosis, the challenges articulated in this paper concern
pathogenesis and aetiology of psychiatric diseases. We hope that this
(non-exhaustive and subjective) list of problems, nominated by scientists and
clinicians from different fields and institutions, provides guidance and
perspectives for choosing future directions in psychiatric science.

Computational Psychiatry aims to describe the relationship between the brain's
neurobiology, its environment and mental symptoms in computational terms. In so
doing, it may improve psychiatric classification and the diagnosis and treatment
of mental illness. It can unite many levels of description in a mechanistic and
rigorous fashion, whilst avoiding biological reductionism and artificial
categorization. We describe how computational models of cognition can infer the
current state of the environment and weigh up future actions, and how these
models provide new perspectives on two example disorders, depression and
schizophrenia. Reinforcement learning describes how the brain can choose and
value courses of actions according to their long-term future value. Some
depressive symptoms may result from aberrant valuations, which could arise from
prior beliefs about the loss of agency ("helplessness"), or from an inability to
inhibit the mental exploration of aversive events. Predictive coding explains
how the brain might perform Bayesian inference about the state of its
environment by combining sensory data with prior beliefs, each weighted
according to their certainty (or precision). Several cortical abnormalities in
schizophrenia might reduce precision at higher levels of the inferential
hierarchy, biasing inference towards sensory data and away from prior beliefs.
We discuss whether striatal hyperdopaminergia might have an adaptive function in
this context, and also how reinforcement learning and incentive salience models
may shed light on the disorder. Finally, we review some of Computational
Psychiatry's applications to neurological disorders such as Parkinson's disease,
and some pitfalls to avoid when applying its methods.

Major Depressive Disorder (MDD) is clinically, and likely
pathophysiologically, heterogeneous. A potentially fruitful approach to parsing
this heterogeneity is to focus on promising endophenotypes. Guided by the NIMH
Research Domain Criteria (RDoC) initiative, we used source localization of
scalp-recorded EEG resting data to examine the neural correlates of three
emerging endophenotypes of depression: neuroticism, blunted reward learning and
cognitive control deficits. Data were drawn from the ongoing multi-site EMBARC
study. We estimated intracranial current density for standard EEG frequency
bands in 82 unmedicated adults with MDD, using Low-Resolution Brain
Electromagnetic Tomography (LORETA). Region-of- interest and whole-brain
analyses tested associations between resting state EEG current density and
endophenotypes of interest. Neuroticism was associated with increased resting
gamma (36.5­44 Hz) current density in the ventral (subgenual) anterior cingulate
cortex (ACC) and orbitofrontal cortex (OFC). In contrast, reduced cognitive
control correlated with decreased gamma activity in the left dorsolateral
prefrontal cortex (dlPFC), decreased theta (6.5­8 Hz) and alpha2 (10.5­12 Hz)
activity in the dorsal ACC, and increased alpha2 activity in the right dlPFC.
Finally, blunted reward learning correlated with lower OFC and left dlPFC gamma
activity. Computational modeling of trial-by-trial reinforcement learning
further indicated that lower OFC gamma activity was linked to reduced reward
sensitivity. Three putative endophenotypes of depression were found to have
partially dissociable resting intracranial EEG correlates, reflecting different
underlying neural dysfunctions. Overall, these findings highlight the need to
parse the heterogeneity of MDD by focusing on promising endophenotypes linked to
specific pathophysiological abnormalities.

In detoxified alcohol-dependent patients, alcohol-related stimuli can promote
relapse. However, to date, the mechanisms by which contextual stimuli promote
relapse have not been elucidated in detail. One hypothesis is that such
contextual stimuli directly stimulate the motivation to drink via associated
brain regions like the ventral striatum and thus promote alcohol seeking, intake
and relapse. Pavlovian-to-Instrumental-Transfer (PIT) may be one of those
behavioral phenomena contributing to relapse, capturing how Pavlovian
conditioned (contextual) cues determine instrumental behavior (e.g. alcohol
seeking and intake). We used a PIT paradigm during functional magnetic resonance
imaging to examine the effects of classically conditioned Pavlovian stimuli on
instrumental choices in n=31 detoxified patients diagnosed with alcohol
dependence and n=24 healthy controls matched for age and gender. Patients were
followed up over a period of 3 months. We observed that (1) there was a
significant behavioral PIT effect for all participants, which was significantly
more pronounced in alcohol-dependent patients; (2) PIT was significantly
associated with blood oxygen level-dependent (BOLD) signals in the nucleus
accumbens (NAcc) in subsequent relapsers only; and (3) PIT-related NAcc
activation was associated with, and predictive of, critical outcomes (amount of
alcohol intake and relapse during a 3 months follow-up period) in
alcohol-dependent patients. These observations show for the first time that
PIT-related BOLD signals, as a measure of the influence of Pavlovian cues on
instrumental behavior, predict alcohol intake and relapse in alcohol dependence.

Psychiatric disorders profoundly impair many aspects of decision-making. Poor
choices have negative consequences in the moment, and also make it very hard
to navigate complex social environments. Computational neuroscience provides
normative, neurobiologically informed, descriptions of the components of
decision-making which serve as a platform for a principled exploration of
dysfunctions. Here, we identify and discuss three classes of failure-modes
arising in these formalisms. They stem from abnormalities in the framing of
problems or tasks, from the mechanisms of cognition used to solve the tasks, or
from the historical data available from the environment.

Good decisions reflect the past and improve the future. Trouble ensues when the
rewards of the past conflict with the goals of the future. New goals often fail
to override behaviors reinforced in the past, for instance, when patients with
obsessive-compulsive disorder (OCD) try to stop the counting behavior that has
been their main strategy to avoid aversive intrusions. [...]

In this issue of the Journal, Gillan et al. (5) add to an impressive catalog
of studies on the imbalance between habitual and goal-directed decisions in
OCD.

Humans routinely formulate plans in domains so complex that even the most
powerful computers are taxed. To do so, they seem to avail themselves of many
strategies and heuristics that efficiently simplify, approximate, and
hierarchically decompose hard tasks into simpler subtasks. Theoretical and
cognitive research has revealed several such strategies; however, little is
known about their establishment, interaction, and efficiency. Here, we use
model-based behavioral analysis to provide a detailed examination of the
performance of human subjects in a moderately deep planning task. We find that
subjects exploit the structure of the domain to establish subgoals in a way that
achieves a nearly maximal reduction in the cost of computing values of choices,
but then combine partial searches with greedy local steps to solve subtasks, and
maladaptively prune the decision trees of subtasks in a reflexive manner upon
encountering salient losses. Subjects come idiosyncratically to favor particular
sequences of actions to achieve subgoals, creating novel complex actions or
"options."

Dual system theories suggest that behavioral control is parsed between a
deliberative `model-based' and a more reflexive `model-free' system. A balance
of control exerted by these systems is thought to be related to dopamine
neurotransmission. However, in the absence of direct measures of human dopamine,
it remains unknown whether this reflects a quantitative relation with dopamine
either in the striatum or other brain areas. Using a sequential decision task
performed during fMRI, combined with striatal measures of dopamine using
[18F]DOPA PET, we show that higher presynaptic ventral striatal dopamine levels
were associated with a behavioral bias towards more model-based control. Higher
presynaptic dopamine in ventral striatum was associated with greater coding of
model-based information in lateral prefrontal cortex and diminished coding of
model-free prediction errors in ventral striatum. Thus, inter- individual
variability in ventral striatal presynaptic dopamine reflects a balance in the
behavioral expression and the neural signatures of model-free and model-based
control. Our data provide a novel perspective on how alterations in presynaptic
dopamine levels might be accompanied by a disruption of behavioral control as
observed in aging or neuropsychiatric diseases such as schizophrenia and
addiction.

The manifold symptoms of depression are a common and often transient feature of
healthy life that are likely to be adaptive in difficult circumstances. It is
when these symptoms enter a seemingly self-propelling spiral that the
maladaptive features of a disorder emerge. We examine this malignant
transformation from the perspective of the computational neuroscience of
decision making, investigating how dysfunction of the brain's mechanisms of
evaluation might lie at its heart. We start by considering the behavioural
implications of pessimistic evaluations of decision variables. We then provide a
selective review of work suggesting how such pessimism might arise via specific
failures of the mechanisms of evaluation or state estimation. Finally, we
analyze ways that miscalibration between the subject and environment may be
self-perpetuating. We employ the formal framework of Bayesian decision theory
as a foundation for this study, showing how most of the problems arise from one
of its broad algorithmic facets, namely model-based reasoning.

Theories of decision-making and its neural substrates have long assumed the
existence of two distinct and competing valuation systems, variously described
as goal-directed versus habitual, or, more recently and based on statistical
arguments, as model-free versus model-based reinforcement-learning. Though both
have been shown to control choices, the cognitive abilities associated with
these systems are under ongoing investigation. Here we examine the link to
cognitive abilities, and find that individual differences in processing speed
covary with a shift from model-free to model-based choice control in the
presence of above-average working memory function. This suggests shared
cognitive and neural processes; provides a bridge between literatures on
intelligence and valuation; and may guide the development of process models of
different valuation components. Furthermore, it provides a rationale for
individual differences in the tendency to deploy valuation systems, which may be
important for understanding the manifold neuropsychiatric diseases associated
with malfunctions of valuation.

Drugs of abuse elicit dopamine release in ventral striatum, possibly biasing
dopamine-driven reinforcement learning towards drug-related reward at the
expense of non-drug-related reward. Indeed, reactivity in dopaminergic target
areas of patients with alcohol dependence is shifted from non-drug-related
stimuli towards drug-related stimuli. Such `hijacked' dopamine signals may
impair flexible learning from non-drug-related rewards and thus promote craving
for the drug of abuse. Here, we used fMRI to measure ventral striatal
activation by reward prediction errors (RPEs) during a probabilistic reversal
learning task in recently detoxified alcohol- dependent patients and healthy
controls (N=27). The same subjects also underwent FDOPA PET to assess ventral
striatal dopamine synthesis capacity. Neither ventral striatal activation by
RPEs, nor striatal dopamine synthesis capacity differed between groups.
However, ventral striatal coding of RPEs was negatively correlated with
craving in patients. Furthermore, we found a negative correlation between
ventral striatal coding of RPEs and dopamine synthesis capacity in healthy
controls, but not in alcohol-dependent patients. Moderator analyses showed
that the magnitude of the association between RPE coding and dopamine
synthesis capacity depended on the amount of chronic-habitual alcohol intake.
Given the relatively small sample size, a power analysis showed that it is
rather unlikely to obtain these results by chance. Using a multimodal
imaging approach, this study suggests that dopaminergic modulation of neural
learning signals is disrupted in alcohol dependence and this is linked to
long-term alcohol intake of patients. Alcohol intake may perpetuate itself by
interfering with dopaminergic modulation of neural learning signals in
ventral striatum, thus increasing craving for habitual drug intake.

Fluid intelligence (fluid IQ), defined as the capacity for rapid problem solving
and behavioral adaptation, is known to be modulated by learning and experience.
Both stressful life events (SLES) and neural correlates of learning
[specifically, a key mediator of adaptive learning in the brain, namely the
ventral striatal representation of prediction errors (PE)] have been shown to be
associated with individual differences in fluid IQ. Here, we examine the
interaction between adaptive learning signals (using a well-characterized
probabilistic reversal learning task in combination with fMRI) and SLES on fluid
IQ measures. We find that the correlation between ventral striatal BOLD PE and
fluid IQ, which we have previously reported, is quantitatively modulated by the
amount of reported SLES. Thus, after experiencing adversity, basic neuronal
learning signatures appear to align more closely with a general measure of
flexible learning (fluid IQ), a finding complementing studies on the effects of
acute stress on learning. The results suggest that an understanding of the
neurobiological correlates of trait variables like fluid IQ needs to take
socioemotional influences such as chronic stress into account.

Instrumental decision making has long been argued to be vulnerable to emotional
responses. Literature on multiple decision making systems suggests that this
emotional biasing might reflect effects of a system that regulates innately
specified, evolutionarily preprogrammed responses. To test this hypothesis
directly, we investigated whether effects of emotional faces on instrumental
action can be predicted by effects of emotional faces on bodily freezing, an
innately specified response to aversive relative to appetitive cues. We tested
43 women using a novel emotional decision making task combined with
posturography, which involves a force platform to detect small oscillations of
the body to accurately quantify postural control in upright stance. On the
platform, participants learned whole body approach-avoidance actions based on
monetary feedback, while being primed by emotional faces (angry/happy). Our data
evidence an emotional biasing of instrumental action. Thus, angry relative to
happy faces slowed instrumental approach relative to avoidance responses.
Critically, individual differences in this emotional biasing effect were
predicted by individual differences in bodily freezing. This result suggests
that emotional biasing of instrumental action involves interaction with a system
that controls innately specified responses. Furthermore, our findings help
bridge (animal and human) decision making and emotion research to advance our
mechanistic understanding of decision making anomalies in daily encounters as
well as in a wide range of psychopathology.

Background: Pavlovian processes are thought to play an important role in the
development, maintenance and relapse of alcohol dependence, possibly by
influencing and usurping on- going thought and behavior. The influence of
Pavlovian stimuli on on-going behavior is paradigmatically measured by
Pavlovian-to-instrumental-transfer (PIT) tasks. These involve multiple stages
and are complex. Whether increased PIT is involved in human alcohol dependence
is uncertain. We therefore aimed to establish and validate a modified PIT
paradigm that would be robust, consistent, and tolerated by healthy controls as
well as by patients suffering from alcohol dependence, and to explore whether
alcohol dependence is associated with enhanced Pavlovian-Instrumental
transfer.
Methods: 32 recently detoxified alcohol-dependent patients and 32 age and gender
matched healthy controls performed a PIT task with instrumental go/no-go
approach behaviours. The task involved both Pavlovian stimuli associated with
monetary rewards and losses, and images of drinks.
Results: Both patients and healthy controls showed a robust and temporally
stable PIT effect. Strengths of PIT effects to drug-related and monetary
conditioned stimuli were highly correlated. Patients more frequently showed a
PIT effect and the effect was stronger in response to aversively conditioned CSs
(conditioned suppression), but there was no group difference in response to
appetitive CSs.
Conclusion: The implementation of PIT has favorably robust properties in chronic
alcohol- dependent patients and in healthy controls. It shows internal
consistency between monetary and drug-related cues. The findings support an
association of alcohol dependence with an increased propensity towards PIT.

Background: Human and animal work suggests a shift from
goal-directed to habitual decision-making in addiction. However, the evidence
for this in human alcohol dependence is as yet inconclusive.
Methods: Twenty-six healthy controls and twenty-six recently
detoxified alcohol-dependent patients underwent behavioral testing with a
two-step task designed to disentangle goal- directed and habitual response
patterns.
Results: Alcohol-dependent patients showed less evidence of
goal-directed choices than healthy controls, particularly after losses. There
was no difference in the strength of the habitual component. The group
differences did not survive controlling for performance on the digit symbol
substitution task.
Conclusion: Chronic alcohol use appears to selectively impair
goal-directed function, rather than promoting habitual responding. It appears to
do so particularly after non-rewards, and this may be mediated by the effects of
alcohol on more general cognitive functions subserved by the prefrontal
cortex.

Optimists hold positive a priori beliefs about the future. In Bayesian
statistical theory, a priori beliefs can be overcome by experience. However,
optimistic beliefs can at times appear surprisingly resistant to evidence,
suggesting that optimism might also influence how new information is selected
and learned. Here, we use a novel Pavlovian conditioning task, embedded in a
normative framework, to directly assess how trait optimism, as classically
measured using self-report questionnaires, influences choices between visual
targets, as learning about their association with reward progresses. We find
that trait optimism relates to an a priori belief about the likelihood of
rewards, but not losses, in our task. Critically, this positive belief behaves
like a probabilistic prior, i.e. its influence reduces with increasing
experience. Contrary to findings in the literature related to unrealistic
optimism and self-beliefs, it does not appear to influence the iterative
learning process directly.

Reinforcement learning (RL) techniques are a set of solutions for optimal
long-term action choice such that actions take into account both immediate and
delayed consequences. They fall into two broad classes. Model-based approaches
assume an explicit model of the environment and the agent. The model describes
the consequences of actions and the associated returns. From this, optimal
policies can be inferred. Psychologically, model-based descriptions apply to
goal-directed decisions, in which choices reflect current preferences over
outcomes. Model-free approaches forgo any explicit knowledge of the dynamics of
the environment or the consequences of actions and evaluate how good actions are
through trial-and-error learning. Model-free values underlie habitual and
Pavlovian conditioned responses that are emitted reflexively when faced with
certain stimuli. While model-based techniques have substantial computational
demands, model-free techniques require extensive experience.

An increasing wealth of experimental detail is becoming available about the
development and nature of addiction. Critical issues such as the varying
vulnerabilities of individuals who develop addiction are being illuminated
across levels of phenomenological, psychological and neurobiological detail.
Furthermore, a rich theoretical understanding is emerging in the field of neural
reinforcement learning, with glimmers as to how this might be related to the
subjective experience of those individuals affected. In this chapter, we
consider some particularly pressing current issues in the interface between
experiment and theory, notably the so-called "compulsive" phase of drug taking.

Dopaminergic signals play a mathematically precise role in
reward-related learning, and variations in dopaminergic signalling have been
implicated in vulnerability to addiction. Here, we provide a detailed overview
of the relationship between theoretical, mathematical and experimental accounts
of phasic dopamine signalling, with implications for the role of
learning-related dopamine signalling in addiction and related disorders. We
describe the theoretical and behavioural characteristics of model-free learning
based on errors in the prediction of reward, including step-by-step explanations
of the underlying equations. We then use recent insights from an animal model
that highlights individual variation in learning during a Pavlovian conditioning
paradigm to describe overlapping aspects of incentive salience attribution and
model-free learning. We argue that this provides a computationally coherent
account of some features of addiction.

2013

Subjects with schizophrenia are impaired at reinforcement-driven reversal
learning from as early as their first episode. The neurobiological basis of this
deficit is unknown. We obtained behavioral and fMRI data in 24 unmedicated,
primarily first episode, schizophrenia patients and 24 age-, IQ- and
gender-matched healthy controls during a reversal learning task. We supplemented
our fMRI analysis, focusing on learning from prediction errors, with detailed
computational modeling to probe task solving strategy including an ability to
deploy an internal goal directed model of the task. Patients displayed reduced
functional activation in the ventral striatum (VS) elicited by prediction
errors. However, modeling task performance revealed that a subgroup did not
adjust their behavior according to an accurate internal model of the task
structure, and these were also the more severely psychotic patients. In patients
who could adapt their behavior, as well as in controls, task solving was best
described by cognitive strategies according to a Hidden Markov Model. When we
compared patients and controls who acted according to this strategy, patients
still displayed a significant reduction in VS activation elicited by informative
errors that precede salient changes of behavior (reversals). Thus, our study
shows that VS dysfunction in schizophrenia patients during reward-related
reversal learning remains a core deficit even when controlling for task solving
strategies. This result highlights VS dysfunction is tightly linked to a reward-
related reversal learning deficit in early, unmedicated schizophrenia patients.

Adaptive decision-making involves interaction between
systems regulating Pavlovian and instrumental control of behavior. Here we
investigate in humans the role of serotonin in such Pavlovian-instrumental
transfer in both the aversive and the appetitive domain using acute tryptophan
depletion, known to lower central serotonin levels. Acute tryptophan depletion
attenuated the inhibiting effect of aversive Pavlovian cues on instrumental
behavior, while leaving unaltered the activating effect of appetitive Pavlovian
cues. These data suggest that serotonin is selectively involved in Pavlovian
inhibition due to aversive expectations and have implications for our
understanding of the mechanisms underlying a range of affective, impulsive, and
aggressive neuropsychiatric disorders.

Background: Decision-making involves two fundamental axes
of control namely valence, spanning reward and punishment, and action, spanning
invigoration and inhibition. We recently used a task whose contingencies
explicitly decouple valence and action to show that these axes are inextricably
coupled during learning. This results in a disadvantage in acquiring active
choices in punished conditions and passive choices in rewarded conditions. The
neuromodulators dopamine and serotonin are likely to play a role in these
asymmetries. Dopamine signals anticipation of future rewards and is involved in
an invigoration of motor responses leading to reward. Serotonin is associated
with motor inhibition and punishment processing.
Methods: Here we combined computational modelling with a pharmacological
manipulation of dopamine and serotonin to examine acquisition of instrumental
responding in a task that crosses action (go/no go) with valence
(reward/punishment) in healthy human volunteers.
Results: Contrary to expectation we found that levodopa decreased the coupling
of action and valence that was evident in the placebo and citalopram groups.
Citalopram had distinct effects and increased participants tendency to perform
active responses independent of outcome valence, consistent with decreased motor
inhibition.
Conclusion: The current data highlights the importance of orthogonally
manipulating action requirements and outcome valence if one wants to reveal the
full complexity of the roles played by dopamine and serotonin in instrumental
learning.

Computational Psychiatry is a heterogeneous field at the intersection of
computational neuroscience and psychiatry. Incorporating methods from
psychiatry, psychology, neuroscience, behavioural economics and machine
learning, computational psychiatry focuses on building mathematical models of
neural or cognitive phenomena relevant to psychiatric diseases. The models span
a wide range - from biologically detailed models of neurons or networks to
abstract models describing high-level cognitive abilities of an organism.
Psychiatric diseases are conceptualized as either an extreme of normal function,
or as a consequence of alterations in parts of the model.

As in computational neuroscience more generally, the building of models forces
key concepts to be made concrete and hidden assumptions to be made explicit. One
critical functions of these models in the setting of psychiatry are their
ability to bridge between low-level biological and high-level cognitive
features. While many neurobiological alterations are known, the exclusively
atheoretical focus of standard psychiatric nosology on high-level symptoms has
as yet prevented an integration of these bodies of knowledge. David Marr pointed
out that models at different levels may be independent (Marr, 1982).
Nevertheless, implementational details may constrain functions at the
computational level. The models used in computational psychiatry make these
constraints explicit, and thereby aim to provide normative conduits between the
different levels at which neural systems are analysed (Stephan et al., 2006;
Huys et al., 2011; Hasler, 2012; Montague et al., 2012). This in turn allows for
a principled approach to study dysfunctions, and indeed may allow the
dysfunctions observed in psychiatry to inform neuroscience in general.

Practically, it underpins hopes that computational techniques may facilitate the
development of a psychiatric nomenclature based on an understanding of the
underlying neuroscience. Computational models enhance experimental designs by
allowing more intricate neural and/or cognitive processes to be inferred from
complex features of the data, often via Bayesian inference. These aspects
motivate hopes that it may facilitate the development of clinical treatment
decision tools informed by advances in neuroscience.

Background: Disadvantageous decisions with respect to
alcohol consumption play a central role in alcohol dependency (AD). This
decision making pattern seems to be in part a result of
Pavlovian-to-instrumental-transfer effects (PIT effects). The aim of this
review is to summarize important findings on PIT within the scope of addiction
disorders. Building on this, open questions in the field of human AD are
discussed. Methods: This review is not based on a systematic
and standardized literature research. Instead the review was based on the
literature search conducted in DFG research group 1617 (Learning and
Habitization in Alcohol Dependence, LeAD). Selection of research articles was
based on expert opinion. Results: PIT effects in AD might
possibly lead to a vicious cycle consisting of enhanced PIT effects through
alcohol consumption and enhanced alcohol consumption through enhanced PIT
effects. Discussion: PIT effects in alcohol addiction are
mainly known from animal studies because there are but few human AD PIT studies.
In human AD research the PIT paradigm may be able to reveal how particular cues
disproportionally motivate AD patients to drink alcohol. PIT experiments thus
have potential uses in the prediction of relapse and the measurement of
addiction severity.

Adaptive behaviour involves interactions between systems regulating Pavlovian
and instrumental control of actions. Here, we present the first investigation of
the neural mechanisms underlying aversive Pavlovian-instrumental transfer using
fMRI in humans. Recent evidence indicates that these Pavlovian influences on
instrumental actions are action-specific: Instrumental approach is invigorated
by appetitive Pavlovian cues, but inhibited by aversive Pavlovian cues.
Conversely, instrumental withdrawal is inhibited by appetitive Pavlovian cues,
but invigorated by aversive Pavlovian cues. We show that BOLD responses in the
amygdala and the nucleus accumbens were associated with behavioural inhibition
by aversive Pavlovian cues, irrespective of action context. Furthermore, BOLD
responses in the ventromedial prefrontal cortex differed between approach and
withdrawal actions. Aversive Pavlovian conditioned stimuli modulated
connectivity between the ventromedial prefrontal cortex and the caudate nucleus.
These results show that action-specific aversive control of instrumental
behaviour involves the modulation of fronto-striatal interactions by
Pavlovian conditioned stimuli.

Background: Depression is characterised partly by blunted reactions to
reward. However, tasks probing this deficiency have not distinguished
insensitivity to reward from insensitivity to the prediction errors for reward
that determine learning and are putatively reported by the phasic activity of
dopamine neurons. We attempted to disentangle these factors with respect to
anhedonia in the context of stress, Major Depressive Disorder (MDD), Bipolar
Disorder (BPD) and a dopaminergic challenge.

Methods: Six behavioural datasets involving 392 experimental sessions
were subjected to a model-based, Bayesian meta-analysis. Participants across all
six studies performed a probabilistic reward task that used an asymmetric
reinforcement schedule to assess reward learning. Healthy controls were tested
under baseline conditions, stress or after receiving the dopamine D2 agonist
pramipexole. In addition, participants with current or past MDD or BPD were
evaluated. Reinforcement learning models isolated the contributions of variation
in reward sensitivity and learning rate.

Results: MDD and anhedonia reduced reward sensitivity more than they
affected the learning rate, while a low dose of the dopamine D2 agonist
pramipexole showed the opposite pattern. Stress led to a pattern consistent with
a mixed effect on reward sensitivity and learning rate.

Conclusion: Reward-related learning reflected at least two partially
separable contributions. The first related to phasic prediction error
signalling, and was preferentially modulated by a low dose of the dopamine
agonist pramipexole. The second related directly to reward sensitivity, and was
preferentially reduced in MDD and anhedonia. Stress altered both components.
Collectively, these findings highlight the contribution of model-based
reinforcement learning meta-analysis for dissecting anhedonic behavior.

Pavlovian biases influence learning and decision making by intricately
coupling reward seeking with action invigoration and punishment
avoidance with action suppression. This bias is not always adaptive; it
can oftentimes interfere with instrumental requirements. The prefrontal
cortex is thought to help resolve such conflict between motivational
systems, but the nature of this control process remains unknown. EEG
recordings of mid-frontal theta band power are sensitive to conflict
and predictive of adaptive control over behavior, but it is not clear
whether this signal would reflect control over conflict between
motivational systems. Here we utilized a task that orthogonalized
action requirements and outcome valence while recording concurrent EEG
in human participants. By applying a computational model of task
performance, we derived parameters reflective of the latent influence
of Pavlovian bias and how it was modulated by mid- frontal theta power
during motivational conflict. Between subjects, individuals who
performed better under Pavlovian conflict exhibited higher mid-frontal
theta power. Within subjects, trial- to-trial variance in theta power
was predictive of ability to overcome the influence of the Pavlovian
bias, and this effect was most pronounced in individuals with higher
mid-frontal theta to conflict. These findings demonstrate that
mid-frontal theta is not only a sensitive index of prefrontal control,
but it can also reflect the application of top-down control over
instrumental processes.

Proceedings of the 35th Annual Conference of the Cognitive Science Society (2013)

In learned helplessness experiments, subjects first experience a lack of control
in one situation, and then show learning deficits when performing or learning
another task in another situation. Generalization, thus, is at the core of the
learned helplessness phenomenon. Substantial experimental and theoretical effort
has been invested into establishing that a state- and task-independent belief
about controllability is necessary. However, to what extent generalization is
also sufficient to explain the transfer has not been examined. Here, we show
qualitatively and quantitatively that Bayesian learning of action-outcome
contingencies at three levels of abstraction is sufficient to account for the
key features of learned helplessness, including escape deficits and impairment
of appetitive learning after inescapable shocks.

Senescence affects the ability to utilize information about the likelihood of rewards for optimal decision-making. Using functional magnetic resonance imaging in humans, we found that healthy older adults had an abnormal signature of expected value, resulting in an incomplete reward prediction error (RPE) signal in the nucleus accumbens, a brain region that receives rich input projections from substantia nigra/ventral tegmental area (SN/VTA) dopaminergic neurons. Structural connectivity between SN/VTA and striatum, measured by diffusion tensor imaging, was tightly coupled to inter-individual differences in the expression of this expected reward value signal. The dopamine precursor levodopa (L-DOPA) increased the task-based learning rate and task performance in some older adults to the level of young adults. This drug effect was linked to restoration of a canonical neural RPE. Our results identify a neurochemical signature underlying abnormal reward processing in older adults and indicate that this can be modulated by L-DOPA.

2012

Decision-making invokes two fundamental axes of control: affect or valence,
spanning reward and punishment, and effect or action, spanning invigoration and
inhibition. We studied the acquisition of instrumental responding in healthy
human volunteers in a task in which we orthogonalized action requirements and
outcome valence. Subjects were much more successful in learning active choices
in rewarded conditions, and passive choices in punished conditions. Using
computational reinforcement-learning models, we teased apart contributions from
putatively instrumental and Pavlovian components in the generation of the
observed asymmetry during learning. Moreover, using model-based fMRI, we showed
that BOLD signals in striatum and substantia nigra/ventral tegmental area
(SN/VTA) correlated with instrumentally learnt action values, but with opposite
signs for go and no-go choices. Finally, we showed that successful instrumental
learning depends on engagement of bilateral inferior frontal gyrus. Our
behavioral and computational data showed that instrumental learning is
contingent on overcoming inherent and plastic Pavlovian biases, while our
neuronal data showed this learning is linked to unique patterns of brain
activity in regions implicated in action and inhibition respectively.

When planning a series of actions, it is usually infeasible to consider all
potential future sequences; instead, one must prune the decision tree. Provably
optimal pruning is, however, still computationally ruinous and the specific
approximations humans employ remain unknown. We designed a new sequential
reinforcement-based task and showed that human subjects adopted a simple pruning
strategy: during mental evaluation of a sequence of choices, they curtailed any
further evaluation of a sequence as soon as they encountered a large loss. This
pruning strategy was Pavlovian: it was reflexively evoked by large losses and
persisted even when overwhelmingly counterproductive. It was also evident above
and beyond loss aversion. We found that the tendency towards Pavlovian pruning
was selectively predicted by the degree to which subjects exhibited sub-clinical
mood disturbance, in accordance with theories that ascribe Pavlovian behavioural
inhibition, via serotonin, a role in mood disorders. We conclude that Pavlovian
behavioural inhibition shapes highly flexible, goal- directed choices in a
manner that may be important for theories of decision-making in mood disorders.

Fluid intelligence represents the capacity for flexible problem solving and
rapid behavioral adaptation. Rewards drive flexible behavioral adaptation, in
part via a teaching signal expressed as reward prediction errors in the ventral
striatum, which has been associated with phasic dopamine release in animal
studies. We examined a sample of 28 healthy male adults using multimodal imaging
and biological parametric mapping with (1) functional magnetic resonance imaging
during a reversal learning task and (2) in a subsample of 17 subjects also with
positron emission tomography using 6-[(18) F]fluoro-L-DOPA to assess dopamine
synthesis capacity. Fluid intelligence was measured using a battery of nine
standard neuropsychological tests. Ventral striatal BOLD correlates of reward
prediction errors were positively correlated with fluid intelligence and, in the
right ventral striatum, also inversely correlated with dopamine synthesis
capacity (FDOPA K#inapp). When exploring aspects of fluid intelligence, we
observed that prediction error signaling correlates with complex attention and
reasoning. These findings indicate that individual differences in the capacity
for flexible problem solving relate to ventral striatal activation during
reward-related learning, which in turn proved to be inversely associated with
ventral striatal dopamine synthesis capacity.

The acquisition of reward and the avoidance of punishment could logically be
contingent on either emitting or withholding particular actions. However, the
separate pathways in the striatum for go and no-go appear to violate this
independence, instead coupling affect and effect. Respect for this
interdependence has biased many studies of reward and punishment, so potential
action-outcome valence interactions during anticipatory phases remain
unexplored. In a functional magnetic resonance imaging study with healthy human
volunteers, we manipulated subjects' requirement to emit or withhold an action
independent from subsequent receipt of reward or avoidance of punishment. During
anticipation, in the striatum and a lateral region within the substantia
nigra/ventral tegmental area (SN/VTA), action representations dominated over
valence representations. Moreover, we did not observe any representation
associated with different state values through accumulation of outcomes,
challenging a conventional and dominant association between these areas and
state value representations. In contrast, a more medial sector of the SN/VTA
responded preferentially to valence, with opposite signs depending on whether
action was anticipated to be emitted or withheld. This dominant influence of
action requires an enriched notion of opponency between reward and punishment.

Hard-wired, Pavlovian, responses elicited by predictions of rewards and
punishments exert significant benevolent and malevolent influences over
instrumentally-appropriate actions. These influences come in two main groups,
defined along anatomical, pharmacological, behavioural and functional lines.
Investigations of the influences have so far concentrated on the groups as a
whole; here we take the critical step of looking inside each group, using a
detailed reinforcement learning model to distinguish effects to do with value,
specific actions, and general activation or inhibition. We show a high degree of
sophistication in Pavlovian influences, with appetitive Pavlovian stimuli
specifically promoting approach and inhibiting withdrawal, and aversive
Pavlovian stimuli promoting withdrawal and inhibiting approach. These influences
account for differences in the instrumental performance of approach and
withdrawal behaviours. Finally, although losses are as informative as gains, we
find that subjects neglect losses in their instrumental learning. Our findings
argue for a view of the Pavlovian system as a constraint or prior, facilitating
learning by alleviating computational costs that come with increased
flexibility.

Mathematically rigorous descriptions of key hypotheses and theories are becoming
more common in neuroscience and are beginning to be applied to psychiatry. In
this article two fictional characters, Dr. Strong and Mr. Micawber, debate the
use of such computational models (CMs) in psychiatry. We present four
fundamental challenges to the use of CMs in psychiatry: (a) the applicability of
mathematical approaches to core concepts in psychiatry such as subjective
experiences, conflict and suffering; (b) whether psychiatry is mature enough to
allow informative modelling; (c) whether theoretical techniques are powerful
enough to approach psychiatric problems; and (d) the issue of communicating
clinical concepts to theoreticians and vice versa. We argue that CMs have yet to
influence psychiatric practice, but that they help psychiatric research in two
fundamental ways: (a) to build better theories integrating psychiatry with
neuroscience; and (b) to enforce explicit, global and efficient testing of
hypotheses through more powerful analytical methods. CMs allow the complexity of
a hypothesis to be rigorously weighed against the complexity of the data. The
paper concludes with a discussion of the path ahead. It points to stumbling
blocks, like the poor communication between theoretical and medical communities.
But it also identifies areas in which the contributions of CMs will likely be
pivotal, like an understanding of social influences in psychiatry, and of the
co-morbidity structure of psychiatric diseases.

Serotonin is a neuromodulator that is extensively entangled in fundamental
aspects of brain function and behavior. We present a computational view of its
involvement in the control of appetitively and aversively motivated actions. We
first describe a range of its effects in invertebrates, endowing specific
structurally fixed networks with plasticity at multiple spatial and temporal
scales. We then consider its rather widespread distribution in the mammalian
brain. We argue that this is associated with a more unified representational and
functional role in aversive processing that is amenable to computational
analyses with the kinds of reinforcement learning techniques that have helped
elucidate dopamine's role in appetitive behavior. Finally, we suggest that it is
only a partial reflection of dopamine because of essential asymmetries between
the natural statistics of rewards and punishments.

Helplessness, a belief that the world is not subject to behavioral control, has
long been central to our understanding of depression, and has influenced
cognitive theories, animal models and behavioral treatments. However, despite
its importance, there is no fully accepted definition of helplessness or
behavioral control in psychology or psychiatry, and the formal treatments in
engineering appear to capture only limited aspects of the intuitive concepts.
Here, we formalize controllability in terms of characteristics of prior
distributions over affectively charged environments. We explore the relevance of
this notion of control to reinforcement learning methods of optimising behavior
in such environments and consider how apparently maladaptive beliefs can result
from normative inference processes. These results are discussed with reference
to depression and animal models thereof.

2008

Decision making lies at the very heart of many psychiatric diseases. It is also
a central theoretical concern in a wide variety of fields and has undergone
detailed, in-depth, analyses. We take as an example Major Depressive Disorder
(MDD), applying insights from a Bayesian reinforcement learning framework. We
focus on anhedonia and helplessness. Helplessness-a core element in the
conceptual- izations of MDD that has lead to major advances in its treatment,
pharmacolog- ical and neurobiological understanding-is formalized as a simple
prior over the outcome entropy of actions in uncertain environments. Anhedonia,
which is an equally fundamental aspect of the disease, is related to the
effective reward size. These formulations allow for the design of specific tasks
to measure anhedonia and helplessness behaviorally. We show that these
behavioral measures capture explicit, questionnaire-based cognitions. We also
provide evidence that these tasks may allow classification of subjects into
healthy and MDD groups based purely on a behavioural measure and avoiding any
verbal reports.

Pavlovian predictions of future aversive outcomes lead to behavioral inhibition,
suppression, and withdrawal. There is considerable evidence for the involvement
of serotonin in both the learning of these predictions and the inhibitory
consequences that ensue, although less for a causal relationship between the
two. In the context of a highly simplified model of chains of affectively
charged thoughts, we interpret the combined effects of serotonin in terms of
pruning a tree of possible decisions, (i.e., eliminating those choices that have
low or negative expected outcomes). We show how a drop in behavioral inhibition,
putatively resulting from an experimentally or psychiatrically influenced drop
in serotonin, could result in unexpectedly large negative prediction errors and
a significant aversive shift in reinforcement statistics. We suggest an
interpretation of this finding that helps dissolve the apparent contradiction
between the fact that inhibition of serotonin reuptake is the first-line
treatment of depression, although serotonin itself is most strongly linked with
aversive rather than appetitive outcomes and predictions.

PhD thesis

Depression, like many psychiatric disorders, is a disorder of affect. Over the past decades, a large number of affective issues in depression have been
characterised, both in human experiments and animal models of the disorder. Over the same period, experimental neuroscience, helped by computational
theories such as reinforcement learning, has provided detailed descriptions of the psychology and neurobiology of affective decision making. Here, we
attempt to harvest the advances in the understanding of the brain's normal dealings with rewards and punishments to dissect out and define more clearly the
components that make up depression. We start by exploring changes to primary reinforcer sensitivity in the learned helplessness animal models of
depression. Then, a detailed formalisation of control in a goal-directed decision making framework is presented and related to animal and human data.
Finally, we show how serotonin's joint involvement in reporting negative values and inhibiting actions may explain some aspects of its involvement in
depression. Throughout, aspects of depression are seen as emerging from normal affective function and reinforcement learning, and we thus conclude that
computational descriptions of normal affective function provide one possible avenue by which to define an ætiology of depression.

Population coding

Peer reviewed

2008

Naturally occurring sensory stimuli are dynamic. In this letter, we consider how
spiking neural populations might transmit information about continuous dynamic
stimulus variables. The combination of simple encoders and temporal stimulus
correlations leads to a code in which information is not readily available to
downstream neurons. Here, we explore a complex encoder that is paired with a
simple decoder that allows representation and manipulation of the dynamic
information in neural systems. The encoder we present takes the form of a
biologically plausible recurrent spiking neural network where the output
population recodes its inputs to produce spikes that are independently
decodeable. We show that this network can be learned in a supervised manner by a
simple local learning rule.

2007

Uncertainty coming from the noise in its neurons and the ill-posed nature of
many tasks plagues neural computations. Maybe surprisingly, many studies show
that the brain manipulates these forms of uncertainty in a probabilistically
consistent and normative manner, and there is now a rich theoretical literature
on the capabilities of populations of neurons to implement computations in the
face of uncertainty. However, one major facet of uncertainty has received
comparatively little attention: time. In a dynamic, rapidly changing world, data
are only temporarily relevant. Here, we analyze the computational consequences
of encoding stimulus trajectories in populations of neurons. For the most
obvious, simple, instantaneous encoder, the correlations induced by natural,
smooth stimuli engender a decoder that requires access to information that is
nonlocal both in time and across neurons. This formally amounts to a ruinous
representation. We show that there is an alternative encoder that is
computationally and representationally powerful in which each spike con-
tributes independent information; it is independently decodable, in other words.
We suggest this as an appropriate foundation for understanding time-varying
population codes. Furthermore, we show how adaptation to temporal stimulus
statistics emerges directly from the demands of simple decoding.

2004

As animals interact with their environments, they must constantly update
estimates about their states. Bayesian models combine prior probabilities, a
dynamical model and sensory evidence to update estimates optimally. These
models are consistent with the results of many diverse psychophysical studies.
However, little is known about the neural representation and manipulation of
such Bayesian information, particularly in populations of spiking neurons. We
consider this issue, suggesting a model based on standard neural architecture
and activations. We illustrate the approach on a simple random walk example,
and apply it to a sensorimotor integration task that provides a particularly
compelling example of dynamic probabilistic computation.

Single cells

Peer reviewed

2009

Biophysically detailed models of single cells are difficult to fit to real data.
Recent advances in imaging techniques allow simultaneous access to various
intracellular variables, and these data can be used to significantly facilitate
the modelling task. These data, however, are noisy, and current approaches to
building biophysically detailed models are not designed to deal with this. We
extend previous techniques to take the noisy nature of the measurements into
account. Sequential Monte Carlo (``particle filtering'') methods, in combination
with a detailed biophysical description of a cell, are used for principled,
model-based smoothing of noisy recording data. We also provide an alternative
formulation of smoothing where the neural nonlinearities are estimated in a
non-parametric manner. Biophysically important parameters of detailed models
(such as channel densities, intercompartmental conductances, input resistances,
and observation noise) are inferred automatically from noisy data via
expectation-maximisation. Overall, we find that model-based smoothing is a
powerful, robust technique for smoothing of noisy biophysical data and for
inference of biophysical parameters in the face of recording noise.

2006

Biophysically accurate multicompart-mental models of individual neurons have
significantly advanced our understanding of the input­ output function of single
cells. These models depend on a large number of parameters that are difficult to
estimate. In practice, they are often hand-tuned to match measured physiological
behaviors, thus raising questions of identifiability and interpretability. We
propose a statistical approach to the automatic estimation of various
biologically relevant parameters, including 1) the distribution of channel
densities, 2) the spatiotemporal pattern of synaptic input, and 3) axial
resistances across extended dendrites. Recent experimental advances, notably in
voltage-sensitive imaging, motivate us to assume access to: i) the
spatiotemporal voltage signal in the dendrite and ii) an approximate description
of the channel kinetics of interest. We show here that, given i and ii,
parameters 1­3 can be inferred simultaneously by nonnegative linear regression;
that this optimization problem possesses a unique solution and is guaranteed to
converge despite the large number of parameters and their complex nonlinear
interaction; and that standard optimization algorithms efficiently reach this
optimum with modest computational and data requirements. We demonstrate that the
method leads to accurate estimations on a wide variety of challenging model data
sets that include up to about 104 parameters (roughly two orders of magnitude
more than previously feasible) and describe how the method gives insights into
the functional interaction of groups of channels.

2005

Our understanding of the input-output function of single cells has been
substantially advanced by biophysically accurate multi-compartmental models. The
large number of parameters needing hand tuning in these models has, however,
somewhat hampered their applicability and interpretability. Here we propose a
simple and well-founded method for auto- matic estimation of many of these key
parameters: 1) the spatial distribution of channel densities on the cell's
membrane; 2) the spatiotemporal pattern of synaptic input; 3) the channels'
reversal potentials; 4) the intercompartmental conductances; and 5) the noise
level in each compart- ment. We assume experimental access to: a) the
spatiotemporal voltage signal in the dendrite (or some contiguous subpart
thereof, e.g. via voltage sensitive imaging techniques), b) an approximate
kinetic description of the channels and synapses present in each compartment,
and c) the morphology of the part of the neuron under investigation. The key
observation is that, given data a)-c), all of the parameters 1)-4) may be si-
multaneously inferred by a version of constrained linear regression; this
regression, in turn, is efficiently solved using standard algorithms, without
any "local minima" problems despite the large number of parameters and complex
dynamics. The noise level 5) may also be estimated by standard techniques. We
demonstrate the method's accuracy on several model datasets, and describe
techniques for quantifying the uncertainty in our estimates.

Objectives: Selecting patients with asymmetrical
sensorineural hearing loss for further investigation continues to pose clinical
and medicolegal challenges, given the disparity between the number of
symptomatic patients, and the low incidence of vestibular schwannoma as the
underlying cause. We developed and validated a diagnostic model using a
generalisation of neural networks, for detecting vestibular schwannomas from
clinical and audiological data, and compared its performance with six previously
published clinical and audiological decision- support screening protocols.

Participants: Clinical and audiometric details of 129
patients with, and as many age and sex-matched patients without vestibular
schwannomas, as determined with magnetic resonance imaging.

Main outcome measures: The ability to diagnose a patient as
having or not having vestibular schwannoma.

Results: A Gaussian Process Ordinal Regression Classifier
was trained and cross-validated to classify cases as `with' or `without'
vestibular schwannoma, and its diagnostic performance was assessed using
receiver operator characteristic plots. It proved possible to pre-select
sensitivity and specificity, with an area under the curve of 0.8025. At 95%
sensitivity, the trained system had a specificity of 56%, 30% better than
audiological protocols with closest sensitivities. The sensitivities of
previously-published audiological protocols ranged between 82-97%, and their
specificities ranged between 15-61%.

Discussion: The Gaussian Process Ordinal Regression
Classifier increased the flexibility and specificity of the screening process
for vestibular schwannoma when applied to a sample of matched patients with and
without this condition. If applied prospectively, it could reduce the number
of `normal' magnetic resonance (MR) scans by as much as 30% without reducing
detection sensitivity. Performance can be further improved through
incorporating additional data domains. Current findings need to be reproduced
using a larger dataset.