0% found this document useful (0 votes)
9 views38 pages

A Computational Neuroscience Perspective On The Change Process in Psychotherapy

Uploaded by

Elena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
9 views38 pages

A Computational Neuroscience Perspective On The Change Process in Psychotherapy

Uploaded by

Elena
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 38

15

A Computational Neuroscience
Perspective on the Change Process
in Psychotherapy
Ryan Smith, Richard D. Lane, Lynn Nadel, and Michael Moutoussis

Introduction

We recently explored a potential neurocognitive basis for effective psychothera-


peutic intervention, grounded in memory reconsolidation research (Lane, Ryan,
Nadel, & Greenberg, 2015). The proposed “integrated memory model” (IMM)
suggests that different psychotherapeutic modalities influence the reconsoli-
dation of multiple types of memory. These include episodic memory, semantic
memory, and the implicit memories underlying emotional responses. The
IMM was largely articulated at the level of description of large-​scale brain sys-
tems (i.e., the level of description of cognitive neuroscience or neuroimaging).
We thus suggested that the relevant reconsolidation processes occur in memory
complexes distributed across medial temporal lobe structures (hippocampal for-
mation, amygdala) and the broad expanses of neocortex implicated in long-​term
memory storage.
In the present chapter, we focus on the level of description of formal
algorithms, seeking to capture mathematically the information processing
implemented by neurons within specified network architectures (Friston, 2010;
Marr, 1982). This approach naturally prioritizes the computational problems the
brain needs to solve and then considers different classes of algorithms capable of
solving them. By subsequently focusing on the subset of those algorithms that are
biologically plausible, it can facilitate the development of “neural process theo-
ries.” These specify (and can quantitatively simulate) how specific neural network
architectures might implement such algorithms. In recent years, this perspec-
tive has yielded a wide array of converging theoretical and empirical findings
regarding biologically plausible mechanisms for perception, learning, decision-​
making, and skeletomotor and visceromotor control (e.g., Greve, Cooper, Kaula,

Ryan Smith, Richard D. Lane, Lynn Nadel, and Michael Moutoussis, A Computational Neuroscience
Perspective on the Change Process in Psychotherapy In: Neuroscience of Enduring Change. Edited by:
Richard D. Lane and Lynn Nadel, Oxford University Press (2020).
© Oxford University Press.
DOI: 10.1093/oso/9780190881511.003.0015
396 Integrative Perspectives

Anderson, & Henson, 2017; Pezzulo, Rigoli, & Friston, 2015; Schwartenbeck,
FitzGerald, Mathys, Dolan, & Friston, 2015). The growth of this computational
neuroscience approach is attributable to several different factors, including
(among others) its precise (mathematically defined) theoretical constructs and
the resources it offers for modeling neural processes in a manner that allows for
quantitative experimental predictions.
As computational approaches have advanced in the cognitive/​neural sciences,
they have also seen applications in psychiatry. Applications to psychopathology
have led to the emerging field of “computational psychiatry” (Friston, Stephan,
Montague, & Dolan, 2014; Huys, Guitart-​Masip, Dolan, & Dayan, 2015; Huys,
Maia, & Frank, 2016; Parr, Rees, & Friston, 2018). This field seeks to (a) under-
stand how normative and aberrant neurocomputational processes can promote
maladaptive patterns of perception, learning, and behavior and (b) leverage this
understanding to inform treatment. Computational characterizations of cer-
tain psychotherapeutic interventions (Moutoussis, Shahar, Hauser, & Dolan,
2017) and of subjective, psychiatrically relevant emotions (Will, Rutledge,
Moutoussis, & Dolan, 2017) have also started to emerge. In this volume, com-
putational approaches have already been invoked in a few chapters (i.e., see
Chapters 3, 4, 6, 10, 14, and 16 of this volume).
Our goal in this chapter is to extend previous discussions with a focus on the
potential intersection of computational psychiatry with the IMM. The struc-
ture of this chapter is as follows: (a) we will briefly review models of probabilistic
inference and the algorithms thought to implement it in the brain; (b) we will
provide examples of how, within these models, memory updating (i.e., learning)
could lead to effective psychotherapeutic change; and (c) we will discuss the
implications this has for the IMM approach.
We focus on a few major themes:

A. The influence of maladaptive prior (and often implicit) expectations on


perception, learning, and behavior.
B. The various types of clinically relevant memories within computational
neuroscience models, which are encoded in the strengths of synaptic
connections that represent a range of quantitative probabilistic estimates
regarding the hierarchical causal structure of, and value of states/​action
within, the world (including the body).
C. How successful psychotherapeutic interventions can be understood to alter
these synaptic connections according to computationally defined learning
rules, and thereby promote more adaptive psychological functioning.

We believe consideration of the neurocomputational level of description greatly


enhances the systems-​ level description of the IMM. Specifically, it allows
A Computational Neuroscience Perspective 397

consideration of an intermediate scale of neural processing that can bridge mo-


lecular/​cellular neurobiology with cognitive neuroscience, phenomenology, and
behavior. As will be discussed, for example, it can link subcellular changes to spe-
cific memory contents and their influences on perception and behavior by char-
acterizing the computational function of memory-​related changes in particular
types of synapses. This allows us to “zoom in” on a more fine-​grained description
of the learning and memory processes within the IMM that could underlie ef-
fective psychotherapeutic interventions, providing an insightful complementary
level of description. Prior to discussing these topics in detail, however, we briefly
introduce some broad neural/​computational principles and discuss how these
principles could shape interactions between learning, memory, emotion, and be-
havior within psychotherapy.

A Primer on Probabilistic Inference and Learning

To optimally solve many problems in perception, learning, and action selection,


the brain must (at least approximately) solve a broad set of probabilistic infer-
ence problems (Knill & Pouget, 2004). Within probability theory, the optimal
solution to such inference problems obeys Bayes’ theorem. Effectively, by solving
this theorem, the brain can take in new sensory evidence (from the retina, the
cochlea, the skin, various internal organs, etc.), combine sensations with know-
ledge gained from past experience, and infer the most likely cause(s) of that sen-
sory input in the world outside of the brain, including the body. These inference
processes are what give rise to what psychologists call extero-​and interoception.
Given an internal model of the world, the same process can also be used to infer
what behaviors are most likely to produce (and perceptually confirm) desired
outcomes.
Bayes’ theorem is a mathematical equation that quantitatively specifies how
the strength of any prior belief B (a probability distribution prior to an expe-
rience) should change in response to the new experience E, leading to a new
probability distribution termed the posterior belief. It essentially stipulates that a
posterior belief (henceforth a “posterior”) should reflect a compromise between
the prior belief (henceforth a “prior”; typically gleaned from past experience)
and how consistent a new experience is with that prior. This consistency level
is formalized by the likelihood function (henceforth a “likelihood”). The like-
lihood specifies how much evidence different experiences would provide for
different beliefs one could hold. Formally, a prior is a simple unconditional prob-
ability of a belief, P(B), a likelihood is the conditional probability of the new ex-
perience given that belief, P(E | B), and a posterior is the conditional probability
of the belief given the new experience, P(B | E). The equation states:
398 Integrative Perspectives

P (B)P (E | B)
P (B | E) = 
p(E ) Equation 1.

The term P(E) is the marginal likelihood, which takes into account how likely the
experience is in general (i.e., how consistent the experience is with other possible
beliefs). Technically, it ensures that the posterior is normalized (i.e., is between
0 and 1).
To illustrate how this theorem can be applied to a clinically relevant topic—​
emotional awareness—​consider a case in which an individual has just experi-
enced a bodily affective response, and their brain is trying to infer what emotion
concept (if any) best explains or accounts for their bodily sensations. For sim-
plicity, we will assume that the brain is only trying to infer whether sadness or
sickness is the best way of understanding their experience. Next, we will assume
that the individual has two pieces of information: (a) that they are at a funeral
(context) and (b) that their movements feel slow and effortful. Further assume
that, (a) in the context of a funeral, P(SAD) = 0.8 and P(SICK) = 0.2 (i.e., sadness
is more expected than sickness while at a funeral), and that P(“slow movement”
| SAD) = 0.4 and P(“slow movement” | SICK) = 0.6 (i.e., slow movement is more
consistent with sickness than sadness). If an individual’s brain were using an al-
gorithm that approximates Bayes’ theorem, the inference would go as follows.
First, calculate the marginal likelihood, P(E):

P(“slow movement”) = P(“slow movement” | SICK) × P(SICK) +


P(“slow movement” | SAD) × P(SAD) = (0.6 × 0.2) + (0.4 × 0.8) = 0.44

Then, for each possible interpretation of the experience (SAD or SICK), take
the product of the prior and the likelihood, P(B) and P(E | B), and divide by the
marginal likelihood, P(E):

P(SAD | “slow movement”) = (0.4 × 0.8) /​0.44 = 0.73


P(SICK | “slow movement”) = (0.6 × 0.2) /​0.44 = 0.27

Here we can see that the brain would infer that, in the context of a funeral,
sadness is more likely (P = 0.73) to account for these bodily feelings—​
corresponding to the conscious recognition by the individual that he or she is
feeling sad. It is understood that this process occurs unconsciously but that its
outputs can (but need not always) become available to subjective awareness.
(note: the term “belief,” as used in this context, is an element of sub-​personal
inference processes, and therefore will not always correspond to a conscious,
reportable belief.)
A Computational Neuroscience Perspective 399

Note here that for an individual to adaptively infer what emotion(s) best account
for what they are feeling (i.e., to have high emotional awareness), they will need
to (a) have learned about many different possible emotions (i.e., possess many
emotion concepts), (b) have learned the right priors for different emotions in
different contexts (e.g., knowing anxiety is more likely when giving a speech than
when watching TV at home), and (c) have learned the right likelihood functions
for each emotion (e.g., that feeling “high energy” is more consistent with excite-
ment than with depression). Note, in passing, that inferring the right emotion
can then usefully inform prior beliefs about actions to be performed next (for ex-
plicit computational models of these processes, see Smith, Lane, Parr, & Friston,
2019; Smith, Parr, & Friston, 2019).
Importantly, to accomplish this kind of inference process, the brain must in
some sense embody a generative model of the world. Generative models can be
thought of as specific combinations of prior and likelihood values, which de-
scribe the joint probability of all salient causes and consequent perceptions. They
are termed “generative” because, given a set of internally represented possible
causes (i.e., possible beliefs about what could be happening in the world outside
of the brain), they can generate predictions (e.g., about the sensory inputs those
causes would produce). This is the inverse of the previous inference, which led
from sensations to causes (or from experiences to beliefs about the most likely
causes or interpretations of those experiences).
One aspect of Bayesian cognition that is highly relevant for understanding
clinical phenomena is that the reliability of prior expectations and sensory inputs
typically depend upon context. For example, visual input may be more reliable
during the day than it is at night. Optimal inference requires that, in any given
context, the brain must learn how much it should “trust” new sensory input and
how much it should instead trust the prior expectations it has previously ac-
quired. To solve this further problem, the probability distributions used in Bayes’
theorem can be associated with estimated level of reliability or “precision.” Thus,
sensory inputs which are estimated (believed) to have higher precision (i.e., less
variance), or are otherwise believed to carry more reliable information, will con-
tribute more to inference than sensory input with lower estimated precision/​reli-
ability; in the latter case, prior expectations will dominate inference. Thus, in our
previous example, if the sensory evidence for “slow movement” were imprecise
(e.g., movement felt slow in general, but its speed varied a lot from one moment
to the next), the brain would trust its prior expectations about sadness versus
sickness much more (i.e., the likelihood term would be effectively downweighted
in the previous equation). This can be thought of as the individual paying less
attention to movement speed when trying to figure out what emotion they are
experiencing (Feldman & Friston, 2010; Parr & Friston, 2017). Thus, in our ex-
ample of inferring one’s own emotions, a fourth requirement for high emotional
400 Integrative Perspectives

awareness is the need for adaptive precision estimates (i.e., knowing what to pay
attention to and what to ignore during the inference process).
Based on the previously described framework, at a very broad level there are
three major types of “memories” stored in the brain as a part of its internal model
of the world: expectations (priors), the relationships between particular sensory
inputs and particular beliefs or descriptions of what is happening out in the world
(likelihoods), and estimates about how reliable particular sensory inputs and
prior expectations are in a given context (precision estimates). Computationally,
these memories are learned parameters of the internal model, such as the ex-
pectation and precision of the priors, parameters of the likelihood function, etc.
It is important to be clear, however, that these general types of memories can
be stored for a vast range of different types of representational content. For ex-
ample, a person might have prior expectations that they have low self-​worth,
that other people are highly judgmental, that the world is uncontrollable, that
what their friends tell them is unreliable, and that feelings of low energy are more
consistent with sickness than sadness. Thus it is the content of certain specific
priors, likelihoods, and precision estimates in an individual’s internal model that
can generate maladaptive percepts, beliefs, emotions, and behaviors in partic-
ular clinical cases. Therapeutic interventions will be effective to the extent that
they can alter these internally stored parameters, so as to make them more adap-
tive. In the following section, we will review how each of these stored parameters
(which can be thought of as implicit memories) can be linked to changes in syn-
aptic strengths between different types of neurons in the brain.

Biologically Plausible Algorithms

Once this normative Bayesian framework is in place, one can explore and test
neural process theories capable of implementing algorithms that approximate
Bayes’ theorem—​allowing them to successfully solve a wide range of inference
problems (i.e., so long as relevant parameters stored in memory are not too far
from their true values). Requiring biological plausibility at the process level
ensures that the resulting theory will be capable of capturing realistic neural-​level
explanations for how individuals interact with their environment. These process
theories can be broadly classified into those dealing with stimulus conditions, or
perception, and those dealing with actions, or behavior.
In the domain of perception, the “predictive coding” theory represents one
important approach (e.g., Friston, 2005). According to predictive coding theory,
the brain is organized into hierarchical levels of representation. Each level
has two classes of neurons: prediction neurons and prediction error neurons.
Prediction neurons at each level provide priors, shaped by the strength of
A Computational Neuroscience Perspective 401

descending synapses, regarding the expected activity at the level below, driven by
activity representing current beliefs at the level above. At the lowest level, these
would be priors regarding sensory input. Prediction error neurons convey likeli-
hood information to the level above, signaling the difference between predicted
values and those most consistent with a new observation (e.g., the difference be-
tween predicted and actual sensory input at the lowest level). By finding the set
of represented percepts/​beliefs that minimize prediction error across levels, the
updated percepts/​beliefs will approximate the optimal Bayesian solution. This
layered set of interconnected inferences then corresponds to the multilayered,
rich structure of our moment-​to-​moment perceptions.
Pyramidal neurons in cortical layers 5/​6 have been hypothesized to estimate
predictions, and pyramidal neurons in layers 2/​3 to estimate prediction error
(Bastos et al., 2012). Downward prediction signals from layer 5/​6 are thought to
be synaptically conveyed by glutamatergic NMDA receptors, which have slow
time constants; in contrast, the synaptic AMPA receptors are thought to me-
diate upward prediction error signals, which have fast time constants (Friston,
2005; Salin & Bullier, 1995). This allows predictions at one level to involve tem-
porally extended patterns of change at the level below. Progressively higher
levels of cortex can therefore learn about progressively longer timescale patterns
(e.g., recognizing the meaning of a word, a phrase, a sentence, a paragraph, etc.;
Hasson, Chen, & Honey, 2015; Hasson, Yang, Vallines, Heeger, & Rubin, 2008;
Kiebel, Daunizeau, & Friston, 2008; Murray, 2014). Higher levels also incorpo-
rate a progressively wider (more convergent) array of inputs, allowing greater
spatial and multimodal integration. Within the higher association cortices most
associated with long-​term memory, one can therefore learn about long timescale
relationships across many types of perceptual experiences. This would allow one
to learn, for example, that fear often involves the co-​occurrence of muscle ten-
sion and a perceived threat. Finally, precision estimates are implemented in pre-
dictive coding through neuromodulatory systems (e.g., the synaptic strengths
of serotonergic, dopaminergic, and noradrenergic synapses) that amplify or
suppress the strength of synaptic inputs conveying more reliable or less reliable
prediction error signals, respectively. Thus, if a person paid more attention to
muscle tension when trying to recognize their emotions, the activity of predic-
tion error neurons associated with muscle tension estimates would be upwardly
modulated. For a simplified illustration of the dynamics of this type of neural
network architecture, see Figure 15.1 and the figure legend describing the asso-
ciated example (i.e., where a neural network has to infer happiness level based on
perceived energy level, given different priors, and precision estimates).
With respect to behavior, two overlapping perspectives are of particular in-
terest, referred to as reinforcement learning (RL; Sutton & Barto, 1998) and ac-
tive inference (AI; Friston et al., 2016) models. Both provide algorithms capable
402 Integrative Perspectives

Figure 15.1 (A) A possible neural basis for an internal model guiding the
recognition of one’s own level of happiness based on felt energy levels. This example
illustrates how the same input (neutral energy level) can be interpreted as a sign of
higher or lower levels of happiness based on different learned prior expectations and
precision estimates. In this model, blue triangles indicate cortical pyramidal neurons,
and black lines indicate axons terminating in synaptic connections. Arrows indicate
excitatory synaptic influences, and circles indicate inhibitory synaptic influences
(dashed arrows are not modeled, but indicate additional context-​specific modulatory
influences that would be present in a more complete model). Activity of the
“Happiness Level” (HL) neuron estimates level of happiness, and “Energy Level” (EL)
neuron activity represents felt level of energy (i.e., low activity indicates low energy
and high activity indicates high energy). The two “Prediction Error” (PE) neurons
reflect prediction errors associated with HL activation (higher level) and with EL
activation (lower level). The strength of the two looping axons’ synapses (connecting
each PE neuron to itself) estimates the precision (reliability) of prior expectations
(πHL; higher level) and ELactivity (πEL; lower level). Expected happiness (PrHL) is
conveyed through the strength of the top–​down inhibitory synapse on the higher-​
level PE neuron. Although not modeled here, PP (“Predictive Processing”) models
also include quantitative synaptic learning mechanisms (i.e., update equations)
allowing the strengths of the PrHL, πHL, and πEL synapses (i.e., prior expectations and
precision estimates) to be altered over time to better match patterns in experience.
Panels B through E illustrate changes in HL neuron activity (i.e., inferred happiness
level; black lines) over time when presented with a neutral energy level (i.e.,
moderate EL activity, most consistent with a neutral amount of happiness, all else
being equal; blue lines) under different model parameter values. These different
A Computational Neuroscience Perspective 403

of representing the value of different states and/​or observations (e.g., observing


the presence of a friend, the state of feeling sad, the state of being excited to go
on a run, etc.) and then inferring the sequence of actions (called a policy) that is
most likely to lead to the states and/​or observations that have the highest value.
RL models focus on reward/​value maximization. AI models aim to maximize
both reward (or “preferred observations”) and information gain (by minimizing
a statistical quantity called “expected free energy” that is closely related to pre-
diction error), as part of a broader aim to keep an organism in states congruent
with its own survival. We will first discuss RL models and then highlight the ad-
ditional benefits offered by AI.
The process of decision-​making/​action selection within RL models involves two
broad classes of mathematically defined algorithms: model-​free (MF) and model-​
based (MB) algorithms. MF algorithms work by slowly updating the stored values
of different actions (behaviors) one can take in different states of the world that
one can occupy. In the simplest case, the equations describing MF algorithms first
calculate a reward prediction error, reflecting the difference between expected re-
ward and observed reward after choosing an action in a specific state. Then, they
increase/​decrease the stored value of a specific action in a specific state if it leads
to more/​less reward than expected and/​or if it leads to a state that itself has tended
to lead to highly rewarding/​unrewarding states in the past. Such algorithms often
eventually learn optimal behavior policies, but they require a large number of trials
to optimize their action values for all states. This is in part because, unlike MB
algorithms, they do not keep a stored model of the world; they simply recognize
the state they are in and then select the action in that state with the highest stored
value (i.e., with no expectations about possible future outcomes).
Neural models of MF algorithms suggest that phasic midbrain dopamine
signals convey reward prediction errors that update the strengths of synapses
within corticostriatal loops that store state-​dependent action values; this in turn
alters the future inhibition of action representations in each loop, where higher
value corresponds to lower inhibition (Berns & Sejnowski, 1996; Dayan & Daw,
2008; Dolan & Dayan, 2013; Frank, 2011). Subsequent competition between dif-
ferent corticostriatal loops (associated with different possible actions) is then bi-
ased toward selection of actions with the highest value (i.e., least inhibition).

Figure 15.1 Continued


parameter values reflect (top) prior expectations of low (B) versus high (C) levels of
happiness, and (bottom) high (D) versus low (E) reliability estimates for expectations
of low happiness. As can be seen, after HL neuron activity stabilizes, lower levels
of happiness are inferred in B compared to C (reflecting the influence of prior
predictions), and in D compared to E (reflecting the influence of higher reliability
estimates for prior predictions of low happiness). For the detailed mathematics on
which this example is based, see Bogacz (2017).
404 Integrative Perspectives

MB algorithms are thought to co-​exist with MF algorithms in the brain,


drawing on working memory processes within dorsolateral prefrontal and
dorsomedial striatal regions—​and interactions with other regions such as orbit-
ofrontal cortex (OFC) and hippocampus (Daw, Gershman, Seymour, Dayan, &
Dolan, 2011; Doll, Simon, & Daw, 2012; Wilson, Takahashi, Schoenbaum, & Niv,
2014). These algorithms build an internal cognitive map (model) of the world,
allowing one to simulate the consequences of different sequences of actions
and state changes and use these simulations of possible futures to select the ac-
tion sequence with the highest expected long-​term reward. This allows much
faster optimal policy learning than with MF algorithms alone (which only learn
from direct experience; see Sutton, 1991; Sutton & Barto, 1998). However, MB
algorithms become intractable when used alone with sufficiently large cogni-
tive maps (with many states and actions), which has led to proposals regarding
MB and MF system interactions and mechanisms that restrict search to within
“nearby” locations to one’s current location in such maps (e.g., see Browne et al.,
2012; Huys et al., 2012).
The last RL concept that is relevant to our following examples is that of the ex-
plore/​exploit trade-​off (Wilson, Geana, White, Ludvig, & Cohen, 2014). To un-
derstand this trade-​off, consider what would happen if the previously described
algorithms were applied to a world that changes over time instead of remaining
stable. In this case, the algorithms would continue to select actions based on pre-
vious learning and therefore may never learn that a state that used to be quite
aversive is now highly rewarding. Therefore, for such algorithms to remain effec-
tive in changing environments, they must occasionally either choose actions at
random (and therefore explore states at random) or intentionally explore the en-
vironment to make sure their model is still accurate. However, if one explores too
much, then behavior also quickly becomes suboptimal. For example, it would be
maladaptive to continually explore the state of “touching a hot stove” to see if it
still burns you like it did the previous times you touched it. Therefore, it is best
if exploratory actions are taken with small probability (e.g., P = 0.2), whereas
learned action values are instead exploited the majority of the time (e.g., P = 0.8).
The previously described processes have close analogues within AI models
(e.g., see Pezzulo et al., 2015). However, the AI framework models policy selec-
tion as approximate Bayesian inference and provides a hierarchical neural pro-
cess theory largely analogous to (and combinable with) the previously described
predictive coding model—​positing populations of neurons that convey pre-
diction and prediction error signals with respect to preferred observations,
perceived states, and policies (among others), as well as synaptic strengths and
neuromodulatory processes that encode prior expectations, precision estimates,
and related parameters (Friston, Parr, & de Vries, 2017). When long timescale
predictions are assigned high precision estimates, an AI network can select
A Computational Neuroscience Perspective 405

“long-​sighted” policies in a manner similar to MB algorithms in RL models,


whereas if higher precision estimates are assigned to short timescale predictions,
an AI network can act more “short-​sightedly” like MF algorithms (e.g., produ-
cing more habitual and impulsive behavior that is less sensitive to context and
higher-​level knowledge).
Aside from providing a more explicit neural process theory, the AI framework
accounts for three further aspects of behavior. First, as briefly mentioned previ-
ously, the AI framework selects policies that minimize an information-​theoretic
quantity called expected free-​energy. This implicitly motivates behaviors that
balance maximizing reward, in the sense of observing preferred outcomes,
with maximizing information gain—​ that is, minimizing ambiguity/​ uncer-
tainty and maximizing internal model accuracy (Friston, FitzGerald, Rigoli,
Schwartenbeck, & Pezzulo, 2017). This means that selected policies will bal-
ance exploration/​exploitation in a principled manner,1 exploring the environ-
ment enough to become confident in its accuracy before exploiting inferences
about the states that will produce preferred observations. To do this in an ap-
proximately Bayes’ optimal manner, AI formally casts preferences as probability
distributions over observations that are “expected” to maintain evolutionary fit-
ness. One thus inherits the “expectation” to observe specific blood glucose levels,
for example, and so food is rewarding when it brings blood glucose levels back
to expected levels; or one could implicitly “expect” to observe high amounts
of social support, and so loss of social support would be aversive. (To be clear,
this use of the term “expect” simply reflects the formal mathematics, and is dis-
tinct from what a person would expect to observe if they made one choice vs.
another.) Second, the AI framework seamlessly accounts for behavioral varia-
bility. Policies are explicitly inferred, and this allows for continuously updated
estimates of the expected precision of policy selection, which drives choice var-
iability. Third, the AI framework offers a specific account of skeletomotor and
visceromotor control. Here, when proprioceptive/​interoceptive prediction error
signals are transiently assigned low-​precision estimates, their influence is atten-
uated and descending proprioceptive and interoceptive prediction signals can
then dynamically alter the set points of skeletomotor and visceromotor reflex
arcs—​causing them to alter muscle and internal organ activity in a manner that
fulfills those predictions (Adams, Shipp, & Friston, 2013; Pezzulo et al., 2015;
Pezzulo, Rigoli, & Friston, 2018). Thus, AI provides a mechanism by which per-
ceptual inferences can engage interoceptive predictions leading to affective au-
tonomic responses, and inferred policies can engage proprioceptive predictions
moving the body in ways that implement a planned sequence of actions.

1 To be clear, RL models have also been developed to balance exploration/​exploitation in different

ways (Sutton & Barto, 1998).


406 Integrative Perspectives

Based on the conceptual resources provided by this approach, we will now


provide a few concrete examples of different psychotherapeutic interventions.
Specifically, we will illustrate how these therapeutic interventions would alter the
parameters of an individual’s internal model, leading to adaptive change.

Example 1: Exposure Therapy for Spider Phobia

In this first example, we will cast fear of spiders, its treatment, and the limitations
of this treatment in terms of computational neuroscience. The crucial issue here
is to infer whether the causes behind sensory experiences associated with spiders
are actually dangerous. Here, we will emphasize this by talking about “latent”
causes—​the underlying (often unobservable) situations/​factors that generate
patterns in an individual’s subjective observations (Gershman, Norman, & Niv,
2015). We will supplement the dynamics of changing latent causes with a predic-
tive coding/​active inference model to describe spider phobia and its treatment
using exposure therapy. As illustrated in Figure 15.2, a heuristic three-​level gen-
erative model (see Figure 15.1 for more detail) can add intuition regarding the
relevant computational mechanisms. This model illustrates how:

1. Probabilistic representations of abstract world states (or latent causes) can


predict the probabilistic co-​occurrence of different concept representations
within semantic memory (deployed for stimulus recognition).
2. Concept representations can in turn predict the probabilistic co-​occurrence
of unimodal perceptual representations within exteroceptive, somato-
sensory, interoceptive, and proprioceptive cortical/​ subcortical sensory
systems.
3. The perceptual representations within interoceptive and proprioceptive
systems can (when their predictions are afforded high precision estimates)
implement visceromotor and skeletomotor commands, respectively (i.e.,
otherwise these representations would be adjusted to minimize prediction
errors arising from afferent bodily input, leading to new bodily percepts;
that is, they also contribute to perceptual inference).
4. Exteroceptive perceptual representations can predict, and attempt to min-
imize prediction error with respect to, patterns of activation across extero-
ceptive sensory organ states (e.g., retina, cochlea), which in turn track the
presence of objects (like spiders) out in the world.

The ability to infer latent causes is considered highly adaptive within the envir-
onments that humans and other animals inhabit. Not only are causes hidden,
but their effects may change (Chan, Niv, & Norman, 2016). Thus, when an
A Computational Neuroscience Perspective 407

Level 3 Threat Context Safe Context Abstract World States

Level 2 DANGER SPIDER SAFETY Concepts

Low Heart
High Heart Avoidance “Spider- Approach
Level 1 Rate & Pain Behavior
“Black”
shaped”
Rate & No
Behavior
Percepts
Pain

Somatic/Interoceptive Proprioceptive Signals


Exteroceptive
Signals (including (including skeletomotor
Signals
visceromotor commands) commands)

Somatovisceral Exteroceptive Sensory Skeletomotor


Body
States Organ States States

Figure 15.2 A potential hierarchical basis for multimodal inference/​learning


(abstracting from the example of hierarchical neural structure shown in
Figure 15.1). At the highest level, the brain is here envisioned to infer the presence
of latent causes, or abstract states of the world (context) the individual is in; two
latent causes/​contexts may be indistinguishable when based solely on current
perception. Based on innate or learned expectations, different contexts predict
the co-​occurrence of different concepts; here, the threat context predicts the
co-​occurrence of SPIDER and DANGER, whereas the safe context predicts the
co-​occurrence of SPIDER and SAFETY. When activated, each of these concepts,
in turn, predicts the co-​occurrence of different percepts across different sensory
modalities; here, SPIDER predicts the co-​occurrence of exteroceptive percepts
such as “black” and “spider-​shaped.” DANGER predicts the co-​occurrence of
somatic/​interoceptive percepts such as “high heart rate” and “pain,” as well as
proprioceptive percepts associated with skeletomotor actions involving avoidance.
SAFETY instead predicts the co-​occurrence of somatic/​interoceptive percepts
such as “low heart rate” and “no pain,” as well as proprioceptive percepts associated
with skeletomotor actions involving approach. Active inference models suggest
that interoceptive and proprioceptive predictions can be fulfilled by (and therefore
act as) visceromotor and skeletomotor commands, respectively. Before exposure
therapy, in this simplified example the brain of an individual with spider phobia can
be understood to infer that it is in the threat context when it perceives/​recognizes
a spider, leading to a cascade of predictions that ultimately lead to increased heart
rate and avoidance behavior. After exposure therapy, such an individual’s brain can
be understood to instead infer that it is in the safe context, leading to a different
cascade of predictions that ultimately lead to low heart rate and approach behavior.
408 Integrative Perspectives

individual’s expectations are violated, this can be resolved with two types of be-
lief updates. First, the brain could infer that the same cause (that has been active
all along) now produces different observations. Second, and more likely if one
assumes that causal relationships in the world are fairly stable, the brain could
infer that a new latent cause is now present instead of the previous one.
This latter inference appears to occur in most animal studies on fear extinc-
tion (Gershman et al., 2015), where the original relationship between a condi-
tioned and unconditioned stimulus (CS and US, respectively) is in fact generated
by an unobservable latent cause (i.e., in this case that unobservable cause is the
current goal of the experimenter). When this relationship abruptly changes
during extinction trials, the animal’s brain now (implicitly) infers the presence
of a new latent cause (often referred to as forming a new “safety memory” in the
animal extinction literature) and that this second latent cause is generating the
new CS–​US relationship (i.e., in this case the new cause is the experimenter’s
new goal, but the animal only implicitly infers that the cause has changed—​it
doesn’t know what the cause is). However, in ambiguous future circumstances,
the brain can infer the probable return of the original latent cause, leading to
spontaneous recovery (return of fear). Interestingly, one prediction of latent
cause models is that if the CS–​US relationship changes gradually as opposed to
abruptly, this would favor the inference that the original latent cause is still pre-
sent but that the observations it generates have evolved (i.e., altering the original
“fear memory”; memory reconsolidation), which should prevent return of fear.
Significantly, a recent experiment (Gershman, Jones, Norman, Monfils, & Niv,
2013) confirmed this prediction by showing that spontaneous fear recovery did
not occur after a gradual (as opposed to abrupt) extinction phase (i.e., in which
the CS was followed by the US with lower and lower frequency, instead of simply
ceasing to follow the CS completely right from the start).

Figure 15.2 Continued


Return of fear after exposure therapy can then be understood as reflecting later
uncertainty about occupying the threat context or the safe context. Recent research
suggests that gradual (as opposed to abrupt) extinction learning may lead to changes
in what the threat context predicts (i.e., changes in the original memory, instead
of favoring the inference that there has been a change in state/​cause). Black arrows
indicate the exchange of mutually reinforcing top–​down predictions and bottom–​up
prediction error signals between hierarchical levels. Smaller gray arrows indicate
lateral (within-​layer) excitatory and inhibitory signaling, allowing, for example,
the activation of the threat context representations to inhibit the activation of
the safe context representations, or allowing “high heart rate” representations to
directly prime predictions about avoidance behavior (e.g., if these were consistently
activated together in past experience).
A Computational Neuroscience Perspective 409

Recent studies have provided evidence that the OFC infers/​represents prob-
ability distributions over latent causes, where different OFC activation patterns
correspond to different probable latent cause inferences—​leading to different
top–​down influences (prior predictions) over lower-​level structures (e.g., the
amygdala; Chan et al., 2016). These different top–​down influences alter lower-​
level predictions regarding co-​occurring stimuli, leading, for example, a CS
representation to predict a US representation under one pattern of top–​down
influence but not under another. In the case of gradual extinction, on the other
hand, one would expect lower-​level representations of CS–​US relationships (e.g.,
in the amygdala) to change while under the same top–​down influence (reflecting
the same latent cause), although this has not been thoroughly tested to date.
Having introduced these relevant preliminary concepts, the present example
of exposure therapy in spider phobia will make use of the three-​level predictive
processing model shown in Figure 15.2. The individual with spider phobia starts
out representing the presence of a latent cause at the third level that we will refer
to as the “threat context,” which predicts the high-​probability co-​occurrence
of the concept SPIDER and the concept DANGER. The concept SPIDER in
turn predicts particular exteroceptive perceptual representations (e.g., “black,”
“spider-​shaped”); thus, when a spider is actually observed, prediction error min-
imization leads to an increase in the represented probability of these perceptual
representations, which in turn increases the represented probability of the con-
cept SPIDER (i.e., the person would recognize the presence of a spider). Because
the threat context predicts the high-​probability co-​occurrence of the concept
SPIDER and the concept DANGER, this latter concept is also strongly activated
and predicts specific somatosensory, interoceptive, and proprioceptive percepts
(e.g., those associated with pain, high heart rate, and avoidance behavior, respec-
tively). Assuming interoceptive/​proprioceptive predictions are assigned high
precision in the threat context, the person would therefore respond to seeing
a spider by displaying an increased heart rate and by behaviorally avoiding the
spider (which could also be cast more explicitly in terms of the policy selection
processes described previously). The most clinically relevant memories in this
description are here identified as the learned prediction strengths (priors) be-
tween levels (e.g., the threat context representation predicts the co-​occurrence of
SPIDER and DANGER with probability = 0.9; likely mediated by the strengths of
top–​down synaptic connections between OFC neurons from the higher level to
lower-​level neurons in the amygdala and other temporal lobe structures).
During exposure therapy, an individual abruptly and continuously observes
a spider along with the absence of many percepts predicted by the concept
DANGER (e.g., no pain and reduced arousal over time). As described previously,
when these prediction errors are strong and abrupt during extinction, the brain
does not appear to minimize the resulting prediction error signals by altering the
410 Integrative Perspectives

represented probability that SPIDER and DANGER will co-​occur in the threat
context; instead, it infers that the latent cause has changed and that it is now in a
new world state that we will call the “safe context” (i.e., even if the threat context
and safe context are otherwise perceptually indistinguishable). Therefore, the ex-
posure experience teaches the brain that, in the safe context, the probability that
SPIDER and DANGER will co-​occur is low and that the probability that SPIDER
and SAFETY will co-​occur is high instead. Put another way, the individual’s brain
still implicitly believes that the original latent causal association exists; it simply
infers that a different (unobservable) latent cause is now present. Following the
previously described explanatory structure, because SAFETY predicts low heart
rate and approach behavior, this is how an individual will respond to the spider
after exposure therapy (again, for simplicity, we have here omitted discussion
of additional nuances associated with policy selection). However, the original
memories associated with the threat context are still present and can lead to the
return of phobic symptoms as soon as the brain again represents that this latent
cause has returned. The process of inferring the new safe context corresponds to
the “new memory” typically created during extinction learning, which is thought
to explain why spontaneous recovery (i.e. “return of fear”) can occur after extinc-
tion (i.e., because the original threat context predictions can still be reactivated).
Following schema theory, the establishment of two sets of associations, when
simply SPIDER–​SAFETY would suffice, has been termed “overaccommodation”
(Moutoussis et al., 2017).
Based on the recent work on gradual extinction, however, it would be possible
to alter the threat context memory instead (i.e., alter the descending synaptic
influence this latent cause representation has on lower-​level structures and/​or
alter lateral synaptic interactions encoding expected co-​variance relationships
between the lower-​level representations of SPIDER and DANGER). This would
happen if the size of prediction errors driven by the threat context were man-
aged so as to change gradually, rather than abruptly, because large prediction
errors during abrupt extinction favor new latent cause inference, whereas
smaller prediction errors during gradual extinction may promote slower
changes to beliefs about the original latent cause. In experimental conditioning/​
extinction paradigms, this is achieved by slowly rather than abruptly lowering
the experienced frequency with which the CS co-​occurs with the US, but in the
human context, where an individual has rarely or never actually experienced
a dangerous spider bite, other clinically relevant features analogous to the US
frequency that may signal latent cause switching must be considered. One such
feature may simply be the presence versus absence of a therapist (Moutoussis
et al., 2017). Carefully designed behavioral experiments could make a start on
this by explicitly inviting the patient not only to work within his or her own
threat contexts but also to mitigate the downstream consequences predicted
A Computational Neuroscience Perspective 411

by the patient’s own internal model—​enough to construct a “zone of proximal


learning” (Vygotsky, 1980). More generally, this discussion suggests that the
form and time course of experienced contingency changes in therapy may rep-
resent an important variable to consider when attempting to alter an original
fear memory (as opposed to promoting the formation of a new “latent cause”
memory, as appears to occur in current exposure therapies that use abrupt ex-
tinction procedures).

Example 2: Corrective Therapeutic Experiences


That Improve Social Functioning

In this second example, we illustrate how the same hierarchical structure


depicted in the previous example can also be used to model corrective thera-
peutic experiences that relate to problems in social functioning. In Figure 15.3,
a client starts in an abstract world state that represents a very generic prior ex-
pectation that “I am not a person of value” (perhaps learned through repeated
experience during development; see Chapter 12 of this volume for discussion
of a similar case). This predicts a high probability of the concept or schema of
SOCIAL REJECTION. In the context of percepts furnishing soft evidence, such
as another individual looking away from the client with a neutral facial ex-
pression, the presence of social rejection would be inferred (i.e., just like sad-
ness accounted for slow movement in the previous primer). Activation of the
SOCIAL REJECTION concept would in turn promote unpleasant autonomic
arousal and avoidance behavior.
The clinically relevant memory in this case would primarily encode the
prior belief “I am not a person of value” in synaptic strengths conveying signals
from higher neurocognitive levels, while corrective therapeutic experiences
would typically involve a therapist repeatedly responding empathetically and
nonjudgmentally. This would strengthen the alternative expectation, that “I am a
person of value.” Hence, the prior for ACCEPTANCE will come to have a higher
probability relative to REJECTION, given identical ambiguous exteroceptive so-
cial percepts. This can have crucial computational consequences. A greater belief
in ACCEPTANCE will predict, and lead to, the experience of a greater expecta-
tion of pleasant bodily percepts (e.g., low arousal). Crucially, the component of
expected unpleasant surprise (free energy) associated with approaching those
that appear rejecting will no longer out-​weigh the value of acting to reduce un-
certainty about them (the epistemic component of free energy). The person will
explore, approach and learn about others more—​which would in turn tend to
garner positive responses from other people and, over time, improvements in
social functioning.
“I am not a “I am a
Level 3 person of value” person of value” Abstract World States

SOCIAL SOCIAL
Level 2 Concepts
REJECTION ACCEPTANCE

Other Person’s Pleasant


Unpleasant Avoidance Other Person Approach
Level 1 High Arousal Behavior
Neutral Facial
looking away
Low
Behavior
Percepts
Expression Arousal

Somatic/Interoceptive Proprioceptive Signals


Exteroceptive
Signals (including (including skeletomotor
Signals
visceromotor commands) commands)

Somatovisceral Exteroceptive Sensory Skeletomotor


Body
States Organ States States

Figure 15.3 An extension of the hierarchical model in Figure 15.2 to the case of
maladaptive abstract expectations learned during childhood and how these can
influence perception, conceptualization, and behavior. Here, the abstract Level
3 expectation that “I am not a person of value” predicts SOCIAL REJECTION,
which, in turn, predicts ambiguous percepts of the behavior of others (e.g., neutral
facial expressions and looking away) as well as “unpleasant high arousal” and
“avoidance behavior.” Thus, when the “I am not a person of value” expectation biases
perception and conceptualization, such ambiguous percepts will be recognized as
evidence of social rejection and lead to unpleasant arousal and avoidance behavior.
In contrast, the Level 3 expectation that “I am a person of value” predicts SOCIAL
ACCEPTANCE, which, in turn, predicts the same ambiguous percepts of the
behavior of others (e.g., neutral facial expressions and looking away) as well as
“pleasant low arousal” and “approach behavior.” Thus, when the “I am a person
of value” expectation biases perception and conceptualization, such ambiguous
percepts will be recognized as consistent with social acceptance and lead to low
arousal and approach behavior. Even higher levels could also be included in such
a model, where a person expects that they are valuable/​lovable in certain contexts
but not others. As in Figure15.2, black arrows indicate the exchange of mutually
reinforcing top–​down predictions and bottom–​up prediction error signals between
hierarchical levels, whereas the smaller gray arrows indicate lateral (within-​layer)
excitatory and inhibitory signaling.
A Computational Neuroscience Perspective 413

Example 3: Repetitive Maladaptive Behavior


and the Role of Therapy in Promoting Behavior
Change Through Exploration

In the previous example, we used the predictive coding/​active inference frame-


work to illustrate one way in which abstract expectations can lead to self-​
reinforcing cycles of behavior (i.e., high represented probabilities for “I am not a
person of value” led to behaviors that promoted further social isolation, whereas
high represented probabilities for “I am a person of value” led to behaviors that
promoted further social acceptance). In the third example described here we ap-
peal to computational models of action selection within the RL tradition to il-
lustrate another type of clinically relevant memory (action value memory) that
successful therapies likely (and should) target to promote more adaptive behav-
ioral patterns. This same example could also be framed in terms of active infer-
ence, and the take-​home message would be similar (i.e., in active inference this
would correspond in part to learning prior expectations about policies—​often
analogous to habit learning—​and in part to learning new associations between
states and preferred observations). We use the simpler RL formulation here for
ease of exposition.
To shed light on maladaptive avoidance behavior, we draw an analogy to a
well-​known difficulty in RL called the “shortcut problem” (Sutton & Barto,
1998). This problem pertains to the previously discussed explore/​exploit di-
lemma, and it is not effectively solved, even by occasionally taking exploratory
actions. After outlining this analogy, we will show how, in clinical cases of mala-
daptive behaviors that can be modeled by this problem, solving it may require a
therapist to motivate an individual to intentionally choose specific sequences of
exploratory actions and visit specific sequences of states, even though they have
reliably led to very aversive outcomes in a client’s personal past. This requires
a strong client–​therapist alliance; otherwise, either avoidance of therapy or
overaccommodation will more likely ensue.
To understand the shortcut problem and how it relates to maladaptive beha-
vior and therapeutic mechanisms, consider the top panel illustrated in Figure
15.4. This depicts an abstract “grid-​world,” where each box is a state (i.e., an in-
ternal/​external situation the person could be in), and there are four possible
actions in each state (up, down, left, and right) standing in for different behaviors
a person could choose to engage in. The state marked “S” is where the individual
starts, and the state “G” is the goal, the only state that provides direct reward.
The direction of the arrow depicted in each state indicates the action with the
highest learned value in that state. This represents optimal behavior given the
location of the barrier (dark gray boxes). The set of behaviors a child learns in a
414 Integrative Perspectives

Model learned from repeated experience in the agent’s “childhood environment.”


G

Inability to learn a new model (with a more adaptive action sequence) in


the agent’s “adult environment.”

Figure 15.4 Illustration of the shortcut problem (based on Sutton & Barto, 1998,
Figure 9.8). In this example, the agent lives in “gridworld,” where each square
represents an abstract “state” the agent can occupy (e.g., the state of being at school,
the state of feeling anxiety, etc.). There are four actions available in each state (up,
down, left, and right), which lead the agent to move to a different state (unless a
barrier is blocking the state transition, in which case the agent remains in the same
state). These abstract actions stand in for actions a real person would consider when
in a real state they recognize (e.g., expressing vs. suppressing emotions when in the
state of feeling anxiety, leading the agent to transition to a new state that may be
rewarding, punishing, or neutral). Initially, the agent explores randomly (starting
in state “S”), and eventually finds state “G” (which represents the “goal state” that
provides a reward). Based on the mathematically defined learning algorithms for
learning/​planning within reinforcement learning (RL) models, over many trials the
agent comes to prefer (i.e., assign the highest value to) the actions indicated by the
arrows shown in each state (i.e., which would be selected by model-​free algorithms);
model-​based algorithms can also use a learned internal model of gridworld to
simulate different possible action sequences and select the optimal route in the
model. In this example, the optimal route (i.e., going left and up around the barrier)
is accurately learned for the “childhood environment” (top). In adulthood (bottom),
however, the world has changed, and a more optimal “shortcut” path is possible (i.e.,
the shorter open path going right and then up). However, for multiple reasons, many
RL algorithms (such as one-​step tabular Q-​learning) will fail to ever learn that this
more adaptive behavior pattern exists. First, model-​free algorithms will continue to
A Computational Neuroscience Perspective 415

certain situation, such as in the context (grid) of an angry parent (barrier), can be
thought of in these terms.
However, even if such behaviors were historically optimal, the situation in
adulthood often differs dramatically from that in childhood. This may be espe-
cially true if one’s childhood environment was characterized by abuse or neglect,
where “optimal” behavioral adaptations can lead to very poor functioning in
adulthood. For example, in adulthood, approaching certain states of an angry
partner compassionately but assertively may offer the most healthy and effective
means (the “shortcut”) leading to preferred outcomes. For those who never had
to go the “long way” in childhood (i.e., avoidance or defensive behavior, because
the “approach” strategy just mentioned was severely punished in childhood),
it can be difficult to understand why suboptimal behaviors are repeated again
and again in adulthood. The bottom panel of Figure 15.4 illustrates this: com-
putational algorithms relying on local exploration have difficulty solving this
problem, failing to learn that the shortcut even exists.
To understand this, consider the values of different actions (“action values”)
that guide behavior. Action-​values can be either directly stored as memories to
be retrieved in each state, or calculated on the fly using (now inaccurate semantic
and episodic) memories of the current context. “Trying out” behaviors that
would allow learning about the shortcut would, according to the client’s internal
model, lead to states with worse values and worse affect, while their established

Figure 15.4 Continued


prefer the actions with the highest values learned in childhood (put differently, they
will “avoid” actions with low values when those actions were previously punished,
or if they were simply less efficient at getting to the goal state). Even if an algorithm
occasionally chooses an “exploratory” action at random (i.e., with some defined
probability; e.g., 10% of the time), it will rarely get far enough to the right to learn
the new route (i.e., requiring eight random actions in a row in this example). Second,
model-​based algorithms will continue to simulate the presence of the barrier
on the right and therefore will never generate a plan to take that path or gain the
new experience necessary to update the model. To learn about this more adaptive
path (i.e., to learn a more adaptive “policy” for selecting actions in each state), the
agent will therefore need to somehow be guided/​motivated to explore states and
actions with currently low represented values. This is one way of understanding
why a therapist may be necessary for some individuals to learn more adaptive
behavior patterns and to stop repeating maladaptive behavior patterns (i.e., which
were learned precisely because they were adaptive during childhood). Put another
way, the therapist may facilitate/​motivate a client to explore the consequences of
action sequences that they have learned to fear (i.e., assign low value to) and would
therefore never choose on their own.
416 Integrative Perspectives

actions would continue to be followed by states consistent with their current


action values. Even if “exploratory” actions are built into behavior, resulting in
choosing the “going right” behavior with, say, probability of p = 0.2, the client
would have to randomly choose eight low-​value actions in a row to discover the
shortcut (let alone learn to prefer it). The chances of this would be 0.28, or about
1 in 400,000. It can thus be very difficult for an individual to learn more adaptive
adult behaviors if current behaviors continue to lead to expected outcomes.
How can therapy help in this model of maladaptive repetitive behaviors?
One key insight is that unless some outside influence strongly encourages and
motivates a client to repeatedly choose the shortcut behavior—​even though
many of the steps along the way will at first be experienced as highly aversive—​
then a client may never learn the more adaptive behavior on their own. A client
may also need additional influences in therapy to prevent/​correct counterfac-
tual reasoning problems (e.g., assuming things would have become worse if
they continued with the new behavior for much longer). This extensive (but
therapist-​directed, non-​random) exploratory behavior, combined with correc-
tive feedback about automatic cognitive interpretations, may be the only way
that, eventually, action values cached in memory (so-​called MF action values)
can be updated. It may also be the only way that the client’s internal model of
the world can be updated to include the shortcut and use it in future behavior
planning. The relevant point here is that there are many clinical cases in which
a therapy can be seen as providing the corrective experiences (i.e., expecta-
tion violations), motivation, and cognitive guidance that promotes, and allows
learning from, this exploration of sequences of states that a client has learned to
expect will be highly aversive. Direct early therapeutic interactions (i.e., those
that motivate such exploratory actions) can also be thought of as facilitating
some initial revisions to a client’s internal model—​such that they start to ex-
plicitly simulate less aversive outcomes in such states. For example, helping a
client identify other possible alternative interpretations of the past and other
possible predictions about the future in cognitive therapies (e.g., reappraisal)
can be understood as broadening the estimated probability distribution over
possible interpretations of their situation within their internal model (i.e., re-
ducing its precision)—​thus reducing the influence of any one interpretation on
future perception/​behavior. However, these types of “within-​session” model
revisions would have less immediate impact on cached action values that drive
action tendencies, which is why true, and not simply simulated, exploratory
experiences are necessary. In summary, a therapist can be seen as someone who
can help a client to explore, and correctly learn from, states of the world that
are actually safe but where a client’s past experience has taught them that those
states are dangerous, which prevents them from taking actions to learn other-
wise on their own.
A Computational Neuroscience Perspective 417

To illustrate the wide applicability of this formulation, consider three con-


crete clinical realizations. First, consider a case in which a client has learned in
childhood that close relationships most often lead to pain. In this case, the be-
havioral policy such a child has learned will likely involve sequences of actions
that avoid all such relationships (the analogue of “leftward” behavior in Figure
15.4), reflecting the implicit (MF, implicitly cached in memory) and/​or explicit
(MB, consciously worked out) expectation of pain if they did not. Note that this
is not because avoiding such relationships is intrinsically pleasant but simply
because the actions/​states that involve seeking out such relationships (i.e., the
analogue of “rightward” behavior in Figure 15.4) are represented as having rela-
tively much lower values (reflecting more unpleasant expected outcomes)—​and
the observed outcomes of avoiding relationships continue to be as expected (i.e.,
there is no observed evidence that their beliefs are wrong). One important ther-
apeutic mechanism is therefore for the therapist to facilitate a client beginning to
seek out relationships in adulthood anyway, until he or she has enough experi-
ence to learn that this often actually leads to higher overall quality of, and satis-
faction with, their life than expected (i.e., the analogue of learning that the more
optimal shortcut behavior path exists). The relevant mechanism here is therefore
to change a particular set of action value and “state value” memories, which facil-
itate more adaptive independent behavior in a client over time.
Similarly, consider a case where, in childhood, an individual was physically
abused and told to “toughen up” whenever he or she displayed outward signs of
sadness. As such, the action value for “expressing sadness” (i.e., when in the state
of “feeling sad”) would be very low, and the action value for “suppressing signs of
sadness” would be much higher. Further, the individual’s internal model would
predict that “expressing sadness” would lead to highly aversive (low-​value) states.
In adulthood, however, the objective fact of the matter is that many people will re-
spond with warmth and care when recognizing that the individual is sad and that
expressing sadness could actually facilitate greater quality of life for the individual as
a whole (i.e., “expressing sadness” represents the more optimal “shortcut” behavior,
whereas “suppressing signs of sadness” represents the less optimal “longer” behav-
ioral route). As in the last example, a therapist can here also be seen as facilitating/​
motivating the individual to “explore” or “try out” expressing sadness a few times, so
that he or she can learn that it leads to better outcomes than expected (i.e., updating
their implicitly stored values for sadness-​expressing actions).
Finally, consider a case in which an individual was abused in childhood and
continues to return to an abusive romantic relationship later in adulthood.
In this case, the relevant question is, “Why does the individual continue to
go back to the abusive partner?” In MF terms, this could be understood as a
case in which the “return to relationship” action has a higher stored value than
the “remain alone” action (i.e., the more optimal behavior, which would be
418 Integrative Perspectives

required to eventually find a healthier relationship). In MB terms, it would be


expected that the individual’s internal model would also predict that the “re-
main alone” action will lead to more aversive future states. This is a complex
case, however, and likely involves many factors (and perhaps different factors
in different individuals; Estrellado & Loh, 2019; Griffing et al., 2005). Here,
we suggest that the RL framework introduced here may at least offer a pos-
sible partial explanation involving a few distinct factors. First, in childhood,
it was likely more adaptive to remain with an abusive parent than to leave
and try to survive childhood alone. This is one way in which the action of re-
turning to, or remaining in, an abusive relationship may have gained a higher
stored value. Second, one necessary mathematical component of successful
MF algorithms is something called a “discount rate term,” which entails that
action values are affected less and less by outcomes that occur farther into
the future. Thus, if returning to an abusive relationship is initially met with
positive consequences and the abuse only returns after some delay (e.g., a few
weeks), then these later negative consequences would be expected to have less
of an effect on the value assigned to the “return to relationship” action (i.e., it
would remain high). Third, it is known that humans often tend to implicitly
prefer familiarity to novelty (Hansen & Wänke, 2009; Liao, Yeh, & Shimojo,
2011; Park, Shimojo, & Shimojo, 2010); in RL terms, this means that the ac-
tion of “remaining alone” and the unfamiliar state of “being in a non-​abusive
relationship” may each be represented as having a lower value (i.e., as being
aversive) simply because they are unfamiliar. Fourth, clients in such cases
often report believing that the abusive partner has now changed and that the
abuse will not continue (e.g., because the partner has been nice/​apologetic).
Such updated beliefs could be thought of in terms of latent cause inference.
For example, if a client’s internal model represented people as either “nice” or
“mean,” the partner’s recent nice behavior could promote the inference that
the latent cause has changed (i.e., the partner is now a “nice” person); in con-
trast, the therapist may make a completely different inference (e.g., that one or
both partners are unstable).
Thus, a wide range of plausible factors could lead to a set of state and action
values and associated inferential processes that would maintain this type of
maladaptive, repetitive behavior, and prevent individuals from learning that a
more adaptive behavior pattern is available to them (for further consideration
of maladaptive repetitive patterns, see Chapter 14 of this volume). As with the
previous examples, a therapist can help to correct MB expectations and mala-
daptive inferences within the therapeutic context and motivate the client to try
out the “remain alone” action long enough to update the stored (MF) value of
A Computational Neuroscience Perspective 419

this behavior and prevent return to the abusive partner (and eventually seek out
a healthier relationship).

Memories Are Updated as a Part of Effective Therapy

In Example 1, the clinically relevant memories involved either:

1. Memories gained by learning about the existence of a new latent cause/​


context (i.e., if exposure is abrupt), in which stimuli seen as threatening no
longer predict danger in the new context, or
2. Memories gained by updating the predictions made under an existing la-
tent cause/​context (i.e., if exposure is gradual), leading to changes in beliefs
about the dangerousness of a stimulus in general (i.e., within-​level associ-
ations, not dependent on a change in context).

The latter type of memory update would lead to similar affective and behavioral
consequences, but with lower chances of later return of fear.
In Example 2, the clinically relevant memories that are changed in therapy
are the learned probabilities (priors) regarding high-​level, abstract expectations
about long timescale regularities in the world (e.g., whether or not people in ge-
neral will tend to value and accept you). These changes then alter (a) the way
social interactions are interpreted and (b) the affective bodily and behavioral
responses such interactions lead to.
The types of memories described in these first two examples map best onto the
psychological construct of implicit schemas within semantic memory (previous
work has also linked schemas to latent causes; Chan et al., 2016). For example,
implicit schemas about dangerous or non-​dangerous events that are expected in
particular situations or implicit schemas about the likely results of social inter-
action. Any potential changes in direct (within-​level) probabilistic relationships
between percepts would instead correspond best to changes in implicit emo-
tional memories (e.g., classically conditioned response memories). For example,
changes in the degree to which the percept of a spider directly predicts (and thus
leads to) bodily changes such as increased heart rate and muscle tension (i.e., in-
dependent of context).
In Example 3, the clinically relevant memories that are changed in therapy in-
volve the stored action values and state values (used by MF and MB algorithms)
to choose between different available behaviors in different perceived states (situ-
ations). By promoting/​motivating exploration of previously aversive states, these
420 Integrative Perspectives

“value memories” can be updated—​breaking patterns of maladaptive, repetitive,


and avoidant behavior and promoting more adaptive patterns of behavior.

Implications for Specific Therapies and


the Integrated Memory Model

In simple phobia (Example 1), we saw that exposure and response prevention
processes may provide powerful prediction error signals driving the brain to
infer an encounter with a new (although perceptually similar) context, where the
likelihood of danger in the presence of the anxiogenic stimulus is low. However,
it may also be possible for effective beliefs about the threat context to be revised
instead, preventing fear recovery. One way to facilitate this would be to moti-
vate the client to engage in exposures under conditions that match their own
threat context (e.g., the client’s home), as opposed to in a new safe context (e.g., at
the clinic). Another interesting hypothesis to be tested is to dissociate a stimulus
and the feared response gradually as opposed to abruptly. This, however, would
not mean gradual exposure with the consistent absence of feared consequences
(as is typical in exposure therapies), but would instead mean that the feared
consequences would be observed initially, perhaps illustrating the developmental
or evolutionary preparedness behind phobias, and would slowly be observed
with lower and lower frequencies over many exposures to a phobic stimulus.
For example, in spider phobia, one could perhaps first show video clips in which
a spider does bite someone but then also occasionally show clips in which the
spider does not cause harm. If over time the frequency of “bite” videos went
down and the frequency of “no-​bite” videos went up, perhaps this would pro-
mote revision of the stored probabilities within the threat context. If successful,
this approach would be expected to minimize spontaneous return of fear. In
IMM terms, it would involve reactivation, revision, and reconsolidation of the
original memory and thus a more generalized change, rather than creation of a
new safety memory vulnerable to being superseded by the original memory in
stressful circumstances (return of fear).
Other psychotherapy approaches can be understood somewhat similarly.
For example, both cognitive-​behavioral therapy (CBT; e.g., Barlow, Allen, &
Choate, 2016) and acceptance and commitment therapy (e.g., Hayes & Smith,
2005) have similar exposure-​based components. However, they also teach cog-
nitive strategies (mindful awareness, cognitive flexibility, cognitive distancing,
acceptance, etc.) that may prevent cognitive processes from interfering with
memory updating (e.g., by reducing the estimated precision of automatic neg-
ative thoughts).
A Computational Neuroscience Perspective 421

In low self-​worth (Example 2), a range of different therapies could be seen as


trying to alter the estimated probabilities of adaptive versus maladaptive high-​
level expectations (e.g., how much value one has as a person in general). CBT
would do so by first helping an individual to identify that they have these ex-
pectations. Reappraisal techniques would then draw a client’s attention to ev-
idence inconsistent with those expectations. In doing so, sufficient prediction
errors may be generated over time to alter these expectations and promote more
successful attempts at real-​world exposure. Psychodynamic approaches (e.g.,
Abbass, Kisely, & Kroenke, 2009) use the therapeutic relationship as a tool for
reactivation of expectations, then guide learning of new models of relationships.
For example, a therapist consistently reacting empathetically while avoiding su-
perficial affirmation treats the client as indeed an “adult of value”). Guided by
experience in therapy, clients explore engaging with others, perhaps in ways
that make them feel vulnerable, to generalize their learning that more adap-
tive expectations than the originally learned ones now apply. More generally,
any therapy that promotes re-​experiencing of negative emotion (e.g., emotion-​
focused therapy; Greenberg, 2010) might be expected to decrease the influence
of a client’s prior expectations during learning (Joffily & Coricelli, 2013) and lead
to more efficient memory updating during therapeutic interaction.
In Example 3, about persisting redundant policies (i.e., engrained behavioral
tendencies), therapies can be seen as changing “action value” and “state value”
memories. As in Example 1, exposure-​based behavioral therapies very directly
promote exploration of aversive states that also apply to this example. More in-
terestingly, cognitive therapies can be seen to promote “mental exploration” of
different models of the world (e.g., Are they in the state of “near a dog that bites,”
or are they in the state of “near a dog that loves playing?”). This can be thought of
as a way of “relocating one’s self ” within one’s own internal model of the possible
states of the world and also updating that internal model. This would then lead
MB algorithms to predict different future scenarios and different available action
options, which would then increase the probability of the same type of behav-
ioral exploration of (previously) aversive states as occurs in exposure therapies.
Finally, psychodynamic therapies can similarly be seen as helping an individual
update their internal model, focusing on the (“dynamic”) forces exerted on the
patient’s observable behavior by implicit models they hold about relationships,
leading to improved insight/​awareness. This may be another example of coming
to “relocate one’s self ” within a more helpful internal model (e.g., “I thought
I simply was in the state ‘worthless person’ but now I see that I am really in the
state of ‘a valuable person who is prone, but not condemned, to self-​criticism’).
After reconceptualizing one’s current situation in this manner and having other
expectations altered via interactions within the therapeutic relationship (as in
422 Integrative Perspectives

Example 2), this would ultimately also facilitate enough motivation and under-
standing to explore (previously) aversive states in spite of avoidant urges.
Therefore, superficially different therapeutic techniques all can ultimately re-
sult in therapeutic efficacy through updating learned expectations and memories
associated with (a) the information stored in one’s internal model of the world
(e.g., What will happen in the future if I do X?), (b) the state one believes one
is in within that model (e.g., How should I think about my current situation?),
(c) what that internal model predicts about the appropriate affective/​visceral
responses in particular states (e.g., Does this state call for high or low arousal?),
and (d) the state and action values that directly govern decision-​making and be-
havior (e.g., automatic action tendencies due to previously observed positive/​
negative outcomes). Action and state value memories do not correspond clearly
to the memory categories within the IMM but can be understood as medi-
ating stable habits, other action tendencies, and decision-​making tendencies
(note: they are also linked to brain systems/​structures, such as the dopaminergic
midbrain and basal ganglia, that were not highly emphasized in the IMM; Frank,
2011). This illustrates a range of types of memory (at a computational level of de-
scription) that successful therapeutic interventions can be understood to target,
most (but not all) of which correspond to broader elements within the IMM.
The crucial tenet of the IMM (Lane et al., 2015) that moderate levels of af-
fective arousal and re-​experiencing negative emotion play an important role
in updating memories during therapy naturally dovetails with two computa-
tional explanations. These include, first, simple indicators and, second, causal
mediators of effective psychotherapeutic process. First, experience of negative
emotion may simply indicate that the semantic states recalled activate state-​
action and state-​value memories overlapping with real life. Both the expecta-
tions but, more important, the prediction errors (Rutledge, Skandali, Dayan, &
Dolan, 2014; Will, Rutledge, Moutoussis, & Dolan, 2017b) associated with fictive
state-​transitions would map to such emotions. The same could also be expressed
in terms of expected free energy (Joffily & Coricelli, 2013). As a psychodynamic
example, a patient feels awful because they expect the therapist to be an awful
transference figure. The therapists' behavior doesn't just reassure them about this
but also teaches that these feelings/​expectations are faceable, thinkable, and, ulti-
mately, learnable, corresponding to an upgrading of the epistemic value of being
in such states.
The causal mediation hypothesis is even more interesting. Here, negative
emotion tunes precision estimates to facilitate belief updating (Clark, Watson,
& Friston, 2018; Joffily & Coricelli, 2013). Feeling negative emotion corresponds
to body states involving increasing prediction errors (i.e., an increase in expected
undesirable surprises; free energy). Negative valence therefore signals that
something is wrong with one’s current internal model of the world, leading to a
A Computational Neuroscience Perspective 423

reduced probability of keeping the body in states conducive to survival and fit-
ness. Positive valence instead signals that the predictions of one’s current internal
model of the world are reliable. When one’s current model is inaccurate, the op-
timal strategy is to amplify the influence of new sensory input on belief updating
and to reduce the influence of current model priors—​as this would be the most
efficient way to correct one’s current model. This corresponds to increasing pre-
cision estimates on new sensory input and decreasing precision estimates on
predictions emanating from higher levels. The upshot is that the influence of new
experience on model updating (revising expectations) will be stronger when
feeling negative emotion. This is consistent with a body of work showing that
people in a sad mood process information more carefully and attend more to de-
tail (Bless, Bohner, Schwarz, & Strack, 1990; Bodenhausen, Sheppard, & Kramer,
1994; Gasper & Clore, 2002; Krauth-​Gruber & Ric, 2000; Zarinpoush, Cooper, &
Moylan, 2000), whereas people in a happy mood tend to rely more on heuristics
and prior expectations (e.g., filling in details from stereotypes, schemas, scripts;
Bless et al., 1996; Bodenhausen, Kramer, & Süsser, 1994; Park & Banaji, 2000),
and that these phenomena may relate to differences in uncertainty (Tiedens &
Linton, 2001).
According to this converging theoretical and empirical work, while a client
experiences negative emotion in therapy, their reliance on prior expectation will
be reduced. Thus, the parameters (memories) in their internal model will be
more malleable, and new experience will have a stronger influence on revising
beliefs and expectations. In contrast, without feeling negative emotion, the thera-
peutic interaction would be less efficient at altering an individual’s beliefs, expec-
tations, schemas, action tendencies, and so forth. It follows that the expectation
violations experienced in therapy will more effectively alter the network archi-
tecture depicted in Figures 15.2 and 15.3 when a person re-​experiences nega-
tive emotion. This is an important alternative to the “catharsis” hypothesis that
getting negative affect “out” is curative; rather, it is useful in part because it sets
the stage for more efficient learning and less reliance on maladaptive schemas
when corrective emotional experiences occur. At the same time, much more
work is needed to understand the evolution of emotions in successful therapy, as
the states of useful negative emotion in fact contain complex mixes rather than a
single good–​bad dimension, including positive feelings of strong therapeutic al-
liance and positive prediction errors associated with successful reconsolidation
of more adaptive knowledge.
The role of episodic memory, important in the IMM, also needs addressing.
Our consideration of computational models so far has not highlighted any
internal model parameters directly relevant to episodic memory. So where
does episodic memory fit in? According to the IMM, episodic memory is not
a separate entity, but largely co-​activated and co-​modified with other types
424 Integrative Perspectives

of memory during reconsolidation and so would be involved in the phe-


nomena considered so far. Neurobiologically, the basis of state representation
may involve the hippocampus—​a structure strongly implicated in episodic
memory—​as an important structure for storing and representing an internal
model (or “cognitive map”) of the world (Ji & Maren, 2007; O’Keefe & Nadel,
1978; Stachenfeld, Botvinick, & Gershman, 2017). This is consistent with its
role in spatial navigation and in the representation of spatiotemporal aspects
of context (Haviland, Louise Warren, & Riggs, 2000; O’Keefe & Nadel, 1978;
Pfeiffer & Foster, 2013; Smith & Mizumori, 2006). Therefore, it is plausible
that the hippocampus is important for recognizing and representing what
state of the world an individual is in and for simulating different sequences
of future (and past) states within MB processes. Within multiple trace theory
(Nadel, Samsonovich, Ryan, & Moscovitch, 2000), the hippocampus also plays
a permanent role in episodic memory by representing the context in which
a memory occurred, binding that context to the object representations that
were also part of the memory (e.g., “The lamp was in my bedroom”; “My uncle
was in the kitchen”; etc.). Predictive coding models of episodic memory have
also been proposed (Henson & Gagnepain, 2010), in which the hippocampus
is envisioned to sit at the top of a memory representation hierarchy (see also
Nadel & Peterson, 2013). Here the hippocampus is proposed to represent mul-
timodal, spatiotemporally extended patterns of activation in lower hierarchical
levels—​with the more specific function of optimizing the mutual predictability
of objects from contexts, and vice versa. This allows the perception of a given
set of objects to activate a particular context representation, allowing the recol-
lection of particular episodic memories in which these objects interacted over
a particular timescale. Such a process appears central to recognizing and con-
ceptualizing the meaning of one’s current perceptions based on their similarity
to past episodic experiences.
It thus seems reasonable to suggest that episodic memories function to
guide the recognition/​representation of one’s current state, which then engages
the implicit priors (schemas) that guide perception, interpretation, and pla-
nning and which also primes the relevant actions and action values that con-
tribute to habitual (MF-​based) behavioral tendencies. Therefore, if therapeutic
interventions involved the reconsolidation of episodic memories, this would
most plausibly act as an indirect means of altering the state recognition pro-
cesses described above, leading to the activation of different priors/​schemas and
to different predictions regarding optimal visceral (affective) and behavioral
responses. This could then complement the learning processes discussed above,
which alter the contents (prior predictions) of schemas and update the stored
action values that more directly influence behavioral tendencies. In some ways,
the third step of the IMM, which involves applying new ways of experiencing
A Computational Neuroscience Perspective 425

and interacting with the world in a variety of contexts, can be seen as contrib-
uting to the updating of these stored action values so that more adaptive beha-
vior becomes more automatic.

Conclusions

This chapter has outlined a number of ways in which the IMM, and the associated
model of therapeutic change (dubbed the “Lane, Ryan, Nadel, and Greenberg” or
“LRNG” model; Lane et al., 2015), can be complemented by a description at the
unique level associated with computational neuroscience. Some major points in-
clude the following:

1. Implicit schemas within semantic memory can be understood as prior


probability estimates about the co-​occurrence of different percepts and
concepts within particular states of the world, implemented by the strengths
of specific types of top–​down axonal/​synaptic connections. Therapeutically
altering these prior beliefs (in part by influencing what clients attend to
and learn most from; i.e., precision estimates) or updating beliefs about
the state of the world a client occupies can adaptively influence a range of
perception, conceptualization, and explicit decision-​making processes that
show strong negative/​threat biases prior to therapy.
2. Some implicit emotional memories can be understood as direct, within-​
level expectations (implemented by the strengths of lateral synaptic
connections encoding covariance relationships) about the pleasant or un-
pleasant visceral percepts/​responses that should or should not co-​occur
with particular stimuli. Ensuring, in a therapy context, that key latent causes
are the subject of new inference (through activating the cause of interest
by key memory features, and then altering these within-​level interactions),
offers the possibility of preventing return of fear. Here, promoting recon-
solidation of expectations regarding these old memories rather than infer-
ring the presence of a new latent causes (new memories) will more plausibly
lead to enduring change (e.g., prevent return of fear).
3. Action value and state value memories correspond less directly to the elem-
ents and brain regions stressed by the IMM and appear to be implemented
by interactions between the midbrain, basal ganglia, and frontal cortex.
However, they appear highly relevant to understanding action tendencies
and instrumental behaviors that promote continued avoidance and repeti-
tive maladaptive behavior patterns. Through therapeutically motivated and
guided exploration of avoided states/​actions, these stored value memories
can be gradually updated—​allowing the development of more adaptive
426 Integrative Perspectives

“default” behavioral patterns. This highlights a type of memory and asso-


ciated brain systems that could be usefully incorporated into an updated/​
expanded IMM.
4. All memories will be more effectively updated during the experience of
moderately intense negative emotion. This is because negative valence may
arise as a result of an increasing rate of prediction errors, which promotes
poor implicit confidence in one’s own model of the world. This in turn
promotes increased precision estimates assigned to new sensory input and
makes an individual’s internal model more malleable during therapeutic
learning experiences and less influenced by prior expectations. Thus,
violations of one’s maladaptive expectations in therapy are more likely to
have enduring effects if they occur in the context of the current experience
of negative emotion—​as predicted by the IMM. This suggests that the expe-
rience of negative affect can play an important role in corrective emotional
experiences that promote enduring change. However, more work is needed
to understand the computational roles of complex and evolving mixes of
emotion in successful versus unsuccessful therapies.
5. Models within computational neuroscience, including those within the
predictive coding/​active inference and RL frameworks, provide precise,
empirically supported, and biologically plausible mathematical equations
(algorithms) that quantitatively stipulate how these learning processes and
associated synaptic strength changes occur in response to new experience
(for derivations of these mathematical equations from normative princi-
ples, see Bogacz, 2017; Friston, 2005; Friston et al., 2016; Huys et al., 2015;
Sutton & Barto, 1998).

To conclude, it is also worth asking what the computational perspective adds.


First, we would argue that it provides additional mechanistic insights at a novel
level of description and offers links between that level of description and both
neural and cognitive levels of description. For example, it links probabilistic
latent cause inference and prediction signaling to extinction learning and
OFC–​amygdala interactions, it links the influences of maladaptive schemas
to optimal prediction error minimization processes under maladaptive
priors, and it links repetitive maladaptive behaviors to normative action value
learning within the basal ganglia under conditions of mismatching childhood
and adulthood environments (and the difficulty of optimally solving the ex-
plore/​exploit dilemma in such cases). Second, it offers additional insights re-
garding the nature of memories targeted in therapy, how this would alter a
range of broader psychological processes, and why this should lead to effective
change. For example, it suggests that synaptic changes associated with learning
in therapy may encode specific priors, precision estimates, and action values,
A Computational Neuroscience Perspective 427

which in turn bias perception, attention/​learning, and automatic action ten-


dencies, respectively. Third, it offers the possibility of quantitative predictions,
potentially allowing for much more rigorous empirical studies. For example,
a body of work on “computational phenotyping” has already begun to emerge
in which performance on behavioral tasks can be used to estimate the priors
and precision estimates an individual has learned (de Berker et al., 2016;
Schwartenbeck & Friston, 2016). Other tasks estimate differences in the de-
gree to which MB and MF processes tend to contribute most to an individual’s
behavior (Kool, Cushman, & Gershman, 2016). Thus, these and other tasks,
adapted more closely to the clinical context—​for example, addressing informa-
tion sampling biases via metacognitive therapy in psychosis (Kumar, Menon,
Moritz, & Woodward, 2015; Wilson, Geana, et al., 2014)—​could be used to test
the effects of psychotherapeutic interventions on such measures, the neural
basis of these changes, and the consistency of these results with the predictions
of the IMM. They could also potentially be used to provide clinically informa-
tive patient characterizations in guiding treatment selection. We believe this
represents a highly promising road forward for understanding and improving
the treatment of psychological disorders.

References
Abbass, A., Kisely, S., & Kroenke, K. (2009). Short-​ term psychodynamic psycho-
therapy for somatic disorders: Systematic review and meta-​analysis of clinical trials.
Psychotherapy and Psychosomatics, 78(5), 265–​274. http://​doi.org/​10.1159/​000228247
Adams, R., Shipp, S., & Friston, K. (2013). Predictions not commands: Active inference
in the motor system. Brain Structure and Function, 218(3), 611–​643. http://​doi.org/​
10.1007/​s00429-​012-​0475-​5
Barlow, D., Allen, L., & Choate, M. (2016). Toward a unified treatment for emotional
disorders: Republished article. Behavior Therapy, 47(6), 838–​853. http://​doi.org/​
10.1016/​j.beth.2016.11.005
Bastos, A., Usrey, W., Adams, R., Mangun, G., Fries, P., & Friston, K. (2012). Canonical
microcircuits for predictive coding. Neuron, 76(4), 695–​711. http://​doi.org/​10.1016/​
j.neuron.2012.10.038
Berns, G., & Sejnowski, T. (1996). How the basal ganglia makes decisions. In A. Damasio,
H. Damasio, & Y. Christen (Eds.), Neurobiology of decision-​making (pp. 101–​113).
Berlin, Germny: Springer. http://​doi.org/​10.1007/​978-​3-​642-​79928-​0
Bless, H., Bohner, G., Schwarz, N., & Strack, F. (1990). Mood and persuasion: A cognitive
response analysis. Personality and Social Psychology Bulletin, 16, 331–​345.
Bless, H., Clore, G., Schwarz, N., Golisano, V., Rabe, C., & Wölk, M. (1996). Mood and the
use of scripts: Does a happy mood really lead to mindlessness? Journal of Personality
and Social Psychology, 71(4), 665–​679. http://​doi.org/​10.1037/​0022-​3514.71.4.665
Bodenhausen, G., Kramer, G., & Süsser, K. (1994). Happiness and stereotypic thinking in
social judgment. Journal of Personality and Social Psychology, 66(4), 621–​632. http://​
doi.org/​10.1037/​0022-​3514.66.4.621
428 Integrative Perspectives

Bodenhausen, G., Sheppard, L., & Kramer, G. (1994). Negative affect and social judg-
ment: The differential impact of anger and sadness. European Journal of Social
Psychology, 24(1), 45–​62. http://​doi.org/​10.1002/​ejsp.2420240104
Bogacz, R. (2017). A tutorial on the free-​energy framework for modelling perception
and learning. Journal of Mathematical Psychology, 76(Pt B), 198–​211. http://​doi.org/​
10.1016/​j.jmp.2015.11.003
Browne, C., Powley, E., Whitehouse, D., Lucas, S., Cowling, P., Rohlfshagen, P., . . .
Colton, S. (2012). A Survey of Monte Carlo Tree Search Methods. IEEE Transactions
on Computational Intelligence and AI in Games, 4(1), 1–​43. http://​doi.org/​10.1109/​
TCIAIG.2012.2186810
Chan, S., Niv, Y., & Norman, K. (2016). A Probability distribution over latent causes, in
the orbitofrontal cortex. Journal of Neuroscience, 36(30), 7817–​7828. http://​doi.org/​
10.1523/​JNEUROSCI.0659-​16.2016
Clark, J. E., Watson, S., & Friston, K. J. (2018). What is mood? A computational per-
spective. Psychological Medicine, 48(14), 2277–​2284. http://​doi.org/​10.1017/​
S0033291718000430
Daw, N., Gershman, S., Seymour, B., Dayan, P., & Dolan, R. (2011). Model-​based
influences on humans’ choices and striatal prediction errors. Neuron, 69(6), 1204–​
1215. http://​doi.org/​10.1016/​j.neuron.2011.02.027
Dayan, P., & Daw, N. (2008). Decision theory, reinforcement learning, and the brain.
Cognitive, Affective & Behavioral Neuroscience, 8(4), 429–​453. http://​doi.org/​10.3758/​
CABN.8.4.429
de Berker, A., Rutledge, R., Mathys, C., Marshall, L., Cross, G., Dolan, R., & Bestmann, S.
(2016). Computations of uncertainty mediate acute stress responses in humans. Nature
Communications, 7, 10996. http://​doi.org/​10.1038/​ncomms10996
Dolan, R., & Dayan, P. (2013). Goals and habits in the brain. Neuron, 80(2), 312–​325.
http://​doi.org/​10.1016/​j.neuron.2013.09.007
Doll, B., Simon, D., & Daw, N. (2012). The ubiquity of model-​based reinforcement
learning. Current Opinion in Neurobiology, 22(6), 1075–​1081. http://​doi.org/​10.1016/​
J.CONB.2012.08.003
Estrellado, A. F., & Loh, J. (2019). To stay in or leave an abusive relationship: losses and gains ex-
perienced by battered Filipino women. Journal of Interpersonal Violence, 34(9), 1843–​1863.
Feldman, H., & Friston, K. (2010). Attention, uncertainty, and free-​energy. Frontiers in
Human Neuroscience, 4, 215. http://​doi.org/​10.3389/​fnhum.2010.00215
Frank, M. (2011). Computational models of motivated action selection in corticostriatal
circuits. Current Opinion in Neurobiology, 21(3), 381–​386. http://​doi.org/​10.1016/​
J.CONB.2011.02.013
Friston, K. (2005). A theory of cortical responses. Philosophical Transactions of the Royal
Society of London. Series B, Biological Sciences, 360(1456), 815–​836. http://​doi.org/​10.1098/​
rstb.2005.1622
Friston, K. (2010). The free-​energy principle: A unified brain theory? Nature Reviews.
Neuroscience, 11(2), 127–​138. http://​doi.org/​10.1038/​nrn2787
Friston, K., FitzGerald, T., Rigoli, F., Schwartenbeck, P., O Doherty, J., & Pezzulo, G.
(2016). Active inference and learning. Neuroscience and Biobehavioral Reviews, 68,
862–​879. http://​doi.org/​10.1016/​j.neubiorev.2016.06.022
Friston, K., FitzGerald, T., Rigoli, F., Schwartenbeck, P., & Pezzulo, G. (2017). Active in-
ference: A process theory. Neural Computation, 29(1), 1–​49. http://​doi.org/​10.1162/​
NECO_​a_​00912
A Computational Neuroscience Perspective 429

Friston, K., Parr, T., & de Vries, B. (2017). The graphical brain: Belief propagation and ac-
tive inference. Network Neuroscience, 1(4), 381–​414. http://​doi.org/​10.1162/​NETN_​a_​
00018
Friston, K., Stephan, K., Montague, R., & Dolan, R. (2014). Computational psychi-
atry: The brain as a phantastic organ. The Lancet. Psychiatry, 1(2), 148–​158. http://​doi.
org/​10.1016/​S2215-​0366(14)70275-​5
Gasper, K., & Clore, G. (2002). Attending to the big picture: Mood and global versus local
processing of visual information. Psychological Science, 13(1), 34–​40. http://​doi.org/​
10.1111/​1467-​9280.00406
Gershman, S., Jones, C., Norman, K., Monfils, M., & Niv, Y. (2013). Gradual extinc-
tion prevents the return of fear: Implications for the discovery of state. Frontiers in
Behavioral Neuroscience, 7, 164. http://​doi.org/​10.3389/​fnbeh.2013.00164
Gershman, S., Norman, K., & Niv, Y. (2015). Discovering latent causes in reinforcement
learning. Current Opinion in Behavioral Sciences, 5, 43–​50. http://​doi.org/​10.1016/​
J.COBEHA.2015.07.007
Greenberg, L. (2010). Emotion-​ focused therapy: Theory and practice. Washington,
DC: APA Press.
Greve, A., Cooper, E., Kaula, A., Anderson, M., & Henson, R. (2017). Does prediction
error drive one-​shot declarative learning? Journal of Memory and Language, 94, 149–​
165. http://​doi.org/​10.1016/​j.jml.2016.11.001
Griffing, S., Ragin, D., Morrison, S., Sage, R., Madry, L., & Primm, B. (2005). Reasons
for returning to abusive relationships: Effects of prior victimization. Journal of Family
Violence, 20(5), 341–​348. http://​doi.org/​10.1007/​s10896-​005-​6611-​8
Hansen, J., & Wänke, M. (2009). Liking what’s familiar: The importance of unconscious
familiarity in the mere-​exposure effect. Social Cognition, 27(2), 161–​182. http://​doi.
org/​10.1521/​soco.2009.27.2.161
Hasson, U., Chen, J., & Honey, C. (2015). Hierarchical process memory: Memory as an in-
tegral component of information processing. Trends in Cognitive Sciences, 19(6), 304–​
313. http://​doi.org/​10.1016/​j.tics.2015.04.006
Hasson, U., Yang, E., Vallines, I., Heeger, D., & Rubin, N. (2008). A hierarchy of tem-
poral receptive windows in human cortex. Journal of Neuroscience, 28(10), 2539–​2550.
http://​doi.org/​10.1523/​JNEUROSCI.5487-​07.2008
Haviland, M., Louise Warren, W., & Riggs, M. (2000). An observer scale to measure
alexithymia. Psychosomatics, 41(5), 385–​392.
Hayes, S., & Smith, S. (2005). Get out of your mind and into your life: The new acceptance
and commitment therapy. Oakland, CA: New Harbinger.
Henson, R., & Gagnepain, P. (2010). Predictive, interactive multiple memory systems.
Hippocampus, 20(11), 1315–​1326. http://​doi.org/​10.1002/​hipo.20857
Huys, Q., Eshel, N., O’Nions, E., Sheridan, L., Dayan, P., & Roiser, J. (2012). Bonsai trees
in your head: How the Pavlovian system sculpts goal-​directed choices by pruning de-
cision trees. PLoS Computational Biology, 8(3), e1002410. http://​doi.org/​10.1371/​
journal.pcbi.1002410
Huys, Q., Guitart-​Masip, M., Dolan, R., & Dayan, P. (2015). Decision-​theoretic psy-
chiatry. Clinical Psychological Science, 3(3), 400–​421. http://​doi.org/​10.1177/​
2167702614562040
Huys, Q., Maia, T., & Frank, M. (2016). Computational psychiatry as a bridge from neu-
roscience to clinical applications. Nature Neuroscience, 19(3), 404–​413. http://​doi.org/​
10.1038/​nn.4238
430 Integrative Perspectives

Ji, J., & Maren, S. (2007). Hippocampal involvement in contextual modulation of fear ex-
tinction. Hippocampus, 17(9), 749–​758. http://​doi.org/​10.1002/​hipo.20331
Joffily, M., & Coricelli, G. (2013). Emotional valence and the free-​energy principle.
PLoS Computational Biology, 9(6), e1003094. http://​doi.org/​10.1371/​journal.pcbi.
1003094
Kiebel, S., Daunizeau, J., & Friston, K. (2008). A hierarchy of time-​scales and the brain.
PLoS Computational Biology, 4(11), e1000209. http://​doi.org/​10.1371/​journal.
pcbi.1000209
Knill, D., & Pouget, A. (2004). The Bayesian brain: The role of uncertainty in neural
coding and computation. Trends in Neurosciences, 27(12), 712–​719. http://​doi.org/​
10.1016/​j.tins.2004.10.007
Kool, W., Cushman, F., & Gershman, S. (2016). When does model-​based control pay
off? PLoS Computational Biology, 12(8), e1005090. http://​doi.org/​10.1371/​journal.
pcbi.1005090
Krauth-​Gruber, S., & Ric, F. (2000). Affect and stereotypic thinking: A test of the mood-​
and-​general-​knowledge model. Personality and Social Psychology Bulletin, 26(12),
1587–​1597. http://​doi.org/​10.1177/​01461672002612012
Kumar, D., Menon, M., Moritz, S., & Woodward, T. (2015). Using the back
door: Metacognitive training for psychosis. Psychosis, 7(2), 166–​178. http://​doi.org/​
10.1080/​17522439.2014.913073
Lane, R., Ryan, L., Nadel, L., & Greenberg, L. (2015). Memory reconsolidation, emotional
arousal and the process of change in psychotherapy: New insights from brain science.
Behavioral and Brain Sciences, 38, e1.
Liao, H.-​I., Yeh, S.-​L., & Shimojo, S. (2011). Novelty vs. familiarity principles in preference
decisions: Task-​context of past experience matters. Frontiers in Psychology, 2, 43. http://​
doi.org/​10.3389/​fpsyg.2011.00043
Marr, D. (1982). Vision: A computational investigation into the human representation and
processing of visual information. New York, NY: Freeman.
Moutoussis, M., Shahar, N., Hauser, T., & Dolan, R. (2017). Computation in psycho-
therapy, or how computational psychiatry can aid learning-​based psychological thera-
pies. Computational Psychiatry, 1–​21. http://​doi.org/​10.1162/​CPSY_​a_​00014
Murray, J., Bernacchia, A., Freedman, D., Romo, R., Wallis, J., Cai, X., . . . Wang, X.-​J.
(2014). A hierarchy of intrinsic timescales across primate cortex. Nature Neuroscience,
17(12), 1661–​1663. http://​doi.org/​10.1038/​nn.3862
Nadel, L., & Peterson, M. (2013). The hippocampus: Part of an interactive posterior repre-
sentational system spanning perceptual and memorial systems. Journal of Experimental
Psychology. General, 142(4), 1242–​1254. http://​doi.org/​10.1037/​a0033690
Nadel, L., Samsonovich, A., Ryan, L., & Moscovitch, M. (2000). Multiple trace theory
of human memory: Computational, neuroimaging, and neuropsychological
results. Hippocampus, 10(4), 352–​368. http://​doi.org/​10.1002/​1098-​1063(2000)
10:4<352::AID-​HIPO2>3.0.CO;2-​D
O’Keefe, J., & Nadel, L. (1978). The hippocampus as a cognitive map. Oxford, England:
Oxford University Press.
Park, J., & Banaji, M. (2000). Mood and heuristics: The influence of happy and sad states
on sensitivity and bias in stereotyping. Journal of Personality and Social Psychology,
78(6), 1005–​1023.
Park, J., Shimojo, E., & Shimojo, S. (2010). Roles of familiarity and novelty in visual pref-
erence judgments are segregated across object categories. Proceedings of the National
A Computational Neuroscience Perspective 431

Academy of Sciences of the United States of America, 107(33), 14552–​14555. http://​doi.


org/​10.1073/​pnas.1004374107
Parr, T., & Friston, K. (2017). Working memory, attention, and salience in active infer-
ence. Scientific Reports, 7(1), 14678. http://​doi.org/​10.1038/​s41598-​017-​15249-​0
Parr, T., Rees, G., & Friston, K. (2018). Computational neuropsychology and Bayesian
inference. Frontiers in Human Neuroscience, 12, 61. http://​doi.org/​10.3389/​
fnhum.2018.00061
Pezzulo, G., Rigoli, F., & Friston, K. (2015). Active inference, homeostatic regulation and
adaptive behavioural control. Progress in Neurobiology, 134, 17–​35. http://​doi.org/​
10.1016/​j.pneurobio.2015.09.001
Pezzulo, G., Rigoli, F., & Friston, K. (2018). Hierarchical active inference: A theory
of motivated control. Trends in Cognitive Sciences, 22(4), 294–​306. http://​doi.org/​
10.1016/​J.TICS.2018.01.009
Pfeiffer, B., & Foster, D. (2013). Hippocampal place-​cell sequences depict future paths to
remembered goals. Nature, 497(7447), 74–​9. http://​doi.org/​10.1038/​nature12112
Rutledge, R., Skandali, N., Dayan, P., & Dolan, R. (2014). A computational and neural
model of momentary subjective well-​being. Proceedings of the National Academy of
Sciences of the United States of America, 111(33), 12252–​12257. http://​doi.org/​10.1073/​
pnas.1407535111
Salin, P., & Bullier, J. (1995). Corticocortical connections in the visual system: Structure
and function. Physiological Reviews, 75(1), 107–​154. http://​doi.org/​10.1152/​
physrev.1995.75.1.107
Schwartenbeck, P., FitzGerald, T., Mathys, C., Dolan, R., & Friston, K. (2015). The dopa-
minergic midbrain encodes the expected certainty about desired outcomes. Cerebral
Cortex, 25(10), 3434–​3445. http://​doi.org/​10.1093/​cercor/​bhu159
Schwartenbeck, P., & Friston, K. (2016). Computational phenotyping in psychiatry: A
worked example. ENeuro, 3(4), ENEURO.0049-​0016.2016. http://​doi.org/​10.1523/​
ENEURO.0049-​16.2016
Smith, R., Lane, R., Parr, T., & Friston, K. (2019). Neurocomputational mechanisms un-
derlying emotional awareness: insights afforded by deep active inference and their
potential clinical relevance. Neuroscience and Biobehavioral Reviews. https://​doi.org/​
10.1101/​681288
Smith, D., & Mizumori, S. (2006). Hippocampal place cells, context, and episodic
memory. Hippocampus, 16(9), 716–​729. http://​doi.org/​10.1002/​hipo.20208
Smith, R., Parr, T., & Friston, K. J. (2019). Simulating emotions: An active inference model
of emotional state inference and emotion concept learning. bioRxiv 640813. https://​
doi.org/​10.1101/​640813
Stachenfeld, K., Botvinick, M., & Gershman, S. (2017). The hippocampus as a predictive
map. Nature Neuroscience, 20(11), 1643–​1653. http://​doi.org/​10.1038/​nn.4650
Sutton, R. (1991). Dyna, an integrated architecture for learning, planning and reacting.
Sigart Bulletin, 2, 160–​163.
Sutton, R., & Barto, A. (1998). Reinforcement learning: An introduction. London,
England: MIT Press.
Tiedens, L., & Linton, S. (2001). Judgment under emotional certainty and uncertainty: The
effects of specific emotions on information processing. Journal of Personality and Social
Psychology, 81(6), 973–​988.
Vygotsky, L. (1980). Mind in society: The development of higher psychological processes.
Cambridge, MA: Harvard University Press.
432 Integrative Perspectives

Will, G.-​J., Rutledge, R., Moutoussis, M., & Dolan, R. (2017). Neural and computational
processes underlying dynamic changes in self-​esteem. ELife, 6, e28098.http://​doi.org/​
10.7554/​eLife.28098
Wilson, R., Geana, A., White, J., Ludvig, E., & Cohen, J. (2014). Humans use directed
and random exploration to solve the explore-​exploit dilemma. Journal of Experimental
Psychology. General, 143(6), 2074–​2081. http://​doi.org/​10.1037/​a0038199
Wilson, R., Takahashi, Y., Schoenbaum, G., & Niv, Y. (2014). Orbitofrontal cortex
as a cognitive map of task space. Neuron, 81(2), 267–​279. http://​doi.org/​10.1016/​
j.neuron.2013.11.005
Zarinpoush, F., Cooper, M., & Moylan, S. (2000). The effects of happiness and sadness on
moral reasoning. Journal of Moral Education, 29(4), 397–​412. http://​doi.org/​10.1080/​
713679391

You might also like