0% found this document useful (0 votes)
32 views9 pages

2021 - Compututations, Language Comprehension

The document discusses whether language comprehension relies on domain-general circuits or domain-specific circuits. It analyzes recent work characterizing the roles of the language network and multiple-demand network in language processing. While the language network responds robustly to linguistic variables, the multiple-demand network shows no sensitivity to language. This suggests it does not play a core role in language comprehension.

Uploaded by

Carolina Aguiar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
32 views9 pages

2021 - Compututations, Language Comprehension

The document discusses whether language comprehension relies on domain-general circuits or domain-specific circuits. It analyzes recent work characterizing the roles of the language network and multiple-demand network in language processing. While the language network responds robustly to linguistic variables, the multiple-demand network shows no sensitivity to language. This suggests it does not play a core role in language comprehension.

Uploaded by

Carolina Aguiar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 9

1046955

research-article2021
CDPXXX10.1177/09637214211046955Fedorenko, ShainSimilar Computations, Distinct Implementation

ASSOCIATION FOR
PSYCHOLOGICAL SCIENCE
Current Directions in Psychological

Similarity of Computations Across Domains Science


2021, Vol. 30(6) 526­–534
© The Author(s) 2021
Does Not Imply Shared Implementation: Article reuse guidelines:
sagepub.com/journals-permissions

The Case of Language Comprehension DOI: 10.1177/09637214211046955


https://doi.org/10.1177/09637214211046955
www.psychologicalscience.org/CDPS

Evelina Fedorenko and Cory Shain


Department of Brain & Cognitive Sciences and McGovern Institute for Brain Research, Massachusetts Institute
of Technology

Abstract
Understanding language requires applying cognitive operations (e.g., memory retrieval, prediction, structure building)
that are relevant across many cognitive domains to specialized knowledge structures (e.g., a particular language’s
lexicon and syntax). Are these computations carried out by domain-general circuits or by circuits that store domain-
specific representations? Recent work has characterized the roles in language comprehension of the language network,
which is selective for high-level language processing, and the multiple-demand (MD) network, which has been
implicated in executive functions and linked to fluid intelligence and thus is a prime candidate for implementing
computations that support information processing across domains. The language network responds robustly to diverse
aspects of comprehension, but the MD network shows no sensitivity to linguistic variables. We therefore argue that the
MD network does not play a core role in language comprehension and that past findings suggesting the contrary are
likely due to methodological artifacts. Although future studies may reveal some aspects of language comprehension
that require the MD network, evidence to date suggests that those will not be related to core linguistic processes such
as lexical access or composition. The finding that the circuits that store linguistic knowledge carry out computations
on those representations aligns with general arguments against the separation of memory and computation in the mind
and brain.

Keywords
language, domain specificity, executive functions, working memory, cognitive control, prediction

Incremental language comprehension likely relies on & Martins, 2014; Koechlin & Jubault, 2006; Novick et al.,
general cognitive operations such as retrieval of repre- 2005; Patel, 2003; Fig. 1a).
sentations from memory, predictive processing, atten- Indeed, a network of frontal and parietal brain
tional selection, and hierarchical structure building regions—the multiple-demand (MD) system (also
(e.g., Gibson, 2000; Tanenhaus et al., 1995). For exam- known as the executive- or cognitive-control network;
ple, in any sentence containing a nonlocal dependency Fig. 2b)—has been shown to respond during diverse
between words, the first dependent has to be retrieved cognitive tasks and to be linked to constructs such
from memory when the second dependent is encoun- as working memory, inhibition, attention, prediction,
tered. These kinds of operations are also invoked in
other domains of perception and cognition, including
object recognition, numerical and spatial reasoning, Corresponding Authors:
music perception, social cognition, and task planning Evelina Fedorenko, Department of Brain & Cognitive Sciences and
McGovern Institute for Brain Research, Massachusetts Institute of
(e.g., Botvinick, 2007; Dehaene et al., 2003). The appar- Technology
ent similarity of these kinds of mental operations across Email: [email protected]
domains has led to arguments that the brain contains Cory Shain, Department of Brain & Cognitive Sciences and McGovern
domain-general circuits that carry out these operations Institute for Brain Research, Massachusetts Institute of Technology
and that language draws on these circuits (e.g., Fitch Email: [email protected]
Similar Computations, Distinct Implementation 527

a
Domain 1 Domain 2 Domain 3
(Language) (Music) (Social Cognition)

Attentional Encoding Hierarchical Structure


Selection Prediction Error Building

Domain-General Hubs

b
Domain 1 Domain 2 Domain 3
(Language) (Music) (Social Cognition)

Attentional Selection Attentional Selection Attentional Selection


Encoding Prediction Error Encoding Prediction Error Encoding Prediction Error
Hierarchical Structure Hierarchical Structure Hierarchical Structure
Building Building Building

Fig. 1. Schematic illustrations of (a) an architecture in which computations that are used across domains are
implemented in shared circuits and (b) an architecture in which such general computations are implemented
locally within each relevant set of domain-specific circuits. The architecture in (a) assumes separation between
memory circuits (which store domain-specific knowledge representations) and computation circuits (which
support attention, prediction, structure building, and other operations across domains); in the architecture in
(b), the circuits that store domain-specific knowledge representations also carry out computations on those
representations (e.g., Dasgupta & Gershman, 2021; Hasson et al., 2015).

structure building, and fluid intelligence (e.g., Duncan although important for various domains, may not draw
et al., 2020). These findings make the MD network an on shared circuits (e.g., Dasgupta & Gershman, 2021;
ideal candidate for carrying out hypothesized domain- Hasson et al., 2015; Fig. 1b), possibly as a way of mini-
general operations. However, many domains—including mizing wiring lengths (Chklovskii & Koulakov, 2004).
language—rely on domain-specific knowledge repre- In this view, the MD network may be a general fallback
sentations stored in specialized brain areas and net- system for domains or tasks for which the brain lacks
works. For example, language recruits a network of specialized circuitry.
frontal and temporal brain regions that respond in a In this article, we review a recent body of work in
highly selective manner during language comprehen- which various aspects of human sentence comprehen-
sion (Fedorenko et al., 2011; Fig. 2a), and damage to sion were investigated using functional MRI (fMRI) tech-
these regions in adulthood leads to selectively linguistic niques that reliably distinguish the language network
deficits (e.g., Fedorenko & Varley, 2016). from the domain-general MD network (see Fedorenko
During language comprehension, the MD network & Blank, 2020, for review), so that their functional
may work together with the language network, carrying response properties could be probed independently.
out general operations on domain-specific knowledge (This approach is complementary to—but more direct
representations. However, it is also possible that the than—past work, which has used dual-task paradigms
language network locally implements general types of and examination of brain-damaged patients to probe
computations (e.g., retrieval of information from mem- the role of domain-general resources in language
ory, predictive processing, and structure building; comprehension; e.g., Caplan & Waters, 1999; Martin
Caplan & Waters, 1999; R. L. Lewis, 1996; Martin et al., et al., 1994.) The results have consistently shown (a)
1994). More generally, these kinds of computations, strong sensitivity in the language network, (b) little
528 Fedorenko, Shain

a b

The Language Network The Multiple-Demand Network

c d
Diachek et al. (2020)
Blank & Fedorenko (2017) Language Network MD Network
2.0

.3 1.5

% Signal Change
Correlation

.2 1.0

.1 0.5

.0 0.0
t1

t2

t3
en

en

en

−0.5
rim

rim

rim

es Lis
ts es Lis
ts
enc enc
pe

pe

pe

nt rd nt rd
Ex

Ex

Ex

Se Wo Se Wo

e f g Diachek et al. (2020)


Shain et al. (2020); Wehbe et al. Language Network MD Network
Shain et al. (2021) (2021) 2.0
0.4

.25 1.5
% Signal Change

0.2
% Signal Change

.20 1.0
Normalized Correlation

0.0
.15 0.5

.10
al

DL rpris er-

Co ion

0.0
ris

u rs

t
T I al

st
ra
rp

d S Pa

eg
Su

se tic

nt

.05
am

Ba tac

−0.5
Gr

n
Sy
5-

Ta nal

Ta nal

Ta nal

Ta nal

.00
io

io

io

io
sk

sk

sk

sk
dit

dit

dit

dit

Language Network
Ad

Ad

Ad

Ad
No

No

MD Network

Fig. 2. (continued on next page)


Similar Computations, Distinct Implementation 529

Fig. 2. Location of the language and multiple-demand (MD) networks and evidence that putatively domain-general cognitive operations
important for language processing are carried out locally within the language system, rather than the MD network. The brain images
show the location of the (a) language and (b) MD networks as identified by group-level patterns of activation in response to sentences
versus nonword lists (Fedorenko et al., 2011) and hard versus easy working memory tasks (Fedorenko et al., 2013), respectively.
These locations were used to constrain the definition of functional regions of interest in each individual participant in all the studies
whose results are presented in the graphs (c–g). The graphs summarize recent functional MRI findings showing that the language, but
not the MD, network responds to diverse aspects of language comprehension. Results are averaged across the regions within each
network, but the patterns also hold for each region individually. The graph in (c) shows the correlation in neural response across
participants in each network during naturalistic story comprehension (Blank & Fedorenko, 2017). The correlation was stronger in the
language network than in the MD network, which suggests that stimulus tracking was stronger in the language network. The graph in
(d) shows the response of each network during processing of sentences and word lists (Diachek et al., 2020). The average response
across participants and experiments, indicated by the horizontal lines, revealed that the language network was more strongly engaged
for sentences than for word lists, but the opposite pattern held in the MD network. (Darker gray bars show results from passive read-
ing and listening experiments, and lighter bars show results from experiments in which language processing was accompanied by a
secondary task (e.g., a memory probe, comprehension, or sentence judgment task). The graph in (e) shows effects of 5-gram surprisal
(Shain et al., 2020), syntactic parser-based surprisal (i.e., surprisal derived from a computational model of syntactic structure building;
Shain et al., 2020), and integration cost (Shain et al., 2021) in each network during naturalistic story comprehension. Integration cost
was operationalized as in the dependency locality theory (DLT; Gibson, 2000). The language, but not the MD, network was sensitive
to all three measures. The graph in (f) shows the performance of reading times (a measure of comprehension difficulty) in self-paced
reading and eye tracking during reading (quantified as the correlation between model predictions and observed responses in out-of-
sample data) as predictors of activity in the two networks (Wehbe et al., 2021). The language, but not the MD, network was robustly
sensitive to comprehension difficulty. The graph in (g) shows the response of each network during passive reading or listening and
when language comprehension was accompanied by an additional task (Diachek et al., 2020). The language network responded
robustly during language comprehension regardless of the presence or absence of an extraneous task, but the MD network responded
only in the presence of an extraneous task. Horizontal lines correspond to averages across participants. Error bars indicate ±1 SEM by
participants in (c), (d), (f), and (g) and ±1 SEM by functional region of interest in (e).

response in the MD network, and (c) significantly strong­er response across individuals during naturalistic language
responses in the language network than the MD network processing and found strongly stimulus-linked responses
for every investigated component of natural-language in the language network, as expected. Critically, how-
comprehension, including word predictability, working ever, the MD network exhibited substantially lower lev-
memory retrieval, and generalized measures of language els of stimulus-linked activity (Fig. 2c). To rule out the
comprehension difficulty. Together, these findings sup- possibility that the MD network tracks linguistic stimuli
port the existence of a self-sufficient specialized lan- closely but in a more variable way across individuals than
guage system that carries out the bulk of language-related the language system does, Blank and Fedorenko also
processing demands. examined within-participant correlations to multiple pre-
sentations of the same stimulus. Within-participant correla-
tions in the MD network were lower than within-participant
The MD Network Does Not Closely correlations in the language network and about as low
Track the Linguistic Signal as between-participants correlations in the MD network.
Activity in a brain region or network that supports lin- This result indicates that the MD network’s activity is less
guistic computations should be modulated by the prop- strongly modulated by changes in the linguistic signal
erties of the linguistic stimulus. One method for than is the activity of the language network.
estimating the degree of stimulus-linked activity (or
stimulus tracking), developed by Hasson and colleagues The MD Network Does Not Show a
(e.g., Hasson et al., 2010), is based on the correlations Core Functional Signature of Language
across individuals during the processing of naturalistic
stimuli. The logic is as follows: If a brain region or net-
Processing
work processes features of a stimulus, different individu- Natural-language sentences exhibit rich patterns of syn-
als should show similar patterns of increases and tactic (e.g., Chomsky, 1957) and semantic (e.g., Montague,
decreases in neural response in that region or network 1973) structure that are not present in perceptually
over time. Note that this method makes no assumptions matched stimuli, such as lists of unconnected words or
about what features in the stimulus are important, so it nonwords. Processing syntactic and semantic dependen-
provides a theory-neutral way to estimate the degree cies is widely thought to impose a computational burden
of stimulus-linked activity. In three experiments, Blank (e.g., S. Lewis & Phillips, 2015), and thus an expected
and Fedorenko (2017) investigated synchrony in neural signature of language processing is an increased neural
530 Fedorenko, Shain

response to sentences relative to control stimuli that lack The MD Network Does Not Show
structure. The language network robustly bears out this Effects of Syntactic Integration
prediction (e.g., Fedorenko et al., 2011). In a recent
large-scale study, Diachek et al. (2020) investigated Influential theories of human sentence comprehension
whether the same is true of the MD network. Their sam- posit a critical role for working memory retrieval in
ple consisted of fMRI responses from 481 participants, integrating words into an incomplete parse of the
each of whom completed one or more of 30 language- unfolding sentence (e.g., Gibson, 2000). Given that
comprehension experiments varying in linguistic materi- working memory is thought to be one of the core func-
als. Some experiments included sentences, others tions supported by the MD network (Duncan et al.,
included lists of unconnected words, and still others 2020), one plausible role for this network in language
included both of these stimulus types. Results replicated comprehension is as a working memory resource for
past findings of stronger responses in the language net- syntactic structure building. We (Shain et al., 2021)
work during the processing of sentences compared with investigated this possibility by exploring the contribu-
word lists, but showed systematically greater MD engage- tion of multiple theory-derived measures of working
ment during the processing of word lists than during the memory cost to explaining variance in the language
processing of sentences, plausibly a reflection of the and MD networks’ responses to naturalistic linguistic
greater difficulty of encoding unstructured stimuli (Fig. stimuli. The language network showed a systematic and
2d). This pattern is inconsistent with generalized MD generalizable (to an unseen data portion) response to
involvement in sentence comprehension. variants of integration cost as proposed by Gibson’s
(2000) dependency locality theory. Gibson posited that
constructing syntactic dependencies incurs a retrieval
The MD Network Does Not Show cost proportional to the number of intervening ele-
Effects of Word Predictability ments that compete referentially with the retrieval tar-
Effects of word predictability are robust in behavioral get. This pattern did not hold in the MD network (Fig.
(e.g., Ehrlich & Rayner, 1981) and electrophysiological 2e, third set of bars), where activity did not reliably
(e.g., Kutas & Hillyard, 1984) measures of human lan- increase with measures of integration cost (or other
guage processing, and it has been argued that frontal and types of working memory demand explored in the
parietal cortical areas—likely within the MD network— study). Thus, whereas these results support a role for
encode expectancies across domains (e.g., Corbetta & working memory retrieval in naturalistic language pro-
Shulman, 2002), including language (Strijkers et al., cessing, they indicate that the working memory
2019). Thus, one possible role for the MD network in resources that support such computations reside in
language processing is to encode incremental predic- language-specific circuits, and that working memory
tion error. With our colleagues, we (Shain et al., 2020) resources housed in the MD network play little role.
investigated this possibility by analyzing measures of
word-by-word surprisal (e.g., Levy, 2008) in fMRI The MD Network Does Not Show
responses to naturalistic audio stories. 1 We examined
effects of surprisal estimates based on both word
Effects of Comprehension Difficulty
sequences (5-gram surprisal models that predict the The foregoing results challenge the hypothesis that the
next word on the basis of the preceding four words) MD network plays a role in two of the core classes of
and syntactic structures (probabilistic context-free computation posited by current theorizing in human
grammar models that predict the next word on the basis sentence-processing research: prediction (e.g., Levy,
of an incomplete syntactic analysis of the unfolding 2008) and integration (e.g., Gibson, 2000). However,
sentence). These estimates of surprisal had significant it is infeasible to enumerate and test the many other
(and separable) effects in the language network, but possible computations involved in human language
neither 5-gram surprisal nor syntactic parser-based sur- processing—including those not covered by existing
prisal had a significant effect in the MD network (Fig. theory—in which MD may play a role. In another study
2e, first two sets of bars). Thus, whereas the results (Wehbe et al., 2021), we bypassed this limitation by
support the existence of a rich predictive architecture leveraging independent measures of reading times to
that exploits both word co-occurrences and syntactic predict fMRI responses to naturalistic stories. Reading
patterns, this architecture appears to rely on a mecha- times are widely regarded in psycholinguistics as reli-
nism housed in language-specific cortical circuits, able, theory-neutral proxies for language-comprehension
rather than on a domain-general predictive coding difficulty and are commonly used as dependent vari-
mechanism that may reside in the MD network. ables to test hypotheses about the determinants of
Similar Computations, Distinct Implementation 531

comprehension difficulty (Rayner, 1998). Using this memory, or any other linguistic operation that leads to
measure enabled us to test whether comprehension comprehension difficulty during language processing;
difficulty in general registers in the MD network, with- and, unlike the core language network, is not engaged
out precommitting to a particular theory of sentence by passive comprehension, instead becoming engaged
processing. We found that activity in the language, but only in the presence of a secondary task (e.g., memory
not the MD, network showed a strong effect of com- probe or sentence judgments, Figs. 2c–g). These findings
prehension difficulty as measured by reading times (Fig. greatly constrain the space of plausible language-related
2f). Thus, the MD network is unlikely to play a critical computations that the MD network might support and
role in the computations that govern incremental (word- align with the architecture outlined in Figure 1b. In
by-word) language-comprehension difficulty, regardless particular, it appears that the network that stores lin-
of how this difficulty is explained theoretically. guistic knowledge representations is also the network
that performs all the relevant computations on these
The MD Network Does Not Respond representations in the course of incremental compre-
hension, despite the fact that many of these computa-
During Comprehension in the Absence tions may be similar to, or the same as, computations
of Extraneous Task Demands used in other domains.
This lack of evidence for the MD network’s engagement Why have prior studies reached different conclusions
during language processing appears to contradict many about the reliance of language processing on domain-
prior reports of activity in what appear to be MD general resources? The answer is likely twofold. First,
regions during language processing (e.g., Novick et al., for many years, researchers have not clearly differenti-
2005). Critically, such results have almost always been ated between the language-selective and the domain-
obtained when word or sentence comprehension has general circuits that cohabit the left frontal lobe but are
been accompanied by an extraneous task, which may robustly and unambiguously distinct (see Fedorenko &
engage the MD network given its robust sensitivity to Blank, 2020, for dicussion). This failure to separate the
task demands (Duncan et al., 2020). Diachek et al. two networks is due to a combination of (a) traditional
(2020) investigated this possibility by contrasting the group-averaging analyses, which blur nearby function-
MD network’s engagement in language experiments ally distinct regions, especially in the association cortex,
that involved passive comprehension (visual or audi- where the precise locations of such regions differ across
tory) with its engagement in experiments that involved individuals (e.g., Frost & Goebel, 2012), and (b) fre-
an additional task, such as responding to memory quent reverse inference from coarse anatomical loca-
probes, answering comprehension questions, or judg- tions (i.e., concluding that a cognitive function was
ing semantic associations. Whereas the language net- involved because anatomical brain areas previously
work was equally engaged in the presence and the associated with that function were active; e.g., Poldrack,
absence of an additional task, the MD network was 2006). Second, many prior studies have used paradigms
engaged only in the presence of an additional task (Fig. in which word or sentence comprehension is accom-
2g). In other words, passive language comprehension panied by a secondary task and/or the linguistic materi-
is sufficient to engage language-selective regions, but als used are highly artificial. Such paradigms may
not MD regions, which suggests that MD engagement indeed recruit the MD network, which is robustly sensi-
during language comprehension is primarily induced tive to task demands, but this recruitment does not
by nonlinguistic task demands (see Discussion). speak to the role of this network in core linguistic
operations such as lexical access, or syntactic or seman-
tic structure building. For these reasons, we have
Discussion
focused our review on studies that (a) relied on well-
The evidence presented here challenges the hypothesis validated functional localizers (Fedorenko et al., 2011)
that domain-general executive resources support core to identify the language and MD networks within each
computations of incremental language processing. The individual brain and (b) used naturalistic comprehen-
MD network (Duncan et al., 2020), where such resources sion tasks (Hasson et al., 2018). Such studies converged
are likely housed, does not closely track linguistic stim- on a clear answer: The domain-general MD network
uli; responds more robustly to less language-like materi- does not support core linguistic computations.
als (e.g., more robustly to lists of unconnected words The fact that the language system appears to locally
than to sentences); does not show evidence of engage- implement general computations such as memory
ment in predictive linguistic processing, retrieval of pre- retrieval, prediction, and structure building suggests
viously encountered linguistic elements from working that local computation may systematically accompany
532 Fedorenko, Shain

functional specialization. This conjecture aligns with Recommended Reading


prior arguments for a tight integration between memory Dasgupta, I., & Gershman, S. J. (2021). (See References). A
and computation at the neuronal level (Dasgupta & review in which the authors argue that memory, in the
Gershman, 2021; Hasson et al., 2015). If a particular form of memorization (storing computational outputs for
stimulus (be it a face, spatial layout, high-pitched future use), may be a ubiquitous component of neural
sound, or linguistic input) is encountered with sufficient information processing, rather than the domain of a des-
frequency to support specialization in particular cir- ignated resource.
cuits, then it may be advantageous for those circuits to Duncan, J. (2010). The multiple-demand (MD) system of the
carry out as much processing as possible in that domain, primate brain: Mental programs for intelligent behaviour.
Trends in Cognitive Sciences, 14(4), 172–179. https://doi
given that local computation may reduce processing
.org/10.1016/j.tics.2010.01.004. A review of evidence for
latencies that would result from interactive communica- the existence of a broad, domain-general frontoparietal
tion with other systems (Chklovskii & Koulakov, 2004). multiple-demand brain system that supports executive
Nevertheless, novel cognitive demands regularly arise, functions and fluid intelligence.
and it is infeasible to dedicate cortical “real estate” to Fedorenko, E., & Blank, I. A. (2020). (See References). A
each of them. A general-purpose cognitive system like review in which the authors argue that Broca’s area con-
the MD network is therefore indispensable to robust tains functionally distinct subregions that belong to the
and flexible cognition (Duncan et al., 2020), including language and multiple-demand networks and that confla-
the critical ability to solve novel problems. Indeed, tion of these subregions in much prior work has led to
recent computational-modeling work has shown that substantial confusion in the field.
an artificial neural network trained on multiple tasks Kanwisher, N. (2010). Functional specificity in the human
brain: A window into the functional architecture of
will spontaneously develop functionally specialized
the mind. Proceedings of the National Academy of
subnetworks for different tasks; however, if new tasks Sciences, 107(25), 11163–11170. https://doi.org/10.1073/
are continually introduced, a subset of the network will pnas.1005062107. A review of evidence for functional
remain flexible and not show a preference for any specialization in the human brain, with emphasis on the
known task (Yang et al., 2019). visual system and discussion of general implications for
Although the studies summarized here rule out a cognitive architecture and research methods.
large set of possibilities for the role of the MD network
in language processing, more work is needed to evalu- Transparency
ate the role of this network in language production, in Action Editor: Robert L. Goldstone
more diverse linguistic phenomena (e.g., pragmatic Editor: Robert L. Goldstone
inference, including during conversational exchanges), Declaration of Conflicting Interests
and in recovery from damage to the language network The author(s) declared that there were no conflicts of
(e.g., Hartwigsen, 2018). Furthermore, the contributions interest with respect to the authorship or the publication
of (possibly domain-general) subcortical and cerebellar of this article.
circuits to language comprehension and cognitive pro- Funding
cessing require additional investigation; future work The work summarized here and E. Fedorenko were sup-
ported by National Institutes of Health R00 Award
may show overlap between linguistic and nonlinguistic
HD057522 and R01 Awards DC016607 and DC016950, by
functions in such circuits. a grant from the Simons Foundation to the Simons Center
for the Social Brain at the Massachusetts Institute of Tech-
Conclusion nology, and by funds from the Department of Brain &
Cognitive Sciences and the McGovern Institute for Brain
Despite the apparent similarity between the mental Research at the Massachusetts Institute of Technology.
operations required for language comprehension and
those required by other cognitive domains, the evi- ORCID iD
dence we have reviewed here challenges the hypothesis Cory Shain https://orcid.org/0000-0002-2704-7197
that domain-general executive circuits (housed within
the MD network) play a core role in language compre- Acknowledgments
hension. We conjecture that such circuits similarly do We thank former and current EvLab and TedLab members (at
not play a core role in other domains that rely on the Massachusetts Institute of Technology), especially Idan
domain-specific representations, and that the core con- Blank, for helpful comments and discussions over the last
tribution of the MD network to human cognition lies few years; Yev Diachek and Leila Wehbe for help with orga-
in supporting flexible behavior and the ability to solve nizing the data for Figure 2; Hannah Small for creating the
new problems. figures; and Matt Davis and John Duncan for comments on
Similar Computations, Distinct Implementation 533

the earlier draft of the manuscript. We apologize to research- Fedorenko, E., & Blank, I. A. (2020). Broca’s area is not a
ers whose relevant reports we do not cite; this is due to the natural kind. Trends in Cognitive Sciences, 24(4), 270–284.
strict limit on the number of references allowed by the jour- https://doi.org/10.1016/j.tics.2020.01.001
nal, and we have provided a more complete list of relevant Fedorenko, E., Duncan, J., & Kanwisher, N. (2013). Broad
references at https://osf.io/dx4ah/. domain generality in focal regions of frontal and parietal
cortex. Proceedings of the National Academy of Sciences,
Note USA, 110(41), 16616–16621. https://doi.org/10.1073/pnas
.1315235110
1. Given a probability model p, surprisal I is the negative log
Fedorenko, E., & Varley, R. (2016). Language and thought
probability of a word wi given its preceding context: I(wi) =
are not the same thing: Evidence from neuroimaging and
−log[p(wi|w0 . . . wi-1)].
neurological patients. Annals of the New York Academy
of Sciences, 1369(1), 132–153. https://doi.org/10.1111/
References nyas.13046
Blank, I. A., & Fedorenko, E. (2017). Domain-general brain Fitch, W. T., & Martins, M. D. (2014). Hierarchical processing
regions do not track linguistic input as closely as language- in music, language, and action: Lashley revisited. Annals
selective regions. Journal of Neuroscience, 37(41), 9999– of the New York Academy of Sciences, 1316(1), 87–104.
10011. https://doi.org/10.1523/JNEUROSCI.3642-16.2017 https://doi.org/10.1111/nyas.12406
Botvinick, M. M. (2007). Multilevel structure in behaviour and Frost, M. A., & Goebel, R. (2012). Measuring structural–
in the brain: A model of Fuster’s hierarchy. Philosophical functional correspondence: Spatial variability of spe-
Transactions of the Royal Society B: Biological Sciences, cialised brain regions after macro-anatomical alignment.
362(1485), 1615–1626. https://doi.org/10.1098/rstb.2007 NeuroImage, 59(2), 1369–1381. https://doi.org/10.1016/j
.2056 .neuroimage.2011.08.035
Caplan, D., & Waters, G. S. (1999). Verbal working memory and Gibson, E. (2000). The dependency locality theory: A distance-
sentence comprehension. Behavioral & Brain Sciences, based theory of linguistic complexity. In A. P. Marantz,
22(1), 77–94. https://doi.org/10.1017/S0140525X99001788 Y. Miyashita, & W. O’Neil (Eds.), Image, language, brain:
Chklovskii, D. B., & Koulakov, A. A. (2004). Maps in the Papers from the first Mind Articulation Project sympo-
brain: What can we learn from them? Annual Review sium (pp. 95–126). MIT Press. https://doi.org/10.7551/
of Neuroscience, 27, 369–392. https://doi.org/10.1146/ mitpress/3654.003.0008
annurev.neuro.27.070203.144226 Hartwigsen, G. (2018). Flexible redistribution in cognitive
Chomsky, N. (1957). Syntactic structures. Mouton. networks. Trends in Cognitive Sciences, 22(8), 687–698.
Corbetta, M., & Shulman, G. L. (2002). Control of goal-directed https://doi.org/10.1016/j.tics.2018.05.008
and stimulus-driven attention in the brain. Nature Reviews Hasson, U., Chen, J., & Honey, C. J. (2015). Hierarchical
Neuroscience, 3(3), 201–215. https://doi.org/10.1038/nrn755 process memory: Memory as an integral component of
Dasgupta, I., & Gershman, S. J. (2021). Memory as a com- information processing. Trends in Cognitive Sciences,
putational resource. Trends in Cognitive Sciences, 25(3), 19(6), 304–313. https://doi.org/10.1016/j.tics.2015.04.006
240–251. https://doi.org/10.1016/j.tics.2020.12.008 Hasson, U., Egidi, G., Marelli, M., & Willems, R. M. (2018).
Dehaene, S., Piazza, M., Pinel, P., & Cohen, L. (2003). Three Grounding the neurobiology of language in first princi-
parietal circuits for number processing. Cognitive Neuro­ ples: The necessity of non-language-centric explanations
psychology, 20(3–6), 487–506. https://doi.org/10.1080/0264 for language comprehension. Cognition, 180, 135–157.
3290244000239 https://doi.org/10.1016/j.cognition.2018.06.018
Diachek, E., Blank, I., Siegelman, M., Affourtit, J., & Fedorenko, Hasson, U., Malach, R., & Heeger, D. J. (2010). Reliability
E. (2020). The domain-general multiple demand (MD) of cortical activity during natural stimulation. Trends in
network does not support core aspects of language com- Cognitive Sciences, 14(1), 40–48. https://doi.org/10.1016/j
prehension: A large-scale fMRI investigation. Journal of .tics.2009.10.011
Neuroscience, 40(23), 4536–4550. https://doi.org/10.1523/ Koechlin, E., & Jubault, T. (2006). Broca’s area and the hier-
JNEUROSCI.2036-19.2020 archical organization of human behavior. Neuron, 50(6),
Duncan, J., Assem, M., & Shashidhara, S. (2020). Integrated 963–974. https://doi.org/10.1016/j.neuron.2006.05.017
intelligence from distributed brain activity. Trends in Kutas, M., & Hillyard, S. A. (1984). Brain potentials during
Cognitive Sciences, 24(10), 838–852. https://doi.org/10 reading reflect word expectancy and semantic asso-
.1016/j.tics.2020.06.012 ciation. Nature, 307(5947), 161–163. https://doi.org/10
Ehrlich, S. F., & Rayner, K. (1981). Contextual effects on word .1038/307161a0
perception and eye movements during reading. Journal Levy, R. (2008). Expectation-based syntactic comprehension.
of Verbal Learning and Verbal Behavior, 20(6), 641–655. Cognition, 106(3), 1126–1177. https://doi.org/10.1016/j
https://doi.org/10.1016/S0022-5371(81)90220-6 .cognition.2007.05.006
Fedorenko, E., Behr, M. K., & Kanwisher, N. (2011). Func­ Lewis, R. L. (1996). Interference in short-term memory: The
tional specificity for high-level linguistic processing in magical number two (or three) in sentence processing.
the human brain. Proceedings of the National Academy Journal of Psycholinguistic Research, 25(1), 93–115.
of Sciences, USA, 108(39), 16428–16433. https://doi.org/ Lewis, S., & Phillips, C. (2015). Aligning grammatical the-
10.1073/pnas.1112937108 ories and language processing models. Journal of
534 Fedorenko, Shain

Psycholinguistic Research, 44(1), 27–46. https://doi.org/ in language-selective cortex. BioRxiv. https://doi.org/


10.1007/BF01708421 10.1101/2021.09.18.460917
Martin, R. C., Shelton, J. R., & Yaffee, L. S. (1994). Language Shain, C., Blank, I. A., van Schijndel, M., Schuler, W., &
processing and working memory: Neuropsychological Fedorenko, E. (2020). fMRI reveals language-specific pre-
evidence for separate phonological and semantic capaci- dictive coding during naturalistic sentence comprehen-
ties. Journal of Memory and Language, 33(1), 83–111. sion. Neuropsychologia, 138(17), Article 107307. https://
https://doi.org/10.1006/jmla.1994.1005 doi.org/10.1016/j.neuropsychologia.2019.107307
Montague, R. (1973). The proper treatment of quantifica- Strijkers, K., Chanoine, V., Munding, D., Dubarry, A.-S.,
tion in ordinary English. In K. J. J. Hintikka, J. M. E. Trébuchon, A., Badier, J.-M., & Alario, F.-X. (2019).
Moravcsik, & P. Suppes (Eds.), Approaches to natural Grammatical class modulates the (left) inferior frontal
language: Proceedings of the 1970 Stanford Workshop on gyrus within 100 milliseconds when syntactic context is
Grammar and Semantics (pp. 221–242). D. Reidel. predictive. Scientific Reports, 9(1), Article 4830. https://
Novick, J. M., Trueswell, J. C., & Thompson-Schill, S. L. (2005). doi.org/10.1038/s41598-019-41376-x
Cognitive control and parsing: Reexamining the role Tanenhaus, M. K., Spivey-Knowlton, M. J., Eberhard, K. M.,
of Broca’s area in sentence comprehension. Cognitive, & Sedivy, J. C. (1995). Integration of visual and linguistic
Affective, & Behavioral Neuroscience, 5(3), 263–281. information in spoken language comprehension. Science,
https://doi.org/10.3758/CABN.5.3.263 268(5217), 1632–1634. https://doi.org/10.1126/science
Patel, A. D. (2003). Language, music, syntax and the brain. Nature .7777863
Neuroscience, 6(7), 674–681. https://doi.org/10.1038/ Wehbe, L., Blank, I. A., Shain, C., Futrell, R., Levy, R., von
nn1082 der Malsburg, T., Smith, N., Gibson, E., & Fedorenko, E.
Poldrack, R. A. (2006). Can cognitive processes be inferred (2021). Incremental language comprehension difficulty
from neuroimaging data? Trends in Cognitive Sciences, predicts activity in the language network but not the mul-
10(2), 59–63. https://doi.org/10.1016/j.tics.2005.12.004 tiple demand network. Cerebral Cortex, 31(9), 4006–4023.
Rayner, K. (1998). Eye movements in reading and information https://doi.org/10.1093/cercor/bhab065
processing: 20 years of research. Psychological Bulletin, Yang, G. R., Joglekar, M. R., Song, H. F., Newsome, W. T.,
124(3), 372–422. https://doi.org/10.1037/0033-2909.124.3.372 & Wang, X.-J. (2019). Task representations in neural net-
Shain, C., Blank, I. A., Fedorenko, E., Gibson, E., & works trained to perform many cognitive tasks. Nature
Schuler, W. (2021). Robust effects of working memory Neuroscience, 22(2), 297–306. https://doi.org/10.1038/
demand during naturalistic language comprehension s41593-018-0310-2

You might also like