Yaaa
Yaaa
Contemporary
Debates in
Cogn"tive Sc"ence
Edited by
Robert
."
..II
J.
Stainton
Blackwell
Publishing
2006
2006
by Robert
J.
Stainton
BLACKWELL PUBLISHING
350 Main Street, Malden, MA02148-5020 , USA
9600 Garsington Road, Oxford OX4 2DQ, UK
550 Swanston Street, Carlton, Victoria3053 , Australia
The right of Robert J. Stainton to be identified as the Author of the Editorial
Material in this Work has been asserted in accordance with the UK Copyright,
Designs, and Patents Act1988 .
All rights reserved. N o part of this publication may be reproduced, stored in a
retrieval system, or transmitted, in any form or by any means, electronic, mechanical,
photocopying, recording or otherwise, except as permitted by the UK Copyright,
Designs, and Patents Act1988 , without the prior permission of the publisher.
First published2006 by Blackwell Publishing Ltd
1 2006
Library of Congress Cataloging-in-Publication Data
Contents
vii
Acknowledgments
Notes on Contributors
viii
Preface
xiii
Jesse J. Prinz
22
Richard Samuels
57
59
81
James McGilvray
Gerd Gigerenzer
97
113
115
134
146
159
187
189
William G. Lycan
Brie Gertler
202
217
219
Ray lackendoff
Georges Rey
257
K"kLudw
16 Is the Aim of Perception to Provide Accurate Representations?
A Case for the MN08 Side
Christopher Viger
275
289
291
326
Index
307
Contents
Acknowledg ments
A great number of people - too many to mention here - made suggestions of topics,
authors, etc. as the volume took shape.
ous) help. I am also grateful, of course, to the authors for their excellent contribu
tions ...and for their patience while the book was completed. I would also like to
thank the extremely helpful crew at Blackwell. Finally,
Notes on Contributors
Adele Abrahamsen is Associate Project Scientist in the Center for Research in Language
at the University of California. San Diego. Her research focuses on the onset and early
development of symbolic gestures and words as well as foundational and theoretical
Issues In cognitive science. She Is author of Child lAllguag(': All IIlt('rdiscipli"ary
Guide to Theory and Researcl, (1977) and coauthor. with William Bechtel, of Con
ntttiollism and the Mind (Blackwell. 2002).
Notes on Contrtbutors
Leonard BloomfIeld Book Award from the Linguistic Society of America in 2004. He
Notes on Contrtbutors
was the Constance E. Smith Fellow at the Radcliffe Institute for Advanced Study at
Harvard University In 2005-6. when his joint chapter with Barbara C. Scholz In this
volume was completed.
Georges R completed his PhD in Philosophy at Harvard University in 1978. He
works primarily on the foundations of cognitive science. He has published on issues
of consciousness and qualia. concepts and intentionality. and the philosophy of lin
guistics. He is the author of Contemporar)' PIIi/osopl,y of Mind (Blackwell. 1997). the
editor (with Barry Loewer) of Mea/ling in Mind: Fodor and His Crirics (Blackwell.
1991). and the section editor for cognitive science for the Routledge Encyclopedia of
Philosop/'),. He has taught at SUNY Purchase. the University of Colorado. and has
held visiting positions at MIT, CREA. the University of Split, the University of London,
the Australian National University. and Stanford. He Is presently Professor of
Philosophy at the University of Maryland at Co\1ege Park.
Richard S amuels is Lecturer in Philosophy at King's College, London. His research
focuses primarily on Issues In the philosophy of psychology and the foundations
of cognitive science. He has published papers on nativism, cognitive architecture,
evolutionalY psychology and the Implications of empirical psychology for our
understanding of human rationality. He is currently completing a book on cognitive
architecture.
Barbara C. Scholz lives In Santa Cruz. California. and held the Frieda L MUler Fdlowshlp
at the Radcliffe Institute for Advanced Study at Harvard University during 2005-6,
when her joint chapter with Geoffrey K. Pullum In this volume was completed. She
publishes articles in journals of philosophy. psychology. linguistics, and psycholin
guistlcs. Her particular interests lie in model-theoretic syntax and the philosophy of
linguistic science.
John Tienson is Professor of Philosophy at the University of Memphis. He has pub
lished extensively on the foundations of cognitive science, including Connectionism
and rht Phi/osop/'y of Psychology (with Terence Horgan, 1996). He is currently work
ing on a book in the philosophy of mind entitled PI,enomena/ Inte"tionality, with
Terence Horgan and George Graham, Wake Forest University. He has recently pub
lished a dozen articles in the philosophy of mind related to the book. many with
Horgan and Graham.
Christopher Viger has been assistant professo r at the University of Western Ontario
since 2002. He received his PhD in philosophy from McGill University in 1999 and
has done postdoctoral work at Tufts University with Daniel Dennett; at the CUNY
Graduate Center, with David Rosenthal; and at Rutgers University. His research areas
are In phUosophy of mind, philosophy of language, and cognitive science, with par
ticular interest in the connection between language and thOUght in an attempt to
find alternatives to the language of thought hypothesis. He has published in such
journals as Mind and Langllage. Sy"these, and Philosopl/ical Psychology.
Notes on Contrtbutors
Notes on Contributors
Preface
Robert 1. Stainton
This volume is about debates in cognitive science. Yet it is part of a series called
Contemporary Debates in Philosophy. How can it be both?
Let's begin with what cognitive science is. It is the interdisciplinary attempt to
understand the mind, most especially the human mind. More specifIcally, one can
think of cognitive science as having four branches: there are the behavioral and brain
sciences, like psycholinguistics, neuroscience, and cognitive psychology; there are those
social sciences that more or less directly inform us about the mind, like anthropol
ogy and sociolinguistics; there are formal disciplines like logic, computer science,
and especially ArtifIcial Intelligence; and, fmally, there are parts of philosophy, espe
cially philosophy of mind and language. The hallmark of cognitive science, in brief,
is that it draws on the methods and rsults of all these branches, to attempt to give
a global understanding of the mind.
To anticipate a worry, the idea obviously is not that philosophy is wholly con
tained in cognitive science. To pick only two examples, history of philosophy and
political philosophy clearly aren't parts of cognitive science. What's more, even some
parts of, say, philosophy of mind don't fIt easily within cognitive science, e.g., issues
about personal identity and life after death. The intersection, rather, is between certain
sub-areas of philosophy and the other three branches.
From this defmition alone we can immediately see why a debate can be both part
of cognitive science and part of philosophy - for there is overlap between the two
overarching fIelds. Some of the debates in this volume exemplifY overlap of that kind.
Brie Gertler and William Lycan debate about the nature and source of consciousness.
Peter Carruthers, Jesse Prinz, and Richard Samuels debate the variety and extent of
modular specialization in the human mind. Geoffrey Pullum and Barbara Scholz debate
with Robert Matthews and James McGilvray about how language develops in the
mind, and specifIcally about the role that an innate endowment plays. Such ques
tions are core parts of traditional philosophy of language and mind, but they are
equally core parts of today's cognitive science. Thus there is an intersection of philo
sophy and cognitive science.
There is another way, however, in which a debate can be both philosophical and
cognitive scientifIc. Many researchers accept that, though philosophy and empirical
science are not the same thing, nevertheless the two are continuous. According to
this view, called "naturalism," there is no sharp dividing line where philosophy stops
and empirical science begins. This isn't merely the claim, just made, that a question
can fall into both domains (e.g., the nature of space and time is among the oldest
philosophical issues, but it is also pursued by experimental methods). The additional
idea is that work which is straightforwardly empirical can bear on long-standing
"properly philosophical" questions, and vice versa. Debates in this volume which
exemplify empirical results informing philosophy include Kirk Ludwig and Chris Viger
on the nature and function of perception. Recent research on how human thermo
receptors work, for instance, suggests that it is not their job to give an accurate
representation of temperature to the agent. Instead, the job of a thermoreceptor is to
bypass accuracy in favor of immediate, limb-saving reactions - like withdrawing a
hand from something hot. This can seem to suggest that a very long tradition in
philosophical thinking about perception misconceives the phenomenon from the
get-go. (Ludwig fmnly resists this inference.) Or again, what rationality is, is an extremely
long-standing philosophical issue. The empirical research that Gerd Gigerenzer brings
to bear in his debate with David Matheson again suggests that the philosophical
tradition has deeply misunderstood rationality's fundamental nature. Going in the
other direction, from philosophy to empirical science, Timothy Williamson urges that
knowledge should be as central to the scientifIc understanding of the mind as it is
to philosophical epistemology. Knowledge, insists Williamson, is a fundamental men
tal state with an importantly different behavioral profIle than well-grounded belief.
Thus the interdisciplinary attempt to understand ilie mind - cognitive science - cannot
leave knowledge out. (This, in turn, means that cognitive science cannot ignore what
is outside the mind, since a state counts as knowledge only insofar as it corresponds
to what obtains "out there.") As another example, Ray lackendoff and Georges Rey
differ sharply about the implications for the science of language/mind of metaphys
ical worries about what "really exists" independently of the mind. These are all four
of them interactions between philosophy and cognitive science.
So, how could a debate be both part of philosophy and part of cognitive science?
In many ways, actually. There are various sorts of intersections and untold interac
tions. Speaking of "untold," let me end with this. Whether an investigation into x
will yield evidence relevant to y cannot be known a priori: it depends upon whether
x and y turn out to be linked in interesting ways. One just never knows for sure,
then, which curious facts might turn out to be deeply evidentially relevant to a prob
lem one is working on. To my mind, it is this aspect of the intersection and inter
action between cognitive science and philosophy - never knowing where the next
big lead may come from - that makes work in this area so challenging, but also so
exciting.
Enjoy.
Preface
CHAPTER
O
N
E
My charge in this chapter is to set out the positive case supporting massively modular
models of the human mind.! Unfortunately, there is no generally accepted under
standing of what a massively modular model of the mind is. So at least some of our
discussion will have to be terminological. I shall begin by laying out the range of
things that can be meant by "modularity." I shall then adopt a pair of strategies. One
will be to distinguish some things that "modularity" defmitely can't mean, if the
thesis of massive modularity is to be even remotely plausible. The other will be to look
at some of the arguments that have been offered in support of massive modularity,
discussing what notion of "module" they might warrant. It will turn out that there
is, indeed, a strong case in support of massively modular models of the mind on one
reasonably natural understanding of "module." But what really matters in the end,
of course, is the substantive question of what sorts of structure are adequate to account
for the organization and operations of the human mind, not whether or not the com
ponents appealed to in that account get described as "modules." So the more inter
esting question before us is what the arguments that have been offered in support
of massive modularity can succeed in showing us about those structures, whatever
they get called.
1 Introduction: On Modularity
In the weakest sense, a module can just be something like a dissociable functional
component. This is pretty much the everyday sense in which one can speak of buy
ing a hi-fl system on a modular basis, for example. The hi-fI is modular if one can
purchase the speakers independently of the tape-deck, say, or substitute one set of
speakers for another for use with the same tape-deck. Moreover, it counts towards
the modularity of the system if one doesn't have to buy a tape-deck at all - just
purchasing a CD player along with the rest - or if the tape-deck can be broken while
the remainder of the system continues to operate normally.
Understood in this weak way, the thesis of massive mental modularity would claim
that the mind consists entirely of distinct components, each of which has some specifIc
job to do in the functioning of the whole. It would predict that the properties of
many of these components could vary independently of the properties of the others.
(This would be consistent with the hypothesis of "special intelligences" - see
Gardner, 1983.) And the theory would predict that it is possible for some of these
components to be damaged or absent altogether, while leaving the functioning of
the remainder at least partially intact.
Would a thesis of massive mental modularity of this sort be either interesting or
controversial? That would depend upon whether the thesis in question were just that
the mind consists entirely (or almost entirely) of modular components, on the one
hand; or whether it is that the mind consists of a great many modular components,
on the other. Read in the fIrst way, then nearly everyone is a massive modularist,
given the weak sense of "module" that is in play. For everyone will allow that the
mind does consist of distinct components ; and everyone will allow that at least some
of these components can be damaged without destroying the functionality of the whole.
The simple facts of blindness and deafness are enough to establish these weak claims.
Read in the second way, however, the thesis of massive modularity would be by
no means anodyne - although obviously it would admit of a range of different strengths,
depending upon how many components the mind is thought to contain. Certainly it
isn't the case that everyone believes that the mind is composed of a great many dis
tinct functional components. For example, those who (like Fodor, 1983) picture the
mind as a big general-purpose computer with a limited number of distinct input and
output links to the world (vision, audition, etc.) don't believe this.
It is clear, then, that a thesis of massive (in the sense of "multiple") modularity is
a controversial one, even when the term "module" is taken in its weakest sense. So
those evolutionary psychologists who have defended the claim that the mind con
sists of a great many modular components (Tooby and Cosmides, 1992; Sperber, 1996;
Pinker, 1997) are defending a thesis of considerable interest, even if "module" just
means "component."
At the other end of the spectrum of notions of modularity, and in the strongest
sense, a module would have all of the properties of what is sometimes called a "Fodor
module" (Fodor, 1983). That is, it would be a domain-specifIC innately-specifIed
processing system, with its own proprietary transducers, and delivering "shallow"
(nonconceptual) outputs (e.g., in the case of the visual system, delivering a 21/2-D sketch;
Marr, 1983). In addition, a module in this sense would be mandatory in its opera
tions, swift in its processing, isolated from and inaccessible to the rest of cognition,
associated with particular neural structures, liable to specifIC and characteristic pat
terns of breakdown, and would develop according to a paced and distinctively-arranged
sequence of growth.
Let me comment briefly on the various different elements of this account. Accord
ing to Fodor (1983) modules are domain-specifIc processing systems of the mind.
Like most others who have written about modularity since, he understands this to
mean that a module will be restricted in the kinds of content that it can take as
Peter Carruthers
input.2 It is restricted to those contents that constitute its domain, indeed. So the
visual system is restricted to visual inputs; the auditory system is restricted to
auditory inputs; and so on. Furthermore, Fodor claims that each module should
have its own transducers: the rods and cones of the retina for the visual system; the
eardrum for the auditory system; and so forth.
According to Fodor, moreover, the outputs of a module are shallow in the sense
of being nonconceptual. So modules generate information of various sorts, but they
don't issue in thoughts or beliefs. On the contrary, belief-flxation is argued by Fodor
to be the very archetype of a nonmodular (or holistic) process. Hence the visual
module might deliver a representation of surfaces and edges in the perceived scene,
say, but it wouldn't as such issue in recognition of the object as a chair, nor in
the belief that a chair is present. This would require the cooperation of some other
(nonmodular) system or systems.
Fodor-modules are supposed to be innate, in some sense of that term, and to
be localized to speciflc structures in the brain (although these structures might not,
themselves, be local ones, but could rather be distributed across a set of dispersed
neural systems). Their growth and development would be under signiflcant genetic
control, therefore, and might be liable to distinctive patterns of breakdown, either
genetic or developmental. And one would expect their growth to unfold according to
a genetically guided developmental timetable, buffered against the vagaries of the
environment and the individual's learning opportunities.
Fodor-modules are also supposed to be mandatory and swift in their processing.
So their operations aren't under voluntary control (one can't turn them off ), and they
generate their outputs extremely quickly by comparison with other (nonmodular)
systems. When we have our eyes open we can't help but see what is in front of us.
And nor can our better judgment (e.g., about the equal lengths of the two lines in
a Mtiller-Lyer illusion) override the operations of the visual system. Moreover,
compare the speed with which vision is processed with the (much slower) speed of
conscious decision making.
Finally, modules are supposed by Fodor to be both isolated from the remainder of
cognition (Le., encapsulated) and to have internal operations that are inaccessible else
where. These properties are often run together with each other (and also with domain
specifIcity), but they are really quite distinct. To say that a processing system is encap
sulated is to say that its internal operations can't draw on any information held out
side of that system. (This isn't to say that the system can't access any stored information
at all, of course, for it might have its own dedicated database that it consults
during its operations.) In contrast, to say that a system is inaccessible is to say that
other systems can have no access to its internal processing, but only to its outputs,
or to the results of that processing.
Note that neither of these notions should be confused with that of domain
specificity. The latter is about restrictions on the input to a system. To say that a
system is domain speciflc is to say that it can only process inputs of a particular
sort, concerning a certain kind subject-matter. Whereas to say that the processing of
a system is encapsulated, on the one hand, or inaccessible, on the other, is to say
something about the access-relations that obtain between the internal operations of
that system and others. Hence one can easily envisage systems that might lack domain
The Case for Massively Modular Models of Mind
specifIcity, for example (being capable of receiving any sort of content as input), but
whose internal operations are nevertheless encapsulated and inaccessible (Carruthers,
2002a; Sperber, 2002).
and defend a thesis of massive modularity is moot. Certainly, innateness has been
emphasized by evolutionary psychologists, who have argued that natural selection has
led to the development of multiple innately channeled cognitive systems (Tooby and
Cosmides, 1 992). But others have argued that modularity is the product of learning
and development (Karmiloff-Smith, 1992). Both sides in this debate agree, however,
that modules will be realized in speciflc neural structures (not necessarily the same
from individual to individual). And both sides are agreed, at least, that develop
ment begins with a set of innate attention biases and a variety of different innately
structured learning mechanisms.
My own sympathies in this debate are towards the nativist end of the spectrum.
I suspect that much of the structure, and many of the contents, of the human mind
are innate or innately channeled. But in the context of developing a thesis of
massive modularity, it seems wisest to drop the innateness-constraint from our defmition
of what modules are. For one might want to allow that some aspects of the
mature language faculty are modular, for example, even though it is saturated with
acquired information about the lexicon of a speciflc natural language like English. And
one might want to allow that modules can be constructed by over-learning, say, in
such a way that it might be appropriate to describe someone's reading competence
as modular.
Finally, we come to the properties of encapsulated and inaccessible processing. These
are thought by many (including Fodor, 2000) to be the core properties of modular
systems. And there seems to be no a priori reason why the mind shouldn't be com
posed exclusively out of such systems, and cycles of operation of such systems.
At any rate, such claims have been defended by a number of those who describe
themselves as massive modularists (Sperber, 1996, 2002, 2005 ; Carruthers, 2002a, 2003,
2004a). Accordingly, they will be left untouched for the moment, pending closer
examination of the arguments in support of massive modularity.
What we have so far, then, is that if a thesis of massive mental modularity is to
be remotely plausible, then by "module" we cannot mean "Fodor-module." In par
ticular, the properties of having proprietary transducers, shallow outputs, domain
speciflcity, comparatively fast processing, and signiflcant innateness or innate chan
neling will have to be struck out. That leaves us with the idea that modules might
be isolable function-speciflc processing systems, whose operations are mandatory, which
are associated with speciflc neural structures, and whose internal operations may
be both encapsulated from the remainder of cognition and inaccessible to it. Whether
all of these properties should be retained in the most defensible version of a thesis
of massive mental modularity will be the subject of the next two sections of this
chapter.
properties that we observe in adult minds, then, aren't (except very indirectly) a prod
uct of natural selection, but are rather a result of learning from the environment
within which fItness-enhancing behaviors will need to be manifested.
Such a proposal is an obvious non-starter, however. It is one thing to claim that
all the contents of the mind are acquired from the environment using general learning
principles, as empiricists have traditionally claimed. (This is implausible enough by
itself; see section 3.2 below.) And it is quite another thing to claim that the structure
and organization of the mind is similarly learned. How could the differences between,
and characteristic causal roles of, beliefs, desires, emotions, and intentions be
learned from experience?5 For there is nothing corresponding to them in the world
from which they could be learned; and in any case, any process of learning must surely
presuppose that a basic mental architecture is already in place. Moreover, how could
the differences between personal (or "episodic") memory, factual (or "semantic")
memory, and short-term (or "working") memory be acquired from the environment?
The idea seems barely coherent. And indeed, no empiricist has ever been foolish
enough to suggest such things.
We have no other option, then, but to see the structure and organization of the
mind as a product of the human genotype, in exactly the same sense as, and to the
same extent that, the structure and organization of the human body is a product of
our genotypes. But someone could still try to maintain that the mind isn't the result
of any process of natural selection. Rather, it might be said, the structure of the mind
might be the product of a single macro-mutation, which became general in the
population through sheer chance, and which has remained thereafter through mere
inertia. Or it might be the case that the organization in question was arrived at through
random genetic drift that is to say, a random walk through a whole series of minor
genetic mutations, each of which just happened to become general in the popula
tion, and the sequence of which just happened to produce the structure of our mind
as its end-point.
These possibilities are so immensely unlikely that they can effectively be dismissed
out of hand. Evolution by natural selection remains the only explanation of organ
ized functional complexity that we have (Dawkins, 1986). Any complex phenotypic
structure, such as the human eye or the human mind, will require the cooperation
of many thousands of genes to build it. And the possibility that all of these thou
sands of tiny genetic mutations might have occurred all at once by chance, or might
have become established in sequence (again by chance), is unlikely in the extreme.
The odds in favor of either thing happening are vanishingly small. (Throwing a six
with a fair dice many thousands of times in a row would be much more likely.) We
can be confIdent that each of the required small changes, initially occurring through
chance mutation, conferred at least some minor fItness-benefIt on its possessor, suffICient
to stabilize it in the population, and thus providing a platform on which the next
small change could occur.
The strength of this argument, in respect of any given biological system, is directly
proportional to the degree of its organized functional complexity - the more complex
the organization of the system, the more implausible it is that it might have arisen
by chance macro-mutation or random genetic walk. Now, even from the perspective
of commonsense psychology the mind is an immensely complex system, which seems
The Case for Massively Modular Models of Mind
to be organized in ways that are largely adaptive.6 And the more we learn about
the mind from a scientific perspective, the more it seems that it is even more com
plex than we might initially have been inclined to think. Systems such as vision,
for example - that are treated as "simples" from the perspective of commonsense
psychology - turn out to have a hugely complex internal structure.
The prediction of this line of reasoning, then, is that cognition will be structured
out of dissociable systems, each of which has a distinctive function, or set of func
tions, to perform.7 This gives us a notion of a cognitive "module" that is pretty close
to the everyday sense in which one can talk about a hi-fi system as "modular" pro
vided that the tape-deck can be purchased, and can function, independently of the
CD player, and so forth. Roughly, a module is just a dissociable component.
Consistent with the above prediction, there is now a great deal of evidence of a
neuropsychological sort that something like massive modularity (in the everyday sense
of "module") is indeed true of the human mind. People can have their language
system damaged while leaving much of the remainder of cognition intact (aphasia);
people can lack the ability to reason about mental states while still being capable of
much else (autism); people can lose their capacity to recognize just human faces ;
someone can lose the capacity to reason about cheating in a social exchange while
retaining otherwise parallel capacities to reason about risks and dangers; and so on
and so forth (Sachs, 1985 ; Shallice, 1988 ; Tager-Flusberg, 1 999; Stone et aI., 2002;
Varley, 2002).
But just how many components does this argument suggest that the mind consists
of? Simon's (1962) argument makes the case for hierarchical organization. At the top
of the hierarchy will be the target system in question (a cell, a bodily organ, the
human mind). And at the base will be the smallest micro-components of the system,
bottoming out (in the case of the mind) in the detailed neural processes that realize
cognitive ones. But it might seem that it is left entirely open how high or how low
the pyramid is (i.e., how many "levels" the hierarchy consists of ) ; how broad its base
is ; or whether the "pyramid" has concave or convex edges. If the pyramid is quite
low with concave sides, then the mind might decompose at the first level of analysis
into just a few constituents such as perception, belief, desire, and the will, much as
traditional "faculty psychologies" have always assumed ; and these might then get
implemented quite rapidly in neural processes. In contrast, only if the pyramid is
high with a broad base and convex sides should we expect the mind to decompose
into many components, each of which in turn consists of many components, and so on.
There is more mileage to be derived from Simon's argument yet, however. For L"ie
complexity and range of functions that the overall system needs to execute will surely
give us a direct measure of the manner in which the "pyramid" will slope. (The greater
the complexity, the greater the number of subsystems into which the system will
decompose.) This is because the hierarchical organization is there in the flfSt place
to ensure robustness of function. Evolution needs to be able to tinker with one func
tion in response to selection pressures without necessarily impacting any of the
others.8 (So does learning, since once you have learned one skill, you need to be able
to isolate and preserve it while you acquire others. See Manoel et aI., 2002.)
Roughly speaking, then, we should expect there to be one distinct subsystem for
each reliably recurring function that human minds are called upon to perform. And
Peter Carruthers
as evolutionary psychologists have often emphasized, these are myriad (Tooby and
Cosmides, 1992; Pinker, 1997). Focusing just on the social domain, for example, humans
need to: identifY degrees of relatedness of kin, care for and assist kin, avoid incest,
woo and select a mate, identifY and care for offspring, make friends and build coali
tions, enter into contracts, identifY and punish those who are cheating on a contract,
identifY and acquire the norms of one's surrounding culture, identifY the beliefs
and goals of other agents, predict the behavior of other agents, and so on and so
forth - plainly this is just the tip of a huge iceberg, even in this one domain. In
which case the argument from biology enables us to conclude that the mind will
consist in a very great many distinct components, which is a (weak) form of massive
modularity thesis.
3.2 The argument from task specifIcity
A second line of reasoning supporting massive modularity derives from reflection on
the differing task demands of the very different learning challenges that people and
other animals must face, as well as the demands of generating appropriate fItness
enhancing intrinsic desires (Gallistel, 1990, 2000; Tooby and Cosmides, 1992, 2005).
It is one sort of task to learn the sun's azimuth (its height in the sky at any given
time of day and year) so as to provide a source of direction. It is quite another sort
of task to perform the calculations required for dead reckoning, integrating distance
traveled with the angle of each tum, so as to provide the direction and distance to
home from one's current position. And it is quite another task again to learn the
center of rotation of the night sky from observation of the stars, extracting from it
the polar north. These are all learning problems that animals can solve. But they
require quite different learning mechanisms to succeed (Gallistel, 2000).
When we widen our focus from navigation to other sorts of learning problem, the
argument is further reinforced. Many such problems pose computational challenges
to extract the information required from the data provided that are distinct from any
others. From vision, to speech recognition, to mind-reading, to cheater-detection, to
complex skill acquisition, the challenges posed are plainly quite distinct. So for each
such problem, we should postulate the existence of a distinct learning mechanism,
whose internal processes are computationally specialized in the way required to solve
the task. It is very hard to believe that there could be any sort of general learning
mechanism that could perform all of these different roles.
One might think that conditioning experiments fly in the face of these claims.
But general-purpose conditioning is rare at best. Indeed, Gallistel (2000 ; Gallistel and
Gibbon, 2001) has forcefully argued that there is no such thing as a general learning
mechanism. SpecifIcally, he argues that the results from conditioning experiments are
best explained in terms of the computational operations of a specialized rate-estimation
module, rather than some sort of generalized associative process. For example, it is
well established that delay of reinforcement has no effect on rate of acquisition, so
long as the intervals between trials are increased by the same proportions. And the
number of reinforcements required for acquisition of a new behavior isn't affected
by interspersing a signifIcant number of unreinforced trials. These facts are hard to
explain if the animals are supposed to be building associations, since the delays and
The Case for Massively Modular Models of Mind
unreinforced trials should surely weaken those associations. But they can be predicted
if what the animals are doing is estimating relative rates of return. For the rate
of reinforcement per stimulus presentation relative to the rate of reinforcement in
background conditions remains the same, whether or not signiflcant numbers of
stimulus presentations remain unreinforced, for example.
What emerges from these considerations is a picture of the mind as containing a
whole host of specialized learning systems (as well as systems charged with generat
ing fltness-enhancing intrinsic desires). And this looks very much like some sort
of thesis of massive modularity. Admittedly, it doesn't yet follow from the argument
that the mind is composed exclusively of such systems. But when combined with the
previous argument, outlined in section 3.1 above, the stronger conclusion would seem
to be warranted.
There really is no reason to believe, however, that each processing system will employ
a unique processing algorithm. On the contrary, consideration of how evolution gener
ally operates suggests that the same or similar algorithms may be replicated many
times over in the human mind/brain. (We could describe this by saying that the same
module-type is tokened more than once in the human brain, with distinct input and
output connections, and hence with a distinct functional role, in each case.) Marcus
(2004) explains how evolution often operates by splicing and copying, followed by
adaptation. First, the genes that result in a given micro-structure (a particular bank
of neurons, say, with a given set of processing properties) are copied, yielding two
or more instances of such structures. Then second, some of the copies can be adapted
to novel tasks. Sometimes this will involve tweaking the processing algorithm that
is implemented in one or more of the copies. But often it will just involve provision
of novel input and/or output connections for the new system.
Samuels ( 1998) challenges the above line of argument for massive processing
modularity, however, claiming that instead of a whole suite of specialized learning
systems, there might be just a single general-learning/general-inferencing mechanism,
but one operating on lots of organized bodies of innate information. (He calls this
"informational modularity," contrasting it with the more familiar form of computa
tional modularity.) However, this would surely create a serious processing bottleneck.
If there were really just one (or even a few) inferential systems - generating beliefs
about the likely movements of the surrounding mechanical objects ; about the likely
beliefs, goals, and actions of the surrounding agents ; about who owes what to whom
in a social exchange; and so on and so forth - then it looks as if there would be a
kind of tractability problem here. It would be the problem of forming novel beliefs on
all these different subject matters in real time (in seconds or fractions of a second),
using a limited set of inferential resources. Indeed (and in contrast with Samuel's
suggestion) surely everyone now thinks that the mind/brain is massively parallel in
its organization. In which case we should expect there to be distinct systems that
can process each of the different kinds of information at the same time.
Samuels might try claiming that there could be a wJ;1ole suite of distinct domain
general processing systems, all running the same general-learning/general-inferencing
algorithms, but each of which is attached to, and draws upon the resources of, a dis
tinct domain-specifIc body of innate information. This would get him the computa
tional advantages of parallel processing, but without commitment (allegedly) to any
Peter Carruthers
ones.
Combining the arguments of sections 3.1 and 3.2, then, we can predict that the
mind should be composed entirely or almost entirely of modular components (in the
everyday sense of "module")' many of which will be innate or innately channeled.
All of these component systems should run task-specifIC processing algorithms, with
distinct input and/or output connections to other systems, although some of them
may replicate some of the same algorithm types in the service of distinct tasks. This
looks like a thesis of massive modularity worth the name, even if there is nothing
here yet to warrant the claims that the internal processing of the modules in ques
tion should be either encapsulated, on the one hand, or inaccessible, on the other.
3.3 The argument from computational tractability
Perhaps the best-known of the arguments for massive modularity, however - at least
among philosophers - is the argument from computational tractability, which derives
from Fodor (1983, 2000).9 And it is generally thought that this argument, if it were
successful, would license the claim that the mind is composed of encapsulated process
ing systems, thus supporting a far stronger form of massive modularity hypothesis
than has been defended in this chapter so far (Carruthers, 2002a; Sperber, 2002).
The Case for Massively Modular Models of Mind
The flISt premise of the argument is the claim that the mind is realized in pro
cesses that are computational in character. This claim is by no means uncontrover
sial, of course, although it is the guiding methodological assumption of much of
cognitive science. Indeed, it is a claim that is denied by certain species of distributed
connectionism. But in recent years arguments have emerged against these competi
tors that are decisive, in my view (Gallistel, 2000; Marcus, 2001). And what remains
is that computational psychology represents easily our best - and perhaps our only
- hope for fully understanding how mental processes can be realized in physical ones
(Rey, 1 997). In any case, I propose just to assume the truth of this flfst premise for
the purposes of the discussion that follows.
The second premise of the argument is the claim that if cognitive processes are
to be realized computationally, then those computations must be tractable ones. What
does this amount to? First of all, it means that the computations must be such that
they can in principle be carried out within fmite time. But it isn't enough that the
computations postulated to take place in the human brain should be tractable in
principle, of course. It must also be feasible that those computations could be
executed (perhaps in parallel) in a system with the properties of the human brain,
within timescales characteristic of actual human performance. By this criterion, it seems
likely that many computations that aren't strictly speaking intractable from the
perspective of computer science, should nevertheless count as such for the purposes
of cognitive science.
There is a whole branch of computer science devoted to the study of more-or-Iess
intractable problems, known as "Complexity Theory." And one doesn't have to dig
very deep into the issues to discover results that have important implications for cog
nitive science. For example, it has traditionally been assumed by philosophers that
any candidate new belief should be checked for consistency with existing beliefs before
being accepted. But in fact consistency-checking is demonstrably intractable, if attempted
on an exhaustive basis. Consider how one might check the consistency of a set of
beliefs via a truth-table. Even if each line could be checked in the time that it takes
a photon of light to travel the diameter of a proton, then even after 20 billion years
the truth-table for a set of just 1 38 beliefs (2138 lines) still wouldn't have been com
pleted (Cherniak, 1 986).
From the flrst two premises together, then, we can conclude that the human mind
must be realized in a set of computational processes that are suitably tractable.
This means that those processes will have to be frugal, both in the amount of
information that they require for their normal operations, and in the complexity of
the algorithms that they deploy when processing that information.
The third premise of the argument then claims that in order to be tractable,
computations need to be encapsulated; for only encapsulated processes can be appro
priately frugal in the informational and computational resources that they require.
As Fodor (2000) explains it, the constraint here can be expressed as one of locality.
Computationally tractable processes have to be local, in the sense of only consulting
a limited database of information relevant to those computations, and ignoring all
other information held in the mind. For if they attempted to consult all (or even a
signiflcant subset) of the total information available, they would be subject to com
binatorial explosion, and hence would fail to be tractable after all.
Peter Carruthers
This third premise, in conjunction with the other two, would then (if it were accept
able) license the conclusion that the mind must be realized in a set of encapsulated
computational processes. And when combined with the conclusions of the arguments
of sections 3.1 and 3.2 above, this would give us the claim that the mind consists
in a set of encapsulated computational systems whose operations are mandatory, each
of which has its own function to perform, and many of which execute processing
algorithms that aren't to be found elsewhere in the mind (although some re-use
algorithms that are also found in other systems for novel functions). It is therefore
crucial for our purposes to know whether the third premise is really warranted; and
if not, what one might put in its stead. This will form the topic of the next section.
is unsuccessful within a specified time-frame, can also work remarkably well - such
as accessing the information in the order in which it was last used, or accessing the
information that is partially activated (and hence made salient) by the context.lO
For a different sort of example, consider the simple practical reasoning system sketched
in Carruthers (2002a). It takes as initial input whatever is currently the strongest desire,
for p.ll It then queries the various belief-generating modules, while also conducting
a targeted search of long-term memory, looking for beliefs of the form Q :::J P. If it
receives one as input, or if it fmds one from its own search of memory, it consults
a database of action schemata, to see if Q is something doable here and now. If it
is, it goes ahead and does it. If it isn't, it initiates a further search for beliefs of the
form R :J Q, and so on. If it has gone more than n conditionals deep without suc
cess, or if it has searched for the right sort of conditional belief without fmding one
for more than some specified time t, then it stops and moves on to the next strongest
desire.
Such a system would be frugal, both in the information that it uses, and in the
complexity of its algorithms. But does it count as encapsulated? This isn't encapsula
tion as that notion would generally be understood, which requires there to be a
limited module-specifIC database that gets consulted by the computational process in
question. For here, on the contrary, the practical reasoning system can search within
the total set of the organism's beliefs, using structure-sensitive search rules. But for
all that, there is a sense in which the system is encapsulated that is worth noticing.
Put as neutrally as possible, we can say that the idea of an encapsulated system
is the notion of a system whose internal operations can't be affected by most or all
of the information held elsewhere in the mind. But there is a scope ambiguity here.12
We can have the modal operator take narrow scope with respect to the quantifier, or
we can have it take wide scope. In its narrow-scope form, an encapsulated system
would be this: concerning most of the information held in the mind, the system
in question can't be affected by that information in the course of its processing.
Call this "narrow-scope encapsulation." In its wide-scope form, on the other hand, an
encapsulated system would be this: the system is such that it can't be affected by
most of the information held in the mind in the course of its processing. Call this
"wide-scope encapsulation."
Narrow-scope encapsulation is the one that is taken for granted in the philo
sophical literature on modularity. We tend to think of encapsulation as requiring
some determinate (and large) body of information, such that that information can't
penetrate the module. However, it can be true that the operations of a module can't
be affected by most of the information in a mind, without there being some deter
minate subdivision between the information that can affect the system and the
information that can't. For as we have just seen, it can be the case that the system's
algorithms are so set up that only a limited amount of information is ever consulted
before the task is completed or aborted. Put it this way: a module can be a system
that must only consider a small subset of the information available. Whether it does
this via encapsulation as traditionally understood (the narrow-scope variety), or via
frugal search heuristics and stopping rules (wide-scope encapsulation), is inessential.
The important thing is that the system should be frugal, both in the information that
it uses and in the resources that it requires for processing that information.
Peter Carruthers
The argument from computational tractability, then, does warrant the claim that
the mind should be constructed entirely out of systems that are frugal; but it doesn't
warrant a claim of encapsulation, as traditionally understood (the narrow-scope
variety). It does, however, warrant a non-standard encapsulation claim (the wide
scope version). In addition, it supports the claim that the processing systems in ques
tion should have internal operations that are inaccessible elsewhere. Or so I shall
now briefly argue by reductio, and by induction across current practices in AI .
Consider what it would be like if the internal operations of each system were access
ible to all other systems. (This would be complete accessibility. Of course the notions
of accessibility and inaccessibility, just like the notions of encapsulation and lack of
encapsulation, admit of degrees.) In order to make use of that information, those other
systems would need to contain a model of those operations, or they would need to
be capable of simulating or replicating them. In order to use the information that a
given processing system is currently undertaking such-and-such computations, the
other systems would need to contain a representation of the algorithms in question.
This would defeat the purpose of dividing up processing into distinct subsystems
running different algorithms for different purposes, and would likely result in some
sort of combinatorial explosion. At the very least, we should expect that most of those
processing systems should have internal operations that are inaccessible to all others ;
and that all of the processing systems that make up the mind should have internal
operations that are inaccessible to most others.1 3
Such a conclusion is also supported inductively by current practices in AI, where
researchers routinely assume that processing needs to be divided up among distinct
systems running algorithms specialized for the particular tasks in question. These sys
tems can talk to one another and query one another, but not access one another's
internal operations. And yet they may be conducting guided searches over the same
memory database. (Personal communication: Mike Anderson, John Horty, Aaron
Sloman.) That researchers attempting to build working cognitive systems have con
verged on some such architecture is evidence of its inevitability, and hence evidence
that the human mind will be similarly organized.
This last point is worth emphasizing further, since it suggests a distinct line of
argument supporting the thesis of massive modularity in the sense that we are cur
rently considering. Researchers charged with trying to build intelligent systems have
increasingly converged on architectures in which the processing within the total sys
tem is divided up among a much wider set of task-speciflc processing mechanisms,
which can query one another, and provide input to each other, and many of which
can access shared databases. But many of these systems will deploy processing algo
rithms that aren't shared by the others. And most of them won't know or care about
what is going on within the others.
Indeed, the convergence here is actually wider still, embracing computer science
more generally and not just AI. Although the language of modularity isn't so often
used by computer scientists, the same concept arguably gets deployed under the
heading of "object-oriented programs." Many programming languages now enable a
total processing system to treat some of its parts as "objects" which can be queried
or informed, but where the processing that takes place within those objects isn't
accessible elsewhere. This enables the code within the "objects" to be altered without
The Case for Massively Modular Models of Mind
having to make alterations in code elsewhere, with all the attendant risks that this
would bring. And the resulting architecture is regarded as well nigh inevitable once
a certain threshold in the overall degree of complexity of the system gets passed.
(Note the parallel here with Simon's argument from complexity, discussed in sec
tion 3.1 above.)
5 Conclusion
What emerges, then, is that there is a strong case for saying that the mind is very
likely to consist of a great many different processing systems, which exist and oper
ate to some degree independently of one another. Each of these systems will have a
distinctive function or set of functions; each will have a distinct neural realization;
and many will be signifIcantly innate, or genetically channeled. Many of them will
deploy processing algorithms that are unique to them. And all of these systems will
need to be frugal in their operations, hence being encapsulated in either the narrow
scope or the wide-scope sense. Moreover, the processing that takes place within each
of these systems will generally be inaccessible elsewhere.14 Only the results, or out
puts, of that processing will be made available for use by other systems.
Does such a thesis deserve the title of "massive modularity"? It is certainly a form
of massive modularity in the everyday sense that we distinguished at the outset. And
it retains many of the important features of Fodor-modularity. Moreover, it does seem
that this is the notion of "module" that is used pretty commonly in Al, if not so
much in philosophy or psychology (McDermott, 2001). But however it is described,
we have here a substantive and controversial claim about the basic architecture of
the human mind; and it is one that is supported by powerful arguments.
In any complete defense of massively modular models of mind, so conceived, we
would of course have to consider all the various arguments against such models,
particUlarly those deriving from the holistic and creative character of much of
human thinking. This is a task that I cannot undertake here, but that I have attempted
elsewhere (Carruthers, 2002a, 2002b, 2003, 2004a). If those attempted rebuttals should
prove to be successful, then we can conclude that the human mind will, indeed, be
massively modular (in one good sense of the term "module").
Acknowledgments
Thanks to Mike Anderson, Clark Barrett, John Horty, Edouard Machery, Richard Samuels, Aaron
Sloman, Robert Stainton, Stephen Stich, and Peter Todd for discussion and/or critical com
ments that helped me to get clearer about the topics covered by this chapter. Stich and Samuels,
in particular, induced at least one substantial change of mind from my previously published
views, in which I had defended the idea that modules must be encapsulated (as traditionally
understood). See Carruthers, 2002a, 2003.
Peter Carruthers
Notes
10
11
For the negative case, defending such models against the attacks of opponents, see Carruthers,
2002a, 2002b, 2003, 2004a.
Evolutionary psychologists may well understand domain specifIcity differently. They tend
to understand the domain of a module to be its function. The domain of a module is
what it is supposed to do, on this account, rather than the class of contents that it can
receive as input. I shall follow the more common content reading of "domain" in the
present chapter. See Carruthers, forthcoming, for further discussion.
This is no accident, since Fodor's analysis was explicitly designed to apply to modular
input and output systems like color perception or face recognition. Fodor has consistently
maintained that there is nothing modular about central cognitive processes of believing
and reasoning. See Fodor, 1 983, 2000.
Is this way of proceeding question-begging? Can one insist, on the contrary, that since
modules are domain-specifIc systems, we can therefore see at a glance that the mind can't
be massively modular in its organization? This would be fme if there were already a pre
existing agreed understanding of what modules are supposed to be. But there isn't. As
stressed above, there are a range of different meanings of "module" available. So prin
ciples of charity of interpretation dictate that we should select the meaning that makes
the best sense of the claims of massive modularists.
Note that we aren't asking how one could learn from experience of beliefs, desires, and
the other mental states. Rather, we are asking how the differences between these states
themselves could be learned. The point concerns our acquisition of the mind itself, not
the acquisition of a theory of mind.
As evidence of the latter point, witness the success of our species as a whole, which has
burgeoned in numbers and spread across the whole planet in the course of a mere 1 00,000
years.
We should expect many cognitive systems to have a set of functions, rather than a unique
function, since multi-functionality is rife in the biological world. Once a component has
been selected, it can be co-opted, and partly maintained and shaped, in the service of
other tasks.
Human software engineers and artificial intelligence researchers have hit upon the same
problem, and the same solution, which sometimes goes under the name "object-oriented
programming." In order that one part of a program can be improved and updated with
out any danger of introducing errors elsewhere, engineers now routinely modularize their
programs. See the discussion towards the end of section 4.
Fodor himself doesn't argue for massive modularity, of course. Rather, since he claims
that we know that central processes of belief fIxation and decision making can't be
modular, he transforms what would otherwise be an argument for massive modularity
into an argument for pessimism about the prospects for computational psychology.
See Carruthers, 2002a, 2002b, 2003, 2004a for arguments that the knowledge-claim under
lying such pessimism isn't warranted.
See Carruthers, forthcoming, for a n extended discussion of the relationship between the
massive modularity hypothesis and the simple heuristics movement, and for elaboration
and defense of a number of the points made in the present section.
Note that competition for resources is another o f the heuristics that may be widely used
within our cognitive systems; see Sperber, 2005. In the present instance one might think of
all activated desires as competing with one another for entry into the practical reason
ing system.
The Case for Massively Modular Models of Mind
12
13
14
Modal terms like "can" and "can't" have wide scope i f they govern the whole sentence
in which they occur; they have narrow scope if they govern only a part. Compare:
"] can't kill everyone" (wide scope; equivalent to, "It is impossible that ] kill everyone")
with, "Everyone is such that I can't kill them" (narrow scope). The latter is equivalent to,
"I can't kill anyone."
One important exception to this generalization is as follows. We should expect that many
modules will be composed out of other modules as parts. Some of these component parts
may feed their outputs directly to other systems. (Hence such components might be shared
between two or more larger modules.) Or it might be the case that they can be queried
independently by other systems. These would then be instances where some of the
intermediate stages in the processing of the larger module would be available elsewhere,
without the intermediate processing itself being so available.
A s we already noted above, the notions of "encapsulation" and "inaccessibility" admit of
degrees. The processing within a given system may be more or less encapsulated from
and inaccessible to other systems.
References
Carruthers, P. (2002a). The cognitive functions of language. And author's response: modularity,
language and the flexibility of thought. Behavioral and Brain Sciences, 25/6, 657-7 19.
- (2002b). Human creativity: its evolution, its cognitive basis, and its connections with childhood pretence. British loumal for the Philosophy of Science, 53, 1 - 2 5.
- (2003). On Fodor's Problem. Mind and Language, 1 8, 502-23.
- (2004a). Practical reasoning in a modular mind. Mind and Language, 19, 2 59-78.
- (2004b). On being simple minded. American Philosophical Quarterly, 4 1 , 205-20.
- (forthcoming). Simple heuristics meet massive modularity. In P. Carruthers, S. Laurence,
and S. Stich (eds.), The Innate Mind: Culture and Cognition. New York: Oxford University
Press.
Cherniak, C. ( 1 986). Minimal Rationality. Cambridge, MA: MIT Press.
Dawkins, R. ( 1 986). The Blind Watchmaker. New York: Norton.
Fodor, J. ( 1 983). The Modularity of Mind. Cambridge, MA: MIT Press.
- (2000). The Mind doesn 't Work that Way. Cambridge, MA: MIT Press.
Gallistel, R. ( 1 990). The Organization of Learning. Cambridge, MA: MIT Press.
- (2000). The replacement of general-purpose learning models with adaptively specialized
learning modules. In M. Gazzaniga (ed.), The New Cognitive Neurosciences (2nd edn.). Cambridge,
MA: MIT Press.
Gallistel, R. and Gibbon, J. (200 O. Time, rate and conditioning. Psychological Review, 108,
289-344.
Gardner, H. ( 1 983). Frames of Mind: The Theory of Multiple Intelligences. London : Heinemann.
Gigerenzer, G., Todd, P., and the ABC Research Group ( 1 999). Simple Heuristics that Make Us
Smart. New York: Oxford University Press.
Karmiloff-Smith, A. ( 1 992). Beyond Modularity. Cambridge, MA: MIT Press.
Manoel, E., Basso, L., Correa, U., and Tani, G. (2002). Modularity and hierarchical organiza
tion of action programs in human acquisition of graphic skills. Neuroscience Letters, 3 3 5/2,
83-6.
Marcus, G. (200 1 ) . The Algebraic Mind. Cambridge, MA : MIT Press.
- (2004). The Birth of the Mind: How a Tiny Number of Genes Creates the Complexities of
Human Thought. New York: Basic Books.
Peter Carruthers
University Press.
Simon, H. ( 1 962). The architecture of complexity. Proceedings oj the A merican Philosophical
Society, 1 06, 467-82.
Sperber, D. ( 1 996). Explaining Culture: a Naturalistic Approach. Oxford: Blackwell.
(2002). In defense of massive modularity. In 1. Dupoux (ed.), Language, Brain and Cognitive
_
Development.
_
Press.
Stone, V., Cosmides, 1., Tooby, J., Kroll, N., and Knight, R. T. (2002). Selective impairment of
reasoning about social exchange in a patient with bilateral limbic system damage.
Proceedings oj the National Academy oj Science, 99, 1 1 53 1 -6.
Tager-Flusberg, H. (ed.) ( 1 999). Neurodevelopmental Disorders. Cambridge, MA: MIT Press.
Tooby, J. and Cosmides, 1. ( 1 992). The psychological foundations of culture. In J. Barkow,
1. Cosmides, and J. Tooby (eds.), The Adapted Mind. New York: Oxford University Press.
Tooby, J., Cosmides, 1., and Barrett, C. (2005). Resolving the debate on innate ideas: learn
ability constraints and the evolved interpenetration of motivational and conceptual functions.
In P. Carruthers, S. Laurence, and S. Stich (eds.), The Innate Mind: Structure and Contents.
New York: Oxford University Press.
Varley, R. (2002). Science without grammar: scientifIc reasoning in severe agrammatic aphasia.
In P. Carruthers, S. Stich, and M. Siegal (eds.), The Cognitive Basis oj Science. New York:
Cambridge University Press.
CHAPTER
T
W
O
When Fodor titled his (1983) book the Modularity of Mind, he overstated his posi
tion. His actual view is that the mind divides into systems some of which are modu
lar and others of which are not. The book would have been more aptly, if less
provocatively, called The Modularity of Low-Level Peripheral Systems. High-level per
ception and cognitive systems are non-modular on Fodor's theory. In recent years,
modularity has found more zealous defenders, who claim that the entire mind divides
into highly specialized modules. This view has been especially popular among evolu
tionary psychologists. They claim that the mind is massively modular (Cosmides and
Tooby, 1 994; Sperber, 1 994; Pinker, 1997; see also Samuels, 1 998). Like a Swiss army
knife, the mind is an assembly of specialized tools, each of which has been designed
for some particular purpose. My goal here is to raise doubts about both peripheral
modularity and massive modularity. To do that, I will rely on the criteria for modu
larity laid out by Fodor (1983). I will argue that neither input systems, nor central
systems are modular on any of these criteria.
Some defenders of modularity have dropped parts of Fodor's defmition and
defmed modularity with reference to a more restricted list of features. Carruthers
(chapter 1, THE CASE FOR MASSIVELY MODULAR MODELS OF MIND) makes such a move. My
arguments against modularity threaten these accounts as well. My claim is not just
that Fodor's criteria are not jointly satisfled by subsystems within the mind, but they
are rarely satisfIed individually. When we draw boundaries around subsystems that
satisfy any one of Fodor's criteria for modularity, we fmd, at best, scattered islands
of modularity. If modules exist, they are few and far between. The kinds of systems
that have been labeled modular by defenders of both peripheral and massive modu
larity probably don't qualify. Thus, modularity is not a very useful construct in doing
mental cartography.
1 Fodor's Criteria
Modularity should be contrasted with the uncontroversial assumption of "functional
decomposition": the mind contains systems that can be distinguished by the func
tions they carry out. The modularity hypothesis is a claim about what some of the
systems underlying human competences are like. Fodor characterizes modularity by
appeal to nine special properties. He says that modular systems are:
2
3
4
5
6
7
8
9
brain (Pulvermiiller, 1 999). Or consider vision. There is considerable debate about the
location of systems involved in processing things as fundamental as space and color.
Uttal also points out that neuroimaging studies often implicate large-scale networks,
rather than small regions, suggesting that vast expanses of cortex contribute to many
fundamental tasks. Sometimes the size of these networks is underestimated. By focus
ing on hotspots, researchers often overlook regions of the brain that are moderately
active during task performance.
Lesion studies are mired by similar problems. Well-known deficits, such as visual
neglect, are associated with lesions in entirely different parts of the brain (e.g., frontal
eye-fields and inferior parietal cortex). Sometimes, lesions in the same area have dif
ferent effects in different people, and all too often neuropsychologists draw general
conclusions from individual case studies. This assumes localization rather than pro
viding evidence for it. Connectionist models have been used to show that focal lesions
can lead to specifiC deficits even when there is no localization of functions: a mas
sively distributed artificial neural network can exhibit a selective deficit after a few
nodes are removed (simulating a focal lesion), even though those nodes were not the
locus of the capacity that is lost (Plaut, 1 995). More generally, when a lesion leads to
an impairment of a capacity, we do not know if the locus of the lesion is the neural
correlate of the capacity or the correlate of some ancillary prerequisite for the capacity.
I do not want to exaggerate the implications of these considerations. There is prob
ably a fair degree of localization in the brain. No one is tempted to defend Lashley's
( 1 9 50) equipotentiality hypothesis, according to which the brain is an undifferen
tiated mass. But the rejection of equipotentiality does not support modularity.
Defenders of modularity combine localization with domain speciflCity: they assume
that brain regions are exclusively dedicated to specifiC functions. Call this "strong
localization. " If, in reality, mental functions are located in large-scale overlapping
networks, then it would be misleading to talk about anatomical regions as modules.
Evidence for strong localization is difficult to come by. Similar brain areas are
active during multiple tasks, and focal brain lesions tend to produce multiple deficits.
For example, aphasia patients regularly have impairments unrelated to language (Bates,
1994; Bates et al., 2000). Even genetic language disorders (specifiC language impair
ments) are co-morbid with nonlingusitic problems, such as impairments in rapid aud
itory processing or oro facial control (Bishop, 1 9 9 2 ; Vargha-Khadem et al., 1 9 9 5 ; TaHal
et al., 1 99 6).
To take another example, consider the discussion in Stone et al. (2002) of a patient
who is said to have a selective deficit in reasoning about social exchanges. This patient
is also impaired in recognizing faux pas and mental-state terms, so he does not
support the existence of a social exchange module. Nor does this patient support
the existence of a general social cognition module, because he performs other social
tasks well.
In sum it is difficult to fmd cases where specifiC brain regions have truly speciflc
functions. One could escape the localization criterion by defming modules as motley
assortments of abilities (e.g., syntax plus oro facial control; social exchange plus faux
pas), but this would trivialize the modularity hypothesis. There is little evidence that
the capacities presumed to be modular by defenders of the modularity hypothesis are
strongly localized.
Jesse J. Prinz
4 Ontogenetic Determinism
Fodor implies that modules are ontogentically determined: they develop in a predict
able way in all healthy individuals. Modules emerge through the maturation, rather
Is the Mind Rea lly Modular?
than learning and experience. In a word, they are innate. I am skeptical. I think many
alleged modular systems are learned, at least in part.
Of all alleged modules, the senses have the best claim to being innate, but they
actually depend essentially on experience. Within the neocortex of infants, there
is considerably less differentiation between the senses than there is in adults.
Cortical pathways seem to emerge through a course of environmentally stimulated
strengthening of connections and pruning. One possibility is that low-level sensory
mechanisms are innate (including sense organs, subcortical sensory hubs, and the
cytoarchitecture of primary sensory cortices), while high-level sensory mechanisms
are acquired through environmental interaction (Quartz and Sejnowski, 1 997). This
conjecture is supported by the plasticity of the senses (Chen et al., 2002). For example,
amputees experience phantom limbs because unused limb-detectors get rewired to
neighboring cells, and blind people use brain areas associated with vision to read
Braille. In such cases, sensory wiring seems to be input driven. Thus, it is impos
sible to classifY the senses as strictly innate or acquired.
Strong nativist claims are even harder to defend when we go outside the senses.
Consider folk physics: our core knowledge of how medium-sized physical objects behave.
It is sometimes suggested that folk physics is an innate module. For example, some
developmental psychologists conjecture that infants innately recognize that objects
move as bounded wholes, that objects cannot pass through each other, and that objects
fall when dropped. I don't fwd these conjectures plausible (see Prinz, 2002).
Newborns are not surprised by violations of boundedness (Slater et al., 1 990), and
flve-month-olds are not surprised by violations of solidity and gravity (Needham
and B aillargeon, 1 993). Indeed, some tasks involving gravity and solidity even stump
two-year-olds (Hood et al., 2000). My guess is that innate capacities to track move
ment through space combine with experience to derive the basic principles of folk
physics (compare Scholl and Leslie, 1 999). If so, folk physics is a learned byproduct
of general tracking mechanisms.
Consider another example: massive modularists claim that we have an innate capa
city for "mind-reading," i.e., attributing mental states (e.g., Leslie, 1 994; Baron-Cohen,
1 996). The innateness claim is supported by two facts: mind-reading emerges on a
fIxed schedule, and it is impaired in autism, which is a genetic disorder. Consider
these in turn. The evidence for a flXed schedule comes from studies of healthy western
children. Normally developing western children generally master mind-reading skills
between the third and fourth birthdays. However, this pattern fails to hold up cross
culturally (Lillard, 1 9 9 8 ; Vinden, 1 999). For example, Quechua speakers of Peru don't
master belief attribution until they are eight (Vinden, 1 996). Moreover, individual
differences in belief attribution are highly correlated with language skills and expos
ure to social interaction (GarfIeld et al., 200 1 ) . This suggests that mind-reading skills
are acquired through social experience and language training.
What about autism? I don't think that the mind-reading defIcit in autism is evid
ence for innateness. An alternative hypothesis is that mind-reading depends on a
more general capacity which is compromised in autism. One suggestion is that autistic
individuals' difficulty with mind-reading is a consequence of genetic abnormality in
oxytocin transmission, which prevents them from forming social attachments, and
thereby undermines learned social skills (Insel et al., 1 999).
Jesse J. Prinz
5 Domain Specificity
Domain specifICity is closely related to innateness. To say that a capacity is innate
is to say that we are biologically prepared with that specifIC capacity. Innate entails
domain speciflc. But, as we have just seen, domain speciflc does not entail innate..
Therefore, in arguing against the innateness criterion of modularity, I have not under
mined the domain specifIcity criterion. Domain specifIcity is regarded by some as the
essence of modularity, and it deserves careful consideration.
It is difficult to assess the claim that some mental systems are domain speciflc
without clarifYing defmitions. What exactly is a "domain"? What is "specificity"?
On some interpretations, domain specifIcity is a trivial property. "Domain" can be
Is the M ind Really Modular?
interpreted as a synonym for "subject matter. " To say that a cognitive system con
cerns a domain, on this reading, is to say that the system has a subject matter. The
subject matter might be a class of objects in the world, a class of related behaviors,
a skill, or any other coherent category. On the weak reading, just about anything
can qualify as a domain. Consider an individual concept, such as the concept "camel."
A mental representation used to categorize camels is specific to a domain, since camels
are a coherent subject matter. Likewise for every concept.
"SpecifIcity" also has a weak reading. In saying that a mental resource is domain
specifIC, we may be saying no more than that is it is used to process informa
tion underlying our aptitude for that domain. In other words, domain specifIcity
would not require exclusivity. Consider the capacity to throw crumpled paper into a
wastebasket. Presumably, the mental resources underlying that ability overlap with
resources used in throwing basketballs in hoops or throwing tin cans into recycle
bins. On the weak defInition of "specificity," we have a domain specifIC capacity
for throwing paper into wastebaskets simply in virtue of having mental resources
underlying that capacity, regardless of the fact that those resources are not dedicated
exclusively to that capacity.
Clearly defenders of domain specifIcity want something more. On a stronger read
ing, "domain" refers, not to any subject matter, but to matters that are relatively
encompassing. Camels are too specifIC. The class of animals might qualify as a domain,
because it is more inclusive. Psychologists have this kind of category in mind when
they talk about "basic ontological domains." But notice that the stronger defInition
is hopelessly vague. What does it mean to say domains are relatively encompassing?
Relative to what? "Camel" is an encompassing concept relative to the concept: "the
particular animal used by Lawrence to cross the Arabian desert." Moreover, it is com
mon in cognitive science to refer to language, mind-reading, and social exhange as
domains. Are these things encompassing in the same sense and to the same degree
as "animal"? In response to these diffIculties, some researchers defIne domains as
sets of principles. This won't help. We have principles underlying our knowledge of
camels, as well as principles underlying our knowledge of animals. I see no escape.
If we drop the weak defInition of "domain" (domain
subject matter), we still fInd
ourselves with defInitions that are vague or insuffIciently restrictive.
Things are slightly better with "specifIcity." On a strong reading, "specifIC" means
"exclusively dedicated." To say that modules are domain specifiC is to say that they are
exclusively dedicated to their subject matter. This is a useful explanatory construct,
and it may be applicable to certain mental systems. Consider the columns of cells in
primary visual cortex that are used to detect edges. These cells may be dedicated to
that function and nothing else. Perhaps modules are supposed to be like that.
There is still some risk of triviality here. We can show that any collection of rules
and representations in the mind-brain is dedicated by simply listing an exhaustive
disjunction of everything that those rules and representations do. To escape trivial
ity, we want to rule out disjunctive lists of functions. We say that systems are domain
specifIC when the domain can be specified in intuitively coherent way. Let's assume
for the sake of argument that this requirement can be made more precise. The prob
lem is that alleged examples of modules probably aren't domain specifIC in this strong
sense.
=
Jesse J. Prinz
Consider vision. Edge detectors may be domain specific, but other resources used
for processing visual information may be more general. For example, the visual sys
tem can be recruited in problem solving, as when one uses imagery to estimate where
a carton of milk can squeeze into a crammed refrigerator. Some of our conceptual
knowledge may be stored in the form of visual records. We know that damage to
visual areas can disrupt conceptual competence (Martin and Chao, 200 1 ) . I have also
noted that, when people lose their sense of sight, areas once used for vision get used
for touch. Visually perceived stimuli also generate activity in cells that are bimodal.
The very same cells are used by the touch system and the auditory system. If we
excluded rules and representations that can be used for something other than deriv
ing information from light, the boundaries of the "visual system" would shrink
considerably. At the neural level of description, it is possible that only isolated
islands of cells would remain. This would be a strange way to carve up the mind.
One of the important things about our senses is that they can moonlight. They can
help each other out and they can play a central role in the performance of cognitive
tasks. Vision, taken as a coherent whole, is not domain specifiC in the strong sense,
even if it contains some rules and representations that are.
Similar conclusions can be drawn for language. I have said that language may
share resources with systems that serve other functions : pattern recognition, muscle
control, and so on. Broca's area seems to contain mirror neurons, which play a role
in the recognition of manual actions, such as pinching and grasping (Heiser et aI.,
2003). Wernicke's area seems to contain cells that are used in the categorization of
nonlinguistic sounds (Saygin et a1., 2003) . Of course, there may be some language
specifiC rules and representation within the systems that contribute to language. Perhaps
the neurons dedicated to conjugating the verb "to be" have no nonlinguistic func
tion. Such highly localized instances of domain specificity will offer little comfort to
the defender of modularity. They are too specifiC to correspond to modules that have
been proposed. Should we conclude that there is a module dedicated to the conjuga
tion of each irregular verb?
There is relatively little evidence for large-scale modules, if we use domain
specificity as the criterion. It is hard to fmd systems that are exclusively dedicated
to broad domains. Vision and language systems are not dedicated in the strong
sense, and the same is true for other alleged modules. Consider mind-reading,
which clearly exploits domain general capacities. I noted above that mind-reading
is correlated with language skills. Hale and Tager-Flusberg (2003) found that
preschoolers who failed the false belief task were more likely to succeed after receiv
ing training in sentential complement clauses. They went from 20 percent correct
in attribute false beliefs to over 75 percent correct. Mind-reading also depends
on working memory. Performance in attributing false beliefs is impaired in healthy
subjects when they are given an unrelated working memory task (McKinnon and
Moscovitch, unpublished). In neuroimaging studies, mind-reading is shown to recruit
language centers in left frontal cortex, visuospatial areas in right temporal-parietal
regions, the amygdala, which mediates emotional responses, and the precuneus, which
is involved in mental image inspection and task switching. In short, mind-reading
seems to exploit a large network of structures all of which contribute to many other
capacities.
Is the Mind Really Modular?
This seems to be the general pattern for alleged modules. The brain structures involved
in mathematical cognition are also involved in language, spatial processing, and
attention (Dehaene, 1 997 ; Simon, 1997). Folk physics seems to rely on multi-object
attention mechanisms (Scholl and Leslie, 1 999). Moral judgment recruits ordinary
emotion centers (Greene and Haidt, 2002).
For all I have said, alleged modules may have domain specifIc components. Perhaps
these systems use some proprietary rules and representations. But they don't seem
to be proprietary throughout. Therefore, domain specifIcity cannot be used to trace
the boundaries around the kinds of systems that modularists have traditionally dis
cussed.
case that choice is determined conceptually (we know that people cannot run when
they are lying down). Remarkably, subj ects are primed to use the word "her" equally
fast in both conditions. If lexical processing were encapsulated from conceptual pro
cessing, one would expect lexically determined word choices to arise faster. These
results imply that formal aspects of language are under immediate and constant influence
of general world knowledge.
Thus far, I have been talking about top-down influences on input systems. There
is also evidence that input systems can speak to each other. This is incompatible with
encapsulation, because a truly encapsulated system would be insulated from any exter
nal influence. Consider some examples. First, when subjects hear speech sounds that
are inconsistent with observed mouth movements, the visual experience systematic
ally distorts the auditory experience of the speech sounds (McGurk and MacDonald,
1 976). Second, Ramachandran has developed a therapeutic technique for treating phan
tom limb pain, in which amputees use a mirror reflection to visually relocate an intact
limb in the location of a missing limb ; if they scratch or sooth the intact limb, the
discomfort in the phantom subsides (Ramachandran et al., 1 995). Third, sound can
give rise to touch illusions: hearing multiple tones can make people feel multiple
taps, when there has been only one (Hotting and ROder, 2004). Finally, people with
synesthesia experience sensations in one modality when they are stimulated in another;
for example, some people see colors when they hear sounds, and others experience
shapes when they taste certain flavors. All these examples show that there can be
direct and content-specific cross-talk between the senses.
The empirical evidence suggests that mental systems are not encapsulated. But
the story cannot end here. There is also a principled argument for encapsulation,
which is nicely presented by Carruthers. It goes like this: mental processes must be
computationally tractable, because the mind is a computer, and mental processes are
carried out successfully in a fmite amount of time; if mental processes had access
to all the information stored in the mind (i.e., if they were not encapsulated), they
would not be tractable (merely checking consistency against a couple hundred beliefs
would take billions of years) ; therefore, mental processes are encapsulated.
Carruthers recognizes that there is a major flaw in this argument. According to
the second premise, mental processes would be intractable if they had access to all
the information stored in the mind. This is actually false. Computational systems can
sort through stupendously large databases at breakneck speed. The trick is to use
" frugal" search rules. Frugal rules are ones that radically reduce processing load by
exploiting simple procedures for selecting relevant items in the database. Once the
most relevant items are selected, more thorough processing of those items can begin.
Psychologists call such simple rules "heuristics" (Kahneman et al., 1 982). There is
overwhelming evidence that we make regular use of heuristics in performing cognit
ive tasks. For example, suppose you want to guess which of two cities is larger, Hamburg
or Mainz. You could try to collect some population statistics (which would take a
long time), or you could just pick the city name that is most familiar. This Take the
Best strategy is extremely easy and very effective ; it is even a good way to choose
stocks that will perform well in the market (Gigerenzer et al., 1 999). With heuristics,
we can avoid exhaustive database searches even when a complete database is at our
disposal. There are also ways to search through a colossal database without much
Jesse J. Prinz
cost. Internet search engines provide an existence proof (Clark, 2002). Consider Google.
A Google search on the word " heuristic" sorts through over a billion web pages in
0. 1 8 seconds, and the most useful results appear in the flfSt few hits. Search engines
look for keywords and for web-pages that have been frequently linked or accessed.
If we perform the mental equivalent of a Google search on our mental flIes, we should
be able to call up relevant information relatively quickly. The upshot is that encap
sulation is not needed for computationally tractability.
At this point, one might expect Carruthers to abandon the assumption that men
tal systems are encapsulated. Instead, he draws a distinction between two kinds of
encapsulation. Narrow-scope encapsulation occurs when most of the information held
in the mind is such that a system can't be affected by that information in the course
of processing. This is the kind of encapsulation that Fodor attributes to modules, and
it is what Carruthers rejects when he appeals to heuristics. It is possible that any item
of information is such that a system could be affected by it. But Carruthers endorses
wide-scope encapsulation : systems are such that they can't be affected by most of
the information held in the mind at the time of processing. This seems reasonable
enough. If every item in the mind sent inputs to a given system simultaneously, that
system would be overwhelmed. So, I accept "wide-scope encapsulation." But "wide
scope encapsulation" is not really encapsulation at all. "Encapsulation" implies that
one system cannot be accessed by another. "Wide-scope encapSUlation" says that all
systems are accessible; they just aren't accessed all at once. Carruthers terminolo
gical move cannot be used to save the hypothesis that mental systems are encap
sulated. In recognizing the power of heuristic search, he tacitly concedes that the
primary argument for encapSUlation is unsuccessful.
I do not want to claim that there is no encapsulation in the mind. It is possible
that some subsystems are impervious to external inputs. I want to claim only that
there is a lot of cross-talk between mental systems. If we try to do mental cartog
raphy by drawing lines around the few subsystems that are encapsulated, we will
end up with borders that are not especially helpful. Encapsulation it is not
suffIciently widespread to be an interesting organizing principle.
to the greater whole. My goal has been to criticize a specifIc account of what the
functional units in the mind are like. The functional units need not be fast, auto
matic, innate, shallow, or encapsulated. Some of the components may be dedicated
to a single mental capacity, but others may serve a variety of different capacities. It
is possible that no component in the mind exhibits the preponderance of properties
on Fodor's list.
Some defenders of modularity are committed to nothing more than functional decom
position. They reject Fodor's list and adopt the simple view that the mind is a machine
with component parts. That view is uncontroversial. Massive modularity sounds like
a radical thesis, but, when the notion of modularity is denatured, it turns into a plat
itude. Of course central cognition has a variety of different rules and representations.
Of course we bring different knowledge and skills to bear when we reason about the
social world as opposed to the world of concrete objects. Of course it is possible for
someone to lose a specifIc cognitive capacity without losing every other cognitive
capacity. Controversy arises only when functional components are presumed to have
properties on Fodor's list.
I think the term "modularity" should be dropped because it implies that many
mental systems are modular in Fodor's sense, and that thesis lacks support. Cognitive
scientists should continue to engage in functional decomposition, but we should resist
the temptation to postulate and proliferate modules.
Dehaene, S. ( 1 997). The Number Sense. New York: Oxford University Press.
Fodor, J. ( 1 983). The Modularity of Mind. Cambridge, MA: MIT Press.
- (2000). The Mind Doesn't Work That Way: The Scope and Limits of Computational Psychology.
Cambridge, MA: MIT Press.
GarfIeld, J. L., Peterson, C. c., and Perry, T. (200 1 ) . Social cognition, language acquisition and
the development of the theory of mind. Mind and Language, 1 6, 494-54 1 .
Gigerenzer, G., Todd, P . M., and the ABC Research Group ( 1 999). Simple Heuristics that Make
Us Smart. New York: Oxford University Press.
Greene, J. and Haidt, J. (2002). How (and where) does moral judgment work? Trends in Cognitive
Science, 6, 5 1 7-23.
Hale, C. M. and Tager-Flusberg, H. (2003). The influence of language on theory of mind: a
training study. Developmental Science, 6, 346-59.
Heiser, M., Iacoboni, M., Maeda, F., Marcus, J., and Mazziotta, J. C. (2003). The essential role
of Broca's area in imitation. European Journal of Neu roscience, 1 7, 1 1 23-8.
Hood, B., Carey, S., and Prasada, S. (2000). Predicting the outcomes of physical events: two
year-olds fail to reveal knowledge of solidity and support. Child Development, 7 1 , 1 540-54.
Hotting, K. and ROder, B. (2004). Hearing cheats touch, but less in congenitally blind than in
sighted individuals. Psychological Science, 1 5, 60-4.
Iusel, T. R., O'Brien, D. J., and Leckman, J. F. ( 1 999). Oxytocin, vasopressin, and autism: is
there a connection? Biological Psychiatry, 45, 1 45-57.
Jacobs, R. A. ( 1 999). Computational studies of the development of functionally specialized
neural modules. Trends in Cognitive Science, 3, 3 1 -8.
Kahneman, D., Slovic, P., and Tversky, A. (eds.) ( 1 982). Judgment Under Uncertainty:
Heuristics and Biases. New York: Cambridge University Press.
Kosslyn, S. M., Thompson, W. L., Kim, 1. J., and Alpeli, N. M. ( 1 99 5). Topographical repres
entations of mental images in primary visual cortex. Nature, 3 78, 496-8.
Lashley, K. ( 1 9 50). In search of the engram. Symposia of the Society for Experimental Biology,
4, 454-82.
Leslie, A. M. ( 1 994). ToMM, ToBy, and Agency: core architecture and domain specifIcity. In
L. Hirschfeld and S. Gelman (eds.), Mapping the Mind: Domain Specificity in Cognition and
Culture. New York: Cambridge University Press.
Lillard, A. (1998). Ethnopsychologies: cultural variations in theories of mind. Psychological Bulletin,
1 23 , 3-32.
Marslen-Wilson, W. and Tyler, L. ( 1 987). Against modularity. In J. L. GarfIeld (ed.), Modularity
in Knowledge Representation and Natural-Language Understanding. Cambridge, MA: MIT Press.
Martin, A. and Chao, L. (2001). Semantic memory and the brain: structure and processes. Current
Opinion in Neurobiology, 1 1 , 1 94-20 1 .
McGurk, H . and MacDonald, J . ( 1 9 76). Hearing lips and seeing voices. Nature, 264, 746-8.
McKinnon, M. and Moscovitch, M. (unpublished). Domain-general contributions to social
reasoning: perspectives from aging and the dual-task method. Manuscript, University of
Toronto.
Needham, A. and Baillargeon, R. ( 1 993) Intuitions about support in 4. 5-month-old infants.
Cognition, 47, 1 2 1-48.
Nisbett, R. and Wilson, T. ( 1 977). Telling more than we can know: verbal reports on mental
processes. Psychological Review, 84, 2 3 1 -59.
Pinker, S. ( 1 997). How the Mind Works. New York: Norton.
Plaut, D. C. ( 1995). Double dissociation without modularity: evidence from connectionist neuro
psychology. Journal of Clinical and Experimental Neuropsychology, 1 7, 29 1-32 1 .
Poeppel, D . ( 1 996). A critical review o f PET studies o f phonological processing. Brain and Language,
55, 3 1 7 - 5 1 .
Is the Mind Really Modular?
Prinz, J. J. (2002). Furnishing the Mind: Concepts and their Perceptual Basis. Cambridge, MA:
MIT Press.
Pulvermtiller, F. ( 1 999). Words in the brain's language. Behavioral and Brain Sciences, 22,
253-336.
Quartz, S. R. and Sej nowski, T. J. ( 1 997). The neural basis of cognitive development: a con
structivist manifesto. Behavioural and Brain Sciences, 20, 537-96.
Ramachandran, V. S., Rogers-Ramachandran, D., and Cobb, S. ( 1 99 5). Touching the phantom
limb. Nature, 3 77, 489-90.
Samuels, R. ( 1998). Evolutionary psychology and the massive modularity hypothesis. British
Journal for the Philosophy of Science, 49, 575-602.
Saygin, A. P., Dick, F., Wilson, S. M., Dronkers, N. F., and Bates, E. (2003). Neural resources
for processing language and environmental sounds: evidence from aphasia. Brain, 1 26/4,
928-45.
Scholl, B. J. and Leslie, A. M. ( 1 999). Explaining the infant's object concept: beyond the
perception/cognition dichotomy. In E. Lepore and Z. Pylyshyn (eds.), What is Cognitive Science?
Oxford: Blackwell.
Seung, H.-K. and Chapman, R. S. (2000). Digit span in individuals with Down's syndrome and
typically developing children: temporal aspects. Journal of Speech, Language, and Hearing
Research, 43, 609-20.
Simon, T. J. ( 1 997). Reconceptualizing the origins of number knowledge: a non-numerical account.
Cognitive Development, 1 2, 349 -72.
Slater, A., Morison, V., Somers, M., Mattock, A., Brown, E., and Taylor, D. ( 1 990). Newborn
and older infants' perception of partly occluded obj ects. Infant Behavior and Development,
1 3 , 33-49.
Sperber, D. ( 1 994). The modularity of thought and the epidemiology of representations. In
L. A. Hirschfeld and S. A. Gelman (eds.), Mapping the Mind: Domain Specificity in Cognition
and Culture. New York : Cambridge University Press.
Stone, V. E., Cosmides, L., Tooby, J., Kroll, N., and Knight, R. T. (2002). Selective impairment
of reasoning about social exchange in a patient with bilateral limbic system damage. Proceedings
of the National Academy of Sciences, 99, 1 1 531-6.
Tallal, P., Miller, S. L., Bedi, G., et al. ( 1 996). Language comprehension in language-learning
impaired children improved with acoustically modifIed speech. Science, 27 1 , 8 1 -4.
Thomas, M. S. C. and Karmiloff-Smith, A. (2002). Are developmental disorders like cases of
adult brain damage? Implications from connectionist modelling. Behavioural and Brain Sciences,
25/6, 727-88.
Uttal, W. R. (200 1 ). The New Phrenology: The Limits of Localizing Cognitive Processes in the
Brain. Cambridge, MA: MIT Press.
Van Giffen, K. and Haith, M. M. ( 1 984). Infant response to Gestalt geometric forms. Infant
Behavioral Development, 7, 3 3 5 -46.
Vargha-Khadem, F., Watkins, K., Alcock, K., Fletcher, P., and Passingham, R. ( 1 99 5) . Praxic
and nonverbal cognitive defIcits in a large family with a genetically transmitted speech and
language disorder. Proceedings of the National Academy of Science, 92, 930-3.
Vinden, P. G. ( 1 996). Junin Quechua children's understanding of mind. Child Development, 67,
1 7 0 1 - 1 6.
- ( 1 999). Children's understanding of mind and emotion: A multi-culture study. Cognition
and Emotion, 1 3 , 1 9 - 48.
Warren, R. M. and Warren, R. P. ( 1 970). Auditory illusions and confusions. Scientific
American, 223, 3 0-6.
Jesse J. Prinz
CHAPTER
T H R E E
Here's how I'll proceed. In section 1 , I cl arifY the main commitments of MM and
the attendant notion(s) of a module. In section 2, I sketch some of the main theoret
ical arguments for MM and highlight their deficiencies. In section 3 , I very briefly
discuss some problems with the experimental case for MM. In section 4, I outline some
reasons for fmding MM at least in radical form implausible. F inally in section 5,
I argue against those who not merely reject MM but deny minds are modular to any
interesting degree.
1 What's at Issue?
To a fIrst approximation, MM is the hypothesis that the human mind is largely or
entirely composed from a great many modules. Slightly more precisely, MM can be
formulated as the conjunction of three claims:
Composition Thesis
MM is in large measure a claim about the kinds of mechanisms from which our minds
are composed - viz., it is largely or even entirely composed from modules.1 As stated,
this is vague in at least two respects. First, it leaves unspecifIed the precise extent to
which minds are composed from modules. In particular, this way of formulating the pro
posal accommodates two different positions, which I call strong and weak massive
modularity. According to strong MM all cognitive mechanisms are modules. Such a
view would be undermined if we were to discover that any cognitive mechanism was
nonmodular in character. By contrast, weak MM maintains only that the human mind
- including those parts responsible for central processing - are largely modular in struc
ture. In contrast to strong MM, such a view is clearly compatible with the claim that
there are some nonmodular mechanisms. So, for example, the proponent of weak MM
is able to posit the existence of some nonmodular devices for reasoning and learning.
A second respect in which the above Composition Thesis is vague is that it leaves
unspecifIed what modules are. For present purposes, this an important matter since the
interest and plausibility of the thesis turns crucially on what one takes modules to be.2
Robust modules
So, the minimal notion of a module won't sufflce for an interesting Composition Thesis.
But debate in cognitive science frequently assumes some more robust conception of
modularity; of which the most well known and most demanding is the one developed
in Fodor ( 1 983). On this view, modules are functionally characterizable cognitive
mechanisms which are (at least paradigmatically) domain speciflc, informationally
encapsulated, innate, mandatory, fast relative to central processes, shallow, neurally
localized, exhibit characteristic breakdown patterns, are inaccessible, and have char
acteristic ontogenetic timetables (Fodor, 1 983). 3
Though the full-fledged Fodorian notion has been highly influential in many areas
of cognitive science (Garfleld, 1 9 87), it has not played much role in debate over MM,4
and for good reason. The thesis that minds are largely or entirely composed of Fodorian
modules is obviously implausible. Indeed, some of the entries on Fodor's list - relative
speed and shallowness, for example - make little sense when applied to central
systems (Carruthers, chapter 1 ; Sperber, forthcoming). And even where Fodor's prop
erties can be sensibly ascribed - as in the case of innateness - they carry a heavy
justiflcatory burden that few seem much inclined to shoulder (Baron-Cohen, 1 9 9 5 ;
Sperber, 1 99 6).
In any case, there is a broad consensus that not all the characteristics on Fodor's
original list are of equal theoretical import. Rather, domain specifIcity and informa
tional encapsulation are widely regarded as most central. Both these properties
concern the architecturally imposed5 informational restrictions to which cognitive
mechanisms are subject - the range of representations they can access - though the
kinds of restriction involved are different.
Domain speciflcity is a restriction on the representations a cognitive mechanism
can take as input - that "trigger" it or "turn it on." Roughly, a mechanism is domain
specifIC (as opposed to domain general) to the extent that it can only take as input
a highly restricted range of representations.6 Standard candidates include mechanisms
for low-level visual perception, face recognition, and arithmetic.
Informational encapsulation is a restriction on the kinds of information a mech
anism can use as a resource once so activated - paradigmatically, though not essen
tially, information stored in memory. Slightly more precisely, a cognitive mechanism
is encapsulated to the extent it can access, in the course of its computations, less
Is the H uman M ind Massively Modular?
than all of the information available to the organism as a whole (Fodor, 1 983). Standard
candidates include mechanisms, such as those for low-level visual perception and
phonology, which do not draw on the full range of an organism's beliefs and
goals.
To be sure, there are many characteristics other than domain specificity and encap
sulation that have been ascribed to modules. But if one uses "module" as more than
a mere terminological expedient - as more than just a nice way of saying "cognitive
mechanism" - yet deny they possess either of these properties, then one could with
some justifIcation be accused of changing the subject. In view of this, I will tend
when discussing more robust conceptions of modularity to assume that modules
must be domain specifIC and/or encapsulated. This has a number of virtues. First,
these properties clearly fIgure prominently in dispute over MM Moreover - and in
contrast to the minimal module version of the Composition Thesis discussed earlier
- the claim that minds are largely or entirely composed of domain specifIC and /or
encapsulated mechanisms is a genuinely interesting one. Not only does it go beyond
the banal claim that our minds are comprised from functionally characterizable
cognitive mechanisms; but it is also a claim that opponents of MM almost invari
ably deny. In later sections I will consider the plausibility of this thesis; but fIrSt
I need to discuss the other two theses associated with MM.
.
1 .2
Plurality Thesis
On reflection, however, it's hard to see how this could be right: how a mere plur
ality of functionally specifIable mechanisms could make for an interesting and dis
tinctive MM. This is because even radical opponents of MM endorse the view that minds
contain a great many such components. So, for instance, the picture of the mind as a
big general-purpose, "classical" computer - roughly, the sort of general-purpose device
that manipulates symbols according to algorithmically specifiable rules - is often
(and rightly) characterized as being fIrmly at odds with MM. Yet big general-purpose
computers are not simple entities. On the contrary, they are almost invariably
decomposable into a huge number of functionally characterizable submechanisms.7
So, for example, a standard von Neumann-type architecture decomposes into a calcu
lating unit, a control unit, a fast-to-access memory, a slow-to-access memory, and
so on; and each of these decomposes further into smaller functional units, which
are themselves decomposable into sub mechanisms, and so on. As a consequence, a
standard von Neumann machine will typically have hundreds or even thousands of
subcomponents.8 Call this a version of massive modularity if you like. But it surely
isn't an interesting or distinctive one.
.
1 .3
Central Modularity
So far we have discussed the Composition and Plurality theses and seen that both
are interesting on a robust construal of modules, but that neither seems interesting
or distinctive on the minimal conception. But there is another thesis that requires
our attention, the thesis of Central Modularity, which states that modules are found
not merely at the periphery of the mind but also in those central regions respons
ible for reasoning and decision making.
This does not strictly follow from any of the claims discussed so far since one
might deny there are any central systems for reasoning and decision making. But
this is not the view that advocates of MM seek to defend. Indeed, a large part of
what distinguishes MM from the earlier, well-known modularity hypothesis defended
by Fodor ( 1 983) and others is that the modular structure of the mind is not restricted
to input systems (those responsible for perception, including language perception)
and output systems (those responsible for producing behavior). Advocates of MM accept
the Fodorian thesis that such peripheral systems are modular. But pace Fodor, they
maintain that the central systems responsible for reasoning and decision making are
largely or entirely modular as well (Jackendoff, 1 992). So, for example, it has been
suggested tl1at there are modules for such central processes as social reasoning (Cosmides
and Tooby, 1 992), biological categorization (Pinker, 1 994), and probabilistic inference
(Gigerenzer, 1 994 and 1 99 6) . In what follows, then, I assume MM is committed to
some version of Central Modularity.
Again, how interesting a thesis is this? If formulated in terms of the minimal notion,
it's hard to see how Central Modularity could be an interesting and distinctive one.
After all, even those who endorse paradigmatically nonmodular views of central
cognition can readily accept the claim. For example, advocates of the "Big Computer"
view of central systems can accept the claim that central cognition is entirely
subserved by a great many minimal modules since big computers are themselves com
posed of a great many such entities. All this is, of course, wholly compatible with
Is the H u m an Mind Massively Modular?
there being some suitable modifIcation that makes for an interesting version of Central
Modularity. But the point I want to insist on here is that if one's arguments succeed
only in supporting this version of the thesis, then they fail to support a distinctive
and interesting version of MM.
Things look rather different if a robust conception of modules is adopted. Here,
the degree to which one's version of Central Modularity is interesting will depend
on (a) the extent to which central cognition is subserved by domain speciflC and /or
encapsulated mechanisms and (b) how many such modules there are. Both these
questions could be answered in a variety of different ways. At one extreme, for
example, one might adopt the following view:
Strong Central Modularity: All central systems are domain specifIC and/or encap
sulated, and there are a great many of them.
That would be a genuinely radical position since it implies that there are no domain
general, informationally unencapsulated central systems. But this Strong Central
Modularity is very implausible. For as we will see in later sections, there are no good
reasons to accept it, and some reason to think it is false. At the other extreme, one
might maintain that:
Evolvability
- both highly regarded for their fme watches. But while Hora prospered, Tempus became
poorer and poorer and fmally lost his shop. The reason:
The watches the men made consisted of about 1 000 parts each. Tempus had so con
structed his that if he had one partially assembled and had to put it down - to answer
the phone, say - it immediately fell to pieces and had to be reassembled from the ele
ments . . . The watches Hora handled were no less complex . . . but he had designed them
so that he could put together sub-assemblies of about ten elements each. Ten of these
sub-assemblies, again, could be put together into a larger sub-assembly and a system
of ten of the latter constituted the whole watch. Hence, when Hora had to put down a
p artly assembled watch in order to answer the phone, he lost only a small part of his
work, and he assembled his watches in only a fraction of the man-hours it took Tempus.
(Simon, 1 962)
The obvious moral - and the one Simon invites us to accept - is that evolutionary
stability requires that complex systems be hierarchically organized from dissociable
subsystems ; and according to Carruthers and Carston, this militates in favor of MM
(Carston, 1 996, p. 7 5) .
Response. Though evolutionary stability may initially appear t o militate in favor
of MM, it is in fact only an argument for the familiar mechanistic thesis that com
plex machines are hierarchically assembled from (and decomposable into) many sub
components. But this clearly falls short of the claim that all (or even any) are domain
specific or encapsulated. Rather it supports at most the sort of banal Plurality Thesis
discussed earlier; one that is wholly compatible with even a Big Computer view of
central processes. All it implies is that if there are such complex central systems,
they will need to be hierarchically organized into dissociable subsystems - which
incidentally, was the view Simon and his main collaborators endorsed all along (Simon,
1 9 6 2 ; Newell, 1 990).
2.2
Throughout the biological world - from cells to cellular assemblies, whole organs, and
so on - one fmds hierarchical organization into semi-decomposable components. We
should expect the same to be true of cognition (Chomsky, 1 980; Carruthers, chapter 1 ).
Response. Same problem as the previous argument. Though all this is correct, it
is at most an argument for the claim that our minds are semi-decomposable systems
- hierarchically organized into dissociable subsystems - a conclusion that is in no way
incompatible with even the most radically nonmodular accounts of central systems.
2.3
Task specifIcity
There are a great many cognitive tasks whose solutions impose quite different demands.
So, for example, the demands on vision are distinct from those of speech recognition,
of mind-reading, cheater-detection, probabilistic judgment, grammar induction, and
so on. Moreover, since it is very hard to believe there could be a single general
inferencing mechanism for all of them, for each such task we should postulate the
existence of a distinct mechanism, whose internal processes are computationally
Is the Human Mind Massively Modular?
specialized for processing different sorts of information in the way required to solve
the task (Carruthers, chapter 1 ; Cosmides and Tooby, 1 992, 1 994).
Response. Two points. F irst, if the alternatives were MM or a view of minds as
comprised of just a single general-purpose cognitive device, then I too would opt for
MM. But these are clearly not the only options. On the contrary, one can readily
deny that central systems are modular while still insisting there are plenty of
modules for perception, motor control, selective attention, and so on. In other words,
the issue is, not merely whether some cognitive tasks require specialized modules but
whether the sorts of tasks associated with central cognition - paradigmatically, reason
ing and decision making - typically require a proliferation of such mechanisms.
Second, it's important to see that the addition of functionally dedicated mech
anisms is not the only way of enabling a complex system to address multiple tasks.
An alternative is to provide some relatively functionally inspecific mechanism with
requisite bodies of information for solving the tasks it confronts. This is a familiar
proposal among those who advocate nonmodular accounts of central processes. Indeed,
advocates of nonmodular reasoning architectures routinely assume that reasoning devices
have access to a huge amount of specialized information on a great many topics, much
of which will be learned but some of which may be innately specifIed (Anderson,
1 990; Newell, 1 990). Moreover, it is one that plausibly explains much of the prolifer
ation of cognitive competences that humans exhibit throughout their lives - e.g., the
ability to reason about historical issues as opposed to politics or gene splicing or
restaurants. To be sure, it might be that each such task requires a distinct mechanism,
but such a conclusion does not flow from general argument alone. For all we know,
the same is true of the sorts of tasks advocates of MM discuss. It may be that the
capacity to perform certain tasks is explained by the existence of specialized mech
anisms. But how often this is the case for central cognition is an almost entirely
open question that is not adjudicated by the argument from task specifIcity.
2.4
Bottleneck argument
2.5
Computational tractability
It is common to argue for MM on the grounds that the alternatives are computa
tionally intractable. In brief, the argument is as follows : Human cognitive processes
are realized by computational mechanisms. But for this to be the case, our cognitive
mechanisms would need to be computationally tractable; and this in tum requires
that they be informationally encapsulated - that they have access to less than all
the information available to the mind as a whole. Hence, the mind is composed of
informationally encapsulated cognitive mechanisms.
Response. If one is disinclined to accept a computational account of cognitive pro
cesses, this is not an argument one is likely to take seriously. But even if one endorses
a computational account of the mind, the argument still doesn't work. There is a
long story about what's wrong (see Samuels, 2005). But the short version is that com
putational tractability does not require informational encapsulation in any standard
sense of that expression ; and the claim that some other kind of "encapsulation" is
involved is tantamount to relabeling. As ordinarily construed, a mechanism is encap
sulated only if - by virtue of architectural constraints - there is some relatively
determinate class of informational states that it cannot access. (For example, the para
digmatic cases of encapsulated devices - low-level perceptual devices cannot access
the agent's beliefs or desires in the course of their computations.) While such an archi
tectural constraint may be one way to engender tractability, it is clearly not the only
way. Rather, what's required is that the mechanism be frugal in its use of informa
tion : that it not engage in exhaustive search but only use a restricted amount of the
available information. Moreover, this might be achieved in a variety of ways, most
obviously by heuristic and approximation techniques of the sort familiar from com
puter science and ArtifIcial Intelligence.
So, it would seem that the tractability argument fails: frugality, not encapsula
tion, is what's required. Carruthers has responded to this, however, by drawing a dis
tinction between two notions of encapsulation: the standard notion, which he calls
"narrow-scope encapsulation" and another notion, "wide-scope encapsulation," on
which the " operations of a module can't be affected by most of the information in
a mind, without there being some determinate subdivision between the information
that can affect the system and the information that can't" (Carruthers, chapter 1,
p. 1 6) . But this really isn't an interesting notion of encapsulation. For not only is it
different from what most theorists mean by "encapsulation," but it's simply what you
get by denying exhaustive search; and since virtually no one thinks exhaustive search
is characteristic of human cognition, the present kind of "encapsulation" is neither
distinctive nor interesting.
- a point I return to in section 5 - the experimental case for central modularity is not
strong (Samuels, 1 9 9 8 ; Fodor, 2000). To the extent things seem otherwise, I suspect
it is because the interpretation of data is heavily driven by a prior acceptance of the
general theoretical arguments for MM - arguments which as we have seen there is
little reason to endorse. 1 0
In some cases, the reliance on general theoretical considerations in the absence
of convincing data is egregious. (My current favorite example is the "homicide mod
ule" hypothesis advocated by Buss and his collaborators ; see Buss and Duntley, 2005.)
But even in cases where the influence of such considerations is less obvious, the data
for Central Modularity are uncompelling. Often the problem is that the putative evid
ence for specialized modules can be better explained in terms of other, less specific
processes. This is well illustrated by arguably the flagship case of a putative reasoning
module: a dedicated mechanism for social contract reasoning (Comsides and Tooby,
1 99 2 ; Gigerenzer and Hug, 1 99 2). Advocates of this hypothesis sometimes represent
it as a kind of modularist assault on the "doctrinal 'citadel' of . . . general-purpose
processes" : human reasoning (Cosmides and Tooby, 1 99 2). But the main experimental
support for the hypothesis - evidence of so-called content effects in Wason's Selection
Task - can be very plausibly explained in terms of quite general features of language
comprehension and, hence, provides no support for a dedicated social contract reason
ing mechanism (Sperber et aI., 1 9 9 5 ; Fodor, 2000 ; Sperber and Girotto, 2003).
Another problem with the experimental case for Central Modularity is that even
where the data suggest some kind of specialized cognitive structure for a given domain,
it seldom adjudicates clearly between claims about the existence of specialized
mechanisms and claims about the existence of specialized bodies of knowledge, such as
a mentally represented theory. The former are modularity theses in the relevant sense
while the latter are wholly compatible with a highly nonmodular account of central
processing on which different bodies of knowledge are used by a small set of non
modular mechanisms (Newell, 1 990; Gopnik and Meltzoff, 1 99 7 ; Fodor, 2000). This
point is well illustrated by the debate over folk biology. Many have proposed the
existence of specialized cognitive structures for folk biology. But while some main
tain it is subserved by a dedicated module (Pinker, 1 994; Atran, 1 998, 200 1 ) , others
claim merely that we posses a body of information - a theory - that is deployed by
relatively inspeciflc inferential devices (Carey, 1 995). The problem is that the main
available evidence regarding folk biology - e.g., cross-cultural evidence for uni
versality, developmental evidence of precocity, anthropological evidence for rapid
cultural transmission, and so on - fails to adjudicate between these options. What
they suggest is that folk biology involves some dedicated - and perhaps innate cognitive structure. But once the general arguments for MM are rejected, there is
little reason to interpret it as favoring a modular account of folk biology.
Finally, even where the evidence for previously unrecognized modules is strong, it
seldom turns out to be clear evidence for central modularity. Consider, for example,
the "geometric module" hypothesis advocated by Cheng, Gallistel, and others (Gallistel,
1 99 0 ; Hermer-Vazquez et aI., 1 999). Though contentious, the evidence for such a
device is quite compelling. But this does not support a modular view of central pro
cesses since the geometric module is most plausibly construed as part of vision or
visuomotor control (Pylyshyn, 1 999). Perhaps surprisingly, a similar point applies to
Richard Sam uels
what is widely regarded as among the strongest candidates for Central Modularity:
the theory of mind module (ToMM) hypothesized by Alan Leslie and his collabor
ators (Leslie et al., 2004). The ToMM hypothesis is not without problems (see Nicholls
and Stich, 2002). But even if we put these to one side, the existence of ToMM would
do little to strengthen the case for Central Modularity since in its most recent and
plausible incarnations, ToMM is characterized as a relatively low-level device for
selective attention; and moreover one which relies heavily on decidedly nonmodular
executive systems - most notably the "selection processor" - in order to perform its
characteristic function (leslie et aI., 2004). Thus what is widely viewed as a strong
candidate for Central Modularity both fails to be a clear example of a central system,
and also presupposes the existence of nonmodular systems.
5 Whither Modularity?
The main burden of this chapter has been to argue that the case for an interesting
and distinctive version of MM is not strong because there is little reason to suppose
central processes are modular. In this section I conclude with some comments on a
more radical view: that little or none of cognition - including peripheral systems for
low-level perception - are modular in character. In my view, this claim goes too far
since there are strong empirical grounds for positing a wide array of non-central
modules. Efforts to draw grand conclusions about the irrelevance of modularity thus
strike me as, at best, premature and, at worst, a serious distortion of our current,
best picture of the cognitive mind.
5. 1
As with almost any scientifIC hypothesis, the empirical case for peripheral modular
ity is not so strong as to exclude all competitors. But what matters is the relative
Richard Samuels
According to Prinz, " alleged modules may have domain-specifiC components" e.g., proprietary rules and representations - but they "don't seem to be proprietary
throughout." Prinz's argument for this conclusion divides in two. First, he dis
tinguishes between some different senses of "specificity" and "domain" and argues
that for a mechanism to be interestingly domain speciflc it should be exclusively
dedicated to processing some relatively inclusive subject matter. Next, Prinz argues
that the main alleged examples of modules - vision, language, mathematics, folk
physics, etc. - are not domain specifiC in this sense. Neither part of the argument
strikes me as convincing.
Is the H u m an Mind Massively Modular?
Let's turn to the case against encapsulated mechanisms. Prinz's strategy is to try to
undermine what is widely regarded as the most plausible case for encapsulation:
perceptual illusions which modularists claim result from the operation of low-level
perceptual mechanisms. But pace Fodor, Pylyshyn, and others, Prinz maintains
illusions are not good evidence for encapsulation because there is a competing
hypothesis - what he calls the "trumping hypothesis" - which explains the existence
of illusions and, moreover, does a better job of accommodating evidence of top-down
effects on perception. Again, I am unconvinced.
First, the trumping hypothesis itself is hard to make sense of. The rough idea is
simple enough: though belief is trumped by perception when the two are in conflict,
it can influence perception when no conflict exists. But how exactly is this to occur?
On the most natural interpretation, what's required is some kind of consistency check
between belief (e.g., that the lines in the Miiller-Lyer illusion are of equal length) and
a representation produced by some perceptual process (e.g., that the lines are of dif
ferent length). But if the trumping hypothesis were correct, such a checking process
would presuppose the existence of encapsulated perceptual mechanisms. After all,
for a consistency check to occur at all, there must be a perceptual representation i.e., the output of some perceptual device - that can be checked against belief. And
since, according to the trumping hypothesis, beliefs only influence perceptual process
ing when no conflict exists, it cannot be that beliefs are implicated in producing the
output of this perceptual device.
In any case, the data cited by Prinz do not merit the rejection of encapsulated
low-level perceptual mechanisms. Following a long tradition, Prinz argues that top
down effects on perception are incompatible with encapsulation. But the argument
turns on the assumption that a "truly encapsulated system would be insulated from
any external influence"; and this is simply false. On the contrary, advocates of encap
sulation agree with their opponents that there are top-down influences on perception.
Is the H u m a n M i n d Massively Modular?
What's at issue is the character and extent of this influence. SpecifIcally, what advo
cates of encapsulated early perception are most concerned to reject is a picture widely associated with Bruner's New Look psychology - on which early perceptual
mechanisms have something approximating an unlimited access to one's beliefs and
desires in the course of their online processing (Fodor, 1 983, p. 60; Pylyshyn, 1 999) .
But this rejection is wholly compatible with many sorts of external cognitive
influence on perception, including:
Shifts in loci of focal attention brought about by clues, instructions, or preferences about where to look;
Top-down processing within a perceptual modality;
Cross-talk between perceptual systems ;
Diachronic or developmental effects in which one's beliefs and goals influence
the development of perceptual systems - e.g., via training effects ;
Beliefs and desires influencing late perceptual processes, such as perceptual
categorization. 13
As far as I can tell, all the putative objections to encapsulation Prinz cites fall
into one or other of these categories. For example, it is very plausible that the influ
ence of verbal cues and decisions on our experience of ambiguous fIgures consists
in the production of shifts in the locus of focal attention (Peterson and Gibson, 1 99 1 ;
Pylyshyn, 1 999). Similarly, the role o f expectations - e.g., in producing non
veridical experiences - is plausibly viewed as an influence on late perceptual pro
cesses. And so on. In view of this, I see no reason here to deny the modularity of
early perception.
Conclusion
We started by claritying the sort of view that advocates of MM seek to defend. We
then saw that the main theoretical arguments for views of this sort fail to provide
reason to prefer them over other competing proposals, such as those on which much
of cognition depends on nonmodular mechanisms with access to bodies of special
ized information. Next, we saw that the available experimental case for MM is not
strong because there is little evidence for the existence of modular central systems.
Moreover, we saw that there is some reason to reject a strong MM which claims that
all central systems are domain specifIc and /or encapsulated. Finally, we saw that
while MM is not well supported by the arguments and evidence, it would be wrong
to maintain that minds are not modular to any interesting degree since there are
good reasons to suppose that more peripheral regions of the mind - especially for
low-level perception - are modular in character. Where does this leave us? If our
assessment of the evidence is correct, then the most plausible position to adopt
is one that takes a middle way between those, such as Carruthers, who endorse a
thoroughgoing massive modularity and those, such as Prinz, who reject modularity
altogether. The situation is, in other words, much as Fodor advocated over two decades
ago (Fodor, 1 983).
[J
Richard Samuels
Acknowledgments
I would like to thank Guy Longworth, David Papineau, Gabe Segal, Rob Stain ton, and Mark
Textor for helpful comments on earlier drafts of this chapter. I would also like to thank Brian
Scholl for helpful discussion of the material in section 5.
Notes
2
3
4
5
10
11
12
13
Chomsky, N. ( 1 9 80). Rules and Representations. New York: Columbia University Press.
Coltheart, M. ( 1 999). Modularity and cognition. Trends in Cognitive Sciences, 3/3, 1 1 5-20.
Copeland, J. ( 1 993). Artificial Intelligence: A Philosophical Introduction. Oxford: Blackwell.
Cosmides, L. (1 989). The logic of social exchange: has natural selection shaped how humans
reason? Studies with Wason Selection Task. Cognition, 3 1 , 1 87-276.
Cosmides, L. and Tooby, J. ( 1 992). Cognitive adaptations for social exchange. In J. Barkow,
L. Cosmides, and J. Tooby (eds.), Th e Adapted Mind. New York: Oxford University
Press.
- and - ( 1 994). Origins of domain specifIcity: the evolution of functional organization. In
L. Hirschfeld and S. Gelman (eds.), Mapping the Mind. Cambridge: Cambridge University
Press.
Epstein, R., Harris, A., Stanley, D., and Kanwisher, N. ( 1 999). The parahippocampal place area:
recognition, navigation, or encoding? Neuron, 23, 1 1 5-25.
Feigenson, L., Dehaene, S., and Spelke, E . (2004). Core systems of number. Trends in Cognitive
Sciences, 8/7, 307 - 1 4.
Fodor, J. ( 1 983). The Modularity of Mind. Cambridge, MA: MIT Press.
(2000). The Mind doesn 't Work that Way. Cambridge, MA: MIT Press.
(2005). Reply to Steven Pinker: " So how does the mind work?" Mind and Language, 20/ 1 ,
25-32.
Gallistel, R. ( 1990). The Organization of Learning. Cambridge, MA : MIT Press.
- (2000). The replacement of general-purpose learning models with adaptively specialized
learning modules. In M. Gazzaniga (ed.), The New Cognitive Neurosciences (2nd edn.). Cambridge,
MA: MIT Press.
GarfIeld, J. (ed.) ( 1 987). Modularity in Knowledge Representation and Natural-Language
Understanding. Cambridge, MA: MIT Press.
Gigerenzer, G. ( 1 994). Why the distinction between single-event probabilities and frequencies
is important for psychology (and vice versa). In G. Wright and P. Ayton (eds.), Subjective
Probability. New York: Wiley.
Gigerenzer, G. and Hug, K. ( 1 992). Reasoning about social contracts : Cheating and perspective
change. Cognition, 43, 1 27 - 7 l .
Gopnik, A. and Meltzoff, A. ( 1 997). Words, Thoughts and Theories. Cambridge, MA: MIT Press.
Hermer-Vazquez, L., Spelke, E., and Katsnelson, A. ( 1 999). Sources of flexibility in human
cognition: Dual-task studies of space and l anguage. Cognitive Psychology 39, 3-36.
Jackendoff, R. ( 1 992). Is there a faculty of social cognition? In R. Jackendoff, Languages of
the Mind. Cambridge, MA: MIT Press.
Karmiloff-Smith, A. ( 1992). Beyond Modularity. Cambridge, MA: MIT Press.
Kirsh, D. ( 1 99 1 ). Today the earwig, tomorrow man. Artificial Intelligence, 47, 1 6 1 -84.
Leslie, A. M., Friedman, 0., and German, T. P. (2004). Core mechanisms in "theory of mind."
Trends in Cognitive Sciences, 8, 528-33.
McKone, E. and Kanwisher, N. (2005). Does the human brain process objects of expertise like
faces? A review of the evidence. In S. Dehaene, J. R. Duhamel, M. Hauser, and G. Rizzolatti
(eds.), From Monkey Brain to Human Brain. Cambridge, MA: MIT Press.
Newell, A. ( 1990). Unified Theories of Cognition. Cambridge, MA: Harvard University Press.
Nicholls, S. and Stich, S. (2002). Mindreading. New York: Oxford University Press.
Palmer, S. ( 1 999). Vision Science: Photons to Phenomenology. Cambridge, MA: MIT Press.
Peterson, M. A. and Gibson, B. S. ( 199 1). Directing spatial attention within an object: altering
the functional equivalence of shape descriptions. Journal of Experimental Psychology: Human
Perception and Performance 1 7 , 170-82.
Pinker, S. ( 1994). The Language Instinct. New York: William Morrow.
- ( 1 997). How the Mind Works. New York: Norton.
Is the H uman Mind Massively Modular?
- (2005). So how does the mind work? Mind and Language, 20/ 1 , 1 -24.
Pylyshyn, Z. ( 1 984). Computation and Cognition. Cambridge, MA: MIT Press.
( 1 999). Is vision continuous with cognition? The case for cognitive impenetrability of visual
perception. Behavioral and Brain Sciences, 22, 3 4 1 -423.
Samuels, R. ( 1 998). Evolutionary psychology and the massive modularity hypothesis. British
Journal for the Philosophy of Science, 49, 575-602.
- (2000). Massively modular minds: Evolutionary psychology and cognitive architecture. In
P. Carruthers and A. Chamberlain (eds.), Evolution and the Human Mind: Modularity, Language
and Meta-Cognition. Cambridge: Cambridge University Press.
- (2005). The complexity of cognition: tractability arguments for massive modularity. In
P. Carruthers, S. Laurence, and S. Stich (eds.), The Innate Mind: Structure and Contents.
New York: Oxford University Press.
Segal, G. ( 1 996). The modularity of theory of mind. In P. Carruthers and P. Smith (eds.), Theories
of Theory of Mind. Cambridge: Cambridge University Press.
Simon, H. ( 1 9 62). The architecture of complexity. Proceedings of the American Philosophical
Society, 1 06, 467-82.
Sperber, D. ( 1 994). The modularity of thought and the epidemiology of representations. In
Hirschfeld, L. and Gelman, S. (eds.), Mapping the Mind. Cambridge: Cambridge University
Press.
- ( 1 996). Explaining Culture: A Naturalistic Approach. Oxford: Blackwell.
- (2002). In defense of massive modularity. In I. Dupoux (ed.), Language, Brain and Cognitive
Development. Cambridge, MA: MIT Press.
- (2005). Massive modularity and the flfSt principle of relevance. In P. Carruthers, S. Laurence,
and S. Stich (eds.), The Innate Mind: Structure and Contents. Oxford: Oxford University Press.
Sperber, D. and Girotto, V. (2003). Does the selection task detect cheater detection? In
K. Sterelny and J. Fitness (eds.), From Mating to Mentality: Evaluating Evolutionary
Psychology. New York: Psychology Press.
Sperber, D. and Hirschfeld, L. (2004). The cognitive foundations of cultural stability and
diversity. Trends in Cognitive Science, 8/ 1 , 40-6.
Sperber, D., Cara, F., and Girotto, V. ( 1 995). Relevance theory explains the selection task. Cognition,
52, 3 -39.
Tooby, J. and Cosmides, L. ( 1 992). The psychological foundations of culture. In J. Barkow,
L. Cosmides, and J. Tooby (eds.), The Adapted Mind. New York: Oxford University Press.
Van der Lely, H. K. J., Rosen, S., and Adlard, A (2004). Grammatical language impairment
and the specifIcity of cognitive domains: Relations between auditory and language abilities.
Cognition, 94, 1 67-83.
Varley, R. (2002). Science without grammar: ScientifIC reasoning in severe agrammatic aphasia.
In P. Carruthers, S. Stich, and M. Siegal (eds.), The Cognitive Basis of Science. Cambridge:
Cambridge University Press.
Varley, R. A, Siegal, M., and Want, S. (200 1 ) . Severe impairment in grammar does not pre
clude theory of mind. Neuroease, 7, 489-93.
Zeki, S. and B artels, A ( 1 998). The autonomy of the visual systems and the modularity of
conscious vision, Philosophical Transactions of the Royal Society London, 3 53 , 1 9 1 1 - 1 4.
Richard Samuels
H OW M U CH
KN OWLE DG E O F
LAN G UAG E I S I N NATE?
CHAPTER
F 0 U R
C.
K.
Pullum
The protracted dispute over the degree of independence of language acquisition from
sensory experience often degenerates into an unsavory cavalcade of exaggerated claims,
tendentious rhetoric, and absurd parodies of opposing views. 1 In this chapter we try
to distinguish between partisan polemics and research programs. If either side of the
partisan dispute about the acquisition of syntax were as stupid as the opposing side
alleges, the free-for-all would not be worthy of serious attention; but in fact we think
there are two important complementary research programs for syntax acquisition
involved here.
We are skeptical about recent triumphalist claims for linguistic nativism, and this
may lead to us being mistaken for defenders of some sort of "empiricism."z But touting
empiricist stock is not our project. Curbing the excesses of irrational nativist exuber
ance is more like it. We argue that it is premature to celebrate nativist victory (as
Laurence and Margolis, 200 1 , seem to be doing, for instance),3 for at least two
reasons. First, the partisan dispute is too ill-delineated to reach a resolution at all,
because of a persistent tendency to conflate non-nativism with reductive empiridsm,
and because of equivocations rooted in the polysemy of the word "innate" (section
2). And second, linguistic nativist research programs need theories of learning - exactly
what non-nativist research programs aim to develop (section 3).
Although we use "linguistic nativism" throughout this paper to denote a number
of contemporary views about the acquisition of syntax, the reader will note that we
tend to avoid using "innate." We try instead to address the specifics that the term
"innate" often seems to occlude rather than illumine: the extent to which the acquisi
tion of syntax proceeds independently of the senses, for example, and the extent to
which it depends on generally applicable human cognitive capacities. The traditional
"empiricist" claim is that the syntactic structure of languages (like everything else)
is learned from sensory input. This could be false in at least two ways: it could be
that the syntactic aspects of language are not acquired at all, but are antecedently
Empiricism
The classic empiricist slogan states that there is "nothing in the intellect which was
not previously in the senses" (Aquinas, Summa Theologica, Ia). Nativism is often taken
to be the negation of empiricism: the view that at least one thing is in the intellect
that was not acquired from the senses. But this is too weak to be an interesting form
of contemporary nativism. It would surely be a pyrrhic victory if linguistic nativism
were true simply in virtue of one solitary unacquired or unlearned contentful lin
guistic principle, everything else being learned.4 And it would make it a mystery why
nativist linguists have attempted to establish the existence of so many such prin
ciples, and have emphasized their abundance.s For the purposes of this chapter, we
take linguistic nativism to be the view stated in ( 1 ) :
Most o f the acquisition o f natural languages by human beings depends on
unacquired (or acquired but unlearned) linguistic knowledge or language
specialized cognitive mechanisms.
This psychological generalization quantifIes over unacquired (or acquired but unlearned)
knowledge and mechanisms specialized for language.6 The research program of
linguistic nativism aims to show, proposition by proposition and mechanism by
mechanism, that very little knowledge of syntactic structure is acquired or learned
from sensory stimuli. Thus the discovery of one (or even a few) language-specialized
cognitive mechanisms does not resolve the partisan nativist/non-nativist dispute. Even
after the discovery of one genuinely unacquired linguistic principle, the continued
Barbara C. Scholz and Geoffrey K. Pullum
development of both nativist and non-nativist research programs would and should
continue.
Non-nativism with regard to language acquisition is the view stated in (2) :
2
Given this reductive characterization, Fodor struggles to specifY the point of dis
agreement between nativists and both historical empiricists and contemporary non
nativists ( 1 98 1 , pp. 279-83). Do they disagree "over which concepts are primitive"?
Do nativists deny "that the primitive concepts constitute an epistemologically inter
esting set"? Do contemporary empiricists accept that "the primitive concepts are the
ones whose attainment I can't eventually account for by appeal to the mechanisms
of concept learning"? Unsurprisingly, his effort to locate the precise point of differ
ence fails. A dispute that is purportedly about the resources required for language
acquisition is miscast as a dispute about the acquisition of unstructured concepts
and the failure of reductive empiricism, all wrapped up in an empiricist theory of
justifICation rather than a theory of concept acquisition.
Contemporary non-nativist psychology need not be either atomistic or reductive.
Consider for example the non-reductive conjectures about the sense-based acquisi
tion of natural-kind concepts developed by Boyd ( 1 9 8 1 , 1 99 1 ) and Kornblith ( 1 993),
crucial to their understanding of scientifIC realism. Boyd fmds in certain (unofficial)
Lockean views the suggestion that natural kinds are primitive, complex, structured,
homeostatic clusters of properties, and our sense-based concepts of them are complex
homeostatic cluster concepts. What Boyd seems to reject is that primitive concepts
are non-complex, and that non-complex concepts are epistemologically interesting.
We will not extend these ideas to language acquisition here, but we note that
Christiansen and Curtin ( 1 999) appear to be applying them to word individuation.
Fodor, however, is certainly right about at least two things. First, any coherent
research program in language acquisition must accept that some acquisition mech
anisms are not acquired. All parties must concede this on pain of a vicious regress
of acquired mechanisms for acquisition (see Block, 1 9 8 1 , p. 280). But Chomsky presses
this point to a parodic extreme:
To say that "language is not innate" is to say that there is no difference between my
granddaughter, a rock, and a rabbit. In other words, if you take a rock, a rabbit, and
my granddaughter and put them in a community where people are talking English, they'll
all learn English. If people believe that, then they'll believe language is not innate. If
they believe that there is a difference between my granddaughter, a rabbit, and a rock,
then they believe that language is innate. (Chomsky, 2000, p. 50)
if what is so acquired is never anything more than an exact copy of the statistical
distributions in the stimulus. This error is frequently made by linguistic nativists. For
example, Lidz et al. (2003) write that "It is hopeless to suppose that learning is respons
ive (solely) to input frequency, because the first word [that children acquire] in English
vocabulary is not the." As Elman (2003) notes, it is an error to take stochastic learning
theory to hypothesize that children learn statistics, i.e., they merely copy or memor
ize stimulus frequency patterns. On the contrary, stochastic learning theory holds
that language learning is based on complex, higher-order properties of stochastic
patterns in sensory experience, not a mere tabulation of frequency of patterns. To
take children's (or adults') sense-based stochastic acquisition abilities to be limited to
frequency detection and tabulation greatly underestimates their power. One leading
question in statistical language acquisition research concerns the kinds of stochastic
patterns infants can acquire (Saffran et ai., 1 996).
Nativists also sometimes mistakenly assume that the only kind of linguistic stim
ulus that could be relevant to language acquisition is the presence or absence of cer
tain individual strings in the primary linguistic data. The assumption that rare or
even absent strings would have to occur frequently for stochastically based learning
to succeed oversimplifies (without investigation) the relevant distributional proper
ties of the data (Lewis and Elman, 200 1 ; Elman, 2003) . Reali and Christiansen (forth
coming) provide further evidence that the relevant features of the stimulus for statistically
based language acquisition models are the stochastic properties of the overall input,
not just the presence or absence of individual strings therein. And see also Saffran
et al. ( 1 99 6) , Aslin et al. ( 1 998), Gomez (2002), and Saffran and Wilson (2003) for
evidence that children are effective statistical learners.
Non-nativist researchers on language acquisition are free to either accept or reject
historical empiricist doctrines, because contemporary linguistic non-nativism is not
a form of reductive empiricism. It is merely a rejection of ( 1 ).
2 VVhat Innateness Is
The hypothesis that some features of natural languages are acquired by triggering is
characteristic of the "principles and parameters" theory.7 Parameters are supposed to
"reduce the diffIculty of the learning problem" (Gibson and Wexler, 1 994, p. 107).
Parametrized universal principles are hypothesized to facilitate language acquisition
by reducing what must be learned from sensory experience about the systematic
parochial variations of natural languages.8 A parameter does not specifY a single prop
erty common to all natural languages. Rather, it specifIes a fIxed set of mutually
exclusive linguistic properties, of which any given natural language can have exactly
one.
Parameters are alleged to be unacquired. What is acquired is a particular setting
of a parameter, by the process of being triggered by an environmental stimulus or
range of stimuli. For example, "initial" might be one possible setting for a para
meter governing position of lexical head (e.g., the verb in a verb phrase), and "fmal"
the other setting, ruling out the possibility of any language in which lexical heads
are positioned, say, as close to the middle of a phrase as possible. The debated issue
I rrational Nativist Exuberance
The antecedently given range of possible parameter settings are "in the slave boy."
The information in the activated parameter is not acquired by reasoning or inference
from information in the environmental trigger. It is inherent in the boy's pre-existing
parameter.
In what follows we use instantaneous acquisition for this kind of parameter
setting by (Fodorian) triggering. In instantaneous acquisition no information in the
environmental trigger informs or is taken up into the product of the triggering
process : e.g., exposure to ambient temperatures of above 90F might cause the head
parameter to be set to strict verb-fmal clause structure.
Barbara C. Scholz and Geoffrey K. Pullum
Under the second triggering metaphor (Stich, 1 975, p. 1 5), the parameter is not
merely set off or activated. Rather, the information in the environmental trigger is
relevant to the information content of the product of the triggering process (though
the latter is not inferred from the former). As Gibson and Wexler ( 1 994, p. 408) char
acterize it, triggers are "sentences in the child's experience that point directly to the
correct settings of parameters " ; indeed, for any setting of a parameter "there is a
sentence that is grammatical under that setting but not under any other." Exposure
to a trigger "allows the learner to determine that the appropriate parameter setting
is the one that allows for the grammaticality of the sentence." Gibson and Wexler
go on to develop this view (see their "Triggering Learning Algorithm," 1 994, pp. 409- 1 0) :
i f a trigger fails to b e grammatical as analyzed by the currently entertained grammar,
the learning algorithm modifIes a parameter setting to see if that will permit the
trigger to be analyzed successfully (and changes it back again if not).
Henceforth we use the term accelerated acquisition for this kind of informationally
triggered parameter setting.
By claiming that the learner determines the setting, Gibson and Wexler mean that
the uptake of linguistic information in the trigger is necessary for the acquisition of a
particular setting for a parameter. The familiar expression "poverty of the stimulus"
picks up on impoverished information in the triggering stimulus by contrast with the
richness of the information in the set parameter in both instantaneous and accelerated
acquisition. But the two triggering processes are distinct. We speculate that an over
emphasis on arguments from the poverty of the stimulus has led many philosophers
and linguists to overlook the differences between these two kinds of triggering.
However, Gibson and Wexler do distinguish sharply between their concept of trig
gering and Fodorian triggering (instantaneous acquisition) which is "supposed to mean
something like an experience that has nothing theoretically to do with a parameter
setting, but nevertheless determines the setting of a parameter" (Gibson and Wexler,
1 994, p. 408, n. 2).
Instantaneous acquisition is a brute-causal psychological process, unmediated
by intervening psychological states, exactly as Samuels says. Thus the product of
instantaneous acquisition is innate for Samuels. For Gibson and Wexler, on the other
hand, discrimination and uptake of the information in the stimulus mediates the
setting of a parameter. P arameter setting by accelerated acquisition is neither psy
chologically primitive nor environmentally canalized, so its products are not innate
in either Samuels' or Ariew's sense.
Doubtless Gibson and Wexler would reject both Samuels' and Ariew's concepts of
innateness because they claim the product of their preferred triggering process is innate,
although neither psychologically primitive nor canalized. But if it is, then there must
be some other concept of innateness to vindicate their claim. Quite a few are on offer.
Sober ( 1 998) has argued that all that is left of the pre-scientifIC concept of innate
ness is the idea that what is innate is invariant across environments. But it is imme
diately clear that invariance innateness won't do : triggering is supposed to explain
the acquisition of linguistic structures that systematically vary across natural lan
guages. Of course, parameters that have not yet been set are in a sense invariant.
But they do not explain how infants acquire knowledge of parochial aspects of their
languages.
Irrational Nativist Exubera nce
Some scientists talk as if what is universal across all typical members of the
species, or across all natural languages, is innate (see, e.g., Barkow et al., 1 992) ; but,
ex hypothesi, the products of triggering are not universal.
Stich ( 1 975, p. 1 2) considers a Cartesian dispositional analysis of innate beliefs :
a belief is innate for a person just i n case "that person is disposed to acquire it under
any circumstances suffIcient for the acquisition of any belief." But this lends no
support to any advocate of the idea that the products of triggering are innate.
Knowledge of particular languages that is acquired by the triggering of a parameter
requires special circumstances.
Gibson and Wexler should probably not consider reverting to the idea that what
is innate is known a priori. First, if a priori acquisition is defmed as "what is acquired
independently of any specifIC experience," then the products of instantaneous and
accelerated acquisition, which depend on the experience of specific sensory triggers,
are not innate; and defming a priori knowledge as "what is known on the basis of
reason alone" fails because the products of all triggering processes are, by defmition,
not acquired by means of inference or reason.
Bealer ( 1 999) has more recently articulated a concept of the a priori through evid
ence that "is not imparted through experiences but rather through intuitions" (p. 245).
According to Bealer, "For you to have an intuition that A is just for it to seem to
you that A" (p. 247). Thus, we might say that
a trait or structure, A, is innate for 5 just in case 5's evidence for
i.e., it cognitively, consciously, and reliably seems to 5 that A.
is a priori,
But the triggering process is, ex hypothesi, not consciously accessible. And the idea
that a trait is innate just in case it is due to our "biological endowment" fails, since
even a behaviorist thinks association mechanisms are part of our biology.
At least three of the senses of "X is i nnate" in the linguistics literature that we
have discussed here are empirically dissociated: (i) X is a psychological primitive,
(ii) X is canalized, and (iii) X is universal across all natural languages. We have also
seen that linguistic nativism hypothesizes at least three distinct specialized mech
anisms of language acquisition that correspond to each of these kinds of innateness:
instantaneous acquisition, accelerated acquisition, and unacquired universal prin
ciples. Our point is not that any one of these conceptions of innateness is somehow
illegitimate, or that one is to be preferred to the others. So far as we can tell each of
these kinds of innateness could plausibly play a role in the explanation of language
acquisition. Rather, our worry is that treating empirically dissociated mechanisms with
the single label "innate" only obscures the detailed and accurate understanding of
language acquisition that is the goal of cognitive psychology.
We are certainly not the flYSt to notice that the blanket labeling of distinct develop
mental trajectories as "innate" (or "learned") impedes scientifiC understanding. B ateson
( 1 99 1 ; 2004, pp. 37-9) has identified seven ways in which "instinct" and "innate" are
polysemous in the behavioral ecology literature, and notes that few traits are innate in
all seven senses. Griffiths (2002) argues that "innateness" is undesirable as a theoret
ical term, since it confuses exactly what needs to be clarified. We join Bateson, GriffIths,
and others in recommending that the term "innate" be abandoned in theorizing about
language acquisition, because it impedes the study of language acquisition.
Barbara C. Scholz a n d Geoffrey K. Pullum
Good For?
Linguistic nativists have repeatedly emphasized that they think that the human
infant must be in possession of unacquired linguistic universals (which we will
henceforth refer to as ULUs). The following remarks of Hauser, Chomsky, and Fitch
are representative:
No known "general learning mechanism" can acquire a natural language solely on the
basis of positive or negative evidence, and the prospects for fmding any such domain
independent device seem rather dim. The diffIculty of this problem leads to the hypo
thesis that whatever system is responsible must be biased or constrained in certain ways.
Such constraints have historically been termed "innate predispositions," with those under
lying language referred to as "universal grammar." Although these particular terms have
been forcibly rej ected by many researchers, and the nature of the particular constraints
on human (or animal) learning mechanisms is currently unresolved, the existence of
some such constraints cannot be seriously doubted. (Hauser et aI., 2002, p. 1 577)
But from the claim that language acquisition must be affected by some sorts of
bias or constraint it does not follow that those biases or constraints must stem from
either ULUs or parameters. A non-nativist can readily accept biases or constraints
stemming from sensory mechanisms that are specifIc to language but non-cognitive,
or cognitive-computational mechanisms that are not language-specialized.
What tempts the defenders of nativism to believe otherwise? The matter is complex.
In brief, we see three factors conspiring to tempt nativists into thinking that only
ULUs could guide language acquisition : (i) an inappropriately selective skepticism based
on Humean underdetermination; (ii) a highly selective faith in lexical learning by
hypothesis formation and testing; and (iii) a failure to appreciate the breadth of scope
of the important mathematical results set out by E. Mark Gold ( 1 967).
The idea of studying learning by investigating the limits of an abstract pattern
learning machine originates with Ray Solomonoff in work done at the end of the
1 9 50s (Li and Vitanyi, 1 997, pp. 8 6-92 provides a very useful history with references).
Independently, it would seem, Hilary Putnam ( 1 9 63 a, 1 9 63b) provided a basic impos
sibility result about what a machine for inductive learning could in principle accom
plish: there can never be a "perfect" learning machine, because for any proposed such
machine we can defme a regularity that it cannot induce. Putnam's proof strategy
was later used by Gold (again, independently, it would appear) to prove a key result
in the narrower domain of language learning by guessing grammars (Gold, 1 9 67).
Gold's enormously influential paper stimulated the development of a whole
sub fIeld of mathematical work on learning based on recursive function theory (see
Jain et al., 1 999, for a comprehensive survey). Conceptualizing language learning as
a process of guessing a generative grammar (or, in a separate series of results, guess
ing a parser), Gold advocated investigation of "the limiting behavior of the guesses
as successively larger and larger bodies of information are considered" (Gold, 1 967,
p. 465). He obtained both pessimistic and optimistic results. On one hand, he showed
that there was a sense in which for all interesting classes of generative grammars9
the learning problem was unsolvable, because what has to be learned is deductively
underdetermined by the positive evidence (evidence about what is in the language;
successful learning from such evidence is called "identifIcation in the limit from text").
On the other hand, he showed that if the evidence is an information sequence cover
ing both what is and what is not in the language, the learning problem is solvable
for a huge range of classes of languages.
The pessimistic results depend on a number of assumptions. We summarize those
relevant to "identifIcation in the limit from text" in (3).
3
(a)
(b)
(c)
(d)
Since human children do learn languages, and Gold has apparently proved that they
can't, we face a paradox. The only plausible response is to rtject one or more of the
assumptions leading to it. That is, one or more of Gold's assumptions must not hold
for child language learners (see Scholz, 2004, for further discussion of this point).
Many contemporary linguistic nativists respond by rejecting one them, namely (3d),
the assumption that language acquisition proceeds by hypothesis formation and
testing. The positive alternative they propose is that ULUs do essentially all the
work.
This move might seem to obviate any need for an algorithm that could acquire a
natural language by hypothesis formation and testing: such an algorithm would be
otiose. No signifIcant linguistic generalizations are learned, because none need to be.
But in fact Gold's paradox recurs for learning any parochial linguistic generalization
that involves universal quantifIcation over an unbounded domain, even a lexical gen
eralization. The fact that natural languages are lexically open (see Johnson and Postal,
1 980, ch. 1 4 ; Pullum and Scholz, 200 1 , 2003 ; Postal, 2004) is relevant. Many purely
parochial lexical generalizations are highly productive, because it is always possible
to add another word to the lexicon of a natural language. Take the English prefIX
anti-, or the suffIX -ish. It appears that any noun will allow them: we can form words
like anti-borogove, anti-chatroom, anti-humvee, anti-Yushchenko (though perhaps some
of these have not yet been coined) ; similarly for borogovish, chatroomish, humveeish,
Yushchenko-ish. Thus the Gold paradox returns : the learner must in effect identifY
a grammar for the lexicon, a grammar generating indefmitely many derived words;
and that cannot be done from positive input alone. For what is the generalization
for anti-, or for -ish? Permissible with all bases beginning with b, ch, h, or y? Or all
nouns other than arachnophobe? Or all non-Latinate roots? Or all bases of two or
more syllables? All these and indefmitely many other hypotheses entail the fmite
corpus of English words considered above. But lexical generalizations are just as under
determined by the evidence as hypotheses about syntactic structure are, so expand
ing the evidence won't determine a hypothesis. Yet neither UiUs nor parameters can
help here: ex hypothesi these parochial lexical generalizations are just those that are
acquired from evidence of use. 1 0
Something more than UiUs and various sorts of parameters will be required for
the full story about language acquisition. Unless anyone wants to propose the extremely
implausible view that no one ever learns anything about any language, we will need
a theory of how people learn what is learned. And developing such a theory is exactly
the non-nativist research program.
If nativists respond to Gold by rejecting learning by hypothesis formation and
testing, how do contemporary non-nativists respond? There are many current non
nativist programs, but none of Gold's assumptions are accepted by all of them as
relevant to children's flISt language acquisition :
Instead of only positive data, the child's experience has been investigated
and shown to offer plenty of information about what is not in the language
(Chouinard and Clark, 2004), and the underdetermination problem is addressed
through Bayesian inference, which rules out many accidental generalizations
that are supported by the corpus, using probability computations to determine
whether certain absences from the corpus are systematic or accidental (Elman,
200 5) ;
Instead of success being defmed as hitting upon a perfectly correct generative
grammar, approxim ative defmitions of success have been proposed (the whole
fleld of "probably approximately correct" or "PAC" learning in computer science
is based on this move) ;
Instead of a hypothesis-testing procedure with whole grammars as hypotheses
and only strings as relevant evidence, various monotonic and incremental pro
cedures for approaching a workable grammar are proposed.
The leading challenge these research programs present for linguistic nativism is this:
If some of the above proposed methods are utilized by children to learn lexical gener
alizations, why are ULUs and linguistic parameters required for the acquisition of
natural language syntax, but not lexical structure?
4. 1
Lewis and Elman (200 1 ) demonstrate that a Simple Recurrent Network (henceforth,
SRN) correctly models the acquisition of what linguistic nativists thought required
unacquired representations of hierarchical syntactic structure. I I The case Lewis and
Elman consider is the one that Crain ( 1 9 9 1 ) calls the "parade case of an innate con
straint." Nativist researchers take it to be one of the strongest arguments for lin
guistic nativism from the poverty of the stimulus. The reader is typically introduced
to the acquisition problem via contrasting pairs of sentences (we cite examples from
Laurence and Margolis, 200 1 , p. 2 2 2 ) :
4
5
(a)
(b)
(a)
(b)
Given just these four types of sentences, the nativist's assumption is that the child
(or is it the linguist?) would be tempted to hypothesize the following kind of
syntactic generalization :
6
(Set aside for now that (6) contains the linguistic concepts "sentence" and "subject.")
The hypothesized (6) turns out to be one of the seductive accidental generalizations
that is not supported by further data, as the following pair of sentences shows.
7
(a)
(b)
The correct yes/no question formed from (7a) is (8), where the second is has been
repositioned:
8
The right hypothesis could be framed in various ways, but a straightforward one would
be this : 12
9
All closed interrogative clauses formed from declarative clauses, are formed by
placing the main clause auxiliary verb at the beginning of the sentence, before
the subject.
If this is the child's generalization about the structure of English, then lexical con
cepts like "main clause" and " auxiliary verb" must, it is supposed, be antecedently
known (Fodor, 1 9 8 1 ) , or the generalization cannot even be entertained. The concept
"main clause" relates to hierarchical syntactic structure, not just the linear order of
words (the presumed stimulus). So there is every reason for the nativist to suppose
that children couldn't frame (9) from stimuli that consist merely of unstructured strings
of words.
Certainly, children are reported not to make mistakes like (7b). Crain and
Nakayama ( 1 9 87) ran a study of thirty children (ages 3 to 5 years) who were told to
"Ask Jabba if the man who is running is bald." Crain ( 1 99 1 ) reports that the out
come was as predicted: children never produced incorrect sentences like (7b).
From this Crain ( 1 99 1 ) concludes, "The fmdings of this study then, lend support to
one of the central claims of universal grammar, that state that the initial state of the
language faculty contains structure-dependence as an inherent property."!3
If replicable, the fmding that 3 - to 5-year-old children don't make mistakes
like (7b) would be interesting to nativists and non-nativists alike. But it does not
support the idea that children have a particular DiD or parameter. And the fmding
only supports the idea that children have some DiD or other if there is no better
explanation.
Irrational Nativist Exuberance
Could children learn not to make mistakes like (7b) from linguistic stimuli? Chomsky
has asserted (without citing evidence) that "A person might go through much or all
of his life without ever having been exposed to relevant evidence" of this kind; he
states (see Piattelli-Palmarini, 1 980, pp. 1 1 4- 1 5) that "you can go over a vast amount
of data of experience without ever fmding such a case" - i.e., a sentence with this
structural property. Sampson ( 1 989, 2002) and Pullum and Scholz (2002) question
whether such strings are all that rare, providing evidence that relevantly similar strings
are found in a variety of texts, including spoken English sources, some of them aimed
at fairly young children. 14 But Lewis and Elman (200 I) did something particularly
interesting that took a different approach.
Lewis and Elman showed that "the stochastic information in the data that is uncon
troversially available to children is sufflcient to allow learning." SRNs will "gener
alize to predict" (200 1 ) in a word-by-word fashion that English has interrogatives
like (8), but not like (7b), from training sets that contained strings with the structure
of (4) and (5), but not the supposedly rare (8). These training sets encoded "no gram
matical information beyond what can be determined from statistical regularities." Thus,
from fmite training sets, their SRN does not generalize in hopelessly wrong ways.
Nor is learning accomplished by ignoring rich stochastic information in the data.
More recently, Reali and Christiansen (forthcoming) have obtained similar results
using a noisy corpus of language addressed to children. These results should be very
surprising and intriguing to a linguistic nativist like Crain.
The moral Lewis and Elman draw is that "assumptions about the nature of the
input, and the ability of the learner to utilize the information therein, clearly play a
critical role in determining which properties of language to attribute to UG" (200 1 ,
p. I I). If the relevant stimulus i s underestimated t o exclude its relevant stochastic
features and if the mechanisms of language acquisition are assumed not to exploit
them, too much will be taken to be unacquired or triggered. The linguistic nativist
seems to tacitly assume that the relevant stimuli for acquisition are simply strings
observed in a context of use. But as Lewis and Elman put it: "the statistical struc
ture of language provides for far more sophisticated inferences than those which can
be made within a theory that considers only whether or not a particular form appears."
The relevant input includes distributional information about the set of acquisition
stimuli (for an SRN, what is in the training set).
Suddenly it begins to look as if what matters for language acquisition is what
information is present in the overall stimulus and how the stimulus is impoverished,
not just whether it is impoverished. Lewis and Elman's training sets included none
of the supposedly crucial rare sentences like (8). It begins to seem that structure
dependence can be acquired from the stimuli, even if sentences like (8) are entirely
absent in it, contrary to over 2 5 years of nativist claims.
Of course, there might be other linguistic universals that can't be learned. But these
fmdings about SRNs raise a series of further questions for both types of research
programs. One wants to know exactly which kinds of gaps in training sets SRNs do
and do not flll in, and extend this line of work to children's language acquisition.
If children fail to flll in as SRNs do, then that might be grist for the nativist's mill.
Indeed, the results of these computational experiments suggest that Jeny Fodor's ( 19 8 1 )
claims about the necessity of u nacquired linguistic concepts and the impossibility o f
Ba rbara C . Scholz and Geoffrey K. Pull u m
learning a language by hypothesis formation and testing only hold for symbolic repres
entations and for the particular learning theory he considers. But they seem irrelev
ant to the acquisition of distributed representations by means of learning theories
based on information-rich statistical regularities in the stimulus, which is a serious
contender as a better explanation of the phenomena. I S
4.2
The research program of Newport and Aslin (2000) has found that children might
well acquire some morphological/syntactic categories and generalizations from
inconsistent and error-ridden data by attending to the stochastic properties of the
stimulus. They studied children whose linguistic input is "entirely from speakers who
are themselves not fluent or native users of the language" (Newport and Aslin, 2000,
p. 12). Their subjects were congenitally and profoundly deaf children acquiring American
Sign Language (ASL) as a flrSt language in families with only very imperfect
profIciency in ASL. They describe these children's ASL input as "very reduced and
inconsistent." We will focus on two of their case studies: one involving a child they
call Simon and the other involving two children they call Stewart and Sarah.
Simon is widely celebrated in the popular literature on language acquisition; see
Pinker ( 1 994, pp. 3 8 -9) for an often cited discussion. Simon's acquisition of ASL is
taken to provide powerful support for linguistic nativism. The case study as reported
in Newport and Aslin (2000), however, does not vindicate a nativist interpretation.
Simon is an only child, with deaf parents who were not exposed to ASL until their
teens. None of Simon's teachers knew ASL, so his ASL input was all from his parents.
Stewart and Sarah are different in that their parents are hearing, though similar to
Simon in other relevant respects. Newport and Aslin report:
Simon's parents sign are like other late learners: they use virtually all of the obligatory
ASL morphemes, but only with middling levels of consistency. On relatively simple morph
emes (the movement morphemes of ASL), they average 65-7 5 percent correct usage.
In contrast, Simon uses these morphemes much more consistently (almost 90 percent
correct, fully equal to children whose parents are native ASL signers. Thus, when input
is quite inconsistent, Simon is nonetheless able to regularize the language and surpass
input models. On more diffIcult morphemes (the hand shape classifIers of ASL), where
his parents were extremely inconsistent (about 45 percent correct), Simon did not per
form at native levels by age seven; but even there he did surpass his parents. (Newport
and Aslin, 2000, p. 1 3)
Newport and Aslin state competing hypotheses that might explain this fmding.
The linguistic nativist hypothesis is "that children know, innately, that natural lan
guage morphology is deterministic, not probabilistic" (2000, p. 1 4) and regularize the
inconsistent morphological stimuli of their parents' signing to conform with ULUs.
But they consider one non-nativist alternative: that children have general cognitive
mechanisms that "sharpen and regularize" inconsistent patterns in the stimuli.
Newport and Aslin elaborate the latter hypothesis in their discussion of Stewart
and Sarah. They note that the correct (native) ASL pattern was used by the parents
"with some moderate degree of consistency, while the errors are highly inconsistent"
Irrational Nativist Exuberance
(2000, p. 1 9) . On the second hypothesis, Simon, Stewart, and Sarah, have acquired
ASL from the consistent patterns in their parents overall inconsistent ASL use. This
suggests that the overall stochastic information in the inconsistent stimulus is
exploited in child language acquisition. That is, learning that is based on the rich
stochastic information in the degraded, inconsistent, and messy ASL use of their
parents is regularized by children's general stochastic learning mechanisms. 16
These case studies do not, of course, refute the view stated in ( 1 ), that most of the
acquisition of natural languages depends on unacquired or unlearned linguistic know
ledge. But a clear moral is that without careful attention to the stochastic properties
of the stimulus, the hypothesis that general cognitive mechanisms play a signifIcant
role in language acquisition is not ruled out. Perhaps because of the way Newport
and Aslin's research has been publicized, the finding that Simon regularized over
inconsistent input has been taken as clear support for linguistic nativism by means
of a poverty of the stimulus argument. But this interpretation is premature. It looks
as if the relative frequencies of patterns and inconsistencies in the overall stimulus
is more important than the mere fact that it contains errors or is inconsistent.17 Children
have unacquired and unlearned mechanisms that regularize partial patterns that they
detect - whether linguistic or not. That (if true) is certainly an inherent fact about
children that assists in their language acquisition; but it does not imply possession
of ULUs.
5 Conclusion
We have argued that, as of today, to maintain that linguistic nativism has triumphed
over non-nativism demands tacitly accepting at least one of three rather unappeal
ing views.
The flfSt is to accept that linguistic nativism is the negation of reductive empiricism
- that is, to depict all contemporary non-nativists as defenders of John Locke and
B. F. Skinner - and declare victory. But that will necessitate ignoring the content of
actual contemporary non-nativist theories.
The second is to take it on faith that one day an appropriate sense of "innate"
will be discovered that makes it a coherent and contentful theoretical term with all
the relevant specialized language acquisition mechanisms in its extension. But the
meanings of "innate" that are in current use in linguistics are not all empirically
equivalent, and the currently hypothesized mechanisms of language acquisition do
not fall under a defmite concept.
And the third strategy is to posit parameters, set by triggering (in some sense of
"triggering"), for absolutely every parochial peculiarity of every natural language,
even lexical generalizations. But if the set of posited parameters tracks the set of
parochial features of natural languages, the theory is rendered vacuous as a theory
of language acquisition: instead of an explanation of how language is acquired we
get just a list of ways natural languages can differ.
None of these three strategies looks productive to us. But the defender of the claim
that linguistic nativism has vanquished rival non-nativist views is in the unfortun
ate position of tacitly accepting at least one of them.
Barbara C. Scholz and Geoffrey K. Pullum
Acknowledgments
This work was supported in part by the Radcliffe Institute for Advanced Study at Harvard
University, where the authors completed the final version of the chapter. We are grateful to
Eve Clark, Shalom Lappin, Ivan Sag, and Stuart Shieber for useful discussions and correspondence;
to Claire CreifIeld for highly intelligent copy-editing assistance; and to Jeff Elman, Mark Liberman,
Diana Raffman, J. D. Trout, and Robert Stainton for their valuable comments on an earlier
draft. No one mentioned here should be assumed to agree with the views we express, of course.
Notes
8
9
Before you charge us with being unfair, take a look at some quotes: "Chomsky's demon
stration . . . is the existence proof for the possibility of a cognitive science" (Fodor, 1 9 8 1 ,
p. 2 58) ; "How can a system o f ideas like this have succeeded in capturing the intellec
tual allegiance of so many educated people?" (Sampson, 1 997, p. 1 59); "A glance at any
textbook shows that . . . generative syntax has uncovered innumerable such examples"
(Smith, 1 999, p. 42) ; "Is the idea supposed to be that there is no (relevant) difference
between my granddaughter, her pet kitten, a rock, a chimpanzee?" (Chomsky, quoted by
Smith, 1 999, pp. 1 69-70) ; "Her rhetorical stance . . . invites comparison with creationists'
attacks on the hegemony of evolution" (Antony, 200 1 , p. 1 94, referring to Cowie, 1 999).
There is more wild-eyed stuff where this came from, and it is spouted by both sides.
Several commentators seem to have assumed this about Pullum and Scholz 2002, which
is actually framed as an effort at stimulating nativists to present evidence that would
actually count in favor of their view. Scholz and Pullum (2002) reply to several critics
and tries to make the goals clearer.
Of course, it's premature to celebrate a non-nativist victory too. Geoffrey Sampson's
announcement in a newspaper article that nativism has collapsed (Sampson, 1999) is an
example of excess at the other extreme.
We note that Hauser, Chomsky, and Fitch, 2002, claim that the core language faculty
comprises just "recursion" and nothing else, apparently accepting such a pyrrhic nativism;
but they are answered on this point in great detail by Pinker and Jackendoff 2005.
"The linguistic literature is full of arguments of this type" (Lightfoot, 1 998, p. 585) ;
"A glance at any textbook shows that half a century of research in generative syntax has
uncovered innumerable such examples" (Smith, 1 999, p. 42).
Notice, we take "linguistic nativism" to denote a claim, not just a strategy. Some
psycholinguists clearly differ. Fodor and Crowther (2002), for example, think linguistic
nativism is a methodology that " assumes everything to be innate that could be innate."
This would presumably contrast with a non-nativist methodology that assumes everything
to be acquired that could be acquired. But these are not the forms of linguistic nativism
and non-nativism we address.
The principles and parameters approach is basically abandoned in the controversial recent
development known as the "minimalist program"; see Pinker and Jackendoff 2005 for a
critique from a standpoint that is decidedly skeptical but nonetheless clearly nativist.
"Parochial" here means varying between natural languages, rather than being true of all
of them.
"Interesting" is used here in a sense stemming from formal language theory, where fmite
or specially gerrymandered classes are not interesting, but broad and mathematically nat
ural classes such as the regular languages or the context-free languages are interesting.
Irrational N ativist Exuberance
10
11
An excellent introduction to both the mathematics and the linguistic and psycholinguistic
relevance can be found in Levelt 1 974.
Janet Fodor ( 1 989) wrestles with this issue, without arriving a t a satisfYing resolution.
See Culicover ( 1 999, p. 1 5) for remarks with which we agree: "Since human beings acquire
both the general and the idiosyncratic, there must be a mechanism or mechanisms that
can accommodate both . . . Even if we assume that the most general correspondences are
instantiations of linguistic universals that permit only simple parametric variation, the
question of how the rest of linguistic knowledge is acquired is left completely unexplored."
The SRN is a "three-layer feed-forward network - made up of the input, hidden, and out
put layers
augmented by a context layer." We should note that Prinz 2002
a work
which unfortunately we encountered only after tills chapter was almost completed describes
Elman's work as showing "that a dumb p attern detector can pick up on structural rela
tions" (p. 206). This seems overstated. Prinz seems unaware of the growing literature sug
gesting an alternative interpretation: that children are very capable and sophisticated learners
of transition probabilities. Elman's computational models are p articularly important in light
of the discovery of children's stochastic learning capacities.
We're ignoring one complication, a s other discussions generally do: i f there is n o main
clause auxiliary verb, the auxiliary verb do is required.
There i s actually a great deal t o b e made clear about just what the higher-order property
of "structure-dependence" is. The statement in (9) is not universal: other languages
do not form interrogative sentences in the same way as English. What could perhaps be
universal is some metaprinciple about the form of suitable candidates for tentative con
sideration as principles of grammar. No one has ever really made this precise. We will
ignore the matter here.
I t i s worth pointing out that there i s a deep inconsistency in the nativist literature con
cerning the kind of stimulus that is relevant to showing that the stimulus for auxiliary
inversion is impoverished. On the one hand, nativists often claim that only the charac
teristics of child-directed speech are relevant for child language acquisition, since chil
dren acquire language primarily from child-directed speech. On the other hand, it is often
pointed out that in some cultures adults do not direct speech to children until they are
verbally fluent, so ex hypothesi, in these cultures the relevant stimulus is not speech directed
specifIcally toward children. The reason this is important is that how impoverished the
stimulus is depends on what stimuli are counted as relevant. For an informed discussion
see Clark 2003.
Fodor 1 9 8 1 ignores stochastically based learning. Pessimistic results like those of Gold
1 967 simply do not apply under the assumption that linguistic input is modeled as a
stochastic process and not text (see Scholz, 2004). Elsewhere, Fodor claims that stochas
tic learning can do nothing but recapitulate the distributional properties of the input. Elman's
SRN is a counterexample to that claim. However, this is not the place to reply to Fodor's
criticisms of connectionism.
Prinz (2002, pp. 209- 1 0) takes Singleton and Newport (2004) to show that "children can
turn a moderate statistical regularity into a consistently applied rule." But Simon's
regularization of morphological structure need not be seen as evidence for the acquisi
tion of rules in the linguist's sense. Simon's accomplishment is to exhibit more statistical
regularity than his input did. This does not of itself tell us that any rules were acquired
(though they might have been).
There is more to be said about sign language, though we have insuffIcient space here.
Findings about language acquisition from inconsistent stimuli have played an import
ant role in research on cross-cohort syntactic change in Nicaraguan Sign Language
where there is iterated regularization across successive stages in the development of the
-
12
13
14
15
16
17
language. Here part of what is being investigated is syntactic regularization across a range
of different inconsistent stimuli. And this line of research promises to provide insights
into creolization, another controversial topic relating to language acquisition (see
Bickerton, 1 984).
Notice, inconsistency and error were not features of the input t o the learner considered
by Lewis and Elman; they assumed that the consistent and correct data lacked instances
of one particular kind of sentence.
Culicover, P. W. ( 1 999). Syntactic Nuts: Hard Cases, Syntactic Theory, and Language
Acquisition. Oxford: Oxford University Press.
Elman, J. 1. (2003). Generalization from sparse input. Proceedings of the 38th Annual Meeting
of the Chicago Linguistic Society. Chicago: Chicago Linguistic Society.
- (2005). Connectionist models of cognitive development: Where next? Trends in Cognitive
Sciences, 9, 1 1 1 - 1 7.
Fodor, J. A. ( 19 8 1). The current status of the innateness controversy. In Representations:
Philosophical Essays on the Foundations of Cognitive Science. Cambridge, MA: MIT Press.
Fodor, J. D. ( 1 989). Learning the periphery. In R. J. Matthews and W. Demopoulos (eds.),
Learnability and Linguistic Theory. Dordrecht: Kluwer.
Fodor, J. D. and Crowther, C. (2002). Understanding stimulus poverty arguments. Th e
Linguistic Review, 19, 1 05-45.
Gibson, E. and Wexler, K. ( 1994). Triggers. Linguistic Inquiry, 2 5, 407-54.
Gold, E . M. ( 1 967). Language identifIcation in the limit. Information and Control, 1 0, 447-74.
Gomez, R. 1. (2002). Variability and detection of invariant structure. Psychological Science,
1 3 , 43 1 -6.
GriffIths, P. E. (2002). What is innateness? The Monist, 85, 70-85.
Hauser, M. D., Chomsky, N., and Fitch, W. T. (2002). The faculty of language: What is it, who
has it, and how did it evolve? Science, 298, 1 569-79.
Jain, S., Osherson, D., Royer, J. S., and Sharma, A. ( 1 999). Systems That Learn (2nd edn.).
Cambridge, MA: MIT Press.
Johnson, D. E. and Postal, P. M. (1 980). A rc Pair Grammar. Princeton, NJ: Princeton
University Press.
Kornblith, H. ( 1 993). Inductive Inference and Its Natural Ground: An Essay in Naturalistic
Epistemology. Cambridge, MA: MIT Press.
Laurence, S. and Margolis, E. (200 1 ) . The poverty of the stimulus argument. British Journal
for the Philosophy of Science, 52, 2 1 7-76.
Levelt, W. J. M. ( 1 974). Formal Grammars in Linguistics and Psycholinguistics (3 vols). The
Hague: Mouton.
Lewis, J. D. and Elman, J. 1. (200 1). Learnability and the statistical structure of language: Poverty
of stimulus arguments revisited. Proceedings of the 2 6th Annual Boston University Con
ference on Language Development. Somerville, MA: Cascadilla Press.
Li, M. and Vitanyi, P. ( 1 997). An Introduction to Kolmogorov Complexity and Its Applications
(2nd edn.). New York: Springer.
Lidz, J., Gleitman, H., and Gleitman, 1. (2003). Understanding how input matters: Verb learning
and the footprint of Universal Grammar. Cognition, 87, 1 5 1 -78.
Lightfoot, D. ( 1 998). Promises, promises: General learning algorithms. Mind and Language, 1 3 ,
582-7.
Lipton, P. ( 19 9 1 ) . Inference to the Best Explanation. New York : Routledge.
- ( 1 998). Induction. In M. Curd and J. A. Cover (eds.), Philosophy of Science: The Central
Issues. New York: Norton. (Reprint of a passage from Lipton, 1 99 1 .)
Marler, P. ( 1 999). On innateness : Are sparrow songs "learned" or "innate"? In M. D. Hauser
and M. Konishi (eds.), The Design of Animal Communication. Cambridge, MA: MIT Press.
Matthews, R. J. and DemopOUlos, W. (eds.) ( 1 989). Learnability and Linguistic Theory.
Dordrecht: Kluwer.
Newport, E. 1. and Aslin, R. N. (2000). Innately constrained learning: blending old and new
approaches to language acquisition. In S. C. Howell, S. A. Fish, and T. Keith-Lucas (eds.),
Proceedings of the 24th Annual Boston University Conference on Language Development.
C HAPTER
F I V E
Linguistic nativists hold that child-learners come to the language acquisition task
equipped with certain domain-specifIC innate knowledge that enables them to suc
ceed in this task in which they would otherwise fail. Many of these nativists further
hold that this innate knowledge is in fact knowledge of certain grammatical principles
true of all natural languages. These principles, the set of which they dub "universal
grammar" (UG, for short), are said to constrain severely the class of possible
natural languages, thereby making successful acquisition possible on the basis of the
limited empirical evidence available to child-learners in the learning environment about
the language to be learned. ! In the years since the demise of behaviorism in the late
1 9 50s and early 1 9 60s, linguistic nativism has gradually become the received view
within cognitive science on matters concerning the innate contribution of the learner
to language acquisition,2 though there continues to be signifICant empirical debate
among nativists as to just what exactly is innate and how it is to be characterized.
There also continues to be a number of linguists, philosophers, and psychologists
who either reject linguistic nativism out of hand in favor of what might be described
as a broadly empiricist conception of language acquisition or else argue that a com
pelling case has yet to be made for linguistic nativism.3 These critics do not have
anything that could be described as a reasonably well-developed alternative to the
increasingly detailed models of child language acquisition presented by nativists and
based on nativist principles. Rather they tend to focus critically on the arguments
advanced in support of linguistic nativism, most notably on so-called poverty of the
stimulus arguments (discussed below). These critics allege that for various reasons
these arguments fail to establish the nativist conclusion that they are intended to
establish ; they argue that these arguments are formally invalid, or they fail to rise
to a suffIcient standard of proof, or they rest on certain dubious unstated assump
tions, or they are crucially vague at critical points, or they depend on empirical premises
that are either false or at least not empirical proven. So at the very least, according
to these critics, the case for linguistic nativism has yet to be made, and empiricism
in these matters is still a live option.
The arguments for linguistic nativism are certainly not apodictic, but, then, vir
tually no arguments for any claim of any importance in empirical science ever are.
Nevertheless these arguments are considerably stronger than anti-nativist critics admit.
The case for linguistic nativism is compelling - enough so that researchers are
certainly justified in attempting to work out the details of a nativist account of
language acquisition. Of course, many such details remain to be worked out, and cur
rently accepted hypotheses about the learning environment, learning mechanisms,
and what is acquired will undoubtedly suffer the fate of most other empirical scientiflc
hypotheses, turning out to be at best only rough approximations of the truth. But
the attempt to work out a nativist account is not the fool's errand that some empiri
cists have attempted to make it out to be. And if bets were being taken on how the
debate between nativists and anti-nativists will turn out, the smart money would be
on the nativists.
In the present paper we examine the case for linguistic nativism, focusing flrst
on the so-called "poverty of the stimulus" arguments on which linguistic nativists
commonly rely. We consider the reasons that anti-nativists fmd these arguments uncon
vincing, concluding that while anti-nativists typically hold these arguments to an
unreasonably high standard of proof, they are right to complain that these argu
ments, as they are actually presented, are often lacking in crucial empirical detail.
Without such detail, these arguments are best construed, we argue, as a kind of "demon
stration" argument for linguistic nativism. We conclude our examination of the case
for linguistic nativism by considering a kind of argument provided by formal learn
ing theory that is arguably more empirically robust and thus less vulnerable to anti
nativist criticisms. We begin our discussion by providing some historical background
for the current debate between nativists and anti-nativists.
of such a triangle, in both cases because we are innately endowed with that know
ledge. Seventeenth- and eighteenth-century empiricists such as Locke, Berkeley, and
Bume, for their part, argued that such knowledge could be, and in fact was, acquired
on the basis of sensory experience alone, using the domain-general inductive learn
ing mechanisms hypothesized by empiricists.
The issue separating rationalists and empiricists has never been, as some believe,
that empiricists failed to credit the mind with any innate structure.5 Empiricists clearly
assumed that the hypothesized perceptual apparatus and domain-general inductive
learning mechanisms were innate. But unlike rationalists, empiricists assumed that
this innate structure imposed no substantive restrictions on the knowledge that could
be acquired. Empiricists identifIed knowledge with complex ideas constructed out of
sensory experience, and there were, as they saw it, no restrictions on the sorts of
complex ideas that the hypothesized innate inductive mechanisms could cobble out
of the deliveries of innate sensory mechanisms. Rationalist accounts, by contrast, denied
that knowledge acquisition involved inductive generalization over the deliveries of
the senses. They regarded knowledge acquisition as a non-inferential, brute causal
process that mapped a course of sensory experience into a body of knowledge. On
the rationalist account, sensory experience played a quite different role in know
ledge acquisition than empiricists imagined: speciflc sensory experiences served to
occasion specifIC innate knowledge that was latently present in the mind. This innate
knowledge effectively constrained what one could possibly learn and thus know, for
one could come to know only what was innately (and latently) present in the mind.
For all the polemics and spilling of ink that characterized the debate between
seventeenth- and eighteenth-century rationalists and empiricists, the debate was
ultimately inconclusive, both because the issues in dispute were not framed with
suffIcient precision, and because the relevant empirical evidence was not in hand.
Neither party had anything like a concrete proposal for how we come to know what
we know; indeed, neither party had a precise, empirically well-supported specifIca
tion of what it is that is learned and hence known. As well, neither party had any
thing more than the glimmer of an idea of what aspects of sensory experience were
relevant to the acquisition of specifIC sorts of knowledge. Thus, neither party was in
a position to decide the crucial question of whether sensory experience in combina
tion with inductive learning strategies was even in principle suffIcient to account for
the knowledge that we in fact have. But all this began to change with the advent
of modern generative linguistics in the late 1 9 505 and early 1 9 60s. First, linguists
developed a reasonably precise characterization of one particular domain of human
knowledge, viz., what it is one knows when one knows a natural language.
Subsequently, developmental psycho linguists working largely within the generative
linguistics tradition began to develop increasingly precise characterization of the
primary linguistic data available to the child-learner. Learning theorists were fmally
in a position to begin to address fruitfully the crucial question of whether, as empiri
cists claimed, domain-general learning strategies were sufflCient to account for the
ability of child-learners to acquire any natural language on the basis of their access
of data, or whether, as rationalists (now calling themselves nativists) claimed, suc
cessful language acquisition required that child-learners come to the acquisition task
equipped with certain innate domain-specifIc knowledge about the language they would
The Case for Lingu istic Nativism
learn, knowledge that would effectively constrain the class of languages that they
could learn (on the basis of available primary linguistic data). Much of the discus
sion and debate, especially within linguistics, focused on so-called poverty of the
stimulus arguments, which linguistic nativists such as Chomsky argued provided com
pelling empirical support for their position.
The conclusion of the argument, it should be noticed, is not that knowledge of these
grammatical principles is innate; rather it is that in order to come to know what they
do on the basis of such impoverished data, learners must come to the learning
task with certain domain-specifIc knowledge (specifIcally, of grammatical principles)
- knowledge that will enable them to acquire any natural language on the basis
of the relevant, impoverished data for that language. The PoS argument itself is
noncommittal as to whether this knowledge that the successful learner must bring to
the learning task is innate, or whether, as some (e.g., Piaget) believe, it is acquired
earlier, perhaps on the basis of nonlinguistic sensory experience.
The argument for linguistic nativism therefore involves more than the direct infer
ence from the poverty of linguistic data to the innateness of linguistic knowledge.
Rather the argument involves two steps: (il a PoS argument from the poverty of
linguistic data to the conclusion that the learner must come to the learning task with
certain domain-specifIc knowledge about what is to be learned, and (ii) a second argu
ment, not necessarily a PoS argument, to the effect that this antecedent domain
specifIC knowledge could not itself be learned and must therefore be innate. And if,
as Chomskyans claim, this innate knowledge takes the form of a universal grammar
Robert J. Matthews
(UG), i.e., a set of grammatical principles true of all possible natural languages, then
there will have to be a third argument to the effect that (iii) this innate, domain
specific knowledge is properly characterized in terms of such universal principles.
Nativists, including Chomsky himself, have tended to focus almost exclusively on
the first step in this argument for linguistic nativism, which establishes only that the
learner must come to the learning task with certain domain-specifiC knowledge. They
do this, not because they think that this is all there is to the argument for linguistic
nativism, but rather because they think, correctly it seems, that recognizing the need
for this domain-specifiC knowledge is the crucial step in the argument for linguistic
nativism that empiricists have traditionally and stubbornly resisted. As long as
empiricists remain convinced that language acquisition can be accounted for in
terms of domain-general learning mechanisms, they will fmd any sort of nativism
unmotivated. But if they can be brought to recognize the need for domain-specific
knowledge, they will be forced, nativists assume, to face the question that wlll even
tually drive them to nativism, namely, how could such knowledge possibly be acquired.7
Chomsky'S own conclusion that the domain-specific knowledge that learners
bring to the learning task is innate, i.e., that it is determined endogenously and not
on the basis of sense experience, rests on the following argument from theoretical
parsimony:
Chomsky presents just this argument from theoretical parsimony in his reply to
Piaget, who famously held that the required constraints were "constructions of
sensorimotor intelligence" :
I see no basis for Piaget's conclusion. There are, to my knowledge, no substantive
proposals involving "constructions of sensorimotor intelligence" that offer any hope of
accounting for the phenomena of language that demand explanation. Nor is there any
plausibility to the suggestion, so far as I can see . . . . The expectation that constructions
of sensorimotor intelligence determine the character of a mental organ such as language
seems to me hardly more plausible than a proposal that the fundamental properties of
the eye or the visual cortex or the heart develop on this basis. Furthermore, when we
turn to specifiC properties of this mental organ, we fmd little justification for any such
belief, so far as I can see. (Chomsky, 1 980b, pp. 36-7)
Other nativists have offered a different sort of argument for the claim that the
domain-specific knowledge is innate.s This argument, which might be dubbed the
The Case for Linguistic Nativism
Is the man
wearing a jacket?
On the basis of such pairs, child-learners come to know that (2) is formed by
"moving" the auxiliary "is" to the front of the sentence. But then when faced with
the task of forming the polar interrogative corresponding to (3), they unerringly
produce sentences such as (4), but never (5), despite the fact that they have, Chomsky
claims, never heard sentences such as (4) :
3
wearing a jacket?
Robert J. Matthews
Child-learners seemingly know that it is the main clause auxiliary that is to be moved,
despite the fact that based solely upon the linguistic data available to them, viz.,
sentences such as ( 1) and (2), (5) is equally plausible. Chomsky concludes that the
child-learner comes to the learning task knowing that grammatical rules and prin
ciples are structure dependent, so that in constructing the rule for polar interrogatives,
the child-learner never entertains the possibility that the relevant rule is something like
"move the flfSt auxiliary to the front of the phrase," despite the fact that such a rule
is consistent with the empirical data available to the child; instead, the child-learner
presumes, correctly, that the rule is something like "move the main clause auxiliary
to the front of the phrase," since this rule is consistent both with the available data
and the knowledge that grammatical rules and principles are structure dependent.
Other well-known examples of PoS arguments endeavor to show: (i) that
child-learners of English must come to the acquisition task knowing the structure of
complex noun phrases, based upon their understanding that an anaphoric pronoun such
as the English "one" can have as its referent a complex noun phrase, despite the fact
that linguistic data available to th e learner may include only sentences in which the
anaphoric "one" takes a noun as its referent (see Hornstein and Lightfoot, 1 98 1 ) ; and
(ii) that child-learners of English come to the acquisition task knowing the possible
parametric variation exhibited by natural languages (as regards binding relations, word
order, null subjects, and so on) as well as the default values of each of the parameters
that defme a language (see, e.g., Wexler, 199 1 ; Gibson and Wexler, 1 994).
Anti-nativist critics have challenged these PoS-based arguments for linguistic nativism
on a number of different (but quite predictable) grounds. Some, like Piaget, have
been prepared to concede the conclusion of the PoS argument to the effect that
success in these learning tasks requires that the learner come to the task with certain
domain-specifIc knowledge of language, but they deny that this antecedent know
ledge need be innate. The relevant knowledge, they argue, could have been acquired
elsewhere, on the basis of nonlinguistic experience. Others concede that the PoS argu
ments establish that the child-learner must come to the learning task with certain
innate domain-specifIc biases, but they deny that these biases need take the form of
innate know/edge. Rather child-learners are said to come to the learning task fur
nished with certain innate learning mechanisms, rather than knowledge, that in some
manner impose certain domain-specifIc learning biases.lO Most anti-nativist critics,
however, are not prepared simply to concede the conclusions of PoS arguments ; they
recognize the diffIculty of blunting the nativist implications of these arguments once
conceded. They challenge the PoS arguments themselves, arguing on one ground or
another that the arguments fail to establish what they claim to establish, namely,
that the child-learner comes to the learning task with certain domain-specifIc know
ledge or biases that make language learning possible. For most anti-nativist critics,
there is in fact no poverty of the stimulus, either because (i) the linguistic data upon
which the learner acquires language is not as impoverished as nativists claim, or because
(ii) the language acquired is not as complex as nativists claim, or both. Or if there
is a poverty of the stimulus, it is one that (iii) a learner could remedy in ways other
than by a domain-specifIc contribution on the part of the learner. Most critics adopt
the fIrst of these three positions, arguing that contrary to what nativists claim, the
relevant evidence that would allow acquisition of the relevant knowledge is, as a
The Case for Linguistic Nativism
matter of empirical fact, available to the learner in the learning environment, so that
at the very least nativists have not made a case for a poverty of the stimulus. Thus,
for example, Cowie ( 1 999) and Pullum and Scholz (2002) argue against the assump
tion, widely held by developmental psycholinguistics, that child-learners acquire lan
guage on the basis of positive evidence only, i.e., on the basis of data drawn from
the language to be acquired; they argue that at the very least there is what they call
"indirect" negative evidence.l1 With respect to Chomsky'S PoS argument from polar
interrogatives, they argue that contrary to what Chomsky claims, learners do in fact
have access in the learning environment to sentences such as (4), and they present
as evidence for this claim the fact that such sentences appear with some frequency
in the Wall Street Journal. They don't actually establish the frequency with which
such sentences appear in the linguistic corpus to which child-learners are exposed,
much less that child-learners actually register and make use of such sentences as are
2
available in the learning cOrpUS.1 It is enough to discredit the argument, they assume,
simply to show that the child-learner might have access to the relevant data. Anti
nativists also argue against PoS arguments by challenging nativist characterizations
of what is learned, arguing that PoS arguments presume defective, or at least con
tentious, grammatical characterizations of what's learned, so that one can have no
grounds for confIdence in any nativist conclusions based on them.
The aim in virtually every case is to discredit PoS arguments by challenging the
nativist's characterization of input data or output knowledge. Anti-nativists rarely
claim to have proven the nativist's characterizations to be false, which of course would
be a decisive refutation of their arguments ; rather they claim, more modestly, to have
shown that nativists have failed to shoulder the burden of proof to the anti-nativist's
satisfaction. For all the nativist has shown, these anti-nativists argue, the characteriza
tions of input data and output knowledge, and hence the conclusion of the PoS argu
ments, might be false, and this bare possibility, they argue, leaves open the possibility
both that nativism is false and (consequently) that some form of empiricism is true.
Anti-nativists critics often write as if they imagine that PoS-based arguments
are intended to convert the committed anti-n ativist. This is simply not the case. It
would be a fool's errand to undertake to convert anyone who holds PoS arguments
to the unreasonable standard of having conclusions that are not possibly false, for no
empirical argument can meet that standard. But even if PoS arguments are evaluated
under some more reasonable standard of acceptability, anti-nativists will probably
remain unconvinced, and for basically two reasons. First, as presented these argu
ments typically do not provide adequate empirical evidence in support of the
crucial premises about what is acquired and the linguistic e vidence on the basis of
which it is acquired, thus permitting anti-nativists to question the truth of the premises.
Second, in themselves these arguments provide no reason to suppose that their conclu
sions are empirically robust, in the sense of continuing to hold under "perturbtions,"
i.e., different reformulations, of the premises that reflect the our uncertainty about
the relevant empirical facts. Put another way, given our uncertainty about the relev
ant empirical facts, anti-nativists question whether we can have any confIdence in an
argument based on any particular specifIcation of these admittedly uncertain facts.
This lack of detailed accompanying empirical support for the premises, coupled
with the lack of demonstrated empirical robustness of the conclusions, makes it diffIcult
Robert J. Matthews
to regard these arguments, at least as they have been presented, as anything more
than demonstration arguments, i.e., arguments that are intended to demonstrate the
sort of reasoning that leads linguistic nativists to their view. Given the obvious
theoretical and empirical complexity of the acquisition problem, specifIcally, the difflculty
of specifYing (i) what precisely is acquired in acquiring a language, (ii) the data on
the basis of which whatever is acquired is acquired, and (iii) the cognitive processes,
including any innate biases, that effect the acquisition - such arguments alone
cannot make a compelling case for linguistic nativism. Indeed, they cannot make a
compelling case even for the crucial claim that learners come to the learning task
with certain domain-specifIC knowledge that makes successful acquisition possible.
Nothing less than the following two theoretical developments would turn the trick:
(i) a theoretically well-developed, empirically well-suppo rted nativist account, i.e.,
one that makes essential use of nativist assumptions, of how child-learners acquire
the languages that they do on the basis of their access to linguistic data, and
(ii) the concomitant failure of empiricist efforts to develop a similarly theoretically
well-developed, empirically well-supported account which does not make essential
use of nativist assumptions.!3
Although there has been considerable progress over the last 25 years in the
development of nativist computational accounts of natural language acquisition, at
present neither nativists nor empiricists have accounts that are suffIciently well
developed theoretically and well supported empirically to bring fmal closure to the
dispute over linguistic nativism. There has been a great deal of empirical work within
generative linguistics to specifY in precise terms what it is that a child-learner acquires
when he or she acquires a natural language, and these specifIcations have become
increasingly sensitive over the years to the obvious requirement that whatever is acquired
must be the sort of thing that provably can be acquired on the basis of the child
learner's given access to data in the learning environment. There has also been con
siderable empirical work within developmental linguistics to specifY precisely what
data is available to learners regarding the specifIC linguistic constructions that they
master. There has, at the same time, been a growing body of research in "formal
learning theory" (discussed below) that attempts to integrate these specifications of
what is learned and the data on the basis of which it is learned into a computa
tionally explicit account of the cognitive processes that map the latter into the
former.14 It is not possible to survey this work here, but sufflce it to say that nativist
assumptions underpin it at every turn (see Wexler, 1991). PoS-based considerations,
for example, guide the development of the computational account of acquisition pro
cesses, suggesting to researchers the sorts of biases that have to be built into these
processes if they are to succeed in their task. During this same period in which nativist
assumptions have so dominated psycho linguistic research, nothing has emerged that
could plausibly be described as even the beginnings of a non-nativist account of
language acquisition. In the absence of such an account, anti-nativists have been
reduced to playing the role of a loyal opposition (some would say flghting a
rearguard action), criticizing nativist PoS arguments, objecting to specifIC nativist
proposals, pointing out research results that might possibly favour a non-nativist account,
and so on. Thus, for example, one fmds many anti-nativist criticisms of the well
known poster examples of PoS arguments, criticisms that, as I said above, typically
The Case for Linguistic Nativism
focus on establishing that these demonstration arguments are not decisive disproofs
of the possibility of an anti-nativist account (e.g., Cowie, 1999 ; Pullum and Scholz,
2002). One fmds arguments to the effect that certain connectionist architectures, e.g.,
simple recurrent networks, hold promise as a way of explaining how learners might
exhibit the learning biases that they do, though without the intervention of domain
specifiC antecedent knowledge (e.g., Elman et aI., 1996, but see Sharkey et aI., 2000).
One similarly fmds arguments to the effect that learners are able to compensate for
the apparent poverty of the stimulus by employing certain stochastic procedures (again,
Cowie, 1999 ; Pullum and Scholz, 2002). But as suggestive as these criticisms and
results may be, they don't really add up to anything that suggests that empiricist
accounts are a live option at this point. All this of course could change, but at this
point in time there is simply no evidence of an impending empiricist renaissance.
Perhaps the most that anti-nativists might reasonably hope for is that as the nativist
research program is elaborated and modifIed in response to the usual theoretical and
empirical pressures that drive research programs we might eventually reach a point
where the resulting account of language acquisition becomes unrecognizable as either
nativist or empiricist, as least as we presently understand these terms. In such event,
the question of which view turned out to be correct will be moot, and anti-nativists
might take some solace in the fact that nativism, as currently conceived, turned out
not to be correct. 1 5
Formal learning theory, as the name suggests, studies the learnability of different
classes of formal objects (languages, grammars, theories, etc.) under different formal
models of learning.16 The specification of such a model, which specifles in formally
precise terms (i) a learning environment (Le., the data which the learner uses to learn
whatever is learned), (ii) a learning function, and (iii) a criterion for successful learn
ing, determines (iv) a class offormal objects (e.g., a class of languages), namely, the
class that can be acquired to the level of the specifIed success criterion by a learner
implementing the specifIed learning function in the specifled learning environment. 17
Much of the early work in FLT concerned itself with extensions and general
izations of the so-called Gold paradigm, initiated by Gold's 1 967 seminal paper
"Language identifIcation in the limit." In this paper Gold examined the learnability
of different classes of formal languages on the basis of two different data formats,
under a success criterion of strict identifIcation in the limit. Gold proved a number
of important learnability results, most famously an unsolvability theorem for text
presentation whose interpretation and import for linguistic nativism has been the sub
ject of continuing debate within cognitive science. IS Subsequent work in FLT has exam
ined learning models that differ widely in their specifIcation of all three parameters
of the model (viz., learning environment, learning function, and success criterion).
Formal learning-theoretic results typically compare models that differ only in their
specifIcation of one of the three parameters, showing that a class of languages learn
able on one specifIcation of the parameter in question is not learnable on a differ
ent specifIcation of that same parameter. Many of the results are unsurprising: there
are learnable classes of languages that cannot be learned by computable learning
functions, that cannot be learned on noisy text (Le., text that includes sentences drawn
from the complement of the language to be acquired), that cannot be learned on
incomplete text, and so on. But there are also some very surprising results, some of
which refute very basic theoretical assumptions within psychology. (For example, there
is a widely held assumption that a "conservative" learning-on-errors strategy, of aban
doning a hypothesis only when it fails to explain the data, is not restrictive as regards
what can be learned. FLT shows this to be false.) 19
There is no direct or immediate application of these various learnability results to
the current debate regarding linguistic nativism; in themselves these results do not
make a case for linguistic nativism. B ut they do serve to map out the conceptual
space within which any plausible theory of natural language acquisition must be articu
lated, and they do provide researchers familiar with these results with a pretty good
sense, "intuitions" as mathematicians might put it, as to the learnability of particu
lar classes of languages under different specifIcations of the three parameters that
defme a learning model. And it is here, arguably, that the FLT case for linguistic
nativism begins to emerge.
From studying the learnability consequences of varying the three parameters that
defme a particular formal learning model, FLT theorists develop a rather clear under
standing of how changes to the different parameters interact with one another to
affect the learnability of broad classes of formal languages. As they study the learn
ability properties of what linguists take to be the class of possible natural languages,
these theorists also develop a pretty clear sense of the sort of restrictions that must
be imposed on that class (in the form of restrictions on the hypotheses that a learner
The Case for Linguistic Nativism
Conclusion
In conclusion, fIve points deserve emphasis. (il Ihe case for linguistic nativism fmds
independent support from both PoS arguments and FLI-based arguments. (ii) These
arguments are not apodictic ; like any argument based on empirical premises, they
are only as good as their premises. But (iii) the preponderance of available evidence
suggests that these arguments are generally sound and empirically robust, and hence
(iv) child-learners do come to the learning task with antecedent domain-specifIc know
ledge, and the most plausible explanation of how learners come by this knowledge
is that it is innate. But (v) any fmal verdict on linguistic nativism must await the
development of a theoretically well-developed, empirically well-supported account of
natural language acquisition, one that satisfIes the minimal adequacy condition imposed
by FLT, namely, that the hypothesized learning procedures provably be able to acquire
the class of natural languages, to the empirically appropriate success criterion, on
the basis of the sort of evidence that child-learners in fact employ in the course of
language acquisition.
Notes
2
3
4
5
6
See Chomsky, J 966 for his perspective on the seventeenth-century debate. See also Cowie
1 999, pp. 3-66.
See Katz, 1 967, pp. 240-68.
In an earlier formulation, Chomsky put the argument this way:
It seems plain that language acquisition is based on the child's discovery of what
from a formal point of view is a deep and abstract theory a generative grammar
of his language - many of the concepts and principles of which are only remotely
related to experience by long and intricate chains of quasi-inferential steps. A con
sideration of . . . the degenerate quality and narrowly limited extent of the avail
able data . . . leave[s] little hope that much of the structure of the language can be
learned by an organism initially uninformed as to its general character. (Chomsky,
1 965, p. 58)
8
9
10
11
12
13
14
J5
J6
Failure to appreciate that PoS arguments are intended to establish only that successful
learners must come to the learning task with certain domain-specifIc knowledge has led
empiricist critics such as Cowie ( 1 999) to complain that these arguments establish less
than they are supposed to establish, namely that learners come to the learning task with
certain innate knowledge of language. It has led nativist supporters such as Nowak et al.
(2002) to conclude that by "innate" linguistic nativists really mean "before data," i.e.,
knowledge that learners bring with them to the learning task.
See, e.g., Matthews, 200 1 .
See, e.g., Chomsky, 1 9 7 5, pp. 30-3; 1 9 80a, pp. 1 1 4- 1 5 ; 1 980b, pp. 4 1 -7 , 1 986, pp. 7- 1 3;
also Lightfoot, 1 99 1 , pp. 3-4 and Pinker, 1 994, pp. 233-4. For a nuanced discussion of
PoS arguments, see Wexler, 1 99 1 .
Whether there is a substantive issue here depends crucially o n how, computationally speak
ing, such knowledge is realized. Most critics who have pressed this issue presume with
out argument a representationalist theory of propositional attitudes (cf. Fodor, 1987) according
to which having an attitude toward some proposition (e.g., knowing that P) is a matter of
having a mental representation with the propositional content P that plays the appropriate
functional/causal role in cognition. This presumption is especially evident in the arguments
of Elman et al. ( 1 996) against the position that they term "representational nativism."
For a defense of the standard assumption, see Marcus, 1 993.
For discussion, see Matthews, 2001 and Crain and Pietrowski, 2002.
But even this might not be enough to bring the committed anti-nativist o n board. I f the
history of science is any indication, many anti-nativists would remain unconverted. For
as historians of science are fond of pointing out, scientifIC progress generally comes not
through conversion of an entrenched scientifIC community to a new view, but through
attrition: defenders of discredited theories simply die off, leaving the field to younger
proponents of the replacing view.
Wexler and Culicover, 1 9 80, and Berwick, 1 9 8 5 are early examples of such work. More
recent examples include Gibson and Wexler, 1 994; Bertolo, 1 99 5 ; Niyogi and Berwick,
1 99 6 ; Sakas and Fodor, 200 1 ; and Yang, 2002.
I am reminded here of the remark, cited by Hilary Putnam, of a famous Nobel laureate
in chemistry who wryly noted that contrary to what historians of science always teach,
the existence of phlogiston was never disproved, by Joseph Priestley or anyone else, that,
on the contrary, phlogiston turned out to be valence electrons!
A class of languages is counted as learnable just in case every language in the class is
learnable, to the specifIed success criterion, on the basis of the specifIed kind of data for
that language.
Robert J. Matthews
17
18
19
For a general introduction, see Feldman, 1 9 7 2 ; Valiant, 1984; Kelly, 1 99 6 ; and Jain et a!.,
1 999 . For applications to the problem of natural language acquisition, see, e.g., Wexler
and Culicover, 1 9 80; Berwick, 1 985; Matthews and Demopoulos, 1 98 9 ; and Bertolo, 200l.
For a careful discussion o f Gold's unsolvability result and its implications for the current
nativist/empiricist debate, See Johnson (2004). See also Matthews, 1 984; Demopoulos, 1 9 89;
and Nowak et a!., 2002.
See Jain e t aI., 1 999 and Osherson e t aI., 1 984.
References
Bertolo, S. ( 1 995) . Maturation and learnability in parametric systems, Language Acquisition,
4, 2 77-3 1 8.
(ed.) (200 1 ) . Language Acquisition and Learnability. Cambridge: Cambridge University Press.
Berwick, R. ( 19 8 5) . The Acquisition of Syntactic Knowledge. Cambridge, MA: MIT Press.
Chomsky, N. ( 1 965). Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
( 1 9 66). Cartesian Linguistics. New York: Harper Et Row.
( 1 9 7 5). Rellections on Language. New York: Pantheon.
( 1 98 0a). Rules and Representations. New York: Columbia University Press.
( 1 98 0b). On cognitive structures and tbeir development: a reply to Piaget. In M. Piattelli
Palmarini (ed.), Language and Learning: The Debate between Jean Piaget and Noam
Chomsky. Cambridge, MA: Harvard University Press.
( 1 986). Knowledge of Language. Westport, CT: Praeger.
( 1 988). Generative grammar: Its basis, development and prospects. Studies in English Linguistics
and Literature, special issue. Kyoto : Kyoto University of Foreign Studies. (Reprinted as On
the nature, acquisition, and use of language, in W. Lycan (ed.), Mind and Cognition. Oxford:
Blackwell, 1 990.)
Cowie, Fiona ( 1 999). What's Within: Natil'ism Reconsidered. Oxford: Oxford University Press.
Crain, S. and Pietrowski, P. (2002). Why language acquisition is a snap. The Linguistic Rel'iew,
1 9, 1 63- 83.
Demop oulos, W. ( 1 989). On applying learnability theory to the rationalism-empiricism con
troversy. In R. Matthews and W. Demopoulos (eds.), Learnability and Linguistic Theory.
D ordrecht: Kluwer.
Elman, J., Bates, E., Johnson, M., Karmiloff-Smith, A, Parisi, D., and Plunkett, K. ( 1996). Rethinking
Innateness. Cambridge, MA: MIT Press.
Feldman, J. ( 1 97 2). Some decidability results on grammatical inference and complexity.
Information and Control, 20, 244-62.
Fodor, J. ( 1 987). Psychosemantics. Cambridge, MA: MIT Press.
Gibson, E. and Wexler, K. ( 1 994). Triggers. Linguistic Inquiry, 25, 407-54.
Gold, E. M. ( 1967). Language identification in the limit. Information and Control, 1 0, 447-74.
Hornstein, N. and Lightfoot, D. ( 1 98 1 ) . Introduction. In Explanation in Linguistics: The Logical
Problem of Language Acquisition. London: Longman.
Jain, S., Osherson, D., Royer, J. S., and Sharma, A K. ( 1 999). Systems That Learn (2nd edn.).
Cambridge, MA: MIT Press.
Johnson, K. (2004). Gold's theorem and cognitive science. Philosophy of Science, 7 1 , 57 1 -9 2 .
Katz, J . ( 1 967). Philosophy o f Language. New York: Harper Et Row.
Kelly, K. ( 1 996). The Logic of Reliable Inquiry. Oxford: Oxford University Press.
Lightfoot, D. ( 199 1 ) . How to Set Parameters. Cambridge, MA: MIT Press.
Marcus, G. ( 1 993). Negative evidence in language acquisition. Cognition, 46, 53-85.
The Case for Linguistic Nativism
Robert J. Matthews
CHA PT E R
S
I
X
On t h e In n aten ess
of La n g u a g e
James McGilvray
Introduction
Barbara Scholz and Geoffrey Pullum in this volume (chapter 4, IRRATIONAL NATIVIST
EXUBERANCE) review some of the current discussion on innateness of language. They
express frustration at the nativist's lack of sympathy with what they call "sense based"
approaches to language learning. SpecifIcally, they suggest looking at the studies of
language learning found in two works, Newport and Aslin, 2000, and Lewis and Elman,
2001. Both claim a role for statistical analysis in language acquisition. That seems
to be what Scholz and Pullum have in mind by "sense based" approaches.
I want to emphasize the kinds of considerations that lead some of those who work
on natural languages and their acquisition to adopt what I call, following Chomsky,
a "rationalist" research program. I Scholz and Pullum's suggestions help in this, for
I suggest - Newport and Aslin's work can be understood as a contribution to a
rationalist project, while Lewis and Elman's has a very different, empiricist aim.
Distinguishing the two reveals what is at stake.
The question of the innateness of language should, I think, be seen as a matter
of choosing the most fruitful research program for the science of language. This
is an empirical question, decided by which program yields the best theory,2 or
naturalistic understanding of language. It is clear what Chomsky'S rationalist choice
was and is (Chomsky, 2000, 2005) : treat linguistics as naturalistic scientifIC inquiry,
using the same methodology as that of physics and biology, and think of language
as a natural object, a kind of organ that grows under normal conditions to a state
that the biophysical nature of that organ permits. That invites thinking of the innate
ness of language as a matter of internal biophysical machinery that sets an inter
nally determined course of language growth or development for the child's mind.
Biophysical machinery selects incoming signals according to its inbuilt criteria of
relevance, "looking" for patterns and elements. It also sets an agenda for a normal
anywhere. Perhaps friends of SRNs can get SRNs to manage this, to some degree.
But no matter how close they might come to convincing people that a device "speaks
English" (Chinese, etc), their effort has no demonstrable relevance to the issue of how
a child actually does acquire a language.
coordinating what the child receives. The child's mind must have a biophysical sys
tem that "contains" the principles and options for any natural language. The child's
mind must have an internal data-selection and option-setting device.
Given sufficient time and devoted effort, a connectionist empiricist story might
possibly be told about SRNs with external guidance exploiting whatever statistical dis
tributions are to be found in an extremely large set of signals of an English-appearing
sort, so as to eventually produce sets of behaviors that some judge or another (who
decides who is to judge?) says shows that it has learned English. To the rationalist,
that would be about as interesting as being told that someone has decided that such
and-such a program yields a machine that thinks. The rationalist sees the produc
tion of behaviors that "pass" as thinking as a matter of decision about usage of think.
So too of learned English: unless it is demonstrated that children's developing minds
are SRNs and that they do learn in the way the connectionist empiricist story says
they do, it is of no interest to the natural science of language and of how language
develops. It might interest software engineers who want to produce "talking" robots.
But that is quite a different enterprise.
To someone trying to fmd out how children's minds actually acquire a language,
facts such as that they seem to employ statistical sampling in "looking for" local
minima in linguistic sounds and in tracking the extent of available data in setting
parametric options (Saffran et aI., 1 99 6 ; Yang, 2004) are interesting. The fust indic
ates that the mind has a device that seeks local minima in language-like signals. It
is interesting because only children's minds automatically recognize the patterns and
their relevance. Note that detection must be at least bimodal (auditory and spatial
for speech and sign), and must be somehow exploited by a growth control system
that brings the detected patterns to bear in the earliest forms of linguistic production
- among others, babbling (Petitto et aI., 2000 ; Petitto, 2005). How is a statistical
sampling system - likely located in the superior temporal gyrus - made to perform
this task, keeping in mind that among organisms, only human minds seem to "look"
for the patterns and "know" how to put them to use? There must be some kind of
selection and control mechanism that is "prepared" for information of this sort, "seeks"
it and in this sense selects it, and quickly sets to putting it to use, in either speech
or sign, or both (Petitto, 2005).
Consider next that a child's mind sets parameters in a sequence that tracks the
extent of available data; that process too shows the mind "anticipating" what is
relevant. Children developing English sometimes produce constructions that English
does not allow. Their errors extinguish gradually, not with a sharp cutoff; that
indicates that something like the statistical distribution of data available in a child's
experience not only plays a role in "choosing," but in setting the order in which
parameters are set. Crucially, though, the process is not random ; children's minds are
constrained to producing constructions available only in other natural languages.
A child developing English might at one stage of development (2 years and 7 months,
according to one study - Crain and Thornton, 1 998) say What do you think what
pigs eat - fme in some dialects of German, but not in English. Or a child might
produce pro-drop or topic-drop constructions - fme under specifIC conditions in
some languages (Italian, Chinese), but ruled out of English (cf. Wang, 2004, p. 2 54).
The choice space seems to be dosed. The only plausible way to make sense of that
James McGilvray
as internal, not external, principles of growth regulation and data relevance. The nativist
says that we can tell a similar tale about language. With language, of course, several
courses of development are allowed - French, Swahili, Miskito, etc. So the internal
system must allow for options - described, in current work, by parameters. These
aside, and ignoring "Saussurean arbitrariness" with lexical items (not something a
naturalistic theory can deal with), grammar itself (largely Merge6) may well be uniform
and extremely simple (Hauser et aI., 2002 ; Chomsky, 2005).
And fmally, there is a problem with connectionist hardware and "wiring" commit
'
ments. At this stage it is premature at best to adopt research programs that assume
that human brains are suff1ciently like SRNs wired up in the way connectionist empiri
cists insist so that we can take seriously their use in proving or even suggesting
that brains learn language in the way Lewis and Elman suggest. It is not an
accident that rationalist approaches to the science of language leave decisions about
what hardware/instantiation computes language aside until a future date - perhaps
the distant future. It is unwise to be conf1dent at this stage about what the brain is,
not to mention how the brain is "wired up" to understand and produce language.
Basically, we really don't know very much - as indicated by the remarks about the
(typically) single-track system vision, where genetic control (Pax-6, Notch . . . ) and
who knows what physical developmental constraints seem to play roles, but no one
knows exactly which, or how, or when. Yes, we know something about brains and
wiring them up - although almost nothing about language in this regard. But we
know a considerable amount about language and how it develops.
These last points - perhaps especially the one about parallels to visual system growth
- give me an opportunity to emphasize that the advent of the principles and
parameters approach to grammar in the early 1 980s and recent developments in
what Chomskyans call "minimalism" have eliminated with respect to "core grammar"
the prima facie implausibility of linguistic nativism. The implausibility arose from
early Chomskyan proposals - for example, those found in Chomsky, 1 965. In them,
innate endowment for grammar was assumed to consist in a "format" for a possible
grammar and, to make acquisition possible in real time, it had to be assumed that
the child's mind was genetically endowed with a "rich and articulate" (Chomsky, 2005)
set of language-specif1c principles and elements that make up the format. Parameters
changed that. By the time of Hauser et aI., 2002 and Chomsky, 2005, perhaps
all that must be language-specif1c and genetically specif1ed for grammar is Merge the one clear case of a computational capacity that humans have that no other
organism does. Chomsky (2005) calls this the "flfSt factor" in language growth.
Parameters might be written into our genes ; but they might also be written into another
factor in language growth - what Chomsky (2005) calls the "third factor" that includes
general principles of data analysis and (perhaps most important) "principles of
structural architecture and developmental constraints" that canalize, set up organic
form and development, provide for aspects of eff1cient computation, etc. So if
"language specif1c" means "written into our genes" (f1rst factor) and core grammar
is language speciflc and unique to us (Hauser et aI., 2002), perhaps only Merge is
written into our genes. On the other hand, the entire process of language develop
ment - that yields core grammar, concepts and sounds - is biophysically "guided" ;
I-languages are products of a process of growth. There must be input (the "second
James McGilvray
factor"), of course. But what, when, and how input contributes is determined by an
internal biophysical system (flfSt and third factors UG+).7
But is UG+ a single system : is it an internal biophysical selection and agenda
setting device? The set of biophysically possible human visual systems is fIxed. The
set of biophysically possible human languages is too. If you think a person's vision
is the product of a biophysical system of growth, you should think a person's 1language is " the product of a single system too.
=
systems inside the head. Syntactic theory says what this information is, and describes
the means by which an internal organ can yield (compute) this information in
the form of potentially infmite sets of sentences. In current terminology, linguistic
meanings (semantic information) are SEMs ; linguistic sounds are PHONs. All the informa
tion in SEMs and PHONs is intrinsic to these linguistic representations - it is "inside"
a PHON or SEM. So the syntactic theory of language provides a syntactic theory
of meaning in Goodman's sense - a syntactic classifIcation of what a sentence
provides at SEM. This internalist syntactic program is ambitious: it aims to charac
terize all possible sounds, meanings, and structures and how they are integrated in all the heads of all speakers of natural languages, those who were and now
are, and those who might be. It does that by saying what languages are, and how
they grow.
If one is careful, this notion of representation that is non-re-presenting can also
be extended to the idea that computations are represented in some biophysical
state(s)/event(s) in the mind. That is, they are embodied in whatever we discover to
be the "machinery" that carries out a derivation/computation. The linguist constructing
a naturalistic theory of language describes the various states and possible confIgura
tions of this machinery - presumably biophysical machinery - with a (computational)
theory of linguistic syntax. This seems to be what Chomsky has in mind when
he speaks of a theory of natural languages as biology (plus - as Chomsky, 2005,
emphasizes - various background physical and information-processing factors that
constrain and guide language growth) at an abstract level of description.
This points to how to view natural-language "ideas," information, etc, as innate.
They are contained in a system that grows and develops according to its own agenda.
The innateness of I-languages - the information concerning merging elements,
recursion, structures, sounds, and meanings
is the innateness of linguistic syntax
and the factors that serve as selectors and guide its growth. It is the innateness of
a biophysical system on which, see above. The overall theory of linguistic growth
- UG+ - "says" what a biophysically possible human language is. What a particular
person has in his or her head at a specifIC stage of development - his or her 1language - is likely to be unique to him or her: lexical items, parametric settings,
etc, yield any number of different I-languages. Yet the theory of growth says what
any one of them can be.
An informal way to make sense of this notion of innateness is found in Ralph
Cudworth's work ( 1688). Descartes's notion of innateness as something like a con
genital disease is too passive, as is the terminology of dispositions. Cudworth, a
seventeenthth-century Cambridge Platonist, did better. He said that the human mind
has an "innate cognoscitive power" that, when "occasioned" by sense, manufactures the
required idea or concept, perhaps thereafter placing it in a memory store. He suggested
thinking of this power as anticipating (he spoke of "prolepsis") what is needed for that
occasion. This is a useful metaphor for UG+ and its development into a specifIC relat
ively steady state, an I-language: the growing child's language faculty anticipates
its possible future states in that the powers of the faculty consist in selecting
relevant data, allowing various options, and determining the potential developed states.
A UG+-type theory describes the powers involved in acquiring and producing
linguistic sounds, meanings, and their complexes - expressions. UG+ "says" what
On the Innateness of Language
some mysterious, innate faculty of the mind, perhaps beyond the reach of scientifIc
inquiry. Or it is language itself, and language is innate. It must be one of these and surely the latter - otherwise children could not display linguistic creativity and
factors that depend on it at such an early age. If they had to "learn" language
in the way Morris et al. (2000) suggest, they would have to spend a lot of time i n
regimented training sessions. Creativity - were i t t o arise at all - would come very
late indeed.
There is also advice in favor of linguistic internalism, and a modular form of it
at that. The cognitive perspectives the child uses are not tied to input, either exter
nal or internal. They must be freely provided by a modular system that operates
without external control. And the study of them, if it is to be accomplished at all,
must be offered in terms of the internal system that provides them. If objectivity is
characteristic of scientifIc inquiry, scientifIc study of language had better try to offer
an account of how it can readily arise in the mind of any child, how it reaches a
relatively steady state - an I-language - and how it "computes" sentences. There is
little alternative but to adopt a research strategy that constructs a biophysical UG+
type theory that deals with all human languages.
can put to use. Consider words - sound-meaning pairs. Linguistic sounds have
qualities and features that it takes special training and theoretical terminology to
characterize and understand, and yet infants manage to distinguish "Englishy" sounds
from Japanese (and others) quickly. Concepts/meanings have structure and detail revealed
in the ways people use and understand language, although they have never been
instructed in any of it. One of the more interesting cases in this respect is the
concept WATER - a philosopher's favorite, often appearing in identity sentences with
H20. My point has nothing to do with the fact that our concept WATER - the one(s)
2
we express in our natural languages - has no single scientifIC counterpart, 1 although
that is true and belies the (misguided) hope of fmding the meaning (content) of water
in some supposed "natural" object H2 0. It is that WATER, the concept we have, is a
complex with structure and richness we have only begun to appreciate (for discus
sion, see Moravcsik, 1 9 7 5 ; Chomsky, 1 995, 2000). WATER is sensitive to matters of
agency: put a tea bag in water, and it becomes tea. It is sensitive to source: if from
a lake, a river, or even a tap, it's water. Tap water is water, even though it has been
fIltered through tea. Heavily polluted, if in a river, it's water: the St Lawrence is fIlled
with it (a quirky feature of RIVER - why isn't the river the water, rather than being
fIlled with water?), supposedly - although the water there includes an unhealthy
proportion of other substances too. And so on. WATER is a single example; there are
thousands more.
I mentioned early in this chapter that PoS observations are almost always mis
understood; now I can explain what I mean. They are not an argument; they present
a problem for the theorist. PoS observations do not stand to theories that try to explain
them as arguments and assumptions stand to conclusions. They commend to theorists
an internalist, nativist research strategy, rather than an empiricist account like
the connectionist story that Scholz and Pullum would like to tell. It is almost a
commonplace that Chomsky offers a poverty of the stimulus "argument for the innate
ness of language." But when he speaks of PoS observations as constituting an argu
ment - as in his discursive Rules and Representations ( 1 980) - he speaks informally.
Like Descartes in "Comments on a certain broadsheet" (to which Chomsky, 1 980, referred
when speaking of an "argument from the poverty of the stimulus"), he is saying
that an internalist, nativist program looks more fruitful than an empiricist effort.
"Language is innate" is not something that the scientist offers as an hypothesis, much
less tests. 13 The only scientifIC hypothesis at issue is the (naturalistic, scientifIc)
theory that - prompted by what the CALU and PoS observations suggest - one
constructs to describe and explain and make sense of these observed facts, plus any
number of other facts having to do with grammaticality, the character of use, other
languages, truth indications, language impairment . . . - an expanding and changing
set as the theory develops.
The would-be scientist of language or any other domain tries what Peirce called
"abduction" or hypothesis-formation. Abduction is not argument. People now are fond
of speaking of an inference to the best explanation, but "inference" fares little better
than "argument" : the mystery is relocated to "inference." If scientists had reliable
inferential principles to follow when they constructed hypotheses, they would have
discovery procedures. If they did, we would either be done by now, or have found
that our principles are useless. Instead, we proceed incrementally, making jumps here
James McGilvray
and there. The process of constructing and refming a theory goes on - usually for
centuries - with many contributing. And we feel justifIed in taking that course of
research if it looks like there is improvement.
There is, however, an empirical issue. For language, it is whether, having adopted
the internalist nativist strategy characteristic of rationalist (internalist nativist) views
of language and its study, one gets successful scientifIc theories. On this, surely the
record of the last fIfty years or so indicates that the answer is yes. The internalist
and nativist science of language that results does not, of course, explain language
use. But the CAlU observations indicate that it would be unwise to try.
Summary
In sum, rationalist nativism is a nativism of "machinery" - biophysical machinery
of language growth and syntactic machinery of I-languages. Rationalists attempt
to construct a natural science of that machinery. They want to get to know how
children acquire languages and what I-languages are. The rationalist (nativist, inter
nalist) research program seems to be on the right track; it speaks to the facts and
aims to meet the standards of natural science. And it has made progress. While there
are many outstanding issues - coming to understand exactly what is language-specifIc,
for example, and developing an account of just how languages grow - we have reached
the point that we can now begin to address the question of why language is the way
it is ("principled explanation," Chomsky, 2005).
There are no obvious challenges to the rationalist program. Newport and Aslin
(2000) contributes to it. Lewis and Elman's (200 1 ) empiricist (anti-nativist, external
ist) program seems to be devoted to trying to show that there is a different way to
produce language-like behavior. Perhaps there is. But what they propose is not rel
evant to the science of human language.
Acknowledgments
I am very grateful to Rob Stainton for comments on an early draft and to Sam Epstein, Paul
Pietroski, and Steve McKay for comments on others.
Notes
A rationalist program is an empirical program that maintains that acquisition of rich cog
nitive capacities such as language depends on innate resources. Rationalists are typically
also - with a few exceptions, such as Fodor - internalists. Rationalist programs contrast with
anti-nativist and externalist empiricist programs. For discussion, see Chomsky, 1 966/2002;
McGilvray, 1 99 9 ; Hornstein, 2005.
Best by the standards of successful scientifIc projects: these are descriptively and
explanatorily adequate to their domains, simple, formal, allow for accommodation to other
sciences, and make progress in all these respects.
On the Innateness of Language
9
10
The commitments of connectionist views of language, data (usage), mind, learning, and the
role of the experimenter are not obvious in Lewis and Elman, 200 1 . They are in Morris et aL,
2000) : children "learn grammatical relations over time, and in the process accommodate
to whatever language-specifIc behaviors . . . [their] target language exhibits." Further: "From
beginning to end this is a usage-based acquisition system. It starts with rote-acquisition
of verb-argument structures, and by fmding commonalities, it slowly builds levels of abstrac
tion. Through this bottom-up process, it accommodates to the target language." Taking apart
all the commitments in these passages is a project for a much longer paper.
Children never "front" the fIrst is in the sentence The dog that is shedding is under the table
to form a question. They never say Is the dog that shedding is under the table; they always
say Is the dog that is shedding under the table. Chomsky often uses examples like this to
emphasize that children's minds must implicitly know the principle(s) that govern the rele
vant phenomena.
A related point: SRNs are designed by experimenters to "solve" virtually any pattern
acquisition "problem"; they are made to be plastic, an enduring trait of empiricist
doctrine. Because of this, they can acquire non-natural language patternings as easily or not - as naturaL Children acquire natural languages easily; they do not acquire non
natural languages easily. Machinery in their minds "anticipates" natural languages, not
other kinds of system. It excludes other possibilities. For example, children's minds assign
The billionaire called the representative from Alberta readings on which the call or the
representative is from Alberta, but excludes a reading on which the billionaire is from
Alberta (Pietroski, 2002).
Intuitively, Merge is a structure-building operation. It has two forms, External Merge and
Internal Merge. One can think of the former as an operation that takes two lexical items
and concatenates them, puts them together, or joins them; essentially, one starts with
{x} and {y} and gets {x, y } . The second puts one item of the set "at the edge"; it replaces
what Chomsky used to call "Move." External Merge yields sentential argument structure;
Internal yields various discourse effects.
In case concepts and sounds look to you to be fertile areas for empiricist (externalist)
acquisition stories, see the next two sections and Chomsky, 1 966/2002, 2000 and
McGilvray, 1 998, 1 999, 2005. That said, a lot needs to be done with concepts before
anyone can plausibly say that as natural scientists they know what they are.
The externalist nativist must deal with two problems - establishing connections to external
contents and dealing with the fact that reference/denotation is a form of human free
action, not a naturalistic "locking" (Fodor, 1 998). Moving contents inside the head but
outside the linguistic representation avoids the agent control problem. But if the nativist
continued to hold that contents were located somewhere else in the head (in Fodor's lan
guage of thought, perhaps), s/he would still have to deal with the problem of establishing
connections. Chomskyan nativists locate their "contents" in linguistic representations
themselves, avoiding both problems.
That said, this object and the "information" it contains constrain its possible uses and
what would count as its correct uses. Meaning guides ("instructs") use, in this sense.
In their (2002) discussion of PoS "arguments for innateness," Scholz and Pullum seem
to accept without qualm (2002, p. 3 1) that the informational "content" of many words
such as house is easily learned at an early age. They should be worried by that: word
acquisition offers any number of examples of easy and apparently automatic acquisition
of rich forms of packaged "information" - phonological and semantic. On that, see Chomsky,
2000 and McGilvray, 1 998, 1 999, 2005. For a good description of lexical acquisition
timing, although little on sounds and meanings themselves, see Gleitman and Fisher,
2005.
James McGilvray
11
12
13
That is, core competence: they have fully developed I-languages, but haven't contended
with all the irregular constructions and their root vocabularies are smaller than most adults'.
There are various states of H20 as the scientist conceives it. See, for example, Ruan et
aI., 2004 ; Wernet et aI., 2004; and Zubavicus and Grunze, 2004.
I n Chomsky, 1 980, to which Scholz and Pullum refer for what they take to b e Chomsky's
canonical view of the "argument from the poverty of the stimulus," the term "argument
from the poverty of the stimulus" is introduced by reference to D escartes and his view
that you can't fmd colors, things with the structure of our ideas, triangles, etc. "in the
world" in the form you fmd them in our heads, so they must be contributed by what we
have in our heads. The rest of the discussion proceeds in a similarly informal fashion.
Lewis, J. D. and Elman, J. (200 1). Learnability and the statistical structure of language :
Poverty of stimulus arguments revisited. Proceedings of the 2 6th Annual Boston University
Conference on Language Development, 3 59-70. (Electronic version at http ://crl.ucsd.edu/
elman/Papers/morris.pdf.)
McGilvray, J. ( 1 998). Meanings are syntactically individuated and found in the head. Mind
and Language, 1 3 , 22 5-80.
- ( 1 999). Chomsky: Language, Mind, and Politics. Cambridge: Polity.
- (2005). Meaning and creativity. In J. McGilvray (ed.), The Cambridge Companion to
Chomsky. Cambridge: Cambridge University Press.
Moravcsik, J. ( 1 97 5). Aitia as generative factor in Aristotle's philosophy of language. Dialogue,
1 4, 622-36.
Morris, W., Cottrell, G., and Elman, G. (2000). A connectionist simulation of the empirical acquis
ition of grammatical relations. In S. Wermter and R. Sun (eds.), Hybrid Neural Systems
Integration. Heidelberg: Springer-Verlag. (Pagination is from the electronic version at
http ://crl.ucsd.edu/ elman/Papers/morris.)
Newport, E. and Aslin, R. (2000). Innately constrained learning: blending old and new
approaches to language acquisition. In S. Howell, S. Fish, and T. Keith-Lucas (eds),
Proceedings of the Annual Boston University Conference on Language Development, 24, 1 -2 1 .
Onuma, Y., Takahashi, S., Asashima, M., Kurata, S., and Gehring, W. J. (2002). Conservation
of Pax 6 function and upstream activation by Notch signaling in eye development of frogs
and flies. Proceedings of the National Academy of Sciences, 99/4, 2020-5.
Petitto, L. (2005). How the brain begets language. In J. McGilvray (ed.), The Cambridge Companion
to Chomsky. Cambridge: Cambridge University Press.
Petitto, L., Zatorre, R., Gauna, K., Nikelski, E. J., Dostie, D., and Evans, A. C. (2000). Speech
like cerebral activity in profoundly deaf people while processing signed languages: implica
tions for the neural basis of all human language. Proceedings of the National Academy of
the Sciences, 97/25, 1 3 9 6 1 -6.
Pietroski, P. (2002). Meaning before truth. In G. Preyer and G. Peter (eds.), Contextualism in
Philosophy. Oxford: Oxford University Press.
Pietroski, P. and Crain, S. (2002). Why language acquisition is a snap. Linguistic Review, 1 9 ,
63-83.
- and - (2005). Innate ideas. In J. McGilvray (ed.), The Cambridge Companion to Chomsky.
Cambridge: Cambridge University Press.
Ruan, C-Y., Lobastov, V. A., Vigliotti, F., Chen, S., and Zewail, A. H. (2004). Ultrafast electron
crystallography of interfacial water. Science, 304, 80-4.
Saffran, J. R., Aslin, R. N., and Newport, E. L. ( 1 996). Statistical learning by 8-month-old infants.
Science, 274, 1 926-8.
Scholz, B. and Pullum, G. (2002). Empirical assessment of stimulus poverty arguments.
Linguistic Review, 1 9, 9-50.
Thompson, W. D 'Arcy ( 1 9 1 7). On Growth and Form. Cambridge : Cambridge University Press.
Wernet, Ph., Nordlund, D., Bergmann, U., et al. (2004). The structure of the flfSt coordination
shell in liquid water. Science, 304, 995-9.
Yang, C. D. (2002). Knowledge and Learning in Natural Language. New York: Oxford Univer
sity Press.
- (2004). Universal Grammar, statistics, or both? Trends in Cognitive Sciences, 8/ 1 0, 1 -6.
Zubavicus, Y. and Grunze, M. (2004). New insights into the structure of water with ultrafast
probes. Science, 304, 974-6.
James McGilvray
C H A PT E R
S E V E N
Bou n d ed a n d Rati on a l
Gerd Gigerenzer
At flfSt glance, Homo sapiens is an unlikely contestant for taking over the world.
"Man the wise" would not likely win an Olympic medal against animals in wrestling,
weight lifting, jumping, swimming, or running. The fossil record suggests that Homo
sapiens is perhaps 400,000 years old, and is currently the only existing species of
the genus Homo. Unlike our ancestor, Homo erectus, we are not named after our
bipedal stance, nor are we named after our abilities to laugh, weep, and joke. Our
family name refers to our wisdom and rationality. Yet what is the nature of that
wisdom? Are we natural philosophers equipped with logic in search of truth? Or
are we intuitive economists who maximize our expected utilities? Or perhaps moral
utilitarians, optimizing happiness for everyone?
Why should we care about this question? There is little choice, I believe. The nature
of sapiens is a no-escape issue. As with moral values, it can be ignored yet will
nonetheless be acted upon. When psychologists maintain that people are unreason
ably overconfIdent and fall prey to the base rate fallacy or to a litany of other
reasoning errors, each of their claims is based on an assumption about the nature
of sapiens
as are entire theories of mind. For instance, virtually everything that
Jean Piaget examined, the development of perception, memory, and thinking, is depicted
as a change in logical structure (Gruber and Voneche, 1 977). Piaget's ideal image of
sapiens was logic. It is not mine.
Disputes about the nature of human rationality are as old as the concept of ration
ality itself, which emerged during the Enlightenment (Daston, 1 988). These controver
sies are about norms, that is, the evaluation of moral, social, and intellectual judgment
(e.g., Cohen, 198 1 ; Lopes, 199 1 ) . The most recent debate involves four sets of scholars,
who think that one can understand the nature of sapiens by (i) constructing as-if
theories of unbounded rationality, (ii) constructing as-if theories of optimization
under constraints, (iii) demonstrating irrational cognitive illusions, or (iv) studying
ecological rationality. Being engaged in this controversy, I am far from dispassionate,
-
and have placed my bets on ecological rationality. Yet I promise that I will try to
be as impartial as I can.
Unbounded rationality
The demon's nearest relative is a being with "unbounded rationality" or "full ration
ality. " For an unboundedly rational person, the world is no longer fully predictable,
that is, the experienced world is not deterministic. Unlike the demon, unboundedly
rational beings make errors. Yet it is assumed that they can fmd the optimal (best)
strategy - that is, the one that maximizes some criterion (such as correct predictions,
monetary gains, or happiness) and minimizes error. The seventeenth-century French
mathematicians Blaise Pascal and Pierre Fermat have been credited with this more
modest view of rationality, defmed as the maximization of the expected value, later
changed to the maximization of expected utility by Daniel Bernoulli (Hacking, 1 9 7 5 ;
Gigerenzer e t aI., 1 989). I n unbounded rationality, optimization (such a s maximiza
tion) replaces determinism, whereas the assumptions of omniscience and unlimited
computational power are maintained. I will use the term "optimization" in the
following way:
a perfect one, for it can lead to errors). To refer to a strategy as optimal, one
must be able to prove that there is no better strategy (although there can be
equally good ones).
Because of their lack of psychological realism, theories that assume unbounded ration
ality are often called as-if theories. They do not aim at describing the actual cogni
tive processes, but are only concerned with predicting behavior. In this program of
research, the question is: If people were omniscient and had all the necessary time
and computational power to optimize, how would they behave? The preference for
unbounded rationality is widespread. This is illustrated by some consequentionalist
theories of moral action, which assume that people consider (or should consider) the
consequences of all possible actions for all other people before choosing the action
with the best consequences for the largest number of people (Williams, 1 98 8). It is
illustrated by theories of cognitive consistency, which assume that our minds check
each new belief for consistency with all previous beliefs encountered and perfectly
memorized; theories of optimal foraging, which assume that animals have perfect
knowledge of the distribution of food and of competitors ; and economic theories that
assume that actors or flrms know all relevant options, consequences, beneflts, costs,
and probabilities.
an ecological view of rationality (see next section), which was revolutionary in think
ing about norms, not just behavior (Simon, 1 95 6 ; Selten, 200 1 ; Gigerenzer, 2004b).
all factors given the time constraint of a few seconds, but would know the optimal
formula to compute the trajectory given these constraints. These are as-if theories.
Real humans perform poorly in estimating the location where the ball will strike the
ground (Saxberg, 1 9 8 7 ; Babler and Dannemiller, 1 993). Note also that as-if theories
have limited practical use for instructing novice players or building a robot player.
The heuristics-and-biases program might respond with an experiment demonstrating
that experienced players make systematic errors in estimating the point where the
ball lands, such as underestimating the distance to the player. Tentatively, these errors
might be attributed to player's overconfIdence bias or optimism bias (underestimating
the distance creates a false certainty that one can actually catch a ball). The demon
stration of a discrepancy between judgment and norm, however, does not lead to
an explanation of how players actually catch a ball. One reason for this is that the
critique is merely descriptive, that is, what players can do, and does not extend to
the norm, that is, what players should do. The norm is the same - to compute the
trajectory correctly, requiring knowledge of the causal factors.
Yet it is time to rethink the norms, such as the ideal of omniscience. The norm
ative challenge is that real humans do not need a full representation and unlimited
computational power. In my view, humans have an adaptive toolbox at their dis
posal, which may contain heuristics that can catch balls in a fast and frugal way.
Thus, the research question is: Is there a fast and frugal heuristic that can solve the
problem? Experiments have shown that experienced players in fact use several heurist
ics (e.g., McLeod and Dienes, 1 996). One of these is the gaze heuristic, which works
only when the ball is already high up in the air:
Gaze heuristic: Fixate your gaze on the ball, start running, and adjust your
speed so that the angle of gaze remains constant.
The angle of gaze is between the eye and the ball, relative to the ground. The gaze
heuristic ignores all causally relevant factors when estimating the ball's trajectory.
It attends to only one variable: the angle of gaze. This heuristic belongs to the class
of one-reason decision making heuristics. A player relying on this heuristic cannot
predict where the ball will land, but the heuristic will lead him to that spot. In other
words, computing the trajectory is not necessary; it is not an appropriate norm. The
use of heuristics crosses species borders. People rely on the gaze heuristic and related
heuristics in sailing and flying to avoid collisions with other boats or planes; and bats,
birds, and flies use the same heuristics for predation and pursuit (Shaffer et aI., 2004).
To repeat, the gaze heuristic consists of three building blocks: flXate your gaze on
the ball, start running, and adjust running speed. These building blocks can be part
of other heuristics, too.
case, to track a moving object against a noisy background. It is easy for humans to
do this; 3-month-old babies can already hold their gaze on moving targets (Rosander
and Hofsten, 2002). Tracking objects, however, is difficult for a robot; a computer
program that can track objects as well as a human mind can does not yet exist. Thus,
the gaze heuristic is simple for humans but not for today's generation of robots. The
standard defmition of optimization as computing the maximum or minimum of a
function, however, ignores the "hardware" of the human brain. In contrast, a heuristic
exploits hard-wired or learned cognitive and motor processes, and these abilities make
it simple. This is the fIrSt reason why fast and frugal heuristics can, in the real world,
be superior to some optimization strategy.
Situatedness: heuristics exploit structures of environments. The rationality of
heuristics is not logical, but ecological. The study of the ecological rationality of a
heuristic answers the normative question concerning the environments in which a
heuristic will succeed and in which it will fail. It specifies the class of problems
a given heuristic can solve (Martignon and Hoffrage, 1999 ; Goldstein et aI., 200 1 ) .
Ecological rationality implies that a heuristic i s not good or bad, rational or irra
tional per se, but only relative to an environment. It can exploit certain structures
of environments or change them. For instance, the gaze heuristic transforms the
complex trajectory of the ball in the environment into a straight line.
Note that as-if optimization theories - because they ignore the human mind - are
formulated more or less independently of the hardware of the brain. Any computer
can compute the maximum of a function. Heuristics, in contrast, exploit the specifIC
hardware and are dependent on it. Social heuristics, for instance, exploit the evolved
or learned abilities of humans for cooperation, reciprocal altruism, and identification
(Laland, 200 1 ) . The principles of embodiment and situated ness are also central for
"New AI" (Brooks, 2002). For an introduction to the study of fast and frugal heuristics,
see Payne et aI., 1 993 ; Gigerenzer et aI., 1 999 ; and Gigerenzer and Selten 200 1 .
works in environments that change slowly, but not in environments under rapid change
(Boyd and Richerson, 200 1). Ecological rationality can also be analyzed in quantit
ative terms : a heuristic will make 80 percent correct predictions in environment E,
and requires only 30 percent of the information.
The ecological rationality of a heuristic is conditional on an environment. Content
blind norms, in contrast, are defmed without consideration of any environment.
Ecological rationality is comparative or quantitative; it is not necessarily about the
best strategy. This provides it with an advantage: ecological rationality can be deter
mined in all situations where optimization is out of reach, and one does not need
to edit the problem so that optimization can be applied. In what follows, I describe the
ecological rationality of three classes of heuristics. For a more extensive analysis, see
Gigerenzer et al., 1 99 9 ; Goldstein et aI., 200 1 ; and Smith, 2003 .
Recognition heuristic
Daniel Goldstein and I asked American and German students the following question
(Goldstein and Gigerenzer, 2002) :
Which city has more inhabitants : San Diego or San Antonio?
Sixty-two percent of Americans answered correctly : San Diego. The Germans knew
little of San Diego, and many had never heard of San Antonio. What percentage of
the more ignorant Germans found the right answer? One hundred percent. How can
people who know less make more correct inferences? The answer is that the Germans
used the recognition heuristic:
If you recognize one city, but not the other, infer that it has the larger
population.
The Americans could not use this heuristic. They knew too much. The Americans had
heard of both cities, and had to rely on their recall knowledge. Exploiting the wisdom
in partial ignorance, the recognition heuristic is an example of ignorance-based
decision making. It guides behavior in a large variety of situations: rats choose food
they recognize on the breath of a fellow rat and tend to avoid novel food; children
tend to approach people they recognize and avoid those they don't; teenagers tend
to buy CDs of bands whose name they have heard of; adults tend to buy products
whose brand name they recognize; participants in large conferences tend to watch
out for faces they recognize; university departments sometimes hire professors by
name recognition; and institutions, colleges, and companies compete for a place in
the public's recognition memory through advertisement.
Like all heuristics, the recognition heuristic works better in certain environments than
in others. The question of when it works is the question of its ecological rationality:
The recognition heuristic is ecologically rational in environments where the recog
nition validity a is larger than chance: a > 0. 5.
The validity a is defmed as the proportion of cases where a recognized object has
a higher criterion value (such as population) than the unrecognized object, for a
given set of objects. This provides a quantitative measure for ecological rationality.
Gerd Gigerenzer
For instance, a is typically around 0.8 for inferring population (Goldstein and
Gigerenzer, 2002), 0.7 for inferring who will win a Grand Slam tennis match (Serwe
and Frings, 2004), and 0.6 for inferring disease prevalence (Pachur and Hertwig, 2004).
Search rule: Search through cues in order of their validity. Look up the cue
values of the cue with the highest validity fIrst.
Stopping rule: If one object has a positive cue value and the other does not
(or is unknown), then stop search and proceed to Step 3. Otherwise exclude
this cue and return to Step 1 . If no more cues are found, guess.
Decision rule: Predict that the object with the positive cue value has the higher
value on the criterion.
Wj >
L Wk
k >j
<
logzN,
where M and N are the number of cues and objects, respectively (Martignon and
Hoffrage, 2002). An example of scarce information is a sample with 30 objects
Bounded and Rational
measured on 5 cues or predictors (lOg2 3 0 < 5). Related environmental structures are
discussed in Hogarth and Karelaia, 2005. Consistent with these results, Take The Best
and other one-reason decision making heuristics have been proven to be, on aver
age, more accurate than multiple regression in making various economic, demographic,
environmental, and health forecasts, as well as in the prediction of heart attacks (Green
and Mehr, 1997 ; Czerlinski et aI., 1 999). Todorov (2003) showed that Take The Best
predicted the outcomes of basketball games during the 1 996/97 NBA season as accur
ately as Bayes's rule did, but with less information. Chater et al. (2003) demonstrated
that Take The Best matched highly complex computational methods such as a three
layer feedforward connectionist network, Quinlan's ( 1 993) decision three algorithm,
and two exemplar-based models, Nearest Neighbor and Nosofsky's ( 1 990) General
ized Context Model. When the environment had scarce information, speciflcally when
the training set induded less than 40 percent of all objects, Take The Best was more
accurate than any of these computationally expensive strategies. The effectiveness of
one-reason decision making has been demonstrated, among others, in the diagnosis
of heart attacks (Green and Mehr, 1 997) and in the simple heuristic for prescribing
antibiotics to children (Fischer et aI., 2002).
Gerd Gigerenzer
2
3
the observer and the demonstrators of the behavior are exposed to similar
environments, such as social systems ;
the environment is stable or changing slowly rather than quickly;
the environment is noisy and consequences are not immediate, that is, it is
hard or time-consuming to fIgure out whether a choice is good or bad, such
as which political or moral system is preferable (Boyd and Richerson, 1 98 5 ;
G oldstein et aI., 200 1 ) .
I n environments where these conditions d o not hold, copying the behavior of the
majority can lead to disaster. For instance, copying the production and distribution
systems of traditional fIrms can be detrimental when an economy changes from local
to globalized.
Cognitive Luck
In this volume, Matheson discusses the study of ecological (bounded) rationality as
a way to overcome the epistemic internalism of the Enlightenment tradition. But he
raises a worry:
If cognitive virtue is located outside the mind in the way that the Post-Enlightenment
Picture suggests, then it turns out to be something bestowed on us by features of the
world not under our control: it involves an intolerable degree of something analogous
to what theoretical ethicists call "moral luck" . . . lcf. Williams, 1 98 1 ; Nagel, 1 993]
"cognitive luck," we might say. (Matheson, chapter 8, p. 1 43)
-
This worry is based on the assumption that internal ways to improve cognition are
under our control, whereas the external ones are not.
This assumption, however, is often incorrect, and reveals a limit of an intern
alist view of cognitive virtue. I conjecture that changing environments can in fact
be easier than changing minds. Consider the serious problem of innumerate phy
sicians, as illustrated by screening for colorectal cancer. A man tests positive on
the FOB (fecal occult blood) test and asks the physician what the probability is that
he actually has cancer. What do physicians tell that worried man? We (Hoffrage
and Gigerenzer, 1 998) gave exper-ienced physicians the best estimates of base rate
(0.3 percent), sensitivity (50 percent), and false positive rate (3 percent), and asked
them to estimate the probability of colorectal cancer given a positive test. Their
estimates ranged between 1 percent and 99 percent. If patients knew about this
variability, they would be rightly scared.
This result illustrates a larger problem: When physicians try to draw a conclusion
from probabilities, their minds typically cloud over (Gigerenzer, 2002). What can be
done to correct this? An internalist might recommend training physicians to use Bayes's
rule in order to compute the posterior probability. In theory, this training should work
wonders, but in reality, it does not. One week after students successfully passed such
a course, for instance, their performance was already down by 50 percent, and it
continued to fade away week by week (Sedlmeier and Gigerenzer, 200 1 ) . Moreover,
the chance of convincing physicians to take a statistics course in the flfSt place is
almost nil ; most have no time or little motivation, while others believe they are incur
ably innumerate. Are we stuck for eternity with innumerate physicians? No. In the
ecological view, thinking does not happen simply in the mind, but in interaction
between the mind and its environment. This adds a second, and more effIcient, way
to improve the situation: to edit the environment. The relevant part of the environ
ment is the representation of the information, because the representation does part
of the Bayesian computation. Natural (non-normalized) frequencies are such an effIcient
representation; they mimic the way information has been encountered before the advent
of writing and statistics, throughout most of human evolution. For instance : 3 0 out
of every 1 0,000 people have colorectal cancer, 1 5 of these will have a positive test; of
the remaining people without colorectal cancer, 300 will still have a positive test.
When we presented the numerical information in natural frequencies as opposed to
conditional probabilities, then the huge variability in physicians' judgments dis
appeared. They all gave reasonable estimates, with the majority hitting exactly on
the Bayesian posterior of about 5 percent, or 1 in 20.
Similarly, by changing the environment, we can make many so-called cognitive
illusions largely disappear (Gigerenzer, 2000), enable fIfth and sixth graders to solve
Bayesian problems before they even heard of probabilities (Zhu and Gigerenzer, forth
coming), and help judges and law students understand DNA evidence (Hoffrage et aI.,
2000). Thus, an ecological view actually extends the possibilities to improve judgment,
whereas an internal view limits the chances. To summarize, the term "cognitive luck"
only makes sense from an internalist view, where luck in fact refers to the theory's
ignorance concerning the environment, induding the social environment. From an
ecological view, environmental structures, not luck, directly influence cognition and
can be designed to improve it. Cognitive virtue is, in my view, a relation between
a mind and its environment, very much like the notion of ecological rationality
(see also Bishop, 2000).
Gerd Gigerenzer
References
Anderson, J. R. ( 1 9 90). The Adaptive Character of Thought. Hillsdale, NJ: Erlbaum.
Arrow, K. J. (2004). Is bounded rationality unboundedly rational? Some ruminations. In
M. Augier and J. G. March (eds.), Models of a Man: Essays in Memory of Herbert A. Simon.
Cambridge, MA: MIT Press.
Babler, T. G. and Dannemiller, J. 1. ( 1 993). Role of image acceleration in judging landing
location of free-falling projectiles. Journal of Experimental Psychology: Human Perception
and Performance, 1 9, 1 5-3 1 .
Bishop, M. A. (2000). I n praise of epistemic irresponsibility: How lazy and ignorant can you
be? Synthese, 1 22, 1 79-208.
Boyd, R. and Richerson, P. J. ( 19 8 5). Culture and the Evolutionary Process. Chicago: University
of Chicago Press.
- and - (200 1). Noons and bounded rationality. In G. Gigerenzer and R. Selten (eds.), Bounded
Rationality: The Adaptive Toolbox. Cambridge, MA : MIT Press.
Brooks, R. (2002). Robot: The Future of Flesh and Machines. London: Penguin Books.
Chater, N., Oaksford, M., Nakisa, R., and Redington, M. (2003). Fast, frugal, and rational : How
rational noons explain behavior. Organizational Behavior and Human Decision Processes,
90, 63-86.
Cohen, L. J. ( 1 98 1). Can human irrationality be experimentally demonstrated? Behavioral and
Brain Sciences, 4, 3 1 7-70.
Cosmides, L. ( 1 989). The logic of social exchange: Has natural selection shaped how humans
reason? Studies with the Wason selection task. Cognition, 3 1 , 1 87-276.
Cosmides, L. and Tooby, J. ( 1 992). Cognitive adaptions for social exchange. In J. H. Barkow,
L. Cosmides, and J. Tooby (eds.), The Adapted Mind: Evolutionary Psychology and the Generation
of Culture. New York: Oxford University Press.
Czerlinski, J., Gigerenzer, G., and Goldstein, D. G. ( 1 999). How good are simple heuristics? In
G. Gigerenzer, P. M. Todd, and the ABC Research Group, Simple Heuristics that Make Us
Smart. New York: Oxford University Press.
Daston, L. ( 1 988). Classical Probability in the Enlightenment. Princeton, NJ: Princeton
University Press.
D awkins, R. ( 1 989). The Selfish Gene (2nd edn.). Oxford: Oxford University Press.
Dugatkin, L. A. ( 1 992). Sexual selection and imitation : Females copy the mate choice of
others. The American Naturalist, 1 39, 1 384-9.
Fillenbaum, S. ( 1 977). Mind your p's and q's: the role of content and context in some uses of
and, or, and if. Psychology of Learning and Motivation, 1 1 , 4 1 - 1 00.
Fischer, J. E., Steiner, F., Zucol, F., et al. (2002). Use of simple heuristics to target macrolide
prescription in children with community-acquired pneumonia. Archives of Pediatrics and
Adolescent Medicine, 1 56, 1 005-8.
Gigerenzer, G. ( 1 996). On narrow noons and vague heuristics: A reply to Kahneman and Tversky
( 1 996). Psychological Review, 1 03 , 592-6.
- (2000). Adaptive Thinking: Rationality in the Real World. New York: Oxford University
Press.
- (20G l ). Content-blind norms, no norms, or good norms? A reply to Vranas. Cognition, 8 1 ,
93- 1 03 .
- (2002). Calculated Risks: How t o Know When Numbers Deceive You. New York: Simon
and Schuster. (Published in UK as Reckoning with Risk: Learning to Live with Uncertainty.
London: Penguin Books.)
- (2004a). Fast and frugal heuristics: The tools of bounded rationality. In D. Koehler and
N. Harvey (eds.), Handbook of Judgment and Decision Making. Oxford: Blackwell.
- (2004b). Striking a blow for sanity in theories of rationality. In R. B. Augier and J. G.
March (eds.), Models of a Man: Essays in Honor of Herbert A. Simon. Cambridge, MA : MIT
Press.
Gigerenzer, G. and Hug, K. ( 1 992). Domain-specifIc reasoning: Social contracts, cheating, and
perspective change. Cognition, 43, 1 27-7 1 .
Gigerenzer, G . and Selten, R . (eds.) (200 1 ) . Bounded Rationality: Th e Adaptive Toolbox.
Cambridge, MA : MIT Press.
Gigerenzer, G., Swijtink, Z., Porter, T., Daston, L., Beatty, J., and KrUger, L. ( 1 989). The Empire
of Chance: How Probability Changed Science and Everyday Life. Cambridge: Cambridge
University Press.
Gerd Gigerenzer
Gigerenzer, G., Todd, P. M., and the ABC Research Group ( 1 999). Simple Heuristics that Make
Us Smart. New York: Oxford University Press.
Gilovich, T., Vallone, R., and Tversky, A. ( 1 985). The hot hand in basketball: On the mis
conception of random sequences. Cognitive Psychology, 1 7 , 295-3 1 4.
Goldstein, D. G. and Gigerenzer, G. (2002). Models of ecological rationality: The recognition
heuristic. Psychological Review, 1 09, 7 5-90.
Goldstein, D. G., Gigerenzer, G., Hogarth, R. M., et a!. (200 1 ) . Group report: Why and when
do simple heuristics work? In G. Gigerenzer and R. Selten (eds.J, Bounded Rationality: The
Adaptive Toolbox. Cambridge, MA: MIT Press.
Green, L. and Mehr, D. R. ( 1 997). What alters physicians' decisions to admit to the coronary
care unit? Journal of Family Practice, 45, 2 19-26.
Gruber, H. E. and Voneche, J. J. ( 1 97 7). The Essential Piaget. New York: Basic Books.
Hacking, 1. ( 1 975). The Emergence of Probability. Cambridge: Cambridge University Press.
Hoffrage, U. and Gigerenzer, G. ( 1 998). Using natural frequencies to improve diagnostic inferences. Academic Medicine, 73, 538-40.
Hoffrage, U., Lindsay, S., Hertwig, R., and Gigerenzer, G. (2000). Communicating statistical
information. Science, 290, 2 2 6 1 -2.
Hogarth, R. M. and Karelaia, N. (2005). Ignoring information in binary choice with con
tinuous variables: When is less "more"? Journal of Mathematical Psychology, 49, 1 1 525.
Kahneman, D. and Frederick, S. (2002). Representativeness revisited: Attribute substitution in
intuitive judgment. In T. Gilovich, D. Griffm, and D. Kahneman (eds.), Heuristics and Biases:
The Psychology of Intuitive Judgment. New York: Cambridge University Press.
Kahneman, D. and Tversky, A. ( 1 982). On the study of statistical intuitions. In D. Kahneman,
P. Slovic, and A. Tversky (eds.), Judgment Under Uncertainty: Heuristics and Biases.
Cambridge: Cambridge University Press.
- and
( 1 996). On the reality of cognitive illusions. A reply to Gigerenzer's critique. Psychological Review, 1 03, 582- 9 1 .
Kahneman, D., Slovic, P., and Tversky, A . (eds.) ( 1 982). Judgment Under Uncertainty:
Heuristics and Biases. Cambridge: Cambridge University Press.
Krueger, J. 1. and Funder, D. C. (2004). Towards a balanced social psychology: Causes, con
sequences, and cures for the problem-seeking approach to social behavior and cognition.
Behavioral and Brain Sciences, 27, 3 1 3 -27.
Laland, K. (200 1). Imitation, social learning, and preparedness as mechanisms of bounded ration
ality. In G. Gigerenzer and R. Selten (eds.), Bounded Rationality: The Adaptive Toolbox.
Cambridge, MA: MIT Press.
Lopes, L. L. ( 1 99 1). The rhetoric of irrationality. Theory and Psychology, 1 , 65-82.
Martignon, L. and Hoffrage, U. ( 1 999). Why does one-reason decision making work? A case
study in ecological rationality. In G. Gigerenzer, P. M. Todd, and the ABC Research Group,
Simple Heuristics that Make Us Smart. New York: Oxford University Press.
- (2002). Fast, frugal and flt: Lexicographic heuristics for paired comparison. Theory and
Decision, 52, 29- 7 1 .
McLeod, P . and Dienes, Z . ( 1 996). D o flelders know where t o go t o catch the ball o r only how
to get there? Journal of Experimental Psychology: Human Perception and Performance, 22,
53 1 -43.
Nagel, T. ( 1 993). Moral luck. In D. Statman (ed.), Moral Luck. Albany, NY: State University of
New York Press.
Nosofsky, R. M. ( 1 990). Relations between exemplar-similarity and likelihood models of
classiflcation. Journal of Mathematical Psychology, 34, 393-4 1 8.
Pachur, T. and Hertwig, R. (2004). How Adaptive is the Use of the Recognition Heuristic? Poster
session presented at the Annual Meeting of the Society for Judgment and Decision Making,
Minneapolis, MN.
Payne, J. W., Bettman, J. R., and Johnson, E. J. ( 1 993). The Adaptive Decision Maker. Cambridge:
Cambridge University Press.
Piattelli-Palmarini, M. ( 1 994). Inevitable Illusions: How Mistakes of Reason Rule our Minds.
New York: Wiley.
Quinlan, J. R. ( 1 993). C4. 5: Programs for Machine Learning. Los Altos, CA: Morgan Kaufmann.
Rips, L. J. ( 1 994). The Psychology of Proof: Deductive Reasoning in Human Thinking.
Cambridge, MA: MIT Press.
- (2002). Circular reasoning. Cognitive Science, 26, 7 67-95.
Rosander, K. and Hofsten, C. von (2002). Development of gaze tracking of small and large
objects. Experimental Brain Research, 1 46, 2 57-64.
Samuels, R., Stich, S., and Bishop, M. (2004). Ending the rationality wars : How to make
disputes about human rationality disappear. In R. Elio (ed.), Common Sense, Reasoning and
Rationality. New York: Oxford University Press.
Sargent, T. J. ( 1 993). Bounded Rationality in Macroeconomics. New York: Oxford University
Press.
Saxberg, B. V. H. ( 1 987). Proj ected free fall trajectories: I. Theory and simulation. Biological
Cybernetics, 56, 1 59-75.
Sedlmeier, P . and Gigerenzer, G. (200 1 ) . Teaching Bayesian reasoning in less than two hours.
Journal of Experimental Psychology: General, 1 30, 380-400.
Selten, R. (2001). What is bounded rationality? In G. Gigerenzer and R. Selten, Bounded Rationality:
The Adaptive Toolbox. Cambridge, MA: MIT Press.
Serwe, S. and Frings, C. (2004). Predicting Wimbledon: Poster presented at the 4th Summer
Institute for Bounded Rationality in Psychology and Economics, Berlin.
Shaffer, D. M., Krauchunas, S. M., Eddy, M., and McBeath, M. K. (2004). How dogs navigate
to catch frisbees. Psychological Science, 1 5, 437-4 1 .
Simon, H . A. ( 19 56). Rational choice and the structure o f environments. Psychological Review,
63, 1 29-38.
- ( 1 990). Invariants of human behavior. Annual Review of Psychology, 4 1 , 1 - 1 9.
Smith, V. L. (2003). Constructivist and ecological rationality in economics. The American Economic
Review, 93, 465-508.
Stigler, G. J. ( 1 96 1) . The economics of information. Journal of Political Economy, 69, 2 1 325.
Todorov, A. (2003). Cognitive procedures for correcting proxy-response biases in surveys. Applied
Cognitive Psychology, 1 7, 2 1 5-24.
Trivers, R. L. ( 1 9 7 1 ) . The evolution of reciprocal altruism. Quarterly Review of Biology, 46,
3 5-57.
- (2002). Natural Selection and Social Theory: Selected Papers of Robert Trivers. New York:
Oxford University Press.
Tversky, A. and Kahneman, D. ( 1 983). Extensional versus intuitive reasoning: The conjunc
tion fallacy in probability j udgment. Psychological Review, 90, 293 -3 1 5.
Wald, A. ( 19 50). Statistical Decision Functions. New York: Wiley.
Wason, P. C. ( 1 966). Reasoning. In B. M. Foss (ed.), New Horizons in Psychology. London :
Penguin Books.
Wason, P. C. and Johnson-Laird, P. N. ( 1 97 2). Psychology of Reasoning: Structure and Content.
Cambridge, MA: Harvard University Press.
Williams, B. ( 19 8 1 ) . Moral Luck. Cambridge: Cambridge University Press.
Gerd Gigerenzer
- ( 1 988). Consequentialism and integrity. In S. Scheffler (ed.), Consequentialism and its Critics.
New York: Oxford University Press. (Reprinted from B. Williams and J. J. C. Smart,
Utilitarianism: For and Against. Cambridge: Cambridge University Press, 1 973).
Wundt, W. ( 1 9 1 2/ 1 973). An Introduction to Psychology (R. Pintner, trans.). New York: Arno.
Zhu, 1. and Gigerenzer, G. (forthcoming). Children can solve Bayesian problems: The role of
representation in computation. Cognition.
CHA PT E R
E I G H T
Introduction
Broadly speaking, a virtue is a feature of something that enables it to perform well
along a relevant dimension of evaluation. Sympathy and courage are moral virtues
because, all else being equal, the sympathetic and courageous tend to act morally
more often than the cold-hearted and cowardly. Ease of handling and fuel effIciency
are automotive virtues because automobiles that possess them are better suited to
serve the typical interests of drivers than those that do not. Clarity and relevance are
virtues of communication because their absence impedes the flow of information.
A cognitive virtue pertains to the mental processes or mechanisms whereby we
acquire our representations of (e.g., beliefs and j udgments about) the world: it is a
feature of these mechanisms that aids in the felicitous acquisition of representations.
But what precisely is the dimension of cognitive evaluation picked out by "felicitous"
here? Along one dimension, the immediate end of cognition is just the acquisition
of numerous representations of the world, and the exercise of our representational
mechanisms is evaluated simply in terms of how many representations they generate.
Cognitive virtue as highlighted in this dimension is of a pretty weak sort, however,
for its connection to representational accuracy is attenuated: here, representational
mechanisms that consistently yield a large number of wildly false judgments about
the world, for example, might count as no more vicious than ones that consistently
yield a comparable number of true judgments.
Another dimension is concerned with how well our representational mechanisms
do when it comes to the acquisition of prudentially (or practically) useful representa
tions of the world. Here again, the weakness of the connection to accuracy makes
for a rather weak form of cognitive virtue - no stronger, it would seem, than the
sort at play in discussions about how well our representational mechanisms do when
it comes to generating emotionally satisfYing representations of the world. It may
not seem very virtuous from the cognitive point of view, given the disheartening
evidence afforded by the medical experts under whose care she falls, for a subject
to believe that she will survive a recently diagnosed illness. But, for all that, it may
be quite virtuous of her, along a prudential dimension of evaluation, to go ahead
and form the belief. After all, the belief, against the evidence, may predictably result
in a better quality of life for whatever time she has left.
A stronger dimension of cognitive evaluation, then, takes the immediate end of
cognition to be the acquisition of accurate representations, and human representa
tional mechanisms are now evaluated in terms of how well they generate representa
tions of that sort. But this is still not a very strong evaluative dimension. For along
this dimension a subject could be performing fully virtuously by the use of such
representational mechanisms as wishful thinking and mere guesswork, provided only
she happens to be lucky enough to have her representations so acquired turn out to
be accurate.
Stronger yet is a dimension according to which the immediate end of cognition
is the stable (alternatively, non-accidental) acquisition of accurate representations, and
our representational mechanisms are evaluated in terms of how well they stably
generate accurate representations. Wishful thinking and mere guesswork may with
the right sort of luck generate accurate representations, but not stably: over the long
haul, chances are that these representational mechanisms will yield as many (or more)
inaccurate as accurate representations of the world.
This paper will be concerned with human cognitive virtue from the point of view
of the last, strongest dimension of cognitive evaluation. There is a certain picture of
cognitive virtue, so understood call it the Enlightenment Picture - that portrays it
as inhabiting the mind alone. In this p icture, to borrow the words of Charles Taylor,
the cognitively virtuous human agent is "ideally disengaged . . . rational to the extent
that he has fully distinguished himself from the natural and social worlds, so that
his identity is no longer to be defmed in terms of what lies outside him in these
worlds" (Taylor, 1 995, p. 7). The alternative p icture portrays cognitive virtue as inhab
iting a broader domain: on the Post-Enlightenment Picture, cognitive virtue can be
located only by looking simultaneously at mind and world.
The current psychological debate about the "boundedness" of human rationality
- and more speciflcally about whether rational human judgment under uncertainty
is "bounded" - has important implications for the respective plausibility of these
competing pictures of cognitive virtue. On one side of the debate, we have those who
read various recent results in experimental psychology as indicative of widespread
irrationality in human judgment under uncertainty. On the other side, there are theor
ists like Gigerenzer who look at those experimental results and draw the conclusion
that the judgments are generally rational but bounded in some important sense. One
problem students of cognitive science are likely to encounter in their examination
of the literature on this debate is that of fmding a succinct formulation of what the
relevant boundedness claim amounts to. In the next section, I will provide it, before
going on to locate the debate against the background of the competing pictures
of cognitive virtue. Given the nature of their boundedness claim, I will show, the
boundedness theorists are committed to a rejection of the Enlightenment Picture. In
my third section I will point to a reason for pause about the boundedness theorists'
Bounded Rationality and Cognitive Virtue
rejection: it seems to carry with it a worrisome consequence about the extent to which
cognitive virtue is under the control of the subject who manifests it. Though I will
offer some suggestions about how to mitigate the worrisome nature of this con
sequence, I would be most interested in hearing Gigerenzer's own.
With the competing theses of Unbounded and Bounded Rationality in hand, we are
now in a position to begin to understand both what motivates theorists like Gigerenzer
to adopt the Bounded Rationality Thesis, and why this commits them to a rejection of
the Enlightenment Picture. From the early 1 970s onward, a number of psychological
experiments have been conducted that seem to show that ordinary adult humans do
not arrive at their judgments under uncertainty (hereafter, uncertainty judgments) in
accord with a class of rules that fit the description given in the Unbounded Ration
ality Thesis, viz., the entrenched rules of the probability calculus (ef. Kahneman and
Tversky, 1972 ; Tversky and Kahneman, 1973a, 1973b, 1 974, 1 982a, 1982b, 1982c, 1983;
Nisbett and Ross, 1980). Thus, for example, in the well-known experiment reported by
Tversky and Kahneman, test subjects were presented with the following "Cab Problem" :
A cab was involved i n a hit-and-run accident at night. Two cab companies, the Green
and the Blue, operate in the city. You are given the following data:
(a) 85 percent of the cabs in the city are Green and 1 5 percent are Blue.
(b) A witness identifIed the cab as Blue. The court tested the reliability of the witness
under the same circumstances that existed on the night of the accident and con
cluded that the witness correctly identified each one of the two colors 80 percent
of the time and failed 20 percent of the time.
What is the probability that the cab involved in the accident was Blue rather than Green?
(Tversky and Kahneman, 1 982b, pp. 1 56-7)
As compared to the result delivered by the probability calculus, test subjects performed
very poorly when confronted with this problem. The probability calculus delivers the
result that the probability that the cab involved in the accident was Blue is 0.4 1 ;
test subjects' answers ranged wildly away from that answer, and their median was 0.80.
This experimental result, in the eyes of many, including Tversky and Kahneman
themselves, served as further evidence for the conclusion that ordinary adult humans do
not follow the entrenched rules of the probability calculus in arriving at their uncer
tainty judgments, such as the sort of probability judgment asked for in the Cab Problem.
The reasoning behind this conclusion is as straightforward as it is compelling.
The flfSt step is just that test subjects are wrong much more often than not with
respect to their uncertainty judgments in probabilistic reasoning tasks like the Cab
Problem. The second step involves some sort of inference to the best available explana
tion: the best available explanation of why test subjects are wrong much more
often than not with respect to their judgments is that they do not follow the entrenched
rules of the probability calculus in arriving at those judgments. Hence, test subjects
do not follow the entrenched rules of the probability calculus in arriving at their
judgments in such probabilistic reasoning tasks. But now, the reasoning continues,
on the plausible assumption that not only are the subjects representative of ordinary
adult humans but also their uncertainty judgments in such probabilistic reasoning tasks
are relevantly representative of their uncertainty judgments in general, we get the
further conclusion that ordinary adult humans (hereafter, humans) do not follow the
entrenched rules of the probability calculus in arriving at their uncertainty judgments.
So much seems relatively uncontroversial. The more controversial bit comes
upon consideration of whether the foregoing conclusion carries any particularly dis
turbing implications with respect to the rationality of human uncertainty judgment.
Bounded Rationality and Cognitive Virtue
If we add a premise to the effect that human uncertainty judgment is rational just in
case it follows the entrenched rules of the probability calculus, we seem straightaway
to land ourselves with a very disturbing, skeptical conclusion about the rationality
of human uncertainty judgment, viz., that such judgment is typically not rational.
Notice, however, that the added premise - that human uncertainty judgment is
rational just in case it follows the entrenched rules of the probability calculus - is
an instance of the Unbounded Rationality Thesis. This is because the entrenched
rules of the probability calculus are speciflable independently of any reference to
the environment of the judging subject. To see why, consider one central entrenched
rule of the probab ility calculus, built upon Bayes's Theorem. We may formulate the
theorem as follows:
Prob(HIE)
Prob(E IH)
[Prob(E IH)
Prob(H)
Prob(H)]
n only if
that Prob(E IH) = 0
that Prob(H) p
that Prob(E I H) = q
that Prob(H) = r
that op/op + qr = n
=
And, clearly, this is a rule of Unbounded Rationality: nothing in its specifIcation involves
any reference to the subject 5's environment.
If we accept the Unbounded Rationality Thesis in the form of such rules, then, we
are committed to the skeptical conclusion that human uncertainty judgment is for
the most part not rational. For, as the experimental evidence shows, humans simply
David Matheson
do not typically follow the entrenched rules of the probability calculus such as the one
based on Bayes's Theorem in forming their uncertainty judgments. The lesson to be
learned from the Cab Problem experiment, for instance, is that humans fail to satisty
(a)-(e) of that rule in forming their conditional probability judgments. Since rational
uncertainty judgment is supposed to require following the entrenched rules of the
probability calculus, humans do not normally form rational uncertainty judgments.
To summarize the discussion thus far: results based on such experiments as the
Cab Problem pull strongly in favor of ( 1 ) :
Human uncertainty judgment does not follow the entrenched rules o f the prob
ability calculus.
If we add to this the premise captured by (2),
5 judges,
5 judges,
5 judges,
5 judges,
5 judges,
in
in
in
in
in
the
the
the
the
the
representation
representation
representation
representation
representation
format
format
format
format
format
F
F
F
F
F
of 1,
of 1,
of 1,
of 1,
of 1,
that
that
that
that
that
1,
a subject 5 rationally
Prob(E I Hl = 0
Prob(H) p
Prob(E I "B) = q
Prob(",H) = r
op/op + qr = n.
The consistency stems from the fact that this rule amounts to a rule for rational uncer
tainty judgment specifiable only by reference to the subject 5's environment. The
information-representation environment I is the external context in which 5 is pre
sented with the relevant probabilistic information available to her. The representation
format F of that environment is the way - guise - in which I typically encodes
that information. There may well be a signifIcant difference between the represen
tation format of our (ordinary adult humans') natural probabilistic information
representation environments and the representation format(s) at play in the artificial
information-representation environments of the Tversky and Kahneman experimental
settings. In the natural environments, for example, probabilistic information may
typically be represented in a natural frequency format ; in the artiflcial environments,
the information may typically represented in another format - e.g., that of the entrenched
rules of the probability calculus. Thus, writes Gigerenzer:
Consider numerical information as an example of external representation [i.e., of the repre
sentation format of a subject's environment]. Numbers can be represented in Roman,
Arabic, and binary systems, among others. These representations can be mapped one to
one onto each other and in this sense are mathematically equivalent. But the form of
representation can make a difference for an algorithm that does, say, multiplication.
The algorithms of our pocket calculators are tuned to Arabic numbers as input data and
would fail badly if we entered binary numbers. Similarly, the arithmetic algorithms acquired
by humans are designed for particular representations . . . Contemplate for a moment long
division in Roman numerals. (Gigerenzer, 2000, p. 94)
The experimenters who have amassed the apparently damning body of evidence that humans
fail to meet the norms of Bayesian inference have usually given their research participants
information in the standard probability format . . . Results from these [experiments] . . .
have generally been taken as evidence that the human mind does not reason with Bayesian
algorithms. Yet this conclusion is not warranted . . . One would be unable to detect a
Bayesian algorithm within a system by feeding it information in a representation that
does not match the representation with which the algorithm works [Le., in a format dif
ferent from that typical of the system's natural information-representation environment].
In the last few decades, the standard probability format has become a common way
to communicate information ranging from medical and statistical textbooks to psychology
experiments. But we should keep in mind that it is only one of many mathematically
equivalent ways of representing information ; it is, moreover, a recently invented
notation. [T]he standard probability format [was not] used in Bayes's . . . original essay.
[Wlith natural frequencies one does not need a pocket calculator to estimate the Bayesian
posterior. (Gigerenzer, 2000, p. 98)
Suppose that people do have reliably developed mechanisms that allow them to apply
a calculus of probability, but that these mechanisms are "frequentist" : they are designed
to accept probabilistic information when it is in the form of a frequency, and to pro
duce a frequency as their output. Let us then suppose that experimental psychologists
present subjects with problems that ask for the "probability" of a single event, rather
than a frequency, as output, and that they present the information necessary to solve
the problem in a format that is not obviously a frequency. Subjects' answers to such
problems would not appear to have been generated by a calculus of probability, even
though they have been designed to do just that. (Cosmides and Tooby, 1 996, p. 1 8)
If we adopt the Bounded Rationality Thesis and as a result replace absolute rules like
those of the entrenched probability calculus with relativized ones like the one just
articulated above, how does the skeptical conclusion (3) get blocked? In place of premise
(2), we would substitute a claim to the effect that human uncertainty judgment is
rational relative to an information-representation environment I just in case it fol
lows relativized-to-I (in the sense of the Bounded Rationality Thesis) rules of the prob
ability calculus. But now the experimentally well-supported ( 1), even when combined
with the new premise about relativized rationality will not yield the skeptical con
clusion (3), at least not in any unqualifIed way. The most that will follow from the
combination of ( 1 ) and the new premise is that human uncertainty judgment is not
rational relative to the artificial information-representation environment of the exper
imental settings. At least, that's all that will follow given this instance of the Bounded
Rationality Thesis : human uncertainty judgment is rational relative to the artificial
information-representation environment of the experimental settings just in case it
follows the entrenched rules of the probability calculus. But then again, the irra
tionality of judgment relative to the artificial settings is perfectly consistent with its
rationality relative to the natural information-representation environment.
Moreover, the claim that human uncertainty judgment is rational in this way relative to its natural information-representation environment - would follow from
yet another instance of the Bounded Rationality Thesis perfectly consistent with the
previous one viz., that human uncertainty judgment is rational relative to its natural
information-representation environment just in case it follows rules of the probab
ility calculus specifIable only by reference to that natural information-representation
environment - together with the assumption that humans do in fact generally follow
rules of the probability calculus speciflable only by reference to the natural information
representation environment. And, interestingly enough, recent experimental results seem
to confIrm that this assumption is correct. (See, e.g., Gigerenzer and Hoffrage, 1 995,
1 99 8 ; Cosmides and Tooby, 1 996; Gigerenzer, 2000 ; Cf. Gigerenzer, 1996; Kahneman
and Tversky, 1 996.)
Thus, in effect, boundedness theorists like Gigerenzer would have us replace the
reasoning given above in ( 1 )-(3) with the reasoning captured below by (4)-(6) - where,
notice, the uncontroversial ( 1 ) is still accepted: it's identical with (5).
5
6
Human uncertainty judgment does not follow the entrenched rules of probab
ility calculus.
Therefore, relative to the artificial information-representation environment of
the experimental settings, human uncertainty judgment is not rational.
And they would have us keep in mind that the conclusion (6) is not to be confused
with the earlier, skeptical conclusion (3), because it is perfectly consistent with the
manifestly non-skeptical conclusion (9), as supported by (7) and (8) :
Relative to its natural information-representation environment, human uncertainty
judgment is rational just in case it follows rules of the probability calculus spec
ifiable only by reference to the natural information-representation environment.
8 Human uncertainty judgment follows rules of the probability calculus specifI
able only by reference to its natural information-representation environment.
9 Therefore, relative to its natural information-representation environment, human
uncertainty judgment is rational.
By adopting the Bounded Rationality Thesis in the form of (4) and (7), as opposed
to its Unbounded counterpart in the form of (2), Gigerenzer and other b oundedness
theorists thereby block the troubling skeptical conclusion that human uncertainty judg
ment is generally irrational, period, as (3) would have it. But notice that they also
thereby maintain that what distinguishes rational from irrational uncertainty judg
ment - following rules for uncertainty judgment that are specifiable only by refer
ence to the judging subject's external environment - is at least partly outside of the
mind. And, since it is eminently plausible to treat this as precisely what serves to
distinguish stably accurate probabilistic representational mechanisms from merely accur
ate probabilistic representational mechanisms, it follows (given the general notion
of cognitive virtue articulated in my introduction) that one very important form of
cognitive virtue - cognitive virtue with respect to our probabilistic representational
mechanisms - is not wholly within the mind. By pushing us in the direction of the
Bounded Rationality Thesis, therefore, theorists like Gigerenzer push us away from
the Enlightenment Picture. For, according to that picture, cognitive virtue - in what
ever form - is supposed to be located entirely within the mind.
The worry has been raised in one form or another by Kaplan ( 1985) and Cruz and
Pollock ( 1999), and can be stated roughly as follows. If cognitive virtue is located
outside the mind in the way that the Post-Enlightenment Picture suggests, then it
turns out to be something bestowed on us by features of the world not under our
control : it involves an intolerable degree of something analogous to what theoret
ical ethicists call "moral luck" (cf. Statman, 1993) - "cognitive luck," we might say.
In that case, there is little we can do to improve our cognitive abilities, for - so the
worry continues - such improvement requires manipulation of what it is within our
power to change, and these external, cognitively fortuitous features are outside
that domain. But consider: one important goal of the enterprise of cognitive science
is precisely the improvement of our cognitive abilities. After a ll, surely one central
reason why cognitive scientists are so concerned with understanding the nature
of our representational mechanisms is that this understanding promises ultimately
to lead to an increased ability to detect and correct faulty - unstable - deployments
of those mechanisms (cf. G oldman, 1 978, 1 986). Hence, if the Post-Enlightenment
Picture turns out to be accurate, one important goal of cognitive science goes by the
wayside.
One response to the worry consists of pointing out that, even on the assumption
that the relevant features outside the mind are not under our control, there is noth
ing in the Post-Enlightenment Picture that suggests that these are all of what makes
for cognitive virtue. That the rationality of human uncertainty judgment involves
following rules for such judgment that are specifIable only by reference to one's envir
onment does not mean that it involves following rules specifIable exclusively by
reference to one's environment; that cognitive virtue cannot be located entirely within
the mind does not imply that it is located entirely outside the mind. So, perhaps there
is a great deal we can do to improve our cognitive abilities, by manipulating those
features of the mind that, together with those outside it, make for cognitive virtue.
Another response to the worry under consideration might be this: we ought not
to accept the assumption that those features outside the mind that partly make for
cognitive virtue are beyond our control. Consider, for example, one such represent
ative feature: having the input to one's visual-belief-acquisition mechanisms - visual
experience - caused under appropriate lighting and distance conditions. This is clearly
a feature of the world outside the mind. But it is perhaps equally clearly a feature
over which we can exercise control: we just have to take pains to ensure that our
bodily location is apt when we deploy these visual-belief-acquisition mechanisms.
Or consider another representative feature : having the informational input to our
uncertainty-judgment-acquisition mechanisms structured in (say) a natural fre
quency format. Again, this is a feature outside the mind that is to some extent under
our control. It is precisely by the manipulation of such features that the experimental
results reported in Cosmides and Tooby 1996, inter alia, are generated.
Acknowledgments
I would like to express my thanks to Robert Stain ton, Joseph Shieber, and Patrick Rysiew for
their helpful feedback during my preparation of this chapter.
Bounded Rationality and Cognitive Virtue
References
Cosmides, 1. and Tooby, J. ( 1 996). Are humans good intuitive statisticians after all? Cognition,
58, 1 -73.
Cruz, J. and Pollock, J. ( 1 999). Contemporary Theories of Knowledge (2nd edn.). Totowa, NJ:
Rowman Et Littlefield.
Gigerenzer, G. ( 1 996). On narrow norms and vague heuristics: A reply to Kahneman and Tversky.
Psychological Review, 1 03 , 592-6.
- (2000). Adaptive Thinking: Rationality in the Real World. Oxford: Oxford University Press.
Gigerenzer, G. and Hoffrage, U. ( 1 995). How to improve Bayesian reasoning without instruc
tion : frequency formats. Psychological Review, 102, 684-704.
- and - ( 1 998). AIDS counselling for low-risk clients. AlDS Care, 1 0, 1 97-2 1 1 .
Gigerenzer, G. and Selten, R. (200 1 ). Rethinking rationality. In G. Gigerenzer and R. Selten
(eds.), Bounded Rationality: The Adaptive Toolbox. Cambridge, MA: MIT Press.
Goldman, A. ( 1 978). Epistemics: The regulative theory of cognition. Journal of Philosophy, 7 5,
509-23.
- ( 1 986). Epistemology and Cognition. Cambridge, MA : Harvard University Press.
Kahneman, D. and Tversky, A. ( 1 972). Subjective probability: A judgment of representativeness. Cognitive Psychology, 3, 430-54.
- and - ( 1 996). On the reality of cognitive illusions. Psychological Review, 1 03, 582-9 1 .
Kaplan, M . ( 1 985). It's not what you know that counts. Journal of Philosophy, 8 2 , 3 50-63.
Nisbett, R. and Ross, 1. ( 1980). Human Inference: Strategies and Shortcomings of Social Judgment.
Englewood Cliffs, NJ: Prentice Hall.
Quine, W. V. O. ( 1 969). Natural kinds. In Ontological Relativity and Other Essays. New York:
Columbia University Press.
Selten, R. (200 1 ) . What Is bounded rationality? In G. Gigerenzer and R. Selten (eds.), Bounded
Rationality: The Adaptive Toolbox. Cambridge, MA: MIT Press.
Simon, H. A. ( 1 97 2). Theories of bounded rationality. In C. B. Radner and R. Radner (eds.),
Decision and Organization. Amsterdam : North Holland Publishing.
Statman, D. (ed.) ( 1 993). Moral Luck. Albany: State University of New York Press.
Taylor, C. ( 1 995). Overcoming epistemology. In C. Taylor, Philosophical Arguments.
Cambridge, MA: Harvard University Press.
Tversky, A. and Kahneman, D. ( 1 973a) . Availability: A heuristic for judging frequency and
probability. Cognitive Psychology, 5, 207-3 2.
- and - ( 1 973b). On the psychology of prediction. Psychological Review, 80, 237- 5 1 .
- and - ( 1 974). Judgment under uncertainty: heuristics and biases. Science, 1 85, 1 1 24-3 1 .
- and - ( 1 982a). Causal schemas i n judgments under uncertainty. I n D . Kahneman, P . Slovic,
and A. Tversky (eds.), Judgment under Uncertainty: Heuristics and Biases. Cambridge: Cambridge
University Press.
- and - ( 1 98 2b). Evidential impact of base rates. In D. Kahneman, P. Slovic, and A. Tversky
(eds.), Judgment under Uncertainty: Heuristics and Biases. Cambridge: Cambridge University
Press.
"- and - ( 1 982c). Judgments of and by representativeness. In D. Kahneman, P. Slovic, and
A. Tversky (eds.), Judgment under Uncertainty: Heuristics and Biases. Cambridge:
Cambridge University Press.
and - ( 1 983). Extensional versus intuitive reasoning: The conjunction fallacy in probab
ility judgment. Psychological Review, 90, 293-3 1 5.
David Matheson
AR E R U LES AN D
R E PR ES E NTATI O N S
N ECESSARY TO
EXPLAI N
SYSTE MATI CITY?
CHA PT E R
N I N E
Human cognition is rich, varied, and complex. In this chapter we argue that because
of the richness of human cognition (and human mental life generally) , there must
be a syntax of cognitive states, but because of this very richness, cognitive processes
cannot be describable by exceptionless rules.
The argument for syntax, in section 1, has to do with being able to get around
in any number of possible environments in a complex world. Since nature did not
know where in the world humans would fmd themselves - nor within pretty broad
limits what the world would be like - nature had to provide them with a means of
"representing" a great deal of information about any of indefmitely many locations.
We see no way that this could be done except by way of syntax
that is, by a
systematic way of producing new, appropriate representations as needed. We discuss
what being systematic must amount to, and what, in consequence, syntax should
mean. We hold that syntax does not require a part/whole relationship.
The argument for the claim that human cognitive processes cannot be described
by exception less rules, in section 2, appeals to the fact that there is no limit to the
factors one might consider in coming to a belief or deciding what to do, and to the
fact that there is no limit in principle to the number of factors one might consider
in coming to a belief or deciding what to do.
We argue not on the basis of models in cognitive science, but instead from reflections
on thought and behavior. We do not take our argument to have obvious short-term
implications for cognitive science practice; we are not seeking to tell cognitive scient
ists what speciflC sorts of modeling projects they should be pursuing in the immediate
future. We do think that our arguments have longer-term implications for cognitive
science, however, because if we are right about human cognition then adequate
models would ultimately need to somehow provide for cognitive processes that (i) are
too complex and subtle to conform to programmable, exceptionless rules for manip
ulating mental representations, (ii) employ syntactically structured representations,
1 1 48 1
about them as physical objects.) And, of course, I have beliefs about many other
kinds of things in the offIce: fumiture, decorations, supplies, flIes, etc.
Whenever you take up a new location, even for a short while, you acquire many
new beliefs of this kind. Think, for example, of a campsite or hotel room, or a restaur
ant. On being seated in a restaurant, you quickly form numerous beliefs about the
locations of typical restaurant items in this restaurant, about particular patrons, etc.
You also, of course, have knowledge of the locations and properties of things in other
familiar locations. Think, for example, of how much you know about items in your
own kitchen. (And of course, you can quickly acquire a good deal of that same kind
of information about a kitchen where you are visiting.)
So you have had an enormous number of beliefs about items in your vicinity.
Such beliefs are obviously necessary to get around in the world and to make use of
what the world provides. Most of the beliefs that any natural cognizer has in the
course of its career are of this kind - beliefs about objects that it does or may have
to deal with.
But reflect further. Nature did not know where on Earth you were going. You could
have gone to different places; then you would have had different beliefs about your
locale. Nature had to provide you with the capacity to take in - have beliefs about
- the confIguration of objects any place on Earth. But also, nature did not know
exactly what Earth was going to be like. You have the capacity to take in many pos
sible Earth-like locales that have never actually existed. You have, indeed, the capa
city to take in many more possible locales that are not particularly Earth-like - witness
e.g., Star Trek. The beliefs about your immediate environment that you are capable
of having - and about potential immediate environments that you are capable of
inhabiting - far outstrip the beliefs that you will ever actually have.
Of course, what we have been saying is true of every human being. And, to a
signifIcant extent, it must be true of most any cognitive system that is capable of
getting around and surviving in the world. Any successful natural cognitive system
must be capable of representing the situation in its locale for an indefmite number
of possible locales - and representing the situation means representing a very large
number of things and their numerous properties and relations.
The beliefs a cognizer forms about items in its environment must have content
appropriate causal roles - otherwise they would be of no help in survival. Thus,
cognitive systems must be set up in such a way that each potential belief, should
it occur, will have an appropriate causal role. 1
1 . 2 What is syntax?
One might, perhaps, provide ad hoc for states with semantically appropriate causal
roles in a simple system by wiring in all the potentially necessary states and causal
relations, but that would hardly be possible for the complex cognitive systems we
actually fmd in nature. The only way a cognitive system could have the vast supply
of potential representations about the locations and properties of nearby objects that
it needs is by having the capacity to produce the representations that are "needed"
when they are needed. The system must generate beliefs from more basic content.
Thus, the following must be at least approximately true: (i) For each individual i and
Cognition Needs Syntax but not Rules
property P that the system has a way of representing, a way of representing "that i
has P" is automatically determined for that system. And the representation that "i has
P" must automatically have its content-appropriate causal role. Thus, (ii) whenever
the system discovers a new individual, potential states predicating old properties
to the new individual must automatically be determined, and when the system
discovers a new property, states predicating that property to old individuals must be
determined. And, again, those new states must have their content-appropriate causal
roles. (i) and (ii) are possible on such a vast scale only if representations that predic
ate different properties of the same individual are systematically related, and repres
entations that predicate the same property of different individuals are systematically
related. Then representations can be "constructed" when needed on the basis of those
systematic relations. In no other way could there be the vast supply of potential
representations that natural cognizers need in order to track their surroundings.
Any representational system with such relations has a syntax, and any such system
is a language of thought. Syntax is simply the systematic and productive encoding
of semantic relationships, understood as follows:
Systematic. When different representational states predicate the same property or
relation to individuals, the fact that the same property or relation is predicated must
be encoded within the structure of representations. And when different representa
tional states predicate different properties or relations to the same individual, the fact
that it is the same individual must be encoded in the structure of representations.
Productive. When a representation of a new property or relation is acquired, the
representations that predicate that property of each individual must be automatic
ally determined. When a representation for a new individual is acquired, the complex
representations that predicate properties of that individual must be automatically
determined.
In order for a system of syntactically structured representations to be a language
of thought, one more thing is necessary. The representations must have semantically
appropriate causal roles within a cognitive system. The causal roles of representa
tions depend in part upon the semantic relationships of their constituents, which are
encoded in syntactic structure. Thus, the fact that a representation has a particular
syntactic structure must be capable of playing a causal role within the system - a causal
role appropriate to the representational content that is syntactically encoded in the
representation. Syntax in a language of thought must be, as we will say, effective.
When philosophers think of language, they tend to think of the languages of logic.
In a theory formulated in the flfSt-order predicate calculus the fact that two sen
tences of the theory predicate the same property of different individuals is encoded
in the structure of representations by the presence of the same predicate at corres
ponding places in the two sentences. So a fIrst order theory is systematic. And fust
order theories are productive. When a new predicate representing a new property
is added to the theory, new sentences predicating that property of each individual
represented in the theory are automatically determined. In logic, and in the syntax
of classical cognitive science, systematicity and productivity are achieved by encoding
identity of reference and identity of predication by shared parts of strings ; syntax is
realized as a part/whole relation. We will call such part/whole based syntax classical
syntax (following Fodor and McLaughlin, 1 990).
1 1 50 1
that generates representations that can be put to use in determining responses, either
typical or atypical, depending on other factors.
Furthermore, each of these different potential responses would be appropriate to
the current situation. They would be recognizably basketball responses, if not optimal
basketball responses. It would be impossible to be capable of so many different
appropriate responses to a given situation without actually representing a great deal
of the situation and without being capable of representing whatever might go into
helping determine one of those responses. In general, physical skills can be exercised
in a variety of ways, including novel ways, depending on present goals and pur
poses. This would be impossible if physical skills were simply learned systems of
responses to stimuli. Basketball, and all physical skills, requires the capacity to bring
a large amount of non-perceptual propositional, hence syntactically structured,
information to bear on the present task.
1 .4 A remark about the argument
We want to distinguish the tracking argument, rehearsed in this section, from an
important argument in linguistics, known as the productivity argument. It goes as
follows. Every speaker knows many ways of taking a given sentence and making a
longer sentence. Hence, there is no longest sentence of a natural language ; that is,
every natural language contains infmitely many sentences. But speakers are fmite.
The only way a fmite speaker could have such an infmite capacity is by having a
capacity to construct sentences from a fmite vocabulary by a fmite set of processes
- i.e., by having a system of rules that generates sentences from fmite vocabulary
(i.e., syntax).
The tracking argument differs from productivity arguments - and from similar argu
ments concerning infmite linguistic capacity. These arguments appeal to recursive
processes and logically complex representations. The tracking argument, however,
appeals only to states of the sort that would most naturally be represented in the
predicate calculus as (monadic or n-adic) atomic sentences, without quantifIers or
connectives - that is, to representations that predicate a property of an individual or
a relation to two or three individuals. Thus, the vastness of potential representations
we are talking about in the tracking argument depends not upon recursive features
of thought, but only upon the vastness of potential simple representations.4 The
tracking argument therefore is not committed to recursiveness in the representational
systems of cognitive agents to which the argument applies - which means that denying
recursiveness would not be a basis for repudiating the argument.
2 No Exceptionless Rules
In section 1 we argued that cognition requires syntax, understood as any systematic
and productive way of generating representations with appropriate causal roles. The
most familiar, and perhaps most natural, setting for syntactically structured repres
entations is so-called "classical" cognitive science - which construes cognition on the
model of the digital computer, as the manipulation of representations in accordance
Terence Horgan and John Tienson
with exceptionless rules of the sort that could constitute a computer program. In this
way of conceiving cognition, the rules are understood as applying to the representa
tions themselves on the basis of their syntactic structures. Let us call these rules
"programmable, representation-level rules" (PRL rules).
We maintain that human cognition is too rich and varied to be described by PRL
rules. Hence, we hold that the classical, rules and representations, paradigm in
cognitive science is ultimately not a correct model of human cognition. In this
section we briefly present two closely related kinds of considerations that lead us to
reject PRL rules. First, there is no limit to the exceptions that can be found to useful
generalizations about human thought and behavior. Second, there is no limit to the
number of considerations that one might bring to bear in belief formation or decision
making. Since any PRL rules that might be proposed to characterize, explain, or predict
some kind of human behavior would essentially impose a limit on relevant factors and
on possible exceptions, such rules would go contrary to the limitless scope, in actual
human cognition, of potential exceptions and p otentially relevant considerations.
2 . 1 Basketball, again
These features of cognition are apparent in the basketball example. Players play dif
ferently in the same circumstances depending on their present aims and purposes,
broadly speaking. As we said above, one's immediate purposes are influenced by
factors such as game plan, score, time, and so forth. How one responds to a situ
ation can also change when one goes to a new team, when the team gets a new coach,
when one is injured, etc. We will mention two descriptions of factors that can influence
how one plays in certain circumstances. Each of these descriptions in fact covers
indefmitely many possible influencing factors of that type. First, one can attempt to
play the role of a player on an upcoming opponent's team for the sake of team prac
tice. Clearly, there are indefmitely many possible opponents one could attempt to
imitate. Second, one's play may be influenced by one's knowledge of the health con
ditions of players in the game. Making such adjustments is fairly natural for human
beings. But there is no way that a system based on PRL rules could do it with the
open-ended range that humans seem to have; rather, the PRL rules would have to
pre-specifY - and thereby limit - what counts as a suitable way to mimic a player
on another team, or what informational factors about opposing players can influence
how one plays against them.
2 .2 Ceteris paribus
Imagine being at someone's house at a gathering to discuss some particular topic, a
philosophical issue, say, or the policy or actions of an organization. And imagine
that you are very thirsty and know that refreshments have been provided in the kitchen.
There is a widely accepted generalization concerning human action that is certainly
at least approximately true :
If a person 5 wants something
W,
by doing
In the case at hand, if you want something to drink, and believe you can get some
thing to drink in the kitchen, you will go to the kitchen. Neither ( 1 ) nor its typical
instances are exceptionless, however which is signaled by the parenthetical "other
things equal." You might not go to the kitchen because you do not want to offend
the person speaking ; or you might not want to miss this particular bit of the con
versation; or you might be new to the group and not know what expected etiquette
is about such things ; or there might be someone in the kitchen who you do not want
to talk to. One could go on and on. And so far we have only mentioned factors that
are more or less internal to the situation. You would also not go to the kitchen if,
say, the house caught on fire.
Generalizations such as ( 1 ) are frequently called ceteris paribus (all else equal)
generalizations. Some have suggested that they should be named "impedimentus
absentus generalizations." Given your thirst, you will go to the kitchen in the absence
of impediments. (The impediments can, of course, include your other beliefs and desires.)
One thing that is important to see is that ( 1 ) is a generalization of a different logical
form from the familiar universal generalizations of categorical logic and the pre
dicate calculus. It is not like "All dogs have four legs," which is falsifIed by a single
three-legged dog. It is like "Dogs have four legs." This is not falsifIed by dogs with
more or less than four legs, provided the exceptions have explanations.5
Thus, ( 1 ) or something very like it is true; exceptions are permitted by the "other
things equal" clause. The point we are emphasizing is that there is no end to the list
of acceptable exceptions to typical instances of ( 1 ) such as the one about you going
to the kitchen. One could, indeed, go on and on listing possible exceptions.
Furthermore, the example we have used is just one of an indefmitely large number
that we could have used. PRL rules are not the kinds of generalizations that apply to
this aspect of human cognition. PRL rules cannot allow endless possible exceptions.
Finally, it bears emphasis that instances of ( 1 ) can be used to explain and predict
human behavior. Very often it is possible to know that ceteris is paribus.6
2 . 3 Multiple simultaneous soft constraints
Think of ordering dinner in a restaurant. What you order is influenced by what kind
of food you like, by how hungry you are, by the mood you are in, and perhaps by
price. But what you order can be influenced by many other factors : what others in
your party are ordering, what you had for lunch, what you expect to have later in
the week, whether you have been to this restaurant before and what you thought of
it, whether you think you might return to this restaurant. Again, we could go on
and on.
There is no limit to the factors that could influence what you order. Each factor
is a "constraint," pushing for some decisions and against others. But (typically) each
of the constraints is "soft" - any one of them might be violated in the face of the
others. The phenomenon of multiple, simultaneous, soft constraints seems to be
ubiquitous in cognition. It is typical of "central processes" : deciding, problem solving,
and belief flxation.7 (The classic detective story is a paradigm case in the realm of
belief fIxation, with multiple suspects and multiple defeasible clues.) In such cases
one is often conscious of the factors one is considering - though often one is not
Terence Horgan and John Tienson
aware of how they conspired to lead to the fmal decision. If there are too many factors
to keep in mind and the matter is important enough, one may recycle the factors
mentally or write them down. (Imagine deciding what job to take supposing you are
fortunate enough to have a choice, or what car to buy, for example.) But in other
kinds of cases, such as social situations, physical emergencies, and playing basket
ball, multiple simultaneous soft-constraint satisfaction appears to occur quite rapidly
(and often unconsciously).
In many instances, multiple simultaneous soft-constraint satisfaction permits
indefmitely many factors to be influential at once - and generates outcomes that are
appropriate to the specifIC mix of indefmitely many factors, however numerous and
varied and heterogeneous they might be. Exceptionless rules cannot take account of
indefmitely many heterogeneous factors and the indefmitely many ways they can
combine to render some specifIC outcome most appropriate, because such rules would
have to pre-specifY - and thereby limit - all the potentially relevant factors and all the
ways they might jointly render a particular outcome the overall most appropriate one.
3 Concluding Remarks
We have argued that complex cognition must utilize representations with syntactic
structure because that is the only way for there to be the vast supply of representa
tions that cognition requires. But we have also argued that many of the processes
these representations undergo are too rich and flexible to be describable by excep
tionless rules.
The heart of classical modeling is programs, hence exceptionless rules. Many
connectionist models, on the other hand - and more generally, many dynamical
systems models - do not employ syntax. We have no quarrel with, and in fact applaud,
(i) models that handle i nformation-processing tasks via the classical rules and repres
entations mode, and (ii) connectionist and dynamical systems models that handle
information-processing tasks without resorting to syntax. Much can be learned from
such models that is relevant to understanding human-like cognition. But if the argu
ments of this chapter are right, then neither kind of model can be expected to scale
up well from handling simplifIed "toy problems" to handling the kinds of complex
real-life cognitive tasks that are routinely encountered by human beings. Scaling up
will require forms of multiple soft-constraint satisfaction that somehow incorporate
effective syntax (as do many extant classical models), while also eschewing the pre
supposition of PRL rules (as do many extant connectionist and dynamical systems
models, in which rules operate only locally and sub-representationally whereas rep
resentations are highly distributed).8
Notes
For any given belief, causal role is a complicated matter. If I believe there is a coffee cup just
out of my sight on the table to my left, this may influence my behavior in many different
ways. If I want some coffee, I may reach out, pick up the cup, and sip from it. (Note that
Cognition Needs Syntax but not Rules
2
3
6
7
I may well not have to look at the cup.) If I want to take my time answering a question,
I may pick up the cup, bring it to my lips, pause, and put it down again. If I want to create
a distraction, I might "inadvertently" knock over the cup. And so forth.
Connectionist tensor product representations are relevantly similar to the chord language.
Cf. Horgan and Tienson, 1 9 9 2 ; 1 996, pp. 74-8 1 .
D o drawings o r computer-generated images have syntax, i n the sense here described?
It depends on whether, and how, such items are used as representations by a given repres
entational system. For instance, a system might use the height of stick-fIgure repres
entations of basketball players to represent players' heights, might use the colors of the
stick figures to represent players' team membership, etc. In such a case there is a systematic,
productive, way of constructing representations - which counts as syntax.
Thus, we believe, the argument applies not only to human cognition but also to non
human animals, for whom it is difflcult to fmd evidence for more than a minimum of
recursive structure in representations. Birds and mammals have languages of thought.
It is probably not a good idea to call such generalizations impedimentus absentus general
izations, since one important class of such generalizations is simple moral principles: lying
is wrong, stealing is wrong, etc, which have exceptions that are not impediments.
For more on this theme, see Horgan and Tienson, 1990; 1 996, ch. 7 .
Jerry Fodor ( 1 983, 200 1 ) has compared belief fixation to theory confirmation in science.
Theory confirmation is "holistic" ; the factors that might be relevant to the confirmation
(or disconflrmation) of a particular theory may come from any part of science. (This, he
suggests, is why confirmation theory is so underdeveloped compared to deductive logic.
Deductive validity is a strictly local matter, determined by nothing but the premises and
conclusion of the argument.) Likewise, anything one believes might turn out, under some
circumstances, to be relevant to anything else you should believe - the potential relevance
of anything to anything. And there are no general principles for determining in advance
what might be relevant.
For a book-length articulation of one way of fIlling in some details of this sketch of a
conception of human cognition, see Horgan and Tienson, 1 996.
References
Fodor, J. ( 1 983). The Modularity of Mind: An Essay on Faculty Psychology. Cambridge, MA:
MIT Press.
- (200 1). The Mind Doesn't Work That Way: The Scope and Limits oj Computational Psychology.
Cambridge, MA: MIT Press.
Fodor, J. and McLaughlin, B. ( 1 990). Connectionism and the problem of systematicity: Why
Smolensky's solution doesn't work. Cognition, 3 5, 1 83-204. (Reprinted in T. Horgan and
J. Tienson (eds.), Connectionism and the Philosophy of Mind, 1 99 1 . Dordrecht: Kluwer.)
Horgan, T. and Tienson, J. ( 1 990). Soft Laws. Midwest Studies in Philosophy, 1 5, 256-79.
- and
( 1 992). Structured representations in connectionist systems? In S. D avis (ed.),
Connectionism: Theory and Practice. New York: Oxford University Press.
- and
( 1996). Connectionism and the Philosophy oj Psychology. Cambridge, MA: MIT
Press.
-
C H A PT E R
T
E
N
Phenomena a n d
M ech a n isms: Putti n g the
Sym bol i c, Con necti o n ist, a n d
Dyn a m i ca l Systems Debate
i n B roa d er Perspective
Adele Abrahamsen and William Bechtel
Cognitive science is, more than anything else, a pursuit of cognitive mechanisms. To
make headway towards a mechanistic account of any particular cognitive phenomenon,
a researcher must choose among the many architectures available to guide and con
strain the account. It is thus fItting that this volume on contemporary debates in
cognitive science includes two issues of architecture, each articulated in the 1 980s
but still unresolved :
Just how modular is the mind? (Part 1 ) - a debate initially pitting encapsu
lated mechanisms (Fodorian modules that feed their ultimate outputs to a non
modular central cognition) against highly interactive ones (e.g., connectionist
networks that continuously feed streams of output to one another) .
Does the mind process language-like representations according to formal rules?
(Part 4) - a debate initially pitting symbolic architectures (such as Chomsky's
generative grammar or Fodor's language of thought) against less language
like architectures (such as connectionist or dynamical ones).
Our project here is to consider the second issue within the broader context of where
cognitive science has been and where it is headed. The notion that cognition in
general - not just language processing - involves rules operating on language-like
representations actually predates cognitive science. In traditional philosophy of mind,
mental life is construed as involving propositional attitudes - that is, such attitudes
towards propositions as believing, fearing, and desiring that they be true - and
logical inferences from them. On this view, if a person desires that a proposition be
true and believes that if she performs a certain action it will become true, she will
make the inference and (absent any overriding consideration) perform the action.
This is a prime exemplar of a symbolic architecture, and it has been claimed that
all such architectures exhibit systematicity and other crucial properties. What gets
debated is whether architectures with certain other design features (e.g., weighted
connections between units) can, in their own ways, exhibit these properties. Or more
fundamentally: What counts as systematicity, or as a rule, or as a representation?
Are any of these essential? Horgan and Tienson offer their own defmition of system
aticity and also of syntax, arguing that syntax in their sense is required for cognition,
but not necessarily part/whole constituent structures or exceptionless rules. They
leave to others the task of discovering what architecture might meet their criteria
at the scale needed to seriously model human capabilities.
Our own goal is to open up the debate about rules and representations by situ
ating it within a framework taken from contemporary philosophy of science rather
than philosophy of mind. First, we emphasize the benefIts of clearly distinguishing
phenomena from the mechanisms proposed to account for them. One might, for exam
ple, take a symbolic approach to describing certain linguistic and cognitive phenomena
but a connectionist approach to specifYing the mechanism that explains them. Thus,
the mechanisms may perform symbolic activities but by means of parts that are not
symbolic and operations that are not rules. Second, we point out that the mech
anisms proposed to account for phenomena in cognitive science often do not fIt the
pure types debated by philosophers, but rather synthesize them in ways that give the
fIeld much of its energy and creativity. Third, we bring to the table a different range
of phenomena highly relevant to psychologists and many other cognitive scientists
that have received little attention from philosophers of mind or even of cognitive
science those that are best characterized in equations that relate variables. Fourth,
we offer an inclusive discussion of the impact of dynamical systems theory on cog
nitive science. It offers ways to characterize phenomena (in terms of one or a few
equations), explain them mechanistically (using certain kinds of lower-level models
involving lattices or interactive connectionist networks), and obtain new understandings
of development.
1 1 60 1
in empirical laws that express idealized regularities in the relations between vari
ables over a set of observations. According to the Boyle-Charles law, for example,
the pressure, volume, and temperature of a gas are in the relation pV kT. In psy
chology, there is a similar focus on relations between variables, but these relations
are less likely to be quantifIed and designated as laws. Instead, psychologists probe
for evidence that one variable causes an effect on another variable. Cummins (2000)
noted that psychologists tend to call such relations effects, offering as an example
the Garcia effect: animals tend to avoid distinctive foods which they ate prior to
experiencing nausea, even if the actual cause of the nausea was something other
than the food (Garcia et al., 1 968).
In the traditional deductive-nomological (D-N) model (Hempel, 1 965), character
izations of phenomena are regarded as explanatory. For example, determining that
an individual case of nausea is "due to" the Garcia effect explains that case. On a
more contemporary view, such as that of Cummins, identifying the relevant phenom
enon is just a preliminary step towards explanation. The actual explanation typically
involves one of two approaches:
The phenomena of cognitive science are so varied that every kind of characterization
is encountered: symbolic rules and representations, descriptive equations compar
able to the empirical laws of classical physics, statements that certain variables are
in a cause-effect relation, and perhaps more. Individual cognitive scientists tend
to have a preferred mode of characterization and also a preferred mode of explain
ing the phenomena they have characterized. Those who propose theoretical laws or
principles typically work in different circles than the somewhat larger number who
Phenomena and Mechanisms
formation to illustrate the conflict between the symbolic and connectionist architec
tural frameworks for mechanistic explanation in cognitive science. More importantly,
we then make the case that these architectures are not polar opposites for which one
winner must emerge. After conceptually reframing the symbolic-connectionist debate
in this way, we fmish by pointing to two research programs that achieve a more
integrative approach: optimality theory in linguistics and statistical learning of rules
in psycholinguistics.
There are further details, such as distinguishing among three allomorphs of the past
tense affIX ed, but the key point is that the mechanisms are at the same scale as the
phenomenon. Operations like rule application and lexical look-up are assumed to directly
modify symbolic representations.
The other approach is to explain past-tense formation by means of a single mech
anism situated at a fmer-grained level that is sometimes called subsymbolic. The best
known sub symbolic models of cognition and language are feedforward connectionist
networks. Architecturally, these originated in networks of formal neurons that were
proposed in the 1 9 40s and, in the guise of Frank Rosenblatt's ( 1 962) perceptrons,
shown to be capable of learning. Overall phenomena of pattern recognition were seen
to emerge from the statistics of activity across numerous identical fme-grained units
that influenced each other across weighted connections. Today these are sometimes
called artificial neural networks (ANNs). The standard story is that network and sym
bolic architectures competed during the 1 960s, ignored each other during the 1 970s
(the symbolic having won dominance), and began a new round of competition in the
1 9 80s. We suggest, though, that the symbolic and network approaches were both at
their best when contributing to new blended accounts. Notably, in the 1 980s they
came together in connectionism when a few cognitive scientists pursued the idea that
a simple ANN architecture could (i) provide an alternative explanation for well-known
Phenomena and Mechanisms
First phoneme
c/v
voi
manner
place
I I
c/v
voi
manner
Third phoneme
place
I I
c/v
voi
manner
Suffix
place
c/v = consonant/vowel
voi = voiced
I I
'oJ)
c
'iii
Vl
d.i
U
0.
c/v
voi
manner
First phoneme
place
I I
c/v
voi
manner
Second phoneme
place
I I
e/v
voi
manner
Third phoneme
plaee
I IL---c-c----c-...J
Unused
human symbolic capabilities such as past-tense formation while also (ii) explaining
additional phenomena of graceful degradation, constraint satisfaction, and learning
that had been neglected (Rumelhart and McClelland, 1 9 8 6b) Connectionist networks
generally are construed as having representations across units, but no rules. The very
idea that these representations are subsymbolic signals the ongoing relevance of the
symbolic approach. Connectionists, unlike many other ANN designers, are grappling
with the problem of how humans' internal networks function in a sea of external
symbols - words, numbers, emoticons, and so forth. In fact, at least one of the pioneers
of connectionism had pursued a different blend in the 1 970s that leaned more towards
the symbolic side - semantic networks - but became critical of their brittleness (see
Norman and Rumelhart, 1 9 7 5).
The fIrst connectionist model of past-tense formation (Rumelhart and McClelland,
1986a) performed impressively, though not perfectly, and received such intense scrutiny
that its limitations have long been known. It explored some intriguing ideas about
representation (e.g., coarse-coding on context-dependent units), but for some years
has been superseded by a sleeker model using a familiar network design. As illus
trated in fIgure 1 0. 1 , Plunkett and Marchman's ( 1 99 1 , 1 993) feedforward network
represents verb stems sub symbolically as activation patterns across the binary units
of its input layer. It propagates activation across weighted connections fust from the
input to hidden layer and then from the hidden to output layer. Each unit in one
layer is connected to each unit in the next layer, as illustrated for two of the hidden
units, and every such connection has its own weight as a result of repeated adaptive
adjustments during learning (via back-propagation) . An activation junction, which
typically is nonlinear, determines how the various weighted activations coming into
a unit will be combined to determine its own activation. In this way, the network
transforms the input representation twice - once for each pair of layers - to anive
at a subsymbolic representation of the past-tense form on the output layer. Although
all three layers offer subsymbolic representations, it is the encoding scheme on the
input layer that most readily illustrates this concept. The verb stems, which would
be treated as morphemes by a rule appending ed in a symbolic account, are replaced
here with a lower-level encoding in terms of the distinctive features of each con
stituent phoneme in three-phoneme stems. For example, the representation of dez
(for convenience, they used artifIcial stems) would begin with an encoding of d as
0 1 1 1 00 (0
consonant, 1
voiced, 1 1 = manner: stop, 10
place: alveolar). With
e encoded on the next six units and z on the last six units, dez is represented as a
binary pattern across 1 8 subsymbols rather than symbolically as a single morpheme.
Moreover, as the pattern gets transformed on the hidden and output layers, it is no
longer binary but rather a vector of 20 real numbers, making the mapping of stem to
past tense a statistical tendency. Connectionist networks are mechanisms - they have
organized parts and operations - but the homogeneity, fme grain, and statistical
functioning of their components make them quite distinct from traditional symbolic
mechanisms. (See Bechtel and Abrahamsen, 2002, for an introduction to connec
tionist networks in chapters 2 and 3 and discussion of past-tense networks and
their challenge to rules in chapter 5.)
The trained network in fIgure 1 0. 1 can stand on its own as a mechanistic model
accounting for past-tense formation by adult speakers. However, it is the network's
=
(a)
(b)
(cl
(d)
Figure 1 0.2 Some nonlinear curves : (a) sigmoidal (e.g., skill as a function of practice) ;
(b) positively accelerated exponential (e.g., early vocabulary size as a function of time) ;
(c) negatively accelerated exponential (e.g., number of items learned on a list as a function
of list repetitions) ; (d) U-shaped acquisition (e.g., irregular past tense as a function of age).
behavior during training that has captured the greatest attention, due to claims that
it provides a mechanistic explanation of an intriguing developmental phenomenon,
U-shaped acquisition. It has long been known that in acquiring the regular past tense,
children overgeneralize it to some of their irregular verbs - even some that had pre
viously been correct (Ervin, 1 964; Brown, 1 973). For example, a child might correctly
produce went and sat at age 2, switch to goed and sitted at age 3 , and gradually
return to went and sat at ages 4-5. Graphing the percentage of opportunities on which
the correct form was used against age, a U-shaped acquisition function is obtained.
This contrasts with typical learning curves, which tend to be sigmoidal or exponen
tiaL The phenomenon of interest is that acquisition of irregulars cannot be described
by any of the usual functions but rather is U-shaped, as illustrated in fIgure 10.2.
Advocates of the symbolic approach have interpreted this as favoring the two
mechanism account (applying a rule for regulars and looking up the verb in a men
tal lexicon for irregulars) ; specifIcally, they attribute the decline in performance on
irregulars to the replacement of lexical look-up (which gives the correct form) with
overgeneralization of the rule (yielding the regular past tense that is inappropriate
for these verbs). The alternative proposal, initially advanced by Rumelhart and
McClelland ( 1 986a), acknowledged competition but relocated it within a single
mechanism - their connectionist network in which the same units and connection
weights were responsible for representing and forming the past tense of all verbs.
Especially when an irregular verb was presented to the network, activation patterns
appropriate to different past-tense forms would compete for dominance. Like child
ren, their network showed a U-shaped acquisition curve for irregular verbs across its
training epochs. Pinker and Prince ( 1 988) and others objected to discontinuities in the
input and to shortcomings in the performance of Rumelhart and McClelland's network
relative to that of human children (based not only on linguistic analyses but also on
detailed data gathered by Kuczaj, 1 977, and later by Pinker, Marcus and others) ,
Subsequent modeling efforts (Plunkett and Marchman, 1 99 1 , 1 993) addressed some of
the criticisms, but the critics responded with their own fairly specifIc, though unimple
mented, model. A readable, fairly current exchange of views is available in a series of
papers in Trends in Cognitive Sciences (McClelland and Patterson, 2002a, 2002b ; Pinker
and Ullman, 2002a, 2002b).
When competing groups of researchers are working within architectural frameworks
as distinct as the symbolic and connectionist a lternatives, data alone rarely generate
Adele Abrahamsen and William Bechtel
That is, the two competing approaches to past-tense formation might be given
complementary roles. One way of construing this is to appreciate linguistic rules as
well suited to characterizing the phenomenon of past-tense formation but to prefer
feedforward networks as a plausible mechanism for producing the phenomenon.
Alternatively, both architectures might be viewed as suitable for mechanistic
accounts, but at different levels - one course-grained and symbolic, the other fme
grained and statistical. Whatever the exact roles, providing a place for more than
one approach can move inquiry towards how they complement each other rather than
seeking a winner. In particular, both approaches need not directly satisfY every evalu
ative criterion. For example, considerations of systematicity proceed most simply
(though not necessarily exclusively) with respect to symbolic accounts, and graceful
degradation is one of the advantages offered by a fme-grained statistical account.
Looking beyond the Chomskian-connectionist axis of the past-tense debate, an
alternative linguistic theory exists that has been very amenable to - even inspired
by - the idea that symbolic and subsymbolic approaches each have a role to play in
an integrated account. Optimality theory emerged from the collaboration between two
Phenomena and Mechanisms
cognitive scientists who were opponents in the 1 980s: connectionist Paul Smolensky
and linguist Alan Prince. In a 1 993 technical report (published as Prince and
Smolensky, 2004), they showed that the constraint-satisfaction capabilities of net
works could also be realized, though in a somewhat different way, at the linguistic
level. SpecifiCally, they succeeded in describing various phonological phenomena using
a single universal set of soft rule-like constraints to select the optimal output among
a large number of candidate outputs. A given language has its own rigid rank ordering
of these constraints, which is used to settle conflicts between them. For example (see
Tesar et aI., 1 999, for the full fIve-constraint version) : the constraint NOCODA is viol
ated by any syllable ending in a consonant (the coda), and the constraint NoINsV
is violated if a vowel is inserted in the process of forming syllables (the output) from
a phoneme string (the input). The input string !apot! would be syllabifIed as .a.pot.
in a language that ranks NoINSV higher (e.g., English), but in a vowel-fmal form
like . a.po.to. in a language that ranks NOCODA higher (e.g., Japanese).
Optimality theory (OT) offers such an elegant explanation of diverse phenomena
that a substantial number of phonologists have adopted it over classic rule-based
theories. (Uptake in syntax has been slower.) For reasons diffICUlt to explain with
out presenting OT in more detail, it is implausible as an explicit mechanistic account.
Those with the ambition of integrating OT with an underlying mechanistic account
have tended to assume a connectionist architecture. Prince and Smolensky ( 1 99 7)
posed, but did not solve, the most tantalizing question invited by such a marriage :
Why would the computational power and flexibility offered by a statistical mech
anism like a network be funneled into solutions (languages) that all exhibit the rigid
rank-ordering phenomenon that makes OT a compelling theory?
A similar dilemma has been raised by a team of cognitive scientists whose integ
rative inclinations have operated on different commitments (their linguistic roots are
Chomskian, and they regard learning as induction rather than adjustments to weights
in a network). They have offered provocative evidence that the language acquisition
mechanism is highly sensitive to distributional statistics in the available language
input (Saffran et aI., 1 996), but seek to reconcile this with their view that the prod
uct of learning is akin to the rules and representations of linguistic theory. That is,
a statistical learning mechanism is credited with somehow producing a nonstatist
ical mental grammar. This brings them into disagreement with symbolic theorists on
one side, who deny that the learning mechanism operates statistically (see Marcus,
200 1) and with connectionists on the other side, who deny that the product of learn
ing is nonstatistical. In support of their nuanced position, Newport and Aslin (2000)
cited the fmding that children acquiring a signed language like American Sign Language
from non-native signers get past the inconsistent input to achieve a more native
like morphological system. On their interpretation "the strongest consistencies are sharp
ened and systematized: statistics are turned into 'rules' " (p. 1 3 ) . They and their
collaborators have also contributed a growing body of ingenious studies of artifIcial
language learning by infants, adults, and primates, from which they argue that the
statistical learning mechanism has selectivities in its computations that bias it
towards the phenomena of natural language (see Newport and Aslin, 2000, 2004).
Attempts like these to integrate connectionist or other statistical approaches with
symbolic ones offer promising alternatives to polarization. We mentioned, though,
Adele Abrahamsen and William Bechtel
that there is a second way of reframing the discussion. Looking at the rise of con
nectionism in the early 1 9 80s, it is seen to involve the confluence of a number
of research streams. Among these are mathematical models, information processing
models, artiflcial neural networks, and symbolic approaches to the representation of
knowledge - especially semantic networks but extending even to the presumed foe,
generative grammar. Some of these themselves originated in interactions between pre
viously distinct approaches; for example, ANNs were a joint product of neural and
computational perspectives in the 1 9 40s, and semantic network models arose when
an early artiflcial intelligence researcher (Quillian, 1 968) put symbols at the nodes
of networks rather than in rules. Some of the research streams leading to connec
tionism also had pairwise interactions, as when mathematical modeling was used to
describe operations within components of information processing models. Finally, some
of these research streams contributed not only to connectionism but also, when com
bined with other influences, to quite different alternatives. Most important here is
that dynamical systems theory (DST) took shape in a quirky corner of mathematical
modeling focused on nonlinear physical state changes. It found a place in cognitive
science when combined with other influences, such as an emphasis on embodiment,
and some of the bends in D ST's path even intersected with connectionism when it
was realized that such concepts as attractor states shed light on interactive networks.
Another example (not discussed in this chapter) is that information processing
models, neuroimaging, and other research streams in the cognitive and neural sciences
came together in the 1 990s, making cognitive neuroscience a fast moving fleld both
on its own and within cognitive science. As well, the idea that cognition is distributed
not only within a single mind, but also on a social scale, gave rise to socially dis
tributed cognition as a distinct approach in cognitive science (see Bechtel et al., 1 998,
Part 3). Thus, an exclusive focus on polar points of contention would give a very
distorted picture of cognitive science. This interdisciplinary research cluster in fact
is remarkable for its protean nature across both short and long time-frames.
If one is seeking maximal contrast to symbolic rules and representations, it is to
be found not in the pastiche of connectionism but rather within the tighter confmes
of one of its tributaries, mathematical modeling. Yet, except for D ST, this approach
has been mostly neglected in philosophical treatments of psychology and of the
cognitive sciences more generally. In the next section we consider how mainstream
mathematical psychology exemplifles the quantitative approach to characterizing and
explaining phenomena in cognitive science. This provides conceptual context for then
discussing DST and its merger with other commitments in the dynamical approach
to perception, action, and cognition.
Mathematical psychology
In its simplest form, mathematical psychology offers the means of characterizing phe
nomena involving quantitative relations between variables. This is a very different class
Phenomena and Mechanisms
of the theory is substantially smaller that the number of degrees of freedom (e.g.,
independent variables) in the data. (Falmagne, 2002, p. 9405)
The Markov models of William Estes ( 1 9 50) and R. R. Bush and F. Mosteller ( 1 9 5 1 )
satisfIed this criterion and energized the fIeld b y elegantly accounting for a variety
of empirical data and relationships. Their new mathematical psychology surpassed
Hull in successfully arriving at equations that were more akin to the explanatory
theoretical laws of physics than to its descriptive empirical laws.
On their own, equations are not mechanistic models. One equation might char
acterize a psychological phenomenon, and another might re-characterize it so as to
provide a highly satisfactory theoretical explanation. Equations do not offer the right
kind of format, however, for constructing a mechanistic explanation they specify
neither the component parts and operations of a mechanism nor how these are organ
ized so as to produce the behavior. This suited the mathematical psychologists of the
1 950s who, like other scientifIcally oriented psychologists, avoided any proposals that
hinted at mentalism. When the computer metaphor made notions of internal informa
tion processing respectable, though, they began to ask what sorts of cognitive mechan
isms might be responsible for phenomena that could be characterized, but not fully
explained, using equations alone.
assumption that a short time after several items have been memorized, they can be
immediately and simultaneously available for expression in recall or in other
responses, rather than having to be retrieved fIrst" (p. 652). Instead, it appeared that
each item was retrieved and compared to the probe in succession (a process that has
been called serial search, serial retrieval, serial comparison, or simply scanning).
Moreover, because the reaction time functions were almost identical for trials in which
there was and was not a match, Sternberg contended that this process was not only
serial but also exhaustive. If, to the contrary, the process terminated once a match
was found, positive trials should have had a shallower slope and averaged just half
the total reaction time of negative trials for a given set size.
Sternberg's deceptively simple paper illustrates several key points.
The linear equation initially played the same straightforward role as Fechner's
logarithmic equation: it precisely characterized a phenomenon.
Sternberg aspired to explain the phenomenon that he characterized, and
departed from the mathematical psychology of the 1 9 50s by proposing a
mechanism, rather than a more fundamental equation, that could produce
the phenomenon of a linear relation between set size and RT.
In the proposed mechanism - one of the earliest information processing
models - the most important parts were a short-term memory store and mental
symbols representing the probe and list items, the most important operation
was retrieval, and the system was organized such that retrieval was both serial
and exhaustive.
The mechanism combined computational and symbolic approaches, in that its
operations were performed on discrete mental symbols.
In addition to the computational and symbolic approaches, the mechanistic
explanation incorporated the quantitative approach of mathematical psycho
logy as well. That is, Sternberg used his linear equation not only to charac
terize the phenomenon, but also as a source of detail regarding an operation.
SpecifIcally, he interpreted the slope (37.9 msec.) as the time consumed by each
iteration of the retrieval operation.
The underdetermination of explanation is hard to avoid. For example,
although Sternberg regarded his data as pointing to a mechanism with serial
processing, competing mechanisms based on parallel processing have been
proposed that can account for the same data. The slope of the linear function
is interpreted quite differently in these accounts.
dx/dt = (A - By)x
dy/dt = (ex - D)y
The notation dx/dt refers to the rate of change in the number of prey x over
time t and dy/dt to the rate of change in the number of predators y over time t.
Figure l O.3a shows just three of the cyclic trajectories that can be obtained from
these equations by specifYing different initial values of x and y. They differ in the size
of the population swings, but share the fate that whichever trqjectory is embarked upon
will be repeated ad infinitum. If the initial values are at the central equilibrium point,
also shown, there are no popUlation swings - it is as though births continuously
replace deaths in each group. A more interesting situation arises if the equations are
modifIed by adding "ecological friction" (similar to the physical friction that brings
Phenomena and Mechanisms
(a)
(b)
Number of prey
Figure 1 0 . 3 Two kinds of trqjectories through state space for a predator and prey
population: (a) cyclic trajectory (equilibrium point is not an attractor) and (b) spiraling
trajectory (equilibrium point is a point attractor).
a pendulum asymptotically to rest by damping its oscillations). Now the swings decrease
over time as the system spirals towards, but never quite reaches, its point of equi
librium. In this modifIed system the point of equilibrium is an attractor state
(specifIcally, a point attractor in state space, or equivalently, a limit point in phase
space) one of the simplest concepts in D ST and very influential in some parts of
cognitive science. Figure l O.3b shows a point attractor and one of the trajectories
spiraling towards it. The type of graphic display used in figure 10.3, called a phase
portrait, is one of DST's useful innovations. Time is indicated by the continuous sequence
of states in each sample trajectory. This leaves all dimensions available for showing
the state space, at a cost of not displaying the rate (or changes in rate) at which the
trajectory is traversed. If graphic display of the rate is desired, a more traditional
plot with time on the abscissa can be used.
In the 1 980s D ST captured the attention of a few psychologists who recognized
its advantages for characterizing phenomena of perception, motor behavior and develop
ment. By the 1 990s D ST was having a broader impact in cognitive science. As in the
case of mathematical psychology before it, the use of D ST progressed through phases:
-
Linkage mechanism
Valve
Figure I DA Watt's centrifugal governor for a steam engine. Drawing from Farley ( 1 827).
prediction of the angle of the spindle arms given any particular speed at which the
flywheel rotates. This is very useful, but does not imply holism.
In addition to their holism, these dynamicists also have a preference for parsi
mony that is similar to that of classic mathematical psychologists (those who had
not yet adopted a mechanistic information processing framework within which to
apply their mathematical techniques). A simple dynamical account typically targets
phenomena involving a small number of quantitative variables and parameters. One
of the earliest examples relevant to cognitive science is Scott Kelso's account of the
complex dynamics evoked in a deceptively simple task used to study motor control.
Participants were asked to repeatedly move their index fmgers either in phase (both
move up together, then down together) or antiphase (one moves up while the other
moves down) in time with a metronome (Kelso, 1 995). As the metronome speed increases,
people can no longer maintain the antiphase movement and involuntarily switch to
in-phase movement. In D ST terms, the two moving fmgers are coupled oscillators.
Slow speeds (low frequencies) permit a stable coupling either in-phase or antiphase:
the state space has two attractors. At high speeds only in-phase coupling is possible:
the state space has only one attractor. Kelso offered a relatively simple equation
that provides for characterization of the attractor landscape (V) as parameter values
change:
<p80)
cos <p
b cos 2<p
Here <p is the degree of asynchrony between the two fmgers (relative phase), 80) reflects
the difference between the metronome's frequency and the spontaneous oscillation
frequency of the fmgers, and a and b indirectly reflect the actual oscillation fre
quency of the fmgers. When the ratio b/a is small, oscillation is fast and only the
in-phase attractor exists. When it is large, there are two attractors: people can
produce in-phase or antiphase movement as instructed or voluntarily traverse a
trajectory between them.
Adele Abrahamsen and Wil liam Bechtel
Things get interesting when the ratio alb is intermediate between values that
clearly provide either one or two attractors. The attractors disappear but each "leaves
behind a remnant or a phantom of itself" (Kelso, 1 995, p. 1 09). The system's trajec
tory now exhibits intermittency, approaching but then swinging away from the phan
tom attractors. Putting it more concretely, the two index fmgers fluctuate chaotically
between in-phase and antiphase movement. Although such intermittency may not
seem particularly important, if we shift to a different domain we can recognize it as
a signifIcant characteristic of cognition. Most people have had the experience, when
looking long enough at an ambiguous fIgure such as the Necker cube, of an irregu
lar series of shifts between the two interpretations. The temporal pattern of these
spontaneous shifts is chaotic - a technical concept in D ST that refers to trajectories
in state space in which no point is revisited (such trajectories appear random but
in fact are deterministic) . Kelso remarked on the similarity between the fmger
movement and Necker cube tasks, not only in the switching-time data he displayed
but also in the quantitative and graphical analyses of the systems. Given the wide range
of applicability of these analyses, their elegance, and their depth, they clearly go beyond
characterizing phenomena to explaining them via re-characterization. Nonetheless,
they provide only one of the two types of explanation we have discussed. In the
next section we introduce an alternative dynamical approach that offers intriguing
new ideas about mechanistic explanation.
A dynamical approach with subsymbolic mechanisms
Kelso chose to focus on switching time phenomena and center his explanation on a
single dynamical equation with variables capturing aspects of the system as a whole.
Here we look at a mechanistic explanation for the same phenomenon. Cees van Leeuwen
and his collaborators (van Leeuwen et al., 1 997) developed a model that is both dynam
ical and mechanistic by using a dynamical equation to govern (not only to describe)
the operations of the fme-grained component parts of a mechanism. It is similar to
a connectionist model insofar as its component parts are numerous homogeneous
units, some with pairwise connections, that become activated in response to an input.
However, in this case the units are more sparsely connected and are designed as oscil
lators that can synchronize or desynchronize their oscillations. Particular patterns of
synchronization across the units are treated as constituting different interpretations
of the input.
More specifIcally, van Leeuwen et al. employed a coupled map lattice (CML) of
the sort fIrst explored by Kaneko ( 1 990). A lattice is a sparsely connected network
in which only neighboring units are connected (coupled ) ; the basic idea is illustrated
by a large construction from Tinkertoys or a fIshnet. A map is a type of function
in which values are iteratively determined in discrete time. Kaneko employed the
logistic equation
xt+1
A xt( 1 - xt)
to govern the activation (x) of units at a future time t + 1 on the basis of their
current activation. Depending on the value of the parameter A, such units will oscil
late between a small number of values or behave chaotically. For nearby units to
Phenomena and Mechanisms
influence each other, there must be connections between them. Although van
Leeuwen's proposed mechanistic model used 50 x 50 arrays of units, with each unit
coupled to four neighbors, his analysis of a CML with just two units sufflCes for
illustration. In his account, the net input to unit x is determined from the activation
of x and the other unit y:
netinputr
Cay + ( 1 - C) a r
The logistic function is then applied to the resulting net input to determine the activa
tion of unit x:
ax,t+l
==
A netinput,) l - netinputx)
The behavior of the resulting network is determined by the two parameters, A and
C. For a range of values of A relative to C, the two units will come to change their
activation values in synchrony. Outside this range, however, the system exhibits the
same kind of intermittency as did Kelso's higher-level system. For some values, the
two units approach synchrony, only to spontaneously depart from synchrony and
wander through state space until they again approach synchrony. With larger net
works, one constellation of nearby units may approach synchrony only to break
free; then another constellation of units may approach synchrony. These temporary
approaches to synchrony were interpreted by van Leeuwen et al. as realizing a par
ticular interpretation of an ambiguous fIgure. Thus, there are echoes of Kelso not
only in the achievement of intermittency but also in its interpretation. Nonetheless,
van Leeuwen et aI's full-scale CML is clearly mechanistic: it explains a higher-level
phenomenon as emerging from the dynamics of numerous, fmer-grained parts and
operations at a lower level. This contrasts with the more typical dynamical approach,
exemplifIed by Kelso, of describing a phenomenon at its own level using just a few
variables in a global equation.
dy/dt
by
where dy/dt is a notation indicating that we are concerned with the number of
additional words to be added per unit of time, b is the percentage increase across
that time (which does not change), and y is the current number of words (which
increases with t). This equation arguably is explanatory, in that it re-characterizes
the fIrSt equation so as to explicate that a constant percentage increase is respons
ible for its exponential absolute increase.
Elman et al. emphasize that a single mechanism can produce slow initial vocabul
ary growth as well as a later vocabulary burst. This does not mean nothing
else can be involved; they themselves go on to consider a more complicated version
of their initial account. The point is that the burst, as such, does not require any
special explanation such as a naming insight; "the behavior of the system can have
different characteristics at different times although the same mechanisms are always
operating" (p. 1 8 5). In another part of their book (pp. 1 24-9), Elman et aL describe
an auto-associative connectionist network that, though limited to simplifIed repres
entations, suggests a mechanistic explanation. It replicates not only the quantitative
phenomenon of a vocabulary burst (exponential growth), but also such phenomena as
a tendency to underextension prior to the burst and overextension thereafter.
Abrupt transitions are not uncommon in development, and connectionist networks
excel at producing this kind of nonlinear dynamic. Another example discussed by
Elman et al. targets the transitions observed by Robert Siegler ( 1 98 1 ) when children
try to solve balance-scale problems. They appear to progress through stages in which
different rules are used (roughly: weight, then distance, then whichever of these
is more appropriate). McClelland and Jenkins ( 1 9 9 1 ) created a connectionist net
work that offered a simplifIed simulation of the abrupt transitions between these three
stages. While intrigued by this capability, Elman et al. commented that "simulating one
black box with another does us little good. What we really want is to understand
exactly what the underlying mechanism is in the network which gives rise to such
behavior" (p. 230).
1 1 80 1
Output
Input
Context
Figure 1 0. 5 Simple recurrent network as proposed by Jeffrey Elman ( 1 990). For clarity,
fewer units are shown than would be typical. The solid arrows represent feedforward
connections used to connect each unit in one layer with each unit in the next layer. The
dotted arrows indicate the recurrent (backwards) connections that connect the hidden units
to specialized input units called context units.
Conclusions
Our strategy through this paper has been to show that the range of phenomena for
which mechanistic models are sought is extremely varied and to illustrate briefly some
of the kinds of models of mechanisms that have been advanced to account for dif
ferent phenomena. The focus in the philosophical literature on systematicity and other
general properties of cognitive architectures presents a distorted view of the actual
debates in the psychological literature over the types of mechanisms required to account
for cognitive behavior. Even in the domain of language, where systematicity is best
exemplifIed, many additional phenomena claim the attention of cognitive scientists.
We discussed two that are quantitative in nature : the U-shaped acquisition of irregu
lar past-tense forms and the exponential acquisition of early vocabulary. Beyond
language we have alluded to work targeting the rich domains of perceptual and motor
behavior, memory, and problem solving. Phenomena in all of these domains are part
of the explanandum of mechanistic explanation in cognitive science. Such explan
atory attempts, which like the phenomena themselves often are quantitative, go back
as far as Weber's psychophysics and currently are moving forward in dynamical
approaches to perception, cognition, and development.
We have also emphasized that cognitive science, despite its many disputes, has
progressed by continually combining and recombining a variety of influences. The
use of equations both in characterizing and explaining phenomena are among these.
When combined with other influences and commitments, the outcomes discussed here
have ranged from information processing models with quantifIed operations to con
nectionist networks to both global and mechanistic dynamical accounts. Each of these
approaches has provided a different answer to the question of whether the mind pro
cesses language-like representations according to formal rules, and we have argued
that the overall answer need not be limited to just one of these. Cognitive science
takes multiple shapes at a given time, and is protean across time.
I am large, I contain multitudes.
(Walt Whitman, Leaves of Grass)
Note
According to Estes (2002), the publication of the three-volume Handbook of Mathematical
Psychology (Luce et aI., 1 963-5) galvanized deVelopment of professional organizations for
mathematical psychologists. The Journal of Mathematical Psychology began publishing in
1 964. The Society for Mathematical Psychology began holding meetings in 1 968, although
official establishment of the society and legal incorporation only occurred in 1977.
References
Abraham, R. H. and Shaw, C. D. ( 1 992). Dynamics: The Geometry of Behavior. Redwood City,
CA: Addison-Wesley.
Adele Abrahamsen and William Bechtel
Baldwin, D. A ( 1 989). Establishing word-object relations: a fust step. Child Development, 60,
3 8 1 -98.
Bechtel, W. ( 1 997). Dynamics and decomposition: are they compatible? Proceedings of the
Australasian Cognitive Science Society.
Bechtel, W. and Abrahamsen, A (2002). Connectionism and the Mind: Parallel Processing,
Dynamics, and Evolution in Networks (2nd edn.). Oxford: Blackwell.
- and - (2005). Explanation: A mechanist alternative. Studies in History and Philosophy
of Biological and Biomedical Sciences, 3 6, 42 1 - 4 1 .
Bechtel, W . and Richardson, R . C . ( 1 993). Discovering Complexity: Decomposition and
Localization as Scientific Research Strategies. Princeton, NJ: Princeton University Press.
Bechtel, W., Abrahamsen, A, and Graham, G. ( 1 998). The life of cognitive science. In W. Bechtel
and G. Graham (eds.), A Companion to Cognitive Science. Oxford: B lackwell.
Brown, R. ( 1 973). A First Language: The Early Stages. Cambridge, MA: Harvard University Press.
Bush, R. R. and Mosteller, F. ( 1 95 1). A mathematical model for simple learning. Psychological
Review, 58, 3 1 3-23.
Cummins, R. (2000). "How does it work?" versus "what are the laws?": Two conceptions of
psychological explanation. In F. Keil and R. Wilson (eds.), Explanation and Cognition. Cambridge,
MA: MIT Press.
Dennett, D. C. ( 1 978). Brainstorms. Cambridge, MA: MIT Press.
Ebbinghaus, H. ( 1 885). Ober das Gediichtnis: Untersuchungen zur Experimentellen Psychologie.
Leipzig: Duncker a Humblot.
Elman, J. 1.. ( 1990). Finding structure in time. Cognitive Science, 1 4, 1 79-2 1 l .
Elman, J. 1.. , Bates, E. A, Johnson, M. H., Karmiloff-Smith, A , Parisi, D., and Plunkett, K.
( 1 996). Rethinking Innateness: A Connectionist Perspective on Development. Cambridge, MA:
MIT Press.
Ervin, S. ( 1 964). Imitation and structural change in children's language. In E. Lenneberg (ed.),
New Directions in the Study of Language. Cambridge, MA: MIT Press.
Estes, W. K. ( 1 9 50). Towards a statistical theory of learning. Psychological Review, 57, 94- 107.
- (2002). Mathematical psychology, history of. In International Encyclopedia of the Social
and Behavioral Sciences. New York: Elsevier.
Falmagne, J.-c. (2002). Mathematical psychology. In International Encyclopedia of the Social
and Behavioral Sciences. New York: Elsevier.
Farley, J. ( 1 827). A Treatise on the Stearn Engine: Historical, Practical, and Descriptive. London:
Longman, Rees, Orme, Brown, and Green.
Fechner, G. T. ( 1 860). Elemente der Psychophysik. Leipzig: Breitkopf und Hartel.
Flynn, J. R. ( 1 987). Massive IQ gains in 14 nations: What IQ tests really measure. Psychological
Bulletin, 10 1 , 1 7 1 - 9 l .
Garcia, J., McGowan, B . K., Ervin, F . R., and Koelling, R . A . ( 1968). Cues: Their relative
effectiveness as a function of the reinforcer. Science, 1 60, 794-5.
Glennan, S. ( 1 996). Mechanisms and the nature of causation. Erkenntnis, 44, 50-7 1 .
- (2002). Rethinking mechanistic explanation. Philosophy of Science, 69, S342-53.
Gopnik, A. and Meltzoff, A N. ( 1 987). The development of categorization in the second year
and its relation to other cognitive linguistic developments. Child Development, 58, 1 523-3 1 .
Hempel, C. G . ( 1 965). Aspects of scientifIC explanation. I n C . G. Hempel (ed.l, Aspects of Scientific
Explanation and Other Essays in the Philosophy of Science. New York: Macmillan.
Hull, C. 1.. ( 1 943). Principles of Behavior. New York: Appleton-Century-Crofts.
Kaneko, K. ( 1 990). Clustering, coding, switching, hierarchical ordering, and control in a net
work of chaotic elements. Physica D, 4 1 , 1 37-42.
Kelso, J. A S. ( 1995). Dynamic Patterns: The Self Organization of Brain and Behavior. Cambridge,
MA: MIT Press.
Phenomena and Mechanisms
CAN CONSCIOUSNESS
AND QUALIA BE
REDUCED?
CHAPTER
ELEVEN
Expla n a nda
I shall address just three notions of consciousness, the three that seem to occupy
most theorists in contemporary philosophy of mind:
2
3
Conscious awareness, and "conscious states" in the sense of mental states whose
subjects are directly aware of being in them.!
"Phenomenal consciousness," as Block ( 1995) calls it, viz., being in a sensory
state that has a distinctive qualitative or phenomenal character.
The matter of "what it is like" for the subject to experience a particular phe
nomenal property - which many writers (including Block) fail to distinguish
from (2). It too is sometimes called "phenomenal consciousness."
The main uses of the term "qualia" (singular, " quale") parallel (2) and (3). A quale
in the sense of C. 1. Lewis (1929) is the sort of qualitative or phenomenal property
mentioned in (2): the color of an after-image, or of a more ordinary patch in one's
visual fIeld; the pitch or volume or timbre of a subjectively heard sound; the smell
of an odor; a particular taste; the texture encountered by touch in feeling some object
in the dark.
Some writers, in my view inaccurately, have used the phrase "what it is like" (here
after "w.i.l . "), brought to the fore by Farrell ( 19 50) and Nagel ( 1974) to mean just a
quale in the sense just mentioned. But the more common use is as in (3) above: what
it is like for the subject to experience a particular phenomenal property or quale, a
higher-order property of the quale itself. Just as unfortunately, "qualia" and "quale"
themselves have been used in the higher-order sense of (3). I shall here reserve "quale"
for (2) and "w.i.l." for (3).
It is important to see that those two uses differ. First, just to make the obvious
point, if w.i.l. is a higher-order property of a quale, it cannot be the quale itself.
Second, as Carruthers (2000) points out, a quale presents itself as part of the world,
not as part of one's mind. It is the apparent color of an apparently physical object
(or, if you are a Russellian, the color of a sense-datum that you have happened to
encounter as an object of consciousness) ; as Carruthers says, a quale is what the
world is or seems like. But w.i.l. to experience that color is what your flfSt-order
perceptual state is like, quintessentially mental and experienced as such.
Third, a quale can be described in one's public natural language, while what it is
like to experience the quale seems to be ineffable. If you are experiencing a reddish
after-image and I ask you, "What color does it look to you?", you will reply, "It
looks red." But if I persevere and ask you w.i.l. to experience that "red" look, you
can only go comparative : "It looks the same color as that," pointing to some nearby
object. If I press further and insist on an intrinsic characterization, you will prob
ably go blank. In one way, you can describe the phenomenal color, as "red," but when
asked w.i.l. to experience that red, you cannot describe that directly. So there is a
difference between "w.i.J" in the bare sense of the quale, the phenomenal color that
can be described using ordinary color words, and w.i.l. to experience that phenom
enal color, which cannot easily be described in public natural language at all.
And fourth, a bit more tendentiously, Armstrong ( 1968), Nelkin ( 1989), Rosenthal
( 199 1), and Lycan ( 1996) have argued that qualia can fail to be conscious in the
earlier sense ( 1) of awareness. That is, a quale can occur without its being noticed
by its subject. But in such a case, it would not be like anything for the subject to
experience that quale. So even when one is aware of a quale, the "w.i.l." requires
awareness and so is something distinct from the quale itself.
Red u ction
Reduction is sometimes understood as a relation between theories, typically as a deriva
tional relation between sets of sentences or propositions. But in our title question it
is applied to consciousness and qualia themselves, so its sense is ontological. It may
apply to individual entities, to properties, to events, processes, or states of affairs.
Let us further distinguish type reduction from token reduction. Token reduction
is the weaker notion : A's are token-reducible to B's iff every A is entirely consti
tuted by some arrangement of B's. Thus, chess pieces are token-reducible to the micro
entities of physics, because every individual chess piece is entirely constituted by
William G. Lyca n
such entities. But chess pieces are not type-reducible to physics, because there is no
correlation between rooks (or for that matter between chess pieces as such) and any
particular type of configuration of subatomic particles described in the language of
physics. A's are type-reducible to B's iff every standard type of A is constituted by
a specific type of conflguration described in B language. Thus, lightning is type-reducible
to high-voltage electrical discharge in the sky, table salt to the compound NaCl, and
so on.
The identity of properties as conceived here is a matter of type-reduction of their
instances. The property of being salt reduces to the property of being NaCl, since
quantities of salt are type-reducible to quantities of NaCI ; the property of being cash
is that of having been made legal or quasi-legal tender in a given population; etc.
(But type-reduction of instances as defmed above does not entail property identity,
since it may involve only constitution rather than strict identity.)
Consciousness(es) and qualia are even less likely to be type-reduced to physics
than are chess pieces, but I will suggest that they are token-reducible, and that they
can be type-reduced to functional and information-theoretic properties located at var
ious higher levels of a creature's physical organization.
This objection demands a careful response. White (1987) and Lycan ( 1 99 6) have
tried to rebut it.
Computational/cognitive overload. Carruthers (2000) points out that, given the
richness of a person's conscious experience at a time, the alleged higher-order
representing agencies would be kept very busy. The higher-order representations
would have to keep pace with every nuance of the total experience. The complexity
of the experience would have to be matched in every detail by a higher-order per
ception or thought. It is hard to imagine that a human being would have so great
a capacity for complex higher-order representation, much less that a small child or
a nonhuman animal would have it. Carruthers concludes that if any HOR theory is
true, then to say the least, few if any creatures besides human adults have conscious
experiences.
Some HOR theorists can live with that apparent consequence. Lycan ( 1 999a) main
tains, contra Carruthers' premise, that in the present sense of " conscious," very few
of our mental states are conscious states.
experience is unveridical. After-images are, after all, optical illusions. The quale, the
redness of the blob, is, like the blob itself, an intentional inexistent. And that is how
the dilemma is resolved: There is a red thing that I am experiencing, but it is not
an actual thing. (I like to say: in defending sense-data, Russell mistook a nonactual
physical thing for an actual nonphysical thing.) The "there is" in "There is a red thing
that . . . " is the same existentially noncommittal "there is" that occurs in "There is
an allegedly Christian god that is not immaterial, but lives on a real planet some
where in the universe" and in 'There is a growth effect that a few loony economists
believe in but that doesn't exist and never will."
The main argument in favor of the representational theory has in effect already
been given: that otherwise our dilemma succeeds in refuting materialism, and com
mits us to actual Russellian sense-data. Of course that is only grist to the mill of
someone who points to qualia in sense (2) by way of arguing against materialism,
but representationalism is a reductive strategy, and shows that such a refutation does
not succeed as it stands, and so contributes to an affIrmative answer to the question
"Can consciousness and qualia be reduced?"
A second argument is Harman's ( 1 990) transparency argument: We normally
"see right through" perceptual states to external objects and do not even notice that
we are in perceptual states ; the properties we are aware of in perception are
attributed to the objects perceived.
Look at a tree and try to tum your attention to intrinsic features of your visual experi
ence. I predict you will fmd that the only features there to tum your attention to will
be features of the presented tree, including relational features of the tree "from here".
(Harman, 1 990, p. 39)
Tye ( 1 995) extends this argument to b odily sensations such as pain. It can be extended
further to the hallucinatory case. Again I am looking at a real, bright yellow banana
in normal conditions. Suppose also that I simultaneously hallucinate a second, ident
ical banana to the left of the real one. The relevant two regions of my visual fIeld
are phenomenally just the same; the banana and color appearances are just the
same in structure. The yellowness inhering in the second-banana appearance is exactly
the same property as that inhering in the fust. But if we agree that the yellowness
perceived in the real banana is just the actual color of the banana itself, then
the yellowness perceived in the hallucinated banana is just the yellowness of the
hallucinated banana itself. And that accounts for the yellow quale involved in the
second banana appearance.
I believe the transparency argument shows that visual experience represents
external objects and their apparent properties. But that is something we really knew
anyway. What the argument does not show, but only claims, is that experience
has no other qualitative properties that pose problems for materialism. The obvious
candidate for such a property is a w.i.l. property - sense (3) - but there are other
candidates as well (Block, 1 996); see below.
A variation on the transparency argument begins with our need to explain the
distinction between veridical and unveridical visual experiences. My veridical experi
ence of the banana is as of a yellow banana, and has yellowness as one of its
William G. Lycan
would never do. ( Does some brain state have the function of indicating ashtrays?
angels? Ariel? Anabaptists? Arcturus? algebra? aleph-nu1l?) More generally, though
I cannot argue this here, I judge that psychosemantics is in very bad shape. Though
naturalist and reductionist by temperament, I am not optimistic about the onto
logical reduction of intentionality to matter. Now, like HOR theories of conscious
awareness, the representational theory reduces qualia, not to matter itself, but to inten
tionality. If intentionality cannot be reduced, then we will not have reduced either
conscious awareness or qualia to the material.
Still, it is a prevalent opinion that although intentionality is a problem for the
materialist, consciousness and qualia are a much greater diffICulty. So it is worth
arguing that consciousness and qualia can be reduced to representation, even if in
the end representation remains unreduced.
Counterexamples. Allegedly there are cases in which either two experiences share
their intentional content and differ in their qualia or they differ entirely in their inten
tional content but share qualia. Peacocke (l983) gave three examples of the former
kind, Block ( 1 990) one of the latter. (Block, 1 99 5 and 1 996, also offers some of the
former kind; for discussion, see Lycan, 1 996.)
In Peacocke's flfSt example, you experience two (actual) trees, at different distances
from you but as being of the same physical height and breadth; " [y]et there is also
some sense in which the nearest tree occupies more of your visual fIeld than the
more distant tree" (Peacocke, 1 983, p. 1 2) . Peacocke maintains that that sense is
qualitative, and the qualitative difference is unmatched by any representational dif
ference. The second and third examples concern, respectively, binocular vision and
the reversible-cube illusion.
In each case, Tye ( 1 99 5) and Lycan ( 1 996) have rejoined that there are after all
identifIable representational differences constituting the qualitative differences. Tye
(2003) continues this project vis-it-vis further alleged counterexamples, to good effect.
Block appeals to an "Inverted Earth," a planet exactly like Earth except that its
real physical colors are (somehow) inverted with respect to ours. The Twin Earthlings'
speech sounds just like English, but their intentional contents in regard to color are
inverted relative to ours: When they say "red," they mean green (if it is green Twin
objects that correspond to red Earthly objects under the inversion in question), and
green things look green to them even though they call those things "red." Now, an
Earthling victim is chosen by the usual mad scientists, knocked out, fItted with color
inverting lenses, transported to Inverted Earth, and repainted to match that planet's
human skin and hair coloring. Block contends that after some length of time, short
or very long, the victim's word meanings and propositional-attitude contents and all
other intentional contents would shift to match the Twin E arthlings' contents, but,
intuitively, the victim's qualia would remain the same. Thus, qualia are not inten
tional contents.
The obvious representationalist reply is to insist that if the intentional contents
would change, so too would the qualitative contents. Block's nearly explicit argu
ment to the contrary is that qualia are "narrow" in that they supervene on head con
tents (on this view, two molecularly indistinguishable people could not experience
different qualial, while the intentional contents shift under environmental pressure
precisely because they are wide. If qualia are indeed narrow, and all the intentional
William G. Lycan
contents are wide and would shift, then Block's argument succeeds. (Stalnaker, 1 996,
gives a version of Block's argument that does not depend on the assumption that
qualia are narrow; Lycan, 1 996, rebuts it.)
Three rejoinders are available. The fIrst is to insist that not all the intentional con
tents would shift. Word meanings would shift, but it does not follow that visual con
tents ever would. Lycan ( 1 996) argues that we have no reason to think that visual
contents would shift. The second rejoinder is to hold that although all the ordinary
intentional contents would shift, there is a special class of narrow though still rep
resentational contents underlying the wide contents ; qualia can be identifIed with
the special narrow contents. That view has been upheld by Shoemaker ( 1 994a), Tye
( 1 994) , and especially Rey ( 1 998). Rey argues vigorously that qualia are narrow, and
then offers a narrow representational theory. (But it turns out that Rey's theory is
not a theory of qualia in sense (2) ; see beloW.)
The third rejoinder is to deny that qualitative content is narrow and to argue that
it is wide, i.e., that two molecularly indistinguishable people could indeed experience
different qualia. This last is the position that Dretske ( 1 99 6) has labeled "phenome
nal externalism." It is maintained by Dretske, Tye ( 1 995), and Lycan ( 1 996, 200 1 ) .
A number o f people - even Tye himself ( 1 998) - have since called the original
contrary assumption that qualia are narrow a "deep/powerful/compelling intuition,"
but it proves to be highly disputable.
If the representational theory is correct, then qualia are determined by whatever
determines a perceptual state's intentional content. In particular, the color properties
represented are taken to be physical properties instanced in the subject's envir
onment. What determines a psychological state's intentional content is given by a
psychosemantics. But every even faintly plausible psychosemantics makes intentional
contents wide. Of course, the representational theory is just what is in question ;
but this argument does not beg that question; it merely points out that the anti
representationalist is not entitled to the bare assumption that qualia are narrow. And
notice that that assumption is a positive modal claim, a claim of necessitation ; we
representationalists have no a priori reason to accept it.
Although until recently the assumption that qualia are narrow had been tacit and
entirely undefended, opponents of representationalism have since begun defending
it with vigor. Here are two of their arguments, with sample replies.
Introspection. An Earthling suddenly fItted with inverting lenses and transported
to Inverted Earth would notice nothing introspectively, despite a change in repres
entational content; so the qualia must remain unchanged and so are narrow.
Reply: The same goes for propositional attitudes, i.e., the transported Earthling would
notice nothing introspectively. Yet the attitude contents are still wide. Wideness does
not predict introspective change under transportation. (Which perhaps is odd.)
Modes of presentation (Rey, 1 998). There is no such thing as representation without
a mode of presentation. If a quale is a represented property, then it is represented
under some mode of presentation, and modes of presentation may be narrow even
when the representational content itself is wide. Indeed, many philosophers of
mind take modes of presentation to be internal causal or functional roles played
by the representations in question. Surely they are strong candidates for qualitative
content. So are they not narrow qualia?
Consciousness a n d Qualia Can Be Reduced
Reply: Remember, quaJia in sense (2) are properties like phenomenal yellowness
and redness, which according to the representational theory are representata. The mode
or guise under which yellowness and redness are represented in vision is something
else again. (It can plausibly be argued that such modes and guises are qualitative or
phenomenal properties of some sort, perhaps higher-order properties. See the next
section.)
There are at least ten more such arguments, and few will be convinced by all of the
externalist replies to them. But no one should fmd it obvious that qualia are narrow.
The representationalist reduction of quaJia is subject to a more general sort of
objection that has been raised against many forms of materialism: a conceivability
argument. On those, see Gertler (chapter 1 2, CONSCIOUSNESS AND QUAUA CANNOT BE REDUCED).5
"What it is Li ke"
It is "w.i.l." in sense (3) that fIgures in anti-materialist arguments from subjects'
"knowing what it is like," primarily Nagel's ( 1974) "Bat" argument and Jackson's ( 1 982)
"Knowledge" argument. Jackson's character Mary, a brilliant color scientist trapped
in an entirely black-and-white laboratory, nonetheless becomes omniscient as
regards the physics and chemistry of color, the neurophysiology of color vision, and
every other public, objective fact conceivably relevant to human color experience.
Yet when she is fmally released from her captivity and emerges into the outside world,
she sees colored objects for the fIrSt time, and learns something: she learns what it
is like to see red and the other colors. Thus she seems to have learned a new fact,
the fact of w.i.l. to see red. By hypothesis, that fact is not a public, objective one,
but is an intrinsically perspectival fact. This is what would refute materialism, since
materialism has it that every fact about every human mind is ultimately a public,
objective fact.
Upon her release, Mary has done two things: She has at last hosted a red quale
in sense (2), and she has learned what it is like to experience a red quale. The fact
she has learned has the ineffability characteristic of w.i.l. i n sense (3) ; were Mary to
try to pass on her new knowledge to a still color-deprived colleague, she would not
be able to express it in English.
As there are representational theories of qualia in sense (2), there are representa
tional theories of w.i.l. in sense (3). The most common answer to the arguments of
Nagel and Jackson is what may be called the "perspectivalist" reply (Horgan, 1 984;
Churchland, 1985; Van Gulick, 1 985 ; Tye, 1986, 1 99 5 ; Lycan, 1 987, 1 996, 2003 ; Loar,
1 990; Rey, 1 99 1 ; Leeds, 1 993). The perspectivalist notes that a knowledge difference
does not entail a difference in fact known, for one can know a fact under one rep
resentation or mode of presentation but fail to know one and the same fact under a
different mode of presentation. Someone might know that lightning is flashing but
not know that electrical discharge was taking place in the sky, and vice versa; some
one might know that person X is much gossiped about without knowing that she
herself is much gossiped about, even if she herself is in fact person X. So, from Mary's
knowledge difference following her release, Jackson is not entitled to infer the exist
ence of a new, weird fact, but at most that of a new way of representing a fact that
William G. Lycan
Mary already knew in a different guise. She has not learned a new fact, but has only
acquired a new, introspective or fIrst-person way of representing one that she
already knew under its neurophysiological aspect.
This response to Nagel and Jackson requires that the fIrst-order qualitative state
itself be represented (else how could it be newly known under Mary's new mode of
presentation?). And that hypothesis in turn encourages a HOR theory of awareness
and introspection. Since we have seen that HOR theories of awareness face signi
fICant objections, the perspectivalist must either buy into such a theory despite its
drawbacks or fmd some other way of explicating the idea of an introspective or flrst
person perspective without appealing to higher-order representation. But I myself
am a happy HOR partisan, and fmd the idea of an introspective perspective entirely
congenial despite its unfortunate assonance.
N otes
Important: the phrase "conscious state" has been used in at least one entirely different
sense, as by Dretske ( 1 993). Failure to keep these senses straight has led to much confu
sion. The present use is as in "conscious memory" or "conscious decision."
2 For extensive general discussion of HOR views, see Gennaro, 2004. A more recent com
petitor is Van Gulick's (200 1 ) "HOGS" (Higher-Order Global State) theory.
3 Lycan ( 1998) answers this objection, not very convincingly.
4 On this view, perhaps surprisingly, quaJia turn out not to be properties of the experiences
that present them: qualia are represented properties of represented objects, and so they are
only intentionally present in experiences. The relevant properties of the experiences are
just their representative properties, such as representing yellow.
5 Lycan (2004) rebuts one of the latest conceivability arguments.
Dennett, D. C. ( 1978). Why you can't make a computer that feels pain. In Brainstorms.
Montgomery, VT: Bradford Books.
Dretske, F. ( 1 993). Conscious experience. Mind, 1 02, 263-83.
- ( 1 995). Naturalizing the Mind. Cambridge, MA: Bradford Books/MIT Press.
- ( 1 996). Phenomenal externalism. In E. Villanueva (ed.), Philosophical Issues, vol. 7 :
Perception. Atascadero, CA: Ridgeview Publishing.
Farrell, B. A. ( 1 950). Experience. Mind, 59, 1 70-98.
Fodor, J. A. ( 1 987). Psychosemantics. Cambridge, MA: Bradford Books/MIT Press.
Gennaro, R. ( 1 995). Consciousness and Self-Consciousness. Philadelphia: John Benjamins.
(ed.) (2004). Higher-Order Theories of Consciousness. Philadelphia: John Benjamins.
Harman, G. ( 1 990). The intrinsic quality of experience. In J. E. Tomberlin (ed.) Philosophical
Perspectives, vol. 4: Action Theory and Philosophy of Mind. Atascadero, CA: Ridgeview
Publishing. (Reprinted in W. G. Lycan (ed.), Mind and Cognition (2nd edn.). Oxford:
Blackwell.)
Heil, J. ( 1 988). Privileged access. Mind, 47, 238-5 1 . (Reprinted in W. G. Lycan (ed.), Mind and
Cognition (2nd edn.). Oxford: Blackwell.)
Hintikka, K. J. J. ( 1 969). On the logic of perception. In N. S. Care and R. H. Grimm (eds.),
Perception and Personal Identity. Cleveland, OH: Case Western Reserve University Press.
Horgan, T. ( 1 984). Jackson on physical information and qualia. Philosophical Quarterly, 34,
1 47-52.
Jackson, F. ( 1 982). Epiphenomenal qualia. Philosophical Quarterly, 32, 1 27-36. (Reprinted in
W. G. Lycan (ed.l, Mind and Cognition (2nd edn.). Oxford: B lackwell.)
Kim, J. ( 1 995). Mental causation: What, me worry? In E. Villanueva (ed.l, Philosophical Issues,
vol 6: Content. Atascadero, CA: Ridgeview Publishing.
Kraut, R. ( 1 982). Sensory states and sensory objects. Nous, 16, 277-95.
Leeds, S. ( 1 993). Qualia, awareness, Sellars. Nous, 27, 303-30.
Lewis, C. I. ( 1 929). Mind and the World Order. New York: C. Scribner's Sons.
Lewis, D . ( 1 983). Individuation by acquaintance and by stipulation. Philosophical Review, 9 2,
3-32.
- ( 1997). Naming the colours. Australasian Journal of Philosophy, 75, 3 2 5-42.
Loar, B. ( 1 990). Phenomenal states. In J. E. Tomberlin (ed.), Philosophical Perspectives, vol. 4 :
Action Theory and Philosophy of Mind. Atascadero, CA: Ridgeview Publishing.
Lycan, W. G. ( 1 987). Consciousness. Cambridge, MA: Bradford Books/MIT Press.
- ( 1 996). Consciousness and Experience. Cambridge, MA: Bradford Books/MIT Press.
- ( 1 998). In defense of the representational theory of qualia. (Replies to Neander, Rey, and
Tye.) In J. E. Tomberlin (ed.), Philosophical Perspectives, vol. 1 2 : Language, Mind, and Ontology.
Atascadero, CA: Ridgeview Publishing.
- (l999a). A response to Carruthers' "natural theories of consciousness." Psyche, 5.
http://psyche.cs.monash.edu.au/v5/psyche-5- 1 1 -lycan.html.
- (ed.) ( 1 999b). Mind and Cognition (2nd edn.). Oxford: Blackwell.
- (200 1 ) . The case for phenomenal externalism. In J. E. Tomberlin (ed.l, Philosophical
Perspectives, vol. 1 5 : Metaphysics. Atascadero, CA: Ridgeview Publishing.
- (2003). Perspectivalism and the knowledge argument. In Q. Smith and A. Jokic (eds.),
Consciousness: New Philosophical Perspectives. Oxford: Oxford University Press.
(2004). Vs. a new a priorist argument for dualism. In E. Sosa and E. Villanueva (eds.),
Philosophical Issues, vol. 1 3 . Oxford: Blackwell.
Nagel, T. ( 1974). What is it like to be a bat? Philosophical Review, 82, 43 5-56.
Neander, K. ( 1 998). The division of phenomenal labor: A problem for representational theor
ies of consciousness. In J. E. Tomberlin (ed.), Philosophical Perspectives, vol. 1 2 : Language,
Mind, and Ontology. Atascadero, CA: Ridgeview Publishing.
William G. Lycan
CHAPTER
TWELVE
Reductionism is currently the favored position in the philosophy of mind, and I surmise
that it enjoys an even wider margin of popularity among cognitive scientists. Accord
ingly, my aspirations here are limited. I hope to show that reductionism faces serious
difflculties, and that anti-reductionism may be more palatable than usually believed.
Pre l i m i n a ries
I will focus on consciousness in Lycan's sense (2): qualia. (See chapter 1 1, CONSCIOUSNESS
AND QUAllA CAN BE REDUCED.) Qualia are the flfSt-order phenomenal properties of (some)
mental states. They are also called "qualitative features," and are said to make up a
state's "phenomenal character." I won't address the other two senses of "conscious
ness" that Lycan more briefly discusses ; these are ( 1) "conscious awareness" and
(3) the second-order property of qualia that he terms "what it's like." For I think that
it is the flfSt-order phenomenal features of mental states that generate the core of
what Chalmers ( 1996) aptly calls "the hard problem of consciousness." In any case,
it is flfSt-order qualia that pose the hardest part of the philosophical problem.
I will argue that qualia cannot be reduced to physical, functional, and/or com
putational ("information-theoretic," in Lycan's phrase) properties of the organism. What
exactly does it mean to say that qualia are thus irreducible? As Lycan says, the notion
of reduction at issue here is an ontological one; it does not chiefly concern concepts
or predicates, and there is no guarantee that truths about reduction are available to
a priori reflection. While the precise requirements for reduction are a matter of con
troversy, one point is clear. Any plausible theory of reduction will gloss the relation
between the reduced properties (or tokens) and the reducing properties (or tokens) as
necessary. That is, in a genuine case of reduction the relationship between what is
reduced and the reduCtive base cannot be mere correlation, or even lawlike correlation.
For an anti-reductionist can allow that physical properties cause irreducibly non
physical qualia, and that these causal connections are lawlike.l Compare: someone
who believes in ghosts can claim that certain physical processes - those involved in
a seance, say - causally suffIce for the presence of ghosts. But clearly this does not
commit her to physicalist reductionism about ghosts. I will not attempt to offer neces
sary and suffIcient conditions for reduction. Instead, because I will be arguing against
reduction, I will specifY one necessary condition that, I think, cannot be met in the
current case. This necessary condition involves necessary supervenience. It says that
Q-properties are reducible to P-properties only if Q-properties necessarily supervene
on P-properties. That is,
Q-properties are reducible to P-properties only if, for all x and all y, x is exactly
similar to y as regards P-properties => x is exactly similar to y as regards
Q-properties.
Put simply, reduction requires that any difference in reduced properties must be
grounded in some difference in reducing properties. So if qualia are reducible to phy
sical (functional, computational) properties, then two persons that are physically,
functionally, and computationally alike must be alike in their qualia properties. The
necessary supervenience of qualia on physical properties is sometimes expressed by
saying "Once God fIxed the physical facts, the qualia facts were also fIXed." This way
of expressing necessary supervenience illustrates why reduction requires necessary super
venience. For if qualia facts reduce to physical facts, fIxing the former should require
no work beyond fIxing the latter.
While reduction requires necessary supervenience, necessary supervenience may not
suffIce for reduction. For instance, many have worried that the fact (if it is a fact2) that
a variety of physical properties can realize qualia
the "multiple realizability" of
qualia - may thwart attempts at reduction. And one may consistently accept neces
sary supervenience while rejecting reduction because of concerns about multiple realiz
ability. The fact that necessary supervenience is a relatively weak notion, as compared
with the more full-blooded notion of reduction, allows the strategy to sidestep the issues
about multiple realizability that are, I suspect, behind Lycan's qualms about type
reduction to the physical. For my approach allows the argument to apply to type- and
token-reduction equally, since both require necessary supervenience. Lycan's criterion
for the token-reducibility of A's to B's is that "every A is entirely constituted by some
arrangement of B's." Taking A's to be entities individuated by their Q-properties, and B's
to be individuated by their P-properties, the relation of constitution demands that there
is no difference in Q-properties without a corresponding difference in P-properties.
(The requirement of necessary supervenience is even clearer in the case of type reduc
tion, which, as Lycan notes, is a more stringent relation than token reduction.)
physicalist reductionism fails so long as qualia can vary independently of the phys
ical (functional, computational). Because evaluating a claim about what is (or is not)
possible requires considering non-actual scenarios, conceivability arguments are
indispensable in evaluating the prospects for reductionism. Or so I shall argue here.
The most straightforward conceivability arguments against reductionism use one
of two claims. (i) We can conceive of a scenario in which particular qualia are pre
sent, but the physical, functional, and/or computational properties to which they
allegedly reduce are absent. (ii) We can conceive of a scenario in which qualia are
absent, but the allegedly reducing properties are present. The argument that uses
(i) is what I'll call the "disembodiment argument" : I can conceive of my qualia tokens
occurring in the absence of a body; whatever I can conceive is possible; so my mind
is not reducible to any body. (Descartes, 1 64 1 / 1984).3 The argument that uses (ii) is the
"zombie argument" : I can conceive of a creature that shares my physical, functional,
and computational properties, but that lacks qualia altogether; so qualia are not reducible
to those properties (Kirk, 1 97 4 ; Chalmers, 1 996). A third anti-reductionist argument,
the "knowledge argument," is less straightforward but very influential (Jackson, 1 982).
Jackson asks us to imagine Mary, a neuroscientist of the future who knows all of
the physical (functional, computational) facts about color vision, but who has spent
her entire life in a black-and-white room, and has never perceived any other colors.
He argues that Mary would not be in a position to determine which qualia are asso
ciated with any particular neurophysiological state. As its name suggests, this argument
is more explicitly epistemic than the disembodiment and zombie arguments. But I
will argue below that this difference is not as signifIcant as it may at fIrst appear, for
the disembodiment and zombie arguments also have a crucial epistemic dimension.
Before assessing these particular arguments, I want to address one source of qualms
about using conceivability arguments. The worry is that conceivability arguments tell
us, at most, about our concepts of qualia (or whatever else they concern). But qualia
themselves may not neatly fIt our concepts. So it's illegitimate to use conceivability
arguments to determine the nature of qualia, including whether qualia are ultimately
reducible. I think that conceivability arguments are not only legitimate, but also indis
pensable for evaluating reductionism. First, as explained above, the relation between
a reducing property and a reduced property is necessary. This means that we cannot
evaluate a claim about reducibility simply by examining actual correlations between
P-properties and Q-properties. For such correlations may be accidental or, more plaus
ibly, they may be a result of natural law. So assessing reductionist claims requires
considering whether non-actual scenarios are possible; and this is precisely the method
of conceivability arguments. Now in a sense it's true that conceivability arguments
are based in our concepts. But it is only by exercising our concepts that we are able
to think about the world; and it is only by exercising our concept of qualia that we
are able to think about qualia. Of course, our concepts may be seriously incomplete,
and so we may not have clear intuitions about whether a described scenario is poss
ible.4 More to the point, conceivability arguments are plainly fallible. We may be
using concepts that nothing satisfIes; and inattention or sloppiness may lead us to
mistakenly think that we are conceiving of a particular scenario when in fact we are
conceiving of a slightly different scenario. But even if our methods are imperfect,
they may nonetheless be indispensable. Conceivability is the only guide to necessity;
Brie Gertler
our concepts, and the intuitions about possibility that derive from them, provide our
only grip on modal claims.s
Finally, it's worth mentioning that modal intuitions - intuitions about what is poss
ible and impossible, which it is the aim of conceivability arguments to reveal - are
as important to arguments for reductionism as they are to anti-reductionist claims.
Again, reductionism entails that it's impossible for the reduced property to vary inde
pendently of the reducing property. Since a claim of impossibility cannot be estab
lished by considering the actual world alone (though of course it can be refuted in
this way), the reductionist must consider whether certain non-actual scenarios are
possible. And the only way to determine this is to use the method of conceivability.
zombies are possible ; and Lycan (this volume, chapter 1 1 ) contends that it is ignor
ance about the relation between phenomenal and physical modes of presentation
that leads us to mistakenly conclude, from Mary's inability to determine the phe
nomenal properties of physical states, that qualia are irreducible to the physical. Since
these objections target the epistemic dimension of anti-reductionist conceivability
arguments by contending that ignorance blocks the inference from conceivability to
possibility, I'll call them "objections from ignorance."
The key responses to objections from ignorance maintain that, even if our cur
rent understanding of qualia and/or physical (functional, computational) properties
is limited, these limitations do not block the inference to the envisaged ontological
conclusion. For instance, Descartes himself argued that there is much we do not know
about the mental : in particular, introspection may not afford knowledge of its causal
etiology. 6 But he did think that introspection yields a complete and adequate picture
of the essential features of mentality. Others have argued that, even if we don't under
stand the physical in all its speciflcity, we do have a concept of the physical that,
in its very nature, seems to exclude the phenomenal (Chalmers, 1 996). And I have
argued that the "same fact, different mode of presentation" objection to the know
ledge argument at best re-describes the knowledge Mary gains, leaving an epistemic
gap that is best explained onto logically (Gertler, 1 999).
Before turning to speciflc objections to each of the three conceivability arguments,
I want to make two general remarks about objections from ignorance. First, appeals
to ignorance must be carefully restricted. In particular, objections from ignorance
cannot claim that we have no clear notion of qualia, or of the physical, or of the
relation between the two ; otherwise, the claim that qualia are (or are not) reducible
to the physical is empty. And so it is entirely legitimate to offer, in defense of these
conceivability arguments, some gloss of our concept of qualia and/or the physical.
In fact, it's hard to see how there could be any basis for evaluating a claim about
reductionism, either in favor of it or against it, that doesn't use at least a tentative
outline of these notions. This isn't to say that we need a thoroughly detailed descrip
tion. But we do need some criterion, even if this is simply "physical properties are
whatever properties will explain rocks, trees, and planets . "7 Moreover, a blanket
skepticism about the rel ations between qualia and physical properties would block
reductionist and anti-reductionist arguments equally.
Second, there appears to be a tension between the objections from ignorance and
the larger impetus for physicalist reduction. One advantage that physicalist reduc
tion purportedly confers is epistemic : if qualia are physically reducible, then the sci
entiflc methods used in explaining physical phenomena will explain qualia as well.
But now suppose that we are in fact ignorant about the physical, as some of the
objections from ignorance maintain. In that case, the scientiflc methods used to explain
physical phenomena are either limited or faulty; hence, the idea of applying these
methods to qualia seems less promising.
I can see two ways for the reductionist to respond to this tension. I'll address each
of these in tum.
First reductionist response. Reductionism holds out the hope that all of reality is
ultimately explainable by the methods used by natural science. For despite the fact that
we don't have the explanations in hand, such explanations are in principle possible
.
Brie Gertler
so long as reductionism is true. And surely the idea that a single set of methods could
yield a coherent account of all of reality is epistemologically promising.
Reply. To assess this response, we must understand what is meant by "the methods
used by natural science." If this phrase is interpreted narrowly, to refer to the
particular techniques of current science, then the claim that such methods will explain
all of reality loses plausibility. For the history of science has shown us that a thorough
account of reality often demands the development of new investigative methods.
Now suppose that we interpret this phrase more liberally, so that it encompasses
speciflC innovations within the larger, relatively stable framework of what might be
called the "scientifIC approach" to the world. In that case, the anti-reductionist will
argue that a core tenet of the scientifIc approach is that one must follow the data
where they lead, without prejudging the outcome of an experiment; and the con
ceivability arguments provide data that call for explanation. Of course, reductionists
do offer alternative explanations of that data, and I will consider some of these below.
My point is just that an appeal to the general epistemological benefIts of scientifIC
methodology in no way diminishes the force of the conceivability arguments.
The anti-reductionist will make a similar reply to a reductionist appeal to a gen
eral methodological principle, like Occam's razor. Occam's razor prohibits multiply
ing entities (and, by extension, properties) without need ; but of course what is at
issue here is precisely whether there is need. Finally, the reductionist may claim that
physicalistic theories have more explanatory power, since they allow for a greater
simplicity in laws, etc., and thereby yield more powerful explanations of observed
phenomena. But anti-reductionists will maintain that, inasmuch as reductionist theories
fail to explain qualia, they are less explanatorily powerful. Again, my point here is
just that this sort of general methodological stance does not eliminate the need for
specifIC responses to the conceivability arguments.
Second reductionist response. Even if reductionism isn't supported on general epis
temic grounds, it carries ontological benefIts. SpecifIcally, it allows us to avoid "spooky"
properties like irreducibly nonphysical qualia, and thus allows for a larger natural
istic picture of the world.
Reply. Whether naturalism supports physicalist reductionism depends, of course,
on what is meant by "naturalism." Suppose that naturalism is understood method
ologically, as the claim that empirical science represents the only legitimate method
of understanding the universe. In that case, my reply to the previous response applies :
either this is unduly restrictive, or a naturalist account of reality must accommodate
, the data revealed by the conceivability arguments. Alternatively, naturalism may be
given an ontological gloss: in this sense, naturalistic properties are (reducible to) phys
ical properties. But an appeal to naturalism in that sense is, of course, question
begging. Relatedly, irreducibly nonphysical qualia should not be dismissed on the
grounds that they're "spooky" unless it can be shown that there are less spooky alter
natives. As I will explain below, anti-reductionists claim that reductionism is com
mitted to brute, unexplained, necessary correlations between qualia and physical
properties - what Chalmers ( 1 999) calls "strong necessities" - that are, they think,
spookier than irreducible qualia. Much more can be said about these issues. But I
hope that these brief remarks illustrate that objections from ignorance and appeals
to naturalism do not easily defeat anti-reductionism.
Consciousness a n d Qualia Can not Be Reduced
For he says that when we introspect qualia, we are presented with properties of qualia:
where qualia are fIrst-order properties, introspection reveals higher-order properties,
"what it's like" to have qualia. And the higher-order properties grasped through
introspection need not yield a complete grasp of the fIrst-order qualia.
As I see it, introspection can yield direct knowledge of flfSt-order qualia properties.
(This is not to say that introspection always does yield direct knowledge; only that,
in optimal cases, it can.) For I believe that, through introspection, one does not encounter
only second-order properties of qualia that are distinct from the fIrst-order qualia;
rather, one can directly encounter the qualia themselves. I have defended an account
of introspecting qualia along these lines (Gertler, 200 1 a ; see also Chalmers, 2003) . But
the basic idea that introspection gives us unmediated access to qualia doesn't depend
on the details of that account; it is familiar from several sources.
Pain . . . is not picked out by one of its accidental properties; rather it is picked out by
the property of being pain itself, by its immediate phenomenological quality. (Kripke,
1 980, pp. 1 52-3)
[11 here is no appearance/reality distinction in the case of sensations. (Hill, 1 9 9 1 , p. 1 27)
When introspecting our mental states, we do not take canonical evidence to be an inter
mediary between properties introspected and our own conception of them. We take
evidence to be properties introspected. (Sturgeon, 2000, p. 48)
On this conception, our introspective grasp of qualia does not involve any mode
of presentation distinct from the qualia themselves. This means that there is no
intermediary to limit our introspective grasp of qualia. But we still need to show
that this introspective grasp is adequate to determining whether the introspected prop
erties could be present in the absence of physical properties. To make this case, the
proponent of the disembodiment argument claims that the essential features of qualia
are exhausted by the features in principle available to introspection (together with
rational reflection on those features). Moreover, this is a conceptual truth about qualia:
e.g., by the quale I might express with the term "burnt orange," I mean this prop
erty, where the property is that to which I am introspectively attending.s The idea
that qualia concepts allow for this sort of exhaustive grasp of essential properties, a
grasp suffIcient to allow the subject to determine the possibility of counterfactual
scenarios, was argued by Kripke ( 1 980). As Kripke notes, this feature of qualia con
cepts, like "pain," provides a stark contrast with physical concepts like "water." For
conceptual reflection does not reveal the scientiflC essence of water - whether it is
H2 0 or XYZ, for instance. And so the apparent conceivability of water that is not
H2 0 does not provide reason to deny that water is (reducible to) H2 0. By contrast,
the nature of our qualia concepts means that the apparent conceivability of this (intro
spectively attended to) property (pain, say) being present in the absence of any phys
ical properties, does warrant denying that pain is reducible to any physical property.
There is much more to be said about the operation of qualia concepts. But the
basic point is just this: while our concept o f a physical property like being water is
tied to causal intermediaries (its characteristic look, feel, etc.) that constitute a mode
of presentation distinct from the property that constitutes its scientifIc essence (being
Consciousness a n d Qualia Ca n n ot Be Reduced
H20), our concept of pain is not tied to causal intermediaries. Instead, our concepts
of pain and other qualia derive from the qualia themselves, grasped introspectively.
So the essence of qualia is not a matter for empirical scientists to determine, but is
instead available, at least in principle, to anyone who attentively considers her own
qualia (though of course there is plenty of room for ignorance about various fea
tures of qualia, including their causal relations). Hence, conceivability intuitions rooted
in these qualia concepts can reveal what is genuinely possible.
Brie Gertler
a physics professor may have exhaustive propositional knowledge about how to hit
a home run - he might know the speed and the angle at which one should swing
the bat - but yet be unable to perform this action. This analysis defuses Jackson's
argument, since his conclusion requires that Mary learns a new fact.
Reply. The ability in question must be construed narrowly, for as Conee ( 1 994)
has shown, the ability to imagine or to remember seeing red isn't perfectly cor
related with knowing what it's like to see red. The ability that is, plausibly, conferred
on Mary's release is just the ability to recognize "seeing red" experiences by their
phenomenal features. (She was already able to recognize them by their neurophysi
ological features.)
But does this ability constitute what Mary gains? While it's true that acquiring a
piece of propositional knowledge doesn't always sufflce for acquiring the associated
ability, there are lots of cases in which it does. For instance, given my background
knowledge and abilities, I can gain the ability to fmd your house simply by learn
ing that you live at 1 23 Main Street. This seems to me a much more accurate model
of what Mary gains upon release: she acquires propositional knowledge that "seeing
red" experiences are like this (where the "this" is fIlled out by a quale to which one
introspectively attends) ; and this propositional knowledge explains her subsequent
ability to recognize such experiences by their phenomenal quality. But if her propo
sitional knowledge explains this ability, then the ability doesn't exhaust what she
gains upon release. (I develop this argument in Gertler, 1 999.)
I want to mention an aspect of the knowledge argument that is often missed.
Although Jackson stipulates that Mary hasn't had color sensations before her release,
it seems to me that the argument doesn't depend on this. The basis of the argument
is simply Mary's inability to link color sensations with the physical, including their
neurophysiological bases and their standard causes (such as seeing a fue engine).
Her lack of color experience simply blocks one way of making such links. For sup
pose that Mary, before her release, has hallucinations involving the quale red. She
might later be able to recall those hallucinations, and she might even give the phe
nomenal character a name. Still, she couldn't correlate it with neurophysiological states
or standard causes, and so she couldn't recognize it as "red." The key point is that
these correlations, however they are effected, remain brute. And brute correlations
are as troublesome as strong necessities.
The Perspectivalist reply. What Mary gains isn't knowledge of a new fact, but just
a new way to grasp a fact she already knew. That is, she learns a new, phenomenal
mode of presentation of the fact that she previously grasped only under its neuro
physiological mode of presentation. The special nature of qualia concepts explains
why knowledge of qualia can't be derived from full knowledge of physical (func
tional, computational) properties. Reductionists have provided a variety of candidates
for what is special about qualia concepts; most involve the idea that qualia concepts
pick out their referents directly, e.g., "via a species of simulation, without invoking
any descriptions" (Papineau, 1 998) ; or "without the use of any descriptive, reference
fIXing intermediaries" (Tye, 1 999). The lack of a descriptive component would explain
why physicalist descriptions of reality won't yield phenomenal knowledge: for such
phenomenal knowledge must take descriptive form if it is to be inferred from physical
knowledge.
Brie Gertler
Reply. It seems to me that these proposals face two threats. First, they seem to .
simply relocate the mystery at the heart of the Mary case. If the non-descriptive nature
of phenomenal concepts is what prevents Mary from knowing her subjects' qualia
on the basis of their neurophysiological states, her epistemic situation upon leaving
the room seems little better. While she can now know which qualia are associated
with which neurophysiological states, this is simply knowledge of a correlation. Hence,
this analysis of the case doesn't escape the charge that reductionism about qualia is
committed to brute necessities.
In a recent article, Lycan acknowledges that this analysis of the knowledge argu
ment leaves intact the "explanatory gap" (Levine, 1 983) between qualia and the phy
sical. But, he claims, this gap derives from the fact that phenomenal information is
"intrinsically perspectival," and the gap is as "metaphysically harmless" as the gap
between non-indexical information contained on a map, and the indexical know
ledge "I am here" that cannot be inferred from simply reading the map (Lycan, 2003,
p. 39 1 , n. 8). The assumption is that the latter gap does not tempt us to introduce a
new ontological category, the perspectival, for my location necessarily supervenes
on the non-perspectival.
While this line of reasoning raises a host of issues that I cannot address here, I
do want to suggest one anti-reductionist objection to it. Arguably, perspectival facts
require consciousness. For imagine: if there were no conscious beings in the world,
then while there would be facts such as "the largest elm tree is at location x," there
wouldn't be any perspectival facts. l3 Now if perspectival facts somehow derive from
consciousness, then an anti-reductionism about consciousness would provide an ele
gant explanation of the gap between perspectival and non-perspectival facts. Anti
reductionists can appeal to the ontological difference between qualia and the
physical to explain the special operation of qualia concepts. For instance, they can
say that qualia are grasped directly because there are no metaphysical intermediaries
between the qualia concept and the qualia themselves. And if my above suggestion
is on the right track, this explanation may also yield a satisfYingly unifIed expla
nation of the gap between the perspectival and the non-perspectival. In any case,
the perspectivalist reply seems, at best, to shift our focus from one gap to another. 1 4
The second threat faced by the perspectivalist reply is that reductionist accounts of
what is special about qualia concepts don't do justice to our knowledge of our own qualia.
For they use the claim that qualia concepts are non-descriptive to limit our introspect
ive knowledge of qualia, including their descriptive physical features. But I think that
what is so epistemically striking about qualia concepts is that they allow for an exhaus
tive introspective grasp of qualia, as explained in defense of the disembodiment argu
ment above. And I think that this fact cannot be accommodated by the perspectivalist
reply, which claims that introspection allows us to grasp qualia only through a non
exhaustive mode of presentation (in Lycan's view, a second-order property of the qualia).
Representatio n a l ism
Before closing, let me briefly comment on representationalism, the view that quali
tative features of experience are reducible to intentional features. Grant, for the moment,
Consciousness a n d Qualia Cannot Be Reduced
Concl usion
Anti-reductionists are not opposed to naturalism, or to the scientifiC method of inquiry.
The choice between reductionism and anti-reductionism depends on which data one
fmds most in need of explanation. Anti-reductionists, myself included, believe that
the costs of reductionism - commitment to brute necessities, and disloyalty to intui
tions and introspective evidence - cancel out much of its benefits, and outweigh
those that remain.
Notes
2
3
8
9
10
11
12
13
14
thought generally; and (ii) Descartes uses God's abilities to show that what is conceivable
is possible, whereas I'm presenting a secular version.
On some views of concepts, concepts may be revised. I myself fmd the notion of genuine
conceptual change problematic; on my view, most alleged cases of concept change are
better described as either abandoning a concept for a closely related one, or of realizing
that one misunderstood one's own concept all along. A good example of the latter is the
reaction to Gettier's argument (Gettier, 1 963) against analyzing " knowledge" as "justified
true belief" ("JTB"). Those who were previously inclined to accept that analysis, but were
persuaded by Gettier to reject it, were in effect correcting a prior misconstrual of their
own concept. The compellingness of Gettier's argument lay in the fact that the cases he
described so obviously failed to be knowledge, though they met the JTB condition.
If "conceivability" is used in a factive sense, what I term "conceivability.. here is better
expressed as " apparent conceivability." I will continue to use "conceivability" in a non
factive sense.
At the point in the Meditations where Descartes's version of the disembodiment argu
ment occurs, the meditator has not yet ruled out the possibility that his mental states
were produced by the machinations of a malicious genius.
This is a version of what Stoljar (2002) calls the "object-based conception" of the
physical. On that conception, "a physical property is a property which either is the
sort of property required by a complete account of the intrinsic nature of paradigmatic
physical objects and their constituents or else is a property which metaphysically (or
logically) supervenes on" that sort of property (Stoljar, 2002, p. 3 1 3).
In fact, a term like "burnt orange" will likely underdetermine the quale I'm referring to,
for qualia individuation may be very fme-grained.
Suitably altered, this sort of objection may be made to the disembodiment argument as
well: even if it's conceivable that I lose my physical properties and become a disembodied
seat of quaJia, this doesn't mean that it's possible that I do so. My reply can be modified
to defend the disembodiment argument.
This description expresses what Chalmers calls the primary intension; Jackson ( 1 998) calls
it the A -intension.
Interestingly, while Jackson ( 1 998) propounds the view that metaphysical necessities must
be explained by conceptual necessities, his current position, in that book and elsewhere,
is reductionist. This marks a turnabout from his 1 98 2 position.
Compare Jackson: "Only in that way [through conceptual analysis] do we defme our sub
ject as the subject we folk suppose is up for discussion" (Jackson, 1 998, p. 42).
We m ight describe such a world by using indexicals, but o f course that doesn't mean
that the facts in that world are perspectival.
Strikingly, Searle ( 1992) is a thoroughgoing physicalist, but claims that there is an onto
logical distinction between tbe perspectival and the non-perspectival. On his view, some
biological properties are "ontologically subj ective."
- (2003). The content and epistemology of phenomenal belief. In Q. Smith and A. Jokic (eds.),
Consciousness: New Philosophical Perspectives. Oxford: Oxford University Press.
Co nee, E. (1994). Phenomenal knowledge. Australasian Journal of Philosophy, 72, 136-50.
Descartes, R. (1641/1984). Meditations on First Philosophy. In J. Cottingham, R. Stoothoff, and
D. Murdoch (eds.), The Philosophical Writings of Descartes, vol. 2. Cambridge: Cambridge
University Press.
Gertler, B. (1999). A defense of the knowledge argument. Philosophical Studies, 93, 3 17-36.
- (2001a). Introspecting phenomenal states. Philosophy and Phenomenological Research, 63,
3 05-28.
- (2001b). The explanatory gap is not an illusion. Mind, 110, 689-94.
Gettier, E. (1963). Is justified true belief knowledge? Analysis, 23, 121-3.
Hill, C. (1991). Sensations: A Defense of Type Materialism. Cambridge: Cambridge University
Press.
Horgan, T. and Tienson, J. (2002). The intentionality of phenomenology and the phe
nomenology of intentionality. In D. Chalmers (ed.), Philosophy of Mind: Classical and
Contemporary Readings. Oxford: Oxford University Press.
Jackson, F. (1982). Epiphenomenal qualia. The Philosophical Quarterly, 32, 127-36.
(1998). From Metaphysics to Ethics: A Defense of Conceptual Analysis. Oxford: Oxford
University Press.
Kirk, R. ( 1 974). Zombies vs. materialists. Aristotelian Society Supplement, 48, 135-52.
Kripke, S. (1980). Naming and Necessity. Cambridge, MA: Harvard University Press.
Levine, J. (1983). Materialism and qualia: The explanatory gap. Pacific Philosophical
Quarterly, 64, 3 45-6 l .
(2001). Purple Haze. Oxford: Oxford University Press.
Lewis, D. (1988). What experience teaches. Proceedings of the Russellian Society. Sydney: University
of Sydney. (Reprinted in W. G. Lycan, Mind and Cognition. Oxford: Blackwell, 1990.)
Lycan, W. G. (1996). Consciousness and Experience, Cambridge, MA: Bradford Books/MIT Press.
(2003). Perspectival representation and the knowledge argument. In Q. Smith and A. Jokic
(eds.), Consciousness: New Philosophical Perspectives. Oxford: Oxford University Press.
Nagel, T. (1974). What is it like to be a bat? Philosophical Review, 82, 435-56.
Nemirow, L. (1990). Physicalism and the cognitive role of acquaintance. In W. G. Lycan (ed.),
Mind and Cognition. Oxford: Blackwell.
Papineau, D. (1998). Mind the gap. Philosophical Perspectives, 12, 373-88.
Pitt, D. (2004). The phenomenology of cognition; or what is it like to think that p ? Philosophy
and Phenomenological Research, 69, 1-36.
Searle, J. (1992). The Rediscovery of the Mind. Cambridge, MA: MIT Press.
Shapiro, L. (2004). The Mind Incarnate. Cambridge, MA: MIT Press.
Siewert, C. (1998). The Significance of Consciousness. Princeton, NJ: Princeton University Press.
Stoljar, D. (2002). Two conceptions of the physical. In D. J. Chalmers (ed.), The Philosophy of
Mind: Classical and Contemporary Readings. Oxford: Oxford University Press. (Originally
published 2001 in Philosophy and Phenomenological Research, 62, 253-81.)
Sturgeon, S. (2000). Matters of Mind: Consciousness, Reason, and Nature. London: Routledge.
Tye, M. (1999). Phenomenal consciousness: The explanatory gap as a cognitive illusion. Mind,
108, 705-25.
Brie Gertler
DOES COGNITIVE
SCIENCE NEED
EXTERNAL CONTENT
AT ALL?
CHAPTER
TH I RTEEN
Locating Meaning in
the Mind ( Where it Belongs)
Ray Jackendojf
approach for about 30 years. Here I want to go through some aspects of it. Some date
back to older work (lackendoff, 1 976, 1983, 1 987); some are new to Foundations of
Language.
I take the basic problem of a mentalistic semantic theory to be:
How can we characterize the messages/thoughts/concepts that speakers
express/convey by means of using language?
How does language express/convey these messages?
I leave the terms "messages/thoughts/concepts" and "express/convey" deliberately vague
for the moment. Part of our job is to sharpen them. In particular, we have to ask:
What makes these mental entities function the way meanings intuitively should?
Unfortunately, the intellectual politics begins right here : this is not the way
everyone construes the term "semantics." Rather than engage in arguments based on
terminological imperialism, I will use conceptualist semantics as a term of art for this
enterprise. (My own particular set of proposals, which I have called Conceptual Semantics
(lackendoff, 1990) , is an exemplar of the approach but not the only possible one.)
Above all, I don't want to get trapped in the question: Is this enterprise really a kind
of semantics or not? The relevant questions are: Is this enterprise a worthwhile
way of studying meaning? To what extent can it incorporate intuitions and insights
from other approaches, and to what extent can it offer insights unavailable in other
approaches? It is important to see that the messages/thoughts/concepts conveyed by
language serve other purposes as well. At the very least, they are involved in the
cognitive processes illustrated in fIgure 1 3 . 1 .
Linguists spend a lot of time accounting for the combinatoriality o f phonology and
syntax. But it is assumed (although rarely articulated) that these systems serve the
purpose of transmitting messages constructed from an equally combinatorial system
of thoughts : a sentence conveys a meaning built combinatorially out of the meanings
of its words. This combinatorial system is represented in fIgure 1 3 . 1 by the component
"formation rules for thoughts," which defmes the class of possible thoughts or con
ceptual structures. (An important terminological point: one use of the term syntax
Formation rules
for thoughts
/ Perception
,'/
t
Noises Language -- Concepts
t / Objects
;,,"
"
Inference
"
Action
'
MIND/BRAIN
Integration
WORLD
Knowledge base
pertains to any sort of combinatorial system. In this sense phonology has a syntax,
music has a syntax, chess has a syntax, and of course thought has a syntax. In the
use favored by linguists, however, syntax refers specifIcally to the combinatorial organ
ization made up of such units as noun phrase and verb phrase. In this sense thought
is defmitely not syntax.)
In the framework of Foundations of Language (lackendoff, 2002), the combinatorial
structures for thought are related to the purely linguistic structures of syntax and
phonology by so-called interface rules (double arrows in fIgure 1 3 . 1 ) . In particular,
different languages have different interfaces, so that the same thought can be mapped
into expressions of different languages, within tolerances, allowing for the possibility
of reasonably good translation among languages. An important part of the interface
rules is the collection of words of the language. A word is a long-term memory asso
ciation of a piece of phonology (its pronunciation), a set of syntactic features (its part
of speech and contextual properties such as subcategorization), and a piece of con
ceptual structure (its meaning). Thus each word in an utterance establishes a part of the
utterance's sound-grammar-meaning correspondence; other parts of the correspondence
are mediated by interface rules that map between syntactic structure and combinatorial
structure in semantics.
These two parts of fIgure 1 3 . 1 - the combinatorial system of meaning and its inter
faces to linguistic expression - are closest to what is often called "linguistic semantics."
Now consider the other interfaces. The use of thoughts/concepts to produce further
thoughts/concepts is what is typically called "inference" or "reasoning." Since we are
interested in the study of real people and not just ideals, this interface must include
not only logical reasoning but also making plans and forming intentions to act, so
called "practical reasoning" (Kahneman et al., 1 982; Bratman, 1987 ; Gigerenzer, 2000)
and "social reasoning" (Fiske, 1 99 1 ; Tooby and Cosmides, 1 992).
We also must account for the integration of thoughts conveyed by language with
previous knowledge or beliefs. Part of previous knowledge is one's sense of the com
municative context, including one's interlocutor's intentions. Thus the work of this
interface is closest to what is often called "pragmatics."
The interfaces to the perceptual systems are what permit one to form thoughts
based on observing the world. In turn, by using such thoughts as the input to lan
guage production, we can talk about what we see, hear, taste, and feel. These inter
faces operate in the other direction as well: language can be used to direct attention
to some particular part of the perceptual fIeld. The interface with the action system
is what permits one to carry out intentions - including carrying out intentions formed
in response to a command or request.
In order to make possible the kinds of interactions just enumerated, all these inter
faces need to converge on a common cognitive structure. If we look at thought through
the lens of language alone, we don't have enough constraints on possible theories. A
richer, more demanding set of boundary conditions emerges from insisting that thought
must also make contact with inference, background knowledge, perception, and action.
In fact, this view of thought permits us to make contact immediately with evolu
tionary considerations as well. Suppose we erase the interface to language from fIgure
1 3 . 1 . We then have an architecture equally suitable - at some level of approximation
- for nonlinguistic organisms such as apes. They too display complex integration of
Locating Meaning in the Mind
perception, action, inference, and background knowledge, in both physical and social
domains (Kohler, 1 9 2 7 ; Goodall, 1 97 1 ; Byrne and Whiten, 1 98 8 ; Hauser, 2000). They
just can't talk about it. It makes evolutionary sense to suppose that some of the fund
amental parts of human thought are a heritage of our primate ancestry (Hauser et al.,
200 2 ; Pinker and lackendoff, 2005).
To presume that we can invoke evolutionary considerations, of course, is also to
presume that some of the overall character of thought is determined by the genome.
I see at least three major domains of thought that suggest an innate genetic basis.
The fIrst is the understanding of the physical world: the identiflcation of objects,
their spatial configurations with respect to each other, the events in which they take
part and interact, and the opportunities they offer for action on and with them. The
second is the understanding of the social world: the identiflCation of persons, char
acterization of the beliefs and motivations of other persons (so-called "theory of mind"),
and the ability to understand the social roles of oneself and others, induding such
issues as kinship, dominance, group membership, obligations, entitlements, and
morals (not as a universal system, but as the underpinning for all cultural variation
in social understanding) (lackendoff, forthcoming). The third domain that I think must
be innate is a basic algebra of individuation, categorization, grouping, and decom
position that undergirds both the two systems just mentioned as well as many others.
In short, conceptualist semantics should aspire to offer a common meeting ground
for multiple traditions in studying cognition, induding not only linguistic semantics
but also pragmatics, perceptual understanding, embodied cognition, reasoning and
planning, social/cultural understanding, primate cognition, and evolutionary psychology.
A high aspiration, but certainly one worth pursuing.
Language
-----
Language
---
WORLD
----.,..
Objects, etc.
t:
o anguage"
_ Mental grammar
and lexicon
MIND
have much to do with the mind. Some in formal semantics, such as David Lewis, are
very explicit about this, others are more or less agnostic.
How can this realist view of language to be reconciled with the mentalist approach
of generative grammar? One approach would be to jettison the mentalism of gener
ative linguistics, but retain the formal mechanisms: to take the position that there is
an objective "language out there in the world," and that this is in fact what gener
ative grammar is studying. Some people have indeed taken this tack (e.g., Katz, 1 9 8 1 ) .
But i t disconnects generative linguistics from all sources o f evidence based o n pro
cessing, acquisition, genetics, and brain damage. And it forces us to give up the fund
amental motivation for positing Universal Grammar and for exploring its character.
I personally think that's too high a price to pay.
An alternative tack might be Frege's ( 1 892) : language is indeed "out in the world"
and it refers to "objects in the world"; but people use language by virtue of their grasp
of it, where " grasp" is a transparent metaphor for "the mind holding/understanding!
making contact with" something in the world. Figure 1 3 . 3 might schematize such an
approach. Generative linguistics, it might then be said, is the study of what is in the
mind when it grasps a language. This way we could incorporate all the mentalistic
methodology into linguistics while preserving a realist semantics.
But what sense are we to make of the notion of "grasping" an abstract object? We
know in principle how the mind "grasps" concrete objects: it constructs cognitive
structures in response to inputs from the senses. This is a physical process: the sense
organs respond to impinging light, vibration, pressure, and so forth by emitting nerve
impulses that enter the brain. But what inputs give rise to the "grasp" of an abstract
object? An abstract object by defmition has no physical manifestations that can impinge
on the nervous system. So how does the nervous system "grasp" them? Without a
careful exegesis of the term - which no one provides - we are ineluctably led toward
a quasimystical interpretation of "grasping," a scientifIC dead end.
Locating Meaning in the Mind
Objects
WORLD
_ _ _
_ _ _ _ _ _ _ _ _ _ _
_ _
' "
( Language
-+-----+-
Concepts - "
,
MIND
Figure 1 3.4 Concepts in the mind that are "about" objects in the world.
One way to eliminate the problem of how the mind grasps language is to push
language entirely into the mind - as generative grammar does. We might then arrive
at a semantic theory structured like fIgure 1 3 .4. This is Jerry Fodor's position, I think
(Fodor, 1 975, 1 983, 1 990, 1 998) : for him, language is a mental faculty that accesses
combinatorially structured concepts (expressions in the "Language of Thought"). In
turn, concepts have a semantics : they are connected to the world by virtue of being
"intentional." The problem for Fodor is to make naturalistic sense of intentionality.
But intentionality suffers from precisely the same difficulty as "grasping" language
in fIgure 1 3 .3 : there is no physically realizable causal connection between concepts
and objects.
The upshot of these explorations is that there seems to be no way to combine a
realist semantics with a mentalist view of language, without invoking some sort of
transcendental connection between the mind and the world. The key to a solution,
I suggest, lies in examining the realist's notion of "objects in the world."
Figure 1 3.5a
Ray Jackendoff
Figure 1 3 .5b
Here is the point of these examples : the commonsense view of reference asserts
that we refer to "objects in the world" as if this is completely self-evident. It is self
evident, if we think only of reference to middle-sized perceivable physical objects
like tables and refrigerators. But as soon as we explore the full range of entities to
which we actually refer, "the world" suddenly begins to be populated with all sorts
of curious beasts whose ontological status is far less clear. For each of the types of
entities cited above, some more or less elaborate story can be constructed, and some
of them have indeed evoked an extensive philosophical literature. But the effect
in each case is to distance the notions of reference and "the world" from direct
intuition. The cumulative effect of considering all of them together is a "world" in
which direct intuition applies only to a very limited class of instances.
That is, in the conceptualist theory, the speaker's judgment and conceptualization
play a critical role.
As initial motivation for exploring the conceptualist position, let's observe that a
language user cannot refer to an entity without having some conceptualization of it.
Consider an example like (9).
9
In order to utter (9) (and mean it), the speaker must have conceptualized some rele
vant entity, though certainly without a full characterization. That is, conceptualiza
tion of a referent is a necessary condition for a speaker to refer. However, being in
the real world is not a necessary condition: speakers can refer without difficulty to
entities like Sherlock Holmes and the unicorn in my dream last night. Furthermore,
an entity's being in the real world is not sufficient for reference either: one has to
conceptualize it in at least some minimal way. In short, an entity's being in the real
world is neither a necessary nor a sufficient condition for a speaker's being able to
refer to it. Rather, the crucial factor is having conceptualized an entity of the proper
sort.
Still, I would not blame the reader for being a bit suspicious of this expression
"the world as conceptualized by the language user." It smacks of a certain solipsism
or even deconstructionism, as though language users get to make up the world any
way they want, as though one is referring to one's mental representations rather
than to the things represented. And indeed, there seems little choice. Figure 1 3 . 1 , the
conceptualist position, has no direct connection between the form of concepts and
the outside world. On this picture our thoughts seem to be trapped in our own brains.
This outcome, needless to say, has corne in for harsh criticism, from many dif
ferent quarters, for example:
But how can mapping a representation onto another representation explain what a rep
resentation means? . . . [Elven if our interaction with the world is always mediated by
representation systems, understanding such systems will eventually involve considering
Ray Jackendoff
what the systems are about, what they are representations of. (Chierchia and McConnell
Ginet, 1990, p. 47)
. . . words can't have their meanings just because their users undertake to pursue some
or other linguistic policies; or, indeed, just because of any purely mental phenomenon,
anyt hing that happens purely "in your head." For "John " to be John's name, there must
be some sort of real relation between the name and its bearer . . . something has to
happen in the world. (Fodor, 1990, pp. 98-9)
But we can know the Markerese translation of an English sentence [Le. its conceptual
structure] without knowing the first thing about the meaning of the English sentence:
namely, the conditions under which it would be true. (Lewis, 1 972, p. 169)
How is it possible to escape this attack? I think the only way is to go deeper into
psychology, and to deal even more carefully with the notion of thought. Think about
it from the standpoint of neuropsychology: the neural assemblies responsible for stor
ing and processing conceptual structures indeed are trapped in our brains. They have
no direct access to the outside world. A position like Fodor's says that the "Language
of Thought" is made of symbols that have meaning with respect to the outside world.
I would rather say that conceptual structures are not made of symbols - they don't
symbolize anything - and they don't have meanings. Rather, I want to say that they
are meaning: they do exactly the things meaning is supposed to do, such as support
inference and judgment. Language is meaningful, then, because it connects to con
ceptual structures. Such a statement is of course anathema to many semanticists and
philosophers, not to mention to common sense. Still, let's persist and see how far we
can go with it.
The deictic pronoun that has almost no intrinsic descriptive content; its semantics
is almost purely referential. In order to understand ( 10), the hearer not only has to pro
cess the sentence but also has to determine what referent the speaker intends by that.
This requires going out of the language faculty and making use of the visual system.
Within the visual system, the hearer must process the visual field and visually
establish an individual in it that can serve as referent of that. The retinal image alone
cannot do the job of establishing such an individual. The retina is sensitive only to
distinctions like "dark point in bright surround at such-and-such a location on retina."
The retina's "ontology" contains no objects and no extemal location. Nor is the situ
ation much better in the parts of the brain most directly fed by the retina: here we
fmd things like local line and edge detectors in various orientations, all in retinotopic
Locating Meaning in the Mind
format (Hubel and Wiesel, 1 9 68) - but still no objects, no external world. And this
is all the contact the brain has with the outside world; inboard from here it's all
computation.
However this computation works, it eventually has to construct a cognitive struc
ture that might be called a "percept." The principles and neural mechanisms that con
struct percepts are subjects of intensive research in psychology and neuroscience,
and are far from understood. The outcome, however, has to be a neurally instantiated
cognitive structure that distinguishes individuals in the perceived environment and
that permits one to attend to one or another of them. This cognitive structure that
gives rise to perceived individuals is nonlinguistic: human infants and various ani
mals can be shown experimentally to identifY and track individuals more or less the
way we do, so the best hypothesis is that they have percepts more or less like ours.
Of course percepts are trapped inside the brain too. There is no magical direct
route between the world and the percept - only the complex and indirect route via
the retina and the lower visual areas. Hence all the arguments that are directed against
conceptualist semantics apply equally to percepts. This may bother some philoso
phers, but most psychologists and neuroscientists take a more practical approach:
they see the visual system as creating a cognitive structure which constitutes part of
the organism's understanding of reality, and which helps the organism act success
fully in its environment (Marr, 1 982 ; Koch, 2004). If there is any sense to the notion
of "grasping" the world perceptually, this wildly complex computation is it; it is far
from a simple unmediated operation.
And of course a visual percept is what is linked to the deictic that in (9) and ( 10),
through the interfaces between conceptual structure and the "upper end" of the visual
system. Thus language has indeed made contact with the outside world - but through
the complex mediation of the visual system rather than through some mysterious
mind-world relation of intentionality. Everything is scientifIcally kosher.
A skeptic may still be left grumbling that something is missing: "We don't perceive
our percepts in our heads, we perceive objects out in the world." Absolutely correct.
However, as generations of research in visual perception have shown, the visual system
populates "the world" with all sorts of "objects" that have no physical reality, for
instance the things in example (3) : the square subtended by four dots and the "amodally
completed" horizontal rectangle. So we should properly think of "the perceptual world"
(or "phenomenal world" in the sense of Koffka, 1 93 5) not as absolute realit'j but as
the "reality" constructed by our perceptual systems in response to whatever is "really
out there."
Naturally, the perceptual world isn't totally out of synch with the "real world."
The perceptual systems have evolved in order that organisms may act reliably in the
real world. They are not concerned with a "true model of the world" in the logical
sense, but with a "world model" that is good enough to support the planning of actions
that in the long run lead to better propagation of the genes. Like other products of
evolution, the perceptual systems are full of "cheap tricks," which is why we see vir
tual objects: these tricks work in the organism's normal environment. It is only in
the context of the laboratory that their artifIciality is detected.
Thus the perceptual world is reality for us. Apart from the sensory inputs, percepts
are entirely "trapped in the brain"; they are nothing but formal structures instantiated
Ray Jackendoff
in neurons. But the perceptual systems give us the sense, the feeling, the affect, of
objects being out there. We experience objects in the world, not percepts in our heads.
That's the way we're built (Dennett, 1 99 1 ; Koch, 2004).
In short, the problem of reference for the intuitively clear cases is not at bottom
a problem for linguistic theory, it is a problem for perceptual theory: how do the
mechanisms of perception create for us the experience of a world "out there"?
I suspect some readers may fmd this stance disquieting. My late friend John
Macnamara, with whom I agreed on so much, used to accuse me of not believing
there is a real world. But I think the proper way I should have replied to him is that
we are ultimately concerned with reality for us, the world in which we lead our lives.
Isn't that enough? (Or at least, isn't that enough for linguistic semantics?) If you
want to go beyond that and demand a "more ultimate reality," independent of human
cognition, well, you are welcome to, but that doesn't exactly render my enterprise
pointless.
example The morning star is the evening star. In his analysis, two senses are attached
to the same reference. We can understand Frege's example in present terms as report
ing the merger of indexicals associated with different perceptual features, on the
basis of some discovery. So Frege's problem is not a uniquely linguistic problem;
rather it lies in a more general theory of how the mind keeps track of individuated
entities.
The reverse, indexical splitting, can also occur. I used to think there was one
literary/cultural theorist named Bloom, until one day I saw a new book by Bloom
and was surprised because I had thought he had been dead for several years. It sud
denly dawned on me, to my embarrassment, that there were two Blooms, Allan and
Harold - so my indexical came to be split into two. Think also of discovering that
someone you have seen around the neighborhood is actually twins.
More generally, indexical features play a role altogether parallel to the discourse
referents in various approaches within formal semantics such as D iscourse Represent
ation Theory (Kamp and Reyle, 1 993), File Change Semantics (Heim, 1989), and Dynamic
Semantics ( Groenendijk et al., 1 996). Hence many of the insights of these approaches
can be taken over here, with the proviso that the "model" over which reference is
defmed should be a psychologically motivated one.
(a)
(b)
(c)
(d)
(el
(f)
1 230 1
(g)
(h)
Pro-measure expression:
The fISh that got away was this long [demonstrating] .
There were about this many people at Joe's party too.
[gesturing toward the assembled throng] .
Pro-time-PP :
You may start . . . right . . . now! [clapping]
[Distance]
[Amount]
[Time]
The deictic expressions here refer to entities in the conceptualized world that can be
picked out with the aid of the accompanying gesture. But the entities referred to are
not objects, they are sounds, sensations, locations, directions, and so on.
In order to accommodate these possibilities for reference, it is useful to introduce
a kind of "ur-feature" that classifIes the entity being referred to into an ontological
type. Each of the ontological types - objects, sounds, actions, locations, and so forth
- has its own characteristic conditions of identity and individuation. It is a task both
for natural language semantics and for cognitive psychology/neuroscience to work
out the logic and the characteristic perceptual manifestations of each type.
Are all the sorts of entities in ( 1 1 ) "in the world"? They are certainly not like
refrigerators - you can't touch them or move them. In fact, it is odd to say they all
exist in the way that refrigerators exist ("the length of the fISh exists"??). Yet ( 1 1)
shows that we can pick them out of the perceptual fIeld and use them as the refer
ents of deictic expressions. So we must accord them some dignifIed status in the
perceived world - the "model" that serves as the basis for linguistic reference.
Now notice what has just happened. Up to a moment ago I was concerned with
reference to objects, and I used perceptual theory to ground the theory of reference.
Now all of a sudden I have turned the argument on its head: If this is the way refer
ence relates to perception, perception must be providing a far richer range of entities
than had previously been suspected. It is now a challenge for perceptual theory to
describe how the perceptual systems accomplish this. In other words, examples like
( 1 1 ) open the door for fruitful cooperation between linguistic semantics and research
in perception.
This repertoire of ontological types seems to me a good candidate for a very
skeletal unlearned element of cognition. Again, because it is central not just to lan
guage but to perception and action, we need not call it part of Universal Grammar,
the human specialization for learning language. But we do have to call it part of the
innate human cognitive endowment.
So far, I've only talked about perceivable entities. What makes them perceivable
is that they have features that connect to the perceptual interfaces in fIgure 1 3 . 1 .
Now suppose that conceptual structure contains other features that d o not pertain to
a perceptual modality, but which connect instead to the inferential system in fIgure
1 3 . 1 . Such features would provide "bridges" to other concepts but no direct connec
tion to perception. That is, they are used in reasoning rather than in identiflcation.
Let's call these inferential features by contrast with perceptual features.
What would be candidates for inferential features? Consider an object's value. This
is certainly not perceivable, but it influences the way one reasons about the object,
including one's goals and desires concerning the object. Value can be established by
all sorts of means, mostly very indirect. Thus value seems like a prime candidate for
Locating Meaning in the Mind
8 Satisfaction a nd Truth
An important piece has been left out of the story so far. Linguistic expressions used
in isolation cannot refer: they can only purport to refer. For example, suppose
I'm talking to you on the phone and say Hey, will you look at THAT! You under
stand that I intend to refer to something; but you can't establish the reference and
therefore can't establish the contextualized meaning of the utterance.
A referential expression succeeds in referring for the hearer if it is satisfied by
something that can serve as its referent. Remember: in realist semantics, satisfaction
is a relation between a linguistic expression and an entity in the world; but in con
ceptualist semantics, the entity is in [the world as conceptualized by the language
user] . It is this latter notion of satisfaction that we have to work out here.
To work out a conceptualist notion of satisfaction, we invoke one component in
fIgure 1 3 . 1 that I haven't yet discussed: the integration of concepts with the knowledge
base. Suppose I say to you: I talked to Henk Verkuyl today in the supermarket. In
isolation, the proper name Henk Verkuyl purports to refer; you assume that I intended
it actually to refer. If you know Henk Verkuyl (Le. he is part of your knowledge base),
you can establish the intended reference in your own construal of the world. If you
do not know him, the proper name is unsatisfIed for you.
More generally, four different situations can arise.
12
(a)
(b)
1 232 1
Ray Jackendoff
(c)
(d)
This is all clear with the reference of noun phrases. Next let's look at the refer
ence of sentences. The standard custom in the formal semantics tradition, going back
to Frege, is that the intended referent of a (declarative) sentence is a truth-value.
I must confess I have never understood the argument for this position (e.g., Chierchia
and McConnell-Ginet, 1990, ch. 2). I am going to explore instead the alternative posi
tion (proposed in lackendoff, 1 972) that the intended reference of a declarative sen
tence is a situation (an event or a state of affairs) . Traditionalists should not worry:
truth-values will get their due shortly.
The view that sentences refer to situations is motivated largely by linguistic par
allels to referentiality in noun phrases. This is a kind of evidence not frequently cited
in the literatures of philosophy of language and formal semantics - although
Situation Semantics (Barwise and Perry, 1 984) made a good deal of it. First notice
how noun phrases and sentences are parallel in the way they can be used to accom
pany deictic reference:
13
(a)
(b)
( 1 3)(a) draws your attention to an object in the environment; ( 1 3)(b) to an event - not
to a truth-value. Next, notice that discourse pronouns can co-refer with sentences as
well as with noun phrases.
14
(a)
(b)
Next consider embedded that-clauses in a context where they alternate with noun
phrases.
15
(a)
(b)
(a)
(b)
(c)
(d)
Your knowledge base includes the event of the Red Sox winning, and
this satisfIes the intended referent of the clause.
Your knowledge base does not include the event of the Red Sox win
ning, so you add this to your knowledge base as the referent of the clause.
Your knowledge base includes something in conflict with the purported
event of the Red Sox winning (say, your take on the world is that the
Red Sox didn't play). Then you have to engage in some repair strategy.
The features of the purported event are inherently in conflict, so that
there is no possible referent. In such a case, for instance That the square
is a circle astounded Max, the clause is judged anomalous, and you again
have to resort to repair.
There are of course, other cases of that-clauses that don't come out this way, notably
in so-called intensional (with an s) contexts such as the complement of believe. However,
noun phrases in this position are subject to the same distortions of referentiality:
17
(aJ
(b)
In both of these, the speaker makes no commitment to the existence of the tooth fairy
or square circles.
So far, then, I've tried to convince you that it makes sense to regard a clause as
referentially satisfIed by a conceptualized situation. What about truth-values, then?
The judgment of a declarative sentence's truth-value follows from how it is referen
tially satisfIed.
18
(a)
(b)
(c)
(d)
Thus truth is defmed in terms of reference and satisfaction, just as proposed by Tarski
( 1 936/ 1 983).
1 234 1
Ray Jackendoff
In short, the parallelism in the reference of noun phrases and sentences lies in the
parallelism between conceptualized objects and conceptualized situations. The notion
of satisfaction applies identically to both. However, sentences have an additional layer
of evaluation, in which they are characterized as true or false on the basis of how
they are referentially satisfIed.
In this approach, then, the problem of characterizing the conditions under which
a sentence is judged true does not go away. It is just demoted from the paramount
problem of semantic theory to one among many problems. What seems more basic
here is the conditions of satisfaction for referential constituents and how they inter
act with the knowledge base. In fact, a great deal of research in "truth-conditional"
semantics can easily be reconstrued as addressing this issue. For instance, the ques
tion of whether sentence S I entails sentence S 2 has nothing to do with their truth
values - sentences may describe thoroughly fIctional or hypothetical situations. Rather,
S I entails S 2 if adding the situation referred to by S I to an otherwise blank knowl
edge base enables the situation referred to by S2 to be satisfIed. The factors involved
in such satisfaction, and the form of the rules of inference, may remain essentially
unchanged from a truth-conditional account.
Above all, the conceptualist approach shifts the focus of semantics from the ques
tion "What makes sentences true?" to what I take to be the more ecologically sound
question, "How do we humans understand language?" where I mean "ecologically
sound" in the sense that it permits us to integrate semantics with the other human
sciences. I take this to be a positive step.
There are obviously many further problems in establishing a mentalistic seman
tics and in particular a mentalistic theory of reference and truth. Foundations of Language
(lackendoff, 2002) addresses some of these and not others. But here is where we are
so far. We have not ended up with the rock-solid rigid notion of truth that the real
ists apparently want. Rather, I think we have begun to envision something that has
the promise of explaining our human sense of truth and reference - and why philo
sophical and commonsensical disputes about truth and reference so often proceed
the way they do. I have no illusions that this work is over, but it strikes me as a
path well worth exploring.
References
Barwise, J. and Perry, J. ( 1984). Situations and Attitudes. Cambridge, MA: MIT Press.
Bratman, M. ( 1 987). Intention, Plans, and Practical Reason. Cambridge, MA: Harvard Univer
sity Press.
Byrne, R. W. and Whiten, A. (eds.) ( 1 988). Machiavellian Intelligence: Social Expertise and the
Evolution of Intellect in Monkeys, Apes, and Humans. Oxford: Clarendon Press.
Chierchia, G. and McConnell-Ginet, S. ( 1990). Meaning and Grammar: An Introduction to
Semantics. Cambridge, MA: MIT Press.
Chomsky, N. ( 1 995). The Minimalist Program. Cambridge, MA: MIT Press.
Dennett, D. ( 1 99 1) . Consciousness Explained. New York: Little, Brown.
Fiske, A. ( 1 99 1) . Structures of Social Life. New York: Free Press.
Fodor, J. A. ( 1 97 5). The Language of Thought. Cambridge, MA: Harvard University Press.
( 1 983). The Modularity of Mind. Cambridge, MA: MIT Press.
-
( 1 990). A Theory of Content and Other Essays. Cambridge, MA: MIT Press.
( 1998). Concepts: Where Cognitive Science Went Wrong. Oxford: Oxford University Press.
Frege, G. ( 1 89 2). Ober Sinn und Bedeutung. Zeitschrift flir Philosophie und Philosophische
Kritik, 1 00, 2 5-50. (English translation in P. Geach and M. Black (eds.), Translations from
the Philosophical Writings of Gottlob Frege. Oxford: Blackwell, 1 952.)
Gigerenzer, G. (2000). Adaptive Thinking: Rationality in the Real World. New York: Oxford
University Press.
Goodall, J. van L. ( 197 1 ) . In the Shadow of Man. New York: Dell.
Groenendijk, J., Stokhof, M., and Veltman, F. ( 1 996). Coreference and Modality. In S. Lappin
(ed.l, The Handbook of Contemporary Semantic Theory. Oxford: Blackwell.
Hauser, M. D. (2000). Wild Minds: What Animals Really Think. New York: Henry Holt.
Hauser, M. D., Chomsky, N., and Fitch, W. T. (2002). The faculty of language: What is it, who
has it, and how did it evolve? Science, 298, 1 569-79.
Heim, 1. ( 1 989). The Semantics of Definite and Indefinite Noun Phrases in English. New York:
Garland.
Hubel, D. and Wiesel, T. ( 1 968). Receptive fields and functional architecture of monkey striate
cortex. Journal of Physiology (London), 1 95, 2 1 5-43.
Jackendoff, R. ( 1972). Semantic Interpretation in Generative Grammar. Cambridge, MA: MlT Press.
- ( 1 976). Toward an explanatory semantic representation. Linguistic Inquiry, 7, 89- 1 50.
- ( 1 983). Semantics and Cognition. Cambridge, MA: MIT Press.
- ( 1 987). Consciousness and the Computational Mind. Cambridge, MA: MIT Press.
- ( 1 990). Semantic Structures. Cambridge, MA: MIT Press.
- (2002). Foundations of Language. Oxford: Oxford University Press.
- (forthcoming). Language, Culture, Consciousness: Essays on Mental Structure. Cambridge,
MA: MlT Press.
Kahneman, D., Slovic, P., and Tversky, A. (eds.) ( 1 982). Judgment under Uncertainty: Heuristics
and Biases. Cambridge: Cambridge University Press.
Kamp, H. and Reyle, U. ( 1 993). From Discourse to Logic. D ordrecht: Kluwer.
Katz, J. J. ( 198 1 ) . Language and other Abstract Objects. Totowa, NJ: Rowman 8: LittlefIeld.
Koch, C. (2004). The Quest for Consciousness: A Neurobiological Approach. Englewood, CO :
Roberts and Company.
Koffka, K. ( 1 93 5) . Principles of Gestalt Psychology. New York: Harcourt, Brace 8: World.
Kohler, W. ( 1 927). The Mentality of Apes. London: Routledge 8: Kegan Paul.
Kripke, S. ( 1972). Naming and Necessity. In D. Davidson and G. Harman (eds.), Semantics for
Natural Language. Dordrecht: Reidel.
Lakoff, G. ( 1 987). Women, Fire, and Dangerous Things. Chicago: University of Chicago Press.
Langacker, R. ( 1987). Foundations of Cognitive Grammar, vol. 1. Stanford, CA: Stanford University
Press.
Lewis, D. ( 1 972). General Semantics. In D. Davidson and G. Harman (eds.), Semantics for Natural
Language. Dordrecht: Reidel.
Marr, D. ( 1 982). Vision. San Francisco: Freeman.
Montague, R. ( 1 973). The proper treatment of quantifIcation in ordinary English. In J. Hintikka,
J. Moravcsik, and P. Suppes (eds.), Approaches to Natural Language. Dordrecht: Reidel.
Pinker, S. and Jackendoff, R. (2005). The faculty of language: What's special about it? Cognition,
9 5, 20 1 -3 6.
Stalnaker, R. ( 1984). Inquiry. Cambridge, MA: MIT Press.
Tarski, A. ( 1 936/ 1 983). The establishment of scientifIC semantics. In A. Tarski, Logic, Semantics,
Metamathematics (J. H. Woodger, trans; 2nd edn., I'd. J. Corcoran). Indianapolis: Hackett.
Tooby, J. and Cosmides, L. ( 1 992). The psychological foundations of culture. In J. H. Barkow,
Leda Cosmides, and John Tooby (eds.), The Adapted Mind. Oxford: Oxford University Press.
Ray Jackendoff
CHAPTER
FO U RTE
EN
In the short space here, I want to address an issue about the reality of language
and the ordinary external world that Jackendoff raises in his chapter of the present
volume (chapter 1 3 , LOCATING MEANING IN THE MIND, WHERE IT BELONGS), 1 and that has
been a persistent theme in his work for the last 20 years.
It would seem to be a commonplace that people, when they talk, produce tokens
of such things as words, sentences, morphemes, phonemes, and phones - I'll call tokens
of all such types, "standard linguistic entities" (" SLEs"). Part and parcel of this com
monplace would be the presumption ("physical tokenism," hereafter abbreviated to
"PT") that these entities can be identifIed with some sorts of acoustic phenomena,
e.g., wave patterns in space and time. For instance, Devitt and Sterelny write: 2
[PT]
Tokens are datable, placable parts of the physical world . . . Inscription types and
sound types are identifIable by their overt physical characteristics and so we might
call them "physical types." (Devitt and Sterelny, 1987, p. 59)
Over the years, however, this latter presumption has been repeatedly challenged by
linguists, such as Saussure ( 1 9 1 6/ 1 966), Sapir ( 1 933/ 1 963), Chomsky and Halle ( 1 9 68),
Jackendoff ( 1 983, 1 987), and Chomsky (2000). In his textbook on phonetics, for exam
ple, John Laver writes :
The stream of speech within a single utterance is a continuum. There are only a few
points in this stream which constitute natural breaks, or which show an articulatory,
auditorily or acoustically steady state being momentarily preserved, and which could
therefore serve as the basis for analytical segmentation of the continuum into "real"
phonetic units. . . . The view that such segmentation is mostly an imposed analysis, and
not the outcome of discovering natural time-boundaries in the speech continuum, is a
view that deserves the strongest insistence. (Laver, 1993, p. 1 0 1 )
This view about language has sometimes led these linguists to deny the existence
not only of SLEs, but, by a kind of parity of reasoning, many of the ordinary non
linguistic things we take ourselves to perceive and discuss. In a number of places,
for example, Chomsky (2000, pp. 1 29ff., pp. 1 80ff.) has suggested that the ontology
for semantics be modeled on that of phonology, and has even gone so far as to con
sider "naming as a kind of 'world-making,' in something like Nelson Goodman's sense"
(2000, p. 1 8 1), expressing sympathy with various forms of seventeenth- and eighteenth
century idealism, according to which "The world as known is the world of ideas," which
he (2000, p. 1 82) quotes approvingly from Yolton's ( 1 994) account of that period.
The linguist who has most explicitly endorsed this sort of idealism is Ray
lackendoff. In lackendoff ( 1 983) he reasonably argues that "musical and linguistic
structure must be thought of ultimately as products of the mind; they do not exist
in the absence of human creators" ( 1 983, pp. 27-8). But he then rapidly generalizes
the view to a familiar form of idealism :
We have conscious access only to the proj ected world - the world as unconsciously
organized by the mind; and we can talk about things only insofar as they have achieved
mental representation through these processes of organization. Hence the information
conveyed by language must be about the projected world. (lackendoff, 1983, p. 29)
The view persists in his contribution to the present volume, where he proposes
" abandoning the unexamined notion of 'objects in the world,' and, for purposes of
the theory of reference, pushing 'the world' into the mind of the language user too,
right along with language." He does worry that his proposal
smacks of a certain solipsism or even deconstructionism, as though language users get
to make up the world any way they want, as though one is referring to one's mental
representations rather than to the things represented.
But, he claims, "there seems little choice . . . On this picture our thoughts seem to be
trapped in our own brains." Indeed " the perceptual world is reality for us" (emphasis
his). But he thinks this is not so bad, since
[W]e are ultimately concerned with reality for us, the world in which we lead our lives.
Isn't that enough? (Or at least, isn't that enough for linguistic semantics?) If you want
to go beyond that and demand a "more ultimate reality," independent of human cogni
tion, well, you are welcome to, but that doesn't exactly render my enterprise pointless.
we have conscious access and can convey information " only" about them, not about
the real one (Isn't lackendoff purporting to tell us about real people? If not, why
should we pay any attention to him?3). What I want to address here is whether any
of his or other linguists' arguments actually drive us to such extravagant views. Along
with Devitt and Sterelny ( 1 987), I think they don't. However, unlike Devitt and Sterelny,
I think the linguists are quite right to be skeptical about the existence of, specifIcally,
SLEs. The project of this chapter is to try to sort these issues out.
In section 1 of what follows, I'll set out some distinctions that I hope will help
clarifY the discussion. Within the broadly intentionalist commitments of contem
porary cognitive science (section 1 . 1 ) , I want to call attention to "existential" versus
" (purely) intentional" uses of our intentional vocabulary, particularly of the crucial
terms "represent" and "representation" (section 1 .2), as well to the related distinction
between "internalist" and "externalist" theories of intentional content (section 1 .3).
These distinctions will help sharpen the linguist's argument in section 2 for deny
ing PT. Externalism in particular provides a useful condition for serious "existence,"
which I'll argue is not met by at least the representative SLEs I'll consider sentences (section 2 . 1 ) and phones (section 2.3). Unlike, for example, the structural
properties of cars (section 2.2), the structural properties of SLEs are not and need not
be realized in space or time. At best, they are "psychologically real," but this kind
of reality is no reality at all - at any rate, not the kind of serious "independent"
reality that would be required for content externalism (section 2.4).
In section 3 I consider lackendoff's idealist proposal, showing how it can't easily
be restricted merely to SLEs, and that it leads to a general idealism that is problem
atic in familiar ways. These and other philosophical proposals are, to use a term Chomsky
uses in this connection, "wheel-spinning," completely inessential to the theoretical
tasks of linguistics. Unlike Chomsky, I just think this is true of all the proposals,
including the idealist one he himself sometimes endorses. And this is because it seems
to me that the sensible thing is not to try to fmd some peculiar ontological status
for SLEs, but simply deny that they exist at all.
But how could anyone sensibly deny the existence of, e.g., utterances of words,
sentences, poems, speeches, and the like? What do I think is going on when people
talk? The hypothesis that I fmd implicit in at least much phonological research is
what I will call the "folie it deux" view: the human ability to speak natural languages
is based on the existence of a special faculty that includes a system for the intended
production and recovery of SLEs. To a fIrst approximation, instructions issue
from speakers' phonological systems to produce certain SLEs, and these instructions
cause various motions in their articulatory systems, which in turn produce various
wave-forms in the air. These wave-forms turn out, however, not to reliably cor
respond to the SLEs specified in the instructions. All that seems to be true is that
when they impinge on the auditory system of an appropriate hearer, this hearer's
phonological system will be able to make an extremely good guess about the
intentional content of the speaker's instructions, not about any actual SLEs, which,
ex hypothesi, never actually got uttered. Indeed, this sort of guessing in general is
so good, and the resulting perceptual illusion so vivid, that it goes largely unnoticed,
and speakers and hearers alike take themselves to be producing and hearing the SLEs
themselves. It is in this way that it's a kind of folie it deux (or it n, for the n speakers
The I ntentional I n existence of Language
of a common language) : the speaker has the illusion of producing an SLE that the
hearer has the illusion of hearing, with however the happy result that the hearer
is usually able to determine precisely what the speaker intended to utter. Indeed, were
SLE tokens actually to exist, it would be something of an accident. Their existence
is completely inessential to the success of normal communication and to the needs
of linguistic theory.
But what of the way we and linguists all the time talk of SLEs? How are we to
understand such claims as that, for example, a certain sentence is ambiguous, is pro
nounced differently by different people, that " rhyme" rhymes with "slime," or that a
pronoun is co-indexed with a certain NP? I recognize the temptation to provide
for them a kind of nonphysical "psychological reality," of the sort that Jackendoff
proposes. However, I conclude in section 4, this temptation - and the problems and
paradoxes it invites - can easily be resisted simply by not treating SLEs as having
any kind of being at all - not in the actual world, nor any possible world, nor in any
sort of "experiential," "phenomenal," or "perceptual" world either. To be sure, there
are stable intentional contents, which facilitate the kind of folies a n that play import
ant roles in our lives. But the apparent objects projected from these folies are best
regarded as (in Franz Brentano's 1 874/ 1 973 phrase) "intentional inexistents" :
"things" that we think about, but that (we often know very well) don't actually exist,
such as Santa Claus and Zeus. Just how systematically to understand talk about such
"things" is a topic of some interest generally. Along lines that parallel some of
lackendoff's, I think such talk is useful - maybe indispensable - to a great deal of
psychology, for example, in the case of "mental images," "mental models," even qualia
and consciousness (see Rey, 1 98 1 ; 1 997, chapter 1 1 ; and forthcoming.).
Unlike lackendoff, however, I don't think such talk needs to be invoked for
everything. In particular, for all the importance of intentional inexistents in our
lives, I also think we sometimes happily succeed in seeing and referring to real things :
e.g., space, time, material objects, people, probably some abstractions, like numbers
and sets - not to mention a case that I will take as representative, my automobile,
a most reliable Honda Accord.
1 I ntentiona l ism
1 . 1 Intentional content
I join lackendoff in taking for granted the (surprisingly controversial) claim that
people have minds, that some of their states are the subject matter of cognitive
science, and that we may take cognitive scientists at their word when they propose
to characterize those states in terms of internal representations, as they nearly ubiq
uitously do. Thus, linguists regularly discuss the character of the language faculty's
representations of SLEs; vision theorists, the visual system's representations of, e.g.,
edges, surfaces, and shapes ; and animal psychologists, b irds' and bees' representations
of, e.g., time, distance, and stars (see Gallistel, 1 990). And, of course, anyone interested
in the full mental life of human beings will be concerned with representations of
anything they can perceive, think about, and discuss. Representation goes hand in
Georges Rey
Physical process it undoubtedly is, but it by no means follows that anyone knows
how to explain it in physical terms ; and this in part is due to the difficulty of speci
fYing in such terms the intentional content of a cognitive or perceptual structure - Le.,
a representation - or even deciding just what that content is - whether, for example,
the content of a certain state of the visual system concerns edges, surfaces, objects,
(Euclidean?) lines and distance, or merely mathematics.4 But this lack of an account
in either the abstract or concrete case is no cause for panic. It's just not true that
without a careful exegesis of the term - which no one provides - we are ineluctably
led toward a quasi-mystical interpretation of "grasping," a scientifiC dead end. (ibid.)
The history of science is replete with uses of terms at one time that had to await
theoretical clarifICation much later. In any case, lackendoff is celiainly wrong to sug
gest that "no one provides" any efforts in this direction. Intentionality has received
considerable philosophical attention in the last hundred years (see, e.g., the recent
work discussed in section 1 .2), and, although providing an adequate account of it
has turned out to be surprisingly difficult, I see no reason to despair and regard the
phenomenon as "quasi-mystical."
that is, there is no real thing that "Zeus" represents - there being no Zeus. Now obvi
ously an expression like "Zeus," could represent something in the flrst usage - "Zeus"
is not a meaningless expression - without representing anything in the second. For
reasons that will emerge, I shall call the flrst usage the " (purely) intentional" use, the
second the "existential."
However, on the assumption that "Zeus" is meaningful, a problem arises about
just what in that case it does represent on the intentional usage. Notice that the two
uses involve different direct objects. If a speaker is talking about things whose exist
ence she expects she and her audience take for granted, she can say:
1
where she is using the right-hand "Bush" as her way of talking about the very real
man George Bush. But if she is talking about something whose existence she does
not expect her or her audience to take for granted, then she will (ordinarily) still say,
e.g.,
2
but she will simply not take the right-hand "Zeus" to be a way of talking about any
real god, Zeus.
So how are we to understand (2)? Well, some people (e.g., Millikan, 2000, pp. 1 7 5-6)
might balk at it, but it seems to me perfectly OK; it's just that we have to be care
ful about what we say. One apparently very tempting thing to say is that, in (2),
"Zeus is an idea in your head" (ef. Quine, 1 9 53/ 1 980). But, literally understood, this
can't be true. Whatever else is in your head, there is no bearded god there; Zeus is
certainly not supposed to be a mere idea, and, moreover, if he were, well, then, he'd
exist after all, since (let us assume) surely the idea exists ! The tempting thing to say
is absurd.
But it is revealing. I submit that what happens without our ordinarily noticing is
that, when we say things like (2), where we know we aren't referring to any actual
thing, we resort to the intentional use of "represent," shifting our concern from any
actual referent to the mere idea of it. That is, when we use "represent" purely inten
tionally all that seems to be involved in our use is something like the "sense" not
the "reference" of the terms (although, pace Frege, we don't thereby refer to the sense,
or an idea, since, again, someone thinking about Zeus is not thinking about a sense
or an idea).
With these distinctions in hand, we can now (provisionally) defme the (intentional)
content of a representation in terms of the intentional use, roughly as follows:
An (intentional) content is whatever we understand x to be when we use the
idiom "represent(ation of) x" purely intentionally.
Le. :
An (intentional) content is whatever we understand x to be when we use the
idiom "represent(ation of) x " but there is no real x
Georges Rey
On some views, e.g., Devitt ( 1 996), Millikan (2000), the causal relation is an actual
historical, causal relation to a Teal phenomenon (the mental representation was actu
ally caused by the phenomenon) ; on other views, e.g., Dretske ( 1 988), Fodor ( 1 990),
it is merely counterfactually co-variant with it (the representation would co-vary with
the phenomenon under certain counterfactual circumstances). In either case, there is
a commitment to the actual or counterfactual reality of the phenomenon that provides
the intentional content of the non-logical components of propositional attitudes. It
is this commitment that I will argue is brought into question in the case of SLEs.
One might think that Strong Externalism is vulnerable to plenty of commonplace
examples of contents that at least do not seem to involve any actual external phe
nomena: ghosts, spirits, gods, characters in dreams and fIction. But there are many
moves externalists make to deal with such cases, the main strategy being to claim
that purely intentional usage of "represent" involves only logically complex repres
entations, not any atomic ones - indeed, for some, any apparently atomic cases (e.g.,
"angel") have no genuine content at all (see, e.g., Fodor, 1 990, 1 9 9 8 ; Millikan, 2000,
pp. 1 7 5-6; 2004; and Taylor, 2003, pp. 207- 1 2) ! What interests me about the examples
of SLEs is that these moves seem particularly implausible in their case, a case that
may involve expressions that, as components of a dedicated language module, may
well be conceptually or contentfully atomic, and one, moreover, in which the issues
are motivated by serious theoretical considerations and not merely ordinary talk.
Indeed, a benefIt of this debate between these different theories of content is that it
provides a way to sharpen the discussion about the existence of SLEs. As lackendoff
(Chapter 1 3 ) and Chomsky (2000, pp. 1 77 -8) rightly emphasize, ordinary talk of
"existence" and "objects" is pretty wild, involving a kind of existential promiscuity:
we refer to "things" like "rainbows," "the sky," "the wind," "a person's reputation," but
would be hard put to specify just what these "things" might be. I am not, however,
concerned here with ordinary talk. Rather, I want to press the purely theoretical issue
about the existence of SLEs as serious explanatory posits: do such real entities play
any essential explanatory role in any linguistic or psychological theory? In order to
give a certain bite to the discussion, I will defend the denial of the existence of SLEs
in the context of these Strong Externalist theories of mental content. I do this partly
because of the widespread influence of such theories of content, which I think needs
to be checked, but also in part because the claims such theories make provide a
useful constraint on the ordinary promiscuity regarding existence. In particular, if
the Strong Externalist theory is to provide an explanatorily adequate account of
content, then the external phenomena to which it appeals had better be specifIable
independently of the state whose content it purports to provide. This latter con
straint is what I believe linguists have given us reason to think cannot be satisfIed
by SLEs, which is why it is plausible to suppose they don't exist.
2
5 LEs versus Ca rs
2 . 1 Sentences
One of the linguist's arguments for the nonexistence of SLEs is in a way extremely
short. Consider some ordinary English sentences that we orthographically represent as:
Georges Rey
NP
kh
VP
seems
PP
to Bill
IP
tt
I'
-fmite
VP
want
IP
VP
PROt
help
NP
himselft
Thus, not only is there an elaborate tree structure in (U I), there are also "empty"
categories: trace (tJ ) and PRO that indicate a node in the tree that for one reason or
another has no SLE attached to it, for example, because an SLE has been "moved"
to a different node in some "transformation" of the sentence, and/or needs to be
co-indexed with elements at other nodes: thus t1 is co-indexed with John, PRO, and
with himself, capturing the fact that the subject of want and help are both John, and
not Bill.
Now what actual thing in the world possesses this structure? For example, does
the utterance that is purportedly named by " (U I )" above possess this structure? Well,
what is that (purported) utterance? A token of an English sentence. According to PT,
this is something in space/time. But does anything I actually produced in space and
time have the above structure? I think not.
2 . 2 Cars
A useful contrast here is my automobile, as I said, a most reliable Honda Accord.
Cars, like (purported) linguistic entities, are artifacts, arguably tokens of types, pro
duced by human beings with certain specifIC intentions. It seems to me very prob
ably true that, had there been no people, there would have been no cars. But, of course,
The I ntentional Inexistence of Language
this doesn't for a moment imply that my Honda isn't real - e.g., that it doesn't exist in
the garage when no one's looking or thinking about it (I expect that, hardy car that
it is, it might well persist long after all of us are gone). Indeed, I submit it's absolutely
crucial to the explanation of why my Honda is so reliable that it in fact has (or real
izes) a certain causal structure : the pistons fIt snugly into the cylinders, so that when
the gas is ignited by the sparks from the plugs, they are pushed down with sufflcient
force to tum the crankshaft, which, through the transmission, turns the axles and
the tires, which, in contact with the road, allow it to scoot from here to there.
Now, does anything in space/time have the structure ascribed to (U I ) in the way
that my car has the structure of my Honda? It would seem not. Whereas examining
the spatiotemporal region of my car clearly reveals the existence of the intended struc
ture, examining the spatiotemporal region of my utterance reveals nothing remotely
like the linguistic structure that I presumably intended. And, if my folie view is cor
rect, nothing in the explanation of the success of linguistic communication depends
upon there being such a structure. One can imagine building a kind of Rube-Goldberg
tree structure, replete with little ornamental morphemes suspended from each ter
minal node, and with wires and pulleys that permitted some movements or connections
here and prevented others there. But it is an interesting fact that the noises we pro
duce when we intend to utter sentences are nothing like this, and don't need to be.
There is nothing we ordinarily produce that has a tree structure with items that have
in fact been "moved" or even in fact "cross-indexed" from one node to another, leav
ing a nevertheless still existing node "empty." As we will see shortly, there are not
even boundaries in the acoustic stream that could serve to causally distinguish the
physical parts of words, let alone NPs, VPs and the like. In "uttering (U l)," I simply
didn't succeed in producing something with an actual linguistic structure in the way
that the technicians at Honda produced something with an automotive one.
Someone (e.g., Ned Block) might claim that the structure of an SLE is "more abstract,"
in the way that, for example, a computer can have an abstract computational struc
ture not easily visible in the array of transistors and the pieces of metal and plastic
of which it's made. However, there is an important difference between the two cases:
in the case of the computer and for at least some of the algorithms it can be liter
ally be said to be running (e.g., the operating system), there are all sorts of causal
claims and true dispositional counteifactuals about what would happen if you were
to apply a certain current to certain points in the network, and it's these causal
claims and counteifactuals that make it true, if it is true, that the computer is execut
ing a certain algorithm. Indeed, the steps in the algorithm can, if the computer is
actually executing it, be clearly mapped in an explanatorily illuminating way to specifIc
physical states of the machine. Nothing analogous seems to be true of an SLE - at
least not independently of human perceivers (to whom we shall return shortly).
began. There is not space to explain here the variety of reasons for discrepancies
between the linguistic and acoustic streams (see also Fodor et al., 1 972, pp. 279-3 1 3 ;
and Jackendoff, 1987, pp. 57ff.). SuffIce it to note a particularly interesting phenomenon
for my folie view: cue displacement. Consider:
(U2) Proust was a great writer of prose but a poor rider of horses.
A typical speaker of American English will issue a command to produce a ItI in
the course of trying to produce "writer" and a Idl in the course of trying to
produce "rider," however, the command will in both cases produce exactly the same
acoustic wave front, or "formant, " at the supposed point of the ItI and the Id/.
But this seems a little surprising, since most hearers of (U2) wouldn't notice the
indistinguishability of the ItI and Id/ here until their attention was explicitly drawn
to it. How do they manage to keep things straight? What happens is that the effect
of the command to produce a /tl in the context of the other intended phones of
/writer/ is to shorten the duration of sound associated with the immediately preced
ing vowel lifo So the speaker did not in fact succeed in produce the It/ that he com
manded; nor did the hearer hear an actual /t/. But both speaker and hearer may have
thought that a token of the word Iwriter/ was produced (a similar dislocation occurs
in the case of "latter" and "ladder" ; see Fodor et al., 1 972, pp. 292-3 and Jackendoff,
1 987, pp. 6 1 -2).
In a useful analogy, Fodor et al. ( 1 972) compare the formants to the clues left by
a criminal: the hearer is in the position of a detective, inferring the identity of the
criminal from the clues - but not by identifying the criminal with the clues:
The acoustic representative of a phone turns out to be quite unlike a fi ngerprint. It is
more like the array of disparate data from which Sherlock Holmes deduces the identity
of the criminal. (Fodor et al., 1 972, p. 30 1 ; see also Laver, 1 993, p. 1 06)
Note that what "clues" and "signals" a speaker provides a hearer will vary accord
ing to the speaker's estimation of what the hearer in a particular context will need:
articulating slowly and distinctly for children, foreigners, and noisy telephones; speed
ing up and employing copious contractions with familiar friends and colleagues ;
and proceeding to a breakneck pace in the highly stylized speech of an auctioneer.
And, of course, clues and evidence would seem to be the order of the day when we
consider the tactile "Tadoma" language developed for the deaf and blind (whereby
SLEs are detected, to some 80 percent accuracy, by merely touching the face and
neck of a speaker!). According to the folie Ii deux I am urging, we're all, as it were,
acoustic Tadomists.
2 .4 "Psychological reality"
As I said at the end of section 2.2, one might, of course, appeal to perceivers
and claim that the boundaries and categories are "psychologically real." Certainly,
standard linguistic data regarding which strings of phonemes are and which are not
"acceptable," which things "co-refer" (as "John" and "himself" do in (U 1 ) ), present
pretty impressive evidence for the reality of how the mind represents those strings.
The Intentional Inexistence of Language
But is something's being "psychological real" suffIcient for its being real tout court?
Consider an example very different in this regard from the examples of Hondas and
computers : Kanizsa fIgures.
Human beings standardly seem to see the fIgure above as a triangle - so vividly in
fact that novices often have to put their fmgers over the apexes to see that it's an
illusion. That is to say, despite the fIgures being "psychologically real," there is in
fact no triangle there, nothing with the standard three-sided structures of these things.
Indeed, Kanizsa fIgures are only particularly dramatic examples illustrating the point
familiar since Plato, that, strictly speaking, no one has seen or could ever really see
a genuine triangle, since all the proffered examples are manifestly defective (the sides
aren't perfectly straight, much less one-dimensional). In any case, "psychologically
real" needs to be understood along the lines of "suspected criminal. " The burden is
on those who would claim that something that is "psychologically real" is genuinely
real - real tout court.
A number of people (e.g., Sue Carey, Zenon Pylyshyn) have replied to me at this
point that there is a triangle even in this Kanizsa case, for isn't there a triangular
region of space (defmed perhaps in terms of distances between point locations) where
the triangle appears to be? Well, no, strictly speaking there probably is not, since
it is likely that our ordinary conceptions of geometrical phenomena are Euclidean,
and real space is Riemannian. But, putting that diffICulty aside, on such a view space
would be chock full of continuum many squares, diamonds, triangles, etc., every
where one looks - it's just that one perceives (at best) very few, if any of them. This
seems to me a desperate and needlessly profligate move, and, in any case, would
ill serve the purposes of the Strong Externalist, since it would only be perceptible
triangles (if such exist) that would be the cause of certain representations, which would
then seem to have their content restricted accordingly.
Moreover, it's important to bear in mind the threat of circularity here: if contents
of representations are to be substantially identifIed with the causes of them, those
causes had better be identifiable independently of that content, i.e., independently of
what people take to be a triangle - or an SiE. But this is precisely what seems diffIcult
in the case of Kanizsa fIgures : as fIgure (aJ nicely illustrates, the types of fIgure that
might be defmed by the apexes depends entirely how it is oriented to the perceiver,
who is caused to "take" it as a triangle.
Georges Rey
Orthographic SLEs are at least as badly off as Kanizsa flgures, since there are entire
Kanizsa "alphabets" consisting of them (see Ninio, 1 998/200 1 , p. 90), and one can
imagine a group of people recording information and communicating quite effect
ively by the use of them. But acoustic SLEs are even worse, for here one doesn't
even have recourse to anything like the background geometry of space. The "heard
boundaries" - not to mention the tree structure and empty nodes - of the sentences
one takes oneself to utter and hear are even more obviously "imposed" by the
perceiver. One would certainly be hard put to identifY an SLE as a cause of an SLE
representation independently of what a subject "took" something to be one.
"But," someone (e.g., Louise Antony) might remind us, "Isn't it crucial to
Chomskyan theories that the sentences a child hears are in fact structured along the
lines he suggests?" To be sure, a person's understanding of a language does involve,
inter alia, being able automatically to hear an acoustic blast as manifesting an intended
structure. However, manifestation of an intentional content is one thing; instantia
tion of it quite another. As many would-be novelists can attest, one can manifest
an intention to produce something without actually ever producing it. Evidently
language acquisition does depend upon there being some fairly systematic mani
festation of linguistic structure, but this does not entail that the acoustic blast in fact
possess or display that structure: it may be enough that it provides merely sufflcient
clues to it, enough bells and whistles to serve as indicators of the intended structure,
along the lines of the folie view I am recommending.7
3 S LE Ideal ism
Many philosophers make a living providing ingenious analyses of problematic entities
that render their existence unproblematic. Since lackendoff does not defend any of
them, I won't consider them here (but see Rey, forthcoming). I will simply comment
on the idealist proposal that he does defend, viz., that SLEs - like all the things we can
conceive - don't exist in the real world, but rather in a "phenomenal," "experiential"
world that is somehow partly constituted by our thought.
The chief problem is, of course, to make sense of exactly what such "objects" are
supposed to be, and how any of the things we believe to be true of SLEs could be
true of them. For starters, it's hard to see how they could be uttered or be otherwise
produced and then heard by an audience if they don't exist in real space/time. And
if they exist only in a phenomenal space/time, is, e.g., the c-command structure causally
efflcacious in it? How does that work, in such a way that other, actual people are
caused to hear it? Or are "other people" also only phenomenal, not actual? Do they
only exist insofar as we conceptualize or experience them? And their bodies? . . . It
would appear that idealism about SLEs soon threatens to engulf the world, and we
are left with the usual paradoxes that I mentioned in passing earlier: why we should
take lackendoff's account seriously, since, by its own lights, none of it is about the
actual world.
Note that lackendoff offends here not only against these sorts of standard philo
sophical reflections, but also against the reflections of an ordinary "us" for whom
he claims to speak: it is simply not the case that " the perceptual world is reality for
us" (emphasis his), or that " [W]e are ultimately concerned with reality for us, the
world in which we lead our lives." Speaking at least for myself (but surely many
others), it is precisely when I concern myself with "ultimately" what's real that I abstract
from the odd errors of my perceptual and cognitive system and acknowledge that
there really are no Kanisza fIgures (indeed, no genuine Euclidean figures at all) nor any SLEs. It's only when I don't care how things ultimately are - which, to be
fair to lackendoff, is a very good deal of the time - that I acquiesce in ordinary talk
and "refer" to things that I fully believe don't exist.
In any case, if one looks at the theoretical work SLEs are supposed to perform in a
theory, their actual existence is entirely needless and extraneous to linguistic theory.
For recall how linguistics discusses SLEs. Outside of explicitly acoustic phonetics,
linguists almost never discuss acoustic phenomena. Rather, they discuss how SLEs
enter into complex sequences : phones into phonemes, phonemes into morphemes,
morphemes into words and syllables, words into phrases and sentences, and sentences
into discourse. There are abundant theories and disputes about the specifIC structures,
rules, principles, and parameters for these structures, which provide the substance of
linguistics about which I wish to remain as neutral as possible. The only issue that
concerns me is what the structures and/or sequences are structures and sequences
of, if, at the basic level of them, phones can't be identifIed with particular acoustic
phenomena. The structures would seem to be enormous and intricate buildings
resting on - nothing. Or, if there turned out to be something, this would be entirely
accidental. As Chomsky (2000, p. 1 29) says of various proposals to identify SLEs
,
with acoustic phenomena, they are all needless "wheel-spinning.' a
A proposal that I've found many people inclined to make at this point is to
claim that SLEs should be regarded as merely " abstract" objects like numbers.
Without entering into in-house controversies in this regard that I said I would avoid
(n. 2), let me just note the following crucial difference between the two cases: our
best theories of the nonpsychological world seem to be committed to numbers, e.g.
as the magnitudes of various parameters like mass or energy, quite independently
of whether anyone has represented them : the magnitude of the mass of the moon
causes a certain magnitude of tides whether anyone ever thinks about it. But in
"psychological" domains such as linguistics, I submit that the entities have no role
to play independently of our representations of them. As Cartwright nicely put it with
reference to purely "fIctional objects" :
Must not dragons have some mode of being, exist in some universe of discourse? To
[such] rhetorical questions, it is suffIcient to reply with another: What, beyond the fact
that it can be referred to, is said of something when it is said to have some mode of
being or to exist in a universe of discourse? The alleged universes are so admirably
suited to perform their function that they are not above suspicion of having been invented
for the purpose. (Cartwright, 1 960/ 1 987, p. 30)
This seems to me precisely the suspicion I think one ought to have about SLEs and
many other what might be called "representation-dependent" entities: there's no reason
to believe in them over and above the reasons one has for believing that there are
simply representations with the corresponding content.
Georges Rey
4 Concl usion
In sum, what I take to be the linguistic argument for the nonexistence of SLEs con
sists of noting that nothing in the external world has the properties that linguistic
theory plausibly identifIes as the properties of SLEs. Those properties are specifIed
by the language faculty, not as a result of any causal interaction with those
properties in the world, but as a result of categories the faculty itself provides and,
so to say, "imposes" upon the physical input. As philosophers and psychologists
at least since Kant have been reminding us, our understanding of the world is as
much a product of our minds as of any input from it. Moreover, in some cases,
for example, SLEs and (Kanizsa) triangles, there turns out on examination to be
nothing corresponding to the mind's categories : nothing in the external world has
the properties that these categories require.
However - and this is the fust point at which I depart from lackendoff's gener
alization of the Kantian point - in other cases, notably my car, things do manage
to have the required properties. And at least one distinctive reason to believe this is
that the properties of cars play crucial causal-explanatory roles in an account of how
things work in the world, in a way that the properties of SLEs do not. The folie a n
that might work for SLEs pretty clearly won't work for cars.
What leads lackendoff to think otherwise is, I suspect, his rightly noting that the
same mechanisms of mind, whereby concepts are imposed upon input, are likely at
work in all our categorizations of the world. But he wrongly concludes from this
that the world itself depends upon our categorization. One can see the temptation to
such a view in the case of SLEs, and I'll return to that temptation in a moment. But
there's no reason to yield to this temptation when our efforts at categorization turn
out to succeed. From the fact that our concepts might be constituted in part by mater
ial provided by our minds, it doesn't follow that the things themselves picked out
by those concepts are constituted in part by that structure. Surely it's possible that
what our minds impose corresponds to what's actually there, as, I've argued, it does
in the case of cars. But cars are constituted of metal and plastic, not ideas. Moreover,
if cars do exist, then we're entitled to switch from the purely intentional to the
existential use of "represent" and say that our representations relate us not to mere
to ideas of cars, but to the cars themselves.9
However, the second point at which I depart from lackendoff and Chomsky is in
seriously thinking of the " experiential world" or "world of ideas" as a world even in
the case of nonexistent things like SLEs. There's of course something right about the
thought: especially in such cases as SLEs and Kanizsa fIgures, the illusions are so
wonderfully stable, not only across time, but across people, that it's easy to proceed
"as if" the things existed, and even "discuss" them socially: most people would surely
spontaneously agree (as they might put it) " those Kanizsa fIgures reproduced above
are astonishingly vivid, their surfaces seeming more luminous than their surrounds."
But the thought of a separate world is over-thought if it invites claims about a rival
world in addition to (and even superceding) the actual one. A better thought is that
it's like being in a play, and knowing we are, but being unable to completely
disengage ourselves from it: we keep "referring to Polonius," with smashing social
The Intentional Inexistence of Language
success, even though we know that "he" is a fIction. If you'll pardon my French, my
jolie it n is a kind of jaon de parler, or, more accurately, a jaon de penser, since
it's surprisingly diffIcult even to think such stable contents without engaging in this
odd sort of "projection" of the nonexistent things to which they seem to refer.
It's here that I think Brentano's talk of " intentional inexistents" may be of help.
When there is a sufficiently stable system of representations with a content [x] , it is
useful - almost ineluctable - to talk about x 's, even when (we know) there are none,
and this stability licenses talking of them as "intentional inexistents." Thus, Hamlet,
Polonius, Zeus, and the other gods, as well as Kanizsa triangles and SLEs, all would
qualify - but not, as Cartwright ( 1 960/ 1987) notes, carnivorous cows, since (so far)
there's no such stable system with [carnivorous cow] as its content. This sort of talk
seems harmless enough, a piece of the "existential promiscuity" noted earlier; who
would be so philosophically puritanical to object, if it can be paraphrased away in
terms of stable representations with certain content?
The only error would be to suppose that we have to accord such things some kind
of genuine being, distinct from actual existence, one that, moreover, threatens to usurp
our references to actual existent things in ways that lackendoff and Chomsky sug
gest. Indeed, I'll end here as Cartwright did: "unreality is just that: it is not another
reality" (Cartwright, 1 9 60/ 1 987, p. 30).
Acknowledg m ents
I am indebted to Ray lackendoff, who graciously made available to me the manuscript of his
chapter in the present volume. I am also indebted to discussions with too many people
to mention here ; I will just express particular gratitude to Paul Pietroski and Dan Blair for
discussions of the linguistic material.
Notes
presumably consciously thinking the things he writes, and those things appear to be expressed
in English and so will surely be subject to the constraints of English. If he thinks English
can't express x, then he'd better not try to express x in English! In any case, I would have
thought that one of the constraints on any language is at least a disquotational principle,
that, e.g., "tables" is about tables. Whatever one might think about the ontological status
of tables, it can't even be tempting to think in one's use of "table" "one is referring to
one's mental representations [of tables] rather than to the things [the tables] represented."
lackendofr seems more certain than I think he is entitled to be about our understanding
of early vision and its contents: see Burge, 1 9 8 6 ; Segal, 1 99 1 ; Davies, 1 99 1 ; and Egan,
1 992 for interesting controversy. It is for resolving these sorts of controversies that a
theory of the "mysterious mind-world relation of intentionality" is required.
A case in point is Chomsky (2000), who claims that, as he understands them, " [internal]
representations are postulated entities, to be understood in the manner of the mental image
of a rotating cube, whether it be the result of . . . a real rotating cube . . . or imagined" (2000,
pp. 1 59-60), but then proceeds to deny that his own use of representations commits him
to any intentionality (2000, pp. 2 2-3, 1 0 5 ; 2003, p. 279). I suspect a similar denial in
lackendoff when he writes, "language has . . . made contact with the outside world - but
through the complex mediation of the visual system rather than through some mysterious
mind-world relation of intentionality." As I argue in Rey, 2003, these denials are incom
patible both with the claims of practicing linguists (including Chomsky and lackendoff
themselves !), and with making sense of their explanatory project.
These are the examples of beings that are physical duplicates, but whose thought contents
seem to depend upon which environment they inhabit. For example, two internally ident
ical people might be referring to different individuals in their use of a proper name,
in part because they are living in different worlds. Note that all that such examples
themselves support is Weak, not Strong Externalism.
Note that, apart from occasional deliberate (and untypical) standardizations, written lan
guage presents many of the same problems. Note also that mechanical speech recognizers do
not function by responding to well-defmed SLEs in the input, but by elaborate statistical
inferences, exploiting "Hidden Markov Models" ("HMMs") by means of which unobserved
sequences of probabilistically related states are "inferred" as the statistically "best explana
tion" of observed acoustic events (see Rabiner, 1 989). Leaving aside the degree to which
the success of these machines depends upon humans setting many of their parameters,
such "prototypical" SLEs have no more reality than "the average face" - which neverthe
less may provide a perfectly good way of identifying faces.
Chomsky (2000, 2003), himself, seems to identify at least some SLEs with items "in the
head," whose nature he expects to be spelled out by some future biology. I consider and
reject this option in Rey (2003, sections 5-6). To make that long story short: the explan
atory force of at least the kind of computational/representational theory of thought that
seems everywhere presupposed in cognitive science derives from Turing's brilliant strategy
for reducing all computation to operations defmed in terms of local physical properties.
SLEs are manifestly not such properties. Therefore a computational theory of language pro
cessing will need to proceed via computations on representations of SLEs. See Rey ( 1 997,
section 3.4) for general discussion.
Note that lackendoff sometimes recognizes this point, surprisingly agreeing with his
"skeptic" that "we don't perceive percepts in our heads, we perceive objects out in the
world," indeed, that "the perceptual world isn't totally out of synch with the 'real world'."
One aim of the present chapter is to provide lackendoff a way of avoiding this embar
rassed use of quotation marks around "real world" - and the inconsistency (noted in n. 3)
between these claims and the idealism he otherwise espouses.
The I n tentional Inexistence of Language
Rabiner, L. ( 1 989). A tutorial on hidden Markov models. Proceedings of the IEEE, 77, 257-86.
Rey, G. ( 1 98 1 ) . What are mental images? In N. Block (ed.), Readings in the Philosophy of
Psychology, vol. 2. Cambridge, MA: Harvard University Press.
( 1 997). Contemporary Philosophy of Mind: A Contentiously Classical Approach. Oxford:
Blackwell.
(2003). Intentionality and a Chomskyan linguistics. In A. Barber (ed.), Epistcmology of
Linguistics, Oxford: Oxford University Press.
(2004). Millikan's (un?)compromised externalism. In R. Schantz (ed.), The Extemalist Challenge:
New Studies on Cognition and Intentionality. New York: de Gruyter.
(forthcoming). Representing Nothing: Phones, Feels and Other Intentional Inexistents. Oxford:
Oxford University Press.
Sapir, E. ( 1 933/ 1 963). The psychological reality of the phoneme. In D. Mandelbaum (ed.), The
Selected Writings of Edward Sapir. Berkeley: University of California Press.
Saussure, F. ( 1 9 1 6/ 1 966). Course in General L inguistics. New York: McGraw Hill.
Segal, G. ( 1 99 1 ). Defense of a Reasonable Individualism. Mind, 1 00/4, 485-93.
(2000). A Slim Book About Narrow Content. Cambridge: MIT Press.
Taylor, K. (2003). Reference and the Rational Mind, Stanford: CSLI Publications.
Yolton, J. ( 1994). Perceptual Acquaintance, Minneapolis: University of Minnesota Press.
IS TH E AIM OF
PERCEPTION TO
PROVIDE ACCU RATE
REPRESENTATIONS?
CHAPTER
F I F T EEN
I take the title question to be the question whether the instrumental or biological
function of perceptual systems is to provide us with perceptual experiences that are
by and large accurate representations of our environments. I will argue that the answer
to this question is "yes. "
The assumption that perception yields experiences that represent the world around
us by and large accurately is deeply embedded in the tradition in the philosophy and
psychology of perception. This is the representational theory of perception. On this
view, when we perceive objects, we have perceptual experiences, which represent our
environments, differently in different modes, but typically through a fIeld medium,
as in the paradigmatic case of visual experience, which represents a spatial fIeld
fIlled in in various ways. We form perceptual beliefs on this basis, and our beliefs
are accurate, since they in effect abstract from the richer representational medium
of experience, just insofar as our experiences are accurate.
This traditional view has been challenged on the basis of recent experimental work
in psychology on "change blindness"] and " inattentional blindness"2 among other
phenomena, and the accumulating body of knowledge about neurophysiology of vision,
which is taken to suggest at the least that our ordinary views about the degree of
accuracy and completeness of our visual representations of the world before our eyes are
vastly exaggerated, and, perhaps, indeed, that some even more radical overhaul of our
traditional picture of perception is required. It has been suggested, in particular, that
Research in this area calls into question whether we really enjoy perceptual experiences
which represent the environment in rich detail. If we do not enjoy experiences of this
sort, then we need to rethink the idea that perception is a process aiming at the
production of such experiences. (Nor, 2002b, preface)
In recent discussions this has come to be called "the grand illusion," though it is not
clear that all participants in the debate express the same view with this phrase.3
There are, in this challenge to the traditional view, two different strains to be
distinguished. The fmt and more radical, hinted at in the above passage, raises the
question whether perception involves representations at all. If it does not, then the
question of their accuracy, and whether they aim at accuracy, does not even come
up. The second admits that perception involves representations, but argues that they
are largely non-veridical or largely inaccurate. We will take up both sorts of objec
tion in the following.
I will begin by sketching the view that is taken to be under attack, and then sketch
some of the evidence that has been advanced against it. Then I take up the radical
view, motivated in part by these empirical fmdings, that denies that perception involves
representations at all. I will then return to the question of whether the traditional
view and the view under attack are the same, whether we have been in any inter
esting sense subject to a grand illusion, and whether the empirical fmdings undermine
the view that perceptual experiences are by and large accurate representations of our
environment. Finally, I will address in the light of the discussion whether perception
aims at accurate representation.
According to many psychologists and philosophers of perception, and even ordin
ary people - it is said - visual perceptual experience in particular represents the
visual scene in uniform, rich detail. This conception of visual experience has been
called "the snapshot conception of experience," according to which "you open your
eyes and - presto - you enjoy a richly detailed picture-like experience of the world,
one that represents the world in sharp focus, uniform detail, and high resolution from
the center out to the periphery" (Noe, 2002a, p. 2). Attention ranges over this
present rich detail, which is stored in some sense as an internal representation of the
environment. It is because of this richly detailed internal representation of the visual
scene that we have the impression of a complete and detailed fIeld of vision and a
richly detailed world.
On closer examination, however, the snapshot conception of experience begins to
look more like a myth. Phenomenologically, it is clear that the visual fIeld (how things
are subjectively presented to us in visual experience) is not uniformly detailed and
in sharp focus from the center right out to the periphery. Our representation of the
scene before us fades as it were toward the edges of the visual fIeld, with far less
detail being represented at the periphery than at the center. This is what we should
expect given that the sensory receptors on the retina are sparser toward the edge
than in the foveal region. We get much less information4 about what goes on in the
regions around us by way of signals generated by light falling on the portions of
the retina which are at the periphery. This is strikingly illustrated in a famous experi
ment, in which subjects were seated before a computer screen and asked to read a
page of text while their eye movements were tracked. They had the impression of a
Kirk Ludwig
stable page of intelligible text. But outside the area their eyes were focused on, about
1 8 characters in width, only junk characters were displayed. That is, the intelligible
text was shifted with their eye movements, and everywhere else junk characters were
displayed.5 In addition, it is well known that the eye saccades three to four times a
second, and during movement there is no information transmitted to the brain. Yet
this seems not to be represented in visual experience. We likewise fail to notice the
blind spot in the visual fIeld corresponding to the location of the optic nerve on
the retina where there are no rods or cones. The information the brain gets, there
fore, is sparse over much of the scene before us, intermittent, and gappy. How could
experience be any better? And surely the fact that these things do not come to our
attention without investigation shows that we had thought experience provided a
richer and more accurate representation of our surroundings that in fact it does.
In addition, the recently investigated phenomena of change blindness6 and inatten
tional blindness,? among others, has been cited as showing that we are radically
mistaken about our experience representing everything that goes on in our fIeld of
vision. In experiments on change blindness, many subjects fail to notice quite signifIcant
changes in the visual scene they are looking at (for example, in the color of flowers
and cars or the structure of a house or position of a rail in the background) if the
changes are made during eye saccades or during the simultaneous display of visual
distractions.s More specifIcally, many subjects fail to report noticing any changes when
asked. Subjects thus appear to be blind to changes that occur during such events.
Inattentional blindness is a related phenomenon. Subjects fail to notice what seem
to be quite signifIcant features of the visual scene if their attention is directed
elsewhere. In one study subjects were asked to concentrate on a cross in the middle
of a screen presented briefly, then masked, and presented again. They were asked to
estimate which arm of the cross was longer. On the third or fourth trial they were
presented with a new stimulus, a colored square or moving bar. When the stimulus
was presented close to the fIxation point, 75 percent of the subjects failed to report
it when asked if they noticed anything different. In another widely discussed experi
ment, subjects were shown a video of two basketball teams, one in black and one
in white, each passing basketballs.9 They were to count the number of passes for one
team. They were asked afterwards if they noticed anything unusual. Forty-fIve seconds
into the video an intruder walks through the players, a woman holding an umbrella
or a man in a gorilla suit. In some trials the intruder was semi-transparent, and in
some fully opaque. Seventy-three percent of the subjects reported nothing unusual
in the semi-transparent trials, and 23 percent in the fully opaque trials. The conclusion
we are invited to draw is that subjects may fail to see or perhaps visually represent
what they don't pay attention to.
I will return to what these phenomena show about the traditional view and
their relevance to our question in a moment. I fIrst consider a radical alternative
to the traditional view motivated by them, the sensorimotor view of perception
advocated by Noe and O' Regan ( Noe and O ' Regan, 200 1 ). OfflCially, this view
holds that perception does not involve representations at all, and seeks to exhibit it
as a matter of a pattern of engagement with the environment. We will be concerned
with how tenable it is, and whether it really represents as radical an alternative as
is suggested.
Perception a n d Accurate Representations
The sensorimotor theory holds that "Visual experience [for example] . . . does not
consist in the occurrence of 'qualia' or such like. Rather it is a kind of give-and-take
between you and the environment" (Noe and O'Regan, 200 1 , p. 80). " [P]erceivers have
sensations in a particular sense modality, when they exercise their mastery of the
sensorimotor laws that govern the relation between possible actions and the result
ing changes in incoming information in that sense modality" (p. 82). It is "in this
sense to be 'attuned' to the ways in which one's movements will affect the char
acter of input" (p. 84) . It is a "form of practical knowledge" ( p . 84). According to the
sensorimotor view of perception, then, visual experience is not to be understood in
ternlS of representational states at all, but rather in terms of patterns of behavior and
their connection with sensory influx embodied in "the whole neurally enlivened body"
(p. 85). This point of view is expressed clearly in the following passage:
both the representationalist and sensationalist [about visual experience], make a . . .
fundamental error. Each relies on a conception of visual experience according to
which experiences are internal items of which we become conscious when we undergo
them . . . momentarily occurring, internal states of consciousness . . . . As against this
conception, I have proposed that perceptual experiences are not internal, momentarily
occurring states of this sort. I advocate that we think of experience rather as a form of
active engagement with the environment. Perceptual experience is a form of integra
tion with the environment as governed by patterns of sensorimotor contingency. (Noe,
2002c, p. 74)
Sometimes the thesis is put as if it were actual behavior that was crucial - "it is a
give and take between you and the environment" ( (Noe and O'Regan, 200 1 , p. 80) ;
"experience is not something that happens in us but something that we do" ( p. 99) ;
"sensation occurs when a person exercises mastery of those sensorimotor contingencies"
(p. 9 9 ; emphasis added). I will call this the activity theory. Sometimes the view is
put in a way that suggests it is not actual activity that is required, but a disposition of
a certain sort. I will call this the dispositional theory. Saying that visual experience
is a form of practical knowledge suggests the dispositional, rather than the activity
view. And at one point exercising one's mastery of sensorimotor contingencies is
characterized as consisting itself in "our practical understanding that if we were
to move our eyes or bodies" (p. 84) there would be appropriate resulting changes of
such things as "the influx from monochromatically tuned rod photoreceptors taking
over as compared to influx from the three different cone types present in central
vision" (p. 83). Thus, in visual experience, there is a characteristic change in neural
influx when we step toward an object and away from it, when we turn our heads
to the right or left, or when we close our eyes, or blink. On this view, the sum of
all these sensorimotor contingencies and our attunement to them associated with a
particular sensation, e.g., of red or yellow, is the visual experience or sensation of
red or yellow.
The activity interpretation is untenable, as Noe and O'Regan recognize. One
may perceive a surface as red so briefly that one has no time to move. Actually
engaging in activity cannot be a requirement on perception. Moreover, it seems clearly
to be a category mistake to talk of perceptual experience, when we have in mind
Kirk Ludwig
our representations of our environment are veridical. Our perceptual beliefs, being
abstractions from the contents of our perceptual representations, are correct only to
the extent to which our perceptual experiences are veridical. Our perceptual beliefs in
tum form the basis for our empirical investigations into the world, our inductive
practices, the formation and testing of empirical theories about physical law and the
neurophysiology of perception. Any empirical argument to the effect that our experi
ences did not represent the world by and large correctly, at least with respect to
those features that form the basis for our scientifIC theorizing, would be self-defeating,
because its premises would be established by the very perceptual mechanisms it
called into question. If it were sound, then there could be no reason to accept its
premises.
We could at most be justifIed empirically in thinking that in some circumscribed
respects our perceptual experiences did not correctly represent the nature of the world
around us. For any argument in favor of that would presuppose that in other respects,
determined by the standpoint from which the skeptical argument was given, our per
ceptual experiences were largely veridical. One traditional example of this kind of
circumscribed skepticism is skepticism about the reality of color, and other so-called
secondary qualities. Recently, very general considerations have been advanced to show
that the conditions necessary for the success of such a skeptical argument cannot be
met (Stroud, 2000, esp. ch. 7). The diffIculty is that to identifY a general illusion
about color, we must simultaneously be able to attribute color experiences and beliefs
to people, and to establish that nothing is colored. But the practices which make
sense of attributing color experiences and beliefs to people depend upon identifYing
what they believe relative to the objective features of objects in their environments
to which they generally respond. If we can make sense of attributing color experi
ences and beliefs to people only if we can fmd those beliefs and experiences to be
generally responsive in the right way to colored objects in the environment, then
there would be no way coherently and simultaneously to identifY color experiences
and beliefs and to deny the reality of color. The line of thought here is connected
with the application of the Principle of Charity in interpretation, which enjoins one,
as a condition on the possibility of fmding another person interpretable as a speaker
at all, to fmd him to have beliefs which are about the conditions in the environment
that prompt them (Davidson, 200 1 ; Rawling, 2003) . If this line of argument can be
sustained, then we would have established the stronger conclusion that we cannot
show we are mistaken in there being things falling in fundamental categories we
represent. The world illusion interpretation of the grand illusion hypothesis, accord
ing to which the world that perceptual experience represents to us is largely illusory,
or illusory in certain fundamental respects, would be shown to be fundamentally
in error.
I tum now to the perception illusion interpretation of the grand illusion hypo
thesis, according to which the illusion lies in our misrepresentations not of the world
but of the character of our perceptual experiences. The perception illusion inter
pretation is directly relevant to our overall question only to the extent to which the
evidence cited calls into question the general accuracy of perceptual representations.
Let us take up fIrst the challenges to the snapshot model of visual experience. The
falsity of the snapshot model, at least if the representations we are interested in
Perception and Accurate Representations
What about the suggestion that visual experience misrepresents because while
signals to the brain are interrupted by saccades, our visual experience does not appear
to be intermittent, and we do not represent a hole in the visual fIeld where there is
a blind spot?
The fact that the signal to the brain is interrupted during saccades does not show
that visual experience of objects in the environment is non-veridical or inaccurate
any more than the fact that a fIlm of a current event has a maximum frame rate,
and so could not be said to be capturing changes in what is being fIlmed continu
ously, shows that it is non-veridical or inaccurate. The visual experience is repres
enting the environment, not the mechanisms which implement it. Like a fIlm, it has
a maximum sensitivity to change. Changes that fall below the threshold are not
represented. That perceptual experience has a limit to its temporal resolution is a
matter of everyday experience. If I snap my fmgers while watching, I do not see the
movement of thumb and fmger, only their starting position and end positions. But
this does not mean that we misrepresent what happens when things move faster than
we can see. When the movement falls below the resolution of visual perception, we
fail to represent it, but this is not to misrepresent it.
In the case of the blind spot, there is no question of a misrepresentation with
binocular vision, because for each eye the other covers the portion of the visual
fIeld it receives no signals from. In the case of monocular vision, the question whether
a misrepresentation is involved depends on whether the visual fIeld is fIlled in in
the portion corresponding to the optic nerve or not. If it is, then it is at least
sometimes a misrepresentation ; if not, it is the absence of representation. 1 4 Neither
case looks to show something signifIcant about whether visual experience generally
provides accurate representations of the visual scene. In binocular vision, there is
no representational defIcit due to the blind spot. At most, in monocular vision,
there is a lack of representation of a small area in the visual scene or sometimes a
misrepresentation.
Let me tum to evidence for the grand illusion hypothesis drawn from studies of
change and inattentional blindness, both of which, I will suggest, are rather ordinary
phenomena, and do little to support either the grand illusion hypothesis or the thesis
that perceptual experience is inaccurate.
Inattentional blindness, that is, failure to notice or recall things that one was not
paying attention to, though these things did clearly physically affect our sensory organs,
both intermodally and intramodally, is familiar from ordinary experience. When we
concentrate on a visual task - reading, or writing, or painting a design on a cup,
we often fail to notice even quite signifIcant aural events in our environment. Similarly,
when listening to the radio, or a conversational partner at a cocktail party, we may
miss most of what goes on in front of our eyes. And it is a commonplace that one
often fails to notice somatic sensation when engaged in a difflCult task or one's atten
tion is directed elsewhere - as pickpockets are well aware - even to the extent of
not noticing that one has cut or bruised oneself or any sensations associated with
that. Likewise, intramodally, one may, in concentrating on what one person is saying,
fail to notice what her companion is saying though it is at the same volume. Or
one may in keeping track of a sprinter not notice or be able to recall the color of
the jersey of the runner next to her or much else about the visual scene. Change
Kirk Ludwig
blindness too is a pervasive feature of everyday life. We often fail to notice all the
changes in scenes in front of us even as we look at them. Some movie-goers I know
have failed to notice that in Luis BuflUel's fIlm That Obscure Object of Desire the
female protagonist is played by two different actresses. Many card tricks are based
on our failure to be able to recall in detail facts about cards we see. Prepare a deck
of cards by placing the seven of diamonds and the eight of hearts on the top of a
deck of cards, and the seven of hearts and the eight of diamonds on the bottom of
the deck. Shuffle the deck in front of the subject without disturbing the two cards
on the top and bottom. Ask the subject to take two cards off the top, look at them
so that he can recall them, and then place them anywhere in the middle of the deck.
Shuffle the cards a number of times, without disturbing the two bottom cards. Place
the deck on the table, tap it twice, and then deal the two bottom cards onto the table
face up. The subject of the trick will take the two cards dealt out to be those which
he had memorized. In this case, clearly it is not a matter of failing to pay attention
to the cards which explains why one fails to see that they are not the cards one
initially looked at. In drawing attention to these things I do not mean to disparage
systematic study of the phenomenon of inattentional blindness and change blind
ness, but only to point out that it is systematic study of a phenomenon we are already
familiar with. If there were a case to be made for a grand illusion involving
inattentional or change blindness, it is a case that could be made independently of
psychological studies.
What do these phenomena show, flISt of all, about the extent to which we are
subject to an illusion about the completeness of experience? Second, what do they
show about the veridicality or accuracy of perception?
In the case of inattentional blindness, it has been claimed that the evidence shows
that "there is no conscious perception at all in the absence of attention and there
fore no perceptual object can exist preattentively" (Mack and Rock, 1 998, p. 227). If
this were true, then I think it would be fair to say that we were subject to a kind of
illusion that we were conscious of things in our visual or auditory fIelds about which
later we cannot report in much detaiL But is it true? Is paying visual attention to
something phenomenally like having tunnel vision? Does the rest of the visual fIeld
disappear or shrink, so that, except for what you are paying attention to, phenom
enally the scene in front of you is just like the scene behind your head? This is an
experiment which one can perform without a laboratory, and for my part I can report
that it is just not so. I am paying attention at the moment to the words that are
appearing on my computer screen as I type. But I do not experience a sudden shrink
ing of the visual fIeld even if I would not be able to tell you much about the detail
of the visual scene outside the area of my attention.15 Similarly for the intermodal
case. In paying attention to the words, my body does not suddenly go numb, I do
not suddenly go deaf, etc. It is quite easy to imagine how one's whole experience
would be different if in paying visual attention to something one simply ceased to
have somatic or proprioceptive or auditory experience. A restricted ability to report
on things one is not paying attention to does not impugn the view that if they affect
one's sensory organs in a way that usually leads to some effect on the phenomenal
character of one's visual or auditory experience, etc., then they have a similar effect
on the phenomenal character of the appropriate portion of the visual or auditory
Perception a n d Accurate Representations
fIeld even when one is not paying attention. For the ability to recall or report that
one was in a certain complex phenomenal state and one's being in that state are not
the same thing, and it is no surprise that we are better able to recall matters involving,
and experiences of, things we are paying attention to than things we are not paying
attention to.
Why would anyone suggest attention was necessary for consciousness? Mack and
Rock reach their conclusion by identifying being able to recall or report something
later with having at the time been conscious of it: "A perception is . . . conscious if
subjects can recall or recognize it" (Mack and Rock, 1 998, p. 233). However, while
this is plausibly a suffIcient condition for having been conscious of it, it is not a
necessary condition, at least if we mean conscious or phenomenal experience. One
could defend Mack and Rock's conclusion by introducing an operational defmition
of "conscious," which does not aim to capture the ordinary meaning, and is tailored
to their experimental results. But the air of excitement goes out of the announce
ment when we take "no conscious perception" to be shorthand for "no or limited
ability to recall in the experimental conditions."
This point applies equally to change blindness. Change blindness does not directly
show that we do not at time t and at time t + E, after the change, represent
correctly features which have changed. What is shown at most is that one may have
limited ability to notice a change in the scene, and by extension in the representa
tion. 1 6 For change in the world is represented in experience by a corresponding change
in what the experience represents, and so in the experience itself. If an object is blue
at one time, then red, one's experience represents that change if before the change
it represented the object as blue and after the change it represented it as red.
To notice that one's experience has represented a change requires taking note of a
difference between one representation and another. The results of change blindness
experiments do not suggest that before and after the change one's experience does
not correctly represent. So they do not suggest that one's experience does not
represent a change. The experimental results suggest only that we may fail to notice
changes in our experience when they occur during saccades, or blinks, or when
there are simultaneous distracting events in the visual scene. Given this, it is a
mistake to suppose that people thinking that they would notice such changes shows
that they are subject to an illusion about the accuracy or veridicality of their experi
ence. 1 7 Rather, they overestimate the extent to which they are able to attend to
changes in their experience, and remember the character of their experiences at
later times.
It is easy enough to explain why we take ourselves to be better at noticing changes
in the special situations that elicit change blindness. As Jonathan Cohen has noted,
"all the inductive evidence available to perceivers supports their belief in their
ordinary capacity to notice ambient events" (Cohen, 2002, p. 1 52). We typically do
notice changes in our environments that are important to us. It is natural then that
we should be surprised when we fail to notice some changes that in retrospect seem
obvious in circumstances that we do not know are rigged to test the limits of
our abilities. I S But as Cohen remarks, this should no more incline us to say we are
subject to a grand illusion than the fact that we are surprised that we are mistaken
in the Muller-Lyer or Ponzo or Ebbinghaus illusions. The "grand illusion"
Kirk Ludwig
is an instance of a very general phenomenon: ordinary subjects are ignorant about the
limitations on their cognitive and perceptual capacities, and when controlled experi
mental conditions make these limitations apparent, they (and we) learn something new.
(Cohen, 2002, p. 1 55)
Given that perceptual experience does by and large provide accurate representa
tions of the environment, the question whether that is its aim is straightforward.
Experience has the instrumental function of providing accurate representations if its
doing so helps us achieve our aims. It is clear that knowing about the environment
is important to our achieving many of our aims. This requires correct perceptual beliefs
about the environment. And since these abstract from our perceptual experiences,
this requires accurate perceptual experiences. Accurate perceptual experience there
fore helps us achieve our aims. Perception therefore has the instrumental function
of providing accurate representations. Any answer to the question of whether the
biological function of perceptual experience is to provide accurate representations is
more speculative, since it is an empirical question whose confIrmation depends upon
historical facts we have only indirect evidence about. Yet it seems overwhelmingly
plausible that accurate representations of the environment tailored to an organism's
needs provides a selectional advantage. Given this, we may safely conclude that it is
also a biological function of perceptual experience to provide accurate representations.
N otes
2
3
4
5
6
7
8
9
10
lI
See O'Regan et a!., 1 996, 1 999 ; Simmons, 2000; Simmons and Levine, 1 997.
Mack and Rock, 1 998.
The phrase was introduced into the literature in Noe et a!., 2000. A recent issue of The
Journal of Consciousness Studies (Noe, 2002b) has been devoted to it.
There is a dangerous ambiguity in "information" which it would be well to note here. In
the text, I use "information" in the sense of a physical signal which together with appro
priate laws and background conditions enables one who knows the laws, background
conditions, and signal, to infer something about its cause. In this sense, rings in the trunk
of a tree carry information about its age. This does not mean that they carry information
in the sense in which a newspaper does. A newspaper carries i n formation in two senses,
in the signal sense, and in the sense that it represents that certain things have occurred,
that is, it contains representations that have intentional content and are true or false.
Tree rings are not intentional and are not true or false.
O'Regan, 1 990.
See Simmons, 2000 for a recent review.
See Mack and Rock, 1 998.
O'Regan, 1 9 9 2 ; O'Regan et a!., 1 996.
Neisser, 1979 ; Simons and Chabris, 1 999.
"Experience" has an event as well as a state reading. However, it is the state reading
which is at issue in the question whether the sensorimotor view provides an adequate
analysis of perceptual experience in the sense in which we speak of my visual or auditory
experience at a given moment in time.
The same point can be made by the more traditional thought experiments of the brain in
a vat, and the disembodied mind all of whose experiences are determined by an evil demon.
Perception and Accurate Representations
12
13
14
15
16
17
18
Consider i n this respect Daniel Dennett's example o f walking into a room and seeing
wallpaper, in his example, of identical photographic portraits of Marilyn Monroe. Dennett
says that "you would see in a fraction of a second that there were 'lots and lots of ident
ical, detailed, focused portraits of Marilyn Monroe' ", but that since "your eyes saccade
four or fIVe times a second at most, you could foveate only on one or two Marilyns
in the time it takes you to jump to the conclusion and thereupon to see hundreds of
identical Marilyns" (Dennett, 1 9 9 1 , p. 3 54). Dennett says rightly that we do not represent
in detail more than we actually foveate on. But then he goes on to say: "Of course it
does not seem that way to you. It seems to you as if you are actually seeing hundreds
of identical Marilyns" (Dennett, 1 9 9 1 , p. 3 55). But this needs to be handled carefully. You
do see a wall on which there are hundreds of portraits of Marilyn Monroe which are
detailed. And it seems to you as if you do. But does it or would it seem to you that your
visual experience represented all of that detail? I don't think that anyone would be under
the illusion that it did. It is just that we know that wallpaper involves repetition of a
pattern, and if we see the pattern, we know we are seeing a wall on which the pattern
is repeated in all its detail. There is no illusion, and no surprise, in any of this.
Space constraints prevent a detailed discussion o f Kathleen Akins's interesting argument
that the peripheral thermoreceptor system does not provide veridical representations (Akins,
1 996). The argument is based on an observation and an assumption. The observation is
that the warm and cold spots that respond to temperature and temperature change are
distributed unevenly over the skin, and have both static and dynamic responses that are
nonlinear. The assumption is that intensity of felt sensation represents surface skin
temperature if anything. It is the assumption that I would question. We treat sensations
of heat and cold as providing information about distal objects and objects we are in contact
with, not our skins, and, as in the case of visual experience, the relation between the
subjective features of experience and the representation of objectively unvarying propert
ies in the environment may be quite complex. The angle subtended by an object on the
retina is not directly correlated either with its represented size or shape, which depends
in addition on the represented distance and viewing angle. We may look for a similar
interplay between what is represented and a variety of different sorts of information, includ
ing cross temporal information, in the case of sensations of heat and cold. For example,
when we step into a hot bath, we know that the intensity of the sensation of heat will
diminish after a moment. But we do not take this to be a representation of the bath water
cooling down - we do not suddenly plunge the rest of the body in after the feet have
ceased to complain.
Dennett claims there i s n o filling in Dennett, 1 9 9 1 , but see Pessoa et aI., 1 99 8 for dis
cussion and some contrary evidence.
Fortunately, drivers d o not g o blind when they are talking o n a mobile phone, though
they are apt to do very poorly in reporting on the visual scene before them.
See Simons et aI., 2002 for some recent experimental work that suggests under probing
subjects often can recover information about a scene that it seemed initially that they
had not taken note of. This suggests that "change blindness" as defmed operationally in
these experiments does not correspond to failure to be able to recall and report on the
change at all, but failure in response to open-ended questions to make comparisons that
would have called the change to mind.
See Levine, 2002 for studies o f the extent to which people overestimate their ability to
detect change.
I n a n informal survey I have found that people have difficulty picking out the difference
between the pair of photographs reproduced in Blackmore et al. 1 9 9 5 when viewing them
at the same time, though they pick out the difference easily on being told what to look
Kirk Ludwig
for. No wonder subjects can fai l to notice a change when they are presented one after
another with an intervening eye movement.
References
Akins, K. ( 1 996). Of sensory systems a nd the "aboutness" of mental states. Journal of
Philosophy, 937, 3 37-72.
Ballard, D . H. (2002). Our perception of the world has to be an illusion. Journal of
Consciousness Studies, 9/5-6, 54-7 I .
B lackmore, S . J . (2002). There i s n o stream o f consciousness. Journal of Consciousness Studies,
9/5-6, 1 7-28.
B lackmore, S. J., Brelstaff, G., Nelson, K., and Troscianko, T. ( 1 995). Is the richness of our
visual world an illusion? Transsaccadic memory for complex scenes. Perception, 24,
1 075-8 I.
Bridgeman, B . (2002). The grand illusion and petit illusions: interactions of perception a nd
sensory coding. Journal of Consciousness Studies, 9/5-6, 29-34.
Clark, A. (2002). Is seeing all it seems? Action, reason and the grand illusion. Journal of
Consciousness Studies, 9/5-6, 1 8 1 -202.
Cohen, J. (2002). The grand grand illusion illusion. Journal of Consciousness Studies, 9/5-6,
1 4 1 -57.
Davidson, D . (200 1 ) . Radical Interpretation. In D. Davidson, Inquiries into Truth and
Interpretation (2nd edn.). New York: Clarendon Press.
Dennett, D. ( 1 99 1). Consciousness Explained. Boston: Little, Brown.
Durgin, F. H. (2002). The Tinkerbell effect: Motion perception and illusion. Journal of
Consciousness Studies, 9/5-6, 88- 1 O I .
Levine, D . T . (2002). Change blindness as visual metacognition. Journal of Consciousness Studies,
9/5-6, 1 1 1 -30.
Mack, A. (2002). Is the visual world a grand illusion? Journal of Consciousness Studies, 9/5-6,
102- 1 0.
Mack, A. and Rock, 1. ( 1 998). Inattentional Blindness. Cambridge, MA: MIT Press.
Neisser, U. ( 1 979). The control of information pickup in selective looking. In A. Pick (ed.),
Perception and its Development. Hillfleld, NJ: Erlbaum.
Noe, A. (200 1). Experience and the active mind. Synthese, 1 29/ 1 , 4 1 -60.
(2002a). Is the visual world a grand illusion? Journal of Consciousness Studies, 9/5-6,
1 - 1 2.
(ed.) (2002b). Is the Visual World a Grand Illusion? Special issue of Journal of
Consciousness Studies, 9/5-6. Charlottesville, VA: Imprint Academic.
(2002c). On what we see. Pacific Philosophical Quarterly, 8 3 / 1 , 57-80.
Noe, A. and O'Regan, J. K. (200 1 ) . What it is like to see: A sensorimotor theory of perceptual
experience. Synthese, 1 29/ 1 , 79- 1 03 .
Noe, A., Thompson, E., and Pessoa, L . (2000). Beyond the grand illusion: What change blind
ness really teaches us about vision. Visual Cognition, 7 / 1 -3, 9 3 - 1 06.
O'Regan, J. K. ( 1990). Eye movements and reading. In E. Kowler (ed.), Eye Movements and
Their Role in Visual and Cognitive Processes. Amsterdam: Elsevier.
- ( 1 99 2). Solving the "real" mysteries of visual perception : The world as an outside memory.
Canadian Journal of Psychology, 3 /46, 46 1 -88.
O'Regan, J. K., Rensink, J. A., and C lark, J. J. ( 1 996). "Mud splashes" render picture changes
invisible. Investigative Ophthalmology and Visual Science, 3 7 , 2 13 .
Perception a n d Accurate Representations
-, -, and
( 1 999). Change-blindness as a Result of "Mud splashes." Nature, 398, 34.
Pessoa, L., Thompson, E., and Noe, A. ( 1 998). Finding out about fIlling in: a guide to percep
tual completion for visual science and the philosophy of perception. Behavio.ral and Brain
Sciences, 2 16, 723-802.
Rawling, P. (2003). Radical interpretation. In K. Ludwig (ed.), Donald Davidson. New York:
Cambridge University Press.
Siewert, C. (2002). Is visual experience rich or poor? Journal o.f Co.nsciousness Studies, 9/5-6,
1 3 1 -40.
Simmons, D . 1. (2000). Current approaches to change blindness. Visual Co.gnition, 7 / 1 -3, 1 - 1 5.
Simons, D. and Chabris, C. ( 1 999). Gorillas in our midst: Sustained inattentional blindness for
dynamic events. Perception, 28, 1 059-74.
Simmons, D . 1. and Levine, D. T. ( 1 997). Change blindness. Trends in Cognitive Sciences, 1 /7,
2 6 1 -7.
Simons, D., Chabris, c., Schnur, T., and Levine, D . T. (2002). Evidence for preserved representa
tions in change blindness. Conscio.usness and Co.gnition, 1 1 / 1 , 78-97.
Strawson, G. ( 1 994). Mental Reality. Cambridge, MA : MIT Press.
Stroud, B. (2000). The Quest for Reality: Subjectivism and the Metaphysics of Colour. New
York: Oxford University Press.
Kirk Ludwig
CHAPTER
S I X T EEN
1 I ntroduction
Is the aim of perception to provide accurate representations? My view is that the
multiple modes of perception and their varied roles in guiding behavior are so diverse
that there is no univocal answer to this question. The aim of perception is to help
guide action via planning and directly responding to a dynamic and often hostile
environment; sometimes that requires accurate representations and, I shall argue, some
times it does not. So in many circumstances the appropriate answer to our question
is "no" and understanding the conditions in which it is not the aim of perception to
provide accurate representations is an important development in cognitive science.
First I present cases in which our perceptual systems do not provide accurate
representations and argue that their usefulness derives from their inaccuracy. In these
cases, we represent the world as being different from how it actually is to make import
ant environmental differences more salient. I then show that there is an ambiguity
in the notion of representation and argue that on the more natural reading there are
cases in which it is not the aim of perception to provide representations at all. Moreover,
the ambiguity does not arise from careless use of language but from the structure of
perception itself, as shown by Melvyn Goodale and David Milner's dual route model.
This empirically well-supported model reveals another class of cases in which perceptual
representations are not accurate, nor is it their aim to be so. In the fmal section,
I discuss how our question is related to a very similar but importantly different
question, namely, "Is the visual world a grand illusion?"
are determined by the brain based on the differential responses between the clusters
of cone types. The details of cell organization to effect this processing are beyond
the scope of this chapter, but the system is organized so as to maximize contrast
between green/red and blue/yellow stimuli (Kandel et aI., 1 99 5, pp. 453-68). The
upshot for our discussion is that phenomenologically we do see the biggest contrasts
between red/green and blue/yellow, consistent with, indeed caused by, our physio
logical organization. ! This is not the actual structure of the electromagnetic radi
ation we visibly detect, however. The wavelengths of the light we see as red are much
closer to those of green than blue, even though red and blue seem more similar to
us. In fact, our red and green cones are maximally sensitized to wavelengths less
than 30 nm apart (Kandel et aI., 1 995, p. 456).
Not only does our color perception fail to provide accurate representations, it may
be adaptive that it does not. Austen Clark (2000, p. 245) and Gonzalo Munevar (forth
coming, ch. 9) both make the claim that color perception is like "false color" used
in satellite telemetry. Arbitrary physical features that have nothing more in common
than that they affect some of the satellite's detectors in the same way are represented
as having the same color. From the vantage point of the satellite, we would not
perceive the scene as it is represented, but the representation makes real physical
differences more salient.
I think that all colours are, in this sense, false colours. Colour similarities may not
represent any resemblance of physical properties other than the propensity of disparate
classes of objects to affect our receptors in the same way. Nonetheless, colour differ
ences borders - can be informative about real physical differences in the scene. (Clark,
2000, p. 245)
Motion is a typical way of bringing about change in our environment and, not
surprisingly, it is something to which we are sensitively attuned. However, we are
so sensitive to motion that in some cases we perceive motion even when there is
none. The advantage here is simple; it is far better to make a false positive
identifIcation than a false negative one, since failing to notice motion can be fatal.
A simple experiment demonstrates how false motion is perceived when similar items
are seen in quick succession. When a red spot is flashed briefly on a blank computer
The Case Against Accurate Representations
The actual difference in this case is the length of the vowel, but the difference is
exaggerated in auditory perception by our hearing distinct consonants - to the extent
that our orthography reflects the illusion. A curious feature of such phonological illu
sions is that they are language specifIc. This suggests that in lexical acquisition the
auditory system is trained to alter what we hear, thereby making differences more
salient, and interpretation easier. Again, accuracy is not the aim, discrimination is.3
The fmal case I consider in this section is our perception of temperature. There is
some real property of the world, the mean kinetic energy of molecules, which we
can determine independently of our perceptual access to temperature through our
thermoreceptors by using thermometers. As with color vision and hearing, we know
enough about temperature and how our thermoreceptors work to conclude that our
thermoreceptors do not provide accurate representations. Thermoreceptors sensitive
to colder temperatures, so called "cold spots," respond with exactly the same fIring
rate for different skin temperatures. That is, the fIring rate of a cold spot as a function
of temperature is a many-one function. In providing exactly the same representation
Christopher Viger
for different stimuli, the representations cannot be said to be accurate. Warm spots
- thermoreceptors sensitive to warmer temperatures - though not many-one in their
response rates, nonetheless do not respond in a linear fashion to linear changes in
temperature either. So neither type of thermoreceptor preserves the structure of the
external stimuli. Furthermore, how a thermoreceptor responds to a change in tem
perature depends on the starting temperature. Our thermal requirements are such that
a flxed temperature change within a moderate range is less relevant than the same
change near the extremes of tolerance. Warm spots give a dynamic response only to
increases in temperature, cold spots only to decreases, and in both cases the strength
of the response is determined by the starting temperature ; changes from moderate
starting temperatures yield small responses, whereas increases from hot starting tem
peratures and decreases from cold starting temperatures yield very strong responses,
thereby meeting our thermal needs. We are better served with information that our
left hand is getting too hot than we are by the information that the external
temperature has increased by ten degrees. Thermoreceptors are not thermometers.4
The cases of perception just considered are not intended to be exhaustive, nor
are they offered as the general case from which our understanding of the nature of
perception should be drawn. They simply serve as examples in which perception does
not provide accurate representations. And careful consideration reveals that this is
not just a failing of the perceptual systems to get things right in these cases. It is
advantageous to creatures to distort what they perceive in these ways. So in such
cases the aim of perception is not to provide accurate representations because they
are less useful than systematically distorted representations. "Evolutionary fltness rests
upon cognitive functions that are useful, and usefulness does not necessarily track
the truth" (Bermudez, 1 999, p. 80). When usefulness and truth come apart, as they do
in the perceptual cases discussed above, it is not the aim of perception to provide
accurate representations.
The lesson we can take from the examples presented above is that it is not precise
enough to ask whether the aim of perception, tout court, is to provide accurate
representations. Even restricted to a single modality the question is not precise enough.
Sometimes perception is accurate, but when the usefulness of perception relies on a
distortion of the underlying structure being perceived, such as in the case of color
vision or language parsing, it is not. The question, "Is the aim of perception to provide
accurate representations?" must be answered on a case-by-case basis as we learn
enough about our world, using specially developed tools, to understand its propert
ies and how they are related, and enough about our perceptual systems to under
stand how they represent those relations. And in many cases the question will be
answered negatively because in those cases our perceptual systems do not provide
accurate representations.
Before switching gears let's take stock of where we are. So far I have presented
cases in which we know enough about our own perceptual systems and independ
ently we know enough about the properties we perceive to conclude that our per
ceptions are not accurate. In these cases it is not an accident that our perceptions
are not accurate; by being inaccurate our perceptual systems make salient important
environmental differences, thereby sacriflcing accuracy for usefulness. As a result,
our question must be considered on a case-by-case basis and since we already
The Case Against Accurate Representations
know of cases favoring positive and negative responses no univocal answer can
be forthcoming; the issues our question raises are too complex to yield a single
answer.
As complex as the question is based on what we have considered so far, answer
ing it in full is even more complex because the question itself is ambiguous. There
are two notions of representation and how the question is answered will depend on
how we interpret it. In the next section, I present the two notions of representation
at play and argue that on the more natural reading there are cases in which it is not
the aim of perception to provide representations at all. Moreover, this ambiguity is
not a simple matter of careless use of terminology. There is an impressive amount
of empirical evidence for a dual model of perception, which predicts that there are
distinct kinds of representations produced within the visual system. So we cannot
stipulate a use of language to decide our question. And the situation is even more
complex still because the dual route model of perception reveals yet another class
of cases in which perception is not accurate, nor is it the aim of perception to be
so. I tum now to discussing these issues.
on the visual index theory there are some cases in which it wouldn't even make
sense to ask if the sensorimotor representation were accurate because no property
is represented. It is merely a demonstrative representation. And in these very cases,
when immediate real-time action is required - in such circumstances as ducking,
blocking, catching, etc. - it is not only not the aim of perception to provide accur
ate representations, their production is detrimental to acting. Processing conceptual
representations is too slow in these cases to produce real-time responses. In the
context of immediate real-time responses our question only makes sense if we inter
pret it as asking about conceptual representations, in which case the answer is that
in these cases the aim of perception is not to provide a (conceptual) representation
at all.
As I mentioned above, the ambiguity in the notion of representation is not a
simple consequence of sloppy usage. There is signifICant and compelling empirical
data supporting David Milner and Melvyn Goodale's ( 1 995, 1 998) dual route model
of perception. This model reveals the source of the ambiguity, showing that our tenn
is locked on to two distinct processes in perception. Let's look at the details of the
model.
If some sort of dual route model is correct - and evidence from "neuroimaging experi
ments, human neuropsychology, monkey neurophysiology, and human psychophys
ical experiments" (Goodale and Westwood, 2004, abstract, p. 1) in its favor is impressive
- then the two streams of visual processing are the source of our two notions of
representation. "According to the Goodale and Milner model, the dorsal and ventral
streams both process information about the structure of objects and about their spatial
locations, but they transform this information into quite different outputs" (Goodale
and Westwood, 2004, p. 2). That is to say, the representations produced by each stream
are quite different. The dorsal stream produces sensorimotor representations for imme
diate responses, whereas the ventral stream produces conceptual representations in
which objects are recognized as such.
The difference in function between the two streams is extremely important to deter
mining the accuracy of the representations each produces. The dorsal stream must
produce representations ready for use by the motor system. Its representations are
egocentric in providing relational information about how our effectors are positioned
relative to a target object and often5 highly accurate to the point of "seeing through"
visual illusions. For example, in the "Tichener circles" illusion we judge a disk sur
rounded by smaller circles to be larger than the same size disk surrounded by larger
circles. However, when we grasp for the disks we open our hands to the same size,
showing that the dorsal stream is insensitive to the illusion (Aglioti et al., 1 995, reported
in Goodale and Westwood, 2004, p. 3). On the other hand, ventral stream processing
is the source of the illusion since it "has no such requirement for absolute metrics,
or egocentric coding. Indeed, object recognition depends on the ability to see beyond
the absolute metrics of a particular visual scene; for example, one must be able to
recognise an object independent of its size and its momentary orientation and posi
tion" (Goodale and Westwood, 2004, p. 3). Object recognition requires abstracting
away from the egocentric particulars of our perceptual circumstances, and so is often
inaccurate, as evidenced in various illusions.6 But giving up accuracy has the beneflt
of making the representations available to more centraF cognitive systems. The
payoff for giving up accuracy in the ventral stream is that "we can intentionally act
in the very world we experience" (Clark, 2002, p. 20 1 ) . Yet again, we have cases in
which (conceptual) representations are useful in being inaccurate.
Before concluding, it is worth distinguishing our question from a closely related
one: Is the visual world a grand illusion?
1 2 82 1
Christopher Viger
notice a man in a gorilla suit that walks into the middle of the scene (Simons and
(habris, 1 999).
Ludwig (this volume, chapter 1 5) characterizes the grand illusion hypothesis as
having two interpretations: the "world illusion" and the "perception illusion."s
According to the world illusion interpretation, the world is different in some respect
from how we represent it. On this interpretation the visual world being a grand
illusion amounts to vision being inaccurate. Since I have discussed this at length in
section 2, I won't say more about it here. The perception illusion interpretation is
the claim that there is a mismatch between our perceptual experiences and our judg
ments of those experiences.9 Note that this is not the same as our question because
it is possible that we judge our perceptions to be inaccurate even though they are
accurate.
The arguments for a perception illusion are based on an assumption, known as
the snapshot conception of experience, that we - lay observers - judge our visual
experience to be like a photograph in being richly and uniformly detailed, with equally
high resolution from center to periphery. The arguments proceed by producing evid
ence that undermines the snapshot conception of experience. The phenomena of change
blindness and inattentional blindness are offered as evidence, being taken to demon
strate that we are profoundly mistaken about the nature of our visual experiences.
Experience is gappy; color and resolution diminish outside the fovea; our eye position
changes approximately three times a second ; and, during each saccade no informa
tion is transmitted. The rich, detailed, evenly colored experience we seem to have is
an illusion; the snapshot conception of experience is badly in error.
In response to the line of reasoning above, many writers, including Alva Noe
and Kirk Ludwig reject the fIrSt premise, namely that ordinary observers hold the
snapshot conception of experience.
We peer, squint, lean forward, adjust lighting, put on glasses, and we do so automatic
ally. The fact that we are not surprised by our lack of immediate possession of detailed
information about the environment shows that we don't take ourselves to have all that
information in consciousness all at once. (Not', 2002a, p. 7)
Precisely because it is so obvious that the snapshot model does not correspond to the
phenomenology of visual perception, however, it seems doubtful that ordinary people
have been suffering under an illusion about this . . . If we have an interest in seeing
what something is like, we turn our heads or eyes toward it, even if it is already visible,
and focus on it, and approach it if necessary to examine it more closely. If we suffered
from a grand illusion because we embraced the snapshot view of perception, we would
expect it show up in our behavior, but it does not. (Chapter 1 5, p. 266; emphasis in
original)
them noticing. This surprise betrays their expectations to the contrary, but for Noe
and Ludwig, that our attention is more limited than we take it to be hardly merits
the title "grand illusion." After all, on a moment's reflection, we would never be aware
of what we are not attending to and so in ordinary experience would never notice
that we missed something. To have what we missed pointed out to us is therefore
surprising, but "the grand illusion is more aptly regarded as a sort of banal surprise"
(Cohen, 2002, p. 1 5 1 ) .
However banal the surprise o f failing t o notice salient aspects of our environment,
we are surprised. And the source of our surprise is a tension in our phenomenology.
We do have some sense of things to which we are not attending. For example, we
may habituate to the ticking of a clock and so not seem to hear it, yet we notice if
it stops. Similarly, the periphery of a visual scene is not invisible even if much of
the detail is lost. As I read the page in front of me I still see the wall behind it.
Curiously, we experience the world as richly detailed even though we do not take in
the detail all at once.lO "O'Regan and Noe . . . are right that it need not seem to people
that they have a detailed picture of the world in their heads. But typically it does"
(Dennett, 2002, p. 1 6). How can it be that we experience the world as richly detailed
even as we move to get a better look? Noe refers to this tension as the problem of
perceptual presence.
With the dual route model in hand, the resolution to the problem of perceptual
presence is clear. D istinct representations with distinct functions are simultaneously
produced. As a result of dorsal stream processing, we make adjustments such as
moving to get a better look, squinting, etc. to improve our impoverished perceptual
input; our sensorimotor knowledge is in the dorsal stream. Ventral stream process
ing, on the other hand, leads to object recognition, so that when we so much as
glimpse a wall, we often recognize it qua wall ; i.e., as the sort of thing that is not
made of cream cheese, the sort of thing we can't walk through, etc. In particular, in
recognizing a wall we recognize it as being a richly detailed object, of which we are
familiar with many of the details, and thereby experience it as such, even though
we are not taking in all that detail at once, as evidenced by dorsal stream guided
movements.
Noe takes a different approach to resolving the problem of perceptual presence,
which is important for, besides being an interesting theory of perception, if correct
it forces a "no" answer to our question because perception never involves concep
tual representations. Developing on the skill theory presented in O'Regan and Noe
(200 1 ), Noe (2002a, 2004), offers the enactive model of perception in place of the
snapshot model as a solution to the problem of perceptual presence. On this view,
"perceiving is a way of acting. Perception is not something that happens to us, or
in us. It is something we do. Think of a blind person tap-tapping his or her way
around a cluttered space, perceiving that space by touch, not all at once, but through
time, by skillful probing and movement. That is, or at least ought to be, our
paradigm of what perceiving is" (Noe, 2004, p. 1 ) . According to Noe, perceivers do
not experience the rich detail of the world; they have access to that detail because
of the sensorimotor knowledge they possess. For example, if we see a cat partly occluded
by a fence, it does not seem to us as if we see an entire cat, since some of it is
obscured. However, it does not seem to us that we see detached cat parts, either.
Christopher Viger
Rather, it seems as if we have access to an entire cat; i.e., it seems that there is an
entire cat that is partly occluded. We know from being embodied observers that if
we move slightly, hidden parts will come into view and other parts will cease to be
visible (Noe, 2002a, p. 10).
Naturally, Ludwig sees insurmountable difficulties with Noe's view. First he argues
that it is a category mistake to suppose that actual activity constitutes perception,
since " [a] perceptual experience . . . is a state, not an event" (chapter 1 5, p. 263). Ludwig
concludes that Noe's position requires a dispositional reading. On this interpretation,
possessing sensorimotor skills and practical knowledge are constitutive of being a
perceiver. But Ludwig offers several reasons for why having a perceptual experience
cannot be being in a dispositional state, two of which are relevant to our question.
First, he argues that Noe's account leaves out the phenomenal character of percep
tual experiences, which is occurrent rather than dispositional. Second, he points
out that the view cannot accommodate our having hallucinations that we know are
hallucinations. The reason is that in such a case we would not have the usual
expectations derived from our practical knowledge about how our own movement
would alter the sensory input. Knowing something to be a hallucination, we wouldn't
expect sensorimotor knowledge to be relevant. But without such expectations we
would fail to have an experience. J 1
What is of interest to us in these objections is that something is missing from
a purely dispositional account. Something occurrent must accompany whatever dis
positional state we are in to explain why we are having the perceptions we are
experiencing when they are experienced. In short, what's missing is a conceptual
representation. 1 2 Noe's model leaves out ventral stream processing. Just as percep
tion is too varied and complex for a univocal "yes" answer to our question, it is
also too complex for a simple "no" as suggested by Noe's model. Nonetheless, in
developing our understanding of sensorimotor representation, Noe's view may
well yield new cases in which it is not the aim of perception to provide (conceptual)
representations at all, accurate or otherwise.
6 Concl usion
I have argued that there are circumscribed cases in which our sensory systems do
not provide accurate representations, and that those representations are useful
because they are not accurate, highlighting and making salient subtle but important
differences in our environment. I have also pointed out an ambiguity in the notion
of representation and shown that on the more natural reading of representation as
conceptual representation - more natural since those are the representations of which
we are aware - there are cases in which it is not the aim of perception to provide
representations at all. Finally, there are at least some cases in which abstracting away
from the particular perceptual circumstances to produce conceptual representations
that can be used by more central cognitive processes results in inaccurate repres
entations. So there are many circumstances in which it is not the aim of perception
to provide accurate representations. In sum, perception is too complex and varied to
yield a univocal answer to our question.
The Case Against Accurate Representations
Acknowledgments
Thanks to Dan Blair, Kirk Ludwig, Gonzalo Munevar, John Nicholas, Robert Stainton,
and Catherine Wearing for helpful feedback on these issues.
Notes
2
3
4
5
6
7
8
9
10
11
12
Our phenomenology regarding color similarity is represented in the color solid. A clear
description of the color solid is given by Palmer ( 1 999).
Durgin (2002) gives another example of perceived visual motion where there is none.
My thanks to Dan Blair for bringing these phonological examples to my attention.
The content of this discussion of thermoreceptors is based on Akins, 1 996.
Cases in which sensorimotor representations are not accurate, such as those produced by
the thermo receptors, were presented in section 2.
It need not be the case that abstraction leads to inaccuracy since the process may pro
duce accurate but less detailed representations. However, empirical results in the form of
illusions such as the Tichener circles or the Muller-Lyre lines show that in many cases
the representations are, in fact, inaccurate.
Central in Fodor's sense of being non-modular as opposed to the homuncular sense of
central that Dennett ( 1 99 1 ) rejects.
Schwitzgebel (2002, p. 3 5) makes the same distinction.
Many of the papers in Noe, 2002b address this issue.
Dennett ( 1 99 1 , p. 3 54) invites us to imagine entering a room covered with pictures of
Marilyn Monroe. We sense being in a room full of Marilyns despite only attending to a
very few. Yet having the impression we might continue to scan the room.
My view is that Noe can respond to Ludwig's objections by augmenting his view along
the lines suggested in the text or those in Andy Clark, 2002. Ludwig also objects that the
relevant dispositions cannot be conceptually necessary for perceptual experiences, but surely
that was never Noe's claim. He is making an empirical claim not a conceptual one. This
makes the relevant notion of possibility physical and not conceptual and takes much of
the sting out of Ludwig's " fatal blow" to the sensorimotor model of perception. A full
discussion of these issues is beyond the scope of this chapter.
Note that in characterizing Noe's view as not involving representations, Ludwig
(chapter 1 5, IS THE AIM OF PERCEPTION TO PROVIDE ACCURATE REPRESENTATIONS?) is committed
to taking our question to be about conceptual representations.
Christopher Viger
Bermudez, J. L. ( 1 999). Naturalism and conceptual norms. Philosophical Quarterly, 49/ 1 94,
77-85.
Clark, Andy ( 1 997). Being There: Putting Brain, Body, and World Together Again. Cambridge,
MA: Bradford/ MIT Press.
- (2002). Is seeing all it seems? Action, reason and the grand illusion. Journal of
Consciousness Studies, 9/5-6, 1 8 1 -202.
Clark, Austen (2000). A Theory of Sentience. New York: Oxford University Press.
Cohen, J. (2002). The grand grand illusion illusion. Journal of Consciousness Studies, 9/5-6,
1 4 1 -57.
Dennett, D . ( 199 1 ) . Consciousness Explained. Boston: Little, Brown.
- (2002). How could I be wrong? How wrong could I be? Journal of Consciousness Studies,
9/5-6, 1 3- 1 6.
Durgin, F. H. (2002). The Tinkerbell effect: Motion perception and illusion. Journal of
Consciousness Studies, 9/5-6, 88- 1 0l.
Geldard, F. A. and Sherrick, C. E. ( 1 97 2). The cutaneous "rabbit": A perceptual illusion. Science,
1 78, 1 78-9.
Goodale, M. and Westwood, D. (2004). An evolving view of duplex vision: Separate but
interacting cortical pathways for perception and action. Current Opinion in Neurobiology,
1 4, 1 -9.
Kandel, E., Schwartz, J., and Jessell, T. ( 1995). Essentials of Neural Science and Behavior. Stamford,
CT: Appleton & Lange.
Kenstowicz, M. ( 1 994). Phonology in Generative Grammar. Malden, MA: Blackwell.
Kolers, P. A. and von Gronau, M. ( 1 976). Shape and color in apparent motion. Vision
Research, 1 6, 3 29-35.
Levine, D . T. (2002). Change blindness as visual metacognition. Journal of Consciousness Studies,
9/5-6, 1 1 1 -30.
Mack, A. (2002). Is the visual world a grand illusion? A response. Journal of Consciousness
Studies, 9/5-6, 1 02- 10.
Mack, A. and Rock, I. ( 1 998). Inattentional Blindness. Cambridge, MA : MIT Press.
Milner, A. D. and Goodale, M. ( 1 995). The Visual Brain in Action. Oxford: Oxford University
Press.
- and - ( 1 998). The visual brain in action (precis). Psyche, 4, 1 - 1 2 .
Munevar, G . (forthcoming). A Theory of Wonder.
Noe, A. (200 1 ) . Experience and the active mind. Synthese, 1 29/ 1 , 4 1 -60.
- (2002a). Is the visual world a grand illusion? Journal of Consciousness Studies, 9/5-6,
1- 12.
- (ed.) (2002b). I s the Visual World a Grand Illusion? Special issue of Journal of
Consciousness Studies, 9/5-6. Charlottesville, VA: Imprint Academic.
(2002c). On what we see. Pacific Philosophical Quarterly, 83/ 1 , 57-80.
(2004). Action in Perception. Cambridge, MA: MIT Press.
Noe, A., Thompson, E., and Pessoa, L. (2000). Beyond the grand illusion: What change blind
ness really teaches us about vision. Visual Cognition, 7/ 1 -3 , 93- 1 06.
O'Regan, J. K. and Noe, A. (200 1) . What it is like to see: A sensorimotor theory of perceptual
experience. Synthese, 1 29/ 1 , 79- 103.
Palmer, S. ( 1 999). Color, consciousness, and the isomorphism constraint. Behavioral and Brain
Sciences, 22/6, 923-43.
Pylyshyn, Z. (200 1). Visual indexes, preconceptual objects, and situated vision. Cognition, 80/ 1 -2,
1 27-58.
- (2003). Seeing and visualizing: It's not what you think. Cambridge, MA: MIT Press.
Schwitzgebel, E. (2002). How well do we know our own conscious experience? The case of
visual imagery. Journal of Consciousness Studies, 9/5-6, 3 5-53.
Siewert, C. (2002). Is visual experience rich or poor? Journal of Consciousness Studies, 9/5-6,
1 3 1 -40.
Simons, D. and Chabris, C. ( 1 999). Gorillas in our midst: Sustained inattentional blindness for
dynamic events. Perception, 28, 1 059-74.
Christopher Viger
CA N MENTAL STATES,
KNOWLEDGE IN
PARTICU L AR, BE
DIVIDED INTO A
NARROW COMPONENT
AND A BROAD
COMPONENT?
CHAPT ER
SEVENTEEN
broad cognitive states as themselves strictly mental for the purposes of cognitive
science. Internalism is here the view that (for the purposes of cognitive science) all
mental states are narrow. Externalism is the negation of internalism. Of course, what
matters is not so much how we use the word "mental" as whether the factorizing
strategy succeeds at all.
For present purposes let us not worry about exactly what counts as internal :
(internal to the brain or merely to the body, and "internal" in what sense?). A radical
view is that there is no natural boundary between the internal and the external: they
somehow flow into each other. That would be bad news for internalism rather than
externalism, since only the former attributes a central signifIcance to the internal
external boundary with respect to the mental. Thus it is legitimate to assume a natural
internal-external boundary for the sake of an argument against internalism.2 But given
that plausible assumption, internalism may seem an obviously attractive or even
compelling view. Isn't the aim of cognitive science precisely to identifY the internal
mechanisms that are the contribution of mind, and to understand how their inter
action with the external environment produces knowledge?
The aim of this chapter is to sketch an argument that the internalist program
of factorization rests on a myopic view of what it would take to understand the
role of cognition in guiding action. The cognitive states that explain action at an
appropriate level of generality are not merely broad ; they are broad in a way that
makes their factorization into narrow and environmental constituents impossible in
principle.
The internalism-externalism debate has ramifIed in many directions over recent
decades. It would be hopeless to attempt here to survey all the arguments that have
been advanced in favour of internalism. Instead, one central line of externalist thought
will be developed. That should give the reader a suffIcient clue as to where the inter
nalist arguments go wrong.3
Section 1 considers two main ways in which the intentional states of folk psy
chology as delineated by natural language predicates are broad. Intemalists may respond
by denying that those folk psychological states are strictly mental in the sense
relevant to cognitive science and attempting to factorize them into narrow and envir
onmental components. Section 2 sketches some abstract structural considerations,
to explain why the factorizing strategy is problematic in principle. Section 3 develops
that argument in more detail by reference to some examples. Section 4 suggests a
more general moral about cognition.
Timothy Williamson
is holding a glass of water (actually it is gin). One cannot know something false,
although one can believe falsely that one knows it. Thus the state of knowing that
one is holding a glass of water is broad.
Knowing is a factive attitude, in the sense that, necessarily, one has it only to
truths. Other factive attitudes include seeing and remembering (in the usual senses
of those terms). Unless the glass contains water, you cannot see that it contains water;
if you are misperceiving, you only think that you can see that it contains water.
Unless you drank water yesterday, you cannot remember that you drank water
yesterday; if you are misremembering, you only think that you remember that you
drank water yesterday. By contrast, the attitude of believing is non-factive; false
belief is possible, indeed actual and widespread. Arguably, knowing is the most general
factive attitude, the, one that subsumes all others : any factive attitude is a form of
knowing.
In unfavorable circumstances, for instance when they are being hoaxed, agents
falsely believe that P, and believe that they know that P, without being in a position
to know that they do not know that P. Thus agents are not always in a position to
know which knowledge states they are in ; such states lack fIrst-person transparency.
If first-person transparency were a necessary condition for being a mental state, know
ledge states would not be mental. But why impose any such condition on mentality?
There is no obvious inconsistency in the hypothesis that someone has dark desires
without being in a position to know that he has them: for example, the opportunity
to put them into effect may not arise. It is a commonplace of cognitive science that
agents' beliefs as to their mental states are sometimes badly mistaken. That is not
to deny that we often know without observation whether we are in a given mental
state: but then we often know without observation whether we are in a given know
ledge state. The partial transparency of knowing is consistent with its being a
mental state.
Do non-factive attitudes such as believing yield narrow mental states? Whatever
the glass contains, one can believe that it contains water. Nevertheless, famous extern
alist arguments show that the content of intentional states can yield a broad mental
state even when the attitude to that content is non-factive. Indeed, the current debate
between internalism and externalism in the philosophy of mind originally arose with
respect to the contents of the attitudes rather than the attitudes themselves.
We can adapt Hilary Putnam's classic example to argue that even the state of
believing that one is holding a glass of water is broad.4 Imagine a planet Twin-Earth
exactly like Earth except that the liquid observed in rivers, seas, and so on is not
H20 but XYZ, which has the same easily observable properties as H 2 0 but an utterly
different chemical composition. Thus XYZ is not water. However, the thought experi
ment is set in 17 50, before the chemical composition of water was known. Oscar on
Earth and Twin-Oscar on Twin-Earth are internal duplicates of each other; they are
in exactly the same narrow states. Oscar is holding a glass of H20; Twin-Oscar is
holding a glass of XYZ. Since H 20 is water and XYZ is not:
They both express a belief by saying "I am holding a glass of water." Since Oscar
uses the word "water" to refer to water, we can report his belief thus:
3
Suppose (for a reductio ad absurdum) that believing that one is holding a glass of water
is a narrow state. By (3), Oscar is in that state. By hypothesis, Oscar and Twin-Oscar
are in exactly the same narrow states. Therefore Twin-Oscar is also in that state:
4
But (7) and (8) imply an implausible asymmetry between Oscar and Twin-Oscar. There
is no more reason to impute error to Twin-Oscar than to Oscar. They manage equally
well in their home environments. Since (1), (2), (3), (5), and (6) are clearly true, the
culprit must be (4). Since (4) follows from (3) and the assumption that believing that
one is holding a glass of water is a narrow state, we conclude that believing that
one is holding a glass of water is a broad state, not a narrow one.
The Putnam-inspired argument generalizes from believing to all sorts of other proposi
tional attitudes, such as wondering, desiring, and intending. It also generalizes beyond
natural kind concepts like water to a huge array of contents that turn out to depend
on the agent's natural or social environment, or simply on the identity of the agent.
Timothy Williamson
For example, the qualitatively identical twins Tweedledum and Tweedledee both speak
truly when they simultaneously utter the words " O nly I believe that I won " ; each
correctly self-attributes a belief the other lacks. Similarly with perceptual demon
stratives : both twins speak truly when they point at each other and simultaneously
utter the words "Only he believes that he won." Intentional content as attributed in
natural languages depends on reference, which in turn depends on causal and other
relations of the agent that are left undetermined by all internal qualities. The broad
ness of reference extends further to folk psychological relational states such as seeing
Vienna, thinking about Vienna, and loving or hating Vienna. If intentionality (about
ness) is the mark of the mental, and our thoughts are typically about the external
environment, then our mental states are typically broad.
Thus both the contents of folk psychological intentional states and the attitudes
to those contents make the states broad. Of course, that does not prevent intern
alists from postulating a core of narrow mental states for the purposes of cognitive
science and trying to identify them by more theoretical means. But it is not obvious
that such a core exists. Granted, when human agents are in an intentional state they
are also in some underlying narrow physical state or other, but it does not follow
that the latter is intentional in any useful sense. We can clarify the issue by
considering the role of cognitive states in the causal explanation of action.
as specifled in the intention? The internalist may reply that the factorizing strategy
must be applied to the agent's intention too. Alternatively, the internalist explanation
may be restricted to a class of basic actions more primitive than either drinking from
this stream here or making as if to drink from a stream of this qualitative appearance.
Even if actions are narrowly individuated, other problems arise for the attempt to
explain them on the basis of the agent's prior internal state. Action is typically not
instantaneous, and not merely because there is no such moment as the one imme
diately preceding the moment of action if time is dense. It takes time even to bend
one's head until one's lips touch the water and more time to drink. That is enough
time to spot piranha fish in the water and withdraw before completing the action.
Thus the internal state of the agent at t does not even determine whether she will
go through the sequence of internal states corresponding to drinking in some brief
interval shortly after t. Again, the internalist may respond by attempting to analyze
extended actions such as bending and drinking into sequences of more basic actions,
each to be explained in terms of an "immediately" preceding internal state of the agent.
To restrict ourselves to concatenating explanations of basic actions would drastic
ally curtail our explanatory ambitions. Consider a tourist in a strange town whose
camera and passport have just been stolen. We may be able to explain on general
grounds why he will sooner or later get to the local police station, whose location
he does not yet know, without assuming anything specifIc about the layout of the
town, his current position in it, or his beliefs about those things. By contrast, if we
concatenate explanations of the basic actions by which, step by step, he actually gets
to the local police station in terms of "immediately" preceding internal states, we
shall need to invoke a mass of detailed extra assumptions about the sequence of his
mental states (for example, his perceptions of the streets around him and of the responses
to his questions about the way to the nearest police station) far beyond anything
that the more general explanation need assume. Thus by moving down to the level
of basic actions we lose signifIcant generalizations that are available at the higher
level. If the layout of the town, his position in it or the responses to his questions
had been different, he would have gone through a different sequence of basic actions,
but he would still have got to the police station in the end. To understand cogni
tion at an appropriate level of generality, we often need to understand non-basic
actions themselves, not merely the sequences of basic actions that realize them
on particular occasions. Since the performance of non-basic actions depends on the
environment as well as the agent, they are not to be explained solely in terms of
the agent's internal states.
Appeals to the locality of causation do not motivate the idea that only narrow
states of the agent at t are causally relevant to effects that are not complete until
some time after t, for environmental states at t are also causally relevant to those
effects. Rather, the issue is whether the causally relevant states at t can be factorized
into narrow states and environmental states.
We should therefore assume that when we are trying to explain the agent's action
in terms of the state of the world at t, both the state of the agent at t and the state
of the external environment at t are potentially relevant. Nevertheless, it might be
argued, the factorization strategy is bound to succeed, for the total (maximally speciflc)
narrow state of the agent at t and the total environmental state (state of the external
Timothy Williamson
environment) at t together determine the total state of the whole world at t (no
difference in the latter without a difference in one of the former), which in turn
determines the action (insofar as it is determined at t at aU). Let us grant the deter
mination claim, although it is not completely uncontentious - in principle there might
be global states of the whole world not determined by the combination of states local
to the agent and states local to the external environment. Even so, when we explain
an action we do not assume a particular maximally specifIc state of the world at t.
It is not just that we cannot know exactly what the state of the world was at t. Even
if we did know, it would be methodologically undesirable to build all that know
ledge into our explanation of the action, because the maximal specifIcity of the assump
tion implies the minimal generality of the explanation. It would not apply to any of
those alternative states of the world that would have led to the same action. A signifIcant
generalization would again have been missed. Compare Putnam's example:
A peg ( 1 inch square) goes through a 1 inch square hole and not through a 1 inch
round hole. Explanation: (7) The peg consists of such-and-such elementary particles
in such-and-such a lattice arrangement. By computing all the trajectories we can get
applying forces to the peg (subject to the constraint that the forces must not be so great
as to distort the peg or the holes) in the fashion of the famous Laplacian super-mind,
we determine that some trajectory takes the peg through the square hole, and no
trqjectories take it through the round hole. (Covering laws: the laws of physics.)
(Putnam, 1 978, p. 42)
PRIME A state S is prime if and only if some narrow state I and some
environmental state E are separately compatible with S but I&E is incompatible
with S.7
Suppose that we can fmd possible combinations of narrow states II and 12 and
environmental states El and E2 such that Il&E1 and I2&E2 imply the cognitive state
C while I1&E2 excludes C. Then II and E2 are compatible with C while Il&E2 is
incompatible with C. Consequently, by PRIME, C is prime. Furthermore, PRIME implies
that for each prime state there is a pair like 11 and E2.
For a simple example, let C be the unspecifIc state of knowing which direction
home is in (in egocentric space). Consider two possible scenarios. In scenario 1, you
know which direction home is in by knowing that home is straight in front of you.
Thus home is straight in front of you, and you believe that it is. Call your narrow
and environmental states in scenario 1, II and E 1 respectively. In scenario 2, you
know which direction home is in by knowing that home is straight behind you.
Thus home is straight behind you, and you believe that it is. Call your narrow and
environmental states in scenario 2, 12 and E2 respectively. Narrow state II is
compatible with C because you are simultaneously in II and C in scenario 1.
Environmental state E2 is compatible with C because you are simultaneously in E2
and C in scenario 2. But the conjunctive state Il&E2 is incompatible with C, because
if 11&E2 obtains you do not know which direction home is in; rather, you believe
that home is straight in front of you (by II) while in fact it is straight behind you
(by E2) ; you have a mere false belief as to which direction home is in. Consequently,
by PRIME, C is prime. Knowing which direction home is in is not equivalent to the
conjunction of a narrow state and an environmental state.
For another example, let C* be the state of consciously thinking about Rover
(not necessarily under the verbal mode of presentation "Rover"). Again, consider two
possible scenarios. In scenario 1*, you can see two very similar dogs, Rover on your
right and Mover on your left. You are consciously thinking about Rover, visually
presented as "that dog on my right" and in no other way; you are not consciously
thinking about Mover at all. Call your narrow and environmental states in scenario
1 *, 11* and E 1 * respectively. In scenario 2*, you can see Rover on your left and Mover
on your right. You are consciously thinking about Rover, visually presented as "that
dog on my left" and in no other way; you are not consciously thinking about Mover
at all. Call your narrow state and environmental states in scenario 2*, I2* and E2*
respectively. Narrow state II* is compatible with C* because you are simultaneously
in 11* and C* in scenario 1*. E nvironmental state E2* is compatible with C* because
you are simultaneously in E2* and C* in scenario 2*. But the conjunctive state II*&E2*
is incompatible with C*, because if II*&E2* obtains you are not consciously thinking
about Rover; rather, you are consciously thinking about Mover, visually presented
as "that dog on my right." Consequently, by PRIME, C* is prime. Consciously thinking
about Rover is not equivalent to the conjunction of a narrow state and an environ
mental state.
It is not diffIcult to multiply such examples. In general, the folk psychological
intentional states individuated by natural language predicates tend to be prime. But
do those states matter for purposes of cognitive science?
Timothy Williamson
nothing at all (in case of hallucination). For example, you may move to the right.
That may differ from what you do if you realize the state of consciously thinking
about Rover (in scenario 2*) by thinking "that dog on my left." Thus what intern
alists conceive as the narrow state of consciously thinking with that visual mode
of presentation may be more relevant to short-term effects than is the broad state of
consciously thinking about Rover. But the long-term effects of consciously thinking
with that visual mode of presentation may depend critically on whether doing so
constitutes thinking about Rover. In one case, the effect may be that you buy Rover;
in the other, that you buy Mover, a dog of similar visual appearance but totally
different personality, or discover that you were hallucinating. From then onwards,
further developments are liable to diverge increasingly, even with respect to your
narrow states. Similarly, the long-term effects of consciously thinking about Rover
may be the same whether you think of him as "that dog on my right" or "that dog
on my left." You may be more likely to buy him either way.
Obviously, the long-term effects under discussion are highly sensitive to the
agent's other mental states, such as desires. Nevertheless, a pattern emerges. The more
short term the effects we consider, the greater the explanatory relevance of narrow
states. As we consider longer-term effects, narrow states tend to lose that explan
atory advantage and the intentional states of folk psychology come into their own.
Those states are typically not just broad but prime. We have already seen that if we
ignore long-term effects we fail to capture signifICant general patterns in cognition.
Thus prime cognitive states are no mere curiosity of folk psychology: they are
central to the understanding of long-term cognitive effects.
When we investigate cognitive effects that depend on continuous feedback from
a complex environment, we cannot expect to lay down strict laws. We can hope to
identify probabilistic tendencies, perhaps comparable to those of evolutionary biology
at species level. For example, we might want to investigate the general cognitive
effects of literacy (as a means of public communication and not just of private note
taking). In doing so, we seek generalizations that hold across different languages and
scripts. But literacy is itself a prime cognitive state, for it involves a kind of match
between the individual's dispositions to produce and respond to written marks and
those prevalent in the appropriate social environment. The narrow states that go with
mastery of written communication in English-speaking countries today did not go
with mastery of written communication in Babylon 3000 years ago, and vice versa.
Knowing how to read is not a narrow or even composite state; it is prime and broad.9
Perhaps recent developments in cognitive science in the study of embodied, situated,
and distributed cognition, particularly cognition that relies on continuous feedback
loops into the external environment, can be interpreted as investigations of prime
cognitive states.lO
Strategies
We have seen that the two main sources of broadness in the intentional states of
folk psychology - factive attitudes and environmentally determined contents - are
Timothy Williamson
also sources of primeness, in ways that make those states especially fIt to explain
long-term effects at an appropriate level of generality. 11 In particular, such states as
knowing, seeing, remembering, and referring play key roles in those explanations.
The appeal to such states in understanding cognition raises a general question about
the nature of the theoretical enterprise. For words like "know," "see," "remember,"
and "refer" are success terms. They describe what happens when cognition goes well.
Even if we replace the ordinary language terms by more theoretical ones for the pur
poses of cognitive science, the argument of previous sections suggests that some of
those more theoretical terms will need to have a relevantly similar character. To give
success terms a central role in our theorizing about cognition is to understand it in
relation to its successes. That is not to ignore the failures; rather, we understand them
as failures, deviations from success. For example, to a fIrst approximation, we can
treat merely believing that P as merely being in a state with the content that P that
the agent cannot distinguish from knowing that P ; having it merely visually appear
to one that P as merely being in a state with the content that P that one cannot
distinguish from seeing that P ; and misremembering that P as merely being in a state
with the content that P that one cannot distinguish from remembering that PY Again,
we might understand cases of reference failure as cases merely indistinguishable by
the agent from cases of reference. That is to employ a sort of teleological strategy
for understanding cognition. The internalist follows the opposite strategy, starting
from states that are neutral between success and failure, and then trying to dis
tinguish the two classes by adding environmental conditions.
The externalist's point is emphatically not to deny that the successes and the fail
ures have something in common. Indeed, the failures were all described as merely
indistinguishable by the agent from successes; since everything is indistinguishable
from itself, it follows that both successes and failures are indistinguishable by the
agent from successes. The point is rather that the failures differ internally among
themselves, and that what unifIes them into a theoretically useful category with the
successes is only their relation to those successes. For example, many different total
internal states are compatible with its perceptually appearing to an agent that there
is food ahead; what they have in common, on this view, is their relation to cases in
which the agent perceives that there is food ahead ("perceive" is factive).
To use a traditional analogy, consider the relation between real and counterfeit
money. Uncontroversially, a counterfeit banknote can in principle be an exact
internal duplicate of a real banknote. The internalist strategy corresponds to taking
as theoretically fundamental not the category of (real) money but the category that
contains both real money and all internal duplicates of it.13 One would then have to
circumscribe the real money by further constraints (presumably concerning origin
and economic role). But that strategy seems quite perverse, for being real money
cannot usefully be analyzed as having a certain intrinsic property and in addition
satisfying some further constraints : those further constraints do all the theoretical
work. Indeed, the property of being money is prime, in the sense that by a criterion
like PRIME it is not the conjunction of a purely intrinsic property and a purely extrinsic
one, for being money is compatible with being gold, and it is compatible with
being in a social environment in which only silver counts as money, but it is incom
patible with the conjunction of those two properties. It is no use complaining that
Can Cognition be Factorized?
real money and its internal duplicates have the same causal powers. For purposes of
economic theory, the category of real money is primary, the category of counterfeit
money must be understood as parasitic on it, and the category of all internal duplic
ates of real money is of no interest whatsoever.14
Of course, the analogy is not decisive. But it does show that the factorizing strat
egy is not always compelling or even attractive. Each application of it must be argued
on its individual merits. The argument of this chapter has been that, for the study
of cognition, the factorizing strategy is often inappropriate. Sometimes, we need to
use concepts like knowledge and reference (or money) in our explanations, and we
cannot replace them without loss of valuable generality by conjunctions of purely
internal and purely external constituents.
Acknowledgments
Thanks to Rob Stainton for helpful comments on an earlier draft.
Notes
On some alternative defmitions, a state S is " narrow" if and only if S has no existential
implications outside the subject of S (Putnam influentially defmed "methodological solip
sism" as "the assumption that no psychological state presupposes the existence of any
individual other than the subject to whom that state is ascribed," 1 9 75, p. 220). Even
when clarifIed, such defmitions awkwardly deprive the class of " narrow" states of closure
properties required by the conception of such states as forming a self-enclosed domain.
In particular, they permit the conjunction of two " narrow" states not to be "narrow": if
Nl and N2 are two incompatible "narrow" states, and B a non-"narrow" state, then the
disjunctive states Nl v B and N2 v B are also "narrow" by such defmitions, since
Nl and N2 imply any existential implication of Nl v B and N2 v B respectively; but
(Nl v B) & (N2 v B) is not " narrow," for it is equivalent to B. By contrast, defmitions
like that in the text, on which the narrow is whatever supervenes on the internal, auto
matically make conjunctions of narrow states narrow.
In the same spirit, the text treats both narrow states and environmental states as
synchronic. That is a signifIcant over-simplifICation. Whether one is in a given folk
psychological intentional state is typically sensitive to causal origins. Although reference
cannot be defmed in causal terms, reference to particulars and kinds in the environment is
still normally carried by causal connections to them through memory and perception.
Similarly, although knowledge cannot be defmed in causal terms, the difference between
knowing and merely believing is often partly constituted by the presence or absence of
an appropriate causal connection. In applying the internal-external distinction, we must
therefore decide how to classifY the agent's past internal history. Since the rationale
for drawing the distinction depends on the assumed locality of causation, which is both
spatial and temporal, the natural ruling is that the past history of both agent and
environment counts as external for purposes of distinguishing broad from narrow. This
makes the conception of the internal and the external as independent dimensions harder
to maintain, since the external includes all the causal antecedents of the internal. That
complication is consistent with the conclusions in the text.
Timothy Williamson
Williamson, 2000 develops the argument of this chapter in greater detail with reference
to epistemology, and responds to some internalist challenges.
4 Compare Putnam, 1975, where the argument (formulated rather differently) is directed
only against internalism about linguistic meaning, and takes internalism about psycho
logical states for granted. Burge, 1979 made the natural generalization to psychological
states, which Putnam later accepted. See Pessin et aI., 1996 for more on the debate.
5 The gloss "necessary and suffIcient for the action to be taken" is indeed very rough,
for we do not expect an explanation of an outcome to generalize to cases in which the
same outcome occurred for completely different reasons. SatisfYing explanations have a
certain unity and naturalness; they do not rope together essentially disparate cases.
Nevertheless, subject to this vague constraint, the point stands that generality is an explan
atory virtue. For more on the importance of generality in causally relevant properties
and the trade-off between generality and naturalness see Yablo, 1 992, 1 997, 2005.
6 More generally, if there are m possible maximally specifIc narrow states and n possible
maximally specifIC environmental states, and the possible maximally specifIc states of
the world correspond to all pairs of the former and the latter, then there are 2m narrow
states, 2" environmental states and 2mll states of the word altogether. For simplicity, the
calculation includes both the universal state (which always obtains) and the null state
(which never obtains) ; they are the only states that are both narrow and environmental.
The null state is composite; every non-null composite state corresponds to a unique com
bination of a non-null narrow state and a non-null environmental state, so there are
1 + (2m- 1 ) (2 "-l) composite states altogether. In the toy model in the text, m n 3 .
7 To establish PRlME, we assume Free Recombination, the principle that any possible narrow
state is compatible with any possible environmental state. Although this principle is
contentious, it is natural for the internalist to assume it, since it reflects the internalist
analysis of the state of the world into two logically independent dimensions, the inter
nal state of the agent and the external state of the environment. Thus it is fair to the
internalist to assume Free Recombination. Moreover, it is independently plausible that
Free Recombination holds at least to a fmt approximation. We can now argue for PRIME.
(=) Suppose that some narrow state 1 and environmental state E are separately compatible
with S but I&E is not, yet S is not prime. Thus for some narrow state 1* and envir
onmental state E*, S is equivalent to I*&E*. Hence I is compatible with I*&E*, so 1&1* is
a possible narrow state. Similarly, since E is compatible with I*&E*, E&E* is a possible
environmental state. Therefore, by Free Recombination, 1&1* is compatible with E&E*, so
I&E is compatible with I*&E*, and so with S, contrary to hypothesis. (=:}) Suppose that
for every narrow state 1 and environmental state E, if 1 and E are separately compatible
with S, so is I&E. Consider all conjunctions of the form I&E that imply S, where 1 and
E are possible maximally specifIc narrow and environmental states respectively. Let 1* be
the (infmite) disjunction of all flIst conjuncts of such conjunctions and E* the (infmite)
disjunction of all second conjuncts. Thus 1* is narrow and E* environmental. S implies
I*&E*, for if S obtains so does some conjunction I&E that implies S, where 1 and E are
possible maximally specifIc narrow and environmental states respectively; hence I implies
1* and E implies E*, so I&E implies I*&E*, so I*&E* obtains. Conversely, I*&E* implies S,
for if I*&E* obtains then for some possible maximally specifiC narrow states 1 and 1** and
environmental states E and E**, I&E also obtains and both I&E** and I**&E entail S; both
I&E** and I**&E are possible by Free Recombination, so I and E are both compatible with
S; therefore, by hypothesis, I&E is compatible with S; since I and E are maximally specifIC,
I&E entails S, so S obtains. Thus S is equivalent to I*&E* and so is not prime.
8 Gertier, 1 963 has classic examples of true beliefs that fail to constitute knowledge because
they are essentially based on false premises.
3
10
11
12
13
14
See Stanley and Williamson, 200 1 for a general argument that knowing how is a species
of knowing that. If so, the discussion of propositional knowledge applies to knowledge
how as a topic for cognitive science.
See Clark, 1997; Hurley, 1998 ; and Gigerenzer e t aI., 1999. For example, "smart" as used i n
Gigerenzer's title presumably refers t o a prime state, one that depends on the appropriate
ness of the agent's simple heuristics to the nature of the environment.
Factiveness and reference need not be independent sources of broadness, for
knowledge-based constraints may play a constitutive role in the determination of
reference (Williamson, 2004).
This idea is arguably the core o f the so-called Dijunctive Theory o f Perception. For recent
discussion see Martin, 2004 and other papers in the same volume. See Williamson, 2000,
pp. 45-8 for necessary qualifications.
The internalist category contains much beyond real and counterfeit money: for instance,
if some distant society uses internal duplicates of my paperclips as money, my own paper
clips fall into the category, without being either real or counterfeit money.
One cannot even assume that counterfeit money that i s a n exact internal duplicate of
real money is undetectable; it may be detected by extrinsic properties like location.
Timothy Williamson
CHAPTER
EI GHTEEN
In his chapter in this volume, Timothy Williamson presents several arguments that
seek to cast doubt on the idea that cognition can be factorized into internal and
external components. In the flrst section of my chapter, I shall attempt to evaluate
these arguments. My conclusion will be that these arguments establish several highly
important points, but in the end these arguments fail to cast any doubt either on the
idea that cognitive science should be largely concerned with internal mental pro
cesses, or on the idea that cognition can be analyzed in terms of the existence of a
suitable connection between internal and external components.
In the second and third section s of the chapter, I shall present an argument
for the conclusion that cognition involves certain causal processes that are entirely
internal - processes in which certain purely internal states and events cause certain
other purely internal states and events. There is every reason to think that at least
a large part of cognitive science will consist in the study of these purely internal
causal processes.
he argues for two main points. First, broad mental states play an important explana
tory role (especially in explaining the "long-term effects" of our mental states). Second,
many of the mental states that play this important explanatory role are not just broad
but prime: that is, they are not equivalent to any conjunction of a narrow state and
an environmental state.
As I shall argue, Williamson's arguments for these points are entirely sound. Their
conclusions are both important and true. Nonetheless, the conclusions of these argu
ments in fact imply much less than one might at fIrst think. They are quite compatible
both with a fairly robust version of "internalism" or "methodological solipsism" (see
Fodor, 1980) about cognitive science, and with the view that broad mental states
can be illuminatingly analyzed in terms of the existence of an appropriate sort of
connection between internal and external elements. The internalist or methodo
logical solipsist about cognitive science can happily accept all of the arguments that
Williamson makes here.
I shall focus on Williamson's arguments for the claim that the factive attitudes (like
the attitude of knowing) play a crucial role in certain causal explanations. If these
arguments are sound, then it should be relatively straightforward to see how to adapt
them into arguments for the claim that broad mental states of other kinds play
a crucial role in causal explanations. The clearest sort of case in which such broad
factive attitudes play an important explanatory role is in the explanation of actions
that last a signifICant amount of time and involve the agent's interacting with his
environment. To take one of Williamson's examples (1995), suppose that a burglar
spends the whole night ransacking a certain house, even though by ransacking the
house so thoroughly he runs a higher risk of being caught. Offhand, it seems quite
possible that the explanation of the burglar's acting in this way might have been
that he knew that the house contained a certain extraordinarily valuable diamond. !
As Williamson points out, we would not get such a good explanation of the
burglar's behavior by appealing merely to the fact that he believed that the house
contained the diamond. This is because one crucial difference between knowing
something and believing something is that knowledge involves a robust connection
to the truth. If you know a proposition p, then it is not only the case that p is true
and you believe p, but it is also not the case that you might easily encounter some
(misleadingly) defeating evidence that would lead you to abandon your belief in p.2
Thus, if the burglar had merely believed that the house contained the diamond,
he might easily have discovered evidence in the house that would have led him to
abandon his belief that the diamond was there. For example, he might have believed
that the diamond was in the house because he believed (i) that if the diamond had
not yet fallen into his arch-rival's possession, then it could only be in a certain sugar
bowl in the house, and (ii) that the diamond had not yet fallen into his rival's pos
session. Then he would not have ransacked the house at all: he would have fled the
house as soon as he found th,at the diamond was not in the sugar bowl. Thus, the
probability of the burglar's ransacking the house for the whole night given that he
knew that it contained the diamond is greater than the probability of his doing so
given only that he believed that it contained the diamond.3
A second alternative explanation of the burglar's behavior might appeal, not
just to the burglar's belief that the house contained the diamond, but to all of the
Internal and External Components of Cognition
burglar's background beliefs and all of his perceptual experiences during the course
of his ransacking the house. It may well be that the probability of the burglar's
ransacking the house given that he knew that it contained the diamond is no higher
than the probability of his ransacking the house given that he had all those
background beliefs and perceptual experiences. But as Williamson points out, this
second alternative explanation has a different defect. It is vastly less general than
the original explanation. Even if the burglar had had a slightly different set of
background beliefs or a slightly different sequence of perceptual experiences, so long
as he had still known that the house contained the diamond, he would still have
ransacked the house. This second alternative explanation is overloaded with too many
specifIc details; these details are not really necessary to explain why the burglar
ransacked the house. Thus, the original explanation, which appealed to the burglar's
knowing that the house contained the diamond, seems preferable to this second altern
ative explanation as well.
It seems to me that this point will in fact generalize to most cases where we are
interested in explaining actions. According to a plausible philosophy of action, which
is due to Al Mele (2000), actions typically involve a causal feedback loop between
(i) the agent's perception of what she is doing and (ii) her intentions to move her
limbs in such a way as to realize her goals. Typically, the agent is perceptually
monitoring her behavior and its immediate environmental effects, and continually
adjusting her behavior so that its effects are in line with her goals. Moreover, prac
tically all the actions that we are interested in explaining take more than an instant
to be performed. (Think of such actions as cooking a meal, washing one's laundry,
writing an email message. paying one.s bills. and so on.) So the fact that one
performs such an action is itself a fact about a complicated interaction between one's
desires and perceptions, one's bodily movements, and one's environment. It is only
to be expected that if the effect or explanandum involves this sort of interaction between
an agent and her environment, the cause or explanans will also involve the agent's
relation to her environment. Thus, it is not surprising that broad mental states (such
as the agent's perceptual knowledge of her immediate environment) will feature in
the explanation of most actions.
For these reasons, then, Williamson's main point seems to be correct: an agent's
factive attitudes, like the burglar's knowing that the house contains the diamond,
may indeed play a crucial role in causally explaining the agent's behavior. As
I mentioned above, there seems to be no reason why this point should not also hold
of other broad mental states as well, like my belief that this glass contains water,
or that Tashkent is the capital of Uzbekistan. Broad mental states play a crucial explan
atory role.
Although this central point seems to me correct, it does not entail the conclusion
that Williamson seems to be seeking to support - namely, the conclusion that the
"internalist" position that the only mental states that are important for the purposes
of cognitive science are narrow states is false. Admittedly, once we accept that broad
mental states play a crucial role in explaining behavior, it certainly becomes plaus
ible that these broad mental states will play an important role in certain branches
of psychology and social theory. It certainly seems plausible that social psychology
and the social sciences (including various forms of anthropology, sociology, economics,
Ralph Wedgwood
and political science) will fmd it useful to appeal to such broad mental states.4 It
also seems overwhelmingly plausible that historians will invoke broad mental states
in explaining historical events. But it is not clear that the same point will apply to
cognitive science as it is usually understood.
Williamson says at the beginning of his chapter that cognitive science is "the
science of cognition," and suggests that cognition is the "process of acquiring,
retaining and applying knowledge." But this is too specifIC to do justice to the broad
array of inquiries that are pursued by cognitive scientists. A better statement of the goal
of cognitive science would just be to say that it is to "understand how the mind works."
There are many sorts of ways in which one interpret this goal of "understanding
how the mind works." But one sort of understanding that cognitive scientists
are often interested in achieving is analogous to the understanding that one would
have of a clock if one could identifY each of its functional parts (its springs and
cogwheels, its pendulum, and so on), and the way in which all these parts interact
to bring it about that the clock has a reliable disposition to tell the correct time. As
Hobbes put it:
For everything is best understood by its constitutive causes. For as in a watch, or
some such small engine, the matter, fIgure and motion of the wheels cannot well be
known, except it be taken insunder and viewed in parts; so to make a more curious
search into the rights of states and duties of subjects, it is necessary, (I say, not to take
them insunder, but yet that) they be so considered as if they were dissolved . . . (Hobbes,
1 6 5 1 , preface, section 3)
This sort of understanding of how clocks work is quite different from the under
standing of clocks that one would have if one studied the impact of clocks on human
society, or the economics of clock production, or the stylistic properties of ornamental
clocks (from the standpoint of art history). An analogous understanding of how
a computer works would involve an understanding of the structure of its electrical
circuits and of the logical structure of its programming code. If this is the sort of
understanding that cognitive science is particularly interested in, that would help to
explain why cognitive scientists are so interested in actually trying to build machines
that can do some of the things that minds can do.
Thus, at least one of the goals of cognitive science will be to explain the micro
level processes that are characteristic of the mind. These are processes in which one
mental event or state is caused by another mental event or state that precedes it as
closely as one mental event can precede another. None of the examples of psycho
logical explanations that Williamson focuses on are explanations of processes of this
sort. These micro-level processes are precisely not processes in which one mental
state causes "long-term effects" by a complicated and extensive interaction between
the thinker's mind and his environment. Thus, it is not clear that Williamson's
arguments cast much doubt on the idea that the mental states that cognitive science
is interested in will very often be narrow states.
Still, someone might think that Williamson's argument that these explanatorily
important broad states are typically "prime" states (that is, they are not equivalent
to any conjunction of narrow states and environmental states) establishes that broad
Internal and External Components of Cognition
states cannot be analyzed in terms of any relation between narrow mental states and
other non-mental factors. (Williamson himself does not claim that his argument estab
lishes that broad states are unanalyzable in this way, but some of his readers might
think that his argument does show this.) If broad states cannot be analyzed in this
way, then given the importance of broad mental states to the explanation of action,
any science of the mind that ignores these broad mental states will, as Williamson
puts it, "lose sight of the primary object of its study."
In fact, however, even if broad states are prime, they could still very well be
analyzable in terms of some relation between narrow states and non-mental factors.
This is because a state is prime just in case it is not equivalent to any conjunction
of narrow states and environmental states. But obviously there could be an analysis
of broad states that does not take the form of a conjunction.
In fact, many of the most promising attempts that philosophers have made on the
project of analyzing knowledge (to take just the most prominent example of a broad
mental state that philosophers have sought to analyze) have not taken the form of
conjunctions at all. Instead, they have taken the form of existential quantifications.
Thus, for example, at fIrst glance it might seem that Nozick's (1981, ch. 3) ana
lysis of what it is for an agent to know p is just a conjunction of a number of
conditions. That is, it might seem that Nozick's analysis is this:
An agent knows a proposition p if and only if
p is true,
2
3
4
Conditions (3) and (4) are summed up by saying that the agent's belief in p "tracks
the truth."
On closer inspection, it is clear, however, that when Nozick comes to present the
most carefully considered version of his analysis, these four conditions fall within
the scope of an existential quantifIer. In the fmal version of his account, Nozick (1981,
p. 179) offers fIrSt an analysis of what it is for an agent to know p via method (or
way oj believing)
M:
1
2
3
Then Nozick (1981, p. 182) uses this notion of knowing p via method M to defme
what it is for an agent to know p simpliciter:
Ralph Wedgwood
An agent knows p if and only if there is some method M such that (a) the
agent knows p via M, and (b) if there are any other methods Mj via which the
agent believes p but does not know p, then these methods are "outweighed"
by M.
Ignoring some of these complications, we may say that according to Nozick's
analysis, for an agent to know p is just for there to be some method M such that
M tracks the truth, and the agent believes p via M.
What has this to do with Williamson's claim that knowing p is a "prime" state?
Let us assume - just for the sake of argument - that the state of believing p via a
particular method M is a narrow state ; and let us also assume that the state of being
in a situation in which method M tracks the truth is an environmental state. Still,
Nozick's analysis will guarantee that the state of knowing p is not equivalent to the
conjunction of any pair of narrow and environmental states of this sort. The reason
for this is that there are usually many methods that one could use to arrive at a
belief about whether (or not) p is true, and for almost all such methods, there are
possible situations in which they track the truth, and other possible situations in which
they do not.
Consider a simple model in which there are just two relevant methods, Mj and
Mz, and two relevant possible situations Sj and Sz. Suppose that in both situations,
S j and S2' both method Mj and method Mz will lead one to believe p. However, in
situation S 1 ' method Mj tracks the truth while method Mz does not; and in situation
S2, method M2 tracks the truth while method Mj does not.
Then, given Nozick's analysis, knowing p will not be equivalent to the conjunc
tion of believing p via Mj and M j 's tracking the truth - since one might know p
even if one were not in this conjunctive state, if one believed p via M2 in situation
S2 in which M2 tracks the truth. Similarly, knowing p is also not equivalent to the
'
conjunction of believing p via Mz and M2'S tracking the truth - for one might know
p even if one were not in that conjunctive state, if one believed p via Mj in situ
ation S I > in which Mj tracks the truth. Moreover, knowing p is not equivalent to the
conjunction of believing p via either M1 or Mz in a situation in which either M j or
Mz tracks the truth, since one might be in that conjunctive state even if one did not
know p if one believed p via M j in S2 in which M j does not track the truth (or if
'
one believed p via M2 in S I > in which M2 does not track the truth). And fmally, know
ing p is obviously not equivalent to the conjunction of believing p via either Mj or
Mz in a situation in which both Mj and Mz track the truth, or to the conjunction
of believing p via both Mj and Mz in a situation in which both Mj and Mz track
the truth.
So it seems that Nozick's analysis of knowledge implies that knowing p is not
equivalent to any such conjunction at all. At best, it is equivalent to an open-ended
disjunction of conjunctions "Either: believing p via Mj while M1 tracks the truth; or
believing p via M2 while M2 tracks the truth; or . . . " But we are assuming here, for
the sake of argument, that the state of believing p via method M is a narrow state,
and being in a situation in which M tracks the truth is an environmental state. According
to Nozick's analysis, it is states of this sort that determine whether or not one knows
p. So, if knowing p were a "composite" state according to Nozick's analysis, then
Internal and External Components of Cogn ition
to conclude, not just that there is a mental state that is present in both of the two
cases that the argument focuses on, but that this is a mental state of a very special
kind, with a very special object (such as a "sense datum") or a special sort of con
tent (such as a special "narrow content" different from the sort of content that ordin
ary mental states have). As I shall formulate it, the argument from hallucination
does not itself try to establish any of these further claims : its conclusion is simply
that there is a mental state that is present in both of the two cases, neither more nor
less.
Of course, if there is a mental state that is present in both of these two cases, it
is natural to ask further questions about this mental state : What sort of mental state
is this? And what is the relation between this mental state, which is present in both
these two cases, and those mental states that are present in one but not the other of
these two cases? However, there is a wide range of answers that could be given
to these further questions. While it would indeed be an objection to the argument
from hallucination if there were no plausible answer that could be given to those
further questions, the argument itself is not tied to any specifIc answer to those further
questions.
The fust of the two examples of the argument from hallucination that I shall pre
sent here starts with a pair of cases that consists of (i) a genuine perception and
(ii) a hallucination. (One of the differences between these two cases is that a percep
tion is a factive state: if one perceives that p is the case, then p is the case: for
example, if you see that the window is broken, then the window must indeed be
broken.) Let us take the pair of cases that Mark Johnston (2004) invokes in his
statement of the argument from hallucination. You are undergoing brain surgery,
while quite conscious, under local anesthetic. The surgeon
applies electrical stimulation to a well-chosen point on your visual cortex. As a result,
you hallucinate dimly illuminated spotlights in a ceiling above you. . . . As it happens,
there really are spotlights in the ceiling at precisely the places where you hallucinate
lights.
Then:
the surgeon stops stimulating your brain. You now genuinely see the dimly lit spot
lights in the ceiling. From your vantage point there on the operating table these dim
lights are indistinguishable from the dim lights you were hallucinating. The transition
from . . . hallucination to . . . veridical perception could be experientially seamless. Try as
you might, you would not notice any difference, however closely you attend to your
visual experience.7 (Johnston, 2004, p. 1 2 2)
What does it mean to say that "from your vantage point," the dim lights that you
see in the ceiling are "indistinguishable from the dim lights you were hallucinating"?
It seems to mean this: you lack any reliable ability to respond to the hallucination
by forming different beliefs and judgments from the beliefs and judgments that you
would form in response to the genuine perception. And the reason why this is the
case seems to be that in each of these two cases, you are disposed to form almost
exactly the same beliefs and judgments - that is, the same beliefs (and the same
Internal and External Components of Cognition
doubts and uncertainties) about what is going on in your environment, about your
own mental states, and so on.s
What can explain this remarkable fact that these two cases are so extraordin
arily similar with respect to the beliefs and judgments that you are disposed to form
in those cases? One plausible explanation is that there is a mental state that is pre
sent in both of these two cases, and it is this common mental state that disposes you
to form those beliefs and judgments. As I noted above, I do not have to take a defmite
stand on the further question of what exactly this common mental state is. Many
different answers to this further question are possible. For example, one possible answer
is that in this pair of cases, the mental state that is common to both cases might be
an experience as oj there being dimly illuminated lights in a ceiling above you.
Some philosophers deny that there is any common mental state. According to these
philosophers the two cases involve fundamentally different mental states - in the
one case a hallucination, and in the other a genuine perception ; all that these cases
have in common is that both cases involve the disjunction of these two mental states
that is, they both involve the disjunctive state of either hallucinating spotlights in a
ceiling or seeing spotlights in the ceiling.9 However, this " disjunctivist" response dearly
fails to provide any explanation of something that surely cries out for explanation namely, how it can be that these two cases are so similar with respect to the beliefs
and judgments that one is disposed to form in those cases. After all, any two cases
in an agent's mental life, no matter how dissimilar these cases may be from each
other, will both involve the disjunction of some mental state involved in the firSt
case and some mental state involved in the second. For example, consider one case
in which I am in excruciating agony, and another in which I am listening to some
beautiful music. These two cases have in common that they both involve the dis
junctive state of either being in excruciating agony or listening to some beautiful
music. But that the two cases have this much in common would hardly explain any
other similarity that they might have (such as a striking similarity in the beliefs and
judgments that one is disposed to form in those cases) . Disjunctivism does not begin
to engage seriously with the explanatory problem that is raised by the argument from
hallucination.
The argument from hallucination can be generalized to other cases as well. In
particular, it can also be applied to two cases where your mental states differ in
content. There are several different theories about what determines the reference of
terms like our term "water" and of the concepts that they express. According to most
of these theories, such terms refer to the natural kind that actually causes the thinker
(or members of the thinker's community) to use the term in the thinker's normal
environment. Now suppose that you are transported from Earth to Twin-Earth in your
sleep, and that you then remain on Twin-Earth for the rest of your life. At some
point, it will be Twin-Earth, rather than Earth, that counts as your normal environ
ment, and it will be a community on Twin-Earth, rather than any community on
Earth, that counts as your community. At that point, then, your terms and concepts
switch from referring to the objects and kinds of Earth to referring to the objects
and kinds of Twin-Earth. But it is striking that you do not notice any switch in the
content of your thoughts. This change seems to leave everything else about your
mental states and dispositions unchanged. But that is an extraordinary fact. How can
Ralph Wedgwood
the contents of all your thoughts change so thoroughly and yet leave so much intact?
You might even move back and forth between Earth and Twin-Earth several times,
in which case the contents of your thoughts might change back and forth several
times. How is it possible for such repeated cognitive revolutions to escape your
attention?
The best explanation of this, it seems to me, is that there is a mental state that
is common to both the Earth case and the Twin-Earth case. In saying that there is
a "mental state" present in both cases, I just mean that there is a mental property
that you have in both cases. I am not requiring that this mental property should take
the form of standing in a defmite mental relation to a particular content. Again,
I do not need to take a defmite stand on the further question of what exactly this
common mental property is. But one plausible answer to this further question may
be that the common mental state is a state such as that of believing a content of
such-and-such a type. Even if there is no such thing as "narrow content" - that is,
even if all intentional contents depend on the thinker's relations to her environment there may still be narrow types of content. That is, it may be that purely internal
facts about the thinker are enough to determine tbat she is indeed believing a content
of such-and-such a type, even though it is not enough to determine precisely which
content of this type she is believing. (For example, for a content to be of such a
narrow type might be for the content to be composed in such-and-such a way out
of concepts of such-and-such types - such as concepts that have such-and-such basic
conceptual roles. But it does not matter for my purposes exactly how these narrow
types of content are defmed - only that there are such narrow types of content.)
I shall suppose then that the argument from hallucination succeeds in showing
that there is a mental state that is common to both cases in all these pairs of cases.
But does it really show that these common mental states are narrow states? As
I noted at the beginning of section 1, there is some initial unclarity about how exactly
we should draw the boundary between internal states and external states. I suggest
that we can use these pairs of cases that the argument from hallucination appeals
to - the pair consisting of the case of genuine perception and the case of hallucina
tion, the pair consisting of the case on Earth and the case on Twin-Earth, and so
on - in order to clarifY where this boundary between the internal and the external
should be drawn. Admittedly, I have not given a precise account of what all these
pairs of cases have in common. Giving such an account, it seems to me, would require
much further investigation (possibly including empirical psychological investigation) ;
and I shall not try to anticipate the results of such an investigation here. But to fIX
ideas, here is a suggestion that seems plausible, at least on fIrSt inspection: in each
of these pairs of cases, the broad states are uncontroversially different between the
two cases, but if the thinker shifts from one case to the other and back again, she
will not notice any change; and tbe reason for this seems to be because all the thinker's
mental dispositions are unaffected by the difference between the two cases (except
of course the thinker's dispositions witb respect to the broad mental states that differ
between the two cases).
At all events, once we have a grasp on what these pairs of cases have in com
mon, then we can just stipulate that the states tbat are present in botb cases in all
these pairs of cases all count as "internal states."10 If a state is present in all these
I nternal and External Components of Cognition
cases, despite the enormous difference of environmental states between all these cases,
this makes it reasonable to call these states "internal states" ; and a state that super
venes on these internal states is what I am calling a "narrow state."
At least when the notion of a " narrow state" is understood in this way, it seems
to me that it is indeed plausible that the argument from hallucination provides a
strong reason to accept the conclusion that there are indeed narrow mental states.
As I noted above, this conclusion does not depend on the correctness of any particu
lar answers to the further questions about what sort of states these narrow mental
states are, or what their relation is to the broad mental states that are also present
in these cases. But to fIX ideas, it may be helpful for me to suggest some possible
answers to these further questions. In answer to the fIrSt of these further questions,
I have already suggested that these narrow states consist in standing in non-Jactive
mental relations towards certain narrow types oj content. For example, such narrow
states would include: having an experience with a content of such-and-such a type ;
having a belief with a content of such-and-such a type; and so on.
What about the second of these further questions? What is the relation between
broad states and narrow states? For example, what is the relationship between the
broad state of knowing p and the narrow state of believing a content of such
and-such a type (where the content p is in fact of such-and-such a type)? It seems
plausible that the relationship is one of one-way strict implication: necessarily, if
one is in the broad state of knowing p, then one is in the narrow state of believing
a content of such-and-such a type; but the converse does not hold. This makes
it plausible that the relationship is that of a determinate to a determinable, as the
property of being scarlet is a determinate of the determinable property of being red,
and the property of being an equilateral triangle is a determinate of the determinable
property of being a triangle. Thus, for example, the relation of knowing is a deter
minate of the determinable relation of believing ; the content p is a determinate
of such-and-such a determinable narrow type of content; and the state of knowing
p is a determinate of the determinable property of believing a content of such
and-such a narrow type.
explanations of why a certain agent forms a certain new belief or intention, or revises
an old belief or intention, on an occasion on which the agent forms or revises her
attitudes in this way for a reason. In a very broad sense, then, these are all explana
tions of pieces of reasoning. The piece of reasoning in question may be either theor
etical reasoning (the upshot of which is that the agent forms or revises her beliefs),
or practical reasoning (the upshot of which is that the agent forms or revises her
intentions about what to do), or any other kind of reasoning that there may be. What
this class of explanations excludes, then, are explanations of cases where an agent
comes to have a mental state, but not for any reason - such as cases where an agent
comes to feel thirsty, or to have a certain sensory experience (on the assumption that
these are not mental states that the agent comes to have for a reason).
Second, the explanations that I am concerned with are explanations that seek to
break down a process of reasoning into its basic steps. (As Hobbes would say, we
are trying to understand the mental process's "constitutive causes. ") A basic step of
this sort would be a mental process that cannot itself be analyzed, at the relevant
level of psychological explanation, into any other mental sub-processes at all. Thus,
suppose that there is a basic step that leads from one's having a sensory experience
as of p's being the case to one's coming to believe p. Then one's having this experi
ence is (at least part of) the proximate psychological explanation of one's coming to
hold this belief. There are no intervening steps, between the experience and the
belief, that can be captured at the relevant level of psychological explanation.
In this section, I shall argue that in a case of this kind, if the explanandum
consists of the fact that the agent acquires (or ceases to have) a narrow mental state,
then the proximate explanation will always also consist in some fact about the agent's
narrow mental states. 1 1 I shall argue for this in two stages. First, I shall argue that
in any case of this kind, the proximate psychological explanation of an agent's
acquiring a mental state is always some fact about that agent's mental states. Then
I shall argue that when the mental state in question is a narrow state, then the
proximate explanation of the agent's acquiring that state is always a narrow mental
state of the agent.
In arguing for the flrst point, I am not denying that it is ever correct to explain
the fact that an agent acquires a mental state through reasoning on the basis of
something other than a fact about the agent's mental states. For example, the fact
that I come to believe that Fermat's last theorem is true could surely be explained
by the fact that I have been told by a reliable informant that Fermat's last theorem
is true - even though the fact that I have been told by a reliable informant that
Fermat's last theorem is true is not a fact about my mental states. This explanation
may be quite correct. It just does not identifY the proximate psychological explana
tion of my coming to believe that Fermat's last theorem is true.
Intuitively, it seems, if this is a correct explanation, there must also be a more
detailed correct explanation, in which my coming to believe that the theorem is true
is not directly explained by my being told by a reliable informant that Fermat's last
theorem is true, but is instead explained by some intervening fact about my mental
states. For example, perhaps my coming to believe that Fermat's last theorem is true
is explained by my having the belief that I have been told by a reliable informant
that the theorem is true; and my having this belief (that I have been told by a
Internal and External Components of Cog nition
reliable informant that the theorem is true) is itself explained by my having an experi
ence as of someone (whom I take to be a reliable informant) telling me that the
theorem is true.
Suppose that I claim that an agent's acquiring a certain belief is explained by a
certain external fact that is not a fact about that agent's mental states ; and suppose
that the context does nothing to make it clear how there could be any more detailed
correct explanation in which the link between that external fact and the acquisition
of that belief is mediated by any intervening facts about the thinker's mental states.
For example, suppose that I say, "I once lived in Edinburgh, so George W. Bush believes
that I once lived in Edinburgh." It would be natural for you to reply, "But how does
Bush know anything about you at all? Did you meet him and talk about your life?
Did he have you investigated by the CIA? Or what?" In asking these questions, you
seem to reveal that you would not accept this explanation unless it is plausible to
you that this link, between the fact that I once lived in Edinburgh and Bush's believ
ing that I once lived in Edinburgh, is mediated by intervening facts about Bush's
mental states.
In general, then, if an agent acquires a mental state through reasoning, the
proximate psychological explanation of her acquiring this mental state on this
occasion will be some fact about her mental states. In fact, it is plausible that this
is one of the distinctive features of reasoning - the process of forming or revising
one's mental states for a reason - in contrast to mental processes of other kinds:
reasoning involves some change in a thinker's beliefs or intentions or other attitudes
the proximate explanation of which is some other fact about the thinker's mental
states.
So far, I have only argued that the proximate explanation of an agent's acquir
ing a mental state through reasoning must involve some fact about the agent's
mental states. I have not yet argued that if the explanandum consists in the fact that
the agent acquires a certain narrow mental state through reasoning, the explanans
must also consist in a fact about the agent's narrow mental states as well. Ironically,
my argument will rely on the very same principle that Williamson relied on to defend
the causal effICacy of the state of knowing p: if the explanandum consists of the
fact that the agent acquired a certain narrow mental state, we will achieve a more
general explanation by appealing to another fact about the agent's narrow mental
states than by appealing to a fact about the agent's broad states.
In this second stage of the argument of this section, I shall rely on the idea that
I suggested at the end of the previous section, that the relation between a broad
mental state and the corresponding narrow state is the relation of a determinate to
a determinable. Thus, for example, the broad state of knowing p is a determinate of
the determinable narrow state of believing a content of such-and-such a type (where
p is a content of the relevant type).
If narrow states are related to broad states as determinables to determinates,
then it is plausible that whenever one is in a narrow state, one is also in some more
determinate broad state. For example, whenever one believes a content of narrow
type T, one either knows p or falsely believes q (where p and q are both contents of
type T) or has some other broad state of this kind. Suppose that in fact one knows
p. Thus, the event of one's coming to believe a content of type T occurs at exactly
Ralph Wedgwood
the same place and time as the event of one's coming to know p. Some philosophers
will want to conclude that these events are in fact identical. But I have been assum
ing that entities that enter into explanatory relations, either as the thing that gets
explained (the explanandum) or as the thing that does the explaining (the explanansl,
are facts rather than events. It surely is plausible that even if the event of one's
coming to believe a content of type T occurs at exactly the same time and place as
the event of one's coming to know p, the fact that one comes to believe a content
of type T is not the same fact as the fact that one comes to know p. After all, even
though in fact both of these facts obtain, it could easily happen that the flrst fact
obtains (you come to believe a content of type T) but the second fact does not (this
belief does not count as a case of knowing pl. Since they are distinct facts, I shall
assume that they may have distinct explanations.
So, consider a case in which the explanandum - the fact that we are trying to
explain - is the fact that an agent acquires a certain narrow mental state through
reasoning. Speciflcally, suppose that this explanandum is the fact that the agent acquires
a belief in a content of a certain narrow type T I ' Now consider two rival explana
tions of this fact. According to the fust of these explanations, the agent acquires this
narrow mental state because she is in a certain antecedent broad mental state - say,
the state of knowing a certain propositional content p. According to the second explana
tion, she acquires this narrow mental state because she is in certain antecedent
narrow state - where this narrow state is in fact a determinable narrow state of which
the broad state cited in the flISt explanation is a determinate. Thus, if the broad state
cited in the flrst explanation is the state of knowing p, the narrow state cited in the
second explanation might be the state of believing a content of type T2 - where the
propositional content p is a content of type T2, and knowing is a type of believing.
Now it seems quite possible that the fact that the agent is in the narrow mental
state that is cited in the second explanation will be just as close to being causally
sufficient for the explanandum as the fact that she is in the broad mental state that
is cited in the first explanation. The probability that the agent will acquire a belief
In a content of type Tl is just as high given that she is in the antecedent narrow
state of believing a content of type T2 as the probability that she will acquire such
a belief given that she is in the antecedent broad state of knowing p.
However, the second explanation will obviously be more general than the flISt.
Consider a case in which you are not in the broad state that is cited in the first
explanation, but you are still in the narrow state that is cited in the second explana
tion (which is a determinable of which the broad state cited in the flISt explana
tion is a determinate). Surely you would still acquire the narrow mental state (such
as the belief in a content of type Tl), which is the fact that both explanations sought
to explain. After all, the argument for postulating such narrow states in the flrst
place - the argument from hallucination - was precisely that such narrow states were
needed to explain certain striking similarities in the short-term mental effects of cer
tain pairs of cases. Thus, I argued that there must be a narrow mental state present
both in the case of hallucination and in the case of genuine perception because both
cases had such similar short-term mental effects (so that it was possible for an agent
to shift from the case of hallucination to the case of genuine perception without
noticing any difference at all). Similarly, I argued that there must be a narrow
Internal and External Components of Cognition
mental state present both in the Earth case and in the Twin-Earth case to explain
the striking similarities in the mental causal effects of the two cases (and to explain
why one would not notice the contents of one's thoughts change as one is trans
ported back and forth between Earth and Twin-Earth in one's sleep).
It is plausible that other things being equal, we should prefer the more general of
two explanations that otherwise count as equally good explanations of the same effect,
from the same temporal distance. This point is especially plausible if the fact cited
as the explanans in the more general explanation is a determinable of which the
fact cited as the explanans in the less general explanation is a determinate. Here is
a simple illustration of this point. Suppose that we want to explain why a certain
code-protected door opened for the heroine. One explanation that we could give would
be to say that the door opened because the heroine drew an equilateral triangle with
each side measuring three inches, using her right index fmger. A second explana
tion that we could give would be to say that the door opened because she drew
a triangle. Now suppose that in fact any triangle drawn on the code-pad would
have succeeded in opening the door. In that case, the second explanation is a
better explanation, because it is more general than the flrst.12
For these reasons, then, it seems highly plausible that the proximate psycholog
ical explanation of these cases in which an agent acquires a narrow mental state
through reasoning is itself always a fact about the agent's narrow mental states. This
is not to say that broad states never play a role in psychological explanations.
As we saw when I endorsed Williamson's arguments in section 1 , knowledge does
seem to play such a role in the explanation of certain actions. Here, however,
the explanandum - such as the burglar's ransacking the house that contains the
diamond - consists in an agent's interacting with his environment in a certain way.
It is only to be expected that the explanans the burglar's knowing that the house
contains the diamond
will also consist in the agent's standing in a certain rela
tion to his environment. This does not show that such broad states will fIgure in the
explanation of the fact that the agent acquires a narrow mental state (such as the
fact that a thinker comes to believe a content of type T I at time t). A narrow mental
fact of this sort is surely more likely to have a correspondingly narrow mental explana
tion. In general, the overall effect of the principle about explanation that I am
appealing to here is that in any correct explanation there must be a certain sort of
proportionality between the explanandum and the explanans. The explanans must be
sufficient in the circumstances to produce the explanandum; but it also must not
contain any irrelevant elements that could be stripped away without making it any
less suffIcient to produce the explanandum. This proportionality principle makes it
plausible that narrow mental states will be particularly well placed to explain other
narrow mental states, and broad mental states will be particularly well placed to explain
other broad mental states (as well as actions, which like broad mental states also
depend, in part, on the agent's relations to her environment).
For these reasons, then, it seems plausible to me that there are correct explana
tions in which both the fact that is explained (the explanandum) and the fact that
does the explaining (the explanans) are facts about the agent's narrow mental states.
At least one branch of cognitive science could be devoted to ascertaining precisely
which explanations of this sort are correct. This branch of cognitive science would
-
Ralph Wedgwood
be a theory of the nature of internal mental processes of this kind, which would be
largely independent of the agent's relationship to her wider environment.
Moreover, as we saw in considering Nozick's defmition of knowledge, Williamson's
arguments do not show that broad states, like the state of knowing p, cannot be
analyzed in terms of the existence of appropriate connections between narrow states
and the non-mental objects, properties, and kinds in the agent's environment. It may
be that some such analysis can be given of all such broad mental states.]) If that
is the case, then a correct account of all the causal relations, actual and counter
factual, among both the agent's narrow states and the non-mental objects, propert
ies, and kinds in the agent's environment, will in fact state all the same facts as
a correct account that overtly appeals to broad mental states. In that case, a form
of cognitive science that focused purely on explaining narrow mental states in terms
of other narrow states (when supplemented by an account of the causal relations
between these narrow mental states and various non-mental features in the agent's
environment) would be able to capture everything that can be stated in terms of such
broad states; and a cognitive science of this kind could not fairly be accused of
"losing sight of the primary object of its study."
I cannot undertake to settle the question here of whether all broad mental states
can be analyzed in such terms. Even if broad mental states cannot be analyzed in
this way, a form of cognitive science that restricts itself to studying purely internal
cognitive processes would still be investigating some pervasive and genuinely
cognitive phenomena. But if broad mental states can be analyzed in this way, then
such a form of cognitive science could truly claim to be seeking the answer to the
question of "how the mind works."
Acknowledgments
This chapter was written with the support of a Research Leave award from the UK Arts and
Humanities Research Board, for which I should like to express my gratitude. I should also like
to thank Alex Byrne, Timothy Williamson, and Rob Stainton, the editor of this volume, for
very helpful comments on an earlier draft.
Notes
2
3
See also Williamson (2000, pp. 60-4, 7 5-88). Of course, if we can give a noncircular
defmition of knowledge in terms of other folk-psychological notions - for example,
if knowledge can be defmed as a rational belief that is in a certain sense "reliable,"
as I believe - then knowledge would not play an indispensable role in any of these
explanations. I shall touch on this question in the last section.
This point is due to Harman ( 1 973, pp. 143-4); for further discussion, see Wedgwood, 2002a.
We would also not get such a good explanation of the burglar's behavior by appealing
to the fact that the burglar truly believed that the diamond was in the house. Even if the
diamond was in the house (say, hidden inside the grand piano), the burglar might have
believed that it was in the house only because he believed that it was in the sugar bowl,
in which case he would still not have ransacked the house for the whole night.
I nternal and External Components of Cognition
4
5
6
7
10
11
12
13
For example, see Diamond ( 1 997, p. 1 43) : "an entire fIeld of science, termed ethnobio
logy, studies people's knowledge of the wild plants and animals in their environment."
Of course, this is a big " if." Almost no one thinks that Nozick's analysis is exactly right
as it stands. But I actually think that an analysis that is at least a little like Nozick's does
succeed; see Wedgwood, 2002a.
For such criticisms, see McDowell, 1 994 and D ancy, 1 995.
Johnston actually focuses on three cases: a hallucination whose content is false or non
veridical, a veridical hallucination, and a genuine perception. It seems to me however
that this additional sophistication is not strictly necessary for the argument.
I say "almost exactly the same beliefs and judgments" because strictly speaking demon
strative judgments (such as the judgment that those lights there are dim) will be differ
ent in the two cases, as we can see from the fact that such demonstrative judgments will
have different truth conditions in the two cases.
This is the view of a school of thought known as "disjunctivism." For some canonical
statements of this disjunctivist view, see Hinton, 1 9 7 3 ; Snowdon, 1 98 1 ; and McDowell,
1 994. For criticism of some of the arguments that are used to support this disjunctive
view, see Mlllar, 1996.
One question that requires further investigation is whether these "internal" states super
vene on intrinsic features of the agent's brain or whether their supervenience base must
include something about the wider environment. It may be, for example, that a brain in
a vat that had never been connected to a body that was capable of acting in a normal
environment could not have any mental states at all. If so, then these " internal" states
will not supervene on intrinsic features of the agent's brain, but only on a slightly wider
supervenience basis, which might include certain highly general and unspecifIC features
of the agent's environment. Nonetheless, the supervenience basis for these internal states
would presumably be much narrower than the broad states that Williamson focuses on.
The argument that I give here is a generalization of an argument that I gave elsewhere
(Wedgwood, 2002b).
l owe this example to the editor o f this volume. For some further discussion o f this prin
ciple about why we should under certain circumstances prefer the more general causal
explanations, see Yablo, 1 992a, pp. 4 1 3-23; 1 992b; 1 997.
Indeed, I believe that it can be argued that there would be something wildly puzzling and
mysterious about broad mental states if they were not analyzable in some such way. But
there is not enough space to present this argument here.
Mele, A. (2000). Goal-directed action: Teleological explanations, causal theories, and deviance.
Philosophical Perspectives, 1 4, 279-300.
Millar, A. ( 1 996). The idea of experience. Proceedings of the Aristotelian Society, 96, 7 5-90.
Nozick, R. ( 19 8 1 ) . Philosophical Explanations. Cambridge, MA : Harvard University Press.
Putnam, H. ( 1 975). The meaning of "meaning." In H. Putnam, Mind, Language and Reality:
Philosophical Papers, vol. 2. Cambridge: Cambridge University Press.
Snowdon, P. F. ( 1981). Perception, vision and causation. Proceedings of the Aristotelian Society,
8 1 , 1 75-92.
Wedgwood, R. (2002a). The aim of belief. Philosophical Perspectives, 1 6, 267-97.
- (2002b). Internalism explained. Philosophy and Phenomenological Research, 65, 349-69.
Williamson, T. ( 1 995). Is knowing a state of mind? Mind, 104, 533-65.
- (2000). Knowledge and Its Limits Oxford: Clarendon Press.
Yablo, S. ( 1 992a). Cause and essence. Synthese, 93, 403-49.
- ( 1 992b). Mental causation. Philosophical Review, 1 0 1 , 245-80.
( 1 997). Wide causation. Philosophical Perspectives, 1 1 , 2 5 1 - 8 1 .
Index
innateness, 66
judgments, 3 1 5 - 1 6
knowledge, 2 1 5 n . 4, 292-3, 3 0 1 -2,
305 n. 8, 308, 323 n. 1
location, 1 48-9
mental states, 293
modules, 1 6
perceptual, 265, 3 09 - 1 0
philosophical, 1 48 -9
Quechua speakers, 26
representation, 1 34-5
theory confIrmation, 1 58 n. 7
Berkeley, G., 83
Bermudez, J. L., 279
Bernoulli, D., 1 1 6
Big Computer view, 4, 4 1 -2
binocular vision, 268
biological approach, 8 - 1 1 , 41, 42-3, 162
biophysical machinery approach, 97-8, 100,
1 0 1 , 1 02-3, 1 09
Blackmore, S. J., 272-3 n. 1 8
blind spot, 2 6 1 , 268
blindness, 4, 26, 29, 247
see also change blindness; inattentional
blindness
Block, N. J., 189 -90, 1 95, 1 96-7, 246
bounded rationality
Enlightenment Picture, 136-42
G igerenzer, G., 1 3 5 - 6, 1 3 7, 1 39, 1 42
optimization, 1 1 8
probability calculus, 1 4 1 -2
Boyd, R. N., 62
Boyle-Charles law, 1 6 1 , 1 70
brain damage, people with, 2 8 1
brain surgery example, 3 1 5
Brentano, F., 240, 252
Broca's area, 23, 24, 29
Bruner, J., 52
Bunuel, L., 269
Burge, T., 243
burglary example, 309 - 10, 322, 323 n. 3
Bush, R. R., 1 7 1
Buss, D., 46
Cab Problem example, 1 3 7-8, 139
camel concept, 2 8
car example, 245- 6
carbohydrate metabolism, 1 62
card tricks, 269
Carey, S., 248
Carruthers, P.
encapsulation, 30, 33, 45
massive mental modularity, 48
modularity, xiii, 22, 37, 40
practical reasoning, 1 6
qualia, 190
Cartwright, R., 2 50, 252
central systems, 37, 4 1 - 2
ceteris paribus generalizations, 1 55 - 6
Chalmers, D. J . , 2 0 7 , 2 10
change blindness, 270
examples, 268-9
experiments, 259 - 60, 272 n. 16
illusion, 264, 283
visual fIeld, 2 6 1
Chierchia, G., 2 2 7
children
antibiotics prescriptions, 1 26
environment for learning, 92
I-language, 98
children's language acquisition
American Sign Language, 168
Chomsky, N., 72, 84, 86
deafness, 7 3 - 4
domain specifIC knowledge, 9 3
errors, 1 0 7
Gold, E. M . , 69-70
innateness, 65, 83-4
internalist reasoning, 99 - 1 00
interrogatives, 86-7, 1 10 n. 4
past-tense formation, 166
sentence constructions, 1 0 1
u nacquired linguistic universals, 7 1 -2
Chomsky, N.
Cartesian Linguistics, 1 06
children's language acquisition, 72, 84, 86
core language, 7 5 n. 4
creative aspect of language use, 106-7
E-Ianguages, 98-9
existence of objects, 244
experiential world, 2 5 1
factors of language growth, 102
generative grammar, 1 59
I-language, 9 8
language acquisition, 6 2 , 7 5 n . 1, 84,
94 n. 6
linguistic nativism, 90
natural languages, 92, 105
phonetics, 237
poverty of the stimulus, 1 1 1 n. 13
Index
data/information, 1 1
database searching, 3 2-3
Dawkins, R., 1 19
deafness, 4, 73 -4, 98, 247
decision making, 4 1 -2, 1 20
decision rule, 1 2 5
decision three algorithm, 1 26
declarative clauses, 7 1
decomposition, functional, 33 - 4, 37,
54 n. 7
deductive-nomological model, 1 6 1
deictic entities, 230-2
Dennett, D. c., 1 7 5, 1 9 1 , 272 n. 1 2, 284,
286 n. 10
Descartes, R.
beliefs, 66
Chomsky, N., 1 08
innateness, 105
introspection, 206
knowledge, 8 2
mechanistic approach, 106, 1 62
Meditations, 2 1 4 - 1 5 n. 3, 2 1 5 n. 6
meditator, 205
determinate/determinable, 3 1 8, 3 20 -2
determinism, 2 5 -7, 1 1 6
developmental psycholinguistics, 8 3 - 4
Devitt, M . , 2 3 7 , 2 3 9 , 244
Diamond, J., 324 n. 4
Discourse Representation Theory, 230
disembodiment argument, 204, 205, 208 -1 0,
2 1 4 - 1 5 n. 3
disjunctivism, 306 n. 1 2, 3 1 6, 3 2 4 n. 9
domain specifIcity, 5-6, 39
arguments against, 49 - 5 1
degrees of, 5 4 n . 6
evolutionary psychology, 19 n. 2
Fodor-modules, 4-5
innateness, 27-30, 8 5 - 6
knowledge, 8 5 - 6, 9 3
localization, 24
massive mental modularity, 6
Dretske, F., 1 95, 197, 1 99 n. 1, 244
dual route model, 275, 2 8 1 - 2
duck-rabbit fIgure, 3 1
Dugatkin, L . A . , 1 26-7
dynamical systems theory
cognitive development, 1 78 - 8 1
cognitive science, 1 74 - 5
holistic, 1 75-7
nonlinear progression, 180
evolution, 9, 1 0, 42-3
evolutionary psychology, 7 , 1 0- 1 1 , 19 n. 2,
22
experience, 26, 263, 2 7 1 n . 1 0
see also visual experience
experiential world, 238, 2 5 1
explanandum/explanans, 3 1 0, 3 2 1 , 322-3
externalism
cognitive states, 30 1 -2
existence, 239
generalizations, 295-6
internalism, 243 -4, 298-9
weak/strong, 243 -4, 248, 253 n. 6
eye movements, 260 - 1 , 270
face recognition, 1 0, 49
factive attitudes, 302-3, 309
factorization, 292, 303 -4, 307
Farrell, B. A., 190
fecal occult blood, 1 27
Fechner, G., 1 70
feedback loop, 302, 3 10
feedforward connectionist networks, 1 63 ,
1 64, 1 65
Fermat, P., 1 1 6, 3 1 9 -20
Fitch, W. T., 67, 7 5 n. 4
fitness effects, 8 -9
Flynn effect, 1 7 3
Fodor, J. A . , 2 2 7
acoustic representation, 247
belief, 1 58 n. 7
Big Computer, 4
causality, 244
cognition, 19 n. 3
computational tractability, 1 3
encapsulation, 30- 1
language acquisition, 6 1 , 7 2-3
language of thought, 1 59, 224
linguistic representation, 1 03
locality, 1 4
modularity, 2 2, 23, 39, 4 1 , 50
Modularity of Mind, 2 2
psychosemantics, 1 9 5 - 6
stochastic learning theory, 76 n. 1 5
see also Fodor-modules
Fodor, J. D., 76 n. 1 0
Fodor-modules, 4-5, 1 8, 25, 3 3 - 4, 1 59
folie a deux view, 239-40, 247, 2 5 1
folk biology, 46
folk physics, 26, 30
Index
linguistic nativism
Chomsky, N., 90
formal learning theory, 90-3
implausibility, 102
language acquisition, 8 1 -2
Lipton, P., 67
poverty of the stimulus, 70- 1
syntax, 59-60
linguistics, 103, 105-6
Lipton, P., 67
localization, 23-4
location, 14- 1 5, 1 48 -9
Locke, J., 6 1 , 74, 83, 1 9 1
Lotka-Volterra model, 1 73
Ludwig, K., xiv, 276, 283, 285
Lycan, W. G.
consciousness, xiii, 1 93
HOP, 1 9 1
knowledge argument, 2 1 3
presentation modes, 206
qualia, 190, 197
representation, 195, 1 96, 2 1 4
macaque monkeys, 2 8 1
McClelland, J. 1., 1 66, 1 80
McConnell-Ginet, S., 227
McGilvray, J., xiii
Mack, A., 264, 269-70
Macnamara, J., 229
Marchman, V., 1 64, 165
Marcus, G., 1 2
Margolis, E., 67
Markov models, 1 7 1
Marler, P., 67
Marslen-Wilson, W., 3 1
massive mental modularity, 4, 37-8
arguments against, 47-8
biological approach, 8 - 1 1
bottleneck argument, 44
Carruthers, P., 48
computational tractability, 1 3 - 1 5
domain specificity, 6
evolution, 42-3
evolutionary psychology, 22
heuristics, 19 n. 10
support for, 7- 1 5
task specifIcity, 1 1 - 1 3 , 43 -4
mathematical psychology, 169-72
mathematico-deductive theory, 1 70
Matheson, D., xiv, 1 27
parallelism, 44, 1 0 1
parsimony, 1 70- 1, 1 76
Pascal, B., 1 1 6
past-tense formation, 27, 1 5 1 , 1 60, 163-7,
1 79
pattern recognition, 1 63
Pax-6 gene, 1 0 1 , 102
Peacocke, c., 196
peer group pressure, 1 26
Peirce, C. S., 108
perceivers/reality, 247-8
percept, 228-30
perception, xiv
beliefs, 265, 309 - 1 0
cognitive influence, 52
computational models, 49
disjunctive theory, 306 n. 1 2
dual route model, 275, 28 1-2
encapsulation, 3 1
Fodor, J . A., 3 0 - 1
hallucination, 276
Higher Order, 1 9 1
memory, 1 79
motion, 277-8
non-modular, 22
phonology, 278
reference, 231- 2
representation, 275
sensorimotor view, 2 6 1 -2, 286 n. 1 1
top-down influences, 5 1-2
veridicality, 269
see also visual perception
perceptrons, 163
perceptual experience, 263
accuracy, 267
dispositional, 285
evolutionary function, 266-7
illusion, 264-6, 270 - 1 , 283
inaccuracy, 276-80
occurrent/dispositional, 263
state/event, 285
perceptual systems, 2 2 1 , 259-60
perspectivalism, 198-9, 2 1 2- 1 3
phantom limb pain, 26, 3 2
phenomena
cognitive science, 1 6 1-2
externalism, 1 9 7
physics, 1 60 - 1
qualia, 1 9 3 - 8
reconstituted, 1 67
phenotypes, 9
philosophy of action, 3 1 0
philosophy o f language, xiii-xiv
philosophy of mind, xiii-xiv, 1 59 - 60, 202
philosophy of science, 1 6 1
phonetics, 2 3 7 , 246-7
phonology, 40, 220- 1, 239, 278
PHONs, 105
photoreceptors, 276-7
physicians, 1 27-8
physics, 1 60 - 1 , 1 70
Piaget, J.
learning, 84
logic, 1 1 5, 1 1 8, 1 19
object concept task, 1 79
poverty of the stimulus, 87
sensimotor intelligence, 85
Pietroski, P., 1 0 1
Pinker, S . , 7 3 , 1 6 6
Plato, 248-9
Meno, 64, 82
Plunkett, K., 1 64, 1 6 5
Plurality Thesis, 40- 1
Pollock, J., 1 43
Ponzo illusion, 270
possible worlds, 222
poverty of the stimulus, 1 07-9
anti-nativism, 86-90
Chomsky, N., 1 1 1 n. 1 3
empirical, 1 0 1
innateness, 108
language acquisition, 72
linguistic nativism, 70- 1 , 82, 84-6
Piaget, J., 87
predator/prey population, 1 73, 1 74
prefixes, 69
presentation modes, 1 97, 206
prime state, 298, 299-302
priming, semantic, 25
Prince, A., 1 66, 1 68
principles and parameters theory, 63 -5,
7 5 n. 7, 102
Prinz, J., xiii, 3 7, 49, 50- 1 , 76-7 n. 1 6
probabilistic inference, 4 1 , 302
probabilities, 1 1 6, 1 28, 138, 1 4 1
probability calculus, 1 37-8, 1 4 1 -2
productivity argument, 1 54
pronouns
anaphoric, 86-7
deictic, 2 27-9
I ndex
reasoning
explanation, 3 19
internalist, 99 - 1 00, 1 28, 239
mental states, 320, 322
modularity, 4 1 -2
practical, 1 6, 2 2 1
social, 2 2 1
reciprocity, 1 22-3
recognition, visual, 2 5
see also face recognition
recognition heuristic, 1 24-5
reduction, type/token, 190- 1 , 203
reductionism
conceivability argument, 203 -5
natural science, 206-7
physicalist, 207
qualia, 202-3
reductivism, 6 1 -2
reference
conceptualist theory, 226-7
deictic, 227-9, 233-4
intended, 233
perception, 23 1-2
purported, 232-5
realist theory, 226
truth, 235
truth-value, 222-3
relevance module, 48
representation, 1 1 8
accuracy, 266, 276-80
beliefs, 1 34-5
causes, 248
Chomsky, N., 253 n. 5
conceptual, 280 - 1
contentful/architectural, 1 0 3 - 6
existential, 241-3
Higher Order, 1 9 1 -2, 193, 1 95, 196, 1 99
intentionality, 239, 241-3
internal, 240 - 1
Lycan, W. G., 1 9 5, 1 96, 2 1 4
mental systems, 5 3 n . 1
non -existent things, 2 4 1 -2
noun phrase, 1 04
perception, 275
perceptual systems, 259-60
psychosemantics, 195-6
sensorimotor, 280-1
standard linguistic entities, 240 - 1
thermoreceptors, 278-9
visual experiences, 260
representation-world relations, 1 04
representationalism, 1 93, 197, 2 1 3 - 1 4
retina, 25, 227-8
Rey, G., xiv, 192, 1 97, 252 n.
Rips, L., 1 23
Rock, 1., 269-70
Rosenblatt, F., 163
Rosenthal, D., 1 90, 1 9 1
mles
application of, 163
exceptionless, 1 47-8, 1 54-5, 1 57
language processing, 1 59
programmable, 1 55
Rumelhart, D. E., 1 66
Saffran, J. R., 63
Sampson, G., 72
Samuels, R., xiii, 1 2, 64, 65, 1 29
Sapir, E., 237
satisfaction, 232-5
satisfIcing, 1 5
Saussure, F . de, 237
Saussurean arbitrariness, 102
Scholz, B . c., xiii, 72, 88, 97, 1 0 1 , 1 03
science, laws, 1 6 1
scientifIC approach, 206-7
scope ambiguity, 1 6
search rules, 3 2 , 1 25
Searle, J., 2 1 5 n. 1 4
Segal, G . , 6
selection processor, 47
selection task, 1 2 2
semantics
Chomsky, N., 238
conceptualist, 2 1 9-22
dynamic, 230
priming, 25
situation, 233
syntax, 1 50 - 1
truth conditional, 235
SEMs (linguistic meanings), 105
senses, 26, 59, 62, 1 94
sensorimotor intelligence, 8 5, 284-5
sensorimotor model, 2 6 1 -3, 280- 1 ,
286 n. 1 1
sensory receptors, 260 - 1
sentences, 1 00 - 1 , 233 -4, 2 3 5, 244-5
shallowness, modularity, 39
Shoemaker, S., 1 92, 197
Siegler, R., 1 80
Stigler, G. J., 1 1 7
stochastic learning theory, 63, 76 n. 1 5
Stoljar, D., 205 - 6, 2 1 5 n. 7
Stone, V. E., 24
stopping-rules, 1 5 - 16, 1 2 5
Strawson, G., 264
Sturgeon, S., 209
subsymbolic level, 1 63, 1 77 -8
suffixes, 69
supervenience, 203, 208, 2 10, 324 n. 1 0
symbolic approach, 1 60, 1 63, 1 6 7
symbolic-connectionism, 163-7
symbolic mechanistic models, 1 7 1 -2
synesthesia, 3 2
syntax
acquisition, 59-60
classical, 1 50 - 1
cognition, 1 47-8, 1 54-5, 1 60
concepts, 3 1 -2
defined, 1 49 - 5 1
drawings, 1 58 n . 3
phonology, 220 - 1
semantic relationships, 1 50 - 1
theory, 1 0 5
tracking argument, 1 48 -9
systematicity, 1 60
Tadoma language, 247
Tager-Flusberg, M., 29
Take The Best heuristic, 1 2 5 - 6
Tarski, A., 234
tasks, 1 1- 1 3, 43 - 4, 47
Taylor, c., 1 3 5
temporal gyrus, superior, 1 00
that-clauses, 234
That Obscure Object of Desire, 269
Thelen, E., 1 78
theory confirmation, 1 58 n. 7
thermoreceptors, xiv, 272 n. 1 3, 278-9
thermostats, 280
Thomas, D., 259
Thomas, M. S. c., 27
Thornton, R., 1 00 - 1
thought processes, 6, 1 9 1
Tichener circles illusion, 282, 286 n . 6
Tienson, J., 1 60
time, as variable, 1 73
tokenism, physical, 237, 252 n. 2
Tooby, J., 39, 40, 53 n. 1 , 1 40 - 1 , 1 43
top-down influences, 3 1 -2, 5 1-2
I ndex
touch/sound, 3 2
tracking argument, 1 48 -9, 1 5 1 -2, 1 54
tracking objects task, 1 2 1
transparency argument, 194
Trends in Cognitive Sciences (McClelland
and Patterson, Pinker and Ullman),
166
triggered learning, 65, 74
Trivers, L. R., 1 22
trumping hypothesis, 5 1
truth, 232-5, 3 09 - 1 0
truth-tables, 1 4
truth-value, 222-3, 233, 234-5
Tversky, A., 1 1 8, 1 29, 1 3 7
Twin-Earth, 293 -5, 308, 3 1 6 - 1 7, 3 2 2
Tye, M., 194, 1 95, 1 96, 197, 2 1 2
Tyler, L., 3 1
unacquired linguistic universals (ULUs),
67-70, 7 1 -2
unbounded rationality, 1 1 6 - 1 7, 1 1 9 -20,
1 28-9, 1 3 6 -7, 1 39
uncertainty, 1 3 5- 6, 1 3 7-8, 1 39, 1 4 1 -2
underdetermination, 1 62
understanding, 249, 2 5 1 , 3 1 1
see also reasoning
Universal Grammar, 60, 67, 72, 8 1 , 84-5,
101, 219
Universal Grammar+, 1 0 5 - 6
un solvability theorem, 9 1 , 95 n. 1 8
Uttal, W . R., 2 3 , 2 4
utterance, 2 3 7 , 239, 246, 247
Van Gelder, T., 1 7 5
Van Leeuwen, c., 1 75, 1 77-8
verbs
conjugation, 2 5, 29
past-tense formation, 27, 1 5 1 , 1 60, 1 63,
1 79
Viger, c., xiv
virtues, 134-5
visual cortex, 3 1 5
visual experience
details, 267
misrepresentation, 267-8
representation, 260
sensorimotor view, 262-3
snapshot model, 265 -6, 283-4
veridical, 1 94-5
visual field, 260 - 1 , 284
Index