Anderson, Lightfoot 2002 The Language Organ
Anderson, Lightfoot 2002 The Language Organ
Anderson, Lightfoot 2002 The Language Organ
Challenging and original, The Language Organ treats human language as the
manifestation of a faculty of the mind, a mental organ whose nature is deter-
mined by human biology. Its functional properties should be explored just
as physiology explores the functional properties of physical organs. The
authors argue that linguistics investigates cognition, taking as its object mental
representations and processes rather than externally observed grammatical pat-
terns (which constitute evidence, but are not themselves the object of study).
Such a view has untraditional consequences for the kinds of argument and
analysis that can be offered in understanding the nature of language. The book
surveys the nature of the language faculty in its various aspects: the systems
of sounds, words, and syntax; the development of language in the child and
historically; and what is known about its relation to the brain. It discusses the
kinds of work that can be carried out in these areas that will contribute to an
understanding of the human language organ. The book will appeal to students
and researchers in linguistics, and is written to be accessible to colleagues in
other disciplines dealing with language as well as to readers with an interest
in general science and the nature of the human mind.
Stephen R. Anderson
Yale University
David W. Lightfoot
Georgetown University
The Pitt Building, Trumpington Street, Cambridge, United Kingdom
http://www.cambridge.org
Preface page ix
3 Syntax 41
3.1 The emergence of syntax within linguistics 41
3.2 Successive merger and deletion 43
3.3 Case 61
3.4 Conclusion 66
vii
viii Contents
7 Morphology 131
7.1 The lexicon 132
7.2 Words and “morphemes” 134
7.3 Productivity 152
7.4 Conclusions about lexical organization 155
References 244
Index 257
Preface
[t]he branch of biology dealing with the processes, activities, and phenomena incidental
to and characteristic of life or of living organisms; the study of the functions of the
organs, tissues, cells, etc. during life, as distinct from anatomy, which deals with their
structure. The final analysis of these processes and phenomena is mainly physical and
chemical. The phenomena of mental life are usually regarded as outside the ordinary
scope of physiology (see p s y c h o l o g y).
Like many dictionary definitions, this one combines the central conceptual
content of the word with other characteristics that are as much the product of
ix
x Preface
historical accident as of basic meaning. The core notion seems to be that of the
study of functions and processes as opposed to static structure, but that by itself
does not tell us much about just what sorts of object might properly fall within
the scope of this science.
A central aim of this book is to explore the notion that physiology should
extend, if it is to be productive, beyond the study of isolable physical structures
and organs. As we see it, the range of “phenomena . . . characteristic of life”
whose functional organization can be usefully attributed to the biology of the
organism exhibiting them, and substantively studied, goes well beyond those
for which we have a “final analysis” which is “mainly physical and chemical.”
Excluding “the phenomena of mental life” from physiology surely reflects, at
least in part, the ancient Cartesian dualism of mind and body in rejecting the
study of cognition from the biological domain. However, this dualism was soon
undermined, largely because of what we came to learn about the “body” not long
after Descartes. Newton reintroduced “occult” causes and qualities, to his own
dismay, and the Cartesian theory of body was shown to be untenable. Hence,
there is no absolute distinction between body and mind, because our notions
of the body need to be enriched beyond the simple mechanisms envisioned
by Descartes.
Accepting the notion that “the mind is what the brain does,” as much mod-
ern cognitive science does, entails that cognitive phenomena essential to the
life of an organism are good candidates for biological study: as with other or-
gans, we can study them at an appropriate functional level and not just in the
narrow terms of a physical or chemical analysis of living tissue. There is no
longer any motivation for a sharp delineation between functional organization
that is associated with discrete anatomy and that whose relation to physical
implementation is less obvious and discrete. In all cases, we need to identify
an appropriate level of abstraction at which significant generalizations about
biologically determined function can be stated. We shall illustrate such a level
of abstraction: the study of “I-language.”
Does it make sense to take linguistics as a paradigm for this sort of analysis,
though? In modern academic life, linguistics as a discipline suffers from a
lack of clear general perception. All too commonly, the notion that there is
anything very complex or technical about the study of language seems contrary
to common sense: after all, just about everyone (even little kids) can speak
a language perfectly well, so what could be that intricate about it? Of course,
when you learn a new language you have to learn a lot of words, and little details
like where the verb goes in a German sentence, but surely that’s not science. If
there is something to study there, this attitude continues, it must be things about
the histories of languages (Which ones are related to which others? Where did
the word boondocks come from?), or else about the way we use language in
society (Does bilingual education make sense? How do men and women do
different things with the same words?).
Preface xi
With the work of Noam Chomsky, and his colleagues and students, however,
a completely new focus arose. Chomsky stressed from an early point that if
we want to understand the essential nature of language, a study of its external
manifestations in speech, texts, communicative behavior, etc. may provide rel-
evant evidence, but that should not be the fundamental object of inquiry in the
field. What we want to understand, that is, is not a text or a spoken utterance,
but the system of knowledge within a speaker that underlies (significant and
well-defined aspects of) it. Linguistics is thus in its most essential aspect a
component of the study of the mind, the study of a system of human know-
ledge (the language organ) that forms a part of our larger cognitive organization.
It has developed a level of abstraction that appears to be appropriate for the
formulation of significant generalizations about language: what we will call
“I-language” below, an individual, internal system. We know something of the
properties of I-language, though certainly not all, and an appropriate level of
abstraction seems to be emerging which permits productive research. Efforts
toward detailing the properties of I-language in a variety of areas constitute the
subject matter of this book.
As modern cognitive science has come increasingly to reject the dualism of
mind and body, and to see our mental life as the product of physical processes
and events, it has become possible at least to consider the possibility that aspects
of cognition have a structure in themselves that is determined as much in its
form by our biological nature as is the structure and functioning of physical
organs like the liver or the skin. And when we look into the matter with care,
we find that there are indeed many strong reasons to consider that the cognitive
organization underlying our ability to acquire and use a language is as much
a part of our genetically determined biology as the structure of our eyes or
skeleton.
This might lead us to expect that the right way to study language might be
to identify and characterize discrete, separable regions of the brain that are
responsible for it, and then to explore the operation of this tissue in precise
detail. It would surely be at least premature to take this approach, however.
We are far from being able to associate specific neural tissue, except at the
very grossest levels, with specific cognitive functions. We can observe that
injuries to certain brain regions result in particular cognitive deficits, and that
shows that the tissue involved must indeed be serving some purpose that is
essential to the cognitive function in question, but that is not at all the same
thing as saying that this is the specific region which “computes” the relevant
cognitive property. Even when rapidly developing imaging techniques allow
us to identify metabolic activity in specific regions of the (normal) brain that
is associated with language-related activities, we are still far from being able
to relate those activities in any direct way to molecular or cellular events and
processes.
Preface xiii
tried to document our discussion from the standard literature. These references
are of two general sorts: some are intended to make it possible for skeptics,
or for those who simply want to know more about a particular point, to see
where our position is grounded. The reader may in general presume that such
additional exploration is not crucial to understanding the point at issue, and
regard these citations as so much baroque ornamentation. Other references are
provided specifically to point readers to fuller discussions of areas which we
cannot go into here in adequate detail; these are generally signalled as such,
and should again be considered as supplementary rather than required reading
assignments.
We have tried, overall, to make this book as self-contained as possible, defin-
ing our terms at least to an extent that should make our usage comprehensible to
a non-specialist who is willing to make the effort to understand us. The reader
who has read carefully Steven Pinker’s very useful book The Language Instinct
should have all of the tools that are necessary to make it through this one. Some
parts will be slower going than others, however. The reader may perhaps take
comfort in the fact that most chapters can be appreciated in the absence of a
complete understanding of some of the others.
Linguistics is a field with many subareas, and the basic concepts and methods
of each are to some extent distinct. We have not by any means attempted to
cover the entire field (lexical semantics, for example, is not treated at all, nor
is parsing, or computational analysis, etc.), but we have tried to provide a
representative piece from several of the major areas of grammatical theory, and
from some other areas that are of special importance for the kind of questions
we wish to address. In all cases, we see that we need to get beyond the everyday
notions of “languages,” “words,” “sounds,” and the like – external, extensional
notions from E-language – and to work at a higher level of abstraction in order
to capture significant generalizations.
Chapter 1 summarizes the way ideas have changed over time concerning the
basic object of inquiry in linguistics. Philologists thought of texts as the essen-
tial reality; in the nineteenth century, the neogrammarians looked at individual
sounds and words; structuralists assumed that language should be studied in
itself, but thought of structure as somehow immanent in an external, social real-
ity. American structuralists thought they could characterize linguistic behavior
from a purely external point of view, until Chomsky’s review of Skinner and
the subsequent triumph of mentalism over behaviorism in the study of cog-
nition showed us that the real object we ought to be studying is the nature,
development, and organization of linguistic knowledge – the language organ.
Against the background of this understanding of what linguistics is centrally
concerned with, chapter 2 sets the stage for more detailed consideration of the
various subparts of the study of the language organ. We argue that the properties
of language cannot be understood if we take it to develop by the inductive
Preface xvii
application of general learning strategies to the data available to the child, and
present evidence that the language faculty is a species-specific, biologically
determined capacity.
Chapter 3 deals in more detail with the domain where the cognitive nature of
the subject was clearest from the start, syntax. Prior to the late 1950s, linguistics
focused almost entirely on the smallest units of language, sounds, words, and
minimal meaningful elements (“morphemes”), where the model of the Saus-
surian sign has most plausibility. “Syntax” was largely a promissory note to
the effect that such sign-based analysis would eventually encompass the larger
units of phrases, sentences, etc. When the productive mechanisms of syntac-
tic formation came under scrutiny with the rise of transformational generative
grammar, however, the challenge to the notion that language is essentially an
inventory of signs became apparent. The resulting insights had profound effects
in all areas of the field.
Chapter 4 deals with phonology, the study of what Edward Sapir meant by
the title of his paper “Sound Patterns in Language” (Sapir 1925). The study
of the linguistic organization of sound led, in structuralist linguistics, to the
notion of the “phoneme” as a minimal unit of contrast. This idea, in turn, was
seen as fundamental by students of several other disciplines, and for many
constitutes the main claim of linguistics to scientific status. A focus on the
definition of this unit, however, resulted in a resolutely surface-oriented view
of language, which was only with great difficulty replaced by a more internalist
picture. In fact, generative phonology more or less accidentally discovered the
right alternative to the externalism of phonemics. In what is generally seen to
be the foundational moment in the development of a generative approach to
phonology, Morris Halle offered an argument against classical phonemics that
had profound effects on the field. What is of interest here is the fact that the true
force of Halle’s argument against phonemes only becomes apparent when the
object of inquiry in the study of language is taken to be a form of knowledge
(I-language), not the properties of sounds, words, sentences, etc.
The concern of chapter 4 is with the nature of representations in phonol-
ogy, the description of the way languages use sound properties to distinguish
linguistic elements from one another and the changes that have taken place
in our conception of the relation between phonological form and surface pho-
netic form. Chapter 5 continues the discussion of phonology by asking how
the regularities of linguistic knowledge in this domain are to be characterized.
We discuss reasons to believe that the kind of configuration-specific rules in
terms of which early generative phonology operated can profitably be replaced
with systems of more general constraints. We view the rise of theories of this
sort (such as “Optimality Theory”) as being of a piece with the more general
abandonment in syntax and elsewhere of construction-specific rules in favor of
general principles.
xviii Preface
referred to a language “organ.” The result was Anderson and Lightfoot 1999,
which serves as the basis for chapter 2 and much of chapter 10 of the present
book. We are grateful to Dr. Hoffmann not only for giving us an opportunity to
discuss this issue for a discerning and sophisticated audience of non-linguists,
but also for making us think about the importance of the basic issues raised by
that question and the need to present them to a broader audience.
We have received useful comments on various drafts of parts of this book
from Dana Boatman, Norbert Hornstein, Ray Jackendoff, and Charles Yang, as
well as from referees for Cambridge University Press. Some of the research on
which chapters below are based was supported by awards (SBR 9514682 and
BCS 9876456) from the National Science Foundation to Yale University.
Figure 6.1, is reprinted from J.-M. Hombert, ‘Consonant types, vowel height
and tone in Yoruba,’ UCLA Working Papers in Phonetics 33 (1976), 40–54.
Figure 10.1, is reprinted from Richard L. Gregory (ed.) The Oxford Compan-
ion to the Mind (Oxford, Oxford University Press: 1987), 620.
Figure 10.2, is reprinted from D. Purves et al. (eds.), Neuroscience
(Sunderland, MA: Sinauer), 159.
1 Studying the human language faculty
If you meet someone at a cocktail party and tell them you are a carpenter, or
a veterinarian, or an astronomer, they are likely to be quite satisfied with that,
and the subsequent evolution of the conversation will depend, at least in part, on
the depth of their interest in woodworking, animals, or the universe. But if you
tell them you are a linguist, this is unlikely to satisfy whatever curiosity they
may have about you : “Oh, so how many languages can you speak?” is the most
common reply at this point. But in fact, many – probably even most – linguists
actually speak few if any languages in addition to their native tongue, in any
practical sense. A “linguist,” at least in academic disciplinary terms, is not a
person who speaks many languages, but rather someone concerned with the
scientific study of language more generally.
That still doesn’t settle matters, though. As we will discuss below, different
generations of scholars have had rather different notions of what was important
enough about language to warrant study. Languages have histories, and relation-
ships with one another that at least superficially parallel genetic connections,
and one can study those things. Most often, languages are spoken, and it is
possible to study the anatomical, acoustic, and perceptual aspects of speech.
Different spoken forms can mean different things, and we might study the kinds
of things we can “mean” and the ways differences in the forms of words are
related to differences in their meanings. Literature, rhetoric, and the texture of
ordinary verbal interchange show us that we can do many different things with
words, and we might take an understanding of these various potential uses as
the goal of a scientific study of language.
All of these approaches to language, however, assume that language is an
essentially ubiquitous activity of human beings. As an infant, every human
being normally acquires a knowledge of (at least one) language, knowledge
that is acquired while exposed to a limited amount of speech in the child’s
surroundings and that allows him or her to participate in the verbal life of
the community within a relatively short time. Surely the most basic questions
to ask if one wants to understand the phenomenon of language concern the
nature and form of that knowledge, the way it arises and the way it relates to
other aspects of human cognition.
1
2 1 Studying the human language faculty
The study of language and cognition during the past several decades has given
increasing credibility to the view that human knowledge of natural language
results from – and is made possible by – a biologically determined capacity spe-
cific both to this domain and to our species. An exploration of the functional
properties of this capacity is the basic program of the present book. These de-
velop along a regular maturational path, such that it seems appropriate to speak
of our knowledge of our language as “growing” rather than as “being learned.”
As with the visual system, much of the detailed structure that we find seems to
be “wired in,” though interaction with relevant experience is necessary to set
the system in operation and to determine some of its specific properties. We
can refer to experience that plays such a role as triggering the organization of
the system, exploiting a term taken from ethology.
The path of development which we observe suggests that the growth of
language results from a specific innate capacity rather than emerging on a purely
inductive basis from observation of the language around us. The mature system
incorporates properties that could not have been learned from observation or any
plausibly available teaching. The deep similarity among the world’s languages
also provides support for the notion that they are the product of a common
human faculty.
These fundamental, apparently native properties are shared by the gestural
languages which develop spontaneously in Deaf communities, quite indepen-
dently of one another or of the language of the surrounding hearing community.
We must conclude that they are neither the result of simple shared history nor
necessary consequences of the articulatory/acoustic/auditory modality of lan-
guage in its most familiar form, spoken language. The development of struc-
turally deficient pidgins into the essentially normal linguistic systems found
in creoles, as a result of transmission through the natural language learning
process in new generations of children, provides additional evidence for the
richness of that process.
The domain-specificity of the language faculty is supported by the many
dissociations that can be observed between control of language structure and
other cognitive functions. Focal brain lesions can result in quite specific lan-
guage impairments in the presence of otherwise normal cognitive abilities;
and vice versa. Natural as well as acquired disorders of language also support
the proposal that the human language faculty is a product of our genetically
determined biological nature: there is evidence that certain language deficits
show a clear distribution within families, patterns that epidemiological and
other studies show to be just what would be predicted of relatively simple
heritable traits (Gopnik and Crago 1991, Tomblin 1997).
Finally, the species-specificity of the human language faculty is supported by
the very fact that (absent severe pathology) every human child exposed in even
1.1 Linguistics and the mind/brain 3
limited ways to the triggering experience of linguistic data develops a full, rich
capacity which is usually more or less homogeneous with that of the surrounding
community. Meanwhile, efforts to teach human languages to individuals of
other species, even those closest to us, have uniformly failed. While a certain
capacity for arbitrary symbolic reference can be elicited in some higher apes
(and perhaps even in other animals, such as parrots), syntactic systems even
remotely comparable to those of human languages seem to be quite outside the
capacity of non-humans, despite intensive and highly directed training.
These considerations make it plausible that human language arises in biologi-
cally based ways that are quite comparable to those directing other aspects of the
structure of the organism. The language organ, though, is not to be interpreted
as having an anatomical localization comparable to that of, say, the kidney. Our
understanding of the localization of cognitive function in brain tissue is much
too fragmentary and rudimentary. Certain cortical and subcortical areas can be
shown to subserve functions essential to language, in the sense that lesions in
these regions disrupt language functioning (sometimes in remarkably specific
ways), but an inference from this evidence to a claim that “language is located
in Broca’s (and/or Wernicke’s) area” is quite unwarranted. The linguistic capac-
ity which develops naturally in every normal human being appears to be best
understood in functional and not anatomical terms, at least for the time being.
We will return to these issues of the physical basis of linguistic knowledge in
chapter 10; until then, let us take it as given that some such physical basis must
exist, and concentrate on the nature of linguistic capacities.
since the most important questions about the nature of language do not really re-
spond to the assumptions and methods of the disciplines concerned with group
phenomena that arise fundamentally as a part of social reality (anthropology,
economics, sociology, political science, etc.). The problem of relating group
behavior to the properties of the individual, and its appropriate resolution, was
already prefigured some time ago in an even broader form by Edward Sapir, a
notable visionary in his approach to language and to culture more generally.
In his classic article in the first number of the journal Psychiatry (Sapir 1938)
and in many other writings, Sapir urged that a true understanding of the nature
and effects of culture must necessarily be founded on an understanding of the
individuals who participate in culture and society: “In the psychological sense,
culture is not the thing that is given us. The culture of a group as a whole is not
a true reality. What is given – what we do start with – is the individual and his
behavior.”1 And the central term in this understanding is the nature of the mind
and personality of the individual, not an external characterization of his actions
and responses or some system that somehow exists outside of any particular
person.
Trained by Franz Boas as a cultural anthropologist, Sapir devoted most of his
professional life to the study of language and the development of the nascent
discipline of linguistics (see Anderson 1985, chap. 9 and Darnell 1990 for
sketches of his personal and professional life). For Boas, as for Sapir, language
was a key to all other understanding of cultural realities, since it is only through
an appreciation of the particularities of an individual’s language that we can
hope to gain access to his thoughts and conception of the world, both natural
and social. Sapir, indeed, is widely associated with the notion of “linguistic
relativity,” according to which the structure of an individual’s language not only
reflects but even contributes to determining the ways in which he construes his
world.2 Language thus occupies a central place among the phenomena that can
lead us to an understanding of culture; and it must follow that the way to study
language is in terms of the knowledge developed in individual speakers, not
in terms of such externalities as collections of recorded linguistic acts. In the
history of linguistics, Sapir is remembered especially as one who emphasized
the need to study what speakers know and believe (perhaps unconsciously)
about their language, not simply what they do when they speak.
In addition to his primary focus on linguistics, Sapir also wrote widely of more
general issues in the nature of society, culture, and personality. His Psychiatry
piece was far from isolated in the sympathy it showed with the project of psy-
chiatry and psychoanalysis. This interest in psychiatric issues and approaches
1 Sapir 1994, p. 139. The quotation is from a recent reconstruction of Sapir’s lectures on these
topics.
2 This is the so-called “Sapir–Whorf Hypothesis.”
1.2 Linguistics as history 5
was certainly not isolated from his work as a linguist and anthropologist. On
the contrary, as reflected in the title of his article (“Why cultural anthropology
needs the psychiatrist”), he felt that the mode of understanding essayed by
the psychiatrist was really the only path to a true appreciation of cultural phe-
nomena, given the claim above that the individual is the basic reality in this study.
Why the psychiatrist, in particular? One must remember that the 1930s, when
Sapir was active in general studies of personality and culture, was the time when
ideas from the positivist and behaviorist traditions predominated in scientific
investigations of psychological questions, and of course these were precisely
antagonistic to a sympathetic investigation of the nature and contents of the mind
and personality. Psychiatry, in contrast, was centrally occupied with exactly this,
and so the fundamental place of the individual in language and culture entailed a
need for the kind of light that could only be shed on core issues by psychiatric
investigation and understanding.
While no one denied Sapir’s stunning brilliance as a linguist, both as a
theoretician and as an analyst, many of his colleagues at the time considered
this “mentalist” aspect of his thought to be an eccentricity – even an aberration –
something to be excused rather than imitated. After all, linguistics was on its
way to attaining genuine status as a science precisely through adopting the be-
haviorism of the day, focusing on purely mechanical methods for collecting and
arranging linguistic data so as to arrive at a purely external analysis of linguistic
behavior, eschewing all metaphysical talk about “minds” and such-like unob-
servables (cf. Bloomfield 1933). Over time, however, the field of linguistics
has arrived at essentially the same conclusion Sapir did, by its own path and
making only a little use of the insight he had to offer.
Words are transmitted from one generation to the next, and they may change
their form in the course of that transmission. Latin pater “father” became padre,
père, patre, pai, etc. in the modern Romance languages. One could characterize
such changes by writing “sound laws” such as the principle that a dental stop
came to be pronounced with vocal fold vibration (that is, [t] came to be pro-
nounced as [d]) between a vowel and a vocalic r at some point in the transition
from Latin to Italian and Spanish. The important observation which made it
possible to envision a science of this sort of thing was the fact that while such
changes affected particular words, they were in principle formulable in terms
of phonetic environments alone, without direct reference to the words them-
selves. Changes in the set of words, then, were the consequence of the working
of sound laws, and linguistics could potentially develop a science of these that
could hope to explain the development of languages over time.
In this view, languages are the basic objects of reality, entities existing in their
own right “out there,” waiting to be acquired by speakers. Linguists sought to
determine and quantify the degree of historical relatedness among sets of lan-
guages, and this relatedness was expressed through tree diagrams or cladograms
such as that in (1.1), introduced by August Schleicher (1861– 62).
(1.1) Proto-Indo-European
Aryan--Greek--Italo-Celtic Germano-Slavic
Greek--Italo-Celtic
(After Schleicher1861–62; note that many of the details in this model would be contested
by scholars today.)
Italian than to German, and French grammars may be more similar to German
than to Spanish.4 There is no reason to believe that structural similarity should
be even an approximate function of historical relatedness – assuming that there
is in fact a non-circular notion of historical relatedness to be discovered.
Nonetheless, nineteenth-century linguists focused on languages, seen as in-
ventories of words composed from basic inventories of sounds, which could
change over time. Languages so conceived appeared to change in systematic
ways which could be formulated as regular correspondences, each the product
of some combination of sound changes that could be expressed as sound laws
independent of the identities of the individual words affected. By the end of
the nineteenth century, linguists knew that this was not the entire story: there
were other regularities of language change which could not be stated in purely
phonetic terms, suggesting that at least in these cases it was not the language
(construed as an inventory) or its sounds that were changing, but rather some
kind of more abstract system. This was dealt with in a terminological move:
there were regularities of “sound change,” but there could be other sorts of
change that worked differently. These were called “analogical change,” and
were assumed to be governed by quite a different, more mysterious, kind of
regularity.
Nineteenth-century linguists focused on the surface forms of words, which
are the products of human behavior, rather than on internal processes that un-
derlie and shape that behavior. They thus dealt with E-language rather than
I-language in the terminology of Chomsky 1986. Not all aspects of language
have satisfying accounts in these terms, though, and it is often necessary to
invoke underlying processes and abstract systems that are not manifest parts of
the observable facts of E-language. This is true for such famous instances of
regular sound change as Grimm’s Law, which affected many types of conso-
nants in a related way in a sort of cycle (cf. (1.2a)); or the Great Vowel Shift
in English (cf. (1.2b)), which changed all of the long vowels in another sort of
cycle, raising all vowels one step and making diphthongs of the highest vowels.
Thus, [swe:t] “sweet” (pronounced with a vowel similar to that of modern skate)
became modern English [swi:t]; [ti:m] “time” (pronounced rather like modern
team) became [taim]; [hu:s] “house” (with a vowel similar to that of modern
loose) became [haus], etc.). Grimm’s Law and the Great Vowel Shift affect
many sounds at the same time, and represent changes in systems, not simply in
inventories.
4 Linguists idealize and speak of French grammars, but a “French grammar” has much the same
status as a “French liver,” a convenient fiction. Individuals have livers and individuals have
grammars. Grammars may be similar to one another, and we may seek similarities between the
grammars of “English speakers,” whoever they may be exactly. In doing so, we must be prepared
to find differences between the grammars of a man in Houston and a woman in Leeds.
10 1 Studying the human language faculty
aspirated voiceless
fricative
ai au
e: o:
ε: ɔ:
a:
Because the historical linguists of this period were working with the external
forms of language, rather than with internal, underlying processes and abstract
systems, it makes sense that their attention was confined to phonological and
morphological aspects of language, and they paid little attention to change in
syntactic systems. It makes no sense to think of (sets of) sentences, products of
behavior, as being transmitted from one generation to another, because language
acquisition is clearly not a matter of learning sets of sentences. This limitation
was not seen as a matter of concern: they simply worked on what they felt they
had useful tools to elucidate. The debates of the time were primarily about the
nature and causes of sound change.
A part of what made the study of sound change appealingly scientific was
the fact that its systematicities could be related to those of another domain:
the physics and physiology of speech. When we describe Proto-Indo-European
/p,t,k/ as becoming /f,θ,χ/ in the Germanic languages (e.g., prehistoric patr.
becoming modern English father, tres becoming three, etc.), we can unify
these by saying that one kind of formational occlusion (complete blockage
of the oral cavity followed by a rapid release, accompanied by a brief burst of
noisy airflow) is replaced by another (incomplete blockage of airflow, allowing
for noisy, turbulent flow during the period of occlusion). With this unitary
formulation in hand, we can hope to find reasons why one sort of occlusion
should be replaced by another, a rather more hopeful task than that of finding
reasons why one arbitrary set of symbols should be replaced by another equally
arbitrary one.
It is natural that the development of rigorous methods in historical linguistics
more or less coincided with the beginnings of serious investigation of the
1.2 Linguistics as history 11
sentences like pears like me became modern I like pears) to the greater interest
taken in persons than in things at one stage in the history of English. For other
examples, see Lightfoot 1999. Hardly anyone was satisfied that the result of
such speculations provided a genuine science of (the history of) language.
that view, scientific investigation can only concern itself with external observ-
ables. Properties of mind (if indeed there are such) are intrinsically beyond the
reach of science. Bloomfield in particular (contrary to his subsequent reputa-
tion in some circles) did not deny the existence of the mind, but maintained that
its properties were so far beyond the reach of (current) science as to make its
investigation pointless.
What was consistently true through this period was that the primary object
of inquiry in the field was not something historical, but something present to
contemporary speakers of a given language. This was always something ex-
ternal, and the basic question of the field had become: what are the properties
of the sets of sounds, words, sentences, etc. recorded in attested acts of speak-
ing? Whether thought of as deriving from social conventions or as the external
responses corresponding to particular stimuli, linguistic objects were consis-
tently seen in this external mode. The commitment to an E-language view of
what linguistics studies was thus axiomatic and complete.
sentences from non-sentences, detect ambiguities, etc., apparently forces us to the con-
clusion that this grammar is of an extremely complex and abstract character, and that the
young child has succeeded in carrying out what from the formal point of view, at least,
seems to be a remarkable type of theory construction. Furthermore, this task is accom-
plished in an astonishingly short time, to a large extent independently of intelligence,
and in a comparable way by all children. Any theory of learning must cope with these
facts. (Chomsky 1959, p. 57)
These remarks had a number of profound consequences. For one, they made
it clear that the generalized mechanisms of behaviorist learning theory, based on
attributing as little structure as possible to the organism in particular domains,
was quite incapable of dealing with the acquisition of human language. For
another, they brought the problem of learning to the forefront in the study
of language: where previously linguists had generally been content to char-
acterize the language itself, Chomsky made it clear that an understanding of
language cannot proceed without asking how a speaker’s ability to use it arises
in development.
But for linguists, the most profound effect of these arguments was a shift
in our conception of the object of study in the field. Chomsky stressed that
the basic problem is not one of characterizing what people do: it is rather one
of characterizing what they know. The central reality of language is the fact
that we call someone a speaker of, say, Japanese, because of a certain kind of
knowledge that he or she has. If that is the case, linguists need to find a way
to study the structure of this knowledge, and while the things people say and
do can constitute important evidence, that is not all there is, or even the most
important thing. This knowledge is what we are calling a person’s language
organ.
In this focus on the nature of language as a form of knowledge, an aspect of
the structure of the mind, linguists have thus returned to a conception much like
Sapir’s of the centrality of the individual to an understanding of linguistic and
cultural phenomena. The decline of behaviorist assumptions in the last several
decades of the twentieth century necessarily led to a much broader consensus
about the need to understand the mind in its own terms: if Bloomfield was
indeed right that the mind was beyond the reach of current science, the thing to
do was to develop the relevant science, not study something else.
Much of academic psychology still finds itself preoccupied with externalist
issues, and for one reason or another rejects the validity or utility of conceiving
its object in terms of the minds of individuals described at some appropriate level
of abstraction. The result has been the rise of cognitive science as a discipline
whose goal is precisely a science of the mind. Combining ideas from linguistics,
computer science, philosophy, anthropology, and cognitive psychology, this
emerging field focuses squarely on the nature of mental and cognitive life.
1.4 Linguistics as the study of I-language 17
In this chapter, we pursue an important source of evidence for the claim that
human language has a specialized basis in human biology: the relation between
what a speaker of a language can be said to “know” and the evidence that is
available to serve as the basis of this knowledge. The apparently common-sense
notion that an adult speaker’s knowledge of his/her language arises by simple
“learning,” that is, as a direct generalization of experience, turns out to pose
a logical paradox. We begin with two brief examples that illustrate this point,
and then explore the consequences of this for the mechanisms that must in fact
underlie the development of language organs in normal human speakers.
18
2.1 We know more than we learn 19
the reduced form in certain places. Yet, all children typically attain the ability
to use the forms in the adult fashion, and this ability is quite independent of
intelligence level or educational background. Children attain it early in their
linguistic development. More significantly, children do not try out the non-
occurring forms as if testing a hypothesis, in the way that they “experiment”
by using forms like goed and taked. The ability emerges perfectly and as if by
magic.
Another example. Pronouns like she, her, he, him, his sometimes may refer
back to a noun previously mentioned in a sentence (2.1a–c). However, one can
only understand (2.1d) as referring to two men, Jay and somebody else; here
the pronoun may not refer to Jay, unlike (2.1a–c).
(2.1) a. Jay hurt his nose.
b. Jay’s brother hurt him.
c. Jay said he hurt Ray.
d. Jay hurt him.
To extend this point, consider some more complex examples, as in (2.2):
(2.2) a. When Jay entered the room, he was wearing a yellow shirt.
b. Jay was wearing a yellow shirt when he entered the room.
c. When he entered the room, Jay was wearing a yellow shirt.
d. He was wearing a yellow shirt when Jay entered the room.
e. His brother was wearing a yellow shirt when Jay entered the
room.
In all of the sentences in (2.2) the pronoun (he or his) may refer to some
other individual, not mentioned. It may also refer to Jay – in all cases, that
is, except (2.2d), where the wearer of the yellow shirt can only be understood
to be someone other than Jay. Again, all speakers understand these sentences
in the same way, but we may legitimately be puzzled at the source of this
commonality. It is quite unlikely to have come from any explicit instruction: as
far as we know, these points about the interpretation of pronouns had not been
systematically noted, even by grammarians, prior to the late 1960s (Ross 1967,
Langacker 1969, Reinhart 1976, McCawley 1999).
As adults we generalize that a pronoun may refer to another noun within the
same sentence except under very precise conditions (as in (2.1d) or (2.2d)). But
then, how did we all acquire the right generalization, particularly knowledge of
the exceptions? In the case of (2.2d), we might be tempted to say that it is only
natural that a pronoun should not be able to refer to an individual mentioned only
later in the sentence, but the evidence of (2.2c,e) shows that such “backwards
anaphora” is in fact possible under some circumstances. Furthermore, we will
see in chapter 9 that even very young children know when backwards anaphora
is possible and when it is not.
20 2 Language as a mental organ
than Tim is? Why do the sentences (2.2a–c,e) not provide an analogical basis for
coreference between Jay and he in (2.2d)? The point is that language learners
arrive at certain very specific generalizations, and fail to arrive at certain other
logically possible ones, in ways that cannot be founded on any independent
general notion of induction or analogy.
A variant on this “solution” is to claim that children learn not to say the deviant
forms because they are corrected by their elders. Alas, this view offers no better
insight for several reasons. First, it would take an acute observer to detect and
correct the error. Second, where linguistic correction is offered, young children
are highly resistant and often ignore or explicitly reject the correction. Third,
in the examples discussed, children do not overgeneralize and therefore parents
have nothing to correct; this will become clearer when we discuss experimental
work on young children later in this chapter and in chapter 9.
So the first “easy” solution to the poverty-of-stimulus problem is to deny
that it exists, to hold that the environment is rich enough to provide evidence
for where the generalizations break down. But the problem is real, and this
“solution” does not address it.
The second “easy” answer would be to deny that there is a problem on the
grounds that a person’s language is fully determined by genetic properties. In
that case, there would be nothing to be learned. Yet this answer also cannot
be right, because people speak differently, and many of the differences are
environmentally induced. There is nothing about a person’s genetic inheritance
that makes her a speaker of English; if she had been raised in a Dutch home,
she would have become a speaker of Dutch.
The two “easy” answers either attribute everything to the environment or
everything to the genetic inheritance, and we can see that neither position is
tenable. Instead, language emerges through an interaction between our genetic
inheritance and the linguistic environment to which we happen to be exposed.
English-speaking children learn from their environment that the verb is may
be pronounced [z] or [z], and native principles prevent the reduced form from
occurring in the wrong places. Likewise, children learn from their environment
that he, his, etc are pronouns, while native principles entail that pronouns may
not refer to a preceding noun under specific circumstances. The interaction of
the environmental information and the native principles accounts for how the
relevant properties emerge in an English-speaking child.
We will sketch some relevant principles below. It is worth pointing out that
we are doing a kind of Mendelian genetics here, in the most literal sense. In
the mid-nineteenth century, Mendel postulated genetic “factors” to explain the
variable characteristics of his pea plants, without the slightest idea of how these
factors might be biologically instantiated. Similarly, linguists seek to identify
information which must be available independently of experience, in order for
22 2 Language as a mental organ
analyze incoming speech signals, and more. At the center is the biological notion
of a language organ, a grammar.
properties. It occurs where verbs occur and not where nouns occur: I want to
visit Chicago, but not *the visit Chicago3 nor *we discussed visit Chicago. So
the expression visit Chicago is a verb phrase (VP), where the V visit is the head
projecting the VP. This can be represented as a labeled bracketing (2.3a) or
as a tree diagram (2.3b). The verb is the head of the VP and the noun is the
complement.
(2.3) a. [VP V visit N Chicago]
b. VP
V N
visit Chicago
(2.4) CP
DP CP
D NP C IP
what cityj willi
DP IP
D NP I VP
the student [ei]
V DP
visit [ej]
What differentiates the two cases in (2.5), such that the reduced form ’s
is possible in the one instance but not in the other? One potential difference
emerges if we consider the structures of the sentences in (2.5) in more de-
tail. In particular, if we take into account the notion that the element who
has been displaced from its natural position in the sentence, we see that the
position from which this movement takes place differs in the two cases:4
,
(2.6) a. I wonder whoi the professor is/* s ei here.
,
b. I wonder whoi the professor is/ s talking about ei now.
We can notice that in example (2.6a) the verb is/’s immediately precedes
the trace of the displaced element who, while in (2.6b) it does not. The correct
generalization seems to be that reduced forms of auxiliary elements cannot
immediately precede empty elements, such as the trace of displacement in
these cases.
This does indeed describe many of the relevant circumstances under which
reduced auxiliaries are impossible, but it is too narrow to be satisfying. There
is no direct evidence for the situation in which reduced forms are excluded,
so some native principle must be involved, but it seems unlikely that such a
principle as “reduced auxiliary verbs cannot appear before an empty element
such as a trace” forms a distinct part of the human genetic endowment for
language; or rather, to the extent that it does, it must surely be as a by-product
of something more general and principled.
We might attempt to improve on this hypothesis as follows. Note that, in a
sentence like Kim’s happy, the auxiliary element ’s is grammatically the head
of the IP, taking the adjective phrase (AdjP) happy as its complement. In pro-
nunciation, however, it forms part of a single unbroken unit with the preceding
word Kim, as the apostrophe in the conventional spelling ’s suggests, despite
the fact that the sequence Kim+’s does not constitute a phrase syntactically.
In the case that interests us, reduced forms of auxiliaries such as ’s (as well as
’ve, ’re, and the reduced forms of am, will, would, shall, and should) do not have
enough phonological “substance” to be words on their own, and thus necessarily
combine with a word on their left to make up a single phonological word in the
pronunciation of sentences in which they occur. In terms of pronunciation, that
is, Kim’s in Kim’s happy is just as indissoluble a unit as birds in Birds fly.
4 As above, we indicate the source of a copied element with an empty symbol “e,” generally
referred to in the syntactic literature as a t r a c e. The identical subscripts attached to the copied
element and its trace indicate that both have the same reference.
2.3 Back to the puzzles 27
Not every element that functions as a unit from the point of view of the
syntax corresponds to a whole word in pronunciation. Syntactic units that do
not contain enough phonetic material to make up a whole word by themselves,
such as the reduced auxiliaries in English, are referred to as (simple) cli ti cs.
Clitics are little words which occur in many, perhaps all languages, and have
the property of not being able to stand alone.
In some languages, these elements attach systematically to the word to their
left; in others, to the right, and in others the direction of attachment depends
on details of the syntactic and/or phonological structure. This is a property of
particular languages that is plausibly learned from overt data: the child need
only determine that some element is a clitic (in the sense above), and then
see whether it is pronounced indissolubly with the word on its left or on its
right. What is consistently the case, however, is that syntactic elements that
do not constitute words in their own right must attach to some other word as
clitics in order to be pronounced at all. This much plausibly reflects properties
of UG.
The phonological evidence clearly supports the claim that in English, clitics
attach to their left, not their right. This direction of association is supported
by the fact that the pronunciation of the clitic varies from [s] to [z] to [əz] as
a function of the final sound of the preceding word (2.7a); and is completely
insensitive to the shape of the word on its right. This variation in shape is exactly
the same as that which we can observe in the shape of the regular plural ending
(spelled (e)s but pronounced in the same three ways as ’s), again as a function of
the final sound of the preceding word (2.7b). Similarly, while the third person
singular present ending of verbs is always spelled -s, it shows the same variation
in pronunciation (2.7c), as does the ending of possessive forms (2.7d).
(2.7) a. Pat’s ([s]) leaving, Kim’s ([z]) coming in, and Chris’s ([əz])
replacing Jan.
b. packs ([s]), pals ([z]), passes ([əz])
c. infects ([s]), cleans ([z]), induces ([əz])
d. Pat’s ([s]), Kim’s ([z]) (or) Chris’s ([əz]) corkscrew
In Kim’s happy, the element ’s is a clitic in terms of its pronunciation. This
does not alter the fact that in syntactic terms, it is a verb which serves as the head
of the phrase [IP ’s happy], a simple structure which involves no displacement
or deletion. Compare this with the case of the underlined is in Kim’s happier
than Tim is, though. This latter element is also a verb that serves as the head of
a phrase, though what it is followed by within that phrase is something which
is understood but not pronounced. Syntactically, the phrase has the structure
[IP is e], where the unpronounced element e is understood as happy.
Now recall that our problem is to account for the fact that not only are
sentences like (2.8a) impossible in English, but language learners know this
28 2 Language as a mental organ
without ever having been explicitly told not to use reduced auxiliary forms in
such positions.
(2.8) a. *Tim’s happier than Kim’s.
b. Tim’s happier than Kim is.
What differentiates the bad sentence (2.8a) from its well-formed near-twin
(2.8b)? The only difference is between the final phrase [IP ’s e], in (2.8a), vs. the
phrase [IP is e], in (2.8b). While reflecting no distinction of syntactic structure or
meaning, this difference does have one consequence: in each of the sentences in
(2.8) the final phrase contains only a single element, but in (2.8a) that element
is a clitic (’s), while in (2.8b) it is the word is.
We already know that a clitic cannot (by definition) constitute a word by
itself. A rather natural part of the theory of sound structure is the prosodi c
h ie r a r c h y (see Nespor and Vogel 1986 for basic discussion of this notion),
according to which utterances are composed of phrases, phrases of words,
words of smaller units called feet, which are themselves composed of syllables.5
It stands to reason that since a clitic by itself cannot constitute a word, a phrase
consisting only of a clitic would contain no word, and would thus be ill-formed.
We might, then, attribute the ill-formedness of (2.8a) in English to the fact that
it contains the ill-formed phrase [IP ’s e].
It may appear that we have played a sort of shell game here, trading on the
fact that the word “phrase” has more than one sense: on the one hand, it refers
to a unit of pronunciation, associated with particular intonational contours,
possibilities for pause, etc. and composed of smaller units of pronunciation,
(phonological) words. On the other hand, however, a “phrase” can be interpreted
as a unit of syntactic organization, composed of units of grammar and meaning
whose actual pronunciation is, for syntactic purposes, quite irrelevant. Thus the
final syntactic phrase in both sentences in (2.8) consists of two elements: a verb
(’s or is) and an adjective (e, interpreted as “happy”). The explanation we are
proposing requires us to say that because [IP ’s e] is syntactically a phrase, its
pronunciation as a phonological phrase would be ill-formed because it would
not contain enough phonological material to make up even a single word. But
what justifies us in the apparent equation of syntactic and phonological notions
of “phrase”?
We must note that syntactic phrasing is not always faithfully reflected in
phonological phrasing. In a sentence like This is the cat, that chased the rat,
that ate the cheese, . . . the boundaries of phonological phrases coincide roughly
with the commas. As a result, that chased the rat is a phonological phrase here,
but not a syntactic phrase: syntactically the phrasing is something like [DP the
cat [IP that chased [DP the rat [IP that ate the cheese . . .]]]].
5 Various authors have proposed slightly different inventories of prosodic constituent types, but
without affecting the overall point here in any material way.
2.3 Back to the puzzles 29
wonder who the professor’s here, *I wonder where the concert’s on Wednesday,
*Kim’s happier than Tim’s, etc., suggest that clitics like ’s cannot host a deleted
(understood) item in this way.
How do these facts interact with the observation we made above that clitic
elements are in fact attached to material on their left, forming part of a phrase
with that material in actual pronunciation? Such “migration” of clitic material
from one phrase to another does not actually alter the force of (2.9). In (2.10a),
for example, happy is the complement of is and therefore reduced ’s may at-
tach to the preceding word without leaving the phrase it heads phonologically
deficient. The same is true for the first is of (2.10b). However, Tim is not the
complement of the underlined is in (2.10b); in this case, the subject Tim and
the copula verb is have permuted. As a result, the underlined is is the only overt
representative of its phrase, and cannot be reduced.
(2.10) a. Kim’s happy.
b. Kim is (/’s) happier than is(/*’s) Tim.
But while the principle in (2.9) is plausibly derivable from general require-
ments that are part of UG, and a consequence of (2.9) is that a reduced is may
not be the only phonological material representing the syntactic phrase of which
it is the head, this is not the whole story. Consider the following examples:
(2.11) a. John, my dear’s, a bastard.
b. *John’s, my dear, a bastard.
c. He is/*’s too going to fix it.
d. Fred is tired of Spinoza, just as Mary is/*’s of Schopenhauer.
e. She’s a better scientist than he is/*’s an engineer.
In all of these cases, the reduced auxiliary is impossible in circumstances
where the syntactic phrase it heads contains other material that ought to satisfy
(2.9). There is, however, a property that characterizes the sentences in (2.11):
the fact that each involves a construction with special intonation. In each of the
bad cases, the material immediately following the auxiliary is/’s is set off as
(part of ) a separate (phonological) phrase in pronunciation.
At this point, we know enough to provide an account of the circumstances
under which reduced auxiliaries are impossible. Suppose that we start with the
syntactic structure of the sentence, and then identify the corresponding pro-
nunciation, in terms of phonological phrases. Simplifying somewhat, we can
say that the most natural phrasing is one that mirrors the syntax: syntactic
phrases correspond to phonological phrases. Some constructions, however, are
exceptions to this in that they enforce a phonological phrase boundary in a
place where one might not be motivated syntactically. These include the vari-
ous parenthetical insertions, emphases, and ellipses illustrated in (2.11). Once
phonological phrases have been delimited, we can say that any such phrase that
2.3 Back to the puzzles 31
contains some phonological material, but not enough to constitute at least one
phonological word, is ipso facto ill-formed. Since clitics do not qualify as full
phonological words, any phrase consisting of a clitic alone will suffer such a
fate. In actual pronunciation, finally, prosodically deficient elements (simple
clitics) are attached phonologically to the word on their left.
Though fairly intricate, this account provides an answer to the problem
sketched at the outset: how to account for the fact that the child comes to know
when the reduced forms of the auxiliary elements are or are not possible, in the
absence of direct evidence. In order to achieve this, the child needs to learn that
in English (a) the elements am, is, has, etc. have simple clitic alternants ’m, ’s,
etc.; (b) clitics attach to their left; and (c) certain constructions (parenthetical
insertions, emphasis, ellipses) get phrased in a special way phonologically. All
of these facts are directly attested in the linguistic data available to the child,
and can thus be learned without difficulty.
But these observable phenomena interact with principles of UG, which are
part of the child’s initial linguistic endowment, and thus do not have to be
learned, principles which are not language-particular but rather apply to all
languages: (a) notions of the phonological word and phrase, together with the
requirement that phonological phrases must be made up of at least one phono-
logical word; and (b) the preference for parallel structure in syntax and in
prosodic organization. Jointly these will entail that a syntactic phrase with
a clitic element such as ’s as its only phonological content will be excluded,
without requiring the child to have overt evidence for the badness of such ex-
amples, as we have seen above.
Part of what a child growing a grammar needs to do is to determine the clitics
in his or her linguistic environment, knowing in advance of any experience that
“clitics” are small, unstressed items attached phonologically to an adjacent word
in ways that may be contrary to the syntactic relations they bear to surrounding
material. This predetermined knowledge – the nature of clitics and the fact that
they cannot by themselves satisfy the requirement that phrases be represented
by at least one phonological word – is contributed by the linguistic genotype
and is part of what the child brings to language acquisition. The environment
provides examples such as Pat’s happy, Bob’s happy, and Alice’s happy too.
The child can observe that the three instances of ’s in these cases vary in their
pronunciation ([s] after Pat, [z] after Bob, and [əz] after Alice). This variation
is quite systematic, and, as noted, follows the same principles as those that
determine the form of the plural ending in cats, knobs, palaces among other
endings in English. These facts confirm that ’s must be a clitic, and must attach
phonologically to its left.
Under this approach, the child is faced with a chaotic environment and in
scanning it, identifies clitics . . . among many other things, of course (Lightfoot
1999). This is the answer that we provide to our initial problem, and it is an
32 2 Language as a mental organ
answer of the right shape. It makes a general claim at the genetic level (“clitic”
is a predefined category) and postulates that the child arrives at a plausible
analysis on exposure to a few simple expressions. The analysis that the child
arrives at predicts no reduction for the underlined is in Kim is happier than Tim
is, I wonder who the professor is here, and countless other cases, and the child
needs no correction in arriving at this system. The very fact that ’s is a clitic,
a notion defined in advance of any experience, dictates that it may not occur
in certain contexts. It is for this reason that the generalization that is may be
pronounced as ’s breaks down at certain points and does not hold across the
board.
Consider now the second problem, the reference of pronouns. An initial
definition might propose that pronouns refer to a preceding noun, but the data
of (2.1) and (2.2) showed that this is both too strong and too weak. It is too strong
because, as we saw, in (2.1d) him may not refer to Jay; in (2.1b) him may refer
to Jay but not to Jay’s brother. The best account of this complex phenomenon
seems to be to invoke a native principle which says, approximately (2.12).
(2.12) Pronouns may not refer back to a higher nominal element
contained in the same clause or in the same DP.
In (2.13) we give the relevant structure for the corresponding sentences of
(2.1). In (2.13b) the DP Jay’s brother is contained in the same clause as him
and so him may not refer back to that DP: we express this by indexing them
differently. On the other hand, Jay is contained inside the DP and is not “higher”
than him, so those two nouns do not need to be indexed differently – they may
refer to the same person and they may thus be coindexed. Again we see the
constituent structure illustrated earlier playing a central role in the way in which
the indexing computations are carried out. In (2.13d) Jay is in the same clause
as him and so the two elements may not be coindexed; they may not refer to the
same person. In (2.13c) Jay is not contained in the same clause as he: Jay and
he may thus refer either to the same person or to different people. In (2.13a) his
is contained inside a DP and may not be coindexed with anything else within
that DP; what happens outside the DP is not systematic; so his and Jay may
corefer and do not need to be indexed differently.
(2.13) a. [IP Jayi hurt [DP hisi/j nose]]
b. [IP [DP Jayi ’s brother]k hurt himi/j/∗k ]
c. [IP Jayi said [IP hei/j hurt Ray]]
d. [IP Jayi hurt himj/∗i ]
The idea that pronouns refer to a preceding noun is shown to be too weak
because sometimes, as in (2.2c,e), the pronoun refers to a following noun. In
this case, the relevant principle seems to be that such “backwards anaphora” is
not possible if the pronoun not only precedes the noun, but is also “higher” (in
2.3 Back to the puzzles 33
a precise sense whose details are not relevant to our present concerns) in the
syntactic structure than the noun which is to serve as its antecedent.6 In (2.2c),
the pronoun precedes Jay, but this is acceptable because the pronoun appears
within a subordinate clause, and thus is not relevantly “higher.” In (2.2e), the
pronoun is subordinated by virtue of appearing as a possessor within a larger
DP. In (2.2d), however, the pronoun appears as subject of the main clause, and
is thus (in the relevant structural sense) syntactically higher than the following
noun, which therefore cannot serve as its antecedent.
We could have illustrated these points equally well with data from French or
from Dutch, or from many other languages, because the principles apply quite
generally, to pronouns in all languages. If we assume a native principle, available
to the child independently of any actual experience, language acquisition is
greatly simplified. Now the child does not need to “learn” why the pronoun
may refer to Jay in (2.13a) or (2.13b,c) but not in (2.13d); in (2.2a–c,e) but
not in (2.2d), etc. Rather, the child raised in an English-speaking setting has
only to learn that he, his, him are pronouns, i.e. elements subject to Principle B
(see note 6). This can be learned by exposure to a simple sentence like (2.1c)
(structurally (2.13c)), uttered in a context where he refers to Jay; that suffices to
show that he is neither an anaphor nor a name – the other possible noun types,
according to most current views – and hence must be a pronoun.7
One way of thinking of the contribution of the linguistic genotype is to view it
as providing invariant principles and option-points or p a rameters. There are
invariant principles, such as that clitics attach phonologically to adjacent words
by virtue of their prosodically “weak” character; that phonological phrases are
based on words, which are based on smaller prosodic units; that phonological
and syntactic phrases are generally related in a particular way; that pronouns
cannot be locally coindexed and that names may not be coindexed with a higher
DP, etc. Taken together, these have consequences, such as the principle in (2.9)
which requires phrases that are pronounced at all to contain at least one full
phonological word. Meanwhile, there are also options: direct objects may pre-
cede the verb in some grammars (German, Japanese) and may follow it in others
(English, French); some constructions may have special intonation associated
with them; clitics in some grammars attach to the right and in others to the
left, etc. These are parameters of variation and the child sets these parameters
one way or another on exposure to particular linguistic experience. As a re-
sult a grammar emerges in the child – a language organ, part of the linguistic
6 The principle alluded to in the last paragraph, that pronouns may not be locally coindexed, is
Principle B of the binding theory; (2.12) is only an informal (and partly inaccurate) rendering of
Principle B. Here we allude to Principle C, that names may not be coindexed with a higher DP
anywhere. In chapter 9 we discuss the acquisition of Principle C.
7 Anaphors are elements that are locally coindexed, according to Principle A of the binding theory,
while names, by Principle C, are never coindexed with a higher DP.
34 2 Language as a mental organ
phenotype. The child has learned that ’s is a clitic and that he is a pronoun;
the genotype ensures that ’s cannot be the only phonological material within a
syntactic phrase and that he is never used in a structurally inappropriate context.
from their linguistic environment. Bates and Elman provide a recent and partic-
ularly striking instance of this line, claiming that artificial neural networks can
learn linguistic regularities from imperfect but “huge computerized corpora of
written and spoken language.”
Nobody denies that the child must extract information from the environment;
it is no revelation that there is “learning” in that technical sense. After all,
children learn to speak one way when surrounded by speakers of French and
another when surrounded by speakers of Italian. Our point is that there is more
to language acquisition than this. Children react to experience in accordance
with specific principles.
The problem demanding explanation is compounded by other factors. Despite
variation in background and intelligence, people’s mature linguistic capacity
emerges in fairly uniform fashion, in just a few years, without much apparent
effort, conscious thought, or difficulty; and it develops with only a narrow
range of the logically possible “errors.” Children do not test random hypotheses,
gradually discarding those leading to “incorrect” results and provoking parental
correction. In each language community the non-adult sentences formed by very
young children seem to be few in number and quite uniform from one child
to another, which falls well short of random (see chapter 9). Normal children
attain a fairly rich system of linguistic knowledge by five or six years of age and
a mature system by puberty. In this regard, language is no different from, say,
vision, except that vision is taken for granted and ordinary people give more
conscious thought to language.
These, then, are the salient facts about language acquisition, or more properly,
language growth. The child masters a rich system of knowledge without sig-
nificant instruction and despite an impoverished stimulus; the process involves
only a narrow range of “errors” and takes place rapidly, even explosively be-
tween two and three years of age. The main question is how children acquire
so much more than they experience and how the growth takes place.
A grammar, the language organ, represents what a speaker comes to know,
subconsciously for the most part, about his or her native language. It represents
the fully developed linguistic capacity, and is therefore part of an individual’s
phenotype. It is one expression of the potential defined by the genotype. Speak-
ers know what an infinite number of sentences mean and the various ways in
which they can be pronounced and rephrased. Most of this largely subconscious
knowledge is represented in a person’s grammar. The grammar may be used for
various purposes, from everyday events like expressing ideas, communicating,
or listening to other people, to more contrived functions like writing elegant
prose or lyric poetry, or compiling and solving crossword puzzles, or writing a
book about the language organ.
We do not want to give the impression that all linguists adopt this view
of things. In fact, people have studied language with quite different goals in
mind, ranging from the highly specific (to describe Dutch in such a way that it
36 2 Language as a mental organ
Each of the items in the triplet – trigger, UG, and grammar – must meet
various demands. The trigger or PLD must consist only of the kinds of things that
children routinely experience and includes only simple structures (see chapter 9
8 The notion of a trigger is from ethologists’ work on the emergence of behavioral patterns in
young animals.
2.4 The acquisition problem 37
for discussion). The theory of grammar or UG is the one constant and must
hold universally such that any person’s grammar can be attained on the basis
of naturally available trigger experiences. The mature grammar must define an
infinite number of expressions as well-formed, and for each of these it must
specify at least the sound and the meaning. A description always involves these
three items and they are closely related; changing a claim about one of the items
usually involves changing claims about the other two.
The grammar is one subcomponent of the mind, a mental organ which inter-
acts with other cognitive capacities or organs. Like the grammar, each of the
other organs is likely to develop in time and to have distinct initial and mature
states. So the visual system recognizes triangles, circles, and squares through
the structure of the circuits that filter and recompose the retinal image (Hubel
and Wiesel 1962). Certain nerve cells respond only to a straight line sloping
downward from left to right within a specific, narrow range of orientations;
other nerve cells to lines sloped in different directions. The range of angles that
an individual neuron can register is set by the genetic program, but experience
is needed to fix the precise orientation specificity (Sperry 1968).
In the mid-1960s David Hubel, Torsten Wiesel, and their colleagues devised
an ingenious technique to identify how individual neurons in an animal’s vi-
sual system react to specific patterns in the visual field (including horizontal
and vertical lines, moving spots, and sharp angles). They found that particular
nerve cells were set within a few hours of birth to react only to certain visual
stimuli, and, furthermore, that if a nerve cell is not stimulated within a few
hours, it becomes totally inert in later life. In several experiments on newborn
kittens, it was shown that if a kitten spent its first few days in a deprived optical
environment (a tall cylinder painted only with vertical stripes), only the neurons
stimulated by that environment remained active; all other optical neurons be-
came inactive because the relevant synapses degenerated, and the kitten never
learned to see horizontal lines or moving spots in the normal way.
We see learning as a similarly selective process: parameters are provided by
the genetic equipment, and relevant experience fixes those parameters (Piattelli-
Palmarini 1986, 1989). A certain mature cognitive structure emerges at the ex-
pense of other possible structures which are lost irretrievably as the inactive
synapses degenerate. The view that there is a narrowing down of possible con-
nections out of an overabundance of initially possible ones is now receiving
more attention in the light of Hubel and Wiesel’s Nobel Prize winning success.
On the evidence available, this seems to be a more likely means of fine tuning
the nervous system as “learning” takes place than the earlier view that there is
an increase in the connections among nerve cells.
So human cognitive capacity is made up of identifiable properties that are
genetically prescribed, each developing along one of various preestablished
routes, depending on the particular experience encountered during the individ-
ual’s early life. These genetic prescriptions may be extremely specialized, as
38 2 Language as a mental organ
Hubel and Wiesel showed for the visual system. They assign some order to our
experience. Experience elicits or triggers certain kinds of specific responses but
it does not determine the basic form of the response.
This kind of modularity is very different from the view that the cognitive fac-
ulties are homogeneous and undifferentiated, that the faculties develop through
general problem-solving techniques. In physical domains, nobody would sug-
gest that the visual system and the system governing the circulation of the blood
are determined by the same genetic regulatory mechanisms. Of course, the pos-
sibility should not be excluded that the linguistic principles postulated here may
eventually turn out to be special instances of principles holding over domains
other than language, but before that can be established, much more must be
known about what kinds of principles are needed for language acquisition to
take place under normal conditions and about how other cognitive capacities
work. The same is of course true for other aspects of cognitive development.
Only on such a basis can meaningful analogies be detected. Meanwhile,
we are led to expect that each region of the central nervous system has its own special
problems that require different solutions. In vision we are concerned with contours and
directions and depth. With the auditory system, on the other hand, we can anticipate a
galaxy of problems relating to temporal interactions of sounds of different frequencies,
and it is difficult to imagine that the same neural apparatus deals with all of these
phenomena . . . for the major aspects of the brain’s operation no master solution is likely.
(Hubel 1978, p. 28)
adults find reduced forms unacceptable with that of non-adult reduced forms
more generally would indicate whether or not children’s grammars contained
the hypothetical genetic constraint. If the genetic constraint is at work, there
should be a significant difference in frequency; otherwise, not.
The target productions were evoked by the following protocols, in which
Thornton and Crain provided children with a context designed to elicit
questions.
(2.15) Protocols for cliticization:
a. Experimenter: Ask Ratty if he knows what that is doing
up there.
Child: Do you know what that’s doing up there?
Rat: It seems to be sleeping.
b. Experimenter: Ask Ratty if he knows what that is up there.
Child: Do you know what that is up there?
Rat: A monkey.
There is, of course, much more to be said about grammars and their ac-
quisition, and we will return to this topic in chapter 9 below. There is also
an enormous technical literature, but here we have briefly illustrated the kind
of issue that work on real-time acquisition can address under our I-language
approach.
2.5 Conclusion
Recent theoretical developments have brought an explosive growth in what
we know about human languages. Linguists can now formulate interesting hy-
potheses and account for broad ranges of facts in many languages with elegant
abstract principles, as we shall see. They understand certain aspects of lan-
guage acquisition in young children and can model some aspects of speech
comprehension.
Work on human grammars has paralleled work on the visual system and has
reached similar conclusions, particularly with regard to the existence of highly
specific computational mechanisms. In fact, language and vision are the areas of
cognition that we know most about. Much remains to be done, but we can show
how children attain certain elements of their language organs by exposure to
only an unorganized and haphazard set of simple utterances; for these elements
we have a theory which meets basic requirements. Eventually, the growth of
language in a child will be viewed as similar to the growth of hair: just as hair
emerges at a particular point in development with a certain level of light, air, and
protein, so, too, a biologically regulated language organ necessarily emerges
under exposure to a random speech community.
From the perspective sketched here, our focus is on grammars, not on the
properties of a particular language or even of general properties of many
or all languages. A language (in the sense of a collection of things people
within a given speech community can say and understand) is on this view an
epiphenomenon, a derivative concept, the output of certain people’s grammars
(perhaps modified by other mental processes). A grammar is of clearer status:
the finite system that characterizes an individual’s linguistic capacity and that
is represented in the individual’s mind/ brain, the language organ. No doubt the
grammars of two individuals whom we regard as speakers of the same language
will have much in common, but there is no reason to worry about defining “much
in common,” or about specifying precise conditions under which the outputs of
two grammars could be said to constitute one language. Just as it is unimportant
for most work in molecular biology whether two creatures are members of the
same species (as emphasized, for example, by Dawkins 1976), so too the notion
of a language is not likely to have much importance if our biological perspec-
tive is taken and if we explore individual language organs, as in the research
program we have sketched here and which we elaborate in later chapters.
3 Syntax
41
42 3 Syntax
sentences as they become mature speakers. Perhaps they acquire some set of
basic structures and construction-types, which may then be elaborated into some
open-ended set, and there were efforts to build models along those lines. But that
involved positing some kind of system, an open-ended, recursive system, and the
available models did not have that capacity. Certain “typological” approaches,
inspired by Greenberg (1963), adopted E-language formulations in identifying
harmonic properties of languages which have “predominant subject–object–
verb” word order, for example, asking to what extent such languages would
show, say, noun–adjective order.
When the productive mechanisms of syntactic structure came under serious
scrutiny, with the development of generative grammar, it became apparent that
even though a description of the words of a language is certainly necessary, it is
not sufficient: languages cannot be viewed simply as inventories of signs. The
resulting insights had profound effects in all areas of the field. There has been
a tremendous amount of work over the last forty years, yielding discoveries in
language after language and theories which apply productively to wide ranges
of phenomena in many languages. Since pregenerative work on syntax was so
scant, there is little to be learned from a comparison. Rather, in this chapter we
shall take two features of current models – deletion and Case theory – and we
shall show some of the ramifications of these ideas, how they are shaped by
the cognitive, I-language nature of our analyses, how they capture details about
language structure, remembering that God is in the details, and the devil, too.
The details are fascinating in themselves; they represent distinctions which
are not reported in standard language textbooks, they are not taught to second-
language learners, and, indeed, for the most part they were not known until
rather recently, when they were discovered by theoreticians. There is no way
that these distinctions could be communicated directly to young children as
they develop their language capacity. However, the distinctions we shall dis-
cuss are not the object of our inquiry, but data which provide evidence about
the inner mechanisms of the mind. Our problem is to pick one or two more
or less self-contained illustrations of those mechanisms, parts of the internal
systems represented in individual brains and acquired under normal childhood
conditions.
The earliest attempts to carry out the program of generative grammar quickly revealed
that even in the best studied languages, elementary properties had passed unrecognized,
that the most comprehensive traditional grammars and dictionaries only skim the sur-
face. The basic properties of languages are presupposed throughout, unrecognized and
unexpressed. This is quite appropriate if the goal is to help people to learn a second
language, to find the conventional meaning and pronunciation of words, or to have some
general idea of how languages differ. But if our goal is to understand the language
faculty and the states it can assume, we cannot tacitly presuppose “the intelligence of
the reader.” Rather, this is the object of inquiry. (Chomsky 2000, p. 6)
3.2 Successive merger and deletion 43
DP IP
I VP
V DP
and . . .), complementation (Reuben said that Phil said that Fred thought
that . . .), and relativization (This is the cow that kicked the dog that chased the
cat that caught the mouse that ate the cheese that . . .).2 It is always possible to
construct a more complex sentence.
Instead of considering how the language capacity deals with the (literal)
infinity of possible structures, let us explore the property of “displacement”:
words and phrases may occur in displaced positions, positions other than those
corresponding to their interpretation. In an expression What did you see in
Berlin?, what is understood as the direct object of see (cf. We saw the Reichstag
in Berlin), but it has been displaced. It is pronounced at the front of the sentence
but understood in the position of a direct object, to the right of the verb. English-
speaking children hear sentences like these and consequently the displacement
operation is learnable; children learn from experience that a wh-phrase typically
occurs at the front of its clause, even though it is understood elsewhere. Chinese
children have no such experiences and their grammars have no comparable
displacement. They hear questions like “You saw what in Berlin?” in Chinese,
where (the equivalent of) “what” is pronounced in the same position in which
it is understood.
Displacement can be viewed as a special case of merger: an item already
in the structure is copied and merged, with subsequent deletion of the copied
element. There are other ways of viewing the displacement effect, of course,
but let us pursue this account. For an example, let us continue with the structure
of (3.1b). The I element will, which is already present in the structure, might
now be copied and merged with (3.1b) to yield the more complex (3.2a), with
a complementizer phrase (CP) headed by the copied will; the lower will is
subsequently deleted to yield (3.2b). Now another step: the DP what city, the
direct object of visit, may be copied and merged, yielding (3.2c), (where what
city is the specifier of the CP), with subsequent deletion to yield (3.2d), the
structure of the sentence What city will the student visit?
(3.2) a. [CP C will [IP [DP the student] [IP I will [VP visit what city]]]]
b. [CP C will [IP [DP the student] [VP visit what city]]]
c. [CP [SpecCP what city] [CP C will [IP [DP the student][VP visit
what city]]]]
d. [CP [SpecCP what city][CP C will [IP [DP the student][VP visit]]]]
We shall focus here on a condition for deleting elements after they have
been copied: the manner in which elements may be deleted limits the way that
2 It has been suggested by Hale 1976 that in some (Australian aboriginal) languages, the functions
of relative clauses are actually filled by a kind of “adjoined” structure which is more like a form
of coordination than it is like the embedded relative clauses of, e.g., English. This does not
compromise the point that in these languages, too, sentences can be arbitrarily extended by the
introduction of the functional equivalents of English relative clauses.
3.2 Successive merger and deletion 45
examples over the next few pages. This matters if grammars are elements of
individual cognition, emerging in people’s brains on exposure to some linguis-
tic experience: we need to tease apart what can plausibly be learned from the
environment and what cannot. Our current generalization, “Delete that,” breaks
down in that certain instances of that may not be deleted. Consider (3.4), where
only the structure with that occurs in speech: *It was apparent to us yesterday
Kay had left, *The book arrived Kay wrote, *Kay had left was obvious to all of
us, *Fay believes, but Ray doesn’t, Kay had left (a boldface “*e” indicates an
illicit empty item, a place where that may not be deleted).
that may be deleted only if its clause is the complement of an adjacent, overt
word.
Somehow this limitation must be derived from intrinsic, native properties; it
cannot be a direct consequence of environmental information, because children
are not informed about what does not occur. This is crucial. “Delete that”
is a very simple generalization, easily learnable by children on the basis of
normal experience, but we see that a UG principle is implicated; we need some
sort of internal principle to prevent the system from over-generating, yielding
structures like those of (3.4) with that deleted, which do not occur in normal
speech. Let us formulate the principle in question as (3.5).
4 It is crucial that movement is local: a wh-phrase moves first to the front of its own clause. From
there it may move on to the front of the next clause up.
48 3 Syntax
5 There are various kinds of conjoined structures, and there is more to be said – in fact, much
more – about ellipsis operations. For example, while the sentence corresponding to (3.7c) is
ill-formed, one encounters sentences like Which man did Jay introduce to Ray and Jim to Kim?
and Which man did Jay introduce to Ray and which woman to Tim? We have no intention of
providing a comprehensive account of ellipsis operations here. Our goal is rather to illustrate
3.2 Successive merger and deletion 49
The same point holds for a deleted that to the right of a gapped verb (3.8b)
and a deletion at the front of an embedded clause (3.8c). (3.8a) is a well-formed
structure yielding the normal, everyday sentence Jay thought Kay hit Ray and
Jim that Kim hit Tim. However, (3.8b,c) are ill-formed; the deleted that in
(3.8b) and the deletion at the front of the indicated CP in (3.8c) fail to meet
our condition, not being in a complement of an adjacent, overt word; so the
structures do not occur, nor do the corresponding sentences *Jay thought Kay
hit Ray and Jim Kim hit Tim (3.8b), *Who did Jay think Kay hit and who (did)
Jim (that) Kim hit? (3.8c).
(3.8) a. Jay thought Kay hit Ray and Jim gap [CP that Kim hit Tim].
b. *Jay thought Kay hit Ray and Jim gap [CP *e Kim hit Tim].
c. *Whoi did Jay think Kay hit ei and whoj (did) Jim gap [CP
*ej (that) [CP Kim hit ej ]]?
Our UG condition (3.5) captures these distinctions with a plausible bifur-
cation between internal properties and information which is learned from the
environment. Children learn from their environment that a sentence-introducing
that may be deleted, that a copied wh-phrase may be deleted, that verbs may
be omitted in conjoined sentences, that large DPs may be displaced to the far
right of their clause. Our internal UG condition (3.5) guarantees that these
learned operations do not over-generate to yield non-occurring structures and
their corresponding sentences.
There are more subtleties holding of the speech of every mature speaker of
English, which follow from this particular condition of UG, making deletion
possible only in certain structures. Again, these are distinctions known sub-
consciously to every speaker of English: to know English means having these
distinctions, even though they were not taught to us and they could not be
derived entirely from our direct experience.
A simple possessive phrase like Jay’s picture is three ways ambiguous. Jay
might be the owner or the painter of the picture, or the person portrayed, i.e.
the object. Traditional grammarians label the last of these readings an “objec-
tive genitive.” Following the traditional intuition, modern linguists say that the
structure for the reading in which Jay is the object, the person portrayed, is
(3.9): Jay is copied from the complement position, where it is understood, to
the possessive position, where it is pronounced. Its copy is deleted in the usual
fashion, where it is adjacent to and the complement of the noun picture.
(3.9) a. [DP Jayi ’s [NP picture ei ]]
But now consider The picture of Jay’s. Here the ambiguity is different and
the phrase is only two ways ambiguous. It means that Jay is the owner or the
some poverty-of-stimulus problems and the kind of reasoning that is involved in solutions to
them.
50 3 Syntax
painter of the picture, but not the object: the expression cannot refer to a picture
in which Jay is portrayed, somewhat surprisingly. Similarly in A picture of the
Queen’s, the Queen can be the owner or the painter but not the person portrayed.
This is something that most adults are not aware of: indeed, it was not observed
until recently and certainly is not something we impart explicitly to children.
Again, a condition of UG must be involved and it is our condition on deletion
(3.5). Possessive elements like Jay’s only occur introducing DPs and therefore
must be followed by an NP even if nothing is pronounced (see note 1). If
nothing is pronounced, then the noun heading the NP must be null (e). So The
picture of Jay’s, where Jay is the owner or the painter, would have the well-
formed structure of (3.10a), where the indicated NP is empty and understood as
“picture.” If The picture of Jay’s were to have the impossible reading where Jay
is the person portrayed, the “objective genitive,” then the structure would be
(3.10b) and the copied element to be deleted (ei ) is the complement of another
empty element, the same empty noun understood as “picture.” The deletion of
the copied Jay is illicit because there is no adjacent overt word (cf. 3.9). Similarly
for The picture is Jay’s (3.10c) and The picture which is Jay’s (3.10d), which
also lack the objective genitive reading for Jay and whose structures involve an
illicit deletion.
Data like these are not the input to language acquisition; rather, they emerge
from the properties of the system which is triggered by simple, accessible
data available in children’s everyday experience and now we understand how.
Children hear expressions like Jay’s picture, understanding it to mean “picture of
Jay,” and thereby learn that English, unlike many other languages, has displaced
objective genitives; the UG condition then blocks structures like (3.10b,c,d).
For a further illustration of the utility of our UG condition (3.5), the deviant
(3.11c) has an illicit deletion right-adjacent to the gapped verb. Because the
verb is a gap, deletion of he i there (the top-most element of the complement)
is illicit, as is now familiar. That is the only relevant difference from (3.11a),
which is perfectly comprehensible and straightforward, involving no ill-formed
deletion. (3.11b) is also well-formed: here there is no gapped verb and known
licenses the deleted copy.
(3.11) a. It is known that Jay left but it isn’t gap that he went to the
movies.
b. Jay is known [ei to have left] but hei isn’t known [ei to have
gone to the movies].
3.2 Successive merger and deletion 51
c. *Jayi is known [ei to have left] but hei isn’t gap [*ei to have
gone to the movies].
Before we change gear, consider one last distinction. The sentence corres-
ponding to structure (3.12a), The crowd is too angry to organize a meeting, is
ambiguous: the understood subject of the embedded clause, written here as “e,”
may refer back to the crowd (the “anaphoric” reading, indicated by coindexing),
or it may be arbitrary in reference, meaning “anybody” (indicated by the “arb”
index): either the crowd is going to organize the meeting or somebody else,
unspecified.
research, as is usual in scientific work, and phenomena of the kind that we have
discussed constitute evidence for the properties of the model.
6 Here is a nice distinction: Max is dancing in London and Mary is in New York is ambiguous: Mary
might be dancing in New York or perhaps running a restaurant. However, the same sentence with
a reduced is is unambiguous: Max is dancing in London and Mary’s in New York means only
that Mary is in New York, not necessarily dancing there. If Mary were dancing in New York,
the structure would be . . . and Mary is [e] in New York and the empty verb would require a full
phonological word as its host, for which the reduced ’s would not suffice.
3.2 Successive merger and deletion 53
represented by (2.11) do not involve deletion, and so (3.5) would have nothing
to say about them. On the other hand, some examples involving displacements
(and thus deletion) would not fall under (2.9) either:
(3.13) Tiredi as he [IP is/*’s [AdjP ei of his job at the car-wash]], Fred
won’t go looking for something better.
In many examples involving deletion affecting the entire complement of an
auxiliary, the two conditions converge to rule out the reduction of auxiliaries,
but each has a separate role to play in other examples.
If we adopt this view, then the notion of a host for a deletion site illumi-
nates more distinctions, which have to do with the extractability of subjects of
embedded clauses. Many years ago Joan Bresnan postulated a Fixed Subject
Constraint to capture the observation that subjects of finite embedded clauses
seemed to be fixed, unmovable. Now we can dispense with a distinct Fixed
Subject Constraint and relate the cases it was intended to account for to what
we have discussed in this section, explaining them through our UG condition
on deletion.
We noted that English embedded clauses are introduced by a complementizer
which may or may not be pronounced; so, sentences corresponding to (3.14a)
occur with and without that. This is also true if a wh-item is copied from
the embedded object position (3.14b): that may or may not be present. The
deleted complementizer in (3.14a) incorporates into the adjacent verb thought,
as indicated. Similarly the deleted wh-word at the front of the (embedded)
clause in (3.14b) incorporates into the adjacent think, whether or not that is
present. The same goes for the deleted wh-word which is the complement of
saw in (3.14b). In each case, the deleted element is the top-most item of the
host’s complement.
how (3.15a), the ill-formed structure for the non-occurring sentence *Who do
you wonder how solved the problem? It cannot be the case that complementizers
like that and how are not appropriate hosts for deletion sites because they are not
full, phonological words in some sense, because (apart from the phonologically
unmotivated nature of this move) the same is also true for expressions like what
time in indirect questions like (3.15c), *Who were you wondering what time
finished the exam?
(3.16) a. *This is the student whoi I wonder [CP whatj [IP *ei bought ej]].
We have argued for our UG condition (3.5) here entirely on the basis of
poverty-of-stimulus facts from English, but the condition holds at the level of
UG, and therefore of all grammars; and we expect to be able to argue for the
condition from the perspective of any language showing displacement prop-
erties. Indeed, we can and probably should make those arguments, but to do
so would exhaust the tolerance of our publisher if not our readers, so we will
demonstrate some consequences for linguistic diversity more succinctly. Rather
than give comparable arguments from several languages, we shall illustrate one
effect as it is manifested in various languages.
Luigi Rizzi (1990) identified three kinds of strategies used in different lan-
guages to circumvent the UG ban on extracting the subject of a tensed clause,
here subsumed under our condition on deletion (3.5). Each strategy employs an
ad hoc, learned device which licenses extraction from a subject position. The
particular devices are quite different from language to language, and our UG
condition on deletion helps us understand that diversity.
(3.18) Three strategies to license an extracted subject:
a. Adjust the complementizer so as to license the extraction
b. Use a resumptive pronoun in the extraction site
c. Move the subject first to a non-subject position and then
extract
English exploits strategy (3.18a) and permits extraction of a subject if the
complementizer that is adjusted – in fact, not present, as in Who do you think
saw Fay?, which has the structure (3.19). Recall that who originates as the
subject of saw. Because it is a wh-expression, it is copied at the front of its
local clause, and then at the front of the upstairs clause. So two copies need
to be deleted in the positions indicated. The lowest deletion (the subject) is
licensed by the higher coindexed (agreeing) position at the front of the CP (and
is incorporated into the coindexed position),9 and the higher deletion at the front
of the CP is licensed by and incorporated into the verb think (being the top-
most item in think’s complement). In the comparable (3.14c) and (3.15a), there
was no host for the deleted item. In other words, subjects of tensed clauses in
English are movable only if the CP contains only an empty, unpronounced, co-
indexed or “agreeing” item: that permits a subject wh-word to be incorporated
9 Example (3.16a) illustrates the importance of indexing, a complex matter. Indexed elements are
effectively overt. This can be seen in a language where verbs are displaced. French Qui voyez-
vous? “Who do you see?” has the structure (i), where ej is the deletion site of voyez and hosts
ei , the deleted qui.
appropriately.10 That and how, on the other hand, in these positions are not
appropriate hosts, as we saw in connection with (3.14c) and (3.15a).
Here we see a very specific, ad hoc device, in this case an operation changing
que to qui, whose sole motivation is to permit deletion of a subject DP. In French
the agreeing complementizer is an overt qui, while in English the comparable
form is the deleted, unpronounced complementizer.
Rizzi identified similar devices in a variety of languages, which host deleted
subjects. West Flemish (a language like German and Japanese and unlike
English and French, where direct objects precede their verb; hence the right-
ward cliticization of the direct object in (3.21a)) behaves similarly to French:
the usual form of the complementizer is da (3.21a) but a special “agreeing”
form die occurs where a deleted subject needs to be hosted (3.21b).
10 In chapter 9 we discuss the course of language acquisition by children who use “medial”
wh-items, as in What do you think what Cookie Monster eats? (Thornton 1995). Such chil-
dren retain medial wh-items longest in contexts where the wh-word is extracted from a sub-
ject position and where it acts as a kind of “agreeing” complementizer: Who do you think
who’s under there?
58 3 Syntax
(3.21) a. Den vent dai Pol peinst [CP ei da Marie ei getrokken heet]
the man that Pol thinks that Marie photographed has
The man that Pol thinks that Marie has photographed
b. Den vent dai Pol peinsti [ CP ei die ei gekommen ist]
the man that Pol thinks that come is
The man that Pol thinks has come
In these four languages we see copied subjects being deleted if the comple-
mentizer is adjusted in some fashion: deleted in English, amended to an agree-
ing form in French, or with a special form in West Flemish and Norwegian.
In all cases there is a kind of agreement. Hebrew is a bit different. Hebrew also
does not allow deletion of a subject DP (3.23a), although objects extract freely
(3.23b), as is now familiar. Subjects are extractable under special circumstances.
A special device adjusts the complementizer, in this case cliticizing the com-
plementizer še onto an adjacent head (3.23c). In (3.23c) the complementizer
cliticizes rightward onto lo, vacating the complementizer position and permit-
ting the subject to be incorporated leftwards, as in the analysis of (3.23d) (see
Shlonsky 1988). Because the complementizer position has been vacated, the
subject is extractable; this is reminiscent of the English device of emptying the
complementizer position not by cliticization but by deletion.
(3.24) a. Vilket ordi visste ingen [CP hur det /*ei stavas]?
which word knew no one how it/e is spelled
Which word did no one know how it is spelled?
b. Kalle i kan jag sla vad om [CP ei /*han kommer att klara sig].
Kalle can I bet about e/he is going to succeed
Kalle, I can bet (*he) is going to succeed.
The West African language Vata adopts the same strategy, but here even for
local movement in a simple, unembedded clause. Again we see the familiar
subject–object asymmetry: an extracted subject has a resumptive pronoun in its
underlying position, never a deletion (3.25a), while the opposite is true for an
extracted object (3.25b). To express English Who ate rice?, one says “Who did
he eat rice?,” with a resumptive pronoun in the subject position, and not “Who
ate rice?”; to express English What did Kofi eat?, interrogating the direct object,
one says “What Kofi ate?,” with no resumptive pronoun (the lower “what” is
incorporated into its verb le), and not “What Kofi ate it?” The resumptive
pronoun is used only where a wh-word may not be incorporated.
(3.25) a. Álói *(òi ) le saká la?
who (he) eat rice wh
Who ate rice?
b. Yii Kòfı́ le (*mı́i ) la?
what Kofi eat (it) wh
What did Kofi eat?
Italian, on the other hand, manifests a third strategy: moving the subject
first to a non-subject position (3.18c). Subjects may occur to the right of the
60 3 Syntax
VP (3.26a): Credo che abbia telefonato Gianni “I think that Gianni has tele-
phoned.” Here wh-words may be incorporated leftward into an adjacent verb,
and that is the position from which they are copied; so (3.26b) is the structure
for a sentence like Chi credi che abbia telefonato? “Who do-you-think has
telephoned?”
license them through an agreeing form (English, French, West Flemish, Hebrew,
Norwegian).
The UG constraint explains the need for ad hoc, language-specific devices.
Each of the devices we have examined is learnable, assuming children are
prohibited genetically from extracting embedded subjects in the normal case.
That is, children are exposed to positive, accessible data which demonstrate the
language-specific operation that adults use: the deletability of that in English,
the operation changing que to qui in French, the need for a resumptive pronoun
only in subject positions in Swedish and Vata, etc. The conditions under which
these apply follow from our UG condition (3.5). We therefore have accounts for
the specific languages, which meet our basic requirements. We also see that the
consequences of a condition of the linguistic genotype may be circumvented
sometimes in the interest of expressivity, and we understand why there is such
diversity in these cases.
Our UG principle on deletion (3.5), requiring that deleted elements are in-
corporated in a clitic-like fashion into an adjacent, overt head, was postulated
on the basis of poverty-of-stimulus arguments. It provided a way to tease apart
generalizations that a child might induce from her environment and the limits to
those generalizations, preventing them from over-generating and yielding non-
occurring structures. We know that the principle has similar effects in other
languages, but we have not illustrated that here. Instead, we have illustrated
some effects in a variety of languages relating to the extractability of subjects.
That suffices to show that our principle is operative in other languages, al-
though it certainly has many more consequences than we have illustrated here
for analyses of French, West Flemish, and so on.
3.3 Case
Let us turn now to a second aspect of UG, which connects with matters central
for traditional grammarians, the existence of cases. As it turns out, this notion –
familiar to anyone who has studied German, Russian, Latin, Greek, etc. – is
both closely similar to and subtly different from a more abstract relation on
syntactic structure which plays a central role in the theory of UG.
Plural
Nominative ā stānas ā scipu ā tala ā naman
Accusative ā stānas ā scipu ā tala ā naman
Genitive āra stāna āra scipu āra tala āra namena
Dative ǣm stānum ǣm scipum ǣm talum ǣm namum
Figure 3.1 Old English (nominal) case inflection
Morphological cases are not just ornamental decorations but they interact
with core syntactic operations. Polish, like Russian, shows an accusative mark-
ing on direct objects (3.29a), but the marking is genitive if the verb is negated
(3.29b).
3.3 Case 63
11 One position in which a DP is not Case-marked and may not be pronounced is the complement
to an adjective: *Kay is proud Ray. Such structures are salvaged not through movement but
through the insertion of the meaningless preposition of, whose sole function is to assign Case
to the DP. The preposition for plays a similar role in salvaging the subject of an infinitival verb:
*him to learn Greek would be difficult vs. for him to learn Greek would be difficult.
64 3 Syntax
these are not Case-marked positions. Again, the Case-marked positions are the
complement of verbs and prepositions and the specifier of a DP. In (3.30) Kay
does not originate in such a position; in (3.30a) Kay originates as the comple-
ment not of a verb arrest but of a participle arrested, and passive participles
(unlike verbs) are not Case assigners. In (3.30b) Kay originates as the comple-
ment of a noun, not a Case assigner, and in (3.30c) the original position isn’t
the complement of anything nor the specifier of a DP nor the subject of a finite
clause. Consequently Kay may not be pronounced in those positions and must
move to a position where it does receive Case: one does not find anything like
*Ray was arrested Kay, *Ray’s picture Kay, *Ray seems Kay to like Jay, or *it
seems Kay to like Jay, because, in each example, Kay is in a position which is
not Case-marked.
(3.30) a. Kayi was participle arrested ei .
b. Kayi ’s N picture ei . (meaning “picture of Kay”)
c. Kayi seems [ ei to like Ray].
If a DP originates in a Case-marked position, on the other hand, then it does
not move. Kay originates as the complement of a verb in (3.31a) and may not
move (3.31aii), as the complement of a preposition in (3.31b) and may not
move (3.31bii), and as the subject of a tensed verb in (3.31c) and may not move
(3.31cii). Compare the analogous structures of (3.30), which differ crucially in
that Kay originates there in non-Case-marked positions and must move.
(3.31) a. i. Somebody V arrested Kay.
ii. *Kayi V arrested ei . (intended to mean “Kay arrested her-
self ”)
b. i. Picture P of Kay.
ii. *Kayi ’s picture P of ei .
c. i. It seems Kay likes Ray.
ii. *Kayi seems [ ei likes Ray].
If we are going to distinguish the positions from which movement takes place
in modern English in terms of case ((3.30) vs. (3.31)), we need an abstract notion
of Case, defined independently of morphological endings, because morpholog-
ical case does not exist in the language outside the pronoun system. Abstract
Case is what is at work in the distinctions of (3.30)–(3.33).
In (3.30a), the deleted Kay (ei ) is not the complement of a verb or preposition
or any other Case assigner, but of the participle arrested, and it has no Case. Sim-
ilarly in (3.30b), ei is the complement of the noun picture and has no Case. And
in (3.30c) ei is the subject of an infinitive verb and is Caseless. As a result, the
deletion sites in (3.30) are positions from which the DP Kay moves to another DP
position and receives Case there. In (3.31a), on the other hand, the deletion site
is the complement of a transitive verb, of a preposition in (3.31b), and in (3.31c)
it is the subject of a tensed verb. All these positions are Case-marked, DPs may
3.3 Case 65
be pronounced in these positions, and these are not positions from which they
must move to other DP positions; indeed, they may not move. We draw the
relevant distinctions in terms of an abstract notion of Case.
Conversely, wh-movement shows the mirror image: a wh-phrase moves to
the specifier of CP, not to another DP position, and it may not move there from
Caseless positions (3.32), but only from Case-marked positions (3.33). (3.32)
and (3.33) correspond to (3.30) and (3.31), respectively. In (3.33a) the deletion
site is the complement of the verb (as in (3.31a)), in (3.33b) the complement of
the preposition of (as in (3.31b)), and in (3.33c) the deletion site is the subject
of a tensed verb (as in 3.31c).
(3.32) a. *Whoi was Kay participle arrested ei ?
b. *Whoi did you see a N picture ei ?
c. *Whoi did it seem [ ei to like Ray]?
12 We have assumed but not demonstrated that copied elements must be deleted. Therefore, if they
cannot be deleted, not meeting the requirements for incorporation into an appropriate host, the
resulting structure is ill-formed. We adopt the analysis of Nunes 1995: copied elements must
be deleted because if they weren’t, there would be two instances of, say, what and those two
instances are non-distinct. If they are non-distinct, they cannot be linearized in the phonology.
This would take us further into grammatical theory than we want to go here, but interested
readers can follow the details in Nunes 1995.
66 3 Syntax
the adjacent noun picture in (3.30b). In (3.30c), the trace ei , it is the top-most
element of the complement of seems, and thus these are all legitimate deletion
sites. That is also true of the deletion sites in (3.31): ei is the complement of
the adjacent verb arrested in (3.31a), the adjacent of in (3.31b) and it is the
top-most element of the complement of seems in (3.31c), all legitimate deletion
sites. The problem with the structures of (3.31) is that they violate, not the
condition on deletion (3.5), but Case theory: Kay receives Case in its original
position and therefore may not move. Case theory interacts with our principle of
deletion (3.5) and together they characterize the possibilities for moving DPs,
i.e. copying and deleting them. We will leave it as an exercise for the reader
to determine that the deletions in (3.32) and (3.33) also meet the requirements
discussed in section 3.2.
3.4 Conclusion
Here we have put a searchlight on two aspects of current syntactic theory,
deletion and Case, and shown how they help to capture distinctions typical
of English speakers, and how they distinguish what a child learns from the
environment from what she knows independently of experience. It is these
detailed distinctions which make up the subconscious knowledge that people
have when they are speakers of some form of some language. That knowledge is
characterized by the kinds of grammars that people have, by their cognitive
systems. It is too gross to say merely that structures are made up of subunits or
that languages with case systems tend to have freer word order than languages
without a rich morphology. Modern work takes us beyond E-language bromides
like this.
4 Sound patterns in language
In this chapter and the two following ones, we turn from issues of syntactic
organization in natural language to the systematicities of sound structure. There
is a conventional division between phonetics, or the study of sounds in speech,
and phonology, the study of sound patterns within particular languages. As
we will see, there is a reasonably clear conceptual distinction here, and we will
follow it in devoting most of this chapter and the next to the more obviously
linguistic domain of phonology while postponing substantive discussion of
the nature of phonetics until chapter 6, after some necessary preliminaries in
section 4.1. We will attempt to tease apart these notions, but that process will
reveal that questions of sound structure, seemingly concrete and physical in
their nature, are actually abstract matters of cognitive organization – aspects of
I-language and not measurable external events.
Ideally, we should broaden our scope a bit: signed languages also have a
“phonology” (and a “phonetics”) despite the fact that this is not based on sound,
although we cannot go into the implications of that within the scope of this
book.1 In recent years, the study of signed languages has revealed the fact that
their systems of expression are governed by principles essentially homologous
with those relevant to spoken language phonology and phonetics. This close
parallelism reinforces the conclusion that we are dealing here with the structure
of the mind, and not simply sound, the vocal tract, and the ears (or the hands
and the eyes).
A rough way to distinguish phonetics from phonology is as follows: phonetics
provides us with a framework for describing the capacities of the organism – the
range of articulatory activities humans use (or can use) in speech, the properties
of the sounds that result, and the way the peripheral auditory system deals with
those sounds. A learner still has to determine how these capacities are deployed
in the language of the environment, but the capacities themselves and their
relation to physical events develop in the organism independent of particular
1 A basic survey of the formational system of American Sign Language is provided in Klima and
Bellugi 1979. Diane Brentari (1995) discusses the relation of this system to the phonological
systems of spoken languages; some more technical papers on this topic will be found in Coulter
1993.
67
68 4 Sound patterns in language
we infer from judgments and intuitions: sentences do not come with little trees
on them (representing their syntactic organization) which we could study di-
rectly, for instance. Because it deals with observables, phonetics seems to many
people not to be much of a “theory” at all: just a sort of neutral observation
language in which we describe utterances. On that view, a phonetic description
might be more or less accurate, but there is no other sensible way to evaluate it.
In fact, in the nineteenth and early twentieth century, phoneticians proceeded
along just this line: they attempted to refine their techniques for measuring as
many dimensions as possible of speech with maximal precision. As equipment
got better, and researchers’ observations could be more and more fine-grained
and accurate, there was an explosion of data – a result that had the paradoxical
effect of convincing most students of language that they were on the wrong
track.
Much of what was being turned up, for instance, followed from the observa-
tion that speech is continuous: what is taking place at any particular moment
is at least a little bit different from what is going on just before or just after
that moment. As a result, if the phonetician attempts to measure everything
at every moment where there are distinct values to be recorded, there is no
limit in principle to the amount of measuring to be done. It is clear that a full
characterization of an utterance as a physical event requires us to recognize an
unlimited number of points in time – but it is also clear that our understanding
of the utterance in linguistic terms is not thereby improved.
In fact, we usually represent the phonetic structure of an utterance in a much
simpler way: as if it were a sequence of separate units, like beads on a string,
each one a “sound” representing about the same amount of phonetic material as
a letter in common writing systems. A phonetic representation usually has the
form of a sequence of segments, where each of these is characterized as a point
in a multi-dimensional space. The dimensions of this space are the phonetically
relevant properties of sounds. These include articulatory details such as the
location of the highest point of the tongue body, presence (vs. absence) of
vibration of the vocal cords, rounding of the lips, etc. This is rather like a
sequence of snapshots, one per segment, where each one shows just the detail
about the articulatory apparatus that we deem relevant.
This characterization may seem to be simply a direct observation statement
about the utterance – after all, if the representation is really phonetic, there
surely couldn’t be anything too abstract about it, right? And isn’t it the case that
an utterance of the English word pit “really” consists of the sound p, followed
by the sound i, followed by the sound t?
But in fact while a phonetic representation has a certain claim to “objec-
tivity,” it is not just a mechanical recording of utterances. Rather (like any
serious record of an observation), it is a theory of what it is about the physical
form of this utterance that is of linguistic significance: it deals in abstractions.
70 4 Sound patterns in language
say that the acoustic facts do not have discontinuities: they often do, but usually
not where we want them. For instance, the acoustic structure of a stop consonant
like the initial [ph ] at the beginning of pit looks like three separate parts, not
one – but where the last one is integrated with the properties of the following
vowel in such a way as to make it impossible to separate them.
Secondly, there is the abstraction of segmental independence. The seg-
mental picture suggests that all of the properties of a segment are located at the
same point, and that segments do not overlap. But in fact, this is not at all the
way speech works. Instead, there is an enormous amount of co-articulation, or
overlapping of articulatory gestures. This is both anticipatory (right-to-left) and
perseverative (left-to-right), and it smears the segments together so as to make
it impossible to define their identity except relative to their particular context.
Another consequence of coarticulation is the fact that what counts as the
“same” sound may be heavily context-dependent. For instance, the [g] sounds
in ghoul and geek are quite different, as a function of differences in the vowel
that follows. This kind of intersegmental interaction is quite typical of speech
articulation: at any given point in time, what is going on is likely to be the
product not just of one but of several of the “segments” in the utterance. Even if
we could find discontinuities, then, we would not in general be able to present
a snapshot that is physically real and consists exclusively of the (invariant)
properties of a single segment we wanted to characterize.
So why do we do this (segmentation)? Because we have learned that this
idealization is actually more appropriate and useful than the literal truth. The
segmental representation is really the only basis for finding the regularities that
obtain in languages with respect to the forms shown by individual words. Of
course, it is only adequate once we specify the relation between such a picture
and the vastly more complex reality that we could (in principle) measure, and
phoneticians therefore take on (at least implicitly) the responsibility of describ-
ing the ways in which the representations they work with are implemented in
all of their messy detail.
The phoneticians of the nineteenth century were concerned to say how transi-
tions between segments fill in the intervening values of a continuous function – a
function which we only specify at some small finite number of points (corre-
sponding to the distinct segments we recognize in the utterance). Having done
that, though, the segmented result is more enlightening for further linguistic
study than a full reconstruction of the actual continuous nature of the facts. It
is an appropriate level of abstraction from those facts because it organizes the
data in a way that is more coherent: a way that allows us to see their structure,
otherwise obscured by detail.
There is a third idealization involved in phonetic representations: these are
also abstract in that they choose some things to characterize at the expense of
others. We might describe the sound [i] (roughly the vowel of Pete), for instance,
72 4 Sound patterns in language
as a “high front unrounded oral vowel.” This tells us about some things involved
in the production of [i]: the position of the highest point of the tongue body, the
configuration of the lips, and the position of the velum. It says nothing about
what the rest of the tongue is doing (except that it is “lower” than the point we
described), or what the epiglottis is doing, or how wide the nose itself is opened,
what the facial muscles are doing, how loudly the speaker is talking, what his
(or her) precise vocal range is, etc. These omissions are not because the values
of these other properties cannot be determined: rather, they represent an implicit
claim that these and other aspects of the physical event are not linguistically
significant (or else that they can be predicted from the other things that we have
described).
No language, for example, ever seems to contrast utterances on the basis
of degree of spreading of the nostrils, or of loudness of the voice. Nor do
languages differ systematically in this respect (such that, e.g., language A is
spoken louder than language B, and to speak language A softly gives you a
foreign accent). Linguistically relevant transcriptions, then (as opposed to what
might be of interest to a psychiatrist or other specialist in non-verbal behavior)
do not need to record the width of the nasal opening, or the loudness of the
voice. In fact, we can go further, and say that a fortiori, such a transcription
should not record these properties – at least not if “phonetics” refers to the
linguistically relevant dimensions of speech. The choice of a set of parameters
(or distinctive features) is (part of) a phonetic theory: the data alone
do not establish their own meaning, or impose a particular choice of a set of
features as a matter of physical necessity.
What is required is a representation that will describe all and only those
aspects of sound production that can play a part in the system of language.
This is linguistic phonetics, after all, and not physics, or physiology. Any
degree of freedom in the representation should correspond to something that
at least potentially could be under linguistic control. The segmented phonetic
representation actually stands at a level of abstraction some distance away from
the physical reality, and there must be an explicit statement of the relation
between the two, a matter which will occupy us further in chapter 6. Theory
kicks in, even at the level of representing the sounds of language.
Phonetics connects with claims about innateness and Universal Grammar
(UG), etc.: linguistic phonetics is an attempt to delineate exactly the aspects
of sound structure that are available for use in natural languages – the things
that one has to be able to control (potentially) as part of learning a particular
language, matters which fall within the domain of the language organ. There
are indeed a number of independent dimensions with this property, but certainly
not an infinite number – and more to the point, not everything that one could
measure is a candidate. And if a discrete, segmented representation is close to
what is linguistically significant, the dimensions of control in speech are rather
abstract and removed from the continuous, concrete physical reality.
4.2 Phonology: language-particular structure 73
What is part of UG, then, and thus innate, is not the actual sound of any words,
but rather a set of abstractly organized possibilities. The child approaching the
task of learning a language knows in advance what to pay attention to, and what
range of possibilities might be important. In fact, as we will discuss in chapter 9,
children attend to these matters in astonishingly structured detail from the very
beginning: according to some research, indeed, even before birth. If the terms
of analysis are given in advance, then we can begin to understand how humans
cope with the infinite variability of experience in a common fashion, in such a
way that we arrive at shared representations and analyses of speech events.
linguistic content, not simply their physical properties. For reasons that will
become apparent, we will often refer to the phonological representation as the
“underlying form” of an utterance. And to further confuse matters, the history of
this kind of representation results in a tendency to call its elements “phonemes”
and thus to speak of a “phonemic” representation. There are subtle distinctions
to be made here, but by and large “phonological,” “underlying” and “phonemic”
representations are all ways of talking about the same thing.
The essence of what this representation is all about is expressed by the
following principle:
Thus, in English, phonetic representations contain both [ph ] (as in pit, phonet-
ically [ph ιt]) and [p] (as in spit, phonetically [spιt]); but we do not want to allow
two phonological representations to differ only in that one has a [ph ] where
the other has a [p], since it is never the case that this difference alone serves to
distinguish meanings in English. Simplifying slightly, aspirated [ph ] appears
always and only at the very beginning of stressed syllables, while unaspirated
[p] appears elsewhere (after a syllable-initial s or under conditions of reduced
stress). A similar generalization applies to other voiceless stops (t, k) in English.
The point is that a speaker’s knowledge of English, her language organ,
includes some characterization of a set of principles defining the notion of
“phonological representation in English” that exclude two such representations
from both being possible, even though the definition of “phonological represen-
tation in Punjabi” (for instance), which consitutes a part of a Punjabi speaker’s
language organ, does allow two representations that differ only in this way.
Just as the notion of “phonetic representation” provides us with an implicit
definition of “possible human linguistic utterance” or the like, a phonological
representation is the basis of an account of what the possible linguistically dis-
tinct utterances are in a given language. The phonological principles that are
thus inherently particular to a specific language form a part of I-language, and
it is part of the task of learning a language to determine the appropriate set of
such principles from among those made available by phonological theory.
We have formulated the principle in (4.4) in extremely general terms, in
order to try to accommodate the many different ways in which linguists have
thought of phonological representations. We can distinguish at least three variant
conceptions of how to satisfy the requirement of the phonemic principle (4.4).
One approach, which we can call a theory of incompletely specified
representations, proposes that where some property is predictable within a
given language, we omit it altogether from phonological representations within
76 4 Sound patterns in language
2 Note that the usual convention is to put phonological representations within /s and phonetic
representations within square brackets.
4.2 Phonology: language-particular structure 77
only in that one has a ph where the other has a p, one or the other of these is
going to be ill-formed because it violates some constraint, and is thus ruled out
as a phonological representation in English.
If all of these approaches to phonological representation can satisfy the basic
desiderata for such a notion, as expressed in (4.4), does it make any difference
which one we choose? In fact here, as elsewhere, the issue of an appropriate
level of abstraction arises, and the determination of the properties of that level
is ultimately an empirical question. If all three of the varieties of phonological
representation just considered can be made to fare equally with respect to the
basic principle in (4.4), there may still be other ways in which they are distinct in
their ability to facilitate an insightful account of the complete reality constituted
by sound structure in I-language. When we consider not simply the nature of
phonological form itself, but also the systematic formulation of its relation to
other aspects of language (including phonetic form, word structure, and other
matters), differences will emerge.
that one had to have a complete account of the latter before trying to identify
the former. But in that case, the phonetic data are all the analyst has to go on
when constructing an analysis, so the analysis has to be the sort of thing that
can be attained on that basis. The conceptual mistake that follows is to identify
what the language “must be” with a particular path by which the linguist can
discover its structure. This has the (presumably unintended) consequence that
linguistic theory winds up describing the behavior and knowledge of linguists,
rather than that of speakers.
Finally, many felt that what the phonemic principle “really” required was
that phonological representations encode linguistic contrast. If we take contrast
to be something that can be operationally determined by asking speakers
“do the utterances U1 and U2 contrast or not?” (or any one of a thousand
variants of this), the principle again follows, since all and only those differences
that can be unambiguously recovered from phonetic presentation alone will
correspond to contrasts in this sense.
As this view became more explicit, however, it became clear that it failed to do
justice to the facts of language. First, students of the nature of perception came
to entertain the possibility that rather than simply registering its input, the mind
actively constructs hypotheses about the world, and compares them with the
incoming data to validate a particular interpretation. Perhaps the purest form of
this picture is the notion of “analysis by synthesis,” according to which percepts
are quite generally constructed by the perceptual system itself (the “synthesis”)
and then confirmed (as the “analysis”) to the extent that the available data do
not contradict them.4 The kind of “motor theory” of speech perception which
we will presume in chapter 6 below generally makes somewhat similar assump-
tions. These and other active views of perception made it clear that the perceptual
system might well involve (at least in part) “top-down” generation and testing of
hypotheses, and not only simple, bottom-up identification of the acoustic signal.
If that is the case, though, more than one such hypothesis might be consistent
with the same surface facts, and thus more than one phonological representation
might represent a speaker’s interpretation of the same phonetic form.
Secondly, when we recognize that there are in fact substantive universals
of language, it becomes clear that linguists are not limited in their grammar-
writing to simple induction over collections of surface data, and also that the
analysis of different parts of the grammar (e.g., sound structure and word
structure) can proceed hand-in-hand. Thus, there is no reason to believe that
the surface facts alone directly constrain the range of possible grammars.
Finally, it turns out that “contrast” is a lot harder to define operationally than
it initially appears. True, we do need to be able to define this notion (insofar
4 See Stevens and Halle 1967 for the specific application of this view to the problem of speech
perception.
80 4 Sound patterns in language
as it is a coherent one), but it does not follow that contrast ought to be the
only thing determining what a phonological representation should be like. A
coherent view of this relation, in fact, along with the other matters discussed
just above, emerges only when we realize that the basic object of inquiry in the
study of sound structure (as with the rest of language) is the language organ, a
form of knowledge, rather than a direct characterization of external events.
In isolation, though (as for instance in the citation forms of the same words),
all of these stems are pronounced in the same way: as [nat].
The matter of how to analyze the object case forms in (4.7) phonologically is
straightforward, since all of the segments in the chart are in contrast, and we can
set up phonological representations essentially equivalent to the phonetic ones
(though we still need to abstract away from the environmentally conditioned
variation in voicing in p/b, t/d, etc.). But what about the phonological analysis
of [nat]? We seem to have two choices.
On the one hand, we might say that [nat] always has the same phonological
representation (e.g., /nat/), one which satisfies biuniqueness (4.5), because it
can always be uniquely, if uninformatively, recovered from the phonetic form.
The alternative is to say that there are at least five (and potentially eight) different
phonological forms that all correspond to the surface form [nat], such that it
is not possible to tell in isolation which phonological representation should be
associated with any given phonetic one (hence violating biuniqueness).
Most linguists are tempted to posit phonological representations /nat/, /nath /,
/nas/, /načh /, /nač/, etc., and to say that there is a principle by which any final or
preconsonantal coronal obstruent (in phonological representation) is replaced
by (phonetic) [t] (with corresponding rules for labials and for velars). This
violates biuniqueness, but it seems to express something real about the language.
What is the basis of this feeling?
In essence, the situation is the following. For any given lexical element of the
language, the prevocalic variant in which it occurs is idiosyncratic, but constant:
e.g., the word for “day” appears with the same consonant in [naei] “in the
daytime” as in [na l] “day (obj),” as well as in any other form where the stem-
final consonant is followed by a vowel. In contrast, forms of, e.g., “sickle” in
which the stem is immediately followed by a vowel always show a stem-final [s].
Given the form that occurs before a vowel, the final or preconsonantal form is
predictable, but not vice versa. That is, we can establish as a general principle
of the language that any segment which is [+Obstruent,+Coronal] in medial
position (i.e. [t, th , s, č] etc.) will correspond to [t] in final position. From the
non-medial form, however, we cannot predict the medial one in a unique way.
Given a biunique notion of the relation between phonological representations
and phonetic form, /nas/, /nac/, etc. cannot be the phonological representations
of words all pronounced [nat], since there would be no way to recover one
of these (as opposed to any of the others) from a given instance of [nat]. Any
phonological representation is a representation of the sound properties of a
message, but biuniquess further limits the sound properties that can potentially
differentiate one message from another to ones that are overtly realized in the
phonetic form of utterances expressing that message – an E-language notion.
But what else could “sound properties” possibly mean? In fact, what dif-
ferentiates e.g. /nas/ from /nac/ as the phonological representation for [nat] is
something a native speaker of Korean knows about the form the item in question
82 4 Sound patterns in language
takes in general and not just in this utterance. As a part of speakers’ knowledge
of the language, these relations clearly belong in a full account of the I-language
Korean, part of a Korean speaker’s language organ. Any theory of phonologi-
cal form which prevents their expression is thereby seen to be inadequate as a
general account of the structure of the language faculty.
Note that the proposal to distinguish /nas/, /nat, /nač/, etc. (all realized pho-
netically as [nat]) is still a matter of the “sound properties differentiating one
(potential) message from another,” however. We have not, for instance, pro-
posed a way to give different representations for, e.g., pair and pear in English,
since these words never differ in their sound properties. While the differences
among Korean /nas/, /nat/, /nač/, etc. are manifested in some (though not all)
phonological environments, there are no such differences among English pare,
pear, pair.
When American structuralists in the 1940s and 1950s came up against this
sort of example, they could not incorporate it into a (biunique) phonemic rep-
resentation. Since their focus was on an E-language conception of the object of
their study, and for other reasons sketched above, they considered biuniqueness
a valuable requirement a priori on phonological form, and were not particularly
troubled by this conclusion. On the other hand, examples such as the Korean
one here (and the widely discussed case of final de-voicing in German, Russian,
and a variety of other languages) made it clear that even if these facts did not
belong in the “phonology,” there was still more to be said to include them in an
overall description.
To allow for this, they constructed a new kind of object: a “morphophonemic”
representation. This was assumed to represent a more abstract characterization
of linguistic elements, related in a non-biunique way to phonemic representa-
tions as in (4.8).
|nath|
our perspective, however, it can readily be seen that these representations actu-
ally articulate something important that a speaker knows, hence properties of
that individual’s language organ: they (and not the biunique phonemic forms)
uniquely determine the full range of shapes a given linguistic element will dis-
play across environments, which is surely part of the characterization of the
“sound properties” that oppose that element to other elements). An account
of sound structure as I-language must thus include these matters; while the
necessity of the biunique phonemic representation, in contrast, remains to be
established.
In fact, the example Halle refers to is not analogous in the relevant details to
the Russian case. Matthews shows that /b,t,k/ are replaced by [m,n,ŋ] in syllable-
final position after a nasal vowel in this form of Dakota. But he also makes it
clear (Matthews 1955, p. 57, note 3) that “[m, n] do not otherwise occur in this
position.” This is thus a case of “partial overlapping,” rather than neutralization,
and the example does not necessitate the loss of a generalization in order to
maintain a biunique analysis. Since the nasalization rule for stops following a
nasal vowel does not involve a mixture of neutralization and allophonic effects,
but can be formulated entirely as a relation between phonemic and phonetic
forms, its unitary nature is not compromised by the principle requiring phonemic
forms to be uniquely recoverable from the surface phonetics. Other examples
that do have this character, however, were known at the time, as we will note.
There were actually several examples that had been discussed in the literature
before Halle’s paper that involve facts whose logic is entirely parallel to that
of Russian voicing assimilation. It is instructive to look at the treatment they
received, because it shows us something about the extent to which linguists of
the period held to the principles of their theory.
One way of dealing with such facts is illustrated by Bloomfield’s discussion
of Menomini. Here, as in Russian, we have an apparent generalization which
(when applied to morphophonemic forms) involves a mixture of phonemic and
subphonemic effects. But instead of concluding that this showed the inadvis-
ability of phonemic representations, Bloomfield interprets the facts as showing
that the allophonic variation is probably phonemic too, after all. “If it looks
like a phoneme, walks like a phoneme, quacks like a phoneme, it must be a
phoneme” (with apologies to Walter Reuther).
If postconsonantal y, w, or any one of the high vowels, i, ı̄, u, ū, follows anywhere in
the word, the vowels ē and ō are raised to ı̄ and ū, and the vowel o in the first syllable
of a glottal word is raised to u: mayı̄čekwaʔ that which they eat, cf. mayēček that which
he eats; ātεʔnūhkuwεw he tells him a sacred story, cf. ātεʔhnōkεw . . . Since ū occurs
only in this alternation, it is not a full phoneme. (Bloomfield 1939, §35)
“Not a full phoneme.” What does that mean? In the inventory of “the actual
Menomini phonemes,” the element ū appears in parentheses, and is identified
as a “semi-phoneme” (Bloomfield 1939, §5). Bloomfield must have been some-
what uncomfortable with this analytic result, because in a later treatment, his
posthumously published grammar of Menomini (edited by Charles Hockett),
he gives some rather marginal arguments that ū is a (“full”) phoneme after all.
Since the occurrence of u
is normally confined to the forms in which it replaces o
under the regular alternation of 1.8 [referring to the rule above], it might be viewed as
a mere positional variant of o
. In this alternation, however, the difference of o
and
u
is parallel with that of e
and i
, two sounds which unmistakably figure as separate
phonemes. Moreover, the difference of o
and u
is maintained by persons in whose
speech this alternation has lost its regularity. Also, the sound of u
(and never of o
) is
86 4 Sound patterns in language
Fairly clearly, the invocation of such marginal evidence (the speech of in-
dividuals who do not really control the phonology of the language and the
pronunciation of a synchronically non-Menomini word) stretches the intuitive
notion of what phonology is about in order to maintain consistency.
A somewhat different response, and a real triumph of honesty in dealing
with such facts, is illustrated by Bernard Bloch. Discussing (in Bloch 1941) an
example from English that is logically just like Halle’s, he notices exactly the
same point Halle makes: an apparently unitary rule must be broken in two as a
result of the requirements for a phonemic representation. But does he conclude
that (biunique) phonemes should be discarded? Not at all. Instead, he concludes
that science has saved us from a seductive but ultimately false generalization.
In essence, he denies the intuitively obvious analysis of the facts on the basis
of a priori theoretical considerations.
These reactions are among the more principled. In fact, when we look at the
examples that began to accumulate by the 1950s which suggested that phone-
mic representations had properties that led to incorrect or otherwise deficient
analyses, we see that linguists of the time found various ways to preserve their
principles in the face of the apparent facts. On an issue other than biuniqueness,
this can be illustrated from reactions to the famous example of writer/rider,
where the surface contrast is in the “wrong” place as illustrated in (4.9). For
many speakers, the pronunciations differ in terms of the length of the first vowel
and not in terms of the the medial stop, which is pronounced in the same way
(typically as a sort of “flap”) in both words.
?
(4.9) [rajD] “writer” vs. [ra
jD] “rider” = /rajtr/ vs. /rājtr/ or /rajtr/
vs. /rajdr/
One possible way to deal with such a situation is to force the theory to provide
the correct result. When the principles lead to absurdity, adapt the principles
so that they will yield what you know intuitively to be correct. An example of
this approach is provided by Harris’ (1951) procedures of “rephonemicization,”
which allow the linguist to massage the analysis in a variety of ways so as to
arrive at a satisfying analysis even though the basic premises of the theory do
not naturally provide one.
An alternative is to follow the principles consistently, and if they lead to
absurdity, then deny the facts. With respect to the specific facts in (4.9), this is
illustrated by Householder’s (1965, p. 29) conviction that “I can tell you from
experience that you will, if the words are in fact consistently distinguished,
invariably find one or more of several other differences [between the flaps of
writer and rider].” That is, even though all of the apparent evidence suggests
4.3 Morphophonemics and I-language 87
that the difference between writer and rider (in the relevant dialect) is a matter
of the quantity or quality of the stressed vowel, a sufficiently assiduous search
for phonetic detail will uncover some basis for assigning the difference to the
medial consonant (where it intuitively “belongs”) and treating the patent vowel
difference as allophonic.
The difficulties that emerged for the phonemic theory of this period follow
directly from the fact that it was a theory of E-language. The biuniqueness con-
dition (4.5) and the approach to language that motivated it forced the terms of
the theory to limit themselves to descriptions of the external observables sup-
posedly provided by phonetics. As a result, facts indicating that speakers’ know-
ledge of a language is not limited in this way had to be dealt with in uncomfortable
ways or not at all.
A phonological representation is, by its very nature, a characterization of
the sound properties that distinguish linguistic objects for a speaker of a
given language. In order to translate this notion into a description of external
reality, however, phonemicists found it necessary to rebuild it on the basis of
observable properties and operational tests, ideas that turn out to be quite prob-
lematic in practice and to lead to a host of difficulties that in fact have nothing
important to do with phonology itself. As we will see in chapter 6, even the
notion of a purely E-language approach to phonetic description turns out to be
inadequate.
When we ask why Halle’s argument should have been so earth-shaking, it is
hard to say. Not only did it not involve a completely novel complex of facts,
it is not even the case that it shows biunique phonemic analyses in general to
lead to loss of generalization. This is a point that several authors have made,
with respect to earlier theories such as that of Trubetzkoy (1939), one of the
founders of the European variety of phonological theory to which generative
phonology often traces its roots.7
Principled discussion in the 1940s and 1950s of facts that were embarrassing
for phonemic theory did not in general consider, as Halle did, the possibility
that the appropriate conclusion to be drawn was that the basic premises of
structuralist phonology were misconceived. On the other hand, Halle’s argument
when it was presented in 1957/1959 was of a sort that had been offered in
substance before; and in any event, it did not really suffice to prove its point
in a fully general form. So why, then, did it have such major consequences,
while other similar cases had little or no effect? It appears that the special force
of Halle’s argument came from the fact that it was embedded in a theory that
was not limited to representations and the alphabets of elements that compose
them:
7 For discussion of alternatives to Halle’s analysis within Trubetzkoy’s framework – involving the
notion of the “Archi-phoneme” – and also within current generative phonology, see Anderson
2000, from which the present section is derived.
88 4 Sound patterns in language
[T]he effectiveness of Halle’s argument . . . lay in the emphasis it put on the centrality of
rules in a phonological description. Note that the entire argument rests on the observation
that, in certain situations, a level meeting the conditions of bi-uniqueness requires some
unitary regularity of the language (here, voicing assimilation) to be split up into two ef-
fectively unrelated rules. Now in a theory (such as American structuralist phonemics) in
which only the representations of forms have “real” status, such an argument is nonsen-
sical or at best irrelevant: the principles relating one representation to another (the rules)
are simply parts of the definitions of individual elements of representations, and have
no independent status whatsoever in the grammar. If they can be formulated in a simple
and concise way, so much the better: but the notion that the elements of representations
themselves should be chosen for the convenience of the rules was inconceivable.
The immediate consequence of Halle’s discussion was a change in phonology in
the direction of much more abstract representations than those permitted within a the-
ory which concentrated on biunique phonemics. But it must be emphasized that this
move was, in an important sense, an ancillary consequence of a more fundamental re-
orientation in phonological research: a shift from a concentration on the properties of
phonological representations and their elements to a much greater stress on the rules
of a grammar. Naturally, concern with questions of representations and their nature did
not disappear overnight. Nonetheless, the recognition was dawning that rules as well
had to be taken seriously as part of a grammar if language was to be examined as a
complex cognitive system rather than an inventory of phonemes, morphemes, words,
and constructions. Since the study of rules, their properties, and their organization into
linguistic systems was virtually unexplored territory, this reorientation had a much
more important effect on the nature of phonological research than the mere fact that
generative underlying representations are more abstract than biunique phonemic ones.
(Anderson 1985, pp. 321f.)
Halle’s innovation, on this view, was the focus he put on the need to get
the rules right in the statement of a language’s phonology, and not simply
to provide the right representations. These rules, as part of the content of a
speaker’s language organ, are intrinsically an aspect of I-language. So long as
linguistic theory remained focused on the (E-language) issue of how to represent
utterances in a principled alphabet, though, an argument based on the need to do
justice to the rules could have no real force, since the content of the statements
that relate phonology to phonetics had no independent external (E-language)
status of the sort the utterances themselves have.
Ultimately, the shift of attention from alphabets (inventories of basic repre-
sentational elements) and representations based on them to rules is significant
because it reflects a more profound shift in the object of inquiry, from the study
of the properties of observable linguistic events, the forms, to the study of
the knowledge speakers have of their language that underlies their production
and perception of such events. Rules are preeminently a characterization of
speakers’ knowledge, while the representations are in some sense primarily a
characterization of the forms. The change is thus a shift from the study of lan-
guage as an external, physical or social reality to the study of the structure and
organization of an aspect of human cognition: from E-language to I-language.
4.3 Morphophonemics and I-language 89
Now during the heyday of American structuralism, it was pretty much out
of bounds to study internalized knowledge: all there was to study was the
observable external form. But by the 1950s the world was gradually coming to
be more receptive to talk about minds, and so such a shift was at least logically
possible. The link between rules and individual cognition is quite explicit, at
least by the time of Chomsky and Halle’s fundamental statement in The Sound
Pattern of English (Chomsky and Halle 1968, pp. 3f):
The person who has acquired knowledge of a language has internalized a system of rules
that determines sound–meaning connections for indefinitely many sentences . . . [W]e
use the term “grammar” to refer both to the system of rules represented in the mind
of the speaker-hearer . . . and to the theory that the linguist constructs as a hypothesis
concerning the actual internalized grammar of the speakerhearer.
As late as 1965, when Fred Householder provided Chomsky and Halle with
a debating platform for use in going through the bases of alternative approaches
to phonology, it is clear that at least a significant fraction of the field did not (and
perhaps could not) understand the notion that linguistics might have speakers’
knowledge, rather than the properties of linguistic forms, as its proper object.
Householder was certainly a very intelligent man, and an experienced linguist,
but the very idea of linguistics as the study of an aspect of the mind was
quite incomprehensible to him. In discussing the claim of Chomsky (1964) that
“A grammar that aims for descriptive adequacy is concerned to give a correct
account of the linguistic intuition of the native speaker,” Householder (1965,
p. 14) finds that “[o]nly . . . ‘observational adequacy’ is intelligible (at least to
me) . . . it is sheer braggadocio to talk about descriptive adequacy, even if one
knew how to discover what a ‘correct account of the linguistic intuition of the
native speaker’ is.”
By the mid to late 1960s, as new generations of students appeared whose
training originated in the work of Chomsky, Halle, and their colleagues at
Massachusetts Institute of Technology, the basic point about the central im-
portance of rules – the need to get those right because they are really what the
study of language is all about – came to be more generally appreciated. But
recall that the persuasiveness of Halle’s original argument rests crucially on
one’s willingness to take seriously this need to get the rules right. And in fact
it took ten years or so after Halle’s original presentation for this to become a
generally accepted notion,8 so it is clear that whatever was responsible for the
rise of generative phonology, it probably was not simply the logic of Halle’s
conclusion about the obstructive role of phonemes in a descriptively adequate
account of Russian voicing assimilation.
So what in fact did happen to change the direction of phonologizing in
the early 1960s? A part of the responsibility undoubtedly should be laid to a
8 See Anderson 2000 for a more detailed account of the relevant history.
90 4 Sound patterns in language
principle that “plus c’est la m ême chose, plus ça change.” That is, by the end of
the 1950s, phonemic theory had increasingly become a settled discipline within
which only quite minor adjustments seemed necessary (or possible). With little
left to do, new generations of students inevitably looked for new challenges –
and new approaches that would provide them. While the fundamentally distinct
scientific premises of the new theory of generative grammar may have been
apparent to its originators, students did not have to appreciate these differences
to see that something quite new and different was going on, and that they could
make real contributions to it.
It is important to understand the content of our creation myths, since they tell
us something about the structure we actually give to our world. On the other
hand, it is also important not to confuse them with explanations of how the
world actually came to be the way we find it. In the end, Halle’s argument about
Russian voicing assimilation probably did not in itself persuade the linguists
of the time to drop their externalist presumptions, their phonemes and their
exclusive focus on representations, so as to become mentalists focusing on
rules as the expression of internalized knowledge. But on the other hand, it is
exactly in the context of that development that we still have to see the logical
force of the original argument. We really only come to appreciate the sense
of this important argument after the shift in point of view that it supposedly
produced has been achieved.
It is not particularly satisfying to discover that a field can change its character
fairly rapidly for reasons that are primarily pragmatic, and not purely principled.
But on the other hand, this case is instructive, not just in its own right, but
because it suggests that the same kind of influence may have been responsible,
on a smaller scale, for a number of the changes we have seen since then (and
probably many times before).
For example, phonological theory in the period immediately before and af-
ter the publication of Chomsky and Halle 1968 was intensely occupied with
highly formal concerns, issues such as rule ordering and the role of formally
defined notational conventions in producing an explanatorily adequate theory of
grammar.9 Within a rather short period in the late 1970s, these were almost com-
pletely abandoned in favor of the study of “auto-segmental representations” –
a notion of the organization of phonological (and phonetic) representations
in terms of the synchronization of properties with respect to the time-course
of speaking. This considerable shift of attention did not come about because
auto-segmentalists solved the earlier problems, however, or even showed that
they were misconceived. Rather, it happened because auto-segmental work
9 That is, a theory which has enough internal deductive structure to ensure that for any given
set of empirical facts, exactly one grammar will be provided – and that grammar will be the
“descriptively adequate” one, in the sense introduced by Chomsky above of providing a “correct
account of the linguistic intuition of the native speaker.”
4.3 Morphophonemics and I-language 91
92
5.1 Phonological knowledge as it appears in borrowing 93
Consonants
These segmental units appear within syllables that may have zero, one or
two consonants in their onset, possibly followed by a glide, preceding an oral
or nasal vowel, followed optionally by one or two additional consonants as a
coda. Monosyllabic words illustrating some of these various possibilities are in
(5.1), where “V” represents the vowel of the syllable and “C” and “G” represent
a consonant and a glide (or semi-vowel) respectively.
(5.1) V eau [o] “water”
GV oui [wi] “yes”
CV peaux [po] “skin”
CGV moi [mwa] “me”
1 Fula is a language spoken in various forms by some 12–15 million people across West Africa.
In much of this region, the colonial language was French, and use of French in a variety of
circumstances continues in many countries. The discussion here is based on the work of Carole
Paradis and her colleagues (e.g., Paradis and LaCharité 1997); where dialect differences are
relevant, the forms are those of a Futa Toro speaker.
94 5 Describing linguistic knowledge
The sounds of Fula are rather different from those of French. These are given
in figure 5.2.
Consonants
Vowels i, u, , , a (long and short); and have closed variants [e] and
[o] in the vicinity of a high vowel or another raised mid vowel.
When French words are borrowed into Fula, their forms are generally changed
so as to adapt to the possibilities offered by the language. For instance, Fula
2 The segment [w] appears in both the labial and the dorsal columns, because it involves both
lip and tongue body in its articulation. The duality has consequences for the way the segment
behaves with respect to regularities of the language, as shown in Anderson 1976.
5.1 Phonological knowledge as it appears in borrowing 95
Similarly, Fula does not have the sound [v]. When French words with [v] are
borrowed, this sound is replaced (unpredictably) with one of [w,b,f] as in (5.5).
Since Fula does not have nasal vowels, French words with those sounds
are borrowed with a sequence of oral vowel followed by nasal consonant. The
nasal consonant is articulated at the same point of articulation as a following
consonant; or, if there is no following consonant, the velar nasal [ŋ] appears.
the basic form of the regular past ending is /-d/, as suggested by vowel-final
verbs such as delay/delayed. The devoicing of the past tense ending in picked,
leaked, etc. is clearly due to the regularity that English final clusters of obstruents
must agree in voicing, as enforced by the devoicing rule in (5.12).
Now consider leave/left, lose/lost, and other similar verbs. These involve
an alternative form of the past ending, one which we also find in mean/meant,
deal/dealt, etc., and which we might represent as basic /-t/. But then in left (from
/lijv#t /), lost (from /luwz#t/), we seem to have quite a different rule applying
from the devoicing rule in (5.12): one that devoices the end of the stem, not
the suffix consonant. Despite this difference in effect, though, the two clearly
both enforce the same generalization: that of (5.10a). Somehow the grammar
containing these rules is not actually capturing this overall regularity. This is
a variation on the same kind of insufficiency we saw before in the ability of a
language’s rule system to account for patterns of adaptation in borrowing.
(5.15) a. Abstract away from surface differences that are due to reg-
ularities of the language; and
100 5 Describing linguistic knowledge
The rules of the grammar then, exist (a) to state the regularities of sound
structure in the language; and (b) to relate the abstract phonological forms of
linguistic elements to their phonetic realizations in various contexts. The idea
pursued by Prince and Smolensky and which has driven OT is that this approach
is ultimately unproductive in satisfying the main goal of phonological theory:
to provide a substantive definition of what constitutes a possible phonologi-
cal system for a natural language. While it is obviously important to provide
accurate descriptions of individual languages, the task of understanding UG
requires us to provide a more general account of the content and organization
of I-language.
Traditional generative phonology thinks of a grammar as a collection of
rules, each of the form A → B/C D. Such a rule looks for input sequences
of the form CAD and performs an operation of the form A → B (“A takes on
property B”) on them. But “[f]or this format to be worth pursuing, there must
be an interesting theory which defines the class of possible predicates CAD
(Structural Descriptions) and another theory which defines the class of possible
operations (Structural Changes).” These theories have proven to be “loose and
uninformative,” and thus we should conclude that “the locus of explanatory
action is elsewhere” (Prince and Smolensky 1993, p. 3).
The point here is that the rules themselves do not really seem to be very
useful in arriving at generalizations about universal properties of phonologi-
cal form. We can try to establish generalizations about what sorts of things
rules can do, but all such theories seem to allow for the formulation of lots of
things we “know” to be impossible. This suggests we should look elsewhere
for explanations.
Furthermore, theories of rules have been limited to theories of individual
rules. Even the best theory of the Structural Descriptions and Structural Changes
of particular rules misses the overall nature of phonologies: that sets of rules
have a coherence that cannot be seen in the individual rules themselves.
As illustrated by the two rules for the pronunciation of consonant clusters in
inflectional forms proposed above, devoicing can affect either the stem-final
consonant or the ending itself depending on which form the ending has. These
two rules are in principle quite independent of one another, but together clearly
express a single regularity.
Now in the 1970s and 1980s there were a variety of proposals made to the
effect that the basis of phonological rules was to be sought in their effects:
that is, that there were various regularities of surface pattern that provided the
5.3 Constraint-based theories of phonological knowledge 101
motivation for the differences between underlying and surface form that are
expressed by individual rules. A language does not have an epenthesis rule
because it likes epenthesis, but rather because as a result of this rule, it will
avoid ill-formed clusters (as in English inflection, discussed above, or the rule
of Spanish that avoids initial /sC/ by inserting an initial [e] in words like España).
A language has assimilation rules not because of the way they work, but because
as a result, all clusters will be homogeneous in some property.
Often it seems to be the case that a language has multiple rules, each of
which by itself is only part of the picture, but which taken together have the
effect that some pattern exists on the surface. Thus English assimilates voice
progressively in inflection (/kæt + z/→[kh æts]), but regressively in some other
cases of inflection (lose/lost), as we have seen just above. The two formally
quite different rules have one surface effect, a matter which we will take up in
more detail below.
These “conspiracies” (a term introduced by Kisseberth 1970a, 1970b) seem
to have a natural formulation as ways to satisfy some constraint on surface
representations. Suppose we take this effect as the locus of explanation in
phonology. Then we can attempt to develop a theory of how representational
well-formedness determines the assignment of phonological structure: a theory
of constraints and their interaction, as opposed to a theory of rules.
The nature of these constraints has been the subject of intense investigation
in recent years. An important basic notion is that constraints are instantiations
of universal aspects of sound structure – hence, they are the stuff of UG. Con-
straints address representational well-formedness (rather than the mechanics
of converting one representation into another), and it is presumed that most of
the content of this notion is due to the structure of the human language faculty,
rather than to arbitrary interlinguistic variation.
A conceptually important difference between OT and related theories lies in
the claim that constraints can be operative in a language even when they are
not necessarily true (or satisfied) in every form. The members of a given set of
constraints are typically in conflict, and not mutually consistent: satisfying one
constraint may require the violation of another. The way a particular language
resolves these conflicts is what characterizes its particular phonology as opposed
to those of other languages.
We can make this concrete by suggesting that there are two fundamentally
conflicting demands in sound structure:
(5.17) *kw
In a language like English, where there are no labialized velars, the effects
of this constraint are absolute. We could express this by saying that “even if
the lexical representation of an English word had a lexical labialized velar, it
would be pronounced without velarization.” Such a markedness constraint in-
evitably comes into conflict with the basic faithfulness property, expressed by
a constraint to the effect that lexical values must be preserved. A language like
English can then be described by saying that in such a language, the marked-
ness constraint (5.17) takes precedence over (or “dominates”) the faithfulness
constraint (5.18).5
4 The content of this statement is that a surface representation is disfavored to the extent it contains
instances of rounded velar consonants.
5 The content of this constraint is that a representation is disfavored to the extent underlying or
input values of the property [Round] are not preserved in the output.
5.3 Constraint-based theories of phonological knowledge 103
We have as yet said nothing about just how a system of constraints (such
as (5.17), (5.18), etc.) allows us to compute a surface form corresponding to a
given input. In essence, this process consists in a comparison among all of the
formally possible surface forms that might correspond to that input (a set of
“candidates”), resulting in the selection of that candidate that best conforms to
the system of ranked constraints. The grammar thus consists of two components,
called Gen and Eval. Gen operates on input representations to produce a
set of candidates; these, in turn, are assessed by Eval. The candidate with the
highest degree of harmony (i.e., the one which violates highly ranked constraints
to the smallest possible degree) is (by definition) optimal, and is thus chosen
as the output.
When we investigate languages from this perspective, what we find is that the
same set of constraints can describe a number of different systems, depending
on their relation to one another. In any given language, the constraints are
organized in a hierarchy, and then contribute to the determination of correct
surface forms via principles of “harmony” that include those of (5.21).
How does Gen do its work? This is a matter of some dispute among practi-
tioners of OT. One view is that Gen produces, for any input, the full range of
possible well-formed expressions over the alphabet of phonological represen-
tations. This requires Eval to be able to screen out vast masses of irrelevant
candidates with particular focus on the few that are legitimate possibilities.
While it has never really been shown how this is computationally possible with
finite resources, something along these lines is commonly assumed in theoret-
ical discussion.
Another possibility (one implicit in the notion that Gen “operates on input
representations”) is that Gen is smart enough to know, for a given input form,
what forms could conceivably be at least possible output candidates. But that in
turn requires that Gen incorporate some intelligence, mechanisms that sound
suspiciously like a set of phonological rules. And of course if the same old set
of rules turns out to be necessary as a (covert) part of the grammar, it is not
obvious how different this system would be from that of classical generative
phonology. These issues remain to be clarified, especially if OT is to be taken
seriously as providing a model of how speakers bring what they know to bear
on what they do, but at least the conceptual outlines of such a constraint-based
theory are relatively clear.
Of course, apart from the constraint system provided by UG (whose inter-
nal ranking characterizes an individual language), a speaker’s knowledge also
includes information about individual linguistic forms: the lexicon, a set of
representations including phonological, morphological, syntactic, and seman-
tic information. We will discuss the organization and content of the lexicon
in chapter 7, but one strictly phonological issue in the structure of the lexicon
forms an important aspect of OT. The goal of this theory is to characterize the
phonological knowledge a speaker has of his or her language entirely in terms
of a system of (ranked) constraints applying to output forms. It follows, there-
fore, that no constraints of this sort should crucially hold at the level of input
forms. This notion (known as the hypothesis of the richness of the base)
has quite interesting consequences for the form of our description.
For instance, in English, a language with no labiovelars, the absence of such
segments follows entirely from the constraint ranking in (5.22).
inputs would all yield the same output, pick (as the lexical representation) the
one whose output incurs the fewest violations. What that means in practice is
that the input which is closest in its properties to the output excludes any of the
others from the lexicon.
This constraint, a part of UG, dis-prefers structures like those of (5.24), since
in each case the property [Voice] is associated with only one member of the
sequence to the exclusion of the other.
(5.24) a. [Voice]
[+Obstruent] [+Obstruent]
b. [Voice]
[+Obstruent] [+Obstruent]
Let us see how we might invoke this apparatus to derive a familiar pattern of
phonological variation (known as voicing assimilation) found in English
inflections such as the plural or past tense, as discussed above. For input
/kæt + z/, /lajk + d/ we should prefer the output forms [kæts] and [lajkt]. The fact
5.3 Constraint-based theories of phonological knowledge 107
that the lexical value of voicing is not generally preserved here suggests that a
faithfulness constraint must be being violated. Let us call it IdentIO(Laryngeal).
6 Including, among others, [Voice], a (possibly complex) property characterizing the activity of
the vocal folds.
7 This is presumably related to the articulatory fact that airflow conditions during the production
of obstruent consonants are such as to inhibit vocal fold vibration, and some additional gesture
is necessary if voicing is to be maintained under these circumstances. Vowels and sonorant
consonants, in contrast, are articulated in such a way that vocal fold vibration is a normally
expected concomitant, and indeed some additional gesture is necessary in these sounds if voicing
is to be inhibited. Here part of the content of UG can be grounded in – though probably not
mechanically deduced from – properties of the speech production system.
108 5 Describing linguistic knowledge
two are thus tied with respect to their faithfulness to the input, the choice be-
tween them falls to (5.26), which tells us to prefer [lajkt] over [lajgd] since the
former contains no voiced obstruents.
If we were to formulate this effect in terms of a rule, it would look something
like (5.27).
(5.27) [+Obstruent] → [−Voice]/[−Voice]+ —
Such a rule would quite adequately express the fact that underlyingly voiced
affixes are devoiced when added to a stem ending in a voiceless segment. We
can show, however, that it would not suffice to describe the knowledge about
voicing assimilation which English speakers bring to bear in determining the
forms of their language.
As we mentioned above, English regular verbs form their past tense (and
past participle) by adding a uniform ending whose basic shape appears to be
/-d/ (although it also appears as [t] or [ d] when required by the shape of
the stem). Among the classes of irregular verbs, however, we find a number
(e.g., learn/learnt, burn/burnt, mean/meant, deal/dealt) which involve a similar
but distinct ending whose basic shape appears to be /-t/. This ending shows a
different pattern of assimilation from the “regular” /-d/: instead of altering
the shape of the ending to assure compliance with a requirement that voicing
be uniform in clusters, in this case it is the stem that changes, in verbs like
leave/left, lose/lost. Notice this direction of assimilation is also what we need
for derived forms such as describe/descriptive (cf. retain/retentive for the form
of the ending), absorb/absorption, five/fifth, etc.
The rule in (5.27) is not adequate to describe this situation, although we can
easily formulate a rule that is.
(5.28) [+Obstruent]→[−Voice]/ — +[−Voice]
(5.28) is clearly a different rule from (5.27), formally quite distinct despite the
fact that both are basically ways of enforcing a single regularity. Such dupli-
cation, where we need multiple formally unrelated rules to express a single
generalization, seems quite unsatisfactory if our aim is to describe the know-
ledge a speaker of English has about the language.
But now notice that this redundancy or duplication can be eliminated if we
base our account on constraints rather than rules. To deal with outputs such as
[lεft] corresponding to input /lijv + t/,8 we do not in fact have to add anything
to the constraint system as we have elaborated it above. As we saw, this system
will prefer outputs with uniform voicing to ones that are strictly faithful to their
8 In accounting for these verbs, we must obviously also include an explanation of the alternation
in the stem vowel ([ij]↔[ε]), but that issue is orthogonal to the one involving voicing which we
are discussing here.
5.4 The extension of constraint-based description 109
111
112 6 Phonetics and the I-linguistics of speech
representation for linguistic items and utterances has generally not been in
doubt. Many readers may be surprised to learn, though, that the status of
phonetic representations themselves in linguistic theory has not always been
quite so clear.
To be sure, there has often been some question about the extent to which pho-
netics is properly part of linguistics at all. If this kind of investigation of the artic-
ulatory, acoustic, and perceptual properties of concrete acts of speaking is essen-
tially a matter of more and more precise measurement of physiological, physical,
and neurological events, it seems to have little to do with linguistic structure
per se, especially if we construe the latter as primarily cognitive in its basis.
Phonetics would have the status of an auxiliary discipline – overlapping with,
but properly included within, physics, physiology, and the neurophysiology of
the auditory system – that simply described the externally observable proper-
ties of the abstract objects with which linguistics is concerned. As Trubetzkoy
(1939) put it, “phonetics is to phonology as numismatics is to economics.”
We argued in section 4.1 of chapter 4 that the kind of representation
generally called “phonetic” is a significant abstraction from the raw physical
facts. Nonetheless, few would question the premise that acts of speaking do
have some observable properties, and that the business of phonetics is to settle
the facts of the matter as to what these are, as far as the language system
is concerned. Relevant results of such observations can then be presented
in some appropriate form, and who could question that such a “phonetic
representation” describes the things phonologists have to account for?
Leonard Bloomfield, in contrast, argued that there is no linguistic signifi-
cance to phonetic representations (cf. Anderson 1985, pp. 262ff.). His point was
that insofar as these deviate in any way from a full physical record of the speech
event (such as might be provided by a tape recording, supplemented with
cineradiographic and other records of the articulatory details), they represent an
arbitrary selection of some properties to the exclusion of others and cannot be
said to be based on theoretically interesting principles. As such, they serve more
as a record of biographical details about the phonetician (what properties he or
she has learned to record and what to omit) than as a theoretically significant
record of a linguistic event. Somewhat similar objections have been revived
(on different grounds) in an updated form by Pierrehumbert (1990).
Bloomfield’s objection is a very serious one, and one to which linguists have
not always devoted enough attention – if only to be clear in their own minds
about why they reject it. Why, after all, should we attribute to some particular
subset of the physical properties of utterances the status of a fundamental
characterization of language? The essential nature of language is that of a
system of tacit knowledge as represented by the language organ, an aspect
of the organization of the mind and the brain. In that light, the selection of
some external properties of linguistic utterances as systematic to the potential
exclusion of others requires at least some justification.
6.1 Representations and the study of sound structure 113
In this chapter, we will defend the claim that there is a significant notion
of “phonetic representation,” one that is distinct both from a phonological
representation and from a complete physical record of a speech event. This is
part of our I-language system, and thus it merits the attention of linguists. The
kind of representation to which we wish to attribute this status, however, is
at least at first blush rather different from the sort of thing linguists typically
teach their students to produce in a beginning phonetics course.1
Let us begin by asking about the factors that contribute to determining the
physical properties of an act of speaking. A talker, let us assume, has something
to say and initiates a sequence of gestures of the vocal organs which affect the
surrounding air and are thus conveyed, perhaps, to the ears of potential listeners.
Any particular speech event of this sort can be regarded as resulting from the
interaction of a number of logically distinguishable aspects of the system that
implements that intention:
a. The talker’s intention to produce a specific utterance (i.e., the properties that
characterize the particular linguistic items – words, etc. – that compose it);
b. The fact that the utterance is produced by a speaker of a particular language
(i.e., patterns of neuromuscular activity characteristic of the sound pattern
of the particular language being spoken);
c. The fact that the utterance is a speech event (i.e., that its production invokes
neurophysiological and motor control mechanisms that are brought into play
in speech in general, as opposed, for instance, to the control regimes that are
relevant to swallowing, breathing, etc.); and
d. The physical and physiological properties of the speech apparatus, the acous-
tics of such systems, etc.
Since we can decompose speech into its articulatory, acoustic, and perceptual
aspects, we might envision (at least) three separate representations, one in each
domain. Alternatively, we might seek a single representation that unifies all of
these sorts of property in terms of one set of independent variables. Without
going into the matter in more detail here, we should make it clear that we are
quite persuaded by the arguments of advocates of a Motor Theory of Speech Per-
ception (see Liberman and Mattingly 1985, Mattingly and Studdert-Kennedy
1991 for discussion) to the effect that the primes of phonetic specification lie in
the articulatory domain, and not (directly) in the acoustics or in perception. This
decision goes against much work in automated speech recognition, for example,
which tends to be resolutely grounded in a “bottom up” approach to recovering
linguistic structure on the basis of the structure of the acoustic signal alone.
Even granting the apparent difficulties that arise in the effort to specify an
architecture for perception that implements a motor theory, we think that the
1 Another recent proposal involving a notable expansion of the notion of “phonetics” beyond the
traditional is that of Kingston and Diehl 1994. The analyses and proposals of these authors are
in some ways similar to the point of view presented below, though they differ in other technical
and substantive respects that we do not go into here.
114 6 Phonetics and the I-linguistics of speech
problem for those who would model perception is to find a way to implement
such an architecture. The tendency in much technologically oriented work on
speech perception and recognition is rather to retreat into the notion that some-
how the invariants must be out there in the acoustics if we will only keep
looking, because perception would be more straightforward if they were. This
is simply another example, we feel, of the drunk who persists in looking for his
keys under the streetlight because the light is best there. We could refine our
view of just what a Motor Theory is, but for present purposes that is not really
necessary. Our reason for bringing the matter up at all is simply to be explicit
about the assumption that it is articulatory activity that we wish to characterize.
far as we can tell, epiglottal activity is not something we manipulate per se, a
dimension which is controlled differently in some languages than in others.
This is perhaps a controversial thing to assert, but let us be clear on why:
some phoneticians have reported that in some languages epiglottal activity is
manipulated for linguistic purposes. But in the absence of clear support for such
claims, we would want to exclude the epiglottis from phonetic representation
in spite of its physiological and acoustic importance unless and until it can be
shown that epiglottal activity is independent of other articulatory events, events
which correspond to dimensions of irreducible linguistic importance.
What we seek, then, is a representation of all and only those aspects of a
speech event that are under linguistic control, in the sense of being managed by
the language organ: the system of linguistic knowledge that the talker brings
to bear in performance. Another way to put this is to say that we want to
characterize everything in the talker’s linguistic intention, as opposed to aspects
of the event that follow from the physical, neurophysiological, and other extra-
linguistic properties of the apparatus that is employed in talking. Providing
such a representation would respond substantively to Bloomfield’s objection,
by grounding the properties attributed to it in their role in the cognitive system
that constitutes our linguistic knowledge. It would certainly be distinct from a
full physical record of the speech event, since it would explicitly abandon the
attempt to describe everything that is true of this event in favor of a description
of everything that is linguistically determined about it.
One particular view of language in which the place of such a representation
is clear is that of Chomsky’s recent m inim alis t analyses (Chomsky 1995).
In that approach, there are only three significant kinds of representation. Post-
poning one of these, the nature of lexical items, for chapter 7, the two key
representations are the interfaces to sound, on the one hand (phonologi cal
fo r m, or PF), and to meaning, on the other (logical form, or LF). The first
of these is intended to characterize all and only the aspects of an utterance’s
form that are managed by the linguistic computational system, and it must be
(exactly) adequate to serve as an interface to the language-independent systems
of articulation and perception. PF in these terms is exactly the representation
we seek.
2 See Lehiste 1970 for a review of the classic literature establishing these facts.
6.2 A linguistic basis for phonetic representation 117
slackness of the folds disappears with some particular latency and elasticity,
and it seems reasonable to attribute the particular contour of pitch that results
in a language like, say, English to general mechanisms of speech motor control,
not to the grammar of English.
In general, we find essentially the same pitch contours in post-consonantal
vowels across many languages. In some languages, however, such as Yoruba
and Thai, where tone is linguistically significant, we find that the pitch contours
of post-obstruent vowels are much sharper. We might suggest that in these cases,
the language specifies an independent compensatory gesture that has the effect
of bringing the vowel to its intended (and significant) pitch value more rapidly
than in English. Compare the contours in figure 6.1. In English, as represented
by the graphs in figure 6.1a, we see that there is still a difference in pitch in
the vowel following a voiced consonant (e.g., [b]) as opposed to a voiceless
one ([p]) more than 100 millseconds after the stop release. In Yoruba, on the
other hand, there are three contrastive tonal levels. We see in figure 6.1b that
a vowel following voiced vs. voiceless consonants displays the same (high,
mid, or low) pitch value within roughly 50 milliseconds after the stop release
(slightly longer in the case of a high tone, but still sooner than in the non-tonal
language English). Hombert (1976, p. 44) argues “that there is a tendency in tone
languages (which does not exist in non-tonal languages) to actively minimize
the intrinsic effect of prevocalic consonants.”
Fo Fo
k
140Hz
180Hz
k k k k k
k g
g
p
130
k
k
150 k
b g
g
k
g
k
g
120
g
120
time time
20 100 msec
20 100 200 msec
a. English b. Yoruba
Figure 6.1 Post-obstruent pitch contours in tonal vs. non-tonal languages
(from Hombert 1976)
118 6 Phonetics and the I-linguistics of speech
voiceless as opposed to voiced obstruents: the talker “intends” the same thing
with respect to the vowels in both cases, but the implementation of that intention
leads to an unintended consequence as a result of the coordination of the vowel
and consonant gestures. In this case, then, the phonetic representations of vowels
ought to be uniform, despite the fact that acoustically the vowel portions of
utterances differ quite systematically by 20–30 milliseconds, depending on
the environment. Paradoxically, vowels of equal duration may thus come to be
specified differently (the Saudi Arabic case) while ones of different duration will
not differ in their phonetic representation where the differences are not intended
per se. On the other hand, where the durational differences considerably exceed
what can be attributed to the mechanical effects (as in the English case), we
must again indicate the presence of some specific phonetic intention responsible
for this.
These sounds are described as voiceless unaspirated initially, but voiced in-
tervocalically. Measurement of the time course of opening of the glottis (Kagaya
1971) during their production suggests that the voicelessness in initial position
is actually a gradual transition from the open glottis position associated with
the neutral position for speech in that environment to the approximated position
required for the following vowel; and that intervocalic voicing represents merely
the maintenance of the position appropriate to the consonant’s surroundings.
The “unaspirated” stops, then, have no value of their own for the position of the
larynx: they are phonetically underspecified for this property.
there are no environments in which we might potentially find either sound, with
the difference between them serving as the basis of a contrast between words.
Furthermore, the consonants surrounding [i] vs. [u] are not arbitrary: those
favoring [u] are ones whose articulation is similar to that of an [u] in involving
lip rounding and a relatively retracted tongue position, while those favoring [i]
involve no rounding, but a relatively high, front tongue body position. Similar
observations can be made for the environments in which the other phonetic
vowels are found.
These facts suggest that in Kabardian, the wide phonetic variety of vowels
corresponds to little or no difference in linguistic intention: essentially, the
vowels consist of a vocalic transition from the position of a preceding consonant
to that of a following one. In any given environment, there are in general only
two possibly distinct qualities that can occur, which we can differentiate by
saying that the transition is made with or without an accompanying deflection
of the tongue body downward. The presence or absence of this deflection is
thus the only way in which the articulation of one vowel differs from that of
another per se.
This kind of analysis was examined more deeply by John Choi (1992). Choi
explored a variety of possibilities quite closely, and examined articulatory and
acoustic data on languages of this type. One such language which he analyzed
in detail is Marshallese. Choi concluded that the front–back dimension in the
realization of Marshallese vowels does not correspond to anything that is under
independent linguistic control: no gesture of fronting or backing intervenes
in the relatively smooth transition from preceding to following tongue body
position as determined by the surrounding consonants, although tongue height
is independently controlled. This state of affairs is thus intermediate between the
almost completely underspecified vowels of Kabardian and a fully determined
system.
4 See Byrd and Saltzman 1998 and Byrd et al. 2000 for one approach to a phonological model of
speech dynamics.
6.3 Speech microprosody: a research program 127
And of course, until we know what aspects of speech organization are subject
to language-particular determination, we do not know what we need to have a
principled theory of in order to describe this aspect of the language organ.
Introduction of an appropriate specification of rhythmic and temporal effects
into the description of the aspects of speech that fall under linguistic control
is thus an important refinement of the range of problems to which phonolo-
gists should attend. We have assumed above that the form this might take is
simply a specification of appropriate aspects of the time course of particular
gestures in speech, but there is another aspect to the problem. Not only can the
implementation of particular gestures over time fall under linguistic control, but
independent of this, their synchronization and relative organization can provide
yet another dimension of possible variability. Browman and Goldstein and their
colleagues (Browman and Goldstein 1998, Browman et al. 1998) develop this
point in suggesting that otherwise comparable gestures in English and Tashlhiyt
Berber, for example, differ in their “bonding” or relative synchronization
more than in their individual component timings. These ways of describing
timing relations need further refinement and application to a variety of language
types.
5 Some references on this topic include Graff-Radford et al. 1986, Blumstein et al. 1987, Kurowski,
Blumstein, and Alexander 1996, and Carbary, Patterson, and Snyder 2000. The interpretation of
this disorder as one of a prosodic, rather than segmental, nature, as suggested below, has come to
be fairly widely accepted, though details vary considerably among both patients and researchers.
A particularly useful review with respect to this issue is provided by Berthier et al. 1991.
6.4 Conclusion 129
6.4 Conclusion
If we take the lessons of the sections above to heart, the representation we
arrive at for “PF” – the interface of the language organ with the mechanics
of articulation7 – is quite distant from the sort of fine phonetic transcription
using symbols of the IPA (or some other, similar system) which most current
linguists learned as students (and generally continue to teach). An accurate PF
representation:
a. fails to indicate some properties that are perfectly true, measurable, and sys-
tematic about utterances, insofar as these are the consequences, not specifi-
cally intended, of other aspects of speech;
b. indicates gestures that are intended even under circumstances where those
gestures do not have their intended auditory (or perhaps even articulatory)
effect;
c. specifies gestures with associated temporal dynamics, and not just timeless
points, and which are related in hierarchical ways;
d. indicates gradient temporal relations rather than mere succession;
e. indicates gradient gestural magnitudes and not only categorical presence vs.
absence; and
f. indicates the overall pattern of articulatory dynamics within which the ges-
tural intentions of the talker are realized.
A representation with these characteristics obviously cannot be taken to be the
kind of physical observation language phonetics is often presumed to provide.
It is rather a description of the cognitively real (though largely unconscious)
representations that underlie speech motor control and that are in some sense
recovered in perception. Despite its non-traditional character, it does seem to
serve the larger goal of characterizing what is linguistically significant about
the facts of speech. It is neither a full physical record of speech events, nor
a restricted characterization of the minimal distinctive core that distinguishes
higher-level linguistic elements from one another. As such, it serves as the output
(or perhaps better, as the implicature) of the sound-structural regularities of a
language. We claim that these are the things that ought to be demanded of a
“phonetic representation” by those for whom such a notion finds its significance
in the theory of language and the mind, rather than in physics or physiology.
That is, it is an appropriate way of characterizing PF, the connection between
the language organ and the organs of speech and hearing.
7 Morphology
If you ask a naive person-in-the-street – the kind of person the British call “the
man on the Clapham omnibus” – what the central thing is that has to be learned in
order to “know” a language, the chances are that a major part of the answer will
be “the words of the language.” This notion that the words of a language are the
essence of its identity is reinforced by standard language courses, which devote
great attention to the systematic presentation of vocabulary. Indeed, much of
what passes for “grammar” in many language courses is actually a subpart of
the theory of words: what has to be learned about things like conjugation and
inflection is first and foremost how to form inflected words. Compared with the
effort usually devoted to drilling vocabulary and word formation, the amount of
attention devoted to exemplifying the uses of the various forms and providing
usage notes is usually quite limited, and the space given to fundamental matters
of syntactic structure virtually none at all.
So if the set of words is such an important property of, say, English, how do
we determine what that set is? A standard answer is provided by a dictionary
(though that, of course, simply puts the problem off by one step: how did the
dictionary makers know what to include?). Most speakers behave as if the
question “Is [such and such] a word of English?” has a determinate answer,
but if so, the dictionary probably does not provide that answer, at least in the
general case. For instance, overlook “disregard” is clearly a word of English,
and is recognized as such in dictionaries, but what about overlookable “subject
to being disregarded”? Surely this is also a word of English, though it will not be
found in any of the standard dictionaries, even the Oxford English Dictionary.
And this ignores the fact that new words enter the language all of the time: if a
word like frobnicate “to manipulate or adjust, to tweak” did not exist prior to
the second half of the twentieth century, how did that change? And why, as soon
as we recognize frobnicate, do we have no trouble at all accepting prefrobnicate
“to manipulate or adjust something prior to performing some other operation
on it”? Clearly a dictionary of a language reflects the language’s words, but
equally clearly no dictionary can be taken as a definition of what constitutes a
word of the language.
131
132 7 Morphology
In contrast to this notion, Aronoff shows that the notion of the lexicon as
an inventory of “members of a major lexical category” is presumed in much
writing on generative grammar. Evidently these two conceptions are not coex-
tensive: on the one hand, some idiosyncratic items (including both members
of the set of grammatical items like determiners, pre- or post-positions, etc.;
and idiomatic phrasal constructions) are not members of open word classes;
and on the other, many words that are members of the classes noun, verb, and
adjective (or whatever the correct set might be) will, in many languages, be
completely compositional formations composed of more than one morpheme,
such as overlookability. Ignoring these differences, the a priori interest of the
lexicon defined in either way is not self-evident.
What these two ways of construing the notion of “lexicon” have in common
is that they are both kinds of list. Perhaps by analogy with dictionaries in the
real world, it seems often to be taken for granted that the lexicon is a kind of
set or database.
Now of course no one would take seriously the notion that the “syntax” of a
language is a list of its sentences (whether the unpredictable ones or all those
of some given type), or that the “phonology” is a list of sound combinations.
Both of these aspects of grammar are generally construed as kinds of knowledge
speakers have about their language: in the one case, knowledge of the patterns by
which words can be organized into larger structures, and in the other, knowledge
of how sound units of various sorts combine and the modifications they undergo
within such larger combinations. These are all components of our language
organ, which develop in particular ways on the basis of our exposure to data
characteristic of a given language. It seems productive to interpret the “lexicon”
of a language in a similar sense as, roughly, “the knowledge a speaker has of
how words can instantiate positions in a syntactic structure.”
When construed in this I-language fashion, the lexicon is not just a list of
items. Of course, much of what we know about the words of our language
does have the character of individual and rather local stipulations, like “cat is
a noun pronounced /kæt/ and meaning ‘a carnivorous mammal (Felis catus)
long domesticated and kept by man as a pet or for catching rats and mice.’”
But in addition, a speaker’s lexical capacity must also include a system of
rules or principles by which words are related to one another, insofar as these
relations are (at least partially) systematic and thus part of what we know
about words, qua speakers of a particular language. Such regularities are also
implicated in describing the formation of new words not hitherto part of a
speaker’s explicit knowledge of the language but implicit in its principles of
word formation. Even when a language has a rich set of general principles for
forming new words, it is typically the case that the question of whether or not
a given possible word is also an actual one remains a lexical issue – not, say, a
syntactic one.
134 7 Morphology
1 “Eskimo” is a name to be avoided, since (like the names of many other groups around the world) it
was assigned to the people involved by speakers of another language, and is considered offensive.
In this case, the word “Eskimo” was apparently supplied by speakers of an Algonquian language
who served as guides to early French and English speaking explorers. It means roughly “people
who eat really disgusting stuff.”
136 7 Morphology
2 We have essentially nothing to say in this book about semantics, the study of meaning, but that
certainly does mean we think this is unimportant, or not related to the nature and structure of the
language organ. This omission simply reflects the limitations of our own research competence.
The work of Ray Jackendoff, for instance (including Jackendoff 1983, 1990 among others)
sheds important light on the way the language organ associates linguistic form with meaning.
Representations in small capitals are intended to refer to elements of meaning: while these are
not quite as arbitrary as they may appear, we will not attempt to defend claims about them here.
7.2 Words and “morphemes” 137
kind of metaphor, it has some rather strong consequences. Some of these are
the following. (a) Since the relation between a (complex) word and another
from which it is derived consists exactly in the addition of another minimal
sound–meaning complex, sign composition must always be strictly monotonic
(additive, concatenative). (b) Since the basis of the sign is the indissoluble
unity of sound and meaning (and (morpho)syntax), there should be a one-
to-one association between form elements and content elements. That is, every
chunk of form ought to correspond to exactly one chunk of meaning, and vice
versa. Exceptions to this are unsystematic instances of accidental homophony
or synonymy. (c) Derivational relations are directional, in the sense that (all of
the) properties of the base form are presupposed by a derived form that involves
an additional marker. We will examine in turn each of these implicit empirical
claims of the morpheme-based theory of lexical structure, and conclude that
none of them are consistent with the facts of natural language in general. This
will lead us to the conclusion that the lexical component of the language organ
should be viewed as a system of rule-like relations among words, rather than
just an inventory of minimal signs (the morphemes).
It should be stressed that the notion that words are exhaustively composed
of morphemes, and that the theory of word structure is essentially a kind of
“syntax of morphemes,” has a long history in linguistics, and a good deal of
initial appeal. Nonetheless, we suggest that this is essentially an E-language
notion that ought to be replaced by a somewhat different conception if the
structure of the language organ is to be properly understood.3
3 Again, the work of Jackendoff, especially Jackendoff 1975, furnishes important precedents for
the ideas developed here.
138 7 Morphology
of case that pose problems, providing the illusion that these have somehow
been solved. In fact, though, they present a solid core set of cases in which a
morphological relation between two forms is not signalled by the addition of
new material, as the morphemic view would require.
One large group of examples of this sort can be collected as instances
of apophony, or relation by replacement. English pairs such as sell/sale,
sing/song, blood/bleed, food/feed, etc., as well as the familiar examples of
man/men, woman/women, mouse/mice, etc., indicate relations such as those
between a verb and a corresponding noun or between singular and plural by
changing the main stressed vowel, not by adding an affix of some sort. In some
languages, such as the Semitic languages Arabic and Hebrew, replacive oper-
ations of this sort (e.g., Modern Hebrew semel “symbol,” simel “symbolize”;
xašav “think,” xišev “calculate,” etc.) can be treated as the association of a basic
set of consonants with one word pattern or another (see McCarthy 1981), but
this analysis has no appeal for English examples like those above.
Other apophonic relations involve consonants, rather than vowels, and
are sometimes called instances of mutation (as opposed to “Ablaut” or
“Umlaut” where vowel changes are concerned). Consider English pairs
such as believe/belief, prove/proof, speak/speech, bath/bathe, breath/breathe,
glass/glaze “provide with glass.” Again, the relations involved are signalled (in
the modern language, at least) not by the addition of some marker, but rather
by a change from one consonant to another.
Some languages (though not English) go so far as to indicate a class of derived
words not by adding material to a base, but rather by subtraction. An interesting
class of nouns is derived from verbs in Icelandic by deleting the final -a that
marks the infinitive: thus, hamr [hamr] “hammering” from hamra [hamra] “to
hammer”; pukr [pü:kr] “concealment” from pukra [pü:kra] “make a secret of.”
One might think that the infinitive is here derived from the (shorter) noun form
by adding the ending -a, but we can see that that is not the case from the vowel
length in forms like pukr [pü:kr]. The fact that this vowel is long appears to
contradict the rules for vowel length in Icelandic; but it makes sense in terms
of the form of the untruncated infinitive ([pü:kra]). Forms derived from this
infinitive preserve its vowel length, but this explanation requires us to assume
that pukr is derived from pukra – by deletion of the final -a.4
4 See Orešnik and Pétursson 1977, Kiparsky 1984 for details and discussion.
7.2 Words and “morphemes” 139
(7.5) a. During their captivity, Kim and Sandy were tortured (by
thoughts of home and family).
b. When called on in class, Terry always looks tortured (by
insecurity).
When no explicit agent is given (for instance, in a by phrase), a sentence like
(7.5a) still carries the implication that there was some cause for the suffering,
even if only mental; and this cause can be made explicit. Even in the adjectival
form (7.5b), the source of the suffering, though not as strongly implied as in the
verbal form, can be made explicit. In Icelandic, however, there is no possibility
of adding an af phrase (the equivalent of an English by phrase, used with true
passives to indicate an agent), and there is no necessary implication that the
suffering has a determinate cause. Any such source can only be mentioned in
the form of an adverbial adjunct phrase.
(7.6) a. *Jón kveljast af tannpı́nu
John is-tortured by toothache
John is tortured by toothache
b. Jón kveljast (ı́ tannpı́nu)
John suffers from toothache
John suffers (from toothache)
7.2 Words and “morphemes” 141
English passives thus have a place for reference to a causer – a place that
may be represented explicitly (with a by phrase), or implicitly, through the in-
troduction of a non-specific or indefinite semantic operator. Icelandic -st verbs,
in contrast, do not involve such reference (even though ordinary passive forms
in Icelandic do, like English). The right semantic analysis of the relation be-
tween the basic and the -st verbs in (7.3) must therefore involve not the binding
(by an abstract or impersonal operator) of an agentive argument, but rather the
omission of the entire layer of semantic structure associated with agentivity.
Semantically, the verbs that serve as bases for this formation have the schematic
form in (7.7).
That is, “(subject) causes (object) to become tired, miserable, started, open,
etc.” The addition of the ending -st has the effect of deleting the highest predi-
cate (cause x,y) from this structure. This means the corresponding argument
disappears from the syntax, so that a basic transitive verb becomes intransitive;
and also that the role played by this argument is no longer present in the se-
mantics of the derived verb. In modern Icelandic, “suffering” is not a matter of
being tortured by someone/something, even though such a meaning may well
provide the etymology of the word.
What does this mean? Simply that there is no way to describe the verbs in
(7.3) as being related by the addition of a “morpheme” -st, because there is no
way to characterize this morpheme’s contribution to meaning as the addition
of something to the meaning of the base verb. The relation between the two
columns of (7.3) is relatively straightforward: speakers know that when a given
verb involves causation of this sort, a related verb ending with -st may describe
the same resulting state, but without reference to its causation. This I-language
characterization, however, has no clear correspondent in terms of the E-language
conception of words as built by combining morphemes.
This example is actually typical of a large class of cases. In many languages,
we find some morphological element that converts transitive verbs to intran-
sitives, often with no remaining trace (in the form of “understood” material)
of the missing argument(s). The subtractive nature of this operation is usually
concealed by grammar writers, who give the element in question a superficially
additive interpretation “detrans” or the like. But we must not lose sight of the
fact that the actual content of “detrans” may be precisely the suppression of
some part of the base verb’s content – an operation just like that of phonologi-
cal subtraction (truncation), when Icelandic pukra “make a secret of ” becomes
pukr “concealment.”
the subtractive operation in question alters the syntactic information in the lex-
ical items affected.
English has a large array of nominals formed with the ending -er, many of
which are derived from related verbs. An interesting point about these concerns
the extent to which the nominal does or does not appear associated with the same
arguments as the basic verb. Hovav and Levin 1992 distinguish two classes of
-er nominals: some (typically with an agentive interpretation) inherit the argu-
ment structure of the base verb, while others (typically with an instrumental
interpretation) do not. Consider, for example, two senses of the noun wrapper
in relation to the syntax of the base verb wrap, as illustrated in (7.8). Here the
difference in prepositions reflects a difference in argument structures, a vari-
ation that is associated with the difference between agentive and instrumental
readings of the nominal.
(7.8) a. The best job for Fred would take advantage of his experi-
ence as a wrapper {of/*for} presents in fancy gold paper at
Tiffany’s.
b. The best use I can think of for The New York Times is as a
wrapper {for/*of} fish that didn’t keep well overnight.
Note that, in association with nominals, of marks complement arguments (as
in (7.8a) – see similar examples in chapter 3), while various other prepositions
(like for in (7.8b)) mark adjuncts.
Hovav and Levin draw an analogy with two types of derived nominals, dis-
tinguished by Grimshaw (1990) among others:
(7.9) a. The examination (of/*for the graduating seniors) lasted three
hours.
b. The examination/exam (for/*of prospective students) was
three pages long.
“Event”-derived nominals such as (7.9a) refer to events, and they can take
complements, as the possibility of (7.9a) with of shows. Non-event-derived
nominals such as (7.9b) refer to objects, results of actions, etc. rather than
events, and these do not take complements, as the impossibility of (7.9b) with
of rather than (the adjunct-marking) for shows.
Hovav and Levin propose to unify the differences found in (7.8) and (7.9)
in the following way. They suggest that the syntax of basic verbs like wrap,
examine includes a reference to an event (of wrapping, examining).5 Both in
the -er cases (7.8b) and the others (7.9b), the derived word involves a change
in the “event” position, as a correlate of the derivational relations illustrated
5 We do not attempt to justify this analysis here, but refer the reader to Hovav and Levin’s work
for the motivation and details.
7.2 Words and “morphemes” 143
by (7.8b) and (7.9b). The relations illustrated by (7.8a) and (7.9a), in contrast,
involve the same formal marker, but quite different semantics; and here the argu-
ment structure of the basic verb, including its “event” argument, are preserved
in the derived noun.
If this account is correct in its essence, it provides an example of “subtrac-
tive” morphosyntax in association with “additive” phonological marking: the
formation of instrumental and non-event-derived nominals involves the dele-
tion of syntactic argument structure with respect to the properties of the base
verb. Again, this makes sense if we think of the derivation of one lexical item
from another as based on a relation which is part of the I-language that
develops as the language organ of speakers of (in this case) English. It is much
harder to understand if we attempt to represent this part of the language by an
E-language-based inventory of morphemes.
6 There is an extensive literature on the structure of “ergative” languages, and a major point of
contention is the nature of grammatical relations within basic clauses in such a language. Since
the resolution of those issues does not concern us here, our use of “subject” and “object” in
referring to Warlpiri can be be taken as presystematically referring to the arguments that would
correspond to the subject and object of a literal translation of the sentence into English.
144 7 Morphology
7 See Anderson 1988 among much other literature for some discussion.
7.2 Words and “morphemes” 145
(7.10). The English expressions that gloss the examples in (7.11) – dig yams,
dig the ground vs. dig for yams, dig at the ground, etc. – illustrate the same kind
of contrast.
It happens that the (quite systematic, familiar and productive) relation be-
tween these two patterns is not marked inWarlpiri by an overt verbal affix, but
in various other languages (including one to which we will turn shortly, West
Greenlandic) entirely comparable relations are overtly marked. We assume that
the relation between these two patterns in Warlpiri, as in languages where there
is an overt affix, is appropriately established in the lexicon, since there are two
distinct morphosyntactic behaviors correlated with a difference in meaning.
There have been a number of proposals in the literature as to how to express
the difference between the members of such pairs. Mary Laughren (1988) makes
an interesting suggestion about the Warlpiri case. She suggests that the affected
verbs have a semantic interpretation involving more than one layer of structure,
and that the morphosyntactic relation is correlated with a reorganization of
those layers.
(7.12) a. i. [V [erg][abs]], “I got yam s by I di g”
ii. [V [erg][dat]], “I dug in order to I get yams”
b. i. [V [erg][abs]], “I broke up e arth by I di g”
ii. [V [erg][double-dat]], “I dug in order to I break
up earth”
If Laughren’s account is more or less on target, we have an example of what
we might call “semantic metathesis”8 as the concomitant of a lexical relation.
That is, the relation between two lexical variants of the same verb involves no
phonological change, but only the morphosyntactic replacement of one case
frame by another, and the reorganization of subordination relations within the
logical conceptual structure that represents the verb’s semantics.
A somewhat different take on a very similar construction is provided in an
interesting paper by Maria Bittner (1987). In West Greenlandic (an “Eskimo”
language), the “object-demoting” or anti-passive construction is overtly
marked – there are at least five different suffixes, in fact, that have this ef-
fect. Bittner argues, contrary to previous accounts, that these affixes are in fact
independent. But what interests us is the change in the morphosyntax of case
marking that accompanies each of the affixes.
We illustrate first with some simple cases, to show the form of the alternation.
The first member of each pair in (7.13) is formally transitive, while the second
is intransitive and “anti-passive” (AP) with a complement in the instrumental
case (inst).
8 metathesis is a change whereby two elements switch position, without otherwise changing their
form. An example is provided by English dialects in which the verb ask is pronounced [æks].
146 7 Morphology
Much the same can be said, again, for the formal patterns of Semitic mor-
phology. Arabic verbs in the C1 aC2 C2 aC3 pattern, for example, include both
intensives (e.g. qattal “massacre”; cf. qatal “kill”), and causatives (e.g. ʕallam
“teach”; cf. ʕalim “know”). While the pattern presents a single category of
word form, that category is associated with quite different content in different
instances.
These facts suggest that within the lexicon of a language, we need to recognize
a collection of formal classes (like -er nouns), and a collection of content-based
classes (like “derived nominals”), independently of one another. Typically a
given form-based class will be largely coextensive with a particular content-
based class (or at least with a small number of these), and vice versa; but since
the two have independent bases, there is no need for the structure of the two
domains to map onto one another in any simple way.
Let us take a look at one such content-based class, to see how it is related
to formal structure. English contains a great many “agent nouns” derived from
verbs, such as [N [V bake]-r], [N [V preach]-er], etc. Some other agent nouns
appear to end in -er, but not an -er which has been added to a more basic verb:
rather, if there is a related verb, it is homophonous with the noun, and should
probably be regarded as derived from it, as in the cases of [V [N butcher]], [V
[N huckster]], and (perhaps) [V [N minister]]. In yet other cases, the noun ends
(phonetically, at least) in the same way as other agent nouns, but there is no
related verb at all: [N carpenter], [N soldier], [N janitor], [N bursar]. We also have
agent nouns that apparently display the same ending, but where the related word
that serves as the base is another noun, rather than a verb: messenger, adulterer,
lawyer, astronomer, furrier, clothier, hatter, etc.
The sources of agent nouns in -er can thus be quite diverse, but that is not at
all the end of the story. Other nouns seem to be just as validly “agent nouns”
in terms of their content, but do not display -er at all: poet, musician, artist,
linguist. Many such nouns are related to verbs in the language, but where such
a non-er agent noun exists, the expected regular formation with -er added to
the verb is not well-formed: cf. cook (*cooker in the sense “one who cooks”:
cf. Kiparsky 1982), judge (*judger), student (*studier), representative (*repre-
senter), correspondent (*corresponder).
Notice that these regular forms are not blocked per se as possible words: in
fact, cooker exists, at least in British English, but as an instrumental (meaning
“oven”), rather than an agentive. The regularity is rather that once a verb is
associated with one action noun in the lexicon, we cannot create another syn-
onymous action noun for that verb by applying the regular process. This is quite
parallel to the fact that a single verb has a single associated action nominal, as
we observed above, though this may come from any one of a wide range of
formal types. The same form can be employed for two different functions – e.g.
wrapper, which can be either an agent or an instrument – but not vice versa.
150 7 Morphology
11 In terms of the number of distinctions they show among consonants, the West Circassian lan-
guages are among the most elaborate in the world. The reader for whom the transcriptions
below seem like gibberish should not be concerned, since these details of pronunciation are not
material to the point at issue.
12 We ignore here a separate “gender” which applies only to one or two locative words.
7.2 Words and “morphemes” 151
either way, in the sense that either the masculine form or the feminine might be
taken as basic depending on the example.
The simplest cases are those in which the relation is perfectly symmetric;
and those in which only one gender is possible.
(7.17) a. εnk-apυtánı̀ “wife’s mother”; ɔl-apυtánı̀ “wife’s father”
b. ε-mɔ́dáı́ “female fool”; ɔl-módáı́ “male fool”
c. εnk-áı́ “God”; *ɔlk-áı́
d. *ε-mεná; ɔl-mεná “contempt”
In other cases, however, it appears that the feminine is basic, and the mascu-
line derived.
(7.18) a. en-kı́né “goat; female goat”; ol-kı́né “male goat”
b. εn-k έráı́ “child (either gender)”; ɔl-kέráı́ “large male child”
c. εnk-anáshὲ “sister”; ɔlk-anáshὲ “very large sister (pejorative)”
d. en-tı́t!o “girl”; ol-tı́t!o “large, shapeless hulk of a woman
(pejorative)”
e. en-kitók “woman”; ol-kitók “very respected man”
In a number of other cases, there is a relation of relatively small/relatively
large between the two forms, and in these instances the feminine seems sec-
ondary with respect to the masculine.
(7.19) a. εn-dóı́nyó “hill”; ol-dóı́nyó “mountain”
b. εnk-ál έm “knife”; ɔl-ál έm “sword”
c. εnk-aláshὲ “weak brother (pejorative)”; ɔl-aláshὲ “brother”
d. εnk-abáánı̀ “female or small doctor, quack”; ɔl-abáánı̀
“(male) doctor, healer”
e. εn-dεk έt “ineffectual curse”; ɔl-dεk έt “curse”
f. ε-l έε “man (pejorative)”; ɔ-l έε “man”
It seems that in this instance, we should recognize a basically symmetric
relation between the two genders:
(7.20) feminine masculine
(female, relatively small) ⇔ (male, relatively large)
/e(n)-/ /o(l)-/
In some cases, basic items have the same status (but different sense) in either
class. In other cases, the lexical item has a “basic” gender, and a shift in either
direction may imply a pejorative value. There is still no sense in which one of
the genders is in general derived from the other, however.
The theme being developed here should be familiar by now. E-language-
based views have assumed that the lexicon of a language can be characterized
by a set of items (signs) whose properties of sound, meaning, and morphosyntax
152 7 Morphology
7.3 Productivity
Let us summarize what seems to be true about our knowledge of words from
what we have seen thus far. First, it is quite impossible to maintain that that
knowledge takes the form of a list of full words we have learned individually.
In fact, we could not possibly have just memorized a list of all the complex
words – certainly not in languages like West Greenlandic or Siberian Yupik,
where much of the expressive power of the language is built into the system
that forms new words, so the creative aspect of language is not limited to the
syntax. Indeed, this is also the case for German or English.
A list would not represent our knowledge for several reasons. First, we can
clearly make new compounds (e.g., PlayStation, ThinkPad, earbud [part of
a hands-free cell phone]) and derived words (prefrobnicate, overlookable),
and at least the first time we use them, such forms would not be on the list.
Secondly, in languages with extensive inflectional paradigms (such as Finnish,
Hungarian, and many others), every lexical item may have thousands of inflected
forms, many of which a given speaker might never have encountered for a
particular word, but which can all be produced and recognized when required.
We recognize new inflected forms of words, and may not even know whether we
have ever seen them before. We can even provide such forms for nonsense words
(as in Jabberwocky). Most generally, such a list does not take advantage of or
express our knowledge of systematic regularities that may be more complicated
than what can be expressed merely as the addition of a “morpheme”; and there
are a great many such regular relations to take advantage of among the words
we know. All of this suggests that our knowledge of words is better represented
by something more like a system of rules than (just) a list. Of course, we have
to have a list of the irreducible bits – the fact that [kh æt] is a noun that means
“cat,” for instance – but we also have to have a system of rules or the like to
represent our knowledge that expresses – and indeed goes well beyond – the
ways in which these bits are combined.
7.3 Productivity 153
We can clearly see that our knowledge of lexical relations can be more or
less exhaustive. This is known in the literature on morphology as the issue of
“productivity”: the formation of new adjectives in -able from verbs is essentially
completely productive, but the relation between spit and spittle is essentially
limited to this pair (and perhaps prick/prickle) in modern English, since there
is no reason to assume any relation at all between such superficially similar
pairs as cod and coddle. Many other connections fall somewhere in between,
applying in a number of instances but not really providing a warrant for novel
creations.
At least a lot of what goes by the name of productivity in the study of
morphology is probably a reflection of the extent to which what we know
about lexical relations determines the properties of new instances. That is, the
properties of a class may sometimes be sufficient to allow us to predict all of
the characteristics of a potential new member, but in other instances the internal
diversity of the class might leave much underdetermined.
A case where our knowledge is clearly only partial, and where the relation
in question is only marginally productive, is that of English adjectives in a-:
e.g., ablaze, abroad, aground, afoot, afield, ajar, alight, asleep, askew, astride,
aspread, awash, aghast, etc. These are derived from verbs (e.g., ablaze), nouns
(e.g., afield), or other adjectives (abroad). Some do not correspond to more
“basic” stems from any class: ajar, aghast (but ghastly?). The class of these
adjectives displays a sort of family resemblance semantically, but no consistent
semantics: Marchand (1969) suggests that they mean “in a state or position
of . . . ,” but that is not interestingly distinct from what we could say about the
class of adjectives as a whole. And of course the conclusion we can draw is that
our inability to make up new adjectives of this class – the lack of productivity of
“a- adjectives” – follows from the fact that the existing relations do not provide
enough limitations to determine what such a word would mean: if we were to
coin the word awalk, we would not know what use to put it to.
Much derivational morphology is lexically isolated in this way. Aronoff
(1976) suggests that productivity might be reduced, when properly under-
stood, to the transparency of a derivational relation. Relations are more or less
productive depending on how completely they determine the properties of a
potential word as a function of those of an existing base, an account which
clearly rests on a conception of word structure as an elaborate and articu-
lated kind of knowledge, rather than just a set of basic elements together with
rules for combining them. Surely some account of the differences among word
formation patterns in terms of their productivity is necessary as part of a de-
scription of what a speaker knows about a language, and it seems inevitable
that any such account will need to be based on the I-language notion of word
structure relations, rather than on the E-language notion of an inventory of
signs.
7.4 Conclusions about lexical organization 155
more complex relational one goes along with a shift in the nature of linguistics
from a focus on items and the properties of a collection of linguistic objects to
a focus on language as a cognitive system, and a kind of knowledge.
We are really only at the beginning of an understanding of how the language
organ accommodates and organizes knowledge of words and their relations
with one another, and of how principles of UG both limit and enable the kinds
of things that can occur in particular languages. We have not even begun to
address issues such as those raised in chapter 5, concerning whether a system
of rewriting rules is the most appropriate way to formulate a description of the
systematicities of I-language in this domain.
It seems self-evident that there is much more variation among languages in
their lexicons than elsewhere in grammar, but perhaps, once the contribution of
UG to lexical structure is properly understood, that will turn out to be illusory,
just as it has turned out that languages are much more similar in their syntactic
organization than was once thought. Any approach to such problems, however,
must start from a clearer picture of just what lexical structure involves than
is provided if we concentrate simply on the identification of a collection of
irreducible basic signs: a “lexicon” in the sense of a “dictionary.”
8 Language change
157
158 8 Language change
“drift,” originally due to Sapir (1921, ch. 7). Sapir dealt with long-term change
by postulating drifts. A drift represented the unconscious selection of those
individual variations that are cumulative in some direction. So he attributed the
replacement of English whom by who to three drifts:
a. the leveling of the subject/object distinction;
b. the tendency to fixed word order;
c. the development of the invariant word.
Sapir was concerned that in positing a “canalizing” of such “forces” one
might be imputing a certain mystical quality to this history. Certainly the modern
work confirms that fear. Robin Lakoff (1972), for example, examined changes
in various Indo-European languages which yield a more “analytic” surface
syntax, and she sought to combine Sapir’s three drifts into one. The phenomenon
cannot be described, she pointed out, by talking about individual changes in
transformational rules or other aspects of a grammar:
French has its head to the left (il conduisit l’armée “he led the army,” le don
des dieux “the gift of the gods”). She explains the change through “an evolu-
tionary concept of language change: . . . languages evolve in the direction of
features that are acquired early” (Bauer 1995, p. 170). She says that “Latin must
have been a difficult language to master, and one understands why this type of
language represents a temporary stage in linguistic development” (Bauer 1995,
p. 188), but she gives no reasons to believe this and she gives no reason why
early languages should have exhibited structures which are hard to acquire.1
If a diachronic change is “adaptive,” one needs to show how the environment
has changed in such a way that the new phenomenon is adaptive in a way that
it wasn’t before. However, proponents of this kind of evolutionary explana-
tion do not do this; instead, they set up universal “tendencies” by which any
change is “adaptive,” such as a tendency for left-branching languages to become
right-branching, and, like the typologists, they postulate inexorable, historical
tendencies as explanatory forces.
Another line of work, again focusing on how languages change in some
global fashion, has similarly emphasized the alleged unidirectionality of change.
Accounts of “Grammaticalization” also treat languages as external objects “out
there,” subject to change in certain inevitable ways. Grammaticalization, a
notion first introduced by Antoine Meillet in the 1930s, is taken to be a semantic
tendency for an item with a full lexical meaning to be bleached over time and
to come to be used to mark a grammatical function. Such changes are said to
be quite general and unidirectional; one does not find changes proceeding in
the reverse direction, so it is said.
We shall discuss an instance of grammaticalization in section 8.3 and there are
many examples that have been described this way in the literature (for a survey,
see Hopper and Traugott 1993). One which is often cited concerns negative
markers in French. In Old French, the negative particle ne was often reinforced
with an appropriate noun. With motion verbs, ne was reinforced with the noun
pas “step.” Over time, pas began to occur even where there was no motion, and
eventually some reinforcing noun became effectively obligatory. As a result,
the reinforcing nouns like pas (others were point “point,” mie “crumb,” gote
“drop”) underwent grammaticalization from noun to negative particle.
Grammaticalization is a real phenomenon, but it is quite a different matter to
claim that it is general and unidirectional, or an explanatory force. If there were a
universal tendency to grammaticalize, there would be no counter-developments,
by which bound forms become independent lexical items (affixes becoming clit-
ics or independent words – we mention an example of this later in this chapter,
1 The same logic, another throwback to nineteenth-century thinking, shows up in the evolutionary
explanations of Haspelmath 1999b; see the commentary on this paper by Dresher and Idsardi
1999.
8.2 Grammars and time 161
2 Janda 2001 offers many references. He also has good critical discussion of how fundamental the
issue of unidirectionality is for grammaticalizationists and how cavalier some of them have been
in dismissing changes which appear to run counter to their predispositions. Imperious blanket
denials that such changes occur, as in the writings of Haspelmath (1999a, 1999c), do not remove
them from history. Newmeyer (1998, ch. 5) provides an excellent general discussion of grammat-
icalization and examines studies which use reconstructions as evidence for “grammaticalization
theory,” despite the fact that it was assumed in the very reconstruction.
3 Nor is it appropriate to explain the change by invoking some principle of UG which favors the
new grammar ( pace Roberts 1993; see Lightfoot 1999, section 8.3 for discussion).
162 8 Language change
system itself, we explain grammatical change through the nature of the acquisi-
tion process, as we indicated in chapter 2. A grammar grows in a child from some
initial state (Universal Grammar, or UG), when she is exposed to primary ling-
uistic data (PLD) (schematically, as in (8.1)). So the only way a different gram-
mar may grow in a different child is when that child is exposed to significantly
different primary data.
e. Il le peut.
he it can
He can (do) it [e.g., understand the chapter].
Furthermore, not only may languages differ in this regard, but also different
stages of one language. Sentences along the lines of the non-existent utterances
of (8.2) were well-formed in earlier English. If the differences between Old and
modern English were a function of separate features with no unifying factor
(Ross 1969), we would expect these features to come into the language at dif-
ferent times and in different ways. On the other hand, if the differences between
Old and modern English reflect a single property, a categorical distinction, then
we would expect the trajectory of the change to be very different. And that is
what we find. If the differences between can and understand were a function
of the single fact that understand is a verb while can is a member of a different
category, inflection (I), then we are not surprised to find that (8.2ci), (8.2di),
(8.2ei), (8.2fi), and (8.2gi) dropped out of people’s language in parallel, at the
same time.
In Middle English Kim can understand the chapter had the structure (8.4a)
and in present-day English (8.4b). If in present-day English can is an I element,
as in (8.4b), then one predicts that it cannot occur to the left of a perfective or
present participle, as in (8.2ci), (8.2di) (those participial markers are generated
in Spec VP), that it is mutually exclusive with the infinitival marker to (which
also occurs in I) (8.2eii), that there may only be one modal per VP (8.2fi),
and that a modal may not be followed by a complement DP (8.2gi). Simply
postulating the structure of (8.4b) accounts for the data of (8.2c–g) in present-
day English. Earlier English had structures like (8.4a), where can is a verb and
behaves like understand.
Spec IP
I VP
V VP
b. Present-day English IP
Spec IP
I VP
I, but later grammars had structures like (8.4b), where can is an I item, drawn
from the lexicon and merged into a structure as an instance of I. As a result,
sentences like (8.2ci–gi) dropped out of the language and no longer occurred
in texts.
The second stage was that the grammars of English speakers lost the operation
moving verbs to a higher I position (e.g., in 8.4a). This change was completed
only in the eighteenth century, later than is generally supposed (Warner 1997).
At this point, sentences with a finite verb moved to some initial position (8.2aii)
or to the left of a negative (8.2bii) became obsolete and were replaced by equiv-
alent forms with the periphrastic do: Does Kim understand this chapter? Kim
does not understand this chapter, etc. Also sentences with an adverb between
the finite verb and its complement became obsolete: Kim reads always the
newspapers. This change has been discussed extensively and Lightfoot (1999,
section 6.3) argues that it was caused by prior changes in PLD, most notably the
recategorization of the modal verbs just discussed and the rise of periphrastic do
forms (above). These changes had the effect of greatly reducing the availability
of the relevant cue, [I V], i.e. a verb occurring in an I position.
The two changes are, presumably, related in ways that we do not entirely un-
derstand: first, the Inflection position was appropriated by a subclass of verbs,
the modal auxiliaries and do, and the V-to-I operation no longer applied gener-
ally to all tensed clauses. Somewhat later, the V-to-I movement operation was
lost for all verbs other than the exceptional be and have (see below) and I was
no longer a position to which verbs might move. We pass over the details of
this change here, in order to discuss something else.
An intriguing paper by Anthony Warner (1995) shows that there is a third
stage to the history of English auxiliaries, involving changes taking place quite
recently affecting the copula be, and this turns out to be of current theoretical
interest. It has often been observed that VP ellipsis is generally insensitive to
morphology. So one finds ellipses where the understood form of the missing
verb differs from the form of the antecedent (8.6).
(8.6) a. Kim slept well, and Jim will [sc. sleep well] too.
b. Kim seems well behaved today, and she often has [sc. seemed
well behaved] in the past, too.
c. Although Kim went to the store, Jim didn’t [sc. go to the
store].
There is a kind of sloppy identity at work here. One way of thinking of this is
that in (8.6a) slept is analyzed as [past+V sleep] and the understood verb of the
second conjunct accesses the verb sleep, ignoring the tense element. However,
Warner noticed that the verb be works differently. Be may occur in elliptical
constructions, but only under conditions of strict identity with the antecedent
form (8.7). In (8.7a,b) the understood form is identical to the antecedent, but
not in the non-occurring (8.7c,d,e).
8.3 English auxiliary verbs 169
(8.7) a. Kim will be here, and Jim will [sc. be here] too.
b. Kim has been here, and Jim has [sc. been here] too.
c. *Kim was here, and Jim will [sc. be here] too.
d. *If Kim is well behaved today, then Jim probably will [sc.be
well behaved] too.
e. *Kim was here yesterday, and Jim has [sc. been here] today.
This suggests that was is not analyzed as [past + V be], analogously to slept,
and be may be used as an understood form only where there is precisely a be
available as an antecedent; not was or is, but just be, as in (8.7a). Similarly for
been; compare (8.7b) and (8.7e). And similarly for am, is, are, was, were.
Warner goes on to note that the ellipsis facts of modern English were not
always so and one finds forms like (8.7c,d,e) in earlier times. Jane Austen was
one of the last writers to use such forms and she used them in her letters and in
speech in her novels, but she did not use them in narrative prose (8.8a,b). These
forms also occur in the work of eighteenth century writers (8.8c), and earlier,
when verbs still moved to I (8.8d).
(8.8) a. I wish our opinions were the same. But in time they will [sc. be
the same]. (1816 Jane Austen, Emma, ed. by R.W. Chapman,
London: OUP, 1933. 471)
b. And Lady Middleton, is she angry? I cannot suppose it pos-
sible that she should [sc. be angry]. (1811 Jane Austen, Sense
and Sensibility, ed. by C. Lamont, London: OUP, 1970. 237)
c. I think, added he, all the Charges attending it, and the Trouble
you had, were defray’d by my Attorney: I ordered that
they should [sc. be defrayed]. (1740–1 Samuel Richardson,
Pamela, London. 3rd edition 1741. Vol. II, 129)
d. That bettre loved is noon, ne never schal. (c. 1370 Chaucer,
A Complaint to his Lady, 80. “So that no one is better loved,
or ever shall [sc. be].”)
These forms may be understood if were in (8.8a) was analyzed as subjunctive
+be and the be was accessed by the understood be. In other words, up until the
early nineteenth century, the finite forms of be were decomposable, just like
ordinary verbs in present-day English. This is what the ellipsis facts suggest.
Warner then points to other differences between present-day English and
the English of the early nineteenth century. Present-day English shows quite
idiosyncratic restrictions on particular forms of the verb be, which did not exist
before the late eighteenth century. For example, only the finite forms of be may
be followed by to+infinitive with a modal sense of obligation (8.9a); only been
may occur with a directional preposition phrase, effectively meaning “gone”
(8.9b); and being, unlike any other form of be, has a special restriction that it
does not permit an ing complement (8.9c).
170 8 Language change
different not only from regular verbs loved:love, etc. but also from that of irregular or
suppletive verbs (slew:slay, went:go), which are in some sense essentially compositional,
as the contrast of behavior in ellipsis shows. (Warner 1995, p. 538)
antecedent for ellipsis in (8.6a) Kim slept well, and Jim will [sc. sleep well]
too. This reveals how elements are stored in the mental lexicon: is is stored
in just that form while slept is stored as V sleep with the form slept created
morphologically by the attachment of an affix. If all verbs were treated the same
way, as in Chomsky 1995, there would be no obvious way to make the distinction
between those which may be antecedents for ellipsis under conditions of sloppy
identity (sleep, etc.), and those which may not (is, are, and other forms of be).
Lasnik (1999, ch. 5) drew a similar distinction between “affixal” and
“featural” verbs and keyed the distinction to whether the verb moves: if a
verb moves in the syntax (e.g., be forms and all finite verbs in French), then it
already has its inflectional features attached when it is merged into the syntactic
structure and is “featural,” but if a verb does not move to a higher inflectional
position, then it is “affixal” and has affixes lowered onto it in the syntax. How-
ever, this correlation is not general and there is more to the story than this.
Modal elements are featural and are generated in I, not moving there. Finite
be, on the other hand, clearly moves to I, because be may also occur in other,
non-finite positions if I is filled with a modal (8.11).
(8.11) Kim might still be reading that chapter.
So forms of be (and have) move to I; they are and always have been featural.
They have always moved to I at all stages of their history but it was only in the
late eighteenth century that they came to be stored atomically and developed the
odd properties discussed here. We conclude that if a verb is featural, it moves
to I. However, a featural item may be base-generated in I (modern modals) and
may or may not be stored atomically: was is not a verb and it is stored atomically
in modern grammars.
What is important about this story is that, while the changes we have discussed
only involve the verb be, they have the hallmarks of grammatical change. There
are several surface changes, all involving be, which can be attributed to one
analytical notion. The changes reflect quite general properties of the grammar.
One can identify the structural property which is relevant and we can tell a
plausible and rather elegant story about why and how the grammatical change
might have come about. We distinguish how items are stored in the lexicon.
We see, again, that morphology has syntactic effects. It is particularly impor-
tant in defining category membership; children assign items to categories on the
basis of their morphology. We have explained the third change by pointing to
changes in the trigger experience which led to the new morphological structure
of be forms. Those changes in the trigger are a function of prior grammatical
shifts, relating to the change in category membership of the modal auxiliaries
and the loss of V-to-I movement; there are links among the three changes and
we have another domino effect. Again we have local causes and we do not need
to appeal to internal motivating factors.
174 8 Language change
with postulating general historical tendencies is that they are too “Cyclopean”
(to adopt a useful term from Calvert Watkins’ (1976) critique of typological
analyses) and too gross to be enlightening, and they predict that languages
should undergo parallel historical changes.
If changes in category membership are relatively common (whatever that
means), they still need local causes. Identifying local causes enables us to
understand the details of the change, as we have illustrated here. This case
study suggests that category changes may result from morphological changes.
Not many of the world’s languages have a richly recorded history, but many
that do have undergone morphological simplification, sometimes with category
changes. If our historical records included languages with increasing morpho-
logical complexity, we would be in a stronger position to relate morphological
and categorial changes. However, given the records that we have, we can see
the precariousness and uselessness of seeking to explain categorial changes by
general historical tendencies.
4 The loss of morphological case is discussed more fully in Lightfoot 1999, ch. 5, from which this
study is drawn, with some revisions.
176 8 Language change
thematic roles are a function of the meaning of the verb and are “assigned”
by the verb, so the DPs are thematically linked to the verbs. In a sentence like
Kay drove to New York, New York is thematically linked to the preposition to
and not to the verb drove; in a phrase John’s mother’s house, the DP John’s
mother is thematically related to house but the smaller DP John is thematically
related only to mother.
If UG stipulates that heads may assign Case to the left or to the right in
accordance with the head-order parameter, as we indicated in chapter 3, one is
not surprised to find Old English nouns assigning Case to the left and to the
right. There is good reason to believe that the head-order parameter was shifting
in late Old English: one finds verbs preceding and following their complement,
object–verb order alternating with verb–object. There is independent evidence
that OE nouns assigned genitive Case not only to the left (8.13a) but also
to the right (8.14b). One finds possessive–head order alternating with head–
possessive. Old English has a very simple analysis. It is more or less a direct
manifestation of this UG theory of Case: nouns assigned Case to the left and
to the right, and only to DPs with which they were thematically related, as we
shall see. Case was assigned in that fashion and then was realized on both sides
of the noun with the morphological, genitive suffix. Lof assigns a thematic role
to god in (8.13ai) and lufu to god and mann in (8.13bi).
If Old English nouns assigned Case to the left and to the right, and if in both
positions it was realized as a morphological genitive, then one is not surprised
to find that Old English also manifested “split genitives” (the term is Eilert
Ekwall’s (1943)). These were split in that a single genitive phrase occurred
on both sides of the head noun. In (8.14) we see an example where the split
element occurring to the right of the noun was a conjunct. Jespersen (1909,
p. 300) notes that with conjuncts, splitting represents the usual word order in
Old English.
8.4 Syntactic effects of the loss of case 177
In addition, appositional elements, where two DPs are in parallel, were usu-
ally split: the two elements occurred on either side of the head noun (8.15a–c),
although (8.15d) was also possible, where Ælfredes cyninges is not split.
These grammars had an overt genitive case on the right or on the left of the
head noun; and they had split genitives, where the head noun assigned the same
5 Cynthia Allen (2002) argues that cyninges is an adjunct to godsune rather than a complement.
This raises interesting questions which we shall not discuss here.
178 8 Language change
thematic role and Case in both directions. So much for splitting in Old English
grammars.
Now for the mysterious changes. Middle and early Modern English also
manifested split genitives but they included forms which are very different
from the split genitives of Old English, as the examples of (8.17) show.
(8.17) a. The clerkes tale of Oxenford. (Chaucer, Clerk’s Tale,
Prologue)
b. The Wive’s Tale of Bath. (Chaucer, Wife of Bath’s Tale,
Prologue)
c. Kyng Priamus sone of Troy. (Chaucer, Troilus & Cressida,
I, 2)
d. This kynges sone of Troie. (Chaucer, Troilus & Cressida,
III,1715)
e. The Archbishop’s Grace of York. (Shakespeare, 1 Henry IV,
III.ii.119)
The meaning is “The clerk of Oxford’s tale,” “King Priam of Troy’s son,”
etc, and the genitive is split in the same sense as in Old English grammars: the
rightmost part of the genitive phrase (italicized) occurs to the right of the head
noun which the genitive phrase modifies. Mustanoja (1960, p. 78) notes that
“the split genitive is common all through ME [Middle English]” and is more
common than the modern “group genitive,” The clerk of Oxford’s tale. Jespersen
(1909, p. 293), exaggerating a little, calls this splitting “the universal practice
up to the end of the fifteenth century.” However, these Middle English split
forms are different from those of Old English grammars, because the rightmost
element is neither a conjunct nor appositional, and it has no thematic relation
with the head noun, tale, son, Grace, but rather with the item to the left: clerk,
wife, etc. How did these new split forms emerge and become so general?
We can understand the development of the new Middle English split genitives
in light of the loss of the overt morphological case system and the theory of
Case related to thematic role. Culicover (1997, pp. 37f.) discusses the “thematic
case thesis,” under which abstract Case realizes thematic-role assignment quite
generally. This is where we seek to connect work on abstract Case with the
morphological properties discussed by earlier grammarians.
Old English had four cases (nominative, accusative, genitive, and dative) and
a vestigial instrumental, but they disappear in the period of the tenth to thirteenth
century, the loss spreading through the population from the north to the south
probably under the influence of the Scandinavian settlements (O’Neil 1978). In
early Middle English, grammars emerged which lacked the morphological case
properties of the earlier systems, in particular lacking a morphological genitive.
Put yourself now in the position of a child with this new, caseless grammar;
your grammar has developed without morphological case. You are living in
8.4 Syntactic effects of the loss of case 179
the thirteenth century; you would hear forms such as (8.15a) Ælfredes godsune
cyninges, but the case endings do not register: that is what it means not to have
morphological case in one’s grammar. You are not an infant and you are old
enough to have a partial analysis, which identifies three words. Ælfredes was
construed as a “possessive” noun in the specifier of DP.
The modern “possessive” is not simply a reflex of the old genitive case.
Morphological case generally is a property of nouns. On the other hand,
“possessive” in modern English is a property of the DP and not of nouns: in
(8.18a) My uncle from Cornwall’s cat the possessor is the whole DP My uncle
from Cornwall. Allen (1997) shows that the ’s is a clitic attached to the preced-
ing element and that the group genitive, where the clitic is attached to a full DP,
is a late Middle English innovation.
(8.18) a. [DP [DP my uncle from Cornwall]’s cat]
b. Poines his brother. (Shakespeare, 2 Henry IV, 2.4.308)
c. For Jesus Christ his sake. (1662 Book of Common Prayer)
d. Mrs. Sands his maid. (OED, 1607)
e. Job’s patience, Moses his meekness, and Abraham’s faith.
(OED, 1568)
As the case system was lost, the genitive ending es was reanalyzed as
something else, a Case-marking clitic. If ’s comes to be a clitic in Middle
English, which Case-marks DPs, this would explain why “group genitives” be-
gin to appear only at that time, as Allen argued. Allen’s analysis also predicts
Jespersen’s observation that splitting was the universal practice until the clitic
became available.
It is likely that there was another parallel reanalysis of the genitive es ending,
yielding the his-genitives which were current in the sixteenth and seventeenth
centuries (8.18b,c) for “Poines’ brother,” “Christ’s sake,” etc. The genitive
ending in ’s was sometimes spelled his, and this form occurs even with females
(8.18d), and occurs alongside possessive clitics (8.18e).
UG dictates that every phonetic DP has Case, as we sketched in chapter 3.
The new caseless children reanalyzed the old morphological genitive suffix
es as a clitic, which was recruited as a Case-marker. The clitic ’s Case-marks
the element in the specifier of the containing DP. So Ælfred has Case and the
Case is realized through the ’s marker (usually analyzed as the head D, as in
the structure given for (8.19a); see also chapter 3). In short, the Ælfredes of the
parents is reanalyzed as Ælfred’s, although orthographic forms like Ælfredes
occur in texts when mental grammars surely yielded Ælfred’s. Orthographic ’s
is a recent innovation. So far, so good.
What about cyninges in (8.15a)? The evidence suggests that the phrase
became (8.19a) Ælfred’s godsune king. One finds phrases of just this form
in (8.19b,c), where the post-nominal noun is not overtly Case-marked, and
180 8 Language change
Jespersen (1909, pp. 283f.) notes that these forms are common in Middle
English.
(8.19) a. Ælfred’s godsune king
[DP [DP Ælfred] D ’s [NP godsune [king]]]
b. The kynges metynge Pharao
Pharaoh the king’s dream (Chaucer, Book of the Duchess,
282)
c. The Grekes hors Synoun
Sinon the Greek’s horse (Chaucer, Squire’s Tale, 209)
The forms of (8.19), where the rightmost element is appositional, are direct
reflexes of OE split genitives like (8.15), corresponding exactly, except that the
split element, Pharao, Synoun, has no overt Case. Despite the absence (for us
new, caseless children – remember our thought experiment) of an overt, mor-
phological genitive case, UG prescribes that the post-nominal DP must carry
some abstract Case. After the loss of the morphological case system, it can no
longer be realized as a genitive case ending. That means that there must be
another way of marking/realizing the abstract Case in (8.19). Perhaps Pharao
receives its Case by coindexing with the Case-marked kynges; the two forms
are in apposition and therefore are coindexed and share the same thematic role.
This is what one would expect if there is a one-to-one relationship between
Case and thematic role, the key element of our theory of Case. In that event, no
independent Case-marker is needed for Pharao.
There is another option for realizing Case on the rightmost element. The
dummy preposition of could be used as a Case-marker, as it is in (8.17)
(see chapter 3, note 11). This is not possible in Ælfred’s godsune king or the
phrases of (8.19), because if of were to Case-mark the DP, one would expect
it also to assign a thematic role (given a one-to-one relation between Case and
thematic role) and in that event the DP could not be interpreted as an apposi-
tional element. The sentences of (8.17), on the other hand, are not like those of
(8.19) and have different meanings. In (8.17b), for example, Wive and Bath are
not appositional, not coindexed, and therefore an independent Case-marker and
thematic-role assigner is needed; this is the function of of. 6 Under this view, the
emergence in Middle English of the new N of DP forms (8.17) is an automatic
consequence of the loss of the morphological case system: of was introduced
in order to Case-mark a DP which would not otherwise be Casemarked. In
particular, the DP could not be Case-marked like the rightmost item in (8.19),
which carries the same Case as Ælfred’s because it has the same thematic role.
Of assigns Case to a DP only if it has an independent thematic role.
6 Nunnally (1985, p. 21) finds no genitival of phrases in his study of the OE translation of
St. Matthew’s Gospel (of was used frequently to show origin or agency, best translated by
modern from or by).
8.4 Syntactic effects of the loss of case 181
the morphological case system of their parents, and (c) were subject to a Case
theory requiring all DPs to have Case (assigned and realized) and linking Case
with the assignment of thematic roles. We have a tight explanation for the new
properties of Middle English grammars. In particular, we explain the distinction
between (8.17) and (8.19), with of occurring where there is no thematic relation
with the head noun (8.17), but not where there is such a relation (8.19). We
see that change is bumpy; if one element of a grammar changes, there may be
many new phenomena (8.17). Children do not just match what they hear and
they may produce innovative forms, as required by UG. UG defines the terrain,
the hypothesis space, and a change in initial conditions (loss of morphological
case) may have syntactic effects.7
This is an explanation for the form of the split genitives of (8.17) in Middle
English. They were around for four centuries and then dropped out of the
language. This was probably a function of the newly available clitic ’s which
made possible group genitives like The clerk of Oxford’s tale; these became
possible only when ’s was construed as a clitic, which Case-marked DPs, and
that in turn was a function of the loss of morphological cases, including the
genitive in es.
Here we have taken a notion (“case”) from traditional grammar, and con-
strued Case as an element in cognitive grammars, in people’s language organs.
Phonetic DPs, DPs which are pronounced, have an abstract Case which must
be realized somehow. This is required by UG, and abstract Cases are often
realized as morphological cases. Children scan their linguistic environment for
morphological cases and, if they find them, they serve to realize abstract Cases.
If children do not find morphological cases, then different grammars emerge. In
that event, a P or V (or other categories) may Case-mark a complement DP. We
have examined here what happens when everything else remains constant. There
came a point in the history of English when children ceased to find morpholog-
ical cases. Those children were exposed to much the same linguistic experience
as their parents, but the transparency of overt case endings had dropped below a
threshold such that they were no longer attained. Given a highly restrictive the-
ory of UG, particularly one linking Case-assignment by nouns to thematic-role
assignment and requiring Cases to be realized on phonetic DPs, other things
then had to change.
In this way our abstract theory of Case enables us to understand how some
of the details of Middle English grammars were shaped, why things changed
as they did and why Middle English grammars had their odd split genitives.
7 Our account leaves open the question of why these extended split genitives (8.17) should have
arisen. Lightfoot (1999) appeals to the reanalysis of one special type of Old English split geni-
tive, those involving double names like Thomasprest Doreward (= priest of Thomas Doreward)
and, crucially, those where the second part begins with of : Rogereswarenner of Beauchamp
(= warrener of Roger of Beauchamp), which may have triggered the new split genitives.
8.5 Chaos 183
8.5 Chaos
From the Greeks to Newton, people have believed in a predictable universe.
Where unpredictable behavior was observed, for example in weather, the un-
predictability was attributed to lack of knowledge: if we just knew more, we
would have better weather forecasts. Pierre Simon Laplace said that he could
specify all future states if he could know the position and motion of all particles
in the cosmos at any moment. Recently, however, scientists in various fields
have found that many systems are unpredictable despite the fact that they follow
courses prescribed by deterministic principles. The key to understanding how
systems may be both determinate and unpredictable – an oxymoron from the
point of view of classical science – lies in the notion of sensitive dependence
on initial conditions.
Predicting final outcomes – or indeed anything beyond the very short-term –
becomes impossible for many types of system. Chaos incorporates elements of
chance, but it is not random disorder. Rather, chaos theory tries to understand
the behavior of systems that do not unfold over time in a linearly predictable
manner. When viewed as a whole, these systems manifest definite patterns and
structures. However, because the evolution of a chaotic system is so hugely
complex and so prone to perturbation by contingent factors, it is impossible to
discern its underlying pattern – its attractor – by looking at a single small event
at a single point in time. At no single point can future directions be predicted
from past history.
So it is with the emergence of a new species in evolutionary change, with
changes in the political and social domain, and in grammar change. Change
is not random, but we are dealing with contingent systems and we offer
retrospective explanations, not predictions. Grammatical change is highly con-
tingent, sensitive to initial conditions, chaotic in a technical sense. Linguists
can offer satisfying explanations of change in some instances, but there is no
reason to expect to find a predictive theory of change, offering long-term, linear
predictions.
The emergence of a grammar in a child is sensitive to the initial conditions of
the primary linguistic data. If those data shift a little, there may be significant
consequences for the abstract system. A new system may be triggered, which
generates a very different set of sentences and structures. There is nothing
principled to be said about why the data should shift a little; those shifts often
represent chance, contingent factors. Contingent changes in the distribution of
the data (more accurately, changes in the “cues”: Lightfoot 1999) may trigger a
grammar which generates significantly different sentences and structures, and
that may have some domino effects, as we have seen.
Changes in languages often take place in clusters: apparently unrelated
superficial changes may occur simultaneously or in rapid sequence. Such clusters
184 8 Language change
manifest a single theoretical choice which has been taken divergently. The
singularity of the change can be explained by the appropriately defined
theoretical choice. The principles of UG and the definition of the cues
constitute the laws which guide change in grammars, defining the available
terrain. Any given phenomenal change is explained if we show, first, that the
linguistic environment has changed in such a way that some theoretical choice
has been taken differently (say, a change in the inflectional properties of verbs),
and, second, that the new phenomenon (may, must, etc. being categorized as I
elements, for example) must be the way that it is because of some principle of
the theory and the new inflectional system.
Sometimes we can explain domino effects of this type. Linguists have
argued that a changing stress pattern may leave word-final inflection mark-
ings vulnerable to neutralization and loss. Loss of inflectional markings may
have consequences for category membership and changes in category member-
ship may have consequences for computational operations moving verbs to an
I position. In that event, one establishes a link between a change in stress pat-
terns and changes in the positions of finite verbs. Benjamin Franklin would
understand: “For want of a nail, the shoe was lost; for want of a shoe the horse
was lost; for want of a horse, the rider was lost.” However, to say that there
may be domino effects is not to say that there is a general directionality of
the kind sought by nineteenth-century linguists and by modern typologists and
grammaticalizationists.
What we cannot explain, in general, is why the linguistic environment should
have changed in the first place (as emphasized by Lass 1997 and others).
Environmental changes are often due to what we have called chance factors,
effects of borrowing, changes in the frequency of forms, stylistic innovations,
which spread through a community and, where we are lucky, can sometimes be
documented by variation studies. Changes of this type need not reflect changes
in grammars. But with a theory of language acquisition which defines the range
of theoretical choices available to the child and specifies how the child may take
those choices, one can predict that a child will converge on a certain grammar
when exposed to certain environmental elements. This is where prediction is
possible, in principle. We thus have a determinist theory of language acquisition,
but not a determinist theory of history or of language change.
We have an interplay of chance and necessity, and appropriately so: changes
are due to chance in the sense that contingent factors influence a child’s PLD and
make the triggering experience somewhat different from what the child’s parent
was exposed to. Necessity factors, the principles of UG and the cues, define the
range of available options for the new grammar. We take a synchronic approach
to history. Historical change is a kind of finite-state Markov process, where
each state is influenced only by the immediately preceding state: changes have
8.5 Chaos 185
only local causes and, if there is no local cause, there is no change, regardless
of the state of the grammar or the language at some previous time.
In that way, the emergence of a grammar in an individual child is sensitive
to the initial conditions, to the details of the child’s experience. So language
change is chaotic, in a technical sense, in the same way that weather patterns
are chaotic. The historian’s explanations are based on available acquisition
theories, and in some cases our explanations are quite tight and satisfying.
Structural changes are interesting precisely because they have local causes.
Identifying structural changes and the conditions under which they took place
informs us about the conditions of language acquisition; we have indeed learned
things about properties of UG and about the nature of acquisition by the careful
examination of diachronic changes. Under this synchronic approach to change,
there are no principles of history; history is an epiphenomenon and time is
immaterial.
9 “Growing” a language
We have argued throughout this book that the cognitive system underlying a
person’s language capacity has intrinsic properties which are there by biolog-
ical endowment. Those properties interact with contingencies resulting from
exposure to a particular linguistic environment and the interaction yields a final
state in which the person may communicate, perhaps some form of French.
In that case, the person, Brigitte, will have incorporated from her environment
the contingent lexical properties that livre is a word to refer to the novel she is
reading and cooccurs with forms like le and bon (being “masculine”), père may
refer to her father. She has also incorporated contingent structural properties:
interrogative phrases like quel livre may be displaced to utterance-initial posi-
tion, verbs raise to a higher functional position, and so on. We have described
ways in which linguists have teased apart the intrinsic properties common to
the species and the contingent properties resulting from individual experience.
That work has been guided by the kind of poverty-of-stimulus arguments that
we have discussed, by theoretical notions of economy and elegance, and by the
specific phenomena manifested by the mature grammar under investigation.
Viewing a person’s language capacity in this way and focusing on what we
have called I-language leads one to ask novel questions about children and their
linguistic development. The perspective we have sketched has already led to
productive research and we have learned a great deal about the linguistic minds
of young children. In many ways, results concerning the attainment of syntactic
knowledge present the most dramatic examples of the bearing of this research
on our overall thesis, and we focus on this area for much of the present chapter.
Work in recent years has also shown remarkable things about the path by which
children attain knowledge of the expression system of their language, however,
and we turn briefly to those matters in section 9.5.
186
9.1 Principles of Universal Grammar: active early 187
questions like (9.1a), but correctly resist this preference when it conflicts with
UG principles. They use the reduced form in asking questions like (9.1a) but
not in questions like (9.1b), so they manifest the hypothetical genetic constraint
at a stage when their spontaneous production manifests very few instances of
long-distance wh-movement. The ingenuity of the experiment shows that even
at this stage the relevant principles are operating (Crain 1991).
The experiments we have described deal with elicited production, but com-
prehension studies also show that hypothetical genetic constraints are in effect
in very young children, at the earliest stage where they can be tested. Thornton
(1994) reported children’s comprehension of yes/no questions containing nega-
tion, such as (9.3). The difference between the two forms lies in the structural
position of the negative: in (9.3ai) the negative is inside the IP (partial structure
given in (9.3aii)) but in (9.3bi) it has formed a word with did and moved out of
the IP to C (9.3bii).
(9.3) a. i. Did any of the turtles not buy an apple?
ii. [CP did [IP any of the turtles not buy an apple]]
b. i. Didn’t any of the turtles buy an apple?
ii. [CP didn’t [IP any of the turtles buy an apple]]
The position of the negative corresponds to two distinct interpretations. That
correspondence between meaning and structural position follows from prin-
ciples of UG, which we need not go into here; essentially, a negative has an
effect on any element within its complement; logicians say that negatives have
sc o pe over certain elements. The phenomenon is clear. Suppose that turtles A
and B bought an apple but turtle C did not. Then if somebody asked question
(9.3ai), an appropriate answer would be that turtle C did not. If somebody asked
(9.3bi), then the appropriate answer would be very different: turtles A and B
did. So children’s responses to questions like (9.3ai,bi) reveal how they interpret
negatives. In particular, responses to (9.3bi) show whether children interpret the
negative in the higher structural position. This is worth testing because Thornton
found that all her children produced non-adult negative questions. Most dou-
bled the auxiliary verb (What do you don’t like?) and one failed to move the
auxiliary to the position of C: What you don’t like?
In testing comprehension, Thornton found that the children had no difficulty
interpreting negative questions in the adult fashion; significantly, all children
were able to access interpretations like (9.3bi), where the negative needs to be
interpreted in the position of C. She tested children between the ages of three
and a half and four and a half. The comprehension test used a modified form
of the Truth Value Judgement task (Crain 1991). A story was acted out by one
experimenter and watched by the child and a second experimenter, who was
playing the role of a puppet, in this case “Snail.” At the end of each story,
the experimenter asked Snail a targeted question. Snail had difficulty with the
question (“That’s a hard one . . .”), and requested help from the child. If the
190 9 “Growing” a language
child was cooperative, she answered the question for Snail.1 The scenarios used
to test children’s comprehension of questions like (9.3ai) and (9.3bi) were de-
signed so that either (9.3ai) or (9.3bi) could be asked appropriately; children’s
answers, however, indicate their analysis of the structural position of the neg-
ative. Thornton found that, while these children made production errors with
expressions like adult What don’t you like?, their comprehension was adult-like
and manifested the UG principles which determine the scope of negatives.
So there is a clear production/comprehension asymmetry, which, of course, is
no surprise under the modular view of mind that we have articulated. Whatever
it is that causes the delay in producing the adult forms, the fact that children
interpret the negative questions in adult fashion shows that they have access
to whatever principles of UG assign scope relations. The difficulty evidently
lies with the behavior of the element n’t: children produce non-adult questions
which retain the n’t in the IP until they figure out that n’t may form part of a
word with did 2 and move with it outside the IP to C.
Evidence suggests, then, that at least some principles of UG are available
and operative at the earliest stage where they might be tested. The same might
turn out to be true of all UG principles. But perhaps not. Some principles of UG
may turn out to be subject to a puberty-like clock, emerging only at a certain
age in a kind of “maturation.” There is nothing implausible about that view and
we know that it holds of some genetic properties in the physiological domain;
so why not also in the cognitive domain?
In fact, the case that grammars mature in this fashion has been made. Some
have argued that very young children have “semantically based” grammars and
that they graduate to “syntactically based” grammars at a certain age just as
tadpoles develop into frogs (Bowerman 1973; for discussion, see deVilliers and
deVilliers 1985 and Gleitman and Wanner 1982). Others have argued that prin-
ciples of the binding theory are not operative until they mature in the organism
(Borer and Wexler 1987, Manzini and Wexler 1987). A little later we shall con-
sider a claim that the principles of the binding theory are based on notions of
linear precedence in very young children, later becoming structurally based. We
do not find these maturation claims convincing and will not discuss them here,
but there is nothing implausible in the general idea of maturation of genetic
properties.
1 The child was not asked the question directly, in order to alleviate any feeling she might have
of being tested; in this setup, Snail is being quizzed, not the child. Here we are giving only the
briefest description of the experiments, but the experimental techniques used for this kind of work
require great ingenuity and are of enormous interest in themselves. For excellent discussion, see
Crain and Thornton 1998.
2 The most common account of this is to say that n’t is a clitic, and attaches to did by cliticization.
Zwicky and Pullum 1983, however, show that words like didn’t actually represent inflected forms
which exist only for a limited set of verbs, rather than cliticization. The difference between these
two analyses does not bear on the point under consideration here, though.
9.2 New phenomena 191
3 The distinction is a function of a Subjacency condition which blocks movement across more than
one bounding node (typically IP and DP in English grammars) and the claim that a wh-phrase
moves to the specifier of CP (see chapter 3).
192 9 “Growing” a language
At this optional infinitive stage, children know that finite verbs may not oc-
cur (more technically, may not be “checked”) in clause-final position in matrix
clauses, because they do not produce clause-final inflected verbs: *Pappa schoe-
nen wast, *Pappa nieuwe scooter koopt. And they know that non-finite verbs
may not occur in an inflectional position and do not produce forms like *Pappa
wassen schoenen or *Ik lezen ook.
Comparable phenomena are found in young French children (Pierce 1992).
They alternate between the non-finite, non-adult forms of (9.11), where the verb
stays in its first, base-generated position inside the VP and to the right of the
negative marker pas, and the finite forms of (9.12), where the verb is finite and
therefore occurs in the I position where its finite features are checked, to the
left of pas.
may precede its antecedent and sometimes not. This is the phenomenon of
“backwards anaphora.” So in (9.13a) (repeated from chapter 2), he may refer
to Jay, but not in (9.13b). In both instances, he precedes Jay, but in (9.13b) he is
also structurally higher than Jay in a sense that we shall not make precise here.
It is Principle C of the binding theory which prevents Jay being coindexed with
he in (9.13b), allowing it in (9.13a).4
(9.13) a. When he entered the room, Jay was wearing a yellow shirt.
b. He was wearing a yellow shirt, when Jay entered the room.
Given what we said in the last section, we would expect children to conform
to Principle C and to produce and understand sentences like (9.13) in the adult
fashion. However, that is not what the early literature suggests. Larry Solan
(1983) and Susan Tavakolian (1978) discussed experiments with children acting
out sentences like (9.13), when provided with suitable toys.5 The children were
three- to eight-year-olds; not so young. For adults, (9.13a) may be interpreted as
referring to two men, Jay and somebody unnamed; (9.13b) must be interpreted
that way, involving two men. That interpretation is also open to children and
the act-out studies found that children interpreted both types of sentences that
way most commonly: two thirds of the time, in fact. This led to the conclusion
that backwards anaphora does not exist in young children, that the conditions
on coreference are purely linear and not structural in the early stages. Put
differently, children do not permit the interpretation allowed by Principle C
and are not subject to Principle C but rather only to a linear condition that a
pronoun may never precede a noun to which it refers. However, the conclusion
was premature.
First, one third of the responses, in fact, permitted backwards anaphora,
with a pronoun referring to a noun to its right. Second, even if all responses
disallowed backwards anaphora, that would not show that children were not
subject to Principle C: perhaps they are only displaying a strong preference to
have the pronoun refer to some second person unnamed in the sentence. To test
adherence to Principle C one needs an experiment which shows that children
sometimes allow backwards anaphora and that they reject it in the appropriate
circumstances. Or not.
4 The binding theory, one of the more resilient aspects of grammatical theory, divides nouns into
three types: anaphors, pronouns, and everything else (sometimes called “names”). Anaphors are
subject to Principle A and must be locally coindexed with a “higher” element (the technical
notion is a “c-commanding” item). A pronoun must not be locally coindexed with a higher
element, by Principle B. And Principle C requires that names not be coindexed with a higher
element anywhere. See Chomsky 1981. In (9.13b) he is higher than Jay in the relevant sense, and
therefore they may not be coindexed (Principle C); in (9.13a), on the other hand, he is contained
in a subordinate clause and is not higher than Jay and therefore Principle C is irrelevant.
5 The actual sentences used included For him to kiss the lion would make the duck happy, That he
kissed the lion made the duck happy.
196 9 “Growing” a language
Crain and McKee (1986) constructed a Truth Value Judgement task, of the
kind we saw earlier. Children were exposed to sentences like (9.14) (analogous
to (9.13)), in situations where the pronoun was backwards anaphoric and in
situations where the pronoun had another referent not mentioned in the sentence.
They judged the truth value of the sentences.
Sentence (9.14a) was presented twice, once where the Ninja Turtle was danc-
ing and eating pizza and once where somebody else was dancing while the
Ninja Turtle was eating pizza. Similar conditions attended the presentation of
(9.14b), although no adult would use such a sentence in a situation in which the
Ninja Turtle was dancing and eating pizza. For each scene, Kermit the Frog said
what he thought happened in that trial, using sentences like (9.14a) or (9.14b). If
Kermit said something appropriate, children could feed him something he liked;
if Kermit said the wrong thing, children could get Kermit to eat something
“yucky,” like a rag or a cockroach, or to do push-ups. There is much to be said
about the design of the experiment, but the results clearly showed that children
correctly accepted the backwards anaphoric reading in sentences like (9.14a)
about two thirds of the time. In addition, 90 percent of the time sentences like
(9.14b) were correctly judged to be wrong in contexts displaying coreference.
Thus even two- and three-year-olds allow backwards anaphora and reject it
when structural conditions dictate that they should. Kermit ate some pretty
unpleasant things as a consequence of the fact that these children behaved in
accordance with Principle C.
The construction of a good experiment is far from trivial. We cannot sim-
ply observe children’s spontaneous expressions, because it might be several
years before a child is confronted with a situation which draws on her know-
ledge of Principle C in a way that can be measured. Crain and Thornton (1998,
chs. 27–30) provide an excellent discussion of the design features of experi-
ments relating to Principle C.
The literature is full of experiments showing that children do not know such-
and-such at some early age, only coming to acquire some principle of UG
at a later age. Since children’s language differs from that of adults, one can
easily allow extraneous factors to distort findings, as in the early experiments
on Principle C. Confounding factors have led many researchers to conclude too
hastily that children do not know certain principles of UG. In fact, we know of
no good demonstration to this effect.
Consider another example. Hornstein and Lightfoot 1981 argued that children
know in advance of any experience that phrase structure is hierarchical and not
flat. So the structure of an expression the second striped ball would be that of
9.3 Experimental technique 197
(9.15a) and the flat structure of (9.15b) would not be a candidate, precluded
by UG.6
(9.15) a. DP
D NP
A NP
A NP
b. DP
D A A N
There were various reasons for this analysis and for the claim that flat struc-
tures are generally unavailable. In particular Hornstein and Lightfoot argued that
if children allowed a flat structure they would never be exposed to positive data
enabling them to acquire the correct hierarchical structure of adult language.
Nonetheless, it was argued, for example by Matthei (1982), that children must
operate with flat structures, because they have difficulty interpreting phrases
like the second striped ball. Confronted with an array like that in figure 9.1
and asked to identify “the second striped ball,” adults invariably identify the
third ball but children often identify the second ball, which is also striped.
6 Hornstein and Lightfoot stated their claims in terms of then-current notions of phrase structure,
in which NP included determiners and intermediate N elements. Twenty years later, we translate
their claims into present-day frameworks using DP, etc.
198 9 “Growing” a language
Hamburger and Crain 1984 then showed that the difficulty that children have
in identifying “the second striped ball” in an array does not relate to their
grammars; in fact, it cannot. They hypothesized that the difficulty might lie in
the pragmatic complexity of identifying “the second striped ball” rather than
in syntactic complexity. A dramatic improvement in children’s responses was
effected by two changes. First, children were given a pre-test session where they
handled and counted sets of striped and unstriped balls. Second, if children were
first asked to identify the first striped ball, forcing them to plan and execute part
of what is involved in identifying the second striped ball, they then performed
much better when asked to identify the second striped ball. As Crain put it,
“these simplifying maneuvers made it possible for children to reveal mastery
of the syntax and semantics of such expressions” (1991, p. 609).
Furthermore, Hornstein and Lightfoot had shown that the pronoun one refers
to an NP and not a noun head (see note 6). Hamburger and Crain noted that if
children employ such hierarchical structures, then, once they know the meaning
of ordinals, they will behave in adult fashion when confronted with an array like
that of figure 9.2 and asked “Point to the first striped ball; point to the second
one,” using the pronoun one. In fact, children pointed consistently to the fifth
object in the array. Using one in this way indicates hierarchical structures like
(9.15a) and is quite incompatible with the flat structure hypothesis ((9.15b)), if,
as is generally assumed, pronouns corefer with syntactic constituents (because
second striped ball is not a constituent in (9.15b); consequently children would
be expected to identify the second ball, not the second striped ball).
Here we have shown more instances where we can see that the language of
young children manifests properties of UG at the earliest stages that we can
test . . . if we do the experiments properly and are careful to tease out their syn-
tactic capacities, designing the experiment to exclude extraneous confounding
factors. As we learn more about experimental technique, so we shall learn more
about children’s linguistic capacities, and vice versa.
children’s initial experience, the primary linguistic data (PLD) that trigger the
development of a mature grammar.
(9.16) primary linguistic data (Universal Grammar → grammar)
The primary linguistic data are positive and robust. That is, they consist of
actual expressions which occur frequently enough for any child to hear them.
As we observed in chapter 2, a rough-and-ready formulation would say that the
PLD do not include negative data, information about what does not occur, nor
exotic expressions, nor expressions used in exotic circumstances. Nor do they
include paraphrase relations or indications of the scope of quantifiers. These
are legitimate data, of course, part of what a grammar should characterize, part
of the output of the emerging system, but plausibly not part of the input to
language acquisition and so not “primary.” These seem to be plausible assump-
tions about the input, but the proof will come if they support successful models
of the form of (9.16), yielding optimal claims about UG and about mature
grammars.
One might also argue that the PLD are structurally simple, that children do not
need to hear complex expressions in order to develop a normal, mature grammar.
Children do hear complex expressions, of course, and they may understand
them, but they may not be a necessary part of experience. In that event, the
question arises what one means by “simple.”
Lightfoot (1991, 1994) has argued that children need exposure to simple,
unembedded clauses and the front of an embedded clause, but not to any-
thing more complex than that. This is degree-0 learnability, the idea that
grammars are learnable by exposure only to unembedded structures. The rel-
evant “unembedded” structures are defined in terms of the I-language, gram-
matical notion of a binding domain. So the PLD are drawn from unembedded
binding domains.
Children need to learn, for example, which verbs are transitive and which are
intransitive, and therefore need access to VPs which may or may not include
a complement. Not everything, however, can be learned from simple, unem-
bedded clauses. English speakers learn that the complementizer that may be
omitted from the front of an embedded finite clause, unlike French que or Dutch
dat, and that must require exposure to the front of an embedded clause, where
sometimes that occurs and sometimes it does not. Similarly some verbs require
that their complement clause be topped by a wh-item (I wonder who she saw)
and others do not: I believe that she saw Reuben. This can only be learned,
it would seem, by exposure to the front of an embedded clause. Furthermore,
in English, some verbs allow a lexical subject in their infinitival complement,
while their counterparts in other languages do not: I expect/want her to see
Reuben. This fact about English verbs also can only be learned if children have
access to embedded clauses.
200 9 “Growing” a language
7 In note 4, we observed that an anaphor must be locally coindexed with a higher element. Com-
parably, pronouns are subject to the same local requirement. “Locally” means within its binding
domain. An element’s binding domain is usually the clause (CP) or DP which contains it. If the
element is in a high structural position (namely in C, Spec of CP, or the subject of a non-finite IP),
then its binding domain is the next CP up. See Chomsky 1981 for the details.
9.4 Nature of the trigger 201
bounding node in those grammars (see note 3), unlike in the grammars of
English speakers. Furthermore, this fact could be learned on exposure to com-
plex sentences like those of (9.18), where there are two levels of embedding.
(9.18) a. Tuo fratello [CP a cui mi domando [CP che storie abbiano
raccontato]] era molto preoccupato.
Your brother, to whom I wonder which stories they have
told, is very troubled.
b. C’est à mon cousin [CP que je sais [CP lequel offrir]].
It is to my cousin that I know which to offer.
Lightfoot showed that Andersson and Dahl’s generalization was incorrect and
that ha may be deleted in a matrix clause if the second position to which it
would ordinarily move is already filled by an adverb like kanske “perhaps,” as
in (9.20d). This suggests that the correct generalization is that ha may be deleted
quite generally, but in fact it fails to be deleted when moved to C (Swedish is a
verb-second language, like Dutch and German, in which finite verbs typically
move to C in matrix clauses). The restriction that it may not be deleted in C
may then be understood in terms of the binding theory: if ha were deleted in
202 9 “Growing” a language
that position, its deleted copy (trace) would not be bound – it would lack an
antecedent. So ha may be deleted only in its original, base-generated position,
where it is not needed as an antecedent for a deleted copy.
This would also explain the non-deletability of a moved do or modal auxiliary
in English, as in (9.21); compare the non-moved can in (9.21c), which may be
deleted. Under this analysis, the Swedish facts are not as peculiar as one might
have thought and there are no special conditions to be learned, and nothing that
requires children to learn anything from embedded binding domains.
(9.21) a. *Who did Jay greet and who Ray treat?
b. *Who can Jay visit and who Ray eat with?
c. Jay can visit Fay and Ray eat with Kay.
More productive syntax of this type followed, where better analyses were
found by following through on the assumption that learning is based only on
structures from unembedded binding domains. What is relevant here is that the
notion of a binding domain is itself a grammatical, I-language notion, and that
is what we need to define the limits to children’s trigger experience. I-language
notions are implicated in analyses of language acquisition from the outset.
In fact, there is a still more fundamental point: not only must the PLD be drawn
from simple structures, but they are abstract structures themselves, not unana-
lyzed sentences. This is so-called cue-based acquisition, which we shall discuss
below, but first let us consider other, more E-language-based approaches.
Chomsky’s Aspects of the theory of syntax (1965), now a classic, viewed
children as endowed with a metric evaluating the grammars which can generate
the primary data to which they are exposed, along with appropriate structural
descriptions for those data. The evaluation metric picks the grammar which
conforms to the principles of UG and is most successful in generating those
data and those structural descriptions. The child selects a grammar whose output
matches her input as closely as possible.
The same point holds for more recent models. Gibson and Wexler (1994)
posit a Triggering Learning Algorithm (TLA), under which the child–learner
uses grammars to analyze incoming sentences and eventually converges on the
correct grammar. If the child–learner cannot analyze a given sentence with the
current grammar, then she follows a procedure to change one of the current
parameter settings and then tries to reprocess the sentence using the new para-
meter values. If analysis is now possible, then the new parameter value is
adopted, at least for a while. So the TLA is error-driven and permits the child
to reset a parameter when the current grammar does not give the right results.
This model has the child seeking grammars which permit analysis of incoming
data, where the data consist of more or less unanalyzed sentences.
Clark (1992) offers a similar kind of model, but one which differs from that
of Gibson and Wexler in that the child does not revise particular parameter
settings. Clark posits a Darwinian competition between grammars needed to
9.4 Nature of the trigger 203
parse sets of sentences. All grammars allowed by UG are available to each child,
and some grammars are used more than others in parsing what the child hears.
A “genetic algorithm” picks those grammars whose elements are activated most
often. A Fitness Metric compares how well each grammar fares, and the fittest
grammars go on to reproduce in the next generation, while the least fit die out.
Eventually the candidate grammars are narrowed to the most fit, and the child
converges on the correct grammar.
What these models have in common is that learners eventually match their
input, in the sense that they select grammars which generate the sentences of
the input. It is only accurate grammars of this type which are submitted to
Chomsky’s (1965) evaluation metric, and Gibson and Wexler’s error-driven
children react to inaccurate grammars by seeking new parameter settings until
a sufficient degree of accuracy is achieved. The child converges on a gram-
mar which analyzes the input successfully, where the input consists of sets of
sentences.
There are problems with these input-matching, E-language-based ap-
proaches. First, the models will need great elaboration to deal with the fact,
observed several times in this chapter, that children produce non-adult forms.
That is, they operate with inaccurate grammars which do not match the input.
The models will need to explain why certain inaccuracies are tolerated and
others not.
Second, the models require extensive appeals to memory, because children
resetting a parameter need to know the full set of sentences which required
earlier resettings, lest they now be lost by picking the wrong parameter to reset.
Third, it is hard to see how these input-matching models can succeed when
children are exposed to unusual amounts of artificial or degenerate data, which
in fact are not matched. In particular, it is hard to see how they could ac-
count for the early development of creole languages, as described by Bickerton
(1984, 1999) and others. In these descriptions, early creole speakers are not
matching their input, which typically consists to a large degree of pidgin data.
Pidgins are primitive communication systems, cobbled together from fragments
of two or more languages. They are not themselves natural languages, and they
tend not to last long, before giving way to a creole with all the hallmarks of a
natural grammar. The first speakers of creoles go far beyond their input in some
ways, and in other ways fail to reproduce what they heard from their models,
arriving at grammars which generate sentences and structural descriptions quite
different from those of their input.
Nowadays we can observe these effects in the development of deaf children
acquiring various kinds of signing grammar. 90 percent of deaf children are
born into hearing homes and are exposed initially to degenerate pidgin-like data,
as their parents and older siblings learn an artificial gestural system in order
to communicate in primitive fashion. Nonetheless, like early creole speakers,
these deaf children go far beyond their models and develop natural systems.
204 9 “Growing” a language
Goldin-Meadow and Mylander (1990) show how deaf children go beyond their
models in such circumstances and “naturalize” the system, altering the code and
inventing new forms which are more consistent with what one finds in natural
languages.8
Fourth, severe feasibility problems arise if acquisition proceeds by evaluating
the capacity of grammars to generate sets of sentences. If UG provides binary
parameters of variation, then 11 parameters permit 4,096 grammars, and 32
allow 8-9 billion grammars, and so on. The child eliminates grammars which
fail to generate the sentences encountered. If we assume thirty to forty structural
parameters and if children converge on a grammar by, say, age seven, then they
eliminate grammars at a fantastic rate, several in each waking second on average.
As if that problem is not enough, it is by no means clear that parameters can be
kept down to between thirty and forty. If there are only thirty to forty structural
parameters, then they must look very different from present proposals.
Cue-based acquisition offers a very different, more I-language-based ap-
proach. Under this view, advocated by Dresher 1999, Fodor 1998, Lightfoot
1999, and others, children do not evaluate grammars against sets of sentences.
Rather, UG specifies a set of “cues” and children scan their linguistic environ-
ment for these cues and converge on a grammar accordingly. A cue is some
kind of structure, an element of grammar, which is derived from the input. The
cues are to be found in the mental representations which result from hearing,
understanding, and “parsing” utterances (“parsing” means assigning structure
and meaning to incoming speech signals). As a person understands an utterance,
even partially, he or she has some kind of mental representation of the utterance.
Similarly for children, but a child may only have a partial understanding of what
is heard, hence a partial parse. The child scans those mental representations,
derived from the input, and seeks the designated cues.
The child scans the linguistic environment for cues only in simple syntactic
domains (this is the degree-0 learnability just discussed). Learners do not try
to match the input; rather, they seek certain abstract structures derived from
the input, and this shapes the emerging grammar without regard to the final
result. That is, a child seeks cues and may or may not find them, regardless
of the sentences that the emerging grammar can generate; the output of the
grammar is entirely a by-product of the cues the child finds, and the success of
the grammar is in no way evaluated on the basis of the set of sentences that it
generates, unlike in the input-matching models.
So, for example, a child scans her environment for nouns and determiners.
She would find the nouns livre, idée, and vin and the determiners le, la, and mon,
if she lives in Paris; she finds book, idea, and wine, if she lives in Boston, and the
determiners the, that, and my. Our Bostonian would also find a determiner ’s,
8 Newport (1999) extends these ideas and Kegl, Senghas, and Coppola (1999) report on the spec-
tacular emergence of Nicaraguan Sign Language over the last twenty years.
9.4 Nature of the trigger 205
which has no counterpart in French. She would find that this determiner
also assigns Case to a preceding DP; she would discover this on exposure
to an expression the player’s hat, analyzed partially as [DP [DP the player] D ’s
[NP hat]]. This partial analysis is possible, of course, only after the child has
identified player and hat as separate words, both nouns projecting an NP, etc. In
this way, the order in which cues are identified, the “learning path” (Lightfoot
1989), follows from dependencies among cues and follows from their internal
architecture.
Our Parisian would also find the cue I V, that is, instances of verbs occurring
in an inflectional position (cf. chapter 8). She would find this on exposure to
sentences like Elle lit toujours les journaux and understanding that les journaux
is the complement of the finite verb lit. In that case, since verbs are first generated
adjacent to their complement, there must be a partial analysis of (9.22a), which
represents the movement of lit out of the verb phrase, across the adverb toujours,
to the higher inflectional position. She would also find the cue when confronted
with and partially understanding an expression Lit-elle les journaux?, which
requires the partial analysis of (9.22b).
(9.22) a. Elle [I V liti ] [VP toujours V ei les journaux].
b. V Liti [IP elle I ei [VP V ei les journaux]]?
Our Bostonian, on the other hand, would not be confronted with any such
expressions and would never be exposed to sentences which required postulating
a verb raised to a higher inflectional position. Since she never finds the cue I V,
her grammar would never have such a verb-raising operation.
So children scan the environment for instances of I V. This presupposes prior
analysis: children may scan for this cue only after they have identified a class of
verbs and when their grammars have a distinct inflectional position, I. The cue
must be represented robustly in the PLD. The approach is entirely I-language
based and children do not test or evaluate grammars against sets of sentences; in
fact, the set of sentences generated by the emerging grammar is quite irrelevant –
the chips fall where they fall.
Cue-based acquisition finesses the feasibility problems which arise for input-
matching models. We are free to postulate 100 or 200 cues, if that is what analy-
ses of different grammars require. That does not raise comparable feasibility
problems for the child learner. Our child would not be evaluating quadrillions
of grammars against sets of sentences, rejecting hundreds every waking second.
Rather, the child would be scanning her environment for the 100 or 200 cues,
much in the way that she scans her environment and identifies irregular past-
tense verbs like took, went, fit, spoke, and so on. That task may raise difficulties
that we do not now understand, but it does not raise the particular, devastating
feasibility problems of input-matching parametric systems.
The cue-based approach has been productive for phonologists concerned
with the parameters of stress systems (Dresher 1999) and it comports well with
206 9 “Growing” a language
work on the visual system, which develops as organisms are exposed to very
specific visual stimuli, such as horizontal lines (Hubel 1978, Hubel and Wiesel
1962, Sperry 1968). Current theories of the immune system are similar; specific
antigens amplify preexisting antibodies. In fact, this kind of thing is typical of
selective learning quite generally.
9 See also Jusczyk 1997 and Kuhl 1999 for further amplification and references.
9.5 Acquiring sound patterns 207
It has been known for many years that infants follow a regular progression
in their sound productions during the first year or so of life. Between one
and about five months, their vocalizations are often referred to as cooing: at
this stage, they produce sounds resembling vowels and begin to control the
process of phonation. Around seven months, they begin the stage known as
babbling: early on, this consists of repetitive productions that rhythmically
alternate consonants and vowels (“babababa,” “mamamama,” etc.). By around
eight months, the vowels produced in babbling begin to approach those specific
to the language spoken by their parents and others around them, and may
alternate different syllables rather than repeating the same one. The intonation
contours of babbling also begin to resemble those of their (soon to be) native
language, and by the age of about ten months, it is possible to differentiate
children on the basis of the linguistic environment in which they have developed.
This development is a crucial part of the infant’s attaining control of language.
We can see this in part from the behavior of congenitally deaf babies: up until
the onset of babbling, their vocalizations are entirely comparable to those of
hearing babies, but deaf babies do not babble. Around the age of seven months
or so, their vocalizations diminish, and do not reemerge until later, at a point
where they are dominated by syllables with labial consonants that the baby can
see how to pronounce.
The absence of vocal babbling in deaf infants, however, does not mean that
this stage of language development is absent. For deaf children raised among
signing adults, there is indeed a linguistic environment: it is simply in another
modality from that of hearing children. And it has been shown (Petitto and
Marentette 1991) that these babies do indeed engage in manual “babbling” that
evokes the constituent elements of signed language, a kind of manual activity
that is qualitatively different from that of other children and that serves the same
sort of function of attunement to a linguistic world as oral babbling does for the
hearing child.
By around 10–15 months, children have arrived at a selection of vowel and
consonant types appropriate to their native language. While babbling may per-
sist, in the production of nonsense repetitions (generally with appropriate sen-
tential intonation), the first stable words begin to appear by the end of the first
year, and infants now come to use a consistent phonetic form to refer to an
object. Around 20–24 months, when most (though by no means all) children
have a vocabulary of roughly 250 to 300 words, they begin to combine words
in meaningful ways and produce their first sentences.
A child’s early productions are of course available for observation, and the
path of development sketched above has been documented for some time. It is
naturally rather harder to study the sequence of events in the development of
perception, since we cannot directly observe what is going on in the mind of
the prelinguistic child. Recent years have seen the emergence and refinement
208 9 “Growing” a language
10 For a survey of the classic cases relevant to this issue, and their interpretation, see Curtiss 1988.
210 9 “Growing” a language
On the other hand, “parrots seem to have achieved a similar end by their own
neuroanatomically distinct neural system, apparently independently evolved”
(Marler 1999, p. 296). The neural organization in hummingbirds shows some
initial similarities to that of oscines, but little evidence has been collected on
these species.
Song may range from a simple series of a few more or less identical notes
through long arias that may last 10 seconds or more. The difference between
songs and calls is only in part a categorial one. Marler (1999, pp. 295f.) describes
songs (as opposed to calls) as
especially loud, longer in duration than calls, often highly patterned, with a variety of
acoustically distinct notes . . . often a male prerogative, with many functions, the most
obvious of which are signaling occupation of a territory and maintenance of sexual
bonds. Songs are sometimes seasonal and sometimes given year round . . . Some learned
11 There is a vast literature on song and other acoustic communication in birds, including the
development of this ability as a function of differences among species. Our presentation here
relies especially on work by Peter Marler (1970, 1991, 1999, and elsewhere), along with other
research represented by chapters in Kroodsma and Miller 1996. Marler has explored the parallels
(and differences) between the development of song in birds and of language in human infants
in great detail, and his work has given rise to a great deal of further exploration of these issues.
9.5 Acquiring sound patterns 211
birdsongs are relatively simple, on a par with those that are innate. Other learned songs
are extraordinarily complex, with individual repertoires numbering in the tens, hundreds,
and in a few cases, even in the thousands.
Songs are quite distinctive from one species to another, of course. The song
itself is made up of a number of separate notes, of different types. These occur
in a particular sequence, and the sequence is essentially the same across repeti-
tions. These matters are important: female song sparrows, for instance, respond
preferentially to songs that are (a) composed of “song sparrow” notes, and
(b) follow “song sparrow” patterns. Experiments show that female receptive-
ness is sensitive to both of these dimensions.
In fact, as Marler observes, the same bird will typically have a repertoire of
several different songs (two to ten, or even many more in other species), gen-
erally similar but distinct. All of these songs serve the same purpose, however:
to claim territory. Different songs do not convey different messages. Females
appreciate the variety, though: diversity of the male’s repertoire helps attract
mates, even though it cannot be shown to be linked to any other objectively
valuable genetic characteristic.
Songs often – maybe usually – display differences of dialect. That is, there
may be local variations that characterize the song. These are not genetic: if you
move a baby bird into a different area, he will learn the local dialect. Females
may even prefer locally appropriate variation, providing evidence that although
in most species they do not sing, females do some song learning too.
The role of learning is quite diverse across species. In some species (e.g.
cuckoo), even a rather complicated song is clearly innate, since it is sung without
learning. This is adaptive for this bird, since cuckoos typically lay their eggs in
other birds’ nests, with the result that the babies would not have other cuckoos
around to serve as models for song learning. In other species, though, learning is
definitely involved. This may consist in identifying some particular conspecific
individual’s song and copying it, perhaps picking up the right pattern from
within a range of possible choices (chaffinch), perhaps relatively free learning
(bullfinch).
The course of learning seems to involve four periods.12 First is the acquisition
of a song model from experience during days 15–35 after birth. The bird does
not sing yet, but it is during this time that he is listening – and song models heard
at this time are internalized and saved. This is followed by a period in which
the bird produces subsong, from roughly 25–40 days, consisting of relative
soft, broad band, unstructured sounds. This stage is thought to be the process by
which a young bird calibrates his vocal instrument, akin to babbling in human
infants. At about 35 to 80 days, the bird begins to produce plastic song, and
12 The timings cited here for these periods are for white crowned sparrows, the bird Marler studied
first in detail. Other species will vary somewhat from these precise ages, while following the
same overall sequence of development.
212 9 “Growing” a language
this period is marked by the gradual approximation of the young male’s song
output to the stored model(s). Finally, at about 90 days, the adult song is fixed
(or “crystallized”) in its permanent form. In some species, several different
songs are sung during plastic song and all but one are dropped at the stage of
crystallization.
Much of this is actually highly parallel to human language acquisition, given
the fact that the bird only has to learn to speak, as it were, and not to say anything:
he has to develop a command of phonology, but not syntax. The stages of song
learning, and the role of a critical (or “sensitive”) period (during which input
has to be available, or else song will not develop normally) are just like what
we find in human infants. Quite clearly, a given bird has a specific range of
systems that it is capable of learning, providing an obvious parallel with the
role of UG in determining the range of human languages that are accessible to
the child. The bird does not have to learn how song works: it only has to learn
which song to sing, within a narrowly constrained range.
Similarly, we can note that at least some birds, during the “plastic song”
phase, produce song elements that are not the same as any they have actually
heard, but which still fall within the range of possible song for that species.
This is comparable to the fact that children’s “errors” during language learning
correspond to different possible grammatical systems, at a stage where they are
still working out just what system the language they are learning instantiates.
In birds, these “errors” may well persist in adult song as creative innovations
(a form of “sound change”?).
This sequence shows us how a particular kind of learning takes place. It
seems quite incontrovertible that this learning sequence is determined by the
bird’s biology: vary the species, and the range of systems that can be acquired
changes, regardless of the nature of input. Song sparrows cannot learn to be
bluebirds, although they can learn to sing somewhat like swamp sparrows
(a closely related species with a rather different song) if that is the only model
available.
While the localization of human language functions in the brain is known
only sketchily, as we will see in chapter 10, the corresponding issues in the
neuroanatomy of birds are understood in much greater detail. The control of
song is centered in connections among several specific, neuroanatomically
distinct areas (especially those known as HVc, MAN, and RA). In song
birds, there are basically four functions that are subserved by this specialized
apparatus:
13 Birds do not have the kind of contralateral organization that connects the right side of the human
body to the left hemisphere, and vice versa.
214 9 “Growing” a language
once song is crystallized. This shows that specialized brain physiology is inti-
mately connected with the learning process, which is of course related to the
notion that song learning (though not the specific song) is innate.
While human infants display a relatively uniform pattern of linguistic devel-
opment, different species of birds behave somewhat differently with respect to
learning. In most species (e.g., zebra finch), song is learned once, in the first
year, and stays constant through the rest of life. In other species, though, new
songs are learned each year: this is the case with the canary. It turns out that
when we look at the birth and death of neurons in the song-relevant parts of
the bird brain, cell birth and death are associated with song learning and song
forgetting, respectively. Interestingly, when we compare canaries with zebra
finches, we find that neurogenesis occurs in close-ended learners (that is, those
who learn their song(s) once and for all), but in contrast to open-ended learners,
this process is generally arrested after the first year of life. These observations
provide a neurological basis for the observation that learning is associated with
critical or sensitive periods, and that the timing of these is a consequence of
physical changes that play themselves out in the stages of a bird’s maturation.
No one would seriously doubt that the control of birdsong is largely organized
as a function of the bird’s neuroanatomy, and thus of its biology. In some birds,
the whole development is innate, since the bird can come to sing its song
with little or no environmental input. In the birds we have been discussing,
though, the song control system develops on the basis of an interaction with
data provided by the environment. The productivity of this interaction is rather
precisely dependent on physical changes in the bird’s neuroanatomy, changes
that are clearly controlled by its specific genetic program.
All of this is strongly consistent with the picture of human language learning
as similarly driven by human biology: normal language acquisition in humans,
also, takes place preferentially during a specific stage of maturation. But just as
what a bird is capable of learning is a function of its species-specific biology,
so also what we are capable of learning as a first language is undoubtedly
determined by our genetic program. Our language organ is distinctly biological
in nature.
9.6 Conclusion
There is, of course, much more to be said about grammars and their acquisition,
as well as the development of phonology (including more detailed aspects than
we have attended to above); and there is an enormous technical literature.
Here we have tried to show how the biological view of grammars, focusing on
the internal representations occurring in individual brains, influences the way
one studies the acquisition of linguistic knowledge in young children. In this
connection, we have outlined a rigorously I-language-based approach.
9.6 Conclusion 215
Our ability to speak and understand a natural language results from – and is made
possible by – a richly structured and biologically determined capacity specific
both to our species and to this domain. In this chapter we review arguments
that show that the language faculty is a part of human biology, tied up with the
architecture of the human brain, and distinct at least in significant part from
other cognitive faculties. We also discuss some of the work that has tried to link
the language organ with specific brain tissue and its activity.
Previous chapters have explored the structure of various components of our
language organ, and some aspects of the course by which that structure arises.
Some component of the mind must be devoted to language, and in its original
state (determined by Universal Grammar (UG)), prior to any actual linguistic
experience, it seems predisposed to infer certain quite specific sorts of system
on the basis of limited and somewhat degenerate data. This is what we mean
when we say that our language organ can be described by a grammar, and
the shape of particular grammars is determined by the system of UG as this
interprets the primary linguistic data available during the period of growth of
the language organ.
Thus far, our description is largely an abstract or functional one: that is, it
does not depend on the specific properties of the physical system that realizes
it. For a parallel, consider the nature of multiplication. We can characterize the
function of multiplication over the natural numbers in terms of some general
properties (commutativity, associativity, etc.), together with some specific re-
sults. Any computation that produces those results, consistent with the general
properties of multiplication, counts as multiplication: repeated addition, binary
shift-and-add strategies, the kind of logarithmic addition that used to be imple-
mented on slide rules, etc. Multiplication remains the same function, regardless
of the algorithm by which we compute it.1
Suppose we take a specific algorithm, for concreteness’ sake – perhaps the
standard one we learn in grade school, by which we carry out multi-digit
1 The discussion here is based on the kind of analysis of cognitive functions proposed by David
Marr (1982).
216
10 The organic basis of language 217
multiplications one place at a time, with carrying, etc. That algorithm seems
quite clearly specified, but in fact it can be implemented in various ways: with
the paper and pencil technique we learned in school (at least prior to the ubiquity
of calculators), on mechanical adding machines, old-fashioned moving-wheel
adding machines, an abacus, digital computers, etc. The inner workings of all
of these devices differ in various ways – even specifying the algorithm does not
tell us exactly how the system does it, at the level of implementation.
Much of what we have seen in previous chapters concerning language remains
at the functional level. We can determine properties of languages, often very
abstract and surprising ones; and we can establish the properties which UG
must have in order to allow the learner to establish a particular instance of the
“language function” on the basis of the kind of data available, but there are
myriad ways in which this function could be “computed” and (at least as far
as we know) few limits in principle on the kind of mechanism with which that
computation could be carried out.2 Silicon-based forms of artificial life (such
as contemporary digital computers) cannot at present fully replicate human
knowledge of language, let alone the acquisition of that knowledge, but we
have no particular reason to believe that the existing limitations reflect some
special property of the stuff of which we (as opposed to our workstations) are
made.
Nonetheless, when we ask how our knowledge of language is implemented,
there is some fact of the matter: that is, there is some physical aspect of human
beings that realizes the knowledge in question, and something that renders
this knowledge accessible in the observed way in the cases of real, physical
human beings. We maintain that there is no serious alternative to the notion of a
language organ: that is, a highly specialized aspect of our cognitive organization
which is common to all of us, particular to humans, specialized for the tasks it
performs (i.e., specific to the domain of language), and determined by human
biology – specifically including (though not limited to) the organic structure of
the human brain.
A commonly posed alternative to this position is the notion that language
is a product not of human biology but of human culture. On that view, lang-
uage might be seen as something that develops within every human society,
for reasons having to do with the imperatives of social interaction, but without
further determination beyond the constraints imposed on effective communi-
cation in naturally occurring human societies. Every known human society has
2 In the terms above, the study of the “algorithms” by which humans carry this out is referred to
as the study of language or speech processing, and forms a part of the discipline of psycho-
linguistics. We have said virtually nothing about processing in this book, which is not at all meant
to deny its significance. There are many different views of language processing that could all
correspond to the same picture of the properties of the language organ in the sense we have been
pursuing; and furthermore, for any particular theory of processing, there are in principle many
ways in which it could be carried out by actual physical systems.
218 10 The organic basis of language
an organ or whether this designation should be reserved for one of its fingers. If
we think of an organ as a distinguishable aspect of an organism’s structure, in-
trinsic to and determined by its biological nature and implementing a particular
function, then the human capacity for language can be viewed in those terms.
5 See Klima and Bellugi 1979, Perlmutter 1991 and a rich technical literature. Note that there
is considerable controversy about the extent to which the input to which the apes in these
experiments were exposed actually constituted a natural language like ASL, as opposed to a
system that lacked essential properties of ASL.
6 For discussion of this work during the period when it was particularly prominent, see Petitto and
Seidenberg 1979, Terrace et al. 1979, Wallman 1992. Unfortunately, the negative conclusions
that arose from careful examination of these experiments held little appeal for the general public,
and the impression has gradually grown that some higher apes have been successfully taught
to use a human natural language. Despite the fascinating abilities that these apes have in fact
displayed, such a conclusion is simply not correct. See Anderson (forthcoming) for review of
these and other issues from the more general perspective of animal communication systems.
220 10 The organic basis of language
One might, of course, reject the validity of this analogy, and say that humans
lack certain skills of bats or electric fish because they lack the appropriate
sensory organs. But what, after all, is an “appropriate sensory organ”? Precisely
some bit of tissue which is specialized, as a result of the animal’s biological
organization, to be sensitive to certain environmental events. But that is exactly
what we claim for the language organ: it is a biologically determined aspect of
certain tissue (primarily in the brain), rendering it uniquely sensitive to linguistic
events in the environment, and allowing the development of a highly specialized
capacity as a consequence of that environmental interaction.
The importance of quite species-specific determination in these matters can-
not be underestimated: we must not imagine that an animal’s capacities develop
simply through the application of very generally defined cognitive systems to
the problems posed by its Lebenswelt. Many animals have noses, and olfac-
tory cortex, but no matter how much a man may desire truffles, he cannot find
them by their scent without the aid of his dog or pig. Similarly, non-human
primates certainly have ears to hear with and (with more qualification) a vocal
tract capable of producing sound; or hands and eyes capable of producing and
identifying signs; but this does not appear to endow even the cleverest of them
with the capacity for human language.
Of course, if one defines “language” generally enough, perhaps simply as
“structured communication” or the like, there is every reason to think that a
vast range of organisms display something of the sort, but we have seen in
earlier chapters that the human language organ has a much more specific and
substantive content than that – a content which in the final analysis has no
significant parallels in any other species.
brain injury and the fact that the affected individual suffered a loss of speech.
In later theorizing, though, the relation of the brain to cognition (including lan-
guage) became somewhat more obscure: Aristotle, for instance, taught that the
brain was really a large sponge, whose function was to serve as a radiator to
cool the blood.
Only fairly recently (in historical terms), with the development of good
microscopes – and then later, with the development of more elaborate imag-
ing facilities – did it become possible to see enough of the structure of the
brain as an organ to develop any kind of coherent picture of its organiza-
tion and functions. The notion that the brain, and the nervous system more
generally, is a collection of cells that communicate with one another at spec-
ific connection points (the neuron doctrine) dates only from the nineteenth
century.
On a larger scale, there have been two distinct views of the brain that have
been in contention for much of recent history. One of these has its modern
origins in the theories of Franz Gall. Gall believed that the brain is composed
of a large number of very specific faculties, each specialized for a very limited
sort of function. He offered specific charts of this kind of organization (see
figure 10.1).
Gall also held the view that exercising a particular mental faculty caused
the corresponding part of the brain to enlarge, pressing against the skull and
producing corresponding bulges (or, in the case of underutilized faculties, de-
pressions). The result was supposed to be palpable irregularities in the surface
of the skull, which an appropriately trained interpreter could use as indices of
personality traits. That gave rise to the pseudo-science of phrenology, which
tended to give “faculty psychology” a bad name from which it has still not
completely recovered. There is no logical necessity, however, to the connection
between this sort of thing and the notion that the mind (as represented in the
brain) has a number of distinct components, modules, or faculties . . . among
them, the language organ.
At the beginning of the nineteenth century, the neurologist Marie-Jean-Pierre
Flourens (1824) performed experiments on animals in which he excised parts
of their brains and then looked to see what specific abilities were affected. By
and large, what he found was that all of his experimental subjects were affected
globally: that is, rather than losing just one particular function, Flourens’ ex-
perimental subjects were all turned into vegetables. From this he concluded
(Flourens 1846), in explicit reaction against Gall’s views, that cognitive capac-
ities are not localized in the brain, but rather distributed globally, so that injury
to any part results in a general degradation. This view that the brain represents
a single very general faculty, rather than a lot of individual and specific ones, is
sometimes called the “aggregate field” view of the brain: the view caricatured
by Pinker (1994) as that of the “brain as meatloaf.”
10.2 Language is a function of the brain 223
Figure 10.1 The personality organs of the human brain, according to Gall
and his followers Johann Spurzheim and George Combe
While the aggregate field view seems somewhat implausible in many ways,
it is quite attractive on one particular notion of the nature of human beings. If
we view most higher mental faculties as actually properties of an incorporeal
soul, or something of the sort, it seems natural to say that the brain is just a
large, functionally amorphous structure that serves as the locus for the effects
224 10 The organic basis of language
of the soul on the body (and perhaps vice versa), but nothing more precise
than that. If, on the other hand, we hold that specific faculties are anatomically
localized, this increases the extent to which we see those faculties as grounded
in the properties of specific material (brain tissue) in the physical world; and
as a result, the materialist view of the basis of the mind becomes much more
congenial. Indeed, in 1802, Gall was forced by conservative Catholic authorities
and the Austro-Hungarian emperor to stop lecturing in Vienna (and soon after, to
leave Austria) precisely because he was seen to be “championing materialism,
atheism, and fatalism bordering on heresy” (Finger 2000, p. 124).
In the late nineteenth and early twentieth century, however, an accumulation
of results suggested that the localist view is correct, at least in part. As people
looked into the question in more detail, it became clear that very specific in-
juries can result in rather specific functional impairments, rather than general
degradation (as in the case of Flourens’ animal subjects). And in this connection,
the study of language has played a central role.
same parts of their brains for language as speakers. Sign aphasias are entirely
comparable to conditions affecting spoken language, and we can find evidence
that the linguistic use of visual information is the responsibility of different parts
of the brain from its non-linguistic use (Poizner, Klima, and Bellugi 1987). ASL
is produced and perceived in the visual spatial modality, as opposed to spoken
languages, and right hemisphere lesions often produce marked visuospatial
deficits. Nonetheless, signers (like speakers) with left hemisphere strokes often
display significant problems with language, while those with right hemisphere
strokes display essentially normal (signed or spoken) language in the presence
of such spatial deficits as left hemi-neglect (essentially ignoring the existence
of the left half of their visual field). We can thus see that it is language and not
( just) speech which is primarily based in the left cerebral hemisphere.
However, this overall localization of language function in the left hemisphere
has been seen in recent years to be a significant oversimplification (Kosslyn
et al. 1999). Rather more language function can be identified in the right hemi-
sphere (by the kinds of arguments suggested above) than was once thought, and
there is considerable individual variation in these matters.
8 These differences are important in determining appropriate sites for surgery to ameliorate epilep-
tic conditions.
9 We certainly cannot cover all of this emerging field here, nor is it necessary to do so to make
our overall point about the biological nature of the language organ. The papers in Brown and
Hagoort 1999 provide a summary of recent views and results, in much greater depth.
226 10 The organic basis of language
learned about, say, the visual system by studying cats and monkeys: removing
selective bits of brain tissue and observing the resultant deficits, implanting
electrodes directly into individual neurons in the intact brain and measuring
its activity, disturbing neural development in various ways and observing the
results, etc. We have learned much about these systems in animals evolutionarily
close to us, and our understanding of the role of various cortical areas in visual
processing is relatively precise. These animal models, in turn, have generally
turned out to be valid in their essentials for the human visual system as well.
The corresponding studies have not been carried out directly on humans, for
obvious moral and ethical reasons, but it appears we can largely dispense with
them, given what we know about other animals.
When we come to language, however, this path to knowledge is closed to us,
for the precise reason that no other animal besides homo sapiens possesses a
language organ, so far as we know. To understand the precise role of the brain in
language, we must start from the very beginning – and our options are relatively
limited, for the same reasons we cannot study the human visual system in the
kind of detail the monkey visual system has been explored. With few excep-
tions, we are limited to the evidence provided by non-intrusive experimental
methods, or to interpreting what we might call the “experiments of nature”: the
consequences of localized brain damage, as caused by stroke, trauma, or other
accidents.
same overall area, though it appears the actual lesions were rather diverse). This
led Broca to announce that “Nous parlons avec l’hémisphère gauche!” More
specifically, a particular kind of language deficit could now be associated with
damage to a relatively specific part of the brain, now referred to as “Broca’s
area” (though see Uylings et al. 1999 for some of the difficulties in defining
that notion more precisely in anatomical terms).
This result provided important impetus for the development of brain science,
and in the 1870s others discovered that electrical stimulation of the brain could
result in movements of very specific parts of an animal’s anatomy. In fact,
it was possible to correlate particular locations in the cortex with particular
motor activities, resulting in a topographic map of the primary motor strip that
shows an obvious resemblance to the body – with interesting differences, largely
related to the fact that some small body parts (the vocal organs, e.g.) require
disproportionately large amounts of controlling tissue, while other body parts
(the trunk and abdomen, e.g.) require less than their relative size would suggest.
A familiar and amusing exercise is to relate the brain tissue subserving
primary motor control with the parts of the body, resulting in a sort of
“homunculus” analogy (cf. figure 10.2).
More recent research suggests there are actually several of these “body maps”
in the brain, rather more diffused, and with some variation in structure in differ-
ent instances. Similar maps link areas of brain tissue bit by bit with the visual
field, with the frequency response range of the cochlea, etc. Neuroscientific re-
search is usually based on the view that activities are rather precisely localized.
And of course, once the plausibility of this position was well established, many
other researchers got busy looking for the specific localization of a variety of
functions – such as language.
In the late 1870s, Carl Wernicke proposed another rather specific relation
between brain lesion and functional deficit. This involved a class of patients
whose production is relatively fluent but who have a deficit in comprehension.
This pattern was associated with left posterior temporal lobe lesions, affecting
the area now known as Wernicke’s area. Wernicke noted that Broca’s area is
quite near the motor areas responsible for control of speech articulators, and
“Broca’s aphasics” have trouble producing fluent speech. “Wernicke’s area,”
on the the other hand, is close to areas of the brain responsible for auditory
comprehension, and “Wernicke’s aphasics” have comprehension diffculties.
Wernicke therefore suggested that language production is controlled by Broca’s
area, and comprehension by Wernicke’s area.
But Wernicke’s main contribution was the suggestion that only some of
the brain basis of cognitive structures such as language is to be found in the
localization of specific functions. Much of the way cognitive systems work,
according to him, involves interconnections among these specific functions.
Thus, on this view both the localists and the aggregate field people were partly
228 10 The organic basis of language
Figure 10.2 Topographic map of the body musculature in the primary motor
cortex (from Purves et al. 1997, p. 316)
right: some functions are local, but mental functioning involves making overall
patterns of connections among these functions.
On this basis, Wernicke predicted another kind of aphasia, one that would
be based not on a lesion in Broca’s area or Wernicke’s area, but rather on
impaired connections between the two. Such a patient ought to have relatively
spared production and comprehension, but to be very bad at tasks that involved
connecting the two, such as repetition. Patients displaying just this sort of
deficit (“Conduction aphasia”) were later actually found, providing one of those
wonderful instances that are all too rare in science of the confirmation of an
empirical prediction made on theoretical grounds.
How much can we refine and extend the information provided by cases
of language pathology in order to derive a more precise picture? As one
10.2 Language is a function of the brain 229
might expect, clinicians have gone far beyond the classification of “Broca’s,”
“Wernicke’s,” “Conduction” aphasics, etc. There are many clinical
“syndromes” that have been identified, and one might expect it to be possi-
ble to look for the locus of brain damage in each case to get a sense of what
goes on where.
Unfortunately, as more and more patients are documented, it becomes clearer
that each is an individual, and the deficits patients show are subtly specific.
In part, the interpretive difficulties arise from the fact that lesions are rarely
precise, and a variety of functions are inevitably implicated when a large area
of brain tissue is affected. But some neuropsychologists argue that so much
individual variation exists among brain-damaged individuals that the only way
to learn about cognition is to study each one as an individual, sui generis.
Lumping patients with similar but distinct lesions together and averaging over
them – e.g., treating a population of “Broca’s aphasics” as being comparable –
necessarily loses important detail on this view. Categories like “Broca’s” or
“agrammatic” aphasia may conceal as much as they reveal, by averaging out
the differences among distinct patients.
Indeed, aphasic symptoms can be exquisitely detailed. This is particularly
obvious in the anomias: patients can lose the ability to name quite precise
categories, such as animate objects, abstract objects, plants . . . one of us recalls
once hearing the neuropsychologist Frieda Newcombe describe a patient who
had lost the ability to name green vegetables, but nothing else.
Given the diffuse and contradictory evidence from attempting to correlate
anomias with lesion sites, it is vanishingly unlikely that this will ever lead to
the discovery of “the green vegetable center” in the brain; but even so, we learn
something more general from the cumulative effect of such cases. They suggest
that our knowledge of words is organized in such a way that semantically similar
items share some brain mechanisms.
Another bit of evidence in the same direction is provided by Tourette’s Syn-
drome patients. These people have a variety of involuntary seizures, including
occasional episodes in which they suddenly start to shout out strings of blas-
phemies and obscenities. Now since there seems to be no reason to assume
that these have any specific semantic content, it would appear that somehow
the patient’s brain has obscenities organized together as well as items similar
in substance, at least if we take Tourette’s Syndrome to be a localized brain
disorder of some sort.
A major problem in studying the brain through the deficits of aphasic patients
is that the ability to localize is rather poor, since most aphasias result from
rather comprehensive lesions. Also, the system as we study it in an aphasic is
an abnormal one: perhaps just the normal with something missing (as the logic
of most research seems to assume), but more likely with something missing and
230 10 The organic basis of language
something else added in the mind/brain’s attempt to make up for the deficits. It
would clearly be much better if we could study people without brain damage,
and in ways that let us see finer spatial detail.
In summary, evidence from brain lesions suggests that a number of important
language functions are largely based in certain areas in the left hemisphere,
especially around the perisylvian fissure. There is good reason to believe that
subcortical structures have considerable importance as well, but these have been
much less accessible to study and so are much less well understood. There seems
little hope that we can go much further than this toward localizing components
of the language organ on the basis of deficit studies, however.
under some conditions. We discuss the potential results of these two sorts of
study in turn.10
10 See Rugg 1999 for a good survey of these methods, their possibilities and their limitations.
232 10 The organic basis of language
overall fashion that averages over long periods of time and many events. Jaeger
and her colleagues (1996) studied the processing of regular vs. irregular verbs
in English using PET, and concluded that these are stored in the brain in ways
that imply different cognitive mechanisms. Such studies are still a matter of
static pictures, though: the experiment involves having several subjects each do
the same thing over and over, which yields a single picture that represents the
whole period. The possible inferences to the functions of the language organ are
limited, at best. Poeppel (1996) discusses a series of PET experiments that pur-
port to identify brain regions active in phonological processing, and concludes
that a combination of weak experimental designs and intrinsic limitations of
the method result in essentially no significant result.
Another way of imaging blood flow is by the use of f(unctional)MRI: the
same method noted above, but with much more powerful magnets than those
normally used for static imaging of tissue. Somewhat paradoxically, the blood
draining from very active tissue is richer in oxygen than that draining from less
active tissue. The extent to which hemoglobin is affected by a strong external
magnetic field depends on its oxygen content, so these differences in metabolic
activity can result in an effect that can be measured from outside the head in
the presence of a powerful magnet.
The images that are possible with fMRI have even better spatial resolution
than PET images, and since they can be acquired within about 100 milliseconds
for a single “slice,” their temporal resolution is better as well. In addition, since
they involve no radiation exposure, there is no resulting limit in the number of
images that can be collected from a single subject, thus avoiding the problems
of intersubject normalization and averaging.
Despite these advantages, fMRI is still some distance from providing a
satisfactory approach to cognitive events, because its temporal resolution is
still somewhat too coarse to allow us to capture the dynamics of what brain
tissue is doing at an appropriate level. It also presents some problems of its
own (extreme sensitivity to slight movements, non-uniform imaging of tissue
in different regions, high noise level of the apparatus) which limit its usefulness.
Nonetheless, as experimental designs are developed which allow the study of
more sophisticated cognitive functions (“event-related fMRI”), its virtues, es-
pecially in terms of spatial resolution, will make contributions to studies of the
localization of language functioning in the brain.
One important problem with the majority of studies conducted using these
hemodynamic techniques is that they are designed by researchers who have
little or no familiarity with the concepts, methods, results, and research issues
of those working on language from a linguistic point of view. Most are based on
studies of individual words, rather than active use of language; they typically
involve visual presentation of written material, rather than the linguistically
more natural mode of aural presentation; and in general they make almost
10.2 Language is a function of the brain 233
into further computation. One might imagine that it would be fairly straightfor-
ward to identify the functional neuroanatomy of speech perception: it is tightly
linked to a single sensory modality,11 unlike higher-level lexical or syntactic
processes. However, it has proved surprisingly difficult to characterize. Poeppel
argues that the difficulty stems in part from the fact that the neural systems
supporting “speech processing” vary as a function of the task: the neural sys-
tems involved in discriminating syllables under laboratory conditions overlap
only partially with those involved in natural, on-line comprehension of spoken
language. Progress can be made in disentangling these results on the basis of
a variety of imaging techniques, but only in the presence of well-worked-out
theories of the cognitive functions involved. The images do not, as it were, bear
their interpretations on their sleeves. Similarly, Phillips (2001) discusses the
complexity of relating acoustic signals to higher-level lexical representations,
using electro-physiological results to refine theories of the abstract, functional
representations.
On a more cautionary note, let us not forget that none of these techniques
give us a way to examine what the neuron is actually doing: although we
talk about neurons “firing,” current flow is not the same thing as the complex
electro-chemistry of real neurons, where a variety of neurotransmitters having
diverse effects are in play. There is accordingly no reason to believe that, if the
functioning of the language organ is somehow reducible to (or representable
in terms of ) the activity of neurons, blood or current flow is the right level of
analysis. And, so far, we have no way of getting information from outside the
head about these more detailed matters.
A more fundamental point here is that the representations provided by non-
invasive imaging techniques are not simply the unvarnished reality of brain
activity, but a significant abstraction from it: they represent some features of
these events (metabolic or electromagnetic correlates of the coordinated ac-
tivity of a large number of spatially associated neurons), but omit much else.
There is no reason a priori to believe that the specific abstractions provided by
these technologies will correspond in a particularly close way to the kinds of
abstraction we need to understand our functional descriptions of the language
organ. We have already seen in chapter 6, for example, that the dimensions of
a linguistically relevant (phonetic) representation of speech correspond only in
a rather indirect way to what is measurable in the acoustic signal. There is no
particular reason to expect that the fit between brain imaging data and other
11 This is true to the extent we disregard work suggesting that non-auditory evidence for the activity
of the speech apparatus – visual observation of the talker’s face, for instance, as in the classical
“McGurk effect” (McGurk and MacDonald 1976) – can also enter into our perception of speech.
And of course the kind of “analysis-by-synthesis” view of perception discussed in chapter 4
further complicates the extent to which we can expect speech perception to be simply a matter
of transforming the information provided by the peripheral auditory system alone.
236 10 The organic basis of language
aspects of language function need be any closer than this. We may, of course,
be able to make inferences about functional organization from evidence of this
sort – at least, if we have sufficiently articulated theories in both domains to
make it possible to confront each with the other. Even adopting a thoroughly
materialist view of the language organ as a cognitive function arising from the
activity of the brain and nervous tissue, though, we are hardly committed to the
proposition that the kinds of physical phenomena we can measure are exactly
and exhaustively coextensive with that function.
The conclusion to be drawn from the remarks above is that while we can be
quite sure that the language organ is a biologically determined aspect of human
physiology, and that it arises primarily as an aspect of the electro-chemical
activity of the brain (more specifically, in most individuals, of the left cerebral
hemisphere), we do not yet have techniques that would allow us to relate specific
cognitive events to specific neurophysiological (electro-chemical) events in
particular assemblages of neurons with any certainty. What evidence we have
suggests that the relevant events are rather complex and implicate substantial
regions of cortical and subcortical tissue, at least some of which subserves other
functions as well as language. And in any event, the cognitive interpretation of
that activity can only be offered in the presence of a fully articulated picture of
its functional nature: of the language organ, as we have interpreted that notion
in this book.
Just as intelligence and language are dissociable, so also is it possible to separate linguis-
tic ability and Theory of Mind, with autistic subjects lacking in the latter but (potentially,
especially in the case of Asperger’s Syndrome – see Frith 1991) language being retained
within normal limits. Some Down’s Syndrome children provide a contrary scenario,
with their Theory of Mind being intact, but their linguistic ability moderately to severely
degraded.
Sieratzki and Woll in press). Edwards and Bastiaanse (1998) address this issue
for some aphasic speakers, seeking to distinguish deficits in the computational
system from deficits in the mental lexicon.
We also know that focal brain lesions can result in quite specific language
impairments in the presence of normal cognitive abilities, and vice versa
(Caplan 1987). Friedmann and Grodzinsky (1997) argue that agrammatic apha-
sics may be unable to compute certain abstract structural elements (“functional
categories”), while Grodzinsky (2000) identifies much of agrammatism with a
disorder specifically impairing the computation of movement relations, local-
ized in the classical “Broca’s area.” Ingham (1998) describes a young child in
similar terms, arguing that she lacked one particular functional category.
This modular view runs contrary to a long tradition, often associated with
Jean Piaget, which claims that language is dependent on prior general cogni-
tive capacities and is not autonomous and modular (Piaget and Inhelder 1968,
Piattelli-Palmarini 1980). Such a claim is undermined by the kinds of dissoci-
ations that have been observed, however. Bellugi et al. (1993) have shown, for
example, that Williams Syndrome children consistently fail to pass seriation
and conservation tests but nonetheless use syntactic constructions whose ac-
quisition is supposedly dependent on those cognitive capacities. Clahsen and
Almazan (1998) demonstrate that Williams Syndrome children have good con-
trol of the rule-governed aspects of syntax and word formation, but are severely
impaired in certain irregular, memory-based functions, while SLI children dis-
play an essentially symmetrical pattern of affected and spared abilities. More
generally, language and other cognitive abilities dissociate in development just
as they do in acquired pathology (Curtiss 1988).
10.4 Conclusions
The psychologist Eleanor Rosch once said that she wanted her new field to be
empirical but not barbarically so. The key to successful scientific work is to
find a level of abstraction at which one can state interesting generalizations.
The point of our book has been to demonstrate that generative grammarians
have found an appropriate level of abstraction in the notion of I-language,
which has enabled them to understand the language faculty better. The details
of that abstraction make up the focus of empirical work of the kind that we have
described.
Does the appropriate level of abstraction incorporate the deletion operations
that we discussed in chapter 3? Perhaps yes, if what we sketched endures, as
ideas about constituent structure and the binding theory have endured now for
long periods. Or perhaps not. In fact, most likely not. As work progresses, our
ideas about deletion will probably come to be seen in a different light, related
to other ideas, whose connections we do not now see, and the technology
10.4 Conclusions 239
In fact, people work at levels where they believe that they can make significant
progress. Strikingly, over the last forty years work on syntax and phonology has
blossomed and led to countless discoveries. A good case can be made that studies
of children’s language have blossomed more recently, say over the last ten years,
and that we are learning a great deal along the lines of what we described in
chapter 9. On the other hand, it is less clear that comparable productivity has
been achieved in studies of aphasia over recent years. All this can change and
aphasiologists might discover some productive tools which will enable them,
over the next decade, perhaps, to make discoveries in a way comparable to
what has happened in syntax and language acquisition. We would bet that
aphasiologists might discover useful tools in the grammatical and acquisitional
literature – or elsewhere, of course. Hunches about future progress and the
discovery of useful tools guide people in figuring out what they want to study
in graduate school, as they prepare to undertake careers in the cognitive sciences.
So there are different levels at which one might work. E-language levels do
not look at all promising for work on the language faculty as an element of
human cognition. I-language approaches of the kind we have described have
been very productive, we contend. Brain-based approaches so far have not led
to much understanding about the language faculty, but there has been some
and, now that we have better tools, it is likely that we shall learn much more in
this domain in the near future, hoping for some correspondence between what
we find about brain processes and the nature of I-language structures and their
acquisition.
What we seek quite broadly is a theory of the mind, seen as an element of
nature. Chomsky (2000, ch. 4) has construed the kind of work we have described
as a development of the rational psychology of the seventeenth century: there
are “principles or notions implanted in the mind [that] we bring to objects from
ourselves [as] a direct gift of Nature, a precept of natural instinct . . . common
notions [and] intellectual truths [that are] imprinted on the soul by the dictates
of Nature itself, [which, though] stimulated by objects [are not] conveyed” by
them (Herbert of Cherbury 1624/1937, p. 133). We try to discover the “principles
or notions implanted in the mind” that are a “direct gift” of nature. We begin
with common-sense formulations. If we observe that Juan knows Spanish, we
focus on a state of the world, including a state of Juan’s brain. He knows
how to interpret certain linguistic signals, certain expressions. We might try to
characterize that knowledge and to ask how his brain reached this state, how his
language organ developed in the interplay of nature and nurture. Inquiry leads to
empirical hypotheses about biological endowment, information derived from
the environment, the nature of the state attained, how it interacts with other
systems of the mind, and so on.
One conducts that inquiry as best one can, typically invoking deeper and
deeper abstractions as one moves beyond common-sense formulations. There
242 10 The organic basis of language
Allen, C. 1997. Investigating the origins of the “group genitive” in English. Transactions
of the Philological Society 95. 111–131.
2002. Case and Middle English genitive noun phrases. In Lightfoot 2002. 57–80.
Anderson, S. R. 1974. The organization of phonology. New York: Academic Press.
1976. On the description of consonant gradation in Fula. Studies in African Linguistics
7. 93–136.
1978. Syllables, segments and the northwest Caucasian languages. In Syllables and
segments, ed. by A. Bell and J. B. Hooper. 47–58. Amsterdam: North-Holland.
1981. Why phonology isn’t “natural.” Linguistic Inquiry 12. 493–539.
1985. Phonology in the twentieth century. Chicago: University of Chicago Press.
1988. Objects (direct and not so direct) in English and other languages. In On lan-
guage: a Festschrift for Robert Stockwell, ed. by C. Duncan-Rose, T. Vennemann
and J. Fisiak. 279–306. Beckenham, Kent: Croom-Helm.
1990. The grammar of Icelandic verbs in -st. In Icelandic syntax, ed. by J. Maling and
A. Zaenen. Syntax & Semantics 24. 235–273. New York: Academic Press.
1992. A-morphous morphology. Cambridge: Cambridge University Press.
1993. Linguistic expression and its relation to modality. In Current issues in ASL
phonology, ed. by G. R. Coulter. Phonetics and Phonology 3. San Diego: Academic
Press.
2000. Reflections on “on the phonetic rules of Russian.” Folia Linguistica 34. 1–17.
forthcoming. Doctor Dolittle’s delusion: animal communication and the nature of
human language.
Anderson, S. R. and D. W. Lightfoot. 1999. The human language faculty as an organ.
Annual Review of Physiology 62. 697–722.
Andersson, A-B. and O. Dahl. 1974. Against the penthouse principle. Linguistic Inquiry
5. 451–454.
Aronoff, M. 1976. Word formation in generative grammar. Cambridge, MA: MIT Press.
1988. Two senses of lexical. Proceedings of the Eastern States Conference on
Linguistics 5. 13–23.
Baker, M. 2001. The atoms of language. New York: Basic Books.
Barbosa, P., D. Fox, P. Hagstrom, M. McGinnis and D. Pesetsky, eds. 1998. Is the best
good enough? Cambridge, MA: MIT Press.
Barbour, J. 2000. The end of time. Oxford: Oxford University Press.
Bates, E. A. and J. L. Elman. 1996. Learning rediscovered. Science 274. 1849–1850.
Bauer, L. 1995. The emergence and development of SVO patterning in Latin and French.
Oxford: Oxford University Press.
244
References 245
MacDonnell, A. 1916. A Vedic grammar for students. Oxford: Oxford University Press.
Manzini, R. and K. Wexler. 1987. Parameters, binding theory, and learnability. Linguistic
Inquiry 18. 413–444.
Marantz, A., Y. Miyashita and W. O’Neil, eds. 2000. Image, language, brain. Cambridge,
MA: MIT Press.
Marchand, H. 1969. The categories and types of present-day English word formation.
München: Beck. [Second edition.]
Marler, P. 1970. Birdsong and human speech: could there be parallels? American
Scientist 58. 669–674.
1991. Song-learning behavior: the interface with neuroethology. Trends in Neuro-
sciences 14. 199–206.
1999. On innateness: are sparrow songs “learned” or “innate”? In Hauser and Konishi
1999. 293–318.
Marr, D. 1982. Vision. San Francisco: Freeman.
Matthei, E. 1982. The acquisition of prenominal modifier sequences. Cognition 11.
201–332.
Matthews, G. H. 1955. A phonemic analysis of a Dakota dialect. International Journal
of American Linguistics 21. 56–59.
Mattingly, I. G. and M. Studdert-Kennedy, eds. 1991. Modularity and the motor theory
of speech perception. Hillsdale, NJ: Lawrence Earlbaum.
Mayeux, R. and E. R. Kandel. 1991. Disorders of language: the aphasias. In Principles
of neural science, ed. by E. Kandel, J. H. Schwartz and T. M. Jessel. 839–851.
New York: Elsevier.
McCarthy, J. J. 1981. A prosodic theory of non-concatenative morphology. Linguistic
Inquiry 12. 373–418.
McCawley, J. D. 1999. Why surface syntactic structure reflects logical structure as much
as it does, but only that much. Language 75. 34–62.
McGurk, H. and J. MacDonald. 1976. Hearing lips and seeing voices. Nature 264.
746–748.
Medawar, P. 1967. The art of the soluble. London: Methuen.
Monrad-Kröhn, G. H. 1947. Dysprosody or altered “melody of language”. Brain 70.
405–415.
Morris Jones, J. 1913. A Welsh grammar. Oxford: Clarendon.
Mustanoja, T. 1960. A Middle English syntax. Helsinki: Société Néophilologique.
Nash, D. 1986. Topics in Warlpiri grammar. New York: Garland.
Nespor, M. and I. Vogel. 1986. Prosodic phonology. Dordrecht: Foris.
Newmeyer, F. J. 1998. Language form and language function. Cambridge, MA: MIT
Press.
Newport, E. L. 1999. Reduced input in the acquisition of signed languages: contributions
to the study of creolization. In DeGraff 1999. 161–178.
Niyogi, P. 2002. The computational study of diachronic linguistics. In Lightfoot 2002.
351–65.
Niyogi, P. and R. Berwick. 1997. A dynamical systems model for language change.
Complex Systems 11. 161–204.
Nunes, J. 1995. Linearization of chains and sideward movement. PhD thesis. University
of Maryland. College Park, MD.
Nunnally, T. 1985. The syntax of the genitive in Old, Middle, and Early Modern English.
PhD thesis. University of Georgia.
References 253
Ojemann, G., J. Ojemann, E. Lettich and M. Berger. 1989. Cortical language organization
in left, dominant hemisphere. Journal of Neurosurgery 71. 316–326.
O’Neil, W. 1978. The evolution of the Germanic inflectional systems: a study in the
causes of language change. Orbis 27. 248–285.
Orešnik, J. and M. Pétursson. 1977. Quantity in modern Icelandic. Arkiv för Nordisk
Filologi 92. 155–171.
Paradis, C. and D. LaCharité. 1997. Preservation and minimality in loanword adaptation.
Journal of Linguistics 33. 379–430.
Paul, H. 1880. Prinzipien der Sprachgeschichte. Tübingen: Niemeyer.
Payne, D. L. 1998. Maasai gender in typological and applied perspective. Read at 29th
Annual Conference on African Linguistics, Yale University.
Pepperberg, I. M. 2000. The Alex studies: cognitive and communicative abilities of grey
parrots. Cambridge, MA: Harvard University Press.
Perlmutter, D. M. 1991. The language of the Deaf. New York Review of Books 38(6).
65–72. [Review of Sacks 1989.]
Petitto, L. A. and P. F. Marentette. 1991. Babbling in the manual mode: evidence for the
ontogeny of language. Science. 251. 1493–1496.
Petitto, L. and M. Seidenberg. 1979. On the evidence for linguistic abilities in signing
apes. Brain and Language 8. 162–183.
Phillips, C. 2001. Levels of representation in the electrophysiology of speech perception.
Cognitive Science 25. 711–731.
Piaget, J. and B. Inhelder. 1968. The psychology of the child. London: Routledge.
Piattelli-Palmarini, M., ed. 1980. Language and learning: the debate between Jean
Piaget and Noam Chomsky. London: Routledge and Kegan Paul.
1986. The rise of selective theories: a case study and some lessons from immunol-
ogy. In Language learning and concept acquisition: foundational issues, ed. by
W. Demopoulos and A. Marras. 117–130. Norwood, NJ: Ablex.
1989. Evolution, selection, and cognition: from “learning” to parameter setting in
biology and the study of language. Cognition 31. 1–44.
Pierce, A. 1992. Language acquisition and syntactic theory: a comparative analysis of
French and English child grammars. Dordrecht: Kluwer.
Pierrehumbert, J. 1990. Phonological and phonetic representation. Journal of Phonetics
18. 375–394.
Pierrehumbert, J. and D. Talkin. 1992. Lenition of /h/ and glottal stop. In Docherty and
Ladd 1992. 90–117.
Pinker, S. 1994. The language instinct. New York: William Morrow.
Poeppel, D. 1996. A critical review of PET studies of phonological processing. Brain
and Language 55. 317–351.
Poeppel, D. and A. Marantz. 2000. Cognitive neuroscience of speech processing. In
Marantz et al. 2000. 29–50.
Poeppel, D. and K. Wexler. 1993. The full competence hypothesis of clause structure in
early German. Language 69. 1–33.
Poizner, H., E. S. Klima and U. Bellugi. 1987. What the hands reveal about the brain.
Cambridge, MA: MIT Press.
Premack, D. 1978. Chimpanzee problem-solving: a test for comprehension. Science
202. 532–535.
1980. Representational capacity and accessibility of knowledge: the case of chim-
panzees. In Piattelli-Palmarini 1980. 205–221.
254 References
1990. Words: What are they, and do animals have them? Cognition 37. 197–212.
Prince, A. and P. Smolensky. 1993. Optimality theory: constraint interaction in generative
grammar. Manuscript, Rutgers University and University of Colorado.
Purves, D., G. J. Augustine, D. Fitzpatrick, L. C. Katz, A.-S. LaMantia and J. O.
McNamara, eds. 1997. Neuroscience. Sunderland, MA: Sinauer.
Rask, R. K. 1818. Undersøgelse om det gamle Nordisk eller Islandske sprogs oprindelse.
Copenhagen: Gyldendals.
Reinhart, T. 1976. The syntactic domain of anaphora. PhD thesis. Massachusetts Institute
of Technology.
Rizzi, L. 1982. Violations of the wh-island constraint and the subjacency condition. In
Issues in Italian syntax, ed. by L. Rizzi. 49–76. Dordrecht: Foris.
1990. Relativized minimality. Cambridge, MA: MIT Press.
Roberts, I. 1993. A formal account of grammaticalization in the history of Romance
futures. Folia Linguistica Historica 13. 219–258.
Ross, J. R. 1967. On the cyclic nature of English pronominalization. To honor Roman
Jakobson. The Hague: Mouton.
1969. Auxiliaries as main verbs. In Studies in philosophical linguistics, ed. by
W. Todd. Vol. I. Evanston: Great Expectations.
Rugg, M. D. 1999. Functional neuroimaging in cognitive neuroscience. In Brown and
Hagoort 1999. 15–36.
Sacks, O. 1989. Seeing voices. Berkeley: University of California Press.
Sapir, Edward. 1921. Language. New York: Harcourt, Brace & World.
1925. Sound patterns in language. Language 1. 37–51.
1929. The status of linguistics as a science. Language 5. 207–214.
1938. Why cultural anthropology needs the psychiatrist. Psychiatry 1. 7–12.
1994. The psychology of culture. Berlin: Mouton de Gruyter. Reconstructed and edited
by Judith T. Irvine.
Savage-Rumbaugh, E. S. 1986. Ape language: from conditioned response to symbol.
New York: Columbia University Press.
1987. Communication, symbolic communication, and language: reply to Seidenberg
& Petitto. Journal of Experimental Psychology: General 116. 288–292.
Savage-Rumbaugh, E. S., K. McDonald, R. A. Sevcick, W. D. Hopkins and E. Rupert.
1986. Spontaneous symbol acquisition and and communicative use by pygmy
chimpanzees (Pan Paniscus). Journal of Experimental Psychology: General 115.
211–235.
1993. Language comprehension in ape and child. Monographs of the Society for
Research in Child Development 58. 1–221.
Savage-Rumbaugh, E. S., S. G. Shanker and T. J. Taylor. 1998. Apes, language, and the
human mind. New York: Oxford University Press.
Schleicher, A. 1848. Über die Bedeutung der Sprache für die Naturgeschichte des
Menschen. Weimar: Hermann-Böhlau.
1861–62. Compendium der vergleichenden Grammatik der indogermanischen
Sprachen. Weimar: Hermann-Böhlau.
1863. Die Darwinische Theorie und die Sprachwissenschaft. Weimar: Hermann-
Böhlau.
Seidenberg, M. and L. Petitto. 1987. Communication, symbolic communication, and
language: comment on Savage-Rumbaugh et al. Journal of Experimental Psychol-
ogy: General 116. 279–287.
References 255
Vennemann, T. 1975. An explanation of drift. In Word order and word order change,
ed. by C. N. Li. 269–302. Austin: University of Texas Press.
Wallman, J. 1992. Aping language. Cambridge: Cambridge University Press.
Warner, A. R. 1995. Predicting the progressive passive: parametric change within a
lexicalist framework. Language 71. 533–557.
1997. The structure of parametric change and V movement in the history of English.
In Parameters of morphosyntactic change, ed. by A. van Kemenade and N. Vincent.
380–393. Cambridge: Cambridge University Press.
Watkins, C. 1976. Toward Proto-Indo-European syntax: problems and pseudo-problems.
In Diachronic syntax, ed. by S. Steever, C. Walker and S. Mufwene. 305–326.
Chicago: Chicago Linguistic Society.
Weverink, M. 1989. The subject in relation to inflection in child language. Master’s
thesis. Utrecht University.
Wexler, K. 1994. Optional infinitives, head movement, and the economy of derivations.
In Verb movement, ed. by D. W. Lightfoot and N. Hornstein. 305–350. Cambridge:
Cambridge University Press.
Wexler, K., M. Rice and P. Cleave. 1995. Specific language impairment as a period of
extended optional infinitives. Journal of Speech and Hearing Research 38. 850–
863.
Williams, H. and F. Nottebohm. 1985. Auditory responses in avian vocal motor neurons:
a motor theory for song perception in birds. Science 229. 279–282.
Wright, J. 1910. Grammar of the Gothic language. Oxford: Clarendon.
Yang, C. D. 2002. Grammar competition and language change. In Lightfoot 2002.
367–380.
Zwicky, A. M. and G. K. Pullum. 1983. Cliticization vs. inflection: English n’t. Language
59. 502–13.
Index
257
258 Index