Word Grammar (Richard Hudson)

Word Grammar
New Perspectives on a Theory of Language Structure

This page intentionally left blank
Word Grammar
New Perspectives on a Theory of Language Structure
edited by Kensei Sugayama and Richard Hudson
continuum
Continuum
The Tower Building 15 East 26th Street
11 York Road New York
London SE1 7NX NY 10010
© Kensei Sugayama and Richard Hudson 2005
All rights reserved. No part of this publication may be reproduced or transmitted in

any form or by any means, electronic or mechanical, including photocopying,
recording, or any information storage or retrieval system, without prior permission in
writing from the publishers.
First published 2006
British Library Cataloguing-in-Publication Data

A catalogue record for this book is available from the British Library.
ISBN: 0-8264-8645-2 (hardback)
Library of Congress Cataloguing-in-Publication Data

To come
Typeset by BookEns Ltd, Royston, Herts.

Printed and bound in Great Britain by MPG Books Ltd, Bodmin, Cornwall
The problem of the word has worried general linguists for the
best part of a century.
-P. H. Matthews
Contents
Contributors xi
Preface xiii
Kensei Sugayama
Introduction 1
1. What is Word Grammar? 3
Richard Hudson
1. A Brief Overview of the Theory 3
2. Historical Background 5
3. The Cognitive Network 7
4. Default Inheritance 12
5. The Language Network 13
6. The Utterance Network 15
7. Morphology 18
8. Syntax 21
9. Semantics 24
10. Processing 27
11. Conclusions 28
Part I
Word Grammar Approaches to Linguistic Analysis:
Its explanatory power and applications 33
2. Case Agreement in Ancient Greek: Implications for a
theory of covert elements 35
Chet Creider and Richard Hudson
1. Introduction 35
2. The Data 35
3. The Analysis of Case Agreement 41
4. Non-Existent Entities in Cognition and in Language 42
5. Extensions to Other Parts of Grammar 46
6. Comparison with PRO and pro 49
7. Comparison with Other PRO-free Analyses 50
8. Conclusions 52
3. Understood Objects in English and Japanese with
Reference to Eat and Taberui A Word Grammar
account 54
Kensei Sugayama
1. Introduction 54
2. Word Grammar 56
viii WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
3. Eat in English 58
4. Taberu in Japanese 60
5. Conclusion 63
4. The Grammar of Be To: From a Word Grammar
point of view 67
Kensei Sugayama
1. Introduction and the Problem 67
2. Category of Be 68
3. Modal Be in Word Grammar 69
4. Morphological Aspects 70
5. Syntactic Aspects 71
6. Semantics of the Be To Construction 72
7. Should To be Counted as Part of the Lexical Item? 75
8. A Word Grammar Analysis of the Be To Construction 77
9. Conclusion 81
5. Linking in Word Grammar 83
Jasper Holmes
1. Linking in Word Grammar: The syntax semantics principle 83
2. The Event Type Hierarchy: The framework; event types;
roles and relations 103
3. Conclusion 114
6. Word Grammar and Syntactic Code-Mixing Research 117
Eva Eppler
1. Introduction 117
2. Constituent Structure Grammar Approaches to Intra-Sentential
Code-Mixing 118
3. A Word Grammar Approach to Code-Mixing 121
4. Word Order in Mixed and Monolingual 'Subordinate' Clauses 128
5. Summary and Conclusion 139
7. Word Grammar Surface Structures and HPSG Order
Domains 145
Takafumi Maekawa
1. Introduction 145
2. A Word Grammar Approach 146
3. An Approach in Constructional HPSG: Ginzburg and Sag 2000 154
4. A Linearization HPSG Approach 160
5. Concluding Remarks 165
Part II
Towards a Better Word Grammar 169
8. Structural and Distributional Heads 171
Andrew Rosta
1. Introduction 171
2. Structural Heads 172
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE ix
3. Distributional Heads 172

4. Thai-Clauses 174
5. Extent Operators 174
6. Surrogates versus Proxies 177
7. Focusing Subjuncts: just, only, even 179
8. Pied-piping 181
9. Degree Words 181
10. Attributive Adjectives 182
11. Determiner Phrases 182
12. The type of Construction 184
13. Inside-out Interrogatives 185
14. 'Empty Categories' 187
15. Coordination 189
16. Correlatives 191
17. Dependency Types 191
18. Conclusion 199
9. Factoring Out the Subject Dependency 204
Nikolas Gisborne
1. Introduction 204
2. Dimensions of Subjecthood 205
3. The Locative Inversion Data 210
4. Factored Out Subjects 216
5. Conclusions 222
Conclusion 225
Kensei Sugayama
Author Index 227

Subject Index 229
Contributors
RICHARD HUDSON is Professor Emeritus of Linguistics at University College
London. His research interest is the theory of language structure; his main
publications in this area are about the theory of Word Grammar, including
Word Grammar (1984, Oxford: Blackwell); English Word Grammar (1990,
Oxford: Blackwell) and a large number of more recent articles. He has also
taught sociolinguistics and has a practical interest in educational linguistics.
Website: www. phon. ucl. ac. uk/home/dick/home. hrm
Email: dick@linguistics. ucl. ac. uk
KENSEI SUGAYAMA, Professor of English Linguistics at Kobe City University of

Foreign Studies. Research interests: English Syntax, Word Grammar, Lexical
Semantics and General Linguistics. Major publications: 'More on unaccusative
Sino-Japanese complex predicates in Japanese' (1991). UCL Working Papers in
Linguistics 3; 'A Word-Grammatic account of complements and adjuncts in
Japanese' (1994). Proceedings of the 15th International Congress of Linguists;
'Speculations on unsolved problems in Word Grammar' (1999). The Kobe City
University Journal 50. 7; Scope of Modern Linguistics (2000, Tokyo: Eihosha);
Studies in Word Grammar (2003, Kobe: Research Institute of Foreign Studies,
KCUFS).
Email: ken@inst. kobe-cufs. ac. jp
CHET CREIDER, Professor and Chair, Department of Anthropology, University

of Western Ontario, London, Ontario, Canada. Research interests: morphol-
ogy, syntax, African languages. Major publications: Structural and Pragmatic
Factors Influencing the Acceptability of Sentences with Extended Dependencies in
Norwegian (1987, University of Trondheim Working Papers in Linguistics 4); The
Syntax of the Nilotic Languages: Themes and variations (1989, Berlin: Dietrich
Reimer); A Grammar of Nandi (1989, with J. T. Creider, Hamburg: Helmut
Buske); A Grammar of Kenya Luo (1993, ed. ); A Dictionary of the Nandi
Language (2001, with J. T. Creider, Koln: Riidiger Koppe).
Email: creider@uwo. ca
ANDREW ROSTA, Senior Lecturer, Department of Cultural Studies, University

of Central Lancashire, UK. Research Interests: all aspects of English grammar.
Email: a. rosta@v21. me. uk
NIKOLAS GISBORNE is a lecturer in the Department of Linguistics and English

Language at the University of Edinburgh. His research interests are in lexical
semantics and syntax, and their interaction in argument structure.
xii WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Website: www. englang. ed. ac. uk/people/nik. html

Email: n. gisborne@ed. ac. uk
JASPERW. HOLMES is a self-employed linguist who has worked with many large
organizations on projects in lexicography, education and IT. Teaching and
research interests include syntax and semantics, lexical structure, corpuses and
other IT applications (linguistics in computing, computing in linguistics),
language in education and in society, the history of English and English as a
world language. His publications include 'Synonyms and syntax' (1996, with
Richard Hudson, And Rosta, Nik Gisborne). Journal of Linguistics 32; 'The
syntax and semantics of causative verbs' (1999). UCL Working Papers in
Linguistics 11; 'Re-cycling in the encyclopedia' (2000, with Richard Hudson), in
B. Peeters (ed. ) The Lexicon-Encyclopedia Interface (Amsterdam: Elsevier);
'Constructions in Word Grammar' (2005, with Richard Hudson) in Jan-Ola
Ostman and Mirjam Fried (eds) Construction Grammars: Cognitive Grounding and
Theoretical Extensions (Amsterdam: Benjamins).
Email: jasper. holmes@gmail. com
EVA EPPLER, Senior Lecturer in English Language and Linguistics, School of

Arts, University of Roehampton, UK. Research Interests: morpho-syntax of
German and English, syntax-pragmatics interface, code-mixing, bilingual
processing and production, sociolinguistics of multilingual communities. Recent
main publication: '"... because dem Computer brauchst' es ja nicht zeigen. ":
because + German main clause word order' International Journal of
Bilingualism 8. 2 (2004), pp. 127-44.
Email: evieppler@hotmail. com
TAKAFUMI MAEKAWA, PhD student, Department of Language and Linguistics,

University of Essex. Research Interests: Japanese and English syntax, Head-
Driven Phrase Structure Grammar and lexical semantics. Major publication:
'Constituency, Word Order and Focus Projection' (2004). The Proceedings of
the llth International Conference on Head-Driven Phrase Structure Grammar.
Center for Computational Linguistics, Katholieke Universiteit Leuven, August
3-6.
Email: maekawa@btinternet. com
Preface
This volume comes from a three-year (April 2002-March 2005) research

project on Word Grammar supported by the Japan Society for the Promotion
of Science, the goal of which is to bring together Word Grammar linguists
whose research has been carried out in this framework but whose approaches
to it reflect differing perspectives on Word Grammar (henceforth WG). I
gratefully acknowledge support for my work in WG from the Japan Society for
the Promotion of Science (grant-in-aid Kiban-Kenkyu C (2), no. 14510533 from
April 2002-March 2005). The collection of papers was planned so as to
introduce the readers into this theory and to include a diversity of languages, to
which the theory is shown to be applicable, along with critique from different
theoretical orientations.
In September 1994 Professor Richard Hudson, the founder of Word
Grammar, visited Kobe City University of Foreign Studies to give a lecture in
WG on a part of his lecturing trip to Japan. His talks were centred on advances
in WG at that time, which refreshed our understanding of the theory. Professor
Hudson has been writing in a very engaging and informative way for about two
quarters of a century in the world linguistics scene.
Word Grammar is a theory of language structure which Richard Hudson,
now Emeritus Professor of Linguistics at University College London, has been
building since the early 1980s. It is still changing and improving in detail, yet the
main ideas remain the same. These ideas themselves developed out of two
other theories that he had tried: Systemic Grammar (now known as Systemic
Functional Grammar), due to Michael Halliday, and then Daughter-
Dependency Grammar, his own invention.
Word Grammar fills a gap in the study of dependency theory. Dependency
theory may not belong to the mainstream in the Western World, especially not
in America, but it is gaining more and more attention, which it certainly
deserves. In Europe, dependency has been better known since the French
linguist Lucien Tesniere's study in the 1950s (cf. Hudson, this volume). I just
mention here France, Belgium, Germany and Finland. Dependency theory
now also rules Japan in the shape of WG. Moreover, the notion of head, the
central idea of dependency, has been introduced into virtually all modern
linguistic theories. In most grammars, dependency and constituency are used
simultaneously. However, this adduces the risk of making these grammars too
powerful. WG's challenge is to eliminate constituency from grammar except in
coordinate structures, although certain dependency grammars, especially the
German ones, refuse to accept constituency for coordination.
Richard Hudson's first book was the first attempt to write a generative
(explicit) version of Systemic Grammar (English Complex Sentences: An
xiv WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Introduction to Systemic Grammar, North Holland, 1971); and his second book
was about Daughter-Dependency Grammar (Arguments for a Non-transforma-
tional Grammar, University of Chicago Press, 1976). As the latter tide indicates,
Chomsky's transformational grammar was very much 'in the air', and both
books accepted his goal of generative grammar but offered other ideas about
sentence structure as alternatives to his mixture of function-free phrase structure
plus transformations. In the late 1970s when Transformational Grammar was
immensely influential, Richard Hudson abandoned Daughter-Dependency
Grammar (in spite of its drawing a rave review by Paul Schachter in Language
54, 348-76). His exploration of various general ideas that hadn't come together
became an alternative coherent theory called Word Grammar, first described
in the 1984 book Word Grammar and subsequently improved and revised in
the 1990 book English Word Grammar. Since then the details have been
worked out much better and there is now a workable notation and an
encyclopedia available on the internet (cf. Hudson 2004). The newest version
of Word Grammar is now on its way (Hudson in preparation).
The time span between the publication of Richard Hudson's Word Grammar
(1984) and this volume is more than two decades (21 years to be precise). The
intervening years have seen impressive developments in this theory by the WG
grammarians as well as those in other competitive linguistic theories such as
Minimalist Programme, Head-driven Phrase Structure Grammar (HPSG),
Generalized Phrase Structure Grammar (GPSG), Lexical Functional Grammar
(LFG), Construction Grammar and Cognitive Grammar.
Here are the main ideas, most of which come from the latest version of the
WG homepage (Hudson 2004), together with an indication of where they came
from:
• It is monostratal - only one structure per sentence, no transformations.
(From Systemic Grammar).
• It uses word-word dependencies - e. g. a noun is the subject of a verb.
(From John Anderson and other users of Dependency Grammar, via
Daughter-Dependency Grammar; a reaction against Systemic Grammar
where word-word dependencies are mediated by the features of the mother
phrase. )
• It does not use phrase structure - e. g. it does not recognize a noun phrase
as the subject of a clause, though these phrases are implicit in the
dependency structure. (This is the main difference between Daughter-
Dependency Grammar and Word Grammar. )
• It shows grammatical relations/functions by explicit labels - e. g. 'subject'
and 'object'. (From Systemic Grammar).
• It uses features only for inflectional contrasts - e. g. tense, number but not
transitivity. (A reaction against excessive use of features in both Systemic
Grammar and current Transformational Grammar. )
• It uses default inheritance, as a very general way of capturing the contrast
between 'basic' or 'underlying' patterns and 'exceptions' or 'transforma-
tions' - e. g. by default, English words follow the word they depend on, but
exceptionally subjects precede it; particular cases 'inherit' the default pattern
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE xv
unless it is explicitly overridden by a contradictory rule. (From Artificial

Intelligence).
• It views concepts as prototypes rather than 'classical' categories that can be
defined by necessary and sufficient conditions. All characteristics (i. e. all
links in the network) have equal status, though some may for pragmatic
reasons be harder to override than others. (From Lakoff and early
Cognitive Linguistics, supported by work in sociolinguistics).
• It presents language as a network of knowledge, linking concepts about
words, their meanings, etc. - e. g. twig is linked to the meaning 'twig', to the
form /twig/, to the word-class 'noun', etc. (From Lamb's Srratificational
Grammar, now known as Neurocognitive Linguistics).
• In this network there are no clear boundaries between different areas of
knowledge - e. g. between 'lexicon' and 'grammar', or between 'linguistic
meaning' and 'encyclopedic knowledge'. (From early Cognitive Linguistics
- and the facts).
• In particular, there is no clear boundary between 'internal' and 'external'
facts about words, so a grammar should be able to incorporate
sociolinguistic facts - e. g. the speaker of jazzed is an American. (From
Sociolinguistics).
In this theory, word-word dependency is a key concept, upon which the syntax
and semantics of a sentence build. Dependents of a word are subcategorized
into two types, i. e. complements and adjuncts. These two types of dependents
play an important role in this theory of grammar.
Let me give you a flavour of the syntax and semantics in WG, as shown in
Figure 1:
Figure 1
xvi WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
Contributors to this volume are primarily WG grammarians across the world

who participated in the research organized by myself, and I am also grateful for
being able to include critical work by Maekawa of the University of Essex, who
is working in a different paradigm.
All the papers here manifest what I would characterize as theoretical
potentialities of WG, exploring how powerful WG is to offer analyses for
linguistic phenomena in various languages. The papers we have collected come
from varying perspectives (formal, lexical-semantic, morphological, syntactic,
semantic) and include work on a number of languages, including English,
Ancient Greek, Japanese and German. Phenomena studied include verbal
inflection, case agreement, extraction, construction, code-mixing, etc.
The papers in this volume span a variety of topics, but there is a common
thread running through them: the claim that word-word dependency is
fundamental to our analysis and understanding of language. The collection
starts with a chapter on WG by Richard Hudson which serves to introduce the
newest version of WG. The subsequent chapters are organized into two
sections:
Part I: Word Grammar Approaches to Linguistic Analysis: its explanatory
power and applications
Part II: Towards a Better Word Grammar
Part I contains seven chapters, which contribute to recent developments in
WG and explore how powerful WG is to analyze linguistic phenomena in
various languages. They deal with formal, lexical, morphological, syntactic and
semantic matters. In this way, these papers give a varied picture of the
possibilities of WG.
In Chapter 2, Creider and Hudson provide a theory of covert elements,
which is a hot issue in linguistics. Since WG has hitherto denied the existence
of any covert elements in syntax, it has to deal with claims such as the one that
covert case-bearing subjects are possible in Ancient Greek. As the authors say
themselves, their solution is tantamount to an acceptance of some covert
elements in syntax, though in every case the covert element can be predicted
from the word on which it depends. The analysis given is interesting because
the argument is linked to dependency. It is more sophisticated than the simple
and undefined Chomskyan notion of PRO element.
In Chapter 3, Sugayama joins Creider and Hudson in detailing an analysis of
understood objects in English and Japanese, albeit at the level of semantics
rather than syntax. He studies an interesting contrast between English and
Japanese concerning understood objects. Unlike English and most other
European languages, Japanese is quite unique in allowing its verbs to miss out
their complements on the condition that the speaker assumes that they are
known to the addressee. The reason seems to be that in the semantic structure
of the sentences, there has to be a semantic argument which should be, but is
not, mapped onto syntax as a syntactic complement. The author adduces a WG
solution that is an improvement on Hudson's (1990) account.
Sugayama shares with the preceding chapter an in-depth lexical-semantic
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE xvii
analysis in order to address the relation between a word and the construction.
In Chapter 4, he attempts to characterize the be to construction within the WG
framework. He has shown that a morphological, syntactic and semantic analysis
of be in the be to construction provides evidence for the category of be in this
construction. Namely, be is an instance of modal verb in terms of morphology
and syntax, while the sense of the whole construction is determined by the
sense of 'to'.
In Chapter 5, Holmes in a very original approach develops an account for
the linking of syntactic and semantic arguments in the WG approach. Under
the WG account, both thematic and linking properties are determined at both
the specific and the general level. This is obviously an advantage.
In Chapter 6, Eppler draws on experimental studies concerning the code-
mixing and successfully extends WG to an original and interesting area of
research. Constituent-based models have difficulties accounting for mixing
between SVO and SOV languages like English and German. A dependency
(WG) approach is imperative here. A word's requirements do not project to
larger units like phrasal constituents. The Null-Hypothesis, then, formulated in
WG terms, assumes that each word in a switched dependency satisfies the
constraints imposed on it by its own language. The material is taken from
English/German conversations of Jewish refugees in London.
Maekawa continues the sequence in this collection towards more purely
theoretical studies. In Chapter 7, he looks at three different approaches to the
asymmetries between main and embedded clauses with respect to the elements
in the left periphery of a clause: the dependency-based approach within WG,
the Constructional HPSG approach, and the Linearization HPSG analysis.
Maekawa, a HPSG linguist, argues that the approaches within WG and the
Constructional HPSG have some problems in dealing with the relevant facts,
but that Linearization HPSG provides a straightforward account of them.
Maekawa's analysis suggests that linear order should be independent to a
considerable extent from combinatorial structure, such as dependency or
phrase structure.
Following these chapters are more theoretical chapters which help to
improve the theory and clarify what research questions must be undertaken
next.
Part II contains two chapters that examine two theoretical key concepts in
WG, head and dependency. They are intended to help us progress a few steps
forward in revising and improving the current WG, together with Hudson (in
preparation).
The notion of head is a central one in most grammars, so it is normal that it
is discussed and challenged by WG and other theorists. In Chapter 8, Rosta
distinguishes between two kinds of head and claims that every phrase has both a
distributional head and a structural head, although he agrees that normally the
same word is both distributional and structural head of a phrase. Finally,
Gisborne's Chapter 9 then challenges Hudson's classification of dependencies.
The diversification of heads (different kinds of dependency) plays a role in
WG as well. Gisborne is in favour of a more fine-grained account of
xviii WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
dependencies than Hudson's 1990 model. He focuses on a review of the

subject-of dependency, distinguishing between two kinds of subjects, which
seems promising. Gisborne's thesis is that word order is governed not only by
syntactic information but also by discourse-presentational facts.
I hope this short overview will suggest to the prospective reader that our
attempt at introducing a dependency-based grammar was successful.
By the means of this volume, we hope to contribute to the continuing
cooperation between linguists working in WG and those working in other
theoretical frameworks. We look forward to future volumes that will further
develop this cooperation.
The editors gratefully acknowledge the work and assistance of all those
contributors whose papers are incorporated in this volume, including one non-
WG linguist who contributed papers from his own theoretical viewpoint and
helped shape the volume you see here.
Last but not least, neither the research in WG nor the present volume would
have been possible without the general support of both the Japan Society for
the Promotion of Science and the Daiwa Anglo-Japanese Foundation, whose
assistance we gratefully acknowledge here. In addition, we owe a special debt of
gratitude to Jenny Lovel for assisting with preparation of this volume in her
normal professional manner. We alone accept responsibility for all errors in
the presentation of data and analyses in this volume.
Kensei Sugayama
References
Hudson, R. A. (1971), English Complex Sentences: An Introduction to Systemic Grammar.
Amsterdam: North Holland.
— (1976), Arguments for a Non-transformational Grammar. Chicago: University of
Chicago Press.
— (1984), Word Grammar. Oxford: Blackwell.
— (1990), English Word Grammar. Oxford: Blackwell.
— (2004, July 1-last update), 'Word Grammar', (Word Grammar), Available:
www. phon. ucl. ac. uk/home/dick/wg. htm (Accessed: 18 April 2005).
— (in preparation), Advances in Word Grammar. Oxford: Oxford University Press.
Pollard, C. and Sag, LA. (1987), Information-Based Syntax and Semantics. Stanford:
CSLI.
Schachter, P. (1978), 'Review of Arguments for a Non-Transformational Grammar'.
Language, 17, 348-76.
Sugayama, K. (ed. ) (2003), Studies in Word Grammar. Kobe: Research Institute of
Foreign Studies, KCUFS.
Tesniere, Lucien (1959), Elements de Syntaxe Structurale. Paris: Klincksieck.
Introduction
1 What is Word Grammar?
RICHARD HUDSON
Abstract
The chapter summarizes the Word Grammar (WG) theory of language structure
under the following headings: 1. A brief overview of the theory; 2. Historical
background; 3. The cognitive network: 3. 1 Language as part of a general network;
3. 2 Labelled links; 3. 3 Modularity; 4. Default inheritance; 5. The language
network; 6. The utterance network; 7. Morphology; 8. Syntax; 9. Semantics; 10.
Processing; and 11. Conclusions.
1. A Brief Overview of the Theory

Word Grammar (WG) is a general theory of language structure. Most of the
work to date has dealt with syntax, but there has also been serious work in
semantics and some more tentative explorations of morphology, sociolinguistics,
historical linguistics and language processing. The only areas of linguistics that
have not been addressed at all are phonology and language acquisition (but even
here see van Langendonck 1987). The aim of this article is breadth rather than
depth, in the hope of showing how far-reaching the theory's tenets are.
Although the roots of WG lie firmly in linguistics, and more specifically in
grammar, it can also be seen as a contribution to cognitive psychology; in terms
of a widely used classification of linguistic theories, it is a branch of cognitive
linguistics (Lakoff 1987; Langacker 1987; 1990; Taylor 1989). The theory has
been developed from the start with the aim of integrating all aspects of language
into a single dieory which is also compatible with what is known about general
cognition. This may turn out not to be possible, but to the extent that it is
possible it will have explained the general characteristics of language as 'merely'
one instantiation of more general cognitive characteristics.
The overriding consideration, of course, is the same as for any other
linguistic theory: to be true to the facts of language structure. However, our
assumptions make a great deal of difference when approaching these facts, so it
is possible to arrive at radically different analyses according to whether we
assume that language is a unique module of the mind, or that it is similar to
other parts of cognition. The WG assumption is that language can be analysed
and explained in the same way as other kinds of knowledge or behaviour unless
there is clear evidence to the contrary. So far this strategy has proved productive
and largely successful, as we shall see below.
4 WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
As the theory's name suggests, the central unit of analysis is the word, which
is central to all kinds of analysis:
• Grammar. Words are the only units of syntax (section 8), as sentence
structure consists entirely of dependencies between individual words; WG
is thus clearly part of the tradition of dependency grammar dating from
Tesniere (1959; Fraser 1994). Phrases are implicit in the dependencies, but
play no part in the grammar. Moreover, words are not only the largest units
of syntax, but also the smallest. In contrast with Chomskyan linguistics,
syntactic structures do not, and cannot, separate stems and inflections, so
WG is an example of morphology-free syntax (Zwicky 1992: 354). Unlike
syntax, morphology (section 7) is based on constituent-structure, and the
two kinds of structure are different in others ways too.
• Semantics. As in other theories words are also the basic lexical units
where sound meets syntax and semantics, but in the absence of phrases,
words also provide the only point of contact between syntax and semantics,
giving a radically 'lexical' semantics. As will appear in section 9, a rather
unexpected effect of basing semantic structure on single words is a kind of
phrase structure in the semantics.
• Situation. We shall see in section 6 that words are the basic units for
contextual analysis (in terms of deictic semantics, discourse or sociolinguistics).
Words, in short, are the nodes that hold the 'language' part of the human
network together. This is illustrated by the word cycled in the sentence / cycled to
UCL, which is diagrammed in Figure 1.
Figure 1
WHAT IS WORD GRAMMAR? 5
Table 1 Relationships in cycled

related concept G relationship of C to notation in diagram
cycled
the word / subject V
the word to post-adjunct '>a'
the morpheme {cycle} stem straight downward line
the word-form {cycle+ed} whole curved downward line
the concept 'ride-bike' sense straight upward line
the concept 'event e' referent curved upward line
the lexeme CYCLE cycled isa CYCLE triangle resting on CYCLE
the inflection 'past'
me speaker 'speaker'
now time 'time'
As can be seen in this diagram, cycled is the meeting point for ten
relationships which are detailed in Table 1. These relationships are all quite
traditional (syntactic, morphological, semantic, lexical and contextual), and
traditional names are used where they exist, but the diagram uses notation
which is peculiar to WG. It should be easy to imagine how such relationships
can multiply to produce a rich network in which words are related to one
another as well as to other kinds of element including morphemes and various
kinds of meaning. All these elements, including the words themselves, are
'concepts' in the standard sense; thus a WG diagram is an attempt to model a
small part of the total conceptual network of a typical speaker.
2. Historical Background
The theory described in this article is the latest in a family of theories which have
been called 'Word Grammar' since the early 1980s (Hudson 1984). The present
theory is very different in some respects from the earliest one, but the continued
use of the same name is justified because we have preserved some of the most
fundamental ideas - the central place of the word, the idea that language is a
network, the role of default inheritance, the clear separation of syntax and
semantics, the integration of sentence and utterance structure. The theory is still
changing and further changes are already identifiable (Hudson, in preparation).
As in other theories, the changes have been driven by various forces - new
data, new ideas, new alternative theories, new personal interests; and by the
influence of teachers, colleagues and students. The following brief history may
be helpful in showing how the ideas that are now called 'Word Grammar'
developed during my academic life.
The 1960s. My PhD analysis of Beja used the theory being developed by
Halliday (1961) under the name 'Scale-and-Category' grammar, which later
turned into Systemic Functional Grammar (Butler 1985; Halliday 1985). I
spent the next six years working with Halliday, whose brilliantly wide-ranging
analyses impressed me a lot. Under the influence of Chomsky's generative
grammar (1957, 1965), reinterpreted by McCawley (1968) as well-formedness

conditions, I published the first generative version of Halliday's Systemic
Grammar (Hudson 1970). This theory has a very large network (the 'system
network') at its heart, and networks also loomed large at tihat time in the
Stratificational Grammar of Lamb (1966; Bennett 1994). Another reason why
stratificational grammar was important was that it aimed to be a model of
human language processing - a cognitive model.
The 1970s. Seeing the attractions of both valency theory and Chomsky's
subcategorization, I produced a hybrid theory which was basically Systemic
Grammar, but with the addition of word-word dependencies under the
influence of Anderson (1971); the theory was called 'Daughter-Dependency
Grammar' (Hudson 1976). Meanwhile I was teaching sociolinguistics and
becoming increasingly interested in cognitive science (especially default
inheritance systems and frames) and the closely related field of lexical
semantics (especially Fillmore's Frame Semantics 1975, 1976). The result was a
very 'cognitive' textbook on sociolinguistics (Hudson 1980a, 1996a). I was also
deeply influenced by Chomsky's 'Remarks on nominalization' paper (1970),
and in exploring the possibilities of a radically lexicalist approach I toyed with
the idea of 'pan-lexicalism' (1980b, 1981): everything in the grammar is 'lexical'
in the sense that it is tied to word-sized units (including word classes).
The 1980s. All these influences combined in the first version of Word
Grammar (Hudson 1984), a cognitive theory of language as a network which
contains both 'the grammar' and 'the lexicon' and which integrates language
with title rest of cognition. The semantics follows Lyons (1977), Halliday (1967-
8) and Fillmore (1976) rather than formal logic, but even more controversially,
the syntax no longer uses phrase structure at all in describing sentence structure,
because everything that needs to be said can be said in terms of
dependencies between single words. The influence of continental depen-
dency theory is evident but the dependency structures were richer than those
allowed in 'classical' dependency grammar (Robinson 1970) - more like the
functional structures of Lexical Functional Grammar (Kaplan and Bresnan
1982). Bresnan's earlier argument (1978) that grammar should be compatible
with a psychologically plausible parser also suggested the need for a parsing
algorithm, which has led to a number of modest Natural Language Processing
(NLP) systems using WG (Fraser 1985, 1989, 1993; Hudson 1989; Shaumyan
1995). These developments provided the basis for the next book-length
description of WG, 'English Word Grammar' (EWG, Hudson 1990). This
attempts to provide a formal basis for the theory as well as a detailed application
to large areas of English morphology, syntax and semantics.
The 1990s. Since the publication of EWG there have been some important
changes in the theory, ranging from the general theory of default inheritance,
through matters of syntactic theory (with the addition of 'surface structure', the
virtual abolition of features and the acceptance of 'unreal' words) and
morphological theory (where 'shape', 'whole' and 'inflection' are new), to
details of analysis, terminology and notation. These changes will be described
below. WG has also been applied to a wider range of topics than previously:
• lexical semantics (Gisborne 1993, 1996, 2000, 2001; Holmes 2004;

Hudson and Holmes 2000; Hudson 1992, 1995, forthcoming; Sugayama
1993, 1996, 1998);
• morphology (Creider 1999; Creider and Hudson 1999);
c• morphology (Creider 1999; Creider and Hudson 1999);
• sociolinguistics (Hudson 1996a, 1997b; Eppler 2005);
• language processing (Hudson 1993a, b, 1996b; Hiranuma 1999, 2001).
Most of the work done since the start of WG has applied the theory to English,
but it has also been applied to the following languages: Tunisian Arabic (Chekili
1982); Greek (Tzanidaki 1995, 1996a, b); Italian (Volino 1990); Japanese
(Sugayama 1991, 1992, 1993, 1996; Hiranuma 1999, 2001); Serbo-Croatian
(Camdzic and Hudson 2002); and Polish (Gorayska 1985).
The theory continues to evolve, and at the time of writing a 'Word Grammar
Encyclopedia' which can be downloaded via the WG website (www. phon. u-
cl. ac. uk/home/dick/wg. htm) is updated in alternate years.
3. The Cognitive Network
3. 1 Language as part of a general network

The basis for WG is an idea which is quite uncontroversial in cognitive science:
The idea is that memory connections provide the basic building blocks through
which our knowledge is represented in memory. For example, you obviously know
your mother's name; this fact is recorded in your memory. The proposal to be
considered is that this memory is literally represented by a memory connection,...
That connection isn't some appendage to the memory. Instead, the connection is the
memory.... all of knowledge is represented via a sprawling network of these
connections, a vast set of associations. (Reisberg 1997: 257-8)
In short, knowledge is held in memory as an associative network (though

we shall see below that the links are much more precisely defined than the
unlabelled 'associations' of early psychology and modern connectionist
psychology). What is more controversial is that, according to WG, the same
is true of our knowledge of words, so the sub-network responsible for words is
just a part of the total 'vast set of associations'. Our knowledge of words is our
language, so our language is a network of associations which is closely integrated
with the rest of our knowledge.
However uncontroversial (and obvious) this view of knowledge may be in
general, it is very controversial in relation to language. The only part of language
which is widely viewed as a network is the lexicon (Aitchison 1987: 72), and a
fashionable view is that even here only lexical irregularities are stored in an
associative network, in contrast with regularities which are stored in a
fundamentally different way, as 'rules' (Pinker and Prince 1988). For example,
we have a network which shows for the verb come not only that its meaning is
'come' but that its past tense is the irregular came, whereas regular past tenses
are handled by a general rule and not stored in the network. The WG view is
that exceptional and general patterns are indeed different, but that they can
both be accommodated in the same network because it is an 'inheritance
network' in which general patterns and their exceptions are related by default
inheritance (which is discussed in more detail in section 4). To pursue the last
example, both patterns can be expressed in exactly the same prose:
(1) The shape of the past tense of a verb consists of its stem followed by -d.
(2) The shape of the past tense of come consists of came.
The only difference between these rules lies in two places: 'a verb' versus come,
and 'its stem followed by -ed" versus came. Similarly, they can both be
incorporated into the same network, as shown in Figure 2 (where the triangle
once again shows the 'isa' relationship by linking the general concept at its base
to the specific example connected to its apex):
Figure 2
Once the possibility is accepted that some generalizations may be expressed

in a network, it is easy to extend the same treatment to the whole grammar, as
we shall see in later examples. One consequence, of course, is that we lose the
formal distinction between 'the lexicon' and 'the rules' (or 'the grammar'), but
this conclusion is also accepted outside WG in Cognitive Grammar (Langacker
1987) and Construction Grammar (Goldberg 1995). The only parts of linguistic
analysis that cannot be included in the network are the few general theoretical
principles (such as the principle of default inheritance).
3. 2 Labelled links
It is easy to misunderstand the network view because (in cognitive psychology)
there is a long tradition of 'associative network' theories in which all links have
just the same status: simple 'association'. This is not the WG view, nor is it the
view of any of the other theories mentioned above, because links are
classified and labelled - 'stem', 'shape', 'sense', 'referent', 'subject', 'adjunct'

and so on. The classifying categories range from the most general - the 'isa' link
- to categories which may be specific to a handful of concepts, such as 'goods'
in the framework of commercial transactions (Hudson forthcoming). This is a
far cry from the idea of a network of mere 'associations' (such as underlies
connectionist models). One of the immediate benefits of this approach is that it
allows named links to be used as functions, in the mathematical sense of
Kaplan and Bresnan (1982: 182), which yield a unique value - e. g. 'the referent
of the subject of the verb' defines one unique concept for each verb. In order to
distinguish this approach from the traditional associative networks we can call
these networks 'labelled'.
Even within linguistics, labelled networks are controversial because the labels
themselves need an explanation or analysis. Because of this problem some
theories avoid labelled relationships, or reduce labelling to something more
primitive: for example, Chomsky has always avoided functional labels for
constituents such as 'subject' by using configurational definitions, and the
predicate calculus avoids semantic role labels by distinguishing arguments in
terms of order.
There is no doubt that labels on links are puzzlingly different from the labels
that we give to the concepts that they link. Take the small network in Figure 2
for past tenses. One of the nodes is labelled 'COME: past', but this label could
in fact be removed without any effect because 'COME: past' is the only concept
which isa 'verb: past' and which has came as its shape. Every concept is uniquely
defined by its links to other concepts, so labels are redundant (Lamb 1996,
1999: 59). But the same is not true of the labels on links, because a network with
unlabelled links is a mere associative network which would be useless in
analysis. For example, it is no help to know that in John saw Mary the verb is
linked, in some way or other, to the two nouns and that its meaning is linked,
again in unspecified ways, to the concepts 'John' and 'Mary'; we need to know
which noun is the subject, and which person is the see-er. The same label may
be found on many different links - for example, every word that has a sense
(i. e. virtually every word) has a link labelled 'sense', every verb that has a subject
has a 'subject' link, and so on. Therefore the function of the labels is to classify
the links as same or different, so if we remove the label we lose information. It
makes no difference whether we show these similarities and differences by
means of verbal labels (e. g. 'sense') or some other notational device (e. g.
straight upwards lines); all that counts is whether or not our notation classifies
links as same or different. Figure 3 shows how this can be done using first
conventional attribute-value matrices and second, the WG notation used so far.
This peculiarity of the labels on links brings us to an important characteristic
of the network approach which allows the links themselves to be treated like the
concepts which they link - as 'second-order concepts', in fact. The essence of a
network is that each concept should be represented just once, and its multiple
links to other concepts should be shown as multiple links, not as multiple
copies of the concept itself. Although the same principle applies generally to
attribute-value matrices, it does not apply to the attributes themselves. Thus
Figmre 3
there is a single matrix for each concept, and if two attributes have the same
value this is shown (at least in one notation) by an arc that connects the two
value-slots. But when it comes to the attributes themselves, their labels are
repeated across matrices (or even within a single complex matrix). For example,
the matrix for a raising verb contains within it the matrix for its complement
verb; an arc can show that the two subject slots share the same filler but the only
way to show that these two slots belong to the same (kind of) attribute is to
repeat the label 'subject'.
In a network approach it is possible to show both kinds of identity in the
same way: by means of a single node with multiple 'isa' links. If two words are
both nouns, we show this by an isa link from each to the concept 'noun'; and if
two links are both 'subject' links, we put an isa link from each link to a single
general 'subject' link. Thus labelled links and other notational tricks are just
abbreviations for a more complex diagram with second-order links between
links. These second-order links are illustrated in Figure 4 for car and bicycle, as
well as for the sentence Jo snores.
Figure 4
This kind of analysis is too cumbersome to present explicitly in most

diagrams, but it is important to be clear that it underlies the usual notation
because it allows the kind of analysis which we apply to ordinary concepts to be
extended to the links between them. If ordinary concepts can be grouped into
larger classes, so can links; if ordinary concepts can be learned, so can links.
And if the labels on ordinary concepts are just mnemonics which could, in
principle, be removed, the same is true of the labels on almost all kinds of link.
The one exception is the 'isa' relationship itself, which reflects its fundamental
character.
3. 3 Modularity
The view of language as a labelled network has interesting consequences for the
debate about modularity: is there a distinct 'module' of the mind dedicated
exclusively to language (or to some part of language such as syntax or
inflectional morphology)? Presumably not if a module is defined as a separate
'part' of our mind and if the language network is just a small part of a much
larger network. One alternative to this strong version of modularity is no
modularity at all, with the mind viewed as a single undifferentiated whole; this
seems just as wrong as a really strict version of modularity. However there is a
third possibility. If we focus on the links, any such network is inevitably
'modular' in the much weaker (and less controversial) sense that links between
concepts tend to cluster into relatively dense sub-networks separated by
relatively sparse boundary areas.
Perhaps the clearest evidence for some kind of modularity comes from
language pathology, where abilities are impaired selectively. Take the case of
Pure Word Deafness (Airman 1997: 186), for example. Why should a person
be able to speak and read normally, and to hear and classify ordinary noises,
but not be able to understand the speech of other people? In terms of a WG
network, this looks like an inability to follow one particular link-type ('sense') in
one particular direction (from word to sense). Whatever the reason for this
strange disability, at least the WG analysis suggests how it might apply to just this
one aspect of language, while also applying to every single word: what is
damaged is the general relationship 'sense', from which all particular sense
relationships are inherited. A different kind of problem is illustrated by patients
who can name everything except one category - e. g. body-parts or things
typically found indoors (Pinker 1994: 314). Orthodox views on modularity
seem to be of little help in such cases, but a network approach at least explains
how the non-linguistic concepts concerned could form a mental cluster of
closely-linked and mutually defining concepts with a single super-category. It is
easy to imagine reasons why such a cluster of concepts might be impaired
selectively (e. g. that closely related concepts are stored close to each other, so a
single injury could sever all their sense links), but the main point is to have
provided a way of unifying them in preparation for the explanation.
In short, a network with classified relations allows an injury to apply to
specific relation types so that these relations are disabled across the board. The
approach also allows damage to specific areas of language which form clusters
with strong internal links and weak external links. Any such cluster or shared
linkage defines a kind of 'module' which may be impaired selectively, but the
module need not be innate: it may be 'emergent', a cognitive pattern which
emerges through experience (Karmiloff-Smith 1992; Bates et al. 1998).
4. Default Inheritance
Default inheritance is just a formal version of the logic that linguists have always
used: true generalizations may have exceptions. We allow ourselves to say that
verbs form their past tense by adding -ed to the stem even if some verbs don't,
because the specific provision made for these exceptional cases will
automatically override the general pattern. In short, characteristics of a general
category are 'inherited' by instances of that category only 'by default' - only if
they are not overridden by a known characteristic of the specific case. Common
sense tells us that this is how ordinary inference works, but default inheritance
only works when used sensibly. Although it is widely used in artificial
intelligence, researchers treat it with great caution (Luger and Stubblefield 1993:
386-8). The classic formal treatment is Touretzky (1986).
Inheritance is carried by the 'isa' relation, which is another reason for
considering this relation to be fundamental. For example, because snores isa
'verb' it automatically inherits all the known characteristics of 'verb' (i. e. of 'the
typical verb'), including, for example, the fact that it has a subject; similarly,
because the link between Jo and snores in Jo snores isa 'subject' it inherits the
characteristics of 'subject'. As we have already seen, the notation for 'isa'
consists of a small triangle with a line from its apex to the instance. The base of
the triangle which rests on the general category reminds us that this category is
larger than the instance, but it can also be imagined as the mouth of a hopper
into which information is poured so that it can flow along the link to the
instance.
The mechanism whereby default values are overridden has changed during
the last few years. In EWG, and also in Fraser and Hudson (1992), the
mechanism was 'stipulated overriding', a system peculiar to WG; but since then
this system has been abandoned. WG now uses a conventional system in which
a fact is automatically blocked by any other fact which conflicts and is more
specific. Thus the fact that the past tense of COME is came automatically blocks
the inheritance of the default pattern for past tense verbs. One of the
advantages of a network notation is that this is easy to define formally: we always
prefer the value for 'R of C' (where R is some relationship, possibly complex,
and C is a concept) which is nearest to C (in terms of intervening links). For
example, if we want to find the shape of the past tense of COME, we have a
choice between came and corned, but the route to came is shorter than that to
corned because the latter passes through the concept 'past tense of a verb'. (For
detailed discussions of default inheritance in WG, see Hudson 2000a, 2003b. )
Probably the most important question for any system that uses default
inheritance concerns multiple inheritance, in which one concept inherits
from two different concepts simultaneously - as 'dog' inherits, for example,

both from 'mammal' and from 'pet'. Multiple inheritance is allowed in WG, as
in unification-based systems and the programming language DATR (Evans and
Gazdar 1996); it is true that it opens up the possibility of conflicting information
being inherited, but this is a problem only if the conflict is an artefact of the
analysis. There seem to be some examples in language where a form is
ungrammatical precisely because there is an irresoluble conflict between two
characteristics; for example, in many varieties of standard English the
combination */ amn't is predictable, but ungrammatical. One explanation for
this strange gap is that the putative form amn't has to inherit simultaneously
from aren't (the negative present of BE) and am (the I-form of BE); but these
models offer conflicting shapes (aren't, am] without any way for either to
override the other (Hudson 2000a). In short, WG does allow multiple
inheritance, and indeed uses it a great deal (as we shall see in later sections).
5. The Language Network

According to WG, then, language is a network of concepts. The following more
specific claims flesh out this general idea.
First, language is part of the same general conceptual network which contains
many concepts which are not part of language. What distinguishes the language
area of this network from the rest is that the concepts concerned are words and
their immediate characteristics. This is simply a matter of definition: concepts
which are not directly related to words would not be considered to be part of
language. As explained in section 3. 3, language probably qualifies as a module
in the weak sense that the links among words are denser than those between
words and other kinds of concept, but this does not mean that language is a
module in the stronger sense of being 'encapsulated' or having its own special
formal characteristics. This is still a matter of debate, but we can be sure that at
least some of the characteristics of language are also found elsewhere - the
mechanism of default inheritance and the isa relation, the notion of linear
order, and many other formal properties and principles.
As we saw in Table 1, words may have a variety of links to each other and to
other concepts. This is uncontroversial, and so are most of the links that are
recognized. Even the traditional notions of 'levels of language' are respected in
as much as each level is defined by a distinct kind of link: a word is linked to its
morphological structure via the 'stem' and 'shape' links, to its semantics by the
'sense' and 'referent' links, and to its syntax by dependencies and word classes.
Figure 5 shows how clearly the traditional levels can be separated from one
another. In WG there is total commitment to the 'autonomy' of levels, in the
sense that the levels are formally distinct.
The most controversial characteristic of WG, at this level of generality, is
probably the central role played by inheritance (isa) hierarchies.
Inheritance hierarchies are the sole means available for classifying concepts,
which means that there is no place for feature-descriptions. In most other
theories, feature-descriptions are used to name concepts, so that instead of
SEMANTICS
SYNTAX
MORPHOLOGY
PHONOLOGY
GRAPHOLOGY
Figure 5
'verb' we have '[+V, -N]' or (changing notation) '[Verb: +, Noun: -, SUB-

CAT: <NP>]' or even 'S/NP'. This is a fundamental difference because, as
we saw earlier, the labels on WG nodes are simply mnemonics and the analysis
would not be changed at all if they were all removed. The same is clearly not
true where feature-descriptions are used, as the name itself contains crucial
information which is not shown in any other way. In order to classify a word as
a verb in WG we give it an isa link to 'verb'; we do not give it a feature-
description which contains that of 'verb'.
The most obviously classifiable elements in language are words, so in
addition to specific, unique, words we recognize general 'word-types'; but we
can refer to both simply as 'words' because (as we shall see in the next section)
their status is just the same. Multiple inheritance allows words to be classified
on two different 'dimensions': as lexemes (DOG, LIKE, IF, etc. ) and as
inflections (plural, past, etc. ). Figure 6 shows how this cross-classification can be
incorporated into an isa hierarchy. The traditional word classes are shown on
the lexeme dimension as classifications of lexemes, but they interact in complex
ways with inflections. Cross-classification is possible even among word-classes;
for example, English gerunds (e. g. Writing in Writing articles is fun. ) are both
nouns and verbs (Hudson 2000b), and in many languages participles are
probably both adjectives and verbs.
Unlike other theories, the classification does not take words as the highest
category of concepts - indeed, it cannot do so if language is part of a larger
network. WG allows us to show the similarities between words and other kinds
Figure 6
of communicative behaviour by virtue of an isa link from 'word' to

'communication', and similar links show that words are actions and events.
This is important in the analysis of deictic meanings which have to relate to the
participants and circumstances of the word as an action.
This hierarchy of words is not the only isa hierarchy in language. There are
two more for speech sounds ('phonemes') and for letters ('graphemes'), and a
fourth for morphemes and larger 'forms' (Hudson 1997b; Creider and Hudson
1999), but most important is the one for relationships - 'sense', 'subject' and so
on. Some of these relationships belong to the hierarchy of dependents which
we shall discuss in the section on syntax, but there are many others which do
not seem to comprise a single coherent hierarchy peculiar to language (in
contrast with the 'word' hierarchy). What seems much more likely is that
relationships needed in other areas of thought (e. g. 'before', 'part-of) are put to
use in language.
To summarize, the language network is a collection of words and word-parts
(speech-sounds, letters and morphemes) which are linked to each other and to
the rest of cognition in a variety of ways, of which the most important is the 'isa'
relationship which classifies them and allows default inheritance.
6. The Utterance Network

A WG analysis of an utterance is also a network; in fact, it is simply an
extension of the permanent cognitive network in which the relevant word
tokens comprise a 'fringe' of temporary concepts attached by 'isa' links, so the
utterance network has just the same formal characteristics as the permanent
network. For example, suppose you say to me 'I agree. ' My task, as hearer, is to
segment your utterance into the two words / and agree, and then to classify each
of these as an example of some word in my permanent network (my grammar).
This is possible to the extent that default inheritance can apply smoothly; so, for
example, if my grammar says that / must be the subject of a tensed verb, the
same must be true of this token, though as we shall see below, exceptions can
be tolerated. In short, a WG grammar can generate representations of actual
utterances, warts and all, in contrast with most other kinds of grammar which
generate only idealized utterances or 'sentences'. This blurring of the boundary
between grammar and utterance is very controversial, but it follows inevitably
from the cognitive orientation of WG.
The status of utterances has a number of theoretical consequences both for
the structures generated and for the grammar that generates them. The most
obvious consequence is that word tokens must have different names from the
types of which they are tokens; in our example, the first word must not be
shown as / if this is also used as the name for the word-type in the grammar.
This follows from the fact that identical labels imply identity of concept,
whereas tokens and types are clearly distinct concepts. The WG convention is
to reserve conventional names for types, with tokens labelled 'wl', 'w2' and so
on through the utterance. Thus our example consists of wl and w2, which isa T
and 'AGREE: pres' respectively. This system allows two tokens of the same type
to be distinguished; so in / agree I made a mistake, wl and w3 both isa T. (For
simplicity WG diagrams in this chapter only respect this convention when it is
important to distinguish tokens from types. )
Another consequence of integrating utterances into the grammar is that
word types and tokens must have characteristics such that a token can inherit
them from its type. Obviously the token must have the familiar characteristics
of types - it must belong to a lexeme and a word class, it must have a sense and
a stem, and so on. But the implication goes in the other direction as well: the
type may mention some of the token's characteristics that are normally
excluded from grammar, such as characteristics of the speaker, the addressee
and the situation. This allows a principled account of deictic meaning (e. g. /
refers to the speaker, you to the addressee and now to the time of speaking), as
shown in Figure 1 and Table 1. Perhaps even more importantly, it is possible
to incorporate sociolinguistic information into the grammar, by indicating the
kind of person who is a typical speaker or addressee, or the typical situation of
use.
Treating utterances as part of the grammar has two further effects which are
important for the psycholinguistics of processing and of acquisition. As far as
processing is concerned, the main point is that WG accommodates deviant
input because the link between tokens and types is guided by the rather liberal
'Best Fit Principle' (Hudson 1990: 45ff): assume that the current token isa the
type that provides the best fit with everything that is known. The default
inheritance process which this triggers allows known characteristics of the token
to override those of the type; for example, a misspelled word such as mispelled
WHAT IS WORD GRAMMAK?
can isa its type, just like any other exception, though it will also be shown as a
deviant example. There is no need for the analysis to crash because of an error.
(Of course a WG grammar is not in itself a model of either production or
perception, but simply provides a network of knowledge which the processor
can exploit. ) Turning to learning, the similarity between tokens and types
means that learning can consist of nothing but the permanent storage of tokens
minus their utterance-specific content.
These remarks about utterances are summarized in Figure 7, which
speculates about my mental representation for the (written) 'utterance' Tons
mispelled it. According to this diagram, the grammar supplies two kinds of
utterance-based information about wl:
• that its referent is a set whose members include its addressee;

• that its speaker is a 'northerner' (which may be inaccurate factually, but is
roughly what I believe to be the case).
It also shows that w2 is a deviant token of the type 'MISSPELL: past'. (The
horizontal line below 'parts' is short-hand for a series of lines connecting the
individual letters directly to the morpheme, each with a distinct part name: part
1, part 2 and so on. )
Figure 7
7. Morphology
As explained earlier, the central role of the word automatically means that the
syntax is 'morphology-free'. Consequently it would be fundamentally against the
spirit of WG to follow transformational analyses in taking Jo snores as Jo 'tense'
snore. A morpheme for tense is not a word in any sense, so it cannot be a
syntactic node. The internal structure of words is handled almost entirely by
morphology. (The exception is the pattern found in clitics, which we return to
at the end of this section. )
The WG theory of inflectional morphology has developed considerably in
the last few years (Creider and Hudson 1998; Hudson 2000a) and is still
evolving. In contrast with the views expressed in EWG, I now distinguish sharply
between words, which are abstract, and forms, which are their concrete (visible
or audible) shapes; so I now accept the distinction between syntactic words and
phonological words (Rosta 1997) in all but terminology. The logic behind this
distinction is simple: if two words can share the same form, the form must be a
unit distinct from both. For example, we must recognize a morpheme {bear}
which is distinct from both the noun and the verb that share it (BEAR noun and
BEARvverb). This means that a word can never be directly related to phonemeserb). This means that a word can never be directly related to phonemes
and letters, in contrast with the EWG account where this was possible (e. g.
Hudson 1990: 90: 'whole of THEM = <them>'). Instead, words are mapped
to forms, and forms to phonemes and letters. A form is the 'shape' of a word,
and a phoneme or letter is a 'pronunciation' or 'spelling' of a form. In
Figure 7, for example, the verb MISSPELL has the form {misspell} as its stem (a
kind of shape), and the spelling of {misspell} is < misspell>.
In traditional terms, syntax, form and phonology define different 'levels of
language'. As in traditional structuralism, their basic units are distinct words,
morphemes and phoneme-type segments; and as in the European tradition,
morphemes combine to define larger units of form which are still distinct from
words. For example, {misspell} is clearly not a single morpheme, but it exists as
a unit of form which might be written {mis+spell} - two morphemes combining
to make a complex form - and similarly for {mis+spell+ed}, the shape of the
past tense of this verb. Notice that in this analysis {... } indicates forms, not
morphemes; morpheme boundaries are shown by '+'.
Where does morphology, as a part of the grammar, fit in? Inflectional
morphology is responsible for any differences between a word's stem - the
shape of its lexeme - and its whole - the complete shape. For example, the
stem of misspelled is {misspell}, so inflectional morphology explains the extra
suffix. Derivational morphology, on the other hand, explains the relations
between the stems of distinct lexemes - in this case, between the lexemes
SPELL and MISSPELL, whereby the stem of one is contained in the stem of
the other. The grammar therefore contains the following 'facts':
• the stem of SPELL is {spell};

e the stem of MISSPELL is {mis+spell};
• the 'mis-verb' of a verb has a stem which contains {mis} + the stem of this
verb;
• the whole of MISSPELL: past is {mis+spell+ed};

• the past tense of a verb has a whole which contains its stem + {ed}.
In more complex cases (which we cannot consider here) the morphological

rules can handle vowel alternations and other departures from simple
combination of morphemes.
A small sample of a network for inflectional morphology is shown in
Figure 8. This diagram shows the default identity of whole and stem, and the
default rule for plural nouns: their shape consists of their stem followed by {s}.
No plural need be stored for regular nouns like DUCK, but for GOOSE the
irregularity must be stored. According to the analysis shown here, geese is
doubly irregular, having no suffix and having an irregular stem whose vowel
positions (labelled here simply T and '2') are filled by (examples of) <e>
instead of the expected <o>. In spite of the vowel change the stem of geese isa
the stem of GOOSE, so it inherits all the other letters, but had it been
suppletive a completely new stem would have been supplied.
Figure 8
This analysis is very similar to those which can be expressed in terms of

'network morphology' (Brown et al. 1996), which is also based on multiple
default inheritance. One important difference lies in the treatment of
syncretism, illustrated by the English verb's past participle and passive participle
which are invariably the same. In network morphology the identity is shown by
specifying one and cross-referring to it from the other, but this involves an
arbitrary choice: which is the 'basic' one? In WG morphology, in contrast, the
syncretic generalizations are expressed in terms of 'variant' relations between
forms; for example, the past participle and passive participle both have as their
whole the 'en-variant' of their stem, where the en-variant of {take} is {taken} and
that of {walk} is {walked}. The en-variant is a 'morphological function'
which relates one form (the word's stem) to another, allowing the required
combination of generalization (by default a form's en-variant adds {ed} to a copy
of the form) and exceptionality.
As derivational morphology is responsible for relationships between
lexemes, it relates one lexeme's stem to that of another by means of exactly
the same apparatus of morphological functions as is used in inflectional
morphology - indeed, some morphological functions may be used both in
inflection and in derivation (for example, the one which is responsible for
adding {ing} is responsible not only for present participles but also for
nominalizations such as flooring). Derivational morphology is not well
developed in WG, but the outlines of a system are clear. It will be based on
abstract lexical relationships such as 'mis-verb' (relating SPELL to
MISSPELL) and 'nominalization' (relating it to SPELLING); these abstract
relations between words are realized, by default, by (relatively) concrete
morphological functions, so, for example, a verb's nominalization is typically
realized by the ing-variant of that verb's stem. Of course, not all lexical
relationships are realized by derivational morphology, in which related lexemes
are partly similar in morphology; the grammar must also relate lexemes where
morphology is opaque (e. g. DIE - KILL, BROTHER - SISTER). The
network approach allows us to integrate all these relationships into a single
grammar without worrying about boundaries between traditional sub-disciplines
such as derivational morphology and lexical semantics.
I said at the start of this section that clitics are an exception to the generally
clear distinction between morphology and syntax. A clitic is a word whose
realization is an affix within a larger word. For example, in He's gone, the clitic 's
is a word in terms of syntax, but its realization is a mere affix in terms of
morphology. They are atypical because typical words are realized by an entire
word-form; but the exceptionality is just a matter of morphology. In the case of
's, I suggest that it isa the word 'BE: present, singular' with the one exceptional
feature that its whole isa the morpheme {s} - exactly the same morpheme as we
find in plural nouns, other singular verbs and possessives. As in other uses, {s}
needs to be part of a complete word-form, so it creates a special form called a
'host-form' to combine it with a suitable word-form to the left.
In more complex cases ('special clitics' - Zwicky 1977) the position of the
clitic is fixed by the morphology of the host-form and conflicts with the
demands of syntax, as in the French example (3) where en would follow deux
(*Paul mange deux en) if it were not attached by cliticization to mange, giving a
single word-form en mange.
(3) Paul en mange deux,

Paul of-them eats two
'Paul eats two of them. '
Once again we can explain this special behaviour if we analyze en as an ordinary

word EN whose shape (whole) is the affix {en}. There is a great deal more to be
said about clitics, but not here. For more detail see Hudson (2001) and
Camdzic and Hudson (2002).
8. Syntax
As in most other theories, syntax is the best developed part of WG, which
offers explanations for most of the 'standard' complexities of syntax such as
extraction, raising, control, coordination, gapping and agreement. However the
WG view of syntax is particularly controversial because of its rejection of phrase
structure. WG belongs to the family of 'dependency-based' theories, in which
syntactic structure consists of dependencies between pairs of single words. As
we shall see below, WG also recognizes 'word-strings', but even these are not
the same as conventional phrases.
A syntactic dependency is a relationship between two words that are
connected by a syntactic rule. Every syntactic rule (except for those involved in
coordination) is 'carried' by a dependency, and every dependency carries at
least one rule that applies to both the dependent and its 'parent' (the word on
which it depends). These word-word dependencies form chains which link
every word ultimately to the word which is the head of the phrase or sentence;
consequently the individual links are asymmetrical, with one word depending
on the other for its link to the rest of the sentence. Of course in some cases the
direction of dependency is controversial; in particular, published WG analyses
of noun phrases have taken the determiner as head of the phrase, though this
analysis has been disputed and may turn out to be wrong (Van Langendonck
1994; Hudson 2004). The example in Figure 9 illustrates all these
characteristics of WG syntax.
A dependency analysis has many advantages over one based on phrase
structure. For example, it is easy to relate a verb to a lexically selected
preposition if they are directly connected by a dependency, as in the pair
consists of in Figure 9; but it is much less easy (and natural) to do so if the
preposition is part of a prepositional phrase. Such lexical interdependencies are
commonplace in language, so dependency analysis is particularly well suited to
descriptions which focus on 'constructions' - idiosyncratic patterns not covered
by the most general rules (Holmes and Hudson 2005). A surface dependency
analysis (explained below) can always be translated into a phrase structure by
building a phrase for each word consisting of that word plus the phrases of all
the words that depend on it (e. g. a sentence; of a sentence; and so on); but
KEY
WORD CLASSES
DEPENDENCY TYPES
Figure 9
dependency analysis is much more restrictive than phrase-structure analysis

because of its total flatness. Because one word can head only one phrase it is
impossible to build a dependency analysis which emulates a VP node or 'unary
branching'. This resfrictiveness is welcome, because it seems that such analyses
are never needed.
In contrast, the extra richness of dependency analysis lies partly in the
labelled dependency links, and partly in the possibility of multiple
dependencies. In a flat structure, in contrast with phrase structure, it is
impossible to distinguish co-dependencies (e. g. a verb's subject and object) by
configuration, so labels are the only way to distinguish them. There is clearly a
theoretical trade-off between phrase structure and labelled functions: the more
information is given in one, the less needs to be given in the other. The general
theory of WG is certainly compatible with phrase structure - after all, we
undoubtedly use part-whole structures in other areas of cognition, and they play
an important role in morphology - but it strongly favours dependency analysis
because labelled links are ubiquitous in the cognitive network, both in
semantics, and elsewhere. If knowledge is generally organized in terms of
labelled links, why not also in syntax? But if we do use labelled links
(dependencies) in syntax, phrase structure is redundant.
Syntactic structures can be much more complex than the example in
Figure 9. We shall briefly consider just three kinds of complication: structure-
sharing, coordination and unreal words. Structure-sharing is found when
one word depends on more than one other word - i. e. when it is 'shared' as a
dependent. The notion is familiar from modern phrase-structure analyses,
especially Head-driven Phrase Structure Grammar (HPSG) (Pollard and Sag
1994: 19), where it is described as 'the central explanatory mechanism', and it is
the main device in WG which allows phrases to be discontinuous. (In
recognizing structure-sharing, WG departs from the European tradition of

dependency analysis which generally allows only strictly 'projective', continuous
structures such as Figure 9. ) Figure 10 illustrates two kinds of structure-sharing
- in raising (you shared by have and been) and in extraction (what shared by have,
been, looking and at). The label 'x<' means 'extractee', and V means 'sharer'
(otherwise known as 'xcomp' or 'incomplement').
Figure 10
This diagram also illustrates the notion 'surface structure' mentioned above.
Each dependency is licenced by the grammar network, but when the result is
structure-sharing, just one of these dependencies is drawn above the words; the
totality of dependencies drawn in this way constitutes the sentence's surface
structure. In principle any of the competing dependencies could be chosen, but
in general only one choice is compatible with the 'geometry' of a well-formed
surface structure, which must be free of 'tangling' (crossing dependencies - i. e.
discontinuous phrases) and 'dangling' (unintegrated words). There are no such
constraints on the non-surface dependencies. (For extensive discussion of how
this kind of analysis can be built into a parsing algorithm, see Hudson 2000c;
for a comparison with phrase-structure analyses of extraction, see Hudson
2003c. )
The second complication is coordination. The basis of coordination is
that conjuncts must share their 'external' dependencies - dependencies (if any)
to words outside the coordination. The structure of the coordination itself (in
terms of 'conjuncts' and 'coordinators') is analyzed in terms of 'word-strings',
simple undifferentiated strings of words whose internal organization is
described in terms of ordinary dependencies. A word string need not be a
phrase, but can consist of two (or more) mutually independent phrases as in the
example of Figure 11, where the coordination and conjuncts are bounded by
brackets: {[... ] [... ]}.
Unreal -words are the WG equivalent of 'empty categories' in other
theories. Until recently I have rejected such categories for lack of persuasive
evidence; for example, my claim has always been that verbs which appeared to
have no subject really didn't have any subject at all. So an imperative (Hurry!)
had no subject, rather than some kind of covert subject. However I am now
convinced that, for at least some languages, this is wrong. The evidence comes
from case-agreement between subjects and predicatives (WG sharers) in
Figure II
languages such as Icelandic and Ancient Greek (Hudson 2003a); and the
conclusion is that some words have no realization (Creider and Hudson, this
volume). In this new analysis, therefore, an imperative verb does have a subject:
the word you. This is the ordinary word you, with its ordinary meaning, but
exceptionally, it is unrealized because this is what imperative verbs require of
their subjects. As Creider and I show, unrealized words may explain a wide
range of syntactic facts.
This discussion of syntax merely sets the scene for many other syntactic
topics, all of which now have reasonably well-motivated WG treatments: word
order, agreement, features, case-selection, 'zero' dependents. The most
important point made is probably the claim that in syntax the network
approach to language and cognition in general leads naturally to dependency
analysis rather than to phrase structure.
9. Semantics
As in any other theory, WG has a compositional semantics in which each word
in a sentence contributes some structure that is stored as its meaning. However,
these meanings are ordinary concepts which, like every other concept, are
defined by a network of links to other concepts. This means that there can be
no division between 'purely linguistic' meaning and 'encyclopedic' meaning.
For instance the lexemes APPLE and PEAR have distinct senses, the ordinary
concepts 'apple' and 'pear', each linked to its known characteristics in the
network of general knowledge. It would be impossible to distinguish them
merely by the labels 'apple' and 'pear' because (as we saw in section 3. 2) labels
on concepts are just optional mnemonics; the true definition of a concept is
provided by its various links to other concepts. The same is true of verb
meanings: for example, the sense of EAT is defined by its relationships to other
concepts such as 'put', 'mouth', 'chew5, 'swallow' and 'food'. The underlying
view of meaning is thus similar to Fillmore's Frame Semantics, in which lexical
meanings are denned in relation to conceptual 'frames' such as the one for
'commercial transaction' which is exploited by the definitions of 'buy', 'sell' and
so on. (See Hudson forthcoming for a WG analysis of commercial transaction
verbs. )
Like everything else in cognition, WG semantic structures form a network
with labelled links like those that are widely used in Artificial Intelligence. As in
JackendofFs Conceptual Semantics (1990), words of all word classes contribute
the same kind of semantic structure, which in WG is divided into 'sense5
(general categories) and 'referent' (the most specific individual or category
referred to). The contrast between these two kinds of meaning can be
compared with the contrast in morphology (section 7) between stem and whole:
a word's lexeme provides both its stem and its sense, while its inflection
provides its whole and its referent. For example, the word dogs is defined by a
combination of the lexeme DOG and the inflection 'plural', so it is classified as
'DOG: plural'. Its lexeme defines the sense, which is 'dog', the general concept
of a (typical) dog, while its inflection defines the referent as a set with more than
one member. As in other theories the semantics cannot identify the particular
set or individual which a word refers to on a particular occasion of use, and
which we shall call simply 'set s'; this identification process must be left to the
pragmatics. But the semantics does provide a detailed specification for what that
individual referent might be - in this case, a set, each of whose members is a
dog. One WG notation for the two kinds of meaning parallels that for the two
kinds of word-form: a straight line for the sense and the stem, which are both
retrieved directly from the lexicon, and a curved line for the referent and the
shape, which both have to be discovered by inference. The symmetry of these
relationships can be seen in Figure 12.
Figure 12
The way in which the meanings of the words in a sentence are combined is
guided by the syntax, but the semantic links are provided by the senses
themselves. Figure 13 gives the semantic structure for Dogs barked, where the
link between the word meanings is provided by 'bark', which has an 'agent' link
(often abbreviated 'er' in WG) to its subject's referent. If we call the particular
act of barking that this utterance refers to 'event-e', the semantic structure must
show that the agent of event-e is set-s. As with nouns, verb inflections contribute
directly to the definition of the referent, but a past-tense inflection does this by
limiting the event's time to some time ('tl') that preceded the moment of
speaking ('now5). Figure 13 shows all these relationships, with the two words
labelled VI' and 'w2'. For the sake of simplicity the diagram does not show
how these word tokens inherit their characteristics from their respective types.
Figure 13
The analysis of Dogs barked illustrates an important characteristic of WG

semantic structures. A word's 'basic' sense - the one that is inherited from its
lexeme - is modified by the word's dependents; and the result of this
modification is a second sense, more specific than the basic sense but more
general than the referent. This intermediate sense contains the meaning of the
head word plus its dependent, so in effect it is the meaning of that phrase. In
contrast with the syntax, therefore, the semantic structure contains a node for
each phrase, as well as nodes for the individual words - in short, a phrase
structure. Moreover, this phrase structure must be strictly binary because there
are reasons for believing that dependents modify the head word one at a time,
each defining a distinct concept, and that the order of combining may
correspond roughly to the bracketing found in conventional phrase structure.
For example, although subjects and objects are co-dependents, subjects seem to
modify the concepts already defined by objects, rather than the other way
round, so Dogs chase cats defines the concepts 'chase cats' and 'dogs chase cats',
but not 'dogs chase' - in short, a WG semantic structure contains something
like a VP node. This step-wise composition of word meanings is called
'semantic phrasing9.
This brief account of WG semantics has described some of the basic ideas,
but has not been able to illustrate the analyses that these ideas permit. In the
WG literature there are extensive discussions of lexical semantics, and some
explorations of quantification, definiteness and mood. However it has to be said
that the semantics of WG is much less well researched than its syntax.
10. Processing
The main achievements on processing are a theory of parsing and a theory of
syntactic difficulty; but current research is focused on a general theory of cognitive
processing in which language processing falls out as a particular case (Hudson, in
preparation). In this theory, processing is driven by a combination of spreading
activation, default inheritance and binding, like any other psychological model it
needs to be tested, and one step towards this has been taken by building two
computer systems called WGNet++ (see www. phon. ucl. ac. uk/home/WGNet/
wgnet++. htm) and Babbage (www. babbagenet. org) for experimenting with
complex networks.
The most obvious advantage of WG for a parser, compared with
transformational theories, is the lack of freely-occurring 'invisible' words (in
contrast with the unrealized words discussed above, which can always be
predicted from other realized words such as imperative verbs); but the
dependency basis also helps by allowing each incoming word to be integrated
with the words already processed without the need to build (or rebuild) higher
syntactic nodes.
A very simple algorithm guides the search for dependencies in a way that
guarantees a well-formed surface structure (in the sense defined in section 8):
the current word first tries to 'capture' the nearest non-dependent word as its
dependent, and if successful repeats the operation; then it tries to 'submit' as a
dependent to the nearest word that is not part of its own phrase (or, if
unsuccessful, to the word on which this word depends, and so on recursively up
the dependency chain); and finally it checks for coordination. (More details can
be found in Hudson 2000c. ) The algorithm is illustrated in the following
sequence of 'snapshots' in the parsing of Short sentences make good examples,
where the last word illustrates the algorithm best. The arrows indicate syntactic
dependencies without the usual labels; and it is to be understood that the
semantic structure is being built simultaneously, word by word. The structure
after ': -' is the output of the parser at that point
(4) a wl = snort. No progress - wl.

b w2 = sentences. Capture - wl w2.
c w3 = make. Capture - wl w2 w3.
d w4 = good. No progress - wl w2 w3, w4 .
e w5 = examples. Capture - w4 w5.

f Submit - wl w2 w3 (w4 <-) w5.
The familiar complexities of syntax are mostly produced by discontinuous

patterns. As explained in section 8, the discontinuous phrases are shown by
dependencies which are drawn beneath the words, leaving a straightforward
surface structure. For example, subject-raising in He has been working is shown

by non-surface subject links from both been and working to he. Once the surface
structure is in place, these extra dependencies can be inferred more or less
mechanically (bar ambiguities), with very little extra cost to the parser.
The theory of syntactic complexity (Hudson 1996b) builds on this
incremental parsing model. The aim of the parser is to link each word as a
dependent to some other word, and this link can most easily be established
while both words are still active in working memory. Once a word has become
inactive it can be reconstructed (on the basis of the meaning that it contributed),
but this is costly. The consequence is that short links are always preferred to
long ones. This gives a very simple basis for calculating the processing load for a
sentence (or even for a whole text): the mean 'dependency distance' (calculated
as the number of other words between a word and the word on which it
depends). Following research by Gibson (1998) the measure could be made
more sophisticated by weighting intervening words, but even the simple
measure described here gives plausible results when applied to sample texts
(Hiranuma 2001). It is also supported by a very robust statistic about English
texts: that dependency links tend to be very short. (Typically 70 per cent of
words are adjacent to the word on which they depend, with 10 per cent
variation in either direction according to the text's difficulty. )
11. Conclusions
WG addresses questions from a number of different research traditions. As in
formal linguistics, it is concerned with the formal properties of language
structure; but it also shares with cognitive linguistics a focus on how these
structures are embedded in general cognition. Within syntax, it uses
dependencies rather than phrase structure but also recognizes the rich
structures that have been highlighted in the phrase-structure tradition. In
morphology it follows the European tradition which separates morphology
strictly from syntax, but also allows exceptional words which (thanks to
cliticization) contain the forms of smaller words. And so on through other areas
of language. Every theoretical decision is driven by two concerns: staying true to
the facts of language, and providing the simplest possible explanation for these
facts. The search for new insights is still continuing, and more cherished beliefs
may well have to be abandoned; but the most general conclusion so far seems
to be that language is mostly very much like other areas of cognition.
References
Aitchison, Jean (1987), Words in the Mind: An Introduction to the Mental Lexicon.
Oxford: Blackwell.
Altaian, Gerry (1997), The Ascent of Babel: An Exploration of Language, Mind and
Understanding. Oxford: Oxford University Press.
Anderson, John (1971), 'Dependency and grammatical functions'. Foundations of
Language, 7, 30-7.
Bates, Elizabeth, Elman, Jeffrey, Johnson, Mark, Karmiloff-Smith, Annette, Parisi,
Domenico and Plunkett, Kim (1998), 'Innateness and emergentism', in William

Bechtel and George Graham (eds), A Companion to Cognitive Science. Oxford:
Blackwell, pp. 590-601.
Bennett, David (1994), 'Stratificational Grammar', in Ronald Asher (ed. ), Encyclopedia
of Language and Linguistics. Oxford: Elsevier, pp. 4351-56.
Bresnan, Joan (1978), 'A realistic transformational grammar', in Morris Halle, Joan
Bresnan, and George Miller (eds), Linguistic Theory and Psychological Reality.
Cambridge, MA: MIT Press, pp. 1-59.
Brown, Dunstan, Corbett, Greville, Fraser, Norman, Hippisley, Andrew and
Timberlake, Alan (1996), 'Russian noun stress and network morphology'.
Linguistics, 34, 53-107.
Butler, Christopher (1985), Systemic Linguistics: Theory and Application. London:
Arnold.
Camdzic, Amela and Hudson, Richard (2002), 'Clitics in Serbo-Croat-Bosnian'. UCL
Working Papers in Linguistics, 14, 321-54.
Chekili, Ferid (1982), 'The Morphology of the Arabic Dialect of Tunis'. (Unpublished
doctoral dissertation, University of London).
Chomsky, Noam (1957), Syntactic Structures. The Hague: Mouton.
— (1965), Aspects of the Theory of Syntax. Cambridge, MA: MIT Press.
— (1970), 'Remarks on nominalization', in Rodney Jacobs and Peter Rosenbaum (eds),
Readings in Transformational Grammar. London: Ginn, pp. 184-221.
Creider, Chet (1999) 'Mixed categories in Word Grammar: Swahili infinitival nouns'.
Linguistica Atlantica, 21, 53-68.
Creider, Chet and Hudson, Richard (1999), 'Inflectional morphology in Word
Grammar'. Lingua, 107, 163-87.
— (Ch. 2 in this volume), 'Case agreement in Ancient Greek: implications for a theory
of covert elements'. This volume.
Eppler, Eva (2005), 'The Syntax of German-English code-switching'. (Unpublished
doctoral dissertation, UCL).
Evans, Roger and Gazdar, Gerald (1996), 'DATR: a language for lexical knowledge
representation'. Computational Linguistics, 22, 167-216.
Fillmore, Charles (1975), 'An alternative to checklist theories of meaning' Proceedings of
the Berkeley Linguistics Society, 1, 123-31.
— (1976), 'Frame semantics and the nature of language'. Annals of the New York
Academy of Sciences, 280, 20-32.
Fraser, Norman (1985), 'A Word Grammar Parser' (Unpublished doctoral dissertation,
University of London).
— (1989), 'Parsing and dependency grammar'. UCL Working Papers in Linguistics, 1,
296-319.
— (1993), 'Dependency Parsing' (Unpublished doctoral dissertation, UCL).
— (1994), 'Dependency Grammar', in Ronald Asher (ed. ), Encyclopedia of Language and
Linguistics. Oxford: Elsevier, pp. 860-4.
Fraser, Norman and Hudson, Richard (1992), 'Inheritance in Word Grammar'.
Computational Linguistics, 18, 133-58.
Gibson, Edward (1998), 'linguistic complexity: locality of syntactic dependencies. '
Cognition 68, 1-76.
Gisborne, Nikolas (1993), 'Nominalisations of perception verbs'. UCL Working Papers
in Linguistics, 5, 23-44.
— (1996), 'English Perception Verbs'. (Unpublished doctoral dissertation, UCL).
— (2000), 'The complementation of verbs of appearance by adverbs', in Ricardo
Bermudez-Otero, David Denison, Richard Hogg and C. McCully (eds), Generative
Theory and Corpus Studies: A Dialogue from 10 ICEHL. Berlin: Mouton de Gruyter,
pp. 53-75.
— (2001), 'The stative/dynamic contrast and argument linking'. Language Sciences, 23,
603-37.
Goldberg, Adele (1995), Constructions: A Construction Grammar Approach to Argument
Structure. Chicago: University of Chicago Press.
Gorayska, Barbara (1985), 'The Semantics and Pragmatics of English and Polish with
Reference to Aspect' (Unpublished doctoral dissertation, UCL).
Halliday, Michael (1961), 'Categories of the theory of grammar'. Word, 17, 241-92.
— (1967-8), 'Notes on transitivity and theme in English'. Journal of Linguistics, 3, 37-
82, 199-244; 4, 179-216.
— (1985), An Introduction to Functional Grammar. London: Arnold.
Hiranuma, So (1999), 'Syntactic difficulty in English and Japanese: A textual study'.
UCL Working Papers in Linguistics, 11, 309-21.
— (2001), 'The Syntactic Difficulty of Japanese Sentences' (Unpublished doctoral
dissertation, UCL).
Holmes, Jasper (2004), 'Lexical Properties of English Verbs' (Unpublished doctoral
dissertation, UCL).
Holmes, Jasper and Hudson, Richard (2005), 'Constructions in Word Grammar', in
Jan-Ola Ostman and Mirjam Fried (eds), Construction Grammars: Cognitive Grounding
and Theoretical Extensions. Amsterdam: Benjamins, pp. 243-72.
Hudson, Richard (1964), 'A Grammatical Analysis of Beja' (Unpublished doctoral
dissertation, University of London).
— (1970), English Complex Sentences: An Introduction to Systemic Grammar/Amsterdam:
Nordi-Holland.
— (1976), Arguments for a Non-transformational Grammar. Chicago: University of
Chicago Press.
— (1980a), Sociolinguistics. Cambridge: Cambridge University Press.
— (1980b), 'Constituency and dependency'. Linguistics, 18, 179-98.
— (1981), 'Panlexicalism'. Journal of Literary Semantics, 10, 67-78.
- (1984), Word Grammar. Oxford: BlackweU.
— (1989), 'Towards a computer-testable Word Grammar of English'. UCL Working
Papers in Linguistics, 1, 321-39.
— (1992), 'Raising in syntax, semantics and cognition', in Iggy Roca (ed. ), Thematic
Structure: Its Role in Grammar. The Hague: Mouton, pp. 175-98.
— (1993), 'Do we have heads in our minds?', in Greville Corbett, Scott McGlashen and
Norman Fraser (eds), Heads in Grammatical Theory. Cambridge: Cambridge
University Press, pp. 266-91.
— (1995), Word Meaning. London: Roudedge.
— (1996a), Sociolinguistics (2nd edition). Cambridge: Cambridge University Press.
— (1996b), 'The difficulty of (so-called) self-embedded structures'. UCL Working Papers
— (1997a), 'The rise of auxiliary DO: verb non-raising or category-strengthening?'.
Transactions of the Philological Society, 95, 41-72.
— (1997b), 'Inherent variability and linguistic theory'. Cognitive Linguistics, 8, 73-108.
— (1998), English Grammar. London: Routiedge.
- (2000a), 'VflrowY. Language, 76, 297-323.
— (2000b), 'Gerunds and multiple default inheritance'. UCL Working Papers in
— (2000c), 'Discontinuity'. Traitement Automatique des Langues, 41, 15-56.
- (2001), 'Cities in Word Grammar'. UCL Working Papers in Linguistics, 13, 293-4.
— (2003a), 'Case-agreement, PRO and structure-sharing'. Research in Language, 1, 7-33.
— (2003b), 'Mismatches in default inheritance', in Elaine Francis and Laura Michaelis
(eds), Linguistic Mismatch: Scope and Theory. Stanford: CSLI, pp. 269-317.
- (2003c), 'Trouble on the left periphery'. Lingua, 113, 607-42.
— (2004), 'Are determiners heads?'. Functions of Language,, 11, 7-43.
— (forthcoming), 'Buying and selling in Word Grammar', in Jozsef Andor and Peter
Pelyvas (eds), Empirical, Cognitive-Based Studies In The Semantics-Pragmatics
Interface. Oxford: Elsevier.
— (in preparation) Advances in Word Grammar. Oxford: Oxford University Press.
Hudson, Richard and Holmes, Jasper (2000) 'Re-cycling in the Encyclopedia', in Bert
Peeters (ed. ), The Lexicon/Encyclopedia Interface. Oxford: Elsevier, pp. 259-90.
Jackendoff, Ray (1990), Semantic Structures. Cambridge, MA: MIT Press.
Kamiloff-Smith, Annette (1992), Beyond Modularity: A developmental perspective on
cognitive science. Cambridge, Mass.: MIT Press.
Kaplan, Ron and Bresnan, Joan (1982), 'Lexical-functional Grammar: a formal system
for grammatical representation', in Joan Bresnan (ed. ), The Mental Representation of
Grammatical Relations. Cambridge, MA: MIT Press, pp. 173-281.
Kreps, Christian (1997), 'Extraction, Movement and Dependency Theory' (Unpub-
lished doctoral dissertation, UCL).
Lakoff, George (1987), Women, Fire and Dangerous Things: What Categories Reveal
about the Mind. Chicago: University of Chicago Press.
Lamb, Sidney (1966), An Outline of Stratificational Grammar. Washington, DC:
Georgetown University Press.
— (1999), Pathways of the Brain: The Neurocognitive Basis of Language. Amsterdam:
Benjamins.
Langacker, Ronald (1987), Foundations of Cognitive Grammar I. Theoretical Prerequisites.
Stanford: Stanford University Press.
— (1990), Concept, Image and Symbol. The Cognitive Basis of Grammar. Berlin: De Gruyter.
Luger, George and Stubblefield, William (1993), Artificial Intelligence. Structures and
Strategies for Complex Problem Solving. Redwood City, CA: Benjamin/Cummings
Pub. Co.
Lyons, John (1977), Semantics. Cambridge: Cambridge University Press.
McCawley, James (1968), 'Concerning the base component of a transformational
grammar'. Foundations of Language, 4, 243-69.
Pinker, Steven (1994), The Language Instinct. Harmondsworth: Penguin Books.
Pinker, Steven and Prince, Alan (1988), 'On language and connectionism: Analysis of a
Parallel Distributed Processing model of language acquisition'. Cognition, 28, 73-193.
Pollard, Carl and Sag, Ivan (1994), Head-driven Phrase Structure Grammar. Chicago:
University of Chicago Press.
Reisberg, Daniel (1997), Cognition. Exploring the Science of the Mind. New York: W. W.
Norton.
Robinson, Jane (1970), 'Dependency structures and transformational rules'. Language,
46, 259-85.
Rosta, Andrew (1994), 'Dependency and grammatical relations'. UCL Working Papers
— (1996), 'S-dependency'. UCL Working Papers in Linguistics, 8, 387-421.
— (1997), English Syntax and Word Grammar Theory. Unpublished doctoral dissertation,
UCL, London.
Shaumyan, Olga (1995), Parsing English with Word Grammar. Imperial College London
MSc Thesis.
Sugayama, Kensei (1991), 'More on unaccusative Sino-Japanese complex predicates in

Japanese'. UCL Working Papers in Linguistics, 3, 397-415.
— (1992), 'A word-grammatic account of complements and adjuncts in Japanese
(interim report)'. Kobe City University Journal, 43, 89-99.
— (1993), 'A word-grammatic account of complements and adjuncts in Japanese'.
Proceedings of the 15th International Congress of Linguistics, Vol. 2, Universite Laval,
pp. 373-6.
— (1996), 'Semantic structure of eat and its Japanese equivalent taberu: a Word-
Grammatic account'. Translation and Meaning, 4, 193-202.
Taylor, John (1989), Linguistic Categorisation: An Essay in Cognitive Linguistics. Oxford:
Oxford University Press.
Tesniere, Lucien (1959), Elements de Syntaxe Structurale. Paris: Klincksieck.
Touretzky, David (1986), The Mathematics of Inheritance Systems. Los Altos, CA: M.
Kaufmann Publishers.
Tzanidaki, Dimitra (1995), 'Greek word order: towards a new approach'. UCL Working
— (1996a), 'Configurationality and Greek clause structure'. UCL Working Papers in
— (1996b), 'The Syntax and Pragmatics of Subject and Object Position in Modern
Greek' (Unpublished doctoral dissertation, UCL).
van Langendonck, Willy (1987), Word Grammar and child grammar'. Belgian Journal
of Linguistics, 2, 109-32.
— (1994), 'Determiners as heads?'. Cognitive Linguistics, 5, 243-59.
Volino, Max (1990), 'Word Grammar, Unification and the Syntax of Italian Clitics'
(Unpublished doctoral dissertation, Edinburgh University).
Zwicky, Arnold (1977), On Clitics. Bloomington: IULC.
— (1992), 'Some choices in the theory of morphology', in Robert Levine (ed. ), Formal
Grammar: Theory and Implementation. Oxford: Oxford University Press, pp. 327-71.
Parti
Word Grammar Approaches to Linguistic Analysis:
Its explanatory power and applications
2. Case Agreement in Ancient Greek: Implications for
a theory of covert elements
CHET CREIDER AND RICHARD HUDSON
Abstract
In Ancient Greek a predicative adjective or noun agrees in case with the subject
of its clause, even if the latter is covert. This provides compelling evidence for
'empty' (i. e. covert) elements in syntax, contrary to the tradition in WG theory.
We present an analysis of empty elements which exploits a feature unique to
WG, the separation of 'existence' propositions from propositions dealing with
other properties; an empty word has the property of not existing (or, more
technically, 0 quantity). We contrast this theory with the Chomskyan PRO and
the Head-driven Phrase Structure Grammar (HPSG) 'potential' SUBCAT list
1. Introduction
Case agreement in Ancient Greek1 has attracted a small but varied set of
treatments in the generative tradition (Andrews 1971; Lecarme 1978; Quicoli
1982). In this literature the problems were framed and solved in transforma-
tional frameworks. In the present chapter we wish to consider the data from the
point of view of the problems they pose for a theory of case assignment and
phonologically empty elements in a modern, declarative framework - Word
Grammar (WG; Hudson 1990). We present an analysis of empty elements
which exploits a feature unique to WG, the separation of existence propositions
from propositions dealing with other properties; and we contrast it with earlier
WG analyses in which these 'empty' elements are simply absent and with
Chomskyan analyses in terms of PRO, a specific linguistic item which is always
covert. The proposed analysis is similar in some respects to the one proposed
by Pollard and Sag (1994) for HPSG.
2. The Data
We confine our attention to infinitival constructions. The infinitive in Ancient
Greek is not inflected for person, number or case and hence, when predicate
adjectives and predicate nominals appear as complements of infinitives, it is
necessary to account for the case of these elements. One purpose of this
discussion is to show that traditional grammars are right to explain the case of
predicates in terms of agreement with the subject, but this analysis works most
naturally if we also assume some kind of 'null' subject for some infinitives. The
examples that support the null subject take the accusative case, and are
discussed in section 2. 1; there are well-known exceptions which are traditionally
described in terms of 'attraction' and which are discussed in section 2. 2.
2. 1. Accusative subjects
Traditional grammars of Greek state that the subject of an infinitive takes the
accusative case.
Examples are usually given of the accusative plus infinitive construction as in
the following:
(1) ekeleuon autous poreuesthai

they-ordered them(acc) to-proceed
'they ordered that they should proceed' (Smyth 1956: 260, X. A. 4. 2. 12)
(2) phe: si toiis andras apelthein
s/he-says the (ace) men(acc) to-go-away
s/he says that the men went away' (Goodwin 1930: 196)
A partial syntactic analysis of (2) is shown in Figure 1. In this analysis the

infinitive is a dependent (object) of the main verb, and it has a dependent
(subject) which bears the accusative case. We assume a standard analysis in
which the definite determiner is a head with respect to a 'determiner phrase'.
Figure 1
Since the subject is accusative, elements (predicate nouns and adjectives)

which agree with it are accusative (in contrast with the nominative case found
when the subject is nominative):
(3) a Klearkhos phugas e: n

Clearchus(nom) exile(nom) was (contrast phugada, 'exile(acc)')
'Clearchus was an exile. ' (X. A. 1. 1. 9)
b nomizo: gar huma: s emoi einai kai patrida
I-think for you(acc) me(dat) to-be and fatherland(acc)
kai philous
and friends (ace)
'for I think you are to me both fatherland and friends' (X. A. 1. 3. 6)
CASE AGREEMENT IN ANCIENT GREEK 37
The agreement of a predicative with the subject can be conveniently

diagrammed as in Figure 2. (In words, whatever a verb's subject and
predicative may be, their case must be the same. )
Figure 2
However, note that the predicative may be accusative even when the accusative
subject is itself absent:
(4) philanthro: pon einai dei

humane (ace) to-be must
'(one) must be humane' (I. 2. 15)
(5) oud' ara po: s e: n enpantess' ergoisi dae: mona
not then in-any-way was in all works skilled(acc)
pho: ta genesthai
man(acc) to-become
'(one) could not then in any way become a man skilled in all works'
(H. II. 23. 670-71)
This can also be true even when there is a coreferential element in the higher
clause:
(6) exarkesei soi nirannon genesthai

it-will-suffice you(dat) king(acc) to-become
'it will be enough for you to become king' (Liddell and Scott 1971,
P. Ale. 2. 141. a)
Figure 3
A partial structure for (6) is presented in Figure 3. Such examples raise urgent
questions about the status of 'understood' subjects. If an understood subject is
simply one which is 'understood' in the semantics but entirely absent from the
syntax, it is hard to explain the case of the predicative in these examples. We
return to these questions below.
When the subject of the infinitive is identical to that of the main verb, it is
normally not expressed:
(7) all' hod' ane: r ethelei pen panto: n emmenai

but this man(nom) he-wishes above all (gen) to-be
allo: n
others (gen)
'but this man wishes to be above all others' (H. II. 1. 287)
Agreeing elements may nevertheless appear in the accusative:
(8) enth' erne men pro: tisth' hetaroi lissonto

then me (ace) then first-of-all companions (nom) they-begged
epeesin turo: n ainumenous ienai palin
words(dat) cheeses(gen) taking(acc-pl) to-go back
'thereupon my companions then first of all begged me with words to take (i. e.
that they might take) some of the cheeses (and) to depart' (H. Od. 9. 224-5)
Examples like (8) are hard to explain without assuming some kind of accusative
subject for the infinitive with which the predicative participle ('taking') can
agree, as hinted at in Figure 4; but of course there is no overt accusative subject.
Figure 4
In situations of emphasis, an infinitival subject may be expressed (Chantraine

1953: 312; Kiihner and Gerth 1955, 2: 30-31; Smyth 1956: 439). When
expressed it appears in the accusative case:
(9) ego: men toinun eukhomai prin tauta epideinhuph' humo: n

I(nom) as-for therefore pray (that) before these-things to-see by you
genomena murias erne ge kata te: s ge: s orguias
having-become 10, 000 me (ace) indeed under the earth fathoms
genesthai
to-become
'for my part, therefore, I pray that before I see these tilings having been brought
about by you, I may be ten thousand fathoms under the earth' (Kiihner and
Gerth 1955: 31, X. A. 7. 1. 30)
(10) hoi Aigiiptioi enomizon heo: utous pro: tous genesthai

the(nom) Egyptians(nom) they-thought themselves (ace) first(acc) to-be
panto: n anthro: po: n
all (gen) human-beings (gen)
'the Egyptians used to think they were the first of all human beings' (Kuhner and
Gerth 1955: 31, Hdt. 2. 2)
(11) to: n d' allo: n erne phe: mi tolu propheresteron emai
of-those others me (ace) I-say by-far better(acc) to-be
'but of those others I say I am better by far' (H. Od. 8. 221)
(Other examples for Homeric Greek in II. 7. 198, 13. 269, 20. 361 - Chantraine
1953: 312. )
The emphasis need not be strong, as the following example, with unstressed
clitic pronoun, shows:
(12) kai te me phe: mi makhe: Tro: essi are: gein

and in-fact me (ace) s/he-says batde(dat) Trojans(dat) to-help
'and she says that I help the Trojans in battle' (H. II. 1. 521)
When the infinitive is used in exclamations with an overt subject, the latter
appears in the accusative:
(13) erne tatheih tade

me (ace) to-suffer this
'That I should suffer this!' (A. Eum. 837)
These examples with overt accusative subjects strongly support the traditional
rule that infinitives have accusative subjects, so the question is how to allow this
generalization to extend to infinitives which appear to have no subject at all in
order to explain the accusative cases found on predicatives in infinitival clauses
in examples such as (4) and (5).
2. 2 Non-accusative subjects
Greek provides a number of interesting alternatives to the possibilities given in
section 2. 1. These are traditionally discussed under two headings, although the
process is the same in both cases:
Sehr viele der Verben, die den Infinitive zu sich nehmen, haben daneben noch ein
personliches Objekt bei sich, welches in dem Kasus steht, den das Verb erfordert...
Wenn zu dem Infinitive adjektivische oder substantivische Pradikatsbestimmungen
treten, so stehen dieselben entweder vermittelst einer Attraktion mit dem
personlichen Objekte in gleichem Kasus oder mit Vernachlassigung der Attraktion
im Akkusative (Kuhner and Gerth 1955: 24)3
Wenn aber das Subjekt des regierenden Verbs zugleich auch das Subjekt des
Infinitivs ist, so wird das Subjekt des Infmitivs... weggelassen, und wenn adjektivische
oder substantivische Pradiskatsbestimmungen bei dem Infinitive stehen, so werden
diese vermittelst der Attraktion in den Nominative gesetzt (ibid.: 29)4
In short, the predicative of an infinitive may have a case which is 'attracted' to

that of a nominal in the higher clause, whether its object or its subject.
Examples:
(14)emoi de ke kerdion eie: seu aphamartouse: khthona

me (dat) but would better it-be you (gen) losing (dat) earth (ace)
dumenai
to-go (beneath)
'but for me it would be better losing you to die' (H. II. 6. 410-11)
(15) dokeo: he,: jm: nAigine: teo: n deesthai ton theon khre: sai
I-think us(dat) Aeginetans(gen) to-beg the(acc) god(acc) toadvise
timo: re: te: ro: n genestha
helpers (gen) to-become
'I think the god has advised us to beg the Aeginetans to become (our) helpers'
(Hdt. 5. 80)
Examples of attraction show that some infinitives do not have accusative

subjects, but they do not undermine the generalization that many do. The
analysis of attraction is tangential to our present concern, but is easily
accomplished via 'structure-sharing', 5 where the higher nominal doubles up as
subject of the infinitival clause - for example, in (15) the genitive noun
'Aeginetans' is not only the complement of the higher verb 'beg' but also
subject of the lower infinitive. The proposed structure for this remarkably
complicated sentence is shown in Figure 5.
Figure 5
This analysis easily explains why the lower nominal predicate 'helpers' has
the same case as this shared nominal, but it does not help with examples where
even the higher nominal is merely implicit, as in (16). (We give an explanation
for examples of this type in section 6. )
(16) ethelo: de toi e: pios einai

I-wish but you(dat) kind(nom) to-be
'but I wish to be kind to you' (H. II. 8. 40)
According to Chantraine (1953: 313) the relative frequency of attraction

increased from Homer forward into Attic authors and in the Attic period it
appears to have been obligatory in cases like (16), i. e. there are no examples
like (8) in Attic Greek. This may have been the reason traditional grammars
discuss attraction under two headings, one for nominative cases and the other
for oblique cases.
3. The Analysis of Case Agreement

First, note that morphological case, unlike gender or number, is a purely
morpho-syntactic property, so it is available to words and not to their meanings.
One consequence is that it is independent of reference.
(17) I saw him yesterday while he was on his way to the beach.
In (17) the three pronouns share a common set of semantic features (gender
and number) and have a common referent, but occur respectively in the
'objective', 'subjective' and 'possessive' cases (to use the terms of traditional
grammar). So far as we know, semantic co-reference between a nominal and a
pronoun never triggers case agreement in the latter, though it often requires
agreement for number and gender. A further consequence is that the only
possible 'target' for case agreement is a word (or other syntactic element); this
rules out a semantic account of case agreement, however attractive such an
account might be for number and gender.
Thus, faced with examples such as (18=6), where an infinitive has an
accusative predicative but no overt subject, we cannot explain the predicative's
case merely by postulating a subject argument in the semantic structure without
a supporting syntactic subject.
(18) exarkesei soi turannon genesthai

it-will-suffice you(dat) king(acc) to-become
'it will be enough for you to become king'
The argument X in the semantic structure 'X becoming a king' cannot by itself
carry the accusative case; there must also be some kind of accusative subject
nominal in the syntactic structure. Nor does a control analysis help, because the
controlling nominal is the pronoun soi, 'to you', which is dative; as expected, its
case has nothing to do with that of coreferential nominals.
The analysis seems to show that a specifically syntactic subject must be
present in order to account for the accusative case seen on the predicate
nominal. We accept this conclusion, though somewhat reluctantly because it
conflicts with the stress on concreteness which we consider an important
principle of Word Grammar. We are sceptical about the proliferation of empty
nodes in Chomskyan theories, and have always felt that the evidence for such
nodes rested heavily on theory-internal assumptions which we did not share. In
contrast, the evidence from case agreement strikes us as very persuasive, so we
now believe that syntactic structure may contain some 'null' elements which are
not audible or visible, such as a case-bearing subject in Ancient Greek infinitival

clauses; this is the analysis that we shall develop in the rest of this chapter. (For a
fuller statement of this argument and conclusion, see Hudson 2003. )
Fortunately, it seems that this evidence is supported by completely
independent data from other languages. For example, Welsh soft mutation
and agreement are much easier to explain if we allow syntactic subjects to be
inaudible (Borsley 2005). Welsh mutation applies to verb dependents which
are separated from the verb. Normally these are objects as in (19):
(19) Gweles (i) gi.

saw-lSG (I) dog
'I saw a dog. '
Here gi is the mutated form of ci, 'dog', whose form shows it to be object rather
than subject even when the optional subject i is omitted. Conversely, however,
subjects are also mutated if they are delayed, as in (20):
(20) a. Mae ci yn yr ardd.

is dog in the garden
'A dog is in the garden. '
b. Mae yn yr ardd gi.
In sentence (a), ci is in the unmutated form expected of a subject, but it is

mutated in (b) because it has been separated from the verb mae. The
generalization seems to be that if a subject or object dependent is separated
from the verb, it is mutated; but this generalization presupposes that there is
always a syntactic subject in examples like (19), even when no subject is audible.
The same conclusion also simplifies the treatment of verb agreement, which is
confined to verbs whose subject is a personal pronoun. In (19), the suffix {es}
on gweles can be said to agree with a first-person singular subject even when this
is covert. In short, inaudible subjects make the grammar of Welsh simpler and
more explanatory, a possibility which we assume occurs to naive learners of the
language as well as to linguists.
In the rest of this chapter we explore the notion 'null element' within the
theoretical framework of Word Grammar. What exactly does it mean to say
that an element is 'null' in a cognitive theory of language which maximizes the
similarities between language and other kinds of cognition? Having introduced
the relevant ideas we shall contrast our view of null elements with the more
familiar ideas about elements such as PRO and pro, as well as with other
proposals from the WG and HPSG traditions.
4. Non-Existent Entities in Cognition and in Language

One of the rather obvious facts about everyday knowledge is that we know
things about entities which we know not to exist. For example, we know that
Father Christmas brings presents, wears a red coat and has a beard; but we also
know that he doesn't exist. How can we know the characteristics of a non-
existent object? The answer must be that 'existence' is somehow separable

from other kinds of characteristic. However there is a serious danger of an
internal contradiction because it is also clear that the concept of Father
Christmas does exist, complete with the links to beards, red coats and presents,
even for those of us who know he does not exist. How can this contradiction
be avoided?
One possible answer follows from a basic assumption of Word Grammar:
that tokens and types are distinct concepts, each with a separate
representation in the mind (Hudson 1984: 24; Hudson 1990: 31-2). Tokens
exist in the realm of ongoing experience, while types exist in permanent
knowledge; in other words, roughly speaking tokens are represented in working
memory and types in long-term memory. For example, when we see a bird, we
assume that we introduce a new concept to represent it in our minds, a token
concept which is distinct from every permanent concept we have for birds or
bird species. Having introduced this distinct concept we can then classify it (e. g.
it's a robin), notice unusual features and remember it. None of this is possible if
tokens and types share the same mental nodes.
Another difference between tokens and types is that tokens are part of our
ongoing experience; in short, they are 'real', whereas types are merely
memories and may even be fictional. For example, we have a permanent
concept for Father Christmas complete with a list of attributes. This concept is
just like those for other people except that we know he's not real; in other
words, we know we will never meet a token of Father Christmas (even if we do
meet tokens of people pretending to be tokens of Father Christmas). This
contrast between real and unreal types can be captured in WG by an attribute
which we call 'quantity' (the 'quantitator' of Hudson 1990: 23). If every token of
experience has a quantity of 1, then real and unreal types can be distinguished
by their quantities - 1 for real and 0 for unreal. If Father Christmas has a
quantity of 0, then any putative token of Father Christmas would be (highly)
exceptional.
The example of Father Christmas is rather isolated because most of our
concepts are based firmly on experience. However, there is an even more
important role for the quantity variable, which is in inherited attributes. For
example, the default person has a foot on each leg, but some unfortunate
individuals have lost a foot. The quantity variable provides a mechanism for
stating the default (each leg has one foot), and then for overriding it in the case
of specified individuals (e. g. the pantomime character Long John Silver has no
foot on his left leg). Potentially inheritable attributes of this kind are a common
part of experience and provide an important role for the quantity variable.
Strictly speaking, quantity is a function like any other, but it plays such a basic
role that it is convenient to abbreviate it as a mere number on the node itself.
Using this notation (together with the triangles standardly used in WG notation
for the 'is-a' relation) tiie facts about feet may be shown in a diagram such as
Figure 6. In prose, a typical person has one left leg, which has one foot; but
although Long John Silver has the expected number of left legs (shown as a
mere dot, which inherits the default quantity), this leg has zero feet.
Figure 6
Returning to the analysis of grammatical structure, we see no reason to

recognize fictional words as such - after all, how could one learn them? (Notice
that we learn about Father Christmas via verbal and visual representations, for
which there is no parallel in language. ) In other words, we see no justification
for lexical items such as the Chomskyan PRO which are inherently inaudible.
However, we do see an important role for dependents that are merely potential
(like a missing left foot). For example, take syntactic objects. On the one hand,
we know that they are typically nouns, that they typically express the patient or
theme of the action, that they follow the verb, and so on; it is essential to state all
these properties just once, at the highest level (i. e. as a characteristic of the
typical verb or even of the typical word). But on the other hand, we also know
that some verbs require an object, others merely allow one, and others again
refuse one. It is essential to be able to separate these statements of 'existence'
from all the other properties of objects, and the obvious mechanism is the
quantity variable introduced above. The default object may have a quantity
which is compatible with either 1 or 0, but for many individual verbs this is
overridden for obligatorily transitive or intransitive verbs. Similar remarks apply
to subjects, the topic of this chapter, but first we must distinguish two different
kinds of 'null' dependent.
On the one hand, there are dependents that are optional in the semantics as
well as in the syntax; for example, many verbs allow a beneficiary (e. g. make her
a cake), but in the absence of a syntactic beneficiary there is no reason to
assume a semantic one. In a sentence such as She made a cake, therefore, there is
no beneficiary dependent although one is possible. In this case, therefore, the
dependent itself has quantity 0. In many other cases, on the other hand, the
null dependent does contribute to the semantics; for instance, even when
DRINK has no object (He drank till he fell asleep], its semantic structure
certainly includes some liquid which by default is alcohol. In this case, we
assume that the syntax does contain an object noun, an ordinary noun such as
ALCOHOL complete with its ordinary meaning; but exceptionally, it has no
realization in form - no stem or fully inflected form.
Null subjects in English are always of this second type: specific words which
have their usual meaning but which are deprived of any form because their
form's quantity is 0. This quantity varies with the verb's inflectional category;
e. g. finite verbs generally have a subject, but imperatives (a kind of finite)

normally have an unrealized YOU. (See section 5. 3 for more discussion. ) A
notation for unrealized words in a written example would help to distinguish
them conceptually from PRO; we suggest square brackets round the ordinary
orthography for the missing word. Using this notation, we might write an
English imperative as follows:
(21) [You] hurry up!
The relevant grammar is in Figure 7, which includes the very general default
tendency for words to have a realization.
Figure 7
In the case of Ancient Greek infinitives and participles, what distinguishes

those with overt subjects from those without is simply the quantity of the
subject's realization. Even if an infinitive has no overt subject, it still has a
subject, and this subject is an ordinary word (probably a personal pronoun)
which has the full range of inheritable syntactic and semantic properties. And
crucially, it has a case which may trigger case agreement in a predicate. The
relevant part of the grammar is sketched in Figure 8. According to this diagram,
a verb's subject is normally nominative and has optional realization - in other
words, Greek is a (so-called) pro-drop language. (We discuss this point further
in the next section. ) However, infinitives override the default pattern by
demanding the accusative case, so even 'null' subjects of infinitives have the
accusative case.
Figure 8
To summarize, then, we are proposing an attribute 'quantity' which controls

the way in which we map stored concepts to items of experience, as types to
tokens. Any item of experience has the value 1 for this attribute, so it will only
match stored concepts which have the same value. In the case of words, what
allows us to experience them is their realization, so by definition the quantity
for the realization of a word token is 1. However the grammar of a language
allows the default 1 for realization to be overridden in the case of dependents of
specific types of words, such as infinitives. But although these words have no
realization, they do have all the other properties expected of them, including
grammatical properties such as case. In Greek, it is the case of unrealized
subjects that explains the agreement patterns described in the first section.
5. Extensions to Other Parts of Grammar
Since the proposed system applies equally to our knowledge of Father

Christmas and to the subjects of Greek infinitives, it would not be surprising if it
turned out to be relevant in other parts of grammar as well. The following list
suggests a number of other areas where 'understood' elements can be handled
in the same way.
5. 1 Null subjects of tensed verbs in fpro-drop languages'

Whereas English requires tensed verbs to have a realized subject, pro-drop
languages allow it to be unrealized. This is helpful in Ancient Greek, where
predicatives have the nominative case in tensed clauses even when the subject is
unrealized, and similarly a virtual subject is as likely as an overt one to 'attract'
the predicative of a lower clause to its nominative case. A relevant example is
(22)=(16), which we noted above as an outstanding problem for a 'structure-
sharing' analysis of attraction. If we assume that the main verb ethelo:, 'I wish',
has a nominative (but unreal) pronoun as its subject, the nominative on the
lower predicative is as expected because this unreal pronoun is also the subject
of the lower clause:
(22) ethelo: de toi erpios einai

I-wish but you(dat) kind(nom) to-be
'but I wish to be kind to you' (H. II. 8. 40)
The subject-verb agreement on the verb is easy to explain if there is always a

subject, real or unreal. Without this assumption, however, a rule of agreement
does not extend easily to examples where there is no overt higher subject.
5. 2 'object pro-drop*
(23) ou dei tois paidotribais enkalein oud' ekballein ek
not necessary the(dat) trainers (dat) to-accuse nor to-banish from
to: n poleo: n
the (gen) cities (gen)
'it is not necessary to accuse the trainers nor to banish them from the cities'
(P. G. 460 d. )
This phenomenon, less common than 'subject pro-drop' but very common in
Greek, is traditionally analyzed under the rubric of 'object-sharing' and has no
agreed modern analysis. Treating the 'omitted' object as unrealized provides a
natural and simple account. Note that the traditional shared object analysis
would incorrectly associate the dative case with the object of ekballein (normally
accusative in this context).
5. 3 Subjects of imperatives
In languages where these are usually absent, such as English, the identity of the
unreal subject is very clear: as we assumed above, it must be the pronoun you
for second-person imperatives, and we for first-person plural ones. This is clear
not only from the meaning but also from the choice of pronoun in the tag
question:
(24) Hurry up, will you?

(25) Let's go now, shall we?
Moreover, where a language offers a choice between intimate and distant

second-person pronouns (such as the French pair tu and vous), the same choice
applies, with the same social consequences, to imperatives even though there is
no overt pronoun (e. g. Viens! or Venez! for 'Come!'). Without unrealized
pronouns as subject it is hard to extend the rule for choosing pronouns so that it
applies to the choice of imperative forms as well; but with unreal pronouns the
choice of pronoun automatically triggers the correct agreement on the verb.
5. 4 Complements of certain definite pronouns in English

The argument here rests on the assumption that 'pluralia tantum' such as
trousers and scales are singular in meaning but plural in syntax; the assumption
has been challenged (Wierzbicka 1988) but we still find it especially plausible
for examples such as scales (plural) contrasting with balance (singular). The
relevant datum is that the choice between this and these matches the syntactic
number when the complement noun is overt (so this balance but these scales],
but the same choice is made even when there is no overt complement
(26) I need some scales to weigh myself on, but these (*this) are (*is) broken.
If pluralia tantum really are singular in meaning, we cannot explain this choice
in terms of meaning, and the most attractive explanation is that the choice is
forced in the same way as in the overt case, by the presence of an unrealized
example of the noun scales (or trousers or whatever). We might also consider
extending this explanation to another curious fact about the demonstrative
pronouns, which is that the singular form can only refer to a thing:
(27) Do you take this !!(woman) to be your lawfully wedded wife?
The explanation would be that only one unrealized noun is possible in the
singular: the noun thing. The analyses that we are suggesting are of course
controversial and may be wrong, but if they are correct then they show that the
unrealized word may be a specific lexical noun rather than a general-purpose
pronoun as in the earlier examples.
5. 5 Complements of certain verbs such as auxiliaries in English
This covers the territory of so-called 'VP deletion' but also other kinds of
anaphoric ellipsis:
(28) I don't know whether I'm going to finish my thesis this year, but I may.
(29) I may finish my thesis this year, but I don't know for sure.
If the complement of mqyjs allowed to be unrealized, then may in (28) actually

has a complement verb, whose properties are (more or less) copied from its
antecedent (namely, (I) finish my thesis this year); and similarly know in (29) has
an object which would have been realized as whether I'll finish my thesis this
year. This analysis combines the flexibility of a purely semantic analysis with the
ability of a syntactic analysis to accommodate syntactic detail such as extraction
out of an elided complement:
(30) OK, you didn't enjoy that book, but here's one which you will.
If will has no syntactic complement at all, the extraction of which in (30) is very
hard to explain; but if its complement is an unrealized enjoy, the rest of the
syntactic structure can be exactly as for here's one which you will enjoy., 6
In all these examples the omitted element is redundant and easy to recover,
so the option of leaving it unsaid obviously helps both the speaker and the
hearer. The familiar functional pressure to minimize effort thus explains why
the choice between 'realized' and 'unrealized' exists in the grammar. On the
other hand, it does not explain why languages allow it in different places - e. g.
why some languages allow tensed verbs to have a null subject while others do
not This variation must be due to different ways of resolving the conflict
between this functional pressure and others which push in the opposite
direction, such as the pressure to make syntactic relations reliably identifiable
(whether by word order as in English or by inflectional morphology as in
Greek).
6. Comparison with PRO and pro

The analysis that we are proposing is different from the more familiar ones
which invoke null pronouns such as PRO and pro, and we believe that the
differences are important:
• PRO and pro are special pronouns which combine the peculiarity of always
being covert with the equally exceptional property of covering all persons,
numbers and genders. The fact that they are exceptional in two such major
respects should arouse suspicion. In contrast, our unrealized pronouns are
the ordinary pronouns - he, me, us, and so on - which just happen to be
unrealized. Even if we count this as an exceptional feature it is their only
exceptional feature, in contrast with the double exceptionality of PRO and
pro.
• In our account, a pronoun may be realized for emphasis in contexts where
it would normally be unrealized; this accounts in a simple way for the
examples in (9) to (12) where the pronoun is emphatic. If the covert
pronoun is always PRO, why should it always alternate with an ordinary
pronoun?
• Our unrealized words need not be pronouns, unlike PRO and pro. As
explained in the previous section, this allows us to extend the same
explanation to other kinds of unexpressed words, such as unrealized
common nouns acting as complement of a pronoun/determiner such as
this, or virtual complements of verbs such as auxiliary verbs. In other words,
our proposal subsumes null subjects under a much broader analysis which
covers ellipsis in general.
• Unrealized words are identified by the quantity feature which applies
outside language (e. g. to Father Christmas and feet) as well as inside. In
contrast, the difference between PRO or pro and other words is specific to
language, involving (presumably) the absence of a phonological entry. Any
explanation which involves machinery that is available independently of
language is preferable to one which involves special machinery.
• In the standard analysis with PRO and pro the difference between these two
is important because both abstract 'Case' and surface case are supposed to
be impossible for PRO but obligatory for pro; this contrast is also claimed to
correlate with the contrast between subjects of non-finite and finite verbs.
More recently, PRO has been claimed to have a special 'null' case
(Chomsky and Lasnik 1993). The empirical basis for these claims was
undermined long ago (e. g. Sigurdsson 1991), and our analysis does not
recognize the distinction between PRO and pro. Unrealized pronouns all
take case (or lack it) just like realized ones in the language concerned.
These differences between our proposal and the PRO/pro system all seem to
favour our proposal.
7. Comparison with Other PRO-free Analyses

In this section we compare our proposal with two other approaches to null
elements neither of which invokes a 'covert' element such as PRO. The first
approach is in the WYSIWYG spirit of earlier versions of Word Grammar,
where it was assumed that null elements were simply absent. This assumption
was only workable because of die possibility of structure-sharing. In this
analysis, the missing subject is specified as (i. e. supplied by) the subject of the
higher verb (see Hudson 1990: 235ff for details). For Greek, as we indicated in
section 2, this approach is adequate for the cases traditionally described under
the rubric of 'attraction', but it fails for the default situation, where the subject of
the infinitive (and other elements dependent on the lower verb) display
accusative case. On the early WG assumptions, the only possible analysis is
simply to stipulate, for the case where there is no infinitival subject, that
predicates of infinitives are accusatives. But this approach fails to be
explanatory: why should these elements bear the accusative case rather than
the general default nominative? (Contrast the principled explanation of Figure
8 and accompanying text)
Moreover, this no-null-element, stipulative analysis suffers from an even
graver defect: the relation 'subject' is a collecting point for a large number of
different patterns in semantics and morphology as well as syntax (Keenan
1976). A verb's subject is the nominal that has the following properties (among
others):
• its referent is the 'active argument' of the verb as defined by the latter's
lexical entry - for instance, with RUN/TREKHO: it is the runner, with
FALL/PIPTO: it is the fuller, with LIKE/PHILEO: it is the liker, and so
on;
• in English, it typically stands before the verb;
• it is the typical antecedent of a reflexive object of the same verb;
• the verb agrees with it;
• in English, it is obligatory if the verb is tensed;
• it is also the verb's object if the verb is passive;
• in Greek, its case is typically nominative.
As soon as some nominal is defined as the verb's subject, it immediately

inherits all these characteristics en bloc. But in the absence of a subject there is
nothing to bring them all together. For example, if himself is the object of hurt it
is tied anaphorically to the hurter via the 'subject' link, but if there is no subject
this link disappears. And yet the fact is that the anaphoric relations are exactly
the same regardless of whether or not there is an overt subject; for example, the
'understood' subject of hurt in Don't hurt yourself! binds yourself in exactly the
same way as the overt one in You may hurt yourself.
The analysis that we are proposing solves these problems by moving towards
the standard view that every verb does indeed have a subject, whether or not
this is overt. Similar problems face the earlier WG approach in other areas of
grammar, and can be solved in the same way. In section 5 we outlined a range
of phenomena that seem to call for analysis in terms of unrealized words, and
which more traditional WG analyses have treated in terms of dependents that
are simply absent.
Another attempt to handle null subjects without invoking PRO is proposed
by Pollard and Sag (1994: 123-45) in the framework of HPSG. As with the
early WG analysis just described, this proposal applies only where syntactic
structure-sharing is not possible. They propose a structure for 'Equi' verbs such
as the following for try (ibid.: 135):
CAT | SUBCAT <NP b VP\inf, SUBCAT <JVP7>]
The infinitive's subject is the italicized 'NP' in its 'SUBCAT' (valency) list This
NP merely indicates the need for a subject, and would normally be 'cancelled'
(satisfied) by unification with an NP outside the VP; for example, in They
worked hard the verb needs a subject, which is provided by they. However at
least the intention of this entry is to prevent the need from being satisfied, so
that the infinitive's subject remains unrealized, as in our proposed WG analysis.
Moreover, this unrealized subject in the SUBCAT list may carry other
characteristics which are imposed both by the infinitive and by try; for example,
the subscripts in the entry for try show that it must be coreferential with the
subject of try - i. e. a sentence such as They try to work hard has only one
meaning, in which they are the workers as well as the try-ers. Most importantly
for the analysis of Greek, the unrealized subject can carry whatever case may be
imposed on it by the infinitive (Henniss 1989). Consequently it can be the
target of predicative case agreement, so Ancient Greek case-agreement would
be no problem.
This approach is clearly very similar to ours. In both theories:
• the infinitive's subject is an ordinary noun(-phrase) rather than a special

pronominal (PRO or pro);
• the subject's status (overt or covert) is handled by a separate mechanism
from its other properties;
• the subject's properties include those inherited from the infinitive;
• the possibility of null realization is determined by the head, rather than
inherent in the unrealized nominal.
However there are also significant differences between the two proposals.
• The null NP in HPSG is purely schematic, so all null subjects have the
same syntax (bar any specific syntactic demands imposed by the infinitive).
They are also schematic in their semantics, in spite of the coreference
restriction, because reference is distinct from semantics (e. g. the winner may
be coreferential with someone I met last night, but these phrases obviously
have different semantic structures). In contrast, WG null subjects are
ordinary lexical nouns and pronouns.
• So far as we can see, the HPSG machinery for distinguishing overt and
covert valents does not appear to generalize beyond language; and indeed,
many advocates of HPSG might argue that it should not do so. In contrast,
we showed in section 4 that our proposal does; it can explain the 'non-
occurrence' of Father Christmas in just the same way as that of the subject
of an infinitive.
Whether or not the proposals differ in terms of specifically linguistic analyses

remains to be seen.
8. Conclusions
The most important conclusion is that where there is strong empirical
evidence for null elements, they can easily be included even in a 'surfacist'
grammar such as WG. This can be done by exploiting the existing WG
machinery for determining 'quantity', a variable which guides the user in
applying knowledge to experience; for example, one of the properties that we
attribute to Father Christmas is zero quantity - i. e. we expect no tokens in
experience. In these terms, a 'null word' is an ordinary word whose realization
has the quantity 0 - an unrealized word. This (or something like it) is
generally available in cognition both for distinguishing fact and fiction and for
cases where an expected attribute is exceptionally absent, so it comes 'for
free', and it is preferable to inventing special linguistic inaudibilia such as
PRO or pro.
References to classical works

Aeschylus, Eumenides (A. Eum. )
Herodotus (Hdt)
Homer, Iliad (H. II. )
Homer, Odyssey (H. Od. )
(LSJ: see liddell & Scott in References)
Isocrates (I. )
Plato, Alcibiades (P. Ale. )
Plato, Gorgias (P. G. )
Xenophon, Anabasis (X. A. )
References
Andrews, A. (1971), 'Case agreement of predicate modifiers in Ancient Greek'.
Linguistic Inquiry, 2, 127-51.
Borsley, R. (2005), 'Agreement, mutation and missing NPs in Welsh', Available: http: //
privatewww. essex. ac. uk/~rborsley/Agreement-paper. pdf (Accessed: 19 April 2005).
Chantraine, P. (1953), Grammaire Homerique. Vol 2. Paris: Klincksieck.
Chomsky, N. and Lasnik, H. (1993), 'The theory of principles and parameters', in J.
Jacobs, A. v. Stechow, W. Sternefeld and T. Venneman (eds), Syntax: An International
Handbook of Contemporary Research. Berlin: Walter de Gruyter, pp. 506-69.
Goodwin, W. W. (1930), Greek Grammar (rev. Charles Burton Gulick). Boston: Ginn.
Henniss, K. (1989), '"Covert" subjects and determinate case: evidence from
Malayalam', in J. Fee and K. Hunt (eds), Proceedings of the West Coast Conference
on Formal Linguistics. Stanford: CSLI, pp. 167-75.
Hudson, R. (1984), Word Grammar. Oxford: Blackwell.
— (2003), 'Case-agreement, PRO and structure sharing'. Research in Language, 1, 7-33.
Keenan, E. L. (1976), 'Towards a universal definition of "subject"', in Charles Li (ed. ),
Subject and Topic. New York: Academic Press, pp. 303-33.
Kiihner, R. and Gerth, B. (1955), Ausfuhrliche Grammatik der griechischen Sprache.
Leverkusen: Gottschaksche Verlagsbuchhandlung.
Lecarme, J. (1978), 'Aspects Syntaxiques des Completives du Grec' (Unpublished
doctoral dissertation, University of Montreal).
Liddell, H. G. and Scott, R. (1971), A Greek-English Lexicon (9th edn, rev. H. Jones and
R. McKenzie, suppl. by E. Barber) Oxford: Clarendon Press.
Pollard, C. J. and Sag, I. A. (1994), Head-Driven Phrase Structure Grammar. Chicago:
Quicoli, A. C. (1982), The Structure of Complementation. Ghent: Story-Scientia.
Sigurdsson, H. (1991), 'Icelandic case-marked PRO and the licensing of lexical
arguments'. Natural Language & Linguistic Theory, 9, 327-63.
Smyth, H. W. (1956), Greek Grammar (rev. G. Messing). Cambridge, MA: Harvard
University Press.
Wierzbicka, A. (1988), The Semantics of Grammar. Amsterdam: Benjamins.
Notes
1 By Ancient Greek we mean the Greek of early epic poetry ('Homeric Greek') down
to the Attic prose of the 5th and 4th centuries B. C. E.
2 A list of abbreviated references to classical authors can be found at the end of this
paper.
3 Very many of the verbs which take the infinitive also take a personal object which
stands in the case that the verb requires... If the infinitive also has an adjectival or
nominal predicate, this stands in the same case as the personal object by (an)
attraction, or in the absence of attraction, in the accusative.
4 However if the subject of the governing verb is at the same time the subject of the
infinitive, the subject of the infinitive is omitted, and if adjectival or nominal
predicates accompany the infinitive, these are put in the nominative by attraction.
5 This felicitous term is taken from the work of Pollard and Sag (1994), but the
analysis was worked out in full detail for English infinitives (and other constructions)
in Hudson (1990: 235-9).
6 We owe this point to Andrew Rosta.
3 Understood Objects in English and Japanese with
Reference to Eat and Taberu: A Word Grammar
account
KENSEI SUGAYAMA
Abstract
The author argues that there is a semantic difference of the suppressed object
between eat and its Japanese equivalent taberu. Then what kind of semantic structure
would the Japanese verb taberu have? This chapter is an attempt to answer this
question in the framework of Word Grammar (Hudson 1984, 1990, 1998, 2005).
1. Introduction
Unlike. English and perhaps most other European languages, Japanese allows
its transitive verbs to miss out their complements (e. g. subject, object) on the
condition that the speaker assumes that they are known to the addressee.1 This
is instantiated by the contrast in (1) and (2):
(1) A: mo keeki-wa yaki-mashita-ka

already cake-TP baked-Q
'Did you bake the cake?'
B: hai, yaki-mashita
yes baked
'Yes, (I) baked it'
(2) A: *mo, yaki-mashita-ka
already baked-Q (* unless the object is situationally recovered)
'Did you bake it?' (intended meaning)
The following sentences are also possible in Japanese.2
(3) hyah! ugoita

Interj moved
'Ouch! It moved. '
(4) kondo-wa yameyoo
next-time-TP stop
'I won't do it again. '
(5) kanojo-wa yubiwa-o oita
she-TP-Sb ring-Ob put
'She put the ring there'
UNDERSTOOD OBJECTS IN ENGLISH AND JAPANESE 55
Sentences (3), (4) and (5) are colloquial and quite often used in the standard
spoken Japanese. In this sense they are not marked sentences. In (3) only the
subject is left out, while in (4) both the subject and object are left out as shown
in the word-for-word translation. Sentence (5) involves the transitive verb oita, a
past tense form of oku 'put', which corresponds to put in English. Oku is a three-
place predicate which semantically requires three arguments [agent, theme,
place]. These three arguments are mapped syntactically to subject, object and
place adverbial, respectively. Quite interestingly (5) shows that the place
element, which is also considered to be a complement (or adjunct-complement)
of the verb oku, is optional when it is known to the addressee, which is virtually
impossible with its counterpart put in English. Although these complements are
in fact missed out (i. e. unexpressed or ungrammaticalized), the addressee
eventually will come to an appropriate interpretation of each sentence where
unexpressed complements are supplied semantically or pragmatically and they
are no doubt given full interpretation. Why is this possible? A possible answer
comes from the assumption that in the semantic structure of the sentences
above, there has to be a semantic argument which should be, but is not actually,
mapped onto the syntactic structure (i. e. grammaticalized as a syntactic
complement in my terms).
Turning to English, on the other hand, it is possible to leave indefinite
objects suppressed for semantically limited verbs such as eat, drink, read, etc. 3
Thus, following Hudson (2005), the syntactic and semantic structure of John ate
will be something like the one shown in Figure 1.
Figure 1
The links between syntactic and semantic structures in Figure 1 are shown by
the vertical solid and curved lines. The categories enclosed by single quotation
marks (e. g. 'John', 'John ate') are concepts which are part of the sentence's
semantic structure; the numbers are arbitrary. Detailed explanation about
syntactic and semantic dependencies will be given in the next section.
But this kind of semantic structure does not seem to be a viable one for the
Japanese verb taberu, because, as I will argue later, there is a semantic difference
in the semantic feature of the suppressed object between eat and taberu, which
does not seem to be properly reflected in the semantic structure of those two
verbs in Word Grammar (WG). Then what kind of semantic structure will the
Japanese verb tab em, the Japanese equivalent of eat, have? This chapter is an
attempt to answer this question in the framework of Word Grammar.
The rest of the chapter is organized in the following way. Section 2
introduces Word Grammar and deals with the relevant notions used in WG to
deal with the problem of a covert object. Section 3 discusses the analysis of an
intransitive use of the eat type verbs. Section 4 discusses the Japanese verb
taberu, an equivalent of eat in English. It also discusses the interpretation of
tab em which lacks an overt object, using the syntactic and semantic analysis in
WG. Section 5 offers my own account of how taberu is more adequately
described in the semantic structure in WG.
2. Word Grammar
Before continuing any further, let us first have a brief look at the basic
framework of WG and its main characteristics. WG, which is fully developed
and formalized in Hudson (1984, 1990), subsequently revised by Rosta (1997),
is to be taken as a lexicalist grammatical theory because the word is central -
hence the name of the theory, basically making no reference to any
grammatical unit larger than a word.4 In his recent comparison of WG with
Head-driven Phrase Structure Grammar (HPSG), Hudson (1995b: 4) gives a
list of the common characteristics between the two theoretically different
grammars, some relevant ones of which are repeated here for convenience:
(6) a. both (i. e. WG and HPSG) include a rich semantic structure parallel
with the syntactic structure;
b. both are monostratal;
c. both make use of inheritance in generating structures;
d. neither relies on tree geometry to distinguish grammatical functions;
e. both include contextual information about the utterance event (e. g. the
identities of speaker and hearer) in the linguistic structure.
In WG, syntactic structure is based on grammatical relations within a general

framework of dependency theory rather than on constituent structure. So a
grammatical relation is defined as a dependency relation between the head and
its dependents which include complements and adjuncts. In this framework,
the syntactic head of a sentence, as well as its semantic head, is therefore a finite
verb on which its dependents such as subject, object and so forth depend. To
take a very simple example, the grammatical analysis of Vera lives in Altrincham
can be partially shown by the diagram in Figure 2.5 Each arrow in this diagram
shows a dependency between words and it points from a head to one of its
dependents, but it is most important here that there are no phrases or clauses.6
Thus in terms of dependency, lives is the root of the sentence on which Vera
and in depend as a subject and a complement respectively. In turn, in is the
head of Altrincham. Semantically, 'Vera', which is a referent of Vera, is linked
as a live-er to a semantic concept 'Vera lives in Altrincham', an instance of
live' and a referent of lives at once and 'Altrincham', which is a referent of
Figure 2
Altrincham, is also linked to 'Vera lives in Altrincham' as a place. The

curved lines connecting in and Altrincham mean that the referent of in and that
of Altrincham are the same (i. e. 'Altrincham'). 'Liv-er' and 'place' are names
of a semantic relation. A convenient way to diagram the model-instance relation
is by using a triangle with its base along the general category (= model) and its
apex pointing at the member (= instance), with an extension line to link the two.
In Figure 2, then, the diagram shows the relation between the sense of the word
lives, live', and its instance 'Vera lives in Altrincham'.
A WG grammar generates a semantic structure which parallels the syntactic
structure described above. The parallels are in fact very close as in Figure 2.
Virtually every word is linked to a single element of the semantic structure, and
the dependency relations between the words are typically matched by one of
the relations between their semantic concepts: dependency (shown as a line
with a point). The familiar distinction between 'referent' and 'sense' are used in
much the same way as in other linguistic theories. Therefore, in WG a word's
sense is understood to be some general category (e. g. the typical chair), while its
referent is some particular instance of this category.
Apart from the diagram we have seen, how are the syntactic and semantic
structure of a sentence represented in WG? WG consists of an unordered set
of propositions called facts. All WG propositions have just two arguments and a
relator between them, and all relations can be theoretically reduced to a single
one represented as '=', although for convenience 'isa' and 'has' are also used.
Hudson (1990: 256), for instance, gives a fairly complete lexical entry for eat in
terms of propositions (or facts). Some of them considered to be most important
for the present purpose are given in (7):
(7) a. EAT isa verb.

b. sense of EAT = eat
c. EAT has [0-1] object.
d. referent of object of EA T = eat-ee of sense of it
These facts are self-explanatory except for a few technical expressions: 'A isa B'
means 'A is an instance of B' and [0-1] in front of object means 'at least 0 and
at most 1' (i. e. 0 or one in this particular case). An element in italics is meant to
be the antecedent of the pronoun.
Propositions in (7) partially represent the syntactic and semantic structures of
John ate potatoes diagrammed in Figure 3 with eat replaced by its past tense form
ate.
Figure 3
In Figure 3, as mentioned earlier, an item enclosed by single quotation

marks represents a concept in the semantic structure. In this case, 'Fred',
'potatoes', etc. are referents, whereas 'ate' and 'potato' are senses. X-er, X-
ee, etc., labelled on a semantic dependency, are semantic relations
complements considered to bear to their head.
This outline of WG brings us now to the syntactic and semantic analysis of
the English verb eat.
3. Eat in English
Let us consider those English transitive verbs that optionally appear without
their object Examples of such verbs, among others, include dress, shave, lose,
win, eat and read, as in (8a) to (8f):
(8) a. William dressed/shaved Andrew.

b. hester United won/lost the game.
c. read the book/ate the shepherd's pie.
d. am dressed/shaved.
e. hester United won/lost.
f ate/read.
While these six verbs will take part in the same syntactic alternation, the
intransitive verbs are obviously interpreted in different ways. In the examples
(8d)-(8f), we can identify three verb classes according to difference in
interpretation of the unexpressed object. The paired sentences in (9) illustrate

these differing interpretations.
(9) a. William shaved = William shaved himself

b. Manchester United won = Manchester United won the game
c. John ate = John ate something (edible) or other
For the shave type verbs, the object, if omitted, is interpreted as being coreferent
with the subject or the subject's specific body part. For the win type verbs, the
surface intransitive verb form signals a severe narrowing of the range of possible
referents of the implicit object, roughly speaking, 'a specific game' to be
recoverable from the context. For the eat type verbs, the intransitive form
means a lack of commitment by the speaker to the referent of the object.
With the eat type verbs, the identity of the referent of the unexpressed object
may be non-specific, i. e. literally unknown to the speaker, because the
sentences in (10) do make sense.
(10) a. I saw Oliver eating, but I don't know what he was eating
b. When I peeked into Oliver's room, he was reading; now I wonder what
he was reading
In both sentences in (10), the identity of what was eaten/read is asked in the
second part. This implies that the patient (or eat-<?£/read-^) argument of eat/
read, which may be grammaticalized as the object at surface structure, does not
have to be definite.
There is other evidence that supports the indefmiteness of the suppressed
object of eat. Consider the following dialogues:
(11) A: What happened to my scones?

B: *The dog ate.
(12) A: Did you eat your kippers?
B: *Yes, I ate.
In both (11) and (12), speaker B cannot reply to speaker A by using ate without
its object. What (11) and (12) suggest is that eat, when its object is suppressed,
cannot have its null object referring to the element in the previous discourse,
which, as I will explain very shortly, is in fact possible in Japanese. What the
ungrammaticality of the utterances by speaker B indicates is that the understood
object has to be indefinite if eat is used as an intransitive verb.7
Here is another interesting piece of evidence supporting that this claim is
true. Observe the following dialogue:
(13) A: I'm starving, let's eat.

B: What would you like to eat?
A: Doesn't matter, anything, I'm just so hungry.
When speaker A first uses the intransitive eat, it is clear that he/she does not
have a definite object (or the referent of a definite object) in mind, and is just
expressing his/her desire to consume something or other. As our previous

arguments predict, this is exactly a case where the intransitive eat should appear,
because there is no antecedent available in this context. However, the semantic
structure of eat in this example is considered to have the patient argument, as
the lexical semantics of eat requires two arguments whether its object is definite
or not. Then the question arises why this argument does not appear at surface
structure.
Now let us reconsider a WG representation of Fred ate, the diagram of
which is repeated here for convenience.
Figure 4
By now it is clear that this semantic representation is inadequate for eat (ate}.
It is not so difficult to see that the important semantic information of the
suppressed object is missing in Figure 4. In Fred ate, there is no object, which
implies that it should be indefinite. Regrettably, this important semantic
information does not seem to be incorporated in Figure 4. Therefore, my
proposal is that we have to revise the diagram so that it can be enriched with the
semantic information of the unexpressed object and accordingly a more
accurate one will be something like the one in Figure 5.
Figure 5
4. Taberu in Japanese
Let us now turn to a Japanese counterpart of eat, taberu. The picture of taberu,
an equivalent of eat in Japanese, is quite different from that of English eat,
which we have just discussed. As stated in section 1, complements are usually

missed out in Japanese as far as they are accessible to the speaker and the
addressee (or recoverable) in the context. This generalization applies to the
verb taberu in Japanese.
Before analyzing the structure of taberu, which can be used with the
suppressed complement as in (14) and (15), let us consider what kind of
grammatical structure WG would give to taberu. Like English eat, taberu in
Japanese takes two arguments in its semantic structure, the agent (eat-er), which
is realized as subject, and the patient (eat-ee) which is realized as object. Thus
WG gives the syntactic and semantic structures of Shota ga ringo o tabeta 'Shota
ate apples' as diagrammed in Figure 6.
Figure 6
In passing, one of the advantages of using WG to analyze the syntactic

structure of Japanese is that by doing so, it is quite easy to explain the
phenomenon of Tree word order as long as the parent is at the end' in
Japanese. As is well-known, Japanese is a verb-final language which implies that
the order of the subject, object and other dependents is not fixed as far as they
are before the verb, which is at the end of a sentence. Thus Shota ga ringo o
tabeta has an alternative version of ringo o Shota ga tabeta. A rather free order of
two complements in the sentence is explained in WG by saying that these two
elements are co-dependents of the head taberu, therefore the order of the
complements is free as far as they are before the head.
As stated above, complements of taberu can be missed out providing that they
are recovered from the context. The following examples illustrate this point
(14) hayaku tabero

quick eat
'Eat it quick'
(15) moo tabe-mashita-ka?
already ate-Q
'Did you eat it already?'
In both sentences above the definite object is apparently suppressed.

Presumably the suppressed objects can be expressed as (definite) pronouns
without any change in meaning as in the following sentences:
(16) hayaku sore-o tabero

quick it-Ob eat
'Eat it quick'
(17) moo sore-o tabemashita-ka?
already it-Ob ate-Q
'Did you eat it already?'
In the last section I argued that the suppressed object of eat is indefinite
because it cannot refer to its antecedent even when it is available in the
preceding context. Interestingly enough the opposite is true with taberu.
Consider die following dialogues corresponding to (11) and (12), which we
discussed in the last section:
(18) A: watashino sukohn-wa doo-shimashi-ta?

my scones-TP how-did
'What happened to my scones?'
B: inu-ga tabeta
dog-Sb ate
'The dog ate them'
(19) A: kippahzu-wa tabeta?
kippers-TP-Sb ate
'Did you eat your kippers?'
B: ee tabeta
yes ate
'Yes, I ate them'
In (18) and (19), the definite object referring to an element in die previous
context is left out in B's utterance. In (19), the subject referring to the speaker is
also missing in B's utterance. These cases show diat the suppressed object of
taberu is definite. In contrast, when the object of taberu is indefinite, there are in
fact cases where it has to be expressed as an indefinite noun as in (20):
(20) A: himana toki-wa nani-o shite-imasu-ka?

spare time-TP what-Ob do-ing-Q
'What do you do in your spare time?'
B: taitei nanika tabete-imasu
usually something eat-ing
'Usually I eat'
These arguments make it very clear that WG should represent the syntactic and
semantic structures of inuga tabeta in (18) as diagrammed in Figure 7.
As I stated in section 2, Hudson (1995b) claims that one of the key
characteristics of WG is taken as 'including contextual information about the
utterance event' (e. g. the identities of a speaker and a hearer) in the linguistic
structure. However, as it stands, the syntactic and semantic structures in Figures
Figure 7
1 and 4 do not seem to include as much contextual information as he suggests

that a WG does. Revisions I have made in this section surely contribute to
increasing contextual information in the grammatical representation in WG.
5. Conclusion
Considering the fact I mentioned above of deletability of definite objects given a
proper context in Japanese, it seems reasonable to add the following rule to the
grammar of Japanese to explain the proper semantic structure of tab em:
(21) Knower of eai-ee of sense of tab em = addressee of it.
Taking into account the arguments above, I conclude that the syntax of missing
complements in Japanese can be given a more satisfactory description by
introducing a parameter of 'default definiteness'. I do not want to enter into
details now but simply suggest that to distinguish between complements and
adjuncts in Japanese one needs another parameter such as 'default
definiteness'. To put it differently, by default the definiteness of a covert
complement is [+defmite] and that of a covert adjunct is [+/-definite] as in (22):
(22) • Covert complement of verb = definite
• Covert adjunct of verb = indefinite
• Knower of referent of complement of verb = addressee of it
References
Allerton, D. J. (1982), Valency and the English Verb. London: Academic Press.
Cote, S. A. (1996), 'Grammatical and Discourse Properties of Null Elements in
English'. (Unpublished doctoral dissertation, University of Pennsylvania).
Fillmore, Ch. J. (1986), 'Pragmatically controlled zero anaphora'. BLS 12, 95-107.
Groefsema, M. (1995), 'Understood arguments: A semantic/pragmatic approach',
Lingua 96, 139-61.
Haegeman, L. (1987a), 'The interpretation of inherent objects in English', Australian
Journal of Linguistics, 7, 223-48.
— (1987b), 'Register variation in English'. Journal of English Linguistics, 20, (2), 230-
48.
Halliday, M. A. K. and Hasan, R. (1976), Cohesion in English. London: Longman.
Hudson, R. A. (1984), Word Grammar. Oxford: Blackwell.
— (1992), 'Raising in syntax, semantics and cognition', in Rocca I. (ed. ), Thematic
Structure: Its Role in Grammar. Berlin: Mouton de Gruyter, pp. 175-98.
— (1994), 'Word Grammar', in Asher, R. E. (ed. ), The Encyclopedia of Language and
Linguistics. Vol. 9. Oxford: Pergamon Press Ltd, pp. 4990-3.
— (1995a), 'Really bare phrase-structure=dependency structure'. Eigo Go ho Bunpoh
Kenkyu (Studies in English Language Usage and English Language Teaching}, 17, 3-17.
- (1995b), HPSG without PS? Ms.
— (1995c), Word Meaning. London: Routledge.
- (1996, October 28), 'Summary: Watch', ( LINGUIST List 7. 1525), Available: http://
linguisdistorg/issues/7/7-1525. html#?CFID=4038808&CFrOKEN=16874386. (Ac-
cessed: 21 April 2005).
— (1998), English Grammar. London: Routledge.
- (2000), '*! amn't'. Language, 76, (2), 297-323.
— (2005), 'An Encyclopedia of English Grammar and Word Grammar', (Word
Grammar), Available: www. phon. ucl. ac. uk/home/dick/wg. htm. (Accessed: 21 April
2005).
Kilby, D. (1984), Descriptive Syntax and the English Verb. London: Groom Helm.
Lambrecht, K. (1996, October 30), 'Re: 7. 1525, Sum: Watch', (LINGUIST List
7. 1534), Available: http://linguisrlist. org/issues/7/7-1534. html#?CFID=4038808&CF-
TOKEN= 16874386 (Accessed: 21 April 2005).
Langacker, R. W. (1990), Concept, Image and Symbol. Berlin: Mouton de Gruyter.
Larjavaara, M. (1996, November 3), 'Disc: Watch', (LINGUIST List 7. 1552),
Available: http: //linguisdist. org/issues/7/7-1552. html#?CFID=4038808&CFTO-
KEN=16874386 (Accessed: 21 April 2005).
Lehrer, A. (1970), 'Verbs and deletable objects'. Lingua, 25, 227-53.
Levin, B. (1993), English Verb Classes and Alternations. Chicago: University of Chicago
Press.
Massam, D. (1987), 'Middle, tough, and recipe context constructions in English'. NELS
18, 315-32.
— (1992), 'Null objects and non-thematic subjects'. Journal of Linguistics, 28, (1), 115-
37.
Massam, D. and Y. Roberge. (1989), 'Recipe context null objects in English'. Linguistic
Inquiry, 20, 134-9.
Quirk, Randolph, Greenbaum, Sidney, Leech, Geoffrey N. and Svartvik, Jan Lars
(1985), A Comprehensive Grammar of the English Language. Harlow: Longman.
Rispoli, M. (1992), 'Discourse and the acquisition of eat'. Journal of Child Language, 19,
581-95.
Rizzi, L. (1986), 'Null objects in Italian and the theory of pro'. Linguistic Inquiry, 17,
501-57.
Roberge, Y. (1991), 'On the recoverability of null objects', in D. Wanner and D. A.
Kibbee (eds), New Analyses in Romance Linguistics. Amsterdam: John Benjamins, pp.
299-312.
Rosta, A. (1994), 'Dependency and grammatical relations'. UCL Working Papers in
Linguistics, (6), 219-58.
— (1997), 'English Syntax and Word Grammar Theory'. (Unpublished doctoral
Sperber, D. and Wilson, D. (1995), Relevance (2nd edn) Oxford: Blackwell.
Sugayama, K. (1993), 'A Word-Grammatic account of complements and adjuncts in
Japanese', in A. Crochetiere, J. -C. Boulanger and C. Ouellon (eds), Acte du XVs
Congres International des Linguistes Vol. 2. Sainte-Foy, Quebec: Les Presses de
1'Universite Laval, pp. 373-76.
— (1994), 'Eigo no "missing objects" ni tsuite (Notes on missing objects in English)',
Eigo Goho Bunpoh Kenkyu (Studies in English Language Usage and English Language
Teaching), 1, 91-104.
— (1999), 'Speculations on unsolved problems in Word Grammar'. Kobe City University
Journal, 50, (7), 5-24.
Thomas, A. L. (1979), 'Ellipsis: the interplay of sentence structure and context'. Lingua,
47, 43-68.
Notes
1 Notice here that Word Grammar, in the framework of which the arguments are
developed, assumes that complements include subject as well as object, contrary to
most of the phrase-structure-based theories. For details about the distinction between
complements and adjuncts in Japanese, see Sugayama (1993, 1999).
2 The following symbols for grammatical markers are used in the gloss: TP (Topic),
Sb (Subject), Ob (Object), Q (Question marker). The symbol 0 is used for a zero
pronominal in examples when necessary.
3 For a complete list of some 50 verbs having this feature, together with references, see
Levin (1993: 33). Lehrer (1970) also gives a similar list.
Unspecified Object Alternation in Levin's terms (Levin, 1993: 33), which applies
to eat, embroider, hum, hunt, fish, iron, knead, knit, mend, milk, mow, nurse, pack, paint,
play, plough, polish, read, recite, sew, sculpt, sing, sketch, sow, study, sweep, teach, type,
sketch, vacuum, wash, weave, whittle, write, etc., has the following pattern:
a. Mike ate the cake.
b. Mike ate. (= Mike ate a meal or something one typically eats. )
4 For some recent advances in WG, see Hudson (1992, 1995a, 1995b, 1998, 2000,
2005) and Rosta (1994, 1997).
5 Notice here that C stands for a complement, rather than a complementizer in the
diagram.
6 No phrases or clauses are assumed in WG except for some constructions (e. g.
coordinate structures).
7 Things are not so straightforward. Surprisingly younger children seem to have a
different grammar in which (9) and (10) are grammatical and are actually used.
Rispoli (1992) looked at the acquisition of Direct Object omissibility with eat in
young children's natural production. In terms of GB, eat is one of the many English
verbs with which the internal argument can be saturated. He found that at an earlier
stage of development children frequently omitted a Direct Object with eat when the
understood object referred to something in the discourse context, as in this exchange
between a parent (P) and a child (C):
(i) Child (2; 7)

(Talking about a pencil)
P: Well I see you already ate the eraser off of it. That's one of the first things you
hadta do.
C: I eat. (four times)
P: I know you ate the eraser, so you don't need a candy bar now.
(Rispoli 1992: 590)
4 The Grammar of Be To: From a Word Grammar
point of view1
KENSEI SUGAYAMA
Abstract
This chapter is an attempt to characterize the be to construction within a Word
Grammar framework. First (section 2), a concise account of previous studies into the
category of be in the construction is followed by a description of be in Word
Grammar (section 3). Section 4, section 5 and section 6, then, present a
morphological, syntactic and semantic discussion of the be to construction,
respectively. Section 7 gives a detailed discussion of the question whether be to is a
lexical unit or not The analysis is theoretically framed in section 8, where it is shown
how Word Grammar offers a syntactico-semantic approach to the construction.
1. Introduction and the Problem

In contemporary English there is a construction instantiated by the sentences in
(I):2
(1) a. You are to report here at 6 a. m.

b. What am I to do?
c. I am to leave tomorrow.
d. That young boy was to become President of the United States.
I shall call this construction the be to construction.3 Previous studies have

analyzed be in this construction in three different ways: (i) modal (e. g.
Huddleston 1980); (ii) 'be + to' analyzed as quasi-auxiliary; (iii) intermediate
between the two, as semi-auxiliary.
These approaches, however, did not give enough evidence to justify their
analyses. In this chapter I will argue that be in this construction is an instance of
modal verb and that 'be + to' is not a lexical (or possibly neither syntactic) unit as
it is often treated in reference grammars.4
My argument is within the framework of a theory called Word Grammar,
hence called a Word Grammar account of the problem, and based on the fact
that there is ample evidence supporting the claim that there is a syntactic and
semantic gap between the two elements, i. e. be and to. In the following sections,
I will provide a characterization of the be to construction within a Word
Grammar framework, as outlined above in the abstract.
2. Category of Be
Before we go into our analysis, let us have a brief look at what characteristics the
modal be shares with other auxiliary verbs, e. g. can and have.
The table below presents a modal be, a prototypical modal can and a
prototypical perfective auxiliary have in respect of the 30 criteria used in
Huddleston (1980) to characterize auxiliaries and modals.
Table 1 Characteristics of be, can and have
BE CAN HAVE
(modal) (perfect)
1 Non-catenative use -a - -
2 Inversion + + +
POLARITY
3 Negative forms R + +
4 Precedes not + + +
5 Emphatic positive + + +
STRANDING
6 So/neither tags + + +
7 SV order, verb given + + +/R
8 SV order, verb new - + +/R
9 Complement fronting - + +
10 Relativized complement + + +/R
11 Occurrence with DO - - -
12 Contraction + + +
POSITION OF PREVERBS
13 Precedes never + + +
14 Precedes epistemic adverb + + +
15 Precedes subject quantifier + + +
INDEPENDENCE OF CATENATIVE
16 Temporal discreteness + + R
17 Negative complement + + +
18 Relation with subject - - -
TYPE OF COMPLEMENTATION
19 Base-complement = + -
20 fo-complement + -a -
21 -en complement - - +
22 -ing complement - - -
INFLECTIONAL FORMS
23 3rd Singular + - + •
24 -en form - - -
25 -ing form = - R
26 Base-form - - +
27 Past Tense + + +
28 Unreal mood: protasis + + +
29 Unreal mood: apodosis - + -
30 Unreal mood: tentative - + -
NB: R means that the verb has the given property but under restricted conditions.
THE GRAMMAR OF BE TO 69
Clearly, at the outset we can say that be in the current construction shares quite a
lot of features with a typical modal can. In what follows I concentrate on the
extent to which this claim holds.
3. Modal Be in Word Grammar

In this section, we look at what Word Grammar says about aspects of be. Word
Grammar (WG), which is fully developed and formalized in Hudson (1984,
1990) is to be taken as a lexicalist grammatical theory because the word is
central - hence the name of the theory, basically making no reference to any
grammatical unit larger than a word.
WG uses a small triangle to show the model-instance relation. So the general
category (i. e. model) is on the base and its apex pointing at the member (i. e.
instance), with an extension line to link the two.
Now, let us see more in detail the properties of be as in (1), to examine the
claim that it should be categorized as a modal in WG terms. Consider the
diagram in Figure 1.
Figure 1 BEto in Word Grammar
What Hudson (1996) basically claims in WG using the model-instance relation

is the following:
• word is an independent entity in grammar;

• verb is an instance of word;
• auxiliary verb is an instance of verb;
• modal verb is an instance of auxiliary verb, along with other instances (e. g.
HAVEs, DOs and BE);
• be in this construction (represented to in Figure 1) is an instance
as BE both
of modal verb and BE.
This analysis implies that BEto may have inherited characteristics of modal
verbs and at the same time those of BE by Hudson's Inheritance Principle

although they are not always necessarily inherited:
Inheritance Principle (final version, Hudson 2005):

If fact F contains C, and C' is an instance of C, then it is possible to infer a second
rule F' in which C' replaces C provided that:
a. F does not contain "is an instance o f . . . ", and
b. there is no other fact which contradicts F and which can also be inherited by C'.
The idea of 'contradicting' can be spelt out more precisely, but the idea here
should be clear. In a nutshell, the Inheritance Principle says that a fact about
one concept C can be inherited by any instance C' of C unless it is contradicted
by another specific fact about C'.
In sum, WG analyzes be in the be to construction as an instance of modal
verb and be, allowing it to inherit characteristics from both heads in the model-
instance hierarchy in Figure 1.
4. Morphological Aspects
I claim diat be in (1) should be considered as a modal because it shares most of
the properties with a prototypical modal in morphology, syntax and semantics.
This claim is assured in Figure 1 by the fact that BEto has a multiple head, thus
inheriting the features from both heads, i. e. modal verb and be. However, a
further semantic analysis of the construction shows that the sense of the
construction is derived from the sense of the infinitive clause rather than that of
be. Let us start by having a look at the morphological characteristics of be in the
construction and its similarities with other modals.
4. 1 Like modals
Consider the following examples:
• In Standard English a modal is always tensed - i. e. eidier present or past
This is compatible with the behaviour presented by be in the be to construction

as in (2):
(2) *It is a shame for John to be to leave here tomorrow. [Warner]

* To be to leave is sad. [Pullum & Wilson]
*I expect him to be to leave. [Pullum & Wilson]
*He could be to leave. [Pullum & Wilson]
*She might be to win the prize.
*I don't like our being to leave tomorrow. [Warner]
*I am being to put in a lot of overwork these days. [Seppanen]
*I have always been to work together. [Seppanen]
* Don't be to leave by midnight. [Pullum & Wilson]
* Be to leave it till later. [Seppanen]
Clearly each of the examples in (2) shows that be in this construction cannot be
infinite. The presence of tense is what be shares with modal verbs.
• Only (tensed) auxiliary verbs accept -n't
Be in this construction allows negative contraction, which is shown only by the

tensed auxiliaries, which again implies that be is a modal because auxiliaries
include modals.
(3) Her novel wasn't to win that year.
4. 2 Unlike modals
• Be has a distinct j-form controlled by subject-verb agreement
(4) (Her novel is/They were/I am/She wasfWe are} to win the prize.
5. Syntactic Aspects
The second aspect is related to syntax.
5. 1 Like modals
• Only (tensed) auxiliary verbs allow a dependent not to follow them.
Be in this construction also has this feature as in (5).
(5) He is not to leave this room. [They are/*get not tired. ]
• It must share subject with the following verb (i. e. it is a raising. verb)5
The behaviour of be is the same as that of a typical raising verb seem as in (6).
(6) He is to go/*He is (for) John to go

He can speak/*He can (for) John speak
*Mary seemed (for) John to be enjoying himself
• Voice-neutral in many circumstances
This again strongly suggests that be in the be to is a raising verb.
(7) You are to take this back to the library at once ~ This is to be taken back to the
library at once.6
• It cannot co-occur with other modals
This feature is critical in that two members of the same category of modal verbs
cannot appear consecutively. (8) shows that might and be to belong in the same
category of modal verbs.
(8) *She might be to win the prize.
• It can precede perfective/progressive/passive auxiliary
When be appears with a perfective, progressive or passive auxiliary verb, it

always appears in the left-most slot reserved for modal verbs, i. e. immediately
before these auxiliaries.
(9) Her novel (was to have won/was to be going on display/was to be considered}

that year.
• It is an operator, i. e. has NICE properties [but Code only if the same verb
given in the previous clause as in (11)]
It has NICE properties. Therefore it is an operator in the sense of Quirk et al.

(1985) Auxiliaries have NICE properties. The NICE properties are also shared
by modals.
(10) Was her novel to win the prize? Mine was.

(11) *Joe's novel would win a prize that year. Mine wasn't.
5. 2 Unlike modals
• taking to -infinitive rather than a bare infinitive as a complement of to
(12) That young boy was *(to) become President of the United States.
6. Semantics of the Be To construction

There exist an array of meanings this be-to has: arrangement, obligation, and
predestined future, 'future in the past', possibility, purpose ('to be intended to')
and hypothetical condition.
It must be noted here that be in this construction has both epistemic/non-
epistemic meanings, which is again a diagnostic typical modals show. Sentences
in (13)-(19) involve a non-epistemic instance of the construction:
(13) She is to see the dean tomorrow at 4 p. m.

(14) You are to sit down and keep quiet.
(15) You're to marry him within the next six months.
(16) Their daughter is to be married soon. [Quirk et al. ]
(17) They are to be married in June [OALD6]
(18) The Prime Minister is to get a full briefing on the release of the hostages next
week.
(19) Ministers are to reduce significantly the number of examinations taken by
pupils in their first year in the sixth form as the result of an official review to be
published later this week. The review will recommend dismantling the modular
system of assessment that is at the heart of the new sixth-form curriculum. [ The
Times}
Although be to has several different meanings, its basic (or core) meaning can be
stated as follows:
• The agent has been set or scheduled to do something by some external

(outside) forces, and is thus obliged. However, the agent's commitment to
the obligation is left open.
Here the key points are the arrangeability of the event described and the
openness of the agent's commitment to the obligation. The first point is most
easily detected when it occurs with the event that cannot be arranged. Consider
the following examples:
(20) ?The sun is to rise at 5. 15 a. m. tomorrow morning

(21) The sun will rise at 5. 15 a. m. tomorrow morning.
(22) You are to take these four times a day.
A straightforward example of the use of be to in the context where an event

cannot be arranged can be found in (20). (20) is odd in comparison with both
other sentences for the point I am about to make. The fact that the sun's rise is
(or cannot be) normally not arranged is indeed the reason why (20) is low in
acceptability.7 In contrast, (21) is all right with will implying the speaker's
subjective prediction. One can also utter a sentence like (22), which depicts an
arrangeable event.
One might take it for granted that the be to necessarily implies the
arrangement of an event, but that would be missing the more general point that
there is no need to express the agent as in (23) or (24):
(23) There's to be an official inquiry. [Quirk et al. \

(24) Regional accents are still acceptable but there is to be a blitz on incorrect
grammar.
[CO BUILD2]
What is needed for non-epistemic meaning of the be to construction is that the

sentence expresses an arrangeable event or activity.
There is another use of be to representing 'predestined future' as in (25)-(30):
(25) They are to stay with us when they arrive. [CLD]

(26) You are to be back by 10 o'clock. [Quirk et al. \
(27) A clean coal-fired power plant is to be built at Bilsthorpe Colliery. [COBUILD]
(28) You are to take this back to the library at once ~ This is to be taken back to the
library at once.
(29) 'They are to be seen and displayed on walls and floors both in museums and
domestically. ' UK Written. CO BUILD WB
(30) I've also learned that in these difficult times it truly is important that we're all
thinking together about what is to be done and how best to move. [US spoken]
All these examples assert the speaker's high certainty at the speech time of the
event happening in the (near) future.
Related to this usage type is 'future in the past', a case where the speech time
is transferred to some point in the past
(31) a. After dinner they were to go to a movie. [COBUILD3]

b. Then he received a phone call that was to change his life...
[COBUILD4]
(32) He was eventually to end up in the bankruptcy court. [Quirk et al. \
(33) The meeting was to be held the following week. [Quirk et al. \
(34) Her novel was to win the prize.
(35) Worse was to follow.
(36) This episode was to be a taste of what was to come in the following couple of weeks.
Different or varied meanings such as 'compulsion', 'plan', 'destiny', etc. can

derive from the core meaning according to the context it appears in as in (37a)
and (37b). In (37), the part before and/as is the same in both sentences.
Nevertheless the interpretation of this part at the level of sentence meaning is
quite different. Where does this difference come from? The context, more
precisely the following context in this particular case, is responsible for this
difference. What is interesting is that the meaning (sense) of be to is determined
by the following context.
(37) a. You aren't to marry him, and that's an order.

b. You aren't to marry him, as I read it in the cards.
(37a) is interpreted as an order obviously, while (37b) has an epistemic

predictive sense. It is manifested quite clearly above that the array of
connotations is pragmatically determined.
On the other hand, the be to has epistemic meanings illustrated in (38)-(42).
(38) Such an outcome is to be expected.

(39) These insects are to be found in NSW. [Huddleston 1980: 66; Seppanen]
Furthermore it can be used in conditionals in English as in (40)-(42).
(40) And the free world has reacted quickly to this momentous process and must
continue to do so if it is to help and influence events [ICE-GB: S1B-054 #17: 1: B]
(41) the system is totally dependent on employee goodwill if it is to produce good
information. [ICE-GB:W2A-016 #118: 1]
(42) However, in nerves regeneration is essential if there is to be a satisfactory
functional outcome. [ICE-GB: W2A-026 #15: 1]
There exists arguably a clear-cut distinction between be to and the epistemic

modals in the use in conditionals. It is practically ruled out, or catalogued as
performance error, that speakers of English select an epistemic modal for the
protasis of conditionals, even though the meaning of this modal is conceptually
quite compatible with the functioning of either part (protasis or adposis) of a
conditional. The contrast in (43) serves as a most relevant observation to
explain this phenomenon.
(43) a. ??If it may rain, you should take your umbrella.

b. If it is possible that it will rain, you should take your umbrella.
According to Lyons (1977: 805-86), 'conditional clauses are incompatible with

subjective epistemic modal expressions'. In (43a), may in the protasis If it may
rain shows figment of the speaker's imagination and merely expresses possibility
as non-factual, which is in conflict with another possible world created by if,
while the possibility expressed by non-modal expressions in an acceptable
utterance like (43b) refers to possibility as actuality independent of the speaker,
and possibility is categorically asserted and therefore it is factual.
In passing, non-modal expressions can express modal-like meanings as in
(44) and (45):
(44) a. It's your duty to visit your ailing parents,

b. You ought to visit your ailing parents.
(45) a. Jessica is possibly at home now.
b. Jessica may be at home.
In the end, there are of course differences between be to and modal verbs.
Still be in (1) shares enough properties with modal verbs to be categorized as a
modal, and although it may be categorized as a modal, the sense of the be to
construction is best considered to be an existence of a situation where the event
is represented by the VP in the infinitive. Modal-like meanings of the
construction are derived from the sense of to rather than that of be, which is
attested in the following section.
7. Should To be Counted as Part of the Lexical Item?

Let me take pieces of evidence one by one to argue for my proposal.
• Inversion
In a yes-no question, what moves to the front is not be to but be, which suggests
that be behaves like a modal (operator), with to being an infinitive marker.
(46) He should go
Should he go?
(47) He ought to go.
* Ought to he go?
Ought he to go?
(48) He is to go.
*Is to he go?
Is he to go?
• VP fronting (Gazdar et al, 1982)
Impossibility of VP fronting as in (49f) shows that be itself isn't a modal. If it is, it

should behave as will in (49b):
(49) a. *... and went he

b. ... and go he will
c. ... and going he is
d. ... and gone he has
e. ... and taken by Sandy he was
f. *... and to go he is
g. *... and to go he wants
h. *... and be going he will
i. *... and have gone he will
j. ... and being evasive he was
• Be may be separated from to.
Therefore, be to isn't a syntactic unit
(50) We are, I believe, to start tomorrow.

(51) The most severe weather is yet/still to come. [Quirk et aL: 143]
(52) He was eventually to end up in the bankruptcy court. [Quirk et aL: 218]
• To may be missed out in the tag.
If be to is a unit, be to has to be retained.
(53) He was to have gone, wasn't he?
• Unlike ought to and have to, the to doesn't have to be retained in be to when a
VP that follows to is deleted.
Since deletion of the VP after to appears always to be possible whether the

relevant verb is a modal or not as in (55), this contrast tells nothing about the
category of the item before to, but what (54) does imply is that the VP deletion
is dependent on be in be to, rather than on to, which in turn suggests that be is a
modal on its own.
(54) Bill is to leave at once, and Alice is (to) also. [McCawley]

Bill has to leave at once, and Alice has * (to) also. [McCawley]
We don't save as much money these days as we {ought (to)/used to}. [Quirk et aL:
909]
(55) I've never met a Klingon, and I wouldn't want to. [Pullum, 1982: 185]
• Unlike ought to and have to, there is no phonetic fusion with be to.
Though it is not fully clear whether examples in (56) are zeugmatic or not, the
/o-infinitive may be coordinated with a wide range of conjuncts of different
categories (Warner 1993). My informants however say that they are all right
without a zeugmatic reading. If this is the case, the to-infinitive is an
independent unit in be to and there has to be a syntactic gap between be and to.
(56) He was new to the school and to be debagged the following day.
The old man is an idiot and to be pitied.
You are under military discipline and to take orders only from me.
You are a mere private and not to enter this mess without permission.
He was an intelligent boy and soon to eclipse his fellows.
If this is the case and there is a one-to-one relation between syntax and
semantics as is maintained in WG, the fo-mfmitive is semantically as well as
functionally an independent element in the be-to construction and therefore the
be-to is obviously not a syntactic unit, although there has to be some
grammatical relation between the two elements (i. e. be and fo-infmitive). It
seems that what fo-infinitives in (56) have in common is the function of
predication. Otherwise they cannot be coordinated with those conjuncts before
and in (56). This implies that be is an ordinary predicative be, followed by an
infinitival clause with to of predicative function.
Sentences in (57) and (58) give evidence supporting this predicative function
of the infinitive clause because there is no be found in examples in (57).
Nevertheless, the NPs in (57) and (58) all express predication, although they
are NPs as a whole in syntactic terms:
(57) Prudential to float its Egg online bank

Woman to head British Library
Teeth to be farmed
Naked swim teacher to sue
Vestey grandson to stand trial
Hayward Gallery to be refurbished and extended
Tendulkar to stand down as captain
(58) ... the quaint aspects of working class life to be found in many major novelists
[Palmer]
This predicative analysis of the to-infinitive is also supported by (59a), where the to-
infinitive is a part of a small clause with the preposition with. It is a well-attested fact that
with heads a small clause expressing predication as in (59b). Therefore, the to-infinitive
functions as a predicative in (59a).
(59) a. ... in a consultation paper agreed with Dublin to be released at the...

[CO BUILD]
b. With Peter the referee we might as well not play the match. [Aarts 1992: 42]
All these arguments show that be to is not a lexical unit.
8. A Word Grammar Analysis of the Be To Construction

In this section, I will show how a WG analysis will make it possible to give the
same syntactic and semantic structure to epistemic/non-epistemic meanings of
the construction, based on the evidence that be is an instance of a raising verb in
both cases. As far as I know, there has been very little, if any, research done
with the mapping between semantic and syntactic structures of core and
marginal modals, including the present construction. In this sense, my

approach is quite a valuable one. In these linguistic circumstances, I suggest that
the question to be asked is: what must be the mapping between the semantic
and syntactic structure of what is represented by the be to construction? I now
present an answer framed in WG terms. Before giving a detailed analysis, let us
have a quick look at WG in a nutshell.
In WG, a syntactic structure is based on grammatical relations within a
general framework of dependency theory, rather than on constituent structure.
Accordingly, a grammatical relation is defined as a dependency relation
between the head and its dependents which include complements and adjuncts
(alias modifiers). In this framework, the syntactic head of a sentence, as well as
its semantic head, is therefore a finite verb on which its dependents such as
subject, object and so forth depend. To take a very simple example, the
grammatical analysis ofyou are reading a Word Grammar paper is partially shown
by the diagram in Figure 2.
Figure 2 Syntax and semantics in WG
Each arrow in this diagram shows a dependency between words and it points
from a head to one of its dependents, but what is most important here is that
there are no phrases or clauses in the sense of constituency grammars. Thus in
terms of dependency, are is the root of the sentence (represented as a bold
arrow) on which you and reading depend as a subject and sharer (a kind of
complement),8 respectively. In turn, a is the head of paper. Grammar depends
on paper, Word depends on Grammar, and so on. Turning to the semantics of
this sentence, 'you', which is a referent of you, is linked as a read-er (i. e. agent)
to a semantic concept 'You read a Word Grammar', an instance of 'read'.9
The curved (vertical) lines point to the referent of a word. '-Er' and '-ee' are
names of semantic relation or thematic roles. A small triangle representing the
model-instance relation is the same in earlier figures. A convenient way to

diagram the model-instance relation is by using a triangle with its base along the
general category (= model) and its apex pointing at the member (= instance),
with an extension line to link the two. In Figure 2, then, the diagram shows the
relation between the sense of the word read, fread', and its instance fyou read a
WG paper'.
Based on the arguments in the preceding sections, diagrams in (60a) and
(60b) offer a view of syntax and semantics of the be to construction, which has
both epistemic and non-epistemic senses. Translated into WG schematic
features, this means that in syntax of both senses, it has a raising structure
represented as the main subject functioning as both the subject of be and that of
to or the infinitival verb.
Thus WG configuration posits by and large the same semantic structure to
both senses of the construction, which is headed by the semantic concept of
'Be'. The detailed analysis of the epistemic structure is given in (60a):
(60) a.
What this diagram shows on the semantic structure is:
• epistemic sense is an instance of modality;

• sense of aren't is 'be' (neglecting the negation);
• 'be' is an instance of epistemic modality;
• 'be' has a proposition as a dependent.
Similarly in (60b), containing the non-epistemic sense of the be to:

(60) b.
Here 'be' is an instance of non-epistemic, because (60b) means that some event
is arranged or planned, without any sense of the speaker's judgement on the
proposition embedded in the utterance (sentence). In both cases, the meaning
(i. e. sense) of 'be' needs a proposition which expresses an activity or event as a
dependent, which is a sense of the verb.
Abstracting away from the technical markers, the diagram in (60c) represents
a WG analysis of the coordinate structure in (56). This diagram schematizes the
very idea that the same predicative function (a dependency relation) holds
between was and the first conjunct new to the school enclosed by square brackets
on the one hand, and between was and the second conjunct to be debagged.... at
the same time.
(60) c.
9. Conclusion
In this chapter, I have shown that a morphological, syntactic and semantic
analysis of be in the be to construction provides evidence for the category of be in
this construction proposed here. Namely, be is an instance of a modal verb in
terms of morphology and syntax, while the sense of the whole construction is
determined by the sense of 'to'. The analysis also explains why be to does not
constitute a lexical unit. Finally, the WG account presented here gives the same
syntactic and semantic structures to the construction, reducing the complexity of
the mapping between the two levels of the structure of the modal-like
expression be to.
References
Aarts, Bas (1992), Small Clauses in English. Berlin: Mouton de Gruyter.
Bybee, John (1994), The Evolution of Grammar. Chicago: The University of Chicago
Press.
Celce-Murcia, Marianne and Larsen-Freeman, Diane (1999 ), The Grammar Book.
Boston, MA: Heinle & Heinle.
Collins Cobuild English Language Dictionary for Advanced Learners. (1995 ), Glasgow:
Harper Collins Publishers.
Collins Cobuild English Language Dictionary for Advanced Learners. (2001 ), Glasgow:
Harper Collins Publishers.
Collins Cobuild Advanced Learner's English Dictionary. (2003 ), Glasgow: Harper Collins
Publishers.
Gazdar, Gerald, Pullum, Geoffrey K. and Sag, Ivan A. (1982), 'Auxiliaries and related
phenomena in a restrictive theory of grammar'. Language, 58, 591-638.
Greenbaum, Sidney, Leech, Geoffrey N. and Svartvik, Jan Lars (eds), (1980), Studies in
English Linguistics for Randolph Quirk. London: Longman.
Huddleston, Rodney (1976), 'Some theoretical issues in the description of the English
verb'. Lingua, 40, 331-383.
— (1980), 'Criteria for auxiliaries and modals', in Greenbaum, Sidney, et al. (eds),
Studies in English Linguistics for Randolph Quirk. London: Longman, pp. 65-78.
Hudson, Richard A. (1984), Word Grammar. Oxford: Blackwell.
— (1996), A Word Grammar Encyclopedia (Version of 7 October 1996). University
College London.
— (2005, February 17 - last update), 'An Encyclopedia of English Grammar and Word
Grammar', (Word Grammar), Available: www. phon. ucl. ac. uk/home/dicVwg. htm.
(Accessed: 21 April 2005).
Kreider, Charles W. (1998), Introducing English Semantics. London: Roudedge.
Lampert, Gunther and Lampert, Martina (2000), The Conceptual Structure (s) of
Modality. Frankfurt am Main: Peter Lang.
Lyons, John (1977), Semantics. Cambridge: Cambridge University Press.
McCawley, James D. (1988),
The Syntactic Phenomena of English. Chicago: University of Chicago Press.
Napoli, Donna Jo (1989), Predication Theory. Cambridge: Cambridge University Press.
Palmer, Frank Robert (19902), Modality and the English Modals. Harlow: Longman.
— (2001 ), Mood and Modality. Cambridge: Cambridge University Press.
Perkins, Michael. R. (1983), Modal Expressions in English. London: Frances Pinter.
Pullum, Geoffrey K. (1982), 'Syncategorematicity and English infinitival to*'. Glossa, 16

(2), 181-215.
Pullum, Geoffrey K. and Wilson, Deirdre (1977), 'Autonomous syntax and the analysis
of auxiliaries'. Language, 53, 741-88.
Quirk, Randolph, Greenbaum, Sidney, Leech, Geoffrey N. and Svartvik, Jan Lars
(1985), A Comprehensive Grammar of the English Language. Harlow: Longman.
Seppanen, A. (1979), 'On the syntactic status of the verb be to in Present-day English'.
Anglia, 97, 6-26.
Sugayama, Kensei (1996), 'Semantic structure of eat and its Japanese equivalent taberu: a
Word-Grammatic account', in Barbara Lewandowska-Tomaszczyk and Marcel
Thelen (eds), Translation and Meaning, Part 4. Maastricht: Universitaire Pers
Maastricht, pp. 193-202
— (1998), 'On be in the be to Construction', in Yuzaburo Murata (ed), Grammar and
Usage in Contemporary English. Tokyo: Taishukan, pp. 169-77.
Warner, Anthony R. (1993), English Auxiliaries. Cambridge: Cambridge University
Press.
Notes
1 This is a revised and expanded version of my paper of the same title read at the
International Conference 'Modality in Contemporary English' held in Verona, Italy
on 6-8 September 2001. 1 am most grateful to the comments from the audience at
the conference. Remaining errors are however entirely my own. The analysis
reported here was partially supported by grants from the Daiwa Anglo-Japanese
Foundation (Ref: 02/2030). Their support is gratefully acknowledged.
2 Here we shall not take into account similar sentences as in (i), which are considered
to have a different grammatical structure from the one we are concerned with in this
chapter.
(i) My dream is to visit Florence before I die.
3 The idea of construction here is quite a naive one, different from the technical
definition of the one used in Goldberg's Construction Grammar.
4 Palmer (1990: 164), among others, claims that 'is to' is formally a modal verb.
5 Be to can be used in the there construction as in (i).
(i) Regional accents are still acceptable but there is to be a blitz on incorrect
grammar. [COBUILD2]
This suggests that be is a raising verb because there is no semantic relation between
there and 'be to'.
Example (i) is construed as a counter-example to view this construction as one
involving a subject-control as in (ii).
(ii) Mary is [PRO to leave by 5]. - Napoli
6 The possibility sense is found only in the passive, so there is no active counterpart
for These insects are to be found in NSW. [Huddleston 1980: 66; Seppanen]
7 Quite exceptionally, it could be arranged by God or other supernatural beings.
Otherwise it cannot be.
8 Sharer is a grammatical relation in Word Grammar.
9 Here we disregard the tense and aspect of the sentence.
5 Linking in Word Grammar
JASPER HOLMES
Abstract
In this chapter I shall develop an account for the linking of syntactic and semantic
arguments in the Word Grammar (WG) framework. The WG account is shown
to have some of the properties of role-based approaches and some of the
properties of class-based approaches.
1. Linking in Word Grammar: The syntax semantics principle
1. 1 Introduction
Any description of linguistic semantics must be able to account for the way in
which words and their meanings combine in sentences. Clearly, this
presupposes an account of the regular relationships between syntactic and
semantic structures: a description of the mechanisms involved in linking.
The search for an adequate account of linking has two further motivations: it
makes it possible to explain the syntactic argument-taking properties of words
(and therefore obviates the need for valency lists or other stipulative
representations of subcategorization facts); and it provides a framework for
dealing with words whose argument-taking properties vary regularly with the
word's meaning (many such cases are treated below and in the work of other
writers in the field of lexical semantics including Copestake and Briscoe 1996;
Croft 1990; Goldberg 1995; Lemmens 1998; Levin 1993; Levin and Rappaport
Hovav 1995; Pustejovsky 1995; and Pustejovsky and Boguraev 1996).
Levin and Rappaport Hovav provide yet another reason to seek an account
of argument linking: that it is an intrinsic part of the structure of language. In
their introduction, they make the following claim:
To the extent that the semantic role of an argument is determined by the meaning of
the verb selecting it, the existence of linking regularities supports the idea that verb
meaning is a factor in detennining the syntactic structure of sentences. The striking
similarities in the linking regularities across languages suggest that
they are part of the architecture of language. (1995: 1, my emphasis)
Of course, it is not the meanings of verbs alone that are relevant in determining
semantic structure. It should also be clear that I do not share Levin and
Rappaport Hovav's conviction of the similarities across languages in the

details of argument linking. However, I accept readily that the fact of
argument linking, and the mechanism that controls it, must be shared across
languages.
The linking regularities that we seek are generalizations over correspon-
dences between syntactic and semantic relationships. In the WG framework
(Hudson 1984, 1990, 1994, 2004; Holmes 2005), they take the form of
specializations or refinements of the Syntax Semantics Principle (SSP) (Hudson
1990: 132). This is represented schematically in Figure 1 and given in prose in
(1). The SSP, as shown here, corresponds to the bijection principle of Lexical
Functional Grammar (Bresnan 1982) and to the projection principles and 0-
criterion of Government and Binding Theory (GB) (Chomsky 1981: 36, 38).
Figure 1 Syntax Semantics Principle
(1) Syntax Semantics Principle (SSP): A word's dependent refers to an

associate of its sense.1
Specific linking rules for specific relationships link classes of syntactic

dependency with classes of semantic associate. These classes gather together
the relevant syntactic and semantic properties. By way of exemplification, I
begin with the structure associated with the indirect object relationship. I go on
to discuss the properties of objects and subjects.
1. 2 Indirect objects
Figure 2 shows some of the syntactic and semantic structure that needs to be
associated lexically with GIVE: the verb has a subject (s in the diagram), an
object (o in the diagram) and an indirect object (io in the diagram), all of which
are nouns. The sense of the verb, Giving, is an event with an 'er' (the referent of
the subject; the properties of 'ers' and 'ees' are discussed shortly), an 'ee' (the
referent of the object), a recipient (the referent of the indirect object) and a result,
LINKING IN WORD GRAMMAR 85
an example of Having which shares its arguments with its parent The giver has
agentive control over the event, being in possession of the givee beforehand and
willing the transfer of possession. The givee and the recipient are more passive
participants: the former undergoes a change of possession, but nothing else, the
latter simply takes possession of the givee. Being a haver presupposes other
properties (centrally humanity), but those are not shown here.
Figure 2 GIVE
Clearly, not all this information is specific to GIVE. Volitional involvement,

control and instigation are semantic properties associated with many other
subject relationships, even those of verbs that have no (indirect) objects; the
passive role and affectedness of the object also apply in many other cases; and
many other verbs can appear with indirect objects, with similar semantic
properties. Levin provides the following two groups of verbs permitting indirect
objects (1993: 45-49), distinguished from each other by semantic properties
(those in (2) alternate, according to Levin's analysis, with constructions with the
preposition TO, those in (3) with constructions with FOR). The question of
alternation, as well as the difference between the two groups, is dealt with shortly.
(2) ADVANCE, ALLOCATE, ALLOT, ASK, ASSIGN, AWARD, BARGE,

BASH, BAT, BEQUEATH, BOUNCE, BRING, BUNT, BUS, CABLE,
CARRY, CART, CATAPULT, CEDE, CHUCK, CITE, CONCEDE,
DRAG, DRIVE, E-MAIL, EXTEND, FAX, FEED, FERRY, FLICK,

FLING, FLIP, FLOAT, FLY, FORWARD, GIVE, GRANT, GUARAN-
TEE, HAND, HAUL, HEAVE, HEFT, HIT, HOIST, HURL, ISSUE,
KICK, LEASE, LEAVE, LEND, LOAN, LOB, LUG, MAIL, MODEM,
NETMAIL, OFFER, OWE, PASS, PAY, PEDDLE, PHONE, PITCH,
POSE, POST, PREACH, PROMISE, PULL, PUNT, PUSH, QUOTE,
RADIO, READ, REFUND, RELAY, RENDER, RENT, REPAY, ROLL,
ROW, SATELLITE, SCHLEP, SELL, SEMAPHORE, SEND, SERVE,
SHIP, SHOOT, SHOVE, SHOW, SHUTTLE, SIGN, SIGNAL, SLAM,
SLAP, SLIDE, SUNG, SLIP, SMUGGLE, SNEAK, TAKE, TEACH,
TELECAST, TELEGRAPH, TELEPHONE, TELEX, TELL, THROW,
TIP, TOSS, TOTE, TOW, TRADE, TRUCK, TUG, VOTE, WHEEL,
WILL, WIRE, WIRELESS, WRITE, YIELD.
(3) ARRANGE, ASSEMBLE, BAKE, BLEND, BLOW, BOIL, BOOK,
BREW, BUILD, BUY, CALL, CARVE, CASH, CAST, CATCH,
CHARTER, CHISEL, CHOOSE, CHURN, CLEAN, CLEAR, COM-
PILE, COOK, CROCHET, CUT, DANCE, DESIGN, DEVELOP, DIG,
DRAW, EARN, EMBROIDER, FASHION, FETCH, FIND, FIX,
FOLD, FORGE, FRY, GAIN, GATHER, GET, GRILL, GRIND,
GROW, HACK, HAMMER, HARDBOIL, HATCH, HIRE, HUM,
IRON, KEEP, KNIT, LEASE, LEAVE, LIGHT, MAKE, MINT, MIX,
MOLD, ORDER, PAINT, PHONE, PICK, PLAY, PLUCK, POACH,
POUND, POUR, PREPARE, PROCURE, PULL, REACH, RECITE,
RENT, RESERVE, ROAST, ROLL, RUN, SAVE, SCRAMBLE,
SCULPT, SECURE, SET, SEW, SHAPE, SHOOT, SING, SLAUGH-
TER, SOFTBOIL, SPIN, STEAL, STITCH, TOAST, TOSS, VOTE,
WASH, WEAVE, WHISTLE, WHITTLE, WIN, WRITE.
The set of verbs that can take an indirect object, of either kind, is in
principle unlimited in size, since it is possible to extend it in one of two ways.
First, membership is open to new verbs which refer to appropriate activities:
(4) We radioed/phoned/faxed/emailed/texted/SMSed them the news.

(5) We posted/mailed/couriered/FedExed™ them the manuscript.
(6) Boil/coddle/microwave/Breville™ me an egg.
Second, and even more tellingly, existing verbs can be used with indirect
objects, with novel meanings contributed by the semantics of the indirect object
(7) The colonel waggled her his bid with his ears.
(8) Dust me the chops with flour.
Examples like (7) and (8) are acceptable to the extent that the actions they
profile can be construed as having the appropriate semantic properties. For
example a bottle of beer can be construed as having been prepared for
someone if it has been opened for them to drink from, but a door is not
construed as prepared when it has been opened for someone to pass through:
UNKING IN WORD GRAMMAR 87
(9) Open me a bottle of pils/*the door.
It is clear from this, and from the fact, noted by Levin (1993: 4-5) with
respect to the middle construction, that speakers make robust judgements
about the meanings of unfamiliar verbs in constructions on the basis of the
construction's meaning (see also (10)), that the meaning of the construction
must be represented in a schematic form in the mind of the language user.
(10) Flense me a whale.
This schematic representation must pair the semantic properties of the

construction with its syntactic and formal (phonological/graphological) proper-
ties. Goldberg (2002) provides a powerful further argument for treating
constructions as symbolic units in this way. This argument, which she traces to
Chomsky (1970) and to Williams (1991) (where it is called the 'target syntax
argument'), holds that where the properties of supposedly derived structures
(here the creative indirect objects) match those of non-derived ones (here the
lexically selected indirect objects), the generalization over the two sorts of
structure is most effectively treated as an argument structure construction in its
own right.
In English, which does not have a rich inflectional morphology, the formal
properties of the indirect object relationship are limited to the fact that personal
pronouns in the indirect object position appear in their default form ({me} not
{I}), which is also true of direct objects. Other languages show more variation,
marking the presence of an indirect object in the form of the verb, as in (11) from
Indonesian (Shibatani 1996: 171), or assigning different case to nouns in indirect
object position than those functioning as direct objects, as in German (12).
(11) Suya membunuh-kan Ana lipas.

I kill BEN [name] centipede
I killed a centipede for Ana.
(12) a. Gib ihr/*sie Blumen.
give her flowers
Give her flowers,
b. Ki\P *ihr/sie.
kiss her
Kiss her.
Some syntactic properties of the indirect object (in English) are given by
Hudson (1992). These include the possibility of merger with subject in passive
constructions (13), the obligatoriness of direct objects in indirect object
constructions (14) and its position immediately following the verb (15).
(13) She was given some flowers.

(14) We gave (her) *(some flowers).
(15) a. We gave her some flowers/*some flowers her.
b. We sent her some flowers over/her over some flowers/*some flowers
her over/*over her some flowers.
The semantic property common to all indirect objects is that they refer to
havers: in the case of the verbs taking 'dative' indirect objects in (2), the result of
the verb's sense is that the referent of the indirect object comes into possession
of something; in the case of those taking 'benefactive' indirect objects in (3), the
verb profiles an act of creating or preparing something intended to be given to
the referent of the indirect object.
Figure 3 Some verbs have indirect objects
Figure 3 shows the various properties associated with indirect objects. First,
the diagram shows that indirect objects are nouns, and that it is verbs, and
more particularly verbs with objects, that have indirect objects: Ditransitive,
the category of verbs with indirect objects, isa Transitive, the category of verbs
with direct objects. This is enough by itself to represent the fact that the direct
object is obligatory with indirect objects (14), but the object relationship is
nevertheless also shown in the ditransitive structure, since it appears in the
word order rule (indirect objects precede objects). The referent of the object
also appears in the semantic structure, along with that of the indirect object,
since without it the semantic structure cannot be interpreted. I show the two
referents as coarguments of the result of the verb's sense, though the
semantics is worked out more clearly in the discussion of Figure 4. The fact
that indirect objects can merge with subjects in passive constructions is dealt
with in the following section.
Indirect objects may have one of two slightly different semantic structures,
each associated with a separate category of ditransitive verbs. In both, the
referents of the two dependents are 'er' and 'ee' of a Having, but the role of that
Having differs somewhat between the two. The two structures are given in
Figure 4.
Figure 4 Two kinds of indirect object
Ditransitive/1 is exemplified in (16):
(16) We baked her a cake.
The sense of the verb isa Making, and its result (therefore) isa Being (is a state)
and the argument of that Being is the referent of the direct object: baking a cake
results in that cake's existence, baking a potato results in that potato's being
ready. The Having that connects the referents of the two arguments is the
purpose of the verb's sense: the purpose of the baking of the cake is that it
should belong to her (the referent of the indirect object). This concept is
connected to the sense of the verb by the beneficiary relationship (labelled ben/
fy). Ditransitive/2 (17) has as its sense a Giving event, which straightforwardly
has as its result the Having that connects the referents of the two arguments.
The referent of the indirect object is connected to the sense of the verb by the
recipient relationship.
(17) We passed her a parcel.
Once these two semantic structures are established, they can be used in the
treatment of the relationship between the indirect object construction and
constructions with the prepositions TO and FOR. Simply, TO has the same sense
as Ditransitive/2 and FOR the same as Ditransitive/1 (with some differences: see
(18)). This synonymy can, though it need not, be treated as a chance occurrence:
no explanation is necessary for the relationship between constructions with
indirect objects and those with TO. The case of FOR and the difference seen in
(18) certainly supports the idea that the two constructions converge on a single
WORD GRAMMAR: PERSPECTIVES ON LANGUAGE STRUCTURE
meaning by chance, since the two meanings are in fact different The use of the
indirect object to refer to the beneficiary of an act of preparation is only possible
where the prepared item is prepared so it can be owned (or consumed) by the
beneficiary; this constraint does not apply to beneficiary FOR.
(18) a. Open a bottle of pils/the door for me.

b. Open me a bottle of pils/*the door.
The pattern in Figure 3 (and Figure 4) represents a symbolic relationship.

Lexical structures include specifications of the meanings of individual lexemes
and of classes of lexemes defined by common properties of all sorts. A lexeme
has a form and a range of syntactic properties which identify the syntactic pole
of the symbolic relationship; it also has a sense, which provides the connection
to a range of semantic properties. Similarly, inflectional and other classes of
lexemes share formal, syntactic and semantic properties. And similarly,
syntactic dependencies are associated with a range of formal and syntactic
properties (chiefly constraints on the elements at either end of the dependency)
and semantic properties (represented in the semantic relationship between the
meanings of the two elements). Figure 5 shows, by way of an example, partial
lexical structures for the lexeme OPEN, the inflectional category Past and the
indirect object relationship.
Figure 5 Schematic representation of OPEN, Past, indirect object
The pattern in Figure 3 (and Figure 4) is a generalization over verbs taking

indirect objects. A verb appearing in a construction with an indirect object
instantiates the more general model. The model represents the properties of
the construction in the same way as a lexeme represents the properties of a
particular word. In the case of a novel use of the construction (19), the fact that
the sentence conforms to the formal properties entails that it also conforms to
the semantic properties of the construction. In fact the construction can also be
used to constrain the set of verbs that may take an indirect object, since only
those verbs that can conform to the properties of the construction can appear in
it: *Skate me a half-pipe/*Run me a mile, etc.
(19) Waggle me your bid.
Examples like (19) represent cases of multiple inheritance: the verb

instantiates both WAGGLE (from which it gets its form and much of its
meaning) and Ditransitive (from which it gets the indirect object and
concomitant semantic properties). This is the same mechanism that mediates
verbal inflection: the past tense of a verb inherits from the verb's lexeme and
from the category Past at the same time.
Because of this possibility, it is not necessary to include all of the structure in
the diagrams in the lexical specification even of a verb like GIVE, since (when it
is used ditransitively) the relevant properties follow from the general properties
of ditransitive verbs. These verbs, whose use with an indirect object seems
unexceptional compared to those like (19), probably are lexically associated
with the indirect object construction. GIVE, for example, might be separated
into two sub-types, one of which isa Ditransitive, and the other of which takes
TO as a complement. By contrast, a verb like ACCORD, that never appears
without an indirect object, inherits all the properties of Ditransitive.
Figure 6 shows a part of the lexical structure of ACCORD and GIVE. All
cases of ACCORD have indirect objects, so the whole category is subsumed
under Ditransitive. GIVE, by contrast, is divided into two subcategories: one
which isa Ditransitive, and one which isn't (this category has TO as a
complement). The diagram also shows that creative use of the indirect object as
in (19) can be mediated by a contextual (= non-lexical) specialization of the
relevant lexeme, that inherits also from the 'inflectional' category Ditransitive.
Figure 6 ACCORD, GIVE, Waggle me your bid

Some of the features of the ditransitive model are nevertheless often

repeated or overridden in the structures associated with verbs that are
specializations of it. For example, Lending and Loaning are special in that their
result is temporary, Donating because the recipient is a charitable organization,
and Denying in that the intention is that the recipient should not receive from
the givee. These specializations of/divergences from the model must be
represented in the individual lexical structures of the verbs concerned.
A classification hierarchy consisting of classes defined by properties that
distinguish them from other categories is a commonplace in many approaches
to knowledge representation (and elsewhere). In linguistics the idea is found in
the work of structuralist semanticists (Weisgerber 1927; Trier 1931; Cruse
1986), among others.
1. 3 Objects
Biber et al. (1999: 126-8) give a number of syntactic properties for English
objects, as follows (the properties assigned to objects and to subjects are all
taken from Biber et al. ', some details may be disputed, but the general point
remains the same):
a. found with transitive verbs only

b. is characteristically an NP, but may be a nominal clause
c. is in accusative case (when a pronoun)
d. typically follows immediately after the VP (though there may be an intervening
indirect object)2
e. may correspond to the subject in passive paraphrases
• The first two syntactic properties refer to the classes of the words at either
end of the object relationship: some verbs (the transitive verbs) lexically
select an object; the objects themselves are generally nouns.
• The third property concerns the form of the object: when it is a pronoun, it
takes the 'accusative' form (what I have above called the default form).
• The fourth property concerns its relative position in the sentence: objects
generally follow their parents, and only a limited set of other dependents of
the parent may intervene (any number of predependents of the object may
intervene) (20). Biber et al. note that indirect objects may come between the
object and the parent (21); this possibility is also open to particles (22).
(20) Philly fiUeted ('skillfully) the fish.

(21) We gave her a new knife.
(22) She threw away the old one.
• The final syntactic property refers to passive constructions. Under the WG

analysis (see Hudson 1990: 336-53), the subject of a passive verb is at the
same time its object (23) (or indirect object (24)), the merger of dependents
being licenced by the passive construction itself.
(23) The camel hair coat was given to Cathy.

(24) Cathy was given the camel hair coat
Figure 7 shows how these syntactic properties can be represented in a lexical

structure.
Figure 7 Syntactic properties of objects
The parent in an object relationship isa Verb and the dependent isa Noun. In
this way, the object relationship defines a class of transitive verbs (verbs that
have objects). Verbs that only appear in transitive constructions inherit all
properties from this class (DEVOUR isa Transitive, just as ACCORD isa
Ditransitive). The word order properties are represented by the next
relationship: the form of the dependent is the next of that of the parent.
The diagram also shows the category Ditransitive (see Figure 4), where the
word order properties are somewhat different (the form of the object is the next
of that of the indirect object). Also represented in the diagram is the class of
passive verbs (the category Passive). These verbs are defined by their formal
properties: the form of a passive verb consists of its base plus a suitable ending
(not shown). There are two classes of passive verb: one, which also isa
Transitive, in which the subject is merged with the object, and one, which also
isa Ditransitive, in which the subject is merged with the indirect object.
The full lexical structure of the object relationship must also include its
semantic properties. In line with the approach outlined above for indirect
objects, the semantic properties of the object are related to its syntax through a
specialization of the SSP. Biber et al. (1999: 126-8) also identify a range of
possible semantic relationships that correspond with the object relationship (see
a-g), and the lexical semantic representation of the object relationship should be
general over all of these.
a. ected (bake a potato)

b. resultant (bake a cake)
c. locative (swim the Ohio)
d. instrumental (kick your feet)
e. measure (weigh 100 tons)
f. cognate object (laugh a sincere laugh)
g. eventive (have a snooze)
Properties a, b and d can be quite straightforwardly collected under a general

treatment, in terms of their force-dynamic properties: in each case, the sense of
the verb has a result which is a further event having the referent of the object as
an argument (when you bake a potato, the potato becomes soft and edible;
when you bake a cake, the cake comes into existence; when you kick your feet,
the feet move). This is represented in Figure 8: a verb's object refers to the 'er'
of the result of the verb's sense. This two-stage relationship is further
represented in a direct relationship between the verb's sense and the referent of
its object, labelled 'ee'.
Figure 8 Affected/effected objects
Notice that a similar conflation of a two-stage relationship into a direct one

was used above in the semantic structure of indirect objects. In fact, when a verb
has an indirect object, the recipient relationship overrides the 'ee' relationship
in being assigned to the 'er' of the result, in much the same way as the word
order properties of the indirect object override those of the object. This is
LINKING IN WORD GRAMMAR , 95
determined in the semantics by the nature of the resulting state: where this state
isa Being, its 'er' is the 'ee' of the verb's sense; where it isa Having, its 'er' is the
recipient of the verb's sense, rather than its 'ee', and the 'ee' of the verb's sense
is the same as the 'ee' of the result (see Figure 4 above).
'LocationaT objects, as in c, do not refer to affected arguments, but to parts
of a path. The example in c defines the beginning and end of the path (on
opposite sides of the river), but other examples may profile the beginning (25a),
middle (25b) or end (25c) of the path.
(25) a. The express jumped the rails, (from Biber et al. (1999: 127))
b. nny vaulted the horse.
c. Elly entered the room.
The set of verbs that can appear with an object of this kind is (naturally) limited
to those that can refer to a motion event and in this sense the 'locative' object is
lexically selected by its parent. Notice also that the verb (often) determines
which part or parts of the path may be profiled by such an object. Because of
this, these arguments must appear in the lexical structures of quite specific
categories (at the level of the lexeme or just above). The relevant categories are
subsumed under Transitive, since the syntactic properties are the same as those
of the affected/effected objects, but it is arguable whether they need to be
collected under a category 'locative object verb'. This category is justified to the
extent that generalizations can be made over the relevant constructions.
There seems to be little semantically in common between locative objects
and affected/effected objects, though there is some relationship. For example,
Dowry's (1991) incremental theme is a property of both kinds of object the
event in both cases is bounded by the theme:
(26) a. Barry baked a potato/*potatoes in five minutes,

b. Sammy swam the Ohio/*rivers in five minutes.
When the sense of the verb is an unbounded event, a measure expression can
be used to define a bounded path: Sammy swam jive miles. It is not entirely clear
that arguments of this sort are indeed objects. Some certainly are not; in 27 the
object of pushed is the (pea) and not the measure expression.
(27) Evans pushed the pea five miles with his nose.
The 'measure' objects are also confined to a limited class of verbs, by which
they are semantically selected (weigh Jive tons, measure Jive furlongs). They also
have little in common semantically with the other types of object, since their
semantics is so heavily constrained by the verb.
'Cognate' objects (28, 29) are also associated with a very small class of verbs
(Levin gives 47 (1993: 95-6), out of a total of 3107 verbs). They have
something in common semantically with effected objects, but the semantics is
constrained by the verb, which may also go so far as to select a particular
lexeme. Levin notes:
Most verbs that take cognate objects do not take a wide range of objects. Often they
only permit a cognate object, although some verbs will take as object anything that is
a hyponym of a cognate object (1993: 96).
The verb and its object refer jointly to a performance of some kind.
(28) She sang a sweet song.

(29) Deirdre died a slow and painful death.
'Eventive' objects are confined to an even smaller class, the 'light' verbs. In
these cases the event structure is determined by the verb, but the details of the
semantics are supplied by the noun. In light verb constructions with HAVE, the
object refers to an event (have a bath/meal/billiards match); light DO, in contrast,
refers to an affective/effective event, the precise nature of which is determined
by the semantics of the (affected) object:
(30) a. I'll do the beds, ['dig them/make them up']

b. I'll do the potatoes, ['peel them']
c. I'll do the cake, ['bake it']
Figure 9 collects together the various semantic properties of objects. The

category Transitive is the same as appeared in Figure 7: it is the locus of the
syntactic properties of objects (these are represented schematically here):
• The majority of objects are subsumed under the affective/effective category.

In the diagram this is represented by the semantic concept Making.
• I show two subcategories. Making' (as in The cold made our lips blue) and
Creating are schematic for the senses of the affective object verbs and the
effective object verbs respectively.
• Making is schematic for all affective/effective events, and as such provides a
sense for 'light' DO (shown as DO/light in the diagram). 'Light' HAVE is
shown as a simple transitive verb that corefers with its object (the shared
referent being an event).
• The set of verbs taking 'locative' objects is represented by a class having
Moving' as its sense. This concept, which is a subcategory of ordinary
Moving, subsumes cases of moving with respect to some landmark. The
landmark appears in the semantic structure (labelled 1m).
• The types of Moving' are classified here according to whether the landmark
is construed as the middle of a path (Passing), an obstacle (Traversing), an
end point (Entering) or a source (Leaving).
• Finally, the diagram shows that some nouns which are objects refer to
Measurements, and they define a property of the er of their parent's sense.
The lexical structure given in Figure 9 integrates the syntactic properties

identified above (Figure 7) with the semantic properties of the various types of
object Figure 9 is schematic for all 'transitive constructions' in that verbs with
objects inherit (some of) their properties from the category Transitive (usually
Figure 9 Semantic properties of objects
by way of one of the subclasses) and nouns that are objects inherit some of their
properties from the category that fills the relevant slot in the structure (perhaps also
by way of one of its subclasses: the diagram does not show inheritance relationships
between the object noun in the most general case and those in the subcases, but
these relationships are nevertheless implicit in the inheritance structure).
1. 4 Subjects
Biber et al. (1999: 123-5) give the following syntactic properties for English
subjects:
a. found with all types of verbs

b. is characteristically an NP, but may be a nominal clause
c. is in nominative case (when a pronoun and in a finite clause)
d. characteristically precedes the VP, except in questions where it follows, except
where the subject is a Wh word itself
e. determines the form of present tense verbs (and of past tense BE)
f. may correspond to a by phrase in passive paraphrases
• Again, the first two syntactic properties concern the classes of words that
participate in the relationship: verbs have subjects, which are generally
nouns. Any verb may have a subject, so the class of 'subject verbs' is less
constrained than the class of transitive verbs. It is perhaps for this reason
that the semantic roles played by subjects are so much more diverse (see
below). All tensed verbs have subjects, so the class Tensed is shown as a
subset of the subject verbs. (See Figure 10. )
• The 'nominative' form of personal pronouns consists of the five words I,
SHE, HE, WE and THEY which are subcases of the relevant pronouns
that are used only in subject position. (See Figure 10. )
Figure 10 Some syntactic properties of subjects
• The word order properties of subjects are slightly more complicated. Generally
the subject precedes its parent, but some subjects follow their parents and in
many of these cases the referent of the verb is questioned (the construction
forms a yes/no question); these cases are represented in the subclass of subject
verbs Inverted. The word order properties of Wh questions are determined in
part by the lexical properties of the category Wh (schematic over Wh words).
This category is always the extractee (x< in the diagram) of its parent and so
precedes it. Where the Wh word is not the subject of the verb, the verb and
subject are also inverted (the complement of Wh isa Inverted).
Figure 11 Word order properties of subjects

• Subject-verb agreement is a property of the categories participating in the

subject relationship. Present verbs (Present is a subcase of Tensed) must
have the same agreement value as their subjects. Those with the agreement
singular have a form consisting of their base plus an {s}. Notice that this
requires that the pronouns I and YOU have agreement plural (or have no
agreement value) (/ like I she likes). Subject-verb agreement is dealt with at
length by Hudson (1999).
Figure 12 Subject-verb agreement
• The final syntactic property is more properly semantic in WG: just as there
is overlap between the semantics of the indirect object relationship and that
of the preposition TO, so there is considerable overlap between the
semantics of the subject relationship and that of the preposition BY.
The semantic properties of subjects are explored more fully in the following
section, but some general remarks can be made here. Biber et al. (1999: 123-5)
give the following possible semantic roles for subjects:
a. agent/willful initiator (She kicked a bottle cap at him)

b. external causer (The wind blew the plane off course)

c. instrument (Tactics can win you these games]
d. with stative verbs:
• recipient (/ know it, She could smell petrol)
• source (Ton smell funny)
• positioner (She sat against a wall)
e. affected (It broke, An escapee drowned)
f. local (The first floor contains sculptures)
g- eventive (A post mortem examination will take place)
h. empty (It rained)
• The first three roles (a-c) can be collected together by virtue of the force-
dynamic properties they share: agents, causes and instruments all precede
the event in the force-dynamic chain.
• I argue below that affected subjects (e) are similarly controlled by the force-
dynamic structures of the verbs that take them.
• The semantic roles played by the subjects of stative verbs are chiefly
determined by the lexical (semantic) structure of the individual lexeme,
though some semantic classification is possible (see Figure 13).
• 'Local' and 'eventive' subjects are controlled by the lexical structures of the
verbs that take them.
• Since every verb can have a subject, the number of different semantic roles
open to the referents of subjects is limited only by the number of different
event types denoted by verbs. This can be seen particularly clearly in the
case of 'dummy' subjects (h).
Figure 13 Semantic properties of subjects

Figure 13 collects together the possible semantic roles associated with the
subject relationship, and relates them symbolically to the syntactic properties
identified above (given schematically in the diagram). The various semantic
types of subject are glossed by the 'er' relationship introduced above. A full
account of this relationship and of the 'ee' relationship linked with objects is
provided in the following section. Four kinds of stative predicate are shown,
covering the three possibilities under (d) and the 'local' subjects in (f). Some of
these semantic classes are dealt with in more detail in following chapters; each
makes different requirements of its 'er'. A class of 'eventive verbs' is also
included; these corefer with their subjects.
1. 5 Three linking rules

On the basis of the above discussion we can construct general linking rules for
the three relationships subject, object and indirect object These linking rules
link sets of syntactic properties, associated with the relevant dependency class,
with sets of semantic properties, associated with classes of semantic association.
The linking rule for subjects is given in Figure 14 and in prose in (31). The
rule pairs the syntactic relationship subject with the semantic relationship 'er'.
The former gathers together the syntactic properties of subjects (Figure 10-Figure
12) and the latter the semantic properties associated with them (Figure 13).
Figure 14 Subject linking rule
(31) A word's subject refers to the 'er' of its sense.
A linking rule for objects is given in Figure 15 and in (32). The rule pairs the
syntactic relationship object with the semantic relationship 'ee' (this is the
pattern given above for DO/light, which is followed by most transitive verbs).
The former gathers together the syntactic properties of objects (Figure 7) and
the latter the semantic properties associated with them (Figure 8).
Figure 15 Object linking rule
(32) A word's object refers to the ee of its sense.
Finally, abstracting away from Figure 4 (and using 'beneficiary' as schematic

over recipients and beneficiaries) gives us the following linking rule for indirect
objects. This rule gathers together, in the two associations indirect object and
beneficiary respectively, the syntactic and semantic properties of indirect object
constructions, as identified above.
Figure 16 Indirect object linking rule
(33) A word's indirect object refers to the beneficiary of its sense.
Now, semantic relationships like recipient or beneficiary are quite

straightforwardly understood in terms of more complex semantic structures:
if a concept C has a result which is an example of Having, dien that result's first
argument is the recipient of C. The relationships 'er' and 'ee', however, which
are linked to subject and object respectively, are less straightforward, so the
status of the subject and object linking rules is at least open to question. In the
second part of this chapter, I address the outstanding issues, providing more
detailed linking rules for subjects and objects.
2. The Event Type Hierarchy: The framework; event types;

roles and relations
2. 1 The framework
In the first part of this chapter I sketched a linking mechanism within the WG
framework, based on generalizations over grammatical relations (specializations
of the Syntax Semantics Principle). The details are fleshed out in this part.
The linking regularities presented above consist of symbolic structures which
link specific syntactic relationships (subject, object, indirect object, etc. ) with
specific semantic relationships ('er', 'ee', recipient, etc. ). The syntactic
relationships are identified by a set of word-level (syntactic, morphological,
phonological, etc. ) properties which, by default, are inherited by all cases of the
dependency: unless otherwise specified, subjects precede their parents and
determine their form, objects follow and permit no intervening codependents,
and so on. The semantic relationships are identified by a set of concept-level
(thematic, force-dynamic, etc. ) properties, which likewise constitute the default
model for the relationship. The syntactic and semantic properties taken
together constitute the lexical structure of the relevant relationship, and can be
seen as a gestalt.
As I argue above, semantic relationships like recipient and result are quite
straightforwardly understood in terms of more complex semantic structures.
The relationships 'er' and 'ee', however, which are linked to subject and object
respectively, are less straightforward. An account is provided here in which the
properties of 'ers' and 'ees' are defined by a hierarchy of event types (notice that
an event type (Having) played a role in the definition of result and recipient).
Since most of the event types are defined by a single exceptional argument
relationship, and since the linking regularities are still stated in terms of single
roles, the WG approach outlined here combines the properties of role-based
and class-based approaches. The linking regularities presented above are
generalizations over the linking properties of all subjects, objects, etc. While
each syntactic dependency always maps onto the same semantic argument, the
exact nature of the role played by that argument is determined by the wider
conceptual structure associated with the parent's sense (as represented partly by
its event type). The distinction between words and constructions is an emergent
property of the network structure.
The categories in the event type hierarchy are defined by their semantic
(conceptual) properties, including force-dynamic properties (but not including
aspectual properties: see Holmes (2005: 176-211)). Many of the event types
function as the senses of words, though some do not. The categories support a
number of associations (more at the more specific levels), including those

mentioned in the linking regularities. The roles of those arguments are defined
by the rest of the conceptual structure associated with the lexical category.
2. 2 Event types
Figure 17 shows the event type hierarchy. The various types are shown, but
most of their properties are not (they are given in the following diagrams). The
category at the top of the hierarchy is labelled Predicate; this is not an entirely
satisfactory name for this concept, but it has the benefit of subsuming both
states and events. The names of the concepts in the hierarchy are intended to
be the senses of lexical words, and for this reason it is perhaps surprising that no
readily useable term exists for the highest category, though it might be argued
that this concept does not have much use as an element in the normal use of
language. The event type hierarchy should more properly be called the
predicate type hierarchy.
Figure 17 Predicate type hierarchy
Predicates are divided into states (State) and events (Event), the latter
consisting of a series of (more or less transient) states. The most general
category, Predicate, is shown with a single argument, labelled 'er', and this
association is inherited (implicitly) by the two subclasses. The states are divided
into Being and Having; the latter and some of the former have a second
argument, labelled 'ee'. Further properties of these categories are explored
shortly. The events include processes like Laughing and Yawning as well as
the further categories Becoming and Affecting. The first of these is telic (it has a
result which is a state); the second has an 'ee' as well as an 'er'. Affecting
includes transitive processes like Pushing and Beating ('hitting' not

'defeating') as well as the category Making which subsumes two further
categories, Creating, which is telic since its result is an example of Being (or
Existing), and Making', which is telic in that its result isa Becoming and the
result of this second event is a state.
Figure 18 shows in more detail the properties of the states. Being defines a
property of its 'er'. For example, Big functions as the size of its 'er' (Drunk is
also shown as an example of the way in which the semantic network represents
all aspects of meaning). Other subcases of Being include Feeling, which
subsumes psychological states (see Figure 19), and At, which subsumes
locations (see Figure 20). The inclusion of the traditional semantic roles theme
and actor pre-empts the discussion of the difference between Being and Having
in Figure 21 and of the relationship between the argument positions and
traditional semantic roles in the following section.
Figure 18 Hierarchy of states
Figure 19 shows in more detail the properties of Feeling. This category

subsumes one- and two-argument psychological states. In both cases the 'er'
must be sentient. One of each kind of state is shown as an example. A single
semantic relationship is shown for each; this stands for a fuller characterization
of the words' meanings which would include for example the relationship
between Happy and Smiling (the 'er' of Happy is often the 'er' of Smiling too).
Figure 20 shows the properties of At, the category subsuming locations, and
the sense of AT. The 'ee' of At defines the place of its 'er', which is therefore
understood as the theme of a state defined by the 'ee'. For this reason, the 'ee' is
also shown as the Landmark (see Figure 20). Two subcases of At are shown, In
and On, the senses of the prepositions IN and ON respectively. These two differ
from At in that the place of the 'er' is not the same as the 'ee', but is rather the
Figure 19 Feeling
same as the place of a part of the 'ee'. In the case of In, this part is the interior;
in the case of On, it is the surface. The diagram also shows that Containing and
Supporting are the converses of In and On respectively (if a is in b then b
contains a; if a is on b then b supports a). These facts are integral parts of the
meanings of the prepositions.
Figure 20 At
Figure 21 shows the properties of Having, the sense of HAVE. As I have

shown in Figure 18, the arguments of Having and those of Being have different
properties. In the case of Having the 'er' is also its actor and the 'ee' its theme (see
section 2. 3); in the case of Being the 'er' is the theme, and the 'ee', if there is one,
is a landmark, or plays some other role (in the case of the psychological states it is
often called a stimulus). Figure 21 shows that Supporting and Containing are
subcases of Having (subsumed under a general category labelled 'Locating').
This explains why these categories assign their arguments in the opposite
way to the corresponding concepts On and In, which inherit their argument
structure from Being (by way of At). It may also help to explain the way in
which some languages use verbs corresponding to English BE and HAVE with
different sets of verbs in perfect constructions and perhaps also explain the
relationship between passive and perfect constructions even in English. This
possibility needs to be explored in future work.
Figure 21 Having
The correspondence between Being and Having also suggests an alternative

to the most usual analyses for verbs like GIVE and the indirect object (see
Holmes 2005: 46-54). It is often claimed that the more specific semantics of
indirect objects overrides the usual principle that the 'ee' of a causative event is
assigned as the 'er' of its result (the gift, which is the 'ee' of Giving, is the 'ee'
rather than the 'er' of the result, if this is to be a case of Having). However, it is
also possible that the result of Giving is instead a case of Being (more
specifically, it isa At), which would preserve the default arrangement. This
would also provide a means of describing the contrast between verbs like GIVE
and those like EQUIP (rare in English) that show the opposite linking
arrangement. This view is supported by the prepositions that are used with
these verbs. GIVE selects TO (in the absence of an indirect object), which in
other constructions refers to a path terminating in a location; EQUIP selects

WITH, which has Having as its sense.
This suggestion is sketched in Figure 22. The result of Giving isa At; its 'er'
(the thing located) is the 'ee' of Giving and its ee (the location) is the recipient.
The result of Equipping isa Having; its 'er' (the possessor) is the 'ee' of Giving
and its 'ee' is the 'equipment'.
Figure 22 Giving, Equipping

Figure 23 shows the properties of the non-states. Event inherits the 'er'
relationship from the Predicate category, and passes it down to the subclasses.
Becoming has additionally a result which is a state which shares its 'er'; the class
is telic and provides the semantic schema for unaccusative constructions. Dying
is shown as an example (see Figure 27). Affecting has additionally an 'ee', which
is a patient. Pushing is shown as an example (see Figure 25). Making represents
telic affective events (it has a result). Two subclasses of Making are shown.
Creating provides the model for effective constructions and Making' for
causative (affective) ones. In both cases it is the 'ee' that functions as the 'er' of
the result Killing is shown as an example of Making' (see Figure 26).
Figure 23 Events
Figure 24 Yawning isa Event
Figure 25 Pushing isa Affecting
Figure 26 Killing isa Making'

Figure 27 Dying isa Becoming
2. 3 Semantic roles and semantic relationships

I have given above a hierarchical classification of predicate types denned by
their properties (see Figure 17, Figure 18, Figure 23). Note that the senses of
particular words (not just verbs: prepositions and adjectives refer to events, as
do some nouns like DESTRUCTION, WEDDING, etc. ) are arranged in the
same hierarchy since they simply instantiate the more general predicate types.
The properties of the predicate types determine the number and nature of the
semantic relationships associated with these senses and the linking of those
associations to syntactic dependencies; alternatively, the number and nature of
semantic associations and the linking of those associations determines the
position of the sense in the predicate type hierarchy.
In the first part I provided linking regularities that link more or less
schematic semantic associations with more or less schematic syntactic ones.
There I gave linking rules for subject, object and indirect object as well as the
more general Syntax Semantics Principle (SSP). The semantic associations
referred to in these rules are the same as those supported by the various
predicate types. In fact the linking rules themselves form part of this hierarchy,
appearing at the highest relevant level.
As noted above, semantic associations like recipient are fairly straightfor-
wardly characterized in terms of other semantic relationships (in terms of their
meanings) but 'er' and 'ee', the two relationships involved in subject and object
linking, are not. The 'ers' and 'ees' of particular events (or event classes) are
instantiations of the more general 'er' and 'ee' that appear in the linking
regularities (note that the 'er' of Predicate (Figure 23) is the most general one
there is, so this is the locus of the subject linking rule and all other 'ers' are
instantiations of this one). The properties of 'ers' and 'ees' of more specific
categories are determined at the appropriate level in the predicate type
hierarchy and it is here that the most semantic information is found.
In the preceding section I define the semantics of these relationships by
relating them to named thematic roles (actor, patient, theme, landmark),
but this begs the question in the absence of a fuller semantic definition of these
roles. Indeed, as discussed below, once the thematic roles have definitions, it
may no longer be necessary, or desirable, to keep the relationships agent,
theme etc. in lexical structure.3
A number of problems with thematic roles have been identified in the

literature. The most immediate practical difficulty is that different writers (and
even different works by the same writer) use the same terms with different
meanings; this is a particular problem for the terms Goal, Patient and Theme
(see below). But there is also the non-monotonicity of argument linking ((34)-
(36) are from Davis and Koenig (2000: 58)), which leads to proposals like
JackendofFs (1990) hierarchical argument linking.
(34) a. Mary owns many books.

b. This book belongs to Mary.
(35) a. We missed the meaning of what he said.
b. The meaning of what he said escaped/eluded us.
(36) a. Oak trees plague/grace/dot the hillsides,
b. The hillsides boast/sport/feature oak trees.
A further, theoretical, difficulty (raised by Dowty 1991) is the open-ended

nature of the set of roles to be used. Goldberg considers this only an empirical
problem, since in principle the set of thematic roles need not be finite, the
nature of the roles being determined by the set of predicate types recognized in
the language:
[P]hrasal constructions that capture argument structure generalizations have

argument roles associated with them; these often correspond roughly to traditional
thematic roles... At the same time, because they are denned in terms of the
semantic requirements of particular constructions, argument roles in this framework
are more specific and numerous than traditional thematic roles. (2002: 342)
Since the semantic relationships supported by the senses of words instantiate

(isa) those of more general categories, the senses of different words (or
constructions) may elaborate the more general models in different ways, so that
the set of thematic roles at the more specific levels can be very large indeed. In
Figure 23, 1 used the thematic role actor as schematic over the first arguments
of all non-states (including processes (37) and causative (38) and unaccusative
(39) events).
(37) a. The flag fluttered in the breeze.

b. The tourist yawned.
c. The flag distracted the tourist.
d. Perry pushed a pea with his nose.
(38) a. Perry pushed a pea to Peterborough.
b. The flag angered the tourist.
c. The judges made a cake.
d. Perry opened a bottle.
(39) a. The pea vanished.
b. The ice melted.
c. The band disbanded.
Trask defines actor as 'that argument NP exercising the highest degree of

independent action in the clause. ' (1993: 6), noting that this is a simple
extension of the category agent to fit other kinds of subject-linked arguments.
This extension covers verbs referring to changes undergone by their single
argument (unaccusative verbs), whose arguments therefore may have few or no
agentive properties (note however, that some are agents (30c). In fact, the actors
of other one- or two-argument events are also not agents ((28a), (28c), (29b)).
Agency is a property of some actors, determined by the thematic properties
of the event, so the thematic role agent ('the semantic role borne by an NP
which is perceived as the conscious instigator of an action' ibid.: 11) is not called
for. Actor, then, corresponds roughly to Dowty's (1991) proto-agent: it is
defined by properties like volitional involvement, causal instigation etc., but not
all cases share all these properties. Dowty's proto-agent wills the event, is
sentient, causes an event or change of state, moves and has independent
existence; the WG treatment presented here accepts all of these but the fourth,
movement.
Patient ('the semantic role borne by an NP which expresses the entity
undergoing an action' Trask (1993: 202)) is schematic over the second
argument of transitive events. Affecting, which is the most general such event,
subsumes processes (like pushing a pea or patting a dog) and causative events
(like pushing a pea to Peterborough or angering a tourist). The patient is the
affected (or effected) argument, even in some of the transitive processes.
Processes have a temporal profile that consists of a set of repeated events.
These events may themselves be causative (Pushing consists of a set of repeated
causative actions on an object), though they may also be states (Patting consists
of a set of repeated locative states) in which case the patient is the theme of the
state (see below).
Dowty's (1991) proto-patient undergoes a change of state, is an incremental
theme, is causally affected by another participant, does not move and does not
have independent existence. Again, the WG analysis accepts all these but the
fourth, concerning movement The incremental theme is a product of the
aspectual structure of affective events (see Holmes 2005).
States have themes, and some have actors. Actors of states share the
properties of those of non-states. The theme is the argument that the state is
predicated of (theme is also used with similar meaning as the name of a
discourse function, where it contrasts with rheme, as topic does with comment).
Trask gives 'an entity which is in a state or a location or which is undergoing
motion' (1993: 278), a definition which subsumes some patients, as defined
above; Trask also notes that the terms theme and patient are used more or less
interchangeably. However, in the current framework the two are separate:
patients undergo some affective/effective process or change; themes have some
stable property. Locative states also have a landmark: the argument whose
position defines that of the theme.
The above definitions of the thematic roles are given in terms of semantic
properties. For example, an actor wills the event, is sentient, causes an event or
change of state and has independent existence. These semantic properties of
actor are shown in Figure 28.
Figure 28 Actor
In the linking framework outlined here, syntactic associations are linked to

semantic ones in a regular way (subjects refer to 'ers', objects to 'ees', indirect
objects to beneficiaries, etc. ), and those semantic associations are defined by
(structural) semantic properties. The relationships 'er' and 'ee' are defined by
the categories of the predicate type hierarchy, and linked there to the various
properties of actors, patients, themes and landmarks, like those in Figure 28. It
is an empirical question whether it is necessary to keep hold of the relationships
actor, patient, etc.: theoretically the 'ers' of non-states could simply be linked
directly to the structure shown in Figure 28 without the mediation of the actor
relationship.
The contrast between Having and Being (the former has an actor-er and a
theme-ee, the latter a theme-er and in some cases a landmark-ee, see Figure 18)
demonstrates that 'er' and 'ee' are distinct from the thematic roles. This
separation of properties is found in other frameworks also. For example, in
Goldberg's (2002) Construction Grammar the lexical structures of grammatical
constructions are separated from those of specific words. Semantic relation-
ships like Actor, Theme, etc. (participant roles), which are supported by
the senses of words, instantiate the argument roles of phrasal constructions
(these correspond to my 'er' and 'ee'), which are therefore schematic over
them. The separation, in lexical structure and in the structures of sentences
(constructs), of the two argument structures allows different verbs to elaborate
different constructions differently: the argument structure of the construction
may add or take away participant roles from the verb, or vice versa.
The WG framework, however, represents the distinction differently: rather
than being properties of two different kinds of elements, the participant roles
and the argument roles are simply different kinds of association supported by
the same elements (events). In Holmes (2005) I show this property of the WG
framework to be crucial in the treatment of specific examples, since it becomes
clear there that both words and constructions may select both argument and
participant roles.
Since the participant roles are defined in terms of sets of default properties,
it is possible for more than one argument of a verb's sense to fit the bill for one
or other participant role. This is the case for the verbs SPRAY and LOAD. As
is well known, these two verbs can be used with objects referring to a thing or
substance moved or to the place it is moved to. These two possibilities reflect
two ways of interpreting the roles of the participants (of choosing which
participant best fits the patient model, and is therefore linked to 'ee' and thence
to object).
In these cases the lexical properties of the syntactic relationship (here object)
can be added to those of the verb. Where the two are not in conflict, they are
simply merged. For example, since LOAD does not select either of its non-
subject arguments as an incremental theme, this property is assigned to the
object-linked argument by the semantics of the 'ee' relationship (40) (the
mechanics of this example are discussed in Holmes 2005: 206ff).
(40) a. Larry loaded * (the) lorries with (the) lollies in 2 hours,

b. Larry loaded * (the) lollies on (the) lorries in 2 hours.
When there is a conflict between the lexical properties of the construction and
those of the verb, the construct is (usually) rendered incoherent. The two
examples in (41) are unacceptable because the lexical structure of POUR
specifies that the 'ee' of its sense is a liquid (that is how the manner of pouring is
defined) and that of COVER specifies that the 'ee' of the sense ends up
underneath something. These two requirements clash with the semantics of the
construction.
(41) a. * Polly poured the pot with water

b. *Corrie covered the quilt over the baby.
3. Conclusion
In the first part of this chapter I sketched the linking mechanisms of WG.
Syntactic and semantic associative relationships participate in symbolic
relationships: syntactic dependencies have meanings, which serve to determine
the interpretations of compositional structures, as well as to constrain the
possibilities for composition. Just as the (default) properties of syntactic
associations are given in terms of a network of related concepts and properties
surrounding the dependency class they instantiate, so are the (default)
properties of semantic associations.
In the second part I distinguished two kinds of semantic association:
participant roles, which carry thematic content; and argument roles, which are
determined by the force-dynamic properties of the event class.
References
Biber, Douglas, Johansson, Stig, Leech, Geoffrey, Conrad, Susan and Finegan, Edward
(1999), Longman Grammar of Spoken and Written English. Harlow, Essex: Longman.
Bresnan, Joan W. (1982), The Mental Representation of Grammatical Relations.
Cambridge, MA: MIT Press.
Chomsky, Noam (1970), 'Remarks on nominalization', in Roderick A. Jacobs and
Peter S. Rosenbaum (eds), Readings in English Transformational Grammar. Waltham
MA: Ginn and Company, pp. 184-221.
— (1981), Lectures on Government and Binding. Dordrecht Foris.
Copestake, Ann and Briscoe, Ted (1996), 'Semi-productive polysemy and sense
extension', in James Pustejovsky and Branimir Boguraev, Lexical Semantics: the
Problem of Polysemy. Oxford: Clarendon Press, pp. 15-68.
Croft, William (1990), 'Possible verbs and the structure of events', in Savas L.
Tsohatzidis (ed. ), Meanings and Prototypes: Studies in Linguistic Categorization.
London: Routledge, pp. 48-73.
Cruse, David A. (1986), Lexical Semantics. Cambridge: Cambridge University Press.
Davis, Anthony R. and Koenig, Jean-Pierre (2000), 'Linking as constraints on word
classes in a hierarchical lexicon'. Language, 76, 56-91.
Dowry, David R. (1991), 'Thematic proto-roles and argument selection'. Language, 67,
547-619.
Goldberg, Adele E. (1995), Constructions: a Construction Grammar Approach to Argument
Structure. Chicago: University of Chicago Press.
— (2002), 'Surface generalizations'. Cognitive Linguistics, 13, 327-56.
Holmes, Jasper W. (2005), 'Lexical Properties of English Verbs' (Unpublished doctoral
Hudson, Richard A. (1984), Word Grammar. Oxford: Blackwell.
— (1992), 'So-called "double objects" and grammatical relations'. Language, 68, 251-76.
— (1994), 'Word Grammar', in Ronald Asher (ed. ), The Encyclopedia of Language and
Linguistics. Oxford: Pergamon Press, pp. 4990-93.
— (1999), 'Subject-verb agreement in English'. English Language and Linguistics, 3, 173-207.
— (2004, July 1-last update), 'Word Grammar', (Word Grammar], Available:
www. phon. ucl. ac. uk/home/dicVwg. htm (Accessed: 18 April 2005).
Jackendoff, Ray S. (1990), Semantic Structures. Cambridge, MA: MIT Press.
Lemmens, Maarten (1998), Lexical Perspectives on Transitivity and Ergativity: Causative
Constructions in English. Amsterdam: J. Benjamins.
Levin, Beth (1993), English Verb Classes and Alternations: a Preliminary Investigation.
Chicago: University of Chicago Press.
Levin, Beth and Rappaport Hovav, Malka (1995), Unaccusativity: at the Syntax-Lexical
Semantics Interface. Cambridge, MA: MIT Press.
Pustejovsky, James (1995), The Generative Lexicon. Cambridge, MA: MIT Press.
— (2001), 'Type construction and the logic of concepts', in Pierette Bouillon and
Federica Busa (eds), The Language of Word Meaning. Cambridge: Cambridge
University Press, pp. 91-123.
Pustejovsky, James and Branimir Boguraev (1996), 'Introduction: lexical semantics in
context', in James Pustejovsky and Branimir Boguraev, Lexical Semantics: the
Problem of Polysemy. Oxford: Clarendon Press, pp. 1-14.
Shibatani, Masayoshi (1996), 'Applicatives and benefactives: a cognitive account', in
Masayoshi Shibatani and Sandra A. Thompson (eds), Grammatical Functions: their
Form and Meaning. Oxford: Clarendon Press, pp. 157-94.
Trask, Robert L. (1993), A Dictionary of Grammatical Terms in Linguistics. London:

Roudedge.
Trier, Jost (1931), Der Deutsche Wortschatz im Sinnbezirk des Verstandes. Von den
Anfdngen bis zum 13. Jahrhundert. Heidelberg: Winter.
Weisgerber, Leo (1927), 'Die Bedeutungslehre - ein Irrweg der Sprachwissenschaft'.
Germanisch-Romanische Monatsschrift, 15, 161-83.
Williams, Edwin (1991), 'Meaning categories of NPs and Ss'. Linguistic Inquiry, 22,
584-7.
Notes
1 The SSP given here turns out not to be able to account for all cases of linking. It is
revised in Holmes (2005: 44).
2 Note that in Biber et al. the VP category subsumes the 'verbal complex' (main verb
and any auxiliaries), but not any complements or other postdependents of the verb.
3 Of course, if a particular speaker knows the words ACTOR and THEME (as
metalinguistic terms), then they must have these relationships in their lexicon, since
they are (or should be!) the meanings of the relevant terms.
6 Word Grammar and Syntactic Code-Mixing
Research
EVA EPPLER
Abstract
This chapter aims to show that WG is preferential over other linguistic theories
for the study of bilingual speech. Constituent-based models have difficulties
accounting for intra-sentential code-mixing because the notions of government
and functional categories are too powerful and rule out naturally occurring
examples. Properties of WG which make this syntactic theory particularly well
suited for code-mixing research are the central role of the word, the dependency
analysis, and several consequences of the view of language as a network which is
integrated with the rest of cognition. A qualitative and quantitative analysis of
because and weil clauses shows that code-mixing patterns can be studied
productively in WG.
1. Introduction
Intra-sententially CODE-MIXED data, i. e. utterances constructed from words
from more than one language, pose an interesting problem for syntactic
research as two grammars interact in one utterance. Based on a German/
English bilingual corpus,11 will show in section 2 of this chapter that constraints
on code-switching formulated within Phrase Structure Grammar frameworks
(Government and Binding, Principles and Parameters, Minimalism) are too
restrictive in that they rule out naturally occurring examples of mixing.
In section 3 I will discuss aspects of WG that make it particularly well suited
for the syntactic analysis of intra-sententially mixed data. WG facilitates the full
syntactic analysis of sizeable corpora and allows us to formulate hypotheses on
code-switching which can subsequently be tested on data. All findings are
supported by quantitative data.
As the word order contrast between German and English is most marked in
subordinate clauses, I focus on examples of this construction type in section 4. 1
will show that code-mixing patterns can be studied productively in terms of
WG: WG rules determining the word order in German/English mixed clauses
hold in relation to my corpus and are supported by evidence from other
corpora. The main section of this chapter focuses on because and weil clauses. A
comparison of the mixed and monolingual clauses reveals that German/English
bilinguals who engage in code-mixing recognize and utilize structural
congruence at the syntax-pragmatics interface. They predominantly mix in a

construction type in which the word order contrast between German (SOV)
and English (SVO) is neutralized.
2. Constituent Structure Grammar Approaches to Intra-

Sentential Code-Mixing
The question underlying grammatical code-switching research is whether there
are syntactic constraints on code-mixing. Some of the hypotheses on intra-
sententiaT code-switching have been formulated in informal frameworks of
traditional grammatical notions; others are derived from assumptions under-
lying specific modern syntactic theories. In this section I will review the main
phrase structure grammar approaches to code-mixing and show that the
constraints formulated within them do not account for the data.
DiSciullo, Muysken and Singh (1986) propose to constrain code-switching
by government, the traditional assumption behind X-bar theory. They initially
used the Chomsky (1981: 164) formulation of government 'oc governs y in [J3
... y . . . a . . . y . . . ], where a = X, and oc and y are part of the same maximal
projection'. The X-bar assumption that syntactic constituents are endocentric is
important for the formulation and working of the government constraint.
Heads not only project their syntactic features onto the constituent they govern,
but also their language index. The language index is assumed to be something
specified in the lexicon (DiSciullo et al. 1986: 6), since the lexicon is a language-
specific collection of elements. For code-switching purposes the Government
Constraint was formalized in DiSciullo et al. (1986: 6) as [Xp Yp], where X
governs Y, and p and q are language indices. The nodes in a tree must
dominate elements drawn from the same language when there is a government
relation holding between them.
The Government Constraint predicts that ungoverned elements, such as
discourse markers, tags, exclamations, interjections and many adverbs, can
easily be switched. This prediction is also supported by my data (see also
Eppler 1999) and most other bilingual corpora. However, the Government
Constraint also predicts that switches between verbs and their objects and/or
clausal complements, and switches between prepositions and their NP
complements, are ungrammatical. Examples violating these predictions from
my corpus are:
(1) *TRU: so [/] so you have

eine Ubersicht. Jen2. cha, line 133
an overview
(2) *DOR: I wonder, wem sie nachgradt Jen2. cha, line 1531
whom she takes after
or in the other direction, i. e. from a German verb to an English clausal

complement:
(3) *MEL: ich hab(e) gedacht there is going to be a fight. Jenl. cha, line 987
I have thought
WORD GRAMMAR AND SYNTACTIC CODE-MIXING 119
(4) *TRU: der hat iiber faith + healing gesprochen. Jen2. cha, line 2383
he has about spoken
The original inclusion of functional categories in the class of governors ruled

out code-switches which are also documented in my data, e. g. between
complementizers and clauses that depend on them, as in (5):
(5) TRU: to buy yourself in means that +... Jenl. cha, lines 977ff
DOR: du kannst dich nochmal einkaufen.
you can yourself once more buy in
and the domain of government was too large. The above formulation of the
government constraint includes the whole maximal projection and thus, for
example, bans switching between verbs and location adverbs, again contrary to
the evidence. Therefore a limited definition of government, involving only the
immediate domain of the lexical head, including its complements but not its
modifiers/adjuncts, was adopted and the Government Constraint was re-
phrased (Muysken 1989) in terms of L-marking:
*[Xp Yq], where X Lrmarks Y, and p and q are language indices
Muysken and collaborators thus shifted from an early and quite general
definition of government to the more limited definition of Lrmarking in their
formulation of the Government Constraint. L-marking restricts government to
the relation between a lexical head and its immediate complements. Even the
modified version of the government constraint in terms of L-marking is
empirically not borne out, as we see from the following example:
(6) TRU: das ist painful. Jen3. cha, line 1879

this is
Muysken (2000: 25) identifies two main reasons why the Government
Constraint, even in its revised form, is inadequate. The main reason is that
CATEGORIAL EQUIVALENCE2 undoes the effect of the government restriction.
The Government Constraint is furthermore assumed to insufficiently acknowl-
edge the crucial role functional categories are supposed to play in code-mixing.
Functional categories feature prominently in several approaches to code-
mixing. Joshi, for example, proposes that 'Closed class items (e. g. determiners,
quantifiers, prepositions, possessive, Aux, Tense, helping verbs, etc. ) cannot be
switched' (1985: 194). Myers-Scotton and Jake (1995: 983) assume that in
mixed constituents, all SYSTEM MORPHEMES3 that have grammatical relations
external to their head constituent (i. e. participate in the sentence's thematic
grid) will come from the language that sets the grammatical frame in the unit of
analysis (CP). And Belazi, Rubin and Toribio (1994) propose the Functional
Head Constraint.
Their model is embedded in the principles and parameters approach.
Belazi, Rubin and Toribio (1994) propose to restrict code-mixing by the
feature-checking process of f-selection. In Belazi, Rubin and Toribio's model,

language is a feature4 of FUNCTIONAL heads that needs checking like all other
features. The Functional Head Constraint (Belazi, Rubin and Toribio (1994:
228)) is formulated as follows:
The language feature of the complement F-selected by a functional head, like all
other relevant features, must match the corresponding feature of that functional
head.
Code switching between a lexical head and its complement proceeds

unimpeded in this model.
Because many inflectional morphemes were treated as independent
functional heads in the principles and parameters approach, Belazi, Rubin
and Toribio (1994) subsume the FREE MORPHEME CONSTRAINT5 (Sankoff
and Poplack 1981) under their functional head constraint: switching is
disallowed between an inflectional morpheme and a word-stem. A counter-
example to this restriction from my corpus would be:
(7) *DOR: wir suffer-n da alle. Jen2. cha, line 904

we suffer INFL MP all
Like all researchers working on Spanish/English and Arabic/French code-

mixing, Belazi, Rubin and Toribio (1994) have to deal with the different
placement of adjectives pre- or post-modifying nouns in the language pairs they
are working on. Their data indicate that switching is possible when the
adjectives and nouns obey the grammars of the languages from which they are
drawn. This leads them to supplement the Functional Head Constraint with the
WORD-GRAMMAR INTEGRITY COROLLARY, 6 which states that 'a word of
language X, with grammar G, must obey grammar G' (Belazi, Rubin and
Toribio 1994: 232).
like the Government Constraint, the Functional Head Constraint rules out
switches between complementizers and their clausal complements. Therefore
example (5) provides counter-evidence to this constraint It also rules out
switches between infinitival to and its verbal complement, examples of which
are also attested in my corpus.
(8) *LIL: you don't need to wegwerfen. Jen2. cha, line 2555
throw away
The Functional Head Constraint furthermore rules out switches between

determiners (including quantifiers and numerals) and nouns. As nouns are the
most frequently borrowed or switched word class, counterexamples abound in
my and many other corpora.
MacSwan (1999: 188), working within the minimalist framework, also
assumes that code-switching within a PF component is not possible. This PF
Disjunction Theorem amounts to the same effect as the Free Morpheme
Constraint (Sankoff and Poplack 1981) and the various restrictions on switching
between stems and morphologically bound inflectional material. Examples (7)

and (9) are therefore clear violations of the PF Disjunction Theorem.
(9) *DOR: sie haben einfach nicht ge#bother-ed. Ibron. cha, lines 1012, 14
they have simply not
The minimalist framework he is working in forces MacSwan (1999) to preserve

constituent structure, but he acknowledges the advantages of a system of
lexicalized parameters for the analysis of code-switching.
In this section I reviewed approaches to code-mixed data that crucially
depend on constituency structure/maximal projections (DiSciullo, Muysken
and Singh 1986) and functional categories (Belazi, Rubin and Toribio 1994). I
showed that these constraints and models are too restrictive in that they rule out
naturally occurring examples of intra-sentential code-mixing. The 'government
constraints' (DiSciullo, Muysken and Singh 1986; Muysken 1989) were found
to be too restrictive when tested against natural language data because the
government domain was too large. Models, approaches and constraints based
on functional categories (Joshi 1985; Myers-Scotton 1993; Belazi, Rubin and
Toribio 1994) fall short of accounting for the data available and are
unsatisfactory because none of the definitions of functional categories that
have been offered (in terms of function words, closed class items, system
morphemes or non-thematicity) work. They either define fuzzy categories
where a sharp distinction would be needed, or they conflict with the data.
Complementizers and determiners, the two most commonly quoted examples
of functional categories, provide most of the counterexamples.
For these reasons a syntactic theory that rejects constituency structure and
does not recognize functional categories (Hudson 2000) seems an interesting
and promising option to explore. In the next section I will review other aspects/
characteristics of WG which are perceived to make this theory of sentence
structure more suitable for the analysis of (monolingual and) code-mixed data
than other theories.
3. A Word Grammar Approach to Code-Mixing

The main reason why I chose WG for the syntactic analysis of my data is
because this theory of sentence structure takes the word as a central unit of
analysis. In WG, syntactic structures are analysed in terms of dependency
relations between single words,7 a parent and a dependent. Phrases are defined
by dependency structures which consist of a word plus the phrases rooted in
any of its dependents. In other words, WG syntax does not use phrase structure
in describing sentence structure, because everything that needs to be said about
sentence structure can be formulated in terms of dependencies between single
words. For intra-sententially switched data this is seen as an advantage over
other syntactic theories because each parent only determines the properties of
its immediate dependent. Language specific requirements are thus satisfied, if
the particular pair of words, i. e. the parent and the dependent, satisfy them. A
word's requirements do not project to larger units like maximal projections/
phrasal constituents. If we want to formulate constraints on code-switching

within WG, they have to be formulated for individual types of dependency
relations. Because they do not affect larger units, they are less likely to be too
restrictive than constraints affecting whole phrasal constituents. One of the main
problems of constituency based models, i. e. over-generalization through
phenomena like government chains, therefore cannot occur in a WG
approach to code-mixing.
The central role of the word in WG moreover means that words are not
only the largest units of WG syntax, but also the smallest. In contrast with
Chomskyan linguistics, syntactic structures do not, and cannot, separate stems
and inflections. Furthermore, at least as far as overt words are concerned, WG
rejects the notion of functional category. Hudson (2000) shows that this notion
is problematic, because it has never been defined coherently and because all
the individual categories that have been given as examples (e. g. complementi-
zers) raise serious problems. For the same reasons, constraints on intra-
sentential code-switching based on functional categories Qoshi 1985; Belazi,
Rubin and Toribios 1994) and models of code-switching that crucially depend
on the distinction between system and content morphemes (Myers-Scotton
1993) run into serious empirical difficulties (see section 2). Because WG is an
example of 'morphology-free syntax' (Zwicky 1992: 354) which rejects the
notion of functional categories, a WG approach to intra-sentential code-
switching cannot over-emphasize the role of inflectional morphemes.
Words being the only and central unit of analysis in Word Grammar
furthermore benefits code-mixing research in a purely pragmatic way. The
majority of research in this area is based on sizable natural language corpora.
Because the only units that need to be processed in WG are individual words
and larger units are built by dependency relations between two words which can
be looked at individually, a WG approach to intra-sentential code-mixing
requires less analysis than constituency-based modes. This facilitates the analysis
of large-ish corpora. Eppler (2004), for example, is based on a WG analysis of a
22, 000 word corpus.
For inrra-sententially mixed sentences a dependency analysis is furthermore
seen as an advantage over phrase structure grammar frameworks because it
highlights the functional relations between words (from the same or different
languages) rather than code-switch points. Constituency-based models describe
and/or constrain intra-sentential code-switching by disallowing switches
between, for example, PP and NP (see section 2). A WG analysis would
note a switched complement relation which is grammatical, if the preposition
and the determiner/(pro-)noun involved in it satisfy the constraints imposed on
them by their own language. To start to understand what is going on in intra-
sentential code-switching, it seems more beneficial to gain an insight into which
syntactic relations are frequently or rarely switched, rather than to increase our
knowledge about points in sentences where switching does not occur.
Another characteristic of WG is that dependency analyses have a totally flat
structure. A single, completely surface structure analysis (with extra
dependencies being drawn below the sentence-words) is seen as benefiting
WG over other theories of language structure for code-mixing research:

linguists working on code-mixing during times when Chomskyan frameworks
still stressed the difference between surface and deep structure did not know
what to do with D-structure, because code-switching clearly seems to be a
surface structure phenomenon. Romaine (1989: 145) concludes her discussion
of the government constraint with the statement 'data such as these [code-
mixing data] have no bearing on abstract principles such as government [... ]
because code-switching sites are properties of S-structure, which are not base
generated and therefore not determined by X-bar theory'. This problem does
not emerge when one works with WG because of its totally flat, i. e. surface,
analysis. A syntactic theory that shares properties of the linguistic phenomenon
under investigation appears to be preferable to other syntactic theories; i. e. for a
surface-structure phenomenon like code-mixing, a syntactic model that allows a
single, completely surface analysis seems to be well suited.
Other aspects of WG which make this theory of sentence structure more
suitable for the analysis of code-mixed data than other theories are derived
from the WG view of language as a network which contains both the grammar
and the lexicon and which integrates language with the rest of cognition.
This cognitive view of language as a labelled network has consequences for a
controversial issue in psycholinguistic bilingualism research: the lexicons debate,
i. e. whether bilinguals' lexical items/lemmas are stored in one or two lexicons.
The network idea offers the advantage of viewing a bilingual's two languages as
sub-networks, with denser links between lexical items from the same language
and looser connections between lexical items from different languages.
This view of the bilingual lexicon (s) in combination with the multiple default
inheritance system which WG operates on could possibly have enormous
benefits for writing a psycholinguistically realistic grammar of a bilingual. The
following exploration is just a sketchy idea as to how this could work and
requires fleshing out, but the basic idea seems to work. Default inheritance
allows us to build a maximally efficient system for bilinguals by locating the
shared properties of words which 'belong' to different languages higher up the
is-a hierarchy and the language specific properties lower down in this hierarchy.
English come and German kommen, for example, are both verbs (is-a verb). They
therefore share certain characteristics: they have a similar meaning ('move
towards'), they both have tense (present or past), they have a subject and the
subject tends to be a pre-dependent noun, etc. All these generalizable facts
about German and English verbs can be located fairly high up in the is-a
hierarchy. The features in which our two example words differ, for example
that they have a different form (/kDITISn/ and /kAm/ respectively), and that
German kommen, when it is the complement of a subordinating conjunction or
an auxiliary/modal, would be placed in clause final position, would be lower in
the is-a hierarchy. Because of the way default inheritance works, characteristics
of a general category are 'inherited' by instances of that category only if they are
not overridden by a more specific (e. g. language-specific) characteristic. A fact
located lower down in the inheritance hierarchy of entities or relations takes
priority over one located above it. Thus we could maximize the bilingual system
by allowing generalization by default inheritance and ensure that the language

specific properties would automatically override the general pattern. For
bilinguals this system would have the advantage that the grammatical system of a
Castilian/Catalan bilingual, for example, would have fewer overriding/blocking
language specific properties listed than that of a German/English bilingual.8
WG furthermore aims at integrating all aspects of language into a single
theory which is also compatible with what is known about general cognition;
that is, language can be analyzed and explained in the same way as other kinds
of knowledge or behaviour. For example, it is widely acknowledged that code-
mixing is influenced by social and psychological factors (Muysken 2000) and a
syntactic model that allows us to incorporate this kind of information is better
suited to describe language contact phenomena than theories that deal
exclusively with language. Knowledge of more than one language, and the use
of more than one language in one sentence, can be analyzed and explained in
the same way as knowledge of one language and monolingual language use. In
other words, code-mixing is not seen as 'deviant'. Because WG aims to explain
and analyze language in the same way as other kinds of social and psychological
knowledge or behaviour, it is perceived to be more suitable for research into
bilingualism than other models of syntax.
The WG view of language as a network of associations which is closely
integrated with the rest of our knowledge lends itself particularly well to code-
mixing research for another reason. It is a well accepted fact in this research
paradigm that adult bilinguals know, first of all, which language the words they
use belong to. Second, they know when to code-switch and when not to (code-
switching as a MARKED or UNMARKED choice,9 for example, Myers-Scotton and
Jake 1995), or when they should be in MONOLINGUAL SPEECH MODE or when
they can go into BILINGUAL MODE (Grosjean 1995). Third, bilinguals also know
which mixing patterns are acceptable in their speech community and which are
not (SMOOTH versus FLAGGED code-switching,10 for example, Poplack and
Meechan 1995). This knowledge about language use is obviously closely
integrated with other types of (social) knowledge and a syntactic theory that
views language as a part of the total associative network is clearly more suitable
to explain these phenomena than other theories.
Viewing language as a sub-network (responsible for words) which is just a
part of the total associative network creates another advantage of WG for the
research paradigm under discussion in this chapter. This benefit is related to
the fact that most code-mixing research is based on natural language corpora. 11
In contrast with most other kinds of grammar which generate only idealized
utterances or sentences, WG grammar can generate representations of actual
utterances. A WG analysis of an utterance is also a network; it is simply an
extension of the permanent cognitive network in which the relevant word
tokens comprise a fringe of temporary concepts attached by 'is-a' links; so the
utterance network has just the same formal characteristics as the permanent
network. This blurring of the boundary between grammar and utterance is
quite controversial, but it follows from the cognitive orientation of WG. For
work based on natural speech data it is seen as another crucial advantage of
WG over other theories which can only generate syntactic structures for
sentences. From the examples quoted so far, it is obvious that the audio data
this study is based on are transcribed as utterances, i. e. units of conversational
structure. For the grammatical analysis, however, I assume that conversational
speech consists of the instantiation of linguistic units, i. e. sentences. In other
words, every conversational utterance is taken to be a token of a particular type
of linguistic unit, the structural features of that unit being defined by the
grammatical rules of either German or English. When using a WG approach to
code-mixed data, one does not have to 'edit' the corpus prior to linguistic
analysis. Any material that cannot be taken as a token of either a German or
English word-form can be left in the texts, but if it cannot be linked to other
elements in the utterance via a relationship of dependency, it is not included in
the syntactic analysis. That is, all the words in a transcribed utterance that are
related to other words by syntactic relationships constitute the sentences the
grammatical analysis is based on. As far as I am aware, WG is the only syntactic
theory that can (and wants to) generate representations of actual utterances, and
facilitates the grammatical analysis of natural speech data without prior editing.
Another consequence of integrating utterances into the grammar is that a
word token must be able to inherit from its type. Obviously the token must
have the typical features of its type - it must belong to a lexeme and a word
class, it must have a sense and a stem, and so on. But the implication goes in the
other direction as well: the type may mention some of the token's
characteristics that are normally excluded from grammar, such as characteristics
of the speaker, the addressee and the situation. For example, we can say that
the speaker is a German/English bilingual and so is the addressee; the situation
thus allows code-mixing. This aspect of WG theory thus allows us to
incorporate sociolinguistic information into the grammar, by indicating the kind
of person who is a typical speaker or addressee, or the typical situation of use.
Treating utterances as part of the grammar has further effects which are
important for the psycholinguistics of processing. The main point here is that
WG accommodates deviant input because the link between tokens and types is
guided by the 'Best Fit Principle' (Hudson 1990: 45fD: assume that the current
token is-a the type that provides the best fit with everything that is known. The
default inheritance process which this triggers allows known characteristics of
the token to override those of the type. Let's take the deviant word /bAS9/ in
the following example:
(lOa) *TRU: xxx and warum waren keine bus(s)e [%pho: bAS9]? JenS. cha, line 331
why were there no buses
/bAS9/ is phonologically deviant for German (Busse is pronounced /buS9/),

and morphologically deviant for English, because the English plural suffix is -
(e)s, not -e. Although this word is deviant, 12 it can is-a its type, just like any
other exception. But it will be shown as a deviant example. There is no need
for the analysis to crash because of an 'error'. The replies to *TRU's question
clearly show that the conversation does not crash:
(lOb) *LIL: xxx [>] wegen einer bombe.

*MEL: xxx [>] a bomb scare. JenS. cha, lines 332-333
This is obviously a big advantage of WG for natural speech data.

Another characteristic of natural speech data - and code-mixed data in
particular - is that they are inherently variant Most syntactic theories aim at
describing and explaining regularized and standardized linguistic data and
therefore disregard inherent variability. Hudson (1997) outlines how a
prototype-based network theory that is based on default inheritance and uses
entrenchment, like WG, can incorporate variation.
One of the strengths of the network approach is that it allows links to have
different 'strength'; these are an essential ingredient of the model of spreading
activation, and are highly relevant to quantitative work. Hudson (1997)
stipulates that a language user who observes variation will arrive at
generalizations about this variation. Each part of a variable network structure
has some degree of'entrenchment' which reflects the experiences of the person
concerned. The degree of entrenchment of a concept can be presented as a
probability of that concept being preferred to any relevant alternatives. This is
presented for word-final variable t/d loss in Figure 1, where the figures13 in
angled brackets present the probabilities.
Figure 1 (Hudson f 997: Figure 5)
This analysis of variation is declarative and non-procedural and requires just

two elementary operations: pattern-matching and default inheritance. Speakers
and hearers need to know that alternative forms can be used instead of the basic
form, and in a real life context the choice between them is influenced by the
linguistic and social context. Figure 2 just hints at how these extra variables
could be introduced.
Figure 2 (Hudson 1997: Figure 6)
This model of inherent variability is possible because WG assumes that

linguistic concepts are closely linked to non-linguistic concepts and carry
quantitatively different entrenchment values. The reason why I find the
proposed model so appealing is because it is a model of competence - not
performance. Inherent variability is generally (rightly or wrongly) associated
with performance, and to my knowledge there is no other modelthat presents
variability and sociolinguistic information as part of a speaker's competence. I
believe that linguistic variation that is influenced by social factors is part of
every speaker's competence and a (more fleshed out) model of how speakers
exploit their sociolinguistic competence is therefore required within linguistic
theory.
In the following main section of this chapter I will present a quantitative/
variationist and qualitative analysis of monolingual and code-mixed subordinate
clauses. As none of the syntactic restrictions on code-switching proposed in the
literature hold absolutely and universally, several recent studies in the field
(Mahootian and Santorini 1996; MacSwan 1999; Eppler 1999) have reverted to
the null hypothesis. I take the same approach. Formulated in WG terms, the
null hypothesis assumes that each word in a switched dependency satisfies the
constraints imposed on it by its own language.
Subordination was chosen as an area of investigation because the two
languages in contact in this particular situation, German and English, display
surface word order differences: English subordinate clauses are SVO whereas
German subordinate clauses are SOV. The contrasting word order rules for
English and German, stated in Word Grammar rules, are:
El) In English any verb follows its subject but precedes all its other
dependents. This holds true for main as well as subordinate clauses and
gives rise to SVO order in both clause types.
E2) Subordinators, e. g. because, require a following finite verb as their
complement. A word's complement generally follows it. 14
For German the most relevant rules15 concerning word order in main and
subordinate clauses are:
Gl) A default finite verb follows one of its dependents but precedes all other
dependents. This gives rise to a verb second (V2) word order in German
main clauses.
G2) A finite verb selected by a lexical subordinator/complementizer takes all
its non-verb dependents to the left, i. e. it is a 'late'16 verb.
G3) Subordinators/complementizers, e. g. daft, select a 'late' finite verb as their
complement. 17 According to G2 finite 'late' verbs follow all their non-verb
dependents.
An example illustrating rules G1-G3 would be:
(11) Ich glaube nicht, da|3 wir die Dorit schon gekannt haben
I think not that weDorit already known have
JenS. cha, line 83
The utterance initial main clause displays V2 word order. The finite auxiliary
haben which depends on the subordinates daft, on the other hand, is in clause
final position following all other constituents including non-finite verbs like
gekannt. In English finite verbs in subordinate clauses do not behave differently
from finite verbs in main clauses. Therefore we do not have to override the
default rule El in the 'isa-hierarchy' of grammar rules. Because German finite
verbs depending on a subordinates take a different word order position than
'independent' finite verbs, we need a more specific rule (G2) that overrides the
default rule (Gl) in the cases stated, i. e. finite verbs selected by German
subordinators.
The pre-minimalism constituent based models discussed in section 2 all
have difficulties accounting for mixing between SVO and SOV languages
because of the opposite setting of the branching parameter. I will show in the
next section that this code-mixing pattern can be studied productively in terms
ofWG.
4. Word Order in Mixed and Monolingual 'Subordinate' Glauses

Code-switching between main and subordinate clauses was chosen as a research
area for several reasons. First, it is interesting from a syntactic point of view. If
German-English bilinguals want to code-switch subordinate clauses, they need
to resolve the problem of English being SVO whereas German finite verbs
depending on subordinating conjunctions generally being placed in clause-final
position (SOV).18 How this word order contrast is resolved is relevant to the
underlying question in all grammatical code-switching research, i. e. whether

there are syntactic constraints on code-mixing. Second, the code-switched
corpus contains a considerable number of switches between main and
subordinate clauses (37), not including the 27 switches involving because
discussed in more detail below. Third, code-switching at clause boundaries has
attracted much attention in the research area.
As complementizers are one of the most commonly quoted examples of word
classes that are functional categories in constituent-based models of syntax, the
government and functional head constraints discussed in section 2, all rule out
switching between C and the remainder of the CP. Gumperz (1982) also
proposes that subordinate conjunctions must always be in the same code as the
conjoined sentence. Sankoff and Poplack (1981: 34), on the other hand, observe
that in their Spanish/English corpus subordinate conjunctions tend to remain in
the language of the head element on which they depend. Bentahila and Davies'
(1983) corpus of Arabic/French yields numerous examples of switches at various
types of clause boundary: switches between main clauses and subordinate clauses,
switching between complementizers and the clauses they introduce, and
examples where the conjunction is in a different language from both clauses.
Although my corpus also contains switches at all the points discussed by
Bentahila and Davies (1983), my data largely support Gumperz' (1982)
'constraint', that is, subordinate conjunctions (apart from because) tend to be in
the language of the subordinate clause that depends on them, and not the head
element on which they depend. Examples illustrating switches between main
and various types of subordinate clauses in both directions are:
(12) *MEL: ich hab(e) gedacht, there is going to be a fight. Jenl. cha, line 987
I have thought
(13) *MEL: I forgot, dass wir alle wieder eine neue partie angefangen haben.
that we all again a new game started have
Jenl. cha, line 2541
(14) *TRU: die mutter wird ihr gelernt haben, how to keep young.
her mother would her taught have Jenl. cha, line 2016
(15) *DOR: wenn du short hist, you -wouldn't talk.
when you are
*DOR: aber wenn man geld hat, you talk.
but when one money has Jen3, line 581-2
(16) *TRU: er schreibt fuenfzehn, if you leave it in your hand.
he counts fifteen Jen2. cha, line 932
(17) *LIL: das haengt davon ab, what 'nasty' is(t).
that depends on Jen2. cha, line 1062
Note that the null hypothesis is born out in examples (12)-(17) and in the vast
majority of monolingual and mixed dependencies19 in the German-English
corpus. The WG rules determining the word order in main and subordinate
clauses also hold. These findings are furthermore supported by the quantitative
analysis of 1, 350 monolingual and 690 mixed dependency relations in a 2, 000
word monolingual sample corpus and a 7, 000 word code-mixed corpus (see
Eppler 2004).
This study particularly focuses on because and well clauses. Several

researchers (Gardner-Chloros 1991; Salmons 1990; Treffers-Daller 1994;
Bolle 1995; Boumans 1998) studying code-mixing between SVO and SOV
languages noticed that the clauses depending on switched conjunctions are
frequently not SOV but V2. The conjunction in these examples, furthermore,
is frequently the causal conjunction because, parce que and omdat. This led
Boumans (1998: 121) to hypothesize that '... it is possible that foreign
conjunctions do not trigger verb-final in Dutch and German clauses simply
because they are used in functions that require main clause order'. He,
however, found it 'hardly feasible to examine this hypothesis in relation to the
published examples because these are for the most part presented out of
context' (Boumans 1998: 121). I will show that a fully (ODES20) transcribed
corpus of German and English data allows us to verify this hypothesis.
Both types of analysis, qualitative structural and quantitative distributional,
are considered to be necessary for a comprehensive description of the data,
because different structural patterns are used to different degrees and for
different purposes. The variation in the data can best be described
quantitatively; the qualitative analysis provides an explanation for the structural
patterns found. This combination of methodologies furthermore enables us to
address Muysken's (2000: 29) statement that'... we do not yet know enough
about the relation between frequency distributions of specific grammatical
patterns in monolingual speech data and properties of the grammar to handle
frequency in bilingual data'. I will compare the because- and «W/-clauses in
mixed utterances with monolingual German and English examples and show
that we do know enough about the syntax and pragmatics of this construction to
explain both the frequency distribution of causal conjunctions and the use of
verb second (rather than verb final) word order.
4. 1 The empirical issues
4. 1. 1 ASYMMETRY BETWEEN CONJUNCTIONS OF REASON

The distribution of German and English subordinators/complementizers in
the corpus is approximately 60: 40, which is in accordance with the general
distribution of word tokens from the two languages in the data. If, however,
we focus on because and the translation equivalent from the same word class,
the subordinating causal conjunction well, we get a very different picture.
The corpus yields twice as many tokens of the English subordinator as it
does of well (see Table 1). A typical use of because, especially for speaker
DOR, is:
(18) DOR: es war unsere [... ] Schuld because man fiihlt sich
it was our fault tone feels
mit den eigenen Leuten wohler.
with the own people happier. Ibron. cha, line 221
Because in the above example can be argued to be a single lexical item inserted
in otherwise German discourse. This particular usage of the English causal

subordinator is not restricted to speaker DOR:
(19) LIL: because er ist ein aufbrausender Irishman.

he is a hot-blooded
Jenl. cha, line 389
Because also enters syntactic relations where the word on which it depends is
English (eat) and its dependent is German (schmeckt), as in:
(20) DOR: eat it with der Hand-! because das schmeckt ganz anders.
the hand it tastes very differently
Ibron. cha, line 2214
or vice versa, e. g. because has a German head verb (habe) but an English
complement (know):
(21) MEL: ich hab's nicht einmal gezahlt because I know I'm going to lose.
I have it not even counted
Jenl. cha, line 881
The German subordinator of reason, weil, on the other hand, only enters into
monolingual dependency relations:
(22) DOR: dann ist sie, weil sie so ungliicklich war, doit gestorben.
then has she, because she so unhappy was, there died
Ibron. cha, line 1002
So there is not only an asymmetry in the number of tokens each subordinator

yields, but also in the language distribution of the immediate syntactic relations
which because and weil enter into, i. e. their main clause head verb and the
subordinate dependent verb. The results are summarized in Table 1.
Table 1: Language of head and dependent of because and weil

headE - depE headE - depG headc - depG headc - depE total
Because 86 5 16 6 123
Weil 0 0 59 0 59
The phenomenon of single lexical item subordinate conjunctions in other

language contexts is not uncommon in code-mixing literature.21 As far as
directionality of the switch is concerned, the situation in my corpus is in sharp
contrast with the findings of dyne (1973) who studies German/English code-
mixing among the Jewish refugee community in Melbourne, Australia. He
reports that 'the words transferred from German to English are mainly
conjunctions (denn, ob, und, weil, wie, wo)' (Clyne 1973: 104). The corpus from
the refugee community in London also shows a high propensity for switching
conjunctions, however the vast majority of them are English conjunctions in

otherwise German discourse. Lexical transfer of the same word class thus
seems to work in the opposite direction in two bilingual communities with a
very similar sociolinguistic profile mixing the same language pair.
To rule out the possibility that English because is used in place of another
German causal conjunction, I will now look at the other possibilities. Da is
another causal subordinates, thus producing the identical word order effects to
well, but normally used in more formal contexts. The whole corpus yields only
one example of German da used as a subordinating conjunction. This token is
embedded in formal discourse and was produced by a speaker who does not
use the mixed code as a discourse mode. Denn is a causal coordinating
conjunction. It was used once by a speaker from the group recordings (not
DOR) and three times by a speaker in a more formal setting. Denn has
increasingly gone out of use in colloquial German (Pasch 1997; Uhmann
1998), however, since it is used by my informants, we need to consider it as a
possible translation equivalent of because. This possibility is interesting because
it involves word order issues: as a coordinating conjunction, denn always takes
V2 order in the clause following it. The relations between well and denn will be
discussed further in section 4. 2. 2 on word order.
4. 1. 2 VERB SECOND WORD ORDER AFTER BECAUSE AND WEIL

Examples (18)-(20) also demonstrate the structural feature under investigation:
German finite verbs occur in main clause word order position in subordinate
clauses introduced by because. In actual fact not one German finite verb
depending on because is in clause final position (as in monolingual German
subordinate clauses with an overt German subordinates; see example 20).
Furthermore, not all finite dependent verbs follow their subject. Some of
them follow fronted indirect objects as in (23), others follow adverbials as in
(24):
(23) DOR: because dem Computer brauchst' es

nicht zeigen.
the computer need you
not it
show.
Jen2. cha, line 729
(24) LIL: is' wahr -? because bei mir hat schon +... 22
it's true at my place has already
Jenf. cha, line 298
The word order in subordinate clauses after because is summarized in Table 2.
Table 2: Word order in subordinate clauses after because

dependent English dependent German
SVX SVX xvs sov
Because 92 15 6 0
What are supposed to be German dependent verbs occur in second position

after because, which shows that because, at least for my informants, has not taken
over the syntactic characteristics of the German subordinating conjunction well
which requires its dependent verbs to be clause final.
Let us now take a closer look at this subordinator. Table 1 illustrates that
well only has German complements. According to the rules of standard
German (rules G2 & G3), finite verbs depending on an overt subordinator
should follow all their dependents, i. e. be clause final. This is not borne out in
the corpus. Note, however, that 58 per cent of dependent verbs are in final
position after well, whereas none is in this position after because. Table 3
summarizes the position of the dependent finite verb in well clauses from my
corpus. In order to see whether verb second after well is a parochial convention
of my data or not, I also give the distribution of V2 and Vf from several other
corpora of monolingual spoken German23 for comparison.
Table 3: Verb position after weil partly based on Uhmann (1998: 98)
Weil Vf V2 Vf V2
Eppler (2004) 34 25 58% 42%
BYU (Vienna) 62 11 85% 15%
Farrar (1998) BYU 1147 517 69% 31%
Schlobinski (1992) 74 22 70% 23%
Uhmann (1998) 24 19 56% 44%
Dittmar (1997) 99 29 77. 3% 22. 7%
Table 3 shows that between 15 per cent and 44 per cent of dependent verbs in
these corpora are not in final position. So weil+V2 word order is not just a
peculiarity of the German spoken by my bilingual informants.
We thus have two problems to solve: 1) the asymmetrical distribution of
because and weil in the corpus; and 2) the word order variation in both mixed
and monolingual causal clauses introduced by because and weil. In the next
section I will suggest possible solutions to these two problems.
4. 2 Possible explanations
4. 2. 1 FOR THE ASYMMETRY OF BECAUSE AND WEIL

The frequencies with which because and weil occur in dependency relations
(summarized in Table 1) suggest that for the asymmetry between because and
weil a probabilistic perspective is required.
Fourteen out of the sixteen tokens of because in an otherwise German
context were produced by one speaker (DOR). This is even more significant if
we remember that this speaker is German dominant. The data from this
speaker only contain seven tokens of the German subordinator weil (and no
denn). Because thus seems to replace weil for specific uses in the speech of this
speaker. This use of the causal conjunctions is also to be found among the
close-knit network of bilinguals who use the mixed code as a discourse mode
(speakers TRU, MEL and LIL); but there is no significant asymmetrical

relation between because and well in the rest of the corpus.
Reasons for the discrepancy between the British and Australian corpora will
have to remain speculative for the moment I will, however, come back to this
point at the end of section 4. 2. 2. Why German-speaking Jewish refugees in
Australia incorporate German conjunctions into their English - and the
directionality of lexical transfer being reversed among the same speakers in
Britain - could be due to the Australian corpus having been collected
approximately 20 years before the London corpus. Michael dyne collected
data, from this speech community in the 1970s. My corpus was collected in
1993. An additional two decades of exposure to English of the London-based
refugees may be a possible explanation for this discrepancy. Data from
American/German dialects that have been in contact with English for up to two
centuries support this assumption. See example (25) from Salmons (1990:
472):
(25) Almost jedes moi is Suppe gewen, because mir ban kei
every time is it soup be we have no
Zeit khat fer Supper recht essen.
time had for soup properly to eat
Treffers-Daller (1994: 192-5) discusses (25) and (26) and suggests analyzing the
conjunctions in these two examples as coordinators. For monolingual English
Schleppegrell (1991: 323) argues that 'a characterisation of all because clauses as
subordinate clauses [... ] is inadequate'. The possibility of a paratactic24 function
for because will be discussed in the next section.
Gardner-Chloros's (1991) French/Alsatian data also offer an interesting
example of two Alsatian clauses linked by a French causal conjunction.
(26) Un noh isch de Kleinmann nunter, parce que ich hab

and now is the Kleinmann down there I have
mi dort mue melde.
myself there must check in.
The German verbs selected by the English and French conjunctions in

examples (25) and (26) follow just one dependent, in these cases their subjects.
I will discuss the not strictly causal/subordinating use of English because,
German weil and French parce que in the next section.
4. 2. 2 V2 AFTER BECAUSE AND WEIL

The clearest result of the quantitative analysis presented in Table 2 is that all
German finite verbs in clauses after because are in second position and none in
clause final position.
The Word Grammar rules stated in section 3 account for the empirical data
because English subordinates only require finite verbs as their complements
(rule E2). German subordinators (rule G3), on the other hand, provide a
specific context that requires dependent verbs to take all their dependents to
the left. As because is an English subordinates which does not specify that its
complement has to be a clause final verb, we get main clause word order (SVO
in monolingual English or V2 in mixed utterances).
Supporting evidence for this interpretation comes from the six instances
where the finite verb follows a dependent other than its subject (cf. examples
23-24 and 27 below).
(27) DOR: I lost because # dreimal gab sie mir drei Konige.
three times gave she me three kings
Jenl. cha, line 817
In the above example the verb is in second position, but the clause is clearly not
SVO. The finite verb is preceded by an adverbial but followed by the subject.
In other words, the clause displays the V2 order expected in German main
clauses.
But how do we know that because and the because-clause, are used in a
restrictive subordinating way in examples (23), (24) and (27)? This question
needs to be addressed because research conducted by, amongst others,
Rutherford (1970), SchleppegreU (1991) and Sweetser (1990), cast doubt on
the characterization of all because-danses as subordinate clauses. Especially in
spoken discourse, because can be used in a variety of non-subordinating and not
strictly causal functions.
Several criteria have been proposed to distinguish between restrictive (i. e.
subordinating25) and non-restrictive because-clauses (Rutherford 1970). In
sentences containing restrictive because clauses yes/no questioning of the whole
sentence is possible; pro-ing with so or neither covers the entire sentence; they
can occur inside a factive nominal; and if another because clause were added,
the two causal clauses would occur in simple conjunction. In semantic terms the
main and the subordinate clause form one complex proposition and the
because-clause provides the cause or reason for the proposition itself. This
causal relationship is one of 'real-world' causality (Sweetser 1990: 81). Chafe
(1984) asserts that restrictive because clauses have a reading that presupposes the
truth of the main clause and asserts only the causal relation between the clauses.
These clauses tend to have a commaless intonational pattern.
I will now apply these characteristics to some of the causal clauses
introduced by because in the corpus cited so far. Utterance (27) passes all of
Rutherford's (1970) syntactic criteria for restrictive because-clauses. The main
and because-clanses form one complex proposition with a reading in which 'her
giving the speaker three kings' is the real world reason for the speaker losing the
game of cards. The truth of the sentence-initial clause is presupposed and the
causal relation between the two clauses is asserted. These properties of (27)
speak for a restrictive analysis. The intonational contour of the utterance,
however, displays a short pause after the conjunction.26 Note furthermore that
the causal clause in (27) contains a pre-posed constituent that triggers inversion,
i. e. a main clause phenomenon (Green 1976). So there are indicators for both
a restrictive/subordinate reading but also syntactic and intonational clues that
point to a non-restrictive/epistemic reading in which the speaker's knowledge

causes the conclusion. The latter interpretation suggests non-subordination,
which would justify the V2 word-order pattern.
Example (18), repeated here with more context (to facilitate the
interpretation) and prosodic information as (28), contains the English
conjunction because but is otherwise lexified with German words:
(28) DOR: wir waren nie mit richtige Englaender zusammen.

'we never mixed with "real" English people'
DOR: man hatte konnen # man hat nicht wollen.
'we could have # but we didn't want to'
DOR: es war unsere [... ] Schuld-.
it was our fault
because man ftihlt sick mit den eigenen Leuten wohler.
one feels oneself with the own people better
Ibronxha, line 217-22
This example passes none of Rutherford's (1970) 'tests'. The intonational drop
before the conjunction which intonationally separates the two clauses also
suggest a non-subordinate analysis for (28). A restrictive reading of the whole
construction is awkward if not unacceptable: feeling relaxed in the company of
fellow compatriots is not the cause or reason for feeling guilty. The non-
restrictive reading in which the because clause provides the reason why the
speaker said 'it was our own fault' is far more plausible. The because clause,
furthermore, indicates an interpretative link between clauses that are several
utterances apart: the last utterance in (28) provides a 'long-distance' reason for
the first utterance in this sequence. Schleppegrell (1991: 333) calls these uses of
because 'broad-scope thematic links'. They can be only identified when a corpus
provides the relevant context for the example. The wider context also identifies
the clause preceding the causal clause as presupposed and thematic. The
information provided in the causal clause is new and asserted.
The analysis so far suggests that because is used in non-restrictive and non-
subordinating functions in code-mixed utterances in my corpus. Without
repeating them, I will now briefly discuss the other examples in which because
introduces a clause with predominantly German lexical items (Examples 19-20
and 23-24). Example (19) is a response to a preceding wh-question and thus an
independent utterance, the information presented in the reply is not
informationally subordinated, it forms the focus of the discourse and provides
new information (Schleppegrell 1991: 31). Example (20) has two intonational
contours. The intonational rise and the verb first order mark the initial clause as
a command or suggestion, i. e. an independent proposition; the following
because clause then represents an elaboration of that proposition. The content
of the causal clause is therefore not presupposed. Example (20) displays all the
characteristics of an 'epistemic' (Sweetser 1990) because, which indicates
'elaboration and continuation in non-subordinating and non-causal contexts'
(Schleppegrell 1991: 323). The because clause in example (23) is preceded by a
short pause, contains a main clause phenomenon (extraction), and is reflexive
on the previous discourse; finally, the because clause in (24) follows a rising
intonation of the initial tag, and again explicitly mentions the speaker's
knowledge state ('it's true').
We can conclude that those clauses in which because has a German (V2)
verb as its complement, display more characteristics of 'non-restrictive'
(Rutherford 1970) clauses and should therefore be analyzed as paratactic
rather than subordinating. The Word Grammar rules formulated in section 3
still account for the data because if because is not analyzed as a subordinator, the
default rule Gl is not overridden and G2 and G3 do not get activated.
The analysis of the code-mixed data discussed so far indicates that the
predominantly German clauses introduced by because fulfil functions that are
not strictly causal but rather epistemic, broad-scope thematic link, etc. This
distinct usage is also reflected in their structural and intonational patterns. We
can therefore assume that we are dealing with non-restrictive because that is non-
subordinate and thus triggers main clause (V2) word order.
However, we also need to consider the monolingual data. The monolingual
German data from my corpus are more worrying at first sight Like because, well
was traditionally analyzed as a subordinating conjunction with causal meaning
which takes a finite verb as its complement These grammar rules are not
absolutely adhered to by my informants and monolingual speakers of German.
Only 58 per cent of verbs depending on well in the speech of my informants are
in clause final/late' position. Table 3 shows, furthermore, that in corpora of
similar, i. e. southern, varieties of German only 3. 1 -85 per cent (with an average of
approximately 67 per cent) of the subordinate clauses introduced by well are
grammatical according to the rules for monolingual German as stated in section 3.
The recent German literature on well constructions (Giinthner 1993, 1996;
Pasch 1997; Uhmann 1998), however, suggest an explanation for the
monolingual German data and opens up the possibility for an interesting
interpretation of the mixed data. There is agreement among the above named
researchers that a) there is considerable variation in the use of well + V2 or well
+ Vf; b) well + V2 is most frequent in southern German dialects; and c) weil
clauses with verb final placement and weil clauses with main clause (V2) word
order are found to show systematic but not absolute differences. In a nutshell,
the analysis for German weil is similar to the analysis proposed for English
because: there are two types of weil clauses, one strictly subordinating one, and
several non-restrictive paratactic uses. The factor that best seems to account for
the data is the information structure of the construction. If pragmatics and
syntax, which in German is a much clearer indicator than in English, fail to
provide clear criteria as to which type of ^/-construction we are dealing with,
intonation can once again help to disambiguate. Example (29) from my corpus
illustrates epistemic weil + V2:
(29) LIL: sie hat sich gedacht, die [/] die muss doch Wien kennenlernen,
'She thought she needs to get to know Vienna'
weil die eltern sind beide aus Wien.
because parents are both from Vienna
JenS. cha, line 107-8
Note that in (29) well could be replaced by the German coordinating

conjunction derm. Pasch (1997) and Uhmann (1998) agree that the non-
restrictive well seems to take the position of Standard German denn in the
system of conjunctions of reason in colloquial German.
In the analysis so far it has been established that there are 'restrictive' and
'non-restrictive' because clauses in English and 'restrictive' and 'non-restrictive'
well clauses in German. A cross-linguistic comparison of these clause types
revealed that they share many of their discourse-pragmatic, syntactic and
intonational characteristics. My informants use both clause types from both
languages in monolingual contexts. In addition to this, they employ because in
code-mixed contexts. They treat English because as the translation equivalent of
the non-restrictive weil+V2 or denn. Their linguistic competence tells them that
these constructions are equivalent in syntax and pragmatic content.
This was demonstrated for the quoted examples and also holds true for the
because followed by weil+V2 examples not reproduced in this chapter.
Furthermore, if we apply this analysis to the quantitative asymmetry found in
the corpus between the two conjunctions because and weil and add the 21
tokens of because+V2 to the weil tokens, this asymmetry shrinks to a figure (80
weil: 120 because) which is in line with the general language distribution in the
corpus. In addition to the syntactic and pragmatic reasons for using this
'congruence approach' (Sebba 1998: 1) to switching at clause boundaries, my
informants may also be dialectally pre-disposed to the weil+VS construction
because all of them are Lx speakers of a southern German variety.
I will now briefly return to the discrepancy between the Australian (Clyne
1973) and London corpora mentioned in sections 4. 1. 1 and 4. 2. 1. The
question was why German speaking Jewish refugees in Australia incorporate
German conjunctions into their English, and the directionality of lexical transfer
is reversed among the same speakers in Britain. I hypothesized that duration of
language contact may have something to do with it. At the time of data
collection, German speaking refugees in Australia had been mixing German
and English for approximately 30 years. In London, on the other hand, these
two languages had been in contact for more than half a century when I collected
my data. Another situation where we can witness long-term contact between the
two languages under investigation are German-American dialects. Note,
furthermore, that example (25) from these data (Salmons 1990) also has main
clause word order after because.
The development in Pennsylvania German (Louden 2003) is particularly
interesting in this respect. Louden (2003) illustrates the causal conjunction
paradigm in Pennsylvania German (PG) data from the 19th century onwards.
In the second half of this century he found the standard German distribution of
weil + verb final and dann (< Germ, denn) + V2. In data from the beginning of
the 20th century PG still has verbs depending on weil in final position; dann,
however, has been replaced byfer (< Engl. /or) + V2. In modern sectarian PG
weil is backed up with (d) ass, a historical merger of doss with comparative als,
and for (originally dann < Germ, denn) has been replaced with because + V2.
This development is interesting for several reasons: PG, in the late 19th,
early 20th century went through a phase that mirrors present-day English in the
distribution between because and for. In modern Pennsylvania German, well
does not seem to be able to function as subordinates in its own right any longer
and it has to be backed up by another complementizer to trigger verb final
placement. This supports rule G2 (section 3) which implicitly proposes a
subordinate feature on lexical complementizers. Modern PG seems to have lost
this feature and therefore needs to be 'backed up' by another subordinates to
trigger verb final word order.
Dann in modern PG, on the other hand, after having gone through the stage
of fer (<Engl. for), is eventually replaced by because, as in my data. This
development not only backs up the speculation voiced in section 4. 1. 2, i. e. that
the discrepancy between my and Clyne's (1973) German-English corpora might
be due to prolonged language contact, but also the qualitative analysis presented
in section 4. 2. 2.
The WG stipulation of a subordinate feature on German complementizers
(Rule G3) is furthermore supported by data from another language contact
situation with word order contrasts in subordinate clauses: French Dutch
contact in Brussels. The most frequently borrowed subordinates in Brussels
Dutch is tandis que. Treffers-Daller (1994: 191) observes that the Dutch
equivalent of tandis que, terwijl, is rarely used in her corpus. In those cases that
do occur, the Dutch conjunction is followed by the Dutch complementizer dat.
like well in Pennsylvania German, Brussels Dutch terwijl may also have lost
the subordinate feature and require an 'obvious' complementizer to trigger verb
final placement.
5. Summary and Conclusion

In section 2 of this chapter I illustrated why syntactic constraints on intra-
sentential code-mixing formulated within Phrase Structure Grammar frame-
works are empirically not borne out. They are too restrictive because the
domain of government, i. e. maximal projections or constituents, was too large,
and because of the problematic distinction between lexical and functional
categories.
In section 3 I discussed the advantages of WG over other linguistic theories
for code-mixing research. They are seen to be:
• Word Grammar requires less analysis than constituency-based models

because the only units that need to be processed are individual words.
Larger units are built by dependency relations between two words which
can be looked at individually.
• As syntactic structure consists of dependencies between pairs of single
words, constraints on code-mixing are less prone to over-generalization than
constraints involving notions of government and constituency.
• Word Grammar allows a single, completely surface analysis (with extra
dependencies drawn below the sentence-words). Code-mixing seems to be a
surface-structure phenomenon, so this property of WG fits the data.
• Knowledge of language is assumed to be a particular case of more general

types of knowledge. Word Grammar accommodates sophisticated socio-
linguistic information about speakers and speech communities. This is
important for language contact phenomena that are influenced by social
and psychological factors.
• In contrast with most other syntactic theories, Word Grammar recognizes
utterances.
• WG is a competence model which can handle inherent variability.
I do not claim that the present work illuminates theories of language structure
but it confronts a linguistic theory, Word Grammar, with statistical data, and
shows that this theory of language structure can be successfully and
illuminatingly used for the analysis of monolingual and code-mixed construc-
tions. The WG formulation of the null hypothesis is born out with just a
handful of exceptions, and the WG rules determining word order in
monolingual German or English and code-mixed clauses also hold.
The investigation of word order in subordinate clauses, furthermore, shows
that the null hypotheses seems to be correct even in cases where we would
expect restricitions on code-switching due to surface word order differences
between the two grammars involved in mixing. The quantitative analysis of
monolingual and code-mixed because and well clauses revealed that a) the core
group of informants favour the English causal conjunction because over German
weil or denn; the use of well and denn are restricted to monolingual German
contexts, and because is also used to introduce mixed utterances; b) the word
order in weil clauses varies between verb final, as required in subordinate
clauses, and verb second, the main clause order; the coordinating conjunction
denn only occurs once and with main clause order, as expected; mixed clauses
introduced by because invariably have verb second structure. Independent
research on the syntactic, intonational, semantic and pragmatic properties of
monolingual because and weil clauses has shown that these properties cluster to
form two main types of causal clauses: restrictive and non-restrictive (Rutherford
1970). The qualitative analysis of the monolingual causal clauses in the corpus
revealed that they also fall into these two types and that the mixed utterances
introduced by because predominantly have the grammatical properties of non-
restrictive clauses. Thus Boumans' (1998: 121) hypothesis that 'foreign
conjunctions do not trigger verb-final in German clauses simply because they
are used in functions that require main clause order' could be verified. The
quantitative analysis of because and weil clauses has furthermore demonstrated
how frequency distributions of a specific grammatical pattern in monolingual
speech data can be combined with our knowledge about syntactic and pragmatic
properties of grammars to handle frequency in bilingual data (Muysken 2000).
The WG analysis of German (and Dutch) lexical subordinators having a
'subordinate' feature which triggers verb final placement was furthermore
supported by data from two other language contact situations (Pennsylvania
German and Brussels Dutch) in which certain subordinators seem to have lost
this feature and therefore to require 'backing up' from overt complementizers.
References
Belazi, H. M., Rubin, E. J. and Toribio, A. J. (1994), 'Code switching and X-bar theory:
The functional head constraint5. Linguistic Inquiry, 25, 221-37.
Bentahila, A. and Davies, E. E. (1983), 'The syntax of Arabic - French code -
switching'. Lingua, 59, 301-30.
Bolle, J. (1995), 'Mengelmoes: Saranan and Dutch language contact', in Papers from the
Summer School Code-switching and Language Contact. Ljouwerl/Leeuwarden: Fryske
Akademie, pp. 290-4.
Boumans, L. (1998), The Syntax of Codeswitching: Analysing Moroccan ArabiclDutch
Conversations. Tilburg: Tilburg University Press.
Chafe, W. L. (1984), 'How people use adverbial clauses'. Berkeley Linguistics Society, 10,
437-49.
Chomsky, N. (1981), Lectures on Government and Binding. Dordrecht: Foris.
Clyne, M. G. (1973), 'Thirty years later: Some observations on "Refugee German" in
Melbourne', in H. Scholler and J. Reidy (eds) Lexicography and Dialect Geography,
Festgabefor Hans Kurath. Wiesbaden: Steiner, pp. 96-106.
— (1987), 'Constraints on code-switching: how universal are they?' Linguistics, 25, 739-
64.
DiSciullo, A-M., Muysken P. and Singh, R. (1986), 'Government and Code-Mixing'.
Journal of Linguistics, 22, 1-24.
Eppler, E. (1999), 'Word order in German-English mixed discourse'. UCL Working
— (2004), '... Because dem Computer brauchst' es ja nicht zeigen': Because + German
main clause word order'. International Journal of Bilingualism, 8, 127-43.
— German/English LIDES database <http: Htalkbank. org/datalLIDES/Eppler. zip>.
Gardner-Chloros, P. (1991), Language Selection and Switching in Strasbourg. Oxford:
Clarendon Press.
Green, G. M. (1976), 'Main clause phenomena in subordinate clauses'. Language, 52,
382-97.
Grosjean, F. (1995), 'A psycholinguistic approach to codeswitching', in P. Muysken and
L. Milroy (eds), One Speaker, Two Languages. Cambridge: Cambridge University
Press, pp. 259-75.
Gumperz, J. J. (1982), Discourse Strategies. Cambridge: Cambridge University Press.
Gunther, S. (1993), '... Weil-man kann es ja wissenschafuich untersuchen'- Diskur-
spragmatische Aspekte der Wortstellung in weil-Satzen'. Linguistische Berichte, 143,
37-59.
— (1996), 'From subordination to coordination?'. Pragmatics, 6, 323-56.
Hudson, R. A. (1980), Sociolinguistics. Cambridge: Cambridge University Press.
— (1997), 'Inherent variability and linguistic theory'. Cognitive Linguistics, 8, 73-108.
— (2000), 'Grammar without functional categories', in R. Borsley (ed. ), The Nature and
Function of Syntactic Categories. New York: Academic Press, pp. 7-35.
Joshi, A. K. (1985), 'Processing of sentences with intrasentential code-switching', in L.
Dowry, L. Kartunnen and A. M. Zwicky (eds), Natural Language Parsing.
Cambridge: Cambridge University Press, pp. 190-205.
Lehmann, Ch. (1988), 'Towards a typology of clause linkage' in J. Haiman and S.
Thompson (eds), Clause combining in grammar and discourse. Amsterdam/
Philadelphia: John Benjamins, pp. 181-226.
Louden, M. L. (2003), 'Subordinate clause structure in Pennsylvania German'. FGLSj
SGL Joint Meeting. London, 2003.
MacSwan, J. (1999), A Minimalist Approach to Intrasentential Codeswitching. New York

and London: Garland.
Mahootian, S. and Santorini, B. (1996), 'Code-switching and the complement/adjunct
distinction'. Linguistic Inquiry, 27, 3, 464-79.
Muysken, P. (1989), 'A unified theory of local coherence in language contact', in P.
Nelde (ed. ), Language Contact and Conflict. Brussels: Centre for the Study of
Multilingualism, pp. 123-9.
— (2000), Bilingual Speech. A Typology of Code-Mixing. Cambridge: Cambridge
University Press.
Myers-Scotton, C. (1993), Duelling Languages: Grammatical Structure in Code-Switching.
Oxford: Oxford University Press.
Myers-Scotton, C. and Jake, J. L. (1995), 'Matching lemmas in a bilingual language
competence and production model: Evidence from intrasentential code switching'.
Pasch, R. (1997), 'Weil mit Hauptsatz-Kuckucksei im denn-Nest?'. Deutsche Sprache, 25,
252-71.
Poplack, S. (1980), 'Sometime I'll start a sentence in Spanish y termino en EspanoF.
Poplack and Meechan (1995), 'Orphan categories in bilingual discourse: A comparative
study of adjectivization strategies in Wolof/French and Fongbe/French'. Language
Variation and Change, 7, 2, 169-94.
Romaine, S. (1989), Bilingualism. Maiden, Mass.: Blackwell.
Rutherford, W. E. (1970), 'Some observations concerning subordinate clauses in
English'. Language, 46, 97-115.
Salmons, J. (1990), 'Bilingual discourse marking: Code switching, borrowing, and
convergence in some German-American dialects'. Linguistics, 28, 453-80.
Sankoff, D. and Poplack, S. (1981), 'A Formal Grammar for Code-Switching'. Papers in
Schleppegrell, M. J. (1991), 'Paratactic because'. Journal of Pragmatics, 16, 323-37.
Schlobinski, P. (1992), 'Nexus druch weil', in P. Schlobinsky (ed. ) Funktionale
Grammatik und Sprachbeschreibung. Opladen: Westdeutscher Verlag. 315-44.
Scotton, C. M. (1990), 'Code-switching and borrowing: Interpersonal and macro-level
meaning', in R. Jacobson (ed. ), Codeswitching as a Wordwide Phenomenon, New York:
Peter Lang, pp. 85-110.
Sebba, M. (1998), 'A congruence approach to the syntax of Codeswitching'. International
Journal of Bilingualism, 2, 1-19.
Sweetser, E. (1990), From Etymology to Pragmatics. Metaphorical and Cultural Aspects of
Semantic Structure. Cambridge: Cambridge University Press.
Thorne, J. P. (1986), 'Because', in D. Kastovsky and A. Szwedek (eds) Linguistics across
Historical and Geographical Boundaries (Vol. 12). Berlin: Mouton. 1063-6.
Treffers-Daller, J. (1994), Mixing Two Languages: French-Dutch Contact in a
Comparative Perspective. Berlin: de Gruyter.
Uhmann, S. (1998), 'Verbstellungsvariationen in weil-Satzen'. ^eitschrift fur Sprachwis-
senschaft, 17, 29-139.
Zwicky, Arnold (1992), 'Some choices in the theory of morphology', in Robert Levine
(ed. ) Formal Grammar: Theory and Implementation. Oxford: Oxford University Press,
pp. 327-71.
Notes
1 The corpus was collected in 1993 from German-speaking Jewish refugees residing
in London. All transcripts are available on <http: //talkbank. ord/data/LIDES/
Eppler. zipX
2 Categorial equivalence is 'when the switched element has the same status in the
two languages, is morphologically encapsulated, shielded off by a functional
element from the matrix language, or could belong to either language' (Muysken
2000: 31).
3 Myers-Scotton and Jake (1995: 985) define system morphemes as morphemes that
do not participate in the thematic structure of a sentence, i. e. they are specified as [-
theta-role assigner/receiver]. A second feature characteristic of 'most' system
morphemes is the feature [+ Qantification]. A morpheme has a plus setting for
quantification within the Matrix Language Frame model, if it restricts possible
referents of a lexical category. Myers-Scotton and Jake (1995: 985) give tense and
aspect as examples for [+ QJ. Tense and aspect restrict the possible reference of
predicates (i. e. verbs and adjectives). Prototypical system morphemes are inflections
and most function words.
4 The WG approach of incorporating different syntactic properties of WORDS in
isa-hierarchies seems more economical and convincing.
5 The Free Morpheme Constraint (Sankoff and Poplack 1981: 5) prohibits switching
between a bound morpheme (pre- or suffix) and a lexical form unless the latter has
been phonologically integrated into the language of the bound morpheme.
6 Note the similarity of this corollary with the WG null hypothesis this study is based
on.
7 Constituency analysis is applied only to coordinate structures.
8 This system implies that code-mixing ought to be less frequent among typologically
quite different language pairs.
9 According to the theory of markedness (Scotton 1990: 90), speakers know that for a
particular conventionalized exchange in their community, a certain code choice will
be the unmarked realization of an expected rights and obligations set between
participants. They also know that other possible choices are more or less marked
because they are indexical of other than the expected rights and obligations set.
10 Smooth code-switches are unmarked by false starts, hesitations, lengthy pauses, etc.;
flagged switches are accompanied by discourse markers and other editing
phenomena (Poplack 1980).
11 A database is particularly important for studies of codes that do not have 'native'
speakers who can provide fairly reliable grammaticality judgements. A corpus is also
an essential test for the constraints on and models of code-mixing.
12 An alternative analysis of this example would be that it is ambiguous, i. e. it conforms
to two different models. The stem conforms to the English phonological model and
the suffix conforms to the German plural suffix; i. e. it is a morphologically
integrated borrowed stem.
13 The figures for individual cases need not be the same; cases of lexical diffusion
would seem to suggest the contrary (Hudson 1980: 168ff). And presumably the
entrenchment value for the general rule in such cases could be different from all the
individual rules.
14 Default inheritance rules apply to the few English constructions in which the
complement comes before the head.
15 These rules are not intended to cover scrambling, double infinitive constructions
and other well-known word order intricacies of German.
16 The term 'late' was chosen instead of 'final' because finite dependent auxiliaries in
double infinitive constructions can be followed by their non-finite dependents; cf.
endnote 15.
17 Support for this analysis comes from the fact that German subordinate clauses
lacking a subordinator/complementizer are V2 (or verb initial). Cf.:
Sie sagte, sie kennen Doris vs. Sie sagte, daft sie Doris kennen
She said they know Doris She said that they Doris know
According to G3, it is only subordinators/complementizers that select 'late' finite
verbs. So if a verb depends directly on another verb (kennen directly depending on
sagte and not daff) the default rule need not be overridden.
18 Exceptions to this rule are extraposition and double-infinitive constructions.
19 The null hypothesis is violated in five tokens of two construction types: word-order
violations of objects and negatives (see Eppler 2004).
20 The data this study is based on are transcribed in the LIDES (Language Interaction
Data Exchange) system. More information on the transcription system can be found
on <www. ling. lancs. ac. uk/staff/mark/lipps/ >.
21 See for example Clyne (1987), Gardner-Chloros (1984), Salmons (1990), Treffers-
Daller (1994).
22 Example (24) is an incomplete subordinate clauses. This does not effect the analysis
because the word order position of the relevant finite dependent verb is clear.
23 Since all my informants are from Vienna, I used only examples from the ten
Viennese informants for the Brigham Young Corpus (BYU) corpus. Farrar (1998)
counted all occurrences of weil in the speakers of southern German dialects from
the BYU corpus. Schlobinski's (1992) data are standard Bavarian; and the Uhmann
(1998) corpus is 'alemannisch-bairisch'.
24 Lehmann (1988) suggests that for clauses that are linked in a relationship of
sociation rather than dependency, 'paratixis' is a more appropriate term than
'coordination'.
25 Two clauses (X and Y) have been defined as being in a subordination relationship
'if X and Y form an endocentric construction with Y as the head' (Lehmann 1988:
182).
26 Note that in the English literature, Rutherford (1970) and Thome (1986), the
comma intonation is assumed to precede the conjunction. Schleppegrell (1991:
333) mentions the possibility of because followed by a pause.
7 Word Grammar Surface Structures and HPSG
Order Domains*
TAKAFUMI MAEKAWA
Abstract
In this chapter, we look at three different approaches to the asymmetries between
main and embedded clauses with respect to the elements in the left periphery of
a clause: the dependency-based approach within Word Grammar (Hudson
2003), the Constructional Head-driven Phrase Structure Grammar (HPSG)
approach along the lines of Ginzburg and Sag (2000), and the Linearization
HPSG analysis by Chung and Kim (2003). We argue that the approaches within
WG and the Constructional HPSG have some problems in dealing with the
relevant facts, but that Linearization HPSG provides a straightforward accounta of
them. This conclusion suggests that linear order should be independent to a
considerable extent from combinatorial structure, such as dependency or phrase
structure.
1. Introduction
There are two ways to represent the relationship between individual words:
DEPENDENCY STRUCTURE and PHRASE STRUCTURE. The former is a pure
representation of word-word relationships while the latter includes additional
information that words are combined to form constituents. If all work can be
done just by means of the relationship between individual words, phrase
structure is redundant and hence dependency structure is preferable to it. It
would therefore be worth considering whether all work can really be done with
just dependencies. We will look from this perspective at certain linear order
asymmetries between main clauses and subordinate clauses. One example of
such asymmetries can be seen in the contrast of (1) and (2). The former shows
that a topic can precede a fronted wA-element in a main clause:
(1) a. Who had ice-cream for supper?

b. For supper who had ice-cream?
(2) illustrates, however, that this is not possible in an embedded clause:
(2) a. Who had ice-cream for supper is unclear,

b. * For supper who had ice-cream is unclear.
It is clear that main clauses are different from subordinate clauses with respect
to the possibility of topicalization. It has been noted by a number of researchers
that elements occurring in the left periphery of the clause, such as interrogative
and relative pronouns, topic and focused elements, show such linear order
asymmetries (see Haegeman 2000; Rizzi 1997; and works cited therein).
The purpose of this overview chapter is to take a critical look at the current
treatment of such asymmetries within the frameworks of WORD GRAMMAR
(wo) and HEAD-DRIVEN PHASE STRUCTURE GRAMMAR (HPSG),
and ask how they should be represented in the grammar. We compare the WG
approach developed in Hudson (2003; see also Hudson 1995, 1999) with the two
relatively recent versions of HPSG: what can be called CONSTRUCTIONAL HPSG in
which grammars include hierarchies of phrase types (Sag 1997; and Ginzburg and
Sag 2000), and so-called LINEARIZATION-BASED HPSG (or LINEARIZA-
TION HPSG), in which linear order is independent to a considerable extent
from phrase structure and is analysed in terms of a separate level of 'ORDER
DOMAINS' (Pollard et al. 1994; Reape 1994; Kathol 2000, etc. ).1 It will be argued
that trie WG and the Construction HPSG approaches have some problems, but
that Linearization HPSG can provide a straightforward account of the facts.
The organization of this chapter is as follows. In the next section we consider
how a WG approach might accommodate the asymmetry between main and
subordinate wA-interrogatives. Section 3 then looks at a Construction HPSG
analysis along the lines of Ginzburg and Sag (2000). In section 4 we shall
outline a Linearization HPSG analysis developed by Chung and Kim (2003).
In the final section, we offer some concluding remarks.
2. A Word Grammar Approach

Before looking at the WG analysis of the phenomenon in discussion, we
should briefly outline how word order, w/z-constructions and extractions are
treated in WG. In WG word order is controlled by two kinds of rule: general
rules that control the geometry of dependencies, and word-order rules that
control the order of a word in relation to other word(s): its LANDMARK(S)
(Hudson 2005). In simple cases a word's landmark is its PARENT: the word it
depends on. In the cases where the word has more than one parent, only the
'higher' parent becomes its landmark (PROMOTION PRINCIPLE; see Hudson
2003a). For example, let us consider the sentence It was raining. The raised
subject it depends on two verbs, was and raining, so it has two parents. In this
case, eligible as its landmark is was. This is because raining depends on was, so
the latter is the 'higher' of the two. In a WG notation, It was raining is
represented as shown below.
(3)
WG SURFACE STRUCTURES AND HPSG ORDER DOMAINS 147
The fact that it is the subject of the two verbs is indicated by the two arrows
labelled V (subject). The arrow labelled V indicates that raining is a 'SHARER'
of was. This is so named since it shares the subject with the parent verb. In a
notation adopted here, the dependencies that do not provide landmarks are
drawn below the words. Therefore, one of the V arrows, the one from raining
to it, is drawn below the words. We thus pick out a sub-set of total
dependencies of a sentence and draw them above the words. This sub-set is
called SURFACE STRUCTURE. Word-order rules are applied to it, and determine
the positioning of a word in relation to its landmark or landmarks. Thus, the
surface structure is the dependencies which are relevant for determining word
order. A word-order rule specifies that a subject normally precedes its
landmark, and another rule specifies that a sharer normally follows its
landmark, as illustrated by the representation in (3).
Among the rules that control the surface structure, THE NO-TANGLING
PRINCIPLE is the most important for our purpose: dependency arrows in the
surface structure must not tangle.2 This principle excludes the ungrammatical
sentence (4b):
(4) a. He lives on green peas,

b. * He lives green on peas.
The dependency structures of this pair are shown in (5):
(5)
(5b) includes tangling of the arrows. Its ungrammaticality is predicted by the No

Tangling Principle.
Let us turn to the WG treatment of wA-interrogatives. Consider the
dependency structure of What happened? for example. As in the case of an
ordinary subject such as it in (3), the grammatical function of the w/z-pronoun
what to the verb happened is a subject. Therefore, what depends on happened,
and this situation can be represented as follows.
(6)
On the other hand, Hudson (1990: 361-82; 2003) argues that the verb is a
complement of the wh-pronoun and thus depends on it
(7)
The evidence for the headness of wh-pronoun includes the following

phenomena (Hudson 2003). First, the pronoun can occur without the verb
in sluicing constructions:
(8) a. Pat I know he's invited a friend. Jo: Oh, who [has he invited] ?
b. I know he's invited a friend, but I'm not sure who [he's invited].
Second, the pronoun is what is selected by the higher verb. In (9) wonder and
sure require a subordinate interrogative clause as their complement For a
clause to be subordinate interrogative, the presence of either a wh-pronoun, or
whether or if is required.
(9) a. I wonder *(who) came.

b. I'm not sure * (what) happened.
Third, the pronoun selects the verb's characteristics such as finiteness and
whether or not it is inverted. (10) illustrates that why selects a finite or infinite
verb as its complement, but when only selects a finite verb:
(10) a. Why/when are you glum?

b. Why/*when be glum?
(11) indicates that why selects an inverted verb as its complement whereas how
come selects a non-inverted verb:
(11) a. Why are you so glum?

b. * Why you are so glum?
c. * How come are you so glum?
d. How come you are so glum?
(12) illustrates that what, who and when select a to-infinitive, but why does not:
(12) I'm not so sure what/who/when/*why to visit
Hudson (2003) argues that all of these phenomena are easily accounted for if
the 2£>/z-pronoun is a parent of the next verb. In the framework of WG,
therefore, there is no reason to rule out any of (6) and (7); the sentence is
syntactically ambiguous. Thus, in What happened? what and happened depend on
each other, and the dependency structure may be either of (13a) and (13b):
(13) a.
b.
Thus, w/z-interrogatives may involve a mutual dependency. In (13b), happened is

the parent and the dependency labelled V is put in surface structure. In (13a),
however, what is the parent, and the dependency labelled 'c' (complement) is
put in surface structure.
Finally, we outline how extraction is dealt with in WG. Let us consider (14a)
with an preposed adjunct in the sentence initial position:
(14) a. Now we need help,

b. We need help now.
The preposed adjunct now would otherwise follow its parent need as in (14b),
but it precedes it. This situation is represented in WG by adding an extra
dependency 'EXTRACTEE' to now.
(15)
The arrow from need to now is labelled 'x <, > a', which means 'an adjunct
which would normally be to the right of its parent (" > a") but which in this case
is also an extractee ("x>")'. Thus the adjunct now is to the left of the parent
verb need.
With this background in mind, let us now turn to the asymmetry between
main and subordinate clauses in question: adverb-preposing is not possible in
subordinate interrogatives although it is possible in main interrogatives.
(16) a. Now what do we need?

b. * He told us now what we need.
As stated above, a w/z-pronoun and its parent are mutually dependent. In (16a)
do is the complement of what whereas what is the extractee of do. Thus, the
dependency structure for (16a) would be either of (17a) and (17b). In the
former, what is the parent, and the dependency labelled 'c' from what to do is
put in surface structure. In the latter, however, do is the parent and the
dependency labelled 'x<' from do to what is put in surface structure. The
preposed adjunct now is labelled 'x <, > a', and precedes its parent do. As the
diagram shows, the 'x <, > a' arrow from do to now tangles with the vertical
arrow in (17a). Thus, it violates the No Tangling Principle. On the other hand,
there is no tangling in (17b), so it is the only correct WG analysis of (16a).
(17) a.
b.
Let us turn to the subordinate w/z-interrogative in (16b). In (16b) what is the

object and the extractee of need while need is the complement of what. It has the
structure represented in (18). What is the clause's subordinates and it has to be
the parent of the subordinate clause. The dependency labelled 'c' should be
put in surface structure since if the arrow labelled 'x <, o' were in the surface
structure, what would have two parents and violate THE NO-DANGLING PRIN-
CIPLE: words should not have more than one parent in surface structure
(Hudson 2005). As the diagram shows, the arrow from need to now is tangled
with the one from told to what. Unlike the main clause case in (17), it has no
alternative structure, so (16b) is ungrammatical.
(18)
Thus, WG can capture the linear order asymmetries of the main and
subordinate clauses in terms of dependencies in surface structure and general
principles on dependencies.
Although the WG analysis looks successful in accommodating the
asymmetry between the main and subordinate clauses, there are some
weaknesses. As surveyed above, the WG approach states that adjunct prepbsing
is possible out of main w/z-interrogatives because the preposed adjunct avoids
violation of the No-Tangling Principle due to the fact that it is a co-dependent of
the ^/z-element (Hudson 2003: 636). The argument along these lines would
suggest that extraction is allowed as long as it does not violate the No-Tangling
Principle. However, there are cases in which extraction out of embedded wh-
interrogative is excluded although it does not violate the No-Tangling Principle.
The data comes from the SUBJECT-AUXILIARY INVERSION (SAI) structures
illustrated by (19). 3
(19) Under no circumstances would I go into the office during the vacation.
In WG, a preposed operator of the SAI clauses, such as under no circumstances

in (19), is a kind of extractee (Hudson 2005), so we should expect that it
behaves like a preposed adjunct As expected, SAI operators cannot be
extracted out of the subordinate zx'A-interrogatives, as illustrated by the following
examples. A WG approach would suggest that this is due to the No-Tangling
Principle.
(20) a. * Lees wonders under no circumstances at all why would Robin

volunteer.
b. * I wonder only with great difficulty on which table would she put the
big rock.
(Chung and Kim 2003)
With this in mind, let us consider the main ^-interrogative clause. A WG

approach would predict that preposing of an SAI operator is possible out of
main w/z-interrogatives because it should not involve violation of the No-
Tangling Principle, as in the case of adjunct preposing. However, it is actually
ungrammatical.
(21) a. * In no way, why would Robin volunteer?

b. * Only with great difficulty on which table would she put the big rock?
Here the preposed SAI operator precedes the wA-element. Note that the
situation is completely on a parallel with the case of the adjunct preposing like
(16a), which is repeated here.
(22) Now what do we need?
As we have seen, (22) is grammatical because it does not violate the No-
Tangling Principle. However, the sentences in (21) are ungrammatical though

they do not violate the same principle. This makes the analysis in terms of the
No-Tangling Principle less plausible.
As we saw at the outset of this section, word order is controlled by two
kinds of rule in WG: general rules, such as the No-Tangling Principle that
control the geometry of dependencies; and word order rules that control the
order of a word in relation to its landmark or landmarks (Hudson 2005).
Someone might suggest that the No-Tangling Principle is simply irrelevant in
(21) and that a word-order rule could exclude the ill-formed order. However,
there are some problems in this approach. Let us suppose that WG has a rule
which excludes the OP(ERATOR) < WH order. It is natural to suggest that
the same rule could apply not only to main clauses but also to subordinate
clauses. As predicted, the subordinate clauses with the same elements in the
same order as (21) are ungrammatical. This is actually illustrated by (20)
above. Now we should recall that they can also be excluded by the No-
Tangling Principle as well; a preposed operator is extracted from the
subordinate w/z-interrogative clause. The situation is entirely on a parallel with
(16b). A question arises: for which reason are the sentences in (20) excluded,
by the No-Tangling Principle or by a word-order rule? If we took the first
option, then the ungrammaticality of (20) would be accounted for by the No-
Tangling Principle, whereas that of the corresponding main clauses would be
explained by a word-order rule. If we took the second option, then both main
and subordinate clauses would be excluded by a word-order rule. It is clear
that we cannot take the first option: it forces the word-order rule to refer to
main clauses. Note that WG does not have a unit larger than a word, so it
does not recognize clauses (Hudson 2005). It does not, therefore, have a way
to distinguish main and subordinate clauses, apart from the assumption that
the latter has a subordinator and a parent outside of the clause. (Hudson
1990: 375-6). It is, then, impossible for WG rules to refer to any clause.
What about the second option, a word-order rule approach, where each ill-
formed case is excluded by a rule which bans a particular word order? It
could indeed account for the ungrammaticality of the OP < WH order in
both main and subordinate clauses. However, this option also has a problem,
to which we will turn in the following paragraph.
Consider the following pair, which shows another asymmetry between main
and subordinate clauses:
(23) a. * Why, in no way would Robin volunteer?

b. Lees wonders why under no circumstances would Robin volunteer.
A w/z-extractee precedes a preposed negative operator in the main clause in

(23a), and it is ungrammatical. However, the same order is allowed in the
subordinate clause, as in (23b). There is clearly an asymmetry between a main
and a subordinate clause. The same sort of asymmetry can be observed in the
case of a w/z-extractee and a topic extractee as well. In (24), a w/z-extractee
precedes a topic extractee in main clauses, and they are ungrammatical:
(24) a. * To whom, a book like this, would you give?

(Koizumi 1995)
b. * For what kind of jobs during the vacation would you give into the office?
(Baltin 1982)
On the other hand, the same permutation of w/z-element and a topic is allowed
in subordinate clauses, as in (25):
(25) a. the man to whom, liberty, we could never grant.

b. ? I wonder to whom this book, Bill should give.
c. I was wondering for which job, during the vacation, I should give into
the office.
Here we have yet another asymmetry between a main and a subordinate clause.
Our observation in (23)-(25) indicates that the word order which is grammatical
in subordinate clauses is ungrammatical in main clauses. Note that the No-
Tangling Principle cannot exclude the ungrammatical cases since they are all in
main clauses. Therefore, the only option we have is to specify word-order rules
to exclude ill-formed cases. Now the same problem as (21) and (20) arises
again. Such word-order rules would have to state that it is applied to a main
clause but not to a subordinate clause. However, it is impossible for WG rules
to refer to a clause since WG does not recognize any unit larger than a word.
Thus, we cannot adopt a word-order rule approach, either.
We have pointed out that the No-Tangling principle is not effective enough
to accommodate the cases of preposing of an SAI operator, another asymmetry
between main and subordinate w/z-interrogatives. Recall that the most important
assumption for a WG approach is that the wA-pronoun is the parent of the
subordinate wA-interrogatives. We should note that this assumption itself is not
without problem. Consider examples in (26a) and (26b); the former is the one
cited by Hudson himself as a problematic data for his analysis (Hudson 1990:
365): 4
(26) a. Which students have failed is unclear,

b. Who shot themselves is unclear.
In WG treatment of wh-pYonoun, which and who are not only the subject of
have and shot, respectively, but also the subject of is. The verb should agree in
number with its subject, so have/shot and is should both agree with which/who.
Which in (26a) should share its plurality with students since the former is a
determiner of the latter; who in (26b) should share its plurality with themselves
since the former is the antecedent of the latter. This does not explain the
morphology of the copula verb in both sentences, which requires the singular
subject. This analysis would predict sentences like the following to be
grammatical:
(27) a. * Which students have failed are unclear,

b. * Who shot themselves are unclear.
The copular verb is are, not is, agreeing with its subject which in (a) and who in
(b). These sentences are, however, ungrammatical. Thus, the assumption that
the w^-pronoun is the parent of the subordinate interrogatives has a weakness.
We should also note that there are some cases where an extractee is allowed
to precede the complementizer. The following examples are from Ross (1986):
(28) a. Handsome though Dick is, I'm still going to marry Herman,
b. The more that you eat, the less that you want.
In (28a), the first clause is the subordinate clause, and the adjective handsome, a
complement of is, is in front of the complementizer though. In (28b) the more,
which is an object of eat and want, is followed by the complementizer that.5 It
would be natural to assume the fronted elements in these examples to be an
extractee in WG's terms; but if so, the dependency arrow from the verb to the
extractee would tangle with the vertical arrow to the complementizer, and hence
the resulting structure in (29) violates the No-Tangling Principle. 6
(29)
It seems, then, that a WG approach to the asymmetry between main and

subordinate ^-interrogatives has some problems.
3. An Approach in Constructional HPSG: Ginzburg and Sag 2000

We will now consider how adjunct preposing in main and subordinate wh-
interrogatives might be accommodated within the framework of HPSG. In
HPSG, lexical and phrasal descriptions are formulated in terms of FEATURE
STRUCTURES like (30):
(30)
The value of the feature PHONOLOGY (PHON) represents phonological

information of a sign. The value of SYNTAX-SEMANTICS (SYNSEM) is of
type synsem, a feature structure containing syntactic and semantic information.
The SLASH feature is for representing information about long-distance
dependencies, which we will consider further below. The value of LOCAL
(LOG) contains the subset of syntactic and semantic information shared in long-
distance dependencies. The syntactic properties of a sign are represented under
the path SYNSEM|LOC|CAT(EGORY). The HEAD value contains
information standardly shared between a phrase and its head, information
such as parts of speech. The semantic properties of a sign are represented
under SYNSEM | LOG | CON (TENT). The value of the ARG-ST (ARGU-
MENT-STRUCTURE) is a list of synsem objects corresponding to the
dependents which a lexical item selects for, including certain types of adverbial
phrases (Abeille and Godard 1997; Bouma et al. 2001; Kim and Sag 2002; van
Noord and Bouma 1994; Przepiorkowski 1999a, 1999b).
Sag (1997) and Ginzburg and Sag (2000) hypothesize that a rich network of
phrase-structure constructions with associated constraints is part of the
grammars of natural languages. The hierarchies allow properties that are
shared between different phrasal types to be spelled out just once. The portion
of the hierarchy which will be relevant to adjunct preposing is represented in
(31).
(31)
Phrases are classified along two dimensions: clausality and headedness. The
clausality dimension distinguishes various kinds of clauses. Clauses are subject
to the constraint that they convey a message. Core clauses are one subtype,
which is defined not to be modifiers and headed by finite verbal forms or the
auxiliary to. The headedness dimension classifies phrases on the basis of their
head-dependent properties, i. e. whether they are headed or not, what kind of
daughters they have, etc. A general property of headed phrases (hd-ph} is the
presence of head daughter, and this phrasal type is constrained as follows:
(32). Generalized Head Feature Principle (GHFP)

hd-ph:
[synsem /[!]]—»•... H[synsem /\T\]...
The GENERALIZED HEAD FEATURE PRINCIPLE (GHFP) states that the

SYNSEM value of the mother of a headed phrase and that of its head daughter
should be identical by default A subtype of hd-ph, head-filler-phrase (hd-Jill-ph),
is associated with the following constraint:
(33) hd-fill-ph:
This constraint requires the following properties. First, the head daughter must
be a verbal projection. Second, one member of the head daughter's SLASH set
is identified with the LOCAL value of the filler daughter. Third, other elements
that might be in the head daughter's SLASH must constitute the SLASH value
of the mother. Ginzburg and Sag (2000) treat the topicalization constructions as
a subtype of hd-fill-ph, and posit a type topicalization-clause (top-cl). It is also
assumed to be a subtype of core-cl. The type top-cl is subject to the construction-
particular constraint which takes the following form:
(34) top-cl:
Topicalized clauses have an independent ([1C +]) finite clause as a head

daughter. Consider (35) for example:
(35) a. Problems of this sort, our analysis would never account for.
b. * She subtly suggested [problems of this sort, our analysis would never
account for].
(Ginzburg and Sag 2000: 50)
The topicalized sentence in (35a) is an independent clause (i. e. [INDEPEN-

DENT-CLAUSE (1C) +]), hence its head daughter our analysis would never
account for has [1C +]. A clause has the [1C — ] specification in an embedded
environment, and hence the embedded clause in (35b) is [1C — ].
Topicalization of such a clause is ruled out by (34). The filler daughter of the
topicalized clause is constrained to be [WH {}], the effect of which is to prevent
any w/z-words from appearing as the filler or an element contained within the
filler. The constraints introduced above are unified to characterize the
topicalized clause constructions.
Given the above constraints, a sentence with a preposed adjunct will have
something like the following structure (Bouma et al. 2001; Kim and Sag 2002):
(36)
As noted above, certain types of adverbial phrases are selected by the verbal
head and listed in the ARG-ST list, along with true arguments. Thus, adjunct-
preposing and standard cases of topicalization can be given a unified treatment.
The ARG-ST of the verb visit thus contains an adverbial element, whose synsem
is specified as a gap-ss. Gap-ss, a subtype of synsem, is specified to have a
nonempty value for the feature SLASH. Its LOG value corresponds to its
SLASH value, as indicated by the shared value [1]. The ARGUMENT REALI-
ZATION PRINCIPLE ensures that all arguments, except for a gap-ss, are realized
on the appropriate valence list (i. e. SUBJ(ECT), COMP(LEMENT)S or
SP(ECIFIER), and hence are selected by a head. Note that in (36) the gap-ss in
the ARG-ST list of visit does not appear in a COMPS list The nonempty
SLASH value is incorporated into the verb's SLASH value.7 The verb's
SLASH value is projected upwards in a syntactic tree from the head daughter to
mother, due to the GHFP. The termination of this transmission, which is
effected by subtypes of the hd-fill-ph constructions, occurs at an appropriate
point higher in the tree: a dislocated constituent as specified as [LOG [1]]
combines with the head that has the property specified in the constraint for hd-
fill-ph in (33).
Now we can consider how this approach might accommodate the asymmetry
between main and subordinate w/z-interrogatives. The data observed in the last
section can be summarized as (37):
(37) Distribution of SAJ operator, ro/z-element and topic (Based on Chung and Kim
2003)
Main clause Embedded clause
*
TOP<WH ok (16a) (16b)
WH<TOP * (24) ok (25)
* *
OP<WH (21) (20)
*
WH<OP (23a) ok (23b)
We will begin with the asymmetry in terms of the interaction of a topic and a
wA-element. The relevant data is repeated here for convenience with the labels
and brackets added:
(38) a. [Si Now [S2 what do we need]]?

b. * He told us [Si now [32 what we need]].
SI is composed of the topic filler and the clausal head, S2. S2 of the two
sentences in (38) is of the type ns-wh-int-cl. What we need to do is to check
compatibility of a clause of the type top-d and that of the type ns-wh-int-cl, with
the latter being the head of the former. We saw above that clauses of the type
top-d are constrained by various constraints; the unification of the constraints is
represented as follows:
(39)
Of note here is that the LOG value of the mother [2] is shared with that of the
head, due to the GHFP (32). This means that the head daughter, in this case a
clause of the type ns-wh-int-d, should have a finite verb as its head and its 1C
value is +. According to the hierarchy in (31), a clause of this type is
characterized as unification of core-d, int-d, hd-ph, hd-fill-ph, wh-int-ph and ns-
wh-int-ph. The following structure is the result of the unification, but is

simplified with the details irrelevant to the discussion omitted:
(40)
The shared value between the features 1C and INV(ERTED) guarantees that if
the clause of this type is inverted ([INV +]) then its 1C value is -+, that is, it
appears in a main clause; if it is uninverted ([INV — ]) then it should be in an
embedded clause ([1C — ]). The S2 of (38a), the head daughter of the whole
clause, is inverted, so its INV value is +, and hence its 1C value is +. This
satisfies the requirement stated in (39) that the head daughter of a topicalization
construction is an independent clause.
The S2 in (38b) is an instance of ns-wh-int-cl as in the previous case, but it is
not inverted (i. e. [INV — ]) in this case. The S2 should then be specified as [1C
— ] due to constraint (40). As we saw above, the head daughter of a
topicalization construction should be [1C +]. This is the reason why the
embedded interrogative does not allow topicalization. Under Ginzburg and
Sag's (2000) analysis, the asymmetry between main and subordinate wh-
interrogatives in terms of adjunct preposing is thus due to the conflict of the
requirement from the topicalization constructions and the embedded
interrogative constructions: the former requires [1C +] while the latter is
specified as [1C — ].
We will move on to the data problematic to a WG approach. Let us first
consider how Ginzburg and Sag's (2000) approach might deal with the
asymmetry in terms of the order WH <TOPIC. As we observed in (24), the
WH < TOPIC order is ungrammatical in main clauses. The data is repeated
in (41), with the square brackets and the labels added:
(41) a. * [si To whom, [§2 a book like this, would you give?]]
b. * [si For what kind of jobs [52 during the vacation would you give into the
office?]]
As we observed in (25), however, the same order is acceptable in subordinate

clauses. The data is repeated in (42):
(42) a. the man (si to whom, [52 liberty, we could never grant]]
b. ?I wonder [Si to whom [S2 this book, Bill should give. ]]
c. I was wondering [51 for which job, [32 during the vacation, I should give
into the office. ]]
In (41) and (42), SI is an instance of ns-wh-int-cl, and its head daughter S2 is of

the type top-cl, so what we need to do is to check compatibility of top-cl as a
head of ns-wh-int-cl. (40) states that the CAT value of ns-wh-int-cl should be
shared by that of its head, top-cl in this case. The v and clausal specifications are
compatible with those of top-cl. SI in the sentences in (41) has [1C +] since it is
main clause. Its clausal head S2, therefore, should also have [1C +], according to
(40). This indicates that the feature structure description given for the head
daughter does not violate the constraint for top-cl in (39). Thus, their analysis
makes a wrong prediction that the sentences in (41) are grammatical.
Since subordinate interrogatives cannot appear independently, SI in (42)
has the [1C — ] specification, and so does its head daughter top-cl. As we saw
above, however, top-cl has [1C +]. Therefore, ungrammaticality of (42) is
predicted; so we have the wrong prediction again.
We will next turn to the interaction of preposed operator of SAI clauses and
a ^^-element. As we observed in (20) and (21), the OP < WH is excluded in
both main and subordinate clauses. We also observed in (23) that the WH <
OP order is excluded in main clauses, but grammatical in subordinate clauses.
The relevant data is repeated here in (43) and (44), with square brackets and
labels added for expository purposes:
(43) a. * [si In no way, [§2 why would Robin volunteer]]?

b. * I wonder [S1 only with great difficulty [S2 on which table would she put
the big rock]].
(44) a. * [si Why, [52 in no way would Robin volunteer]]?
b. Lees wonders [51 why [$% under no circumstances would Robin
volunteer]].
It is not clear exactly what sort of constraints preposed operators must satisfy in
Ginzburg and Sag's (2000) system, but it is clear that the S2 in (43a, b) and the SI
in (44a, b) are clauses of the type ns-wh-int-cl. Therefore, they should at least
satisfy constraint (40). As we saw above, this constraint guarantees that the clause
of this type is inverted ([INV +]) if it is in a main clause ([1C +]) and that it is
uninverted ([INV —]) if it is in an embedded clause ([1C — ]). All the
occurrences of ns-wh-int-cl in (43) and (44) are inverted, so they all should be
independent ([1C +]), and that means they cannot appear in subordinate clauses.
This correctly predicts that (43b) is ungrammatical, but it is problematic to (44b);
we have here an example of the clause of the type ns-wh-int-cl, which appears in a
subordinate clause ([1C — ]), but is inverted ([INV +]). Nothing in Ginzburg and
Sag's (2000) constraints rules out the (a) examples in (43) and (44).
It seems, then, that an approach to the asymmetry between main and
subordinate wMnterrogatives within the framework of Ginzburg and Sag (2000)
has some problems.
4. A Linearization HPSG Approach

An analysis of English left peripheral elements given by Chung and Kim (2003)
is based on a version of HPSG, which is a so-called linearization-based HPSG.
In this framework, word order is determined not at the level of the local tree,
but in a separate level of 'order domains', an ordered list of elements that
contain at least phonological and categorical information (see, e. g. Pollard et al.
1994; Reape 1994; and Kathol 2000). The list can include elements from
several local trees. Order domains are given as the value of the attribute
DOM(AIN). At each level of syntactic combination, the order domain of the
mother category is computed from the order domains of the daughter
constituents. The domain elements of a daughter may be COMPACTED to form a
single element in the order domain of the mother or they may just become
elements in the mother's order domain. In the latter case the mother has more
domain elements than the daughters. For example, let us consider the following
representation for the sentence Is the girl coming? (Borsley and Kathol 2000):
(45)
The VP is coming has two daughters and its domain contains two elements, one
for is and one for coming. The top S node also has two daughters, but its order
domain contains three elements. This is because the VP's domain elements
have just become elements in the S's order domain, whereas those of the NP
are compacted into one single domain element, which ensures the continuity of
the NP. Discontinuity is allowed if the domain elements are not compacted: is
and coming are discontinuous in the order domain of the S.
The notable feature of Chung and Kim's (2003) analysis is that each element
of a clausal order domain is uniquely marked for the region that it belongs to
(Borsley and Kathol 2000; Kathol 2000, 2002; Perm 1999). The positional
assignment is determined by the following constructional constraints:
a
(46) -
b.
c.
PKA-elements are assigned to position 3 in main clauses, and those in

embedded (interrogative and relative) clauses are put in position 1. Topic
elements are always assigned to position 2, and the operators are always
assigned to position 3.8 Thus, left peripheral elements in English have the
following distributions:
(47) Distribution of English left peripheral elements (Chung and Kim 2003)
Marker field Topic field Focus field
1 2 3
Main clause TOP WH/OP

Embedded clause WH/COMP TOP OP
An embedded wA-phrase competes for position 1 with a complementizer. This

competition accounts for the fact that these two elements never co-occur in
English (cf. Chomsky and Lasnik 1977). They further assume THETOPOLOGI-
GAL LINEAR PRECEDENCE CONSTRAINT, a linear precedence constraint which
is imposed on the elements in order domains:
(48) Topological Linear Precedence Constraint

1 <2<3
(48) states that the elements in position 1 should precede those in 2, which
should in turn precede those in 3.
Now let us consider how this approach might accommodate the asymmetry
between main and subordinate w/t-interrogatives. The summary of the relevant
data given in (37) is repeated here in (49):
(49) Distribution of SAI operator, wh-element and topic

Main clause Embedded clause
*
TOP < WH ok (16a) (16b)
WH < TOP * (24) ok (25)
* * (20)
OP < WH (21)
WH < OP * (23a) ok (23b)
As introduced above, Chung and Kim's approach assumes that a topic is in

position 2 and a wA-element is in position 3 in main clauses. This accounts for
the grammaticality of the TOP <WH order in main clauses since it has the
following representation:
(50)
This order domain does not violate the Topological Linear Precedence
Constraint in (48), and hence accounts for the grammaticality of (16a), repeated
here for convenience:
(51) Now what do we need?
Let us turn to embedded clauses, where the TOP < WH order is

ungrammatical. (46c) states that wA-elements are assigned to position 1 in
embedded clauses, whereas topic elements are always in position 2, no matter
whether it is embedded or not. Thus, the TOP < WH order in embedded
clauses leads to the following order domain.
(52)
(52) violates the Topological Linear Precedence Constraint since its DOM
element marked 2 precedes that marked 1. This explains the ungrammaticality
of (16b).
(53) *He told us now what we need.
Thus, Chung and Kim's (2003) approach can accommodate the asymmetry
between main and embedded clauses with respect to a topic and a wA-element.
The fact that WH < TOP is excluded from the main clauses is accounted for
along the same lines. This linear order leads to the following order domain:
(54)
Here, the element with 3 precedes that with 2, which violates (48), which
accounts for the ungrammaticality of the sentences in (24).
(55) a. * To whom, a book like this, would you give?

b. * For what kind of jobs during the vacation would you give into the
office?
For embedded clauses, on the other hand, (46a) and (46c) require that a topic
should be in 2 and a wA-phrase in position 1, respectively. The resulting order
is (56):
(56)
This conforms to constraint (48), which correctly predicts the grammaticality of

the WH < TOP order in embedded clauses, illustrated by (25), which is
repeated below:
(57) a. the man to whom, liberty, we could never grant

b. ? I wonder to whom this book, Bill should give.
c. I was wondering for which job, during the vacation, I should give into
the office.
Constraint (46b) states that w/z-elements and operators are both assigned to
position 1 in main clauses. This accounts for the ungrammaticality of the WH
< OP and the OP < WH orders in main clauses: the competition for a single
position between these two elements entails that they cannot co-occur.
(58) a. * In no way, why would Robin volunteer?

b. * Only with great difficulty on which table would she put the big rock?
(59) * Why, in no way would Robin volunteer?
A wA-phrase is assigned to position 1 in embedded clauses while operators are

assigned to position 3, embedded or not. This accounts for the grammaticality
of the WH < OP order since its order domains has the 1 < 3 linear order.
(60) Lees wonders why under no circumstances would Robin volunteer.
The OP < WH order is correctly excluded since it entails 3 < 2, which violates
(48).
(61) a. * Lees wonders under no circumstances at all why would Robin

volunteer.
b. * I wonder only with great difficulty on which table would she put the big
rock.
Thus, a linearization-based HPSG approach by Chung and Kim (2003) can

provide an account for all the relevant data, including those problematic for an
approach in Word Grammar and for the framework of Ginzburg and Sag (2000).
Another advantage of Chung and Kim's (2003) approach is that it can also
predict the grammaticality with respect to TOP < OP and OP < TOP. The
positional assignment represented in (47) predicts that a topic precedes an
operator in both main and embedded clauses, and it also predicts, with the
Topological Linear Precedence Constraint (48), the ungrammaticality of the
OP < TOP order in both types of clauses. This is borne out by the following
examples, which illustrate that TOP < OP is no problem but OP < TOP is
ungrammatical in main clauses (62) and in embedded clauses (63):
(62) a. To John, nothing would we give.

b. * Nothing, to John, would we give.
(63) a. He said that beans, never in his life, had he been able to stand,
b. * He said that never in his life, beans, had he been able to stand.
5 Concluding Remarks
In this chapter, we have looked at three different approaches to the
asymmetries between main and embedded clauses with respect to the elements
in the left periphery of a clause. We compared the dependency-based
approach developed within WG (Hudson 2003) with the Constructional
HPSG approach along the lines of Ginzburg and Sag (2000), and the
Linearization HPSG analysis by Chung and Kim (2003), and argued that the
approaches within WG and the Constructional HPSG have some problems in
dealing with the relevant facts, but that Linearization HPSG provides a
straightforward account of them.
As we discussed at the outset of this chapter, dependency structure is simpler
than phrase structure in that the former only includes information on the
relationship between individual words, but the latter involves additional
information about constituency. Other things being equal, simpler representa-
tions are preferable to more complex representations. This might lead to the
conclusion that WG is potentially superior to HPSG. We have shown,
however, that both the dependency-based analysis in WG and the constituency-
based analysis in Constructional HPSG are not satisfactory in accounting for the
linear order facts. These two frameworks follow the traditional distinction
between the rules for word order and the rules defining the combinations of
elements.9 We should note, however, that the rules for word order are applied
to local trees in Constructional HPSG and to dependency arrows in WG.
Sisters must be adjacent in Constructional HPSG whereas in WG the parent
and its dependent can only be separated by elements that directly or indirectly
depend on one of them. This means that the linear order is still closely tied to
the combinatorial structure. That these frameworks cannot accommodate
certain linear order facts suggests that neither dependency structure nor phrase
structure is appropriate as the locus of linear representation. We saw above that
the linearization HPSG analysis gives a satisfactory account of linear order of
elements in the left periphery. This conclusion suggests that we need to
separate linear order from combinatorial mechanisms more radically than the
above traditional separation of the rules.
References
Abeille, Anne and Godard, Daniele (1997), 'The syntax of French negative adverbs',
in Danielle Forget, Paul Hirschbuhler, France Martineau, and Maria L. Rivero
(eds), Negation and Polarity: Syntax and Semantics. Amsterdam: John Benjamins, pp.
1-17.
Baltin, Mark (1982), 'A landing site for movement rules'. Linguistic Inquiry, 13, 1-38.
Borsley, Robert D. (2004), 'An approach to English comparative correlatives', in Stefan
Muller (ed. ), Proceedings of the HPSG04 Conference. Stanford: CSLI Publications, pp.
70-92.
Borsley, Robert D. and Kathol, Andreas (2000), 'Breton as a V2 language'. Linguistics,
38, 665-710.
Borsley, Robert D. and Przepiorkowski, Adam (eds), Slavic in Head-Driven Phrase
Structure Grammar. Stanford: CSLI Publications.
Bouma, Gosse, Malouf, Rob and Sag, Ivan A. (2001), 'Satisfying constraints on
extraction and adjunction'. Natural Language and Linguistic Theory, 19, 1-65.
Chomsky, Noam and Lasnik, Howard (1977), 'Filters and control'. Linguistic Inquiry, 8,
425-504.
Chung, Chan and Kim, Jong-Bok (2003), 'Capturing word order asymmetries in English
left-peripheral constructions: A domain-based approach', in Stefan Miiller (ed. ),
Proceedings of the 10th International Conference on Head-Driven Phrase Structure
Grammar. Stanford: CSLI Publications, pp. 68-87.
Ginzburg, Jonathan and Sag, Ivan A. (2000), Interrogative Investigations. Stanford: CSLI
Publications.
Haegeman, liliane (2000), 'Inversion, non-adjacent inversion and adjuncts in CP'.
Transaction of the Philological Society, 98, 121-60.
Hudson, Richard A. (1990), English Word Grammar. Oxford: Blackwell.
— (1995), 'HPSG without PS?'. Available: www. phon. ucl. ac. uk/home/dick/unpub. htm.
— (1999), 'Discontinuity'. Available: www. phon. ucl. ac. uk/home/dick/disconthtm. (Ac-
cessed: 21 April 2005).
- (2003), 'Trouble on the left periphery'. Lingua, 113, 607-42.
— (2005, Feburuary 17-last update), 'An Encyclopedia of English Grammar and Word
Grammar', (Word Grammar). Available: www. phon. ucl. ac. uk/home/dick/wg. htm.
Kathol, Andreas (2000), Linear Syntax. Oxford: Oxford University Press.
— (2002), 'Linearization-based approach to inversion and verb-second phenomena in
English', in Proceedings of the 2002 LSK International Summer Conference Volume II:
Workshops on Complex Predicates, Inversion, and 0 T Phonology, pp. 223-34.
Kim, Jong-Bok and Sag, Ivan A. (2002), 'Negation without head-movement'. Natural
Language and Linguistic Theory, 20, 339-412.
Koizumi, Masatoshi (1995), 'Phrase Structure in Minimalist Syntax'. (Unpublished
doctoral dissertation, MIT),
van Noord, Gertjan and Bouma, Gosse (1994), 'Adjuncts and the processing of lexical
rules', in Fifteenth International Conference on Computational Linguistics (COLING
'94), pp. 250-6.
Perm, Gerald (1999), 'Linearization and WH-extraction in HPSG', in R. D. Borsley and
A. Przepiorkowski (eds) Slavic in Head-Driven Phrase Structure Grammar. Stanford:
CSLI Publications, pp. 149-82.
Pollard, Carl, Kasper, Robert and Levine, Robert (1994), Studies in Constituent Ordering:
towards a Theory of Linearization in Head-driven Phrase Structure Grammar. Research
Proposal to the National Science Foundation, Ohio State University.
Pollard, Carl and Sag, Ivan A. (1994), Head-Driven Phrase Structure Grammar. Chicago:
Przepiorkowski, Adam (1999a), 'On complements and Adjuncts in Polish', R. D.
Borsley and A. Przepiorkowski (eds) Slavic in Head-Driven Phrase Structure
Grammar. Stanford: CSLI Publications, pp. 183-210.
Przepiorkowski, Adam (1999b), 'On case assignment and "adjuncts as complements"',
in Gert Webelhuth, Jean-Pierre Koenig and Andreas Kathol (eds), Lexical and
Constructional Aspects of Linguistic Explanation. Stanford: CSLI Publications, pp.
231-45.
Reape, Michael (1994), 'Domain union and word order variation in German', in John
Nerbonne, Klaus Netter and Carl J. Pollard, (eds), German in Head-Driven Phrase
Structure Grammar. Stanford: CSLI Publications, pp. 151-98.
Rizzi, Luigi (1997), 'On the fine structure of the left-periphery', in Liliane Haegeman
(ed. ), Elements of Grammar. Dordrecht: Kluwer Academic Publishers, pp. 281-337.

Ross, John R. (1986), Infinite Syntax! New Jersey: Ablex Publishing Corporation.
Sag, Ivan A. (1997), 'English relative clause constructions'. Journal of Linguistics, 33,
431-84.
Webelhuth, Gert, Koenig, Jean-Pierre and Kathol, Andreas (eds) (1999), Lexical and
Constructional Aspects of Linguistic Explanation. Stanford: CSLI Publications.
Notes
* I would like to thank Bob Borsley and Kensei Sugayama for their helpful
comments. Any errors are those of the author.
1 For comparison of WG with an earlier version of HPSG (Pollard and Sag 1994),
see Hudson (1995).
2 In the current version of WG (Hudson 2005), the No-Tangling Principle has been
replaced with ORDER CONCORD, whose effects are essentially the same as for its
predecessor. In this chapter we will refer to the No-Tangling Principle.
3 The examples in the rest of this section are cited from Haegeman (2000) unless
otherwise indicated.
4 (26b) was provided for me by Bob Borsley (p. c. )
5 (28b) is not acceptable to some speakers (Borsley 2004).
6 The data in (28) could be accommodated in WG if we assumed a dependency
relation between the complementizer and the extractee (Borsley (p. c. ); and
Sugayama (p. c. )). Needless to say, however, an argument along these lines would
need to clarify the nature of this apparently ad hoc grammatical relation.
7 This amalgamation of the SLASH values is due to the SLASH-Amalgamation
Constraint (Ginzburg and Sag 2000: 169):
(i)
8 See Kathol (2002) for an alternative analysis for English clausal domains.
9 In constituency-based grammars such as HPSG, these two rule-types are LINEAR
PRECEDENCE RULES and IMMEDIATE DOMINANCE RULES.
Part II
Towards a Better Word Grammar

8. Structural and Distributional Heads
ANDREW ROSTA
Abstract
Heads of phrases are standardly diagnosed by both structural and distributional
criteria. This chapter argues that these criteria often conflict and that the notion
'head of a phrase' is in fact a conflation of two wholly distinct notions, 'structural
head' (SH) and 'distributional head' (DH). The SH is the root of the phrase and
is diagnosed by structural criteria (mainly, word order and ellipsis),. Additionally,
the distribution of the phrase may be conditioned by one or more words in the
phrase: these are DHs. The SH is often a DH, but there are many English
constructions in which a DH is not the SH and is instead a word subordinated
within the phrase. The chapter discusses a variety of these constructions,
including: that-dauses; pied-piping; degree words; attributive adjectives; determi-
ners; just, only, even; not, almost, never, all but; the type-of construction;
coordination; correlatives; adjuncts; subjects; empty categories.
1. Introduction
The central contention of this chapter is that a number of constructions in
English oblige us to recognize that the distribution of a phrase may be
determined by a word subordinated within the phrase, rather than by, as
commonly taken for granted, the structural head of a phrase - i. e. the highest
lexical node in the phrase. By a phrase's 'distribution' is meant the range of
environments - positions - in which it can occur. An example of such a
construction is pied-piping (discussed in section 8), as in (la). The root of the
phrase in the midst of which throng of admirers is in, but it is by virtue of
containing which that it occupies its position before the inverted auxiliary, for
(la) alternates with (Ib), but not with (2a-2b).
(1) a. In the midst of which throng of admirers was she finally located?
b. Which throng of admirers was she finally located in the midst of?
(2) a. *In the midst of this throng of admirers was she finally located,
b. *This throng of admirers was she finally located in the midst of.
The head of a phrase is normally understood to be denned, and hence
diagnosed, by both structural and distributional criteria. But the notion 'head of a
phrase' is in fact a conflation of two wholly distinct notions: the distributional, or
'external', head, and the structural, or 'internal', head. These two types of head are
explained in sections 2-3. Although I use the term 'phrase' in a mostly theory-
neutral way, it is important to realize that it doesn't entail the Phrase Structure
Grammar notion that phrases are nonlexical nodes. In Word Grammar (WG),
which is the grammatical model that serves as a framework for the discussion of
grammatical analysis in this chapter, all nodes are lexical, and WG defines a
phrase as a word plus all the words that are subordinate to it1 (A word's
'subordinates' are its 'descendants' in the syntactic tree, the nodes below it; its
'superordinates' are its 'ancestors', the nodes above it) The words in a sentence
comprise all the nodes of a tree, and every subtree of the sentence tree is a phrase.
2. Structural Heads
A phrase's structural head (henceforth 'SH') is, as stated above, to be defined
as the highest lexical node in the phrase. In a model such as Word Grammar,
in which all nodes are lexical, the SH is, therefore, the root of the phrase's
tree structure. For determining which word is the root of a phrase, the
principal diagnostic is word order. Take the phrase eat chocolate: if chocolate is
the root then there cannot be a dependency between eat and a word that
follows chocolate; and if eat is the root then there cannot be a dependency
between chocolate and a word that precedes eat. The test shows that eat is the
root of eat chocolate:
(3) a. *Do Belgian eat chocolate. ['Do eat Belgian chocolate. ']
b. *Do your eat chocolate. ['Do eat your chocolate. ']
(4) a. Eat chocolate today,
b. Don't eat chocolate.
These restrictions follow from the general (and probably inviolable) principle of
grammar that requires phrases to be continuous: no parts of a phrase can be
separated from one another by an element that is not itself contained within the
phrase. The principle is discussed further in section 15. Diagrammatically, the
principle can conveniently be captured as a prohibition against a branch in the
syntactic tree structure crossing another, as in (5).
(5)
* Do Belhian eat chocolate. [Do eat Belgian chocolate.']
3. Distributional Heads
The distribution of a word (or phrase) is the range of grammatical environments
it can occur in. In the broadest sense, this includes a word's co-occurrence with
both its dependents, e. g. the fact that eat can occur with an object (eat chocolate),
and its regent, e. g. the fact that eat can be complement of an auxiliary (will eat).
(The term 'regent' is used in this chapter as the converse of 'dependent'. ) But in
the narrower and more usual sense employed here, a word's distribution
essentially concerns what it can be a dependent of. 'Distribution' in the latter
sense contrasts with 'Valency' (or 'Selection'), which concerns what a word can
STRUCTURAL AND DISTRIBUTIONAL HEADS 173
be regent of. As a first approximation, we can therefore say that the distribution
of X is the product of rules that such and such a regent permits or requires a
dependent belonging to a category that X belongs to. But the topic of this
chapter is such that instead of that first approximation we need, at least
pretheoretically, to formulate this in terms of the notion 'distributional head':
when a word permits or requires a dependent of category X, it permits or
requires a dependent that is a phrase whose distributional head is of category X.
Models of syntax have generally held that something is a distributional head
(henceforth 'DH') if and only if it is a SH - in other words, that a phrase has
just one sort of head, and that this single head determines both the structural
and the distributional properties of the phrase. But my first aim is to show that
the two sorts of head must be distinguished. Normally the two sorts of head
coincide, so that one word is both SH and DH of a phrase - i. e. that the root of
a phrase determines its distribution. This is generally known as 'endocentricity'.
But a fair number of constructions in English suggest that that norm cannot be
exceptionless. (And, as we will see later, once we acknowledge that the norm
has exceptions, there is reason to question whether it is in fact even much of a
norm at all. ) In these constructions, the SH is not the DH. This is exocentricity.
But the constructions involve a very particular kind of exocentricity: in them,
the DH is subordinate to the SH. That is, the distribution of the phrase is
determined not by the root of the phrase but by a word subordinated more or
less deeply within the phrase. I will call this phenomenon 'hypocentricity', since
the DH is below the SH in the tree.
Although the notion 'structural head', defined as the root of a phrase, has a
role in the formal analysis of hypocentricity, the notion 'distributional head'
does not, and is purely descriptive. This is because it turns out that a phrase
may have many distributional heads. This can be illustrated as follows. Section
11 argues that in 'determiner phrases' (i. e. 'noun phrases' in the traditional
sense), the determiner is SH and the noun is DH. This is illustrated in (6a),
where, as in subsequent examples, small capitals indicate the SH and italics the
DH. And section 8 argues that in pied-piping in wh-relative clauses, the wh-
word is DH, so SH and DH are as indicated in (6b-6c). But in (6c) the locus of
the DH also follows the pattern of (6a), giving (6d), where there is one SH, the,
and two DHs, news and which.
(6) a. [THE news] had just reached us
b. [NEWS of which] had just reached us
c. [THE news of which] had just reached us
d. [THE news of which] had just reached us
In the formal analysis of hypocentricity introduced in section 4 and presented
in full in section 6, a phrase's DHs are defined relationally relative to the SH.
So, although I said in section 1, in framing the discussion of hypocentricity,
that the notion 'head of phrase' is a conflation of two sorts of phrasal head, the
structural and the distributional head, it would be more accurate to say that the
traditional notion 'head of a phrase' remains valid, but that it is defined by
structural criteria, as the phrase root, and, contrary, to what is usually thought,
not by distributional criteria. The distribution of a phrase may be conditioned

by categorial properties of its (structural) head, but it may equally well be
conditioned instead by categorial properties of words more or less deeply
subordinated within the phrase. As we will see in section 15 and section 17,
once the criteria for identifying the phrasal head are solely structural and not
distributional, we are led to transmogrify the familiar WG analyses of the
structure of many constructions into radically new but more satisfactory forms.
In the following sections I discuss a number of constructions where there is
prima facie reason to think that they might be hypocentric. It is beyond the
scope of this chapter to agonize over the details of the structure of each
construction, so by and large my identification of the SH in each construction
will rest more on prima facie plausibility than on detailed argumentation.
4. 77m*-Clauses
In a ^/-clause, the SH is that, which explains why it must be at the extreme left
edge of the clause. But the DH of the thai-clause is the finite complement of
that. The evidence for this is that the clausal complement of certain verbs, such
as require and demand, must be subjunctive.2 So the DH is the subjunctive word;
it is the presence of the subjunctive word that satisfies the selectional
requirements of require I demand:
(7) a. I require [THAT she be/*is here on time].
b. I demand [(THAT) she give/*gives an immediate apology].
A satisfactory analysis of this phenomenon is provided in Rosta (1994, 1997)
(from whose terminology I deviate in this chapter without further comment).
That is defined (in its lexical entry) as 'surrogate' of its complement, the finite
word. As a general rule, every word is also surrogate of itself; so the finite word
is surrogate of itself. Require /demand select for a complement that is surrogate of
a finite word. Since the surrogates of a finite word are itself and that (if it is
complement of that), the selectional requirements of require/demand can be
satisfied by that or by a finite word. Surrogacy accounts for some hypocentric
constructions, but not all. We return to this point in section 6.
5. Extent Operators
I adopt 'extent operator' as an ad hoc term to cover such items as all but, more
than, other than, almost, not, never, which do not necessarily form a natural
grammatical class but do at least have certain shared properties that warrant
their being discussed together here. Reasons will be given why, when an extent
operator modifies a predicative word, as in (8a-8f), or a number word, as in
(9a-9f), the extent operator appears to be SH and the number or predicative
word to be DH.
(8) a. She had [ALL but expired}.
b. My rent has [MORE than doubled].
c. She was [OTHER than proud of herself].
d. My rent has [ALMOST doubled].

e. Her having [NOT had a happy childhood], he was inclined to be patient
with her.
f. Her having [NEVER seen the sea before], this was a real treat.
(9) a. [MORE than thirty] went.
b. [ALMOST thirty] went.
c. [BARELY thirty] went.
d. [OVER/UNDER thirty people] went.
e. [NoT many] know that.
f. [NOT two minutes] had elapsed before the bell rang.
The identification of the DH is probably not very controversial, but the
justification for it is most apparent in (8a, b, d, e, f), where auxiliary have requires a
past participle as its complement, and it is the DH that satisfies this
requirement. Note also that as demonstrated by (lOa-lOb), verbal number
inflection is triggered by the number of the DH rather than the SH or the
meaning of the whole phrase:
(10) a. [MORE than one] is/*are.
b. [LESS/FEWER than two] are/*is.
More controversial is the identification of the SH, and the evidence for this will
now be presented.
First of all, there is the evidence of meaning: the bracketed phrases could all
be described as 'semantically exocentric'. For instance, (8b, 8d) don't refer to
an event of doubling, and (9a-9d) don't refer to a quantify of 30. Rather, the
meanings are roughly thus:
(8a, 8d): 'My rent has increased by a factor that is more/slightly less than 2'
(8c): 'She was in a state that is other than a state of being proud of herself
(9a-d): 'a set whose cardinality is a number more than/almost/barely/over/
under thirty'
(9e): 'a set whose cardinality is a number that is not many'
(9f): 'a set (of minutes) whose cardinality is a number that is not two', or 'a
period that is not two minutes'
There is no prior theoretical reason to suppose that the 'semantic head' should
in general be the SH rather than the DH; if anything, one would expect the DH
to align with the semantic head, given that it is the DH that seems to be the
more visible from outside the phrase. But it remains the case that in these
constructions the extent operator is closest to being the semantic head. For
example all in (8a) might be taken to mean 'an act that stops slightly short of
outright expiring', and in (9d), over/under might be taken to mean 'a number - a
place in number space - that is over/under thirty', just as under the table means
'the place under the table' in (lla-c).3
(11) a. Under the table is an obvious place to hide.
b. Let's paint under the table black.
c. The pupils always cover under the table with chewing gum.
The second kind of evidence for the identification of the SH comes from
certain of the extent operators' more core variants that have a nominal
complement, notably all but JVP, more than JVP and over/under JVP. It is easy to
demonstrate that in these variants, all, more and over /under is both SH and DH.
For instance, in (12a) we have me rather than / and are rather than am because
the subject is all rather than me/L4'
(12) a. All but me/*I are/*am to come.
b. More than me/*I are/*am here.
c. Over/Under us/*we seems/*seem unsuitable for the storage of
radioactive waste.
The third and most telling sort of evidence comes from ellipsis. Ellipsis
involves the deletion of the phonological content of some syntactic structure,
and it seems to operate rather as if (the phonology of) a branch of the syntactic
tree were snipped off. Thus if the phonological content of one node is deleted,
then so must be the phonological content of all nodes subordinate to it. 5 So, if
we have established that a branch links two nodes, X and Y, and X's phonology
remains when Y's is deleted, it must follow that Y is subordinate to X. And we
find that with certain extent operators, including not, nonstandard never
(meaning 'not' rather than 'nowhen') and almost, their phonology can remain
when the phonology of the DH is deleted. (Words whose phonology is deleted
are shown in subscript. )
(13) a. %I would prefer that you be so not
mde.
b. %I know you want to do it, but try to not
c. Would I do it? I wouldn't not do it
d. We'll make him not
e. I know it's unmanly to flinch, but how can you stand there and not fi^ch?
f. You can't go out without knickers - not go out without knickers and still
stay decent
g. %She never stoie your cigarette lighter- ['She didn't']
h. %Did she do it? No, but she almost
(13g) is dialectal, and I have also marked (13a, b, h) as subject to variant
judgements, because some speakers reject them, but all of (13a-13h) are
acceptable for some speakers, and that is what matters here. The conclusion is
that the deleted DH is subordinate to the extent operator, which is therefore
the SH.6
We have established, then, that the internal structure of these phrases is as
shown in (14a-14b). This raises two questions. The first, which is addressed in
section 6, concerns the structure of (15a-15b): how can structures (14a-14b) be
reconciled with the fact that it is perished that satisfies the selectional
requirements of had?
(14) all but perished
almost perished
(15) a. She had all but perished,

b. She had almost perished.
The second question concerns the structure of (16a-16b). Other things being
equal, we would expect (16a-16b) to be ungrammatical due to illicit word
order, as diagrammed in (17a-17b), while we would expect (18a-18b), whose
word order remedies the apparent illicitness of (17a-17b), to be grammatical.
(16) a. I know she all but perished,
b. I know she almost perished.
(17) a.
I know she all but perished.

b.
I know she almost perished.
(18) a.
* I know all but she perished. ['She all but perished. ']
b.
* I know almost she perished. ['She almost perished. ']
It seems, then, that (16a-16b) must involve something along the lines of
obligatory 'leftwards-extraposition' of the subject; the subject moves from its
ordinary position and ends up as a subordinate of the extent operator, as
diagrammed in (19a-19b).7 We return to this matter in section 17. 2, where a
far more satisfactory solution is provided.
(19) a. I know she all but perished.
SUBJECT
I know she almost perished.
SUBJECT
6. Surrogates verus Proxies

In section 4 we observed that require/demand requires as its complement the
surrogate of a finite word. This requirement is satisfied by the finite word itself,
(20a), by clausal that, (20b), and by extent operators (20c). (The list is not
exhaustive. )
(20) a. She demanded he go.
b. She demanded that he go.
c. She demanded he almost go.

But as (21a-21c) show, the complement of clausal that can be a finite word, or
an extent operator, but not another that. The same pattern holds for
complements of extent operators, (22a-22f). (Structurally, almost in (21c) and
(22d) occurs in the position where the complement of but and that is expected.
For this reason, I conclude that almost (rather than perished] is indeed the
complement of but and that. And, as pointed out in section 5, almost is the
semantic head in almost perished: it means 'an event near to being an event of
perishing'. ) Hence it cannot be the case that the selectional requirements of that
or of extent operators are such that any surrogate of a finite word will satisfy
them.
(21) a. I know that she went.
b. * I know that that she went.
c. I know that she almost went.
(22) a. I know that she almost perished.
b * I kno'w she almost that perished.
c. I know that she almost almost perished.
d. I know that she all but almost perished.
e. I know that she almost all but perished.
f- Anybody not not happy should raise their hands now.
(20-22) show that there are two types of hypocentric phrase. In one type, the
SH is surrogate of the DH, and the SH can be that, an extent operator, or the
DH, the finite word. In the other type, the SH can be an extent operator or the
DH, but not that. To capture this pattern, which, as we will see in later sections,
generalizes across many diverse constructions, we need to posit a subtype of of
the Surrogate relation, which I will call 'Proxy'. So, whereas require/demand
select for a complement that is surrogate of a subjunctive word, that selects for a
complement that is proxy of a finite word. Likewise for extent operators: almost
and but (in all but} select for a complement that is proxy of a finite word (or of
whatever other sorts of word extent operators can modify).
In general, the format for selectional rules will be not (23a), but rather (23b-
23c). Rules of type (23a) seem to be surprisingly scarce: I am currently aware of
only one instance, which is discussed in section 16 (in examples (73a-73c)).
(23) a. the complement of X is a word of category Y

b. the complement of X is proxy of a word of category Y
c. the complement of X is surrogate of a word of category Y
Given that rules of form (23b-23c) are more cumbersome than the rules of
form (23a) that we are accustomed to, we will introduce an abbreviating
equivalent for (23b-23c), and say that X 'targets' Y for its complement (but,
implicitly, will accept Y's surrogate or proxy in lieu of Y); Y is X's 'complement
target'.
The key rules defining Surrogate and Proxy are (24a-24c):
(24) a. If X is proxy of Y, then X is surrogate of Y.
b. X is proxy of X.
c. If X is surrogate of Y, and Y is surrogate of Z, then X is surrogate of Z.
More specific rules of the grammar define what is surrogate or proxy of what. It
defines that as surrogate (but not proxy) of its finite target, and it defines extent
operators as proxy of the modified word.
Informally, we can distinguish between different degrees of hypocentricity.
In its strongest form, phrases [X[Y]] and [Y] are in free variation: i. e. [Z[X[Y]]]
is possible if and only if [Z[Y]] is possible. The analysis for this sort of case is
that X is proxy of Y, and the complement of Z must be proxy of Y. Extent
operators are examples of strong hypocentricity. In a weaker form of
hypocentricity, phrases [X[Y]1 and [Y] are in free variation only in certain
environments, e. g. [Z[Y]] alternates with [Z[X[Y]]], but [W[Y]] does not
alternate with *[W[X[Y]]]. The analysis for this sort of case is that X is
surrogate of Y, and the complement of Z must be surrogate of Y. That clauses
are an example of this weaker form. But there is potentially a still weaker form
of hypocentricity, 'quasihypocentricity', in which [Z[X[Y]]] does not alternate
with [Z[Y]] at all, but nonetheless Z and Y are sensitive to each other's
presence; for instance, it might be that X is eligible to be complement of Z only
if Y is complement of X, or it might be that the presence of Z is a condition for
Y's inflectional form, or vice versa. Some (somewhat rarefied) examples of
quasihypocentricity are discussed in section 17. 2-3, where the analysis of
quasihypocentricity is elaborated on a little.
7. Focusing Subjuncts: just, only, even

Just, only and even, called 'focusing subjuncts' in Quirk et al. (1985), behave
rather like extent operators with regard to hypocentricity. The DH of even Sophy
in (25a) is Sophy, and it is to the DH that the inflectional morphology is
sensitive, as (25b) shows. The DH is complement target of the subjunct, and
the subjunct is proxy of its complement target:
(25) a. EVEN Sophy} would
b. Even I/*me am/*is.

The identification of the focusing subjimct as the SH needs some justification.

If the subjunct is not SH, then the structure is one of (26a-26b). (Classing
focusing subjuncts as adverbials, as Quirk et al. (1985) do, implies (26b). )
(26) a. Even Sophy would.
b. Even Sophy would.
(26a) fails to account for why the subjunct must be at the extreme edge of the
focused phrase. With the subjunct as SH, as in (25a), the ungrammaticality of
(27a) is predicted by the crossing branches. But no such prediction is made
with structure (26a), as in (27b):
(27) a. * Sophy even's parents would.
b. Sophy even's parents would.

(26b), with the subjunct as dependent of the verb, incorrectly predicts that (28-
29) should be ungrammatical. In (28a-28b) there are crossing branches. In
(29a-29b), Edgar does not occupy the position immediately following its regent
(gave and will), even though that normally results in ungrammaticality, as with
(30a-30b). But the incorrect predictions vanish if the structures are as in (31-
32).
(28) a. She stepped in only two puddles.
b. Pictures only of eyelashes were recommended.
(29) a. She gave even Edgar flowers.
b. Will even Edgar relent?
(30) a. * She gave today Edgar flowers.
b. * Will today Edgar relent?
(31) a. She stepped in only two puddles.
b. Pictures only of eyelashes were recommended.
(32) a. She gave even Edgar flowers.

b. Will even Edgar relent?
8. Pied-piping
The roots of the bracketed phrases in (33a-33b) are under and on. But the
phrases occupy their position before the inverted auxiliary by virtue of
containing a negative element, no, or an interrogative wh-word, which. The root
of the bracketed phrase in (33c) is on or should, depending on which is
subordinate to which in one's analysis, but the DH is which: the rule for finite
wh-relative clauses is that they contain a subject or topic phrase that contains a
relative wh-word. In contrast to certain other hypocentric constructions, the
semantic head of the phrase appears to be the DH, the wh-word, since
semantically it is the relationship of binding or equation that connects the
relative clause to the modificand.
(33) a. [UNDER no circumstances] would she consent.
b. [ON the corner of which streets] should we meet?
c. streets [on the corner of which we should meet]
As with the inside-out interrogative construction discussed in section 13, the SH
in pied-piping is surrogate of the DH, and - in relative clause pied-piping at
least - the surrogate relation is long-distance, i. e. there is no upper limit to the
number of nodes that can be on the path up the tree from DH to SH.
9. Degree Words
In the phrases bracketed in (34a-34d), the degree words (too, more, as) modify
the adjectives that are the distributional head. If the DH were also the SH, we
would expect complements of the degree word to appear on the same side of
the adjective as the degree word does, as in the ungrammatical (35a-35d), in
order to avoid the sort of crossing branches diagrammed in (36).
(34) a. This is [TOO heavy for him to lift].

b. He is [TOO tough to shed the odd tear].
c. She is [MORE sophisticated than him].
d. She is [AS sophisticated as him].
(35) a. *This is too for him to lift heavy.
b. *He is too to shed the odd tear tough.
c. *She is more than him sophisticated.
d. *She is as as him sophisticated.
(36) He is too tough to shed the odd tear.
A standard solution to this problem would have the adjective as SH and the
complement of the degree word obligatorily extraposed. But as far as I am
aware, this solution is motivated solely by the lack of any mechanism for
handling hypocentricity, and not by independent evidence. By purely structural

and not distributional criteria, it is the degree word that is prime candidate for
being SH. Accordingly, we take the modificand to be complement target of the
degree word, and the degree word to be surrogate of its complement target.
10. Attributive Adjectives

The same kind of argument that suggests that degree words are structurally
superordinate to the adjectives they modify also suggests that attributive
adjectives are structurally superordinate to the common nouns they modify. In
(37a), to read is complement of easy, but if book is SH then we would expect the
word order to be impossible, (37b). The word order favours easy as SH, (37c).
Again as with degree words, a construction-specific rule of obligatory
extraposition is an alternative solution.
(37) a. an [EASY book to read]
b. an [EASY book to read]
c. an [EASY book to read]
The nonextrapositional analysis copes well when a degree word modifies an

attributive adjective, for it correctly predicts that a complement of the degree
word must follow the noun modified by the adjective, as (38):
(38) a more sophisticated person tfran him
To rule out (39), the extrapositional analysis would have to posit that than him
first extraposes to become some sort of postdependent of sophisticated, in line
with the extraposition rule that applies to degree words, and then extraposes
again to become a postdependent of the noun, in line with the extraposition
rule that applies to attributive adjectives.
(39) *a more sophisticated than him person
The suggested analysis is that attributive adjectives are not adjuncts. Rather,
they take the proxy of a noun as their complement, and the adjective is proxy of
its complement. Further evidence for this analysis comes from ellipsis:
(40) She chose an easy puzzle and he chose a difficult puzzle
11. Determiner Phrases

It was suggested in section 10 that attributive adjectives target a noun for their
complement, the resulting phrase having the adjective as SH and the noun as
DH. In this section I present other examples of hypocentric NPs (or DPs).
Hudson (2004) notes the contrast (41a-41b). It is the presence of an

instance of the lexeme WAY that allows the noun phrase to be an adverbial.
Hence way is the DH:
(41) a. She did it this way.
b. *She did it this manner.
Generalizing beyond this example, the DH of the determiner phrase is the
noun that is its complement target. Evidence for this comes from extraposition
out of subjects, for only dependents of the DH can extrapose out of a subject.
For example, (42a) can yield (42b) by extraposition, since the extraposee is a
dependent of statement, the DH of the subject phrase. But in (42c), the DH of
the subject phrase is author, so extraposition of a dependent of statement is
ungrammatical. On the assumption, justified below, that the determiner is SH,
(42d) shows that it is not the case that dependents of the actual subject (the SH,
those] can extrapose. And the apparent exception presented by die
grammaticality of (42e), where a dependent of statement is extraposed but at
first glance the DH would seem to be sort, in fact serves to confirm that, as
argued below in section 12, sort of phrases are hypocentric, so statement is DH
of sort of statement and of the whole subject phrase.
(42) a. A statement that denies the allegation has been released.
b. A statement _ has been released [that denies the allegation].
c. *The author of a statement _ has been arrested [that denies the
allegation].
d. * Those _ have been released [statements that deny the allegation],
e. A curious sort of statement _ has been released [that denies the
allegation].
But it is the determiner that is SH. The evidence for this comes from word
order and ellipsis. The word order evidence is that the determiner always
occurs at the extreme left of the phrase, a fact that follows automatically if the
noun is subordinate to the determiner but that would be unexplained if the
determiner is subordinate to the noun. The ellipsis evidence is that the noun
but not the determiner can be deleted. Thus (43a-43b) are synonymous,
whereas (44a-44b) are not, and there is no reason to suppose that any
determiner is present in (44b).
(43) a. These scales are not working properly,
b. These scaies are not working properly.
(44) a. This milk is off.
b. Milk is off.
Indeed, the ellipsis can perfectly well apply to way, as in (45a), the phonological
visibilia of a syntactic structure whose lexical content is given in (45b).
(45) a. You do it your way and I'll do it mine.
b. You do it you's way and I will do it me's way.
There is not much in the way of argument against treating the determiner as
SH. Van Langendonck (1994) argues that treating the determiner as head in
this booklthat book fails to capture the analogy with the adjective phrases this big I
that big. But section 9 has argued that in this big/that £fg, the SH is this /that, and
big is DH, so the analogy is captured.
There are other examples that can be used to make the point made by (41a-
41b), but none that are quite so convincing. The verb crane appears to require
X's neck as its object, but it's hard to prove that this is not merely the consequence
of the verb's meaning, which specifies that the cranee is the neck of the craner. A
somewhat more convincing example is (46): wreak for many speakers requires an
object whose DH is havoc, and no synonym will suffice in its stead.
(46) The storm will wreak the usual havoc/%devastation.
The same appears to hold for cognate objects:
(47) a. She smiled her usual smile/*grin.
b. She slept a deep and placid sleep/* slumber/* somnolence/*kip.
But the contrast in (48a-48c) shows havoc and cognate objects to be unlike way
adverbials, and makes it hard to maintain that the presence of havoc and smile in
(48b-48c) is a syntactic requirement. 8
(48) a. *She did it something that fell short of a wholly sensible way.
b. The storm will wreak something that falls short of outright havoc.
c. She smiled something that fell short of the sweet smile we had come to
expect from her.
The relationship between determiner and noun is analogous to that between
clausal that and finite word. Just as multiple that is ungrammatical (See that
(*that) she does), so are multiple determiners: the (*the) book, or, more
plausibly, *a my book, fa book of mine'. Clausal that takes the proxy of a finite
word as its complement, and is surrogate of its complement Likewise, the
determiner takes the proxy of a common noun as its complement, and is
surrogate of its complement
This analysis predicts that (49) should be ungrammatical. I am not sure that
that prediction is correct, though.
(49) ?She did it the opposite of a sensible way.
(49) ?She did it the opposite of a sensible way.

The bracketed phrase in (50a) is hypocentric. The SH and DH are as shown.
The SH is proxy of the DH. The of in this construction takes as its complement
the proxy of a common noun. Since determiners are surrogate but not proxy of
their complement target, this rules out (50b) (at least as an instance of this
hypocentric construction).
(50) a. these [TYPES/KINDS/SORTS/VARIETIES/MANNERS/CLASSES of dog]
b. * these types of a dog
Some evidence for the hypocentricity of this construction has already been
given in section 11. Further evidence is as follows.
First, there is the grammatically of (51a-51b). The adverbial function of the

noun phrase is licensed by virtue of having way as its DH:
(51) a. Do it the usual sort of way.
b. Do it the same kind of way you always do.
Second, dog in (50) needn't have the coerced mass interpretation that it gets in
There was dog all over the road. Normally, a noun can receive a count interpretation
only if it is the complement target of a determiner; bare, determinerless nouns
must receive a mass interpretation.9 If types in (50) is proxy of dog, then dog can be
complement target of these and hence receive a count interpretation.
It seems that in type of X, type is optionally rather than obligatorily proxy of
X. (52a) is ambiguous between a reading equivalent to (52b), with cake receiving
a count interpretation, and a reading equivalent to (52c), with cake receiving a
mass interpretation. When it receives the count interpretation, cake (and type) is
complement target of a determiner (presumably a), and type is proxy of cake I
brick. When it receives the mass interpretation, type is not proxy of cake/brick,
and the only complement target of a is type.
(52) a. A strange type of cake was on display.
b. A cake of a strange type was on display.
c. Cake of a strange type was on display.
The third and last piece of evidence for the identification of the DH is as
follows. (53a) is paraphrasable as (53b), (54a) as (54b), and (56a) as (56b-
56c). 10 But (55a) is trickier to paraphrase. (55b) is ungrammatical for some
reason. 11 (55c)/(56b) is a possible paraphrase of (55a), but it is ambiguous,
because it also paraphrases (55b). The only unambiguous paraphrase of (55a) is
(55d). And in (55d) we find that these agrees in number with cakes but not type.
Hence cakes is DH of type of cakes.
(53) a. (a) cake of this type (54) a. cake of these types
b. this type of cake b. these types of cake
(55) a. cakes of this type (56) a. cakes of these types
b. *this type of cakes b. these types of cakes
c. these types of cakes c. these types of cake
d. %these type of cakes
13. Inside-out Interrogatives

The italicized phrases in (57a-57f) are instances of what I will call the 'inside-
out interrogative' construction.
(57) a. She always chooses nobody can ever guess which item from the menu.
b. It was hidden in the middle of nobody could tell where.
c. She's been going out with I've no idea who.
d. She managed to escape nobody was able to fathom how.
e. She smokes goodness only knows how many cigarettes a day.
f. The drug makes you you can never be sure how virile.
The construction is functionally motivated by the impossibility of relativizing

out of an interrogative clause, as in (58), so from a functional if not a structural
perspective it ought to be seen as a kind of relative clause.
(58) *in the middle of somewhere^ nobody could tell which place
z was.
_
By all appearances these phrases have the internal structure of a clause that
itself contains an interrogative clause in which sluicing has occurred, as in (59a-
59f). This is why the structure is 'inside-out': (57a) means something like if not
(59a) then at least 'It was hidden in the middle of a place such that nobody
could tell which place it was'.
(59) a. Nobody can ever guess which item from the
she aiwa yS Cmenu
h00ses-
b. Nobody could tell where it was hidden in ^ middle of.
c. I've no idea who she>s been going out with-
d. Nobody was able to fathom how she managed to escape.
e. Goodness only knows how many cigarettes a day she smokes-
f. You can never be sure how virile ^ drug makes you.
But inside-out interrogative phrases have a distribution equivalent to that of
the interrogative wh-word they contain, as in (60a-60f) - setting aside for a
moment the shift to question-meaning.
(60) a. She always chooses which item from the menu?
b. It was hidden in the middle of where?
c. She's been going out with who?
d. She managed to escape how?
e. She smokes how many cigarettes a day?
f. The drug makes you how virile?
To put it another way, this in (6la-6If) can be replaced by an inside-out
interrogative to yield (57a-57f).
(61) a. She always chooses this item from the menu.
b. It was hidden in the middle of this place.
c. She's been going out with this person.
d. She managed to escape this way.
e. She smokes this many cigarettes a day.
f. The drug makes you this virile.
The SH of the inside-out interrogative is the root of the clause, i. e. a proxy of
a finite word. The essence of the construction is that this proxy of a finite word
is licensed to be surrogate of an interrogative wh-word that is complement of (a
subordinate of) the finite word. Since the SH is surrogate of the wh-word, the
SH is also surrogate of whatever the wh-word is surrogate of. The key surrogacy
relations in (57a-57f) are indicated by the dotted arrows in (62a-62f), which
assume that where, who and manner (but not degree) how are the phonological
expression of what is, syntactically, which place, which body (meaning 'person', as
in somebody) and which way. The SH of the bracketed phrases is in small
capitals.
(62) a. She always chooses [nobody CAN ever guess which item from the menu].
b. It was hidden in the middle of [nobody COULD tell which place].
c. She's been going out with [I'VE no idea which body].
d. She managed to escape [nobody WAS able to fathom which way].
e. She smokes [goodness only KNOWS how many cigarettes a day].
f. The drug makes you [you CAN never be sure how virile].
Section 11 explains why which is surrogate of items/place I body/way. Section 9

explains why how is surrogate of many and virile. Section 10 explains why many
is surrogate of cigarettes (on the assumption that many is some kind of attributive
adjective). Because how is surrogate of many, and many is surrogate of cigarettes,
how is surrogate of cigarettes. Hence, in (62a, b, c, e), the SH in small capitals is
surrogate of a noun, thus satisfying the requirement of chooses, o/and with for a
complement that is surrogate of a noun. In (62d), the SH is surrogate of way,
which makes it eligible to function as a manner adverbial. In (62f), the SH in
small capitals is surrogate of an adjective, thus satisfying the requirement of
makes for a complement that is surrogate of an adjective. 12
14. 'Empty Categories'

WG has so far not embraced the empty categories so beloved of other models,
chiefly Transformational Grammar. But there is no fundamental incompatibility
between WG and empty categories, if empty categories are taken to be
phonologyless words. As I will briefly detail below, empty categories would be a
beneficial enhancement to WG, so it is worth considering how they would work in
WG. As also explained below, though, they do raise a certain problem, but this
problem is solvable by means of the Proxy relation, though not by means of
hypocentricity. This is why empty categories warrant a short section in this chapter.
Since all nodes in WG are words, the WG counterpart of empty categories
would in WG be a word, an instance of the lexical item '<£>', which is
phonologyless and has the semantic property of expressing a variable.13
(Phonologyless words are notated within angle brackets. )
Positing <e> affords both better analyses of the data, and significant
simplifications to the overall model. The principal simplification comes if
syntactically bound <e> occurs in positions where, in a transformational

model, traces (or other bound empty categories) occur. This would give us the
sort of structure shown in (63a), in contrast to the traditional WG analysis
shown in (63b). ('Binder' is, needless to say, a syntactical relation between a
word and another word that binds it. )
OBJECT
(63) a. Whatdid she say he had been hoping to eat <e>?
BINDER
b. What did she say he had been hoping to eat?

OBJECT
Traditional WG makes a distinction between dependencies - dependency

tokens, that is, not dependency types - that form branches of the sentence tree,
and dependencies that don't. For example the object dependency from eat to
what in (63b) doesn't form a branch in the tree. Word order rules apply only to
dependencies that form branches. (64b) is ungrammatical because the indirect
object (every child in the class) must follow its regent, give. (64c) is ungrammatical
because the indirect object must precede the direct object (a gold star). But
(64d-64e) are grammatical, even though the indirect object does not follow
given in (64d) and does not precede the direct object in (64e), because the
indirect object is not a branch dependent of given.
(64) a. The teacher will give every child in the class a gold star.
b. *The teacher will every child in the class give a gold star.
c. *The teacher will give a gold star every child in the class.
d. Every child in the class was given a gold star.
e. Also given a gold star were all the children in the class.
But the < e > analysis allows us to do away with the distinction between branch
and nonbranch dependencies: with the sole exception of Binder, all
dependencies are branches. The syntactic structure of a sentence is just a
tree with labelled branches, supplemented by nonbranch relations of types
Binder, Surrogate and Proxy. (Even the branch labels are potentially
redundant, given that a branch is distinguished from its siblings by its position. )
Thus, the whole apparatus of syntactic structure can be significantly simplified,
for the price of merely one extra lexical item among thousands.
On the assumption that unbound <e> is interpreted as 'something/
someone', we are then in a position to posit structures for (65a-65d)14 that yield
the meaning that the sentences actually have. Furthermore, the presence of
< e > in (65c-65d) provides a way to capture the fact that even though there is
no overt or deleted object of keep or subject of alive, semantically the object of
keep is still understood to be the subject of alive. ((65c) is the structure one
would have if unbound < e > is added to otherwise orthodox WG. (65d) is the
structure I am proposing. )
(65) a. She was reading <e>.
b. Thou shalt not kill <e>...
c. ... but need'st not strive officiously to keep <e> alive.

SUBJECT
d. ... but need'st not strive officiously to keep <e> <e> alive.
BINDER
The main snag with < e > has to do with the phenomenon of connectivity,
whereby traces have to have the categorial properties of their binder, i. e. of what
they're traces of. An adjective leaves an adjectival trace, a noun leaves a nominal
trace, and so forth. This is so that the trace can satisfy the categorial selectional
requirements imposed on the position the trace occupies. For example, the
subject-raising verb wax requires an adjectival complement. So if <e> is
Section 11 explains why which is surrogate of items/place I body/way. Section 9
wroth.
(66) How wroth did she wax <e>?

BINDER
If we had to introduce invisible words of every conceivable word class, and add
rules requiring them to agree in word class with their binder, then this would be
very much the opposite of a simplification to the grammar. But the Proxy
relation provides a simple solution, if <e> is proxy of its binder. The
selectional requirements of wax are that it takes a complement that is surrogate
of an adjective, and this requirement is satisfied in (66), since (i) how is binder of
<e> and hence <e> is proxy of how; (ii) being a degree word, how is
surrogate of wroth; and (iii) <e> is therefore surrogate of wroth.
In all the other cases discussed in this chapter where the Proxy relation is to
be found, it occurs in a hypocentric phrase, where the SH is proxy of the DH,
which is subordinate to the SH. But this clearly does not apply to the proxy
relation holding between <e> and its binder. The conclusion to be drawn
from this is that rather than the Surrogate and Proxy relations being merely
convenient ways to formalize hypocentricity, they are in fact fundamental, and
hypocentricity is merely a convenient label for phrases whose root is surrogate
of one of its subordinates.
15. Coordination
Since its beginnings, WG has analyzed coordination as exceptional to major
and otherwise exceptionless principles. The first exception is that whereas the
rest of syntax consists solely of dependencies between lexical nodes, i. e.
words, coordinate structures employ nonterminal, nonlexical nodes, which

are linked to other nodes not by dependencies but by part-whole relations.
The nonterminal nodes are of types Conjunct and Coordination. For
example, in (67), the coordination node, marked by curly brackets, is mother
of two conjunct nodes, marked by angle brackets, and of and. The first
conjunct node is mother of Sophy and of roses, and the second is mother of
Edgar and of tulips.
(67) Give {< [Sophy] [roses] > and < [Edgar] [tulips] >}.
Coordination is thus an exception to the principle of Node Lexicality, which

requires all nodes to be lexical.
The second exception is that branches in the tree can cross only where there
is coordination, as shown in (68):
(68) He thinks she made her excuses and left.
Latterly, WG has handled this exception by doing away with a No Crossing

Branches principle, and replacing it with a principle of 'Precedence Concord',
which states that if X is a subordinate of Y, and Y is a subordinate of Z, then the
precedence of X relative to Z must be the same as the precedence of Y relative
to Z; so if X precedes Z then so must Y, and if X follows Z then so must Y. To
this principle, (67-68) are not an exception. But nor is (69) an exception to it
either, and (69) is ungrammatical, due to the crossing branches. Hence a
principle of No Crossing Branches is still required, and (68) is still an exception
to it.
(69) * Give students tulips of linguistics. ['Give students of linguistics tulips. ']
Third, coordination is an exception to the principle of 'Branch Uniqueness',

which requires each dependency between words and its dependents to be of a
different type. Hence a word cannot have more than one subject or more than
one object, and so on. 15 But in WG's analysis of (68), give has two indirect
objects and two direct objects. It is easy enough to reformulate Branch
Uniqueness so that it applies only to dependents that aren't conjoined with each
other, but that just raises the question of why there should be such an
exception.
Other things being equal, WG's model of grammar would be both simpler
and more plausible if the principles of Node Lexicality, No Crossing and
Branch Uniqueness were exceptionless. This calls for a wholly different analysis
of coordination. 16 I propose that coordinations are hypocentric phrases whose
SH is the conjunction. Each conjunct is DH. The conjuncts are dependents of
the conjunction, and the conjunction is proxy of its dependents.
(70) She ate [apples AND oranges],
At a stroke, the exceptions to Node Lexicality and Branch Uniqueness are

eradicated. There are no nonlexical nodes. Branch Uniqueness is preserved,
because ate in (70) has only one object, namely and. As for No Crossing, and
the analysis of (68), we return to this in section 17. 2, which provides an analysis
that does not violate No Crossing.
The obvious glaring objection to this analysis of coordination comes from
complex coordination, as in (7la), where the conjuncts appear not to be single
phrases. (As pointed out in Hudson (1976), the position of the correlative
shows that (7la) cannot be derived by deletion from (71b). ) But the objection
can be turned on its head, and (7la) can be taken as evidence that Sophy tulips
is in fact a single phrase. Section 17. 4 provides a vague sketch of how this could
be.
(71) a. Give both Sophy tulips and Edgar roses.

b. * Give both Sophy tulips and give Edgar roses.
16. Correlatives
Another instance of hypocentriciry in coordination arises with correlatives (both,
either, neither). The correlative's position at the extreme edge of the phrase
follows if it is SH. The conjunction is complement of the correlative, and the
correlative is proxy of the conjunction.
(72) a. She eats [BOTH apples and oranges].

b. She eats [EITHER apples or oranges].
c. She eats [NEITHER apples nor oranges].
Correlatives are one of the very rare instances, mentioned in section 6, of words
whose complement is their complement target rather than a proxy of their
target. This can be seen from the ungrammaticality of (73a) in contrast to (73b-
73c). In (73a), the complement of both is or, which is proxy of each and: this is
ungrammatical, because the complement of both must be and.
(73) a. *Find both [[Alice and Bill] or [Carol and Dave]].

b. Find (either) both Alice and Bill or both Carol and Dave.
c. Find Alice and Bill or Carol and Dave.
17. Dependency Types

In WG, dependencies are of different types, such as Subject and Object. In
section 141 suggested that these types could be reduced to labels on branches
in the sentence tree. But in this section I will argue that branches are
unlabelled, and that there is no distinction between branches and
dependencies; and so-called 'dependency types' are in fact lexical items in
their own right. Thus, instead of X being subject of Y, there is a word, an
instance of the lexical item < PREDICATION >, that has two dependents, X (the
subject) and Y (the predicate). These words that take over the job of
grammatical relations ('GRs'), I will call 'GR-words'. GR-words belong to a class
of function word characterized, in part, by phonologylessness.
This proposals is relevant to this chapter for two reasons. First, the phrases
whose root is a GR-word are strongly or weakly hypocentric. And second, many
of the other analyses made elsewhere in the chapter converge on the GR-word
analysis as a more or less inescapable conclusion.
17. 1 Adjuncts
Semantically, adjuncts are the converse of complements, in that whereas X is a
semantic argument of Y when X is a complement of Y, Y is a semantic
argument of X when X is an adjunct of Y. For instance, in She snoozed during the
interval, the snoozing is the theme argument of 'during'. A natural logical
corollary of the fact that the modificand is an argument of the modifier is that
modification is recursive: after one modifier has been added to the modificand,
another modifier (of the same type or another type) can always be added. This
is, of course, because a predicate's argument 'place' ('attribute') can have only
one 'filler' ('value'), but the filler of one argument place can also be the filler of
many others. We would therefore predict that recursibility is a default property
of adjunction. One cannot rule out, a priori, the possibility of special rules
prohibiting adjunct recursion in certain constructions, but it is hard to imagine
what systemic or functional motivation there could be for such a prohibition. So
the null hypothesis is therefore that all adjuncts are recursible. 17
If adjuncts were simply dependents of the word they modify, then the
principle of Branch Uniqueness ought to make them irrecursible. I propose
instead that Adjunct is not a dependency but rather a GR-word. Rather than X
being adjunct of Y, X and Y are dependents of an Adjunction GR-word; X is
the modifier dependent and Y is the modificand dependent. Adjunction is a
word class; the words it contains are the different kinds of adjuncts, such as
< manner-adverbial >, < depictive >, and so forth. Adjunction phrases are
hypocentric: the adjunction (the SH) is proxy of first dependent, the
modificand (the DH). This can be seen from (74), where it is dozed that
satisfies the requirement of had for a past participle as its complement target:
(74) She had [<ADJUNCTION> [dozed off] [during the interval]].
The adjunction serves as the locus of constructional meaning. For example,

(75a) has the meaning (75b) and the structure (75c), < depictive > being an
adjunction. It is the word < depictive > that adds the meaning 'while', i. e. that
the relationship between her going to bed and her being agitated is that the
former occurs during the latter.
(75) a. She went to bed agitated.

b. She went to bed while (she was) agitated.
c. She [<depictive> [went to bed] [agitated]].
A further merit of adjunctions is that they explain what Hudson (1990) calls
'semantic phrasing'. For example, (76a-76b) are not synonymous. (76a) says
that what happens provocatively is her undressing slowly, while (76b) says that
what happens slowly is her undressing provocatively. This nuance of meaning is
reflected directly in the structure, (77a-77b).
(76) a. She undressed slowly provocatively,

b. She undressed provocatively slowly.
(77) a. She <adjunction> <adjunction> undressed slowly provocatively.
She <adiunction> <adjunction> undressed provocatively slowly.
Noun+noun premodification structures present a conundrum soluble only

by means of adjunctions. The conundrum rests on the difficulty of reconciling
evidence from word order with evidence from ellipsis. Ellipsis, as in (78),
demonstrates that the modifying noun cannot be a dependent of the modified
noun, since the modified noun can delete while the modifying noun remains:
(78) On one reading, it receives a count interpretation and on the other
reading it receives a mass interpretation-
We might for a moment suppose that the modifying noun is like an attributive
adjective, and its complement is a proxy of the modified noun (cf. Section 10).
In this case, (79a-79b) would have the indicated dependency structure. Their
ambiguity would then hinge not on the dependency structure but on the
complement targets, shown in (80-81) by dotted arrows pointing to the
complement target.
(79) a. old clothes bag
b. work clothes bag
(80) a. Oid clothes bag ['bag for old clothes']
b. work clothes bag ['bag for work clothes']

(81) a. old clothes bag ['old bag for clothes']
b. work clothes bag ['work bag for clothes', 'clothes bag for work']
But then we find that this analysis falls foul of word order evidence. The
structure given to (82a) fails to rule out (82b):
(82) a.
revenge kitchen implement attack ['revenge attack with kitchen implement']
b.
| * kitchen revenge implement attack ["revenge attack with kitchen implement"]
But the traditional WG analysis, where the modifying noun is a dependent of

the modified noun, makes the right predictions here, as shown in (83a-83b),
even though it is incompatible with the ellipsis facts:
(83) a.
revenge kitchen implement attack ['revenge attack with kitchen implement']

b.
kitchen revenge implement attack ['revenge attack with kitchen implement']
The solution is to be found if this construction involves an adjunction,

' < n + n >'. This adjunction allows its second dependent to delete, as in (84). It
gives the structures in (85-86). And these structures succeed in excluding (87b)
as a No Crossing violation.
(84) On one reading, it receives a count interpretation and on the other

reading it receives a [<n+n> [mass] [interpretation! ]•
(85) a. [<n+n> [old [clothes]] [bagj] ['bag for old clothes']
b. [<n+n> [<n+n> [work] [clothes]] [bag]] ['bag for work clothes']
(86) a. [old [<n+n> [clothes] [bag]]] ['old bag for clothes']
b. [<n+n> [work] [<n+n> [clothes] [bag]]] ['work bag for clothes',
'clothes bag for work']
(87) a. <n+n> revenge <n+n> <n+n> kitchen implement attack

['revenge attack with kitchen implement']
STRUCTURAL AND DISRTIBUTIONAL HEADS
195
b.
* <n+n> <n+n> <n+n> kitchen revenge implement attack
['revenge attack with kitchen implement']
17. 2 Subjects
Conjoined predicates, as in (88a), present a problem. If the structure is as in
(88b), then No Crossing is violated. If the structure is as in (88c) or (88d), then
some kind of rule of leftwards extraposition of subjects is required:
(88) a. He thinks she made her excuses and left,

b. He thinks she made her excuses and left.
c. He thinks she made her excuses and left.
d. He thinks she <e> made her excuses and <e> left.
As we saw in section 5, exactly the same problem arises with extent operators:
(89) a. He thinks she made her excuses and left.
b. He knows she all but perished.
c. He knows she all but <e> perished.
d. He knows she all but perished.
But under the GR-word analysis, the problem evaporates. The GR-word
<predication > has two dependents, 18 the first corresponding to the subject and
the second corresponding to the predicate19:
(90) a. He thinks <predication> she made her excuses and left.
b. He knows <predication> she all but perished.
< Predication > phrases are quasihypocentric in the sense defined in

section 6. A phrase [X[<predication > [Y][Z]]] does not freely alternate with
[X[Z]]. In nontechnical and atheoretical terms, predicative phrases of category
C do not freely alternate with nonpredicative phrases of category C. But X and
Z are nevertheless sensitive to one another's presence, as can be seen from
(91a-91b). The complement of auxiliary have must be a <predication> whose
second dependent is proxy of a past participle. The complement of wax must
be a <predication > whose second dependent is proxy of an adjective.
(91) a. She had [<PREDICATION> <e> perished\.

b. She waxed [<PREDICATION> <e> wroth],
That <predicatiori> is not proxy of its second dependent can be seen from the
fact that one <predicatiori> cannot be second dependent of another, i. e. that
multiple subjects cannot occur.
(92) a. * She he went.
b- * <predication> She <predication> he went.
As it stands, the analysis makes it look coincidental that it is only the second
('predicate') dependent of < predication > and not the first ('subject') that has
DH-like properties. Therefore the grammar should perhaps formally accord
the second dependent in this construction a special status. Let us therefore call
< predication > the 'guardian' of its second dependent, the metaphor being that
the second dependent is a legal minor and its intercourse with its
superordinates must always be mediated by its guardian. And let us add rules
((93a-93b); (93b) replaces (24c)):
(93) a. If X is surrogate of Y, then X is guardian of Y.
b. If X is guardian of Y, and Y is guardian of Z, then X is guardian of Z.
In this case, the complement of auxiliary have must be a <predication > that is
guardian of a past participle, and the complement of wax must be a
<predication > that is surrogate of an adjective.
17. 3 Topics and finiteness

Topics are phrases, like white chocolate in (94), that have been moved to the
position immediately preceding the preverbal subject.
(94) White chocolate, I can't help gorging myself on _.
Like Subject and Predicate, both the topic and the 'comment' phrases can be a
coordination.
(95) a. White chocolate, she keeps on giving me and I can't help gorging
myself on.
b. Both white chocolate and Cheshire cheese, I can't help gorging myself
on.
As with <predication >, these facts motivate a GR-word for the topic-comment
structure, its first dependent being the topic and its second the comment. The
second dependent of this GR-word is a <predication > that is guardian of a
verb or auxiliary.
(96) [<'topic-comment'> [White chocolate], [<predication> [I] [can't

help gorging myself on]]].
Topics occur only in finite clauses. On the unproblematic assumption that

the structure of It is is (97), it can also be maintained that all finite clauses
contain topics.
(97) [<'topic-comment'> [it] z [<predication> [<e>z] [is]]]
< Topic—comment > can therefore be equated with finiteness: a finite clause is
one that contains < Topic-comment >, which we could equally well call
<finite >.
I leave for future investigation issues about the relationship between
<finite> and mood and tense, about verbal inflection, and about whether
mood exists as a grammatical category in English.20 At any rate, it is clear that
<finite> phrases are at most weakly hypocentric. If Indicativity and
Subjunctivity are subtypes of <finite >, then <finite> phrases are not
hypocentric at all, since know can select for a surrogate of < indicative >, require
for a surrogate of < subjunctive > and insist for a surrogate of <finite >. If, on
the other hand, the mood distinctions are located lower down within the
<finite> phrase, then know /require /insist will select for a surrogate of
<Jinite>, but know and require will further stipulate that their complement
must be guardian of wherever the appropriate mood distinction is located.
17. 4 Complements
An inescapable corollary of the proposed analysis of coordination is that
conjuncts are complete phrases.21 In this section I sketch how this must work,
though the sketch is of a solution strategy rather than an analysis worked
through in detail.
(98a) shows that - uncontroversially - eats cheese is a complete phrase. But

(98b) shows that core is a complete phrase too. So in a verb+object construction,
V+O is a complete phrase, but so too is V on its own. How can this be?
(98) a. She will [[eat cheese] and [drink wine]],

b. She will [[core] and [peel]] the apples.
The answer has to be that there is an extra GR-word present, whose function is
like that of an X' node, uniting into a single phrase its two separate dependents,
V and O. This gives the structures in (99a-99c).
(99) a. She will [[<X'> eat cheese] and [<X'> drink wine]],
b. She will <X> [[core] and [peel]] the apples.
c. She will <X> eat [cheese and bread].
Similarly, (lOOa) shows that Sophy is a complete phrase in (lOOd), and (lOOb)
shows that Sophy is a complete phrase in (lOOd). But (lOOc) shows also that
Sophy roses is a complete phrase in (lOOd).
(100)a. She will give Sophy and Edgar roses.

b. She will give Sophy roses and tulips.
c. She will give Sophy roses and Edgar tulips.
d. She will give Sophy roses.
There must therefore be an additional GR-word present - let's call it

'<ditransitive>'. (lOOa-lOOd) must have structures (lOla-lOld).
(101)a. She will <X > give <ditransitive> Sophy and Edgar roses.
b. She will <X> give <ditransitive> Sophy roses and tulips.

c.
She will <X> give <ditransitive> Sophy roses and <ditransitive> Edgar tulips,
d. She will <X'> give <ditransitive> Sophy~roses.

(102a) shows that roses today is aphrase. Today is an adjunct, but not roses:
it is not the roses that occur today but rather their being given. Today must be
an adjunct of a GR-word that marks the second object, as in (102b). (103a)
therefore has structure (103b).
(102) a. She will give Sophy roses today and tulips tomorrow.
b.
She will <X'> give <ditr> Sophy <adj> <GR> roses today and <adj> <GR> tulips tomorrow.
(103)a. She will give Sophy roses.
b. She will <X'> give <ditr> Sophy <GR> roses.
With the exception of < transitive >, the GR-words involved in comple-
mentation would be guardians rather than surrogates or proxies, since the GR-
words are not freely omissible in the way that surrogates and proxies are. As for
the kind of hypocentricity, if any, involved with < X' >, I leave this for future
investigation.
18. Conclusion
This chapter has demonstrated the existence - and indeed the prevalence - of
hypocentricity, the syntactic phenomenon whereby the distribution of a phrase
is determined not by the root of the phrase but by a word subordinate to the
phrase root. Hypocentricity comes in different 'strengths'. In the strongest form
of hypocentricity, a phrase with a given SH is in free variation with a version of
the phrase with the SH absent. These are the hypocentric constructions that
involve the Proxy relation. The instances discussed in this chapter involve (i)
'extent operators' like almost, not and all but, (ii) 'focusing subjuncts' like even;
(iii) attributive adjectives; (iv) the type-of construction; (v) coordinating
conjunctions; (vi) correlatives like both; and (vii) adjunctions, which are the
invisible words that link adjuncts to their modificands. Apart from coordina-
tion, these could all be called 'modifier constructions'. In addition it has been
suggested that invisible bound variables are proxy of their binder, even though,
exceptionally, the binder would not be a subordinate of its proxy.
In hypocentricy of 'intermediate strength', a phrase with a given SH is in
distributional alternation with a version of the phrase with the SH absent, but
the variation is limited to certain environments. These are the hypocentric
constructions that involve the Surrogate relation. The instances discussed in this
chapter involve (i) clausal that; (ii) inside-out interrogative clauses, which behave
like clausal determiners; (iii) pied-piping; (iv) degree words; and (v)
determiners.
In the weakest form of hypocentricity, there is no distributional alternation,
but the DH is nevertheless sensitive to material external to the hypocentric
phrase. These are the hypocentric constructions that involve the Guardian
relation. The instances discussed in this chapter involve the invisible GR-words
<predication >, which is the root of the subject+predicate construction, and
< finite >, which is the root of the topic+comment construction, and various
other GR-words that form the structural basis of complementation.
The relations Proxy and Surrogate are initially motivated as mechanisms that
provide an analysis for constructions that cannot otherwise be satisfactorily
handled by WG. Once this mechanism is admitted, it opens the way - or the
floodgates - for a series of increasingly radical (and increasingly sketchy and
programmatic) analyses of coordination and of grammatical relations, which
aim to simplify WG by drastically reducing the range of devices from which
syntactic structure is constituted, while still remaining consistent with WG's
basic tenets. The devices that are done away with are (i) exceptions to the
principle of Node Lexicality, i. e. nonlexical phrasal nodes, which orthodox
WG uses for coordination; (ii) exceptions to the No Crossing principle barring
crossing branches in the sentence tree; (iii) dependencies that are not associated
with branches in the sentence tree; and, perhaps, (iv) dependency types tout
court. In their most extreme form, these changes result in a syntactic structure
consisting of nothing but words linked by unlabelled branches forming a tangle-
free tree, supplemented by Binder, Proxy, Surrogate and Guardian relations.
While I believe the necessity for Proxy and Surrogate relations is demonstrated
fairly securely by the earlier sections of the chapter, their extended application
in the analysis of coordination, empty variables and grammatical relations is of a
far more speculative nature. But my aim in discussing these analyses in this
chapter has been to point out how they are possible within a WG model and
why they are potentially desirable.
References
Cormack, Annabel and Breheny, Richard (1994), 'Projections for functional categories'.
UCL Working Papers in Linguistics, 6, 35-62.
Hudson, Richard (1976), 'Conjunction reduction, gapping and right node raising'.
Language, 52, 535-62.
— (2004), 'Are determiners heads?'. Functions of Language 11, 7-42.
Jaworska, Ewa (1986), 'Prepositional phrases as subjects and objects'. Journal of
Payne, John (1993), 'The headedness of noun phrases: Slaying the nominal hydra', in
Greville G. Corbett, Norman M. Fraser and Scott McGlashan (eds), Heads in
Grammatical Theory. Cambridge: Cambridge University Press, pp. 114-39.
Quirk, Randall, Greenbaum, Sidney, Leech, Geoffrey N. and Svartvik, Jan Lars (1985),
A Comprehensive Grammar of the English Language. Harlow: Longman.
Rosta, Andrew (1994), 'Dependency and grammatical relations'. UCL Working Papers
— (1997), 'English Syntax and Word Grammar Theory'. (Unpublished doctoral
Van Langendonck, Willy (1994), 'Determiners as Heads?'. Cognitive Linguistics, 5,
243-59.
Notes
1 In standard Phrase Structure Grammar, lexical nodes are terminal and nonlexical
nodes are nonterminal. If a nonterminal node is defined as one that contains others,
then in WG all nodes are terminal. But this terminology is a bit misleading, since in
a WG tree structure, terminal nodes are ones that have no subordinates. Hence it is
more perspicuous to define WG as maintaining that all nodes are lexical.
2 This problem posed by require was pointed out by Payne (1993); cf. also Cormack
and Breheny (1994).
3 See Jaworska (1986) for examples of such nonpredicative prepositions.
4 These judgements are for conservative Standard English. Admittedly there is the
famous line 'The boy stood on the burning deck, whence all but he had fled'
(Felicia Hemans, 'Casabianca'), but that could be a solecism induced by the
register, hypercorrectively, and by the disconcerting unfamiliarity of the word-
sequence him had in contrast to he had. It is also true that for many speakers of
contemporary English, the rules for the incidence of personal pronouns' subjective
forms, especially in less colloquial registers, seem to be pretty much of the make-
them-up-as-you-go-along or when-in-any-doubt-use-the-subjective-form sort.
5 It has to be admitted that this claim flies in the face of a certain amount of evidence
to the contrary, notably determiner-complement ellipsis, as in (i), and
pseudogapping, as in (ii).
(i) Do students of Lit tend to be brighter than those students °f Lang?
(ii) She will do her best to bring the food, as will he a0 his best to bring me wine.
6 (13f) raises its own analytical curiosities, which I won't investigate further here.
Logically, the structure is 'not [[go out without knickers] and [still stay decent]]';
that is, the ellipsis is of one conjunct Egregiously unexpected though such a
phenomenon is, I find that the analogous structure in (i) is, also surprisingly,
acceptable.
(i) Nobody likes to complain, but she should [[COmpiaiJ and [be the happier for
it]].
7 The arrows below the words represent dependencies that don't form branches in
the tree structure. (See section 14 on the eradication of such dependencies. )
8 More generally, I would maintain that open class lexemes are invisible to syntax
and hence that selectional rules cannot refer to them. Only word classes are visible
to syntax and can be involved in selectional rules. WG has always held that
lexemes are word classes, but my contention is that this is true only of closed class
lexemes, in that closed class lexemes are word classes that are associated with
particular phonological forms, whereas open class lexemes are morphological
stems (which is why processes of derivational morphology, which output stems, can
output only open class lexemes). Way (and a few other similar words, such as
place) would be a subclass of Common Noun.
9 More precisely, the rule is that by default, nouns receive a mass interpretation, but
the complement of certain determiners, such as an, must be proxy of a noun that
receives a count interpretation.
10 I don't know why (56c) is grammatical. Cake has a count interpretation, so ought to
be complement target of a determiner, but if it is complement target of these then it
ought to agree in number with these.
11 I suggest that the reason is that common nouns take plural inflection only when
complement of a plural determiner. (This supposes that bare plurals are
complement of a phonologically invisible plural an. } In (55b) there is no plural
determiner present to trigger the plural inflection on cakes.
12 This is an approximation. Make actually requires a complement that is surrogate of

a predicative word. In the analysis of predicativity given in section 17. 2,
<predicative > would be complement of make, and the surrogate of virile would be
dependent of <predicative >.
13 By 'empty category', I mean traces and suchlike, that have a privileged syntactic
status, and are empty not only of phonological content but also of ordinary lexical
content. As noted in section 5, it is also possible for ordinary words in particular
environments to lack phonology; cf. also Creider and Hudson (this volume).
14 (65b-65c) from Arthur Hugh Clough's 'The latest decalogue'.
15 Besides coordination, adjuncts appear to constitute an exception to Branch
Uniqueness. A word can have more than one adjunct - indeed, that is part of the
definition of the Adjunct relation. But section 17. 1 provides an analysis of adjuncts
that removes the exception to Branch Uniqueness.
16 A further problem with the WG analysis of coordination is that it cannot easily
accommodate the fact that the definitional boundary between coordination and
subordination is gradient rather than clearcut as one would expect were
coordination and subordination handled by fundamentally different mechanisms.
(See Rosta 1997 for a full demonstration of this point. )
17 This null hypothesis stands up extremely well to the data, but there are some
constructions where a dependent is irrecursible but is not subject to selectional
restrictions and is not an argument of the modificand and hence has selectional
and semantic properties more typical of adjuncts than of complements. But such
dependents are best seen as atypical complements. One example is the result
dependent in the resultative construction, e. g. soggy in (i). Another is the indirect
object me in (ii).
(i) She sneezed the hankie soggy.
(ii) Fry me some bacon.
Another example is bare relative clauses (BRCs), in those dialects in which BRCs
are not recursible.
(iii) %The book [I'd been asking for _] [she finally bought me _] turned out to be
crap.
Another respect in which BRCs are unlike adjuncts is that, as (ii) shows, they don't
extrapose, though this point should be taken as suggestive rather than conclusive,
since it is not clear how consistently extraposability distinguishes adjuncts from
complements or other nonadjuncts.
(iv) [That book _J has arrived [*(that) you ordered Ji.
18 In the absence of evidence to the contrary, GR-words are assumed to precede their
dependents, since this is the default order for English. Indeed, for English it may
eventually turn out to be an exceptionless principle that dependents follow their
regent. But there are plenty of exceptions that are hard to explain away, such as too
'also' ({[She] [too] went]), degree enough, possessive 's ([[Sophy]'s [father]), and
nonfmal dependents of conjunctions.
19 A corollary of this analysis is that the 'inverted subject' in subject-auxiliary
inversion is not in fact a subject Rather, it is an object (of the auxiliary) that has not
been raised to subject position.
20 The data to be accounted for is summarized in (i-ix).
(i) though she is/ was/ %be/ *were/ goes/ %go/ went mad
(ii) if she is/ was/ %be/ were/ goes/ %go/ went mad
(iii) insist that she is/ was/ be/ *were/ goes/ go/ went mad
(iv) require that she *is/ *was/ be/ were/ goes/ go/ went mad
(v) know that she is/ was/ *be/ *were/ goes/ *go/ went mad
(vi) She would, *is/ %was/ *?be/ %were/ *goes/ *go/ *went she mad.
(vii) She would, is/ %was/ *?be/ %were/ *goes/ *go/ *went you to.
(viii) I would prefer that you not be/%be/*is/*is.
(ix) She almost *be/*be/is/%is.
21 In determining what sorts of phrase can be coordinated, it is important to factor
out the extraneous but distorting effects of Right Node Raising-type operations,
which delete the phonology of part of one conjunct. See Rosta (1997).
9 Factoring Out the Subject Dependency
NIKOLAS GISBORNE
Abstract
This chapter offers a revision to the English Word Grammar (EWG) model by
factoring out different kinds of dependency. This is because the information
encoded in the EWG model of dependencies is not organized at the appropriate
level of granularity. It is not enough to say, for example, that by default the
referent of a subject is the agent of the event denoted by the verb.
1. Introduction
The English Word Grammar model treats dependencies as asymmetrical
syntactic relations (Hudson 1990: 105-8), where the critical information is the
asymmetry and the relative ordering of head and dependent. Hudson (1990:
120-1) goes on to treat grammatical relations as a particular subclass of
dependency relation, and to identify certain semantic roles as being
prototypically linked to certain grammatical relations. For the English Word
Grammar model, therefore, dependencies are labelled asymmetrical syntactic
relations which are also triples of semantic information, syntactic relation
information and word-order information bound together by default inheritance.
The theory is syntactically minimalist: all syntactic phenomena are analyzed in
terms of dependency relations and the categorization of the words that the
dependencies relate to.
The result is a highly restrictive model of grammar, where all relationships
are strictly local. It differs from other lexicalist frameworks, such as Lexical
Functional Grammar (LFG), in that there are not different domains of structure
which represent different kinds of information. All grammatically relevant
information for WG is read off the lexicon and the dependency information.
Within this theory of grammar, Hudson (1990) uses an inventory of
dependencies which is pretty much what you find making up the set of
grammatical relations in both traditional grammar and classical transformational
grammar.
In this chapter, I offer a revision to the English Word Grammar model by
factoring out different kinds of subject dependency. This is because the
information encoded in the EWG model is not organized at the appropriate level
of granularity. It is not enough to say, for example, that by default the referent of a
subject is the agent of the event denoted by the verb. This is because there are at
FACTORING OUT THE SUBJECT DEPENDENCY 205
least three kinds of subject in English: subjects triggered by finiteness (on the
grounds that English is not a 'pro-drop' language); subjects triggered by
predicative complementation; and 'thematic' or 'lexical' subjects such as the
subjects of gerunds and other inherently predicating expressions. Subjects
triggered by finiteness are not required to be in any kind of semantic relationship
with the event denoted by the verb. Similar observations about the non-
uniformity of the subject relationship are found in McCloskey (1997, 2001).
In this chapter, therefore, I review the inventory of dependencies in Word
Grammar, and establish a more fine-grained account of subjecthood than the
model of Hudson (1990) envisages. I focus on data introduced in Bresnan
(1994). Bresnan (1994) explores locative inversion, shown in (1), and shows
that in inverted sentences like (Ib), the subject properties are split between the
italicized PP and the emboldened NP:
(1) a. A lamp was in the corner.

b. In the comer was a lamp. 1
Bresnan's (1994) account explains the split subject properties in terms of the
parallel architecture of LFG where grammatical information is handled in terms
of a-structure, f-structure and c-structure. I show that the revised Word
Grammar account can capture the same kind of data as LFG within a more
parsimonious ontology.
The chapter is organized into 5 sections. In section 2, 1 discuss the different
dimensions, of subjecthood, and explore the different properties that subjects
have been claimed to display since Keenan (1976). In section 3, I lay out the
data that needs to be discussed (drawn from Bresnan 1994), and explain the
problems that this data presents. In section 4, I present the refined view of
subjecthood that this chapter argues for, and show how it accounts for the data.
The final section, section 5, presents the conclusions and some prospects for
future research.
2. Dimensions of Subjecthood
Subject properties have been gathered up in several different places - for
example, Keenan (1976), Keenan and Comrie (1977), and Andrews (1985).
Subjects have been shown to have diverse properties across languages, and it
has been shown that not every subject property is always displayed by all
subjects in a given language. It is this observation that drives Falk's (2004: 1)
claim that 'a truly explanatory theory of subjecthood has yet to be constructed'.
In this section, I itemize and exemplify some of the major features of
subjecthood, which are generally held to apply crosslinguistically, and present
three diagnostics which apply parochially to English. I have relied on the
presentation of these properties in Falk (2004: 2-5), where they are usefully
gathered together. Not all of the subject properties laid out here are directly
relevant to the analysis of the split-subject phenomena found in locative
inversion, but they are relevant to the broader conclusions about subjecthood
that this case study takes us to, and which are laid out in section 5.
2. 1 Typical subject properties

Subjects are typically the dependent which expresses the agent argument in the
active voice. This is shown in (2):
(2) a. The dog chased the cat

b. The cat was chased by the dog.
In (2a), the subject is also the agent of the action. In order for the subject not to
have to be the agent, passive voice is available as in (2b). Voice phenomena are
devices for re-arranging the arguments of the verb so that the agent no longer
has to be presented as the subject Of course, it is not always the case that
subjects are agents, because there are verbs that do not have agentive subjects,
as in (3), but many linguists follow Jackendoff (1990) in assuming a hierarchy of
semantic roles, where the most agent-like is always the one which links to the
subject.
(3) a. Jimmy weighs 90kg.

b. The glass shattered.
So we can use semantic role as a subject-diagnostic.

The second diagnostic is that sole arguments of intransitives typically show
(other) subject properties. For example, tag questions and subject-auxiliary
inversion are diagnostics of subjecthood in English, and the subject of (3b)
above and (4a) can have the relevant diagnostic applied to it
(4) a. The glass shattered, did it/*he/*she/*they?

b. Did the glass shatter?
From this, we can say that the glass in (3b) is shown to be the subject by the
diagnostics in (4).
The addressee of an imperative is a subject In the following examples, the
addressee has the status of the subject, irrespective of its semantic role.
(5) a. Go away!
b. Be miserable, see if I care!
From the imperative examples, we can see that it is also possible for subjects to
be covert.2
One widely noted diagnostic is to do with anaphora. There is a subject-
object asymmetry, which becomes evident when subject and object are co-
referential. In the case of co-reference, it is the object which is expressed as a
(reflexive)3 pronoun. This is shown in (6):
(6) a. Jane hurt herself.

b. * Herself hurt Jane.
There is cross-linguistic variation in this construction. In English, there is a

hierarchy of grammatical functions so that the reflexive pronoun has to be

lower in the hierarchy than its antecedent. In some other languages, only
subjects may be antecedents of reflexive pronouns.
The subject is the only argument which may be shared in a predicative
complementation structure (in fact, in both varieties of predicative comple-
mentation - raising and control). This is shown by the examples in (7) for
'control' verbs, and (8) for 'raising' verbs. The subject of the xcomp is shared
with either the object or the subject of the matrix verb.
(7) a. Jane persuaded the doctor to see Peter.

b. Jane persuaded Peter to be seen by the doctor.
c. The doctor was persuaded to see Peter. 4
d. *Jane persuaded Peter the doctor to see.
In (7a), the doctor is shared between persuaded and to see Peter, because to is the
xcomp of persuade. The passivization facts in (7b) show us that the doctor in (7a)
is the subject of to (and see). The passivization facts in (7c) show us that the doctor
is an argument that is shared with persuaded, because it is also the object of
persuaded. The ungrammatical (7d) shows that Peter cannot be the object of
persuaded and of see at the same time. Therefore, the property of being sharable
with a higher predicate is a property of subjects, not other arguments.
(8) a. It seems that Jane likes Peter.

b. Jane seems to like Peter.
c. Peter seems to be liked by Jane.
d. * Peter seems Jane to like .
The relationship between (8a) and (8b) shows that in (8b), Jane is the subject of
both seems and to like. The example in (8c) shows that the passive subject of (to
be) liked can also be shared with seems. The ungrammatical (8d) shows that it is
not possible to exploit the object of like as the shared subject of seems.
Falk (2004) claims that the subject is the only argument which can be shared
in coordination. The examples in (9) show that a subject can be shared by two
conjoined verbs, but not an object:
(9) a. Jane kissed Peter and hugged Jim.

b. *Jane kissed Peter and Cassandra hugged .
However, this observation is not quite right. In Right Node Raising, the object
of the second conjunct can be shared, as in Jane kissed, and Cassandra hugged,
Peter. Right Node Raising needs to be treated as a special construction type
because, among other things, it comes with particular intonation - indicated
here by the commas - which is not a necessary part of the argument sharing in
(9a). But, it is also the case that it is possible to say Cassandra peeled and ate a
grape. Here, both the object and the subject are shared by the conjoined verbs.
There are two remaining properties of subjects which can be stated very
generally. The first is that in many languages the subject is obligatory, as it is in
English (except in the case of imperatives). This observation gives rise to the
Projection Principle of Chomsky (1981), and its later incarnation as the
Extended Projection Principle (EPP). The second fact is that subjects are
usually discourse topics.
In the next section, I identify some subject properties that are found
parochially in English.
2. 2 Parochial subject diagnostics for English

The first is that subject-inversion is found in main-clause interrogatives.
(10) a. Jane was running.

b. Was Jane running?
c. Jane ran.
d. Did Jane run?
As (lOa-lOb) show, where there is an auxiliary in the corresponding declarative

clause, it inverts with the subject in interrogatives. The examples in (lOc-lOd)
show that where there is no auxiliary in the corresponding declarative clause,
one has to be supplied in the interrogative.
The next diagnostic for English is that tag-questions have properties that are
unique to subjecthood: the pronoun in a tag question has to agree with the
person, number and gender 'features' of the noun or noun phrase in the matrix
clause it replaces. If we look back at the example in (4a), we can see that the
only legal pronoun in the tag question is it, which has features appropriate to the
glass.
The last diagnostic that I want to look at concerns extraction. There are two
main properties in English: the Condition on Extraction Domains shows us that
it is easier to extract out of complements than out of adjuncts, which in turn are
easier to extract out of than subjects. And the THAT-trace effect shows that - in
general terms - English subjects resist extraction. (In other languages subjects
are often more extractable than other arguments. Keenan and Comrie (1977)
showed that in terms of the THAT-trace effect, English is atypical: they found
that cross-linguistically, subjects were more, not less, likely to be extracted. )
There are relevant data in (11):
(11) a. Jane thinks that Peter is a drunk.

b. * Who does Jane think that is a drunk?
c. Who does Jane think is a drunk?
d. What does Jane think that Peter is ?
The example in (lla) gives the basic declarative sentence, (lib) shows that it is
impossible to extract a subject after that, even though (lie) shows that it is
possible to extract a subject out of a finite complement clause when there is no
that, and (lid) shows that it is possible to extract other arguments out of a finite
complement clause, like the object.
2. 3 Subject-verb agreement
Subject-verb agreement is not universally found as a subject property; however,
it is not a parochial property of English either. As a phenomenon, agreement is
complex - some languages have agreement that works across a range of
dimensions, whereas English only shows agreement in terms of number
(Hudson 1999). Although English has subject-verb agreement, which is very
common in Semitic, Bantu and Indo-European languages, some other
languages have no agreement morphology at all - for example the modern
Scandinavian languages and the Sinitic languages. English subject-verb
agreement is shown in (12).5
(12) a. The girl likes the dog.

b. The girls like the dog.
c. *The girl like the dog.
d. *The girl like the dogs.
The examples in (12a-12b) show that the number feature of the finite verb co-
varies with the number of the subject. If the subject is plural, so is the verb: girls
triggers like. The example in (12c) shows that a plural subject requires a plural
verb, and the example in (12d) shows that English does not have agreement
with objects, so a plural object cannot rescue a plural verb that has a singular
subject. The agreement phenomena of English are significant in the discussion
of locative inversion that follows.
This, then, completes the review of subject properties. A number of these
properties were exploited by Bresnan in her (1994) article, which is discussed
in the next section, but before we turn to section 3, I shall just summarize the
subject properties in three bullet-point lists here:
General subject properties.

• subjects are typically the dependent which expresses the Agent argument in
the active voice;
• the sole arguments of intransitives typically show (other) subject proper-
ties;
• the addressee of an imperative is a subject;
• there is a subject-object asymmetry, such that where subject and object
are co-referential, it is the object which is expressed as a (reflexive)
pronoun;
• the subject is the only argument of an xcomp which may be shared in a
predicative complementation structure (in both raising and control);
• the subject can be the shared argument in coordination;
• the subject is often obligatory;
• subjects are usually the discourse topic.
Parochial subject diagnostics for English

• English main-clause interrogatives show subject-inversion;
• tag questions show agreement between the pronoun tag and the subject;
• English subjects resist extraction. (In other languages, subjects are often
more extractable than other arguments).
Agreement
• subject-verb agreement: subjects agree with their verb.
The issue, at least in as much as the locative inversion data constitute a problem for
a story of subjecthood, is to do with which of these subject properties belong
together. In the next section, I look at the locative inversion data presented in
Bresnan (1994), and then in section 4, 1 look at these subject properties in the light
of Bresnan's findings about the arguments in the locative inversion construction.
3. The Locative Inversion Data

Bresnan (1994) presents an account of locative inversion which carefully details
the circumstances within which locative inversion can take place, and which also
describes the discourse factors, as well as the grammatical factors, which affect
locative inversion. Locative inversion in English is shown in (1) above, and (13)
and (14) below.
(13) a. My friend Rose was sitting among the guests.

b. Among the guests was sitting my friend Rose.
(14) a. The tax collector came back to the village,
b. Back to the village came the tax inspector.
As Bresnan (1994: 75) puts it, 'locative inversion involves the preposing of a
locative PP and the postposing of the subject NP after the verb. The positions
of the locative and subject arguments are inverted without changing the
semantic role structure of the verb. ' Bresnan (1994: 75-6) sets out the limits of
locative inversion, excluding other kinds of inversion around be from the
discussion, and limiting the phenomenon to examples like those in (15).
(15) a. Crashing through the woods came a wild boar.

b. Coiled on the floor lay a one-hundred-and-fifty-foot length of braided nylon
climbing rope three-eighths of an inch thick.
The examples in (15) can be included in the set of locative inversion data
because the inverted VPs involve a locative PP, and the verbs GOME and LIE
number among the verbs which support locative inversion. Bresnan goes on to
demonstrate that the verbs which allow locative inversion are unaccusative -
thus the grammaticality difference between (16a) and (16b) - or passive (but
with the BY phrase suppressed), as in (17):
(16) a. Among the guests was sitting my friend Rose.

b. * Among the guests was knitting my friend Rose.
(17) a. My mother was seated among the guests of honour,
b. Among the guests of honour was seated my mother.
From these data, Bresnan concludes that locative inversion 'can occur just in
case the subject can be interpreted as the argument of which the location,
change of location or direction expressed by the locative argument is
predicated' (1994: 80) - to put this another way, the subject must be a 'theme'
in the terms of Jackendoff (1990). This is consistent with Bresnan's account of
unaccusativity, where it is claimed that unaccusativity is not a syntactic
phenomenon, but one where the unaccusative subject's referent is always the
theme of the sense of the verb.
The final aspect of the grammar of locative inversion is that the locative PP is
always an argument of the verb, not an adjunct. The argument/adjunct
distinction is hard to draw in the case of locative expressions, and I do not want
to get bogged down in the debate, but the evidence that Bresnan (1994: 82-3)
brings to bear on the issue is compelling enough. She shows that adjuncts can
be preposed before the subjects of questions, although arguments cannot, and
she uses the so-anaphora test to show that adjuncts can be excluded from the
interpretation of so-anaphora, whereas locative arguments cannot.
To summarize, locative inversion:
• occurs with unaccusative verbs or passivized verbs;

• requires the subject NP's referent to be the theme of the sense of the verb;
• requires the locative PP to be an argument of the verb.
There are other facts that apply in the treatment of locative inversion. These
are:
• presentational focus;
• sentential negation;
• other subject properties.
Presentation focus is not strictly syntactic. I shall not return to this. Sentential
negation is more important. Bresnan gives the examples in (18). The
significance is that in (18a), sentential negation is not possible, whereas in (18b),
constituent negation of the postverbal NP is possible:
(18) a. *On the wall never hung a picture of U. S. Grant.

b. On the wall hangs not a picture of U. S. Grant but one of Jefferson Davis.
Bresnan (1994: 88) quotes Aissen (1975: 9) as saying that this restriction is due
to the way in which the locative expression sets a backdrop for a scene.
Negating the main clause undermines this discourse function, whereas
contrastive negation on the postverbal NP does not have such an effect
Bresnan (1994: 88) on the other hand, contrasting English with Chichewa,
argues that sentential negation in Chichewa excludes the subject, so the
restriction comes down to a statement about the scope of negation.
3. 1 Evidence that the subject properties are split between the

locative PP and the postposed NP
We shall see in this section that by a number of different diagnostics for
subjecthood, the subject properties are split between the locative PP and the
postposed NP that expresses the theme argument.
Agreement
In the case of agreement, we see that the locative PP does not agree with the
finite verb.
(19) a. In the swamp was /*were found a child.

b. In the swamp were /*was found two children.
In English agreement is with the NP Theme.
Control of attributive VPs (participial relatives)

In this case too, we see that the locative PP cannot be the controller (or subject)
of an attributive participle. As Bresnan (1994: 95) points out, this constitutes a
difference between English and other languages: Chichewa does allow
examples like (21b). In borrowing Bresnan's examples, I have also taken her
representational system.6
(20) a. On the corner stood a woman fcp who was standing near another woman],
b. On the corner stood a woman [0 standing near another woman] cp.
Note that the locative PP cannot control the participle in the participial relative.
(21) a. She stood on the corner [cp on which was standing another woman] cp.
b. *She stood on the corner [0 standing another woman].
Subject-raising
However, English does allow apparent subject-raising of locative PPs as in (22).
(22) a. Over my windowsill seems to have crawled an entire army of ants.

b. On that hill appears to be located a cathedral.
c. In these villages are likely to be found the best examples of this cuisine.
Bresnan (1994: 96) observes that only subjects can be raised in English. She
compares die two examples in (23) as evidence of this:
(23) a. It seems that John, you dislike,

b. *John seems you to dislike.
In (23a), John is the focused, and leftward-moved, object of dislike. This

movement is entirely acceptable in the context of the finite verb. In (23b),
however, we can see that it is not possible for the object to be focused and then
raised over seems as its subject. From this, she concludes that any word or
phrase which is the subject of a predicate like seem is also the subject of the
xcomp of seem.
Tag questions
The argument from tag questions is a negative one: the claim is that the NP
theme cannot be the subject by this diagnostic. In English tag-questions, a
declarative clause expressing a statement is followed by an auxiliary verb and a
pronoun which expresses a questioning of the prepositional content of the
main clause. Examples are given in (24). The pronoun must agree with the
subject of the main clause.
(24) a. Mary fooled John, didn't she/*he?

b. John was fooled by Mary, wasn't he/*she?
As Bresnan points out, tags are in general unacceptable with locative inversion.
The examples in (25) show this:
(25) a. ?Into the garden ran John, didn't he?

b. *Into the garden ran a man, didn't one/he?7
The example in (25a) is less unacceptable than that in (25b). Bresnan (1994:
97) quotes Bowers (1976: 237) who gives the example in (26) and argues
that this shows that the postposed NP in locative inversions cannot be the
subject.
(26) In the garden is a beautiful statue, isn't there?
The claim is that in (26) there is coreferential with [i]n the garden, which
indicates, if anything, that in the garden is a more likely candidate for subject
status than the postposed NP. 8 Bresnan also quotes *A man arrived didn't onej
he? - an example from Gueron (1980: 661) to show that tags are in general
difficult to establish with locatives even when they do not involve inversion.
However, this example does seem to be set up to make the situation more,
rather than less, problematic: replace a man by the man, and the problem of
pronoun choice vanishes. With appropriate context, as in the train arrived at 3,
didn't it? a tag question is fine with locative inversion.
The tag-question data are difficult to interpret, therefore; it seems that the
best solution is to put them on one side as inconclusive.
Subject extraction/THAT-trace effect

In this section, I simply quote part of Bresnan's (1994: 97) section 8. 2, although
with renumbered examples. Bresnan is discussing the THAT-rrace effect.
[The] preposed locatives in locative inversion show the constraints on subject

extraction adjacent to complementizers:
(27) a. It's in these villages that we all believe can be found the best examples
of this cuisine.
b. *It's in these villages that we all believe that can be found the best
examples of this cuisine.
Nonsubject constituents are unaffected by this restriction, as we can see by
comparing extraction of the uninverted locatives:
(28) a. It's in these villages that we all believe the finest examples of this cuisine can
be found .
b. It's in these villages that we all believe that the finest examples of this cuisine
can be found .
Only subjects show the effect.
(29) a. It's this cuisine that we all believe can be found in these villages.
b. *It's this cuisine that we all believe that can be found in these villages.
Extraction from coordinate constituents

Bresnan (1994: 98) gives the examples in (31-32) which show the constraint in
(30):
(30) 'subject gaps at the top level of one coordinate constituent cannot occur with, any
other kind of gap in the other coordinate constituent. '
(31) a. She's someone that loves cooking and hates jogging.
b. She's someone that cooking amuses and jogging bores .
(3la) has two subject gaps; (31b) has two non-subject gaps.
(32) a. * She's someone that cooking amuses and hates jogging.

b. She's someone that cooking amuses and I expect will hate jogging.
In (32a), a non-subject gap is coordinated with a subject gap, leading to

ungrarnmaticality. In (32b), we see that a non-subject gap can be coordinated with
an embedded subject gap, hence the careful formulation of the constraint in (30).
Bresnan (1994: 98) suggests that judgments are delicate with examples like those
in (33)-(34), which involve locative inversion examples, but gives these examples:
(33) a. That's the old graveyard, in which is buried a pirate and is likely to be
buried a treasure, [subject-subject]
b. That's the old graveyard, in which workers are digging and a treasure is
likely to be buried . [nonsubject-nonsubject]
(34) a. PPThat's the old graveyard, in which workers are digging and is likely
to be buried a treasure, [nonsubject-subject]
b. That's the old graveyard, in which workers are digging and they say is
buried a treasure, [nonsubject-embedded subject]
The crucial point here is that the examples in (33) show that 'the inverted
locative PPs show the extraction patterning of subjects' (Bresnan 1994: 98). As
Bresnan points out, (34a) is fine with there in the subject gap 'which channels
the extraction to a nonsubject argument (the oblique)', which argues in favour
of a subject treatment of the locative PP.
Another diagnostic for subjecthood is inversion in interrogatives (Bresnan,

1994: 102). As the examples in (35) show, it is clearly the case that the locative
PP is not a subject by this criterion. We shall use this fact in the next section
where I argue that the locative inversion data can be best handled by treating
syntactic subjects as distinct from morphosyntactic subjects.
(35) a. *Did over my windowsill crawl an entire army of ants?

b. *Did on that hill appear to be located a cathedral?
As these examples show, the locative PPs are not able to appear as the subjects
of auxiliary do in closed interrogatives. Moreover, as we can see in (36), they
cannot occur as subjects in open interrogatives, either:
(36) a. *When did over my windowsill crawl an entire army of ants?

b. *Why did on that hill appear to be located a cathedral?
c. Why did an entire army of ants crawl over my windowsill?
d. Why did a cathedral appear on that windowsill?
The examples in (36) show that the locative PP cannot appear as the subject in
(36a-36b) although the theme NP can in (36c-36d) in non-inverted examples.
However, Bresnan (1994: 102), in a section arguing against a null expletive
subject analysis, provides the following data, which make the situation reported
here more complex:
(37) a. Which portrait of the artist hung on the wall?

b. * Which portrait of the artist did hang on the wall?
The examples in (37) show that when a subject itself is questioned, as in (37a),
which is the interrogative correlate of a portrait of the artist hung on the wall,
subject-inversion is not triggered. In fact, as (37b) shows, auxiliaries cannot
occur. We see the same facts with locatives:
(38) a. On which wall hung a portrait of the artist?

b. *On which wall did hang a portrait of the artist?
The examples in (38) correlate to on the wall hung a portrait of the artist. In these
examples, on which wall behaves just like a subject in a subject-interrogative.
3. 2 Results and conclusions

It is possible to organize these results into a table - we can then explore the
hypothesis that the split in subject properties shown in locative inversion
corresponds to a split in subject properties which can be explored elsewhere in
grammar. Table 1 shows whether a given subject property applies to the locative
PP or the theme NP in a locative inversion structure; for this reason the table
says that it is not possible for the NP to undergo subject-to-subject raising in the
case of into the room ran a child.
Table 1 Subject property applied to the locative PP or the theme NP in a

locative inversion structure
Subject property Found on PP? Found on NP?
Agreement X /
Subject of participial relative X /
Subject raising / X
Tag questions^ / X
THAT-trace effect / X10
Extraction from coordinated constituents / x 11
Inversion in interrogatives X X
Subject interrogatives / X
The evidence from Bresnan's paper, which I have reviewed in this section,
shows that several subject properties may occur on either the locative PP or the
postposed subject NP. Only three properties are not able to occur on the PP:
these are agreement, being the subject of a participial relative, and an inversion
in non-subject interrogatives. The tag question data appears to favour a subject
analysis of the locative PP rather than the postposed NP.
In the next section, I set out to accommodate those facts within Word
Grammar. As I stated in the introduction, these facts are problematic for the
model put forward in Hudson (1990), because there is only a single account of
subjects in that theory. In the next section, I review some of the dimensions of
subjecthood, in the light of the discussion in section 2. 1 go on to argue that we
need to split subjects into three kinds - lexical subjects, syntactic subjects and
morphosyntactic subjects - and that this division can take account of the pattern
of data reported in section 3.
4. Factored Out Subjects

In this section, I relate the data in section 3 to the more general discussion of
subjects presented in section 2. The problem for the English Word Grammar
typology of dependencies is that in this model there is only one dependency
which has the 'subject-of label - and yet in the locative inversion data, as we
have seen, there are two candidates for the subject dependency. If we start with
the general observations made about subjects in section 2, we can see that the
subject properties detailed there gather around three poles: there are subject
properties which are - broadly speaking - lexical; syntactic; and morphosyn-
tactic. 12 We can put the subject properties into three lists. Bracketed items
feature in more than one list13
Lexical properties
• Subjects are typically the dependent which expresses the Agent argument in
the active voice.
• (The sole arguments of intransitives typically show (other) subject
properties. )
• (The addressee of an imperative is a subject)
Syntactic properties
• The subject can be the shared argument in coordination.
• The subject is the only argument of an xcomp which may be shared in a
predicative complementation structure (in both raising and control).
• English subjects resist extraction. (In other languages subjects are often
more extractable than other arguments).
• Auxiliary inversion fails with a subject interrogative.
• Subjects are usually the discourse topic.
Morphosyntactic properties
• (The addressee of an imperative is a subject. )
• The subject is often obligatory.
• Subject-verb agreement.
As we can see, the lexical properties are those properties which are primarily to
do with the mapping of semantic roles to grammatical functions. The first two
items listed as lexical properties concern the mapping of semantic roles in
particular in nominative-accusative languages (like English) rather than
absolutive-ergative languages (like West Greenlandic). I have the second two
items which are listed as lexical properties in brackets, because these are shared
with other parts of the grammar: the identification of the subject of imperatives
is not only a lexical property. What makes it a 'lexical' subject is that the subject
of an imperative picks out the same semantic role as the first two criteria - the
linking facts apply here as well. But this subject-criterion could also be
morphosyntactic, because the imperative is a mood, and mood is a
morphosyntactic feature. Excepting certain well-known construction types, it
is only the imperative mood that permits subjects not to be represented by an
overt noun or noun phrase in English.
The syntactic properties are those that have to do with the grammatical
phenomena that are commonly called 'movement' or 'deletion'. The first three
can be subsumed under the descriptive generalization that the subject is the
element which can be an argument of more than one predicate. The English
extraction data are at odds with the more typical extraction data, but they still
show that subjects can be identified by their extraction properties. I have
included the non-inversion of subject interrogatives as a syntactic fact (rather
than a morphosyntactic one) on the grounds that subject-inversion is a word
order rule, and is therefore syntactic. For this reason, the fact that PP locatives
behave like other subjects shows that they have the same syntactic properties as
other subjects: they resist subject-inversion in interrogatives.
The final observation listed under syntactic subjects is arguably not even
grammatical - but there are constructional interactions between topic and
syntactic structure and, indeed, focus and syntactic structure. Again, there is a
descriptive generalization to be captured, that subjects generally are topics.14
The morphosyntactic properties of subjects tend to be linguistically specific.
Tenseless languages will not show any morphosyntactic subject properties, and
as Hudson (1999) shows, such properties are in decay in English. I have already
discussed the issue to do with the imperative. I have put the obligatory criterion
here, because it is related to agreement: in languages which have a highly
developed agreement morphology in both the verbal conjugation system and
the nominal declension system it is possible for subjects to be omitted. English
has obligatory subjects - which can be expletive - which appear to be obligatory
because of the impoverished inflectional morphology. 15
It is possible, on the basis of this discussion, to make some general
predictions about what might be found cross-linguistically. The lexical
properties of subjects will vary according to whether the language is nominative
or ergative. The morphosyntactic properties of subjects will vary according to
whether the language has a rich inflectional system, an impoverished
inflectional system, or no inflectional system. And the syntactic properties of
subjects should be relatively consistent across languages: to the extent that there
is variation in the syntactic properties of subjects, it should be attributable to the
interaction between this dimension of subjecthood and one of the other
dimensions.
From the point of view of Word Grammar, these observations about
subjects are only salient if factored-out subjects can do two things: exist in
mismatch structures, so that no single grammatical relation can be held to
obtain between a verb and another element in a clause; and help capture
descriptive facts better than simply treating subject-of as a single unitary relation.
If we return to the data presented in Table 1, we can see that all of the
subject properties that are found on the inverted PP are syntactic subject
properties in that they are all subject properties that are relevant to the ability of
a single noun or noun phrase to be construed as an argument of more than one
predicate. The properties that were found on the locative PP were:
• subject raising;
• tag questions;
• subject extraction;
• extraction from coordinated constituents;
• subject interrogatives.
Excepting the tag-question data (which are arguably morphosyntactic, and which
were, in any case, moot) all of these subject properties are properties that are
related to the ability of a single subject entity to be an argument of more than
one predicate. What we find is that locative PPs can behave like syntactic
subjects, and that when they do behave like syntactic subjects, the NP theme
argument of the verb cannot behave like a syntactic argument.
This is half of an argument that subject properties are split. The other half of
the argument is that the morphosyntactic properties are found on the NP. The
properties that were found on the NP in English, but not on the preposed PP,
were:
• agreement;
• subject of participial relative.
Agreement is clearly morphosyntactic, and in any case, it is not possible for a

category which does not have number to show agreement. The more difficult
case is that of being subject of a participial relative. Bresnan (1994: 94)
introduces this diagnostic because in Chichewa it is possible for a PP to be the
subject of a participial relative. Given the cross-linguistic variation, this has to be
assigned to an arbitrary difference between languages. It is probably due to the
fact that participial relatives are adjuncts of the noun they modify, and in the
semantics of English adjuncts take their head as their argument. The adjuncts
which do not take their heads as their arguments are limited in number to a
small set of exceptions: adjective adjuncts of verbs in, for example, resultative
constructions like Jane ran her trainers threadbare where threadbare is an adjunct
of ran, and its 'er' is (her) trainers. The Chichewa data suggest that, in general,
the participial relative facts need to be treated as syntactic rather than lexical or
morphosyntactic.
The datum which I have not discussed is the inability of PP subjects to
undergo subject-inversion in interrogatives. I repeat examples (35) and (36)
here:
(35) a. *Did over my windowsill crawl an entire army of ants?

b. *Did on that hill appear to be located a cathedral?
(36) a. *When did over my windowsill crawl an entire army of ants?
b. *Why did on that hill appear to be located a cathedral?
c. Why did an entire army of ants crawl over my windowsill?
d. Why did a cathedral appear on that windowsill?
The examples in (35) and (36a-36b) show that the locative PP in locative
inversion cannot undergo subject-inversion. I think that the crucial thing here is
that this is not a syntactic property of subjecthood, but a morphosyntactic one. 16
It is not a syntactic one, because the syntactic constraints were generally
concerned with the ability of a single phrase to occur as an argument of more
than one predicate. The restriction in (35) and (36) is different; in fact, it is
attributable to the PP's lack of morphosyntactic properties. I take it that an
auxiliary cannot invert with a phrase that it cannot agree in number with. 17
However, the examples in (37) and (38), which I repeat here, show that the
locative PP behaves like a subject with respect to subject interrogatives:
(37) a. Which portrait of the artist hung on the wall?

b. * Which portrait of the artist did hang on the wall?
(38) a. On which wall hung a portrait of the artist?
b. *On which wall did hang a portrait of the artist?
The split between morphosyntactic subjects and syntactic subjects permits an

elegant account of the interrogative data. In locative inversion, the locative PP
patterns with subjects in subject interrogatives, simply because this property is a
word-order property, which belongs in the domain of syntactic subjecthood.

But the locative PP does not undergo subject-inversion because this is a
morphosyntactic property.
I propose that, in the case of locative inversion, we reject the data from tag
questions on the grounds that even uninverted theme NPs cannot be
antecedents for the pronouns in tag-questions, as we saw in the discussion of
(24)-(26) above. Tag questions should be diagnostic of morphosyntactic
subjecthood, given that the auxiliary verb behaves like a resumptive pronoun
with respect to the main verb, and given that the agreement pattern should
reflect that of the main clause.
4. 1 Summary and discussion

The locative inversion data argue for a differentiation between morphosyntactic
subjects and syntactic subjects - a split which is supported by an examination of
subject properties more generally. The situation is slightly more complicated in
that the locative inversion data argues for a two-way split, but the more general
discussion of subject properties suggests that there ought to be a three-way split.
However, in other general discussions of subjecthood, such as Dixon (1994),
Falk (2004), and Manning (1996), only a two-way split between 'subjects' and
'pivots' is maintained. In that work, 'subjects' correspond to my lexical subjects,
and the subject, as opposed to the pivot, is the argument where linking is
controlled. On the other hand, in Anderson (1997) a distinction between
morphosyntactic subjects and syntactic subjects, such as I have been arguing for
here, is maintained.
I think that the way forward is to treat the investigation of subjecthood in a
similar way to the commutation-series approach to the phoneme inventory of a
language. We have seen from the locative inversion data that morphosyntactic
and syntactic subjects can be factored out from each other. I propose, briefly, to
show that lexical and morphosyntactic subjects can be factored out, and then
that lexical and syntactic subjects too can be factored out
The account presented here contrasts with Bresnan's in two dimensions.
Bresnan (1994, 103-5) argues that the locative PP is a subject in LFG's domain
of f-structure, but that it does not occupy a subject position in LFG's domain of
c-structure. 18 The clause-initial property of the locative PP in locative inversion
is attributed to its also being a topic in f-structure. The other part of Bresnan's
(1994: 105) analysis is that the postposed NP is identified as the f-structure
object, for both English and Chichewa. The evidence and arguments that
Bresnan puts forward for her analysis are largely that because it is a PP, the
locative PP cannot fulfill certain structural roles associated with subjecthood,
which are contingent on the property of being a nominal category. PPs do not
normally have the distribution of NPs - it is only in the particular construction
of locative inversion, with the additional overlay of topichood, that locative PPs
may behave as subjects. 19 Bresnan claims that the f-structure element in her
analysis accounts for what I have described here as syntactic subject properties;
and the c-structure part accounts for the morphosyntactic subject properties.
Within Word Grammar, we cannot exploit a c-structure/f-structure

mismatch. Nor is it possible to assert that certain subject properties reside in
one domain of structure. However, by splitting the subject properties in the way
I have here, we can account for the same phenomena within a single domain of
structure: the dependency relation. This buys an advantage over Bresnan's
account: as we have seen, the postposed NP has the properties of a
morphosyntactic subject. Bresnan treats the postposed NP as an object, but this
analysis cannot account for its agreement properties. By treating it as a
morphosyntactic subject, and not an object, we can account for the agreement
facts without assuming, for example, that English has object agreement.
In the next section, I discuss the distinction between lexical and other
subjects a little further. This discussion does not add to the analysis of locative
PPs in locative inversion, but it does complete the discussion of a three-way split
in subject properties.
4. 1. 1 LEXICAL AND OTHER SUBJECTS

We can see that lexical and morphosyntactic subjects must be factored out from
each other by looking at raising and gerunds. The examples in (39) show
raising, the examples in (40) show gerunds.
(39) a. Jane seems to be running.

b. I expect Jane to be running.
In both examples in (39), Jane is the 'er' of'run'. However, in neither example
can Jane be thought of as the morphosyntactic subject, because there is no
agreement: the infinitive does not have a feature-value 'number'.
The examples in (40) are even more acute: the gerund running also has no
value for number, but it does have a subject - the pronoun me in (40a) and the
pronoun my in (40b). Again, in neither case can this be thought of as
morphosyntactic subjecthood.
(40) a. Everyone laughed at me running,

b. My running was funny.
Additionally, the evidence from gerunds shows that lexical subjects have to be
distinguished from syntactic subjects: in (40a), me is the head of running, and in
(40b), my is the head of running. From these examples, we have to conclude
that there are cases where lexical subjecthood has to be distinguished from
syntactic subjecthood and from morphosyntactic subjecthood.
Another example, although a negative one, comes from weather IT. The
example in (41) shows that weather IT can be simultaneously a morphosyntactic
and a syntactic subject, even though it cannot be a lexical subject given that the
verb rain does not have any semantic arguments.
(41) It seems to be raining and to be sleeting.
The example in (41) is also important because it shows that the property of
being a syntactic subject is not co-extensive with being a topic. Both Bresnan
(1994) and Falk (2004) argue that syntactic subjects are topics of some kind, but
an expletive pronoun is not a candidate for topichood.
5. Conclusions
In this section, I argue that the treatment of the data presented in this chapter
handles the facts and the data more satisfactorily than the mismatch account of
Bresnan (1994), and that it is more compatible with other general assumptions
about the architecture of grammar that Word Grammar adopts. On the basis of
the locative inversion evidence, I have made a distinction between
morphosyntactic and syntactic subjects, and on the basis of further evidence
from other constructions have made a further distinction which separates lexical
subjects out from the other kinds of subject.
Bresnan also argues for a three-way distinction, but in her case the
factorization of subjecthood is over three of the domains of structure that LFG
recognizes: a-structure, f-structure and c-structure. She effectively argues that the
locative PPs can only be construed as subjects because they are also topics. The
problem with this account is that it treats 'topic' as fundamentally syntactic,
located in f-structure, when it is clear (a) that subjects need not be topics; and (b)
that some subjects cannot be topics. Furthermore, we have seen that the
properties of subjects itemized in section 4 do not require there to be a separate
dimension of topichood - it is simply the case that some subjects are syntactic
rather than morphosyntactic.
In some senses, the different approaches between this chapter and Bresnan
(1994) are due to underlying assumptions that the two models have, which
make them different from each other. LFG does not permit there to be a
mapping of more than one f-structure relation between two elements; Word
Grammar does not distinguish between argument structure and the instantiated
dependencies in a given construction. But it is also the case that the WG
account espoused here allows the theory of subjects to be elaborated so that it
can account for a wide range of differences in the spectrum of subject
properties.
There are some obvious avenues for future research: for example, both
West Greenlandic and Mandarin are tenseless. For this reason, Mandarin has
been argued not to have the subject and object dependencies that are witnessed
in other languages. However, while Mandarin has long-distance reflexives, West
Greenlandic does not. One salient difference is that Mandarin is a nominative
language while West Greenlandic is an ergative language, and so the question is
begged whether these facts are attributable to differences in lexical subjects in
these languages.
Certainly more research is required on the cross-linguistic typology of
dependencies. Meanwhile, it is clear that the English Word Grammar model
needs to be revised, to admit at least three different kinds of subject.
References
Aissen, J. (1975), 'Presentational-£/z<?re insertion: a cyclic root transformation'. Chicago
Linguistics Society, 11, 1-14.
Anderson, J. M. (1997), A Notional Theory of Syntactic Categories. Cambridge:
Cambridge University Press.
Andrews, A. (1985), 'The major functions of the noun phrase', in T. Shopen (ed. ),
Language Typology and Syntactic Description, vol. 1: Clause Structure. Cambridge:
Cambridge University Press, pp. 62-154.
Bowers, J. S. (1976), 'On surface structure grammatical relations and the structure-
preserving hypothesis'. Linguistic Analysis, 2, 584-6.
Bresnan, J. W. (1994), 'Locative inversion and the architecture of grammar'. Language,
70, 72-131.
Chomsky, N. (1981), Lectures on Government and Binding. Dordrecht: Foris.
Dixon, R. M. W. (1994), Ergativity. Cambridge: Cambridge University Press.
Falk, Y. (2004), 'Explaining subjecthood' (Unpublished manuscript, Hebrew University
of Jerusalem).
Gueron, J. (1980), 'The syntax and semantics of PP-extraposition'. Linguistic Inquiry,
11, 637-78.
Hudson, R. A. (1990), English Word Grammar. Oxford: Blackwell.
— (1999), 'Subject-verb agreement in English'. English Language and Linguistics, 3, 173-
207.
Jackendoff, R. S. (1990), Semantic Structures. Cambridge, MA: MIT Press.
Keenan, E. L. (1976), 'Towards a universal definition of "subject"', in Charles Li (ed. ),
Subject and Topic. New York: Academic Press, pp. 303-33.
Keenan, E. L. and Connie, B. (1977), 'Noun phrase accessibility and universal
grammar'. Linguistic Inquiry, 8, 63-99.
McCloskey, J. (1997), 'Subjecthood and subject position', in L. Haegeman (ed. ), Elements
of Grammar: Handbook of Generative Syntax. Dordrecht Kluwer, pp. 197-235.
— (2001), 'The distribution of subject properties in Irish', in W. D. Davies and S.
Dubinsky (eds), Objects and Other Subjects. Dordrech: Kluwer, pp. 157-92.
Manning, C. (1996), Ergativity. Stanford, CA: CSLI Publications.
Notes
1 Unless I explicitly state otherwise, the examples in section 1 and section 3 (where I
present Bresnan's locative inversion data) are taken from Bresnan's (1994) paper,
and the grammaticality judgements are hers. I have, however, silently amended
Bresnan's spelling to British English norms.
2 This is a pre-formal statement, and I do not intend it to commit me to any particular
theoretical position.
3 English makes a distinction between disjoint pronouns - the forms him, her, me and
so forth, and anaphoric pronouns like himself, herself, myself and so forth. Not all
languages make this distinction, and English has not always made the same
distinction in the course of its history.
4 The underscore represents the subject position for to. I do not mean by this
representation to suggest that there is actual movement - like Hudson (1990) I
reject a movement account The representation is intended to be pre-formal, and is
borrowed from Bresnan (1994), whose examples I borrow in section 3 - and in
borrowing some of these examples, I import the representation.
5 I adopt the analysis of subject-verb agreement presented in Hudson (1999), which
argues, in summary, that present-day English represents a transitional stage where,

except in the case of be, number is the only remaining agreement feature.
6 The italicized part of the sentence shows that in (22a) [o]ver my windowsill appears
to have been raised from the subject position of to have crawled an entire army of ants
into the subject position of seems. However, in borrowing this representation, I do
not commit myself to a movement analysis of these data.
7 These examples are not drawn from Bresnan's paper.
8 This claim is debatable: there in this example is not the deictic there of there it is, but
the empty one of there's a problem. It might make more sense to say that it was co-
referential if it were the deictic there.
9 The table shows Bresnan's (1994) evaluation of this evidence, although it seems
clear that the tag-question data is rather more moot than her presentation would
suggest. I return to this in section 4 below.
10 This is not, purely, a subject diagnostic: the point is that the parallelism shows that if
the extracted PP is to be treated as a subject in one conjunct, it cannot be an object
or other argument in the second conjunct, which suggests strongly that it is actually a
subject.
11 Of course, when the theme NP is in the normal subject position it can undergo
inversion: the point of these cells in the table is that neither the NP nor the PP can
undergo inversion when it is the PP that is in subject position.
12 I am using these terms pre-formally as a descriptive heuristic. I refine the terms, and
the analysis, below.
13 I am leaving the reflexive binding facts out of the lists. These facts could
theoretically belong in all three lists - and for that reason, more research needs to
be done about the relationship between reflexive binding and the dimensions of
subjecthood. I shall come back to this briefly in section 5; here it suffices to point
out that binding has been treated in terms of clause domains, which is either
syntactic or morphosyntactic, depending on how clauses are denned, and in terms
of hierarchies of arguments, which is clearly lexical.
14 Of course, a subject need not be a topic: weather it, the it of extraposition, and
expletive THERE cannot be treated as topics given that they are not referential.
15 It might be objected that Mandarin has no inflectional morphology whatever, yet
subjects can be omitted in Mandarin when they can be pragmatically recovered.
This point is, however, consistent with my observation: Mandarin has no tense;
indeed, it could be argued that it has no finiteness at all. Given that lack of
morphosyntax in Mandarin, it is unsurprising that it does not have the category of
morphosyntactic subjects. And given that subjects can be omitted in languages with
a rich morphosyntactic system, as well as in languages lacking a morphosyntactic
system, we can deduce that obligatory subjecthood is neither a lexical property, nor
a syntactic property.
16 Hudson (1990: 240) also treats subject-inversion as a morphosyntactic property.
17 This claim argues for a treatment of THERE-insertion where THERE acquires its
number from its 'associate', given that THERE can invert with an auxiliary.
18 She goes on to reduce the typological differences between English and Chichewa to
a difference in the c-structure representations: in Chichewa, the f-structure PP
subject is also a c-structure subject
19 For this reason, Bresnan (1994: 110) distinguishes between 'place' and 'time'
denoting PPs, which can have the same distribution as nominal elements, and the
PPs found in locative inversion.
Conclusion
KENSEI SUGAYAMA
The movement of Word Grammar began largely as an approach to the analysis

of grammatical structure and linguistic meaning in response to constituency-
based grammar and generative grammar. In this book, we have focused on the
analyses of morphology, syntax, semantics, and discourse based on the
fundamental hypotheses presented in the Preface and Chapter 1: WG is
monostratal; it uses word-word dependencies; it does not use phrase structure;
and language is viewed as a network of knowledge, linking concepts about
words, their meanings, etc. We conclude our survey by pointing out some of
the ways Word Grammar has gone, and should go, beyond its boundaries.
The monostratal character of WG is an advantage, especially the absence of
transformations, even of movement rules. Their role has been taken over by
the acceptance of double dependency within certain limits.
Although word-word dependencies are difficult to accept for a number of
grammarians, it makes the grammar simpler and is also important in
determining the default word order of a language. The notion of phrase is
not completely lost in WG since phrases can be seen as dependency chains. It
is also a good idea to see grammatical relations (subject, object, etc. ) as a
subclass of dependents.
WG presents language as a network of knowledge, linking concepts about
words, their meanings, etc. In this network, there are no clear boundaries
between different areas of knowledge - e. g. between 'lexicon' and 'grammar', or
between 'linguistic meaning' and 'encyclopedic knowledge'. It is rarely known
that this hypothesis was advanced much earlier than the contemporary
movement of cognitive linguistics. Thus WG has implied since its very start in
the early 1980s that conceptual structures and processes proposed for language
should be essentially the same as those found in nonlinguistic human cognition.
It uses 'default inheritance', as a very general way of capturing the relation
between 'model' or 'prototype' concepts and 'instances' or 'peripheral'
concepts. 'Default inheritance' and especially 'prototypes' are now widely
accepted among linguists.
As Richard Hudson puts it in his conclusion to Chapter 1, WG addresses
questions from a number of different research traditions. As in formal
linguistics, it is concerned with the formal properties of language structure; but it
also shares with cognitive linguistics a focus on how these structures are
embedded in general cognition. Within syntax, it uses dependencies rather

than phrase structure, but also recognizes the rich structures that have been
highlighted in the phrase-structure tradition. In morphology it follows the
European tradition which separates morphology strictly from syntax, but also
allows exceptional words which contain the forms of smaller words. And so on
through other areas of language. Every theoretical decision is driven by two
concerns: staying true to the facts of language, and providing the simplest
possible explanation for these facts.
The search for new illuminating insights is still under way, and more
widespread beliefs may well have to be abandoned; but the most general
conclusion so far, as Richard Hudson says, seems to be that language is mostly
very much like other areas of cognition. Thus, Word Grammar in its
architecture has the potential to make a contribution to a theory of cognition
that goes beyond language.
Author Index
Abeille, A. 155 Ginzburg, J. 145, 146, 154-60, 164,

Anderson, J. M. 6, 220 165, 167n. 7
Andrews, A. 35, 72, 205 Godard, D. 155
Goldberg, A. 8, 83, 87, 111, 113
Baltin, M. 153
Biber, D. et al, 92, 94, 95, 97, 99 Haegeman, L. 146, 167n. 3
Boguraev, B. 83 Halliday, M. A. K. 5, 6
Borsley, R. D. 161, 167n. 4, n. 5, n. 6 Henniss, K. 51
Bouma, G. 155, 157 Holmes, J. 84, 112, 113, 114
Breheny, R. 201 Huddleston, R. 68, 69
Bresnan, J. 6, 84, 205, 210, 220, 223n. Hudson, R. A. 35, 42, 43, 50, 69, 70,
13, 224n. 7, n. 9, n. 19 84, 87, 92, 99, 121, 122, 125, 126,
Briscoe, T. 83 145-53, 165, 167n. 1, n. 2, 183,
191, 202n. 13, 204, 205, 218, 223n.
Chomsky, N. (including Chomskyan) 9, 4, n. 5, 224
35, 41, 84, 87, 162
Chung, C. 145, 146, 151, 153, 158, Jackendoff, R. 24, 111, 206, 211
160-5 Jaworska, E. 20In. 3
Connie, B. 205 Johansson, S. 92, 94, 95, 97, 99
Conrad, S. 92, 94, 95, 97, 99
Copestake, A. 83 Rasper, R. 146, 161
Cormack, A. 201n. 2 Kathol, A. 146, 161, 167n. 8
Creider, C. 202n. 13 Keenan, E. L. 205
Croft, W. 83 Kim, J. -B. 145, 146, 151, 153, 155, 157,
Cruse, D. 92 158, 160-5
Koenig, J. -P. Ill
Davis, A. Ill Koizumi, M. 153
Dixon, R. M. W. 220
Dowty, D. 95, 111, 112 Lakoff, G. 3
Lamb, S. 9
Eppler, E. 118, 122, 127, 129 Langacker, R. 3, 8
Lasnik, H. 50, 72, 162
Falk, Y. 205 Lecarme, J. 35
Fillmore, Ch. 6 Leech, G. 92, 94, 95, 97, 99
Finegan, E. 92, 94, 95, 97, 99 Lemmens, M. 83
Levin, B. 83, 95-6
Gazdar, G. 75 Levine, R. 146, 161
228 AUTHOR INDEX
Lyons, J. 6 Rosta, A. 202n. 16, 203n. 21

Rutherford, W. E. 135, 136, 137
McCawley, J. 6
McCloskey, J. 205 Sag, I. A. 35, 51, 145, 146, 154-60, 164,
Malouf, R. 155, 157 165, 167n. 1, n. 7
Manning, C. 220 Sankoff, D. 120, 129
Muysken, P. 118, 119, 121, 124, 130, 140 Shaumyan, O. 6
Shibatani, M. 87
Payne, J. 201n. 2 Sigursson, H. 50
Penn, G. 161 Sugayama, K. 7, 167n. 6
Pinker, S. 11
Pollard, C. J. 35, 51, 146, 161, 167n. 1 Taylor, J. 3
Przepiorkowski, A. 155 Tesniere, L. 4
Pustejovsky, J. 83 Trask, R. 111-2
Trier, J. 92
Quicoli, A. C. 35
Quirk, R. et al. 180 van Langendonck, W. 183
van Noord, G. 155
Rappaport Hovav, M. 83
Reape, M. 146, 161 Warner, A. R. 76
Rizzi, L. 146 Weisgerber, L. 92
Ross, J. R. 154 Williams, E. 87
Subject Index
a(n) 202 Code-mixing 117, 118, 119, 120, 122-4,

accusative case 36, 38, 39, 41, 45, 50 128, 129, 131, 139
accusative subject (see subject(s)) Code-switching 118, 124, 127, 128
actor (see participant role(s)) cognition, 3, 15, 22, 24, 28
adjunct 149-51, 154, 157, 159, 192 cognitive linguistics 3, 28
adjunction 192 comment 197
adjective (s) compaction 161
attributive adjectives 182 complementizer 154, 162, 167n. 6
predicative adjective 35 complex coordination (see coordination)
adverbial 155, 157 concept 5, 8, 9, 10, 11, 12, 13, 16, 24,
agent (see participant role(s)) 25
agreement 95 conjunct 189
all but 174ff, 178 connectivity 189
almost 174ff, 178 constituency 165, 167n. 9
argument role 113 constraint 117-20, 122, 127,
attraction 36, 40, 41, 46, 50, 53n. 3, constructional constraint 161
53n. 4 coordination 21, 22, 23, 190-1, 202
atypical complement 202 complex coordination 191, 198
auxiliary correlative 191
quasi-auxiliary 67 count interpretation (see also mass
semi-auxiliary 67 interpretation) 201n. 9
be 68, 69 default definiteness 63

be to construction 67, 70, 72, 73, 75, 78, default inheritance 5, 6, 8, 12-13, 16,
79 20, 123, 124, 126
because 117, 128, 129-40 degree words 181
beneficiary (see participant role(s)) deletion 176
Best Fit Principle 15 VP deletion 48
binder 188, 200 demand 174, 177
both 191 dependency 6, 21-4, 27-8, 122, 125,
branch 188 127, 139, 145-54, 165, 167n. 6, 204
Branch Uniqueness principle 190, 192, long-distance dependency 155
202 dependency types 191
depictive 192
case agreement 35, 41-2, 45, 51 determiner 182
clausal that 174 difficulty 27, 28
clitic 21-2 distribution 172
230 SUBJECT INDEX
ditransitive 198 145-6, 154, 160, 164, 165, 167n. l,

DOMAIN (DOM) (see HEAD-DRIVEN n. 9
PHRASE STRUCTURE DOMAIN (DOM) 161-3
GRAMMAR) feature structure 154
filler 156, 158
eat type verbs 56, 59 HEAD (in HPSG) 154
ee (see also er) 101, 104, 110 SLASH 154-59, 167n. 7
either 191 synsem (value) 155, 157
ellipsis 176, 182, 183, 193, 201n. 5, SYNTAX-SEMANTICS
201n. 6 (SYNSEM) 154-6
ellipsis, anaphoric 48 hypocentricity (see also
ellipsis, determiner-complement 201n. 5 endocentricity) 173, 178,
embedded clause 145-6, 156, 159, 160, 199-200
162-5
empty category 187ff, 202 immediate dominance rules 167n. 9
empty element 35 incremental theme 95, 112
endocentricity (see also hypocentricity) INDEPENDENT-CLAUSE (1C) 156,
173 158-60
English 117, 118, 123, 125, 127-40 infinitival subject (see subject(s))
er (see also ee) 101, 104, 110 inflection 4, 6, 14, 20, 25, 26
even 178ff Inheritance Principle 70
event 104-5, 108-10 inside-out interrogative
existence propositions 35 construction 185ff
exocentricity 173 instance (see also model) 57, 58, 69, 67,
extractee 149-54, 167n. 6 70, 72, 77, 78, 79, 81
extraction 21, 23 interrogative 146-54, 158-60, 162
extraposition 183, 202 interrogative clause 185ff
Inversion 75
feature structure (see HEAD-DRIVEN inversion, locative 21 Off,
PHRASE STRUCTURE inversion, subject-auxiliary (see SAI)
GRAMMAR) INVERTED (INV) 159, 160
filler (see HEAD-DRIVEN PHRASE isa 8, 9, 10, 11, 12, 13, 14, 15, 16, 57,
STRUCTURE GRAMMAR) 58
finite 197
form 13, 15, 18, 20-1, 25 just 178ff
'future in the past' 74
landmark 105, 110, 112, 146, 152
gap-ss 157 lexeme 5, 14, 18, 21, 24, 27, 34, 36
Generalized Head Feature Principle ACCORD 91
(GHFP) 156, 158 AT 105
generative grammar 5-6 BE 107
German 117, 118, 123, 125, 127, 128, COVER 114
130-40 DEVOUR 93
goal (see participant role(s)) DO 96
grammatical relation 192 FOR 89-90
GR-word 192, 195 GIVE 84, 85, 91, 107
guardian 196, 200 HAVE 95, 96, 107
IN 105
head 154-60, 173 LOAD 114
Head-Driven Phrase Structure Grammar ON 105
(HPSG) 35, 42, 51, 52, 56, OPEN 90
SUBJECT INDEX 231
lexeme, cont. parent 146, 148-50, 152-4, 165

POUR 114 parsing 27, 28
SPRAY 114 participant role (s) 113
TO 89, 107 actor 107, 110, 111, 113
WITH 108 agent 112
Lexical Functional Grammar beneficiary 89-90, 102
(LFG) 204, 220, 222 goal 111
lexical relationship 27 patient 108, 110, 111, 112
lexical subjects (see subject(s)) recipient 85, 89, 102-3
lexical unit 67, 77, 81 theme 107, 110, 111, 112
linear order 145, 146, 151, 163-5 passive 87, 93
linear precedence rule 165 past 90
local tree 161, 165 patient (see participant role(s))
locative inversion 21 Off, phrase 145-6, 154-8, 162-5
long-distance dependency (see dependency) pied-piping 171, 181
plural inflection 202n. ll
mass interpretation 185, 201n. 9 pluralia tantum 47, 48
mental nodes 43 precedence concord 190
model (see also instance) 57, 69, 70, 79 predicate 195
modularity 11-12 predicate nominal 35, 41
more than 174ff predication 19-56
morpheme 5, 15, 17, 18, 19, 20 present 99
morphological case 41 principle of No Crossing 190
morphology 4, 7, 18-21, 22, 28 principle of Node Lexicality 190
morphology, derivational 18, 20 PRO 35, 42, 44, 45, 49, 50, 51, 52
morphology, inflectional 18, 19 processing 7, 16, 27-28
morphology-free syntax 4 pro-drop 45, 46-7
morphosyntactic subjects (see subject(s)) Promotion Principle 146
multiple inheritance 12, 13, 14, 91 proxy 177ff, 188, 199
pseudogapping 201n. 5
neither 191 purpose 89
network 5, 6, 7-10, 11, 13, 14, 15, 17,
20, 22, 24, 27, 123, 124, 126 quantity variable 43, 44
never 174ff question 98
NICE properties 72
No-Dangling Principle 150 recipient (see participant role(s))
No-Tangling Principle 147, 151-4, regent 172
167n. 2 require 174
not 174ff, 178 restrictive vs. non-restrictive 135-40
noun + noun premodification 193-5 result 95, 108
null subject (see subject(s)) Right Node Raising 203n. 21
object 92-7, 101 semantic phrasing 26

object, indirect 84-92, 102, 107 semantics 4, 6, 7, 13, 24-7
objects, suppressed 54, 56, 59, 60, 62 sharer 147
objects, understood 59, 65n. 7 shave type verbs 59
object pro-drop 47 SLASH (see HEAD-DRIVEN PHRASE
only 178ff STRUCTURE GRAMMAR)
Order Concord 167n. 2 sociolinguistics 4, 6, 7
order domain 146, 161-4 SOV 127, 128, 130
other than 174ff split subjects (see subject(s))
232 SUBJECT INDEX
SSP (see Syntax Semantics Principle) terminal node 201n. l

state 104-8 thai-clause 174
Stratificational Grammar 6 theme (see participant role(s))
structure-sharing 22, 23 three-place predicate 55
SUBCATlist 36, 51 token (see also type) 43, 46
subject(s) 97-101, 195, 205 topic 145, 146, 152, 153, 158, 161-4,
accusative subject 36-9, 40 197
infinitival subject 38, 50 topicalization 146, 156, 157, 159
lexical subjects 216 Topological Linear Precedence
morphosyntactic subjects 217 Constraint 162-4
null subject 36, 44-5, 46, 49, 51, type (see also token) 43, 46
52 type-of construction 184-5
split subjects 205
subject properties 206 ff unreal words (see words)
syntactic subjects 218 Unspecified Object Alternation 65n. 3
subject-auxiliary inversion (SAI) 151, utterance 15-17
153, 158, 160, 162, 202n. l9
subordinate 172 V2 128, 128, 132, 136, 137
subordinate clause 145-46, 149-54, VP fronting 75
159-60
superordinate 172 way 183
surface structure 147, 149-51 weil (German] 117, 130-40
surrogate 174, 177ff, 187, 199 ^-interrogative 146, 147, 149-54,
SVO 127, 128, 130, 158-60, 162
synsem (value) (see HEAD-DRIVEN w/z-pronoun 147, 148, 149, 153, 154
PHRASE STRUCTURE win type verbs 59
GRAMMAR) word(s) 4, 5, 13, 14, 15, 18-19, 20,
syntactic subjects (see subject(s)) 21-2, 23, 24, 26, 28
syntax 4, 21-4, 25, 26, 27-8 unreal words 6, 22, 23-4
Syntax Semantics Principle (SSP) 84, Word Grammar (WG) 56-8, 117,
94, 103, 110 121-7, 137, 139, 140, 145-154,
SYNTAX-SEMANTICS (SYNSEM, see 159, 164-5, 167n. 2, 204, 205, 218,
HEAD-DRIVEN PHRASE 221, 222
STRUCTURE GRAMMAR; see word order 146, 147, 152, 153, 161,
also synsem) 165, 172, 177, 188
word-order rule 146, 147, 152, 153

Word Grammar (Richard Hudson)

Uploaded by

Copyright:

Available Formats

Word Grammar (Richard Hudson)

Uploaded by

Document Information

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

Word Grammar (Richard Hudson)

Uploaded by

Copyright:

Available Formats

What is the book about?

What is the book about?

What publishing information is provided?

What publishing information is provided?

Word Grammar

New Perspectives on a Theory of Language Structure

edited by Kensei Sugayama and Richard Hudson

© Kensei Sugayama and Richard Hudson 2005

All rights reserved. No part of this publication may be reproduced or transmitted in

First published 2006

British Library Cataloguing-in-Publication Data

ISBN: 0-8264-8645-2 (hardback)

Library of Congress Cataloguing-in-Publication Data

Typeset by BookEns Ltd, Royston, Herts.

3. Distributional Heads 172

Author Index 227

KENSEI SUGAYAMA, Professor of English Linguistics at Kobe City University of

CHET CREIDER, Professor and Chair, Department of Anthropology, University

ANDREW ROSTA, Senior Lecturer, Department of Cultural Studies, University

NIKOLAS GISBORNE is a lecturer in the Department of Linguistics and English

Website: www. englang. ed. ac. uk/people/nik. html

EVA EPPLER, Senior Lecturer in English Language and Linguistics, School of

TAKAFUMI MAEKAWA, PhD student, Department of Language and Linguistics,

This volume comes from a three-year (April 2002-March 2005) research

unless it is explicitly overridden by a contradictory rule. (From Artificial

Contributors to this volume are primarily WG grammarians across the world

dependencies than Hudson's 1990 model. He focuses on a review of the

1. A Brief Overview of the Theory

Table 1 Relationships in cycled

grammar (1957, 1965), reinterpreted by McCawley (1968) as well-formedness

• lexical semantics (Gisborne 1993, 1996, 2000, 2001; Holmes 2004;

3. The Cognitive Network

3. 1 Language as part of a general network

In short, knowledge is held in memory as an associative network (though

Once the possibility is accepted that some generalizations may be expressed

classified and labelled - 'stem', 'shape', 'sense', 'referent', 'subject', 'adjunct'

This kind of analysis is too cumbersome to present explicitly in most

from two different concepts simultaneously - as 'dog' inherits, for example,

5. The Language Network

'verb' we have '[+V, -N]' or (changing notation) '[Verb: +, Noun: -, SUB-

of communicative behaviour by virtue of an isa link from 'word' to

6. The Utterance Network

• that its referent is a set whose members include its addressee;

• the stem of SPELL is {spell};

• the whole of MISSPELL: past is {mis+spell+ed};

In more complex cases (which we cannot consider here) the morphological

This analysis is very similar to those which can be expressed in terms of

(3) Paul en mange deux,

Once again we can explain this special behaviour if we analyze en as an ordinary

dependency analysis is much more restrictive than phrase-structure analysis

recognizing structure-sharing, WG departs from the European tradition of

The analysis of Dogs barked illustrates an important characteristic of WG

(4) a wl = snort. No progress - wl.

e w5 = examples. Capture - w4 w5.

The familiar complexities of syntax are mostly produced by discontinuous

surface structure. For example, subject-raising in He has been working is shown

Domenico and Plunkett, Kim (1998), 'Innateness and emergentism', in William

Sugayama, Kensei (1991), 'More on unaccusative Sino-Japanese complex predicates in

CHET CREIDER AND RICHARD HUDSON

(1) ekeleuon autous poreuesthai