0% found this document useful (0 votes)
99 views8 pages

A Very Short Introduction To CCG: Mark Steedman

This document provides a very short introduction to Combinatory Categorial Grammar (CCG). CCG associates syntactic categories like verbs with their argument structure and semantic type. Elements combine through rules like forward and backward application. Derivations in CCG are equivalent to phrase structure trees. CCG allows flexible word order through combinators like composition and type-raising, and handles coordination. The semantics of rules is determined compositionally based on the interpretation of categories.

Uploaded by

VFisa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
99 views8 pages

A Very Short Introduction To CCG: Mark Steedman

This document provides a very short introduction to Combinatory Categorial Grammar (CCG). CCG associates syntactic categories like verbs with their argument structure and semantic type. Elements combine through rules like forward and backward application. Derivations in CCG are equivalent to phrase structure trees. CCG allows flexible word order through combinators like composition and type-raising, and handles coordination. The semantics of rules is determined compositionally based on the interpretation of categories.

Uploaded by

VFisa
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

A Very Short Introduction to CCG

Mark Steedman
Draft, November 1, 1996

This paper is intended to provide the shortest possible introduction to Combinatory


Categorial Grammar.

1 Combinatory Grammars.
In Combinatory Categorial Grammar (CCG, Steedman 1987, 1996b), as in other varieties
of Categorial Grammar reviewed by Wood 1993 and exemplified in the bibli0graphy be-
low, elements like verbs are associated with a syntactic “category” which identifies them
as functions, and specifies the type and directionality of their arguments and the type of
their result. We here use the “result leftmost” notation in which a rightward-combining
functor over a domain β into a range α are written α=β, while the corresponding leftward-
n
combining functor is written α β.1 α and β may themselves be function categories. For
example, a transitive verb is a function from (object) NPs into predicates—that is, into
functions from (subject) NPs into S:
n
(1) likes := (S NP)=NP
(2) Forward Application: (>)
X =Y Y ) X
(3) Backward Application: ( )
Y X nY ) X
<

These rules have the form of very general binary PS rule schemata. In fact, pure categorial
grammar is just context-free grammar written in the accepting, rather than the producing,
direction, with a consequent transfer of the major burden of specifying particular grammars
from the PS rules to the lexicon. While it is now convenient to write derivations as in a,
below, they are equivalent to conventional phrase structure derivations b:
 The research was supported in part by NSF grant nos. IRI91-17110, IRI95-04372, ARPA grant no.
N66001-94-C6043, and ARO grant no. DAAH04-94-G0426.
1
There is an alternative “result on top” notation due to Lambek 1958, according to which the latter category
is written βnα.

1
2 M A R K S T E E D M A N

(4) a. Mary likes


musicals b. Mary likes musicals
n
NP (S NP)=NP NP NP V NP
n<
S NP
>
VP
S S
It is important to note that such tree-structures are simply a representation of the process of
derivation. They are not structures that need to be built by a processor, nor do they provide
the input to any rules of grammar.
Such categories can be regarded as encoding the semantic type of their translation, and
this translation can be made explicit in the following expanded notation, which associates
a translation with the entire syntactic category, via the colon operator, which is assumed
to have lower precedence than the categorial slash operators. (Agreement features are also
included in the syntactic category, represented as subscripts, much as in Bach 1983. The
feature 3s is “underspecified” for gender and can combine with the more specified 3sm by
a standard unification mechanism that we will pass over here – cf. Shieber 1986.)2
n
(5) likes := (S NP3s)=NP : like0
We must also expand the rules of functional application in the same way:
(6) Forward Application: (>)
X =Y : f Y : a )
X : fa
(7) Backward Application: (<)
Y :a X Y : fn )
X : fa
They yield derivations like the following:
(8) Mary likes musicals
n
NP3sm : mary0 (S NP3s)=NP : like0 NP : musicals0
n
S NP3s : like0 musicals0
>
<
S : like0 musicals0mary
The derivation yields an S with a compositional interpretation, equivalent under a conven-
tion of left associativity to (like0 musicals0)mary0.
Coordination might be included in CG via the following rule, allowing constituents of
like type to conjoin to yield a single constituent of the same type:3
2
This notation follows Steedman 1987. Another notation, used in Steedman 1990, associates a unifiable
logical form with each primitive category, so that the same transitive verb appears as follows:
(i) likes := (S : like0y xnNP3s : x)=NP : y
The advantage is that the predicate-argument structure is built directly by the unification, and that the combi-
nation rules need no further modification. Otherwise the choice is largely a matter of notational convenience.
3
The semantics of this rule, or rather rule schema, is somewhat complex, and is omitted here. The rule is
also simplified syntactically in several respects for the present purpose.
S H O R T I N T R O D U C T I O N T O C C G 3

(9) Coordination: (< & >)


X con j X X)
(10) I loathe and detest opera
n
NP (S NP)=NP CONJ (S NP)=NP NP n
(S nNP )=NP
<&>

S NPn <
>

S
In order to allow coordination of contiguous strings that do not constitute constituents, CCG
allows certain further operations on functions related to Curry’s combinators 1958. For ex-
ample, functions may nondeterministically compose, as well as apply, under the following
rule:
(11) Forward Composition: (> B)
X =Y Y =Z )
X =Z
The most important single property of combinatory rules like this is that their semantics is
completely determined under the following principle: 4
(12) The Principle of Combinatory Transparency: The semantic interpretation of
the category resulting from a combinatory rule is uniquely determined by the
interpretation of the slash in a category as a mapping between two sets.
In the above case, the category X =Y is a mapping of Y into X and the category Y =Z is that
of a mapping from Z into Y . Since the two occurrences of Y identify the same set, the result
category X =Z is that mapping from Z to X which constitutes the composition of the input
functions. It follows that the only semantics that we are allowed to assign, when the rule is
written in full, is as follows:
(13) Forward Composition: (> B)
X =Y : f Y =Z : g )
X =Z : λx: f (gx)
No other interpretation is allowed. It is worth noticing that this principle would follow
automatically if we were using the alternative unification-based notation discussed in note
2 and the composition rule as as it is given in 11.
The operation of this rule in derivations is indicated by an underline indexed > B (be-
cause Curry called his composition combinator B). Its effect can be seen in the derivation
of sentences like I requested, and would prefer, musicals, which crucially involves the com-
position of two verbs to yield a composite of the same category as a transitive verb (the
rest of the derivation is given in the simpler notation). It is important to observe that com-
position also yields an appropriate interpretation for the composite verb would prefer, as
λx:λy:will0(prefer0 x) y, an object which if applied to an object musicals and a subject I
4
This principle is stated differently in Steedman 1996b but is in fact identical.
4 M A R K S T E E D M A N

yields the proposition will0 (prefer0 musicals0) me0 . The coordination will therefore yield an
appropriate semantic interpretation. 5
(14) I requested and would prefer musicals
n
NP (S NP)=NP CONJ (S NP)=VP : n will0
VP NP : = prefer0 NP
SnNP NP : λx λy will prefer x y
> 0 0 )
B
( )= : : (

SnNP NP
< > &
( )=

SnNP
>
<
S
Combinatory grammars also include type-raising rules, which turn arguments into func-
tions over functions-over-such-arguments. These rules allow arguments to compose, and
thereby take part in coordinations like I dislike, and Mary likes, musicals. For example, the
following rule allows the conjuncts to form as below (again, the remainder of the derivation
is given in the briefer notation):
(15) Subject Type-raising: (>T)
NP : a ) n
T=(T NP) : λ f : f a
(16) I dislike and Mary likes musicals
NP (S nNP )=NP CONJ NP n
(S NP)=NP NP
: mary0 : λx:λy:like0xy
n
S=(S NP)
>T
n
S=(S NP)
>T
: λf :f mary0
>B >B
S=NP S=NP
: λx:like0x mary0
<&>
S=NP
>
S
Rule 15 has an “order-preserving” property. That is, it turns the NP into a rightward looking
function over leftward function, and therefore preserves the linear order of subjects and
predicates.
Like composition, type-raising rules are required by the Principle of Combinatory Trans-
parency 12 to be transparent to semantics. This fact ensures that the raised subject NP has
an appropriate interpretation, and can compose with the verb to produce a function that can
either coordinate with a transitive verb or reduce with an object musicals to yield like’ mu-
sicals’ mary’.
Since complement-taking verbs like think, VP=S, can in turn compose with fragments
like Mary likes, S=NP, we correctly predict the fact that right-node raising is unbounded, as
5
The analysis begs some syntactic and semantic questions about the coordination rule and the interpretation
of modals. See Steedman 1990, 1996b for more complete accounts of both.
S H O R T I N T R O D U C T I O N T O C C G 5

in a, below, and also provide the basis for an analyis of the similarly unbounded character of
leftward extraction, as in b (see the earlier papers and Steedman 1991a, 1996b for details,
including ECP effects and other extraction asymmetries, and the involvement of similar
fragments in intonational phrasing):
(17) a. [I dislike]S=NP and [you think Mary likes]S=NP musicals.
b. The musicals which [you think Mary likes]S=NP.
This apparatus has been applied to a wide variety of coordination phenomena, including
English “argument-cluster coordination”, “backward gapping” and verb-raising construc-
tions in Germanic languages, and English gapping. The first of these is relevant to the
present discussion, and is illustrated by the following analysis, from Dowty 1988:6
(18) introduce Bill to Sue and Harry to George
<T <T <B
(VP=PP)=NP (VP=PP)n((VP=PP)=NP) VPn(VP=PP) CONJ VPn((VP=PP)=NP)
<B
VPn((VP=PP)=NP)
<&>
VPn((VP=PP)=NP)
<
VP
The important feature of this analysis is that it uses “backward” rules of type-raising < T
and composition < B that are the exact mirror-image of the two “forward” versions intro-
duced as examples 11 and 15, which similarly guarantee that the semantics of non standard
constituents like Bill to Sue is such as to reduce appropriately with a ditransitive verb like
give. It is in fact a prediction of the theory that such a construction can exist in English, and
its inclusion in the grammar requires no additional mechanism whatsoever.
The earlier papers show that no other non-constituent coordinations of dative-accusative
NP sequences are allowed in any language with the English verb categories, given the as-
sumptions of CCG. Thus the following are ruled out in principle, rather than by stipulation:
(19) a. *Bill to Sue and introduce Harry to George
b. *Introduce to Sue Bill and to George Harry
A number of related well-known cross-linguistic generalisations concerning the depen-
dency of so-called “gapping” upon lexical word-order are also captured (see Dowty 1988
and Steedman 1985, 1990). In English the phenomenon shows up in all constructions that
can be assumed to involve multiple arguments of the same functor:
(20) a. I gave Deadeye Dick a sugar stick, and Mexican Pete a bun.
b. I saw Keats yesterday, and Chapman the day before.
c. I saw Gilbert arrive and George leave.
d. I persuaded Sid to take a bath and Nancy to have a wash.
e. I promised Mutt to go to the movies and Jeff to go to the play.
f. I told Shem I lived in London and Shaun I lived in Philadelphia.
g. I bet Sammy sixpence I would win and Rosie a dollar I would lose.
6
In more recent work, Dowty has disowned this analysis, because it apparently entails an “intrinsic” use of
logical form to account for binding phenomena. This issue is discussed in Steedman 1996b
6 M A R K S T E E D M A N

Phenomena like the above immediately suggest that all complements of verbs bear type-
raised categories. However, we do not want anything else to type-raise. In particular,
we do not want raised categories to raise again, or we risk infinite regress in our rules.
One way to deal with this problem is to explicitly restrict the two type-raising rules to
the relevant arguments of verbs, as follows, a restriction that is a natural expression of
the resemblance of type-raising to some generalized form of (nominative, accusative, etc)
grammatical case—cf. Steedman 1985, 1990.
(21) Forward Type-raising: (>T)
X:a ) n
T=(T X ) : λ f : f a
where X 2f gNP
(22) Backward Type-raising: ( T)
X : a ) Tn T X : λ f f a
>

( = ) :

where X 2 fNP PP AP VP VP S S g
; ; ; ;
0; ;
0

The other solution is to simply expand the lexicon by incorporating of the raised cat-
egories that these rules define, so that categories like NP have raised categories, and all
functions into such categories, like determiners, have the category of functions into raised
categories.
These two tactics are essentially equivalent, because in some cases we need both raised
and unraised categories for complements. (The argument is developed in Steedman 1996b,
and depends upon the observation that any category that is not a barrier to extraction must
bear an unraised category, and any argument that can take part in argument-cluster coor-
dination must be raised). The correct solution from a linguistic point of view, inasfar as it
captures the fact that some languages appear to lack certain unraised categories (notably PP
and S0 ), is probably the lexical solution. However the restricted rule-based solution makes
derivations easier to read and causes them to take up less space. We will therefore follow it
here without further discussion.
Since categories like NP can be raised over a number of different functor categories,

n n n n
such as predicate, transitive verb, ditransitive verb etc, and since the resulting raised cat-
egories S (S=NP), (S NP) ((S NP)=PP), etc. of NPs, PPs, etc are quite hard to read, it
is sometimes convenient to abbreviate the raised categories as a schema written NP" , PP" ,
etc.7

Bibliography
Ades, Anthony E. and Mark J. Steedman: 1982, “On the Order of Words,” Linguistics and Philoso-
phy, 4, 517-558.
Bach, Emmon: 1979, “Control in Montague Grammar,” Linguistic Inquiry, 10, 513-531.
Bach, Emmon: 1980, “In Defense of Passive,” Linguistics and Philosophy, 3, 297-341.
7
In a computational implementation one would in fact want to schematise type-raised categories in this
way—see Steedman 1991c for further discussion.
S H O R T I N T R O D U C T I O N T O C C G 7

Bach, Emmon: 1983, “On the Relationship between Word-grammar and Phrase-grammar,” Natural
Language and Linguistic Theory, 1, 65-89.
Bach, Emmon: 1988, “Categorial Grammars as Theories of Language,” in Oehrle et al. 1988,
17-34.
Chierchia, Gennaro: 1985, “Formal Semantics and the Grammar of Predication,” Linguistic Inquiry,
16, 417-443.
Chierchia, Gennaro: 1988, “Aspects of a Categorial Theory of Binding,” in Oehrle et al, 1988,
153-98.
Chierchia, Gennaro: 1989, “Structured Meanings, Thematic Roles, and Control,” in G. Chierchia,
B. Partee, and R. Turner, (eds.), Properties, Types and Meanings, Kluwer, Dordrecht. 131-166.
Cresswell, Max: 1973, Logics and Languages, Methuen.
Curry, Haskell and Robert Feys: 1958, 1emCombinatory Logic, North Holland, Amsterdam.
Dowty, David: 1982, “Grammatical Relations and Montague Grammar,” in Pauline Jacobson and
Geoffrey K. Pullum (eds.), The Nature of Syntactic Representation, Reidel, Dordrecht. 79-130.
Dowty, David: 1988, Type-Raising, Functional Composition, and Non-Constituent Coordination, in
Richard T. Oehrle, E. Bach and D. Wheeler, (eds), Categorial Grammars and Natural Language
Structures, Reidel, Dordrecht, 153–198.
Dowty, David: 1992, “Variable-free” Syntax, Variable-binding Syntax, the Natural Deduction Lam-
bek Calculus, and the Crossover Constraint,” in Proceedings of the 11th West Coast Conference on
Formal Linguistics, Stanford CA 1992, SLA Stanford CA, 161-176.
Hepple, Mark: 1990, The Grammar and Processing of Order and Dependency: a Categorial
Aproach, Ph.D dissertation, University of Edinburgh.
Hoffman, Beryl: 1995, Computational Analysis of the Syntax and Interpretation of “Free” Word-
order in Turkish, unpublished PhD thesis, IRCS Report 95-17, IRCS, U. Penn.
Jacobson, Pauline: 1990, “Raising as Function Composition,” Linguistics & Philosophy, 13, 423-
476.
Jacobson, Pauline: 1992, “The Lexical Entailment Theory of Control and the Tough Construction,”
in Ivan Sag & Anna Szabolcsi, (eds.) Lexical Matters, CSLI/Chicago UP, Chicago, 269-300.
Kirkeby-Garstad, Trond and Krisztina Polgárdi: 1994, “Against Prosodic Composition,” G. Bouma
and G. van Noord, (eds.), Computational Linguistics in the Netherlands, IV, Vakgroep Alfa-
Informatica, Rijksuniversiteit Groningen, Groningen. 73-86.
Lambek, Joachim: 1958, “The Mathematics of Sentence Structure,” American Mathematical
Monthly, 65, 154-170.
Moortgat, Michael: 1988, Categorial Investigations, Ph.D dissertation, Universiteit van Amsterdam,
publ. Foris, Dordrecht, 1989.
Morrill, Glyn: 1994, Type-logical Grammar, Kluwer, Dordrecht.
Oehrle, Richard. 1987. “Boolean Properties in the Analysis of Gapping.” In Syntax and Semantics,
20, 203–240. Geoffrey Huck and Almerindo Ojeda, eds., New York: Academic Press.
Oehrle, Richard T: 1988, “Multidimensional Compositional Functions as a Basis for Grammatical
Analysis,” in Richard T. Oehrle, E. Bach and D. Wheeler, (eds), Categorial Grammars and Natural
Language Structures, Reidel, Dordrecht. 349-390.
8 M A R K S T E E D M A N

Prevost, Scott and Mark Steedman: 1994, “Specifying Intonation from Context for Speech Synthe-
sis,” Speech Communication, 15, 139-153.
Prevost, Scott: 1995, A Semantics of Contrast and Information Structure for Specifying Intonation
in Spoken Language Generation, PhD dissertation, IRCS TR 96-01, University of Pennsylvania.
Shieber, Stuart: 1986, An Introduction to Unification-based Approaches to Grammar, CSLI/Chicago
University Press, Chicago.
Steedman, Mark: 1985a. Dependency and Coordination in the Grammar of Dutch and English,
Language 61.523-568.
Steedman, Mark: 1987. Combinatory grammars and parasitic gaps. Natural Language & Linguistic
Theory, 5, 403-439.
Steedman, Mark: 1990. “Gapping as Constituent Coordination,” Linguistics & Philosophy, 13,
207-263.
Steedman, Mark: 1991a, Structure and Intonation, Language, 68, 260-296.
Steedman, Mark: 1991b, “Surface Structure, Intonation, and Focus,” in Ewan Klein and F. Veltman
(eds.), Natural Language and Speech, Proceedings of the ESPRIT Symposium, Brussels, Nov. 1991.
21-38, 260-296.
Steedman, Mark: 1991c, “Type-raising and Directionality on Combinatory Grammar,” in Proceed-
ings of the 28th Annual Meeting of the Association for Computational Linguistics, Berkeley, July
1991, 71-78.
Steedman, Mark: 1995, “Temporality,” in J. van Benthem & A. ter Meulen (eds.) Handbook of
Logic and Linguistics, North Holland, Amsterdam.
Steedman, Mark: 1996a, “Representing Discourse Information for Spoken Dialogue Generation”,
Proceedings of the 2nd International Symposium on Spoken Dialogue, Workshop at the International
Conference on Spoken Language Proces- sing (ICSLP-96), Philadelphia October 1996, 89-92.
Steedman, Mark: 1996b, Surface Structure and Interpretation, MIT Press, Cambridge (in press).
Stone, Matthew and Christy Doran: 1996, “Paying Heed to Collocations,” ms. University of Penn-
sylvania.
Szabolcsi, Anna: 1989, “Bound Variables in Syntax: Are there any?” in R. Bartsch, J. van Benthem,
and P. van Emde Boas (eds.), Semantics and Contextual Expression, 295-318, Foris, Dordrecht.
Szabolcsi, Anna: 1992a, “On Combinatory Grammar and Projection from the Lexicon” in Ivan Sag
& Anna Szabolcsi, (eds.), Lexical Matters, CSLI, Stanford CA. 241-268.
Szabolcsi, Anna, and Franz Zwarts. 1990. Semantic Properties of Composed Functions and the
Distribution of Wh-phrases. Proceedings of the Seventh Amsterdam Colloquium, ed. by Stokhof
and Torenvliet. 529-555. Amsterdam: ILLI.
Wood, Mary McGee:1993, Categorial Grammar, Routledge.
Mark Steedman
Dept. of Computer and Information Science
University of Pennsylvania
200 South 33rd Street
Philadelphia PA 19104-6389
([email protected])

You might also like