0% found this document useful (0 votes)

26 views13 pages

Combinatory Categorial Grammar: E.1 CCG Categories

Uploaded by

IEC2020034

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views13 pages

Combinatory Categorial Grammar: E.1 CCG Categories

Uploaded by

IEC2020034

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

Speech and Language Processing. Daniel Jurafsky & James H. Martin. Copyright © 2023.

All
rights reserved. Draft of January 7, 2023.

CHAPTER

Combinatory Categorial
E Grammar
categorial
grammar In this chapter, we provide an overview of categorial grammar (Ajdukiewicz 1935,
combinatory
Bar-Hillel 1953), an early lexicalized grammar model, as well as an important mod-
categorial ern extension, combinatory categorial grammar, or CCG (Steedman 1996, Steed-
grammar
man 1989, Steedman 2000). CCG is a heavily lexicalized approach motivated by
both syntactic and semantic considerations. It is an exemplar of a set of computa-
tionally relevant approaches to grammar that emphasize putting grammatical infor-
mation in a rich lexicon, including Lexical-Functional Grammar (LFG) (Bresnan,
1982), Head-Driven Phrase Structure Grammar (HPSG) (Pollard and Sag, 1994),
and Tree-Adjoining Grammar (TAG) (Joshi, 1985).
The categorial approach consists of three major elements: a set of categories,
a lexicon that associates words with categories, and a set of rules that govern how
categories combine in context.

E.1 CCG Categories

Categories are either atomic elements or single-argument functions that return a cat-
egory as a value when provided with a desired category as argument. More formally,
we can define C , a set of categories for a grammar as follows:

• A ⊆ C , where A is a given set of atomic elements

• (X/Y), (X\Y) ∈ C , if X, Y ∈ C

The slash notation shown here is used to define the functions in the grammar.
It specifies the type of the expected argument, the direction it is expected be found,
and the type of the result. Thus, (X/Y) is a function that seeks a constituent of type
Y to its right and returns a value of X; (X\Y) is the same except it seeks its argument
to the left.
The set of atomic categories is typically very small and includes familiar el-
ements such as sentences and noun phrases. Functional categories include verb
phrases and complex noun phrases among others.

E.2 The Lexicon

The lexicon in a categorial approach consists of assignments of categories to words.
These assignments can either be to atomic or functional categories, and due to lexical
ambiguity words can be assigned to multiple categories. Consider the following
2 A PPENDIX E • C OMBINATORY C ATEGORIAL G RAMMAR

sample lexical entries.

flight : N
Miami : NP
cancel : (S\NP)/NP

Nouns and proper nouns like flight and Miami are assigned to atomic categories,
reflecting their typical role as arguments to functions. On the other hand, a transitive
verb like cancel is assigned the category (S\NP)/NP: a function that seeks an NP on
its right and returns as its value a function with the type (S\NP). This function can,
in turn, combine with an NP on the left, yielding an S as the result. This captures
subcategorization information with a computationally useful, internal structure.
Ditransitive verbs like give, which expect two arguments after the verb, would
have the category ((S\NP)/NP)/NP: a function that combines with an NP on its
right to yield yet another function corresponding to the transitive verb (S\NP)/NP
category such as the one given above for cancel.

E.3 Rules
The rules of a categorial grammar specify how functions and their arguments com-
bine. The following two rule templates constitute the basis for all categorial gram-
mars.

X/Y Y ⇒ X (E.1)
Y X\Y ⇒ X (E.2)

The first rule applies a function to its argument on the right, while the second
looks to the left for its argument. We’ll refer to the first as forward function appli-
cation, and the second as backward function application. The result of applying
either of these rules is the category specified as the value of the function being ap-
plied.
Given these rules and a simple lexicon, let’s consider an analysis of the sentence
United serves Miami. Assume that serves is a transitive verb with the category
(S\NP)/NP and that United and Miami are both simple NPs. Using both forward
and backward function application, the derivation would proceed as follows:
United serves Miami
NP (S\NP)/NP NP
>
S\NP
<
S
Categorial grammar derivations are illustrated growing down from the words,
rule applications are illustrated with a horizontal line that spans the elements in-
volved, with the type of the operation indicated at the right end of the line. In this
example, there are two function applications: one forward function application indi-
cated by the > that applies the verb serves to the NP on its right, and one backward
function application indicated by the < that applies the result of the first to the NP
United on its left.
English permits the coordination of two constituents of the same type, resulting
in a new constituent of the same type. The following rule provides the mechanism
E.3 • RULES 3

to handle such examples.

X CONJ X ⇒ X (E.3)

This rule states that when two constituents of the same category are separated by a
constituent of type CONJ they can be combined into a single larger constituent of
the same type. The following derivation illustrates the use of this rule.
We flew to Geneva and drove to Chamonix
NP (S\NP)/PP PP/NP NP CONJ (S\NP)/PP PP/NP NP
> >
PP PP
> >
S\NP S\NP
<Φ>
S\NP
<
S

Here the two S\NP constituents are combined via the conjunction operator <Φ>
to form a larger constituent of the same type, which can then be combined with the
subject NP via backward function application.
These examples illustrate the lexical nature of the categorial grammar approach.
The grammatical facts about a language are largely encoded in the lexicon, while the
rules of the grammar are boiled down to a set of three rules. Unfortunately, the basic
categorial approach does not give us any more expressive power than we had with
traditional CFG rules; it just moves information from the grammar to the lexicon. To
move beyond these limitations CCG includes operations that operate over functions.
The first pair of operators permit us to compose adjacent functions.

X/Y Y /Z ⇒ X/Z (E.4)

Y \Z X\Y ⇒ X\Z (E.5)

forward
composition The first rule, called forward composition, can be applied to adjacent con-
stituents where the first is a function seeking an argument of type Y to its right, and
the second is a function that provides Y as a result. This rule allows us to compose
these two functions into a single one with the type of the first constituent and the
argument of the second. Although the notation is a little awkward, the second rule,
backward
composition backward composition is the same, except that we’re looking to the left instead of
to the right for the relevant arguments. Both kinds of composition are signalled by a
B in CCG diagrams, accompanied by a < or > to indicate the direction.
type raising The next operator is type raising. Type raising elevates simple categories to the
status of functions. More specifically, type raising takes a category and converts it
to a function that seeks as an argument a function that takes the original category
as its argument. The following schema show two versions of type raising: one for
arguments to the right, and one for the left.

X ⇒ T /(T \X) (E.6)

X ⇒ T \(T /X) (E.7)

The category T in these rules can correspond to any of the atomic or functional
categories already present in the grammar.
A particularly useful example of type raising transforms a simple NP argument
in subject position to a function that can compose with a following VP. To see how
4 A PPENDIX E • C OMBINATORY C ATEGORIAL G RAMMAR

this works, let’s revisit our earlier example of United serves Miami. Instead of clas-
sifying United as an NP which can serve as an argument to the function attached to
serve, we can use type raising to reinvent it as a function in its own right as follows.

NP ⇒ S/(S\NP)

Combining this type-raised constituent with the forward composition rule (E.4) per-
mits the following alternative to our previous derivation.
United serves Miami
NP (S\NP)/NP NP
>T
S/(S\NP)
>B
S/NP
>
S
By type raising United to S/(S\NP), we can compose it with the transitive verb
serves to yield the (S/NP) function needed to complete the derivation.
There are several interesting things to note about this derivation. First, it pro-
vides a left-to-right, word-by-word derivation that more closely mirrors the way
humans process language. This makes CCG a particularly apt framework for psy-
cholinguistic studies. Second, this derivation involves the use of an intermediate
unit of analysis, United serves, that does not correspond to a traditional constituent
in English. This ability to make use of such non-constituent elements provides CCG
with the ability to handle the coordination of phrases that are not proper constituents,
as in the following example.
(E.8) We flew IcelandAir to Geneva and SwissAir to London.
Here, the segments that are being coordinated are IcelandAir to Geneva and
SwissAir to London, phrases that would not normally be considered constituents, as
can be seen in the following standard derivation for the verb phrase flew IcelandAir
to Geneva.
flew IcelandAir to Geneva
(VP/PP)/NP NP PP/NP NP
> >
VP/PP PP
>
VP
In this derivation, there is no single constituent that corresponds to IcelandAir
to Geneva, and hence no opportunity to make use of the <Φ> operator. Note that
complex CCG categories can get a little cumbersome, so we’ll use VP as a shorthand
for (S\NP) in this and the following derivations.
The following alternative derivation provides the required element through the
use of both backward type raising (E.7) and backward function composition (E.5).
flew IcelandAir to Geneva
(V P/PP)/NP NP PP/NP NP
<T >
(V P/PP)\((V P/PP)/NP) PP
<T
V P\(V P/PP)
<B
V P\((V P/PP)/NP)
<
VP
Applying the same analysis to SwissAir to London satisfies the requirements for
the <Φ> operator, yielding the following derivation for our original example (E.8).
E.4 • CCG BANK 5

flew IcelandAir to Geneva and SwissAir to London

(V P/PP)/NP NP PP/NP NP CONJ NP PP/NP NP
<T > <T >
(V P/PP)\((V P/PP)/NP) PP (V P/PP)\((V P/PP)/NP) PP
<T <T
V P\(V P/PP) V P\(V P/PP)
< <
V P\((V P/PP)/NP) V P\((V P/PP)/NP)
<Φ>
V P\((V P/PP)/NP)
<
VP

Finally, let’s examine how these advanced operators can be used to handle long-
distance dependencies (also referred to as syntactic movement or extraction). As
mentioned in Appendix D, long-distance dependencies arise from many English
constructions including wh-questions, relative clauses, and topicalization. What
these constructions have in common is a constituent that appears somewhere dis-
tant from its usual, or expected, location. Consider the following relative clause as
an example.
the flight that United diverted
Here, divert is a transitive verb that expects two NP arguments, a subject NP to its
left and a direct object NP to its right; its category is therefore (S\NP)/NP. However,
in this example the direct object the flight has been “moved” to the beginning of the
clause, while the subject United remains in its normal position. What is needed is a
way to incorporate the subject argument, while dealing with the fact that the flight is
not in its expected location.
The following derivation accomplishes this, again through the combined use of
type raising and function composition.
the flight that United diverted
NP/N N (NP\NP)/(S/NP) NP (S\NP)/NP
> >T
NP S/(S\NP)
>B
S/NP
>
NP\NP
<
NP
As we saw with our earlier examples, the first step of this derivation is type raising
United to the category S/(S\NP) allowing it to combine with diverted via forward
composition. The result of this composition is S/NP which preserves the fact that we
are still looking for an NP to fill the missing direct object. The second critical piece
is the lexical category assigned to the word that: (NP\NP)/(S/NP). This function
seeks a verb phrase missing an argument to its right, and transforms it into an NP
seeking a missing element to its left, precisely where we find the flight.

E.4 CCGbank
As with phrase-structure approaches, treebanks play an important role in CCG-
based approaches to parsing. CCGbank (Hockenmaier and Steedman, 2007) is the
largest and most widely used CCG treebank. It was created by automatically trans-
lating phrase-structure trees from the Penn Treebank via a rule-based approach. The
method produced successful translations of over 99% of the trees in the Penn Tree-
bank resulting in 48,934 sentences paired with CCG derivations. It also provides a
6 A PPENDIX E • C OMBINATORY C ATEGORIAL G RAMMAR

lexicon of 44,000 words with over 1200 categories. Appendix C will discuss how
these resources can be used to train CCG parsers.

E.5 Ambiguity in CCG

As is always the case in parsing, managing ambiguity is the key to successful CCG
parsing. The difficulties with CCG parsing arise from the ambiguity caused by the
large number of complex lexical categories combined with the very general nature of
the grammatical rules. To see some of the ways that ambiguity arises in a categorial
framework, consider the following example.
(E.9) United diverted the flight to Reno.
Our grasp of the role of the flight in this example depends on whether the prepo-
sitional phrase to Reno is taken as a modifier of the flight, as a modifier of the entire
verb phrase, or as a potential second argument to the verb divert. In a context-free
grammar approach, this ambiguity would manifest itself as a choice among the fol-
lowing rules in the grammar.

Nominal → Nominal PP
VP → VP PP
VP → Verb NP PP

In a phrase-structure approach we would simply assign the word to to the cate-

gory P allowing it to combine with Reno to form a prepositional phrase. The sub-
sequent choice of grammar rules would then dictate the ultimate derivation. In the
categorial approach, we can associate to with distinct categories to reflect the ways
in which it might interact with other elements in a sentence. The fairly abstract
combinatoric rules would then sort out which derivations are possible. Therefore,
the source of ambiguity arises not from the grammar but rather from the lexicon.
Let’s see how this works by considering several possible derivations for this
example. To capture the case where the prepositional phrase to Reno modifies the
flight, we assign the preposition to the category (NP\NP)/NP, which gives rise to
the following derivation.

United diverted the flight to Reno

NP (S\NP)/NP NP/N N (NP\NP)/NP NP
> >
NP NP\NP
<
NP
>
S\NP
<
S

Here, the category assigned to to expects to find two arguments: one to the right as
with a traditional preposition, and one to the left that corresponds to the NP to be
modified.
Alternatively, we could assign to to the category (S\S)/NP, which permits the
following derivation where to Reno modifies the preceding verb phrase.
E.6 • CCG PARSING 7

United diverted the flight to Reno

NP (S\NP)/NP NP/N N (S\S)/NP NP
> >
NP S\S
>
S\NP
<B
S\NP
<
S

A third possibility is to view divert as a ditransitive verb by assigning it to the

category ((S\NP)/PP)/NP, while treating to Reno as a simple prepositional phrase.

United diverted the flight to Reno

NP ((S\NP)/PP)/NP NP/N N PP/NP NP
> >
NP PP
>
(S\NP)/PP
>
S\NP
<
S

While CCG parsers are still subject to ambiguity arising from the choice of gram-
mar rules, including the kind of spurious ambiguity discussed above, it should be
clear that the choice of lexical categories is the primary problem to be addressed in
CCG parsing.

E.6 CCG Parsing

Since the rules in combinatory grammars are either binary or unary, a bottom-up,
tabular approach based on the CKY algorithm should be directly applicable to CCG
parsing. Unfortunately, the large number of lexical categories available for each
word, combined with the promiscuity of CCG’s combinatoric rules, leads to an ex-
plosion in the number of (mostly useless) constituents added to the parsing table.
The key to managing this explosion of zombie constituents is to accurately assess
and exploit the most likely lexical categories possible for each word—a process
called supertagging.
These following sections describe an approaches to CCG parsing that make use
of supertags, structuring the parsing process as a heuristic search through the use of
the A* algorithm.

E.6.1 Supertagging
Chapter 8 introduced the task of part-of-speech tagging, the process of assigning the
supertagging correct lexical category to each word in a sentence. Supertagging is the correspond-
ing task for highly lexicalized grammar frameworks, where the assigned tags often
dictate much of the derivation for a sentence (Bangalore and Joshi, 1999).
CCG supertaggers rely on treebanks such as CCGbank to provide both the over-
all set of lexical categories as well as the allowable category assignments for each
word in the lexicon. CCGbank includes over 1000 lexical categories, however, in
practice, most supertaggers limit their tagsets to those tags that occur at least 10
8 A PPENDIX E • C OMBINATORY C ATEGORIAL G RAMMAR

times in the training corpus. This results in a total of around 425 lexical categories
available for use in the lexicon. Note that even this smaller number is large in con-
trast to the 45 POS types used by the Penn Treebank tagset.
As with traditional part-of-speech tagging, the standard approach to building a
CCG supertagger is to use supervised machine learning to build a sequence labeler
from hand-annotated training data. To find the most likely sequence of tags given a
sentence, it is most common to use a neural sequence model, either RNN or Trans-
former.
It’s also possible, however, to use the CRF tagging model described in Chapter 8,
using similar features; the current word wi , its surrounding words within l words,
local POS tags and character suffixes, and the supertag from the prior timestep,
training by maximizing log-likelihood of the training corpus and decoding via the
Viterbi algorithm as described in Chapter 8.
Unfortunately the large number of possible supertags combined with high per-
word ambiguity leads the naive CRF algorithm to error rates that are too high for
practical use in a parser. The single best tag sequence T̂ will typically contain too
many incorrect tags for effective parsing to take place. To overcome this, we instead
return a probability distribution over the possible supertags for each word in the
input. The following table illustrates an example distribution for a simple sentence,
in which each column represents the probability of each supertag for a given word
in the context of the input sentence. The “...” represent all the remaining supertags
possible for each word.

United serves Denver

N/N: 0.4 (S\NP)/NP: 0.8 NP: 0.9
NP: 0.3 N: 0.1 N/N: 0.05
S/S: 0.1 ... ...
S\S: .05
...

To get the probability of each possible word/tag pair, we’ll need to sum the
probabilities of all the supertag sequences that contain that tag at that location. This
can be done with the forward-backward algorithm that is also used to train the CRF,
described in Appendix A.

E.6.2 CCG Parsing using the A* Algorithm

The A* algorithm is a heuristic search method that employs an agenda to find an
optimal solution. Search states representing partial solutions are added to an agenda
based on a cost function, with the least-cost option being selected for further ex-
ploration at each iteration. When a state representing a complete solution is first
selected from the agenda, it is guaranteed to be optimal and the search terminates.
The A* cost function, f (n), is used to efficiently guide the search to a solution.
The f -cost has two components: g(n), the exact cost of the partial solution repre-
sented by the state n, and h(n) a heuristic approximation of the cost of a solution
that makes use of n. When h(n) satisfies the criteria of not overestimating the actual
cost, A* will find an optimal solution. Not surprisingly, the closer the heuristic can
get to the actual cost, the more effective A* is at finding a solution without having
to explore a significant portion of the solution space.
When applied to parsing, search states correspond to edges representing com-
pleted constituents. Each edge specifies a constituent’s start and end positions, its
E.6 • CCG PARSING 9

grammatical category, and its f -cost. Here, the g component represents the current
cost of an edge and the h component represents an estimate of the cost to complete
a derivation that makes use of that edge. The use of A* for phrase structure parsing
originated with Klein and Manning (2003), while the CCG approach presented here
is based on the work of Lewis and Steedman (2014).
Using information from a supertagger, an agenda and a parse table are initial-
ized with states representing all the possible lexical categories for each word in the
input, along with their f -costs. The main loop removes the lowest cost edge from
the agenda and tests to see if it is a complete derivation. If it reflects a complete
derivation it is selected as the best solution and the loop terminates. Otherwise, new
states based on the applicable CCG rules are generated, assigned costs, and entered
into the agenda to await further processing. The loop continues until a complete
derivation is discovered, or the agenda is exhausted, indicating a failed parse. The
algorithm is given in Fig. E.1.

function CCG-AS TAR -PARSE(words) returns table or failure

supertags ← S UPERTAGGER(words)
for i ← from 1 to L ENGTH(words) do
for all {A | (words[i], A, score) ∈ supertags}
edge ← M AKE E DGE(i − 1, i, A, score)
table ← I NSERT E DGE(table, edge)
agenda ← I NSERT E DGE(agenda, edge)
loop do
if E MPTY ?(agenda) return failure
current ← P OP(agenda)
if C OMPLETED PARSE ?(current) return table
table ← I NSERT E DGE(table, current)
for each rule in A PPLICABLE RULES(current) do
successor ← A PPLY(rule, current)
if successor not ∈ in agenda or chart
agenda ← I NSERT E DGE(agenda, successor)
else if successor ∈ agenda with higher cost
agenda ← R EPLACE E DGE(agenda, successor)

Figure E.1 A*-based CCG parsing.

E.6.3 Heuristic Functions

Before we can define a heuristic function for our A* search, we need to decide how
to assess the quality of CCG derivations. We’ll make the simplifying assumption
that the probability of a CCG derivation is just the product of the probability of
the supertags assigned to the words in the derivation, ignoring the rules used in the
derivation. More formally, given a sentence S and derivation D that contains supertag
sequence T , we have:

P(D, S) = P(T, S) (E.10)

Yn
= P(ti |si ) (E.11)
i=1
10 A PPENDIX E • C OMBINATORY C ATEGORIAL G RAMMAR

To better fit with the traditional A* approach, we’d prefer to have states scored by
a cost function where lower is better (i.e., we’re trying to minimize the cost of a
derivation). To achieve this, we’ll use negative log probabilities to score deriva-
tions; this results in the following equation, which we’ll use to score completed
CCG derivations.

P(D, S) = P(T, S) (E.12)

Xn
= − log P(ti |si ) (E.13)
i=1

Given this model, we can define our f -cost as follows. The f -cost of an edge is
the sum of two components: g(n), the cost of the span represented by the edge, and
h(n), the estimate of the cost to complete a derivation containing that edge (these
are often referred to as the inside and outside costs). We’ll define g(n) for an edge
using Equation E.13. That is, it is just the sum of the costs of the supertags that
comprise the span.
For h(n), we need a score that approximates but never overestimates the actual
cost of the final derivation. A simple heuristic that meets this requirement assumes
that each of the words in the outside span will be assigned its most probable su-
pertag. If these are the tags used in the final derivation, then its score will equal
the heuristic. If any other tags are used in the final derivation the f -cost will be
higher since the new tags must have higher costs, thus guaranteeing that we will not
overestimate.
Putting this all together, we arrive at the following definition of a suitable f -cost
for an edge.

f (wi, j ,ti, j ) = g(wi, j ) + h(wi, j ) (E.14)

j
X
= − log P(tk |wk ) +
k=i
i−1
X N
X
min (− log P(t|wk )) + min (− log P(t|wk ))
t∈tags t∈tags
k=1 k= j+1

As an example, consider an edge representing the word serves with the supertag N
in the following example.
(E.15) United serves Denver.
The g-cost for this edge is just the negative log probability of this tag, −log10 (0.1),
or 1. The outside h-cost consists of the most optimistic supertag assignments for
United and Denver, which are N/N and NP respectively. The resulting f -cost for
this edge is therefore 1.443.

E.6.4 An Example
Fig. E.2 shows the initial agenda and the progress of a complete parse for this ex-
ample. After initializing the agenda and the parse table with information from the
supertagger, it selects the best edge from the agenda—the entry for United with the
tag N/N and f -cost 0.591. This edge does not constitute a complete parse and is
therefore used to generate new states by applying all the relevant grammar rules. In
this case, applying forward application to United: N/N and serves: N results in the
creation of the edge United serves: N[0,2], 1.795 to the agenda.
E.6 • CCG PARSING 11

Skipping ahead, at the third iteration an edge representing the complete deriva-
tion United serves Denver, S[0,3], .716 is added to the agenda. However, the algo-
rithm does not terminate at this point since the cost of this edge (.716) does not place
it at the top of the agenda. Instead, the edge representing Denver with the category
NP is popped. This leads to the addition of another edge to the agenda (type-raising
Denver). Only after this edge is popped and dealt with does the earlier state repre-
senting a complete derivation rise to the top of the agenda where it is popped, goal
tested, and returned as a solution.

Initial
Agenda
1
United: N/N United serves: N[0,2]
.591 1.795
Goal State
2 3 6
serves: (S\NP)/NP serves Denver: S\NP[1,3] United serves Denver: S[0,3]
.591 .591 .716

4 5
Denver: NP Denver: S/(S\NP)[0,1]
.591 .591

United: NP
.716

United: S/S
1.1938

United: S\S United serves Denver

1.494
N/N: 0.591 N: 1.795 S: 0.716
NP: 0.716
serves: N S/S: 1.1938
1.494 S\S: 1.494
…
[0,1] [0,2] [0,3]
Denver: N
1.795 (S\NP)/NP: 0.591 S/NP: 0.591
N: 1.494
…
Denver: N/N
2.494
[1,2] [1,3]
NP: 0.591
N: 1.795
N/N: 2.494
…
…
[2,3]

Figure E.2 Example of an A* search for the example “United serves Denver”. The circled numbers on the
blue boxes indicate the order in which the states are popped from the agenda. The costs in each state are based
on f-costs using negative log10 probabilities.

The effectiveness of the A* approach is reflected in the coloring of the states in

Fig. E.2 as well as the final parsing table. The edges shown in blue (including all the
12 A PPENDIX E • C OMBINATORY C ATEGORIAL G RAMMAR

initial lexical category assignments not explicitly shown) reflect states in the search
space that never made it to the top of the agenda and, therefore, never contributed any
edges to the final table. This is in contrast to the PCKY approach where the parser
systematically fills the parse table with all possible constituents for all possible spans
in the input, filling the table with myriad constituents that do not contribute to the
final analysis.

E.7 Summary
This chapter has introduced combinatory categorial grammar (CCG):
• Combinatorial categorial grammar (CCG) is a computationally relevant lexi-
calized approach to grammar and parsing.
• Much of the difficulty in CCG parsing is disambiguating the highly rich lexical
entries, and so CCG parsers are generally based on supertagging.
• Supertagging is the equivalent of part-of-speech tagging in highly lexicalized
grammar frameworks. The tags are very grammatically rich and dictate much
of the derivation for a sentence.

Bibliographical and Historical Notes

Bibliographical and Historical Notes 13

Ajdukiewicz, K. 1935. Die syntaktische Konnexität. Stu-

dia Philosophica, 1:1–27. English translation “Syntactic
Connexion” by H. Weber in McCall, S. (Ed.) 1967. Polish
Logic, pp. 207–231, Oxford University Press.
Bangalore, S. and A. K. Joshi. 1999. Supertagging: An
approach to almost parsing. Computational Linguistics,
25(2):237–265.
Bar-Hillel, Y. 1953. A quasi-arithmetical notation for syn-
tactic description. Language, 29:47–58.
Bresnan, J., editor. 1982. The Mental Representation of
Grammatical Relations. MIT Press.
Hockenmaier, J. and M. Steedman. 2007. CCGbank: a cor-
pus of CCG derivations and dependency structures ex-
tracted from the penn treebank. Computational Linguis-
tics, 33(3):355–396.
Joshi, A. K. 1985. Tree adjoining grammars: How
much context-sensitivity is required to provide reasonable
structural descriptions? In D. R. Dowty, L. Karttunen,
and A. Zwicky, editors, Natural Language Parsing, pages
206–250. Cambridge University Press.
Klein, D. and C. D. Manning. 2003. A* parsing: Fast exact
Viterbi parse selection. HLT-NAACL.
Lewis, M. and M. Steedman. 2014. A* ccg parsing with a
supertag-factored model. EMNLP.
Pollard, C. and I. A. Sag. 1994. Head-Driven Phrase Struc-
ture Grammar. University of Chicago Press.
Steedman, M. 1989. Constituency and coordination in
a combinatory grammar. In M. R. Baltin and A. S.
Kroch, editors, Alternative Conceptions of Phrase Struc-
ture, pages 201–231. University of Chicago.
Steedman, M. 1996. Surface Structure and Interpretation.
MIT Press. Linguistic Inquiry Monograph, 30.
Steedman, M. 2000. The Syntactic Process. The MIT Press.

Core Syntax A Minimalist Approach
100% (1)
Core Syntax A Minimalist Approach
356 pages
Advanced Technologies For Next Generation Integrated Circuits
No ratings yet
Advanced Technologies For Next Generation Integrated Circuits
321 pages
Practice Tests Plus Cambridge English Advanced 9781447966203 Answer Key
No ratings yet
Practice Tests Plus Cambridge English Advanced 9781447966203 Answer Key
18 pages
Unit Iv Context Free Languages
No ratings yet
Unit Iv Context Free Languages
74 pages
A Very Short Introduction To CCG: Mark Steedman
No ratings yet
A Very Short Introduction To CCG: Mark Steedman
8 pages
Plano Analitico Ingles 7a Classe 2024
No ratings yet
Plano Analitico Ingles 7a Classe 2024
6 pages
Use The Passive and Active Voice Meaningfully in Varied Contexts (EN7G-III-c-2)
No ratings yet
Use The Passive and Active Voice Meaningfully in Varied Contexts (EN7G-III-c-2)
4 pages
A Primer of Ecclesiastical Latin Chapters 1-5 Flash Cards
100% (1)
A Primer of Ecclesiastical Latin Chapters 1-5 Flash Cards
21 pages
Unit-4 Context Free Grammar
No ratings yet
Unit-4 Context Free Grammar
106 pages
Rambow Slides Leysin Tag
No ratings yet
Rambow Slides Leysin Tag
142 pages
NL11SyntaxandContext Free Grammars
No ratings yet
NL11SyntaxandContext Free Grammars
185 pages
Think 11 (A World of Animals) - Unit 11
No ratings yet
Think 11 (A World of Animals) - Unit 11
8 pages
Unit-3 Aim 502
No ratings yet
Unit-3 Aim 502
14 pages
8-Syntax Part1 Merged
No ratings yet
8-Syntax Part1 Merged
139 pages
Degrees of The Adjectives
100% (1)
Degrees of The Adjectives
1 page
Lecture 6
No ratings yet
Lecture 6
18 pages
Syntax Analysis
No ratings yet
Syntax Analysis
87 pages
Normal Forms For Context Free Grammars
No ratings yet
Normal Forms For Context Free Grammars
43 pages
Kayne 0217 Connectedness and Binary Branching PDF
No ratings yet
Kayne 0217 Connectedness and Binary Branching PDF
136 pages
Chapter Five
No ratings yet
Chapter Five
96 pages
Unit 3 CFG
No ratings yet
Unit 3 CFG
65 pages
214 Grammar 2014
No ratings yet
214 Grammar 2014
50 pages
Transformer
No ratings yet
Transformer
59 pages
11 English
No ratings yet
11 English
4 pages
Shift-Reduce CCG Parsing: Yue Zhang Stephen Clark
No ratings yet
Shift-Reduce CCG Parsing: Yue Zhang Stephen Clark
10 pages
Theory of Computation: Lecture 7: Context-Free Grammar
No ratings yet
Theory of Computation: Lecture 7: Context-Free Grammar
21 pages
Toc Mod3
No ratings yet
Toc Mod3
72 pages
NLP Module-3
No ratings yet
NLP Module-3
67 pages
Booklet B2
No ratings yet
Booklet B2
8 pages
Lecture7 PDF
No ratings yet
Lecture7 PDF
40 pages
SK RPT Bahasa Inggeris Tahun 5 Shared by Fazleen
No ratings yet
SK RPT Bahasa Inggeris Tahun 5 Shared by Fazleen
26 pages
Participle & GerundWPS Office
No ratings yet
Participle & GerundWPS Office
2 pages
Artificial Neural Networks
No ratings yet
Artificial Neural Networks
43 pages
Slides 19
No ratings yet
Slides 19
85 pages
3 LM Jan 08 2021
No ratings yet
3 LM Jan 08 2021
77 pages
GMAT Error Log
No ratings yet
GMAT Error Log
46 pages
General English A1 - Imran B. Ibrahim
No ratings yet
General English A1 - Imran B. Ibrahim
30 pages
Unit 4
No ratings yet
Unit 4
45 pages
English Test - Verb To Be
No ratings yet
English Test - Verb To Be
1 page
Pda Annotated 10 12 2021
No ratings yet
Pda Annotated 10 12 2021
37 pages
"Context-Free Grammar" From John Martin (3 Edition)
No ratings yet
"Context-Free Grammar" From John Martin (3 Edition)
40 pages
TOC Notes Endsem
No ratings yet
TOC Notes Endsem
32 pages
BBC Skillswise - Putting Sentences Together - Factsheet 2 - Complex Sentences
No ratings yet
BBC Skillswise - Putting Sentences Together - Factsheet 2 - Complex Sentences
1 page
Constituency Grammars
No ratings yet
Constituency Grammars
31 pages
Jurafsky Martin Edition 3 Draft Chapter 10
No ratings yet
Jurafsky Martin Edition 3 Draft Chapter 10
29 pages
9-Syntax Part1
No ratings yet
9-Syntax Part1
26 pages
12
No ratings yet
12
31 pages
CH 3
No ratings yet
CH 3
16 pages
Context Free Language
No ratings yet
Context Free Language
31 pages
Context Free Grammer
No ratings yet
Context Free Grammer
10 pages
Context-Free Grammars and Constituency Parsing
No ratings yet
Context-Free Grammars and Constituency Parsing
26 pages
UNIT 4 Part1
No ratings yet
UNIT 4 Part1
19 pages
Context-Free Grammar
No ratings yet
Context-Free Grammar
22 pages
L4-Use Commas To Create Pauses - Two Legs or One
100% (1)
L4-Use Commas To Create Pauses - Two Legs or One
12 pages
Session 07 - Context Free Grammar
No ratings yet
Session 07 - Context Free Grammar
34 pages
Statistical Constituency Pars-Ing: C.1 Probabilistic Context-Free Grammars
No ratings yet
Statistical Constituency Pars-Ing: C.1 Probabilistic Context-Free Grammars
21 pages
M CFG S For Linguists
No ratings yet
M CFG S For Linguists
25 pages
3 - Grammer - II-CFG
No ratings yet
3 - Grammer - II-CFG
34 pages
Compilers Lecture 5
No ratings yet
Compilers Lecture 5
30 pages
Ambiguity + Chomsky Normal Form: November 11, 2013
No ratings yet
Ambiguity + Chomsky Normal Form: November 11, 2013
23 pages
Cognitive Systems Mental Processes
No ratings yet
Cognitive Systems Mental Processes
23 pages
The Role of Syntax in Vector Space Models of Compositional Semantics
No ratings yet
The Role of Syntax in Vector Space Models of Compositional Semantics
11 pages
TIC 2151 - Theory of Computation: Context-Free Grammars (CFG)
No ratings yet
TIC 2151 - Theory of Computation: Context-Free Grammars (CFG)
23 pages
NLP - Shortnotes Unit 3
No ratings yet
NLP - Shortnotes Unit 3
16 pages
r19 Ai Unit IV Chapter 1
No ratings yet
r19 Ai Unit IV Chapter 1
19 pages
Act R
No ratings yet
Act R
17 pages
Prediction in Chart Parsing Algorithms For Categorial Unification Grammar
No ratings yet
Prediction in Chart Parsing Algorithms For Categorial Unification Grammar
6 pages
Present Simple Exceptions in English Grammar
No ratings yet
Present Simple Exceptions in English Grammar
10 pages
Past Simple
No ratings yet
Past Simple
1 page
Cognition Distributed
No ratings yet
Cognition Distributed
14 pages
Future Tense in Simple Words
No ratings yet
Future Tense in Simple Words
22 pages
Subject - Verb Concord
No ratings yet
Subject - Verb Concord
6 pages
2 Contex Free Language
No ratings yet
2 Contex Free Language
13 pages
Part of Speech Thing, Person, or Place
No ratings yet
Part of Speech Thing, Person, or Place
12 pages
Actionverbs and Linking Verbs
No ratings yet
Actionverbs and Linking Verbs
11 pages
Context-Free Grammer
No ratings yet
Context-Free Grammer
8 pages
Module 4
No ratings yet
Module 4
7 pages
Tugas 2 Pbis4202 Introduction To Linguistics
No ratings yet
Tugas 2 Pbis4202 Introduction To Linguistics
6 pages
Unit 3-1
No ratings yet
Unit 3-1
4 pages
Automata Theory Assignment Adam
No ratings yet
Automata Theory Assignment Adam
5 pages
Lia Aftanty - 21202241042 - Summary of Functional Grammar
No ratings yet
Lia Aftanty - 21202241042 - Summary of Functional Grammar
5 pages
Uralicwokshop5 Final
No ratings yet
Uralicwokshop5 Final
13 pages
Estructuras Gramaticales
No ratings yet
Estructuras Gramaticales
8 pages
List of Language Functions and Grammar Points That Learners at The A2 Level Should Master
No ratings yet
List of Language Functions and Grammar Points That Learners at The A2 Level Should Master
3 pages
Modeling of Temperature and Field-Dependent Electron Mobility in A Single-Layer Graphene Sheet
No ratings yet
Modeling of Temperature and Field-Dependent Electron Mobility in A Single-Layer Graphene Sheet
4 pages
Teaching Present Continuous Through Inductive and Deductive Approaches
No ratings yet
Teaching Present Continuous Through Inductive and Deductive Approaches
4 pages
Charlie Lesson Plan Educ 13
No ratings yet
Charlie Lesson Plan Educ 13
3 pages
English Exam G-9
No ratings yet
English Exam G-9
2 pages
Freq. Adverbs - Questions With How
No ratings yet
Freq. Adverbs - Questions With How
2 pages
Fundamentals of Mathematical Physics
From Everand
Fundamentals of Mathematical Physics
Edgar A. Kraut
3/5 (4)
Modern Multidimensional Calculus
From Everand
Modern Multidimensional Calculus
Marshall Evans Munroe
No ratings yet
The Riemann Zeta-Function: Theory and Applications
From Everand
The Riemann Zeta-Function: Theory and Applications
Aleksandar Ivic
No ratings yet
Introduction to Electromagnetic Theory
From Everand
Introduction to Electromagnetic Theory
George E. Owen
No ratings yet
Geometry of Submanifolds
From Everand
Geometry of Submanifolds
Bang-Yen Chen
No ratings yet
A Treatise on the Calculus of Finite Differences
From Everand
A Treatise on the Calculus of Finite Differences
George Boole
4/5 (1)
Calculus of Variations
From Everand
Calculus of Variations
I.M. Gelfand
No ratings yet
A Sentence Diagramming Primer: The Reed & Kellogg System Step-By-Step
From Everand
A Sentence Diagramming Primer: The Reed & Kellogg System Step-By-Step
Dr. Judith Coats
No ratings yet
Notes on the Quantum Theory of Angular Momentum
From Everand
Notes on the Quantum Theory of Angular Momentum
Eugene Feenberg
No ratings yet
Barron's Physics Practice Plus: 400+ Online Questions and Quick Study Review
From Everand
Barron's Physics Practice Plus: 400+ Online Questions and Quick Study Review
Barron's Educational Series
No ratings yet

Combinatory Categorial Grammar: E.1 CCG Categories

Uploaded by

Combinatory Categorial Grammar: E.1 CCG Categories

Uploaded by

Speech and Language Processing. Daniel Jurafsky & James H. Martin. Copyright © 2023.

E.1 CCG Categories

• A ⊆ C , where A is a given set of atomic elements

E.2 The Lexicon

sample lexical entries.

to handle such examples.

X/Y Y /Z ⇒ X/Z (E.4)

X ⇒ T /(T \X) (E.6)

flew IcelandAir to Geneva and SwissAir to London

E.5 Ambiguity in CCG

In a phrase-structure approach we would simply assign the word to to the cate-

United diverted the flight to Reno

United diverted the flight to Reno

A third possibility is to view divert as a ditransitive verb by assigning it to the

United diverted the flight to Reno

E.6 CCG Parsing

United serves Denver

E.6.2 CCG Parsing using the A* Algorithm

function CCG-AS TAR -PARSE(words) returns table or failure

Figure E.1 A*-based CCG parsing.

E.6.3 Heuristic Functions

P(D, S) = P(T, S) (E.10)

P(D, S) = P(T, S) (E.12)

f (wi, j ,ti, j ) = g(wi, j ) + h(wi, j ) (E.14)

United: S\S United serves Denver

The effectiveness of the A* approach is reflected in the coloring of the states in

Bibliographical and Historical Notes

Ajdukiewicz, K. 1935. Die syntaktische Konnexität. Stu-

You might also like