Chart Parsing Bottom-Up Chart Parsing
Chart Parsing Bottom-Up Chart Parsing
1
Chart Parsing
General search methods are not best for syntactic parsing because the same
syntactic constituent may be rederived many times as a part of larger constituents.
For example, an NP could be part of different VPs or PPs.
Chart parsing uses a chart to keep track of partial derivations so nothing has to be
rederived.
Chart parsers also use an agenda to prioritize the constituents to be processed.
The agenda can be implemented as a stack or a queue to simulate depth-rst or
breadth-rst search, or as a priority queue for best-rst search.
Natural Language Processing
2
Bottom-up Chart Parsing
1. If the agenda is empty, look up the next word (scanning the sentence
left to right) in the lexicon and add its possible parts-of-speech to the
agenda. Include its position in the input sentence.
For example, if the rst word is an article then it would be
added as ART (p
0
-p
1
).
2. Select a constituent C from the agenda.
3. Add C to the chart using the arc extension algorithm.
Natural Language Processing
3
The Arc Extension Algorithm
To add a constituent C that spans positions p
i
-p
j
1. INSERT C into the chart in positions p
i
-p
j
.
2. SEARCH for grammar rules that begin with the constituent C.
For each rule, add an active arc to the chart of the form:
X C * X
1
X
2
...X
n
, spanning positions p
i
-p
j
.
3. EXTEND all active arcs of the form X X
1
... * C ... X
n
spanning positions p
m
-p
i
by adding a COPY of the arc to the chart that includes the new constituent:
X X
1
... C * ... X
n
, spanning positions p
m
-p
j
..
Do not delete the original active arc, always create a copy!!
4. For each NEWLY COMPLETED ARC of the form X X
1
... X
n
C *, spanning
positions p
m
-p
j
, add a new constituent of type X onto the agenda for positions
p
m
-p
j
.
Natural Language Processing
4
An Example: the grammar and lexicon
Sentence: The smelly dog drank the water.
Grammar
S NP VP
NP ART N
NP ADJ N
NP ART ADJ N
VP V
VP V NP
Lexicon
the: ART
smelly: ADJ
dog: V N
drank: V
water: V N
Natural Language Processing
5
Top-down vs. Bottom-up Chart Parsing
Bottom-up chart parsing checks the input and builds each constituent
exactly once. Avoids duplication of effort!
But bottom-up chart parsing may build constituents that cannot be
used legally.
For example, given the can, bottom-up parsing will build a VP
for can even if no grammar rules allow a VP to follow an
article.
Top-down chart parsing is highly predictive. Only grammar rules that
can be legally applied will be put on the chart.
Natural Language Processing
6
Top-Down Chart Parsing Algorithm (Earley Algorithm)
Initialize: For each grammar rule with S as its left-hand side,
add the arc S X
1
...X
k
to the chart using the top-down arc
introduction algorithm.
While there are input words left:
1. If the agenda is empty, add the next word to the agenda.
2. Select a constituent C from the agenda.
3. Combine C with every active arc on the chart using the arc extension
algorithm. Add any new constituents to the agenda.
4. For any active arcs created by step 3, add them to the chart using the
top-down arc introduction algorithm.
Natural Language Processing
7
Top-Down Arc Introduction Algorithm
When adding arc Y C
1
... C
i
...C
n
ending at position j:
For each rule C
i
X
1
...X
k
in the grammar,
recursively add a new arc C
i
X
1
...X
k
from position j to j.
Natural Language Processing
8
Chart Parsing: Pros and Cons
Pros
Relatively efcient because each constituent is generated exactly
once.
Easy to generate a single parse, N parses, or all possible parses.
If no complete parse is found, then easy to gather pieces and
construct a partial parse.
Could search for only certain types of constituents, such as NPs and
VPs.
Cons
Parsing is still not cheap, especially if the grammar is large.
The chart can become quite large in some cases.
Natural Language Processing
9
Modifying a Chart Parser to Handle Features
Given an arc on the chart of the form:
X X
1
... X
i
...X
n
and a constituent of type C that can extend the arc:
1. Find an instantiation of the variables such that all features specied in
X
i
are satised by C
2. Create a copy of the arc with the variables instantiated as specied by
the previous step
3. Add the copy to the chart in the usual fashion
Natural Language Processing
10
An Example
Suppose the following arc is on the chart:
NP agr <w:3s,3p> (ART agr <w:3s,3p>) (N agr <w:3s,3p>)
And the arc can be extended with the constituent:
N root table agr 3s
The new arc would be:
NP agr <w:3s> (ART agr <w:3s>) (N agr <w:3s>)
Natural Language Processing
11
Syntactic Ambiguities
Syntactic ambiguities can cause the number of possible parse trees to
explode.
PP attachment ambiguity:
I saw the man on the hill with a telescope.
Noun phrase bracketing:
plastic cat food can cover
Conjunctions and appositives:
Rover, my sh, and Fluffy
A medium-sized sentence with a moderate grammar can have over 1000
legal parse trees!
Natural Language Processing
12
Dealing with Syntactic Ambiguity
One approach to minimizing syntactic ambiguity is to modify the grammar.
Consider this grammar:
NP NP1
NP1 noun
NP1 NP1 NP1
versus this grammar rule (if the Kleene + operator is legal):
NP noun+
or if the Kleene + operator is not legal:
NP noun
NP noun noun
NP noun noun noun
NP noun noun noun noun
These grammar rules look more complicated but eliminate all structural
ambiguity!
Natural Language Processing
13
Packing
Another way to reduce structural ambiguity is to use a packed chart.
Packing combines constituents of the same type that include the
same words, no matter how they were derived.
For example, [[cat food] can] and [cat [food can]]
would be combined into the same NP.
You can keep track of the different derivations for a constituent even
when the chart contains only one item representing them all.
Natural Language Processing
14
Shallow Parsing
Instead of trying to generate a complete parse tree for a sentence,
shallow parsers generate fragments representing local syntactic
constituents. (Also called partial parsers or chunkers.)
Shallow parsers typically try to identify NPs, VPs, and PPs (and
occasionally other constituents).
These local syntactic constituents can be identied (relatively) reliably
using simple grammar rules and heuristics.
Most shallow parsers use nite state machines to recognize a regular
grammar.
Natural Language Processing
15
Shallow Parsing
Shallow parsers usually produce a at syntactic representation of
non-recursive constituents (sometimes called chunks).
The process is often implemented as a series of cascaded FSMs.
The election in the U.S. will occur in November
[NP: The election] in [NP: the U.S.] will occur in [NP: November]
[NP: The election] [PP: in [NP: the U.S.]] will occur [PP: in [NP:
November]]
[NP: The election] [PP: in [NP: the U.S.]] [VP: will occur] [PP: in
[NP: November]]
Natural Language Processing
16
Shallow Parsing as Classication
Shallow parsers can be built with supervised learning techniques using an
annotated training corpus.
The trick is to view shallow parsing as a classication or tagging problem.
A common scheme is IOB tagging, where B=Beginning, I=Internal, and
O=Outside.
John Smith gave Mary a book about NLP.
Natural Language Processing
17
Benets of Shallow Parsing
Deep syntactic structure may not be important for some NLP
applications.
Some ambiguity issues can be ignored if they are not critical for
identifying the fragments.
Some structural issues can be delayed and left for semantic analysis.
Shallow parsers are more robust with ungrammatical or ill-formed
input.
Shallow parsers are usually much faster than full parsers.
Natural Language Processing
18
Weaknesses of Shallow Parsing
Usually does not handle embedded relative clauses well.
Ex: I gave the boy that was sick some medicine.
Often has trouble recognizing reduced relative clauses.
Ex: The woman killed last night was an important diplomat.
Attachments are usually not attempted.