0% found this document useful (0 votes)
2 views

Lecture 06

Uploaded by

1162407364
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views

Lecture 06

Uploaded by

1162407364
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 49

Natural Language

Processing
Lecture 6: Parsing with Context Free Grammars I.
CKY algorithm

11/8/2020

COMS W4705
Yassine Benajiba
Formal Grammar and
Parsing
• Formal Grammars are used in linguistics, NLP,
programming languages.

• We want to build a compact model that describes a


complete language.

• Need efficient algorithms to determine if a sentence is in


the language or not (recognition problem).

• We also want to recover the structure imposed by the


grammar (parsing problem).
Syntactic Parsing
• Formalisms like CFGs and Finite State Automata define
the (possibly infinite) set of legal strings of a language.

• Parsing algorithms determine if an input string is part of


this language or not. For CFGs, they assign each string
one or more syntactic analyses.
Two Approaches to Parsing
• Bottom-up: Start at the words (terminal symbols) and see
which subtrees you can build. Then combine these subtrees
into larger trees. (Driven by the input sentence.)
CKY algorithm - requires Grammars in Chomsky Normal
Form.

• Top-down: Start at the start symbol (S), try to apply


production rules that are compatible with the input.
(Driven by the grammar - next week)
Earley algorithm

• Both approaches can be seen as a kind of search problem


(next week).
Chomsky Normal Form
• A CFG G=(N, Σ, R, S) is in Chomsky Normal Form (CNF) if
the rules take one of the following forms:

• A → B C, where A ∈ N, B ∈ N, C ∈ N.

• A → b, where A ∈ N, b ∈ Σ.
S → NP VP V → saw
VP → V NP P → with
VP → VP PP D → the
PP → P NP N → cat
NP →DN N → tail
NP → NP PP N → student

Any CFG can be converted to an equivalent grammar in CNF that


expresses the same language.
Cocke-Kasami-Younger
(CKY) Algorithm - Motivation
• A nonterminal A covers a sub-span [i,j] of the input string s if the rules in
the grammar can derive s[i,j] from A.
Let π[i,j] be the set of nonterminals that cover [i,j].

• The string is recognized by the grammar if S ∈ π[i,j].

• Approach: Compute π[i,j] for all sub-spans bottom-up, using dynamic-


programming.

π[0,8] = {S}
π[2,8] = {VP}
π[0,5] = {S}
π[2,5] = {VP} π[5,8] = {NP}
π[0,2] = {NP} π[3,5] = {NP} π[6,8] = {NP}
π[0,1] = {D} {N} {V,N} {D} {N} {P} {D} π[7,8] = {D}
s= the student saw the cat with the tail
0 1 2 3 4 5 6 7 8
CKY Data Structure
• Use a 2-dimensional “parse table” to represent π[i,j].

0 she 1 saw 2 the 3 cat 4 with 5 glasses

0 0,1 0,2 0,3 0,4 0,5 0,6


S → NP VP NP → she
VP → V NP NP → glasses 1 1,2 1,3 1,4 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 2,3 2,4 2,5 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 3,4 3,5 3,6
P → with
4 4,5 4,6

5 5,6

6
CKY Initialization
• For i=0…length(s-1):
π[i, i+1] = {A | A → s[i:i+1] ∈ R}

0 she 1 saw 2 the 3 cat 4 with 5 glasses

0 NP 0,2 0,3 0,4 0,5 0,6


S → NP VP NP → she
VP → V NP NP → glasses 1 V 1,3 1,4 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D 2,4 2,5 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N 3,5 3,6
P → with
4 P 4,6

5 NP,N

6
CKY - finding the split
• CKY requires grammar to be in CNF.

• Assume subspan [i,j] is covered by nonterminal A.

• Then this nonterminal was recognized by some


production of the form A → B C, where
A ∈ N, B ∈ N, C ∈ N (grammar is in CNF).

• Span [i,j] can be split into two parts:


[i,k], which is covered by B, and
[k,j] which is covered by C.
A
i k j
B C
CKY - Recursive Definition
• To compute π[i, j], try all possible split points k, such that
i < k < j.

• For each k, check if the nonterminals in π[i,k] and π[k,j]


match any of the rules in the grammar.

• Recursive definition for π[i, j]:


CKY Full Algorithm
• Input: Grammar G=(N, Σ, R, S), input string s of length n.

• for i=0…n-1: initialization


π[i, i+1] = {A | A → s[i] }

• for length=2…n: main loop


for i=0…(n-length):
j = i+length
for k=i+1…j-1:

• if S ∈ π[0, i+1] return True, otherwise False


CKY Algorithm
for i=0…(n-length): length=2
j = i+length i=0,k=1,j=2
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP 0,2 0,3 0,4 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V 1,3 1,4 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D 2,4 2,5 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N 3,5 3,6
P → with
4 P 4,6

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=2
j = i+length i=1,k=2,j=3
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP 0,3 0,4 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V 1,3 1,4 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D 2,4 2,5 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N 3,5 3,6
P → with
4 P 4,6

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=2
j = i+length i=2,k=3,j=4
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP 0,3 0,4 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V 1,4 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP 2,5 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N 3,5 3,6
P → with
4 P 4,6

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=2
j = i+length i=3,k=4,j=5
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP 0,3 0,4 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V 1,4 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP 2,5 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N 3,6
P → with
4 P 4,6

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=2
j = i+length i=4,k=5,j=6
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP 0,3 0,4 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V 1,4 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP 2,5 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N 3,6
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=3
j = i+length i=0,k=1,j=3
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP 0,4 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V 1,4 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP 2,5 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N 3,6
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=3
j = i+length i=0,k=2,j=3
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP 0,4 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V 1,4 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP 2,5 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N 3,6
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=3
j = i+length i=1,k=2,j=4
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP 0,4 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP 2,5 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N 3,6
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=3
j = i+length i=1,k=3,j=4
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP 0,4 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP 2,5 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N 3,6
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=3
j = i+length i=2,k=3,j=5
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP 0,4 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N 3,6
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=3
j = i+length i=2,k=4,j=5
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP 0,4 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N 3,6
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=3
j = i+length i=3,k=4,j=6
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP 0,4 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=3
j = i+length i=3,k=5,j=6
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP 0,4 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=4
j = i+length i=0,k=1,j=4
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=4
j = i+length i=0,k=2,j=4
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=4
j = i+length i=0,k=3,j=4
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,5 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=4
j = i+length i=1,k=2,j=5
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=4
j = i+length i=1,k=3,j=5
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=4
j = i+length i=1,k=4,j=5
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP 2,6
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=4
j = i+length i=2,k=3,j=6
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=4
j = i+length i=2,k=4,j=6
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP NP
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=4
j = i+length i=2,k=5,j=6
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S 0,5 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP NP
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=5
j = i+length i=0,k=1,j=5
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP NP
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=5
j = i+length i=0,k=2,j=5
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP NP
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=5
j = i+length i=0,k=3,j=5
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP NP
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=5
j = i+length i=0,k=4,j=5
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP 1,6
VP → VP PP D → the
PP → P NP N → cat 2 D NP NP
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=5
j = i+length i=1,k=2,j=6
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP VP
VP → VP PP D → the
PP → P NP N → cat 2 D NP NP
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=5
j = i+length i=1,k=3,j=6
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP VP
VP → VP PP D → the
PP → P NP N → cat 2 D NP NP
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=5
j = i+length i=1,k=4,j=6
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S 0,6
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP VP
VP → VP PP D → the
PP → P NP N → cat 2 D NP NP
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N
! We can build VP over [1,6] in two ways!
6
CKY Algorithm
for i=0…(n-length): length=5
j = i+length i=0,k=1,j=6
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S S
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP VP
VP → VP PP D → the
PP → P NP N → cat 2 D NP NP
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=5
j = i+length i=0,k=2,j=6
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S S
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP VP
VP → VP PP D → the
PP → P NP N → cat 2 D NP NP
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=5
j = i+length i=0,k=3,j=6
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S S
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP VP
VP → VP PP D → the
PP → P NP N → cat 2 D NP NP
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=5
j = i+length i=0,k=4,j=6
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S S
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP VP
VP → VP PP D → the
PP → P NP N → cat 2 D NP NP
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Algorithm
for i=0…(n-length): length=5
j = i+length i=0,k=5,j=6
for k=i+1…j-1:
0 she 1 saw 2 the 3 cat 4 with 5 glasses
….
0 NP S S
S → NP VP NP → she
VP → V NP NP → glasses 1 V VP VP
VP → VP PP D → the
PP → P NP N → cat 2 D NP NP
NP →DN N → glasses
NP → NP PP V → saw 3 N
P → with
4 P PP

5 NP,N

6
CKY Runtime
• Input: Grammar G=(N, Σ, R, S), input string s of length n.

• for i=0…n-1: O(N x |R|)


π[i, i+1] = {A | A → s[i] }

• for length=2…n: O(N)


for i=0…(n-length): O(N) Total : O(N3 x |R|)
j = i+length O(N)
for k=i+1…j-1:

• if S ∈ π[0, i+1] return True, otherwise False


Syntactic Ambiguity
S → NP VP NP → she
VP → V NP NP → glasses
VP → VP PP D → the
PP → P NP N → cat
NP →DN N → glasses
NP → NP PP V → saw
P → with
S S
VP[1,6] VP[1,6]

VP[2,4] NP[2,6]

NP PP[4,6] NP PP

NP V D N P NP NP V[1,2] D N P NP
she saw the cat with glasses she saw the cat with glasses
Backpointers
• The CKY algorithm presented so far determines if a sentence is
recognized by a grammar.
0 she 1 saw 2 the 3 cat 4 with 5 glasses
• Also want to retrieve the
NP S 0,6
0
parse trees!
1 V VP VP
• Instead of a set of
2 D NP NP
nonterminals, store a list of
instantiated rules and 3 N
backpointers.
4 P PP

{VP[1,6] → V[1,2] NP[2,6]


}
VP[1,6] → VP[1,4] PP[4,6]
5

6
NP,N
Retrieving Parse-Trees
• Start at the [0,n] entry and recursively follow the backpointers.
Return a set of of subtrees from the recursion.

You might also like