0% found this document useful (0 votes)
57 views112 pages

Cky CNF

The document discusses the CKY algorithm for parsing natural language. It begins with an overview of parsing strategies and their inefficiencies. It then introduces the CKY algorithm as a dynamic programming approach that avoids these inefficiencies by storing intermediate parsing solutions in a chart. The CKY algorithm works by building up a parse table from the bottom left to top right, considering all ways to split constituent spans at each point. The document provides an example of running the CKY algorithm on a simple grammar.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views112 pages

Cky CNF

The document discusses the CKY algorithm for parsing natural language. It begins with an overview of parsing strategies and their inefficiencies. It then introduces the CKY algorithm as a dynamic programming approach that avoids these inefficiencies by storing intermediate parsing solutions in a chart. The CKY algorithm works by building up a parse table from the bottom left to top right, considering all ways to split constituent spans at each point. The document provides an example of running the CKY algorithm on a simple grammar.
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 112

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

CKY Algorithm, Chomsky Normal Form


Scott Farrar CLMA, University of Washington

January 13, 2010

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Todays lecture

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Parsing strategies

Name one reason why bottom-up parsing is inecient?

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Parsing strategies

Name one reason why bottom-up parsing is inecient? The [search for Spock] was successful.

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Parsing strategies

Name one reason why bottom-up parsing is inecient? The [search for Spock] was successful. And for top-down?

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Parsing strategies

Name one reason why bottom-up parsing is inecient? The [search for Spock] was successful. And for top-down? Which would you like? That one.

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Parsing strategies

Name one reason why bottom-up parsing is inecient? The [search for Spock] was successful. And for top-down? Which would you like? That one. And what makes naive search so inecient?

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Parsing strategies

Name one reason why bottom-up parsing is inecient? The [search for Spock] was successful. And for top-down? Which would you like? That one. And what makes naive search so inecient? Theres no way to store intermediate solutions.

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

CKY algorithm
Cocke-Kasami-Younger (CKY) algorithm: a fast bottom-up parsing algorithm that avoids some of the ineciency associated with purely naive search with the same bottom-up strategy.

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

CKY algorithm
Cocke-Kasami-Younger (CKY) algorithm: a fast bottom-up parsing algorithm that avoids some of the ineciency associated with purely naive search with the same bottom-up strategy.

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

CKY algorithm
Cocke-Kasami-Younger (CKY) algorithm: a fast bottom-up parsing algorithm that avoids some of the ineciency associated with purely naive search with the same bottom-up strategy. Intermediate solutions are stored.

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

CKY algorithm
Cocke-Kasami-Younger (CKY) algorithm: a fast bottom-up parsing algorithm that avoids some of the ineciency associated with purely naive search with the same bottom-up strategy. Intermediate solutions are stored. Only intermediate solutions that contribute to a full parse are further pursued.

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

CKY algorithm
Cocke-Kasami-Younger (CKY) algorithm: a fast bottom-up parsing algorithm that avoids some of the ineciency associated with purely naive search with the same bottom-up strategy. Intermediate solutions are stored. Only intermediate solutions that contribute to a full parse are further pursued. The CKY is picky about what type of grammar it accepts.

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

CKY algorithm
Cocke-Kasami-Younger (CKY) algorithm: a fast bottom-up parsing algorithm that avoids some of the ineciency associated with purely naive search with the same bottom-up strategy. Intermediate solutions are stored. Only intermediate solutions that contribute to a full parse are further pursued. The CKY is picky about what type of grammar it accepts. We require that our grammar be in a special form, known as Chomsky Normal Form (CNF).

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

CKY algorithm
Cocke-Kasami-Younger (CKY) algorithm: a fast bottom-up parsing algorithm that avoids some of the ineciency associated with purely naive search with the same bottom-up strategy. Intermediate solutions are stored. Only intermediate solutions that contribute to a full parse are further pursued. The CKY is picky about what type of grammar it accepts. We require that our grammar be in a special form, known as Chomsky Normal Form (CNF). The rationale is to ll in a chart with the solutions to the subproblems encountered in the bottom-up parsing process.
Scott Farrar CLMA, University of Washington CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Dynamic programming
Denition Dynamic programming: a method of reducing the runtime of algorithms by discovering solutions to subproblems along the way to the solution of the main problem; to optimally plan a multi-stage process good for problems with overlapping subproblems generally involves the caching of partial results in a table for later retrieval many application (outside of NLP) What are the subproblems for the parsing task?

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Well-formed substring table (WFST)

Denition A well-formed substring table is a data structure containing partial constituency structures. It may be represented as either a chart or a graph.

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Well-formed substring table (WFST)


Example the brown dog NP DT Nom, Nom JJ NN , DT the , etc.

the DT1

brown JJ2

dog NP5 Nom4 NN3

Numbers indicate order in which symbol was enterred into table.

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Setting up the CKY algorithm

For an input of length=n, create a matrix (n + 1 x n + 1), indexed from 0 to n. Each cell in the matrix [i , j ] is the set of all categories of constituents spanning from position i to j . The algorithm forces you to ll in the table in the most ecient way. Process cells left to right (across columns), bottom to top (backwards across rows).

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Well-formed substring table (WFST)


Example the brown dog NP DT Nom, Nom JJ NN , DT the , etc.

the DT1

brown JJ2

dog NP5 Nom4 NN3

Numbers indicate order in which symbol was enterred into table.

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

CKY: assumptions

Critical observation: any portion of the input string spanning i to j can be split at k , and structure can then be built using sub-solutions spanning i to k and sub-solutions spanning k to j . Example 0 the 1 brown 2 dog 3 k = 1: possible constituents are [0,1] and [1,3] k = 2: possible constituents are [0,2] and [2,3]

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Simple grammar
S NP VBZ S NP VP VP VP PP VP VBZ NP VP VBZ PP VP VBZ NNS VP VBZ VP VP VBP NP VP VBP PP NP DT NN NP DT NNS PP IN NP DT the NN chef NNS sh NNS chopsticks VBP sh VBZ eats IN with

0 the 1 chef 2 eats 3 sh 4 with 5 the 6 chopsticks 7


Scott Farrar CLMA, University of Washington CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 1 2 3 4 5 6 0 1 2 3 4 5 6 Build an n+1 x n+1 matrix, where n = number of words in input

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

the 0 1 0 [0,1] 1 2 3 4 5 6 Illustrate the

chef 2 [1,2]

eats 3

sh 4

with 5

the 6

chopsticks 7

[2,3] [3,4] [4,5] [5,6] [6,7] numbering of cells: [i,j]s represent spans.

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

the chef eats sh with the chopsticks 0 1 2 3 4 5 6 7 0 1 [1,2] 2 3 4 5 6 Notice how the spans (e.g, [1,2]) dier from the word indices (e.g, chef, 2).

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0

the 1 DT [0,1]

chef 2

eats 3

sh 4

with 5

the 6

chopsticks 7

1 [1,2] 2 3 4 5 6 the is labelled DT

[2,3] [3,4] [4,5] [5,6] [6,7]

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0

the 1 DT [0,1]

chef 2

eats 3

sh 4

with 5

the 6

chopsticks 7

1 NN [1,2] 2 3 4 5 6 chef is labelled NN

[2,3] [3,4] [4,5] [5,6] [6,7]

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1

the 1 DT [0,1]

chef 2 NP [0,2]

eats 3

sh 4

with 5

the 6

chopsticks 7

NN [1,2] 2 [2,3] 3 4 5 6 Found an NP: [0,1], [1,2]

[3,4] [4,5] [5,6] [6,7]

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1 2

the 1 DT [0,1]

chef 2 NP [0,2] NN [1,2]

eats 3

sh 4

with 5

the 6

chopsticks 7

VBZ [2,3] [3,4] [4,5] [5,6] [6,7]

3 4 5 6 eats is labelled VBZ

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1 2

the 1 DT [0,1]

chef 2 NP [0,2] NN [1,2]

eats 3 S [0,3]

sh 4

with 5

the 6

chopsticks 7

VBZ [2,3] [3,4] [4,5] [5,6] [6,7]

3 4 5 6 Found an S: [0,2],[2,3]

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1 2 3

the 1 DT [0,1]

chef 2 NP [0,2] NN [1,2]

eats 3 S [0,3]

sh 4

with 5

the 6

chopsticks 7

VBZ [2,3] NNS [3,4] [4,5] [5,6] [6,7]

4 5 6 sh is labelled NNS

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1 2 3

the 1 DT [0,1]

chef 2 NP [0,2] NN [1,2]

eats 3 S [0,3]

sh 4

with 5

the 6

chopsticks 7

VBZ [2,3] NNS,VBP [3,4] [4,5] [5,6] [6,7]

4 5 6 sh is labelled VBP

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1 2 3

the 1 DT [0,1]

chef 2 NP [0,2] NN [1,2]

eats 3 S [0,3]

sh 4

with 5

the 6

chopsticks 7

VBZ [2,3]

VP [2,4] NNS,VBP [3,4] [4,5] [5,6] [6,7]

4 5 6 Found a VP: [2,3], [3,4]

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1 2 3

the 1 DT [0,1]

chef 2 NP [0,2] NN [1,2]

eats 3 S [0,3]

sh 4 S [0,4]

with 5

the 6

chopsticks 7

VBZ [2,3]

VP [2,4] NNS,VBP [3,4] [4,5] [5,6] [6,7]

4 5 6 Found an S: [0,2],[2,4]

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1 2 3

the 1 DT [0,1]

chef 2 NP [0,2] NN [1,2]

eats 3 S [0,3]

sh 4 S [0,4]

with 5

the 6

chopsticks 7

VBZ [2,3]

VP [2,4] NNS,VBP [3,4] IN [4,5] [5,6] [6,7]

4 5 6 with is labelled IN

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1 2 3

the 1 DT [0,1]

chef 2 NP [0,2] NN [1,2]

eats 3 S [0,3]

sh 4 S [0,4]

with 5

the 6

chopsticks 7

VBZ [2,3]

VP [2,4] NNS,VBP [3,4] IN [4,5] DT [5,6] [6,7]

4 5 6 the is labelled DT

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1 2 3 4 5 6

the 1 DT [0,1]

chef 2 NP [0,2] NN [1,2]

eats 3 S [0,3]

sh 4 S [0,4]

with 5

the 6

chopsticks 7

VBZ [2,3]

VP [2,4] NNS,VBP [3,4] IN [4,5] DT [5,6] NNS [6,7]

chopsticks is labelled NNS

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1 2 3 4 5 6

the 1 DT [0,1]

chef 2 NP [0,2] NN [1,2]

eats 3 S [0,3]

sh 4 S [0,4]

with 5

the 6

chopsticks 7

VBZ [2,3]

VP [2,4] NNS,VBP [3,4] IN [4,5] DT [5,6] NP [5,7] NNS [6,7]

Found an NP: [5,6], [6,7]

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1 2 3 4 5 6

the 1 DT [0,1]

chef 2 NP [0,2] NN [1,2]

eats 3 S [0,3]

sh 4 S [0,4]

with 5

the 6

chopsticks 7

VBZ [2,3]

VP [2,4] NNS,VBP [3,4] IN [4,5] DT [5,6] PP [4,7] NP [5,7] NNS [6,7]

Found a PP: [4,5],[5,7]

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1 2 3 4 5 6

the 1 DT [0,1]

chef 2 NP [0,2] NN [1,2]

eats 3 S [0,3]

sh 4 S [0,4]

with 5

the 6

chopsticks 7

VBZ [2,3]

VP [2,4] NNS,VBP [3,4] IN [4,5] DT [5,6] VP [3,7] PP [4,7] NP [5,7] NNS [6,7]

Found a VP: [3,4], [4,7]

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1 2 3 4 5 6

the 1 DT [0,1]

chef 2 NP [0,2] NN [1,2]

eats 3 S [0,3]

sh 4 S [0,4]

with 5

the 6

chopsticks 7

VBZ [2,3]

VP [2,4] NNS,VBP [3,4] IN [4,5] DT [5,6]

VP [2,7] VP [3,7] PP [4,7] NP [5,7] NNS [6,7]

Found a VP: [2,3],[3,7]

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1 2 3 4 5 6

the 1 DT [0,1]

chef 2 NP [0,2] NN [1,2]

eats 3 S [0,3]

sh 4 S [0,4]

with 5

the 6

chopsticks 7

VBZ [2,3]

VP [2,4] NNS,VBP [3,4] IN [4,5] DT [5,6]

VP1 , VP2 [2,7] VP [3,7] PP [4,7] NP [5,7] NNS [6,7]

Found another VP: [2,4],[4,7]

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1 2 3 4 5 6

the 1 DT [0,1]

chef 2 NP [0,2] NN [1,2]

eats 3 S [0,3]

sh 4 S [0,4]

with 5

the 6

chopsticks 7 S [0,7]

VBZ [2,3]

VP [2,4] NNS,VBP [3,4] IN [4,5] DT [5,6]

VP1 , VP2 [2,7] VP [3,7] PP [4,7] NP [5,7] NNS [6,7]

Found an S node: [0,2] [2,7]

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1 2 3 4 5 6

the 1 DT [0,1]

chef 2 NP [0,2] NN [1,2]

eats 3 S [0,3]

sh 4 S [0,4]

with 5

the 6

chopsticks 7 S1 , S2 [0,7] VP1 , VP2 [2,7] VP [3,7]

VBZ [2,3]

VP [2,4] NNS,VBP [3,4] IN [4,5] DT [5,6]

PP [4,7] NP [5,7] NNS [6,7]

Found a second S node: also [0,2] [2,7]

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

0 0 1 2 3 4 5 6

the 1 DT [0,1]

chef 2 NP [0,2] NN [1,2]

eats 3 S [0,3]

sh 4 S [0,4]

with 5

the 6

chopsticks 7 S1 , S2 [0,7] VP1 , VP2 [2,7] VP [3,7]

VBZ [2,3]

VP [2,4] NNS,VBP [3,4] IN [4,5] DT [5,6]

PP [4,7] NP [5,7] NNS [6,7]

Found a second S node: also [0,2] [2,7] Recognition algorithm returns True when a root node is found in [0,n]

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

The CKY Algorithm (recognition)


function CKY-Parse (words , grammar ) returns table for j 1 to length(words ) do: (loop over columns) table [j-1,j] {A|A words [j] grammar } (add POS) for i j-2 downto 0 do: (loop over rows, backwards) for k i+1 to j-1 do: (loop over contents of cell) table [i,j] table [i,j] {A|A B C grammar , B table [i,k] C table [k,j] }
Scott Farrar CLMA, University of Washington CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

CKY recognition vs. parsing

Returning the full parse requires storing more in a cell than just a node label. We also require back-pointers to constituents of that node. We could also store whole trees, but less space ecient. For parsing, we must add an extra step to the algorithm:

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

CKY recognition vs. parsing

Returning the full parse requires storing more in a cell than just a node label. We also require back-pointers to constituents of that node. We could also store whole trees, but less space ecient. For parsing, we must add an extra step to the algorithm: follow pointers and return the parse

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

The CKY Algorithm (parsing)


function CKY-Parse (words , grammar ) returns parses for j 1 to length(words ) do: (loop over columns) table [j-1,j] for all {A|A words [j] grammar } (add all POS) for i j-2 downto 0 do: (loop over rows, backwards) for k i+1 to j-1 do: (loop over contents of cell) for all {A|A B C }: (all productions) back [i,j,A] { k,B,C } (add back pointer) return buildtree(back [1, length(words ,S]), table [1,LENGTH(words ),S] (follow back pointer)
Scott Farrar CLMA, University of Washington CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Issues with CKY

Eciency The CKY can be performed in cubic time: O (n3 ), where n=number of words in sentence. The complexity of the inner most loop is bounded by the square of the number of non-terminals. The more rules, the less ecient; but this increases at a constant rate L = r 2 where r is the number of non-terminals.

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Issues with CKY


Grammar requirements The basic algoritm requires a binary grammar, in fact a grammar in Chomsky Normal Form. Basic algorithm can be extended to account for arbitrary CFGs. However, transforming a grammar into a CNF grammar is easier and more ecient than parsing with an arbitrary grammar. Later, well look at the Earley Algorithm for parsing arbitrary CFGs.

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Binary tree

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Chomsky Normal Form grammar


Denition CNF grammar: a context-free grammar where the RHS of each production rule is restricted to be either two non-terminals or one terminal, and no empty productions are allowed. There can be: no mixed rules (NP the NN ) no unit productions (NP NNP ), except for NN dog no right hand sides of more than two non-terminals (VP VBZ NP PP ).

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Grammar equivalence
Any CFG can be converted to a weakly equivalent grammar in CNF.

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Grammar equivalence
Any CFG can be converted to a weakly equivalent grammar in CNF. Denition Weak equivalence: Two grammars are weakly equivalent if they generate the same set of strings (sentences). Transforming a grammar to CNF results in a new grammar that is weakly equivalent.

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Grammar equivalence
Any CFG can be converted to a weakly equivalent grammar in CNF. Denition Weak equivalence: Two grammars are weakly equivalent if they generate the same set of strings (sentences). Transforming a grammar to CNF results in a new grammar that is weakly equivalent. Denition Strong equivalence: Two grammars are strongly equivalent if they generate the same set of strings AND the same structures over those strings. If only the variable names are di. then the grammar are said to be isomorphic.
Scott Farrar CLMA, University of Washington CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Symbol naming conventions

Use new symbols (binarization): X 1, X 2, . . . , Y 3 S NP VP PUNC becomes: S NP X 1, X 1 VP PUNC Delete a symbol (unary collapsing): SBAR S , S NP VP becomes SBAR NP VP

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

CNF conversion algorithm


1

Removing unit-productions (unary collapsing): while there is a unit-production A B , Remove A B . foreach B u , add A u .

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

CNF conversion algorithm


1

Removing unit-productions (unary collapsing): while there is a unit-production A B , Remove A B . foreach B u , add A u .

Remove terminals from mixed rules foreach production A B1 B2 ...Bk , containing a terminal x Add new non-terminal/production X 1 x (unless it has already been added) Replace every Bi = x with X 1

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

CNF conversion algorithm


1

Removing unit-productions (unary collapsing): while there is a unit-production A B , Remove A B . foreach B u , add A u .

Remove terminals from mixed rules foreach production A B1 B2 ...Bk , containing a terminal x Add new non-terminal/production X 1 x (unless it has already been added) Replace every Bi = x with X 1

Remove rules with more than two nonterminals on the RHS (binarization) foreach rule p of form A B1 B2 ...Bk replace p with A B1 X 1, X 1 B2 X 2, X 2 B3 X 3, ..., X (k 2) Bk 1 Bk (Xi s are new variables.)
Scott Farrar CLMA, University of Washington CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Binarization

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC

(non-binary)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S

(non-binary)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S

(non-binary) (mixed)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP

(non-binary) (mixed)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP

(non-binary) (mixed) (OK)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP NP NN

(non-binary) (mixed) (OK)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP NP NN

(non-binary) (mixed) (OK) (unit production)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP NP NN NN dog

(non-binary) (mixed) (OK) (unit production)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP NP NN NN dog

(non-binary) (mixed) (OK) (unit production) (OK)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP NP NN NN dog NN cat

(non-binary) (mixed) (OK) (unit production) (OK)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP NP NN NN dog NN cat

(non-binary) (mixed) (OK) (unit production) (OK) (OK)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP NP NN NN dog NN cat VP VBZ NP

(non-binary) (mixed) (OK) (unit production) (OK) (OK)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP NP NN NN dog NN cat VP VBZ NP

(non-binary) (mixed) (OK) (unit production) (OK) (OK) (OK)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP NP NN NN dog NN cat VP VBZ NP VP VBZ

(non-binary) (mixed) (OK) (unit production) (OK) (OK) (OK)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP NP NN NN dog NN cat VP VBZ NP VP VBZ

(non-binary) (mixed) (OK) (unit production) (OK) (OK) (OK) (unit production)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP NP NN NN dog NN cat VP VBZ NP VP VBZ VBZ sleeps

(non-binary) (mixed) (OK) (unit production) (OK) (OK) (OK) (unit production)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP NP NN NN dog NN cat VP VBZ NP VP VBZ VBZ sleeps

(non-binary) (mixed) (OK) (unit production) (OK) (OK) (OK) (unit production) (OK)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP NP NN NN dog NN cat VP VBZ NP VP VBZ VBZ sleeps VBZ eats

(non-binary) (mixed) (OK) (unit production) (OK) (OK) (OK) (unit production) (OK)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP NP NN NN dog NN cat VP VBZ NP VP VBZ VBZ sleeps VBZ eats

(non-binary) (mixed) (OK) (unit production) (OK) (OK) (OK) (unit production) (OK) (OK)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP NP NN NN dog NN cat VP VBZ NP VP VBZ VBZ sleeps VBZ eats DT the

(non-binary) (mixed) (OK) (unit production) (OK) (OK) (OK) (unit production) (OK) (OK)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

A sample CFG

S NP VP PUNC S S and S NP DT NP NP NN NN dog NN cat VP VBZ NP VP VBZ VBZ sleeps VBZ eats DT the

(non-binary) (mixed) (OK) (unit production) (OK) (OK) (OK) (unit production) (OK) (OK) (OK)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 1

Non-CNF grammar

CNF grammar -

Action

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 1

Non-CNF grammar NP NN

CNF grammar -

Action

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 1

Non-CNF grammar NP NN NN dog

CNF grammar -

Action

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 1

Non-CNF grammar NP NN NN dog NN cat

CNF grammar -

Action

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 1

Non-CNF grammar NP NN NN dog NN cat

CNF grammar -

Action

NP dog

(collapse rule)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 1

Non-CNF grammar NP NN NN dog NN cat

CNF grammar -

Action

NP dog NP cat

(collapse rule) (collapse rule)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 1

Non-CNF grammar NP NN NN dog NN cat

CNF grammar -

Action

NP dog NP cat VP VBZ

(collapse rule) (collapse rule)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 1

Non-CNF grammar NP NN NN dog NN cat

CNF grammar -

Action

NP dog NP cat VP VBZ VBZ sleeps

(collapse rule) (collapse rule)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 1

Non-CNF grammar NP NN NN dog NN cat

CNF grammar -

Action

NP dog NP cat VP VBZ VBZ sleeps VBZ eats

(collapse rule) (collapse rule)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 1

Non-CNF grammar NP NN NN dog NN cat

CNF grammar -

Action

NP dog NP cat VP VBZ VBZ sleeps VBZ eats VP sleeps VP eats

(collapse rule) (collapse rule)

(collapse rule) (collapse rule)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 2

Non-CNF grammar

CNF grammar -

Action

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 2

Non-CNF grammar S S and S

CNF grammar -

Action

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 2

Non-CNF grammar S S and S

CNF grammar -

Action

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 2

Non-CNF grammar S S and S

CNF grammar S S X1

Action (new symbol)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 2

Non-CNF grammar S S and S

CNF grammar S S X1 X1 X2 S

Action (new symbol) (new symbol)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 2

Non-CNF grammar S S and S

CNF grammar S S X1 X1 X2 S X 2 and

Action (new symbol) (new symbol)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 3

Non-CNF grammar

CNF grammar -

Action

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 3

Non-CNF grammar S NP VP PUNC

CNF grammar -

Action

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 3

Non-CNF grammar S NP VP PUNC

CNF grammar S NP X 3

Action (new symbol)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 3

Non-CNF grammar S NP VP PUNC

CNF grammar S NP X 3 X 3 VP PUNC

Action (new symbol)

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 3

Non-CNF grammar S NP VP PUNC

CNF grammar S NP X 3 X 3 VP PUNC

Action (new symbol)

NP DT NP

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 3

Non-CNF grammar S NP VP PUNC

CNF grammar S NP X 3 X 3 VP PUNC NP DT NP

Action (new symbol) (carry over)

NP DT NP

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 3

Non-CNF grammar S NP VP PUNC

CNF grammar S NP X 3 X 3 VP PUNC NP DT NP

Action (new symbol) (carry over)

NP DT NP VP VBZ NP

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 3

Non-CNF grammar S NP VP PUNC

CNF grammar S NP X 3 X 3 VP PUNC NP DT NP VP VBZ NP

Action (new symbol) (carry over) (carry over)

NP DT NP VP VBZ NP

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 3

Non-CNF grammar S NP VP PUNC

CNF grammar S NP X 3 X 3 VP PUNC NP DT NP VP VBZ NP

Action (new symbol) (carry over) (carry over)

NP DT NP VP VBZ NP DT the

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Conversion of CFG to CNF: Step 3

Non-CNF grammar S NP VP PUNC

CNF grammar S NP X 3 X 3 VP PUNC NP DT NP VP VBZ NP DT the

Action (new symbol) (carry over) (carry over) (carry over)

NP DT NP VP VBZ NP DT the

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

CFG in CNF

NP dog NP cat VP sleeps VP eats S S X1 X1 X2 S X 2 and

S NP X 3 X 3 VP PUNC NP DT NP VP VBZ NP DT the

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Homework 2 discussion

Homework: CKY and toCNF

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

Brief review CKY algorithm Chomsky Normal Form (CNF) Homework2

Symbol naming conventions

Refer to NLTK treetransforms module


Create new symbols from old (binarization): S NP VP PUNC becomes: S NP S| VP -PUNC , S| VP-PUNC VP PUNC Create new symbols from old (unary collapsing): SBAR S , S NP VP becomes SBAR +S NP VP

Scott Farrar CLMA, University of Washington

CKY Algorithm, Chomsky Normal Form

You might also like