0% found this document useful (0 votes)
68 views8 pages

Chomshky Notes

The CKY algorithm parses strings to determine if they can be generated by a given context-free grammar. It employs dynamic programming and bottom-up parsing to fill a table indicating how the string can be generated. CKY runs in polynomial time by avoiding repeated computations and finding all parts of exponential number of parse trees. It works for context-free grammars in Chomsky Normal Form by building a table where a spanning non-terminal is placed based on combining subparts from below.

Uploaded by

Abhishek Chauhan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views8 pages

Chomshky Notes

The CKY algorithm parses strings to determine if they can be generated by a given context-free grammar. It employs dynamic programming and bottom-up parsing to fill a table indicating how the string can be generated. CKY runs in polynomial time by avoiding repeated computations and finding all parts of exponential number of parse trees. It works for context-free grammars in Chomsky Normal Form by building a table where a spanning non-terminal is placed based on combining subparts from below.

Uploaded by

Abhishek Chauhan
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOC, PDF, TXT or read online on Scribd
You are on page 1/ 8

Dynamic Programming We need a method that fills a table with partial results that does not do (avoidable) repeated

work And can find all the pieces of an exponential number of trees in polynomial time. Two popular methods: CKY Earley CKY Algorithm The CockeYoungerKasami (CYK) algorithm (alternatively called CKY) determines whether a string can be generated by a given context-free grammar and, if so, how it can be generated. This is known as parsing the string. The algorithm employs bottom-up parsing and dynamic programming. The standard version of CYK operates on context-free grammars given in Chomsky normal form (CNF). Any context-free grammar may be transformed to a CNF grammar expressing the same language. In the theory of computation, the importance of the CYK algorithm stems from the fact that it constructively proves that it is decidable whether a given string belongs to the formal language described by a given context-free grammar, and the fact that it does so quite efficiently. The worst case running time of CYK is , where n is the length of the parsed string and | G| is the size of the CNF grammar G. This makes it one of the most efficient algorithms for recognizing general context-free languages. The algorithm in pseudocode is as follows: Input string of size n Create a 2D table chart of size n2 For i=0 to n-1 Chart[i][i+1]=A if there is a rule A->a and input[i] =a For j=2 to N For i=j-2 down to 0 For k=i+1 to j-1 Chart[i][j]=A if there is a rule A->BC and chart[i][k]=B and chart[k][j]=C Return yes if chart[0][n] has the start symbol Else return no

The CKY (Cocke-Kasami-Younger)Algorithm requires the grammar be in Chomsky Normal Form (CNF) All rules must be in following form: A -> B C A -> w Any grammar can be converted automatically to Chomsky Normal Form Converting to CNF Rules that mix terminals and non-terminals.Introduce a new dummy non-terminal that covers the terminal INFVP -> to VP replaced by: INFVP -> TO VP TO -> to Rules that have a single non-terminal on right (unit productions) Rewrite each unit production with the RHS of their expansions Rules whose right hand side length >2 Introduce dummy non-terminals that spread the right-hand side Automatic Conversion to CNF

Sample Grammar

CKY Parsing Given rules in CNF. Consider the rule A -> BC. If there is an A in the input then there must be a B followed by a C in the input. If the A goes from i to j in the input then there must be some k st. i<k<j that is B splits from the C someplace. So lets build a table so that an A spanning from i to j in the input is placed in cell [i,j] in the table. So a non-terminal spanning an entire string will sit in cell [0, n].If we build the table bottom up well know that the parts of the A must go from i to k and from k to j. Meaning that for a rule like A -> B C we should look for a B in [i,k] and a C in [k,j].In other words, if we think there might be an A spanning i,j in the input AND A -> B C is a rule in the grammar THEN there must be a B in [i,k] and a C in [k,j] for some i<k<j.So just loop over the possible k values. CKY Example S -> NP VP VP -> V NP NP -> NP PP VP -> VP PP PP -> P NP NP -> John, Mary, Denver V -> called P -> from

Ambiguity Both CKY and Earley will result in multiple S structures for the [0,n] table entry. They both efficiently store the sub-parts that are shared between multiple parses. But neither can tell us which one is right. Not a parser a recognizer The presence of an S state with the right attributes in the right place indicates a successful recognition. But no parse tree no parser Thats how we solve (not) an exponential problem in polynomial time Converting CKY from Recognizer to Parser With the addition of a few pointers we have a parser. Augment each new cell in chart to point to where we came from.

You might also like