0% found this document useful (0 votes)
747 views

Appendix F. CYK Algorithm

The CYK algorithm decides if a string is in the language of a context-free grammar in O(n3) time. It constructs an n×n matrix where each entry Vij represents the set of nonterminals that can derive the substring wij. The entries are computed by combining adjacent substrings that nonterminals in the grammar can derive. If the start symbol is in the bottom right entry Vnn, then the string is accepted as part of the language.

Uploaded by

kims3515354178
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
747 views

Appendix F. CYK Algorithm

The CYK algorithm decides if a string is in the language of a context-free grammar in O(n3) time. It constructs an n×n matrix where each entry Vij represents the set of nonterminals that can derive the substring wij. The entries are computed by combining adjacent substrings that nonterminals in the grammar can derive. If the start symbol is in the bottom right entry Vnn, then the string is accepted as part of the language.

Uploaded by

kims3515354178
Copyright
© Attribution Non-Commercial (BY-NC)
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 7

Appendix F.

CYK Algorithm for the


Membership Test for CFL
The membership test problem for context-free languages is, for a given arbitrary
CFG G, to decide whether a string w is in the language L(G) or not. If it is, the
problem commonly requires a sequence of rules applied to derive w. A brute force
technique is to generate all possible parse trees yielding a string of length |w|, and
check if there is any tree yielding w. This approach takes too much time to be
practical.
Here we will present the well-known CYK algorithm (for Cocke, Younger and
Kasami, who first developed it). This algorithm, which takes O(n 3) time, is based on
the dynamic programming technique. The algorithm assumes that the given CFG is in
the Chomsky normal form (CNF).
Let w = a1a2 . . . . an, wij = aiai+1 . . . aj and wii = ai . Let Vij be the set of nonterminal
symbols that can derive the*string wij , i.e.,
Vij = { A | A  wij , A is a nonterminal symbol of G}

1
CYK Algorithm

wij = ai ..... aj

Construct an upper triangular matrix whose


entries are Vij as shown below. In the matrix, j
corresponds to the position of input symbol, and
Vij i corresponds to the diagonal number.

j
w = a1 a2 a3 a4 a5 a6
V11 V22 V33 V44 V55 V66
V12 V23 V34 V45 V56
V13 V24 V35 V46
V14 V25 V36 Clearly, by definition if
i S  V16 , then string
V15 V26
w  L(G).
V16

2
CYK Algorithm

The entries Vij can be computed with the entries in the i-th diagonal and those in the
j-th column, going along the direction indicated by the two arrows in the following
figure. If A Vii (which implies A can derive ai ), B V(i+1)j (implying B can derive
ai+1. . . aj ) and C  AB, then put C in the set Vij . If D Vi(i+1) (which implies D can
derive aiai+1), E V(i+2)j (implying E can derive ai+2. . . aj ) and F  DE, then put F in
the set Vij , and so on.

wij = ai ai+1 ai+2 . . . . . aj


Vii Vjj A
Vi(i+1) N
I

V(i+2)j
Vi(j-1) V(i+1)j
Vij

3
CYK Algorithm

For example, the set V25 is computed as follows.

w = a1 a2 a3 a4 a5 a6
V11 V22 V33 V44 V55 V66
V12 V23 V34 V45 V56
V13 V24 V35 V46
A
V14 V25 V36 N
Let A, B and C be nonterminals of G. V15 V26 I
V25 = { A | B  V22 , C  V35 , and A  BC } V16
 { B | C  V23 , A  V45 , and B  CA }
 { C | B  V24 , A  V55 , and C  BA }
.....
(Recall that G is in CNF.)

4
CYK Algorithm

In general, Vij =  { A | B  Vik , C  V(k+1) j and A  BC }


i  k  j-1

wij = ai ai+1 . . . . . aj
Vii Vjj
Vi(i+1)

V(i+2)j
Vi(j-1) V(i+1)j
Vij

5
CYK Algorithm
Example:
w = a a a a b b

{A, D} {A,D} {A,D} {A,D} {B} {B}

{D} {D} {D} {S,C}


CFG G S  AB {}
D  AD D  AD D  AD
C  DB
S  aSb | aDb
{S,C} {B}
{D} {D} S  AC
B  SB
C  DB
D  aD | a
{S,B,C}
{D} {S,C} SAB,CDB
B  SB

{S,B,C}
CNF CFG {S,C} SAC,CDB
B  SB
S  AB | AC Aa B  SB Bb
{S,B,C}
SAB,S AC
C  DB D  AD | a CDB, BSB

Since S  V16 , we have w  L(G).


6
CYK Algorithm

Here is a pseudo code for the w = a1 a2 a3 a4 a5 a6


algorithm.
V11 V22 V33 V44 V55 V66
//initially all sets Vij are empty V12 V23 V34 V45 V56
// Input x = a1a2 . . . . an. V13 V24 V35 V46
for ( i = 1; i <= n; i ++ )
V14 V25 V36
Vii = { A | A  ai };
for ( j = 2; j <= n; j++ ) V15 V26
for ( i = j-1; i =1; i-- ) V16
for ( k = i; k <= j-1; k++)
vij = vij  { A | B  Vik , C  V(k+1) j and A  BC };
if ( S  Vin ) output “yes”; else output “no”;

The number of sets Vij is O(n2), and it takes O(n) steps to compute each vij. Thus
the time complexity of the algorithm is O(n3).

You might also like