Chapter 17: Context-Free Languages
Chapter 17: Context-Free Languages
Chapter 17: Context-Free Languages
Peter Cappello Department of Computer Science University of California, Santa Barbara Santa Barbara, CA 93106 [email protected]
Please read the corresponding chapter before attending this lecture. These notes are not intended to be complete. They are supplemented with gures, and material that arises during the lecture period in response to questions.
Based on Theory of Computing, 2nd Ed., D. Cohen, John Wiley & Sons, Inc.
Closure Properties
Theorem: CFLs are closed under union If L1 and L2 are CFLs, then L1 L2 is a CFL. Proof 1. Let L1 and L2 be generated by the CFG, G1 = (V1, T1, P1, S1) and G2 = (V2, T2, P2, S2), respectively. 2. Without loss of generality, subscript each nonterminal of G1 with a 1, and each nonterminal of G2 with a 2 (so that V1 V2 = ). 3. Dene the CFG, G, that generates L1 L2 as follows: G = (V1 V2 {S}, T1 T2, P1 P2 {S S1 | S2}, S).
4. A derivation starts with either S S1 or S S2. 5. Subsequent steps use productions entirely from G1 or entirely from G2 . 6. Each word generated thus is either a word in L1 or a word in L2.
Example Let L1 be PALINDROME, dened by: S aSa | bSb | a | b | Let L2 be {anbn|n 0} dened by: S aSb | Then the union language is dened by: S S1 | S2 S1 aS1a | bS1b | a | b | S2 aS2b |
Theorem: CFLs are closed under concatenation If L1 and L2 are CFLs, then L1L2 is a CFL. Proof 1. Let L1 and L2 be generated by the CFG, G1 = (V1, T1, P1, S1) and G2 = (V2, T2, P2, S2), respectively. 2. Without loss of generality, subscript each nonterminal of G1 with a 1, and each nonterminal of G2 with a 2 (so that V1 V2 = ). 3. Dene the CFG, G, that generates L1L2 as follows: G = (V1 V2 {S}, T1 T2, P1 P2 {S S1S2}, S). 4. Each word generated thus is a word in L1 followed by a word in L2.
Example Let L1 be PALINDROME, dened by: S aSa | bSb | a | b | Let L2 be {anbn|n 0} dened by: S aSb | Then the concatenation language is dened by: S S1S2 S1 aS1a | bS1b | a | b | S2 aS2b |
Theorem: CFLs are closed under Kleene star If L1 is a CFL, then L is a CFL. 1 Proof 1. Let L1 be generated by the CFG, G1 = (V1, T1, P1, S1). 2. Without loss of generality, subscript each nonterminal of G1 with a 1. 3. Dene the CFG, G, that generates L as follows: 1 G = (V1 {S}, T1, P1 {S S1S | }, S). 4. Each word generated is either or some sequence of words in L1. 5. Every word in L (i.e., some sequence of 0 or more words in L1) can 1 be generated by G.
Example Let L1 be {anbn|n 0} dened by: S aSb | Then L is generated by: 1 S S1 S | S1 aS1b | None of these example grammars is necessarily the most compact CFG for the language it generates.
10
Theorem: CFLs are not closed under complement If L1 is a CFL, then L1 may not be a CFL. Proof They are closed under union. If they are closed under complement, then they are closed under intersection, which is false. More formally, 1. Assume the complement of every CFL is a CFL. 2. Let L1 and L2 be 2 CFLs. 3. Since CFLs are close under union, and we are assuming they are closed under complement, L1 L2 = L1 L2 is a CFL.
11
4. However, we know there are CFLs whose intersection is not a CFL. 5. Therefore, our assumption that CFLs are closed under complement is false.
12
Example This does not mean that the complement of a CFL is never a CFL. Let L1 = {anbnan|n 0}, which is not a CFL. L1 is a CFL. We show this by constructing it as the union of 5 CFLs. M pq = (a+)(anbn)(a+) = {apbq ar | p > q} M qp = (anbn)(b+)(a+) = {apbq ar | p < q} M qr = (a+)(b+)(bnan) = {apbq ar | q > r} M qr = (a+)(bnan)(a+) = {apbq ar | q < r} M = a+b+a+ = all words not of the form apbq ar . Let L = M M pq M qp M qr M qr. Since M L, L contains only words of the form apbq ar .
13
L cannot contain words of the form apbq ar , where p < q. L cannot contain words of the form apbq ar , where p > q. Therefore L only contains words of the form apbq ar , where p = q. L cannot contain words of the form apbq ar , where q < r. L cannot contain words of the form apbq ar , where q > r. Therefore L only contains words of the form apbq ar , where q = r. Since p = q and q = r, L contains words of the form anbnan, which is not context-free.
14
Theorem: The intersection of a CFL and an RL is a CFL. If L1 is a CFL and L2 is regular, then L1 L2 is a CFL. Proof 1. We do this by constructing a PDA I to accept the intersection that is based on a PDA A for L1 and a FA F for L2. 2. Convert A, if necessary, so that all input is read before accepting. 3. Construct a set Y of all As states y1, y2, . . ., and a set X of all F s states x1, x2, . . .. 4. Construct {(y, x) | y Y, x X}. 5. The start state of I is (y0, x0), where y0 is the label of As start state, and x0 is F s initial state.
15
6. Regarding the next state function, the x component changes only when the PDA is in a READ state: If in (yi, xj ) and yi is not a READ state, its successor is (yk , xj ), where yk is the appropriate successor of yi. If in (yi, xj ) and yi is a READ state, reading a, its successor is (yk , xl ), where yk is the appropriate successor of yi on an a (xj , a) = xl . 7. Is ACCEPT states are those where the y component is ACCEPT and the x component is nal. If the y component is ACCEPT and the x component is not nal, the state in I is REJECT (or omitted, implying a crash).
16
Example Let L1 be the CFL EQUAL of words with an equal number of as and bs. Draw its PDA. Let L2 = (a + b)a. Draw its FA. Perform the construction of the intersection PDA.
17