Formal Languages and Automata: Simplification of Context-Free Grammars and Normal Forms
Formal Languages and Automata: Simplification of Context-Free Grammars and Normal Forms
Formal Languages and Automata: Simplification of Context-Free Grammars and Normal Forms
Department of Computer Science and Information Engineering National Taipei University of Technology Taipei, TAIWAN
Objectives
Study several transformations and substitutions Investigate normal forms for context-free grammars (cfgs) Chomsky normal form Greibach normal form Discuss useless, , and unit- productions
Contents
Methods for Transforming Grammars Two Important Normal Forms A Membership Algorithm for cfgs .
Example 1
Consider G = ({A, B }, {a, b, c}, A, P ) with A a|aaA|abBc, B abbA|b as described in the above theorem? What is G Note: 1. The substitution rule we discussed here needs that A and B are distinct. How about A = B ? 2. Consider the productions associated with B after the substitution.
Example 2
Consider G = ({A, B, S }, {a, b}, S, P ) with S A, A aA|, B bA B is useless and B bA is a useless production. Note: Two reasons why a variable is useless 1. it can not be reached from the start symbol. 2. it can not derive a terminal string.
Example 3
Eliminate useless variables and productions from G = ({A, B, C, S }, {a, b}, S, P ) with P consisting of S A B C Solution: aS |A|C, a, aa, aCb.
Rule 1
Theorem: Let G = (V, T, S, P ) be a context-free grammar. Then there exists an equivalent grammar = (V ,T , S, P ) that does not contain any useless G variables or productions.
Removing -Productions
Denition: (-production) Any production of a cfg of the form A is call a -production. Denition: (Nullable Variable) Any variable A for which the derivation A = is possible is called nullable.
Example 4
Consider the grammar S aS1 b, S1 aS1 b|, which generates the -free language {an bn : n 1}. S1 can be removed by adding S ab, S1 ab.
Rule 2
Theorem: Let G = (V, T, S, P ) be a context-free grammar with L(G). Then there exists an equivalent grammar = (V ,T , S, P ) that does not contain any G -productions.
Example 5
Find a cfg without -productions equivalent to the grammar dened by S A B C D ABaC, BC, b|, D|, d.
Removing Unit-Productions
Denition: (Unit-Productions) Any production of a cfg of the form AB where A, B V , is call a unit-production.
Rule 3
Theorem: Let G = (V, T, S, P ) be a context-free grammar without -productions. Then there exists an = (V ,T , S, P ) that does not contain equivalent cfg G any unit-productions.
Example 6
Remove all unit-productions from S Aa|B, B A|bb, A a|bc|B.
Theorem
Let L be a context free language that does not contain . Then there exists a cfg that generates L and that does not have any useless productions, -productions, or unit-productions.
Contents
Methods for Transforming Grammars Two Important Normal Forms A Membership Algorithm for cfgs .
Example 7
The grammar S AS |a, A SA|b is in Chmosky normal form. The grammar S AS |AAS, A SA|aa is not in Chmosky normal form.
Theorem
Any cfg G = (V, T, S, P ) with L(G) has an = (V ,T , S, P ) in Chomsky equivalent grammar G normal form.
Example 8
Convert the grammar with productions S abA, A aab, B Ac to Chmosky normal form.
Example 9
The grammar S AB, A aA|bB |b, B b is not in Greibach normal form.
Example 10
Convert the grammar S abSb|aa to Greibach normal form.
Theorem
Any cfg G = (V, T, S, P ) with L(G) has an = (V ,T , S, P ) in Greibach equivalent grammar G normal form.
Contents
Methods for Transforming Grammars Two important Normal Forms A Membership Algorithm for cfgs .
CYK Algorithm
An algorithm to verify if a given string belongs to the language generated by some given cfg. According to the dynamic programming algorithmic design paradigm. The given cfg G = (V, T, S, P ) is in Chomsky normal form. Given a string w = a1 a2 an , we dene substrings wij = ai aj , and subsets Vij = {A V : A wij } V . Clearly, w L(G) S V1n .
Computing Vij
Consider two forms dened in Chomsky normal form: For each i, A Vii A ai ; For j > i, A wij a production A BC , with B wik , C wk+1j , for some i k < j . In other words,
Vij =
k{i,i+1,...,j 1}
Bottom-up Approach
Use a Bottom-up approach to compute all Vij with the equations discussed Compute V11 , V22 , . . . , Vnn , Compute V12 , V23 , . . . , Vn1,n , Compute V13 , V24 , . . . , Vn2,n ,
Algorithm - Pseudocode
M EMBERSHIP (G, w ) 1 for i 1 to n n = |w|; 2 do if A ai exists 3 then Vii = A 4 else Vii = 5 for l 2 to n 6 do for i 1 to n l + 1 7 do j i + l 1 8 for k i to j 1 9 do if A BC exists 10 for B Vik and C Vk+1,j 11 then Vij = Vij {A} 12 if S V1n 13 then w L(G)
Example 11
Determine whether the string w = aabbb is in the language generated by the grammar S AB, A BB |a, B AB |b.