We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21
Context Free Languages
Context Free Grammars
Parsing Arithmetic Expression Removing λ-productions Normal forms
January 4, 2024 Formal Language Theory 1
Context Free Grammar (CFG)
Definition: A CFG, G, is a PSG G=(N, T, P, S) with
productions of the form A β, A Є N, β Є (NUT)*. CFGs are used in defining the syntax of programming languages and in parsing arithmetic expressions. A language generated from CFG is called Context Free Language (CFL). Ex. a) S aB b) S aB|A B bA|b A aA|a|CBA Aa Bλ Cc
January 4, 2024 Formal Language Theory 2
CFG: cont’d
Let G be a CFG, then x Є L(G) iff S x in
zero or more steps over G. x Є L(G) can as well be obtained from a derivation tree or parse tree. The root of the tree is S and x is the collection of leaves from left to right. Left most derivation: employs the reduction of the left most non-terminal Right most derivation: employs the reduction of the right most non-terminal
January 4, 2024 Formal Language Theory 3
CFG: cont’d If a derivation of a string x has two different left most derivations, then the grammar is said to be ambiguous. Otherwise unambiguous. (i.e. a grammar is ambiguous if it can produce more than one parse tree for a particular sentence. Ex. 1. G1 = (N, T, P, S) with productions: S AB A aA|a B bB|b let x = aaabbb a) find a left most and right most derivations for x b) draw the parse tree for x
January 4, 2024 Formal Language Theory 4
CFG: cont’d
2. G2 = (N, T, P, S) with productions:
S SbS|ScS|a let x = abaca Є L(G2) a) find a left most and right most derivations for x b) draw the parse tree for x 3. Is G1 ambiguous? Is G2?
January 4, 2024 Formal Language Theory 5
Parsing Arithmetic Expression
Consider the following grammar:
ET|E+T|E–T T F | T * F | T/F F a | b | c | (E)
Draw parse trees for
a) a*b+c b) a+b*c c) (a+b)*c d) a-b-c
January 4, 2024 Formal Language Theory 6
Removing λ-productions
Let G be a CFG and A α, A Є N
1. If α = λ, then A α is a λ-production 2. If α ≠ λ for all productions, then G is λ-free 3. If A => λ in zero or more steps, then A is called nullable or λ- generating non-terminal Ex. Let G=(N, T, P, S) be a CFG with productions: S ABaC AB B b| λ C c| λ Find the non-terminals which are nullable. January 4, 2024 Formal Language Theory 7 Removing λ-productions: cont’d
Let G=(N, T, P, S) be a CFG with λ-productions.
Construct G’=(N, T, P’, S), a λ-free CFG as follows: 1. Put all non λ-productions of P in P’ 2. For all nullable non-terminals, put productions in P’ by removing one or more nullable non-terminals on the right side productions. Thus, L(G) \ {λ} = L(G’) Ex1: Construct G’ for the previous example. Ex2: Construct G’ for a grammar G with productions: S aS | AB Aλ Bλ Db
January 4, 2024 Formal Language Theory 8
Removing λ-productions: cont’d
Theorem: For any CFG G, there exists a
CFG G’ with λ-free productions such that L(G) \{λ} = L(G’). A production A α is called relevant/useful iff there exists a derivation of some xЄL(G) that uses the production. Otherwise, it’s called irrelevant/useless. Ex. S aSb | A | λ A aA | λ January 4, 2024 Formal Language Theory 9 Normal Forms
When the productions in a CFG G satisfy certain
restrictions, G is said to be in a normal form. We’ll see two normal forms: CNF and GNF
1. Chomsky Normal Form (CNF)
Let G=(N, T, P, S) be a CFG and A α be a production of G. 1. If α = B, B in N, then A α is called a Unit production 2. If |α|>1 and there exists a terminal substring of α, then A α is called a Secondary production 3. If α contains more than two non-terminals, then A α is called a Tertiary production
January 4, 2024 Formal Language Theory 10
Chomsky Normal Form (CNF)
Definition: Let G = (N, T, P, S) be a λ-free CFG, then
G is said to be in CNF if all its productions are of the form: A BC where A, B, C Є N OR A a, a Є T, A Є N Ex. G with productions S AB Aa Bb
January 4, 2024 Formal Language Theory 11
CNF: cont’d
Theorem: Any λ-free CFL can be generated by a
CFG in CNF. proof: Let G = (N, T, P, S) be a λ-free grammar such that L = L(G). Construct a grammar G which is in CNF. Steps: 1. Replace all Unit productions as follows: For any A Є N on the LHS of the Unit production, denote U(A) and non-Unit productions of A by N(A) For each A Є N and U(A) ≠ Ø replace U(A) by {A α | A=>B in one or more steps, and Bα Є N(B)}
January 4, 2024 Formal Language Theory 12
CNF: cont’d 2. Replace all Secondary productions as follows: For any a Є T, a substring of a Secondary production, replace a by Aa where Aa is a new non-terminal and Aaa. 3. Replace all Tertiary productions as follows: If A B1B2…Bm, m>2, then replace the production by: A B1B1’ B1’ B2B2’ B2’ B3B3’ … Bm-2’ Bm-1Bm, where B1’, …, Bm-2’ are all unique new non-terminals that do not appear in any other production.
January 4, 2024 Formal Language Theory 13
CNF: cont’d
Ex. Convert the following grammars to CNF
1. Let G be a CFG with productions: S A | ABA A aA | a| B B bB | b
2. Let G be a CFG with productions:
S aAD A aB | bAB Bb Dd
January 4, 2024 Formal Language Theory 14
Identification of non-CFLs
Pumping lemma for CFLs
(Reading assignment)
January 4, 2024 Formal Language Theory 15
Greibach Normal Form (GNF)
Let G=(N, T, P, S) be a λ-free CFG, then if all the
productions of G are of the form A aα, A Є N, a Є T, α Є N* then G is said to be in GNF
Theorem G1: If A α1Bα2 is a production in a CFG
G and B β1|β2|β3|…|βk are all productions with B on the LHS, then A α1B α2 can be replaced by A α1β1α2| α1β2α2| … | α1βkα2 without affecting L(G).
January 4, 2024 Formal Language Theory 16
GNF: cont’d
Theorem G2: If in a CFG there is a production A
Aα1 | Aα2 | … | Aαn | β1|β2|…|βm, such a production is called left recursive, and A β1|β2|β3|…|βm are the remaining productions with A on the LHS. Then an equivalent grammar can be constructed by introducing a new non-terminal, A’, and replacing all these productions by: A β1|β2|β3|…|βm| β1A’| β2A’|…| βmA’ A’ α1| α2|…| αn| α1A’| α2A’|…| αnA’
January 4, 2024 Formal Language Theory 17
GNF: cont’d Theorem GNF: Any λ-free CFG G can be converted into a grammar in GNF. Proof: Let G be a λ-free CFG. To convert G into GNF use the steps below: 1. Convert G into CNF, G’ 2. Rename the non-terminals in G’ as A1, A2, …, Am (m>=1) 3. Convert all the productions into Ai aα or Ai Ajα with j > i To convert to the form Ai Ajα with j > i, do the following: Substitute for Aj according to Theorem G1. If there exist left recursive productions with Ai on the LHS, then introduce a new non-terminal Ai’ and apply Theorem G2.
January 4, 2024 Formal Language Theory 18
GNF: cont’d
4. After the 3rd step, the productions will be
of the form i. Ai Ajα, j > i, αЄ(NUN’)* where N’ stands for the new non-terminals Ai introduced. ii. Ai aα, a Є T, αЄ(NUN’)* or
iii. Ai’ xα, xЄ(NUT), αЄ(NUN’)*
Replace (i) by using Theorem G1
(iii) by using Theorem G2
January 4, 2024 Formal Language Theory 19
GNF: cont’d
Ex. Convert to GNF
1. Let G be with productions S AB A BS | b B SA | a 2. Let G be with productions A 1 A 2A 2 | a A 2 A 1A 2 | b January 4, 2024 Formal Language Theory 20 Closure Properties of CFGs