0% found this document useful (0 votes)

26 views67 pages

Chapter3 CFG

it descibes how CFG works

Uploaded by

wubalem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

26 views67 pages

Chapter3 CFG

it descibes how CFG works

Uploaded by

wubalem

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 67

Chapter 3

Context Free Languages

Contents
– Context free languages
– Parsing and ambiguity
– Sentential forms
– Derivation tree or parse tree
– Left most and right most
– Derivations
o n of c on te x t fr e e gra m m ar
– Simplifica ti
mars.
– Methods for transforming gram

1
Hierarchy of languages (Revision)

Regular Languages  Finite State Machines, Regular

Expression
Context Free Languages  Context Free Grammar, Push-down
Automata
Non-Recursively Enumerable Languages

Recursively Enumerable Languages

Recursive Languages

Context-Free Languages

Regular Languages

2
Context Free Languages (CFL)
• The pumping lemma showed there are languages that are not
regular
– There are many classes “larger” than that of regular languages
– One of these classes are called “Context Free” languages
• Described by Context-Free Grammars (CFG)
– Why named context-free?
– Property that we can substitute strings for variables regardless of
context (implies context sensitive languages exist)
• CFG’s are useful in many applications
– Describing syntax of programming languages
– Parsing
– Structure of documents, e.g.XML
• Analogy of the day:
– DFA:Regular Expression as Pushdown Automata : CFG
CFG
• It is a notation used to specify the syntax of the language.
Context-free Grammar is used to design parsers.

• As Lexical Analyzer generates a string of tokens which are

given to parser to construct parse tree. But, before
constructing the parse tree, these tokens will be grouped so
that the results of grouping will be a valid construct of a
language. So, to specify constructs of language, a suitable
notation is used, which will be precise & easy to
understand. This notation is Context-Free Grammar.

4
Cont…

Definition. Given a context-free grammar G = ( , NT,

R, S), the language generated or
derived from G is the set:

L(G) = {w : S  * w }

Definition. A language L is context-free if there is a

context-free grammar G = ( , NT, R, S), such that L is
generated from G
Informal Comments
• A context-free grammar is a notation for describing languages.
• It is more powerful than finite automata or RE’s, but still cannot
define all possible languages.
• Useful for nested structures, e.g., parentheses in programming
languages.
• Basic idea is to use “variables” to stand for sets of strings (i.e.,
languages).
• These variables are defined recursively, in terms of one another.
• Recursive rules (“productions”) involve only concatenation.
• Alternative rules for a variable allow union.
6
CFG Formalism
• Terminals:- symbols of the alphabet of the language being
defined.
• Variables (non-terminals ) = a finite set of other symbols, each
of which represents a language.
• Start symbol:- the variable whose language is the one being
defined.
• A production has the form variable -> string of variables and
terminals.
• Convention:
– A, B, C,… are variables.
– a, b, c,… are terminals.
– …, X, Y, Z are either terminals or variables.
– …, w, x, y, z are strings of terminals only.
– , , ,… are strings of terminals and/or variables. 7
Context Free Grammar
Notation
Definition − A context-free grammar (CFG) consisting
of a finite set of grammar rules is a quadruple (N, T, P,
S) where
• N is a set of non-terminal symbols.
• T is a set of terminals where N ∩ T = NULL.
• P is a set of rules, P: N → (N ∪ T)*, i.e., the left-hand side of the
production rule P does have any right context or left context.
• S is the start symbol

Example
• The grammar ({A}, {a, b, c}, P, A), P : A → aA, A → abc.
• The grammar ({S, a, b}, {a, b}, P, S), P: S → aSa, S → bSb, S →
ε
• The grammar ({S, F}, {0, 1}, P, S), P: S → 00S | 11F, F → 00F8 | ε
Examples
1. Write down Grammar for language L={an|n≥1}
Solution
Let G=(V,Σ,P,S)
V = {S}
Σ={a}
P = { S→aS|a }
2. Construct a CFG for a language L = {wcwR | where w € (a,
b)*}.
The grammar could be:
S → aSa rule 1
S → bSb rule 2
S→c rule 3

9
Cont…
Construct a CFG for the language L = anb2n where n>=1.
Solution:
The string that can be generated for a given language is {abb,
aabbbb, aaabbbbbb....}.
The grammar could be:
S → aSbb | abb

10
Sample CFG

1. EI // Expression is an identifier

2. EE+E // Add two expressions
3. EE*E // Multiply two expressions
4. E(E) // Add parenthesis
5. I L // Identifier is a Letter
6. I ID // Identifier + Digit
7. I IL // Identifier + Letter
8. D0|1|2|3|4|5|6|7|8 |9 // Digits
9. L a |b|c|…A|B|…Z // Letters

Note Identifiers are regular; could describe as (letter)(letter +

digit)*
Derivations – Intuition
• We derive strings in the language of a CFG by starting with
the start symbol, and repeatedly replacing some variable A by
the right side of one of its productions.
– That is, the “productions for A” are those that have A on the
left side of the ->.
• We say  A =>    if A ->  is a production.
• Example: S -> 01; S -> 0S1.
• S => 0S1 => 00S11 => 000111.
Definition. v is one-step derivable from u, written u  v, if:
• u = x z
• v = x z
•    in R
Definition. v is derivable from u, written u  * v, if:
There is a chain of one-derivations of the form: 12
u  u1  u2  …  v
Example of Derivation
S  SS | (S)S | () | E  E O E | (E) | id
S O+|-|*|/
 SS E
 (S)SS EOE
(S)S(S)S  (E) O E
 (S)S(())S  (E O E) O E
 ((S)S)S(())S * ((E O E) O E) O E
 ((S)())S(())S  ((id O E)) O E) O E
 ((())())S(())S  ((id + E)) O E) O E
 ((())()) (())S  ((id + id)) O E) O E
 ((())())(())  * ((id + id)) * id) + id
13
Generation of Derivation Tree
• A derivation tree or parse tree is an ordered rooted tree that
graphically represents the semantic information a string
derived from a context-free grammar.
Representation Technique
 Root vertex − Must be labeled by the start symbol.
 Vertex − Labeled by a non-terminal symbol.
 Leaves − Labeled by a terminal symbol or ε.
If S → x1x2 …… xn is a production rule in a CFG, then the parse
tree / derivation tree will be as follows

14
Cont…
There are two different approaches to draw a derivation tree −
Top-down Approach −
• Starts with the starting symbol S
• Goes down to tree leaves using productions
Bottom-up Approach −
• Starts from tree leaves
• Proceeds upward to the root which is the starting symbol S
Derivation or Yield of a Tree
The derivation or the yield of a parse tree is
the final string obtained by concatenating
the labels of the leaves of the tree from left
to right, ignoring the Nulls. However, if all
the leaves are Null, derivation is Null.
Let a CFG {N,T,P,S} be
N = {S}, T = {a, b}, Starting symbol = S, P = S → SS | aSb | ε
One derivation from the above CFG is “abaabb”S → SS → aSbS
15
→ abS → abaSb → abaaSbb → abaabb
Cont…
Sentential Form and Partial Derivation Tree

Any string of variables and/or terminals derived from the start

symbol is called a sentential form.
A partial derivation tree is a sub-tree of a derivation tree/parse
tree such that either all of its children are in the sub-tree or none
of them are in the sub-tree.
Example
If in any CFG the productions are −
S → AB, A → aaA | ε, B → Bb| ε
the partial derivation tree can be the
following −
16
Derivation Trees

S  A|AB Other derivation trees for

A  e | a | A b | AA w = aabb this string?
B  b|bc|Bc|bB
S
S S
A
A B A B
A A Infinitely
A A b B A A b many others
A A A b possible.
a a b a A b
a e A b
a
a
Cont…
Leftmost and Rightmost Derivation of a String
• Leftmost derivation − A leftmost derivation is obtained by
applying production to the leftmost variable in each step.
• Rightmost derivation − A rightmost derivation is obtained by
applying production to the rightmost variable in each step.

Example
Let any set of production rules in a CFG be
X → X+X | X*X |X| a over an alphabet {a}.
The leftmost derivation for the string "a+a*a" may be −
X → X+X → a+X → a + X*X → a+a*X → a+a*a
The stepwise derivation of the above string is shown as below −

18
Cont…

19
Cont…
The rightmost derivation for the above string "a+a*a" may be −
X → X*X → X*a → X+X*a → X+a*a → a+a*a
The stepwise derivation of the above string is shown as below −

20
Example 2
S → AB
Let’s draw leftmost and rightmost derivations
A → Aa | a of above grammar to get the string “aab”.

B→b
Leftmost Derivation Rightmost Derivation

• Each step of the derivation is • Each step of the derivation is a

a replacement of the leftmost replacement of the rightmost
nonterminals in a sentential nonterminals in a sentential
form. form.
E E
 EOE  EOE
 (E) O E  E O id
 (E O E) O E  E * id
 (id O E) O E  (E) * id
 (id + E) O E  (E O E) * id
 (id + id) O E  (E O id) * id
 (id + id) * E  (E + id) * id
 (id + id) * id  (id + id) * id
22
Sample Parse /derivation
Tree
E

• Using a leftmost derivation generates the

E * E
parse tree for a*(a+b1)
• Does using a rightmost derivation I ( E )
produce a different tree?
• The yield of the parse tree is the string L E + E

that results when we concatenate the

a I I
leaves from left to right (e.g., doing a
leftmost depth first search). L I D
– The yield is always a string that is derived
from the root and is guaranteed to be a string
a L 1
in the language L.
b
Example
G = ({A, B, C, S}, {a, b, c}, P, S)
P:
(1) S –> ABC
(2) A –> aA
A –> aA | ε
(3) A –> ε
(4) B –> bB B
–> bB | ε
(5) B –> ε
(6) C –> cC C
–> cC | ε
Derivations: S=> ABC (7) (1) C –> ε S=> ABC (1)
=> BC (3) => aABC (2)
=>C (5) => aaABC (2)
=> ε (7) => aaBC (3)
=> aabBC (4)
=> aabC (5)
=> aabcC (6)
=> aabc (7)

24
Example CFG
for {0k1k | k≥0}:
G = ({S}, {0, 1}, P, S) // Remember: G = (V, T, P,
S)
P:
(1) S –> 0S1 or
just simply S –> 0S1 | ε Derivations:
Example
(2) S –> ε
S => 0S1 (1) S => ε (2)
=> 01 (2)
S => 0S1 (1)
=> 00S11 (1)
=> 000S111 (1)
=> 000111 (2)
• Derivation of aabb
S  aSb  aaSbb  aabb
• Derivation tree
S

a S b

a b
l
Examples
G ({S , A, B},{a , b},
{S  AB, A  aA |  ,
B  Bb | },
S)
L(G )  L( a * b*)

Leftmost D erivation :
S  AB  aAB  aB  aBb  ab
Rightmost Derivation :
S  AB  ABb  Ab  aAb  ab

27
Derivation Tree (abstracts derivation)
S

A B

a A B b

l l
Parse Trees and Derivations…..Cont…
E 1
EE+E
E2 E3 (1)
+
E 4
E 5

id + E
i *
d
(2)
i i
d
Preorder d

id + E * E (3)
numbering
E 1

id + id * E (4)

id + id * id(5)
E5 + E 2

EE+E
i E 4
* E 3
(1)
d i i 
E + E * E (2)
d
Reverse of postorder d
numbering 
E + E * id (3)

E + id * id (4) 29
Examples
• C++ identifier names. Check if _var2 is valid identifier name.
Answer:
• < id> : = < letter > < rest >|< underscore > < rest >
• < rest > :: = < letter >< rest > | < underscore >< rest > | < digits
> < rest>| Ɛ
<letter> :: = a|b|c…|z|A|B|…|Z
• <digits> ::= 0|1|…|9
• <underscore> :: = -
• Changing it to CFG :
• I  LR|UR
• R  LR|UR|DR|l
• L a | b | c … | z | A | B | … | Z
• D 0| 1 | … | 9
U _ 30
Examples: CFGs and CFLs
Find out language generated by Grammar.
G=({S},{a,b}{S → a S b,S → a,b},S)
Solution

Production rules: S ⇒ aSa

S → aSa S ⇒ abSba
S → bSb S ⇒ abbSbba
S→c S ⇒ abbcbba
L= {wcwR | w € (a, b)*}
31
S  aSa | aBa L ( B ) {b m | m  0}
B  bB | b L ( S ) {a n b m a n | n  0  m  0}

S  aSa | B
L( S ) {a n bm a n | n 0  m 0}
B  bB | 

L ( S ) {( ab) n c n | n 0}

S  abSc | 

S  AB
S  aS | aB
A  aA | a
B  bB | 
B  bB | 
L( S ) {a n b m | m 0  n  0}
 * 32
L( S )  L( a b )
n
S  abScB |  L( S ) {(ab) n
 |
cb mi

B  bB | b i 1

n 0  (i : 1 i n  mi  0)}

S  aS | B
S  AbAbA B  bA
a * ba * ba *
A  aA |  A  aA | bC
C  aC | 

Left to right generation of string.

S --> B | Ɛ
L = {am bn | m >= n}. B --> aBb | A
A --> aA | a

33
Cont…
L {w  {a, b}* | length ( w) is EVEN}
E 
E  | aO | bO
| aaE | abE
| baE | bbE O aE | bE

L {w  {a, b}* | w has EVEN number of b' s}

E  | aE | bO
O aO | bE

{am bn cm+n | m,n0}

Rewrite as {am bn cn cm | m,n 0}:

S  S’ | a S c
S’  e | b S’ c 34
S→aSb/A
CFG for L {a b | n <= m+3,
n m
A→ϵ/a/aa/aaa/B
n,m>=0}
B→bB/ϵ

S → S1S2
L2 = {a nb mc k | n + k = m } S1 → aS1b
S1 → ϵ
S2 → bS2c
S2 → ϵ

35
Cont…
Left and Right Recursive Grammars
In a context-free grammar G, if there is a production in the
form X → Xa where X is a non-terminal and ‘a’ is a string of
terminals, it is called a left recursive production. The
grammar having a left recursive production is called a left
recursive grammar.
And if in a context-free grammar G, if there is a production is
in the form X → aX where X is a non-terminal and ‘a’ is a
string of terminals, it is called a right recursive production.
The grammar having a right recursive production is called
a right recursive grammar.

37
Ambiguity in Context-Free Grammars
• Context Free Grammars(CFGs) are classified based on:
• Number of Derivation trees
• Number of strings
• Depending on the Number of Derivation trees, CFGs are sub-
divided into 2 types:
• Ambiguous grammars
• Unambiguous grammars

Definition: G = (V,T,P,S) is a CFG that is said to be ambiguous if

and only if there exists a string in T* that has more than one parse
tree. where V is a finite set of variables. T is a finite set of
terminals. P is a finite set of productions of the form, A -> α,
where A is a variable and α ∈ (V ∪ T)* S is a designated variable
called the start symbol. 38
Ambiguity
CFG ambiguous  any of following equivalent statements:
–  string w with multiple derivation trees.
–  string w with multiple leftmost derivations.
–  string w with multiple rightmost derivations.Defining
ambiguity of grammar, not language.
Ambiguity in Context-Free Grammars
If a context free grammar G has more than one derivation tree for
some string w ∈ L(G), it is called an ambiguous grammar. There
exist multiple right-most or left-most derivations for some string
generated from that grammar.
Problem
Check whether the grammar G with production rules −
X → X+X | X*X |X| a
is ambiguous or not.
Solution
Let’s find out the derivation tree for
the string "a+a*a". It has two
leftmost derivations.
Derivation 1 − X → X+X → a +X
→ a+ X*X → a+a*X → a+a*a
40
Parse tree 1 −
Cont…
Derivation 2 − X → X*X → X+X*X → a+ X*X → a+a*X →
a+a*a
Parse tree 2 −

Since there are two parse trees

for a single string "a+a*a",
the grammar G is ambiguous.

41
Exercise
Check the Following are ambiguous grammar or not:
• S-> aS |Sa| Є
• E-> E +E | E*E| id
• A -> AA | (A) | a

• S -> SS|AB , A -> Aa|a , B -> Bb|b

How to find out whether grammar is ambiguous or not?
if we can directly or indirectly observe both left and right
recursion in grammar, then the grammar is ambiguous.

Example - S -> SaS|Є

In this grammar we can see both left and right
recursion. So the grammar is ambiguous.
We can make more than one parse tree/derivation tree
for input string (let's say {aa} ) 42
Cont….

↨ If both left and right recursion are not present in grammar, then
is the grammar unambiguous? Explain with an example.
↨ Ans– No, the grammar can still be ambiguous. If both left and
right recursion are present in grammar, then the grammar is
ambiguous, but the reverse is not always true.

43
Cont….
In the above example, although both
Example - left and right recursion are not
S -> aB | ab present, but if we see string { ab },
A -> AB | a we can make more than one parse
B -> Abb | b tree to generate the string.

We can see that even if both

left and right recursion are
not present in grammar, the
grammar can be ambiguous

44
Cont….
1. State whether the grammar is ambiguous or not.
S -> SAB | Є
A -> AaB | a
B -> AS | b
Ans – The grammar is Ambiguous.
If we put
B -> AS in S -> SAB
Then we get S -> SAAS and the grammar clearly contains both
left and right recursion. Hence the grammar is ambiguous.

2. Is the following grammar ambiguous?

SAS | ε
AA1 | 0A1 | 01
45
Cont…
Example : Check whether the following grammar is ambiguous or
not
S → AB / C Solution
A → aAb / ab Now we draw more than one parse trees to
B → cBd / cd get string w = aabbccdd.
C → aCd / aDd
D → bDc / bc

As original string (w =aabbccdd) can

derived through two different parse
trees. So, the given grammar is
ambiguous.
46
CFL Closure Property

Context-free languages Example

are closed under − Let L1 = { anbn , n > 0}.
• Union Corresponding grammar G1 will
• Concatenation have P: S1 → aAb|ab
• Kleene Star operation Let L2 = { cmdm , m ≥ 0}.
Union Corresponding grammar G2 will
Let L1 and L2 be two
have P: S2 → cBb| ε
context free languages. Union of L1 and L2, L = L1 ∪ L2 =
Then L1 ∪ L2 is also
{ anbn } ∪ { cmdm }
context free. The corresponding grammar G will
have the additional production S →
S1 | S2
47
Cont…
Kleene Star
Concatenation If L is a context free
If L1 and L2 are context free language, then L* is also
languages, then L1L2 is also context free.
context free. Example
Example Let L = { anbn , n ≥ 0}.
Product of the languages L1 and Corresponding grammar G
will have P: S → aAb| ε
L2, L = L1L2 = { anbncmdm }
Kleene Star L1 = { anbn }*
The corresponding grammar G
The corresponding grammar
will have the additional
G1 will have additional
production S → S1 S2
productions S1 → SS1 | ε
48
Cont…
Context-free languages are not closed under −
• Intersection − If L1 and L2 are context free languages, then L1
∩ L2 is not necessarily context free.
• Intersection with Regular Language − If L1 is a regular
language and L2 is a context free language, then L1 ∩ L2 is a
context free language.
• Complement − If L1 is a context free language, then L1’ may
not be context free.

49
Simplification of CFG

As we have seen, various languages can efficiently be represented

by a context-free grammar. All the grammar are not always
optimized that means the grammar may consist of some extra
symbols(non-terminal). Having extra symbols, unnecessary
increase the length of grammar. Simplification of grammar means
reduction of grammar by removing useless symbols.

In a CFG, it may happen that all the production rules and symbols
are not needed for the derivation of strings. Besides, there may be
some null productions and unit productions. Elimination of these
productions and symbols is called simplification of CFGs.
50
Cont…
Simplification essentially comprises of the following steps −
• Reduction of CFG
• Eliminate ambiguity.
• Eliminate “useless” variables.
• Eliminate e-productions: A .
• Eliminate unit productions: A B.
• Eliminate redundant productions.
• Trade left- & right-recursion.
Reduction of CFG
CFGs are reduced in two phases −
Phase 1 − Derivation of an equivalent grammar, G’, from the
CFG, G, such that each variable derives some terminal string.
Cont…
Derivation Procedure −
Step 1 − Include all symbols, W1, that derive some terminal
and initialize i=1.
Step 2 − Include all symbols, Wi+1, that derive Wi.
Step 3 − Increment i and repeat Step 2, until Wi+1 = Wi.
Step 4 − Include all production rules that have Wi in it.
Phase 2 − Derivation of an equivalent grammar, G”, from the
CFG, G’, such that each symbol appears in a sentential form.
Derivation Procedure −
Step 1 − Include the start symbol in Y1 and initialize i = 1.
Step 2 − Include all symbols, Yi+1, that can be derived
from Yi and include all production rules that have been applied.
Step 3 − Increment i and repeat Step 2, until Yi+1 = Yi.
52
Cont…
Example
Find a reduced grammar equivalent to the grammar G, having
production rules, P: S → AC | B, A → a, C → c | BC, E → aA |
e
Solution
Phase 1 −
T = { a, c, e }
W1 = { A, C, E } from rules A → a, C → c and E → aA
W2 = { A, C, E } U { S } from rule S → AC
W3 = { A, C, E, S } U ∅
Since W2 = W3, we can derive G’ as −
G’ = { { A, C, E, S }, { a, c, e }, P, {S}}
where P: S → AC, A → a, C → c , E → aA | e
53
Cont…
Phase 2 −
Y1 = { S }
Y2 = { S, A, C } from rule S → AC
Y3 = { S, A, C, a, c } from rules A → a and C → c
Y4 = { S, A, C, a, c }
Since Y3 = Y4, we can derive G” as − G” = { { A, C, S }, { a, c },
P, {S}}, of
Removal where
Unit P: S → AC, A → a, C → c
Productions
Any production rule in the form A → B where A, B ∈ Non-terminal
is called unit production..
Removal Procedure −
1 − To remove A → B, add production A → x to the grammar rule
whenever B → x occurs in the grammar. [x ∈ Terminal, x can be
Null]
2 − Delete A → B from the grammar.
3 − Repeat from step 1 until all unit productions are removed. 54
Cont….
Example
Remove unit production from the following −
S → XY, X → a, Y → Z | b, Z → M, M → N, N → a
Solution −
There are 3 unit productions in the grammar −
Y → Z, Z → M, and M → N
At first, we will remove M → N.
As N → a, we add M → a, and M → N is removed.
The production set becomes
S → XY, X → a, Y → Z | b, Z → M, M → a, N → a
Now we will remove Z → M.
As M → a, we add Z→ a, and Z → M is removed.
The production set becomes
S → XY, X → a, Y → Z | b, Z → a, M → a, N → a
55
Cont…
Now we will remove Y → Z.
As Z → a, we add Y→ a, and Y → Z is removed.
The production set becomes
S → XY, X → a, Y → a | b, Z → a, M → a, N → a
Now Z, M, and N are unreachable, hence we can remove those.
The final CFG is unit production free −
S → XY, X → a, Y → a | b
Removal of Null Productions
In a CFG, a non-terminal symbol ‘A’ is a nullable variable if
there is a production A → ε or there is a derivation that starts
at A and finally ends up with
ε: A → .......… → ε

56
Cont….
Removal Procedure
Step 1 − Find out nullable non-terminal variables which derive ε.
Step 2 − For each production A →a, construct all productions A
→ x where x is obtained from ‘a’ by removing one or multiple
non-terminals from Step 1.
Step 3 − Combine the original productions with the result of step 2
and remove ε - productions.
Example
Remove null production from the following −
S → ASA | aB | b, A → B, B → b | ∈
Solution −
There are two nullable variables
− A and B
At first, we will remove B → ε. 57
Cont….
After removing B → ε, the production set becomes −
S→ASA | aB | b | a, A ε B| b | &epsilon, B → b
Now we will remove A → ε.
After removing A → ε, the production set becomes −
S→ASA | aB | b | a | SA | AS | S, A → B| b, B → b
This is the final production set without null transition.
Remove Useless productions
The productions that can never take part in derivation of any
string , are called useless productions. Similarly , a variable that
can never take part in derivation of any string is called a useless
variable.
Use less
S -> abS | abA | abB
productions
For eg. A -> cd are
B -> aB
B -> aB
C -> dc 58
C -> dc
Chomsky Normal Form
A CFG is in Chomsky Normal Form if the Productions are in the
following forms −
 A→a
 A → BC
 S → ε, where A, B, and C are non-terminals and a is terminal.

Algorithm to Convert into Chomsky Normal Form −

Step 1 − If the start symbol S occurs on some right side, create
a new start symbol S’ and a new production S’→ S.
Step 2 − Remove Null productions. (Using the Null production
removal algorithm discussed earlier)
Step 3 − Remove unit productions. (Using the Unit production
removal algorithm discussed earlier)
59
Cont…
Step 4 − Replace each production A → B1…Bn where n >
2 with A → B1C where C → B2 …Bn. Repeat this step for
all productions having two or more symbols in the right
side.
Step 5 − If the right side of any production is in the form A
→ aB where a is a terminal and A, B are non-terminal, then
the production is replaced by A → XB and X → a. Repeat
this step for every production which is in the form A → aB.
Example
Convert the following CFG into CNF
S → ASA | aB, A → B | S, B → b | ε

60
Cont…
Solution
(1) Since S appears in R.H.S, we add a new state S0 and S0→S is
added to the production set and it becomes −
S0→S, S→ ASA | aB, A → B | S, B → b | ∈
(2) Now we will remove the null productions −
B → ∈ and A → ∈
After removing B → ε, the production set becomes −
S0→S, S→ ASA | aB | a, A → B | S | ∈, B → b
After removing A → ∈, the production set becomes −
S0→S, S→ ASA | aB | a | AS | SA | S, A → B | S, B → b

61
Cont…
(3) Now we will remove the unit productions.
After removing S → S, the production set becomes −
S0→S, S→ ASA | aB | a | AS | SA, A → B | S, B → b
After removing S0→ S, the production set becomes −
S0→ ASA | aB | a | AS | SA, S→ ASA | aB | a | AS | SA
A → B | S, B → b
After removing A→ B, the production set becomes −
S0 → ASA | aB | a | AS | SA, S→ ASA | aB | a | AS | SA
A→S|b
B→b
After removing A→ S, the production set becomes −
S0 → ASA | aB | a | AS | SA, S→ ASA | aB | a | AS | SA
A → b |ASA | aB | a | AS | SA, B → b
62
Cont…
4) Now we will find out more than two variables in the R.H.S
Here, S0→ ASA, S → ASA, A→ ASA violates two Non-terminals
in R.H.S.
Hence we will apply step 4 and step 5 to get the following final
production set which is in CNF −
S0→ AX | aB | a | AS | SA
S→ AX | aB | a | AS | SA
A → b |AX | aB | a | AS | SA
B→b
X → SA
(5) We have to change the productions S0→ aB, S→ aB, A→ aB
And the final production set becomes −

63
Cont…
S0→ AX | YB | a | AS | SA
S→ AX | YB | a | AS | SA
A → b A → b |AX | YB | a | AS | SA
B→b
X → SA
Y→a

64
Greibach Normal
Form
A CFG is in Greibach Normal Form if the Productions are in the
following forms −
A→b
A → bD1…Dn
S → ε,where A, D1,....,Dn are non-terminals and b is a terminal.
Algorithm to Convert a CFG into Greibach Normal Form
1. If the start symbol S occurs on some right side, create a new start
symbol S’ and a new production S’ → S.
2. Remove Null productions. (Using the Null production removal
algorithm discussed earlier)
3. Remove unit productions. (Using the Unit production removal
algorithm discussed earlier)
4. Remove all direct and indirect left-recursion.
5. Do proper substitutions of productions to convert it into the
proper form of GNF. 65
Cont…
Example
Convert the following CFG Step 4
into CNF Now after replacing
S → XY | Xn | p X in S → XY | Xo | p
X → mX | m with
Y → Xn | o mX | m
Solution we obtain
Here, S does not appear on S → mXY | mY | mXo | mo |
the right side of any p.
production and there are no And after replacing
unit or null productions in X in Y → Xn | o
the production rule set. So, with the right side of
we can skip Step 1 to Step 3. X → mX | m
66
Cont…
we obtain
Y → mXn | mn | o.
Two new productions O → o and P → p are added to the
production set and then we came to the final GNF as the
following −
S → mXY | mY | mXC | mC | p
X → mX | m
Y → mXD | mD | o
O→o
P→p

Chapter 3
No ratings yet
Chapter 3
77 pages
Unit-3 Flat
No ratings yet
Unit-3 Flat
29 pages
Unit 4 ContextFreeLanguage
No ratings yet
Unit 4 ContextFreeLanguage
58 pages
Lecture 03
No ratings yet
Lecture 03
36 pages
FLAT - Ch. 3
No ratings yet
FLAT - Ch. 3
69 pages
Unit Iii Context-Free Grammar and Languages: 3.1.1. Definition
No ratings yet
Unit Iii Context-Free Grammar and Languages: 3.1.1. Definition
29 pages
ATCD Unit4
No ratings yet
ATCD Unit4
81 pages
Context-Free Grammar (CFG)
No ratings yet
Context-Free Grammar (CFG)
27 pages
Unit V Flat LM Cse
No ratings yet
Unit V Flat LM Cse
19 pages
Chapter 3
No ratings yet
Chapter 3
57 pages
CFGs and Its Derivation Tree
No ratings yet
CFGs and Its Derivation Tree
12 pages
Toc Unit3
No ratings yet
Toc Unit3
11 pages
Flat - Unit 3
No ratings yet
Flat - Unit 3
18 pages
Flat CH 3
No ratings yet
Flat CH 3
74 pages
ContextFreeGrammars Myppt
No ratings yet
ContextFreeGrammars Myppt
41 pages
Chapter - 3 - Context Free Language - Part - 1
No ratings yet
Chapter - 3 - Context Free Language - Part - 1
110 pages
Context Free Grammars
No ratings yet
Context Free Grammars
39 pages
FLAT - Ch. 3 (Lecture Notes)
No ratings yet
FLAT - Ch. 3 (Lecture Notes)
23 pages
Samir CFG
No ratings yet
Samir CFG
105 pages
WINSEM2024-25 BCSE304L TH VL2024250501632 2025-02-15 Reference-Material-I
No ratings yet
WINSEM2024-25 BCSE304L TH VL2024250501632 2025-02-15 Reference-Material-I
29 pages
Context Free Grammars
No ratings yet
Context Free Grammars
40 pages
Automata Ch3
No ratings yet
Automata Ch3
29 pages
Unit-2 Context Free Grammer (TOC)
No ratings yet
Unit-2 Context Free Grammer (TOC)
100 pages
4.1 Context - Free - Grammars-MKN
No ratings yet
4.1 Context - Free - Grammars-MKN
77 pages
Toa-Handout-22 CFG More Examples
No ratings yet
Toa-Handout-22 CFG More Examples
16 pages
Chapter - 2 - Finite State Automata - Part - 3
No ratings yet
Chapter - 2 - Finite State Automata - Part - 3
50 pages
Lecture 9
No ratings yet
Lecture 9
22 pages
CFG Toc 2
No ratings yet
CFG Toc 2
20 pages
Unit 2
No ratings yet
Unit 2
10 pages
TOC II Updated
No ratings yet
TOC II Updated
41 pages
Automata Lectuee5
No ratings yet
Automata Lectuee5
33 pages
Parsing Bun
No ratings yet
Parsing Bun
48 pages
Chapter 3 - Context Free Languages
No ratings yet
Chapter 3 - Context Free Languages
59 pages
Jan-June 2025 Btcs 4 Sem v10 Btcs404 Btcs404 Unit3 Notes
No ratings yet
Jan-June 2025 Btcs 4 Sem v10 Btcs404 Btcs404 Unit3 Notes
14 pages
Toc 4 and 5 Unit Notes
No ratings yet
Toc 4 and 5 Unit Notes
72 pages
CS242 - Module 5
No ratings yet
CS242 - Module 5
42 pages
Module 3 CFG - Final
No ratings yet
Module 3 CFG - Final
40 pages
Unit 3 Syntax - Analyzer
No ratings yet
Unit 3 Syntax - Analyzer
56 pages
Formal Languages and Automata Theory: CH 4: Context Free Languages
No ratings yet
Formal Languages and Automata Theory: CH 4: Context Free Languages
59 pages
Automata Theory Lec-03
No ratings yet
Automata Theory Lec-03
58 pages
Chapter Four
No ratings yet
Chapter Four
54 pages
Chapter 4 Automata
No ratings yet
Chapter 4 Automata
36 pages
SDC - Grammar - CFG
No ratings yet
SDC - Grammar - CFG
46 pages
CGF and CFL
No ratings yet
CGF and CFL
45 pages
ContextFreeGrammars
No ratings yet
ContextFreeGrammars
28 pages
Vision 2023 Toc Chapter 5 Context Free Grammar 12
No ratings yet
Vision 2023 Toc Chapter 5 Context Free Grammar 12
25 pages
Context Free Language
No ratings yet
Context Free Language
31 pages
Unit Iii
No ratings yet
Unit Iii
28 pages
Syntax Analyzer
No ratings yet
Syntax Analyzer
38 pages
CD Model Set-3 Answer Key
No ratings yet
CD Model Set-3 Answer Key
29 pages
CC Manual - Grammar PDF
No ratings yet
CC Manual - Grammar PDF
2 pages
TE Computer 2019 Course Revised Draft 7june2021
No ratings yet
TE Computer 2019 Course Revised Draft 7june2021
104 pages
CS402 Quiz-3 by Vu Topper RM
100% (1)
CS402 Quiz-3 by Vu Topper RM
31 pages
4th Sem Syllabus and Credit Scheme
No ratings yet
4th Sem Syllabus and Credit Scheme
107 pages
Context Free Grammars
No ratings yet
Context Free Grammars
40 pages
Context Free Grammars
No ratings yet
Context Free Grammars
25 pages
08 CFG
No ratings yet
08 CFG
27 pages
Context-Free Languages & Grammars (Cfls & CFGS) : Reading: Chapter 5
No ratings yet
Context-Free Languages & Grammars (Cfls & CFGS) : Reading: Chapter 5
38 pages
Context Free Grammars
No ratings yet
Context Free Grammars
36 pages
CFG & GNF
No ratings yet
CFG & GNF
21 pages
Class 18 Context Free Grammar
No ratings yet
Class 18 Context Free Grammar
35 pages
Chapter Four Automata
No ratings yet
Chapter Four Automata
36 pages
13-Dependency Grammar-03-09-2024
No ratings yet
13-Dependency Grammar-03-09-2024
31 pages
Motivation For Formal Grammars
No ratings yet
Motivation For Formal Grammars
15 pages
2 Grammars Parse Trees and Derivations
No ratings yet
2 Grammars Parse Trees and Derivations
55 pages
Context-Free Languages & Grammars (Cfls & CFGS) : Reading: Chapter 5
No ratings yet
Context-Free Languages & Grammars (Cfls & CFGS) : Reading: Chapter 5
40 pages
Simplification of CFG: Presented To Presented by
100% (2)
Simplification of CFG: Presented To Presented by
12 pages
TE Computer Engg 2019 Course Syllabus Draft 23may2021
No ratings yet
TE Computer Engg 2019 Course Syllabus Draft 23may2021
100 pages
VNW Ce Toc Un 03
No ratings yet
VNW Ce Toc Un 03
44 pages
TOC GTU Study Material Presentations Unit-3 22022020072212AM
No ratings yet
TOC GTU Study Material Presentations Unit-3 22022020072212AM
55 pages
MTech Information Security FINAL 10052018
No ratings yet
MTech Information Security FINAL 10052018
20 pages
CST308 - KQB KtuQbank
No ratings yet
CST308 - KQB KtuQbank
13 pages
The Different Phases of A Compiler
No ratings yet
The Different Phases of A Compiler
9 pages
III Year V Sem Cs6503 Theory of Computation
No ratings yet
III Year V Sem Cs6503 Theory of Computation
44 pages
TCS Unit 1 Introduction
No ratings yet
TCS Unit 1 Introduction
23 pages
Context Free Gramcccmar Introduction
No ratings yet
Context Free Gramcccmar Introduction
3 pages
Syllabus Mca 4th Sem
No ratings yet
Syllabus Mca 4th Sem
15 pages
2 Contex Free Language
No ratings yet
2 Contex Free Language
13 pages
Describing Syntax and Semantics: Isbn 0-321-49362-1
No ratings yet
Describing Syntax and Semantics: Isbn 0-321-49362-1
55 pages
(2010) Automating The Conceptual Design Process
No ratings yet
(2010) Automating The Conceptual Design Process
14 pages
Compiler Design Notes
No ratings yet
Compiler Design Notes
157 pages
Flat Unit 3 - 21.9.20
No ratings yet
Flat Unit 3 - 21.9.20
2 pages
Automata and Formal Languages: CS138, Winter 2006
No ratings yet
Automata and Formal Languages: CS138, Winter 2006
34 pages
Aim - Theory - What Is JFLAP ?
No ratings yet
Aim - Theory - What Is JFLAP ?
10 pages
Shewa - NLP Project Report PDF
No ratings yet
Shewa - NLP Project Report PDF
7 pages
Student Declaration: - , I Will Not
No ratings yet
Student Declaration: - , I Will Not
7 pages
Compiler Design Introduction
No ratings yet
Compiler Design Introduction
23 pages
Gs2012 Computer Solved
No ratings yet
Gs2012 Computer Solved
26 pages

Chapter3 CFG

Uploaded by

Chapter3 CFG

Uploaded by

Chapter 3

Context Free Languages

Regular Languages  Finite State Machines, Regular

Recursively Enumerable Languages

• As Lexical Analyzer generates a string of tokens which are

Definition. Given a context-free grammar G = ( , NT,

Definition. A language L is context-free if there is a

1. EI // Expression is an identifier

Note Identifiers are regular; could describe as (letter)(letter +

Any string of variables and/or terminals derived from the start

S  A|AB Other derivation trees for

• Each step of the derivation is • Each step of the derivation is a

• Using a leftmost derivation generates the

that results when we concatenate the

Production rules: S ⇒ aSa

L ( S ) {( ab) n c n | n 0}

Left to right generation of string.

L {w  {a, b}* | w has EVEN number of b' s}

{am bn cm+n | m,n0}

Rewrite as {am bn cn cm | m,n 0}:

Definition: G = (V,T,P,S) is a CFG that is said to be ambiguous if

Since there are two parse trees

• S -> SS|AB , A -> Aa|a , B -> Bb|b

Example - S -> SaS|Є

We can see that even if both

2. Is the following grammar ambiguous?

As original string (w =aabbccdd) can

Context-free languages Example

As we have seen, various languages can efficiently be represented

Algorithm to Convert into Chomsky Normal Form −

You might also like