Automata Theory Computability - M3
Automata Theory Computability - M3
Title Page
Chapter No: 3
No.
3.1 CONTEXT FREE GRAMMAR 1 - 80
Grammar:
What is a grammar?
The re-write system which is used to define a language is called Grammar.
G is a grammar which generates a language L then the language is specified as L(G). Grammar work on
set of symbols; can be of two types: Non-terminal symbols and Terminal symbols.
Non-terminal and Terminal Symbols
What is Non-terminal and Terminal Symbols?
Non-terminal symbols are kind of symbols that act as working symbols while the grammar is
working on derivation. Non-terminal symbols disappear when the grammar completely derived
the string w of L(G).
Terminal Symbols are from the input ∑. These symbols generate the string w of L(G).
Every grammar need one special symbol called start symbol. It is normally denoted by S.
ie: S and , is replaced by → and the input symbol ‘a’ and the next state S are concatenated and
written on RHS of CFG. ie: S → aS.
For final state we have to include ‘ɛ’ on RHS of CFG . ie: S → ɛ.
Therefore the CFG for the language L ={ an | n ≥ 0 } is given by:
S → aS.
S → ɛ.
S → aSbb
S→ɛ
Here for every ‘a’ two ‘b’s have been be generated. This is obtained by suffixing ‘aS’ with ‘bb’.
The minimum string is ɛ.
If n ≥ 1 then CFG becomes:
S → aSbb
S → abb
Generation of context free grammar for the language represented using regular expression.
i. Obtain CFG for the language L = { (a, b)*}
Language represents any number of a’s and b’s with ɛ.
S → aS | bS | ɛ
ii. Obtain CFG for the language L = { (w ab w| where w € (a+b) * } OR
Obtain CFG for the language containing strings of a’s and b’s with substring ‘ab’
L = {w ab w} can be re-written as:
Where A production represents any number of a’s and b’s and is given by:
A → aA | bA | ɛ
Therefore the resulting grammar is G = ( V, T, P, S) where,
V = { S, A }, T = { a, b}, S is the start symbol, and P is the production rule is as shown below:
S → AabA
A → aA | bA | ɛ
iii. Obtain CFG for the language L = { ( 011 + 1) * 01 }
L can be re-written as:
iv. Obtain CFG for the language L = { w| w € (0,1)* with at least one occurrence of ‘101’ }.
The regular expression corresponding to the language is L = { w 101 w }
Where A production represents any number of 0’s and 1’s and is given by:
A → 0A | 1A | ɛ
Therefore the resulting grammar is G = ( V, T, P, S) where,
V = { S, A }, T = { 0, 1}, S is the start symbol, and P is the production rule is as shown below:
S → A101A
A → 0A | 1A | ɛ
v. Obtain CFG for the language L = { w| wab € (a,b) * }. OR
Obtain CFG for the language containing strings of a’s and b’s ending with’ab’. }.
The resulting grammar is G = ( V, T, P, S) where,
V = { S, A }, T = { a, b}, S is the start symbol, and P is the production rule is as shown below:
S → Aab
A → aA | bA | ɛ
vi. Obtain CFG for the language containing strings of a’s and b’s ending with’ab’ or ‘ba’. }.
OR
Obtain the context free grammar for the language L = { XY | X € (a, b) * and Y € (ab or ba)
The regular expression corresponding to the language is w (ab + ba) where w is in( a, b) *
X→ aX | bX | ɛ
Y → ab | ba
The resulting grammar is G = ( V, T, P, S) where,
V = { S, X, Y }, T = { a, b}, S is the start symbol, and P is the production rule is as shown
below:
S → XY
X→ aX | bX | ɛ
Y → ab | ba
a). Obtain the CFG for the language L = { Na (w) = Nb (w) | w € (a, b) * }
OR
Obtain the CFG for the language containing strings of a’s and b’s with equal number of a’s and
b’s.
Answer:
To get equal number of a’s and b’s, we know that there are 3 cases:
i. An empty string ɛ has equal number of a’s and b’s
ii. Equal number of a’s followed by equal number of b’s.
iii. Equal number of b’s followed by equal number of a’s.
The corresponding productions for these 3 cases can be written as
S→ ɛ
S→ aSb
S→ bSa
Using these productions the strings of the form ɛ, ab, ba, ababab….., bababa…. etc can be
generated.
But the strings such as abba, baab, etc, where the strings starts and ends with the same symbol,
cannot be generated from these productions. So to generate these type of strings, we need to
concatenate the above two productions which generates equal a’s and equal b’s and equal b’s
and equal a’s or vice versa. The corresponding production is S→ SS.
The resulting grammar corresponding to the language with equal number of a’s and equal
number of b’s is G = ( V, T, P, S) where,
V = { S }, T = { a, b}, S is the start symbol, and P is the production rule is as shown below:
S→ɛ
S→ aSb
S → bSa
S → SS
b). Obtain the CFG for the language L = { Na (w) = Nb (w) + 1 | w € (a, b)* }
The language containing stings of a’s and b’s with number of a’s one more than number of ‘b’s.
Here we should have one more a’s either in the beginning or at the end or at the middle.
We can write the A production with equal number of a’s and equal number of b’s as
d. Obtain the CFG for the language L = {set of all non-palindromes over {a, b}}
Answer:
Non-palindrome strings are not having same symbol at the start and ending point.
ie: A→ aBb | bBa
Where B corresponds to any number of a’s and b’s; ie: B→ aB| bB |ɛ
Finally non-palindrome strings are generated by inserting A production between a palindrome
production S; ie S→ aSa| bSb | A
The resulting grammar corresponding to the language L = { set of all non-palindromes} is G = (
V, T, P, S) where,
V = { S, A, B }, T = { a, b}, S is the start symbol, and P is the production rule is as shown below:
S → aSa | bSb | A
A→ aBb | bBa
B→ aB | bB |ɛ
e.Obtain the CFG for the language L = { wwR | w€ (a, b) *}
Answer:
NOTE: wwR generate palindrome strings of a’s and b’s of even length.
That means we can remove the odd length palindrome strings such as ‘a’ and ‘b’ from the
palindrome problem(c).
The resulting grammar corresponding to the language L = { wwR} is G = ( V, T, P, S) where,
V = { S }, T = { a, b}, S is the start symbol, and P is the production rule is as shown below:
S → aSa | bSb | ɛ
f. Obtain the CFG for the language L = { w = wR | w is in (a, b) *} 5
OR
L = { palindrome strings over {a,b}
Note: w = wR indicates that string w and its reversal wR is always equal; That means the strings
generated from the language is palindrome strings. (either even or odd length palindrome).
Obtain the CFG for the language containing all positive odd integers up to 999.
The resulting grammar corresponding to the language L = {all positive odd integers up to 999 }
is G = ( V, T, P, S) where,
V = { S, C, D }, T = { 0,1, 2,3,4,5,6,7,8,9}, S is the start symbol, and P is the production rule is
as shown below:
S → C | DC | DDC
C → 1|3|5|7|9
D → 0| 1|2|3|4|5|6|7|8|9
Athmaranjan K Dept of ISE Page 10
Automata Theory and Computability Design of CFG
We know that CFG corresponding to the language 0m 1m | m ≥ 1, by referring the basic building
block grammar of an bn | n ≥ 1.
The equivalent A production is:
A → 0A1
A → 01
Here B represents any number of 2’s with at least one 2 (n ≥ 1), which is similar to an grammar.
The equivalent B production is:
B → 2B
B→2
So the context free grammar for the language L = { 0m 1m 2n | m, n ≥ 1 } is G = ( V, T, P, S)
where,
V = { S, A, B}, T = { 0, 1, 2}, S is the start symbol, and P is the production rule is as shown
below:
S → AB
A → 0A1 | 01
B → 2B | 2
Obtain the context free grammar for the language L = {a2n bm | m, n ≥ 0 }
Answer:
Since ‘a’ represented in terms of ‘n’ and ‘b’ represented in terms of ‘m’, we can re-write the
language as:
V = { S, A, B}, T = { a, b }, S is the start symbol, and P is the production rule is as shown below:
S → AB
A → aaA |ɛ
B → bB |ɛ
Obtain the context free grammar for the language L = {0 i 1j 2k | i = j or j =k where i, j, k ≥ 0 }
Case 1: when i = j
The given language becomes
The resultant production is given by: A→ 0A1| ɛ and B→ 2B| ɛ
Therefore case 1 results in productions
S→ AB
A→ 0A1| ɛ
B→ 2B| ɛ
Case 2: when j = k
Where C→ 1C| 1
So the context free grammar for the language L = { 0i 1j | i ≠ j where i, j ≥ 0 }
is G = ( V, T, P, S) where,
V = {S, A, B, C}, T = {0, 1}, S is the start symbol, and P is the production rule is as shown
below:
S→ AB |BC
A→ 0A| 0
B→ 0B1| ɛ
C→ 1C| 1
Obtain the context free grammar for the language L = {an bm | n = 2m where m ≥ 0 }
Answer:
By substituting n = 2m we have
L= { a2m bm | m ≥ 0 }
Here for every two ‘a’s one ‘b’ has to be generated. This is obtained by suffixing ‘aaS’ with one
‘b’. The minimum string is ɛ.
So the context free grammar for the language L = {an bm | n = 2m where m ≥ 0 }
is G = ( V, T, P, S) where,
V = {S }, T = {a, b}, S is the start symbol, and P is the production rule is as shown below:
S → aaSb
S→ɛ
Obtain the context free grammar for the language L = {an bm | n ≠ 2m where n, m ≥ 1 }
Answer:
Here n ≠ 2m means n > 2m or n< 2m, which results in two possible cases of Language L.
Case 1: when n > 2m, we can re-write the language L by taking n = 2m + 1
L= { a2m+1 bm | m ≥ 1}; by referring the basic building block grammar example, the resulting
production ( a2m bm ) is given by:
A → aaAb
The minimum string when m = 1 is ‘aaab’.
ie : A → a
Therefore A → aaAb | aaab
Case 2: when n < 2m, we can re-write the language L by taking n = 2m - 1
L= { a2m-1 bm | m ≥ 1 }; by referring the basic building block grammar example, the resulting
production ( a2m bm ) is given by:
B → aaBb
The minimum string when m = 1 is ‘ab’.
ie : B→ ab
Therefore B → aaBb | ab.
So the context free grammar for the language L = {an bm | n ≠ 2m where n, m ≥ 1 }
is G = ( V, T, P, S) where,
V = {S, A, B }, T = {a, b}, S is the start symbol, and P is the production rule is as shown below:
S →A | B
A → aaAb | aaab
B → aaBb | ab.
Obtain the context free grammar for the language L = {an bm | n ≠ 2m where n, m ≥ 0 }
Note: Answer is same as that of previous problem, except the minimum string value,
ie: when m= 0.
For A production minimum string value when m= 0 is: ‘a’
A→a
For B production minimum string value can be obtained when m= 1 ie: ‘ab’
B → ab
So the context free grammar for the language L = {an bm | n ≠ 2m where n, m ≥ 0 }
is G = ( V, T, P, S) where,
V = {S, A, B}, T = {a, b}, S is the start symbol, and P is the production rule is as shown below:
S →A | B
A → aaAb | a
B → aaBb | ab
Obtain the context free grammar for the language L = { 0 i 1j 2k | i + j = k where i, j ≥ 0 }
Answer:
When i+ j =k, the given language becomes: L = 0i 1j 2i + j
L = 0i 1j 2i 2 j = ; minimum value when i=0 is A
Note: For this type of language we have to select the middle string as a substring (A) and we
need to insert this substring between the start production ie: 0i 2i (where middle term A is
ignored)
The equivalent substring A production is given by: A→ 1A2| ɛ
The start production S→ 0S2| A ; here the minimum value when i = 0 is A
So the context free grammar for the language L = { 0i 1j 2k | i + j = k where i, j ≥ 0 }
is G = ( V, T, P, S) where,
V = {S, A}, T = {0, 1, 2}, S is the start symbol, and P is the production rule is as shown below:
S→ 0S2| A
A→ 1A2| ɛ
L=
is G = ( V, T, P, S) where,
V = {S, A}, T = {a, b, 0, 1}, S is the start symbol, and P is the production rule is as shown
below:
S → aSb | aaAbb
A→ 0A0| 1A1 |ɛ
S1 production is ; S1 → AB
A → aAb |ɛ
B → cB | c
S2 production is ; S2 → AC
C → cCd | ɛ
So the context free grammar for the language L = {an bnci | n ≥ 0, i ≥1 U an bn cm dm | n, m ≥ 0 }
is G = ( V, T, P, S) where,
V = {S, S1, S2 A, B, C}, T = {a, b, c, d}, S is the start symbol, and P is the production rule is as
shown below:
S → S 1 | S2
S1 → AB
A → aAb |ɛ
B → cB | c
S2 → AC
C → cCd | ɛ
Obtain the context free grammar for the language L1L2 where L1 = { an bn ci | n ≥ 0, i ≥1 } and L2
={ 0n12n | n ≥ 0 }
Answer:
S1 production is ; S1 → AB
A → aAb |ɛ
B → cB | c
S2 production is: S2 → 0 S211 | ɛ
So the context free grammar for the language L1 = { an bn ci | n ≥ 0, i ≥1 } and L2 ={ 0n12n | n ≥
0}
is G = ( V, T, P, S) where,
V = {S, S1, S2, A, B}, T = {a, b, c, 0, 1}, S is the start symbol, and P is the production rule is as
shown below:
S → S 1 S2
S1 → AB
A → aAb |ɛ
B → cB | c
S2 → 0S211 | ɛ
******* Obtain the context free grammar for the language L = { an+2 bm | n ≥ 0, m ≥ n }
It is clear from the above language that set of strings that can be generated is represented as:
n=0 n=1 n=2 …….
m=1 m=2 m=3 … m=2 m=3 m=4 . m=3 m=4 m=5 ……….
aab aabb aabbb … aaabb aaabbb aaabbb . aaaabbb aaaabbbb aaaabbbbb
a anbn b*
We observe that above language consists of strings of a’s and b’s which starts with one a
followed by an bn n ≥ 1, which in-term followed by any number of b’s (b*) .
A → aAb |ab
B→ bB |ɛ ; and S production is S → aAB
So the context free grammar for the language L = { an+2 bm | n ≥ 0, m ≥ n } is G = ( V, T, P, S)
where,
V = {S, A, B}, T = {a, b}, S is the start symbol, and P is the production rule is as shown below:
S → aAB
A → aAb |ab
B→ bB |ɛ
******* Obtain the context free grammar for the language L = { an bm | n ≥ 0, m ≥ n }
ɛ b+ ab b+ aabb b+ ……….
an bn b+ where n ≥ 1
We observe that above language consists of strings of a’s and b’s with n number of a’s followed
by n number of b’s, which in term followed by any number of b’s with at least one b
L = { a n bn b+ | n ≥ 0 }
******* Obtain the context free grammar for the language L = { an bn-3 | n ≥ 3 }
Answer:
L = { aaaɛ, aaaab, aaaaabb, aaaaaabbb,………………………………….. }
So we can re-write the language as;
L = aaa an bn | n ≥ 0
So the context free grammar for the language L = { an bn-3 | n ≥ 3 } is G = ( V, T, P, S) where,
V = {S, A}, T = {a, b}, S is the start symbol, and P is the production rule is as shown below:
S → aaaA
A → aAb | ɛ
Modulo – k Problems: Writing CFG by constructing DFA:
******* Obtain the context free grammar for the language L = { w € ( a)* | |w| mod 3 ≠ |w|
mod 2 }
Answer:
Here mod 3 results in 3 remainders such as; 0, 1 and 2 and mod 2 results in 2 remainders such as
0 and 1:
The possible states are: ( 0, 0), ( 0, 1), ( 1, 0), ( 1, 1), ( 2, 0), (2, 1)
The equivalent DFA:
is G = ( V, T, P, S) where,
V = {S, A, B, C, D, E}, T = {a, b}, S is the start symbol, and P is the production rule is as shown
below:
S → aA
A→ aB
B→ aC | ɛ
C→ aD | ɛ
D→ aE| ɛ
E→ aS |ɛ
******* Obtain the context free grammar for the language L = { w € ( a, b) * | |w| mod 3 ≠ |w|
mod 2 }
DFA:
C→ aD |bD | ɛ
D→ aE| bE |ɛ
E→ aS |bS| ɛ
******* Obtain the context free grammar for the language L = {w: N a(w) mod 2 = 0 where
w € ( a, b)* }
Answer:
Na(w) mod 2 = 0 means; the string contains even number of a’s and any number of b’s.
The Language can be re-written as: bn a2m bn | n ≥ 0, m ≥ 0
The S production is given by: S → ABA where A represents bn | n ≥ 0 and B represents a2m | m ≥
0.
A production is given by: A → bA | ɛ
B production is given by: B→ aaB | ɛ
So the context free grammar for the language L = {w: Na(w) mod 2 = 0 where w € ( a, b)* }
is G = ( V, T, P, S) where,
V = {S, A, B }, T = {a, b}, S is the start symbol, and P is the production rule is as shown below:
S → ABA
A → bA | ɛ
B→ aaB | ɛ
Write a CFG for the language L defines balanced parentheses. OR
L = { { (, ) }*| parentheses are balanced }
So the context free grammar G = ( V, T, P, S) where,
V = {S}, T = {(, )}, S is the start symbol, and P
S → (S)
S → SS
S→ ɛ
Derivation
Define the following terms:
i. Derivation
ii. Left Most Derivation
iii. Right Most Derivation.
iv. Sentential Form
v. Left Sentential Form
Derivation: The process of obtaining string of terminals and/or non-terminals from the start
symbol by applying some or all production rules is called derivation.
If a string is obtained by applying only one production, then it is called one step derivation.
Example: Consider the Productions: S →AB, A→ aAb|ɛ, B →bB|ɛ
S => AB
aAbB
abB
abbB
abb
Note: The derivation process may end whenever one of the following things happens.
i. The working string no longer contains any non terminal symbols (including, as a special case
when the working string is ε). Ie: working string is generated.
i. There are non terminal symbols in the working string but there is no match with the left-hand
side of any rule in the grammar. For example, if the working string were AaBb, this would
happen if the only left-hand side were C
Left Most Derivation (LMD): In the derivation process, if a leftmost variable is replaced at every
step, then the derivation is said to be leftmost.
Example: E → E+E | E*E | a | b
Let us derive a string a+b*a by applying LMD.
E => E*E
E+E*E
a +E*E
a+b*E
a+b*a
Right Most Derivation (RMD): In the derivation process, if a rightmost variable is replaced at
every step, then the derivation is said to be rightmost.
Example: E → E+E | E*E | a | b
Let us derive a string a+b*a by applying RMD.
E => E+E
E+E*E
E +E*a
E+b*a
a+b*a
Sentential form: For a context free grammar G, any string ‘w’ in (V U T)* which appears in
every derivation step is called a sentence or sentential form.
Two ways we can generate sentence:
i. Left sentential form
ii. Right sentential form
Example: S => AB
aAbB
abB
abbB
abb
Here {S, AB, aAbB, abB, abbB, abb } can be obtained from start symbol S, Each string in the set
is called sentential form.
Left Sentential form: For a context free grammar G, any string ‘w’ in (V U T)* which appears in
every Left Most Derivation step is called a Left sentential form.
Example: E => E*E
E+E*E
a +E*E
a+b*E
a+b*a
Left sentential form = {E, E*E, E+E*E, a +E*E, a+b*E, a+b*a }
Right Sentential form: For a context free grammar G, any string ‘w’ in (V U T)* which appears
in every Right Most Derivation step is called a Left sentential form.
Example: E => E+E
E+E*E
E +E*a
E + b*a
a + b*a
Right sentential form = {E, E+E, E+E*E, E +E*a, E+ b*a, a + b * a }
PARSE TREE: ( DERIVATION TREE)
What is parse tree?
The derivation process can be shown in the form of a tree. Such trees are called derivation trees
or Parse trees.
Example: E → E+E | E*E | a | b
The Parse tree for the LMD of the string a+b*a is as shown below:
YIELD OF A TREE:
What is Yield of a tree?
The yield of a tree is the string of terminal symbols obtained by only reading the leaves of the
tree from left to right without considering the ɛ symbols.
Example:
Branching factor
Define the branching factor of a CFG
The branching factor of a grammar G is the length (the number of symbols) of the longest right-
hand side of any rule in G.
Then the branching factor of any parse tree generated by G is less than or equal to the branching
factor of G.
NOTE:
1. Every leaf node is labelled with terminal symbols including ɛ.
2. The root node is labelled with start symbol S
3. Every interior- node is labelled with some element of V.
Problem 1:
Consider the following grammar G:
S → aAS |a
A→ SbA |SS |ba
Obtain: i) LMD; ii. RMD iii. Parse tree for LMD iv. Parse tree for RMD for the string
‘aabbaa’
Problem 2:
Design a grammar for valid expressions over operator – and /. The arguments of expressions are
valid identifier over symbols a, b, 0 and 1. Derive LMD and RMD for string w = (a11 – b0) /
(b00 – a01). Write parse tree for LMD
Grammar for valid expression:
E → E – E | E / E | (E) |I
I → a | b | Ia |Ib | I0 |I1
Problem 3:
Consider the following grammar G:
E → + EE | * EE | - EE | x | y
Find the: i) LMD; ii. RMD iii. Parse tree for the string ‘+*-xyxy’
Answer:
E → + EE | * EE | - EE | x | y
LMD: RMD:
Problem 4:
Show the derivation tree for the string ‘aabbbb’ with grammar:
S → AB |ɛ
A → aB
B → Sb
Give a verbal description of the language generated by this grammar.
Answer: Derivation tree:
Answer:
Problem 6:
Consider the following grammar:
S → AbB
A →aA |ɛ
B → aB | bB |ɛ
Give LMD, RMD and parse tree for the string aaabab
LMD: RMD:
Obtain the context free grammar for generating integers and derive the integer 1278 by applying
LMD.
The context free grammar corresponding to the language containing set of integers is G = ( V, T,
P, S) where, V = { I, N, D }, T = { 0, 1}, I is the start symbol, and P is the production rule is as
shown below:
I → N | SN
S→+|-|ε
N → D | DN | ND
D → 0 | 1 | 2 | 3 | ……….| 9
LMD for the integer 1278:
I => N
ND
NDD
NDDD
DDDD
1DDD
12DD
127D
1278
AMBIGUOUS GRAMMAR:
Sometimes a Context Free Grammar may produce more than one parse tree for some (or all) of
the strings it generates. When this happens, we say that the grammar is ambiguous. More
precisely. a grammar G is ambiguous iff there is at least one string in L( G) for which G
produces more than one parse tree.
***What is an ambiguous grammar?
A context free grammar G is an ambiguous grammar if and only if there exists at least one string
„w’ is in L(G) for which grammar G produces more than one parse tree.
Show how ambiguity in grammars are verified with an example.
Testing of ambiguity in a CFG by the following rules:
i. Obtain the string ‘w‟ in L(G) by applying LMD twice and construct the parse tree. If the
two parse trees are different, then the grammar is ambiguous.
ii. Obtain the string ‘w‟ in L(G) by applying RMD twice and construct the parse tree. If the
two parse trees are different, then the grammar is ambiguous.
iii. Obtain the LMD and get a string „w‟. Obtain the RMD and get the same string „w‟
for both the derivations construct the parse tree. If there are two different parse trees
then the grammar is ambiguous.
Sometimes L(G) contains ɛ and it is important to retain it. The algorithm to handle this situation
is as follows:
1. Let G‟ is the grammar after eliminating ɛ productions
2. If start symbol (S) of G‟ is a Nullable variable then:
2.1 Create a new symbol S‟ in G‟
2.2 Add two production rules in G‟ as S‟→ ɛ and S‟→ S where S is the start symbol of
grammar after eliminating ɛ-rule.
3. Return G‟
Show that the following grammar is ambiguous:
S → SS
S→ ( S) | ɛ over string w = (( ) ( ) ( ) )
The given string has two parse trees by applying LMD twice so the grammar is ambiguous;
The grammar is ambiguous, because we are getting two different parse trees for the same string
by applying LMD twice.
Associativity and Precedence Priority in CFG:
Example:
E → E+E| E-E
E →E*E
E →a|b|c
Associativity:
Let us consider the string : a + b + c
Parse Tree for LMD1: Parse Tree for LMD2:
The two different parse trees exist because of the associativity rules fails. That means for the
given string a + b + c; on either side of the operand „b’, there exist two operators. Which
operator should I associate with operand b? This ambiguity results in either I should consider the
operand „b‟ with left side operator (Left associative) or right side (Right associative) operator. So
the first parse tree is correct, where the left most „+‟ is evaluated first.
How to resolve the associativity rules:
E →E+E
E →a|b|c
Here the grammar is not defined in the proper order, ie: the growth of the tree is in either left
direction or right direction.
The growth of the first parse tree is in left direction. That means it is left associative. The growth
second parse tree is in right direction, ie: right associative.
For normal associative rule is left associative, so we have to restrict the growth of parse tree in
right direction by modifying the above grammar as:
E →E+I|I
I→ a | b | c
The parse tree corresponding to the string: a+b+c:
The growth of the parse tree is in left direction since the grammar is left recursive, therefore it is
left associative. There is only one parse tree exists for the given string. So the grammar is
ambiguous.
Note: For the operators to be left associative, grammar should be left recursive. Also for the
operators to be right associative, grammar should be right recursive.
Left Recursive grammar: A production in which the leftmost symbol of the body is same as
the non-terminal at the head of the production is called a left recursive production.
Example: E → E + T
Right Recursive grammar: A production in which the rightmost symbol of the body is same as
the non-terminal at the head of the production is called a right recursive production.
Example: E → T + E
The first parse tree is valid, because the highest precedence operator „*‟ is evaluated first
compared to „+‟. (See the lower level of parse tree, where „*‟ is evaluated first). The second
parse tree is not valid, since the expression containing „+‟ is evaluated first. So here we got two
parse trees because of the precedence is not taken care.
So if we take care of associativity and precedence of operators in CFG, then the grammar is un-
ambiguous.
NOTE:
Normal precedence rule: If we have the operators such as +, -, *, /, , then the highest
precedence operator is evaluated first.
Next highest precedence operator * and / is evaluated. Finally the least precedence operator +
and – is evaluated.
Normal Associativity rule: Grammar should be left associative.
Un-Ambiguous Grammar:
For a grammar to be un-ambiguous we have to resolve the two properties such as:
i. Associativity of operators: This can be resolved by writing the grammar recursion.
ii. Precedence of operators: can be resolved by writing the grammar in different levels.
Is the following grammar is ambiguous?
If the grammar is ambiguous, obtain the un-ambiguous grammar assuming normal precedence
and associativity.
E →E+E
E →E*E
E →E/E
E →E-E
E → (E ) | a | b| c
Answer:
Let us consider the string: a + b * c
LMD 1 for the string: a+b*c LMD 2 for the string: a+b*c
For the given string there exists two different parse trees, by applying LMD twice. So the above
grammar is ambiguous.
The equivalent un-ambiguous grammar is obtained by writing all the operators as left associative
and writing the operators +, – at the first level and *, / at the next level.
Equivalent un-ambiguous grammar:
E →E+T|E–T|T
T →T*F|T/F|F
F → ( E) | a | b | c
Problem 1:
Consider the grammar:
S → aS | aSbS | ɛ
Is the above grammar ambiguous? Show in particular that the string „aab‟ has two:
i. Parse trees
ii. Left Most Derivations
iii. Right Most Derivations.
iv. Un-ambiguous grammar
OR
Define ambiguous grammar. Prove that the following grammar is ambiguous.
S → aS | aSbS| ɛ
LMD 1 for the string ‘aab’: LMD 2 for the string ‘aab’:
RMD 1 for the string ‘aab’: RMD 2 for the string ‘aab’:
The above grammar is ambiguous, since we are getting two parse trees for the same string „aab‟
by applying LMD twice.
The equivalent un-ambiguous grammar:
S → aS | aAbS | ɛ
A → aAbA | ɛ
Problem 2:
Consider the grammar:
S → S +S | S * S | (S) | a
Show that string a + a * a has two
i. Parse trees
ii. Left Most Derivations
Find an un-ambiguous grammar G‟ equivalent to G and show that L (G) = L (G‟) and G‟ is un-
ambiguous.
Two different parse trees for the string a+a*a :
Two LMDs:
That means the evaluation starts from right side; therefore the operator is right associative.
Show that the following grammar is ambiguous. Also find the un-ambiguous grammar
equivalent to the grammar by normal precedence and associative rules.
E → E+ E | E - E
E → E*E| E / E
E→E E
E → ( E) | a | b
Answer:
We already proved that the above grammar is ambiguous
Equivalent Un-ambiguous grammar:
E→E+T|E–T|T
T→T*F|T/F|F
F→G F|G
G → (E) | a | b
Show that the following grammar is ambiguous using the string “ ibtibtaea”
S → iCtS | iCtSeS | a
C→ b
Answer:
Here i stands for if, C stands for condition, S stands for statement, t stands for then and e stands
for else.
String w = ibtibtaea (if condition b then if condition b then assignment a else assignment
statement a.
The given string has two parse trees by applying LMD twice so the grammar is ambiguous.
Elimination of Ambiguity:
Normally else is matched with closest previous if.
There can be matched pair and unmatched pair
The matched pair means if statement has else part and unmatched pair means if statement
has no else part.
Hence the un-ambiguous grammar will be:
S →M|U
M → iCtMeM | a
U→ iCtS | iCtMeU
C→b
Athmaranjan K Page 47
Automata Theory and Computability Simplification of CFG: Elimination of ε production rules
A variable “A” is nullable variable if . If A is nullable then the production of the form
B → A gives another nullable variable B.
Procedure:
1. For the given grammar G identify all the productions are of the form A → ε, then A is a
considered as a nullable variable.
2. Identify the productions are of the form B → C1C2C3…………Ck, where each Ci is must
be a nullable variable, then B is also nullable.
3. Construct a new CFG without ε-productions: In a given grammar identify all non ε-
productions and add these productions to list of new CFG’s without ε-production. Take all
the combinations of nullable variables in a production by replacing each nullable variable
one by one. Add these productions to list of new CFG’s without ε-production.
Sometimes the language generated by the CFG L(G) contains ɛ and it is important to retain it.
The algorithm to handle this situation is as follows:
1. Let G’ is the grammar after eliminating ɛ productions
2. If start symbol (S) of G’ is a Nullable variable then:
1.1 Create a new symbol S’ in G’
1.2 Add two production rules in G’ as S’→ ɛ and S’→ S where S is the start symbol of
grammar after eliminating ɛ-rule.
2. Return G’
C → D|ε
D → d
Nullable variables:
Obviously B and C are directly nullable, since
B → ε
C → ε
Next we find A is nullable variable because of the rule
A → BC
Nullable variables are { A, B, C }
Thus the grammar without ε- productions is given by:
S → AbaC |Aba|baC|ba
A → BC | B | C
B → b|ε
C → D|ε
D → d
B → bBB | ε
S’ → S
S → AB | B | A
A → aAA | aA |a
B → bBB | bB|b
NOTE:
Sometimes removal ɛ production also adds some production rule of the form A →B, which is of
useless.
Example:
S → A
A → B
B → C
C → d
E → T|E+T
T → F|T*F
F → I | (E)
I → a | b| Ia | Ib | I0 | I1
The unit productions are:
E → T
T → F
F → I
The unit production of the form
F → I
results in a new non unit production for F such as F → a | b| Ia | Ib | I0 | I1 | (E)
Variable S is generating from the rule S →aBC and A → aS results in another generating symbol
A. Thus the symbols which are generating string of terminals are{ a, b, S, A, B, C } and D is
not a generating symbol, so we can eliminate that symbol and its production rule.
By eliminating D and those productions involving D we get the grammar:
S → aAa | aBC
A → aS
B → aBa |b
C → abb
Identify the reachable symbols:
From the above grammar we observe that all the variables and terminals S, A, B, C, “a” and “b”
are reachable from S. So the resultant grammar without useless productions are:
S → aAa
A → aS
B → aBa |b
C → abb
For the grammar given below eliminate all useless productions
S → AB | CA
A → a
B → BC |AB
C → aB | b
Identify the generating symbols (Variables)
Initially we have
A → a
C → b
Generates string of terminals a and b, So { A, C } are initial generating symbols.
Again from
S → CA
We can generate sting of terminals, S is also generating symbol.
Finally {S, A, C} are generating symbols, and B is not a generating symbol, so we can eliminate
that symbol.
By eliminating B and those productions involving B we get the grammar:
Athmaranjan K Dept of ISE Page 57
Automata Theory and Computability Eliminating useless symbols and production rules
S → CA
A → a
C → b
Identify the reachable symbols:
From the above grammar we observe that only S, A, C, “a” and “b” are reachable from S. So the
grammar without useless symbols and productions are:
S → CA
A → a
C → b
For the grammar given below eliminate all useless productions
S → abA | bB
A → aA |d
B → bB
D → ab| Ea
E → aC | a
Identify the generating symbols (Variables)
A → d
D → ab
E → a
Initially { A,D, E } are generating symbols, from the rule
S → abA
S is also generating symbols. Finally { S, A, D, E } are generating symbols, and B is not a
generating symbol, so we can eliminate that symbol. Also eliminate symbol C.
By eliminating B, C and those productions involving B and C we get the grammar:
S → abA
A → aA |d
D → ab| Ea
E → a
S → aAa
A → Sb | bCC
C → abb
E → ac
S → ABa | BC
A → aC | BCC
C → a
B → bcc
D → E
E → d
F → e
Note: Incase if the grammar has ε-productions, unit productions and useless productions, then
perform the following operations one after the other.
1. Eliminate all - ε-productions.
2. Eliminate all - Unit-productions.
3. Finally Eliminate all - useless-productions.
Remove all useless productions, unit productions and all ε-productions from the grammar
S → aA | aB
A → aaA| B | ε
B → b | bB
D → B
By eliminating ε-productions, the given grammar becomes,
Nullable variable = { A }
S → aA | a | aB
A → aaA| aa | B
B → b | bB
D → B
It has two unit productions, A → B and D→ B
After eliminating unit-productions, the given grammar becomes,
S → aA | a | aB
A → aaA| aa | b |bB
B → b | bB
D → b | bB
From the above grammar {S, A, B, D} are generating symbols.
The reachable symbols are S, A, B, a and b.
Now by eliminating useless symbol D & its productions, resultant grammar is
S → aA | a | aB
A → aaA| aa | b |bB
B → b | bB
D → CX
E → AD
F → AX
X → a
A → a| c | YC
B → c | YC
C → YC| c
X → a
Y → c
CNF Examples:
Convert the following grammar to CNF
S → aSb | ab | Aa
A → aab
There is no ɛ, unit and useless production rules for the given grammar.
After removing mixed productions:
S → XSY | XY | AX
A → XXY
X→a
Y→b
To convert the resultant grammar into CNF, we need to remove mixed and long production rules:
After eliminating mixed productions, the resultant grammar is:
S → XACX |XCX| XAX| XX
A → a | c | YC
C→ YC |c
X→ a
Y→ c
After eliminating long production rules, the resultant grammar is in Chomsky Norm Form:
S → XF|XD| XE| XX
A → a | c | YC
C→ YC |c
X→ a
Y→ c
D → CX
E → AX
F→ AD
X→ a
Y→ c
Begin with the grammar G:
S → ABC | BaB
A → aA| BaC | aaa
B → bBb | a | D
C → CA | AC
D → ε
i. Eliminate all ε - productions
ii. Eliminate any unit - productions in the resulting grammar
iii. Eliminate any useless - productions in the resulting grammar
iv. Put the resulting grammar into Chomsky Normal Form
After eliminating ε – productions, the resultant grammar is:
Nullable variables: {D, B}
S → BaB| Ba|aB|a
B → bBb | bb| a
To convert the resultant grammar into CNF, we need to remove mixed and long production rules:
After eliminating mixed productions, the resultant grammar is:
S → BXB| BX | XB |a
B → YBY |YY| a
X → a
Y → b
After eliminating long production rules, the resultant grammar is in Chomsky Norm Form
S → BZ| BX | XB |a
B → YP |YY| a
X → a
Y → b
Z → XB
P → BY
S → ASB | AB
A → XAS | XA | a
B → SYS | SY | YS | b | XAS | XA | a |YY
X → a
Y → b
After removing long production rules, the resultant grammar is in CNF:
S’ → ɛ
S’ → AC | AB
S → AC | AB
A → XD | XA | a
B → SE | SY | YS | b | XD | XA | a |YY
A → XD | XA | a
X → a
Y → b
C → SB
D → AS
E → YS
Convert the following CFG to CNF:
E → E + T | T * F | (E) | a | b | Ia |Ib | I0 | I1
T → T * F | (E) | a | b | Ia |Ib | I0 | I1
F → (E) | a | b | Ia |Ib | I0 | I1
I → a | b | Ia |Ib | I0 | I1
Solution:
There is no ɛ, unit and useless production for the given grammar; so after removing mixed
production rules the resultant grammar G:
E → E P T | T M F | LER | a | b | IA |IB | IZ | IO
T → T M F | LER | a | b | IA |IB | IZ | IO
F → LER | a | b | IA |IB | IZ | IO
I → a | b | IA |IB | IZ | IO
A→a
B→b
Z→ 0
O→1
P→+
M→*
L→(
R →)
After removing the long productions, the resultant grammar is in CNF:
E → E C | T D | LP | a | b | IA |IB | IZ | IO
T → T D | LP | a | b | IA |IB | IZ | IO
F → LP | a | b | IA |IB | IZ | IO
I → a | b | IA |IB | IZ | IO
C → PT
D → MF
P →ER
A→a
B→b
Z→ 0
O→1
P→+
M→*
L→(
R →)
Begin with the grammar G:
S → ABaC
A → BC
B → b |ε
C → D|ε
D → d
C → cCD
D → add
Now eliminating useless– productions, the resultant grammar is:
Generating symbols {a, d, B, D, A, S }, so eliminate symbol C and its rule.
Reachable symbols are { S, A, B, a} so eliminate symbol D and its rule. The resultant grammar:
S → a | aA | Aa
A → aB
B → Aa | a
After removing mixed rules:
S → a | XA | AX
A → XB
B → AX | a
X → a
Since there is no long productions, and all the rules are in CNF:
S → a | XA | AX
A → XB
B → AX | a
X → a
Convert the grammar G into CNF:
S → ABa
A → aab
B → Ac
The grammar in CNF:
S → AP
A → XR
B → AZ
X→a
Y→b
Z→ c
P → BX
R→ XY
Convert the grammar G into CNF:
S → aBa | abba
A → ab | AA
B → aB | a
There is no ɛ and unit productions but it contains A as a useless symbol, which is not reachable
from S. After removing useless productions the resultant grammar G:
S → aBa | abba
B → aB | a
The grammar in CNF:
S → XP | XT
B → XB | a
X→a
Y→b
P → BX
R → YX
T → YR
Convert the grammar G into CNF:
S → ~S | [S∩S] | p |q ( S being the only variable)
After removing mixed productions:
S → NS | LSISR| p | q
N→ ~
I→ ∩
L→[
R →]
Grammar in CNF:
S → NS | LZ | p | q
X→SR
Y →IX
Z→SY
N→ ~
I→ ∩
L→[
R →]
Exercises
Let L = {a. b} For the languages that are defined by each of the following grammars, do each of
the following:
i. List five strings that are in L.
ii. List five strings that are not in L
Answer:
a)
i. L = { ε, a, b, aaabbbb, ab }
ii. L= {. ba, bbaa, bbbbba, ababab, aba}
b)
i. L = { a, b, aaa, bbabb, aaaabaaaa }
ii. L= {ε, ab, bbbbbbba, bb, bbbaaa}
c)
i .L ={ ε, a, aa, aaa, ba}
ii. There aren’t any over the alphabet {a, b}.
d)
i. L= {ε, a, aaa, aaba, aaaabbbb}
ii. L= {b, bbaa, abba, bb}
Consider the following grammar G:
S→0S1 |SS | 10
Show a parse tree produced by G for the string: 010110.
a is either an input symbol in Σ or a = ε, the empty string, which is assumed not to be an input
symbol. δ which maps from K x {∑ U ε} x Г → 2 K x Г *
A finite state control reads inputs, one symbol at a time. The PDA is allowed to observe the
symbol at the top of the stack and to base its transition on its current state, the input symbol and
the symbol at the top of stack.
1. It consumes the input symbol that it uses in the transition. If ε is used for the input, then
no input symbol is consumed.
2. Goes to a new state, which may or may not be the same as the previous state.
3. Replaces the symbol at the top of the stack by any string. The string could be ε, which
corresponds to a pop of the stack. It could be the same symbol that appeared at the top of
the stack previously.
Pushdown automata choose a transition by indexing a table by input signal, current state, and the
symbol at the top of the stack. This means that those three parameters completely determine the
transition path that is chosen. Finally the given input string can be accepted by some final state or
it can be rejected, decided by the output unit.
(q, aw, Zα ) (p, w, βα) means that the current configuration of PDA will be (p, aw, Zα ) and
after applying zero or more number of transitions, the PDA enters into new configuration (p, aw,
Zα ).
PDA M by final state is L(M) = { w | (q0, w, Z0) (q, ε, α ) } for some state q in A, start state s =
q0 and any stack string α.
Acceptance by Empty stack state: let M = ( K, Σ, Γ, δ, s, Z0, A ) be a PDA. Then the language
accepted by PDA M by empty stack is L(M) = { w | (q0, w, Z0) (q, ε, ε ) } from s = q0 to any
state q. That is, L(M) is the set of inputs w that PDA M can consume and at the same time empty
its stack.
Logic: Since language contains strings of „n‟ number of a‟s followed by „n‟ number of b‟s,
machine can read n number of „a‟s in start state. Let us push all the scanned input symbol ‘a’ onto
the stack. When machine encounter input string as „b‟, we should see that for each „b‟ input, there
should be corresponding symbol ‟a‟ on the stack. Finally if there is no input (ε) and stack is empty, it
indicates that the string scanned has n number of „a‟s followed by n number of „b‟s.
PDA to accept L = { anbn | n ≥ 0 } is given by:
M = ( K, Σ, Γ , δ, s, Z0, A ) where δ is given by
δ(q0, a, Z0) = (q0, aZ0)
δ(q0, a, a) = (q0, aa)
δ(q0, b, a) = (q1, ε)
δ(q1, b, a) = (q1, ε) K = { q0, q1, qf }, s = q0 is the start state, Z0 is the initial stack symbol
δ(q1, ε, Z0) = (qf, Z0) Σ = { a, b}, Γ = { a, Z0 } and A = { qf }
δ(q0, ε, Z0) = (qf, Z0) (for minimum value; when n =0)
Graphical representation ( Transition diagram) :
Logic: Since language contains strings of „n‟ number of (‟s followed by „n‟ number of )‟s, machine
can read n number of „(‟s in start state. Let us push all the scanned input symbol ‘(’ onto the stack.
When machine encounter input string as „)‟, we should see that for each „)‟ input, there should be
corresponding symbol ‟(‟ on the stack. Finally if there is no input (ε) and stack is empty, it indicates
that the string scanned has n number of „(‟s followed by n number of „)‟s.
PDA to accept L = { w | w € { (, ) }* where w is a balanced parentheses }is given by:
M = ( K, Σ, Γ , δ, s, Z0, A ) where δ is given by
δ(q0, (, Z0) = (q0, (Z0)
δ(q0, (, ( ) = (q0, (( )
δ(q0, ), ( ) = (q1, ε)
δ(q1, ), ( ) = (q1, ε) K = { q0, q1, qf }, s = q0 is the start state, Z0 is the initial stack symbol
δ(q1, ε, Z0) = (qf, Z0) Σ = { (, )}, Γ = { (, Z0 } and A = { qf }
δ(q0, ε, Z0) = (qf, Z0) (for minimum value; when n =0)
Design a PDA to accept the language L = { anb2n | n ≥ 0 } . Draw the transition diagram and also
write the moves made by PDA for the string „aabbbb‟.
Logic: Since language contains strings of „n‟ number of a‟s followed by „2n‟ number of b‟s, machine
can read n number of „a‟s in start state. For each input symbol ‘a’ push two ‘a’s onto the stack. When
machine encounter input string as „b‟, we should see that for each „b‟ input, there should be
corresponding symbol ‟a‟ on the stack. Finally if there is no input (ε) and stack is empty, it indicates
that the string scanned has n number of „a‟s followed by 2n number of „b‟s.
Design a PDA to accept the language L = {02n 1n | n ≥ 1 } . Draw the transition diagram for the
constructed PDA. Also show the moves made by PDA for the string „000011‟.
Logic: Since language contains strings of „2n‟ number of 0‟s followed by „n‟ number of 1‟s, machine
can read 2n number of „0‟s in start state. In start state q0, let us push all the scanned input symbol ‘0’
onto the stack. When it reads „1‟ in q0, change the state to q1 and pop one „0‟ from stack. In state q1
without consuming any input (ε) symbol, change the state to q2 and pop one „0‟ from stack. In state q2
machine reads input symbol as „1‟ and change the state to q1, pop one‟0‟ from stack and this process is
repeated. When machine encounter, there is no more input(ε) in state q2 and stack is empty, change the
state to final state qf. It indicates that the string scanned has 2n number of „0‟s followed by n number of
„1‟s.
Design a PDA to accept the language L = {w | w € (a + b)* and Na(w) = Nb(w) } . Draw the
transition diagram for the constructed PDA. Also show the moves made by PDA for the string
abbaaabb
Procedure: The first scanned input symbol is either „a‟ or „b‟, push that symbol onto the stack. From
this point onwards, if the scanned input symbol and the top of stack symbol are same, then push that
current input symbol onto the stack. If the input symbol and top of stack symbol are different, then pop
one symbol from stack and repeat the process. Finally, when end of string is encountered, if the stack is
empty, we say that the string w has equal number of „a‟s and „b‟s otherwise number of „a‟s and „b‟s are
different.
PDA to accept L = { w | w € ( a + b)* and Na(w) = Nb(w) } is given by:
M = ( K, Σ, Γ , δ, s, Z0, A ) where δ is given by
δ(q0, a, Z0) = (q0, aZ0)
δ(q0, b, Z0) = (q0, bZ0)
δ(q0, a, a) = (q0, aa)
δ(q0, b, b) = (q0, bb)
δ(q0, a, b) = (q0, ε)
δ(q0, b, a) = (q0, ε)
δ(q0, ε, Z0) = (qf, Z0) K = { q0, qf }, s = q0 is the start state, Z0 is the initial stack symbol
Σ = { a, b}, Γ = { a, b, Z0 } and A = { qf }
Transition diagram:
Design a PDA to accept the language L = {w | w € ( a + b)* and Na(w) > Nb(w)}. Draw the
transition diagram for the constructed PDA. Also show the moves made by PDA for the string
„baaabbaa‟.
Note: Procedure remains same as of previous problem, only the changes in final state transition
function. That is once the end of input string is encountered (ε), the stack should contain at least one
„a‟. From this point onwards change state to q1, keep on popping the symbol „a‟ from stack until stack
gets empty. When stack is empty (Z0), input is already empty, so go to final state and accept the
language.
PDA to accept L = { w | w € ( a+ b)* and Na(w) > Nb(w) } is given by:
M= ( K, Σ, Γ , δ, s, Z0, A ) where δ is given by
δ(q0, a, Z0) = (q0, aZ0)
δ(q0, b, Z0) = (q0, bZ0)
δ(q0, a, a) = (q0, aa)
δ(q0, b, b) = (q0, bb)
δ(q0, a, b) = (q0, ε)
δ(q0, b, a) = (q0, ε)
δ(q0, ε, a) = (q1, ε)
δ(q1, ε, a) = (q1, ε)
δ(q1, ε, Z0) = (qf, Z0) K = { q0, q1, qf }, s = q0 is the start state, Z0 is the initial stack symbol
Σ = { a, b}, Γ = { a, b, Z0 } and A = { qf }
Transition diagram:
Design a PDA to accept the language L = {w | w € ( a + b)* and Na(w) < Nb(w) }. Draw the
transition diagram for the constructed PDA. Also show the moves made by PDA for the string
„aabbbbab‟.
Note: Procedure remains same as that of Na(w) = Nb(w) problem, only the changes in final state
transition function. That is once the end of input string is encountered (ε), the stack should contain
at least one „b‟. From this point onwards change state to q1, keep on popping the symbol „b‟ from stack
until stack gets empty. When stack is empty (Z0), input is already empty, so go to final state and accept
the language.
PDA to accept L = {w | w € ( a + b)* and Na(w) < Nb(w) } is given by:
M = ( K, Σ, Γ , δ, s, Z0, A ) where δ is given by
δ(q0, a, Z0) = (q0, aZ0)
δ(q0, b, Z0) = (q0, bZ0)
Athmaranjan K Dept of ISE Page 91
Automata Theory and Computability Design of PDA
Design a PDA to accept the language L = { wCwR | w € ( a+b) * } . Draw the transition diagram and
also write the moves made by PDA for the string “baaCaab”
Logic: To check for palindrome, let us push all scanned input symbols onto the stack till we encounter
the letter C. Once we pass the middle string, if the string is palindrome, for each scanned input symbol,
there should be a corresponding symbol (same as input symbol) on the stack. Finally if there is no
input and stack is empty, we say that the given string is palindrome.
δ(q1, a, a) = (q1, ε)
δ(q1, b, b) = (q1, ε)
δ(q1, ε, Z0) = (qf, Z0)
K = { q0, q1, qf }, s = q0 is the start state, Z0 is the initial stack symbol; Σ = { a, b, C}, Γ = { a, b, Z0 }
and A = { qf }
Transition diagram:
Design an NPDA to accept the language L = { wwR | w € ( a+b) * } . Draw the transition diagram and
also write the moves made by PDA for the string “baaaab”
Logic: To check for palindrome, let us push all scanned input symbols onto the stack till we encounter
the midpoint (through our common sense). Once we pass the middle string, if the string is palindrome,
for each scanned input symbol, there should be a corresponding symbol (same as input symbol) on the
stack. Finally if there is no input and stack is empty, we say that the given string is palindrome.
Here in state q0 if the input symbol and top of stack symbol are same, we can push the input symbol
onto the stack, if the midpoint is not reached otherwise if midpoint is crossed we can change the state
and pop one symbol from stack.
PDA to accept L = { wwR | w € ( a + b)* } is given by:
M = ( K, Σ, Γ , δ, s, Z0, A ) where δ is given by
δ(q0, a, Z0) = (q0, aZ0)
δ(q0, b, Z0) = (q0, bZ0) δ(q0, ε, Z0) = (qf, Z0)
δ(q0, a, a) = (q0, aa)
δ(q0, b, b) = (q0, bb )
δ(q0, a, b) = (q0, ab)
δ(q0, b, a) = (q0, ba)
δ(q0, ε, a) = (q1, a)
δ(q0, ε, b) = (q1, b)
δ(q1, a, a) = (q1, ε)
δ(q1, b, b) = (q1, ε)
δ(q1, ε, Z0) = (qf, Z0)
K = { q0, q1, qf }, s = q0 is the start state, Z0 is the initial stack symbol; Σ = { a, b,}, Γ = { a, b, Z0 }
and A= { qf }
Moves made by PDA for the string : baaaab
(q0, baaaab, Z0 ) (q0, aaaab, bZ0 ) (q0, aaab, abZ0 ) (q0, aab, aabZ0 ) (q1, aab, aabZ0 ) (q1, ab,
abZ0 ) (q1, b, bZ0 ) (q1, ε, Z0 ) (qf, Z0 )
Transition diagram:
Design a PDA to accept the language L = { 0 n1m0n | m, n ≥ 1 } . Draw the transition diagram and
also write the moves made by PDA for the string “0011100”.
Procedure: Initially (q0) machine reads n number of „0‟s, push all the scanned input symbol „0‟ onto
the stack, when machine reads „1‟ in start state, change the state to q1, and do not alter the content of
stack. In q1 state machine reads „1‟s and ignore that symbol. When machine reads „0‟ in q1 state, we
should see that for each scanned input symbol „0‟ there should be a corresponding symbol „0‟ on the
stack, so change the state to q2 and pop one „0‟ from stack. Finally if there is no input (ε) and stack is
empty, we say that string w has n number of „0‟s followed by „m‟ number of „1‟s followed by „n‟
number of „0‟s.
PDA to accept L = { 0n1m0n | m, n ≥ 1 } . is given by:
M = ( K, Σ, Γ , δ, s, Z0, A ) where δ is given by
δ(q0, 0, Z0) = (q0, 0Z0)
δ(q0, 0, 0) = (q0, 00)
δ(q0, 1, 0) = (q1, 0)
δ(q1, 1, 0) = (q1, 0)
δ(q1, 0, 0) = (q2, ε)
δ(q2, 0, 0) = (q2, ε)
Design a PDA to accept the language L = {0n1m0m1n | m, n ≥ 1 } . Draw the transition diagram and
also write the moves made by PDA for the string “0011100011”.
Procedure: Initially (q0) machine reads n number of „0‟s, push all the scanned input symbol „0‟ onto
the stack, when machine reads „1‟ in start state, change the state to q1, and push that input symbol onto
the stack. In q1 state machine reads as many number of „1‟s and push that symbol onto the stack. When
machine reads „0‟ in q1 state, we should see that for each scanned input symbol „0‟ there should be a
corresponding symbol „1‟ on the stack, so change the state to q2 and pop one „1‟ from stack. Again in
q2 machine reads „0‟s and each time we should see that for each scanned input symbol „0‟ there should
be a corresponding symbol „1‟ on the stack and pop one‟1‟ from stack. In q2 if machine reads „1‟s ,
then change state to q3 and we should see that for each scanned input symbol „1‟ there should be a
corresponding symbol „0‟ on the stack and pop one‟0‟ from stack Again in q3 machine reads remaining
„1‟s and each time pop one „0‟ from stack. Finally in q3 if there is no input (ε) and stack is empty, we
say that string w has n number of „0‟s followed by „m‟ number of „1‟s followed by „m‟ number of „0‟s
followed by „n‟ number of „1‟s.
PDA to accept L = { 0n1m0m1n | m, n ≥ 1 } . is given by:
M = ( K, Σ, Γ , δ, s, Z0, A ) where δ is given by
Athmaranjan K Dept of ISE Page 96
Automata Theory and Computability Design of PDA
Is the PDA corresponding to the language L = { w | w € ( a + b)* and Na(w) = Nb(w) } by final
state is deterministic?
PDA to accept L = { w | w € ( a + b)* and Na(w) = Nb(w) } is given by:
M = ( K, Σ, Γ , δ, s, Z0, A ) where δ is given by
δ(q0, a, Z0) = (q0, aZ0)
δ(q0, b, Z0) = (q0, bZ0)
δ(q0, a, a) = (q0, aa)
δ(q0, b, b) = (q0, bb)
δ(q0, a, b) = (q0, ε)
δ(q0, b, a) = (q0, ε)
δ(q0, ε, Z0) = (qf, Z0) K = { q0, qf }, s = q0 is the start state, Z0 is the initial stack symbol
Σ = { a, b}, Γ = { a, b, Z0 } and A = { qf }
1. From the above transition functions, we observe that, the first condition to be deterministic is
δ(q, a, X) should have only one component. In this case, for each q in K, a in Σ and X in γ,
there exists only one transition. So the first condition is satisfied.
2. To satisfy the second condition, consider one transition function δ(q0, a, Z0) = (q0, aZ0) is
defined (non-empty), then δ(q0, ε, Z0) should not be defined, which is not true.
Since the second condition fails, so the given PDA is Non-Deterministic.
• The appearance of a terminal symbol c on the top of the stack means that G is attempting to
generate c.
• M only wants to pursue paths that generate its input string w. So at that point, it pops the top
symbol off the stack reads its next input character. and compares the two.
• If they match, the derivation that M is pursuing is consistent with generating-w and the
process continues.
• If they don't match, the path that M is currently following ends without accepting
• So at each step. PDA M either applies a grammar rule, without consuming any input, or it
reads an input character and pops one terminal symbol off the stack.
• When stack is empty, and PDA M has read all the characters of string w, G can generate
string w , so PDA M accepts.
Conversion of CFG to PDA using Top Down Parsing approach: Procedure
Formally, PDA M = ( {p, q, qf }, ∑, Г, δ, Z0 {p}, { qf } )
where δ contains:
1. The start up transition δ (p, ε, Z0) = (q, S) which pushes the start symbol of G onto the stack
and goes to state q.
2. For each production rule in G of the form:
X → γ1 γ2 γ3 …………… γn introduce the transition in
state q as:
δ (q, ε, X) = (q, γ1 γ2 γ3 …………… γn )
3. For each terminal symbol or character c € ∑ introduce the transition of the form:
δ (q, c, c) = (q, ε )
Construct an NPDA corresponding to the grammar given below by using Top down parsing
S → aA
A → aABC | bB | a
B → b
C → c
Show the sequence of moves made by NPDA in processing the string ‘aaabc’
For each Non terminal S, A, B and C, the transition functions are:
δ(q, ε, S) = { (q,aA) }
δ(q, ε, A) = { (q,aABC), (q,bB), (q,a) }
δ(q, ε, B) = { (q, b) }
δ(q, ε, C) = { (q, c) }
For each terminal a,b and c, the transition functions are:
δ(q, a, a) = { (q, ε) }
δ(q, b, b) = { (q, ε) }
δ(q, c, c) = { (q, ε) }
Sequence of moves for the string aaabc:
ID: (q, aaabc, S ) (q, aaabc, aA ) (q, aabc, A ) (q, aabc, aABC ) (q, abc, ABC ) (q, abc, aBC )
(q, bc, BC ) (q, bc, bC ) (q, c, C ) (q, c, c ) (q, ε, ε )
Obtain a PDA equivalent to the following grammar using Top down parsing approach.
E → E+E
E → E*E
E → E/E
E → E–E
E → (E) | a | b
For each Non terminal E the transition functions are:
δ(q, ε, E) = { (q, E + E), (q, E * E), (q, E / E), (q, E - E), (q, (E), (q, a), (q, b) }
For each terminal +, *, /, -, (, ), a and b the transition functions are:
δ(q, , +, + ) = { (q, ε) }
δ(q, , *, * ) = { (q, ε) }
δ(q, , /, / ) = { (q, ε) }
δ(q, , -, - ) = { (q, ε) }
δ(q, , (, ( ) = { (q, ε) }
δ(q, , ), ) ) = { (q, ε) }
δ(q, , a, a ) = { (q, ε) }
δ(q, , b, b ) = { (q, ε) }
δ(q, , b, b ) = { (q, ε) }
Obtain a PDA equivalent to the following grammar.
S → aABB
A → aBB | a
B → bBB | A
C → a
For each Non terminal S, A,B and C, the transition functions are:
δ(q, ε, S) = { (q, aABB) }
δ(q, ε, A) = { (q, aBB) , (q, a) }
δ(q, ε, B) = { (q, bBB) , (q, A) }
δ(q, ε, C) = { (q, a) }
For each terminal a and b the transition functions are:
δ(q, , a, a ) = { (q, ε) }
δ(q, , b, b ) = { (q, ε) }
Obtain a PDA equivalent to the following grammar that accept the same language by empty stack.
S → 0S1 | A
A → 1A0 | S | ε
For each Non terminal S, and A, the transition functions are:
δ(q, ε, S) = { (q, 0S1) , (q, A) }
δ(q, ε, A) = { (q, 1A0) , (q, S), (q, ε) }
For each terminal 0 and 1 the transition functions are:
δ(q, , 0, 0 ) = { (q, ε) }
δ(q, , 1, 1 ) = { (q, ε) }
Athmaranjan K Dept. of ISE Page 104
Automata Theory and Computability CFG to PDA using Bottom up parsing
• PDA M can read an input symbol and shift it onto the stack.
• Whenever a sequence of elements at the top of the stack matches, in reverse the right-hand
side of some rule in P. M can pop that sequence of and replace it by the left-hand side of P.
When this happens, we say that M has reduced by rule P
Obtain a PDA equivalent to the following grammar using Bottom up parsing approach.
S → aA
A → aA | bA | a | b
PDA M = ( {p, q}, ∑, Г, δ, Z0 {p}, { q } ) where δ contains:
Obtain a PDA equivalent to the following grammar using Bottom Up parsing approach.
S → aABB
A → aBB | a
B → bBB | A
C → a|ε
PDA M = ( {p, q}, ∑, Г, δ, Z0 {p}, { q } ) where δ contains
Obtain a PDA equivalent to the following grammar using Bottom Up parsing approach
E → E+E
E → E*E
E → E/E
E → E–E
E → (E) | a | b
Find a CFG that generates the language accepted by NPDA P = ( {90, 91}, {a, b}, {A,Z}, δ, 90, Z,
{91})
With transitions δ( 90, a, Z) = { (90, AZ)}
δ( 90, b, A) = { (90, AA)}
δ( 90, a, A) = { (91, ε)}
Solution: For transition function δ( 90, a, A) = { (91, ε)} introduce the production as
90A91 → a
With transitions δ( 90, a, Z) = { (90, AZ)}
δ( 90, b, A) = { (90, AA)}introduce the productions as
90Z90→ a (90A90) (90Z90) | a (90A91) (91Z90)
90Z91→ a (90A90) (90Z91) | a (90A91) (91Z91)