0% found this document useful (0 votes)
242 views28 pages

Unit Iii

Uploaded by

dhurgadevi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
242 views28 pages

Unit Iii

Uploaded by

dhurgadevi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 28

UNIT III CONTEXT FREE GRAMMAR AND CONTEXT FREE LANGUAGE

Context Free Grammars and Languages– Derivations – Ambiguity- Relationship between derivation
and derivation trees – Pumping Lemma for CFL – Problems based on Pumping Lemma -
Simplification of CFG – Elimination of Useless symbols - Unit productions - Null productions

CONTEXT FREE GRAMMARS AND LANGUAGES


CFG stands for context-free grammar. It is is a formal grammar which is used to generate all
possible patterns of strings in a given formal language. Context-free grammar G can be defined
by four tuples as:
G = (V, T, P, S)
Where,
G is the grammar, which consists of a set of the production rule. It is used to generate the string
of a language.
T is the final set of a terminal symbol. It is denoted by lower case letters.

V is the final set of a non-terminal symbol. It is denoted by capital letters.


P is a set of production rules, which is used for replacing non-terminals symbols(on the left side
of the production) in a string with other terminal or non-terminal symbols(on the right side of
the production).
S is the start symbol which is used to derive the string. We can derive the string by repeatedly
replacing a non-terminal by the right-hand side of the production until all non-terminal have
been replaced by terminal symbols.
Example 1:
Construct the CFG for the language having any number of a's over the set ∑= {a}.
Solution:
As we know the regular expression for the above language is
1. r.e. = a*
Production rule for the Regular expression is as follows:
1. S → aS rule 1
2. S → ε rule 2
Now if we want to derive a string "aaaaaa", we can start with start symbols.
1. S
2. aS
3. aaS rule 1
4. aaaS rule 1
5. aaaaS rule 1
6. aaaaaS rule 1
7. aaaaaaS rule 1
8. aaaaaaε rule 2
9. aaaaaa
The r.e. = a* can generate a set of string {ε, a, aa, aaa,.....}. We can have a null string because S
is a start symbol and rule 2 gives S → ε.
Example 2:
Construct a CFG for the regular expression (0+1)*
Solution:
The CFG can be given by,
1. Production rule (P):
2. S → 0S | 1S
3. S → ε
The rules are in the combination of 0's and 1's with the start symbol. Since (0+1)* indicates {ε,
0, 1, 01, 10, 00, 11, ....}. In this set, ε is a string, so in the rule, we can set the rule S → ε.
Example 3:
Construct a CFG for a language L = {wcwR | where w € (a, b)*}.
Solution:
The string that can be generated for a given language is {aacaa, bcb, abcba, bacab,
abbcbba, ....}
The grammar could be:
1. S → aSa rule 1
2. S → bSb rule 2
3. S→c rule 3
Now if we want to derive a string "abbcbba", we can start with start symbols.
1. S → aSa
2. S → abSba from rule 2
3. S → abbSbba from rule 2
4. S → abbcbba from rule 3
Thus any of this kind of string can be derived from the given production rules.
Example 4:
Construct a CFG for the language L = anb2n where n>=1.
Solution:
The string that can be generated for a given language is {abb, aabbbb, aaabbbbbb....}.
The grammar could be:
1. S → aSbb | abb
Now if we want to derive a string "aabbbb", we can start with start symbols.
1. S → aSbb
2. S → aabbbb

DERIVATIONS
Derivation is a sequence of production rules. It is used to get the input string through these
production rules. During parsing, we have to take two decisions. These are as follows:
o We have to decide the non-terminal which is to be replaced.
o We have to decide the production rule by which the non-terminal will be replaced.
We have two options to decide which non-terminal to be placed with production rule.
1. Leftmost Derivation:
In the leftmost derivation, the input is scanned and replaced with the production rule from left
to right. So in leftmost derivation, we read the input string from left to right.
Example:
Production rules:
1. E=E+E
2. E=E-E
3. E=a|b
Input
1. a - b + a
The leftmost derivation is:
1. E=E+E
2. E=E-E+E
3. E=a-E+E
4. E=a-b+E
5. E=a-b+a
2. Rightmost Derivation:
In rightmost derivation, the input is scanned and replaced with the production rule from right to
left. So in rightmost derivation, we read the input string from right to left.
Example
Production rules:
1. E = E + E
2. E = E - E
3. E = a | b
Input
1. a - b + a
The rightmost derivation is:
1. E=E-E
2. E=E-E+E
3. E=E-E+a
4. E=E-b+a
5. E=a-b+a
When we use the leftmost derivation or rightmost derivation, we may get the same string. This
type of derivation does not affect on getting of a string.
Examples of Derivation:
Example 1:
Derive the string "abb" for leftmost derivation and rightmost derivation using a CFG given by,
1. S → AB | ε
2. A → aB
3. B → Sb
Solution:
Leftmost derivation:

Rightmost derivation:

Example 2:
Derive the string "aabbabba" for leftmost derivation and rightmost derivation using a CFG
given by,
1. S → aB | bA
2. S → a | aS | bAA
3. S → b | aS | aBB
Solution:
Leftmost derivation:
1. S
2. aB S → aB
3. aaBB B → aBB
4. aabB B→b
5. aabbS B → bS
6. aabbaB S → aB
7. aabbabS B → bS
8. aabbabbA S → bA
9. aabbabba A→a
Rightmost derivation:
1. S
2. aB S → aB
3. aaBB B → aBB
4. aaBbS B → bS
5. aaBbbA S → bA
6. aaBbba A→a
7. aabSbba B → bS
8. aabbAbba S → bA
9. aabbabba A→a
Example 3:
Derive the string "00101" for leftmost derivation and rightmost derivation using a CFG given
by,
1. S → A1B
2. A → 0A | ε
3. B → 0B | 1B | ε
Solution:
Leftmost derivation:
1. S
2. A1B
3. 0A1B
4. 00A1B
5. 001B
6. 0010B
7. 00101B
8. 00101
Rightmost derivation:
1. S
2. A1B
3. A10B
4. A101B
5. A101
6. 0A101
7. 00A101
8. 00101
Derivation tree is a graphical representation for the derivation of the given production rules for
a given CFG. It is the simple way to show how the derivation can be done to obtain some string
from a given set of production rules. The derivation tree is also called a parse tree.
Parse tree follows the precedence of operators. The deepest sub-tree traversed first. So, the
operator in the parent node has less precedence over the operator in the sub-tree.
A parse tree contains the following properties:
1. The root node is always a node indicating start symbols.
2. The derivation is read from left to right.
3. The leaf node is always terminal nodes.
4. The interior nodes are always the non-terminal nodes.
Example 1:
Production rules:
1. E = E + E
2. E = E * E
3. E = a | b | c
Input
1. a * b + c
Step 1:

Step 2:
Step 2:

Step 4:

Step 5:

Note: We can draw a derivation tree step by step or directly in one step.
Example 2:
Draw a derivation tree for the string "bab" from the CFG given by
1. S → bSb | a | b
Solution:
Now, the derivation tree for the string "bbabb" is as follows:

The above tree is a derivation tree drawn for deriving a string bbabb. By simply reading the leaf
nodes, we can obtain the desired string. The same tree can also be denoted by,

Example 3:
Construct a derivation tree for the string aabbabba for the CFG given by,
1. S → aB | bA
2. A → a | aS | bAA
3. B → b | bS | aBB
Solution:
To draw a tree, we will first try to obtain derivation for the string aabbabba
Now, the derivation tree is as follows:

Example 4:
Show the derivation tree for string "aabbbb" with the following grammar.
1. S → AB | ε
2. A → aB
3. B → Sb
Solution:
To draw a tree we will first try to obtain derivation for the string aabbbb

Now, the derivation tree for the string "aabbbb" is as follows:


AMBIGUITY
A grammar is said to be ambiguous if there exists more than one leftmost derivation or more
than one rightmost derivation or more than one parse tree for the given input string. If the
grammar is not ambiguous, then it is called unambiguous.
If the grammar has ambiguity, then it is not good for compiler construction. No method can
automatically detect and remove the ambiguity, but we can remove ambiguity by re-writing the
whole grammar without ambiguity.
Example 1:
Let us consider a grammar G with the production rule
1. E → I
2. E → E + E
3. E → E * E
4. E → (E)
5. I → ε | 0 | 1 | 2 | ... | 9
Solution:
For the string "3 * 2 + 5", the above grammar can generate two parse trees by leftmost
derivation:
Since there are two parse trees for a single string "3 * 2 + 5", the grammar G is ambiguous.
Example 2:
Check whether the given grammar G is ambiguous or not.
1. E → E + E
2. E → E - E
3. E → id
Solution:
From the above grammar String "id + id - id" can be derived in 2 ways:
First Leftmost derivation
1. E → E + E
2. → id + E
3. → id + E - E
4. → id + id - E
5. → id + id- id
Second Leftmost derivation
1. E → E - E
2. →E+E-E
3. → id + E - E
4. → id + id - E
5. → id + id - id
Since there are two leftmost derivation for a single string "id + id - id", the grammar G is
ambiguous.
Example 3:
Check whether the given grammar G is ambiguous or not.
1. S → aSb | SS
2. S → ε
Solution:
For the string "aabb" the above grammar can generate two parse trees
Since there are two parse trees for a single string "aabb", the grammar G is ambiguous.
Example 4:
Check whether the given grammar G is ambiguous or not.
1. A → AA
2. A → (A)
3. A → a
Solution:
For the string "a(a)aa" the above grammar can generate two parse trees:

Since there are two parse trees for a single string "a(a)aa", the grammar G is ambiguous.
Unambiguous Grammar
A grammar can be unambiguous if the grammar does not contain ambiguity that means if it
does not contain more than one leftmost derivation or more than one rightmost derivation or
more than one parse tree for the given input string.
To convert ambiguous grammar to unambiguous grammar, we will apply the following rules:
1. If the left associative operators (+, -, *, /) are used in the production rule, then apply left
recursion in the production rule. Left recursion means that the leftmost symbol on the right side
is the same as the non-terminal on the left side. For example,
1. X → Xa
2. If the right associative operates(^) is used in the production rule then apply right recursion in
the production rule. Right recursion means that the rightmost symbol on the left side is the
same as the non-terminal on the right side. For example,
1. X → aX
Example 1:
Consider a grammar G is given as follows:
1. S → AB | aaB
2. A → a | Aa
3. B → b
Determine whether the grammar G is ambiguous or not. If G is ambiguous, construct an
unambiguous grammar equivalent to G.
Solution:
Let us derive the string "aab"

As there are two different parse tree for deriving the same string, the given grammar is
ambiguous.
Unambiguous grammar will be:
1. S → AB
2. A → Aa | a
3. B → b
Example 2:
Show that the given grammar is ambiguous. Also, find an equivalent unambiguous grammar.
1. S → ABA
2. A → aA | ε
3. B → bB | ε
Solution:
The given grammar is ambiguous because we can derive two different parse tree for string aa.

The unambiguous grammar is:


1. S → aXY | bYZ | ε
2. Z → aZ | a
3. X → aXY | a | ε
4. Y → bYZ | b | ε
Example 3:
Show that the given grammar is ambiguous. Also, find an equivalent unambiguous grammar.
1. E → E + E
2. E → E * E
3. E → id
Solution:
Let us derive the string "id + id * id"

As there are two different parse tree for deriving the same string, the given grammar is
ambiguous.
Unambiguous grammar will be:
1. E → E + T
2. E → T
3. T → T * F
4. T → F
5. F → id
Example 4:
Check that the given grammar is ambiguous or not. Also, find an equivalent unambiguous
grammar.
1. S→S+S
2. S → S * S
3. S → S ^ S
4. S → a
Solution:
The given grammar is ambiguous because the derivation of string aab can be represented by the
following string:

Unambiguous grammar will be:


1. S → S + A |
2. A → A * B | B
3. B → C ^ B | C
4. C → a

RELATIONSHIP BETWEEN DERIVATION AND DERIVATION TREES


Derivation tree is a graphical representation for the derivation of the given production rules
of the context free grammar (CFG). It is a way to show how the derivation can be done to
obtain some string from a given set of production rules. It is also called as the Parse tree.
Derivations mean replacing a given string’s non-terminal by the right-hand side of the
production rule. The sequence of applications of rules that makes the completed string of
terminals from the starting symbol is known as derivation. The parse tree is the pictorial
representation of derivations. Therefore, it is also known as derivation trees. The derivation tree
is independent of the other in which productions are used.

PUMPING LEMMA FOR CFL


Lemma
If L is a context-free language, there is a pumping length p such that any string w ∈ L of
length ≥ p can be written as w = uvxyz, where vy ≠ ε, |vxy| ≤ p, and for all i ≥ 0, uvixyiz ∈ L.
Applications of Pumping Lemma
Pumping lemma is used to check whether a grammar is context free or not. Let us take an
example and show how it is checked.
Problem
Find out whether the language L = {xnynzn | n ≥ 1} is context free or not.
Solution
Let L is context free. Then, L must satisfy pumping lemma.
At first, choose a number n of the pumping lemma. Then, take z as 0n1n2n.
Break z into uvwxy, where
|vwx| ≤ n and vx ≠ ε.
Hence vwx cannot involve both 0s and 2s, since the last 0 and the first 2 are at least (n+1)
positions apart. There are two cases −
Case 1 − vwx has no 2s. Then vx has only 0s and 1s. Then uwy, which would have to be in L,
has n 2s, but fewer than n 0s or 1s.
Case 2 − vwx has no 0s.
Here contradiction occurs.
Hence, L is not a context-free language.

Problems
1. L = {ak | k is a prime number} Proof by contradiction:
Let us assume L is regular. Clearly L is infinite (there are infinitely many prime numbers).
From the pumping lemma, there exists a number n such that any string w of length greater than
n has a “repeatable” substring generating more strings in the language L.
Let us consider the first prime number p ≥ n. For example, if n was 50 we could use p = 53.
From the pumping lemma the string of length p has a “repeatable” substring.
We will assume that this substring is of length k ≥ 1.
Hence:

ap ap + k ap+2k s L and
s L as well as
s L, etc.
It should be relatively clear that p + k, p + 2k, etc., cannot all be prime but let us add k p
times, then we must have:ap + pk sL, of course ap + pk = ap (k + 1)

so this would imply that (k + 1)p is prime, which it is not since it is divisible by both p and k
+ 1.Hence L is not regular.
2. L = {anbn+1}
Assume L is regular. From the pumping lemma there exists a p such that every w s L such
that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose apbp+1. Its
length is 2p + 1 ≥ p. Since the length of xy cannot exceed p, y must be of the form ak for
some k > 0. From the pumping lemma ap-kbp+1 must also
be in L but it is not of the right form. Hence the language is not regular.

Note that the repeatable string needs to appear in the first n symbols to avoid the following
situation:
assume, for the sake of argument that n = 20 and you choose the string a10 b11 which is of
length larger than 20, but |xy| c 20 allows xy to extend past b, which means that y could
contain some b’s. In such case, removing y (or adding more y’s) could lead to strings which
still belong to L.
3. L = {anb2n }
Assume L is regular. From the pumping lemma there exists a p such that every w s L such
that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose apb2p. Its
length is 3p ≥ p. Since the length of xy cannot exceed p, y must be of the form ak for some
k > 0. From the pumping lemma ap-kb2p must also be in L but it is not of the right form.
Hence the language is not regular.

4. TRAILING-COUNT as any string s followed by a number of a’s equal to the length of s.


Assume L is regular. From the pumping lemma there exists a p such that every w s L such
that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose bpap. Its
length is 2p ≥ p. Since the length of xy cannot exceed p, y must be of the form bk for some
k > 0. From the pumping lemma bp-kap must also be in L but it is not of the right form.
Hence the language is not regular.

5. EVENPALINDROME = { all words in PALINDROME that have even length} Same as #2


above, choose anbban.
6. ODDPALINDROME = { all words in PALINDROME that have odd length} Same as
#2 above, choose anban.

7. DOUBLESQUARE = { anbn where n is a square }


Assume DOUBLESQUARE is regular. From the pumping lemma there exists a p such that
every w s L such that |w| ≥ p can be represented as x y z with |y| › 0 and
|xy| c p. Let us choose ap*pbp*p. Its length is 2p2 ≥ p. Since the length of xy cannot
exceed p, y must be of the form ak for some k > 0. Let us add y p times. From the
pumping lemma ap*p+pkbp*p = ap(p + k)bp*p must also be in L but it is not of the right form. Hence
the language is not regular.

9. L = { w | w s {a, b}*, w = wR} Proof by contradiction:


Assume L is regular. Then the pumping lemma applies.
From the pumping lemma there exists an n such that every w s L longer than n can be
represented as x y z with |y| › 0 and |x y| c n.
Let us choose the palindrome anban.
Again notice that we were clever enough to choose a string which:
a.has a center mark which is not a (otherwise when we remove or add y we would be left with
an acceptable string)
b.has a first portion on length n which is all a’s (so that when we remove or add y it will create
an imbalance).
Its length is 2n + 1 ≥ n. Since the length of xy cannot exceed n, y must be of the form ak for
some k > 0. From the pumping lemma an-k b an must also be in L but it is not a palindrome.
Hence L is not regular.

10. L = { w s {a, b}* | w has an equal number of a’s and b’s}


Let us show this by contradiction: assume L is regular. We know that the
language generated by a*b* is regular. We also know that the intersection of two regular
languages is regular. Let M = (anbn | n ≥ 0} = L(a*b*) n L. Therefore if L is regular M
would also be regular. but we know tha M is not regular. Hence, L is not regular.

11. L = { 0n | n is a power of 2 }
Assume L is regular. From the pumping lemma there exists a p such that every w s L such
that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose
n = 2p. Since the length of xy cannot exceed p, y must be of the form 0k for some 0<
k cp. From the pumping lemma 0m where m = 2p+ k must also be in L. We have

2p < 2p + k c 2p + p < 2p + 1
Hence this string is not of the right form. Hence the language is not regular.
13. L = {a2kw | w s {a, b}*, |w| = k}
Assume L is regular. From the pumping lemma there exists a p such that every w s L such
that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose a2pbp. Its
length is 3p ≥ p. Since the length of xy cannot exceed p, y must be of the form ak for some k
> 0. From the pumping lemma a2p-kbp must
also be in L but it is not of the right form since the number of a’s cannot be twice the
number of b’s (Note that you must subtract not add , otherwise some a’s could be shifted
into w). Hence the language is not regular.
14. L = {akw | w s {a, b}*, |w| = k}
Assume L is regular. From the pumping lemma there exists a p such that every w s L such
that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose apbp. Its length
is 2p ≥ p. Since the length of xy cannot exceed p, y must be of the form ak for some k > 0.
From the pumping lemma ap-kbp must also
be in L but it is not of the right form since the number of a’s cannot be equal to the number
of b’s (Note that you must subtract not add , otherwise some a’s could be shifted into w).
Hence the language is not regular.
15. L = {anbl | n c l}
Assume L is regular. From the pumping lemma there exists a p such that every w s L such
that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose apbp. Its
length is 2p ≥ p. Since the length of xy cannot exceed p, y must be of the form ak for some k
> 0. From the pumping lemma ap+k bp must
also be in L but it is not of the right form since the number of a’s exceeds the number of b’s
(Note that you must add not subtract, otherwise the string would be OK). Hence the
language is not regular.
16. L = {anblak | k = n + l}
Assume L is regular. From the pumping lemma there exists a p such that every w s L such
that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose apbap+1. Its
length is 2p+2 ≥ p. Since the length of xy cannot exceed p, y must be of the form am for
some m > 0. From the pumping lemma ap-mbap+1
must also be in L but it is not of the right form. Hence the language is not regular.

17. L = {vak+1 | v s {a, b}*, |v| = k}


Assume L is regular. From the pumping lemma there exists an n such that every w s L such
that |w| ≥ n can be represented as x y z with |y| › 0 and |xy| c p. Let us choose bnan+1. Its
length is 2n+1 ≥ n. Since the length of xy cannot exceed n, y must be of the form bk for
some k > 0. From the pumping lemma if we add two
y to the original string bn+2kan+1 must also be in L but that string is of length 2n+2k+1 and v
would have to be bn+k to fit the pattern the rest of the string would then be bkak+1 which is not
of the right form. Hence the language is not regular.

18. L = {va2k | v s {a, b}*, |v| = k}


Assume L is regular. From the pumping lemma there exists a n such that every w s L such
that |w| ≥ n can be represented as x y z with |y| › 0 and |xy| c n. Let us choose bna2n. Its
length is 3n ≥ n. Since the length of xy cannot exceed n, y must be of the form bk for some k
> 0. From the pumping lemma bn+k an must
also be in L but it is not of the right form since the number of a’s exceeds the number of b’s
and we cannot move any b’s on the a side (Note that you must add not subtract, otherwise
the string would be OK by shifting a’s to the b side). Hence the language is not regular.

19. L = {ww | w s {a, b}*}


Assume L is regular. From the pumping lemma there exists an n such that every w s L such
that |w| ≥ n can be represented as x y z with |y| › 0 and |xy| c n. Let us choose anbnanbn. Its
length is 4n ≥ n. Since the length of xy cannot exceed n, y must be of the form ak for some
k > 0. From the pumping lemma an+kbnanbn
must also be in L but it is not of the right form since the middle of the string would be in the
middle of the b which prevents a match with the beginning of the string. Hence the
language is not regular.

20. L = { an! | n ≥ 0}
Proof by contradiction:
Let us assume L is regular. From the pumping lemma, there exists a number p such that any
string w of length greater than p has a “repeatable” substring generating more strings in the
language L. Let us consider ap! (unless p < 3 in which case we chose a3!). From the
pumping lemma the string w has a “repeatable” substring. We will assume that this
substring is of length k ≥ 1.
From the pumping lemma ap!-k must also be in L. For this to be true there must
be j such that j! = m! - k But this is not possible since when p > 2 and k c m we have
m! - k > (m - 1)!
Hence L is not regular.

21. L = { anbl | n › l}
Proof by contradiction:
Let us assume L is regular. From the pumping lemma, there exists a number p
such that any string w of length greater than p has a “repeatable” substring generating more
strings in the language L. Let us consider n = p! and l = (p+1)! From the pumping lemma
the resulting string is of length larger than p and has a “repeatable” substring. We will
assume that this substring is of length k ≥ 1.
From the pumping lemma we can add y i-1 times for a total of i ys. If we can find
an i such that the resulting number of a’s is the same as the number of b’s we have won.
This means we must find i such that:
m! + (i - = (m + 1)! or
1)*k
(i - 1) k = (m + 1) m! - m! = m * m! or
i = (m * m!) / k +1

but since k < m we know that k must divide m! and that (m * m!) / k must be an integer.
This proves that we can choose i to obtain the above equality.
Hence L is not regular.

22. L = {anblak | k > n + l}


Assume L is regular. From the pumping lemma there exists a p such that every w s L such
that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose apbap+2. Its
length is 2p+3 ≥ p. Since the length of xy cannot exceed p, y must be of the form am for
some m > 0. From the pumping lemma ap+2mbap+2
must also be in L but it is not of the right form since p+2m+1 > p+2. Hence the language is
not regular.

23. L = {anblck | k › n + l}
Assume L is regular. From the pumping lemma there exists a p such that every w s L such
that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose ap!bp!a(p+1)!. Its
length is 2p!+(p+1)! ≥ p. Since the length of xy cannot exceed p, y must be of the form am
for some m > 0. From the pumping lemma
any string of the form xyi. z must always be in L. If we can show that it is always possible to
choose i in such a way that we will have k = n + l for one such string we will have shown a
contradiction. Indeed we can have
p!+(i-1)m + p! = (p+1)!
if we have i = 1 + ((p+1)! - 2 p!)/ m Is that possible? only if m divides
((p+1)! -2 p!
((p + 1)! - 2 * (p)! = (p + 1 - 2) p! and since m c p m is guaranteed to divide p!.

Hence i exists and the language is not regular.

24. L = {anblak | n = l or l › k} Proof by contradiction:


Let us assume L is regular. From the pumping lemma, there exists a number p such that any
string w of length greater than p has a “repeatable” substring generating more strings in the
language L. Let us consider w = apbpap.

From the pumping lemma the string w, of length larger than p has a “repeatable” substring.
We will assume that this substring is of length m ≥ 1. From the
pumping lemma we can remove y and the resulting string should be in L.
However, if we remove y we get ap - mbpap. But this string is not in L since p-m › p and p =
p.Hence L is not regular.
25. L = {anba3n | n ≥ 0}
Assume L is regular. From the pumping lemma there exists a p such that every w s L such
that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose apba3p. Its
length is 4p+1 ≥ p. Since the length of xy cannot exceed p, y must be of the form ak for
some k > 0. From the pumping lemma ap-kba3p
must also be in L but it is not of the right form. Hence the language is not regular.
26. L = {anbncn | n ≥ 0}
Assume L is regular. From the pumping lemma there exists a p such that every w s L such
that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose apbpcp. Its
length is 3p ≥ p. Since the length of xy cannot exceed p, y must be of the form ak for some
k > 0. From the pumping lemma ap-kbpap must
also be in L but it is not of the right form. Hence the language is not regular.

27. L = {aibn | i, n ≥ 0, i = n or i = 2n}


Assume L is regular. From the pumping lemma there exists a p such that every w s L such
that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose apbp. Its
length is 2p ≥ p. Since the length of xy cannot exceed p, y must be of the form ak for
some k > 0. From the pumping lemma ap-kap must
also be in L but it is not of the right form. Hence the language is not regular.

28. L = {0k10k | k ≥ 0 }
Assume L is regular. From the pumping lemma there exists an n such that every w s L such
that |w| ≥ n can be represented as x y z with |y| › 0 and |xy| c n. Let us choose 0n10n. Its
length is 2n+1 ≥ n. Since the length of xy cannot exceed n, y must be of the form 0p for
some p > 0. From the pumping lemma 0n-p10n must
also be in L but it is not of the right form. Hence the language is not regular.

29. L = {0n1m2n | n, m ≥ 0 }
Assume L is regular. From the pumping lemma there exists a p such that every w s L such
that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose 0p12p. Its
length is 2p+1 ≥ p. Since the length of xy cannot exceed p, y must be of the form 0p for
some p > 0. From the pumping lemma 0n-p12n must
also be in L but it is not of the right form. Hence the language is not regular.

Simplification of CFG
As we have seen, various languages can efficiently be represented by a context-free
grammar. All the grammar are not always optimized that means the grammar may consist of
some extra symbols(non-terminal). Having extra symbols, unnecessary increase the length
of grammar. Simplification of grammar means reduction of grammar by removing useless
symbols. The properties of reduced grammar are given below:
1. Each variable (i.e. non-terminal) and each terminal of G appears in the derivation of some
word in L.
2. There should not be any production as X → Y where X and Y are non-terminal.
3. If ε is not in the language L then there need not to be the production X → ε.
Let us study the reduction process in

detail./p>
Removal of Useless Symbols
A symbol can be useless if it does not appear on the right-hand side of the production rule
and does not take part in the derivation of any string. That symbol is known as a useless
symbol. Similarly, a variable can be useless if it does not take part in the derivation of any
string. That variable is known as a useless variable.
For Example:
1. T → aaB | abA | aaT
2. A → aA
3. B → ab | b
4. C → ad
In the above example, the variable 'C' will never occur in the derivation of any string, so the
production C → ad is useless. So we will eliminate it, and the other productions are written
in such a way that variable C can never reach from the starting variable 'T'.
Production A → aA is also useless because there is no way to terminate it. If it never
terminates, then it can never produce a string. Hence this production can never take part in
any derivation.
To remove this useless production A → aA, we will first find all the variables which will
never lead to a terminal string such as variable 'A'. Then we will remove all the productions
in which the variable 'B' occurs.
Elimination of ε Production
The productions of type S → ε are called ε productions. These type of productions can only
be removed from those grammars that do not generate ε.
Step 1: First find out all nullable non-terminal variable which derives ε.
Step 2: For each production A → a, construct all production A → x, where x is obtained
from a by removing one or more non-terminal from step 1.
Step 3: Now combine the result of step 2 with the original production and remove ε
productions.
Example:
Remove the production from the following CFG by preserving the meaning of it.
1. S → XYX
2. X → 0X | ε
3. Y → 1Y | ε
Solution:
Now, while removing ε production, we are deleting the rule X → ε and Y → ε. To preserve
the meaning of CFG we are actually placing ε at the right-hand side whenever X and Y have
appeared.
Let us take
1. S → XYX
If the first X at right-hand side is ε. Then
1. S → YX
Similarly if the last X in R.H.S. = ε. Then
1. S → XY
If Y = ε then
1. S → XX
If Y and X are ε then,
1. S → X
If both X are replaced by ε
1. S → Y
Now,
1. S → XY | YX | XX | X | Y
Now let us consider
1. X → 0X
If we place ε at right-hand side for X then,
1. X → 0
2. X → 0X | 0
Similarly Y → 1Y | 1
Collectively we can rewrite the CFG with removed ε production as
1. S → XY | YX | XX | X | Y
2. X → 0X | 0
3. Y → 1Y | 1
Removing Unit Productions
The unit productions are the productions in which one non-terminal gives another non-
terminal. Use the following steps to remove unit production:
Step 1: To remove X → Y, add production X → a to the grammar rule whenever Y → a
occurs in the grammar.
Step 2: Now delete X → Y from the grammar.
Step 3: Repeat step 1 and step 2 until all unit productions are removed.
For example:
1. S → 0A | 1B | C
2. A → 0S | 00
3. B → 1 | A
4. C → 01
Solution:
S → C is a unit production. But while removing S → C we have to consider what C gives.
So, we can add a rule to S.
1. S → 0A | 1B | 01
Similarly, B → A is also a unit production so we can modify it as
1. B → 1 | 0S | 00
Thus finally we can write CFG without unit production as
1. S → 0A | 1B | 01
2. A → 0S | 00
3. B → 1 | 0S | 00
4. C → 01

You might also like