Unit Ii
Unit Ii
Regular Language- Regular Expression- Equivalence of finite Automaton and regular expressions –
Minimization of DFA - Pumping Lemma for Regular sets – Problems based on Pumping Lemma
Regular Languages:
A language is regular if it can be expressed in terms of regular expression.
(a∪e∪i∪o∪u) {a, e, i, o, u}
set of vowels
(a.b*) {a, ab, abb, abbb, abbbb,….}
a followed by 0 or more
b
any no. of vowels v*.c* ( where v – { ε , a ,aou, aiou, b, abcd…..} where ε
followed by any no. of vowels and c – represent empty string (in case 0 vowels
consonants consonants) and o consonants )
o The language accepted by finite automata can be easily described by simple expressions
called Regular Expressions. It is the most effective way to represent any language.
o The languages accepted by some regular expression are referred to as Regular languages.
o A regular expression can also be described as a sequence of pattern that defines a string.
o Regular expressions are used to match character combinations in strings. String searching
algorithm used this pattern to find the operations on a string.
For instance:
In a regular expression, x* means zero or more occurrence of x. It can generate {e, x, xx, xxx,
xxxx, .....}
In a regular expression, x+ means one or more occurrence of x. It can generate {x, xx, xxx,
xxxx, .....}
Operations on Regular Language
The various operations on regular language are:
Union: If L and M are two regular languages then their union L U M is also a union.
1. 1. L U M = {s | s is in L or s is in M}
Intersection: If L and M are two regular languages then their intersection is also an intersection.
1. 1. L ⋂ M = {st | s is in L and t is in M}
Kleen closure: If L is a regular language then its Kleen closure L1* will also be a regular
language.
1. 1. L* = Zero or more occurrence of language L.
Example 1:
Write the regular expression for the language accepting all combinations of a's, over the set ∑ =
{a}
Solution:
All combinations of a's means a may be zero, single, double and so on. If a is appearing zero times,
that means a null string. That is we expect the set of {ε, a, aa, aaa, ....}. So we give a regular
expression for this as:
1. R = a*
That is Kleen closure of a.
Example 2:
Write the regular expression for the language accepting all combinations of a's except the null
string, over the set ∑ = {a}
Solution:
The regular expression has to be built for the language
1. L = {a, aa, aaa, ....}
This set indicates that there is no null string. So we can denote regular expression as:
R = a+
Example 3:
Write the regular expression for the language accepting all the string containing any number of a's
and b's.
Solution:
The regular expression will be:
1. r.e. = (a + b)*
This will give the set as L = {ε, a, aa, b, bb, ab, ba, aba, bab, .....}, any combination of a and b.
The (a + b)* shows any combination with a and b even a null string.
Examples of Regular Expression
Example 1:
Write the regular expression for the language accepting all the string which are starting with 1 and
ending with 0, over ∑ = {0, 1}.
Solution:
In a regular expression, the first symbol should be 1, and the last symbol should be 0. The r.e. is as
follows:
1. R = 1 (0+1)* 0
Example 2:
Write the regular expression for the language starting and ending with a and having any having any
combination of b's in between.
Solution:
The regular expression will be:
1. R = a b* b
Example 3:
Write the regular expression for the language starting with a but not having consecutive b's.
Solution: The regular expression has to be built for the language:
1. L = {a, aba, aab, aba, aaa, abab, .....}
The regular expression for the above language is:
1. R = {a + ab}*
Example 4:
Write the regular expression for the language accepting all the string in which any number of a's is
followed by any number of b's is followed by any number of c's.
Solution: As we know, any number of a's means a* any number of b's means b*, any number of c's
means c*. Since as given in problem statement, b's appear after a's and c's appear after b's. So the
regular expression could be:
1. R = a* b* c*
Example 5:
Write the regular expression for the language over ∑ = {0} having even length of the string.
Solution:
The regular expression has to be built for the language:
1. L = {ε, 00, 0000, 000000, ......}
The regular expression for the above language is:
1. R = (00)*
Example 6:
Write the regular expression for the language having a string which should have atleast one 0 and
alteast one 1.
Solution:
The regular expression will be:
1. R = [(0 + 1)* 0 (0 + 1)* 1 (0 + 1)*] + [(0 + 1)* 1 (0 + 1)* 0 (0 + 1)*]
Example 7:
Describe the language denoted by following regular expression
1. r.e. = (b* (aaa)* b*)*
Solution:
The language can be predicted from the regular expression by finding the meaning of it. We will
first split the regular expression as:
r.e. = (any combination of b's) (aaa)* (any combination of b's)
L = {The language consists of the string in which a's appear triples, there is no restriction on the
number of b's}
Example 8:
Write the regular expression for the language L over ∑ = {0, 1} such that all the string do not
contain the substring 01.
Solution:
The Language is as follows:
1. L = {ε, 0, 1, 00, 11, 10, 100, .....}
The regular expression for the above language is as follows:
1. R = (1* 0*)
Example 9:
Write the regular expression for the language containing the string over {0, 1} in which there are at
least two occurrences of 1's between any two occurrences of 1's between any two occurrences of
0's.
Solution: At least two 1's between two occurrences of 0's can be denoted by (0111*0)*.
Similarly, if there is no occurrence of 0's, then any number of 1's are also allowed. Hence the r.e.
for required language is:
1. R = (1 + (0111*0))*
Example 10:
Write the regular expression for the language containing the string in which every 0 is immediately
followed by 11.
Solution:
The regular expectation will be:
1. R = (011 + 1)*
Conversion of RE to FA
To convert the RE to FA, we are going to use a method called the subset method. This method is
used to obtain FA from the given regular expression. This method is given below:
Step 1: Design a transition diagram for given regular expression, using NFA with ε moves.
Step 2: Convert this NFA with ε to NFA without ε.
Step 3: Convert the obtained NFA to equivalent DFA.
Example 1:
Design a FA from given regular expression 10 + (0 + 11)0* 1.
Solution: First we will construct the transition diagram for a given regular expression.
Step 1:
Step 2:
Step 3:
Step 4:
Step 5:
Now we have got NFA without ε. Now we will convert it into required DFA for that, we will first
write a transition table for this NFA.
State 0 1
q1 qf ϕ
q2 ϕ q3
q3 q3 qf
*qf ϕ ϕ
State 0 1
[q1] [qf] ϕ
[q2] ϕ [q3]
*[qf] ϕ ϕ
Example 2:
Design a NFA from given regular expression 1 (1* 01* 01*)*.
Solution: The NFA for the given regular expression is as follows:
Step 1:
Step 2:
Step 3:
Example 3:
Construct the FA for regular expression 0*1 + 10.
Solution:
We will first construct FA for R = 0*1 + 10 as follows:
Step 1:
Step 2:
Step 3:
Step 4:
Finite automata can be used to generate strings in a regular language. A finite automaton for a
particular language is “programmed,” in a way, to generate the strings of a given language through
its states and transition functions. You can walk through a finite state machine to see what strings
are able to be made and therefore are part of the language the machine described, or you can feed it
an input string to see if a given input can be made by the machine.
Note: The symbol epsilon, \epsilonϵ, represents transitions on the empty string, sometimes called a
"null transition." This means that the machine can take these transitions without needing to read a
particular symbol on the input.
A regular expression is one way to represent a regular language as a string. For example, the
regular language described by the regular expression 0^* 1 \big| 1^*00∗1∣∣1∗0 means strings that
either contain any number of 0’s followed by a single 1 or any number of 1’s followed by a single
0. This regular expression can be represented by the following finite state machine:
The regular expression (01)^* 11 (10)^*(01)∗11(10)∗, which is the language of strings starting
with any number of “01” substrings, followed by two 1’s and then any number of “10” substrings,
can be represented with the following finite state machine:
Minimization of DFA
Minimization of DFA means reducing the number of states from given FA. Thus, we get the
FSM(finite state machine) with redundant states after minimizing the FSM.
We have to follow the various steps to minimize the DFA. These are as follows:
Step 1: Remove all the states that are unreachable from the initial state via any set of the transition
of DFA.
Step 2: Draw the transition table for all pair of states.
Step 3: Now split the transition table into two tables T1 and T2. T1 contains all final states, and T2
contains non-final states.
Step 4: Find similar rows from T1 such that:
1. 1. δ (q, a) = p
2. 2. δ (r, a) = p
That means, find the two states which have the same value of a and b and remove one of them.
Step 5: Repeat step 3 until we find no similar rows available in the transition table T1.
Step 6: Repeat step 3 and step 4 for table T2 also.
Step 7: Now combine the reduced T1 and T2 tables. The combined transition table is the transition
table of minimized DFA.
Kleene Closure / Plus
Definition − The set ∑+ is the infinite set of all possible strings of all possible lengths over
∑ excluding λ.
Representation − ∑+ = ∑1 ∪ ∑2 ∪ ∑3 ∪…….
∑+ = ∑* − { λ }
Example − If ∑ = { a, b } , ∑+ = { a, b, aa, ab, ba, bb,………..}
Example:
Solution:
Step 1: In the given DFA, q2 and q4 are the unreachable states so remove them.
Step 2: Draw the transition table for the rest of the states.
State 0 1
→q0 q1 q3
q1 q0 q3
*q3 q5 q5
*q5 q5 q5
Step 3: Now divide rows of transition table into two sets as:
1. One set contains those rows, which start from non-final states:
State 0 1
q0 q1 q3
q1 q0 q3
2. Another set contains those rows, which starts from final states.
State 0 1
q3 q5 q5
q5 q5 q5
q3 q3 q3
State 0 1
→q0 q1 q3
q1 q0 q3
*q3 q3 q3
Pumping Lemma
1. L = { akbk | k ≥ 0}
see notes
ap+
2k
It should be relatively clear that p + k, p + 2k, etc., cannot all be prime but let us add k p
times, then we must have:
ap + pk s L, of course ap + pk = ap (k + 1)
so this would imply that (k + 1)p is prime, which it is not since it is divisible by both p and k
+ 1.
3. L = {anbn+1}
Assume L is regular. From the pumping lemma there exists a p such that every w s L such
that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose apbp+1. Its
length is 2p + 1 ≥ p. Since the length of xy cannot exceed p, y must be of the form ak for
some k > 0. From the pumping lemma ap-kbp+1 must also
be in L but it is not of the right form. Hence the language is not regular.
Note that the repeatable string needs to appear in the first n symbols to avoid the
following situation:
assume, for the sake of argument that n = 20 and you choose the string a10 b11 which is of
length larger than 20, but |xy| c 20 allows xy to extend past b, which means that y could
contain some b’s. In such case, removing y (or adding more y’s) could lead to strings which
still belong to L.
4. L = {anb2n }
Assume L is regular. From the pumping lemma there exists a p such that every w s L such
that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose apb2p. Its
length is 3p ≥ p. Since the length of xy cannot exceed p, y must be of the form ak for some
k > 0. From the pumping lemma ap-kb2p must also be in L
but it is not of the right form. Hence the language is not regular.
From the pumping lemma there exists an n such that every w s L longer than n can be
represented as x y z with |y| › 0 and |x y| c n.
Its length is 2n + 1 ≥ n. Since the length of xy cannot exceed n, y must be of the form ak for
some k > 0. From the pumping lemma an-k b an must also be in L but it is not a palindrome.
12. L = { 0n | n is a power of 2 }
Assume L is regular. From the pumping lemma there exists a p such that every w s L
such that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose
n = 2p. Since the length of xy cannot exceed p, y must be of the form 0k for some
0 < k cp. From the pumping lemma 0m where m = 2p+ k must also be in
L. We have
2p < 2p + k c 2p + p < 2p + 1
Hence this string is not of the right form. Hence the language is not regular.
16. L = {anblak | k = n + l}
Assume L is regular. From the pumping lemma there exists a p such that every w s L
such that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let
us choose apbap+1. Its length is 2p+2 ≥ p. Since the length of xy cannot exceed p, y must
be of the form am for some m > 0. From the pumping lemma ap-mbap+1
must also be in L but it is not of the right form. Hence the language is not regular.
20. L = { an! | n ≥ 0}
Proof by contradiction:
Let us assume L is regular. From the pumping lemma, there exists a number p such that
any string w of length greater than p has a “repeatable” substring generating more
strings in the language L. Let us consider ap! (unless p < 3 in which case we chose a3!).
From the pumping lemma the string w has a “repeatable” substring. We will assume
that this substring is of length k ≥ 1.
From the pumping lemma ap!-k must also be in L. For this to be true there must
be j such that j! = m! - k But this is not possible since when p > 2 and k c m we have
m! - k > (m - 1)!
Hence L is not regular.
21. L = { anbl | n › l}
Proof by contradiction:
Let us assume L is regular. From the pumping lemma, there exists a number p
such that any string w of length greater than p has a “repeatable” substring generating
more strings in the language L. Let us consider n = p! and l = (p+1)! From the pumping
lemma the resulting string is of length larger than p and has a “repeatable” substring.
We will assume that this substring is of length k ≥ 1.
From the pumping lemma we can add y i-1 times for a total of i ys. If we can find
an i such that the resulting number of a’s is the same as the number of b’s we have
won. This means we must find i such that:
m! + (i - 1)*k = (m + 1)! or
(i - 1) k = (m + 1) m! - m! = m * m! or
i = (m * m!) / k +1
but since k < m we know that k must divide m! and that (m * m!) / k must be an
integer. This proves that we can choose i to obtain the above equality.
Hence L is not regular.
23. L = {anblck | k › n + l}
Assume L is regular. From the pumping lemma there exists a p such that every w s L
such that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us choose ap!bp!
a(p+1)!. Its length is 2p!+(p+1)! ≥ p. Since the length of xy cannot exceed p, y must be of
the form am for some m > 0. From the pumping lemma
any string of the form xyi. z must always be in L. If we can show that it is always
possible to choose i in such a way that we will have k = n + l for one such string we will
have shown a contradiction. Indeed we can have
p!+(i-1)m + p! = (p+1)!
if we have i = 1 + ((p+1)! - 2 p!)/ m Is that possible? only if m divides
((p+1)! -2 p!
((p + 1)! - 2 * (p)! = (p + 1 - 2) p! and since m c p m is guaranteed to divide p!.
Hence i exists and the language is not regular.
24. L = {anblak | n = l or l › k}
Proof by contradiction:
Let us assume L is regular. From the pumping lemma, there exists a number p such
that any string w of length greater than p has a “repeatable” substring generating more
strings in the language L. Let us consider w = apbpap. From
the pumping lemma the string w, of length larger than p has a “repeatable” substring.
We will assume that this substring is of length m ≥ 1. From the
pumping lemma we can remove y and the resulting string should be in L.
However, if we remove y we get ap - mbpap. But this string is not in L since p-m › p and p =
p.
Hence L is not regular.
25. L = {anba3n | n ≥ 0}
Assume L is regular. From the pumping lemma there exists a p such that every w
s L such that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us
choose apba3p. Its length is 4p+1 ≥ p. Since the length of xy cannot exceed p, y
must be of the form ak for some k > 0. From the pumping lemma ap-kba3p
must also be in L but it is not of the right form. Hence the language is not
regular.
26. L = {anbncn | n ≥ 0}
Assume L is regular. From the pumping lemma there exists a p such that every w s
L such that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us
choose apbpcp. Its length is 3p ≥ p. Since the length of xy cannot exceed p, y must
be of the form ak for some k > 0. From the pumping lemma ap-kbpap must
also be in L but it is not of the right form. Hence the language is not regular.
28. L = {0k10k | k ≥ 0 }
Assume L is regular. From the pumping lemma there exists an n such that every w
s L such that |w| ≥ n can be represented as x y z with |y| › 0 and |xy| c n. Let us
choose 0n10n. Its length is 2n+1 ≥ n. Since the length of xy cannot exceed n, y
must be of the form 0p for some p > 0. From the pumping lemma 0n-p10n must
also be in L but it is not of the right form. Hence the language is not regular.
29. L = {0n1m2n | n, m ≥ 0 }
Assume L is regular. From the pumping lemma there exists a p such that every w
s L such that |w| ≥ p can be represented as x y z with |y| › 0 and |xy| c p. Let us
choose 0p12p. Its length is 2p+1 ≥ p. Since the length of xy cannot exceed p, y
must be of the form 0p for some p > 0. From the pumping lemma 0n-p12n must
also be in L but it is not of the right form. Hence the language is not regular.