0% found this document useful (0 votes)
27 views

Toc Unit 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views

Toc Unit 2

Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 29

UNIT II REGULAR EXPRESSIONS AND LANGUAGES

Regular Expressions – FA and Regular Expressions – Proving Languages not to be regular – Closure
Properties of Regular Languages – Equivalence and Minimization of Automata.

2. REGULAR EXPRESSIONS AND LANGUAGES

Definition:
1. Regular Expressions are structural notation for describing the language that can be represented by
Finite Automata.
2. In other words, the languages accepted by finite automata are easily described by simple
expressions called regular expressions.
3. It serves as the input language for many systems that process strings.
Examples include:
1. Search commands such as the UNIX grep or equivalent commands for findings that sees in web
browsers or text-formatting systems.
2. Lexical-analyzer generators, such as Lex or Flex
2.1. The Operators of Regular Expression:
The operators of regular expressions represents three operations
1. Union:
The union of two languages L and M, denoted by L ⋃ M, is the set of strings that are in either L or
M, or both.
Example:
L = {001,10,111}
M = {ϵ, 001}
L⋃M = {ϵ,10,001,111}
2. Concatenation:
The concatenation of languages L and M is the set of strings that can be formed by taking any
string in L and concatenating it with any string in M
Example:
L = {001,10,111}
M = {ϵ, 001}
L.M = {001,10,111,001001,10001,111001}
3. Closure:
The closure (or star, or Kleene closure) of a language L is denoted by L* and represents the set of
those strings that can be formed by taking any number of strings from L, possibly with repetition and
concatenating all of them.
Example
L = {0,1}
L* is the set of all strings of 0’s and 1’s.
L* is the infinite union Ui ≥ 0 Li , where L0 = {ϵ}, L1 = L and Li , i>1 . i.e., LL…L (the
concatenation of i copies of L)

2.2. Building Regular Expressions:


1
Let ∑ be an alphabet, the regular expression over ∑ and the language that denoted are defined as
follows:
Basis:
1. The constant  is the regular expression denoting the language L() = {}
2. The constant ϵ is the regular expression denoting the language L(ϵ) = {ϵ}
3. If a is any symbol, then a is regular expression. This expression denotes the language {a}.
L(a) = {a}
4. A variable, usually capitalized and italic such as L, is a variable, representing any language.
Induction: There are four parts in the inductive step, one for each of the three operators
1. If E and F are regular expressions, then E+F is a regular expression denoting the union of L(E) and
L(F). That is, L(E+F) = L(E) U L(F).
2. If E and F are regular expressions, then EF is a regular expression denoting the concatenation of
L(E) and L(F). That is, L(EF) = L(E) L(F).
3. If E is a regular expressions, then E* is a regular expression denoting the closure of L(E). That is,
L(E*) = (L(E))*.
4. If E is a regular expressions, then (E), a parenthesized E, is also a regular expression, denoting the
same language as E. Formally L((E)) = L(E).
2.3. Precedence of Regular-Expression Operators:
The regular-Expression have an order of “precedence”, which means that operators are associated
with their operands in a particular order.
The following is the order of precedence for the operators:
1. The star operator is the highest precedence.
2. Next in precedence comes the concatenation or “dot” operator.
3. Finally, all unions (+ operators) are grouped with their operands.
Descriptions:
1. ϵ - denotes the empty string.

q0

2. 0* - denotes the language of strings of any number of zeros including ϵ.


L(0*) = { ϵ,0,00,000,…}

q0

3. 0 1* - The set of all strings that begin with a 0 and followed by any number of ones.
L(01*) = {0,01,011,…..}

2
1

0 1
q0 q1 q2

4. 0+1 – denotes the language of strings of either ‘1’ or ‘0’.

0,1
q0 q2

5. (0+1)* - The set of all strings that contains any combinations of 0’s and 1’s.
L(0+1)* = { 0, 1, 00,11,010,11100,….}

0,1

q0 0,1 q1

Regular set:
Regular sets are the sets which are accepted by finite automata.
Ex:
L(0+1)* = { ϵ, 0, 1, 00,11,010,11100,….}
Problems:
1. Design regular expression for the language containing any number of a’s and b’s.
R = (a+b)* - Epsilon will be accepted
R = (a+b)+ - Epsilon will not be accepted
2. Construct regular expression for the language accepting all the strings which ending with 00 over the ∑
= {0,1}.
R = (any number of 0’s and 1’s) 00
= (0+1)* 00
L = {000,100,0100……}

3. Construct R.E for the set of all strings of 0’s and 1’s not containing 101 as a substring.
R = (The set of all strings may start with 0’s and 1’s) when 1 is encountered then no single 0 and 1
must not follow it.
R = (0* 1* 00)* 0* 1*
4. For the regular expression ab(a+b)* a, generate the language.
L = Set of all strings that begin with ab and end with a.
= L(a) L(b) L(a⋃b)* L(a)
= {ab} w {a} | w ∈ {a,b}*
3
5. Construct the language for the given R.E a+ a(a+b)* a
L = L(a) ⋃ (L(a) (L(a⋃b)*)L(a))
= {a},{a} w{a} | w∈ {a,b}*
6. Write regular expression for the following language.
a) The set of all strings over alphabet {a, b, c} containing atleast one ‘a’ and atleast one ‘b’.
Solution:
The set of all strings = (a+b+c)*
The set of all strings containing
Atleast one ‘a’ = (a+b+c)* a
Atleast one ‘b’ = (a+b+c)* b
Result,
R.E = (a+b+c)* a (a+b+c)*b

7. Describe the language denoted by the following regular expression (b*(aaa)* b*)*
The language consists of strings in which a’s appear trippled, there is no restrictions on number of
b’s
8. Give English descriptions of the language of the following regular expression.
a. (0+10)* 1*
- Set of all strings of 0’s and 1’s contains consecutive 1’s only at the end.
b. (0* 1*)000(0+1)*
- Set of all strings of 0’s and 1’s that contains 000 as substring.
c. (1+ϵ)(00*1*)*0*
- Set of all strings that start with 1 or ϵ followed by any number of 0’s and 1’s.

2.4 FA and Regular Expression:


2.4.1 Converting Regular Expression to Automata:
Theorem:
Every language defined by a regular expression is also defined by finite automaton.
(or)
Let r be a regular expression, then there exists an NFA with ϵ transitions that accepts L(r).
Proof:
Suppose L =L(R) for a regular expression R. We show that L = L(E) for some ϵ-NFA E with:
1. Exactly one accepting state.
2. No arcs into the initial state.
3. No arcs out of the accepting state.
Basis: ( Zero Operators)
If r = ϵ, M is

ϵ
q0 f2
q1

If r = , M is

4
q0 f2
q1

If r = {a} for some a∈ ∑, M is

a
q0 f2
q1

Induction: (One or more operators)


In the induction, the regular expression contains one or more operators.
There exists three cases
1. Union
2. Concatenation
3. Closure
Case 1: Union
Let r1 = r1+r2 where r1 and r2 be the regular expression, then there exists two NFA’s
M1 = ( Q1, ∑1, δ1, q1, f1) and
M2 = ( Q2, ∑2, δ2, q2, f2)
L(M1) = L(r1) i.e., the language represented by R.E. r1 is the same which is represented by M1.
Similarly, L(M2) = L(r2)
Q1 and Q2 are the set of states in M1 and M2 respectively.
Let form a new NFA

M = (Q1 U Q2, ∑1 U ∑2,δ,{q0},{f0})

If r = r1 + r2 , then M is

M1

r1
ϵ q1 f1 ϵ

q0 f0

ϵ r2 ϵ
q2 f2

M2

L(r1) = L(M1)
L(r2) = L(M2)
L(M) = L(M1) U L(M2)
5
Case 2: Concatenation
Let r = r1r2 , where r1 and r2 be the regular expression, then there exists two NFA’s M1 and M2
Let form a new NFA
M = (Q1 U Q2, ∑1 U ∑2,δ,{q1},{f2})
If r = r1 r2 , then M is

r1 ϵ r2
q1 f1 q2 f2

M1 M2

Therefore, L(r) = L(r1. r2)


= L(r1) L(r2)
L(M) = L(M1).L(M2)
Case 3 : Closure
Let r = r1*, where r1 be a regular expression.
The NFA M1 is such that L(M1) = L(r1)
Let form a new NFA
M1 = ( Q1, ∑1, δ, q0, f0)
*
If r = r1 then M is

q0 ϵ r1 ϵ f0
q1 f1

ϵ
Therefore ,
*
L(M1) = L(r1 )
Hence, the every language defined by regular expression is also defined by a finite automata.

6
Converting Regular Expression to Automata: (Thomson’s Construction)
1. R = 0 + 1

2. R = a*

3. R = (0+1)*

1. Convert the regular expression (0+1)* 1 (0+1) to an ϵ - NFA.


(0 + 1)

(0 + 1)*

R = (0+1)* 1 (0+1)

7
Automata constructed for (0+1)* 1 (0+1)
2. Construct a NFA equivalent for the regular expression (0+1)* (00+11).

Problems:
1. Convert the regular expression (0+1)* 1 (0+1) to an ϵ - NFA.
2. Construct a NFA equivalent for the regular expression (0+1)* (00+11).
2.4.2 Converting DFA’s to Regular Expression:
Theorem :
If L = L(A) for some DFA A, Then there is a regular expression R such that L = L(R).
Proof:
Let A be a DFA which defines the language L.
A’s states are {1,2,3,…..n}, then the constructing regular expression of the form,
Let us use Rij (k) - as the name of a regular expression whose language is the set of strings w from state i
to state j in A and the path has no intermediate node whose number is greater than K.
Let us provide this theorem by induction for kij(k)
Basis:
If K= 0, there will be no intermediate nodes between i and j.
i)
8
i j

ii)

There will be two cases as above:

i) An arc from state i to state j


ii) An arc to that node itself
Examine the DFA A and find those input symbols ‘a’ and
if i≠j then
a) If there is no such symbol a, then Rij(0)=φ
b) If there is exactly one such symbol a, then Rij(0)=a
c) If there are symbols a1 +a2 +…..+ak that label arcs from state i to state j, then
Rij(0)=a1 +a2 +…..+ak
If i=j (Ex:R11(k)),all loops from i to itself.
a. If there is no such symbol a, then Rij(0)=ϵ.
b. If there is exactly one such symbol a, then Rij(0)=ϵ+a.
c. If there are symbols a1,a2….ak that label are from state i to state j, then Rij(0)=ϵ+a1+a2+….+ak
Induction:
If k ≥ 1 i.e, there is a path from state i to state j goes through no state higher than K.
There are two cases exists.
1. The path does not go through state K at all. In this case, the label of the path is in the language of
Rij(K-1)
2. If the path goes through the state K at least once, we can break the path into several pieces as follows:
a. The first piece goes from state i to state K without passing through k
b. The middle piece goes from K to itself without passing K.
c. The last piece goes from K to j without passing through k.

i k k k k j

In Rik(k-1) Zero or more strings in Rik(k-1) In Rkj(k-1)


9
When we combine the expressions for the path of two types, the expression

Rij(k)= Rij(k-1)+ Rik(k-1)[ Rkk(k-1)]*Rkj(k-1)


We have Rij(n) for all i and j. The regular expression for the language of the automaton is then the sum
(union) of all expression Rij(n) such that i is the start state and j is the accepting state.
Basic Formula for Regular Expression:

1. (ϵ +1)*=1*
2.1*( ϵ +1)= 1*
3. (ϵ +1)+ 1*=1*
4. 0+0 1*=01*
5. 0+1*0=1*0
6. ϵ+00*=0*
7. (1+0)*=(1*0*)*
8. φR=Rφ=φ
9. φ+R=R
10. ϵ R=R ϵ =R
11. R*R+R=R*R
12. R+R=R
13. RR*=R*R=R*
14. ϵ *= ϵ
15. ϵ +R*=R*
16. * = ϵ
17. 0+ 1*0 = 1*0
Problems:

1. Convert the following to a regular expression

1 0,1

0
1 2

Solution:
If k=0
R11(0)=1+ϵ
R12(0)= 0
R21(0)= φ
R22(0)= ϵ+0+1
If k =1
Rij(k) = Rij(k-1)+ Rik(k-1) [Rkk(k-1)]* Rkj(k-1)

10
Rij(1) = Rij(0)+ Ri1(0)[ R11(0)]* Rij(0)
R11(1) = R11(0)+ R11(0)[ R11(0)]* R11(0)
= (1+ ϵ) + (1+ ϵ) (1+ ϵ)* (1+ϵ)
= (1+ ϵ) + (1+ ϵ) 1* (1+ ϵ)
= (1+ ϵ) + 1*
= 1*
R12(1)= R12(0)+ R11(0)[ R11(0)]* R12(0)
= 0+ (1+ ϵ) (1+ ϵ)* 0
= 0+(1+ ϵ)1*0
= 0+ 1*0
=1*0
R21 (1) = R21 (0) + R21 (0) [R11(01)]* R11(0)
= φ + φ [1+ ϵ]* [1+ ϵ]
=φ+φ

R22 (1) = R22 (0) + R21 (0)[R11(0)]* R12(0)
= (ϵ +1+0) + φ [1+ ϵ]*(0)
= (ϵ +1+0) + φ
= ϵ+1+0
R22 (1) = ϵ +1+0
If K=2
R11(2) = R11(1) + R12(1)[R22(1)]* R21(1)
= 1*+1*0[ϵ +1+0]*
= 1* + 
= 1*
R12 (2) = R12(1) + R12(1)[R22(1)]* R22(1)
= 1*0+1*0[ϵ +1+0]*[ ϵ +1+0]
= 1*0+1*0 (1+0)* [ϵ +1+0]*
= 1*0+1*0(1+0)*
R12 (2) = (1+0)* 1*0 [R+RR*= R*R]
R21 (2) = R21(1) + R22(1)[R22(1)]* R21 (1)
= +[ϵ +1+0][ ϵ +1+0]*
=+
=
R22 (2) = R22 (1) + R22 (1)[R22 (1)]* R22(1)
= (ϵ +1+0) + (ϵ +1+0) (ϵ +1+0)* (ϵ +1+0)
= (ϵ +1+0) +(ϵ +1+0) (0+1)* (ϵ +1+0)
= (ϵ +1+0)+ (ϵ +1+0) (0+1)*
= (ϵ +1+0)+ (0+1)*
= (0+1)*
The Rij(k) has been constructed.
The R.E., Rij(n)
R12 (2) =1*0 (1+0)*

11
2. Convert the following DFA to regular expression.

0
1

0 1
q1 q2 q3

If k=0
R11(0) = 1+ϵ
R12(0) = 0
R13(0) = 
R21(0) = 
R22(0) = ϵ+1
R23(0) =1
R31(0) = 
R32(0)= 1
R33(0) = 0+ϵ
If k=1
Rij (k) = Rij (k-1) + Rij (k-1)[Rkk (k-1)]* Rkj (k-1)
R11(1) = R11(0) + R11(0) [R11(0)]* R11(0)
= (1+ ϵ) + (1+ ϵ) [1+ϵ]* [1+ ϵ]
= [1+ ϵ] + [1+ ϵ] 1*[1+ϵ]
= [1+ ϵ] + 1*
=1*
R12(1) = R12(0) + R11(0) [R11(0)]* R12(0)
= 0+[1+ ϵ] [1+ϵ]* 0
= 0+[1+ϵ]* 0
= 0+1*0
= 1*0
R13(1) = R13(0) + R11(0) [R11(0)]* R13(0)
= φ+[1+ ϵ] [1+ϵ]* φ
=φ+φ

R21(1) = R21(0) + R21(0) [R11(0)]* R11(0)
= φ+ φ [1+ϵ]* [1+ ϵ]
=φ+φ

R22(1) = R22(0) + R21(0) [R11(0)]* R12(0)
12
= ϵ+ φ [1+ϵ]* 0
= ϵ+ φ

R23(1) = R23(0) + R21(0) [R11(0)]* R13(0)
= 1+φ [1+ϵ]* φ
= 1+ φ
=1
R31(1) = R31(0) + R31(0) [R11(0)]* R11(0)
= φ+φ [1+ϵ]* (1+ ϵ)
=φ+φ

R32(1) = R32(0) + R31(0) [R11(0)]* R12(0)
= 1+φ [1+ϵ]* 0
=1+
=1
R33(1) = R33(0) + R31(0) [R11(0)]* R13(0)
= ϵ+0+φ[1+ϵ]*φ
= ϵ+0+φ
= ϵ+0
=0
Rij(k) = Rij(k-1)+Rik(k-1)[Rkk(k-1)]*Rkj(k-1)
R11(2) = R11(1) + R12(1) [R22(1)]* R21(1)
= 1*+1*0[ϵ]*φ
= 1*+φ
= 1*
R12(2) = R12(1) + R12(1) [R22(1)]* R22(1)
= 1*0+1*0[ϵ]*ϵ [ϵ* =ϵ]
* *
= 1 0+1 0 [ϵR = R ϵ=R]
=1*0 [R+R=R]
*
R13(2) = R13(1) + R12(1) [R22(1)] R23(1)
= φ+1*0[ϵ] 1
= 1*0 1
R21(2) = R21(1) + R22(1) [R22(1)]* R21(1)
= φ+ϵ[ϵ]*φ

R22(2) = R22(1) + R22(1) [R22(1)]* R22(1)
= ϵ+ϵ[ϵ]*ϵ
= ϵ+ϵ

R23(2) = R23(1) + R22(1) [R22(1)]* R23(1)
=1+ϵ[ϵ]*1 [ ϵ.R=R]
=1+[ϵ]*1 [ R.R=RR]
= 1+1
=1
13
R31(2) = R31(1) + R32(1) [R22(1)]* R21(1)
= φ+1[ϵ]*.φ

R32(2) = R32(1) + R32(1) [R22(1)]* R22(1)
= 1+1[ϵ]*.ϵ
= 1+1ϵ
=1
R33(2) = R33(1) + R32(1) [R22(1)]* R23(1)
= (ϵ+0) +1[ϵ]*1
= ϵ+0+11
Rij(n), Here i=1, j = 3, n=3
R13(3) = R13(2) + R13(2) [R33(2)]* R33(2)
= 1*01 + 1*01(0+ϵ+11)*(0+ϵ+11)
= 1*01 + 1*01(0+11)*
=1*01 + 1*01(0+11)*
R13(3) = 1*01 (0+11)* [R=RR*=R*R]

2.4.3 Converting Finite Automata to Regular Expression:

Converting Finite Automata to Regular Expression by Eliminating States:


1. Consider the figure below, which shows a generic state ‘s ‘about to be eliminated.
2. The labels on all edges are regular expressions.

The strategy for constructing a regular expression from a finite automaton is as follows:

1. For each accepting state q, apply the above reduction process to produce an equivalent automaton
with regular-expression labels on the arcs. Eliminate all states except q and the start state q 0.
For example consider the following automata,

14
1 1 3

0 1

1
1+00*1
3

2. If q ≠
q0, then we shall be left with a two-state automaton looks like the figure below.

Fig: A generic two-state automaton

We can describe this automaton as: (R+SU*T)*SU*


3. If the start state is also an accepting state, then we must also perform a state elimination from the
original automaton that gets rid of every state but the start state. This leaves the following:

Fig: A generic one-state automaton

We can describe this automaton as: R*.


4. The desired regular expression is the sum (union) of all the expressions derived from the reduced
automata for each accepting state, by rules (2) and (3).
Problem:
1. Find the Regular expression for the given FA by State Elimination technique.

15
Solution:
Step 1: Our first step is to convert it to an automaton with regular expression

Step 2: Eliminate state B, since this state is neither accepting nor the start state.

Using R11+Q1 S*P1 ,


R11 = R11+Q1 S*P1 , , Q1 = 1, S = Ø, P1 = 0 + 1
R11+Q1 S*P1 = Ø + 1 Ø* (0 + 1)
= 1 Ø* (0 + 1)
= 1. ε. (0 + 1)
= 1 (0 + 1)

Fig: Eliminating state B


Step 3: Eliminating state C, the mechanics are similar to those we performed above to eliminate state B,
and resulting automaton becomes

Fig: A two state automaton with states A and D.

The Regular expression of the form: (R+SU*T)*SU* (Rule of two state automaton)
R = 0 + 1, S = 1(0+1)(0+1), T = Ø , U = Ø
The resulting expression is
R1 = (0 + 1)*1(0+1)(0+1)
Step 4: Eliminating state D

16
Thus, we can apply the rule for two-state automata and simplify the expression to get
R2= (0+1)* 1 (0+1)
Step 5: All that remains is to sum the two expressions to get the expression for the entire automaton.
The regular expression,
R = R1+ R2
R = (0+1)* 1 (0+1) + (0 + 1)*1(0+1)(0+1)
2.4.4 Arden’s Theorem:
1. It is used to find the regular expression for the given finite state automata.
2. Let P and Q be two regular expression over Σ. If P does not contain ε, then the equation in
R = Q + RP has a solution i.e., R = QP*.
The principle of this theorem is:
1. The finite automata should not have ε moves.
2. The FA should have only one start state say q1.
3. It states are q1, q2,….qn.
4. R is the regular expression (regex) representing the set of strings accepted by the FA.
5. αij denotes the set of labels of edges from qi and qj. If there is no edge αij = Ø
We will get
q1 = q1 α11 + q2 α21+……+ qn αn1 + ε
q2 = q1 α12 + q2 α22+……+ qn αn2
qn = q1 α1n + q2 α2n+……+ qn αnn
Now applying repeated substitution we can express RE in terms of αij.

Problem:
1. Construct regular expression to the given FA using Arden’s Theorem.

0 1

1
q1 q2

1
0
0

q3

Solution:
We can obtain regex (or) regular expression by applying Arden’s Theorem.
Step 1: i) Check whether FA does not have ε moves.
ii) It has only one start state
Step 2: Express states in term of transitions
The transitions that required to reach q1 from other states is

17
q1 = q1 0 + q3 0+ ε ----------- 1
Similarly, q2 = q2 1 + q1 1+ q3 1 ----- 2
q3 = q2 0 ------- 3
Substitute 3 in 2
q2 = q2 1 + q1 1+ q2 01
q2 = q2 (1 + 01)+ q11
Since q2 is in LHS and RHS so we can write
q2 = q1 1 (1 + 01)*
Sub 3 in 1
Now q1 = q1 0 + q2 00+ ε ( because q3 = q2 0)
= q1 0 + q1 1 (1 + 01)*00+ ε
= q1 (0+ 1 (1 + 01)*00) + ε
By applying Arden’s theorem
q1 = ε (0+ 1 (1 + 01)*00)
= (0+ 1 (1 + 01)*00)
As q1 is the final state, the regular expression corresponding to given FA is
R.E = (0+ 1 (1 + 01)*00)

2.5 Closure properties of Regular languages:


If certain languages are regular and language L is formed from them by certain operations (e.g., L
is the union of two languages) then L is also regular. These properties are called closure properties of
regular languages.
The closure properties express the idea that when one or more languages, then certain related
languages also regular.
The principal closure properties for regular languages are:
2. The union of two languages is regular.
3. The intersection of two languages is regular.
4. The complement of a regular language is regular
5. The difference of two regular languages is regular.
6. The reversal of a regular language is regular.
7. The closure (star) of a regular languages is regular.
8. The concatenation of two regular languages is regular.
9. A homomorphism (substitution of strings for symbols) of a regular language is regular.
10. The inverse homomorphism of a regular language is regular.
I. Closure of Regular languages under Boolean operations:
It includes Union, intersection, and complementation.
Closure under union:
Theorem:
If L and M are regular languages, then so is L U M.
Proof:
Since L and M are two regular languages, which generates the Regular expression R and S
respectively.
L = L(R)
18
M = L(S)
Then L U M = L(R+S)
Here R and S are regular expressions, therefore R+S , by of ‘+’ operator LUM also regular.
Intersection:
Theorem: If L and M are regular languages, then so is L ∩ M.
Proof:
Let L and M be the languages of automata.
AL = (QL, ∑, δL, qL, FL)
AM = (QM, ∑, δM, qM, FM)
Construct a new automaton
A = (QLx QM,∑,δ,( qL, qM), FLx FM)
Where δ((p,q),a)=((δ L(p,a), δ M(q,a))

By induction,
δ((qL, qM),w)=( (δ L(qL,w), δ M(qM,w))
A accepts w if and only if δ((qL, qM),w) is a pair of accepting states
δ L(qL,w)-> FL
δ M(qM,w))->FM
w is accepted by A if and only if both AL and AM accept w. Thus, A accepts the intersection
of L and M.

Complement:
We could find a regular expression for its complement as follows:
1. Convert the regular expression to an ϵ- NFA.
2. Convert that ϵ-NFA to a DFA by the subset construction.
3. Complement the accepting states of DFA.
4. Turn the complement DFA back into Regular expression.
Theorem:
If L is a regular language over alphabet ∑, then L = ∑* - L is also a regular.
Proof:
Let L=L(A) for some A.
A= (Q,∑,δ, q0,F).Then L=L(B),where DFA B=((Q,∑,δ, q0, Q-F).
B is exactly like A, but the accepting of A have become Non – accepting states of B.
W is in L(B) if and only if δ(q0,w) is in Q - F, which occurs if only if w is not in L(A).
Difference:
The difference of L and M, is the set of strings that are in language L but not in language M.
Theorem: If L and M are regular languages, then so is L – M.
Proof:
L–M=L⋂M
By previous theorem, M and L ⋂ M are regular, therefore L – M is regular.
Reversal:
The reversal of a string a1a2…an is the string written backwards, that is an anan-1…a1. we use wR for
the reversal of string w.
e.g., 0010R = 0100
The reversal of language L, written LR is the language consisting of the reversal of all its strings.
Then LR = {100, 01, 111}

19
It is possible to construct an automaton for LR.
1. Reverse all the arcs in the transition diagram for A.
2. Make the start state of A be the only accepting state.
3. Create a new start state p0 with transitions on ϵ to all the accepting states of A.
The result is an automaton that simulates A “in reverse”, and accepts a string w if and only if A
accepts wR

Theorem: If L is a regular language, so is LR


Assume L is defined by regular expression E, such that
L(ER) = (L(E))R i.e., ER is the reversal of the language of E.
Basis: If E is ϵ, ф or a for some symbol a. The ER is same as E.
{ ϵ}R = ϵ , ф R = ф ,{a}R = {a}
Induction:
There are three cases,
1. E = E1+E2, then E = E1R+E2R
2. E = E1.E2, then E = E1R. E2R
3. E=E1*,then ER =( (E1)R)*

Homomorphism:
A string Homomorphism is a function on strings that works by substituting a particular string for
each symbol.
e.g., h(0) = abb
h(1) = ba
h(1011) = baabbbaba
Theorem: If L is a regular over alphabet ∑, and h is a homomorphism on ∑, then h(L) is also regular.
Proof:
Let L=L(R) for some regular expression R. In general, If E is a regular expression with symbols in
∑, let h(E) be the expression, obtained by replacing each symbol a on ∑ in E by h(a).
Basis:
If E is ф or ϵ, then h(E) is same as E. since h does not affect the string ф and ϵ.
L(h(E))= L(E)
If E= a for symbol a in ∑ .In this L(E) = {a}
So, h(L(E)) = {h(a)} , also h(E) is the r.e that is the strings of h(a)
L(h(E)) = {h(a)}
So we conclude
L(h(E)) = h(L(E))
Induction:
There are three cases exists 1) Union 2) Concatenation 3) Closure
i) E = F+G
Apply homomorphism
h(E) = h(F+G)
= h(F)+h(G)
= h(F)Uh(G)
L(h(E)) = L(h(F)+h(G))
= L(h(F)) U L(h(G))
20
By definition of ‘+’ means in Regular Expression.
Finally ,
h(L(E)) = h(L(F)+L(G))
= h(L(F)) U h(L(G))
The conclusion is that L(h(R)) = h(L(R))
i.e., applying the homomorphism h to the regular expression for language L results in a regular expression
that defines language h(L).
Inverse Homomorphism:
Homomorphism may also be applied “backward” and in this mode they also preserve regular
languages.
Suppose h is a homomorphism from some alphabet ∑ to alphabet T, L be a over alphabet T .Then
h-1(L) is the set of strings w in ∑* such that h(w) is in L.
Problem:
1. Consider the homomorphism h from alphabet {0,1,2} to {a,b} defined by: h(0) = ab, h(1) = b, h(2) = aa.
a) What is h(0210) ?
b) What is h(2201) ?
c) If L is the language 1*02* what is h(L).
Soln:
Given : h(0) = ab, h(1) = b, h(2) = aa.
a) h(0210) = {abaabab}
b) h{2201} = {aaaaabb}
c) Given L = 1*02*
h(L) = b* ab (aa)*
Inverse Homomorphism:
Homomorphism may also be applied “backward” and in this mode they also
preserve regular languages.
Definition: Suppose h is a homomorphism from some alphabet ∑ to alphabet T, L
be a over alphabet T .Then h-1(L) is the set of strings w in ∑* such that h(w) is in L.
Example:
Let h(0) = ab; h(1) = ε
Let L = {abab}
h-1 (L) = the language with two 0’s and any number of 1’s = L(1*01*01*).
Theorem: If h is a homomorphism from alphabet ∑ to alphabet T, and L is a regular language over T,
then h-1 (L) is also a regular language.
Proof: The proof starts with a DFA A for L. We construct from A and h, a DFA for h-1 (L) as the figure
shown.
Formally, let L be L(A), where DFA A = (Q,T,δ,q0,F). Define a DFA
B = (Q,∑, ϒ, q0,F)
Where transition function ϒ is constructed by the rule ϒ(q,a) = δ(q,h(a)). That is, the transition B makes
on input a is the result of the sequence of transitions that A makes on the string of symbols h(a).

21
Input
h(
a a)
Input
h(a) Accept
Start A
To A / reject
Fig: The DFA for h-1 (L) applies h to its input, and then simulates the DFA for L.
By induction on w, ϒ(q,w) = δ(q,h(w)). Since the accepting states of A and B are the same, B accepts w if
and only if A accepts h(w)

2.6. Proving Languages not to be regular:


The powerful Technique which is used to prove that certain languages are not regular is Pumping
Lemma.
The Pumping Lemma for Regular Languages:
Theorem:
Let L be a regular language. Then there exists a constant n (which depends on L) such that for
every string w in L such that |w|≥ n, we can break w into three strings, w= xyz, such that:
1. y  ϵ.
2. |xy|≤ n
3. For all k≥0, the string x yk z is also in L.
Note: We can always find a non – empty string y not too far from the beginning of w ,that can be “
Pumped” i.e., repeating y any number of times, or deleting it(the case k =0),keeps resulting string in
language L.
Proof:

If L is regular, then L = L(A) for some DFA A. Suppose A has n number of states.
Consider any string w of length n or more,
Say, w = a1 a2 ……am ,where m ≥ n and ai is an input symbol. For i = 0,1,..,n
δ is the transition function of A defined by
δ(q0,a1 a2…ai) = pi , where q0 is the start state of A
pi is the state A is in after reading the first i symbols of w. Note that q0 = p0
By the pigeonhole principle, it is not possible for the n+1 different pi ‘s for i = 0,1,2,..,n to be
distinct, since there are only n different states.
Thus there are two integers i and j ,0≤ i< j≤ n, such that we can break w = xyz as follows:
1. x = a1 a2 …ai
2. y = ai+1,…aj
3. z = aj+1,…am
i.e., x takes us to pi once; y takes us from pi back to pi , and z is in balance of w.
Consider the Automaton receives the input x yk z for any k≥0.
If k = 0, then A accepts xz.
If k > 0, then x yk z accepted by A
22
i.e., x yk z is in language L.
Problems:
Problems:
1. Show that the language L consisting of all palindromes over (0+1)* is not regular.
Solution:
a) Let us assume that the given language ‘L’ is a regular language.
b) Take the string ‘w’ and calculate the length of the string
w = 0n 1 0n ( condition for palindrome)
c) Length of the string, | w| ≥ n
2n + 1 ≥ n
d) So, we can break w into 3 strings
w = xyz,
Let w = 0i 1 0i
Let us make the following assumptions,
xy = 0m
y = 0j
z = 0i-m 1 0i
To check the assumptions
xyz = 0m 0i-m 1 0i
= 0i 1 0i
So our assumption is correct.
i) | xy | ≤ n
| 0m | ≤ n ® m ≤ n
ii) y ≠ ε (or) | y| ≥ 1
0j ≠ ε (or) | 0j| = j ≥ 1
Since the both conditions are true, for all k = 0, the string x yk z is also in ‘ L’
x yk z = xyyk-1z
= 0m (0j)k-1 0 i-m 1 0i
= 0m+j(k-1)+i-m 1 0i
x yk z = 0i+j(k-1) 1 0i
Put k = 0,
x yk z = 0i+j(k-1) 1 0i
= 0i+j(0-1) 1 0i
= 0i-j 1 0i
≠ 0i 1 0i
Put k = 1,
x yk z = 0i+j(1-1) 1 0i
= 0i+j(0) 1 0i
= 0i 1 0i is in L
Put K = 2,
x yk z = 0i+j(2-1) 1 0i
= 0i+j(1) 1 0i
= 0i+j 1 0i
≠ 0i 1 0i
23
Since for k = 0,2 we have the string that does not belongs to the language L, so the language L = { 0 n 1 0n |
n ≥ 1} is not regular.

2
2. Show that the set L = { a i | i≥1} is not regular. (or) L = { an | n = i2, i ≥1} is not regular.
a) Let us assume that the given language ‘L’ is a regular language.
b) Take the string ‘w’ and calculate the length of the string
w = an where n = i2
c) Length of the string, | w| ≥ n
n≥n
d) So, we can break w into 3 strings
w = xyz,
Let w = an
Let us make the following assumptions,
xy = am
y = aj
z = an-m
To check the assumptions
xyz = am an-m
= an
So our assumption is correct.
i) | xy | ≤ n
| am | ≤ n ® m ≤ n
ii) y ≠ ε (or) | y| ≥ 1
j
a ≠ε (or) | aj| = j ≥ 1
Since the both conditions are true, for all k = 0, the string x yk z is also in ‘ L’
x yk z = xyyk-1z
x yk z = am (aj)k-1 an-m
Put k = 0,
x y0 z = am (aj)k-1 an-m
= am aj(-1) an-m
= am a-j an-m
= an-j
≠ an
Put k = 1,
x y1 z = am (aj)k-1 an-m
= am aj(1-1) an-m
= am an-m
= an is in L.
Put K = 2,
x y2 z = am (aj)k-1 an-m
= am aj(2-1) an-m
= am aj an-m
= an+j ≠ an

24
Since for k = 0,2 we have the string that does not belongs to the language L, so the language L = { an | n =
i2, i ≥1} is not regular.

3. Show that the language L = { an bn | n ≥ 1 }is not regular


a) Let us assume that the given language ‘L’ is a regular language.
b) Take the string ‘w’ and calculate the length of the string
w = an bn
c) Length of the string, | w| ≥ n
2n ≥ n
d) So, we can break w into 3 strings
w = xyz,
Let w = ai bi
Let us make the following assumptions,
xy = am
y = aj
z = ai-m bi
To check the assumptions
xyz = am ai-m bi
= ai bi
So our assumption is correct.
i) | xy | ≤ n
| am | ≤ n ® m ≤ n
ii) y ≠ ε
aj ≠ ε
Since the both conditions are true, for all k = 0, the string x yk z is also in ‘ L’
x yk z = xyyk-1z
= am (aj)k-1 a i-m bi
= am+j(k-1)+i-m bi
x yk z= ai+j(k-1) bi
Put k = 0,
x yk z = ai+j(k-1) bi
= ai+j(0-1) bi
= ai-j bi
≠ ai bi
Put k = 1,
x yk z = ai+j(1-1) bi
= ai+j(0) bi
= ai bi is in L
Put K = 2,
x yk z = ai+j(2-1) bi
= ai+j(1) bi
= ai+j bi
≠ ai bi

25
Since for k = 0,2 we have the string that does not belongs to the language L, so the language L = { an bn | n
≥ 1 }is not regular.

4. Prove L = { ap | p is a prime} is not regular.

Solution:
Let us assume L is a regular and p is a prime number.
L = ap
|z| = uvw i=1
i
Now conider L= uv w where i = 2
= uvvw
Adding 1 to p we get,
P< |uvvw|
P<p+1
But P+1 is not a prime number. Hence we assumed becomes contradictory. Thus L behaves as it is
not a regular language.

2.7. Equivalence and Minimization of Automata:


Testing Equivalence of states:
1. The states p and q are equivalent if “ for all input strings w,δ(p,w) is an accepting state if and only
if δ(q,w) is an accepting state”
2. If two states are not equivalent, then they are distinguishable.
3. State p is distinguishable from state q if there is at least one string w such that one of δ(p,w) and
δ(q,w) is accepting and other is not accepting.
Table – Filling Algorithm:
Table –filling algorithm is used to find the states that are equivalent and distinguishable.
Step 1: If q0,q1,…qn are the states then label all states except qn in x-axis and label all states except q0 in y-
axis in reverse order.
Step 2: If qi is a final state and qj is not a final state, then mark X in the appropriate box.
Step 3: If(qi,qj) is a pair of states with (qi,qj) location which marked with X for a pair of state p,q ,if the
state read the input ‘a’, so that δ(p,a)=qi, δ(q,a)=qj.
Step 4: All pairs which are not marked are equivalent states and remaining are distinguishable states.
Minimization of DFA’s:
The algorithm for minimization of DFA’s is
1. First, eliminate any state that cannot be reached from start state.
2. Then, partition the remaining states in to blocks, so that all states in the same block are equivalent, and
no pairs of states from different blocks are equivalent.

Problem:
1. Find the equivalence and minimization of finite automata for the following FA.

26
0

0 D
B
0 1
1
A 0
E

1 1
0
0
C F

1
0 1

Solution:
Transition table:

0 1
A B C
B D E
C F G
D D E
E F G
F D E
G F G

Finding equivalence states:


The accepting states and non-accepting cannot be equivalent. Therefore {A,D}, {B,D},{C,D},
{E,D},{A,F}, {B,F},{C,F},{E,F}, {A,G}, {B,G},{C,G},{E,G}, Where D,G,F are accepting states and
A,B,C,E are non-accepting states.
Therefore, mark the above pairs are not equivalent states.

27
{A,B}
δ(A,0) = B – Non-Final state
δ(B,0) = D – Final Sate
{A,B} are not equivalent or distinguishable states

{A,C}
δ(A,1) = C – Non-Final state
δ(C,1) = G – Final Sate
{A,C} are not equivalent or distinguishable states

{A,E}
δ(A,0) = B – Non-Final state
δ(E,0) = F – Final Sate
{A,E} are not equivalent or distinguishable states

{B,C}
δ(B,1) = E – Non-Final state
δ(C,1) = G – Final Sate
{B,C} are not equivalent or distinguishable states

{B,E}
δ(B,1) = E – Non-Final state
δ(E,1) = G – Final Sate
{B,E} are not equivalent or distinguishable states

{C,E}
δ(C,0) = F – Final state
δ(E,0) = F – Final Sate
δ(C,1) = G – Final state
δ(E,1) = G – Final Sate
δ(C,010) = F – Final state
δ(E,010) = F – Final Sate
C and E takes same strings w and leads to accepting states, therefore {C, E} are equivalent states

{D,F}
δ(D,0) = D – Final state
δ(F,0) = D – Final Sate
δ(D,1) = E –Non Final state
δ(F,1) = E – Non Final Sate
δ(D,10) = F – Final state
δ(F,10) = F – Final Sate
D and F takes same strings w and leads to accepting states, therefore {D, F} are equivalent states

28
Table Filling Algorithm:

B X
C X X
D X X X
E X X * X
F X X X * X
G X X X X X X
A B C D E F
The Equivalent States are {C, E} and {D,F}

The minimum state finite automata is

G
A 0
B

0
1 0
1 1

C,E D,F
0

0
1

29

You might also like