0% found this document useful (0 votes)
19 views58 pages

Unit-III (Regular Expression)

Uploaded by

Sabu Dahal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
19 views58 pages

Unit-III (Regular Expression)

Uploaded by

Sabu Dahal
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 58

Unit III

Regular Expressions

Prepared By:
Ghanashyam BK

1
Regular Language
A language is said to be a REGULAR LANGUAGE if
and only if some Finite State Machine recognizes it
So what languages are NOT REGULAR?
 The languages
Which are not recognized by any FSM
Which require memory

2
Regular Expressions
those algebraic expressions used for representing
regular languages, the languages accepted by finite
automaton.
offer a declarative way to express the strings we want
to accept.
Many system uses regular expression as input
language.
Search commands such as UNIX grep
Lexical analyzer generator such as LEX or FLEX.
Lexical analyzer is a component of compiler that breaks
the source program into logical unit called tokens.
3
Regular Expressions
 Each regular expression ‘r’ denotes a language L(r)
 The defining rules specify how L(r) is formed by combining in
various ways
Method:
 Let Σ be an alphabet, the regular expression over the alphabet Σ
are defined inductively as follows:
Basic steps:
 Φ is a regular expression representing empty language.
 Є is a regular expression representing the language of empty
strings. i.e.{Є}
 if ‘a’ is a symbol in Σ, then ‘a’ is a regular expression
representing the language {a}.
4
Regular Expressions
following operations over basic regular expression define
the complex regular expression as:
-if ‘r’ and ‘s’ are the regular expressions representing the
language L(r) and L(s) then
r U s is a regular expression denoting the language L(r) U
L(s).
r.s is a regular expression denoting the language L(r).L(s).
r* is a regular expression denoting the language (L(r))*.
(r) is a regular expression denoting the language (L(r)).
Note: any expression obtained from Φ, Є, a using above
operation and parenthesis where required is a regular
expression.
5
Regular Operators
Basically, there are three operators that are used to
generate the languages that are regular
Union (U / | /+): If L1 and L2 are any two regular
languages then
L1UL2 ={s | s ε L1, or s ε L2 }
For Example:
 L1 = {00, 11}, L2 = (Є, 10} then
 L1UL2 = {Є, 00, 11, 10}

6
Regular Operators
Concatenation (.):
If L1 and L2 are any two regular languages then,
L1.L2 = {l1.l2|l1 ε L1 and l2 ε L2}
For examples:
 L1 = {00, 11} and L2 = {Є, 10} then
 L1.L2={00,11,0010,1110}
 L2.L1={1000,1011,00,11}
 So L1.L2 !=L2.L1

7
Regular Operators
Kleen Closure (*):
If L is any regular Language then,
L* = Li =L0 UL1UL2U………….
Precedence of regular operator:
The star operator is of highest precedence. i.e it applies
to its left well formed RE.
Next precedence is taken by concatenation operator.
Finally, unions are taken

8
Regular Languages
Let Σ be an alphabet, the class of regular language
over Σ is defined inductively as:
Φ is a regular language representing empty language
{Є} is a regular language representing language of
empty strings.
For each a ε Σ, {a} is a regular language.
If L1, L2…………. Ln is regular languages, then so is
L1U L2U………..ULn.
If LI,L2,L3,…………..Ln are regular languages, then so
is L1.L2.L3………Ln
If L is a regular language, then so is L*

9
Applications of Regular Languages
Validation:
Determining that a string complies with a set of
formatting constraints. Like email address validation,
password validation etc.
Search and Selection:
Identifying a subset of items from a larger set on the
basis of a pattern match.
 Tokenization:
Converting a sequence of characters into words, tokens
(like keywords, identifiers) for later interpretation.

10
Algebraic Rules for Regular Expressions
Commutativity:
Commutative of operator means we can switch the order
of its operands and get the same result.
The union of regular expression is commutative but
concatenation of regular expression is not commutative.
Associativity:
The unions as well as concatenation of regular
expressions are associative.
i.e. if t, r, s are regular expressions representing regular
languages L(t),L(r) and L(s) then,
 t+(r+s) = (t+r)+s and t.(r.s) = (t.r).s

11
Algebraic Rules for Regular Expressions
Distributive law:
For any regular expression r,s,t representing regular
language L(r), L(s) and L(t) then,
r(s+t) = rs+rt ------ left distribution.
(s+t)r = sr+tr ------ right distribution
Identity law:
Φ is identity for union. i.e. for any regular expression r
representing regular expression L(r).
r + Φ = Φ + r = r i.e. Φ U r = r.
Є is identity for concatenation. i.e. Є.r = r = r. Є

12
Algebraic Rules for Regular Expressions
Annihilator:
An annihilator for an operator is a value such that when
the operator is applied to the annihilator and some other
value, the result is annihilator.
Φ is annihilator for concatenation.
i.e. Φ.r = r.Φ = Φ
Idempotent law of union:
For any regular expression r representing the regular
language L(r), r + r = r.
This is the idempotent law of union.

13
Algebraic Rules for Regular Expressions
Law of closure:
for any regular expression r, representing the regular
language L(r),then
(r*)*=r*
Closure of Φ = Φ* = Є
Closure of Є = Є* = Є
Positive closure of r, r+ = rr*.

14
Regular Expressions Examples
Consider Σ = {0, 1}, then some regular expressions over
Σ are:
0*10* is RE that represents language {w|w contains a
single 1}
Σ * 1Σ* is RE for language{w|w contains at least single
1}
Σ*001 Σ* = {w|w contains the string 001 as substring}
(Σ Σ)* or ((0+1)*.(0+1)*) is RE for {w|w is string of
even length}
1*(01*01*)* is RE for {w|w is string containing even
number of zeros}
15
Regular Expressions Examples
0*10*10*10* is RE for {w|w is a string with exactly
three 1’s}
For string that have substring either 001 or 100, the
regular expression is (1+0)*.001.(1+0)*+(1+0)*.(100).
(1+0)*
For strings that have at most two 0’s with in it, the
regular expression is 1*.(0+Є).1*.(0+Є).1*
For the strings ending with 11, the regular expression
is (1+0)*.(11)

16
Finite Automata and Regular expression
In order to show that the RE define the same class of
language as Finite automata, we must show that:
Any language define by one of these finite automata is
also defined by RE.
Every language defined by RE is also defined by any of
these finite automata.

17
Reduction of Regular Expression to ε –
NFA
We can show that every language L(R) for some RE R,
is also a language L(E) for some epsilon NFA.
This say that both RE and epsilon-NFA are equivalent
in terms of language representation.
Theorem 1
For any regular expression r, there is an Є-NFA that
accepts the same language represented by r.
Proof:
Let L =L(r) be the language for regular expression r,
now we have to show there is an Є-NFA E such that L
(E) =L.
18
Reduction of Regular Expression to ε –
NFA
 The proof can be done through structural induction on r, following the
recursive definition of regular expressions.
 For this we know Φ, Є, ‘a’ are the regular expressions representing
languages {Φ}; an empty language, {Є};language for empty strings
and {a} respectively.
 The Є-NFA accepting these languages can be constructed as;

19 This Forms the basic steps


Reduction of Regular Expression to ε –
NFA
Now the induction parts are shown below
Let r be a regular expression representing language
L(r) and r1,r2 be regular expressions for languages
L(r1) and L(r2),
For union ‘+’: From basis step we can construct Є-
NFA’s for r1 and r2. Let the Є-NFA’s be M1 and M2
respectively

20
Reduction of Regular Expression to ε –
NFA
Then, r=r1+r2 can be constructed as:

The language of this automaton is L(r1) U L(r2) which


is also the language represented by expression r1+r2.
 For concatenation ‘.’ : Now, r = r1.r2 can be
constructed as;

21
Reduction of Regular Expression to ε –
NFA
Here, the path from starting to accepting state go first
through the automaton for r1, where it must follow a
path labeled by a string in L(r1), and
then through the automaton for r2, where it follows a
path labeled by a string in L(r2).
Thus, the language accepted by above automaton is
L(r1).L(r2).

22
Reduction of Regular Expression to ε –
NFA
For *(Kleen closure)
Now, r* Can be constructed as;

Clearly language of this Є-NFA is L(r*) as it can also


just Є as well as string in L(r), L(r)L(r), L(r)L(r)L(r)
and so on. Thus covering all strings in L(r*).
This completes the proof.

23
Examples (Conversion from RE to Є-NFA)
For regular expression (1+0) the Є-NFA is:

for (0+1)*, the Є-NFA is:

24
Examples (Conversion from RE to Є-NFA)
For regular expression (00+1)*10 the Є-NFA is as:

Now,Find Є-NFA for whole regular expression


(0+1)*1(0+1)

25
Equivalence of Regular Expression and
Finite Automata
Discussed in class.

26
Conversion of DFA to Regular Expression
Arden’s Theorem:
Let p and q be the regular expressions over the
alphabet Σ, if p does not contain any empty string then
r = q + rp has a unique solution r = qp*.
Proof:
Here, r = q + rp ……………… (i)
Let us put the value of r = q + rp on the right hand side
of the relation (i), so;
r = q + (q + rp)p
r = q + qp + rp2………………(ii)

27
Conversion of DFA to Regular Expression
Again putting value of r = q + rp in relation (ii), we get;
r = q + qp + (q +rp) p2
r = q+ qp + qp2 + rp3………………
Continuing in the same way, we will get as;
r = q + qp + qp2 + qp3………………..
r = q(Є + p + p2 +p3 +…………………..
Thus r = qp* Proved.

28
Conversion of DFA to Regular Expression
Use of Arden’s rule to find the regular expression
for DFA:
To convert the given DFA into a regular expression,
here are some of the assumptions regarding the
transition system:
The transition diagram should not have the Є-transitions.
There must be only one initial state.
The vertices or the states in the DFA are as;
 q1,q2,……………..qn (Any qi is final state)

29
Conversion of DFA to Regular Expression
Wij denotes the regular expression representing the set
of labels of the edjes from qi to qj.
Thus we can write expressions as;
q1=q1w11+q2w21+q3w31+………………qnwn1+Є
q2=q1w12+q2w22+q3w32+………………+qnwn2
q3=q1w13+q2w23+q3w33+………………+qnwn3
…………………………………………………
…………………………………………………
qn=q1w1n+q2wn2+q3wn3+………………………qnwnn
Solving these equations for qi in terms of wij gives the
regular expression eqivalent to given DFA.
30
Conversion of DFA to Regular Expression
 Examples: Convert the following DFA into regular
expression.

 Let the equations are:


 q1= Є + q21+q30……….(i)
 q2=q10…………………(ii)
 q3=q11…………………..(iii)
 q4=q20+q31+q40+ q41……(iv

31
Conversion of DFA to Regular Expression
Putting the values of q2 and q3 in (i)
q1=q101+q110+ Є
i.e.q1=q1(01+10)+ Є
i.e.q1= Є+q1(01+10) (since r = q+rp)
i.e. q1= Є(01+10)* (using Arden’s rule)
Since, q1 is final state, the final regular expression for
the DFA is
Є(01+10)*
= (01+10)*

32
Excercises
Convert the following DFA into RE.

33
Representation of Languages
Representations can be formal or informal.
Example (formal): represent a language by a RE or
DFA defining it.
Example: (informal): a logical or prose statement
about its strings:
{0n1n | n is a nonnegative integer}
The set of strings consisting of some number of 0’s
followed by the same number of 1’s.

34
Properties of Regular Languages
 Language classes have two important kinds of properties:
 Decision properties.
A decision property for a class of languages is an algorithm that
takes a formal description of a language (e.g., a DFA) and tells
whether or not some property holds.
Example: Is language L empty?
 Closure properties.
A closure property of a language class says that given languages
in the class, an operator (e.g., union) produces another language in
the same class.
Example: the regular languages are obviously closed under union,
concatenation, and (Kleene) closure.Use the RE representation of
languages.

35
Pumping Lemma
It is shown that the class of language known as regular
language has at least four different descriptions.
They are the language accepted by DFA‟s, by NFA‟s,
by Є-NFA, and defined by RE.
Not every language is Regular.
To show that a langauge is not regular, the powerfull
technique used is known as Pumping Lemma.

36
Pumping Lemma
Statement:
Let L be a regular language. Then, there exists an
integer constant n so that for any x ε L with |x| ≥ n,
there are strings u, v, w such that x = uvw,
v is not equal to Є
|uv| ≤ n,
|v| > 0.
 Then uvkw ε L for all k ≥ 0.
Note: Here k is the string that can be pumped i.e
repeating k any number of times or deleting it, keeps
the resulting string in the language.
37
Pumping Lemma
Proof:
Suppose L is a regular language, then L is accepted by
some DFA M. Let M has n states. Also L is infinite so
M accepts some string x of length n or greater. Let
length of x, |x| =m where m ≥ n.
Now suppose;
X = a1a2a3………………am where each ai ε Σ be an input
symbol to M. Now, consider for j = 1,………….n, qj be
states of M

38
Pumping Lemma
 Then,
 (q0,x) = (q0,a1a2………..am) [q0 being start state of M]
= (q1,a2………am)
=…………………
=…………………
= (qm,Є) [qm being final state]

 Since m ≥ n, and DFA M has only n states, so by pigeonhole


principle, there exists some i and j; 0 ≤ i < j ≤ m such that q i =qj.

39
Pumping Lemma
Now we can break x=uvw as
u = a1a2…………..ai
v =ai+1……………aj
w =aj+1……………am
i.e. string ai+1 ………………aj takes M from state qi
back to itself since qi = qj. So we can say M accepts
a1a2…………ai(ai+1…………aj)k aj+1……………am
for all k≥0.
Hence, uvkw ε L for all k≥0.

40
Application of Pumping Lemma
To prove any language is not a regular language.
For example: Show that language, L={0r1r|r ≥0} is
not a regular language.
Solution:
Let L is a regular language. Then by pumping lemma,
there are strings u, v, w with v≥1 such that uvkw ε L for
k≥0.

41
Application of Pumping Lemma
Case I:
Let v contain 0’s only. Then,
 suppose u = 0p , v = 0q ,w = 0r1s ;
Then we must have p+q+r = s (as we have 0r1r ) and
q>0
Now, uvkw = 0p(0q)k0r1s = 0p+qk+r1s
Only these strings in 0p+qk+r1s belongs to L for k=1
otherwise not.
Hence we conclude that the language is not regular.

42
Application of Pumping Lemma
Case II
Let v contains 1’s only. Then u= 0p1q , v = 1r , w=1s
Then p= q+r+s and r>0
Now, 0p1q(1r)k1s = 0p1q+rk+s
Only those strings in 0p1q+rk+s belongs to L for k =1
otherwise not.
Hence the language is not regular.

43
Application of Pumping Lemma
Case III
V contains 0’s and 1’s both. Then, suppose,
u = 0p , v = 0q1r , w = 1s ;
p+q = r+s and q+r>0
Now, uvkw = 0p(0q1r)k1s = 0p+qk1rk+s
Only those strings in 0p+qk1rk+s belongs to L for k=1,
otherwise not. (As it contains 0 after 1 for k>1 in the
string.)
Hence the language is not regular.

44
Closure Properties of Regular Languages
The union of two regular languages is regular
The intersection of two regular languages is regular.
The complement of a regular language is regular
The difference of two regular language is regular.
The reversal of a regular language is regular.
The closure (star) of a regular language is regular.
The concatenation of a regular language is a regular.

45
Properties of Regular Languages over
Union(U)
Theorem: If L and M are regular languages, then so
is L U M.
Proof:
Since, L and M are regular, they have regular
expressions,
Say L=L(R) and M = L(S). Then
L U M = L(R+S) by the definition of the + operator for
regular expressions.

46
Properties of Regular Languages over
Complement

47
Minimization of Finite State Machines:
Table Filling Algorithm
Given a DFA M, that accepts a language L (M). Now,
configure a DFA M’. During the course of
minimization, it involves identifying the equivalent
states and distinguishable states.
Equivalent States: Two states p & q are called
equivalent states, denoted by p ≡ q if and only if for
each input string x, (
(p, x) is a final state if and only if (q, x) is a final
state.
Distinguishable state: Two states p & q are said to be
distinguishable states if (for any) there exists a string
x, such that (p, x) is a final state (q, x) is not a
48
final state.
Minimization of Finite State Machines:
Table Filling Algorithm
The steps of the algorithm are; For identifying the
pairs (p, q) with p ≠ q;
List all the pairs of states for which p ≠ q.
Make a sequence of passes through each pairs.
On first pass, mark the pair for which exactly one
element is final (F).
On each sequence of pass, mark the pair (r, s) if for any a
ε Σ, δ(r, a) = p and δ(s, a) = q and (p, q) is already
marked.
After a pass in which no new pairs are to be marked, stop
Then marked pairs (p, q) are those for which p and q are
not equivalent and unmarked pairs are those for which p
49 ≡ q.
Minimization of Finite State Machines:
Table Filling Algorithm
Example

50
Minimization of Finite State Machines:
Table Filling Algorithm
Now to solve this problem first we should determine
weather the pair is distinguishable or not.

51
Minimization of Finite State Machines:
Table Filling Algorithm
For pair (b, a)
(δ(b, 0 ), δ(a, 0)) = (g, h) – unmarked
(δ(b, 1), δ(a, 1)) = (c, f) – marked
For pair (d, a)
(δ(d, 0), δ(a, 0)) = (c, b) – marked
Therefore (d, a) is distinguishable.
For pair (e, a)
(δ(e, 0), δ(a, 0)) = (h, h) – unmarked.
(δ(e, 1), δ(a, 1)) = (f, f) –unmarked.
[(e, a) is not distinguishable)]

52
Minimization of Finite State Machines:
Table Filling Algorithm
For pair (g, a)
(δ(g, 0), δ( a, 0)) = (a, g) – unmarked.
(δ(g, 1), δ(a, 1)) = (e, f) – unmarked
For pair (h, a)
(δ(h, 0), δ(a, 0)) = (g, h) –unmarked
(δ(h, 1), δ(a 1) = (c, f) – marked
Therefore (h, a) is distinguishable.
For pair (d, b)
(δ(d, 0), δ(b,0)) = (c, g) – marked
Therefore (d, b) is distinguishable.

53
Minimization of Finite State Machines:
Table Filling Algorithm
For pair (e, b)
(δ(e, 0), δ(b,0)) = (h, g) –unmarked
(δ(e, 1), δ(b,1) = (f, c) – marked.
For pair (f, b)
(δ(f, 0), δ(b,0)) = (c, g) – marked
For pair (g, b)
(δ(g, 0), δ(b, 0)) = (g, g) – unmarked
(δ(h, 1), δ(b, 1)) = (e, c) – marked
For pair (h, b)
(δ(h, 0), δ(b, 0)) = (g, g) – unmarked
(δ(h,1), δ(b,1)) = (c, c) - unmarked.

54
Minimization of Finite State Machines:
Table Filling Algorithm
For pair (e, d)
(δ(e, 0), δ(d, 0)) = (h, c) – marked
(e, d) is distinguishable.
For pair (f, d)
(δ(f, 0), δ(d, 0)) = (c, c) – unmarked
(δ(f,1), δ(f,1)) = (g, g) - unmarked.
For pair (g, d)
(δ(g, 0), δ(d, 0)) = (g, c) – marked
For pair (h, d)
(δ(h, 0), δ(d, 0)) = (g, c) – marked

55
Minimization of Finite State Machines:
Table Filling Algorithm
For pair (f, e)
(δ(f, 0), δ(e, 0)) = (c, h) – marked
For pair (g, e)
(δ(g, 0), δ(e, 0)) = (g, h) – unmarked
(δ(g,1), δ(e,1)) = (e, f) -marked.
For pair (h, e)
(δ(h, 0), δ(e, 0)) = (g, h) – unmarked
(δ(h,1), δ(e,1)) = (c, f) -marked.
For pair (g, f)
(δ(g, 0), δ(f, 0)) = (g, c) – marked

56
Minimization of Finite State Machines:
Table Filling Algorithm
For pair (h, f)
(δ(h, 0), δ(f, 0)) = (g, c) – marked
For pair (h, g)
(δ(h, 0), δ(g, 0)) = (g, g) – unmarked
(δ(h,1), δ(g,1)) = (c, e) -marked.
Thus (a, e), (b, h) and (d, f) are equivalent pairs of
states.

57
Minimization of Finite State Machines:
Table Filling Algorithm
Hence the minimized DFA is

58

You might also like