0% found this document useful (0 votes)
77 views23 pages

Regular Expression

The document defines regular expressions and describes how to construct non-deterministic finite automata (NFAs) from regular expressions using Thompson's construction method. Key points include: - Regular expressions can be used to denote languages and are composed of symbols, empty string, concatenation, union, closure - Thompson's construction method builds an NFA from a regular expression recursively by applying theorems for union, concatenation, closure, and parentheses - The method shows that any language with a regular expression representation can be recognized by an NFA, proving their equivalence in expressive power

Uploaded by

qwdfgh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
77 views23 pages

Regular Expression

The document defines regular expressions and describes how to construct non-deterministic finite automata (NFAs) from regular expressions using Thompson's construction method. Key points include: - Regular expressions can be used to denote languages and are composed of symbols, empty string, concatenation, union, closure - Thompson's construction method builds an NFA from a regular expression recursively by applying theorems for union, concatenation, closure, and parentheses - The method shows that any language with a regular expression representation can be recognized by an NFA, proving their equivalence in expressive power

Uploaded by

qwdfgh
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 23

Regular Expression

• ∑ be an alphabet which is used to denote the


input set. The regular expression over ∑ can
be defined as follows.
• Ø is a regular expression which denotes the
empty set.
• Ε is a regular expression, standing for the
language {ε} and it is a null string.
• a for some a in the alphabet ∑, standing for
the language {a}
• If r and s are regular expressions denoting the
languages L1 and L2 respectively, then
• r+s is equivalent to L1 U L2 (Union)
• rs is equivalent to L1L2 (Concatenation)
• r* is equivalent to L1* (closure)
• The r* is known as kleen closure or closure
which indicates occurrence of r for number of
times.
• For example if ∑={a} and we have regular
expression R=a*, then R is a set denoted by
R={є,a,aa,aaa,aaaa….}

• That is R includes any number of a’s as well as


empty string which indicates zero number of
a’s appearing, denoted by є character.
• Positive closure L+ denotes set of all strings
except є or null string .
• If ∑={a} and if we have regular expression R=a+
then R is a set denoted by
• R={a,aa,aaa,aaaa…}
• We can construct L*= є. L+
PRECEDENCE
• Tightest ( )
• Star (“*”)
• Concatenation (“.”, “”)

• Loosest Union (“∪”, “+”, “|”)


• EXAMPLE
• R1*R2 ∪ R3 = ((R1*)R2)UR3
Write RE for the given Language

• Beginning with 1: 1(0+1)*


• Ends with 1 : (0+1)* 1
• Substring 1: (0+1)* 1 (0+1)*
• Even number of a’s: (b+ab*a)*
• Write r.e for the language accepting the
strings which are starting with 1 and ending
with 0, over the set ∑={0,1}.

• the r.e to denote the language L over ∑={a,b}


such that all the strings do not contain the
substring “ab”.
• Construct r.e for the language L over the set
∑={a,b} in which the total number of a’s are
divisible by 3.
• Solution:
– The total number of a’s are divisible by 3. So valid
strings can be

L={baaa,babaa,bababab,ababab,…..}

The r.e can be,

R=(b*.a.b*.a.b*.a.b*)*
• Write r.e to denote a language L which accepts
all the strings which begin or end with either
00 or 11.
Solution:
The r.e can be categorized into two subparts.
R=L1+L2
L1= The strings which begin with 00 or 11.
L2= The strings which end with 00 or 11.
Let us find out L1 and L2:
L1=(00+11)(any number of 0’s and 1’s)
L1=(00+11)(0+1)*
• Similarly,
L2=(any number of 0’s and 1’s)(00+11)
L2=(0+1)*(00+11)

Hence

R=[(00+11)(0+1)*]+[(0+1)*(00+11)]
EQUIVALENCE
• L can be represented by a regexp
• ⇔
• L is a regular language

• Given regular expression R, we show there


exists NFA N such that R represents L(N)
Given regular expression R, we show there
exists NFA N such that R represents L(N)
Induction on the length of R:
Base Cases (R has length 1):
R=a a
(matches a single symbol)

R=ε ε
(matches the empty string)

R=∅
(matches nothing)
THOMPSON’S CONSTRUCTION
Inductive Step:
Assume R has length k > 1 and that any regular
expression of length < k represents a language
that can be recognized by an NFA

Four possibilities for R:


R=R∪S (Union Theorem)
R = RS (Concatenation Theorem)
R = R* (Closure Theorem)
R = ( R) (Parenthesized R)
THOMPSON’S CONSTRUCTION
UNION THEOREM - (R+S)

ε R ε

ε S ε
THOMPSON’S CONSTRUCTION

CONCATENATION THEOREM - (R·S)

ε
R S
THOMPSON’S CONSTRUCTION
CLOSURE OPERATION OF R- R*

R
ε ε

ε
THOMPSON’S CONSTRUCTION
PARENTHESIZED R- ( R)

R
Have Shown

L can be represented by a regexp



L is a regular language
Transform (1(0 ∪ 1))* to an NFA
ε 0 ε

1
ε ε

ε 0 ε
ε 1 5 6
ε ε
1 2 3 4 9 10
1
ε 7 8 ε

ε
THOMPSON’S CONSTRUCTION METHOD

RE to ε-NFA Example
• Convert R= (ab+a)* to an NFA
– We proceed in stages, starting from simple
elements and working our way up
a
a
b
b
a ε b
ab
THOMPSON’S CONSTRUCTION

RE to ε-NFA Example (2)


ab+a
a ε b
ε ε

a
ε ε

(ab+a)* a ε b
ε ε
ε ε

a
ε ε

You might also like