0% found this document useful (0 votes)

72 views51 pages

ATC Module 2

The document discusses regular expressions (REs) and finite state machines (FSMs). It defines the basic rules for forming REs and their semantic interpretation. REs can be used to define languages and analyze simple expressions. The document also explains how to build an equivalent FSM from a given RE using a step-by-step construction process. Additionally, it describes a heuristic algorithm for building a equivalent RE from a given FSM by iteratively selecting and removing states.

Uploaded by

Palguni DS

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

72 views51 pages

ATC Module 2

Uploaded by

Palguni DS

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 51

Chapter-6

Regular Expressions
Regular Expression (RE)

A RE is a string that can be formed according to the following rules:

1. ø is a RE.
2. ε is a RE.
3. Every element in ∑ is a RE.
4. Given two REs α and β,αβ is a RE.
5. Given two REs α and β, α U β is a RE.
6. Given a RE α, α* is a RE.
7. Given a RE α, α+ is a RE.
8. Given a RE α, (α) is a RE.

if ∑ = {a,b}, the following strings are regular expressions:

ø, ε, a,b, (a U b)*, abba U ε.

Semantic interpretation function L for the language of regular expressions:

1. L (ø) = ø, the language that contains no strings.

2. L (ε) = {ε}, the language that contains empty string.
3. For any cϵ∑, L(c) = {c}, the language that contains single character string c.
4. For any regular expressions α and β, L (αβ) = L (α) L (β).
5. For any regular expressions α and β, L (α U β) = L (α) U L (β).
6. For any regular expression α, L (α*) = (L (α))*.
7. For any regular expression α, L (α+) = L (αα*) = L (α) (L (α))*
8. For any regular expression α, L ((α)) = L (α).

Analysing Simple Regular Expressions

1.L( (a U b)*b) = L((a U b)*)L(b)
= (L((a U b)))*L(b)

1
= (L(a) U L(b))*L(b)
=({a} U {b})*{b}
= {a,b}*{b}
(a U b)*b is the set of all strings over the alphabet {a, b} that end in b.

2. L( ((a U b) (a U b))a(a U b)*)

= L(((a U b)(a U b)))L(a) L((a U b)*)

= L((a U b)(a U b)) {a} (L((a U b)})*
= L((a U b))L((a U b)) {a} {a,b}*
= {a, b} { a, b} {a} {a, b}*
• ((a U b)(a U b))a(a U b)* is
{xay : x and y are strings of a's and b's and lxl = 2}.

Finding RE for a given Language

1. Let L = {w ϵ {a, b }*: |w| is even}.
L = {aa,ab,abba,aabb,ba,baabaa, -------}
RE = ((a U b)(a U b))* or ( aa U ab U ba U bb )*

2. Let L = {w ϵ {a, b }*: w starting with string abb}.

L = {abb,abba,abbb,abbab ------ }
RE = abb(a U b)*

3. Let L = {w ϵ {a, b }*: w ending with string abb}.

L = {abb,aabb,babb,ababb ------ }
RE = (a U b)*abb

4. L = {w ϵ {0, 1}* : w have 001 as a substring}.

L = {001,1001,000101, ------ }
RE = (0 U 1)*001(0 U 1)*

5. L = {w ϵ {0, 1}* : w does not have 001 as a substring}.

L = {0,1,010,110,101, ---- }
RE = (1 U 01)*0*

2
6. L = {w ϵ {a, b}* : w contains an odd number of a's}.

L = {a,aaa,ababa,bbaaaaba ----- }
RE = b*(ab*ab*)* a b* or b*ab*(ab*ab*)*

7. L = {w ϵ {a, b}* :#a(w) mod 3 = 0}.

L = {aaa,abbaba,baaaaaa, -- }
RE = (b*ab*ab*a)*b*

8. Let L = {w ϵ {a, b }*:#a(w) <= 3}.

L = {a,aa,ba,aaab,bbbabb,------- }
RE = b*(a U ε)b*(a U ε)b*(a U ε)b*

9. L = {w ϵ {0, 1}* : w contains no consecutive 0’s}

L={0, ε,1,01,10,1010,110,101, ---- }
RE = (0 U ε)(1 U 10)

10. L = {w ϵ {0, 1}* : w contains at least two 0’s}

L={00,1010,1100,0001,1010,100,000, ---- }
RE = (0 U 1)*0(0 U 1)*0(0 U 1)*

11.L = { anbm / n>=4 and m<= 3}

RE= (aaaa)a*(ε U b U bb U bbb)

12.L = { anbm / n<=4 and m>= 2}

RE= (ε U a U aa U aaa U aaaa)bb(b)*

13. L = { a2nb2m / n>=0 and m>= 0}

RE= (aa)*(bb)*

14. L = { anbm:(m+n) is even}

(m+n) is even when both a’s and b’s are even or both odd.
RE = (aa)*(bb)* U a(aa)*b(bb)*

3
Three operators of RE in precedence order(highest to lowest)

1. Kleene star
2. Concatenation
3. Union

Eg: (a U bba) is evaluated as (a U (b(b)a))

Kleene's Theorem

Theorem 1:
Any language that can be defined by a regular expression can be accepted by some
finite state machine.
Theorem 2:
Any language that can be accepted by a finite state machine can be defined by
some regular expressions.
Note: These two theorems are proved further.

Buiding an FSM from a RE

Theorem 1:For Every RE, there is an Equivalent FSM.

Proof: The proof is by construction.
We can show that given a RE α,
we can construct an FSM M such that L (α) = L (M).
Steps:
1. If α is any cϵ∑ ,we construct simple FSM shown in Figure(1)

Figure (1)

4
2. If α is any ø, we construct simple FSM shown in Figure(2).

Figure (2)
3. If α is ε,we construct simple FSM shown in Figure(3).

Figure (3)
4. Let β and γ be regular expressions.
If L(β) is regular,then FSM M1 = (K1, ∑ , δ1, s1, A1).
If L(γ) is regular,then FSM M2 = (K2, ∑ , δ2, s2, A2).
If α is the RE β U γ, FSM M3=(K3, ∑ , δ3, s3, A3) and
L(M3)=L(α)=L(β) U L(γ)
M3 = ({S3} U K1 U K2, ∑ , δ3, s3, A1 U A2), where
δ3 = δ1 U δ2 U { ((S3, ε), S1),((S3, ε),S2)}.

α=βUγ

5. If α is the RE βγ, FSM M3=(K3, ∑ , δ3, s3, A3) and

L(M3)=L(α)=L(β)L(γ)
M3 = (K1 U K2, ∑ , δ3, s1, A2), where

5
δ3 = δ1 U δ2 U { ((q, ε), S2):qϵA1}.

α = βγ

6. If α is the regular expression β*, FSM M2 = (K2, ∑, δ2 s2, A2) such that
L (M2) = L (α)) = L (β )*.
M2 = ({S2} U K1, ∑, δ2,S2,{S2} U A1), where
δ2 = δ1 U {((S2, ε ),S1)} U {((q, ε ),S1):q ϵ A1}.

α = β*

Algorithm to construct FSM, given a regular expression α

regextofsm(α : regular expression) =

Beginning with the primitive subexpressions of α and working
outwards until an FSM for an of α has been built do:
Construct an FSM as described in previous theorem.

Building an FSM from a Regular Expression

1. Consider the regular expression (b U ab )*.

An FSM for b

6
An FSM for a

An FSM for ab

An FSM for (b U ab)

7
An FSM for (b U ab)*

2. Construct FSM for the RE (b(a U b)b)*

8
9
Building a Regular Expression from an FSM

Building an Equivalent Machine M

Algorithm for FSM to RE(heuristic)

fsmtoregexheuristic(M: FSM) =

1. Remove from M-any unreachable states.

2. No accepting states then return the RE ø.

3. If the start state of M is has incoming transitions into it, create a new start
state s.

4. If there is more than one accepting state of M or one accepting state with
outgoing transitions from it, create a new accepting state.

5. M has only one state, So L (M} = { ε } and return RE ε.

6. Until only the start state and the accepting state remain do:

6.1. Select some state rip of M.

6.2. Remove rip from M.

6.3. Modify the transitions. The labels on the rewritten

transitions may be any regular expression.

7. Return the regular expression that labels from the

start state to the accepting state.

10
Example 1 for building a RE from FSM

Let M be:

Step 1:Create a new start state and a new accepting state and link them to M

After adding new start state 4 and accepting state 5

Step 2: let rip be state 3

11
After removing rip state 3

1-2-1:ab U aaa*b

1-2-5:a

Step 3: Let rip be state 2

After removing rip state 2

4-1-5 : (ab U aaab)(a U ε)

Step 4: Let rip be state 1

After removing rip state 1

RE = (ab U aaa*b)*(a U ε)

12
Theorem 2 :For Every FSM ,there is an equivalent regular expression

Statement : Every regular language can be defined with a regular expression.

Proof : By Construction

Let FSM M = (K,∑,δ,S,A),construct a regular expression α such that

L(M) = L(α)

Collapsing Multiple Transitions

{C1,C2,C3. ..... Cn} - Multiple Transition

Delete and replace by {C1 U C2 U C3. ..... U Cn}

If any of the transitions are missing, add them without changing L(M) by labeling
all of the new transitions with the RE ø.

13
Select a state rip and remove it and modify the transitions as shown below.

Consider any states p and q.once we remove rip,how can M get from p to q?

Let R(p,q) be RE that labels the transition in M from P to Q.Then the new machine
M' will be removing rip,so R'(p,q)

R'(p,q) = R(p,q) U R(p,rip)R(rip,rip)*R(rip,q)

Ripping States out one at a time

R'(1,3) = R(1,3) U R(1,rip)R(rip,rip)*R(rip,3)

= R(1,3) U R(1,2)R(2,2)*R(2,3)

= ø U ab*a

= ab*a

Algorithm to build RE that describes L(M) from any FSM M = (K,∑,δ,S,A)

Two Sub Routines:

1. standardize : To convert M to the required form

2. buildregex : Construct the required RE from

modified machine M

1. Standardize (M:FSM)

i. Remove unreachable states from M

ii. Modify start state

iii. Modify accepting states

iv. If there is more than one transition between states p and q ,collapse them to
single transition

v. If there is no transition between p and q and p ∉A, q ∉S,then create a

transiton between p and q labled Φ

14
2. buildregex(M:FSM)

i. If M has no accepting states then return RE Φ

ii. If M has only one accepting state ,return RE ε

iii. until only the start state and the accepting state remain do:

a. Select some state rip of M

b. Find R'(p,q) = R(p,q) U R(p,rip).R(rip,rip)*.R (rip,q)

c. Remove rip on d all transitions into ad out of it

iv. Return the RE that labels from start state to the accepting state

Example 2: Build RE from FSM

Step 1: let RIP be state 4

1-4-2 : bb

After removing rip state 4

Step 2: Collapse multiple transitions from state 1 to state 2

1-2: a U bb

After collapsing multiple transitions from state 1 to state 2

15
Step 3: let rip be state 2

1-3: (a U bb)b*a

After removing rip state 2

RE = (a U bb)b*a

Example 3: Build RE From FSM

Step 1: Remove state s as it is dead state

After removing state s

Step 2: Add new start state t and new accepting state u

16
After adding t and u

Step 3: Let rip be state q

p-q-p: 01

After removing rip state q

Step 4: Let rip be state r

p-r-p: 10

After removing rip state r

RE = (01 U 10)*

17
Example 4:A simple FSM with no simple RE

L = {w ε {a,b}* : w contains an even no of a's and an odd number of b's}

[3] even a's odd b's

18
19
20
Building DFSM

• It is possible to construct a DFSM directly from a set of patterns

• Suppose we are given a set K of n keywords and a text string s.

• Find the occurences of s in keywords K

• K can be defined by RE

(Σ(K1 U K2 U ....... U Kn)Σ)+

• Accept any string in which at least one keyword occurs

Algorithm- buildkeywordFSM

• To build dfsm that accepts any string with atleast one of the specified
keywords

Buildkeyword(K:Set of keywords)

• Create a start state q0

• For each element k of K do

Create a branch corresponding to k

21
• Create a set of transitions that describe what to do when a branch dies

• Make the states at the end of each branch accepting

Applications of Regular Expressions

• Many Programming languages and scripting systems provide support for
regular expression matching

• Re's are used in emails to find spam messages

• Meaningful words in protein sequences are called motifs

• Used in lexical analysis

• To Find Patterns in Web

• To Create Legal passwords

• Regular expressions are useful in a wide variety of text processing tasks,

22
• More generally string processing, where the data need not be textual.

• Common applications include data validation, data scraping (especially web

scraping), data wrangling, simple parsing, the production of syntax
highlighting systems, and many other tasks.

RE for Decimal Numbers

RE = -? ([0-9]+(\.[0-9]*)? | \.[0-9]+)
• (α)? means the RE α can occur 0 or 1 time.

• (α)* means the RE α can repeat 0 or more times.

• (α)+ means the RE α can repeat 1 or more times.

24.23,-24.23, .12, 12 ------ are some examples

Requirements for legal password

• A password must begin with a letter

• A password may contain only letters numbers and a underscore character

• A password must contain atleast 4 characters and no more than 8 characters

((a-z) U (A-Z))

((a-z) U (A-Z) U (0-9) U _)

((a-z) U (A-Z) U (0-9) U _ U ε)

Very lengthy regular expression

23
Different notation for writing RE

• α means that the pattern α must occur exactly once.

• α* means that the pattern may occur any number of times(including zero).

• α+ means that the pattern α must occur atleast once.

• α{n,m} means that the pattern must occur atleast n times but not more than
m times

• α{n} means that the pattern must occur n times exactly

• So RE of a legal password is :

RE = ((a-z) U (A-Z))((a-z) U (A-Z) U (0-9) U _){3,7}

• RE for an ip address is :

RE = ((0-9){1,3}(\.(0-9){1,3}){3})
Examples: 121.123.123.123

118.102.248.226

10.1.23.45

Manipulating and Simplifying Regular Expressions

Let α, β, ү represent regular expressions and we have the following identities.

1. Identities involving union

2. Identities involving concatenation

3. Identities involving Kleene Star

Identities involving Union

• Union is Commutative

αUβ=βUα

24
• Union is Associative

(α U β) U ү = α U (β U ү)

• Φ is the identity for union

αUΦ=ΦUα=α

• union is idempotent

αUα=α

• For any 2 sets A and B, if B ⊆ A, then A U B = A

a* U aa = a, since L(aa) ⊆ L(a).

Identities involving concatenation

• Concatenation is associative

(αβ)ү = α(βү)

• ε is the identity for concatenation

αε = εα = α

• Φ is a zero for concatenation.

αΦ = Φα = Φ

• Concatenation distributes over union

(α U β)ү = (αү) U (βү)

ү(α U β) = (үα) U (үβ)

Identities involving Kleene Star

• Φ* = ε

• ε* = ε

• (α*)* = α*

• α*α* = α*
25
• If α* ⊆ β* then α*β* = β*

• Similarly If β* ⊆ α* then αβ = α*

a(a U b) = (a U b), since L(a) ⊆ L((a U b)*).

• (α U β)* = (α*β*)*

• If L(β) ⊆ L(α) then (α U β)* = α*

(a U ε)* = a,since {ε} ⊆ L(a).

Simplification of Regular Expressions

1. ((a* U Φ)* U aa) = (a) U aa //L(Φ) ⊆ L(a*)

= a* U aa //(α*)* = α*

= a* // L(aa) ⊆ L(a*)

2. (b U bb)b = bb //L(bb) ⊆ L(b*)

= b* // α*α* = α*

3. ((a U b)* b* U ab)*

= ((a U b)* U ab)* //L(b) ⊆ L(a U b)

= (a U b)* //L(a) ⊆ L(a U b))

4. ((a U b)* (a U ε )b* = (a U b)* //L((a U ε )b) ⊆ L(a u b)

5. (Φ* U b)b* = (ε U b)b* //Φ* = ε

= b* //L(ε U b) ⊆ L(b*)

6. (a U b)a U b = (a U b)* U b // L(a) ⊆ L((a U b))

= (a U b)* // L(b) ⊆ L((a U b)*)

7.((a U b)+)* = (a U b)*

26
Chapter-7

Regular Grammars

Regular grammars sometimes called as right linear grammars.

A regular grammar G is a quadruple (V, ∑ , R, S)

• V is the rule alphabet which contains nonterminals

and terminals.

• ∑ (the set of terminals) is a subset of V

• R (the set of rules) is a finite set of rules of the form

XY

• S (the start symbol) is a nonterminal.

All rules in R must:

• Left-hand side should be a single nonterminal.

• Right-hand side is ε or a single terminal or a single terminal followed by a

single nonterminal.

Legal Rules

Sa

Sε

TaS

Not legal rules

SaSa

STT

aSaT

ST

27
• The language generated by a grammar G = (V, ∑ , R, S) denoted by L( G) is
the set of all strings w in ∑* such that it is possible to start with S.

• Apply some finite set of rules in R, and derive w.

• Start symbol of any grammar G will be the symbol on the left-hand side of
the first rule in RG

Example of Regular Grammar

Example 1:Even Length strings

Let L = {wϵ {a, b }*: lwl is even}.

The following regular expression defines L:

((aa) U (ab) U (ba) U (bb))* or ((a U b)(a U b))*

DFSM accepting L

Regular Grammar G defining L

Sε

SaT

SbT

TaS

TbS

Derivation of string using Rules

Derivation of string “abab”

28
S => aT

=> abT

=> abaS

=> ababS

=> abab

Regular Grammars and Regular Languages

THEOREM

Regular Grammars Define Exactly the Regular Languages

Statement:

The class of languages that can be defined with regular grammars is exactly the
regular languages.

Proof: Regular grammar  FSM

FSM  Regular grammar

The following algorithm constructs an FSM M from a regular grammar G = (V,

∑ , R, S) and assures that

L (M) = L (G):

Algorithm-Grammar to FSM

grammartofsm ( G: regular grammar) =

1. Create in M a separate state for each nonterminal in V.

2. Make the state corresponding to S the start state.

3. If there are any rules in R of the form Xw, for some

w ϵ ∑, then create an additional state labeled #.

4. For each rule of the form X wY,

29
add a transition from X to Y labeled w.

5. For each rule of the form Xw, add a transition from X

to # labeled w.

6. For each rule of the form Xε, mark state X as

accepting.

7. Mark state # as accepting.

8. If M is incomplete then M requires a dead state.

Add a new state D. For every (q, i) pair for which no

transition has already been defined, create a transition

from q to D labeled i. For every i in Σ, create a transition

from D to D labeled i.

Example 2:GrammarFSM

Strings that end with aaaa

Let L = {wϵ {a, b }*: w end with the pattern aaaa}.

RE = (a U b)*aaaa

Regular Grammar G

SaS

SbS

SaB

BaC

CaD

Da

30
Example 3:The Missing Letter Language

Let ∑ = {a, b, c}.

LMissing = { w : there is a symbol a € ∑ not appearing in w}.

Grammar G generating LMissing

31
32
Algorithm FSM to Grammar

1. Make M deterministic (to get rid of ε-transitions).

2. Create a nonterminal for each state in the new M.

3. The start state becomes the starting nonterminal.

4. For each transition δ(T, a) = U, make a rule of the form T → aU.

5. For each accepting state T, make a rule of the form T → ε.

Example 7: Build grammar from FSM

33
RE = (a U bb)b*a

Grammar

AaB

AbD

BbB

BaC

DbB

Cε

Derivation of string “aba”

A => aB

=> abB

=> abaC

=> aba

Derivation of string “bba”

A => bB

=> bbB

=> bbaC

=> bba

Example 8:A simple FSM with no simple RE

L = {w ε {a,b}* : w contains an even no of a's and an odd

number of b's}

34
Grammar

AaB

AbC

BaA

BbD

CbA

CaD

DbB

DaC

Cε

Derivation of string “ababb”

A => aB

=> abD

=> abaC

=> ababA

=> ababbC

=> ababb

35
Satisfying Multiple Criteria

Let L = { wϵ {a, b }*: w contain an odd number of a’s and

w ends in a}.

SbS
SaT
T ε
TaS
TbX
XaS
XbX

36
Conclusion on Regular Grammars

• Regular grammars define exactly the regular languages.

• But regular grammars are often used in practice as FSMs and REs are easier
to work.

• But as we move further there will no longer exist a technique like regular
expressions.

• So we discuss about context-free languages and context-free-grammars are

very important to define the languages of push-down automata.

37
Chapter-8

Regular and Nonregular Languages

• The language ab is regular.

• The language AnBn = {anbn :n>=0} is not regular.

• The language {w Є {a,b}*:every a is immediately followed by b} is regular.

• The language {w Є {a, b}*:every a has a matching b somewhere and no b

matches more than one a} is not regular.

• Given a new language L, how can we know whether or not it is regular?

Theorem 1: The Regular languages are countably infinite

Statement:

There are countably infinite number of regular languages.

Proof:

• We can enumerate all the legal DFSMs with input alphabet ∑.

• Every regular language is accepted by at least one of them.

• So there cannot be more regular languages than there are DFSMs.

• But the number of regular languages is infinite

because it includes the following infinite set of

languages:

{a}, { aa} , { aaa}, { aaaa}. { aaaaa}, { aaaaaa } ----

• Thus there are at most a countably infinite number of

regular languages.

38
Theorem 2 : The finite Languages

Statement: Every finite language is regular.

Proof:

• If L is the empty set, then it is defined by the R.E Ø and so

is regular.

• If it is any finite language composed of the strings

s1,s2,….sn for some positive integer n, then it is defined by

the R.E: s1 U s2 U …U sn

• So it too is regular

 Regular expressions are most useful when the elements of L match one or
more patterns.

 FSMs are most useful when the elements of L share some simple structural
properties.

39
Examples:

• L1 = {w Є {0-9}*: w is the social security number of the

current US president}.

L1 is clearly finite and thus regular. There exists a simple

FSM to accept it.

• L2 = {1 if Santa Claus exists and 0 otherwise}.

• L3 = {1 if God exists and 0 otherwise}.

L 2 and L3 are perhaps a little less clear.

So either the simple FSM that accepts { 0} or the simple

FSM that accepts { 1} and nothing else accepts L2 and L3.

• L = {1 if there were people in north America more than
4
10000 years age and 0 otherwise}.
• L = {1 if there were people in north America more than
5
15000 years age and 0 otherwise}.
L is clear. It is the set { 1}.
4

L is also finite and thus regular.

5
• L = {w Є {0-9}*: w is the decimal representation, without
6
leading 0’s, of a prime Fermat number}

• Fermat numbers are defined by

Fn = 22n + 1 , n >= 0.

• The first five elements of F are {3, 5, 17, 257,65537}.

• All of them are prime. It appears likely that no other Fermat numbers are
prime. If that is true,then L6

is finite and thus regular.

40
• lf it turns out that the set of Fermat numbers is infinite,then it is almost
surely not regular.

Four techniques for showing that a language L(finite or infinite) is regular:

1. Exhibit a R.E for L.

2. Exhibit an FSM for L.

3. Show that the number of equivalence of ≈L is finite.

4. Exhibit a regular grammar for L.

Closure Properties of Regular Languages

The Regular languages are closed under

• Union

• Concatenation

• Kleene star

• Complement

• Intersection

• Difference

• Reverse

• Letter substitution

Closure under Union, Concatenation and Kleene star

Theorem: The regular languages are closed under union,

concatenation and Kleene star.

Proof: By the same constructions that were used in the

proof of Kleene’s theorem.

41
Closure under Complement
Theorem:

The regular languages are closed under complement.

Proof:

• If L1 is regular, then there exists a DFSM M 1=(K,∑,δ,s,A)

that accepts it.

• The DFSM M2=(K, ∑,δ,s,K-A), namely M1 with accepting

and nonaccepting states swapped, accepts ¬(L(M1)

because it rejects all strings that M 1 accepts and rejects

all strings that M1 accepts.

Steps:

1. Given an arbitrary NDFSM M1,construct an equivalent

DFSM M' using the algorithm ndfsmtodfsm.

2. If M1 is already deterministic, M' = M1.

3. M' must be stated completely, so if needed add dead

state and all transitions to it.

4. Begin building M2 by setting it equal to M'.

5. Swap accepting and nonaccepting states. So

M2=(K, ∑,δ,s,K-A)

Example:

• Let L = {w Є {0,1}* : w is the string ending with 01}

RE = (0 U 1)*01

• The complement of L(M) is the DFSM that will accept

strings that do not end with 01.

42
Closure under Intersection

Theorem:

The regular languages are closed under intersection.

Proof:

• Note that

L(M1) ∩ L(M2) = ¬ (¬L(M1) U ¬L(M2)).

• We have already shown that the regular languages are closed under both
complement and union.

• Thus they are closed under intersection.

• Example:

• Fig (a) is DFSM L1 which accepts strings that have 0.

• Fig(b) is DFSM L2 which accepts strings that have 1.

43
• Fig(c) is Intersection or product construction which accepts that have both 0
and 1.

The Divide and Conquer Approach

• Let L = {w Є {a,b}* : w contains an even number of a’s and an odd number

of b’s and all a’s come in runs of three }.

• L is regular because it is the intersection of two regular languages,

L = L1 ∩ L2, where

• L1 = {w Є {a,b}* : w contains an even number of a’s

and an odd number of b’s},and

L2 = {w Є {a,b}*: all a’s come in runs of three}.

• L1 is regular as we have an FSM accepting L1

• L2 = {w Є {a,b}*: all a’s come in runs of three}.

• L2 is regular as we have an FSM accepting L2

44
L = {w Є {a,b}* : w contains an even number of a’s and an odd number of b’s and
all a’s come in runs of three }.

L is regular because it is the intersection of two regular languages,L = L1 ∩ L2

Closure under Set difference

Theorem:

The regular languages are closed under set difference.

Proof:

L(M1) - L(M2) = L(M1) ∩ ¬L(M2)

• Regular languages are closed under both complement

and intersection is shown.

• Thus regular languages are closed under set difference.

Closure under Reverse

Theorem:

The regular languages are closed under reverse.

Proof:

• LR = { w Є ∑* : w = xR for some x Є L}.

Example:

1. Let L = {001,10,111} then LR = {100,01,111}

2. Let L be defined by RE (0 U 1)0* then LR is 0*(0 U 1)

reverse(L) = {x ∈ Σ* : x = wR for some w ∈ L}.

By construction.

• Let M = (K, Σ, δ, s, A) be any FSM that accepts L.

• Initially, let M′ be M.

45
• Reverse the direction of every transition in M′.

• Construct a new state q. Make it the start state of M′.

• Create an ε-transition from q to every state that was an accepting state in M.

• M′ has a single accepting state, the start state of M.

Closure under letter substitution or Homomorphism

• The regular languages are closed under letter substitution.

• Consider any two alphabets, ∑1 and ∑2.

• Let sub be any function from ∑1 to ∑2*.

• Then letsub is a letter substitution function from L1 to L2 iff letsub(L1) = {

w Є ∑2*:Ǝy Є L1(w = y except that every character c of y has been replaced
by sub(c))}.

• Example 1

Consider ∑1 = {a,b} and ∑2 = {0,1}

Let sub be any function from ∑1 to ∑2*.

sub(a) = 0, sub(b) = 11

letsub(anbn : n >= 0}) = { 0n12n : n >= 0}

• Example 2

Consider ∑1 = {0,1,2} and ∑2 = {a,b}

Let h be any function from ∑1 to ∑2*.

h(0) = a, h(1) = ab, h(2) = ba

h(0120) = h(0)h(1)h(2)h(0)

= aabbaa

46
h(01*2) = h(0)(h(1))*h(2)

= a(ab)*ba

Long Strings Force Repeated States

Theorem: Let M=(K,∑,δ,s,A) be any DFSM. If M accepts any string of length |K |

or greater, then that string will force M to visit some state more than once.

Proof:

• M must start in one of its states.

• Each time it reads an input character, it visits some state. So ,in processing a
string of length n, M creates a total of n+1 state visits.

• If n+1 > | K |, then, by the pigeonhole principle, some state must get more
than one visit.

• So, if n>= | K |,then M must visit at least one state more than once.

The Pumping Theorem for Regular Languages

Theorem: If L is regular language, then:

Ǝk >= 1 (∀strings w ϵ L, where |w| >= k ( Ǝx, y, z ( w = xyz,

|xy| <= k,

y ≠ ε,and

∀q >= 0(xyqz ϵ L)))).

Proof:

• If L is regular then it is accepted by some DFSM M=(K,∑,δ,s,A).

Let k be |K|

• Let w be any string in L of length k or greater.

• By previous theorem to accept w, M must traverse some loop at least once.

47
• We can carve w up and assign the name y to the first substring to drive M
through a loop.

• Then x is the part of w that precedes y and z is the part of w that follows y.

• We show that each of the last three conditions must then hold:

• |xy| <= k

M must not traverse thru a loop.

It can read k - 1 characters without revisiting any states.

But kth character will take M to a state visited before.

• y≠ε

Since M is deterministic, there are no loops traversed by ε.

• ∀q >= 0 (xyqz ϵ L)

y can be pumped out once and the resulting string must be in L.

Steps to prove Language is not regular by contradiction method.

1. Assume L is regular.

2. Apply pumping theorem for the given language.

3. Choose a string w, where w ϵ L and IwI >= k.

4. Split w into xyz such that |xy| <= k and y ≠ ε.

5. Choose a value for q such that xyqz is not in L.

6. Our assumption is wrong and hence the given language is not regular.

48
Problems on Pumping theorem (Showing that the language is not regular)
1. Prove that the following language is not regular
L = {anbn : n ≥ 1 }
Proof :
1. Let us assume the given L is regular.
2. Consider W = aaa…aabb…bbb
n n
3. Split W = xyz

x y z
W = aaa…a..a..bb..bbb

(1) |xy| ≤ k |x|+|y| ≤ n (true)

(2) |y| ≥ 1 |y| ≥ 1 (true )
(3) for all q ≥ 0: xyqz ∈ L if q=0 then xy0z ∉ L (fails)

(because number of a’s and b’s will not be equal )

Hence it is a contradiction to the assumption made so the given Language is not
regular.

2. Prove that the following language is not regular

L = {wwR: W ϵ ∑* }
Proof :
1. Assume the given L is regular.
2. Consider W = ambmbmam , where |w| ≥ m
3. Split w = xyz

49
1) |xy| ≤ m (true)

2) |y| ≥ 1 (true)

3) for all k ≥ 0: xykz ∈ L , if k =2 then xy2z ∉ L

4 ) Prove that the following language is not regular

L = {an! | n ≥ 0 }
Proof :
1. Let us assume the given L is regular.
2. Split W = xyz
3. Consider W = an! = aj (ak )an!-j-k
x y z

|xy| ≤ n and |y|≥ 1

i) |x|+|y| ≤ n (True)
ii) |y| ≥ 1 (True)
iii) if i=0 then xyiz (fails)
aj (ak )i a n!-j-k
aj an!-j-k
an! - k
n! > n!-k
Hence it is contradiction to the assumption made so the given Language is not
regular.
50
5 ) Prove that the following language is not regular
L = {0n | n is prime }
Proof :
1. Let us assume the given L is regular.
2. Split W = xyz
3. Consider W = 0n = 0j (0k) 0n-j-k
x y z

|xy| ≤ n and |y|≥ 1

i) |x|+|y| ≤ n (True)
ii) |y| ≥ 1
iii) if i=0 then xyiz ∉ L
0j (0k)i 0n-j-k ∉ L
= j+ki+n-j-k
= n+k(i-1) substitute i= n+1
= n+kn
= n(k+1) if k=1 it will not be prime

Flat - Unit-2
No ratings yet
Flat - Unit-2
32 pages
Flat-Unit-2 Notes
No ratings yet
Flat-Unit-2 Notes
23 pages
Chapter 3 Regular Expression
No ratings yet
Chapter 3 Regular Expression
25 pages
Regular Expression
No ratings yet
Regular Expression
14 pages
Regular Expressions
No ratings yet
Regular Expressions
169 pages
TOC-Module 2
No ratings yet
TOC-Module 2
50 pages
HN ATC Notes Module 2
No ratings yet
HN ATC Notes Module 2
19 pages
Atc
No ratings yet
Atc
50 pages
Regular Expressiontzzz
No ratings yet
Regular Expressiontzzz
46 pages
Unit II Regular Expression
No ratings yet
Unit II Regular Expression
176 pages
ATC RE Theorems
No ratings yet
ATC RE Theorems
4 pages
Chapter 2 REGULAR EXPRESSION
No ratings yet
Chapter 2 REGULAR EXPRESSION
26 pages
Flat Unit
No ratings yet
Flat Unit
18 pages
CH 3 - Regular Languages Amd Regular Grammars
No ratings yet
CH 3 - Regular Languages Amd Regular Grammars
67 pages
@vtucode - in Module 2 ATC 5th Semester 2021 Scheme
0% (1)
@vtucode - in Module 2 ATC 5th Semester 2021 Scheme
49 pages
TOC Unit2
No ratings yet
TOC Unit2
87 pages
AT Module 2
100% (1)
AT Module 2
87 pages
Wa0014.
No ratings yet
Wa0014.
85 pages
Chapter Two Regular Expression
No ratings yet
Chapter Two Regular Expression
41 pages
4 Re
No ratings yet
4 Re
78 pages
CS351 Regular Expressions
No ratings yet
CS351 Regular Expressions
14 pages
cs212 Lect02 63 Inter
No ratings yet
cs212 Lect02 63 Inter
39 pages
FLAT Lec - 3
No ratings yet
FLAT Lec - 3
34 pages
Regular Expressions (Re) : Res: Formal Definition
No ratings yet
Regular Expressions (Re) : Res: Formal Definition
12 pages
TAFL Unit 2 Processed
No ratings yet
TAFL Unit 2 Processed
32 pages
18CS54 ATC Question Bank Module 4 and 5
No ratings yet
18CS54 ATC Question Bank Module 4 and 5
2 pages
2.1regular Expression-UNIT - II
No ratings yet
2.1regular Expression-UNIT - II
31 pages
Toc 2
No ratings yet
Toc 2
26 pages
Chapter 2
No ratings yet
Chapter 2
25 pages
Regular Expression: Operations On Regular Language
No ratings yet
Regular Expression: Operations On Regular Language
33 pages
ATC Notes Module 4
No ratings yet
ATC Notes Module 4
23 pages
FLAT-Regular Expression and Language
No ratings yet
FLAT-Regular Expression and Language
69 pages
Class 3
No ratings yet
Class 3
52 pages
Spring 2024 Compiler Constructoin A Lab 3-2
No ratings yet
Spring 2024 Compiler Constructoin A Lab 3-2
16 pages
3-5 Simplified Note
No ratings yet
3-5 Simplified Note
32 pages
TOC Module-2 Notes
No ratings yet
TOC Module-2 Notes
24 pages
Regular Expressions
No ratings yet
Regular Expressions
22 pages
Flat Unit 2 - 17.9.20
No ratings yet
Flat Unit 2 - 17.9.20
22 pages
Automata Chapter 2
No ratings yet
Automata Chapter 2
15 pages
Regular Expressions
No ratings yet
Regular Expressions
34 pages
Unit-Ii Regular Expressions and Languages Definition
No ratings yet
Unit-Ii Regular Expressions and Languages Definition
34 pages
Practice Questions Modules 2 Answer
No ratings yet
Practice Questions Modules 2 Answer
33 pages
Unit 2: Can Be Recursively Defined As Follows
No ratings yet
Unit 2: Can Be Recursively Defined As Follows
28 pages
WINSEM2024-25 ISWE203L TH VL2024250503413 2025-01-20 Reference-Material-I
No ratings yet
WINSEM2024-25 ISWE203L TH VL2024250503413 2025-01-20 Reference-Material-I
30 pages
WINSEM2024-25 BCSE304L TH VL2024250501632 2025-02-12 Reference-Material-I
No ratings yet
WINSEM2024-25 BCSE304L TH VL2024250501632 2025-02-12 Reference-Material-I
13 pages
Regular Expressions (RE) 3.1
100% (3)
Regular Expressions (RE) 3.1
53 pages
Unit 2
No ratings yet
Unit 2
35 pages
Regular Expression
No ratings yet
Regular Expression
21 pages
Chapter Two Regular Expression and Regular Language
No ratings yet
Chapter Two Regular Expression and Regular Language
30 pages
Toc Unit-2
No ratings yet
Toc Unit-2
109 pages
Flat Unit-3
No ratings yet
Flat Unit-3
10 pages
Regular Expressions
No ratings yet
Regular Expressions
30 pages
CH 14
No ratings yet
CH 14
13 pages
FLAT-unit II
No ratings yet
FLAT-unit II
18 pages
Toc Unit 2 Regular Expression
No ratings yet
Toc Unit 2 Regular Expression
109 pages
Regular-Expressions: LECT-2
No ratings yet
Regular-Expressions: LECT-2
12 pages
Chapter 2 RegularExpressions
No ratings yet
Chapter 2 RegularExpressions
95 pages
Ch2 Automata & Complexity Theory2016
No ratings yet
Ch2 Automata & Complexity Theory2016
44 pages
Chapter 2 RegularExpressions
No ratings yet
Chapter 2 RegularExpressions
95 pages
Books Doubtnut Question Bank
No ratings yet
Books Doubtnut Question Bank
9 pages
Subject Description Units Pre-Requisites First Year, 1 Semester
No ratings yet
Subject Description Units Pre-Requisites First Year, 1 Semester
4 pages
Unit-2 ATCD
No ratings yet
Unit-2 ATCD
65 pages
Aristotle's Empiricism
No ratings yet
Aristotle's Empiricism
473 pages
18CS55 ADP Notes Module 4 and 5
100% (1)
18CS55 ADP Notes Module 4 and 5
72 pages
BBM - 978 94 009 3707 9 - 1
No ratings yet
BBM - 978 94 009 3707 9 - 1
54 pages
Adjustment Theory
No ratings yet
Adjustment Theory
46 pages
Descriptive Statistics
No ratings yet
Descriptive Statistics
240 pages
526 Syllabus PDF
No ratings yet
526 Syllabus PDF
5 pages
Session6 Student
No ratings yet
Session6 Student
19 pages
Dsa Mod1@Azdocuments - in
No ratings yet
Dsa Mod1@Azdocuments - in
44 pages
ATC (18CS54) Module-4: 14.1 The Decidable Questions
No ratings yet
ATC (18CS54) Module-4: 14.1 The Decidable Questions
23 pages
FTS-Phase-3 Test-3 Code-A Paper-1 21.04.2024 (Maths)
No ratings yet
FTS-Phase-3 Test-3 Code-A Paper-1 21.04.2024 (Maths)
25 pages
SAT Suite Question Bank - Geometry & Trigonometry With Answers
No ratings yet
SAT Suite Question Bank - Geometry & Trigonometry With Answers
166 pages
Moment 1
No ratings yet
Moment 1
7 pages
ATC Module 3
No ratings yet
ATC Module 3
40 pages
Discrete Mathematics: (ONLY Textbook I Allowed. No Notes or Cheat Sheet Is Allowed.)
No ratings yet
Discrete Mathematics: (ONLY Textbook I Allowed. No Notes or Cheat Sheet Is Allowed.)
6 pages
Reading: December 28, 2012 17:33 WSPC/0219-8762 196-IJCM 1250055
No ratings yet
Reading: December 28, 2012 17:33 WSPC/0219-8762 196-IJCM 1250055
15 pages
Solutions To Laplace's Equations-II
No ratings yet
Solutions To Laplace's Equations-II
10 pages
Mensuration of Plane Figures (Triangle)
No ratings yet
Mensuration of Plane Figures (Triangle)
5 pages
HW2 Solution
No ratings yet
HW2 Solution
4 pages
Machine Learning and Artificial Intelligence
No ratings yet
Machine Learning and Artificial Intelligence
4 pages
OOP-CS252 - Programming Test 1
No ratings yet
OOP-CS252 - Programming Test 1
2 pages
Game Theory
No ratings yet
Game Theory
27 pages
Elementary Row Operations and Rank of Matrix (Handout)
No ratings yet
Elementary Row Operations and Rank of Matrix (Handout)
8 pages
The Modulus Function - Definations
No ratings yet
The Modulus Function - Definations
4 pages
Power in RLC Circuit
No ratings yet
Power in RLC Circuit
2 pages
Calculus Mock Exam
No ratings yet
Calculus Mock Exam
3 pages
2nd PT - Math
86% (7)
2nd PT - Math
5 pages
18CS55 ADP Question Bank Module 3, 4 and 5
No ratings yet
18CS55 ADP Question Bank Module 3, 4 and 5
5 pages
2025 Grade 12 Mathematics Question by Question Analysis
No ratings yet
2025 Grade 12 Mathematics Question by Question Analysis
4 pages
Kapsabet - Maths
No ratings yet
Kapsabet - Maths
28 pages
Word Problems Involving Direct, Partitive and Inverse
No ratings yet
Word Problems Involving Direct, Partitive and Inverse
20 pages
Math Resources Trigonometric Formulas
No ratings yet
Math Resources Trigonometric Formulas
6 pages
Course: MES Course Code:18CSCS44 Program & Semester: CS-4 Max. Marks: 30 Date: 03-08-2021 Time: 02:00-3:00 (60 Minutes)
No ratings yet
Course: MES Course Code:18CSCS44 Program & Semester: CS-4 Max. Marks: 30 Date: 03-08-2021 Time: 02:00-3:00 (60 Minutes)
1 page
46 - Math and Logic Games
No ratings yet
46 - Math and Logic Games
3 pages
Vac Syl
No ratings yet
Vac Syl
3 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
A First Course in Functional Analysis
From Everand
A First Course in Functional Analysis
Martin Davis
No ratings yet
Differential Equations: A Concise Course
From Everand
Differential Equations: A Concise Course
H. S. Bear
5/5 (3)

ATC Module 2

Uploaded by

ATC Module 2

Uploaded by

Chapter-6

A RE is a string that can be formed according to the following rules:

if ∑ = {a,b}, the following strings are regular expressions:

ø, ε, a,b, (a U b)*, abba U ε.

Semantic interpretation function L for the language of regular expressions:

1. L (ø) = ø, the language that contains no strings.

Analysing Simple Regular Expressions

2. L( ((a U b) (a U b))a(a U b)*)

= L(((a U b)(a U b)))L(a) L((a U b)*)

Finding RE for a given Language

2. Let L = {w ϵ {a, b }*: w starting with string abb}.

3. Let L = {w ϵ {a, b }*: w ending with string abb}.

4. L = {w ϵ {0, 1}* : w have 001 as a substring}.

5. L = {w ϵ {0, 1}* : w does not have 001 as a substring}.

7. L = {w ϵ {a, b}* :#a(w) mod 3 = 0}.

8. Let L = {w ϵ {a, b }*:#a(w) <= 3}.

9. L = {w ϵ {0, 1}* : w contains no consecutive 0’s}

10. L = {w ϵ {0, 1}* : w contains at least two 0’s}

11.L = { anbm / n>=4 and m<= 3}

12.L = { anbm / n<=4 and m>= 2}

13. L = { a2nb2m / n>=0 and m>= 0}

14. L = { anbm:(m+n) is even}

Eg: (a U bb*a) is evaluated as (a U (b(b*)a))

Buiding an FSM from a RE

Theorem 1:For Every RE, there is an Equivalent FSM.

5. If α is the RE βγ, FSM M3=(K3, ∑ , δ3, s3, A3) and

Algorithm to construct FSM, given a regular expression α

regextofsm(α : regular expression) =

Building an FSM from a Regular Expression

1. Consider the regular expression (b U ab )*.

An FSM for (b U ab)

2. Construct FSM for the RE (b(a U b)b)*

Building an Equivalent Machine M

Algorithm for FSM to RE(heuristic)

1. Remove from M-any unreachable states.

2. No accepting states then return the RE ø.

5. M has only one state, So L (M} = { ε } and return RE ε.

6.1. Select some state rip of M.

6.2. Remove rip from M.

6.3. Modify the transitions. The labels on the rewritten

transitions may be any regular expression.

7. Return the regular expression that labels from the

start state to the accepting state.

After adding new start state 4 and accepting state 5

Step 2: let rip be state 3

Step 3: Let rip be state 2

After removing rip state 2

4-1-5 : (ab U aaa*b)*(a U ε)

Step 4: Let rip be state 1

After removing rip state 1

Statement : Every regular language can be defined with a regular expression.

Let FSM M = (K,∑,δ,S,A),construct a regular expression α such that

Collapsing Multiple Transitions

{C1,C2,C3. ..... Cn} - Multiple Transition

Delete and replace by {C1 U C2 U C3. ..... U Cn}

R'(p,q) = R(p,q) U R(p,rip)R(rip,rip)*R(rip,q)

Ripping States out one at a time

R'(1,3) = R(1,3) U R(1,rip)R(rip,rip)*R(rip,3)

Algorithm to build RE that describes L(M) from any FSM M = (K,∑,δ,S,A)

Two Sub Routines:

1. standardize : To convert M to the required form

2. buildregex : Construct the required RE from

i. Remove unreachable states from M

ii. Modify start state

iii. Modify accepting states

v. If there is no transition between p and q and p ∉A, q ∉S,then create a

i. If M has no accepting states then return RE Φ

ii. If M has only one accepting state ,return RE ε

a. Select some state rip of M

b. Find R'(p,q) = R(p,q) U R(p,rip).R(rip,rip)*.R (rip,q)

c. Remove rip on d all transitions into ad out of it

Example 2: Build RE from FSM

Step 1: let RIP be state 4

After removing rip state 4

Step 2: Collapse multiple transitions from state 1 to state 2

After collapsing multiple transitions from state 1 to state 2

Eg: (a U bba) is evaluated as (a U (b(b)a))

4-1-5 : (ab U aaab)(a U ε)

(Σ(K1 U K2 U ....... U Kn)Σ)+

a* U aa = a, since L(aa) ⊆ L(a).

• Similarly If β* ⊆ α* then αβ = α*

a(a U b) = (a U b), since L(a) ⊆ L((a U b)*).

(a U ε)* = a,since {ε} ⊆ L(a).

1. ((a* U Φ)* U aa) = (a) U aa //L(Φ) ⊆ L(a*)