0% found this document useful (0 votes)
23 views36 pages

Regular Expressions and Languages

This document discusses regular expressions and finite automata. [1] It explains that regular expressions provide a declarative way to express string patterns, while finite automata are more machine-like in accepting or rejecting input strings. [2] It then reviews the basic operators of regular expressions: union, concatenation, and Kleene closure. [3] Examples are provided to illustrate how regular expressions define languages.

Uploaded by

Atik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views36 pages

Regular Expressions and Languages

This document discusses regular expressions and finite automata. [1] It explains that regular expressions provide a declarative way to express string patterns, while finite automata are more machine-like in accepting or rejecting input strings. [2] It then reviews the basic operators of regular expressions: union, concatenation, and Kleene closure. [3] Examples are provided to illustrate how regular expressions define languages.

Uploaded by

Atik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 36

Regular Expressions and

Languages
1

4/11/2020
Regular Expressions vs. Finite Automata
2

⚫ Offers a declarative way to express the pattern of any string


we want to accept
⚪ E.g., 01*+ 10*

⚫ Automata => more machine-like


< input: string , output: [accept/reject] >
⚫ Regular expressions => more program syntax-like

⚫ Unix environments heavily use regular expressions


⚪ E.g., bash shell, grep, vi & other editors

⚫ Lexical analyzers such as Lex or Flex

4/11/2020
Regular Expressions
3

⚫ Operators of Regular Expressions


⚪ Review of three operations on languages L and M:
Union --- L∪M = {x | x∈L or x∈M}
Example: If L={001, 10, 111} and M={ε, 001} then
L∪M={ε, 10, 001, 111 }
Concatenation --- LM = {xy | x∈L, y∈M}
Example: If L={001, 10, 111} and M={ε, 001} then
LM={001, 10, 111, 001001, 10001, 111001 }

4/11/2020
Kleene Closure (the * operator)
4

⚫ Kleene Closure of a given language L:


⚪ L 0 = { ε}
⚪ L1= {w | for some w ∈ L}
⚪ L2= { w1w2 | w1 ∈ L, w2 ∈ L (duplicates allowed)}
⚪ Li= { w1w2…wi | all w’s chosen are ∈ L (duplicates allowed)}
⚪ (Note: the choice of each wi is independent)
⚪ L* = Ui≥0 Li (arbitrary number of concatenations)
Example:
⚫ Let L = { 1, 00}
0
⚪ L = { ε}
⚪ L1= {1,00}
⚪ L2= {11,100,001,0000}
⚪ L3= {111,1100,1001,10000,000000,00001,00100,0011}
⚪ L* = L0 U L1 U L2 U …

4/11/2020
Regular Expressions
5

Building Regular Expressions


⚪ Recursive definition of a regular expression
(RE) E and the language , L(E):
Basis:
Constants ε and φ are RE’s, defining languages {ε} and φ,
respectively ⇒ L(ε) = {ε}, L(φ) = φ.
If a is a symbol, then a is an RE, defining the language {a}
⇒ L(a) = {a}. (note: a is of bold face)

4/11/2020
Regular Expressions
6

Building Regular Expressions


⚪ Induction: given two RE’s E and F, then
E + F is an RE such that L(E + F) = L(E) ∪ L(F)
(union)
EF is an RE such that L(EF) = L(E)L(F)
(concatenation)
E * is an RE such that L(E*) = (L(E))* (closure)

(E) is an RE such that L((E)) = L(E)


(parenthesization).

4/11/2020
Regular Expressions
7

⚫ Examples
⚪ RE F = 1 “expresses” the language L(1) = {1}.

*
⚪ RE E = 1

Language expressed by E ---


L = L(E) = L(1*) = (L(1))* = ({1})*
(closure of language)
= {ε, 1, 11, 111, 1111, …}
= {1n | n ≥ 0}

4/11/2020
Regular Expressions
8

⚫ Examples
*
⚪ RE G = 01

Language expressed by G ---


L = L(G) = L(01*) = L(0)L(1*) (concatenation)
= {0}{ε, 1, 11, 111, 1111, …}
= {0, 01, 011, 0111, …}
= {01n | n ≥ 0}

4/11/2020
Regular Expressions
9

⚫ Examples
*
⚪ RE H = 1 + 01

Language expressed by H ---


L = L(H) = L(1 + 01*) = L(1) U L(01*)
= {1} U {0, 01, 011, 0111, …}
= {1, 0, 01, 011, 0111, …}
= {1}U{01n | n ≥ 0}

4/11/2020
Regular Expressions
10

⚫ Examples
⚪ RE K = ε + a*
Language expressed by K ---
L = L(K) = L(ε + a*) = L(ε ) U L(a*)
= {ε} U {ε, a, aa, aaa, …}
= {ε, a, aa, aaa, …}
= L(a*)
That is, we have the following RE equalities:

ε + a* = a* = a* + ε

4/11/2020
Regular Expressions
11
⚫ Example
⚪ A RE defining a language of strings of alternating 0’s and 1’s
(including none) is one of the two below:

(01)* + (10)* + 0(10)* + 1(01)*


(0…1 1…0 0…0 1…1)
(ε + 1)(01)*(ε + 0)

4/11/2020
Regular Expressions
12

Precedence of RE operators
⚪ Precedence
*
Highest --- (closure)
Next--- . (concatenation) (left to right)
Last--- + (union) (left to right)
Use parentheses anywhere to resolve ambiguity

4/11/2020
Regular Expressions
13

Precedence of RE operators
⚪ Example

Three ways to interpret 01* + 1:


(0(1*)) + 1 by precedence above (= 01* + 1)
(01)* + 1 (another meaning)
0(1* + 1) (a third meaning)

4/11/2020
FA’s & RE’s
14

⚫ Important Theorems
⚪ Every language defined by a DFA is also defined by an RE.

⚪ Every language defined by an RE is also defined by an ε-NFA.

4/11/2020
FA’s & RE’s
15

ε-NFA NFA

RE DFA

4/11/2020
FA’s & RE’s
16

From DFA’s to RE’s


If L = L(A) for some DFA A, then there is an RE
R such that L = L(R).
Prove by constructing progressively string sets
defined by a certain RE form Rij(k) until the entire set
of acceptable strings (i.e., language L(A)) is
obtained.
Assume the states are {1, 2, ..., n} (1 is the start
state).

4/11/2020
FA’s & RE’s
17

⚫ Meaning of Rij(k) ---

⚪ Rij(k) is a regular expression


Language is set of strings w
w is the label of a path from state i to state j of DFA
A
The path has no intermediate state greater than k

4/11/2020
FA’s & RE’s
18

⚫ Meaning of Rij(k) ---


⚪ T construct Rij(k), we use induction, starting at k = 0
and stop at k = n (the largest state number).

Then, when k = n, i =1, and j specifies an accepting


state, then Rij(k) defines a set of strings accepted by
DFA A, with each string forming a path starting
from the start state to the accepting state.

4/11/2020
FA’s & RE’s
19

(k)
⚫ Meaning of R ---
ij
Basis:
⚪ when k = 0, since all state numbers ≥ 1, and so there is no
intermediate state in path i to j;
2 cases to consider:
(1) an arc (a transition) from i to j;
(2) a path from i to i itself.

4/11/2020
FA’s & RE’s
20

(k)
⚫ Meaning of R ---
ij

Basis (cont’d):
⚪ If i ≠ j, only (case 1) is possible:
no symbol for such a transition ⇒ Rij(0) = φ
one symbol a for the transition ⇒ Rij(0) = a
multiple symbls a1, a2, ..., am for the transition,
• ⇒ Rij(0) = a1 + a2 + ... + am

4/11/2020
FA’s & RE’s
21
(k)
⚫ Meaning of R ---
ij

Basis (cont’d) i ≠ j: qi qj

Rij(0) = φ
a
qi qj
Rij(0) = a
a1+…+am
(0)
qi qj
Rij = a1 + a2 + ... + am

4/11/2020
FA’s & RE’s
22
(k)
⚫ Meaning of R ---
ij
⚪ If i = j, only (case 2) is possible, which means there
exists at least a path ε from i to i itself, in addition to
the 3 cases:
no symbol for such a transition ⇒ Rij(0)= ε
one symbol a for the transition ⇒ Rij(0)= ε + a
multiple symbls a1, a2, ..., am for the transition,
• ⇒ Rij(0) = ε + a1 + a2 + ... + am

4/11/2020
FA’s & RE’s
(k) 23
⚫ Meaning of R ---
ij ε

Basis (cont’d) i = j: qi

Rij(0) = ε ε+a

Rij(0) = ε + a qi

Rij(0) = ε + a1 + a2 + ... + am ε+a1+…+am

qi

4/11/2020
FA’s & RE’s
24

R
Induction (to compute ij(k) ):
⚪ Suppose there is a path from i to j that goes through no state
numbered higher than k. Then, two cases should be
considered:
(1) the path does not go through k ⇒ Rij(k-1)
(2) the path goes through k at least once, then the path may be
broken into 3 pieces:
through i to k without passing k ⇒ Rik(k-1)
from k to k itself ⇒ (Rkk(k-1))* (recusive);
from k to j without passing k ⇒ Rkj(k-1).

4/11/2020
FA’s & RE’s
25

⚫ Illustration of paths represented by Rij(k) :

(Rkk(k-1))*
…… circulating zero
or more times

i … k … j
Rik (k-1)
Rkj(k-1)

4/11/2020
FA’s & RE’s
26

Induction (cont’d):

⚪ The three pieces are concatenated to be


Rik(k-1)(Rkk(k-1))*Rkj(k-1).

⚪ Combining (1) & (2), we get the RE defining “all the paths from
i to j that go through no state higher than k” as
Rij(k) = Rij(k-1) + Rik(k-1)(Rkk(k-1))*Rkj(k-1).

4/11/2020
FA’s & RE’s
27

⚫ Example
⚪ Convert the following DFA into an RE.
1 0, 1

0
start 1 2
⚪ Rij(0) may be constructed to be (details in the next page):

R11(0) ε+1
R12(0) 0
R21(0) φ
R22(0) (ε + 0 + 1)
4/11/2020
FA’s & RE’s
28

⚫ Example
1 0, 1

0
start 1 2

⚪ R11(0) = ε + 1 because δ(1, 1) = 1 & going back to itself


⚪ R12(0) = 0 because δ(1, 0) = 2 (going out to state 2)
⚪ R21(0) = φ because there is no path from state 2 to 1
⚪ R22(0) = (ε + 0 + 1) because δ(2, 0) = 2 & δ(2, 1) = 2 &
going back to itself

4/11/2020
FA’s & RE’s
29

⚫ Example (cont’d) 0, 1
1

0
start 1 2

⚪ We can then compute all Rij(k) for k=1 & k=2.

⚪ However, we may alternatively compute only necessary


terms of Rij(k) backward from the final states, to save time.

4/11/2020
FA’s & RE’s
30

⚫ Example (cont’d)
⚪ There is only one final state 2, so only have to compute
R12(2) = R12(1) + R12(1)(R22(1))*R22(1).

⚪ Only have to compute R12(1) and R22(1), without computing


R21(1) and R11(1).

⚪ To compute each of these terms, we need some RE equalities


to simplify intermediate results.

4/11/2020
FA’s & RE’s
31

⚫ Some equalities (R is an RE):


1. φR=Rφ=φ (φ=annihilator for concatenation)

2. φ + R = R + φ = R (φ=identity for union)

3. εR = Rε = R (ε = identity for concatenation)

4. (ε + a)* = a* = (a + ε)*

5. (ε + a)a* = (εa* + aa*) = a* + a+ = a*

6. a*(ε + a) = (a*ε + a*a) = a* + a+ = a*

(all provable by easy deduction)

4/11/2020
FA’s & RE’s
32

⚫ To compute
R12(2) = R12(1) + R12(1)(R22(1))*R22(1)
⚪ R12(1) = R12(0) + R11(0)(R11(0))*R12(0)
= 0 + (ε + 1)(ε + 1)*0 (by substitutions)
= 0 + (ε + 1)1*0 (by 4. in last slide )
= 0 + 1* 0 (by 5.)
= (ε + 1*)0 (by distributive law)
= 1*0 (by 4.)
4/11/2020
FA’s & RE’s
33

⚫ To compute
R12(2) = R12(1) + R12(1)(R22(1))*R22(1)
⚪ R22(1) = R22(0) + R21(0)(R11(0))*R12(0)
= (ε + 0 + 1) + φ(ε + 1)*0 (by substitutions)
= (ε + 0 + 1) + φ (by 1.)
=ε+0+1 (by 2.)

4/11/2020
FA’s & RE’s
34

⚫ To compute
R12(2) = R12(1) + R12(1)(R22(1))*R22(1)
⚪ Finally, R12(2)
= 1*0 +1*0(ε + 0 + 1)*(ε + 0 + 1) (by subst.)
= 1*0 +1*0(0 + 1)*(ε + 0 + 1) (by 4.)
= 1*0 +1*0(0 + 1)* (by 6.)
=1*0(ε + (0 + 1)*) (by distributive law)
= 1*0(0 + 1)* (by 4.)

4/11/2020
FA’s & RE’s
35

⚫ Check the correctness of the final result


R12(2) = 1*0(0 + 1)*

1 0, 1

0
start 1 2

It is a language that begins with zero or more 1’s


than have a zero and then any string of zero’s
and 1’s
correct (by looking at the diagram directly)!
4/11/2020
Acknowledgement
36

⚫ Tania Akter Setu


⚫ Lecturer, Dept. of CSE , UITS

4/11/2020

You might also like