Compiler - Chap.2.part 3

Download as ppt, pdf, or txt
Download as ppt, pdf, or txt
You are on page 1of 85

Regular Expression

• As discussed earlier that a* generates


Λ, a, aa, aaa, … and a+ generates a, aa, aaa, aaaa,
…, so the language L1 = {Λ, a, aa, aaa, …} and
L2 = {a, aa, aaa, aaaa, …} can simply be expressed
by a* and a+, respectively.
a* and a+ are called the regular expressions (RE)
for L1 and L2 respectively.
Note: a+, aa* and a*a generate L2.

1
Recursive definition of Regular
Expression(RE)
Step 1: Every letter of Σ including Λ is a regular
expression.
Step 2: If r1 and r2 are regular expressions then
1.(r1)

2.r1 r2

3.r1 + r2 and

4. r1*
are also regular expressions.
Step 3: Nothing else is a regular expression.
2
Defining Languages
(continued)…
• Method 3 (Regular Expressions)
• Consider the language L={Λ, x, xx, xxx,…} of
strings, defined over Σ = {x}.
We can write this language as the Kleene star
closure of alphabet Σ or L=Σ*={x}*
this language can also be expressed by the
regular expression x*.
• Similarly the language L={x, xx, xxx,…}, defined
over Σ = {x}, can be expressed by the regular
expression x+.

3
• Now consider another language L, consisting of
all possible strings, defined over Σ = {a,
b}. This language can also be expressed by the
regular expression
(a + b)*.
• Now consider another language L, of strings
having exactly double a, defined over Σ=
{a, b}, then it’s regular expression may be
b*aab*

4
• Now consider another language L, of even
length, defined over Σ = {a, b}, then it’s regular
expression may be
((a+b)(a+b))*
• Now consider another language L, of odd length,
defined over Σ = {a, b}, then it’s regular
expression may be
(a+b)((a+b)(a+b))* or
((a+b)(a+b))*(a+b)

5
• Example:
• Consider the language, defined over
Σ={a , b} of words having at least one a, may be
expressed by a regular expression
(a+b)*a(a+b)*.
• Consider the language, defined over
Σ = {a, b} of words having at least one a and
one b, may be expressed by a regular
expression
(a+b)*a(a+b)*b(a+b)*+ (a+b)*b(a+b)*a(a+b)*.
6
• Consider the language, defined over
Σ={a, b}, of words starting with double a and
ending in double b then its regular expression
may be aa(a+b)*bb
• Consider the language, defined over
Σ={a, b} of words starting with a and ending in
b OR starting with b and ending in a, then
its regular expression may be
a(a+b)*b+b(a+b)*a

7
TASK
• Consider the language, defined over
Σ={a, b} of words beginning with a, then its
regular expression may be a(a+b)*

• Consider the language, defined over


Σ={a, b} of words beginning and ending in same
letter, then its regular expression may be (a+b)
+a(a+b)*a+b(a+b)*b

8
TASK
• Consider the language, defined over
Σ={a, b} of words ending in b, then its regular
expression may be (a+b)*b.
• Consider the language, defined over
Σ={a, b} of words not ending in a, then its regular
expression may be (a+b)*b + Λ. It is to be noted that
this language may also be expressed by ((a+b) *b)*.

9
Task
• Determine the RE of the language, defined over Σ={a,
b} of words beginning with a.
Solution:
The required RE may be a(a+b)*
• Determine the RE of the language, defined over Σ={a, b}
of words beginning with and ending in same letter.
Solution:
The required RE may be (a+b)+a(a+b)*a+b(a+b)*b

10
Task Continued …
• Determine the RE of the language, defined over Σ={a, b}
of words ending in b.
Solution:
The required RE may be
(a+b)*b.
• Determine the RE of the language, defined over
Σ={a, b} of words not ending in a.
Solution: The required RE may be
(a+b)*b + Λ Or ((a+b)*b)*

11
An important example
The Language EVEN-EVEN :
Language of strings, defined over Σ={a, b} having
even number of a’s and even number of b’s. i.e.
EVEN-EVEN = {Λ, aa, bb, aaaa,aabb,abab, abba,
baab, baba, bbaa, bbbb,…} ,
its regular expression can be written as
(aa+bb+(ab+ba)(aa+bb)*(ab+ba))*

12
Note
• It is important to be clear about the
difference of the following regular
expressions
r1=a*+b*
r2=(a+b)*
Here r1 does not generate any string of
concatenation of a and b, while r2 generates
such strings.

13
Equivalent Regular
Expressions
• Definition:
Two regular expressions are said to be
equivalent if they generate the same language.
Example:
Consider the following regular expressions
r1= (a + b)* (aa + bb)
r2= (a + b)*aa + ( a + b)*bb then
both regular expressions define the language of
strings ending in aa or bb.
14
Note

• If r1 =(aa + bb) and r2=( a + b) then


1. r1+r2 =(aa + bb) + (a + b)
2. r1r2 =(aa + bb) (a + b)
=(aaa + aab + bba + bbb)
3. (r1)* =(aa + bb)*

15
Regular Languages
• Definition:
The language generated by any regular expression
is called a regular language.
It is to be noted that if r1, r2 are regular
expressions, corresponding to the languages L 1
and L2 then the languages generated by r1+ r2, r1r2(
or r2r1) and r1*( or r2*) are also regular languages.

16
Note
• It is to be noted that if L1 and L2 are expressed by r1and r2,
respectively then the language expressed by
1) r1+ r2, is the language L1 + L2 or L1 U L2
2) r1r2, , is the language L1L2, of strings obtained by prefixing
every string of L1 with every string of L2
3) r1*, is the language L1*, of strings obtained by
concatenating the strings of L, including the null string.

17
Example
• If r1=(aa+bb) and r2=(a+b) then the language of strings
generated by r1+r2, is also a regular language, expressed by
(aa+bb)+(a+b)
• If r1=(aa+bb) and r2=(a+b) then the language of strings
generated by r1r2, is also a regular language, expressed by
(aa+bb)(a+b)
• If r=(aa+bb) then the language of strings generated by r*, is
also a regular language, expressed by (aa+bb)*

18
All finite languages are
regular.
Example:
Consider the language L, defined over Σ={a,b}, of
strings of length 2, starting with a, then
L={aa, ab}, may be expressed by the regular
expression aa+ab. Hence L, by definition, is a
regular language.

19
Note
It may be noted that if a language contains even
thousand words, its RE may be expressed, placing ‘
+ ’ between all the words.
Here the special structure of RE is not important.
Consider the language L={aaa, aab, aba, abb, baa,
bab, bba, bbb}, that may be expressed by a RE
aaa+aab+aba+abb+baa+bab+bba+bbb, which is
equivalent to (a+b)(a+b)(a+b).

20
Introduction to Finite Automaton

• Consider the following game board that contains 64


boxes

21
Finite Automaton
Continued …
There are some pieces of paper. Some are of white
colour while others are of black color. The number of
pieces of paper are 64 or less. The possible
arrangements under which these pieces of paper can
be placed in the boxes, are finite. To start the game,
one of the arrangements is supposed to be initial
arrangement. There is a pair of dice that can
generate the numbers 2,3,4,…12 . For each number
generated, a unique arrangement is associated
among the possible arrangements.

22
Finite Automaton
Continued …
It shows that the total number of transition rules of
arrangement are finite. One and more arrangements
can be supposed to be the winning arrangement. It
can be observed that the winning of the game
depends on the sequence in which the numbers are
generated. This structure of game can be considered
to be a finite automaton.

23
Defining Languages
(continued)…
• Method 4 (Finite Automaton)
Definition:
A Finite automaton (FA), is a collection of the followings
1) Finite number of states, having one initial and some
(maybe none) final states.
2) Finite set of input letters (Σ) from which input strings are
formed.
3) Finite set of transitions i.e. for each state and for each
input letter there is a transition showing how to move
from one state to another.

24
Example
• Σ = {a,b}
• States: x, y, z where x is an initial state and z is final state.
• Transitions:
1. At state x reading a go to state z,
2. At state x reading b go to state y,
3. At state y reading a, b go to state y
4. At state z reading a, b go to state z

25
Example Continued …

• These transitions can be expressed by the


following table called transition table

Old States New States

Reading a Reading b
x- z y
y y y
z+ z z
26
Note
• It may be noted that the information of an FA,
given in the previous table, can also be depicted by
the following diagram, called the transition
diagram, of the given FA

a,b

y
b

x–
a,b
a
Z+ 27
Remark

• The previous transition diagram is an FA accepting


the language of strings, defined over Σ={a, b},
starting with a. It may be noted that this language
may be expressed by the regular expression
a (a + b)*

28
Recap Lecture 4
• Regular expression of EVEN-EVEN language,
Difference between a* + b* and (a+b)*, Equivalent
regular expressions; sum, product and closure of
regular expressions; regular languages, finite
languages are regular, introduction to finite
automaton, definition of FA, transition table,
transition diagram.

29
Note
• It may be noted that to indicate the initial state, an
arrow head can also be placed before that state and
that the final state with double circle, as shown
below. It is also to be noted that while expressing an
FA by its transition diagram, the labels of states are
not necessary.

a, b

a, b

30
Example
• Σ = {a,b}
States: x, y, where x is both initial and final state.
Transitions:
1.At state x reading a or b go to state y.
2.At state y reading a or b go to state x.

31
Example Continued …

• These transitions can be expressed by the following


transition table

Old States New States


Reading Reading
a b
x± y y

y x x

32
Example Continued …
• It may be noted that the previous transition table
may be depicted by the following transition
diagram.

a, b

x y

a, b

33
Example Continued …
• The previous transition diagram is an FA accepting
the language of strings, defined over Σ={a, b} of
even length. It may be noted that this language
may be expressed by the regular expression

((a+ b) (a + b))*

34
TASK
Build an FA for the language L of strings,
defined over Σ={a, b}, of odd length.

35
Solution of Task

a,b

– +

a,b

36
Example: Consider the language L of strings, defined
over Σ={a, b}, starting with b. The language L may be
expressed by RE b(a + b)* , may be accepted by the
following FA
a,b

––
b +

a,b
a

1
37
• Example:
Consider the language L of strings, defined over
Σ={a, b}, ending in a. The language L may be
expressed by RE
(a+b)*a
This language may be accepted by the following FA

38
Example Continued …

b a a

– +

There may be another FA corresponding to the given


language.
39
Example continued …

a
a
–– +

a b
b

40
Note

• It may be noted that corresponding to a given


language there may be more than one FA accepting
that language, but for a given FA there is a unique
language accepted by that FA.

41
Note

• It is to be noted that given the languages L1 and


L2 ,where
L1 = The language of strings, defined over Σ={a,
b}, beginning with a
L2 = The language of strings, defined over Σ={a,
b}, not beginning with b
The  does not belong to L1 while it does belong to
L2 . This fact may be depicted by the corresponding
transition diagrams of L1 and L2.
42
FA1 Corresponding to L1
a,b

––
a +

b a,b

• The language L may be expressed by the regular


1
expression a(a + b)*
43
FA2 Corresponding to L2
a,b

a
 +

a,b
b

• The language L2 may be expressed by the regular


expression a (a + b)* + Λ

44
Example
• Consider the Language L of Strings of length two or
more, defined over Σ = {a, b}, beginning with and ending
in same letters.
The language L may be expressed by the following
regular expression
a (a + b)* a + b (a + b)* b
It is to be noted that if the condition on the length of
string is not imposed in the above language then the
strings a and b will then belong to the language.
This language L may be accepted by the following FA

45
Example Continued …

b a a

+
a b

a b b
b
+
a

46
Task

• Build an FA accepting the Language L of Strings,


defined over Σ = {a, b}, beginning with and ending in
same letters.

47
TASK
Build an FA for the language L of strings,
defined over Σ={a, b}, of odd length.
Solution:The language L may be
expressed by RE (a+b)((a+b)
(a+b))* or ((a+b)(a+b))*(a+b)
This language may be accepted by the
following FA

48
Solution continued …

a,b

1 – 2+

a,b

49
Task
• Build an FA accepting the Language L of Strings,
defined over Σ = {a, b}, beginning with and ending in
same letters.
Solution:The language L may be expressed by the
following regular expression
(a+b)+a(a + b)*a + b(a + b)*b
This language L may be accepted by the following FA

50
Solution continued …

a b a
a

b
2+ 4 6+
a b
1– b a b
b
b a
3+ 5 7+
a

51
Example

Consider the Language L of Strings , defined over Σ


= {a, b}, beginning with and ending in different
letters.
The language L may be expressed by the following
regular expression
a (a + b)* b + b (a + b)* a
This language may be accepted by the following FA

52
Example Continued …

a b b

2 4+
a a
1–
b a a
b
3 5+
b

53
Example
• Consider the Language L , defined over Σ=
{a, b} of all strings including Λ, The language L may
be accepted by the following FA
a,b

a,b
1  2+

• The language L may also be accepted by the


following FA

54
Example Continued …
a,b

• The language L may be expressed by the following


regular expression

(a + b)*

55
Example
• Consider the Language L , defined over Σ=
{a, b} of all non empty strings. The language L may
be accepted by the following FA

a,b

a,b
– +

The above language may be expressed by the


following regular expression (a + b)+

56
Example

• Consider the following FA, defined over Σ = {a,


b}
a,b
a,b
– +

• It is to be noted that the above FA does not accept


any string. Even it does not accept the null string. As
there is no path starting from initial state and ending
in final state.
57
Equivalent FAs

• It is to be noted that two FAs are said to be


equivalent, if they accept the same language,
as shown in the following FAs.

58
Equivalent FAs Continued
a,b

FA1 –
a,b
+

a,b

FA2 –
a,b
1 2

a,b a,b
FA3 1–
a,b
2 3+

59
Note (Equivalent FAs)
• FA1 has already been discussed, while in FA2, there
is no final state and in FA3, there is a final state but
FA3 is disconnected as the states 2 and 3 are
disconnected.
It may also be noted that the language of strings
accepted by FA1, FA2 and FA3 is denoted by the
empty set i.e.
{ } OR Ø

60
Example
Consider the Language L of strings , defined
over Σ = {a, b}, containing double a.
The language L may be expressed by the
following regular expression
(a+b)* (aa) (a+b)*. This
language may be accepted by the following
FA

61
Example Continued …

b a,b
a
a 3+
1- 2

62
Example

Consider the language L of strings, defined over


Σ={0, 1}, having double 0’s or double 1’s, The
language L may be expressed by the regular
expression (0+1)* (00 + 11)
(0+1)*
This language may be accepted by the following FA

63
Example Continued …
x

0 0
0,1

- 0 1 +

1
1

y
64
Example
Consider the language L of strings, defined over
Σ={a, b}, having triple a’s or triple b’s. The
language L may be expressed by RE
(a+b)* (aaa + bbb) (a+b)*
This language may be accepted by the following FA

65
Example Continued …
2 a
4

a
a
b a,b

1–– a b 6+

b a
b

3 b 5
66
Example
• Consider the EVEN-EVEN language, defined over
Σ={a, b}. As discussed earlier that EVEN-EVEN
language can be expressed by the regular
expression (aa+bb+(ab+ba)(aa+bb)*(ab+ba))*
EVEN-EVEN language may be accepted by the
following FA

67
Example Continued …
b

1 3
b

a a a a

2 4
b

68
FA corresponding to finite
languages
• Example
Consider the language
L = {Λ, b, ab, bb}, defined over
Σ ={a, b}, expressed by
Λ + b + ab + bb OR Λ + b (Λ + a + b).
The language L may be accepted by the following
FA

69
a,b

Example continuedy …
a
a,b
a 3 b
1 +
4

b
2+ b +
5

a,b
a

x
a,b
70
Example Continued …
• It is to be noted that the states x and y are called
Dead States, Waste Baskets or Davey John
Lockers, as the moment one enters these states
there is no way to leave it.

71
Note
• It is to be noted that to build an FA accepting the
language having less number of strings, the tree
structure may also help in this regard, which can be
observed in the following transition diagram for the
Language L, discussed in the previous example

72
a
4

3 a,b
b
a a,b
5+ a,b
1± 8
a,b
b
a
6

2+ a,b
b
7+ 73
Example

Consider the language


L = {aa, bab, aabb, bbba}, defined over
Σ ={a, b}, expressed by aa +
bab + aabb + bbba OR aa (Λ +
bb) + b (ab + bba)
The above language may be accepted by the
following FA

74
Example Continueda,b…
x

b a,b
a a

1– a 2 a 3+
b 4 b 5+

b
6 b 7 b 8 a 9+
a
a b
10 a,b
a
b
11+ a,b y
a,b 75
Example

Consider the language L=


{w belongs to {a,b}*: length(w) >= 2 and w neither
ends in aa nor bb}.
The language L may be expressed by the regular
expression

(a+b)*(ab+ba)
This language may be accepted by the following FA

76
Example Continued …

2
b 4+
a
a b
1–
b b a
b
3 a 5+

77
Note
• It is to be noted that building an FA corresponding to
the language L, discussed in the previous example,
seems to be quite difficult, but the same can be done
using tree structure along with the technique
discussed in the book
Introduction to Languages and Theory of
Computation, by J. C. Martin
so that the strings ending in aa, ab, ba and bb should
end in the states labeled as aa, ab, ba and bb,
respectively;as shown in the following FA

78
Example continued … a
aa
a

a b
b
a a
ab

Λ b a

b ba
a
b
b a
b b

bb 79
Example
• Consider the language
L = {w belongs to {a,b}*: w does not end in aa}.
The language L may be expressed by the regular
expression
Λ+a+b+
(a+b)*(ab+ba+bb)
This language may be accepted by the following FA

80
Example Continued …
a

aa
a

a b
b
a a
ab

Λ b a

b ba
a
b
b a
b b

bb 81
Task

• Using the technique discussed by Martin, build an


FA accepting the following language
L = {w belongs to {a,b}*: length(w) >= 2 and second
letter of w, from right is a}.

82
Task
• Using the technique discussed by Martin, build an
FA accepting the following language
L = {w belongs to {a,b}: w neither ends in ab nor
ba}.

83
Defining Languages (continued)…

• Method 5 (Transition Graph)


Definition: A Transition graph (TG), is a collection of the
followings
1)Finite number of states, at least one of which is start state
and some (maybe none) final states.
2)Finite set of input letters (Σ) from which input strings are
formed.
3)Finite set of transitions that show how to go from one
state to another based on reading specified substrings of
input letters, possibly even the null string Λ.

84
Defining Languages (continued)…

• Method 5 (Transition Graph)


Definition: A Transition graph (TG), is a collection of
the followings
1)Finite number of states, at least one of which is
start state and some (maybe none) final states.
2)Finite set of input letters (Σ) from which input
strings are formed.
3)Finite set of transitions that show how to go from
one state to another based on reading specified
substrings of input letters, possibly even the null
string (Λ).

85

You might also like