0% found this document useful (0 votes)
17 views

Module - 2 (Compiler)

Uploaded by

ankurvatsa3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views

Module - 2 (Compiler)

Uploaded by

ankurvatsa3
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 39

CSI2005

Principles of Compiler
Design
MODULE – 1
Dr. WI. Sureshkumar
Associate Professor
School of Computer Science and Engineering (SCOPE)
VIT Vellore
[email protected]
SJT413A34
Regular Expression
Let  be a given alphabet. Then
1)  ,  , and any a are all regular expressions. These are called
primitive regular expressions.
2) If r1 and r2 are regular expressions, then
- r1 / r2 is a regular expression.
- r1  r2 is a regular expression.
- r1* is a regular expression.
- (r1 ) is a regular expression.
3) A string is a regular expression, if and only if it can be derived from the
primitive regular expressions by a finite number of applications of rules in
(2).
Regular Expression
Check whether the given string is a regular expression,
S ={ a, b, c}, the string (a/b . c)* . (c / )
r1 = a , r 2 = b , r 3 = c
r4 = r2 . r3 = b . c
r5 = r1 / r 4 = a / b . c
r6 = (r5) = (a / b . c)
r7 = r6* = (a / b . c)*
r8 = 
r9 = r3 / r 8 = c / 
r10 = (r9)= (c / )
Languages associated with regular
expressions
The language L(r) denoted by a regular expression r is defined by the
following rules,
1)  is a r.e denoting the empty set L () = {}
2)  is a r.e denoting the empty set L () = {}
3) For any a is a r.e denoting the set L (a) = {a}
If r1 and r2 are regular expressions, then
4) L (r1 / r2) = L{r1} U L{r2}
5) L (r1 . r2) = L{r1}.L{r2}
6) L ((r1)) = L(r1)
7) L (r1*) = (L (r1))*
Languages associated with regular
expressions
Exhibit the language L(a* . (a / b)) in set notation
L(a* . (a / b)) = L(a*) . L( (a / b))
= (L(a))*. L(a / b))
= (L(a))*. L(a) U L(b)
= { a }*. { a } U { b }
= {, a, aa, aaa,. . . }. { a, b}
= {a, aa, aaa, aaaa,. . . ,b, ab, aab, aaab,. . . }
Languages associated with regular
expressions
r = (a / b)*. (a / bb), find L(r)
L(r) = L((a / b)*. (a / bb))
= L((a / b)*). L(a / bb)
= (L(a / b))*. L(a / bb)
= (L(a) U L(b))* . L(a) U L(bb)
= {a, b}* . {a} U {b}.{b}
= {a, b}* . {a} U {bb}
= {a, b}* . {a, bb}
= {, a, b, aa, ab, ba, bb,. . .} . {a, bb}
= {a, aa, ba, aaa, aba, baa, bba,. . .,bb, abb, bbb, aabb, abbb, . . .}
Examples
Describe the following sets by regular expressions
1) L1 = the set of all strings of 0’s and 1’s ending with 00
2) L2 = the set of all strings of 0’s and 1’s and beginning with 0 and ending
with 1
3) L3 = {, aa, aaaa, . . . }
4) The set of all strings of 0’s and 1’s with at least two consecutive zeros
5) The set of all strings of a’s and b’s whose length is divisible by 6
6) The set of all strings of a’s and b’s whose 5th last symbol is b
7) The expression r = (aa)* (bb)*b denotes the set of strings with an even
number of a’s followed by an odd number of b’s
8) L4 = {an bm / n ≥ 4 , m ≤ 3}
Examples
1) (0 / 1)*00
2) 0(0 / 1)*1
3) (aa)*
4) (0 / 1)*00(0 / 1)*
5) [(a / b)6]*
6) (a / b)*b(a / b)4
7) L(r) = {a2n b2m+1 / n ≥ 0 , m ≥ 0 }
8) aaaaa*( / b / bb / bbb)
Regular expression to -NFA
Theorem: Let r be a regular expression, then there exists some NFA
with -moves that accept L(r). Consequently, L(r) is a regular language.
Proof: We begin with automata that accepts the language for primitive
regular expressions  ,  , and any a
i) a) NFA accepts 
q0 qf

b) NFA accepts  c) NFA accepts a

a
 q0 qf
q0 qf
Regular expression to -NFA
ii) NFA accepts L(r)
M(r)

iii) NFA accepts L(r1/r2)


M(r1)

 

M(r2)


Regular expression to -NFA
iii) NFA accepts L(r1.r2)
M(r1) M(r2)

 

iii) NFA accepts L(r1*) 

M(r1)


Examples
1) The set of integers
(1+2+. . . +9)(0+1+. . . +9)*
2)The set of decimal numbers

.
(1+2+. . . +9)(0+1+. . . +9)* (0+1+. . . +9)*
3) Strings over {a, b} and length multiple of 3
[(a + b)(a + b)(a + b)]*
Examples
1) (0 + 01)*
2) (a + b)*b(a + b)*
3) (a+b)*abb
4) aa* + ab a*b*
5) (abab)* + (aa*+ b)*
6) ((00*)*1)*
7) (01)* + 1(01)* + (01)*0 + 1(01)*0
1) (0 / 01)*

0
0

1
1

0  1
01
0
 
0 / 01
 
0  1


(0 / 01)* 0
 
 
 
0  1


(a/b)*

a 

 

 
b

(a / b)*b(a / b)*

 

a a 
  
 b  

   
b b
 
Direct Method – (RE to DFA)
• Let r be the regular expression. Then the augmented regular
expression is denoted by r#
r = (a/b)*bb r# = (a/b)*bb#
• Construct a syntax tree for r#
• Traverse the tree to construct the following functions
nullable() , firstpos() , lastpos() , followpos()
• Construct DFA by using subset construction method.
Given (a/b)*abb , then augmented r.e is (a/b)*abb#

#
6
b
5
b
4

* a
3

|
a b
1 2
{6} # {6}
{5} b {5}

b
{4} {4}

* {3} a { 3}

|
{1} a {1} {2} b {2}
{ 1, 2, 3 } {6}

{ 1, 2, 3 } {5} {6} # {6}


{ 1, 2, 3 } {4} {5} b {5}
{ 1, 2, 3 }
{ 3} b
{4} {4}
{ 1, 2 }
* { 1, 2 } {3} a { 3}

{ 1, 2 } { 1, 2 }
|
{1} a {1} {2} b {2}
Computing followpos()
The function followpos( i ) tells us what position can follow position i
in the syntax tree. To find followpos( i ), we need 2 rules:

1) If n is a cat-node with left child c1 and right child c2 , i is a position in


lastpos(c1) , then all positions in firstpos(c2) are in followpos( i ).

2) If n is a star-node, and i is a position in lastpos(n) , then all positions


in firstpos(n) are in followpos( i ).
A = {1, 2, 3} position(a) = 1, 3 position(b) = 2, 4, 5

Dtran[A, a] = followpos(1) U followpos(3)


A
= {1, 2, 3} U {4} = {1, 2, 3, 4} = B
Dtran[A, b] = followpos(2)
= {1, 2, 3} = {1, 2, 3} = A

a
A B
B = {1, 2, 3, 4} Dtran[B, a] = followpos(1) U followpos(3)
= {1, 2, 3} U {4} = {1, 2, 3, 4} = B
Dtran[B, b] = followpos(2) U followpos(4)
= {1, 2, 3} U {5} = {1, 2, 3, 5} = C

a b
A B C

a
C = {1, 2, 3, 5} Dtran[C, a] = followpos(1) U followpos(3)
= {1, 2, 3} U {4} = {1, 2, 3, 4} = B
Dtran[C, b] = followpos(2) U followpos(5)
= {1, 2, 3} U {6} = {1, 2, 3, 6} = D

b
a

a b b
A B C D

a
D = {1, 2, 3, 6} Dtran[D, a] = followpos(1) U followpos(3)
= {1, 2, 3} U {4} = {1, 2, 3, 4} = B
Dtran[D, b] = followpos(2)
= {1, 2, 3} = {1, 2, 3} = A

b b
a

a b b
A B C D

a
a
Given (a/b)*a(a/b) , then augmented r.e is (a/b)*a(a/b)#

#
6
|
a b

*
4 5
a
3

|
a b
1 2
{6} # {6}

|
{5} b {5}
{4}a {4}
* {3} a{3}

|
{1} a {1} {2} b {2}
{ 1, 2, 3 } {6}

{ 1, 2, 3 } { 4, 5 }
{6} # {6}
{ 4, 5 }
{ 4, 5 } |
{ 1, 2, 3 } {3}
{5} b {5}
{4} a {4}
{ 1, 2 }
* { 1, 2 } {3} a {3}

{ 1, 2 } | { 1, 2 }

{1} a {1} {2} b {2}


4
1

6
3

2 5
A = {1, 2, 3} position(a) = 1, 3, 4 position(b) = 2, 5

Dtran[A, a] = followpos(1) U followpos(3)


A = {1, 2, 3} U {4, 5} = {1, 2, 3, 4, 5} = B
Dtran[A, b] = followpos(2)
= {1, 2, 3} = {1, 2, 3} = A
b

a
B
A
B = {1, 2, 3, 4, 5} Dtran[B, a] = followpos(1)U followpos(3)U followpos(4)
= {1, 2, 3} U {4, 5} U {6} = {1, 2, 3, 4, 5, 6} = C
Dtran[B, b] = followpos(2) U followpos(5)
b = {1, 2, 3}U{6} = {1, 2, 3, 6} = D

a a
B
A C

D
C = {1, 2, 3, 4, 5, 6} Dtran[C, a] = followpos(1)Ufollowpos(3)U followpos(4)
D = {1, 2, 3, 6} = {1, 2, 3} U {4, 5} U {6} = {1, 2, 3, 4, 5, 6} = C
Dtran[C, b] = followpos(2) U followpos(5)
= {1, 2, 3}U{6} = {1, 2, 3, 6} = D
b
Dtran[D, a] = followpos(1)Ufollowpos(3)
= {1, 2, 3} U {4, 5} = {1, 2, 3, 4, 5} = B
a
Dtran[D, b] = followpos(2)
a
A
B
C = {1, 2, 3} = A
a
b a
b
b

D
Construct DFA for the following regular expressions:
1) (a/b)*a(a/b) (a/b)
2) (a/b)*abb(a/b)*
3) (a/b)*a(a/b)(a/b)(a/b)
4) (a/b)*a(a/b)*

You might also like