Compiler Design CS - 2
Compiler Design CS - 2
BITS Pilani
Pilani | Dubai | Goa | Hyderabad
BITS Pilani
Pilani | Dubai | Goa | Hyderabad
Contact Session - 2
Introduction to Lexical Analyzer
Lexical Analyzer (A.k.a. Scanner)
• The only part of a compiler that looks at each character of
the source text and does a linear analysis
• Reads source text and produces TOKENS
• Also keeps track of the source-coordinates of each token -
which file name, line number and position
– (This is useful for debugging & error indication purposes.)
• Advantages of a separate Lexical Analyzer:
– Keeps Compiler design simple
– Improves Efficiency and
– Increases Portability
3
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
The Role of a Lexical Analyzer
Lexical analyzer
22
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Solution
1) (b | bab)*
2) ((a | b) (a |b))* or (aa |ab |ba |bb)*
3) (0 |1) (0 |1) 0 (0 |1)*
4)b*(ab*ab*)* a b* or b*a b* (ab*ab*)*
5) (1 | 01)* 00 1*
6) a (a | b)* a | b (a |b)* b
7) (1 | 01 )* ( 0 | ϵ) (0 | 10)* (1 | ϵ)
8) (ab | ba |aa |bb)* (a | b)
23
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Regular Expression Construction
• Problem : Specify a set of unsigned numbers as a
regular expression. (Examples: 1997, 19.97)
• Observations on numbers:
1. Could be made up of one or more digits from set
(0 – 9)
2.Optionally Can have a decimal point in the end
followed by 0 or more digits “.”(0 – 9)*
3.A number can also start with a Point followed by one or
more digits
[ (0 – 9)+ [“.”(0 – 9)*] ? ] | [“.”(0 – 9) +]
24
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Regular Expressions for Some Tokens
of a Programming Language
Regular Expression Token Type
if [ Return IF; ]
[ a – z ] [ a – z 0 – 9 ]* [ return ID ]
( [0 – 9 ] + ‘ . ’[ 0 – 9 ] * ) | ( [ ‘ . ’[ 0 – 9 ] +) Return REAL
. return ERROR
25
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
• Regular definition.
– gives names to regular expressions to construct more complicate
regular expressions.
d1 -> r1
d2 -> r2
…
dn ->rn
– example:
letter -> A | B | C | … | Z | a | b | …. | z
digit -> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
identifier -> letter (letter | digit) *
28
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Example
29
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Example
30
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Example
31
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
More than One Regular Expression for a Language
32
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
RegEx - idioms
33
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Analyzing a Simple Regular Expression
34
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Another Simple Regular Expression
35
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Given a Language, Find a Regular Expression
36
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Regular Expressions
• https://www.javatpoint.com/examples-of-reg
ular-expression
• https://www.w3schools.com/python/python_
regex.asp
37
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956
Thank you
38
BITS Pilani, Deemed to be University under Section 3 of UGC Act, 1956