0% found this document useful (0 votes)

34 views26 pages

SSC Module2 LexicalAnalysis

Uploaded by

vanessa.quadras2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

34 views26 pages

SSC Module2 LexicalAnalysis

Uploaded by

vanessa.quadras2003

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

Chapter 3 (3.1 – 3.

Sunitha G, SJEC 1
Outline
 Role of lexical analyzer
 Input buffering
 Specification of tokens
 Recognition of tokens

Sunitha G, SJEC 2
The role of lexical analyzer
token
Source Lexical To semantic
program Parser analysis
Analyzer
getNextToken

Symbol
table

Sunitha G, SJEC 3
Attr
Sunitha G, SJEC 4
Why to separate Lexical analysis
and parsing
1. Simplicity of design
2. Improving compiler efficiency
3. Enhancing compiler portability

Sunitha G, SJEC 5
Tokens, Patterns and Lexemes
 A token is a pair consisting of a token name and an
optional token value
 A pattern is a description of the form that the lexemes
of a token may take
 A lexeme is a sequence of characters in the source
program that matches the pattern for a token

Sunitha G, SJEC 6
Example
Token Informal description Sample lexemes
if Characters i, f if
else Characters e, l, s, e else
comparison < or > or <= or >= or == or != <=, !=

id Letter followed by letter and digits pi, score, D2

number Any numeric constant 3.14159, 0, 6.02e23
literal Anything but “ sorrounded by “ “core dumped”

if(a>=b)
printf(“total=%d\n”, a);
Lexemes: if ( a >= b ) printf ( “total=%d” a )
Tokens : IF LP id relop id RP id LP literal id RP
Sunitha G, SJEC 7
Attributes for tokens
 E = M * C ** 2
 <id, pointer to symbol table entry for E>
 <assign-op>
 <id, pointer to symbol table entry for M>
 <mult-op>
 <id, pointer to symbol table entry for C>
 <exp-op>
 <number, integer value 2>

Fig

Sunitha G, SJEC 8
Lexical errors
 Some errors are out of power of lexical analyzer to
recognize:
 fi (a == f(x)) …
 However it may be able to recognize errors like:
 d = 2r
 Such errors are recognized when no pattern for tokens
matches a character sequence

Sunitha G, SJEC 9
Error recovery
 Panic mode: successive characters are ignored until we
reach to a well formed token
 Delete one character from the remaining input
 Insert a missing character into the remaining input
 Replace a character by another character
 Transpose two adjacent characters

Sunitha G, SJEC 10
Input buffering
 Sometimes lexical analyzer needs to look ahead some
symbols to decide about the token to return
 In C language: we need to look after -, = or < to decide
what token to return
 In Fortran: DO 5 I = 1.25
 We need to introduce a two buffer scheme to handle
large look-aheads safely

E = M * C * * 2 eof

Sunitha G, SJEC 11
Sentinels
E = M eof * C * * 2 eof eof

Switch (*forward++) {
case eof:
if (forward is at end of first buffer) {
reload second buffer;
forward = beginning of second buffer;
}
else if {forward is at end of second buffer) {
reload first buffer;\
forward = beginning of first buffer;
}
else /* eof within a buffer marks the end of input */
terminate lexical analysis;
break;
cases for the other characters;
}
Sunitha G, SJEC 12
Specification of tokens
 Strings and Languages
 Operations on Languages

Sunitha G, SJEC 13
Regular Expressions
 In theory of compilation regular expressions are used
to formalize the specification of tokens
 Regular expressions are means for specifying regular
languages
 Example:
 Letter_(letter_ | digit)*
 Each regular expression is a pattern specifying the
form of strings

Sunitha G, SJEC 14
Regular Expressions
 Ɛ is a regular expression, L(Ɛ) = {Ɛ}
 If a is a symbol in ∑then a is a regular expression, L(a)
= {a}
 (r) | (s) is a regular expression denoting the language
L(r) ∪ L(s)
 (r)(s) is a regular expression denoting the language
L(r)L(s)
 (r)* is a regular expression denoting (L(r))*
 (r) is a regular expression denoting L(r)

Sunitha G, SJEC 15
Algebraic laws for regular expressions

Sunitha G, SJEC 16
Regular Definitions
d1 -> r1
d2 -> r2
…
dn -> rn

Example:
C Identifiers
letter_ -> A | B | … | Z | a | b | … | Z | _
digit -> 0 | 1 | … | 9
id -> letter_ (letter_ | digit)*

Sunitha G, SJEC 17
Example:
Unsigned numbers (integer or floating point)
Strings such as 5280, 0.01234, 6.336E4, or 1.89E-4

Sunitha G, SJEC 18
Extensions
 One or more instances: (r)+
 Zero or one instances: r?
 Character classes: [abc]

 Example: C Identifiers
letter_ -> [A-Za-z_]
digit -> [0-9]
id -> letter_(letter|digit)*

Unsigned numbers
digit -> [0-9]
digits -> digit+
number -> digits (. digits)? ( E [+-]? digits )?
Sunitha G, SJEC 19
Recognition of tokens
 Starting point is the language grammar to understand
the tokens:
stmt -> if expr then stmt
| if expr then stmt else stmt
|Ɛ
expr -> term relop term
| term
term -> id
| number

Sunitha G, SJEC 20
Recognition of tokens (cont.)
 The next step is to formalize the patterns:
digit -> [0-9]
Digits -> digit+
number -> digit(.digits)? (E[+-]? Digit)?
letter -> [A-Za-z_]
id -> letter (letter|digit)*
If -> if
Then -> then
Else -> else
Relop -> < | > | <= | >= | = | <>
 We also need to handle whitespaces:
ws -> (blank | tab | newline)+

Sunitha G, SJEC 21
Transition diagrams
 Transition diagram for relop

Sunitha G, SJEC 22
Transition diagrams (cont.)
 Transition diagram for reserved words and identifiers

letter (letter | digit)*

Sunitha G, SJEC 23
Transition diagrams (cont.)
 Transition diagram for unsigned numbers

digit(.digits)? (E[+-]? digit)?

Sunitha G, SJEC 24
Transition diagrams (cont.)
 Transition diagram for whitespace

(blank | tab | newline)+

Sunitha G, SJEC 25
Architecture of a transition-diagram-based lexical analyzer
TOKEN getRelop()
{
TOKEN retToken = new (RELOP)
while (1) { /* repeat character processing until a
return or failure occurs */
switch(state) {
case 0: c= nextchar();
if (c == ‘<‘) state = 1;
else if (c == ‘=‘) state = 5;
else if (c == ‘>’) state = 6;
else fail(); /* lexeme is not a relop */
break;
case 1: …
…
case 8: retract();
retToken.attribute = GT;
return(retToken);
}
Sunitha G, SJEC 26

Compiler Design Lexical Analysis
No ratings yet
Compiler Design Lexical Analysis
24 pages
04 Lexi Cal A Analysis
No ratings yet
04 Lexi Cal A Analysis
39 pages
21CS51 ATCD MODULE 2 - 2 Lexical Analyser Part2
No ratings yet
21CS51 ATCD MODULE 2 - 2 Lexical Analyser Part2
62 pages
Lexical Analysis Techniques and Implementation
No ratings yet
Lexical Analysis Techniques and Implementation
44 pages
Compiler-Lexical Analysis
100% (1)
Compiler-Lexical Analysis
59 pages
Compiler Lexical Analysis Guide
No ratings yet
Compiler Lexical Analysis Guide
56 pages
Chapter 3 - Lexical Analysis
No ratings yet
Chapter 3 - Lexical Analysis
34 pages
Acknowledgements: The Slides For This Lecture Are A Modified Versions of The Offering by
No ratings yet
Acknowledgements: The Slides For This Lecture Are A Modified Versions of The Offering by
40 pages
Compiler
No ratings yet
Compiler
60 pages
Unit NO.03 Phases in Compilers-Lexical Analysis& Syntax Analysis
No ratings yet
Unit NO.03 Phases in Compilers-Lexical Analysis& Syntax Analysis
43 pages
Chapter 2 Lexical Analysis
No ratings yet
Chapter 2 Lexical Analysis
33 pages
Lexical Analysis and Token Recognition
100% (3)
Lexical Analysis and Token Recognition
51 pages
Lexical Analysis
No ratings yet
Lexical Analysis
57 pages
Chapter 3 - Lexical Analysis
No ratings yet
Chapter 3 - Lexical Analysis
51 pages
Pdf&rendition 1
No ratings yet
Pdf&rendition 1
14 pages
Chapter 2 - Lexical Analysis
100% (1)
Chapter 2 - Lexical Analysis
69 pages
Chapter 3 - Lexical Analysis
No ratings yet
Chapter 3 - Lexical Analysis
51 pages
4 Lexical Analysis
No ratings yet
4 Lexical Analysis
60 pages
Token Recognition in Compiler Design
No ratings yet
Token Recognition in Compiler Design
51 pages
Compiler Course: Lexical Analysis
No ratings yet
Compiler Course: Lexical Analysis
50 pages
Compiler Design: Lexical Analysis
No ratings yet
Compiler Design: Lexical Analysis
27 pages
2 - Lexical Analysis
No ratings yet
2 - Lexical Analysis
52 pages
(3rd Year) Compiler PPT RM
No ratings yet
(3rd Year) Compiler PPT RM
50 pages
Lexical Analysis: Tokens & Patterns Explained
No ratings yet
Lexical Analysis: Tokens & Patterns Explained
77 pages
Chapter 3 - Lexical Analysis
100% (1)
Chapter 3 - Lexical Analysis
51 pages
Lecture 03
No ratings yet
Lecture 03
42 pages
Ch3 Modified
No ratings yet
Ch3 Modified
80 pages
Chapter 3 - Lexical Analysis and Lexical Analyzer Generators
No ratings yet
Chapter 3 - Lexical Analysis and Lexical Analyzer Generators
52 pages
CH 3
No ratings yet
CH 3
66 pages
Lexical Analysis and Lexical Analyzer Generators: COP5621 Compiler Construction
No ratings yet
Lexical Analysis and Lexical Analyzer Generators: COP5621 Compiler Construction
52 pages
2 Lex
No ratings yet
2 Lex
45 pages
CH 3 Myppt
No ratings yet
CH 3 Myppt
59 pages
Unit 1 (B)
No ratings yet
Unit 1 (B)
69 pages
4-Intro To Flex and Bison-09!09!2024
No ratings yet
4-Intro To Flex and Bison-09!09!2024
28 pages
Lexical Analysis
No ratings yet
Lexical Analysis
6 pages
Compiler Design and Lexical Analysis
No ratings yet
Compiler Design and Lexical Analysis
126 pages
Unit 2 Lexical Analyzer
No ratings yet
Unit 2 Lexical Analyzer
63 pages
Chapter 3 - Lexical Analysis
No ratings yet
Chapter 3 - Lexical Analysis
52 pages
Lexical and Syntax Analysis
No ratings yet
Lexical and Syntax Analysis
63 pages
Compiler Design Chapter-2
60% (5)
Compiler Design Chapter-2
105 pages
2024 CD-Ch02 Lexical Analysis
No ratings yet
2024 CD-Ch02 Lexical Analysis
25 pages
Lec2 LexicalAnalyser
No ratings yet
Lec2 LexicalAnalyser
30 pages
21CS51 ATCD MODULE 2 - 2 Lexical Analyser Part1
No ratings yet
21CS51 ATCD MODULE 2 - 2 Lexical Analyser Part1
63 pages
Chapter2-Lexical Analysis
No ratings yet
Chapter2-Lexical Analysis
64 pages
02 Lexical Analysis
No ratings yet
02 Lexical Analysis
86 pages
Lexical Analysis
No ratings yet
Lexical Analysis
62 pages
Lexical Analysis
No ratings yet
Lexical Analysis
62 pages
Compiler - Lexical Analyzer-2
No ratings yet
Compiler - Lexical Analyzer-2
16 pages
Lexical Analyzer in Perspective: Parser Source Program Token
No ratings yet
Lexical Analyzer in Perspective: Parser Source Program Token
22 pages
Chapter 2
No ratings yet
Chapter 2
91 pages
Compiler Syntax Analysis Guide
No ratings yet
Compiler Syntax Analysis Guide
39 pages
Chap 22 Electric Fields: PHYS 102 Dr. Bassem Hmouda
No ratings yet
Chap 22 Electric Fields: PHYS 102 Dr. Bassem Hmouda
32 pages
Interloop Strategic Marketing Plan
100% (1)
Interloop Strategic Marketing Plan
5 pages
Nakulapita & Nakulamata: Buddhist Devotion
No ratings yet
Nakulapita & Nakulamata: Buddhist Devotion
2 pages
Sample Economics Assignment
100% (3)
Sample Economics Assignment
9 pages
Actifed: Cold & Allergy Relief History
No ratings yet
Actifed: Cold & Allergy Relief History
7 pages
Biochem - Enzymes Report Script
No ratings yet
Biochem - Enzymes Report Script
5 pages
Public Assembly Law and INC Protest
No ratings yet
Public Assembly Law and INC Protest
3 pages
Game and Software Data Services
No ratings yet
Game and Software Data Services
102 pages
Wind Gear Oil Wind Turbine Gears 3 Stage Planetary Flender - AG - ASWI9000UK - 01
No ratings yet
Wind Gear Oil Wind Turbine Gears 3 Stage Planetary Flender - AG - ASWI9000UK - 01
2 pages
Oracle Support Tools Guide
No ratings yet
Oracle Support Tools Guide
6 pages
Bv0023 - Environmental Noise Measurements
No ratings yet
Bv0023 - Environmental Noise Measurements
44 pages
English Formal vs. Informal Styles
No ratings yet
English Formal vs. Informal Styles
2 pages
Concord
No ratings yet
Concord
1 page
Vision-Mission of ANDSC Schools
No ratings yet
Vision-Mission of ANDSC Schools
7 pages
The Eternal Training Ground
No ratings yet
The Eternal Training Ground
1,033 pages
Article
No ratings yet
Article
6 pages
HR & Admin Process Flow Chart
No ratings yet
HR & Admin Process Flow Chart
4 pages
Eviction Process Guide for Landlords
No ratings yet
Eviction Process Guide for Landlords
1 page
MFI Business Plan Format Guide
No ratings yet
MFI Business Plan Format Guide
16 pages
Chuyên 8.2025
No ratings yet
Chuyên 8.2025
9 pages
De Facto Corp. by Estoppel
No ratings yet
De Facto Corp. by Estoppel
21 pages
Arakan Mug Battalion Formation History
0% (1)
Arakan Mug Battalion Formation History
7 pages
Oblicon Quizzes Quiz 1
No ratings yet
Oblicon Quizzes Quiz 1
5 pages
2025 مسح اجهزة الطبية لمستشفى الجمهوري-1
No ratings yet
2025 مسح اجهزة الطبية لمستشفى الجمهوري-1
58 pages
Z-Dimensional v. Sakar International
No ratings yet
Z-Dimensional v. Sakar International
3 pages
Australian Wood Review Issue 127 2025-06
No ratings yet
Australian Wood Review Issue 127 2025-06
84 pages
Mastering RSI: by @cryptocred
No ratings yet
Mastering RSI: by @cryptocred
17 pages
Kiganda Julius Work-3
No ratings yet
Kiganda Julius Work-3
15 pages
Body Rev 9 - Meal Plan - Men (70kg+)
100% (1)
Body Rev 9 - Meal Plan - Men (70kg+)
15 pages
Chronic Kidney Disease DR - Jitendra Kumar
No ratings yet
Chronic Kidney Disease DR - Jitendra Kumar
2 pages

SSC Module2 LexicalAnalysis

Uploaded by

SSC Module2 LexicalAnalysis

Uploaded by

Chapter 3 (3.1 – 3.

id Letter followed by letter and digits pi, score, D2

letter (letter | digit)*

digit(.digits)? (E[+-]? digit)?

(blank | tab | newline)+

You might also like