Lect2 Lexical

The document discusses the phases of a compiler, with a focus on the lexical analysis phase. It describes how the lexical analyzer reads input characters and produces tokens by recognizing patterns specified by regular expressions. The lexical analyzer works with the parser to break down the source code into meaningful tokens that can be analyzed. Key tasks of the lexical analyzer include specifying tokens using patterns, and implementing a nexttoken() routine to recognize tokens based on the specifications.

Uploaded by

Ricky

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

20 views9 pages

Lect2 Lexical

Uploaded by

Ricky

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPT, PDF, TXT or read online on Scribd

You are on page 1/ 9

Review: Compiler Phases:

Source program

Lexical analyzer Front End

Syntax analyzer
Symbol table
manager Semantic analyzer Error handler

Intermediate code generator

Code optimizer
Backend
Code generator
Chapter 3: Lexical Analysis
Lexical analyzer: reads input characters and produces a
sequence of tokens as output (nexttoken()).
Trying to understand each element in a program.
Token: a group of characters having a collective meaning.
const pi = 3.14159;

Token 1: (const, -)
Token 2: (identifier, pi)
Token 3: (=, -)
Token 4: (realnumber, 3.14159)
Token 5: (;, -)
Interaction of Lexical analyzer
with parser

token
Source Lexical parser
program analyzer Nexttoken()

symbol
table
Some terminology:
Token: a group of characters having a collective
meaning. A lexeme is a particular instant of a token.
E.g. token: identifier, lexeme: pi, etc.
pattern: the rule describing how a token can be formed.
E.g: identifier: ([a-z]|[A-Z]) ([a-z]|[A-Z]|[0-9])*

Lexical analyzer does not have to be an individual

phase. But having a separate phase simplifies the
design and improves the efficiency and
portability.
Two issues in lexical analysis.
How to specify tokens (patterns)?
How to recognize the tokens giving a token specification (how to
implement the nexttoken() routine)?

How to specify tokens:

all the basic elements in a language must be
tokens so that they can be recognized.
main() {
int i, j;
for (I=0; I<50; I++) {
printf(I = %d, I);
}
}
Token types: constant, identifier, reserved word, operator and
misc. symbol.
Tokens are specified by regular expressions.
Some definitions
alphabet : a finite set of symbols. E.g. {a, b, c}
A string over an alphabet is a finite sequence of symbols drawn
from that alphabet (sometimes a string is also called a sentence or a
word).
A language is a set of strings over an alphabet.
Operation on languages (a set):
union of L and M, L U M = {s|s is in L or s is in M}
concatenation of L and M
LM = {st | s is in L and t is in M}

Kleene closure of L,
L L
* i
i 0
Positive closure of L,

L Li
Example: i 1

L={aa, bb, cc}, M = {abc}

Formal definition of Regular expression:f
Given an alphabet ,
(1) is a regular expression that denote { }, the
set that contains the empty string.
(2) For each a , a is a regular expression
denote {a}, the set containing the string a.
(3) r and s are regular expressions denoting the
language (set) L(r ) and L(s ). Then
( r ) | ( s ) is a regular expression denoting L( r ) U L( s )
( r ) ( s ) is a regular expression denoting L( r ) L ( s )
( r )* is a regular expression denoting (L ( r )) *

Regular expression is defined together with the

language it denotes.
Examples:
let {a, b}
a|b
(a | b) (a | b)
a*
(a | b)*
a | a*b

We assume that * has the highest precedence and is

left associative. Concatenation has second highest
precedence and is left associative and | has the lowest
precedence and is left associative
(a) | ((b)*(c ) ) = a | b*c
Regular definition.
gives names to regular expressions to construct more complicate
regular expressions.
d1 -> r1
d2 ->r2

dn ->rn
example:
letter -> A | B | C | | Z | a | b | . | z
digit -> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9
identifier -> letter (letter | digit) *

more examples: integer constant, string constants, reserved

words, operator, real constant.

Lisp Interpreter in Rust
From Everand
Lisp Interpreter in Rust
Vishal Patil
1/5 (1)
Lexical Analysis-1
No ratings yet
Lexical Analysis-1
9 pages
ch3 M.PPTX - 0
No ratings yet
ch3 M.PPTX - 0
46 pages
Chapter THREE
No ratings yet
Chapter THREE
24 pages
Compiler Design Chapter-2
60% (5)
Compiler Design Chapter-2
105 pages
Lexi Cal A Analyzer
No ratings yet
Lexi Cal A Analyzer
38 pages
Chapter 2 - Lexical Analysis
No ratings yet
Chapter 2 - Lexical Analysis
56 pages
Compilers - Week 2
No ratings yet
Compilers - Week 2
14 pages
Language About Complier Construction
No ratings yet
Language About Complier Construction
23 pages
Lexical Analysis
No ratings yet
Lexical Analysis
57 pages
Chapter Two (3) (Autosaved)
No ratings yet
Chapter Two (3) (Autosaved)
29 pages
2 Lex
No ratings yet
2 Lex
45 pages
Lec2 LexicalAnalyser
No ratings yet
Lec2 LexicalAnalyser
30 pages
CD ch2
No ratings yet
CD ch2
104 pages
Specification of Tokens
No ratings yet
Specification of Tokens
21 pages
Lexical Analyzer 1
No ratings yet
Lexical Analyzer 1
37 pages
Lexical Analysis
No ratings yet
Lexical Analysis
41 pages
Chapter 2
No ratings yet
Chapter 2
56 pages
Unit22pdf 2021 03 13 13 38 11
No ratings yet
Unit22pdf 2021 03 13 13 38 11
114 pages
Regular Expressions
No ratings yet
Regular Expressions
4 pages
Lecture02 Scanning 1
No ratings yet
Lecture02 Scanning 1
72 pages
Module 3
No ratings yet
Module 3
7 pages
Lexical Analyzer in Perspective: Parser Source Program Token
No ratings yet
Lexical Analyzer in Perspective: Parser Source Program Token
22 pages
Chapter 2 - Lexical Analysis
100% (1)
Chapter 2 - Lexical Analysis
69 pages
CC 2
No ratings yet
CC 2
65 pages
CH 3 Myppt
No ratings yet
CH 3 Myppt
59 pages
Chapter 7 Lexical Analysis
No ratings yet
Chapter 7 Lexical Analysis
61 pages
Lexical Analysis1
No ratings yet
Lexical Analysis1
44 pages
Chapter2-Lexical Analysis
No ratings yet
Chapter2-Lexical Analysis
64 pages
Chapter 2
No ratings yet
Chapter 2
77 pages
Lecture 3a and 3b
No ratings yet
Lecture 3a and 3b
21 pages
CC Unit-1
No ratings yet
CC Unit-1
143 pages
Chapter 2
No ratings yet
Chapter 2
99 pages
Ch3 - Lexical Analysis
No ratings yet
Ch3 - Lexical Analysis
52 pages
1st Phase Lexical Analyzer
No ratings yet
1st Phase Lexical Analyzer
33 pages
TPL Lect 15 - 16
No ratings yet
TPL Lect 15 - 16
5 pages
Lexical Analysis
No ratings yet
Lexical Analysis
73 pages
Lecture # 06
No ratings yet
Lecture # 06
27 pages
Lexical Analysis
No ratings yet
Lexical Analysis
47 pages
2 - Lexical Analysis
No ratings yet
2 - Lexical Analysis
52 pages
Scanner (Lexical Analyzer) : The Structure of A Compiler
No ratings yet
Scanner (Lexical Analyzer) : The Structure of A Compiler
109 pages
Lexical Analyzer 2023
No ratings yet
Lexical Analyzer 2023
38 pages
Chapter 3 - Lexical Analysis
100% (3)
Chapter 3 - Lexical Analysis
51 pages
Compiler-Lexical Analysis
100% (1)
Compiler-Lexical Analysis
59 pages
rkCD-Chapter 2 - LEXICAL ANALYSIS
No ratings yet
rkCD-Chapter 2 - LEXICAL ANALYSIS
9 pages
Specification of Tokens
No ratings yet
Specification of Tokens
21 pages
Chapter 2
No ratings yet
Chapter 2
91 pages
(Week 7) REGULAR EXPRESSION
No ratings yet
(Week 7) REGULAR EXPRESSION
44 pages
Recognition of Tokens
No ratings yet
Recognition of Tokens
34 pages
CD Unit-2
No ratings yet
CD Unit-2
64 pages
CD Unit-2
No ratings yet
CD Unit-2
64 pages
Compiler
No ratings yet
Compiler
60 pages
Compiler Design - Lexical Analysis
No ratings yet
Compiler Design - Lexical Analysis
16 pages
Pcdunit2 Continuation
No ratings yet
Pcdunit2 Continuation
26 pages
Lec - 2. Scanning (Lexical Analysis) Part 1
No ratings yet
Lec - 2. Scanning (Lexical Analysis) Part 1
37 pages
Lexical Analysis
No ratings yet
Lexical Analysis
31 pages
CompilerD L3
No ratings yet
CompilerD L3
36 pages
Acknowledgements: The Slides For This Lecture Are A Modified Versions of The Offering by
No ratings yet
Acknowledgements: The Slides For This Lecture Are A Modified Versions of The Offering by
40 pages
Lec 4
No ratings yet
Lec 4
16 pages
Chapter 3 - Regular Expression
No ratings yet
Chapter 3 - Regular Expression
16 pages
1 MP
No ratings yet
1 MP
3 pages
M Bharath
No ratings yet
M Bharath
3 pages
Elektor-1982-07 (Super LN Phono, Class A+B Amplifier)
No ratings yet
Elektor-1982-07 (Super LN Phono, Class A+B Amplifier)
97 pages
Chapter 11 - Dynamic-Object-Modeling
No ratings yet
Chapter 11 - Dynamic-Object-Modeling
32 pages
Magnetos Maintenance and Overhaul PDF
100% (1)
Magnetos Maintenance and Overhaul PDF
64 pages
01 - Disaster - (2) - JupyterLab
No ratings yet
01 - Disaster - (2) - JupyterLab
16 pages
PREPOSITIONS OF PLACE - Quizizz
No ratings yet
PREPOSITIONS OF PLACE - Quizizz
6 pages
Logix Class7 Computer 18day LessonPlan
No ratings yet
Logix Class7 Computer 18day LessonPlan
1 page
0702LS Infineon PDF
No ratings yet
0702LS Infineon PDF
12 pages
Capstone Case Study
No ratings yet
Capstone Case Study
4 pages
How To Bypass or Remove A BIOS Password
No ratings yet
How To Bypass or Remove A BIOS Password
5 pages
Windroid Application For E-Attendance
No ratings yet
Windroid Application For E-Attendance
53 pages
Kurzwell Forte Se Guide
No ratings yet
Kurzwell Forte Se Guide
18 pages
Engine Speed Circuit Fault
No ratings yet
Engine Speed Circuit Fault
7 pages
Module-1: Web Programming
100% (1)
Module-1: Web Programming
50 pages
Khalid Khan
No ratings yet
Khalid Khan
4 pages
Getting Started Guide Icepak
No ratings yet
Getting Started Guide Icepak
62 pages
Chapter 8 Implementing VPNv2
No ratings yet
Chapter 8 Implementing VPNv2
23 pages
BMW Innovations & RND
No ratings yet
BMW Innovations & RND
7 pages
Project Charter Comparison
No ratings yet
Project Charter Comparison
1 page
Mess Management System
No ratings yet
Mess Management System
13 pages
Assignment No 4
No ratings yet
Assignment No 4
7 pages
List of Job Consultancy With Address in Hyderbad
67% (6)
List of Job Consultancy With Address in Hyderbad
20 pages
Pages From Trends in Educational Research About E-Learning A Systematic Literature Review (2009-2018) - 4
No ratings yet
Pages From Trends in Educational Research About E-Learning A Systematic Literature Review (2009-2018) - 4
1 page
Statistics and Probability (MAT02) Numerical Descriptive Measure
No ratings yet
Statistics and Probability (MAT02) Numerical Descriptive Measure
13 pages
Bright Technologies
No ratings yet
Bright Technologies
1 page
Drive Harmonics in Power Systems Whitepaper Sinamicsg220 - Original
100% (1)
Drive Harmonics in Power Systems Whitepaper Sinamicsg220 - Original
34 pages
8th STD - Maths - Qtrly Exam - Sep 2021 - 22 Online - 20.09.2021
No ratings yet
8th STD - Maths - Qtrly Exam - Sep 2021 - 22 Online - 20.09.2021
2 pages
DevOps Engineer
No ratings yet
DevOps Engineer
2 pages
24 GHZ Radar Technology Web
No ratings yet
24 GHZ Radar Technology Web
16 pages

Lect2 Lexical

Uploaded by

Lect2 Lexical

Uploaded by

Review: Compiler Phases:

Lexical analyzer Front End

Intermediate code generator

Lexical analyzer does not have to be an individual

How to specify tokens:

L={aa, bb, cc}, M = {abc}

Regular expression is defined together with the

We assume that * has the highest precedence and is

more examples: integer constant, string constants, reserved

You might also like