Compiler Design Part 2

This document discusses compiler design and lexical analysis. It covers the following key points: - Lexical analysis is the first phase of a compiler that identifies lexemes in source code and produces tokens. It is implemented using patterns and finite automata. - The main tasks of a lexical analyzer are to read source characters, group them into lexemes, and produce tokens that are passed to the parser. - Regular expressions are used to define patterns for tokens. Finite automata are constructed to recognize regular languages defined by patterns. - Attributes may be associated with tokens to provide additional information to later compiler phases. Lexical errors are also handled. - Methods for minimizing the number of states in

Uploaded by

KDCreatives

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

50 views20 pages

Compiler Design Part 2

Uploaded by

KDCreatives

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PPTX, PDF, TXT or read online on Scribd

You are on page 1/ 20

COMPILER DESIGNING

INTRODUCTION
Finite automata and lexical Analysis:
⚫A lexical analyzer automatically by specifying the
lexeme patterns to a lexical-analyzer generator and
compiling those patterns into code that functions as a
lexical analyzer.
⚫It also speeds up the process of implementing the
lexical analyzer, since the programmer species the
software at the very high level of patterns and relies on
the generator to produce the detailed code.
⚫A lexical-analyzer generator called Lex.
The Role of the Lexical Analyzer:
⚫As the first phase of a compiler, the main task of the lexical
analyzer is to read the input characters of the source program,
group them into lexemes, and produce as output a sequence
of tokens for each lexeme in the source program. The stream
of tokens is sent to the parser for syntax analysis.
⚫Commonly, the interaction is implemented by having the
parser call the lexical analyzer.
⚫The call, suggested by the getNextToken command, causes
the lexical analyzer to read characters from its input until it
can identify the next lexeme and produce for it the next
token, which it returns to the parser.
The Role of the Lexical Analyzer:

⚫ Since the lexical analyzer is the part of the compiler that reads the source
text, it may perform certain other tasks besides identification of lexemes.
⚫ One such task is stripping out comments and whitespace (blank, newline,
tab, and perhaps other characters that are used to separate tokens in the
input).
lexical analyzers are divided into a cascade of two
processes:

⚫ Scanning consists of the simple processes that do not require

tokenization of the input, such as deletion of comments and
compaction of consecutive whitespace characters into one.
⚫ Lexical analysis proper is the more complex portion, which
produces tokens from the output of the scanner.
A) Lexical Analysis Versus Parsing
⚫ There are a number of reasons why the analysis portion of a
compiler is normally separated into lexical analysis and parsing
(syntax analysis) phases.
I. Simplicity of design is the most important consideration.
The separation of lexical and syntactic analysis often
allows us to simplify at least one of these tasks.
II. Compiler efficiency is improved.
III.Compiler portability is enhanced. Input-device-specific
peculiarities can be restricted to the lexical analyzer.
B) Tokens, Patterns, and Lexemes
⚫ A token is a pair consisting of a token name and an optional attribute
value. The token name is an abstract symbol representing a kind of
lexical unit. The token names are the input symbols that the parser
processes..
⚫ A pattern is a description of the form that the lexemes of a token
may take. In the case of a keyword as a token, the pattern is just the
sequence of characters that form the keyword. For identifiers and
some other tokens, the pattern is a more complex structure that is
matched by many strings.
⚫ A lexeme is a sequence of characters in the source program that
matches the pattern for a token and is identified by the lexical
analyzer as an instance of that token.
Example
⚫ To see how these concepts are used in practice, in the C
statement
printf("Total = %d\n", score);
both printf and score are lexemes matching the pattern for token
id, and "Total = %d\n" is a lexeme matching literal.
C) Attributes for Tokens
⚫ One lexeme can match a pattern, the lexical analyzer must provide the
subsequent compiler phases additional information about the
particular lexeme that matched.
⚫ Example

E = M * C**2

⚫ are written below as a sequence of pairs.

D) Lexical Errors
⚫ It is hard for a lexical analyzer to tell, without the aid of other
components, that there is a source-code error.
⚫ For instance, if the string fi is encountered for the first time in a C
program in the context:
fi ( a == f(x)) ...
⚫ A lexical analyzer cannot tell whether fi is a misspelling of the
keyword if or an undeclared function identifier.
⚫ Other possible error-recovery actions are:
a) Delete one character from the remaining input.
b) Insert a missing character into the remaining input.
c) Replace a character by another character.
d) Transpose two adjacent characters.
Regular Expression
⚫ Notations
If r and s are regular expressions denoting the
languages L(r) and L(s), then
⚫ Union : (r)|(s) is a regular expression denoting L(r) U
L(s)
⚫ Concatenation : (r)(s) is a regular expression
denoting L(r)L(s)
⚫ Kleene closure : (r)* is a regular expression denoting
(L(r))*(r) is a regular expression denoting L(r)
Precedence and Associativity

⚫ *, concatenation (.), and | (pipe sign) are left

associative
⚫ * has the highest precedence
⚫ Concatenation (.) has the second highest
precedence.
⚫ | (pipe sign) has the lowest precedence of all.
Finite Automata Construction
⚫ States : States of FA are represented by circles.
State names are written inside circles.
⚫ Start state : The state from where the automata
starts is known as the start state. Start state has an
arrow pointed towards it.
⚫ Intermediate states : All intermediate states
have at least two arrows; one pointing to and
another pointing out from them.
⚫ Final state : If the input string is successfully parsed, the
automata is expected to be in this state. Final state is
represented by double circles. It may have any odd number
of arrows pointing to it and even number of arrows
pointing out from it. The number of odd arrows are one
greater than even, i.e. odd = even+1.
⚫ Transition : The transition from one state to another state
happens when a desired symbol in the input is found. Upon
transition, automata can either move to the next state or
stay in the same state. Movement from one state to another
is shown as a directed arrow, where the arrows point to the
destination state. If automata stays on the same state, an
arrow pointing from a state to itself is drawn.
Minimizing the Number of states of a DFA

⚫ There are two method for minimizing the number of

states and transitions:-

1. Empirical Method
2. Formal Method
Empirical Method
⚫ This method minimized the number of states
and transitions according to the following
steps:-
❑ We merge two states in DFA into one state if they
have the same important states. The important
state is the state with no ε-transition
❑ We merge two states if they either both include or
both exclude accepting states of the NFA.
Example: Minimize the DFA to the Empirical
method.
2- Formal Method
⚫ In this method we don’t need an experience for deciding
which of the state that are legal for merging, instead of
that we must use the following algorithm:-
⚫ Algorithm: - Minimizing the number of states of a DFA
⚫ Input: - A DFA M with set of states S, inputs Σ, transitions
defined for all states and inputs, initial state S0, and set
the final states F.
⚫ Output: - A DFA M' accepting the same language as M and
having as few states as possible.
THANK YOU

Lexical Analysis: Risul Islam Rasel
No ratings yet
Lexical Analysis: Risul Islam Rasel
148 pages
Agronomy ACC
No ratings yet
Agronomy ACC
174 pages
3 Depreciation
No ratings yet
3 Depreciation
15 pages
Chapter 3 Finite Automata and Lexical Analysis
No ratings yet
Chapter 3 Finite Automata and Lexical Analysis
95 pages
Internship Report On Branding & Promotional Strategies of Ceylon Biscuits Bangladesh (PVT.) Limited - IUBAT - Sahadat Hossain
85% (13)
Internship Report On Branding & Promotional Strategies of Ceylon Biscuits Bangladesh (PVT.) Limited - IUBAT - Sahadat Hossain
92 pages
Duolingo English Test Sample Questions and Answers
No ratings yet
Duolingo English Test Sample Questions and Answers
14 pages
REPORT New-1
No ratings yet
REPORT New-1
40 pages
Scanning Protocol Abdomen Pelvis
No ratings yet
Scanning Protocol Abdomen Pelvis
2 pages
Brother Sewing Machine Guide
No ratings yet
Brother Sewing Machine Guide
1 page
Lexical Analysis
No ratings yet
Lexical Analysis
153 pages
Chapter 2
No ratings yet
Chapter 2
56 pages
Lecture3 E
No ratings yet
Lecture3 E
153 pages
Unit 2 - Lesson 1.1 - Vocab & Reading - Pages 14 & 15
No ratings yet
Unit 2 - Lesson 1.1 - Vocab & Reading - Pages 14 & 15
78 pages
Lecture 02
No ratings yet
Lecture 02
150 pages
Lexical Analysis: Dr. Murali Krishna Enduri Department of CSE
No ratings yet
Lexical Analysis: Dr. Murali Krishna Enduri Department of CSE
88 pages
Scanner (Lexical Analyzer) : The Structure of A Compiler
No ratings yet
Scanner (Lexical Analyzer) : The Structure of A Compiler
109 pages
Meenakari Origin and History
No ratings yet
Meenakari Origin and History
5 pages
Chapter 3 Finite Automata and Lexical Analysis
No ratings yet
Chapter 3 Finite Automata and Lexical Analysis
100 pages
A Typical Lexical Analyzer Generator Nfa To Dfa DFA Analysis
No ratings yet
A Typical Lexical Analyzer Generator Nfa To Dfa DFA Analysis
64 pages
CD 1
No ratings yet
CD 1
92 pages
CC Unit 2
No ratings yet
CC Unit 2
80 pages
Tumapon - Strengths and Weaknesses of Language Origin Theories
No ratings yet
Tumapon - Strengths and Weaknesses of Language Origin Theories
3 pages
Chapter 2
No ratings yet
Chapter 2
77 pages
L2 Lexical Analysis
No ratings yet
L2 Lexical Analysis
59 pages
CD - CH2 - Lexical Analysis
No ratings yet
CD - CH2 - Lexical Analysis
67 pages
Lexical Analyzer (Compiler Contruction)
100% (1)
Lexical Analyzer (Compiler Contruction)
6 pages
Compiler Construction Lexical Analysis
No ratings yet
Compiler Construction Lexical Analysis
63 pages
4 Lexical Analysis
No ratings yet
4 Lexical Analysis
60 pages
CD Unit-2
No ratings yet
CD Unit-2
64 pages
Chapter 2 Lexical Analysis
No ratings yet
Chapter 2 Lexical Analysis
55 pages
Chapter 2 - Lexical Analysis
No ratings yet
Chapter 2 - Lexical Analysis
56 pages
CD - CH2 - Lexical Analysis
No ratings yet
CD - CH2 - Lexical Analysis
59 pages
Chpater 2 Lexical Analysis
No ratings yet
Chpater 2 Lexical Analysis
48 pages
CD Unit-2
No ratings yet
CD Unit-2
64 pages
Chapter 7 Lexical Analysis
No ratings yet
Chapter 7 Lexical Analysis
61 pages
21CS51 ATCD MODULE 2 - 2 Lexical Analyser Part2
No ratings yet
21CS51 ATCD MODULE 2 - 2 Lexical Analyser Part2
62 pages
Lexical Analysis
No ratings yet
Lexical Analysis
62 pages
Unit 6
No ratings yet
Unit 6
109 pages
Unit 2
No ratings yet
Unit 2
61 pages
Chapter 2 - Lexical Analyser
No ratings yet
Chapter 2 - Lexical Analyser
39 pages
Cephalopelvic Disproportion
60% (5)
Cephalopelvic Disproportion
2 pages
Capitalstructureplanning 1
No ratings yet
Capitalstructureplanning 1
51 pages
Unit II - Lexical Analysis-20-1-2021
No ratings yet
Unit II - Lexical Analysis-20-1-2021
49 pages
CS3304 9 LanguageSyntax 2 PDF
No ratings yet
CS3304 9 LanguageSyntax 2 PDF
39 pages
Compiler Design
No ratings yet
Compiler Design
42 pages
Chapter Two (3) (Autosaved)
No ratings yet
Chapter Two (3) (Autosaved)
29 pages
Chapter 2 - Lexical Analyser
No ratings yet
Chapter 2 - Lexical Analyser
38 pages
Jonathan 170725114020
No ratings yet
Jonathan 170725114020
33 pages
M2 Main
No ratings yet
M2 Main
41 pages
Chapter 2
No ratings yet
Chapter 2
39 pages
CH 2
No ratings yet
CH 2
36 pages
1st Phase Lexical Analyzer
No ratings yet
1st Phase Lexical Analyzer
33 pages
CD Mod 1 & 2
No ratings yet
CD Mod 1 & 2
32 pages
Liberty Tax (TY2023) Textbook ch3 Filing Status
No ratings yet
Liberty Tax (TY2023) Textbook ch3 Filing Status
22 pages
Lec2 LexicalAnalyser
No ratings yet
Lec2 LexicalAnalyser
30 pages
Chapter 2
No ratings yet
Chapter 2
27 pages
TOK Essay 2018 Examiner Preparation Notes
No ratings yet
TOK Essay 2018 Examiner Preparation Notes
21 pages
Compiler Design - Lexical Analysis
No ratings yet
Compiler Design - Lexical Analysis
16 pages
CD - Unit II - Notes
No ratings yet
CD - Unit II - Notes
20 pages
rkCD-Chapter 2 - LEXICAL ANALYSIS
No ratings yet
rkCD-Chapter 2 - LEXICAL ANALYSIS
9 pages
Lexical Analyzer
No ratings yet
Lexical Analyzer
17 pages
Topic Module 1: The Teaching Profession Activities & Answers To Assessment Summary and Reaction Paper
No ratings yet
Topic Module 1: The Teaching Profession Activities & Answers To Assessment Summary and Reaction Paper
19 pages
Chapter Two LexicalAnalysis
No ratings yet
Chapter Two LexicalAnalysis
16 pages
Compiler - Lexical Analyzer-2
No ratings yet
Compiler - Lexical Analyzer-2
16 pages
School Paper Fnal
No ratings yet
School Paper Fnal
14 pages
Lec 02
No ratings yet
Lec 02
17 pages
Slide Set 4 Lexical Analysis
No ratings yet
Slide Set 4 Lexical Analysis
11 pages
2.1 Constituents of Lexical Analysis
No ratings yet
2.1 Constituents of Lexical Analysis
10 pages
HPC 1 Module 3
No ratings yet
HPC 1 Module 3
11 pages
Compiler 2
No ratings yet
Compiler 2
10 pages
Module 5 Lexical Analyser
No ratings yet
Module 5 Lexical Analyser
10 pages
CD KCS502 Unit 1 B
No ratings yet
CD KCS502 Unit 1 B
12 pages
M.Suhaib Khalid PDF
No ratings yet
M.Suhaib Khalid PDF
10 pages
CC Note 1
No ratings yet
CC Note 1
11 pages
2 LexicalAnalysis
No ratings yet
2 LexicalAnalysis
11 pages
The Smith Family Written by Damian Ofori
No ratings yet
The Smith Family Written by Damian Ofori
10 pages
An Exploratory Study of Strata Residential Propert
No ratings yet
An Exploratory Study of Strata Residential Propert
9 pages
VIN RO Creation Date RO Completion Date RO Closed Date Express Service Service Order Type Dialogue Reception
No ratings yet
VIN RO Creation Date RO Completion Date RO Closed Date Express Service Service Order Type Dialogue Reception
8 pages
FUSE: A Microservice Approach To Cross-Domain Federation Using Docker Containers
No ratings yet
FUSE: A Microservice Approach To Cross-Domain Federation Using Docker Containers
10 pages
Conten. Prioriz 2ºcuatrim, 5tomy T, 8 AyB
No ratings yet
Conten. Prioriz 2ºcuatrim, 5tomy T, 8 AyB
6 pages
Effects of Exchange Rate Fluctuations On The Balance of Payment in The Nigerian Economy
No ratings yet
Effects of Exchange Rate Fluctuations On The Balance of Payment in The Nigerian Economy
8 pages
Social Work Law TMA 1
No ratings yet
Social Work Law TMA 1
7 pages
Philippine National Police: Id Application Form (PNP Dependent)
No ratings yet
Philippine National Police: Id Application Form (PNP Dependent)
1 page
Adr Project Final PDF
No ratings yet
Adr Project Final PDF
5 pages
Lexical Analysis All Token List and Diffence
No ratings yet
Lexical Analysis All Token List and Diffence
4 pages
TRAFx Vehicle Counter
No ratings yet
TRAFx Vehicle Counter
2 pages
Compiler Design - Lexical Analysis: University of Salford, UK
No ratings yet
Compiler Design - Lexical Analysis: University of Salford, UK
1 page
Material Safety Data Sheet: The Clorox Company
No ratings yet
Material Safety Data Sheet: The Clorox Company
1 page
10.11.2024 DD PM EVIDYA CH29 Transmission
No ratings yet
10.11.2024 DD PM EVIDYA CH29 Transmission
1 page

Compiler Design Part 2

Uploaded by

Compiler Design Part 2

Uploaded by

COMPILER DESIGNING

⚫ Scanning consists of the simple processes that do not require

⚫ are written below as a sequence of pairs.

⚫ *, concatenation (.), and | (pipe sign) are left

⚫ There are two method for minimizing the number of

You might also like