0% found this document useful (0 votes)
21 views27 pages

Compiler Design: N I E T

The document outlines the syllabus for a Compiler Design course at Anna University, detailing the structure and phases of compilers, including lexical analysis, syntax analysis, and semantic analysis. It emphasizes the role of the lexical analyzer in token generation and discusses compiler-construction tools and techniques. Additionally, it provides examples of tokens, patterns, and the implementation process of a compiler.

Uploaded by

Priya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views27 pages

Compiler Design: N I E T

The document outlines the syllabus for a Compiler Design course at Anna University, detailing the structure and phases of compilers, including lexical analysis, syntax analysis, and semantic analysis. It emphasizes the role of the lexical analyzer in token generation and discusses compiler-construction tools and techniques. Additionally, it provides examples of tokens, patterns, and the implementation process of a compiler.

Uploaded by

Priya
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 27

CS3501

COMPILER DESIGN
(Anna University, Regulation 2021)

Mrs. S. Priya, M.E., (Ph.D)


Assistant Professor,
Nehru Institute of Engineering and
Technology
Accredited by NACC, Recognized by UGC with Section 2(f) and 12(B),
NBA Accredited UG Courses : Aero and CSE
Thirumalayampalayam, Coimbatore- 641 105
[email protected]
Syllabus
UNIT I INTRODUCTION TO COMPILERS (9)

Structure of a compiler – Lexical Analysis – Role of Lexical


Analyzer – Input Buffering – Specification of Tokens – Recognition
of Tokens – Lex – Finite Automata – Regular Expressions to
Automata – Minimizing DFA.

TEXT BOOK:
Alfred V. Aho, Monica S. Lam, Ravi Sethi, Jeffrey D. Ullman,
Compilers: Principles, Techniques and Toolsǁ, Second Edition,
Pearson Education, 2009.

Mrs. S. Priya, AP- CSE, 2


NIET
Contents/ Topics:
1. The Role of the Lexical Analyzer
2. Input Buffering (Omit)
3. Specification of Tokens
4. Recognition of Tokens
5. The Lexical -Analyzer Generator Lex
6. Finite Automata
7. From Regular Expressions to Automata
8. Design of a Lexical-Analyzer Generator
9. OptimizationMrs.
of
NIET
S.DFA-Based
Priya, AP- CSE, Pattern Matchers 3
1. The Role of the Lexical
Analyzer
• As the first phase of a compiler, the main task of the lexical
analyzer is to read the input characters of the source program,
group them into lexemes, and produce as output a sequence of
tokens for each lexeme in the source program.

Mrs. S. Priya, AP- CSE, 4


NIET
Why Lexical Analysis and Parsing
(Syntax Analysis) are Separate
• Simplifies the design of the compiler
– LL(1) or LR(1) parsing with 1 token lookahead would not be
possible (multiple characters/tokens to match)
• Provides efficient implementation
– Systematic techniques to implement lexical analyzers by hand or
automatically from specifications
– Stream buffering methods to scan input
• Improves portability
– Non-standard symbols and alternate character encodings can
be normalized (e.g. UTF8, trigraphs)
Mrs. S. Priya, AP- CSE, 5
NIET
Tokens, Patterns, and Lexemes
• A token is a pair consisting of a token name and an optional
attribute value
– The token name is an abstract symbol representing a kind of lexical unit
– For example: id and num
• Lexemes are the specific character strings that make up a token
– For example: abc and 123
• Patterns are rules describing the set of lexemes belonging to a token
– For example: “letter followed by letters and digits” and
“non-empty sequence of digits”
Mrs. S. Priya, AP- CSE, 6
NIET
Examples of Tokens

Token Classes:
1. One token for each keyword
2. Tokens for the operators
3. One token representing all identifiers
4. One or more tokens representing constants
Mrs. S. Priya, AP- CSE, 7
5. Tokens for each punctuation
NIET symbol
Attributes for Tokens
• When more than one lexeme can match a pattern, the lexical
analyzer must provide the subsequent compiler phases additional
information about the particular lexeme that matched
• Examples: lexemes, token names and associated attribute values for
the following statements.
printf ( "Total = %d\n", score ) ;

E = M * C ** 2
Mrs. S. Priya, AP- CSE, 8
NIET
Language Processors
• A compiler is a program that can read a program in one language the source
language and translate it into an equivalent program in another language the target
language
• An important role of the compiler is to report any errors in the source program that
it detects during the translation process.

Mrs. S. Priya, AP-SG.CSE, NIET 9


Language Processors

Mrs. S. Priya, AP-SG.CSE, NIET 1


0
Phases of a Compiler

Mrs. S. Priya, AP-SG.CSE, NIET 1


1
Phases of a Compiler

Mrs. S. Priya, AP-SG.CSE, NIET


12
Compiler-Construction Tools
Some commonly used compiler-construction tools include
1. Parser generators that automatically produce syntax analyzers from a grammatical
description of a programming language.
2. Scanner generators that produce lexical analyzers from a regular-expression description
of the tokens of a language.
3. Syntax-directed translation engines that produce collections of routines for walking a
parse tree and generating intermediate code.
4. Code-generator generators that produce a code generator from a collection of rules for
translating each operation of the intermediate language into the machine language for a
target machine.
5. Data- flow analysis engines that facilitate the gathering of information about how values
are transmitted from one part of a program to each other part. Data- flow analysis is a key
part of code optimization.
6. Compiler-construction toolkits that provide an integrated set of routines for constructing
various phases of a compiler Mrs. S. Priya, AP-SG.CSE, NIET
13
Implementation of Compiler
The compiler implementation process is divided into two
parts :
Analysis of a source program
Synthesis of a target program
• Analysis involves analyzing the different constructs of the
program Analysis consists of three phases:
• Lexical analysis (Linear or scanning)
• Syntax Analysis (hierarchical)
• Semantic analysis
• Synthesis of a target program includes three phases:
• Intermediate code generator
• Code optimizer
• Code generator
Mrs. S. Priya, AP-SG.CSE, NIET
14
Lexical Analysis of Compiler
Lexeme: Lexemes are the smallest logical units of a program. It
is the sequence of characters in the source program for which a
token is produced.
Tokens: Class of similar lexemes are identified by the same
token.
Pattern: Pattern is a rule which describes a token

Example: Pattern of an identifier is that it should consists of


letters and digits but the first character should be a letter.
int x = 10; int is a lexeme for the token keyword x is a lexeme
for the token identifier

Mrs. S. Priya, AP-SG.CSE, NIET


15
Lexical Analysis of Compiler

Mrs. S. Priya, AP-SG.CSE, NIET


16
Lexical Analysis of Compiler
The repetitions to process the entire source program before
generating code are referred as passes.
Most compilers with optimization use more than one pass
One Pass for scanning and parsing
One Pass for semantic analysis and source-level
optimization
The third Pass for code generation and target-level
• Parser Generator - It produces syntax
optimization
analyzers

Mrs. S. Priya, AP-SG.CSE, NIET


17
Lexical Analysis of Compiler
Scanner Generator - It generates lexical analyzers
from the input
Ie., It generates a finite automaton to recognize the regular
expression.

Mrs. S. Priya, AP-SG.CSE, NIET


18
1.
Other Uses of Compiler
Scanning and Parsing Techniques
• Command language interpreters
• Scripting language interpretation (Unix shell, Per
Python)
• XML Parsing and documentation tree construction
2. •Program Analysis
Database Techniques
query interpreters
• Converting the sequential loop to a parallel loop
• Program analysis to determine if programs are data-
race free
3. Applications of Compiler
• Profiling programstechnology
to determine busy regions
• Parsers for HTML in web browsers
• Interpreters for JavaScript/ Flash
4. Application-Nature of Compiler Algorithms
• Greedy algorithms and Graph Algorithms- register allocation
• Heuristic search-Mrs.
listS.scheduling
Priya, AP-SG.CSE, NIET
19
Data Structures of Compiler
 The interaction between the algorithms used by the phases of a
compiler and the data structures that support these phases, of course is
a strong one.
 A compiler should be compiling a program in time proportional to the
size of the program.
Time α Size
 Time Complexity, O(n) where n is a measure of program size (usually a
number of characters)

Mrs. S. Priya, AP-SG.CSE, NIET


20
Parse tree for position =
initial + rate * 60 of Compiler
Lexical Analysis:

Grouped into the following tokens:


1. The identifier position.
2. The assignment symbol=.
3. The identifier initial.
4. The plus sign.
5. The identifier rate.
6. The multiplication sign.
7. The number60
Mrs. S. Priya, AP-SG.CSE, NIET
21
Parse tree for position =
initial + rate * 60 of Compiler

Syntax Analysis:

Mrs. S. Priya, AP-SG.CSE, NIET


22
Parse tree for position =
initial + rate * 60 of Compiler
Syntax Analysis:
The hierarchical structure of a program is expressed by recursive rules .i.e by
context-free grammars.
The following rules define an expression:
1. Any identifier is an expression. i.e. E →id
2. Any number is an expression. i.e. E →num
3. If expression1 (E1 ) and expression2 (E2 )are expressions ,
i.e. E → E + E | E * E | ( E)
Rules (1) and (2) are basic rules which are non-recursive, while rule(3)
define expression in terms of operators applied to other expressions.
Mrs. S. Priya, AP-SG.CSE, NIET
23
Parse tree for position =
initial + rate * 60 of Compiler
Semantic Analysis:

Mrs. S. Priya, AP-SG.CSE, NIET


24
Parse tree for position =
initial + rate * 60 of Compiler
Intermediate Code Generation:
 After semantic analysis, some compilers generate an explicit intermediate
representation of the source program.
 Varieties of forms.
• Three address code
• Postfix notation
• Syntax Tree
 The IR code for the given input is as follows:
temp1 = inttoreal ( 60 )
temp2 = id3 * temp1
temp3 = id2 + temp2
id1 = temp3
Mrs. S. Priya, AP-SG.CSE, NIET
25
Parse tree for position =
initial + rate * 60 of Compiler
Code Optimization:
This is the fifth phase of the compiler, whose input is the three address code and output
is an optimized three address code.
 The IR code for the given input is as follows:
temp1 = inttoreal ( 60 )
temp2 = id3 * temp1
temp3 = id2 + temp2
id1 = temp3

t1 = id3 * 60.0 (i.e., by literally defining the floating point value of 60 as 60.0.)
id1 = id2 + t1

Mrs. S. Priya, AP-SG.CSE, NIET


26
EXAMPLE of Compiler

Example: (8 * x) / 2
Load a, x
Mult a, 8
Div a, 2

Mrs. S. Priya, AP-SG.CSE, NIET


27

You might also like