Automata Theory and Compiler Design (AT&CD) Vtu Sce 5th Sem 21cs51
Automata Theory and Compiler Design (AT&CD) Vtu Sce 5th Sem 21cs51
Introduction
1
2 CHAPTER 1. INTRODUCTION
source program
Compiler
target program
source program
Interpreter output
input
source program
Translator
source program
L_
Preprocessor
Compiler
L_
Assembler
character stream
i
Lexical Analyzer
token stream
Syntax Analyzer
syntax tree
Semantic Analyzer
syntax tree
intermediate representation
i
Machine-Independent
Code Optimizer
intermediate representation
i
Code Generator
target-machine code
i
Machine-Dependent
Code Optimizer
target-machine code
and groups the characters into meaningful sequences called lexemes. For each
lexeme, the lexical analyzer produces as output a token of t h e form
(token-name, attribute-value)
p o s i t i o n = i n i t i a l + r a t e * 60 (1.1)
The characters in this assignment could be grouped into the following lexemes
and mapped into the following tokens passed on to the syntax analyzer:
In this representation, the token names =, +, and * are abstract symbols for
the assignment, addition, and multiplication operators, respectively.
1
Technically speaking, for the lexeme 60 we should make up a token like (number, 4 ) ,
where 4 points to the symbol table for the internal representation of integer 60 but we shall
defer the discussion of tokens for numbers until Chapter 2. Chapter 3 discusses techniques
for building lexical analyzers.
THE STRUCTURE OF A COMPILER
i
Lexical Analyzer
<id,l) +
position (id,2)-
initial (id,3) 60
rate
Semantic Analyzer
tl = inttofloat(60)
t2 = id3 * tl
t3 = id2 + t2
idl = t3
i
Code Optimizer
tl = id3 * 60.0
idl = id2 + tl
i
Code Generator
p o s i t i o n = i n i t i a l + r a t e * 60
tl • inttofloat(60)
t 2 •• id3 * t l
(1.3)
ts •• id2 + t 2
idl t3
tl = id3 * 60.0
(1.4)
idl = id2 + tl
(1.5) loads t h e contents of address i d 3 into register R2, then multiplies it with
floating-point constant 60.0. T h e # signifies t h a t 60.0 is to be treated as an
immediate constant. The third instruction moves i d 2 into register Rl and t h e
fourth adds to it the value previously computed in register R2. Finally, the value
in register Rl is stored into the address of i d l , so the code correctly implements
t h e assignment statement (1.1). Chapter 8 covers code generation.
This discussion of code generation has ignored t h e important issue of stor-
age allocation for the identifiers in the source program. As we shall see in
Chapter 7, the organization of storage at run-time depends on the language be-
ing compiled. Storage-allocation decisions are made either during intermediate
code generation or during code generation.