0% found this document useful (0 votes)
56 views

CD Introduction

The document describes the key phases of a compiler: 1) Lexical analysis converts characters into tokens which are passed to the syntax analyzer. 2) Syntax analysis generates a parse tree by checking token arrangements against the grammar. 3) Semantic analysis checks that rules like type compatibility are followed. 4) Intermediate code is generated in a form easier to convert to machine code. 5) Code optimization rearranges statements to improve performance before code generation.

Uploaded by

John Son
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
56 views

CD Introduction

The document describes the key phases of a compiler: 1) Lexical analysis converts characters into tokens which are passed to the syntax analyzer. 2) Syntax analysis generates a parse tree by checking token arrangements against the grammar. 3) Semantic analysis checks that rules like type compatibility are followed. 4) Intermediate code is generated in a form easier to convert to machine code. 5) Code optimization rearranges statements to improve performance before code generation.

Uploaded by

John Son
Copyright
© © All Rights Reserved
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 32

Compiler Design

Compiler Construction 1
 The text book is:

Alfred V. Aho, Ravi Sethi and Jeffry D. Ulman,


Compilers Principles, Techniques and Tools, Addison
Wesley, 2007,

Compiler Construction 2
Language Processing System

Compiler Construction 3
Language Processors.

 A translator inputs and then converts a source program


into an object or target program.
 Source program is written in a source language
 Object program belongs to an object language

 A translators could be: Assembler, Compiler,


Interpreter

Assembler:
Assembler
source program object program
(in assembly language) (in machine
language)

Compiler Construction 4
Language Processors.

 a compiler is a program that can read a program in one language the source
language - and translate it into an equivalent program in another language - the
target language;
 An important role of the compiler is to report any errors in the source program that
it detects during the translation process

 If the target program is an executable machine-language program, it can then be


called by the user to process inputs and produce outputs;

Compiler Construction 5
Interpreter

•An interpreter is another common kind of language processor. Instead of producing a


target program as a translation, an interpreter appears to directly execute the operations
specified in the source program on inputs supplied by the user

The machine-language target program produced by a compiler is usually much faster


than an interpreter at mapping inputs to outputs . An interpreter, however, can usually
give better error diagnostics than a compiler, because it executes the source program
statement by statement.

Compiler Construction 6
Definition of Compiler

- Compiler: translates a source program written in a High-


Level Language (HLL) such as Pascal, C++ into
computer’s machine language (Low-Level Language
(LLL)).
* The time of conversion from source program into
object program is called compile time
* The object program is executed at run time

Compiler Construction 7
Overview of Compilers

Compilation Process:

Data

Source Object Executing Results


Compiler
program program Computer
Compile time run time

Compiler Construction 8
Tasks of A Compiler

 A compiler must perform two tasks:


- analysis of source program: The analysis part breaks up
the source program into constituent pieces and imposes a
grammatical structure on them. It then uses this structure to
create an intermediate representation of the source program.
- synthesis of its corresponding program: constructs the
desired target program from the intermediate representation
and the information in the symbol table.
 The analysis part is often called the front end of the
compiler; the synthesis part is the back end.

Compiler Construction 9
Phases of Compiler

 The compilation process is a sequence of various phases.

 Each phase takes input from its previous stage, has its own
representation of source program, and feeds its output to the
next phase of the compiler.

 Let us understand the phases of a compiler.

Compiler Construction 10
Phases of Compiler

The phases of a compiler can be given as follows:

o • Lexical analysis.

o • Syntax analysis.

o • Semantic analysis.

o • Intermediate code generation.

o • Code optimization.

o • Code generation.

o Code optimization.

Compiler Construction 11
Phases of Compilation Process and Its Output

Error
handler

Compiler phases

Compiler Construction 12
Lexical Analysis (scanner): The first phase of a compiler

 Lexical analyzer reads the stream of characters making up the


source program and groups the characters into meaningful
sequences called lexeme
 For each lexeme, the lexical analyzer produces a token of the form
that it passes on to the subsequent phase, syntax analysis
(token-name, attribute-value)
 Token-name: an abstract symbol is used during syntax analysis
 attribute-value: points to an entry in the symbol table for this
token.

Compiler Construction 13
Example: position =initial + rate * 60

1.”position” is a lexeme mapped into a token (id, 1), where id is an abstract symbol
standing for identifier and 1 points to the symbol table entry for position. The
symbol-table entry for an identifier holds information about the identifier, such as its
name and type.
2. = is a lexeme that is mapped into the token (=). Since this token needs no attribute-
value, we have omitted the second component. For notational convenience, the
lexeme itself is used as the name of the abstract symbol.
3. “initial” is a lexeme that is mapped into the token (id, 2), where 2 points to the
symbol-table entry for initial.
4. + is a lexeme that is mapped into the token (+).
5. “rate” is a lexeme mapped into the token (id, 3), where 3 points to the symbol-table
entry for rate.
6. * is a lexeme that is mapped into the token (*) .
7. 60 is a lexeme that is mapped into the token (60)
Blanks separating the lexemes would be discarded by the lexical analyzer.

14
Syntax Analysis (parser) : The second phase of the compiler

 The next phase is called the syntax analysis or parsing. It takes the token
produced by lexical analysis as input and generates a parse tree (or syntax
tree).
 In this phase, token arrangements are checked against the source code
grammar, i.e. the parser checks if the expression made by the tokens is
syntactically correct.

 A typical representation is a syntax tree in which each interior node


represents an operation and the children of the node represent the
arguments of the operation

15
Semantic Analysis: Third phase of the compiler

 Semantic analysis checks whether the parse tree constructed follows


the rules of language.

 For example, assignment of values is between compatible data types,


and adding string to an integer.

 Also, the semantic analyzer keeps track of identifiers, their types and
expressions; whether identifiers are declared before use or not etc.

 The semantic analyzer produces an annotated syntax tree as an output.

Compiler Construction 16
Intermediate Code Generation: three-address code

 After semantic analysis the compiler generates an intermediate


code of the source code for the target machine.

 It represents a program for some abstract machine.

 It is in between the high-level language and the machine


language.

 This intermediate code should be generated in such a way that


it makes it easier to be translated into the target machine code.

17
Code Optimization: to generate better target code

 The next phase does code optimization of the intermediate code.


Optimization can be assumed as something that removes unnecessary
code lines, and arranges the sequence of statements in order to speed up
the program execution without wasting resources (CPU, memory).

 The machine-independent code-optimization phase attempts to improve


the intermediate code so that better target code will result.

 Usually better means:

 faster, shorter code, or target code that consumes less power.

18
Code Generation: takes as input an intermediate representation of
the source program and maps it into the target language

 In this phase, the code generator takes the optimized representation of


the intermediate code and maps it to the target machine language.
 The code generator translates the intermediate code into a sequence of
(generally) re-locatable machine code.
 Sequence of instructions of machine code performs the task as the
intermediate code would do.

19
Symbol-Table Management:

 It is a data-structure maintained throughout all the phases


of a compiler.

 All the identifier's names along with their types are stored
here.

 The symbol table makes it easier for the compiler to


quickly search the identifier record and retrieve it.

 The symbol table is also used for scope management.

 Whenever an identifier is detected in any of the phases, it


is stored in the symbol table.
20
 Example

 int a, b; float c; char z;

Symbol name Type Address


A Int 1000
B Int 1002
C Float 1004
Z Char 1008

Compiler Construction 21
Error Handling

 • Each phase can encounter errors. After detecting an error,


a phase must handle the error so that compilation can
proceed.

 • In lexical analysis, errors occur in separation of tokens.

 • In syntax analysis, errors occur during construction of


syntax tree.

 • In semantic analysis, errors may occur at the following


cases:(i) When the compiler detects constructs that have
right syntactic structure but no meaning

22
 (ii) During type conversion.

 In code optimization, errors occur when the result is affected


by the optimization. In code generation, it shows error when
code is missing etc.

Compiler Construction 23
Translation of an assignment
statement

Compiler Construction 24
Parse Tree and Symbol Table

 Parse tree defines the program structure; how to combine parts of the
program to produce larger part and so on.

 Symbol table provides

- the associations between all occurrences of each

name given in the program.

- It provides a link between each name and it

declaration.

Compiler Construction 25
Grouping of Phases

The phases of a compiler can be grouped as:


Front end
o Front end of a compiler consists of the phases
o • Lexical analysis.
• Syntax analysis.
• Semantic analysis.
Back end
o Back end of a compiler contains
o • Code optimization.
• Code generation.

Compiler Construction 26
Front End

• Front end comprises of phases which are dependent on the input


(source language) and independent on the target machine (target
language).
• It includes lexical and syntactic analysis, symbol table management,
semantic analysis and the generation of intermediate code.
• Code optimization can also be done by the front end.
• It also includes error handling at the phases concerned.

Compiler Construction 27
Back End

 • Back end comprises of those phases of the compiler that are


dependent on the target machine and independent on the source
language.
• This includes code optimization, code generation.
• In addition to this, it also encompasses error handling and symbol
table management operations.

Compiler Construction 28
Common Back-end Compiling System

Fortran C/C++ Pascal Cobol

Common IR (e.g. Ucode)

Optimizer

Target Machine Code Gen

Compiler Construction 29
Compiling Passes

 Several phases can be implemented as a single pass


consist of reading an input file and writing an output
file.

 A typical multi-pass compiler looks like:

 First pass: preprocessing, macro expansion

 Second pass: syntax-directed translation, IR code


generation

 Third pass: optimization

 Last pass: target machine code generation 30


Passes
 • The phases of compiler can be implemented in a single pass by
marking the primary actions viz. reading of input file and writing to
the output file.
• Several phases of compiler are grouped into one pass in such a
way that the operations in each and every phase are incorporated
during the pass.
• (eg.) Lexical analysis, syntax analysis, semantic analysis and
intermediate code generation might be grouped into one pass. If so,
the token stream after lexical analysis may be translated directly
into intermediate code.

Compiler Construction 31
Reducing the Number of Passes

 • Minimizing the number of passes improves the time efficiency as


reading from and writing to intermediate files can be reduced.
• When grouping phases into one pass, the entire program has to be
kept in memory to ensure proper information flow to each phase
because one phase may need information in a different order than the
information produced in previous phase.
The source program or target program differs from its internal
representation. So, the memory for internal form may be larger than
that of input and output.

Compiler Construction 32

You might also like