0% found this document useful (0 votes)
59 views51 pages

Compiler Design: Dr. Eng. Ahmed Moustafa Elmahalawy

The document discusses the phases and components of a compiler. It describes the compiler as having front-end and back-end phases. The front-end phases include lexical analysis, syntax analysis, and semantic analysis. Lexical analysis breaks the source code into tokens. Syntax analysis groups tokens into syntactic structures like expressions and statements. Semantic analysis determines the meaning of the source program. The back-end phase generates target code. The overall goal is to analyze the source program and synthesize equivalent target code.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views51 pages

Compiler Design: Dr. Eng. Ahmed Moustafa Elmahalawy

The document discusses the phases and components of a compiler. It describes the compiler as having front-end and back-end phases. The front-end phases include lexical analysis, syntax analysis, and semantic analysis. Lexical analysis breaks the source code into tokens. Syntax analysis groups tokens into syntactic structures like expressions and statements. Semantic analysis determines the meaning of the source program. The back-end phase generates target code. The overall goal is to analyze the source program and synthesize equivalent target code.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 51

Compiler

Design
Dr. Eng. Ahmed Moustafa Elmahalawy
Computer Science and Engineering Department
‫ميثاق المحاضرة‬

‫األحترام المتبادل‬ ‫إغالق المحمول‬ ‫تحديد الهدف‬ ‫المشاركة‬ ‫اإللتزام بالوقت‬


Chapter 2
Model of a Compiler
Compiler Design Chapter 2: Model of a Compiler

Contents:-

1- Phases of Compiler
2- Error Handler

3- Symbol tables
Compiler Design Chapter 2: Model of a Compiler

The task of constructing a compiler for a


particular source language is complex.

The complexity and nature of the


compilation process depend, to a large extent,
on the source language.

Compiler complexity can often be reduced


if a programming-language designer takes
various design factors into consideration.
Compiler Design Chapter 2: Model of a Compiler

2.1 Phases of a Compiler


Compilers are highly complex
programs, and it is unreasonable to consider
the translation process as occurring in a
single step.

It is usual to regard it as divided into a


series of phases.
Compiler Design Chapter 2: Model of a Compiler

The simplest breakdown recognizes that


there is an analytic phase, in which the source
program is analyzed to determine whether it
meets the syntactic and static semantic
constraints imposed by the language.

This is followed by a synthetic phase in


which the corresponding object code is
generated in the target language.
Compiler Design Chapter 2: Model of a Compiler

The components of the translator that


handle these two major phases are said to
comprise the front end and the back end of
the compiler.

The front end is largely independent of


the target machine; the back end depends
very heavily on the target machine.
Compiler Design Chapter 2: Model of a Compiler

A basic model of a compiler is given in


following Figure.
Compiler Design Chapter 2: Model of a Compiler

A compiler must perform two major tasks:

a) the analysis of a source program. The


analysis task deals with the decomposition
of the source program into its basic parts.

b) the synthesis of its corresponding object


program. Using the previous parts, the
synthesis task builds their equivalent object
program modules.
Compiler Design Chapter 2: Model of a Compiler

Before the compiler begins, it must do the


following.

The character handler is the section that


communicates with the outside world, through
the operating system, to read in the characters
that make up the source text.

As character sets and file handling vary


from system to system, this phase is often
machine or operating system dependent.
Compiler Design Chapter 2: Model of a Compiler

A source program is a string of symbols


each of which is generally a letter, a digit, or
certain special symbols such as +, -, and (,).'
A source program contains elementary
language constructs such as variable
names, labels, constants, keywords, and
operators.
Compiler Design Chapter 2: Model of a Compiler

It is therefore desirable for the compiler


to identify these various types as classes.

These language constructs are given in


the definition of the language.
Compiler Design Chapter 2: Model of a Compiler

1- The lexical analyser or scanner is the


section that fuses characters of the
source text into groups that logically
make up the tokens of the language -
symbols like identifiers, strings, numeric
constants, keywords like while and if,
operators like <=, and so on.
Compiler Design Chapter 2: Model of a Compiler

The source program is input to a lexical


analyzer or scanner whose purpose is to
separate the incoming text into pieces or
tokens such as constants, variable names,
keywords (such as DO, IF, and THEN in
PL/I), and operators.

In essence, the lexical analyzer


performs low-level syntax analysis.
Compiler Design Chapter 2: Model of a Compiler

For efficiency reasons, each class of


tokens is given a unique internal
representation number.

For example, a variable name may be


given a representation number of 1, a
constant a value of 2, a label the number 3,
the addition operator ( + ) a value of 4, etc.
Compiler Design Chapter 2: Model of a Compiler

WHILE A > 3 * B DO A := A - 1 END

easily decodes into tokens


Compiler Design Chapter 2: Model of a Compiler

as we read it from left to right, but the


Fortran statement

10 DO 20 I = 1.30

is more deceptive.

Readers familiar with Fortran might see


it as decoding into
Compiler Design Chapter 2: Model of a Compiler
Compiler Design Chapter 2: Model of a Compiler

while those who enjoy perversity


might like to see it as it really is:
Compiler Design Chapter 2: Model of a Compiler

Note that in scanning the source


statement and generating the representation
number of each token we have ignored
spaces (or blanks) in the statement.

Some scanners place constants,


labels, and variable names in appropriate
tables.
Compiler Design Chapter 2: Model of a Compiler

The lexical analyzer supplies tokens to


the syntax analyzer.

These tokens may take the form of a


pair of items.

_ The first item gives the address or location


of the token in some symbol table.

_ The second item is the representation


number of the token.
Compiler Design Chapter 2: Model of a Compiler

2- The syntax analyser or parser groups


the tokens produced by the scanner into
syntactic structures - which it does by
parsing expressions and statements.
Compiler Design Chapter 2: Model of a Compiler

Often the parser is combined with the


contextual constraint analyser, whose job
it is to determine that the components of the
syntactic structures satisfy such things as
scope rules and type rules within the context
of the structure being analysed.
Compiler Design Chapter 2: Model of a Compiler

The syntax analyzer is much more


complex than the lexical analyzer.

Its function is to take the source


program (in the form of tokens) from the
lexical analyzer and determine the manner
in which it is to be decomposed into its
constituent parts.
Compiler Design Chapter 2: Model of a Compiler

The syntax analyzer determines the


overall structure of the source program.

This process is analogous to determining


the structure of a sentence in the English
language.

In such an instance we are interested in


identifying certain classes such as "subject,"
"predicate," "verb," "noun," and "adjective."
Compiler Design Chapter 2: Model of a Compiler

In syntax analysis we are concerned


with grouping tokens into larger syntactic
classes such as expression, statement, and
procedure.

The syntax analyzer (or parser) outputs


a syntax tree (or its equivalent) in which its
leaves are the tokens and every nonleaf
node represents a syntactic class type.
Compiler Design Chapter 2: Model of a Compiler

For example, an analysis of the source


statement

(A + B)*(C + D)

can produce the syntactic classes (factor),


(term), and (expression) as exhibited it the
syntax tree given in next Figure.
Compiler Design Chapter 2: Model of a Compiler
Compiler Design Chapter 2: Model of a Compiler

There is a set of rules known as a


grammar is used to define precisely the source
language.

A grammar can be used by the syntax


analyzer to determine the structure of the
source program.

This recognition process is called parsing,


and consequently we often refer to syntax
analyzers as parsers.
Compiler Design Chapter 2: Model of a Compiler

3- The syntax tree produced by the syntax


analyzer is used by the semantic analyzer.

The function of the semantic analyzer is to


determine the meaning (or semantics) of the
source program.

Although it is conceptually desirable to


separate the syntax of a source program from
its semantics, the syntax and semantic
analyzers work in close cooperation.
Compiler Design Chapter 2: Model of a Compiler

For an expression such as (A + B).(C +


D), for example, the semantic analyzer must
determine what actions are specified by the
arithmetic operators of addition and
multiplication.

When the parser recognizes an operator


such as "+" or ". ," it invokes a semantic routine
which specifies the action to be performed.
Compiler Design Chapter 2: Model of a Compiler

This routine may check that the two


operands to be added have been declared,
that they have the same type (if not, the
routine would probably make them the
same), and that both operands have values.

The semantic analyzer often interacts


with the various tables of the compiler in
performing its task.
Compiler Design Chapter 2: Model of a Compiler

The semantic-analyzer actions may involve


the generation of an intermediate form of source
code.

For the expression (A+B)*(C+D), the


intermediate source code might be the following
set of quadruples:

(+,A,B, T1)

(+,C,D, T2)

(*, T1, T2, T3)


Compiler Design Chapter 2: Model of a Compiler

_ (+, A, B, T1) is interpreted to mean "add A


and B and place the result in temporary
T1,"

_ (+, C, D, T2) is interpreted to mean "add C


and D and place this result in T2,"

_ (*, T1, T2, T3) is interpreted to mean


"multiply T1 and T2 and place the result in
T3."
Compiler Design Chapter 2: Model of a Compiler

An infix expression may be converted


to an intermediate form called Polish
notation.

Using this approach, the infix


expression (A+B)*(C+D) would be
converted to the equivalent suffix-Polish
expression AB+CD+ *.
Compiler Design Chapter 2: Model of a Compiler

4- The output of the semantic analyzer is


passed on to the code generator.

It uses the data structures produced by


the earlier phases to generate a form of
code, perhaps in the form of simple code
skeletons or macros, or ASSEMBLER or
even high-level code for processing by an
external assembler or separate compiler.
Compiler Design Chapter 2: Model of a Compiler

The major difference between


intermediate code and actual machine code
is that intermediate code need not specify in
detail such things as the exact machine
registers to be used, the exact addresses to
be referred to, and so on.
Compiler Design Chapter 2: Model of a Compiler

At this point the intermediate form of the


source-language program is usually translated
to either assembly language or machine
language.

As an example, the translation of the


three quadruples for the previous expression
can yield the following sequence of single-
address, single-accumulator assembly-
language instructions:
Compiler Design Chapter 2: Model of a Compiler

LDA A Load the contents of A into the accumulator.

ADD B Add the contents of B to that of the accumulator.

STO T1 Store the accumulator contents in temporary storage T1.

LDA C Load the contents of C into the accumulator.

ADD D Add the contents of D to that of the accumulator.

STO T2 Store the accumulator contents in temporary storage T2.

LDA T1 Load the contents of T1 into the accumulator.

MUL T2 Multiply the contents of T2 by that of the accumulator.

STO T3 Store accumulator contents in temporary storage T3


Compiler Design Chapter 2: Model of a Compiler

Our example statement

might produce intermediate code equivalent


to
Compiler Design Chapter 2: Model of a Compiler

Then again, it might produce something


like

#
Compiler Design Chapter 2: Model of a Compiler

5- The output of the code generator is


passed on to a code optimizer.

This process is present in more


sophisticated compilers, it may optionally be
provided, in an attempt to improve the
intermediate code in the interests of speed
or space or both.
Compiler Design Chapter 2: Model of a Compiler

Its purpose is to produce a more


efficient object program.

Certain optimizations that are possible


at a local level include the evaluation of
constant expressions, the use of certain
operator properties such as associatively,
commutatively, and distributive, and the
detection of common sub-expressions.
Compiler Design Chapter 2: Model of a Compiler

To use the same example as before,


obvious optimization would lead to code
equivalent to
Compiler Design Chapter 2: Model of a Compiler

Because of the commutatively of the


multiplication operator, the previous
assembly code can be reduced to the
following:
Compiler Design Chapter 2: Model of a Compiler

Let us examine some possible


interactions between the lexical and syntax
analyzers.

_ One possibility is that the scanner


generates a token for the syntax analyzer
for processing. The syntax analyzer then
"calls" the scanner when the next token is
required.
Compiler Design Chapter 2: Model of a Compiler

_ Another possibility is for the scanner to


produce all the tokens corresponding to
the source program before passing control
to the syntax analyzer.

In this case the scanner has examined


the entire source program-this is called a
separate pass.
Compiler Design Chapter 2: Model of a Compiler

Some compilers make as little as one


pass while other compilers have been
known to make more than 30 passes (e.g.,
some of IBM's first PL/I compilers).

Factors which influence the number of


passes to be used in a particular compiler
include the following:
Compiler Design Chapter 2: Model of a Compiler

1. Available memory

2. Speed and size of compiler

3. Speed and size of object program

4. Debugging features required

5. Error-detection and -recovery techniques


desired

6. Number of people and time required to


accomplish the compiler writing project

You might also like