0% found this document useful (0 votes)
17 views5 pages

Chapter 5 - ICG

Chapter Five of Compiler Design focuses on intermediate code generation, detailing its purpose, representation formats like syntax trees, postfix notation, and three address code. It explains the conversion process from syntax directed translation to three address code, the data structures used in its implementation, and techniques such as backpatching for handling control statements. The chapter also discusses various forms of three address code, including quadruples, triples, and indirect triples, emphasizing their roles in compiler design.

Uploaded by

lidelidetuwatiro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views5 pages

Chapter 5 - ICG

Chapter Five of Compiler Design focuses on intermediate code generation, detailing its purpose, representation formats like syntax trees, postfix notation, and three address code. It explains the conversion process from syntax directed translation to three address code, the data structures used in its implementation, and techniques such as backpatching for handling control statements. The chapter also discusses various forms of three address code, including quadruples, triples, and indirect triples, emphasizing their roles in compiler design.

Uploaded by

lidelidetuwatiro
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Compiler Design

Chapter Five: Intermediate code generation


The Objective of this chapter are listed as follows,
 Describe intermediate code generation and the various representation formats: syntax tree, postfix
notation and three address code
 Explain the conversion process: from syntax directed translation into three address code.
 Explain data structures used in the implementation of TAC: Quadruples, Triples and Indirect
triples.
 Describe the term Declarations as well as technique used in code generation (Backpatching).
Intermediate code generation
• In a compiler, the front end translates a source program into an intermediate representation, and the
back end generates the target code from this intermediate representation.
• ICG is the final phase of the compiler front-end
• Goal: translate the program into a format expected by the compiler back-end.
• Techniques for intermediate code generation can be used for final code generation • The use of a
machine independent intermediate code (IC) is: o retargeting to another machine is facilitated
o the optimization can be done on the machine independent code
• Intermediate languages can be represented in the form of
1. Syntax tree
2. Postfix notation
3. Three address code Syntax tree o While parsing the input, a syntax tree can be
constructed for the following tables. A syntax tree (abstract tree) is a condensed form of parse tree
useful for representing language constructs.
o For example, for the string a+b, the parse tree in (a) below will be represented by the syntax tree shown
in (b); the keywords (syntactic sugar) that existed in the parse tree will not exist in the syntax tree.

Postfix notation
The postfix notation is practical for an intermediate representation as the operands are found just before the
operator. In fact, the postfix notation is a linearized representation of a syntax tree.
Example:
1 + 2 * 3 will be represented in the postfix notation as 1 2 + 3 *
Three address code
The three-address code is a sequence of statements of the form: X
:= Y op Z

Compiled by: Daniel T. 1


Compiler Design
Where: X, Y and Z are names, constants or compiler-generated temporaries, op is an operator such as
integer or floating-point arithmetic operator or logical operator on Boolean data.
Important Notes:
o No built-up arithmetic operator is permitted o Only one operator at the right side of the assignment
is possible, i.e. x + y + z is not possible o Similarly, to postfix notation, the three-address code is a
linearized representation of a syntax tree. It has been given the name three-address code because such
an instruction usually contains three addresses (the two operands and the result).

Types of three address statements


• As with an assembler statement, the three-address code statement can have: Symbolic labels, as well as
Statements for control flow.
• Common three-address code statements:
Statement Format Comments
1. Assignment (binary operation) X := Y op Z Arithmetic and logical operators used
2. Assignment (unary operation) X := op Y Unary -, not, conversion operators used
3. Copy statement X := Y
4. Unconditional jump Goto L
5. Conditional jumps If X relop y goto L
6. Function call:
- Parameter specification param X1 The parameters are specified using param
- Calling the function call P, N The procedure P is called by indicating the number of
parameters
7. Indexed arguments 1) X := Y [I] X will be assigned the value at the address Y + I
2) Y [I] := X The value at the address Y + I will be assigned X
8. Address & pointer assignments X := &Y X is assigned the address of Y
X := *Y X is assigned the element at the address Y
*X = Y The value at the address X is assigned Y
The choice of allowable operators is an important issue in the design of an intermediate form. It should be
rich enough to implement the operations of the source language and yet it should not be too complicated to
be translated in the target language.

o The three-address code for the input a:= x + y * z will be:


t1 := y * z
t2 := x + t1
a := t2
o TAC (Three Address Code) can range from high- to low-level, depending on the choice of operators.
In general, it is a statement containing at most 3 addresses or operands.
o The general form is x := y op z, where “op” is an operator, x is the result, and y and z are operands.
x, y, z are variables, constants, or “temporaries”. A three-address instruction consists of at most 3
addresses for each statement.
o Most common implementations of three address code are- Quadruples, Triples and Indirect triples.
Compiled by: Daniel T. 2
Compiler Design
Quadruples
Quadruples consists of four fields in the record structure. One field to store operator op, two fields to store
operands or arguments arg1and arg2 and one field to store result res. res = arg1 op arg2 Example 1: a = b
+ c b is represented as arg1, c is represented as arg2, + as op and a as res. Unary operators like ‘-‘ do not
use agr2. Operators like param do not use agr2 nor result. For conditional and unconditional statements res
is label. Arg1, arg2 and res are pointers to symbol table or literal table for the names.

Compiled by: Daniel T. 3


Compiler Design
Example: a = -b * d + c + (-b) * d
Three address code for the above statement is as follows
t1 = - b t2 = t1 * d t3 = t2 + c t4 = op arg1 arg2 res
- b t5 = t4 * d t6 = t3 + t5 a = t6 - b t1
three address code
* t1 d t2
+ t2 c t3
Quadruples for the above example
- b t4
TRIPLES
Triples use only three fields in the record * t4 d t5 structure. One field for operator,
two fields for operands named as arg1 and + t3 t5 t6 arg2. Value of temporary
variable can be accessed by the position of = t6 a the statement the computes it
and not by location as in quadruples. Example: a = -b * d + c + (-b) *
d
Triples for the above example is as follows
op arg1 arg2
- b
* d (0)
+ c (1)
- b
* d (3)
+ (2) (4)
= a (5)
Arg1 and arg2 may be pointers to symbol table for program variables or literal table for constant or pointers
into triple structure for intermediate results.
INDIRECT TRIPLES
These consist of a listing of pointers to triples, rather than a listing of the triples themselves.
An optimizing compiler can move an instruction by reordering the instruction list, without affecting the
triples themselves.
Instruction op arg1 arg2
(0) - b
(1) * d (0)
(2) + c (1)
(3) - b
(4) * d (3)
(5) + (2) (4)
(6) = a (5)
Declarations
The declaration is used by the compiler as a source of type-information that it will store in the symbol table.
While processing the declaration, the compiler reserves memory area for the variables and stores the
Compiled by: Daniel T. 4
Compiler Design
relative address of each variable in the symbol table. The relative address consists of an address from the
static data area.
We use in this section a number of variables, attributes and procedure that help the processing of the
declaration. The compiler maintains a global offset variable that indicates the first address not yet allocated.
Initially, offset is assigned 0. Each time an address is allocated to a variable, the offset is incremented by
the width of the data object denoted by the name.
The procedure enter (name, type, address) creates a symbol table entry for name, give it the type type and
the relative address address.
The synthesized attributes name and width for non-terminal T are also used to indicate the type and number
of memory units taken by objects of that type.

Backpatching
The main problem for generating code for control statements in a single pass is that, during one single pass,
we may not know the labels where the control must go at the time the jump statements are generated. We
can solve this problem by generating jump statements where the targets are temporarily left unspecified.
Each such statement will be put on a list of goto statements whose labels will be filled when determined.
We call this backpatching and it is widely used in three-address code generation. Backpatching is a
technique for generating code for boolean expressions and statements in one pass. The idea is to maintain
lists of incomplete jumps, where all the jump instructions on a list have the same target. When the target
becomes known, all the instructions on its list are completed by filling in the target.

Compiled by: Daniel T. 5

You might also like