4 - Intermediate Code Generation
4 - Intermediate Code Generation
Generation
Introduction
• The front end of a compiler
translates a source program into
an independent intermediate
code.
• The back end uses the
intermediate code to generate
the target code.
01/06/2025 2
Why Generate Intermediate Code
• Enhances portability
• Without intermediate code new compiler would have been required for each
different machine.
• Only one optimizer is needed for n machines.
• Without intermediate code we would need n optimizer for n different
machines.
• Easier to apply source code modifications to improve the performance
by optimizing the intermediate code.
01/06/2025 3
Types of Intermediate Code
• Design of an intermediate representation varies.
Intermediate Code
(a+b)*(a+b+c)
Linear Form Tree Form
Postfix Notation Three address code Syntax Tree Directed Acyclic Graph
𝑡1 =𝑎+𝑏
ab+ab+c+* * *
+
+ +
+
a b a + c
a b
01/06/2025 4
b c
Three Address Code
• A three address statement is an abstract form of intermediate code.
• Three-address code is a sequence of statements of the form
01/06/2025 5
Three-Address Code
• Any source-language expression is converted into a sequence of
three-address code.
01/06/2025 6
Three-Address Code
• Three-address code is a linearized representation of a syntax tree or a
DAG.
• Explicit names correspond to the interior nodes of the graph.
+
𝑡 1 =𝑏 −𝑐
+ *
𝑡 2=𝑎 ∗ 𝑡 1
* 𝑡 3=𝑎 +𝑡 2
d
a
𝑡 4=𝑡 1 ∗ 𝑑
-
𝑡 5 =𝑡 3+ 𝑡 4
01/06/2025
b c 7
Addresses and Instruction
• Three-address code is built from two concepts
• Addresses
• Instruction
• An address can be
• A name
• In implementation source-name is replaced by a pointer to symbol table.
• A constant
• A compiler-generated temporary
01/06/2025 8
Instructions Forms
• Assignment Instructions of the form
01/06/2025 9
Instruction Forms
• Copy instructions of the form
• An unconditional jump
01/06/2025 10
Instruction Forms
• Conditional jumps of the form
01/06/2025 11
Instructions Forms
• Procedure calls and returns are implemented using the following
instructions
• for parameters
• for procedure call
• for function call
• is optional
• Typical use
The instruction sets the content of the location units beyond to the
value of .
01/06/2025 13
Instructions Forms
• Address and pointer assignments of the form
01/06/2025 14
Implementation of Three-Address
Code
• The instructions are implemented as objects or as records
• With fields for operator and operands.
• Three such representations
• Quadruples
• Triples
• Indirect Triples
01/06/2025 15
Quadruples
• A quadruples has four fields
• op
01/06/2025 16
Quadruples
• Exceptions of the rule
• Instructions with unary operations -
• is not used.
• Assignment operation
• is not used
• is .
• Operands like use neither nor result
• Conditional and unconditional jumps put the target label in .
01/06/2025 17
Quadruples
• Consider the statement
01/06/2025 18
Triples
• Contains only three fields
01/06/2025 19
Triples
• Consider the statement
0
1
2
3
4
5
01/06/2025 20
Triples
• A ternary operation like requires two entries in the triple structure.
𝑥 [ 𝑖 ] =𝑦 𝑥=𝑦 [𝑖 ]
0 0
1 1
01/06/2025 21
Indirect Triples
• Consist of a listing of pointers to triples, rather than a listing of triples
themselves.
35 (0)
36 (1) 0
37 (2) 1
38 (3) 2
39 (4) 3
40 (5) 4
5
01/06/2025 22
Comparison of the Implementations
• In quadruples, we can move the instructions around without changing any
temporary values.
• But it requires more space than others.
• In indirect triples, we can move the instructions without touching the actual
triples.
• But it requires two memory access.
01/06/2025 23
Backpatching
• We can’t use loops in three-address code.
• Instead we will use jumps.
• But where should the program jump?
• We can use labels to indicate jump positions.
• We will use a method named backpatching mark the labels.
01/06/2025 24
Backpatching
• Consider the statement
if (a<b) then t = 1
else t = 0
01/06/2025 25
Backpatching (Conditional Address Code
Statement) 1 4
if a<b goto _____
2 t1 = 0
• Consider the next example 3 5
goto ____
4 t1 = 1
5 8
if c<d goto ______
a<b and c<d or e<f 6 t2 = 0
7 9
goto______
8 t2 = 1
9 12
if e<f goto ______
10 t3 = 0
11 13
goto ______
12 t3 = 1
13 t4 = t1 and t2
14 t5 = t4 or t3
01/06/2025 26
Backpatching (while Loop)
• Consider the statement
L: if E==0 goto
S
F GOTO L
E :…
T
L: if E goto
goto
S :S
:…
01/06/2025 27
Backpatching (while Loop)
• Consider the statement
Address Code
101 103
if a<b goto ______
102 106
goto ______
103 t = y+z
104 x=t
105 101
goto ______
106 ……
01/06/2025 28
Backpatching (for Loops)
S F
E3 T
E2
01/06/2025 29
Backpatching (for Loops)
• Consider the statement
Address Code
320 i=0
321 323
if i<10 goto _____
322 328
goto _____
323
324
325
326
327 321
goto ______
328 ……
01/06/2025 30
Backpatching (switch case) Address Code
1 t = i+j
• Consider the switch-case 2 12
goto ______
statement 3
4
switch (i+j){
5 15
goto ______
case(1):
a = b+c 6
break 7
case(2): 8 15
goto ______
p = q+r
9
break;
default: 10
x = y+z 11 15
goto ______
12 if goto _____ 3
13 if goto _____ 6
14 9
goto _____
15 ……
31
Addressing Array Elements
• In general array elements are numbered from 0, 1, …, n-1 for an array
with n elements.
• If the width of each array element is w, then the ith element of array
A begins in location
where base is the relative address of the storage allocated for the
array. (relative address of A[0])
01/06/2025 32
Addressing Array Elements
• The formula can be generalized for multi-dimensional array.
• For two dimensional array, the location of A[i1][i2] is
01/06/2025 33
Addressing Array Elements
• Array elements need not be numbered starting at 0.
• In a one-dimensional array, elements are numbered low, low+1, …. ,
high and base is the relative address of A[low].
• Then the address of A[i]
01/06/2025 34
Addressing Array Elements
• A multi-dimensional array is always stored in a one-dimensional
memory.
• So it is converted.
• A two-dimensional array is normally stored in two forms
• Row-major order
• Column-major order
01/06/2025 35
Addressing Array Elements
• Consider the array A with 2 rows and 3 columns
0,0 0,1 0,2
1,0 1,1 1,2
A[0,0] A[0,0]
A[0,1] A[1,0]
A[0,2] A[0,1]
A[1,0] A[1,1]
A[1,1] A[0,2]
A[1,2] A[1,2]
01/06/2025 37
Addressing array elements
𝑥= 𝐴 [ 𝑖 ] [ 𝑗 ]
01/06/2025 38
Intermediate Code for Procedures
• Consider the function
01/06/2025 39
Syntax Directed Translation to
Intermediate Code
• Now let’s see how we can create intermediate codes from the
annotated parse tree.
01/06/2025 40
Postfix Notation
• Consider the grammar with semantic rules
𝐸 → 𝐸 +𝑇 𝑝𝑟𝑖𝑛𝑡 ( + )
𝐸 →𝑇{ }
𝑇 → 𝑇 ∗ 𝐹 𝑝𝑟𝑖𝑛𝑡 ( ∗ )
𝑇 → 𝐹{ }
𝐹 → 𝑛𝑢𝑚{ 𝑝𝑟𝑖𝑛𝑡 ( 𝑛𝑢𝑚 . 𝑣𝑎𝑙 ) }
01/06/2025 41
Postfix Notation
E
• Let’s try with an example
E + T
• We will first create the parse tree.
T T * F
F F num
num num
01/06/2025 42
Postfix Notation
• Let’s traverse the tree and see how postfix notation is created.
E
E + T
T T * F Output Buffer
2 3 4 * +
F F num
num num
01/06/2025 43
Syntax Tree
Function
Pointer to create a new node.
to Left
a node
child pointer Value Right child pointer
• Let’s re-design the semantic rules.
Grammar Semantic Rules
𝑇 → 𝐹 𝑇 . 𝑛𝑝𝑡𝑟 =𝐹 . 𝑛𝑝𝑡𝑟
𝐹 → 𝑛𝑢𝑚𝐹.𝑛𝑝𝑡𝑟=𝑚𝑘𝑛𝑜𝑑𝑒(𝑛𝑢𝑙𝑙,𝑖𝑑.𝑛𝑎𝑚𝑒,𝑛𝑢𝑙𝑙)
01/06/2025 44
Syntax Tree
• Let’s use the previous example. Apply Reduction with
E
Grammar Semantic Rules Apply Reduction with
𝐸E → +𝐸 +𝑇
T
𝐸.𝑛𝑝𝑡𝑟=𝑚𝑘𝑛𝑜𝑑𝑒(𝐸 1 .𝑛𝑝𝑡𝑟 ,′+′,𝑇 .𝑛𝑝𝑡𝑟)
Apply Reduction with
𝑇F
→ F
𝐹 𝑇 .num 𝑛𝑝𝑡𝑟 =𝐹 . 𝑛𝑝𝑡𝑟 400 200 * 300
𝐹 → 𝑛𝑢𝑚𝐹.𝑛𝑝𝑡𝑟=𝑚𝑘𝑛𝑜𝑑𝑒(𝑛𝑢𝑙𝑙,𝑖𝑑.𝑛𝑎𝑚𝑒,𝑛𝑢𝑙𝑙)
num num
100 null 2 null 200 null 3 null 300 null 4 null
01/06/2025 45
Three Address Code Generates a three-address code
01/06/2025 46
Three-address Code
• Let’s generate code for the expression
S
𝑖𝑑 = E
𝑥
E + T
T T * F
F F 𝑖𝑑
𝑐
𝑖𝑑 𝑖𝑑
01/06/2025 𝑎 𝑏 47
Three-address Code
S 𝑡 1 =𝑏 ∗ 𝑐
𝑡 2 =𝑎 + 𝑡 1
𝑥 =𝑡 2
𝑖𝑑 = E 𝐸 . 𝑝𝑙𝑎𝑐𝑒=𝑡 2
𝑥
Grammar Semantic Rules E 𝐸 . 𝑝𝑙𝑎𝑐𝑒=𝑎 + T T . 𝑝𝑙𝑎𝑐𝑒=𝑡 1
T 𝑇 . 𝑝𝑙𝑎𝑐𝑒=𝑏 * F 𝐹 . 𝑝𝑙𝑎𝑐𝑒=𝑐
T 𝑇 . 𝑝𝑙𝑎𝑐𝑒=𝑎
F 𝐹 . 𝑝𝑙𝑎𝑐𝑒=𝑏
F 𝐹 . 𝑝𝑙𝑎𝑐𝑒=𝑎 𝑖𝑑
𝑖𝑑 𝑐
𝑖𝑑
𝑏
𝑎
48
01/06/2025
Three-address Code (while loop)
Production Semantic Rule
S.begin
S.after
01/06/2025 49
Self Study
• Syntax Directed Definitions for Flow-of-Control Statements
01/06/2025 50
End
01/06/2025 51