Intermediate Code Generation
Intermediate Code Generation
Generation?
compiler
details of source language are confined to the front end and the details
Parser
Static
Checker
Front end
Intermediate
Code Generator
Code
Generator
Back end
Expressions.
We can easily show the common sub-expressions and then use that
*
*
d
-
a
b
E -> E1+T
E -> E1-T
E -> T
T -> (E)
T -> id
T -> num
Example
1)p1=Leaf(id, entry-a)
2)P2=Leaf(id, entry-a)=p1
3)p3=Leaf(id, entry-b)
4)p4=Leaf(id, entry-c)
5)p5=Node(-,p3,p4)
6)p6=Node(*,p1,p5)
7)p7=Node(+,p1,p6)
Semantic Rules
E.node= new Node(+, E1.node,T.node)
E.node= new Node(-, E1.node,T.node)
E.node = T.node
T.node = E.node
T.node = new Leaf(id, id.entry)
T.node = new Leaf(num, num.val)
8)
9)
10)
11)
12)
13)
p8=Leaf(id,entry-b)=p3
p9=Leaf(id,entry-c)=p4
p10=Node(-,p3,p4)=p5
p11=Leaf(id,entry-d)
p12=Node(*,p5,p11)
p13=Node(+,p7,p12)
+
+
*
*
d
-
a
b
a+a*(b-c)+(b-c)*d
2
3
+
i
10
id
num
+
=
To entry for i
10
1
1
2
3
*
*
d
-
a
b
t1 = b c
t2 = a * t1
t3 = a + t2
t4 = t1 * d
t5 = t3 + t4
Addresses and
Instructions
instructions.
An address can be one of the following:
A name - we allow source-program names to appear as
y.
4. An unconditional jump goto L. The three-address instruction with
instructions: param x for parameters; call p,n and y = call p,n for
procedure and function calls, respectively; and return y, where y,
representing a returned value, is optional. Their typical use is as the
sequence of three address instructions
param x1
param x2
param xn
call p,n
= y.
Example
do i = i+1; while (a[i] < v);
L:
t1 = i + 1
i = t1
t2 = i * 8
t3 = a[t2]
if t3 < v goto L
Symbolic labels
100:
101:
102:
103:
104:
t1 = i + 1
i = t1
t2 = i * 8
t3 = a[t2]
if t3 < v goto 100
Position numbers
Triples
Temporaries are not used and instead references
Indirect triples
In addition to triples we use a list of pointers to
triples
Quadruples
A quadruple (quad) has four fields, which we call op,
in result.
Triples
A triple has only three fields, which we call op, arg1, and
arg2
Using triples, we refer to the result of an operation x op y
structure itself.
positions
numbers.
or
pointers
to
positions
were
called
value
Example
b * minus c + b * minus c
Quadruples
op arg1 arg2 result
minus c
t1
*
b
t1 t2
minus c
t3
b
t3
t4
*
+
t2 t4 t5
=
t5
a
Triples
0
1
2
3
4
5
op arg1 arg2
minus c
*
b
(0)
minus c
b
(2)
*
+
(1) (3)
=
a
(4)
t1 = minus c
t2 = b * t1
t3 = minus c
t4 = b * t3
t5 = t2 + t4
a = t5
Indirect Triples
op
35 (0)
36 (1)
37 (2)
38 (3)
39 (4)
40 (5)
0
1
2
3
4
5
op arg1 arg2
minus c
*
b
(0)
minus c
b
(2)
*
+
(1) (3)
=
a
(4)
x
y
a
a
b
=
=
=
=
=
1
2
x + y
a + 3
x + y
x1
y1
a1
a2
b1
= 1
= 2
= x1 + y1
= a1 + 3
= x1 + y1
(b) SSA form
Translation Applications
From the type of a name, a compiler can determine the storage that will be
Type Expressions
Types have structure, which we shall represent using
language to be checked.
The array type int [2][3] can be read as "array of 2
Type Expressions
basic type is a type expression
type name is a type expression
type expression can be formed by applying array type constructor to a
types
If s and t are type expressions, then their Cartesian product s*t is a
type expression
Type expressions may contain variables whose values are type
expressions
Type Equivalence
When type expressions are represented by graphs(DAG),
are
formed
by
applying
the
same
constructor
to
Declarations
type.
basic type, such as a character, integer, or float, requires an integral
number of bytes.
For easy access, storage for aggregates such as arrays and classes
Sequences of Declarations
Languages such as C and Java allow all the declarations in a
with its relative address set to the current value of offset, which
is then incremented by the width of the type of x.
Sequences of Declarations
following production
The fields in this record type are specified by the sequence of
declarations generated by D.
The field names within a record must be distinct; that is, a name
The use of a name x for a field within a record does not conflict with
q.
t2, . . .
Incremental Translation
Code attributes can be long strings, so they are usually
generated incrementally.
In the incremental approach, gen not only constructs a
Incremental Translation
the array.
location of A[i]
any number)
name.
L.array.base is used to determine the actual l-value of an array reference after
all the index expressions are analyzed.
Control Flow
The translation of statements such as if-else-statements and
Boolean Expressions
Boolean expressions are composed of the boolean operators
(which we denote &&, ||, and !, for the operators AND, OR, and
NOT, respectively) applied to elements that are boolean variables
or relational expressions.
Relational expressions are of the form E1 rel E2, where E1 and E2
Short-Circuit Code
Flow-of-Control Statements
Syntax-directed definition
if x < 5 goto L2
goto L3
L3: if x > 10 goto L4
goto L1
L4: if x == y goto L2
goto L1
L2: x = 3
There are three extra gotos.
One is a goto the next statement.
Two others could be eliminated by using ifFalse.
Using the above rules if ( x < 100 || x > 200 && x != y ) x = 0; translates to
if x < 100 goto L2
ifFalse x > 200 goto L2
ifFalse x != y goto L1
L2 : x = 0
L1 :
S id = E ; | if ( E ) S | while (E)S | S S
E E || E | E && E | E rel E | E + E | (E) | id | true | false
E.n represents syntax tree node for E.
Two methods jump and rvalue are used in evaluating expressions.
jump to generate jumping code at an expression node
Backpatching
A key problem when generating code for boolean expressions
incomplete, with the label field unfilled. These incomplete jumps are
placed on lists pointed to by B.truelist and B.falselist, as appropriate
array of instructions
Merge(p1,p2): concatenates the lists pointed by p1 and p2 and
&& x ! = y
Flow-of-Control
Statements
Translation of a switchstatement
Readings
Chapter 6 of the book