0% found this document useful (0 votes)
100 views41 pages

06.intermediate Code Generation

The document discusses intermediate code generation in compilers. It explains that intermediate code facilitates retargeting compilers to different architectures and allows for machine-independent optimization. It presents various forms of intermediate code representations like syntax trees, postfix notation, and three-address code. Three-address code uses assignment statements of the form x := y op z. The document outlines how to generate three-address code from syntax trees and handle control flow statements using labels. It also discusses implementing three-address code using quads, triples or implicit triples and managing symbol tables and scope.

Uploaded by

Pavan Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
100 views41 pages

06.intermediate Code Generation

The document discusses intermediate code generation in compilers. It explains that intermediate code facilitates retargeting compilers to different architectures and allows for machine-independent optimization. It presents various forms of intermediate code representations like syntax trees, postfix notation, and three-address code. Three-address code uses assignment statements of the form x := y op z. The document outlines how to generate three-address code from syntax trees and handle control flow statements using labels. It also discusses implementing three-address code using quads, triples or implicit triples and managing symbol tables and scope.

Uploaded by

Pavan Kumar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 41

Intermediate Code Generation

dr. Hristo Nikolov


LIACS
Leiden University

Leiden, Fall 2011

Why Intermediate Code?


Front-end: generates intermediate representation
Back-end: generates target code
Why Intermediate code?
Retargeting is facilitated
A machine-independent code optimizer can be applied

Parser

Static
Checker

Intermediate
Code
Generator

Intermediate
Code

Code
Generator

Intermediate Languages
Syntax trees, Postfix notation,

Three-address code (x := y op z)
Example:

a := b * -c + b * -c
assign

Syntax
tree:

assign

*
b

uniminus

DAG:

uniminus

Postfix notation: a b c uniminus * b c uniminus * + assign

uniminus

Three-Address Code
Sequence of statements:

x := y op z
x+y*z
=>
Example:

t1 := y * z
t2 := x + t1

a := b * -c + b * -c

t1 := - c
t2 := b *t1
t3 := - c
t4 := b * t3
t5 := t2 + t4
a := t5
Linearized representation
of a syntax tree

t1
t2
t5
a

:=
:=
:=
:=

- c
b *t1
t2 + t2
t5

Linearized representation
of a syntax DAG

Types of Three-Address Statements

Assignment statements:
Indexed assignments:
Pointer assignments:
Copy statements:
Unconditional jumps:
Conditional jumps:
Function calls:
p(x1,..., xn) =>

x := y op z and x := op y
x := y[ i ] and x[ i ] := y
x := & y, x := * y, * x := y
x := y
goto L
if x relop y goto L
param x, call p, n, and return y
param x1
. . .
param xn
call p, n

Syntax-directed Translation into


Three-Address Code
Temporary names are created

E E1 + E2
e.g.

=> t := E1 + E2,
t5 := t2 + t4
a := t5

When expression is an identifier no new temporary


Nonterminal E has two attributes:
E.place the name which holds the value of E
E.code three-address code sequence

Syntax-directed Translation
To produce three-address code for assignments
Production
S id := E

Semantic Rule
S.code := E.code || gen( id.place := E.place )

E E 1 + E2 E.place := newtemp();
E.code := E1.code || E2.code || gen( E.place := E1.place + E2.place )
E E1 * E2

E.place := newtemp();
E.code := E1.code || E2.code || gen( E.place := E1.place * E2 .place )

E - E1

E.place := newtemp();
E.code := E1.code || gen( E.place := uminus E1.place )

E ( E1 )

E.place := E1.place
E.code := E.code

E id

E.place := id.place
E .code :=

Function newtemp() returns distinct names (t1, t2)


gen(x := y op z) => x := y op z

Flow-of-control Statements
While statements
Production
S while E do S1

S.begin :

Semantic Rule
S.begin := newlabel()
S.after := newlabel()
S.code := gen( S.begin : ) ||
E.code ||
gen ( if E.place = 0 goto S.after ) ||
S1.code ||
gen ( goto S.begin) ||
gen( S.after : )

E.code
...
if E.place = 0 goto S.after
S1.code
...
goto S.begin

S.after :

...

Flow-of-control Statements
Example
Source program
fragment

Three-address
code sequence

i := 2 * n + k
while i do
i := i - k

t1 := 2
t2 := t1 * n
t3 := t2 + k
i := t3
L1: if i = 0 goto L2
t4 := i - k
i := t4
goto L1
L2:

Implementation (1)
Quadruples, triples, and idirect triples

Quadruple is record: op, arg 1, arg 2, result


a := b * -c + b * -c
t1
t2
t3
t4
t5
a

:=
:=
:=
:=
:=
:=

- c
b *t1
- c
b * t3
t2 + t4
t5

Three-address code

op

arg 1

arg 2

result

(0)

uniminus

(1)

(2)

uniminus

(3)

t3

t4

(4)

t2

t4

t5

(5)

:=

t5

t1
t1

t2
t3

Quads (quadruples)
Pro: easy to rearrange code for global optimization
Cons: lots of temporaries

Implementation (2)
Quadruples, triples, and idirect triples

Triple is record: op, arg 1, arg 2


a := b * -c + b * -c
t1
t2
t3
t4
t5
a

:=
:=
:=
:=
:=
:=

- c
b *t1
- c
b * t3
t2 + t4
t5

Three-address code

op

arg 1

arg 2

(0)

uniminus

(1)

(2)

uniminus

(3)

(2)

(4)

(1)

(3)

(5)

:=

(4)

(0)

Triples
Pro: temporaries are implicit
Cons: difficult to rearrange code

Implementation (3)
Quadruples, triples, and idirect triples

Implicit Triples: pointers to triples


a := b * -c + b * -c
t1
t2
t3
t4
t5
a

:=
:=
:=
:=
:=
:=

- c
b *t1
- c
b * t3
t2 + t4
t5

Three-address code

statement

op

arg 1

arg 2

(0)

(14)

(14) uniminus

(1)

(15)

(15)

(2)

(16)

(16) uniminus

(3)

(17)

(17)

(16)

(4)

(18)

(18)

(15)

(17)

(5)

(19)

(19)

:=

(18)

Implicit Triples

c
b
c

Triple Container

Pro: temporaries are implicit and it is


easier to rearrange code

(14)

Names and Scope


The three-address code is simplistic
it assumes that the names of variables can be easily
resolved by the back end in global or local variables
We need local symbol tables to
record global and local declarations in procedures,
blocks, and structs to resolve names

Declarations in a Procedure
PMD
M

{ offset := 0 }

DD;D
D id : T

{ enter( id.name, T.type, offset);


offset := offset + T.width }

T integer

{ T.type := integer;
T.width := 4 }

T real

{ T.type := real;
T.width := 8 }

T array [ num ] of T1

{ T.type := array( num.val, T1.type );


T.width := num.val x T1.width }

T ^ T1

{ T.type := pointer( T1.type );


T.width := 4 }

Keeping Track of Scope Information


Nested procedures
PD
D D ; D | id : T | proc id ; D ; S
D proc id ; D1 ; S: then a new symbol table is created
The new table points to the enclosing procedures table

Procedure enter() knows in which table to make an entry

Symbol Table Operations


mktable( previous ): returns a pointer to a new table

which is linked to a previous table in the outer scope


enter( table, name, type, offset ): creates a new entry

in table
addwidth( table, width ): accumulates the total width

of all entries in table


enterproc( table, name, newtable ): creates a new

entry in table for procedure with local scope newtable

Processing Declarations
Nested Procedures
PMD

{ addwidth( top( tblptr ), top( offset ) );


pop( tblptr ); pop( offset ) }

{ t := mktable( nil );
push( t, tblptr ); push( 0, offset ) }

D D1 ; D2
D proc id ; N D1 ; S

{ t := top( tblptr );
addwidth( t, top( offset ) );
pop( tblptr ); pop( offset );
enterproc( top( tblptr ), id.name, t ) }

D id : T

{ enter( top ( tblptr ), id.name, T.type, top( offset ) );


top( offset ) := top( offset ) + T.width }

{ t := mktable( top( tblptr ) );


push( t, tblptr ); push( 0, offset ) }

Assignment Statements
A Translation Scheme
S id := E

{ p := lookup( id.name );
if p <> nil then
emit( p := E.place )
else error }

E E1 + E2

{ E.place := newtemp;
emit( E.place := E1.place + E2.place ) }

E E1 * E2

{ E.place := newtemp;
emit( E.place := E1.place * E2.place ) }

E - E1

{ E.place := newtemp;
emit( E.place := uniminus E1.place )}

E ( E1 )

{ E.place := E1.place }

E id

{ p := lookup( id.name );
if p <> nil then
E.place := p
else error }

Reusing Temporary Names


Each time newtemp() generates a new temporary name
Can we reuse temporaries?
E E1 + E2 results in:
evaluate E1 into t1
evaluate E2 into t2
t3 := t1 + t2
If t1 is no longer used, can reuse t1
instead of using new temp t3

Modify newtemp() to use a stack :


Keep a counter c, initialized to 0
newtemp() increments c and returns temporary $c
Decrement counter on each use of a $i in a three-address
statement

Reusing Temporary Names


Example: x := a * b + c * d e * f
Statement

Value of c
0

$0 := a * b

$1 := c * d

$0 := $0 + $1

$1 := e * f

$0 := $0 - $1

:= $0

Addressing Array Elements


One-Dimensional Arrays
A : array [low .. high] of type;

ith element of A begins in location

base + (i low) x W
base = A[low], W = sizeof( type )

t1
t2
t3

:=
:=
:=
:=

c
i * W
t1[t2]
t3

Compile time evaluation if rewritten as

i x W + ( base low x W )
c = base low x W is stored in symbol-table entry for A
A[i] = i x W + c

Addressing Array Elements


Two-Dimensional Arrays... (1)
A : array [1..2, 1..3] of integer;
First
Row

A[1,1]

A[1,1]

A[1,2]

A[2,1]

A[1,3]

A[1,2]

A[2,1]

A[2,2]

A[2,2]

A[1,3]

A[2,3]

A[2,3]

Row-major

Column-major

First
Column

In Row-major form A[i,j] is equivalent to A[i][j]

Addressing Array Elements


Two-Dimensional Arrays... (2)
A : array [1..2, 1..3] of integer; // Row-major form
A[i,j]= base+((ilowi)x Nj+j-lowj)x W
Nj = highj-lowj+1
((i x Nj) + j) x W + c, where
c = base - ((lowi x Nj) + lowj) x W
t1
t1
t2
t3
t4

:=
:=
:=
:=
:=
:=

i * 3
t1 + j
c
t1 * 4
t2[t3]
t4

Addressing Array Elements


Grammar
L := E
E+E
|(E)
|L
L
Elist ]
| id
Elist Elist , E
| id [ E
S
E

Synthesized Attributes:
E.place

Elist.array
Elist.place
Elist.ndim
L.place
L.offset

null

name of temp holding value of E


array name
name of temp holding index value
number of array dimensions
lvalue (=name of temp)
index into array (=name of temp)
indicates non-array simple id

Addressing Array Elements


A Translation Scheme ... (1)
S L := E

{ if L.offset = null then /* L is a simple id */


emit( L.place := E.place )
else
emit( L.place [ L.offset ] := E.place ) }

E E1 + E2

{ E.place := newtemp;
emit( E.place := E1.place + E2.place ) }

E ( E1 )

{ E.place := E1.place }

EL

{ if L.offset = null then /* L is a simple id */


E.place := L.place
else begin
E.place := newtemp;
emit( E.place :=L.place [ L.offset ] )
end }

Addressing Array Elements


A Translation Scheme... (2)
L Elist ]

{ L.place := newtemp; L.offset := newtemp


emit( L.place := c( Elist.array ) )
emit( L.offset := Elist.place * width( Elist.array )) }

L id

{ L.place := id.place;
L.offset := null }

Elist Elist 1, E

{ t := newtemp; m := Elist 1.ndim +1;


emit( t := Elist 1.place * limit( Elist 1.array, m ));
emit( t := t + E.place );
Elist.array := Elist 1.array;
Elist.place := t;
Elist.ndim := m }

Elist id [ E

{ Elist.array := id.place;
Elist.place := E.place;
Elist.ndim := 1 }

Translating Boolean Expressions


Composed by boolean operators and, or, not
Applied to boolean variables or relational expressions
Relational expression: E1 relop E2, where
E1 and E2 are arithmetic expressions
The grammar:
E E or E | E and E | not E | ( E ) | id relop id | true | false

Attribute op: <, , =, , >, represented by relop

Translating Boolean Expressions


Numerical representation
True = 1, false = 0
Example:
a or b and not c

t1 := not c
t2 := b and t1
t3 := a or t2

a < b: if a<b then 1 else 0


100:
101:
102:
103:
104:

if a
t :=
goto
t :=

< b goto 103


0
104
1

Translating Boolean Expressions


A Translation Scheme
E E1 or E2

{ E.place := newtemp;
emit( E.place := E1.place or E2.place ) }

E E1 and E2

{ E.place := newtemp;
emit( E.place := E1.place and E2.place ) }

E not E1

{ E.place := newtemp;
emit( E.place := not E1.place ) }

E ( E1 )

{ E.place := E1.place }

E id1 relop id2

{ E.place := newtemp;
emit( if id1.place relop.op id2.place goto nextstat + 3 );
emit( E.place := 0 );
emit( goto nextstat + 2 );
emit( E.place := 1 ) }

E true

{ E.place := newtemp;
emit( E.place := 1 ) }

E false

{ E.place := newtemp;
emit( E.place := 0 ) }

Backpatching
Code generation problem:
Labels (addresses) that control must go to may not be
known at the time the jump statements are generated
A solution - backpatching:
Generate branching statements with empty jumps
Put the statements in a goto list
Fill the labels of these statements when the proper label
is determined

Backpatching
Quadruples in a quadruple array
Labels are indexes into the array

Manipulate the list of lables:


makelist( i ) creates a new list containing three-address
location i, returns a pointer to the list
merge( p1 , p2 ) concatenates lists pointed to by p1 and

p2, returns a pointer to the concatenates list


backpatch( p, i ) inserts i as the target label for each of

the statements in the list pointed to by p

Backpatching
Grammar for boolean expressions
E E1 or M E2
| E1 and M E2
| not E1
| ( E1 )
| id1 relop id2
| true
| false
M

Synthesized Attributes:
E.code

E.truelist
E.falselist
M.quad

three-address code
backpatch list for jumps on true
backpatch list for jumps on false
location of current three-address
quad

Backpatching
A Translation Scheme... (1)
E E1 or M E2

{ backpatch( E1.falselist, M.quad );


E.truelist := merge( E1.truelist, E2.truelist );
E.falselist := E2.falselist }

E E1 and M E2

{ backpatch( E1.truelist, M.quad );


E.truelist := E2.truelist;
E.falselist := merge( E1.falselist, E2.falselist ) }

E not E1

{ E.truelist := E1.falselist;
E.falselist := E1.truelist }

E ( E1 )

{ E.truelist := E1.truelist;
E.falselist := E1.falselist }

Backpatching
A Translation Scheme... (2)
E id1 relop id2

{ E.truelist := makelist( nextquad );


E.falselist := makelist( nextquad + 1 );
emit( if id1.place relop.op id2.place goto_ );
emit( goto_ ) }

E true

{ E.truelist := makelist( nextquad );


emit( goto_ ) }

E false

{ E.falselist := makelist( nextquad );


emit( goto_ ) }

{ M.quad := nextquad }

Backpatching
Example: a<b or c<d and e<f
E id1 relop id2
E E1 or M E2
(M.quad := 102)
E E1 and M E2
(M.quad := 104)

backpatch( {102}, 104 )


backpatch( {101}, 102 )

100:
101:
102:
103:
104:
105:

if a < b goto_
goto_
if c < d goto_
goto_
if e < f goto_
goto_
backpatch( {102}, 104 )

backpatch( {101}, 102 )


100:
101:
102:
103:
104:
105:

if a
goto
if c
goto
if e
goto

< b goto TRUE


102
< d goto 104
FALSE
< f goto TRUE
FALSE

Flow-of-control Statements
Grammar
Synthesized Attributes:
S if E then S
| if E then S else S
| while E do S
| begin L end
| A
LL;S
| S

E.code
E.truelist
E.falselist
M.quad

S.nextlist
L.nextlist

three-address code
backpatch list for jumps on true
backpatch list for jumps on false
location of current three-address
quad
backpatch list for jumps to the
next statement after S (or nil)
backpatch list for jumps to the
next statement after L (or nil)

Flow-of-Control Statements
A Translation Scheme using backpatching... (1)
S if E then M1 S1 N
else M2 S2

{ backpatch( E.truelist, M1.quad );


backpatch( E.falselist, M2.quad );
S.nextlist := merge(S1.nextlist,
merge(N.nextlist,S2.nextlist) }

{ N.nextlist := makelist( nextquad );


emit( goto_) }

{ M.quad := nextquad }

S if E then M S1

{ backpatch( E.truelist, M.quad );


S.nextlist := merge( E.falselist, S1.nextlist) }

S wihle M1 E do
M2 S1

{ backpatch( S1.nextlist, M1.quad );


backpatch( E.truelist, M2.quad );
S.nextlist := E.falselist;
emit( goto_ M1.quad ) }

Flow-of-Control Statements
A Translation Scheme using backpatching... (2)
S begin L end

{ S.nextlist := L.nextlist }

SA

{ S.nextlist := nil }

L L1 ; M S

{ backpatch( L1.nextlist, M.quad );


L.nextlist := S.nextlist }

LS

{ L.nextlist := S.nextlist }

Intermediate Code Generation


Chapters for reading:
8.1 to 8.6

You might also like