0% found this document useful (0 votes)
5 views

Intermediate Code Generation

The document discusses intermediate code generation, focusing on three-address code, its types, and representations such as quadruples, triples, and indirect triples. It explains the syntax-directed translation of abstract syntax trees into intermediate representations and the semantic rules for generating three-address code for various expressions and declarations. Additionally, it covers the role of symbol tables in managing variable scopes and types during code generation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views

Intermediate Code Generation

The document discusses intermediate code generation, focusing on three-address code, its types, and representations such as quadruples, triples, and indirect triples. It explains the syntax-directed translation of abstract syntax trees into intermediate representations and the semantic rules for generating three-address code for various expressions and declarations. Additionally, it covers the role of symbol tables in managing variable scopes and types during code generation.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPTX, PDF, TXT or read online on Scribd
You are on page 1/ 44

Intermediate Code Generation

- Types of Three address code,


Representation, Declarations
Intermediate Code Generation
• Facilitates retargeting: enables attaching a back end
for the new machine to an existing front end

Intermediate Target
Front end code Back end machine
code

• Enables machine-independent code optimization


Intermediate Representations
• Graphical representations
• AST
• Postfix notation: operations on values stored on operand stack
• JVM bytecode
• Three-address code: x := y op z
• Variation of three address code - two-address code:x := op y
Syntax-Directed Translation of
Abstract Syntax Trees

Production Semantic Rule


S  id := E S.nptr := mknode(‘:=’, mkleaf(id, id.entry), E.nptr)
E  E1 + E2 E.nptr := mknode(‘+’, E1.nptr, E2.nptr)
E  E1 * E2 E.nptr := mknode(‘*’, E1.nptr, E2.nptr)
E  - E1 E.nptr := mknode(‘uminus’, E1.nptr)
E  ( E1 ) E.nptr := E1.nptr
E  id E.nptr := mkleaf(id, id.entry)
Abstract Syntax Trees
E.nptr

a * (b + c) E.nptr * E.nptr

a ( E.nptr )

E.nptr + E.nptr

b c
*

a +

b c
Abstract Syntax Trees versus DAGs

a := b * -c + b * -c

:= :=

a + a +

* * *

b uminus b uminus b uminus

c c c
Tree DAG 6
Postfix Notation

a := b * -c + b * -c Bytecode (for example)


iload 2 // push b
a b c uminus * b c uminus * + assign
iload 3 // push c
Postfix notation represents ineg // uminus
operations on a stack imul // *
iload 2 // push b
iload 3 // push c
ineg // uminus
imul // *
iadd // +
istore 1 // store a
Three-Address Code

a := b * -c + b * -c

t1 := - c t1 := - c
t2 := b * t1 t2 := b * t1
t3 := - c t5 := t2 + t2
t4 := b * t3 a := t5
t5 := t2 + t4
a := t5 Linearized representation
Linearized representation
of a syntax DAG
of a syntax tree
Three address code
• In a three address code there is at most one operator at the right side of
an instruction
• Example:

+
t1 = b – c
+ * t2 = a * t1
* d
t3 = a + t2
- t4 = t1 * d
a
t5 = t3 + t4
b c
Types of three address codes
• x := y op z
• x := op y
• x := y
• goto L
• if x goto L and if (false x) goto L1
• if x relop y goto L
Types of three address code
• Procedure calls using:
• param x
• call p, n
• y := call p, n
• return y
• x := y[i] and x[i] := y
• x := &y
• x := *y and *x :=y
Example
• do i = i+1; while (a[i] < v);

L: t1 = i + 1 100: t1 = i + 1
i = t1 101: i = t1
t2 = i * 8 102: t2 = i * 8
t3 = a[t2] 103: t3 = a[t2]
if t3 < v goto L 104: if t3 < v goto 100

Symbolic labels Position numbers


Representing three address codes
• Quadruples
• Has four fields: op, arg1, arg2 and result
• The contents of arg1, arg2, and result are usually pointers to the symbol table
entries for the names represented by these fields
• Unary operators like x := -y do not use arg2
• Operators like param use neither arg2 nor result
• Conditional and unconditional jump put target label in result
Three address code

Example: a:= b * -c + b* -c Quadruple


Op Arg1 Arg2 result
• t1 : = -c (0) uminus c t1
• t2 := b * t1 (1) * b t1 t2
• t3 : = - c (2) uminus c t3
• t4 := b * t3 (3) * b t3 t4
• t5 := t2 + t4 (4) + t2 t4 t5
• a := t5 (5) := t5 a
Representing three address codes
• Triples
• To avoid entering temporary names into the symbol table, we can refer to a
temporary value by the position of the statement that computes it
• Has 3 fields: op, arg1, and arg2
• arg1, and arg2 are either pointers to the symbol table (for programmer-
defined names or constants) or pointers to the triple structure (for temporary
values)
• Parenthesized numbers represent pointers into the triple structure itself
Example
• Example: a:= b * -c + b* -c

• t1 : = -c Op Arg1 Arg2

• t2 := b * t1 (0) uminus c

• t3 : = - c (1) * b (0)
(2) uminus c
• t4 := b * t3
(3) * b (2)
• t5 := t2 + t4
(4) + (1) (3)
• a := t5
(5) := a (4)
Triples for arrays – ternary operation
x [i] : = y

op arg1 arg2
(0) []= x i
(1) assign (0) y

x : = y [i]

op arg1 arg2
(0) =[] y I
(1) assign x (0)
Representing three address codes
• Indirect triples
• In addition to triples we use a list of pointers to triples
• to list pointers to triples in the desired order
Example

Statement Op Arg1 Arg2


(30) (0) (0) uminus c
(31) (1) (1) * b (0)
(32) (2) (2) uminus c
(33) (3) (3) * b (2)
(34) (4) (4) + (1) (3)
(35) (5) (5) := a (4)
Comparison of Representations
• Benefit of quadruples over triples can be seen in an optimizing
compiler, where instructions are often moved around.
• With quadruples, if we move an instruction that computes a temporary t,
then the instructions that use t require no change.
• With triples, the result of an operation is referred to by its position, so moving
an instruction may require us to change all references to that result.
• This problem does not occur with indirect triples, an optimizing compiler can
move an instruction by reordering the instruction list, without affecting the
triples themselves.
SDT into Three address code
• Three address code is constructed based on the grammar construct
• Non-terminals have Attributes
• Code – sequence of three-address statements evaluating Expression
• Place – Address/name that holds the value
• Value
Three address code for expression
Production Semantic Rules
S  id := E; S.code = E.code || gen (top.get(id.lexeme) ‘:=‘ E.address
E  E1 + E2 E.addr = new Temp()
E.code = E1.code || E2.code ||
gen (E.addr ‘:=‘ E1.addr ‘+’ E2.addr)

E  - E1 E.addr = new Temp()


E.code = E1.code || gen (E.addr ‘:=‘ ‘uminus’ E1.addr)
E  (E1) E.addr = E1. addr
E.code = E1.code
Three address code
Production Semantic Rules
E  E1 * E2 E.addr = new Temp()
E.code = E1.code || E2.code ||
gen (E.addr ‘:=‘ E1.addr ‘*’ E2.addr)
E  id E.addr = top.get(id.lexeme)
E.code = ‘ ‘
Example
• a:= b + c * d
S  id := E => E + E => E +E * E = > id + id * id
The corresponding syntax tree would be
Example
* Node – New temp
E1.addr = c, E2.addr = d
+ Node – New temp
E1.addr = b, E2.addr=t1
t1 := c * d
t2 := b + t1
Root node
a := t2
Declarations – Three address code
• Can be in a procedure – need to track scope of variable’s and need
symbol table
• Computing the address of variables and other is done by semantic
rules related to three address code
• Type, width, offset
Declarations
Production Semantic rules
PD {offset = 0}
DD;D
D  id : T {enter (id.name, T.type, offset);
offset = offset + T.width;
T integer T.type = integer;
T.width = 4;
T  real T.type = real;
T.width = 8;
T array[num] of T1 T.type = array(num.val, T1.type)
T.width = num *T1.width;
Declarations
Production Semantic rules
T  ↑ T1 T.type = pointer (T1.type);
T.width = 4;
Declarations
• PD
• D  D ; D | id : T | proc id ; D ; S
A new symbol table is created when proc id; D;S is encountered
Symbol Table Functions
• mktable(previous) creates a new symbol table and returns a pointer
to the new table that is linked to a previous table in the outer scope.
The pointer previous is placed in the header of the new symbol table.
• enter(table, name, type, offset) creates a new entry for name in the
symbol table pointed to by table
• table – address of the current table, variable name, type and offset
• addwidth(table, width) records the cumulative width of all entries in
table in the header associated with this symbol table
Symbol Table functions
• enterproc(table, name, newtable) creates a new entry in table for
procedure name with local scope newtable
• table – existing table, name of the newtable, address of the new table
• lookup(table, name) returns a pointer to the entry in the table for
name by following linked tables
Example
globals
struct S prev=nil [4] Trec S
{ int a; s (0)
int b; prev=nil [8]
swap
a (0)
} s; foo
b (4)
Tfun swap
void swap(int a, int b) Tref
{ int t; prev [12]
Tint
t = a; a (0)
a = b; b (4)
b = t; t (8) Table nodes
} type nodes
Tfun foo
(offset)
prev [0] [width]
Example
void foo()
{ …
swap(s.a, s.b);

}
Example function call
Void foo( ) Void A()
{ call A() {
call D()
call B() }
call (C)
}
Calling stack

D
A B C
foo foo foo foo
Main Main Main Main
Symbol Table semantic rules
Productions Semantic Rule
P MD;S {addwidth (top(tblptr), top(offset);
pop(tblptr); pop(offset)}

Mε { t := mktable(nil); push(t, tblptr); push(0, offset) }


D  id : T { enter(top(tblptr), id.name, T.type, top(offset));
top(offset) := top(offset) + T.width }

D  proc id ; N D1 ; S { t := top(tblptr); addwidth(t, top(offset));


pop(tblptr); pop(offset);
enterproc(top(tblptr), id.name, t) }
Symbol Table creation
Productions Semantic Rule
Nε { t := mktable(top(tblptr)); push(t, tblptr); push(0, offset) }

D  D1 ; D2
T  integer { T.type := ‘integer’; T.width := 4 }
T  real { T.type := ‘real’; T.width := 8 }
T  array [ num ] of { T.type := array(num.val, T1.type);
T1 T.width := num.val * T1.width }
T  ^ T1 { T.type := pointer(T1.type); T.width := 4 }
Symbol table tracking
• Stack of tblptr is available to keep track of the available symbol table
• When a new procedure is called a symbol table is created and its
pointer pushed to this stack with offset
• When the call terminates this tblptr is popped
Declarations and Records in Pascal

Production Semantic rule


T  record L D end { T.type := record(top(tblptr)); T.width := top(offset);
addwidth(top(tblptr), top(offset)); pop(tblptr);
pop(offset) }

Lε { t := mktable(nil); push(t, tblptr); push(0, offset) }


SDT’s of Statements
• S S;S
S  id := E
{ p := lookup(top(tblptr), id.name);
if p = nil then
error()
else if p.level = 0 then // global variable
emit(id.place ‘:=’ E.place)
else // local variable in subroutine frame
emit(fp[p.offset] ‘:=’ E.place) }
Assignment statements
• Names in the symbol table
• Variables referred to addresses
• Lookup(id.name) identifies variable id in symbol table
• Emit – used to emit three address statements to output file
Translation scheme
Production Semantic Rules
S  id := E; {p = lookup(id.name);
If p ≠ nil then emit (p ‘:=‘ E.place) else error
E  E1 + E2 {E.place= newtemp();
emit (E.place ‘:=‘ E1.place ‘+’ E2.place)}
E  E1*E2 {E.place= newtemp();
emit (E.place ‘:=‘ E1.place ‘*’ E2.place)}
E  - E1 {E. place = newtemp();
emit(E.place ‘:=‘ ‘uminus’ E1.place)}
E  (E1) {E.place = E1. place}
E  id {p := lookp(id.name); if p ≠ nil then E.place :=p else error}
Reusing Temporary Names
generate
E1 + E2 Evaluate E1 into t1
Evaluate E2 into t2
t3 := t1 + t2

If t1 no longer used, can reuse t1


instead of using new temp t3
Modify newtemp() to use a “stack”:
Keep a counter c, initialized to 0
newtemp() increments c and returns temporary $c
Decrement counter on each use of a $i in a three-address statement
Reusing temporary name

x := a * b + c * d - e * f

Statement c
0
$0 := a * b 1
$1 := c * d 2
$0 := $0 + $1 1
$1 := e * f 2
$0 := $0 - $1 1
x := $0 0

You might also like