Lecture Notes Compiler Design Chapter-6
Lecture Notes Compiler Design Chapter-6
Intermediate Code
Generation
1
Outline
• Intermediate representations
• Intermediate code generation
• Intermediate languages
• Syntax-Directed Translation of Abstract Syntax Trees
• Abstract Syntax Trees versus DAGs
• Three-Address Code
• Three-Address Statements
• Syntax-Directed Translation into Three-Address Code
• Implementation of Three-Address Statements:
– Quads, triples, indirect triples
• Three address code for an assignment statement and an
expression
2
Outline
3
Intermediate
Representations
• In a compiler, the front end translates source program into
an intermediate representation,
• and the back end generates the target code from this
intermediate representation.
• The use of a machine independent intermediate code (IC) is:
Synta Parse
Token Parser x tree Type tree IC gen
strea checke
m r
IC
code
Parser Type IC gen
checke
r
Parse tree ONE
PASS
6
Intermediate Code
Generation
• Intermediate language can be many different
languages, and the designer of the compiler decides
this intermediate language.
– Syntax tree can be used as an intermediate language.
– Postfix notation can be used as an intermediate language.
– Three-address code (Quadraples) can be used as an
intermediate language
• We will use three address to discuss intermediate code generation
• Three address are close to machine instructions, but they are not
actual machine instructions.
– Some programming languages have well defined
intermediate languages.
• java – java virtual machine
• prolog – warren abstract machine
• In fact, there are byte-code emulators to execute instructions in
these intermediate languages.
7
Types of Intermediate
Representations
Three major categories
• Structural Examples:
— Graphically oriented Trees, DAG
— Heavily used in source-to-source translators
— Tend to be large
• Linear
— Pseudo-code for an abstract machine Examples:
— Level of abstraction varies 3 address code
— Simple, compact data structures Stack machine
— Easier to rearrange code
• Hybrid
Example:
— Combination of graphs and linear code Control-flow
— Example: control-flow graph graph 8
Intermediate languages
• Syntax tree
• While parsing the input, a syntax tree can be constructed.
• A syntax tree (abstract tree) is a condensed form of parse tree useful for
representing language constructs.
• For example, for the string a+b, the parse tree in (a) below can be represented
by the syntax tree shown in (b);
• the keywords (syntactic sugar) that existed in the parse tree will no longer exist
in the syntax tree.
Parse E +
tree
E E
+ a b
a b Abstract
tree 9
Abstract Syntax Trees
a*(b+c) E
E E
*
a ( E )
E + E
*
b c
a +
b c
10
Abstract Syntax Trees versus
DAGs
TREE a:= b * -c + b * -c DAG
:= :=
a +
a +
* *
*
b uminus
b uminus b uminus
c c c
11
Syntax Tree representation
TREE a:= b * -c + b * -c
=
:=
id a
a + +
* * * *
id b id b
b uminus b uminus
uminus uminus
c c
id c id c 12
Three-Address Code
a:= b * -c + b * -c
t1 := - c t1 := - c
t2 := b * t1 t2 := b * t1
t3 := - c t5 := t2 + t2
t4 := b * t3 a := t5
t5 := t2 + t4
a := t5 Linearized
representation of a
Linearized syntax DAG
representation of a
syntax tree
13
Three-Address Code
• A three address code is:
x := y op z
where x, y and z are names, constants or compiler-
generated temporaries; op is any operator.
t1 = y * z
t2 = x + t1
15
Three-Address Statements
Binary Operator:
op y,z,result or result := y op z
16
Three-Address Statements (cont.)
Unary Operator:
18
Three-Address Statements
(cont.)
Conditional Jumps: jmprelop y,z,L or if y relop z goto L
We will jump to the three-address code with the label L if the result of y relop z is
true, and the execution continues from that statement. If the result is false, the execution
continues from the statement following this conditional jump statement.
Ex: jmpgt y,z,L1 // jump to L1 if y>z
jmpgte y,z,L1 // jump to L1 if y>=z
jmpe y,z,L1 // jump to L1 if y==z
jmpne y,z,L1 // jump to L1 if y!=z
19
Three-Address Statements (cont.)
Procedure Parameters: param x,, or param x
Procedure Calls: call p,n, or call p,n
where x is an actual parameter, we invoke the procedure p with n
parameters.
Ex: param x1,,
param x2,,
p(x1,...,xn)
param xn,,
call p,n,
21
Three Address Statements
(summary)
• Assignment statements: x := y op z, x := op y
• Indexed assignments: x := y[i], x[i] := y
• Pointer assignments: x := &y, x := *y, *x := y
• Copy statements: x := y
• Unconditional jumps: goto L
• Conditional jumps: if y relop z goto L
• Function calls: param x… call p, n
return y
22
Syntax-Directed Translation into Three-
Address Code
• Syntax directed translation can be used to generate the
three-address code.
• Generally, either:
• the three-address code is generated as an attribute of
the attributed parse tree or
• the semantic actions have side effects that write the
three-address code statements in a file.
23
Syntax-Directed Translation into Three-
Address Code
• The following functions are used to generate 3-
address code:
25
Syntax-Directed Translation into 3-
address code
• Deal with assignments.
• Use attributes:
– E.place: the name that will hold the value of E
• Identifier will be assumed to already have the place
attribute defined.
– E.code: hold the three address code statements
that evaluate E (this is the `translation’
attribute).
Syntax-Directed Translation into Three-
Address Code
Production Semantic Rules
S → id := E S.code three address code for S
| while E do S S.begin lable to start of S or nil
E→ E+E S.after lable to end of S or nil
|E*E E.code three-address code for E
|-E E.place a name holding the value of
E
|(E)
| id
| num
gen(E.place ‘:=‘ E1.place ‘+’ E2.place)
Code generation
t3 := t1 + t2
27
Implementation of Three-Address
Statements:
• The description of three-address instructions specifies
the components of each type of instruction.
• However, it does not specify the representation of these
instructions in a data structure.
• In a compiler, these statements can be implemented as
objects or as records with fields for the operator and the
operands.
x [i] = y x = y [i]
33
Syntax-Directed Translation into Three-
Address Code
Three address code for an assignment statement and an
expression
Productions Semantic actions
S id := E S.code := E.code || gen (id.lexeme ‘ :=‘ E.place); S.begin = S.after = nil
E E1 + E2 E.place := newtemp();
E.code := E1.code || E2.code || gen (E.place, ‘:=’, E1.place, ‘+’,E2.place)
E E1 * E2 E.place := newtemp();
E.code := E1.code || E2.code || gen (E.place, ‘:=’, E1.place, ‘*’, E2.place)
E - E1 E.place := newtemp();
E.code := E1.code || gen (E.place, ‘:= uminus ’ E1.place)
E ( E1) E.place := E1.place
E.code := E1.code
E id E.place := id.lexeme
E.code := ‘’ /* empty code */
E num E.place := newtemp();
E.code := gen (E.place ‘=‘ num. value)
34
Syntax-Directed Translation (cont.)
Three address code for an assignment statement and an expression
S.begin: E.code
if E.place = 0 goto S.after
S1.code
goto S.begin
S.after: ….
35
Syntax-Directed Translation (cont.)
37
Syntax-Directed Translation (cont.)
S while E do S1 S.begin = newlabel();
S.after = newlabel();
S.code = gen(S.begin “:”) || E.code ||
gen(‘jmpf’ E.place ‘,,’ S.after) || S1.code ||
gen(‘jmp’ ‘,,’ S.begin) ||
gen(S.after ‘:”)
S if E then S1 else S2 S.else = newlabel();
S.after = newlabel();
S.code = E.code ||
gen(‘jmpf’ E.place ‘,,’ S.else) || S1.code ||
gen(‘jmp’ ‘,,’ S.after) ||
gen(S.else ‘:”) || S2.code ||
gen(S.after ‘:”)
38
Exercises
• Draw the decorated parse tree and generate
three-address code by using the translation
schemes given:
a) A := B + C
b) A := C * ( B + D)
c) while a < b do a := (a + b) * c
d) while a < b do a := a + b
e) a:= b * -c + b * -c
39
Three address code for A := B + C
id.lexeme= := E.place=t1
A E.code = t1=E1.place + E2.place
t1=B+c
E.code=“ “ := E.code=“ “
E.place=B E.place=C
id.lexeme= B id.lexeme= C
40
Three address code for A := C * (B + D)
S.code => t1 =B+D, t2 =C*t1, A=t2
E.place=t2
id.lexeme= A := E.code = t2=C*t1, t1=B+D
E.place=E1.place
E.code=“ “ E.code=E1.code
*
E.place=C
E.place=t1
( E.code=
t1= B+D
)
id.lexeme= C
E.code=“ “ E.code=“ “
+
E.place=B E.place=D
id.lexeme= B id.lexeme= D 41
Exercises
while a < b i := 2 * n + k
a = (a + b ) * c while i do
i:= i - k
L1: t1 := a < b t1 := 2
if t1 = 0 goto L2 t2 := t1 * n
t2 := a + b t3 := t2 + k
t3 := t2 * c i := t3
a := t3 L1: if i = 0 goto L2
goto L1 t4 := i - k
L2: i := t4
goto L1
L2:
How come ? Draw the decorated
parse tree 42
Three address code for
Declarations
• The declaration is used by the compiler as a source of
type-information that it will store in symbol table.
43
Three address code for
Declarations
• The compiler maintains a global offset variable that
indicates the first address not yet allocated.
• Initially, offset is assigned 0.
• Each time an address is allocated to a variable, the offset
is incremented by the width of the data object denoted
by the name.
• The procedure enter (name, type, address) creates a
symbol table entry for name, give it the type type and
the relative address address.
• The synthesized attributes name and width for non-
terminal T are also used to indicate the type and number
of memory units taken by objects of that type.
44
Translation scheme for declaration
PMD
M€ { offset=0 }
DD;D
D id : T { enter(id.name,T.type,offset); offset=offset+T.width }
T int { T.type=int; T.width=4 }
T real { T.type=real; T.width=8 }
T array[num] of T1 { T.type=array(num.val,T1.type);
T.width=num.val*T1.width }
T ^ T1 { T.type=pointer(T1.type); T.width=4 }
46
Addressing array elements
• Elements of arrays can be accessed quickly if the elements are
stored in a block of consecutive locations.
A one-dimensional array A:
… …
48
Addressing Array elements(cont.)
• Example for an array declared as A : array [10..20] of integer;
• if it is stored at the address 100,
A[15] = 100 + (15 – 10) * 4
t1 := c // c = baseA – 10 * 4
t2 := i * 4
t3 := t1[t2]
…:= t3
49
Addressing Array elements: Grammar
S L := E Synthesized attributes:
EE+E
| E*E E.place name of temp holding
value of E
|-E L.place lvalue (=name of temp)
|(E) L.offset index into array (=name
|L of temp)
L id [ E ] null indicates non-array simple
| id id
50
Three-address code for assignment statement and
expressions (including array references)
S L := E if L.offset = nil then /* L is a simple id */
S.code := L.code || E.code || Gen (L.place, ‘:=’, E.place);
else
S.code := L.code || E.code || Gen (L.place, ‘[’, L.offset, ‘] :=’,
E.place);
E E1 + E2 E.place := newtemp();
E.code := E1.code || E2.code || gen (E.place, ‘:=’, E1.place, ‘+’,
E2.place)
E E1 * E2 E.place := newtemp();
E.code := E1.code || E2.code || gen (E.place, ‘:=’, E1.place, ‘*’,
E2.place)
E - E1 E.place := newtemp();
E.code := E1.code || gen (E.place, ‘:= uminus ’, E1.place)
51
Three-address code for assignment statement and
expressions…
E (E1) E.place := E1.place;
E.code := E1.code
EL if L.offset = nil then /* L is simple */
begin
E.place := L.place
E.code := L.code;
else
begin
E.place := newtemp();
E.code := L.code || gen (E.place, ‘ :=’, L.place, ‘[’ , L.offset, ‘]’)
end
52
Three-address code for assignment statement and
expressions…
L id [E] L.place := newtemp();
L.offset := newtemp();
L.code := E.code || gen (L.place, ‘:=’, base (id.lexeme) -
width (id.lexeme) * low(id.lexeme)) || gen (L.offset, ‘:=’,
E.place, ‘*’, width (id.lexeme));
L id p := lookup (id.lexeme)
if p <> nil then L.place = p.lexeme else error
L.offset := nil; /* for simple identifier */
L.code := ‘’ /* empty code */
53
Example
• Three-address code generation for the input X := A [y]
• A is stored at the address 100 and its values are integers
(width = 4) and low = 1.
• The semantic actions will generate the following three-
address code.
t1 := 96
t2 := y * 4
t3 := t1 [t2]
x := t3