Intermediate Code Generation

Download as pdf or txt
Download as pdf or txt
You are on page 1of 23

Intermediate Code Generation

Part I
Compiler Architecture

Intermediate
Code
Static Intermediate Code Target
Parser
Checker Code generator Generator language

Front End Back End

• m × n compliers can be built by writing m front ends


and n back ends – save considerable amount of effort
• We assume parsing, static checking and IC
generation is done sequentially
– These can be combined and done during parsing
• Static checking
– Operator operand compatibility
– Proper placement of break/continue keywords etc.
Intermediate Code (IC)

• The given program in a source language is converted


to an equivalent program in an intermediate language
by the IC generator.
• Ties the front and back ends together
• Language and Machine neutral
• Many forms
• Level depends on how being processed
• More than one intermediate language may be used by
a compiler
• Intermediate language can be many different languages, and
the designer of the compiler decides this intermediate language.
– syntax trees can be used as an intermediate language.
– postfix notation can be used as an intermediate language.
– three-address code (Quadraples) can be used as an intermediate
language
• we will use quadraples to discuss intermediate code generation
• quadraples are close to machine instructions, but they are not
actual machine instructions.
– some programming languages have well defined intermediate
languages.
• java – java virtual machine
• prolog – warren abstract machine
• In fact, there are byte-code emulators to execute instructions in
these intermediate languages.
Intermediate language levels

• High • Medium • Low


T1 a[i,j+2] t1 j+2 r1 [fp-4]
t2 i * 20 r2 r1 + 2
t3 t1 + t2 r3 [fp-8]
t4 4 * t3 r4 r3*20
t5 addr a r5 r4 + r2
t6 t5 + t 4 r6 4 * r5
t7 *t6 r7 fp – 216
f1 [r7+r6]
Intermediate Languages Types

• Graphical IRs:
– Abstract Syntax trees
– Directed Acyclic Graphs (DAGs)
– Control Flow Graphs
• Linear IRs:
– Stack based (postfix)
– Three address code (quadruples)
Graphical IRs

• Abstract Syntax Trees (AST) – retain essential


structure of the parse tree, eliminating unneeded
nodes.

• Directed Acyclic Graphs (DAG) – compacted AST to


avoid duplication – smaller footprint as well

• Control flow graphs (CFG) – explicitly model control


flow
ASTs and DAGs:

a := b *-c + b*-c
:= :=
a + a +

* * *
b - (uni) b - (uni) b - (uni)

c c c

AST DAG
Implementation of DAG/AST: Value Number Method
Three-Address Code

• A three-address code is:


x := y op z
where x, y and z are names, constants or compiler-generated
temporaries; op is any operator.

• But we may also the following notation for three-address code


(it looks like a machine code instruction)
op y,z,x
apply operator op to y and z, and store the result in x.

• We use the term “three-address code” because each statement


usually contains three addresses (two for operands, one for the
result).
Linearized Representation of DAG/AST

• Source Code
– a = b * -c + b * -c
• Three address code

• Tree Representation
Three-Address Statements

Binary Operator: op y,z,result or result :=


y op z
where op is a binary arithmetic or logical operator. This binary
operator is applied to y and z, and the result of the operation is
stored in result.
Ex: add a,b,c
gt a,b,c
addr a,b,c
addi a,b,c

Unary Operator: op y,,result or result :=


op y
where op is a unary arithmetic or logical operator. This unary
operator is applied to y, and the result of the operation is stored
in result.
Ex: uminus a,,c
not a,,c
Three-Address Code

• Two concepts
– Address
– Instruction
• Address
– Name: source-program names to appear as addresses
– Constant: Different types of constants
– Compiler Generated temporary:
Three-Address Instruction

Assignment Type 1: x := y op z
op is a binary arithmetic or logical operation
x, y and z are addresses

Assignment Type 2: x := op z
op is a unary arithmetic or logical operation
x and z are addresses

Copy Instruction: x := y
x and z are addresses and x is assigned the value of y
Three-Address Instructions

Unconditional Jump: goto L


We will jump to the three-address code with the label L, and the
execution continues from that statement.
Ex: goto L1 // jump to L1
jmp 7 // jump to the statement 7

Conditional Jump 1: if x goto L and ifFalse x goto L


We will jump to the three-address code with the label L if x is TRUE
and FALSE, respectively. Otherwise, the following three-address
instruction in sequence is executed next.

Conditional Jump 2: if x relop y goto L


We will jump to the three-address code with the label L if the result of
y relop z is true, and the execution continues from that statement.
If the result is false, the execution continues from the statement
following this conditional jump statement.
Three-Address Statements (cont.)

Procedure Parameters: param x


Procedure Calls: call p,n
where x is an actual parameter, we invoke the procedure p with
n parameters.
Ex: param x1
param x2
…… p(x1,...,xn)
param xn
n is necessary because call
call p,n,
can be nested

f(x+1,y) add x,1,t1


param t1,,
param y,,
call f,2,
Three-Address Statements (cont.)

Indexed Assignments:
x := y[i]
sets x to the value in location i memory units beyond location y
y[i] := x
sets contents of the location i memory units beyond location y to the
value of x
Address and Pointer Assignments:
x := &y
sets the r-value of x to l-value of y
x := *y where y is a pointer whose r-value is a location
sets the r-value of x equal to the contents of that location
*x := y
sets the r-value of the object pointed by x to the r-value of y
Three address code example

do i=i+1; while (a[i] < v)

L: t1=i+1 100: t1 = I +1
i=t1 101: i = t1
t2=i*8 102: t2=i*8
t3=a[t2] 103: t3=a[t2]
if t3 < v goto L 104: if t3 < v goto 100

(A) Symbolic Labels (B) Position Numbers


Representing 3-Address Statements

op, arg1, arg2, result

• x = minus y
– Does not use arg2
• x = y
– Op is =
• param a1
– Uses neither arg2 nor result
• Conditional/Unconditional jumps
– Put the target label in result
Quadruples

• a = b * -c + b * -c
Quadruples

• Store each fields directly


Triples

op arg1 arg2
0 [ ]= x i
1 := 0 y
Indirect Triples

You might also like