0% found this document useful (0 votes)
57 views

Intermediate Code Generation

The document discusses intermediate code generation in compilers. It describes: 1) An intermediate compiler architecture with front-end, back-end, and intermediate code generation in between. 2) Different forms intermediate code can take, including syntax trees, postfix notation, and three-address code (triples/quadruples). 3) How three-address code works by representing statements with an operation and up to three operands - two for arguments and one for the result. Common three-address instructions are described.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
57 views

Intermediate Code Generation

The document discusses intermediate code generation in compilers. It describes: 1) An intermediate compiler architecture with front-end, back-end, and intermediate code generation in between. 2) Different forms intermediate code can take, including syntax trees, postfix notation, and three-address code (triples/quadruples). 3) How three-address code works by representing statements with an operation and up to three operands - two for arguments and one for the result. Common three-address instructions are described.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Intermediate Code Generation

Part I
Compiler Architecture

Intermediate
Code
Static Intermediate Code Target
Parser
Checker Code generator Generator language

Front End Back End

• m × n compliers can be built by writing m front ends


and n back ends – save considerable amount of effort
• We assume parsing, static checking and IC
generation is done sequentially
– These can be combined and done during parsing
• Static checking
– Operator operand compatibility
– Proper placement of break/continue keywords etc.
Intermediate Code (IC)

• The given program in a source language is converted


to an equivalent program in an intermediate language
by the IC generator.
• Ties the front and back ends together
• Language and Machine neutral
• Many forms
• Level depends on how being processed
• More than one intermediate language may be used by
a compiler
• Intermediate language can be many different languages, and
the designer of the compiler decides this intermediate language.
– syntax trees can be used as an intermediate language.
– postfix notation can be used as an intermediate language.
– three-address code (Quadraples) can be used as an intermediate
language
• we will use quadraples to discuss intermediate code generation
• quadraples are close to machine instructions, but they are not
actual machine instructions.
– some programming languages have well defined intermediate
languages.
• java – java virtual machine
• prolog – warren abstract machine
• In fact, there are byte-code emulators to execute instructions in
these intermediate languages.
Intermediate language levels

• High • Medium • Low


T1 a[i,j+2] t1 j+2 r1 [fp-4]
t2 i * 20 r2 r1 + 2
t3 t1 + t2 r3 [fp-8]
t4 4 * t3 r4 r3*20
t5 addr a r5 r4 + r2
t6 t5 + t 4 r6 4 * r5
t7 *t6 r7 fp – 216
f1 [r7+r6]
Intermediate Languages Types

• Graphical IRs:
– Abstract Syntax trees
– Directed Acyclic Graphs (DAGs)
– Control Flow Graphs
• Linear IRs:
– Stack based (postfix)
– Three address code (quadruples)
Graphical IRs

• Abstract Syntax Trees (AST) – retain essential


structure of the parse tree, eliminating unneeded
nodes.

• Directed Acyclic Graphs (DAG) – compacted AST to


avoid duplication – smaller footprint as well

• Control flow graphs (CFG) – explicitly model control


flow
ASTs and DAGs:

a := b *-c + b*-c
:= :=
a + a +

* * *
b - (uni) b - (uni) b - (uni)

c c c

AST DAG
Implementation of DAG/AST: Value Number Method
Three-Address Code

• A three-address code is:


x := y op z
where x, y and z are names, constants or compiler-generated
temporaries; op is any operator.

• But we may also the following notation for three-address code


(it looks like a machine code instruction)
op y,z,x
apply operator op to y and z, and store the result in x.

• We use the term “three-address code” because each statement


usually contains three addresses (two for operands, one for the
result).
Linearized Representation of DAG/AST

• Source Code
– a = b * -c + b * -c
• Three address code

• Tree Representation
Three-Address Statements

Binary Operator: op y,z,result or result :=


y op z
where op is a binary arithmetic or logical operator. This binary
operator is applied to y and z, and the result of the operation is
stored in result.
Ex: add a,b,c
gt a,b,c
addr a,b,c
addi a,b,c

Unary Operator: op y,,result or result :=


op y
where op is a unary arithmetic or logical operator. This unary
operator is applied to y, and the result of the operation is stored
in result.
Ex: uminus a,,c
not a,,c
Three-Address Code

• Two concepts
– Address
– Instruction
• Address
– Name: source-program names to appear as addresses
– Constant: Different types of constants
– Compiler Generated temporary:
Three-Address Instruction

Assignment Type 1: x := y op z
op is a binary arithmetic or logical operation
x, y and z are addresses

Assignment Type 2: x := op z
op is a unary arithmetic or logical operation
x and z are addresses

Copy Instruction: x := y
x and z are addresses and x is assigned the value of y
Three-Address Instructions

Unconditional Jump: goto L


We will jump to the three-address code with the label L, and the
execution continues from that statement.
Ex: goto L1 // jump to L1
jmp 7 // jump to the statement 7

Conditional Jump 1: if x goto L and ifFalse x goto L


We will jump to the three-address code with the label L if x is TRUE
and FALSE, respectively. Otherwise, the following three-address
instruction in sequence is executed next.

Conditional Jump 2: if x relop y goto L


We will jump to the three-address code with the label L if the result of
y relop z is true, and the execution continues from that statement.
If the result is false, the execution continues from the statement
following this conditional jump statement.
Three-Address Statements (cont.)

Procedure Parameters: param x


Procedure Calls: call p,n
where x is an actual parameter, we invoke the procedure p with
n parameters.
Ex: param x1
param x2
…… p(x1,...,xn)
param xn
n is necessary because call
call p,n,
can be nested

f(x+1,y) add x,1,t1


param t1,,
param y,,
call f,2,
Three-Address Statements (cont.)

Indexed Assignments:
x := y[i]
sets x to the value in location i memory units beyond location y
y[i] := x
sets contents of the location i memory units beyond location y to the
value of x
Address and Pointer Assignments:
x := &y
sets the r-value of x to l-value of y
x := *y where y is a pointer whose r-value is a location
sets the r-value of x equal to the contents of that location
*x := y
sets the r-value of the object pointed by x to the r-value of y
Three address code example

do i=i+1; while (a[i] < v)

L: t1=i+1 100: t1 = I +1
i=t1 101: i = t1
t2=i*8 102: t2=i*8
t3=a[t2] 103: t3=a[t2]
if t3 < v goto L 104: if t3 < v goto 100

(A) Symbolic Labels (B) Position Numbers


Representing 3-Address Statements

op, arg1, arg2, result

• x = minus y
– Does not use arg2
• x = y
– Op is =
• param a1
– Uses neither arg2 nor result
• Conditional/Unconditional jumps
– Put the target label in result
Quadruples

• a = b * -c + b * -c
Quadruples

• Store each fields directly


Triples

op arg1 arg2
0 [ ]= x i
1 := 0 y
Indirect Triples

You might also like