CSC3201 - Compiler Construction (Part II) - Lecture 5 - Code Generation
CSC3201 - Compiler Construction (Part II) - Lecture 5 - Code Generation
Construction - Part II
Code Generation
Ahmad Abba Datti
Overview
Runtime Environment
Code Generation
Code Generation
Outline
Code
Front end Code optimizer
Generator
Issues in the Design of Code Generator
a=b+c
x=y+z d=a+e
LD R0, y LD R0, b
ADD R0, R0, z ADD R0, R0, c
ST x, R0 ST a, R0
LD R0, a
ADD R0, R0, e
ST d, R0
Register allocation
Two subproblems
Register allocation: selecting the set of variables that will reside in
registers at each point in the program
Resister assignment: selecting specific register that a variable reside in
Complications imposed by the hardware architecture
Example: register pairs for multiplication and division
t=a+b t=a+b
t=t*c t=t+c
T=t/d T=t/d
L R0, a
L R1, a A R0, b
A R1, b M R0, c
M R0, c SRDA R0, 32
D R0, d D R0, d
ST R1, t ST R1, t
A simple target machine model
Unconditional jumps: BR L
variable name: x
LD R1, i //R1 = i
ST b, R2 //b = R2
a[j] = c
LD R1, c //R1 = c
LD R2, j // R2 = j
ST a(R2), R1 //contents(a+contents(R2))=R1
x=*p
LD R1, p //R1 = p
ST x, R2 // x=R2
conditional-jump three-address instruction
If x<y goto L
LD R1, x // R1 = x
LD R2, y // R2 = y
SUB R1, R1, R2 // R1 = R1 - R2
BLTZ R1, M // i f R1 < 0 jump t o M
costs associated with the addressing modes
LD R0, R1 cost = 1
LD R0, M cost = 2
call callee
Return
Halt
action
Target program for a sample call and return
Stack Allocation
Return to caller
in Callee: BR *0(SP)
in caller: SUB SP, SP, #caller.recordsize
Target code for stack allocation
Basic blocks and flow graphs
We wish to determine for each three address statement x=y+z what the next
uses of x, y and z are.
Algorithm:
Attach to statement i the information currently found in the symbol table regarding
the next use and liveness of x, y, and z.
In the symbol table, set x to "not live" and "no next use.“
In the symbol table, set y and z to "live" and the next uses of y and z to i.
DAG representation of basic blocks
There is a node in the DAG for each of the initial values of the variables appearing
in the basic block.
There is a node N associated with each statement s within the block. The children
of N are those nodes corresponding to statements that are the last definitions,
prior to s, of the operands used by s.
Node N is labeled by the operator applied at s, and also attached to N is the list of
variables for which it is the last definition within the block.
Certain nodes are designated output nodes. These are the nodes whose variables
are live on exit from the block.
Code improving transformations
For each available register, a register descriptor keeps track of the variable names
whose current value is in that register. Since we shall use only those registers that
are available for local use within a basic block, we assume that initially, all
register descriptors are empty. As the code generation progresses, each register
will hold the value of zero or more names.
For each program variable, an address descriptor keeps track of the location or
locations where the current value of that variable can be found. The location
might be a register, a memory address, a stack location, or some set of more than
one of these. The information can be stored in the symbol-table entry for that
variable name.
Machine Instructions for Operations
Use getReg(x = y + z) to select registers for x, y, and z. Call these Rx, Ry and Rz.
If the address descriptor for v says that v is somewhere besides R, then we are OK.
If v is x, the value being computed by instruction I, and x is not also one of the other
operands of instruction I (z in this example), then we are OK. The reason is that in this
case, we know this value of x is never again going to be used, so we are free to ignore it.
Otherwise, if v is not used later (that is, after the instruction I, there are no further uses
of v, and if v is live on exit from the block, then v is recomputed within the block), then
we are OK.
If we are not OK by one of the first two cases, then we need to generate the store
instruction ST v, R to place a copy of v in its own memory location. This operation is
called a spill.
Selection of the register Rx
1. Since a new value of x is being computed, a register that holds only x is always
an acceptable choice for Rx.
2. If y is not used after instruction I, and Ry holds only y after being loaded, Ry
can also be used as Rx. A similar option holds regarding z and Rx.
Possibilities for value of R
If the address descriptor for v says that v is somewhere besides R, then we are OK.
If v is x, the value being computed by instruction I, and x is not also one of the other
operands of instruction I (z in this example), then we are OK. The reason is that in this
case, we know this value of x is never again going to be used, so we are free to ignore it.
Otherwise, if v is not used later (that is, after the instruction I, there are no further uses
of v, and if v is live on exit from the block, then v is recomputed within the block), then
we are OK.
If we are not OK by one of the first two cases, then we need to generate the store
instruction ST v, R to place a copy of v in its own memory location. This operation is
called a spill.
Characteristic of peephole optimizations
Redundant-instruction elimination
Flow-of-control optimizations
Algebraic simplifications
LD a, R0
ST R0, a
if debug == 1 goto L1
goto L2
L I : print debugging information
L2:
Flow-of-control optimizations
x=x+0
x=x*1
Register Allocation and Assignment
Usage Counts
For the loops we can approximate the saving by register allocation as:
Sum over all blocks (B) in a loop (L)
For each uses of x before any definition in the block we add one unit of saving
If x is live on exit from B and is assigned a value in B, then we ass 2 units of saving
Flow graph of an inner loop
Code sequence using global register
assignment
Register allocation by Graph coloring