Code Generation:
Good code generation is difficult.
A careful code-generation algorithm can easily produce code that runs
perhaps twice as fast as code produced by an ill-considered algorithm
Input to our code-generation routine:
An intermediate language program that can be a sequence of quadruples, a sequence
of triples, a tree, or a postfix Polish string.
The output of a code generator:
An object program with a variety of forms:
an absolute machine-language program
a relocatable machine-language program
an assembly-language program
Assumptions: Assume that the code generator is presented the intermediate text in
quadruples or a parse tree
Problems in code generation:
1) What instructions should we generate?
Consider the instruction: Add-one-to-storage (AOS) A:= A+1
LOAD A
ADD 1
STORE A
So we have to decide what the best machine code instruction to be generated is.
2) In what order should we perform computations:
Some orders of computations require fewer registers
Deciding the best order is a very difficult problem.
3) What registers we should use
This depends on the availability of the registers which are limited.
The following addressing modes will be assumed:
1. r (register mode): Register r contains the operand.
2. *r (indirect register mode): Register r Contains the address of the operand.
3. #X (immediate): The instruction contains the literal operand X.
4. X (absolute): The address of X follows the instruction.
Cost: The length of an instruction to be its cost.
1. MOV R1, R0 cost =1, as it occupies only one word of memory.
2. MOV M, R5 cost =2, as the memory location M is in the word
following the instruction.
Example:
Consider A:= B + C where B and C are simple variables in distinct memory
locations of the same name:
Variety of code sequences can be generated:
1. MOV R0, B
ADD R0, C Cost = 6
MOV A, R0
2. MOV *R0, *R1
Cost = 2
ADD *R0, *R2
3. ADD R1, R2
MOV A, R1 Cost = 3
Register Descriptors:
We shall maintain a register descriptor that keeps track of what is currently
in each register.
We shall consult this register descriptor whenever we need a new register.
We assume that initially the register descriptor shows that all registers are
empty
Address Descriptors:
For each name an address descriptor keeps track of the location (or loca-
tions) where the current value of the name can be found at run time.
The location might be a register, a stack location, a memory address, or
some set of these.
The Code-Generation Algorithm:
We are now ready to outline the code generation algorithm. We are given a sequence
of quadruples constituting a basic block.
For each quadruple A:= B op C we perform the following actions:
1. Invoke a function GETREG() to determine the location L where the computation
B op C should be performed. L is usually a register or a memory location.
2. Consult the address descriptor for B to determine B', (one of) the current
location(s) of B. Prefer the register for B' if the value of B is currently both in
memory and a register. If the value of B is not in L, generate the instruction MOV
L, B' to place a copy of B in L.
3. Generate the instruction OP L, C', where C' is the current location of C. Update
the address descriptor of A to indicate that A is in location L. If L is a register,
update its descriptor to indicate that it will contain at run time the value of A.
4. If the current values of B and/or C have no next uses, are not live on exit from the
block, and are in registers, alter the register descriptor to indicate that, after
execution of A := B op C, those registers no longer will contain B and/or C,
respectively.
An Example:
The expression W:= (A - B) + (A - C) + (A - C) might be translated into the following
three-address code sequence:
T := A - B
U := A - C
V := T + U
W := V + U
with W live at the end.
Statements Code Register Address
generated descriptor descriptor