Code Generation and Optimization
Code Generation and Optimization
The code generation techniques presented below can be used whether or not an
optimizing phase occurs before code generation.
Cont…
The source code written in a higher-level language is transformed into a
lower-level language that results in a lower-level object code should have
the following minimum properties:
It should carry the exact meaning of the source code.
It should be efficient in terms of CPU usage and memory management .
Issues in the design of a code generator
Code generator converts the intermediate representation of source code into
a form that can be readily executed by the machine.
The following issues arise during the code generation phase :
1. Input to code generator
2. Target program
3. Memory management
4. Instruction selection
5. Register allocation
6. Evaluation order
TARGET MACHINE
Familiarity with the target machine and its instruction set is a prerequisite for
designing a good code generator.
op source, destination
where, op is an op-code, and source and destination are data fields.
Cont…
It has the following op-codes :
MOV (move source to destination)
For example :
MOV R0, M stores contents of Register R0 into memory location M ;
The register descriptors show that initially all the registers are empty.
An address descriptor stores the location where the current value of the name
can be found at run time.
A code-generation algorithm:
• The algorithm takes as input a sequence of three-address statements
constituting a basic block. For each three-address statement of the form x :
= y op z, perform the following actions:
1. Invoke a function getreg to determine the location L where the result of the
computation y op z should be stored.
2. Consult the address descriptor for y to determine y’, the current location of y.
Prefer the register for y’ if the value of y is currently both in memory and a
register. If the value of y is not already in L, generate the instruction MOV y’ , L to
place a copy of y in L.
3. Generate the instruction OP z’ , L where z’ is a current location of z.
Prefer a register to a memory location if z is in both. Update the address descriptor
of x to indicate that x is in location L. If x is in L, update its descriptor and remove
x from all other descriptors.
4. If the current values of y or z have no next uses, are not live on exit from the
block, and are in registers, alter the register descriptor to indicate that, after
execution of x : = y op z , those registers will no longer contain y or z
Cont…
Generating Code for Assignment Statements:
• The assignment d : = (a-b) + (a-c) + (a-c) might be translated into the
following three address code sequence:
t:=a–b
u:=a–c
v:=t+u
• Spill code is additional instructions inserted into the generated code to move variables between registers and
memory.
t2 : = a * b
t3 : = 2 * t2
t4 : = t1 + t3
t5 : = b * b
t6 : = t4 + t5
Cont…
A DAG for a basic block is a directed acyclic graph with the following labels on
nodes:
1. Leaves are labeled by unique identifiers, either variable names or constants.
3. Nodes are also optionally given a sequence of identifiers for labels to store the computed values.
DAGs are useful data structures for implementing transformations on basic blocks.
The optimization process should not delay the overall compiling process
Cont…
• When to Optimize?
Optimization of the code is often performed at the end of the development stage
since it reduces readability and adds code that is used to increase the performance.
• Why Optimize?
Optimizing an algorithm is beyond the scope of the code optimization phase. So
the program is optimized. And it may involve reducing the size of the code.
So optimization helps:
1. Reduce the space consumed and increases the speed of compilation.
2. Manually analyzing datasets involves a lot of time. Hence we make use of software like
display for data analysis. Similarly manually performing the optimization is also tedious and
is better done using acode optimizer.
3. An optimized code often promotes re-usability
Types of Code Optimization
1. Machine Independent Optimization
• This code optimization phase attempts to improve the intermediate code to get a better
target code as the output.
• The part of the intermediate code which is transformed here does not involve any CPU
registers or absolute memory locations OR
• Transforms a program that improve the target code without taking into consideration any
properties of the target machine
2. Machine Dependent Optimization
• Machine-dependent optimization is done after the target code has been generated and
when the code is transformed according to the target machine architecture.
• It involves CPU registers and may have absolute memory references rather than relative
references.
• It is based on register allocation and utilization of special machine-instruction sequences.
Code Optimization
The criteria for code improvement transformations:
1. The transformation must preserve the meaning of programs
2. Transformation must, on the average, speedup programs by a measurable amount
3. The transformation must be worth the effort
Code Optimization
Function-Preserving Transformations
There are a number of ways in which a compiler can improve a program without
changing the function it computes
Copy propagation,
Dead-code elimination
Constant folding
Where to apply Optimization?
1. Source program
Optimizing the source program involves making changes to the algorithm or changing the
loop structures. User is the actor here.
2. Intermediate Code
Optimizing the intermediate code involves changing the address calculations and
transforming the procedure calls involved. Here compiler is the actor.
3. Target Code
Optimizing the target code is done by the compiler. Usage of registers, select and move
instructions is part of optimization involved in the target code
Phases of Optimization
• There are generally two phases of optimization:
1. Global Optimization:
Transformations are applied to large program segments that includes functions,
procedures and loops.
2. Local Optimization:
Transformations are applied to small blocks of statements.
• A simple but effective technique for improving the target code is peephole
optimization, a method for trying to improving the performance of the target program
by examining a short sequence of target instructions (called the peephole) and
replacing these instructions by a shorter or faster sequence, whenever possible.
Peephole Optimization Techniques
• The code in the peephole need not be contiguous, although some implementations do
require this. It is characteristic of peephole optimization that each improvement may
spawn opportunities for additional improvements.
cont
Characteristics of peephole optimizations:
Redundant-instructions elimination
Flow-of-control optimizations
Algebraic simplifications
Unreachable Code
End of chapter