compiler_optimization-notes-revised
compiler_optimization-notes-revised
Intermediate code (IC) optimization is an optional phase in most compilers which is enabled by an
user, if so desired. For example in gcc, optimization can be enabled using the Optimization switch “-
Ox”, where O denotes optimization, where x denotes a level, which is an integer, can be between 0 to
3. Level 0 denotes indicates no optimization, while levels 1, 2 and 3 specify optimizations. Level 2 is
the optimization level most commonly deployed. Level 3 is used for architectures with advanced
features, such as vector processors, multiple processors, etc.
Optimization
Front end Back end
Int code Phase Int code
Source
Program
Target
Program
As shown in the figure above, the main objective of the optimization phase is to improve the quality of
intermediate code generated by the front end of the compiler.
We have already observed during the SDTS of various linguistic features, that the intermediate code
generation concentrates on semantic correctness and equivalence to the source code but not on efficient
code. Therefore the generated IC during SDTS often has redundant code fragments that can be
eliminated or rewritten through analysis of the IC. For example,
• use of large number of temporaries to generate code to evaluate an expression; no concern for
reusing temporaries for already evaluated expression values,
• generation of large number of “goto ____” during partial evaluation of conditional
expressions in translation of control flow structures,
• multiple references to the same multiple dimensional array references, etc.
1. Processing of Intermediate Code to extract control flow information
The SDTS for all the features considered by us, except for declaration processing, results in generation
of intermediate code. As illustrated in the figure above, the optimization phase of a compiler, which is
optional, is entrusted with the task of transforming the naive intermediate code to more efficient
intermediate code by a sequence of transformations.
Before the transformation process starts, the first task in the optimization phase is to extract control
flow information present in the intermediate code. The result is a directed graph, known as control flow
graph.
Consider the following source code and its intermediate code generated as produced by our STDS.
A C source program is given in Column1 and its IC in the simplified form is given in the 2 nd column.
The source has a few assignments and uses two control structures, an “if-then-else” within an enclosing
for-loop. The presence of control flow statements will make the generated IC more interesting for the
purpose of analysis. The call and return statements are special instances of unconditional control flow
statements.
Intermediate Code for the entire source code is given in the 2 nd column of the following table. The IC is
very similar to the SDTS for assignment statement and control structure statements developed in the
course. The only difference is that there are fewer temporaries used and hence fewer statements
generated. The control flows are however exactly the same. Note the translation of the for-loop using
the template defined earlier. The IC for the if-then-else construct follows the SDTS used in the course.
Source program Three Address Intermediate Code
int main() 50: a := 2
{ int a = 2, b = 3, c = 40, d, i, j; 51: b := 3
int x[10]; 52: c := 40
x[0] = 10; x[4] = 20; x[7] = 30; 53: x[0] := 10
for ( i = 1; i < 11; i++ ) 54: x[4] := 20
{ d = a * 15; 55: x[7] := 30
if (i < 5) 56: i := 1
{ x[i] = x[i-1] + d; b = a*c;} 57: goto 70
else 58: d := a*15
{ b = a * c; 59: if i < 5 goto 61
d = a * c; 60: goto 66
c = b + d;} 61: t1 := x[i-1]
}; 62: t2 := t1 + d
printf(" a = %d b = %d d= %d x[5] = %d \n", 63: x[i] := t2
a, b, d, x[5]); 64: b := a*c
return 0; 65: goto 69
} 66: b := a*c
67: d := a*c
68: c := b+d
69: i := i+1
70: if i < 11 goto 58
71: goto 72
72: param format-string
73: param a
74: param b
75: param d
76: param x[5]
77: call printf
78: t3 := 0
79: return t3
Third column gives the contents of the basic blocks in terms of contiguous TAC statements they contain. Two
statements, that have a single unconditional goto, such as 60 and 71 are not included in the basic blocks, because
they we can represent them by an edge in the flow graph. The numbering of the basic blocks does not change the
structure of the control flow graph.
Basic block Header Contents Edges Basic block Header Contents Edges
B1 50 50 : 57 B1→B2 B6 69 69 : 69 B6→B2
B2 70 70 : 70 B2→B3 B7 72 72 : 76 B7→B8
B2→B7
B3 58 58 : 59 B3→B4 B8 77 77 : 77 B8→B9
B3→B5
B4 61 61 : 65 B4→B6 B9 78 78 : 78 B9→B10
B5 66 66 : 68 B5→B6 B10 79 79 : 79
entry node
B1 50 : 56
70 : 70
B2 if i < 11
True False
58 : 59
B3 if i < 5 72 : 76 B7
True False
61 : 64 66 : 68 B5 B8 77 : 77
B4
78 : 78 B9
43 : 45 B6
B10 79 : 79
Control Flow Graph for Example exit node
3. Use of Basic Blocks / control flow graph in Optimization: The control flow graph exhibits the
control structure in the program explicitly. Loops in the control flow graph depict parts of the code that
are expected to execute repeatedly. Therefore improvement in the contents of basic blocks that are part
of one or more loops at compile time, should result in significant savings at execution time.
• The cfg is useful for another significant optimization, known as detection of unreachable code.
Unreachable code comprises of one or more basic blocks which are not reachable from the
entry node (also called as start node) of the cfg. A basic block x, other than start node, that is x
≠start, is unreachable from start if predecessor(x) = Ф. Finding all unreachable nodes in a cfg
is a simple graph theoretic problem in directed graphs and can be determined. All such nodes
can be safely removed from the cfg without changing the semantics of the underlying program.
This optimization is named as elimination of unreachable code.
• Detecting all loops in a cfg, including nested loops, is also a well known graph theoretic
problem in directed graphs. One may use DFS to detect and extract the loops in a cfg in terms
of the set of basic blocks that constitute the loop. For the cfg of our example, there is exactly
one loop, described by the set of basic blocks, {<bb 2>, <bb 3>, <bb 4>, <bb 5>, <bb 6>, <bb
2> which can only be entered through the block <bb 2>.
The main purpose of the cfg is to perform optimizations on the intermediate code in order to improve
its efficiency. Analysis of the structure of the cfg, is known as control flow analysis. Control flow
analysis is used in optimizations, popularly known as control flow transformations, that may result in
addition / deletion of nodes or edges, but usually do not change the contents of the basic blocks.
Removal of unreachable code, detection of loops, are some of the examples of control flow analysis
and transformations.
Exercise : Write a function in a source language such that the control flow graph for the function will
be able to detect unreachable basic blocks in the graph.
4. Machine Independent Optimizations
Analysis of the contents of basic blocks is known as data flow analysis. If the analysis is confined to
within a basic block, also called as intra-basic block, then it is called as local data flow analysis. On
the contrary, when the contents of all basic blocks are analysed, for collecting useful information about
<BB1> <BB1-modified>
43 : b = a * c; 43 : t1= a * c;
44 : d = a * c; 44 : b := t1;
45 : c = b + d; 44a : d = t1;
45 : c = t1+ t1;
Before local Optimization
After local Optimization
a=2 a = 10
b=3 b=2
…...
t1 = a* b
Analysis : Compiler has to perform an analysis of the IC to determine at the program point of interest,
say p, the definitions of all its operands that reach the point p. The analysis performed by the compiler
is known as “Reaching Definitions Analysis” and is beyond the scope of this course. However, once
the analysis has been performed by the compiler, it can easily check if the operands have a single
definition of both the operands, and in this case can safely perform constant folding.
Home work : Analyse and report whether all the instances of constant folding in the example given are
safe.
It is to be noted that compiler performs any optimization not once but as many times it finds it
profitable. For example, after constant folding more opportunities for constant propagation exist. This
is shown in the prog4.c.
This optimization is known as Constant Propagation. The main objective is to propagate constants
detected at compile time for further optimizations including constant folding.
Optimization 3 : While the two optimizations discussed earlier are useful, larger benefits accrue if
some computations within loops can be safely moved of a loop to a place outside the loop. This
optimization is known as Loop Invariant Code Motion. This optimization involves two tasks,
• detect a computation that is invariant in the loop it is placed, that is, it does not depend on the
loop surrounding it and has exactly the same value on every iteration of the loop, and
• find a place to move the loop invariant code outside the enclosing loop.
This optimization also, like all others that we discuss, must not change the semantics of the IC. For the
modified program, prog5c, three opportunities are highlighted using 3 different colors in 2 nd column of
the following table. These computations have been pulled out the for-loop and placed just before the
loop.
After constant folding & propagation : prog3.c Situations for optimization : prog4.c
int main() int main()
{ int a = 2, b = 3, c = 40, d, i, j; { int a = 2, b = 3, c = 40, d, i, j;
int x[10]={10, 20, 30, 40, 50, 60, 70, 80, 90, 100}; int x[10]={10, 20, 30, 40, 50, 60, 70, 80, 90, 100};
for ( i = 1; i < 11; i+=2 ) d = 6; c = 300; d = 4;
{ if (i%2) for ( i = 1; i < 11; i+=2 )
{ d = 6; x[i] = x[i-1] + 6; c = 300; } { if (i%2)
else { // d = 6; loop invariant
{d = 4; x[i] = x[i-1] + 6;
Statement of Loop invariant Code Motion : Given a loop L of the cfg and a computation or IC of the
form, “ a = b op c” at some point p in some basic block of L, determine if “ a = b op c” is an loop
invariant of L. An IC, “ a = b op c”, is a loop invariant of a loop L, if all the definitions of the
operands, b and c are placed outside L. Then this code can be safely moved out from L to a suitable
predecessor / successor of L.
Benefits : Movement of loop invariant code outside a loop at compile time reduces the computation
effort of m*x (where m is the number of iterations of loop at run time, and x is one time cost of
execution of the code), to x.
Constraints : All operands involved in a loop invariant code must have their definitions from outside
the given loop.
Analysis : Requires detection of a loop L in cfg, followed by analysis of all basic blocks that are part of
the loop L.
Optimization 4 : Dead Code Elimination is yet another optimization performed by a compiler. As the
name suggest, code that are not used in the program, after their definition at some point p, may be
safely removed without changing the semantics of the original program. This optimization involves two
tasks,
• detect a computation, defined at program point p, that is no longer used on any path starting
from p to the rest of the program till its exit point.
• Change the IC by removing the corresponding IC.
This optimization also, like all others that we discuss, must not change the semantics of the IC. For the
modified program, prog6.c, three definitions, that are highlighted are dead from their point of definition
and hence can be safely removed.
Statement of Dead Code Elimination : IC statements that are not used after their definition at point p
on any path in the cfg starting from p to the rest of the cfg till its exit block, can be safely removed.
Benefits : Saves memory and also reduces execution time.
Constraints : The variable defined in the code, needs to have at least one use on any path starting
from the point of interest p, in order not be declared as dead at p. Variables who have at least one use
from the point of interest p are referred to as live variables at p. A variable that is not live at a point p
becomes a candidate for dead code.
Analysis : Live variable Analysis is performed by the compiler to detect all variables that are live (or
have some use at that point or later) at every program point in the program. Variables that are not found
to be live, are dead, and can be safely eliminated. The details of performing live variable analysis
algorithmically is beyond the scope of this course.
Statement of Unreachable Code Elimination : Statements that are unreachable from the start node of
the cfg are known as “unreachable code”. This may result either from control flow analysis, or due to
the application of optimizations which lead to some conditional expression to have a constant value,
True or False, at compile time.
Benefits : Saves both memory and execution time. Also identifies unintentional potential bugs in the
design / code.
Constraints : Correctness depends on the the compile time evaluations done under various
optimizations by the compiler which has resulted in some code marked as unreachable.
Analysis : Control flow analysis is the key analysis performed by the compiler to detect code that are
found to be unreachable in the cfg. Global data flow analysis is required to determine code that
reachable from the start block but will never be executed at run time because some conditional
expression has been analysed and found to have acquired a constant value at compile time.
In summary, six different optimizations were applied above in some order to convert the given source
program to the final form shown in the following table. Readers are urged to compile and run all the six
different versions of the program, from prog1.c to prog6.c. Verify whether all the programs produce the
same output.
Homework : As a take home exercise, students are urged to find if s/he can define an optimization,
other than those mentioned above, and apply to the final version, prog6.c ,in order to make it even
more efficient.
Give a suitable name to the optimization, and describe it in terms of the components : statement of the
optimization, benefits, constraints and analysis required. Provide an example of application of your
optimization giving the program code before and after the optimization.
End of Document