0% found this document useful (0 votes)
0 views

compiler_optimization-notes-revised

Intermediate code optimization is an optional phase in compilers aimed at enhancing the efficiency of the generated intermediate code. The optimization process involves transforming naive intermediate code into more efficient forms by eliminating redundancies and analyzing control flow information to create a control flow graph. Basic blocks are defined as sequences of statements executed sequentially, and their structure aids in optimizing loops and improving execution efficiency.

Uploaded by

rb292983
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
0 views

compiler_optimization-notes-revised

Intermediate code optimization is an optional phase in compilers aimed at enhancing the efficiency of the generated intermediate code. The optimization process involves transforming naive intermediate code into more efficient forms by eliminating redundancies and analyzing control flow information to create a control flow graph. Basic blocks are defined as sequences of statements executed sequentially, and their structure aids in optimizing loops and improving execution efficiency.

Uploaded by

rb292983
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 16

Introduction to Intermediate Code Optimization

Intermediate code (IC) optimization is an optional phase in most compilers which is enabled by an
user, if so desired. For example in gcc, optimization can be enabled using the Optimization switch “-
Ox”, where O denotes optimization, where x denotes a level, which is an integer, can be between 0 to
3. Level 0 denotes indicates no optimization, while levels 1, 2 and 3 specify optimizations. Level 2 is
the optimization level most commonly deployed. Level 3 is used for architectures with advanced
features, such as vector processors, multiple processors, etc.

Optimization
Front end Back end
Int code Phase Int code

Source
Program
Target
Program

As shown in the figure above, the main objective of the optimization phase is to improve the quality of
intermediate code generated by the front end of the compiler.
We have already observed during the SDTS of various linguistic features, that the intermediate code
generation concentrates on semantic correctness and equivalence to the source code but not on efficient
code. Therefore the generated IC during SDTS often has redundant code fragments that can be
eliminated or rewritten through analysis of the IC. For example,
• use of large number of temporaries to generate code to evaluate an expression; no concern for
reusing temporaries for already evaluated expression values,
• generation of large number of “goto ____” during partial evaluation of conditional
expressions in translation of control flow structures,
• multiple references to the same multiple dimensional array references, etc.
1. Processing of Intermediate Code to extract control flow information
The SDTS for all the features considered by us, except for declaration processing, results in generation
of intermediate code. As illustrated in the figure above, the optimization phase of a compiler, which is
optional, is entrusted with the task of transforming the naive intermediate code to more efficient
intermediate code by a sequence of transformations.
Before the transformation process starts, the first task in the optimization phase is to extract control
flow information present in the intermediate code. The result is a directed graph, known as control flow
graph.
Consider the following source code and its intermediate code generated as produced by our STDS.

CS333/Intermediate Code Optimization/Supratim/1


SDTS generates a large number of statements involving temporaries even for assignment statements.
For an assignment statement, the number of IC statements are more as compared to that in the source.
However the statements are executed strictly in sequential order in both the codes. There is no transfer
of control within the statements. Therefore to reduce the code size, without compromising on the
control transfers in the source or the IC, we shall continue with the source assignment statements
instead of their equivalent verbose IC statements. However for small size source programs, the
processing that is described in this section can be directly applied to the IC as well.
Consider the translation template for a for-loop of C given below, where the names, “init”, “test”,
“increment”, “body” and “next” have their intuitive meaning. The template is the one that is used in
gcc, but one can define other templates with equivalent semantics.

for-loop structure Translation template


for ( init; test; increment)
{ init
body goto test
} body
next
increment
true
test
goto body
next false

A C source program is given in Column1 and its IC in the simplified form is given in the 2 nd column.
The source has a few assignments and uses two control structures, an “if-then-else” within an enclosing
for-loop. The presence of control flow statements will make the generated IC more interesting for the
purpose of analysis. The call and return statements are special instances of unconditional control flow
statements.

Control flow Intermediate code form used in the course


Conditional control flow if a relop b goto label 1
goto label 2
Unconditional control flow goto label 3
call statement call function_name
return statement return or return x

CS333/Intermediate Code Optimization/Supratim/2


• For function call, such as printf(" a = %d b = %d d= %d x[5] = %d \n", a , b , d , x[5]); we shall
generate intermediate code that is fairly close to the structure of assembly code.

Source code Intermediate Code structure used


printf (" a = %d b = %d d= %d x[5] = %d \n", a, b, d, x[5]); t1 := x[5]
param " a = %d b = %d d= %d x[5] = %d \n"
param a
param b
param d
param t1
call printf

Intermediate Code for the entire source code is given in the 2 nd column of the following table. The IC is
very similar to the SDTS for assignment statement and control structure statements developed in the
course. The only difference is that there are fewer temporaries used and hence fewer statements
generated. The control flows are however exactly the same. Note the translation of the for-loop using
the template defined earlier. The IC for the if-then-else construct follows the SDTS used in the course.
Source program Three Address Intermediate Code
int main() 50: a := 2
{ int a = 2, b = 3, c = 40, d, i, j; 51: b := 3
int x[10]; 52: c := 40
x[0] = 10; x[4] = 20; x[7] = 30; 53: x[0] := 10
for ( i = 1; i < 11; i++ ) 54: x[4] := 20
{ d = a * 15; 55: x[7] := 30
if (i < 5) 56: i := 1
{ x[i] = x[i-1] + d; b = a*c;} 57: goto 70
else 58: d := a*15
{ b = a * c; 59: if i < 5 goto 61
d = a * c; 60: goto 66
c = b + d;} 61: t1 := x[i-1]
}; 62: t2 := t1 + d
printf(" a = %d b = %d d= %d x[5] = %d \n", 63: x[i] := t2
a, b, d, x[5]); 64: b := a*c
return 0; 65: goto 69
} 66: b := a*c
67: d := a*c
68: c := b+d
69: i := i+1
70: if i < 11 goto 58
71: goto 72
72: param format-string
73: param a
74: param b
75: param d
76: param x[5]
77: call printf
78: t3 := 0
79: return t3

CS333/Intermediate Code Optimization/Supratim/3


To explicate the control flows in the function, a compiler creates a graph from the generated IC. The
nodes of the graph contain statements with no transfer of control within the node. The edges between a
pair of nodes represent the transfer of control from one node to the other. The resulting graph is a
directed graph as the control transfer statements, whether conditional or unconditional, give the
direction of the transfer of control. This graph is called a “control flow graph” and the nodes are called
as “basic blocks”.
What constitutes a basic block?
A basic block is a maximal contiguous sequence of IC, such that
• the statements in the block are executed sequentially starting from the first IC statement to the
last one within the block.
• Transfer of control to a basic block is restricted to the following , a) transfer of control into a
basic block is only permitted to the first statement in the block, and b) only the last statement in
the block may be a transfer of control statement outside the block.
• It has the largest set of IC statements satisfying the properties listed.
A control flow graph (cfg) is formed by creating a node in the graph for each block and an edge
between blocks bi and bj, denoted by (bi, bj), when the last statement in bi is a transfer of control
statement to the first statement of bj.
2. Construction of Basic Blocks
Algorithm : Partition intermediate code statements into basic blocks and construct a control
flow graph
i/p: A sequence of three-address code (TAC) generated after semantic analysis
o/p : Control flow graph
Method outline :
1. Identify the set of headers (also called as leaders), which constitute the first statements of
basic blocks. Headers of basic blocks are identified as follows.
• The first IC statement is a header, by definition.
• Any statement that is the target of a conditional or unconditional goto is a header.
• Any statement that immediately follows a goto or conditional goto statement is a
header.
2. Extract the body of each basic block. For each header, its basic block consists of the header
and all statements up to but not including the next header or the end of the program.
3. Insert edges from a basic block to its successor blocks, based on transfer of control.

CS333/Intermediate Code Optimization/Supratim/4


We shall examine the TAC sequentially, staring with the first statement at label 50, and would like to
identify all the headers in a single pass, if possible.
Three Address Code Identification of Headers of Basic Blocks Headers to Basic Blocks
(TAC)
50: a := 2 1. 50: a := 2 is a header – 1st TAC Basic blocks will be numbered from
51: b := 3 B1 onwards.
52: c := 40 2. 70: if i < 11 goto 58 is a header being a target of 57
53: x[0] := 10 B1 (header 50) = {50 to 57}
54: x[4] := 20 3. 58: d := a*15 is a header as it follows 57
55: x[7] := 30 B2 (header 70) = {70}
56: i := 1 4. 61: t1 := x[i-1] is a header as it is a target of 59
57: goto 70 B3 (header 58) = {58 to 59}
58: d := a*15 5. 60: goto 66 is a header since it follows 59
59: if i < 5 goto 61 B4 (header 61) = {61 to 65}
60: goto 66 6. 66: b := a*c because it is a target of 60
61: t1 := x[i-1] B5 (header 66) = {66 to 68}
62: t2 := t1 + d 7. 69: i := i+1 is a header because of 65: goto 69
63: x[i] := t2 B6 (header 69) = {69}
64: b := a*c 8. 71: goto 72 is a header because it follows 70
65: goto 69 B7 (header 72) = {72 to 76}
66: b := a*c 9. 72: param format-string is a header because is a target
67: d := a*c of 71 B8 (header 77) = {77}
68: c := b+d
69: i := i+1 10. 77: call printf is a header – unconditional control B9 (header 78) = {78}
70: if i < 11 goto 58 transfer
71: goto 72 B10 (header 79) = {79}
72: param format-string 11. 78: t3 := 0 is a header as it follows 77
73: param a
74: param b 12. 79: return t3 is a header – unconditional control
75: param d transfer
76: param x[5]
77: call printf There are no more headers in the TAC. Header
78: t3 := 0 statements in sorted order of label numbers :
79: return t3 50, 58, 60, 61, 66, 69, 70, 71, 72, 77, 78, 79

Third column gives the contents of the basic blocks in terms of contiguous TAC statements they contain. Two
statements, that have a single unconditional goto, such as 60 and 71 are not included in the basic blocks, because
they we can represent them by an edge in the flow graph. The numbering of the basic blocks does not change the
structure of the control flow graph.

Basic block Header Contents Edges Basic block Header Contents Edges
B1 50 50 : 57 B1→B2 B6 69 69 : 69 B6→B2
B2 70 70 : 70 B2→B3 B7 72 72 : 76 B7→B8
B2→B7
B3 58 58 : 59 B3→B4 B8 77 77 : 77 B8→B9
B3→B5
B4 61 61 : 65 B4→B6 B9 78 78 : 78 B9→B10
B5 66 66 : 68 B5→B6 B10 79 79 : 79

CS333/Intermediate Code Optimization/Supratim/5


Summary of observations for construction of a control flow graph
• There are 11 directed edges and 10 nodes in the cfg.
• The unique entry node of the graph is B1 and B10 is the unique exit node.
• A control flow graph is normally constructed at function level. However the concept easily
extends to program level also. For example, a program with n distinct function definitions will
have n control flow graphs, one for each function. The call and return statements will connect
the cfg of two functions who are related by a caller-callee relationship. Such a graph is called a
program flow graph. In this course we are concerned with the cfg of a single function only.
• The unconditional goto statement at the end of a basic block can be safely removed from the
basic block contents as the outgoing edge from the block to its successor block gives the desired
effect.
• The conditional goto at the end of a basic block will have exactly two successors, one for the
true branch and the other for the false branch.
Using these observations, control flow graph for main() is drawn below.

entry node
B1 50 : 56

70 : 70
B2 if i < 11

True False

58 : 59
B3 if i < 5 72 : 76 B7
True False

61 : 64 66 : 68 B5 B8 77 : 77
B4

78 : 78 B9
43 : 45 B6

B10 79 : 79
Control Flow Graph for Example exit node

CS333/Intermediate Code Optimization/Supratim/6


Properties of a basic block :
• All statements in a basic block are executed sequentially.
• Transfer from other basic blocks are only permitted to the header of the block, that is the
first statement in the block.
• Transfer from within a basic block is not permitted, except at the last statement.
A basic block is the largest sequence of statements that can be executed sequentially once the control
reaches the header of the block.

3. Use of Basic Blocks / control flow graph in Optimization: The control flow graph exhibits the
control structure in the program explicitly. Loops in the control flow graph depict parts of the code that
are expected to execute repeatedly. Therefore improvement in the contents of basic blocks that are part
of one or more loops at compile time, should result in significant savings at execution time.
• The cfg is useful for another significant optimization, known as detection of unreachable code.
Unreachable code comprises of one or more basic blocks which are not reachable from the
entry node (also called as start node) of the cfg. A basic block x, other than start node, that is x
≠start, is unreachable from start if predecessor(x) = Ф. Finding all unreachable nodes in a cfg
is a simple graph theoretic problem in directed graphs and can be determined. All such nodes
can be safely removed from the cfg without changing the semantics of the underlying program.
This optimization is named as elimination of unreachable code.
• Detecting all loops in a cfg, including nested loops, is also a well known graph theoretic
problem in directed graphs. One may use DFS to detect and extract the loops in a cfg in terms
of the set of basic blocks that constitute the loop. For the cfg of our example, there is exactly
one loop, described by the set of basic blocks, {<bb 2>, <bb 3>, <bb 4>, <bb 5>, <bb 6>, <bb
2> which can only be entered through the block <bb 2>.
The main purpose of the cfg is to perform optimizations on the intermediate code in order to improve
its efficiency. Analysis of the structure of the cfg, is known as control flow analysis. Control flow
analysis is used in optimizations, popularly known as control flow transformations, that may result in
addition / deletion of nodes or edges, but usually do not change the contents of the basic blocks.
Removal of unreachable code, detection of loops, are some of the examples of control flow analysis
and transformations.
Exercise : Write a function in a source language such that the control flow graph for the function will
be able to detect unreachable basic blocks in the graph.
4. Machine Independent Optimizations
Analysis of the contents of basic blocks is known as data flow analysis. If the analysis is confined to
within a basic block, also called as intra-basic block, then it is called as local data flow analysis. On
the contrary, when the contents of all basic blocks are analysed, for collecting useful information about

CS333/Intermediate Code Optimization/Supratim/7


the entire flow graph, then the resulting analysis is known as global data flow analysis.
Analysis, whether local or global, is concerned with collecting precise information about the
computation that is carried out by the codes in the blocks. Analysis is a pre-requisite for applying
transformations (also called as optimizations) on the IC to improve its performance at run time.
Analysis and optimization go hand in hand and are interdependent, in the sense that analysis opens up
opportunities for optimization, and after an optimization has been successfully performed, the resulting
code is further analysed for more information. Optimization and analysis is an iterative process and the
compiler repeats them in tandem till substantial benefits are not expected.

Data Flow Optimizations are broadly divided into 2 classes.


1. Local Optimization: Improvements in the contents within a basic block are usually
termed as local optimization. The scope of a local optimization is a basic block only.
The analysis required for local optimization is limited and therefore such optimizations
can be performed efficiently with O(number of statements in block) effort . Examples
could be eliminating redundant computations, other optimizations such as constant
folding, constant propagation, etc., that are described later.
An instance of local code optimization exists in basic block, <BB1> shown below.
Result of removing the redundant computation is shown below. Whether the new
assignments to b and d are used in the rest of the cfg requires global data analysis and
will not be handled by the local optimization.

<BB1> <BB1-modified>

43 : b = a * c; 43 : t1= a * c;
44 : d = a * c; 44 : b := t1;
45 : c = b + d; 44a : d = t1;
45 : c = t1+ t1;
Before local Optimization
After local Optimization

2. Global Optimization : Improvements in the contents of basic blocks by examining the


entire cfg including all its basic blocks, fall under the scope of global optimization. Such
computations are expensive since they require analysing the entire control graph and are
usually O(n2), where n is the number of nodes in the cfg. Examples are given later.
While they are expensive to implement the savings are significant when opportunities
for such optimizations are detected.

CS333/Intermediate Code Optimization/Supratim/8


Research on Optimization of intermediate code is an ongoing effort that has been pursued for the past
50 years. There exist interesting practical problems in this domain that have not been solved
satisfactorily till date. There is a large amount of literature on this topic that would span more than two
full semester courses.
In this course the intent is to give you a feel of some of the well known transformations or
optimizations performed by modern compilers, such as gcc. The optimizations will be introduced at the
source level instead of at the intermediate code level where it is actually performed in practice.
Illustration of Optimization through Example
The approach we take is to illustrate the capability of the optimization phase through examples. All the
optimizations shall be performed on the source code itself, though we know that in reality they are
performed by the compiler on the intermediate code (IC). The reason for the choice of source code is to
communicate the intent and spirit of the optimizations without using the volume and verbosity of IC.
Consider the following C program which has a main() function. An integer array and few scalar
variables are defined here. There is a for-loop which has an if-then-else statement within the body of
the for-loop. Another if-then-else statement follows the for-loop.
int main()
{ int a = 2, b = 3, c = 40, d, i, j;
int x[10]={10, 20, 30, 40, 50, 60, 70, 80, 90, 100};
for ( i = 1; i < 11; i+=2 )
{ if ( i%2)
{ d = a * b; x[i] = x[i-1] + d; c = b*100; }
else
{d = a * a; x[(i+d)%10] = x[i]+x[i-1];}
};
if (a < 2) for (j = 1; j < 11; j++) x[(j+5)%10]=x[j]+5;
else
for (j = 0; j < 10; j++) printf(" x[%d] = %d \n", j, x[j]);
return 0;
}
The following exercise may be done by the readers to independently examine the effect of
optimizations versus no optimization performed by gcc on the program above. Let us name the file as
“prog1.c”. The switch, “-fverbose-asm” asks the compiler to add annotations to the generated
assembly, which is generally more readable.
• generate assembly code without optimization
$ gcc -S -fverbose-asm prog1.c
$ mv prog1.s prog1_unopt.s

CS333/Intermediate Code Optimization/Supratim/9


• generate assembly code at optimization level 2
$ gcc -S -O2 -fverbose-asm prog1.c
• Compare the two assembly codes, “prog1_unopt.s” and “prog1.s” and make your own
observations.
Optimization1 : Can a compiler perform some computations at compile time and save the cost
incurred at run time ? Let us explore situations in prog1.c which permit such computations.

Source Code Opportunities for improvement


int main()
{ int a = 2, b = 3, c = 40, d, i, j;
int x[10]={10, 20, 30, 40, 50, 60, 70, 80, 90, 100};
for ( i = 1; i < 11; i+=2 )
{ if ( i%2)
{ d = a * b; Note that there are variables that have been initialized
x[i] = x[i-1] + d; at declaration time. For example, a and b are both
c = b*100; initialized.
}
else Compiler can propagate constant values in operands
{d = a * a; x[(i+d)%10] = x[i]+x[i-1];} provided it can ascertain that the operand does not get a
}; value from any other definition.
if (a < 2) for (j = 1; j < 11; j++)
Opportunities are present in the highlighted parts.
x[(j+5)%10]=x[j]+5;
else
for (j = 0; j < 10; j++) printf(" x[%d] = %d \n", j, x[j]
);
return 0;
}
Performing constant propagation by substituting constant values in expressions, produces column 2
// prog1.c // prog2.c
int main() int main()
{ int a = 2, b = 3, c = 40, d, i, j; { int a = 2, b = 3, c = 40, d, i, j;
int x[10]={10, 20, 30, 40, 50, 60, 70, 80, 90, 100}; int x[10]={10, 20, 30, 40, 50, 60, 70, 80, 90, 100};
for ( i = 1; i < 11; i+=2 ) for ( i = 1; i < 11; i+=2 )
{ if ( i%2) { if ( i%2)
{ d = a * b; x[i] = x[i-1] + d; c = b*100; } { d = 2*3; x[i] = x[i-1] + d; c = 3*100;}
else else
{d = a * a; x[(i+d)%10] = x[i]+x[i-1];} {d = 2*2; x[(i+d)%10] = x[i]+x[i-1];}
}; };
if (a < 2) for (j = 1; j < 11; j++) x[(j+5)%10]=x[j]+5; if (2 < 2) for (j = 1; j < 11; j++) x[(j+5)%10]=x[j]+5;
else else
for (j = 0; j < 10; j++) printf(" x[%d] = %d \n", j, x[j]); for (j = 0; j < 10; j++) printf(" x[%d] = %d \n", j, x[j]);
return 0; return 0;
} }

CS333/Intermediate Code Optimization/Supratim/10


This optimization is known as Constant Propagation. The main objective is to propagate constants
detected at compile time for further optimizations, such as constant folding discussed next.
Statement of Constant Propagation :
An intermediate code statement of the form : “a := b op c” is to be replaced by the value of its
operands, b or c, if one or both of them are found out to be constant at compile time. For example, if b
has the value v1, then IC is changed to IC “a := v1 op c”.
Benefits : At the least a memory access is saved at run time, this optimization may also expose more
instances of other optimizations.
Constraints : This optimization has the constraints that the operand value(s) reaching the statement is
(are) unique and is a constant. Counterexamples can be created for this optimization also, where despite
an operand being constant, it is not safe to perform this optimization
Analysis : The “Reaching Definitions Analysis” whose task is to determine at each program point all
the definitions that reach that point is used to perform this optimization safely using algorithms but that
is outside the scope of this course.
Home work : Analyse and report whether all the instances of constant propagation in the example
given are safe. Can you one change in the source such that will render some constant propagation to
become unsafe ?
Optimization 2: This is commonly known as Constant Folding.
The transformed program after constant propagation has generated several expressions both of whose
operands are constants. The compiler can safely do these computations at compile time and not incur
runtime cost for doing the calculations. For the example program, the effect of performing constant
folding is shown in column 2.
// prog2.c // prog3.c
int main() int main()
{ int a = 2, b = 3, c = 40, d, i, j; { int a = 2, b = 3, c = 40, d, i, j;
int x[10]={10, 20, 30, 40, 50, 60, 70, 80, 90, 100}; int x[10]={10, 20, 30, 40, 50, 60, 70, 80, 90, 100};
for ( i = 1; i < 11; i+=2 ) for ( i = 1; i < 11; i+=2 )
{ if ( i%2) { if ( i%2)
{ d = 2*3; x[i] = x[i-1] + d; c = 3*100;} { d = 6; x[i] = x[i-1] + d; c = 300;}
else else
{d = 2*2; x[(i+d)%10] = x[i]+x[i-1];} {d = 4; x[(i+d)%10] = x[i]+x[i-1];}
}; };
if (2 < 2) for (j = 1; j < 11; j++) x[(j+5)%10]=x[j]+5; if (False) for (j = 1; j < 11; j++) x[(j+5)%10]=x[j]+5;
else else
for (j = 0; j < 10; j++) printf(" x[%d] = %d \n", j, x[j]); for (j = 0; j < 10; j++) printf(" x[%d] = %d \n", j, x[j]);
return 0; return 0;
} }

CS333/Intermediate Code Optimization/Supratim/11


Statement of the Constant Folding :
An intermediate code statement of the form : “a := b op c” is to be replaced by the constant values of
its operands, b and c, say v1 and v2 respectively, to the new code “a := v3” where v3 is the result of
v1*v2 that is by computed at compile time itself.
Benefits : An operation is saved at run time, as the computed value can be directly used. The savings
are more for complex operations, such as *, /, %, etc as they consume more cycle time during
execution.
Constraints : This optimization requires that the operand values reaching the statement are unique. A
counterexample where this optimization would change the semantics of the original program is shown
below. It is not possible to fold the computation, “t1 = a * b”, even though both the operands are
constants in both the paths reaching the point, because the semantics will not be the same after folding.

a=2 a = 10
b=3 b=2

…...
t1 = a* b

Analysis : Compiler has to perform an analysis of the IC to determine at the program point of interest,
say p, the definitions of all its operands that reach the point p. The analysis performed by the compiler
is known as “Reaching Definitions Analysis” and is beyond the scope of this course. However, once
the analysis has been performed by the compiler, it can easily check if the operands have a single
definition of both the operands, and in this case can safely perform constant folding.
Home work : Analyse and report whether all the instances of constant folding in the example given are
safe.
It is to be noted that compiler performs any optimization not once but as many times it finds it
profitable. For example, after constant folding more opportunities for constant propagation exist. This
is shown in the prog4.c.

CS333/Intermediate Code Optimization/Supratim/12


After constant folding : prog3.c Prog4.c after constant propagation
// prog3.c int main()
int main() { int a = 2, b = 3, c = 40, d, i, j;
{ int a = 2, b = 3, c = 40, d, i, j; int x[10]={10, 20, 30, 40, 50, 60, 70, 80, 90, 100};
int x[10]={10, 20, 30, 40, 50, 60, 70, 80, 90, 100}; for ( i = 1; i < 11; i+=2 )
for ( i = 1; i < 11; i+=2 ) { if (i%2)
{ if ( i%2) { d = 6; x[i] = x[i-1] + 6; // propagate const value of d
{ d = 6; x[i] = x[i-1] + d; c = 300;} c = 300;
else }
{d = 4; x[(i+d)%10] = x[i]+x[i-1];} else {d = 4;
}; x[(i+4)%10] = x[i]+x[i-1];} // propagate value of d
if (False) for (j = 1; j < 11; j++) x[(j+5)%10]=x[j]+5; };
else if (false)
for (j = 0; j < 10; j++) printf(" x[%d] = %d \n", j, x[j]); for (j = 1; j < 11; j++) x[(j+5)%10]=x[j]+5;
return 0; else
} for (j = 0; j < 10; j++) printf(" x[%d] = %d \n", j, x[j]);
return 0;
}

This optimization is known as Constant Propagation. The main objective is to propagate constants
detected at compile time for further optimizations including constant folding.
Optimization 3 : While the two optimizations discussed earlier are useful, larger benefits accrue if
some computations within loops can be safely moved of a loop to a place outside the loop. This
optimization is known as Loop Invariant Code Motion. This optimization involves two tasks,
• detect a computation that is invariant in the loop it is placed, that is, it does not depend on the
loop surrounding it and has exactly the same value on every iteration of the loop, and
• find a place to move the loop invariant code outside the enclosing loop.
This optimization also, like all others that we discuss, must not change the semantics of the IC. For the
modified program, prog5c, three opportunities are highlighted using 3 different colors in 2 nd column of
the following table. These computations have been pulled out the for-loop and placed just before the
loop.
After constant folding & propagation : prog3.c Situations for optimization : prog4.c
int main() int main()
{ int a = 2, b = 3, c = 40, d, i, j; { int a = 2, b = 3, c = 40, d, i, j;
int x[10]={10, 20, 30, 40, 50, 60, 70, 80, 90, 100}; int x[10]={10, 20, 30, 40, 50, 60, 70, 80, 90, 100};
for ( i = 1; i < 11; i+=2 ) d = 6; c = 300; d = 4;
{ if (i%2) for ( i = 1; i < 11; i+=2 )
{ d = 6; x[i] = x[i-1] + 6; c = 300; } { if (i%2)
else { // d = 6; loop invariant
{d = 4; x[i] = x[i-1] + 6;

CS333/Intermediate Code Optimization/Supratim/13


x[(i+4)%10]= x[i]+x[i-1];} // c = 300; loop invariant
}; }
if (false) else
for (j = 1; j < 11; j++) x[(j+5)%10]=x[j]+5; {// d = 4; loop invariant
else x[(i+4)%10] = x[i]+x[i-1];}
for (j = 0; j < 10; j++) };
printf(" x[%d] = %d \n", j, x[j]); if (false)
return 0; for (j = 1; j < 11; j++) x[(j+5)%10]=x[j]+5;
} else
for (j = 0; j < 10; j++) printf(" x[%d] = %d \n", j, x[j]);
return 0;
}

Statement of Loop invariant Code Motion : Given a loop L of the cfg and a computation or IC of the
form, “ a = b op c” at some point p in some basic block of L, determine if “ a = b op c” is an loop
invariant of L. An IC, “ a = b op c”, is a loop invariant of a loop L, if all the definitions of the
operands, b and c are placed outside L. Then this code can be safely moved out from L to a suitable
predecessor / successor of L.
Benefits : Movement of loop invariant code outside a loop at compile time reduces the computation
effort of m*x (where m is the number of iterations of loop at run time, and x is one time cost of
execution of the code), to x.
Constraints : All operands involved in a loop invariant code must have their definitions from outside
the given loop.
Analysis : Requires detection of a loop L in cfg, followed by analysis of all basic blocks that are part of
the loop L.

Optimization 4 : Dead Code Elimination is yet another optimization performed by a compiler. As the
name suggest, code that are not used in the program, after their definition at some point p, may be
safely removed without changing the semantics of the original program. This optimization involves two
tasks,
• detect a computation, defined at program point p, that is no longer used on any path starting
from p to the rest of the program till its exit point.
• Change the IC by removing the corresponding IC.
This optimization also, like all others that we discuss, must not change the semantics of the IC. For the
modified program, prog6.c, three definitions, that are highlighted are dead from their point of definition
and hence can be safely removed.

CS333/Intermediate Code Optimization/Supratim/14


After constant folding & propagation : prog5.c Situations for optimization : prog6.c
int main() int main()
{ int a = 2, b = 3, c = 40, d, i, j; { int a = 2, b = 3, c = 40, d, i, j;
int x[10]={10, 20, 30, 40, 50, 60, 70, 80, 90, 100}; int x[10]={10, 20, 30, 40, 50, 60, 70, 80, 90, 100};
d = 6; // d = 6; dead variable
c = 300; // c = 300; dead variable
d = 4; // d = 4; dead variable
for ( i = 1; i < 11; i+=2 ) for ( i = 1; i < 11; i+=2 )
{ if (i%2) { if (i%2)
{ x[i] = x[i-1] + 6; } { x[i] = x[i-1] + 6; }
else else
{ x[(i+4)%10] = x[i]+x[i-1];} }; { x[(i+4)%10] = x[i]+x[i-1];} };
if (false) if (false)
for (j = 1; j < 11; j++) x[(j+5)%10]=x[j]+5; for (j = 1; j < 11; j++) x[(j+5)%10]=x[j]+5;
else else
for (j = 0; j < 10; j++) for (j = 0; j < 10; j++)
printf(" x[%d] = %d \n", j, x[j]); printf(" x[%d] = %d \n", j, x[j]);
return 0; return 0;
} }

Statement of Dead Code Elimination : IC statements that are not used after their definition at point p
on any path in the cfg starting from p to the rest of the cfg till its exit block, can be safely removed.
Benefits : Saves memory and also reduces execution time.
Constraints : The variable defined in the code, needs to have at least one use on any path starting
from the point of interest p, in order not be declared as dead at p. Variables who have at least one use
from the point of interest p are referred to as live variables at p. A variable that is not live at a point p
becomes a candidate for dead code.
Analysis : Live variable Analysis is performed by the compiler to detect all variables that are live (or
have some use at that point or later) at every program point in the program. Variables that are not found
to be live, are dead, and can be safely eliminated. The details of performing live variable analysis
algorithmically is beyond the scope of this course.

Optimization 5 : Elimination of Unreachable Code


This optimization involves elimination of unreachable code from IC. In prog6.c, the conditional
“if(false)” is always false, so that control never reaches the for-loop in the then part of the conditional.
These statements may therefore be safely removed without changing the underlying semantics. Code
fragments that are unreachable due to evaluation of conditional expressions in control flow constructs,
or statements that immediately follow an unconditional transfer of control, such as “goto”, “return”,
“break”, etc are examples of such code.

CS333/Intermediate Code Optimization/Supratim/15


After Dead Code Elimination : prog5.c Situations for optimization : prog6.c
int main() int main()
{ int a = 2, b = 3, c = 40, d, i, j; { int a = 2, b = 3, c = 40, d, i, j;
int x[10]={10, 20, 30, 40, 50, 60, 70, 80, 90, 100}; int x[10]={10, 20, 30, 40, 50, 60, 70, 80, 90, 100};
for ( i = 1; i < 11; i+=2 ) for ( i = 1; i < 11; i+=2 )
{ if (i%2) { x[i] = x[i-1] + 6; } { if (i%2) { x[i] = x[i-1] + 6; }
else else
{ x[(i+4)%10] = x[i]+x[i-1];} }; { x[(i+4)%10] = x[i]+x[i-1];} };
if (false) // if (false) // always false unreachable code
for (j = 1; j < 11; j++) x[(j+5)%10]=x[j]+5; // for (j = 1; j < 11; j++) x[(j+5)%10]=x[j]+5;
else //else
for (j = 0; j < 10; j++) printf(" x[%d] = %d \n", j, for (j = 0; j < 10; j++)
x[j]); printf(" x[%d] = %d \n", j, x[j]);
return 0; return 0;
} }

Statement of Unreachable Code Elimination : Statements that are unreachable from the start node of
the cfg are known as “unreachable code”. This may result either from control flow analysis, or due to
the application of optimizations which lead to some conditional expression to have a constant value,
True or False, at compile time.
Benefits : Saves both memory and execution time. Also identifies unintentional potential bugs in the
design / code.
Constraints : Correctness depends on the the compile time evaluations done under various
optimizations by the compiler which has resulted in some code marked as unreachable.
Analysis : Control flow analysis is the key analysis performed by the compiler to detect code that are
found to be unreachable in the cfg. Global data flow analysis is required to determine code that
reachable from the start block but will never be executed at run time because some conditional
expression has been analysed and found to have acquired a constant value at compile time.
In summary, six different optimizations were applied above in some order to convert the given source
program to the final form shown in the following table. Readers are urged to compile and run all the six
different versions of the program, from prog1.c to prog6.c. Verify whether all the programs produce the
same output.
Homework : As a take home exercise, students are urged to find if s/he can define an optimization,
other than those mentioned above, and apply to the final version, prog6.c ,in order to make it even
more efficient.
Give a suitable name to the optimization, and describe it in terms of the components : statement of the
optimization, benefits, constraints and analysis required. Provide an example of application of your
optimization giving the program code before and after the optimization.
End of Document

CS333/Intermediate Code Optimization/Supratim/16

You might also like