0% found this document useful (0 votes)
5 views20 pages

Code Optimization

Code optimization is a technique aimed at improving code efficiency by reducing resource consumption and increasing speed through program transformations. Optimizations are categorized into machine-independent and machine-dependent, with local and global transformations being key methods. Techniques such as common sub-expression elimination, copy propagation, dead-code elimination, and constant folding are essential for enhancing performance, while data flow analysis and peephole optimization further refine the code.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
5 views20 pages

Code Optimization

Code optimization is a technique aimed at improving code efficiency by reducing resource consumption and increasing speed through program transformations. Optimizations are categorized into machine-independent and machine-dependent, with local and global transformations being key methods. Techniques such as common sub-expression elimination, copy propagation, dead-code elimination, and constant folding are essential for enhancing performance, while data flow analysis and peephole optimization further refine the code.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 20

Code Optimization

Introduction
 Code Optimization is a program transformation technique, which tries to improve
the code by making it consume less resources (i.e. CPU, Memory) and deliver high
speed.
 The code produced by the straight forward compiling algorithms can often be made
to run faster or take less space, or both.
 This improvement is achieved by program transformations that are traditionally
called optimizations. Compilers that apply code-improving transformations are called
optimizing compilers.
 Optimizations are classified into two categories:
 Machine independent optimizationsare program transformations that improve the
target codewithout taking into consideration any properties of the target machine.
 Machine dependant optimizations are based on register allocation and utilization of
specialmachine-instruction sequences.
Principal Sources of Optimization
 A transformation of a program is called local if it can be performed by looking only at
the statements in a basic block; otherwise, it is called global.
 Many transformations can be performed at both the local and global levels.
 Local transformations are usually performed first.
1. Function-Preserving Transformations(OR) Local Optimizations:
 There are a number of ways in which a compiler can improve a program without
changing the function it computes.
Examples:
 a. Common sub expression elimination
 b. Copy propagation,
 c. Dead-code elimination
 d. Constant folding

 The other transformations come up primarily when global optimizations are


performed.
 Frequently, a program will include several calculations of the offset in an array. Some
of the duplicate calculations cannot be avoided by the programmer because they lie
below the level of detail accessible within the source language.

 a. Common Sub expressions elimination:

 An occurrence of an expression E is called a common sub-expression if E was


previously computed, and the values of variables in E have not changed since the
previous computation. We can avoid recomputing the expression if we can use the
previously computed value.

 For example
t1: = 4*i
t2: = a [t1]
t3: = 4*j
t4: = 4*i
t5: = n
t6: = b [t4] +t5
 The above code can be optimized using the common sub-expression elimination as
t1: = 4*i
t2: = a [t1]
t3: = 4*j
t5: = n
t6: = b [t1] +t5

 The common sub expression t4: =4*i is eliminated as its computation is already in t1
and the value of i is not been changed from definition to use.

 b. Copy Propagation:
 Assignments of the form f : = g called copy statements, or copies for short. The idea
behind the copy-propagation transformation is to use g for f, whenever possible
after the copy statement f: = g. Copy propagation means use of one variable instead
of another. This may not appear to be an improvement, but as we shall see it gives
us an opportunity to eliminate x.
 For example:
x=Pi;
A=x*r*r;
 The optimization using copy propagation can be done as follows:
A=Pi*r*r;
Here the variable x is eliminated.

 c. Dead-Code Eliminations:
 Dead code is one or more than one code statements, which are:
 Either never executed or unreachable,
 Or if executed, their output is never used.
 It is possible that a program contains a large amount of dead code.
 This can be caused when once declared and defined once and forget to remove
them in this case they serve no purpose.
 A variable is live at a point in a program if its value can be used subsequently;
otherwise, it is dead at that point. A related idea is dead or useless code, statements
that compute values that never get used. While the programmer is unlikely to
introduce any dead code intentionally, it may appear as the result of previous
transformations.
 Example:

i=0;
if(i=1)
{
a=b+5;
}
 Here, ‘if’ statement is dead code because this condition will never get satisfied.
 In the example below, the value assigned to i is never used, and the dead store can
be eliminated. The first assignment to global is dead, and the third assignment to
global is unreachable; both can be eliminated.
int global;
void f ()
{
int i;
i = 1; /* dead store */
global = 1; /* dead store */
global = 2;
return;
global = 3; /* unreachable */
}

 Below is the code fragment after dead code elimination.


int global;
void f ()
{
global = 2;
return;
}

 d. Constant folding:
 Expressions with constant operands can be evaluated at compile time, thus
improving run-time performance and reducing code size by avoiding evaluation at
compile-time.
 Example:
 In the code fragment below, the expression (3 + 5) can be evaluated at compile time
and replaced with the constant 8.
int f (void)
{
return 3 + 5;
}

 Below is the code fragment after constant folding.

int f (void)
{
return 8;
}

 For example,
a=3.14157/2 can be replaced by
a=1.570 thereby eliminating a division operation.

2. Global Optimizations or Loop Optimizations:


 In loops, especially in the inner loops, programs tend to spend the bulk of their time.
The running time of a program may be improved if the number of instructions in an
inner loop is decreased, even if we increase the amount of code outside that loop.

 Three techniques are important for loop optimization:


o (i) Code motion, which moves code outside a loop;
o (ii) Induction-variable elimination, which we apply to replace variables from
inner loop.
o (iii)Reduction in strength, which replaces and expensive operation by a
cheaper one, such as a multiplication by an addition.

 (i)Code Motion:
 An important modification that decreases the amount of code in a loop is code
motion.
 This transformation takes an expression that yields the same result independent of
the number of times a loop is executed (a loop-invariant computation) and places
the expression before the loop. Note that the notion “before the loop” assumes the
existence of an entry for the loop. For example, evaluation of limit-2 is a loop-
invariant computation in the following while-statement:
 Example:
while (i<= limit-2) /* statement does not change limit*/
 Code motion will result in the equivalent of
t= limit-2;
while (i<=t) /* statement does not change limit or t */

 (ii)Induction Variables:
 Induction variable elimination can reduce the number of additions (or
subtractions) in a loop, and improve both run-time performance and code space.
 Some architecture has auto-increment and auto-decrement instructions that can
sometimes be used instead of induction variable elimination.
 (OR) An induction variable is a variable that gets increased or decreased by a fixed
amount on every iteration of a loop or is a linear function of another induction
variable.
 Example:
 The code fragment below has three induction variables (i1, i2, and i3) that can be
replaced with one induction variable, thus eliminating two induction variables.

 The code fragment below shows the loop after induction variable elimination.

 Notes:
 Induction variable elimination can reduce the number of additions (or subtractions)
in a loop, and improve both run-time performance and code space.

 (iii)Reduction in Strength:
 Reduction in strength replaces expensive operations by equivalent cheaper ones on
the target machine. Certain machine instructions are considerably cheaper than
others and can often be used as special cases of more expensive operators.
 For example, x² is invariably cheaper to implement as x*x than as a call to an
exponentiation routine. Fixed-point multiplication or division by a power of two is
cheaper to implement as a shift. Floating-point division by a constant can be
implemented as multiplication by a constant, which may be cheaper.

 Note:
 Strength reduction is used to replace the expensive operation by the cheaper once
on the target machine.
 Addition of a constant is cheaper than a multiplication. So we can replace
multiplication with an addition within the loop.
 Multiplication is cheaper than exponentiation. So we can replace exponentiation
with multiplication within the loop.

Optimization of basic Blocks


 Optimization process can be applied on a basic block.
 While optimization, we don't need to change the set of expressions computed by the
block.
 There are two type of basic block optimization. These are as follows:
1. Structure-Preserving Transformations
2. Algebraic Transformations

 1.Structure preserving transformations:


 The primary Structure-Preserving Transformation on basic blocks is as follows:

o Common sub-expression elimination

o Dead code elimination

o Renaming of temporary variables

o Interchange of two independent adjacent statements


(a) Common sub-expression elimination:

 In the common sub-expression, you don't need to be computed it over and over
again. Instead of this you can compute it once and kept in store from where it's
referenced when encountered again.

(b) Dead-code elimination

 Dead code is one or more than one code statements, which are:
 Either never executed or unreachable,
 Or if executed, their output is never used.
 It is possible that a program contains a large amount of dead code.
 This can be caused when once declared and defined once and forget to remove
them in this case they serve no purpose.

Example:
// Program with Dead code
int main()
{
x=2
if (x > 2)
cout<< "code"; // Dead code
else
cout<< "Optimization";
return 0;
}
// Optimized Program without dead code
int main()
{
x = 2;
cout<< "Optimization"; // Dead Code Eliminated
return 0;
}
(c) Renaming temporary variables

 A statement t:= b + c can be changed to u:= b + c where t is a temporary variable and


u is a new temporary variable. All the instance of t can be replaced with the u
without changing the basic block value.

(d) Interchange of statement

 Suppose a block has the following two adjacent statements:


1. t1 : = b + c
2. t2 : = x + y

 These two statements can be interchanged without affecting the value of block
when value of t1 does not affect the value of t2.

 2. Algebraic transformations:
 Algebraic identities represent another important class of optimizations on basic
blocks. This includes simplifying expressions or replacing expensive operation by
cheaper ones i.e. reduction in strength.
 Exe 1: Thus, the expression 2*3.14 would be replaced by 6.28.
 Exe 2: Thus, the expression 5*2.7 would be replaced by13.5.
 Exe 3: In the algebraic transformation, we can change the set of expression into an
algebraically equivalent set. Thus the expression x:= x + 0 or x:= x *1 can be
eliminated from a basic block without changing the set of expression.
 The algebraic transformation on basic blocks includes:
1. Constant Folding
2. Copy Propagation
3. Strength Reduction
1. Constant Folding:
Solve the constant terms which are continuous so that compiler does not need to solve
this expression.
Example:
x = 2 * 3 + y ⇒x = 6 + y (Optimized code)
2. Copy Propagation:
It is of two types, Variable Propagation, and Constant Propagation.
Variable Propagation:
x=y ⇒ z = y + 2 (Optimized code)
z=x+2
Constant Propagation:
x=3 ⇒ z = 3 + a (Optimized code)
z=x+a
3. Strength Reduction:
Replace expensive statement/ instruction with cheaper ones.
x = 2 * y (costly) ⇒ x = y + y (cheaper)

Introduction to Global Data Flow Analysis


 To efficiently optimize the code compiler collects all the information about the
program and distribute this information to each block of the flow graph. This
process is known as data-flow graph analysis.
 Certain optimization can only be achieved by examining the entire program. It can't
be achieve by examining just a portion of the program.
 For this kind of optimization user defined chaining is one particular problem.
 Here using the value of the variable, we try to find out that which definition of a
variable is applicable in a statement.
 Based on the local information a compiler can perform some optimizations. For
example, consider the following code:

1. x = a + b;
2. x=6*3

o In this code, the first assignment of x is useless. The value computer for x is never
used in the program.
o At compile time the expression 6*3 will be computed, simplifying the second
assignment statement to x = 18;

 Some optimization needs more global information. For example, consider the
following code:

1. a = 1;
2. b = 2;
3. c = 3;
4. if (....) x = a + 5;
5. else x = b + 4;
6. c = x + 1;
 In this code, at line 3 the initial assignment is useless and x +1 expression
can be simplified as 7.
 But it is less obvious that how a compiler can discover these facts by looking
only at one or two consecutive statements. A more global analysis is required
so that the compiler knows the following things at each point in the program:

o Which variables are guaranteed to have constant values?


o Which variables will be used before being redefined?

 Data flow analysis is used to discover this kind of property. The data flow
analysis can be performed on the program's control flow graph (CFG).
 In order to do code optimization and a good job of code generation, compiler needs
to collect information about the program as a whole and to distribute this
information to each block in the flow graph. A compiler could take advantage of
“reaching definitions” , such as knowing
 where a variable like debug was last defined before reaching a given block, in order
to perform transformations are just a few examples of data-flow information that an
optimizing compiler collects by a process known as data-flow analysis.
 Data-flow information can be collected by setting up and solving systems of
equations of the form:

out [S] = gen [S] U ( in [S] - kill [S] )

 This equation can be read as “the information at the end of a statement is either
generated within the statement, or enters at the beginning and is not killed as
control flows through the statement.” Such equations are called data-flow equation.

 We say that the beginning points of the dummy blocks at the statement’s region are
the beginning and end points, respective equations are inductive, or syntax-directed,
definition of the sets in[S], out[S], gen[S], and kill[S] for all statements S. gen[S] is the
set of definitions “generated” by S while kill[S] is the set of definitions that never
reach the end of S.

 Consider the following data-flow equations for reaching definitions :

Fig. 5.8 (a) Data flow equations for reaching definitions

 Observe the rules for a single assignment of variable a. Surely that assignment is a
definition of a, say d. Thus
gen[S]={d}
 On the other hand, d “kills” all other definitions of a, so we write
 Kill[S] = Da - {d}
 Where, Da is the set of all definitions in the program for variable a.
Peephole Optimization
 Peephole optimization is a simple and effective technique for locally improving
target code.
 This technique is applied to improve the performance of the target program by
examining the short sequence of target instructions (called the peephole) and
replace these instructions replacing by shorter or faster sequence whenever
possible.
 Peephole is a small, moving window on the target program.
 Characteristics of peephole optimizations:
(1) Redundant-instructions elimination
(2) Unreachable code
(3) Flow-of-control optimizations
(4) Algebraic simplifications
(5) Use of machine idioms

 Explanation:

(1) Redundant-instructions elimination:In this technique the redundancy is


eliminated.
 Initial code:
y = x + 5;
i = y;
z = i;
w = z * 3;
 Optimized code:
y = x + 5;
i = y;
w = y * 3;

 (2) Unreachable Code:Another opportunity for peephole optimizations is the


removal of unreachable instructions. An unlabeled instruction immediately following
an unconditional jump may be removed. This operation can be repeated to eliminate
a sequence of instructions. For example, for debugging purposes, a large program
may have within it certain segments that are executed only if a variable debug is 1. In
C, the source code might look like:

 #define debug 0
If ( debug )
{
Print debugging information
}
 In the intermediate representations the if-statement may be translated as:
If debug =1 goto L1 goto L2
L1: print debugging information
L2: ………………………… (a)

 One obvious peephole optimization is to eliminate jumps over jumps.Thus, no


matter what the value of debug; (a) can be replaced by:

If debug ≠1 goto L2
Print debugging information
L2: …………………………… (b)

If debug ≠0 goto L2
Print debugging information
L2: …………………………… (c)

 As the argument of the statement of (c) evaluates to a constant true it can be


replaced
 By goto L2. Then all the statement that print debugging aids are manifestly
unreachable and can be eliminated one at a time.

 (3) Flows-Of-Control Optimizations:The unnecessary jumps can be eliminated in


either the intermediate code or the target code by the following types of peephole
optimizations. We can replace the jump sequence
Syntax:
goto L1
….
L1: gotoL2 (d)

by the sequence
goto L2
….
L1: goto L2

 If there are now no jumps to L1, then it may be possible to eliminate the statement
L1: goto L2 provided it is preceded by an unconditional jump.

 Example:Similarly, the sequence

if a < b goto L1
….
L1: goto L2 (e)

can be replaced by

If a < b goto L2
….
L1: goto L2

 Finally, suppose there is only one jump to L1 and L1 is preceded by an unconditional


goto.

 Then the sequence


goto L1
….
L1: if a < b goto L2 (f)
L3:
 may be replaced by

If a < b goto L2
goto L3
…….
L3:

 While the number of instructions in(e) and (f) is the same, we sometimes skip the
unconditional jump in (f), but never in (e).Thus (f) is superior to (e) in execution time

 (4) Algebraic Simplification:There is no end to the amount of algebraic simplification


that can be attempted through peephole optimization. Only a few algebraic
identities occur frequently enough that it is worth considering implementing them.
 For example, statements such as
x := x+0 orx := x * 1
are often produced by straightforward intermediate code-generation algorithms,
and they can be eliminated easily through peephole optimization.

 The output is :
t1=x+0
t1 = x
and
t2=x*1
t2=x

 (5) Reduction in Strength:Reduction in strength replaces expensive operations by


equivalent cheaper ones on the target machine. Certain machine instructions are
considerably cheaper than others and can often be used as special cases of more
expensive operators.
 For example, x² is invariably cheaper to implement as x*x than as a call to an
exponentiation routine. Fixed-point multiplication or division by a power of two is
cheaper to implement as a shift. Floating-point division by a constant can be
implemented as multiplication by a constant, which may be cheaper.
 X2 → X*X

 (6) Use of Machine Idioms:The target machine may have hardware instructions to
implement certain specific operations efficiently. For example, some machines have
auto-increment and auto-decrement addressing modes. These add or subtract one
from an operand before or after using its value. The use of these modes greatly
improves the quality of code when pushing or popping a stack, as in parameter
passing. These modes can also be used in code for statements like i : =i+1.

Exe1: i:=i+1 → i++


Exe 2: i:=i-1 → i- -

You might also like