Compiler Unit V
Compiler Unit V
UNIT V
CODEOPTIMIZATIONAND CODEGENERATION
5.1 PRINCIPAL SOURCES OF CODE OPTIMISATION
A transformation of a program is called local if it can be performed by looking only at
the statements in a basic block; otherwise, it is called global. Many transformations can be
performed at both the local and global levels. Local transformations are usually performed first.
Initial flow graph is
181
CS3501 Compiler Design R-21
Types of Transformation
5.1.1 Function-Preserving Transformations
Common sub expression elimination
Copy propagation,
Dead-code elimination
Constant folding
5.1.2 Loop Optimization
Code Motion
Induction Variables
Reduction In Strength
Common sub expression elimination
An occurrence of an expression E is called a common sub-expression if E was previously
computed, and the values of variables in E have not changed then no need to recompute the
expression instead we can use the computed value.
For example
t1: = 4*i
t2: = a [t1]
t3: = 4*j
t4: = 4*i
t5: = n
t6: = b [t4] +t5
The above code can be optimized using the common sub-expression elimination as
t1: = 4*i
t2: = a [t1]
t3: = 4*j
t5: = n
t6: = b [t1] +t5
The common sub expression t4: =4*i is eliminated as its computation is already in t1 and the value
of i is not been changed from definition to use. Flow graph after common sub expression
elimination in B5 and B6.
182
CS3501 Compiler Design R-21
Copy Propagation
Assignments of the form f : = g called copy statements. The idea behind the copy-propagation
transformation is to use g for f, whenever possible after the copy statement f: = g.
For example:
x=Pi;
A=x*r*r;
The optimization using copy propagation can be done as follows: A=Pi*r*r;
Here the variable x is eliminated
Dead Code Elimination
The dead code may be a variable or the result of some expression computed by the programmer
that may not have any further uses. By eliminating these useless things from a code, the code will
get optimized. For e.g.
183
CS3501 Compiler Design R-21
Constant folding
Deducing at compile time that the value of an expression is a constant and using the constant
instead is known as constant folding. The code that can be simplified by user itself, is simplified.
For example
Initial code:
x = 2 * 3;
Optimized code:
x = 6;
Loop Optimizations: In loops, especially in the inner loops, programs tend to spend the bulk of
their time. The running time of a program may be improved if the number of instructions in an
inner loop is decreased, even if we increase the amount of code outside that loop. Some loop
optimization techniques are:
Frequency Reduction (Code Motion)
In frequency reduction, the amount of code in loop is decreased. A statement or expression, which
can be moved outside the loop body without affecting the semantics of the program, is moved
outside the loop.
For example
184
CS3501 Compiler Design R-21
Induction-variable elimination
A variable x is said to be an. "induction variable" if there is a positive or negative constant c such
that each time x is assigned, its value increases by c. For instance, i and tl are induction variables
in the loop containing B2.After induction variable elimination the flow graph will be
Reduction in Strength
The strength of certain operators is higher than other operators. For example, strength of * is higher
than +. Replacement of higher strength operator by lower strength operator is called a strength
reduction technique.
For example
185
CS3501 Compiler Design R-21
186
CS3501 Compiler Design R-21
Optimized code:
x = 6;
3.Strength Reduction: The operators that consume higher execution time are replaced by the
operators consuming less execution time.
For e.g.
Initial code:
y = x * 2;
Optimized code:
y = x + x;
4.Algebraic Simplification: Peephole optimization is an effective technique for algebraic
simplification. The statements such as x = x + 0 or x = x * 1 can eliminated by peephole
optimization.
5. Machine idioms: The target instructions have equivalent machine instructions for performing
some operations. Hence we can replace these target instructions by equivalent machine
instructions in order to improve the efficiency. For e.g., some machine have auto-increment or
auto-decrement addressing modes that are used to perform add or subtract operations.
5.3 DIRECTED ACYCLIC GRAPH
The Directed Acyclic Graph (DAG) is used to represent the structure of basic blocks, to
visualize the flow of values between basic blocks, and to provide optimization techniques in
the basic block. Basic block is a sequence of statements without jump or halt where control enters
at the beginning and leaves at the end. A DAG for basic block is a directed acyclic graph with the
following labels on nodes:
1. The leaves of graph are labeled by unique identifier and that identifier can be variable
names or constants.
2. Interior nodes of the graph is labeled by an operator symbol.
3. Nodes are sequence of identifiers for labels to store the computed value.
Algorithm for construction of DAG
Input: It contains a basic block
Output:
187
CS3501 Compiler Design R-21
188
CS3501 Compiler Design R-21
Expression 2
Expression 3
Example 2 :
T1:= 4*I0
T2:= a[T1]
T3:= 4*I0
T4:= b[T3]
T5:= T2 * T4
T6:= prod + T5
prod:= T6
T7:= I0 + 1
I0:= T7
189
CS3501 Compiler Design R-21
if I0 <= 20 goto T1
Example 3:
Consider the following three address statement:
S1:= 4 * i
S2:= a[S1]
S3:= 4 * i
S4:= b[S3]
S5:= S2 * S4
S6:= prod + S5
Prod:= s6
S7:= i+1
i := S7
if i<= 20 goto (1)
Stages:
190
CS3501 Compiler Design R-21
191
CS3501 Compiler Design R-21
192
CS3501 Compiler Design R-21
193
CS3501 Compiler Design R-21
Applications of DAG’s
There are several pieces of useful information that we can obtain as we are executing the
preceding algorithm.
First we can automatically detect, common sub expressions.
Second we can determine which identifiers have their values used in the block.
Third we can determine which statements compute values that could he used outside
the block
5.4 OPTIMIZATION OF BASIC BLOCKS
Basic block is a sequence of statements without jump or halt where control enters at
the beginning and leaves at the end. Optimization is applied to the basic blocks after the
intermediate code generation phase of the compiler. Optimization is the process of transforming
a program that improves the code by consuming fewer resources.
There are two types of basic block optimizations:
1. Structure preserving transformations
2. Algebraic transformations
The structure-preserving transformation on basic blocks includes:
o Dead Code Elimination
o Common Sub expression Elimination
o Renaming of Temporary variables
o Interchange of two independent adjacent statements
Algebraic transformation on basic blocks includes:
o Constant Folding
o Copy Propagation
o Strength Reduction
1. Structure preserving transformations
Dead Code Elimination
The dead code may be a variable or the result of some expression computed by the programmer
that may not have any further uses. By eliminating these useless things from a code, the code
will get optimized. For e.g.
Copy Propagation
Assignments of the form f : = g called copy statements. The idea behind the copy-propagation
transformation is to use g for f, whenever possible after the copy statement f: = g.
For example:
x=Pi;
A=x*r*r;
The optimization using copy propagation can be done as follows: A=Pi*r*r;
Here the variable x is eliminated.
Reduction in Strength
The strength of certain operators is higher than other operators. For example, strength of * is
higher than +. Replacement of higher strength operator by lower strength operator is called a
strength reduction technique. For e.g.
Before optimization: After optimization:
for (i=1;i<=10;i++) temp = 7;
{ for(i=1;i<=10;i++)
sum = i * 7; {
printf(“%d”, sum); temp = temp + 7;
} sum = temp;
printf(“%d”, sum)
}
5.5 GLOBAL DATA FLOW ANALYSIS AND ALGORITMM
To efficiently optimize the code compiler collects all the information about the program
and distribute this information to each block of the flow graph. This process isknown as
data-flow graph analysis. Certain optimization can only be achieved by examining the entire
program. It can't be achieve by examining just a portion of the program.
Below are some basic terminologies related to data flow analysis.
Definition Point- A definition point is a point in a program that defines a data item.
Reference Point- A reference point is a point in a program that contains a reference to
a data item.
Evaluation Point- An evaluation point is a point in a program that contains an
expression to be evaluated.
The below diagram shows an example of a definition point, a reference point, and an
evaluation point in a program
Data-flow information can be collected by setting up and solving systems of equations of the
form :
out [S] = gen [S] U ( in [S] - kill [S] )
where
Out[s] is the information at the end of the statement s.
gen[s] is the information generated by the statement s.
In[s] is the information at the beginning of the statement s.
Kill[s] is the information killed or removed by the statement s.
Points and Paths:
Within a basic block, we talk of the point between two adjacent statements, as well as the
point before the first statement and after the last. Thus, block B1 has four points: one before
any of the assignments and one after each of the three assignments.
Now let us take a global view and consider all the points in all the blocks. A path from
p1 to pn is a sequence of points p1, p2,….,pn such that for each i between 1 and n-1, either
1. Pi is the point immediately preceding a statement and pi+1 is the point immediately
following that statement in the same block, or
2. Pi is the end of some block and pi+1 is the beginning of a successor block.
Reaching Definitions:
A definition D is reaching a point x if D is not killed or redefined before that point. It is generally
used in variable/constant propagation. In the example, D1 is a reaching definition for block B2
since the value of x is not changed (it is two only) but D1 is not a reaching definition for block
B3 because the value of x is changed to x + 2. This means D1 is killed or redefined by D2.
Live Variable
A variable x is said to be live at a point p if the variable's value is not killed or redefined by
some block. If the variable is killed or redefined, it is said to be dead.
It is generally used in register allocation and dead code elimination.
In the above example, the variable a is live at blocks B1,B2, B3 and B4 but is killed at block
B5 since its value is changed from 2 to b + c. Similarly, variable b is live at block B3 but is
killed at block B4.
Computation of in and out:
Many data-flow problems can be solved by synthesized translation to compute gen and kill. It
can be used, for example, to determine computations.
The set out[S] is defined similarly for the end of s. it is important to note the distinction between
out[S] and gen[S]. The latter is the set of definitions that reach the end of S without following
paths outside S. Assuming we know in[S] we compute out by equation, that is
Out[S] = gen[S] U (in[S] - kill[S])
Considering cascade of two statements S1; S2, as in the second case. We start by observing
in[S1]=in[S]. Then, we recursively compute out[S1], which gives us in[S2], since a definition
reaches the beginning of S2 if and only if it reaches the end of S1. Now we can compute out[S2],
and this set is equal to out[S]. Consider the if-statement. we have conservatively assumed that
control can follow either branch, a definition reaches the beginning of S1 or S2 exactly when it
reaches the beginning of S. That is,
in[S1] = in[S2] = in[S]
If a definition reaches the end of S if and only if it reaches the end of one or both substatements;
i.e,
out[S]=out[S1] U out[S2]
Data-flow analysis of structured programs:
Flow graphs for control flow constructs such as do-while statements have a useful property.
There is a single beginning point at which control enters and a single end point that
control leaves from when execution of the statement is over.
We exploit this property when we talk of the definitions reaching the beginning and the
end of statements with the following syntax.
Sid: = E| S; S | if E then S else S | do S while E
Eid + id| id
We define a portion of a flow graph called a region to be a set of nodes N that includes
a header, which dominates all other nodes in the region.
All edges between nodes in N are in the region, except for some that enter the header.
Some control structures are
Figure: Data flow equations for cascade of two statements S1; S2,
Under what circumstances is definition d generated by S=S1; S2? First of all, if it is generated
by S2, then it is surely generated by S. if d is generated by S1, it will reach the end of S
provided it is not killed by S2. Thus, we write
gen[S]=gen[S2] U (gen[S1]-kill[S2])
Similar reasoning applies to the killing of a definition, so we have
Kill[S] = kill[S2] U (kill[S1] - gen[S2])
An iterative algorithm
The most common way of solving the data-flow equations is by using an iterative algorithm.
It starts with an approximation of the in-state of each block. The out-states are then computed
by applying the transfer functions on the in-states. From these, the in-states are updated by
applying the join operations. The latter two steps are repeated until we reach the so-called
fixpoint: the situation in which the in-states (and the out-states in consequence) do notchange.
A basic algorithm for solving data-flow equations is the round-robin iterative algorithm:
for i ← 1 to N
initialize node i
while (sets are still changing)
for i ← 1 to N
recompute sets at node i
investigates data structures that may be useful which are then summarized as feature vectors
in stage (b). In stage (c) training examples consisting of feature vectors and the correct answer
are passed to a machine learning tool. In stage (d) the learned model is inserted in the compiler.
Recent research has shown that machine learning (ML) can unlock more opportunities
in compiler optimization by replacing complicated heuristics with ML policies. But heuristics
become increasingly difficult to improve over time.
Heuristics are algorithms that, empirically, produce reasonably optimal results for hard
problems, within pragmatic constraints (e.g. "reasonably fast"). In the compiler case, heuristics
are widely used in optimization passes, even those leveraging profile feedback, such as inlining
and register allocation. Such passes have a significant impact on the performanceof a broad
variety of programs. These problems are often NP-hard and searching for optimal solutions may
require exponentially increasing amounts of time or memory.
5.6.3 Machine Learning Guided Optimizations (MLGO) is an industrial-grade general
framework for integrating ML techniques systematically in LLVM (an open-source industrial
compiler infrastructure that is ubiquitous for building mission-critical, high-performance
software). MLGO uses reinforcement learning (RL) to train neural networks to make decisions
that can replace heuristics in LLVM. It describes two MLGO optimizations for LLVM: