0% found this document useful (0 votes)
144 views10 pages

CD Unit-V

This document discusses machine-independent optimization techniques in compiler design. It covers the principal sources of optimization like common subexpression elimination, copy propagation, dead code elimination, and constant folding. It also introduces data flow analysis techniques like constant propagation and partial redundancy elimination that are used to optimize programs. Loops are an important part of programs and the document discusses loop optimizations like code motion, induction variable elimination, and reduction in strength.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
144 views10 pages

CD Unit-V

This document discusses machine-independent optimization techniques in compiler design. It covers the principal sources of optimization like common subexpression elimination, copy propagation, dead code elimination, and constant folding. It also introduces data flow analysis techniques like constant propagation and partial redundancy elimination that are used to optimize programs. Loops are an important part of programs and the document discusses loop optimizations like code motion, induction variable elimination, and reduction in strength.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

UNIT – V

Machine-Independent Optimization: The Principal Sources of Optimization, Introduction to


Data-Flow Analysis, Foundations of Data-Flow Analysis, Constant Propagation, Partial-
Redundancy Elimination, Loops in Flow Graphs.

5.1 THE PRINCIPAL SOURCES OF OPTIMIZATION:

 A transformation of a program is called local if it can be performed by looking only at the


statements in a basic block; otherwise, it is called global.

 Many transformations can be performed at both the local and global levels. Local
transformations are usually performed first.
Function-Preserving Transformations
 There are a number of ways in which a compiler can improve a program without
changing the function it computes.

 Common sub expression elimination

 Copy propagation

 Dead-code elimination, and

 Constant folding

Common Sub expressions elimination


An occurrence of an expression E is called a common sub-expression if E was previously
computed, and the values of variables in E have not changed since the previous computation.
We can avoid recomputing the expression if we can use the previously computed value.
For example
t1: = 4*i
t2: = a [t1]
t3: = 4*j
t4: = 4*i
t5: = n
t6: = b [t4] +t5

The above code can be optimized using the common sub-expression elimination as
t1: = 4*i

Page 1 of 10
t2: = a [t1]
t3: = 4*j
t5: = n
t6: = b [t1] +t5
The common sub expression t4: =4*i is eliminated as its computation is already in t1. And
value of i is not been changed from definition to use.
Copy Propagation
Assignments of the form f : = g called copy statements, or copies for short. The idea behind
the copy-propagation transformation is to use g for f, whenever possible after the copy
statement f: = g. Copy propagation means use of one variable instead of another. This may
not appear to be an improvement, but as we shall see it gives us an opportunity to eliminate x.
For example:
x=Pi;
……
A=x*r*r;
The optimization using copy propagation can be done as follows:
A=Pi*r*r;
Here the variable x is eliminated
Dead-Code Eliminations:
A variable is live at a point in a program if its value can be used subsequently; otherwise, it is
dead at that point. A related idea is dead or useless code, statements that compute values that
never get used. While the programmer is unlikely to introduce any dead code intentionally, it
may appear as the result of previous transformations. An optimization can be done by
eliminating dead code.
For example:
i=0;
if(i=1)
{
a=b+5;
}
Here, ‘if’ statement is dead code because this condition will never get satisfied.
Constant folding:
We can eliminate both the test and printing from the object code. More generally, deducing at
compile time that the value of an expression is a constant and using the constant instead is
known as constant folding. One advantage of copy propagation is that it often turns the copy
statement into dead code.

Page 2 of 10
For example:
a=3.14157/2 can be replaced by
a=1.570 thereby eliminating a division operation.
Loop Optimizations:
We now give a brief introduction to a very important place for optimizations, namely loops,
especially the inner loops where programs tend to spend the bulk of their time. The running
time of a program may be improved if we decrease the number of instructions in an inner
loop, even if we increase the amount of code outside that loop.
Three techniques are important for loop optimization:
 Code motion, which moves code outside a loop;

 Induction-variable elimination, which we apply to replace variables from inner loop.

 Reduction in strength, which replaces and expensive operation by a cheaper one, such as
a multiplication by an addition.

Code Motion:
An important modification that decreases the amount of code in a loop is code motion. This
transformation takes an expression that yields the same result independent of the number of
times a loop is executed ( a loop-invariant computation) and places the expression before the
loop. Note that the notion “before the loop” assumes the existence of an entry for the loop.
For example:
Evaluation of limit-2 is a loop-invariant computation in the following while-statement:
while (i <= limit-2) /* statement does not change limit*/
Code motion will result in the equivalent of
t= limit-2;
while (i<=t) /* statement does not change limit or t */

Induction Variables :
Loops are usually processed inside out. For example consider the loop around B3. Note that
the values of j and t4 remain in lock-step; every time the value of j decreases by 1, that of t4
decreases by 4 because 4*j is assigned to t4. Such identifiers are called induction variables.

When there are two or more induction variables in a loop, it may be possible to get rid of all
but one, by the process of induction-variable elimination. For the inner loop around B3 in
Fig. we cannot get rid of either j or t4 completely; t4 is used in B3 and j in B4. However, we
Page 3 of 10
can illustrate reduction in strength and illustrate a part of the process of induction-variable
elimination. Eventually j will be eliminated when the outer loop of B2 - B5 is considered.
Example:
As the relationship t4:=4*j surely holds after such an assignment to t4 in Fig. and t4 is
not changed elsewhere in the inner loop around B3, it follows that just after the statement
j:=j-1 the relationship t4:= 4*j-4 must hold. We may therefore replace the assignment t4:= 4*j
by t4:= t4-4. The only problem is that t4 does not have a value when we enter block B3 for
the first time. Since we must maintain the relationship t4=4*j on entry to the block B3, we
place an initializations of t4 at the end of the block where j itself is..

The replacement of a multiplication by a subtraction will speed up the object code if


multiplication takes more time than addition or subtraction, as is the case on many machines.
Reduction In Strength:
Reduction in strength replaces expensive operations by equivalent cheaper ones on the target
machine. Certain machine instructions are considerably cheaper than others and can often be
used as special cases of more expensive operators. For example, x² is invariably cheaper to
implement as x*x than as a call to an exponentiation routine. Fixed-point multiplication or
division by a power of two is cheaper to implement as a shift. Floating-point division by a
constant can be implemented as multiplication by a constant, which may be cheaper.

Page 4 of 10
5.2 INTRODUCTION TO DATA-FLOW ANALYSIS:
In order to do code optimization and a good job of code generation , compiler needs to collect
information about the program as a whole and to distribute this information to each block in
the flow graph. A compiler could take advantage of “reaching definitions” , such as knowing
where a variable like debug was last defined before reaching a given block, in order to
perform transformations are just a few examples of data-flow information that an optimizing
compiler collects by a process known as data-flow analysis. Data-flow information can be
collected by setting up and solving systems of equations of the form :

out [S] = gen [S] U ( in [S] – kill [S] )

This equation can be read as “ the information at the end of a statement is either generated
within the statement , or enters at the beginning and is not killed as control flows through the
statement.”

 The notions of generating and killing depend on the desired information, i.e., on the
data flow analysis problem to be solved. Moreover, for some problems, instead of
proceeding along with flow of control and defining out[s] in terms of in[s], we need
to proceed backwards and define in[s] in terms of out[s].

 Since data flows along control paths, data-flow analysis is affected by the constructs
in a program. In fact, when we write out[s] we implicitly assume that there is unique
end point where control leaves the statement; in general, equations are set up at the
level of basic blocks rather than statements, because blocks do have unique end
points.

 There are subtleties that go along with such statements as procedure calls,
assignments through pointer variables, and even assignments to array variables.

5.3 CONSTANT PROPAGATION


Constants assigned to a variable can be propagated through the flow graph and substituted at
the use of the variable.
Example:
In the code fragment below, the value of x can be propagated to the use of x.
x = 3;
y = x + 4;
Below is the code fragment after constant propagation and constant folding.

Page 5 of 10
x = 3;
y = 7;
Notes:
Some compilers perform constant propagation within basic blocks; some compilers perform
constant propagation in more complex control flow.
Some compilers perform constant propagation for integer constants, but not floating-point
constants.
Few compilers perform constant propagation through bitfield assignments.
Few compilers perform constant propagation for address constants through pointer
assignments.
5.4 LOOPS IN FLOW GRAPHS
A graph representation of three-address statements, called a flow graph, is useful for
understanding code-generation algorithms, even if the graph is not explicitly constructed by a
code-generation algorithm. Nodes in the flow graph represent computations, and the edges
represent the flow of control.
Dominators:
In a flow graph, a node d dominates node n, if every path from initial node of the flow graph
to n goes through d. This will be denoted by d dom n. Every initial node dominates all the
remaining nodes in the flow graph and the entry of a loop dominates all nodes in the loop.
Similarly every node dominates itself.

Example:
 In the flow graph below,
 Initial node,node1 dominates every node. *node 2 dominates itself
 node 3 dominates all but 1 and 2. *node 4 dominates all but 1,2 and 3.
 node 5 and 6 dominates only themselves,since flow of control can skip around either by
goin through the other.
 node 7 dominates 7,8 ,9 and 10. *node 8 dominates 8,9 and 10.
 node 9 and 10 dominates only themselves.

Page 6 of 10
Fig. 5.3(a) Flow graph (b) Dominator tree

The way of presenting dominator information is in a tree, called the dominator tree, in which
• The initial node is the root.
• The parent of each other node is its immediate dominator.
• Each node d dominates only its descendents in the tree.

The existence of dominator tree follows from a property of dominators; each node has a
unique immediate dominator in that is the last dominator of n on any path from the initial
node to n. In terms of the dom relation, the immediate dominator m has the property is d=!n
and d dom n, then d dom m.
D(1)={1}
D(2)={1,2}
D(3)={1,3}
D(4)={1,3,4}
D(5)={1,3,4,5}
D(6)={1,3,4,6}
D(7)={1,3,4,7}
D(8)={1,3,4,7,8}
D(9)={1,3,4,7,8,9}
D(10)={1,3,4,7,8,10}
Natural Loops:
One application of dominator information is in determining the loops of a flow graph suitable
for improvement. There are two essential properties of loops:

Page 7 of 10
 A loop must have a single entrypoint, called the header. This entry point-
dominates all nodes in the loop, or it would not be the sole entry to the loop.

 There must be at least one way to iterate the loop(i.e.)at least one path back to the
headerOne way to find all the loops in a flow graph is to search for edges in the flow
graph whose heads dominate their tails. If a→b is an edge, b is the head and a is the
tail. These types of

 edges are called as back edges.

Example:
In the above graph,
7→4 4 DOM 7
10 →7 7 DOM 10
4→3
8→3
9 →1
The above edges will form loop in flow graph. Given a back edge n → d, we define the
natural loop of the edge to be d plus the set of nodes that can reach n without going through
d. Node d is the header of the loop.
Algorithm: Constructing the natural loop of a back edge.
Input: A flow graph G and a back edge n→d.
Output: The set loop consisting of all nodes in the natural loop n→d.
Method: Beginning with node n, we consider each node m*d that we know is in loop, to
make sure that m’s predecessors are also placed in loop. Each node in loop, except for d, is
placed once

on stack, so its predecessors will be examined. Note that because d is put in the loop initially,
we never examine its predecessors, and thus find only those nodes that reach n without going
through d.
Procedure insert(m):
if m is not in loop then begin loop := loop U {m};
push m onto stack end;
stack : = empty;
loop : = {d}; insert(n);
while stack is not empty do begin
pop m, the first element of stack, off stack;
for each predecessor p of m do insert(p)

Page 8 of 10
end
Inner loops:
If we use the natural loops as “the loops”, then we have the useful property that unless two
loops have the same header, they are either disjointed or one is entirely contained in the
other. Thus, neglecting loops with the same header for the moment, we have a natural notion
of inner loop: one that contains no other loop.

When two natural loops have the same header, but neither is nested within the other, they are
combined and treated as a single loop.
Pre-Headers:
Several transformations require us to move statements “before the header”. Therefore begin
treatment of a loop L by creating a new block, called the preheader. The pre-header has only
the header as successor, and all edges which formerly entered the header of L from outside L
instead enter the pre-header. Edges from inside loop L to the header are not changed. Initially
the pre-header is empty, but transformations on L may place statements in it.

Fig. 5.4 Two loops with the same header

Fig. 5.5 Introduction of the preheader

Page 9 of 10
Reducible flow graphs:
Reducible flow graphs are special flow graphs, for which several code optimization
transformations are especially easy to perform, loops are unambiguously defined, dominators
can be easily calculated, data flow analysis problems can also be solved efficiently. Exclusive
use of structured flow-of-control statements such as if-then-else, while-do, continue, and
break statements produces programs whose flow graphs are always reducible.

The most important properties of reducible flow graphs are that


1. There are no umps into the middle of loops from outside;
2. The only entry to a loop is through its header

Definition:
A flow graph G is reducible if and only if we can partition the edges into two disjoint groups,
forward edges and back edges, with the following properties.

1. The forward edges from an acyclic graph in which every node can be reached from initial
node of G.
2. The back edges consist only of edges where heads dominate theirs tails.

Example:

The above flow graph is reducible. If we know the relation DOM for a flow graph, we can
find and remove all the back edges. The remaining edges are forward edges. If the forward
edges form an acyclic graph, then we can say the flow graph reducible. In the above
example remove the five back edges 4→3, 7→4, 8→3, 9→1 and 10→7 whose heads
dominate their tails, the remaining graph is acyclic.

Page 10 of 10

You might also like