0% found this document useful (0 votes)
33 views

Dependence-Based Program Analysis

This document introduces the dependence flow graph (DFG), a representation of program dependence that generalizes def-use chains and static single assignment form. It describes how the DFG for a program can be constructed in O(EV) time and how forward and backward dataflow analyses like constant propagation and elimination of partial redundancies can be performed efficiently on the DFG by framing them as solutions to dataflow equations in the DFG. The DFG addresses limitations of def-use chains by incorporating more control flow information.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
33 views

Dependence-Based Program Analysis

This document introduces the dependence flow graph (DFG), a representation of program dependence that generalizes def-use chains and static single assignment form. It describes how the DFG for a program can be constructed in O(EV) time and how forward and backward dataflow analyses like constant propagation and elimination of partial redundancies can be performed efficiently on the DFG by framing them as solutions to dataflow equations in the DFG. The DFG addresses limitations of def-use chains by incorporating more control flow information.
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

Dependence-Based Program Analysis

Richard Johnson
Keshav Pingali

Department of Computer Science


Cornell Universi@, Ithaca, NY 14853

Abstract from the inputs of statements to their outputs in the case of


forward analysis, and from outputs to inputs in the case of
Program analysis and optimization can be speeded up through backward analysis. The analysis terminates when all state-
the use of the dependence flow graph (DFG), a represen- ments have a consistent set of input and output assertions.
tation of program dependence which generalizes clef-use Although easy to implement, this approach has a number
chains and static single assignment (SSA) form. In this pa- of disadvantages.
per, we give a simple graph-theoretic description of the DFG ● Information is propagated throughout the control flow
and show how the DFG for a program can be constructed graph, not just to where it is needed for optimization.
in O(EV) time. We then show how forward and back- For example, inconstant propagation it suffices to prop-
ward dataflow analyses can be performed efficiently on the agate information from definitions of variables to their
DFG, using constant propagation and elimination of partial uses. In common subexpression elimination, it is un-
redundancies as examples. These analyses can be framed as necessary to propagate availability of an expression to
solutions of dataflow equations in the DFG. Our construction points where the variables of the expression are dead,
algorithm is of independent interest because it can be used ● When the vector at some point in the program is updated,
to construct a program’s control dependence graph in O(E) the entire control flow graph below that point (or above
time and its SSA representation in O(EV) time, which are it, in backward analysis) may be re-amdyzed, even if
improvements over existing algorithms. there are few points in that region affected by the update.
☛ Many optimization benefit from analysis performed in
stages, but this is difficult to do in the standard approach.
1 Introduction Consider redundancy elimination in the following pro-
gram. To deduce that the computation of y is redun-
A number of recent papers have focused attention on the prob-
dant, we must first deduce that the computation of w is
lem of speeding up program optimization [FOW87, BM090,
redundant. This kind of analysis in stages is contrary
CCF91, PBJ+ 91, CFR+91, DRZ92]. Most optimization al-
to the standard approach, which considers all assertions
gorithms are based on dataflow analysis. Classic examples
simultaneously.
are Kildall’s constant propagation algorithm [Ki173], and
Morel and Renvoise’s algorithm for elimination of partial . . .
redundancies [MR79]. These algorithms are usually ~mple- z=a+b

mented using vectors of boolean, integer or real values to w=a+b

represent sets of assertions, such as x is 5 here, or y+ z is . . .

available here. One vector is associated with each point in X=z+l


the control flow graph and initialized appropriately. Vector y=w+l

values are computed iteratively by propagating information . . .

Def-use chains provide a partial solution to these prob-


1This research was supported by an NSF presidential Young liwestigator
award CCR-8958543, NSF grant CCR-90138526, ONR grant NOOO14-93-1- lems. They permit information to flow directly between defi-
0103, end a grant from Hewlett-Packard Corporation. nitions and uses without going through unrelated statements.
Permission to copy without fee all or part of this material is However, clef-use chains suffer from three drawbacks. First,
granted provided that the copias are not made or distributed for clef-use chains cannot be used for backward dataflow prob-
direct commercial advantage, the ACM copyright notice and the lems, such as the elimination of redundant computations,
title of the publication and its date appear, and notice is given
because they do not incorporate sufficient information about
that copying is by permission of the Association for Computing
the control structure of the program. Second, this lack of
Machinery. To copy otherwise, or to republish, requires a fee
and/or specific permission. control flow information in clef-use chains affects the preci-
ACM-S IGPLAN-PLDl-6/93 /Albuquerque, N.M. sion of analysis even in forward dataflow problems such as
@ 1993 ACM 0-89791 -598 -4/93 /0006 /0078... $ 1.50

78
constant propagation [WZ85, PBJ+ 91 ]. Finally, the worst- 2 A graph-theoretic characterization of
case size of clef-use chains is O(E2 V) where E is the number
dependence flow graphs
of edges in the control flow graph and V is the number of
variables [RT81]. We give a graph-theoretic characterization of clef-use chains,
The size problem can be overcome by using a “factored”
static single assignment form, and the dependence flow
representation of clef-use chains called static single assign- graph. This characterization formalizes the relationship be-
ment (SSA) form, which has worst-case asymptotic size
tween these program representations, permits the design of
O(EV) [CFR+89, CFR+91]. However, SSA form camnot
an efficient construction algorithm for DFGs, and aids in
be used for backward dataflow problems. A generalization proving optimization algorithms correct.
of the SSA approach, called the sparse datapow evaluation
graph, has been proposed to address this problem, but sparse
graphs take 0(IV3) time to construct, where N is the number 2.1 Terminology
of nodes in the control flow graph [CCF91, DRZ92]. Definition 1 A control Row graph (CFG) is a directed
In this paper, we show how these problems can be solved graph with distinguished nodes start and end such that
using the dependenceflow graph (DFG), which can be viewed all nodes are reachable from start and all nodes have a
as a generalization of clef-use chains and SSA form. The DFG path toe nd. start is the only node with no predecessors,
was suggested to us by the work of Cartwright and Felleison and end is the only node with no successors.
who showed the advantages of an executable representation
of program dependence [CF89]. We previously introduced For convenience in describing algorithms that operate on
the DFG using a dataflow machine style operational seman- CFGS, we introduce explicit switch and merge nodes to sepa-
tics [PBJ+ 91, Bec92]. Dataflow ~chlne graphs are also the rate branching and merging of control flow from computation.
basis of the program dependence web (PDW) of Ballance, A switch node is essentially a conditional jump that redkects
McCabe and Ottenstein [BM090], as well as the ori@d control flow to one of multiple outgoing edges based on the
SSA graphs due to Shapiro and Saint [SS70]. However, our value of an expression computed within the node, A merge
experience in implementing and using a representation based node performs no computation but simply serves as the target
on these ideas is that a full-blown dataflow graph representa- of multiple control flow edges, An assignment statement
tion is neither necessary nor desirable. node performs any general, non-branching computation,
The main contribution of this paper is the distillation and It is useful to extend the standard notions of dominance,
incorporation of the essence of the dataflow graph represen- postdominance and control dependence so that they apply to
tation into a traditional optimizing compiler framework. We edges as well as to nodes in the CFG.
accomplish this as follows.
In Section 2, we give a simple graph-theoretic charac- Definition 2 A node or edge x is said to dominate node or
terization of dependence flow graphs. This characterization edge y in a directed graph if every path from start to y

permits the exorcism of the dataflow execution model from includes x.


the description of DFGs. It also brings out key connections A node or edge x is said to postdominate node or edge y
between our work and prior work on representing control and in a directed graph if every path from y to end includes x.
data dependence. A node or edge x is said to be control dependent on node
In Section 3, we describe how to construct DFGs. An n if x postdominates all edges on some path from n to x, but
important step in this construction is determining when two x does not postdominate n. (Intuitively, n is a conditional
nodes in a control flow graph have the same control depen- branch that determines if control will pass through x.)
dence. We describe how to do this in O(E) time. This
algorithm is of independent interest since it can be used to 2.2 Def-use chains
build a program’s control dependence graph in O(E) time
and its SSA representation in O ( EV) time, which are im- We begin by recalling the standard definition of clef-use
provements over existing algorithms [CFS90, CFR+89],, chains,

In Section 4, we show how to use the DFG in a forward


Definition 3 A definition of variable x is said to reach a use
dataflow problenx constant propagation with dead code elim-
of x if there is a control jlow path from the definition to the
ination. This algorithm is faster than the standard control
use that does not pass through any other definition of x.
flow graph algorithm, yet it does as good a job of optimizing
A clef-use chain for variable x is a node pair (nl, n2)
programs.
such that nl dejines x, n2 uses x, and the definition of x at
In Section 5, we show how to solve a backward dataOow
n 1 reaches the use of x at nz.
problenx anticipatability of expressions, which is an impor-
tant step in the elimination of partial redundancies [MR’79].
For our purpose, it is convenient to recast this in terms of
Finally, in Section 6, we describe what we have learned so
control flow edges rather than nodes.
far in our implementation.

79
.... ... ... .* dependence for y ------ *
control flow dependence forx ~

,,;&$ ,:=i~) :~b)

!/
! ,’
;, .
:, ~.. -
“’” /
;, Y.. /
L ~~
Y. ~ Y.
/

& x H x., d ,.X.

(a) CFG with clef-use chains (b) SSA form (c) DFG form

Figure 1: A Comparison of Program Representations

Definition 4 A clef-use chain for variable x is an edge pair To summarize, algorithms using clef-use chains can
(e~, e~) such that produce less optimized code than algorithms performing
dataflow analysis directly on the control flow graph. In addi-
1. the source of el dejines x,
tion, clef-use chains cannot be used for backward dataflow
2. the destination of e2 uses x, and
analysis, and the worst-case size of clef-use chains is
3. there is a control$ow path from el to e2 with no assign-
0(E2V) [RT81], which is rather large.
ment to x.

Figure 1(a) shows a control flow graph with clef-use chains. 2.3 Static single assignment form
The limitations of clef-use chains have been discussed exten-
sively in the literature [WZ85, PBJ+ 91 ], and we summarize Static single assignment form solves the size problem of def-
them here. Consider the problem of constant propagation use chains by introducing a so-called ~-function to com-
using clef-use chains: the standard algorithm replaces a use bine clef-use edges having the same destination [CFR+ 89,
of a variable with a constant if the right hand side of every CFR+91]. In an SSA representation, each use of a variable is
definition reaching that use is that constant [ASU86]. In Fig- reached by exactly one definition or ~-function. Figure 1(b)
ure 1(a), this algorithm determines that the use of x in the shows the SSA form for the previous example. Notice that the
conditional branch can be replaced by the constant 1, and clef-use edges for variable y are combined by a d-function
this fact is determined without propagating the vahre of x placed at the merge in the control flow graph. SSA edges
through the assignment to y that is between the definition have the following graph-theoretic characterization.
and use of x. Similarly, it determines that the right hand side Definition 5 An SSA edge for variable x corresponds to an
of the statement y: = y+ 1 can be replaced by the constant 3. edge pair (el, e2) such that
However, this algorithm cannot determine that the last use of
y in this program can be replaced by the constant 3, since 1. there exists a definition of x that reaches el,
2. there exists a use of x reachable from e2,
there are two clef-use edges carrying different constants that
reach this use. By contrast, the standard dataflow analysis 3. there is no assignment to x on any control pow path

algorithm for constant propagation on the control flow graph from el to e2) and

will find this constant, since it deduces correctly that the false 4. el dominates e2.

side of the conditional branch is dead2. The first two conditions assert that an SSA edge connects
z~~ls of ~ls algotiti are dkcussed in Section 4. two points on some path from a definition of x to a use

80
of x reached by that definition. Conditions 3 and 4 ensure between the assignment to x and the use of x is a single-entry
that the only definitions that reach ez are those that reach single-exit region containing no definition of x. Thus there
el; otherwise there would be @functions between el and is a dependence edge from the definition of x directly to the
ez and there would be no SSA edge from e1 directly to ez. use of x. However, this region contains a definition of y, so
The worst-case size of the SSA representation is O(EV). dependence edges for y cannot bypass this region; they are
This solves the size problem of clef-use chains, but the SSA intercepted by a switch operator at the conditional branch. In
form cannot be used directly in backward dataflow ana~ysis this way, dependence in the DFG are intercepted by merges
problems. and switches at merge points and branches respectively in the
control flow graph.

2.4 Dependence flow graph


Figure 1(c) shows the dependence flow ~aph for the running 3 Constructing DFGs
example. Unlike clef-use chains, which go directly from
definitions to uses, a DFG edge for a variable x can bypass a We now describe the construction algorithm for depen-
region of the control flow graph only if this region is a single- dence flow graphs. We lint outline our algorithm for identi-
entry single-exit region that does not contain an assignment to fying single-entry single-exit regions and then show how this
x, since such a region has neither data nor control informal.ion information is used to build the DFG.
that is of interest to program analysis. The following theorem
defines single-entry single-exit regions formally, and states 3.1 Finding single-entry single-exit regions
an important connection between control dependence and
Abstractly, decomposing a control flow graph into single-
DFGs.
entry single-exit regions can be viewed as providing a
Theorem 1 The following are equivalent. “parse tree” of the control flow structure. In structured pro-
● el and e2 enclose a single-entry single-exit region. grams, syntactic constructs such as if-the n-e 1 se and
● el dominates e2, ez postdominates e~, and every cycle while loops provide information for determining single-
containing et also contains e2 and vice versa. entry single-exit regions. For general control flow graphs,
● e1 and ez have the same control dependence. however, we need an efficient algorithm for discovering this
information.
For lack of space, we omit the proof of this theorem. Consider any two single-entry single-exit regions. It is
A related structure called a hammock has been discussed in easy to show that if they overlap, then either one is nested
the literature [Kas75]. Hammocks are not the same as sin,gle- within the other, or the intersection is itself a single-entry
entry single-exit regions since the exit node in a hammock single-exit region. If we only consider regions that are not
can be the target of edges outside the hammock; besides, the formed by sequentially composing smaller regions, then we
algorithm for finding hammocks is O (EN). can show that single-entry single-exit regions are pairwise
Just as clef-use chains are “intercepted” by ~-functions in either nested, disjoint, or sequentially ordered. Thus these
the SSA representation, they are intercepted by swit ch and regions give a hierarchical decomposition of a control flow
merge operators in the DFG. DFG edges can be character- graph’s structure.
ized as follows: For lack of space, we sketch our O(E) algorithm for find-
ing single-entry single-exit regions. A longer paper giving
Definition 6 A DFG edge for variable x corresponds to an the details of this algorithm can be obtained from the au-
edge pair (e 1, ez) such that thors. Given a control flow graph, we want to find sets of
edges having the same control dependence. Such edges are
1. there exists a definition of x that reaches el,
totally ordered, and each pair of consecutive edges are the
2. there exists a use of x reachable from ez,
entry/exit of a single-entry single-exit region.
3. there is no assignment to x on any control flow path
To find sets of edges having the same control dependence,
from e~ to ez,
note that we can insert a dummy node on each edge and then
4. el dominates ez,
compute the property for nodes3. We then reduce the problem
5. ez postdominates el, and
to one of finding sets of cycle equivalent nodes in a related
6. every cycle containing el also contains ez and vice
graph.
versa.

Definition 7 Control~ow nodes a and b are said to be cycle


Conditions 1 through 4 are the same as in the SSA rep-
equivalent if every cycle containing a also contains b and
resentation. In the DFG, the merge operator plays the same
vice versa.
role as d-functions do in SSA form. Conditions 4 through 6
formally specify that the region of the control flow graphbe- 3Adding E nodes does not change the O(E) time complexity of our
algorithm.
tween el and ez must be a single-entry single-exit region. For
example, in F@ure 1(c) the region of the control flow graph

81
... ... .... . ------
control flow

E “ ““g’- ‘e”n&”
y,=2

(a) example CFG (b) base-level DFG (c) after region bypassing
and dead edge removal

Figure 2: An Illustration of DFG Construction

Claim 1 Nodes a and b have the same control dependence require the computation of the dominator or postdominator
if and only if a and b are cycle equivalent in the strongly relations.
connected component formed by adding the edge e nci --+
staick to the CFG. 3.2 The DFG construction algorithm
The proof is straightforward using Theorem 1. Comput- Suppose single-entry single-exit regions are discovered by
ing cycle equivalence in directed graphs appears difficult, the algorithm sketched above. The following steps outline
but we further reduce the problem to that of finding cycle the DFG construction algorithm and are illustrated in Fig-
equivalence in a related undirected graph. ures 2(b)-(c). In the example, each assignment statement is
a single-entry single-exit region, as is the i f –t he n –e 1s e
Claim 2 Nodes a and b are cycle equivalent in a strongly
construct.
connected component S if and only if the corresponding
nodes a’ and b’ are cycle equivalent in the undirected graph 1. Determine the variables dejined within each single-
G formedfrom S as follows: entry single-exit region. This is accomplished by an
● expand S into S’ by splitting each node n into nodes “inside-out” traversal of the regions. This information
ni, nt and no such that inedges to n connect to ni, out- will be used in a later step to allow dependence flow
edges from n originate from nO, and there are directed paths to bypass regions not relevant to the dependence
edges n~ ~ n’ and n! ~ no. path. In our example, each assignment statement defines
e undirect all edges in S to form G. one variable, and the i f –then–el se defines y.
2. Create a base-level DFG with no region bypassing.
Note that any cycle in S has a corresponding cycle in G, Simply insert V dependence edges in parallel with each
and although G contains cycles not in S, these cycles do control flow edge. Figure 2(b) shows the base-level
not af%xt the cycle equivalence relation on nodes in G that DFG.
correspond to nodes in S. Our algorithm for finding undi- 3. Perform region bypassing using the information found
rected cycle equivalence is based on depth-first search and in Step 1. Use a forwardflow algorithm that maintains
runs in O(E) tim~ the details are omitted. In a companion the most recent dependence source for each variable.
paper, we show that this algorithm can be used to construct When a region is bypassed, some dependence are cut.
the factored control dependence graph of a program in O(E) 4. Remove dead flow edges generated by bypassing. Use
time. A surprising aspect of this algorithm is that it does not a backward propagation starting from edges cut during

82
the region bypassing. Figure 2(c) shows the graph after
region bypassing and dead edge removal.
if (p) then
P := true
{Z:=l;
if (p) then
3.3 Discussion x := 2+2 }
{X:=l}
else
Multiedges: Due to region bypassing, it is often the case else
{2:=2;
that many DFG edges for a single variable share a com- {x:=2}
x := 2+1 } .=
mon source. For example, in Figure 2(c) two dependence :=x Y-x
Y
edges start at the assignment x: =1. We find it convenient
to refer to such a collection as a multiedge. We refer to the
(a) all-paths (b) possible-paths
source of a multiedge as its tail, and its successors as heads.
We will use the term edge to refer to the tail and a piurtic-
Figure 3: Exmples of Constant Propagation
ular head of a multiedge. From Theorem 1, it follows that
the tail and all the heads of a multiedge are totally ordwed
by dominance/postdominance, In the following sections on
dataflow analysis using DFGs, our use of multiedges allows propagation algorithms in the literature, such as the clef-use

us to separate the flow of information between a node and its chain algofithm [ASU86], discover such all-paths constants.
dependence flow successors into two parts: propagation be- However, additional constants may be found if we ignore
tween anode and a multiedge tail, and propagation between definitions inside dead regions of code. In Figure 3(b), the
a multiedge tail and its multiple heads. predicate of the conditional can be determined to be constant.
Control edges It is convenient to ensure that the DFG is By ignoring the definition on the unexecuted branch, the use
connected and rooted at start. To do this, we introduce a of x in the last statement can be determined to have value 1.
dummy variable defined at st art and used in each statement Suchpossible-paths constants are common in code generated
that has no other variables on its right hand side. Note that horn irdine expansion of procedures or macros [WZ85], but
these additional edges are simply control edges indicating algorithms that use clef-use chains alone do not find these
a node’s control dependence region. In F@ure 2(c), the constants.
dependence for this dummy variable are indicated by a solid We will discuss the standard CFG algorithm, which solves
arc with a circle. a set of dataflow equations in the control flow graph, and
Region Bypassing: Bypassing single-entry single-exit re- the DFG algorithm, which solves a set of equations in the
gions of the control flow graph is useful because it speeds dependence flow graph. Both algorithms find all-paths and
up optimization. However, the DFG-based optimization al- possible-paths constants, but the DFG algorithm is asymp-
gorithms described in this paperwork correctly even if some totically faster by a factor of O(V).
or no bypassing at all is performed. Abstractly, this means We use Kildall’s framework for constant propaga-
that any equivalence relation on CFG edges that is finer than tion [Ki173]. Define a lattice consisting of all constant values
control dependence equivalence can be used to construct the and two distinguished values T and 1. Uses of variables are
DFG. For example, we can use a relation in which two edges assigned values ffom the lattice during constant propagation.
are equivalent if and only if they are in the same basic b] ock Initially, each use is mapped to J_, meaning that we have no
— this will permit bypassing of assignment statements but information yet about the values of the variable at runtime.
not of control structures. A use is mapped to T when the algorithm cannot determine
Constructing the SSA Representation: If the SSA rep- that the use is a constant (e.g. if the use is reached by two
resentation of a program is desired, we can construct it in definitions whose right hand sides are 3 and 4.) At the end
O(EV) time by first building the DFG representation and of constant propagation, the interpretation of the lattice value
then eliding switches and converting merges to d-functions. assigned to a use of a variable x is as follows:
Unlike the standard algorithm [CFR+ 89], our algorithm does 1- This use was never examined during constant propaga-
not require computation of the dominance relation or dmni- tion; it is dead code.
nance frontiers and is therefore much simpler to implement. c This use of x has the value c in all executions.
T This use of x may have different values in different
executions.
4 Forward dataflow analysis
4.1 The CFG algorithm
In this section, we present constant propagation as an example
of forward dataflow analysis using the DFG. At each edge, we maintain a vector of lattice values having an
Consider Figure 3(a). The first use of z can be repla~ced entry for each variable. Intuitively, these vectors summarize
by 1 and the second by 2. The right hand sides of the two the possible values of variables at each program point. These
definitions of x can now be simplified to the constant 3, vectors are initialized to UL, the vector with -1- in every
and the final use of x can be replaced by 3. Most constant entry, and they are updated monotonically as the algorithm

83
start assignment multiedge switch merge

m :%! “h”, ,#?. /+” “’’’’?(’2


A Vi B c ‘t ‘f c’

~B = ‘A if p[aA] = true V p = T
OL otherwise
{
UA = Ul_ aB=aA[XH e{UA}] UC= UAUUB

~c = ‘A ifP[~A]=fa19e V p= T
OJ. otherwise
{

(a) control flow graph dataflowequations


I ! I

v if Vp= true VVp=T


Vt =
-1 otherwise
Vi=V
{ V= VIHV2
Vi=T V. = e{ Vi}
V if Vp= false V Vp=T
Vf =
L otherwise
{

(b) dependence flow graph dataflow equations

Figure 4: Dataflow Equations for Constant Propagation

proceeds. 4.2 The DFG algorithm


Computing these vectors can be viewed as solvinga system
The DFG algorithm propagates values for each variable sep-
of dataflow equations. One equation describing the output
arately along edges in the DFG. The set of DFG dataflow
vectors(s) in terms of the input vector(s) is associated with
equations has one equation for each DFG edge. As discussed
each node. Figure 4(a) shows the equation scheme for con-
in Section 2, a DFG edge d can be viewed as representing
stant propagation on the control flow graph. In this figure,
a pair of CFG edges (e 1, ez), and the dataflow equation for
0’ represents the vector of lattice values stored at edge A.
d computes the lattice value of the associated variable in
Since the values of variables at start are unknown, we set
the region between el and e’. Figure 4(b) shows the equa-
the vector ats tart to T for all variables. The output vector
tion scheme for solving constant propagation using the DFG.
of an assignment statement x: =e is obtained by evaluating
These equations are similar to the ones for the control flow
expression e using the values available at the statement input,
graph algoritluw the only new feature is the rule that prop-
and updating the x entry of the output vector to reflect this
agates the value at the tail of a DFG multiedge to its heads.
value. Expression e evaluates to L (or T) if any operand of
For example, in the program of Figure 1(c), this rule can be
e is 1 (or T) in the input vector. At a switch, the output
used to propagate the value of x from the assignment x: =1
vector is identical to the input vector for each direction that
to the two uses of x in the program.
control flow can take; the output vector is UL for directions
As with the control flow algorithxq a simple worklist-
that control flow cannot possibly take. At a merge, the
based algorithm can be used to solve the dataflow equations.
output vector is simply the least upper bound of the input
Whereas the control flow algorithm performed O(V) work
vectors.
each time a node is processed, the DFG algorithm performs
These equations can be solved using a standard worklist
work only for the relevant dependence at each node. There-
algorithm. Unfortunately, the asymptotic complexity of this
fore, the asymptotic complexity of the DFG algorithm is
algorithm is poor. If V is the number of program variables
O (EV). In addition, the algorithm avoids propagating in-
and E is the number of CFG edges, then the algofithm re-
formation through single-entry single-exit regions in which
quires O(EV) space and 0(EV2) time. The inefficiency
there are no assignments to the relevant variable.
arises because lattice values must be propagated along con-
A variety of enhancements to this algorithm are possible.
trol flow paths from definitions of variables to their uses.
For example, the MuMflow compiler performed predicate
analysis to determine additional constants: if the predicate at
a switch is x=1, we can propagate the constant 1 for x on the
true side of the conditional even if we cannot determine the

84
value of x for the false side [LFK+ 93]. It is easy to extend a multiedge accumulates the contributions to anticipatability
both theDFG and CFG algorithms to accomplish this, but this from nodes with the same set of control dependence.
extension seems difficult in SSA-based algorithms [WZ91] In the solution of the CFG equations for ANT, the initial ap-
since SSA edges bypass switches in the CFG. proximation has ANT true everywhere except ate nd which
We omit the proof of correctness of the DFG algorithm. provides the “boundary condition” where ANT is false. DFG
Given the structural properties of DFG edges, it is a simple edges do not go to end, but the role of e n d in providing the
matter to project values from the DFG onto the corresponding boundary condition can be played by statements that use x
CFG edges and then show that these values are consistent but do not compute the expression x+1 — dependence for
with the values determined by the CFG algorithm. x at these statements are set to false. Similarly, if a variable
x is live on one side of a conditional branch but dead on the
other, then the dependence for x is initialized to false on the
5 Backward dataflow analysis dead side of the switch.
Once the DFG propagation is done, the values of ANT at
In this section, we describe the use of the DFG in performing points in the CFG can be found by projecting from the DFG
backward dataflow analysis, using the computation of an- into the CFG: simply set ANT to true at every point in the
ticipatable expressions as an example. We then show how the single-entry single-exit region between the head and tail
anticipatable expressions can be used in a powerful optimiza- of every dependence edge for which ANT is true at the head.
tion called elimination of partial redundancies, which sub- An example of ANT computation is shown in F@ure 6.
sumes common subexpression elimination and loop-invaxiant We start with ANT being true at all DFG edges except at
removal MR79]. expressions other than x+1 that use x; in the example, ANT
is true at all edges except d4, where it is false. The values
5.1 Anticipatability at d4 and d5 are combined together by the multiedge rule,
so ANT at d2 is true. Since ANT is true at d6, ANT at d3
Definition 8 An expression e is totally (partially) antici- remains true. ANT at d2 and d3 are true, so ANT is true at
patable at a point p if, on every (some) path in the CFG dl. Projecting ANT onto the CFG sets ANT to true at every
from p to end, there is a computation of e before an assign-
point between the definition of x and the two computations
ment to any of the variables in e. We denote total (partial)
Ofx+l.
anticipatability as ANT (PAN). We can reduce the problem of computing ANT for mul-
tivariable expressions like x+ y to the single-variable ANT
ANT and PAN are usually computed for all expression in
discussed above as follows.
the program simultaneously, but we will focus attention on
a single expression to keep the discussion simple. The CFG Definition 9 An expression e is totally anticipatable rela-
equations for the computation of anticipatability are shown tive to variable x at a point p if, on every path in the CFG
in Figure 5. The solution of the equations for ANT can be
from p to end, there is a computation of e before an assign-
obtained iteratively, starting with an initial approximaticm in
ment to x.
which ANT is true everywhere in the program except at e n d.
This initial approximation permits ANT to propagate through It follows that an expression is totally anticipatable at a
loops correctly while ensuring that the “boundary condition” point p if it is totally anticipatable relative to all of its vari-
at end is satisfied. Similarly, the equations for PAN can be ables at p. A similar notion can be defined for partial antici-
solved iteratively, starting with an initial approximation in potability.
which PAN is false everywhere in the program. To compute ANT for x+ y, we initialize all dependence
We now discuss the solution of dataflow equations for ANT for x and y to true, except for the boundary dependence
and PAN in the DFG. We first discuss expressions involving which are set to false. Propagation along dependence for x
a single variable (such as x+1, y* 3 ) and then generalize and y proceeds independently using the single-variable rules
to multivariable expressions (such as x+ y, x* y). described earlier. Once DFG propagation is completed, we
Figure 5 shows the DFG equations for ANT and PAN com- project ANT relative to x and ANT relative to y onto the
putations for an expression x+ 1. These equations are similar CFG edges as before, and assert that ANT is true wherever it
to the CFG equations. The rule for multiedges propagates is true relative to both x and y separately.
anticipatability information from the heads to the multiedge Figure 7 gives an example of multivariable ANT for the
tail. The intuition behind this rule is the following by the expression x+ y. We solve the single-variable ANT problems
definition of the DFG, the heads of a multiedge postdominate relative to x and y independently. Solving single-variable
its tail, and there can be no definitions of variable x in the ANT relative to x yields true at dl and d3 and false at d2;
portion of the CFG between the tail and any of the heads; projecting the results onto control flow edges yields ANT
therefore, if the expression is totally (partially) anticipatable true relative to x at e2 through e7. Single-vtiable ANT
at any head, then it is also totally (partially) anticipatabl e at relative to y is true at d6 through d8 and is false at d4 and
the tail. Another way of looking at this is that the tail of d5; projecting the results onto control flow edges yields ANT

85
statement expression multiedge switch merge
, I 4

EL & .~ck /4’+ “y’ B c c

~.4 = false A.NT,4 = true MA = ~ ~B,C ANTA,B . ANTc

p~A = false pANA = tr’Ue pA.NA = ~ PANB,c PANA,B = PANc

(a) control flow graph dataflow e uations I

ANTA = t~ue ANTA = ~ #@/Tc,

PANA = true PANA = ~ PANc;

PPtiKA) ‘~f ANTc . AVB

PPtii[B) ‘:f ANTc . AVA

I I AVC. = PP. AVB,C = AVA AVC == AVA . AVB

DELETEA ‘~f AVA IIW.ERTA ‘~f PPA. N AVA

(b) dependence flow graph Wallow equations

Figure 5: Dataflow Equations for AN’17PAN and EPR

L+S-l
v
x := ...
x := .
dl

dl
& *\
e2
..”” “. .,d2 d3
..... . . . . ....
.’””’”d4 “.., ,,- ....
,..” ,.
..$. ,. @~
,..”’ :.
..X.?.2.. e3: d4’./d2
‘M”” ‘“ x ““’’”,
&~&

J
.- .,
..’”””..;;’’’...
, ,, \
e4 ‘.
..” #.“ ‘.
,..” .- Ci5 ““. .
m
.,X+1..
‘A; .( .. ,:
~.,.
.. ,.”
,. ,.. es! d6 ; a
... ,.. ..... -. . .
d6’... ..’” dl / /’
..,. ... . .. . . .. ... ,
..,. ..X+l ..
Q ,,
-.$. ,.,

.,. ,’
““’’”;d’k’”’
‘3 ““
(j’ x
ds ,’

b end L..lEEJ

F@ure 6: An Example of Single-Variable Anticipatability Figure 7: An Example of Multivariable Anticipatability

86
true relative toy at e5 throughe7. We combine these sepacate are local definitions that do not propagate. The rationale be-
results to get x+ y anticipatable at e5, e6 and e7. hind these rules is as follows. Once we have computed ANT
In our approach, the propagation of ANT occurs only in and PAN as described in Section 5.1, we determine where
that portion of the program in which at least one of the vawi- it may be profitable to place (PP) computations. There are
ables in the expression is live. More elaborate approaches two rules. The merge rule inserts a computation into a region
are possible. For example, we can avoid propagating false if it is anticipatable and partially available at the output of
from expressions other than x+ y that use x or y by com- the merge — after insertion, the expression becomes totally
puting ANT in two phases as is done in some ClW-based available at the output of the merge, so computations below
dataflow analyses [DRZ92]. It is also possible to compute the merge can be deleted. The multiedge rule eliminates re-
ANT directly by simultaneously following dependence for dundancies within a control region: it is profitable to place
all variables used in the expression, relying on a depth-first a computation at the tail of a multiedge if the expression is
numbering scheme to order these dependence. We will not anticipatable at the tail and partially anticipatable at two or
discuss these more complex alternatives any further due to more heads. This is equivalent to saying that placement is
lack of space. profitable at the tail if the computation is totally anticipat-
able at one head and partially anticipatable at another head.
Finally, we insert a computation at a point if placement is
5.2 Elimination of partial redundancies
profitable but the expression is not available, and we delete
Partial and total anticipatability can be used in a power- computations where the expression is available.
ful optimization called elimination of partial redundancies The rule for multiedges can be refined further to fine-tune
(epr), which subsumes common subexpression elimination the placement of code. Instead of hoisting a computation to
and loop-invariant removal. A computation is said to be the tail of the multiedge, it may be desirable to place it at the
redundant if it follows a computation of the same value on head where the value of the expression is first used in some
some execution path. If a redundant computation is preceded other computation, since that is the earliest place where the
by computations of the same value on all execution paths, we computation must be available. This kind of refinement is
say the computation is totally redundan~ otherwise we say easy to do in the DFG since the dependence edges tell us
the computation is partially redundant. A classic dataflow exactly where we must look to find the desired information,
algorithm for the removal of partial redundancies is due to We are evaluating these heuristics and we refer the interested
Morel and Renvoise [MR79]. We complete our discussion reader to the forthcoming thesis of one of the authors [Joh93].
of DFG-based analysis by showing how we can implement Our epr algorithm is simple in part because it is edge-
epr. based rather than node-based like conventional presentations
The basic idea is to insert new computations into the CI?G of epr. Placing computations at nodes is complicated by
where it is safe and profitable to do so, thereby making par- the presence of control flow edges whose source is a switch
tially redundant computations totally redundant. Totally re- and whose destination is a merge, such as the back edge
dundant computations may be replaced by a use of a new of repe at –unt i 1 loops. This complication is eliminated
temporary variable that is properly assigned at the preced- by adding empty basic blocks to split such edges [MR79],
ing computations. After removing these totally redundant but these blocks must later be removed if no code is moved
computations, no execution path will contain more instances into them. DFG algorithms are naturally edge-based and
of a computation than it did originally, and some paths will avoid these complications. Our epr algorithm propagates
contain fewer instances of the computation. information only through the portion of the control flow graph
A program point is safe for insertion of a computation of where the variables in the expression are live. It can also
e if e is anticipatable at that point, but what about profitabil- skip over single-entry single-exit regions of the control flow
ity? Inserting computations of e wherever it is anticipatable, graph where there is no definition or use of the variables in the
and then deleting computations of e wherever it has become expression. This is not possible in CFG-based approaches.
available eliminates partial redundancies; however, this strat- Fhtally, the DFG is built only once prior to optimization (it
egy may perform superfluous code motion. For example, in is, of course, updated as optimization is performed). This
Figure 6 this strategy will place x+1 immediately after the approach is simpler to implement than other approaches that
assignment to x and delete the other computations of x+ 1, build a special-purpose graph for each expression for which
even though there is no redundancy in the original program. partial redundancies must be eliminated PRZ92].
There has been much discussion in the literature about code
motion strategies [DS88, Dha91, KRS92], but to our knowl-
edge there is no experimental data showing the superiority of 6 Conclusions
any single strategy.
Our approach to epr has the virtue of simplicity. F@ure 5 In this paper, we have presented an approach to speeding up
shows the dataflow equations for the dependence-based al- program analysis and optimization that is based on the use of
gorithm. ANT and PAN are backward dataflow proble~ms, the dependence flow graph. First, we gave a graph-theoretic
while AV is a forward problem. PP, INSERT, and DELETE characterization of the DFG that tied together related work

87
on clef-use chains, static single assignment form, and control lation on control flow edges, and control dependence equiv-
dependence. Then we sketched a fast algorithm for con- alence is the coarsest equivalence relation that can be used
structing the DFG, which is of independent interest because it for this purpose. Realizing this led us to invent an algorithm
can be used to construct a factored control dependence graph that computes control dependence equivalences directly, and
of a program in 0(E) time, a factor of N improvement over this in turn led us to an O(E) algorithm for constructing a
the best existing algorithm. Finally, we showed how the DFG factored control dependence graph.
can be used for both forward and backward dataflow analy- To place our work in perspective, it is useful to understand
ses. Our approach avoids inefficiencies by (1) propagating the differences between the DFG and the program depen-
information for a variable x only where needed, bypassing dence graph (PDG) [FOW87]. The PDG of a program is the
single-entry single-exit regions of the graph which contain union of its control and data dependence. There have been
no definition of x, and by (2) performing work proportional many efforts to give a formal semantics to the PDG, with the
to the number of variable references at each assignment state- objective of using the semantics in correctness proofs of pro-
ment. gram transformations [HPR88, Se189, CF89]. However, this
We have not addressed the problem of optimization in has proved to be difficult. For example, it has been shown
stages described in Section 1. Note that performing redun- that two programs with the same PDG have the same input-
dancy elimination in “dependence order” in the example of output behavior, but the proof is long and intricate even for
Section 1 achieves the desired ordering. The general picture structured programs. Other efforts give a constructive defini-
is more complex because of merges, but we believe that a tion of the PDG by transforming the denotational semantics
dependence-based approach is the right one. of an imperative language, but the construction and the final
In this paper, we have focused on the use of the DFG result are hard to decipher. These difficulties in giving se-
for optimizations. For parallelization, the simple picture of mantics to a program, once it has been broken down into its
the DFG in this paper can be extended to include aliasing, control and data dependence, are not unlike the difficulties in
data structures, anti- and output dependence, loop recog- giving semantics to a program once it has been broken down
nition, and distance/direction information for loop-carried into assignments and GOTOS. Like quarks in nuclei [GM64]
dependence. Our treatment of aliasing, and anti- and out- or conditional jumps in the stored program computer, con-
put dependence is discussed in an earlier paper llMP91]. trol dependence is a deep and fundamental notion. However,
We are implementing a DFG tool-kit for parallelization and quarks do not exist in isolation, and conditional jumps are im-
optimization, and we will report on experimental results in plicit and hidden in modern programming language control
another paper. We conclude with a discussion of what we structures. Similarly in the context of optimization, con-
have learned so far from our implementation. trol and data dependence should be fused together to give
First, we have found that it is neither necessary nor de- structure to dependence as we have done in the DFG. It
sirable to use a full-blown dataflow graph representation of is straightforward to give a semantics to the DFG if one is
imperative language programs as an intermediate form. Our desired, but more importantly, such a semantic account is un-
current representation retains control flow information and necessary since DFG edges have precise structural properties
permits us to expose only relevant dependence in any phase that can be used in correctness proofs. Put simply, the DFG
of the compiler; for example, an optimization phase could gives a way of propagating information in the control flow
expose only clef-use information, as we have done in this graph while bypassing uninteresting regions, and the amount
paper. We are aware of successful functional language com- of bypassing is a useful compromise between too much and
pilers (for dataflow machines) that represent programs as too little.
full-blown dataflow graphs, but we attribute their success to
the relative simplicity of functional languages and the close-
ness between the intermediate and machine languages. 7 Acknowledgements
Second, we believe that renaming of variables to accom-
plish single assignment (static or dynamic) is orthogonal to Thanks to Mayan Moudgill for early work on DFG-based
dependence representation. In our opinion, single assign- optirnizations and to Micah Beck for his stimulating input

ment has nothing to do with the DFG (or for that matter, static throughout the entire project. Thanks to Richard Huff, Wei
Li, Mayan Moudgill, Paul Stodghill and Mark Charney for
single assignment form) — it is best to view these represen-
tations as a way of knitting control dependence information their comments on the paper.

with clef-use information. This view has two advantages.


First, aliasing can be handled very simply [BJP91], Second,
References
control dependence can also be combined with anti- and
output dependence without the need for a new conceptual [ASU86] A. V. Aho, R. Sethi, and J. D. Unman. Compilers:
framework. Principles, Techniques, and Tools. Addison-Wesley,
Finally, in the context of optimization, control dependence Reading, MA, 1986.
equivalence is more important than control dependence per
se. The construction of the DFG requires an equivalence re-

88
/J3ec92] Micah Beck. Translating FORTRAN to Datajlow [GM64] Murray Gell-Msnn, A schematic model of baryons and
Graphs. PhD thesis, Department of Computer Science, mesons. Physics Letters, 8(3):214–2 15, February 1964,
Cornell University, May 1992.
[HPR88] Sussn Horwitz, Jan Prins, and Thomas Reps. On the
[BJP91] Micah Beck, Richard Johnson, and Keshav Pingali. adequacy of program dependence graphs for represent-
From control flow to dataflow. Journal of Parallel and ing programs. Jn ConferenceRecordof the 15th Annual
Distributed Computing, 12:118-129,1991. ACM Symposium on Principles of Programming Lan-
guages, pages 146-157, January 13-15,1988.
[BM090] Robert A. Ballsnce, Arthur B. Maccabe, and Karl J.
Ottenstein. The program Dependence Web: A repre- [Joh93] Richard Johnson. Dependence-Based Compilation
sentation supporting control-, data-, and demand-driven (working title). PhD thesis, Department of Computer
interpretation of imperative languages. In Proceed- Science, Cornell University, 1993. Expected in Septem-
ings of theSIGPLAN’90 Conference on Programming ber.
Language Design and Implementation, pages 257-271,
~as75] V, N. Kas’jsnov. Distinguishing hammocks in a directed
June 20-22,1990, graph. Soviet Math. Doklady, 16(5):448-450, 1975.
[CCF91] Jong-Deok Choi, Ron Cytron, and Jeanne Fermnte.
DW73] Gary A. Kildall. A unitled approach to global program
Automatic constmction of sparse data flow evaluation
optimization. In Conference Record of the ACM Sympo-
graphs. In Conference Record of the 18th Annual ACM
sium on Principles of Programming Languages, pages
Symposium on Principles of Programming Languages,
194-206, October 1-3,1973.
pages 55-66, January 21-23, 1991.
llCRS92] Jens Knoop, Oliver Riithing, and Bernhard Steffen.
[CF89] Robert Cartwright and Matthias Felleisen, The semant- Lazy code motion. In Proceedings of the SIGPLAN
ics of program dependence. Jn Proceedings of the SIG-
’92 Conference on Programming Language Design and
PLAN ’89 Conference on Programming Language De- Implementation, pages 224-234, June 17-19,1992.
sign and Implementation, pages 13-27, June 2 1–23,
1989. lLFK+ 93] P. GeoiTrey Lowney, Stefan M. Freudenberger,
Thomas J. Karzes, W. D. Liechtenstein, Robert P. Nix,
[CFR+ 89] Ron Cytron, Jeanne Ferrante, Barry K. Rosen, Mark N.
John S. O’Domell, and John C. Ruttenberg. The Mul-
Wegman, and F. Kenneth Zadeck. An efficient method
tifiow trace scheduling compiler. Journal of Supercom-
of computing static single assignment form, Jn (70n -
puting, 7(1/2), January 1993.
ferenceRecordof the 16th Annual ACh4Symposium on
Principles of Programming Languages, pages 25-35, [MR79] Etienne Morel and Claude Renvoise. Global optimiza-

January 11-13,1989. tion by suppression of partial redundancies. Communi-


cations of the ACM, 22(2):96-103, February 1979.
[CFR+ 91] Ron Cytron, Jeanne Ferrsnte, Barry K. Rosen, Mark N.
Wegman, and F. Kenneth Zadeck. Efficiently comput- ~BJ+ 91] Keshav Pingali, Micah Beck, Richard Johnson, Mayan

ing static single assignment form and the control de- Moudgill, and Paul Stodghill. Dependence Flow

pendence graph. ACM Transactions on Programming Graphs: An algebraic approach to pro~am dependen-

Languages and Systems, 13(4):451-490, October 1991. cies. In Conference Record of the 18th Annual ACM
Symposium on Principles of Programming Languages,
[CFS90] Ron Cytron, Jeanne Ferrante, and Vivek Sarkar. Com-
pages 67-78, January 21-23, 1991.
pact representations for control dependenc& In Pro-
BT81] John H. Reif and Robert E. Tsrjsn. Symbolic program
ceedings of theSIGPLAN’90 Conference on Program-
ming Language Design and Implementation, piiges
analysis in almost-liness time. SIAM Journal on Com-

337-351, June 20-22,1990. puting, 11(1):81-93, February 1981.

~ha91] Dhansnjay M. Dhamdhere. Practical adaptation of the [Se189] Rebecca P. SeIke. A rewriting semantics for program

global optimization algorithm of Morel and Renvoise. dependence graphs. In Conference Record of the 16th

ACM Transactions on Programming Languages and AnnualACMSymposium on Principles ofProgramming


Systems, 13(2):291-294, April 1991. Languages, pages 12-24, January 11-13, 1989.

~RZ92] DhsnanjayM. Dharndhere, Barry K. Rosen, andF. Ken- [ss70] R. M. Shapiro and H. Saint. The representation of
neth Zadeck. How to analyze large programs efficiently algorithms. Technical Report CA-7002-1432, Mas-

and informatively. In Proceedings of the SIGPLAN’92 sachusetts Computer Associates, February 1970.
Conference on ProgrammingLanguage Design andlm- ~85] Mark N. Wegman and F. Kenneth Zadeck. Constant
plementation, pages 212-223, June 17-19, 1992. propagation with conditional branches. In Conference
[DS88] K.-H. Drechsler and M. P, Stadel. A solution to a Record of the 12th Annual ACM Symposium on Princi-

problem with Morel and Renvoise’s “Global Optimiza- ples of Programming Languages, pages 291-299, Jsn-
USry 14-16, 1985.
tion by Suppression of Partial ACMRedundancies”.
Transactionson ProgrammingLanguagesandSystems, pwz91] Mark N. Wegman and F. Kenneth Zsdeck. Con-
10(4):635-640, October 1988. stant propagation with conditional branches. ACM
[FOW87] J. Ferrsnte, K. J. Ottenstein, and J. D. Warren. The Transactions on ProgrammingLanguages and Systems,
13(2):181-210, April 1991.
program dependency graph and its uses in optimization.
ACM Transactions on Programming Languages and
Systems, 9(3):3 19-349, June 1987.

89

You might also like