0% found this document useful (0 votes)
92 views100 pages

Notes On Graph Algorithms Used in Optimizing Compilers: Carl D. Offner

This document discusses depth-first walks of directed graphs which: 1) Construct a depth-first spanning forest of the graph and number the nodes in pre-order and reverse post-order. 2) The pre-order and reverse post-order numbering is a powerful tool for analyzing graph structure and understanding graph algorithms. 3) A depth-first walk partitions the graph into sets of nodes, with each set having its own spanning tree in the depth-first forest. Different walks can yield different partitions and spanning trees.

Uploaded by

ref denis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
92 views100 pages

Notes On Graph Algorithms Used in Optimizing Compilers: Carl D. Offner

This document discusses depth-first walks of directed graphs which: 1) Construct a depth-first spanning forest of the graph and number the nodes in pre-order and reverse post-order. 2) The pre-order and reverse post-order numbering is a powerful tool for analyzing graph structure and understanding graph algorithms. 3) A depth-first walk partitions the graph into sets of nodes, with each set having its own spanning tree in the depth-first forest. Different walks can yield different partitions and spanning trees.

Uploaded by

ref denis
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 100

Notes on Graph Algorithms

Used in Optimizing Compilers

Carl D. Offner

University of Massachusetts Boston


March 31, 2013
Contents

1 Depth-First Walks 1
1.1 Depth-First Walks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 A Characterization of DAGS . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.3 A Characterization of Descendants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.4 The Path Lemma . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.5 An Application: Strongly Connected Components . . . . . . . . . . . . . . . . . . . . . . . . 11
1.6 Tarjan’s Original Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

2 Flow Graphs 19
2.1 Flow Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.2 Dominators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
2.3 Depth-First Spanning Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

3 Reducible Flow Graphs 24


3.1 Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.2 Algorithms for Constructing Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.3 Reducible Flow Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.4 A Subgraph of Nonreducible Flow Graphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.5 Back Arcs in Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
3.6 Induced Depth-First Walks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
3.7 Characterizations of Reducibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
3.8 The Loop-Connectedness Number . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.9 Interval Nesting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40

4 Testing for Reducibility 45


4.1 Finite Church-Rosser Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45

i
ii CONTENTS

4.2 The Transformations T 1 and T 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46


4.3 Induced Walks from T 1 and T 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
4.4 Kernels of Intervals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5 Reachunder Sets and Tarjan’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53

5 Applications to Data-Flow Analysis 59


5.1 Four Data-Flow Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
5.2 Data-Flow Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.3 Indeterminacy of Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
5.4 Abstract Frameworks for Data-Flow Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
5.5 Solutions of Abstract Data Flow Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
5.6 Two More Data-Flow Analysis Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.7 How Many Iterations are Needed? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.8 Algorithms Based on Reducibility . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.8.1 Allen and Cocke’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.8.2 Schwartz and Sharir’s Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
5.8.3 Dealing With Non-Reducible Flow Graphs . . . . . . . . . . . . . . . . . . . . . . . . . 88
Chapter 1

Depth-First Walks

We will be working with directed graphs in this set of notes. Such graphs are used in compilers for modeling
internal representations of programs being compiled, and also for modeling dependence graphs. In these
notes, however, we will be concerned mainly with the graph theory; relations to compiler optimization will
appear as applications of the theory.
All graphs in these notes are finite graphs. This fact may or may not be mentioned, but it should always be
assumed.
The elements of a directed graph G are called nodes, points, or vertices. If x and y are nodes in G, an arc
(edge) from x to y is written x → y. We refer to x as the source or tail of the arc and to y as the target or
head of the arc.1
To be precise, we should really denote a directed graph by hG, Ai, where A is the set of arcs (i.e. edges) in
the graph. However, we will generally omit reference to A.
Similarly, a sub graph of a directed graph hG, Ai should really be denoted hH, Ei, where E is a collection
of edges in A connecting elements of H. However, in general we denote a sub graph of G simply by H. We
will make the convention that the edges in the sub graph consist of all edges in the flow graph connecting
members of H.
The one exception to this convention is when the sub graph H of G is a tree or a forest of trees. In this case,
the edges of H are understood to be only the tree edges, ignoring any other edges of G between members of
H.

1.1 Depth-First Walks


Algorithm A in Figure 1.1 performs a depth-first walk of a directed graph G and constructs

1. A depth-first spanning forest D of G.

2. A pre-order numbering of the nodes of G.


1 While this usage is now standard, Tarjan’s early papers use “head” and “tail” in the opposite sense.

1
2 CHAPTER 1. DEPTH-FIRST WALKS

3. A reverse post-order numbering of the nodes of G.

A depth-first walk is also often referred to as a depth-first traversal, or a depth-first search.


The numbering of the nodes of G by a depth-first walk is a powerful tool that can be used both to analyse
structural properties of G and to understand the workings of graph algorithms on G. We will denote the
pre-order number of a node x by pre[x], and the reverse post-order number of x by rpost[x].

procedure DFWalk(G: graph)


begin
Mark all elements of G “unvisited”;
i ← 1;
j ← number of elements of G;
while there is an “unvisited” element x ∈ G do
call DFW(x);
end while;
end
procedure DFW(x: node)
begin
Mark x “visited”;
pre[x] ← i; i ← i + 1;
for each successor y of x do
if y is “unvisited” then
Add arc x → y to D;
call DFW(y);
end if ;
end for;
rpost[x] ← j; j ← j − 1;
end

Figure 1.1: Algorithm A: Depth-First Walk and Numbering Algorithm

The successive calls to DFW from DFWalk yield a partition of G into sets of nodes, each with its own
spanning tree. In particular, each call of DFW(x) from DFWalk yields a spanning tree of a set of noes of G.
Different choices of x in the FOR loop of DFS will yield the same set of nodes of G but a different spanning
tree for them.
Different choices of the next x to process in the WHILE loop of DFWalk will in general yield a different
partition of G into sets of nodes.
The terms successor and predecessor always refer to the relation defined by the edges of the directed graph
G.
On the other hand, the terms ancestor , child , and descendant are always to be understood with reference
to a particular spanning forest D for G. So in particular, a node x may be an ancestor of another node y
with respect to one spanning forest for G, but not with respect to another. Figure 1.2 shows two different
walks of a graph G. The edges of the depth-first spanning forest in each case are indicated by thick arcs in
1.1. DEPTH-FIRST WALKS 3

the graph.

(3,1) A (1,1) A

(1,4) B (4,2) C (2,4) B (4,2) C

(2,5) D (5,3) E (3,5) D (5,3) E

Each node is labeled with an ordered pair of numbers (x, y), where x is the pre-order number of the
node and y is the reverse post-order number of the node. The thick arcs are the arcs of the depth-first
spanning forest.

Figure 1.2: Two depth-first forests for a directed graph.

If G is a tree, the pre-order and reverse post-order numberings are essentially equivalent; each node is visited
before all its children. Thus, both pre-order and reverse post-order numberings of a tree topologically sort
the partial order determined by the tree.
In the case of a DAG (directed acyclic graph), however, these orderings are not equivalent: the pre-ordering
corresponds to a walk which visits a node not before all its children, but only before all those children which
were previously unvisited. Thus, in a pre-order walk of a DAG, a node can be visited after one or more of
its children. Reverse post-ordering does not suffer from this defect: it’s probably intuitively obvious, but we
will show precisely below (see Theorem 1.8 on page 9), in a post-order walk, a node is visited after all its
children (whether the children were previously visited or not). Therefore, in a reverse post-order walk on a
dag, a node comes before all its children, and therefore a reverse post-order numbering of a DAG topological
sorts the DAG’s partial order.
As an example, if in Figure 1.2, we remove the curved arc, the directed graph becomes a DAG. We see
that, with respect to the first depth-first spanning forest (on the left), the pre-order numbering does not
topologically sort the DAG’s partial order, but the reverse post-order numbering does. (This is only an
example, of course—as we’ve said, the actual proof for the general case of any DAG will come later.)
1.1 Exercise Construct a DAG and a depth-first walk of the DAG such that the pre-order numbering correspond-
ing to the walk does not topologically sort the DAG. Show that the reverse post-order numbering, however,
does.

The arcs in the depth-first spanning tree created by Algorithm A are all arcs that were originally in G. We
call these arcs tree arcs. The remainder of the arcs in G can be classified as follows:

• Forward arcs: arcs that go from ancestors to descendants in D. We exclude, however, any arc that
goes from a node to itself.
• Back arcs: arcs that go from descendants to ancestors in D. This includes any arc that goes from a
node to itself.
4 CHAPTER 1. DEPTH-FIRST WALKS

• Cross arcs: all other arcs.2

Figure 1.3 illustrates a directed graph with arcs labelled as determined by a particular depth-first forest; in
this case, the forest is actually a tree.

(1,1)
A

F B
B (2,4) (5,2) C

C
D (3,6) (4,5) E (6,3) F

The ordered pairs are the pre- and post-order numbers of the nodes. The thick arcs are the arcs of the
depth-first spanning tree (i.e. they are the tree arcs). The remaining arcs are labelled F (forward arc),
B (back arc), and C (cross arc).

Figure 1.3: Classification of arcs in a depth-first walk.

As we remarked above, there are in general different depth-first spanning forests D of G. Corresponding to
each such D, there is a different classification of arcs.
By examining the way Algorithm A works, we can write down some characterizations of these arcs. Let us
distinguish between reaching a node and processing a node: a node may be reached several times in the
algorithm (i.e., once for each predecessor), but is only processed once (the first time it is reached). The first
time a node is reached is when a tree arc is created with that node as target. When a node x is processed, it
is (figuratively, at least) placed on the stack by the recursion and pre[x] is assigned. When the processing is
complete (i.e. after all its successors have been processed), it is taken off the stack and rpost[x] is assigned.
Thus we see that node x is an ancestor of node y in D iff x is on the stack when y is first reached.
Another way to look at this is to modify Algorithm A so that instead of computing pre- and post-order
numbers for each node, it computes the “times” start[x] and finish[x] that the node started and finished
being processed; i.e. the times that it was pushed onto and popped from the stack. This algorithm is shown
in Figure 1.4.
The relation of these time stamps to the numberings computed in Algorithm A is this:

• The numbers start[x] are just the pre-order numbers with gaps in them—every number finish[x] is a
number missing from the sequence {start[x]}. Thus, if the sequence {start[x]} is “squashed” to remove
2 The names for arcs given here differ from those used in some of the original papers:

Here Tarjan[19] Tarjan[20]


forward arc reverse frond forward arc
back arc frond cycle arc
cross arc cross-link cross arc
1.1. DEPTH-FIRST WALKS 5

procedure DFWalk(G: graph)


begin
Mark all elements of G “unvisited”;
i ← 1;
while there is an “unvisited” element x ∈ G do
call DFW(x);
end while;
end
procedure DFW(x: node)
begin
Mark x “visited”;
start[x] ← i; i ← i + 1;
for each successor y of x do
if y is “unvisited” then
Add arc x → y to D;
call DFW(y);
end if ;
end for;
finish[x] ← i; i ← i + 1;
end

Figure 1.4: Depth-First Walk with Time Stamps

the gaps, the number assigned to node x by the squashed sequence is pre[x]. In particular, the following
statements are equivalent:

pre[x] < pre[y]


start[x] < start[y]

x is pushed on the stack before y is.

• The numbers finish[x] are just the post-order numbers with gaps in them. Squashing this sequence,
as above, assigns to each node its post-order number. In particular, the following statements are
equivalent:

rpost[x] > rpost[y]


finish[x] < finish[y]

x is popped from the stack before y is.

In implementations, pre- and reverse post-order numbers are more useful than time stamps, because there
are no gaps in the sequences of those numbers. Time stamps are useful, not so much in themselves, but
because they make it somewhat easier to deduce properties of pre- and post-order numberings. This is
6 CHAPTER 1. DEPTH-FIRST WALKS

because, in addition to the relations just noted, they satisfy two additional properties which we state in the
form of two lemmas:
1.2 Lemma For each node x, start[x] < finish[x].

Proof. This just says that x is always pushed on the stack before it is popped off the stack.

1.3 Lemma (Parenthesis Nesting Property) For any two nodes x and y, the start and finish numbers of x
and y nest as parentheses. That is, only the following relations are possible (see Figure 1.5):

start[x] < finish[x] < start[y] < finish[y]


start[x] < start[y] < finish[y] < finish[x]
start[y] < start[x] < finish[x] < finish[y]
start[y] < finish[y] < start[x] < finish[x]

Proof. If start[x] < start[y] < finish[x] then x is on the stack when y is first reached. Therefore the
processing of y starts while x is on the stack, and so it also must finish while x is on the stack: we have
start[x] < start[y] < finish[y] < finish[x]. The case when start[y] < start[x] < finish[y] is handled in the same
way.

Another way to state the parenthesis nesting property is that given any two nodes x and y, the intervals
[start[x], finish[x]] and [start[y], finish[y]] must be either nested or disjoint.

1.4 Lemma x is an ancestor of y iff both pre[x] < pre[y] and rpost[x] < rpost[y].

Proof. x is an ancestor of y iff x is first reached before y is and the processing of y is complete before the
processing of x is. This in turn is true iff start[x] < start[y] < finish[y] < finish[x].

All four of the relations illustrated in Figure 1.5 are possible. However, when there is an arc in the graph G
from x to y, the last one can never occur. This is the content of the next theorem:
1.5 Theorem The following table characterizes the pre- and reverse post-order numberings of arcs x → y in a
directed graph G with respect to a depth-first walk:

Tree Arc pre[x] < pre[y] rpost[x] < rpost[y]


Forward Arc pre[x] < pre[y] rpost[x] < rpost[y]
Back Arc pre[x] ≥ pre[y] rpost[x] ≥ rpost[y]
Cross Arc pre[x] > pre[y] rpost[x] < rpost[y]

Proof. The first three lines of the table follow from Lemma 1.4. As for the last line of the table, there are
two possibilities:

• The processing of y began before the processing of x began. That is, start[y] < start[x]. Since y is not
an ancestor of x, then we must also have finish[y] < start[x](< finish[x]), and this is just the last line
of the table.
1.1. DEPTH-FIRST WALKS 7

x pushed x popped
on stack from stack

y pushed y popped
on stack from stack

x pushed x popped
on stack from stack

y pushed y popped
on stack from stack

x pushed x popped
on stack from stack

y pushed y popped
on stack from stack

x pushed x popped
on stack from stack

y pushed y popped
on stack from stack

Figure 1.5: Parenthesis-like nesting of start-finish intervals for nodes x and y.


8 CHAPTER 1. DEPTH-FIRST WALKS

• y was not reached before the processing of x began. But in this case, since y is a successor of x, y
will be reached while x is on the stack. That is, we have start[x] < start[y] < finish[x], and so y is a
descendant of x, contradicting the assumption that x → y is a cross arc.

Following Tarjan[19], we modify Algorithm A by initializing both the arrays pre[ ] and rpost[ ] to 0. So at
any point in the algorithm, pre[x] = 0 iff x has not yet been reached for the first time. (This does away with
the need to mark nodes as “visited”.) In addition, rpost[x] = 0 iff the processing of x is not finished. (It
might not even have started, but at any rate, it’s not finished.)
The modified algorithm, which we call Algorithm B (Figure 1.6), walks G and simultaneously constructs the
pre-ordering and reverse post-ordering of G, classifies all the arcs in G (thereby constructing D as well), and
computes the number of descendants ND[x] of each node x in the tree D. (Note: ND[x] is the number of
elements in the subtree of D which is rooted at x. This reflects the fact that conventionally, any node is a
descendant of itself.)
1.6 Theorem Algorithm B generates a spanning forest D of G, computes the number of descendants in D of
each node x, and correctly classifies arcs in G relative to D.
Proof. First, every node except the nodes x selected in the FOR loop is reached exactly once by a tree
arc. So the tree arcs form a forest of trees rooted at those nodes x.
By the construction in the algorithm, if x is a terminal node in D then ND[x] = 1, and otherwise
X
ND[x] = 1 + {ND[y] : x → y is an arc in D}

which is what we mean by the number of descendants of x.


Theorem 1.5 then shows that the rest of the arcs are labelled correctly.

1.2 A Characterization of DAGS


Using these properties of arcs, we can characterize directed graphs which are DAGS:
1.7 Theorem If G is a directed graph, the following are equivalent:

1. G is a DAG.
2. There is a depth-first walk of G with no back arcs. (More precisely, there is a depth-first walk of G
with respect to which no arc in G is a back arc.)

3. No depth-first walk of G has back arcs. (More precisely, there is no arc in G which is a back arc with
respect to any depth-first walk of G.)

Proof. 1 =⇒ 3: Since there are no cycles in a DAG, there can be no back arcs with respect to any depth-first
walk.
3 =⇒ 2: because 3 is stronger than 2.
2 =⇒ 1: By assumption, there is a depth-first walk of G with no back arcs. Let the pre-order and reverse
post-order numbering of the nodes in G be determined by this walk. Now if G is not a DAG, then G contains
a cycle. But then every edge x → y in the cycle would satisfy rpost[x] < rpost[y], which is impossible.
1.2. A CHARACTERIZATION OF DAGS 9

procedure DFWalk(G: graph)


begin
i ← 1;
j ← number of elements of G;
for each node x ∈ G do
pre[x] ← 0;
rpost[x] ← 0;
end for;
while there is an element x ∈ G with pre[x] = 0 do
call DFW(x);
end while;
end

procedure DFW(x: node)


begin
pre[x] ← i; i ← i + 1;
ND[x] ← 1; -- if x has no successors, ND[x] = 1
for each successor y of x do
if pre[y] = 0 then -- y reached for the first time
Label x → y a tree arc; -- Add arc x → y to D
call DFW(y);
ND[x] ← ND[x] + ND[y];
else if rpost[y] = 0 then -- y is already on the stack
Label x → y a back arc;
else if pre[x] < pre[y] then -- y’s processing finished
Label x → y a forward arc;
else -- y’s processing finished
Label x → y a cross arc;
end if ;
end for;
rpost[x] ← j; j ← j − 1; -- x’s processing finished
end

Figure 1.6: Algorithm B: Labelling and Numbering for a Depth-First Walk.

As a corollary, we can now see what we promised above:

1.8 Lemma A reverse post-order numbering of a DAG topologically sorts the DAG’s partial order.

Proof. Using the fact that in a depth-first walk of a DAG there are no back arcs, we see immediately from
the table in Theorem 1.5 that the reverse post-order number of any node is less than that of each of its
successors. And that immediately implies that the depth-first tree created by the walk is a topological sort
of the DAG.
10 CHAPTER 1. DEPTH-FIRST WALKS

1.3 A Characterization of Descendants


Again we let G be a directed graph with a depth-first walk which generates a spanning forest D by Algorithm
B. We will characterize descendants in D. (Actually, we could just start with a spanning forest D and a
depth-first walk of that forest, since the descendant relation is determined only by arcs in D.)
We already know by Lemma 1.4 that if z is a descendant of x, then pre[x] < pre[z] and rpost[x] < rpost[z].

1.9 Lemma

1. If z is a descendant of x and if pre[x] < pre[y] < pre[z], then y is a descendant of x.

2. Similarly, if z is a descendant of x and if rpost[x] < rpost[y] < rpost[z], then y is a descendant of x.

Proof. 1: This follows from the Parenthesis Nesting Property. Since z is a descendent of x, we must have
the relations indicated in this picture:

start start finish finish


x z z x

Now since pre[x] < pre[y] < pre[z], we must have one of the two following situations:

start start start finish finish finish


x y z z y x

start start finish start finish finish


x y y z z x

and so y is a descendant of x.
2: The proof is similar.

1.10 Lemma The pre-order (resp. reverse post-order) numbers of all the descendants of a node x form an interval
in Z whose left-hand endpoint is pre[x] (resp. rpost[x]).

Proof. We know that the pre-order numbers of all the descendants of x fall to the right of pre[x] in Z. The
preceding lemma then shows that there are no gaps in the set of these numbers. The proof for the reverse
post-order numbers is similar.
1.4. THE PATH LEMMA 11

1.11 Theorem The following are equivalent:

1. x is an ancestor of y.

2. pre[x] < pre[y] and rpost[x] < rpost[y].

3. pre[x] < pre[y] < pre[x] + ND[x].

4. rpost[x] < rpost[y] < rpost[x] + ND[x].

Proof. 1 ⇐⇒ 2 is just Lemma 1.4. 1 ⇐⇒ 3 and 1 ⇐⇒ 4 by the preceding lemma.

1.4 The Path Lemma


1.12 Lemma If there is a path x = s0 → s1 → . . . → sn = y such that pre[x] < pre[sj ] for 1 ≤ j ≤ n, then y is a
descendant of x.

Remarks 1. By the same token, every element of the path is a descendant of x.


2. The condition asserts that at the time that x is first reached, no element of the path from x to y has yet
been reached.

Proof. We will show by induction on j that each element sj is a descendant of x. Of course this is true for
j = 0. Now say it is true for a particular value j; i.e. say we know that sj is a descendant of x. If j = n, of
course, we are done. Otherwise, we have to show that sj+1 is also a descendant of x. To do this, we look at
the arc sj → sj+1 .
If the arc is a tree arc or a forward arc, then sj+1 is a descendent of sj , and we are done.
Otherwise, by Theorem 1.5, we have

pre[x] < pre[sj+1 ] < pre[sj ]

so again sj+1 is a descendant of x, by Lemma 1.9.

1.13 Lemma (Tarjan’s “Path Lemma”) If pre[x] < pre[y], then any path from x to y must contain a common
ancestor of x and y.

Proof. This is an immediate corollary of Lemma 1.12: Let m be the node on the path such that pre[m] is
a minimum. By the Lemma, m is an ancestor of y. Further, either m = x (in which case m is trivially an
ancestor of x), or we have pre[m] < pre[x] < pre[y], so m is also an ancestor of x by Lemma 1.9.

Note that Lemma 1.12 also follows easily from Lemma 1.13.

1.5 An Application: Strongly Connected Components


In this section we present an efficient algorithm for computing the strongly connected components of an
arbitrary directed graph G.
12 CHAPTER 1. DEPTH-FIRST WALKS

The reverse graph Gr of G is the graph with the same nodes as G but with each edge reversed (i.e. “pointing
in the opposite direction”). It is immediate that the strongly connected components of G are the same as
those of Gr .

1.14 Lemma If {Ti } is the spanning forest of trees produced by a depth-first traversal of G, then (the set of nodes
of ) each strongly connected component of G is contained in (the set of nodes of ) one of the Ti .

Remark Intuitively, once a depth-first walk reaches one node of a strongly connected component, it will
reach all other nodes in that component in the same depth-first tree of the walk. The path lemma enables
us to give a quick proof of this fact:

Proof. If x and y are contained in the same strongly connected component of G, let us suppose without
loss of generality that pre[x] < pre[y]. Since we know there is a path from x to y, some element u of that
path is an ancestor of x and y. That is, x, y, and u must all belong to the same tree in the forest. Since x
and y are arbitrary, each element in the strongly connected component must belong to that same tree.

procedure SCC(G: graph)


begin
Perform a depth-first traversal of G, assigning to each node x its reverse-post-order number rpost[x].
Perform a depth-first traversal of Gr . Each time a choice is necessary in the while loop of the driving
procedure DFWalk, choose the node x which has not yet been visited and for which rpost[x] is
least.
Return the family of trees produced by this second walk. Each tree is a strongly connected component
of G.
end

Figure 1.7: Algorithm for computing strongly-connected components.

1.15 Theorem The algorithm in Figure 1.7 computes the strongly connected components of G.

Proof. By the lemma, each strongly connected component lies entirely within some tree T in the forest
produced by the second depth-first traversal (i.e., the traversal of Gr ). So we only have to prove that each
tree in the forest is strongly connected. To do this, it is enough to prove that if r is the root of a tree T in
the forest and x is an element of T different from r, then x and r are in the same component; i.e. that there
exist paths in G (or in Gr , for that matter) from x to r and from r to x. We may assume that x 6= r.
Since x is an element of the tree T rooted at r, there is a path in Gr from r to x (namely the path of tree
arcs from r to x). The reverse path is therefore a path in G from x to r. We note that this path lies entirely
in T r .

Note that what we are denoting here by T r is not a tree. It is just the tree T with all the edges reversed.
These reversed edges are all edges in the original graph G.

Thus there is a path in G from x to r. It remains to show that there is a path in G from r to x.
1.6. TARJAN’S ORIGINAL ALGORITHM 13

To show this, we will show that in fact r is an ancestor of x in the original depth-first walk (the one over G
in the algorithm).
Now for r to be an ancestor of x, it is necessary and sufficient that both

pre[r] < pre[x]


rpost[r] < rpost[x]

Further, by the construction, we know that rpost[r] < rpost[x]. (This is true because of the choice made in
the second step of the algorithm.)
Therefore r is an ancestor of x iff pre[r] < pre[x].
Suppose then this is not true. Then (since x 6= r) pre[x] < pre[r], and so by the Path Lemma, there is an
element u on the path in T r from x to r which is an ancestor of both r and x. Hence rpost[u] ≤ rpost[r], and
since u is an ancestor of x (but by assumption r is not), u 6= r, so in fact rpost[u] < rpost[r], contradicting
the choice of r made in the second step of the algorithm.

This algorithm was published in Aho, Hopcroft, and Ullman[1], and is originally due to R. Kosaraju (un-
published) and Sharir[17].
Figure 1.8 gives an example of the workings of this algorithm.

1.6 Tarjan’s Original Algorithm


Tarjan was the first person to give an algorithm for finding strongly connected components in time pro-
portional to the size of G. Although this algorithm is more intricate than the algorithm presented in the
previous section, it is quite easy to program, and is possibly somewhat more efficient.
We begin with a strengthening of Lemma 1.14:
1.16 Lemma If T is any one of the trees generated by a depth-first search of G, then the intersection of the nodes
and edges of T with those of a strongly connected component of G is again a tree.

Proof. Let R be a strongly connected component of G. By Lemma 1.14, either R is disjoint from T or its
set of nodes is completely contained in T . Further, from the proof of Lemma 1.14, we know that given any
two elements x and y of R, they have a common ancestor in R. By iterating the process of finding common
ancestors, we can thus find an element r of R which is a common ancestor of every other element in R.
If y is any node in the component, and if r →∗ x →∗ y is a path (x being any node in G) in T , then since
there must be a path in G from y to r, x must also be in the component. Thus, the intersection of T with
the component is itself a tree rooted at r.

We refer to the node r as the root of the strongly connected component. r can be characterized as the node
in the component whose pre-order number is least. Since it is an ancestor of every node in the component,
its reverse post-order number is also least.
Note that r is not a unique entry node of R: a strongly connected component could have more than one entry
node; and different depth-first searches of G could yield different roots of a strongly connected component.
14 CHAPTER 1. DEPTH-FIRST WALKS

(1,2) (9,1)

(2,3) (8,5)

(3,4) (7,9)

(4,6) (6,8)

(5,7)

A depth-first walk in a directed graph. The ordered pairs by each node


show the pre-order and reverse post-order numbers determined by the walk.

Figure 1.8: The reverse graph, showing the strongly connected components as found by the algorithm of
Section 1.5. The numbers show the order in which the components are discovered.
1.6. TARJAN’S ORIGINAL ALGORITHM 15

Now for any node x in a strongly connected component R, consider the set of paths P(x) of the form
x = x0 → x1 → . . . → xn → y where

1. all the nodes in the path are in R,


2. all the arcs in the path except the last one are tree arcs (and so in particular, each arc is an arc from
a node to a descendant of that node),
3. the arc xn → y is either a back arc or a cross arc.

Let L(x) denote the lesser of pre[x] and the minimum value of pre[y] where y is reachable from x by such
an arc. So in particular, L(x) ≤ pre[x].
Note that L(x) would have the same value if the arcs up to the last arc could also be forward arcs. Such
a forward arc could always be replaced by a sequence of tree arcs. For our purposes, it is simpler to ignore
the forward arcs.
1.17 Lemma A node x is the root r of a strongly connected region ⇐⇒ L(x) = pre[x].

Proof. =⇒ : This is immediate from the fact that pre[y] ≥ pre[r] for any node y in R.
⇐= : Say x 6= r. Again, denote the depth-first spanning tree by T . By the previous lemma, T ∩ R is a tree
TR rooted at r. Since x 6= r, the subtree of TR rooted at x (call it TR,x ) does not contain r. There is a path
from x to r (and certainly every element of this path is in R). Let y be the first element of this path not in
TR,x . Then the predecessor p of y on this path can be reached by a series of arcs in TR,x , and so that series
of arcs followed by the arc p → y constitutes a path P in P(x).
Now if pre[y] > pre[x], then by the Path Lemma, x would be an ancestor of y, which is impossible because
y∈
/ TR,x . Hence pre[y] < pre[x], and so L(x) < x.

The problem with this Lemma is that without knowing R to begin with, condition 1 is impossible to evaluate.
Therefore, we shall show below how to replace this condition with one that is easily deterimined. This is
Tarjan’s key point.
1.18 Lemma If R is a strongly connected component of G with root r, and if r → x1 → x2 → . . . → xn → y
is a path with all arcs except the last being either tree arcs or forward arcs, and if pre[y] < pre[r] (so in
particular, y does not lie in R), then if s is the root of the strongly connected component containing y, we
have rpost[s] > rpost[r].

Proof. An equivalent statement is this: if R and S are distinct strongly connected components with roots
r and s respectively, and if x ∈ R and y ∈ S and there is an arc x → y, and if pre[y] < pre[r], then
rpost[s] > rpost[r].
We know in any case that pre[s] ≤ pre[y] < pre[r]. If also rpost[s] < rpost[r] then r would be a descendant
of s. Since there is a path from r to s (containing the arc x → y), this would mean that R and S are really
the same strongly connected component, a contradiction.

So here is how the algorithm works: We perform a depth-first search of the graph. We maintain an auxiliary
stack σ of nodes of G. σ will be managed just like the implicit stack of the depth-first search, except that
nodes are not automatically popped off it when their processing is finished.
16 CHAPTER 1. DEPTH-FIRST WALKS

To be precise, σ is managed as follows: As each node is first reached, it is pushed on σ. (That is, the
nodes of G are pushed on σ in pre-order.) Thus the descendants of each node lie above the node on σ.
In particular, the roots of the strongly connected components of G will eventually be on σ with the other
elements of their component above them. When the processing (i.e. in the depth-first walk) for a root r is
finished, we pop every element in the stack down to and including r. (For this to happen, of course, we have
to be able to identify the nodes which are roots of strongly connected regions. We deal with this below.)
No other elements of other strongly connected regions can be among the elements popped. This is because
if y were such an element, it would have been placed on the stack after r was first reached and before the
processing of r was finished, and so would be a descendant of r. So by the preceding lemma, the root s of
the component of y would also be a descendant of r, and so would have been placed above r on the stack.
Since its processing would have been finished before r’s processing, s and all nodes higher than it on the
stack, including y, would already have been popped off the stack before we pop r.
As noted above, for this algorithm to work, we have to be able to identify the roots of the strongly connected
regions. Using the stack σ, we can do this as follows: for each node x being processed, compute the attribute
L[x] (L is the replacement for L above), whose value is defined to be the lesser of pre[x] and the minimum
of pre[y] taken over all y such that there is a path x = x0 → x1 → . . . → xn → y where

1. Each arc up to xn is a tree arc.

2. The arc xn → y is a back arc or a cross arc.

3. y is on σ at some point during the processing of x.

Here condition 3 substitutes for the previous condition 1.


The same argument as in Lemma 1.17 shows that if x is not a root, then L[x] < pre[x].
On the other hand, let r be a root, and let y be that node for which pre[y] = L(r). Now the only way
we could have L[r] < pre[r] is if the y belonged to a strongly connected component with root s such that
rpost[s] > rpost[r]. Since we also have pre[s] ≤ pre[y] = L(r) < pre[r], we have

start[s] < start[r]


finish[s] < finish[r]

and so by the parenthesis nesting property, finish[s] < start[r]; i.e., the processing of s is finished before the
processing of r starts, which means that s (and hence y) is not on the stack at any time during the processing
of r. Hence L[r] = pre[r].
Thus in the algorithm, we compute L[x] for each node as it is processed. When the processing of the node is
finished, if L[x] = pre[x], it is known to be a root, and it and all elements above it on the stack are popped
and constitute a strongly connected component.
The complete algorithm is shown in Figure 1.9. Figure 1.10 shows the order in which the strongly connected
components of the graph in Figure 1.8 are discovered, with the same depth-first walk as in that figure.
1.6. TARJAN’S ORIGINAL ALGORITHM 17

function SCCwalk(G: graph): set of sets of nodes


σ: stack of nodes;
SetOfComponents: set of sets of nodes;
begin
Initialize the stack σ to be empty;
SetOfComponents← ∅;
i ← 1;
for each node x ∈ G do
L[g] ← 0;
pre[x] ← 0;
end for;
while there is an element x ∈ G with pre[x] = 0 do
call SCC(x);
end while;
return SetOfComponents;
end

procedure SCC(x: node)


begin
pre[x] ← i; i ← i + 1;
L[x] ← pre[x];
push x onto σ;
for each successor y of x do
if pre[y] = 0 then -- x → y is a tree arc.
call SCC(y);
if L[y] < L[x] then
L[x] ← L[y];
end if ;
else if pre[x] < pre[y] -- x → y is a forward arc.
Do nothing;
else -- x → y is a back or cross arc.
if (y is on σ) and (pre[y] < L[x]) then
L[x] ← pre[y];
end if ;
end if ;
end for;
if L[x] = pre[x] then
C ← ∅;
repeat
pop z from σ;
Add z to C;
until z = x;
end if ;
Add C to SetOfComponents;
end

Figure 1.9: Tarjan’s original algorithm for finding strongly connected components.
18 CHAPTER 1. DEPTH-FIRST WALKS

Figure 1.10: The same directed graph as in Figure 1.8. Given the same depth-first walk of this graph,
this figure shows the order in which the, showing the strongly connected components are discovered by the
algorithm of Section 1.6.
Chapter 2

Flow Graphs

2.1 Flow Graphs

A flow graph is a finite directed graph G together with a distinguished element s (“start”, the “initial
element”) such that every element in G is reachable by a path from s. Henceforth, G will always be a flow
graph with initial element s. As a notational convenience, we may sometimes write hG, si to denote this flow
graph.
A flow graph must have at least one node (i.e., s). A flow graph with only one node is called the trivial flow
graph.
If P is a path x0 → x1 → . . . → xn in G, we call the nodes {x1 , . . . , xn−1 } the interior points of the path.
A loop or cycle is a path which starts and ends at the same node. A self-loop is a loop of the form x → x.
We say that the path P is simple or cycle-free iff all the nodes {xj : 0 ≤ j ≤ n} in it are distinct.
If there is a path from x to y, then there is a simple path from x to y. (This is a simple proof by induction
on the number of elements in the path. If the path is not simple, then there must be a cycle in it; removing
the cycle is the inductive step.)
Most of our conventional usage for directed graphs applies in particular to flow graphs:

• If x → y is an arc in a flow graph, we say that x is a predecessor of y, and that y is a successor of x.

• The terms ancestor , child , and descendant have different meanings: they always refer to relations
within a tree (e.g. a dominator tree, or the spanning tree of a flow graph).

• To be precise, we should really denote a flow graph by hG, A, si, where A is the set of arcs (i.e. edges)
in the graph. However, we will generally omit reference to A.

• Similarly, a sub graph of a flow graph hG, A, si should really be denoted hH, Ei, where E is a collection
of edges in A connecting elements of H. However, in general we denote a sub graph of G simply by
H. We will make the convention that the edges in the sub graph consist of all edges in the flow graph
connecting members of H.

19
20 CHAPTER 2. FLOW GRAPHS

• The one exception to this convention is when the sub graph H of G is a tree or a forest of trees. In this
case, the edges of H are understood to be only the tree edges, ignoring any other edges of G between
members of H.

2.2 Dominators
If x and y are two elements in a flow graph G, then x dominates y (x is a dominator of y) iff every path
from s to y includes x. If x dominates y, we will write x≫y. For example, in Figure 2.1 (modified from a
graph in Tarjan[19]), C dominates J, but C does not dominate I.

C B

F G E A

I J H D

Figure 2.1: A flow graph in which we wish to find dominators.

In particular, if x is any element of G, then x dominates itself, and s dominates x. If x≫y and x 6= y, we say
that x strongly dominates y, and write s ≫ y. The negation of x≫y is x 6≫ y, and the negation of x ≫ y is
x 6≫ y.
2.1 Lemma The dominance relation is a partial order on G. That is, for all x, y, and z in G,

1. x≫x.
2. If x≫y and y≫x, then x = y.
3. If x≫y and y≫z, then x≫z.

Proof. 1 and 3 are obvious.


To prove 2, let P be a simple path from s to y. Then P must include x. But then the initial part of the
path from s to x must itself include y. So unless x = y, the path includes y in two different positions, and
is not simple, a contradiction.

2.2 Lemma If x and y both dominate z then either x≫y or y≫x.


2.2. DOMINATORS 21

Proof. Let P be a simple path from s to z. P must contain both x and y. If y occurs after x, then x≫y,
for if not, there would be a path P ′ from s to y which did not include x. Replacing that part of P from s
to y with P ′ yields a path from s to z not including x, a contradiction. Similarly, if x occurs after y in P ,
then y≫x.

Thus, the set of dominators of a node x is linearly ordered. In particular, every element x of G except s
has an immediate dominator ; i.e. an element y such that y ≫ x and if z ≫ x then z≫y. We denote the
immediate dominator of x by idom(x).
Now let us define a graph T on the elements of G as follows: there is an arc from x to y iff x is the immediate
dominator of y.
2.3 Lemma T is a tree rooted at s.
Proof. s has no predecessors in T . (s has no dominators other than s itself.)
The in-degree of each element except s is 1. (There is only 1 arc leading to any x 6= s; namely the arc from
its immediate dominator.)
Every element in G is reachable from s in T . (s dominates every element in G, and following the immediate
dominator chain of any element x back from x leads to s.)

T is called the dominator tree of G. Figure 2.2 shows the dominator tree for the flow graph of Figure 2.1.

I K C B A

F G E D

J H

Figure 2.2: The dominator tree for the flow graph of Figure 2.1.

There are some relations between the arcs in hG, si and the arcs in its dominator tree T . An arc in T may
correspond to an arc in hG, si, or it may not. (For instance, the arc S → I in the dominator tree in Figure 2.2
does not correspond to an arc in the flow graph in Figure 2.1.) In general, an arc in hG, si falls into one of
the following three categories:

1. Arcs x → y where x is an ancestor of y in T (i.e. x dominates y). In this case, x must in fact be the
parent of y in T (i.e. x = idom(y)), since otherwise there would be a path from s to x and then to
y which avoided idom(y) in T , a contradiction. (As we have seen, however, not every arc in T has to
come from an arc in hG, si.)
22 CHAPTER 2. FLOW GRAPHS

2. Arcs x → y where y is an ancestor of x in T (i.e. where y dominates x . (These we may call “back
arcs”; cf. the beginning of the proof of Theorem 3.25.)
3. All other arcs x → y. y cannot be s, since then y≫x, which was handled by the previous case.
Therefore, y has an immediate dominator z = idom(y). Further, z must be an ancestor of x in T (i.e.
z = idom(y)≫x), since otherwise there would be a path from s to x, and then to y, which avoided z,
again a contradiction.

Figure 2.1 contains examples of each of these three types of arcs. In any case, we see that if x → y is an arc
in G then either y is an ancestor of x in T or the parent of y in T is an ancestor of x in T . That is, either y
dominates x or the immediate dominator of y dominates x. We will give a name to this result:
2.4 Lemma (Dominator Lemma) If x → y is an arc in a flow graph G, then either

1. y≫x, or

2. y 6= s and idom(y)≫x.

Equivalently, either y = s or idom(y)≫x.

2.3 Depth-First Spanning Trees


We will perform depth-first walks of flow graphs. All such walks will start at s; hence the spanning forests
they generate will all be trees. Algorithm B of Chapter 1 (Figure 1.6) thus can be rewritten a little more
simply as shown in Figure 2.3.
2.5 Theorem The pre-order and reverse post-order numberings corresponding to any depth-first walk of a flow
graph G each topologically sort the dominance partial ordering in G.

Proof. If x dominates y, then x must be an ancestor of y in any depth-first walk of G. Hence pre[x] < pre[y]
and rpost[x] < rpost[y] by the lemma.

The study of flow graphs inevitably revolves about the study of loops—if there are no loops in a flow graph,
the graph is a DAG, and in some sense the flow of control in the computer program it represents is trivial.
So one of the main problems that has to be faced is how to get a handle on the looping structure of a flow
graph. The simplest idea is undoubtedly to look at strongly connected subgraphs of the flow graph. In fact,
it is evident that every flow graph can be uniquely decomposed into maximal strongly connected subgraphs.
The problem with this is twofold:

1. If the graph is a DAG, every node in the graph is a separate maximal strongly connected subgraph.
This is too much detail—we really don’t care about all those nodes individually.
2. If the program contains a triply nested loop, for instance, all three of those loops will be contained in
the same maximal strongly connected subgraph. This is too little detail. We want the decomposition
to reflect the structure of loop inclusion.

We will see in chapter 3 how we can expose the looping structure of a flow graph by means of a decomposition
into intervals.
2.3. DEPTH-FIRST SPANNING TREES 23

procedure DFWalk(G: graph)


begin
i ← 1;
j ← number of elements of G;
for each node x ∈ G do
pre[x] ← 0;
rpost[x] ← 0;
end for;
call DFW(s);
end

procedure DFW(x: node)


begin
pre[x] ← i; i ← i + 1;
ND[x] ← 1; -- if x has no successors, ND[x] = 1
for each successor y of x do
if pre[y] = 0 then -- y reached for the first time
Label x → y a tree arc; -- Add arc x → y to D
call DFW(y);
ND[x] ← ND[x] + ND[y];
else if rpost[y] = 0 then -- y is already on the stack
Label x → y a back arc;
else if pre[x] < pre[y] then -- y’s processing finished
Label x → y a forward arc;
else -- y’s processing finished
Label x → y a cross arc;
end if ;
end for;
rpost[x] ← j; j ← j − 1; -- x’s processing finished
end

Figure 2.3: Algorithm B: Labelling and Numbering for a Depth-First Walk.


Chapter 3

Reducible Flow Graphs

The definitions that follow are all motivated by the goal of finding intrinsic characterizations of single-entry
loops; that is, loops that can be entered at only one flow graph node. Such loops can of course be nested—
programmers do this all the time. We need to find a way of representing this nesting in a way that can be
analyzed efficiently.

3.1 Intervals
If H is a subset of nodes of a flow graph hG, si, then an element h of H is called an entry node of H iff either

1. h = s; or

2. there is an edge x → h in G with x ∈


/ H.

A region in a flow graph is a sub graph H with a unique entry node h.


In particular, a region is a sub flow graph. We can write hH, hi to denote a region with entry h. We note
that since H has a unique entry node, if s ∈ H, we must have s = h. The term “header” is also used to
describe an entry node of a region.
A pre-interval in a flow graph is a region hH, hi such that every cycle in H includes h. That is, a pre-interval
in a flow graph is a sub graph H satisfying the following properties:

1. H has a unique entry node h.


2. Every cycle in H includes h.

An interval in a flow graph is a maximal pre-interval.1

3.1 Lemma Every element in a region hH, hi is reachable from h by a path in H.


1 In the literature, no distinction is customarily made between an interval and what here is called a pre-interval, although
sometimes an interval is referred to as a “maximal interval”. I have introduced the term pre-interval to make the distinction
explicit. Pre-intervals are useful in proofs and in understanding the algorithms, but intervals are the objects of real interest.

24
3.1. INTERVALS 25

Proof. If x is any element of H, then since hG, si is a flow graph, there is a path

P : s = x0 → x1 → x2 → . . . → xn = x

Let xj−1 be the last element of this path not in H. Then since xj−1 → xj is an arc in G, xj is an entry node
of H and so must must equal h. Thus the path

xj → . . . → xn = x

is a path in H from h to x.

3.2 Lemma If hH, hi is a region and x ∈ H, any simple path from h to x lies entirely within H.

Proof. If the path h = x0 → x1 → . . . → xn = x does not lie entirely within H, then as in the proof of the
previous lemma, one of the points xj with j ≥ 2 is an entry node of H, and hence must be h. But then the
path is not simple.

3.3 Lemma If hH, hi is a region, then h dominates (in G) every other node in H.

Proof. If s ∈ H, then h = s, and so h dominates every element in G and so also in H.


Otherwise, there is a path from s to some node x in H which avoids h and which is not entirely contained in
H. Let y be the last element on this path which is not in H, and let z be the next element on this path (so
z ∈ H; z might be x, in particular). Then as before, z is an entry node of H but z 6= h, a contradiction.

3.4 Theorem For each element h ∈ G, there is a region R(h) with header h which includes all other regions
with header h. R(h) consists of all elements of G which are dominated by h.

Remark The region is not a “maximal region” unless h = s; in fact, the only maximal region is hG, si itself.
But the region is maximal among all regions with header h.

Proof. Let H denote the family of regions with header h. Since {h} ∈ H, H 6= ∅. The union of any two
members of H is in H. But then since H is finite, the union of all the members of H is in H. Thus R(h)
exists.
Let H = {x : h≫x}. By Lemma 3.3, R(h) ⊂ H.
To prove the converse inclusion, we will show that H is a region with header h—this will show that H ⊂ R(h).
If H is not such a region, then it has an entry node x different from h. Therefore there is an arc y → x
with y ∈/ H. Since then y is not dominated by h, there is a path P from s to y which avoids h. But then
appending the arc y → x to the end of P yields a path from s to x which avoids h, a contradiction, because
h dominates x. Thus H has h as its unique entry node, and is therefore a region, and so H ⊂ R(h).

3.5 Lemma A region is a convex subset of the dominator tree. That is, if hH, hi is a region, if x and y are in
H, and if there is an element m ∈ G such that x≫m≫y, then m ∈ H.

Proof. If P is a simple path from s to y, then since h≫x, that part of P which starts at h is a simple path
from h to y which includes x and m. By Lemma 3.2, it lies entirely within H, so m ∈ H.

Remark Not every convex subset of the dominator tree is a region, however. For instance, the union of two
disjoint subtrees of the dominator tree is not a region.
26 CHAPTER 3. REDUCIBLE FLOW GRAPHS

3.6 Lemma If hH, hi and hK, ki are regions with a non-empty intersection, then either k ∈ H or h ∈ K. If
k ∈ H, then hH ∪ K, hi is a region.

Proof. There must be a node x ∈ H ∩ K. Since h and k both dominate x, either h dominates k or k
dominates h. Say h dominates k. Then by Lemma 3.4, k ∈ H. But then H ∪ K must be a region with
header h, for any path entering H ∪ K either enters H (at h) or K (at k). But since k ∈ H, to reach k it
must first enter H at h.

3.7 Lemma If hH, hi and hK, ki are pre-intervals with a non-empty intersection, then either k ∈ H or h ∈ K.
If k ∈ H, then hH ∪ K, hi is a pre-interval.

Proof. By the previous lemma, we already know that either k ∈ H or h ∈ K and that H ∪ K is a region.
Without loss of generality, let us assume that k ∈ H. We will show that hH ∪ K, hi is a pre-interval.
Let C = {x0 → x1 → . . . → xn = x0 } be a cycle in H ∪ K. If C ⊂ H, then C must include h. Otherwise,
there is a node xj ∈ K − H, and there are two possibilities:

Case I: C ⊂ K. Then C includes k, which is in H, and so C enters H (since xj ∈


/ H), so C must enter H
at h; i.e. h ∈ C.

Case II: C 6⊂ K. Again, C must re-enter H (since it is not completely contained in K), so it must contain
h.

So in any case, h ∈ C, and so hH ∪ K, hi is a pre-interval.

3.8 Theorem For each element h ∈ G, there is a pre-interval with header h which includes all other pre-intervals
with header h.

Remark The pre-interval is not necessarily a maximal pre-interval (i.e. is not necessarily an interval). But
the pre-interval is maximal among all pre-intervals with header h.

Proof. As in the proof of Lemma 3.4, let H denote the family of pre-interals with header h. Since {h} ∈ H,
H 6= ∅. Lemma 3.7 shows that the union of any two members of H is in H. But then since H is finite, the
union of all members of H is in H.

3.9 Theorem If hH, hi and hK, ki are intervals in the flow graph hG, si, then they are either identical or disjoint.

Proof. If H and K are not disjoint, then by Lemma 3.7, H ∪ K is a pre-interval. But then by maximality,
H = H ∪ K = K.

3.10 Theorem G can be uniquely partitioned into (disjoint) intervals.

Proof. Any single node in G by itself constitutes a pre-interval. Hence every node in G lies in some interval.
That interval is unique by the previous theorem, and the set of all these intervals is disjoint by the theorem.

This last theorem is an essential and key fact concerning intervals.


3.2. ALGORITHMS FOR CONSTRUCTING INTERVALS 27

function MakePI(h: node)


begin
A ← {h}; -- initialize
while there exists a node m 6= s all of whose predecessors are in A do
A ← A ∪ {m};
end while;
return A;
end

Figure 3.1: Algorithm C


Compute the Largest Pre-Interval with Header h

3.2 Algorithms for Constructing Intervals


For each h ∈ G, let P I(h) be the largest pre-interval with header h. By Theorem 3.8, P I(h) exists and
contains every pre-interval with header h. We will show that Algorithm C in Figure 3.1 computes P I(h):

3.11 Theorem Algorithm C computes P I(h).

Proof. 1. Since all the predecessors of each node m which is added to A all lie in A already, by induction
A has h as its unique entry point.
2. If there is a loop x0 → x1 → . . . → xn = x0 in A which avoids h, let xj be the last element in this loop
which was added to A by the algorithm. Then xj+1 must have already been in A. But this is a contradiction,
for since xj+1 6= s, it must have been added to A by the algorithm, and when this happened, one of its
predecessors would have been xj which was not yet in A.
Thus A is a pre-interval with header h. We must show that it is the maximal such pre-interval. So say B is
a strictly larger pre-interval with header h. By the construction in the algorithm, every element of B − A
must have a predecessor in B − A; otherwise it would have been added to A. (Note that s cannot be in
B − A, since B has header h.) But then, since B is finite, if we take any element x ∈ B − A and find a
predecessor in B − A, and iterate this process, we must eventually create a cycle in B − A. This cycle does
not contain h, which contradicts the assumption that B is a pre-interval with header h.

Now Algorithm D in Figure 3.2 can be used to partition G into intervals. The algorithm uses an auxiliary
set H (a worklist of headers) and builds a family J of intervals.
3.12 Theorem Algorithm D partitions G into intervals.

Proof. Since all elements of G are reachable from s, the algorithm at least partitions G into pre-intervals.
So we only have to show that each pre-interval created by the algorithm is an interval. We will proceed by
induction.
The first pre-interval created is P I(s). This must be an interval, because any larger pre-interval would also
have header s; but P I(s) includes all pre-intervals with header s.
Now assume that we are about to add P I(h) to J ; by the inductive assumption all the pre-intervals already
added to J were actually intervals. We must show that P I(h) is also an interval.
28 CHAPTER 3. REDUCIBLE FLOW GRAPHS

procedure MakeIntervals
begin
-- Initialize H and J :
H ← {s};
J ← ∅;
Mark all nodes of G “unselected”.
while H 6= ∅ do
Extract a node h from H;
Compute P I(h) by algorithm C;
Mark all nodes of P I(h) “selected”;
Add P I(h) to J ;
Add to H any node m ∈ G which is not selected but which has a selected predecessor;
end while;
return J ;
end

Figure 3.2: Algorithm D


Partition a Flow Graph into Intervals

In any case, P I(h) is contained in an interval H, say. Since intervals are disjoint, H is disjoint from all the
intervals in J . Since h has a predecessor in one of the intervals in J , h must be the unique entry point (i.e.
the header) of H. But we know that P I(h) contains every pre-interval with header h, so P I(h) = H. Hence
P I(h) is an interval, and we are done.

If P I(h) is actually an interval, we shall refer to it also as I(h).

3.3 Reducible Flow Graphs

Corresponding to the flow graph G, we can create a derived flow graph I(G) as follows:

1. The nodes of I(G) are the intervals of G.

2. The initial node of I(G) is P I(s).

3. If I and J are two distinct intervals of G, there is an arc I → J in I(G) iff there is an arc in G from
some element of I to the header of J. (In particular, I(G) contains no self-loops.)

Thus, each node in G has an image node in I(G) (namely, the interval to which it belongs), and each path
in G has an image path in I(G). In general, each subgraph H of G has an image, which we will denote by
I(H), and which is a subgraph of I(G). Going in the other direction, every path in I(G) is the image of at
least one path in G.
3.3. REDUCIBLE FLOW GRAPHS 29

Note that the image of a simple path may not be simple. (A path can start in an interval but not at its
header, leave that interval, and end by reentering the interval at its header. The path could be simple but
the image would not be.)
On the other hand, every path in I(G) is the image of a path in G, and a simple path in I(G) can be taken
to be the image of a simple path in G.
Now this process can be repeated: If we set G0 = G and for all j > 0 we set Gj = I(Gj−1 ), then the sequence
G0 , G1 , . . . is called the derived sequence for G.
The number of elements of I(Gj ) is always ≤ the number of elements of Gj . Further, I(Gj ) has the same
number of elements as Gj iff each element of Gj constitutes an interval of Gj , in which case I(Gj ) = Gj .
ˆ
Thus, after a finite number of terms, each term of the derived sequence is a fixed flow graph I(G) called the
ˆ
limit flow graph of G. Every node of I(G) is an interval.
ˆ
G is said to be reducible iff I(G) is the trivial flow graph (i.e. the flow graph with only one node). A flow
graph which is not reducible is said to be nonreducible. Figure 3.3 shows the sequence of reductions of a
reducible flow graph.

Figure 3.3: Derived Sequence of a Reducible Flow Graph

For example, if G is a directed acyclic graph (DAG), then G is reducible: since there are no cycles, hG, si is
itself an interval, so I(G) already is the trivial flow graph.
We will say that a non-trivial limit flow graph is an irreducible flow graph. (Note, however, that in the
literature, the terms irreducible and nonreducible have sometimes been used interchangeably.)
30 CHAPTER 3. REDUCIBLE FLOW GRAPHS

Flow graphs which are generated by programs written in languages with structured flow of control statements
such as if-then-else constructions and while loops but no goto statements are automatically reducible. And
in fact, it has been observed that competent programmers who do use goto statements virtually always
produce code which leads to reducible flow graphs anyway. So nonreducible flow graphs really are in some
sense pathological.
The concept of reducibility seems to be an intrinsically important one, as there are many striking character-
izations of reducible flow graphs, as we will see below. Right now, we will prove a useful result relating the
derived sequence to the dominance partial order:

3.13 Lemma If hH, hi and hK, ki are intervals in G and if x ∈ K, the following are equivalent:

1. H≫K in I(G).

2. h≫k in G.

3. h≫x in G.

Proof. 1 =⇒ 3: Otherwise, there would be a path from s to x which avoids h. But then this path cannot
include any member of H, and so its image in I(G) avoids H; hence H does not dominate K in I(G).
3 =⇒ 2: Otherwise there would be a path from s to k not including h. Adjoining to this path the path in
K from k to x gives a path from s to x not including h, a contradiction.
2 =⇒ 1: Any path from s to k must go through h, which means that any path in I(G) from I(s) to K must
go through H, so H≫K in I(G).

3.14 Theorem If x → y is an arc in G, and if the images of x and y in an element Gj of the derived sequence
of G are xj and yj respectively and if xj 6= yj , then y≫x ⇐⇒ yj ≫xj .

Proof. Let us say the images of x and y in the elements Gk of the derived sequence of G are xk and yk
respectively.
Since xj 6= yj , we must have yk 6= xk for 0 ≤ k ≤ j. Therefore the arc x → y has an image arc xk → yk in
Gk for 0 ≤ k ≤ j. For 0 ≤ k < j, xk and yk cannot be in the same interval of Gk , and hence yk must be
an interval header in Gk . (This is where we use the fact that x → y is an arc.) Hence applying Lemma 3.13
repeatedly, we get

y≫x ⇐⇒ y1 ≫x1 ⇐⇒ . . . ⇐⇒ yj ≫xj .

3.4 A Subgraph of Nonreducible Flow Graphs


We will show that a flow graph is nonreducible iff it contains a subgraph of the form in Figure 3.4, where

1. the paths P , Q, R, S, and T are all simple.


2. the paths P , Q, R, S, and T have pairwise disjoint interiors.
3. nodes s, a, b, and c are distinct, unless a = s, in which case there is no path S in the subgraph.
3.4. A SUBGRAPH OF NONREDUCIBLE FLOW GRAPHS 31

4. nodes s, a, b, and c do not occur in the interiors of any of the paths P , Q, R, S, and T .

Let us call this subgraph (∗).

a
R T

P
b
c

Figure 3.4: A Subgraph of Nonreducible Flow Graphs.

3.15 Lemma If two paths in G have disjoint interiors, then their images in I(G) have disjoint interiors.

Proof. If not, then both paths in G have an interior node in the same interval in G, and furthermore, the
starting node of each path cannot be in that interval (for otherwise, that interval would not be an interior
point of the corresponding image path). But then each path must have entered that interval at an interior
point of the path, which means that the header node of the interval must be an interior point of each path,
a contradiction.

3.16 Lemma If (∗) is a subgraph of I(G), then (∗) is a subgraph of G.

Proof. For a, b, and c in G, take the headers of the (three distinct) intervals in G which are the vertices of
(∗) in I(G). The paths in G will be composed of arcs between intervals (which correspond to the arcs in the
paths of I(G)), and arcs totally within intervals. Since no interval in G which corresponds to a node on any
of the paths of (∗) in I(G) occurs twice, these paths will all have disjoint interiors, and the nodes s, a, b,
and c will not occur on the interior of any of these paths. Further, the paths can all be taken to be simple.

3.17 Lemma If (∗) is a subgraph of G, then (∗) is a subgraph of I(G).

Proof. b and c cannot be in the same interval, since if they were, the paths P and Q would form a cycle
which would have to include the header of that interval (either because it was completely contained within
the interval or because it left the interval and re-entered it). But the path SR also enters the interval, and
by assumption, S ∪ R intersects the cycle P ∪ Q only at b, so b would have to be the header of the interval.
But similarly, c would have to be the header of the interval, and b 6= c, a contradiction.
32 CHAPTER 3. REDUCIBLE FLOW GRAPHS

a and b cannot be in the same interval, since if they were, the header of that interval would have to lie in S.
(If a = s, the header of course would be a.) But then the cycle P Q could not be contained in that interval,
since it does not contain the header, and it could not leave and re-enter the interval for the same reason.
Similarly, a and c cannot be in the same interval. Therefore, a, b, and c have distinct images in I(G).
Lemma 3.15 then shows that the images of P , Q, R, and T have disjoint interiors.
The image of a cannot lie on the interior of the image of S, for then S would have to enter the interval
containing a twice and so would not be simple.
The image of a cannot lie on the interior of the image of R, for then the path SR would have to enter the
interval containing a twice, which would again be a contradiction. Similarly, the image of a cannot lie on
the interior of the image of T .
The image of a cannot lie on the interior of the image of P , for then the path SRP would enter the interval
containing a twice, again a contradiction. And similarly, the image of a cannot lie on the interior of the
image of Q.
Similar arguments show that the images of b and c cannot lie on the interior of the images of S, R, T , P , or
Q.
And finally, the images of all the paths are simple, by analogous arguments.
3.18 Theorem A flow graph hG, si is reducible iff it does not contain a subgraph (∗).
ˆ
Proof. If G contains (∗) as a subgraph, then by Lemma 3.17, so does I(G). Iterating, the limit I(G) must
contain (∗) as a subgraph, which means it is not the trivial flow graph, so G is nonreducible.
Conversely, if G is nonreducible, then by Lemma 3.16, it is enough to show that I(G) contains (∗). So we
may assume that in fact, G is a limit flow graph. We will prove G contains (∗) by induction on the number
of nodes of G.
First, any flow graph with 1 or 2 nodes is reducible. For flow graphs of 3 nodes, a simple enumeration of
possibilities shows that the nonreducible ones contain (∗). So let us assume that it has been shown that for
3 ≤ j ≤ n − 1, any limit flow graph with j nodes contains (∗), and say G has n nodes. In particular, since
G is a limit flow graph, every node in G is an interval.
With any depth-first spanning tree D for G, with its associated reverse post-order numbering of nodes of G,
let x be the child of s whose reverse post-order number is smallest. (In fact, it must be 2.) There must be
an arc y → x with y 6= s, since otherwise Algorithm C would show that x was in the interval containing s.
Since y 6= s and y 6= x, we must have rpost[y] ≥ 2 = rpost[x], so y → x is a back arc, and so x is an ancestor
of y in D. Thus, there is a simple path A in D from x to y. We consider two cases:

1. x dominates y: Then R(x) (see Theorem 3.4 on page 25) is a region including y but not including s.
Any interval in this region (considered as a sub flow graph) must be a pre-interval in G. Hence the
only intervals in R(x) are single nodes: R(x) considered as a sub flow graph is a limit flow graph with
fewer than n nodes, and hence by induction contains (∗) with x corresponding to s or a. Prepending
the arc s → x yields a copy of (∗) in G.
2. x does not dominate y: Then there is a simple path B from s to y not containing x. Let z be the
first point on B (starting from s) which is also on A. z could be y, but it cannot be x. In either case,
{s, z, x} forms an example of (∗) in G.
3.5. BACK ARCS IN INTERVALS 33

3.5 Back Arcs in Intervals


3.19 Lemma if hH, hi is a pre-interval, and if x and y are nodes in H and x is an ancestor of y with respect to
some depth-first walk of G, then there is a path P in H from x to y such that h ∈
/ P unless x = h.

Proof. By assumption, the depth-first spanning tree D of G which is created by the depth-first walk
contains a simple path from s to x to y. This path must enter H at h, so h precedes x in P , and so that
part of the path from x to y can contain h only if x = h.

3.20 Theorem If hH, hi is an interval, than an arc x → y between nodes of H is a back arc with respect to a
depth-first spanning tree iff y = h.

Remark Thus, the back arcs in an interval do not depend on which depth-first walk we use to construct a
spanning tree of that interval. We will see below that this property is true of a flow graph as a whole iff the
flow graph is reducible.

Proof. If the arc were a back arc but y 6= h, then by Lemma 3.19, there would be a path P from y to
x in H avoiding h. This path together with the back arc would constitute a cycle in H avoiding h, which
contradicts the fact that hH, hi is a pre-interval.
Conversely, if x → h is an arc, then since h dominates x, h is an ancestor of x, and so the arc is a back arc.

3.21 Corollary The target of any back arc is an interval header.

Proof. If the arc is contained in an interval, the result follows from the theorem. Otherwise, the target of
the arc must be an entry point of an interval, and so is the interval header.

It is not true that every interval header is the target of a back arc. Figure 3.5, for instance, gives a
counterexample.

T T

C
T T B

Arcs are labeled T (tree arc), C (cross arc), and B (back arc). Intervals are indicated by dotted
boxes. There is no back arc to the header of the left bottom interval.

Figure 3.5: An Interval Which is not the Target of a Back Arc.


34 CHAPTER 3. REDUCIBLE FLOW GRAPHS

3.6 Induced Depth-First Walks

3.22 Lemma If D is a spanning tree for G, then I(D) is a spanning tree for I(G).

Proof. If hH, hi, hI, ki, and hK, ki are three distinct intervals in G and H → K and I → K are arcs in
I(G) which come from arcs in D, then there must be arcs x → k and y → k in D with x ∈ H and y ∈ I.
Since H and I are disjoint, k has two parents in D, which is impossible, since D is a tree. Hence I(D) is a
tree, and it spans I(G) because it enters every interval in G.

We see that if hH, hi and hK, ki are intervals in G (i.e. nodes of I(G)), and if H → K is an arc in I(G),
then H → K is an arc in I(D) iff there is a tree arc x → k with x ∈ H.
Now if we have a depth-first walk of G, there is an induced depth-first walk of I(G), generating the spanning
tree I(D), which is produced by walking the nodes of I(D) in the order determined by the pre-ordering of
their headers in G.
Figure 3.6 shows the derived sequence of the flow graph of Figure 2.1 together with the induced depth-first
walks on each derived graph.
With the pre and post orderings of I(G) determined by this walk, we have (letting hH, hi and hK, ki be any
two nodes in I(G), and referring to the processing in Algorithm B on page 23),

pre[H] < pre[K] ⇐⇒ H is first reached before K


⇐⇒ h is first reached before k
⇐⇒ pre[h] < pre[k]

rpost[H] < rpost[K] ⇐⇒ all children of K are finished being processed


before all children of H are
⇐⇒ all nodes reachable from k are finished being
processed before all nodes reachable from h are
⇐⇒ rpost[h] < rpost[k]

3.23 Lemma If hH, hi and hK, ki are disjoint intervals in G, the following are equivalent:

1. There is a back arc in G from H to K.

2. There is at least one arc in G from H to K, and all such arcs are back arcs.

3. There is a back arc H → K in I(G).

Proof. If there is an arc in G from H to K, then its target must be k, so it is of the form x → k, where
x ∈ H.
If x → k is not a back arc then rpost[x] < rpost[k]. Since in any case rpost[h] < rpost[x] (since h dominates
x), we have rpost[h] < rpost[k], and hence rpost[H] < rpost[K], so H → K is not a back arc.
3.6. INDUCED DEPTH-FIRST WALKS 35

(1,1)

T
F
T

(8,2)
(2,7)

T T
T T
B
(6,8) (9,5) (11,3)
(3,10)

T C T B T T

(7,9) (10,6) (12,4)


C
(4,11)

B T
C

(5,12)
(1,1)

B T

T
(4,2)

(2,4)

B T

B T

(5,3)

(1,1) C
(3,5)

B T

T
(4,2) (1,1)

(2,3)
B

F
T
B T C
T
(2,2) (3,3)

B
(3,4)

Each node is labeled with an ordered pair of numbers (x, y), where x is the pre-order number of
the node and y is the reverse post-order number of the node. Each arc is labeled T (tree arc),
F (forward arc), C (cross arc), or B (back arc). Tree arcs are drawn as solid vectors; all other
arcs are drawn as dashed vectors. Intervals are indicated by dotted boxes. The flow graph is not
reducible.

Figure 3.6: The derived sequence for the flow graph of Figure 2.1.
36 CHAPTER 3. REDUCIBLE FLOW GRAPHS

Conversely, if x → k is a back arc, then k is an ancestor of x, and so there is a path in D from k to


x which must therefore pass through h. That is, k is an ancestor of h. Hence rpost[h] > rpost[k], so
rpost[H] > rpost[K], and so H → K must be a back arc in I(G).

3.24 Corollary Back arcs in G fall into two disjoint classes:

1. Arcs within intervals in G whose target is the head of the interval.

2. Arcs whose image in I(G) is a back arc in I(G).

3.7 Characterizations of Reducibility


In this section we will give some further characterizations of reducible flow graphs. They are all due to Hecht
and Ullman ([6] and [8]; also contained in [5]).

3.25 Theorem If hG, si is a flow graph, the following are equivalent:

1. hG, si is reducible.

2. The back arcs of hG, si are unique; that is, all depth-first walks yield the same set of back arcs.

3. An arc x → y is a back arc ⇐⇒ y≫x.

Proof. First, if y≫x and x → y is an arc, then since y is an ancestor of x in any tree, the arc must be a
back arc in any case. Thus condition 3 really is just

3’ : x → y is a back arc =⇒ y≫x.

If hG, si is any flow graph, then its back arcs (under any depth-first walk) fall into two classes, by the previous
corollary:
1. All arcs within intervals in G whose target is the head of the interval. These arcs have no image in I(G).
2. Arcs whose image in I(G) is a back arc in I(G).
The arcs in class 1 are uniquely determined by the interval structure of hG, si—they will be the same for
any depth-first walk of G. In addition, if x → y is a class 1 arc, then we must have y≫x.
Now we iterate this process. Precisely, let

ˆ
G = G0 , G1 , . . . , Gn = I(G)

be the derived sequence of G. At step j of the iteration (j ≥ 1), we will consider the family of back arcs in
Gj which are images of class 2 arcs x → y in Gj−1 . Let Fj denote the set of arcs in this family which are
class 1 arcs in Gj . Fj corresponds to a family of back arcs Bj in G, and this correspondence is determined
by the interval structure of G, not by any particular walk of G. The arcs in Fj , and hence also the arcs in
Bj are uniquely determined by the interval structure of Gj . Similarly, for each arc xj → yj in Fj we have
yj ≫xj . Hence by Theorem 3.14, for the corresponding arc x → y in Bj (x and y being in G), we have y≫x.
3.7. CHARACTERIZATIONS OF REDUCIBILITY 37

ˆ
If G is reducible, then I(G) is the trivial flow graph, which has no back arcs. Hence every back arc in G will
be in one of the sets Bj . Thus 1 =⇒ 2 and 3.
Conversely, if hG, si is not reducible, then it contains a (∗) sub flow graph. If we begin a depth-first spanning
tree by following the paths SRP Q, then all the arcs which make up path P will be tree arcs, and the last
arc in Q will be a back arc. On the other hand, if we begin the tree by following the paths ST QP , then all
the arcs which make up path Q will be tree arcs and the last arc in P wil be a back arc. Thus, the back
arcs will not be unique. Similarly, it is clear that property 3 does not hold for either of the back arcs in P
or Q. So properties 2 and 3 each imply property 1.

If hG, si is not reducible, then not only is the set of back arcs not unique, the number of back arcs may not
be unique. For instance, Figure 3.7 shows an irreducible flow graph together with two walks which yield
different numbers of back arcs.

(1,1) (1,1)

F T

(4,3) (2,2)

C T T T

T T
(3,4) (3,3)

T T
B T

(2,2) (4,4)

This irreducible flow graph has either 1 or 2 back arcs, depending on which way it is walked. As
before, the pairs of numbers at each node are the pre-order and reverse post-order numberings
of that node.

Figure 3.7: Two Walks of an Irreducible Flow Graph

It is also possible for an irreducible flow graph (i.e. a non-trivial limit flow graph) to have a back arc x → y
where y dominates x; it just can’t have that property for all its back arcs. See Figure 3.8 for an example.
We say that a DAG of a flow graph hG, si is a maximal acyclic sub flow graph.
3.26 Lemma Any DAG hH, E, si of a flow graph hG, si includes all the nodes of G.

Proof. If not, there must be an arc x → y with x ∈ H and y ∈


/ H. But then adjoining y to H and x → y
to E still keeps hH, E, si acyclic.

3.27 Theorem A flow graph hG, si is reducible iff it has a unique DAG.
38 CHAPTER 3. REDUCIBLE FLOW GRAPHS

p
t
q
r

An irreducible flow graph. q must be a back arc, and the target of q dominates the source of
q. However (depending on which depth-first walk is used), either r or s is also a back arc, and
neither of the nodes of those arcs dominates the other.

Figure 3.8: Back arcs in an Irreducible Flow Graph

Proof. If hH, E, si is a DAG of G, the arcs in E are tree arcs, forward arcs, or cross arcs with respect to
any depth-first walk of hH, E, si. Any arc x → y of G which is not in E must be a back arc with respect to
such a walk, because otherwise it could be added to E and leave hH, E, si acyclic. Thus, E consists of all
the arcs in G except the back arcs. Hence hH, E, si is unique iff the set E is unique iff the set of back arcs
of G is unique iff G is reducible.

3.28 Theorem A flow graph hG, si is reducible iff its arcs can be partitioned into two sets A1 and A2 such that
hG, A1 , si is a DAG of G and for each arc x → y in A2 , y dominates x in G.

Proof. If G is reducible, just let A2 be the (unique) set of back arcs of G.


Conversely, if G is not reducible, it has at least two DAGS, D1 and D2 , say. If x → y is an edge which is
in D1 but not in D2 (there must be at least one such edge), then y could not dominate x, for if it did, D1
would not be acyclic.

3.29 Theorem A flow graph hG, si is reducible iff its arcs can be partitioned into two sets A1 and A2 such that
hG, A1 , si is a DAG D of G and for each arc x → y in A2 , y dominates x in D.

Remark The only difference between this theorem and the last is that we insist that y dominate x in D,
rather than in G. Since there are fewer edges in D than in G, this condition is stronger in one direction, and
weaker in another.

Proof. If G is reducible, we can get D as in the previous theorem. If x → y is in A2 , then y dominates x


in G, and hence in D.
To prove the converse, we need to show that if we have a partition of the edges of G into A1 and A2 where
the A1 edges form a DAG D, and if y dominates x in D for all the arcs x → y in A2 , then y dominates x
in G for all those arcs. We do this now:
Let the arcs in A2 be adjoined successively in some order to D to form the subgraphs D = H0 , H1 , . . . , Hn = G
of G. (Of course all these subgraphs contain the same vertices. It is only the edges that are different.)
Let j be the first number for which there is an arc x → y in A2 for which y does not dominate x in Hj (we
know j > 0 if it exists at all), and let Hj be formed from Hj−1 by adjoining the arc u → v, say. Then there
3.8. THE LOOP-CONNECTEDNESS NUMBER 39

is a simple path P from s to x in Hj which avoids y, and this path must include the arc u → v exactly once.
Hence the initial part P1 of this path, from s to u is composed only of arcs in Hj−1 . Since P is simple, P1
cannot include v. But this means that v does not dominate u in Hj−1 , a contradiction.

3.8 The Loop-Connectedness Number


If G is reducible, the length dsl(G) of the derived sequence of G (the derived sequence length) can be thought
of as the maximum loop nesting depth in G. A famous empirical study performed by Knuth[14] showed that
dsl(G) was almost always bounded by 3. For the analysis of a data-structure algorithm below (see page 81),
we will find it useful to estimate a related quantity:
For any flow graph G, we define lc(G) to be the largest number of back arcs (with respect to a fixed depth-
first walk of G) that are contained in any cycle-free path of G. Since there are only finitely many cycle-free
arcs in G, lc(G) is well-defined and finite. lc(G) is called the loop-connectedness number of G. Of course if
G is reducible, lc(G) is a property of G alone, and is independent of the particular depth-first walk.
As usual, I(G) will denote the derived graph of G.
3.30 Lemma lc(I(G)) ≥ lc(G) − 1.
Proof. Let P be a cycle-free path in G with m = lc(G) back arcs. Let the back arcs in P be (in order)

x1 → r1 , x2 → r2 , . . . , xm → rm
where ri occurs before xi+1 in P . Since P is simple, the {rj } are all distinct. By Corollary 3.21 (page 33),
each ri is therefore the header of a distinct interval.
Let P ′ be the sub-path of P from r1 to rm . The image of P ′ in I(G) must be simple, because otherwise P ′
(which starts at an interval header) would have to enter the same interval twice, and so would not be simple.
Since P is simple, and since for all i the sub-path from ri to ri+1 starts in I(ri ) and ends in I(ri+1 ), it must
enter I(ri+1 ) at ri+1 . Thus for 1 ≤ i < m, xi+1 is not in I(ri+1 ), and so the image of xi+1 → ri+1 in I(G)
is a back arc. Thus the image of P ′ in I(G) contains at least m − 1 back arcs.
3.31 Theorem If G is a reducible flow graph with loop-connectedness number lc(G) and derived sequence length
dsl(G), then

lc(G) ≤ dsl(G).
Proof. Apply the lemma repeatedly. Since G is reducible, after dsl(G) iterations, we will arrive at a trivial
flow graph for which both dsl and lc are 0. But we subtracted 1 at most dsl(G) times from lc(G) to get to
0.

In fact, lc(G) can be much less than dsl(G). For instance, Figure 3.9 shows two flow graphs. The graph on
the left corresponds to three nested repeat loops (in the sense of Pascal; the loop always iterates at least
once, and the test is at the bottom of the loop); the graph on the right corresponds to three nested while
loops (equivalently, Fortran do loops). In both cases, dsl(G) = 3. But for the repeat loop nest, lc(G) = 1,
while for the while loop nest, lc(G) = 3. Thus, while loops keep adding to lc(G), but nests of repeat loops
really have no effect on it.
40 CHAPTER 3. REDUCIBLE FLOW GRAPHS

The flow graph on the left represents three nested repeat loops (with the test at the bottom of
the loop); it has dsl(G) = 3 and lc(G) = 1. The flow graph on the right represents three nested
while loops (equivalently, Fortran do loops); it also has dsl(G) = 3 but has lc(G) = 3.

Figure 3.9: Nested Loops

3.9 Interval Nesting

Of course, there is no such thing as interval nesting. We know that intervals form a disjoint partition of G.
However, each interval in the derived graph I(G) is composed of a number of intervals in G, and so on; so
in this sense there is a nesting of lower order intervals in higher order ones. This use of the term “nesting”
reflects what we mean when we talk about loop nesting. See, for instance the flow graph on the left of
Figure 3.9.
Let

ˆ
G = G0 , G1 , . . . , Gn = I(G)

be the derived sequence of G. Let us refer to an interval in Gj as an interval of order j + 1. (It is an element
of Gj+1 .) Each such interval can be considered as being composed of a certain number of elements of G,
and in a similar way, the header of such an interval can be associated with an element of G (which is the
header of an ordinary, or order-1, interval of G). Naming intervals by their headers and their order, we can
indicate the nesting of lower order intervals in higher order intervals by means of a tree.
For instance, Figure 3.10 shows the derived sequence of a reducible flow graph. Figure 3.11 shows the
corresponding nesting tree for intervals of all orders.
Now let us identify intervals of different orders which share the same header. We will do this by removing
3.9. INTERVAL NESTING 41

S S S

B B

C C C

D E

F J F

G K L

O
H I

P
M

N
O

P
S S S
Q

B B

Figure 3.10: Reduction of a Flow Graph


42 CHAPTER 3. REDUCIBLE FLOW GRAPHS

S5

B4 S4

C3 B3 S3

O2 C2 B2 S2

O1 P1 C1 F1 L1 J1 B1 S1

Each interval is named by the node in G which corresponds to its header, followed by the number
which is the order of the interval.

Figure 3.11: Tree Showing the Nesting of Intervals of all Orders for the Reducible Flow Graph of Figure 3.10

from the tree in Figure 3.11 all but the highest order interval for each header. The tree that remains is called
the interval nesting tree, and is illustrated on the right of Figure 3.12. We can think of each interval in this
tree as consuming all its children.
So far as the interval nesting tree is concerned, we identify each header node in G with a corresponding
interval (which may be a higher order interval—for instance the interval corresponding to s is the higher
order interval which corresponds to the entire flow graph). Note that

1. If x is an ancestor of y in the interval nesting tree, then x must be an ancestor of y in the dominator
tree (that is, x must dominate y).

2. The converse is not true. For instance, F dominates L, but F is not an ancestor of L in the interval
nesting tree.

Let us define a map hdr on G such that hdr(x) is the header of the interval containing x. Then in contrast
to the above observation, we have:

3.32 Theorem If G is reducible and x → y is an arc in G with hdr(x) 6= hdr(y) (so in particular, y must be an
interval header), then the following are equivalent:

1. y≫x.

2. y is an ancestor of hdr(x) in the interval tree.


3.9. INTERVAL NESTING 43

S S

A B B

C C

D F E J O F L J

O G L K P

P H I M N

Figure 3.12: The Dominator Tree and the Interval Nesting Tree for the Flow Graph of Figure 3.10

Proof. First of all, if z → w is any arc in G, let zi → wi denote its image in Gi . Of course, zi might equal
wi , and since G is reducible, we know that in any case zn = wn .
1 =⇒ 2: We know that x → y is a back arc. As in the proof of Theorem 3.25 (page 36), the back arcs of G
are partitioned into disjoint sets {Bj }, where each arc z → w in Bj has an image zj → wj in Gj such that
wj is an interval header in Gj and zj is in that interval. (That is, zj → wj is a back arc in an interval of
Gj .) Say x → y falls in the set Bk . Then y is the header of an interval Iy of order k + 1.
x is a member of the set xk , which is an element of Iy . Hence hdr(x) must also be a member of xk , and
so is a descendant of xk in the interval nesting tree. But we know that xk is a member of Iy , and so is a
descendant of y in the interval nesting tree.
2 =⇒ 1: Since y is an ancestor of hdr(x) in the interval nesting tree, y dominates hdr(x), and hence also
dominates x. (This is just the affirmative part of the observation made earlier.)

The arcs described by this theorem can be characterized as the arcs which are exits from intervals to larger
containing intervals (“interval” in this sense again meaning interval of any order).
Figure 3.13 shows the actual interval nesting in the flow graph corresponding to the interval nesting tree of
Figure 3.12 for the flow graph of Figure 3.10.
44 CHAPTER 3. REDUCIBLE FLOW GRAPHS

D E

F J

G K

H I

N
O

Figure 3.13: Interval Nesting for the Flow Graph of Figure 3.10
Chapter 4

Testing for Reducibility

4.1 Finite Church-Rosser Transformations


If S is a set and =⇒ is a relation on that set (written a =⇒ b), then we say that a sequence

a0 =⇒ a1 =⇒ . . . =⇒ ak

is a chain from a0 to ak of length k. (We think of these chains as chains of implications or of derivations.)

We write a =⇒ b to indicate that there is a chain from a to b.
We say that the relation =⇒ is finite iff for each a ∈ S there is a bound on the length of chains a =⇒
b =⇒ . . . beginning with a. That is, =⇒ is finite iff for each a ∈ S, there is an n ∈ N such that if

a =⇒ a1 =⇒ . . . =⇒ ak

with all ai distinct, then k ≤ n. We will call the maximum length of such chains the height of a, and denote
it by h(a). It is clear that if a =⇒ b then h(a) ≥ h(b) + 1. If h(x) = 0, we will refer to x as alimit point,
because there is no other point reachable from x via the relation =⇒ .

The completion =⇒ c of the relation =⇒ is the relation defined by right-maximal chains; that is, a =⇒
c b iff
c b iff b is a limit point
there is a chain from a to b and there is no c 6= b with b =⇒ c. Equivalently, a =⇒
and there is a chain from a to b.
c is a function.
We say that the relation =⇒ is a finite Church-Rosser transformation iff it is finite and =⇒
c b and a =⇒
That is, =⇒ is FCR iff it is finite and if a =⇒ c c implies b = c.

4.1 Theorem (Aho, Sethi, and Ullman[2]) A relation =⇒ on a set S is a finite Church-Rosser transfor-
mation iff

1. =⇒ is finite; and
∗ ∗
2. for all a ∈ S, if a =⇒ b and a =⇒ c, then there is an x ∈ S such that b =⇒ x and c =⇒ x.

45
46 CHAPTER 4. TESTING FOR REDUCIBILITY

Proof. A finite Church-Rosser transformation certainly satisfies conditions 1 and 2. So we only have to
prove that a relation satisfying 1 and 2 is a finite Church-Rosser transformation.
We will prove this by induction on the height of a. That is, given 1 and 2, we will prove the following
statement:

(P ) c b and a =⇒
If h(a) ≤ k, and if a =⇒ c c then b = c.

for all k.
(P) is trivially true for k = 1. Suppose it has been proved true for all k < n. Let a be any element of S with
height n, and say a =⇒ c b and a =⇒c c. Thus we have chains

a =⇒ b1 =⇒ b2 =⇒ . . . =⇒ b

a =⇒ c1 =⇒ c2 =⇒ . . . =⇒ c

each of length ≤ n, where h(b) = h(c) = 0. By property 2 (since a =⇒ b1 and a =⇒ c1 ), we then have
chains

b1 =⇒ β2 =⇒ β3 =⇒ . . . =⇒ x

c1 =⇒ γ2 =⇒ γ3 =⇒ . . . =⇒ x

We may assume without loss of generality that x is a limit point: otherwise, we could simply continue both
chains in the identical manner until a limit point was attained. But now we have

c b
b1 =⇒ and c x
b1 =⇒

and h(b1 ) < h(a) = n, so by the inductive assumption, b = x. Similarly, c = x, and so b = c, and we are
done.

4.2 The Transformations T 1 and T 2


We define two transformations T 1 and T 2 on flow graphs as follows:

1. T 1 is the removal of a self-loop. That is, if x → x is an arc in hG, si, then T 1hG, si is the flow graph
whose nodes are the nodes of G and whose arcs are all the arcs of hG, si except for the arc x → x.

2. If x 6= y and x is the only predecessor of y in hG, si, and if y 6= s, then the transformation T 2 creates
a new flow graph G′ by “collapsing” y into x. Precisely, the nodes of G′ are all the nodes of G except
for y, and the arcs of G′ are
• all the arcs of G not involving y;
• an arc x → b for each arc y → b in G. (In particular, if there is an arc y → x in G, then G′
contains a self-loop x → x.)
4.2. THE TRANSFORMATIONS T 1 AND T 2 47

When T 2 is applied in this fashion, we say that x consumes y.

In general, we will say that the flow graph G′ is produced by collapsing the flow graph G if G′ is produced
from G by a series of transformations of the form T 1 and/or T 2.
Of course the transformations T 1 and T 2 are really not uniquely defined: if there are several self-loops in
hG, si, T 1 could be applied to any of them; and similarly for T 2. They actually define a relation on the set
of flow graphs. We can imagine applying these transformations repeatedly on a flow graph hG, si until we
can go no further. We will show that the flow graph that we get by doing this is always the limit flow graph
of hG, si, no matter in what order the transformations were applied. In particular, a flow graph is reducible
iff it can be transformed to the trivial flow graph by successive applications of T 1 and T 2, in any order.
The transformations T 1 and T 2 are due to Hecht and Ullman[6].

4.2 Theorem The relation =⇒ defined on the set of flow graphs by the transformations T 1 and T 2 is a finite
Church-Rosser transformation.

Proof. First, the relation is finite, since T 1 and T 2 each reduce the number of arcs by at least 1. So the
length of a chain starting at hG, si can be at most the number of arcs in hG, si.
Ti Tj
Now suppose we have G=⇒L and G=⇒M . We want to show that the following diagram can be produced:
Ti
G =⇒ L
w w
wT j w∗
 

M =⇒ H

Case I: i = j = 1. If T 1 was applied to node x to yield L and to node y to yield M , then

• if x = y, then L = M , and we are done.


• otherwise, T 1 can be applied to y in L and to x in M , yielding the same graph H in each case.

Case II: i = j = 2. Say x consumes y to yield L and z consumes w to yield M :

x
 z
 
y y
y w

We know that in any case x 6= y and z 6= w.

• If y = w, then x must equal z, since y has a unique predecessor in hG, si. Hence L = M already.
In all the remaining cases, y 6= w.
• If all four nodes are distinct, then x can consume y in M and z can consume w in L to yield the
same graph H.
• If x = z, then x can consume w in L and x can consume y in M to yield the same graph H.
• If x = w but z 6= y, then z can consume x in L and z can consume y in M to yield the same
graph H.

All other possible cases are symmetric versions of these cases.


48 CHAPTER 4. TESTING FOR REDUCIBILITY

Case III: i 6= j. Say x consumes y to yield L (via T 2) and T 1 is applied to z to yield M . y cannot equal
z. Hence x can still consume y in M , and T 1 can be applied to z (which might be x) in L, in either
case yielding the same graph H.

Therefore, we see that it does not matter in which order (or to which nodes or arcs) we apply the transfor-
mations T 1 and T 2—if we start with a flow graph hG, si and keep on going until we can go no farther, we
will arrive at a unique graph depending only on G.

4.3 Lemma Any pre-interval can be collapsed to its header by a series of T 2 operations, followed by at most
one T 1 operation. Any arcs from the original pre-interval to other nodes become arcs from the header to
those same nodes.

Proof. Let hH, hi be any pre-interval in G. There must be at least one node x 6= h in H such that x has
only one predecessor (y, say), for if each node in H other than h had two predecessors, then one of them
would not be h, and by following arcs backward through H, always choosing the predecessor which was not
h, we would eventually create a loop in H not containing h, a contradiction.
So if x is such a node, then T 2 can be applied to collapse x into y, turning H into H ′ , say. Any arcs from x
leading outside H now become arcs from y ∈ H ′ to the same target. After this application of T 2, hH ′ , hi is
still a pre-interval, for it still has the unique entry node h, and if C is a cycle in H ′ not including h, then C
must include y (otherwise C would have been a cycle in H). Say C looks like this:

c0 → c1 → . . . → ck−1 → y → ck+1 → . . . → cn = c0 .

There could not be an arc y → ck+1 in H, for then again C would be a cycle in H. Hence the arc y → ck+1
must come from an arc x → ck+1 in H. But then replacing y in C by y → x yields a cycle in H not including
h, a contradiction.
If we apply this process repeatedly, we will wind up with H collapsed to its header, with possibly a self-loop
at the header. This self-loop can be eliminated by applying T 1. Thus H is replaced by its header, and every
arc from an element of H to a target outside H is replaced by an arc from its header to the same target.

While the above proof was a pure existence proof (in fact, a proof by contradiction), we will see below
(Theorem 4.8 on page 50) that there actually is an algorithm for generating the sequence of T2 reductions
of any pre-interval

4.4 Lemma G =⇒ I(G).

Proof. Applying Lemma 4.3 in turn to each interval in G yields I(G).

ˆ
c I(G).
4.5 Theorem G =⇒

Remark That is, the result of applying T 1 and T 2 repeatedly to G until neither can be applied any more
is the limit graph of G; and in particular, G is reducible iff it can be reduced to the trivial flow graph by
successive applications of T 1 and T 2.
∗ ˆ ˆ
Proof. Applying Lemma 4.4 repeatedly, we get G =⇒ I(G). So all we have to prove is that I(G) is a limit
point in the set of flow graphs under the transformations T 1 and T 2.
4.3. INDUCED WALKS FROM T 1 AND T 2 49

ˆ
First, derived flow graphs by construction do not contain self-loops. Hence T 1 cannot be applied to I(G).
ˆ
Next, since I(G) is a limit flow graph, all its pre-intervals are trivial (i.e. consist of just one node). If
ˆ
there were an element x of I(G) which had a unique predecessor y, then h{y, x}, yi would be a non-trivial
ˆ ˆ
pre-interval in I(G), a contradiction. Hence T 2 cannot be applied to I(G).

4.3 Induced Walks from T 1 and T 2


Now let us say that G is transformed into G′ by an application of T 1 or T 2. If D is a depth-first spanning
tree of G (corresponding to a depth-first walk of G), then D has an image D′ in G′ .

Case I. T 1 was applied to get G′ . Since all self-loops are back arcs, D′ is the same as D, and the pre- and
post-orderings of the nodes of G′ are the same as those of G.

Case II. T 2 was applied to get G′ . Say T 2 was applied to the arc a → b in G. The arc a → b was the only
arc to b. Hence it must be a tree arc. So D′ is D without this arc (and with a and b identified). D′
is still a tree because every element except s in G′ has exactly one arc in D′ entering it. There is an
induced walk on D′ which can be described as follows:

• Each node which was visited before b is visited in the same order. This includes all the nodes
visited up to a, and those children of a which were visited before b.
• Then the descendants of b are visited in the same order.
• Then the subtrees rooted at the remaining children of a (after b) are visited in the same order.
• Finally, the rest of the tree is visited in the same order.

That is, the tree is walked until b would have been reached; it is skipped over, and the remaining
nodes are visited in the same order. Thus, if nodes in G are denoted by {x, y, . . .} and nodes in G′ are
denoted by {x′ , y ′ , . . .}, then

pre[x] < pre[b] =⇒ pre[x′ ] = pre[x]


pre[x] > pre[b] =⇒ pre[x′ ] = pre[x] − 1

If G has n nodes, then G′ has n − 1 nodes. If the post-order number of a node is denoted by post[x],
then we have

rpost[x] + post[x] = n; rpost[x′ ] + post[x′ ] = n − 1.

Now

rpost[x] < rpost[b] ⇐⇒ b is finished being processed before x


⇐⇒ post[b] is assigned before post[x]
⇐⇒ post[x′ ] = post[x] − 1
⇐⇒ rpost[x′ ] = rpost[x]
50 CHAPTER 4. TESTING FOR REDUCIBILITY

and similarly,

rpost[x] > rpost[b] ⇐⇒ x is finished being processed before b


⇐⇒ post[x] is assigned before post[b]
⇐⇒ post[x′ ] = post[x]
⇐⇒ rpost[x′ ] = rpost[x] − 1

4.6 Theorem If x and y are nodes in G whose images in G′ are x′ and y ′ , then

1. pre[x] < pre[y] =⇒ pre[x′ ] ≤ pre[y ′ ].

2. rpost[x] < rpost[y] =⇒ rpost[x′ ] ≤ rpost[y ′ ].

If x′ 6= y ′ then

1. pre[x] < pre[y] ⇐⇒ pre[x′ ] < pre[y ′ ].

2. rpost[x] < rpost[y] ⇐⇒ rpost[x′ ] < rpost[y ′ ].

Proof. Follows immediately from the calculations above.

4.7 Theorem If x → y is an arc in G which corresponds to an arc x′ → y ′ in G′ , then either they are both back
arcs or neither is.

Proof. 1. If T 1 was applied to yield G′ , the pre- and post-orderings of all the nodes are unchanged, and
tree arcs are preserved, so the character of all the arcs (except of course the deleted one) is unchanged. So
let us assume that T 2 was applied to yield G′ .
2. If x′ and y ′ are identified in G′ , then a back arc x → y must become a self-loop x′ → x′ in G′ , which is
automatically a back arc. Similarly, if x′ and y ′ are identified in G′ , then a self-loop x′ → x′ in G′ could
only have come from a back arc in G.
3. If x′ and y ′ are distinct in G′ , then

x → y is a back arc in G ⇐⇒ pre[x] > pre[y] and rpost[x] > rpost[y]


⇐⇒ pre[x′ ] > pre[y ′ ] and rpost[x′ ] > rpost[y ′ ]
⇐⇒ x′ → y ′ is a back arc in G′

We also note that x′ is a descendant of y ′ in G′ (with respect to D′ ) iff x is a descendant of y in G (with


respect to D).
Now as promised, we can show that a pre-interval can be collapsed by T 2 transformations in an easily
calculated order: the order in fact is just reverse post-order:

4.8 Theorem If the flow graph hG, si is numbered by a depth-first walk, then any pre-interval hH, hi in G, can
be collapsed to its header by applying T 2 transformations successively to each node in H − {h} in reverse
post-order, followed possibly by one T 1 operation on h. Any arcs from the original pre-interval hH, hi to
other nodes become arcs from the header h to those same nodes.
4.4. KERNELS OF INTERVALS 51

Proof. First, since h dominates every element of H, rpost[h] < rpost[x] for all x ∈ H.
Let x be the element of H − {h} such that rpost[x] is smallest. If y → x is an arc, then rpost[y] < rpost[x],
since otherwise y → x would be a back arc, and the only back arcs in H have h as their target. But then
since rpost[x] is minimal, we must have y = h. Thus, the only arc whose target is x is h → x. Hence T 2 can
be applied to collapse x into h.
By Theorem 4.6, the reverse post-ordering of the nodes of H is not changed by this application of T 2. The
same argument can then be iterated to visit all the nodes of H − {h} in reverse post-order, collapsing each
in turn into h.
As in Lemma 4.3, at most one application of T 1 to h will then complete the collapse of hH, hi to the single
node h.

4.4 Kernels of Intervals


In general, an interval consists of zero or more loops (each including the interval header), and additional
non-cyclic structures hanging off the loops (or off the header if there are no loops). The loops are the essential
part of the flow graph: without loops, computations of data flow are very easy. So let us refer to the loop
part of an interval as the “kernel” of an interval. First we show that the kernel exists unambiguously. Clearly
we want the loop part of an interval to be strongly connected and to include the interval header. We use
this idea to characterize the kernel.

4.9 Lemma The family H of strongly connected subsets of a pre-interval hH, hi which include h has a maximal
element which includes all the members of H.

Proof. Since {h} ∈ H, H = 6 ∅. Clearly the union of any two members of H is a member of H. Since H is
finite, the union of all the members of H is maximal and includes every member of H.

Now we define the kernel of a pre-interval hH, hi to be the maximal strongly connected subset of H which
includes h. We will use the notation k(H) or khH, hi to refer to the kernel of the pre-interval hH, hi.

4.10 Lemma The kernel of a pre-interval hH, hi consists of all elements x of H for which there is a path in H
from x to h.

Proof. Call this set C. Since k(H) is strongly connected, k(H) ⊂ C. Conversely, if x ∈ C then since hH, hi
is a pre-interval, there is a path P in H from h to x. Every element y in this path is in C, since the path
can be followed from y to x, and then there is a path from x to h. Hence every element of C can be reached
from h by a path in C, and so C is strongly connected. By maximality, then, C ⊂ k(H).

In particular, by Theorem 3.20, k(H) includes all the back arcs in H.

4.11 Lemma The kernel of a pre-interval with header h is a pre-interval with header h.

Proof. We know that h ∈ k(H). If a ∈ k(H), a 6= h, and a is the target of an arc b → a, then b must be
in H. Since there is then a path in H from b to h, b must be in k(H). Thus, the only entry to k(H) is at h.
There can be no cycles in k(H) which do not include h since k(H) is a subset of H.
52 CHAPTER 4. TESTING FOR REDUCIBILITY

Thus the kernel of each interval can be collapsed by Lemma 4.3. After its kernel has been collapsed, each
interval has no remaining back arcs (by Theorem 4.7) and so is a DAG. There still may be back arcs in
the graph, however: these correspond to back arcs between intervals in the original graph. Let us denote
the graph generated by collapsing the kernel of each interval of G by J(G). Just as we iterated the map
ˆ
G → I(G) to get a limit graph I(G) when defining reducibility, we can iterate the map G → J(G) to get a
ˆ
limit graph J(G). ˆ
Figure 4.1 shows the construction of J(G) for the flow graph of Figure 3.3.

Figure 4.1: Successive Kernel Reductions for the Graph of Figure 3.3

More generally, let T 3 be a transformation on a flow graph which consists of collapsing the kernel of 1 or
more intervals of that graph. In particular, J(G) is produced by the action of one version of T 3 on G, namely
the version of T 3 which collapses the kernels of all the intervals of G.
It seems likely to me that the family of T 3 transformations is a finite Church-Rosser system, and so any
ˆ
T 3-limit graph of G is equal to J(G). For our purposes, however, we don’t need this strong result. A simple
characterization of reducibility in terms of T 3-limit graphs is all we need:

ˆ
4.12 Theorem G is reducible iff J(G) is a DAG.

ˆ
Proof. First of all, by Theorem 4.5, G is reducible iff J(G) ˆ
is reducible. So we have to show that J(G) is
reducible iff it is a DAG. Since any DAG is reducible (the whole graph is an interval), we only have to show
ˆ
that if J(G) is reducible, it must be a DAG.
ˆ
If J(G) is not a DAG, it contains a back arc (with respect to some depth-first walk). Let x → y be the back
4.5. REACHUNDER SETS AND TARJAN’S ALGORITHM 53

arc such that pre[y] is maximal1 . Let H be the union of the nodes on all simple paths from y to x. Since
ˆ
J(G) is assumed to be reducible, y must dominate x, and so y must dominate every node in H. Further,
there cannot be any cycles in H not including y, for if so, there would be a back arc a → b with a and b in
H and b 6= y. But since y dominates b, pre[y] < pre[b], which contradicts the maximality of pre[y] among all
targets of back arcs. Hence H is a pre-interval with header y, and x ∈ k(H). But this means that J(G)ˆ is
not invariant under J, a contradiction.

4.13 Corollary G is reducible iff any sequence of applications of T 3 can be extended with 0 or more applications
of T 3 to yield a DAG.

Proof. Since T 3 consists of applications of T 1 and T 2, G is reducible iff it remains reducible after any
sequence of applications of T 3. Now apply the Theorem.

4.14 Corollary If hG, si is a flow graph, the following are equivalent:

1. G is reducible.
2. Some T 3-limit graph formed starting with G is a DAG.
3. Every T 3-limit graph formed starting with G is a DAG.

In particular, we can test a flow graph for reducibility by collapsing the kernel of just one interval, then
collapsing the kernel of just one interval in the resulting graph, and so on. The graph will be reducible iff
we eventually come to a DAG.

4.5 Reachunder Sets and Tarjan’s Algorithm


Tarjan gave an algorithm to test for flow graph reducibility based on these ideas. His algorithm uses what we
will call a “reachunder” set, which is more or less equivalent to what we have called the interval kernel—it
actually could include the kernels of several intervals.
Definition Let hG, si be a flow graph, and let there be a depth-first spanning tree chosen for hG, si. If x is
the target of a back arc in G, the reachunder set of x, ρ(x), is the set of those elements y such that there is
a simple path from y to x, the last arc of which is a back arc.2

In particular, x ∈
/ ρ(x). Of course, if there is no back arc whose target is x, then ρ(x) = ∅.
Reachunder sets can be used to characterize reducibility:
4.15 Theorem If hG, si is a flow graph with a spanning tree D rooted at s, the following are equivalent:

1. G is reducible.
2. For all x ∈ G, x dominates every element in ρ(x).
1 That is, maximal with respect to all back arcs. This device is due to Tarjan.
2 The first use of the term “reachunder” appears to be in the paper by Schwartz and Sharir[16]. Tarjan doesn’t give a name
to the reachunder sets he uses; they occur in his paper (Tarjan[20]) as P (x). The idea of the set goes back at least to Hecht[5]
in his proof of our Theorem 3.18. It actually isn’t necessary for that proof, as we have seen.
54 CHAPTER 4. TESTING FOR REDUCIBILITY

3. For all x ∈ G, x is an ancestor (with respect to D) of every element in ρ(x).

Proof. 1 =⇒ 2: Say y ∈ ρ(x). If x does not dominate y, then there is a path in G from s to y avoiding x.
If we adjoin to the end of this path the path in ρ(x) from y to x, we get a path from s to x which does not
contain x in its interior. Therefore, since the last arc in that path is a back arc (z → x, say), we see that x
does not dominate z and hence G is not reducible.
2 =⇒ 3: If x dominates an element y, then x must be an ancestor of y with respect to each spanning tree
of G which is rooted at s.
3 =⇒ 1: If G is not reducible, then with respect to any given depth-first spanning tree D of G, there is a
back arc z → x for which x does not dominate z. We know that x 6= s, since s dominates everything. Hence
there is a path in G from s to z which avoids x, and hence s ∈ ρ(x). But certainly x is not an ancestor of
s.

4.16 Lemma If x is the node in G with the greatest pre-order number which is a target of a back arc3 , then no
descendant of x is the target of a back arc.

Proof. If b is any descendant of x, we must have pre[x] < pre[b]. Since x is the node with highest pre-order
number which is the target of a back arc, b cannot be the target of a back arc.

4.17 Lemma If G is reducible and if x is the node in G with the greatest pre-order number which is the target of
a back arc, then ρ(x) ∪ {x} = k(I(x)).

Proof. Say y → x is a back arc. ρ(x) ∪ {x} must have x as its only entry point because x dominates the
source of every back arc (since G is reducible). Further, there can be no cycles in ρ(x) because otherwise
there would be a back arc in ρ(x), which is impossible by the last lemma. Hence ρ(x) ∪ {x} is a pre-interval
with header x, and is therefore contained in I(x). Since by construction it consists of all elements z of I(x)
for which there is a path in I(x) from z to x, Lemma 4.10 shows that ρ(x) ∪ {x} = k(I(x)).

We can now give Tarjan’s algorithm for testing reducibility (Figure 4.2). The algorithm first makes a list of
targets of back arcs in reverse pre-order, v1 , . . . , vk (so that pre[v1 ] is maximal, and the sequence {pre[vi ]}
is decreasing). Then it computes ρ(v1 ). If ρ(v1 ) contains a non-descendant of v1 , the algorithm returns
FALSE. Otherwise, ρ(v1 ) can be collapsed into v1 . Then ρ(v2 ) is computed (in the collapsed graph), and so
on.
When building up ρ, each node added to ρ is checked to make sure that x is an ancestor of it. Hence by
Lemma 4.16, there can be no back arcs to it. Therefore, when we are building up the set P which will
eventually be ρ(v1 ), we only have to look at tree arcs, forward arcs, and cross arcs entering P . After ρ(v1 )
has been finished and collapsed into v1 , v2 will be the node with the greatest pre-order number which is the
target of a back arc (in the collapsed graph), and so the same process can be used when building up ρ(v2 ).
Each reachunder set is identified with its header, which is always the target of one of the original back arcs
in G. As the algorithm proceeds, reachunder sets can be formed from nodes which themselves represent
reachunder sets. In order to manage these set unions, the algorithm uses two functions:

FIND(x) returns the set (i.e. the node in the reduced graph) containing x.
3 as in the proof of Theorem 4.12
4.5. REACHUNDER SETS AND TARJAN’S ALGORITHM 55

function REDUCE(hG, si: flow graph): boolean;


begin
Construct a depth-first spanning tree D of G, numbering vertices from 1 to n in pre-order and reverse
post-order and calculating ND[v] for each vertex v;
for each vertex v (in any order) do
Make lists of all back arcs, forward arcs, and cross arcs entering vertex v;
Construct a set named v containing v as its only element;
HIGHPT[v] ← 0;
end for;
for each vertex x in reverse pre-order do
P ← ∅;
for each back arc y → x do
add FIND(y) to P ;
end for;
Q ← P;
-- Now construct ρ(x) by exploring backward from vertices in Q
while Q 6= ∅ do
Select a vertex t ∈ Q and delete it from Q;
for each forward arc, tree arc, or cross arc y → t do
-- All back arcs entering t have already been collapsed
y ′ ← FIND(y);
if pre[x] > pre[y ′ ] or pre[x] + ND(x) ≤ pre[y ′ ] then
return(FALSE); -- G is not reducible.
end if ;
if y ′ ∈ P and y ′ 6= x then
Add y ′ to P and to Q;
end if ;
if HIGHPT(y ′ ) = 0 then
HIGHPT(y ′ ) ← pre[x];
end if ;
end for;
end while;
-- Now P = ρ(x) in the current graph
for s ∈ P do
UNION(s, x, x);
end for;
end for;
return(TRUE);
end

Figure 4.2: Tarjan’s Test for Reducibility


56 CHAPTER 4. TESTING FOR REDUCIBILITY

UNION(A, B, C) creates a set C equal to the union of the sets A and B (destroying A and B in the process).

In addition, an attribute HIGHPT[x] is defined for each node and computed to be the pre-order number of
the first vertex (which must be a target of a back arc) into which x is collapsed (and 0 if x is not collapsed).
For example, for all y ∈ ρ(v1 ), HIGHPT[y] = pre[v1 ].
Tarjan’s algorithm gives us a convenient way to specify how a reducible flow graph can be collapsed by
successive applications of the T 1 and T 2 transformations:
Let us define an ordering on the nodes of G, which we will call reduction ordering, as follows: x will precede
y iff

1. HIGHPT[x] > HIGHPT[y]; or

2. HIGHPT[x] = HIGHPT[y] and rpost[x] < rpost[y].

4.18 Theorem The flow graph hG, si can be reduced by a sequence of T 1 and T 2 transformations as follows: visit
the nodes in reduction order.

1. Each node can be collapsed by an application of T 2 into some other node.

2. All nodes with the same value of HIGHPT are collapsed into the same node.

3. When all the nodes with the same value of HIGHPT have been collapsed (say into node h), then a
possible application of T 1 to h can be made.

Proof. Consider first all the nodes for which HIGHPT is greatest; this is the initial part of the sequence
of nodes in reduction order. These nodes constitute the first reachunder set formed by Tarjan’s algorithm.
By Theorem 4.8, these nodes can be collapsed in reverse post-order (which is the same as reduction order
on this set) by applications of T 2. After they have all been collapsed into their header, an application of T 1
to that header may be needed to delete any self-loop at the header.
We know that the reverse post-ordering of the nodes is not affected by the transformations T 1 and T 2 (that
is, the actual numbering may change, but the ordering induced by the numbering does not). Hence this
process can be iterated on each successive reachunder set until G is completely reduced.

Tarjan has showed that this algorithm can be performed in time

O(Eα(E, V ))

where V and E are the number of vertices and edges in hG, si and where α is related to an inverse of
Ackermann’s function and grows so slowly that for all practical purposes it is bounded by 3.
An example of the working of this algorithm is shown in Figure 4.3.
Figure 4.4 shows Tarjan‘s algorithm applied to the same reducible flow graph as in Figure 3.10. The limit
DAG is shown in the upper right-hand corner, and the nesting tree of reachunder sets (analogous to the
interval nesting tree) is shown at the lower right-hand corner. The root of the nesting tree is the limit DAG.
4.5. REACHUNDER SETS AND TARJAN’S ALGORITHM 57

S (1,1,0)

A (2,4,1) C (4,2,1)

B (3,5,2) D (5,3,4)

The ordered triples are (pre[x], rpost[x], HIGHPT[x]). The reachunder sets are outlined in
dotted boxes. Reduction order is DBCAS.

Figure 4.3: Reduction Order Computed by Tarjan’s Algorithm


58 CHAPTER 4. TESTING FOR REDUCIBILITY

S 1 S

B A 2 F

B 3

O J
B

B C B
4

D 5 E 18

C
C
16
F 6 J

root
B
G 7 K
17
B
S O L J
H 8 I
12
L 13

F B B P
BF
M
14
N
9O C
15
C

B
P
11 F
10
Q

The nodes are numbered in pre-order. Tree arcs are solid lines. Forward arcs (F), cross arcs (C),
and back arcs (B) are dashed lines and are labelled. The successive reachunder sets are indicated
by dotted boxes. The reachunder sets are found in the order JLPOFCBS.
ˆ
The limit graph J(G) is shown at the upper right.
The nesting tree of reachunder sets is shown at the lower right.

Figure 4.4: Tarjan’s Algorithm Applied to the Reducible Flow Graph of Figure 3.10
Chapter 5

Applications to Data-Flow Analysis

5.1 Four Data-Flow Problems


In this chapter, hG, si will be understood to be the flow graph of basic blocks in the intermediate represen-
tation of a computer program. In addition, although we have not assumed anything about exit nodes in
flow graphs (and although programs in general do not have to have unique exit nodes), we will assume that
hG, si has a unique exit node, which is reachable from every other node. This is no great restriction, for if
a program has more than one exit node, we can simply create a new one and add an arc from each of the
former exit nodes to the new one. And certainly we are not interested in analyzing programs which contain
nodes which cannot lead to an exit. Let us denote the exit node of the program by e (for “end” or “exit”).
We will denote such a flow graph by hG, s, ei.
Note that a flow graph hG, s, ei is now defined perfectly symmetrically; and in fact, for each flow graph, there
is a reverse graph hG, e, si formed from the same nodes as G by simply reversing the direction of each arc. s
and e then change roles. The reverse graph of a reducible flow graph is not necessarily reducible, however;
see, for instance, Figure 5.1.

s e

a e a s

Figure 5.1: The reverse graph of a reducible flow graph may not be reducible.

For the purposes of global optimization, we wish to gather information about the use of variables in the
program. The following are typical kinds of information:

Available Expressions An expression such as x + y is available at a point P in the program iff


• the expression has been computed on each path from s to P ; and

59
60 CHAPTER 5. APPLICATIONS TO DATA-FLOW ANALYSIS

• if the expression were recomputed at P , its value would not be different from the last computed
value.
For instance, not only must x + y have been computed previously, but also the values of x and y must
not have changed since the last computation of x + y on any path to P . Any change to the values of
x or y is said to “kill” the value of the expression x + y.
If an expression is available at a point, then in principle, it does not need to be re-evaluated at that
point if it occurs there—its value may have been saved in a register. Thus, the more expressions we
know are available at a point, the more useful information we have.
Reaching Definitions A definition of a variable is a statement which can modify the value of the variable;
e.g. an assignment to the variable or a read statement. A definition of a variable v reaches the point
P in the program iff there is a path from the definition to P on which v can be shown to be unchanged
(i.e. on which there is no other definition of v; another definition of v would be said to “kill” the first
definition). Given a point P in the program, and a variable v, we want to compute the set of definitions
of v which reach P .
If only one definition of a variable reaches a point, then a use of that variable can be replaced by its
definition—in particular, the definition may assign a constant value to the variable, and the use of the
variable may thus be replaced by that constant. On the other hand, if more than one definition of
a variable reaches a point, then we can’t perform this kind of replacement. Thus (in contrast to the
situation in the problem of Available Expressions) the fewer definitions of a variable which reach a
point, the more useful information we have.
Another way that reaching definitions information can be used is this: Say E is an expression in the
body of a loop. If all the reaching definitions of E are outside the loop, then E is a loop invariant
and can be evaluated once before the loop is entered, rather than each time through the loop. Again,
finding fewer reaching definitions constitutes finding more useful information.
Live Variables A variable is live at a point P of the program iff its value at P could be used along some
path leaving P .
Liveness information is used in register allocation. A code generator may generate code assuming
an infinite number of “virtual” registers. Each virtual register than has to be assigned to an actual
machine register, and of course this has to be done in such a way that the assignments do not conflict.
A modern compiler has a register allocation phase that performs this assignment.
Virtual registers can be classified as local (used entirely within one basic block or global (used in more
than one basic block).
If the machine we are compiling for only has a few registers, it is common to perform only local register
allocation—this means that we allocate registers within each basic block, but do not keep any values in
registers between basic blocks. Thus, any global register used in a basic block will have to be written
to memory at the end of the block, in order to safely preserve its value for use in some subsequent
basic block. However, any virtual register that is not live at the end of a basic block does not have to
be stored in memory—it can be allowed to “die” at the end of the block. So knowing that a virtual
register is not live is useful information. (Note that while the problem is conventionally referred to as
“live variables”, the variables in question are actually the virtual registers.)
If the machine has a sufficiently large number of registers, we may also perform global register allocation.
That is, we attempt to assign global virtual registers to machine registers consistently over larger
segments of the program. The key point is that a virtual register has to correspond to a machine
5.2. DATA-FLOW EQUATIONS 61

register only at those points in the program at which the virtual register is live. Global register
allocation thus is driven by liveness information. So in this case also, knowing that a virtual register
is not live is useful information, because it frees up a machine register for allocation to another virtual
register: if we have fewer live variables, we have more useful information.
Very Busy Expressions An expression is very busy at a point P of the program iff it is always subsequently
used before it is killed.
If an expression is very busy at P then it can be computed at P and subsequent computations along
all paths before its first use along each path can be deleted. This process is called “code hoisting”,
and saves space, though in general not time. (For this reason, this concept is now only of historical
interest.) The more very busy expressions we have, the more code hoisting we can do, and hence the
more useful information we have.

In each of these problems, we may not be able to compute all this useful information perfectly accurately.
However, if we compute approximations to the useful information which yield no more than the true useful
information (i.e. which are lower approximations to the true useful information), then we will at any rate not
perform any unjustified program transformations. Thus, lower approximations to the true useful information
in each case are safe or conservative approximations.

5.2 Data-Flow Equations


The points P of the program at which we are concerned with this information are the beginnings and the
ends of basic blocks. It turns out that there are some simple relationships or constraints which we can write
down. Each constraint relates four quantities:

in The information valid on entrance to a basic block.


out The information valid on exit from a basic block.
gen The information generated within the basic block.
kill The information killed or made invalid within the basic block.

In each case, the items gen and kill are quantities which we can compute in a straightforward manner for
each basic block (they are functions on the set of basic blocks), and the quantities in and out (which are
also functions on the set of basic blocks) are the final quantities we hope to arrive at.
For example, for the Available Expressions problem, for a given expression and a given basic block, in for
that basic block is a boolean quantity which is TRUE iff that expression is available on entrance to that
basic block.
In general, for each instance of the four examples of data-flow information given above (that is, for each
expression, or each definition, etc.), these four quantities are boolean functions on basic blocks which are
TRUE as follows (N.B. the precise definitions of gen and kill in each case are determined by the data-
flow equations subsequently presented; in particular, the definitions for the Live Variables and Very Busy
Expressions problems are tailored to reflect their use in “bottom-up” equations, as we will see below):

Available Expressions For a given expression,


62 CHAPTER 5. APPLICATIONS TO DATA-FLOW ANALYSIS

in The expression is available on entrance to the basic block.


out The expression is available on exit from the basic block.
gen There is an explicit computation of the expression in the basic block which is not followed in the
same basic block by a subsequent change to the value of any variable in the expression (which
would change the value of the expression if it were recomputed).
kill The value of some variable in the expression changes within the basic block (so that if the expres-
sion were recomputed, its value might change), and the expression is not subsequently computed
in the basic block.
Reaching Definitions For a given definition,
in The definition reaches the entrance to the basic block.
out The definition reaches the exit from the basic block.
gen The definition is produced within the basic block and not subsequently killed in the same basic
block (by another definition of the same variable).
kill The definition is killed within the basic block (by another definition of the same variable within
that block).
Live Variables For a given variable,
in The variable is live on entrance to the basic block.
out The variable is live on exit from the basic block.
gen The variable is used within the basic block prior to any definition of that variable within the
block. Also called use.
kill The variable is defined within the basic block prior to any use of the variable within the basic
block. Also called def.
Very Busy Expressions For a given expression,
in The expression is very busy on entrance to the basic block.
out The expression is very busy on exit from the basic block.
gen The expression is used within the basic block prior to any definition of any variable in that
expression within the basic block. Also called use.
kill The value of some variable in the expression changes within the basic block (so that if the ex-
pression were recomputed, its value would be different) prior to any use of the expression in the
block.

Note that in every case, gen ∧ kill = FALSE. This is a convention: we don’t allow information to be both
generated and killed within one basic block.
We now can write down some equations involving these quantities. For each basic block b, let pred(b) denote
the family of basic blocks which are predecessors of b (i.e. those basic blocks x for which x → b is an arc in
the flow graph); and let succ(b) denote the family of basic blocks which are successors of b. We have three
kinds of equations:

Boundary value equations These define values at the entry or exit blocks of the program.
5.2. DATA-FLOW EQUATIONS 63

Propagation equations These tell how values are propagated from one basic block to another. These are
also called confluence equations.

Transfer equations These tell how values are changed by a basic block.

There are two general types of data-flow problems: those in which information is propagated in the direction
of program execution (forward equations), and those in which equation is propagated in the reverse direction
(backward equations). In general, we let in, out, etc. be represented as boolean vectors (usually referred
to as bit-vectors), with one dimension allocated to each expression, definition, or variable in the problem.
Since boolean vectors are equivalent to subsets, we can denote the vector which is all FALSE by ∅, 0, or ⊥.
Although we will not need it till later, we will denote the vector which is all TRUE by 1 or ⊤. Similarly,
the boolean operator ∨ will be denoted by ∪, and ∧ will be denoted by ∩. We can therefore interpret the
variables in, out, kill, and gen as the sets of expressions, definitions, or variables which are characterized by
the conditions stated above.

Forward Equations

Available Expressions and Reaching Definitions are both examples of data-flow problems which lead to
forward equations:
Available Expressions: The equations are

in(s) = ∅

\
in(b) = out(x) for b 6= s
x∈pred(b)

out(b) = (in(b) − kill(b)) ∪ gen(b)

Reaching Definitions: The equations are

in(s) = ∅

[
in(b) = out(x) for b 6= s
x∈pred(b)

out(b) = (in(b) − kill(b)) ∪ gen(b)

Thus, forward equations are characterized by passing information from the predecessors of a node to the
node itself, with a boundary condition on the entry node s. The transfer equation then computes out in
terms of in.
64 CHAPTER 5. APPLICATIONS TO DATA-FLOW ANALYSIS

Backward Equations

Live Variables and Very Busy Expressions are both examples of data-flow problems which lead to backward
equations:
Live Variables: The equations are

out(e) = ∅

[
out(b) = in(x) for b 6= e
x∈succ(b)

in(b) = (out(b) − kill(b)) ∪ gen(b)

Very Busy Expressions: The equations are

out(e) = ∅

\
out(b) = in(x) for b 6= e
x∈succ(b)

in(b) = (out(b) − kill(b)) ∪ gen(b)

Thus, backward equations are characterized by passing information from the successors of a node to the
node itself, with a boundary condition on the exit node e. The transfer equation then computes in in terms
of out.
Following Hecht [5], we can make a taxonomy of the four types of data-flow problems considered here so far
in the form of a table:

Set Intersection, Set Union,


“and”, ∀-problems “or”, ∃-problems
Top-Down Problems Available Expressions Reaching Definitions
(flow from predecessors)
Bottom-Up Problems Very Busy Expressions Live Variables
(flow from successors)

We note that for set intersection problems, larger sets give us more useful information, while for set union
problems, smaller sets give us more useful information.

Distributive Problems

Let us examine the equations for the Available Expressions Problem more closely. If for each basic block b
we write fb (x) = (x − kill(b)) ∪ gen(b), then the propagation and transfer equations for this problem can be
5.3. INDETERMINACY OF SOLUTIONS 65

written together as
 \ 
out(b) = fb out(x) .
x∈pred(b)

What this says is that if we intersect the information coming in from each predecessor of b and apply fb to
that intersection, we will get the information leaving b. Actually, this may be too conservative—it might
happen in principle that each predecessor x of b transmitted distinct information out(x) which nevertheless
got mapped by fb to the same element. In this case, the equations we have written would yield out(b) = fb (0),
whereas in fact out(b) should really be ∩fb (out(x)), which might well properly include fb (0).
In fact, this is not a problem in the case of Available Expressions, because the function fb satisfies

(1) fb (x ∩ y) = fb (x) ∩ fb (y)

which means we can intersect the out(x) first before applying fb and get the same answer as if we had first
applied fb to each out(x) and then performed the intersection. Each of the four data-flow problems we have
considered leads to functions fb satisfying equations such as (1)—in the case of Reaching Definitions and
Live Variables, ∩ is replaced by ∪. Functions satisfying (1) are called distributive, and problems leading to
such functions are called distributive data-flow problems.
In general, solutions of the data-flow equations don’t yield correct solutions for non-distributive data-flow
problems. But they do give a conservative estimate, as we will see below.

5.3 Indeterminacy of Solutions


The data-flow equations may not have unique solutions. For example, consider the flow graph in Figure 5.2.
For the Available Expressions problem restricted to one expressions, the equations for this Figure are:

Figure 5.2: Flow Graph Illustrating Non-Unique Solution of Data-Flow Equations

in(s) = F ALSE
in(e) = out(s) ∧ out(e)
out(s) = (in(s) ∧ ¬kill(s)) ∨ gen(s)
out(e) = (in(e) ∧ ¬kill(e)) ∨ gen(e)
66 CHAPTER 5. APPLICATIONS TO DATA-FLOW ANALYSIS

Let us suppose that the expression we are concerned with is evaluated in s and never evaluated again, and
that it is never killed (i.e. no variable in it is every modified after the expression is evaluated. Then we must
have

gen(s) = TRUE; kill(s) = FALSE

gen(e) = FALSE; kill(e) = FALSE

and the equations become

in(s) = F ALSE
in(e) = out(s) ∧ out(e)
out(s) = in(s) ∨ TRUE = TRUE
out(e) = in(e) ∨ FALSE = in(e)

After a final round of substitution, we get

in(s) = F ALSE
in(e) = T RU E ∧ out(e) = out(e) (which we knew anyway)
out(s) = TRUE
out(e) = in(e)

This system of equations has two solutions: one corresponding to out(e) = FALSE, and the other corre-
sponding to out(e) = TRUE. It is obvious, however, that out(e) = TRUE is the solution we need.
The data-flow equations, therefore, should be thought of only as a set of constraints. In order to find
the actual solution to the data-flow problem, we must either place further restrictions on the solutions or
formulate the problem differently.
In fact, there is a different formulation of this problem which makes the solution obvious: Let us consider
the set of all paths from s to e. There are an infinite number of them: s → e, s → e → e, s → e → e → e,
and so forth. Each of these paths has its own data-flow properties, which can be computed by the same
data flow equations, except that since each node has a unique predecessor, there is no intersection needed
when passing from one node to another. It is easy to see that for each path, the expression generated in s is
available at the end of the path (i.e. out is TRUE at the end of each path). So now here is the reformulation:
The expressions available on exit from e are just those expressions which are available on exit from every
path from s to e, and hence in this case, we have out(e) = TRUE. This is called the “meet-over-paths”
solution to the data flow problem. We will look into this further in the next section.

5.4 Abstract Frameworks for Data-Flow Analysis


Before we do this, let us first observe that in some sense, these four different types of data-flow problems
can be transformed into each other:
5.4. ABSTRACT FRAMEWORKS FOR DATA-FLOW ANALYSIS 67

First, a backward problem can be thought of as a forward problem on the reverse flow graph. The boundary
condition then becomes a boundary condition on s, rather than on e, and the roles of in and out are reversed.
With these changes, the data-flow equations become identical to the forward equations.
Next, temporarily writing in for the boolean complement of in, and similarly for out, the forward data-flow
equations for set union problems (e.g. Reaching Definitions) become

in(s) = 1

\
in(b) = out(x) for x 6= s
x∈pred(b)

out(b) = (in(b) − gen(b)) ∪ kill(b)


which are just the forward equations for set intersection problems (e.g. Available Expressions), with the
exception that the boundary condition has become an assignment of 1 instead of 0, and the roles of gen and
kill have become interchanged.
With this transformation, larger values of in and out correspond in each case to more useful information.
We can now construct an abstract model for data-flow analysis problems. As we construct the model, we
will show that it includes the Available Expressions problem, and therefore the other three problems as well,
since as we have seen, they are formally equivalent to it. Our abstract model will not only handle all four
data-flow analysis problems presented here, but others as well which are not formally equivalent to them.
Instead of having in and out take values in the space of bit-vectors (or subsets), we will allow them to take
values in a general lower semi-lattice:
A lower semi-lattice is a nonempty set L together with a binary operator ∧ on L (called “meet”), which
is commutative, associative, and idempotent. For example, the lattice of subsets of a fixed set is a lower
semi-lattice with ∧ = ∩. If x and y are members of L, we say that x ≤ y iff x ∧ y = x.
5.1 Lemma ≤ is a partial order on L.
Proof. Reflexivity: x ≤ x means x ∧ x = x, and this is true because ∧ is idempotent.
Transitivity: If x ≤ y and y ≤ z then x∧y = x and y∧z = y. Hence x∧z = (x∧y)∧z = x∧(y∧z) = x∧y = x,
since ∧ is associative, and so x ≤ z; i.e. ≤ is transitive.
Anti-symmetry: If x ≤ y and y ≤ x then x = x ∧ y = y ∧ x = y, since ∧ is commutative.

In our model of data-flow analysis, L will be thought of as representing information, and to say that x ≤ y
will be to say that x represents less information than y does.
5.2 Lemma For all x and y in L, x ∧ y is the greatest lower bound of x and y.
Proof. 1. Using associativity and commutativity, x ∧ y ≤ y, and x ∧ y ≤ x.
2. If z ≤ x and z ≤ y then z ∧ (x ∧ y) = (z ∧ x) ∧ y = z ∧ y = z, so z ≤ x ∧ y.

By induction, we can show that if X = {x1 , x2 , . . . , xn } is a finite subset of L, then X has a greatest lower
V
bound, which we denote by X, and in fact
^
X = x1 ∧ x2 ∧ . . . ∧ xn .
68 CHAPTER 5. APPLICATIONS TO DATA-FLOW ANALYSIS

In practically all cases L is finite, but just in case it is not, we will go one step farther and assume that L
is complete, by which we mean that every non-empty subset (finite or not) X of L has an inf (a greatest
V
lower bound), which we denote by X. Completeness in general is a deep property of a lattice; it is what
distinguishes the real numbers from the rationals, for example; and it is not the sort of thing you can prove by
induction. All the lattices that arise in data-flow computations, however, can easily be seen to be complete.
(Of course, any finite semi-lattice L is automatically complete.) If L is a complete lattice, then in particular
V
L has a minimum element L, which we denote 0 or ⊥. Now it turns out that much more is true: the
assumption that L is complete is enough to ensure that L is actually a lattice, by which we mean that L
also is closed under the least upper bound operation ∨:

5.3 Lemma A complete lower semi-lattice is a complete lattice.

Proof. Let A be any subset of L. Let

M = {x ∈ L : for all a ∈ A, x ≥ a}.


V
Since L is complete, there is an element m = M.
Since each element a of A is a lower bound for M , it must be ≤ m. Hence m is an upper bound for A, and
by construction, it is the least upper bound for A.

W
In particular, L must also have a maximum element 1 = L.1 The maximum element is also sometimes
denoted by ⊤.
We will be dealing with sets of functions having values in L. If f and g are two functions on a common
domain having values in L, we say that f ≤ g iff f (x) ≤ g(x) for all x in their domain. We define the
meet f ∧ g of two functions pointwise: (f ∧ g)(x) = f (x) ∧ g(x). A function f : L → L is monotonic iff
x ≤ y =⇒ f (x) ≤ f (y).

5.4 Lemma f : L → L is monotonic iff for all x and y in L,

f (x ∧ y) ≤ f (x) ∧ f (y).

Proof. If f is monotonic, then since x ∧ y ≤ x and x ∧ y ≤ y, we must have f (x ∧ y) ≤ f (x) and


f (x ∧ y) ≤ f (y). Hence f (x ∧ y) ≤ f (x) ∧ f (y) by Lemma 5.2.
Conversely, if x ≤ y, then f (x) = f (x ∧ y) ≤ f (x) ∧ f (y) ≤ f (y), again by Lemma 5.2, so f is monotonic.

In the data-flow problems we have been considering, L is the family of subsets of a finite set (e.g. the set of
all expressions in a program). It is clearly complete, and ⊥ = ∅. Further, the relation ≤ is the same as the
relation ⊆.
Now for each basic block other than s in Available Expressions (our model problem) we have functions which
produce out from in:

out(b) = fb (in(b))
1 By the proof, this amounts to saying that 1 =
V
∅. If this seems too cute, of course, one could simply adjoin a maximum
element to a semi-lattice which was otherwise complete.
5.4. ABSTRACT FRAMEWORKS FOR DATA-FLOW ANALYSIS 69

where

fb (x) = (x − kill(b)) ∪ gen(b)

Let us interpret this slightly differently: think of fb as a map from L to L which reflects the effect of the
basic block b. Thus the functions {fb } contain all the information about each basic block and represent the
information contained in the transfer equations.
In fact, we will make a further generalization: let us associate functions with arcs in the graph, rather than
with nodes. For the problems we have been considering, the function fx→y will be the same as what we have
been denoting by fx —that is, it will be the function reflecting the effect of the basic block x. In general,
however, the functions fx→y will be allowed to be different for different y. This will mean that the effect of
node x may be different depending on which node y it exits to. This can be useful in two situations:

1. The test at the end of a basic block can give us information. For instance, if x exits to y if the variable
v = 4, and otherwise exits to z, then fx→y might reflect the fact that v is known to be 4, whereas fx→z
would not.

2. We want to allow for the possibility that the nodes in the flow graph are not themselves basic blocks
but have internal structure. For instance, the flow graph might be a derived graph of another flow
graph, in which case each node would actually be an interval in the other flow graph. This interval
could have several exits. Associating the functions f with edges leaving nodes does not completely
enable us to capture all the information due to the fact that a node could have several internal exits,
but it does help a little.

To include these functions in our formal model, we will assume that we have a family F of functions mapping
L to L such that

1. F contains the identity function I.

2. F is closed under function composition.

3. Each function in F is monotonic.

Such a structure hL, ∧, Fi is called a monotone data flow analysis framework .


Let us show that Available Expressions is modeled by a monotone data flow analysis framework. The set of
functions F will be the set of all functions from L to L of the form

f (x) = (x − kill) ∪ gen

(Equivalently, f (x) = (x ∩ killc ) ∪ gen.2 ) We get the identity function when kill = gen = 0. The fact that
F is closed under function composition follows from the following identity, where we take f (x) = (x ∩ a) ∪ b
and g(x) = (x ∩ c) ∪ d:
   
f ◦ g(x) = x ∩ (a ∩ c) ∪ (a ∩ d) ∪ b

2 In general, if X is any set, X c stands for the complement of that set.


70 CHAPTER 5. APPLICATIONS TO DATA-FLOW ANALYSIS

Finally, every function of the form f (x) = (x ∩ a) ∪ b is clearly monotonic, since set inclusion is equivalent
to the relation ≤.
An instance of a monotone data flow analysis framework hL, ∧, Fi is a flow graph hG, s, ei together with

1. an assignment e → fe of a function fe ∈ F to each edge e of G.

2. a value λ ∈ L assigned to s. This is the boundary condition. Typically, λ is either 0 (as in Available
Expressions), or 1 (as in Reaching Definitions).

5.5 Solutions of Abstract Data Flow Problems


The Meet-Over-Paths Solution

Given an instance of a monotone data flow analysis framework, we can define a function f from paths
P : s = x0 → x1 → . . . → xn = b to L by

f (P ) = fxn−1 →xn ◦ fxn−2 →xn−1 ◦ . . . ◦ fx0 →x1 (λ).

Note that f (P ) does not include the effect of node xn —intuitively, we are calculating data flow up through
the end of xn−1 ; that is, up to the beginning of xn . In terms of the Available Expressions problem, we are
calculating what might be denoted inP (b).
The meet-over-paths solution (sometimes referred to as the mop solution) of an instance of a data flow
analysis framework is the function from G to L defined by

 λ ^
 if b = s
mop(b) = f (P ) otherwise

 paths P
from s to b

The number of paths may be infinite, but since L is complete, the expression on the right-hand side is
well-defined.
If we knew that every legal path in the program could actually be executed, then the meet-over-paths solution
would yield the true values of in at each basic block (and from that, we could get the values of out). In
general, however, it is an undecidable question whether a particular path in a program will ever be executed,
and so the meet-over-paths solution may not be strictly correct. If it is incorrect, however, it is only because
too many paths were included, and this would make the meet-over-paths solution too small. That is, the
meet-over-paths solution is a lower approximation to the real solution. As such, it contains less information
than the real solution, and so is a safe, or conservative, estimate of the real solution.

The Maximal Fixed-Point Solution

Since most computer programs contain loops, the number of paths to most nodes in the flow graph is infinite,
and hence in general the meet-over-paths solution is difficult or impossible to compute. We are always safe,
however, if we can compute a lower approximation to it, since we will then have a lower approximation to
5.5. SOLUTIONS OF ABSTRACT DATA FLOW PROBLEMS 71

the real solution. It turns out that instances of monotone data-flow analysis frameworks always have such
solutions, and the solutions are easily computable. These solutions can be computed in terms of the original
data-flow constraint equations.
First, note that the constraint equations for our model problem can be written

in(s) = λ

\
in(b) = (in(x) − kill(x)) ∪ gen(x) for b 6= s
x∈pred(b)

where λ is a constant; it is 0 for the Available Expressions problem. Now the family M of functions mapping
G to L is a complete lower semi-lattice under the pointwise operations. It has a maximal element; namely,
the function m whose value m(b) at every point b is 1. If m is a typical member of M and if we define a
map T : M → M by

(T m)(s) = m(s)

\
(T m)(b) = fx→b (m(x)) for b 6= s
x∈pred(b)

then the solutions to the model constraint equations are just those functions m which are fixed points of the
transformation T under the constraint m(s) = λ. Equivalently, let M0 be the subspace of M consisting of
all the functions m ∈ M such that m(s) = λ. M0 is invariant under T , and M0 is still a complete lower
semi-lattice under the pointwise operations. Its maximal element is the function m defined by

λ b=s
m(b) =
1 b 6= s

The solutions to the model constraint equations are then just the fixed points of T when restricted to M0 .
Since all the maps fx are monotonic, the transformation T is also.
5.5 Theorem (Tarski) If L is a complete lattice and T is a monotonic function on L, then

1. T has a fixed point.


2. The set of fixed points of T has a maximal element

Proof. Let U denote the set {x ∈ L : T (x) ≥ x}. Thus any fixed points of T (if there are any) will lie in
U . Further, U =
6 ∅ since 0 ∈ U .
Let a = sup U .
For each u ∈ U , a ≥ u, and so T (a) ≥ T (u) ≥ u. So we have

a = sup{u : u ∈ U } ≤ sup{T (u) : u ∈ U } ≤ T (a)


72 CHAPTER 5. APPLICATIONS TO DATA-FLOW ANALYSIS

which shows that a ∈ U .


Now we know that a ≤ T (a), and so T (a) ≤ T (T (a)); i.e., T (a) ∈ U . If in fact a 6= T (a), then a < T (a) ∈ U ,
and so a cannot be the sup of U , a contradiction. Thus a is in fact a fixed point for T . a must be the
maximal fixed point, since U includes all the fixed points of T .

Thus the constraint equations have a maximal solution, which we call the maximal fixed-point solution and
denote mfp. The proof we have given is non-constructive. If we restrict the semi-lattice L further, however,
a constructive algorithm can be given for finding mfp. The algorithm is quite simple, and is really based on
the essential idea of the proof—start with an element, and iterate T . In this case, however, we start with
the maximal element 1 of L.3
The algorithm works for any complete lower semi-lattice L with a maximal element 1 which also satisfies
the descending chain condition (DCC): each descending sequence x1 > x2 > x3 > . . . has only finitely many
elements. (The number of elements does not have to be uniformly bounded over all chains, however.) Again,
any finite semi-lattice (such as those characterizing the four data-flow problems we have been considering)
certainly satisfies DCC; also, any semi-lattice which satisfies DCC is automatically complete.
Figure 5.3 gives an algorithm for finding the maximal fixed-point solution if L satisfies DCC.

begin
m(s) ← λ;
for each node b ∈ G − {s} do
m(b) ← 1;
repeat
for each b ∈ G^− {s} do
m(b) ← fx→b (m(x));
x∈pred(b)
end for;
until there is no change to any m(b);
end

Figure 5.3: Algorithm E:


Algorithm for Computing the Maximal Fixed-Point Solution

5.6 Theorem If L satisfies DCC, then Algorithm E terminates and computes the maximal fixed-point solution
of the data-flow constraint equations.
Proof. Let m0 denote the function m as initialized, and let mj denote the value of m after the while loop
has been executed j times. By induction, mj+1 (b) ≤ mj (b) for all b and j. Since L satisfies DCC and G is
finite, the algorithm must terminate in a finite number of steps.
Let m be the final value computed by the algorithm. By construction, T m = m and m(s) = λ, so m is a
fixed point and a solution to the data-flow constraint equations. If a is any other solution, then since a ≤ m0 ,
we have a = T a ≤ T (m0 ) = m1 , and by induction, a ≤ m. Thus, m is the maximal fixed-point solution.
3 One could also prove Tarski’s theorem by approximating from above, but the proof becomes slightly more technical—letting
P denote the (possibly empty) set of fixed points of T , we define U to be the set of those points x ∈ L such that x ≥ T (x) and
x ≥ every element of P. The proof then goes through much as above. The proof we have given is the one given by Tarski in his
original paper [24]. Tarski actually proved somewhat more: he proved that the set P of fixed points of T was itself a lattice.
5.5. SOLUTIONS OF ABSTRACT DATA FLOW PROBLEMS 73

5.7 Corollary The result produced by Algorithm E is independent of the order in which the nodes are processed
in the second for loop.

We say that a function f : L → L is distributive (or a semi-lattice homomorphism) iff for all x and y in L,

f (x ∧ y) = f (x) ∧ f (y).
By Lemma 5.4, every distributive function is monotonic. A distributive data-flow analysis framework is one in
which all the functions in F are distributive. It is easy to check that the functions of the form f (x) = (x∩a)∪b
are distributive, so the four data-flow problems we have been considering are all distributive.
We can now derive the relationship between the mfp and mop solutions. The facts are that in general,
mfp ≤ mop, and if hL, ∧, Fi is distributive and satisfies DCC, mfp = mop.
5.8 Lemma If hL, ∧, Fi is a monotone data-flow analysis framework, and if f is any function in F and X is
any subset of L, then
^  ^
f X ≤ f (x).
x∈X

If in fact hL, ∧, Fi is a distributive data-flow analysis framework which satisfies DCC, then
^  ^
f X = f (x).
x∈X

Proof. In any case, we have


^
X≤x

for each x ∈ X, and so since f is monotonic,


^ 
f X ≤ f (x)

for each x ∈ X, and therefore


^  ^
f X ≤ f (x).
x∈X

To prove the converse, since we are now assuming that hL, ∧, Fi satisfies DCC, there is a finite subset
{x1 , x2 , . . . , xn } of X such that
^
X = x1 ∧ x2 ∧ . . . ∧ xn .

Thus,
^
f (x) ≤ f (x1 ) ∧ . . . ∧ f (xn )
x∈X

= f (x1 ∧ . . . ∧ xn ) since f is distributive


^ 
= f X
74 CHAPTER 5. APPLICATIONS TO DATA-FLOW ANALYSIS

5.9 Theorem (Kam and Ullman[10]) If hL, ∧, Fi and hG, s, ei constitute an instance of a monotone data-
flow analysis framework, then mfp ≤ mop.

Proof. We let π(b) denote the set of all paths from s to b, and πj (b) denote the set of paths from s to b of
length ≤ j. Thus, π(b) = ∪πj (b), and
^ ^ ^
mop(b) = f (P ) = f (P ).
P ∈π(b) j P ∈πj (b)

We will show by induction that

^
(∗) mfp(b) ≤ f (P )
P ∈πj (b)

for all j, which will prove the theorem.

For j = 0, the right-hand side equals 1 except at s, where it equals λ = mfp(s).

Now suppose we know that (*) is true for j = n; we want to show it is true for j = n + 1. If b = s, there is
again nothing to prove. Otherwise, we have

^
mfp(b) = fx→b (mfp(x))
x∈pred(b)
^  ^ 
≤ fx→b f (P )
x∈pred(b) P ∈πn (x)
^ ^
≤ fx→b ◦ f (P )
x∈pred(b) P ∈πn (x)
^
= f (P )
P ∈πn+1 (b)

Thus for instances of monotone data-flow analysis frameworks, the maximal fixed-point solution is easily
computable and is a conservative estimate for the true solution of the data-flow problem.

In fact, for the most common data-flow problems, including the ones we started with, the mfp and mop
solutions are equal, as we will now show:

5.10 Theorem In an instance of a distributive data-flow analysis framework satisfying DCC, mfp = mop.

Proof. In any case mop(s) = λ = mfp(s). For b 6= s, and with the same notation as in the previous proof,
5.6. TWO MORE DATA-FLOW ANALYSIS PROBLEMS 75

V
we have mop(b) = P ∈π(b) f (P ). Hence
^
T (mop(b)) = fx→b (mop(x))
x∈pred(b)
^  ^ 
= fx→b f (P )
x∈pred(b) P ∈π(x)
^ ^
= fx→b ◦ f (P )
x∈pred(b) P ∈π(x)
^
= f (P ) = mop(b)
P ∈π(b)

so mop(b) is a fixed point of T and so is ≤ mfp(b).

This theorem was in essence first proved by Kildall in his Ph.D. thesis ([12]; summarized in [13]).

5.6 Two More Data-Flow Analysis Problems


Constant Folding

To keep the notation simple, in this section, we will revert to the convention that transfer functions are
associated with nodes, rather than with edges.
A form of constant folding can be handled as a data-flow analysis problem.
Let K be the set of variables in the program we are considering. The set of possible constant values of any
variable in K will be denoted by V (so typically, V is defined to be R, Z, or C). We will adjoin a “bottom”
value ⊥ and a “top” value ⊤ to V to create an augmented value space V ∗ . V ∗ is ordered as follows: there is
no order relation between members of V , but ⊥ is less than every member of V and ⊤ is greater than every
member of V . (So in particular, any chain in V ∗ has length ≤ 3.)

... -2 -1 0 1 2 ...

Figure 5.4: The constant folding lattice V ∗ .


76 CHAPTER 5. APPLICATIONS TO DATA-FLOW ANALYSIS

The semi-lattice L will consist of all functions from K to V ∗ . L will be ordered in the obvious way pointwise
(using the ordering of V ∗ defined above). 0 is just the function which is ⊥ everywhere, and 1 is the function
which is ⊤ everywhere. L has infinitely many elements. However, since there are finitely many variables in
the program—n, say—the length of any chain of elements in L is ≤ 2n + 1; and hence L satisfies DCC, and
in particular is complete.
The intuition for this is that if φ is a solution of our data-flow analysis problem, and if x is a variable in the
program, then φ(x) has a value in V iff x is known to be constant, and the value of φ(x) is that constant.
φ(x) = ⊥ will mean that the variable x is not constant. φ(x) = ⊤ will mean that nothing is known about
x—this happens only during the course of the algorithm; when the algorithm is finished the value ⊤ will
never occur. (The idea is that the value of φ at each local variable will be initialized to ⊤; clearly it would
be a semantic violation to use such a variable prior to assigning to it.)
Thus, every statement s in the program will have a function fs ∈ F associated with it, and the transfer func-
tion for a basic block will be the functional composition of those functions corresponding to the statements
in that block. fs will be the identity function unless s is an assignment statement of one of the following
forms (where x and yj denote variables in the program):

x ← a where a is a constant. Then fs is defined by



φ(w) if w 6= x
fs (φ)(w) =
a if w = x

x ← θ(y1 , . . . , yn ) where the right-hand side is an expression involving only the variables y1 , . . . , yn and
possibly some constants. Then fs is defined by

 φ(w) if w 6= x; and otherwise,
fs (φ)(w) = θ(φ(y1 ), . . . , φ(yn )) if all φ(yj ) are in V

⊥ if any φ(yj ) = ⊥

We don’t have to bother considering what happens when any φ(yj ) = ⊤, since as we noted above, this
would be semantically invalid.

We let F be the set of functions generated by the identity and these functions and closed under functional
composition. To show that hL, ∧, Fi is a monotone data-flow analysis framework, it will be enough to show
that each of these generating functions is monotonic.
Suppose then that φ ≤ ψ are members of L. First, say s is a statement of the form x ← a. Then we have

fs (φ)(w) = φ(w) ≤ ψ(w) = fs (ψ)(w)


if w 6= x, and they are equal if w = x; hence fs is monotonic.
Next, say s is a statement of the form x ← θ(y1 , . . . , yn ). If w 6= x then as before, fs (φ)(w) ≤ fs (ψ)(w).
Otherwise, w = x. Now if any of the φ(yj ) = ⊥, then fs (φ)(w) = ⊥ ≤ fs (ψ)(w). Otherwise, none of the
φ(yj ) = ⊥ and therefore none of the ψ(yj ) = ⊥. As we noted before, we can assume that all φ(yj ) and ψ(yj )
are in V . But then since each φ(yj ) ≤ ψ(yj ), we must have φ(yj ) = ψ(yj ) for all j, so fs (φ)(w) = fs (ψ)(w).
Thus hL, ∧, Fi is a monotone data-flow analysis framework. It is not, however, distributive, as the program
in Figure 5.5 shows.
5.6. TWO MORE DATA-FLOW ANALYSIS PROBLEMS 77

x←2 x←3
y←3 y←2

 
 x → 2  x → 3
φ1 : y → 3 φ2 : y → 2
 
z → ⊤ z → ⊤

z ←x+y


 x → ⊥
φ3 : y → ⊥

z → ?

Figure 5.5: Non-Distributivity for Constant Folding

In this figure, φ1 ∧ φ2 is ⊥ at both x and y and ⊤ at z. Hence, fz←x+y (φ1 ∧ φ2 )(z) = ⊥. On the other hand,
fz←x+y (φ1 )(z) = fz←x+y (φ2 )(z) = 5, so fz←x+y (φ1 ∧ φ2 ) 6= fz←x+y (φ1 ) ∧ fz←x+y (φ2 ); i.e. the framework is
not distributive.

Finding Dominators

The problem of finding dominators in a flow graph can be cast into the form of a distributive data-flow
analysis problem, as follows:
Given a flow graph hG, s, ei, the lattice L will consist of the set of all subsets of G, ordered by inclusion. In
particular, 0 = ∅, and 1 = G. The function fn corresponding to a node n ∈ G will be

fn (S) = S ∪ {n}.

where S is any subset of G.


It is easy to see that functions of this form (together with the identity function) constitute a distributive
78 CHAPTER 5. APPLICATIONS TO DATA-FLOW ANALYSIS

data-flow analysis framework. It is then obvious that the meet-over-paths solution (with the boundary value
condition Ss = {s}) yields precisely the set of dominators at each node, since it consists of those nodes which
occur on every path from s to that node.
This gives us a reasonably good way of finding dominators in directed graphs. And although it is not the
most efficient way, it is surely the simplest one conceptually.
Also, if we make the lattice meet operation ∧ correspond to ∪ instead of ∩, then the same data-flow analysis
problem, instead of yielding all the dominators of a node, yields all the “ancestors” of the node (with respect
to the whole graph; that is, all the nodes from which there is a path to the node under consideration).

5.7 How Many Iterations are Needed?


Many data-flow analysis frameworks have a property which enables us to show that the iterative Algorithm
E for finding mfp will converge quickly. There are various forms of this property, but they all say in one
form or another that in finding the mop solution to a data-flow problem, we can restrict ourselves to a finite
number of paths.
Let us relax our assumption that all paths start at the entry node s of G. In general, our iterative algorithms
start with an initial value α(x) ∈ L at each node x ∈ G, where α(s) = λ and otherwise α(x) = 1; for a path

P : x0 → x1 → . . . → xn ,
we define

f (P ) = fxn−1 →xn ◦ fxn−2 →xn−1 ◦ . . . ◦ fx1 →x0 (α(x0 )).

Now each P ∈ π(b) either starts at s or can be extended to a path from s to b: say it does not start at s and
the new path is

P ′ : s = y0 → y1 → . . . → ym = x0 → x1 → . . . → xn = b.
Let us write

Q : s = y0 → y1 → . . . → ym = x 0 .
Then we have f (Q) ≤ 1 = α(x0 ), and so f (P ) ≥ f (P ′ ).
Thus, adding all these new paths (the ones not starting at s) to the meet in computing mop cannot change
the value of mop, since each such new path is bounded below by a path already in the meet (i.e., already
starting at s).
Now suppose it is actually true that a finite number of paths suffices to compute mop; in fact, say we happen
to know that for each b ∈ G
^
mop(b) = f (P )
P ∈π(b)

where π(b) denotes some finite set of paths ending at b. (Note that this is a new and different use of the
notation π(b); the paths do not need to start at s.)
5.7. HOW MANY ITERATIONS ARE NEEDED? 79

By what we just saw above, if the path P is an element of π(b) and {Qi } is a finite set of paths such that
V
f (Qi ) ≤ f (P ), then P can be replaced in π(b) by {Qi }.
In Algorithm E (page 72), there is a choice made in the order in which the nodes are visited in the main
loop. Let us fix such an order once and for all. (We will later see that there is a natural order to use.)
Let us modify Algorithm E to produce Algorithm F in Figure 5.6. Given a particular path

P : x0 → x1 → . . . → xn ,
Algorithm F computes f (P ). By comparing Algorithm E with Algorithm F, we see that if Algorithm F
computes f (P ) in n(P ) iterations of the while loop, then after n(P ) iterations of the while loop in Algorithm
E, we must have m(b) ≤ f (P ).

begin
m(s) ← λ;
for each node x ∈ G − {s} do
m(x) ← 1;
end for;
j ← 2;
while j ≤ |P | do
for each b ∈ G − {s} do
if (b = xj ) then
m(xj ) ← fxj−1 →xj (m(xj−1 ));
j ← j + 1;
end if ;
end for;
end while;
end

Figure 5.6: Algorithm F


Modified Algorithm E; Computes f (P )

Thus, if for each b ∈ G there is a finite set π(b) of paths ending at b (and not necessarily starting at s) such
that
^
mop(b) = f (P )
P ∈π(b)

and if N is the maximum number of iterations of the while loop in Algorithm F to compute f (P ) for all
P ∈ ∪π(b), then after N iterations of the while loop in Algorithm E, we have
^
m(b) ≤ f (P ) = mop(b)
P ∈π(b)

for all b ∈ G, and therefore Algorithm E can be stopped after N iterations. (If in fact, the data-flow problem
is distributive, then after N iterations, mfp(b) ≤ m(b) ≤ mop(b) = mfp(b), so Algorithm E actually ends
after N iterations.)
80 CHAPTER 5. APPLICATIONS TO DATA-FLOW ANALYSIS

There is a useful class of data-flow problems for which a finite number of paths suffices to compute mop;
this is the class of problems for which mop can be computed solely in terms of the simple paths. Since every
flow graph is finite, there are only finitely many simple paths, and the analysis we have given above applies.
How can we characterize such data-flow problems? First, some notation: let us denote the sub-path from
some node xj to another node xi of the path

P : x0 → x1 → . . . → xn ,

by xj → xi , and let us write fxj →x

i
for

fxi−1 →xi ◦ fxi−2 →xi−1 ◦ . . . ◦ fxj →xj+1 .

Thus, we have

f (P ) = fx0 →x
∗ (α(x0 )) = fx ∗ ◦ fx0 →x
∗ (α(x0 ))
n k →xn k

The mop solution is the meet of f (P ) over all such paths P which start at s. If xk = xn (so the path has a
cycle in it), then we would like to split the path into two paths, eliminating a cycle in the process; we will
show below how to do this.
∗ ∗ ∗
So suppose the path x0 → xn is split into x0 → xk and xk → xn . We would like the expression for f (P ) to
be an upper bound for

fx ∗ (α(xk )) ∧ fx0 →x
∗ (α(x0 )).
k →xn k


(If this is true, then as we have seen above, we can replace the path P by the two paths x0 → xk and

xk → xn in computing mop.)
In particular, we want

f ◦ g(λ) ≥ f (1) ∧ g(λ)

for all f and g in F. Equivalently, setting x = g(λ), we want

(2) f (x) ≥ f (1) ∧ x.

It turns out that for distributive frameworks, this condition is sufficient:

5.11 Theorem If hL, ∧, Fi is a distributive framework satisfying (2), then the mop solution of any instance of
the framework can be found by forming the meet of f (P ) over all cycle-free paths.

Proof. Let

P : s = x0 → x1 → . . . → xn = b

be a path from s to b which contains at least one cycle. Let j be the highest number such that xj equals
∗ ∗
some xk with k > j. Then x0 → xj+1 and xj+1 → xn together contain at least one less cycle than the
5.7. HOW MANY ITERATIONS ARE NEEDED? 81

original path, and the second path is actually simple. We have

f (P ) = fxj+1 →x

n
◦ fx0 →x

j+1
(λ)
≥ fxj+1 →x

n
(1) ∧ fx0 →x

j+1
(λ)


Hence we can replace any P : s → b which contains a cycle by a finite set of paths each of which has fewer
cycles than P . By iterating this process, we can ultimately replace P by a finite set of cycle-free paths, and
so mop(b) can be computed entirely in terms of cycle-free paths to b.

Since there are only finitely many cycle-free paths, we see that for a distributive data-flow framework satis-
fying (2), mop can be calculated in terms of a finite number of paths. In fact, in this case, we can do even
better, provided the graph is reducible.
For reducible graphs, the loop-connectedness number (see page 39) lc(G) is ≤ dsl(G), where dsl(G) is the
derived sequence length of G, and in general is ≤ 3. Now suppose that in Algorithm F above, we walk each
iteration of the nodes (in the while loop) in reverse postorder. Since for each path P , the nodes occur in
reverse postorder except for the back arcs in P , we see that it will take at most lc(G) + 1 iterations of the
while loop for algorithm F to compute f (P ) provided P is simple. (That is, it will take 1 iteration to reach
the first back arc, and 1 more iteration to get past that arc and reach the next one, or the end of the path.)
For instance, if a path in some graph has the form

2 → 3 → 6 → 16 → 17 → 20 → 30 → 12 → 32 → 40 → 5 → 9 → 19

where the numbers are the reverse post-order numbers of the nodes in the path, then on the first iteration
of the while loop in Algorithm F, nodes 2 through 30 will be visited. (The first back arc is 30 → 12.) On the
next iteration, nodes 12 to 40 will be visited (up to the back arc 40 → 5). And then on the third iteration
of the while loop, nodes 5 through 19 will be visited and f (P ) will have been computed. Here there are 2
back arcs in P , and it took 3 executions of the while loop to compute f (P ).
To sum up, for distributive data-flow problems satisfying (2), Algorithm E needs only lc(G) + 1 iterations
of the while loop to compute mop. Of course, if we don’t know lc(G), then we will need an extra iteration
to confirm the “no change” test on the while loop, making the number of iterations lc(G) + 2. This result is
due to Kam and Ullman[9].
Data-flow frameworks where the L is a semi-lattice of boolean arrays (or, equivalently, of sets), and where
the functions f ∈ F are of the form

f (x) = (x ∩ a) ∪ b

are called bit-vector frameworks, because they are implemented using bit vectors.

5.12 Theorem (Hecht and Ullman[7]) The while loop in Algorithm E is entered at most lc(G) + 2 times for
bit-vector data-flow problems on reducible flow graphs, provided the nodes are visited in reverse postorder.

Proof. All we have to show is that the family of functions of the form

f (x) = (x ∩ a) ∪ b
82 CHAPTER 5. APPLICATIONS TO DATA-FLOW ANALYSIS

satisfies (2). But

f (x) = (x ∩ a) ∪ b
⊇ (x ∩ a) ∪ (x ∩ b)
= x ∩ (a ∪ b)
= x ∩ f (1)

Thus, the natural way to visit the nodes in Algorithm E is in reverse post-order; and if the graph is reducible,
then we can actually get an effective bound on the running time of the algorithm when applied to bit-vector
problems.
This kind of analysis can be extended to data-flow frameworks for which mop can be computed by considering
paths containing at most k cycles. Again, the conditions only apply to distributive frameworks. Frameworks
such as these are sometimes called fast frameworks. See for instance Tarjan’s papers [21], [23], and [22].
More general conditions, not requiring distributivity, have also been considered by Graham and Wegman [4]
and Rosen [15]. These conditions do not yield an efficient algorithm for mop 4 ; but they do yield an efficient
algorithm which computes a solution which is guaranteed to be between mfp and mop. These conditions,
however, are not broad enough to include the problem of constant propagation considered earlier.

5.8 Algorithms Based on Reducibility


The algorithm we have been using up to this point consists of iterating over all the nodes of G a certain
number of times. For this reason, it is called the iterative algorithm. For bit-vectoring problems where the
flow graph is known to be reducible, there are some algorithms which potentially can save execution time,
at the cost of managing more complicated data structures.
We will consider two such algorithms here. Both algorithms take their most natural form for reducible
graphs, but both can be extended to apply to general flow graphs. Let us start, however, by assuming that
G is reducible. Both these algorithms lean heavily on the fact that we can restrict ourselves to cycle-free
paths when computing the data-flow solutions.
Since the algorithms are based on the derived sequence of a reducible flow graph, we cannot consider backward
data-flow problems to be forward problems on the reverse flow graph—as we noted previously, the reverse
graph of a reducible flow graph is not necessarily reducible. Thus separate algorithms are necessary for
forward and backward problems.
Given a path Q in G of the form

Q : q0 → q1 → . . . → qs ,

let us write

fQ = fqs−1 →qs ◦ fqs−2 →qs−1 ◦ . . . ◦ fq0 →q1 .

The algorithms will be based on the following lemma:


4 In a negative direction, Kam and Ullman [10] have shown that there is no general algorithm which computes mop for all
non-distributive problems.
5.8. ALGORITHMS BASED ON REDUCIBILITY 83

5.13 Lemma Let hL, ∧, Fi and hG, si be an instance of a distributive data-flow analysis framework satisfying
DCC and (2). If all simple paths from s to x in G pass through a node k, then

 ^ 
mop(x) = fQ (mop(k)).
paths Q from
k to x

Proof. Each simple path P from s to x corresponds uniquely to a pair of paths R from s to k, and Q from
k to x, by the composition “P is R followed by Q”; and conversely. Correspondingly, we have fP = fQ ◦ fR .
Hence by Lemma 5.8,
^
mop(x) = fP (λ)
paths P from
s to x
^ ^
= fQ ◦ fR (λ)
paths Q from paths R from
k to x s to k
^  ^ 
= fQ fR (λ)
paths Q from paths R from
k to x s to k
^
= fQ (mop(k))
paths Q from
k to x
 ^ 
= fQ (mop(k))
paths Q from
k to x

5.8.1 Allen and Cocke’s Algorithm

This algorithm is essentially that presented in Allen and Cocke[3], although the notation is closer to that
used in the Schwartz and Sharir algorithm, which we present subsequently.

G = G0 , G1 , . . . , Gn = {g}
will be the derived sequence of G. The function φj will map each interval of Gj to its corresponding point in
Gj+1 . The function hj , applied to an interval in Gj , returns the element of Gj which is the header of that
interval.
Let us also define a function θj : Gj → G as follows: θ0 is just the identity function on G. If j ≥ 1, then
θj (x) is that element of G which represents the “ultimate header” of x. (For instance, in the example of
Figure 3.11 on page 42, θ3 (C3) = C.) Formally, we can define θj recursively as follows:

x j=0
θj (x) =
θj−1 hj−1 φ−1
j (x) j>0
84 CHAPTER 5. APPLICATIONS TO DATA-FLOW ANALYSIS

The functions fx→y are assumed to have already been computed, and are denoted as f0 (x, y).
These functions may seem complicated. The point to keep in mind is that they really come for free—they
just represent the information in the data structure representing the flow graph and its derived sequence.
Let us make the convention that P is a path in Gj of the form

P : p o → p 1 → . . . → pr

and Q is a path in G of the form

Q : q0 → q1 → . . . → qs .

ι will represent the identity function on the semi-lattice L.

The Forward Algorithm

The algorithm consists of two passes over the derived sequence. In the first pass, which proceeds from inner
to outer intervals (that is, from G0 to Gn ), we compute at each stage (that is for each Gj ) three sets of
functions:

• For each arc x → y in Gj , fj (x, y), which will be computed recursively from the computations in Gj−1 ,
will represent data flow from θj (x) to θj (y):

^
fj (x, y) = fqs−1 →qs ◦ fqs−2 →qs−1 ◦ . . . ◦ fq0 →q1
paths Q from
θj (x) to θj (y)

(where q0 = θj (x) and qs = θj (y)).

• For each interval hK, ki in Gj and each x ∈ K, gj (x) is the function which represents data flow from
k to x within K:
^
gj (x) = fj (pr−1 , pr ) ◦ fj (pr−2 , pr−1 ) ◦ . . . ◦ fj (p0 , p1 )
paths P from
k to x
^
= fqs−1 →qs ◦ fqs−2 →qs−1 ◦ . . . ◦ fq0 →q1
paths Q from
θj (k) to θj (x)

(where again q0 = θj (x) and qs = θj (y) and where p0 = k and pr = y.)


Thus for all x ∈ K,
^
gj (x) = fj (y, x) ◦ gj (y).
y∈predj (x)
y∈K
5.8. ALGORITHMS BASED ON REDUCIBILITY 85

• For each successor x of K, Fj (K, x) is the function which represents data flow from k through K and
out to x:
^
Fj (K, x) = fj (y, x) ◦ gj (x).
y∈predj (x)
y∈K

Fj will then be reinterpreted as fj+1 when Gj+1 is processed.

In computing gj (x), we can restrict ourselves to cycle-free paths from k to x. This means, since K is an
interval, that gj (x) can be computed in 1 pass over the nodes in K, except for paths with a back arc in
them. The only such paths end at k: if x = k, we compute gj (x) separately by a final meet over paths from
its predecessors.
The second pass over the derived sequence proceeds from outer to inner intervals. mn (g) is set equal to λ.
Then for each interval hK, ki in Gj , mj (k) is set equal to mj+1 (φj (K)). Finally, for each x ∈ K, mj (x) is
set equal to gj (x)(mj (k)). By induction, we see that at each step of this pass, for each interval hK, ki in Gj ,

• mj (k) is set equal to the mop solution at θj (k).

• Then, since for each x ∈ K, any cycle-free path from s to θj (x) must pass once through θj (k),
Lemma 5.13 shows that mj (x) is set equal to the mop solution at θj (x).

Hence, at the end of the algorithm, which is in Figure 5.7, m0 is the solution to the data-flow problem.

The Backward Algorithm

The modification of the Allen-Cocke algorithm for backwards data-flow propagation is due to Kennedy. His
version, and the original references, can be found in [11].
The backwards problem is complicated by the fact that, although intervals are single-entry, they are not
single-exit. Since we are following data-flow backwards through the flow graph, we have to take account of
the fact that intervals can be entered at different points.
As usual, the exit node of the flow graph is denoted by e. Let us denote the image of e in Gj by ej (so in
particular, e0 = e and en = g). Further, let us adjoin a dummy “post-exit” node δj to each Gj , and a single
edge ej → δj . δj will not be thought of as being a member of Gj , and in particular, will not be a member
of any interval of Gj . Its only function is to be a technical help in the algorithm. We set fe0 →δ0 ← ι.
Again we make two passes over the derived sequence. In the first pass, which again proceeds from inner to
outer intervals, (that is, from G0 to Gn ), we compute for each Gj three sets of functions:

• For each arc x → y in Gj , fj (x, y), which will be computed recursively from the computations in Gj−1 ,
will represent data flow from θj (y) backwards to θj (x):
^
fj (x, y) = fq0 →q1 ◦ fq1 →q2 ◦ . . . ◦ fqs−1 →qs
paths Q from
θj (x) to θj (y)
86 CHAPTER 5. APPLICATIONS TO DATA-FLOW ANALYSIS

-- Step I: Inner-to-outer interval processing.


for j = 0 to n − 1 do
if j 6= 0 then
-- f0 is already defined.
for each arc x → y in Gj do
fj (x, y) ← Fj−1 (φ−1 −1
j (x), hj−1 (φj (y)));
end for;
end if ;
for each interval hK, ki in Gj do/
for each x ∈ K do
initialize gj (x) to the constant function 1;
end for;
repeat
for each x ∈ K ^ − {k} do
gj (x) ← fj (y, x) ◦ gj (y);
y∈predj (x)
y∈K
end for;
until no change;
^
gj (k) ← f (y, k) ◦ gj (y);
y∈predj (k)
y∈K
for each successor ^
x of K do/
Fj (K, x) ← fj (y, x) ◦ gj (y);
y∈predj (x)
y∈K
end for;
end for;
end for;
-- Step II: Outer-to-inner interval processing.
mn (g) ← λ;
for j = n − 1 to 0 step −1 do
for each interval hK, ki in Gj do
-- mj+1 is known.
mj (k) ← mj+1 (φj (K));
for each x ∈ K − {k} do
mj (x) ← gj (x)(mj (k));
end for;
end for;
end for;

The repeat loop only has to be executed once provided the nodes in its for loop are visited in
reverse post-order.

Figure 5.7: Allen and Cocke’s Algorithm: Forward Data-Flow Propagation


5.8. ALGORITHMS BASED ON REDUCIBILITY 87

(where q0 = θj (x) and qs = θj (y). Note the order in which the functions are composed to form each
fQ —this is backwards propagation.)

• For each interval hK, ki in Gj , each x ∈ K, and each y ∈ succ(K), gj (x, y) represents the data flow
backwards from the beginning of y (i.e. from the exit of K to y) back through K to x:
^
gj (x, y) = fj (p0 , p1 ) ◦ fj (p1 , p2 ) ◦ . . . ◦ fj (pr−1 , pr )
paths P from
x to y
^
= fq0 →q1 ◦ fq1 →q2 ◦ . . . ◦ fqs−1 →qs
paths Q from
θj (x) to θj (y)

Thus for all x ∈ K,


^
gj (x, y) = fj (x, w) ◦ gj (w, y)
x→w
w∈Kor w=y

with the convention that gj (y, y) = ι.

• For each successor y of K,

Fj (K, y) = gj (k, y).

In computing gj (x, y), if we visit the nodes x in K in reverse post-order, then at most two passes over the
interval will be necessary, since any simple path from x to a successor y of K can contain at most one back
arc (and this will be true iff the path contains the interval header k).
The second pass over the derived sequence proceeds from outer to inner intervals. We compute a function
γj : Gj → F. If x ∈ Gj , γj (x) will represent data flow backward from the exit e to θj (x) (that is, it will be
the meet of all functions fQ where Q runs over all paths from θj (x) to the exit).
γj is computed as follows: first, γn (g) is set equal to gn (g, δn ). For each j (stepping down from n − 1 to 0),
for each interval hK, ki in Gj , we first set γj (k) to be γj+1 of the image of K in Gj+1 . Then again for each
interval hK, ki in Gj , and for each node x ∈ K − {k}, we set
^
γj (x) ← gj (x, y) ◦ γj (y).
y∈succ(K)

We see by induction that for each interval hK, ki of Gj ,

• γj (k) is the meet of all functions fQ where Q runs over all paths from θj (k) to the exit.

• Since for each x ∈ K, any cycle-free path from s to θj (x) must past once through θj (k), a minor variant
of Lemma 5.13 shows that γj (x) is the meet of all functions fQ where Q runs over all paths from θj (x)
to the exit.
88 CHAPTER 5. APPLICATIONS TO DATA-FLOW ANALYSIS

Hence, at the end of the algorithm, the mop solution will be m(x) = γ0 (x)(λ) for all x ∈ G.
The complete algorithm is given in Figure 5.8.
Kennedy’s original algorithm did not use the fact that we knew gj (x, y) for all x ∈ K, and therefore computed γj as follows:
after first computing γj for each interval header in Gj as above, for each hK, ki in Gj , the nodes x ∈ K − {k} are visited in
post-order. For each such node x, we set

^ fj (x, y) ◦ γj (y) if y ∈ K
γj (x) ←
gj (x, y) ◦ γj (y) otherwise
y∈succ(x)

Since the nodes x ∈ K are visited in post-order, each y ∈ succ(x) has either

1. been visited already in the interval; or


2. is the target of a back arc in the interval, in which case it is the interval header; or
3. is the target of an arc to another interval, in which case it is the header of another interval.

In each case, γj (y) has already been assigned. Thus, one pass through each interval computes γj .

5.8.2 Schwartz and Sharir’s Algorithm

One inefficiency in the algorithms as just presented, is that there may be nodes in one of the derived graphs
which are unaffected by the next reduction, and so remain in the next derived graph. (See, for instance,
Figure 3.10 on page 41.) Certainly there is no reason to visit these intervals again.
Schwartz and Sharir [16] noticed that the tree of reachunder sets created by Tarjan’s algorithm for testing
reducibility could be used instead of the tree reflecting the nesting of intervals of all orders to obtain a more
efficient algorithm.
So instead of dealing with the derived sequence of G, we will deal with the sequence {ρj = ρj (kj ) : 1 ≤ j ≤ n}
defined as follows: For 1 ≤ j < n, ρj is the j th reachunder set produced by Tarjan’s algorithm, and ρn is the
limit graph produced by the algorithm. (Since G is assumed to be reducible, ρn is a DAG.) In each case, kj
is the header of ρj .
When a reachunder set ρj is identified by Tarjan’s algorithm, it is collapsed into a point pj , which becomes
an element of the next set ρj+1 . Instead of separate functions fj , we will have one function f (x, y), where x
and y may be points in G or elements pj , as appropriate. The forward algorithm is given in Figure 5.9.
For the backward algorithm, we only need one additional node δ, which, since it is not part of any reachunder
set (or in fact, part of G), is never collapsed into any of the points {pj }. We set f (e, δ) ← ι.
The backward algorithm is given in Figure 5.10.

5.8.3 Dealing With Non-Reducible Flow Graphs

The Schwartz-Sharir algorithm can be easily modified to handle non-reducible flow graphs, as follows: During
the computation of reachunder sets in Tarjan’s algorithm, the nodes kj which are targets of back arcs are
visited in reverse pre-order. If the reachunder set of kj includes s (so we know that G is not reducible), then
no action is taken other than marking kj as “improper”. Then the set kj+1 is computed as usual, and the
algorithm proceeds. If any ρn contains an “improper” node, it is also marked as “improper”.
5.8. ALGORITHMS BASED ON REDUCIBILITY 89

-- Step I: Inner-to-outer interval processing.


for j = 0 to n − 1 do
if j 6= 1 then
-- f1 is already defined.
for each arc x → y in Gj do
fj (x, y) ← Fj−1 (φ−1 −1
j (x), hj−1 (φj (y)));
end for;
fj (ej , δj ) ← ι;
end if ;
for each interval hK, ki in Gj do
for each x ∈ K and y ∈ succ(K) do/
Initialize gj (x, y) to the constant function 1;
Initialize gj (y, y) to ι;
end for;
repeat
for each x ∈ K do^
gj (x, y) ← fj (x, w) ◦ gj (w, y);
x→w
w∈K or w=y
end for;
until no change;
for each successor y of K do
Fj (K, y) ← gj (k, y);
end for;
end for;
end for;
-- Step II: Outer-to-inner interval processing.
γn (g) ← gn (g, δn );
for j = n − 1 to 0 step −1 do
for each interval hK, ki in Gj do
γj (k) ← γj+1 (φj (K));
end for;
for each interval hK, ki in Gj do
for each x ∈ K ^ − {k} do
γj (x) ← gj (x, y) ◦ γj (y);
y∈succ(K)
end for;
end for;
end for;
-- Step III: Final propagation step.
for each x ∈ G do
m(x) ← γ0 (x)(λ);
end for;

The repeat loop only has to be executed twice provided the nodes in its for loop are visited in
reverse post-order.

Figure 5.8: Allen and Cocke’s Algorithm: Backward Data-Flow Propagation


90 CHAPTER 5. APPLICATIONS TO DATA-FLOW ANALYSIS

-- Step I: Inner-to-outer interval processing.


for j = 0 to n − 1 do
for each x ∈ ρj do
initialize gj (x) to the constant function 1;
end for;
repeat
for each x ∈ ρj − {kj } do
^
gj (x) ← f (y, x) ◦ gj (y);
y∈predj (x)
y∈K
end for;
until no change;
^
gj (kj ) ← f (y, kj ) ◦ gj (y);
y∈predj (k)
y∈ρj
for each successor x of ρj do
^
f (pj , x) ← f (y, x) ◦ gj (y);
y∈predj (x)
y∈ρj
end for;
for each predecessor x of ρj do
f (x, pj ) ← f (x, kj );
end for;
end for;
-- Step II: Outer-to-inner interval processing.
mn (pn ) ← λ;
for j = n − 1 to 0 step −1 do
-- mj+1 is known.
mj (kj ) ← mj+1 (pj );
for each x ∈ ρj − {kj } do
mj (x) ← gj (x)(mj (kj ));
end for;
end for;

If G is reducible, the repeat loop only has to be executed once provided the nodes in its for loop
are visited in reverse post-order.

Figure 5.9: Schwartz and Sharir’s Algorithm: Forward Data-Flow Propagation


5.8. ALGORITHMS BASED ON REDUCIBILITY 91

-- Step I: Inner-to-outer interval processing.


for j = 0 to n − 1 do
for each x ∈ ρj and y ∈ succ(ρj ) do
Initialize gj (x, y) to the constant function 1;
Initialize gj (y, y) to ι;
end for;
repeat
for each x ∈ K do/ ^
gj (x, y) ← f (x, w) ◦ gj (w, y);
x→w
w∈K or w=y
end for;
until no change;
for each successor y of ρj do
f (pj , y) ← gj (k, y);
end for;
for each predecessor x of ρj do
f (x, pj ) ← f (x, kj );
end for;
end for;
-- Step II: Outer-to-inner interval processing.
γn (g) ← gn (g, δ);
for j = n − 1 to 0 step −1 do
γj (kj ) ← γj+1 (pj );
for each x ∈ ρj − {kj } do
^
γj (x) ← gj (x, y) ◦ γj (y);
y∈succ(ρj )
end for;
end for;
-- Step III: Final propagation step.
for each x ∈ G do
m(x) ← γ0 (x)(λ);
end for;

If G is reducible, the repeat loop only has to be executed twice provided the nodes in its for loop
are visited in reverse post-order; and when j = n, it only has to be executed once (since ρn is
known to be a DAG).

Figure 5.10: Schwartz and Sharir’s Algorithm: Backward Data-Flow Propagation


92 CHAPTER 5. APPLICATIONS TO DATA-FLOW ANALYSIS

The only change in the algorithms then that has to be made is that in Step I of each, the repeat loop for an
improper ρj really has to be repeated until no change. That is, improper reachunder sets (or an improper
terminal DAG) are handled by the iterative algorithm.
Bibliography

[1] Alfred V. Aho, John Hopcroft, and Jeffrey D. Ullman. Data Structures and Algorithms. Addison-Wesley,
Reading, Massachusetts, 1983.

[2] Alfred V. Aho, Ravi Sethi, and Jeffrey D. Ullman. Code optimization and finite Church-Rosser sys-
tems. In Randall Rustin, editor, Design and Optimization of Compilers, pages 89–106. Prentice-Hall,
Englewood Cliffs, New Jersey, 1971.

[3] F. E. Allen and John Cocke. A program data flow analysis procedure. Communications of the ACM,
19(3):137–147, 1976.

[4] Susan L. Graham and Mark Wegman. A fast and usually linear algorithm for global flow analysis.
Journal of the ACM, 23(1):172–202, January 1976.

[5] Matthew S. Hecht. Flow Analysis of Computer Programs. North-Holland, New York, 1977.

[6] Matthew S. Hecht and Jeffrey D. Ullman. Flow graph reducibility. SIAM Journal on Computing,
1(2):188–202, June 1972.

[7] Matthew S. Hecht and Jeffrey D. Ullman. Analysis of a simple algorithm for global flow problems.
In Conference Record of the First Annual ACM Symposium on Principles of Programming Languages,
Boston (Mass.), pages 207–217, October 1973.

[8] Matthew S. Hecht and Jeffrey D. Ullman. Characterizations of reducible flow graphs. Journal of the
ACM, 21(3):367–375, July 1974.

[9] John B. Kam and Jeffrey D. Ullman. Global data flow analysis and iterative algorithms. Journal of the
ACM, 23(1):158–171, January 1976.

[10] John B. Kam and Jeffrey D. Ullman. Monotone data flow analysis frameworks. Acta Informatica,
7(3):305–317, 1977.

[11] Ken Kennedy. A Survey of Data Flow Analysis Techniques. In Steven S. Muchnick and Neil D. Jones,
editors, Program Flow Analysis: Theory and Applications, pages 5–54. Prentice-Hall, Englewood Cliffs,
New Jersey, 1981.

[12] Gary Arlen Kildall. Global Expression Optimization During Compilation. PhD thesis, University of
Washington, Seattle, Washington, May 1972.

93
94 BIBLIOGRAPHY

[13] Gary Arlen Kildall. Global expression optimization during compilation. In Conference Record of the
First Annual ACM Symposium on Principles of Programming Languages, Boston (Mass.), pages 194–
206, October 1973.

[14] Donald Knuth. An empirical study of FORTRAN programs. Software Practice and Experience,
1(12):105–134, 1974.

[15] Barry K. Rosen. Monoids for rapid data flow analysis. SIAM Journal on Computing, 9(1):159–196,
1980.

[16] J. T. Schwartz and M. Sharir. A design for optimizations of the bitvectoring class. Technical Report 17,
Courant Institute of Mathematical Sciences, Computer Science Department, September 1979.

[17] M. Sharir. A strong-connectivity algorithm and its applications in data flow analysis. Computation and
Mathematics with Applications, 7(1):67–72, 1981.

[18] M. Sharir. Structural analysis: A new approach to flow analysis in optimizing compilers. Computer
Languages, 5(3/4):141–153, 1981.

[19] Robert Endre Tarjan. Finding dominators in directed graphs. SIAM Journal on Computing, 3(1):62–89,
1974.

[20] Robert Endre Tarjan. Testing flow graph reducibility. J. Computer and System Sciences, 9:355–365,
1974.

[21] Robert Endre Tarjan. Iterative algorithms for global flow analysis. In J. F. Traub, editor, Algorithms
and Complexity: New Directions and Recent Results, pages 71–102. Academic Press, 1976.

[22] Robert Endre Tarjan. Fast algorithms for solving path problems. Journal of the ACM, 28(3):594–614,
1981.

[23] Robert Endre Tarjan. A unified approach to path problems. Journal of the ACM, 28(3):577–593, 1981.

[24] Alfred Tarski. A lattice-theoretical fixpoint theorem and its applications. Pacific Journal of Mathemat-
ics, 5:285–309, 1955.
Index

ancestor, 2, 19 finite relation, 45


arc flow graph, 19
back, 3 derived, 28
cross, 4 kernel, 51
forward, 3 limit, 29
tree, 3 reverse, 59
available expression, 59 trivial, 19
forward equations, 63
backward equations, 63 framework
boundary value equations, 62 bit-vector, 81
fast, 82
chain, 45 monotone, 69
height, 45 function
child, 2, 19 distributive, 65
collapsing a flow graph, 47 monotonic, 68
confluence equations, 63
consumes, 47 head, 1
cycle, 19 header, 24
cycle-free, 19
interior, 19
data flow analysis framework interval, 24
monotone, 69 interval nesting, 40
data-flow problem interval nesting tree, 42
distributive, 65 irreducible, 29
monotone, 69
derived sequence, 29 kernel, 51
derived sequence length, 39
descendant, 2, 19 lattice, 68
descending chain condition, 72 lc, see loop-connectedness number
distributive, 73 limit point, 45
dominator, 20 live variable, 60
immediate, 21 loop, 19
strong, 20 loop-connectedness number, 39
dominator tree, 21 lower semi-lattice, 67
dsl, see derived sequence length complete, 68

entry node, 24 maximal fixed-point solution, 72


meet-over-paths solution, 70
finite Church-Rosser transformation, 45 mfp, see maximal fixed-point solution

95
96 INDEX

monotone data flow analysis framework


instance, 70
mop, see meet-over-paths solution

nonreducible, 29

Path Lemma, 11
pre-interval, 24
predecessor, 2, 19
propagation equations, 63

reaching definition, 60
reachunder set, 53
reducible, 29
reduction ordering, 56
region, 24
reverse graph, 12
root of a strongly connected component, 13

self-loop, 19
simple, 19
source, 1
successor, 2, 19

tail, 1
target, 1
transfer equations, 63

very busy expression, 61

You might also like