Optimal Sequential Partitions of Graphs (Brian W. Kernighan) (1971)
Optimal Sequential Partitions of Graphs (Brian W. Kernighan) (1971)
BRIAN W. KERNIGHAN
Bell Telephone Laboratories, Incorporated, Murray Hill, New Jersey
ABSTRACT. This paper presents an algorithm for finding a minimum cost partition of the nodes
of a graph into subsets of a given size, subject to the constraint that the sequence of the nodes
may not be changed, that is, that the nodes in a subset must have consecutive numbers. The
running time of the procedure is proportional to the number of edges in the graph. One possi-
ble application of this algorithm is in partitioning computer programs into pages for operation
in a paging machine. The partitioning minimizes the number of transitions between pages.
KEY WORDS AND PHRASES: graph partitioning, graph theory, partitioning, segmentation,
pagination
1. Introduction
Consider a graph G in which the n nodes have sizes and the edges have costs. A
partitioning problem is to divide G into subsets so that the sum of costs on edges
ioining nodes in different subsets is minimum. This is often called "clustering"; it
is usually stated (equivalently) in terms of maximizing the sums of costs within
subsets.
A related, more difficult problem is to find minimum cost partitions in which the
size of any subset is limited or bounded. This occurs, for instance, in placing cir-
cuit components on boards to minimize interboard connections; the size limitation
arises because a circuit board can hold only a certain number of components. This
general problem may be solved exactly for very small n using such methods as
branch and bound [1]; heuristic methods yield approximate solutions for much
larger values [2].
This paper discusses a special case for which exact solutions may be found in a
time proportional to the number of edges of the graph. Suppose the nodes of G are
numbered 1, . . - , n. Then the restriction is that the nodes in any subset must
have consecutive numbers; that is, they must follow the ordering imposed by
the graph.
As an example, let the nodes of G represent instructions in a computer program,
and the edges possible successors of the instructions. The costs on the edges will
be the relative frequencies of transitions. Then finding a minimum cost partition of
G into subsets of size less than or equal to p is essentially dividing the program
into pages of size p so as to minimize the frequency of interpage transitions. The
constraint against reordering the nodes corresponds to keeping the instructions i~
the order originally produced by a programmer or compiler. In this case, the prob.
lem is to find the best places in the sequence of instructions to start new pages.
An earlier version of this paper (Optimal segmentation points for programs) was presente¢
at the 2nd ACM Symposium on Operating Systems Principles, Princeton, N. J., (Oct. 1969)
Journal of the Assoeiat~on for Computing Machinery,Vol. 18, No. 1, January 1971,pp. 34-40.
Optimal Sequential Partitions of Graphs 35
The most closely parallel work is that reported independently by Kral [3]. A
distantly related version of the problem was also considered by Schurmann [4];
Berztiss [5] provides a faster version of Schurmann's algorithm.
2. Definitions
Let G be a directed graph of n nodes {1, 2, . . . , n}. Let p be a positive number
called the block size. The vertices of G have weights or sizes {wl, w2, .-- , wn} such
that 0 < w~ < p. (Typically p and w~ are integers, but this is not necessary.) The
size of a set S of nodes is
iES
Each edge (i, j ) of G has a nonnegative cost cis. The cost of a partition is ~ c ~ ,
with i and j in different blocks. An optimal partition is an admissible partition of
minimum cost.
If G contains edges (i, j) and (j, i), they may be replaced by the single edge
(i, j), i < j, with a cost c'~s = c~] ~ c~. Hence, without loss of generality, we shall
~ssume t h a t G contains only edges (i, j ) , i < j.
An alternate model for costs is derived from the theory of Markov chains. (See
Feller [6], for example.) For each edge (i, j ) of G, we define a number 0 g q(i, j)
l, which is called the transition probability for the transition i --~ j. The values
l(i, j ) must add up to 1 when summed over all j.
Using this approach, it is straightforward to compute the {c~s} from the
iq(i, j)}. A discussion of this computation, and of several possible Markov models,
~s related specifically to computer programs, can be found in Pinkerton [7]. Since
~e use the {cij} as the fundamental quantities in this paper, we will not discuss
~Iarkov models further.
Journal of the Association for Computing Machinery, Vol. 18, No. 1, J a n u a r y 1971
36 BRL~N W. KERNmm~
clusive.
Let
C(~) = ~ c~ (x=l, 2,...,n+l).
t<z<_/
C(x) is the sum of costs on all edges cut b y a b r e a k point at x. Let T(x) be the mini-
m u m partial cost (as far as node x) for any partition of a section of G {1, . . . , x},
with a b r e a k point at x. T h u s T ( 1 ) = O. T(x) will be evaluated iteratively.
Intuitively, the algorithm operates roughly as follows. For x = 1, 2, • • • , n + 1,
set T(x) = C(x) + T(y), where y is such t h a t dr.x_1 ~_ p and where T(y) is mini-
mal over all such y. If more t h a n one y yields a m i n i m u m T(y), choose the smallest
y.
This stage terminates after T(n + 1) has been computed; T(n + 1) is the cost
of the optimal segmentation. T h e particular y value which was used to compute
T(n + 1) defines the last b r e a k point bk. N o w T(bk) was computed as C(bk)
T(bk_~), where d(bk_~, bk -- 1) < p, so bk-~ defines the next b r e a k point. This proc-
ess continues until b~ = 1 is reached. At this time, a set of b r e a k points {b~, . . . ,
bk} has been found, and the total cost evaluated.
This algorithm is essentially a form of dynamic programming [8].
While the procedure above is correct in outline, there are some complicating
details which m u s t be included in the formal definition of the algorithm. T h e most
important of these is the necessity of remembering when a t e r m c~i is in b o t h T(y)
and C(x), so the computation of T(x) can be adjusted so as not to count it twice.
Journal of the Asaoclat,ion for Computlng Machinery, Vol. 18, No. 1, January 1971
Optimal Sequential Partitions of Graphs 37
40 100 20
0 2142 42 41 41] 1 ~ - ~ 2 ~-~2 2 0
0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0
0 2 3 4 5 6 1 7 8 9 10 11 12 13 14 15
An Example
Before proving optimality, let us examine a detailed example of the procedure in
operation. Figure 1 shows a graph of n = 14 nodes. The nodes are all of size 1, and
the block size p is to be 4. The costs are indicated on the figure.
The cost of edge (0, 1) is 0, and T ( 1 ) = 0. Edge (1, 2) has cost 2, so T(2) =
2. T(3) is c~.3 W c2.6 = 82. T ( 4 ) is also 82. T ( 5 ) is c4.6 W c4.9 ~ c2.6 = 82. L ( 1 )
through L ( 5 ) are all 1. At node 6, a decision is necessary, since the distance from
node 6 to node 1 is greater than p. If T ( 3 ) or T(4) is chosen, C(6, 3) and C(6, 4)
are both 42, since the edge (2, 6) is included on both T ( 3 ) and T(4). If T(5) is
chosen, C(6, 5) is 41, since both (2, 6) and (4, 9) are included in T ( 5 ) . If we choose
T(2), C(6, 2) is 82, since neither of the edges (2, 6) and (4, 9) is in T ( 2 ) . The cost
for choosing T ( 2 ) is thus 84, and this is the minimum-cost choice. Therefore,
T(6) = 84.
A record is kept that the choice made at node 6 was node 2, by setting L ( 6 ) =
2. At node 7, C(7,5) = C(7,6) = l i f T ( 5 ) or T ( 6 ) is chosen, a n d C ( 7 , 3 ) =
C(7, 4) = 2 if T ( 3 ) or T ( 4 ) is chosen, depending on the inclusion or exclusion of
edge (4, 9). Clearly the minimum choice here is T ( 5 ) , which makes T(7) = 83,
and L(7) = 5. Similarly nodes 8 and 9 have T(8) = T ( 9 ) = 82 ~ 201 = 283
aad point back to node 5. Node 10 has T(10) = 85 and points back to 7. Node 11
is T ( l l ) = 83 -~ 42 = 125 and points back to node 7. Node 12 has T(12) = 127,
pointing back to node 10. Nodes 13 and 14 have T values of 87 and point to node
10. Node 15 has T(15) = 87, which is the minimal cost. L(15) = 13. This sequence
is shown in Figure 2.
The break points are now determined by worlSng backwards. Since L(15) =
13, node 13 corresponds to the minimal-cost T(i) in the range of nodes 11 through
14 (the "last block"), and node 13 is a break point. Node 13 refers back to 10, so
10 is a break point. 10 refers to 7, so 7 is a break point, and so on. This partition
has 5 blocks, with elements 1, 2, 3, 4; 5, 6: 7.8, 9; 10, 11, 12; and 13, 14, for a total
cost of 87. Note that the minimum cost partition does not minimize the number of
blocks: the expansion factor in this example is 5~.
~. Proof of Optimality
THEOREM 1. The procedure A (opt) finds an optimal admissible partition of a
graph.
Journal of the Association for Computing Machinery, VoL 18 No. 1, January 1971
38 BRIAN W. KERNIGHAN
40 100 20
0 2[42 42 41 4111 ~ - - ~ 2 V~'-~2 2 0
0--0--0--0--0--0--0--0--0--0--0--0--0--0--0--0
l 1 l
T1 T~ T3 T4 T6 T6 T, Ts T~ Tlo T~i T12 Tla T~4 Tip
0 2 82 82 82 84 83 283 283 85 125 127 87 87 87
PROOF. Let
as before. C(x, y) is the incremental cost of a break point at node x, excluding all
edges cij with i less than y. Let z0 and z~ be the first two break points defined by
the procedure. Then
T(zo) = T(z,) + C(zo, zi)
by definition, and so T(zo) is minimum at n --~ 1 onJy if T(zl) -~ C(zo, zl) is the
minimum cost value in the set of nodes within distance p of n. T(zO in turn is mini-
mum only if T(z~) + C(zl, z2) is the minimum cost overnodeswithd(z2, zl -- 1) _<
p. But T(z2) Jr C(z~, z2) is minimum only if T(z3) + C(z2, za) is minimum, and
so on, until T(zk) = T ( 1 ) is reached. T ( 1 ) is zero and obviously minimum. Thus,
by following the chain of implications back to T(zo), we conclude that the proce-
dure does indeed find a minimum-cost partition.
QED
T ~ n O R ~ 2. The partition produced by A (opt) minimizes the expansion factor
over the set of all optimal partitions.
PROOF. Let B = {b~, . . - , bk} be the set of break points for the optimal parti-
tion produced by A(opt). Let B' = {b~', . . . , bm'l be another partition with the
same or lower cost, such that m < k.
B y construction, bk is the location of that break point with minimum total cost
in the range of nodes y such that d~.. _~ p. Now if there is only one occurrence of
the minimum total cost value in this range, then bm~ must necessarily equal bk.
(For otherwise, the total cost of B ~ would exceed that of B.) If, however, there are
several occurrences of the same minimal total cost, then bm~ might be greater than
bk, but bm~ can never be less than bk, since bk is the lowest numbered node for mini-
mal total cost. Therefore b,~' >_ bk.
Now bk_1 is similarly the location of the lowest-numbered minimal cost break
point in the range of nodes y with d~.bk-~ _~ p, and by the same reasoning, b:_~ >
bk-1 ,
Journal of the Association for Computing Machinery, Vol. 18, No. 1, January 1971
Optimal Sequential Partitions of Graphs 39
In the course of evaluating all T's, each edge is examined twice, once as a term of
the form cx-1.,, (x' ~ x) and once as c~,~_1 (dv.,-1 ~ p).
QED
We shall now prove that in general it is not possible to state in advance whether
a particular node is a break point in an optimal partition without doing an analysis
of the entire graph. T h a t is, no general algorithm can make decisions on the basis
of purely local information.
THEOREM 5. There exists a graph G of m nodes, of size 1, and block size p, such
that/or any integers n and n', with n < m -- p and n' < m, the set of break points
for nodes {1, . - . , n} depends on the cost of the edge (n', n' -5 1); in fact, there exist
two completely different sets of break points depending on the cost of the edge (n', n' -5
1).
Hence the break points for some subset of the nodes can depend on the value
of edges arbitrarily far from these nodes.
PROOF. Let the block size be p, and let n = rp. Consider the sequence of nodes
v,, v2, • • • , v(~+1)~ with costs as follows:
c~_~.~, = 1, cp.p+~= 2;
Ckp--l.kp = CkIa,kp+l = 2 for 2 < k _< r -5 1;
c~,~+, > 3 for all other i in 1, n -5 p -- 1.
3 3 3...3 1 2 3..3 2 2 3..3 2 2 3...3 2 ? 3...3
0--0--0--.-.--0--0--0 .. 0 - - 0 - - 0 .. 0 - - 0 - - 0 ... 0 - - 0 - - 0 . . . 0 - - 0
I 2 3 p--l~p+1 2p rp (r+l)p m
Journal of the Aasociation for Computing Machinery, Vol. 18, No. 1, January 1971
40 BRIAN W, KERNIGHAN
Now if the edge [(r -~ 1)p, (r ~ 1)p q- l] has a cost greater than 1, the optimal
partition will include among its break points the set {p, 2p, . . . , rp}. But if the
value of the edge [(r T 1)p, (r ~ 1)p ~ 1] is less than 1, then the set of break
points will include {p ~ 1, 2p --~ 1, . . . , rp ~ 1}, a set which is completely dif-
ferent from the first one in the range (1, n).
QED
Thus, whether or not an edge is a break point depends on the values of all other
edges in the graph. Break points cannot be determined locally.
5. Conclusions
This paper has presented an algorithm for finding a minimum cost partition of a
graph, subject to size constraints, for use in those situations where arbitrary re-
arrangement of the nodes is not feasible or not desirable. For instance, it may be
applied to sequences of instructions and data items in computer programs to im-
prove performance in a paged memory. (Limited experience with simple heuristic
partitioning algorithms [9] indicates that improvement in paging behavior is possi-
ble by modifying program structure in this way.) The algorithm presented here hat
the advantages of requiring linear time, and of providing a partition with minimum
expansion factor.
REFERENCES
1. GORINSHTEYN,L.L. The partitioning of graphs. Eng. Cybern. 7, 1 (Jan. 1969), 76--82.
2. K~RNIGHAN,B. W., ANDLIN, S. An efficient heuristic procedure for partitioning graphs
Bell System Tech. J. ~9, 2 (Feb. 1970), 291-307.
3. KRAL, J. To the problem of segmentation of a program. In Information Processing Ma.
chines, Research Institute for Mathematical Machines, Prague, 1965, pp. 140-149.
4. SCHURMANN,A. The application of graphs to the analysis of distribution of loops in
program. Inform. Control 7, 3 (Sept. 1964), 275--282.
5. BERZTISS,A.T. A note on segmentation of computer programs. Inform. Control I$, ]
(Jan. 1968), 21-22.
6. FELLER,W. An Introduction to Probability Theory and Its Applications, Vol. 1. Wiley~
New York, 1957.
7. PINKERTON,T . B . Program behavior and control in virtual storage computer systems
Tech. Rep. 4, U. of Michigan, 1968.
8. BELLMAN,R. Dynamic Programming. Princeton U. P., Princeton, N. J., 1957.
9. DEARNLEY, F. H., AND NEWELL, G . S . Automatic segmentation of programs for a two
level store computer. Comput. J. 7, 3 (Oct. 1964), 185-187.
Journal of the A seoclatton for Computing Machinery,Vol. 18, No. 1, January 1971