Partitioning PDF
Partitioning PDF
1. Iterative Improvement
The partitioning problem is the problem of breaking a circuit into two subcircuits.
Like many problems in VLSI design automation, we will solve this problem by a method
known as Iterative Improvement. To apply the iterative improvement technique, we need
to be able to do several things. These are listed below.
4. We need one or more techniques for generating new solutions from existing
solutions. The techniques must take the acceptability criteria (see 2 above) into
account so that only acceptable solutions are generated.
The basic idea of iterative improvement is to start with an initial solution, and
generate new solutions iteratively until we have a solution that is as optimal as we can get
it. Optimality is measured with respect to the goodness criteria. It is important to note that
most iterative improvement techniques do not produce optimal solutions, but some of
them come quite close. Most iterative improvement techniques are greedy. When a new
solution is generated from an existing solution we have two choices. We can discard the
old solution and keep the new one, or we can discard the new solution, and try some
other technique for generating a new solution (or stop the process entirely!) In a greedy
algorithm, the new solution is accepted only if it is better than the old one (with respect to
the goodness criterion). Non-greedy methods (sometimes known as hill-climbing
algorithms) will sometimes accept a solution that is worse than the existing solution. The
reason that hill-climbing algorithms are used is to avoid getting trapped in a local
minimum. If solution A can be created from solution B in one step, then solution B is
called a neighbor of solution A. It is possible for a non-optimal solution to be better than
all of its neighbors. When this occurs, we say that the solution is a local minimum. A hill-
climbing algorithm can sometimes climb out of a local minimum, and find an even better
solution by temporarily accepting a solution that is worse than the existing solution.
2. The Kernighan-Lin Algorithm
The most basic approaches to the partitioning problem treat the circuit as a graph.
This is true for the first, and most famous partitioning algorithm, called the Kernighan-
Lin algorithm. This algorithm was originally designed for graph partitioning rather than
circuit partitioning, so to apply the algorithm, one must first convert the circuit into a
graph. To do this, each gate is treated as a vertex of the graph. If two gates are directly
connected by a net, then an edge is placed between the corresponding vertices of the
graph. Figure 1 shows the relationship between a circuit and the corresponding graph.
A G
F H
C
A G
B E
H
C F
2. A solution is acceptable only if both subcircuits contain the same number of gates.
We will assume that the number of gates is even. The algorithm can be tweaked to
handle an odd number of gates.
3. The goodness of a solution is equal to the number of graph edges that are cut.
Suppose the edge (V,W) exists in the graph derived from the circuit. (Recall that V
and W are both gates.) There are two possibilities. V and W can be in different
subcircuits, or they can be in the same subcircuit. If V and W are in different
subcircuits, we say that the edge (V,W) is cut. Otherwise we say that (V,W) is uncut.
4. The technique for generating new solutions from old solutions is to select a subset of
gates from S1, and a subset of gates from S2 and swap them. To maintain
acceptability, we always select two subsets of the same size.
A G
S2
S2
B E
H
C F
The second tentative swap gives the changes to the gate-improvements and pair-
improvements listed in Figure 9 and Figure 10.
Pair Improvement
A,C (-2)+(-1)-0= -3
A,F (-2)+(-3)-0= -5
G,C (-2)+(-1)-0= -3
G,F (-2)+(-3)-0= -5
Figure 10. Pair Improvements, Step 3.
At this point, there are no more positive improvements to be made. Some enhanced
versions of the Kernighan-Lin algorithm are capable of detecting this condition and
stopping at this point. Were not that smart yet, so we will persevere by swapping gates A
and C, giving the tentative swap list of Figure 11.
Pair Cut Count
Zero Swaps 7
(B,D) 3
(H,E) 1
(A,C) 4
As Figure 11 shows, we are now moving backwards. The new improvements are
given in Figure 12 and Figure 13. Since there is only one pair left, (G,F), there is no
confusion about which we should select. In fact, it is not necessary for us to compute the
new improvements for this pair, because we know that the final swap must return us to
the original cut-count of 7.
Pair Improvement
G,F (-2)+(-1)-0= -3
Figure 13. Pair Improvements, Step 4.
After the final swap of G and F, we end up with the complete tentative swap list given
in Figure 14.
After creating the tentative swap list, we trace the list from beginning to end
searching for the minimum cut count. As Figure 14 shows, the minimum cut count is
obtained after swapping (B,D) and (H,E). The minimum cut-count is less than the zero-
swaps-count, so we go ahead and make the swaps, giving the improved partitioning
shown in Figure 15.
D
A G
S2
B E
H
C F S1
Performance
The Kernighan-Lin algorithm is probably the most intensely studied algorithm for
partitioning. There have been many improvements to the algorithm to improve its
running time. The unimproved running time is actually quite bad due to the need to
examine n2/4 pairs of gates. If we simplistically search through the list every time we
wish to swap a new pair of gates, the total running time of each iteration would be n3/4.
This could lead to a total running time of O(n4) or worse. Needless to say, there has been
considerable interest in algorithms that deal with individual gates rather than pairs of
gates.
In KL, the two fanout branches in the above diagram will be converted into two edges
in the derived graph. In FM, the single net will be considered alone, so the above
situation is an improvement in KL, but not in FM.
4. Simulated Annealing
In many optimization problems, it is possible to get caught in a local minimum. In
other words, the algorithm will produce a minimal solution that is not even close to
minimal, but be unable to find the true minimum, because the only way to get there is to
temporarily create a solution that is worse than the one the algorithm started with.
Computer algorithms are notoriously bad at doing bad things for the purpose of
getting something even better. One algorithm that was specifically designed to get
around this problem is Simulated Annealing. Simulated Annealing(SA) is a general
iterative improvement algorithm that can be used for many different purposes. In SA, it is
necessary to consider several thousand or even several million states, so computing a new
state from an old state must be very efficient. In partitioning, SA starts with a random
partition, just as the two previous algorithms. A new state is computed by selecting a gate
at random from each of the two subsets, and swapping them. As before, the swap remains
tentative, until the quality of the new partitioning is computed. The number of nets cut is
the measure of goodness. If the new state is better than the old state, it is accepted and the
swap is made permanent. If the new state is worse than the old state, it might be accepted
and it might not. The SA algorithm operates in a series of distinct phases called
temperatures. An actual temperature value is assigned to each phase. The algorithm
begins with temperature set to a high value, and proceeds to lower and lower
temperatures. A predetermined number of moves is attempted at each temperature. When
a bad move is attempted, the algorithm computes an acceptance value that is based on
temperature and on the badness of the solution. This acceptance value is compared to a
random number to determine whether the move will be accepted. The random number is
used to guarantee that there is always a non-zero probability that any bad move will be
accepted. The higher the temperature, the more likely it is that a particular bad move will
be accepted, and at a given temperature, the worse the move, the less likely it is to be
accepted. In most cases the acceptance function is computed using the following
function, where s is the change in the quality and T is the current temperature. For bad
moves this function will produce a value between 0 and 1. A random number between 0
and 1 is generated and if the quality measure is larger than the generated random number,
the bad move is accepted. Recall that in partitioning, negative values of s are good and
positive values are bad.
s
T
e
There are several parameters that must be determined experimentally. The first of
these is the starting temperature. This is usually chosen so that many bad moves will be
accepted. The second is the cooling schedule. This is the series of temperatures that will
be used. The change in temperature is seldom uniform. Usually temperature is changed in
large steps at high temperatures and in small steps at lower temperatures. A significant
amount of experimentation is usually necessary to determine the best cooling schedule.
The final parameter that must be determined is the number of moves to be made at each
temperature. This number of moves is generally based both on temperature and on the
number of gates in the input. One reasonable choice for a number of moves is 500 times
the number of gates in the input.
Simulated annealing generally does a very good job, but runs very, very slowly.