Multi-Objective Hypergraph Partitioning Algorithms For Cut and Maximum Subdomain Degree Minimization
Multi-Objective Hypergraph Partitioning Algorithms For Cut and Maximum Subdomain Degree Minimization
XX, 2005 1
Coarsening Phase
G1 G1
constraint, and the requirement that a certain function is
optimized is referred to as the partitioning objective.
G2
A. The Multilevel Paradigm for Hypergraph Partitioning G2
try to reduce the maximum degree of the intermediate lower-k optimization problem but they differ on the starting point
partitioning solutions. of that refinement. The first algorithm called Direct Multi-
For this reason, approaches based on direct k-way parti- Phase Refinement directly optimizes the multi-objective cost
tioning are better suited for the problem of minimizing the using k-way V-cycle framework [19], while the second al-
maximum subdomain degree, as they provide a concurrent gorithm called Aggressive Multi-Phase Refinement utilizes
view of the entire k-way partitioning solution. The ability of refinement strategies that enable large scale perturbations
direct k-way partitioning to optimize objective functions that of the solution space. Details on the exact multi-objective
depend on knowing how the hyperedges are partitioned across formulation is provided in the rest of this section and the two
all k partitions has been recognized by various researchers, refinement algorithms are described in subsequent sections.
and a number of different algorithms have been developed
to minimize objective functions such as the sum-of-external-
degrees, scaled cost, absorption etc. [5], [10], [25], [30], [34]). A. Multi-Objective Problem Formulation
Moreover, direct k-way partitioning can potentially produce In general, the objectives of producing a k-way partitioning
much better solutions than a method that computes a k-way that both minimize the cut and the maximum subdomain
partitioning via recursive bisection. In fact, in the context degree are reasonably well correlated with each other, as
of a certain classes of graphs it was shown that recursive the partitionings with low cuts will also tend to have low
bisectioning can be up to an O(log n) factor worse than the maximum subdomain degrees. However, this correlation is not
optimal solution [33]. perfect, and these two objectives can actually be at odds with
However, despite the inherent advantage of direct k-way each other. That is, a reduction in the maximum subdomain
partitioning to naturally model much more complex objectives, degree may only be achieved if the cut of the partitioning is
and the theoretical results which suggest that it can lead increased. This situation arises with vertices that are adjacent
to superior partitioning solutions, a number of studies have to vertices that belong to more than two subdomains. For
shown that existing direct k-way partitioning algorithms for example, consider a vertex v that belongs to the maximum
hypergraphs, produce solutions that are in general inferior to degree partition V i and let Vq and Vr be two other partitions
those produced via recursive bisectioning [10], [25], [30], [34]. such that v is connected to vertices in V i , Vq , and Vr . Now,
The primary reason for that is the fact that computationally if EDq (v) IDi (v) < 0 and EDr (v) IDi (v) < 0, then the
efficient k-way partitioning refinement algorithms are often move of v to either partitions V q or Vr will increase the cut
trapped into local minima, and usually require much more but if EDq (v) + EDr (v) IDi (v) > 0, then moving v to either
sophisticated and expensive optimizers to climb out of them. Vq or Vr will actually decrease Vi s subdomain degree. One
To overcome these conflicting requirements and characteris- such scenario is illustrated in Figure 3, in which vertex v from
tics, our algorithms for minimizing the maximum subdomain partition Vi is connected to vertices x, y, and z of partitions V i ,
degree combine the best features of the recursive bisectioning Vq and Vr , respectively, and the weights of the respective edges
and direct k-way partitioning approaches. We achieve this by are 6, 5, and 3. Moving vertex v from partition V i to either
treating the minimization of the maximum subdomain degree partitions Vq or Vr will reduce the subdomain degree of V i ;
as a post-processing problem to be performed once a high- however, either of these moves will increase the overall cut of
quality k-way partitioning has been obtained. Specifically, we the partitioning. For example, if v moves to V q , the subdomain
use existing state-of-the-art multilevel-based techniques [19], degree of Vi will reduce from 8 to 6, whereas the overall cut
[22] to obtain an initial k-way solution via repeated bisec- will increase from 8 to 9. This discussion suggests that in order
tioning, and then refine this solution using various k-way to develop effective algorithms that explicitly minimize the
partitioning refinement algorithms that (i) explicitly minimize maximum subdomain degree and the cut, these two objectives
the maximum subdomain degree, (ii) ensure that the cut does need to be coupled together into a multi-objective framework
not significantly increase, and (iii) ensure that the balancing that allows the optimization algorithm to intelligently select
constraints of the resulting k-way partitioning are satisfied. the preferred solution.
This approach has a number of inherent advantages. First, The problem of multi-objective optimization within the
by building upon a cut-based k-way partitioning, it leverages context of graph and hypergraph partitioning has been exten-
the huge body of existing research on this topic, and it can sively studied in the literature [1], [27], [29], [31], [36] and
benefit from future improvements. Second, in terms of cut, two general approaches have been developed for combining
its initial k-way solution is of extremely high-quality; thus, multiple objectives. The first approach keeps the different
allowing us to primarily focus on minimizing the maximum objectives separate and couples them by assigning to them
subdomain degree without being overly concerned about the different priorities. Essentially in this scheme, a solution that
cut of the final solution (as long as the partitioning is not optimizes the highest priority objective the most is always
significantly perturbed). Third, it allows for a user-adjustable preferred and the lower priority objectives are used as tie-
and predictable framework in which the user can specify how breakers (i.e., used to select among equivalent solutions in
much (if any) deterioration on the cut he or she is willing to terms of the higher priority objectives). The second approach
tolerate in order to reduce the maximum subdomain degree. creates an explicit multi-objective function that numerically
To actually perform the maximum subdomain-degree fo- combines the individual functions. For example, a multi-
cused k-way refinement we developed two classes of algo- objective function can be obtained as the weighted sum of the
rithms. Both of them treat the problem as a multi-objective individual objective functions. In this scheme, the choice of
IEEE TRANSACTIONS ON COMPUTER AIDED DESIGN, VOL XX, NO. XX, 2005 5
node internal to partition a then v is not moved. If v is at the Collapse into macro nodes
one of the partitions N (v) that vertices adjacent to v belong Create x pairs using 2x macro
to (the set N (v) is often referred to as the neighborhood of v). nodes
Let N (v) be the subset of N (v) that contains all partitions b Swap macro nodes
to improve the pairs
such that movement of vertex v to partition b does not violate Convert the pairs to larger
the balancing constraint. Now the partition b N (v) that 3 macro nodes
leads to the greatest positive reduction in the multi-objective
function is selected and v is moved to that partition.
are there k nodes?
NO
V. AGGRESSIVE M ULTI -P HASE R EFINEMENT
One of the potential problems with the multi-objective YES
number of combinations that needs to be considered increases algorithm that can be considered an extension of the classi-
exponentially with l), the algorithm obtains the partitioning in cal Kernighan-Lin algorithm [26] for k-way refinement that
a bottom-up fashion by repeatedly applying the above scheme operates as follows.
l times. In addition, after each round of pairings, the macro- The algorithm consists of a number of iterations. During
node-level partitioning is further refined by applying a pair- each iteration it identifies and performs a sequence of macro-
wise macro-node swapping algorithm described in Section V- nodes swaps that improve the value of the objective function
A.1. and terminates when no such sequence can be identified within
In the fourth step, the quality in terms of the particular a particular iteration. Each of these iterations is performed as
multi-objective function of the resulting macro-node level follows. Let k be the number of partitions, m the number
partitioning is improved using a pair-wise macro-node swap- of macro-nodes, and q = m/k the number of macro-nodes
ping algorithm (described in Section V-A.1). This algorithm per partition. Since each macro-node v in partition V i can
operates at the macro-node level and selects two macro- be swapped with any macro-node belonging to a different
nodes, each one from a different partition, and swaps the partition, there are a total of m(m q) possible pairs of
partitions that they belong so that to improve the overall macro-nodes that can be swapped. For each of these swaps,
quality of the solution. Since by construction, each macro the algorithm computes the improvement in the value of the
node is approximately of the same size, such swaps almost objective function (i.e., the gain) achieved by performing it,
always lead to feasible solutions in terms of the balance and inserts all the m(mq) possible swaps into a max-priority
constraint. The use of such a refinement algorithm was the queue based on this gain value. Then it proceeds to repeatedly
primary motivation behind the development of the aggressive (i) extract from this queue the macro-node pair whose swap
multi-phase algorithm as it allows us to move large portions leads to the highest gain, (ii) modify the partitioning by
of the hypergraph between partitions without having to either performing the macro-node swap, (iii) record the current value
violate the balancing constraints or rely on a sequence of small of the objective function, and (iv) update the gains of the
vertex-moves in order to achieve the same effect. Moreover, macro-node pairs in the priority queue to reflect the new
because by construction, each macro-node corresponds to a partitioning. Once the priority queue becomes empty, the
good cluster (as opposed to a random collection of nodes) such algorithm determines the point within this sequence of swaps
swaps can indeed lead to improved quality very efficiently. that resulted in the best value of the objective function and
Finally, in the fifth step, the macro-node based partitioning reverts the swaps that it performed after that point. An outline
is used to induce a partitioning of the original hypergraph, of the single iteration of this hill-climb swap algorithm is
which is then further improved using the direct multi-phase presented in Algorithm 1.
refinement algorithm described in Section IV.
1) Macro-node Partitioning Refinement: We developed two Algorithm 1 Hill-climbing algorithm for identifying a se-
algorithms for refining a partitioning solution at the macro- quence of pair-wise macro node swaps to reach a lower cost.
node level. The differences between the two algorithms are
Compute initial gain values for all possible pairs
the method used to identify the pairs of macro nodes to be Insert them in a priority queue
swapped and the policy used in determining whether or not a
particular swap will be accepted. Details on these two schemes while Pairs exist in priority queue do
are provided in the next two sections. Pop the highest gain pair
Make the swap
a) Randomized Pair-wise Node Swapping: In this Lock the pair
scheme, two nodes belonging to different partitions are ran-
domly selected and the quality of the partitioning resulting if Cost is minimum then
by their swap is evaluated in terms of the particular multi- Record roll back point
objective function. If that swap leads to a better solution, Record new minimum cost
end if
the swap is performed, otherwise it is not. Swaps that do
not improve or degrade the particular multi-objective function if Maximum subdomain degree changed then
are also allowed, as they often introduce desirable perturba- Update the gain values of all pairs remaining in priority queue.
tions. The primary motivation for this algorithm is its low else
computational complexity, and in practice it produces very Update the gain values of affected pairs remaining in priority
queue.
good results. Also, when there are two nodes per subdomain,
end if
the randomized pair-wise node swapping can be done quite end while
efficiently by pre-computing the cut and degree of all possible
pairings and storing them in a 2D table. This loop-up based Roll back to minimum cost point (i.e., undo all swaps after the
swapping takes less than one second to evaluate the cost of minimum cost point in reverse order)
one million swaps on a 1.5 GHz workstation.
b) Coordinated Sequence of Pair-wise Node Swaps: One Due to the global nature of the maximum subdomain degree
of the limitations of the previous scheme is that it lacks the cost, if a macro-node swap changes the value of the maximum
ability to climb out of local minima as it does not allow any subdomain degree, the gains of all the pairwise swaps that are
swaps that decrease the value of the objective function. To still in the priority queue needs to be recomputed. However, if
overcome this problem, we developed a heuristic refinement a swap does not change the value of the maximum subdomain
IEEE TRANSACTIONS ON COMPUTER AIDED DESIGN, VOL XX, NO. XX, 2005 8
degree, then only the gains of the macro-node pairs that Input Hypergraph
difference. TABLE II
The top-down algorithm starts by computing a k-way par- T HE CHARACTERISTICS OF THE HYPERGRAPHS USED TO EVALUATE OUR
titioning that minimizes the cut using recursive bisectioning ALGORITHM .
TABLE III
D IRECT M ULTI -P HASE R EFINEMENT R ESULTS .
the average and thus there is significantly more room for superior to the randomized algorithm. For example, for l = 2
improvement. both of these measures are over 10% better than the corre-
Furthermore, the direct multi-phase refinement algorithm sponding measures for the randomized algorithm. However,
also leads to partitionings that on the average have lower cut in terms of the maximum subdomain degree (measured by
and average subdomain degree. Specifically, the cut tends to RMax), the hill-climbing algorithm provides little advantage.
improve by 1% to 4%, whereas the average subdomain degree In fact, its overall performance is slightly worse than the
improves by 5% to 13%. Finally, comparing the different randomized schemeleading to solutions whose maximum
multi-objective formulations we can see that in general, there subdomain degree is about 1% to 3% higher for l = 1 and
are very few differences between them, with both of them l = 2, respectively.
leading to comparable solutions. The mixed performance of the hill-climbing algorithm and
its inability to produce solutions that have lower maximum
B. Aggressive Multi-Phase Refinement subdomain degree suggest that this type of refinement may not
Our experimental evaluation of the aggressive multi-phase be well-suited for the step-nature of the maximum subdomain
refinement schemes (described in Section V) focused along degree objective. Since there are relatively few macro-node
two directions. First, we designed a set of experiments to swaps that affect the maximum subdomain degree, the priority
evaluate the effectiveness of the macro-node-level partitioning queue used by the hill-climbing algorithm forces it to order
refinement algorithms used by these schemes and second, the moves based on their gains with respect to the cut (as it
we performed a series of experiments that were designed is the secondary objective). Because of this, this refinement is
to evaluate the effectiveness of the bottom-up and top-down very effective in minimizing RCut and RDeg but it does not
schemes within the context of aggressive refinement. affect RMax. In fact, as the results suggest, this emphasis on
1) Evaluation of Macro-node Partitioning Refinement Al- the cut may affect the ability of subsequent swaps to reduce
gorithms: To directly evaluate the relative performance of the maximum subdomain degree. To see if this is indeed the
the two refinement algorithms described in Section V-A.1 we case we performed another sequence of experiments in which
performed a series of experiments using a simple version of we modified the randomized algorithm so that to perform the
the aggressive refinement schemes. Specifically, we computed moves using the same priority-queue-based approach used by
a 2l k-way partitioning, collapsed each partition into a macro the hill-climbing scheme and terminated each inner-iteration
node, and obtained an initial k-way partitioning of these macro as soon as the priority queue contained negative gain vertices
nodes using a random assignment. This initial partitioning was (i.e., it did not perform any hill-climbing). Our experiments
then refined using the two macro-node partitioning refinement (not presented here) showed that this scheme produced results
algorithmsrandomized swap and hill-climbing swap. This whose RMax was worse than that of the randomized and
experiment was performed for each one of the circuits in our hill-climbing approaches but its RCut and RDeg were be-
benchmark suite and the overall performance achieved by the tween those obtained by the randomized and the hill-climbing
two algorithms for k = 8, 16, 32 and l = 1, 2 relative to schemesverifying our hypothesis that due to the somewhat
those obtained by hMETISs recursive bisectioning algorithm conflicting nature of the two objectives, a greedy ordering
is shown in Table IV. Note that for this set of experiments, scheme does not necessarily lead to better results.
the two objectives of maximum subdomain degree and cut The columns of Table IV labeled RTime shows the
were combined using a priority scheme, which uses the min- amount of time required by the two refinement algorithms.
imization of the maximum subdomain degree as the primary As expected, the randomized algorithm is faster than the
objective. hill-climbing algorithm and its relative runtime advantage
From these results we can see that contrary to our initial improves as the number of macro-nodes increases. Due to the
expectations, the hill-climbing algorithm does not outperform mixed performance of the hill-climbing algorithm and its con-
the randomized randomized-swapping algorithm for all three siderably higher computational requirements for large values
performance metrics. Specifically, in terms of the cut (RCut) of l, our subsequent experiments used only the randomized
and the average degree (RDeg), the hill-climbing algorithm is refinement algorithm.
IEEE TRANSACTIONS ON COMPUTER AIDED DESIGN, VOL XX, NO. XX, 2005 11
TABLE IV
A NALYSIS OF RANDOMIZED VS HILL - CLIMB SWAPPING .
2) Evaluation of Bottom-up and Top-down Schemes: Ta- that for l = 1 and l = 2 and large values of k, the bottom-up
ble V shows the performance achieved by the bottom-up scheme actually leads to solutions than are somewhat better
and top-down aggressive multi-phase refinement schemes for than those obtained by the top-down scheme. We believe
l = 1, . . . , 3, and k = 4, 8, . . . , 64 relative to those obtained that this is due to the fact that for small values of l, the
by hMETISs recursive bisectioning algorithm. Specifically, for macro-node pairing scheme used by the bottom-up scheme to
each value of l and k, this table shows four sets of results. derive the macro-node level k-way partitioning (that takes into
The first two sets (one for the bottom-up and one for the top- account all possible pairings of macro-nodes), is inherently
down scheme) were obtained using the priority-based multi- more powerful than macro-node-level refinement used by the
objective formulation whereas the remaining two sets used top-down scheme. This becomes more evident for large values
the combining scheme. Due to space constraints, we only of k, for which there is considerably more room for alternate
present results in which the two objectives were combined pairingsresulting in relatively better results.
using = k.
From these results, we can observe a number of general C. Comparison of Direct and Aggressive Multi-phase Refine-
trends about the performance of the aggressive multi-phase ment Schemes
refinement schemes and their sensitivities to the various pa- Comparing the results obtained by the aggressive multi-
rameters. In particular, as l increases from one to two (i.e., phase refinement with the corresponding results obtained by
each partition is further subdivided into two or four parts), the the direct multi-phase refinement algorithm (Tables V and III),
effectiveness of the multi-objective partitioning algorithm to we can see that in terms of the maximum subdomain degree,
produce solutions that have lower maximum subdomain degree the aggressive scheme leads to substantially better solutions
compared to the solutions obtained by hMETIS, improves. In than those obtained by the direct scheme, whereas in terms of
general, for l = 1, the multi-objective algorithm reduces the cut and the average subdomain degree, the direct scheme
the maximum subdomain degree by 7% to 28%, whereas is superior. These results are in agreement with the design
for l = 2, the corresponding improvements range from 6% principles behind these two multi-phase refinement schemes
to 35%. However, these improvements lead to solutions in for the multi-objective optimization problem at hand, and
which the cut and the average subdomain degree obtained for illustrate that the former is capable of making relatively large
l = 2 are somewhat higher than those obtained for l = 1. For perturbations on the initial partitioning obtained by recursive
example, for l = 1, the multi-objective algorithm is capable bisectioning, as long as these perturbations improve the multi-
of improving the cut over hMETIS by 0% to 4%, whereas for objective function. In general, the aggressive multi-phase re-
l = 2, the multi-objective algorithm leads to solutions whose finement scheme with l = 1, dominates the direct scheme, as it
cut is up to 5% worse than those obtained by hMETIS. Note leads to better improvements in terms of maximum subdomain
that these observations are to a large extent independent of the degree and still improves over hMETIS in terms of cut and
particular multi-objective formulation or the method used to average degree. However, if the goal is to achieve the highest
obtain the initial macro-node-level partitioning. reduction in the maximum average degree, then the aggressive
For the reasons discussed in Section V-B, the trend of scheme with l = 2 should be the preferred choice, as it does
successive improvements in the maximum subdomain degree so with relatively little degradation on the cut.
does not hold for the bottom-up scheme any more for l = 3.
In particular, the improvements in the maximum subdomain D. Runtime Complexity
degree relative to hMETIS are in the range of 0%35%, which Table VI shows the amount of time required by the various
are somewhat lower than the corresponding improvements multi-objective partitioning algorithms using either direct or
for l = 2. On the other hand, the top-down scheme is aggressive multi-phase refinement. For each value of k and
able to further reduce the maximum subdomain degree when particular multi-objective algorithm, this table shows the total
l = 3, leading to results that are 10% to 36% lower than the amount of time that was required to partition all 18 bench-
corresponding results of hMETIS. Note that this trend continues marks relative to the amount of time required by hMETIS to
for higher values of l as well (due to space constraints these compute the corresponding partitionings. From these results
results are not reported here). These results suggest that the we can see that the multi-objective algorithm that uses the
top-down scheme is better than the bottom-up scheme for large direct multi-phase refinement is the least computationally ex-
values of l. However, a closer inspection of the results reveals pensive and requires around 50% more time than hMETIS does.
IEEE TRANSACTIONS ON COMPUTER AIDED DESIGN, VOL XX, NO. XX, 2005 12
TABLE V
A GGRESSIVE M ULTI -P HASE R EFINEMENT R ESULTS .
l=1
Prioritized Combined, = k
Bottom-Up Top-Down Bottom-Up Top-Down
k RMax RCut RDeg RMax RCut RDeg RMax RCut RDeg RMax RCut RDeg
4 0.927 0.990 0.958 0.911 0.956 0.929 0.904 0.972 0.941 0.897 0.957 0.927
8 0.838 0.995 0.945 0.849 0.960 0.918 0.834 0.992 0.943 0.830 0.968 0.921
16 0.787 1.005 0.942 0.811 0.980 0.928 0.795 1.000 0.935 0.812 0.991 0.934
32 0.754 0.993 0.923 0.762 0.984 0.913 0.758 0.991 0.917 0.795 0.989 0.921
64 0.724 0.996 0.916 0.738 0.988 0.910 0.721 0.993 0.905 0.749 0.992 0.911
l=2
Prioritized Combined, = k
Bottom-Up Top-Down Bottom-Up Top-Down
k RMax RCut RDeg RMax RCut RDeg RMax RCut RDeg RMax RCut RDeg
4 0.938 1.021 0.991 0.901 0.943 0.917 0.905 0.992 0.963 0.883 0.956 0.924
8 0.825 1.046 1.004 0.822 0.964 0.922 0.814 1.041 1.001 0.806 0.974 0.926
16 0.749 1.049 1.008 0.761 0.997 0.943 0.751 1.048 1.003 0.761 1.013 0.952
32 0.693 1.041 0.991 0.704 1.000 0.936 0.689 1.033 0.976 0.728 1.017 0.945
64 0.654 1.040 0.983 0.664 1.007 0.934 0.652 1.041 0.974 0.704 1.018 0.937
l=3
Prioritized Combined, = k
Bottom-Up Top-Down Bottom-Up Top-Down
k RMax RCut RDeg RMax RCut RDeg RMax RCut RDeg RMax RCut RDeg
4 1.007 1.121 1.091 0.899 0.953 0.927 0.950 1.058 1.029 0.877 0.950 0.919
8 0.848 1.119 1.088 0.815 0.974 0.931 0.842 1.109 1.073 0.796 0.982 0.934
16 0.759 1.101 1.070 0.748 1.000 0.945 0.754 1.077 1.034 0.759 1.022 0.966
32 0.697 1.095 1.059 0.682 1.006 0.943 0.700 1.064 1.010 0.722 1.023 0.954
64 0.701 1.100 1.052 0.645 1.012 0.941 0.663 1.066 1.006 0.683 1.024 0.945
RMax, RCut, and RDeg are the average maximum subdomain degree, cut, and average subdomain degree, respectively of the multi-objective solution relative
to hMETIS. Numbers less than one indicate that the multi-objective algorithm produces solutions that have lower maximum subdomain degree, cut, or average
subdomain degree than those produced by hMETIS.
Developing computationally scalable refinement algorithms [13] C. M. Fiduccia and R. M. Mattheyses. A linear time heuristic for
that can successfully climb out of local minima for this improving network partitions. In In Proc. 19th IEEE Design Automation
Conference, pages 175181, 1982.
type of objectives is still an open research problem whose [14] S. Hauck and G. Borriello. An evaluation of bipartitioning technique.
solution can lead to even better results both for the partitioning In Proc. Chapel Hill Conference on Advanced Research in VLSI, 1995.
problem addressed in this paper as well as other objective [15] B. Hendrickson and R. Leland. A multilevel algorithm for partitioning
graphs. Technical Report SAND93-1301, Sandia National Laboratories,
functions that share similar characteristics. Also, our work 1993.
so far was focused on producing multi-objective solutions, [16] B. Hendrickson, R. Leland, and R. V. Driessche. Enhancing data locality
which satisfy the same balancing constraints as those resulting by using terminal propagation. In Proceedings of the 29th Hawaii
International Conference on System Science, 1996.
from the initial recursive bisectioning based solution. However, [17] B. Hu and M. Marek-sadowska. Congestion minimization during
additional improvements can be obtained by relaxing the placement without estimation. In Proceedings of ICCAD, pages 737
lower-bound constraint. Our preliminary results with such an 745, Nov 2002.
[18] G. Karypis. Multilevel hypergraph partitioning. In J. Cong and
approach appears promising. J. Shinnerl, editors, Multilevel Optimization Methods for VLSI, chapter 6.
Kluwer Academic Publishers, Boston, MA, 2002.
[19] G. Karypis, R. Aggarwal, V. Kumar, and S. Shekhar. Multilevel
ACKNOWLEDGMENT hypergraph partitioning: Application in vlsi domain. IEEE Transactions
on VLSI Systems, 20(1), 1999. A short version appears in the proceedings
This work was supported in part by NSF CCR- of DAC 1997.
9972519, EIA-9986042, ACI-9982274, ACI-0133464, and [20] G. Karypis, E. Han, and V. Kumar. Chameleon: A hierarchical clustering
algorithm using dynamic modeling. IEEE Computer, 32(8):6875, 1999.
ACI-0312828; the Digital Technology Center at the University [21] G. Karypis and V. Kumar. Analysis of multilevel graph partitioning.
of Minnesota; and by the Army High Performance Comput- In Proceedings of Supercomputing, 1995. Also available on WWW at
ing Research Center (AHPCRC) under the auspices of the URL http://www.cs.umn.edu/karypis.
[22] G. Karypis and V. Kumar. hMETIS 1.5: A hypergraph partition-
Department of the Army, Army Research Laboratory (ARL) ing package. Technical report, Department of Computer Science,
under Cooperative Agreement number DAAD19-01-2-0014. University of Minnesota, 1998. Available on the WWW at URL
The content of which does not necessarily reflect the position http://www.cs.umn.edu/metis.
[23] G. Karypis and V. Kumar. METIS 4.0: Unstructured graph partitioning
or the policy of the government, and no official endorsement and sparse matrix ordering system. Technical report, Department of
should be inferred. Access to research and computing facilities Computer Science, University of Minnesota, 1998. Available on the
was provided by the Digital Technology Center and the WWW at URL http://www.cs.umn.edu/metis.
[24] G. Karypis and V. Kumar. A fast and highly quality multilevel
Minnesota Supercomputing Institute. scheme for partitioning irregular graphs. SIAM Journal on Scien-
tific Computing, 20(1), 1999. Also available on WWW at URL
http://www.cs.umn.edu/karypis. A short version appears in Intl. Conf.
R EFERENCES on Parallel Processing 1995.
[25] G. Karypis and V. Kumar. Multilevel k-way hypergraph partitioning.
[1] C. Ababei, N. Selvakkumaran, K. Bazargan, and G. Karypis. Multi- VLSI Design, 2000.
objectivecircuit partitioning for cutsize and path-based delay minimiza- [26] B. W. Kernighan and S. Lin. An efficient heuristic procedure for
tion. In Proceedings of ICCAD, 2002. Also available on WWW at URL partitioning graphs. The Bell System Technical Journal, 49(2):291307,
http://www.cs.umn.edu/karypis. 1970.
[2] C. Alpert and A. Kahng. A hybrid multilevel/genetic approach for circuit [27] P.Fishburn. Decision and Value Theory. J.Wiley & Sons, New York,
partitioning. In Proceedings of the Fifth ACM/SIGDA Physical Design 1964.
Workshop, pages 100105, 1996. [28] A. Pothen, H. D. Simon, and K.-P. Liou. Partitioning sparse matrices
[3] C. J. Alpert. The ISPD98 circuit benchmark suite. In Proc. of the Intl. with eigenvectors of graphs. SIAM Journal of Matrix Analysis and
Symposium of Physical Design, pages 8085, 1998. Applications, 11(3):430452, 1990.
[4] C. J. Alpert, J. H. Huang, and A. B. Kahng. Multilevel circuit [29] R.Keeney and H. Raiffa. Decisions with Multiple Objectives: Prefer-
partitioning. In Proc. of the 34th ACM/IEEE Design Automation ences and Value Tradeoffs. J.Wiley & Sons, New York, 1976.
Conference, 1997. [30] L. Sanchis. Multiple-way network partitioning. IEEE Trans. On
[5] C. J. Alpert and A. B. Kahng. Recent directions in netlist partitioning. Computers, 38(1):6281, 1989.
Integration, the VLSI Journal, 19(1-2):181, 1995. [31] K. Schloegel, G. Karypis, and V. Kumar. A new algorithm for multi-
[6] J. Babb, R. Tessier, and A. Agarwal. Virtual wires: Overcoming objective graph partitioning. In Proceedings of EuroPar 99, pages 322
pin limitations in FPGA-based logic emulators. In IEEE Workshop 331, 1999.
on FPGAs for Custom Computing Machines, pages 142151. IEEE [32] S. Shekhar and D. R. Liu. Partitioning similarity graphs: A framework
Computer Society Press, 1993. for declustering problmes. Information Systems Journal, 21(4), 1996.
[7] S. T. Barnard and H. D. Simon. A fast multilevel implementation of [33] H. D. Simon and S.-H. Teng. How good is recursive bisection? Technical
recursive spectral bisection for partitioning unstructured problems. In Report RNR-93-012, NAS Systems Division, NASA, Moffet Field, CA,
Proceedings of the sixth SIAM conference on Parallel Processing for 1993.
Scientific Computing, pages 711718, 1993. [34] M. Wang, S. K. Lim, J. Cong, and M. Sarrafzadeh. Multi-way
[8] T. Bui and C. Jones. A heuristic for reducing fill in sparse matrix partitioning using bi-partition heuristics. In Proceedings of ASPDAC,
factorization. In 6th SIAM Conf. Parallel Processing for Scientific pages 441446. IEEE, January 2000.
Computing, pages 445452, 1993. [35] S. Wichlund and E. J. Aas. On Multilevel Circuit Partitioning. In Intl.
[9] A. E. Caldwell, A. B. Kahng, and I. L. Markov. Improved algorithms for Conference on Computer Aided Design, 1998.
hypergraph bipartitioning. In Asia and South Pacific Design Automation [36] P. Yu. Multiple-Criteria Decision Making: Concepts, Techniques, and
Conference, pages 661666, 2000. Extensions. Plenum Press, New York, 1985.
[10] J. Cong and S. K. Lim. Multiway partitioning with pairwise movement. [37] H. Zha, X. He, C. Ding, H. Simon, and M. Gu. Bipartite graph
In Proceedings of ICCAD, pages 512516, 1998. partitioning and data clustering. In CIKM, 2001.
[11] J. Cong and M. L. Smith. A parallel bottom-up clustering algorithm with [38] K. Zhong and S. Dutt. Algorithms for simultaneous satisfaction of
applications to circuit partitioning in vlsi design. In Proc. ACM/IEEE multiple constraints and objective optimizaion in a placement flow with
Design Automation Conference, pages 755760, 1993. application to congestion control. In Proceedings of DAC, pages 854
[12] R. Cooley, B. Mobasher, and J. Srivastava. Web mining: Information and 859, 2002.
pattern discovery on the world wide web. In International Conference
on Tools with Artificial Intelligence, pages 558567, Newport Beach,
1997. IEEE.
IEEE TRANSACTIONS ON COMPUTER AIDED DESIGN, VOL XX, NO. XX, 2005 14