0% found this document useful (0 votes)
16 views33 pages

Min-Max Graph Partitioning and Small Set Expansion

MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
16 views33 pages

Min-Max Graph Partitioning and Small Set Expansion

MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

SIAM J. COMPUT.

c 2014 Society for Industrial and Applied Mathematics



Vol. 43, No. 2, pp. 872–904

MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION∗


Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

NIKHIL BANSAL† , URIEL FEIGE‡ , ROBERT KRAUTHGAMER‡ , KONSTANTIN


MAKARYCHEV§ , VISWANATH NAGARAJAN¶, JOSEPH (SEFFI) NAOR , AND
ROY SCHWARTZ§

Abstract. We study graph partitioning problems from a min-max perspective, in which an


input graph on n vertices should be partitioned into k parts, and the objective is to minimize the
maximum number of edges leaving a single part. The two main versions we consider are where the k
parts need to be of equal size, and where they must separate a set of k given
√ terminals. We consider
a common generalization of these two problems, and design for it an O( log n log k) approximation
algorithm. This improves over an O(log2 n) approximation for the second version due to Svitkina and
Tardos [Min-max multiway cut, in APPROX-RANDOM, 2004, Springer, Berlin, 2004], and roughly
O(k log n) approximation for the first version that follows from other previous work. We also give an
O(1) approximation algorithm for graphs that exclude any fixed minor. Our algorithm uses a new
procedure for solving the small-set expansion problem. In this problem, we are given a graph G and
 is to find a nonempty set S ⊆ V of size |S| ≤ ρn with minimum edge expansion. We give
the goal
an O( log n log (1/ρ)) bicriteria approximation algorithm for small-set expansion in general graphs,
and an improved factor of O(1) for graphs that exclude any fixed minor.

Key words. sparse cut, balanced cut, expansion, spreading metrics

AMS subject classification. 68Q01

DOI. 10.1137/120873996

1. Introduction. We study graph partitioning problems from a min-max per-


spective. Typically, graph partitioning problems ask for a partitioning of the vertex
set of an undirected graph under some problem-specific constraints on the different
parts, e.g., balanced partitioning or separating terminals, and the objective is min-
sum, i.e., minimizing the total weight of the edges connecting different parts. In the
min-max variant of these problems, the goal is different—minimize the weight of the
edges leaving a single part, taking the maximum over the different parts. A canon-
ical example, that we consider throughout the paper, is the min-max k-partitioning
problem: given an undirected graph G = (V, E) with nonnegative edge weights and
k ≥ 2, partition the vertices into k (roughly) equal parts S1 , . . . , Sk so as to mini-
mize maxi δ(Si ), where δ(S) denotes the sum of edge weights in the cut (S, V \ S).

∗ Received by the editors April 18, 2012; accepted for publication (in revised form) February 13,

2013; published electronically April 29, 2014. An extended abstract of this paper appeared in Pro-
ceedings of 52nd Symposium on Foundations of Computer Science, IEEE Computer Society, Los
Alamitos, CA, 2011.
http://www.siam.org/journals/sicomp/43-2/87399.html
† Eindhoven University of Technology, 5600 MB Eindhoven, The Netherlands ([email protected]).

Work supported in part by the Netherlands Organisation for Scientific Research (NWO) grant
639.022.211.
‡ Weizmann Institute of Science, Rehovot, Israel ([email protected], robert.krauthgamer@

weizmann.ac.il). The second author was supported in part by The Israel Science Foundation (grant
873/08). The third author was supported in part by the Israel Science Foundation (grant 452/08),
the US-Israel BSF (grant 2010418), and by a Minerva grant.
§ Microsoft Research, One Microsoft Way, Redmond, WA 98052 (konstantin.makarychev@

microsoft.com, [email protected]).
¶ IBM T.J. Watson Research Center, P.O. Box 218, Yorktown Heights, NY 10598 (viswanath@

us.ibm.com).
 Computer Science Dept., Technion, Haifa, Israel ([email protected]). Supported by ISF

grant 954/11 and BSF grant 2010426.


872

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION 873

We design for this problem a bicriteria approximation algorithm. Throughout, let


w : E → R+ denote the edge weights and let n = |V |.
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

Min-max partitions arise naturally in many settings. Consider the following appli-
cation in the context of cloud computing, which is a special case of the general graph-
mapping problem considered in [5] (and also implicit in other previous works [36, 35,
11]). There are n processes communicating with each other, and there are k ma-
chines, each having a bandwidth capacity C. The goal is to allocate the processes to
machines in a way that balances the load (roughly n/k processes per machine), and
meets the outgoing bandwidth requirement. Viewing the processes as vertices and the
traffic between them as edge weights, we get the min-max k-partitioning problem. In
general, balanced partitioning (either min-sum or min-max) is at the heart of many
heuristics that are used in a wide range of applications, including VLSI layout, circuit
testing and simulation, parallel scientific processing, and sparse linear systems.
Balanced partitioning, particularly in its min-sum version, has been studied exten-
sively during the last two decades, with impressive results and connections to several
fields of mathematics; see, e.g., [25, 13, 26, 4, 3, 20, 22, 9]. The min-max variants,
in contrast, have received much less attention. Kiwi, Spielman, and Teng [18] formu-
lated the min-max k-partitioning problem and designed algorithms achieving absolute
bounds for special classes of graphs such as meshes and planar graphs. Prior to our
work, no approximation algorithm for the min-max k-partitioning problem was given
explicitly for general graphs,
√ and the approximation that follows from known results
is not smaller than O(k log n).1 However, Raghavendra, Steurer, and Tulsiani [31,
Theorem IV.5] recently proved that a constant factor approximation for this problem,
even for k = 2, as well as a bicriteria approximation, is SSE-hard, meaning that it
would refute the so-called SSE hypothesis (which asserts a certain promise version of
the SSE problem defined below does not have a polynomial-time algorithm). A dif-
ferent min-max partitioning problem, min-max multiway cut was studied by Svitkina
and Tardos [34], who designed an O(log2 n) approximation algorithm.
An important tool in our result above is an approximation algorithm for the Small-
Set Expansion (SSE) problem. This problem was suggested recently by Raghavendra
and Steurer [32] (see also [30, 31]) in the context of the unique games conjecture.
Recall that the edge expansion of a subset S ⊆ V with 0 < |S| ≤ 12 |V | is

δ(S)
Φ(S) := .
|S|

The input to the SSE problem is an edge-weighted graph and ρ ∈ (0, 12 ], and the goal
is to compute

Φρ := min Φ(S).
|S|≤ρn

Raghavendra, Steurer, and Tetali [30] designed for the SSE problem in d-regular
graphs
 an algorithm that approximates the expansion of small sets within a factor of
O( (d/Φρ ) log(1/ρ)), while violating the bound on |S| by no more than a constant

1 One could reduce the problem to the min-sum version of k-partitioning. The latter admits

bicriteria approximation O( log n log k) [20], but the reduction loses another factor of k/2. Another
possibility is to repeatedly remove n/k vertices from the graph, paying again a factor of k/2 on top
of the approximation in a single iteration, which is, say, O(log n) by [29].

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


874 BANSAL ET AL.

multiplicative factor, i.e., a bicriteria approximation.2 Notice that the approximation


factor depends on Φρ ; this is not an issue if every small set expands well, but in
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

general Φρ can be as small as 1/poly(n), in which case this guarantee is quite weak.
Using Räcke’s technique of converting a graph into a tree [29], one can obtain
an O(log n) approximation for SSE for any value of ρ, without √ violating the bound
|S| ≤ ρn.3 A better (bicriteria) approximation factor of O( log n) is achieved by
Arora, Rao, and Vazirani [3] at the price of slightly violating the size constraint.
However, this latter result works only for ρ = Ω(1). In our context of min-max
problems we need a solution to the case ρ =√1/k, where k = k(n) is part of the input.
Therefore, it is desirable to extend the O( log n) bound of [3] to a large range of
values for ρ.
1.1. Main results. Our two main results are bicriteria approximation algo-
rithms for the min-max k-partitioning and the SSE problems; here, the term bicriteria
means that the output generated by our algorithms violates the size constraints by at
most a multiplicative constant, while the cost of the solution is compared to that of
an optimal solution (that does satisfy the size constraints). The notation Oε (t) hides
multiplicative factors depending on ε, i.e., stands for O(f (ε) · t).
Definition 1.1. An (α, β) bicriteria approximate solution for the min-max k-
partitioning problem is a partition S1 , . . . , Sk of the input graph such that
1. max1≤i≤k δ(Si ) ≤ α · OPT (where OPT is the value of an optimal solution)
and
2. max1≤i≤k |Si | ≤ β · nk .
Theorem 1.1. For every positive √ constant ε > 0, min-max k-partitioning admits
a bicriteria approximation of (Oε ( log n log k), 2 + ε).
Theorem 1.1 provides a polynomial-time algorithm that with high probability
outputs
√ a partition S1 , . . . , Sk such that maxi |Si | ≤ (2 + ε) nk and maxi δ(Si ) ≤
O( log n log k)OPT. We note that the guarantee on part size can be improved slightly
to 2 − k1 + ε.
Definition 1.2. An (α, β) bicriteria approximate solution for the SSE problem
is a nonempty cut S ⊆ V such that
1. Φ(S) ≤ α · OPT (where OPT is the value of an optimal solution) and
2. |S| ≤ β · ρn.

Theorem 1.2. For every constant ε > 0, SSE admits a bicriteria approximation
of (Oε ( log n log (1/ρ)), 1 + ε).
Theorem 1.2 provides a polynomial-time algorithm that with high probability
outputs
 a cut S ⊆ V of size 0 < |S| ≤ (1 + ε)ρn whose edge expansion is δ(S)/|S| =
O( log n log (1/ρ))OPT. We point out that our algorithm actually handles a more
general version, called weighted SSE, which is required in the proof of Theorem 1.1.
So in a sense, Theorem 1.1 is proved by reducing min-max k-partitioning to weighted
SSE. We defer the precise details to section 2.
1.2. Additional results and extensions.
1.2.1. ρ-unbalanced cut. Closely related to the SSE problem is the ρ-unbal-
anced cut problem. The input is again a graph G = (V, E) with nonnegative edge
weights and a parameter ρ ∈ (0, 12 ], and the goal is to find a subset S ⊆ V of size

2 The algorithm of [30] works also in general (nonregular) graphs for the closely related objective

of conductance.
3 For very small values of ρ, roughly ρn ≤ O(log2 n), a better approximation ratio is known [15].

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION 875

|S| = ρn that minimizes δ(S). The relationship between this problem and SSE is
similar to the one between balanced cut and sparsest cut.
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

Definition 1.3. An (α, γ, β) bicriteria approximate solution for the ρ-unbalanced


cut problem is a cut S ⊆ V such that
1. δ(S) ≤ α · OPT (where OPT is the value of an optimal solution) and
2. γ · ρn ≤ |S| ≤ β · ρn.
Theorem 1.3. For every constant  0 < ε < 1, the ρ-unbalanced cut problem
admits a bicriteria approximation of (Oε ( log n log(1/ρ)), Ω(1), 1 + ε).
Theorem 1.3 provides a polynomial-time algorithm that with high prob-
 finds a cut S ⊆ V of size Ω(ρn) ≤ |S| ≤ (1 + ε)ρn and value δ(S) ≤
ability
Oε ( log n log (1/ρ))OPT. This result generalizes the bound of [3] from ρ = Ω(1)
to any value of ρ ∈ (0, 12 ]. Our factor is better than the O(log n) approximation ra-
tio that follows from [29], at the price of slightly violating the size constraint. Our
algorithm actually handles a more general version, called weighted ρ-unbalanced cut,
which is required in Theorem 1.1. We defer the precise details to section 2.4.
1.2.2. Min-max-multiway-cut. We also consider the min-max-multiway-cut
problem [34]. The input is an undirected graph with nonnegative edge weights and k
terminal vertices t1 , . . . , tk . The goal is to partition the vertices into k parts S1 , . . . , Sk
(not necessarily balanced), under the constraint that each part contains exactly one
terminal, so as to minimize maxi δ(Si ). Svitkina and Tardos designed an O(α log n)
approximation algorithm for this problem, where α is the approximation factor known
for minimum bisection. Plugging α = O(log n), due to Räcke [29], the algorithm of
Svitkina and Tardos achieves an O(log 2 n) approximation. Using a similar algorithm
to the one in Theorem 1.1, we obtain a better approximation √ factor.
Theorem 1.4. Min-max-multiway-cut admits an O( log n log k) approximation
algorithm.
Somewhat surprisingly, we show that removing the dependence on n for min-
max-multiway-cut appears hard, even though no balance is required. This stands
in contrast to its min-sum version, known as multiway cut, which admits an O(1)
approximation [7, 17]. In particular, we show that if there is a guarantee independent
of n, then it would imply a similar (independent of n) guarantee for the min-sum
version of k-partitioning. Thus, for large but constant k = k(ε), we would get an
(O(1), O(1)) bicriteria approximation for min-sum k-partitioning,4 which is signifi-
cantly better than current state of the art approximation algorithms [3, 2, 20], and is
known, in fact, to be SSE-hard [31, Theorem IV.5].
Theorem 1.5. If there is a k 1−ε approximation algorithm for min-max-multiway-
cut for some constant ε > 0, then there is a (k 2 , γ) bicriteria approximation algorithm
for min-sum k-partitioning with γ ≤ 32/ε .
1.2.3. Min-max cut. We also consider a common generalization of min-max k-
partitioning and min-max-multiway-cut, which we call min–max cut. In fact we obtain
Theorems 1.1 and 1.4 as a special case of our result for min–max cut. The input for
the min–max cut problem is an undirected graph G = (V, E) with nonnegative edge
weights, a collection of disjoint terminal sets T1 , T2 , . . . , Tk ⊆ V (possibly empty), and
parameters ρ ∈ [1/k, 1] and C, D > 0. The goal is to find (if possible) a partition
S1 , . . . , Sk of the input graph such that
1. Ti ⊆ Si ∀i ∈ [k],
2. |Si | ≤ ρn ∀i ∈ [k],
4 Bicriteria approximation for min-sum k-partitioning is defined analogously to Definition 1.1; see

also section 5.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


876 BANSAL ET AL.

3. max1≤i≤k δ(Si ) ≤ C,
k
4. i=1 δ(Si ) ≤ D.
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

This problem models the aforementioned cloud computing scenario, where in


addition, certain processes are preassigned to machines (each set Ti maps to machine
i ∈ [k]). The goal is to assign the processes V to k machines according to the
preassignment and machine load constraints, while bounding both the bandwidth per
machine C and the total volume of communication D.
Theorem 1.6. There is a randomized polynomial time algorithm that given any
feasible instance of the min–max cut problem with parameters k, ρ, C, D and any con-
stant ε > 0, finds a partition S1 , . . . , Sk with the following properties:
1. Ti ⊆ Si ∀i ∈ [k];
2. |Si | ≤ (2 + ε)ρn ∀i ∈ [k];√
3. E[max1≤i≤k δ(Si )] ≤ Oε ( log n log k)C;
k √
4. E[ i=1 δ(Si )] ≤ Oε ( log n log k)D.

1.2.4. Excluded-minor graphs. Finally, we obtain an improved approxima-


tion—constant factor—for SSE in graphs excluding a fixed minor.
Theorem 1.7. For every constant ε > 0, SSE admits
1. a bicriteria approximation of (Oε (r2 ), 1 + ε) on graphs excluding a Kr,r -
minor;
2. a bicriteria approximation of (Oε (log g), 1 + ε) on graphs of genus g ≥ 1.
These bounds extend to the ρ-unbalanced cut problem, and by plugging them
into the proofs of Theorems 1.1, 1.4, and 1.6 we achieve an improved (bicriteria)
approximation ratio of O(r2 ) for min-max k-partitioning, min-max-multiway-cut, and
min–max cut in graphs excluding a Kr,r -minor, and an improved approximation ratio
of O(log g) for the same problems in genus g graphs.

1.3. Techniques. For clarity, we restrict the discussion here mostly to our main
application, min-max k-partitioning. Our approach has two main ingredients. First,
we reduce the problem to a weighted version of SSE, showing that an α (bicriteria)
approximation for the latter can be used to achieve an O(α)
 (bicriteria) approximation
for min-max k-partitioning. Second, we design an Oε ( log n log(1/ρ)) (bicriteria)
approximation for weighted SSE (our applications will use ρ = 1/k).
The SSE problem. Let us first examine SSE, and assume for simplicity of presen-
tation that ρ = 1/k. Note that SSE bears obvious similarity to both balanced cut
and min-sum k-partitioning—its solution contains a single cut (vertex subset) with
a size condition, as in balanced cut, but the size of this cut is n/k similarly to the
k pieces in min-sum k-partitioning. Thus, our algorithm is inspired by, but different
from, the approximation algorithms known for these two problems [3, 20]. As in these
two problems, we use a semidefinite programming (SDP) relaxation to compute an
22 metric on the graph vertices. However, new spreading constraints are needed since
SSE is highly asymmetric in its nature—it contains only a single cut of size n/k. We
devise a randomized rounding procedure based on the orthogonal separator technique,
first introduced by Chlamtac, Makarychev, and Makarychev [10] in the context of the
unique games problem. These ideas lead to an algorithm that √ computes a cut S of
expected size |S| ≤ O(n/k) and of expected cost δ(S) ≤ O( log n log k) times the
SDP value. An obvious concern is that both properties occur only in expectation and
might be badly correlated, e.g., the expected edge expansion E[δ(S)/|S|] might be ex-
tremely large. Nevertheless, we prove that with good probability, both |S| = O(n/k)
and δ(S)/|S| is sufficiently small.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION 877

Excluded-minor and bounded-genus graphs. For SSE on excluded-minor and


bounded-genus graphs, we give a constant factor approximation guarantee, by ex-
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

tending the notion of orthogonal separators to linear programs (LPs) and designing
such “LP separators” of low-distortion for these special graph families. The proof
uses the probabilistic decompositions of Klein, Plotkin, and Rao [19] and Lee and
Sidiropoulos [23]. We believe that this result may be of independent interest—the LP
formulation for SSE is nontrivial and requires new spreading constraints. We remark
that even on planar graphs, the decomposition of Räcke [29] suffers an Ω(log n) loss
in the approximation guarantee, and thus does not yield an o(log n) ratio for SSE on
this class of graphs.
From SSE to min-max partitioning. Several natural approaches for designing an
approximation algorithm for min-max k-partitioning fail. First, reducing the problem
to trees à la Räcke [29] is not very effective, because there might not be a single tree
in the distribution that preserves all k cuts simultaneously. Standard arguments show
that the loss might be a factor of O(k log n) in the case of k different cuts. Second, one
can try and formulate a relaxation for the problem. However, the natural linear and
semidefinite relaxations both have large integrality gaps. As a case study, consider
for a moment min-max-multiway-cut. The standard linear relaxation of Calinescu,
Karloff, and Rabani [7] was shown by Svitkina and Tardos [34] to have an integrality
gap of k/2. In Appendix A we extend this gap to the semidefinite relaxation that
includes all 22 triangle inequality constraints. A third attempt is a greedy approach
that repeatedly removes from the graph, using SSE, pieces of size Θ(n/k). However,
by removing the “wrong” vertices from the graph, this process might arrive at a
subgraph in which every cut of Θ(n/k) vertices has edge weight greater by a factor
of Θ(k) than the original optimum (see Appendix B for details). Thus, a different
approach is needed.
Our approach. Our approach is to use multiplicative weight updates on top of
the algorithm for weighted SSE.√ This yields a collection S of sets S, each of size |S| =
Θ(n/k) and cost δ(S) ≤ O( log n log k)OPT, that covers every vertex v ∈ V at least
Ω(|S|/k) times. Alternatively, by assigning each set S ∈ S a fractional value of k/|S|,
we can view this as a fractional covering of vertices by valid configurations,
√ where valid
configurations correspond to sets S with |S| = Θ(n/k) and δ(S) = O( log n log k) ·
OPT.
Next, we randomly sample sets S1 , . . . , St from S until V is covered, and derive
a partition given by P1 = S1 , P2 = S2 \ S1 , and in general Pi = Si \ (∪j<i Sj ). This
method of converting a cover into a partition was previously used in [1, Theorem 2],
although it is somewhat counterintuitive, since the sets Pi may have a very large cost
δ(Pi ). We use the fact that S1 , . . . , St are chosen randomly (and in particular are in
random order), to show that the total expected boundary of the partition is not very
large, i.e.,
 
 
E δ(Pi ) ≤ O(k log n log k) OPT.
i

Then, we start fixing the partition


√ by the following local operation: find a Pi violating
the constraint δ(Pi ) ≤ O( log n log k)OPT, replace it with the (unique) Si which
defined it, and adjust all other sets Pj accordingly. Somewhat surprisingly, we prove
that this local fixing procedure terminates quickly. Finally, the resulting partition
consists of sets Pi , each of which satisfies all the necessary properties. However, the

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


878 BANSAL ET AL.

number of these sets might be very large, i.e., much larger then k. Thus, the last step
of the rounding is to merge small sets together. We show that this can be done while
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

maintaining simultaneously the constraints on the sizes and on the costs of the sets.
Organization. We first show in section 2 how to approximate weighted SSE in
both general and excluded-minor graphs, and then show (section 2.4) that an ap-
proximation algorithm for weighted SSE also yields an approximation algorithm for
weighted ρ-unbalanced cut. In section 3 we present an approximation algorithm for
min-max k-partitioning that uses the aforementioned algorithm for ρ-unbalanced cut.
The common generalization of both min-max k-partitioning and min-max-multiway-
cut, min–max cut, appears in section 4. Finally, Theorem 1.5, the hardness result for
min-max-multiway-cut, is proved in section 5.

2. Approximation algorithms for small-set expansion. In this section we


design approximation algorithms for the SSE problem. Our main result is for general
graphs and uses an SDP relaxation. It actually holds for a slight generalization of the
problem, where expansion is measured with respect to vertex weights (see Definition
2.1 and Theorem 2.1). This generality is useful later in using this algorithm as a
subroutine for min-max partitioning. We further obtain improved approximation for
certain graph families such as planar graphs (see section 2.3).
To simplify notation, we shall assume that vertex weights are normalized: we
consider measures μ and η on a vertex set V with μ(V ) = η(V ) = 1. For u ∈ V ,
we denote μ(u) = μ({u}) and η(u) = η({u}). We let (V, w) denote a complete
(undirected) graph on vertex set V with edge weight w(u, v) = w(v, u) ≥ 0 for every
pair of vertices u = v in V . In our context, such (V, w) can easily model a specific
edge set E, by simply setting w(u, v) = 0 for every nonedge (u, v) ∈ / E. Recall that
we denote

δ(S) := w(u, v),
u∈S,v∈V \S

to be the total weight of edges crossing the cut (S, V \ S).


Definition 2.1 (weighted SSE). Let G = (V, w) be an edge-weighted graph, and
let μ and η be two measures on the vertex set V with μ(V ) = η(V ) = 1. The weighted
SSE of G with respect to ρ ∈ (0, 1/2] is
 
δ(S)
Φρ,μ,η (G) := min : η(S) > 0, μ(S) ≤ ρ .
η(S)

Intuitively, η in the above definition is used to measure the sparsity of the cut;
thus, the objective is to minimize δ(S)/η(S) and it is required that η(S) > 0 (i.e., S
is not an empty cut with respect to η). On the other hand, μ is used to measure the
size of S, hence, it is required that μ(S) is not too large. Note that when both η and
μ are uniform measures, we obtain the usual SSE problem.
Throughout, we restrict attention to cases where Φρ,μ,η is defined (excluding, e.g.,
the case where μ, η are supported on the same single vertex). Part I of the following
theorem immediately implies Theorem 1.2 since we can choose both η and μ to be
the uniform measures.
Theorem 2.1 (approximating SSE). I. For every fixed ε > 0, there is a polynomial-
time algorithm that given as input an edge-weighted graph G = (V, w), two measures

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION 879

μ and η on V (μ(V ) = η(V ) = 1), and some ρ ∈ (0, 1/2], finds a set S ⊆ V satisfying
η(S) > 0, μ(S) ≤ (1 + ε)ρ, and
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

δ(S)
≤ D · Φρ,μ,η (G),
η(S)

where D = Oε ( log n log(1/ρ)) and n = |V |.
II. Furthermore, when the input contains also a parameter H ∈ (0, 1), there is an
algorithm to find a set S ⊆ V satisfying η(S) ∈ [Ω(H), 2(1 + ε)H], μ(S) ≤ (1 + ε)ρ,
and
 
δ(S) δ(T )
≤ D · min : η(T ) ∈ [H, 2H], μ(T ) ≤ ρ ,
η(S) η(T )

where D = Oε ( log n log(max{1/ρ, 1/H})).
Remark 2.2. Let us clarify why we need part II of the theorem. In section 3.1, we
repeatedly apply this theorem to generate a family of sets S ∈ S that uniformly covers
all vertices in V . The measure μ will be uniform and ρ = k1 , corresponding to the
requirement that each part in min-max k-partitioning contains n/k vertices. Each set
S ∈ S must satisfy two conditions: μ(S) ≤ ρ and δ(S) ≤ D·OPT. The second measure
η is used to ensure that we uniformly cover all vertices: we dynamically update η
(starting with the uniform measure) so that yet uncovered vertices have larger weight.
We need the lower bound η(S) ≥ Ω(H) to ensure adequate new coverage. We also
need an upper bound on η(S) to ensure that δ(S) is small:
OPT
δ(S) ≤ D η(S) · ≤ 2(1 + ε) D · OPT.
H
We prove part I of the theorem in section 2.1, and part II in section 2.2. These
algorithms require the following notion of m-orthogonal separators due to Chlamtac,
Makarychev, and Makarychev [10].
Definition 2.3 (orthogonal separators). Let X be an 22 metric (i.e., a finite
collection of vectors satisfying 22 triangle inequalities) and m > 0. A distribution
over subsets S ⊆ X is called an m-orthogonal separator of X with distortion D > 0,
probability scale α > 0, and separation threshold 0 < β < 1, if the following conditions
hold:
1. for all u ∈ X we have Pr(u ∈ S) = α u 2;
2. for all u, v ∈ X with u − v 2 ≥ β min{||u 2 , v 2 },

min{Pr(u ∈ S), Pr(v ∈ S)}


Pr(u ∈ S and v ∈ S) ≤ ;
m
3. for all u, v ∈ X we have Pr(IS (u) = IS (v)) ≤ αD · u − v 2 , where IS is the
indicator function for the set S.
Orthogonal separators is a distribution over subsets of vectors which has three
properties. The first property states that the probability that a vector u is chosen by
this distribution is proportional to its squared norm. The second property intuitively
states that for any two vectors u and v, which are far from each other, the event that
u is in a random subset is almost independent from the event that v is in a random
subset. The third and last property states that the probability that two vectors u
and v are separated by a random subset is upper bounded by a quantity which is
proportional to the 22 distance of these two vectors.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


880 BANSAL ET AL.

Theorem 2.2 (see [10]). There exists a polynomial-time randomized algorithm


that given a set of vectors X which includes the origin and satisfies the 22 triangle
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

inequality constraints, positive number


 m, and 0 < β < 1, generates an m-orthogonal
separator with distortion D = Oβ ( log |X| log m) and scale α ≥ 1/p(|X|) for some
polynomial p.
The second requirement in the definition of orthogonal separators was slightly
different originally (in [10]); however, exactly the same algorithm and proof works in
our case.5
2.1. Approximation Algorithm I: Small-set expansion in general graphs.
In this section we prove part I of Theorem 2.1.
SDP relaxation. In our relaxation we introduce a vector v̄ for every vertex v ∈ V .
The objective is to minimize the total weight of cut edges:

min w(u, v) ū − v̄ 2 .
(u,v)∈E

We could constrain all vectors v̄ to have length at most 1, i.e., v̄ 2 ≤ 1, but it turns
out our algorithm never uses this constraint. We require that the vectors {v̄ : v ∈
V } ∪ {0}, i.e., all the vectors corresponding to V and the origin, satisfy the 22 triangle
inequalities. For every u, v, w ∈ V :
2 2
ū − w̄ ≤ ū − v̄ + v̄ − w̄ 2 ,
2 2
ū ≤ ū − v̄ + v̄ 2 ,
2 2
ū − w̄ ≤ ū + w̄ 2 .

Note that any valid integral solution (where each vector is either zero or a fixed unit
vector) satisfies these conditions. Suppose now that we have approximately guessed
the measure H of the optimal solution H ≤ η(S) ≤ 2H (this step is not necessary but
it simplifies the exposition). This can be done since the measure of every set S lies in
the range from η(u) to n · η(u), where u is the heaviest element in S, hence H can be
chosen from the set {2t η(u) : u ∈ V, t = 0, . . . , log2 n } of size O(n log n). Therefore,
we add a constraint:

(1) v̄ 2 η(v) ≥ H.
v∈V

We also ignore all vertices v ∈ V with η(v) > 2H, since they do not participate in the
optimal solution, and thus write the constraint (equivalent to removing them from
the graph)
2
(2) v̄ = 0, whenever η(v) > 2H.

Finally, we introduce new spreading constraints which state that for every u ∈ V ,

(3) μ(v) · min{ ū − v̄ 2 , ū 2 } ≥ (1 − ρ) ū 2 .
v∈V

5 Let u−v2 ≥ βu2 and assume w.l.o.g that u2 ≤ v2 . Using arithmetic one can prove that

u, v ≤ (1 − β/2)v2 . Applying Lemma 4.1 in [10] implies that ϕ(u), ϕ(v) ≤ (1 − β/2). Hence,
ϕ(u) − ϕ(v)2 ≥ β > 0 and using Corollary 4.6 in [10] finishes the proof since ψ(u) − ψ(v) ≥ 2γ =

β/4 > 0.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION 881

We remark that one could alternatively use a slightly simpler, almost equivalent,
constraint
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy


ū, v̄ μ(v) ≤ ρ ū 2 .
v∈V

We chose to use the former formulation because it has an analogous constraint in


linear programming; see section 2.3.
The SDP relaxation used in our algorithm is presented below in its entirety. We
also note that the second constraint can be equivalently written as ū, v̄ ≤ ū 2 , and
the third constraint can be equivalently written as ū, v̄ ≥ 0.


2
min w(u, v) ū − v̄
(u,v)∈E

2 2 2
s.t. ū − w̄ + w̄ − v̄ ≥ ū − v̄ ∀u, v, w ∈ V,
2 2 2
ū − v̄ ≥ ū − v̄ ∀u, v ∈ V,
2 2
ū + v̄ ≥ ū − v̄ 2 ∀u, v ∈ V,

μ(v) · min{ ū − v̄ 2 , ū 2 } ≥ (1 − ρ) ū 2
∀ u ∈ V,
v∈V
2
v̄ =0 whenever η(v) > 2H,

2
(4) η(v) v̄ ≥ H.
v∈V

Lemma 2.3. Assuming a correct guess of H, the optimal value of SDP (4) is at
most 2H · Φρ,μ,η (G).
Proof. Let S ∗ be an optimal solution to the SSE problem, i.e., Φρ,μ,η (G) =
δ(S )/η(S ∗ ). Construct the following solution to SDP (4). If v ∈ S ∗ set v̄ = 1 for

some fixed unit vector, and if v ∈ S ∗ set v̄ = 0. Assuming that H was guessed correctly
(i.e., H ≤ η(S ∗ ) ≤ 2H), each vertex v ∈ S ∗ has η(v) ≤ 2H, and constraint (1) is also
satisfied. It is clear that all the 22 triangle inequality constraints with the origin are
satisfied. Let us focus now on the spreading constraints. If u ∈ S ∗ , then ū = 1 and
the sum in the spreading constraint equals μ(V \ S) ≥ 1 − ρ. Otherwise, if u ∈ / S∗,
then ū = 0 and both sides of the spreading constraint equal 0. Hence, we can conclude
that the solution we defined for (4) is feasible.
Note that for every edge (u, v) ∈ E, if the edge crosses the cut defined by S ∗
then it contributes its weight w(u, v) to the sum in the objective of (4). Otherwise,
its contribution is 0. Thus, δ(S ∗ ) equals the objective value of this SDP solution and
the lemma follows.
We now describe the approximation algorithm.
Approximation Algorithm I. We first informally describe the main idea behind
the algorithm. The algorithm solves the SDP relaxation and obtains a set of vectors
{ū}u∈V , whose value we denote by SDP . Now it samples an orthogonal separator, a
random set S ⊆ V , and returns it. Assume for the moment that the probability scale
α equals 1. Since Pr(v ∈ S) = v̄ 2 , we get E[η(S)] ≥ H. The expected size of the

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


882 BANSAL ET AL.

cut is at most D · SDP by the third property of orthogonal separators; and thus the
ratio of expectations
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

E[δ(S)] SDP
≤D· ≤ 2D · Φρ,μ,η (G).
E[η(S)] H
We will show that the expected ratio is also bounded by a similar quantity. Moreover,
using the second property of orthogonal separators and the spreading constraints, we
can show that μ(S) ≤ (1 + ε)ρ. We now proceed to the formal argument.
We may assume that ε is sufficiently small, i.e., ε ∈ (0, 1/4). The approximation
algorithm guesses a value H satisfying H ≤ η(S) ≤ 2H. Then, it solves the SDP
while further constraining ū = 0 for every u ∈ V with η(u) > 2H, to obtain a
set of vectors X = {v̄}v∈V . After that, it samples an orthogonal separator S with
m = ε−1 ρ−1 and β = ε (from Theorem 2.2). For convenience, we let S be the set of
vertices corresponding to vectors belonging to the orthogonal separator rather than
the vectors themselves. The algorithm repeats the previous step α−1 n2  times (recall
α is the probabilistic scale of the orthogonal separator). It outputs the set S having
minimum δ(S)/η(S) value among those satisfying 0 < μ(S) < (1 + 10ε)ρ. If no
S satisfies this constraint (which we shall see happens with an exponentially small
probability), the algorithm outputs an arbitrary set satisfying the constraints.
Analysis. We first estimate the probability of the event {u ∈ S and μ(S) < (1 +
10ε)ρ} for a fixed vertex u ∈ V . Let Au = {v : ū − v̄ 2 ≥ β ū 2 } be all the vertices
which are far from u and Bu = {v : ū − v̄ 2 < β ū 2 } all the vertices which are close
to u. Assume for now that u ∈ S, and let us show that only a small fraction of Au
is likely to belong to S, and that the entire set Bu has small measure μ(Bu ). Recall
that u satisfies the spreading constraint,

(5) min{||ū − v̄ 2 , ū 2 } · μ(v) ≥ (1 − ρ) ū 2 .
v∈V

The left-hand side above is at most β ū 2 · μ(Bu ) + ū 2 · μ(Au ). Since μ(Au ) =


1 − μ(Bu ), this implies that μ(Bu ) ≤ ρ/(1 − β) = ρ/(1 − ε) ≤ (1 + 2ε)ρ.
For an arbitrary v ∈ Au (for which v̄ = 0) we have ū − v̄ 2 ≥ β ū 2 ≥
β min( ū 2 , v̄ 2 ). By the second property of orthogonal separators, Pr(v ∈ S | u ∈ S)
≤ 1/m = ερ, and thus, the expected measure is E [μ(Au ∩ S) | u ∈ S] ≤ ερ. Now, still
conditioning on u ∈ S, the event {μ(S) ≥ (1 + 10ε)ρ} implies the event {μ(Au ∩
S) ≥ 8ερ} because μ(Bu ∩ S) ≤ μ(Bu ) ≤ (1 + 2ε)ρ. Thus by Markov’s inequality,
Pr[μ(S) ≥ (1 + 10ε)ρ | u ∈ S] is at most 1/8.
Recall that each vertex u ∈ V belongs to S with probability α ū 2 . Hence, with
probability at least 7/8 α ū 2 , it holds that both u ∈ S and μ(S) < (1 + 10ε)ρ.
Finally, we use the third property of orthogonal separators to bound the cost of
the cut δ(S),
(6)  
E[δ(S)] = |IS (u)−IS (v)|·w(u, v) ≤ αD· ū−v̄ 2 ·w(u, v) = αD·SDP,
(u,v)∈E (u,v)∈E
√ 
where D = Oβ ( log n log m) ≤ Oε ( log n log(1/ρ)) is the distortion of m-orthogonal
separators.
Consider the function f :
H
η(S) − δ(S) · 4D·SDP if S = ∅ and μ(S) < (1 + 10ε)ρ,
f (S) =
0 otherwise.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION 883

As the event {u ∈ S and μ(S) < (1+10ε)ρ} holds for each vertex u with probability at
least 34 α ū 2 , and by (6) we have E[δ(S)] ≤ αD · SDP , we can bound the expectation
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

as
 3α ū 2 η(u) αH αH
E[f (S)] ≥ − ≥ .
4 4 2
u∈V

The second inequality uses the last SDP constraint u∈V ū 2 η(u) ≥ H. The random
variable f (S) is always upper bounded by 2nH, because by the SDP constraints,
ū = 0 whenever η(u) > 2H. We note that the zeroing of all u ∈ V such that
η(u) > 2H is not essential to the algorithm. However, this zeroing enables us to
obtain a better bound on the expected value of f , which affects only the number of
samples of orthogonal separators the algorithm needs to perform.
Thus, Pr[f (S) > 0] ≥ α/(4n). Therefore, after α−1 n2 iterations, with probability
exponentially close to 1, the algorithm will find S with f (S) > 0. The latter implies
that η(S) > 0, μ(S) < (1 + 10ε)ρ, and
δ(S) SDP
≤ 4D · .
η(S) H
This finishes the proof of Theorem 2.1 part I, since SDP/(2H) ≤ Φρ,μ,η (G).
2.2. Approximation Algorithm II: Small-set expansion in general graphs.
We now prove part II of Theorem 2.1.
SDP relaxation. Our SDP relaxation is similar to part I, namely, we use SDP (4)
with the following additional constraints. Recall that Approximation Algorithm II
gets H (an approximate value of η(S) in the optimal solution) as input, and thus
does not need to guess it. First, we add a spreading constraint also on the η-measure
of the solution

(7) min{ ū − v̄ 2 , ū 2 } · η(v) ≥ (1 − 2H) ū 2 ∀u ∈ V.
v∈V

Second, we complement the two spreading constraints using different normalizations

(8) v̄ ≤ 1 ∀v ∈ V,

(9) v̄ 2 μ(v) ≤ ρ.
v∈V

It is not difficult to verify (analogously to Lemma 2.3) that this SDP is indeed a
relaxation of the problem stated in Theorem 2.1 part II.
Approximation Algorithm II. The algorithm solves the SDP relaxation and then
executes several iterations. Each iteration computes a vertex subset S by using a
variant of Approximation Algorithm I from section 2.1 detailed below, and then “re-
moves” this S from the graph G and the SDP solution, by zeroing the weight of edges
incident to S (i.e., discarding these edges) and zeroing the SDP vectors corresponding
to vertices in S (i.e., setting them to the origin 0). The algorithm maintains the ver-
tices removed so far in a set T ⊂ V , by initializing T = ∅, and then at each iteration
updating T = T ∪ S. The iterations repeat until either μ(T ) ≥ ρ/4 or η(T ) ≥ H/4,
at which point the algorithm returns a vertex subset F determined as follows: if
μ(T ) ≤ ρ and η(T ) ≤ H, then F = T ; otherwise, F = S (as computed in the last
iteration).

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


884 BANSAL ET AL.

To perform a single iteration, repeatedly sample an orthogonal separator S, by


applying Theorem 2.2 on the current SDP vectors with m = 1/(ερ) and β = ε, until
obtaining a set S with a positive f  value according to the definition in (10) below.
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

(Alternatively, one can return an arbitrary feasible S if the number of samples exceeds
a certain threshold.)
Remark 2.4. To handle terminals in the extended version of the problem (see
section 4), we guess a single terminal u that belongs to the optimal solution S (if
any), and set ū = 1 and v̄ = 0 for all other terminals v. Since an orthogonal
separator never contains the zero vector, the returned solution F will always contain
at most one terminal.
Analysis. Let us examine the effect of the algorithm’s changes to the SDP solution
(by zeroing some vectors and removing edges). Notice that the objective value of the
SDP may only decrease, and that constraints (8), (9) and the triangle inequalities
clearly remain satisfied. Constraint (1) may get violated, but not by too much; the
removed vertices have total measure η(T ) ≤ H/4, and due to (8) we still have the
slightly weaker constraint

(1 ) ū 2 η(u) ≥ 3H/4.
u∈V

Let us show that the spreading constraints (3) and (7) are still satisfied. Consider the
constraint (3) for a given vertex u ∈ V :

μ(v) · min{ ū − v̄ 2 , ū 2 } ≥ (1 − ρ) ū 2 .
v∈V

If vertex u ∈ T , the constraint holds trivially since the right-hand side is 0. If u ∈


/ T,
we claim that zeroing some vectors v̄ can only increase the left-hand side: if v ∈ / T,
then the term min{ ū − v̄ 2 , ū 2 } does not change; and if v ∈ T , then it is replaced
with min{ ū − 0 2 , ū 2 } = ū 2 , which is at least as large. The same argument
applies to constraint (7).
Recall that Approximation Algorithm II repeatedly samples S from the orthogo-
nal separator until obtaining a set S with f  (S) > 0, where the function f  is defined
(analogously to f in Approximation Algorithm I) by
(10) ⎧

H
⎪η(S) − δ(S) · 4D·SDP − μ(S)
4ρ · H if S = ∅,
f  (S) = μ(S) < (1 + 10ε)ρ, η(S) ≤ 2(1 + 10ε)H,


0 otherwise.

Notice that f  has an extra term and an extra condition involving η(S) when compared
to f .
Consider now a single sampling of S from the orthogonal separator. The same
argument as before shows, using constraints (5) and (7), that for every given u ∈ V ,

  3
Pr μ(S) ≤ (1 + 10ε)ρ and η(S) ≤ 2(1 + 10ε)H | u ∈ S ≥ .
4

Using the new constraint (9), we get E[μ(S)] ≤ αρ, and recalling the bound (6) on

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION 885

E[δ(S)], we obtain altogether


 3α ū 2
η(u) αH αH 3α · 3H/4 αH αH
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

E[f  (S)] ≥ − − ≥ − ≥ ,
4 4 4 4 2 16
u∈V

where the second inequality used (1’). As before, we can bound Pr[f  (S) > 0] ≥
Ω(α/n), because with probability 1 we have f  (S) ≤ η(S) ≤ 2nH (recall that ū = 0
whenever η(u) > 2H). So after O(n2 /α) samples, with probability that is exponen-
tially close to 1, the algorithm finds a set of vertices S satisfying f  (S) > 0.6
At every iteration we have f  (S) > 0, which implies that

SDP
(11) δ(S) ≤ 4D · η(S),
H
(12) η(S) ≥ H · μ(S)/(4ρ),

and thus also (throughout the iterations)

(13) η(T ) ≥ H · μ(T )/(4ρ).

Let T  denote the set T at the end of the algorithm, and let T  denote the
penultimate set T . Also let S  denote the last set S; so T  = T  ∪ S  . Note that
μ(T  ) < ρ/4 and η(T  ) < H/4 by the iterations’ stopping condition. In addition,
either μ(T  ) ≥ ρ/4 or η(T  ) ≥ H/4, but in both cases η(T  ) ≥ H/16 follows from (13).
We consider the two possible outputs.
• F = T  . This implies μ(T  ) ≤ ρ and η(T  ) ≤ H. And also η(T  ) ≥ H/16
from above.
• F = S  . This means μ(T  ) > ρ or η(T  ) > H, which imply, respectively,
μ(S  ) > 34 ρ or η(S  ) > 34 H; in either case, η(S  ) > 3H/16 follows from (12).
Also our algorithm ensures that μ(S  ) ≤ (1 + 10ε)ρ and η(S  ) ≤ 2(1 + 10ε)H.
We thus see that either output F satisfies μ(F ) ≤ (1 + 10ε)ρ, η(F ) ≤ 2(1 + 10ε)H,
and η(F ) ≥ H/16. Finally, inequality (11) holds for every set S added, and thus
a similar inequality holds for the output F as well. This completes the proof of
Theorem 2.1 part II.
2.3. Small-set expansion in minor-closed graph families. In this subsec-
tion we prove Theorem 1.7. We start by writing the LP relaxation (14). For every
vertex u ∈ V we introduce a variable x(u) taking values in [0, 1]; and for every pair of
vertices u, v ∈ V we introduce a variable z(u, v) = z(v, u) also taking values in [0, 1].
The intended integral solution corresponding to a set S ⊆ V has x(u) = 1 if u ∈ S,
and x(u) = 0 otherwise; z(u, v) = |x(u) − x(v)|. Intuitively, x(u) is the distance to a
fictional vertex o that never belongs to S. The analogous object in the SDP relaxation
(4) is the origin, and indeed, it is instructive to think of x(u) as an analogue of ū 2
and of z(u, v) as an analogue of ū − v̄ 2 .
It is easy to verify that LP (14) below is a relaxation of the SSE problem. The
first three constraints say that z(u, v) is a metric (strictly speaking, a semimetric), by
requiring the triangle inequality, even with the fictional vertex o. The novelty of the
LP is in the fourth constraint, which is a new spreading constraint for ensuring that
the size of S is small.
6 In fact, now f  (S) ≤ 2(1 + 10ε)H, thus Pr[f  (S) > 0] ≥ Ω(α/H) and we need only O(n/αH)

iterations.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


886 BANSAL ET AL.


min w(u, v) z(u, v)
(u,v)∈E
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

s.t. z(u, v) + z(v, w) ≥ z(u, w) ∀u, v, w ∈ V,


|x(u) − x(v)| ≤ z(u, v) ∀u, v ∈ V,
x(u) + x(v) ≥ z(u, v) ∀u, v ∈ V,

μ(v) · min{x(u), z(u, v)} ≥ (1 − ρ)x(u) ∀u ∈ V,
v∈V
x(v) = 0 whenever η(v) > 2H,

η(v)x(v) ≥ H,
v∈V
(14) x(u), z(u, v) ∈ [0, 1] ∀u, v ∈ V.

We introduce an analogue of m-orthogonal separators for linear programming,


which we call LP separators.
Definition 2.5 (LP separator). Let G = (V, E) be a graph, and let {x(u),
z(u, v)}u,v∈V be a set of numbers. We say that a distribution over subsets S ⊆ V is
an LP separator of V with distortion D ≥ 1, probability scale α > 0, and separation
threshold β ∈ (0, 1) if the following conditions hold:
1. for all u ∈ V we have Pr(u ∈ S) = α · x(u);
2. for all u, v ∈ V with z(u, v) ≥ β · min{x(u), x(v)} we have Pr(u ∈ S and v ∈
S) = 0;
3. for all (u, v) ∈ E we have Pr(IS (u) = IS (v)) ≤ αD · z(u, v), where IS is the
indicator function for the set S.
Below we present an efficient algorithm for generating an LP separator. The
input to this algorithm is a graph G = (V, E) that excludes Kr,r minors, a parameter
β ∈ (0, 1), and numbers {x(u)}u∈V , {z(u, v)}u,v∈V satisfying the first three constraints
in LP (14) (namely, the triangle inequality even with the fictional vertex o). The
algorithm then computes a sample of an LP separator with distortion O(r2 ). For
genus g graphs, the argument is similar and gives distortion O(log g).
Using this algorithm for LP separators, Theorem 1.7 is proved in the following
way. Replace in Approximation Algorithm I and Approximation Algorithm II above
the SDP relaxation (4) with the LP relaxation (14), and the orthogonal separators
with LP separators. This provides an O(r2 ) approximation algorithm for SSE in Kr,r
excluded-minor graphs.
Computing LP separators. We now describe an algorithm that samples an LP
separator (see Definition 2.5) with respect to a feasible solution to LP (14). We recall
a standard notion of low-diameter decomposition of a metric space; see, e.g., [6, 16, 21]
and references therein.
Let (V, d) be a finite metric space. Given a partition P of V and a point v ∈ V ,
we refer to the elements of P as clusters, and let P (v) denote the unique cluster S ∈ P
that contains v. A stochastic decomposition of this metric is a probability distribution
ν over partitions P of V .
Definition 2.6 (separating decomposition). Let D, Δ > 0. A stochastic de-
composition ν of a finite metric space (V, d) is called a D-separating Δ-bounded
decomposition if it satisfies

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION 887

1. for every partition P ∈ supp(ν) and every cluster S ∈ P ,

diam(S) := max d(u, v) ≤ Δ;


Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

u,v∈S

2. for every u, v ∈ V , the probability that a partition P sampled from ν separates


them is
d(u, v)
Pr [P (u) = P (v)] ≤ D · .
P ∼ν Δ
Theorem 2.4 (see [19, 33, 14]). Let G = (V, E) be a graph excluding Kr,r as a
minor, equipped with nonnegative edge lengths, and let Δ > 0. Then there exist an
O(r2 ) separating Δ-bounded decomposition ν of the shortest-path metric dG on G and
a polynomial-time algorithm that samples a partition from ν.
Lee and Sidiropoulos [23] similarly show that graphs with genus g ≥ 1 admit an
O(log g) separating decomposition. Alternative algorithms for both cases are shown
in [21].
Theorem 2.5. There exists an algorithm that, given a graph G = (V, E) that
excludes Kr,r minors, numbers {x(u)}u∈V , {z(u, v)}u,v∈V satisfying the first three
constraints of LP (14), and a parameter β ∈ (0, 1), returns an LP separator with
distortion D = O(r2 /β), probability scale α = Ω(1/|V |), and separation threshold β.
Proof. We slightly modify the graph G by adding a new vertex o and edges
{(o, v) : v ∈ V }; let V  = V ∪ {o} and E  = E ∪ {(o, v) : v ∈ V }. Note that the
modified graph (V  , E  ) excludes minor Kr+1,r+1 , so we can still apply Theorem 2.4.
We set the length of every edge (u, v) ∈ E  to be

z(u, v) if u, v ∈ V ,
(u, v) :=
x(u) if u ∈ V and v = o.

For ease of notation, let the modified graph (V  , E  ) also be called G, and let dG
denote the shortest-path metric according to these edge lengths. It follows from the
LP’s triangle inequalities that z(u, v) ≤ dG (u, v) for all u, v ∈ V  . The following
algorithm outputs a subset S ⊂ V which is an LP separator.
1. Choose δ ∈ [0, 1] uniformly at random, and sample a random partition Pδ
from Theorem 2.4 applied to the metric (V  , dG ) and parameter Δ = βδ/2.
2. Let S1 , . . . , St ∈ Pδ denote the clusters that contain at least one vertex u ∈ V
with x(u) ∈ [δ, 2δ]. Clearly t ≤ |V |.
3. Output cluster S = Si with probability 1/|V | for each i ∈ {1, . . . , t}; with the
remaining probability 1 − t/|V | output the empty cluster S = ∅.
Let us show that the output cluster S is an LP separator with α = 1/(2|V |) and
D = O(r2 /β). For the first requirement from an LP separator, we see that for every
u∈V,
  1 x(u)
Pr[u ∈ S] ≥ Pr x(u) 2 ≤ δ ≤ x(u) · = = α · x(u).
|V | 2|V |
Observe that conditioned on δ, if the output cluster S = ∅ then there is some
w ∈ S with x(w) ∈ [δ, 2δ] by step 2. Since the diameter of cluster S under the
metric dG is bounded by Δ = βδ/2, for all v ∈ S we have |x(w) − x(v)| ≤ z(w, v) ≤
dG (w, v) ≤ βδ/2, and therefore

(15) (1 − β/2)δ ≤ min x(v) ≤ max x(v) ≤ (2 + β/2)δ.


v∈S v∈S

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


888 BANSAL ET AL.

For the second requirement from an LP separator, consider u, v ∈ V with z(u, v) ≥


β · min{x(u), x(v)}. Assume w.l.o.g. that x(u) ≤ x(v), and then dG (u, v) ≥ z(u, v) ≥
β · min{x(u), x(v)} = β · x(u). For the event {u, v ∈ S} we must have (i) (1 − β/2)δ ≤
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

x(u) from inequality (15) and (ii) dG (u, v) ≤ Δ = βδ/2 by the diameter bound on
x(u)
S. The former implies δ ≤ 1−β/2 < 2x(u) since β < 1, while the latter implies
2
δ ≥ β · dG (u, v) ≥ 2x(u). We conclude that Pr[u, v ∈ S] = 0.
For the third requirement from an LP separator, we will bound Pr[u ∈ S, v ∈ S]
for an edge (u, v) ∈ E, the other (symmetric) case being identical. Using inequal-
ity (15),

1 x(u) x(u)
|V | if 2+β/2 ≤δ≤ 1−β/2 ,
Pr[ u ∈ S | δ ] ≤
0 otherwise.

Using Bayes’ rule, then the fact that Pδ is a separating decomposition, and then that
dG (u, v) ≤ (u, v) = z(u, v) (because it is an edge, we in fact have equality),
    dG (u, v) 2D0 · z(u, v)
Pr v ∈ S | u ∈ S, δ ≤ Pr Pδ (u) = Pδ (v) | δ ≤ D0 ≤ ,
βδ/2 βδ

where D0 = O(r2 ) is from Theorem 2.4. Combining the last two bounds, and recalling
that δ ∈ [0, 1] uniformly at random and β < 1,
 2x(u)
1 2D0 · z(u, v) 2 ln 6 · D0 · z(u, v)
Pr[ u ∈ S, v ∈ S ] ≤ dδ = .
x(u)/3 |V | βδ β|V |

We thus have

4 ln 6 · D0 · z(u, v) 8 ln 6 · D0
Pr[IS (u) = IS (v)] ≤ =α z(u, v),
β|V | β

which proves the third requirement with D = 8 ln 6 · D0 /β = O(r2 /β).

2.4. From small-set expansion to ρ-unbalanced cut. ρ-unbalanced cut and


SSE are related to each other, from the perspective of bicriteria approximation guar-
antees, in the same way that balanced cut and (uniform-demands) sparsest cut are.
Our intended application of approximating min-max k-partitioning (in section 3),
requires a weighted version of the ρ-unbalanced cut problem.
Definition 2.7 (weighted ρ-unbalanced cut). The input to this problem is a
tuple G, y, w, τ, ρ, where G = (V, E) is a graph with vertex weights y : V → R+ ,
edge weights w : E → R≥0 , and parameters τ, ρ ∈ (0, 1]. The goal is to find S ⊆ V of
minimum weight (cost) δ(S) satisfying
1. y(S) ≥ τ · y(V ) and
2. |S| ≤ ρ · |V |.
The unweighted version of the problem (defined in section 1.2) has τ = ρ and
unit vertex weights, i.e., y(v) = 1 for all v ∈ V . We show that weighted ρ-unbalanced
cut can be reduced to weighted SSE, which is needed for our intended application.
Formally, we have the following corollary of Theorem 2.1, which immediately proves
the unweighted version stated in Theorem 1.3. We use OPT G,y,w,τ,ρ to denote the
optimal value (cost) of the corresponding weighted ρ-unbalanced cut instance.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION 889

Corollary 2.6 (approximating weighted ρ-unbalanced cut). For every ε >


0, there exists a polynomial-time algorithm that given an instance G, y, w, τ, ρ of
weighted ρ-unbalanced cut, finds a set S ⊆ V satisfying
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

1. |S| ≤ βρn, where β = 1 + ε and n = |V |;


2. y(S) ≥ τ · y(V )/γ, where γ = O(1); 
3. δ(S) ≤ α · OPT G,y,w,τ,ρ for α = Oε ( log n log(max(1/ρ, 1/τ ))).
Proof. Let S ∗ be an optimal solution to G, y, w, τ, ρ; thus |S ∗ | ≤ ρn, y(S ∗ ) ≥
τ · y(V ), and δ(S ∗ ) = OPT G,y,w,τ,ρ . Define two measures μ and η on V as follows.
For every S ⊆ V , set μ(S) := |S|/n and η(S) := y(S)/y(V ).
The algorithm guesses H ≥ τ such that H ≤ η(S ∗ ) ≤ 2H (see Approximation
Algorithm I above for an argument why we can guess H). Then it invokes Theorem
2.1 part II on G with the measures μ and η defined above, and parameters ρ, H. The
obtained solution S satisfies

|S| = μ(S) · n ≤ (1 + ε)ρ n,

and

y(S) = η(S) · y(V ) ≥ Ω(H) · y(V ) ≥ Ω(τ ) · y(V ).

Furthermore,

δ(S) ≤ η(S) · [α · δ(S ∗ )/η(S ∗ )] ≤ α · 2(1 + ε) · δ(S ∗ ),



where α = Oε ( log n log(max(1/ρ, 1/τ ))).
3. Min-max balanced partitioning. In this section, we present our algorithm
for min-max k-partitioning, assuming a subroutine that approximates weighted ρ-
unbalanced cut. Our algorithm for min-max k-partitioning follows by a straightfor-
ward composition of Theorem 3.1 and Theorem 3.3 below. Plugging in for (α, β, γ)
the values obtained in section 2 would complete the proof of Theorem 1.1.
3.1. Uniform coverings. We first consider a covering relaxation of min-max
k-partitioning and solve it using multiplicative updates. This covering relaxation can
alternatively be viewed as a fractional solution to a configuration LP of exponential
size, as discussed further below. We include the analysis details for the multiplicative
updates for completeness.
Let C = {S ⊆ V : |S| ≤ n/k, δ(S) ≤ OPT} denote all the vertex sets that
are feasible for a single part. Note that a feasible solution in min-max k-partitioning
corresponds to a partition of V into k parts, where each part belongs to C. Algorithm 1
uniformly covers V , where the exact meaning of uniformly covers is stated in Theorem
3.1, using sets in C (actually a slightly larger family than C since our approximation
for weighted ρ-unbalanced cut is bicriteria). It is important to note that its output S
is a multiset.
The algorithm above is a multiplicative update type algorithm. Initially each
vertex is uncovered and is assigned a weight of 1. Whenever a vertex is covered
by some set S, its weight is halved, and the algorithm continues as long as the total
weight exceeds 1/n. At each time step, these weights are used as input to the weighted
ρ-unbalanced cut instance, which guides the set S to cover those vertices that have
not been covered enough times thus far.
Theorem 3.1. Running Algorithm 1 on an instance of min-max k-partitioning
outputs a multiset S that satisfies the following conditions:

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


890 BANSAL ET AL.

Algorithm 1. Covering procedure for min-max k-partitioning.


Set t =1, S = ∅, and y 1 (v) = 1 for all v ∈ V .
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

while v∈V y t (v) > 1/n do


// Solve the following using the algorithm from Corollary 2.6.
Let S t ⊆ V be the solution for the weighted ρ-unbalanced cut instance
G, y t , w, k1 , k1 .
Add S t to S.
// Update the weights of the covered vertices.
for every v ∈ V do
Set y t+1 (v) = 12 · y t (v) if v ∈ S t , and y t+1 (v) = y t (v) otherwise.
Set t = t + 1.
return S.
1. for all S ∈ S,

δ(S) ≤ α · OPT and |S| ≤ β · n/k,

where OPT denotes the optimal value of the min-max k-partitioning instance.
2. for all v ∈ V ,

|{S ∈ S : v ∈ S}| 1
≥ .
|S| (5γk)

Above, α = Oε ( log n log k), β = 1 + ε, and γ = O ε (1), for any fixed ε > 0.
Proof. For an iteration t, let us denote Y t := v∈V y t (v). The first assertion of
the theorem is immediate from the following claim.
Claim 3.2. Every iteration t of Algorithm 1 satisfies: δ(S t ) ≤ α · OPT and
|S t | ≤ β · n/k.
Proof. We claim that no matter what the values of the y t ’s are, the optimal
solution to the weighted ρ-unbalanced cut instance G, y t , w, k1 , k1  is at most OPT.
To see this, consider the optimal solution {Si∗ }ki=1 of the original min-max k-partitio-
ning instance. We have |Si∗ | ≤ n/k and w(δ(Si∗ )) ≤ OPT for all i ∈ [k]. Since {Si∗ }ki=1
partitions V , there is some j ∈ [k] with y t (Sj∗ ) ≥ Y t /k. It now follows that Sj∗ is
a feasible solution to the weighted ρ-unbalanced cut instance G, y t , w, k1 , k1 , with
objective value at most OPT, which proves the claim.
We proceed to prove the second assertion of Theorem 3.1. Let  denote the
number of iterations of the while loop, for the given min-max k-partitioning instance.
For any v ∈ V , let Nv denote the number of iterations t in which v ∈ S t . Then,
by the updating rule for the y weights we have that y +1 (v) = 1/2Nv . Moreover,
the termination condition implies that y +1 (v) ≤ 1/n (since Y +1 ≤ 1/n). Thus, we
obtain that Nv ≥ log2 n for all v ∈ V . From the approximation guarantee of the
weighted ρ-unbalanced cut algorithm (Corollary 2.6), it follows that y t (S t ) ≥ γ1k · Y t
in every iteration t. Thus,
 
t+1 t 1 t t 1
Y = Y − · y (S ) ≤ 1 − · Y t.
2 2γ k

This, in turn, implies that


  −1   −1
1 1 1
Y ≤ 1− ·Y = 1− · n.
2γ k 2γ k

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION 891

However, Y > 1/n since the algorithm performs  iterations. Thus,  ≤ 1+4γ k·ln n ≤
5γ k · log2 n. This finished the proof since
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

|{S ∈ S : S  v}|/|S| = Nv / ≥ (5γ)−1 k −1 .

Alternative view: A configuration LP. We now describe an alternate approach to


finding a cover S. Given a bound λ on the cost of any single cut, define the set of
feasible cuts as follows:
 n 
Fλ = S ⊆ V : |S| ≤ , δ(S) ≤ λ .
k
We define a configuration LP for min-max k-partitioning as follows. There is a variable
xS for each S ∈ Fλ indicating whether the cut S is chosen or not. The only constraint
we have is that each vertex v belongs to at least one feasible cut.

P(λ) = min xS
S∈Fλ

(16) s.t. xS ≥ 1 ∀v ∈ V,
S∈Fλ :v∈S

xS ≥ 0 ∀S ∈ Fλ .

The goal is to determine the smallest λ > 0 such that P(λ) ≤ k. Since the config-
uration LP (16) has an exponential number of variables, we formulate its dual and
present a separation oracle for it:

D(λ) = max yv
v∈V

(17) s.t. yv ≤ 1 ∀S ∈ Fλ ,
v∈S

yv ≥ 0 ∀v ∈ V.

One can (approximately) solve the dual formulation since a separation oracle which
uses weighted SSE exists. The details for approximating the configuration LP are
rather technical (since we only have a bicriteria approximation for weighted SSE, as
opposed to a usual multiplicative approximation) and do not offer any advantage over
the multiplicative updates method except for an alternative intuition regarding the
issue. Thus, they are omitted.
3.2. Aggregation. The aggregation process, which might be of independent
interest, transforms a cover of G into a partition, without violating the min-max
objective by much. In particular, we show the following.
Theorem 3.3. There is a randomized polynomial-time algorithm (Algorithm 2
below) that when given a graph G = (V, E), an ε ∈ (0, 1), and a cover S consisting of
subsets of V such that
1. every vertex in V is covered by at least a fraction c/k of sets S ∈ S for some
c ∈ (0, 1], and
2. each S ∈ S satisfies |S| ≤ 2n/k and δ(S) ≤ B,

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


892 BANSAL ET AL.

outputs a partition P of V into at most k sets such that for all P ∈ P we have
|P | ≤ 2(1 + ε)n/k and E[maxP ∈P δ(P )] ≤ 8B/(cε).
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

Algorithm 2. Aggregation procedure for min-max k-partitioning.


1 // Step 1: Sampling
Sort sets in S in a random order: S1 , S2 , . . . , S|S| . Let Pi = Si \ (∪j<i Sj ).
2 // Step 2: Replacing Expanding Sets with Sets from S
while there is a set Pi such that δ(Pi ) > 2B do
Set Pi = Si , and for all j = i, set Pj = Pj \ Si .
3 // Step 3: Aggregating

Let B  = max{ k1 i δ(Pi ), 2B}.
while there are Pi = ∅, Pj = ∅ (i = j) such that |Pi | + |Pj | ≤ 2(1 + ε)n/k
and δ(Pi ) + δ(Pj ) ≤ 2B  ε−1 do
Set Pi = Pi ∪ Pj and set Pj = ∅.
4 return all nonempty sets Pi .

We remark that this process is comprised of three different steps. Intuitively, we


first let the sets randomly compete with each other over the vertices so as to form
a partition; then, to make sure no set has large cost, we repeatedly fix the partition
locally, and use a potential function to track progress. Finally, the sets need to be
aggregated while ensuring that their cost and sizes do not increase much, such that
we are left with at most k cuts.
Note that in Step 2, whenever δ(Pi ) exceeds 2B, we replace Pi with Si , and adjust
the other sets Pj accordingly to ensure that we maintain a partition. In Step 3, we
aggregate the pieces Pi to reduce the number of parts to at most k.
Analysis. We now prove that this algorithm satisfies the properties stated in
Theorem 3.3.
Proof of Theorem 3.3. We start by analyzing Step 1. Observe that after Step 1
the collection of sets {Pi } is a partition of V and Pi ⊆ Si for every i. Particularly,
|Pi | ≤ |Si | ≤ 2n/k. Note, however, that the bound δ(Pi ) ≤ B may be violated for
some i since Pi might be a strict subset of Si . 
We finish the analysis of Step 1 of Algorithm 2 by proving that E[ i δ(Pi )] ≤
2kB/c. Fix an i ≤ |S| and estimate the expected weight of edges E(Pi , ∪j>i Pj ) given
that Si = S. If an edge (u, v) belongs to E(Pi , ∪j>i Pj ) then (u, v) ∈ E(Si , V \ Si ) =
E(S, V \ S) and both u, v ∈ / ∪j<i Sj . For any edge (u, v) ∈ δ(S) (with u ∈ S, v ∈/ S),
Pr((u, v) ∈ E(Pi , ∪j>i Pj ) | Si = S) ≤ Pr(v ∈ / ∪j<i Sj | Si = S) ≤ (1 − c/k)i−1 , since
v is covered by at least c/k fraction of sets in S and is not covered by Si = S. Hence,

E[w(E(Pi , ∪j>i Pj )) | Si = S] ≤ (1 − c/k)i−1 δ(S) ≤ (1 − c/k)i−1 B,

and E[w(E(Pi , ∪j>i Pj ))] ≤ (1 − c/k)i−1 B. Therefore, ∞the total expected weight of
i
edges
 crossing the boundary of the Pi ’s is at most i=0 (1 − c/k) B = kB/c, and
E[ i δ(Pi )] ≤ 2kB/c. This finishes the analysis of Step 1. 
Let us now analyze Step 2. Our potential function is i δ(Pi ), whose initial value
was bounded in Step 1. We prove that this potential function reduces quickly over the
iterations of Step 2, thus, Step 2 terminates after a “small” number of steps. After
each iteration of Step 2, the following invariant holds: the collection of sets {Pi } is
a partition of V and Pi ⊂ Si for all i. Particularly, |Pi | ≤ |Si |  ≤ 2n/k. The key
observation is that at every iteration of the “while” loop, the sum j δ(Pj ) decreases

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION 893

by at least 2B. This is due to the following uncrossing argument:


  
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

δ(Si ) + δ(Pj \ Si ) ≤ δ(Si ) + δ(Pj ) + w(E(Pj \ Si , Si )) − w(E(Si \ Pj , Pj )


j =i j =i
 
≤ δ(Si ) + δ(Pj ) + w(E(V \ Si , Si )) − w(E(Pi , V \ Pi ))
     
j =i
δ(Si ) δ(Pi )
 
= δ(Pj ) + 2δ(Si ) − 2δ(Pi )
j
 
≤ δ(Pj ) − 2B.
j

The above inequalities use the facts that Pi ⊆ Si for all i and that all the Pj ’s are
disjoint. Hence, the number of iterations of the loop in Step 2 is always polynomially
bounded and after the last iteration E[ i δ(Pi )] ≤ 2kB/c (the expectation is over
random choices at Step  1; Step 2 does not use random bits). Hence, E[B  ] ≤ 4B/c;
 1
recall that B = max{ k i δ(Pi ), 2B}.
Let us now analyze Step 3. The following analysis holds conditional on any value
of B  . After each iteration of Step 3, the following invariant holds: the collection of
sets {Pi } is a partition of V , and for all i, |Pi | ≤ 2(1 + ε)n/k and δ(Pi ) ≤ 2B  ε−1 .
Clearly, after Step 2, δ(Pi ) ≤ 2B ≤ B  and |Pi | ≤ 2n/k for all i. Moreover, the sum

i δ(Pi ) can only decrease in Step 3.
When the loop  terminates, we obtain a partition
 of V into sets Pi satisfying
|Pi | ≤ 2(1 + ε)n/k, i |Pi | = n, δ(Pi ) ≤ 2B  ε−1 , i δ(Pi ) ≤ kB  , such that no two
sets can be merged without violating the above constraints. Hence, by Lemma 3.4
below (with ai = |Pi | and bi = δ(Pi )), the number of nonempty sets is at most

n kB 
2 + = (1 + ε)−1 k + (ε/2)k ≤ k.
2(1 + ε)n/k 2B  ε−1

This finishes the proof.


Lemma 3.4 (greedy aggregation). Let a1 , . . . , at and b1 , . . . bt be two
sequences of
t
nonnegative numbers satisfying the following constraints ai < A, bi < B, i=1 ai ≤ S,
t
and i=1 bi ≤ T (for some positive real numbers A, B, S, and T ). Moreover, assume
that for every i and j (i = j) either ai + aj > A or bi + bj > B. Then,
 
S T S T
t< + + max , ,1 .
A B A B

Proof.
t By rescaling we assume that A = 1 and B = 1. Moreover, we may assume
t
that i=1 ai < S and i=1 bi < T by slightly decreasing the values of all ai and bi
so that all inequalities still hold.
We write two linear programs:

(LPI ) min xi
i
s.t. xi + xj ≥ 1 ∀(i, j) ∈ {(i, j) : ai + aj ≥ 1},
xi ≥ 0 ∀i,

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


894 BANSAL ET AL.

(LPII ) min yi
i
yi + yj ≥ 1 ∀(i, j) ∈ {(i, j) : bi + bj ≥ 1},
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

s.t.
yi ≥ 0 ∀i.

The first LP (LPI ) has variables xi and constraints xi + xj ≥ 1 for all i, j such
that ai + aj ≥ 1. The second LP (LPII ) has variables yi and constraints yi + yj ≥ 1
such that bi + bj ≥ 1. The LP objectives are to minimize i xi and to
for all i, j
minimize i yi . Note, that {ai } is a feasible point for LPI and {bi } is a feasible point
for LPII . Thus, the optimum values of LPI and LPII are strictly less than S and T ,
respectively.
Observe that both LPs are half-integral. Consider optimal solutions x∗i , yj∗ where
xi , yj ∈ {0, 1/2, 1}. Note that for every i, j either x∗i +x∗j ≥ 1 or yi∗ +yj∗ ≥ 1. Consider
∗ ∗
t
several cases. If for all i, x∗i + yi∗ ≥ 1, then t < S + T , since i=1 (x∗i + yi∗ ) < S + T .
If for some j, x∗j + yj∗ = 0 (and hence x∗j = yj∗ = 0), then x∗i + yi∗ ≥ 1 for i = j and,
thus, t < S + T + 1. Finally, assume that for some j, x∗j + yj∗ = 1/2, and w.l.o.g.,
x∗j = 1/2 and yj∗ = 0. The number of i’s with x∗i = 0 is (strictly) bounded by 2S.
For the remaining i’s, x∗i = 0 and hence yi∗ = 1 (because yi∗ = yi∗ + yj∗ ≥ 1), and
thus the number of such i’s is (strictly) bounded by T . This finishes the proof of the
lemma.
4. Further extensions. Both Theorems 1.1 and 1.4 follow from a more general
result for a problem that we call min–max cut. This problem models the aforemen-
tioned cloud computing scenario, where in addition, certain processes are preassigned
to machines (each set Ti maps to machine i ∈ [k]). The goal is to assign the processes
V to machines [k] while respecting the preassignment and machine load constraints,
and minimizing both bandwidth per machine and total volume of communication.
It is clear that in fact Theorem 1.6 generalizes both Theorems 1.1 and 1.4. Let us
now describe modifications to the min-max k-partitioning algorithm used to obtain
Theorem 1.6.
Uniform coverings. First, by the introduction of vertex weights, we can shrink
each preassigned set Ti to a single terminal ti (for i ∈ [k]). Then, feasible vertex sets C
in the covering procedure (section 3.1) consist of those S ⊆ V where weight(S) ≤ ρ n
(balance constraint) and |S ∩ {ti }ki=1 | ≤ 1 (preassignment constraint). The subprob-
lem weighted ρ-unbalanced cut also has the additional |S ∩ {ti }ki=1 | ≤ 1 constraint;
this can be handled in the Approximation Algorithm II from section 2 by guess-
ing which terminal belongs to S (see Remark 2.4). Using Corollary 2.6 we assume
an (α, β, γ) approximation algorithm forthis (modified) weighted ρ-unbalanced cut
problem, where for any ε > 0, α = Oε ( log n log(max{1/ρ, 1/τ })), β = 1 + ε, and
γ = Oε (1).
Algorithm 3 below gives the procedure to obtain a uniform covering S bounding
total edge cost in addition to the conditions in Theorem 3.1.
Theorem 4.1. For any instance of min–max cut, output S of Algorithm 3 sat-
isfies
1. δ(S) ≤ α · C and |S| ≤ β · nk for all S ∈ S;
2. |{S ∈ S : v ∈ S}| ≥ log2 n for all v ∈ V ;
3. |S|
 ≤ 5γ k · log2 n;
4. S∈S δ(S) ≤ 17α γ log√2 n · D.
Above, for any ε > 0, α = Oε ( log n log k), β = 1 + ε, and γ = Oε (1).
Proof. We start with the following key claim.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION 895

Algorithm 3. Covering procedure for min-max cut.



Set t ← 1, S ← ∅, y 1 (v) ← 1 for all v ∈ V , and Y 1 ← v∈V y 1 (v).
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

while Y t > n1 do
for i = 0, . . . , log2 k + 1 do
Solve the weighted ρ-unbalanced cut instance G, y t , w, 21i , ρ using
the algorithm from Corollary 2.6, to obtain S t (i) ⊆ V .
If δ(S t (i)) ≤ α · min{C, 4D/2i } then S t ← S t (i) and quit this loop.
Add S t to S.
for v ∈ V do
Set y t+1 (v) ← 12 · y t (v) if v ∈ S t , and y t+1 (v) ← y t (v) otherwise.

Set Y t+1 ← v∈V y t+1 (v).
Set t ← t + 1.
return S.

Claim 4.2. In any iteration t of Algorithm 3, there exists an i ∈ {0, 1, . . . , log2 k + 1}


t
such that δ(S t (i)) ≤ α · min{C, 4D/2i }, |S t (i)| ≤ β · nk , and y t (S t (i)) ≥ γY2i .
Proof. Consider the optimal solution {Sj∗ }kj=1 of the original min–max cut in-
stance. For all j ∈ [k] we have that |Sj∗ | ≤ ρn, δ(Sj∗ ) ≤ C, and Sj∗ contains at most
k
one terminal. Moreover, j=1 δ(Sj∗ ) ≤ D. Since {Sj∗ }kj=1 partitions V , we also have
k t ∗ t
j=1 y (Sj ) = Y . Let L ⊆ [k] denote the indices j having δ(Sj∗ ) ≤ 2D t ∗
Y t · y (Sj ).
We claim that j∈L y t (Sj∗ ) ≥ Y t /2. This is true since

 k
Yt  Yt 
y t (Sj∗ ) ≤ · δ(Sj∗ ) ≤ · δ(Sj∗ ) ≤ Y t /2.
2D 2D j=1
j ∈L j ∈L

t
Since |L| ≤ k, there is some q ∈ L with y t (Sq∗ ) ≥ Y2k . Let i ∈ {1, . . . , log2 k + 1}
be the value such that y t (Sq∗ )/Y t ∈ [ 21i , 2i−1
1
]; note that such an i exists because
t ∗ t 1
y (Sq )/Y ∈ [ 2k , 1]. For this i, consider the weighted ρ-unbalanced cut instance
G, y t , w, 2−i , ρ. Observe that Sq∗ is a feasible solution here since y t (Sq∗ ) ≥ Y t /2i ,
|Sq∗ | ≤ ρn, and Sq∗ contains at most one terminal. Hence, the optimal value of this
instance is at most:
   
2D 4D
δ(Sq∗ ) ≤ min C, t · y t (Sq∗ ) ≤ min C, i .
Y 2

The first inequality uses the definition of L and that δ(Sq∗ ) ≤ C, and the second
inequality is by the choice of i. It now follows from Corollary 2.6 √ that the solu-
tion S t (i) satisfies the claimed properties. We note that α = Oε ( log n log k) be-
cause each instance of weighted ρ-unbalanced cut has parameters τ = 2−i ≥ 1/k and
ρ ≥ 1/k.
For each iteration t, let it ∈ {1, . . . , log2 k + 1} be the index such that S t = S t (it ).
Claim 4.2 implies that such an index always exists, and so the algorithm is indeed well
defined. Condition 1 of Theorem 4.1 also follows directly. Let  denote the number of
iterations of the while loop, for the given min–max cut instance. For any v ∈ V , let
Nv denote the number of iterations t in which v ∈ S t . Then, by the update rule of

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


896 BANSAL ET AL.

the weights y we have y +1 (v) = 1/2Nv . Moreover, the termination condition implies
that y +1 (v) ≤ 1/n (since Y +1 ≤ 1/n). Thus we obtain Nv ≥ log2 n for all v ∈ V ,
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

proving condition 2 of Theorem 4.1.


t
Claim 4.2 implies that for each iteration t, we have y t (S t ) ≥ γY2it and δ(S t ) ≤
4α D/2it . Since it ≤ log2 k + 1, we also obtain that
 
Yt Yt
y t (S t ) ≥ max , · δ(S t ) .
2γ k 4αγ D

We need to prove that condition 3 and condition 4 also hold. First, using the fact
that y t (S t ) ≥ Y t /(2γ k) in each iteration, we can deduce that
 
t+1 1 1
Y = Y − · y t (S t ) ≤
t
1− · Y t.
2 4γ k

This, in turn, implies that


  −1   −1
1 1 1
Y ≤ 1− ·Y = 1− · n.
4γ k 4γ k

However, Y > 1/n since the algorithm performs  iterations. Thus,  ≤ 1+8γ k·ln n ≤
9γ k · log2 n. This proves condition 3 in Theorem 4.1.
Yt
Second, using y t (S t ) ≥ 4αγ t
D · δ(S ) in each iteration, Y
t+1
= Y t − 12 · y t (S t ) ≤
δ(S t )
(1 − 8αγ D ) · Y t . So,

    
t −1 t
1 −1 δ(S ) δ(S )
< Y ≤ Πt=1 1− · Y 1 ≤ exp − t=1 ·n
n 8αγ D 8αγ D

This implies:

−1

δ(S t ) ≤ (16αγ ln n) · D.
t=1

Adding in δ(S ) ≤ α · C ≤ α D, we obtain condition 4 of Theorem 4.1. This completes


the proof of Theorem 4.1.
Aggregation. This step remains essentially the same as in section 3.2, namely,
Algorithm 2 (with parameter B := α · C). The only difference is that in Step 3 we
do not merge parts containing terminals. We first show that this yields a slightly
weaker version of Theorem 1.6: in condition 2 we obtain a bound of (3 + ε)ρn on
the cardinality of each part. (Later we show how to achieve the cardinality bound of
(2 + ε)ρn as claimed in Theorem 1.6.)
Note that each of the final sets {Pi } is a subset of some set in S, and hence
contains at most one terminal. It also follows that the final sets {Pi } are at most 2k
in number: at most k of them contain no terminals (just as in Theorem 3.3), and at
most k contain a terminal (since there are at most k terminals). Each of these sets
{Pi } has size at most (2 + ε)ρn and cut value at most 8B/(cε), by the analysis in
Theorem 1.6. Moreover, if a set Pi contains a terminal then |Pi | ≤ β · ρn = (1 + ε)ρn
(since it does not participate in any merge). Finally in order to reduce the number of

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION 897

parts to k, we merge arbitrarily each part containing a terminal with one nonterminal
part, and output this as the final solution. It is clear that √ each part has at most one
terminal, has size ≤ (3 + ε)ρn, and cut value at most Oε ( log n log k) · C. The bound
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

on total cost (condition 4 in Theorem 1.6) is by the following claim. This proves a
weaker version of Theorem 1.6, with size bound (3 + ε)ρn.
Claim 4.3. Algorithm √ S from Theorem 4.1 outputs parti-
2k applied on collection
tion {Pi }ki=1 satisfying E[ i=1 w(δ(Pi ))] = Oε ( log n log k) D.
Proof. We will show  that the random√partition {Pi } at the end of Step 1 in
Algorithm
 2 satisfies E [ i w(δ(Pi ))] ≤ Oε ( log n log k) D. This would suffice, since
i w(δ(P i )) does not increase in Steps 2 and 3. For notational convenience, we assume
(by adding empty sets) that |S| = 5γ k · log2 n in Theorem 4.1; note that this does
not affect any of the other conditions.
To bound the cost of the partition {Pi } in Step 1, consider any index i ≤ |S|.
From the proof of Theorem 3.3, we have:

E[w(E(Pi , ∪j>i Pj )) | Si = S] ≤ (1 − c/k)i−1 δ(S),

where c = 1/(5γ) is such that each vertex lies in at least a fraction c/k of sets S. We
remove the conditioning,
 

i−1 i−1
E[w(E(Pi , ∪j>i Pj ))] ≤ (1 − c/k) · E[δ(Si )] = (1 − c/k) · δ(S)/|S| ,
S∈S

where we used that Si is a uniformly random set from S. Therefore, the total edge
cost can be upper bounded as follows:
 
 
E δ(Pi ) = 2 · E[w(E(Pi , ∪j>i Pj ))]
i i
⎛ ⎞  
 
≤⎝ i⎠
(1 − c/k) · δ(S)/|S|
i≥0 S∈S

k 
= · δ(S)/|S|.
c
S∈S

Using S∈S δ(S) ≤ 17αγ log2 n · D and |S| = 5γ k · log2 n from Theorem 4.1, we can
conclude that
 
 17α 
E δ(Pi ) ≤ D = Oε ( log n log k) D
i
5c

since α = Oε ( log n log k) and 1/c = Oε (1).
Obtaining size bound of (2 + ε)ρn. We now describe a modified aggre-
gating step (replacing Step 3 of Algorithm 2) that yields the promised guarantee of
Theorem 1.6. Given the uniform cover S from Algorithm 3, run Steps 1 and 2 of
Algorithm 2 (use B = αC) to obtain parts P1 , . . . , P|S| . Then perform the following.

1. Set B  := max{ k1 i δ(Pi ), 2B}.
2. While there are Pi , Pj = ∅ (i = j) such that |Pi | + |Pj | ≤ (1 + ε)ρn, δ(Pi ) +
δ(Pj ) ≤ 2B  , and Pi ∪ Pj does not contain a terminal: replace Pi ← Pi ∪ Pj
and Pj ← ∅.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


898 BANSAL ET AL.

3. Sort the resulting nonempty sets P1 , . . . , Pt according to a nonincreasing order


of size.
4. Form t/k groups where the jth group consists of parts indexed between
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

(j − 1)k + 1 and jk.


5. For each i ∈ [k] define Qi as the union of one part from each group such that
it contains terminal i but no other terminal. Additionally, ensure that each
part is assigned to one of {Qi }ki=1 .
We first show that the number of parts after step 2 above is t ≤ 4k. Note that each
part contains at most one terminal, and the number of parts containing a terminal is
at most k. For the nonterminal parts, using Lemma 3.4 (with ai = |Pi |, bi = δ(Pi ),
a = (1 + ε)ρn, b = 2B  , S = n, and T = kB  ) we obtain a bound of 5k/2, which
implies t ≤ 4k.
Next observe that sets {Qi }ki=1 in step 5 above are well defined, and can be found
using a simple greedy rule: this is because each group contains at most k parts.
This gives Theorem 1.6 part 1. Since t ≤ 4k there are at most 4 groups and hence
maxki=1 δ(Qi ) ≤ 4 · maxti=1 δ(Pi ) ≤ 8B  . This proves Theorem 1.6 part 3. Also, the
proof of Claim 4.3 implies Theorem 1.6 part 4; this is due to the fact that in the final
k |S|
partition i=1 δ(Qi ) ≤ =1 δ(P ).
We now show Theorem 1.6 part 2, i.e., maxki=1 |Qi | ≤ (2 + ε)ρn. Consider any
i ∈ [k] and let Pi denote the part assigned to Qi from the first group. By the
nonincreasing size ordering P1 , . . . , Pt and the round-robin assignment (step 5) into
Qi s, we obtain that |Qi | − |Pi | ≤ |Qj | for all j ∈ [k]. Taking an average, |Qi | − |Pi | ≤
1 k n t
k j=1 |Qj | = k . Finally, since each part {P } =1 has size at most (1 + ε)ρn we have
|Qi | ≤ (2 + ε)ρn. This completes the proof of Theorem 1.6.
5. Hardness of min-max-multiway-cut. In this section we prove Theorem
1.5, which shows that obtaining a k 1−ε approximation algorithm for min-max-
multiway-cut should be very challenging, and suggests that some dependence on
n might be necessary, unless we are satisfied with an approximation factor that
is linear in k, which is trivial. This situation is in contrast to several other cut
problems with sum-objective (multiway-cut, multicut, requirement-cut, etc.), where
poly(log k) approximation guarantees are known when k denotes the size of the ter-
minal set [28, 24, 12, 8, 27]. Throughout this section, we assume k is independent
of n, which simplifies the statements; strictly speaking, our reductions relate solv-
ing one problem with parameter k(n) to solving another problem with parameter
k(n ).
We will refer to the min-sum version of min-max k-partitioning, called min-sum k-
partitioning, in which the input is an edge-weighted graph G = (V, E) and a parameter
k, and the goal is to partition the vertices into k equal-sized parts while minimizing
the total edge weight of all edges cut. An algorithm for min-sum k-partitioning is an
(α, β) bicriteria approximation if for every instance, it partitions the vertices into k
pieces, each of size at most β|V |/k, and the total edge weight of all edges cut is at
most α times the least possible among all partitions into k equal-sized sets.
The basic idea in proving Theorem 1.5 is as follows. Although there is no vertex-
balance requirement in min-max-multiway-cut, an edge balance is implicit in the
objective. By introducing a complete bipartite graph (having suitable edge weight)
between the terminals and the rest of the graph, this edge balance can be used to
enforce the vertex balance required in min-sum k-partitioning. After this first step,
which obtains a bicriteria approximation for min-sum k-partitioning, we do a second
step which improves the size violation in the algorithm for min-sum k-partitioning.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION 899

These two steps are formalized in the two foregoing Lemmas 5.1 and 5.3. Putting
them together immediately proves Theorem 1.5.
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

Lemma 5.1. If there is a ρ-approximation algorithm for min-max-multiway-cut


then there is a (5ρ k, 10ρ) bicriteria approximation algorithm for min-sum k-partitio-
ning.
Proof. Let A denote the ρ-approximation algorithm for min-max-multiway-cut.
Consider any instance I of min-sum k-partitioning: graph G = (V, E), edge costs
c : E → R+ , and parameter k. For every B ≥ 0, consider an instance J (B) of
min-max-multiway-cut on graph# GB defined as
V {ti }ki=1 where {ti }ki=1 are the terminals;
• the vertex set is #
• the edges are E {(ti , u) : i ∈ [k], u ∈ V };
• extend cost function c by setting c(ti , u) = Bn for all i ∈ [k], u ∈ V .
Note that every solution to J (B) corresponds to a k–partition in G (though possibly
unbalanced). We say that a solution to J (B) is β-balanced if each piece in the
partition has size at most β · nk .
The algorithm for min-sum k-partitioning on Iruns algorithm A on all the min-
max-multiway-cut instances {J (2i ) : 0 ≤ i ≤ log2 ( e∈E ce )}, and returns the cheap-
est partition that is (10ρ)-balanced. We now show that this results in a (5ρ k, 10ρ)
bicriteria approximation ratio.
Note that algorithm A must be invoked on J (B) for some value B with B <
OPT(I) ≤ 2 · B. We will show that the partition resulting from this call is the desired
bicriteria approximation.
Claim 5.2. OPT(J (B)) ≤ OPT(I) + 2B ≤ 5OPT(I).
Proof. Let P ∗ denote the optimal k–partition to I. Consider the solution to
J (B) obtained by including each terminal into a distinct piece of P ∗ . The boundary
of the piece containing ti (any i ∈ [k]) in GB costs at most

B n B
OPT(I) + n · + · (k − 1) · ≤ OPT(I) + 2B,
n k n

the term OPT(I) is due to edges in E, the second term is due to edges at ti , and
the third is due to edges at all other {tj : j = i}. The claim now follows since
B ≤ 2 OPT(I).
Let P denote A’s solution to J (B), and {Pi }ki=1 the partition of V induced
by P . From Claim 5.2, P has objective value (for min-max-multiway-cut) at most
5ρ OPT(I). Note that the boundary (in graph GB ) of ti ’s piece in P costs at least
c(δG (Pi )) + (k − 1) B
n · |Pi |. Thus we obtain

B
c(δG (Pi )) + (k − 1) · |Pi | ≤ 5ρ OPT(I), for all i ∈ [k].
n

It follows that for every i ∈ [k], we have


1. |Pi | ≤ 5ρ OPT(I)
B
n
· k−1 ≤ 10ρ · nk since B > OPT(I), and
2. c(δG (Pi )) ≤ 5ρ · OPT(I).
Thus the solution {Pi }ki=1 to I is (10ρ)-balanced and costs at most 5ρ k · OPT(I).
This completes the proof of Lemma 5.1.
Lemma 5.3. If there is an (α, k 1−ε ) bicriteria approximation algorithm for min-
sum k-partitioning for some constant ε > 0, then there is also an (α · log log k, γ)
bicriteria approximation algorithm with γ ≤ 32/ε .

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


900 BANSAL ET AL.

Proof. Let A denote the (α, k 1−ε ) bicriteria approximation algorithm. The idea is
to use A recursively to obtain the claimed (α · log log k, γ) approximation; details are
below. Let G denote the input graph with n vertices, and OPT the optimal balanced
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

k–partition. The algorithm deals with several subinstances of G, each of which is


assigned a level from {0, 1, . . . , t}, where t := Θ(log log k) is fixed below. Every level i
i
instance will contain ki = k (1−ε/2) parts and ni = nk · ki vertices. Choose t to be the
smallest integer such that kt ≤ 32/ε . While generating the subinstances we also add
dummy singleton vertices (to keep the instances balanced). For notational simplicity
we use the same identifier for a subinstance and the graph corresponding to it. Note
that G is the unique level 0 instance. For each i ∈ {0, 1, . . . , t − 1}, every level i
instance I generates ki level i + 1 instances as follows:
• run algorithm A on I to obtain a ki1−ε -balanced partition {Pj : 1 ≤ j ≤ ki }
of I;
• for each j ∈ [ki ], add ni+1 − |Pj | singleton vertices to I[Pj ] to obtain a new
level i + 1 instance.
1−ε/2
Note that this process is indeed well defined since |Pj | ≤ nkii · ki1−ε ≤ nkii · ki =
ni i
ki · ki+1 = n i+1 . Moreover the total number of level i + 1 instances is Π =0 k =
i 
k =0 (1−ε/2) ≤ k 2/ε . Since each instance has size at most n, the total size of all
instances is polynomial.
The algorithm finally returns the partition P corresponding to the set of all level
t instances. Note that there are at most nt = nk · kt ≤ 32/ε · nk vertices in each level t
instance. The algorithm ignores all dummy vertices in each piece of P and greedily
merges pieces until every piece has at least n/k vertices (all from V ); let P  denote
the resulting partition of V . Clearly there are at most k pieces in P  , and each has
size at most 32/ε · nk . Thus P  is a 32/ε -balanced k–partition.
We now upper bound the total cost of all edges removed by the algorithm (over
all instances); this also bounds the cost of P  . This is immediate from Claim 5.4
below: since there are t levels and A achieves an α-approximation to the cost, the
total cost is bounded by α t · OPT.
Claim 5.4. For each i ∈ {1, . . . , t}, the sum of optimal values of level i instances
is at most OPT.
Proof. Consider any fixed i, and an instance I of level i. We will show that
the edges of OPT induced on I form a balanced ki -partition for I. This suffices to
prove the claim since the level i instances partition V (the original vertex set). Let V 
denote the vertices from V in I; note that |V  | ≤ nki−1
i−1 1−ε
· ki−1 . Consider the partition

Q of V induced by OPT; note that each piece in Q has size at most n/k. Greedily
merge pieces in Q as long as the size of each piece is at most n/k, to obtain partition
Q ; so the number of pieces is

k ni−1 1−ε k 1−ε 1−ε 1−ε/2


|Q | ≤ 1 + 2|V  | · ≤ 1+2 k · = 1 + 2 ki−1 ≤ 3 ki−1 ≤ ki−1 = ki .
n ki−1 i−1 n

The second-last inequality uses ki−1 ≥ 32/ε which is true by the choice of t. Now we
can fill pieces of Q with dummy singleton vertices of instance I to obtain a balanced
ki -partition of I.
This completes the proof of Lemma 5.3.

Appendix A. Integrality gap for SDP relaxation of min-max-multiway-


cut. Consider the following semidefinite relaxation for the min-max multiway cut

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION 901

problem:
min λ
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

(18)

s.t. λ≥ ||yu,i − yv,i ||22 ∀i = 1, 2, . . . , k,
(u,v)∈E
k

(19) ||yu,i ||22 = 1 ∀u ∈ V,
i=1
(20) ||yti ,i ||22 = 1 ∀i = 1, 2, . . . , k,
(21) yu,i · yu,j = 0 ∀u ∈ V, ∀i = j,
k

(22) yu,i · yv,j = ||yu,i ||22 ∀u, v ∈ V, ∀i = 1, 2, . . . , k,
j=1

(23) ||yu,i − yv,j ||22 + ||yv,j − yw,r ||22 ≥ ||yu,i − yw,r ||22 ∀u, v, w ∈ V,
∀i, j, r = 1, 2, . . . , k
(24) yu,i · yv,j ≥ 0 ∀u, v ∈ V, ∀i, j = 1, 2, . . . , k,
(25) ||yu,i ||22 ≥ yu,i · yv,j ∀u, v ∈ V, ∀u, v ∈ V,
∀i, j = 1, 2, . . . , k.
In the above relaxation, each vertex u ∈ V is associated with k different vectors:
yu,1 , yu,2 , . . . , yu,k . Each of those corresponds to a different terminal. Notice that
the last two constraints of the relaxation are the 22 triangle inequality constraints
including the origin (the same constraints used by [10]).
As stated, the integrality gap we consider is the star graph which contains a single
vertex u connected by k edges to the k terminals. The value of any integral solution
to this instance is exactly k − 1 since the only choice is to which terminal should u be
assigned. Any choice made results in a min-max objective value of k − 1.
Let us construct the fractional solution to the above relaxation. Fix e to be an
arbitrary unit vector, and x1 , x2 , . . . , xk to be k unit vectors which are all orthogonal
to e and the inner product between any two of them is −1/(k − 1). Set the following
fractional solution:
yti ,i = e ∀i = 1, 2, . . . , k,
yti ,j = 0 ∀i = j,

1 k−1
yu,i = e+ xi ∀i = 1, 2, . . . , k.
k k
First, we show that the above fractional solution is feasible. Constraints (19), (20),
and (21) are obviously feasible for all terminals. Vertex u also upholds constraint (19)
since
k $$ √ $$2 k  
 $$ 1 $$ 
$$ e + k − 1 xi $$ = 1
+
k−1
= 1.
$$ k k $$ k2 k2
i=1 2 i=1

Let us verify that vertex u also satisfies constraint (21):


 √   √   
1 k−1 1 k−1 1 k−1 1
e+ xi · e+ xj = 2 + · − = 0.
k k k k k k2 k−1

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


902 BANSAL ET AL.

Focus on constraint (22). It is easy to verify that for all terminals, the sum of all
k
k vectors that are associated with the picked vertex is exactly e. Since i=1 xi = 0,
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

the sum of all vectors associated with u is also e. Therefore, all constraints of type
(22) are satisfied.
Regarding the last three constraints, which are the 22 triangle inequality con-
straints including the origin, one can verify the following:

||e − 0||22 = 1,
$$  √ $$2
$$ $$
$$e − 1 e − k − 1 xi $$ =1−
1
,
$$ k k $$ k
2
$$  √ $$2
$$ $ $
$$0 − 1 e − k − 1 xi $$ =
1
,
$$ k k $$ k
2
$$ √   √ $$2
$$ 1 k−1 1 k−1 $ $ 2
$$ xj $$$$
$$ k e − k
xi −
k
e−
k
=
k
.
2

The last three constraints are all derived from the above calculations. Hence, we
can conclude that the fractional solution defined above is feasible for the semidefinite
relaxation.
The value of this solution is
  
1 1 k−1
max 1− + (k − 1) · =2 .
1≤i≤k k k k

This gives an integrality gap of k/2.


Appendix B. Bad example: Greedy algorithm for min-max k-parti-
tioning. We show that the naive greedy algorithm that repeatedly uses SSE to
remove a part of size n/k performs very poorly. In this example we even assume
that there is an# exact algorithm for SSE. The graph #k−1 is a tree on n = k 2 ver-
k−1
# V := {v} i=1 {ui,0 , . . . , ui,k } #
tices and edges E = i=1 {(ui,j , ui,j−1 ) : 1 ≤ j ≤
k} {(ui,0 , ui−1,0 ) : 2 ≤ i ≤ k − 1} {(v, u1,0 )}.
The simple greedy algorithm will cut out parts having a small boundary for k − 1
iterations, namely, Pi = {ui,1 , . . . , ui,k } for i ∈ [k − 1]; note that δ(Pi ) = 1 for all
i ∈ [k − 1]. However the last part {v, u1,0 , . . . , uk−1,0 } has cut value k − 1; so the
resulting objective value is k − 1.
On the other hand, it can be checked directly that the optimal value is at most
four: Consider the partition obtained by repeatedly taking the first k consecutive
vertices from the ordering v, u1,0 , . . . , u1,k , u2,0 , . . . , u2,k , . . . , uk−1,0 , . . . , uk−1,k .
Acknowledgment. We thank the anonymous reviewers for their careful reading
and instructive comments.

REFERENCES

[1] N. Alon and B. Klartag, Economical toric spines via Cheeger’s inequality, J. Topol. Anal.,
1 (2009), pp. 101–111.
[2] K. Andreev and H. Racke, Balanced graph partitioning, Theoret. Comput. Systems, 39
(2006), pp. 929–939.
[3] S. Arora, S. Rao, and U. V. Vazirani, Geometry, flows, and graph-partitioning algorithms,
Commun. ACM, 51 (2008), pp. 96–105.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


MIN-MAX GRAPH PARTITIONING AND SMALL SET EXPANSION 903

[4] Y. Aumann and Y. Rabani, An O(log k) approximate min-cut max-flow theorem and approx-
imation algorithm, SIAM J. Comput., 27 (1998), pp. 291–301.
[5] N. Bansal, K.-W. Lee, V. Nagarajan, and M. Zafer, Minimum congestion mapping in a
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

cloud, in Proceedings of the ACM Symposium on Principles of Distributed Computing,


ACM, New York, 2011, pp. 267–276.
[6] Y. Bartal, Probabilistic approximation of metric spaces and its algorithmic applications, in
37th Annual Symposium on Foundations of Computer Science, IEEE Computer Society,
Los Alamitos, CA, 1996, pp. 184–193.
[7] G. Calinescu, H. J. Karloff, and Y. Rabani, An improved approximation algorithm for
multiway cut, J. Comput. System Sci., 60 (2000), pp. 564–574.
[8] M. Charikar, T. Leighton, S. Li, and A. Moitra, Vertex sparsifiers and abstract rounding
algorithms, in Proceedings of the Symposium on Foundations of Computer Science, IEEE
Computer Society, Los Alamitos, CA, 2010, pp. 265–274.
[9] J. Cheeger, B. Kleiner, and A. Naor, A (log n)Ω(1) integrality gap for the sparsest cut SDP,
in Proceedings of the Symposium on Foundations of Computer Science, IEEE Computer
Society, Los Alamitos, CA, 2009, pp. 555–564.
[10] E. Chlamtac, K. Makarychev, and Y. Makarychev, How to play unique games using
embeddings, in Proceedings of the Symposium on Foundations of Computer Science, IEEE
Computer Society, Los Alamitos, CA, 2006, pp. 687–696.
[11] N. M. M. K. Chowdhury, M. R. Rahman, and R. Boutaba, Virtual network embedding with
coordinated node and link mapping, in INFOCOM, IEEE, Piscataway, NJ, 2009, pp. 783–
791.
[12] M. Englert, A. Gupta, R. Krauthgamer, H. Raecke, I. Talgam-Cohen, and K. Talwar,
Vertex sparsifiers: New results from old techniques, in International Workshop on Ap-
proximation Algorithms for Combinatorial Optimization Problems, Springer, Berlin, 2010,
pp. 152–165.
[13] G. Even, J. S. Naor, S. Rao, and B. Schieber, Fast approximate graph partitioning algo-
rithms, SIAM J. Comput., 28 (1999), pp. 2187–2214.
[14] J. Fakcharoenphol and K. Talwar, Improved decompositions of graphs with forbidden mi-
nors, in Sixth International Workshop on Approximation Algorithms for Combinatorial
Optimization Problems, Springer, Berlin, 2003, pp. 871–882.
[15] U. Feige, R. Krauthgamer, and K. Nissim, On cutting a few vertices from a graph, Discrete
Appl. Math., 127 (2003), pp. 643–649.
[16] A. Gupta, R. Krauthgamer, and J. R. Lee, Bounded geometries, fractals, and low-distortion
embeddings, in 44th Annual IEEE Symposium on Foundations of Computer Science, IEEE
Computer Society, Los Alamitos, CA, 2003, pp. 534–543.
[17] D. R. Karger, P. Klein, C. Stein, M. Thorup, and N. E. Young, Rounding algorithms for a
geometric embedding of minimum multiway cut, Math. Oper. Res., 29 (2004), pp. 436–461.
[18] M. Kiwi, D. A. Spielman, and S.-H. Teng, Min-max-boundary domain decomposition, The-
oret. Comput. Sci, 261 (1998), pp. 253–266.
[19] P. Klein, S. A. Plotkin, and S. Rao, Excluded minors, network decomposition, and multi-
commodity flow, in Proceedings of the 25th Annual ACM Symposium on Theory of Com-
puting, ACM, New York, 1993, pp. 682–690.
[20] R. Krauthgamer, J. Naor, and R. Schwartz, Partitioning graphs into balanced components,
in Proceedings of the ACM–SIAM Symposium on Discrete Algorithms, SIAM, Philadel-
phia, 2009, pp. 942–949.
[21] R. Krauthgamer and T. Roughgarden, Metric clustering via consistent labeling, Theory
Comput., 7 (2011), pp. 49–74.
[22] J. R. Lee and A. Naor, Lp metrics on the Heisenberg group and the Goemans-Linial con-
jecture, in Proceedings of the Symposium on Foundations of Computer Science, IEEE
Computer Society, Los Alamitos, CA, 2006, pp. 99–108.
[23] J. R. Lee and A. Sidiropoulos, Genus and the geometry of the cut graph, in Proceedings
of the 21st Annual ACM-SIAM Symposium on Discrete Algorithms, SIAM, Philadelphia,
2010, pp. 193–201.
[24] T. Leighton and A. Moitra, Extensions and limits to vertex sparsification, in Proceedings
of the Symposium on Theory of Computing, ACM, New York, 2010, pp. 47–56.
[25] T. Leighton and S. Rao, Multicommodity max-flow min-cut theorems and their use in de-
signing approximation algorithms, J. ACM, 46 (1999), pp. 787–832.
[26] N. Linial, E. London, and Y. Rabinovich, The geometry of graphs and some of its algorith-
mic applications, Combinatorica, 15 (1995), pp. 215–245.
[27] K. Makarychev and Y. Makarychev, Metric extension operators, vertex sparsifiers and
Lipschitz extendability, in in Proceedings of the Symposium on Foundations of Computer
Science, IEEE Computer Society, Los Alamitos, CA, 2010, pp. 255–264.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.


904 BANSAL ET AL.

[28] A. Moitra, Approximation algorithms for multicommodity-type problems with guarantees inde-
pendent of the graph size, in in Proceedings of the Symposium on Foundations of Computer
Science, IEEE Computer Society, Los Alamitos, CA, 2009, pp. 3–12.
Downloaded 09/27/24 to 169.235.25.107 . Redistribution subject to SIAM license or copyright; see https://epubs.siam.org/terms-privacy

[29] H. Räcke, Optimal hierarchical decompositions for congestion minimization in networks, in


Proceedings of the Symposium on Theory of Computing, ACM, New York, 2008, pp. 255–
264.
[30] P. Raghavendra, D. Steurer, and P. Tetali, Approximations for the isoperimetric and
spectral profile of graphs and related parameters, in Proceedings of the Symposium on
Theory of Computing, ACM, New York, 2010, pp. 631–640.
[31] P. Raghavendra, D. Steurer, and M. Tulsiani, Reductions between expansion problems,
in IEEE Conference on Computational Complexity, 2012, IEEE Computer Society, New
York, pp. 64–73.
[32] P. Raghavendra and D. Steurer, Graph expansion and the unique games conjecture, in
Proceedings of the Symposium on Theory of Computing, ACM, New York, 2010, pp. 755–
764.
[33] S. Rao, Small distortion and volume preserving embeddings for planar and Euclidean metrics,
in Proceedings of the 15th Annual Symposium on Computational Geometry, ACM, New
York, 1999, pp. 300–306.
[34] Z. Svitkina and E. ´ Tardos, Min-max multiway cut, in APPROX-RANDOM, Lecture Notes
in Comput. Sci. 3122, Springer, Berlin, 2004, pp. 207–218.
[35] M. Yu, Y. Yi, J. Rexford, and M. Chiang, Rethinking virtual network embedding: Substrate
support for path splitting and migration, ACM SIGCOMM Comp. Commun. Rev., 38
(2008), pp. 17–29.
[36] Y. Zhu and M. H. Ammar, Algorithms for assigning substrate network resources to virtual
network components, in INFOCOM, IEEE, Piscataway, NJ, 2006.

Copyright © by SIAM. Unauthorized reproduction of this article is prohibited.

You might also like