0% found this document useful (0 votes)
53 views

Introduction To Graph Partitioning

Graph partitioning involves dividing a graph into clusters or communities with high internal connectivity and low external connectivity. Common approaches include spectral partitioning based on the graph Laplacian, flow-based partitioning using min-cut/max-flow algorithms, and combinations of the two. While polynomial-time algorithms exist, real graphs do not always have clear partitions, and finding balanced partitions remains a challenge, with different methods failing on different graph structures like expanders or long chains. Modern techniques use local search, embeddings, and hybrid spectral-flow methods.

Uploaded by

Luthii Chachacha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
53 views

Introduction To Graph Partitioning

Graph partitioning involves dividing a graph into clusters or communities with high internal connectivity and low external connectivity. Common approaches include spectral partitioning based on the graph Laplacian, flow-based partitioning using min-cut/max-flow algorithms, and combinations of the two. While polynomial-time algorithms exist, real graphs do not always have clear partitions, and finding balanced partitions remains a challenge, with different methods failing on different graph structures like expanders or long chains. Modern techniques use local search, embeddings, and hybrid spectral-flow methods.

Uploaded by

Luthii Chachacha
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

CS369M: Algorithms for Modern Massive Data Set Analysis

Lecture 12 - 11/04/2009

Introduction to Graph Partitioning


Lecturer: Michael Mahoney

Scribes: Noah Youngs and Weidong Shao

*Unedited Notes

Graph Partition

A graph partition problem is to cut a graph into 2 or more good pieces. The methods are based on

1. spectral. Either global (e.g., Cheeger inequality,) or local.


2. ow-based. min-cut/max-ow theorem. LP formulation. Embeddings. Local Improvement.
3. combination of spectral and ow.

Note that not all graphs have good partitions.


Question: Can we certify that there are no good clusters in a graph?
Good clusters have the following properties:

1. internally (intra) - well connected.


2. externally (inter) - relatively poor

How do we quantify this?


Extreme cases:

1. split into 2 disconnected pieces


2. split into

S, S

on 2 maximum complete induced subgraphs.

Min cut problem

, where S V .
Define Given G = (V, E), a cut is a partition of V , (S, S)
Given

s, t V ,

an

(s, t) cut is a cut s.t. s S, t S


(u, v) : (u, v) E, u S, v S

A cut set of a cut is

The min cut problem: nd the cut of "smallest" edge weights

1. good: Polynomial time algorithm (min-cut = max ow)


2. bad: often get very inbalanced cut
3. in theory: cut algorithms are used as a sub-routine in divide and conquer algorithm
4. in practice: often want to "interpret" the clusters or partitions

Max Flow Problem

Define Call the capacity of an edge (u, v) E : euv


c : E R+
f : E R+

Let there be a cost function:


Then a ow is function of

1.
2.

, delineated

fuv Cuv u, v (capacity constraints)


P
P
fvu (conservation of
(u,v)E fuv =

cuv

or

ce

ows)

Then the value of the ow

|f | =

fsv

v
The MAX ow problem:

max |f |
The capacity of

(s, t)

cut is

= P Cuv .
c(S, S)

The min cut problem is

min C(S, T )
Note: this is a "single ow problem" ... i.e. only one
Theorem: the max value of an

st

and one

ow is equal to the min capacity of an

st

cut.

Proof idea:

max f low min cut

(weak duality)

Does there exists a cut that achieves equality?


Yes, from the strong duality theorem we can also solve the dual of the max-ow problem, which is the
min-ow problem
Primal: (max ow)

max |f |
subject to

fuv Cuv
Dual: (min cut)

min

cij dij

(i,j)E
s.t.

dij pi + pj 0, ij E
ps = 1, pt = 0, pi 0, V
dij 0, ij E
Can we add a "balance" condition?

1. want a good cut value


2. want

S, S

E(S, S)

both to be balanced - same size, or approximately same size

the answer is "Yes"

Explicit balance conditions:

= n/2
|S| = |S|
= (1 )n
|S| = n, |S|

Graph bisection - min cut s.t.

balanced cut min cut s.t

Implicit Balance conditions:

1. input balance constraints


2. expansion.

3. sparsity

E(S,S)
|S|
n

(def this as :h(S) )

E(S,S)
(def this as :sp(S) )
|S||S|

4. conductance

E(S,S)
V ol(S)
n

5. normalized cut

(with

V ol(S) =

ijE

deg(Vi )

E(S,S)

vol(|S|)vol(|S|)

(latter two are used in ML)


6. quotien cut

E(S,S)

min(vol(|S|),vol(|S|))

expansion and sparcity: are "same" (in the following sense:)

min h(S) min sp(S)


Quotient cuts yield a tight bound on cheeger inequality
In-practice: bias towards high degree nodes

Note:
quotient cuts get balanced implicitly, no explicit constraints on inter or intra connectivity

Z2

on random geometric graps or nice planer graphs yield good quotient cuts

More generally, - very inbalanced - disconnected clusters.

Example: extremely sparse random graph

G(n, p)

model,

Graph Partition Algorithms

4.1 Local Improvement


Developed in the 70's
Often it is a greedy improvemnt
Local minima are a big problem

p log n2 /n

expander

p logn/n

Usual methods improve them by constant factors


- simulated annealing
- big dierence in practice

Kernighan-Lin algorithm, fundamental work, no-longer used due to

(n2 )

performance

Fiduccia-Mattheyses algorithm, linear time, still commonly used


METIS algorithm from Karypis and Kumar, works very well in practice, especially on low dimensional graphs

4.2 Spectral methods


Develped in the 70's and 80's
Serivce level gaurantee (Cheeger's inequality)
At root, this is relaxation or rounding method related to QIP formualation :
t

M AXx(1.1)n xxtLx
x
- quadratic worst case.

hyperplane rounding:
-compute an eigenvector
- cut according to some rules
- post processing with local improvments

4.3 Flow-based methods


Developed in the 90's
Consider all pairs, multi-commodity ow problem.
Want to route the commodities s.t. the constraints are satised without bottlenecks.
Idea: bottleneck in ow computation corresponds to good cuts.

kcommodity
(logn)

problem: does not satisfy strong duality.

releax ow to LP

embed solution in l1

Round soltuion to

0, 1, (log n)

does satisfy approx min-cut max ow value gap

worst case.

4.4 Additional Graph Partitioning Notes


These methods "fail".... i.e. achieve the worst case, on the following graphs:
- spectral methods - fail on long stringy pieces  
- ow-based methods - fail on expander graphs. n choose 2 pairs but most pairs are far apart. (log n) apart.

Improvements/extensions for large data:


there exist hybrid ow based and local methods
(cut around the cut) local spectrum methods

 good cut around a start node of a given size


 time depends on the size of the output.

4.5 Methods that combine spectral and ow

ARV algorithm (developed a few years ago by Arora, Rao, and Vazirani)

most hyrbid algorithms are theoretical, but some implementations embed in SDP.

approximate solution (two-player game).

boosting & emsemble methods

References
1. Schaeer, "Graph Clustering", Computer Science Review 1(1): 27-64, 2007
2. Kernighan, B. W.; Lin, Shen (1970). "An ecient heuristic procedure for partitioning graphs". Bell
Systems Technical Journal 49: 291-307.
3. CM Fiduccia, RM Mattheyses. "A Linear-Time Heuristic for Improving Network Partitions". Design
Automation Conference.
4. G Karypis, V Kumar (1999). "A Fast and High Quality Multilevel Scheme for Partitioning Irregular
Graphs". Siam Journal on Scientic Computing.

You might also like