submod-tutorial-2
submod-tutorial-2
Tutorial - lecture II
Jan Vondrák1
1 IBM Almaden Research Center
San Jose, CA
Lecture I:
1 Submodular functions: what and why?
2 Convex aspects: Submodular minimization
3 Concave aspects: Submodular maximization
Lecture II:
1 Hardness of constrained submodular minimization
2 Unconstrained submodular maximization
3 Hardness more generally: the symmetry gap
These hardness results assume the value oracle model: the only
access to f is through value queries, f (S) =?
Jan Vondrák (IBM Almaden) Submodular Optimization Tutorial 3 / 24
Superconstant hardness for submodular minimization
√
A = random (hidden) set of size k = n
A √
f (S) = min{ n, |S \ A| + min{log n, |S ∩ A|}
log n
√
n
Analysis: with high probability, a value query does not give any
√
information about A ⇒ an algorithm will return a set of value n, while
the optimum is log n.
Lecture I:
1 Submodular functions: what and why?
2 Convex aspects: Submodular minimization
3 Concave aspects: Submodular maximization
Lecture II:
1 Hardness of constrained submodular minimization
2 Unconstrained submodular maximization
3 Hardness more generally: the symmetry gap
∅ Initialize A = ∅, B =everything.
In each step, grow A or shrink B.
Invariant: A ⊆ B.
While A 6= B {
Pick i ∈ B \ A;
Let α = max{f (A + i) − f (A), 0}, β = max{f (B − i) − f (B), 0};
α
With probability α+β , include i in A;
β
With probability α+β remove i from B; }
Initialize A = ∅, B =everything.
In each step, grow A or shrink B.
Invariant: A ⊆ B.
While A 6= B {
Pick i ∈ B \ A;
Let α = max{f (A + i) − f (A), 0}, β = max{f (B − i) − f (B), 0};
α
With probability α+β , include i in A;
β
With probability α+β remove i from B; }
Initialize A = ∅, B =everything.
In each step, grow A or shrink B.
Invariant: A ⊆ B.
While A 6= B {
Pick i ∈ B \ A;
Let α = max{f (A + i) − f (A), 0}, β = max{f (B − i) − f (B), 0};
α
With probability α+β , include i in A;
β
With probability α+β remove i from B; }
Initialize A = ∅, B =everything.
In each step, grow A or shrink B.
Invariant: A ⊆ B.
While A 6= B {
Pick i ∈ B \ A;
Let α = max{f (A + i) − f (A), 0}, β = max{f (B − i) − f (B), 0};
α
With probability α+β , include i in A;
β
With probability α+β remove i from B; }
Initialize A = ∅, B =everything.
In each step, grow A or shrink B.
Invariant: A ⊆ B.
While A 6= B {
Pick i ∈ B \ A;
Let α = max{f (A + i) − f (A), 0}, β = max{f (B − i) − f (B), 0};
α
With probability α+β , include i in A;
β
With probability α+β remove i from B; }
Initialize A = ∅, B =everything.
In each step, grow A or shrink B.
Invariant: A ⊆ B.
While A 6= B {
Pick i ∈ B \ A;
Let α = max{f (A + i) − f (A), 0}, β = max{f (B − i) − f (B), 0};
α
With probability α+β , include i in A;
β
With probability α+β remove i from B; }
Initialize A = ∅, B =everything.
In each step, grow A or shrink B.
Invariant: A ⊆ B.
While A 6= B {
Pick i ∈ B \ A;
Let α = max{f (A + i) − f (A), 0}, β = max{f (B − i) − f (B), 0};
α
With probability α+β , include i in A;
β
With probability α+β remove i from B; }
Initialize A = ∅, B =everything.
In each step, grow A or shrink B.
Invariant: A ⊆ B.
While A 6= B {
Pick i ∈ B \ A;
Let α = max{f (A + i) − f (A), 0}, β = max{f (B − i) − f (B), 0};
α
With probability α+β , include i in A;
β
With probability α+β remove i from B; }
Initialize A = ∅, B =everything.
In each step, grow A or shrink B.
Invariant: A ⊆ B.
While A 6= B {
Pick i ∈ B \ A;
Let α = max{f (A + i) − f (A), 0}, β = max{f (B − i) − f (B), 0};
α
With probability α+β , include i in A;
β
With probability α+β remove i from B; }
Initialize A = ∅, B =everything.
In each step, grow A or shrink B.
Invariant: A ⊆ B.
While A 6= B {
Pick i ∈ B \ A;
Let α = max{f (A + i) − f (A), 0}, β = max{f (B − i) − f (B), 0};
α
With probability α+β , include i in A;
β
With probability α+β remove i from B; }
Initialize A = ∅, B =everything.
In each step, grow A or shrink B.
Invariant: A ⊆ B.
While A 6= B {
Pick i ∈ B \ A;
Let α = max{f (A + i) − f (A), 0}, β = max{f (B − i) − f (B), 0};
α
With probability α+β , include i in A;
β
With probability α+β remove i from B; }
B
Initially: A = ∅, B = N, O = S ∗ .
A
O f (A) + f (B) + 2f (O) ≥ 2 · OPT .
B
Initially: A = ∅, B = N, O = S ∗ .
A
O f (A) + f (B) + 2f (O) ≥ 2 · OPT .
α β 2αβ (α − β)2
·α+ ·β− = ≥ 0.
α+β α+β α+β α+β
Again, the value oracle model: the only access to f is through value
queries, f (S) =?, polynomially many times.
Again, the value oracle model: the only access to f is through value
queries, f (S) =?, polynomially many times.
Continuous submodularity:
∂2ψ
If ∂x∂y ≤ 0, then f (S) = ψ( |S∩A| |S∩B|
|A| , |B| ) is submodular.
(non-increasing partial derivatives ' non-increasing marginal values)
Continuous submodularity:
∂2ψ
If ∂x∂y ≤ 0, then f (S) = ψ( |S∩A| |S∩B|
|A| , |B| ) is submodular.
(non-increasing partial derivatives ' non-increasing marginal values)
S f (A) = 1 f (B) = 1
A B
f (S) = 1/2
1.0 ψ(x, y )
ψ̃( 12 , 12 )
ψ̃(x, y )
ψ̃(0, 1)
0.5
−δ 0 δ x −y
The function for |x − y | < δ is flattened so it depends only on x + y .
1.0 ψ(x, y )
ψ̃( 12 , 12 )
ψ̃(x, y )
ψ̃(0, 1)
0.5
−δ 0 δ x −y
The function for |x − y | < δ is flattened so it depends only on x + y .
If the partition (A, B) is random, x = |S∩A| |S∩B|
|A| and y = |B| are
random variables, with high probability satisfying |x − y | < δ.
Hence, an algorithm will never learn any information about (A, B).
Jan Vondrák (IBM Almaden) Submodular Optimization Tutorial 13 / 24
Hardness and symmetry
Lecture I:
1 Submodular functions: what and why?
2 Convex aspects: Submodular minimization
3 Concave aspects: Submodular maximization
Lecture II:
1 Hardness of constrained submodular minimization
2 Unconstrained submodular maximization
3 Hardness more generally: the symmetry gap
Example:
x1 x2
Notes:
"Blow-up" means expanding the ground set, replacing the
objective function by the perturbed one, and extending the
feasibility constraint in a natural way.
Example: max{f (S) : |S| ≤ 1} on a ground set [k ]
−→ max{f (S) : |S| ≤ n/k } on a ground set [n].
x1 x2
x1 x2
x1 x2 x3 x4 x5 x6
x1 x2 x3 x4 x5 x6
x1 x2 x3 x4 x5 x6
x1 x2 x3 x4 x5 x6 x7
A
X = A ∪ B, |A| = |B| = k ,
F = {S ⊆ X : |S ∩ A| = 1, |S ∩ B| = k − 1}.
f (S) = number of arcs leaving S; symmetric under Sk .
x1 x2 x3 x4 x5 x6 x7
A
X = A ∪ B, |A| = |B| = k ,
F = {S ⊆ X : |S ∩ A| = 1, |S ∩ B| = k − 1}.
f (S) = number of arcs leaving S; symmetric under Sk .
OPT = F (1, 0, . . . , 0; 0, 1, . . . , 1) = 1.
OPT = F ( k1 , . . . , k1 ; 1 − k1 , . . . , 1 − k1 ) = k1 .
x1 x2 x3 x4 x5 x6 x7
A
X = A ∪ B, |A| = |B| = k ,
F = {S ⊆ X : |S ∩ A| = 1, |S ∩ B| = k − 1}.
f (S) = number of arcs leaving S; symmetric under Sk .
OPT = F (1, 0, . . . , 0; 0, 1, . . . , 1) = 1.
OPT = F ( k1 , . . . , k1 ; 1 − k1 , . . . , 1 − k1 ) = k1 .
Refined instances: non-monotone submodular maximization over
matroid bases, with base packing number ν = k /(k − 1).
Theorem implies that a better than k1 -approximation is impossible.
Jan Vondrák (IBM Almaden) Submodular Optimization Tutorial 21 / 24
Symmetry gap ↔ Integrality gap
NON-MONOTONE MAXIMIZATION
Two meta-questions:
Is there a maximization problem which is significantly more difficult
for monotone submodular functions than for linear functions?