Subgradient Methods
Subgradient Methods
Subgradient methods
Yuxin Chen
Princeton University, Fall 2019
Outline
• Steepest descent
• Subgradients
f (x+αd)−f (x)
where f 0 (x; d) := limα↓0 α
• updates
xt+1 = xt + ηt dt
5(9x21 + 16x22 ) 2
1
if x1 > |x2 |
f (x1 , x2 ) =
9x1 + 16|x2 | if x1 ≤ |x2 |
5(9x21 + 16x22 ) 2
1
if x1 > |x2 |
f (x1 , x2 ) =
9x1 + 16|x2 | if x1 ≤ |x2 |
Subgradients
Subgradient methods
g ∈ ∂f (0)
⇐⇒ hg, zi ≤ kzk, ∀z
This follows from generalized Cauchy-Schwarz, i.e.
Subgradient methods 4-11
f (x) = |x|
Example: max{f1 (x), f2 (x)}
Y
_
]{≠1},
_ if x < 0
f (x) = max{f1 (x), f2 (x)} where f1 and f2 are differentiable
ˆf (x) = [≠1, 1], if x = 0
_
_
Y
[{1}, if x > 0
]{f1 (x)},
_
_
Õ if f1 (x) > f2 (x)
ˆf (x) = [f1Õ (x), f2Õ (x)], if f1 (x) = f2 (x)
_
_
[{f Õ (x)},
2 if f1 (x) < f2 (x)
Subgradient methods 4-11
Subgradient methods
{f1 (x)},
0 if f1 (x) > f2 (x)
∂f (x) = [f 0 (x), f20 (x)], if f1 (x) = f2 (x)
1
{f 0 (x)},
2 if f1 (x) < f2 (x)
Subgradient methods 4-12
Basic rules
n
X
f (x) = kxk1 = |xi |
|{z}
i=1
=:fi (x)
since (
sgn(xi )ei , if xi = 6 0
∂fi (x) =
[−1, 1] · ei , if xi = 0
we have
X
sgn(xi )ei ∈ ∂f (x)
i:xi 6=0
X
=⇒ A> g = sgn(a>
i x + bi )ai ∈ ∂h(x)
i x+bi 6=0
i:a>
q1 g1 + · · · + qn gn ∈ ∂f (x)
>
f (x) = max ai x + bi
1≤i≤m
pick any aj s.t. a>
j x + bj = maxi ai x + bi , then
>
aj ∈ ∂f (x)
Rewrite
f (x) = sup y > (x1 A1 + · · · + xn An ) y
y:kyk2 =1
Rewrite
U V > ∈ ∂f (X)
Lemma 4.1
Projected subgradient update rule (4.1) obeys
kxt+1 − x∗ k22 ≤ kxt − x∗ k22 − 2ηt f (xt ) − f opt + ηt2 kg t k22 (4.3)
| {z }
fixed
| {z }
majorizing function
kxt+1 − x∗ k22 = kPC xt − ηt g t − PC x∗ k22
≤ kxt − ηt g t − x∗ k22 (nonexpansiveness of projection)
= kx −
t
x∗ k22 − 2ηt hx − x∗ , g t i + ηt2 kg t k22
t
≤ kxt − x∗ k22 − 2ηt f (xt ) − f (x∗ ) + ηt2 kg t k22
f (xt ) − f opt
ηt = (4.4)
kgt k22
which leads to error reduction
2
f (xt ) − f (x∗ )
kx t+1
− x∗ k22 t
≤ kx − x∗ k22 − (4.5)
kg t k22
find x œ C
Ì
RE]
C2 be closed convex sets and suppose C1 fl C2 ”= ÿ
[PICTURE]
find x œ C fl LetC C1 , C2 be closed convex sets and sup
1 2
find x œ C1 fl
Ì
Ì
For minimize
this problem, maxsubgradient
x the distC2with
method
{distC1 (x), (x)}Polyak’s stepsize rule
is equivalent to alternating projection
istC (x) := minzœC Îx ≠ zÎ2 minimizex max {distC1 (
t+1
x = PC1 (x ),
t
x t+2
= P (x
t+1
)
where distC (x)
C2
:= minzœC Îx ≠ zÎ2
Proof: Use the subgradient rule for pointwise max functions to get
g t ∈ ∂distCi (xt )
xt − PCi (xt )
g t = ∇distCi (xt ) =
distCi (xt )
1 2
which follows since ∇ 2 distCi (x )
t = xt − PCi (xt ) (homework) and
1 2
2 distCi (x ) = distCi (xt ) · ∇distCi (xt )
∇ t
= PCi (xt )
Lf kx0 − x∗ k2
f best,t − f opt ≤ √
t+1
√
• sublinear convergence rate O(1/ t)
2
=⇒ (t + 1) f best,t − f opt ≤ kx0 − x∗ k22 L2f
L2f η
lim f best,t ≤
t→∞ 2
i.e. may converge to non-optimal points
P 2 P
• Diminishing step size obeying t ηt < ∞ and t ηt → ∞:
lim f best,t = 0
t→∞
1
• Optimal choice? ηt = √
t
:
1 − µηt t 1 ηt
=⇒ f (xt )−f opt ≤ kx −x∗ k22 − kxt+1 −x∗ k22 + kg t k22
2ηt 2ηt 2
Since ηt = 2/(µ(t + 1)), we have
and hence
µt(t − 1) t ∗ 2 µt(t + 1) t+1 ∗ 2 1 t 2
t f (xt ) − f opt ≤ kx −x k2 − kx −x k2 + kg k2
4 4 µ
L2f t 2L2f 1
=⇒ f best,k − f opt ≤ Pt ≤
µ k=0 k µ t+1
ef opt
= f (x , y ) with x⇤ and y ⇤ being optimal solutions
⇤ ⇤
0.67
⌧ =0
t
x xi, x2X 0.66
y y i, yt ,2y)Y f (xt , y t ) hg t , y
f (x 0.65
y t i, y2Y
0.6 y 0.1
0.05
0.55
0
indicating that
0.5 -0.05
-0.1
yOptimal
, y)y t i,fpoint t , y ) obeys
(x ∗ ∗
gytf,(xt
(x, y x )2Xhg,xt ,yx2
t
Yxi + hgyt , y y t i, x 2 X, y 2
f (x∗ , y) ≤ f (x∗ , y ∗ ) ≤ f (x, y ∗ ), ∀x ∈ X , y ∈ Y
herefore,
of f againinvoking
gives convexity-concavity of f again gives
Subgradient methods 4-46
t t t t
Projected subgradient method
One way to measure the quality of the solution is via the following
error metric (think of it as a certain “duality gap”)
opt opt
ε(x, y) := max f (x, ỹ) − f + f − min f (x̃, y)
ỹ∈Y x̃∈X
τ =0 ητ
y∈Y x∈X
τ =0 τ =0
1
t
X
≤ Pt max ητ hgxτ , xτ − xi − hgyτ , y τ − yi (4.7)
τ =0 ητ
x∈X ,y∈Y
τ =0
=⇒ 2ητ hxτ − x, gxτ i ≤ kxτ − xk22 − kxτ +1 − xk22 + ητ2 kgxτ k22
as claimed