Seq Slides
Seq Slides
• SCP is a heuristic
– it can fail to find optimal (or even feasible) point
– results can (and often do) depend on starting point
(can run algorithm from many initial points and take best result)
• SCP often works well, i.e., finds a feasible point with good, if not
optimal, objective value
minimize f0(x)
subject to fi(x) ≤ 0, i = 1, . . . , m
hi(x) = 0, j = 1, . . . , p
with variable x ∈ Rn
minimize fˆ0(x)
subject to fˆi(x) ≤ 0, i = 1, . . . , m
ĥi(x) = 0, i = 1, . . . , p
x ∈ T (k)
(k)
T (k) = {x | |xi − xi | ≤ ρi, i = 1, . . . , n}
2 (k)
P = ∇ f (x ) +
, PSD part of Hessian
• particle method:
– choose points z1, . . . , zK ∈ T (k)
(e.g., all vertices, some vertices, grid, random, . . . )
– evaluate yi = f (zi)
– fit data (zi, yi) with convex (affine) function
(using convex optimization)
• advantages:
– handles nondifferentiable functions, or functions for which evaluating
derivatives is difficult
– gives regional models, which depend on current point and trust
region radii ρi
• example:
T
h(x) = (1/2)xT P x + q T x + r = ((1/2)P x + q) x + r
• nonconvex QP
• use approximation
−20
−30
f (x(k))
−40
−50
−60
−70
5 10 15 20 25 30
= (1/2)xT (P + 2 diag(λ)) x + q T x − 1T λ
−1
• g(λ) = −(1/2)q T (P + 2 diag(λ)) q − 1T λ; need P + 2 diag(λ) ≻ 0
where λ > 0
• for λ large enough, minimizer of φ is solution of original problem
• for SCP, use convex approximation
p
m
!
φ̂(x) = fˆ0(x) + λ fˆi(x)+ +
X X
|ĥi(x)|
i=1 i=1
l 2 , m2
τ2 θ2
τ1
l 1 , m1
θ1
• θ0 = θ1 = θinit; θN = θN +1 = θfinal
• m1 = 1, m2 = 5, l1 = 1, l2 = 1
• N = 40, T = 10
• τmax = 1.1
• λ=2
70
60
50
φ(x(k))
40
30
20
10
5 10 15 20 25 30 35 40
k
2
14 10
12 −1
10
11.5
−2
10
11
−3
10.5 10
5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40
k k
2
140 10
120
δ̂ (dotted), δ (solid)
1
10
100
80 0
ρ(k) (◦)
10
60
−1
40 10
20 −2
10
0
−3
−20 10
5 10 15 20 25 30 35 40 5 10 15 20 25 30 35 40
k k
1.5 3.5
3
1
2.5
θ1
τ1
0.5
1.5
1
0
0.5
−0.5 0
0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 10
t t
2 5
θ2
τ2
0
0
−1 −5
0 2 4 6 8 10 0 2 4 6 8 10
t t
f (x) = h(c(x))
102
101
f (x(k)) − f (x⋆)
100
10−1
10−2
10−3
10−4
10−5
20 40 60 80 100
k
102
101
f (x(k)) − f (x⋆)
100
10−1
10−2
10−3
10−4
10−5
1 2 3 4 5 6 7
k
100
f (x(k)) − f (x⋆)
10−1
10−2
10−3
10−4
10−5
10−6
1 2 3 4 5 6 7 8 9 10
k
0
10
−1
f (x(k)) − f (x⋆)
10
−2
10
−3
10
−4
10
−5
10
−6
10
−7
10
1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0
k
• express problem as
N
X
f (Σ) = log det Σ + Tr(Σ−1Y ), Y = (1/N ) yiyiT
i=1
−5
−10
f (Σ)
−15
−20
−25
−30
1 2 3 4 5 6 7
• NMF problem:
minimize kA − XY kF
subject to Xij , Yij ≥ 0
variables X ∈ Rm×k , Y ∈ Rk×n, data A ∈ Rm×n
25
kA − XY kF
20
15
10
0
0 5 10 15 20 25 30
k