Sparse Optimization Lecture: Basic Sparse Optimization Models
Sparse Optimization Lecture: Basic Sparse Optimization Models
1 / 33
2 / 33
Basis pursuit
min{kxk1 : Ax = b}
`1 ball
3 / 33
Basis pursuit
min{kxk1 : Ax = b}
find least `1 -norm point on the affine plane {x : Ax = b}
tends to return a sparse point (sometimes, the sparsest)
(1a)
(1b)
(1c)
min kxk1 +
x
kAx bk22 ,
2
(2a)
(2b)
(2c)
(3)
(4)
min k xk1 +
x
kAx bk22 ,
2
8 / 33
min{k xk1 : Ax = b}
x
(5)
9 / 33
10 / 33
10
300
800
250
10
200
600
150
400
10
100
200
10
10
0.5
1
1.5
2
2.5
sorted coefficient magnitudesx 105
0
0
50
0.5
1
1.5
2
2.5
sorted coefficient magnitudesx 105
0
0
0.5
1
1.5
2
2.5
sorted coefficient magnitudes x 105
Figure: the DCT and wavelet coefficients are scaled for better visibility.
11 / 33
Questions
generality: use a test data set, then scale parameters for real data
cross validation: reserve a subset of data to test the solution
12 / 33
Joint/group sparsity
where
kXk2,1 :=
m
X
(6)
i=1
13 / 33
Joint/group sparsity
Decompose {1, . . . , n} = G1 G2 GS .
non-overlapping groups: Gi Gj = , i 6= j.
otherwise, groups may overlap (modeling many interesting structures).
(7)
where
kxkG,2,1 =
S
X
ws kxGs k2 .
s=1
14 / 33
Auxiliary constraints
15 / 33
(8)
Therefore,
min{kxk1 : Ax = b} reduces to a linear program (LP)
minx kxk1 +
kAx
2
program (QP)
minx {kAx bk2 : kxk1 } reduces to a bound constrained QP
minx {kxk1 : kAx bk2 } reduces to a second-order cone program
(SOCP)
16 / 33
Conic programming
Basic form:
min{cT x : Fx + g K 0, Ax = b.}
x
Example:
1
"
(x, y, z) :
x
y
y
S+
z
0.5
0
1
1
0
0.5
1 0
x
17 / 33
Linear program
Model
min{cT x : Ax = b, x K 0}
where K is the nonnegative cone (first orthant).
x K 0 x 0.
Algorithms
the Simplex method (move between vertices)
interior-point methods (IPMs) (move inside the polyhedron)
decomposition approaches (divide and conquer)
log-barrier formulation:
min{cT x (1/t)
log(xi ) : Ax = b}
i
18 / 33
Model
min{cT x : Ax = b, x K 0}
where K = K1 KK ; each Kk is the second-order cone
q
o
n
Kk = y Rnk : ynk y12 + + yn2 k 1 .
IPM is the standard solver (though other options also exist)
Log-barrier of Kk :
(y) = log yn2 k (y12 + + ynk 1 )
19 / 33
Semi-definite program
Model
min{C X : A(X) = b, X K 0}
n
20 / 33
(y) K 0,
nonnegative orthant Rn+: (y) =
Pn
i=1 log yi
y T (y) = n
tr(Y (Y )) = n
y1
..
2
,
y T (y) = 2
(y) = 2
yn+1 y12 yn2 yn
yn+1
21 / 33
Central path
for t > 0, define x?(t) as the solution of
minimize tf0(x) + (x)
subject to Ax = b
(for now, assume x?(t) exists and is unique for each t > 0)
central path is {x?(t) | t > 0}
example: central path for an LP
minimize cT x
subject to aTi x bi,
i = 1, . . . , 6
x?
x?(10)
22 / 33
Log-barrier formulation:
min{tf0 (x) + (x) : Ax = b}
Complexity of log-barrier interior-point method:
'
&
P
log(( i i )/(t(0) ))
k
log
23 / 33
24 / 33
x1 ,x2
x1 ,x2
x1 ,x2
If we can model
min{kXk : A(X) = b}
X
(9)
1
kA(X)
2
bk2F
26 / 33
observe
y, z 0 and
"
y
yz |x|
x
#
x
0.
z
So,
"
we attain
1
(y
2
y
x
#
x
1
0 = (y + z) |x|.
2
z
+ z) = |x| if y = z = |x|.
#
)
1
T
M = x, M = M , M 0 .
0
27 / 33
"
Y
XT
#
X
0
Z
"
[UT , VT ]
Y
XT
i .
#"
#
X
U
= UT YU + VT ZV UT XV VT XT U
Z V
= UT YU + VT ZV 2 0.
28 / 33
Therefore,
"
#
)
Y
X
1
(tr(Y) + tr(Z)) :
0
Y,Z
2
XT Z
(
"
#
)
0
I
1
T
= min
tr(M) :
M = X, M = M , M 0 .
M
2
0 0
(
kXk = min
(10)
(11)
1
kA(X)
2
bkF
29 / 33
off-the-shelf) solvers
Yet, the most reliable solvers cannot handle large-scale problems (e.g.,
constraints. Even worse, the constraint coefficients are dense. Result: Out
of memory.
matrix involving A.
Even large and dense matrices can be handled, for sparse optimization,
30 / 33
The Simplex, active-set, and IPMs have reliable solvers; good to be the
benchmark.
They have nice interfaces (including CVX and YALMIP, which save you
time.)
CVX and YALMIP are not solvers; they translate problems and then call
solvers; see http://goo.gl/zUlMK and http://goo.gl/1u0xP.
They can return highly accurate solutions; some first-order algorithms
31 / 33
Low-rank factorizations:
S. Burer and R. D. C. Monteiro, A nonlinear programming algorithm for solving
semidefinite programs via low-rank factorization, Math. Program., 95:329357, 2003.
LMaFit, http://lmafit.blogs.rice.edu/
Matrix-free IPMs:
K. Fountoulakis, J. Gondzio, P. Zhlobich. Matrix-free interior point method for compressed
sensing problems, 2012. http://www.maths.ed.ac.uk/~gondzio/reports/mfCS.html
32 / 33
Subgradient methods
kAx
2
bk22 .
33 / 33