0% found this document useful (0 votes)
101 views27 pages

Canonical Problem Forms: Ryan Tibshirani Convex Optimization 10-725

This document introduces several canonical problem forms in convex optimization, including linear programs (LPs), quadratic programs (QPs), and semidefinite programs (SDPs). It provides examples of how problems from areas like transportation, portfolio optimization, and machine learning can be modeled using these forms. Key points covered include the standard forms for LPs, QPs, and SDPs, as well as properties of positive semidefinite matrices that are important for SDPs.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
101 views27 pages

Canonical Problem Forms: Ryan Tibshirani Convex Optimization 10-725

This document introduces several canonical problem forms in convex optimization, including linear programs (LPs), quadratic programs (QPs), and semidefinite programs (SDPs). It provides examples of how problems from areas like transportation, portfolio optimization, and machine learning can be modeled using these forms. Key points covered include the standard forms for LPs, QPs, and SDPs, as well as properties of positive semidefinite matrices that are important for SDPs.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 27

Canonical Problem Forms

Ryan Tibshirani
Convex Optimization 10-725
Last time: optimization basics

• Optimization terminology (e.g., criterion, constraints, feasible


points, solutions)
• Properties and first-order optimality

• Equivalent transformations (e.g., partial optimization, change


of variables, eliminating equality constraints)

2
Outline

Today:
• Linear programs
• Quadratic programs
• Semidefinite programs
• Cone programs

3
4
Linear program
A linear program or LP is an optimization problem of the form

min cT x
x
subject to Dx ≤ d
Ax = b

Observe that this is always a convex optimization problem

• First introduced by Kantorovich in the late 1930s and Dantzig


in the 1940s
• Dantzig’s simplex algorithm gives a direct (noniterative) solver
for LPs (later in the course we’ll see interior point methods)
• Fundamental problem in convex optimization. Many diverse
applications, rich history

5
Example: diet problem

Find cheapest combination of foods that satisfies some nutritional


requirements (useful for graduate students!)

min cT x
x
subject to Dx ≥ d
x≥0

Interpretation:
• cj : per-unit cost of food j
• di : minimum required intake of nutrient i
• Dij : content of nutrient i per unit of food j
• xj : units of food j in the diet

6
Example: transportation problem
Ship commodities from given sources to destinations at min cost
m X
X n
min cij xij
x
i=1 j=1
Xn
subject to xij ≤ si , i = 1, . . . , m
j=1
m
X
xij ≥ dj , j = 1, . . . , n, x ≥ 0
i=1

Interpretation:
• si : supply at source i
• dj : demand at destination j
• cij : per-unit shipping cost from i to j
• xij : units shipped from i to j
7
Example: basis pursuit
Given y ∈ Rn and X ∈ Rn×p , where p > n. Suppose that we seek
the sparsest solution to underdetermined linear system Xβ = y

Nonconvex formulation:

min kβk0
β

subject to Xβ = y
Pp
where recall kβk0 = j=1 1{βj 6= 0}, the `0 “norm”

The `1 approximation, often called basis pursuit:

min kβk1
β

subject to Xβ = y

8
Basis pursuit is a linear program. Reformulation:

min kβk1 min 1T z


β ⇐⇒ β,z

subject to Xβ = y subject to z ≥ β
z ≥ −β
Xβ = y

(Check that this makes sense to you)

9
Example: Dantzig selector

Modification of previous problem, where we allow for Xβ ≈ y (we


don’t require exact equality), the Dantzig selector:1

min kβk1
β

subject to kX T (y − Xβ)k∞ ≤ λ

Here λ ≥ 0 is a tuning parameter

Again, this can be reformulated as a linear program (check this!)

1
Candes and Tao (2007), “The Dantzig selector: statistical estimation when
p is much larger than n”
10
Standard form

A linear program is said to be in standard form when it is written as

min cT x
x
subject to Ax = b
x≥0

Any linear program can be rewritten in standard form (check this!)

11
Convex quadratic program

A convex quadratic program or QP is an optimization problem of


the form
1
min cT x + xT Qx
x 2
subject to Dx ≤ d
Ax = b

where Q  0, i.e., positive semidefinite

Note that this problem is not convex when Q 6 0

From now on, when we say quadratic program or QP, we implicitly


assume that Q  0 (so the problem is convex)

12
Example: portfolio optimization

Construct a financial portfolio, trading off performance and risk:


γ T
max µT x − x Qx
x 2
subject to 1T x = 1
x≥0

Interpretation:
• µ : expected assets’ returns
• Q : covariance matrix of assets’ returns
• γ : risk aversion
• x : portfolio holdings (percentages)

13
Example: support vector machines

Given y ∈ {−1, 1}n , X ∈ Rn×p having rows x1 , . . . , xn , recall the


support vector machine or SVM problem:
n
1 X
min kβk22 + C ξi
β,β0 ,ξ 2
i=1
subject to ξi ≥ 0, i = 1, . . . , n
yi (xTi β + β0 ) ≥ 1 − ξi , i = 1, . . . , n

This is a quadratic program

14
Example: lasso

Given y ∈ Rn , X ∈ Rn×p , recall the lasso problem:

min ky − Xβk22
β

subject to kβk1 ≤ s

Here s ≥ 0 is a tuning parameter. Indeed, this can be reformulated


as a quadratic program (check this!)

Alternative parametrization (called Lagrange, or penalized form):


1
min ky − Xβk22 + λkβk1
β 2

Now λ ≥ 0 is a tuning parameter. And again, this can be rewritten


as a quadratic program (check this!)

15
Standard form

A quadratic program is in standard form if it is written as


1
min cT x + xT Qx
x 2
subject to Ax = b
x≥0

Any quadratic program can be rewritten in standard form

16
Motivation for semidefinite programs
Consider linear programming again:

min cT x
x
subject to Dx ≤ d
Ax = b

Can generalize by changing ≤ to different (partial) order. Recall:


• Sn is space of n × n symmetric matrices
• Sn
+ is the space of positive semidefinite matrices, i.e.,

Sn+ = {X ∈ Sn : uT Xu ≥ 0 for all u ∈ Rn }

• Sn
++ is the space of positive definite matrices, i.e.,

Sn++ = X ∈ Sn : uT Xu > 0 for all u ∈ Rn \ {0}




17
Facts about Sn , Sn+ , Sn++
• Basic linear algebra facts, here λ(X) = (λ1 (X), . . . , λn (X)):

X ∈ Sn =⇒ λ(X) ∈ Rn
X ∈ Sn+ ⇐⇒ λ(X) ∈ Rn+
X ∈ Sn++ ⇐⇒ λ(X) ∈ Rn++

• We can define an inner product over Sn : given X, Y ∈ Sn ,

X • Y = tr(XY )

• We can define a partial ordering over Sn : given X, Y ∈ Sn ,

X  Y ⇐⇒ X − Y ∈ Sn+

Note: for x, y ∈ Rn , diag(x)  diag(y) ⇐⇒ x ≥ y (recall,


the latter is interpreted elementwise)

18
Semidefinite program

A semidefinite program or SDP is an optimization problem of the


form

min cT x
x
subject to x1 F1 + · · · + xn Fn  F0
Ax = b

Here Fj ∈ Sd , for j = 0, 1, . . . , n, and A ∈ Rm×n , c ∈ Rn , b ∈ Rm .


Observe that this is always a convex optimization problem

Also, any linear program is a semidefinite program (check this!)

19
Standard form

A semidefinite program is in standard form if it is written as

min C •X
X
subject to Ai • X = bi , i = 1, . . . , m
X0

Any semidefinite program can be written in standard form (for a


challenge, check this!)

20
Example: theta function
Let G = (N, E) be an undirected graph, N = {1, . . . , n}, and
• ω(G) : clique number of G
• χ(G) : chromatic number of G

The Lovasz theta function:2

ϑ(G) = max 11T • X


X
subject to I • X = 1
Xij = 0, (i, j) ∈
/E
X0

The Lovasz sandwich theorem: ω(G) ≤ ϑ(Ḡ) ≤ χ(G), where Ḡ is


the complement graph of G

2
Lovasz (1979), “On the Shannon capacity of a graph”
21
Example: trace norm minimization
Let A : Rm×n → Rp be a linear map,
 
A1 • X
A(X) =  . . . 
Ap • X
for A1 , . . . , Ap ∈ Rm×n (and where Ai • X = tr(ATi X)). Finding
lowest-rank solution to an underdetermined system, nonconvex:
min rank(X)
X
subject to A(X) = b
Trace norm approximation:
min kXktr
X
subject to A(X) = b
This is indeed an SDP (but harder to show, requires duality ...)
22
Conic program

A conic program is an optimization problem of the form:

min cT x
x
subject to Ax = b
D(x) + d ∈ K

Here:
• c, x ∈ Rn , and A ∈ Rm×n , b ∈ Rm
• D : Rn → Y is a linear map, d ∈ Y , for Euclidean space Y
• K ⊆ Y is a closed convex cone

Both LPs and SDPs are special cases of conic programming. For
LPs, K = Rn+ ; for SDPs, K = Sn+

23
Second-order cone program
A second-order cone program or SOCP is an optimization problem
of the form:

min cT x
x
subject to kDi x + di k2 ≤ eTi x + fi , i = 1, . . . , p
Ax = b

This is indeed a cone program. Why? Recall the second-order cone

Q = {(x, t) : kxk2 ≤ t}
So we have

kDi x + di k2 ≤ eTi x + fi ⇐⇒ (Di x + di , eTi x + fi ) ∈ Qi

for second-order cone Qi of appropriate dimensions. Now take


K = Q1 × · · · × Qp

24
Observe that every LP is an SOCP. Further, every SOCP is an SDP

Why? Turns out that


 
tI x
kxk2 ≤ t ⇐⇒ 0
xT t

Hence we can write any SOCP constraint as an SDP constraint

The above is a special case of the Schur complement theorem:


 
A B
 0 ⇐⇒ A − BC −1 B T  0
BT C

for A, C symmetric and C  0

25
Hey, what about QPs?
Finally, our old friend QPs “sneak” into the hierarchy. Turns out
QPs are SOCPs, which we can see by rewriting a QP as

min cT x + t
x,t
1 T
subject to Dx ≤ d, x Qx ≤ t
2
Ax = b

Now write 12 xT Qx ≤ t ⇐⇒ k( √12 Q1/2 x, 21 (1 − t))k2 ≤ 12 (1 + t)

Take a breath (phew!). Thus we have established the hierachy

LPs ⊆ QPs ⊆ SOCPs ⊆ SDPs ⊆ Conic programs

completing the picture we saw at the start

26
References and further reading

• D. Bertsimas and J. Tsitsiklis (1997), “Introduction to linear


optimization,” Chapters 1, 2
• S. Boyd and L. Vandenberghe (2004), “Convex optimization,”
Chapter 4
• A. Nemirovski and A. Ben-Tal (2001), “Lectures on modern
convex optimization,” Chapters 1–4

27

You might also like