0% found this document useful (0 votes)
20 views57 pages

Advances in Convex Optimization: Interior-Point Methods, Cone Programming, and Applications

Uploaded by

smashouff
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views57 pages

Advances in Convex Optimization: Interior-Point Methods, Cone Programming, and Applications

Uploaded by

smashouff
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 57

Advances in Convex Optimization:

Interior-point Methods, Cone Programming,


and Applications

Stephen Boyd
Electrical Engineering Department
Stanford University

(joint work with Lieven Vandenberghe, UCLA)

CDC 02 Las Vegas 12/11/02


Easy and Hard Problems
Least squares (LS)

minimize kAx − bk22

A ∈ Rm×n, b ∈ Rm are parameters; x ∈ Rn is variable

• have complete theory (existence & uniqueness, sensitivity analysis . . . )


• several algorithms compute (global) solution reliably
• can solve dense problems with n = 1000 vbles, m = 10000 terms
• by exploiting structure (e.g., sparsity) can solve far larger problems

. . . LS is a (widely used) technology

CDC 02 Las Vegas 12/11/02 1


Linear program (LP)

minimize cT x
subject to aTi x ≤ bi, i = 1, . . . , m

c, ai ∈ Rn are parameters; x ∈ Rn is variable

• have nearly complete theory


(existence & uniqueness, sensitivity analysis . . . )
• several algorithms compute (global) solution reliably
• can solve dense problems with n = 1000 vbles, m = 10000 constraints
• by exploiting structure (e.g., sparsity) can solve far larger problems

. . . LP is a (widely used) technology

CDC 02 Las Vegas 12/11/02 2


Quadratic program (QP)

minimize kF x − gk22
subject to aTi x ≤ bi, i = 1, . . . , m

• a combination of LS & LP
• same story . . . QP is a technology
• solution methods reliable enough to be embedded in real-time
control applications with little or no human oversight
• basis of model predictive control

CDC 02 Las Vegas 12/11/02 3


The bad news

• LS, LP, and QP are exceptions

• most optimization problems, even some very simple looking ones, are
intractable

CDC 02 Las Vegas 12/11/02 4


Polynomial minimization

minimize p(x)

p is polynomial of degree d; x ∈ Rn is variable

• except for special cases (e.g., d = 2) this is a very difficult problem


• even sparse problems with size n = 20, d = 10 are essentially intractable
• all algorithms known to solve this problem require effort exponential in n

CDC 02 Las Vegas 12/11/02 5


What makes a problem easy or hard?

classical view:

• linear is easy

• nonlinear is hard(er)

CDC 02 Las Vegas 12/11/02 6


What makes a problem easy or hard?

emerging (and correct) view:

. . . the great watershed in optimization isn’t between linearity and


nonlinearity, but convexity and nonconvexity.

— R. Rockafellar, SIAM Review 1993

CDC 02 Las Vegas 12/11/02 7


Convex optimization

minimize f0(x)
subject to f1(x) ≤ 0, . . . , fm(x) ≤ 0

x ∈ Rn is optimization variable; fi : Rn → R are convex:

fi(λx + (1 − λ)y) ≤ λfi(x) + (1 − λ)fi(y)

for all x, y, 0 ≤ λ ≤ 1

• includes LS, LP, QP, and many others


• like LS, LP, and QP, convex problems are fundamentally tractable

CDC 02 Las Vegas 12/11/02 8


Example: Robust LP

minimize cT x
subject to Prob(aTi x ≤ bi) ≥ η, i = 1, . . . , m

coefficient vectors ai IID, N (ai, Σi); η is required reliability


• for fixed x, aTi x is N (aTi x, xT Σix)
• so for η = 50%, robust LP reduces to LP

minimize cT x
subject to aTi x ≤ bi, i = 1, . . . , m

and so is easily solved


• what about other values of η, e.g., η = 10%? η = 90%?

CDC 02 Las Vegas 12/11/02 9


Hint

{x | Prob(aTi x ≤ bi) ≥ η, i = 1, . . . , m}

η = 10% η = 50% η = 90%

CDC 02 Las Vegas 12/11/02 10


That’s right

robust LP with reliability η = 90% is convex, and very easily solved

robust LP with reliability η = 10% is not convex, and extremely difficult

moral: very difficult and very easy problems can look quite similar
(to the untrained eye)

CDC 02 Las Vegas 12/11/02 11


Convex Analysis and Optimization
Convex analysis & optimization

nice properties of convex optimization problems known since 1960s


• local solutions are global
• duality theory, optimality conditions
• simple solution methods like alternating projections

convex analysis well developed by 1970s Rockafellar


• separating & supporting hyperplanes
• subgradient calculus

CDC 02 Las Vegas 12/11/02 12


What’s new (since 1990 or so)

• primal-dual interior-point (IP) methods


extremely efficient, handle nonlinear large scale problems,
polynomial-time complexity results, software implementations

• new standard problem classes


generalizations of LP, with theory, algorithms, software

• extension to generalized inequalities


semidefinite, cone programming

. . . convex optimization is becoming a technology

CDC 02 Las Vegas 12/11/02 13


Applications and uses

• lots of applications
control, combinatorial optimization, signal processing,
circuit design, communications, . . .

• robust optimization
robust versions of LP, LS, other problems

• relaxations and randomization


provide bounds, heuristics for solving hard problems

CDC 02 Las Vegas 12/11/02 14


Recent history

• 1984–97: interior-point methods for LP


– 1984: Karmarkar’s interior-point LP method
– theory Ye, Renegar, Kojima, Todd, Monteiro, Roos, . . .
– practice Wright, Mehrotra, Vanderbei, Shanno, Lustig, . . .
• 1988: Nesterov & Nemirovsky’s self-concordance analysis
• 1989–: LMIs and semidefinite programming in control
• 1990–: semidefinite programming in combinatorial optimization
Alizadeh, Goemans, Williamson, Lovasz & Schrijver, Parrilo, . . .
• 1994: interior-point methods for nonlinear convex problems
Nesterov & Nemirovsky, Overton, Todd, Ye, Sturm, . . .
• 1997–: robust optimization Ben Tal, Nemirovsky, El Ghaoui, . . .

CDC 02 Las Vegas 12/11/02 15


New Standard Convex Problem Classes
Some new standard convex problem classes

• second-order cone program (SOCP)


• geometric program (GP) (and entropy problems)
• semidefinite program (SDP)

for these new problem classes we have


• complete duality theory, similar to LP
• good algorithms, and robust, reliable software
• wide variety of new applications

CDC 02 Las Vegas 12/11/02 16


Second-order cone program

second-order cone program (SOCP) has form

minimize cT0 x
subject to kAix + bik2 ≤ cTi x + di, i = 1, . . . , m

with variable x ∈ Rn

• includes LP and QP as special cases


• nondifferentiable when Aix + bi = 0
• new IP methods can solve (almost) as fast as LPs

CDC 02 Las Vegas 12/11/02 17


Example: robust linear program

minimize cT x
subject to Prob(aTi x ≤ bi) ≥ η, i = 1, . . . , m

where ai ∼ N (āi, Σi)


equivalent to

minimize cT x
subject to āTi x + Φ−1(η)kΣi xk2 ≤ 1,
1/2
i = 1, . . . , m

where Φ is (unit) normal CDF


robust LP is an SOCP for η ≥ 0.5 (Φ(η) ≥ 0)

CDC 02 Las Vegas 12/11/02 18


Geometric program (GP)
log-sum-exp function:

lse(x) = log (ex1 + · · · + exn )

. . . a smooth convex approximation of the max function

geometric program:

minimize lse(A0x + b0)


subject to lse(Aix + bi) ≤ 0, i = 1, . . . , m

Ai ∈ Rmi×n, bi ∈ Rmi ; variable x ∈ Rn

CDC 02 Las Vegas 12/11/02 19


Entropy problems
unnormalized negative entropy is convex function

n
X
− entr(x) = xi log(xi/1T x)
i=1

defined for xi ≥ 0, 1T x > 0


entropy problem:

minimize − entr(A0x + b0)


subject to − entr(Aix + bi) ≤ 0, i = 1, . . . , m

Ai ∈ Rmi×n, bi ∈ Rmi

CDC 02 Las Vegas 12/11/02 20


Solving GPs (and entropy problems)

• GP and entropy problems are duals (if we solve one, we solve the other)

• new IP methods can solve large scale GPs (and entropy problems)
almost as fast as LPs

• applications in many areas:


– information theory, statistics
– communications, wireless power control
– digital and analog circuit design

CDC 02 Las Vegas 12/11/02 21


CMOS analog/mixed-signal circuit design via GP
given
• circuit cell: opamp, PLL, D/A, A/D, SC filter, . . .
• specs: power, area, bandwidth, nonlinearity, settling time, . . .
• IC fabrication process: TSMC 0.18µm mixed-signal, . . .

find
• electronic design: device L & W , bias I & V , component values, . . .
• physical design: placement, layout, routing, GDSII, . . .

CDC 02 Las Vegas 12/11/02 22


The challenges

• complex, multivariable, highly nonlinear problem

• dominating issue: robustness to


– model errors
– parameter variation
– unmodeled dynamics

(sound familiar?)

CDC 02 Las Vegas 12/11/02 23


Two-stage op-amp
Vdd

M8 M7
M5

Vin+ M1 M2 Vin−
Rc Cc
CL
Ibias
M6
M3 M4

Vss
• design variables: device lengths & widths, component values
• constraints/objectives: power, area, bandwidth, gain, noise, slew rate,
output swing, . . .

CDC 02 Las Vegas 12/11/02 24


Op-amp design via GP

• express design problem as GP


(using change of variables, and a few good approximations . . . )
• 10s of vbles, 100s of constraints; solution time ¿ 1sec

robust version:
• take 10 (or so) different parameter values (‘PVT corners’)
• replicate all constraints for each parameter value
• get 100 vbles, 1000 constraints; solution time ≈ 2sec

CDC 02 Las Vegas 12/11/02 25


Minimum noise versus power & BW
400
wc=30MHz
w =60MHz
c
wc=90MHz
350
0.5
Minimum noise in nV/Hz

300

250

200

150

100
0 5 10 15
Power in mW

CDC 02 Las Vegas 12/11/02 26


Cone Programming
Cone programming

general cone program:

minimize cT x
subject to Ax ¹K b

• generalized inequality Ax ¹K b means b − Ax ∈ K, a proper convex


cone

• LP, QP, SOCP, GP can be expressed as cone programs

CDC 02 Las Vegas 12/11/02 27


Semidefinite program

semidefinite program (SDP):

minimize cT x
subject to x1A1 + · · · + xnAn ¹ B

B, Ai are symmetric matrices; variable is x ∈ Rn

• constraint is linear matrix inequality (LMI)


• inequality is matrix inequality, i.e., K is positive semidefinite cone
• SDP is special case of cone program

CDC 02 Las Vegas 12/11/02 28


Early SDP applications
(around 1990 on)

• control (many)
• combinatorial optimization & graph theory (many)

CDC 02 Las Vegas 12/11/02 29


More recent SDP applications

• structural optimization: Ben-Tal, Nemirovsky, Kocvara, Bendsoe, . . .


• signal processing: Vandenberghe, Stoica, Lorenz, Davidson, Shaked,
Nguyen, Luo, Sturm, Balakrishnan, Saadat, Fu, de Souza, . . .
• circuit design: El Gamal, Vandenberghe, Boyd, Yun, . . .
• algebraic geometry:
Parrilo, Sturmfels, Lasserre, de Klerk, Pressman, Pasechnik, . . .
• communications and information theory:
Rasmussen, Rains, Abdi, Moulines, . . .
• quantum computing:
Kitaev, Waltrous, Doherty, Parrilo, Spedalieri, Rains, . . .
• finance: Iyengar, Goldfarb, . . .

CDC 02 Las Vegas 12/11/02 30


Convex optimization heirarchy

convex problems
more general
cone problems

SDP

SOCP GP

QP LP
more specific
LS

CDC 02 Las Vegas 12/11/02 31


Relaxations & Randomization
Relaxations & randomization

convex optimization is increasingly used


• to find good bounds for hard (i.e., nonconvex) problems, via relaxation

• as a heuristic for finding good suboptimal points, often via


randomization

CDC 02 Las Vegas 12/11/02 32


Example: Boolean least-squares

Boolean least-squares problem:

minimize kAx − bk2


subject to x2i = 1, i = 1, . . . , n

• basic problem in digital communications


• could check all 2n possible values of x . . .
• an NP-hard problem, and very hard in practice
• many heuristics for approximate solution

CDC 02 Las Vegas 12/11/02 33


Boolean least-squares as matrix problem

kAx − bk2 = xT AT Ax − 2bT Ax + bT b


= Tr AT AX − 2bT AT x + bT b

where X = xxT
hence can express BLS as

minimize Tr AT AX − 2bT Ax + bT b
subject to Xii = 1, X º xxT , rank(X) = 1

. . . still a very hard problem

CDC 02 Las Vegas 12/11/02 34


SDP relaxation for BLS
ignore rank one constraint, and use
· ¸
X x
X º xxT ⇐⇒ º0
xT 1

to obtain SDP relaxation (with variables X, x)

minimize Tr AT AX − 2bT AT x + bT b
· ¸
X x
subject to Xii = 1, º0
xT 1

• optimal value of SDP gives lower bound for BLS


• if optimal matrix is rank one, we’re done

CDC 02 Las Vegas 12/11/02 35


Interpretation via randomization

• can think of variables X, x in SDP relaxation as defining a normal


distribution z ∼ N (x, X − xxT ), with E zi2 = 1
• SDP objective is E kAz − bk2

suggests randomized method for BLS:


• find X ?, x?, optimal for SDP relaxation
• generate z from N (x?, X ? − x?x?T )
• take x = sgn(z) as approximate solution of BLS
(can repeat many times and take best one)

CDC 02 Las Vegas 12/11/02 36


Example

• (randomly chosen) parameters A ∈ R150×100, b ∈ R150


• x ∈ R100, so feasible set has 2100 ≈ 1030 points

LS approximate solution: minimize kAx − bk s.t. kxk2 = n, then round


yields objective 8.7% over SDP relaxation bound

randomized method: (using SDP optimal distribution)


• best of 20 samples: 3.1% over SDP bound
• best of 1000 samples: 2.6% over SDP bound

CDC 02 Las Vegas 12/11/02 37


0.5

0.4 SDP bound LS solution

0.3
frequency

0.2

0.1

0
1 1.2
kAx − bk/(SDP bound)

CDC 02 Las Vegas 12/11/02 38


Interior-Point Methods
Interior-point methods

• handle linear and nonlinear convex problems Nesterov & Nemirovsky


• based on Newton’s method applied to ‘barrier’ functions that trap x in
interior of feasible region (hence the name IP)

• worst-case complexity theory: # Newton steps ∼ problem size
• in practice: # Newton steps between 10 & 50 (!)
— over wide range of problem dimensions, type, and data
• 1000 variables, 10000 constraints feasible on PC; far larger if structure
is exploited
• readily available (commercial and noncommercial) packages

CDC 02 Las Vegas 12/11/02 39


Typical convergence of IP method

2
10

0
10
duality gap

−2
10

−4
10 SOCP GP LP

SDP
−6
10
0 10 20 30 40 50
# Newton steps
LP, GP, SOCP, SDP with 100 variables

CDC 02 Las Vegas 12/11/02 40


Typical effort versus problem dimensions

35

30
• LPs with n vbles, 2n

Newton steps
constraints
25
• 100 instances for each of
20 problem sizes
• avg & std dev shown 20

15 1 2 3
10 10 10
n

CDC 02 Las Vegas 12/11/02 41


Computational effort per Newton step

• Newton step effort dominated by solving linear equations to find


primal-dual search direction

• equations inherit structure from underlying problem

• equations same as for least-squares problem of similar size and structure

conclusion:
we can solve a convex problem with about the same effort as
solving 30 least-squares problems

CDC 02 Las Vegas 12/11/02 42


Problem structure

common types of structure:


• sparsity
• state structure
• Toeplitz, circulant, Hankel; displacement rank
• Kronecker, Lyapunov structure
• symmetry

CDC 02 Las Vegas 12/11/02 43


Exploiting sparsity

• well developed, since late 1970s

• direct (sparse factorizations) and iterative methods (CG, LSQR)

• standard in general purpose LP, QP, GP, SOCP implementations

• can solve problems with 105, 106 vbles, constraints


(depending on sparsity pattern)

CDC 02 Las Vegas 12/11/02 44


Exploiting structure in SDPs

in combinatorial optimization, major effort to exploit structure


• structure is mostly (extreme) sparsity
• IP methods and others (bundle methods) used
• problems with 10000 × 10000 LMIs, 10000 variables can be solved
Ye, Wolkowicz, Burer, Monteiro . . .

CDC 02 Las Vegas 12/11/02 45


Exploiting structure in SDPs

in control
• structure includes sparsity, Kronecker/Lyapunov
• substantial improvements in order, for particular problem classes
Balakrishnan & Vandenberghe, Hansson, Megretski, Parrilo, Rotea, Smith,
Vandenberghe & Boyd, Van Dooren, . . .

. . . but no general solution yet

CDC 02 Las Vegas 12/11/02 46


Conclusions
Conclusions

convex optimization
• theory fairly mature; practice has advanced tremendously last decade

• qualitatively different from general nonlinear programming

• becoming a technology like LS, LP (esp., new problem classes), reliable


enough for embedded applications

• cost only 30× more than least-squares, but far more expressive

• lots of applications still to be discovered

CDC 02 Las Vegas 12/11/02 47


Some references

• Semidefinite Programming, SIAM Review 1996

• Applications of Second-order Cone Programming, LAA 1999

• Linear Matrix Inequalities in System and Control Theory, SIAM 1994

• Interior-point Polynomial Algorithms in Convex Programming,


SIAM 1994, Nesterov & Nemirovsky

• Lectures on Modern Convex Optimization,


SIAM 2001, Ben Tal & Nemirovsky

CDC 02 Las Vegas 12/11/02 48


Shameless promotion

Convex Optimization, Boyd & Vandenberghe

• to be published 2003

• good draft available at Stanford EE364 (UCLA EE236B) class web site
as course reader

CDC 02 Las Vegas 12/11/02 49

You might also like