0% found this document useful (0 votes)
8 views26 pages

6 Approx

The document discusses various methods of approximation and fitting in convex optimization, including norm and penalty approximations, regularized approximations, and robust approximations. It outlines techniques for minimizing the difference between linear combinations of variables and desired outcomes, while also considering the effects of measurement errors and design efficiency. Examples include least-norm problems, Huber penalty functions, and signal reconstruction methods, emphasizing the importance of selecting appropriate norms and penalty functions for optimal results.

Uploaded by

賴裕芳
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views26 pages

6 Approx

The document discusses various methods of approximation and fitting in convex optimization, including norm and penalty approximations, regularized approximations, and robust approximations. It outlines techniques for minimizing the difference between linear combinations of variables and desired outcomes, while also considering the effects of measurement errors and design efficiency. Examples include least-norm problems, Huber penalty functions, and signal reconstruction methods, emphasizing the importance of selecting appropriate norms and penalty functions for optimal results.

Uploaded by

賴裕芳
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Convex Optimization

Stephen Boyd Lieven Vandenberghe

Revised slides by Stephen Boyd, Lieven Vandenberghe, and Parth Nobel


6. Approximation and fitting
Outline

Norm and penalty approximation

Regularized approximation

Robust approximation

Convex Optimization Boyd and Vandenberghe 6.1


Norm approximation

▶ minimize ∥Ax − b∥ , with A ∈ Rm×n , m ≥ n, ∥ · ∥ is any norm

▶ approximation: Ax★ is the best approximation of b by a linear combination of columns of A


▶ geometric: Ax★ is point in R (A) closest to b (in norm ∥ · ∥ )
▶ estimation: linear measurement model y = Ax + v
– measurement y, v is measurement error, x is to be estimated
– implausibility of v is ∥v∥
– given y = b, most plausible x is x★
▶ optimal design: x are design variables (input), Ax is result (output)
– x★ is design that best approximates desired result b (in norm ∥ · ∥ )

Convex Optimization Boyd and Vandenberghe 6.2


Examples

▶ Euclidean approximation ( ∥ · ∥ 2 )
– solution x★ = A† b

▶ Chebyshev or minimax approximation ( ∥ · ∥ ∞ )


– can be solved via LP
minimize t
subject to −t1 ⪯ Ax − b ⪯ t1

▶ sum of absolute residuals approximation ( ∥ · ∥ 1 )


– can be solved via LP
minimize 1T y
subject to −y ⪯ Ax − b ⪯ y

Convex Optimization Boyd and Vandenberghe 6.3


Penalty function approximation

minimize 𝜙(r1 ) + · · · + 𝜙(rm )


subject to r = Ax − b
(A ∈ Rm×n , 𝜙 : R → R is a convex penalty function)
examples 2

▶ quadratic: 𝜙(u) = u2 log barrier


1.5 quadratic
▶ deadzone-linear with width a:

𝜙(u)
1
𝜙(u) = max{0, |u| − a} deadzone-linear

▶ log-barrier with limit a: 0.5

−a2 log(1 − (u/a) 2 )



|u| < a 0
𝜙(u) = −1.5 −1 −0.5 0 0.5 1 1.5
∞ otherwise u

Convex Optimization Boyd and Vandenberghe 6.4


Example: histograms of residuals

A ∈ R100×30 ; shape of penalty function affects distribution of residuals


40

p=1
absolute value 𝜙(u) = |u|
0
−2 −1 0 1 2
10

p=2
square 𝜙(u) = u2
0
−2 −1 0 1 2

Log barrier Deadzone


20

deadzone 𝜙(u) = max{0, |u|−0.5}


0
−2 −1 0 1 2
10

log-barrier 𝜙(u) = − log(1 − u2 )


0
−2 −1 0 1 2
r

Convex Optimization Boyd and Vandenberghe 6.5


Huber penalty function
2

1.5

𝜙hub (u)
u2

|u| ≤ M 1
𝜙hub (u) =
M(2|u| − M) |u| > M
0.5

0
−1.5 −1 −0.5 0 0.5 1 1.5
u

▶ linear growth for large u makes approximation less sensitive to outliers


▶ called a robust penalty

Convex Optimization Boyd and Vandenberghe 6.6


Example

20

10

f (t)
0

−10

−20
−10 −5 0 5 10
t

▶ 42 points (circles) ti , yi , with two outliers


▶ affine function f (t) = 𝛼 + 𝛽t fit using quadratic (dashed) and Huber (solid) penalty

Convex Optimization Boyd and Vandenberghe 6.7


Least-norm problems

▶ least-norm problem:
minimize ∥x∥
subject to Ax = b,
with A ∈ Rm×n , m ≤ n, ∥ · ∥ is any norm

▶ geometric: x★ is smallest point in solution set {x | Ax = b}


▶ estimation:
– b = Ax are (perfect) measurements of x
– ∥x∥ is implausibility of x
– x★ is most plausible estimate consistent with measurements
▶ design: x are design variables (inputs); b are required results (outputs)
– x★ is smallest (‘most efficient’) design that satisfies requirements

Convex Optimization Boyd and Vandenberghe 6.8


Examples

▶ least Euclidean norm ( ∥ · ∥ 2 )


– solution x = A† b (assuming b ∈ R (A) )

▶ least sum of absolute values ( ∥ · ∥ 1 )


– can be solved via LP
minimize 1T y
subject to −y ⪯ x ⪯ y, Ax = b
– tends to yield sparse x★

Convex Optimization Boyd and Vandenberghe 6.9


Outline

Norm and penalty approximation

Regularized approximation

Robust approximation

Convex Optimization Boyd and Vandenberghe 6.10


Regularized approximation

▶ a bi-objective problem:

minimize (w.r.t. R2+ ) (∥Ax − b∥, ∥x∥)

▶ A ∈ Rm×n , norms on Rm and Rn can be different


▶ interpretation: find good approximation Ax ≈ b with small x

▶ estimation: linear measurement model y = Ax + v, with prior knowledge that ∥x∥ is small
▶ optimal design: small x is cheaper or more efficient, or the linear model y = Ax is only valid
for small x
▶ robust approximation: good approximation Ax ≈ b with small x is less sensitive to errors
in A than good approximation with large x

Convex Optimization Boyd and Vandenberghe 6.11


Scalarized problem

▶ minimize ∥Ax − b∥ + 𝛾∥x∥


▶ solution for 𝛾 > 0 traces out optimal trade-off curve
▶ other common method: minimize ∥Ax − b∥ 2 + 𝛿∥x∥ 2 with 𝛿 > 0
▶ with ∥ · ∥ 2 , called Tikhonov regularization or ridge regression

minimize ∥Ax − b∥ 22 + 𝛿∥x∥ 22

▶ can be solved as a least-squares problem


    2
minimize √A x−
b
𝛿I 0 2

with solution x★ = (AT A + 𝛿I) −1 AT b

Convex Optimization Boyd and Vandenberghe 6.12


Optimal input design

▶ linear dynamical system (or convolution system) with impulse response h:


t
∑︁
y(t) = h(𝜏)u(t − 𝜏), t = 0, 1, . . . , N
𝜏=0

▶ input design problem: multicriterion problem with 3 objectives


ÍN
– tracking error with desired output ydes : Jtrack = t=0 (y(t) − ydes (t)) 2
ÍN
– input magnitude: Jmag = u(t) 2
ÍN−1t=0 2
– input variation: Jder = t=0 (u(t + 1) − u(t))
track desired output using a small and slowly varying input signal
▶ regularized least-squares formulation: minimize Jtrack + 𝛿Jder + 𝜂Jmag
– for fixed 𝛿, 𝜂, a least-squares problem in u(0) , . . . , u(N)

Convex Optimization Boyd and Vandenberghe 6.13


Example
▶ minimize Jtrack + 𝛿Jder + 𝜂Jmag
▶ (top) 𝛿 = 0, small 𝜂; (middle) 𝛿 = 0, larger 𝜂; (bottom) large 𝛿
5 1
0 0.5

y(t)
u(t)

0
−5 −0.5
−10 −1
0 50 100 150 200 0 50 100 150 200
t t
4 1
2 0.5

y(t)
u(t)

0 0
−2 −0.5
−4 −1
0 50 100 150 200 0 50 100 150 200
t t
4 1
2 0.5

y(t)
u(t)

0 0
−2 −0.5
−4 −1
0 50 100 150 200 0 50 100 150 200
t t
Convex Optimization Boyd and Vandenberghe 6.14
Signal reconstruction

▶ bi-objective problem:

minimize (w.r.t. R2+ ) (∥ x̂ − xcor ∥ 2 , 𝜙(x̂))

– x ∈ Rn is unknown signal
– xcor = x + v is (known) corrupted version of x, with additive noise v
– variable x̂ (reconstructed signal) is estimate of x
– 𝜙 : Rn → R is regularization function or smoothing objective
▶ examples:
– quadratic smoothing, 𝜙quad (x̂) = n−1 2
Í
i=1 (x̂i+1 − x̂i )
Í n−1
– total variation smoothing, 𝜙tv (x̂) = i=1 |x̂i+1 − x̂i |

Convex Optimization Boyd and Vandenberghe 6.15


Quadratic smoothing example
0.5

0.5


0
x

0 −0.5
0 1000 2000 3000 4000
0.5
−0.5
0 1000 2000 3000 4000


0

−0.5
0.5
0 1000 2000 3000 4000
0.5
xcor


0

−0.5 −0.5
0 1000 2000 3000 4000 0 1000 2000 3000 4000
i i

three solutions on trade-off curve


original signal x and noisy signal xcor
∥ x̂ − xcor ∥ 2 versus 𝜙quad (x̂)
Convex Optimization Boyd and Vandenberghe 6.16
Reconstructing a signal with sharp transitions
2

x̂i
0
1
x

0 −2
0 500 1000 1500 2000
−1 2

−2

x̂i
0 500 1000 1500 2000 0

2 −2
0 500 1000 1500 2000
1 2
xcor

x̂i
0
−1

−2 −2
0 500 1000 1500 2000 0 500 1000 1500 2000
i i
three solutions on trade-off curve
original signal x and noisy signal xcor
∥ x̂ − xcor ∥ 2 versus 𝜙quad (x̂)
▶ quadratic smoothing smooths out noise and sharp transitions in signal
Convex Optimization Boyd and Vandenberghe 6.17
Total variation reconstruction
2

1 2
x


0

−1
−2
0 500 1000 1500 2000
−2 2
0 500 1000 1500 2000


0
2

1 −2
0 500 1000 1500 2000
xcor

2
0


−1 0

−2 −2
0 500 1000 1500 2000 0 500 1000 1500 2000
i i
three solutions on trade-off curve
original signal x and noisy signal xcor
∥ x̂ − xcor ∥ 2 versus 𝜙tv (x̂)

▶ total variation smoothing preserves sharp transitions in signal


Convex Optimization Boyd and Vandenberghe 6.18
Outline

Norm and penalty approximation

Regularized approximation

Robust approximation

Convex Optimization Boyd and Vandenberghe 6.19


Robust approximation

▶ minimize ∥Ax − b∥ with uncertain A

▶ two approaches:
– stochastic: assume A is random, minimize E ∥Ax − b∥
– worst-case: set A of possible values of A, minimize supA∈ A ∥Ax − b∥

▶ tractable only in special cases (certain norms ∥ · ∥ , distributions, sets A )

Convex Optimization Boyd and Vandenberghe 6.20


Example

12

10
A(u) = A0 + uA1 , u ∈ [−1, 1] xnom
▶ xnom minimizes ∥A0 x − b∥ 22 8

▶ xstoch minimizes E ∥A(u)x − b∥ 22

r(u)
6
xstoch
with u uniform on [−1, 1] xwc
▶ xwc minimizes sup−1≤u≤1 ∥A(u)x − b∥ 22 4

2
plot shows r(u) = ∥A(u)x − b∥ 2 versus u
0
−2 −1 0 1 2
u

Convex Optimization Boyd and Vandenberghe 6.21


Stochastic robust least-squares

▶ A = Ā + U , U random, E U = 0, E U T U = P
▶ stochastic least-squares problem: minimize E ∥ ( Ā + U)x − b∥ 22
▶ explicit expression for objective:

E ∥Ax − b∥ 22 = E ∥ Āx − b + Ux∥ 22


= ∥ Āx − b∥ 22 + E xT U T Ux
= ∥ Āx − b∥ 22 + xT Px

▶ hence, robust least-squares problem is equivalent to: minimize ∥ Āx − b∥ 22 + ∥P1/2 x∥ 22


▶ for P = 𝛿I , get Tikhonov regularized problem: minimize ∥ Āx − b∥ 22 + 𝛿∥x∥ 22

Convex Optimization Boyd and Vandenberghe 6.22


Worst-case robust least-squares
▶ A = {Ā + u1 A1 + · · · + up Ap | ∥u∥ 2 ≤ 1} (an ellipsoid in Rm×n )
▶ worst-case robust least-squares problem is
minimize supA∈ A ∥Ax − b∥ 22 = sup ∥u∥ 2 ≤1 ∥P(x)u + q(x) ∥ 22
 
where P(x) = A1 x A2 x · · · Ap x , q(x) = Āx − b
▶ from book appendix B, strong duality holds between the following problems
maximize ∥Pu + q∥ 22 minimize t+𝜆
subject to ∥u∥ 22 ≤ 1  I
 T P q 

subject to  P 𝜆I 0 ⪰0
 T 
 q
 0 t 

▶ hence, robust least-squares problem is equivalent to SDP
minimize t+𝜆
 I P(x) q(x) 
 
subject to  P(x) T 𝜆I 0 ⪰0
 
 q(x) T 0 t 
 
Convex Optimization Boyd and Vandenberghe 6.23
Example
▶ r(u) = ∥ (A0 + u1 A1 + u2 A2 )x − b∥ 2 , u uniform on unit disk
▶ three choices of x:
– xls minimizes ∥A0 x − b∥ 2
– xtik minimizes ∥A0 x − b∥ 22 + 𝛿∥x∥ 22 (Tikhonov solution)
– xrls minimizes supA∈ A ∥Ax − b∥ 22 + ∥x∥ 22
0.25

0.2 xrls
frequency
0.15

0.1
xtik
0.05 xls

0
0 1 2 3 4 5
r(u)

Convex Optimization Boyd and Vandenberghe 6.24

You might also like