0% found this document useful (0 votes)

28 views14 pages

Review 3

The document reviews concepts for the final exam, including: - Convex optimization problems minimize a convex objective function over a convex constraint set. - Examples of convex problems include projection onto convex sets, minimizing distances to points on spheres or positive quadrants, and problems with convex objectives and constraints. - Non-convex problems can arise when the constraint set is non-convex, such as defining a circle.

Uploaded by

Bharath S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

28 views14 pages

Review 3

Uploaded by

Bharath S

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 14

Review for the Final (last part)

October 30, 2015

Contents
1 Convex optimization 1
1.1 Convex function and convex sets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Convex optimization problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Examples of convex optimizations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4

2 Nonlinear (convex) programming 7

2.1 Interior point method for linear programming . . . . . . . . . . . . . . . . . . . . . . 7

3 Barrier and Penalty 11

4 Examples and Applications 14

1 Convex optimization
1.1 Convex function and convex sets
For a smooth function f , we have the following equivalent definitions for f to be convex:
(a) f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y)
(b) ∇f is monotone: (∇f (x) − ∇f (y), y − x) ≥ 0
(c) If f is smooth, ∇2 f (x) is nonnegative definite
For smooth function, we usually use (c) to check whether a function is convex or not. For
convexity of general function (or operation of convex functions), we use (a).
Similarly, we have the following equivalent definitions for strict convexity:
(a) f (λx + (1 − λ)y) ≤ λf (x) + (1 − λ)f (y). The equality holds only when x = y or λ = 0, 1.
(b) ∇f is monotone: (∇f (x) − ∇f (y), y − x) > 0 if y 6= x.
(c) If f is smooth, ∇2 f (x) is positive definite.

Examples of convex functions :

• Powers xp on (0, ∞) for p > 1 or p < 0
• Absolute value |x| and norms kxkp
• Exponential ex , negative of logarithm − ln x on (0, ∞)

1
Operations on functions preserving convexity

(i) If h and g are convex, then so are m(x) = max(f (x), g(x)) and h(x) = f (x) + g(x)

(ii) If f and g are convex and g is non-decreasing, then h(x) = g(f (x)) is convex

(iii) If f (x, y) is convex in x then g(x) = supy∈C f (x, y) is convex

Theorem 1.1 (Relation between convex function and convex sets). (a) If f is convex, then the set
{x | f (x) ≤ c} (could be empty) for any constant c is convex. (b) If f is convex, then the set (called
the epigraph of f ) {(x, t) | f (x) ≤ t} (could be empty) is convex.

You can tell directly a large collection of sets are convex, like the balls

{x | kxk1 ≤ 1}, {x | kxk2 ≤ 1}, {x | kxk∞ ≤ 1}.

The relation between the sets and functions could help us to determine whether an optimization
problem below is convex or not quickly.

1.2 Convex optimization problems

A Convex optimization problem (or Convex programming) is the minimization of a convex
function on a convex subset, i.e.,
min f (x)
subject to x ∈ Ω.
Here f is a convex function and Ω is a convex set. If the objective is to maximize g, then g is
concave (or −g is convex).

Example 1.1 (Projection on convex sets). Let Ω be a convex set, then the projection of x on the
convex set Ω
min kx − yk2
y∈Ω

is a convex problem. In practical calculation, we often minimize kx − yk22 instead of kx − yk because

of the simplicity of the gradient. This projection exist if Ω is close, and is denoted as PΩ (x). It is
easy to see that PΩ (x) = x for any x ∈ Ω.

Theorem 1.2 (Characterization of projection on convex set). Let Ω be a close convex set and PΩ (x)
is the projection of x on Ω if and only if for any y ∈ Ω,

(y − PΩ (x), x − PΩ (x)) ≤ 0.

Proof. If PΩ (x) is the projection of x on Ω, then for y ∈ Ω, the point (1 − λ)PΩ (x) + λy ∈ Ω for
λ ∈ (0, 1) (because Ω is convex). From the definition, PΩ (x) has the smallest distance to x, i.e.,

kPΩ (x) − xk22 ≤ k(1 − λ)PΩ (x) + λy − xk22

= kPΩ (x) − x + λ(y − PΩ (x))k22
= kPΩ (x) − xk22 + 2λ(PΩ (x) − x, y − PΩ (x)) + λ2 ky − PΩ (x)k22 . (1)

Therefore, 2λ(PΩ (x) − x, y − PΩ (x)) + λ2 ky − PΩ (x)k22 ≥ 0 for any λ ∈ (0, 1) or

2(PΩ (x) − x, y − PΩ (x)) + λky − PΩ (x)k22 ≥ 0.

2
x

x1 PΩ(x)

x2 x3
Ω

Figure 1: The projection of x on the convex set Ω.

Let λ → 0 we have (PΩ (x) − x, y − PΩ (x)) ≥ 0.

On the other hand, if (y − PΩ (x), x − PΩ (x)) ≤ 0 then

ky − xk22 = ky − PΩ (x) + PΩ (x) − xk22

= ky − PΩ (x)k22 + 2(y − PΩ (x), PΩ (x) − x) + kPΩ (x) − xk22
≥ ky − PΩ (x)k22 + kPΩ (x) − xk22 (2)

If y 6= PΩ (x), then ky − PΩ (x)k22 > 0 and therefore ky − xk22 > kPΩ (x) − xk22 . This also shows the
uniqueness of the projection PΩ (x).

Theorem 1.3 (Nonexpansion of the projection).

kPΩ (x) − PΩ (y)k ≤ kx − yk.

Example 1.2 (Projection on the unit sphere S = {y | kyk2 ≤ 1}). If x ∈ S, or kxk ≤ 1 then
PΩ (x) = x. Otherwise, PΩ (x) is in the same direction as x, and is located on the boundary of S,
which gives PΩ (x) = x/kxk2 . Therefore
(
x, if kxk2 ≤ 1,
PΩ (x) = x
kxk2 , otherwise.

Example 1.3 (Projection on the positive quadrant Rn+ = {y = (y1 , · · · , yn ) | y1 ≥ 0, · · · , yn ≥ 0}).

We can find it using the definition.

kx − yk22 = (x1 − y1 )2 + · · · + (xn − yn )2

Since yi ≥ 0, if xi ≥ i, we can choose yi = xi ; otherwise yi = 0. Therefore

PΩ (x) = (max(x1 , 0), · · · , max(xn , 0)).

3
x

x3
pΩ(x) PΩ(x3)

PΩ(x2)
PΩ(x1)

x2
x1

Figure 2: The projection of x on the sphere and on the positive quadrant.

Example 1.4 (Projection on subspaces). If Ω is a subspace (like a hyperplane, but not necessarily
include the origin), then first we have

(y − PΩ (x), x − PΩ (x)) ≤ 0.

On the other hand, 2PΩ (x) − y ∈ Ω (the special property when Ω is a subspace, then replacing y
by 2PΩ (x) − y ∈ Ω in the previous inequality,

0 ≥ ((2PΩ (x) − y) − PΩ (x), x − PΩ (x)) = −(y − PΩ (x), x − PΩ (x)).

Therefore (y − PΩ (x), x − PΩ (x)) = 0, y − PΩ (x) and x − PΩ (x) are perpendicular.

2PΩ(x) − y
PΩ(x)
y

Figure 3: The projection of x on a subspace Ω, which have the special property that if y, PΩ (x) ∈ Ω
so is 2PΩ (x) − y.

1.3 Examples of convex optimizations

Since the feasible region Ω is usually given by a set of constraints ci (x) ≤ 0, one sufficient condition
is that ci is convex.

4
Example 1.5.
min f (x) = |x − x1 | + · · · + |x − xm |, x ∈ R,
where x1 < x2 < · · · < xm are m constants.
Here |x − xi | is convex, so is their sum. Therefore, this is a convex optimization problem.
Example 1.6. p
min f (x) = (x1 − 2)2 + (x2 − 2)2
subject to x1 + x2 = 2.
The objective function can be written as f (x) = kx − x0 k2 where x0 = (2, 2), and is convex.
The constraint is a linear equality, which is also convex. This is a convex optimization problem.
Example 1.7. p
min f (x) = (x1 − 2)2 + (x2 − 2)2
subject to x1 ≤ 1,
x2 ≤ x1 .
The objective is the same as before and is convex. The constraints are two linear inequality, and
are convex. This is a convex optimization problem.
Example 1.8.
min f (x) = x1 + x2
subject to 2 − x21 − x22 ≥ 0.
The objective function f (x) = x1 + x2 is linear, and hence convex. We have to write the constraint
as c1 (x) = x21 + x22 − 2 ≤ 0. Since c1 is convex, this is a convex optimization problem.
Example 1.9.
min f (x) = x1 + x2
subject to 2 − x21 − x22 ≥ 0, x2 ≥ 0.
Compared to the previous problem, we have the additional constraint x2 ≥ 0. Since it is convex,
this new optimization problem is convex.
Example 1.10.
min f (x) = x1 + x2
subject to x21 + x22 = 1 ≥ 0.
The constraint (a circle) is NOT convex, therefore this is not a convex optimization problem. In
general, a convex optimization problem can only have linear equality constraints, but not nonlinear
equality constraints.
Example 1.11. The minimization problem

min f (x)

where f (x) = max(x, x2 ) is convex. Because both x and x2 are convex, so their maximum f .
We can write it in the following equivalent form.

min t
subject to x ≤ t,
x2 ≤ t.

5
This problem has the objective function f (x, t) = t which is convex. The two constraints are convex
too, and this alternative form is also a convex optimization problem. The reason we prefer this form
is that everything is differentiable, but the function f has a kink (thus non-differentiable) at x = 0
and x = 1.
Example 1.12.
min (x1 − 1)2 + (x2 − 1)2
subject to kxk1 = |x1 | + |x2 | ≤ 1.
The objective function is convex, because the Hessian matrix is positive definite. The constraint is
convex because kxk1 is convex. Therefore this is a convex optimization problem.
Example 1.13.
min x1
subject to |x1 − 1| + x2 ≤ 4,
x1 − |x2 − 1| ≥ 0.
The objective function is convex. The first constraint c1 (x) = |x1 − 1| + x2 − 4 ≤ 0 is convex,
because c1 is the sum of two convex functions |x1 − 1| and x2 − 4. The second constraint can be
written as c2 (x) = |x2 − 1| − x1 ≤ 0, and c2 is also the sum of two convex functions |x2 − 1| and
−x1 . Therefore, this is a convex optimization problem.
If the objective function f is convex only on some part of the domain, we have to check whether
f is convex on the feasible region (though it is not on the whole domain).
Example 1.14.
min f (x) = x31 + x22
subject to − 1 ≤ x1 ≤ 0.
The Hessian matrix for the objective function is

6x1 0
∇2 f (x) = ,
0 2
which has a negative eigenvalue 6x1 if −1 ≤ x1 < 0. Therefore f is not convex on −1 ≤ x1 < 0 and
this is NOT a convex optimization.
Example 1.15.
min f (x) = x31 + x22
subject to 0 ≤ x1 ≤ 0.
The Hessian matrix for the objective function is

2 6x1 0
∇ f (x) = ,
0 2
which has non-negative eigenvalues on the whole feasible region. This is a convex optimization.
Example 1.16.
min f (x) = x1
subject to (x1 − 1)2 + x22 = 1,
(x1 + 1)2 + x22 = 1.
In general, problems subject to nonlinear equality constraints can not be convex, but this one
is special in the sense that the feasible region has only one point, the origin. Therefore this is a
convex optimization problem (though trivial).

6
Example 1.17.
min f (x) = x21 − 2x1 + x22 − x23 + 4x3
subject to x1 − x2 + 2x3 = 2.
The objective function f (x) is not convex (because of −x23 term), but on the (convex) feasible
region Ω, it could be convex. One way to show this is to write x2 = x1 + 2x3 − 2 (this should be
choice with the smallest amount of calculation). Equivalently, we can show the convexity of the
function
q(x1 , x3 ) = f (x1 , x1 + 2x3 − 2, x3 ) = 2x21 + 3x23 + 4x1 x2 − 6x1 − 6x3 + 4.
Since the Hessian matrix for q
4 4
∇2 q(x) =
4 6
is positive definite, q (and thus f ) is convex. Notice that we have the condition ∇2 q(x) =
Z t ∇2 f (x)Z, where  
1 0
Z = 1 2
0 1
whose columns are in the null space of the constraint, i.e., x1 − x2 + 2x3 = 0.

Convex programming plays an important role in the field of optimization, because of properties
like:

(a) If a local minimal exists, it is a global minimum (but may not be strict)

(b) the set of all (global) minima is convex

2 Nonlinear (convex) programming

2.1 Interior point method for linear programming
The difference between simplex method and interior point method is illustrated in Figure ?? for the
problem
max x1 + x2
subject to x1 ≤ 3, x3 ≤ 2, x1 + x2 ≥ 1
x1 − x2 ≤ 1, x2 − x1 ≤ 1.
Using the fact that the optimizer is always on the boundary, the simplex methods (there are
various methods) search the optimizer from vertices to vertices, keeping the objective function
nondecreasing. However, the simplex methods suffer from a few problem:

• The complexity for the worst case is exponential, making it difficult for large scale problems
(with a lot of variables and constraints)

• The methods can be degenerate and lead to cycling (but can be fixed).

7
x2 x2
x2 − x1 ≤ 1
x1 + x2 = c
x2 ≤ 3

x1 − x2 ≤ 1

x1 ≤ 3

x1 x1
x1 + x2 ≤ 1

Figure 4: The comparison between simplex method (left) and interior point method (right).

The interior point method was proposed in early 1980s to deal with these difficulties of simplex
method. Starting from the primal problem in the standard form:

min ct x
subject to Ax = b, x ≥ 0.

The dual problem becomes

max bt λ
subject to At λ + s = c, s ≥ 0.
The interior point method solves (x, λ, s) of the optimality (KKT) condition iteratively. More
precisely, this method solves the solution to the system
 t   
A λ+s−c 0
F (x, λ, s) =  Ax − b  = 0 ,
 x, s ≥ 0,
XSe µe

where
     
x1 s1 1
.. ..  .. 
X = diag(x1 , · · · , xn ) =   , S = diag(s1 , · · · , sn ) =   , e = . ,
   
. .
xn sn 1

and then let τ → 0.

In practice, we have to find the minimizer iteratively. If at iteration k with (xk , λk , sk ), we have
to find (∆x, ∆λ, ∆s) such that

(xk+1 , λk+1 , sk+1 ) = (xk , λk , sk ) + α(∆x, ∆λ, ∆s)

for some step length α > 0. The equation for (∆x, ∆λ, ∆s) can be derived from those for
((xk , λk , sk ). From
Axk = b, Axk+1 = b

8
we have αA∆x = A(xk+1 − xk ) = 0. Since α 6= 0, A∆x = 0. Similarly At ∆λ + ∆s = 0. Finally,
from xki ski = µk and xk+1
i sk+1
i = µk+1 , we get

xki ∆si + ski ∆xi + α∆si ∆xi = (µk+1 − µk )/α.

Since both ∆xi and ∆si are small, we can ignore ∆si ∆xi and get the last sets of equations. Once
(∆x, ∆λ, ∆s) is found, the step length α is chosen such that (xk+1 , λk+1 , sk+1 ) is still inside the
feasible region.

Example 2.1. Consider the linear programming

min 2x1 + x2
x1 + 2x2 = 4
x1 ≥ 0, x2 ≥ 0.

2
Do one iteration of the interior point method starting from x0 = .
1
The primal problem is already in the standard form with

2
A = 1 2 , b = 4, c = .
1

The dual problem is

max bt λ
subject to At λ + s = c, s ≥ 0.

2
We choose s = and y = 0. For fixed µ0 = 2, the equation for (∆x, ∆λ, ∆s) is
1
    
0 0 1 1 0 ∆x1 0
0 At
  
I ∆x 0
 0 2 0 1

∆x2   0 
  
A 0 0  ∆λ = 
1 2 0 0 0
  ∆λ  =  0  ,
   
S0 0 Xk ∆s 2 0 0 2 0  ∆s1  −2
0 1 0 0 1 ∆s2 1

and the solution is given by ∆x = (−1.2, 0.6), ∆λ = −0.2, ∆s1 = 0.2, ∆s2 = 0.4. Since α > 0, we
only have to check x01 + α∆x1 ≥ 0 or α ≤ 5/3. For this simple case, we get the global minimizer
for this α.

For quadratic programming, we will focus on the the projected gradient method and the related
Active-set method.
For linear equality constraints Ax = b, the gradient pk at xk , either the negative gradient
(p = −∇f (xk )) or the Newton’s method pk = −(∇2 f (xk )−1 ∇f (xk ) may not lie in the null space
k

of A (or xk + αpk is not feasible for any α 6= 0. We can project pk on to the null space of A to get
p̃k = pk − At λ for some λ. This is reduced to another problem

min kpk − At λk2 .

If A has full row rank, then λ = (AAt )−1 Apk and p̃k = (I − At (AAt )−1 A)pk . Here p̃k is called the
projected or reduced gradient.

9
Example 2.2 (Active-set method). Solve the following problem using active-set method
1
min f (x) = (x1 − 3)2 + (x2 − 2)2 ,
2
subject to 2x1 − x2 ≥ 0, (c1 )
− x1 − x2 ≥ −4, (c2 )
x2 ≥ 0. (c3 )

Solution: Start with c1 and c3 are active, so that the feasible region is one point x0 = (0, 0).
The Lagrange multiplier with the subproblem min f (x) subject to c1 and c3 active are governed by

0 = ∇f (x0 ) − λ1 ∇c1 (x0 ) − λ3 ∇c3 (x0 )

or λ1 = −3/2, λ3 = −11/2. Since both Lagrange multipliers are negative, we can get rid of any one
of them. For simplicity, we get rid of c1 .
For the subproblem of min f (x) with only c3 active, we can get the minimizer x1 = (3, 0) and the
Lagrange multiplier is λ3 = −4 < 0. This implies that c3 is not active and we have an unconstrained
problem at x1 .
We can find the search direction using Newton’s method,

1 2 1 −1 1 0
p = −(∇ f (x )) ∇f (x ) = .
2

The step length α for x2 = x1 + αp1 is determined from the feasible condition for x2 . This gives
α = 1/2 and c2 becomes active.
Finally, we solve the subproblem min f (x) with only c2 active and get x3 = (7/3, 5/3). The
optimality condition indicates that this is a local (actually global) minimizer.

3
x3
2
x2
1

x0 1 2 x31 4

Figure 5: The active-set method

10
3 Barrier and Penalty
For constrained problems, alternative ways to dealing with the constraints are defining the objective
on the whole domain, but preventing the solution into the infeasible region (Barrier) or put some
penalty on the objective function. We have to introduce new variables to control these constraints.
In the appropriate limit of these variables, we get the same minimizer as the original problem.
We actually have encountered barrier in the duality theory, but put an infinity value on the
objective function if the variable is outside the feasible region and zero otherwise. This one is called
the indicate function. More precisely, for the minimization problem f (x) over the feasible region Ω
defined as
Ω = {x | ci (x) = 0, i ∈ E, ci (x) ≥ 0, i ∈ I}.
The original constrained problem is equivalent to

min f (x) + χΩ (x),

where χΩ (x), called the indicator function, is

(
0, if x ∈ Ω,
χΩ (x) =
∞, otherwise.

We use the following equivalent fact in the duality theory

 
X
χΩ (x) = max − λi ci (x) .
λi ≥0, i∈I S
λi free, i∈E i∈I E

However, we have switch the order of min and max to get something useful.
Other barrier methods use logarithmic and inverse function such that the problem

min f (x) subject to gi (x) ≥ 0, i = 1, 2, · · · , m.

That is
m
X
βµ (x) = f (x) − µ ln gi (x)
i=1
or
m
X 1
βµ (x) = f (x) + µ .
gi (x)
i=1

Example 3.1. Solve the following problem using logarithmic barrier

min f (x) = x1 − 2x2 ,
subject to 1 + x1 − x22 ≥ 0, (c1 )
x2 ≥ 0. (c2 )
The barrier function is

βµ (x) = x1 − 2x2 − µ log(1 + x1 − x22 ) − µ log x2 .

The minimizer is given by

µ !
1− 1+x1 −x22
0 = ∇βµ (x) = 2µx2 .
−2 + 1+x1 −x22
− xµ2

11
Using the first equation, the second equation can be reduced to −2 + 2x2 − µ/x2 = 0. The roots (it
is actually a quadratic equation) are
√
1 ± 1 + 2µ
x2 = .
2
√
We have to choose the positive root x2 = (1 + 1 + 2µ)/2 and correspondingly x1 = 3µ/2 +
√
( 1 + 2µ − 1)/2. To find the minimizer, we have to take the limit when µ → 0+ , that is
√
∗ 3µ/2 + ( 1 + 2µ − 1)/2 0
x = lim √ = .
µ→0+ (1 + 1 + 2µ)/2 1

We can get a lot of information from this approach. Since

µ µ
∇βµ (x) = ∇f (x) − ∇c1 (x) − ∇c2 (x),
c1 (x) c2 (x)

this implies that µ/ci (x) is an approximation of the Lagrangian Multiplier. From the calculate
above
µ µ 2µ p
λ1 (µ) = = 1, λ2 (µ) = =√ = 1 + 2µ − 1.
c1 (x) x2 1 + 2µ + 1
Taking the limit µ → 0+ , we get λ∗1 = 1 and λ∗2 = 0.

Example 3.2. Solve the following problem using logarithmic barrier

min f (x) = x21 + x22 ,

subject to x1 − 1 ≥ 0, (c1 )
x2 + 1 ≥ 0. (c2 )

The logarithmic barrier function is

βµ (x) = x21 + x22 − µ log(x1 − 1) − µ log(x2 + 1)

The minimizer satisfies

µ
2x1 −

0 = ∇βµ (x) = x1 −1
µ
2x2 − x2 +1
or √ √
1+ 1 + 2µ −1 + 1 + 2µ
x1 = , x2 = .
2 2
The Lagrange Multiplier can be estimated as
µ p µ p
λ1 (µ) = = 1 + 2µ + 1, λ2 (µ) = = 1 + 2µ − 1.
x1 − 1 x1 + 1
It is easy to check that x(µ) → x∗ = (1, 0) and λ(µ) → λ∗ = (2, 0).
But in general, the Hessian matrix for Barrier function is ill-conditioned. For this one we have
µ
!
2 + (x1 −1) 0 2 + µ4 0

2 2
∇x βµ (x) = µ ≈ .
0 2 + (x2 +1)2 0 2

Therefore the condition number is approximately 2+ 4
µ /2 = O(µ−1 ). The numerical scheme
may become unstable.

12
For the barrier method, the minimizer can not be outside the feasible region, while for the
penalty method, the minimizer can be infeasible, but with a penalty.
The most popular penalty method for equality constrained problem like

min f (x) subject to ci (x) = 0, i ∈ E.

is with quadratic penalty

µX 2
Q(x; µ) = f (x) + ci (x),
2
i∈E

and we are interested in the limit when µ goes to infinity (such that x(µ) satisfies ci (x(µ)) → 0).
Example 3.3. Consider the problem

min f (x) = −x1 x2 ,

subject to g(x) = x1 + 2x2 − 4 = 0.

using quadratic penalty.

The problem with quadratic penalty function is
µ
Q(x; µ) = −x1 x2 + (x1 + 2x2 − 4)2 ,
2
and the minimizer satisfies

−x2 + µ(x1 + 2x2 − 4) = 0, −x1 + 2µ(x1 + 2x2 − 4) = 0.

For µ > 1/4, this yields the solution

8µ 4µ
x1 (µ) = , x2 (µ) = .
µ−1 4µ − 1
The limit when µ goes to infinity gives x(µ) → x∗ = (2, 1). We can also estimate the Lagrange
Multiplier as
4µ
−λg(x(µ)) = − → λ∗ = −1.
4µ − 1
The Hessian matrix of Q is
2 µ 2µ − 1
∇x Q(x; µ) =
2µ − 1 4µ
The conditional number is close to 25µ/4, is also ill-conditioned.
For general constrained problems

min f (x) subject to ci (x) = 0, i ∈ E, ci (x) ≥ 0, i ∈ I.

the quadratic penalty function is

def µX 2 µX
Q(x; µ) = f (x) + ci (x) + ([ci (x)]− )2 .
2 2
i∈E i∈I

Here [·]− mean the positive part of the argument, i.e.,

(
ci (x), if ci (x) ≥ 0,
[ci (x)]− =
0, otherwise.

13
To avoid the ill-conditioning of the Hessian matrix, nonsmooth penalty functions like
def µX 2 µX
Q(x; µ) = f (x) + ci (x) + ([ci (x)]− )2 ,
2 2
i∈E i∈I

can be introduced. But the minimizer is more difficult to find, because of the non-differentiability
of Q.

Nature optimizes things the best way.

4 Examples and Applications

Example 4.1 (Minimal distance to discrete points (Least Square)). Let xi , i = 1, 2, · · · , m be m
points in Rn , find the point x with the minimal of the sum of the squared (Euclidean) distance to
these point. This can be formulated as the following unconstrained least square problem.
m
X
min f (x) = kx − xi k22
x
i=1

The minimizer x∗ is given by

m
X m
X
0 = ∇f (x∗ ) = (x∗ − xi ) = nx∗ − xi
i=1 i=1

m
∗ 1 X
or x = xi , the mean or the center of mass of these points.
m
i=1

Example 4.2 (Minimal distance to a subspace (Least square or Projection)). Find the minimal
distance from a point x0 to the subspace spanned by the vectors v1 , v2 , · · · , vm .
This subspace can be represented as a linear combination of the vectors µ1 v1 +µ2 v2 +· · ·+µm vm
for the constants µ1 , · · · , µm . Therefore, this problem can be formulated as

min kx0 − µ1 v1 − · · · − µm vm k22 = minm f (µ), f (µ) = kx0 − V µk22 ,

µ1 ,··· ,µm µ∈R

where t
V = v1 v2 · · · vm , µ = µ1 µ 2 · · · µm .
It is still an unconstrained problem and the minimizer is given by

0 = ∇f (µ∗ ) = V t (x0 − V µ∗ )

or V t V µ∗ = V t x0 . If v1 , v2 , · · · , vm are linearly independent, then µ∗ = (V t V )−1 V t x0 . Other-

wise, the solution is not unique, and we can choose a linearly independent subset of vectors from
v1 , · · · , vm .

Example 4.3 (Law of reflection). Find C on the line `, such that |AC| + |BC| is minimal. The
minimizer C gives the actual path of the light traveling from A to B (the actual statement is
minimal traveling time, but since the speed of light is constant in this case, it is equivalent to
minimal distance).

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
From Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
4/5 (6471)
Principles: Life and Work
From Everand
Principles: Life and Work
Ray Dalio
4/5 (650)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
From Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
4/5 (1175)
Never Split the Difference: Negotiating As If Your Life Depended On It
From Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
4.5/5 (1005)
The Glass Castle: A Memoir
From Everand
The Glass Castle: A Memoir
Jeannette Walls
4.5/5 (1859)
Grit: The Power of Passion and Perseverance
From Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
4/5 (651)
The Perks of Being a Wallflower
From Everand
The Perks of Being a Wallflower
Stephen Chbosky
4.5/5 (4104)
Sing, Unburied, Sing: A Novel
From Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
4/5 (1278)
Her Body and Other Parties: Stories
From Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
4/5 (903)
Shoe Dog: A Memoir by the Creator of Nike
From Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
4.5/5 (629)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
From Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
4/5 (1022)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
From Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
4.5/5 (361)
The Emperor of All Maladies: A Biography of Cancer
From Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
4.5/5 (298)
Steve Jobs
From Everand
Steve Jobs
Walter Isaacson
4.5/5 (1139)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
From Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
4.5/5 (582)
A Man Called Ove: A Novel
From Everand
A Man Called Ove: A Novel
Fredrik Backman
4.5/5 (5181)
Angela's Ashes: A Memoir
From Everand
Angela's Ashes: A Memoir
Frank McCourt
4.5/5 (945)
Brooklyn: A Novel
From Everand
Brooklyn: A Novel
Colm Toibin
3.5/5 (2141)
The Little Book of Hygge: Danish Secrets to Happy Living
From Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
3.5/5 (464)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
From Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
3.5/5 (2289)
Yes Please
From Everand
Yes Please
Amy Poehler
4/5 (2016)
The Yellow House: A Memoir (2019 National Book Award Winner)
From Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
4/5 (100)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
From Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
4.5/5 (280)
Bad Feminist: Essays
From Everand
Bad Feminist: Essays
Roxane Gay
4/5 (1090)
The Art of Racing in the Rain: A Novel
From Everand
The Art of Racing in the Rain: A Novel
Garth Stein
4/5 (4372)
A Tree Grows in Brooklyn
From Everand
A Tree Grows in Brooklyn
Betty Smith
4.5/5 (2033)
The Woman in Cabin 10
From Everand
The Woman in Cabin 10
Ruth Ware
3.5/5 (2815)
The Outsider: A Novel
From Everand
The Outsider: A Novel
Stephen King
4/5 (2886)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
From Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
3.5/5 (233)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
From Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
4.5/5 (141)
Team of Rivals: The Political Genius of Abraham Lincoln
From Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
4.5/5 (244)
Wolf Hall: A Novel
From Everand
Wolf Hall: A Novel
Hilary Mantel
4/5 (4135)
On Fire: The (Burning) Case for a Green New Deal
From Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
4/5 (78)
Rise of ISIS: A Threat We Can't Ignore
From Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
3.5/5 (144)
Manhattan Beach: A Novel
From Everand
Manhattan Beach: A Novel
Jennifer Egan
3.5/5 (929)
Analytical Method Validation Anvisa
100% (1)
Analytical Method Validation Anvisa
22 pages
Fear: Trump in the White House
From Everand
Fear: Trump in the White House
Bob Woodward
3.5/5 (841)
John Adams
From Everand
John Adams
David McCullough
4.5/5 (2547)
OR Module 2 SESSION 2
No ratings yet
OR Module 2 SESSION 2
52 pages
Laplace and Fourier Table PDF
No ratings yet
Laplace and Fourier Table PDF
1 page
RD Sharma Class 12 Solutions
30% (10)
RD Sharma Class 12 Solutions
2 pages
The Unwinding: An Inner History of the New America
From Everand
The Unwinding: An Inner History of the New America
George Packer
4/5 (45)
The Light Between Oceans: A Novel
From Everand
The Light Between Oceans: A Novel
M.L. Stedman
4.5/5 (815)
Little Women
From Everand
Little Women
Louisa May Alcott
4.5/5 (2369)
The Constant Gardener: A Novel
From Everand
The Constant Gardener: A Novel
John le Carré
4/5 (278)
Lab Manual Control Systems Final
100% (2)
Lab Manual Control Systems Final
109 pages
9539-Article Text-24052-1-10-20240331
No ratings yet
9539-Article Text-24052-1-10-20240331
12 pages
Readings Segall Fall 2024
No ratings yet
Readings Segall Fall 2024
19 pages
Problemas de Optimización
No ratings yet
Problemas de Optimización
19 pages
Leininger's Phases of Ethnonursing Analysis For Qualitative Data
No ratings yet
Leininger's Phases of Ethnonursing Analysis For Qualitative Data
1 page
Calculateds The Probabilities of Committing A Type I and Type II Eror.
100% (1)
Calculateds The Probabilities of Committing A Type I and Type II Eror.
4 pages
Newton
No ratings yet
Newton
48 pages
3 01 Extrema On An Interval
No ratings yet
3 01 Extrema On An Interval
8 pages
Composition of Functions of Several Variables
No ratings yet
Composition of Functions of Several Variables
2 pages
Gauss Seidel Method
No ratings yet
Gauss Seidel Method
2 pages
CH 4
No ratings yet
CH 4
23 pages
Cost Function
No ratings yet
Cost Function
17 pages
Worksheet 1 (Sol)
No ratings yet
Worksheet 1 (Sol)
3 pages
19651chapter 6
No ratings yet
19651chapter 6
36 pages
Computational Lingustices Curriculum
No ratings yet
Computational Lingustices Curriculum
19 pages
Cs 372
No ratings yet
Cs 372
2 pages
Handbook of Linear Partial Differential Equations For Engineers and Scientists, Second Edition
No ratings yet
Handbook of Linear Partial Differential Equations For Engineers and Scientists, Second Edition
44 pages
MATLAB Tutorial, Part 2: Plotting Parametric Curves
No ratings yet
MATLAB Tutorial, Part 2: Plotting Parametric Curves
2 pages
Process Flow Chart (PFC) : Write Down The Opposite of Product Characteristics As in PFC
No ratings yet
Process Flow Chart (PFC) : Write Down The Opposite of Product Characteristics As in PFC
4 pages
Integration Notes PDF
No ratings yet
Integration Notes PDF
7 pages
COMPLEX ANALYTIC FUNCTIONS Hand Out
100% (1)
COMPLEX ANALYTIC FUNCTIONS Hand Out
16 pages
Psychology Module IGNOU
No ratings yet
Psychology Module IGNOU
11 pages
Worksheet Questions
No ratings yet
Worksheet Questions
2 pages
Introduction To Statistics
No ratings yet
Introduction To Statistics
23 pages
Math 1 B
No ratings yet
Math 1 B
3 pages
Structural Analysis 2
No ratings yet
Structural Analysis 2
177 pages

Review 3

Uploaded by

Review 3

Uploaded by

Review for the Final (last part)

October 30, 2015

2 Nonlinear (convex) programming 7

3 Barrier and Penalty 11

4 Examples and Applications 14

Examples of convex functions :

(iii) If f (x, y) is convex in x then g(x) = supy∈C f (x, y) is convex

{x | kxk1 ≤ 1}, {x | kxk2 ≤ 1}, {x | kxk∞ ≤ 1}.

1.2 Convex optimization problems

is a convex problem. In practical calculation, we often minimize kx − yk22 instead of kx − yk because

kPΩ (x) − xk22 ≤ k(1 − λ)PΩ (x) + λy − xk22

Therefore, 2λ(PΩ (x) − x, y − PΩ (x)) + λ2 ky − PΩ (x)k22 ≥ 0 for any λ ∈ (0, 1) or

2(PΩ (x) − x, y − PΩ (x)) + λky − PΩ (x)k22 ≥ 0.

Figure 1: The projection of x on the convex set Ω.

Let λ → 0 we have (PΩ (x) − x, y − PΩ (x)) ≥ 0.

ky − xk22 = ky − PΩ (x) + PΩ (x) − xk22

Theorem 1.3 (Nonexpansion of the projection).

kPΩ (x) − PΩ (y)k ≤ kx − yk.

Example 1.3 (Projection on the positive quadrant Rn+ = {y = (y1 , · · · , yn ) | y1 ≥ 0, · · · , yn ≥ 0}).

kx − yk22 = (x1 − y1 )2 + · · · + (xn − yn )2

Since yi ≥ 0, if xi ≥ i, we can choose yi = xi ; otherwise yi = 0. Therefore

PΩ (x) = (max(x1 , 0), · · · , max(xn , 0)).

Figure 2: The projection of x on the sphere and on the positive quadrant.

0 ≥ ((2PΩ (x) − y) − PΩ (x), x − PΩ (x)) = −(y − PΩ (x), x − PΩ (x)).

Therefore (y − PΩ (x), x − PΩ (x)) = 0, y − PΩ (x) and x − PΩ (x) are perpendicular.

1.3 Examples of convex optimizations

(b) the set of all (global) minima is convex

2 Nonlinear (convex) programming

The dual problem becomes

and then let τ → 0.

(xk+1 , λk+1 , sk+1 ) = (xk , λk , sk ) + α(∆x, ∆λ, ∆s)

xki ∆si + ski ∆xi + α∆si ∆xi = (µk+1 − µk )/α.

Example 2.1. Consider the linear programming

The dual problem is

min kpk − At λk2 .

0 = ∇f (x0 ) − λ1 ∇c1 (x0 ) − λ3 ∇c3 (x0 )

Figure 5: The active-set method

min f (x) + χΩ (x),

where χΩ (x), called the indicator function, is

We use the following equivalent fact in the duality theory

min f (x) subject to gi (x) ≥ 0, i = 1, 2, · · · , m.

Example 3.1. Solve the following problem using logarithmic barrier

βµ (x) = x1 − 2x2 − µ log(1 + x1 − x22 ) − µ log x2 .

The minimizer is given by

We can get a lot of information from this approach. Since

Example 3.2. Solve the following problem using logarithmic barrier

min f (x) = x21 + x22 ,

The logarithmic barrier function is

βµ (x) = x21 + x22 − µ log(x1 − 1) − µ log(x2 + 1)

The minimizer satisfies

min f (x) subject to ci (x) = 0, i ∈ E.

is with quadratic penalty

min f (x) = −x1 x2 ,

using quadratic penalty.

−x2 + µ(x1 + 2x2 − 4) = 0, −x1 + 2µ(x1 + 2x2 − 4) = 0.

For µ > 1/4, this yields the solution

min f (x) subject to ci (x) = 0, i ∈ E, ci (x) ≥ 0, i ∈ I.

the quadratic penalty function is

Here [·]− mean the positive part of the argument, i.e.,

Nature optimizes things the best way.

4 Examples and Applications

The minimizer x∗ is given by

min kx0 − µ1 v1 − · · · − µm vm k22 = minm f (µ), f (µ) = kx0 − V µk22 ,

or V t V µ∗ = V t x0 . If v1 , v2 , · · · , vm are linearly independent, then µ∗ = (V t V )−1 V t x0 . Other-

You might also like