0% found this document useful (0 votes)
2 views5 pages

Lecture 3

Uploaded by

liangqianrou27
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
2 views5 pages

Lecture 3

Uploaded by

liangqianrou27
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Convex Analysis and Optimization

Lecture 3
Instructor: Wu Yuqia

1.1 Global and local minimum

Let X ⊆ Rn be a nonempty set, and let

f : Rn → (−∞, +∞]

be an extended-real-valued function. We consider the problem of

min f (x).
x∈X

Definition 1.1 (Feasible set and feasible solutions) A point x ∈ X is called a feasible solution (or
feasible point) for the above minimization problem. The set X itself is called the feasible set. If X is
nonempty, we say that the problem is feasible; otherwise, it is infeasible.

Definition 1.2 (Minimum and minimizing points) We say a point x∗ ∈ X is a minimum (or global
minimum) of f over X if
f (x∗ ) = inf f (x).
x∈X

Equivalently, we say f attains a minimum over X at x∗ . In this situation, we write

x∗ ∈ arg min f (x).


x∈X

If the minimizer is unique, one often writes

x∗ = arg min f (x) (slight abuse of notation).


x∈X

A fundamental question in optimization is whether an optimal solution (i.e., a global minimizer) exists for
an optimization problem.

Proposition 1.3 (Weierstrass’ Theorem) Consider a closed, proper function

f : Rn → (−∞, +∞].

Assume that one of the following three conditions holds:

1. dom(f ) is bounded;

2. There exists a scalar γ such that the level set { x | f (x) ≤ γ} is nonempty and bounded;

3. f is coercive, i.e., f (x) → +∞ whenever kxk→ +∞. f

1-1
1-2 Lecture 3

Then the set of all minima of f over Rn is nonempty and compact.

Proof: Case 1: Condition (1) holds. Since f is proper, dom(f ) is nonempty. Pick a sequence {xk } ⊂
dom(f ) such that
lim f (xk ) = infn f (x).
k→∞ x∈R

Because dom(f ) is bounded, the sequence {xk } has at least one limit point, call it x∗ . Since f is closed, it
is lower semicontinuous at x∗ (see, e.g., Proposition 1.2.2(b)), hence

f (x∗ ) ≤ lim f (xk ) = inf f (x),


k→∞ x∈Rn

so x∗ is a global minimizer of f . Therefore the set

X ∗ = arg minn f (x)


x∈R

is nonempty. Moreover, X ∗ ⊆ dom(f ) and dom(f ) is bounded, so X ∗ is bounded.


Next, observe that X ∗ is also the intersection of all sublevel sets of f at levels γ strictly exceeding inf x∈Rn f (x):
\
X∗ = { x ∈ Rn | f (x) ≤ γ}.
γ> inf x∈Rn f (x)

Each such sublevel set { x | f (x) ≤ γ} is closed (since f is closed; cf. Proposition 1.2.2(c)), and an intersec-
tion of closed sets is closed. Hence X ∗ is closed. Boundedness plus closedness in Rn implies compactness.

Case 2: Condition (2) holds. In this scenario, there is some γ for which the level set

{ x | f (x) ≤ γ}

is nonempty and bounded. Define a new function f˜ by



f (x), if f (x) ≤ γ,
f˜(x) =
+∞, otherwise.

Then f˜ is closed. Furthermore, dom(f˜) is precisely the bounded set { x : f (x) ≤ γ}. Hence, applying the
argument from Case 1 to f˜, we conclude that the set of minima of f˜ (and thus of f ) is nonempty and
compact.

Case 3: Condition (3) holds. Since f is proper, it has at least one nonempty level set. Being coercive
means f (x) → +∞ whenever kxk→ ∞, so every nonempty level set of f is bounded. Hence condition (2) is
satisfied, reducing to the argument above. 

Example 1.4 Consider the problem minx∈Rn 21 kAx − bk2 with A ∈ Rm×n and b ∈ Rm . If m < n, there
exists γ > 0 such that the level set of F is not bounded.

Definition 1.5 (Local Minimum) Let X ⊆ Rn and let f : Rn → (−∞, +∞] be a function. A point
x∗ ∈ X is said to be a local minimum of f over X if there exists some ε > 0 such that

k x − x∗ k ≤ ε and x ∈ X =⇒ f (x∗ ) ≤ f (x).

When X = Rn or if X is precisely the domain of f , one often omits the phrase “over X” and simply says
x∗ is a local minimum of f . A local minimum x∗ is strict if there is no other local minimum within a
neighborhood of x∗ .
Lecture 3 1-3

Proposition 1.6 If X is a convex subset of Rn and f : Rn → (−∞, ∞] is a proper convex function, then a
local minimum of f over X is also a global minimum of f over X. If in addition f is strictly convex, then
there exists at most one global minimum of f over X.

Proof: Let f be convex, and assume to arrive at a contradiction, that x∗ is a local minimum of f over X
that is not global. Then, there must exist an x̄ ∈ X such that f (x̄) < f (x∗ ). By convexity, for all α ∈ (0, 1),

f (αx∗ + (1 − α)x̄) ≤ αf (x∗ ) + (1 − α)f (x̄) < f (x∗ ).

Thus, f has strictly lower value than f (x∗ ) at every point on the line segment connecting x∗ with x̄, except
at x∗ . Since X is convex, the line segment belongs to X, thereby contradicting the local minimality of x∗ .
Let f be strictly convex, and assume to arrive at a contradiction, that two distinct global minima of f over
X, x and y, exist. Then the midpoint (x + y)/2 must belong to X, since X is convex. Furthermore, the
value of f must be smaller at the midpoint than at x and y by the strict convexity of f . Since x and y are
global minima, we obtain a contradiction.

1.2 Projection

In this section we consider the projection, that is, finding a vector in a given nonempty closed convex set C,
which is at minimum Euclidean distance from another given vector x. It is called the projection of x onto
C.

Proposition 1.7 Let C be a nonempty closed convex subset of Rn .

(a) For every x ∈ Rn , there exists a unique vector that minimizes kz − xk over all z ∈ C. This vector,
called the projection of x on C, is denoted by PC (x):

PC (x) = arg minkz − xk.


z∈C

(b) For every x ∈ Rn , a vector z ∈ C is equal to PC (x) if and only if:

(y − z)0 (x − z) ≤ 0, ∀y ∈ C.

In the case where C is an affine set, the above condition is equivalent to:

(x − z) ∈ S ⊥ ,

where S is the subspace that is parallel to C.

(c) The function PC : Rn → C is continuous and nonexpansive, i.e.,

kPC (y) − PC (x)k≤ ky − xk, ∀x, y ∈ Rn .

(d) The distance function d : Rn → R, defined by d(x, C) = minz∈C kz − xk, is convex.

Proof: (a) Fix x and let w be some element of C. Minimizing kx − zk over all z ∈ C is equivalent to
minimizing the continuous function:
1
g(z) = kz − xk2 .
2
1-4 Lecture 3

Over the set z ∈ C such that kx−zk≤ kx−wk, which is a compact set. Therefore, by Weierstrass’s Theorem,
there exists a minimizing vector.
To prove uniqueness, note that g(·) is a strictly convex function because its Hessian matrix is the identity
matrix, which is positive definite. Thus, its minimum is attained at a unique vector.
(b) For all y and z in C, we have:

ky − xk2 = ky − zk2 +kz − xk2 −2(y − z)0 (x − z) ≥ kz − xk2 −2(y − z)0 (x − z).

Therefore, if z is such that (y − z)0 (x − z) ≤ 0 for all y ∈ C, we have:

ky − xk2 ≥ kz − xk2 , ∀y ∈ C,

implying that z = PC (x).


Conversely, let z = PC (x), consider any y ∈ C, and for α > 0, define:

yα = αy + (1 − α)z.

We have:
kx − yα k2 = kα(x − y) + (1 − α)(x − z)k2 .
Expanding this, we get:

kx − yα k2 = α2 kx − yk2 +(1 − α)2 kx − zk2 +2α(1 − α)(x − y)0 (x − z).

Viewing kx − yα k2 as a function of α, we compute the derivative:

∂ 
kx − yα k2 = −2kx − zk2 +2(x − y)0 (x − z) = −2(y − z)0 (x − z).
∂α α=0

Since α = 0 minimizes kx − yα k2 over α ∈ [0, 1], we have:

∂ 
kx − yα k2 ≥ 0,
∂α α=0

or equivalently:
(y − z)0 (x − z) ≤ 0.

If C is affine and is parallel to the subspace S, we have:

y ∈ C ⇐⇒ y − z ∈ S.

Hence, the condition (y − z)0 (x − z) ≤ 0 for all y ∈ C is equivalent to w0 (x − z) ≤ 0 for all w ∈ S, or


(x − z) ⊥ S.
(c) Let x and y be elements of Rn . From part (b), we have:

(w − PC (x))0 (x − PC (x)) ≤ 0, ∀w ∈ C.

Since PC (y) ∈ C, we obtain:


(PC (y) − PC (x))0 (x − PC (x)) ≤ 0.
Similarly:
(PC (x) − PC (y))0 (y − PC (y)) ≤ 0.
Lecture 3 1-5

Adding these two inequalities, we get:

(PC (y) − PC (x))0 (x − PC (x) − y + PC (y)) ≤ 0.

By rearranging and using the Schwarz inequality, we have:

kPC (y) − PC (x)k2 ≤ (PC (y) − PC (x))0 (y − x) ≤ kPC (y) − PC (x)k·ky − xk.

Thus:
kPC (y) − PC (x)k≤ ky − xk,
showing that PC is nonexpansive and therefore also continuous.
(d) Assume, to arrive at a contradiction, that there exist x1 , x2 ∈ Rn and an α ∈ (0, 1) such that:

d(αx1 + (1 − α)x2 , C) > αd(x1 , C) + (1 − α)d(x2 , C).

Then, there must exist z1 , z2 ∈ C such that:

d(αx1 + (1 − α)x2 , C) > αkz1 − x1 k+(1 − α)kz2 − x2 k.

This implies:
kαz1 + (1 − α)z2 − αx1 − (1 − α)x2 k> αkz1 − x1 k+(1 − α)kz2 − x2 k,
which contradicts the triangle inequality.

Example 1.8 Consider the quadratic programming problem:


1
minimize kxk2 +c0 x subject to Ax = 0,
2
where c is a given vector in Rn and A is an m × n matrix of rank m.
By adding the constant term 21 kck2 to the cost function, we can equivalently write this problem as:

1
minimize kc + xk2 subject to Ax = 0,
2
which is the problem of projecting the vector −c onto the subspace X = {x | Ax = 0}. By Proposition 1.7
(b), a vector x∗ such that Ax∗ = 0 is the unique projection if and only if:

(c + x∗ )0 x = 0, ∀x with Ax = 0.

This implies that c + x∗ ∈ Range(A> ), and thus, there exists λ > 0 such that c + x∗ = A> λ. Then,
Ac = AA> λ, which together with the fact that A is of full row rank that λ = (AA> )−1 Ac. It can be seen that
the vector:
x∗ = − I − A> (AA> )−1 A c


satisfies this condition and is thus the unique solution of the quadratic programming problem (2.1). (The
matrix AA> is invertible because A has rank m.)

Homework: 1.4(a), 1.7


Find a minimizer of the following problem:
1
minn kx − ck2 s.t. Ax ≤ b.
x∈R 2

You might also like