Lecture 3
Lecture 3
Lecture 3
Instructor: Wu Yuqia
f : Rn → (−∞, +∞]
min f (x).
x∈X
Definition 1.1 (Feasible set and feasible solutions) A point x ∈ X is called a feasible solution (or
feasible point) for the above minimization problem. The set X itself is called the feasible set. If X is
nonempty, we say that the problem is feasible; otherwise, it is infeasible.
Definition 1.2 (Minimum and minimizing points) We say a point x∗ ∈ X is a minimum (or global
minimum) of f over X if
f (x∗ ) = inf f (x).
x∈X
A fundamental question in optimization is whether an optimal solution (i.e., a global minimizer) exists for
an optimization problem.
f : Rn → (−∞, +∞].
1. dom(f ) is bounded;
2. There exists a scalar γ such that the level set { x | f (x) ≤ γ} is nonempty and bounded;
1-1
1-2 Lecture 3
Proof: Case 1: Condition (1) holds. Since f is proper, dom(f ) is nonempty. Pick a sequence {xk } ⊂
dom(f ) such that
lim f (xk ) = infn f (x).
k→∞ x∈R
Because dom(f ) is bounded, the sequence {xk } has at least one limit point, call it x∗ . Since f is closed, it
is lower semicontinuous at x∗ (see, e.g., Proposition 1.2.2(b)), hence
Each such sublevel set { x | f (x) ≤ γ} is closed (since f is closed; cf. Proposition 1.2.2(c)), and an intersec-
tion of closed sets is closed. Hence X ∗ is closed. Boundedness plus closedness in Rn implies compactness.
Case 2: Condition (2) holds. In this scenario, there is some γ for which the level set
{ x | f (x) ≤ γ}
Then f˜ is closed. Furthermore, dom(f˜) is precisely the bounded set { x : f (x) ≤ γ}. Hence, applying the
argument from Case 1 to f˜, we conclude that the set of minima of f˜ (and thus of f ) is nonempty and
compact.
Case 3: Condition (3) holds. Since f is proper, it has at least one nonempty level set. Being coercive
means f (x) → +∞ whenever kxk→ ∞, so every nonempty level set of f is bounded. Hence condition (2) is
satisfied, reducing to the argument above.
Example 1.4 Consider the problem minx∈Rn 21 kAx − bk2 with A ∈ Rm×n and b ∈ Rm . If m < n, there
exists γ > 0 such that the level set of F is not bounded.
Definition 1.5 (Local Minimum) Let X ⊆ Rn and let f : Rn → (−∞, +∞] be a function. A point
x∗ ∈ X is said to be a local minimum of f over X if there exists some ε > 0 such that
When X = Rn or if X is precisely the domain of f , one often omits the phrase “over X” and simply says
x∗ is a local minimum of f . A local minimum x∗ is strict if there is no other local minimum within a
neighborhood of x∗ .
Lecture 3 1-3
Proposition 1.6 If X is a convex subset of Rn and f : Rn → (−∞, ∞] is a proper convex function, then a
local minimum of f over X is also a global minimum of f over X. If in addition f is strictly convex, then
there exists at most one global minimum of f over X.
Proof: Let f be convex, and assume to arrive at a contradiction, that x∗ is a local minimum of f over X
that is not global. Then, there must exist an x̄ ∈ X such that f (x̄) < f (x∗ ). By convexity, for all α ∈ (0, 1),
Thus, f has strictly lower value than f (x∗ ) at every point on the line segment connecting x∗ with x̄, except
at x∗ . Since X is convex, the line segment belongs to X, thereby contradicting the local minimality of x∗ .
Let f be strictly convex, and assume to arrive at a contradiction, that two distinct global minima of f over
X, x and y, exist. Then the midpoint (x + y)/2 must belong to X, since X is convex. Furthermore, the
value of f must be smaller at the midpoint than at x and y by the strict convexity of f . Since x and y are
global minima, we obtain a contradiction.
1.2 Projection
In this section we consider the projection, that is, finding a vector in a given nonempty closed convex set C,
which is at minimum Euclidean distance from another given vector x. It is called the projection of x onto
C.
(a) For every x ∈ Rn , there exists a unique vector that minimizes kz − xk over all z ∈ C. This vector,
called the projection of x on C, is denoted by PC (x):
(y − z)0 (x − z) ≤ 0, ∀y ∈ C.
In the case where C is an affine set, the above condition is equivalent to:
(x − z) ∈ S ⊥ ,
Proof: (a) Fix x and let w be some element of C. Minimizing kx − zk over all z ∈ C is equivalent to
minimizing the continuous function:
1
g(z) = kz − xk2 .
2
1-4 Lecture 3
Over the set z ∈ C such that kx−zk≤ kx−wk, which is a compact set. Therefore, by Weierstrass’s Theorem,
there exists a minimizing vector.
To prove uniqueness, note that g(·) is a strictly convex function because its Hessian matrix is the identity
matrix, which is positive definite. Thus, its minimum is attained at a unique vector.
(b) For all y and z in C, we have:
ky − xk2 = ky − zk2 +kz − xk2 −2(y − z)0 (x − z) ≥ kz − xk2 −2(y − z)0 (x − z).
ky − xk2 ≥ kz − xk2 , ∀y ∈ C,
yα = αy + (1 − α)z.
We have:
kx − yα k2 = kα(x − y) + (1 − α)(x − z)k2 .
Expanding this, we get:
∂
kx − yα k2 = −2kx − zk2 +2(x − y)0 (x − z) = −2(y − z)0 (x − z).
∂α α=0
∂
kx − yα k2 ≥ 0,
∂α α=0
or equivalently:
(y − z)0 (x − z) ≤ 0.
y ∈ C ⇐⇒ y − z ∈ S.
(w − PC (x))0 (x − PC (x)) ≤ 0, ∀w ∈ C.
kPC (y) − PC (x)k2 ≤ (PC (y) − PC (x))0 (y − x) ≤ kPC (y) − PC (x)k·ky − xk.
Thus:
kPC (y) − PC (x)k≤ ky − xk,
showing that PC is nonexpansive and therefore also continuous.
(d) Assume, to arrive at a contradiction, that there exist x1 , x2 ∈ Rn and an α ∈ (0, 1) such that:
This implies:
kαz1 + (1 − α)z2 − αx1 − (1 − α)x2 k> αkz1 − x1 k+(1 − α)kz2 − x2 k,
which contradicts the triangle inequality.
1
minimize kc + xk2 subject to Ax = 0,
2
which is the problem of projecting the vector −c onto the subspace X = {x | Ax = 0}. By Proposition 1.7
(b), a vector x∗ such that Ax∗ = 0 is the unique projection if and only if:
(c + x∗ )0 x = 0, ∀x with Ax = 0.
This implies that c + x∗ ∈ Range(A> ), and thus, there exists λ > 0 such that c + x∗ = A> λ. Then,
Ac = AA> λ, which together with the fact that A is of full row rank that λ = (AA> )−1 Ac. It can be seen that
the vector:
x∗ = − I − A> (AA> )−1 A c
satisfies this condition and is thus the unique solution of the quadratic programming problem (2.1). (The
matrix AA> is invertible because A has rank m.)