Lecture 4
Lecture 4
s x
1 2 3 4 5 6 7
– boundary points of S,
1
TMA947 / MMG621 – Nonlinear optimization Lecture 4
We denote x∗ ∈ S as a strict local minimum of f over S if f (x∗ ) < f (x) holds above for x ̸= x∗ .
– We say that a set S ⊆ Rn is open if for every x ∈ S there exists an ε > 0 such that Bε (x) :=
{ y ∈ Rn | ∥y − x∥ < ε } ⊂ S.
– We say that a set S ⊆ Rn is closed if Rn \ S is open.
– A limit point of a set S ⊆ Rn is a point x such that there exists a sequence {xk }∞
k=1 ⊂ S
fulfilling xk → x.
– We can then define a closed set as a set which contains all its limit points.
– We say that a set S ⊆ Rn is bounded if there exists a constant C > 0 such that ∥x∥ ≤ C for all
x ∈ S.
– If a set is both closed and bounded, we call it compact.
Two important definitions needed to formulate Weierstrass’ Theorem are the following.
Definition (weakly coercive function). A function f is said to be weakly coercive with respect to the
set S if either S is bounded or
lim f (x) = ∞
∥x∥→∞
x∈S
2
TMA947 / MMG621 – Nonlinear optimization Lecture 4
Now we can formulate Weierstrass’ Theorem which guarantees the existence of optimal solutions
to an optimization problem as long as a few assumptions are satisfied.
Theorem (Weierstrass’ Theorem). Consider the problem (1), where S is a nonempty and closed set and
f is lower semi-continuous on S. If f is weakly coercive with respect to S, then there exists a nonempty,
closed and bounded (thus compact) set of optimal solutions to the problem (1).
One way to remember the assumptions in Weierstrass’ Theorem is to imagine what can go wrong,
i.e., when does a problem not have an optimal solution. One example of an optimization problem
where the solution set is empty is when f (x) = 1/x and S = [1, ∞).
When S = Rn , i.e., the problem is an unconstrained optimization problem, then the following
theorem holds.
Theorem (necessary condition for optimality, C 1 ). If f ∈ C 1 on Rn , then
T
∂f (x) ∂f (x)
Note that ∇f (x) = ∂x1 , . . . , ∂xn . The opposite of the theorem is, however, not true. Take
f (x) = x and x = 0. We can strengthen the theorem by assuming that f is also in C 2 .
3
∇f (x∗ ) = 0
x∗ is a local minimum of f on Rn =⇒
∇2 f (x∗ ) ⪰ 0
3
TMA947 / MMG621 – Nonlinear optimization Lecture 4
Remember that for a matrix A ∈ Rn×n , the notion A ⪰ 0 (A positive semidefinite) means that
xT Ax ≥ 0, for all x ∈ Rn . Now once again, the opposite direction in the theorem is not true.
However, by assuming positive definiteness of the Hessian of f , we can obtain a sufficient condi-
tion.
Theorem (sufficient condition for optimality, C 2 ). If f ∈ C 2 on Rn , then
∇f (x∗ ) = 0
=⇒ x∗ is a strict local minimum of f on Rn
∇2 f (x∗ ) ≻ 0
To get sufficient conditions for a point to be a local minimum, we need to assume convexity of the
function f .
Theorem (necessary and sufficient condition for optimality, C 1 ). If f ∈ C 1 is convex on Rn , then
x∗ is a global minimum of f on Rn ⇐⇒ ∇f (x∗ ) = 0
When S = Rn the directions we could move from a point x and still stay feasible were Rn itself.
When we consider cases where S ⊂ Rn this might not hold.
Definition (feasible direction). Let x ∈ S. A vector p ∈ Rn defines a feasible direction at x if
∃δ > 0 : x + αp ∈ S, for all α ∈ [0, δ].
So the feasible directions at a point x ∈ S describes the directions in which we can ”move”
without becoming infeasible.
Definition (descent direction). Let x ∈ Rn . A vector p defines a descent direction with respect to f
at x if
∃δ > 0 : f (x + αp) < f (x), for all α ∈ (0, δ].
Suppose that f ∈ C 1 around a point x ∈ Rn , and that p ∈ Rn . If ∇f (x)T p < 0 then the vector
p defines a direction of descent with respect to f at x. We can now state necessary optimality
conditions for cases when S ̸= Rn .
Theorem (necessary optimality conditions). Suppose that S ⊆ Rn and that f ∈ C 1 on S.
4
TMA947 / MMG621 – Nonlinear optimization Lecture 4
We refer to (2) as a variational inequality and we can now extend the notion of stationary points by
denoting them as points fulfilling (2). This is first out of four definitions of a stationary point.
Now the necessary and sufficient conditions for optimality can be stated in the following theorem.
Note that when S = Rn , the expression to the right just becomes ∇f (x∗ ) = 0. Why?
We will now present three additional definitions of a stationary point which are all equivalent to
(2). The first one we get by taking the minimum of the left-hand-side of (2) and then realizing that
the optimal value must be zero, i.e.,
Convince yourself that (2) and (3) are equivalent! Now we claim that (2) and (3) are also equiva-
lent with
x∗ = ProjS [x∗ − ∇f (x∗ )] . (4)
The equation (4) states that if you stand in a stationary point and take a step in the direction of the
negative gradient and then project back to the feasible set, you should end up in the same point.
The details for showing this equivalence can be found in the book (pp. 94–95). See also Figure 2.
For the last equivalent definition of a stationary point, we need to introduce the normal cone
Definition (normal cone). Suppose the set S is closed and convex. Let x ∈ S. Then the normal cone to
S at x is the set
NS (x) := p ∈ Rn | pT (y − x) ≤ 0, y ∈ S .
Think of the normal cone at a point x as all direction pointing ”straight out” from the set. See
Figure 2. Now the fourth equivalent definition of a stationary point is that
5
TMA947 / MMG621 – Nonlinear optimization Lecture 4
N
z
x
1
0
0
1
Definition (stationary point). Suppose that S is convex and that f ∈ C 1 . A point x∗ ∈ S fulfilling the
four equivalent statements a)–d) are called a stationary point.
a)
∇f (x∗ )T (x − x∗ ) ≥ 0, x ∈ S,
b)
min ∇f (x∗ )T (x − x∗ ) = 0,
x∈S
c)
x∗ = ProjS [x∗ − ∇f (x∗ )] ,
d)
−∇f (x∗ ) ∈ NS (x∗ ).
The two important theorems which will be utilized throughout the whole course are the follow-
ing.
Theorem (necessary optimality conditions). Suppose that S is convex and that f ∈ C 1 . Then
Theorem (necessary and sufficient optimality conditions). Suppose that S is convex and that f ∈ C 1
is convex. Then
x∗ is a global minimum of f over S ⇐⇒ x∗ is stationary
As we will see later in the course, the last definition (the inclusion (5)) is the only one that can be
extended to the case of non-convex sets S.
6
TMA947 / MMG621 – Nonlinear optimization Lecture 4
Now we will present a very useful theorem for convex sets which says that: "If a point y does not
lie in a closed convex set S, then there exist a hyperplane separating y from S". Mathematically,
this amounts to the following.
Theorem (the separation theorem). Suppose that S ⊆ Rn is closed and convex, and that the point y
does not lie in S. Then there exists a vector π ̸= 0 and a scalar α ∈ R such that π T y > α, and π T x ≤ α
for all x ∈ S.
p
S
P
x
2
The separation theorem can be used to prove Farkas’ Lemma efficiently, see Theorem 4.33.