0% found this document useful (0 votes)
25 views47 pages

Optimization Theory

The document provides mathematical background on topics like sets, sequences, limits, functions, and open/closed sets. It defines concepts such as supremum, infimum, convergence, and introduces theorems around open/closed sets. Examples are also given to illustrate certain properties.

Uploaded by

khaipv.epu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views47 pages

Optimization Theory

The document provides mathematical background on topics like sets, sequences, limits, functions, and open/closed sets. It defines concepts such as supremum, infimum, convergence, and introduces theorems around open/closed sets. Examples are also given to illustrate certain properties.

Uploaded by

khaipv.epu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 47

Optimization Theory

CASED

February 28, 2020

1
Chapter 1: Mathematical Background

• Given p sets A1 , A2 , . . . , Ap , the set A1 × A2 × . . . × Ap is

A1 × A2 × . . . × Ap = {(a1 , a2 , . . . , ap ) : ai ∈ Ai , for i = 1, 2, . . . , p}
Qp
Very often people write i=1 Ai instead of A1 × A2 × . . . × Ap .

• Let x, y be two elements (vectors) of Rn . We write

x ≥ y if xi ≥ yi , ∀i
x > y if xi ≥ yi , ∀i, and xi > yi for at least one i
x >> y if xi > yi , ∀i

We use the analogous definitions for ≤, <, <<.

• We write x · y for the inner (scalar) product of two vectors: x · y =


Pn √
i=1 xi yi . The norm of x is kxk = x · x. The distance d(x, y) between
two vectors is d(x, y) = kx − yk. Observe that kx − yk = ky − xk ,kλxk =
|λ|kxk for any λ ∈ R, any x, and also kxk = 0, if, and only if, x = 0.

• The distance between two vectors satisfies the Triangle Inequality:

kx + yk ≤ kxk + kyk

and
kx − zk ≤ kx − yk + ky − zk

(equivalently d(x, z) ≤ d(x, y) + d(y, z)).

• We have also the Cauchy-Schwartz Inequality:

|x · y| ≤ kxkkyk

• A sequence x1 , x2 , . . . will be denoted by {xk }k . A subsequence {xkn }


of {xk }k is an infinite sequence xk1 , xk2 , . . . where k1 , k2 , . . . is an infinite
increasing sequence of integers. In particular the initial sequence {xk } is
a subsequence.

2
• A sequence of real numbers {xk }k converges to zero (notation xk → 0) if:
for any ε > 0 there exists an integer N such that if k ≥ N then |xk | ≤ ε,
i.e. −ε ≤ xk ≤ ε.

• A sequence of real numbers {xk }k converges to x (notation xk → x) if


xk − x → 0.

• A sequence of real numbers {xk }k converges to +∞ (notation xk → +∞)


if: for any A ∈ R, there exists K such that, if k ≥ K then xk ≥ A.

• A sequence of points {xk }k in Rn converges to x (notation xk → x) if


kxk − xk → 0.

Theorem 1 A sequence of points {xk }k in Rn converges to x = (x1 , x2 , . . . , xn ),


where xk = (xk1 , xk2 , . . . , xkn ), if, and only if, xki → xi for any i = 1, . . . , n.

• Theorem 2 A sequence in Rn has at most one limit.

Proof : Assume {xk }k in Rn has two limits a, b. Then, given ε > 0, one
can find Na , Nb such that, for any n ≥ Na , one has kxn − ak ≤ ε and any
n ≥ Nb , one has kxn − bk ≤ ε. In this case, if we take N ≥ max{Na , Nb },
we have

ka − bk = ka − xN + xN − bk ≤ ka − xN k + kxN − bk ≤ 2ε

Let ε go to 0. We obtain a = b.

• Assume the sequence of points {xk }k in Rn converges to x. Then any


subsequence {xkn }n of {xk }k is also convergent, and its limit is x. Con-
versely, if all subsequences of {xk }k converge to the same limit, then {xk }k
converges also to this limit.
Proof : For any ε > 0, there exists N s.t. for any k ≥ N , we have
kxk − xk ≤ ε. Recall that the sequence {kn }n is increasing. There ex-
ists kM ≥ N . Then for any n ≥ M , we have kn ≥ kM ≥ N . Hence,
kxkn − xk ≤ ε.
If any subsequence converges to x, since the initial sequence is also a
subsequence, the conclusion follows.

• Let A be a nonempty set of R. An upper bound of A is a point u such


that a ≤ u for any a ∈ A. If A has an upper bound then the supremum of
A (notation sup(A)) is the smallest upper bound of A. If A has no upper
bound then we set sup(A) = +∞. The maximum of A (notation max(A))
is a point z ∈ A such that a ≤ z, ∀a ∈ A.

3
Proposition 1 Let A be a nonempty set. The sup(A) is the unique ele-
ment (eventually +∞) such that :
(i) If x > sup(A) then x ∈
/A
(ii) if x < sup(A) then there exists a ∈ A such that x < a
(iii) There exists a sequence in A which converges to sup(A).

Proof : (i) If x ∈ A then x ≤ sup(A) since this one is an upper bound of


A. The result is then obvious.
(ii) If not, for any a ∈ A, a ≤ x and x is an upper bound which is smaller
than the smallest upper bound which is sup(A).
(iii) First suppose that sup(A) is finite. Then for any k ∈ N, there exists
ak ∈ A which satisfies sup(A) − k1 < ak ≤ sup(A). The sequence {ak }k
converges to sup(A). Now suppose sup(A) = +∞.The set A is then
unbounded from above. Hence, for any k ∈ N, there exists ak ∈ A with
ak ≥ k. Obviously, the sequence {ak }k converges to +∞ = sup(A).

If sup(A) is an element of A then max(A) = sup(A). If max(A) exists,


then max(A) = sup(A).

• Let A be a nonempty set of R. An lower bound of A is a point l such


that a ≥ l for any a ∈ A. If A has a lower bound then the infimum of
A (notation inf(A)) is the largest lower bound of A. If A has no lower
bound then we set inf(A) = −∞. The minimum of A (notation min(A))
is a point z ∈ A such that a ≥ z, ∀a ∈ A.

Proposition 2 Let A be a nonempty set. The inf(A) is the unique ele-


ment (eventually −∞) such that :
(i) If x < inf(A) then x ∈
/A
(ii) if x > inf(A) then there exists a ∈ A such that x > a
(iii) There exists a sequence in A which converges to inf(A).

If inf(A) is an element of A then min(A) = inf(A). If min(A) exists, then


min(A) = inf(A).

• Any non decreasing sequence of real numbers converges either to a finite


number or to +∞. Any non increasing sequence of real numbers converges
either to a finite number or to −∞.
Proof : Consider a non decreasing sequence A = {a1 , a2 , . . .}. If sup(A) =
+∞, then for any M , there exists ak ∈ A which is larger than M . But
for any n > k, we have also an ≥ ak > M . The sequence A converges

4
to +∞. Now, suppose sup(A) < +∞. Then for any ε > 0, there exists
ak ∈ A which satisfies sup(A) − ε < ak ≤ sup(A). But we also have
∀n > k, sup(A) − ε < ak ≤ an ≤ sup(A). That means the sequence
A converges to sup(A). The proof is similar when the sequence is non
increasing.

• Matrices If M is a m × n (m rows and n columns) and x is n−vector,


then we write M x for the product of M and x. If x is a m−vector then
if we multiply the matrix from the left by x we will write x0 M .

• Mappings Let f : S → T , where S ⊆ Rn , T ⊆ Rm . Then S is called the


domain of f , and T is the range of f . For R ⊆ S, the image of R under
f is
f (R) = {y ∈ T : y = f (x), for some x ∈ R}
For U ⊆ T , the inverse image of U , f −1 (U ) is

f −1 (U ) = {x ∈ S : f (x) = y, y ∈ U }

The set f −1 (U ) may be empty.

• Open and closed sets For x ∈ Rn , the open ball B(x, r) with centre x
and radius r is the set B(x, r) = {y ∈ Rn : ky − xk < r}.

– A set S ∈ Rn is open if for all x ∈ S there exists r such that B(x, r) ⊆


S. In particulat the ball B(x, r) is an open set.
– A point x is in the interior of a set S ⊆ Rn if there exists r > 0 such
that the open ball B(x, r) is contained in S. The set of all interior
points of S is denoted by intS.
– A set S ∈ Rn is closed if its complement S c = {x ∈ Rn : x ∈
/ S} is
open
Theorem 3 A set S ∈ Rn is closed if, and only if, whenever {xk }
is a sequence of points in S that converges to a limit x, then also
x ∈ S.

Proof : We first prove that, if S is closed, then whenever {xk } is


a sequence of points in S that converges to a limit x, then also
x ∈ S. Suppose x ∈ S c . Since S c is open there exists an open ball
B(x, ε) ⊂ S c . Then there exists N such that for any k > N , we
have xk ∈ B(x, ε). That implies xk ∈ S c for k > N : a contradiction.
Hence x ∈ S.
Now suppose we have that whenever {xk } is a sequence of points in
S that converges to a limit x, then x ∈ S. We will prove that S is

5
closed. If not, S c is NOT open. Then there there exists y ∈ S c s.t.
for any k ∈ N, there exists a point xk ∈ S ∩ B(y, k1 ). The sequence
of elements in S, {xk }k , converges to y when k converges to infinity.
But y is not in S: a contradiction. Hence, S is closed.

– For two sets S1 , S2 of Rn , the sum S1 + S2 is

S1 + S2 = {x ∈ Rn : x = x1 + x2 , x1 ∈ S1 , x2 ∈ S2 }

– The empty set is open. The whole space Rn is then closed. But this
one is also open. Hence the empty set is also closed.
– The union of an arbitrary collection of open sets is again open
– The intersection of an arbitrary collection of open sets is NOT AL-
WAYS open
– The intersection of a FINITE collection of open sets is open
– The sum of two open sets is open.
Proof : Let S1 , S2 be two open sets. If one of them is empty then
S1 + S2 is also empty, hence open. So, we assume that both of
them are non-empty. Let x1 ∈ S1 , x2 ∈ S2 . there exists ε > 0
s.t. B(x1 , ε) ⊂ S1 , B(x2 , ε) ⊂ S2 . We claim that the open ball
B(x1 + x2 , 2ε ) is in S1 + S2 . Indeed, let y1 satisfy ky1 − x1 k < 2ε .
This implies y1 ∈ B(x1 , ε) ⊂ S1 . Now, let y ∈ B(x1 + x2 , 2ε ) and
y2 = y − y1 . We have

ky2 − x2 k = ky − y1 − x1 + x1 − x2 k
ε ε
≤ ky − (x1 + x2 )k + ky1 − x1 k < + = ε.
2 2
Thus, y2 ∈ B(x2 , ε) ⊂ S2 . We then have: y = y1 + y2 with y1 ∈
S1 , y2 ∈ S2 . This ends the proof.

– The union of an arbitrary collection of closed sets is NOT ALWAYS


closed
– The union of a FINITE collection of closed sets is closed
– The intersection of an arbitrary collection of closed sets is closed
– The sum of two closed sets is NOT ALWAYS closed
An example: take S1 = {(x, y) ∈ R2 : x > 0, y = x1 }, S2 = {(x, y) ∈
R2 : x > 0, y = − x1 }. Consider the sequence {z k }k ⊂ S1 + S2 defined
by ∀k, z k = (xk1 + xk2 , y1k + y2k ), where xk1 = xk2 = k1 and y1k = k, y2k =
−k. We have z k = ( k2 , 0). It converges to (0, 0) ∈ / S1 + S2 .

• Bounded and compact sets

6
– A set S ∈ Rn is bounded if there exists M such that S ⊆ B(0, M ). In
other words, S is bounded if there exists M > 0 such that kxk ≤ M
for all x ∈ S.
– A set S is compact if it is bounded and closed.
Theorem 4 A set S is compact if, and only if, for all sequences
{xk } in S, there exists a subsequence {xkm }m which converges to a
point x ∈ S.

Proof : Assume S has the property that for all sequences {xk } in S,
there exists a subsequence {xkm }m which converges to a point x ∈ S.
Let us prove it is compact. First, S is bounded. If not there exists
a sequence {xk }k ⊂ S with limk→+∞ kxk k = +∞. There exists a
subsequence {xkn }n of {xk }k which converges to some x ∈ S, which
is impossible since limn kxkn k = +∞.
We now prove S is closed. Suppose {xk }k ⊂ S converges to x.
There exists a subsequence which converges in S. This limit must
be x. Hence, x ∈ S.
Assume that S is compact. We will assume S ⊂ R2 . One can
easily see that the proof can be carried on when the dimension of
the space is larger than 2. Let {xk } be a sequence of S. Write
xk = (xk1 , xk2 ), ∀k. Since S is bounded, there exists a subsequence
{xk1n }n which converges to some x1 ∈ R. But there exists also a
kn
subsequence {x2 l }l of {xk2n }n which converges to some x2 ∈ R.
kn kn
Since S is closed, (x1 , x2 ) ∈ S since it is the limit of {(x1 l , x2 l )}l .
To summarize, we have found a subsequence of {xk } which converges
to a point in S.

– The union of an arbitrary collection of compact sets is NOT AL-


WAYS compact
– The union of a finite collection of compact sets is compact
– The intersection of an arbitrary collection of compact sets is compact
– The sum of two compact sets is compact

• Limits of mappings Let f : S → Rm , where S ⊆ Rn .

– We say that f (x) → l ∈ Rn as x → a ∈ Rm , if for every ε > 0, there


exists δ > 0 such that for all x ∈ S with 0 < kx − ak < δ we have
kf (x) − lk < ε. We use the notation limx→a f (x) = l.
Proposition 3 Let f : S → Rm , where S ⊆ Rn . Then f (x) → l ∈
Rn as x → a ∈ Rm if, and only if, for every sequence {xk } in S such
that xk 6= a but xk → a we have that f (xk ) → l.

7
– Suppose f : R → Rn . Then we say that f (x) → l ∈ Rn as x → +∞,
if for every ε > 0 there exists M ∈ R such that kf (x) − lk < ε for
all x > M . We use the notation limx→+∞ f (x) = l. We use similar
definitions for x → −∞.

• Continuous mappings

– A mapping f : S → Rm , where S ⊆ Rn is continuous at x ∈ S if for


all sequences {xk } in S converging to x, we have that f (xk ) → f (x).
Equivalently, f is continuous at x ∈ S, if for every ε > 0 there exists
δ > 0 such that if y ∈ S and ky − xk < δ then kf (y) − f (x)k < ε.
– Let f (x) = (f1 (x), . . . , fm (x)) where fi is a mapping from S to R.
Then f is continuous at x ∈ S if, and only if, each fi is continuous
at x.
– A mapping f is continuous on S if it is continuous at any x ∈ S.
– Let f : Rm → Rm , g : Rn → Rm . Assume f, g are continuous. Then
f + g is continuous. If λ is a real number, then λf is continuous.
– Let f : Rm → Rm , g : Rm → Rp . Assume f, g are continuous. The
mapping gof defined by gof (x) = g(f (x)) for any x is continuous.
– Let f : Rn → Rm . Then f is continuous if, and only if, for any open
set A ⊆ Rm , f −1 (A) is open in Rn . Equivalently, f is continuous if,
and only if, for any closed set A ⊆ Rm , f −1 (A) is closed in Rn .
Proof : Assume f is continuous. Let A ∈ Rm be open. We will show
that f −1 (A) is open. For that, let x ∈ f −1 (A). Since A is open, there
exists an open ball B(f (x), ε) ⊂ A. Since f is continuous, there exists
δ > 0 such that if y ∈ B(x, δ) then f (y) ∈ B(f (x), ε) ⊂ A. This
implies y ∈ f −1 (A). Equivalently, B(x, δ) ⊂ f −1 (A) and f −1 (A) is
open.
Conversely, assume that for any open set A ⊂ Rm , the set f −1 (A)
is open. We will prove that f is continuous. Let A = {z ∈ Rm :
kz − f (x)k < ε}. The set A is open. Observe that x ∈ f −1 (A). Since
f −1 (A) is open, there exists an open ball B(x, δ) ⊂ f −1 (A). That
means, if ky−xk < δ then f (y) ∈ A or equivalently kf (y)−f (x)k < ε.
We have proved that f is continuous.
To prove that f is continuous if, and only if, for any closed set
A ⊆ Rm , f −1 (A) is closed in Rn , one can observe that f −1 (Ac ) =
c
f −1 (A) .

– In the definition given above δ depends on x. A function f is uni-


formly continuous on S if δ does not depend on x. More precisely, f

8
is uniformly on S if for every ε > 0, there exists δ > 0 such that for
any x ∈ S, for any y in S and ky − xk < δ then kf (y) − f (x)k < ε.
Proposition 4 Let f : S → Rm , where S ⊆ Rn . If S is compact
and f is continuous on S, then f is uniformly continuous on S.

Proof : Assume S is compact and f is not uniformly compact on S.


Let ε > 0 be given. Since f is not uniformly continuous, for any δ >
0, there exists xδ , y δ such that kxδ −y δ k < δ and kf (xδ )−f (y δ )k ≥ ε.
Take δk = k1 . Then, for any k, one can find xk , y k s.t. kxk −y k k < k1 et
kf (xk ) − f (y k )k ≥ ε. Since S is compact, one can find a subsequence
{xkn } ⊂ S which converges to x ∈ S, and another subsequence
{y knl } ⊂ S which converges to y ∈ S and hence (xknl , y knl ) → (x, y)
when l → +∞. We have kxknl − y knl k < k1n → 0 when l → +∞.
l
Thus kx − yk ≤ 0, and x = y. But for any l, kf (xknl ) − f (y knl )k ≥ ε
for any l. Passing to the limits when l goes to infinity we get a
contradiction 0 = kf (y) − f (x)k ≥ ε > 0. This ends the proof.

– Proposition 5 Let f : S → Rm , where S ⊆ Rn . Assume S is


compact and f is continuous on S. Then f (S) is compact.
Proof : First, f (S) is bounded. If not there exists a sequence
{y n }n ⊂ f (S) with limn→+∞ ky n k = +∞. We can write y n = f (xn )
with xn ∈ S for every n. Since S is compact, there exists a subse-
quence {xnk }k which converges to some x ∈ S and f (xnk ) converges
to f (x). That is a contradiction since kf (xnk )k converges to infinity
too.
We now show that f (S) is closed. For that, let {y n }n ⊂ f (S)→ y.
We claim that y ∈ f (S). Write y n = f (xn ) with xn ∈ S for every
n. Since S is compact, there exists a subsequence {xnk }k which con-
verges to some x in S and f (xnk ) converges to f (x). We must have
y = f (x). Hence, y ∈ f (S).

Theorem 5 (Weirstrass Theorem) Let f : S → R, where S ⊆


Rn . Assume S is compact, nonempty and f is continuous on S.
Then f has both a maximum and a minimum.

Proof : Let M = sup(f (S)). There exists a sequence {f (xn )}n


converging to M . Since S is compact, there exists a subsequence
{xnk } which converges to some x ∈ S and f (xnk ) converges to f (x) ∈
f (S). We have M = f (x) and hence, M = max(f (S)). The proof is
similar for min(f (S)).

9
Corollary 1 Let A ⊆ R be a nonempty compact set. Then A has a
maximum and a minimum.

Proof : Take f (x) = x, for all x ∈ Rn . Then sup(A) = sup(f (A)).

• Differentiability of mappings

– A mapping f : S → Rm , where S ⊆ Rn , is differentiable at x0 ∈ S,


where x0 must be in the interior of S, if there exists a m × n matrix
A so that
f (x) − f (x0 ) − A(x − x0 )
→ 0, as x → x0
kx − x0 k

The matrix A is called the derivative of f at x0 and is denoted


by Df (x0 ). Moreover, we say that f is differentiable on S if it is
differentiable at every point of S.
– If f is differentiable on a set S, its derivative Df can be seen as a
mapping Df : S → Rm×n . If this mapping is continuous, we say
that f is continuously differentiable or f is C 1 .

• Differentiability of functions

– Let f : S → R, where S ⊆ Rn . We say that f is differentiable at


x0 ∈ S, where x0 must be in the interior of S, if there exists a vector
a ∈ Rn so that
f (x) − f (x0 ) − a · (x − x0 )
→ 0, as x → x0
kx − x0 k

The vector a is called the derivative of f at x0 and is denoted by


Df (x0 ). Moreover, we say that f is differentiable on S if it is dif-
ferentiable at every point of S. We can regard Df as a mapping
from S to Rn . If Df is continuous, we say that f is continuously
differentiable or f is C 1 .
– Let f : S → R, where S ⊆ Rn . We study the derivative via partial
derivatives. Let (e1 , . . . , en ) be the canonical basis of Rn . If ei is a
vector of this basis, then the coordinates of ei equal zero excepted
the i−th coordinate which equals 1. The i−th partial derivative of
f at a point x is the number ∂f∂x(x) i
defined by
 
∂f (x) f (x + tei ) − f (x)
= lim
∂xi t→0 t

10
Theorem 6 Let f : S → R, where S ⊆ Rn is an open set. The
function f is C 1 on S if, and only if, all partial derivatives of f exist
and are continuous on S. In that case we also have
 
∂f (x) ∂f (x) ∂f (x)
Df (x) = , ,...,
∂x1 ∂x2 ∂xn

• Second derivatives
n
 f : S → R, where
– Let  S ⊆ R . Then the derivative Df (x) =
∂f (x) ∂f (x) ∂f (x)
∂x1 , ∂x2 , . . . , ∂xn is also a mapping from S to Rn . If Df is
differentiable then f is called twice differentiable with second deriva-
tive D2 f (x). The partial derivatives of the partial derivatives of f
∂ 2 f (x) ∂ 2 f (x)
are denoted by ∂x i ∂xj
if i 6
= j, and by ∂x2i
if i = j. In these cases
we have  2 
∂ f (x) ∂ 2 f (x)
2 . . . ∂x1 ∂xn 
 ∂x1

 . . . . . 

D2 f (x) =  .....
 

 
 2 ..... 2
 

∂ f (x) ∂ f (x)
∂xn ∂x1 . . . ∂x2
n

This matrix is called the Hessian of f at x.


– When f is twice differentiable on S and each second partial deriva-
tive is a continuous function, then f is called twice continuously
differentiable or C 2 .
Theorem 7 If f is C 2 on S ⊆ Rn , then D2 f (x) is a symmetric
∂ 2 f (x) ∂ 2 f (x)
matrix, i.e. ∂x i ∂x j
= ∂x j ∂xi
for all i, j and all x ∈ S.

• Taylor expansion

Theorem 8 Let f : S → R, where S ⊆ Rn is an open set. Pick x0 ∈ S


(i) If f is C 1 on S, then for any x ∈ S we can write

f (x) = f (x0 ) + Df (x0 ) · (x − x0 ) + R1 (x, x0 )kx − x0 k

where R1 (x0 , x0 ) = 0 and R1 (x, x0 ) → 0 as x → x0 .


(ii) If f is C 2 on S, then for any x ∈ S, one can write
1
f (x) = f (x0 )+Df (x0 )·(x−x0 )+ (x−x0 )0 D2 f (x0 )(x−x0 )+R2 (x, x0 )kx−x0 k2
2
where R2 (x0 , x0 ) = 0 and R2 (x, x0 ) → 0 as x → x0 .

• Definite and semidefinite matrices Let A be an n × n matrix. Then


A is said to be

11
– positive definite if x0 Ax > 0 for all x ∈ Rn , x 6= 0
– positive semidefinite if x0 Ax ≥ 0 for all x ∈ Rn
– negative definite if x0 Ax < 0 for all x ∈ Rn , x 6= 0
– negative semidefinite if x0 Ax ≤ 0 for all x ∈ Rn
– We denote by Ak the k × k submatrix of A formed by taking just
the first k rows and columns of A.

Theorem 9 Let A be a symmetric n × n matrix.


(i) Then A is positive definite if, and only if, detAk > 0 for all
k = 1, . . . , n
(ii) And A is negative definite if, and only if, detAk > 0 for all even
k ∈ {1, . . . , n} and detAk < 0 for all odd k ∈ {1, . . . , n}.

There is NO equivalent of the above theorem for positive or negative


SEMIDEFINITE matrices.

• Unconstrained Optimization Let f : S → R, where S ⊆ Rn is a


nonempty set.

– A point x ∈ S is a global maximum of f on S if f (y) ≤ f (x) for all


y ∈ S.
– A point x ∈ S is a local maximum of f on S if there exists r > 0
such that f (y) ≤ f (x) for all y ∈ S ∩ B(x, r).
– A point x ∈ S is a strict local maximum of f on S if there exists
r > 0 such that f (y) < f (x) for all y ∈ S ∩ B(x, r), y 6= x.
– A point x ∈ S is an unconstrained local maximum of f on S if
there exists r > 0 such that B(x, r) ⊆ S and f (y) ≤ f (x) for all
y ∈ B(x, r).

Similarly, one can define a global minimum , local minimum , strict local
minimum , unconstrained local minimum

• First-order conditions for an unconstrained optimum

Theorem 10 Suppose x∗ ∈ S is either an unconstrained local minimum


or an unconstrained local maximum. Then Df (x∗ ) = 0.

Proof : Suppose that x∗ is an unconstrained local maximum. There


exists r > 0 such that B(x∗ , r) ⊆ S and f (y) ≤ f (x∗ ) for all y ∈ B(x∗ , r).
Assume a = Df (x∗ ) 6= 0. From the Taylor expansion, we have

f (x) = f (x∗ ) + Df (x∗ ) · (x − x∗ ) + R1 (x, x∗ )kx − x∗ k

12
For any real number t, define xt = x∗ + ta. For t close enough to zero, we
have that xt ∈ B(x∗ , r) ∩ S. Then

f (xt ) = f (x∗ ) + a · (xt − x∗ ) + R1 (xt , x∗ )kxt − x∗ k


= f (x∗ ) + tkak2 + R1 (x∗ + ta, x∗ )|t|kak
= f (x∗ ) + tkak (kak + R1 (x∗ + ta, x∗ )) , when t ≥ 0
> f (x∗ )

when t is small enough. We get a contradiction. Hence Df (x∗ ) = 0. The


proof is similar when x∗ is an unconstrained local minimum.

A point x is an optimum for f if it is either a maximum or a minimum of f .


If it is an unconstrained local optimum, we have proved that Df (x) = 0.

• Second order conditions for an unconstrained local optimum

Lemma 1 (i) Let M be a positive definite n × n matrix. Let S(0, 1)


denote the unit-sphere of Rn . Then minx∈S(0,1) x0 M x > 0.
(ii) Let M be a negative definite n × n matrix. Let S(0, 1) denote the
unit-sphere of Rn . Then maxx∈S(0,1) x0 M x < 0.

Proof : (i) The function ψ : S(0, 1) → R+ defined by ψ(x) = x0 M x for


x ∈ S(0, 1) is continuous and positive for any x ∈ S(0, 1). Since S(0, 1) is
compact, ψ has a minimum on S(0, 1) which is positive, i.e. there exists
x̄ ∈ S(0, 1) which satisfies 0 < ψ(x̄) = x̄0 M x̄ = minx∈S(0,1) x0 M x.
(ii) The proof is similar.

Theorem 11 Let f : S → R, where S ⊆ Rn . Assume x0 is an uncon-


strained local optimum of f . If D2 f (x0 ) is negative definite then x0 is an
unconstrained local maximum. If D2 f (x0 ) is positive definite, then x0 is
an unconstrained local mimimum.

Proof : We must have Df (x0 ) = 0. Consider the Taylor expansion


1
f (x) = f (x0 ) + Df (x0 ) · (x − x0 ) + (x − x0 )0 D2 f (x0 )(x − x0 ) + R2 (x, x0 )kx − x0 k2
2
1
= f (x0 ) + (x − x0 )0 D2 f (x0 )(x − x0 ) + R2 (x, x0 )kx − x0 k2
2

13
x−x0
Assume D2 f (x0 ) positive definite. When x 6= x0 , let u = kx−x 0k
∈ S(0, 1).
0 2
We know that minx∈S(0,1) x D f (x0 )x = α > 0. Therefore,
1
f (x) = f (x0 ) + (x − x0 )0 D2 f (x0 )(x − x0 ) + R2 (x, x0 )kx − x0 k2
2
1
= f (x0 ) + kx − x0 k2 u0 D2 f (x0 )u + R2 (x, x0 )kx − x0 k2 , when x 6= x0
2
1
≥ f (x0 ) + kx − x0 k2 [α + 2R2 (x, x0 )] , when x 6= x0
2
When x is close to x0 but different from x0 , we have α+2R2 (x, x0 ) > 0 and
thus f (x) ≥ f (x0 ). We have that x0 is an unconstrained local minimum.
Similarly, when D2 f (x0 ) is negative definite then x0 is an unconstrained
local maximum.

1 Exercises
Exercise 1 Show that the ball B(x, r) = {y ∈ Rn : ky − xk ≤ r} is a closed
set, where r > 0.

Exercise 2 Let U ∈ Rn . Then, U is open if, and only if, for any x ∈ U there
exist r > 0 such that B(x, r) ⊆ S.

Exercise 3 For the sets A given below, show that EVERY function f : A → R
is continuous
(a) A ∈ Rn with just an element
(b) A ∈ Rn with a finite number of elements
(c) A = Z, considered as a subset of R.

Exercise 4 Determine which of the sets in (a)-(c) from above are compact.

Exercise 5 Let D = {0} ∪ n1 : n = 1, 2, . . . . Determine, justifying your an-




swers, if the following statements are true


(a) if f : D → R is continuous, then f has a maximum on D
(b) if f : D → R satisfies f (x) ∈ [0, 1] for all x ∈ D, then f has a maximum
or a minimum on D.
(c) Show that the function f defined by f (x) = x if x 6= 0, and f (0) = 1 is
not continuous on D.

Exercise 6 Let X and Y be subsets of Rn . If X is compact and Y is closed,


then X + Y is closed.

Exercise 7 Let f : R → R be continuous. Assume that it satisfies limx→−∞ f (x) =


0 and limx→+∞ f (x) = 0. Show that f must have a minimum or a maximum
on R.

14
Chapter 2: Convex Functions
This chapter is devoted to a class of functions from Rn into R called convex
functions and to give a first important property of such functions. Any convex
function is continuous.

2 Basic definitions and properties


A set A in Rn is convex if for any a ∈ A, any b ∈ A, any λ ∈ [0, 1], we have
λa + (1 − λ)b ∈ A.
Let f : A → R be a function defined from A, a nonempty set of Rn , into R.
We require that A is convex. We say that f is convex on A if:

∀a1 ∈ A, ∀a2 ∈ A, ∀λ ∈ [0, 1] , f (λa1 + (1 − λ)a2 ) ≤ λf (a1 ) + (1 − λ)f (a2 ).

The function f is said to be strictly convex if

∀a1 ∈ A, ∀a2 ∈ A, ∀λ ∈ ]0, 1[ , f (λa1 + (1 − λ)a2 ) < λf (a1 ) + (1 − λ)f (a2 ).

The function f : A → R is concave on A if −f is convex. Explicitly, f is


concave on A if

∀a1 ∈ A, ∀a2 ∈ A, ∀λ ∈ [0, 1] , f (λa1 + (1 − λ)a2 ) ≥ λf (a1 ) + (1 − λ)f (a2 ).

It is strictly concave on A if:

∀a1 ∈ A, ∀a2 ∈ A, ∀λ ∈ ]0, 1[ , f (λa1 + (1 − λ)a2 ) > λf (a1 ) + (1 − λ)f (a2 ).

Proposition 6 (Jensen inequality) Let U be a nonempty convex set of Rn


and let f : U → R be a convex function on U . Then f is convex if, and only
if, for any integer p ≥ 2, for any p elements of U , x1 , . . . , xp , for any p positive
numbers λ1 , . . . , λp , the sum of which equals 1, then

f (λ1 x1 + . . . + λp xp ) ≤ λ1 f (x1 ) + . . . + λp f (xp ).

Proof : Let f : U → R where U is a convex, nonempty subset of Rn . Assume


f convex. Let x1 , . . . , xp be p elements of U . If p = 2, then obviously f (λ1 x1 +
. . . + λp xp ) ≤ λ1 f (x1 ) + . . . + λp f (xp ).
Assume that Jensen Inequality holds for p − 1. We will show that it holds also
for p. If λp = 1, then the result is trivially true. So, we assume λp 6= 1. Let
sp−1 = λ1 + . . . + λp−1 . We have:
λ1 λp−1 
λ1 x1 + . . . + λp xp = sp−1 x1 + . . . + xp−1 + (1 − sp−1 )xp .
sp−1 sp−1

15
The function f being convex, we have:
λ1 λp−1 
f (λ1 x1 + . . . + λp xp ) ≤ sp−1 f x1 + . . . + xp−1 + (1 − sp−1 )f (xp ).
sp−1 sp−1

We have assumed that Jensen Inequality holds for p − 1. Hence


λ1 λp−1 λ1 λp−1
f( x1 + . . . + xp−1 ) ≤ f (x1 ) + . . . + f (xp−1 ).
sp−1 sp−1 sp−1 sp−1

Thus:
f (λ1 x1 + . . . + λp xp ) ≤ λ1 f (x1 ) + . . . + λp f (xp ).
The converse is true by taking p = 2.

Proposition 7 (i) Let f , g be two convex functions from a convex set U, into
R. Then max(f, g) is convex. More generally, consider a collection of convex
functions {fi }i=1,...,I , from U into R. Then max{fi | i = 1, . . . , I} is convex
from U into R.
(ii) If (fi )i∈N is a sequence of convex functions from Rn into R which converges
pointwise, i.e. ∀x ∈ Rn , the sequence (fi (x))i∈N converges in R, then the func-
tion defined for all x ∈ Rn by f (x) = limi→+∞ fi (x) is convex from Rn into R.
(iii) If f : U → R, g : V → R, with f (U ) ⊂ V , are convex, if g is nondecreasing,
then g ◦ f is convex.
(iv) If f, g are convex from a convex set U into R and if λ is a nonnegative real
number, then f + g and (λf ) are convex.

Proof : Exercise

3 Continuity theorems
In this section, we want to prove that for any convex function f from an open
set U ⊂ Rn into R, it is continuous. Let T = {a1 , a2 , . . . , am } ⊂ Rn . The
convex hull of T denoted by coT is the set
m m
( )
X X
coT = x : x = λi ai , λi ≥ 0, ∀i, λi = 1
i=1 i=1

Proposition 8 Let f be a convex function from an open set U of Rn into R.


Assume 0 ∈ U . Let B̄(0, r) ⊂ U . Then f is bounded above on B̄(0, r).

Proof : Since 0 ∈ U, one can choose α > 0 sufficiently small such that the
convex hull V = co{αe1 , . . . , αen , −αe1 , . . . , −αen }, where the ei are the vectors
of the canonical basis of Rn , is contained in U . V has a nonempty interior, 0

16
is in the interior of V . Thus, V contains a closed ball B̄(0, r). Any x in V
may be expressed as: x = α ni=1 λi ei − α ni=1 λ0i ei with λi ≥ 0, λ0i ≥ 0, ∀i,
P P
Pn 0
i=1 (λi + λi ) = 1. Since f is convex,

n
X
λi f (αei ) + λ0i f (−αei ) ≤ max f (αe1 ), . . . , f (αen ), f (−αe1 ), . . . , f (−αen ) .
 
f (x) ≤
i=1

Therefore f is bounded above on B̄(0, r).

Remark Actually the set V in the proof above is the ball B̄1 (0, α) of the norm
k.k1 . Indeed, if x ∈ V then
!
X X X X
0
x=α λi ei − λi ei = α (λi − λ0i )ei = xi ei
i i i i

with λi ≥ 0, λ0i ≥ 0, ∀i, i (λi + λ0i ) = 1. We have


P

X X X
|xi | = α |λi − λ0i | ≤ α (λi + λ0i ) = α
i i i

and x ∈ B̄1 (0, α).


P
Conversely, let x = (x1 , . . . , xn ) ∈ B̄1 (0, α). Then i |xi | ≤ α. If x = 0, the
problem is over by taking λ1 = λ01 = 1/2, λi = λ0i = 0, for i ≥ 2. So, take
x 6= 0. Let I = {i : xi ≥ 0}, J = {i : xi < 0}. Without lost of generality we can
assume I 6= ∅ (if it is not the case, change x into −x). Denote |I| the number
of elements of I. Define
 
xi 1  1 X
λi = + 1− |xj | , if i ∈ I
α 2|I| α
j
λi = 0 if i ∈ J
 
1  1 X
λ0i = 1− |xj | , if i ∈ I
2|I| α
j
xi
λ0i = − if i ∈ J
α
One can check that
!
X X X
λi ≥ 0, λ0i ≥ 0, ∀i; 1 = (λi + λ0i ); x=α λi ei − λ0i ei
i i i

Theorem 12 If f is convex from an open set U ⊂ Rn into R, then it is con-


tinuous in U .

17
Proof : Let x0 ∈ U, V = U − {x0 }. Observe that V is convex, open and 0 ∈ V .
Consider the function x ∈ V → h(x) = f (x + x0 ) − f (x0 ). Obviously h is
convex, and f is continuous at x0 if, and only if, h is continuous at 0. Moreover
h(0) = 0. Since V is open and 0 ∈ V , there exists a closed ball B̄(0, r) ⊂ V . Let
{xn }n ⊂ V converge to 0. For n large enough, {xn }n ⊂ B̄(0, r). Let S denote
rxn n = − rxn .
the sphere of radius r: {x ∈ Rn : kxk = r}. Define y n = kx nk , z kxn k
n n
Then y ∈ S, z ∈ S, ∀n. One can check that
kxn ky n kxn k
 
r
xn = , 0= x n
+ zn
r r + kxn k r + kxn k
Since h is convex, we then get
kxn k kxn k kxn k
 
h(xn ) ≤ h(y n ) + 1 − h(0) = h(y n )
r r r
kxn k
≤ sup h(y)
r y∈S
kxn k
 
r n
0 = h(0) ≤ h(x ) + h(z n )
r + kxn k r + kxn k
kxn k
⇔0 ≤ h(xn ) + h(z n )
r
kxn k
≤ h(xn ) + sup h(y).
r y∈S
Let n → +∞. Then kxn k → 0 and

lim sup h(xn ) ≤ 0 = h(0)


n
0 ≤ lim inf h(xn ).
n

Summing up

0 = h(0) ≤ lim inf h(xn ) ≤ lim sup h(xn ) ≤ 0 = h(0)


n n

Hence 0 = h(0) = lim h(xn ).


n

The following theorem is very important and useful. We give it without


proof.
Theorem 13 Let f be a convex differentiable function from Rn into R. Then
for any x ∈ Rn , any y ∈ Rn , we have f (y) − f (x) ≥ Df (x) · (y − x).

4 Exercises
Exercise 8 1. Show that the segment [x, y] = {z : z = λx+(1−λ)y, λ ∈ [0, 1]}
is closed.
2. Let A be a non-empty set of Rn . Show that A is convex if, and only if,
for any x ∈ A, any y ∈ A, x2 + y2 ∈ A

18
Exercise 9 Let f : U → R be a continuous function on a convex, nonempty
set U of Rn . Show that f is convex on U if, and only if, ∀x ∈ U, ∀y ∈ U ,
f ((x + y)/2) ≤ (f (x) + f (y))/2.

Exercise 10 A function f : Rn → R is positively homogenous if ∀x ∈ Rn , ∀λ >


0, f (λx) = λf (x). An example of positively homogenous function which is not
simply linear is f (x) = |x|, ∀x ∈ R.
a) Show that a positively homogenous function f is convex if, and only if, ∀x ∈
Rn , ∀y ∈ Rn , f (x + y) ≤ f (x) + f (y).
b) Show also that if f is positively homogeneous convex function on Rn then
f (λ1 x1 + . . . + λm xm ) ≤ λ1 f (x1 ) + . . . + λm f (xm ),
∀(x1 , . . . , xm ) ∈ (Rn )m , ∀(λ1 , . . . , λm ) with λi > 0, ∀i = 1, . . . , m.

Exercise 11 a) Show that the function ln is concave.


b) Show that, if p, q are two positive real numbers such that 1/p + 1/q = 1 and
if a, b are two strictly positive numbers, then one has:
a b
a1/p b1/q ≤ + .
p q
c) Prove that, for all x = (x1 , . . . , xn ) and for all y = (y1 , . . . , yn ), one has:
n n
!1/p n !1/q
X X X
p q
|xi yi | ≤ |xi | |yi | (1)
i=1 i=1 i=1

where p, q are two positive numbers verifying 1/p + 1/q = 1.


Hint:
Define ai = |xi |p /( nj=1 |xj |p ), bi = |yi |q /( nj=1 |yj |q ) and apply b).
P P

d) Prove that:
Xn Xn n
X
( |xi + yi |p )1/p ≤ ( |xi |p )1/p + ( |yi |p )1/p
i=1 i=1 i=1

Hint:
Observe that:
n
X n
X n
X
p (p−1)
|xi + yi | ≤ |xi + yi | |xi | + |xi + yi |(p−1) |yi |.
i=1 i=1 i=1

Apply (1) to each sum of the second member of the previous inequality with p
and q = p/(p − 1).
e) Deduce from d) that the function:
Xn
x = (x1 , . . . , xn ) → ( |xi |p )1/p
i=1

is convex.

19
Exercise 12 Let f be a convex function from Rn into R. Let x, y be two
elements in Rn .
Show that the function g : t ∈ [0, 1] → f (tx + (1 − t)y) is convex and continuous.

Exercise 13 Let A be a convex set of Rn and let f be a function from A into


R. Show that f is convex on A if, and only if, for all a and all a0 in A, the
function ϕ defined for all t ∈ [0, 1] by ϕ(t) = f (ta + (1 − t)a0 ) is convex in [0, 1].

Exercise 14 a) Let f be a convex, differentiable function on the interval I of


R. Let t < t0 be two points in I. Let x ∈ ]t, t0 [. Show that:
f (x) − f (t) f (t0 ) − f (t) f (x) − f (t0 )
≤ ≤ . (2)
x−t t0 − t x − t0
Let t1 , t2 in I verify t1 < t2 . Let x1 , x2 , two points in ]t1 , t2 [, verify x1 < x2 .
Deduce from (2) that
f (x1 ) − f (t1 ) f (x2 ) − f (t1 ) f (t1 ) − f (t2 ) f (x2 ) − f (t2 )
≤ ≤ ≤ . (3)
x1 − t1 x2 − t 1 t1 − t2 x2 − t 2
Show that f 0 (t1 ) ≤ f 0 (t2 ).
b) We now assume that f 0 is nondecreasing. Show that f is convex.
Hint: Take two points t1 , t2 ∈ I, t1 < t2 and λ ∈ [0, 1]. Use the Mean Value
theorem (see below) to prove that:

f (λt1 + (1 − λ)t2 ) − f (t1 ) = (1 − λ)(t2 − t1 )f 0 (µ1 )


f (t2 ) − f (λt1 + (1 − λ)t2 ) = λ(t2 − t1 )f 0 (µ2 )

with t1 ≤ µ1 ≤ λt1 + (1 − λ)t2 ≤ µ2 ≤ t2 .


Conclude that f is convex.
c) Assume that f is twice differentiable in I. Show that f is convex if, and only
if, f 00 (t) ≥ 0, ∀t ∈ I.
d) Let f be a function which is differentiable on an open convex set A of Rn .
Prove that it is convex on A if, and only if, for all a, a0 in A, the function g
defined for t ∈ [0, 1] by:
n
X ∂f
g(t) = (ai − a0i ) (ta + (1 − t)a0 )
∂xi
i=1

is a nondecreasing function, where (ai ), (a0i ) are the components of a and a0 .


Hint: Consider the function ϕ(t) = f (ta + (1 − t)a0 ) for t ∈ [0, 1].
e) Let f be a twice differentiable function on an open convex set A of Rn . Show
that f is convex on A if, and only if, for any a0 ∈ A, the following quadratic
form:
X ∂2f
Ψ(h) = hi hj (a0 )
∂xi ∂xj
i,j

20
is nonnegative, i.e., Ψ(h) ≥ 0, ∀h = (h1 , . . . , hn ).
Hint: Take a, a0 ∈ A. Consider the function ϕ(t) = f (ta+(1−t)a0 ) for t ∈ [0, 1].

Theorem 14 (Mean value theorem) Let f be a function from [a, b] into R,


where a < b. We assume that f is continuous on [a, b] and differentiable on
]a, b[. There exists a point c ∈ ]a, b[ such that f (b) − f (a) = (b − a)f 0 (c).

21
Chapter 3: Convex Optimization
In this chapter we want to solve the problem min{f (x) | x ∈ C}, where f is
a convex function on Rn , and C is a convex, nonempty subset of Rn . A point
x∗ ∈ C is a global solution, or more simply a solution to this problem, or a
minimizer of f on C, if f (x∗ ) ≤ f (x), ∀x ∈ C.
The set C of constraints will be defined by a finite number of constraints
fi (x) ≤ 0, ∀i ∈ I. We give necessary and sufficient criteria to check that a point
is a minimizer of a convex function. These criteria are known as Kuhn-Tucker
Conditions.
Let I be a finite set. Then card(I) denotes the number of elements of I.
H is a hyperplane in Rn if there exists p ∈ Rn , p 6= 0 and α ∈ R such that
H = {x ∈ Rn : p · x = α}.

5 Separation theorems
Proposition 9 [First separation theorem] Let A and B be two nonempty dis-
joint convex subsets of Rn . There there exist α, β, α ≤ β and p ∈ Rn , p 6= 0,
such that p · a ≤ α ≤ β ≤ p · b, for all a ∈ A, all b ∈ B.

Proposition 10 [Second separation theorem] Let A and B be two nonempty


disjoint closed convex subsets of Rn . If one of them is compact, then there there
exist α, β, α < β and p ∈ Rn , p 6= 0, such that p · a ≤ α < β ≤ p · b, for all
a ∈ A, all b ∈ B.

P is polyhedral if there exist (a1 , . . . , am ) ∈ (Rn )m such that P = {x : x =


Pm i
i=1 λi a , λi ∈ R+ , ∀i = 1, . . . , m}.

Proposition 11 Let P and C be nonempty convex sets such that P is polyhe-


dral and P ∩ C = ∅. Then there exists p ∈ Rn , p 6= 0 such that

p · x ≤ 0 ≤ p · y, ∀x ∈ P, ∀y ∈ C

and there exists z ∈ C such that p · z > 0.

22
6 Kuhn-Tucker Conditions
6.1 Necessary and sufficient condition for optimality
The aim of this section is to give the necessary and sufficient conditions for a
point to be an optimal solution to Problem (P ):

 fi (x) ≤ 0, ∀i ∈ I

(P ) Minimize f0 (x) under the constraints gi (x) ≤ 0, ∀i ∈ J

gi (x) = 0, ∀i ∈ K.

where f0 : Rn → R is a convex function, I, J and K are finite and possibly


empty sets, for all i ∈ I, fi is convex, non-affine function from Rn into R, for
all i ∈ J ∪ K, gi is a non-null affine function.
The function f0 is called the objective function. A feasible point is a point
x ∈ Rn that satisfies all the constraints. An optimal solution to (P ) or simply
a solution to (P ) is a feasible point x, such that for all feasible point x, f0 (x) ≥
f0 (x).

Lemma 2 (i) Let f be a linear function on Rn . If f (x) ≥ 0 for any x then


f = 0.
(ii) Hence, if g is an affine function which satisfies g(x) ≥ 0 for any x, then
g equals a nonnegative constant.

Proof : (i) Suppose f 6= 0. There exists x such that f (x) > 0. But we have a
contradiction 0 > −f (x) = f (−x) ≥ 0. Hence f = 0.
(ii) We can write ∀x, g(x) = f (x) + b where f is linear and b is a real
constant. Let x ∈ Rn . We have f (x) + b ≥ 0. Let λ > 0. We also have
λf (x) + b = f (λx) + b ≥ 0, ∀λ > 0. This is equivalent to f (x) + λb ≥ 0 for all
λ > 0. Let λ → +∞. We get f (x) ≥ 0. But x has been arbitrarily chosen.
That means f (x) ≥ 0, ∀x. Thus, f = 0 and g(x) = b, ∀x. And b ≥ 0.

Lemma 3 Let I, J and K be finite possibly empty sets, and for all i ∈ I, fi
is a convex, non-affine function from Rn into R, and for all i ∈ J ∪ K, gi is
non-null affine function. Assume there exists x0 such that
(
gi (x0 ) ≤ 0, ∀i ∈ J
gi (x0 ) = 0, ∀i ∈ K.

If the system: 
 fi (x) < 0, ∀i ∈ I

gi (x) ≤ 0, ∀i ∈ J

gi (x) = 0, ∀i ∈ K.

23
has no solution, then there exist nonnegative real scalars (λi )i∈I , (µi )i∈J , and
real numbers (µi )i∈K , at least one of the (λi )i∈I is not zero, which verify:
X X X
λi fi (x) + µi gi (x) + µi gi (x) ≥ 0, ∀x.
i∈I i∈J i∈K

Proof : Let p = card(I), q = card(J), r = card(K), and


( )
∀i ∈ I, f i (x) < zi
Z = (zi )i∈I∪J∪K ∈ Rp+q+r ∃x,
∀i ∈ J ∪ K, gi (x) = zi

The set Z is convex, nonempty and Z ∩ (Rp+q − × {0Rr }) = ∅. From Proposition


11 there exist real scalars (λi )i∈I , (µi )i∈J∪K , all of them not being equal to zero,
which verify:

µi ζi , ∀z ∈ Z, ∀ζ ∈ Rp+q
X X X X
λ i zi + µi zi ≥ λ i ζi + −
i∈I i∈J∪K i∈I i∈J
P P
and there exists z ∈ Z such that i∈I λi zi + i∈J∪K µi zi > 0.
If for some i, λi < 0, then letting zi tend to +∞, we get a contradiction. Thus
λi ≥ 0, ∀i ∈ I. If for some i ∈ J, one has µi < 0, then letting ζi tend to −∞,
we have another contradiction. Hence, µi ≥ 0, ∀i ∈ J. Let ε > 0 and x ∈ Rn .
Define z ∈ Z by (
zi = fi (x) + ε, ∀i ∈ I,
zi = gi (x), ∀i ∈ J ∪ K.
P P P
We have i∈I λi fi (x) + i∈J∪K µi gi (x) + ε i∈I λi ≥ 0. Let ε tend to zero.
We obtain that i∈I λi fi (x) + i∈J∪K µi gi (x) ≥ 0, ∀x ∈ Rn .
P P

To end the proof, it remains to show that at least one of the (λi )i∈I is strictly
P
positive. Assume the contrary. We then have i∈J∪K µi gi (x) ≥ 0, ∀x ∈
n
P P
R and hence i∈J∪K µi gi (x0 ) = 0. The affine function i∈J∪K µi gi has a
n
P
minimum in R . From Lemma 2, it must be equal to zero. But since i∈I λi zi +
P
i∈J∪K µi zi > 0 for some z ∈ Z, and since all the (λi )i∈I are equal to zero, there
P P
exists x such that i∈J∪K µi gi (x) > 0. That contradicts that i∈J∪K µi gi is
equal to zero.

The Problem (P ) satisfies Slater Condition (S) if there exists x0 such that:

 fi (x0 ) < 0, ∀i ∈ I

(S) gi (x0 ) ≤ 0, ∀i ∈ J

gi (x0 ) = 0, ∀i ∈ K.

Proposition 12 Consider Problem (P ). Assume that (P ) satisfies Slater Con-


dition (S). If (P ) has an optimal solution x, then there exists scalars (λi )i∈I ,

24
(µi )i∈J∪K such that:
(i) ∀i ∈ I, λi ≥ 0, λi fi (x) = 0,
∀i ∈ J, µi ≥ 0, µi gi (x) = 0.
(ii)
X X X X
f0 (x) + λi fi (x) + µi gi (x) ≤ f0 (x) + λi fi (x) + µi gi (x)
i∈I i∈J∪K i∈I i∈J∪K

for any x.
(iii) If f0 , (fi )i∈I , (gi )i∈J∪K are differentiable, then
X X
0 = Df0 (x) + λi Dfi (x) + µi Dgi (x)
i∈I i∈J∪K

Proof : Let α = f0 (x). Then the following system has no solution.




 f0 (x) < α,

 f (x) < 0, ∀i ∈ I,
i
 gi (x) ≤ 0, ∀i ∈ J,


gi (x) = 0, ∀i ∈ K.

The Slater condition allows us to apply Lemma 3.


There exist λ0 , (λi )i∈I , (µi )i∈J∪K such that λ0 ≥ 0, λi ≥ 0, ∀i ∈ I, µi ≥ 0,
∀i ∈ J, at least one of the λ0 , (λi )i∈I is strictly positive, and
X X
λ0 (f0 (x) − α) + λi fi (x) + µi gi (x) ≥ 0, ∀x. (4)
i∈I i∈J∪K
P P
We claim that λ0 > 0. Indeed, if λ0 = 0, then ∀x, i∈I λi fi (x)+ i∈J∪K µi gi (x) ≥
0. Moreover, since there exists at least one i ∈ I such that λi > 0, we deduce
P P
i∈I λi fi (x0 ) + i∈J∪K µi gi (x0 ) < 0, a contradiction. Thus λ0 > 0 and one
can suppose λ0 = 1.
Define the convex function h by
X X
h(x) = f0 (x) + (λi fi )(x) + µi gi (x), ∀x ∈ Rn .
i∈I i∈J∪K

The inequality (4) can be equivalently rewritten as h(x) ≥ α, ∀x. But h(x)−α =
P P P
i∈I λi fi (x) + i∈J∪K µi gi (x) ≤ 0. Hence, α = h(x). Thus i∈I λi fi (x) +
P
i∈J∪K µi gi (x) = 0 and since x is feasible, one has ∀i ∈ I, λi fi (x) = 0 and
∀i ∈ J, µi gi (x) = 0. We have proved Assertion (i). Now, h(x) ≥ α, ∀x is
equivalent to
X X X X
f0 (x) + λi fi (x) + µi gi (x) ≤ f0 (x) + λi fi (x) + µi gi (x)
i∈I i∈J∪K i∈I i∈J∪K

for any x. We have proved assertion (ii). Statement (iii) is obvious when
f0 , (fi )i∈I , (gi )i∈J∪K are differentiable. The proof is now complete.

25
The real numbers (λi )i∈I , (µi )i∈J∪K are called Lagrange parameters, La-
grange multipliers or Kuhn-Tucker coefficients or more simply multipliers of
Problem (P ).
The following conditions (i), (ii) and (iii) are called Kuhn-Tucker Conditions
for Problem (P ).
We say that x, (λi )i∈I , (µi )i∈J∪K satisfy Kuhn-Tucker Conditions of Problem
(P ) if they satisfy Conditions (i), (ii) and (iii).
(i) ∀i ∈ I, λi ≥ 0, fi (x) ≤ 0, λi fi (x) = 0,
∀i ∈ J, µi ≥ 0, gi (x) ≤ 0, µi gi (x) = 0.
(ii) ∀i ∈ K, gi (x) = 0.
P P P P
(iii) f0 (x)+ i∈I λi fi (x)+ i∈J∪K µi gi (x) ≤ f0 (x)+ i∈I λi fi (x)+ i∈J∪K µi gi (x)
for any x.

As particular case of Proposition 12, we get the following result when the
problem is without convex constraints.

Corollary 2 Consider Problem (P ) without convex constraints, i.e.


(
gi (x) ≤ 0, ∀i ∈ J
min f0 (x) under the constraints
gi (x) = 0, ∀i ∈ K.

where f0 : Rn → R is convex, and for all i ∈ J ∪ K, gi is affine.


For this problem, Slater Condition is: ∃x0 such that gi (x0 ) ≤ 0, ∀i ∈ J, and
gi (x0 ) = 0, ∀i ∈ K.
If (P ) has an optimal solution x, then there exist scalars (µi )i∈J∪K verifying:
(i) ∀i ∈ J, µi ≥ 0, µi gi (x) = 0,
P P
(ii) f0 (x) + i∈J∪K µi gi (x) ≤ f0 (x) + i∈J∪K µi gi (x) for any x

Proof : Obvious

Remark 1 (i) The Slater condition is very important even in the one-dimensional
case. Consider the following example:

min{f (x) = x | x2 ≤ 0}.

The problem has a unique solution x = 0. The Slater condition is not satisfied.
There exists no λ ≥ 0 such that 0 = x + λx2 ≤ x + λx for any x ∈ R.
(ii) It is important to notice that Slater Condition is not necessary to obtain
Kuhn-Tucker Conditions. In the previous example, Slater Condition is not
satisfied and there is no Kuhn-Tucker coefficient. Now replace this problem
by an equivalent problem which is min{f (x) = x | g(x) = |x| ≤ 0}. As before,
Slater Condition does not hold. The unique solution is always x = 0. Let λ = 1,

26
we have successively g(0) = 0, λg(0) = 0 and 0 = f (0) + λg(0) ≤ f (x) + λg(x) =
x + |x|, for any x ∈ R. In other words, Kuhn-Tucker Conditions hold.
(iii) In the previous example, one can check that Kuhn-Tucker Conditions are
sufficient for 0 to be a solution. This result is quite general as it will be proved
in the next proposition.

Proposition 13 Let x, (λi )i∈I , (µi )i∈J∪K verify Kuhn-Tucker Conditions for
Problem (P ). Then x is a solution to (P ).

Proof : Let ∀x ∈ Rn , h(x) = f0 (x) + i∈I (λi fi )(x) + i∈J∪K µi gi (x). Condi-
P P

tion (iii) is equivalent to h(x) ≥ h(x), ∀x. Combining conditions (i) and (ii), we
get h(x) ≥ f (x), ∀x. Let x satisfy the contraints of Problem (P ). We obtain
f0 (x) ≥ f0 (x), ∀x.

Theorem 15 (Kuhn-Tucker) Assume that Slater Condition is satisfied for


Problem (P ). Then x is a solution to (P ) if, and only if, there exists coefficients
(λi )i∈I , (µi )i∈J∪K which, together with x, satisfy Kuhn-Tucker Conditions for
Problem (P ).

Proof : The statement follows from Proposition 12 and Proposition 13.

We now show how the Kuhn-Tucker coefficients and the optimal solutions
can be characterize in terms of the saddle-point of a certain concave-convex
function.

6.2 Lagrangian of Problem (P )


The Lagrangian of Problem (P ) is the function L : Rp+ × Rq+ × Rr−q × Rn → R
defined by for all (λ, µ, x) = ((λi )i∈I , (µi )i∈J , (µi )i∈K , x),
X X X
L(λ, µ, x) = f0 (x) + λi fi (x) + µi gi (x) + µi gi (x).
i∈I i∈J i∈K

where p = card(I), q = card(J) and r = card(J ∪ K).


We say that (λ, µ, x) is a saddle-point of L if for all ∀(λ, µ) ∈ Rp+ × Rq+ ×
Rr−q , ∀x ∈ Rn ,
L(λ, µ, x) ≤ L(λ, µ, x) ≤ L(λ, µ, x).

Theorem 16 (i) If (λ, µ, x) is a saddle-point of L, then x is a solution to


(P ) and (λ, µ) are the Kuhn-Tucker coefficients associated with x. Moreover,
L(λ, µ, x) = f0 (x).
(ii) If (λ, µ, x) satisfies Kuhn-Tucker Conditions for Problem (P ), then (λ, µ, x)
is a saddle-point of L.

27
Proof : (i) Let (λ, µ, x) be a saddle-point of L. By definition of a saddle-point,
one has:

L(λ, µ, x) = sup L(λ, µ, x)


(λ,µ)∈Rp+ ×Rq+ ×Rr−q
( )
X X
= sup f0 (x) + λi fi (x) + µi gi (x) .
(λ,µ)∈Rp+ ×Rq+ ×Rr−q i∈I i∈J∪K

If for some i, we have fi (x) > 0, the second member of the inequality equals
+∞. Hence fi (x) ≤ 0, ∀i. If fi (x) < 0, the supremum is reached by λi = 0.
The same assertion holds for gi (x), i ∈ J. For i ∈ K, if gi (x) 6= 0 the second
member equals +∞. Hence gi (x) = 0, ∀i ∈ K.
One the other hand, we have:

L(λ, µ, x) = infn L(λ, µ, x)


x∈R

which is equivalent to Condition (iii) and thus, we obtain Kuhn-Tucker Condi-


tions. By Proposition 13, x is a solution to Problem (P ). One can easily check
that L(λ, µ, x) = f0 (x).
(ii) Conversely, assume that (λ, µ, x) satisfy Kuhn-Tucker Conditions for
Problem (P ). In particular we have :
X X
∀x ∈ Rn , f0 (x) + (λi fi )(x) + µi gi (x) ≥
i∈I i∈J∪K
X X
f0 (x) + (λi fi )(x) + µi gi (x).
i∈I i∈J∪K

In other words, L(λ, µ, x) ≥ L(λ, µ, x), ∀x ∈ Rn .


P
But fi (x) ≤ 0, ∀i ∈ I and gi (x) ≤ 0, ∀i ∈ J ∪ K. Thus i∈I λi fi (x) +
P p q r−q
i∈J∪K µi gi (x) ≤ 0, ∀(λ, µ) ∈ R+ × R+ × R , and hence L(λ, µ, x) ≤ f0 (x) =
L(λ, µ, x), by Kuhn-Tucker Conditions (i), (ii).

6.3 Duality in convex programming


Consider the function g : Rp+ ×Rq+ ×Rr−q → R defined by g(λ, µ) = inf x∈Rn (f0 (x)+
P P
i∈I λi fi (x) + i∈J∪K µi gi (x)). One can easily show that g is concave with
respect to (λ, µ). We call dual program of (P ) the program (P ∗ ):

max g(λ, µ).


(λ,µ)∈Rp+ ×Rq+ ×Rr−q

Theorem 17 (Duality Theorem) (i) If Problem (P ) has a saddle-point (λ, µ, x),


then x is a solution to (P ) and (λ, µ) is a solution to (P ∗ ). Moreover, one has
g(λ, µ) = f0 (x).
(ii) Conversely, if x is a solution to (P ) and (λ, µ) ∈ Rp+ × Rq+ × Rr−q are such
that g(λ, µ) = f0 (x), then (λ, µ, x) is a saddle-point of L.

28
Proof : (i) If (λ, µ, x) is a saddle-point of L, from Theorem 16, x is a solution to
(P ) and one has fi (x) ≤ 0, ∀i ∈ I, gi (x) ≤ 0, ∀i ∈ J, gi (x) = 0, ∀i ∈ I. Hence,
g(λ, µ) ≤ f0 (x), ∀(λ, µ) ∈ Rp+ × Rq+ × Rr−q and g(λ, µ) ≤ f0 (x). But from the
very definition of the saddle-point, one has g(λ, µ) ≥ L(λ, µ, x) = f0 (x). In
other words, (λ, µ) is a solution to (P ∗ ).
(ii) If g(λ, µ) = f0 (x) with x solution to (P ), one has:
X X
f0 (x) = g(λ, µ) ≤ f0 (x) + (λi fi )(x) + µi gi (x), ∀x ∈ Rn .
i∈I i∈J∪K
P P
Taking x = x, one gets 0 ≥ i∈I λi fi (x) + i∈J∪K µi gi (x) ≥ 0. Thus λi fi (x) =
0, ∀i ∈ I and µi gi (x) = 0, ∀i ∈ J. Since we have gi (x) = 0, ∀i ∈ K, Kuhn-
Tucker Conditions (i) and (ii) are fulfilled. On the other hand, we have
X X
g(λ, µ) = f0 (x) = f0 (x) + λi fi (x) + µi gi (x)
i∈I i∈J∪K
X X
≤ f0 (x) + λi fi (x) + µi gi (x), ∀x ∈ Rn .
i∈I i∈J∪K

The Kuhn-Tucker conditions (i), (ii) and (iii) are satisfied. From Assertion (ii)
of Theorem 16 (λ, µ, x) is a saddle-point of L.

7 Exercises
Exercise 15 Let f be a convex function from Rn into R. Consider the problem

(P ) min f (x)
x∈U

where U is nonempty convex set of Rn . If x is a solution to (P ), we say that x


is a global solution.
(i) Show that the set of solutions to (P ) is convex.
(ii) Recall that x is a local solution to (P ) if there exists an open ball B centered
at x such that ∀x ∈ B ∩ U , f (x) ≥ f (x). Show that any local solution is global
solution.

Exercise 16 Consider the following problem (P ):

(P ) min(x2 + y 2 ) under the constraint 2x + y ≤ −4.

(i) Solve (P ).
(ii) Write its dual program (P ∗ ).
(iii) Solve (P ∗ ).
(iv) Check the duality theorem.

29
Exercise 17

min(x + y) under the constraint x2 + y 2 ≤ 1.

Exercise 18 Let p ∈ R be given. Solve


(
x2 + y 2 ≤ 1
min(x2 + (y − p)2 ) under the constraints
x≤y

Give a geometric interpretation of the results.

Exercise 19 (i) Solve:



 x2 y 2
2 2 2 + 2 + z2 ≤ 1
min(x + y + (z − 2) ) under the constraints a2 b
 y−p=0

where a > 0, b > 0 and −b < p < b.


(ii) Check the duality theorem.

Exercise 20 (Farkas Lemma) Let f1 , . . . , fp , g be linear forms on Rn . Notic-


ing that the following claim

fk (x) ≥ 0, ∀k = 1, . . . , p ⇒ g(x) ≥ 0

is equivalent to say that 0 solves

min g(x) under the constraints fk (x) ≥ 0, ∀k = 1, . . . , p

Use corollary 2 to show that it is equivalent to ∃λk ≥ 0, ∀k = 1, . . . , p and


g = pk=1 λk fk .
P

Exercise 21 (Linear Programming) Let c ∈ Rn , b ∈ Rm and A be a (m×n)


real matrix. Prove that x solves the problem
(
Ax ≥ b
min c · x under the constraints
x≥0

if, and only if, there exists p ∈ Rm n t t t


+ , q ∈ R+ , such that c = p · A + q, (Ax − b) ≥
0, x ≥ 0, p · (Ax − b) = 0, p · Ax = c · x, where t p, t q, t c are the transposed of
p, q, c.

Exercise 22 Consider the function u : R2+ → R+ defined by u(x1 , x2 ) = xα1 xβ2 ,


where (α, β) ∈ R2++ and α + β = 1.
a) Show that u is concave and continuous on R2+ and differentiable on R2++ .
b) For ω ∈ R2+ \ {0} and p ∈ R2++ , we want to solve the following problem:

 p1 x1 + p2 x2 ≤ p · ω

(P ) Maximize u(x1 , x2 ) under the constraints x1 ≥ 0

x2 ≥ 0.

30
i) Prove that this problem has a solution.
ii) Show that this solution is in R2++ .
iii) Find a solution (x1 , x2 , λ) to the following system:

(α−1) β


 αx1 x2 = λp1
 α (β−1)
βx1 x2 = λp2

 p 1 x1 + p 2 x2 = p · ω



 x > 0, x > 0, λ > 0.
1 2

iv) Show that this solution (x1 , x2 , λ) verifies Kuhn-Tucker’s Conditions of


Problem (P ), and hence (x1 , x2 ) solves Problem (P ).

31
The following chapters will not be taught in the PreMaster program

32
Chapter4 : LinearProgramming
We say that f : Rn → R is a linear function (or a linear functional) if it is of
the form
f (x) = a1 x1 + a2 x2 + . . . , +an xn + b
where a1 , . . . , an , b are real constants.

Proposition 14 (Minkowski-Farkas’ lemma) Let f 1 , . . . , f n and g be lin-


ear functionals defined on Rm . In order that the following implication holds:

f k (x) ≤ 0 , ∀k = 1, . . . , n ⇒ g(x) ≤ 0

it is necessary and sufficient that there exist λ1 , . . . λn ∈ R+ such that g =


Pn k
k=1 λk f .

Proof : This condition is equivalent to say that 0 solves the problem

min(−g(x)) subject to f k (x) ≤ 0, ∀k = 1, . . . , n

The Slater condition is satisfied: f k (0) = 0, ∀k. Therefore, there exist λ1 , . . . , λk ,


λk ≥ 0, ∀k, which satisfy 0 ≤ −g(x) + nk=1 λk f k (x), ∀x ∈ Rm . Since
P

g, f1 , . . . , fk are linear, the previous inequality implies −g + nk=1 λk f k = 0.


P

The converse is obvious.

Minkowski-Farkas’ Lemma, fundamental for the duality theory in Linear


Programming, has various corollaries. The first one is simply a translation in
terms of matrices of Proposition 14.

Corollary 3 Let A be a (m × n)-matrix and b ∈ Rm . The equation Ax = b has


a solution x ≥ 0 in Rn if and only if the following implication is true

p0 A ≤ 0, p ∈ Rm ⇒ p · b ≤ 0.

Proof : (i) Assume x ≥ 0 verifies Ax = b and p verifies p0 A ≤ 0. Then 0 ≥


p0 Ax = p0 b = p · b.
(ii) The converse is a consequence of Proposition 14

Corollary 4 Let f 1 , . . . , f q , h1 , . . . , hr and g be linear functionals defined on


Rm . In order that the following implication holds:
n o
f k (x) = 0 ∀k = 1, . . . , q, hk (x) ≤ 0 ∀k = 1, . . . , r ⇒ g(x) ≤ 0

it is necessary and sufficient that there exist µ1 , . . . , µq in R, ν1 , . . . , νr in R+


such that g = qk=1 µk f k + rk=1 νk hk .
P P

33
Proof : Apply Proposition 14 to the linear functionals f k , k = 1, . . . , q, −f k , k =
1, . . . , q, hk , k = 1, . . . , r and g.

Corollary 5 Let A be a (m×n)-matrix and b ∈ Rm . In order that the following


implication
p0 A ≤ 0, p ∈ Rm
+ ⇒ p·b≤0

holds true, it is necessary and sufficient that there exist x ∈ Rn+ such that
Ax ≥ b.

Proof : The implication can be written as:

p0 [A, −I] ≤ 0 ⇒ p · b ≤ 0

where I is the identity matrix of Rm . Apply Corollary 3 to get that there exist
x ∈ Rn+ , z ∈ Rm + such that Ax − z = b. Hence Ax ≥ b. The converse is
immediate.

As corollary of Corollary 5 we have the following

Corollary 6 Let A be a (m × n) matrix. Then one of the two following alter-


natives holds:
- either there exists x ∈ Rn+ such that Ax ≥ b
- or there exists p ∈ Rm 0
+ such that p A ≤ 0 and p · b > 0.

Proof : Let P = {p ∈ Rm 0
+ : p A ≤ 0}. This set is nonempty since 0 ∈ P . Either
there exists p ∈ P with p · b > 0 or ∀p ∈ P, p · b ≤ 0. The second statement is
equivalent to there exists x ∈ Rn+ such that Ax ≥ b.

Corollary 7 Let A be a (m×n)-matrix and b ∈ Rm . In order that the following


implication
p0 A = 0, p ∈ Rm
+ ⇒ p·b≤0

holds true, it is necessary and sufficient that there exist x ∈ Rn such that
Ax ≥ b.

Proof : The implication can be written as:

p0 [A, −A − I] ≤ 0 ⇒ p · b ≤ 0

where I is the identity matrix of Rm . Apply Corollary 3 to get that there exist
x1 ∈ Rn+ , x2 ∈ Rn+ , z ∈ Rm 1 2
+ such that Ax − Ax − z = b. Posit x = x − x .
1 2

Then Ax ≥ b. The converse is immediate.

The transposed of a matrix K will be denoted by K 0 .

34
Corollary 8 Let K be a skew-symmetric (n × n)-matrix (K 0 = −K). Then
the system of linear inequalities with a nonnegativity constraint Kx ≥ 0, x ≥ 0
has a solution verifying x + Kx  0.

Proof : Let i ∈ {1, . . . , n} and ei be the ith vector of the natural basis of Rn .
In view of Corollary 6 applied to the skew-symmetric matrix K, it follows that
one and only one of the two following systems:

Kx ≥ ei with x ≥ 0

Kx = −x0 K ≥ 0, x · ei > 0 with x ≥ 0

has a solution. Let us denote by xi ∈ Rn the solution of one of these systems.


Thus for every i, one can find xi ≥ 0 such that Kxi ≥ 0 and (xi + Kxi )i > 0.
Setting x = x1 + . . . + xn , we see that x ≥ 0, Kx ≥ 0, and (x + Kx)i ≥
(xi + Kxi )i > 0, ∀i = 1, . . . , n, that is x + Kx  0.

A linear programming problem (or LP-problem) is a constrained optimization


problem of the form

maximize (or minimize) f (x) subject to

x ∈ S = {x ∈ Rn : gj (x) = 0, j = 1, . . . , k; hi (x) ≥ 0, i = 1, . . . , l} (5)

where the objective function f and the constraints g1 , . . . , gk , h1 , . . . , hl are


linear functions.
The above gives the general form for an LP-problem. In these notes, we will
assume that an LP-problem has the following form

minimize c · x = c1 x1 + c2 x2 + . . . + cn xn subject to

a11 x1 + a12 x2 + . . . + a1n xn − b1 ≥ 0


a21 x1 + a22 x2 + ... + a2n xn − b2 ≥ 0 (6)
...
am
1 x1 + am
2 x2 + ... + am
n xn − bm ≥ 0

with x1 ≥ 0, x2 ≥ 0, . . . , xn ≥ 0. Here the ci , bj , aij , i = 1, . . . , n, j = 1, . . . , m


are real constants.
We will show that the LP-problem with form (7) encompasses the one de-
scribed by (5). First, to maximize a linear form c · x is equivalent to minimize
−c · x. Second, consider an equality constraint in (5). Suppose we have

a11 x1 + a12 x2 + . . . + a1n xn = b1

35
Obviously, at least one of the coefficients a1i must be different from zero. Assume
that is a1n . We can write

(b1 − a11 x1 − a12 x2 − . . . − a1n−1 )


xn =
a1n
Then by just substituting this value for xn in the objective function and in
all constraints, we get a new LP-problem. This new problem is one dimension
lower than the original was, because xn has disappeared. The constraint

a11 x1 + a12 x2 + . . . + a1n xn = b1

now becomes
(b1 − a11 x1 − a12 x2 − . . . − a1n−1 )
≤0
a1n
because xn ≥ 0, or equivalently

(a11 /a1n )x1 + . . . + (a1n−1 /a1n )xn−1 − b1 /a1n ≥ 0

For the constraints x ∈ Rn , observe that any real number a can be written as

a = max(0, a) − max(0, −a). Hence, xi = x+ i i
i − xi , with x+ ≥ 0 and x− ≥ 0.
We get a new LP problem with a new variable.
The following problem will be called Problem P (P for primal)

minimize c · x = c1 x1 + c2 x2 + . . . + cn xn subject to

a11 x1 + a12 x2 + . . . + a1n xn − b1 ≥ 0


a21 x1 + a22 x2 + . . . + a2n xn − b2 ≥ 0 (7)
...
am
1 x1 + am
2 x2 + ... + am
n xn − bm ≥ 0

with x1 ≥ 0, x2 ≥ 0, . . . , xn ≥ 0. The FOC of this problem are: there exist


multipliers (p1 , . . . , pm , µ1 , . . . , µn ) : pi ≥ 0, µi ≥ 0, ∀i which satisfy

pi (ai1 x1 + . . . , +ain xn − bi ) = 0, ∀i = 1, . . . , m (8)


µi xi = 0, ∀i = 1, . . . , n (9)
m
pj aji − µi = 0, ∀i = 1, . . . , n
X
ci − (10)
j=1

ai1 x1 + . . . + ain xn − bi ≥ 0, ∀i = 1, . . . , m (11)


xi ≥ 0, ∀i = 1, . . . , n (12)

Consider now the problem D (D for dual):

maximize b · p = b1 p1 + b2 p2 + . . . + bm pm subject to

36
a11 p1 + a21 p2 + . . . + am
1 pm − c1 ≤ 0
a12 p1 + a22 p2 + . . . + am
2 pm − c2 ≤ 0 (13)
...
a1n p1 + a2n p2 + ... + am
n pm − cn ≤ 0

Observe that maximize b · p = minimize − b · p. For i = 1, . . . , m define


ζi = ai1 x1 + . . . + ain xn − bi . We then observe that (x1 , . . . , xn , ζ1 , . . . , ζm ) are
the multipliers of problem D associated with (p1 , . . . , pm ) Indeed we have

xi (a1i p1 + . . . , +am
i pm − ci ) = 0, ∀i = 1, . . . , n (14)
ζi pi = 0, ∀i = 1, . . . , m (15)
n
X
−bi − xj aij − ζi = 0, ∀i = 1, . . . , m (16)
j=1

a1i p1 + . . . + am
i pm − ci ≤ 0, ∀i = 1, . . . , n (17)
pi ≥ 0, ∀i = 1, . . . , m (18)

Observe that from (8) we have

Xm m
X
b·p=( pi ai1 )x1 + . . . + ( pi ain )xn
i=1 i=1

and from (14) we get

Xm m
X
c·x=( pi ai1 )x1 + . . . + ( pi ain )xn
i=1 i=1

Thus
b · p = c · x. (19)

Proposition 15 Problem P has a solution x̄ if, and only if, problem D has a
solution p̄. And we have c · x̄ = b · p̄.

Proof : Assume P has a solution x̄. Slater condition is satisfied since x̄ satisfies
the constraints. Therefore, the multipliers p̄, µ̄ exist. Define for i = 1, . . . , m,
ζ̄i = ai1 x̄1 + . . . + ain x̄n − bi . Then (x̄, ζ̄) are the multipliers for Problem D,
associated with p̄. From the Kuhn-Tucker theorem, p̄ solves D. The proof of
the converse claim is similar. From (19) we have c · x̄ = b · p̄.

We will write Problems P and D in a compact form. Let


 
a11 . . . a1n
A =  ... 
 

am
1 . . . an
m

37
Its transposed A0 is  
a11 . . . am
1
A0 =  . . . 
 

a1n . . . amn

Problem P will be:


Minimize c · x subject to

Ax ≥ b
(P) x≥0
x ∈ Rn .

while Problem D will be

Maximize b · p subject to

A0 p ≤ c
(D) p≥0
p ∈ Rm .

We say that x is feasible for Problem P if

Ax ≥ b
x≥0
x ∈ Rn .

and p is feasible for Problem D if

A0 p ≤ c
p≥0
p ∈ Rm .

Lemma 4 If x and p are respectively feasible to problems (P) and (D), then
p · b ≤ c · x. If, moreover, p · b = c · x, then x and p are respectively optimal
solutions to problems (P) and (D).

Proof : Assume that x and p are feasible. Their transposed vectors will be
denoted by x0 , p0 . Then

p · b = p0 b ≤ p0 Ax = (p0 A) · x ≤ c0 x = c · x.

Assume moreover that p · b = c · x. If x is not optimal, consider y such that


Ay ≥ b, y ≥ 0, c · y < c · x. Then c · y < p · b, which contradicts the feasibility of
y and p. Analogously, if p is not optimal, consider q such that A0 q ≤ c, q ≥ 0,
b · q > b · p. Then b · q > c · x, which contradicts the feasibility of x and q.

38
Lemma 5 The system of linear inequalities

Ax − tb ≥ 0
−A0 p + tc ≥ 0
p·b−c·x ≥ 0

has a solution (p, x, t) ∈ Rm n


+ × R+ × R+ satisfying:

Ax − tb + p  0
−A0 p + tc + x  0
p·b−c·x+t > 0.

Proof : It suffices to consider the skew-symmetricmatrix


 
0 A −b
K = −A0 0 c
 

b0 −c0 0

and to apply Corollary 8 to Minkowski-Farkas’ lemma to the system


 
p
K x ≥ 0,
 

p ≥ 0, x ≥ 0, t ≥ 0.

Theorem 18 [Duality Theorem] Given the pair of dual linear programming


problems (P) and (D), one of the two following alternatives holds:
- either (P) and (D) have a couple (x, p) of optimal solutions satisfying:
p · b = c · x (obviously, the same relation p · b = c · x holds then for every couple
(x, p) of optimal solutions);
- or neither (P) nor (D) has an optimal solution and one of both feasible
sets is empty.

Proof : Let (p, x, t) be as in Lemma 5. We will distinguish two cases.


First case: t > 0. Then x = (1/t)x and p = (1/t)p are respectively feasible
solutions to (P) and (D) such that p · b − c · x ≥ 0. We deduce from the first
assertion of Lemma 4, that p · b − c · x = 0 and from the second assertion of
Lemma 4, that x and p are optimal solutions to problems (P) and (D).
Second case: t = 0. Then, Ax ≥ 0, p0 A ≤ 0, x ≥ 0, p ≥ 0. If both feasible
sets are nonempty, consider (x0 , p0 ), a couple of feasible elements of problems
(P) and (D), that is such that: Ax0 ≥ b, x0 ≥ 0; A0 p0 ≤ c, p0 ≥ 0. One

39
deduces: p0 b ≤ p0 Ax0 ≤ 0 and c0 x ≥ p0 · Ax ≥ 0, which contradicts the relation
p · b > c · x in Lemma 5. We have thus proved that at least one feasible set is
empty. But in this case, at least one the two problems P, D has no solution.
From Proposition 15, the other problem has no solution too.

Corollary 9 (Complementarity relations) In order that a couple (x, p) of


feasible solutions to (P) and (D) is a couple of optimal solutions, it is necessary
and sufficient that (p0 A − c0 )x = p0 (Ax − b) = 0, a condition which can be
rewritten:
(p0 A − c0 )i xi = 0, ∀i = 1, . . . , n
and
p0j (Ax − b)j = 0, ∀j = 1, . . . , m.

Proof : Let (x, p) be a couple of feasible elements to (P) and (D). If x and
p are optimal solutions, then the first alternative of the previous proposition
holds and c0 x = p0 b. Using the feasibility relations, one gets:

0 = p0 b − c0 x ≤ p0 Ax − c0 x = (p0 A − c0 )x ≤ 0
0 = p0 b − c0 x ≤ p0 b − p0 Ax = p0 (Ax − b) ≤ 0,

which shows that (p0 A − c0 )x = p0 (Ax − b) = 0.


Assume conversely that (p0 A − c0 )x = p0 (Ax − b) = 0. Then c · x = c0 x = p0 Ax =
p0 b = p · b, and it follows from Lemma 4 that x and p are optimal solutions to
respectively (P) and (D).

8 Exercises
Exercise 23 Consider problem P:

Minimize x1 + x2 − x4 subject to

x1 + x2 + 3x3 + 4x4 ≥ 18
3x1 + 2x2 + x3 + x4 = 10
(x1 , x2 , x3 , x4 ) ∈ R4

(a) Transform this problem in the standard form for a LP-problem


(b) Formulate the dual problem.

Exercise 24 Consider problem

Maximize x + y subject to

40
x + 2y ≤ 3
−2x + y ≤ 10
5x ≤6
x ≥ 0, y ≥ 0

(a) Write this problem in its standard form, represent the feasible set and
use the complementary relations to get an optimal solution.
(b) Give the dual problem. Find an optimal solution to the dual problem

Exercise 25 Same questions for the problem

Maximize − x + 4y subject to

2x − y ≥ 4
x − 2y ≥ 3
x ≥ 0, y ≥ 0

Exercise 26 Minimize x + 4x subject to the constraints of Exercise 25

41
Chapter 5: Non Convex Optimization
Definition 1 Let f be a continuously differentiable mapping from an open,
nonempty convex set U of Rn into Rn . Let a ∈ U . Then f (a) = (f1 (a), . . . , fn (a)).The
Jacobian matrix Jf (a) is
 
Df1 (a)
Jf (a) =  Df2 (a) 
 

. . . Dfn (a)

where the row-vector Dfi (a) is the derivative of fi at point a.

Theorem 19 [Local Inversion Theorem] Let f be a continuously differentiable


mapping from an open, nonempty convex set U of Rn into Rn . Let a ∈ U verify
f (a) = 0. Assume that the Jacobian matrix Jf (a) is invertible. Then there
exists an open set V ⊂ U , which contains a, and an open set W containing 0
such that f is one-to-one from V into W .
Moreover, the inverse f −1 is continuously differentiable on W .

Let f be a function from an open, convex, nonempty set U of Rn into R.


We say that f is locally minimal at x under the constraints x ∈ Γ, if x ∈ Γ and
if there exists a neighborhood V of x such that ∀x ∈ V ∩ Γ, f (x) ≥ f (x).
Consider the following problem (P ):

 x∈U

(P̃ ) min f0 (x) under the constraints fi (x) ≤ 0, ∀i = 1, . . . , I

gi (x) = 0, ∀i = 1, . . . , K

where f0 and fi , for i = 1, . . . , I, gi , for ii = 1, . . . , K are continuously differen-


tiable functions from an open, convex, nonempty set U of Rn into R.

Lemma 6 Let f be a differentiable function from an open, convex set U ⊂


Rm × Rp × Rq into R. We suppose 0 ∈ U . Suppose that the function f :
(x, y, z) → f (x, y, z) is locally minimal at 0 under the constraints x ≥ 0, y = 0.
Then:
fx0 (0, 0, 0) ≥ 0, fz0 (0, 0, 0) = 0.

Proof : Let C = {(x, y, z) ∈ Rm × Rp × Rq | x ≥ 0, y = 0}. Let (r, s, t) ∈ C.


Then there exists λ1 > 0 such that ∀λ ∈ [0, λ1 [, one has (λr, λs, λt) ∈ C ∩ U .
The function F defined by ∀λ ≥ 0, F (λ) = f (λr, λs, λt) is locally minimal at 0.
Hence, F 0 (0) ≥ 0, i.e.

f 0 (0, 0, 0) · (r, s, t) ≥ 0, ∀(r, s, t) ∈ C.

42
In particular, fx0 (0, 0, 0) · r ≥ 0, ∀r ∈ Rm such that r ≥ 0, fz0 (0, 0, 0) · t ≥ 0,
∀t ∈ Rq . Hence: fx0 (0, 0, 0) ≥ 0 and fz0 (0, 0, 0) = 0.

Let x be a feasible point, i.e. fi (x) ≤ 0, ∀i = 1, . . . , I, and gi (x) = 0,


∀i = 1, . . . , K.
Let I(x) = {i | fi (x) = 0}. We say that the constraints of Problem (P ) are
regular at x if the gradients (Dfi (x))i∈I(x) , (Dgi (x))i=1,...,K are linearly inde-
pendent.

Theorem 20 Suppose that f0 is locally minimal at x under the constraints of


Problem (P ). Suppose also that the constraints are regular at x. Then there
exists non-negative scalars (λi )i∈I , and scalars (µi )i∈K such that:
P P
(i) Df0 (x) = − i=1,...,I λi Dfi (x) + i=1,...,K µj Dgj (x),
(ii) λi fi (x) = 0, ∀i = 1, . . . , I.

Proof : Observe that f0 is also locally minimal at x under the constraints


(
fi (x) ≤ 0, ∀i ∈ I(x),
gi (x) = 0, ∀i = 1, . . . , K

where I(x) = {i ∈ I | fi (x) = 0}.


Suppose that I(x) has cardinal J. Since, by assumption, the constraints are
regular at x, there exists θ1 , . . . , θq , vectors of Rn , with q = n − J − K, such
that the matrix ((Dfi (x))i=1,...,I ; (Dgi (x))i=1,...,K , θ1 , . . . , θq ) is invertible. De-
fine θ˜i (x) = θi · (x − x), ∀i = 1, . . . , q, ∀x ∈ Rn , and the map ϕ from Rn into Rn
by
ϕ(x) = ((−fi (x))i∈I(x) , (gj (x))i=1,...,K , (θ˜i (x))i=1,...,q ).
One has ϕ(x) = 0, and Jϕ (x) is invertible, where Jϕ (x) is the Jacobian matrix
of ϕ at x. From the Local inversion theorem (Theorem 19), ϕ has, in a neigh-
borhood V of x, an inverse ϕ−1 which is continuously differentiable.
Define, on ϕ(V ), the map F = f0 ◦ ϕ−1 . One has: F (u, v, w) = f0 (x) if
ϕ(x) = (u, v, w). In particular, F (0, 0, 0) = f0 (x). Hence F (u, v, w) ≥ F (0, 0, 0)
for u ≥ 0, v = 0. From Lemma 6, Fu0 (0, 0, 0) ≥ 0 and Fw0 (0, 0, 0) = 0, that
means: F 0 (0, 0, 0) = (λ, µ, 0) with λ ∈ RJ+ and µ ∈ RK . But Df0 (x) =
F 0 (0, 0, 0)(Jϕ (x)). Since

Jϕ (x) = ((−Dfi (x))i∈I(x) , (Dgi (x))i∈K , (θi )i=1,...,q ),

one gets: X X
Df0 (x) = − λi Dfi (x) + µi Dgi (x)
i∈I(x) i=1,...,K

with λi ≥ 0, ∀i ∈ I(x). Define λi = 0, ∀i ∈ / I(x). Relation (i) is thus proved.


We have λi fi (x) = 0, ∀i = 1, . . . , I, that is Condition (ii).

43
Exercises
Exercise 27 Solve

min{(3 2x + 3y − 1)

under the constraints


(x, y) ∈ R2
(

y2
x2 + 2 = 1}

Exercise 28 Consider the ellipsoid (E) defined by the equations


(
2x2 + y 2 − 4 = 0
x+y+z =0

(i) Find the point of (E) which maximizes the Euclidean distance between (E)
and the axis 0y.
(ii) Find the point of (E) which minimizes the Euclidean distance between (E)
and the axis 0y.

44
Chapter 6: Dynamic Programming in Economics

9 Finite Horizon
We will present some economic models.

9.1 The one-sector Ramsey Model


Consider problem P1 :
T
X
max β t u(ct )
t=0

under the constraints: for any t ≥ 0, ct + kt+1 − (1 − δ)kt ≤ F (kt ) and


k0 ≥ 0 is given.
Here, ct , kt denote respectively the consumption and the capital stock at
period t.
In this model, define at = ct ,st = kt , ψ(a, s) = F (s) + (1 − δ)s − a, Φ(s) =
[0, F (s) + (1 − δ)s] (at is the action at period t, st is the state at period t) and
finally r(a, s) = u(a). The model becomes
T
X
max β t r(at , st )
t=0
s.t. ∀t ≥ 0, st+1 = ψ(at , st ), at ∈ Φ(st ), s0 is given

9.2 The One-sector model with endogenous supply of labor


Consider problem P2 :
T
X
max β t u(ct , lt )
t=0

under the constraints: for any t ≥ 0, ct + kt+1 − (1 − δ)kt ≤ F (kt , 1 − lt )


and k0 ≥ 0 is given.
Here, ct , kt , lt denote respectively the consumption, the capital stock and
leisure at period t.
Let a = (c, l), s = k, ψ(a, s) = F (s, 1 − l) + (1 − δ)s − c, Φ(s) = {(c, l) : 0 ≤
l ≤ 1, 0 ≤ c ≤ F (s, 1 − l) + (1 − δ)s} and r(a, s) = u(c, l). The model becomes
T
X
max β t r(at , st )
t=0
s.t. ∀t ≥ 0, st+1 = ψ(at , st ), at ∈ Φ(st ), s0 is given

45
9.3 The human capital model
Consider problem P3 :
T
X
max β t u(ct )
t=0

under the constraints: for any t ≥ 0, ct ≤ F (ht θt ), ht+1 = ht (1 + ζ(1 − θt ))


and h0 > 0 is given. The function ζ is the training technology, ζ(0) = 0, ζ(1) =
η, (1 − θ) is the time devoted for training.
Here, ct , ht , θt denote respectively the consumption, the human capital and
the working time at period t. We normalize θ1 ∈ [0, 1].
Let a = (c, θ), s = h, ψ(a, s) = s(1 + ζ(1 − θ)), Φ(s) = {(c, θ) : 0 ≤ θ ≤
1, 0 ≤ c ≤ F (sθ)} and r(a, s) = u(c, l). As before, the model becomes
T
X
max β t r(at , st )
t=0
s.t. ∀t ≥ 0, st+1 = ψ(at , st ), at ∈ Φ(st ), s0 is given

9.4 The Bellman equation


The general form will be
T
X
max β t r(at , st )
t=0
s.t. ∀t ≥ 0, st+1 = ψ(at , st ), at ∈ Φ(st ), s0 is given

For τ = 0, . . . , T
T
X
let Vτ (s) = max β t r(at , st )
t=τ
s.t. ∀t ≥ τ, st+1 = ψ(at , st ), at ∈ Φ(st ), sτ = s.

We obtain

VT (s) = max{r(a, s) : a ∈ Φ(s)}

and, for t ≤ T − 1

Vt (s) = max{r(a, s) + βVt+1 (ψ(a, s)) : a ∈ Φ(s)}

10 Infinite Horizon
The problem is:

X
max β t r(at , st )
t=0
s.t. ∀t ≥ 0, st+1 = ψ(at , st ), at ∈ Φ(st ), s0 is given

46
Let

X
V (s0 ) = max β t r(at , st )
t=0
s.t. ∀t ≥ 0, st+1 = ψ(at , st ), at ∈ Φ(st ), s0 is given

We have the following Bellman equation

V (s) = max{r(a, s) + βV (ψ(a, s)) : a ∈ Φ(s)}

11 Optimal strategy
A strategy is a sequence (finite if finite horizon, infinite if infinite horizon)
(at , st+1 ) which satisfies: for any t, st+1 = ψ(at , st ), at ∈ Φ(st ). It is optimal if
it solves the problem
T
X
max β t r(at , st )
t=0
s.t. ∀t ≥ 0, st+1 = ψ(at , st ), at ∈ Φ(st ), s0 is given

where T is finite or infinite. Let (a∗t , s∗t+1 ) denote an optimal sequence. Then
we have:
a) for finite horizon: for t ≤ T − 1

Vt (s∗t ) = r(a∗t , s∗t ) + βVt+1 (ψ(a∗t , s∗t ))

and VT (s∗T +1 ) = r(a∗T +1 , s∗T +1 ).


b) for infinite horizon: for any t we have

V (s∗t ) = r(a∗t , s∗t ) + βV (ψ(a∗t , s∗t ))


= r(a∗t , s∗t ) + βV (s∗t+1 )

47

You might also like