0% found this document useful (0 votes)
17 views70 pages

Notes Convex Analysis Ngalla

RAS

Uploaded by

Kamingu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views70 pages

Notes Convex Analysis Ngalla

RAS

Uploaded by

Kamingu
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 70

CONVEX ANALYSIS, OPTIMIZATION, VARIATIONAL

METHODS AND APPLICATIONS

Ngalla Djitte,
Department of Applied Mathematics
Gaston Berger University
Saint Louis, Senegal.
[email protected]
CONTENTS

1 Existence Results in Infinite Dimensional Spaces 1


1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Existence theorem in finite dimensional spaces . . . . . . . . . . . . . . 2
1.3 Existence in Infinite dimensional spaces . . . . . . . . . . . . . . . . . . 5
1.3.1 Counter-example . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.2 Toward the existence theorem . . . . . . . . . . . . . . . . . . . 6

2 Generalized derivatives: subdifferential of convex functions 13


2.1 Differential characterizations of convex functions . . . . . . . . . . . . 13
2.2 Subdifferential of convex functions . . . . . . . . . . . . . . . . . . . . . 17
2.2.1 Directional derivative . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Nonvacuity of the subdifferential . . . . . . . . . . . . . . . . . . . . . . 25

3 Convex minimization problems 27


3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.1.1 Normal cone to a convex set . . . . . . . . . . . . . . . . . . . . 27
3.1.2 Tangent cone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.2 Optimality conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
3.3 Ekeland variational principle and applications . . . . . . . . . . . . . . 32

1
2 CONTENTS

3.3.1 Application to optimization . . . . . . . . . . . . . . . . . . . . 35


3.3.2 Existence of minimizers . . . . . . . . . . . . . . . . . . . . . . . 36
3.4 Selected exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

4 Optimization problems with constrains 48


4.1 Optimization problems with equality constrains . . . . . . . . . . . . . 48
4.2 First order necessary conditions . . . . . . . . . . . . . . . . . . . . . . 49
4.2.1 Second order sufficient conditions . . . . . . . . . . . . . . . . . 51
4.3 Optimization with inequality constraints . . . . . . . . . . . . . . . . . 53

5 Calculus of variation 54
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2 The Basic Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3 Necessary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.3.1 First Order Necessary Condition . . . . . . . . . . . . . . . . . . 56
5.3.2 Second Order Necessary Condition . . . . . . . . . . . . . . . . 58
5.4 Sufficient conditions for solutions . . . . . . . . . . . . . . . . . . . . . . 60
5.5 Exercices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

6 Minimum’s Principle of Pontryaguine 63


6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.1.1 Minimum Principle of Pontryagin: general case . . . . . . . . . 66

Bibliography 67

- N.D JITTÉ 2013 2


CHAPTER 1

Existence Results in Infinite Dimensional


Spaces

Sommaire
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2 Existence theorem in finite dimensional spaces . . . . . . . . . . . . 2

1.3 Existence in Infinite dimensional spaces . . . . . . . . . . . . . . . . 5

1.1 Introduction
Let V be a normed linear space and F : V → R ∪ {+∞} be an extended real valued
function. Consider the following optimization problem:

inf F(v), (1.1)


v∈V

or
sup F(v), (1.2)
v∈V

1
2 Existence results

Remarque 1.1 Observing that:

sup F(v) = − inf (−F)(v),


v∈V v∈V

then in what follows, we shall restrict our discussion to problem like (1.1).

Définition 1.2 We say that a point ū ∈ V is a local minimizer of F in V if there exists a


positive constant r > 0 such that

F(ū) ≤ F(v) ∀ v ∈ B(ū, r). (1.3)

It is a global minimizer if (1.3) holds for all points v in V .

Solve an optimization problem like (1.1) is to find global or a local minimizers of F in


V . If ū is a global minimum of F in V , we just write

F(ū) = min F(v).


v∈K

Définition 1.3 The domain of F is the subset of V , dom (F), defined by

dom(F) := {v ∈ : F(v) < +∞}.

We say that F is proper if its domain is nonempty.

1.2 Existence theorem in finite dimensional spaces


In this section, we give two main existence theorems in Rn .

Théorème 1.4 (Weierstrass) Let K be a nonempty compact subset of Rn and F : Rn → R be


a continuous function. Then F has a global minimizer in K, i.e., there exists a point x̄ ∈ K
such that
F(x̄) = min F(x).
x∈K

Preuve. Let (xn ) be a minimizing sequence of F in K. Since K is compact, then the


sequence (xn ) has a subsequence (xnk ) that converges to some point x̄ ∈ K. Using the
continuity of F at x̄, it follows that

lim F(xnk ) = F(x̄). (1.4)


k→+∞

- N.D JITTÉ 2013 2


3 Existence results

On the other hand since (F(xnk ) is a subsequence of F(xn ) we have

lim F(xnk ) = inf F(x). (1.5)


k→+∞ x∈K

Using (1.4), (1.5) and the uniqueness of the limit, it follows that

F(x̄) = inf F(x).


x∈K

So x̄ is a global minimum of F in K.

Définition 1.5 Let F : Rn → R be a real valued function. F is said to be coercive if

lim F(x) = +∞. (1.6)


kxk→+∞

Lemme 1.6 Let K be a non empty closed subset of Rn . If F is coercive and continuous on
some open set containing K, then the followings hold.

(i) The function F is bounded from bellow bellow on K.

(ii) Any minimizing sequence of F in K is bounded.

Preuve.

(i). Suppose that F is not bounded from bellow bellow on K. Then, for all n ∈ N, there
exists xn ∈ K such that F(xn ) < −n. So we get a sequence (xn ) of points of K satisfying

F(xn ) < −n, ∀ n ∈ K. (1.7)

Claim: The sequence (xn ) is bounded. Indeed, by contradiction, assume that (xn ) is
not bounded. Then, it has a subsequence (xnk ) such that

lim kxnk k = +∞.


k→∞

From the coercitivity of F, it follows that

lim F(xnk ) = +∞.


k→∞

- N.D JITTÉ 2013 3


4 Existence results

But from (3.5), we have


lim F(xnk ) = −∞.
k→∞

This is a contradiction by the uniquness of the limit. Therefore, (xn ) is bounded. By


Bolzano-Weierstrass theorem, there exists a subsequence (xnk ) of (xn ) that converges
to some point x̄ ∈ K. Using the continuity of F at x̄ it follows that

lim F(xnk ) = F(x̄).


k→∞

From (3.5) we get


lim F(xnk ) = −∞.
k→∞

Therefore, by uniqness of the limit, it follows that

F(x̄) = −∞,

contradiction, so F is bounded bellow on K and this end the proof of (i).

(ii). Let (xn ) be a minimizing sequence of F in K, that is

lim F(xn ) = inf F(x). (1.8)


n→∞ x∈K

We have to show that (xn ) is bounded. By contradiction, assume not. Then there exists
a subsequence (xnk ) of (xn ) such that

lim kxnk k = +∞.


k→∞

Since F is coercive, we have


lim F(xnk ) = +∞.
k→∞

Using (1.8), we have


lim F(xnk ) = inf F(x).
k→∞ x∈K

and this leads to


inf F(x) = +∞.
x∈K

This is a contradiction because of the fact that F is bounded from bellow on K.

- N.D JITTÉ 2013 4


5 Existence results

Théorème 1.7 Let K be a non empty closed subset of Rn (not necessarely bounded) and F :→
R be a function coercive and continuous on some open set containing K. Then F has a gobal
minimum on K, i.e.,there exists at least one point x̄ ∈ K such that

F(x̄) = min F(x).


x∈K

Preuve. Let (xn ) be a minimizing sequence of F in K. By Lemma 1.6, (xn ) is bounded.


So by Bolzano-Weierstrass theorem, (xn ) has a subsequence (xnk ) that converges to
some point x̄ ∈ Rn . Since K is closed, then x̄ ∈ K. Using the continuity of F at x̄, it
follows that
lim F(xnk ) = F(x̄). (1.9)
k→+∞

On the other hand, since (F(xnk ) is a subsequence of F(xn ), we have

lim F(xnk ) = inf F(x). (1.10)


k→+∞ x∈K

Using (1.9), (1.10) and the uniqueness of the limit, it follows that F(x̄) = inf F(x). So, x̄
x∈K
is a global minimum of F in K.

1.3 Existence in Infinite dimensional spaces

1.3.1 Counter-example
In this section, we are going to give a counter-example showing that in Infinite Dimen-
sional Spaces, the existence of a minimizer is not guaranteed by conditions (1.6) used
in theorem1.7. This is because of the fact that in infinite dimensional spaces closed
bounded subsets need not to be compact.

Let us take V = H 1 (]0, 1[) = {v ∈ L2 , v0 ∈ L2 }, the sobolev space with the norm
Z 1
 21
kvk = (|v0 |2 + |v|2 )dx ,
0

- N.D JITTÉ 2013 5


6 Existence results

and F : V → R be the function defined by:


Z 1
F(v) = ((|v0 (x)| − 1)2 + v(x)2 ) dx ∀ v ∈ V.
0

It is obvious to see that F is continuous and that the condition (1.6) is satisfied. Indeed,
Z 1 Z 1
1 1
F(v) = kvk2 − 2 |v0 (x)| dx + 1 ≥ kvk2 − v02 (x) dx − 1 ≥ kvk2 − 1
0 2 0 2

But we have
inf F(v) = 0. (1.11)
v∈V

Therefore, there is no minimizer of F in V . In fact, if there is one, say u, it must satisfy


F(u) = 0, and then u = 0 and |u0 | = 1 a.e on [0, 1]. So, we get a contradiction.

To prove (1.11), we define the following minimizing sequence (un ) by: for n ≥ 1,
 k k 2k + 1

 x − if ≤ x ≤ ,

 n n 2n
un (x) = (1.12)
 2k + 1 − x if 2k + 1 ≤ x ≤ k + 1 , for, 0 ≤ k ≤ n − 1.



n n n
Then, we have
 k 2k + 1

 1 if < x < ,

 n 2n
(un )0 (x) = (1.13)
 −1 if 2k + 1 < x < k + 1 , for, 0 ≤ k ≤ n − 1,



n n
1
So F(un ) = . This prove (1.11).
4n

1.3.2 Toward the existence theorem


Let V be a real normed linear space and F : V → R ∪ {+∞}, with non empty domaine
satisfying:
∀ (un ) ⊂ K, lim kun k = +∞ ⇒ lim F(un ) = +∞. (1.14)
n→+∞ n→+∞

- N.D JITTÉ 2013 6


7 Existence results

Then, any minimizing sequence is bounded. If V is finite dimensional, we conclude


by using Bolzano-Weirstrass theorem. In a more general setting, we can conclude if
(V, K, F) satisfies

∀(un ) ⊂ K such that sup kun k < +∞,


n∈N (1.15)
n
lim F(u ) = l ∈ [−∞, +∞[⇒ ∃u ∈ K, F(u) ≤ l.
n→+∞

So, under conditions (1.14) et (1.15), problem (1.11) has a solution. But condition (1.15)
is not usefull because difficult to verify.

1.3.2.1 Lower semi continuous functions (lsc)

Définition 1.8 Let V be a real vector space. The epigraph of F : V −→ R ∪ {+∞} is the subset
of V × R defined by:
epi(F) := {(x, a) ∈ V × R : F(x) ≤ a}.

We say that F is lower semi continuous if epi(F) is closed in V × R.

Proposition 1.9 Let F : V −→ R ∪ {+∞} and x ∈ V . Then F is lower semi continuous at x


if and only if for every sequence (xn ) in V that convergences to x ∈ V , we have

F(x) ≤ lim inf F(xn ).


n

Preuve. Assume that F is lsc at x. Let (xn ) be a sequence of V such that xn → x in V .


Let (xnk ) be a subsequence of (xn ) such that limk F(xnk ) = lim infn F(xn ). Then, it follows
that (xnk , F(xnk )) is a sequence of epi(F) that converges to (x, lim infn F(xn )). Since epi(F)
is closed , we have (x, lim inf F(xn )) ∈ epi(F) and therefore,

F(x) ≤ lim inf F(xn ).


n

Conversely, assume that (xn → x in V ) ⇒ F(x) ≤ lim infn F(xn ). Let (xn , an ) be a se-
quence in epi(F) that converges to (x, a) in V × R. Then, xn → x and an → a. Therefore,
by hypothesis, we have

F(x) ≤ lim inf F(xn ) ≤ lim inf an = lim an = a.


n n n

So, (x, a) ∈ epi(F) and then epi(F) is closed, which implies that F is lsc.

- N.D JITTÉ 2013 7


8 Existence results

Proposition 1.10 A function F : V −→ R ∪ {+∞} is lsc if and only if for all a ∈ R, the set
Ca := {x ∈ V : F(x) ≤ a} is closed in V.

Preuve. Assume that F is lsc. Let a ∈ R, and let (xn ) be a sequence in Ca that converges
to x ∈ V . Since F(xn ) ≤ a ∀ n we have lim infn F(xn ) ≤ a. So, by proposition 1.9,it follows
that
F(x) ≤ lim inf F(xn ) ≤ a.
n

Hence, x ∈ Ka . Therefore Ca is closed.

Conversely, assume that Ca is closed for all a ∈ R. Let (xn ) be a sequence in V that
converges to x ∈ V . Let (xnk ) be a subsequence of (xn ) such that

lim F(xnk ) = lim inf F(xn ).


k n

Assume that F(x) > lim infn F(xn ). Then there exists a ∈ R such that

lim inf F(xn ) < a < F(x). (1.16)


n

So, there exists N ∈ N such that F(xnk ) < a for all k ≥ N. Therefore xnk ∈ Ca ∀k ≥ N. Since
xnk → x and Ca closed, we have x ∈ Ka . Therefore, F(x) ≤ a. Unsing (1.16), we get a
contradiction. So F(x) ≤ lim infn F(xn ).

Ex. 1.11 Show that any continuous function is lower semi continuous.

Ex. 1.12 Let K be a nonempty subset of V and let ψK be the indicator function of K
defined by: 
 0 si x ∈ K
ψK (x) =
 +∞ sinon

Show that ψK is lsc if and only if K closed in V.

Ex. 1.13 Let Ω be a measurable subset of Rm and u : Ω × Rn −→ R be a positive mea-


surable function . Let φu : L p (Ω; Rn ) → R be a function defined by:
Z
φu (v) = u(t, v(t)) dt.

- N.D JITTÉ 2013 8


9 Existence results

φu is called the Nemistky’s functional associated to u. Show that if u(t, .) is lower semi-
continuous for almost t ∈ Ω, then φu is well defined and lower semi continuous on
L p (Ω, Rn ).

Proposition 1.14 Let F : V −→ R ∪ {+∞} be a proper, lower semi continuous function.


Assume that V is compact. Then there exists x̄ ∈ K such that

F(x̄) = inf F(x).


x∈K

Preuve. Let (xn ) be a minimizing sequence of F in V , that is (xn ) ⊂ V and

lim F(xn ) = inf F(x).


n→+∞ x∈V

Since V is compact, there exists a subsequence (xnk ) of (xn ) such that xnk → x̄ ∈ V . By
proposition1.9 we have:

F(x̄) ≤ lim inf F(xnk ) = lim F(xnk ) = inf F(x).


k k x∈V

So, inf F(x) = F(x̄).


x∈K

1.3.2.2 Convex sets and convex functions

Définition 1.15 Let V be avector space. A set K of V is convex if it is empty or for every x
and y in K, the line segment joining x and y remains inside K, where the line segment [x, y]
joining x and y is defined by

[x, y] = {λx + (1 − λ)y : 0 ≤ λ ≤ 1}.

Therefore, a subset K of V is convex if and only if for every x and y in K and every λ
with 0 ≤ λ ≤ 1, the vector {λx + (1 − λ)y is also in K.

Exemple 1.16

a. Let V be a vector spqqce and x and v be vectors in V . The line L through x in the
direction v:
L = {x + λv, λ ∈ R}

- N.D JITTÉ 2013 9


10 Existence results

is a convex subset V .

b. Any linear subspace M of V is a convex set, since linear subspaces are closed under
addition and scalar multiplication.

c. If x̄ ∈ Rn and α ∈ R, then the closed half-spaces

F + = {y ∈ Rn : x̄ · y ≥ α} and F − = {y ∈ Rn : x̄ · y ≤ α}

determined by x̄ and α are all convex subsets of Rn .

d. If x̄ ∈ Rn and r > 0, then the ball centered at x̄ with radius r,

B̄(x̄, r) := {x ∈ Rn : kx − x̄k ≤ r}

is a convex subset of Rn .

Définition 1.17 Let V be a vector space. An extended real-valued function F : V → R∪{+∞}


is convex if

F(λx + (1 − λ)y) ≤ λF(x) + (1 − λ)F(y) for all x, y ∈ V λ ∈ (0, 1). (1.17)

If the inequality (1.17) is strict for all x, y ∈ V with x , y and all λ ∈ (0, 1), then F is said to be
strictly convex. If the inequalities in the above definitions are reversed, we obtain the definition
of concave and strictly concave functions.

Remarque 1.18 Note that F is convex (respectively strictly convex) on a convex set
K if and only if −F is is concave (respectively strictly concave) on K. Because of
this close connection, we will formulate all results in terms of convex functions only.
Corresponding results for concave functions will be clear.

Proposition 1.19 Let F : V → R ∪ {+∞} be an extended real valued function. Then F is


convex if and only if epi(F) is convex in V × R.

Preuve. Exercice.

- N.D JITTÉ 2013 10


11 Existence results

Proposition 1.20 Let F1 , F2 be convex functions on V , λ > 0, and ϕ be an increasing convex


function on R. Then, F1 + F2 , max(F1 , F2 ), λF1 and ϕoF1 are all convex functions.

Preuve. Exercice.

Théorème 1.21 (Link to Optimization Problems) Let F : V → R ∪ {+∞} be a function.

(i) If F is strictly convex on V , then F has at most one minimizer, that is if the minimizer
exists, it must be unique.

(ii) If F is convex, then any local minimizer of F is a global minimizer.

(iii) If F is convex, then the set of minimizers of F on V is a convex subset of V .

Preuve. (i). Let x1 and x2 be two different minimizers of F, and let λ with, 0 < λ < 1.
Because of the strict convexity of F and the fact that

F(x1 ) = F(x2 ) = min F(x),


x∈V

we have
F(x1 ) ≤ F(λx1 + (1 − λ)x2 ) < λF(x1 ) + (1 − λ)F(x2 ) = F(x1 ),

which is a contradiction. Therefore, x1 = x2 .

(ii). Suppose that x̄ is a local minimizer of F in V . Then there is a positive number r


such that
F(x̄) ≤ F(x), ∀ x ∈ B(x̄, r).

Given any x ∈ V , we want to show that F(x̄) ≤ F(x). To this end, select λ, with 0 < λ < 1
and small so that
x̄ + λ(x − x̄) = λx + (1 − λ)x̄ ∈ B(x̄, r).

Then,
F(x̄) ≤ F(x̄ + λ(x − x̄)) = F(λx + (1 − λ)x̄) ≤ λF(x) + (1 − λ)F(x̄)

because F is convex. Now substract F(x̄) from both sides of the preceding inequality
and divide the result by λ to obtain 0 ≤ F(x) − F(x̄). This establishes the desired result.

- N.D JITTÉ 2013 11


12 Existence results

Proposition 1.22 Let F : V → R ∪ {+∞} be convex and lower semi continuous. Then F is
weakly lower semi continuous, that is epi(F) is weakly closed in V × R.

Preuve. Assume that F : V −→ R ∪ {+∞} lower semi continuous and convex. Then
epi(F) is closed and convex in V × R. Therefore, from Hann Banach Theorem, epi(F) is
weakly closed and hence, F is weakly lower semi continuous.

Théorème 1.23 (Existence theorem) Let V be a real Reflexive Banach Space and K be a non
empty closed convex subset of V . Let F : V → R ∪ {+∞} convex,lower semi continuous and
proper. If K is bounded or F coercive (that is F satisfies (1.14)), then there exists at least a
minimizer of F in K.

Preuve. Let (un ) be a minimizing sequence of F in K. From (1.14), it follows that (un )
is bounded. Since V is reflexive, there exists a subsequence (unk ) of (un ) that converges
weakly to some point ū ∈ V . But K is closed and convex, then it is weakly closed.
Hence ū ∈ K. On the other hand, F is convex and lower semi continuous, then it is
weakly lower semi continuous. Therefore,

F(ū) ≤ lim inf F(unk ) = lim F(un ) = inf F(v).


k n v∈K

So, ū is a minimizer of F in K.

- N.D JITTÉ 2013 12


CHAPTER 2

Generalized derivatives: subdifferential of


convex functions

Sommaire
2.1 Differential characterizations of convex functions . . . . . . . . . . 13

2.2 Subdifferential of convex functions . . . . . . . . . . . . . . . . . . 17

2.3 Nonvacuity of the subdifferential . . . . . . . . . . . . . . . . . . . . 25

2.1 Differential characterizations of convex functions


Our aim in this section is to characterize the convexity of a differentiable function
through a monotonicity property of its differential. Let us extend first the concept
of convexity to functions defined on subsets of vector spaces. Let C be a nonempty
convex subset of a vector space E. A function f defined on a subset of E containing C
and with values in R is convex relatively to C (or on C) if for all λ ∈ (0, 1), x, y ∈ C, we
have
f (λx + (1 − λ)y) ≤ λ f (x) + (1 − λ) f (y), (2.1)

13
14 Optimality Conditions

provided the right hand side exists.

If inequality (2.1) is strict for all x, y ∈ C with x , y and f (x), f (y) both finite, we say that
the function f is strictly convex on C.
Let us strart our study with the following important Lemma.

Lemme 2.1 (Slope inequality for convex functions) Let I be an interval of R and h : R̄ →
R be a proper convex function. Let r1 , r2 , r3 ∈ I such that r1 < r2 < r3 and h(r1 ) and h(r2 ) are
finite. Then
h(r2 ) − h(r1 ) h(r3 ) − h(r1 ) h(r3 ) − h(r2 )
≤ ≤ . (2.2)
r2 − r1 r3 − r1 r3 − r2
Further, these inequalities for all such r1 , r2 , r3 ∈ I characterize the convexity of f on I. If the
inequalities are required to be strict for all r1 , r2 , r3 ∈ I such that r1 < r2 < r3 and h(r1 ), h(r2 )
and h(r3 ) are finite, we obtain a characterization of the strict convexity of f on C.
r3 − r2
Preuve. For λ = , we have λ ∈ (0, 1) and r2 = λr1 + (1 − λ)r3 . By the convexity of
r3 − r1
h we then have
r3 − r2 r2 − r1
h(r2 ) ≤ h(r1 ) + h(r3 ). (2.3)
r3 − r1 r3 − r1
Adding −h(r1 ) (resp. −h(r3 )) to both sides of (2.3) we obtain the first (resp. the second)
inequality of (2.2). Further, it is easy to see that these inequalities characterize the
convexity of f . The proof of the strict convexity is similar. Through the Lemma
2.1 we can characterize the convexity of a differentiable functions of one variable as
follows:

Proposition 2.2 Let I be an open interval of R and h : I → R be a real-valued derivable


function on I. Then the following assertions are equivalent:

(i) h is convex on I;

(ii) the derivative function h0 is nondecreasing on I;

(iii) h0 (r) · (s − r) ≤ h(s) − h(r) for all s, r ∈ I.

- N.D JITTÉ 2013 14


15 Optimality Conditions

If the function h is twice derivable on I, then h is convex if and only if h00 (r) ≥ 0 for all r ∈ I.

Similarly, the following are equivalent

(a) h is strictly convex on I;

(b) the derivative function h0 is increasing on I;

(c) h0 (r) · (s − r) < h(s) − h(r) for all s, r ∈ I with r < s.

Further, if the function h is twice derivable on I and h00 (r) > 0 for all r ∈ I, then h is strictly
convex on I. The converse does not hold, that is, the strict convexity of a twice derivable
function on I does not entail the positivity of h00 .

Preuve. (i) ⇒ (ii). Let r,t ∈ I with r < t. From the derivability of h, (i) and Lemma 2.1
we have
h(s) − h(r)
h0 (r) = lim
s↓r s−r
h(t) − h(r)

t −r
h(t) − h(s)
≤ lim
s↑t t −s
h(s) − h(t)
= lim
s↑t s−t
0
= h (t),

which proves that the derivative function h0 is nondecreasing on I.

(ii) ⇒ (iii). For fixed r ∈ I, let ϕ(s) = h(s) − h(r) − h0 (r) · (s − r) for all s ∈ I. The function
ϕ is derivable on I and ϕ0 (s) = h0 (s) − h0 (r). By assumption (ii), we see that ϕ0 (s) ≥ 0
if s ≥ r and ϕ0 (s) ≤ 0 if s ≤ r. We then deduce that ϕ(s) ≥ ϕ(r) = 0 for all s ∈ I, that is
assertion (iii).

(iii) ⇒ (i). Assuming (iii), we have for all s ∈ I,


h i
0
h(s) = sup h (r) · (s − r) + h(r) .
r∈I

- N.D JITTÉ 2013 15


16 Optimality Conditions

Hence h is convex on I as the pointwise suprimum of a family of convex functions on I.

The case when h is twice derivable on I as well as the case concerning the strict con-
vexity are left as exercise.

Let us consider now the more general case of a differentiable function on an open and
convex subset of a normed linear space.

We shall make use the following lemma, which can be seen as a bridge between con-
vex functions defined on subsets of R and the ones defined on subsets of normed
linear spaces.

Lemme 2.3 Let U be an open and convex subset of a normed linear space E and let f : U → R
be a function. For fixed point x, y ∈ U, with x , y, set

Ixy := {s ∈ R : x + s(y − x) ∈ U} and hxy (s) := f (x + s(y − x)) ∀ s ∈ Ixy .

Then Ixy is an open interval of R and f is convex on U if and only if for all x, y ∈ U with x , y,
the function hxy is convex on Ixy .

Théorème 2.4 Let U be a nonempty open and convex subset of a normed linear space and f :
U → R be a function which is Fréchet-differentiable on U. Then the following are equivalent:

(a). f is convex on U;

(b). hD f (y) − D f (x), y − xi ≥ 0 for all x, y ∈ U ← (monotonicity of the operator D f );

(c). hD f (x), y − xi ≤ f (y) − f (x) for all x, y ∈ U. ← (convex inequality)

If f is twice differentiable on U, then f is convex on U if and only if for each x ∈ U the bilinear
form associated with D2 f (x) is positive semi definite, i.e.,

hD2 f (x) · v, vi ≥ 0, ∀ v ∈ E.

Similarly, the following are equivalent:

- N.D JITTÉ 2013 16


17 Optimality Conditions

(a0 ) f is strictly convex on U;

(b0 ) hD f (y) − D f (x), y − xi > 0;

(c0 ) hD f (x), y − xi < f (y) − f (x) for all x, y ∈ U with x , y.

Further, assuming the twice differentiability of f on U, a sufficient (but not necessary) con-
dition for the strict convexity of f on U is for each x ∈ U the positive definteness of D2 f (x),
i.e.,
hD2 f (x) · v, vi > 0, ∀ v ∈ E, with v , 0.

Preuve. For x, y ∈ U, let Ixy and hxy as in Lemma 2.3. Observing that 0 ∈ Ixy and 1 ∈ Ixy
with hxy (0) = f (x) and hxy (1) = f (y), then the proof follows from Lemma 2.3.

2.2 Subdifferential of convex functions


Let E be a normed linear space. According to the proof of Theorem 2.4, when a convex
function f : E → R̄ is finite on a neighborhood of a point a ∈ E and differentiable at a,
we have
hD f (a), x − ai ≤ f (x) − f (a) ∀ x ∈ E. (2.4)

However, very often a convex function f : E → R̄, finite at a is not differentiable at a


but admits several elements x∗ ∈ E ∗ satisfying

hx∗ , x − ai ≤ f (x) − f (a) ∀ x ∈ E. (2.5)

For example, the convex function f : R → R with f (x) = |x| is not differentiable at a = 0
but all x∗ ∈ [−1, 1] satisfy (2.5) for a = 0.

On the other hand, it is obvious that f attains its minimum at the point a if and only
if the element x∗ = 0 satisfies (2.5). These comments lead to the following notion:

- N.D JITTÉ 2013 17


18 Optimality Conditions

Définition 2.5 Let E be a normed linear space and f : E → R̄ be a convex function which is
finite at a. The subdifferential of f at the point a is the subset of E ∗ , ∂ f (a) defined as follows:

∂ f (a) := {x∗ ∈ E ∗ : hx∗ , x − ai ≤ f (x) − f (a) ∀ x ∈ E}.

If f (a) is not finite, we then set ∂ f (a) = 0.


/ When ∂ f (a) , 0,
/ we say that f is subdiffer-
entiable at a and each x∗ ∈ ∂ f (a) is called a subgradient of f at a. The inequality in the
definition of ∂ f (a) is called the subgradient inequality.

Remarque 2.6 It is trivial to observe from the definition that if the function f takes the value
−∞ at some point, then ∂ f (a) = 0/ for all a ∈ E .

Exemple 2.7

1. For f : R → R with f (x) = |x|, we have ∂ f (0) = [−1, 1], ∂ f (a) = {1} if a > 0 and
∂ f (a) = {−1} if a < 0.


2. For f : R → R ∪ {+∞} with f (x) = − x if x ≥ 0 and f (x) = +∞ if x < 0, the function
n 1 o
/ For a , 0, ∂ f (a) = − √
f is convex on R and finite at 0, but ∂ f (0) = 0. for a > 0
2 a
and ∂ f (a) = 0/ for a < 0.

3. Let f : E → R be a real-valued sublinear function and a ∈ E. Then ∂ f (a) = {x∗ ∈


∂ f (0) : hx∗ , ai = f (a)}.

Ex. 2.8 Find the subdifferentials of the following convex functions: f : R → R ∪ {+∞}, f (x) =
0 if |x| ≤ 1 and f (x) = +∞ otherwise; g : R → R∪{+∞}, g(x) = 1/x if 0 < x < 1 and g(x) = +∞
otherwise.

Proposition 2.9 (Subdifferential characterization of a minimum) Let f : E → R∪{+∞}


be a convex function which is finite at a ∈ E. Then ∂ f (a) is a convex subset of E ∗ (maybe
empty) which is w∗ -closed and the point a is a global minimizer of f on E if and only if
0 ∈ ∂ f (a).

Preuve. Exercise.

- N.D JITTÉ 2013 18


19 Optimality Conditions

2.2.1 Directional derivative


Analysis. Let E be a normed linear space and f : E → R̄ be a convex function which
is finite at a ∈ E. According to the definition of ∂ f (a), we observe that for x∗ ∈ E ∗ , we
have
x∗ ∈ ∂ f (a) ⇔ ∀ x ∈ E, hx∗ , x − ai ≤ f (x) − f (a). (2.6)

The later inequality in (2.6) is equivalent to say that for all v ∈ E and all real t > 0,
holds
hx∗ , a + tv − ai ≤ f (a + tv) − f (a),

i.e.,
h i
hx∗ , vi ≤ t −1 f (a + tv) − f (a) .

So, x∗ ∈ ∂ f (a) if and only if for each v ∈ E


h i
hx∗ , vi ≤ inf t −1 f (a + tv) − f (a) . (2.7)
t>0

Consequently, the function in v (taking values in R̄) given by the right-hand side of
(2.7) characterizes the subdifferential of f at a, ∂ f (a).

h i
On the other hand, the differential quotient function t → t −1 f (a + tv) − f (a) is a
nondecreasing function (according to Lemma 2.1) from R − {0} into R̄. Therefore, the
h i
limit over t of t −1 f (a + tv) − f (a) exists in R̄ and we have
h i h i
−1 −1
lim t f (a + tv) − f (a) = inf t f (a + tv) − f (a) . (2.8)
t↓0 t>0

Définition 2.10 Let E be a normed linear space, f : E → R̄ be a function (which may not be
convex) and a, v ∈ E with f (a) finite, The directional derivative of f at a in the direction v,
f 0 (a; v), is defined by:
h i
f 0 (a; v) := lim t −1 f (a + tv) − f (a) ,
t↓0

provided the limit exists in R̄.

• If f 0 (a; v) exists in R for all v ∈ E and the map v 7→ f 0 (a; v) is linear and continuous from
E to R, we say that f is Gâteaux differentiable at a with Gâteaux differential at a denoted by

- N.D JITTÉ 2013 19


20 Optimality Conditions

DG f (a) or f 0 (a), and in such case, f 0 (a; v) = DG f (a) · v = hDG f (a), vi.

• Finally, f is differentiable (or Frechet differentiable) at a if there exist a continuous linear


functional D f (a), a function ε defined on some neighborhood of 0E , with lim ε(v) = 0 and
v→0
such that
f (a + h) = f (a) + D f (a) · v + kvkε(v).

Obviously, if f is finite around a and Frechet-differentiable at a, then f is Gâteaux


differentiable at a (the converse does not hold). Further in such a case, we have
D f (a) = DG f (a). Moreover, the domain of f , i.e., the subset of E where it is finite,
must be open. Here is an important case, where the converse is true.

Proposition 2.11 Let f be a Gateaux differentiable function with open domain, such that
u → f 0 (u) is a continuous mapping from the domain of f into E ∗ . Then f is also continuously
Frechet differentiable, and is called a C1 function.

Preuve. Under our hypothesis, we have to prove that the Gateaux derivative f 0 (u)
is in fact a Frechet-derivative. Take any point u, in the domain of f and some r > 0
such that B(u, r) ⊂ Dom( f ). Using the Gâteaux differentiability of f , it follows that for
every v ∈ E with kvk < r, there exists some θ ∈ [0, I] such that

f (u + v) − f (u) = h f 0 (u + θv), vi
= h f 0 (u), vi + h f 0 (u + θv) − f 0 (u), vi (2.9)

Let ε > 0. From the continuity of the map v → f 0 (v), there exists η, 0 < η < r such that
kw − uk < η implies

k f 0 (w) − f 0 (u)k∗ < ε.

For every v ∈ E, with kvk < η, we obtain from (2.9) that

| f (u + v) − f (u) − h f 0 (u), vi| ≤ εkvk,

which proves the Frechet differentiability.

In the sequel, we shall need the following lemma.

- N.D JITTÉ 2013 20


21 Optimality Conditions

Lemme 2.12 Let E be a linear space and p : E → R be a convex and positively homogeneous
function. Then p is sublinear.

Preuve. We need only to prove the subadditivity of p. For this end, let x, y ∈ E. From
the assymptions, we have
h 1 1 i
p(x + y) = p 2 x + y
2 2
1 1 
= 2p x + y homogeneous property.
2 2
≤ p(x) + p(y) convexity property.

Théorème 2.13 Let E be a normed linear space and f : E → R be a convex function which is
finite at a ∈ E. Then the following properties hold.

(i) For any v ∈ E, the directional derivative of f at a in the direction v, f 0 (a; v) exists in R̄
and further
h i
f 0 (a; v) = inf t −1 f (a + tv) − f (a) .
t>0

(ii) ∂ f (a) = {x∗ ∈ E ∗ : hx∗ , vi ≤ f 0 (a; v) ∀ v ∈ E}.

(iii) The directional derivative function v → f 0 (a; v) is sublinear from E to R̄ and f 0 (a; 0E ) =
0.

(iv) For any v ∈ E, we have − f 0 (a; −v) ≤ f 0 (a; v).

(v) If f is finite on a neighborhood of a and Gâteaux differentiable at a, then

∂ f (a) = {DG f (a)}.

Preuve. Assertion (i) and (ii) follow from the analysis above.

h i
(iii). Observe first that f 0 (a; 0E ) = lim t −1 f (a + t0E ) − f (a) = 0. For λ > 0 and v ∈ E,
t↓0
h i h i
f 0 (a; λv) = lim t −1 f (a + t(λv)) − f (a) = λ lim(tλ)−1 f (a + tλv) − f (a) = λ f 0 (a; v).
t↓0 t↓0

- N.D JITTÉ 2013 21


22 Optimality Conditions

Now we will prove that the directional derivative function v → f 0 (a; v) is convex. Since
we already prove that it is positively homogeneous, then its sublinearity follows from
Lemma 2.12.

Fot this, let v1 , v2 ∈ E and , r1 , r2 ∈ R such that f 0 (a; vi ) < ri . By (i). There exists, for each
h i
−1
i = 1, 2, some ti > 0 such that ti f (a + ti vi ) − f (a) < ri . Set s := min{t1 ,t2 } > 0. Now
let λ1 , λ2 ∈ (0, 1) such that λ1 + λ2 = 1. The nondeacreasing property of the function
h i
−1
t →t f (a + tv) − f (a) gives the inequalities
h i
s−1 f (a + sv) − f (a) < ri , i = 1, 2.
h i
So by the convexity of the function v → s−1 f (a + sv) − f (a) , we obtain
h i h i
−1 −1
s f (a + s(λ1 v1 + λ2 v2 )) − f (a) ≤ λ1 s f (a + sv1 ) − f (a)
h i
+ λ2 s−1 f (a + sv2 ) − f (a)
< λ1 r1 + λ2 r2 .

Using (i) again, we have

f 0 (a; λ1 v1 + λ2 v2 ) < λ1 r1 + λ2 r2 .

So, the function v → f 0 (a; v) is convex.

(iv). If f 0 (a; v) = +∞ or f 0 (a; −v) = +∞, the inequality in (iv) is trivial. So suppose that
−∞ ≤ f 0 (a; v) < +∞ and −∞ ≤ f 0 (a; −v) < +∞. Let r, s ∈ R such that f 0 (a; v) < r and
f 0 (a; −v) < s. Then by (iii) we have

0 = f 0 (a; 0E ) = f 0 (a; v + (−v)) < r + s,

hence −s < r. Letting r ↓ f 0 (a; v) and s ↓ f 0 (a; −v), we obtain (iv), as desired.

(v). Assertion (v) is a direct consequence of the definition of Gâteaux differentiability.

- N.D JITTÉ 2013 22


23 Optimality Conditions

Remarque 2.14 The directional derivative may take the value −∞ at some points even
for a function taking values in (−∞, +∞]. For example for f : R → R ∪ {+∞} with

f (x) = − x if x ≥ 0 and f (x) = +∞ if x < 0, the function f is convex, finite at 0 and

 −∞, if v > 0;


f 0 (0; v) = 0, if v = 0;


+∞, if v < 0.

Let us now study the properties of the function v → f 0 (a; v) when f is finite at a and
continuous at a. Let us start with the following Lemma.

Lemme 2.15 Let E be a normed linear space and f : E → R ∪ {+∞} be a convex function and
a ∈ E such that f (a) is finite and f is continuous at a. Then f is Lipschitzian around a, that
is, there exist r > 0 and γ > 0 such that f is γ-Lipschitzian on B(a, r), i.e.

| f (x) − f (y)| ≤ γkx − yk ∀ x, y ∈ B(a, r).

Preuve. From the continuity of f at a, there exist positive real numbers r, M such that

| f (x)| ≤ M, ∀ x ∈ B̄(a, 2r).


r
Let x, y ∈ B(x, r) with x , y. Define α = kx − yk and z = y + (y − x). We have
α
r
kz − ak ≤ ky − ak + ky − xk
α
≤ 2r.
α r
So, z ∈ B(a, 2r) and hence | f (z)| ≤ M. We also have y = z+ x. Therefore, from
α+r α+r
the convexity of f , it follows that
α r
f (y) ≤ f (z) + f (x),
α+r α+r
which implies that
α  
f (y) − f (x) ≤ f (z) − f (x)
α+r
α
≤ (M + M)
r
2M
= kx − yk.
r

- N.D JITTÉ 2013 23


24 Optimality Conditions

Interchanging x and y gives

2M
f (x) − f (y) ≤ kx − yk.
r

Therefore,
2M
| f (y) − f (x)| ≤ γkx − yk, with γ := .
r
Hence, f is Lipschitzian on B̄(a, r).

Théorème 2.16 Let E be a normed linear space and f : E → R̄ be a convex function which
is finite at a ∈ E and continuous at a. Then the directional derivative f 0 (a; v) is finite for all
v ∈ E. Furthermore for some γ, the function f is γ-Lipschitzian around a and f 0 (a; ·) is finite
and γ-Lipschitzian on E, i.e.,

| f 0 (a; v1 ) − f 0 (a; v2 )| ≤ γkv1 − v2 k ∀ v1 , v2 ∈ E.

In particular, we have
| f 0 (a; v)| ≤ γkvk ∀ v ∈ E.

Preuve. Using the continuity of f at a and Lemma 2.15, there exist r > and γ > 0 such
that
| f (y) − f (x)| ≤ γkx − yk ∀ x, y ∈ B(a, r).

For v ∈ E, let t > 0 small such that a + tv ∈ B(a, r). Then | f (a + tv) − f (a)| ≤ tγkvk. This
implies that f 0 (a; v) is finite.

Now let v1 , v2 ∈ E. Using the Lipschitz property of f around a and the definition of
f 0 (a; v), it follows that
| f 0 (a; v1 ) − f 0 (a; v2 )| ≤ γkv1 − v2 k.

- N.D JITTÉ 2013 24


25 Optimality Conditions

2.3 Nonvacuity of the subdifferential


The next theorem continues with the analysis of the subdifferential of f at a when f is
finite and continuous at a.

Théorème 2.17 (Nonvacuity of the subdifferential) Let f : E → R̄ be a convex function


which is finite and continuous at a. Then the following properties hold.

(a) ∂ f (a) , 0/ and is w∗ -compact.

(b) For every v ∈ E, we have

f 0 (a; v) = max {hx∗ , vi : x∗ ∈ ∂ f (a)} and {hx∗ , vi : x∗ ∈ ∂ f (a)} = [− f 0 (a; −v), f 0 (a; v)].

(c) If γ ≥ 0 is a Lipschitz constant for f near a, then

∂ f (a) ⊂ γBE ∗ ,

where BE ∗ is the closed unit ball of E ∗ centered at 0E ∗ with respect to the dual norm
k · k∗ .

Preuve. Let p : E → R be the functional defined p(v) := f 0 (a; v) for all v ∈ E. From
(iii) of Theorem 2.13, p is sublinear. Let w ∈ E be a nonzero vector of E (if E = {0E },
everything is trivial). For F := Rw = {sw : s ∈ R}, consider the function ϕ : F → R
defined by
ϕ(sw) = s f 0 (a; w), ∀ s ∈ R.

The function ϕ is obviously linear over the real vector subspace F of E. Further,

ϕ(sw) = s f 0 (a; w) = f 0 (a, sw) for s ≥ 0

and for s < 0, using (iii) and (iv) of Theorem 2.13, we have

ϕ(sw) = s f 0 (a; w) = −(−s) f 0 (a; w) = − f 0 (a; −sw) ≤ f 0 (a; sw).

So ϕ(sw) ≤ f 0 (a; sw) = p(sw) ∀ s ∈ R. According to the (Analytical) Hann Banach exten-
sion Theorem, we can extend ϕ to a linear functional l : E → R satisfying

l(v) ≤ p(v) ∀ v ∈ E. (2.10)

- N.D JITTÉ 2013 25


26 Optimality Conditions

Since the sublinear functional p = f 0 (a; ·) is finite and continuous on E (see Theorem
2.16), the linear functional l is continuous, i.e., l ∈ E ∗ . Then inequality (2.10) and (ii)
of Theorem 2.13 tell us that l ∈ ∂ f (a), which establishes the nonvacuity of ∂ f (a).

Assertion (ii) of Theorem 2.13 and inequality (2.10) again assure that

f 0 (a; v) = max {hx∗ , vi : x∗ ∈ ∂ f (a)}.

It is not difficult to deduce from this equality that {hx∗ , vi : x∗ ∈ ∂ f (a)} = [− f 0 (a; −v), f 0 (a; v)],
which implies, in particular that ∂ f (a) is bounded in E ∗ . Further, it is not hard to see
from (ii) of Theorem 2.13 that ∂ f (a) is w∗ -closed. The Banach Alaoglu-Bourbaki The-
orem, the w∗ -closeness and the boundedness of ∂ f (a) ensure its w∗ -compactness.

Finally, (ii) of Theorem 2.13 and Theorem 2.16 imply that forall x∗ ∈ ∂ f (a),

hx∗ , vi ≤ f 0 (a; v) ≤ γkvk ∀ v ∈ E.

Hence kx∗ k ≤ γ, and this means ∂ f (a) ⊂ γBE ∗ . The proof of the theorem is then com-
plete.

- N.D JITTÉ 2013 26


CHAPTER 3

Convex minimization problems

Sommaire
3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

3.2 Optimality conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

3.3 Ekeland variational principle and applications . . . . . . . . . . . . 32

3.4 Selected exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.1 Preliminaries

3.1.1 Normal cone to a convex set


Let C be a convex subset of a normed linear space and a ∈ C. Using the definition of the
indicator function ψC , it is readily seen that a continuous linear functional x∗ ∈ ∂ψC (a)
if and only if
hx∗ , x − ai ≤ 0 for all x ∈ K. (3.1)

Définition 3.1 Let C be a convex subset of a normed linear space E and a ∈ C. The cone
∂ψC (a), given by the subdifferential of the indicator function ψC at the point a is called the

27
28 convex minimization problems

normal cone to C at a. It is generally denoted by NC (a) or N(C; a), i.e.,

NC (a) := ∂ψC (a) = {x∗ ∈ E ∗ : hx∗ , x − ai ≤ 0 ∀ x ∈ K}.

It follows from the definition and the previous section that NC (a) is a w∗ -closed convex
cone with 0 ∈ NC (a). Any element of NC (a) is called a normal functional (or normal
vector if E is a finite dimensional space or Hilbert space) to C at a. When a < C, we put
NC (a) = 0.
/

3.1.2 Tangent cone


In the previous section, the concept of normal vector has been extended to convex
subsets of a normed linear space E (which are not required to be smooth submanifols).
Following Boulingand’s approach, the notion of tangent vectors can be also extended
to any subset of E as follows.

Définition 3.2 Let C be a subset of a normed linear space E and a ∈ C. A vector v ∈ E is said
to be tangent vector (in the sense of Boulingand) to C at a, if there exist a sequence {xn } of
points of C and a sequence of positive real numbers tn ↓ 0 such that v = lim tn−1 (xn − a). The
n→∞
set of all such vectors v is readly seen to be a cone. It is called the tangent cone to C at a and is
denoted by TC (a) or T (C; a).

Remarque 3.3 We always have 0 ∈ TC (a) and any sequence {xn } as above converges
to a.

The following theorem provides some properties of the tangent cone with a particular
attention to the case of convex sets.

Théorème 3.4 Let C be a subset of a normed linear space and a ∈ C. The following hold.

(a) The tangent cone TC (a) is a closed cone of E (not necessarily convex).

(b) If in addition C is convex, then TC (a) = {v ∈ E : dC0 (a; v) = 0}, where dC : E → R is the
distance function associated to C defined as follows:

dC (x) := inf{d(x, y) : y ∈ C}.

- N.D JITTÉ 2013 28


29 convex minimization problems
h
Further, the tangent cone TC (a) is convex and TC (a) = ClE R+ (C − a)].

Preuve. (a). Let {vk }k∈N be a sequence of TC (a) that converges to v ∈ E. For each
k ∈ N, choose a sequence (xnk )n∈N in C converging to a and a sequence of positive real
numbers tnk with tnk → 0, as n → ∞ such that vk = lim (tnk )−1 (xnk − a). Therefore, there
n→∞
exists an increasing sequence of natural numbers nk such that
1 1
sk := tnkk < and ks−1
k (xnk k − a) − vk k < .
k k
Observing that sk ↓ 0 and v = lim s−1 k
k (xnk − a), it then follows that v ∈ TC (a), according
k→∞
to the definition of TC (a). So, TC (a) is closed.

(b). Let v ∈ TC (a). Choose a sequence (xn )n∈N in C converging to a and a sequence of
positive real numbers tn ↓ 0 such that v = lim tn−1 (xn − a). Setting vn := tn−1 (xn − a), we
n→∞
have a + tn vn ∈ C and vn → v. According to the 1-Lipschitz property of the distance
function dC , we have
dC (a + tn v) − dC (a) dC (a + tn vn ) − dC (a)
≤ kvn − vk + = kvn − vk.
tn tn
So,
dC (a + tn v) − dC (a)
→ 0, as n → ∞.
tn
dC (a + tv) − dC (a)
Since we know that dC0 (a; v) = lim exists in R̄, we deduce that dC0 (a; v) =
t↓0 tn
0.

Suppose now that dC0 (a; v) = 0. Fix a sequence (tn ) of positive real numbers such that
tn ↓ 0. For each n, choose xn ∈ C such that

a + tn v − xn < dC (a + tn v) + tn2 .

Then,
h i
v − tn−1 (xn − a) ≤ tn−1 dC (a + tn v) + tn = tn−1 dC (a + tn v) − dC (a) + tn

and hence v = lim tn−1 (xn − a). So, v ∈ TC (a). We have then established that
n→∞

TC (a) = {v ∈ E : dC0 (a; v) = 0}. (3.2)

- N.D JITTÉ 2013 29


30 convex minimization problems

Observing that dC0 (a; v) ≥ 0 for all v ∈ E, we can write (3.2) in the form

TC (a) = {v ∈ E : dC0 (a; v) ≤ 0},

which through the convexity of the function dC0 (a; ·) (see Theorem 2.13) ensures the
convexity of the set TC (a).

3.2 Optimality conditions


The above tangent cone concept allows us to provide a necessary and sufficient opti-
mality condition for convex minimization problem with a set constraints.

Consider the following minimization problem



 min f(x)
(P)
 x ∈ K,

where f : E → R ∪ {+∞} is the objective or cost function and C ⊂ E is the set-constraint.


We say that a ∈ E is a solution of (P) if a ∈ C and f (a) = inf f (x). When the function f
x∈K
and the set C are convex, we say that (P) is a convex minimization problem.

Théorème 3.5 (Optimality condition through the tangent cone) Assume that E is a normed
linear space and (P) a convex minimization problem with f finite and continuous at a ∈ C.
Then, the point a is a solution of (P) if and only if

f 0 (a; v) ≥ 0 for all v ∈ TC (a). (3.3)

Preuve. Suppose that the point a is a solution of (P). Let v ∈ TC (a). By the definition
of TC (a), there exist a sequence of positive real numbers tn ↓ 0 and a sequence {vn } in
E with vn → v such that a + tn vn ∈ C for all n. On the other hand, we know that f is
γ-Lipschitz around a for some γ > 0, i.e., there exists an open neighborhood U of a
such that
| f (x) − f (y)| ≤ γkx − yk for all x, y ∈ U. (3.4)

- N.D JITTÉ 2013 30


31 convex minimization problems

Since a +tn vn → a and a +tn v → a, we can choose an integer N ∈ N such that a +tn vn , a +
tn v ∈ U for all n ≥ N. Then for each n ≥ N, we have

f (a) ≤ f (a + tn vn ) because a + tn vn ∈ C.

Hence,
f (a + tn v) − f (a) f (a + tn v) − f (a + tn vn ) f (a + tn vn ) − f (a)
= +
tn tn tn
f (a + tn v) − f (a + tn vn )
≥ .
tn
From (3.4) it follows that
f (a + tn v) − f (a + tn vn )
≥ −γkvn − vk.
tn
So,
f (a + tn v) − f (a)
≥ −γkvn − vk.
tn
Taking limit in both sides, as n → ∞ gives f 0 (a; v) ≥ 0, which gives the necessary opti-
mality condition.

Conversely, assume that f 0 (a; v) ≥ 0 for all v ∈ TC (a). Let x ∈ K. By Proposition 2.13,
we have x − a ∈ TC (a) and hence f 0 (a : x − a) ≥ 0 according to our assumption. Since
f (a + t(x − a)) − f (a)
f 0 (a; x − a) = inf ≤ f (a + (x − a)) − f (a) = f (x) − f (a),
t>0 t
we obtain f (x) − f (a) ≥ 0. This beeng true for all x ∈ K. We then conclude that a is a
solution of (P). This completes the proof of the Theorem.

Corollaire 3.6 (Optimality condition: differential) Assume that E is a normed linear


space and (P) a convex minimization problem with f finite and Gâteaux differentiable at a ∈ C.
Then, the point a is a solution of (P) if and only if

DG f (a) · v ≥ 0 for all v ∈ TC (a). (3.5)

If in addition, a is an interior point of K, then the variational inequality (3.5) reduces to the
Euler’s equation:
DG f (a) · v = 0. (3.6)

- N.D JITTÉ 2013 31


32 convex minimization problems

3.3 Ekeland variational principle and applications


Since its appearance in 1972, the variational principle of Ekeland has found many ap-
plications in different fields in Analysis. The best references for those are by Ekeland
himself: his survey article [**] and his book with J.-P. Aubin [*]. Not all material pre-
sented here appears in those places.

Ivar Ekeland ”...it is hoped that every mathematician will find something to enjoy
from the Variational Principle”

Motivation. We know that an extended real valued lower semicontinuous function at-
tains it infinimum over a compact space and that without some compactness assump-
tion, such a property in general does not hold. However, the next theorem shows that
for a lower semicontinuous function f : V → R ∪ {+∞} bounded from bellow over a
complete metric space V , there exists a function close in a certain sense to f which
achieves its minimum on V .

Théorème 3.7 (Ekeland variational principle) [strong statement]


Let (V, d) be a complete metric space, and F : V → R ∪ {+∞} be a proper lower semi-
continuous function, bounded from below. Let ε > 0 be given, and a point u ∈ V such
that
F(u) ≤ inf F + ε.
V

Then there exists some point v ∈ V such that

F(v) ≤ F(u); (3.7)


d(u, v) ≤ 1; (3.8)
F(v) < F(w) + εd(v, w) ∀ w , v. (3.9)

Inequality (3.9) tells us that v is a strict minimum for the function w → f (w) + εd(w, v).

Preuve. Let us define inductively a sequence {un }, starting with u0 = u. Suppose


un ∈ V is known. Now either

- N.D JITTÉ 2013 32


33 convex minimization problems

(a) ∀ w , un , F(un ) < F(w) + εd(un , w). Then set un+1 = un .

(b) ∃w , un : F(un ) ≥ F(w) + εd(un , w). Let Sn be the set of such w ∈ V . Then choose
un+1 ∈ Sn such that
1h i
F(un+1 ) − inf F ≤ F(un ) − inf F . (3.10)
Sn 2 Sn

Claim. The sequence {un } is Cauchy. Indeed, if case (a) ever occurs, the sequence {un }
is stationary. If not, we have the inequalities

εd(un , un+1 ) ≤ F(un ) − F(un+1 ), ∀ n ∈ N.

Adding them up, we get

εd(un , u p ) ≤ F(un ) − F(u p ), ∀ n ≤ p. (3.11)

Observing that the sequence {F(un )} is decreasing and bounded from below, it then
follows that it is convergent. So the right-hand side goes to zero with (n, p). Therefore,
{un } is Cauchy. Since the space is complete, un converges to some v ∈ V .

Claim The point v satisfies (3.7), (3.8) and (3.9). Inequality (3.7) follows from the
nondecreasing property of the sequence {F(un )} and the lower semicontinuity of F:
F(v) ≤ lim inf F(un ).

In inequality (3.11), taking n = 0, we obtain the following estimates

εd(u, u p ) ≤ F(u) − F(u p ) ≤ F(u) − inf F


V
≤ ε by assumption.

Letting p → +∞ gives Inequality (3.8).

The proof of (3.9) is by contradiction. For this, assume that (3.9) is not true. Then there
exists some w ∈ V with w , v such that

F(w) ≤ F(v) − εd(v, w). (3.12)

- N.D JITTÉ 2013 33


34 convex minimization problems

Letting p → ∞ in (3.11) and using (3.12), we obtain

F(w) ≤ F(un ) − εd(un , w),

and hence w ∈ Sn for all n. But relation (3.10) can be written as

2F(un+1 ) − F(un ) ≤ inf F ≤ F(w).


Sn

When n → ∞, F(un ) → l, and this implies l ≤ F(w). Since F is lower semicontinuous, we


have F(v) ≤ l. Finally, we get the inequality F(v) ≤ F(w), which gives a contradiction
from the definition of w.

Corollaire 3.8 Under the same setting as Theorem 3.7, given ε > 0, and a point u ∈ V such
that
F(u) ≤ inf F + ε.
V
Then for every λ > 0, there exists some point v ∈ V such that

F(v) ≤ F(u); (3.13)


d(u, v) ≤ λ; (3.14)
ε
F(v) < F(w) + d(v, w) ∀ w , v. (3.15)
λ

Preuve. For λ > 0, we define on V the distance dλ := λd. We apply Theorem 3.7 by
replacing d by dλ .

We immediately deduce from Theorem 3.7 the following.

Théorème 3.9 (Ekeland variational principle) [weak statement]


Let (V, d) be a complete metric space, and F : V → R ∪ {+∞} be a proper lower semi-
continuous function, bounded from below. For any ε > 0, there exists some point v ∈ V
such that

F(v) ≤ inf F + ε.; (3.16)


V
F(v) ≤ F(w) + εd(v, w) ∀ w ∈ V. (3.17)

- N.D JITTÉ 2013 34


35 convex minimization problems

* A point v given by Theorem 3.9 is called an ε-minimizer of f .

This relies on the Fact that there always is some point u ∈ V with F(u) ≤ infV F + ε.
Inequality (3.16) then proceeds from (3.7) and (3.17) from (3.8). Theorem 3.7 certainly
is stronger than Theorem 3.9. The main difference lies in inequality (3.8), which gives
the whereabouts of point v ∈ V , and which has no counterpart in Theorem 3.9.

3.3.1 Application to optimization


This really was the motivation which led Ekeland to prove Theorem 3.7.

Théorème 3.10 Let E be a real Banach space, and F : E → R ∪ {+∞} be a lower semicontin-
uous function, Gâteaux differentiable and such

−∞ < inf F < +∞.


E

Then for every ε > 0, every u ∈ E such that F(u) ≤ infV F + ε and every λ > 0, there exists
some point v ∈ E such that

F(v) ≤ F(u); (3.18)


kv − uk ≤ λ; (3.19)
ε
kF 0 (v)k∗ ≤ . (3.20)
λ

Preuve. It is a straightforward application of Corollary 3.8. Inequality (3.15) gives, for


every w ∈ E, every t > 0

ε
F(v + tw) ≥ F(v) − tkwk,
λ
F(v + tw) − F(v) ε
≥ − kwk.
t λ

Letting t ↓ 0, we obtain
ε
hF 0 (v), wi ≥ − kwk. (3.21)
λ

- N.D JITTÉ 2013 35


36 convex minimization problems

The inequality (3.21), holding for every w ∈ E, means that

ε
kF 0 (v)k∗ ≤ .
λ

Corollaire 3.11 Let E be a real Banach space, and F : E → R ∪ {+∞} be a lower semicontin-
uous function, Gâteaux differentiable and such

−∞ < inf F < +∞.


E

Then for every ε > 0, there exists vε ∈ E such that

F(vε ) − inf F ≤ ε2 ; (3.22)


E
kF 0 (vε )k∗ ≤ ε. (3.23)

Preuve. Just take ε2 instead of ε and ε instead of λ in Theorem 3.10.

Remarque 3.12 We can view the Corollary 3.23 as telling us that the equation F 0 (v) = 0
although it need have no solution, alway has approximate solutions, i.e., there exists
a sequence {un } such that kF 0 (un )k∗ → 0 as n → 0. The cluster points of such sequences
have been intensively studied.

Any point v which minimizes F over E satisfies (3.22) and (3.23) with ε = 0. On the
other hand, there might not be any such point: the usual conditions ensuring the
existence of a minimizer are quite stringent (F should be convex, have bounded level
sets or be coercive and E should be refexive). What Corollary 3.23 does is, even in
the absence of an exact minimum, to provide us with points which almost minimize F
and almost satisfy the first-order necessary conditions. In other words, the equations
F(v) = infF F(v) and F 0 (v) = 0 can be satisfied to any prescribed ε > 0.

3.3.2 Existence of minimizers


Let us start with the following definition of the so called Palais-Smale ((P S) for short)
conditions.

- N.D JITTÉ 2013 36


37 convex minimization problems

Définition 3.13 Let E be a Banach space. We say that a C1 -function F : E → R stisfies the
Palais-Smale (P S) condition if every sequence (xn ) in E such that (F(xn )) is bounded and
lim F 0 (xn ) = 0 in E ∗ has a convergent subsequence.

Théorème 3.14 Let E be a Banach space and F : E → R be a C1 -function bounded from below.
Assume that F satisfies (PS) condition. Then there exists some point u ∈ E such that

F(u) = min F(v) and F 0 (u) = 0.


v∈E

Preuve. By Corollary 3.23, there exists a sequence (un ) in E such that F(un ) → inf F
E
and F 0 (un ) → 0. From our assumptions, F satisfies (PS) condition and so (xn ) has a
subsequence (xnk ) that converges to some point u ∈ E. By continuity of F and F 0 , it
follows that F(u) = lim F(un ) = inf F and F 0 (u) = lim F 0 (un ).
E

3.4 Selected exercises


Ex. 3.15 Let U be a nonempty convex open subset of a normed linear space E and
f : E → R be a function. For x, y ∈ U with x , y, define

Ixy := {s ∈ R : x + s(y − x) ∈ U} and hxy (s) := f (x + s(y − x)) ∀ s ∈ Ixy .

(a) Show that Ixy is an open interval of R.

(b) Show that f is convex on U if and only if for all x, y ∈ U with x , y, hxy is convex
on Ixy .

Ex. 3.16 Let p ≥ 1. Show that the function: t ∈ R −→ |t| p is convex on R. Deduce that
n
the function h(x1 , . . . , xn ) = ∑ |xi | p is convex on Rn .
i=1

Ex. 3.17 Let f : R2 −→ R defined by:

f (x, y) = x(x + y2 ), ∀ (x, y) ∈ R2 .

1. Show that U := {(x, y) ∈ R2 | x > y2 } is open and convex in R2 .


2. Show that the function f is strictly convex on U.

- N.D JITTÉ 2013 37


38 convex minimization problems

Ex. 3.18 1. In Rn with the euclidian norm, show that

kλx + (1 − λ)yk2 = λkxk2 + (1 − λ)kyk2 − λ(1 − λ)kx − yk2 , ∀ x, y ∈ Rn .

2. Let f : Rn → Rn be a nonexpansive mapping, i.e.,

k f (x) − f (y)k ≤ kx − yk ∀ x, y ∈ Rn .

Show that the Fixed Point Set of f given by

Fix( f ) := {x ∈ Rn : f (x) = x}

is a closed convex subset of Rn .

Ex. 3.19 Let H be a real inner product space with norm k · k and inner product h·, ·i.
Show that the function f : H → R defined by f (x) = kxk2 for all x ∈ H, is strictly convex
on H.

Ex. 3.20 Let E be a normed linear space, K be a nonempty subset of E and f : E → R


be a function.
1. Give the definition of a minimizing sequence of f on K.
2. Prove that a minimizing sequence of f on K exists always.
3. Prove that if K is bounded or f coercive then every minimizing sequence of f on K
is bounded.

Ex. 3.21 Let E be a normed linear space and K be a nonempty subset of E. We define
the function

 0, if x ∈ K,
ψK : E → R ∪ {+∞} as follows: ψK (x) =
 +∞, if x < K.

1. What is the epigrah of the function psiK ?


2. Show that:

(a). The function ψK is convex if and only if K is convex.

(b). The function ψK is lower semicontinuous if and only if K is closed.

- N.D JITTÉ 2013 38


39 convex minimization problems

Ex. 3.22 Let E be a normed linear space, K be a nonempty convex subset of E. Let
f : E → R be a differentiable convex function. Show that a ∈ K is a minimizer of f on
K if and only if
D f (a) · (v − a) ≥ 0, ∀ v ∈ K.

* Hint: You may use the relation D f (a) · w = f 0 (a; w) for all w ∈ E.

Ex. 3.23 Let E be a normed linear space, K be a nonempty convex subset of E. Let
f : E → R be a convex function.
1. Show that the set of minimizers of f on K is a convex subset of K.
2. Show that if f is strictly convex, then this set reduces to a singleton.

Ex. 3.24 Let E be a normed linear space and S be a nonempty subset of E. Let f : S → R
be a function which is γ-Lipschitz on S. For each x ∈ E, setcounter

g(x) := sup[ f (y) − γd(x, y)] and h(x) := inf [ f (y) + γd(x, y)].
y∈S y∈S

1. Show that g and h are real valued functions, i.e., g(x) and h(x) are finite.
2. Show that g(x) ≤ h(x) for all x ∈ E and that g(x) = h(x) = f (x) for all x ∈ S.
3. Show that the functions g and h are γ-Lipschitz on E.

Ex. 3.25 Let H be a real inner product space with norm k · k and inner product h·, ·i.
Let x ∈ H and ϕ : H → R be a convex and lower semicontinuous function. Set
1
F(v) = kv − xk2 + ϕ(v), ∀ v ∈ H.
2
1. Show that there exist a ∈ H and α ∈ R such that ϕ(v) ≥ ha, vi + α for all v ∈ H.
¯ ∈ H × R such that λ
* Hint: Take (ū, λ) ¯ < ϕ(ū) and use the geometric form of Hann-
Bannach Theorem.
2. Deduce that F has a unique minimizer in H, ux characterized by the following
variational inequality:

(ux − x, v − ux ) + ϕ(v) − ϕ(ux ) ≥ 0, ∀v ∈ V. (3.24)

3. Let Proxϕ : H → H be the map defined by Proxϕ (x) = ux where ux ∈ H is the unique
point satisfying (3.27). Show that the map Prox−ϕ Lipschitzian.

- N.D JITTÉ 2013 39


40 convex minimization problems

4. Let K be a nonempty closed convex subset of H. Assume that ϕ = ψK . Show that


Proxϕ is the orthogonal projection map onto K.

Ex. 3.26 Let E be a finite dimensional normed linear space and f : E → R be a lower
semicontinuous function. Show that the following assertions are equivalent.
f (x)
1. lim inf > −∞;
kxk→+∞ kxk
2. there exists r ∈ R such that for all x ∈ E, r(1 + kxk) ≤ f (x).

Ex. 3.27 Let H be a real Hilbert with norm k · k and inner product (·, ·) and A : H → H
be a bounded linear operator. Assume that there exists a positive real number α > 0
such that:
(Av, v) ≥ αkvk2 , ∀ v ∈ H.

1. Let b ∈ H and define


1
J(v) = (Av, v) − (b, v), ∀ v ∈ H.
2
a. Show that J strictly convex, continuous and coercive.

b. Deduce that there exists a unique u ∈ H such that J(u) = min J(v).
v∈H

2. Show that J 0 (u) = 0.


3. Assume that A is auto-adjoint, i.e., (Av, w) = (v, Aw) for all v, w ∈ H. Show that u is
the unique solution of the equation Av = b.

======================================

Ex. 3.28 For n, m ≥ 1, let A be a n × n invertible matrix and B be a n × m matrix. Let f


be given in Rn . For each v ∈ Rm let z = z(v) ∈ Rm be the unique solution of the linear
system:
Az(v) = f + Bv. (3.25)

Let J : Rm → R be the functional defined by:


1 α
J(v) = kz(v)k2n + kvk2m (3.26)
2 2

- N.D JITTÉ 2013 40


41 convex minimization problems

where α > 0, k.kn and ( . )n are the natural norm and inner product of Rn .
1. Show that the map v → z(v) is affine from Rm to Rn i.e., z(v) = y0 + l(v) where
l ∈ L (Rm , Rn ) and y0 ∈ Rn .
2. Deduce that J is strictly convex.
3. Let K be a nonempty closed convex subset of Rm . Show that J has a unique mini-
mizer on K.

Ex. 3.29 Let E be a normed linear space and C be a nonempty subset of E. Let dC :
E → R be the distance function associated to C defined by

dC (x) := d(x,C) = inf{d(x, y) : y ∈ C}.

1. Show that dC is 1-Lipschitz, i.e.,

|dC (x) − dC (y)| ≤ kx − yk ∀ x, y ∈ E.

2. Show that if C is convex, then dc is convex.


3. Let a ∈ C. Show that

∂dC (a) = NC (a) ∩ BE ∗ and NC (a) = R+ ∂dC (a).

- N.D JITTÉ 2013 41


42 convex minimization problems

Academic Session 2013/2014

Optimization: Tutorial-Set I
December,05, 2013.

- Instructor: Pr. N. Djitte [Gaston Berger university, Saint Louis, Senegal].


- Tutors Dr. F.N. Diop, K. Saint-Cyr & El. Thiam [AIMS, Mbour,Senegal].

======================================

Ex. 3.30 Let p ≥ 1. Show that the function: t ∈ R −→ |t| p is convex on R. Deduce that
n
the function h(x1 , . . . , xn ) = ∑ |xi | p is convex on Rn .
i=1

Ex. 3.31 Let f : R2 −→ R defined by:

f (x, y) = x(x + y2 ), ∀ (x, y) ∈ R2 .

1. Show that U := {(x, y) ∈ R2 | x > y2 } is open and convex in R2 .


2. Show that the function f is strictly convex on U.

Ex. 3.32 1. In Rn with the euclidian norm, show that

kλx + (1 − λ)yk2 = λkxk2 + (1 − λ)kyk2 − λ(1 − λ)kx − yk2 , ∀ x, y ∈ Rn .

2. Let f : Rn → Rn be a nonexpansive mapping, i.e.,

k f (x) − f (y)k ≤ kx − yk ∀ x, y ∈ Rn .

Show that the Fixed Point Set of f given by

Fix( f ) := {x ∈ Rn : f (x) = x}

is a closed convex subset of Rn .

Ex. 3.33 Let H be a real inner product space with norm k · k and inner product h·, ·i.
Show that the function f : H → R defined by f (x) = kxk2 for all x ∈ H, is strictly convex
on H.

- N.D JITTÉ 2013 42


43 convex minimization problems

Ex. 3.34 Let H be a real inner product space with norm k · k and inner product h·, ·i.
Let u ∈ H and ϕ : H → R be a convex and lower semicontinuous function. Set

1
F(v) = kv − uk2 + ϕ(v), ∀ v ∈ H.
2

1. Show that there exist a ∈ H and α ∈ R such that ϕ(v) ≥ ha, vi + α for all v ∈ H. [Hint:
you may use the geometric form of Hann-Bannach Theorem].
2. Deduce that F has a unique minimizer in H.

- N.D JITTÉ 2013 43


44 convex minimization problems

Optimization: Tutorial-Set II
December,11, 2013.

- Instructor: Pr. N. Djitte [Gaston Berger university, Saint Louis, Senegal].


- Tutors Dr. F.N. Diop, K. Saint-Cyr & El. Thiam [AIMS, Mbour,Senegal].

======================================

Ex. 3.35 Let H be a real inner product space with norm k · k and inner product h·, ·i.
Let x ∈ H and ϕ : H → R be a convex and lower semicontinuous function. Let

1
F(v) = kv − xk2 + ϕ(v), ∀ v ∈ H.
2

1. Show that there exist a ∈ H and α ∈ R such that ϕ(v) ≥ ha, vi + α for all v ∈ H.
¯ ∈ H × R such that λ
* Hint: Take (ū, λ) ¯ < ϕ(ū) and use the geometric form of Hann-
Bannach Theorem.
2. Deduce that F has a unique minimizer in H, ux characterized by the following
variational inequality:

(ux − x, v − ux ) + ϕ(v) − ϕ(ux ) ≥ 0, ∀v ∈ V. (3.27)

3. Let Proxϕ : H → H be the map defined by Proxϕ (x) = ux where ux ∈ H is the unique
point satisfying (3.27). Show that the map Prox id Lipschitzian.
4. Let K be a nonempty closed convex subset of H. Assume that ϕ = ψK . Show that
Proxϕ is the orthogonal projection map onto K.

Ex. 3.36 Let E be a normed linear space and C be a nonempty subset of E. Let dC :
E → R be the distance function associated to C defined by

dC (x) := d(x,C) = inf{d(x, y) : y ∈ C}.

1. Show that dC is 1-Lipschitz, i.e.,

|dC (x) − dC (y)| ≤ kx − yk ∀ x, y ∈ E.

- N.D JITTÉ 2013 44


45 convex minimization problems

2. Show that if C is convex, then dc is convex.


3. Let a ∈ C. Show that

∂dC (a) = NC (a) ∩ BE ∗ and NC (a) = R+ ∂dC (a).

Ex. 3.37 Let E be a normed linear space and S be a nonempty subset of E. Let f : S → R
be a function which is γ-Lipschitz on S. For each x ∈ E, setcounter

g(x) := sup[ f (y) − γd(x, y)] and h(x) := inf [ f (y) + γd(x, y)].
y∈S y∈S

1. Show that g and h are real valued functions, i.e., g(x) and h(x) are finite.
2. Show that g(x) ≤ h(x) for all x ∈ E and that g(x) = h(x) = f (x) for all x ∈ S.
3. Show that the functions g and h are γ-Lipschitz on E.

CONVINCE YOUR SELF

- T HE P EDAGOGICAL T EAM

Academic Session 2013/2014

Optimization: Tutorial-Set III


December,18, 2013.

- Instructor: Pr. N. Djitte [Gaston Berger university, Saint Louis, Senegal].


- Tutors Dr. F.N. Diop, K. Saint-Cyr & El. Thiam [AIMS, Mbour,Senegal].

======================================

Ex. 3.38 Find the subdifferentials of the following convex functions: f : R → R ∪ {+∞},
f (x) = 0 if |x| ≤ 1 and f (x) = +∞ otherwise; g : R → R ∪ {+∞}, g(x) = 1/x if 0 < x < 1
and g(x) = +∞ otherwise.

Ex. 3.39 Let E be a normed linear space and C be a nonempty subset of E. Let dC :
E → R be the distance function associated to C defined by

dC (x) := d(x,C) = inf{d(x, y) : y ∈ C}.

- N.D JITTÉ 2013 45


46 convex minimization problems

Assume that C is convex and let a ∈ C.


1. Show that
∂dC (a) = NC (a) ∩ BE ∗ .

2. Let v ∈ E such that dC0 (a; v) = 0. Show that v ∈ TC (a).

Ex. 3.40 Let C be a nonempty subset of a normed linear space, a ∈ C and Let {vk }k∈N
be a sequence of TC (a) that converges to v ∈ E. We know from the definition of TC (a)
that for each k ∈ N, there exist a sequence (xnk )n∈N in C converging to a and a sequence
of positive real numbers tnk with tnk → 0, as n → ∞ such that vk = lim (tnk )−1 (xnk − a).
n→∞
1. Show that there exists an increasing sequence of natural numbers nk such that:
1 1
sk := tnkk < and ks−1
k (xnk k − a) − vk k < .
k k
2. Show that sk ↓ 0 and v = lim s−1 k
k (xnk − a) and deduce that TC (a) is closed in E.
k→∞

Ex. 3.41 Let C be a nonempty convex subset of a normed linear space and a ∈ C. Show
that x − a ∈ TC (a) for all x ∈ K.

Definition (Legendre-Fenchel conjugate). Let E be a normed linear space and f : R →


R̄ be a function. The function f ∗ : E ∗ → R defined for x∗ ∈ E ∗ by
h i
f ∗ (x∗ ) := sup hx∗ , xi − f (x)
x∈E
is called the Legendre-Fenchel conjugate of the function f .

Ex. 3.42 Let f , g : R → R̄ be functions. Show that


a. The conjugate function f ∗ is convex amd w∗ -lower semicontinuous;
h i
∗ ∗ ∗
b. f (x ) := sup hx , xi − f (x) ;
x∈dom( f )
c. the inequality inf f = − f ∗ (0E ∗ ) holds and deduce that f is bounded from below if
E
and only if f ∗ (0E ∗ ) is finite;
d. the following inequality, known as the Fenchel inequality holds:

hx∗ , xi − f (x) ≤ f ∗ (x∗ ), ∀ x ∈ E, x∗ ∈ E ∗ ;

e. If f ≤ g then g∗ ≤ f ∗ ;
f. f ∗ takes the value −∞ somewhere if and only if f ≡ +∞ and in this case f ∗ ≡ −∞;
g. If f is convex then a∗ ∈ ∂ f (a) if and only if ha∗ , ai = f (a) + f ∗ (a∗ ).

- N.D JITTÉ 2013 46


47 convex minimization problems

Ex. 3.43 Let p, q ∈ (1, +∞) with 1/p + 1/q = 1.


1 1
a. Let ϕ : R → R with ϕ(t) = |t| p for all t ∈ R. Show that ϕ∗ (t) = |t|q for all t ∈ R.
p q
b. Let now k · k p be the norm on Rn given by
 n 1
p
kxk p = ∑ |xi| p .
i=1

1
What is the Lengre-Fenchel conjugate the function k·k pp ? Deduce through the Fenchel
p
inequality that for any x, y ∈ Rn

1 1
hx, xi ≤ kxk pp + kykqq .
p q

CONVINCE YOUR SELF

- T HE P EDAGOGICAL T EAM

- N.D JITTÉ 2013 47


CHAPTER 4

Optimization problems with constrains

Sommaire
4.1 Optimization problems with equality constrains . . . . . . . . . . . 48

4.2 First order necessary conditions . . . . . . . . . . . . . . . . . . . . . 49

4.3 Optimization with inequality constraints . . . . . . . . . . . . . . . 53

4.1 Optimization problems with equality constrains


We call an optimization problem with equality constrains, any problem of the follow-
ing type:

 ext F(x)
(4.1)
 ϕi (x) = 0, i = 1, · · · , m,

where, m ≤ n and F, ϕi , i = 1, · · · , m are real valued function defined on some open


subset Ω of Rn . The function F is called the objective (or cost) function and the set K
given by
K := {x ∈ Ω : ϕi (x) = 0, ∀ i = 1, · · · , m}

is the feasible set or constraint set.

48
49 Optimization with constrains

4.2 First order necessary conditions


Définition 4.1 A point x̄ ∈ K is regular (or satisfies the qualification condition (QC)), if the
vectors ∇ ϕ1 (x), . . . , ∇ ϕm (x) are linearely independent, or equivalently, if the Jacobian matrix
of ϕ at x̄, Jϕ(x̄) has rank m, where ϕ is the vector function defined on Ω by

ϕ(x) := (ϕ1 (x), · · · , ϕm (x)).

Théorème 4.2 (Lagrange) Assume that F, ϕi (i = 1, · · · , m) : Ω → R are continuously dif-


férentiable. If x̄ ∈ K is a local extremum of F on K satisfying the qualification condition (CQ),
then there exists un unique vector λ ¯ = (λ
¯ 1, . . . , λ
¯ m ) ∈ Rm such that (x̄, λ)
¯ satisfies the Lagrange
condition:
m
¯ i ∇ϕi (x̄).
∇F(x̄) + ∑ λ (4.2)
i=1

¯ is called the Lagrange multiplier associated to the extremum x̄.


+ The vector λ

Preuve. We rectify the probleme and use the implicit fonctions theorem. The proof is
divided into to steps.

Step 1: Consequences of (QC). Since x̄ satisfies (QC), then the la Jacobian matrix of ϕ
at x̄, denoted by Dϕ(x̄) has rank m. Let {e1 , . . . , em } be the canonical basis of Rm . There
exists vectors h1 , . . . , hm in Rn such that

Dϕ(x̄)hi = ei ∀, i = 1, . . . , m. (4.3)

Define the function H : Rn × Rm → Rm by:


m
H(x,t) = ϕ(x + ∑ ti hi ), with t = (t1 , . . . ,tm ). (4.4)
i=1

We have H is C1 in Ω × Rm and futher

H(x̄, 0) = 0, (4.5)
∂H
(x̄, 0) = I. (4.6)
∂t

- N.D JITTÉ 2013 49


50 Optimization with constrains

Therefore, from the implicit function theorem, there exists an open neighborhood U1
of x̄ and a function t : U1 → Rm of class C1 such that

t(x̄) = 0, (4.7)
H(x,t(x)) = 0, ∀ x ∈ U1 . (4.8)

Differentiating (4.8) at x = x̄ and using the definition of the hi , it follows that


m
Dϕ(x̄).(I + ∑ hi Dti (x)) = 0, (4.9)
i=1

or equivalently,
Dϕi (x̄) + Dti (x̄) = 0, i = 1, . . . , m. (4.10)

Etape2: Conclusion. Assume that x̄ is a local minimizer of F on K. Then there exists


an open neighborhood U2 of x̄ such that

F(x) ≥ F(x̄), ∀ x ∈ K ∩U2 . (4.11)

Set U3 = U1 ∩U2 . On one hand, we have

∀x ∈ U3 ∩ K, F(x) ≥ F(x̄).
m
On the other hand, using the continuity of the map x → x + ∑ ti (x)hi at x̄, there exists
i=1
an open neighborhood U4 of x̄ such that
m
x + ∑ ti (x)hi ∈ U3 ∩ K, ∀ x ∈ U4 . (4.12)
i=1

m
Now, define the function γ : U4 → R by γ(x) = F(x + ∑ ti (x)hi ). Then, we have γ(x̄) =
i=1
F(x̄) and ∀ x ∈ U4 , γ(x) ≥ γ(x̄). Therefore x̄ is a minimizer of γ on U4 . So,

Dγ(x̄) = 0. (4.13)

or equivalently,
m m
DF(x̄).(I + ∑ hi Dti (x̄)) = DF(x̄) + ∑ [DF(x̄) · hi ]Dti (x̄) = 0. (4.14)
i=1 i=1

- N.D JITTÉ 2013 50


51 Optimization with constrains

Using equation (4.10), we obtain


m
DF(x̄) − ∑ [DF(x̄) · hi ]Dϕi (x̄) = 0. (4.15)
i=1

¯ i = −DF(x̄).hi finish the proof.


Finally, taking λ

Définition 4.3 The Lagrangian associated to problem (4.1), is the function L : Rn × Rm → R


defined by:
m
L(x, λ) = f (x) + ∑ λi ϕi (x) (4.16)
i=1

Using the Lagragian, we have the following version of Theorem 4.2.

Théorème 4.4 (Lagrange(second)) Assume that F, ϕi (i = 1, · · · , m) : Ω → R are contin-


uously différentiable. If x̄ ∈ K is a local extremum of F on K satisfying the qualification
¯ = (λ
condition (CQ), then there exists un unique vector λ ¯ 1, . . . , λ
¯ m ) ∈ Rm such that:

∂L ¯
(x̄, λ) = 0 (4.17)
∂x

4.2.1 Second order sufficient conditions


Définition 4.5 Let x ∈ K. The tangent to C at the point x, is the subset of Rn , denoted by
T (C, x) and defined by par:

T (C, x) = {d ∈ Rn , | (∇ϕi (x), d) = 0, i = 1, . . . , m}. (4.18)

Remarque 4.6 If x satifies (QC), then T (C, x) is a subspace of Rn of codimension m.

¯ ∈ Rm such that
Théorème 4.7 (Sufficient Conditions for Minimizer) Let x̄ ∈ K and λ

¯ = 0,
Dx L(x̄, λ)

(4.19)
 ¯
(D2x L(x̄, λ)d, d) > 0, ∀ d ∈ T (C, x̄).

Then x̄ is a local minimizer of F on C.

Preuve. The proof is by contradiction. Suppose that x̄ is not a strict local minimizer of
F on C. Then there exists a sequence {xn } in C satisfying:

- N.D JITTÉ 2013 51


52 Optimization with constrains

1. xn → x̄,

2. xn , x̄ ∀ n ∀ n ≥ 1,

3. F(xn ) ≤ F(x̄) ∀ n ≥ 1.

For each n ≥ 1 define dn as follows:


xn − x̄
dn = . (4.20)
kxn − x̄k
Then the sequence {dn } is in the unit sphere which is compact. Therefore it has a sub-
sequence denoted again by {dn } converging to d , 0.

Claim: d ∈ T (C, x̄).

set
tn = kxn − x̄k ∀ n ≥ 1. (4.21)

We have xn = x̄ +tn dn . Doing a Taylor expansion arround x̄ gives: there exist a function
ε defined in a neighborhood of x̄ and θ ∈ [0, 1] such that ε(x) → 0 as x → x̄ and

ϕ(xn ) = ϕ(x̄) + tn Dϕ(x̄).dn + tn kdn kε(x̄ + θtn dn ) (4.22)

But ϕ(xn ) = ϕ(x̄) = 0 because xn , x̄ ∈ C. Therefore, (4.22) implies that:

tn Dϕ(x̄).dn + tn kdn kε(x̄ + θtn dn ) = 0. (4.23)

Dividing by tn and letting n goes to +∞, we obtain d ∈ T (C, x̄).


¯ arround x̄ gives
Again, using a Taylor expansion of the function L(., λ)

L(xn , λ) = L(x̄, λ) ¯ dn ) + tn (D2 L(x̄, λ)d


¯ + tn (Dx L(x̄, λ), ¯ n , dn ) + t 2 kdn k2 εn (4.24)
2 x n

with εn → 0 as n → 0. Futher, we have: L(xn , λ) ¯ = F(xn ), L(x̄, λ)¯ = F(x̄) and Dx L(x̄, λ)
¯ = 0.
So, (4.24) implies:
F(xn ) − F(x̄)
(D2x L(x̄, λ)dn , dn ) + 2kdn k2 εn = ≤ 0 because F(xn ) ≤ F(x̄). (4.25)
tn2 /2
Passing limit over n, we obtain:

(D2x L(x̄, λ)d, d) ≤ 0.

This contradicts the assumption (D2x L(x̄, λ)d, d) > 0 

- N.D JITTÉ 2013 52


53 Optimization with constrains

4.3 Optimization with inequality constraints


Let F : O ⊂ Rn → R be a differentiable function on some open subset of Rn , and let K
be a subset of Rn defined by:

K := {x ∈ Rn | hi (x) = 0, i = 1, . . . , m; g j (x) ≤ 0, j = 1, . . . , p}.

Définition 4.8 A point x̄ ∈ K satisfies (QC) if the vectors: ∇h1 (x̄), . . . , ∇hm (x̄), ∇g j (x̄), j ∈
I(x̄) are linearely independent, where I(x̄) := { j | g j (x̄) = 0}.

Théorème 4.9 (Karush-Kuhn-Tucker) We assume that the functions F, hi , g j are continu-


ously differentiable. If x̄ ∈ O ∩ K is a local minimizer of F on K, and if x̄ satisfies (QC), then
¯ = (λ
there exist λ ¯ 1, . . . , λ
¯ m ) ∈ Rm and µ̄ = (µ̄1 , . . . , µ̄ p ) ∈ R p such that:
 m p
+ ¯ i ∇hi (x̄) +
x̄) ∑ µ̄ j ∇g j (x̄) = 0,



 ∇F( ∑ λ
i=1 j=1

(4.26)

 µ̄ j ≥ 0, j = 1, . . . , p;


µ̄ j g j (x̄) = 0, j = 1, . . . , p.

- N.D JITTÉ 2013 53


CHAPTER 5

Calculus of variation

Sommaire
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54

5.2 The Basic Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55

5.3 Necessary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

5.4 Sufficient conditions for solutions . . . . . . . . . . . . . . . . . . . . 60

5.5 Exercices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61

5.1 Introduction
The calculus of variations gives us precise analytical techniques to answer questions
of the following type:

+ Find the shortest path (i.e., geodesic) between two given points on a surface.

+ Find the curve between two points in the plane that yields a surface of revolution
of minimum area when revolved around a given axis.

54
55 Calculus of variations

+ Find the curve along which a bead will slide (under the effect of gravity) in the
shortest time.

The calculus of variations is concerned with the problem of extremising functionals.


This problem is a generalisation of the problem of finding extrema of functions of sev-
eral variables. It is the problem of finding extrema of functions of an infinite number
of variables. In fact, these variables will themselfs be functions and will be finding
extrema of functions or functionals.

5.2 The Basic Problem


In calculus of variations one is given a fixed C2 -function L = L(t, x, y), for t ∈ [t0 ,t1 ] and
x, y ∈ R and the problem is to minimise (or maximise) the functional
Z t1
J(x) = L(t, x(t), x0 (t)) dt (5.1)
t0

that acts on functions x : [t0 ,t1 ] → R satisfying

x(t0 ) = x0 , x(t1 ) = x1 . (5.2)

Here, x0 , x1 are given real numbers, and any C1 -function x = x(t) satisfying these two
boundary conditions is said to be admissible.

The problem above can be seen as an optimisation problem in infinitely many vari-
ables(one for each t ∈ [t0 ,t1 ]). But fortunately the possible minimising and maximis-
ing functions x(t) can be found among the solutions to a certain ordinary differential
equation. This is the content of the following section.

- N.D JITTÉ 2013 55


56 Calculus of variations

5.3 Necessary conditions

5.3.1 First Order Necessary Condition


Théorème 5.1 (Euler-Lagrange equation) . For an admissible function x∗ (t) to maximise
or minimise
Z t1
J(x) = L(t, x(t), x0 (t)) dt, such that x(t0 ) = x0 , x(t1 ) = x1 , (5.3)
t0
it is necessary that x∗ (t) solves the following ordinary differential equation, called Euler-
Lagrange Equation:
d 
Ly (t, x(t), x0 (t)) = Lx (t, x(t), x0 (t)), x(t0 ) = x0 , x(t1 ) = x1 . (5.4)
dt

To reveal the nature of the above differential equation, it is instructive to carry out
the t-differential using the chain rule (this is allowed at least if the solution x∗ (t) is a
C2 -function). In that case, it follows that

Lyy (t, x(t), x0 (t))x00 (t) + Lxy (t, x(t), x0 (t))x0 (t) − Lx (t, x(t), x0 (t)) = 0.

This is a homogeneous second order differential equation, which is said to be quasi-


linear because the coefficients depends on the solution and its lower order derivatives.
To illustrate the usefulness of theorem 5.1, one can take the following simple example.

Exemple 5.2 Consider


Z 1 
2 0 2
J(x) = x(t) + x (t) dt, x(0) = 0, x(1) = e2 − 1.
0

Here, L(t, x, y) = x2 + y2 so that the Euler-Lagrange equation is given by


d
(2x0 ) − 2x = 2(x00 − x) = 0.
dt
Its general solution is given by the function x(t) = Aet + Be−t . Putting the boundary
conditions, it is seen that x(t) is admissible precisely for A = e = −B. So, the only
candidate for a minimiser or maximiser is

x∗ (t) = e1+t − e1−t .

- N.D JITTÉ 2013 56


57 Calculus of variations

However, x∗ is not a maximiser (J(x) can be seen to take arbitrary large values), but it
will follow later from sufficient conditions that x∗ is a minimiser. Preuve. The point
of departure is to show the so called fundamental lemma of calculus of variations:

Lemme 5.3 Let f : [t0 ,t1 ] → R be a continuous function satisfying:


Z t1
f (t)ϕ(t) dt = 0
t0

for every C2 -function ϕ on [t0 ,t1 ] such that: ϕ(t0 ) = ϕ(t1 ) = 0. Then f (t) = 0 for every
t ∈ [t0 ,t1 ].

Preuve. Suppose there exists s ∈]t0 ,t1 [ such that f (s) , 0, says f (s) > 0. By conti-
nuity, there is some interval I :=]a, b[ on which f (t) > 0 and t0 < a < b < t1 . On I,
define ϕ(t) = (t − a)3 (b − t 3 ) > 0, and let ϕ(t) = 0 outside I. So we have ϕ is C2 and
R t1
0= t0 f (t)ϕ(t) dt > 0, which is a contradiction. Hence f (t) = 0 for every t.

The Lemma 5.3 will be used by forming a so-called variation of the given function

xα (t) = x∗ (t) + αϕ(t).

Here, α ∈ R is just a parameter, while ϕ is an arbitrary C2 -function on [t0 ,t1 ] satisfying


ϕ(t0 ) = ϕ(t1 ) = 0. Clearely, xα (t) is admissible for every α.

As a convenient notation, let


Z t1
I(α) := J(xα ) = L(t, xα (t), xα0 (t)) dt.
t0

For simplicity, the proof continues with the case of a minimum at x∗ (the case of max-
imum is similar). This means that

I(α) ≥ I(0) for all α.

Thus
I 0 (0) = 0.

- N.D JITTÉ 2013 57


58 Calculus of variations

This is, of course under the assumption that α → I(α) is differentiable under the inte-
gral sign above (this will be proved later). Proceeding from this, one arrives at
Z t1 h i
0 0 0
I (0) = Lx (t, x∗ , x∗ )ϕ(t) + Ly (t, x∗ , x∗ )ϕ0 (t) dt.
t0

Since ϕ vanishes at the end points, an integration by parts gives


Z t1 h
0 0 d 0
i
0 = I (0) = Lx (t, x∗ , x∗ ) − Ly (t, x∗ , x∗ ) ϕ(t) dt. (5.5)
t0 dt
Therefore, it follows from Lemma 5.3 that
0 d 0
Lx (t, x∗ , x∗ ) − Ly (t, x∗ , x∗ ) = 0, for all t.
dt
This means that the Euler-Lagrange equation is satisfied by x∗ (t). Now it only remains
to prove the next lemma.

Lemme 5.4 The Functional α → I(α) is differentiable at α = 0 and


Z t1 h i
0 0 0
I (0) = Lx (t, x∗ , x∗ )ϕ(t) + Ly (t, x∗ , x∗ )ϕ0 (t) dt.
t0

5.3.2 Second Order Necessary Condition


Sometimes, the following necessary condition is also useful.

Théorème 5.5 (Legendre) If L is a C2 -function, it is a necessary condition for the functional


Z t1
J(x) = L(t, x(t), x0 (t)) dt
t0

to have an extreme value at an admissible function x∗ (t) that


0
Lyy (t, x∗ (t), x∗ (t)) ≤ 0 in the case where x∗ is a maximizer. (5.6)
0
Lyy (t, x∗ (t), x∗ (t)) ≥ 0 in the case where x∗ is a minimizer. (5.7)

These inequalities are required to hold for all t ∈ [t0 ,t1 ].

The Legendre condition is often useful, when one wants to show that a solution can-
didate is not a maximizer (or a mionimizer). This is elucidated by the next example.

- N.D JITTÉ 2013 58


59 Calculus of variations

Exemple 5.6 In the above example where


Z 1
J(x) = x(t)2 + x0 (t)2 ) dt,
0

one finds at once that Lyy = 2 > 0, and this rules out that the admissible function
x∗ (t) = e1+t − e1−t be a minimizer. But it is still open whether x∗ is a maximizer or not.
This cannot be conclude from the fact that Lyy > 0.

One of the possible complications in pratice is that there, for good reasons, are further
constraints on the admissible functions. This can for example lead us to the problem
of finding a C1 -function x : [t0 ,t1 ] → R such that
Z t1
max L(t, x, x0 ) dt; x(t0 ) = x0 , x(t1 ) = x1 ; (5.8)
t0
h(t, x(t), x0 (t)) > 0 ∀ t ∈ [t0 ,t1 ]. (5.9)

Here h is a suitable C1 -function defining the constraint. However, one can show that
the Euler-Lagrange and Legendre conditions are necessary also for such problems.

A more radical change is met, if one considers the problem of having


Z t1
min L(t, x, x0 ) dt, x(t0 ) = x0 ,
t0

together with one of the following terminal conditions:

(i) x(t1 ) free (t1 given);

(ii) x(t1 ) ≤ x1 (t1 and x1 given);

(iii) x(t1 ) = g(t1 ) (t1 free, but g a given C1 -function).

Correspondingly the admissible functions are now required to be C1 -functions and


satisfy the state initial and terminal conditions. It is clear that a minimizing function x∗
still satisfies the Euler-Lagrange equation, for x∗ also minimizes among the admissible
functions that satisfies x(t1 ) = x∗ (t1 ). But one has to add some transversality conditions

Théorème 5.7 (Transversality conditions) If x∗ is an admissible function solving the above


minimization problem, then x∗ solves Euler-Lagrange equation and the corresponding transver-
sality conditions:

- N.D JITTÉ 2013 59


60 Calculus of variations

0
(i) Ly (t1 , x∗ (t1 ), x∗ (t1 )) = 0;
0
(ii) Ly (t1 , x∗ (t1 ), x∗ (t1 )) ≥ 0 ( and = 0 holds if x∗ (t1 ) < 0);
0
 0
 0
(iii) L(t1 , x∗ (t1 ), x∗ (t1 )) − g0 (t1 ) − x∗ (t1 ) Ly (t1 , x∗ (t1 ), x∗ (t1 )) = 0.

In case (ii), the inequality is reversed if x∗ solves the maximization problem.

Preuve. Set y1 = x∗ (t1 ). Since x∗ minimises J, the inequality J(x∗ ) ≤ J(y) holds in partic-
ular for all admissible functions that satisfy y(t1 ) = y1 . Therefore x∗ is also a solution of
the basic problem on the fixed time interval [t0 ,t1 ] and with data x0 , y1 . Consequently
x∗ satisfies the Euler-Lagrange equation.

The transversality condition (i) can be proved as a continuation of the proof of the-
orem 5.1: since the terminal condition value x(t1 ) is not fixed in this context, it is
possible that the variation ϕ(t) is such that ϕ(t1 ) , 0 (but ϕ(t0 ) = 0 is still required);
again it is seen that I 0 (0) = 0. Since it is already known that x∗ solves Euler-Lagrange
equation, it follows from the integration by parts leading to (5.5) that
0
Ly (t1 , x∗ (t1 ), x∗ (t1 ))ϕ(t1 ) = 0. (5.10)

Taking ϕ such that ϕ(t1 ) = 1 the conclusion in (i) follows. With a little more effort
also (ii) can be obtained along these lines. However, (iii) requires the implicit function
theorem and a longer argument, so details are skipped here.

5.4 Sufficient conditions for solutions


Finally, a result on sufficient conditions for a solution is given. However, it holds
only in the case where the basic function L(t, x, y) is convex with respect to (x, y). This
∂2 L
means that the Hessian matrix is positive semi-definite at all points; ≥ 0; that is,
∂x∂y
for all t, x, y, this matrix has eigenvalues λ1 , λ2 in [0, +∞[.

Théorème 5.8 Suppose L(t, x, y) is convex with respect to (x, y), and that x∗ (t) satisfies the
Euler-Lagrange equation and one of the terminal conditions. Then x∗ is a global minimiser.

- N.D JITTÉ 2013 60


61 Calculus of variations

Exemple 5.9 Since L(t, x, y) = x2 + y2 is convex in example 5.2, the solution x∗ (t) = e1+t −
e1−t actually minimises the functionanal J subject to the conditions x(0) = 0 and x(1) = e2 −1.

5.5 Exercices
Z 1
Ex. 5.10 (i) Calculate the value of J(x) = (x(t)2 + x0 (t)2 ) dt in the cases
0

1. x(t) = (e2 − 1)t,

2. x(t) = e2t − 1,

3. x(t) = e1+t − e1−t ,

4. x(t) = at 2 + (e2 − 1 − a)t.

They are all go through (0, 0) and (1, e2 − 1), hence are admissible.
(ii) Show that J(x) has no maximum on the curves joining (0, 0) and (1, e2 − 1).

Ex. 5.11 Investigate what the Euler Lagrange equations give in the special cases when

1. L = L(t, x);

2. L = L(t, y);

3. L = L(x, y).
Z 2
Ex. 5.12 Let J(x) = t −2 x0 (t)2 dt and consider the boundary conditions x(1) = 1, x(2) =
1
2.

1. Find the admissible solutions to the Euler-Lagrange equation.

2. Show that the maximisation problem for J has no solution. (Try x(t) = at 2 (1 −
3a)t + 2a).

3. Does the above imply that the solution in (1) minimises J?

- N.D JITTÉ 2013 61


62 Calculus of variations

Ex. 5.13 Consider the problem


Z 1
min J(x) := (t + x)4 dt, x(0) = 0, x(1) = a. (5.11)
0

Find the solution of the Euler-Lagrange equation and determine the value of a for
which this solution is admissible. For this values of a, find the solution of the prob-
lem.

Ex. 5.14 The lenght of the graph of a C1 -function x(t) which connects (t0 , x1 ) to (t1 , x1 )
is given by Z t1 q
L(x) = 1 + x0 (t)2 dt.
t0
Prove that L(x) attains its minimum over the admissible functions exactly when x(t)
has a straight line as its graph.

- N.D JITTÉ 2013 62


CHAPTER 6

Minimum’s Principle of Pontryaguine

Sommaire
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63

6.1 Introduction
In this chapter, we present the Protryagin’s Principle. It concerns the study of dy-
namical optimization problems. Roughly speaking, it is a way to control a system
(mechanic, chemestre , physic, economic...) with a minimum cost.

To formulate this problem mathematically, it is necessary first to define the variables


that define the stat of the system, the model that describes the evolution of the system,
and the actions that can be make on the system (i.e., the control or command). For ex-
amples, moving from a place to an other by the fastest (or shortest) way, maximizing
the gain of an investment, etc.... More precisely, we are interesteing to a system (a
rocket, an airplane, a robot, a portfolio of stocks,bonds or options etc...) whose state
x(t) (at the time t) is given by the solution of an ordinary differential equation:

63
64 Minimum’s principle of Pontryagin

dx
= f (t, x, u) on ]t0 ,t1 [, x(t0 ) = x0 ,
dt
where x0 ∈ Rn and t0 ,t1 ∈ R are fixed with t0 < t1 ; f is a given function defined from
R × Rn ×U into Rn . As we can see, one of the arguments of the function f is a function
u defined on the interval ]t0 ,t1 [ and taking values in the given set U. We assume that
U is a nonempty and closed subset of Rm . This function u translates mathematically
the actions (or decisions) that can make on the of system. The set U corresponds to
any restrictions or constrains that must respect the control u, (for example, limited
resources, limits on the acceleration or speed for the driving). Formulate an opti-
mal control problem is to define the state of the system and the differential equation
governing its evolution, the class admissible controls u(t) and finally a criterion of
evolution or a cost function J whose typical form is:
Z t1
J(u) = g(t, x(t), u(t))dt + h(x(t1 )), (6.1)
t0

where g and h are given functions defined respectively on R × Rn × U and Rn into R.


The problem is now to find the optimal cost or optimal control, that is to solve the
following optimization problem:
 Z t 
1


 inf g(t, x(t), u(t))dt + h(x(t1 )) , with
 u∈L2 (0,T ;U) t0

(6.2)

 dx = f (t, x, u) on ]t0 ,t1 [, x(t0 ) = x0 .



dt
To fixe the ideas, let us consider a concrete example. One consider a car moving from
a city to an other one. Denote by x1 (t) the position of the car at time t, by x2 (t) the
speed of the car at time t and by u(t) the acceleration at time t. We have
 dx
1

 = x2 (t),
 dt

(6.3)

 dx2 = u(t),


dt
or
dx
= f (t, x(t), u(t)),
dt

- N.D JITTÉ 2013 64


65 Minimum’s principle of Pontryagin

where f (t, x, u) = (x2 , u) and x = (x1 , x2 ). We know that the consumption of a car de-
pends on how the acceleration is used. So, we take as cost function J to be the con-
sumption of fuel, as control the acceleration. Therefore, we have the following optimal
control problem:


 inf J(u) := consumption of fuel


 u





dx (6.4)
= f (t, x(t), u(t))


 dt





 x(0) = (x1 , x2 )

0 0

In the following, a control will be a measurable function u defined on the interval


]t0 ,t1 [ and taking values in U, with u ∈ L2 (]t0 ,t1 [,U). Moreover, for (6.1) and (6.2) to
make sense, we make the following assumptions :
∂ f ∂g
(i) f and g are continuous with respect to x and the derivatives , are continuous
∂x ∂x
on,

(ii) h ∈ C1 (Rn ),

(iii) g and h are respectively bounded below on R × Rn × Rm and Rn ,

(iv) k f (t, x, u)k ≤ C(1 + u) + kxk + kuk), ∀(t, x, u) ∈ R × Rn × Rm ,


∂f ∂g
(v) k (t, x, u)k + k (t, x, u)k ≤ CR (1 + kuk), forkxk ≤ R, |t| ≤ R, u ∈ U,
∂x ∂x

(vi) |g(t, x, u) ≤ CR (1 + kuk2 ), f orkxk ≤ R, |t| ≤ R, u ∈ U,

where, C is a positive constant independant of x,t and u and CR is a positive constant,


which depend only on the real R (arbitrary on R∗+ ).

These assumptions on f garantie the existence and uniqueness of solution of the dif-
ferential system (6.1). In fact, let if u = u(t) be a fixed. Then the function f¯(t, x) =

- N.D JITTÉ 2013 65


66 Minimum’s principle of Pontryagin

f (t, x, u) satisfies :

∂ f¯
k (t, x)k ≤ CR (1 + ku(t)k
∂x
,
for kxk ≤ R and for all t ∈] − R, R[. Hence that k f¯(t, x)k ≤ C(1 + kxk + ku(t)k) for x ∈ Rn
and for almost every t ∈]t0 ,t1 [. According to the existence theorem for ordinary dif-
ferential equations, these boundedness are enough when ever u ∈ L2 (]to ,t1 [; Rm ), to
garantie the existence of a unique solution x ∈ C([t0 ,t1 ], Rn ) of (6.1) satisfying:

Z t
x(t) = x0 f (s, x(s), u(s)) ds, ∀t ∈]t0 ,t1 [.
t0
Moreover, x(t) depend continuously on x0 . In particular, we have: kx(t)k ≤ R for all
t ∈ [t0 ,t1 ] for a some onstant R > 0 (depending on u). From the assumptions, it follows
that the criterion J given by (6.1) is well defined and the infinimum of (6.1) is finite,
i.e., inf J > −∞.

6.1.1 Minimum Principle of Pontryagin: general case


Définition 6.1 The Hamiltonian of the problem (6.2) is the function H : R×Rn ×Rn ×U → R
defined by:

H(t, x, p, u) := g(t, x, u) + hp, f (t, x, u)i for all (t, x, p, u) ∈ R × Rn × Rn ×U. (6.5)

Définition 6.2 Let u(t) be a control and x(t) be the corresponding state, that is
dx
= f (t, x, u), x(t0 ) = x0 ,
dt
The adjoint state associated to u(t), x(t) is the unique solution p(t) of the following system:
dp ∂H  
=− t, x(t), p(t), u(t) on ]t0 ,t1 [ , p(t1 ). = h0 (x(t1 )) (6.6)
dt ∂x
Théorème 6.3 (Minimum Principle of Pontryagin) Assume that the conditions (i) −
(vi) are satisfied. Let ū:[t0 ,t1 ] −→ Rn be a solution of (6.1). Then for all t ∈]t0 ,t1 [
 
ū(t) realize the minimum over U of the function u 7−→ H t, x̄(t), p̄(t), u (6.7)

where x̄ and p̄ are the state and the adjoint state associated to ū(t).

- N.D JITTÉ 2013 66


BIBLIOGRAPHY

[1] H. Brezis, Analyse fonctionnelle, théorie et applications, Masson (1983) et Dunod


(1999).

[2] I.Ekeland, J.P.Aubin, Applied Nonlinear Analysis, Wiley, 1984, 520p.

[3] I.Ekeland and R.Temam, Convex analysis and Variational Problems, SIAM Clas-
sics In Applied Mathematics, 1999.

67

You might also like