Notes Convex Analysis Ngalla
Notes Convex Analysis Ngalla
Ngalla Djitte,
Department of Applied Mathematics
Gaston Berger University
Saint Louis, Senegal.
[email protected]
CONTENTS
1
2 CONTENTS
5 Calculus of variation 54
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.2 The Basic Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3 Necessary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
5.3.1 First Order Necessary Condition . . . . . . . . . . . . . . . . . . 56
5.3.2 Second Order Necessary Condition . . . . . . . . . . . . . . . . 58
5.4 Sufficient conditions for solutions . . . . . . . . . . . . . . . . . . . . . . 60
5.5 Exercices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
Bibliography 67
Sommaire
1.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Introduction
Let V be a normed linear space and F : V → R ∪ {+∞} be an extended real valued
function. Consider the following optimization problem:
or
sup F(v), (1.2)
v∈V
1
2 Existence results
then in what follows, we shall restrict our discussion to problem like (1.1).
Using (1.4), (1.5) and the uniqueness of the limit, it follows that
So x̄ is a global minimum of F in K.
Lemme 1.6 Let K be a non empty closed subset of Rn . If F is coercive and continuous on
some open set containing K, then the followings hold.
Preuve.
(i). Suppose that F is not bounded from bellow bellow on K. Then, for all n ∈ N, there
exists xn ∈ K such that F(xn ) < −n. So we get a sequence (xn ) of points of K satisfying
Claim: The sequence (xn ) is bounded. Indeed, by contradiction, assume that (xn ) is
not bounded. Then, it has a subsequence (xnk ) such that
F(x̄) = −∞,
We have to show that (xn ) is bounded. By contradiction, assume not. Then there exists
a subsequence (xnk ) of (xn ) such that
Théorème 1.7 Let K be a non empty closed subset of Rn (not necessarely bounded) and F :→
R be a function coercive and continuous on some open set containing K. Then F has a gobal
minimum on K, i.e.,there exists at least one point x̄ ∈ K such that
Using (1.9), (1.10) and the uniqueness of the limit, it follows that F(x̄) = inf F(x). So, x̄
x∈K
is a global minimum of F in K.
1.3.1 Counter-example
In this section, we are going to give a counter-example showing that in Infinite Dimen-
sional Spaces, the existence of a minimizer is not guaranteed by conditions (1.6) used
in theorem1.7. This is because of the fact that in infinite dimensional spaces closed
bounded subsets need not to be compact.
Let us take V = H 1 (]0, 1[) = {v ∈ L2 , v0 ∈ L2 }, the sobolev space with the norm
Z 1
21
kvk = (|v0 |2 + |v|2 )dx ,
0
It is obvious to see that F is continuous and that the condition (1.6) is satisfied. Indeed,
Z 1 Z 1
1 1
F(v) = kvk2 − 2 |v0 (x)| dx + 1 ≥ kvk2 − v02 (x) dx − 1 ≥ kvk2 − 1
0 2 0 2
But we have
inf F(v) = 0. (1.11)
v∈V
To prove (1.11), we define the following minimizing sequence (un ) by: for n ≥ 1,
k k 2k + 1
x − if ≤ x ≤ ,
n n 2n
un (x) = (1.12)
2k + 1 − x if 2k + 1 ≤ x ≤ k + 1 , for, 0 ≤ k ≤ n − 1.
n n n
Then, we have
k 2k + 1
1 if < x < ,
n 2n
(un )0 (x) = (1.13)
−1 if 2k + 1 < x < k + 1 , for, 0 ≤ k ≤ n − 1,
n n
1
So F(un ) = . This prove (1.11).
4n
So, under conditions (1.14) et (1.15), problem (1.11) has a solution. But condition (1.15)
is not usefull because difficult to verify.
Définition 1.8 Let V be a real vector space. The epigraph of F : V −→ R ∪ {+∞} is the subset
of V × R defined by:
epi(F) := {(x, a) ∈ V × R : F(x) ≤ a}.
Conversely, assume that (xn → x in V ) ⇒ F(x) ≤ lim infn F(xn ). Let (xn , an ) be a se-
quence in epi(F) that converges to (x, a) in V × R. Then, xn → x and an → a. Therefore,
by hypothesis, we have
So, (x, a) ∈ epi(F) and then epi(F) is closed, which implies that F is lsc.
Proposition 1.10 A function F : V −→ R ∪ {+∞} is lsc if and only if for all a ∈ R, the set
Ca := {x ∈ V : F(x) ≤ a} is closed in V.
Preuve. Assume that F is lsc. Let a ∈ R, and let (xn ) be a sequence in Ca that converges
to x ∈ V . Since F(xn ) ≤ a ∀ n we have lim infn F(xn ) ≤ a. So, by proposition 1.9,it follows
that
F(x) ≤ lim inf F(xn ) ≤ a.
n
Conversely, assume that Ca is closed for all a ∈ R. Let (xn ) be a sequence in V that
converges to x ∈ V . Let (xnk ) be a subsequence of (xn ) such that
Assume that F(x) > lim infn F(xn ). Then there exists a ∈ R such that
So, there exists N ∈ N such that F(xnk ) < a for all k ≥ N. Therefore xnk ∈ Ca ∀k ≥ N. Since
xnk → x and Ca closed, we have x ∈ Ka . Therefore, F(x) ≤ a. Unsing (1.16), we get a
contradiction. So F(x) ≤ lim infn F(xn ).
Ex. 1.11 Show that any continuous function is lower semi continuous.
Ex. 1.12 Let K be a nonempty subset of V and let ψK be the indicator function of K
defined by:
0 si x ∈ K
ψK (x) =
+∞ sinon
φu is called the Nemistky’s functional associated to u. Show that if u(t, .) is lower semi-
continuous for almost t ∈ Ω, then φu is well defined and lower semi continuous on
L p (Ω, Rn ).
Since V is compact, there exists a subsequence (xnk ) of (xn ) such that xnk → x̄ ∈ V . By
proposition1.9 we have:
Définition 1.15 Let V be avector space. A set K of V is convex if it is empty or for every x
and y in K, the line segment joining x and y remains inside K, where the line segment [x, y]
joining x and y is defined by
Therefore, a subset K of V is convex if and only if for every x and y in K and every λ
with 0 ≤ λ ≤ 1, the vector {λx + (1 − λ)y is also in K.
Exemple 1.16
a. Let V be a vector spqqce and x and v be vectors in V . The line L through x in the
direction v:
L = {x + λv, λ ∈ R}
is a convex subset V .
b. Any linear subspace M of V is a convex set, since linear subspaces are closed under
addition and scalar multiplication.
F + = {y ∈ Rn : x̄ · y ≥ α} and F − = {y ∈ Rn : x̄ · y ≤ α}
B̄(x̄, r) := {x ∈ Rn : kx − x̄k ≤ r}
is a convex subset of Rn .
If the inequality (1.17) is strict for all x, y ∈ V with x , y and all λ ∈ (0, 1), then F is said to be
strictly convex. If the inequalities in the above definitions are reversed, we obtain the definition
of concave and strictly concave functions.
Remarque 1.18 Note that F is convex (respectively strictly convex) on a convex set
K if and only if −F is is concave (respectively strictly concave) on K. Because of
this close connection, we will formulate all results in terms of convex functions only.
Corresponding results for concave functions will be clear.
Preuve. Exercice.
Preuve. Exercice.
(i) If F is strictly convex on V , then F has at most one minimizer, that is if the minimizer
exists, it must be unique.
Preuve. (i). Let x1 and x2 be two different minimizers of F, and let λ with, 0 < λ < 1.
Because of the strict convexity of F and the fact that
we have
F(x1 ) ≤ F(λx1 + (1 − λ)x2 ) < λF(x1 ) + (1 − λ)F(x2 ) = F(x1 ),
Given any x ∈ V , we want to show that F(x̄) ≤ F(x). To this end, select λ, with 0 < λ < 1
and small so that
x̄ + λ(x − x̄) = λx + (1 − λ)x̄ ∈ B(x̄, r).
Then,
F(x̄) ≤ F(x̄ + λ(x − x̄)) = F(λx + (1 − λ)x̄) ≤ λF(x) + (1 − λ)F(x̄)
because F is convex. Now substract F(x̄) from both sides of the preceding inequality
and divide the result by λ to obtain 0 ≤ F(x) − F(x̄). This establishes the desired result.
Proposition 1.22 Let F : V → R ∪ {+∞} be convex and lower semi continuous. Then F is
weakly lower semi continuous, that is epi(F) is weakly closed in V × R.
Preuve. Assume that F : V −→ R ∪ {+∞} lower semi continuous and convex. Then
epi(F) is closed and convex in V × R. Therefore, from Hann Banach Theorem, epi(F) is
weakly closed and hence, F is weakly lower semi continuous.
Théorème 1.23 (Existence theorem) Let V be a real Reflexive Banach Space and K be a non
empty closed convex subset of V . Let F : V → R ∪ {+∞} convex,lower semi continuous and
proper. If K is bounded or F coercive (that is F satisfies (1.14)), then there exists at least a
minimizer of F in K.
Preuve. Let (un ) be a minimizing sequence of F in K. From (1.14), it follows that (un )
is bounded. Since V is reflexive, there exists a subsequence (unk ) of (un ) that converges
weakly to some point ū ∈ V . But K is closed and convex, then it is weakly closed.
Hence ū ∈ K. On the other hand, F is convex and lower semi continuous, then it is
weakly lower semi continuous. Therefore,
So, ū is a minimizer of F in K.
Sommaire
2.1 Differential characterizations of convex functions . . . . . . . . . . 13
13
14 Optimality Conditions
If inequality (2.1) is strict for all x, y ∈ C with x , y and f (x), f (y) both finite, we say that
the function f is strictly convex on C.
Let us strart our study with the following important Lemma.
Lemme 2.1 (Slope inequality for convex functions) Let I be an interval of R and h : R̄ →
R be a proper convex function. Let r1 , r2 , r3 ∈ I such that r1 < r2 < r3 and h(r1 ) and h(r2 ) are
finite. Then
h(r2 ) − h(r1 ) h(r3 ) − h(r1 ) h(r3 ) − h(r2 )
≤ ≤ . (2.2)
r2 − r1 r3 − r1 r3 − r2
Further, these inequalities for all such r1 , r2 , r3 ∈ I characterize the convexity of f on I. If the
inequalities are required to be strict for all r1 , r2 , r3 ∈ I such that r1 < r2 < r3 and h(r1 ), h(r2 )
and h(r3 ) are finite, we obtain a characterization of the strict convexity of f on C.
r3 − r2
Preuve. For λ = , we have λ ∈ (0, 1) and r2 = λr1 + (1 − λ)r3 . By the convexity of
r3 − r1
h we then have
r3 − r2 r2 − r1
h(r2 ) ≤ h(r1 ) + h(r3 ). (2.3)
r3 − r1 r3 − r1
Adding −h(r1 ) (resp. −h(r3 )) to both sides of (2.3) we obtain the first (resp. the second)
inequality of (2.2). Further, it is easy to see that these inequalities characterize the
convexity of f . The proof of the strict convexity is similar. Through the Lemma
2.1 we can characterize the convexity of a differentiable functions of one variable as
follows:
(i) h is convex on I;
If the function h is twice derivable on I, then h is convex if and only if h00 (r) ≥ 0 for all r ∈ I.
Further, if the function h is twice derivable on I and h00 (r) > 0 for all r ∈ I, then h is strictly
convex on I. The converse does not hold, that is, the strict convexity of a twice derivable
function on I does not entail the positivity of h00 .
Preuve. (i) ⇒ (ii). Let r,t ∈ I with r < t. From the derivability of h, (i) and Lemma 2.1
we have
h(s) − h(r)
h0 (r) = lim
s↓r s−r
h(t) − h(r)
≤
t −r
h(t) − h(s)
≤ lim
s↑t t −s
h(s) − h(t)
= lim
s↑t s−t
0
= h (t),
(ii) ⇒ (iii). For fixed r ∈ I, let ϕ(s) = h(s) − h(r) − h0 (r) · (s − r) for all s ∈ I. The function
ϕ is derivable on I and ϕ0 (s) = h0 (s) − h0 (r). By assumption (ii), we see that ϕ0 (s) ≥ 0
if s ≥ r and ϕ0 (s) ≤ 0 if s ≤ r. We then deduce that ϕ(s) ≥ ϕ(r) = 0 for all s ∈ I, that is
assertion (iii).
The case when h is twice derivable on I as well as the case concerning the strict con-
vexity are left as exercise.
Let us consider now the more general case of a differentiable function on an open and
convex subset of a normed linear space.
We shall make use the following lemma, which can be seen as a bridge between con-
vex functions defined on subsets of R and the ones defined on subsets of normed
linear spaces.
Lemme 2.3 Let U be an open and convex subset of a normed linear space E and let f : U → R
be a function. For fixed point x, y ∈ U, with x , y, set
Then Ixy is an open interval of R and f is convex on U if and only if for all x, y ∈ U with x , y,
the function hxy is convex on Ixy .
Théorème 2.4 Let U be a nonempty open and convex subset of a normed linear space and f :
U → R be a function which is Fréchet-differentiable on U. Then the following are equivalent:
(a). f is convex on U;
If f is twice differentiable on U, then f is convex on U if and only if for each x ∈ U the bilinear
form associated with D2 f (x) is positive semi definite, i.e.,
hD2 f (x) · v, vi ≥ 0, ∀ v ∈ E.
Further, assuming the twice differentiability of f on U, a sufficient (but not necessary) con-
dition for the strict convexity of f on U is for each x ∈ U the positive definteness of D2 f (x),
i.e.,
hD2 f (x) · v, vi > 0, ∀ v ∈ E, with v , 0.
Preuve. For x, y ∈ U, let Ixy and hxy as in Lemma 2.3. Observing that 0 ∈ Ixy and 1 ∈ Ixy
with hxy (0) = f (x) and hxy (1) = f (y), then the proof follows from Lemma 2.3.
For example, the convex function f : R → R with f (x) = |x| is not differentiable at a = 0
but all x∗ ∈ [−1, 1] satisfy (2.5) for a = 0.
On the other hand, it is obvious that f attains its minimum at the point a if and only
if the element x∗ = 0 satisfies (2.5). These comments lead to the following notion:
Définition 2.5 Let E be a normed linear space and f : E → R̄ be a convex function which is
finite at a. The subdifferential of f at the point a is the subset of E ∗ , ∂ f (a) defined as follows:
Remarque 2.6 It is trivial to observe from the definition that if the function f takes the value
−∞ at some point, then ∂ f (a) = 0/ for all a ∈ E .
Exemple 2.7
1. For f : R → R with f (x) = |x|, we have ∂ f (0) = [−1, 1], ∂ f (a) = {1} if a > 0 and
∂ f (a) = {−1} if a < 0.
√
2. For f : R → R ∪ {+∞} with f (x) = − x if x ≥ 0 and f (x) = +∞ if x < 0, the function
n 1 o
/ For a , 0, ∂ f (a) = − √
f is convex on R and finite at 0, but ∂ f (0) = 0. for a > 0
2 a
and ∂ f (a) = 0/ for a < 0.
Ex. 2.8 Find the subdifferentials of the following convex functions: f : R → R ∪ {+∞}, f (x) =
0 if |x| ≤ 1 and f (x) = +∞ otherwise; g : R → R∪{+∞}, g(x) = 1/x if 0 < x < 1 and g(x) = +∞
otherwise.
Preuve. Exercise.
The later inequality in (2.6) is equivalent to say that for all v ∈ E and all real t > 0,
holds
hx∗ , a + tv − ai ≤ f (a + tv) − f (a),
i.e.,
h i
hx∗ , vi ≤ t −1 f (a + tv) − f (a) .
Consequently, the function in v (taking values in R̄) given by the right-hand side of
(2.7) characterizes the subdifferential of f at a, ∂ f (a).
h i
On the other hand, the differential quotient function t → t −1 f (a + tv) − f (a) is a
nondecreasing function (according to Lemma 2.1) from R − {0} into R̄. Therefore, the
h i
limit over t of t −1 f (a + tv) − f (a) exists in R̄ and we have
h i h i
−1 −1
lim t f (a + tv) − f (a) = inf t f (a + tv) − f (a) . (2.8)
t↓0 t>0
Définition 2.10 Let E be a normed linear space, f : E → R̄ be a function (which may not be
convex) and a, v ∈ E with f (a) finite, The directional derivative of f at a in the direction v,
f 0 (a; v), is defined by:
h i
f 0 (a; v) := lim t −1 f (a + tv) − f (a) ,
t↓0
• If f 0 (a; v) exists in R for all v ∈ E and the map v 7→ f 0 (a; v) is linear and continuous from
E to R, we say that f is Gâteaux differentiable at a with Gâteaux differential at a denoted by
DG f (a) or f 0 (a), and in such case, f 0 (a; v) = DG f (a) · v = hDG f (a), vi.
Proposition 2.11 Let f be a Gateaux differentiable function with open domain, such that
u → f 0 (u) is a continuous mapping from the domain of f into E ∗ . Then f is also continuously
Frechet differentiable, and is called a C1 function.
Preuve. Under our hypothesis, we have to prove that the Gateaux derivative f 0 (u)
is in fact a Frechet-derivative. Take any point u, in the domain of f and some r > 0
such that B(u, r) ⊂ Dom( f ). Using the Gâteaux differentiability of f , it follows that for
every v ∈ E with kvk < r, there exists some θ ∈ [0, I] such that
f (u + v) − f (u) = h f 0 (u + θv), vi
= h f 0 (u), vi + h f 0 (u + θv) − f 0 (u), vi (2.9)
Let ε > 0. From the continuity of the map v → f 0 (v), there exists η, 0 < η < r such that
kw − uk < η implies
Lemme 2.12 Let E be a linear space and p : E → R be a convex and positively homogeneous
function. Then p is sublinear.
Preuve. We need only to prove the subadditivity of p. For this end, let x, y ∈ E. From
the assymptions, we have
h 1 1 i
p(x + y) = p 2 x + y
2 2
1 1
= 2p x + y homogeneous property.
2 2
≤ p(x) + p(y) convexity property.
Théorème 2.13 Let E be a normed linear space and f : E → R be a convex function which is
finite at a ∈ E. Then the following properties hold.
(i) For any v ∈ E, the directional derivative of f at a in the direction v, f 0 (a; v) exists in R̄
and further
h i
f 0 (a; v) = inf t −1 f (a + tv) − f (a) .
t>0
(iii) The directional derivative function v → f 0 (a; v) is sublinear from E to R̄ and f 0 (a; 0E ) =
0.
Preuve. Assertion (i) and (ii) follow from the analysis above.
h i
(iii). Observe first that f 0 (a; 0E ) = lim t −1 f (a + t0E ) − f (a) = 0. For λ > 0 and v ∈ E,
t↓0
h i h i
f 0 (a; λv) = lim t −1 f (a + t(λv)) − f (a) = λ lim(tλ)−1 f (a + tλv) − f (a) = λ f 0 (a; v).
t↓0 t↓0
Now we will prove that the directional derivative function v → f 0 (a; v) is convex. Since
we already prove that it is positively homogeneous, then its sublinearity follows from
Lemma 2.12.
Fot this, let v1 , v2 ∈ E and , r1 , r2 ∈ R such that f 0 (a; vi ) < ri . By (i). There exists, for each
h i
−1
i = 1, 2, some ti > 0 such that ti f (a + ti vi ) − f (a) < ri . Set s := min{t1 ,t2 } > 0. Now
let λ1 , λ2 ∈ (0, 1) such that λ1 + λ2 = 1. The nondeacreasing property of the function
h i
−1
t →t f (a + tv) − f (a) gives the inequalities
h i
s−1 f (a + sv) − f (a) < ri , i = 1, 2.
h i
So by the convexity of the function v → s−1 f (a + sv) − f (a) , we obtain
h i h i
−1 −1
s f (a + s(λ1 v1 + λ2 v2 )) − f (a) ≤ λ1 s f (a + sv1 ) − f (a)
h i
+ λ2 s−1 f (a + sv2 ) − f (a)
< λ1 r1 + λ2 r2 .
f 0 (a; λ1 v1 + λ2 v2 ) < λ1 r1 + λ2 r2 .
(iv). If f 0 (a; v) = +∞ or f 0 (a; −v) = +∞, the inequality in (iv) is trivial. So suppose that
−∞ ≤ f 0 (a; v) < +∞ and −∞ ≤ f 0 (a; −v) < +∞. Let r, s ∈ R such that f 0 (a; v) < r and
f 0 (a; −v) < s. Then by (iii) we have
hence −s < r. Letting r ↓ f 0 (a; v) and s ↓ f 0 (a; −v), we obtain (iv), as desired.
Remarque 2.14 The directional derivative may take the value −∞ at some points even
for a function taking values in (−∞, +∞]. For example for f : R → R ∪ {+∞} with
√
f (x) = − x if x ≥ 0 and f (x) = +∞ if x < 0, the function f is convex, finite at 0 and
−∞, if v > 0;
f 0 (0; v) = 0, if v = 0;
+∞, if v < 0.
Let us now study the properties of the function v → f 0 (a; v) when f is finite at a and
continuous at a. Let us start with the following Lemma.
Lemme 2.15 Let E be a normed linear space and f : E → R ∪ {+∞} be a convex function and
a ∈ E such that f (a) is finite and f is continuous at a. Then f is Lipschitzian around a, that
is, there exist r > 0 and γ > 0 such that f is γ-Lipschitzian on B(a, r), i.e.
Preuve. From the continuity of f at a, there exist positive real numbers r, M such that
2M
f (x) − f (y) ≤ kx − yk.
r
Therefore,
2M
| f (y) − f (x)| ≤ γkx − yk, with γ := .
r
Hence, f is Lipschitzian on B̄(a, r).
Théorème 2.16 Let E be a normed linear space and f : E → R̄ be a convex function which
is finite at a ∈ E and continuous at a. Then the directional derivative f 0 (a; v) is finite for all
v ∈ E. Furthermore for some γ, the function f is γ-Lipschitzian around a and f 0 (a; ·) is finite
and γ-Lipschitzian on E, i.e.,
In particular, we have
| f 0 (a; v)| ≤ γkvk ∀ v ∈ E.
Preuve. Using the continuity of f at a and Lemma 2.15, there exist r > and γ > 0 such
that
| f (y) − f (x)| ≤ γkx − yk ∀ x, y ∈ B(a, r).
For v ∈ E, let t > 0 small such that a + tv ∈ B(a, r). Then | f (a + tv) − f (a)| ≤ tγkvk. This
implies that f 0 (a; v) is finite.
Now let v1 , v2 ∈ E. Using the Lipschitz property of f around a and the definition of
f 0 (a; v), it follows that
| f 0 (a; v1 ) − f 0 (a; v2 )| ≤ γkv1 − v2 k.
f 0 (a; v) = max {hx∗ , vi : x∗ ∈ ∂ f (a)} and {hx∗ , vi : x∗ ∈ ∂ f (a)} = [− f 0 (a; −v), f 0 (a; v)].
∂ f (a) ⊂ γBE ∗ ,
where BE ∗ is the closed unit ball of E ∗ centered at 0E ∗ with respect to the dual norm
k · k∗ .
Preuve. Let p : E → R be the functional defined p(v) := f 0 (a; v) for all v ∈ E. From
(iii) of Theorem 2.13, p is sublinear. Let w ∈ E be a nonzero vector of E (if E = {0E },
everything is trivial). For F := Rw = {sw : s ∈ R}, consider the function ϕ : F → R
defined by
ϕ(sw) = s f 0 (a; w), ∀ s ∈ R.
The function ϕ is obviously linear over the real vector subspace F of E. Further,
and for s < 0, using (iii) and (iv) of Theorem 2.13, we have
So ϕ(sw) ≤ f 0 (a; sw) = p(sw) ∀ s ∈ R. According to the (Analytical) Hann Banach exten-
sion Theorem, we can extend ϕ to a linear functional l : E → R satisfying
Since the sublinear functional p = f 0 (a; ·) is finite and continuous on E (see Theorem
2.16), the linear functional l is continuous, i.e., l ∈ E ∗ . Then inequality (2.10) and (ii)
of Theorem 2.13 tell us that l ∈ ∂ f (a), which establishes the nonvacuity of ∂ f (a).
Assertion (ii) of Theorem 2.13 and inequality (2.10) again assure that
It is not difficult to deduce from this equality that {hx∗ , vi : x∗ ∈ ∂ f (a)} = [− f 0 (a; −v), f 0 (a; v)],
which implies, in particular that ∂ f (a) is bounded in E ∗ . Further, it is not hard to see
from (ii) of Theorem 2.13 that ∂ f (a) is w∗ -closed. The Banach Alaoglu-Bourbaki The-
orem, the w∗ -closeness and the boundedness of ∂ f (a) ensure its w∗ -compactness.
Finally, (ii) of Theorem 2.13 and Theorem 2.16 imply that forall x∗ ∈ ∂ f (a),
Hence kx∗ k ≤ γ, and this means ∂ f (a) ⊂ γBE ∗ . The proof of the theorem is then com-
plete.
Sommaire
3.1 Preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.1 Preliminaries
Définition 3.1 Let C be a convex subset of a normed linear space E and a ∈ C. The cone
∂ψC (a), given by the subdifferential of the indicator function ψC at the point a is called the
27
28 convex minimization problems
It follows from the definition and the previous section that NC (a) is a w∗ -closed convex
cone with 0 ∈ NC (a). Any element of NC (a) is called a normal functional (or normal
vector if E is a finite dimensional space or Hilbert space) to C at a. When a < C, we put
NC (a) = 0.
/
Définition 3.2 Let C be a subset of a normed linear space E and a ∈ C. A vector v ∈ E is said
to be tangent vector (in the sense of Boulingand) to C at a, if there exist a sequence {xn } of
points of C and a sequence of positive real numbers tn ↓ 0 such that v = lim tn−1 (xn − a). The
n→∞
set of all such vectors v is readly seen to be a cone. It is called the tangent cone to C at a and is
denoted by TC (a) or T (C; a).
Remarque 3.3 We always have 0 ∈ TC (a) and any sequence {xn } as above converges
to a.
The following theorem provides some properties of the tangent cone with a particular
attention to the case of convex sets.
Théorème 3.4 Let C be a subset of a normed linear space and a ∈ C. The following hold.
(a) The tangent cone TC (a) is a closed cone of E (not necessarily convex).
(b) If in addition C is convex, then TC (a) = {v ∈ E : dC0 (a; v) = 0}, where dC : E → R is the
distance function associated to C defined as follows:
Preuve. (a). Let {vk }k∈N be a sequence of TC (a) that converges to v ∈ E. For each
k ∈ N, choose a sequence (xnk )n∈N in C converging to a and a sequence of positive real
numbers tnk with tnk → 0, as n → ∞ such that vk = lim (tnk )−1 (xnk − a). Therefore, there
n→∞
exists an increasing sequence of natural numbers nk such that
1 1
sk := tnkk < and ks−1
k (xnk k − a) − vk k < .
k k
Observing that sk ↓ 0 and v = lim s−1 k
k (xnk − a), it then follows that v ∈ TC (a), according
k→∞
to the definition of TC (a). So, TC (a) is closed.
(b). Let v ∈ TC (a). Choose a sequence (xn )n∈N in C converging to a and a sequence of
positive real numbers tn ↓ 0 such that v = lim tn−1 (xn − a). Setting vn := tn−1 (xn − a), we
n→∞
have a + tn vn ∈ C and vn → v. According to the 1-Lipschitz property of the distance
function dC , we have
dC (a + tn v) − dC (a) dC (a + tn vn ) − dC (a)
≤ kvn − vk + = kvn − vk.
tn tn
So,
dC (a + tn v) − dC (a)
→ 0, as n → ∞.
tn
dC (a + tv) − dC (a)
Since we know that dC0 (a; v) = lim exists in R̄, we deduce that dC0 (a; v) =
t↓0 tn
0.
Suppose now that dC0 (a; v) = 0. Fix a sequence (tn ) of positive real numbers such that
tn ↓ 0. For each n, choose xn ∈ C such that
a + tn v − xn < dC (a + tn v) + tn2 .
Then,
h i
v − tn−1 (xn − a) ≤ tn−1 dC (a + tn v) + tn = tn−1 dC (a + tn v) − dC (a) + tn
and hence v = lim tn−1 (xn − a). So, v ∈ TC (a). We have then established that
n→∞
Observing that dC0 (a; v) ≥ 0 for all v ∈ E, we can write (3.2) in the form
which through the convexity of the function dC0 (a; ·) (see Theorem 2.13) ensures the
convexity of the set TC (a).
Théorème 3.5 (Optimality condition through the tangent cone) Assume that E is a normed
linear space and (P) a convex minimization problem with f finite and continuous at a ∈ C.
Then, the point a is a solution of (P) if and only if
Preuve. Suppose that the point a is a solution of (P). Let v ∈ TC (a). By the definition
of TC (a), there exist a sequence of positive real numbers tn ↓ 0 and a sequence {vn } in
E with vn → v such that a + tn vn ∈ C for all n. On the other hand, we know that f is
γ-Lipschitz around a for some γ > 0, i.e., there exists an open neighborhood U of a
such that
| f (x) − f (y)| ≤ γkx − yk for all x, y ∈ U. (3.4)
Since a +tn vn → a and a +tn v → a, we can choose an integer N ∈ N such that a +tn vn , a +
tn v ∈ U for all n ≥ N. Then for each n ≥ N, we have
f (a) ≤ f (a + tn vn ) because a + tn vn ∈ C.
Hence,
f (a + tn v) − f (a) f (a + tn v) − f (a + tn vn ) f (a + tn vn ) − f (a)
= +
tn tn tn
f (a + tn v) − f (a + tn vn )
≥ .
tn
From (3.4) it follows that
f (a + tn v) − f (a + tn vn )
≥ −γkvn − vk.
tn
So,
f (a + tn v) − f (a)
≥ −γkvn − vk.
tn
Taking limit in both sides, as n → ∞ gives f 0 (a; v) ≥ 0, which gives the necessary opti-
mality condition.
Conversely, assume that f 0 (a; v) ≥ 0 for all v ∈ TC (a). Let x ∈ K. By Proposition 2.13,
we have x − a ∈ TC (a) and hence f 0 (a : x − a) ≥ 0 according to our assumption. Since
f (a + t(x − a)) − f (a)
f 0 (a; x − a) = inf ≤ f (a + (x − a)) − f (a) = f (x) − f (a),
t>0 t
we obtain f (x) − f (a) ≥ 0. This beeng true for all x ∈ K. We then conclude that a is a
solution of (P). This completes the proof of the Theorem.
If in addition, a is an interior point of K, then the variational inequality (3.5) reduces to the
Euler’s equation:
DG f (a) · v = 0. (3.6)
Ivar Ekeland ”...it is hoped that every mathematician will find something to enjoy
from the Variational Principle”
Motivation. We know that an extended real valued lower semicontinuous function at-
tains it infinimum over a compact space and that without some compactness assump-
tion, such a property in general does not hold. However, the next theorem shows that
for a lower semicontinuous function f : V → R ∪ {+∞} bounded from bellow over a
complete metric space V , there exists a function close in a certain sense to f which
achieves its minimum on V .
Inequality (3.9) tells us that v is a strict minimum for the function w → f (w) + εd(w, v).
(b) ∃w , un : F(un ) ≥ F(w) + εd(un , w). Let Sn be the set of such w ∈ V . Then choose
un+1 ∈ Sn such that
1h i
F(un+1 ) − inf F ≤ F(un ) − inf F . (3.10)
Sn 2 Sn
Claim. The sequence {un } is Cauchy. Indeed, if case (a) ever occurs, the sequence {un }
is stationary. If not, we have the inequalities
Observing that the sequence {F(un )} is decreasing and bounded from below, it then
follows that it is convergent. So the right-hand side goes to zero with (n, p). Therefore,
{un } is Cauchy. Since the space is complete, un converges to some v ∈ V .
Claim The point v satisfies (3.7), (3.8) and (3.9). Inequality (3.7) follows from the
nondecreasing property of the sequence {F(un )} and the lower semicontinuity of F:
F(v) ≤ lim inf F(un ).
The proof of (3.9) is by contradiction. For this, assume that (3.9) is not true. Then there
exists some w ∈ V with w , v such that
Corollaire 3.8 Under the same setting as Theorem 3.7, given ε > 0, and a point u ∈ V such
that
F(u) ≤ inf F + ε.
V
Then for every λ > 0, there exists some point v ∈ V such that
Preuve. For λ > 0, we define on V the distance dλ := λd. We apply Theorem 3.7 by
replacing d by dλ .
This relies on the Fact that there always is some point u ∈ V with F(u) ≤ infV F + ε.
Inequality (3.16) then proceeds from (3.7) and (3.17) from (3.8). Theorem 3.7 certainly
is stronger than Theorem 3.9. The main difference lies in inequality (3.8), which gives
the whereabouts of point v ∈ V , and which has no counterpart in Theorem 3.9.
Théorème 3.10 Let E be a real Banach space, and F : E → R ∪ {+∞} be a lower semicontin-
uous function, Gâteaux differentiable and such
Then for every ε > 0, every u ∈ E such that F(u) ≤ infV F + ε and every λ > 0, there exists
some point v ∈ E such that
ε
F(v + tw) ≥ F(v) − tkwk,
λ
F(v + tw) − F(v) ε
≥ − kwk.
t λ
Letting t ↓ 0, we obtain
ε
hF 0 (v), wi ≥ − kwk. (3.21)
λ
ε
kF 0 (v)k∗ ≤ .
λ
Corollaire 3.11 Let E be a real Banach space, and F : E → R ∪ {+∞} be a lower semicontin-
uous function, Gâteaux differentiable and such
Remarque 3.12 We can view the Corollary 3.23 as telling us that the equation F 0 (v) = 0
although it need have no solution, alway has approximate solutions, i.e., there exists
a sequence {un } such that kF 0 (un )k∗ → 0 as n → 0. The cluster points of such sequences
have been intensively studied.
Any point v which minimizes F over E satisfies (3.22) and (3.23) with ε = 0. On the
other hand, there might not be any such point: the usual conditions ensuring the
existence of a minimizer are quite stringent (F should be convex, have bounded level
sets or be coercive and E should be refexive). What Corollary 3.23 does is, even in
the absence of an exact minimum, to provide us with points which almost minimize F
and almost satisfy the first-order necessary conditions. In other words, the equations
F(v) = infF F(v) and F 0 (v) = 0 can be satisfied to any prescribed ε > 0.
Définition 3.13 Let E be a Banach space. We say that a C1 -function F : E → R stisfies the
Palais-Smale (P S) condition if every sequence (xn ) in E such that (F(xn )) is bounded and
lim F 0 (xn ) = 0 in E ∗ has a convergent subsequence.
Théorème 3.14 Let E be a Banach space and F : E → R be a C1 -function bounded from below.
Assume that F satisfies (PS) condition. Then there exists some point u ∈ E such that
Preuve. By Corollary 3.23, there exists a sequence (un ) in E such that F(un ) → inf F
E
and F 0 (un ) → 0. From our assumptions, F satisfies (PS) condition and so (xn ) has a
subsequence (xnk ) that converges to some point u ∈ E. By continuity of F and F 0 , it
follows that F(u) = lim F(un ) = inf F and F 0 (u) = lim F 0 (un ).
E
(b) Show that f is convex on U if and only if for all x, y ∈ U with x , y, hxy is convex
on Ixy .
Ex. 3.16 Let p ≥ 1. Show that the function: t ∈ R −→ |t| p is convex on R. Deduce that
n
the function h(x1 , . . . , xn ) = ∑ |xi | p is convex on Rn .
i=1
k f (x) − f (y)k ≤ kx − yk ∀ x, y ∈ Rn .
Fix( f ) := {x ∈ Rn : f (x) = x}
Ex. 3.19 Let H be a real inner product space with norm k · k and inner product h·, ·i.
Show that the function f : H → R defined by f (x) = kxk2 for all x ∈ H, is strictly convex
on H.
Ex. 3.21 Let E be a normed linear space and K be a nonempty subset of E. We define
the function
0, if x ∈ K,
ψK : E → R ∪ {+∞} as follows: ψK (x) =
+∞, if x < K.
Ex. 3.22 Let E be a normed linear space, K be a nonempty convex subset of E. Let
f : E → R be a differentiable convex function. Show that a ∈ K is a minimizer of f on
K if and only if
D f (a) · (v − a) ≥ 0, ∀ v ∈ K.
* Hint: You may use the relation D f (a) · w = f 0 (a; w) for all w ∈ E.
Ex. 3.23 Let E be a normed linear space, K be a nonempty convex subset of E. Let
f : E → R be a convex function.
1. Show that the set of minimizers of f on K is a convex subset of K.
2. Show that if f is strictly convex, then this set reduces to a singleton.
Ex. 3.24 Let E be a normed linear space and S be a nonempty subset of E. Let f : S → R
be a function which is γ-Lipschitz on S. For each x ∈ E, setcounter
g(x) := sup[ f (y) − γd(x, y)] and h(x) := inf [ f (y) + γd(x, y)].
y∈S y∈S
1. Show that g and h are real valued functions, i.e., g(x) and h(x) are finite.
2. Show that g(x) ≤ h(x) for all x ∈ E and that g(x) = h(x) = f (x) for all x ∈ S.
3. Show that the functions g and h are γ-Lipschitz on E.
Ex. 3.25 Let H be a real inner product space with norm k · k and inner product h·, ·i.
Let x ∈ H and ϕ : H → R be a convex and lower semicontinuous function. Set
1
F(v) = kv − xk2 + ϕ(v), ∀ v ∈ H.
2
1. Show that there exist a ∈ H and α ∈ R such that ϕ(v) ≥ ha, vi + α for all v ∈ H.
¯ ∈ H × R such that λ
* Hint: Take (ū, λ) ¯ < ϕ(ū) and use the geometric form of Hann-
Bannach Theorem.
2. Deduce that F has a unique minimizer in H, ux characterized by the following
variational inequality:
3. Let Proxϕ : H → H be the map defined by Proxϕ (x) = ux where ux ∈ H is the unique
point satisfying (3.27). Show that the map Prox−ϕ Lipschitzian.
Ex. 3.26 Let E be a finite dimensional normed linear space and f : E → R be a lower
semicontinuous function. Show that the following assertions are equivalent.
f (x)
1. lim inf > −∞;
kxk→+∞ kxk
2. there exists r ∈ R such that for all x ∈ E, r(1 + kxk) ≤ f (x).
Ex. 3.27 Let H be a real Hilbert with norm k · k and inner product (·, ·) and A : H → H
be a bounded linear operator. Assume that there exists a positive real number α > 0
such that:
(Av, v) ≥ αkvk2 , ∀ v ∈ H.
b. Deduce that there exists a unique u ∈ H such that J(u) = min J(v).
v∈H
======================================
where α > 0, k.kn and ( . )n are the natural norm and inner product of Rn .
1. Show that the map v → z(v) is affine from Rm to Rn i.e., z(v) = y0 + l(v) where
l ∈ L (Rm , Rn ) and y0 ∈ Rn .
2. Deduce that J is strictly convex.
3. Let K be a nonempty closed convex subset of Rm . Show that J has a unique mini-
mizer on K.
Ex. 3.29 Let E be a normed linear space and C be a nonempty subset of E. Let dC :
E → R be the distance function associated to C defined by
Optimization: Tutorial-Set I
December,05, 2013.
======================================
Ex. 3.30 Let p ≥ 1. Show that the function: t ∈ R −→ |t| p is convex on R. Deduce that
n
the function h(x1 , . . . , xn ) = ∑ |xi | p is convex on Rn .
i=1
k f (x) − f (y)k ≤ kx − yk ∀ x, y ∈ Rn .
Fix( f ) := {x ∈ Rn : f (x) = x}
Ex. 3.33 Let H be a real inner product space with norm k · k and inner product h·, ·i.
Show that the function f : H → R defined by f (x) = kxk2 for all x ∈ H, is strictly convex
on H.
Ex. 3.34 Let H be a real inner product space with norm k · k and inner product h·, ·i.
Let u ∈ H and ϕ : H → R be a convex and lower semicontinuous function. Set
1
F(v) = kv − uk2 + ϕ(v), ∀ v ∈ H.
2
1. Show that there exist a ∈ H and α ∈ R such that ϕ(v) ≥ ha, vi + α for all v ∈ H. [Hint:
you may use the geometric form of Hann-Bannach Theorem].
2. Deduce that F has a unique minimizer in H.
Optimization: Tutorial-Set II
December,11, 2013.
======================================
Ex. 3.35 Let H be a real inner product space with norm k · k and inner product h·, ·i.
Let x ∈ H and ϕ : H → R be a convex and lower semicontinuous function. Let
1
F(v) = kv − xk2 + ϕ(v), ∀ v ∈ H.
2
1. Show that there exist a ∈ H and α ∈ R such that ϕ(v) ≥ ha, vi + α for all v ∈ H.
¯ ∈ H × R such that λ
* Hint: Take (ū, λ) ¯ < ϕ(ū) and use the geometric form of Hann-
Bannach Theorem.
2. Deduce that F has a unique minimizer in H, ux characterized by the following
variational inequality:
3. Let Proxϕ : H → H be the map defined by Proxϕ (x) = ux where ux ∈ H is the unique
point satisfying (3.27). Show that the map Prox id Lipschitzian.
4. Let K be a nonempty closed convex subset of H. Assume that ϕ = ψK . Show that
Proxϕ is the orthogonal projection map onto K.
Ex. 3.36 Let E be a normed linear space and C be a nonempty subset of E. Let dC :
E → R be the distance function associated to C defined by
Ex. 3.37 Let E be a normed linear space and S be a nonempty subset of E. Let f : S → R
be a function which is γ-Lipschitz on S. For each x ∈ E, setcounter
g(x) := sup[ f (y) − γd(x, y)] and h(x) := inf [ f (y) + γd(x, y)].
y∈S y∈S
1. Show that g and h are real valued functions, i.e., g(x) and h(x) are finite.
2. Show that g(x) ≤ h(x) for all x ∈ E and that g(x) = h(x) = f (x) for all x ∈ S.
3. Show that the functions g and h are γ-Lipschitz on E.
- T HE P EDAGOGICAL T EAM
======================================
Ex. 3.38 Find the subdifferentials of the following convex functions: f : R → R ∪ {+∞},
f (x) = 0 if |x| ≤ 1 and f (x) = +∞ otherwise; g : R → R ∪ {+∞}, g(x) = 1/x if 0 < x < 1
and g(x) = +∞ otherwise.
Ex. 3.39 Let E be a normed linear space and C be a nonempty subset of E. Let dC :
E → R be the distance function associated to C defined by
Ex. 3.40 Let C be a nonempty subset of a normed linear space, a ∈ C and Let {vk }k∈N
be a sequence of TC (a) that converges to v ∈ E. We know from the definition of TC (a)
that for each k ∈ N, there exist a sequence (xnk )n∈N in C converging to a and a sequence
of positive real numbers tnk with tnk → 0, as n → ∞ such that vk = lim (tnk )−1 (xnk − a).
n→∞
1. Show that there exists an increasing sequence of natural numbers nk such that:
1 1
sk := tnkk < and ks−1
k (xnk k − a) − vk k < .
k k
2. Show that sk ↓ 0 and v = lim s−1 k
k (xnk − a) and deduce that TC (a) is closed in E.
k→∞
Ex. 3.41 Let C be a nonempty convex subset of a normed linear space and a ∈ C. Show
that x − a ∈ TC (a) for all x ∈ K.
e. If f ≤ g then g∗ ≤ f ∗ ;
f. f ∗ takes the value −∞ somewhere if and only if f ≡ +∞ and in this case f ∗ ≡ −∞;
g. If f is convex then a∗ ∈ ∂ f (a) if and only if ha∗ , ai = f (a) + f ∗ (a∗ ).
1
What is the Lengre-Fenchel conjugate the function k·k pp ? Deduce through the Fenchel
p
inequality that for any x, y ∈ Rn
1 1
hx, xi ≤ kxk pp + kykqq .
p q
- T HE P EDAGOGICAL T EAM
Sommaire
4.1 Optimization problems with equality constrains . . . . . . . . . . . 48
48
49 Optimization with constrains
Preuve. We rectify the probleme and use the implicit fonctions theorem. The proof is
divided into to steps.
Step 1: Consequences of (QC). Since x̄ satisfies (QC), then the la Jacobian matrix of ϕ
at x̄, denoted by Dϕ(x̄) has rank m. Let {e1 , . . . , em } be the canonical basis of Rm . There
exists vectors h1 , . . . , hm in Rn such that
Dϕ(x̄)hi = ei ∀, i = 1, . . . , m. (4.3)
H(x̄, 0) = 0, (4.5)
∂H
(x̄, 0) = I. (4.6)
∂t
Therefore, from the implicit function theorem, there exists an open neighborhood U1
of x̄ and a function t : U1 → Rm of class C1 such that
t(x̄) = 0, (4.7)
H(x,t(x)) = 0, ∀ x ∈ U1 . (4.8)
or equivalently,
Dϕi (x̄) + Dti (x̄) = 0, i = 1, . . . , m. (4.10)
∀x ∈ U3 ∩ K, F(x) ≥ F(x̄).
m
On the other hand, using the continuity of the map x → x + ∑ ti (x)hi at x̄, there exists
i=1
an open neighborhood U4 of x̄ such that
m
x + ∑ ti (x)hi ∈ U3 ∩ K, ∀ x ∈ U4 . (4.12)
i=1
m
Now, define the function γ : U4 → R by γ(x) = F(x + ∑ ti (x)hi ). Then, we have γ(x̄) =
i=1
F(x̄) and ∀ x ∈ U4 , γ(x) ≥ γ(x̄). Therefore x̄ is a minimizer of γ on U4 . So,
Dγ(x̄) = 0. (4.13)
or equivalently,
m m
DF(x̄).(I + ∑ hi Dti (x̄)) = DF(x̄) + ∑ [DF(x̄) · hi ]Dti (x̄) = 0. (4.14)
i=1 i=1
∂L ¯
(x̄, λ) = 0 (4.17)
∂x
¯ ∈ Rm such that
Théorème 4.7 (Sufficient Conditions for Minimizer) Let x̄ ∈ K and λ
¯ = 0,
Dx L(x̄, λ)
(4.19)
¯
(D2x L(x̄, λ)d, d) > 0, ∀ d ∈ T (C, x̄).
Preuve. The proof is by contradiction. Suppose that x̄ is not a strict local minimizer of
F on C. Then there exists a sequence {xn } in C satisfying:
1. xn → x̄,
2. xn , x̄ ∀ n ∀ n ≥ 1,
3. F(xn ) ≤ F(x̄) ∀ n ≥ 1.
set
tn = kxn − x̄k ∀ n ≥ 1. (4.21)
We have xn = x̄ +tn dn . Doing a Taylor expansion arround x̄ gives: there exist a function
ε defined in a neighborhood of x̄ and θ ∈ [0, 1] such that ε(x) → 0 as x → x̄ and
with εn → 0 as n → 0. Futher, we have: L(xn , λ) ¯ = F(xn ), L(x̄, λ)¯ = F(x̄) and Dx L(x̄, λ)
¯ = 0.
So, (4.24) implies:
F(xn ) − F(x̄)
(D2x L(x̄, λ)dn , dn ) + 2kdn k2 εn = ≤ 0 because F(xn ) ≤ F(x̄). (4.25)
tn2 /2
Passing limit over n, we obtain:
Définition 4.8 A point x̄ ∈ K satisfies (QC) if the vectors: ∇h1 (x̄), . . . , ∇hm (x̄), ∇g j (x̄), j ∈
I(x̄) are linearely independent, where I(x̄) := { j | g j (x̄) = 0}.
Calculus of variation
Sommaire
5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
5.5 Exercices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
5.1 Introduction
The calculus of variations gives us precise analytical techniques to answer questions
of the following type:
+ Find the shortest path (i.e., geodesic) between two given points on a surface.
+ Find the curve between two points in the plane that yields a surface of revolution
of minimum area when revolved around a given axis.
54
55 Calculus of variations
+ Find the curve along which a bead will slide (under the effect of gravity) in the
shortest time.
Here, x0 , x1 are given real numbers, and any C1 -function x = x(t) satisfying these two
boundary conditions is said to be admissible.
The problem above can be seen as an optimisation problem in infinitely many vari-
ables(one for each t ∈ [t0 ,t1 ]). But fortunately the possible minimising and maximis-
ing functions x(t) can be found among the solutions to a certain ordinary differential
equation. This is the content of the following section.
To reveal the nature of the above differential equation, it is instructive to carry out
the t-differential using the chain rule (this is allowed at least if the solution x∗ (t) is a
C2 -function). In that case, it follows that
Lyy (t, x(t), x0 (t))x00 (t) + Lxy (t, x(t), x0 (t))x0 (t) − Lx (t, x(t), x0 (t)) = 0.
However, x∗ is not a maximiser (J(x) can be seen to take arbitrary large values), but it
will follow later from sufficient conditions that x∗ is a minimiser. Preuve. The point
of departure is to show the so called fundamental lemma of calculus of variations:
for every C2 -function ϕ on [t0 ,t1 ] such that: ϕ(t0 ) = ϕ(t1 ) = 0. Then f (t) = 0 for every
t ∈ [t0 ,t1 ].
Preuve. Suppose there exists s ∈]t0 ,t1 [ such that f (s) , 0, says f (s) > 0. By conti-
nuity, there is some interval I :=]a, b[ on which f (t) > 0 and t0 < a < b < t1 . On I,
define ϕ(t) = (t − a)3 (b − t 3 ) > 0, and let ϕ(t) = 0 outside I. So we have ϕ is C2 and
R t1
0= t0 f (t)ϕ(t) dt > 0, which is a contradiction. Hence f (t) = 0 for every t.
The Lemma 5.3 will be used by forming a so-called variation of the given function
For simplicity, the proof continues with the case of a minimum at x∗ (the case of max-
imum is similar). This means that
Thus
I 0 (0) = 0.
This is, of course under the assumption that α → I(α) is differentiable under the inte-
gral sign above (this will be proved later). Proceeding from this, one arrives at
Z t1 h i
0 0 0
I (0) = Lx (t, x∗ , x∗ )ϕ(t) + Ly (t, x∗ , x∗ )ϕ0 (t) dt.
t0
The Legendre condition is often useful, when one wants to show that a solution can-
didate is not a maximizer (or a mionimizer). This is elucidated by the next example.
one finds at once that Lyy = 2 > 0, and this rules out that the admissible function
x∗ (t) = e1+t − e1−t be a minimizer. But it is still open whether x∗ is a maximizer or not.
This cannot be conclude from the fact that Lyy > 0.
One of the possible complications in pratice is that there, for good reasons, are further
constraints on the admissible functions. This can for example lead us to the problem
of finding a C1 -function x : [t0 ,t1 ] → R such that
Z t1
max L(t, x, x0 ) dt; x(t0 ) = x0 , x(t1 ) = x1 ; (5.8)
t0
h(t, x(t), x0 (t)) > 0 ∀ t ∈ [t0 ,t1 ]. (5.9)
Here h is a suitable C1 -function defining the constraint. However, one can show that
the Euler-Lagrange and Legendre conditions are necessary also for such problems.
0
(i) Ly (t1 , x∗ (t1 ), x∗ (t1 )) = 0;
0
(ii) Ly (t1 , x∗ (t1 ), x∗ (t1 )) ≥ 0 ( and = 0 holds if x∗ (t1 ) < 0);
0
0
0
(iii) L(t1 , x∗ (t1 ), x∗ (t1 )) − g0 (t1 ) − x∗ (t1 ) Ly (t1 , x∗ (t1 ), x∗ (t1 )) = 0.
Preuve. Set y1 = x∗ (t1 ). Since x∗ minimises J, the inequality J(x∗ ) ≤ J(y) holds in partic-
ular for all admissible functions that satisfy y(t1 ) = y1 . Therefore x∗ is also a solution of
the basic problem on the fixed time interval [t0 ,t1 ] and with data x0 , y1 . Consequently
x∗ satisfies the Euler-Lagrange equation.
The transversality condition (i) can be proved as a continuation of the proof of the-
orem 5.1: since the terminal condition value x(t1 ) is not fixed in this context, it is
possible that the variation ϕ(t) is such that ϕ(t1 ) , 0 (but ϕ(t0 ) = 0 is still required);
again it is seen that I 0 (0) = 0. Since it is already known that x∗ solves Euler-Lagrange
equation, it follows from the integration by parts leading to (5.5) that
0
Ly (t1 , x∗ (t1 ), x∗ (t1 ))ϕ(t1 ) = 0. (5.10)
Taking ϕ such that ϕ(t1 ) = 1 the conclusion in (i) follows. With a little more effort
also (ii) can be obtained along these lines. However, (iii) requires the implicit function
theorem and a longer argument, so details are skipped here.
Théorème 5.8 Suppose L(t, x, y) is convex with respect to (x, y), and that x∗ (t) satisfies the
Euler-Lagrange equation and one of the terminal conditions. Then x∗ is a global minimiser.
Exemple 5.9 Since L(t, x, y) = x2 + y2 is convex in example 5.2, the solution x∗ (t) = e1+t −
e1−t actually minimises the functionanal J subject to the conditions x(0) = 0 and x(1) = e2 −1.
5.5 Exercices
Z 1
Ex. 5.10 (i) Calculate the value of J(x) = (x(t)2 + x0 (t)2 ) dt in the cases
0
2. x(t) = e2t − 1,
They are all go through (0, 0) and (1, e2 − 1), hence are admissible.
(ii) Show that J(x) has no maximum on the curves joining (0, 0) and (1, e2 − 1).
Ex. 5.11 Investigate what the Euler Lagrange equations give in the special cases when
1. L = L(t, x);
2. L = L(t, y);
3. L = L(x, y).
Z 2
Ex. 5.12 Let J(x) = t −2 x0 (t)2 dt and consider the boundary conditions x(1) = 1, x(2) =
1
2.
2. Show that the maximisation problem for J has no solution. (Try x(t) = at 2 (1 −
3a)t + 2a).
Find the solution of the Euler-Lagrange equation and determine the value of a for
which this solution is admissible. For this values of a, find the solution of the prob-
lem.
Ex. 5.14 The lenght of the graph of a C1 -function x(t) which connects (t0 , x1 ) to (t1 , x1 )
is given by Z t1 q
L(x) = 1 + x0 (t)2 dt.
t0
Prove that L(x) attains its minimum over the admissible functions exactly when x(t)
has a straight line as its graph.
Sommaire
6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
6.1 Introduction
In this chapter, we present the Protryagin’s Principle. It concerns the study of dy-
namical optimization problems. Roughly speaking, it is a way to control a system
(mechanic, chemestre , physic, economic...) with a minimum cost.
63
64 Minimum’s principle of Pontryagin
dx
= f (t, x, u) on ]t0 ,t1 [, x(t0 ) = x0 ,
dt
where x0 ∈ Rn and t0 ,t1 ∈ R are fixed with t0 < t1 ; f is a given function defined from
R × Rn ×U into Rn . As we can see, one of the arguments of the function f is a function
u defined on the interval ]t0 ,t1 [ and taking values in the given set U. We assume that
U is a nonempty and closed subset of Rm . This function u translates mathematically
the actions (or decisions) that can make on the of system. The set U corresponds to
any restrictions or constrains that must respect the control u, (for example, limited
resources, limits on the acceleration or speed for the driving). Formulate an opti-
mal control problem is to define the state of the system and the differential equation
governing its evolution, the class admissible controls u(t) and finally a criterion of
evolution or a cost function J whose typical form is:
Z t1
J(u) = g(t, x(t), u(t))dt + h(x(t1 )), (6.1)
t0
where f (t, x, u) = (x2 , u) and x = (x1 , x2 ). We know that the consumption of a car de-
pends on how the acceleration is used. So, we take as cost function J to be the con-
sumption of fuel, as control the acceleration. Therefore, we have the following optimal
control problem:
inf J(u) := consumption of fuel
u
dx (6.4)
= f (t, x(t), u(t))
dt
x(0) = (x1 , x2 )
0 0
(ii) h ∈ C1 (Rn ),
These assumptions on f garantie the existence and uniqueness of solution of the dif-
ferential system (6.1). In fact, let if u = u(t) be a fixed. Then the function f¯(t, x) =
f (t, x, u) satisfies :
∂ f¯
k (t, x)k ≤ CR (1 + ku(t)k
∂x
,
for kxk ≤ R and for all t ∈] − R, R[. Hence that k f¯(t, x)k ≤ C(1 + kxk + ku(t)k) for x ∈ Rn
and for almost every t ∈]t0 ,t1 [. According to the existence theorem for ordinary dif-
ferential equations, these boundedness are enough when ever u ∈ L2 (]to ,t1 [; Rm ), to
garantie the existence of a unique solution x ∈ C([t0 ,t1 ], Rn ) of (6.1) satisfying:
Z t
x(t) = x0 f (s, x(s), u(s)) ds, ∀t ∈]t0 ,t1 [.
t0
Moreover, x(t) depend continuously on x0 . In particular, we have: kx(t)k ≤ R for all
t ∈ [t0 ,t1 ] for a some onstant R > 0 (depending on u). From the assumptions, it follows
that the criterion J given by (6.1) is well defined and the infinimum of (6.1) is finite,
i.e., inf J > −∞.
H(t, x, p, u) := g(t, x, u) + hp, f (t, x, u)i for all (t, x, p, u) ∈ R × Rn × Rn ×U. (6.5)
Définition 6.2 Let u(t) be a control and x(t) be the corresponding state, that is
dx
= f (t, x, u), x(t0 ) = x0 ,
dt
The adjoint state associated to u(t), x(t) is the unique solution p(t) of the following system:
dp ∂H
=− t, x(t), p(t), u(t) on ]t0 ,t1 [ , p(t1 ). = h0 (x(t1 )) (6.6)
dt ∂x
Théorème 6.3 (Minimum Principle of Pontryagin) Assume that the conditions (i) −
(vi) are satisfied. Let ū:[t0 ,t1 ] −→ Rn be a solution of (6.1). Then for all t ∈]t0 ,t1 [
ū(t) realize the minimum over U of the function u 7−→ H t, x̄(t), p̄(t), u (6.7)
where x̄ and p̄ are the state and the adjoint state associated to ū(t).
[3] I.Ekeland and R.Temam, Convex analysis and Variational Problems, SIAM Clas-
sics In Applied Mathematics, 1999.
67