0% found this document useful (0 votes)
17 views14 pages

Minimization of Functionals

Uploaded by

Safimba SOMA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views14 pages

Minimization of Functionals

Uploaded by

Safimba SOMA
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

Chapter 5

Minimization of Functionals

The collection of intertwining definitions, theorems, lemmas and propositions that


constitute optimization theory in a functional analytic setting can be intimidating
for those who are not full-time mathematicians. However, the task of utilizing the
functional analytic framework in practical problems is made tractable by keeping in
mind that all of these concepts are generalizations of much simpler, more familiar
ideas. In this chapter we will emphasize that an understanding of minimization
of convex functionals over abstract spaces is facilitated by drawing analogies to
some well-known facts from elementary calculus. At the heart of any optimization
problem, whether it arises in structural design or control design, it is necessary to
find the extrema of some function that represents a cost, usually subject to a set
of constraints. To solve this problem, at least three “natural” questions arise:
• Is the optimization problem at least well posed? That is, does there exist an
optimal solution for the problem as stated?
• How can we find the extrema that represent a solution of the optimization
problem?
• Can we guarantee that the extrema we have found are the minima or maxima
we sought?
In the language of mathematics, the first question above is a statement of the
well-posedness and existence of a solution to an optimization problem, while the
second question seeks to characterize the optimal solutions.

5.1 The Weierstrass Theorem

If the measure of the utility of any theorem is judged by how concisely it may
be expressed, and how widely it may be applied, then the Weierstrass Theorem
rightly plays a central role in optimization theory. It provides sufficient conditions
for the solution of the optimization problem where we seek to find u ∈ U ⊆ X
such that
f (u) = inf f (v)
v∈U
162 Chapter 5. Minimization of Functionals

without requiring the differentiability properties for f , stated in Theorem 5.3.2.


It is important to note that this theorem combines continuity and compactness
requirements to guarantee the solution of the optimization problem.
Theorem 5.1.1. Let (X, τ ) be a compact topological space and let the functional
f : X → R be continuous. Then there exists an x0 ∈ X such that

f (x0 ) = inf f (x).


x∈X

Because the proof of this theorem, while well-known, is instructive and serves as
a model for the proofs of more general results in this chapter, we will summarize
it here. We will require the following alternative characterizations of continuity
on topological spaces to carry out this proof in a manner that can be “lifted” to
more general circumstances. Recall that one of the most common definitions of
continuity is cast in terms of inverse images of open sets.
Definition 5.1.1. Let (X, τx ) and (Y, τy ) be topological spaces. A function f : X →
Y is continuous at x0 ∈ X if the inverse image of every open set O in Y that
contains f (x0 ) is an open set in X that contains x0 . That is

O ∈ τy and f (x0 ) ∈ O =⇒ x0 ∈ f −1 (O) ∈ τx .

A function that is continuous at each point of a topological space is said to be


continuous on that space. The following two definitions restate this fact.
Definition 5.1.2. Let (X, τx ) and (Y, τy ) be topological spaces. The following are
equivalent:
(i) f : X → Y is continuous.
(ii) The inverse image under f of every open set in Y is an open set in X.
That is,
O ∈ τy → f −1 (O) ∈ τx .
(iii) The inverse image of every closed set in Y is a closed set in X.
Now we return to the proof of Theorem 5.1.1.
Proof. Suppose that α = inf f (x). Consider the sequence of closed sets in the
x∈X
range of f 6 7
1 1
Qk = α − , α + k ∈ N.
k k
From this sequence of closed sets, we can construct a sequence of closed sets in
the domain X
Ck = {x ∈ X : f (x) ∈ Qk } .
By construction, this sequence of sets is nested

Ck+1 ⊆ Ck ∀k ∈ N
5.2. Elementary Calculus 163

and each Ck is compact being a closed subset of a compact set. The sequence of
compact sets {Ck }∞
k=1 clearly satisfies the finite intersection property, so that


∃ x0 ∈ Ck .
k=1

It consequently follows that


f (x0 ) = α = inf f (x). 
x∈X

5.2 Elementary Calculus


To set the foundation for the analysis that follows, let us review some well-known
results from elementary calculus that give relatively straightforward methods for
answering these questions for cost functions defined in terms of a single real vari-
able. When we express an optimization problem in terms of a function f that maps
the real line into itself, that is,
f :R→R (5.1)
we seek to find a real number x0 ∈ R such that
f (x0 ) = inf f (x) (5.2)
x∈C

where C is the constraint set. Now, there are many ways in which we can construct
simple functions for which there is no minimizer over the constraint set. If the
constraint set is unbounded, such as the entire real line, an increasing function
like f (x) = x obviously does not have a minimizer. Even if the constraint set is
bounded, for example C ≡ (0, 1], there is no minimizer for the simple function
f (x) = x. Intuitively, we would like to say that x0 = 0 is the minimizer, but
this point is not in the constraint set C. While there are many theorems that can
describe when a function will achieve its minimum over some constraint set, one
prototypical example is due to Weierstrass.
Theorem 5.2.1. If f is a real-valued function defined on a closed and bounded
subset C of the real line, then f achieves its minimum.
If f is in fact a differentiable function of the real variable x, and is defined on all
of R, then the problem of characterizing the values of x where extrema may occur
is well known: the extrema may occur only when the derivative of the function f
vanishes. From elementary calculus we know that:
Theorem 5.2.2. If f is a differentiable, real-valued function of the real variable x
and is defined on all of R, then
f (x0 ) = inf f (x) implies that f (x0 ) = 0.
x∈R

In fact, most students studying calculus for the first time spend a great deal of time
finding the zeros of the derivative of a function, in order to find the extrema of the
164 Chapter 5. Minimization of Functionals

function. Soon after learning that the first derivative can be used to characterize
the possible locations of the extrema of a real-valued function, the student of
calculus is taught to examine the second derivative of a function to gain some
insight into the nature of the extrema.
Theorem 5.2.3. If f is a twice differentiable, real-valued function defined on all of
R, f (x0 ) = 0, and
f (x0 ) > 0 (5.3)
then x0 is a relative minimum. In other words,

f (x) ≥ f (x0 )
for all x in some neighborhood of x0 .
Of course, readers will recognize these theorems immediately. These theo-
rems are fundamentals of the foundations of real variable calculus, and require no
abstract, functional analytic framework whatsoever. Because of their simple form
and graphical interpretation, they are easy to remember. They are important to
this chapter in that they provide a touchstone for more abstract results in func-
tional analysis that are required to treat optimization problems in mechanics. For
example, the elastic energy stored in a beam, rod, plate, or membrane cannot be
expressed in terms of a real-valued function of a real variable f (x). One can hy-
pothesize that equilibria of these structures correspond to minima in their stored
energy, but the expressions for the stored energy are not classically differentiable
functions of a real variable. We cannot simply differentiate the energy expressions
in a classical sense to find the equilibria as described in the above theorems.
What is required, then, is a generalization of these theorems that is suffi-
ciently rich to treat the meaningful collection of problems in mechanics and control
theory. For a large class of problems, we will find that each of the simple, intuitive
theorems above can be generalized so that they are meaningful for problems in
control and mechanics. In particular, this chapter will show that:
• The Weierstrass Theorem can be generalized to a functional analytic frame-
work. To pass to the treatment of control and mechanics problems, we will
need to generalize the idea of considering closed and bounded subsets of the
real line, and consider compact subsets of topological spaces. We will need to
generalize the notion of continuity of functions of a real variable to continuity
of functionals on topological spaces.
• The characterization of minima of real-valued functions by derivatives that
vanish will be generalized by considering Gateaux and Fréchet derivatives of
functionals on abstract spaces. It will be shown that Theorem 5.2.1 has an
immediate generalization to a functional analytic setting.
• The method of determining that a given extrema of a real-valued function is
a relative minima, by checking to see if its second derivative is positive, also
has a simple generalization. In this case, a relative minima can be deduced if
the second Gateaux derivative is positive.
5.3. Minimization of Differentiable Functionals 165

5.3 Minimization of Differentiable Functionals


Now we can state our first step in “lifting” the results from elementary calculus for
characterizing minima of a real-valued function, described in Equations (5.1)–(5.3).
We consider only functionals having relatively strong differentiability properties
to begin, and weaken these assumptions in subsequent sections. It is important to
note that the results in this section are strictly local in character. That is, if X is
a normed vector space and f : X → R is an extended functional, f is said to have
a local minima at x0 ∈ X if there exists a neighborhood N (x0 ) such that

f (x0 ) ≤ f (y) ∀ y ∈ N (x0 ).

This is clearly a result that is directly analogous to the local character of the
characterization of extrema of real-valued functions. In fact, the primary results
of this section are derived by exploiting the identification of f (x0 + th) with a
real-valued function
g(t) ≡ f (x0 + th)
where t ∈ [0, 1] and h ∈ X. Note that for fixed x0 , h ∈ X, g(t) is a real-valued
function. Indeed, if g is sufficiently smooth, uniformly for all x0 and h in some
subset of X, we can expand g in a Taylor series about t = 0

n
tk g (k) (0)
g(t) = g(0) + + Rn+1 .
k!
k=1

Now we obtain the most direct, simple generalization of Theorem 5.2.2 for real-
variable functions.
Theorem 5.3.1. Let X be a normed vector space, and let f : X → R. If f has a
local minimum at x0 ∈ X and the Gateaux derivative Df (x0 ) exists, we have

Df (x0 ), hX ∗ ×X = 0 ∀ h ∈ X.

Proof. By assumption, the limit

f (x0 + th) − f (x0 )


Df (x0 ), hX ∗ ×X = lim
t→0 t
exists. Since x0 is a local minimum, we have that

f (x0 + th) − f (x0 )


≥0 ∀ x0 + th ∈ N (x0 ).
t
But as we take the limit as t → 0 for t > 0, we always have x0 + th ∈ N (x0 ) for t
small enough, for any h ∈ X. This fact implies that

Df (x0 ), hX ∗ ×X ≥ 0 ∀ h ∈ X.


166 Chapter 5. Minimization of Functionals

By choosing h = ±ξ, we can write

± Df (x0 ), ξX ∗ ×X ≥ 0

and consequently we obtain Df (x0 ) ≡ 0 ∈ X ∗ . 

For some functionals that have higher order differentiability properties, it is


also possible to express sufficient conditions for the existence of local extrema in
terms of positivity of the differentials. These conditions appear remarkably similar
to results from the calculus of real-variable functions.

Theorem 5.3.2. Let X be a normed vector space and let the functional f : X → R.
Suppose that n is an even number with n ≥ 2 and
(i) f is n times Fréchet differentiable in a neighborhood of x0 ,
(ii) Df (n) is continuous at x0 , and
(iii) the nth derivative is coercive, that is

Df (n) (x0 ), (h, h, . . . , h)X ∗ ×X ≥ chnX .

Then f has a strict local minimum at x0 .

Proof. The proof is left as an exercise. 

5.4 Equality Constrained Smooth Functionals


In the last section, we have discussed necessary conditions for existence of a local
minimum for unconstrained optimization problems. We now consider constrained
optimization problems where we seek x0 ∈ X such that

f (x0 ) = inf f (x)


x∈C

where
C = {x : g(x) = 0}.

Provided that the functions are smooth enough and the constraints are regular,
there is a very satisfactory Lagrange multiplier representation for this problem.

Definition 5.4.1. Let X and Y be Banach spaces. Suppose g : X → Y is Fréchet


differentiable on an open set O ⊂ X, and suppose that the Fréchet derivative
Dg(x0 ) is continuous at a point x0 ∈ O in the uniform topology on the space
L(X, Y ). The point x0 ∈ O is a regular point of the function f if Dg(x0 ) maps X
onto Y .
5.4. Equality Constrained Smooth Functionals 167

Ljusternik’s Theorem
As will be seen in many applications, the regularity of the constraints plays an im-
portant role in justifying the applicability of Lagrange multipliers to many equality
constrained problems. In fact, this pivotal role is made clear in the following the-
orem due to Ljusternik.
Theorem 5.4.1 (Ljusternik’s Theorem). Let X and Y be Banach spaces. Suppose
that
(i) g : X → Y is Fréchet differentiable on an open set O ⊆ X,
(ii) g is regular at x0 ∈ O, and
(iii) the Fréchet derivative x0 → Df (x0 ) is continuous at x0 in the uniform op-
erator topology on L(X, Y ).
Then there is a neighborhood N (y0 ) of y0 = g(x0 ) and a constant C such that the
equation
y = g(x)
has a solution x for every y ∈ N (y0 ) and
x − x0 X ≤ Cy − y0 Y .
With these preliminary definitions, we can now state the Lagrange multiplier the-
orem for equality constrained extremization.
Theorem 5.4.2. Let X and Y be Banach spaces, f : X → R, and g : X → Y .
Suppose that:
(i) f and g are Fréchet differentiable on an open set O ⊆ X,
(ii) the Fréchet derivatives
x0 → Df (x0 )
x0 → Dg(x0 )
are continuous in the uniform operator topology on L(X, R) and L(X, Y ),
respectively, and
(iii) x0 ∈ O is a regular point of the constraints g(x).
If f has a local extremum under the constraint g(x0 ) = 0 at the regular point
x0 ∈ O, then there is a Lagrange multiplier y0∗ ∈ Y ∗ such that the Lagrangian
f (x) + y0∗ g(x)
is stationary at x0 . That is, we have
Df (x0 ) + y0∗ ◦ Dg(x0 ) = 0.
Proof. We first show that if x0 is a local extremum, then Df (x0 ) ◦ x = 0 for all x
such that Dg(x0 ) ◦ x = 0. Define the mapping
F :X →R×Y
 
F (x) = f (x), g(x) .
168 Chapter 5. Minimization of Functionals

Suppose, to the contrary, that there is u ∈ X such that

Dg(x0 ) ◦ u = 0

but
Df (x0 ) ◦ u = z = 0.
If this were the case, then x0 would be a regular point of the mapping F . To see
why this is the case, we can compute
 
DF (x0 ) ◦ x = Df (x0 ) ◦ x, Dg(x0 ) ◦ x ∈ R × Y.

By assumption Dg(x0 ) : X → Y is onto Y since x0 is a regular point of the


constraint g(x) = 0. Pick some arbitrary (α, y) ∈ R × Y . Since Dg(x0 ) is onto Y ,
there is an x̄ ∈ X such that
Dg(x0 ) ◦ x̄ = y.
The derivatives Df (x0 )◦u and Dg(x0 )◦u are linear in the increment u by definition.
This fact, along with the definition of the real number β, where β = Df (x0 ) ◦ x̄,
implies that

α−β α−β
Df (x0 ) ◦ u = Df (x0 ) ◦ u
z z
= α − β = α − Df (x0 ) ◦ x̄

α−β α−β
Dg(x0 ) ◦ u = Dg(x0 ) ◦ u = 0.
z z
 
α−β
If we choose x = z u + x̄, it is readily seen that
      
DF (x0 ) ◦ α−β
z u + x̄ = Df (x0 ) ◦ α−β u + x̄ , Dg(x0 ) ◦ α−β u + x̄
 z z 
= α−β
z Df (x0 ) ◦ u + Df (x 0 ) ◦ x̄, Dg(x0 ) ◦ x̄
= (α − Df (x0 ) ◦ x̄ + Df (x0 ) ◦ x̄, y)
= (α, y).

The map DF (x0 ) is consequently onto R × Y and x0 is a regular point of the


map F . Define  
F (x0 ) = f (x0 ), g(x0 ) = (α0 , 0) ∈ R × Y.
By Ljusternik’s theorem, there is a neighborhood of (α0 , 0)

N (α0 , 0) ⊆ R × Y

such that the equation


F (x) = (α, y)
5.4. Equality Constrained Smooth Functionals 169

has a solution for every (α, y) ∈ N (α0 , 0) and the solution satisfies
x − x0 X ≤ C {|α − α0 | + yY } .
In particular, the element (α0 − , 0) is in the neighborhood N (α0 , 0) for all  small
enough. For every  > 0 there is a solution x to the equation
F (x ) = (α0 − , 0).
But this means that
f (x ) = α0 − 
= f (x0 ) − 
and
g(x ) = 0.
Furthermore, we have that
x − x0 X ≤ .
This contradicts the fact that x0 is a local extremum, and we conclude
Df (x0 ) ◦ x = 0
for all x ∈ X such that
Dg(x0 ) ◦ x = 0.
Recall that  
{x ∈ X : Dg(x0 ) ◦ x = 0} = ker Dg(x0 ) .
 ⊥
In fact Df (x0 ) ∈ X ∗ and Df (x0 ) ∈ ker (Dg(x0 )) . Since the range of Dg(x0 )
is closed, we have
 ∗    ⊥
range Dg(x0 ) = ker Dg(x0 ) .

By definition
Dg(x0 ) : X → Y
and  ∗
Dg(x0 ) : Y ∗ → X ∗ .
We conclude that there is a y0∗ ∈ Y ∗ such that
 ∗
Df (x0 ) = − Dg(x0 ) ◦ y0∗
 ∗
Df (x0 ) + Dg(x0 ) ◦ y0∗ = 0.
By definition
( ∗ )
Dg(x0 ) ◦ y0∗ , x = y0∗ , Dg(x0 ) ◦ xY ∗ ×Y .
X ∗ ×X

So that this equality can be written as


Df (x0 ) + y0∗ ◦ Dg(x0 ) = 0. 
170 Chapter 5. Minimization of Functionals

The above theorem bears a close resemblance to the Lagrange multiplier the-
orem from undergraduate calculus discussed in [12], [18] in the introduction. The
essential ingredients of the above theorem include smoothness of the functionals
f and g and the regularity of the constraints. There is an alternative form of this
theorem that weakens the requirement that the constraints are in fact regular at
x0 . It will be useful in many applications.
Theorem 5.4.3. Let X and Y be Banach spaces, f : X → R, and g : X → Y .
Suppose that:
(i) f and g are Fréchet differentiable on an open set O ⊆ X,
(ii) the Fréchet derivatives
x0 → Df (x0 )
x0 → Dg(x0 )
are continuous in the uniform operator topology on L(X, R) and L(X, Y ),
respectively, and
(iii) the range of Dg(x0 ) is closed in Y .
If f has a local extremum under the constraint g(x0 ) = 0 at the point x0 ∈ O, then
there are multipliers λ0 ∈ R and y0∗ ∈ Y ∗ such that the Lagrangian
λ0 f (x) + y0∗ g(x)
is stationary at x0 . That is
λ0 Df (x0 ) + y0∗ ◦ Dg(x0 ) = 0.
Proof. The proof of this theorem can be carried out in two steps. First, suppose
that the range of Dg(x0 ) is all of Y . In this case, the constraint g(x0 ) is regular at
x0 . We can apply the preceding theorem and select λ0 ≡ 1. If, on the other hand,
the range Dg(x0 ) is strictly contained in Y , we know that there is some ỹ ∈ Y
such that   
d = inf ỹ − y : y ∈ range Dg(x0 ) > 0.
  ⊥
By Theorem 2.2.2 there is an element y0∗ ∈ range Dg(x0 ) such that

y0∗ , ỹ = d = 0
and y0∗ = 0. But for any linear operator A
 ⊥
range(A) = ker(A∗ )
so that   ⊥  ∗ 
y0∗ ∈ range Dg(x0 ) ≡ ker Dg(x0 ) .
By definition, since y0∗ ∈ Y ∗
( ∗ )
y0∗ , Dg(x0 ) ◦ xY ∗ ×Y = Dg(x0 ) ◦ y0∗ , x =0
X ∗ ×X
for all x ∈ X. We choose λ0 = 0 and conclude
λ0 Df (x0 ) + y0∗ ◦ Dg(x0 ) = 0. 
5.5. Fréchet Differentiable Implicit Functionals 171

5.5 Fréchet Differentiable Implicit Functionals


In all of our discussions so far in this chapter, optimization problems having quite
general forms have been considered. In this section, we discuss a class of opti-
mization problem that has a very specific structure. These problems often arise
in the study of optimal control. It has been noted by several authors that the
distinguishing feature of optimal control problems within the field of optimization
is their distinct structure. The standard optimization problem is such that we seek
u0 ∈ U such that
J(u0 ) = inf {J(u) : u ∈ U}. (5.4)
u
Instead of Equation (5.4), optimal control problems frequently arise where we seek
a pair (x0 , u0 ) ∈ X × U such that
J (x0 , u0 ) = inf {J (x, u) : A(x, u) = 0, u ∈ U}. (5.5)
x, u

In these equations, x represents the dependent quantity or physical state of the


system under consideration, while u denotes the input or control. Optimization
problems having the form depicted in Equation (5.5) arise in control problems for
physical reasons. We typically seek to minimize some quantity such as fuel, cost,
departure motion, or vibration, etc. subject to a collection governing equations
that are inviolate. In Equation (5.5),
A(x, u) = 0 (5.6)
denotes the equations of physics that relate the inputs u to the states x. It is
a fundamental premise of optimal control that a pair (x, u) ∈ X × U that does
not satisfy Equation (5.6) violates some physical law. The equations of evolution
that govern the physical variables of the problem are encoded by ordinary differ-
ential equations, partial differential equations or integral equations represented by
Equation (5.6). It is usually very difficult, either computationally or analytically
to solve for the state x as a function of the control u in Equation (5.6). If we can
find x(u) that satisfies Equation (5.6)
A(x(u), u) = 0
it is clear that we can reduce Equation (5.5) to the form in Equation (5.4).

J(u) = J (x(u), u)
J(u0 ) = J (x(u0 ), u0 ) = inf {J(u) = J (x(u), u) : u ∈ U}.
u
In some cases, it will be possible to solve for x(u). More frequently, it will not.
We will need methods for calculating the Gateaux derivative of J(u) without
calculating x(u) explicitly. This task is accomplished by using the co-state, adjoint,
or optimality system equations.
The following theorem provides the theoretical foundation of adjoint, co-state
or optimality system methods for Fréchet differentiable, implicit functionals.
172 Chapter 5. Minimization of Functionals

Theorem 5.5.1. Let X, Y, U be normed vector spaces, J : X × U → Z, and suppose


A : X × U → Y defines a unique function x(u) via the solution of

A(x(u), u) = 0.

Suppose further that:


• A(·, u) : X → Y is Fréchet differentiable at x = x(u),
• J (·, u) : X → Z is Fréchet differentiable at x = x(u),
• A(x, ·) : U → Y is Gateaux differentiable,
• J (x, ·) : U → Z is Gateaux differentiable,
• Gateaux differential Du A(x, u) is continuous on X × U ,
• Gateaux differential Du J (x, u) is continuous on X × U , and
• x(u) is Lipschitz continuous

x(u) − x(v)X ≤ Cu − vU

for some C ∈ R.
If there is a solution λ ∈ L(Y, Z) to the equation

λ ◦ Dx A(x, u) = Dx J (x, u)

at x = x(u), then

J(u) = J (x(u), u)
is Gateaux differentiable at u and

DJ(u) = Du J (x, u) − λ ◦ Du A(x, u) ∈ L(U, Z).

Proof. Suppose 0 ≤  ≤ 1. For u, ũ ∈ U , define



u = u + (ũ − u) ∈U
(5.7)

x = x(u ).

Recall that we want to find an expression for DJ(u)

 J(u + v) − J(u)


DJ(u) ◦ v = lim . (5.8)
→0 
We have
J(u ) − J(u) J (x(u ), u ) − J (x(u), u)
=
  (5.9)
J (x(u ), u ) − J (x(u ), u) J (x(u ), u) − J (x(u), u)
= + .
 
By the Fréchet differentiability of J (·, u)
 
J (x(u ), u) = J (x(u), u) + Dx J (x(u), u) ◦ (x(u ) − x(u)) + R1 x(u ) − x(u)X
5.5. Fréchet Differentiable Implicit Functionals 173

where the remainder R1 (·) satisfies


 
R1 x(u ) − x(u)X
→0
x(u ) − x(u)X
as
x(u ) − x(u)X → 0.
By the Lipschitz continuity of x(u), we note that

x(u ) − x(u) ≤ Cũ − uU · 

so that    
R1 x(u ) − x(u)X R1 x(u ) − x(u)X
≤ .
Cũ − uU ·  x(u ) − x(u)X
Consequently, we write

J (x(u ), u) = J (x(u), u) + Dx J (x(u), u) ◦ (x(u ) − x(u)) + R() (5.10)

where R() is a remainder term such that

R()
lim → 0.
→0 
In the various derivative expressions that follow, we will use R() to denote gener-
ically any remainder terms that have the above asymptotic behavior as a function
of . In addition, by the Gateaux differentiability of J (x, ·), we have

J (x(u ), u ) = J (x(u ), u) + Du J (x(u ), u) ◦ (u − u) + R()


(5.11)
= J (x(u ), u) + Du J (x(u ), u) ◦ (ũ − u) + R().

Substituting Equations (5.10) and (5.11) into (5.9) yields

J(u ) − J(u)
= Du J (x(u ), u) ◦ (ũ − u)

Dx J (x(u), u) ◦ (x(u ) − x(u)) R()
+ + . (5.12)
 
Since the pairs (x(u ), u ), (x(u), u) are solutions of A(·, ·) = 0, it is always true
that
λ ◦ (A(x(u ), u ) − A(x(u), u)) = 0 ∈ Z.
We can write
A(x(u ), u ) − A(x(u ), u) A(x(u ), u) − A(x(u), u)
λ◦ + = 0. (5.13)
 
174 Chapter 5. Minimization of Functionals

Since A is Gateaux differentiable in its second argument

A(x(u ), u ) = A(x(u ), u) + Du A(x(u ), u) ◦ (u − u) + R()


(5.14)
= A(x(u ), u) + Du A(x(u ), u) ◦ (ũ − u) + R()

and Fréchet differentiable in its first argument

A(x(u ), u) = A(x(u), u) + Dx A(x(u), u) ◦ (x(u ) − x(u)) + R2 (x(u ) − x(u))


= A(x(u), u) + Dx A(x(u), u) ◦ (x(u ) − x(u)) + R().
(5.15)

When we substitute Equations (5.15) and (5.14) into (5.13), we obtain

Dx A(x(u), u) ◦ (x(u ) − x(u)) R()


λ ◦ Du A(x(u ), u) ◦ (ũ − u) + + = 0.
 
(5.16)
By hypothesis, we have

λ ◦ Dx A(x(u), u) ◦ (x(u ) − x(u)) = Dx J (x(u), u) ◦ (x(u ) − x(u))

which, from Equation (5.16) implies that

Dx J (x(u), u) ◦ (x(u ) − x(u)) R()


= −λ ◦ Du A(x(u ), u) ◦ (ũ − u) + .
 

When we substitute this expression into Equation (5.12), we obtain

J(u ) − J(u) R()


= Du J (x(u ), u) ◦ (ũ − u) − λ ◦ Du A(x(u ), u) ◦ (ũ − u) + .
 
Recalling that u = u + (ũ − u), we can take the limit above to obtain

DJ(u) ◦ (ũ − u) = Du J (x(u), u) ◦ (ũ − u) − λ ◦ Du A(x(u), u) ◦ (ũ − u).

In this last limit, we have used the continuity of Du J (x, u) and Du A(x, u) on
X × U. 

You might also like