0% found this document useful (0 votes)
23 views174 pages

Math 195 Notes

Uploaded by

Matthew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views174 pages

Math 195 Notes

Uploaded by

Matthew
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 174

Mathematics 195

MATHEMATICAL METHODS FOR


OPTIMIZATION
Dynamic Optimization

Spring 2021 version

Lawrence C. Evans
Department of Mathematics
University of California, Berkeley
Contents

PREFACE v

INTRODUCTION 1

Chapter 1. FIRST VARIATION 3


§1.1. The calculus of variations 3
§1.2. Computing the first variation 5
1.2.1. Euler-Lagrange equation 5
1.2.2. Alternative notation 9
1.2.3. Derivation 10
§1.3. Extensions and generalizations 13
1.3.1. Conservation laws 13
1.3.2. Transversality conditions 19
1.3.3. Integral constraints 22
1.3.4. Systems 27
1.3.5. Routh’s method 31
§1.4. Applications 32
1.4.1. Brachistochrone 32
1.4.2. Terrestrial brachistochrone 35
1.4.3. Lagrangian and Hamiltonian dynamics 38
1.4.4. Geodesics 41
1.4.5. Maxwell’s fisheye 47

Chapter 2. SECOND VARIATION 51


§2.1. Computing the second variation 51
2.1.1. Integral and pointwise versions 51

i
ii Contents

2.1.2. Weierstrass condition 53


§2.2. Positive second variation 55
2.2.1. Riccati equation 55
2.2.2. Conjugate points 56
§2.3. Strong local minimizers 58
2.3.1. Fields of extremals 59
2.3.2. More on conjugate points 62
§2.4. Existence of minimizers 63
2.4.1. Examples 63
2.4.2. Convexity and minimizers 65

Chapter 3. MULTIVARIABLE VARIATIONAL PROBLEMS 67


§3.1. Multivariable calculus of variations 67
§3.2. First variation 69
3.2.1. Euler-Lagrange equation 69
3.2.2. Derivation of Euler-Lagrange PDE 72
§3.3. Second variation 73
§3.4. Extensions and generalizations 75
3.4.1. Other boundary conditions 75
3.4.2. Integrating factors 77
3.4.3. Integral constraints 78
3.4.4. Systems 79
§3.5. Applications 80
3.5.1. Eigenvalues, eigenfunctions 80
3.5.2. Minimal surfaces 83
3.5.3. Harmonic maps 84
3.5.4. Gradient flows 86

Chapter 4. OPTIMAL CONTROL THEORY 89


§4.1. The basic problem 89
§4.2. Time optimal linear control 93
4.2.1. Linear systems of ODE 94
4.2.2. Reachable sets and convexity 94
§4.3. Pontryagin Maximum Principle 99
4.3.1. Fixed time, free endpoint problems 99
4.3.2. Other terminal conditions 101
§4.4. Applications 106
4.4.1. Simple linear-quadratic regulator 107
4.4.2. Production and consumption 108
Contents iii

4.4.3. Ramsey consumption model 110


4.4.4. Zermelo’s navigation problem 111
4.4.5. Chaplygin’s navigation problem 113
4.4.6. Optimal harvesting 115
§4.5. Proof of PMP 119
4.5.1. Simple control variations 119
4.5.2. Fixed time problem 121
4.5.3. Multiple control variations 125
4.5.4. Fixed endpoint problem 126

Chapter 5. DYNAMIC PROGRAMMING 131


§5.1. Hamilton-Jacobi-Bellman equation 131
5.1.1. Derivation 132
5.1.2. Optimality 135
§5.2. Applications 136
5.2.1. General linear-quadratic regulator 137
5.2.2. Rocket railway car 139
5.2.3. Fuller’s problem, chattering controls 141

APPENDIX 147
A. Notation 147
B. Linear algebra 147
C. Multivariable chain rule 148
D. Divergence Theorem 150
E. Implicit Function Theorem 150
F. Solving a nonlinear equation 151

EXERCISES 153

Bibliography 165
PREFACE

Last fall I taught a revised version of Math 170, primarily on finite di-
mensional optimization. This new spring class Math 195 discusses dynamic
optimization, mostly the calculus of variations and optimal control theory.
(However, Math 170 is not a prerequisite for Math 195, since we will be
developing quite different mathematical tools.)
We continue to be grateful to Kurt and Evelyn Riedel for their very gen-
erous contribution to the Berkeley Math Department, in financial support
of the redesign and expansion of our undergraduate classes in optimization
theory.
The texts Dynamic Optimization by Kamien and Schwartz [K-S] and
Introduction to Optimal Control Theory by Macki–Strauss [M-S] are good
overall references for this class, and I also strongly recommend Levi, Classical
Mechanics with Calculus of Variations and Optimal Control [L]. Part of the
content in Chapters 4 and 5 is reworked from my old online lecture notes
[E].
I have used Inkscape and SageMath for the illustrations. Thanks to
David Hoffman for the beautiful pictures of minimal surfaces. I am again
very thankful to have had Haotian Gu as my course assistant this term.

v
INTRODUCTION

Mathematical optimization theory comprises three major subareas:


A. Discrete optimization
B. Finite dimensional optimization
C. Infinite dimensional optimization.
This class covers several topics from infinite dimensional optimization the-
ory, mainly the rigorous mathematical theories for the calculus of variations
and optimal control theory. For these problems the unknowns are functions,
and our main mathematical tools will be calculus and differential equations
techniques. In most of our examples the unknowns will be functions of time,
whence the name dynamic optimization.

The big math ideas for this class are

(i) First variation, Euler-Lagrange equations


(ii) Hamiltonian dynamics
(iii) Second variation
(iv) Pontryagin maximum principle
(v) Dynamic programming

While reading these notes students should carefully distinguish between


the core mathematical theories and their applications. It is essential to un-
derstand how to write down for various problems the correct Euler-Lagrange
equations or the correct form of the Pontryagin maximum principle. But
these in turn may lead to problem-specific difficulties that can be quite hard.

1
2 INTRODUCTION

I have written up in detail a lot of tricky mathematics needed for particu-


lar problems; students should read the calculations but should not let these
particular issues deflect from their understanding of the larger mathematical
framework.
Chapter 1

FIRST VARIATION

1.1. The calculus of variations


We introduce a class of optimization problems for which the unknown is a
function.
DEFINITION. Assume a < b and the points y 0 , y 1 ∈ R are given. The
corresponding set of admissible functions is

A = {y : [a, b] → R | y(·) is continuous and piecewise


continuously differentiable, y(a) = y 0 , y(b) = y 1 }.
So the graphs of functions y(·) ∈ A connect the given endpoints A = (a, y 0 )
and B = (b, y 1 ).

a b
Graph of an admissible function

NOTATION. We will often write “y(·)” when we wish to emphasize that


y : [a, b] → R is a function. 

3
4 1. FIRST VARIATION

DEFINITION. The Lagrangian is a given continuous function


L : [a, b] × R × R → R,
written
L = L(x, y, z).
DEFINITION. If y(·) ∈ A and L is a Lagrangian, we define the corre-
sponding integral functional
Z b
(1.1) I[y(·)] = L(x, y(x), y 0 (x)) dx,
a

where
dy
y0 = .
dx
Note that we insert y(x) into the y-variable slot of L(x, y, z), and y 0 (x)
into the z-variable slot of L(x, y, z).
INTERPRETATION. We can informally think of the number I[y(·)] as
being some sort of “energy” associated with the function y(·) (but there can
be many other interesting interpretations). 
REMARK. To avoid various technical issues, we will usually suppress men-
tion of the precise degree of smoothness assumed for various functions that
we discuss. In particular, whenever we write down a derivative (or partial
derivative) of some function at some point, the reader should suppose that
the function is indeed differentiable there. 

The basic problem in the calculus of variations is to study func-


tions y0 (·) ∈ A that satisfy
(COV) I[y0 (·)] = min I[y(·)].
y(·)∈A

Does such a minimizer y0 (·) exist? What are its properties?

Different choices of the Lagrangian L give us different sorts of problems:


EXAMPLE (Shortest path between two points). Consider first the
case that
L(x, y, z) = (1 + z 2 )1/2 .
Then
Z b
I[y(·)] = (1 + (y 0 )2 )1/2 dx
a
= length of the graph of y(·).
1.2. Computing the first variation 5

So a minimizer y0 ∈ A will give the shortest path connecting the points


A = (a, y 0 ) and B = (b, y 1 ), at least among curves that can be written as
graphs of functions. The graph of the minimizer y0 (·) is obviously a straight
line, but it will be interesting to see later what our general theory says even
for this simple problem. 
EXAMPLE (Minimal surfaces of revolution). As a second example,
take
L(x, y, z) = 2πy(1 + z 2 )1/2 .
Then
Z b 1/2
I[y(·)] = 2π y 1 + (y 0 )2 dx
a
= area of surface of revolution of the graph.

A surface of revolution

What curve y0 (·) gives the surface of revolution of least surface area?
This is more difficult than the previous example, and we will only later have
the tools to handle this. 

1.2. Computing the first variation


1.2.1. Euler-Lagrange equation.

The most important insight of the calculus of variations is the next the-
orem. It says that a minimizer y0 (·) ∈ A automatically solves a certain
ordinary differential equation (ODE). This equation appears when we com-
pute an appropriate first variation for our minimization problem (COV).
THEOREM 1.2.1. Assume y0 (·) ∈ A solves (COV) and y0 (·) is twice
continuously differentiable.
6 1. FIRST VARIATION

Then y0 solves the nonlinear ODE


 
d ∂L 0 ∂L
(1.2) − (x, y0 (x), y0 (x)) + (x, y0 (x), y00 (x)) = 0
dx ∂z ∂y
for a ≤ x ≤ b.
DEFINITIONS. (i) We call
 
d ∂L ∂L
(E-L) − (x, y, y 0 ) + (x, y, y 0 ) = 0
dx ∂z ∂y
the Euler-Lagrange equation corresponding to the Lagrangian L. This
is a second-order, and usually nonlinear, ODE for the function y = y(·).
(ii) Solutions y(·) of the Euler-Lagrange equation are called extremals
(or critical points or stationary points) of I[ · ].
(iii) Problems in mathematics or the sciences that lead to equations of
the form (E-L) are called variational. 

REMARKS.
(i) Theorem 1.2.1 says that any minimizer y0 solving (COV) satisfies the
Euler-Lagrange differential equation and thus is an extremal. But a given
extremal need not be a minimizer.
(ii) Remember that 0 = dx
d
. So it is also correct to write (E-L) as
 0
∂L 0 ∂L
− (x, y, y ) + (x, y, y 0 ) = 0.
∂z ∂y
(iii) We could apply the chain rule to expand out the first term in (E-L),
but it is almost always best not to do so. 

HOW TO WRITE DOWN THE EULER-LAGRANGE EQUA-


TION FOR A SPECIFIC PROBLEM:
Step 1. Given L = L(x, y, z), compute
∂L ∂L
(x, y, z) and (x, y, z).
∂y ∂z
Step 2. Plug in y(x) for the variable y and y 0 (x) for z, to obtain
∂L ∂L
(x, y(x), y 0 (x)) and (x, y(x), y 0 (x)).
∂y ∂z
Step 3. Now write (E-L):
 
d ∂L 0 ∂L
− (x, y(x), y (x)) + (x, y(x), y 0 (x)) = 0.
dx ∂z ∂y
1.2. Computing the first variation 7

WARNING ABOUT NOTATION. Most books write (E-L) as


 
d ∂L 0 ∂L
− (x, y, y ) + (x, y, y 0 ) = 0.
dx ∂y 0 ∂y
This is very common, but bad, notation. Note carefully: L = L(x, y, z) is
a function of the three real variables x, y, z; it has nothing to do with the
derivative y 0 of some other function y(·). So “ ∂y
∂L
0 ” has no meaning. 

The Euler-Lagrange equation is extremely important, since it provides


us with a procedure for finding candidates for minimizers y0 (·) of (COV).
We do so by trying to solve the (E-L) differential equation.

EXAMPLE. In the first example from page 4, the function y0 minimizes


Rb 1
I[y] = a (1 + (y 0 )2 ) 2 dx (the length of graph of y) among functions y ∈ A.
The Euler-Lagrange equation is an ODE which provides useful informa-
tion about y0 . For this example we have
L = (1 + z 2 )1/2 ,
and therefore
∂L ∂L z
= 0, = .
∂y ∂z (1 + z 2 )1/2
We insert y 0 for z and then write down (E-L):
0
y0

0=−
(1 + (y 0 )2 )1/2
 
00 0 2 −1/2 0 1 −3/2 0 00
= −y (1 + (y ) ) −y − 1 + (y 0 )2 2y y
2
y 00
=− .
(1 + (y 0 )2 )3/2
y000
Consequently a minimizer y0 solves 1/2 = 0, and this implies
(1+(y00 )2 )
y000 = 0 (a ≤ x ≤ b).
Hence the graph of y0 is indeed a straight line connecting A and B.

GEOMETRIC INTERPRETATION. This conclusion is of course ob-


vious, but our method suggests something interesting, namely that the ex-
pression
!0
y0 y 00
=
(1 + (y 0 )2 )1/2 (1 + (y 0 )2 )3/2
8 1. FIRST VARIATION

may have a geometric meaning. It does. For any twice differentiable curve
y(·),
y 00
(1.3) κ=
(1 + (y 0 )2 )3/2
is the curvature of the graph of y(·) at the point (x, y(x)). The calculus
of variations has automatically produced this important expression for the
geometry of planar curves. And what (E-L) really says is that the graph of
our minimizer y0 has constant curvature κ = 0. 

EXAMPLE. Compute the Euler-Lagrange equation satisfied by minimiz-


ers of Z b 0 2
(y )
I[y(·)] = − f y dx
a 2
where f : [a, b] → R is given.
In this case
z2 ∂L ∂L
L(x, y, z) = − f (x)y, = −f (x), = z.
2 ∂y ∂z
So (E-L) is the simple linear ODE
−y 00 = f.

EXAMPLE. In the second example on page 5, we have
L(x, y, z) = 2πy(1 + z 2 )1/2 .
Then
∂L ∂L 2πyz
= 2π(1 + z 2 )1/2 , = .
∂y ∂z (1 + z 2 )1/2
Consequently (E-L) reads
!0
yy 0 1/2
0=− + 1 + (y 0 )2 .
(1 + (y 0 )2 )1/2
Which functions y solve this nonlinear ODE? We do not yet have the
tools to answer this and so must return to this example later. 
EXAMPLE. Lagrangians of the form
(1.4) L = a(y)z
are called null Lagrangians, meaning that every function y : [a, b] → R
automatically solves the associated Euler-Lagrange equation.
1.2. Computing the first variation 9

Indeed,
 
d ∂L 0 ∂L d
− (y, y ) + (y, y 0 ) = − (a(y)) + a0 (y)y 0 = 0
dx ∂z ∂y dx
for all functions y.
We will learn later that null Lagrangians, especially for more complicated
variational problems, can provide useful information. 

1.2.2. Alternative notation.


In the examples above the variable x denotes a spatial position, but for
many other applications the independent variable represents time t. In these
situations it is appropriate to use different notation.
For such problems we consider Lagrangians
L : [0, T ] × R × R → R
that depend upon the variables t denoting time, x denoting position, and v
denoting velocity. So we will write
L = L(t, x, v).
The letter T gives a terminal time. We also redefine the admissible class to
be

A = {x : [0, T ] → R | x(·) is continuous and piecewise


continuously differentiable, x(0) = x0 , x(T ) = x1 }
for given points x0 , x1 ∈ R, and put
Z T
(1.5) I[x(·)] = L(t, x(t), ẋ(t)) dt.
0

Observe that when the independent variable is t, we usually write


d
˙= .
dt

Employing this new notation, we check that the Euler-Lagrange equation


for extremals x(·) now reads
 
d ∂L ∂L
(E-L) − (t, x, ẋ) + (t, x, ẋ) = 0
dt ∂v ∂x
for 0 ≤ t ≤ T . There is no new mathematics here; we are simply changing
notation by renaming the variables.
10 1. FIRST VARIATION

EXAMPLE. Consider the following simple model for the motion of a par-
ticle along the real line, moving under the influence of a potential energy. In
this interpretation m denotes the mass, x(t) is the position of the particle
at time t, and ẋ(t) is its velocity.
In addition,
m
|ẋ(t)|2 = kinetic energy at time t,
2
W (x(t)) = potential energy at time t,
where W : R → R is given. The action of a path x : [0, T ] → R is the time
integral of the difference between the kinetic and potential energies:
Z T
m 2
I[x(·)] = |ẋ| − W (x(t)) dt.
0 2
What is the corresponding Euler-Lagrange equation?

We have
mv 2 ∂L ∂L
L= − W (x), = −W 0 (x), = mv,
2 ∂x ∂v
where 0 = d
dx . So (E-L) is
d
(mẋ) − W 0 (x) = 0,

dt
which is Newton’s law of motion:
mẍ = −W 0 (x).
In other words, ma = f for the acceleration a = ẍ and force f = −W 0 . The
calculus of variations provides a systematic derivation for this fundamental
law of physics. 

1.2.3. Derivation.
In this section we prove that minimizers satisfy the Euler-Lagrange equa-
tion.

LEMMA 1.2.1. (i) If f, g : [a, b] → R are continuously differentiable, we


have the integration by parts formula
Z b Z b
0
(1.6) f g dx = − f g 0 dx + f (b)g(b) − f (a)g(a).
a a

(ii) Assume f : [a, b] → R is continuous and


Z b
(1.7) f w dx = 0
a
1.2. Computing the first variation 11

for all continuously differentiable functions w : [a, b] → R such that w(a) =


w(b) = 0. Then
f (x) = 0 for all a ≤ x ≤ b.

Proof. 1. Integrate (f g)0 = f 0 g + f g 0 from a to b.

2. A standard approximation argument shows if holds for all contin-


uously differentiable functions w, it is valid also for all merely continuous
functions w. Let φ : [a, b] → R be positive for a < x < b and zero at the
endpoints a, b. Put w(x) = φ(x)f (x) above, to find
Z b
φf 2 dx = 0.
a

Hence φ(x)f 2 (x)


= 0 for all x ∈ [a, b], since the integrand is nonnegative.
Then since φ(x) > 0 for all x ∈ (a, b), we see that f (x) = 0 if x ∈ (a, b). 

Derivation of Euler-Lagrange equation:

y()
τ
A

y()
0
a b

Computing the first variation

1. Let w : [a, b] → R be continuously differentiable, with w(a) = w(b) =


0. Assume −1 ≤ τ ≤ 1 and define
yτ (x) = y0 (x) + τ w(x) (a ≤ x ≤ b).
Note yτ (·) ∈ A, since yτ (a) = y 0 , yτ (b) = y 1 .
Thus
I[y0 (·)] ≤ I[yτ (·)]
since y0 (·) is the minimizer of I[ · ]. Define
i(τ ) = I[yτ (·)].
12 1. FIRST VARIATION

Then
i(0) ≤ i(τ ).
So i(·) has a minimum at τ = 0 on the interval −1 ≤ τ ≤ 1, and therefore
di
(0) = 0.

Our task now is to see what information we can extract from this simple
formula.
2. We have

i(τ ) = I[yτ (·)]


Z b
= L(x, yτ (x), (yτ )0 (x)) dx
a
Z b
= L(x, y0 (x) + τ w(x), y00 (x) + τ w0 (x)) dx.
a

Therefore
Z b
di ∂
(τ ) = L(x, y0 + τ w, y00 + τ w0 ) dx
dτ a ∂τ
Z b
∂L ∂L
= (x, y0 + τ w, y00 + τ w0 )w + (x, y0 + τ w, y00 + τ w0 )w0 dx,
a ∂y ∂z
where we used the chain rule. Next, set τ = 0, to learn that
Z b
di ∂L ∂L
0= (0) = (x, y0 , y00 )w + (x, y0 , y00 )w0 dx.
dτ a ∂y ∂z
We now integrate by parts, to deduce
Z b  
∂L d ∂L
(x, y0 , y00 ) − (x, y0 , y00 ) w dx = 0.
a ∂y dx ∂z
This is valid for all functions w such that w(a) = w(b) = 0. According then
to the Lemma above, it follows that
 
∂L 0 d ∂L 0
(x, y0 , y0 ) − (x, y0 , y0 ) = 0
∂y dx ∂z
for all a ≤ x ≤ b. This is (E-L). 

REMARK. The procedure in this proof is called computing the first vari-
ation. 
1.3. Extensions and generalizations 13

1.3. Extensions and generalizations


In this section we discuss various extensions of the basic theory, focussing
in particular upon how to use the Euler-Lagrange equation (E-L) to extract
useful information. There are several general approaches for this:
(a) deriving exact formulas for extremals;
(b) calculating perturbative corrections from known solutions;
(c) applying rigorous ODE theory;
(d) introducing numerical methods.
In these notes we will stress (a) (although there is certainly no magic way
to exactly solve all (E-L) equations) and (c).

1.3.1. Conservation laws.


We introduce first some methods for actually finding solutions of Euler-
Lagrange equations in the various special cases. The idea is to try to reduce
(E-L) to a (much simpler) first-order equation.

• SPECIAL CASE 1: L = L(x, z) does not depend on y.

THEOREM 1.3.1. If L does not depend on y and the function y(·) solves
(E-L), then
∂L
(1.8) (x, y 0 ) is constant for a ≤ x ≤ b.
∂z
∂L
Proof. Since ∂y = 0, (E-L) says
 0
∂L 0
− (x, y ) = 0;
∂z
and so ∂L 0
∂z (x, y ) is a constant. 

REMARK. Why is this result useful? The point is that when


∂L
(x, y 0 ) = C
∂z
for some constant C, we can perhaps rewrite this to solve for y 0 :
y 0 = f (x, C).
Then Z x
y(x) = f (t, C) dt + D
0
14 1. FIRST VARIATION

for constants C, D is a formula for general solution of (E-L). We can next


try to select C, D so that the boundary conditions y(a) = y 0 , y(b) = y 1 hold;
in which case y(·) ∈ A. 

EXAMPLE. (a) Write down and then solve (E-L) for


Z b
I[y(·)] = x3 (y 0 )2 dx.
a

We have
∂L ∂L
L(x, y, z) = x3 z 2 , = 0, = 2x3 z;
∂y ∂z
therefore
∂L
(x, y 0 (x)) = 2x3 y 0 (x) = C.
∂z
Hence
C
y 0 (x) = ,
2x3
and so
E
y(x) = +F
x2
for constants E, F .
(b) Find a minimizer of I[ · ] from the admissible class
A = {y : [1, 2] → R | y(1) = 3, y(2) = 4}.
We need to select the constants E, F above so that
E
3 = y(1) = E + F, 4 = y(2) = + F.
4
Solving, we find that E = − 43 , F = 13
3 , and thus
4 13
y0 (x) = −
2
+ .
3x 3
Therefore if (COV) has a solution, it must be this. 

• SPECIAL CASE 2: L = L(y, z) does not depend on x.


THEOREM 1.3.2. If L does not depend on x and the function y(·) solves
(E-L), then
∂L
(1.9) y0 (y, y 0 ) − L(y, y 0 ) is constant for a ≤ x ≤ b.
∂z
REMARK. Conversely, we will see from the proof that if y 0 ∂L 0
∂z (y, y ) −
L(y, y 0 ) is constant, then y(·) solves the Euler-Lagrange equation on any
subinterval where y 0 6= 0. 
1.3. Extensions and generalizations 15

Proof.
 0  0
0 0 ∂L 0 ∂L 0 ∂L 00 00 ∂L 0 ∂L
L(y, y ) − y (y, y ) = y + y −y −y
∂z ∂y ∂z ∂z ∂z
  0 
0 ∂L 0 ∂L 0
=y − (y, y ) + (y, y )
∂z ∂y
= 0,
since the expression in the parentheses is 0 according (E-L). 

Why is this useful? If


∂L
y0 (y, y 0 ) − L(y, y 0 ) = C,
∂z
then perhaps we can rewrite this expression into the form
y 0 = g(y, C).
This is a nonlinear first-order ODE that is solvable, at least in principle,
when g 6= 0:

REVIEW: Solving a nonlinear first-order ODE. Let us recall how to


solve nonlinear ODE of the form
y 0 = g(y).
First, introduce an antiderivative
Z y
dt
G(y) = ,
g(t)
so that G0 = g1 . Next, try to solve the algebraic expression
G(y) = x + D
for y = y(x, D), where D is a constant.
We claim that y(·) solves the ODE y 0 = g(y). To confirm this, notice
that G(y) = x + D implies G0 (y)y 0 = 1. Hence y 0 = g(y), since G0 = g1 . 

EXAMPLE. We are now able to solve the Euler-Lagrange equation from


the surface of revolution example on page 8 above. Recall that we have
∂L ∂L yz
L = y(1 + z 2 )1/2 , = (1 + z 2 )1/2 , = .
∂y ∂z (1 + z 2 )1/2
Then (E-L) says
yy 0
 
d
0=− + (1 + (y 0 )2 )1/2 ,
dx (1 + (y 0 )2 )1/2
and this is a difficult nonlinear second-order ODE.
16 1. FIRST VARIATION

But since L does not depend on x, we can apply Theorem 1.3.2. Now

∂L yy 0
y0 − L = y0 − y(1 + (y 0 )2 )1/2
∂z (1 + (y 0 )2 )1/2
y
=− .
(1 + (y 0 )2 )1/2

Therefore Theorem 1.3.2 tells us that

y
=C
(1 + (y 0 )2 )1/2

for some constant C. We solve this expression for

1/2.
y2 − C 2

0
y =±
C2

We take the positive sign and solve this ODE:

dy (y 2 − C 2 )1/2
=
dx C
dy dx
2 2 1/2
=
(y − C ) C
Z Z
dy dx
2 2 1/2
=
(y − C ) C
y x
cosh−1 = + D.
C C

(I looked up the expression for the y integral from a table of standard inte-
grals.)
Therefore the curve giving a surface of revolution of least area is

x 
y0 (x) = C cosh +D ,
C

x −x
where we recall that cosh(x) = e +e
2 . The graph for the y-curve is called
a catenary. The corresponding surface of revolution is a catenoid. 
1.3. Extensions and generalizations 17

A catenoid

REMARK. To fully resolve our problem we need try to adjust the con-
stants C and D so the solution passes through the given endpoints. This
however can be subtle and may not be possible: see Gilbert [G]. 

EXAMPLE. (Geometric optics) Suppose that the velocity of light v in


some two-dimensional translucent material depends only upon the vertical
coordinate y. Then the time for a light ray, moving along the path of the
function y(·), to travel between two given points is
1
(1 + (y 0 )2 ) 2
Z b Z b
ds
= dx,
a v(y) a v(y)
where s denotes arclength along the curve. The Lagrangian
1
(1 + z 2 ) 2
L = L(y, z) =
v(y)
does not depend upon x. Consequently if the graph of the function y(·)
describes the path along which light travels, we know that
∂L 1 sin ξ
L(y, y 0 ) − y 0 (y, y 0 ) = 1 = ,
∂z v(y)(1 + (y 0 )2 ) 2 v
is constant, where ξ is the angle of the tangent with the vertical, as drawn.
18 1. FIRST VARIATION

ξ
y

x
Angles and derivatives

Thus
sin ξ
(1.10) = C,
v(y)
for some constant C. This is a continuous version of Snell’s Law of diffrac-
tion (see the Math 170 lecture notes). 

EXAMPLE. Recall for our model for the motion of a particle on the line
that
Z b
m 2
I[x(·)] = |ẋ| − W (x(t)) dt
a 2
with
mv 2
L= − W (x).
2
We compute
m(ẋ)2
 
∂L
ẋ − L = ẋ(mẋ) − − W (x)
∂v 2
m(ẋ)2
= + W (x).
2
Since L does not depend upon t, Theorem 1.3.2 implies that the above
expression is constant. So

total energy = kinetic energy + potential energy


m(ẋ)2
= + W (x)
2
is constant for all times a ≤ t ≤ b. The calculus of variations therefore
predicts the physical law of conservation of total energy. 
1.3. Extensions and generalizations 19

1.3.2. Transversality conditions.


a. Free endpoint problems. Our definition of the admissible class A
on page 3 forces the prescribed boundary conditions that y(a) = y 0 , y(b) =
y 1 . But what if we change the admissible class so as to require only, say, that
y(a) = y 0 ? That is, suppose we redefine the class of admissible functions,
now to be

(1.11) A = {y : [a, b] → R | y(·) is continuous and piecewise


continuously differentiable, y(a) = y 0 },
and so require nothing about the values at x = b for functions y(·) ∈ A.

We as usual define
Z b
I[y(·)] = L(x, y, y 0 ) dx
a
for functions y(·) ∈ A; and seek to understand the behavior of a minimizer
y0 (·) ∈ A of this free endpoint problem.

THEOREM 1.3.3. Let the admissible class be given by (1.11). Assume


y0 (·) ∈ A solves (COV) and is twice continuously differentiable.
(i) Then y0 solves the Euler-Lagrange equation
 0
∂L 0 ∂L
(1.12) − (x, y0 , y0 ) + (x, y0 , y00 ) = 0 (a ≤ x ≤ b)
∂z ∂y
(ii) Furthermore,
∂L
(1.13) (b, y0 (b), y00 (b)) = 0.
∂z

INTERPRETATION. So the Euler-Lagrange equation is as before, whereas


the new formula (1.13) appears at the free endpoint x = b.
This so-called transversality condition (or natural boundary con-
dition) is implicit in the variational formulation and, as we will see, appears
automatically when we compute the first variation. 

Proof. 1. We appropriately modify our earlier derivation of (E-L). So let


w : [a, b] → R, with w(a) = 0. Assume −1 ≤ τ ≤ 1 and define
yτ (x) = y0 (x) + τ w(x) (a ≤ x ≤ b).
Observe that yτ (·) ∈ A, since yτ (a) = y 0 . Then I[y0 (·)] ≤ I[yτ (·)]. Define
di
i(τ ) = I[yτ (·)], and, as before, observe that dτ (0) = 0.
20 1. FIRST VARIATION

As in the earlier proof, we have


Z b
di ∂L ∂L
0= (0) = (x, y0 , y00 )w + (x, y0 , y00 )w0 dx.
dτ a ∂y ∂z
Integrate by parts in the second term, remembering that w(a) = 0, but that
w(b) need not necessarily vanish:
Z b  0 
∂L 0 ∂L 0
(x, y0 , y0 ) − (x, y0 , y0 ) w dx
(1.14) a ∂y ∂z
∂L
+ (b, y0 (b), y00 (b))w(b) = 0.
∂z

2. If we now assume also that w(b) = 0, then (1.14) gives


Z b  0 
∂L 0 ∂L 0
(x, y0 , y0 ) − (x, y0 , y0 ) w dx = 0.
a ∂y ∂z
That this integral identity holds for all variations w satisfying w(a) = w(b) =
0 implies the (E-L) equation (1.12).
Now, drop the assumption that w(b) = 0 and return to (1.14). Since we
now know that (1.12) holds, we deduce from (1.14) that
∂L
(b, y0 (b), y00 (b))w(b) = 0.
∂z
This is valid for all choices of w(b) and consequently the natural boundary
condition (1.13) holds. 
EXAMPLE. The minimizer y0 (·) of
Z 1 0 2
(y )
I[y(·)] = − f y dx,
0 2
subject to y(0) = 0, satisfies
−y000 = f (0 ≤ x ≤ 1), y0 (0) = 0, y00 (1) = 0.


b. Free endtime problems. For many interesting time-dependent


problems, we are asked to minimize an integral functional that depends
both upon a curve x : [0, T ] → R and a variable end time T > 0 at which
some specified terminal condition holds.
To be more precise, let us switch to the notation introduced in Sec-
tion 1.2.2 for time-dependent problems. We select points x0 , x1 ∈ R and
introduce the new admissible class
(1.15) A = {(T, x(·)) | T > 0, x : [0, T ] → R, x(0) = x0 , x(T ) = x1 },
1.3. Extensions and generalizations 21

over which we propose to minimize


Z T
I[T, x(·)] = L(t, x(t), ẋ(t)) dt.
0
Note carefully: we are now selecting both the terminal time T > 0 and the
curve x : [0, T ] → R.

THEOREM 1.3.4. Let the admissible class be given by (1.15) and assume
(T0 , x0 (·)) ∈ A minimizes I[ · ].
(i) Then x0 solves the Euler-Lagrange equation
 
d ∂L ∂L
(1.16) − (t, x0 , ẋ0 ) + (t, x0 , ẋ0 ) = 0 (0 ≤ t ≤ T0 ).
dt ∂v ∂x
(ii) Furthermore, we have
∂L
(1.17) (T0 , x0 (T0 ), ẋ0 (T0 ))ẋ0 (T0 ) − L(T0 , x0 (T0 ), ẋ0 (T0 )) = 0.
∂v

INTERPRETATION. The Euler-Lagrange equation (1.16) is same as


before; whereas the new free endtime transversality condition (1.17)
appears at the optimal endtime T0 . This is new information that we will
discover by computing a variation in the arrival time. 

REMARK. If L = L(x, v) does not depend on t, then according to Theo-


rem 1.3.2, we furthermore see that
∂L
(1.18) (x0 , ẋ0 )ẋ0 − L(x0 , ẋ0 ) = 0 (0 ≤ t ≤ T0 ).
∂v


Proof. 1. That the usual Euler-Lagrange equation (1.16) holds follows as


in the proof of Theorem 1.3.3.

2. To prove (1.17), we introduce a new sort of variation by scaling our


minimizer x0 (·) in the time variable. For this, select −1 < σ < 1 and define
T0
Tσ = 1+σ , xσ (t) = x0 (t(1 + σ)) (0 ≤ t ≤ Tσ ).
Then (Tσ , xσ (·)) ∈ A and thus
Z Tσ
j(σ) = L(t, xσ (t), ẋσ (t)) dt
0
has a minimum at σ = 0. Therefore
dj
(0) = 0.

22 1. FIRST VARIATION

3. Now
dTσ T0
=−
dσ (1 + σ)2
and
∂xσ
= ẋ0 (t(1 + σ))t.
∂σ
Therefore
dj
0= (0) = −T0 L(T0 , x0 (T0 ), ẋ0 (T0 ))

Z T0
∂L ∂L
+ (t, x0 , ẋ0 )w + (t, x0 , ẋ0 )ẇ dt,
0 ∂x ∂v
for
w(t) = ẋ0 (t)t.
We integrate by parts in the integral and recall (1.16), to deduce that
∂L
0 = −T0 L(T0 , x0 (T0 ), ẋ0 (T0 )) + (T0 , x0 (T0 ), ẋ0 (T0 ))ẋ0 (T0 )T0 .
∂v
This gives (1.17), since T0 > 0. 

The interesting book by Kamien and Schwartz [K-S] has further infor-
mation about transversality conditions in various more complicated situa-
tions.

1.3.3. Integral constraints.


Constraints involving integrals appear often in the calculus of varia-
tions. For a model such problem, assume in addition to the Lagrangian
L = L(x, y, z) we are given also a function G : [a, b] × R → R, G = G(x, y).
Define then the integral functional
Z b
J[y(·)] = G(x, y) dx.
a

We use J[ · ] to define a new admissible class:


A = {y : [a, b] → R | y(a) = y 0 , y(b) = y 1 , J[y(·)] = 0}.
So we are now requiring the additional integral constraint that J[y(·)] = 0.
We continue as usual to write
Z b
I[y(·)] = L(x, y, y 0 ) dx.
a
1.3. Extensions and generalizations 23

THEOREM 1.3.5. Assume that y0 (·) ∈ A is a minimizer of I[ · ] over A.


Suppose also that
∂G
(1.19) (x, y0 ) is not identically zero for all a ≤ x ≤ b.
∂y
Then there exists λ0 ∈ R such that
 0
∂L 0 ∂L ∂G
(1.20) − (x, y0 , y0 ) + (x, y0 , y00 ) + λ0 (x, y0 ) = 0
∂z ∂y ∂y
for a ≤ x ≤ b.

INTERPRETATION. We understand λ0 as the Lagrange multiplier


corresponding to the integral equality constraint that J[x(·)] = 0. The
hypothesis (1.19) is a constraint qualification condition, ensuring the
existence of the Lagrange multiplier. See the Math 170 notes for lots more
about constraint qualification conditions in finite dimensional optimization.


Proof. 1. Select w : [a, b] → R with w(a) = w(b) = 0. We want to design


a variation involving w, but setting y0 + τ w for small τ will not work, since
this function will probably not belong to A. We must build some sort of
correction, to restore the integral constraint.
Now the condition (1.19) implies that we can find a smooth function
v : [a, b] → R, such that v(a) = v(b) = 0 and
Z b
∂G
(1.21) (x, y0 )v dx 6= 0.
a ∂y
Define Z b
Φ(τ, σ) = G(x, y0 + τ w + σv) dx.
a
Then Z b
Φ(0, 0) = G(x, y0 ) dx = J[y0 (·)] = 0
a
and Z b
∂Φ ∂G
(0, 0) = (x, y0 )v dx 6= 0.
∂σ a ∂y
Therefore the Implicit Function Theorem (see the Appendix) tells us
that for some small τ0 > there exists a function
φ : [−τ0 , τ0 ] → R
such that φ(0) = 0 and
Φ(τ, φ(τ )) = Φ(0, 0) = 0 (−τ0 ≤ τ ≤ τ0 ).
24 1. FIRST VARIATION

Let us differentiate this expression in τ , to learn that


∂Φ ∂Φ
(0, 0) + (0, 0)φ0 (0) = 0,
∂τ ∂σ

where φ0 = dτ . Since
Z b
∂Φ ∂G
(0, 0) = (x, y0 )w dx,
∂τ a ∂y
it follows that
Z b Z b
∂G ∂G
(1.22) (x, y0 )w dx + φ0 (0) (x, y0 )v dx = 0.
a ∂y a ∂y

2. Now define
yτ (x) = y0 (x) + τ w(x) + φ(τ )v(x) (−τ0 ≤ τ ≤ τ0 ).
Then yτ (a) = y0,yτ (b) = andy1,
Z b
J[yτ (·)] = G(x, y0 + τ w + φ(τ )v) dx = Φ(τ, φ(τ )) = 0;
a
therefore yτ (·) ∈ A. Hence i(τ ) = I[yτ (·)] has a minimum at τ = 0, and
consequently
di
(0) = 0.

We will extract the Lagrange multiplier from this simple looking equality.

3. We compute
Z b
di ∂L
(τ ) = (x, y0 + τ w + φ(τ )v, y00 + τ w0 + φ(τ )v 0 )(w + φ0 (τ )v)
dτ a ∂y
Z b
∂L
+ (x, y0 + τ w + φ(τ )v, y00 + τ w0 + φ(τ )v 0 )(w0 + φ0 (τ )v 0 ) dx.
a ∂z
Now put τ = 0 and then integrate by parts:
Z b
di ∂L ∂L 0
0= (0) = (w + φ0 (0)v) + (w + φ0 (0)v 0 ) dx
dτ a ∂y ∂z
Z b  0 
∂L ∂L
(1.23) = − (w + φ0 (0)v) dx,
a ∂y ∂z
where L is evaluated at (x, y0 , y00 ).

4. We next define the Lagrange multiplier to be


∂L 0
R b h ∂L i
a ∂y − ∂z v dx
λ0 = − R b ∂G ,
a ∂y v dx
1.3. Extensions and generalizations 25

in which L is evaluated at (x, y0 , y00 ) and G is evaluated at (x, y0 ). Then


(1.22) implies
Z b  0  Z b Z b
∂L ∂L ∂G ∂G
φ0 (0) − v dx = −φ0 (0)λ0 v dx = λ0 (x, y0 )w dx.
a ∂y ∂z a ∂y a ∂y

We utilize this calculation in (1.23), to find that


Z b  0 
∂L ∂L ∂G
− + λ0 w dx = 0.
a ∂y ∂z ∂y
This identity is valid for all functions w as above, and therefore the ODE
(1.20) holds.


EXAMPLE. (Isoperimetric problem) We wish to find a curve y(·) ∈ A


to minimize the length
Z 1
I[y(·)] = (1 + (y 0 )2 )1/2 dx
0
among curves connecting the given endpoints A, B and having with a given
area a under the the graph:
Z 1
J[y(·)] = y dx = a.
0
The Euler-Lagrange equation reads
!0
y0
= λ.
(1 + (y 0 )2 )1/2

B
A

0 1

Recall from (1.3) that this says the curvature κ is constant. Therefore
(as we will prove later, on page 48) the graph of y(·) is an arc of a circle
connecting the given endpoints. 
26 1. FIRST VARIATION

Generalization. We can extend the foregoing to handle more compli-


cated integral constraints having the form
Z b
J[y(·)] = G(x, y(x), y 0 (x)) dx
a

where G : [a, b] × R × R → R, G = G(x, y, z).


THEOREM 1.3.6. Assume that y0 ∈ A is a minimizer of I[ · ] over A.
Suppose also that
 0
∂G 0 ∂G
(1.24) − (x, y0 , y0 ) + (x, y0 , y00 )
∂z ∂y
is not identically zero on the interval [a, b].
Then there exists λ0 ∈ R such that
 0
∂L 0 ∂G 0
(1.25) − (x, y0 , y0 ) + λ0 (x, y0 , y0 )
∂z ∂z
 
∂L 0 ∂G 0
+ (x, y0 , y0 ) + λ0 (x, y0 , y0 ) = 0
∂y ∂y
for a ≤ x ≤ b.

We omit the proof, which is similar to that for the previous theorem.

REMARKS. (i) The ODE (1.25) is of course the Euler-Lagrange equation


for the new Lagrangian K = L + λo G.
(ii) It is not hard to see that the same Euler-Lagrange equation (1.25)
holds if we change the constraint to read J[y(·)] = C for any constant C. 

EXAMPLE. (Hanging chain) A chain of constant mass density and


length l hangs between the points A± = (±a, 0). What is its shape?
The problem is to minimize the gravitational potential energy
Z a
1
I[y(·)] = y(1 + (y 0 )2 ) 2 dx,
−a
subject to y(±a) = 0 and the length constraint
Z a
1
J[y(·)] = (1 + (y 0 )2 ) 2 dx = l.
−a
1 1
Here L = y(1 + z 2 ) , G = (1 + z 2 ) 2 . Since K = L + λG does not depend
2

on x, we see from the Euler-Lagrange equation (1.25) that


 
0 ∂L ∂G y+λ
−y +λ (y, y 0 ) + (L + λG)(y, y 0 ) = 1
∂z ∂z (1 + (y 0 )2 ) 2
1.3. Extensions and generalizations 27

is constant. Thus ȳ = y + λ satisfies



1 = C
(1 + (ȳ 0 )2 ) 2
for some constant C, and this is an ODE we have solved earlier, on page
16. We thereby obtain the symmetric catenary ȳ(x) = C cosh Cx ; and
consequently x
y0 (x) = C cosh − λ.
C
Ra 1
We now adjust C so that −a (1 + (y 0 )2 ) 2 dx = l, and then select λ so that
y(±a) = 0.

Catenary

Now if l < 2a, the admissible class is empty and we will not be able
to select C as above. If l = 2a, the admissible class consists only of one
configuration, for which chain is stretched horizontally between its left and
right endpoints. The constraint qualification condition (1.24) then fails. 

1.3.4. Systems.
We next turn attention to calculus of variations problems for functions
y : [a, b] → Rn . The new difficulties are mostly notational, as the basic ideas
are the same as above.
NOTATION. (i) We write
y10 (x)
   
y1 (x)
y(x) =  ...  , y0 (x) =  ...  .
   

yn (x) yn0 (x)


(ii) The admissible class is
28 1. FIRST VARIATION

A = {y : [a, b] → Rn | y(·) is continuous and piecewise


continuously differentiable, y(a) = y 0 , y(b) = y 1 },
where y 0 , y 1 ∈ Rn are given.
(iii) We are given a Lagrangian function L : [a, b] × Rn × Rn → R,
L = L(x, y, z) = L(x, y1 , . . . , yn , z1 , . . . , zn ),
with  ∂L   ∂L 
∂y1 ∂z1
∇y L =  ...  , ∇z L =  ...  .
   
∂L ∂L
∂yn ∂zn

(iv) We write
Z b
I[y(·)] = L(x, y(x), y0 (x)) dx.
a

Our problem now is to study functions y0 (·) ∈ A that satisfy


(COV) I[y0 (·)] = min I[y(·)].
y(·)∈A

THEOREM 1.3.7. Suppose y0 (·) ∈ A solves (COV) and is twice contin-


uously differentiable. Then y0 (·) solves the Euler-Lagrange system of ODE
 
d ∂L 0 ∂L
(E-L) − (x, y, y ) + (x, y, y0 ) = 0 (k = 1, . . . , n).
dx ∂zk ∂yk

REMARKS. (i) In vector notation (E-L) reads


0
− ∇z L(x, y, y0 ) + ∇y L(x, y, y0 ) = 0.
These comprise n coupled second-order ODE for the n unknown functions
y1 (·), . . . , yn (·) that are the components of y(·)
(ii) If time t is the independent variable and L = L(t, x, v), the Euler-
Lagrange system of ODE is written
 
d ∂L ∂L
(E-L) − (t, x, ẋ) + (t, x, ẋ) = 0
dt ∂vk ∂xk
for k = 1, . . . , n. In vector form, this is
d
(1.26) − (∇v L(t, x, ẋ)) + ∇x L(t, x, ẋ) = 0.
dt

1.3. Extensions and generalizations 29

Proof. 1. We extend our previous derivation of the Euler-Lagrange equa-


tion to this vector case. Select w : [a, b] → R, written
 
w1
 .. 
w =  . ,
wn
such that w(a) = w(b) = 0. Then define
yτ (x) = y0 (x) + τ w(x) (a ≤ x ≤ b)
for −1 ≤ τ ≤ 1. We have yτ (·) ∈ A, and consequently
I[y0 (·)] ≤ I[yτ (·)].
Define i(τ ) = I[yτ (·)], so that i(·) has a minimum at τ = 0. Therefore
di
(0) = 0.

2. Since
Z b
i(τ ) = L(x, y0 (x) + τ w(x), y00 (x) + τ w0 (x)) dx,
a
we can apply the chain rule to compute
Z bXn
di ∂L
(τ ) = (x, y0 + τ w, y00 + τ w0 )wl
dτ a ∂yl
l=1
n
X ∂L
+ (x, y0 + τ w, y00 + τ w0 )wl0 dx.
∂zl
l=1

Thus
n
Z bX n
di ∂L X ∂L
0= (0) = (x, y0 , y00 )wl + (x, y0 , y00 )wl0 dx.
dτ a ∂yl ∂zl
l=1 l=1

Now fix some index k ∈ {1, . . . , n} and put


w = [0 . . . 0 w 0 . . . 0]T ,
where the real-valued function w : [a, b] → R appears in the k-th slot. Then
we have
Z
∂L ∂L
(x, y0 , y00 )w + (x, y0 , y00 )w0 dx = 0.
U ∂y k ∂zk
Upon integrating by parts, we deduce as usual that the k-th equation of the
stated Euler-Lagrange system (E-L) holds. 
30 1. FIRST VARIATION

EXAMPLE. (Motion of particle in space) For this example the inde-


pendent variable is t (for time) and if x : [a, b] → Rn , we regard x(t) as the
position at time t of a particle with mass m moving in Rn . The action of
any such path is
Z b
m|ẋ|2
I[x(·)] = − W (x) dt.
a 2
So
m|v|2
L= − W (x), ∇v L = mv, ∇x L = −∇W (x).
2
The Euler-Lagrange system of equations give Newton’s law
mẍ = −∇W (x)
for the motion of a particle in space governed by the potential energy W .
The path of the particle is thus an extremal of the action. It is sometimes
said that the path of the particle satisfies the principle of least action. But
this terminology is misleading: the path is an extremal of the action, but is
not necessarily a minimizer. 

THEOREM 1.3.8. Suppose y(·) is an extremal.


(i) If L = L(x, z) does not depend on y, then
∇z L(x, y0 ) is constant for a ≤ x ≤ b.
More generally, if L does not depend upon yk for some k ∈ {1, . . . , n}, then
∂L
(1.27) (x, y, y0 ) is constant for a ≤ x ≤ b.
∂zk
(ii) If L = L(y, z) does not depend on x, then
(1.28) y0 · ∇z L(y, y0 ) − L(y, y0 ) is constant for a ≤ x ≤ b.

The proof of (1.27) is simple, and the proof of (1.28) is similar to that
for our earlier Theorem 1.3.2.

EXAMPLE. For the motion of a particle in space, the Lagrangian does


not depend upon t, and therefore the total energy
m|ẋ|2
ẋ · ∇L(x, ẋ) − L(x, ẋ) = + W (x)
2
is conserved. 
1.3. Extensions and generalizations 31

1.3.5. Routh’s method. We explain next a technique that can sometimes


be invoked to reduce the number of unknowns in an Euler-Lagrange system
of ODE.
The simplest case is m = 2, for which the unknown is y = [y1 y2 ]T . The
basic idea is that if the Lagrangian
L = L(x, y, z) = L(x, y1 , z1 , z2 )
does not depend upon y2 , we can then convert the full (E-L) system
  0
− ∂L (x, y, y0 ) + ∂L (x, y, y0 ) = 0
(1.29)  ∂z1 ∂y1
− ∂L (x, y, y0 ) 0 = 0

∂z2

into a single ODE for the single unknown y1 .

To do this, first observe that the second Euler-Lagrange equation in


(1.29) implies
∂L
(1.30) (x, y1 , y10 , y20 ) = C
∂z2
for some constant C. We assume next that we can rewrite the algebraic
identity
∂L
(x, y1 , z1 , z2 ) = C
∂z2
to solve for z2 :
z2 = φ(x, y1 , z1 , C).
Thus
(1.31) y20 = φ(x, y1 , y10 , C).

DEFINITION. Routh’s function is


(1.32) R(x, y1 , z1 ) = L(x, y1 , z1 , φ(x, y1 , z1 , C)) − Cφ(x, y1 , z1 , C).

THEOREM 1.3.9. Assume that y = [y1 y2 ]T solves the (E-L) system


(1.29) and that the conservation law (1.30) holds.
Then y1 solves the single (E-L) equation determined by Routh’s function:
 0
∂R 0 ∂R
(1.33) − (x, y1 , y1 ) + (x, y1 , y10 ) = 0.
∂z1 ∂y1

REMARK. And so if we can solve the ODE (1.33) for the unknown func-
tion y1 , we can then recover y2 by integrating (1.31). 
32 1. FIRST VARIATION

Proof. We calculate
 
∂R ∂L ∂L ∂φ
= + −C
∂x ∂x ∂z2 ∂x
and  
∂R ∂L ∂L ∂φ
= + −C .
∂z1 ∂z1 ∂z2 ∂z1
Hence (1.31) and (1.30) imply
∂R ∂L
(x, y1 , y10 ) = (x, y1 , y10 , y20 ).
∂x ∂x
and
∂R ∂L
(x, y1 , y10 ) = (x, y1 , y10 , y20 ).
∂z1 ∂z1
Then the first equation in (1.29) lets us compute that
 0
∂R 0 ∂R
− (x, y1 , y1 ) + (x, y1 , y10 )
∂z1 ∂y1
 0
∂L 0 ∂L
=− (x, y, y ) + (x, y, y0 ) = 0.
∂z1 ∂y1

1.4. Applications
Following are some more substantial, and more interesting, applications of
our theory.

1.4.1. Brachistochrone.
Given two points A, B as drawn, we can interpret the graph of a function
y(·) joining these points as a wire path along which a bead of unit mass slides
without friction under the influence of gravity. How do we design the slide
so as to minimize the time it takes for the bead to slide from A to B?

A x

B
1.4. Applications 33

For simplicity, we assume that A = (0, 0) and that y(x) ≤ 0 for all
0 ≤ x ≤ b. As the particle slides its total energy (= kinetic energy +
potential energy) is constant. Therefore
v2
+ gy = 0
2
on the interval [0, b], where v is the velocity and g is gravitational accelera-
tion. The constant is 0, since v(0) = y(0) = 0. Therefore
1
v = (−2gy) 2 .
The time for the bead to slide from A to B is thus
1
(1 + (y 0 )2 ) 2
Z b Z b
ds
= dx.
0 v 0 −2gy
We therefore seek a path y0 (·) from A to B that minimizes
1
(1 + (y 0 )2 ) 2
Z b
I[y(·)] = dx.
0 −y

Now
 12 − 12
(1 + z 2 ) (1 + z 2 )
 
∂L z
L= , =− ,
−y ∂z −y y
and consequently
− 12  21
(1 + (y 0 )2 ) (y 0 )2 (1 + (y 0 )2 )
 
0 ∂L 0 0
y (y, y ) − L(y, y ) = − −
∂z −y y −y
1
= (−y(1 + (y 0 )2 ))− 2 .
Since L does not depend on x, it follows from Theorem 1.3.2 that
∂L
y0 (y, y 0 ) − L(y, y 0 )
∂z
is constant. Therefore
(1.34) y(1 + (y 0 )2 ) = C
on the interval [0, b] for some (negative) constant C.
GEOMETRIC INTERPRETATION. It is possible to directly inte-
grate the ODE (1.34) (see Kot [K]), but the following geometric insights are
more interesting. We first check if the graph of y(·) is the blue curve drawn
below, the angle ξ satisfies
1
sin ξ = 1 .
(1 + (y 0 )2 ) 2
34 1. FIRST VARIATION

y
ξ

ξ
Angles and derivatives

Hence the ODE (1.34) says geometrically that

sin ξ
(1.35) 1 is constant;
(−y) 2

and, according to the Remark on page 14, this in turn implies y(·) solves the
full Euler-Lagrange equation. (Compare all this with the geometric optics
example on page 17.)

A cycloid

Now (1.35) turns out to imply that the brachistochrone path is along
a cycloid, the curve traced by a point on the rim of a circle as it rolls
horizontally. Levi [L, pages 190–192] and Melzak [M, page 96] provide the
following elegant geometric proof. The key observation is that if a point
C = (x, y) on a rolling circle of diameter d > 0 generates a cycloid and if
A is the instantaneous point of contract of the circle with the line, then the
vector AC is perpendicular to the velocity vector v.
1.4. Applications 35

D A

B
Geometry of brachistochrone

Thus v, which is tangent to the blue cycloid curve, is parallel to CB, B


denoting the point directly opposite from A on the circle. Consequently
|AB| = d.
Elementary geometry shows for the angles ξ as drawn that |AC| = d sin ξ
and
−y = |DC| = |AC| sin ξ = d sin2 ξ.
This gives (1.35). 
REMARK. See also Levi [L, page 55] for an argument showing that the
cycloid is a tautochrone. This means that if we release from rest two
beads from different locations along the cycloid wire curve, they arrive at
the lowest point at the same time. 

1.4.2. Terrestrial brachistochrone.


A modern variant of the classical brachistochrone problem asks us the
find a curve r = r(θ) in polar coordinates, describing a tunnel through the
earth within which a (frictionless) vehicle falls and thereby travels in the
least time between points on the surface.
As derived in Smith [S, pages 131-133] and Kot [K, pages 6-7], the
transit time along a given curve is
  1 Z θb  0 2 1
R 2 (r ) + r2 2
I[r(·)] = dθ,
g θa R2 − r 2
where R is the radius of the earth and g is the gravitational acceleration at
the surface.
36 1. FIRST VARIATION

Since the Lagrangian


 12
s2 + r 2

L(r, s) =
R2 − r 2
does not depend on θ, we know that for any minimizing curve the expression
∂L
r0 (r, r0 ) − L(r, r0 )
∂s
is constant. We compute this and simplify, to find for some constant C that
 0 2
r r2
(1.36) +1=C 2
r R − r2

GEOMETRIC INTERPRETATION. If ψ denotes, as drawn, the angle


between the path traced by r = r(θ) and the radial vector r, we have
r0 = r cot ψ.

r
ψ

O
Angles and derivatives in polar coordinate

Hence the ODE (1.36) says geometrically that


 2 − 12
R
(1.37) sin ψ −1 is constant.
r2

We now modify the geometric reasoning from the conventional brachis-


tochrone. A circle of radius d rolls along the inside of a circle with center
O and radius R > d, and a point C on the smaller circle sweeps out a
hypocycloid.
1.4. Applications 37

Hypocycloid

If A is the instantaneous point of contract of the smaller circle with the


larger, then the vector AC is perpendicular to the velocity vector v.

C
ψ
B

r
R-d

O
Geometry of terrestrial brachistochrone
38 1. FIRST VARIATION

Therefore the point B as drawn is directly opposite from A, with |AB| =


d. We apply the Law of Cosines to the triangle ACO, to deduce that
r2 = d2 sin2 ξ + R2 − 2dR sin ξ cos( π2 − ξ).
Thus
(1.38) r2 = sin2 ξ(d2 − 2Rd) + R2 .
The Law of Sines applied to the triangle BCO tells us as well that
sin ψ sin(π − ξ) sin ξ
= = .
R−d r r
Use this identity in (1.38), to learn that
R2 2 2dR − d2
− 1 = sin ψ .
r2 (R − d)2
Therefore the hypocycloid traced out by the point C satisfies the geometric
form (1.37) of the terrestrial brachistochrone equation. 

1.4.3. Lagrangian and Hamiltonian dynamics.


In this section we explain how to rewrite certain Euler-Lagrange equa-
tions into a new formulation. For this, we assume the independent variable
is time t, and the Lagrangian is L = L(x, v). In this notation Euler-Lagrange
equations are
d
(E-L) − (∇v L(x, ẋ)) + ∇x L(x, ẋ) = 0,
dt
where  ∂L   ∂L 
∂x1 ∂v1
∇x L =  ...  , ∇v L =  ...  .
   
∂L ∂L
∂xn ∂vn

Hamilton’s equations. We propose to convert (E-L) into a system of


2n first-order ODE, having the elegant form
(
ẋ = ∇p H(x, p)
(H)
ṗ = −∇x H(x, p).

These equations involve a new function H : Rn × Rn → R, H = H(x, p),


called the Hamiltonian, that we will define below. We will write
 ∂H   ∂H 
∂x1 ∂p1
∇x H =  ...  , ∇p H =  ...  .
   
∂H ∂H
∂xn ∂pn
1.4. Applications 39

The unknowns in (H) are the two functions x, p : [0, ∞) → Rn , where


   
x1 p1
 ..   .. 
x =  . , p =  . .
xn pn

Assume hereafter that x(·) solves (E-L) for all times t ≥ 0.


DEFINITION. Set
p(t) = ∇v L(x(t), ẋ(t)) (t ≥ 0).
We call p(·) the (generalized) momentum associated with x(·).

The Hamiltonian. We will need the following hypothesis:





Assume for all x, p ∈ Rn that the equation

 p = ∇v L(x, v)
(1.39)
can be uniquely solved for v as a function of x and p:



 v = φ(x, p).
We consequently have the identity
(1.40) ∇v L(x, φ(x, p)) = p (x, p ∈ Rn ).

DEFINITION. The Hamiltonian H corresponding to the Lagrangian L


is
(1.41) H(x, p) = p · φ(x, p) − L(x, φ(x, p)) (x, p ∈ Rn ).

THEOREM 1.4.1.
(i) The functions x, p : [0, ∞) → Rn solve Hamilton’s equations (H).
(ii) Furthermore,
(1.42) H(x, p) is constant on [0, ∞).

Proof. 1. We compute using (1.41) and (1.40) that


∇p H(x, p) = φ(x, p) + (∇p φ)T (x, p)(p − ∇v L(x, φ(x, p)))
(1.43)
= φ(x, p)
and
∇x H(x, p) = (∇x φ)T (x, p)(p − ∇v L(x, φ(x, p))) − ∇x L(x, φ(x, p))
(1.44)
= −∇x L(x, φ(x, p)).
40 1. FIRST VARIATION

Put x = x and p = p into these formulas, and note that (1.39) implies
ẋ = φ(x, p).
From (1.43), it follows that ∇p H(x, p) = ẋ. And (1.44) gives
∇x H(x, p) = −∇x L(x, ẋ)
d
=− (∇v L(x, ẋ)) according to (E-L)
dt
= −ṗ.

2. To see that H(x, p) is constant in time, compute


d
H(x, p) = ∇x H(x, p) · ẋ + ∇p H(x, p) · ṗ
dt
= ∇x H(x, p) · ∇p H(x, p) − ∇p H(x, p) · ∇x H(x, p)
= 0.

REMARK. (Lagrangians, Hamiltonians and convex duality) If we
assume for each x ∈ Rn that
v 7→ L(x, v) is uniformly convex
and that the superlinear growth condition
L(x, v)
lim =∞
|v|→∞ |v|
holds, then the Hamiltonian is the dual convex function
(1.45) H(x, p) = maxn {x · v − L(x, v)},
v∈R

the maximum occurring for v = φ(x, p). (See the Math 170 notes for more
about convex duality.) 

EXAMPLE. (Motion within a magnetic field) Let B : R3 → R3


denote a time-independent magnetic field. A charged particle within this
magnetic field moves according to the Lorentz equation
(1.46) mẍ = q(ẋ × B(x)),
in which m is the mass of the particle and q is its charge.
We now show that this equation follows from Hamilton’s equations (H)
for
1
(1.47) H= |p − qA(x)|2 ,
2m
1.4. Applications 41

where the magnetic potential field A satisfies

∇ × A = B.

We compute

p − qA(x) q(∇A(x))T (p − qA(x))


∇p H = , ∇x H = − ,
m m

since ∇ |A|2 = 2(∇A)T A. So Hamilton’s equations read




( p−qA(x)
ẋ = m
q(∇A(x))T (p−qA(x))
ṗ = m .

We now show that these imply the Lorentz equation (1.46), by comput-
ing

mẍ = ṗ − q∇Aẋ
q(∇A)T (p − qA)
= − q∇A(x)ẋ
m
= q((∇A)T − ∇A)ẋ
= q(ẋ × (∇ × A))
= q(ẋ × B).

In this calculation we employed the vector calculus rule

(∇g − (∇g)T )y = (∇ × g) × y

for y ∈ R3 and g : R3 → R3 .
Taylor [T] is a good text for more on physical applications of variational
principles and Hamilton’s equations. 

1.4.4. Geodesics.
Let U ⊆ Rn be an open region. Assume that we are given a function
y : U → Rl , which we write as
 
y1
 .. 
y =  . .
yl

We call y a coordinate patch.


42 1. FIRST VARIATION

xn

x1

DEFINITION. The metric tensor G is the n × n symmetric matrix


function whose entries are
∂y ∂y
gij = · (i, j = 1, . . . , n).
∂xi ∂xj

We assume G is everywhere positive definite: G  0.

NOTATION. The matrix G is therefore invertible. We will write

g ij

for the (i, j)-th entry of the inverse matrix G−1 .

DEFINITION. The corresponding Christoffel symbols are


n  
1 X mk ∂gik ∂gjk ∂gij
(1.48) Γm
ij = g + − .
2 ∂xj ∂xi ∂xk
k=1

DEFINITION. The energy of a curve x : [0, T ] → U is


n
1 T X
Z
E[x(·)] = gij (x)ẋi ẋj dt.
2 0
i,j=1

THEOREM 1.4.2.
1.4. Applications 43

(i) The Euler-Lagrange equations for the energy E[ · ] are


n
X
(1.49) ẍm + Γm
ij (x)ẋi ẋj = 0 (m = 1, . . . , n).
i,j=1

(ii) If x solves (1.49), then


n
X
(1.50) gij (x)ẋi ẋj is constant.
i,j=1

DEFINITION. A curve x(·) solving the system of ODE (1.49) is called a


geodesic. We will see later that (1.50) says geodesics have constant speed.


Proof. 1. The Lagrangian is


n
1 X
L = L(x, v) = gij (x)vi vj ,
2
i,j=1

with
n n
∂L X ∂L 1 X ∂gij
= gik vi , = vi vj .
∂vk ∂xk 2 ∂xk
i=1 i,j=1
 
d ∂L ∂L
We insert these into the Euler-Lagrange equation − dt ∂vk + ∂xk = 0, to
find
n n
!
d X 1 X ∂gij
gik ẋi − ẋi ẋj = 0.
dt 2 ∂xk
i=1 i,j=1
Therefore
n n  
X X ∂gik 1 ∂gij
0= gik ẍi + − ẋi ẋj
∂xj 2 ∂xk
i=1 i,j=1
n n  
X 1 X ∂gik ∂gjk ∂gij
= gik ẍi + + − ẋi ẋj .
2 ∂xj ∂xi ∂xk
i=1 i,j=1

Multiply by g mk , sum on m, and recall nk=1 g mk gki = δmi , to deduce


P
 
n n
1 X ∂gik ∂gjk X ∂gij 
ẍm + g mk  + − ẋi ẋj = 0.
2 ∂xj ∂xi ∂xk
i,j,k=1 i,j=1

We recall the definition (1.48) of the Christoffel symbols to complete the


derivation of (1.49).
44 1. FIRST VARIATION

2. Since the Lagrangian L does not depend upon the independent vari-
able t, Theorem 1.3.8 tells us that the expression
n
1 X
ẋ · ∇z L(x, ẋ) − L(x, ẋ) = gij (x)ẋi ẋj
2
i,j=1

is constant for times 0 ≤ t ≤ T. 

Length and energy. We discuss next how minimizing the energy is


equivalent to minimizing length. We henceforth take T = 1 in the definiton
of the energy.
DEFINITIONS.
(i) The length of a curve x : [0, 1] → U is
Z 1 X n 1
2
L[x(·)] = gij (x)ẋi ẋj dt.
0 i,j=1

(This is the Euclidean length of the image of x(·) under the coordinate chart
y(·).)
(ii) The distance between two points A, B ∈ Rn in the metric deter-
mined by G is
dist(A, B) = min {L[x] | x : [0, 1] → Rn , x(0) = A, x(1) = B} .


It turns out that minimizing energy is equivalent to minimizing length:


THEOREM 1.4.3. A curve that minimizes the energy among paths join-
ing A and B for 0 ≤ t ≤ 1 if and only if it also minimizes the length.
Furthermore
dist(A, B)2
(1.51) min {E[x] | x : [0, 1] → Rn , x(0) = A, x(1) = B} = .
2

Proof. Let us assume that a curve x = [x1 · · · xn ]T joining A and B gives


the distance:
Z 1 Xn 1
2
dist(A, B) = gij (x)ẋi ẋj dt.
0 i,j=1
We can if necessary reparameterize x(·) to have constant speed, so that
X n 1
2
(1.52) gij (x)ẋi ẋj = dist(A, B) (0 ≤ t ≤ 1).
i,j=1
1.4. Applications 45

Next, assume that a curve y(·) minimizes the energy among paths connecting
A to B:
E[y] = min {E[w] | w : [0, 1] → Rn , w(0) = A, w(1) = B} .
According to Theorem 1.4.2,
n
1 X
(1.53) gij (y)ẏi ẏj = E
2
i,j=1

is constant in time. Then (1.52) implies


n
1 1 X dist(A, B)2
Z
E = E[y] ≤ E[x] = gij (x)ẋi ẋj dt = .
2 0 2
i,j=1

But also
Z 1 n
X 1
2
dist(A, B) = L[x] ≤ L[y] = gij (y)ẏi ẏj dt
0 i,j=1
 1
n 2
Z 1 X 1
≤ gij (y)ẏi ẏj dt = (2E) 2 ,
0 i,j=1

dist(A,B)2
according to (1.53). Hence E = 2 ; and therefore L[x] = L[y], E[y] =
E[x]. 

EXAMPLE (Hyperbolic metric). A famous model in geometry is the


Poincaré hyperbolic plane, for which n = 2 and
1
gij = δij (i ≤ i, j ≤ 2)
(x2 )2
in the region H = {−∞ < x1 < ∞, 0 < x2 < ∞}.
We calculate using the definition (1.48) that
1 1
Γ112 = Γ121 = Γ222 = − , Γ211 = ,
x2 x2
and the remaining Christoffel symbols vanish. Employing these formulas in
(1.49) yields this system of geodesic ODE for the hyperbolic plane:
(
ẍ1 = 2ẋx12ẋ2
(1.54) 2 ẋ1 )2
ẍ2 = (ẋ2 ) x−(2
.

We can extract geometric information from these equations:


THEOREM 1.4.4. The path of any trajectory solving (1.54) is either
along a vertical line or else along a circle centered on the x1 -axis.
46 1. FIRST VARIATION

Proof. 1. Consider a solution of the ODE system (1.54) with ẋ1 6= 0. Define
x2 ẋ2
(1.55) a = x1 + .
ẋ1
Then
ẋ2 ẋ2 + x2 ẍ2 x2 ẋ2 ẍ1
ȧ = ẋ1 + −
ẋ1 (ẋ1 )2
2 x2 (ẋ2 )2 − (ẋ1 )2
   
(ẋ2 ) x2 ẋ2 2ẋ1 ẋ2
= ẋ1 + + −
ẋ1 ẋ1 x2 (ẋ1 )2 x2
= 0;
consequently a is constant.
2. We claim that the motion of the point x = [x1 x2 ]T lies within a circle
with center (a, 0). To confirm this, let us use (1.55) to calculate that
d (x1 − a)2 + (x2 )2
 
= (x1 − a)ẋ1 + x2 ẋ2
ds 2
 
x2 ẋ2
= − ẋ1 + x2 ẋ2
ẋ1
= 0.
Therefore
(x1 − a)2 + x22 = r2
for some appropriate radius r > 0. 
GEOMETRIC INTERPRETATION. Thus the geodesics in the hyper-
bolic half plane are either vertical, or else approach the x1 -axis as s → ±∞.
The half circles they traverse have infinite length. qed

Geodesics in the hyperbolic plane


1.4. Applications 47

1.4.5. Maxwell’s fisheye.


This example concerns curves x(·) taking values in R3 that are extremals
for the Lagrangian
|v|
(1.56) L(x, v) = (x, v ∈ R3 ).
1 + |x|2
Then
v 2|v|x
∇v L = 2
, ∇x L = − .
|v|(1 + |x| ) (1 + |x|2 )2
The (E-L) system is therefore
 
d ẋ 2|ẋ|x
(1.57) − − = 0.
dt |ẋ|(1 + |x|2 ) (1 + |x|2 )2
where · = d
dt . We reparameterize in terms of the arclength s = s(t), which
satisfies
ds
= |ẋ|.
dt
Then (1.57) becomes
0
x0

2x
(1.58) =− .
1 + |x|2 (1 + |x|2 )2
where 0 = d
ds .

We will now investigate properties of solutions of the system (1.58), with


|x0 | ≡ 1.
THEOREM 1.4.5. Any trajectory solving (1.58) is either along a line
through the origin, or else along a circle in a plane passing through the
origin.
GEOMETRIC INTERPRETATION. The proof below shows how we
can sometimes deduce interesting geometric information about solutions of
a system of ODE.
In particular, the Lagrangian (1.56) has the remarkable property that
all solutions of the corresponding Euler-Lagrange system of ODE move in
circles (or along straight lines). And even though L is radial in x, the centers
of these circles need not be the origin.


Proof. 1. We compute using the cross product and the ODE (1.58) that
0  0
x0 x0 x0
  
0
×x = ×x+ ×x
1 + |x|2 1 + |x|2 1 + |x|2
48 1. FIRST VARIATION

2x
=− × x = 0.
(1 + |x|2 )2
Hence
x0
×x=b for some vector b ∈ R3 .
1 + |x|2
If b 6= 0, then since x · b = 0, the trajectory lies in the plane through the
origin perpendicular to b. If b = 0, then x and x0 are everywhere parallel
and so the motion is along a line.

2. We henceforth assume b 6= 0. Upon rotating coordinates if necessary,


we may assume that b is parallel to [0 0 1]T . Thus we covert to the two-
dimensional case that x = [x1 x2 ]T solves the (now two-dimensional) system
of ODE (1.58). Carrying out the differentiation in the term on the left hand
side of (1.58), we can now write (E-L) as
2
(1.59) x00 = ((x · x0 )x0 − x).
1 + |x|2

Let t = x0 be the unit tangent vector to the curve in the plane. Then
(1.60) t0 = κn
where n is the unit normal and κ ≥ 0 is the curvature. So {t, n} is an
orthonormal frame moving along the curve. If we differentiate the expression
t · n = 0, we see that
(1.61) n0 = −κt.

Since x = (x · t)t + (x · n)n, x0 = t and x00 = t0 = κn, we deduce from


(1.59) that
x·n
(1.62) κ = −2 .
1 + |x|2

3. We will now show that κ is constant. Let us calculate using (1.61)


that
 0
x · n + x · n0 (x · n)2x · x0

κ0 = −2 −
1 + |x|2 (1 + |x|2 )2
 
t · n − κx · t (x · n)2x · t
= −2 −
1 + |x|2 (1 + |x|2 )2
2(x · t)  2

= κ(1 + |x| ) + (x · n)2
(1 + |x|2 )2
= 0,
the last equality following from (1.62).
1.4. Applications 49

4. Finally we show that the trajectory lies on a circle if κ > 0. Define


1
c = x + n.
κ
Then
c0 = x0 + κ1 n0 = t − κ1 κt = 0,
and hence c ≡ c for some point c ∈ R2 . Furthermore,
2
(|x − c|2 )0 = 2(x − c) · x0 = − n · t = 0.
κ
Consequently the trajectory moves along the circle of radius κ−1 and center
c. 
PHYSICAL INTERPRETATION. In the 19th century, there was inter-
est in the optical properties of the eyes of fishes, and in particular the ques-
tion as to how such eyes, which are often quite flat, focus images. Maxwell
had found the geometric properties of extremals for the Langrangian (1.56),
and there was some thought that these may explain the optics of fish eyes.
In the 20th century R. Luneburg generalized Maxwell’s ideas to design
lenses with various interesting properties, made from transparent materials
with radially varying refractive index. (The refractive index of an optical
medium is n = vc , where c is the speed of light in a vacuum and v is the
speed within the medium.) 
Chapter 2

SECOND
VARIATION

2.1. Computing the second variation


We return to our standard calculus of variations problem of characterizing
minimizers y0 (·) of the functional
Z b
I[y(·)] = L(x, y, y 0 ) dx
a

over the admissible class

A = {y : [a, b] → R | y(a) = y 0 , y(b) = y 1 }.

We have so far examined in great detail the Euler-Lagrange equation, the


derivation of which corresponds to taking the first variation. This chapter
turns attention to the second variation.

2.1.1. Integral and pointwise versions.

THEOREM 2.1.1. Suppose y0 (·) ∈ A is a minimizer.


(i) Then
b
∂2L
Z
(2.1) (x, y0 , y00 )(w0 )2
a ∂z 2
∂2L ∂2L
+2 (x, y0 , y00 )ww0 + (x, y0 , y00 )w2 dx ≥ 0
∂y∂z ∂y 2
for all w : [a, b] → R with w(a) = w(b) = 0.

51
52 2. SECOND VARIATION

(ii) Furthermore,
∂2L
(2.2) (x, y0 (x), y00 (x)) ≥ 0 (a ≤ x ≤ b).
∂z 2

REMARKS.
(i) We call the left hand side of (2.1) the second variation of I[ · ]
about y0 (·), evaluated at w(·).
(ii) If the mapping
(2.3) z 7→ L(x, y0 (x), z) is convex,
then (2.2) holds. This observation strongly suggests that the convexity of
the Lagrangian L in the variable z will be a useful hypothesis if we try to
find minimizers: see Section 2.4.2. 

Proof. 1. We extend our earlier first variation proof of the Euler-Lagrange


equation. Select w : [a, b] → R, with w(a) = w(b) = 0, and define yτ (x) =
y0 (x) + τ w(x) for −1 ≤ τ ≤ 1. Then yτ (·) ∈ A, and so τ 7→ i(τ ) has a
minimum at τ = 0 on the interval −1 ≤ τ ≤ 1. Therefore
d2 i
(0) ≥ 0.
dτ 2

2. Differentiating twice with respect to τ , we find


Z b 2
d2 i ∂ L
2
(τ ) = 2
(x, y0 + τ w, y00 + τ w0 )w2
dτ a ∂y
∂2L ∂2L
+2 (x, y0 + τ w, y00 + τ w0 )ww0 + (x, y0 + τ w, y00 + τ w0 )(w0 )2 dx.
∂y∂z ∂z 2
d2 i
Put τ = 0 and recall dτ 2
(0) ≥ 0, to prove (2.1).

3. We need to design appropriate functions w to extract pointwise in-


formation from (2.1). For this, define φ : R → R by setting
(
x if 0 ≤ x ≤ 1
φ(x) =
2−x if 1 ≤ x ≤ 2
on the interval [0, 2] and then extend φ to be 2-periodic on all of R. Thus φ
is a “sawtooth function” with corners at the integers and φ0 = ±1 elsewhere.
Next let ζ : [a.b] → R be any continuously differentiable function with
ζ(a) = ζ(b) = 0. Then for each ε > 0, define
wε (x) = εφ( xε )ζ(x).
2.1. Computing the second variation 53

Then for all but finitely many points, wε is differentiable:


wε0 (x) = φ0 ( xε )ζ(x) + εφ( xε )ζ 0 (x).
Note that the second term on the right is less than or equal to Aε for some
appropriate constant A.
We now plug in wε in place of w in (2.1). Then upon making some
simple estimates, we learn that
Z b 2
∂ L
2
(x, y0 , y00 )(φ0 ( xε ))2 ζ 2 dx + Dε ≥ 0,
a ∂z
for some constant D. But
2
φ0 ( xε ) =1
except at finitely many points (which have no effect on the integral). There-
fore upon sending ε → 0, we deduce that
Z b 2
∂ L
2
(x, y0 , y00 )ζ 2 dx ≥ 0
a ∂z

for all functions ζ as above. This implies (2.2). 

2.1.2. Weierstrass condition.


In this section we strengthen (2.2):

THEOREM 2.1.2. Suppose y0 (·) ∈ A is a minimizer that is continuously


differentiable.
Then for all z ∈ R and all points a ≤ x ≤ b,
∂L
(2.4) L(x, y0 (x), z) ≥ L(x, y0 (x), y00 (x)) + (x, y0 (x), y00 (x))(z − y00 (x)).
∂z

This is the Weierstrass condition for a minimizer.

GEOMETRIC INTERPRETATION. This inequality says that for fixed


x and y0 (x), the graph of the function
z 7→ L(x, y0 (x), z)
lies above the tangent line at the point z = y00 (x). This of course is clear if
z 7→ L(x, y0 (x), z) is convex.
Notice that (2.4) does not follow from any sort of local second varia-
tion argument, since z need not be close to y00 (x). Rather, the Weierstrass
condition is a consequence of the global minimality of y0 (·). 
54 2. SECOND VARIATION

Proof. To simplify the exposition, we will assume for simplicity that L =


L(x, z) does not depend upon y.
1. Select any z ∈ R and 0 < δ < 1. Choose any point a < x0 < b and
select ε > 0 so small that the interval [x0 − δε, x0 + ε] lies within [a, b]. Now
define 


 y0 (x) (a ≤ x ≤ x0 − δε)

l (x)
1 (x0 − δε ≤ x ≤ x0 )
y(x) =


 l2 (x) (x0 ≤ x ≤ x0 + ε)

y (x)
0 (x0 + ε ≤ x ≤ b),
where l1 , l2 are linear functions selected so that y(·) is continuous and
(
l10 (x) = z (x0 − δε ≤ x ≤ x0 )
y0 (x0 +ε)−y0 (x0 −δε)
l20 (x) = ε − δz (x0 ≤ x ≤ x0 + ε)

Then y(·) ∈ A and thus


I[y0 (·)] ≤ I[y(·)]
Since y(·) = y0 (·) outside the interval [x0 − δε, x0 + ε], it follows that
Z x0 +ε Z x0 +ε
0
L(x, y0 ) dx ≤ L(x, y 0 ) dx
x0 −δε x0 −δε
Z x0 Z x0 +ε  
0 (x0 −δε)
= L(x, z) dx + L x, y0 (x0 +ε)−y
ε − δz dx.
x0 −δε x0
Then
Z x0 Z x0
1 1
(2.5) L(x, y00 ) dx ≤ L(x, z) dx
δε x0 −δε δε x0 −δε
Z x0 +ε 
1 0 (x0 −δε)

+ L x, y0 (x0 +ε)−y
ε − δz − L(x, y00 ) dx.
δε x0

2. We must examine carefully the integral on the right. Now


 
0 (x0 −δε)
(2.6) L x, y0 (x0 +ε)−y
ε − δz − L(x, y00 ) = ∂L 0
∂z (x, y0 )aε + rε ,

for
y0 (x0 + ε) − y0 (x0 − δε)
aε = − δz − y00 (x),
ε
where the remainder term rε satisfies the estimate
(2.7) |rε | ≤ Aa2ε .
According to L’Hospital’s Rule,
(2.8) lim aε = (1 + δ)y00 (x0 ) − δz − y00 (x).
ε→0
2.2. Positive second variation 55

Then (2.7) implies


(2.9) lim sup |rε | ≤ B(δ 2 + (y00 (x0 ) − y00 (x))2 ).
ε→0

for a suitable constant B.

3. Now send ε → 0 in (2.5), using (2.6),(2.8) and (2.9):


L(x0 , y00 ) ≤ L(x0 , z) + ∂L 0 0
∂z (x0 , y0 )(y0 (x0 ) − z) + Bδ.
Finally, let δ → 0, to deduce
L(x0 , y00 ) + ∂L 0
∂z (x0 , y0 )(z − y00 (x0 )) ≤ L(x0 , z);
this is (2.4) for L = L(x, z). 

2.2. Positive second variation


Suppose now that y(·) ∈ A is an extremal, and consequently solves the
Euler-Lagrange equation
 0
∂L 0 ∂L
(E-L) − (x, y, y ) + (x, y, y 0 ) = 0 (a ≤ x ≤ b).
∂z ∂y
We devote the rest of this chapter to the question:
When is y(·) in fact a local minimizer of I[ · ]?

2.2.1. Riccati equation.

NOTATION. Let us write


∂2L 0 ∂2L 0 ∂2L
(2.10) A= (x, y, y ), B = (x, y, y ), C = (x, y, y 0 ).
∂z 2 ∂y∂z ∂y 2

Therefore the second variation of I[ · ] about y(·) is


Z b
(2.11) A(w0 )2 + 2Bww0 + Cw2 dx
a

for w : [a, b] → R with w(a) = w(b) = 0. 

We assume hereafter that


(2.12) A>0 on [a, b].

As a first approach towards understanding local minimizers, we intro-


duce a new nonlinear equation:
56 2. SECOND VARIATION

DEFINITION. The Riccati equation associated with (E-L) and the


extremal y(·) is the first-order ODE

(q − B)2
(R) q0 = − +C (a ≤ x ≤ b),
A
for the functions A, B, C defined by (2.10). 

A remarkable observation is that if (R) has a solution, then the second


variation is positive:
THEOREM 2.2.1. Assume that there exists a solution q(·) of the Riccati
equation (R).
Then the second variation of I[ · ] around y(·) is positive:
Z b
(2.13) A(w0 )2 + 2Bww0 + Cw2 dx > 0
a
for all nonzero w : [a, b] → R with w(a) = w(b) = 0.

Proof. We calculate that


Z b
A(w0 )2 +2Bww0 + Cw2 dx
a
Z b  
(q−B)2
= A(w0 )2 + 2Bww0 + q 0 + A w2 dx
a
Z b
(q−B)2 2
= A(w0 )2 − 2(q − B)ww0 + A w dx
a
Z b  2
= A w0 − q−B
A w dt ≥ 0.
a

Furthermore, if w0 − q−B
A w = 0 on the interval [a, b] and w(a) = 0, then
uniqueness of the solution of this ODE imples w = 0. 

2.2.2. Conjugate points.


We show next that we can construct a solution to (R) if we can find a
positive solution of a related linear ODE.

DEFINITIONS. (i) The linearization of the Euler-Lagrange equa-


tion (E-L) about the extremal y(·) is the linear operator

(2.14) Ju = −(Au0 + Bu)0 + Bu0 + Cu,


defined for twice continuously differentiable functions u : [a, b] → R.
2.2. Positive second variation 57

(ii) We call the ODE


Ju = 0 (a ≤ x ≤ b)
Jacobi’s equation. 

We can construct solutions of the Riccati equation from positive solutions


of Jacobi’s equation:
THEOREM 2.2.2. Suppose that
(
Ju = 0 (a ≤ x ≤ b)
(2.15)
u>0 (a ≤ x ≤ b).
Then
u0
(2.16) q=A +B
u
solves the Riccati equation (R).

Proof. We calculate
0 = −(Au0 + Bu)0 + Bu0 + Cu
= −(qu)0 + Bu0 + Cu
= −q 0 u + (B − q)u0 + Cu
(q − B)2
= −q 0 u − u + Cu.
A
Divide by u > 0 to derive (R). 

In light of the previous theorem, we need to understand when we can


find a positive solution of the Jacobi equation on a given interval [a, b]. This
depends upon whether or not the interval contains conjugate points:
DEFINITION. Select a ∈ R. We say that a point c > a is a conjugate
point with respect to a if there exists a function u : [a, c] → R such that

Ju = 0
 (a ≤ x ≤ c)
(2.17) u(a) = u(c) = 0

u 6= 0.

THEOREM 2.2.3. If there are no conjugate points within the interval


(a, b], then there exists a solution q(·) of the Riccati equation (R).
Consequently, the second variation of I[ · ] about y(·) is positive.
58 2. SECOND VARIATION

Proof. 1. It can be shown that if there are no conjugate points with respect
to a within (a, b], then in fact for some nearby point d < a there are no
conjugate points with respect to d within (d, b]. (The proof of this relies
upon some mathematical ideas a bit beyond the scope of these notes.)

2. Consider then the initial-value problem


(
Ju = 0 (d ≤ x ≤ b)
(2.18)
u(d) = 0, u0 (d) = 1.
Standard ODE theory tells us that this problem has a unique solution. We
observe furthermore that
u(x) > 0 (d < x ≤ d + ε)
for some sufficiently small ε > 0, since u0 (d) = 1.
We claim that in fact
(2.19) u(x) > 0 (d < x ≤ b).
To see this, assume instead that d < c ≤ b is a point where u(c) = 0. Then
c would be a conjugate point with respect to d, and this is not possible.
Since by continuity u is strictly positive on the interval [a, b], it follows
0
from Theorem 2.2.2 that q = A uu + B solves (R). Theorem 2.2.1 now implies
that the second variation is positive. 

2.3. Strong local minimizers


We now build upon the ideas just developed. We next show that the absence
of conjugate points implies not only that the second variation is positive, but
also that y(·) is a local minimizer of I[ · ], in fact a strong local minimizer.
DEFINITION. We say y : [a, b] → R is a strong local minimizer of
I[ · ] if there exists δ > 0 such that
(2.20) max |y − ȳ| ≤ δ
[a,b]

implies
(2.21) I[y(·)] ≤ I[ȳ(·)]
for all ȳ : [a, b] → R satisfying
ȳ(a) = y(a), ȳ(b) = y(b).
REMARK. We call such a local minimizer strong since we require only
that (2.20) be valid, and not necessarily also that
(2.22) max |y 0 − ȳ 0 | ≤ δ.
[a,b]
2.3. Strong local minimizers 59

a b
y and ȳ are close, but y0 and ȳ 0 are not

In particular (2.20) does not imply that L(x, ȳ, ȳ 0 ) is close to L(x, y, y 0 )
pointwise. 

2.3.1. Fields of extremals.


We begin with the assertion that y(·) is a strong local minimizer, pro-
vided we can embed it within a field of extremals. This means that we can
find a family of other solutions of (E-L), depending upon a parameter c,
that surround y(·):
DEFINITION. We say that w : [a, b] × [−1, 1] → R, w = w(x, c), is a field
of extremals about y(·) provided:
(i) For each −1 ≤ c ≤ 1,
 0
∂L 0 ∂L
(2.23) − (x, w, w ) + (x, w, w0 ) = 0 (a ≤ x ≤ b);
∂z ∂y
(ii)
(2.24) y(x) = w(x, 0) (a ≤ x ≤ b);

and
(iii)
∂w
(2.25) (x, 0) > 0 (a ≤ x ≤ b).
∂c
NOTATION. In (2.23) and below we write
∂w
w0 (x, c) = (x, c).
∂x

60 2. SECOND VARIATION

REMARK. The importance of condition (2.25) is that it ensures that the


graphs of the functions
{w(·, c) | |c| ≤ ε}
do not intersect on the interval [a, b], if ε > 0 is sufficiently small. Therefore
each point close enough to the graph of y(·) lies on the graph of precisely
one of the functions {w(·, c) | |c| ≤ ε}. 

THEOREM 2.3.1. Suppose y(·) is an extremal that lies within a field of


extremals w, as above. Assume also that
(2.26) z 7→ L(x, y, z) is convex (a ≤ x ≤ b, y ∈ R).

Then y(·) is a strong local minimizer of I[ · ].

a b
The graph of y is red, the graph of ȳ is blue and the graphs of the other
extremals are black

Proof. 1. Let ȳ : [a, b] → R satisfy ȳ(a) = y(a), ȳ(b) = y(b) and


max |y − ȳ| ≤ δ
[a,b]

for some small number δ > 0.


As noted in the Remark above, the Implicit Function Theorem implies
in view of (2.25) that we can uniquely write
(2.27) ȳ(x) = w(x, c(x)) (a ≤ x ≤ b)
for a function c : [a, b] → (−1, 1) provided δ > 0 is small enough. Further-
more, (2.24) forces
(2.28) c(a) = c(b) = 0.
2.3. Strong local minimizers 61

2. According to (2.27) we have


∂w
ȳ 0 (x) = w0 (x, c(x)) +
(x, c(x))c0 (x).
∂c
Therefore the convexity condition (2.26) implies
∂L ∂w 0
L(x, ȳ, ȳ 0 ) ≥ L(x, ȳ, w0 ) c=c(x)
+ (x, ȳ, w0 ) c c=c(x)
(2.29) ∂z ∂c
0
= L(x, y, y ) + E,
where
∂L ∂w 0
E = L(x, w, w0 ) c=c(x)
− L(x, y, y 0 ) + (x, w, w0 ) c c=c(x)
.
∂z ∂c

3. We now show that E is the derivative of a function F that vanishes


at x = a, b. To see this, we compute using (2.23) that
Z c(x)
∂ ∂L ∂w 0
E= L(x, w(x, c), w0 (x, c)) dc + (x, w, w0 ) c
0 ∂c ∂z ∂c c=c(x)
∂L ∂w ∂L ∂w0
Z c(x)
∂L ∂w 0
= + dc + (x, w, w0 ) c
0 ∂y ∂c ∂z ∂c ∂z ∂c c=c(x)
Z c(x)  0
∂L ∂w ∂L ∂w0 ∂L ∂w 0
= + dc + (x, w, w0 ) c
0 ∂z ∂c ∂z ∂c ∂z ∂c c=c(x)
= F0
for Z c(x)
∂L ∂w
F (x) = (x, w, w0 ) dc.
0 ∂z ∂c
We used here the calculus formula
Z g(x) ! Z
g(x)
d ∂f
f (x, t) dt = (x, t) dt + f (x, g(x))g 0 (x).
dx a a ∂x
Observe also that
(2.30) F (b) = F (a) = 0,
since c(a) = c(b) = 0, according to (2.28).

4. It now follows from (2.29) and (2.30) that


Z b Z b Z b
0 0 0
L(x, ȳ, ȳ ) dx ≥ L(x, y, y ) + F dx = L(x, y, y 0 ) dx.
a a a
Therefore
I[y(·)] ≤ I[ȳ(·)].

62 2. SECOND VARIATION

REMARK. This direct proof, which uses ideas from Ball-Murat [B-M],
is more straightforward than conventional arguments involving Hilbert’s in-
variant integral (as explained, for instance, in Kot [K]).
Clarke and Zeidan present in [C-Z] a very interesting alternative ap-
proach, using the Riccati equation (R) to build an appropriate calibration
function. 

2.3.2. More on conjugate points.


We now need to understand when we can embed a given extremal within
a field of extremals. A clue comes from our earlier discussion of the Riccati
equation and conjugate points.

d a b

The graph of y is red and the graphs of the other extremals are blue

THEOREM 2.3.2. Let y : [a, b] → R be an extremal, and suppose that


there are no conjugate points with respect to a within the interval [a, b].
(i) Then we can embed y(·) within a field w of extremals.
(ii) Consequently if z 7→ L(x, y, z) is convex, then y(·) is a strong local
minimizer of I[ · ].

Proof. 1. First, as noted in the proof of Theorem 2.2.3, for some point
d < a there are no conjugate points with respect to d within (d, b]. By
solving the Euler-Lagrange ODE we can extend y to be a solution defined
on the larger interval [d, b].
Now solve the initial-value problems
0 0 0
 ∂L
 ∂L
− ∂z (x, w, w ) + ∂y (x, w, w ) = 0
 (d ≤ x ≤ b)
(2.31) w(d, c) = y(d)

 0
w (d, c) = y 0 (d) + c
2.4. Existence of minimizers 63

for the functions w = w(x, c). By uniqueness of the solution to an initial-


value problem, we have y(x) = w(x, 0).

2. Define
∂w
(2.32) u(x) = (x, 0).
∂c
We now differentiate the ODE in (2.31) with respect to the variable c. This
gives the identity
0
∂ L ∂w ∂ 2 L ∂w0 ∂ 2 L ∂w0
 2
∂ 2 L ∂w
0=− + + + .
∂z∂y ∂c ∂z 2 ∂c ∂y 2 ∂c ∂y∂z ∂c
We now put c = 0 and recall the definitions (2.10) and (2.32):

0 = −(Bu + Au0 )0 + Cu + Bu0 = Ju.

Upon differentiating as well the initial conditions in (2.31), we see that u


solves the initial-value problem
(
Ju = 0 (d ≤ x ≤ b)
(2.33)
u(d) = 0, u0 (d) = 1.

3. As shown in the proof of Theorem 2.2.3, the hypothesis of no conju-


gate points implies that u = ∂w∂c (·, 0) is strictly positive on the interval [a, b].
Therefore w : [a, b] × [−1, 1] → R is a field of extremals.
Now recall Theorem 2.3.1 to deduce that y(·) is a strong local minimizer,
provided we suppose also that z 7→ L is convex. 

REMARK. We are able to transform (2.31) into (2.33) by differentiating


in the parameter c, since the Jacobi differential operator

Ju = −(Au0 + Bu)0 + Bu0 + Cu

is the linearization of the Euler-Lagrange equation about y(·). 

2.4. Existence of minimizers


2.4.1. Examples.
We next build upon our insights from second variation calculations to
discuss the existence of minimizers.
We have thus far simply assumed for various variational problems that
minimizers exist, but in fact this issue can be quite complicated. Even quite
simple looking problems may have no solution:
64 2. SECOND VARIATION

EXAMPLE. For instance, the functional


Z 1
I[y(·)] = y 2 − (y 0 )2 dx
0
does not have a minimizer over the admissible set
A = {y : [a, b] → R | y(0) = 0, y(1) = 0}.
This is clear from the second variation condition (2.2), which definitely fails
for all functions y0 (·) ∈ A since
∂2L
L = y2 − z2, < 0.
∂z 2
In fact, we have
inf I[y(·)] = −∞.
A
To prove this, define for each integer k
yk (x) = sin(kπx) ∈ A
and observe that I[yk (·)] → −∞ as k → ∞. 

EXAMPLE. As an even easier example of nonexistence of minimizers con-


sider Z 1
I[y(·)] = y dx.
0
It is not hard to see that
(2.34) inf I[y(·)] = −∞
y(·)∈A

for the admissible set A as above. 

EXAMPLE. This example illustrates a different mechanism for the failure


of minimizers to exist. We investigate
Z 1
(2.35) I[y(·)] = x2 (y 0 )2 dx,
0
over the admissible set A = {y : [0, 1] → R | y(0) = 1, y(1) = 0}.
Define (
0 ( k1 ≤ x ≤ 1)
yk (x) =
1 − kx (0 ≤ x ≤ k1 ).
Then
Z 1
k 1
I[yk (·)] = x2 k 2 dx = →0 as k → ∞,
0 3k
2.4. Existence of minimizers 65

and consequently
inf I[y(·)] = 0.
y(·)∈A
R1
But the minimum is not attained, since 0 x2 (y 0 )2 dx > 0 for all y(·) ∈ A. 

EXAMPLE. We next show that the problem of minimizing


Z 1
(2.36) I[y(·)] = − xyy 0 dx
0

over A = {y : [0, 1] → R | y(0) = 0, y(1) = b} has a solution only if b = 0.


To see this, we first integrate by parts:
1 1 b2 1 1 2
Z Z
2 0
I[y(·)] = − x(y ) dx = − + y dx.
2 0 2 2 0
R1
For any ε > 0 we can find a function y(·) ∈ A such that 0 y 2 dx < ε. Hence
b2
inf I[y(·)] = − .
y(·)∈A 2
R1
But the minimum is not attained, because 0 y 2 dx > 0 for all y ∈ A (unless
b = 0). 

2.4.2. Convexity and minimizers.


The examples in the previous section illustrate that in general variational
problems need not have solutions. And even when we can show the existence
of minimizers, this theory depends upon the advanced mathematical topics
of measure theory and functional analysis, which are beyond the scope of
these notes. Instead, in this section we provide a brief discussion about the
key assumption for the existence theory for minimizers and mention some
typical theorems.
The primary requirement, strongly motivated by our earlier second vari-
ation necessary conditions (2.2) and (2.4), is that
(2.37) the mapping z 7→ L(x, y, z) is convex
for each (x, y).
This is the fundamental hypothesis for most existence theorems for mini-
mizers in the calculus of variations. Indeed, a basic existence theorem states
that if L satisfies (2.37) and also certain growth conditions, then there ex-
ists at least one minimizer y0 (·) within some appropriate class of admissible
functions A.
66 2. SECOND VARIATION

REMARK. Modern existence theory in fact separates the questions of (i)


the existence of minimizers and (ii) their regularity (or smoothness) proper-
ties. The basic strategy is expand the admissible class A, to include various
functions that are less smooth than continuous and piecewise continuously
differentiable (as we have supposed throughout these notes). The precise
definition for this expanded admissible class is rather subtle, but the idea is
that the more functions we accept as belonging to A, the easier it will be to
find a minimizer.
And once we have resolved the problem (i) of existence of a minimizer,
we can then ask how smooth it is. A typical theorem concerning (ii) requires
that L be smooth and satisfies the strengthened convexity condition that
∂2L
(2.38) (x, y, z) ≥ γ > 0
∂z 2
for some constant γ. The regularity assertion is then that a minimizer y0 (·)
will be a smooth function of the variable x. 
Chapter 3

MULTIVARIABLE
VARIATIONAL
PROBLEMS

3.1. Multivariable calculus of variations


In this chapter we extend the theory to variational problems for functions
of more than one variable. There will be few new mathematical ideas, and
so the only difficulties will be notational.
NOTATION. In the following U will always denote an open subset of Rn ,
with smooth boundary ∂U . Its closure is Ū = U ∪ ∂U . 
DEFINITION. Assume g : ∂U → R is a given function. The correspond-
ing set of admissible functions is
A = {u : Ū → R | u(·) is continuously differentiable, u = g on ∂U }.
NOTATION. We will often write “u(·)” to emphasize that u : Ū → R is a
function. The gradient of a function u ∈ A is
 ∂u 
∂x1
∇u =  ...  .
 
∂u
∂xn


Assume that in addition we are given a function L : U × R × Rn → R,


L = L(x, y, z),

67
68 3. MULTIVARIABLE VARIATIONAL PROBLEMS

called the Lagrangian. So L is a function of the 2n + 1 real variables


(x, y, z) = (x1 , . . . , xn , y, z1 , . . . , zn ).

DEFINITION. If u(·) ∈ A, we define


Z
I[u(·)] = L(x, u(x), ∇u(x)) dx.
U

Note that we insert the number u(x) into the y-variable slot of L(x, y, z),
and the vector ∇u(x) into the z-variable slot of L(x, y, z).

The basic problem of the multivariable calculus of variations is


to characterize functions u0 (·) ∈ A that satisfy

(COV) I[u0 (·)] = min I[u(·)].


u(·)∈A

EXAMPLE (Dirichlet energy). As a first example, take


n
|z|2 1X 2
L(x, y, z) = = zk .
2 2
k=1

Then
Z
1
I[u(·)] = |∇u(x)|2 dx,
2 U

an expression called the Dirichlet energy of u. What function u0 (·) mini-


mizes this energy among all other candidates satisfying the given boundary
conditions that u = g on ∂U ? 

EXAMPLE (Minimal surfaces). For a geometric example, we consider


now
L(x, y, z) = (1 + |z|2 )1/2 .
Then
Z
1/2
I[u(·)] = 1 + |∇u(x)|2 dx
U
= surface area of the graph of u.

Among all candidates that equal g on the boundary of U , which function


u0 (·) gives the surface having the least area? 
3.2. First variation 69

3.2. First variation


3.2.1. Euler-Lagrange equation.
We show next that a minimizer u0 (·) ∈ A of our multivariable vari-
ational problem automatically solves a certain partial differential equation
(PDE).

THEOREM 3.2.1. Assume u0 (·) ∈ A is twice continuously differentiable


and solves (COV).
Then u0 satisfies the nonlinear PDE
n  
X ∂ ∂L ∂L
(3.1) − (x, u0 (x), ∇u0 (x)) + (x, u0 (x), ∇u0 (x)) = 0
∂xk ∂zk ∂y
k=1
within the region U .

REMARKS.
(i) The Euler-Lagrange equation corresponding to L is the PDE
n  
X ∂ ∂L ∂L
(E-L) − (x, u, ∇u) + (x, u, ∇u) = 0.
∂xk ∂zk ∂y
k=1

Solutions u of the Euler-Lagrange equation are extremals (or critical


points or stationary points) of I[ · ]. Consequently, a minimizer u0 (·)
is an extremal, subject to the boundary conditions that u = g on ∂U .
(ii) Using vector calculus notation, we can also write (E-L) as
∂L
− div (∇z L(x, u, ∇u)) + (x, u, ∇u) = 0.
∂y
Here “div” stands for divergence. 

WARNING ABOUT NOTATION. Most books write the (E-L) PDE


as
n  
X ∂ ∂L ∂L
− (x, u, ∇u) + (x, u, ∇u) = 0
∂xk ∂uxk ∂u
k=1
or, worse, as
n
!
X ∂ ∂L ∂L
− ∂u
(x, u, ∇u) + (x, u, ∇u) = 0.
∂xk ∂( ∂x ) ∂u
k=1 k

This is spectacularly bad notation: the Lagrangian L = L(x, y, z) is a


function of the 2n + 1 real variables (x, y, z) = (x1 , . . . , xn , y, z1 , . . . , zn ). So
the expressions “ ∂L ∂L
∂u ” and “ ∂ux ” have no meaning. 
k
70 3. MULTIVARIABLE VARIATIONAL PROBLEMS

|z|2
EXAMPLE. Let L(x, y, z) = 2 . Then
∂L ∂L
= 0, = zk .
∂y ∂zk
Consequently the Euler-Lagrange equation is
n  
X ∂ ∂u
= 0.
∂xk ∂xk
k=1

If we define the Laplacian


n
X ∂2u
(3.2) ∆u = ,
∂x2k
k=1

then the Euler-Lagrange PDE is Laplace’s equation


∆u = 0 within U .
A function solving Laplace’s equation is called harmonic. Thus a minimizer
of Dirichlet’s energy functional is a harmonic function. 

WARNING ABOUT NOTATION. Many physics and engineering texts


use the symbol “∇2 ” for the Laplacian. In these and the Math 170 notes, we
instead employ ∇2 to denote the matrix of second derivatives: see Section
C in the Appendix. 

EXAMPLE. The Euler-Lagrange equation for the functional


Z
1
|∇u|2 − uf dx
U 2
is Poisson’s equation
−∆u = f in U .
The Euler-Lagrange equation for the functional
Z
1
|∇u|2 − F (u) dx
U 2
is the nonlinear Poisson equation
−∆u = f (u) in U ,
where f = F 0 . 

EXAMPLE. Assume L = (1 + |z|2 )1/2 . Then


∂L ∂L zk
= 0, = .
∂y ∂zk (1 + |z|2 )1/2
3.2. First variation 71

Consequently the Euler-Lagrange equation is


n ∂u
! !
∇u X ∂ ∂xk
div 1 = = 0.
(1 + |∇u|2 ) 2 ∂xk (1 + |∇u|2 ) 12
k=1

This is the minimal surface equation.

GEOMETRIC INTERPRETATION. The expression


!
∇u
H = div 1
(1 + |∇u|2 ) 2

has a geometric meaning; it is the mean curvature of the graph of u at


the point (x, u(x)). Therefore minimal surfaces have zero mean curvature
everywhere. 

EXAMPLE. If b : U → Rn and f : R → R, the expression


L = z · b(x)f (y) + F (y) div b,
where F 0 = f , is a null Lagrangian. Recall from page 8 that this means
every function automatically solves the associated Euler-Lagrange PDE.
Indeed, for any function u : U → R we have
∂L
− div (∇z L) + = − div (bf (u)) + ∇u · bf 0 (u) + f (u) div b = 0.
∂y


EXAMPLE. Let us for this example write points in Rn+1 as (x, t) with
x ∈ Rn denoting position and t ≥ 0 denoting time. We consider functions
u = u(x, t).
The Euler-Lagrange equation for the functional
Z Z  2
1 T ∂u
I[u] = − |∇x u|2 dxdt
2 0 Rn ∂t
is the wave equation
∂2u
− ∆u = 0,
∂t2
where the Laplacian ∆, defined as in (3.2), acts in the x-variables. However,
it is easy to see that the functional I[ · ] is unbounded from below on any
reasonable admissible class of functions. Consequently, solutions of the wave
equation correspond to extremals that are not minimizers. 
72 3. MULTIVARIABLE VARIATIONAL PROBLEMS

3.2.2. Derivation of Euler-Lagrange PDE.


NOTATION. We write  
ν1
ν =  ...  .
 

νn
for the outward pointing unit normal vector field along ∂U . 
LEMMA 3.2.1.
(i) For each k = 1, . . . , n we have the multivariable integration-by-
parts formula
Z Z Z
∂f ∂g
(3.3) g dx = − f dx + f gν k dS
U ∂xk U ∂xk ∂U
for k = 1, . . . , n.
(ii) If f : U → R is continuous and
Z
f w dx = 0 for all continuous w : Ū → R
U
with w = 0 on ∂U , then f ≡ 0 within U .

Proof. 1.The Divergence Theorem says that


Z Z
div h dx = h · ν dS.
U ∂U

Apply this to the vector field h = [0, . . . , 0, f g, 0, . . . , 0]T with the nonzero
term f g in the k-th slot.

2. Let φ : Ū → R be positive within U and equal to zero on ∂U . Let


w(x) = φ(x)f (x) above, to find
Z
φf 2 dx = 0.
U
Hence φ(x)f 2 (x)
= 0 for all x ∈ U as the integrand is positive. Then since
φ(x) > 0, we conclude that f (x) = 0 for all x ∈ U . 

Derivation of Euler-Lagrange equation:


1. Let w : Ū → R satisfy w = 0 on ∂U . Assume −1 ≤ τ ≤ 1 and define
uτ (x) = u0 (x) + τ w(x) (x ∈ U ).
Note uτ (·) ∈ A, since w = 0 on ∂U .
Thus
I[u0 (·)] ≤ I[uτ (·)],
3.3. Second variation 73

since u0 is the minimizer of I. Define i(τ ) = I[uτ (·)]. Then


i(0) ≤ i(τ ).
So τ 7→ i(τ ) has a minimum at τ = 0 on the interval −1 ≤ τ ≤ 1, and
therefore
di
(0) = 0.

2. Now
Z Z
i(τ ) = L(x, uτ , ∇uτ ) dx = L(x, u0 + τ w, ∇u0 + τ ∇w) dx.
U U
Therefore
Z
di ∂
(τ ) = L(x, u0 + τ w, ∇u0 + τ ∇w) dx.
dτ ∂τ
ZU
∂L
= (x, u0 + τ w, ∇u0 + τ ∇w)w
U ∂y
n
X ∂L ∂w
+ (x, u0 + τ w, ∇u0 + τ ∇w) dx.
∂zk ∂xk
k=1
Put τ = 0:
Z n
di ∂L X ∂L ∂w
0= (0) = (x, u0 , ∇u0 )w + (x, u0 , ∇u0 ) dx.
dτ U ∂y ∂zk ∂xk
k=1
We now integrate by parts, to deduce that
Z " n  #
∂L X ∂ ∂L
(x, u0 , ∇u0 ) − (x, u0 , ∇u0 ) w dx = 0.
U ∂y ∂xk ∂zk
k=1
This is valid for all functions w such that w = 0 on ∂U . The lemma before
now implies that the (E-L) PDE holds everywhere in U . 

3.3. Second variation


THEOREM 3.3.1. Suppose u0 (·) ∈ A is a minimizer and u0 is continu-
ously differentiable.
(i) Then
n
∂2L
Z X
∂w ∂w
(3.4) (x, u0 , ∇u0 )
U k,l=1 ∂zk ∂zl ∂xk ∂xl
n
X ∂2L ∂w ∂2L
+2 (x, u0 , ∇u0 ) w+ (x, u0 , ∇u0 )w2 dx ≥ 0
∂zk ∂y ∂xk ∂y 2
k=1

for all w : Ū → R with w = 0 on ∂U .


74 3. MULTIVARIABLE VARIATIONAL PROBLEMS

(ii) Furthermore,

(3.5) ∇2z L(x, u0 (x), ∇u0 (x))  0 (x ∈ U ).

DEFINITION. The expression on the left hand side of (3.4) the second
variation of I[ · ] about u0 (·), evaluated at w(·). 

REMARK. The inequality (3.5) means that


n
X ∂2L
(x, u0 , ∇u0 )yk yl ≥ 0 (y ∈ Rn ).
∂zk ∂zl
k,l=1

If the mapping
(3.6) z 7→ L(x, u0 (x), z) is convex,
then (3.5) holds automatically. 

Proof. 1. Select w : Ū → R, with w = 0 on ∂U , and define uτ (x) =


u0 (x) + τ w(x) for −1 ≤ τ ≤ 1. Then uτ (·) ∈ A, and so τ 7→ i(τ ) has a
minimum at τ = 0 and thus
d2 i
(0) ≥ 0.
dτ 2

2. Differentiating twice with respect to τ , we find


d2 i ∂2L
Z
(τ ) = (x, u0 + τ w, ∇u0 + τ ∇w)w2
dτ 2 U ∂y
2
n
X ∂2L ∂w
+2 (x, u0 + τ w, ∇u0 + τ ∇w)w
∂y∂zk ∂xk
k=1
n
X ∂2L ∂w ∂w
+ (x, u0 + τ w, ∇u0 + τ ∇w) dx.
∂zk ∂zl ∂xk ∂xl
k,l=1

d2 i
Put τ = 0 and recall dτ 2
(0) ≥ 0, to prove (3.4).

3. As in the second variation calculation for the scalar case, we define


(
x if 0 ≤ x ≤ 1
φ(x) =
2−x if 1 ≤ x ≤ 2
and extend φ to be 2-periodic on R. Thus φ is a sawtooth function and
φ0 = ±1 except at the integers.
3.4. Extensions and generalizations 75

Next let ζ : U → R be any continuously differentiable function with


ζ = 0 on ∂U . Finally, choose any y ∈ Rn and write
wε (x) = εφ( x·y
ε )ζ(x).

for each ε > 0. Then wε is differentiable except along finitely many hyper-
planes that intersect U , with
∇wε (x) = φ0 ( x·y x·y
ε )ζ(x)y + εφ( ε )∇ζ(x).

Note that the second term on the right is less than or equal to Aε for some
appropriate constant A.
Insert wε in place of w in (3.4). It follows that
n
∂2L
Z X
2
(x, u0 , ∇u0 )yk yl ζ 2 φ0 ( x·y
ε ) dx + Bε ≥ 0
U ∂zk ∂zl
k,l=1

for some constant B. But


2
φ0 ( x·y
ε ) =1
except along finitely many hyperplanes (which have no effect on the integral).
Therefore upon sending ε → 0, we deduce that
n
∂2L
Z X
(x, u0 , ∇u0 )yk yl ζ 2 dx ≥ 0
U ∂z k ∂zl
k,l=1

for all functions ζ as above. This implies (3.5). 

3.4. Extensions and generalizations


3.4.1. Other boundary conditions.
If we change the admissible class of functions and/or if we change the
energy functional I[ · ], new effects can appear.
For example, suppose we are additionally given a function B : ∂U ×R →
R, B = B(x, y). We then for this section redefine the energy by adding a
boundary integral term:
Z Z
(3.7) I[u(·)] = L(x, u, ∇u) dx + B(x, u) dS.
U ∂U

We also redefine
(3.8) A = {u : Ū → R | u is continuously differentiable},
so as now to require nothing about the boundary behavior of admissible
functions.
76 3. MULTIVARIABLE VARIATIONAL PROBLEMS

THEOREM 3.4.1. Let the energy be given by (3.7) and the admissible
class by (3.8). Assume u0 (·) ∈ A is a twice continuously differentiable
minimizer
(i) Then u0 solves the usual Euler-Lagrange equation
n  
X ∂ ∂L ∂L
(3.9) − (x, u0 , ∇u0 ) + (x, u0 , ∇u0 ) = 0
∂xk ∂zk ∂y
k=1

within U .
(ii) Furthermore, we have the boundary condition
n
X ∂L ∂B
(3.10) (x, u0 , ∇u0 )ν k + (x, u0 ) = 0
∂zk ∂y
k=1

on ∂U .

In vector notation, (3.10) reads


∂B
(3.11) ∇z L(x, u0 , ∇u0 ) · ν + (x, u0 ) = 0 on ∂U .
∂y
INTERPRETATION. The identity (3.10) is the natural boundary
condition (or transversality condition) hidden in our new variational
problem. It appears automatically when we compute the first variation. 

Proof. 1. Select any w : Ū → R, but do not require that w vanishes on the


boundary. Define uτ (x) = u0 (x) + τ w(x) (x ∈ U ), and observe uτ (·) ∈ A.
Thus i(τ ) = I[uτ (·)] has a minimum at τ = 0 and therefore
di
(0) = 0.

2. Now
i(τ ) = I[uτ (·)]
Z Z
= L(x, u0 + τ w, ∇u0 + τ ∇w) dx + B(x, u0 + τ w) dS.
U ∂U
Therefore
Z n
di ∂L X ∂L ∂w
0= (0) = (x, u0 , ∇u0 )w + (x, u0 , ∇u0 ) dx
dτ U ∂y ∂zk ∂xk
k=1
Z
∂B
+ (x, u0 )w dS.
∂U ∂y
3.4. Extensions and generalizations 77

Integrate by parts, to deduce


Z " n  #
∂L X ∂ ∂L
(x, u0 (x), ∇u0 (x)) − (x, u0 (x), ∇u0 (x)) w dx
U ∂y k=1
∂xk ∂zk
Z X n  
∂L k ∂B
+ (x, u0 , ∇u0 )ν + (x, u0 ) w dS = 0.
∂U ∂zk ∂y
k=1

This identity is valid for all functions w. If we restrict attention to


functions satisfying also w = 0 on ∂U , we as usual deduce that the Euler-
Lagrange PDE holds within U .
Knowing this, we can then rewrite the foregoing:
Z X n  
∂L k ∂B
(x, u0 , ∇u0 )ν + (x, u0 ) w dS = 0.
∂U ∂zk ∂y
k=1

As this identity is valid for all functions w, regardless of their behavior


on ∂U , it follows that the boundary condition (3.10) holds everywhere on
∂U . 

EXAMPLE. A minimizer u0 (·) the functional


Z Z
1 2
I[u(·)] = |∇u| dx + B(u) dS
2 U ∂U

over the admissible class (3.8) solves the nonlinear boundary-value problem
(
∆u0 = 0 in U
∂u0
∂ν + b(u0 ) = 0 on ∂U ,

where b = B 0 and
∂u
= ∇u · ν
∂ν
is the outward normal derivative of u. 

3.4.2. Integrating factors.


It is often important to determine whether a given PDE (or ODE) prob-
lem is variational, meaning that it arises as the Euler-Lagrange equation
for an appropriate energy functional I[ · ].

Consider for instance the linear equation


(3.12) −∆u + b · ∇u = f in U ,
where b : U → Rn is a given vector field, b = b(x). In general (3.12) is not
variational.
78 3. MULTIVARIABLE VARIATIONAL PROBLEMS

But for the special case that b = ∇φ is the gradient of a potential


function φ : U → R, let us consider the functional
Z  
1
(3.13) I[u(·)] = e−φ |∇u|2 − f u dx.
U 2
The Lagrangian function is
 
−φ(x) 1 2
L=e |z| − f (x)y .
2
and the corresponding Euler-Lagrange equation for a minimizer u is
 
− div e−φ ∇u − e−φ f = 0.

We simplify and cancel the term e−φ , to rewrite the foregoing as


(3.14) −∆u + ∇φ · ∇u = f ;
this is (3.12) when b = ∇φ.

INTERPRETATION. We can regard e−φ as an integrating factor.


This means that if we multiply the PDE (3.14) by this expression, it then
becomes variational. 

3.4.3. Integral constraints.


In this section we for simplicity take L = 12 |z|2 ; so that
Z
1
I[u(·)] = |∇u|2 dx.
2 U
Define also the functional
Z
J[u(·)] = G(x, u(x)) dx,
U

where G : U × R → R, G = G(x, y). The new admissible class will be


(3.15) A = {u : U → R | u = g on ∂U , J[u(·)] = 0}.

THEOREM 3.4.2. Assume that u0 ∈ A is a minimizer of I[ · ] over A.


Suppose also that
∂G
(3.16) (x, u0 ) is not identically zero on U .
∂y
Then there exists λ0 ∈ R such that
∂G
(3.17) −∆u0 + λ0 (x, u0 ) = 0
∂y
within U .
3.4. Extensions and generalizations 79

We omit the proof, which is similar to that for Theorem 1.3.5. As in that
previous theorem, we interpret λ0 as the Lagrange multiplier for the integral
constraint J[u(·)] = 0. The requirement (3.16) is a constraint qualification
condition.
For an application, see Theorem 3.5.1 below.

3.4.4. Systems.
We can also extend our theory to handle systems. For this, we assume
g : ∂U → Rm is given, and redefine the class of admissible functions, now
to be
A = {u : Ū → Rm | u = g on ∂U }.
NOTATION. We write
  ∂u1 ∂u1
(∇u1 )T
   
u1 ∂x1 ... ∂xn
u =  ...  , ∇u =  ..   .. .. ..  .
=
  
.   . . . 
∂um ∂um
um (∇um )T ∂x1 ... ∂xn


Suppose next we have a Lagrangian function L : U × Rm × Mm×n → R,


L = L(x, y, Z),
where Mm×n denotes the space of real, m × n matrices.
NOTATION. In this section only, we will write a matrix Z ∈ Mm×n as
 1
z1 . . . zn1

Z =  ... . . . ...  .
 

z1m . . . znm

DEFINITION. If u(·) ∈ A, we define
Z
I[u(·)] = L(x, u(x), ∇u(x)) dx.
U
Note that we insert u(x) into the y-variables of L(x, y, Z), and ∇u(x) into
the Z-variables.
THEOREM 3.4.3. Assume u0 (·) ∈ A minimizers I[ · ] and u0 is twice
continuously differentiable.
Then u0 solves within U the system of nonlinear PDE
n  
X ∂ ∂L ∂L
(E-L) − (x, u 0 (x), ∇u 0 (x)) + (x, u0 (x), ∇u0 (x)) = 0
∂xk ∂zkl ∂yl
k=1
for l = 1, . . . , m.
80 3. MULTIVARIABLE VARIATIONAL PROBLEMS

Proof. 1. Select w : Ū → R such that w = 0 on ∂U , and then define and


define
uτ (x) = u0 (x) + τ w(x) (x ∈ U ).
for −1 ≤ τ ≤ 1 Then uτ (·) ∈ A, since w = 0 on ∂U . Hence
I[u0 (·)] ≤ I[uτ (·)].
Define i(τ ) = I[uτ (·)]. Then τ 7→ i(τ ) has a minimum at τ = 0, and
therefore
di
(0) = 0.

2. We have
Z
i(τ ) = I[uτ (·)] = L(x, u0 (x) + τ w(x), ∇u0 (x) + τ ∇w(x)) dx.
U
Therefore
Z Xm
di ∂L
(τ ) = (x, u0 + τ w, ∇u0 + τ ∇w)wl
dτ U ∂yl
l=1
m X n
X ∂L ∂wl
+ l
(x, u0 + τ w, ∇u0 + τ ∇w) dx;
∂zk ∂xk
l=1 k=1

and so
Z Xm m X
n
di ∂L X ∂L ∂wl
0= (0) = (x, u0 , ∇u0 )wl + (x, u0 , ∇u0 ) dx.
dτ U ∂yl ∂zk ∂xk
l=1 l=1 k=1

Now fix some index l ∈ {1, . . . , m} and put


w = [0 . . . 0 w 0 . . . 0]T ,
where the real-valued function w appears in the l-th slot. Then we have
Z n
∂L X ∂L ∂w
(x, u0 , ∇u0 )w + (x, u0 , ∇u0 ) dx = 0.
U ∂yl ∂zk
k=1
∂xk

Upon integrating by parts, we deduce as usual that the l-th equation of the
(E-L) system of PDE holds. 

3.5. Applications
3.5.1. Eigenvalues, eigenfunctions.
Assume for this section that U ⊂ Rn is a connected, open set, with
smooth boundary ∂U .
3.5. Applications 81

THEOREM 3.5.1. (i) There exists a real number λ1 > 0 and a smooth
function w1 such that

−∆w1 = λ1 w1 in U

(3.18) w1 = 0 on ∂U

2
R
w1 > 0 in U , U w1 dx = 1.

(ii) Furthermore,
Z Z 
2 2
(3.19) λ1 = min |∇u| dx | u dx = 1, u = 0 on ∂U .
U U

DEFINITION. We call λ1 the principal eigenvalue for the Laplacian


with zero boundary conditions on ∂U . The function w1 is a principal eigen-
function.
REMARK. We can also write
2
R
UR |∇u| dx
(3.20) λ1 = min 2
.
U u dx
u=0 on ∂U ,u6=0

This is Rayleigh’s principle. 

Proof. If w1 ∈ A is a minimizer for the constrained variational problem


(3.19), then
∂G
(w1 ) = 2w1 6≡ 0.
∂y
Therefore Theorem 3.4.2 tells us that the Euler-Lagrange equation is
−∆w1 + µw1 = 0,
where µ is the eigenvalue for the constraint. But then
Z Z
µ=µ w12 dx = − |∇w1 |2 dx = −λ1 .
U U

The existence of a smooth minimizer w1 , with w1 > 0 within U , requires


graduate level mathematical ideas beyond the scope of these notes. 

THEOREM 3.5.2. (i) There exist real numbers 0 < λ1 < λ2 ≤ λ3 ≤ · · ·


and smooth function w1 , w2 , w3 , . . . such that
λk → ∞
and

−∆wk = λk wk in U

(3.21) wk = 0 on ∂U

R 2
U wk dx = 1.
82 3. MULTIVARIABLE VARIATIONAL PROBLEMS

(ii) Furthermore,
Z Z
2
(3.22) λk = min |∇u| dx | u2 dx = 1, u = 0 on ∂U ,
U U
Z 
wl u dx = 0 (l = 1, . . . , k − 1) .
U

DEFINITION. We call λk the k-th eigenvalue for the Laplacian with


zero boundary conditions on ∂U . The function wk is a corresponding eigen-
function. 

Proof. Assume that (3.21) holds for l = 1, . . . , k − 1. The Euler-Lagrange


PDE for the minimization problem (3.22) is
k−1
X
(3.23) −∆wk = λk wk + µ l wl ,
l=1

where λk is the Lagrange multiplier for theRconstraint U w2 dx = 1 and µl is


R

the Lagrange multiplier for the constraint U wl w dx = 0 for l = 1, . . . , k − 1.


Let 1 ≤ m ≤ k − 1, multiply (3.23) by wm , and integrate:
Z Z k−1
X
− ∆wk wm dx = (λk wk + µl wl )wm dx = µm .
U U l=1

Therefore
Z Z
µm = − ∆wk wm dx = ∇wk · ∇wm dx
U U
Z Z
=− wk ∆wm dx = λm wk wm dx = 0.
U U

Thus (3.23) becomes −∆wk = λk wk . 

REMARK. (Level curves for eigenfunctions) If U is connected, the


first eigenfunction w1 is positive; but the higher eigenfunctions wk for k =
2, . . . change sign within U .
Take a thin metal plate cut into the shape U ⊂ R2 , sprinkle it with sand,
and then connect the plate to an audio speaker. If we we play appropriate
high frequencies over the speaker, the plate resonates according to the higher
eigenfunctions of the Laplacian (at least approximately) and so the sand
collects into the level sets {wk = 0}. At higher and higher frequencies,
complicated and beautiful structures appear, called Chladni patterns: see
www.youtube.com/watch?v=wvJAgrUBF4w. 
3.5. Applications 83

3.5.2. Minimal surfaces.


We saw on page 70 that we can apply variational methods to study
minimal surfaces that are the graphs of functions u : Ū → R. The main
observation was that for such surfaces the mean curvature H equals 0.
For several centuries there has been intense mathematical investigation
of more complicated minimal surfaces that cannot be represented globally
as a graph. The study of these is beyond the scope of these notes, but
following are some pictures (provided to me by David Hoffman) of some
beautiful 2-dimensional minimal surfaces. For all of these H = κ1 + κ2 is
identically zero, where κ1 , κ2 are the principal curvatures.
84 3. MULTIVARIABLE VARIATIONAL PROBLEMS

These surfaces are mathematical models for physical soap films.


More complicated are mathematical models for soap bubbles. These en-
tail minimizing surface area, subject to volume constraints (= volume of the
air within the bubbles). Such surfaces have locally constant mean curvature,
and we can interpret the constant as a Lagrange multiplier for the volume
constraint.

Bubbles with constant mean curvature

3.5.3. Harmonic maps.


Let us next study how to minimize the Dirichlet energy among maps
from U ⊂ Rn into the unit sphere S m−1 ⊂ Rm , satisfying given boundary
conditions. We therefore take as the admissible class of mappings
A := {u : U → Rm | u = g on ∂U, |u| = 1} .
3.5. Applications 85

The corresponding energy is


Z
1
I[u(·)] = |∇u|2 dx
2 U

THEOREM 3.5.3. Let u0 ∈ A satisfy


I[u0 ] = min I[u].
u∈A

Then
(
−∆u0 = |∇u0 |2 u0 in U
(3.24)
u0 = g on ∂U .

INTERPRETATION. The function λ0 = |∇u0 |2 is the Lagrange multi-


plier corresponding to the pointwise constraint |u0 | = 1. 

Proof. 1. Select w : U → Rm , with w = 0 on ∂U . Then since |u0 | = 1, it


follows that |u0 + τ w| =
6 0 for each sufficiently small τ . Consequently
u0 + τ w
(3.25) u(τ ) := ∈ A.
|u0 + τ w|
Thus
i(τ ) := I[w(τ )]
has a minimum at τ = 0, and so, as usual,
di
(0) = 0.

2. We have
Z
0
(3.26) i (0) = ∇u · ∇u0 (0) dx = 0.
U

where 0 = d
dτ . But we compute directly from (3.25) that
w [(u0 + τ w) · w](u0 + τ w)
u0 (τ ) = − .
|u0 + τ w| |u0 + τ w|3
So u0 (0) = w − (u0 · w)u0 . Put this equality into (3.26):
Z
(3.27) 0= ∇u0 · ∇w − ∇u0 · ∇((u0 · w)u0 ) dx.
U
We next differentiate the identity |u0 |2 ≡ 1, to learn that
(∇u0 )T u0 = 0.
Using this fact, we then verify that
∇u0 · ∇((u0 · w)u0 ) = |∇u0 |2 (u0 · w)
86 3. MULTIVARIABLE VARIATIONAL PROBLEMS

in U . Inserting this into (3.27) gives


Z
0= ∇u0 · ∇w − |∇u0 |2 (u0 · w) dx.
U
This identity, valid for all functions w as above, implies the PDE in (3.24).


3.5.4. Gradient flows.


The Euler-Lagrange equation arising in PDE theory are mostly equilib-
rium equations that do not entail changes in time. But it is interesting also
to consider certain time-dependent equations that can be interpreted as gra-
dient flows. Recall that if we are given an “energy” function Φ : Rn → R,
the system of ODE
(3.28) ẋ = −∇Φ(x),
describes a “downhill” gradient flow

We introduce next a PDE version, corresponding to the energy


Z
I[u(·)] = L(x, u(x), ∇u(x)) dx.
U
The corresponding time dependent gradient flow, generalizing (3.28), is the
PDE
n  
∂u X ∂ ∂L ∂L
(3.29) = (x, u, ∇u) − (x, u, ∇u).
∂t ∂xk ∂zk ∂y
k=1

We assume that u = u(x, t) solves this PDE, with the boundary condition
(3.30) u=0 on ∂U .
THEOREM 3.5.4.
(i) The function
φ(t) = I[u(·, t)] (0 ≤ t < ∞)
is nonincreasing on [0, ∞).
(ii) Assume in addition that (y, z) 7→ L(x, y, z) is convex. Then φ is
convex on [0, ∞).

Proof. 1. We differentiate in t, to see that


Z
d d
I[u(·, t)] = L(x, u(x, t), ∇x u(x, t)) dx
dt dt U
Z  
∂L ∂u ∂u
= (x, u, ∇u) + ∇z L(x, u, ∇u) · ∇x dx
U ∂y ∂t ∂t
3.5. Applications 87

Z  
∂L ∂u
= (x, u, ∇u) − div(∇z L(x, u, ∇u)) dx
U ∂y ∂t
Z  2
∂u
=− dx ≤ 0.
U ∂t
Observe that there is no boundary term when we integrate by parts, since
if (3.30) holds for all times t, then ∂u∂t = 0 on ∂U .

2. Differentiate again in t:
2
d2 ∂u ∂ 2 u
Z  Z
d ∂u
2
I[u(·, t)] = − dx = −2 dx.
dt dt U ∂t U ∂t ∂t2
We can also differentiate the PDE (3.29), to find
n n
!
∂2u X ∂ ∂ 2 L ∂u X ∂ 2 L ∂ 2 u
= +
∂t2 ∂xk ∂zk ∂y ∂t ∂zk ∂zl ∂xl ∂t
k=1 l=1
n
∂ 2 L ∂u X ∂ 2 L ∂ 2 u
− − .
∂y 2 ∂t ∂y∂zl ∂xl ∂t
l=1
We insert this into the previous calculation, and integrate by parts:
n
d2 ∂2L ∂2u ∂2u
Z X
I[u(·, t)] = 2
dt2 U k,l=1 ∂zk ∂zl ∂xk ∂t ∂xl ∂t
n
∂ 2 L ∂ 2 u ∂u ∂ 2 L ∂u 2
X  
+2 + dx ≥ 0,
∂zk ∂y ∂xk ∂t ∂t ∂y 2 ∂t
k=1
the last inequality holding provided L is convex in the variables (y, z). 
Chapter 4

OPTIMAL CONTROL
THEORY

4.1. The basic problem


This section provides an informal introduction to optimal control theory,
and discusses three model problems. The basic issue is this: we are given
some system of interest, whose evolution in time is modeled by a differential
equation that depends upon certain parameters. We ask how to optimally
and continually adjust these parameters, so as to maximize a given payoff
functional.

To be more precise, assume that the state of our system at time t ≥ 0


is x(t), where
x : [0, ∞) → Rn
solves a system of differential equations having the form
(
ẋ(t) = f (x(t), α(t)) (t ≥ 0)
(ODE) 0
x(0) = x ,

x0 ∈ Rn denoting the initial state of the system. Here we are given

f : Rn × A → Rn ,

where A ⊆ Rm is the set of control (or parameter) values. The (possibly


discontinuous) mapping
α : [0, ∞) → A
is an admissible control.

89
90 4. OPTIMAL CONTROL THEORY

We write A for the collection of all admissible controls, and will always
assume that for each α(·) ∈ A, the solution x(·) of (ODE) exists and is
unique. We call x(·) the response of the system to the control α(·) ∈ A.

NOTATION. We write
   
f1 (x, a) x1 (t)
f (x, a) =  .. x(t) =  ...  .
,
   
.
fn (x, a) xn (t)


x0

Three responses to three controls

Given T > 0 and functions r : Rn × A → R, g : Rn → R, we define for


each control α(·) ∈ A the payoff functional
Z T
(P) P [α(·)] = r (x(t), α(t)) dt + g(x(T )),
0
where x(·) solves (ODE) for the control α(·). We call T the terminal time,
r is the running payoff, and g is the terminal payoff.

OPTIMAL CONTROL PROBLEM. Our task is to design a control


α0 (·) ∈ A that maximizes the payoff; thus
P [α0 (·)] = max P [α(·)].
α(·)∈A

This is an optimal control problem.

REMARK. More precisely, this is a fixed time, free endpoint optimal con-
trol problem, instances of which appear as the next two examples. Other
problems are instead free time, fixed endpoint: see the third example follow-
ing. 
4.1. The basic problem 91

EXAMPLE. (A production and consumption model) Consider an


economic activity (such as running a company) that generates an output,
some fraction of which we can at each moment reinvest, while consuming
the rest. How should we plan our consumption/reinvestment strategy so as
to maximize our total consumption over a time period of given length T ?
To write down a mathematical model, introduce
x(t) = output of company at time t
α(t) = fraction of output reinvested at time t.
Since the control α(·) represents a fraction of output, we have 0 ≤ α(t) ≤ 1
for times 0 ≤ t ≤ T . In other words,
α : [0, T ] → A
where A denotes the interval [0, 1] ⊂ R.
Next we model the output of the company as a function of the reinvest-
ment strategy:
(
ẋ(t) = γα(t)x(t) (0 ≤ t ≤ T )
(ODE)
x(0) = x0 .
Here γ > 0 is a known growth rate. Since (1 − α(t))x(t) is the amount of
the output consumed at time t, our total consumption will therefore be
Z T
(P) P [α(·)] = (1 − α(t))x(t) dt.
0

We wish to design an optimal reinvestment plan α0 (·) that maximizes P [ · ].


This fits into the fixed time, free endpoint control theory formulation
from above, with n = m = 1 and
f (x, a) = kax, r(x, a) = (1 − a)x, g = 0.


EXAMPLE. (Linear-quadratic regulator) The linear-quadratic regu-


lator is a widely used model, since, as we will later see, it is solvable.
In the simplest case, we take n = m = 1 and introduce the system
dynamics
(
ẋ(t) = x(t) + α(t) (0 ≤ t ≤ T )
(ODE)
x(0) = x0 .
We also assume A = R; so there is no constraint on the magnitude of control.
92 4. OPTIMAL CONTROL THEORY

We want to minimize the quadratic cost functional


Z T
x2 (t) + α2 (t) dt.
0
Since our theory is based upon maximization, we therefore take
Z T
(P) P [α(·)] = − x2 (t) + α2 (t) dt.
0
This falls into the control theory framework, as a fixed time, free endpoint
problem, with
f (x, a) = x + a, r(x, a) = −(x2 + a2 ), g = 0.


EXAMPLE. (Rocket railroad car) We study next a railway car that


can move along the real line, and whose acceleration can be adjusted by
firing rockets at each end of the car. How can we steer the car to the origin
in the least amount of time?

rocket engines

We introduce
y(t) = position at time t
ẏ(t) = velocity
ÿ(t) = acceleration
α(t) = thrust of rocket engines
T = time the car arrives at the origin, with zero velocity.
We assume concerning the trust that in appropriate physical units we have
−1 ≤ α(t) ≤ 1; consequently α : [0, T ] → A for A = [−1, 1]. If the car has
mass 1, then Newton’s Law tells us that
ÿ(t) = α(t).

We rewrite this problem into the general form discussed before, setting
n = 2, m = 1. We put
   
x1 (t) y(t)
x(t) = = .
x2 (t) ẏ(t)
4.2. Time optimal linear control 93

Then our dynamics are


   
0 1 0
(ODE) ẋ(t) = x(t) + α(t) (0 ≤ t ≤ T )
0 0 1
with x(0) = x0 = [y 0 v 0 ]T , where y 0 is the initial position and v 0 is the
initial velocity. Our goal is to steer the railway car to the origin at a time
T > 0 (so it arrives with zero velocity: x(T ) = [0 0]T ), and to do so in the
least time.
We consequently want to maximize
Z T
(P) P [α(·)] = − 1 dt = −T.
0
This is a free time, fixed endpoint problem, since the time T to reach the
origin is not prescribed. 

4.2. Time optimal linear control


When the dynamics are linear in both the state and the control, we can
use tools of linear and convex analysis to design explicit optimal controls
and/or to deduce detailed information. In this section we illustrate such an
approach. (However, our presentation will invoke ideas from measure theory
and functional analysis, and will also omit some details.)
Consider therefore the linear control system:
(
ẋ(t) = M x(t) + N α(t)
(ODE)
x(0) = x0 ,
for given matrices M ∈ Mn×n and N ∈ Mn×m . We will take
A = [−1, 1]m ⊂ Rm ,
and consider the class of admissible controls
A = {α : [0, ∞) → A | α(·) is measurable} .
Define
(P) P [α(·)] = −T
where T = T (α(·)) denotes the first time the solution of our ODE hits
the origin 0: x(T ) = 0. (If the trajectory never reaches the origin, we set
T = ∞.)

OPTIMAL TIME LINEAR PROBLEM. We are given the starting


point x0 ∈ Rn , and want to find an optimal control α0 (·) such that
P [α0 (·)] = max P [α(·)].
α(·)∈A
94 4. OPTIMAL CONTROL THEORY

Then
T0 = −P[α0 (·)] is the minimum time to steer to the origin.

4.2.1. Linear systems of ODE.


Let us first briefly recall some terminology and basic facts about linear
systems of ordinary differential equations.

DEFINITION. Assume M ∈ Mn×n . Let X(·) : [0, ∞) → Mn×n be the


unique solution of the matrix ODE
(
Ẋ(t) = M X(t) (t ≥ 0)
X(0) = I.

We call X(·) the fundamental solution, and sometimes write


∞ k
X t Mk
X(t) = etM = .
k!
k=0

THEOREM 4.2.1. (Solving linear systems)


(i) The unique solution of the homogeneous initial-value problem
(
ẋ = M x (t ≥ 0)
(4.1)
x(0) = x0
is
x(t) = X(t)x0 .

(ii) Suppose that f : [0, ∞) → Rn . Then the unique solution of the


nonhomogeneous initial-value problem
(
ẋ = M x + f (t ≥ 0)
(4.2)
x(0) = x0 .
is given by the variation of parameters formula
Z t
0
x(t) = X(t)x + X(t) X−1 (s)f (s) ds.
0

4.2.2. Reachable sets and convexity.

DEFINITION. We define the reachable set for time t > 0 to be

K(t, x0 ) = {x1 ∈ Rn | there exists α(·) ∈ A such that the


corresponding solution of (ODE) satisfies x(t) = x1 }.
4.2. Time optimal linear control 95

In other words, x1 ∈ K(t, x0 ) provided there exists an admissible control


that steers the solution of (ODE) from x0 to x1 at time t. Using the variation
of parameters formula, we see that x1 ∈ K(t, x0 ) if and only if
Z t
(4.3) 1 0
x = X(t)x + X(t) X−1 (s)N α(s) ds
0

for some control α(·) ∈ A.

The geometry of the reachable set is important:

THEOREM 4.2.2. For each time t > 0, the reachable set K(t, x0 ) is
convex and closed.

Proof. 1. Let x1 , x2 ∈ K(t, x0 ). Then there exist controls α1 (·), α2 (·) ∈ A


such that
Z t
1 0
x = X(t)x + X(t) X−1 (s)N α1 (s) ds
0
Z t
x2 = X(t)x0 + X(t) X−1 (s)N α2 (s) ds.
0

Let 0 ≤ θ ≤ 1. Then
Z t
1 2
θx + (1 − θ)x = X(t)x + X(t) 0
X−1 (s)N (θα1 (s) + (1 − θ)α2 (s)) ds.
0

Since θα1 (·) + (1 − θ)α2 (·) ∈ A, we see that θx1 + (1 − θ)x2 ∈ K(t, x0 ).

2. We omit the proof that K(t, x0 ) is closed, as this requires some


knowledge of functional analysis. 

We next exploit the convexity of reachable sets, to deduce nontrivial


information about the structure of an optimal control.

THEOREM 4.2.3. (Time optimal linear maximum principle) As-


sume that α0 (·) is a piecewise continuous optimal control, which steers the
system from x0 to 0 in the least time T0 .
Then there exists a nonzero vector h ∈ Rn such that

(M) h · X−1 (t)N α0 (t) = max{h · X−1 (t)N a}


a∈A

for each time 0 ≤ t ≤ T0 that is a point of continuity of α0 (·).


96 4. OPTIMAL CONTROL THEORY

INTERPRETATION. Note that the maximum on the right hand side of


(M) is over the finite dimensional set A = [−1, 1]m of control values, and not
over the infinite dimensional set A of all admissible controls. And in fact,
since the expression h · X−1 (t)N a is linear in a, the maximum will occur
among the finitely many corners of the cube A.
The significance is that if we know h, then the maximization principle
(M ) provides us with a formula for computing α0 (·), or at least for extracting
useful information. See the example below for how this works in practice.
We will see later that (M ) is a special case of the general Pontryagin
Maximum Principle. 

Outline of proof 1. Since T0 denotes the minimum time it takes to steer


to 0, we have
0∈/ K(t, x0 ) for all times 0 ≤ t < T0 .
It follows that
(4.4) 0 ∈ ∂K(T0 , x0 ).

Since K(T0 , x0 ) is a nonempty closed convex set, there exists a support-


ing plane to K(T0 , x0 ) at 0: see the Math 170 notes. This means that there
exists a nonzero vector g ∈ Rn , such that
(4.5) g·x≤0 for all x ∈ K(T0 , x0 ).

K(T0,x0)

2. Given any control α(·) ∈ A, define x ∈ K(T0 , x0 ) by


Z T0
0
x = X(T0 )x + X(T0 ) X−1 (s)N α(s) ds.
0
4.2. Time optimal linear control 97

Note also that


Z T0
0
0 = X(T0 )x + X(T0 ) X−1 (s)N α0 (s) ds.
0
Since g · x ≤ 0, we therefore have
 Z T0 
0 −1
g · X(T0 )x + X(T0 ) X (s)N α(s) ds
0
 Z T0 
0 −1
≤ 0 = g · X(T0 )x + X(T0 ) X (s)N α0 (s) ds .
0

Define
h = XT (T0 )g;
so that hT = g T X(T0 ). Then
Z T0 Z T0
T −1
h X (s)N α(s) ds ≤ hT X−1 (s)N α0 (s) ds,
0 0
and therefore
Z T0
(4.6) h · X−1 (s)N (α0 (s) − α(s)) ds ≥ 0
0
for all controls α(·) ∈ A.

3. Now select any time 0 < t < T0 and any value a ∈ A. We pick δ > 0
so small that the interval [t, t + δ] lies in [0, T0 ]. Define
(
a if t ≤ s ≤ t + δ
α(s) =
α0 (s) otherwise;
then (4.6) implies
Z t+δ
1
(4.7) h · X−1 (s)N (α0 (s) − a) ds ≥ 0
δ t
We sent δ → 0, to deduce that if t is a point of continuity of α0 (·), then
(4.8) h · X−1 (t)N α0 (t) ≥ h · X−1 (t)N a
for all a ∈ A. This implies the maximization assertion (M). 
REMARKS. (i) This outline of the proof needs more details, in particular
for the assertion (4.4) that 0 lies on the boundary of the reachable set.
(ii) If an optimal control α0 (·) is measurable, but not necessarily piece-
wise continuous, the same proof shows that (M) holds for almost every point
time t in the interval [0, T0 ]. 
98 4. OPTIMAL CONTROL THEORY

EXAMPLE. (Rocket railway car) For this problem, introduced on page


92, we have
   
0 1 0
ẋ(t) = x(t) + α(t)
0 0 1
for n = 2, m = 1 and
 
x1 (t)
x(t) = , A = [−1, 1].
x2 (t)

According to the maximum principle (M), there exists a nonzero vector


h ∈ R2 such that
h · X−1 (t)N α0 (t) = max h · X−1 (t)N a

(4.9)
|a|≤1

for an optimal control α(·). We will now extract from this the useful in-
formation that an optimal control α0 (·) takes on only the values ±1 and
switches between these values most once.
We must first compute X(t) = etM . To do so, we observe
    
0 0 1 2 0 1 0 1
M = I, M = , M = = O;
0 0 0 0 0 0
and therefore M k = O for all k ≥ 2, where O denotes the zero matrix.
Consequently,
 
tM 1 t
e = I + tM = .
0 1
Then
 
−1 1 −t
X (t) = ;
0 1
 
−1 −t
h · X (t)N = [h1 h2 ] = −th1 + h2 .
1
Thus (4.9) asserts
(4.10) (−th1 + h2 )α0 (t) = max{(−th1 + h2 )a};
|a|≤1

and this implies that


α0 (t) = sgn(−th1 + h2 )
for the sign function

1
 (x > 0)
sgn x = 0 (x = 0)

−1 (x < 0).

4.3. Pontryagin Maximum Principle 99

Therefore the optimal control α0 (·) switches at most once; and if h1 = 0,


then α0 (·) is constant. (With this information, it is not especially difficult
to find optimal controls and trajectories: see page 140.) 

4.3. Pontryagin Maximum Principle


We turn now to general optimal control problems, and learn how to gener-
alize the maximization condition (M) from Theorem 4.2.3.

4.3.1. Fixed time, free endpoint problems.


The dynamics for a fixed time optimal control problem read
(
ẋ(t) = f (x(t), α(t)) (0 ≤ t ≤ T )
(ODE)
x(0) = x0 ,
where T > 0 and
A = {α : [0, T ] → A | α(·) is piecewise continuous}
for a given set A ⊆ Rm . The payoff is
Z T
(P) P [α(·)] = r(x(t), α(t)) dt + g(x(T )).
0

Our goal is to characterize an optimal control α0 (·) ∈ A that maximizes


P [α(·)] among all controls α(·) ∈ A. This is a fixed time, free endpoint
problem.

x0

Fixed time, free endpoint trajectories

DEFINITION. The control theory Hamiltonian is


H(x, p, a) = f (x, a) · p + r(x, a)
100 4. OPTIMAL CONTROL THEORY

for x, p ∈ Rn , a ∈ A. That is,


n
X
H(x, p, a) = fj (x, a)pj + r(x, a).
j=1

NOTATION. (i) We write


     ∂f1 ∂f1 
p1 f1 ∂x1 ... ∂xn
 ..   .. 
p =  . , f =  . , ∇x f =  ... .. .. 

. . 
∂fn ∂fn
pn fn ∂x1 ... ∂xn n×n

∂H
(ii) Since ∂pi = fi for i = 1, . . . , n, we have

(4.11) ∇p H = f .
∂H Pn ∂fj ∂r
Also, ∂xi = j=1 ∂xi (x, a)pj + ∂xi (x, a) for i = 1, . . . , n. Consequently,

(4.12) ∇x H = (∇x f )T p + ∇x r.


Next is our first version of the Pontryagin Maximum Principle.

THEOREM 4.3.1. (Fixed time, free endpoint PMP) Suppose α0 (·)


is an optimal control for the fixed time, free endpoint problem stated above,
and x0 (·) is the corresponding solution to (ODE).
(i) Then there exists a function p0 : [0, T ] → Rn such that for times
0 ≤ t ≤ T we have
(ODE) ẋ0 (t) = ∇p H (x0 (t), p0 (t), α0 (t)) ,

(ADJ) ṗ0 (t) = −∇x H (x0 (t), p0 (t), α0 (t)) ,

(M) H (x0 (t), p0 (t), α0 (t)) = max H(x0 (t), p0 (t), a),
a∈A

and
(T) p0 (T ) = ∇g (x0 (T )) .

(ii) In addition,
(4.13) H (x0 , p0 , α0 ) is constant on [0, T ].
4.3. Pontryagin Maximum Principle 101

TERMINOLOGY. (i) We call

x01 (t)
   0 
p1 (t)
 ..   .. 
x0 (t) =  .  , p0 (t) =  . 
x0n (t) p0n (t)

the optimal state and costate at time t.


(ii) (ADJ) is the adjoint equation;
(iii) (M) is the maximization condition;
(iv) (T) is the terminal (or transversality) condition. 

REMARKS.
(i) To be more precise, (ODE), (ADJ) and (M) hold at times 0 < t < T
that are points of continuity of the optimal control α0 (·).
(ii) The most important assertion is (M). In practice, this usually allows
us to transform the infinite dimensional problem of finding an optimal con-
trol α0 (·) ∈ A into a finite dimensional problem, at each time t, involving
maximization over A ⊆ Rm .
(iii) The costate equation (ADJ) and transversality condition (T) repre-
sent a sort of “back propagation” of information from the terminal time T .
We can also interpret the costate as a Lagrange multiplier corresponding to
the constraint that x0 (·) solves (ODE).
Note that we specify the initial condition x0 (0) = x0 for (ODE) and
the terminal condition p(T ) = ∇g(x0 (T )) for (ADJ). Hence even if α0 (·) is
known, solving this coupled system of equations can be tricky.
(iv) Remember from Section 1.4.3 that if H : Rn ×Rn → R, H = H(x, p),
we call
(
ẋ = ∇p H(x, p)
(H)
ṗ = −∇x H(x, p)

a Hamiltonian system of ODE. Notice that (ODE), (ADJ) are of this Hamil-
tonian form, except that now H = H(x, p, a) depends also on the control.
Observe furthermore that our assertion (4.13) is similar to (1.42) from The-
orem 1.4.1.


4.3.2. Other terminal conditions.


102 4. OPTIMAL CONTROL THEORY

For another important class of optimal control problems, the dynamics


are
(
ẋ(t) = f (x(t), α(t)) (0 ≤ t ≤ T )
(ODE)
x(0) = x0 , (T ) = x1 ,
where given initial and terminal values x0 and x1 are given, but the termi-
nal time T > 0 is not prescribed. This is a free time, fixed endpoint
problem, for which our goal is to maximize the payoff
Z T
(P) P [α(·)] = r (x(t), α(t)) dt,
0
where x(·) solves (ODE). Our goal is to characterize an optimal control
α0 (·) ∈ A.

x0

Free time, fixed endpoint trajectories

DEFINITION. The extended Hamiltonian is


(4.14) H(x, p, a, q) = f (x, a) · p + qr(x, a)
for x, p ∈ Rn , a ∈ A, q ∈ R.

THEOREM 4.3.2. (Free time, fixed endpoint PMP) Suppose α0 (·)


is an optimal control for the free time, fixed endpoint control problem and
x0 (·) is the corresponding solution to (ODE), that arrives at the target point
at time T0 . Let H be the extended Hamiltonian.
(i) Then there exists a function p0 : [0, T0 ] → Rn and a constant
(4.15) q0 = 0 or 1,
such that for 0 ≤ t ≤ T0 we have
(ODE) ẋ0 (t) = ∇p H (x0 (t), p0 (t), α0 (t), q0 ) ,
4.3. Pontryagin Maximum Principle 103

(ADJ) ṗ0 (t) = −∇x H (x0 (t), p0 (t), α0 (t), q0 ) ,


and
(M) H (x0 (t), p0 (t), α0 (t), q0 ) = max H(x0 (t), p0 (t), a, q0 ).
a∈A

(ii) If q0 = 0, then
(4.16) p0 (·) is not identically 0 on [0, T0 ].

(iii) Furthermore,

(T) H (x0 , p0 , α0 , q0 ) = 0 on [0, T0 ].

REMARKS.
(i) So for the free time problem, we have the transversality condition
that H(x0 , p0 , α0 , q0 ) = 0 at T0 and thus H(x0 , p0 , α0 , q0 ) = 0 on the entire
interval [0, T0 ]. This generalization of our earlier Theorem 1.3.4 is stronger
than the corresponding assertion (4.13) for the fixed time problem.
(ii) But we for the free time problem, we must deal with an additional
Lagrange multiplier q0 . We say the free time problem is normal if q0 = 1;
it is abnormal if q0 = 0. (The abnormal case is analogous to the existence
of the abnormal Lagrange multiplier γ0 = 0 in the F. John conditions for
finite dimensional optimization theory. See the Math 170 notes for more on
this.) 

Most free time problems are normal, and a simple assumption ensuring
this follows.

LEMMA 4.3.1. Suppose that the controllability assumption


(4.17) max{f (x, a) · p} > 0 (x, p ∈ Rn , p 6= 0)
a∈A

holds.
Then the associated free time, fixed endpoint control problem is normal.

Proof. If the problem were abnormal, then q0 = 0, p0 6≡ 0, and (T) would


assert
max H(x0 (t), p0 (t), a, q0 ) = max{f (x0 (t), a) · p0 (t)} = 0
a∈A a∈A

on [0, T0 ]. This however contradicts (4.17). 


104 4. OPTIMAL CONTROL THEORY

EXAMPLE. Here is an example of an abnormal problem with n = 2,


m = 1 and A = [−1, 1]. The dynamics are
(
ẋ1 = α2
(ODE)
ẋ2 = 1
with the initial and terminal conditions
   
0 0 1 0
x(0) = x = , x(T ) = x = .
0 1
The goal is to maximize
Z T
P [α] = α dt.
0

Now the only admissible control is α0 (·) ≡ 0, which is therefore optimal,


and T0 = 1. The free time Hamiltonian is
H = p1 a2 + p2 + qa.
Then (M) implies
∂H
0= = 2p01 α0 + q0 = q0
∂a
and hence this problem is abnormal. Furthermore (ADJ) tells us that p0 (·)
is constant. If we take  
−1
p0 = ,
0
then p0 6= 0 and the conditions (M),(T) of Theorem 4.3.2 hold. This example
of course fails to satisfy (4.17). 

EXAMPLE. Let us check that Theorem 4.3.2 accords with our previous
maximum principle for the time optimal linear problem, as developed in
Section 4.2. We have
H(x, p, a, q) = (M x + N a) · p − q (x, p ∈ Rn , a ∈ A, q ∈ R).
Select the vector h as in Theorem 4.2.3, and consider the system
(
ṗ0 (t) = −M T p0 (t)
p0 (0) = h,
the solution of which is
p0 (t) = X−T (t)h.
We know from condition (M) in Theorem 4.2.3 that
h · X−1 (t)N α0 (t) = max{h · X−1 (t)N a}.
a∈A
4.3. Pontryagin Maximum Principle 105

But since p0 (t)T = hT X−1 (t), this says


p0 (t) · N α0 (t) = max{p0 (t) · N a)},
a∈A

Then
p0 (t) · (M x0 (t) + N α0 (t)) − q0 = max{p0 (t) · (M x0 (t) + N a)} − q0 ,
a∈A

and this agrees with (M) from Theorem 4.3.2. 

Another sort of control problem has the dynamics


(
ẋ(t) = f (x(t), α(t)) (0 ≤ t ≤ T )
(ODE)
x(0) = x0 , (T ) = x1

for fixed initial and terminal values x0 and x1 are given and a fixed terminal
time T > 0. This is a fixed time, fixed endpoint problem. The payoff
is again
Z T
(P) P [α(·)] = r (x(t), α(t)) dt
0

THEOREM 4.3.3. (Fixed time, fixed endpoint PMP) Suppose α0 (·)


is an optimal control for the fixed time, fixed endpoint control problem and
x0 (·) is the corresponding solution to (ODE). Let H be given by (4.14).
(i) Then there exists p0 : [0, T ] → Rn and
(4.18) q0 = 0 or 1,
such that for all 0 ≤ t ≤ T :
(ODE) ẋ0 (t) = ∇p H (x0 (t), p0 (t), α0 (t), q0 )
(ADJ) ṗ0 (t) = −∇x H (x0 (t), p0 (t), α0 (t), q0 )
(M) H (x0 (t), p0 (t), α0 (t), q0 ) = max H(x0 (t), p0 (t), a, q0 )
a∈A

(ii) Furthermore,
(4.19) H (x0 , p0 , α0 , q0 ) is constant on [0, T ].

REMARKS.
(i) There is no transversality condition (T), since the fixed time, fixed
endpoint condition is too rigid to allow for any variations. As before, we
call the problem normal if q0 = 1 and abnormal if q0 = 0.
106 4. OPTIMAL CONTROL THEORY

(ii) We can deduce Theorem 4.3.3 from Theorem 4.3.2 by introducing a


new variable and rewriting the dynamics as
(
˙
x̄(t) = f̄ (x̄, α(t)) (0 ≤ t ≤ T )
x̄(0) = x̄0 , x̄(T ) = x̄1 ,
for
     0  1
x f 0 x 1 x
x̄ = , f̄ = , x̄ = , x̄ = .
xn+1 1 0 T
This gives a free time, fixed endpoint problem. Consequently there exist
q0 and p̄0 : [0, T0 ] → Rn+1 satisfying the conclusions of Theorem 4.3.2. We
write
 
p0
p̄0 = 0 .
pn+1
Then (ADJ) implies p0n+1 is constant on [0, T ]. Hence we may deduce from
(T) that H (x0 , p0 , α0 , q0 ) = −p0n+1 is constant on [0, T ]. 

4.4. Applications
HOW TO USE THE PONTRYAGIN MAXIMUM PRINCIPLE
Step 1. Write down the Hamiltonian
(
f (x, a) · p + r(x, a) for fixed time, free endpoint problems
H=
f (x, a) · p + qr(x, a) for fixed endpoint problems,
and calculate
∂H ∂H
, (i = 1, . . . , n).
∂xi ∂pi
Step 2. Write down (ODE), (ADJ), (M) and, if appropriate, (T).
Step 3. Use the maximization condition (M) to compute, if possible,
α0 (t) as a function of x0 (t), p0 (t).
Step 4. Now try to solve the coupled equations (ODE), (ADJ) and (T),
to find x0 (·), p0 (·) (and q0 for free time problems).

DEFINITION. If (M) does not uniquely determine α0 (·) on some time


interval (where Step 3 therefore fails), we call that part of the trajectory of
x0 (·) a singular arc. 

To simplify notation will mostly not write the subscripts “0” in the
following examples:
4.4. Applications 107

4.4.1. Simple linear-quadratic regulator.


The dynamics for the simplest version of the linear-quadratic regulator
read
(
ẋ = x + α
(ODE)
x(0) = x0

for controls α : [0, T ] → R. Here n = m = 1. The values of the controls are


unconstrained; that is, A = R. The payoff is quadratic in the state and the
control:
Z T
(P) P [α(·)] = − x2 + α2 dt
0

We therefore have a fixed time problem, for


f = x + a, r = −x2 − a2 , g = 0.
So the Hamiltonian is
H = f (x, a)p + r(x, a) = (x + a)p − x2 − a2 .
Then
∂H ∂H
= x + a, = p − 2x.
∂p ∂x
Consequently the equations for the PMP read
∂H
(ADJ) ṗ = − = 2x − p
∂x
(M) H (x(t), p(t), α(t)) = max{(x(t) + a)p(t) − x2 (t) − a2 }
a∈R
(T) p(T ) = 0.

We start with (M), and compute the value of α by noting the (uncon-
strained) maximum occurs where ∂H ∂H
∂a = 0. Since ∂a = p − 2a, we see that
p
(4.20) α= .
2
We use this information to rewrite (ODE), (ADJ):
(
ẋ = x + p2
(4.21)
ṗ = 2x − p,

with the initial condition x(0) = x0 and the terminal condition p(T ) = 0.
To solve (4.21), let us suppose that we can write
(4.22) p(t) = d(t)x(t) (0 ≤ t ≤ T ),
108 4. OPTIMAL CONTROL THEORY

for some function d(·) that we must find. To find an equation for d(·), we
assume that (4.21) is valid and compute
˙ + dẋ
ṗ = dx
˙ +d x+ p
 
2x − p = dx
 2 
(2 − d)x = dx˙ + d x + dx .
2
Cancelling the x, we discover that we should select d(·) to solve the Riccati
equation
( 2
d˙ = 2 − 2d − d2
(4.23)
d(T ) = 0.
Conversely, we check that if d(·) solves this Riccati equation and (4.22)
holds, then we have (4.21).
So we solve the terminal value problem for this nonlinear ODE, to get
the function d(·). Recalling then (4.22) and (4.20), we set
1
α0 (t) = d(t)x0 (t) (0 ≤ t ≤ T ),
2
to synthesize an optimal feedback control, where x0 (·) solves (ODE) for
this control: (
ẋ0 = x0 + 12 dx0
x0 (0) = x0 .

REMARK. We can convert the Riccati equation (4.23) into a terminal


value problem for a linear second-order ODE, by writing
2ḃ
d= ,
b
where (
b̈ = b − 2ḃ
b(T ) = 1, ḃ(T ) = 0.


4.4.2. Production and consumption.


Recall that for our production and consumption model on page 91, we
have the dynamics
(
ẋ = αx
(ODE)
x(0) = x0 ,
4.4. Applications 109

where the control α takes values in A = [0, 1]. We assume x0 > 0, and for
simplicity have taken the growth rate γ = 1. The payoff is
Z T
(P) P [α(·)] = (1 − α)x dt.
0
This fits within our fixed time PMP setting, for n = m = 1 and
f = ax, r = (1 − a)x, g = 0.
Thus the Hamiltonian is
H = f (x, a)p + r(x, a) = axp + (1 − a)x = x + ax(p − 1);
and so
∂H ∂H
= ax, = 1 + a(p − 1).
∂p ∂x
Consequently,
∂H
(ADJ) ṗ = − = −1 − α(p − 1).
∂x
Furthermore, we have
(M) H(x(t), p(t), α(t)) = max {x(t) + ax(t)(p(t) − 1)};
0≤a≤1

(T) p(T ) = 0.

We now carry out the maximization in (M), to learn how to compute an


optimal control α(·) in terms of the other functions:
(
1 if p(t) > 1
(4.24) α(t) =
0 if p(t) < 1.
This follows since x(·) is positive on [0, T ].
We next use the above information to find x(·), p(·), and the idea is to
work backwards from the ending time. Since p(T ) = 0, it must be that
p(t) < 1 for some interval [t0 , T ]. Thus (4.24) implies α = 0 for t0 ≤ t ≤ T .
We now analyze the various equations above on this interval (when α = 0):
(ODE) ẋ = 0,
(ADJ) ṗ = −1.
It follows that p(t) = T − t for t0 ≤ t ≤ T and p(t0 ) = 1 for
t0 = T − 1.

Next, we study the equations on the time interval 0 ≤ t ≤ t0 :


(ODE) ẋ = αx,
(ADJ) ṗ = −1 − α(p − 1).
110 4. OPTIMAL CONTROL THEORY

We see that p(t) > 1 if t1 ≤ t ≤ t0 for some time t1 < t0 . But then (4.24)
says α = 1 for t1 ≤ t ≤ t0 , and therefore
ṗ = −1 − (p − 1) = −p.
Since p(t0 ) = 1, it follows that p(t) = et0 −t > 1 for t1 ≤ t < t0 . Consequently
t1 = 0 and p(t) > 1 for 0 ≤ t < t0 .
So the optimal control is
(
1 (0 ≤ t < T − 1)
α0 (t) =
0 (T − 1 < t ≤ T ).
This means that we should reinvest all of the output until the time t0 = T −1,
and thereafter consume all the output.

REMARK. The formulas (ODE) and (ADJ) from the PMP provide us
with explicit differential equations for the optimal states and costates, but we
do not in general have a differential equation for the corresponding optimal
control. Indeed, the production/consumption example above has a “bang-
bang” control, which is piecewise constant with a single jump, and so does
not solve any differential equation.
However the next two applications illustrate that we can sometimes es-
tablish ODE also for the controls. The idea will be to try to eliminate p(·)
from the various equations. 

4.4.3. Ramsey consumption model.


For this example, x(t) ≥ 0 represents the capital at time t in some
economy and the control c(t) ≥ 0 is the consumption at time t. Given an
initial amount of capital x0 , we want to maximize the utility of the total
consumption over a fixed time interval [0, T ], but are required to leave an
amount x1 of capital at time T .
We model the evolution of the economy by the equation
(
ẋ = f (x) − c (0 ≤ t ≤ T )
(ODE)
x(0) = x , x(T ) = x1
0

for some appropriate function f : [0, ∞) → [0, ∞). So this is a fixed time,
fixed endpoint problem, which we assume is normal.
We wish to find an optimal consumption plan to maximize the payoff
Z T
(P) P [c(·)] = ψ(c) dt,
0
where ψ : [0, ∞) → [0, ∞), the consumption utility function, satisfies
ψ 0 > 0, ψ 00 < 0.
4.4. Applications 111

We will not analyze this problem completely, but will show that we can
derive an ODE for an optimal consumption policy. The Hamiltonian is
H = (f (x) − c)p + ψ(c),
and therefore
(ADJ) ṗ = −f 0 (x)p;

(M) H(x(t), p(t), c(t)) = max{(f (x(t)) − c)p(t) + ψ(c)}.


c≥0

Now (M) implies for each time t that either


(4.25) c(t) = 0
or
(4.26) ψ 0 (c(t)) = p(t), c(t) > 0.
We ignore for the moment the first possibility and therefore suppose (4.26)
always holds. This and (ADJ) now imply
ψ 00 (c)ċ = ṗ = −f 0 (x)p = −f 0 (x)ψ 0 (c).
We thus obtain the Keynes-Ramsey consumption rule
ψ 0 (c) 0
(4.27) ċ = − f (x).
ψ 00 (c)
Then (ODE) and (4.27) provide us with a coupled system of equations for
an optimal control c0 (·) and the corresponding state x0 (·).
REMARKS. (i) Our analysis of this problem is however incomplete, since
we have ignored the state constraint that x(·) ≥ 0. If the consumption
plan c(·) computed above forces x(·) to start to go negative at some time t,
we are clearly in the case (4.25), rather than (4.26).
(ii) We have also not specified an initial condition for (4.27). This would
need to be selected so that x0 (T ) = x1 . 

4.4.4. Zermelo’s navigation problem.


Let x = [x1 x2 ]T denote the location of a boat moving at fixed speed V
through water that has a current (depending on position, but not changing
in time) given by the vector field v : R2 → R2 , v = [v1 v2 ]T . We control the
boat by changing the direction in which it is pointing, determined by the
angle ξ from due east. The dynamics are therefore
(
ẋ1 = V cos ξ + v1 (x1 , x2 )
(ODE)
ẋ2 = V sin ξ + v2 (x1 , x2 ).
112 4. OPTIMAL CONTROL THEORY

How to we adjust the angle ξ(·) so as to steer the boat between two given
points x0 , x1 in the least time?

This is a free time, fixed endpoint problem for which the control is ξ(·).
We assume the problem is normal, and so the Hamiltonian is

H(x, p, ξ) = (V cos ξ + v1 )p1 + (V sin ξ + v2 )p2 − 1.

Consequently
(
∂v1 ∂v2
ṗ1 = −p1 ∂x − p2 ∂x
(ADJ) 1
∂v1
1
∂v2
ṗ2 = −p1 ∂x 2
− p2 ∂x 2
.

Furthermore, the maximization condition (M) implies


∂H
0= = V (−p1 sin ξ + p2 cos ξ).
∂ξ
Therefore
p2
(4.28) ξ = arctan ,
p1

p2 p1
(4.29) sin ξ = 1 , cos ξ = 1 .
(p21 + p22 ) 2 (p21 + p22 ) 2

For this problem, it turns out that we can eliminate the costates p1 , p2
and so express the optimal dynamics in terms of x1 , x2 and ξ. To do so, let
us use (4.28) and (ADJ) to compute
1 ṗ2 p1 − p2 ṗ1
ξ˙ =  2
p2 p21
1+ p1
   
p1 ∂v1 ∂v2 p2 ∂v1 ∂v2
= 2 −p1 − p2 − 2 −p1 − p2 .
p1 + p22 ∂x2 ∂x2 p1 + p22 ∂x1 ∂x1
Then (4.29) implies
 
∂v2 ∂v1 ∂v2 ∂v1
(4.30) ξ˙ = sin ξ
2
+ sin ξ cos ξ − − cos2 ξ .
∂x1 ∂x1 ∂x2 ∂x2

REMARK. The equations (ODE) and (4.30) provide us with a coupled


system for the optimal control ξ0 (·) and optimal state x0 (·). However we
do not have the initial condition for (4.30) and so must presumably rely
on numerical simulations to find a trajectory (if there is one) that passes
through the target point x1 at some time T0 > 0. 
4.4. Applications 113

4.4.5. Chaplygin’s navigation problem.


Here is another navigation problem. A boat takes a given time T to
move at constant speed V around a closed loop in the ocean. Assuming
that the sea water is flowing from west to east at constant speed v < V ,
what is the shape of such a path that encloses the maximum area?
If x = [x1 x2 ]T denotes the location of the boat, its motion is determined
by the equations
(
ẋ1 = V cos ξ + v
(ODE)
ẋ2 = V sin ξ,

where, as in the previous example, ξ is the angle from due east, as illustrated
below.
We assume the path of the boat is a simple, closed curve, traversed
counterclockwise. The area enclosed by the curve is
Z Z T
1 1
(P) P [ξ(·)] = x1 dx2 − x2 dx1 = x1 ẋ2 − x2 ẋ1 dt.
2 2 0

So here n = 2, m = 1 and we have a fixed time and fixed endpoint problem,


since the boat must begin and end its journey at some given point.

V ξ

Chaplygin’s problem
114 4. OPTIMAL CONTROL THEORY

Assuming the problem to be normal, we see that the Hamiltonian is


x1 x2
H = (V cos ξ + v)p1 + V sin ξ p2 + V sin ξ − (V cos ξ + v)
 x2   2 x  2
1
= p1 − (V cos ξ + v) + p2 + V sin ξ.
2 2
Therefore the adjoint dynamics are
(
ṗ1 = − V2 sin ξ
(ADJ)
ṗ2 = 12 (V cos ξ + v).
Using (ODE) and (ADJ), we see that
ẋ2 ẋ1
ṗ1 + = 0, ṗ2 − = 0.
2 2
Consequently
x2 x1
p1 + = a, p2 − =b
2 2
for appropriate constants a and b. Upon shifting coordinates if necessary,
we may assume a = b = 0; so that
x2 x1
(4.31) p1 + = 0, p2 − = 0.
2 2
We next compute the optimal control angle from the maximization con-
dition (M), by setting
∂H  x2   x1 
0= = − p1 − V sin ξ + p2 + V cos ξ.
∂ξ 2 2
Then (4.31) implies
(4.32) x1 cos ξ + x2 sin ξ = 0.
We now switch to polar coordinates, by writing
(4.33) x1 = r cos θ, x2 = r sin θ,
where θ is the polar angle, as drawn. Then (4.32) tells us that
0 = cos θ cos ξ + sin θ sin ξ = cos(ξ − θ).
π
Therefore ξ − θ is an odd multiple of 2;and so, from the picture,
π
(4.34) ξ=θ+ .
2
So the optimal control is to steer at right angles to the polar angle θ.

We show next that the optimal path is an ellipse. To do so, we first


differentiate (4.33):
ẋ1 = ṙ cos θ − rθ̇ sin θ;
ẋ2 = ṙ sin θ + rθ̇ cos θ.
4.4. Applications 115

This implies
ṙ = ẋ1 cos θ + ẋ2 sin θ.
Now use (4.34) to compute
v v
ṙ − ẋ2 = ẋ1 cos θ + ẋ2 sin θ − V sin ξ
V V
= (V cos ξ + v) cos θ + V sin ξ sin θ − v sin ξ
= (V cos ξ + v) sin ξ − V sin ξ cos ξ − v sin ξ
= 0.
Hence for some constant γ, we have
r − ex2 = γ,
where e = Vv . Thus the motion lies on the projection into R2 of the inter-
section in R3 of the cone x3 = r with the plane x3 = ex2 + γ. This is an
ellipse, since e < 1.
REMARKS. If v = 0, the motion of the boat is a circle. We have thus
in particular shown that among smooth curves of fixed length, a circle en-
closes the maximum area. (More precisely, we have shown that if there
exists a smooth minimizer, it is a circle.) Compare this assertion with the
isoperimetric problem discussed on page 25. 

4.4.6. Optimal harvesting.


A simple ODE model for the population x of, say, fish in a lake is
 x
ẋ = γx 1 − ,
k
where γ > 0 is the growth rate and k > 0 is the long term carrying capacity.
As t → ∞ all positive solutions converge to the equilibrium level k.
Suppose now that we continually harvest the populations:
 x
ẋ = γx 1 − − qαx,
k
where 0 ≤ α ≤ 1 represents our fishing effort and q > 0 its effectiveness.
Thus h = qαx is the harvest amount. The corresponding economic payoff
over a given time period [0, T ] is therefore
Z T
P [α(·)] = ph − θα dt
0
where the constant p > 0 is the fixed price for fish and the constant θ > 0
represents a cost rate for our fishing efforts. We suppose the initial fish
population is x0 , and we also require that after the fishing season is over,
the population of fish in the lake should be restored to a prescribed level x1 .
116 4. OPTIMAL CONTROL THEORY

We rescale and make various simplifying choices of the parameters, to


reduce to the dynamics
(
ẋ = x(1 − x) − qαx
(ODE)
x(0) = x0 , x(T ) = x1 .
The payoff functional is
Z T
(P) P [α(·)] = (x − θ)α dt.
0
We will assume
1
(4.35) 0 < θ < 1,
q> .
2
We say that a control α(·) is admissible if it satisfies the constraints 0 ≤
α(·) ≤ 1 and the corresponding solution x(·) of the differential equation in
(ODE) with x(0) = x0 satisfies the terminal condition x(T ) = x1 .
We wish to characterize an optimal fishing effort that maximizers the
payoff, subject to the dynamics (ODE). This is a fixed time, fixed endpoint
problem.
We assume our problem is normal, and consequently the Hamiltonian is
H = (x(1 − x) − qax)p + (x − θ)a.
Hence the adjoint dynamics are
(ADJ) ṗ = −(1 − 2x − qα)p − α
and the maximization condition is
(M) H(x(t), p(t), α(t)) = max {a(−qp(t)x(t) + x(t) − θ)}.
0≤a≤1

Equilibrium solutions. Let us first look for equilibrium solutions of


the above, that is, solutions of the form x(·) ≡ x∗ , p(·) ≡ p∗ , α(·) ≡ a∗ for
constants x∗ , p∗ , a∗ with 0 < a∗ < 1. This will be an algebra execise. In this
case, (ODE) and (ADJ) imply
(4.36) 1 − x∗ − qa∗ = 0, (1 − 2x∗ − qa∗ )p∗ + a∗ = 0;
and, since 0 < a∗ < 1 solves (M), we must have
(4.37) −qp∗ x∗ + x∗ − θ = 0.
The two equations (4.36) give a∗ = p∗ x∗ . Plugging this into (4.37) yields
a∗ = x∗q−θ . Using this back in (4.36) and simplifying, we find
1+θ 1−θ 1−θ
(4.38) x∗ = , p∗ = , a∗ = .
2 q(1 + θ) 2q
4.4. Applications 117

Owing to (4.35), we have 0 < a∗ < 1, as required: we can interpret α(·) ≡ a∗


as a sustainable fishing policy. Note also, for future reference, that

x∗ gives the maximum of the quadratic function (x − θ)(1 − x).

Most rapid approach path. We propose now to employ the constants


x∗ , p∗ , a∗ to build a general, nonequilibrium solution of our harvesting prob-
lem. To be specific, let us suppose x0 > x∗ and x1 > x∗ . We now find, if we
can, the first time 0 ≤ t1 ≤ T so that the solution of (ODE) with x(0) = x0
and α ≡ 1 on the time interval [0, t1 ] satisfies

x(t1 ) = x∗ .

We also find, if possible, a time t1 ≤ t2 ≤ T so that the solution of solution


of (ODE) with x(T ) = x1 and α ≡ 0 satisfies

x(t2 ) = x∗ .

(We assume that the values of x(·) are between 0 and 1, in appropriate units.
Thus the fish population will rise if α ≡ 0.)

x0
x1

x*

1 a* 0

0 t1 t2 T
Optimal harvesting

We claim now that the optimal control is



1 (0 ≤ t ≤ t1 )

α0 (t) = a∗ (t1 ≤ t ≤ t2 )

0 (t2 ≤ t ≤ T ).

To see this, consider another admissible control α : [0, T ] → [0, 1] and


let x(·) denote the corresponding solution of (ODE).
118 4. OPTIMAL CONTROL THEORY

x0
x1

x*

0 t1 t2 T
Another harvesting plan

Now
T T  
x(1 − x) − ẋ
Z Z
P [α(·)] = (x − θ)α dt = (x − θ) dt.
0 0 qx
The part of integrand that involves ẋ is a null Lagrangian, and consequently
that part of the payoff depends only upon the fixed boundary values x0 and
x1 . That is,
Z T
x−θ x1 − θ log x1 x0 − θ log x0
ẋ dt = − = C;
0 qx q q
and hence
1 T
Z
P [α(·)] = (x − θ)(1 − x) dt − C.
q 0
Let x0 (·) be the dynamics corresponding to α0 (·). Since α0 ≡ 1 on
(0, t1 ), we have x(t) ≥ x0 (t) ≥ x∗ on [0, t1 ]; see the illustration. Since x∗
gives the maximum of the quadratic (x − θ)(1 − x), it follow that
Z t1 Z t1
(x0 − θ)(1 − x0 ) dt ≥ (x − θ)(1 − x) dt.
0 0
Similarly, x(t) ≥ x0 (t) ≥ x∗ on [t2 , T ], and consequently
Z T Z T
(x0 − θ)(1 − x0 ) dt ≥ (x − θ)(1 − x) dt.
t2 t2
Furthermore,
Z t2 Z t2
(x0 − θ)(1 − x0 ) dt ≥ (x − θ)(1 − x) dt,
t1 t1
since x0 (·) ≡ x∗ on this interval and x∗ gives the maximum of the integrand.
Hence
P [α0 (·)] ≥ P [α(·)].
4.5. Proof of PMP 119

ECONOMIC INTERPRETATION. Observe that on [t1 , t2 ] we have a


singular arc (see page 106), since the maximization condition (M) does not
determine there the value of the optimal control.
This example, adapted from Mesterton-Gibbons [MG], is an instance
of what economists call a most rapid approach path. It is optimal to
move as quickly as possible to where x = x∗ and then to stay on this path,
sometimes called a turnpike, as long as possible. 

REMARK. There are many more applications of the PMP discussed in


the texts listed in the References. See in particular Kamien–Schwartz [K-S],
Lee–Markus [L-M] and my old online lecture notes [E]. 

4.5. Proof of PMP


We present next a reasonably complete discussion of the ideas behind the
derivation of the Pontryagin Maximum Principle.

4.5.1. Simple control variations.


Recall that the response x(·) to a given control α(·) is the unique solution
of the system of differential equations:
(
ẋ(t) = f (x(t), α(t)) (t ≥ 0)
(ODE) 0
x(0) = x .
We investigate in this section how certain simple changes in the control affect
the response.

DEFINITION. Fix a time s > 0 and a control parameter value a ∈ A.


Select 0 < ε < s and define then the modified control
(
a if s − ε < t < s
αε (t) =
α(t) otherwise.
We call αε (·) a simple (or needle) variation of α(·).

Let xε (·) be the corresponding response to our system:


(
ẋε (t) = f (xε (t), αε (t)) (t > 0)
(ODEε )
xε (0) = x0 .
We want to understand how our choices of s and a cause xε (·) to differ from
x(·), for small ε > 0.
120 4. OPTIMAL CONTROL THEORY

NOTATION. Define the matrix-valued function A : [0, ∞) → Mn×n by


A(t) = ∇x f (x(t), α(t)).
So the (i, j)-th entry of A(t) is
∂fi
(x(t), α(t)) (i, j = 1, . . . , n).
∂xj


We first quote a standard perturbation assertion for ordinary differential


equations:
LEMMA 4.5.1. Let yε (·) solve the initial-value problem:
(
ẏε (t) = f (yε (t), α(t)) (t ≥ 0)
yε (0) = x0 + εy 0 + o(ε).
Then
(4.39) yε (t) = x(t) + εy(t) + o(ε) as ε → 0,
uniformly for t in compact subsets of [0, ∞), where
(
ẏ(t) = A(t)y(t) (t ≥ 0)
(4.40) 0
y(0) = y .

NOTATION. We write
o(ε)
to denote any expression gε such that

lim = 0.
ε→0 ε
In words, if ε → 0, then gε = o(ε) goes to zero “faster than ε”. 

Returning now to the dynamics (ODEε ), we establish


LEMMA 4.5.2. Assume that s is a point of continuity for the control α(·).
Then we have
(4.41) xε (t) = x(t) + εy(t) + o(ε) as ε → 0,
uniformly for t in compact subsets of [0, ∞), where
y(t) = 0 (0 ≤ t ≤ s)
and
(
ẏ(t) = A(t)y(t) (t ≥ s)
(4.42)
y(s) = ys,
4.5. Proof of PMP 121

for

(4.43) y s = f (x(s), a) − f (x(s), α(s)).

NOTATION. We will sometimes write

y(t) = Y(t, s)y s (t ≥ s)

when (4.42) holds. 

Proof. Clearly xε (t) = x(t) for 0 ≤ t ≤ s − ε. For times s − ε ≤ t ≤ s, we


have
Z t
xε (t) − x(t) = f (xε (r), a) − f (x(r), α(r)) dr.
s−ε
Thus
xε (s) − x(s) = [f (x(s), a) − f (x(s), α(s))]ε + o(ε).

On the time interval [s, ∞), x(·) and xε (·) both solve the same ODE,
but with differing initial conditions given by

xε (s) = x(s) + εy s + o(ε),

for y s defined by (4.43). According then to Lemma 4.5.1, we have

xε (t) = x(t) + εy(t) + o(ε) (t ≥ s),

the function y(·) solving (4.42). 

4.5.2. Fixed time problem.


Terminal payoff problem. We return to our usual dynamics
(
ẋ(t) = f (x(t), α(t)) (0 ≤ t ≤ T )
(ODE) 0
x(0) = x ,
and introduce the terminal payoff functional

(P) P [α(·)] = g(x(T )),

to be maximized. So for now we are taking the running payoff r ≡ 0.


We assume that α0 (·) is an optimal control for this problem, correspond-
ing to the optimal response x0 (·). The control theory Hamiltonian is

H(x, p, a) = f (x, a) · p,

and our task is find p0 : [0, T ] → Rn such that (ADJ), (T) and (M) hold.
122 4. OPTIMAL CONTROL THEORY

We reintroduce the function A(·) = ∇x f (x0 (·), α0 (·)) and the control
variation αε (·), as in the previous section. We now define p0 : [0, T ] → R to
be the unique solution of the terminal-value problem
(
ṗ0 (t) = −AT (t)p0 (t) (0 ≤ t ≤ T )
(4.44)
p0 (T ) = ∇g(x0 (T )).
This gives (ADJ) and (T), and so our goal is to establish the maximization
principle (M).

The main point is that p0 (·) helps us calculate the variation of the
terminal payoff:
LEMMA 4.5.3. Assume 0 < s < T is a point of continuity for α0 (·). Then
we have
d
(4.45) P [αε (·)]|ε=0 = [f (x0 (s), a) − f (x0 (s), α0 (s))] · p0 (s).

Proof. According to Lemma 4.5.2,
P [αε (·)] = g(xε (T )) = g(x0 (T ) + εy(T ) + o(ε)),
where y(·) satisfies (4.42) for
y s = f (x0 (s), a) − f (x0 (s), α0 (s)).
Thus
d
(4.46) P [αε (·)]|ε=0 = ∇g(x0 (T )) · y(T ).

On the other hand, (4.42) and (4.44) imply
d
(p0 (t) · y(t)) = ṗ0 (t) · y(t) + p0 (t) · ẏ(t)
dt
= −AT (t)p0 (t) · y(t) + p0 (t) · A(t)y(t)
= 0.
Hence
∇g(x0 (T )) · y(T ) = p0 (T ) · y(T ) = p0 (s) · y(s) = p0 (s) · y s .
Since y s = f (x0 (s), a)−f (x0 (s), α0 (s)), this identity and (4.46) imply (4.45).


THEOREM 4.5.1. (PMP with no running costs) There exists a func-


tion p0 : [0, T ] → Rn satisfying the adjoint dynamics (ADJ), the maximiza-
tion principle (M) and the terminal condition (T).
4.5. Proof of PMP 123

Proof. The adjoint dynamics and terminal condition are both in (4.44).
To confirm (M), fix 0 < s < T and a ∈ A, as above. Since the mapping
ε 7→ P [αε (·)] for 0 ≤ ε ≤ 1 has a maximum at ε = 0, we deduce from Lemma
4.5.3 that
d
0≥ P [αε (·)] = [f (x0 (s), a) − f (x0 (s), α0 (s)] · p0 (s).

Hence
H(x0 (s), p0 (s), a) = f (x0 (s), a) · p0 (s)
≤ f (x0 (s), α0 (s)) · p0 (s) = H(x0 (s), p0 (s), α0 (s))
for all a ∈ A and each time 0 < s < T that is a point of continuity for α0 (·).
This proves the maximization condition (M). 

General payoff problem. We next extend our analysis, to cover the


case that the payoff functional includes also a running payoff:
Z T
(P) P [α(·)] = r(x(s), α(s)) ds + g(x(T )).
0

The control theory Hamiltonian is now


H(x, p, a) = f (x, a) · p + r(x, a)
and we must manufacture a costate function p0 (·) satisfying (ADJ), (M)
and (T).

Adding a new variable. The trick is to introduce another variable


and thereby convert to the previous case. We consider the function xn+1 :
[0, T ] → R given by
(
ẋn+1 (t) = r(x(t), α(t)) (0 ≤ t ≤ T )
(4.47)
xn+1 (0) = 0,

where x(·) solves (ODE). It follows that


Z T
xn+1 (T ) = r(x(t), α(t)) dt.
0

Introduce next the new notation


   
x1 p1
   .   0    . 
x  .  x p  . 
x̄ = =  .  , x̄0 = , p̄ = =  . ,
xn+1  xn  0 pn+1  pn 
xn+1 pn+1
124 4. OPTIMAL CONTROL THEORY

   
x1 (t) f1 (x, a)
   .     ..
x(t)  ..  f (x, a)

(4.48) x̄(t) = =  , f̄ (x̄, a) = =
 . 
xn+1 (t) r(x, a)

 xn (t)  fn (x, a)
xn+1 (t) r(x, a)
and
(4.49) ḡ(x̄) = g(x) + xn+1 .
Then (ODE) and (4.47) produce the dynamics
(
˙
x̄(t) = f̄ (x̄(t), α(t)) (0 ≤ t ≤ T )
(ODE) 0
x̄(0) = x̄ .
Consequently our control problem transforms into a new problem with no
running payoff and the terminal payoff functional
P̄ [α(·)] = ḡ(x̄(T )).
THEOREM 4.5.2. (PMP for fixed time, free endpoint problem)
There exists a function p0 : [0, T ] → Rn satisfying the adjoint dynamics
(ADJ), the maximization principle (M) and the terminal condition (T).

Proof. We apply Theorem 4.5.1, to obtain p̄0 : [0, T ] → Rn+1 satisfying


(M) for the Hamiltonian
H̄(x̄, p̄, a) = f̄ (x̄, a) · p̄.
Also the adjoint equations (ADJ) hold, with the terminal transversality
condition
p̄0 (T ) = ∇ḡ(x̄0 (T )).
But f̄ does not depend upon the variable xn+1 , and consequently the last
equation in (ADJ) reads
ṗn+1
0 (t) = − ∂x∂n+1

= 0.
∂ḡ
Since ∂xn+1 = 1, we deduce that
p0n+1 (t) ≡ 1.
As the last component of the vector function f̄ is r, we then conclude from
(8.11) that
H̄(x̄, p̄, a) = f (x, a) · p + r(x, a) = H(x, p, a).
Therefore
p01 (t)
 

p0 (t) =  ... 
 

p0n (t)
4.5. Proof of PMP 125

satisfies (ADJ), (M) for the Hamiltonian H. 

4.5.3. Multiple control variations.

Proving the PMP for the free time, fixed endpoint problem is much
more difficult, since the result of a simple variation as above may produce
a response xε (·) that never hits the target point x1 . We consequently need
to introduce more complicated control variations, discussed in this section.

DEFINITION. Let us select times 0 < s1 < s2 < · · · < sN , numbers


λ1 , . . . , λN > 0, and also control parameters a1 , a2 , . . . , aN ∈ A. Write
(
ak if sk − λk ε ≤ t < sk (k = 1, . . . , N )
(4.50) αε (t) =
α(t) otherwise,
for ε > 0 taken so small that the intervals [sk − λk ε, sk ] do not overlap.
This is called a multiple variation of the control α(·).

We assume for this section that x(·) solves


(
ẋ(t) = f (x(t), α(t)) (0 ≤ t ≤ T )
(4.51) 0
x(0) = x .
for some piecewise continuous control α(·), and that xε (·) is the response to
αε (·):
(
ẋε (t) = f (xε (t), αε (t)) (0 ≤ t ≤ T )
(4.52) 0
xε (0) = x .

NOTATION. (i) Define


y sk = f (x0 (sk ), ak )) − f (x0 (sk ), α0 (sk ))
for k = 1, . . . , N .
(ii) As before, set A(·) = ∇x f (x0 (·), α0 (·)) and write
y(t) = Y(t, s)y s (t ≥ s)
to denote the solution of
(
ẏ(t) = A(t)y(t) (t ≥ s)
y(s) = ys,
where y s ∈ Rn is given.

Suitably modifying the proof of Lemma 4.5.2, we can establish


126 4. OPTIMAL CONTROL THEORY

LEMMA 4.5.4. We have


(4.53) xε (t) = x(t) + εy(t) + o(ε) as ε → 0,
uniformly for t in compact subsets of [0, ∞), where

y(t) = P
 0 (0 ≤ t ≤ s1 )
m s
y(t) = k=1 λk Y(t, sk )y k (sm ≤ t ≤ sm+1 , m = 1, . . . , N − 1)
 PN

y(t) = k=1 λk Y(t, sk )y s k (sN ≤ t).

DEFINITION. The cone of variations at time T is the set


N
nX
K(T ) = λk Y(T, sk )y sk | N = 1, 2, . . . ,
k=1
o
λk > 0, ak ∈ A, 0 < s1 ≤ · · · ≤ sN < T .

Observe that K(T ) is a convex cone in Rn , which according to Lemma


4.5.4 comprises all changes in the state x(T ), up to order ε, that we can
effect by multiple variations of the control α(·).

4.5.4. Fixed endpoint problem.


We turn now to the free time, fixed endpoint problem, characterized by
the constraint
x(T ) = x1 ,
where T = T [α(·)] is the first time that x(·) hits the given target point. The
payoff functional is
Z T
P [α(·)] = r(x(s), α(s)) ds.
0

Adding a new variable. As before, we introduce the function xn+1 :


[0, T ] → R defined by (4.47) and recall the notation (4.48), (4.49), with
ḡ(x̄) = xn+1 .

Our problem is therefore to find controlled dynamics satisfying


(
˙
x̄(t) = f̄ (x̄(t), α(t)) (0 ≤ t ≤ T )
(ODE)
x̄(0) = x̄0 ,
and maximizing
(P) ḡ(x̄(T )) = xn+1 (T ),
T being the first time that x(T ) = x1 . In other words, the first n components
of x̄(T ) are prescribed, and we want to maximize the (n + 1)-th component.
4.5. Proof of PMP 127

We assume that α0 (·) is an optimal control for this problem, corre-


sponding to the optimal trajectory x0 (·) and the time T0 ; our task is to
construct the corresponding costate p0 (·), satisfying (ADJ) and the maxi-
mization principle (M).

Using the cone of variations. Our program for building the costate
depends upon our taking multiple variations, as in the previous section, and
understanding the resulting cone of variations K = K(T0 ).
Let K 0 denote the (perhaps empty) interior of K. Put
en+1 = [0 · · · 0 1]T ∈ Rn+1 .
Here is the key observation:
LEMMA 4.5.5. We have
(4.54) en+1 ∈
/ K 0.

Proof. 1. If (4.54) were false, there would exist n + 1 linearly independent


vectors z 1 , . . . , z n+1 ∈ K such that
n+1
X
n+1
e = λk z k
k=1

with constants λk > 0 and


z k = Y(T0 , sk )ȳ sk
for appropriate times 0 < s1 < s1 < · · · < sn+1 < T0 , where
ȳ sk = f̄ (x̄(sk ), ak )) − f̄ (x̄(sk ), α(sk )) (k = 1, . . . , n + 1).

2. We will next construct a control αε (·), having the multiple variation


form (4.50), with corresponding response x̄ε (·) = [xε (·)T xεn+1 (·)]T satisfying
(4.55) xε (T0 ) = x1
and
(4.56) xεn+1 (T0 ) > x0n+1 (T0 ).
This will be a contradiction to the optimality of the control α0 (·).

3. Introduce for small η > 0 the closed and convex set


n+1
( )
X
C= z= λk z k | 0 ≤ λk ≤ η .
k=1

Since the vectors z 1 , . . . , z n+1 are independent, C has an interior.


128 4. OPTIMAL CONTROL THEORY

Now define for small ε > 0 the mapping


(4.57) Φε : C → Rn+1
by setting
Φε (z) = x̄ε (T0 ) − x̄0 (T0 )
Pn+1
for z = k=1 λk z k , where x̄ε (·) solves (4.52) for the control αε (·) defined by
(4.50).

We assert that if µ, η, ε > 0 are small enough, then we can solve the
nonlinear equation
(4.58) Φε (z) = µen+1 = [0 · · · 0 µ]T
for some z ∈ C. To see this, note that
|Φε (z) − z| = |x̄ε (T0 ) − x̄0 (T0 ) − z|
= o(|z|)
< |z − µen+1 | for all z ∈ ∂C.

We now apply the topological theorem from Appendix F, to find a point


z ∈ C satisfying (4.58). Then
x̄ε (T0 ) = x̄0 (T0 ) + µen+1 ,
and hence (4.55), (4.56) hold. This gives the desired contradiction, provided
en+1 ∈ K 0 . 

Existence of the costate.

THEOREM 4.5.3. (PMP for free time, fixed endpoint problem)


There exists a function p0 : [0, T0 ] → Rn and a number
q0 = 0 or 1
satisfying the statements (ADJ), (M) and (T) of the free time, fixed endpoint
PMP.

Proof. 1. Since en+1 ∈/ K 0 according to Lemma 4.5.5, there exists a nonzero


vector w ∈ R n+1 such that
(4.59) w·z ≤0 for all z ∈ K
and
(4.60) w · en+1 = wn+1 ≥ 0.
4.5. Proof of PMP 129

Separating en+1 from K

Let p̄0 (·) solve (ADJ), with the terminal condition


p̄0 (T0 ) = w.
Then
(4.61) pn+1
0 (t) = wn+1 ≥ 0 (0 ≤ t ≤ T0 ).

Fix any time 0 ≤ s < T0 , any control value a ∈ A, and set


ȳ s = f̄ (x̄0 (s), a) − f̄ (x̄0 (s), α0 (s)).
Now solve (
˙
ȳ(t) = Ā(t)ȳ(t) (s ≤ t ≤ T0 )
ȳ(s) = ȳ s ;
so that as before
0 ≥ w · ȳ(T0 ) = p̄0 (T0 ) · ȳ(T0 ) = p̄0 (s) · ȳ(s) = p̄0 (s) · ys .
Therefore
p̄0 (s) · [f̄ (x̄0 (s), a) − f̄ (x̄0 (s), α0 (s))] ≤ 0;
and then
H̄(x̄0 (s), p̄0 (s), a) = f̄ (x̄0 (s), a) · p̄0 (s)
(4.62) ≤ f̄ (x̄0 (s), α0 (s)) · p̄0 (s)
= H̄(x̄0 (s), p̄0 (s), α0 (s)),
for the Hamiltonian
H̄(x̄, p̄, a) = f̄ (x̄, a) · p̄.

2. We now must address two situations, according to whether


(4.63) wn+1 > 0
130 4. OPTIMAL CONTROL THEORY

or
(4.64) wn+1 = 0.
When (4.63) holds, we can divide p0 (·) by wn+1 , to reduce to the case that
q0 = p0n+1 ≡ 1.
Then (4.62) provides the maximization principle (M). If instead (4.64) holds,
we have an abnormal problem, for which
q0 = p0n+1 ≡ 0.


An abnormal problem

REMARK. We have not proved the condition (T) that


H(x0 (t), p0 (t), α0 (t)) = 0 (0 ≤ t ≤ T0 )
for the free time, fixed endpoint problem. This is in fact rather subtle: see
the book [L-M] of Lee and Markus for details. For more advanced students,
I recommend the book of Bressan and Piccoli [B-P], which provides a fully
rigorous and detailed proof of the PMP. 
Chapter 5

DYNAMIC
PROGRAMMING

5.1. Hamilton-Jacobi-Bellman equation


We next show how to use value functions in optimal control theory, within
the context of dynamic programming. This approach depends upon the
mathematical idea that it is sometimes easier to solve a given problem by
incorporating it within a larger class of problems.
We want to adapt some version of this insight to the vastly complicated
setting of control theory. For this, fix a terminal time T > 0 and then look
at the controlled dynamics
(
ẋ(s) = f (x(s), α(s)) (0 < s < T )
x(0) = x0 ,

with the associated payoff functional


Z T
P [α(·)] = r(x(s), α(s)) ds + g(x(T ))
0

for the free endpoint problem.


The new idea is to embed this into a larger family of similar problems,
by varying the starting times and starting points:
(
ẋ(s) = f (x(s), α(s)) (t ≤ s ≤ T )
(ODE)
x(t) = x,

131
132 5. DYNAMIC PROGRAMMING

Z T
(P) Px,t [α(·)] = r(x(s), α(s)) ds + g(x(T )).
t
We will consider the above problems for all choices of starting times 0 ≤ t ≤
T and all initial points x ∈ Rn .

5.1.1. Derivation.
DEFINITION. For x ∈ Rn , 0 ≤ t ≤ T , we define the value function
v : Rn × [0, T ] → R to be the greatest payoff possible if we start at x ∈ Rn
at time t. In other words,
v(x, t) = sup Px,t [α(·)]
α(·)∈A

for x ∈ Rn , 0 ≤ t ≤ T .
REMARK. Then
v(x, T ) = g(x) (x ∈ Rn ),
since if we start at time T at the point x, we must immediately stop and so
collect the payoff g(x). 

Our task in this section is to show that the value function v so defined
satisfies a certain nonlinear partial differential equation. Our derivation will
be based upon the reasonable principle that it is better to act optimally from
the beginning, rather than to act arbitrarily for a while and then later act
optimally. We will convert this philosophy of life into mathematics.
To simplify, we hereafter suppose that the set A of control parameter
values is closed and bounded.

THEOREM 5.1.1. Assume that the value function v is a continuously


differentiable function of the variables (x, t). Then v solves the Hamilton–
Jacobi–Bellman partial differential equation
∂v
(HJB) (x, t) + max {f (x, a) · ∇x v(x, t) + r(x, a)} = 0
∂t a∈A

for x ∈ Rn , 0 ≤ t < T , with the terminal condition


(5.1) v(x, T ) = g(x) (x ∈ Rn ).

DEFINITION. The PDE Hamiltonian is


H(x, p) = max H(x, p, a) = max {f (x, a) · p + r(x, a)}
a∈A a∈A
5.1. Hamilton-Jacobi-Bellman equation 133

where x, p ∈ Rn . Hence we can write (HJB) as


∂v
+ H(x, ∇x v) = 0 in Rn × [0, T ].
∂t


Proof. 1. Let x ∈ Rn , 0 ≤ t < T , and note that, as always in this course,


A = {α(·) : [0, T ] → A | α(·) is piecewise continuous}.
Pick any parameter a ∈ A and let ε > 0 be so small that t + ε ≤ T . Suppose
we start at x at time t, and use the constant control
α(s) = a
for times t ≤ s ≤ t + ε. The dynamics then arrive at the point x(t + ε).
Suppose now that we switch to an optimal control (assuming it exists) and
employ it for the remaining times t + ε ≤ s ≤ T .
What is the payoff of this procedure? Now for t ≤ s ≤ t + ε, we have
(
ẋ(s) = f (x(s), a)
(5.2)
x(t) = x.
R t+ε
The payoff for this time period is t r(x(s), a) ds. Furthermore, the payoff
incurred from time t + ε to T is v(x(t + ε), t + ε), according to the definition
of the payoff function v(·). Hence the total payoff is
Z t+ε
r(x(s), a) ds + v(x(t + ε), t + ε).
t

But the greatest possible payoff if we start from (x, t) is v(x, t). Therefore
Z t+ε
r(x(s), a) ds + v(x(t + ε), t + ε) ≤ v(x, t).
t

2. Next rearrange (5.5) and divide by ε > 0:


v(x(t + ε), t + ε) − v(x, t) 1 t+ε
Z
+ r(x(s), a) ds ≤ 0.
ε ε t
Hence
∂v
(x, t) + ∇x v(x(t), t) · ẋ(t) + r(x(t), a) ≤ o(1),
∂t
as ε → 0. But recall that x(·) solves (5.2). We employ this above and send
ε → 0, to discover
∂v
(x, t) + f (x, a) · ∇x v(x, t) + r(x, a) ≤ 0.
∂t
134 5. DYNAMIC PROGRAMMING

This inequality holds for all control parameters a ∈ A, and consequently


 
∂v
(5.3) max (x, t) + f (x, a) · ∇x v(x, t) + r(x, a) ≤ 0.
a∈A ∂t

3. We next demonstrate that in fact the maximum above equals zero.


To see this, suppose α0 (·), x0 (·) are optimal for the problem above. Let us
utilize the optimal control α0 (·) for t ≤ s ≤ t + ε. The payoff is
Z t+ε
r(x0 (s), α0 (s)) ds
t

and the remaining payoff is v(x0 (t + ε), t + ε). Consequently, the total payoff
is
Z t+ε
r(x0 (s), α0 (s)) ds + v(x0 (t + ε), t + ε) = v(x, t).
t
Rearrange and divide by ε:
t+ε
v(x0 (t + ε), t + ε) − v(x, t) 1
Z
+ r(x0 (s), α0 (s)) ds = 0.
ε ε t

Let ε → 0 and suppose α0 (t) = a0 ∈ A. Then


∂v
(x, t) + ∇x v(x, t) · f (x, a0 ) + r(x, a0 ) = 0.
∂t
This and (5.3) confirm that v solves the Hamilton-Jacobi-Bellman PDE. 

REMARK. Dynamic programming applies as well to free time, fixed end-


point optimal control problems. To be specific, let us suppose that we are
required to steer from a point x ∈ Rn to the origin under the dynamics
(
ẋ(s) = f (x(s), α(s)) (0 ≤ s ≤ T )
(ODE)
x(0) = x, x(T ) = 0,

so as to maximize the payoff


Z T
(P) Px [α(·)] = r(x(s), α(s)) ds.
0

Here the terminal time T is free.


The corresponding value function is

v(x) = sup Px [α(·)] (x ∈ Rn ).


α(·)∈A
5.1. Hamilton-Jacobi-Bellman equation 135

Arguing as above, we discover that if the value function v is continuously dif-


ferentiable, it solves the (stationary) Hamilton-Jacobi-Bellman equa-
tion
(HJB) max {f (x, a) · ∇v(x) + r(x, a)} = 0
a∈A

for x ∈ Rn \ {0} and


v(0) = 0.


5.1.2. Optimality.
HOW TO USE DYNAMIC PROGRAMMING
For fixed time optimal control problems as in Section 5.1.1, we carry out
these steps to synthesize an optimal control:
Step 1: Try to solve the HJB equation, with the terminal condition
(5.1), and thereby find the value function v.
Step 2: Use the value function v and the Hamilton–Jacobi–Bellman
PDE to design an optimal control α0 (·), as follows. Define for each point
y ∈ Rn and each time 0 ≤ s ≤ T ,
a(y, s) ∈ A
to be a parameter value where the maximum in HJB is attained at the point
(y, s). In other words, select a(y, s) ∈ A so that
∂v
(5.4) (y, s) + f (y, a(y, s)) · ∇x v(y, s) + r(y, a(y, s)) = 0.
∂t
Step 3: Next, solve the following ODE, assuming a(·) is sufficiently
regular to do so:
(
ẋ0 (s) = f (x0 (s), a(x0 (s), s)) (t ≤ s ≤ T )
(5.5)
x0 (t) = x.
Step 4: Finally, define the optimal feedback control
(5.6) α0 (s) = a(x0 (s), s) (t ≤ s ≤ T ),
so that we can rewrite (5.5) as
(
ẋ0 (s) = f (x0 (s), α0 (s)) (t ≤ s ≤ T )
x0 (t) = x.
In particular, if the state of system is y at time s, we use the control which
at time s takes on a parameter value a = a(y, s) ∈ A for which the maximum
in HJB is obtained.
136 5. DYNAMIC PROGRAMMING

THEOREM 5.1.2. For each starting time 0 ≤ t < T and initial point
x ∈ Rn , the control α0 (·) defined by (5.5) and (5.6) is optimal.

Proof. We have
Z T
Px,t [α0 (·)] = r(x0 (s), α0 (s)) ds + g(x0 (T )).
t

Then (5.4) and (5.6) imply


Z T 
∂v
Px,t [α0 (·)] = − (x0 (s), s) − f (x0 (s), α0 (s)) · ∇x v(x0 (s), s) ds + g(x0 (T ))
t ∂t
Z T
∂v
=− (x0 (s), s) + ∇x v(x0 (s), s) · ẋ0 (s) ds + g(x0 (T ))
t ∂t
Z T
d
=− v(x0 (s), s) ds + g(x0 (T ))
t ds
= −v(x0 (T ), T ) + v(x0 (t), t) + g(x0 (T ))
= v(x, t)
= sup Px,t [α(·)].
α(·)∈A

Hence α0 (·) is optimal, as asserted. 

REMARKS.
(i) Notice that v acts here as a calibration function that we use to es-
tablish optimality.
(ii) We can similarly design optimal controls for free time problems by
solving the stationary HJB equation. 

5.2. Applications
Applying dynamic programming is usually quite tricky, as it requires us to
solve a nonlinear PDE and this is often very difficult. The main hope, as
we will see in the following examples, is to try to guess the form of v, to
plug this guess into the HJB equation and then to adjust various constants
and auxiliary functions, to ensure that we have an actual solution. (Alter-
natively, we could compute the solution of the terminal-value problem for
HJB numerically.)
To simplify notation will not write the subscripts “0” in the subsequent
examples.
5.2. Applications 137

5.2.1. General linear-quadratic regulator. For this important prob-


lem, we are given matrices M, B, D ∈ Mn×n , N ∈ Mn×m , C ∈ Mm×m ; and
assume
B, C, D are symmetric,
with
B, D  0, C  0.
In particular, C is invertible.
We take the linear dynamics
(
ẋ(s) = M x(s) + N α(s) (t ≤ s ≤ T )
(ODE)
x(t) = x,
for which we want to minimize the quadratic cost functional
Z T
x(s)T Bx(s) + α(s)T Cα(s) ds + x(T )T Dx(T ).
t
So we must maximize the payoff
Z T
(P) Px,t [α(·)] = − x(s)T Bx(s) + α(s)T Cα(s) ds − x(T )T Dx(T ).
t
The control values are unconstrained, meaning that the control parameter
values can range over all of A = Rm .

We employ dynamic programming to design an optimal control. To


carry out this plan, we first figure out the structure of the HJB equation
(
∂v
∂t + maxa∈R {f · ∇x v + r} = 0
m in Rn × [0, T ]
v=g on Rn × {t = T },
for
f = M x + N a, r = −xT Bx − aT Ca, g = −xT Dx.
We rewrite the PDE as
∂v
(5.7) + max {(∇v)T N a − aT Ca} + (∇v)T M x − xT Bx = 0,
∂t a∈Rm
and note that we have the terminal condition
(5.8) v(x, T ) = −xT Dx.

To compute the maximum above, define


Q(a) = (∇v)T N a − aT Ca,
and solve
n
∂Q X
= vxi nij − 2ai cij = 0 (j = 1, . . . , n).
∂aj
i=1
138 5. DYNAMIC PROGRAMMING

Then (∇x v)T N = 2aT C, and thus 2Ca = N T ∇x v, since C is symmetric.


Therefore
1
(5.9) a = C −1 N T ∇x v.
2
This is the point at which the maximum in HJB occurs, which we now insert
into (5.7):
∂v 1
(5.10) + (∇v)T N C −1 N T ∇v + (∇v)T M x − xT Bx = 0.
∂t 4

Solving the HJB equation. Our task now is to solve this nonlinear PDE,
with the terminal condition (5.8). We guess that our solution has the form
(5.11) v(x, t) = xT K(t)x
for some appropriate symmetric n×n-matrix valued function K(·) for which
(5.12) K(T ) = −D.
Let us compute
∂v
(5.13) = xT K̇(t)x, ∇x v = 2K(t)x.
∂t
We now insert our guess v = xT K(t)x into (5.10), to discover that
xT {K̇(t) + K(t)N C −1 N T K(t) + 2K(t)M − B}x = 0.
Since
2xT KM x = xT KM x + [xT KM x]T
= xT KM x + xT M T Kx,
the foregoing becomes
xT {K̇ + KN C −1 N T K + KM + M T K − B}x = 0.

This identity will hold provided K(·) satisfies the matrix Riccati equa-
tion
(R) K̇(t) + K(t)N C −1 N T K(t) + K(t)M + M T K(t) = B
on the interval [0, T ].

In summary, once we solve the Riccati equation (R) with the terminal
condition (5.12), we can then use (5.9) and (5.13) to construct the optimal
feedback control
α0 (t) = C −1 N T K(t)x0 (t).

5.2. Applications 139

5.2.2. Rocket railway car. In view of our discussion on page 98, the
rocket railway problem is actually quite easy to solve. However it is also
instructive to see how dynamic programming applies.
The equations of motion are
   
0 1 0
ẋ = x+ α, −1 ≤ α ≤ 1
0 0 1
for n = 2, and
Z T
Px [α(·)] = − time to reach the origin = − 1 dt = −T.
0

Then the value function v(x) is minus the least time it takes to get to
the origin from the point x = [x1 x2 ]T ; and the corresponding stationary
HJB equation is
max{f · ∇v + r} = 0
a∈A
for 
x2
A = [−1, 1], f= , r = −1.
a
Therefore  
∂v ∂v
max x2 +a − 1 = 0;
|a|≤1 ∂x1 ∂x2
and consequently the Hamilton-Jacobi-Bellman equation is
(
∂v ∂v
x2 ∂x 1
+ ∂x2
=1 in R2 \ {0}
(HJB)
v(0) = 0.

Solution of HJB equation. We introduce the regions


I := {(x1 , x2 ) | x1 > − 12 x2 |x2 |},
II := {(x1 , x2 ) | x1 < − 21 x2 |x2 |},
and define

−x − 2 x + 1 x2  21 in Region I
2 1 2 2
(5.14) v(x) = 1
x2 − 2 −x1 + 1 x2  2 in Region II.
2 2

We could have derived this formula for v using the ideas in the next example,
but for now let us just show that v really solves HJB.
In Region I we compute
− 12 − 12
x22 x22
 
∂v ∂v
= − x1 + , = −1 − x1 + x2 ;
∂x1 2 ∂x2 2
140 5. DYNAMIC PROGRAMMING

and check that there


∂v
< 0.
∂x2

Hence in Region I we have

− 12 " − 12 #
x22 x22
 
∂v ∂v
x2 + = −x2 x1 + + 1 + x2 x1 + = 1.
∂x1 ∂x2 2 2

This confirms that our HJB equation holds in Region I, and a similar cal-
culation holds in Region II, owing to the symmetry condition

v(−x) = v(x) (x ∈ R2 ).

Now let Γ denote the boundary between Regions I and II. Since
(
∂v <0 in Region I
∂x2 >0 in Region II

and
∂v
=0 on Γ,
∂x2

our function v defined by (5.14) does indeed solve the nonlinear HJB partial
differential equation.

GEOMETRIC INTERPRETATION. It is in fact easy to find optimal


trajectories for this problem. Indeed, when α = ±1, then
 
d 1 2
x1 ∓ (x2 ) = x2 ∓ x2 (±1) = 0.
dt 2

x22
Thus any solution of (ODE) moves along a parabola of the form x1 = 2 +C
x2
when α = 1, and along a parabola of the form x1 = − 22
+ C when α = −1.
Since we know from the PMP that an optimal control changes sign at most
once (see page 98), an optimal trajectory must move along one such family
of parabolas, and change to the other family of parabolas at most once, at
the switching curve Γ given by the formula x1 = − 12 |x2 |x2 . The picture
illustrates a typical optimal trajectory.
5.2. Applications 141

x2

x1

Optimal path for rocket railway car

So we did not need to invoke the full majesty of dynamic programming


to solve this simple problem, but the next example will build upon these
ideas. 

5.2.3. Fuller’s problem, chattering controls. We take the same equa-


tions of motion as in the previous example
   
0 1 0
ẋ = x+ α, −1 ≤ α ≤ 1,
0 0 1

but make a simple change in the payoff functional (which will have a pro-
found effect on optimal controls and trajectories). So let us now take

1 T
Z
Px [α(·)] = − (x1 )2 dt.
2 0
Thus
 
x 1
A = [−1, 1], f= 2 , r = − x21 ;
a 2
and the stationary HJB equation for the value function

v(x) = sup Px [α(·)]


α(·)∈A
142 5. DYNAMIC PROGRAMMING

is now
(
∂v ∂v
x2 ∂x1
+ ∂x2 = 12 x21 in R2 \ {0}
(HJB)
v(0) = 0.

Solving the HJB equation. Using the definition of the value function,
we can check that it satisfies the two symmetry conditions:
(5.15) v(−x) = v(x) (x ∈ R2 )
and
(5.16) v(λ2 x1 , λx2 ) = λ5 v(x1 , x2 ) (x ∈ R2 , λ > 0).

To verify (5.16), suppose that x = [x1 x2 ]T is an optimal trajectory


starting at the point x0 = [x01 x02 ]T , corresponding to an optimal control α.
Then if λ > 0 and we start instead at the point x0λ = [λ2 x01 λx02 ]T , optimal
trajectories and control are
xλ (t) = [λ2 x1 ( λt ) λx2 ( λt )]T , αλ (t) = α( λt ).
Since
Tλ Tλ
λ4
Z Z
1
P [αλ (·)] = − (x1λ )2 dt = − (x1 )2 ( λt )dt = λ5 P [α(·)],
2 0 2 0
the scaling identity (5.16) holds.

Now (5.16) and the previous example suggest that the optimal switching
should occur on the boundary Γ between two regions of the form
I := {(x1 , x2 ) | x1 > −βx2 |x2 |},
II := {(x1 , x2 ) | x1 < −βx2 |x2 |},
for some as yet unknown constant β > 0.

Still motivated by the previous example, we look a function v for which


the scaling symmetry (5.16) holds,
∂v ∂v
(5.17) < 0 in Region I, = 0 on Γ,
∂x2 ∂x2
and v solves the linear PDE
∂v ∂v 1
(5.18) x2 − = (x1 )2 in Region I.
∂x1 ∂x2 2
In view (5.16), let us start by looking for a particular solution of (5.18)
having the polynomial form
v = Ax52 + Bx1 x32 + Cx21 x2 .
5.2. Applications 143

If we plug this guess into (5.18) and match coefficients, we discover that
1 1 1
v = − x52 − x1 x32 − x21 x2
15 3 2
is a solution. Now the general solution of the linear, homogeneous PDE
∂w ∂w
x2 − =0
∂x1 ∂x2
has the form  
1
w = f x1 + x22 .
2
Hence the general solution of (5.18) is
 
1 5 1 3 1 2 1 2
v = − x2 − x1 x2 − x1 x2 + f x1 + x2 .
15 3 2 2
In order to satisfy the scaling condition (5.16), we take f to be homogeneous
and so have
 5
1 5 1 3 1 2 1 2 2
(5.19) v = − x2 − x1 x2 − x1 x2 − γ x1 + x2
15 3 2 2
for another as yet unknown constant γ > 0.
∂v
We want next to adjust the constants β, γ so that ∂x 2
= 0 on Γ. Now
 3
∂v 1 4 2 1 2 5γ 1 2 2
(5.20) = − x2 − x1 x2 − x1 − x1 + x2 x2 .
∂x2 3 2 2 2
Therefore on Γ+ = {x1 = −βx22 , x2 > 0}, we have
"  3 #
∂v 4 1 1 2 5γ 1 2
(5.21) = x2 − + β − β − −β + ;
∂x2 3 2 2 2
and on Γ− = {x1 = βx22 , x2 < 0}, we have
"  3 #
∂v 1 1 5γ 1 2
(5.22) = x42 − − β − β 2 + β+ .
∂x2 3 2 2 2
Consequently, we need to select β, γ so that

− 1 + β − 1 β 2 − 5γ −β + 1  23 = 0
(5.23) 3 2 2 2
3
− 1 − β − 1 β 2 + 5γ β + 1  2 = 0.
3 2 2 2

Solving each equation for γ, we see (5.23) implies


 3    3  
1 2 1 1 2 1 2 1 1 2
φ(β) = β + − + β − β − −β + +β+ β = 0.
2 3 2 2 3 2
Since
φ(0) < 0, φ( 21 ) > 0,
144 5. DYNAMIC PROGRAMMING

there exist 0 < β < 12 such that φ(β) = 0. We can then find γ > 0 so
that β, γ solve (5.23). A further calculation confirms that for these choices,
∂v
∂x2 < 0 within Region I.
Using (5.15) to extend our definition of v to all of R2 , we have (at last)
found our solution of the stationary HJB equation.

GEOMETRIC INTERPRETATION. Optimal trajectories for Fuller’s


problem are more interesting than for the rocket-railway car.

x2

x1

Γ
Part of an optimal path for Fuller’s problem

As before, a solution of (ODE) moves along a parabola of the form


x22
x1 = + C (drawn in green)
2
when α = 1, and along a parabola of the form
x22
x1 = − + C (drawn in red)
2
when α = −1. Furthermore, the optimal control switches from 1 to −1
(or vice versa) at the (blue) switching curve Γ given by the formula x1 =
−β|x2 |x2 .
But since 0 < β < 21 , such a trajectory will hit Γ infinitely many times.
Consequently, the optimal control α0 (·) will switch between ±1 infinitely
often before driving the state to the origin at a time T < ∞. We call α0 (·)
a chattering control. 
5.2. Applications 145

CLOSING REMARKS. Our detailed discussion of Fuller’s problem illus-


trates how difficult it can be to find an explicit solution of an HJB equation,
if this is even possible.
In fact, there are not so many optimal control problems for which ex-
act formulas can be had, using either the Pontryagin maximum principle
or dynamic programming. (See the old book of Athans and Falb [A-F] for
an extensive discussion of various solvable engineering control problems.)
Designing optimal controls is an important, but highly nonlinear and infi-
nite dimensional undertaking, and it is not surprising that exactly solvable
problems have been named after the researchers who found them.
It is therefore essential to turn to computational methods for most op-
timal control problems, and indeed for optimization problems in general. I
therefore strongly recommend that students take subsequent classes (mostly
offered in engineering) on computational techniques for optimization, with
the hope that their understanding the optimization theory from Math 170
and 195 will make the algorithms and software packages from these courses
more understandable. 
APPENDIX

A. Notation
Rn denotes n-dimensional Euclidean space, a typical point of which is the
column vector  
x1
 .. 
x =  . .
xn
To save space we will often write the corresponding row vector
x = [x1 · · · xn ]T .

If x, y ∈ Rn , we define
n n
!1/2
X 1 X
x·y = xi yi = xT y, |x| = (x · x) = 2 x2i
i=1 i=1

B. Linear algebra
Throughout these notes A denotes a real m × n matrix and AT denotes its
transpose:
   
a11 . . . a1n a11 . . . am1
 .. .. ..  T  .. .. ..  .
A= . . .  A = . . . 
am1 . . . amn a1n . . . amn
Recall the rule
(AB)T = B T AT .

147
148 APPENDIX

We write
Mn×m
for the space of all real m × n matrices.

An n × n matrix A is symmetric if A = AT , and a symmetric matrix


A is nonnegative definite if
n
X
y T Ay = aij yi yj ≥ 0 for all y ∈ Rn .
i,j=1

We then write A  0. We say A is positive definite if


n
X
y T Ay = aij yi yj > 0 for all y ∈ Rn ;
i,j=1

and write A  0.

C. Multivariable chain rule


Let f : Rn → R, f = f (x) = f (x1 , . . . , xn ). Then we write
∂f
∂xk
for the k-th partial derivative, k = 1, . . . , n . We likewise write
∂2f
 
∂ ∂f
= (k, l = 1, . . . , n),
∂xk ∂xl ∂xl ∂xk
∂2f ∂2f
and recall that ∂xk ∂xl = ∂xl ∂xk if f is twice continuously differentiable.

The gradient ∇f is the vector


 ∂f 
∂x1
∇f =  ... 
 
∂f
∂xn

and the Hessian matrix of second partial derivatives ∇2 f is the symmetric


n × n matrix
∂2f ∂2f
 
 ∂x2 ...
 1 ∂x1 ∂xn 
∇2 f = 
 .. .. .. .

 . . . 
 ∂f 2
∂ f 
...
∂x1 ∂xn ∂x2n
C. Multivariable chain rule 149

The chain rule tells us how to compute the partial derivatives of com-
posite functions, made from simpler functions. For this, assume that we are
given a function
f : Rn → R,
which we write as f (x) = f (x1 , . . . , xn ). Suppose also we have functions
g1 , . . . , gn : Rm → R
so that gi (y) = gi (y1 , . . . , ym ) for i = 1, . . . , n.

NOTATION. We define g : Rm → Rn by
 
g1
 .. 
g =  . .
gn
The gradient matrix of g is
  ∂g1 ∂g1

(∇g1 )T

∂y1 ... ∂ym
∇g =  ...  = 
  .. .. .. 
. .

 . . 
T ∂gn ∂gn
(∇gn ) ∂y ... ∂ym
1

This is an n × m matrix-valued function. 

We now build the composite function h : Rm → R by setting xi = gi (y)


in the definition of f ; that is, we define
h(y) = f (g(y)) = f (g1 (y), g2 (y), . . . , gn (y)).

THEOREM. (Multivariable chain rule) We have


n
∂h X ∂f ∂gi
(y) = (g(y)) (y) (k = 1, . . . , m).
∂yk ∂xi ∂yk
i=1

In matrix notation, this says


∇h = (∇g)T ∇f.
EXAMPLE. We use the chain rule to prove the useful formula
∇ |g|2 = 2(∇g)T g


where g : Rn → Rn .
To prove this, we compute that
n
1 ∂ 2
X ∂gi
|g| = gi = k-th entry of (∇g)T g.
2 ∂xk ∂xk
i=1

150 APPENDIX

D. Divergence Theorem
If U ⊂ Rn is an open set, with smooth boundary ∂U , we let
 
ν1
 .. 
ν =  . .
νn
denote the outward pointing unit normal vector field along ∂U . Then |ν| = 1
along ∂U .
Assume also that h : U → Rn , written
 
h1
 .. 
h =  . ,
hn
is a vector field. Its divergence is
n
X ∂hi
div h = ∇ · h = .
∂xi
i=1

THEOREM. (Divergence Theorem) We have


Z Z
div h dx = h · ν dS.
U ∂U

The expression on the right is an integral with respect to (n − 1 dimen-


sional) surface area over the boundary of U .

E. Implicit Function Theorem


Assume for this section that
f : R2 → R
is continuously differentiable. We will write f = f (x, y).
THEOREM. (Implicit Function Theorem) Assume that f (x0 , y0 ) = 0
and
∂f
(x0 , y0 ) 6= 0.
∂y
(i) Then there exist ε > 0 and a continuously differentiable function
g : (x0 − ε, x0 + ε) → R
such that
f (x, g(x)) = 0 (x0 − ε < x < x0 + ε).
F. Solving a nonlinear equation 151

(ii) Furthermore, for every solution of f (x, y) = 0 with (x, y) sufficiently


close to (x0 , y0 ), we have
y = g(x).

{f>0}
(x0,y0)
graph of g

{f<0}

x0 x
∂f
Implicit Function Theorem (if ∂y (x0 , y0 ) > 0)

F. Solving a nonlinear equation


THEOREM. Let C denote a closed, bounded, convex subset of Rn and
assume p lies in the interior of C. Suppose Φ : C → Rn is a continuous
vector field that satisfies the strict inequalities
|Φ(x) − x| < |x − p| for all x ∈ ∂C

Then there exists a point x ∈ C such that


Φ(x) = p.

Proof. 1. Suppose first that C is the unit ball B(0, 1) and p = 0. Squaring
the inequality |Φ(x) − x| < |x|, we deduce that
Φ(x) · x > 0 for all x ∈ ∂B(0, 1).
Then for small t > 0, the continuous mapping
Ψ(x) := x − tΦ(x)
maps B(0, 1) into itself, and hence has a fixed point x according to Brouwer’s
Fixed Point Theorem. Then Φ(x) = 0 = p.
152 APPENDIX

2. For the general case, we can always assume after a translation that
p = 0, so that 0 belongs to the interior of C. We introduce a nonnegative
gauge function ρ : Rn → [0, ∞) such that ρ(λx) = λρ(x) for all λ ≥ 0 and
C = {x ∈ Rn | ρ(x) ≤ 1}.
We next map C onto B(0, 1) by the continuous function
ρ(x)
a(x) = x = y.
|x|
Define  
ρ(y) |y|
Ψ(y) = Φ y .
|y| ρ(y)
Then the inequality |Φ(x) − x| < |x| implies
|Ψ(y) − y| < 1 for all y ∈ ∂B(0, 1).
Consequently the first part of the proof shows that there exits y ∈ B(0, 1)
such that Ψ(y) = 0. And then
|y|
x= y∈C
ρ(y)
satisfies Φ(x) = 0 = p. 
EXERCISES

Some of these problems are from Kamien–Schwartz [K-S] and Bressan–


Piccoli [B-P].
1. This and the next few problems provide practice for solving certain
ODE. Find the general solution of

y 0 = a(x)y + b(x).

(Hint: Multiply by eA(x) for an appropriate function A.)


2. (a) If y satisfies

y 0 = a(x)y + b(x),

what ODE does z = y q solve?


(b) Use (a) to solve Bernoulli’s equation

y 0 = a(x)y + b(x)y p .

3. Find an implicit formula for the general solution of the separable


ODE
y 0 = a(x)b(y),
where b > 0.
4. Assume y1 , y2 both solve the linear, second-order ODE

Ly = y 00 + b(x)y 0 + c(x)y = 0.

Find the first-order ODE satisfied by the Wronskian w = y20 y1 −


y10 y2 .

153
154 EXERCISES

5. (a) Find two linearly independent solutions y1 , y2 of the constant


coefficient ODE
Ly = ay 00 + by 0 + cy = 0.
where a > 0. (Hint: Try y = eλx . What if λ is a repeated
root of the corresponding polynomial?)
(b) Find two linearly independent solutions y1 , y2 of Euler’s equa-
tion
Ly = ax2 y 00 + bxy 0 + cy = 0.
6. If y1 solves
Ly = y 00 + b(x)y 0 + c(x)y = 0,
find another, linearly independent solution of the form y2 = zy1 .
(Hint: Show w = z 0 satisfies a first-order linear ODE.)
7. Write down the Euler-Lagrange equations for the following La-
grangians:
2
(a) ez
(b) z 4 + y 2
(c) sin(xyz)
(d) y 3 x4 .
8. Give an alternate proof that
L = a(y)z
R1
is a null Lagrangian, by observing I[y(·)] = 0 L(y, y 0 ) dx depends
only upon the values taken on by y(·) at the endpoints 0, 1.
9. Compute (E-L) for
Z 1
a(x, y) + b(x, y)y 0 dx
0
and show that it is not a differential equation, but rather an im-
plicit algebraic formula for y as a function of x.
10. (Discrete version of Euler-Lagrange equations) Let h > 0 and de-
fine xk = kh for k = 0, . . . , N + 1. Consider the problem of finding
{yk }N
k=1 to minimize
N  
y −y
X
L xk , yk , k+1h k h,
k=0

where y0 = y0,
yN +1 = y1 are prescribed. What algebraic con-
ditions do minimizers satisfy? What is the connection with the
Euler-Lagrange equation?
EXERCISES 155

11. Assume Z b
I[y(·)] = L(x, y, y 0 , y 00 ) dx
a
for L = L(x, y, z, u). Derive the corresponding Euler-Lagrange
equation for extremals.
12. Show that the extremals of
Z b
0
(y 0 )2 e−y dx
a
are linear functions.
13. Find the extremals of
Z 1
(y 0 )2 + 10xy dx,
0
subject to y(0) = 1, y(1) = 2.
14. Find the extremals of
Z T
e−λt (a(ẋ)2 + bx2 ) dt,
0
satisfying x(0) = 0, x(T ) = c.
15. Assume c > 0. Find the minimizer of
Z T
e−λt (c(ẋ)2 + dx) dt,
0
subject to x(0) = 0, x(T ) = b.
16. Find the minimizer of
(y 0 )2
Z 1
− x2 y dx,
0 2
where y(0) = 1 and y(1) is free.
17. Consider the free endpoint problem of minimizing
Z b
I[y(·)] = L(x, y, y 0 ) dx + g(y(b))
a
subject to y(a) = 0 g : R → R is a given function. What
y , where
is the transversality condition satisfied at x = b by a minimizer
y0 (·)?
18. Assume y0 (·) minimizes the functional
1 b 00 2
Z
I[y(·)] = (y ) dx
2 a
over the admissible class
A = {y : [a, b] → R | y(a) = 0, y(b) = 1}.
156 EXERCISES

What additional boundary conditions does y0 (·) satisfy?


19. Give a different proof of Theorem 1.3.4 by rewriting
Z T 
1 s

j(σ) = L 1+σ , x0 (s), (1 + σ)ẋ0 (s) ds
1+σ 0
dj
and then computing dσ (0).
20. Find the minimizer of
Z 1
(ẋ)2 dt,
0
R1
subject to x(0) = 0, x(1) = 2 and 0 x dt = b.

21. Find a minimizer of I[y(·)] = 0 (y 0 )2 dx over the admissible class
Z π
A = {y : [0, π] → R | y(0) = y(π) = 0, y 2 dx = 1}.
0
Rb
22. Suppose I[y(·)] = a L(x, y, y 0 ) dx and the admissible class is
A = {y : [a, b] → R | y(a) = 0, y(b) = 0, φ(x) ≤ y(x) (a ≤ x ≤ b)}
for some given function φ.
(a) Show that if y0 (·) ∈ A is a minimizer, then
 0
∂L 0 ∂L
− (x, y0 , y0 ) + (x, y0 , y00 ) ≥ 0 (a ≤ x ≤ b).
∂z ∂y
(b) Show that
 0
∂L 0 ∂L
− (x, y0 , y0 ) + (x, y0 , y00 ) = 0
∂z ∂y
in the region {a ≤ x ≤ b | φ(x) < y0 (x)} where y0 (·) does not
hit the constraint.
RT
23. Let I[x(·)] = 0 |ẋ| dt for functions belonging to the admissible
class
A = {x : [0, T ] → Rn | x(0) = A, x(T ) = B},
where A, B ∈ Rn are given. Show that the graph of a minimizer
x0 (·) is a line segment from A to B.
24. For each Lagrangian L : Rn × Rn → R write out the Euler–
Lagrange equation, the Hamiltonian H and the Hamiltonian dif-
ferential equations:
(a) L = m 2
2 |v| − W (x)
m
(b) L = 2 |v|2 + b(x) · v.
EXERCISES 157

25. Assume that the matrix function G, whose (i, j)-th entry is gij , is
symmetric and positive definite. What is the Hamiltonian corre-
sponding to the Lagrangian
n
1 X
L= gij (x)vi vj ?
2
i,j=1

26. Show that


m 2
L= |v| + qv · A(x)
2
is the Lagrangian corresponding to the Hamiltonian
1
H= |p − qA(x)|2 .
2m
27. If x : [0, 1] → U is a curve and z = y(x) is its image under the
coordinate patch y : U → Rl , show that the length of the curve z
in Rl is
Z 1 X n 1
2
gij (x)ẋi ẋj dt.
0 i,j=1

28. Compute explicitly the Christoffel symbols Γkij for the hyperbolic
plane. Check that the ODE given in the text for the geodesics is
correct.
29. Suppose
b
∂2L
Z
(x, y0 , y00 )ζ 2 dx ≥ 0
a ∂z 2
for all functions ζ such that ζ(a) = ζ(b) = 0. Explain carefully
why this implies
∂2L
(x, y0 , y00 ) ≥ 0 (a ≤ x ≤ b).
∂z 2
30. Assume y(·) > 0 solves the linear, second-order ODE

y 00 + b(x)y 0 + c(x)y = 0.

Find the corresponding Riccati equation, which is the nonlinear,


0
first-order ODE that w = yy solves.
31. What does the Weierstrass condition say about the possible values
of y00 (·) for minimizers of
Rb
(a) a ((y 0 )2 − 1)2 dx
Rb 1
(b) a 1+(y 0 )2 dx

subject to given boundary conditions?


158 EXERCISES

32. Derive this useful calculus formula, which we used in the proof of
Theorem 2.3.1:
Z g(x) ! Z
g(x)
d ∂f
f (x, t) dt = (x, t) dt + f (x, g(x))g 0 (x).
dx a a ∂x
Ry R g(x)
(Hint: Define F (x, y) = a f (x, t) dt; so that a f (x, t) dt =
F (x, g(x)). Apply the chain rule.)
33. Suppose that y(·) is an extremal of I[ · ], the second variation of
which satisfies for some constant γ > 0 the estimate
Z b Z b
0 2 0 2
A(w ) + 2Bww + Cw dx ≥ γ (w0 )2 + w2 dx
a a
for all w : [a, b] → R with w(a) = w(b) = 0. Use a Taylor expansion
to show directly that y(·) is a weak local minimizer; this means
that there exists δ > 0 such that
I[y] ≤ I[ȳ]
for all admissible ȳ satisfying max[a,b] {|y − ȳ| + |y 0 − ȳ 0 |} ≤ δ.
34. Show that for each l > 0 the function y(·) = 0 is a strong local
minimizer of
Z l 0 2
(y ) y2 y4
+ − dx
0 2 2 4
for
A = {y : [0, l] → R | y(0) = y(l) = 0}.
35. Assume that (y, z) 7→ L(x, y, z) is convex for each a ≤ x ≤ b.
Show that each extremal y(·) ∈ A is in fact a minimizer of I[ · ],
for the admissible set
A = {y : [a, b] → R | y(a) = y 0 , y(b) = y 1 }.
(Hint: Recall that if f : Rn → R is convex, then
f (x̂) ≥ f (x) + ∇f (x) · (x̂ − x)
for all x̂.)
36. A function f : Rn → R is strictly convex provided
f (θx + (1 − θ)x̂) < θf (x) + (1 − θ)f (x̂)
if x 6= x̂ and 0 < θ < 1.
Suppose that (y, z) 7→ L(x, y, z) is strictly convex for each
a ≤ x ≤ b. Show that there exists at most one minimizer y0 (·) ∈ A
of I[ · ].
EXERCISES 159

37. Explain carefully why


Z 1
I[y(·)] = y dx
0
does not have a minimizer over the admissible set
A = {y : [0, 1] → R | y(·) is continuous, y ≥ 0, y(0) = 0, y(1) = 1}.
38. Write down the Euler-Lagrange PDE for the following Lagrangians:
2
(a) |z|2 + b(x) · z
p
(b) |z|p (1 ≤ p < ∞)
1
(c) 1 + |z|2 2 + F (y).
39. Write down the Euler-Lagrange PDE for these functionals:
2
(a) I[u] = U |∇u|
R
2 + F (x, u) dx
(b) I[u] = 12 U nk,l=1 akl (x) ∂x
∂u ∂u
R P
k ∂xl
dx.
40. Use the Divergence Theorem to give another proof that
L = z · b(x)f (y) + F (y) div b,
is a null Lagrangian, where F 0 = f .
(Hint: L(x, u, ∇u) = div(F (u)b).)
41. Derive the 4-th order Euler-Lagrange PDE satisfied by minimizers
of Z
1
I[u(·)] = (∆u)2 dx,
2 U
subject to the boundary conditions u = g on ∂U . What is the
transversality condition on ∂U for a minimizer?
42. Consider the system of PDE
(
−∆u1 = u2
−∆u2 = 2u1 ,
 
u1
for the unknown u : R2→ R2 ,
u= .
u2
Show that this problem is variational. This means to find
a Lagrangian function L : R2 × R2 × M2×2 → R so that the
Rtwo PDE above are the Euler-Lagrange equations for I[u(·)] =
R2 L(x, u, ∇u) dx.
43. A system of two reaction-diffusion equations has the form
(
−a1 ∆u1 = f1 (u1 , u2 )
−a2 ∆u2 = f2 (u1 , u2 ).
160 EXERCISES

Under what conditions on the functions f1 , f2 : R2 → R is this


system variational?
44. Consider the dynamics
(
ẋ(t) = α(t) (0 ≤ t ≤ 1)
x(0) = x0

and payoff functional


Z 1
P [α(·)] = |x(t)| dt.
0

(a) Suppose A = {−1, 1}. If |x0 | ≥ 1, describe an optimal control


that minimizes P [α(·)]. Explain why an optimal control does
not exist if |x0 | < 1.
(b) Suppose instead that A = {−1, 0, 1}. Explain why an optimal
control exists.
45. Consider the problem of reaching the origin in least time, when
the dynamics are
(
ẋ1 = x2
ẋ2 = −x1 + α,

where |α| ≤ 1.  
cos t sin t
(a) Check that X(t) = .
− sin t cos t
(b) Show that an optimal control α0 (·) is periodic in time.
46. Use the maximum principle to find an optimal control for the linear
time optimal problem with dynamics
(
ẋ1 = x2 + α1 |α1 | ≤ 1
ẋ2 = −x1 + α2 |α2 | ≤ 1

47. Write down (ODE), (ADJ), (M) and (T) for the fixed time, free
endpoint problems with n = m = 1 and
(a) f = (x2 + a)2 − x4 , A = [−1, 1], r = 2ax, g = sin x
(b) f = x2 a, A = [0, 2], r = a2 + x2 , g = 0.
48. Write down the equations (ADJ), (M) and (T) for the fixed time,
free endpoint problem corresponding to the dynamics
(
ẋ1 = sin(x1 + αx2 )
ẋ2 = cos(αx1 + x2 ),
EXERCISES 161

where 0 ≤ α ≤ 1, with the payoff


Z T
P [α(·)] = α4 + (x1 x2 )2 dt + (x1 (T ))4 .
0

49. Consider the variational problem of minimizing


Z T
I[x(·)] = L(x(t), ẋ(t)) dt,
0

for L = L(x, v), over the admissible class

A = {x : [0, T ] → Rn | x(·) is continuous and piecewise


continuously differentiable, x(0) = x0 , x(T ) = x1 }.

(a) Show how to interpret this as a fixed time, fixed endpoint


optimal control problem.
(b) If x0 is a minimizer, show that (M) implies p0 = ∇v L(x0 , ẋ0 ).
(c) Show that (ADJ) implies the Euler-Lagrange equations.
50. Use the free time, fixed endpoint PMP to give a new proof of
Theorem 1.3.4 for a Lagrangian L = L(x, v). In particular, explain
what condition (T) says for the variational problem.
51. Solve the linear-quadratic regulator problem of minimizing
Z T
x2 (T )
x2 + α2 dt +
0 2
for the dynamics
(
ẋ = x + α
x(0) = x0

with controls α : [0, T ] → R.


52. Assume that a function z : [0, T ] → R is given and we wish to
minimize
Z T
(x − z)2 + α2 dt,
0
where A = R and
(
ẋ = x + α
x(0) = x0 .
Show
1 e
α = dx +
2 2
162 EXERCISES

is an optimal feedback control, where d(·) and e(·) solve


(
d˙ = 2 − 2d − 21 d2 , d(T ) = 0
ė = −2z − 1 + d2 e, e(T ) = 0.


(Hint: Assume PMP applies and write down the equations for x
and p. Look for a solution of the form p = dx + e.)
53. Find explicit formulas for the optimal state x0 (·) and costate p0 (·)
for the production and consumption model discussed on page 108.
Show that H(x0 , p0 , α0 ) is constant on the interval [0, T ], as as-
serted by the PMP.
54. How does our analysis of the Ramsey consumption model break
down if we drop the requirement that x(T ) = x1 ?
55. Use the PMP to solve the problem of maximizing
Z 2
P [α(·)] = 2x − 3α + α2 dt,
0

where (
ẋ = x + α, 0≤α≤2
x(0) = 0.
56. Use the PMP to find a control to minimize the payoff
1 1 4
Z
P [α(·)] = α dt,
4 0
for the dynamics
(
ẋ = x + α
x(0) = 1, x(1) = 0,
where A = R.
57. Assume that the matrices B, C, D are symmetric and that the
matrix Riccati equation
(
K̇ + KN C −1 N T K + KM + M T K = B (0 ≤ t ≤ T )
K(T ) = −D

has a unique solution K(·). Show that K(t) is symmetric for each
time 0 ≤ t ≤ T .
58. Apply the PMP to solve the general linear-quadratic regulator
problem, introduced on page 137. In particular, show how to solve
(ADJ) for the costate p0 (·) in terms of the matrix Riccati equation.
EXERCISES 163

59. This exercise discusses the infinite horizon problem:


(
ẋ = −x + α (t ≥ 0)
x(0) = x,
where A = R and
Z ∞
Px [α(·)] = x2 + α2 dt
0
Define
v(x) = inf Px [α(·)].
α(·)
(a) Derive the HJB equation for v.
(b) Solve this equation and design an optimal feedback control.
60. Use dynamic programming to solve the tracking problem of mini-
mizing Z T
|x(t) − z(t)|2 + |α(t)|2 dt
0
for the dynamics
(
ẋ(t) = M x(t) + N α(t) (0 ≤ t ≤ T )
x(0) = 0,
where z : [0, T ] → Rn is given. Here M ∈ Mn×n , N ∈ Mn×m , and
A = Rm .
Bibliography

[A-F] M. Athans and P. L. Falb, Optimal Control: An Introduction to


the Theory and its Applications, Dover, 2007
[B-M] J. Ball and F. Murat, Remarks on rank-one convexity and quasi-
convexity, in Ordinary and Partial Differential Equations, B. D.
Sleeman and R. J. Jarvis, eds., Pitman Research Notes, Pitman,
1991.
[B-P] A. Bressan and B. Piccoli, Introduction to the Mathematical The-
ory of Control, AIMS Series on Applied Math, Vol 2, American
Institute of Mathematical Sciences, 2007
[C-Z] F. H. Clarke and V. Zeidan, Sufficiency and the Jacobi condition
in the calculus of variations, Canad. J. Math (38) 1986, 1199–
1209.
[E] L. C. Evans, An Introduction to Mathematical Optimal Control
Theory, Version 0.2, lecture notes available at math.berkeley.
edu/~evans/control.course.pdf
[F-R] W. Fleming and R. Rishel, Deterministic and Stochastic Optimal
Control, Springer, 1975
[G] T. Gilbert, Lost mail: the missing envelope in the problem of the
minimal surface of revolution, American Math Monthly (119)
2012, 359–372
[K-S] M. Kamien and N. Schwartz, Dynamic Optimization: The Cal-
culus of Variations and Optimal Control in Economics and Man-
agement, 2nd ed, Dover, 2012

165
166 Bibliography

[K] M. Kot, A First Course in the Calculus of Variations, Student


Mathematical Library, Vol 72, American Math Society, 2014
[L-M] E. B. Lee and L. Markus, Foundations of Optimal Control The-
ory, Wiley, 1967
[L] M. Levi, Classical Mechanics with Calculus of Variations and
Optimal Control, Student Mathematical Library, Vol 69, Amer-
ican Math Society, 2014
[M-S] J. Macki and A. Strauss, Introduction to Optimal Control The-
ory, Springer, 1982
[M] Z. A. Melzak, Mathematical Ideas, Modeling & Applications,
Wiley–Interscience, 1976
[MG] M. Mesterton-Gibbons, A Primer on the Calculus of Variations
and Optimal Control Theory, Student Mathematical Library, Vol
50, American Math Society, 2009
[S] D. R. Smith, Variational Methods in Optimization, Prentice Hall,
1974
[T] J. R. Taylor, Classical Mechanics, University Science Books,
2005

You might also like