Math 195 Notes
Math 195 Notes
Lawrence C. Evans
Department of Mathematics
University of California, Berkeley
Contents
PREFACE v
INTRODUCTION 1
i
ii Contents
APPENDIX 147
A. Notation 147
B. Linear algebra 147
C. Multivariable chain rule 148
D. Divergence Theorem 150
E. Implicit Function Theorem 150
F. Solving a nonlinear equation 151
EXERCISES 153
Bibliography 165
PREFACE
Last fall I taught a revised version of Math 170, primarily on finite di-
mensional optimization. This new spring class Math 195 discusses dynamic
optimization, mostly the calculus of variations and optimal control theory.
(However, Math 170 is not a prerequisite for Math 195, since we will be
developing quite different mathematical tools.)
We continue to be grateful to Kurt and Evelyn Riedel for their very gen-
erous contribution to the Berkeley Math Department, in financial support
of the redesign and expansion of our undergraduate classes in optimization
theory.
The texts Dynamic Optimization by Kamien and Schwartz [K-S] and
Introduction to Optimal Control Theory by Macki–Strauss [M-S] are good
overall references for this class, and I also strongly recommend Levi, Classical
Mechanics with Calculus of Variations and Optimal Control [L]. Part of the
content in Chapters 4 and 5 is reworked from my old online lecture notes
[E].
I have used Inkscape and SageMath for the illustrations. Thanks to
David Hoffman for the beautiful pictures of minimal surfaces. I am again
very thankful to have had Haotian Gu as my course assistant this term.
v
INTRODUCTION
1
2 INTRODUCTION
FIRST VARIATION
a b
Graph of an admissible function
3
4 1. FIRST VARIATION
where
dy
y0 = .
dx
Note that we insert y(x) into the y-variable slot of L(x, y, z), and y 0 (x)
into the z-variable slot of L(x, y, z).
INTERPRETATION. We can informally think of the number I[y(·)] as
being some sort of “energy” associated with the function y(·) (but there can
be many other interesting interpretations).
REMARK. To avoid various technical issues, we will usually suppress men-
tion of the precise degree of smoothness assumed for various functions that
we discuss. In particular, whenever we write down a derivative (or partial
derivative) of some function at some point, the reader should suppose that
the function is indeed differentiable there.
A surface of revolution
What curve y0 (·) gives the surface of revolution of least surface area?
This is more difficult than the previous example, and we will only later have
the tools to handle this.
The most important insight of the calculus of variations is the next the-
orem. It says that a minimizer y0 (·) ∈ A automatically solves a certain
ordinary differential equation (ODE). This equation appears when we com-
pute an appropriate first variation for our minimization problem (COV).
THEOREM 1.2.1. Assume y0 (·) ∈ A solves (COV) and y0 (·) is twice
continuously differentiable.
6 1. FIRST VARIATION
REMARKS.
(i) Theorem 1.2.1 says that any minimizer y0 solving (COV) satisfies the
Euler-Lagrange differential equation and thus is an extremal. But a given
extremal need not be a minimizer.
(ii) Remember that 0 = dx
d
. So it is also correct to write (E-L) as
0
∂L 0 ∂L
− (x, y, y ) + (x, y, y 0 ) = 0.
∂z ∂y
(iii) We could apply the chain rule to expand out the first term in (E-L),
but it is almost always best not to do so.
may have a geometric meaning. It does. For any twice differentiable curve
y(·),
y 00
(1.3) κ=
(1 + (y 0 )2 )3/2
is the curvature of the graph of y(·) at the point (x, y(x)). The calculus
of variations has automatically produced this important expression for the
geometry of planar curves. And what (E-L) really says is that the graph of
our minimizer y0 has constant curvature κ = 0.
Indeed,
d ∂L 0 ∂L d
− (y, y ) + (y, y 0 ) = − (a(y)) + a0 (y)y 0 = 0
dx ∂z ∂y dx
for all functions y.
We will learn later that null Lagrangians, especially for more complicated
variational problems, can provide useful information.
EXAMPLE. Consider the following simple model for the motion of a par-
ticle along the real line, moving under the influence of a potential energy. In
this interpretation m denotes the mass, x(t) is the position of the particle
at time t, and ẋ(t) is its velocity.
In addition,
m
|ẋ(t)|2 = kinetic energy at time t,
2
W (x(t)) = potential energy at time t,
where W : R → R is given. The action of a path x : [0, T ] → R is the time
integral of the difference between the kinetic and potential energies:
Z T
m 2
I[x(·)] = |ẋ| − W (x(t)) dt.
0 2
What is the corresponding Euler-Lagrange equation?
We have
mv 2 ∂L ∂L
L= − W (x), = −W 0 (x), = mv,
2 ∂x ∂v
where 0 = d
dx . So (E-L) is
d
(mẋ) − W 0 (x) = 0,
−
dt
which is Newton’s law of motion:
mẍ = −W 0 (x).
In other words, ma = f for the acceleration a = ẍ and force f = −W 0 . The
calculus of variations provides a systematic derivation for this fundamental
law of physics.
1.2.3. Derivation.
In this section we prove that minimizers satisfy the Euler-Lagrange equa-
tion.
y()
τ
A
y()
0
a b
Then
i(0) ≤ i(τ ).
So i(·) has a minimum at τ = 0 on the interval −1 ≤ τ ≤ 1, and therefore
di
(0) = 0.
dτ
Our task now is to see what information we can extract from this simple
formula.
2. We have
Therefore
Z b
di ∂
(τ ) = L(x, y0 + τ w, y00 + τ w0 ) dx
dτ a ∂τ
Z b
∂L ∂L
= (x, y0 + τ w, y00 + τ w0 )w + (x, y0 + τ w, y00 + τ w0 )w0 dx,
a ∂y ∂z
where we used the chain rule. Next, set τ = 0, to learn that
Z b
di ∂L ∂L
0= (0) = (x, y0 , y00 )w + (x, y0 , y00 )w0 dx.
dτ a ∂y ∂z
We now integrate by parts, to deduce
Z b
∂L d ∂L
(x, y0 , y00 ) − (x, y0 , y00 ) w dx = 0.
a ∂y dx ∂z
This is valid for all functions w such that w(a) = w(b) = 0. According then
to the Lemma above, it follows that
∂L 0 d ∂L 0
(x, y0 , y0 ) − (x, y0 , y0 ) = 0
∂y dx ∂z
for all a ≤ x ≤ b. This is (E-L).
REMARK. The procedure in this proof is called computing the first vari-
ation.
1.3. Extensions and generalizations 13
THEOREM 1.3.1. If L does not depend on y and the function y(·) solves
(E-L), then
∂L
(1.8) (x, y 0 ) is constant for a ≤ x ≤ b.
∂z
∂L
Proof. Since ∂y = 0, (E-L) says
0
∂L 0
− (x, y ) = 0;
∂z
and so ∂L 0
∂z (x, y ) is a constant.
We have
∂L ∂L
L(x, y, z) = x3 z 2 , = 0, = 2x3 z;
∂y ∂z
therefore
∂L
(x, y 0 (x)) = 2x3 y 0 (x) = C.
∂z
Hence
C
y 0 (x) = ,
2x3
and so
E
y(x) = +F
x2
for constants E, F .
(b) Find a minimizer of I[ · ] from the admissible class
A = {y : [1, 2] → R | y(1) = 3, y(2) = 4}.
We need to select the constants E, F above so that
E
3 = y(1) = E + F, 4 = y(2) = + F.
4
Solving, we find that E = − 43 , F = 13
3 , and thus
4 13
y0 (x) = −
2
+ .
3x 3
Therefore if (COV) has a solution, it must be this.
Proof.
0 0
0 0 ∂L 0 ∂L 0 ∂L 00 00 ∂L 0 ∂L
L(y, y ) − y (y, y ) = y + y −y −y
∂z ∂y ∂z ∂z ∂z
0
0 ∂L 0 ∂L 0
=y − (y, y ) + (y, y )
∂z ∂y
= 0,
since the expression in the parentheses is 0 according (E-L).
But since L does not depend on x, we can apply Theorem 1.3.2. Now
∂L yy 0
y0 − L = y0 − y(1 + (y 0 )2 )1/2
∂z (1 + (y 0 )2 )1/2
y
=− .
(1 + (y 0 )2 )1/2
y
=C
(1 + (y 0 )2 )1/2
1/2.
y2 − C 2
0
y =±
C2
dy (y 2 − C 2 )1/2
=
dx C
dy dx
2 2 1/2
=
(y − C ) C
Z Z
dy dx
2 2 1/2
=
(y − C ) C
y x
cosh−1 = + D.
C C
(I looked up the expression for the y integral from a table of standard inte-
grals.)
Therefore the curve giving a surface of revolution of least area is
x
y0 (x) = C cosh +D ,
C
x −x
where we recall that cosh(x) = e +e
2 . The graph for the y-curve is called
a catenary. The corresponding surface of revolution is a catenoid.
1.3. Extensions and generalizations 17
A catenoid
REMARK. To fully resolve our problem we need try to adjust the con-
stants C and D so the solution passes through the given endpoints. This
however can be subtle and may not be possible: see Gilbert [G].
ξ
y
x
Angles and derivatives
Thus
sin ξ
(1.10) = C,
v(y)
for some constant C. This is a continuous version of Snell’s Law of diffrac-
tion (see the Math 170 lecture notes).
EXAMPLE. Recall for our model for the motion of a particle on the line
that
Z b
m 2
I[x(·)] = |ẋ| − W (x(t)) dt
a 2
with
mv 2
L= − W (x).
2
We compute
m(ẋ)2
∂L
ẋ − L = ẋ(mẋ) − − W (x)
∂v 2
m(ẋ)2
= + W (x).
2
Since L does not depend upon t, Theorem 1.3.2 implies that the above
expression is constant. So
We as usual define
Z b
I[y(·)] = L(x, y, y 0 ) dx
a
for functions y(·) ∈ A; and seek to understand the behavior of a minimizer
y0 (·) ∈ A of this free endpoint problem.
THEOREM 1.3.4. Let the admissible class be given by (1.15) and assume
(T0 , x0 (·)) ∈ A minimizes I[ · ].
(i) Then x0 solves the Euler-Lagrange equation
d ∂L ∂L
(1.16) − (t, x0 , ẋ0 ) + (t, x0 , ẋ0 ) = 0 (0 ≤ t ≤ T0 ).
dt ∂v ∂x
(ii) Furthermore, we have
∂L
(1.17) (T0 , x0 (T0 ), ẋ0 (T0 ))ẋ0 (T0 ) − L(T0 , x0 (T0 ), ẋ0 (T0 )) = 0.
∂v
3. Now
dTσ T0
=−
dσ (1 + σ)2
and
∂xσ
= ẋ0 (t(1 + σ))t.
∂σ
Therefore
dj
0= (0) = −T0 L(T0 , x0 (T0 ), ẋ0 (T0 ))
dσ
Z T0
∂L ∂L
+ (t, x0 , ẋ0 )w + (t, x0 , ẋ0 )ẇ dt,
0 ∂x ∂v
for
w(t) = ẋ0 (t)t.
We integrate by parts in the integral and recall (1.16), to deduce that
∂L
0 = −T0 L(T0 , x0 (T0 ), ẋ0 (T0 )) + (T0 , x0 (T0 ), ẋ0 (T0 ))ẋ0 (T0 )T0 .
∂v
This gives (1.17), since T0 > 0.
The interesting book by Kamien and Schwartz [K-S] has further infor-
mation about transversality conditions in various more complicated situa-
tions.
2. Now define
yτ (x) = y0 (x) + τ w(x) + φ(τ )v(x) (−τ0 ≤ τ ≤ τ0 ).
Then yτ (a) = y0,yτ (b) = andy1,
Z b
J[yτ (·)] = G(x, y0 + τ w + φ(τ )v) dx = Φ(τ, φ(τ )) = 0;
a
therefore yτ (·) ∈ A. Hence i(τ ) = I[yτ (·)] has a minimum at τ = 0, and
consequently
di
(0) = 0.
dτ
We will extract the Lagrange multiplier from this simple looking equality.
3. We compute
Z b
di ∂L
(τ ) = (x, y0 + τ w + φ(τ )v, y00 + τ w0 + φ(τ )v 0 )(w + φ0 (τ )v)
dτ a ∂y
Z b
∂L
+ (x, y0 + τ w + φ(τ )v, y00 + τ w0 + φ(τ )v 0 )(w0 + φ0 (τ )v 0 ) dx.
a ∂z
Now put τ = 0 and then integrate by parts:
Z b
di ∂L ∂L 0
0= (0) = (w + φ0 (0)v) + (w + φ0 (0)v 0 ) dx
dτ a ∂y ∂z
Z b 0
∂L ∂L
(1.23) = − (w + φ0 (0)v) dx,
a ∂y ∂z
where L is evaluated at (x, y0 , y00 ).
B
A
0 1
Recall from (1.3) that this says the curvature κ is constant. Therefore
(as we will prove later, on page 48) the graph of y(·) is an arc of a circle
connecting the given endpoints.
26 1. FIRST VARIATION
We omit the proof, which is similar to that for the previous theorem.
Catenary
Now if l < 2a, the admissible class is empty and we will not be able
to select C as above. If l = 2a, the admissible class consists only of one
configuration, for which chain is stretched horizontally between its left and
right endpoints. The constraint qualification condition (1.24) then fails.
1.3.4. Systems.
We next turn attention to calculus of variations problems for functions
y : [a, b] → Rn . The new difficulties are mostly notational, as the basic ideas
are the same as above.
NOTATION. (i) We write
y10 (x)
y1 (x)
y(x) = ... , y0 (x) = ... .
(iv) We write
Z b
I[y(·)] = L(x, y(x), y0 (x)) dx.
a
2. Since
Z b
i(τ ) = L(x, y0 (x) + τ w(x), y00 (x) + τ w0 (x)) dx,
a
we can apply the chain rule to compute
Z bXn
di ∂L
(τ ) = (x, y0 + τ w, y00 + τ w0 )wl
dτ a ∂yl
l=1
n
X ∂L
+ (x, y0 + τ w, y00 + τ w0 )wl0 dx.
∂zl
l=1
Thus
n
Z bX n
di ∂L X ∂L
0= (0) = (x, y0 , y00 )wl + (x, y0 , y00 )wl0 dx.
dτ a ∂yl ∂zl
l=1 l=1
The proof of (1.27) is simple, and the proof of (1.28) is similar to that
for our earlier Theorem 1.3.2.
REMARK. And so if we can solve the ODE (1.33) for the unknown func-
tion y1 , we can then recover y2 by integrating (1.31).
32 1. FIRST VARIATION
Proof. We calculate
∂R ∂L ∂L ∂φ
= + −C
∂x ∂x ∂z2 ∂x
and
∂R ∂L ∂L ∂φ
= + −C .
∂z1 ∂z1 ∂z2 ∂z1
Hence (1.31) and (1.30) imply
∂R ∂L
(x, y1 , y10 ) = (x, y1 , y10 , y20 ).
∂x ∂x
and
∂R ∂L
(x, y1 , y10 ) = (x, y1 , y10 , y20 ).
∂z1 ∂z1
Then the first equation in (1.29) lets us compute that
0
∂R 0 ∂R
− (x, y1 , y1 ) + (x, y1 , y10 )
∂z1 ∂y1
0
∂L 0 ∂L
=− (x, y, y ) + (x, y, y0 ) = 0.
∂z1 ∂y1
1.4. Applications
Following are some more substantial, and more interesting, applications of
our theory.
1.4.1. Brachistochrone.
Given two points A, B as drawn, we can interpret the graph of a function
y(·) joining these points as a wire path along which a bead of unit mass slides
without friction under the influence of gravity. How do we design the slide
so as to minimize the time it takes for the bead to slide from A to B?
A x
B
1.4. Applications 33
For simplicity, we assume that A = (0, 0) and that y(x) ≤ 0 for all
0 ≤ x ≤ b. As the particle slides its total energy (= kinetic energy +
potential energy) is constant. Therefore
v2
+ gy = 0
2
on the interval [0, b], where v is the velocity and g is gravitational accelera-
tion. The constant is 0, since v(0) = y(0) = 0. Therefore
1
v = (−2gy) 2 .
The time for the bead to slide from A to B is thus
1
(1 + (y 0 )2 ) 2
Z b Z b
ds
= dx.
0 v 0 −2gy
We therefore seek a path y0 (·) from A to B that minimizes
1
(1 + (y 0 )2 ) 2
Z b
I[y(·)] = dx.
0 −y
Now
12 − 12
(1 + z 2 ) (1 + z 2 )
∂L z
L= , =− ,
−y ∂z −y y
and consequently
− 12 21
(1 + (y 0 )2 ) (y 0 )2 (1 + (y 0 )2 )
0 ∂L 0 0
y (y, y ) − L(y, y ) = − −
∂z −y y −y
1
= (−y(1 + (y 0 )2 ))− 2 .
Since L does not depend on x, it follows from Theorem 1.3.2 that
∂L
y0 (y, y 0 ) − L(y, y 0 )
∂z
is constant. Therefore
(1.34) y(1 + (y 0 )2 ) = C
on the interval [0, b] for some (negative) constant C.
GEOMETRIC INTERPRETATION. It is possible to directly inte-
grate the ODE (1.34) (see Kot [K]), but the following geometric insights are
more interesting. We first check if the graph of y(·) is the blue curve drawn
below, the angle ξ satisfies
1
sin ξ = 1 .
(1 + (y 0 )2 ) 2
34 1. FIRST VARIATION
y
ξ
ξ
Angles and derivatives
sin ξ
(1.35) 1 is constant;
(−y) 2
and, according to the Remark on page 14, this in turn implies y(·) solves the
full Euler-Lagrange equation. (Compare all this with the geometric optics
example on page 17.)
A cycloid
Now (1.35) turns out to imply that the brachistochrone path is along
a cycloid, the curve traced by a point on the rim of a circle as it rolls
horizontally. Levi [L, pages 190–192] and Melzak [M, page 96] provide the
following elegant geometric proof. The key observation is that if a point
C = (x, y) on a rolling circle of diameter d > 0 generates a cycloid and if
A is the instantaneous point of contract of the circle with the line, then the
vector AC is perpendicular to the velocity vector v.
1.4. Applications 35
D A
B
Geometry of brachistochrone
r
ψ
O
Angles and derivatives in polar coordinate
Hypocycloid
C
ψ
B
r
R-d
O
Geometry of terrestrial brachistochrone
38 1. FIRST VARIATION
THEOREM 1.4.1.
(i) The functions x, p : [0, ∞) → Rn solve Hamilton’s equations (H).
(ii) Furthermore,
(1.42) H(x, p) is constant on [0, ∞).
Put x = x and p = p into these formulas, and note that (1.39) implies
ẋ = φ(x, p).
From (1.43), it follows that ∇p H(x, p) = ẋ. And (1.44) gives
∇x H(x, p) = −∇x L(x, ẋ)
d
=− (∇v L(x, ẋ)) according to (E-L)
dt
= −ṗ.
the maximum occurring for v = φ(x, p). (See the Math 170 notes for more
about convex duality.)
∇ × A = B.
We compute
( p−qA(x)
ẋ = m
q(∇A(x))T (p−qA(x))
ṗ = m .
We now show that these imply the Lorentz equation (1.46), by comput-
ing
mẍ = ṗ − q∇Aẋ
q(∇A)T (p − qA)
= − q∇A(x)ẋ
m
= q((∇A)T − ∇A)ẋ
= q(ẋ × (∇ × A))
= q(ẋ × B).
(∇g − (∇g)T )y = (∇ × g) × y
for y ∈ R3 and g : R3 → R3 .
Taylor [T] is a good text for more on physical applications of variational
principles and Hamilton’s equations.
1.4.4. Geodesics.
Let U ⊆ Rn be an open region. Assume that we are given a function
y : U → Rl , which we write as
y1
..
y = . .
yl
xn
x1
g ij
THEOREM 1.4.2.
1.4. Applications 43
with
n n
∂L X ∂L 1 X ∂gij
= gik vi , = vi vj .
∂vk ∂xk 2 ∂xk
i=1 i,j=1
d ∂L ∂L
We insert these into the Euler-Lagrange equation − dt ∂vk + ∂xk = 0, to
find
n n
!
d X 1 X ∂gij
gik ẋi − ẋi ẋj = 0.
dt 2 ∂xk
i=1 i,j=1
Therefore
n n
X X ∂gik 1 ∂gij
0= gik ẍi + − ẋi ẋj
∂xj 2 ∂xk
i=1 i,j=1
n n
X 1 X ∂gik ∂gjk ∂gij
= gik ẍi + + − ẋi ẋj .
2 ∂xj ∂xi ∂xk
i=1 i,j=1
2. Since the Lagrangian L does not depend upon the independent vari-
able t, Theorem 1.3.8 tells us that the expression
n
1 X
ẋ · ∇z L(x, ẋ) − L(x, ẋ) = gij (x)ẋi ẋj
2
i,j=1
(This is the Euclidean length of the image of x(·) under the coordinate chart
y(·).)
(ii) The distance between two points A, B ∈ Rn in the metric deter-
mined by G is
dist(A, B) = min {L[x] | x : [0, 1] → Rn , x(0) = A, x(1) = B} .
Next, assume that a curve y(·) minimizes the energy among paths connecting
A to B:
E[y] = min {E[w] | w : [0, 1] → Rn , w(0) = A, w(1) = B} .
According to Theorem 1.4.2,
n
1 X
(1.53) gij (y)ẏi ẏj = E
2
i,j=1
But also
Z 1 n
X 1
2
dist(A, B) = L[x] ≤ L[y] = gij (y)ẏi ẏj dt
0 i,j=1
1
n 2
Z 1 X 1
≤ gij (y)ẏi ẏj dt = (2E) 2 ,
0 i,j=1
dist(A,B)2
according to (1.53). Hence E = 2 ; and therefore L[x] = L[y], E[y] =
E[x].
Proof. 1. Consider a solution of the ODE system (1.54) with ẋ1 6= 0. Define
x2 ẋ2
(1.55) a = x1 + .
ẋ1
Then
ẋ2 ẋ2 + x2 ẍ2 x2 ẋ2 ẍ1
ȧ = ẋ1 + −
ẋ1 (ẋ1 )2
2 x2 (ẋ2 )2 − (ẋ1 )2
(ẋ2 ) x2 ẋ2 2ẋ1 ẋ2
= ẋ1 + + −
ẋ1 ẋ1 x2 (ẋ1 )2 x2
= 0;
consequently a is constant.
2. We claim that the motion of the point x = [x1 x2 ]T lies within a circle
with center (a, 0). To confirm this, let us use (1.55) to calculate that
d (x1 − a)2 + (x2 )2
= (x1 − a)ẋ1 + x2 ẋ2
ds 2
x2 ẋ2
= − ẋ1 + x2 ẋ2
ẋ1
= 0.
Therefore
(x1 − a)2 + x22 = r2
for some appropriate radius r > 0.
GEOMETRIC INTERPRETATION. Thus the geodesics in the hyper-
bolic half plane are either vertical, or else approach the x1 -axis as s → ±∞.
The half circles they traverse have infinite length. qed
Proof. 1. We compute using the cross product and the ODE (1.58) that
0 0
x0 x0 x0
0
×x = ×x+ ×x
1 + |x|2 1 + |x|2 1 + |x|2
48 1. FIRST VARIATION
2x
=− × x = 0.
(1 + |x|2 )2
Hence
x0
×x=b for some vector b ∈ R3 .
1 + |x|2
If b 6= 0, then since x · b = 0, the trajectory lies in the plane through the
origin perpendicular to b. If b = 0, then x and x0 are everywhere parallel
and so the motion is along a line.
Let t = x0 be the unit tangent vector to the curve in the plane. Then
(1.60) t0 = κn
where n is the unit normal and κ ≥ 0 is the curvature. So {t, n} is an
orthonormal frame moving along the curve. If we differentiate the expression
t · n = 0, we see that
(1.61) n0 = −κt.
SECOND
VARIATION
51
52 2. SECOND VARIATION
(ii) Furthermore,
∂2L
(2.2) (x, y0 (x), y00 (x)) ≥ 0 (a ≤ x ≤ b).
∂z 2
REMARKS.
(i) We call the left hand side of (2.1) the second variation of I[ · ]
about y0 (·), evaluated at w(·).
(ii) If the mapping
(2.3) z 7→ L(x, y0 (x), z) is convex,
then (2.2) holds. This observation strongly suggests that the convexity of
the Lagrangian L in the variable z will be a useful hypothesis if we try to
find minimizers: see Section 2.4.2.
for
y0 (x0 + ε) − y0 (x0 − δε)
aε = − δz − y00 (x),
ε
where the remainder term rε satisfies the estimate
(2.7) |rε | ≤ Aa2ε .
According to L’Hospital’s Rule,
(2.8) lim aε = (1 + δ)y00 (x0 ) − δz − y00 (x).
ε→0
2.2. Positive second variation 55
(q − B)2
(R) q0 = − +C (a ≤ x ≤ b),
A
for the functions A, B, C defined by (2.10).
Furthermore, if w0 − q−B
A w = 0 on the interval [a, b] and w(a) = 0, then
uniqueness of the solution of this ODE imples w = 0.
Proof. We calculate
0 = −(Au0 + Bu)0 + Bu0 + Cu
= −(qu)0 + Bu0 + Cu
= −q 0 u + (B − q)u0 + Cu
(q − B)2
= −q 0 u − u + Cu.
A
Divide by u > 0 to derive (R).
Proof. 1. It can be shown that if there are no conjugate points with respect
to a within (a, b], then in fact for some nearby point d < a there are no
conjugate points with respect to d within (d, b]. (The proof of this relies
upon some mathematical ideas a bit beyond the scope of these notes.)
implies
(2.21) I[y(·)] ≤ I[ȳ(·)]
for all ȳ : [a, b] → R satisfying
ȳ(a) = y(a), ȳ(b) = y(b).
REMARK. We call such a local minimizer strong since we require only
that (2.20) be valid, and not necessarily also that
(2.22) max |y 0 − ȳ 0 | ≤ δ.
[a,b]
2.3. Strong local minimizers 59
a b
y and ȳ are close, but y0 and ȳ 0 are not
In particular (2.20) does not imply that L(x, ȳ, ȳ 0 ) is close to L(x, y, y 0 )
pointwise.
and
(iii)
∂w
(2.25) (x, 0) > 0 (a ≤ x ≤ b).
∂c
NOTATION. In (2.23) and below we write
∂w
w0 (x, c) = (x, c).
∂x
60 2. SECOND VARIATION
a b
The graph of y is red, the graph of ȳ is blue and the graphs of the other
extremals are black
REMARK. This direct proof, which uses ideas from Ball-Murat [B-M],
is more straightforward than conventional arguments involving Hilbert’s in-
variant integral (as explained, for instance, in Kot [K]).
Clarke and Zeidan present in [C-Z] a very interesting alternative ap-
proach, using the Riccati equation (R) to build an appropriate calibration
function.
d a b
The graph of y is red and the graphs of the other extremals are blue
Proof. 1. First, as noted in the proof of Theorem 2.2.3, for some point
d < a there are no conjugate points with respect to d within (d, b]. By
solving the Euler-Lagrange ODE we can extend y to be a solution defined
on the larger interval [d, b].
Now solve the initial-value problems
0 0 0
∂L
∂L
− ∂z (x, w, w ) + ∂y (x, w, w ) = 0
(d ≤ x ≤ b)
(2.31) w(d, c) = y(d)
0
w (d, c) = y 0 (d) + c
2.4. Existence of minimizers 63
2. Define
∂w
(2.32) u(x) = (x, 0).
∂c
We now differentiate the ODE in (2.31) with respect to the variable c. This
gives the identity
0
∂ L ∂w ∂ 2 L ∂w0 ∂ 2 L ∂w0
2
∂ 2 L ∂w
0=− + + + .
∂z∂y ∂c ∂z 2 ∂c ∂y 2 ∂c ∂y∂z ∂c
We now put c = 0 and recall the definitions (2.10) and (2.32):
and consequently
inf I[y(·)] = 0.
y(·)∈A
R1
But the minimum is not attained, since 0 x2 (y 0 )2 dx > 0 for all y(·) ∈ A.
MULTIVARIABLE
VARIATIONAL
PROBLEMS
67
68 3. MULTIVARIABLE VARIATIONAL PROBLEMS
Note that we insert the number u(x) into the y-variable slot of L(x, y, z),
and the vector ∇u(x) into the z-variable slot of L(x, y, z).
Then
Z
1
I[u(·)] = |∇u(x)|2 dx,
2 U
REMARKS.
(i) The Euler-Lagrange equation corresponding to L is the PDE
n
X ∂ ∂L ∂L
(E-L) − (x, u, ∇u) + (x, u, ∇u) = 0.
∂xk ∂zk ∂y
k=1
|z|2
EXAMPLE. Let L(x, y, z) = 2 . Then
∂L ∂L
= 0, = zk .
∂y ∂zk
Consequently the Euler-Lagrange equation is
n
X ∂ ∂u
= 0.
∂xk ∂xk
k=1
EXAMPLE. Let us for this example write points in Rn+1 as (x, t) with
x ∈ Rn denoting position and t ≥ 0 denoting time. We consider functions
u = u(x, t).
The Euler-Lagrange equation for the functional
Z Z 2
1 T ∂u
I[u] = − |∇x u|2 dxdt
2 0 Rn ∂t
is the wave equation
∂2u
− ∆u = 0,
∂t2
where the Laplacian ∆, defined as in (3.2), acts in the x-variables. However,
it is easy to see that the functional I[ · ] is unbounded from below on any
reasonable admissible class of functions. Consequently, solutions of the wave
equation correspond to extremals that are not minimizers.
72 3. MULTIVARIABLE VARIATIONAL PROBLEMS
νn
for the outward pointing unit normal vector field along ∂U .
LEMMA 3.2.1.
(i) For each k = 1, . . . , n we have the multivariable integration-by-
parts formula
Z Z Z
∂f ∂g
(3.3) g dx = − f dx + f gν k dS
U ∂xk U ∂xk ∂U
for k = 1, . . . , n.
(ii) If f : U → R is continuous and
Z
f w dx = 0 for all continuous w : Ū → R
U
with w = 0 on ∂U , then f ≡ 0 within U .
Apply this to the vector field h = [0, . . . , 0, f g, 0, . . . , 0]T with the nonzero
term f g in the k-th slot.
(ii) Furthermore,
DEFINITION. The expression on the left hand side of (3.4) the second
variation of I[ · ] about u0 (·), evaluated at w(·).
If the mapping
(3.6) z 7→ L(x, u0 (x), z) is convex,
then (3.5) holds automatically.
d2 i
Put τ = 0 and recall dτ 2
(0) ≥ 0, to prove (3.4).
for each ε > 0. Then wε is differentiable except along finitely many hyper-
planes that intersect U , with
∇wε (x) = φ0 ( x·y x·y
ε )ζ(x)y + εφ( ε )∇ζ(x).
Note that the second term on the right is less than or equal to Aε for some
appropriate constant A.
Insert wε in place of w in (3.4). It follows that
n
∂2L
Z X
2
(x, u0 , ∇u0 )yk yl ζ 2 φ0 ( x·y
ε ) dx + Bε ≥ 0
U ∂zk ∂zl
k,l=1
We also redefine
(3.8) A = {u : Ū → R | u is continuously differentiable},
so as now to require nothing about the boundary behavior of admissible
functions.
76 3. MULTIVARIABLE VARIATIONAL PROBLEMS
THEOREM 3.4.1. Let the energy be given by (3.7) and the admissible
class by (3.8). Assume u0 (·) ∈ A is a twice continuously differentiable
minimizer
(i) Then u0 solves the usual Euler-Lagrange equation
n
X ∂ ∂L ∂L
(3.9) − (x, u0 , ∇u0 ) + (x, u0 , ∇u0 ) = 0
∂xk ∂zk ∂y
k=1
within U .
(ii) Furthermore, we have the boundary condition
n
X ∂L ∂B
(3.10) (x, u0 , ∇u0 )ν k + (x, u0 ) = 0
∂zk ∂y
k=1
on ∂U .
2. Now
i(τ ) = I[uτ (·)]
Z Z
= L(x, u0 + τ w, ∇u0 + τ ∇w) dx + B(x, u0 + τ w) dS.
U ∂U
Therefore
Z n
di ∂L X ∂L ∂w
0= (0) = (x, u0 , ∇u0 )w + (x, u0 , ∇u0 ) dx
dτ U ∂y ∂zk ∂xk
k=1
Z
∂B
+ (x, u0 )w dS.
∂U ∂y
3.4. Extensions and generalizations 77
over the admissible class (3.8) solves the nonlinear boundary-value problem
(
∆u0 = 0 in U
∂u0
∂ν + b(u0 ) = 0 on ∂U ,
where b = B 0 and
∂u
= ∇u · ν
∂ν
is the outward normal derivative of u.
We omit the proof, which is similar to that for Theorem 1.3.5. As in that
previous theorem, we interpret λ0 as the Lagrange multiplier for the integral
constraint J[u(·)] = 0. The requirement (3.16) is a constraint qualification
condition.
For an application, see Theorem 3.5.1 below.
3.4.4. Systems.
We can also extend our theory to handle systems. For this, we assume
g : ∂U → Rm is given, and redefine the class of admissible functions, now
to be
A = {u : Ū → Rm | u = g on ∂U }.
NOTATION. We write
∂u1 ∂u1
(∇u1 )T
u1 ∂x1 ... ∂xn
u = ... , ∇u = .. .. .. .. .
=
. . . .
∂um ∂um
um (∇um )T ∂x1 ... ∂xn
Z = ... . . . ... .
z1m . . . znm
DEFINITION. If u(·) ∈ A, we define
Z
I[u(·)] = L(x, u(x), ∇u(x)) dx.
U
Note that we insert u(x) into the y-variables of L(x, y, Z), and ∇u(x) into
the Z-variables.
THEOREM 3.4.3. Assume u0 (·) ∈ A minimizers I[ · ] and u0 is twice
continuously differentiable.
Then u0 solves within U the system of nonlinear PDE
n
X ∂ ∂L ∂L
(E-L) − (x, u 0 (x), ∇u 0 (x)) + (x, u0 (x), ∇u0 (x)) = 0
∂xk ∂zkl ∂yl
k=1
for l = 1, . . . , m.
80 3. MULTIVARIABLE VARIATIONAL PROBLEMS
2. We have
Z
i(τ ) = I[uτ (·)] = L(x, u0 (x) + τ w(x), ∇u0 (x) + τ ∇w(x)) dx.
U
Therefore
Z Xm
di ∂L
(τ ) = (x, u0 + τ w, ∇u0 + τ ∇w)wl
dτ U ∂yl
l=1
m X n
X ∂L ∂wl
+ l
(x, u0 + τ w, ∇u0 + τ ∇w) dx;
∂zk ∂xk
l=1 k=1
and so
Z Xm m X
n
di ∂L X ∂L ∂wl
0= (0) = (x, u0 , ∇u0 )wl + (x, u0 , ∇u0 ) dx.
dτ U ∂yl ∂zk ∂xk
l=1 l=1 k=1
Upon integrating by parts, we deduce as usual that the l-th equation of the
(E-L) system of PDE holds.
3.5. Applications
3.5.1. Eigenvalues, eigenfunctions.
Assume for this section that U ⊂ Rn is a connected, open set, with
smooth boundary ∂U .
3.5. Applications 81
THEOREM 3.5.1. (i) There exists a real number λ1 > 0 and a smooth
function w1 such that
−∆w1 = λ1 w1 in U
(3.18) w1 = 0 on ∂U
2
R
w1 > 0 in U , U w1 dx = 1.
(ii) Furthermore,
Z Z
2 2
(3.19) λ1 = min |∇u| dx | u dx = 1, u = 0 on ∂U .
U U
(ii) Furthermore,
Z Z
2
(3.22) λk = min |∇u| dx | u2 dx = 1, u = 0 on ∂U ,
U U
Z
wl u dx = 0 (l = 1, . . . , k − 1) .
U
Therefore
Z Z
µm = − ∆wk wm dx = ∇wk · ∇wm dx
U U
Z Z
=− wk ∆wm dx = λm wk wm dx = 0.
U U
Then
(
−∆u0 = |∇u0 |2 u0 in U
(3.24)
u0 = g on ∂U .
2. We have
Z
0
(3.26) i (0) = ∇u · ∇u0 (0) dx = 0.
U
where 0 = d
dτ . But we compute directly from (3.25) that
w [(u0 + τ w) · w](u0 + τ w)
u0 (τ ) = − .
|u0 + τ w| |u0 + τ w|3
So u0 (0) = w − (u0 · w)u0 . Put this equality into (3.26):
Z
(3.27) 0= ∇u0 · ∇w − ∇u0 · ∇((u0 · w)u0 ) dx.
U
We next differentiate the identity |u0 |2 ≡ 1, to learn that
(∇u0 )T u0 = 0.
Using this fact, we then verify that
∇u0 · ∇((u0 · w)u0 ) = |∇u0 |2 (u0 · w)
86 3. MULTIVARIABLE VARIATIONAL PROBLEMS
We assume that u = u(x, t) solves this PDE, with the boundary condition
(3.30) u=0 on ∂U .
THEOREM 3.5.4.
(i) The function
φ(t) = I[u(·, t)] (0 ≤ t < ∞)
is nonincreasing on [0, ∞).
(ii) Assume in addition that (y, z) 7→ L(x, y, z) is convex. Then φ is
convex on [0, ∞).
Z
∂L ∂u
= (x, u, ∇u) − div(∇z L(x, u, ∇u)) dx
U ∂y ∂t
Z 2
∂u
=− dx ≤ 0.
U ∂t
Observe that there is no boundary term when we integrate by parts, since
if (3.30) holds for all times t, then ∂u∂t = 0 on ∂U .
2. Differentiate again in t:
2
d2 ∂u ∂ 2 u
Z Z
d ∂u
2
I[u(·, t)] = − dx = −2 dx.
dt dt U ∂t U ∂t ∂t2
We can also differentiate the PDE (3.29), to find
n n
!
∂2u X ∂ ∂ 2 L ∂u X ∂ 2 L ∂ 2 u
= +
∂t2 ∂xk ∂zk ∂y ∂t ∂zk ∂zl ∂xl ∂t
k=1 l=1
n
∂ 2 L ∂u X ∂ 2 L ∂ 2 u
− − .
∂y 2 ∂t ∂y∂zl ∂xl ∂t
l=1
We insert this into the previous calculation, and integrate by parts:
n
d2 ∂2L ∂2u ∂2u
Z X
I[u(·, t)] = 2
dt2 U k,l=1 ∂zk ∂zl ∂xk ∂t ∂xl ∂t
n
∂ 2 L ∂ 2 u ∂u ∂ 2 L ∂u 2
X
+2 + dx ≥ 0,
∂zk ∂y ∂xk ∂t ∂t ∂y 2 ∂t
k=1
the last inequality holding provided L is convex in the variables (y, z).
Chapter 4
OPTIMAL CONTROL
THEORY
f : Rn × A → Rn ,
89
90 4. OPTIMAL CONTROL THEORY
We write A for the collection of all admissible controls, and will always
assume that for each α(·) ∈ A, the solution x(·) of (ODE) exists and is
unique. We call x(·) the response of the system to the control α(·) ∈ A.
NOTATION. We write
f1 (x, a) x1 (t)
f (x, a) = .. x(t) = ... .
,
.
fn (x, a) xn (t)
x0
REMARK. More precisely, this is a fixed time, free endpoint optimal con-
trol problem, instances of which appear as the next two examples. Other
problems are instead free time, fixed endpoint: see the third example follow-
ing.
4.1. The basic problem 91
rocket engines
We introduce
y(t) = position at time t
ẏ(t) = velocity
ÿ(t) = acceleration
α(t) = thrust of rocket engines
T = time the car arrives at the origin, with zero velocity.
We assume concerning the trust that in appropriate physical units we have
−1 ≤ α(t) ≤ 1; consequently α : [0, T ] → A for A = [−1, 1]. If the car has
mass 1, then Newton’s Law tells us that
ÿ(t) = α(t).
We rewrite this problem into the general form discussed before, setting
n = 2, m = 1. We put
x1 (t) y(t)
x(t) = = .
x2 (t) ẏ(t)
4.2. Time optimal linear control 93
Then
T0 = −P[α0 (·)] is the minimum time to steer to the origin.
THEOREM 4.2.2. For each time t > 0, the reachable set K(t, x0 ) is
convex and closed.
Let 0 ≤ θ ≤ 1. Then
Z t
1 2
θx + (1 − θ)x = X(t)x + X(t) 0
X−1 (s)N (θα1 (s) + (1 − θ)α2 (s)) ds.
0
Since θα1 (·) + (1 − θ)α2 (·) ∈ A, we see that θx1 + (1 − θ)x2 ∈ K(t, x0 ).
K(T0,x0)
Define
h = XT (T0 )g;
so that hT = g T X(T0 ). Then
Z T0 Z T0
T −1
h X (s)N α(s) ds ≤ hT X−1 (s)N α0 (s) ds,
0 0
and therefore
Z T0
(4.6) h · X−1 (s)N (α0 (s) − α(s)) ds ≥ 0
0
for all controls α(·) ∈ A.
3. Now select any time 0 < t < T0 and any value a ∈ A. We pick δ > 0
so small that the interval [t, t + δ] lies in [0, T0 ]. Define
(
a if t ≤ s ≤ t + δ
α(s) =
α0 (s) otherwise;
then (4.6) implies
Z t+δ
1
(4.7) h · X−1 (s)N (α0 (s) − a) ds ≥ 0
δ t
We sent δ → 0, to deduce that if t is a point of continuity of α0 (·), then
(4.8) h · X−1 (t)N α0 (t) ≥ h · X−1 (t)N a
for all a ∈ A. This implies the maximization assertion (M).
REMARKS. (i) This outline of the proof needs more details, in particular
for the assertion (4.4) that 0 lies on the boundary of the reachable set.
(ii) If an optimal control α0 (·) is measurable, but not necessarily piece-
wise continuous, the same proof shows that (M) holds for almost every point
time t in the interval [0, T0 ].
98 4. OPTIMAL CONTROL THEORY
for an optimal control α(·). We will now extract from this the useful in-
formation that an optimal control α0 (·) takes on only the values ±1 and
switches between these values most once.
We must first compute X(t) = etM . To do so, we observe
0 0 1 2 0 1 0 1
M = I, M = , M = = O;
0 0 0 0 0 0
and therefore M k = O for all k ≥ 2, where O denotes the zero matrix.
Consequently,
tM 1 t
e = I + tM = .
0 1
Then
−1 1 −t
X (t) = ;
0 1
−1 −t
h · X (t)N = [h1 h2 ] = −th1 + h2 .
1
Thus (4.9) asserts
(4.10) (−th1 + h2 )α0 (t) = max{(−th1 + h2 )a};
|a|≤1
x0
∂H
(ii) Since ∂pi = fi for i = 1, . . . , n, we have
(4.11) ∇p H = f .
∂H Pn ∂fj ∂r
Also, ∂xi = j=1 ∂xi (x, a)pj + ∂xi (x, a) for i = 1, . . . , n. Consequently,
(4.12) ∇x H = (∇x f )T p + ∇x r.
(M) H (x0 (t), p0 (t), α0 (t)) = max H(x0 (t), p0 (t), a),
a∈A
and
(T) p0 (T ) = ∇g (x0 (T )) .
(ii) In addition,
(4.13) H (x0 , p0 , α0 ) is constant on [0, T ].
4.3. Pontryagin Maximum Principle 101
x01 (t)
0
p1 (t)
.. ..
x0 (t) = . , p0 (t) = .
x0n (t) p0n (t)
REMARKS.
(i) To be more precise, (ODE), (ADJ) and (M) hold at times 0 < t < T
that are points of continuity of the optimal control α0 (·).
(ii) The most important assertion is (M). In practice, this usually allows
us to transform the infinite dimensional problem of finding an optimal con-
trol α0 (·) ∈ A into a finite dimensional problem, at each time t, involving
maximization over A ⊆ Rm .
(iii) The costate equation (ADJ) and transversality condition (T) repre-
sent a sort of “back propagation” of information from the terminal time T .
We can also interpret the costate as a Lagrange multiplier corresponding to
the constraint that x0 (·) solves (ODE).
Note that we specify the initial condition x0 (0) = x0 for (ODE) and
the terminal condition p(T ) = ∇g(x0 (T )) for (ADJ). Hence even if α0 (·) is
known, solving this coupled system of equations can be tricky.
(iv) Remember from Section 1.4.3 that if H : Rn ×Rn → R, H = H(x, p),
we call
(
ẋ = ∇p H(x, p)
(H)
ṗ = −∇x H(x, p)
a Hamiltonian system of ODE. Notice that (ODE), (ADJ) are of this Hamil-
tonian form, except that now H = H(x, p, a) depends also on the control.
Observe furthermore that our assertion (4.13) is similar to (1.42) from The-
orem 1.4.1.
x0
(ii) If q0 = 0, then
(4.16) p0 (·) is not identically 0 on [0, T0 ].
(iii) Furthermore,
REMARKS.
(i) So for the free time problem, we have the transversality condition
that H(x0 , p0 , α0 , q0 ) = 0 at T0 and thus H(x0 , p0 , α0 , q0 ) = 0 on the entire
interval [0, T0 ]. This generalization of our earlier Theorem 1.3.4 is stronger
than the corresponding assertion (4.13) for the fixed time problem.
(ii) But we for the free time problem, we must deal with an additional
Lagrange multiplier q0 . We say the free time problem is normal if q0 = 1;
it is abnormal if q0 = 0. (The abnormal case is analogous to the existence
of the abnormal Lagrange multiplier γ0 = 0 in the F. John conditions for
finite dimensional optimization theory. See the Math 170 notes for more on
this.)
Most free time problems are normal, and a simple assumption ensuring
this follows.
holds.
Then the associated free time, fixed endpoint control problem is normal.
EXAMPLE. Let us check that Theorem 4.3.2 accords with our previous
maximum principle for the time optimal linear problem, as developed in
Section 4.2. We have
H(x, p, a, q) = (M x + N a) · p − q (x, p ∈ Rn , a ∈ A, q ∈ R).
Select the vector h as in Theorem 4.2.3, and consider the system
(
ṗ0 (t) = −M T p0 (t)
p0 (0) = h,
the solution of which is
p0 (t) = X−T (t)h.
We know from condition (M) in Theorem 4.2.3 that
h · X−1 (t)N α0 (t) = max{h · X−1 (t)N a}.
a∈A
4.3. Pontryagin Maximum Principle 105
Then
p0 (t) · (M x0 (t) + N α0 (t)) − q0 = max{p0 (t) · (M x0 (t) + N a)} − q0 ,
a∈A
for fixed initial and terminal values x0 and x1 are given and a fixed terminal
time T > 0. This is a fixed time, fixed endpoint problem. The payoff
is again
Z T
(P) P [α(·)] = r (x(t), α(t)) dt
0
(ii) Furthermore,
(4.19) H (x0 , p0 , α0 , q0 ) is constant on [0, T ].
REMARKS.
(i) There is no transversality condition (T), since the fixed time, fixed
endpoint condition is too rigid to allow for any variations. As before, we
call the problem normal if q0 = 1 and abnormal if q0 = 0.
106 4. OPTIMAL CONTROL THEORY
4.4. Applications
HOW TO USE THE PONTRYAGIN MAXIMUM PRINCIPLE
Step 1. Write down the Hamiltonian
(
f (x, a) · p + r(x, a) for fixed time, free endpoint problems
H=
f (x, a) · p + qr(x, a) for fixed endpoint problems,
and calculate
∂H ∂H
, (i = 1, . . . , n).
∂xi ∂pi
Step 2. Write down (ODE), (ADJ), (M) and, if appropriate, (T).
Step 3. Use the maximization condition (M) to compute, if possible,
α0 (t) as a function of x0 (t), p0 (t).
Step 4. Now try to solve the coupled equations (ODE), (ADJ) and (T),
to find x0 (·), p0 (·) (and q0 for free time problems).
To simplify notation will mostly not write the subscripts “0” in the
following examples:
4.4. Applications 107
We start with (M), and compute the value of α by noting the (uncon-
strained) maximum occurs where ∂H ∂H
∂a = 0. Since ∂a = p − 2a, we see that
p
(4.20) α= .
2
We use this information to rewrite (ODE), (ADJ):
(
ẋ = x + p2
(4.21)
ṗ = 2x − p,
with the initial condition x(0) = x0 and the terminal condition p(T ) = 0.
To solve (4.21), let us suppose that we can write
(4.22) p(t) = d(t)x(t) (0 ≤ t ≤ T ),
108 4. OPTIMAL CONTROL THEORY
for some function d(·) that we must find. To find an equation for d(·), we
assume that (4.21) is valid and compute
˙ + dẋ
ṗ = dx
˙ +d x+ p
2x − p = dx
2
(2 − d)x = dx˙ + d x + dx .
2
Cancelling the x, we discover that we should select d(·) to solve the Riccati
equation
( 2
d˙ = 2 − 2d − d2
(4.23)
d(T ) = 0.
Conversely, we check that if d(·) solves this Riccati equation and (4.22)
holds, then we have (4.21).
So we solve the terminal value problem for this nonlinear ODE, to get
the function d(·). Recalling then (4.22) and (4.20), we set
1
α0 (t) = d(t)x0 (t) (0 ≤ t ≤ T ),
2
to synthesize an optimal feedback control, where x0 (·) solves (ODE) for
this control: (
ẋ0 = x0 + 12 dx0
x0 (0) = x0 .
where the control α takes values in A = [0, 1]. We assume x0 > 0, and for
simplicity have taken the growth rate γ = 1. The payoff is
Z T
(P) P [α(·)] = (1 − α)x dt.
0
This fits within our fixed time PMP setting, for n = m = 1 and
f = ax, r = (1 − a)x, g = 0.
Thus the Hamiltonian is
H = f (x, a)p + r(x, a) = axp + (1 − a)x = x + ax(p − 1);
and so
∂H ∂H
= ax, = 1 + a(p − 1).
∂p ∂x
Consequently,
∂H
(ADJ) ṗ = − = −1 − α(p − 1).
∂x
Furthermore, we have
(M) H(x(t), p(t), α(t)) = max {x(t) + ax(t)(p(t) − 1)};
0≤a≤1
(T) p(T ) = 0.
We see that p(t) > 1 if t1 ≤ t ≤ t0 for some time t1 < t0 . But then (4.24)
says α = 1 for t1 ≤ t ≤ t0 , and therefore
ṗ = −1 − (p − 1) = −p.
Since p(t0 ) = 1, it follows that p(t) = et0 −t > 1 for t1 ≤ t < t0 . Consequently
t1 = 0 and p(t) > 1 for 0 ≤ t < t0 .
So the optimal control is
(
1 (0 ≤ t < T − 1)
α0 (t) =
0 (T − 1 < t ≤ T ).
This means that we should reinvest all of the output until the time t0 = T −1,
and thereafter consume all the output.
REMARK. The formulas (ODE) and (ADJ) from the PMP provide us
with explicit differential equations for the optimal states and costates, but we
do not in general have a differential equation for the corresponding optimal
control. Indeed, the production/consumption example above has a “bang-
bang” control, which is piecewise constant with a single jump, and so does
not solve any differential equation.
However the next two applications illustrate that we can sometimes es-
tablish ODE also for the controls. The idea will be to try to eliminate p(·)
from the various equations.
for some appropriate function f : [0, ∞) → [0, ∞). So this is a fixed time,
fixed endpoint problem, which we assume is normal.
We wish to find an optimal consumption plan to maximize the payoff
Z T
(P) P [c(·)] = ψ(c) dt,
0
where ψ : [0, ∞) → [0, ∞), the consumption utility function, satisfies
ψ 0 > 0, ψ 00 < 0.
4.4. Applications 111
We will not analyze this problem completely, but will show that we can
derive an ODE for an optimal consumption policy. The Hamiltonian is
H = (f (x) − c)p + ψ(c),
and therefore
(ADJ) ṗ = −f 0 (x)p;
How to we adjust the angle ξ(·) so as to steer the boat between two given
points x0 , x1 in the least time?
This is a free time, fixed endpoint problem for which the control is ξ(·).
We assume the problem is normal, and so the Hamiltonian is
Consequently
(
∂v1 ∂v2
ṗ1 = −p1 ∂x − p2 ∂x
(ADJ) 1
∂v1
1
∂v2
ṗ2 = −p1 ∂x 2
− p2 ∂x 2
.
p2 p1
(4.29) sin ξ = 1 , cos ξ = 1 .
(p21 + p22 ) 2 (p21 + p22 ) 2
For this problem, it turns out that we can eliminate the costates p1 , p2
and so express the optimal dynamics in terms of x1 , x2 and ξ. To do so, let
us use (4.28) and (ADJ) to compute
1 ṗ2 p1 − p2 ṗ1
ξ˙ = 2
p2 p21
1+ p1
p1 ∂v1 ∂v2 p2 ∂v1 ∂v2
= 2 −p1 − p2 − 2 −p1 − p2 .
p1 + p22 ∂x2 ∂x2 p1 + p22 ∂x1 ∂x1
Then (4.29) implies
∂v2 ∂v1 ∂v2 ∂v1
(4.30) ξ˙ = sin ξ
2
+ sin ξ cos ξ − − cos2 ξ .
∂x1 ∂x1 ∂x2 ∂x2
where, as in the previous example, ξ is the angle from due east, as illustrated
below.
We assume the path of the boat is a simple, closed curve, traversed
counterclockwise. The area enclosed by the curve is
Z Z T
1 1
(P) P [ξ(·)] = x1 dx2 − x2 dx1 = x1 ẋ2 − x2 ẋ1 dt.
2 2 0
V ξ
Chaplygin’s problem
114 4. OPTIMAL CONTROL THEORY
This implies
ṙ = ẋ1 cos θ + ẋ2 sin θ.
Now use (4.34) to compute
v v
ṙ − ẋ2 = ẋ1 cos θ + ẋ2 sin θ − V sin ξ
V V
= (V cos ξ + v) cos θ + V sin ξ sin θ − v sin ξ
= (V cos ξ + v) sin ξ − V sin ξ cos ξ − v sin ξ
= 0.
Hence for some constant γ, we have
r − ex2 = γ,
where e = Vv . Thus the motion lies on the projection into R2 of the inter-
section in R3 of the cone x3 = r with the plane x3 = ex2 + γ. This is an
ellipse, since e < 1.
REMARKS. If v = 0, the motion of the boat is a circle. We have thus
in particular shown that among smooth curves of fixed length, a circle en-
closes the maximum area. (More precisely, we have shown that if there
exists a smooth minimizer, it is a circle.) Compare this assertion with the
isoperimetric problem discussed on page 25.
x(t1 ) = x∗ .
x(t2 ) = x∗ .
(We assume that the values of x(·) are between 0 and 1, in appropriate units.
Thus the fish population will rise if α ≡ 0.)
x0
x1
x*
1 a* 0
0 t1 t2 T
Optimal harvesting
x0
x1
x*
0 t1 t2 T
Another harvesting plan
Now
T T
x(1 − x) − ẋ
Z Z
P [α(·)] = (x − θ)α dt = (x − θ) dt.
0 0 qx
The part of integrand that involves ẋ is a null Lagrangian, and consequently
that part of the payoff depends only upon the fixed boundary values x0 and
x1 . That is,
Z T
x−θ x1 − θ log x1 x0 − θ log x0
ẋ dt = − = C;
0 qx q q
and hence
1 T
Z
P [α(·)] = (x − θ)(1 − x) dt − C.
q 0
Let x0 (·) be the dynamics corresponding to α0 (·). Since α0 ≡ 1 on
(0, t1 ), we have x(t) ≥ x0 (t) ≥ x∗ on [0, t1 ]; see the illustration. Since x∗
gives the maximum of the quadratic (x − θ)(1 − x), it follow that
Z t1 Z t1
(x0 − θ)(1 − x0 ) dt ≥ (x − θ)(1 − x) dt.
0 0
Similarly, x(t) ≥ x0 (t) ≥ x∗ on [t2 , T ], and consequently
Z T Z T
(x0 − θ)(1 − x0 ) dt ≥ (x − θ)(1 − x) dt.
t2 t2
Furthermore,
Z t2 Z t2
(x0 − θ)(1 − x0 ) dt ≥ (x − θ)(1 − x) dt,
t1 t1
since x0 (·) ≡ x∗ on this interval and x∗ gives the maximum of the integrand.
Hence
P [α0 (·)] ≥ P [α(·)].
4.5. Proof of PMP 119
NOTATION. We write
o(ε)
to denote any expression gε such that
gε
lim = 0.
ε→0 ε
In words, if ε → 0, then gε = o(ε) goes to zero “faster than ε”.
for
On the time interval [s, ∞), x(·) and xε (·) both solve the same ODE,
but with differing initial conditions given by
H(x, p, a) = f (x, a) · p,
and our task is find p0 : [0, T ] → Rn such that (ADJ), (T) and (M) hold.
122 4. OPTIMAL CONTROL THEORY
We reintroduce the function A(·) = ∇x f (x0 (·), α0 (·)) and the control
variation αε (·), as in the previous section. We now define p0 : [0, T ] → R to
be the unique solution of the terminal-value problem
(
ṗ0 (t) = −AT (t)p0 (t) (0 ≤ t ≤ T )
(4.44)
p0 (T ) = ∇g(x0 (T )).
This gives (ADJ) and (T), and so our goal is to establish the maximization
principle (M).
The main point is that p0 (·) helps us calculate the variation of the
terminal payoff:
LEMMA 4.5.3. Assume 0 < s < T is a point of continuity for α0 (·). Then
we have
d
(4.45) P [αε (·)]|ε=0 = [f (x0 (s), a) − f (x0 (s), α0 (s))] · p0 (s).
dε
Proof. According to Lemma 4.5.2,
P [αε (·)] = g(xε (T )) = g(x0 (T ) + εy(T ) + o(ε)),
where y(·) satisfies (4.42) for
y s = f (x0 (s), a) − f (x0 (s), α0 (s)).
Thus
d
(4.46) P [αε (·)]|ε=0 = ∇g(x0 (T )) · y(T ).
dε
On the other hand, (4.42) and (4.44) imply
d
(p0 (t) · y(t)) = ṗ0 (t) · y(t) + p0 (t) · ẏ(t)
dt
= −AT (t)p0 (t) · y(t) + p0 (t) · A(t)y(t)
= 0.
Hence
∇g(x0 (T )) · y(T ) = p0 (T ) · y(T ) = p0 (s) · y(s) = p0 (s) · y s .
Since y s = f (x0 (s), a)−f (x0 (s), α0 (s)), this identity and (4.46) imply (4.45).
Proof. The adjoint dynamics and terminal condition are both in (4.44).
To confirm (M), fix 0 < s < T and a ∈ A, as above. Since the mapping
ε 7→ P [αε (·)] for 0 ≤ ε ≤ 1 has a maximum at ε = 0, we deduce from Lemma
4.5.3 that
d
0≥ P [αε (·)] = [f (x0 (s), a) − f (x0 (s), α0 (s)] · p0 (s).
dε
Hence
H(x0 (s), p0 (s), a) = f (x0 (s), a) · p0 (s)
≤ f (x0 (s), α0 (s)) · p0 (s) = H(x0 (s), p0 (s), α0 (s))
for all a ∈ A and each time 0 < s < T that is a point of continuity for α0 (·).
This proves the maximization condition (M).
x1 (t) f1 (x, a)
. ..
x(t) .. f (x, a)
(4.48) x̄(t) = = , f̄ (x̄, a) = =
.
xn+1 (t) r(x, a)
xn (t) fn (x, a)
xn+1 (t) r(x, a)
and
(4.49) ḡ(x̄) = g(x) + xn+1 .
Then (ODE) and (4.47) produce the dynamics
(
˙
x̄(t) = f̄ (x̄(t), α(t)) (0 ≤ t ≤ T )
(ODE) 0
x̄(0) = x̄ .
Consequently our control problem transforms into a new problem with no
running payoff and the terminal payoff functional
P̄ [α(·)] = ḡ(x̄(T )).
THEOREM 4.5.2. (PMP for fixed time, free endpoint problem)
There exists a function p0 : [0, T ] → Rn satisfying the adjoint dynamics
(ADJ), the maximization principle (M) and the terminal condition (T).
p0 (t) = ...
p0n (t)
4.5. Proof of PMP 125
Proving the PMP for the free time, fixed endpoint problem is much
more difficult, since the result of a simple variation as above may produce
a response xε (·) that never hits the target point x1 . We consequently need
to introduce more complicated control variations, discussed in this section.
Using the cone of variations. Our program for building the costate
depends upon our taking multiple variations, as in the previous section, and
understanding the resulting cone of variations K = K(T0 ).
Let K 0 denote the (perhaps empty) interior of K. Put
en+1 = [0 · · · 0 1]T ∈ Rn+1 .
Here is the key observation:
LEMMA 4.5.5. We have
(4.54) en+1 ∈
/ K 0.
We assert that if µ, η, ε > 0 are small enough, then we can solve the
nonlinear equation
(4.58) Φε (z) = µen+1 = [0 · · · 0 µ]T
for some z ∈ C. To see this, note that
|Φε (z) − z| = |x̄ε (T0 ) − x̄0 (T0 ) − z|
= o(|z|)
< |z − µen+1 | for all z ∈ ∂C.
or
(4.64) wn+1 = 0.
When (4.63) holds, we can divide p0 (·) by wn+1 , to reduce to the case that
q0 = p0n+1 ≡ 1.
Then (4.62) provides the maximization principle (M). If instead (4.64) holds,
we have an abnormal problem, for which
q0 = p0n+1 ≡ 0.
An abnormal problem
DYNAMIC
PROGRAMMING
131
132 5. DYNAMIC PROGRAMMING
Z T
(P) Px,t [α(·)] = r(x(s), α(s)) ds + g(x(T )).
t
We will consider the above problems for all choices of starting times 0 ≤ t ≤
T and all initial points x ∈ Rn .
5.1.1. Derivation.
DEFINITION. For x ∈ Rn , 0 ≤ t ≤ T , we define the value function
v : Rn × [0, T ] → R to be the greatest payoff possible if we start at x ∈ Rn
at time t. In other words,
v(x, t) = sup Px,t [α(·)]
α(·)∈A
for x ∈ Rn , 0 ≤ t ≤ T .
REMARK. Then
v(x, T ) = g(x) (x ∈ Rn ),
since if we start at time T at the point x, we must immediately stop and so
collect the payoff g(x).
Our task in this section is to show that the value function v so defined
satisfies a certain nonlinear partial differential equation. Our derivation will
be based upon the reasonable principle that it is better to act optimally from
the beginning, rather than to act arbitrarily for a while and then later act
optimally. We will convert this philosophy of life into mathematics.
To simplify, we hereafter suppose that the set A of control parameter
values is closed and bounded.
But the greatest possible payoff if we start from (x, t) is v(x, t). Therefore
Z t+ε
r(x(s), a) ds + v(x(t + ε), t + ε) ≤ v(x, t).
t
and the remaining payoff is v(x0 (t + ε), t + ε). Consequently, the total payoff
is
Z t+ε
r(x0 (s), α0 (s)) ds + v(x0 (t + ε), t + ε) = v(x, t).
t
Rearrange and divide by ε:
t+ε
v(x0 (t + ε), t + ε) − v(x, t) 1
Z
+ r(x0 (s), α0 (s)) ds = 0.
ε ε t
5.1.2. Optimality.
HOW TO USE DYNAMIC PROGRAMMING
For fixed time optimal control problems as in Section 5.1.1, we carry out
these steps to synthesize an optimal control:
Step 1: Try to solve the HJB equation, with the terminal condition
(5.1), and thereby find the value function v.
Step 2: Use the value function v and the Hamilton–Jacobi–Bellman
PDE to design an optimal control α0 (·), as follows. Define for each point
y ∈ Rn and each time 0 ≤ s ≤ T ,
a(y, s) ∈ A
to be a parameter value where the maximum in HJB is attained at the point
(y, s). In other words, select a(y, s) ∈ A so that
∂v
(5.4) (y, s) + f (y, a(y, s)) · ∇x v(y, s) + r(y, a(y, s)) = 0.
∂t
Step 3: Next, solve the following ODE, assuming a(·) is sufficiently
regular to do so:
(
ẋ0 (s) = f (x0 (s), a(x0 (s), s)) (t ≤ s ≤ T )
(5.5)
x0 (t) = x.
Step 4: Finally, define the optimal feedback control
(5.6) α0 (s) = a(x0 (s), s) (t ≤ s ≤ T ),
so that we can rewrite (5.5) as
(
ẋ0 (s) = f (x0 (s), α0 (s)) (t ≤ s ≤ T )
x0 (t) = x.
In particular, if the state of system is y at time s, we use the control which
at time s takes on a parameter value a = a(y, s) ∈ A for which the maximum
in HJB is obtained.
136 5. DYNAMIC PROGRAMMING
THEOREM 5.1.2. For each starting time 0 ≤ t < T and initial point
x ∈ Rn , the control α0 (·) defined by (5.5) and (5.6) is optimal.
Proof. We have
Z T
Px,t [α0 (·)] = r(x0 (s), α0 (s)) ds + g(x0 (T )).
t
REMARKS.
(i) Notice that v acts here as a calibration function that we use to es-
tablish optimality.
(ii) We can similarly design optimal controls for free time problems by
solving the stationary HJB equation.
5.2. Applications
Applying dynamic programming is usually quite tricky, as it requires us to
solve a nonlinear PDE and this is often very difficult. The main hope, as
we will see in the following examples, is to try to guess the form of v, to
plug this guess into the HJB equation and then to adjust various constants
and auxiliary functions, to ensure that we have an actual solution. (Alter-
natively, we could compute the solution of the terminal-value problem for
HJB numerically.)
To simplify notation will not write the subscripts “0” in the subsequent
examples.
5.2. Applications 137
Solving the HJB equation. Our task now is to solve this nonlinear PDE,
with the terminal condition (5.8). We guess that our solution has the form
(5.11) v(x, t) = xT K(t)x
for some appropriate symmetric n×n-matrix valued function K(·) for which
(5.12) K(T ) = −D.
Let us compute
∂v
(5.13) = xT K̇(t)x, ∇x v = 2K(t)x.
∂t
We now insert our guess v = xT K(t)x into (5.10), to discover that
xT {K̇(t) + K(t)N C −1 N T K(t) + 2K(t)M − B}x = 0.
Since
2xT KM x = xT KM x + [xT KM x]T
= xT KM x + xT M T Kx,
the foregoing becomes
xT {K̇ + KN C −1 N T K + KM + M T K − B}x = 0.
This identity will hold provided K(·) satisfies the matrix Riccati equa-
tion
(R) K̇(t) + K(t)N C −1 N T K(t) + K(t)M + M T K(t) = B
on the interval [0, T ].
In summary, once we solve the Riccati equation (R) with the terminal
condition (5.12), we can then use (5.9) and (5.13) to construct the optimal
feedback control
α0 (t) = C −1 N T K(t)x0 (t).
5.2. Applications 139
5.2.2. Rocket railway car. In view of our discussion on page 98, the
rocket railway problem is actually quite easy to solve. However it is also
instructive to see how dynamic programming applies.
The equations of motion are
0 1 0
ẋ = x+ α, −1 ≤ α ≤ 1
0 0 1
for n = 2, and
Z T
Px [α(·)] = − time to reach the origin = − 1 dt = −T.
0
Then the value function v(x) is minus the least time it takes to get to
the origin from the point x = [x1 x2 ]T ; and the corresponding stationary
HJB equation is
max{f · ∇v + r} = 0
a∈A
for
x2
A = [−1, 1], f= , r = −1.
a
Therefore
∂v ∂v
max x2 +a − 1 = 0;
|a|≤1 ∂x1 ∂x2
and consequently the Hamilton-Jacobi-Bellman equation is
(
∂v ∂v
x2 ∂x 1
+ ∂x2
=1 in R2 \ {0}
(HJB)
v(0) = 0.
We could have derived this formula for v using the ideas in the next example,
but for now let us just show that v really solves HJB.
In Region I we compute
− 12 − 12
x22 x22
∂v ∂v
= − x1 + , = −1 − x1 + x2 ;
∂x1 2 ∂x2 2
140 5. DYNAMIC PROGRAMMING
− 12 " − 12 #
x22 x22
∂v ∂v
x2 + = −x2 x1 + + 1 + x2 x1 + = 1.
∂x1 ∂x2 2 2
This confirms that our HJB equation holds in Region I, and a similar cal-
culation holds in Region II, owing to the symmetry condition
v(−x) = v(x) (x ∈ R2 ).
Now let Γ denote the boundary between Regions I and II. Since
(
∂v <0 in Region I
∂x2 >0 in Region II
and
∂v
=0 on Γ,
∂x2
our function v defined by (5.14) does indeed solve the nonlinear HJB partial
differential equation.
x22
Thus any solution of (ODE) moves along a parabola of the form x1 = 2 +C
x2
when α = 1, and along a parabola of the form x1 = − 22
+ C when α = −1.
Since we know from the PMP that an optimal control changes sign at most
once (see page 98), an optimal trajectory must move along one such family
of parabolas, and change to the other family of parabolas at most once, at
the switching curve Γ given by the formula x1 = − 12 |x2 |x2 . The picture
illustrates a typical optimal trajectory.
5.2. Applications 141
x2
x1
but make a simple change in the payoff functional (which will have a pro-
found effect on optimal controls and trajectories). So let us now take
1 T
Z
Px [α(·)] = − (x1 )2 dt.
2 0
Thus
x 1
A = [−1, 1], f= 2 , r = − x21 ;
a 2
and the stationary HJB equation for the value function
is now
(
∂v ∂v
x2 ∂x1
+ ∂x2 = 12 x21 in R2 \ {0}
(HJB)
v(0) = 0.
Solving the HJB equation. Using the definition of the value function,
we can check that it satisfies the two symmetry conditions:
(5.15) v(−x) = v(x) (x ∈ R2 )
and
(5.16) v(λ2 x1 , λx2 ) = λ5 v(x1 , x2 ) (x ∈ R2 , λ > 0).
Now (5.16) and the previous example suggest that the optimal switching
should occur on the boundary Γ between two regions of the form
I := {(x1 , x2 ) | x1 > −βx2 |x2 |},
II := {(x1 , x2 ) | x1 < −βx2 |x2 |},
for some as yet unknown constant β > 0.
If we plug this guess into (5.18) and match coefficients, we discover that
1 1 1
v = − x52 − x1 x32 − x21 x2
15 3 2
is a solution. Now the general solution of the linear, homogeneous PDE
∂w ∂w
x2 − =0
∂x1 ∂x2
has the form
1
w = f x1 + x22 .
2
Hence the general solution of (5.18) is
1 5 1 3 1 2 1 2
v = − x2 − x1 x2 − x1 x2 + f x1 + x2 .
15 3 2 2
In order to satisfy the scaling condition (5.16), we take f to be homogeneous
and so have
5
1 5 1 3 1 2 1 2 2
(5.19) v = − x2 − x1 x2 − x1 x2 − γ x1 + x2
15 3 2 2
for another as yet unknown constant γ > 0.
∂v
We want next to adjust the constants β, γ so that ∂x 2
= 0 on Γ. Now
3
∂v 1 4 2 1 2 5γ 1 2 2
(5.20) = − x2 − x1 x2 − x1 − x1 + x2 x2 .
∂x2 3 2 2 2
Therefore on Γ+ = {x1 = −βx22 , x2 > 0}, we have
" 3 #
∂v 4 1 1 2 5γ 1 2
(5.21) = x2 − + β − β − −β + ;
∂x2 3 2 2 2
and on Γ− = {x1 = βx22 , x2 < 0}, we have
" 3 #
∂v 1 1 5γ 1 2
(5.22) = x42 − − β − β 2 + β+ .
∂x2 3 2 2 2
Consequently, we need to select β, γ so that
− 1 + β − 1 β 2 − 5γ −β + 1 23 = 0
(5.23) 3 2 2 2
3
− 1 − β − 1 β 2 + 5γ β + 1 2 = 0.
3 2 2 2
there exist 0 < β < 12 such that φ(β) = 0. We can then find γ > 0 so
that β, γ solve (5.23). A further calculation confirms that for these choices,
∂v
∂x2 < 0 within Region I.
Using (5.15) to extend our definition of v to all of R2 , we have (at last)
found our solution of the stationary HJB equation.
x2
x1
Γ
Part of an optimal path for Fuller’s problem
A. Notation
Rn denotes n-dimensional Euclidean space, a typical point of which is the
column vector
x1
..
x = . .
xn
To save space we will often write the corresponding row vector
x = [x1 · · · xn ]T .
If x, y ∈ Rn , we define
n n
!1/2
X 1 X
x·y = xi yi = xT y, |x| = (x · x) = 2 x2i
i=1 i=1
B. Linear algebra
Throughout these notes A denotes a real m × n matrix and AT denotes its
transpose:
a11 . . . a1n a11 . . . am1
.. .. .. T .. .. .. .
A= . . . A = . . .
am1 . . . amn a1n . . . amn
Recall the rule
(AB)T = B T AT .
147
148 APPENDIX
We write
Mn×m
for the space of all real m × n matrices.
and write A 0.
The chain rule tells us how to compute the partial derivatives of com-
posite functions, made from simpler functions. For this, assume that we are
given a function
f : Rn → R,
which we write as f (x) = f (x1 , . . . , xn ). Suppose also we have functions
g1 , . . . , gn : Rm → R
so that gi (y) = gi (y1 , . . . , ym ) for i = 1, . . . , n.
NOTATION. We define g : Rm → Rn by
g1
..
g = . .
gn
The gradient matrix of g is
∂g1 ∂g1
(∇g1 )T
∂y1 ... ∂ym
∇g = ... =
.. .. ..
. .
. .
T ∂gn ∂gn
(∇gn ) ∂y ... ∂ym
1
where g : Rn → Rn .
To prove this, we compute that
n
1 ∂ 2
X ∂gi
|g| = gi = k-th entry of (∇g)T g.
2 ∂xk ∂xk
i=1
150 APPENDIX
D. Divergence Theorem
If U ⊂ Rn is an open set, with smooth boundary ∂U , we let
ν1
..
ν = . .
νn
denote the outward pointing unit normal vector field along ∂U . Then |ν| = 1
along ∂U .
Assume also that h : U → Rn , written
h1
..
h = . ,
hn
is a vector field. Its divergence is
n
X ∂hi
div h = ∇ · h = .
∂xi
i=1
{f>0}
(x0,y0)
graph of g
{f<0}
x0 x
∂f
Implicit Function Theorem (if ∂y (x0 , y0 ) > 0)
Proof. 1. Suppose first that C is the unit ball B(0, 1) and p = 0. Squaring
the inequality |Φ(x) − x| < |x|, we deduce that
Φ(x) · x > 0 for all x ∈ ∂B(0, 1).
Then for small t > 0, the continuous mapping
Ψ(x) := x − tΦ(x)
maps B(0, 1) into itself, and hence has a fixed point x according to Brouwer’s
Fixed Point Theorem. Then Φ(x) = 0 = p.
152 APPENDIX
2. For the general case, we can always assume after a translation that
p = 0, so that 0 belongs to the interior of C. We introduce a nonnegative
gauge function ρ : Rn → [0, ∞) such that ρ(λx) = λρ(x) for all λ ≥ 0 and
C = {x ∈ Rn | ρ(x) ≤ 1}.
We next map C onto B(0, 1) by the continuous function
ρ(x)
a(x) = x = y.
|x|
Define
ρ(y) |y|
Ψ(y) = Φ y .
|y| ρ(y)
Then the inequality |Φ(x) − x| < |x| implies
|Ψ(y) − y| < 1 for all y ∈ ∂B(0, 1).
Consequently the first part of the proof shows that there exits y ∈ B(0, 1)
such that Ψ(y) = 0. And then
|y|
x= y∈C
ρ(y)
satisfies Φ(x) = 0 = p.
EXERCISES
y 0 = a(x)y + b(x).
y 0 = a(x)y + b(x),
y 0 = a(x)y + b(x)y p .
Ly = y 00 + b(x)y 0 + c(x)y = 0.
153
154 EXERCISES
where y0 = y0,
yN +1 = y1 are prescribed. What algebraic con-
ditions do minimizers satisfy? What is the connection with the
Euler-Lagrange equation?
EXERCISES 155
11. Assume Z b
I[y(·)] = L(x, y, y 0 , y 00 ) dx
a
for L = L(x, y, z, u). Derive the corresponding Euler-Lagrange
equation for extremals.
12. Show that the extremals of
Z b
0
(y 0 )2 e−y dx
a
are linear functions.
13. Find the extremals of
Z 1
(y 0 )2 + 10xy dx,
0
subject to y(0) = 1, y(1) = 2.
14. Find the extremals of
Z T
e−λt (a(ẋ)2 + bx2 ) dt,
0
satisfying x(0) = 0, x(T ) = c.
15. Assume c > 0. Find the minimizer of
Z T
e−λt (c(ẋ)2 + dx) dt,
0
subject to x(0) = 0, x(T ) = b.
16. Find the minimizer of
(y 0 )2
Z 1
− x2 y dx,
0 2
where y(0) = 1 and y(1) is free.
17. Consider the free endpoint problem of minimizing
Z b
I[y(·)] = L(x, y, y 0 ) dx + g(y(b))
a
subject to y(a) = 0 g : R → R is a given function. What
y , where
is the transversality condition satisfied at x = b by a minimizer
y0 (·)?
18. Assume y0 (·) minimizes the functional
1 b 00 2
Z
I[y(·)] = (y ) dx
2 a
over the admissible class
A = {y : [a, b] → R | y(a) = 0, y(b) = 1}.
156 EXERCISES
25. Assume that the matrix function G, whose (i, j)-th entry is gij , is
symmetric and positive definite. What is the Hamiltonian corre-
sponding to the Lagrangian
n
1 X
L= gij (x)vi vj ?
2
i,j=1
28. Compute explicitly the Christoffel symbols Γkij for the hyperbolic
plane. Check that the ODE given in the text for the geodesics is
correct.
29. Suppose
b
∂2L
Z
(x, y0 , y00 )ζ 2 dx ≥ 0
a ∂z 2
for all functions ζ such that ζ(a) = ζ(b) = 0. Explain carefully
why this implies
∂2L
(x, y0 , y00 ) ≥ 0 (a ≤ x ≤ b).
∂z 2
30. Assume y(·) > 0 solves the linear, second-order ODE
y 00 + b(x)y 0 + c(x)y = 0.
32. Derive this useful calculus formula, which we used in the proof of
Theorem 2.3.1:
Z g(x) ! Z
g(x)
d ∂f
f (x, t) dt = (x, t) dt + f (x, g(x))g 0 (x).
dx a a ∂x
Ry R g(x)
(Hint: Define F (x, y) = a f (x, t) dt; so that a f (x, t) dt =
F (x, g(x)). Apply the chain rule.)
33. Suppose that y(·) is an extremal of I[ · ], the second variation of
which satisfies for some constant γ > 0 the estimate
Z b Z b
0 2 0 2
A(w ) + 2Bww + Cw dx ≥ γ (w0 )2 + w2 dx
a a
for all w : [a, b] → R with w(a) = w(b) = 0. Use a Taylor expansion
to show directly that y(·) is a weak local minimizer; this means
that there exists δ > 0 such that
I[y] ≤ I[ȳ]
for all admissible ȳ satisfying max[a,b] {|y − ȳ| + |y 0 − ȳ 0 |} ≤ δ.
34. Show that for each l > 0 the function y(·) = 0 is a strong local
minimizer of
Z l 0 2
(y ) y2 y4
+ − dx
0 2 2 4
for
A = {y : [0, l] → R | y(0) = y(l) = 0}.
35. Assume that (y, z) 7→ L(x, y, z) is convex for each a ≤ x ≤ b.
Show that each extremal y(·) ∈ A is in fact a minimizer of I[ · ],
for the admissible set
A = {y : [a, b] → R | y(a) = y 0 , y(b) = y 1 }.
(Hint: Recall that if f : Rn → R is convex, then
f (x̂) ≥ f (x) + ∇f (x) · (x̂ − x)
for all x̂.)
36. A function f : Rn → R is strictly convex provided
f (θx + (1 − θ)x̂) < θf (x) + (1 − θ)f (x̂)
if x 6= x̂ and 0 < θ < 1.
Suppose that (y, z) 7→ L(x, y, z) is strictly convex for each
a ≤ x ≤ b. Show that there exists at most one minimizer y0 (·) ∈ A
of I[ · ].
EXERCISES 159
where |α| ≤ 1.
cos t sin t
(a) Check that X(t) = .
− sin t cos t
(b) Show that an optimal control α0 (·) is periodic in time.
46. Use the maximum principle to find an optimal control for the linear
time optimal problem with dynamics
(
ẋ1 = x2 + α1 |α1 | ≤ 1
ẋ2 = −x1 + α2 |α2 | ≤ 1
47. Write down (ODE), (ADJ), (M) and (T) for the fixed time, free
endpoint problems with n = m = 1 and
(a) f = (x2 + a)2 − x4 , A = [−1, 1], r = 2ax, g = sin x
(b) f = x2 a, A = [0, 2], r = a2 + x2 , g = 0.
48. Write down the equations (ADJ), (M) and (T) for the fixed time,
free endpoint problem corresponding to the dynamics
(
ẋ1 = sin(x1 + αx2 )
ẋ2 = cos(αx1 + x2 ),
EXERCISES 161
(Hint: Assume PMP applies and write down the equations for x
and p. Look for a solution of the form p = dx + e.)
53. Find explicit formulas for the optimal state x0 (·) and costate p0 (·)
for the production and consumption model discussed on page 108.
Show that H(x0 , p0 , α0 ) is constant on the interval [0, T ], as as-
serted by the PMP.
54. How does our analysis of the Ramsey consumption model break
down if we drop the requirement that x(T ) = x1 ?
55. Use the PMP to solve the problem of maximizing
Z 2
P [α(·)] = 2x − 3α + α2 dt,
0
where (
ẋ = x + α, 0≤α≤2
x(0) = 0.
56. Use the PMP to find a control to minimize the payoff
1 1 4
Z
P [α(·)] = α dt,
4 0
for the dynamics
(
ẋ = x + α
x(0) = 1, x(1) = 0,
where A = R.
57. Assume that the matrices B, C, D are symmetric and that the
matrix Riccati equation
(
K̇ + KN C −1 N T K + KM + M T K = B (0 ≤ t ≤ T )
K(T ) = −D
has a unique solution K(·). Show that K(t) is symmetric for each
time 0 ≤ t ≤ T .
58. Apply the PMP to solve the general linear-quadratic regulator
problem, introduced on page 137. In particular, show how to solve
(ADJ) for the costate p0 (·) in terms of the matrix Riccati equation.
EXERCISES 163
165
166 Bibliography