Hamilton-Jacobi Equations: University of Helsinki
Hamilton-Jacobi Equations: University of Helsinki
Hamilton-Jacobi Equations
Master’s Thesis
Nikita Skourat
supervised by
Pr. Kari Astala
Nikita Skourat
Työn nimi — Arbetets titel — Title
Hamilton-Jacobi Equations
Oppiaine — Läroämne — Subject
Matematiikka
Työn laji — Arbetets art — Level Aika — Datum — Month and year Sivumäärä — Sidoantal — Number of pages
This thesis presents the theory of Hamilton-Jacobi equations. It is first shown how the equation is
derived from the Lagrangian mechanics, then the traditional methods for searching for the solution
are presented, where the Hopf-Lax formula along with the appropriate notion of the weak solution
is defined. Later the flaws of this approach are remarked and the new notion of viscosity solu-
tions is introduced in connection with Hamilton-Jacobi equation. The important properties of the
viscosity solution, such as consistency with the classical solution and the stability are proved. The
introduction into the control theory is presented, in which the Hamilton-Jacobi-Bellman equation
is introduced along with the existence theorem. Finally multiple numerical methods are introduced
and aligned with the theory of viscosity solutions.
The knowledge of the theory of partial differential analysis, calculus and real analysis will be helpful.
Kumpulan tiedekirjasto
Muita tietoja — Övriga uppgifter — Additional information
Contents
Introduction 2
2 Traditional Approach 17
2.1 Separation Of Variables . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.2 Legendre Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.3 Classical Hopf-Lax Formula . . . . . . . . . . . . . . . . . . . . . . . 22
2.4 Weak Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
3 Viscosity Solutions 38
3.1 Method Of Vanishing Viscosity . . . . . . . . . . . . . . . . . . . . . 38
3.2 Definition and Basic Properties . . . . . . . . . . . . . . . . . . . . . 41
3.3 Hopf-Lax Formula as a Viscosity Solution . . . . . . . . . . . . . . . 47
3.4 Uniqueness . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.5 Optimal Control Theory . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.5.2 Hamilton-Jacobi-Bellman Equation . . . . . . . . . . . . . . . 56
4 Numerical Approach 62
4.1 First Order Monotone Schemes . . . . . . . . . . . . . . . . . . . . . 62
4.2 Time Discretization . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.3 Higher Order Schemes and Further Reading . . . . . . . . . . . . . . 67
References 70
1
Introduction
The topic of this thesis is the Hamilton-Jacobi equation. This equation is an example
of a partial differential equations that arises in many different areas of physics and
mathematics, and hence has an utmost importance in science. Originally, it was
derived by William Rowen Hamilton in his essays [15] and [16] as a formulation
of classical mechanics for a system of moving particles. Hamilton extended the
Lagrangian formulation, using the calculus of variations, thus deriving the following
equation: ! "
∂S ∂S ∂S
+H , ..., , q1 , ..., qd = 0.
∂t ∂q1 ∂qd
Here H was a Hamiltonian - a descriptor of the system, defined by Hamilton, and
qi are# the generalized coordinates. The solution to the equation the classical action
t
S = 0 (U + T )ds, also called the principal function, where U is a potential and T
is a kinetic energy of the system. However, mathematically Hamilton’s arguments
contained some flaws. For example, Hamilton assumed that the equation would hold
only in case of conservative systems (satisfying the canonical Hamilton’s equations).
Jacobi showed later in his papers, that such an assumption is not necessary and also
provided mathematically meaningful adjustments to the equation. That’s why the
modern name for the equation is Hamilton-Jacobi equation.
Later, another version of the derivation of the equation was presented. It showed
that the principal function S can be seen as a generating function for a canonical
transformation of coordinates. In this thesis we’ll follow this approach.
In mathematics, the similar equation appeared in control theory in 50s, when Richard
Bellman developed a dynamic programming method in [2] with his colleagues. This
so-called Hamilton-Jacobi-Bellman equation defines a broader class of equations,
than the version from the classical mechanics.
Mathematicians have been working on solving this equations for many years. It
appeared, that this equation in general has no classical solutions, hence first, some
notions of weak solution was developed by Hopf ([17]) and Lax([21]). Their investi-
gations led to the important Hopf-Lax formula, that can be used for calculating the
solution or studying its properties.
Finally, in the 80s, the new notion of viscosity solution was developed for the
Hamilton-Jacobi equation and extended to a broader class of partial differential equa-
tions. The first ideas about this weaker notion of solutions appeared in Evans’ paper
[10] by applying the vanishing viscosity method to the equation, however the exact
definition was given later by Crandall and Lions in [9] along with some very impor-
tant properties.
2
The purpose of this thesis is to present the theory of Hamilton-Jacobi equations
from multiple sides. In the first chapter we will show how this equation is derived by
using the canonical transformation argument. In the second chapter the traditional
approach to the existence and uniqueness theory is presented along with Hopf-Lax
formula. And finally, in the third chapter we’ll introduce the viscosity solutions from
scratch: we’ll first show how the application of the method of vanishing viscosity
to the Hamilton-Jacobi equation leads to the definition, and then define it strictly
and show, that such weak solutions are consistent and stable. Also we give a short
introduction into the control theory and dynamic programming, thus also deriving
the Hamilton-Jacobi-Bellman equation.
This thesis uses Chapters 3.3 and 10 from [11] as a skeleton. The prior knowledge of
theory of partial differentiable equations, real analysis is assumed.
Remark. The notation and definitions presented in this paper may be different from
the ones in mechanics books. It was chosen as it better suits the mathematical PDE
theory, that we’re studying in this thesis. You can read more about Hamilton’s prin-
ciple and Hamilton’s ODEs from the physical perspective in §2 and §40 in [20], or
Chapter 9 in [1].
3
1.1 Hamilton’s Principle
Hamiltonian mechanics is one of the several formulations of classical mechanics, that
uses generalised coordinates x = (x1 , ..., xd ) as one of the parameters. We know from
the Newtonian formulation, that to define the position of a system of N moving
particles1 in three-dimensional Euclidian space, we need to specify N vectors, which
requires a total pf 3N coordinates. This number of coordinates is also called the
number of degrees of freedom. The generalised coordinates are some number of
independent quantities, that are needed to describe the motion of system of particles
(3N in the example above). Note, that these quantities must not necessarily be
Cartesian coordinates and the optimal choice often depends on the given problem.
In this paper, we are looking at the motion law for a system of particles with d
degrees of freedom, which is mathematically described as a path x : [0, ∞] → Rd , i.e.
xj = xj (t), j = 1...d. Our first goal is to derive the Euler-Lagrange equations for this
law. Before we state the Hamilton’s principle and its consequences, let’s give some
definitions first:
defined for functions x(s) = (x1 (s), ...xd (s)), belonging to the admissible class A:
4
The Hamilton’s principle in mechanics states, that the particle moves between
points y and x in a way, that the action integral takes the least possible value. This
principle is also called the principle of least action.
According to this, to find the law of motion of the particle, we have the following
calculus of variation problem: we want to find a C 2 -curve, that minimizes the action
functional among all admissible curves w ∈ A:
% t
I[x(·)] = min L(ẇ(s), w(s))ds, (1.4)
w∈A 0
5
Finally, notice that we can simplify the first part of the integrand with integration
by parts, taking into account the that v vanishes at the integration limits:
% t d %
& t
Dq L(ẋ(s),x(s)) · v̇(s)ds = (Dqi L(ẋ(s), x(s))v̇i (s)ds
0 i=1 0
d % t
& d
= (− Dq L(ẋ(s), x(s))vi (s)ds
i=1 0 ds i
t
d
%
= (− Dq L(ẋ(s), x(s)) · v(s)ds
0 ds
Euler-Lagrange equations with the corresponding initial values constitute the for-
mulation of the classical mechanics, that uses generalized coordinates and generalized
velocities as parameters.
Remark. Theorem (1.1) doesn’t prove the existence of the minimizer. However,
under some assumptions, we can ensure the existence. According to the theory of
calculus of variation, the minimizer exists, if the Lagrangian L is coercive and convex
in q. More about this fact can be found in [11], Chapter 8.2.
6
1.2 Hamilton’s ODE
As the second step of the construction, we convert the system of Euler-Lagrange
equations into Hamilton’s equations. We assume that x is a C 2 function, that is a
critical point of the action functional. We define first the generalized momentum as a
function p : [0, t] → Rd , that will serve as the second parameter for the Hamiltonian
formulation of mechanics:
p = Dq L(q, x) (1.11)
Example 1.1. Remember what we told before, that for a moving particle Lagrangian
is L = T − V , or, after expanding:
1
L(q, x) = m|q|2 − V (x).
2
Now, since we are using the Cartesian coordinates, we know, that in this case the
momentum is p = mq, so we can calculate the Hamiltonian by the above definition:
1 1 1 1
H(p, x) = p · p − L(q, x) = |p|2 − |p|2 + V (x) = |p|2 + V (x)
m m 2m 2m
The first term in the sum is just the kinetic energy, written via the momentum.
So in this case the Hamiltonian equals the total energy of the particle. However,
this is not always the case, and it happens only if the Cartesian coordinates can
be written in terms of the generalized coordinates without time dependence and vice
versa. Otherwise the expression gets more complicated.
Now that we’ve defined the Hamiltonian and generalized momentum in terms of
Lagrangian and generalized velocity, we are ready to derive the Hamilton’s ODEs.
7
Theorem 1.2. The functions x and p (defined as in (1.10)), satisfy Hamilton’s ODE
system: $
ṗ(s) = −Dx H(p(s), x(s))
(1.13)
ẋ(s) = Dp H(p(s), x(s))
for 0 ≤ s ≤ t. Moreover, the mapping s '→ H(p(s), x(s) is constant.
This is the system of 2d first-order differential equations.
Proof. Remember, in the definition (1.3), the generalized velocity q is uniquely
defined by the equation p = Dq L(q(p, x), x). Then from (1.10) it follows that
ẋ(s) = q(p(s), x(s)). We calculate the corresponding gradients of H from its def-
inition (1.12):
Now if we remember the Euler-Lagrange equation (1.5) and the above, we get:
Finally, using this, we can easily prove that the mapping s '→ H(p(s), x(s) is con-
stant:
d
H(p(s), x(s)) = Dp H(p(s), x(s)) · ṗ(s) + Dx H(p(s), x(s)) · ẋ(s)
ds
= Dp H(p(s), x(s)) · (−Dx H(p(s), x(s))) + Dx H(p(s), x(s)) · Dp H(p(s), x(s))
=0
8
Unlike Lagrangian formulation, the Hamiltonian ODEs are coupled and the order
of the equations is one, not two, hence mathematically it can be easier to solve them
in some cases. But there’s still a question, where the Hamiltonian formalism can be
useful and why it was introduced in the first place? Let us list some of the reasons:
1. Hamiltonian is a physical quantity, namely a total energy of the system, as
described in the example (1.1) above, and therefore all parts of Hamilton’s
equations have a physical interpretation.
9
where P = P (p, x, t) and X = X(p, x, t), and the Hamiltonian in new coordinates
is denoted as K(P, X) = H(p(P, X), x(P, X)), then the Hamilton’s equations in new
coordinates have the following form:
$
Ṗ (t) = −DX K(P (t), X(t))
(1.15)
Ẋ(t) = DP K(P (t), X(t))
Now let us remind ourselves about Hamilton’s principle. The path x(t) minimizes
the action functional, thus solving the variational problem (1.4). In terms of calculus
of variation, it means, that x(t) is a stationary point of the action functional, meaning
that % t
δI[x(·)] = δ L(ẋ(s), x(s))ds = 0,
0
where δ denotes the functional derivative. In physics this version is often referred
to as the Hamilton’s principle in configuration space. We want to rewrite it via the
Hamiltonian. Let p(s) = Dq L(ẋ(s), x(s)). Then from the definition 1.3 it follows,
that H(p(s), x(s)) = p(s)· ẋ(s)−L(ẋ(s), x(s)). We can rewrite the variational relation
above now: % t
δI[x(·)] = δ [p(s) · ẋ(s) − H(p(s), x(s))]ds = 0. (1.16)
0
This version is called the Hamilton’s principle in phase space. Next we want to apply
the canonical transformation and expect the Hamilton’s principle to hold in the new
coordinates, i.e. there exists some path X(s) in new coordinates, such that
% t
δI[X(·)] = δ [P (s) · Ẋ(s) − K(P (s), X(s))]ds = 0, (1.17)
0
where K(P, X) is the Hamiltonian in new coordinates. The expressions under the
integral inside (1.16) and (1.17) need not be equal, however, due to the fact, that
we have fixed endpoint values for x(t), x(0) and hence p(t), p(0), so there’s a zero
variation at endpoints, we must have the following relation between the integrands
in old and new coordinates:
du
α(p(s) · ẋ(s) − H(p(s), x(s)) = P (s) · Ẋ(s) − K(P (s), X(s)) + |t=s , (1.18)
dt
where u is some good enough function with zero variation on endpoints (this will
ensure the simultaneous satisfaction of both (1.16) and (1.17)) and α is a scaling
constant. From now on we consider α = 1, as it’s not relevation for our discussion.
This function F is called a generating function, depending on the variables, on which
it depends. Goldstein in [14] introduces four generic types of generating functions:
10
1. u1 := u1 (x, X, t).
2. u2 := u2 (P, x, t) − X · P .
3. u3 := u3 (p, X, t) + x · p.
4. u4 := u4 (p, P, t) + x · p − X · P .
We are interested in transformation laws that appear in case of the generating func-
tion type-2. Let u2 = u(P, x, t) − X · P . We remark, that the term −X · P is added
to get rid of Ẋ in (1.17). Let us see what happens (we’ll omit the parameter s for
the sake of clearer calculations):
'
d(u(P, x, t) − X · P ) ''
p · ẋ − H(p, x) = P · Ẋ − K(P, X) +
dt '
t=s
= P · Ẋ − K(P, X) + ut + DP u · Ṗ + dx u · ẋ − P · Ẋ − Ṗ · X
= −K(P, X) + ut + DP u · Ṗ + Dx u · ẋ − X · Ṗ .
Or, rewriting it:
(p − Dx u) · ẋ + (X − DP u) · Ṗ = H(p, x) − K(P, X) + ut . (1.19)
Separately the new and old coordinates are independent of each other, then the whole
equation can be identically true, if the corresponding coefficients for ẋ and Ṗ vanish,
i.e. if the following is true:
p(s) = Dx u(P (s), x(s), s)
X(s) = DP u(P (s), x(s), s).
And that means, that the rest of the equation should also be equal to zero, thus
giving us the formula for the Hamiltonian in the new coordinates and thus proving
the following theorem.
Theorem 1.3. If (p, x, t) '→ (P (p, x, t), X(p, x, t), t) is a canonical transformation,
generated by a function u, then the Hamiltonian in the new coordinates is:
K(P, X, t) = H(p, x) + ∂t u(P, x, t). (1.20)
Generally the process of applying the transformation is defined by several steps:
1. We start with the function u : Rd × Rd × R → R, u := u(P, x, t).
2. We define the coordinates p = Dx u, X = DP u.
3. Then (p(t), x(t)) '→ (P (t), X(t)) is a canonical transformation.
11
1.4 Hamilton-Jacobi Equation and Its Properties
Now we are searching for a specific canonical transformation, that will make the
resulting Hamiltonian K to be 0. Then the Hamilton’s equations in the new coordi-
nates will be trivial: $
Ṗ (t) = 0
(1.21)
Ẋ(t) = 0
So we are looking for a generating function u(P, x, t), that will produce such a trans-
formation. By theorem (1.3), K(P, X, t) = H(p, x)+∂t u(P, x, t), and by the definition
of a type-2 generating function, p = Dx u. So, in order for the new Hamiltonian K
to be zero, we must have:
∂t u(P, x, t) + H(Dx u(P, x, t), x) = 0.
However, P doesn’t play any mathematical role in this equation, so for further in-
vestigation, we assume that u has no explicit dependence on P and the equation
becomes the Hamilton-Jacobi equation:
∂t u(x, t) + H(Dx u(x, t), x) = 0. (1.22)
The function u, as a solution to the Hamilton-Jacobi equation, is also called the
Hamilton principle function. We’ll show, that it’s actually closely related to the
action integral. Since u is a generating function, then Dx u = p. Also ẋ = q. Using
these facts and the Hamilton-Jacobi equation, we can calculate the time derivative
of u:
du dx
= Dx u · + ∂t u = p · ẋ + ∂t u = p · ẋ − H(p, x) = p · q − H(p, x) = L(p, q) (1.23)
dt dt
So actually, u is a classical action plus some constant.
12
We assume, that F = F (p, z, x) and g are given smooth functions. The idea of the
method of characteristics if the following: we assume, that u solves the equation
(1.24) and we fix x ∈ Ω and x0 ∈ Γ. We know the value of u in x0 because of the
boundary condition and then want to find the value u(x) by finding a curve inside
Ω, that connects x and x0 and along which we can calculate the value of u.
The curve mathematically is described as x(s) = (x0 (s), ..., xd (s)). We define z(s) =
u(x(s)) and p(s) = Du(x(s)). Let’s first differentiate:
n
&
ṗi (s) = uxi xj ẋj (s). (1.25)
j=1
∂F
ẋj (s) = (p(s), z(s), x(s)). (1.28)
∂pj
∂F ∂F
ṗi (s) = − (p(s), z(s), x(s))pi (s) − (p(s), z(s), x(s)) (1.29)
∂z ∂xi
Finally, we calculate the derivative of z(s):
n n
& ∂u & ∂F
ż(s) = (x(s))ẋj (s) = ṗj (s) (p(s), z(s), x(s))
j=1
∂xj j=1
∂p j (1.30)
= Dp F (p(s), z(s), x(s)) · p(s).
13
Equations (1.28), (1.29) and (1.30) are the characteristic lines we were looking for.
Let’s rewrite them in the vector notation:
⎧
⎨ṗ(s) = −Dz F (p(s), z(s), x(s))p(s) − Dx F (p(s), z(s), x(s))
⎪
ż(s) = Dp F (p(s), z(s), x(s)) · p(s) (1.31)
⎪
ẋ(s) = Dp F (p(s), z(s), x(s)).
⎩
It should be noted, that for these equations to be used, appropriate initial conditions
should be found. More about it can be found in Chapter 3.2 in [11]. We, however,
return to the Hamilton-Jacobi PDE.
Here we have F (P, z, X) = q + H(p, x), where P = (p, q) and X = (x, t). Now we
calculate:
DP F = (Dp H, 1).
DP F · P = (Dp H, 1) · (p, q) = Dp H · p + q = Dp H · p − H(p, x).
Dz F = 0.
DX F = (Dx H, 0).
Now we can calculate these derivatives at (y(s), z(s), y(s)) and use (1.31) and we get
the following system:
⎧
⎨Ṗ (s) = −Dx H(p(s), x(s))
⎪
ż(s) = Dp H(p(s), x(s)) − H(p(s), x(s))
⎪
Ẋ(s) = Dp H(p(s), x(s)).
⎩
The equations for x(s) and p(s) are exactly the Hamilton’s ODEs, as we have derived
them from the Lagrangian mechanics. Notice also, that the equation for z(s) is
trivial, once we find x(s) and p(s), because the function H doesn’t directly depend
on z, so we can immediately integrate the equation for z(s).
14
1.5 Applications
In this chapter we’ll show some useful examples, that shows the relations of the
Hamilton-Jacobi equation to such fields of physics as optics and quantum mechanics.
2. The paths contribute equally in magnitude, but the phase of the contributions
is the classical action.
2
An isosurface represents points of constant action within some volume of space.
15
iu(x,t)
Mathematically, this means, that the amplitude for a given path x(s) is Ae ! ,
where A is some real function and u is a classical action, performed by the system
between times t1 and t2 . And the total amplitude of a system is
& & iun
n
K21 = K21 = An21 e ! (1.34)
n n
, where An21 and un correspond to the path xn (t). This function K is called a space-
time propagator and it’s related to the quantum-mechanical wavefunction.
Now, to establish the connection between Hamilton-Jacobi equation and Schrödinger
equation, we’ll consider the following simplified wavefunction:
iu(x,t)
ψ(x, t) := e ! (1.35)
! is a Planck constant. This function has all the properties of wave function of wave
mechanics. Inverting this, we get:
1 !2
H(Dx u, t) = |Dx |2 + V (x) = − Dx2 ψ + V (x). (1.37)
2m 2mψ
16
Finally, the time derivative is:
∂u i! ∂ψ
=−
∂t ψ ∂t
And now we can rewrite the Hamilton-Jacobi equation in terms of ψ:
!2 2 ∂ψ
− Dx ψ + V (x)ψ − i! =0 (1.38)
2m ∂t
This is the time-dependent non-relativistic Schrödiner equation, which is a fundamen-
dal mathematical description of behavior of quantum systems. Although we gave
only the brief physical motivation of our guess, that u can be views as an amplitude
of the wave, the derivation above suggests a deep link between Hamiltonian formu-
lation of classical mechanics and quantum mechanics. More about this connection
can be found in [12].
2 Traditional Approach
Now that we have established all the connections between the Hamilton-Jacobi PDE,
Hamilton’s ODEs and calculus of variation problems, we can move on to finding the
solution to the problem (1.1). We’ll demostrate how to solve the HJE by separation
of variables and then show that in classical sense, the solution may be found only in
some restricted cases, and this solution might not even be unique.
17
Passing it back to (2.1), we get the following equation:
∂u1 ∂u′
∂t u′ + H(f1 (x1 , ), ..., fd (xd , )) = 0 (2.3)
∂x1 ∂xd
Assume now, that we have already found a solution u in the form (2.2). Then (2.3)
becomes an identity for any value of x1 . But when we change x1 , only the function
f1 is affected. Hence it must be constant. Thus, we obtain two equations:
∂u1
f1 (x1 , ) = α1 , (2.4)
∂x1
∂u′ ∂u′
∂t u′ + H(α1, f2 (x2 , )..., fd (xd , )) = 0 (2.5)
∂x2 ∂xd
α1 is an arbitrary constant. Assume also, that f1 is invertible in the second variable
in a sense, that there exists g(x1 , y1 ), such that for each (x1 , p1 ) and each y1 ∈ R,
f (x1 , g(x1, y1 )) = y1 . For our case this means, that ∂u 1
∂x1
= g(x1 , α1 ). Then after
simple integration we find out u1 :
%
u1 (x1 ) = g1 (x1 , α1 )dq1 + C1 , (2.6)
The separability of the equation depends not only on the Hamiltonian itself, but on
the generalized coordinates too.
18
Example 2.1. We’ll demonstrate this method on the very trivial Hamilton-Jacobi
PDE for the motion of a particles on a line (one dimension).
ut + (ux )2 = 0. (2.11)
Then, following
√ the steps of the method, we have that H(x, p) = f (x, p) = p2 , hence
g(x, α) = α. Now, we are looking for a solution in a form u(x, t) = u0 (t) + u1 (x).
We can calculate u1 (x) by (2.6):
√
u1 (x) = αx + C,
To specify α and C, we need to supply the initial conditions. For example, if the
initial condition is u(x, 0) = 0, then obviously α = 0 and hence also C = 0.
Further, more complicated examples of the usages of this method can be found
in §48 in [20].
f ∗∗ = f,
19
We’ll prove this fact later in this chapter for the case of Lagrangian and Hamilto-
nian. Moreover, this equality above is in fact, a charaterization of a convex lower
semicontinuous function. This fact is know as a Fenchel-Moreau theorem.
In case of f being non-convex, the biconjugate (the transformation applied twice)
will not be the same as the original function, but it will be the largest lower semi-
continuous convex function with f ∗∗ = f , which is called the closed convex hull.
f (q0 ) = p · q0 + b,
p = Dq f (q0 ).
For strictly convex function, each component of the gradient is strictly monotone, so
the second equation can be uniquely solved for q0 = q0 (p), which gives us a way to
calculate b as a function of p:
y = p · q − f ∗ (p)
So the Legendre transform maps the graph of the function to the family of the tangent
planes of this graph.
To simplify the discussion, we assume, that the Lagrangian L and its correspond-
ing Hamiltonian H don’t explicitly depend on x, so we write L = L(q) and H = H(p).
From now on in this chapter we also assume that Lagrangian L is convex and satisfies
the following condition:
L(q)
lim = +∞ (2.13)
q→∞ |q|
This condition is called superlinearity. First we’ll show that the Legendre transform
of such a Lagrangian is actually a corresponding Hamiltonian.
20
Proof. Remember the definition 1.3 of the Hamiltonian. In the view of condition
(2.13), for each p there exists a point q(p) ∈ Rd , such that the mapping q '→ p·q−L(q)
has a maximum at q(p) and:
L∗ (p) = p · q(p) − L(q(p)). (2.15)
Now it follows that Dq L(q(p)) = p, which is a solvable for q(p). It immediately
follows, that (2.15) gives us the definition of the Hamiltonian.
Now we turn to some properties of the Legendre transform, that make it so
valuable to the discussion.
Theorem 2.2. Assume, the convex Lagrangian L satisfies the condition (2.13). De-
fine H = L∗ . Then the following facts are true:
1. H is a convex map,
H(p)
2. limp→∞ |p|
= +∞,
3. L = H ∗ .
These properties are called the convex duality of Hamiltonian and Lagrangian.
Proof. 1. Take τ ∈ [0, 1) and p1 , p2 ∈ Rd . We have:
H(τ p1 + (1 − τ )p2 ) = L∗ (τ p1 + (1 − τ )p2 ) = sup{(τ p1 + (1 − τ )p2 ) · q − L(q)}
q
= τ H(p1 ) + (1 − τ )H(p2 )
Hence H is a convex map by definition.
p
2. Take p ̸= 0 and λ > 0. For estimate below we’ll just use q0 = λ |p| . We have:
H(p) = sup{p · q − L(q)}
q
p
≥ λ|p| − L(λ )
|p|
≥ λ|p| − max L(q).
q∈B(0,λ)
Now the latter term is fixed, so it’s not affected by the change of p, so we can
deduce that:
H(p)
lim inf ≥ λ.
p→∞ |p|
Since λ was an arbitrary constant, we proved the second property.
21
3. Since H = L∗ , then from the definition of the Legendre transform it follows,
that for all p, q ∈ Rd :
H(p) + L(q) ≥ p · q.
In particular, the following is true:
L(q) ≥ sup{p · q − H(p)} = H ∗ (q).
p
22
over all w : [0, t] → Rd , such that, w(t) = x. We also need to ensure that the initial
condition is take into account, so we’ll add a term g(w(0)). So, we have the following
variational problem:
,% t -
u(x, t) := inf L(ẇ(s))ds + g(w(0))|w(t) = x (2.19)
0
We emphasize again, that this seemingly random guess comes from the relations of
the Hamilton’s equations to the Lagrange’s variational problem and the fact that
the Hamilton’s equations are in fact the characteristic equations of the HJE. Next
we simplify the expression (2.19).
Take y ∈ Rd and define w(s) := y + st (x − y). Then by definition of u, we have
t ! "
x−y
%
u(x, t) ≤ L(ẇ(s))ds + g(y) = tL + g(y),
0 t
23
fact a minimum.
Later we denote by Lip(g) the Lipschitz constant of g, which is
, -
|g(x) − g(y)
Lip(g) = sup . (2.21)
x,y∈Rd ,x̸=y |x − y|
In the case of Lipschitz functions Lip(g) < ∞. We make an estimate, using the
Lipschitz continuity:
|g(y) − g(x)| − Lip(g)|y − x| ≤ 0.
Then we add this term to the definition of ψ to estimate it from below:
! "
x−y
ψ(y) ≥ tL + g(y) + |g(y) − g(x)| − Lip(g)|y − x|
t
! "
x−y
≥ tL + g(y) + |g(y)| − |g(x)| − Lip(g)|y − x|
t
! "
x−y
≥ tL − |g(x)| − Lip(g)|y − x|
t
0 . 1
L x−y
/
t |g(x)|
= |x − y| |x−y|
− − Lip(g)
|x − y|
t
L x−y
. /
t
|x−y|
→|y|→∞ +∞,
t
|g(x)|
by our assumption (2.13). Obviously, |x−y|
→ 0 as |y| → ∞. That proves, that:
which means by definition, that for any x, t there exists R > 0, such that if y ∈ /
B̄(0, R) (by B̄(0, R) we denote a closed ball of radius R), then ψ(y) > u(x, t) + ϵ for
some fixed ϵ > 0. But by extreme value theorem inside this closed ball ψ attains its
minimum at some point y0 ∈ B̄(0, R).
By this construction, inf |y|≤R ψ(y) ≥ ψ(y0 ). Also by definition of the infimum, there
exists a point y ′ ∈ B̄(0, R), such that ψ(y ′) < u(x, t) + 2ϵ . From which it follows,
that:
ϵ
inf ψ(y) > u(x, t) + ϵ > u(x, t) + > ψ(y ′) ≥ ψ(y0 ).
|y|>R 2
24
All of the above means, that u(x, t) = inf y∈Rd ψ(y) ≥ ψ(y0 ). It immediately follows
from the definition of the infimum, that:
, ! " -
x−y
u(x, t) = min tL + g(y) . (2.23)
y∈Rd t
We have just proven the following theorem, which originally was constructed by Hopf
in [17] and Lax in [21]:
Remark. Actually, the Hopf-Lax formula can be motivated in a different way. No-
tice, that for any y, z ∈ Rd taken as a parameter, the function
solves the Hamilton-Jacobi equation from the initial value problem (2.18). Indeed,
Dx F (x, t) = z, and hence Ft (x, t) = −H(z) = −H(Dx F (x, t)), which proves this
claim.
Then the Hopf-Lax formula can be obtained from this F (x, t) by the two steps envelope
solution:
u(x, t) = inf sup {(x − y) · z − tH(z) + g(y)}
y∈Rd z∈Rd
, ! " -
∗ x−y
= inf tH + g(y) .
y∈Rd t
Which is exactly the Hopf-Lax formula. The fact that inf in this version is min can
be proved in the same way as in the reasoning above.
Moreover, Hopf in [17] constructed a second formula, by switching the inf and sup
in the above envelope formula, after which we can write a Legendre transform of g:
Before showing, how the Hopf-Lax formula can be used to define some reason-
able solution to the Hamilton-Jacobi equation, let us first show some of the useful
properties of the function u, defined by the Hopf-Lax formula.
25
Lemma 2.1. The function u, defined by Hopf-Lax formula (2.23), where g is Lips-
chitz continuous, has the following properties:
1. u is Lipschitz continuous.
2. u = g on Rd × {t = 0}.
3. For each x ∈ Rd and 0 ≤ s < t we have the following functional identity for
u(x, t), defined by (2.23):
, ! " -
x−y
u(x, t) = min (t − s)L + u(y, s) (2.24)
y∈Rd t−s
Then
, ! " - ! "
x̂ − z x−y
u(x̂, t) − u(x, t) = inf tL + g(z) − tL − g(y)
z∈Rd t t
= [let z = x̂ − x + y for the expression inside the inf]
! " ! "
x−y x−y
≤ tL + g(x̂ − x + y) − tL − g(y)
t t
≤ g(x̂ − x + y) − g(y) ≤ Lip(g)|x̂ − x|.
2. Let 0 < s < t. The formula (2.24), that we need to prove, means is that once
we know the value of u(x, s), we can us it as a initial condition for calculating
u(x, t). Fix y ∈ Rd and 0 < s < t and choose z ∈ Rd , such that
! "
y−z
u(y, s) = sL + g(z).
s
26
Notice, that:
x−z 2 s3 x − y s y − z
= 1− + .
t t t−s t s
It follows by convexity of L, that:
! " 2 ! " ! "
x−z s3 x−y s y−z
L ≤ 1− L + L .
t t t−s t s
Therefore
! " ! " ! "
x−z x−y y−z
u(x, t) ≤ tL + g(z) ≤ (t − s)L + sL + g(z)
t t−s s
! "
x−y
= (t − s)L + u(y, s).
t−s
This is true for all y ∈ Rd . We have already shown the Lipschitz continuity,
and hence the continuity of a map y '→ u(y, ·). So it follows, that:
, ! " -
x−y
u(x, t) ≤ min (t − s)L + u(y, s)
y∈Rd t−s
27
3. We now need to show that there is also Lipschitz continuity with respect to the
t variable. Let us again fix t > 0 and x ∈ Rd and choose y = x in the Hopf-Lax
formula (2.23). Then
u(x, t) ≤ tL(0) + g(x). (2.27)
Moreover, using the Lipschitz condition on g,
we can estimate,
, ! " -
x−y
u(x, t) = min tL + g(y)
y∈Rd t
, ! " -
x−y
≥ g(x) + min tL − Lip(g)|x − y|
y∈Rd t
x−y
[now set z = ]
t
= g(x) + min {tL(z) − tLip(g)|z|}
z∈Rd
[next we use min{f (x)} = −max{−f (x)}
= g(x) − t max {Lip(g)|z| − L(z)}
z∈Rd
= g(x) − t max max{w · z − L(z)}
w∈B(0,Lip(g)) z∈Rd
∗
= g(x) − t max L (w) = g(x) − t max H(w)
w∈B(0,Lip(g)) w∈B(0,Lip(g))
where
C := max{|L(0)|, max H(w)}.
w∈B(0,Lip(g))
28
And, similarly to when s=0:
, ! " -
x−y
u(x, t) ≥ u(x, s) + min (t − s)L − Lip(u(·, t))|x − y|
y∈Rd t−s
= u(x, s) − C(t − s)
This proves the Lipschitz continuity of t '→ u(·, t).
29
and fix h > 0. We set s = t − h and y = st x + (1 − st )z. Then x−z
t
= y−z
s
. Therefore
! " ! ! " "
x−z y−z
u(x, t) − u(y, s) ≥ tL + g(z) − tL + g(z)
t s
! "
x−z
= (t − s)L .
t
From where it follows, that:
By letting h → 0+ , we obtain:
! "
x−z x−z
· Dx u(x, t) + ut (x, t) ≥ L
t t
Finally
! "
x−z x−z
0 ≤ ut (x, t) + · Dx u(x, t) − L
t t
≤ ut (x, t) + max{q · Dx u(x, t) − L(q)}
q∈Rd
Taking into account the fact that H = L∗ , this ends the proof.
One may conclude from this, that the Lipschitz continuous function, that solves
the HJE a.e. and agrees with g, when t = 0, might be a good definition for the
weak solution of (2.18), but unfortunately, such a solution is not unique, that can be
illustrated by the following example:
We can notice, that the functions u1 (x) := 1 − |x| and u2 (x) := |x| − 1 solve the
equation almost everywhere (except for x = 0, as the derivative doesn’t exist there).
Moreover, since we only want a.e. solution, we can notice, that any function, that
has the piecewise property |u′ (x)| = 1 and agrees with the boundary conditions, will
30
solve the equation. For example, the function, that has the following graph, also
solves the equation a.e., along with u1 and u2 :
−1 1
also solves the initial value problem almost everywhere and it’s Lipschitz continuous.
Moreover, there’re infinitely many function, that satisfy this problem almost every-
where. So as a conclusion, we need to put more restrictions on the functions g and
H, and rethink our definition of the weak solution.
31
Now, we can estimate u(x+ z, t) and u(x−z, t) by the Hopf-Lax formula from above,
by choosing respectively y +z and y −z as corresponding y-variable from the formula.
It follows that:
! ! " "
x+z−y−z
u(x + z, t) − 2u(x, t) + u(x − z, t) ≤ tL + g(y + z)
t
! ! " " ! ! " "
x−y x−z−y+z
− 2 tL + g(y) + tL + g(y − z)
t t
= g(y + z) − 2g(y) + g(y − z) ≤ C|z|2 .
The second step will be to establish the connection between H and u. We drop
the semiconcavity assumption on g and consider now the uniformly convex H. It
happens, that this property is enough to ensure the semiconcavity of the solution u
for any fixed t > 0. Again, we first give a definition.
Definition 2.3. A C 2 function H : Rd → R is called strongly convex with constant
θ > 0, if the following inequality is true for all p, ξ ∈ Rd :
ξ T ∇2 H(p)ξ ≥ θ|ξ|2. (2.35)
Remark. This definition is actually a special case of a uniformly convex function,
which means that there exists a function φ, which is increasing and vanishes only at
zero, such that:
ξ T · Dx2 H(p) · ξ ≥ φ(|ξ|) (2.36)
Obviously, the case of φ(α) = θα2 gives the above definition of a strongly convex
function.
Lemma 2.3. Let H be a strongly convex Hamiltonian with constant θ, L = H ∗ be the
corresponding Lagrangian, and u is defined by the Hopf-Lax formula (2.23). Then u
is semiconcave for any t > 0, in particular, for all x, z ∈ Rd :
1 2
u(x + z, t) − 2u(x, t) + u(x − z, t) ≤ |z| .
θt
Proof. Let p1 , p2 ∈ Rd . Then, by Taylor’s formula, we have the following identities:
! " ! " ! "
p1 + p2 p1 + p2 p1 − p2
H(p1 ) = H + Dx H ·
2 2 2
! "T ! "T
1 p1 − p2 p1 − p2
+ · Dx2 H(p̂1) ·
2 2 2
32
! " ! " ! "
p1 + p2 p1 + p2 p2 − p1
H(p2 ) = H + Dx H ·
2 2 2
! "T ! "T
1 p2 − p1 p2 − p1
+ · Dx2 H(p̂2) ·
2 2 2
p̂1 and p̂2 are the mid-points, corresponding to the remainder term in Lagrange form.
Now, summing it up and estimating each of the remainder terms by (2.35), we obtain:
! "
p1 + p2 θ
H(p1 ) + H(p2 ) ≥ 2H + |p1 − p2 |2 (2.37)
2 4
33
This will help to get rid of the pi from (2.38). We obtain:
! "
q1 + q2 1
L(q1 ) + L(q2 ) ≤ 2L + |q1 − q2 |2 (2.39)
2 4θ
Finally, we choose y to be a minimizer in the Hopf-Lax formula for u(x, t), and then
using the same value for estimating u(x + z, t) and u(x − z, t), we obtain:
! ! " "
x+z−y
u(x + z, t) − 2u(x, t) + u(x − z, t) ≤ tL + g(y)
t
! ! " " ! ! " "
x−y x−z−y
− 2 tL + g(y) + tL + g(y)
t t
! ! " ! " ! ""
x+z−y x−y x−z−y
≤t L − 2L +L
t t t
' '2
1 ' 2z ' 1
≤ t '' '' = |z|2 .
4θ t θt
x+z−y x−z−y
Here we applied the inequality (2.39) here with q1 = t
and q2 = t
.
Now we can combine the results of both the lemmas to create a suitable definition
of the weak solution of the HJE initial value problem (2.18).
The previous lemmas (2.2) and (2.3) directly lead us to the existence theorem of
the weak solution, under some restrictions on the function H or g.
34
The uniqueness, however, can be ensured without additional semiconcavity and
strong convexity assumptions on g and H, however the semiconcavity is used here,
as the key point to the proof is the condition 3. from the definition 2.4.
Proof. Suppose, that u1 and u2 are two weak solutions of the (2.18). Let w = u1 − u2
and let (y, s) be an arbitrary point, where u1 and u2 are differentiable, and hence
solve the HJE. Then
wt (y, s) = u1t (y, s) − u2t (y, s) = −H(Dx u1 (y, s) + H(Dx u2 (y, s))
% 1
d
=− H(τ Dx u1 (y, s) + (1 − τ )Dx u2 (y, s))dτ
0 dτ
% 1
=− Dp H(τ Dx u1(y, s) + (1 − τ )Dx u2 (y, s))dτ · (Dx u1(y, s) − Dx u2 (y, s))
0
Consequently,
wt + b · Dx w = 0 a.e. (2.41)
Let φ be some smooth function φ : R → [0, ∞), whose further properties we’ll
establish later in the proof. Denote v(t) := φ(w(t)) ≥ 0. Then from (2.41), multiplied
by φ′ (w), it follows
35
ui :
' ϵ
' ui (x + he, t) − uϵi (x, t) '
'
' '
' h '
'% '
' (ui (x + he − y, t − r) − ui (x − y, t − r)) '
=' ' ηϵ (y, r) dydr ''
Rd ×(0,∞) h
' '
' (ui (x + he − y, t − r) − ui (x − y, t − r)) '
%
≤ ηϵ (y, r) '' ' dydr
Rd ×(0,∞) h '
%
≤ Lip(ui )|e| ηϵ (y, r)dydr = Lip(ui ).
Rd
Finally, by using the property 3 of a definition of a weak solution (2.4), we have one
more inequality for all ϵ > 0, y ∈ Rd and s > 2ϵ:
! "
2 ϵ 1
Dx ui (y, s) ≤ C 1 + I (2.45)
s
By I we denote an identity matrix. This inequality can be obtained by the limit argu-
ment for the second derivative, like it was done above with the Lipschitz continuity.
Simply notice, that for z > 0, |z| ≤ 1
! " ! "
ui (y + zej , s) − 2ui (y, s) + ui (y − zej , s) 1 2 1
≤C 1+ |z| ≤ C 1 + .
h2 s s
By letting h → 0, we obtain the claim.
We denote
% 1
bϵ (y, s) := Dp H(rDx uϵ1 (y, s) + (1 − r)Dx uϵ2 (y, s))dr. (2.46)
0
36
Then:
vt + bϵ · Dx v = (bϵ − b) · Dx v a.e
and therefore
vt + div(vbϵ ) = div(bϵ )v + (bϵ − b) · Dx v a.e.
Clearly,
% 1 d
&
div(bϵ ) = Hpk pl (rDx uϵ1 + (1 − r)Dx uϵ2 )(ruϵ1xl xk + (1 − r)uϵ1xl xk )dr
0 k,l=1
! " (2.47)
1
≤C 1+
s
and
C := {(x, t), s.t. 0 ≤ t ≤ t0 , |x − x0 | ≤ R(t0 − t)}.
Denote %
e(t) = v(x, t)dx.
B(x0 ,R(t0 −t))
37
Letting ϵ → 0, we have by the Dominated Convergence Theorem:
! "
1
ė(t) ≤ C 1 + e(t), a.e. 0 < t < t0 . (2.48)
t
Now fix 0 < ϵ < r < t and let the function φ(z) ≥ 0 be zero, if
Therefore,
|u1 − u2 | ≤ ϵ(Lip(u1 ) + Lip(u2)) on B(x0 , R(t0 − r)). (2.50)
Since it’s true for all ϵ > 0, then u1 (x0 , t0 ) = u2 (x0 , t0 ).
As a conclusion, we can say, that the existence and the uniqueness of the solution
of the HJE are being well established only in a restricted case. There’re however a
lot of limitations on the input data, that cannot be omitted. That’s the reason why
recently there was developed another notion of the weak solution, called viscosity
solution, which helps to build an existence and uniqueness theory further without
harsh restrictions on the initial data and the Hamiltonian. The next chapter is
dedicated to this approach.
3 Viscosity Solutions
In this chapter we’ll introduce the notion of viscosity solution, along with its original
motivation, originating in the vanishing viscosity method. Nowadays this concept is
widely used in the theory of degenerate elliptic PDEs (we’ll give the definition for
that later), but originally it was introduced as a generalized solution particularly to
the Hamilton-Jacobi equation.
38
then the method is to have this equation as a limit (ε → 0) of the following equation:
uεt + (F (uε ))x = ε∆uε , (3.2)
∂2u
4d
where ∆u = i=1 ∂x2 .
i
Such a model originally appeared in fluid dynamics, namely in different specific cases
of Navier-Stokes equations, where the term µ∆u represents the viscosity of the fluid,
that’s why the term ε∆u is called the vanishing viscosity. This method is motivated,
first of all, by the fact, that this additional small viscosity term makes the model
more physically real. And more importantly for studying the solvability of (3.1), the
equation (3.2) may have a smooth solution, even in case of discontinuous initial data.
More generally, the method can be seen as:
A(u) = f → Aε (uε ) = f,
where A is a nonlinear differential operator, f is given, and Aε are operators that
are expected to converge to A in some sense.
Now we turn back to the Hamilton-Jacobi equation (1.1). We forget about the
restrictions on the Hamiltonian from the previous chapter and consider the general
case, where H depends on x and may not be convex in p. This application of the
vanishing viscosity method is originally due to Evans in [10].
We start by adding a viscosity term:
$
uεt + H(Dx uε , x) = ε∆uε
(3.3)
uε (x, 0) = g.
This is a quasilinear parabolic PDE. In our case of smooth Hamiltonian, it turn
out to have smooth solutions (the proof of this fact can found in Friedman [13]), so
we can call (3.3) a regularization of (1.1). Our hope is that uε converges to some
solution u of (1.1), as ε → 0. Evans in [10] proved the following theorem, using the
Friedman’s result:
Theorem 3.1. Let H = H(p, x) be a Hamiltonian and there exists a constant M,
s.t.
|H| + |Hx | + |Hp | ≤ M,
let g be a bounded Lipschitz function. Then there exists a sequence εj ↘ 0, such that
there exists a uniform limit on compact sets of Rd × [0, T ]:
lim uεj (x, t) = u(x, t)
εj ↘0
and this limit is Lipschitz and is a solution of (1.1) in a weak sense (it solves the
equation a.e. and satisfies the initial condition).
39
This is a good way to ensure the existence of such a weak solution to the orig-
inal problem, although, as we can see, there are still some assumptions put on the
Hamiltonian. Later we are going to build the existence theory, that follows from the
control theory.
Now we turn to some observations regarding the limit uε , as ε → 0. First, let’s
notice, that the estimates uε strongly depend on the regularizing term ε∆u, hence
we can lose control over these estimates and their derivatives, as ε → 0. However,
in practice, if we assume, that uε converge uniformly on some compact subsets of
Rd × [0, ∞), then we can work with a family uε , that is bounded and equicontinu-
ous on these compact subsets. Then, by the Arzela-Ascoli theorem, there exists a
subsequence {uεj }∞ d
j=1 and some function u ∈ C(R × [0, ∞), such that
This at least gives us some information about the solution u, however, as it’s only
continuous, we still have no idea about whether the derivatives exist and what they
might look like. So what we are going to do is to exploit the maximum principle,
which will allow us to move to differentiating smooth test functions.
Let u be defined as a limit above. Fix v ∈ C ∞ (Rd × (0, ∞)) and assume, that
This means that uεj − v attains a local maximum somewhere inside Br . If we repeat
these steps for a sequence rj → 0, we get that there exists (xεj , tεj ) → (x0 , t0 ), such
that
uεj − v has a local maximum at (xεj , tεj ). (3.6)
And this, consequently, means, that
$
Dx uεj (xεj , tεj ) = Dx v(xεj , tεj )
ε (3.7)
ut j (xεj , tεj ) = vt (xεj , tεj )
40
and
− ∆uεj (xεj , tεj ) ≥ −∆v(xεj , tεj ) (3.8)
We can now calculate, using the results above and (3.3):
Finally, we let εj → 0 ad using the fact that v is a smooth test function and H is a
smooth Hamiltonian (in fact we can require it to be only continuous), and we obtain
If we soften the condition (3.5) by assuming that the maximum is not necessarily
strict, we can still deduce the same inequality with some additional steps.
Let’s denote for δ > 0, v δ (x, t) := v(x, t) + δ(|x − x0 |2 + |t − t0 |2 ). If u − v has a local
maximum at (x0 , t0 ), then u − v δ has a strict local maximum at this point, so by the
above reasoning vtδ (x0 , t0 ) + H(Dx v δ (x0 , t0 ), x0 ) ≤ 0 and (3.10) follows for v as well.
In a similar manner, by reversing some inequalities, we can prove, that if
then
vt (x0 , t0 ) + H(Dx v(x0 , t0 ), x0 ) ≥ 0. (3.12)
The connection between (3.11) and (3.12) (and between (3.10) and its respective
local maximum condition) is valid for all smooth test functions, so we have achieved
what we were hoping for: we put the derivatives on test functions with full control
over them.
41
1. u = g on Rd × {t = 0},
2. for each v ∈ C ∞ (Rd ) × (0, ∞), we have that, if u − v has a local maximum (or,
respectively, a local minumum) at a point (x0 , t0 ) ∈ Rd × (0, ∞), then,
vt (x0 , t0 ) + H(Dx v(x0 , t0 ), x0 ) ≤ 0
or, respectively,
vt (x0 , t0 ) + H(Dx v(x0 , t0 ), x0 ) ≥ 0.
u is called a viscosity solution, if it’s both viscosity subsolution and viscosity super-
solution.
Remark. In this paper we’ll use the definition above, but it’s worth mentioning,
that it can be defined in a much more general case. We’ll briefly introduce it in this
remark.
We are looking at the equation in form F (x, u, Dx u, Dx2 u) = 0, where F = F (x, r, p, X)
is a function F : Rd × R × Rd × S(Rd ), where S(Rd ) is the set of symmetric matrices
d × d equipped with the standard order. We have only one requirement to put on F :
it has to satisfy the following proper degenerate ellipticity condition:
Let Ω ⊂ Rd and let F satisfy the above condition (3.13). A viscosity subsolution to
F (x, u, Dx u, Dx2 u) = 0 on Ω is then an upper semicontinuous function u : Ω → R,
such that for any v ∈ C ∞ (Rd ), such that u − v has a local maximum at x0 ∈ Ω, it
follows that
F (x0 , u(x0 ), Dx v(x0 ), Dx2 v(x0 )) ≤ 0. (3.14)
And similarly, a supersolution is a lower semicontinuous function u : Ω → R, such
that for any v ∈ C ∞ (Rd ), such that u − v has a local minimum at x0 ∈ Ω, it follows
that
F (x0 , u(x0 ), Dx v(x0 ), Dx2 v(x0 )) ≥ 0. (3.15)
u is a viscosity solution, if it’s both a subsolution and a supersolution. More about
this definition and the corresponding properties can be found in [8].
42
it by definition.
If u is a viscosity solution, then for any v ∈ C ∞ (R), such that x0 is a maximum
(minimum) point of u − v, we need to have |v ′(x0 )| ≤ 1 (≥ 1). This means, that if
we consider a smooth function v(x) = c(x−1) for c < −1, then we should have u ≤ v,
for otherwise, there would exist a maximizing point x0 for u − v and |v ′ (x)| = |c| > 1,
which contradicts our assumption, that u is a viscosity solution. Letting c → −1, we
get that u ≤ 1 − x.
Similarly, by putting v(x) = c(x + 1) for c > 1 and letting c → 1, we get that
u ≤ 1 + x, which means that u ≤ 1 − |x|.
Finally, try to get v by smoothing the tip of c(1 − |x|) where 0 < c < 1 and see, that
if there exists x̃ ∈ (−1, 1), such that u(x̃) ≤ v(x̃), then there must be a minimizing
point x0 , in which the following should be true |v ′ (x0 )| = |c| ≥ 1, which contradicts
our assumption on c. Thus we have shown that u ≥ v for any smooth v, arbitrarily
close to 1 − |x|. Along with the previous estimate, this proves that u(x) = 1 − |x| is
a viscosity solution of |u′ (x)| = 1.
Notice, however, that u wouldn’t be a viscosity solution of 1 − |u′(x)| = 0, because
then, if x0 is a maximum (minimum) point of u−v, we would need to have |v ′ (x0 )| ≥ 1
(≤ 1). In this case, the viscosity solution is u(x) = |x| − 1. This example illustrates,
that the viscosity solution allows the orientation of ”corners” or ”bumps” in only one
directions, so in general, if u solves H(Dx u, u) = 0 in a viscosity sense, it will not
necessarily be a viscosity solution to −H(Dx u, u) = 0.
Notice also, that this equation is exactly the eikonal equation (1.33) in 1-dimension
in the medium with uniform refraction index 1 and the viscosity solution is the one
that has the physical meaning. The refraction index determines the speed of light at
each point and and in out case the speed of light is the same everywhere, thus the
time for travelling from the center of the domain (x = 0) to each of the boundaries
will take the same time, which the graph of the function illustrates.
−1 1
Remark. It’s worth mentioning, that the example above suggests another interpre-
tation of viscosity solutions. They may be seen as a sort of uniquification of multiple
43
solutions to the problem.
The discussion in previous chapter showed, that the solution, obtained by the
method of vanishing viscosity, thus, defined by (3.4), is a viscosity solution. Next
we’ll show some important properties of solutions, defined by Definition 3.1.
The first thing that comes to mind is whether a classical solution is a viscosity
solution and vice versa. We’ll prove the following result.
Theorem 3.2 (Consistency of viscosity solutions). The following facts are true:
Proof. The first fact is easy to prove. Let’s take any test function v ∈ C ∞ (Rd ×
(0, ∞)) and assume, that u − v attains a local extremum at (x0 , t0 ), then
$
Dx u(x0 , t0 ) = Dx v(x0 , t0 )
ut (x0 , t0 ) = vt (x0 , t0 ),
44
Under these assumptions, we have that u(x) = |x|ρ1 (x), where ρ1 : Rd → R is a
continuous function, such that ρ1 (0) = 0. We next set for r > 0:
and therefore v ∈ C 1 (Rd ). Finally we have to show the strict local maximum prop-
erty. If x ̸= 0, then
% 2|x|
u(x) − v(x) = |x|ρ1 (x) − ρ2 (r)dr − |x|2
|x|
% 2|x|
≤ |x|ρ1 (|x|) − |ρ1 (|x|)|dr − |x|2
|x|
Now we return to the proof of the second fact of the theorem. Applying the
lemma above to u in Rd+1 , we deduce, that there exists a C 1 function v, such that
u − v has a strict maximum at (x0 , t0 ). Next we take the standard mollifier ηε and
denote v ε := ηε ∗ v. Then by the properties of mollification, we obtain:
⎧
ε
⎨v → v
⎪
D x v ε → Dx v (3.21)
⎪
⎩ ε
vt → vt ,
45
uniformly near (x0 , t0 ). Then it follows, that u − v ε has a maximum at some point
(xε , tε ), such that (xε , tε ) → (x0 , t0 ) as ε → 0. Since a mollification v ε is a smooth
function, we can apply the definition of a viscosity solution and deduce, that:
vtε (x0 , t0 ) + H(Dx v ε (x0 , t0 ), x0 ) ≤ 0. (3.22)
Passing to limit ε → 0, we get
vt (x0 , t0 ) + H(Dx v(x0 , t0 ), x0 ) ≤ 0. (3.23)
But since u is differentiable at (x0 , t0 ), at u − v attains a maximum at (x0 , t0 ), we
see that $
Dx u(x0 , t0 ) = Dx v(x0 , t0 )
(3.24)
ut (x0 , t0 ) = vt (x0 , t0 ).
so, also
ut (x0 , t0 ) + H(Dx u(x0 , t0 ), x0 ) ≤ 0. (3.25)
Finally, we can repeat the steps above for the function −u, then we’ll find a C 1
function ṽ, such that u − v has a strict local minimum at (x0 , t0 ). Then we can
deduce in a similar manner, that
ut (x0 , t0 ) + H(Dx u(x0 , t0 ), x0 ) ≥ 0, (3.26)
thus completing the proof.
Next we prove the stability result. In general, for non-linear first-order PDE, the
set of solutions is not necessarily closed in the topology of uniform convergence. We
may not conclude, that the uniform limit of solutions of some problem will be a solu-
tion itself without ensuring the uniform convergence of the corresponding gradients
too. However, in the case of viscosity solutions, we can skip this requirement on the
convergence of derivatives. The result is originally due to Crandall and Lions in [9].
Theorem 3.3 (Stability of Viscosity Solutions). Let Hn be a sequence of continu-
ous Hamiltonians converging in C(Rd × [0, ∞)) to H. Let un be a subsolution (or,
respectively, supersolution) of ut + Hn (Dx u(x, t), x) = 0, and let un → u uniformly
in Rd . Then u is a viscosity solution of ut + H(Dx u(x, t), x) = 0.
Proof. Before we prove the theorem, let us first prove the following lemma.
Lemma 3.2. Let u : Rd → R be a continuous function. Let v be a smooth function,
such that u − v has a strict local maximum (minumum) at x0 . If un is a sequence,
converging to u uniformly, then there exists a sequence of points xn , such that xn →
x0 with un (xn ) → u(x) and, moreover, um − v has a local maximum (minumum) at
xm .
46
Proof. We prove the result in case of u − v having a local maximum. Then for every
δ > 0 there exists a small enough εδ > 0, such that when |x − x0 | = δ
u(x) − v(x) < u(x0 ) − v(x0 ) − εδ . (3.27)
Now, by the uniform convergence of un , there exists a sufficiently large Nδ , such that
for all m ≥ Nδ , we have |um (x) − u(x)| < ϵ4δ . Then for |x − x0 | = δ
um (x) − v(x) − um (x0 ) + v(x0 )
= (um (x) − u(x)) + (u(x) − v(x)) − um (x0 ) + v(x0 )
ϵδ (3.28)
< + u(x0 ) − v(x0 ) − εδ − um (x0 ) + v(x0 )
4
ϵδ
<− .
2
Or, rewriting it,
um (x) − v(x) < um (x0 ) − v(x0 ) − εδ . (3.29)
Which means that um − v has a local maximum at some point xm with |xm − x| < δ.
We can construct the rest of xm by letting δ, εδ → 0.
Now, let’s prove the theorem in case of subsolutions (the case of supersolutions
is obtained in a similar manner). We take v, a test function, and let u − v have a
local maximum at (x0 , t0 ). Then by the lemma there exists a sequence (xm , tm ), such
that xm → x0 , such that um − v has a local maximum at (xm , tm ) and um (xm , tm ) →
u(x0 , t0 ). Then, since um is a subsolution, then
umt (xm , tm ) + Hm (Dx um (xm , tm ), xm ) ≤ 0. (3.30)
By letting m → ∞, we finish the proof, using the continuity:
ut (x0 , t0 )) + H(Dx u(x0 , t0 ), x0 ) ≤ 0. (3.31)
47
where H is a convex map, that satisfies the superlinearity condition:
H(p)
lim = +∞,
|p|→∞ |p|
Combining the estimates above, we obtain for t < t0 and (x, t) close to (x0 , t0 ).
! "
x0 − x
v(x0 , t0 ) − v(x, t) ≤ (t0 − t)L . (3.34)
t0 − t
If we introduce now h = t0 − t > 0 and let x = x0 − hq, then the inequality becomes
48
By dividing the inequality by h and letting h → 0, we obtain for all q ∈ Rd
Now, from convex duality and the definition of the Legendre transform, we have for
each p, q ∈ Rd
H(p) = sup {p · q − L(q)}.
q∈Rd
By formula (2.24), for small enough h there exists some x1 ∈ Rd , close to x0 , such
that ! "
x0 − x1
u(x0 , t0 ) = hL + u(x1 , t0 − h).
h
Next we compute
1
d
%
v(x0 , t0 ) − v(x1 , t0 − h) = v(sx0 + (1 − s)x1 , t0 + (s − 1)h)ds
ds
%0 1
= Dx v(sx0 + (1 − s)x1 , t0 + (s − 1)h) · (x0 − x1 )
0
+ vt (sx0 + (1 − s)x1 , t0 + (s − 1)h)hds
!% 1 "
=h Dx v · q + vt ds .
0
Here q = x0 −x
h
1
. If h > 0 is small enough, we can use our contradictory assumption
(3.37), and get
! "
x0 − x1
v(x0 , t0 ) − v(x1 , t0 − h) ≤ hL − δh,
h
49
which in its turn, means by
v(x0 , t0 ) − v(x1 , t0 − h) ≤ u(x0 , t0 ) − u(x1 , t0 − h) − δh,
which contradicts the local minimality of u − v at (x0 , t0 ). This contradiction proves
that u is a viscosity supersolution and hence a viscosity solution.
3.4 Uniqueness
In this and the following sections we’ll be following the chapters 10.2 and 10.3 from
[11]. The main focus of this chapter is the uniqueness result. We will show under
which assumptions is the viscosity solution of the problem (1.1) is unique. We’ll
consider a problem for fixed time T > 0:
$
ut + H(Dx u, x) = 0 in Rd × (0, T ]
(3.38)
u(x, 0) = g on Rd × {t = 0}
First we prove a lemma, showing that the inequalities from the definition of the
viscosity solution, can be extended to the terminal time T .
Lemma 3.3. Assume, u is a viscosity solution of (3.38) and u − v has a local
maximum (minimum) at a point (x0 , t0 ) ∈ Rd × (0, T ]. Then
vt (x0 , t0 ) + H(Dx v(x0 , t0 ), x0 ) ≤ 0(≥ 0) (3.39)
Proof. We assume that u − v has a strict local maximum at the point (x0 , T ) (we
can get rid of the strictness by the same technique we used in subsection 3.1. Define
for x ∈ Rd and 0 < t < T
ε
ṽ(x, t) := v(x, t) + .
T −t
Then for a small enough ε > 0, u − ṽ has a local maximum at (xε , tε ), where
0 < tε < T and (xε , tε ) → (x0 , T ). Therefore,
ṽt (x, t) + H(Dx ṽ(xε , tε ), xε ) ≤ 0,
and hence
ε
vt (x, t) + + H(Dx v(xε , tε ), xε ) ≤ 0.
(T − t)2
By letting now ε → 0, we get
vt (x0 , T ) + H(Dx v(x0 , T ), x0) < 0.
Which proves the theorem for the case of u − v having a local maximum. The case
of local minimum is proved by reversing the inequalities.
50
Now we are going to prove one of most important facts in the theory of viscosity
solutions for Hamilton-Jacobi equations.
for all x, y, p, q ∈ Rd and for some constant C ≥ 0. Then there exists at most one
viscosity solution of (1.1).
Proof. Assume, that we have u and ũ to be different viscosity solutions, that both
satisfy the initial condition, but
51
Moreover, ε(|x0 |2 + |y0|2 ) = O(1), and therefore
1 3
ε(|x0 | + |y0 |) = ε 4 ε 4 (|x0 | + |y0 |)
1 3
≤ ε 2 + Cε 2 (|x0 |2 + |y0 |2 ) (3.45)
1
≤ Cε 2
1
Which means, that ε(|x0 |+|y0|) = O(ε 2 ). Next, due to the fact, that Φ(x0 , y0 , t0 , s0 ) ≥
Φ(x0 , x0 , t0 , t0 ), we have
1
u(x0 , t0 ) − ũ(y0 , s0 ) − λ(t0 + s0 ) − (|x0 − y0 |2 + (t0 − s0 )2 )
ε2
− ε(|x0 |2 + |y0|2 ) ≥ u(x0 , t0 ) − ũ(x0 , t0 ) − 2λt0 − 2ε|x0 |2 .
Therefore,
1
(|x0 − y0 |2 + (t0 − s0 )2 ) ≤ ũ(x0 , t0 ) − ũ(y0 , s0 ) + λ(t0 − s0 )
ε2
+ ε(x0 + y0 ) · (x0 − y0 ).
By this and the uniform continuity of ũ, we have |x0 − y0 |, |t0 − s0 | = o(ε).
Next, let ω be a modulus of continuity of u, meaning, that for all x, y ∈ Rd , 0 ≤
t, s ≤ T , we have that
|u(x, t) − u(y, s)| ≤ ω(|x − y| + |t − s|), (3.46)
and ω(r) → 0 as r → 0. Similarly we define ω̃. Then
σ
≤ u(x0 , t0 ) − ũ(y0 , s0 ) = u(x0 , t0 ) − u(x0 , 0) + u(x0 , 0) − ũ(x0 , 0)
2
+ ũ(x0 , 0) − ũ(x0 , t0 ) + ũ(x0 , t0 ) − ũ(y0, s0 )
≤ ω(t0 ) + ω̃(t0 ) + ω̃(o(ε))
Now we can make ε even smaller, so that σ4 ≤ ω(t0 ) + ω̃(t0 ), which in turn implies,
that t0 , s0 ≥ µ > 0 for some constant µ.
Now, in the view of (3.42), the map (x, t) '→ Φ(x, y0 , t, s0 ) has a maximum at (x0 , t0 ),
then by definition of Φ, u − v has a maximum at (x0 , t0 ), where v is defined by
1
v(x, t) := ũ(y0 , s0 ) + λ(t + s0 ) + (|x − y0 |2 + (t − s0 )2 ) + ε(|x| + |y0 |2 ).
ε2
Since u is a viscosity solution, we have by definition
vt (x0 , t0 ) + H(Dx v(x0 , t0 ), x0 ) ≤ 0. (3.47)
52
Hence, ! "
2(t0 − s0 ) 2
λ+ +H (x0 − y0 ) + 2εx0 , x0 ≤ 0. (3.48)
ε2 ε2
Similarly, the map (y, s) '→ −Φ(x0 , y, t0, s) has a minimum at (y0 , s0 ), then ũ − ṽ has
a minimum at (y0 , s0 ) for ṽ defined by
1
ṽ(y, s) := u(x0 , t0 ) − λ(t0 + s) − (|x − y0 |2 + (t0 − s)2 ) − ε(|x0 |2 + |y|2).
ε2
Again, since ũ is a viscosity solution, then
Hence ! "
2(t0 − s0 ) 2
−λ+ +H (x0 − y0 ) − 2εy0, y0 ≥ 0. (3.49)
ε2 ε2
Now, combining the two results, we get
! "
2 2
2λ ≤ H( 2 (x0 − y0 ) − 2εy0, y0 ) − H (x0 − y0 ) + 2εx0 , x0 . (3.50)
ε ε2
Letting ε → 0, we get that 0 < λ ≤ 0, which contradicts the assumption and hence
ends the proof.
53
3.5.1 Introduction
Optimal control theory is, in fact, an extension of the calculus of variation, dealing
not just with an optimization problem, but with the control law for a system, such
that the optimality conditions are achieved. Mathematically, the goal is to optimally
control the solution to the following system of ordinary differential equations on a
fixed time period [t, T ] (t > 0, T > t) by changing some parameters:
$
ẋ(s) = f (x(s), α(x)), (t < s < T )
(3.52)
x(t) = x.
Due to the assumptions of f , the Picard-Lindelöf theorem yields that for each α ∈ A
there exists a unique Lipschitz continuous solution xα (s) on [t, T ]. Such a solution
is called the response of the system to the control α. We have to find the optimal
control α∗ , but first we need to define the ”optimality”. We do it with the help of
the cost functional, that is defined for a control α, given an initial condition x ∈ Rd
to (3.52) and 0 ≤ t ≤ T :
% T
Cx,t [α] := h(xα (s), α(s))ds + g(x(T )). (3.54)
t
54
By researching this function’s behavior, we are embedding the original control prob-
lem (3.52) into the larger class of the similar problems, as we also let x and t vary.
We named this function u for a reason, as it will appear, that u solves a PDE of
Hamilton-Jacobi type. Our goal is to build this equation and to show that the
solution of this PDE solves the optimal control problem for each x and t.
Theorem 3.6 (Optimality Conditions). For all h > 0, such that t + h < T , we get
,% t+h -
α α
u(x, t) = inf h(x (s), α(s))ds + u(x (t + h), t + h) . (3.56)
α∈A t
The corresponding solution x3 will have the following form (by the uniqueness of
solutions to (3.52)) :
$
x1 (s), if t ≤ s < t + h
x3 (s) = (3.59)
x2 (s), if t + h ≤ s ≤ T.
55
Since α1 was arbitrary, we get
,% t+h -
α α
u(x, t) ≤ inf h(x (s), α(s))ds + u(x (t + h), t + h) + ε. (3.60)
α∈A t
To get the reversed inequality and complete the proof, choose a new ε > 0, then
α4 ∈ A with the corresponding solution x4 for the initial time t, such that
% T
u(x, t) + ε ≥ (h(x4 (s), α4 (s))ds + g(x4 (T )), (3.61)
t
Lemma 3.4. The value function u, defined by (3.55) is bounded and Lipschitz con-
tinuous.
56
take ε > 0. We choose α̂ ∈ A and a corresponding solutions x̂(s) and x(s) to (3.52)
with initial value x̂ and x, so that
% T
u(x̂, t) + ε ≥ h(x̂(s), α̂(s))ds + g(x̂(T )). (3.64)
t
Using this result, the Lipschitz continuity of h and g and the previous estimate (3.64),
we deduce
% T
u(x, t) − u(x̂, t) ≤ (h(x(s), α̂(s)) − h(x̂(s), α̂(s))ds
t
+ g(x(T )) − g(x(T )) + ε
% T
≤ Ch |x̂(s) − x(s)|ds + Cg |x̂(T ) − x(T )| + ε
t
≤ C0 |x − x̂| + ε.
If we repeat all the steps reversing the roles of x and x̂, we get the Lipschitz continuity
of u:
|u(x, t) − u(x̂, t)| ≤ C0 |x − x̂|. (3.65)
Next take x ∈ Rd and 0 ≤ t < t̂ ≤ T . Let ε > 0 and choose α ∈ A with a
corresponding solution x(s), such that
% T
u(x, t) + ε ≥ h(x(s), α(s))ds + g(x(T )). (3.66)
t
57
Then x̂(s) = x(s + t − t̂). Therefore,
% T
u(x, t̂) − u(x, t) ≤ h(x̂(s), α̂(s))ds + g(x̂(T )
t̂
% T
− h(x(s), α(s))ds − g(x(T )) + ε
t
% T
=− h(x(s), α(s))ds + g(x(T + t − t̂)) − g(x(T )) + ε
T +t−t̂
≤ C|t − t̂| + ε.
where $
˙
x̂(s) = f (x̂(s), α̂(s))
(3.69)
x̂(t̂) = x.
Then we define $
α̂(s + t̂ − t), if t ≤ s ≤ T + t − t̂
α(s) =
α̂(T ), if T + t − t̂ ≤ s ≤ T.
And let x be a solution, corresponding to α. Then x(s) = x̂(s + t̂ − t), for t ≤ s ≤
T + t − t̂. Therefore
% T
u(x, t) − u(x, t̂) ≤ h(x(s), α(s))ds + g(x(T ))
t
% T
− h(x̂(s), α̂(s))ds − g(x̂(T ) + ε
t̂
% T
= h(x(s), α(s))ds + g(x(T )) − g(x(T + t − t̂)) + ε
T +t−t̂
≤ C|t − t̂| + ε.
58
where the Hamiltonian H is defined for x, p ∈ Rd as
Remark. Before we proceed with the discussion, we have to note, that in case of a
terminal-value problem for the Hamilton-Jacobi equation, the definition of the viscos-
ity solution is a bit different. We say, that a uniformly continuous bounded function
u, having the terminal value u = g on Rd × {t = T } is a viscosity subsolution of
(3.70), provided, that for each v ∈ C ∞ (Rd × (0, T ))
$
if u − v has a local maximum at (x0 , t0 ) ∈ Rd × (0, T )
(3.72)
then vt (x0 , t0 ) + H(Dx v(x0 , t0 ), x0 ) ≥ 0,
Now we’ll prove one of the major results of the control theory, that connects the
Hamilton-Jacobi equations and the optimal control problem.
Theorem 3.7. The value function u is the unique viscosity solution of the terminal
value problem for the Hamilton-Jacobi-Bellman equation:
$
ut + mina∈A {f (x, a) · Dx u + h(x, a)} = 0 in Rd × (0, T )
(3.74)
u = g on Rd × t = T .
Proof. First, by the lemma (3.4), u is bounded and Lipschitz continuous. Moreover,
by definition of u, for each x ∈ Rd
We prove first, that u is a viscosity subsolution to the given problem. We’ll prove
it by contradiction. Assume, v ∈ C ∞ (Rd × (0, T )), and assume u − v has a local
59
maximum at (x0 , t0 ) ∈ Rd × (0, T ). Suppose, the inequality from the definition of
the subsolution doesn’t hold, so there exists some δ > 0 and a ∈ A, that
|x − x0 | + |t − t0 | < ϵ. (3.77)
Since u − v has a local maximum at (x0 , t0 ), we can choose ϵ above such that for all
(x, t), satisfying the inequality above, the following is true
Now let’s take a constant control α(s) = a for t0 ≤ s ≤ T and the corresponding
solution x(s): $
ẋ(s) = f (x(s), a)
(3.79)
x(t0 ) = x.
Now choose 0 < h < ϵ to be so small, that |x(s) − x0 | ≤ ϵ, for t0 ≤ s ≤ t0 + h. Then
according to the estimates above, for each s, t0 ≤ s ≤ t0 + h, we have
60
This contradicts the assumption, and hence proves that u is indeed a viscosity sub-
solution.
Now we assume, that u − v has a local minimum at (x0 , t0 ) ∈ Rd × (0, T ). Again we
try to contradict the definition of a viscosity supersolution and assume, that there is
δ > 0, such that for all a ∈ A
vt (x, t) + f (x, a) · Dx v(x, t) + h(x, a) ≥ δ (3.82)
for (x, t), sufficiently close to (x0 , t0 )
|x − x0 | + |t − t0 | ≤ ϵ
, where ϵ is chosen so that by definition of a local minumum at (x0 , t0 ), we have
(u − v)(x, t) > (u − v)(x0 , t0 ).
Next choose 0 < h < ϵ, so that |x(s) − x0 | < ϵ for t0 ≤ s ≤ t0 + h, where x(s) is a
solution, corresponding to some α ∈ A:
$
ẋ(s) = f (x(s), α(s))
x(t0 ) = x0 .
Next, we estimate
u(x(t0 + h),t0 + h) − u(x0 , t0 ) ≥ v(x(t0 + h), t0 + h) − v(x0 , t0 )
% t0 +h
d
= v(x(s), s)ds
t0 ds (3.83)
% t0 +h
= vt (x(s), s) + f (x(s), α(s)) · Dx v(x(s), s)ds,
t0
61
4 Numerical Approach
In the previous chapter we talked about the method of vanishing viscosity for ap-
proximating the solutions to the Hamilton-Jacobi equation, but it’s not the only
way to find the approximate solution. There’re also a lot of numerical methods for
solving the general version of the equation, as well as its particular versions, like
eikonal equation. In this final chapter we’ll see how some of the numerical methods
agree with the theory presented in the previous chapters. As this chapter has an
illustrative mission, most of the proofs will be skipped or referred to the sources.
62
1. The monotone schemes are stable and convergent to the viscosity solution of
the initial value problem.
2. The error between the numerical solution and the viscosity solution is at most
1
O(h 2 ) in L∞ -norm.
In general, the monotone schemes are the simplest numerical methods for solving the
Hamilton-Jacobi equations, and although they have the nice properties above, they
still fail to achieve the desired order of accuracy for smooth solutions, as it can only
be O(h).
Ĥ := Ĥ(p− , p+ ) = Ĥ(p− − + +
1 , ..., pd , p1 , ..., pd ),
63
Definition 4.1. The Lax-Friedrichs Hamiltonian is defined by
! −
p + p+ p+ − p−
"
LF1 − +
Ĥ (p , p ) = H −α , (4.4)
2 2
where α is the constant (artificial viscosity), that has to satisfy the following condi-
tion:
α ≥ max |H ′ (p)|.
The other version can be given as :
p+ − p−
! "
LF2 − + 1 − +
Ĥ (p , p ) = H(p ) + H(p ) − α , (4.5)
2 2
Remark. One can see, that this numerical Hamiltonian is indeed consistent with the
original H, as ! "
LF1 p+p p−p
Ĥ (p, p) = H −α = H(p),
2 2
and it’s also monotone due to the convexity of H.
Definition 4.2. The Godunov Hamiltonian (or Godunov monotone flux) is defined
by $
minp− ≤p≤p+ H(p), if p− ≤ p+
Ĥ G (p− , p+ ) = (4.6)
maxp+ ≤p≤p− H(p), if p− > p+
Remark. This scheme is more difficult to program due to the max and min involved,
however, in case of convex Hamiltonian this scheme turns out to be quite efficient.
Definition 4.3. Denote I(a, b) = [min(a, b), max(a, b)], then the Roe Hamiltonian
is defined by
⎧
⎪
⎪ H(p− ), if H ′ (p) ≥ 0 for all p ∈ I(p− , p+ )
+
), if H ′ (p) ≤ 0 for all p ∈ I(p− , p+ )
⎨
Ĥ R (p− , p+ ) = H(p
2 − +3 (4.7)
+
⎩H p +p − α p −p , if H ′ (p)changes sign in I(p− , p+ )
⎪
⎪ −
2 2
64
Next we’ll work out a simple example for the eikonal equation to illustrate, that
the scheme converges to the viscosity solution.
Example 4.1. Remember the problem for the eikonal equation
$
|u′ (x)| = 1 on (−1, 1)
u(−1) = u(1) = 0.
We know that the unique viscosity solution to this problem is u(x) = 1 − |x|. Now
we want to apply the first order monotone scheme to see what results it will yield.
We’ll use the Lax-Friedrichs Hamiltonian. In our case H(p) = |p| − 1.The artificial
viscosity α can immediately be taken as α = max |H ′ (p)| = 1, hence the numerical
Hamiltonian in our case looks like:
' −
' p + p+ ' p+ − p−
'
− +
Ĥ(p , p ) = '' '−
2 ' 2
The mesh we’ll work with is going to have h = 0.5, so we have x0 = −1, x1 =
−0.5, x2 = 0, x3 = 0.5, x4 = 1. We have due to the boundary conditions: u0 = u4 = 0.
Next we calculate backward differences for u:
u2 − u1 u1 − u0
u′+
1 = = 2(u2 − u1 ), u′−
1 = = 2u1,
0.5 0.5
u3 − u2 u2 − u1
u′+
2 = = 2(u3 − u2 ), u′−
2 = = 2(u2 − u1 )
0.5 0.5
u4 − u3 u3 − u2
u′+
3 = = −2u3 , u′−
3 = = 2(u3 − u2 )
0.5 0.5
Next we calculate the Hamiltonian for different arguments and build the system of
equations:
H(u′+ ′−
1 , u1 ) = |u2 | + 2u1 − u2 = 1
H(u′+ ′−
2 , u2 ) = |u3 − u1 | + 2u2 − u1 − u3 = 1
H(u′+ ′−
3 , u3 ) = |u2 | + 2u3 − u2 = 1
From the first and the third equation, we can see that u1 = u3 . Then we get:
|u2| + 2u1 − u2 = 1
2u2 − 2u1 = 1
65
It follows, after summing up the equations, that |u2 | + u2 = 2. So u2 = 1, and hence
u1 = u3 = 12 , which are exactly the values of the viscosity solution at the correspond-
ing points of the mesh! So even at such a big mesh, the method gives the numerical
solution, which is very close to the unique viscosity solution. The figure below illus-
trates this
−1 1
is the full discretization of the spatial operator H. We have the initial conditions,
that can be rewritten in this form as
u(0) = ((u1 (0), ..., un−1(0)) = (g(x1 ), ..., g(xn−1) =: (g1 , ..., gn−1).
We choose the time step ∆t and denote u(j) = u(j∆t). The idea of the Euler’s
method is simply using the right time derivative approximation, meaning, that the
scheme (4.8) can be approximated as:
u(j+1) − u(j)
+ H̃(u(j) ) = 0,
∆t
which is called a forward Euler method, or as
u(j+1) − u(j)
+ H̃(u(j+1) ) = 0,
∆t
66
which is called a backward Euler method.
The local error of such method is proportional to ∆t2 and although Euler method is
very simple and easy to program, it’s generally effective only for small time step. It
also serves as a basis for further methods. We’ll discuss some of these improvements.
The simplest one is called midpoint method. First we rewrite the Euler scheme as
The idea of the midpoint method is to insert additional evaluation for the argument
of H̃. Instead of u(j+1) we’ll use the mid-point approximation, that comes from the
forward Euler method:
1 ∆t
u(j+ 2 ) = u(j) − H̃(u(j+1) ).
2
So the mid-point scheme will look like:
∆t
u(j+1) = u(j) − ∆tH̃(u(j) − H̃(u(j+1) )).
2
We can continue such an argument further and that will lead to the family of explicit
Runge-Kutta methods, which in general look like this for the semidiscrete scheme
(4.8):
k1 = −H̃(u(j) )
k2 = −H̃(u(j) + ∆ta21 k1 )
...
ks = −H̃(u(j) + ∆t(as1 k1 + as2 k2 + ... + as,s−1 ks−1))
u(j+1) = u(j) + ∆t(b1 k1 + b2 k2 + ... + bs ks ).
The parameters asj and bs are chosen so that the certain local truncation error
requirement is met. In general, one should have s > p, if the desired error should be
O(∆tp+1 ). Further information about Runge-Kutta methods can be found in [4].
67
1. Choose two mesh regions S2− = {xi−1 , xi }, S2+ = {xi , xi+1 } that contain two
grid points and will be used to approximate the left and the right derivatives.
2. Next we find two interpolating polynomials p− (x) and p+ (x), such that p− (xi−1 ) =
ui−1 , p− (xi ) = ui and p+ (xi ) = ui , p+ (xi+i ) = ui+1 . The polynomials would
have the following form (we don’t calculate constants b1 and b2 , as they are not
relevant for the discussion):
ui − ui−1 ui − ui−1
p− (x) = x + b1 = x + b1
xi − xi−1 h
ui+1 − ui ui+1 − ui
p+ (x) = x + b2 = x + b2 .
xi+1 − xi h
Now the idea of the higher order schemes is to use higher order approximating poly-
nomials. Let’s work out the idea in terms of using mesh regions with four points.
2. The conditions for calculating the coefficients of the two interpolating polyno-
mials p1 (x) and p2 (x) are:
The polynomials will be of degree three. We’ll need p′1 (xi ) and p′2 (xi ) for
calculating the derivatives. We calculate it for p1 by the Lagrange method
(description of the method can be found here [18]). The Lagrange polynomial
68
in our case will look like
(x − xi−1 )(x − xi )(x − xi+1 )
p− (x) = ui−2
(xi−2 − xi−1 )(xi−2 − xi )(xi−2 − xi+1 )
(x − xi−2 )(x − xi )(x − xi+1 )
+ ui−1
(xi−1 − xi−2 )(xi−1 − xi )(xi−1 − xi+1 )
(x − xi−2 )(x − xi−1 )(x − xi+1 )
+ ui
(xi − xi−2 )(xi − xi−1 )(xi − xi+1 )
(x − xi−2 )(x − xi−1 )(x − xi )
+ ui+1
(xi+1 − xi−2 )(xi+1 − xi−1 )(xi+1 − xi )
(x − xi−1 )(x − xi )(x − xi+1 )
=− ui−2
6h3
(x − xi−2 )(x − xi )(x − xi+1 )
+ ui−1
2h3
(x − xi−2 )(x − xi−1 )(x − xi+1 )
− ui
2h3
(x − xi−2 )(x − xi−1 )(x − xi )
+ ui+1
6h3
In a similar manner we can write down p+ .
In this manner one can design finite difference schemes of any order. These schemes
are extremely useful in case of smooth solutions.
Further information about the higher order schemes can be found in [23]. Possi-
ble improvements and adjustments to the Lax-Friedrichs method, applied to the
Hamilton-Jacobi equation is in [19] and [3]. Extensive theoretical background on
monotone schemes for Hamilton-Jacobi equation with the proof of its fine properties
can be found in [7].
69
References
[1] Arnold V., Mathematical Methods Of Classical Mechanics, Springer, 1986.
[2] Bellman, R. E., Dynamic Programming and a new formalism in the calculus of
variations, Proc. Natl. Acad. Sci. 40 (4), (1954), 231?235.
[3] Breuss, M., The Correct Use of the Lax-Friedrichs Method, ESAIM: Mathematical
Modelling and Numerical Analysis, 38.3 (2010), 519-540.
[4] Butcher, John C., Numerical Methods for Ordinary Differential Equations, New
York: John Wiley And Sons, 2008.
[6] Crandall M.G., Evans L.C., Lions P.-L., Some Properties of Viscosity Solutions
of Hamilton-Jacobi Equations, Trans. Amer. Math. Soc., 282 (1984) 487-502.
[8] Crandall M.G., Ishii H., Lions P.-L., User’s Guide to Viscosity Solutions of Second
Order Partial Differential Equations, Bull. Amer. Math. Soc., 27 (1992) 1-67.
[11] Evans L.C., Partial Differential Equations, vol. 19 of Graduate Studies in Math-
ematics, American Mathematical Society, Providence, RI, 1998.
[12] Field J.H., Derivation of the Schrödinger equation from the Hamilton-Jacobi
equation in Feynman’s path integral formulation of quantum mechanics, Eur. J.
Phys. 32 (2011) 63-86.
[13] Friedman A., The Cauchy problem for first order partial differential equations,
Indiana Univ. Math. J. 23, 27-40 (1973).
70
[15] Hamilton, W. R., On a general method in dynamics; by which the study of the
motions of all free systems of attracting or repelling points is reduced to the search
and differentiation of one central relation, or characteristic function, Philosoph-
ical Transactions of the Royal Society, p. 247-308, 1834.
[17] Hopf E., Generalized solutions of nonlinear equations of first order, J. Math.
Mech. 14 (1965) 951-973.
[19] Kao, C., Osher, S., Qian, J., Lax-Friedrichs Sweeping Scheme for Static
Hamilton-Jacobi Equations, UCLA CAM report (2003).
[20] Landau L. D., Lifshitz E. M., Mechanics, Pergamon Press, Oxford, 1969.
[21] Lax P.D., Hyperbolic systems of conservation laws II, Commun. Pure Appl.
Math., 10 (1957) 537-566.
[22] Lax P.D., Weak Solutions of Nonlinear Hyperbolic Equations and Their Numer-
ical Approximation, Comm. Pure Appl. Math., 7 (1954), 159-193.
[23] Osher, S., Shu, C.-W., High-order Essentially Nonoscillatory Schemes for
Hamilton-Jacobi Equations, SIAM Journal on Numerical Analysis, 28 (1991),
907-922.
[24] Torby B., Energy Methods, Advanced Dynamics for Engineers, HRW Series in
Mechanical Engineering, USA: CBS College Publishing, 1984.
71