Notes 220 17 PDF
Notes 220 17 PDF
Lenya Ryzhik∗
December 5, 2017
Essentially nothing found here is original except for a few mistakes and misprints here
and there. These lecture notes are based on material from the following books: L. Evans
”Partial Differential Equations”, Y. Pinchover and J. Rubinstein “An Introduction to Partial
Differential Equations”
1.1 A triviality
The very simplest partial differential equation is, probably,
∂u
= 0, (1.1)
∂t
for a function u(t, x) that depends on two variables (but x does not appear in (1.1), of
course). However, there is almost nothing interesting to say about (1.1) apart from observing
that u(t, x) is an arbitrary function of x and does not depend on t. Still, we can draw a
couple of conclusions from (1.1), that we will soon generalize to somewhat more interesting
equations, before abandoning (1.1) for good. First, (1.1) without any extra conditions, has
infinitely many solutions – a function of the form u(t, x) = g(x) solves (1.1) for an arbitrary
function g(x). Therefore, to get a unique solution we need to impose some condition apart
from (1.1) on the function u(t, x). One such constraint is to prescribe the initial data:
It is easy to see that solution of (1.1)-(1.2) is unique: u(t, x) = f (x). Another possibility is
to add a boundary condition to (1.1) (usually if t and x play the role of “time” and “space”,
respectively, then we talk about “initial conditions” if the function is prescribed at t = 0, and
about “boundary conditions” if the function is prescribed at x = 0, or some other fixed point
in x):
u(t, 0) = g(t). (1.3)
∗
Department of Mathematics, Stanford University, Stanford CA 94305, USA; [email protected]
1
However, as one can check immediately, (1.1) with the boundary condition (1.3) has a solution
only if g(t) = g0 , that is, g(t) is a constant independent of t. Moreover, in that case solution
is not unique – any function u(t, x) = r(x) with r(0) = g0 would solve (1.1), (1.3) provided
that g(t) is a constant. Another example of a “good” curve to impose some condition is the
line x = t: if we add the constraint
u(t, t) = v(t), (1.4)
with a prescribed function v(t) to (1.1) then solution is unique: u(t, x) = v(x). A quick lesson
to remember is that adding a boundary, or initial, or mixed condition even to an absurdly
simple PDE like (1.1) may lead to existence of a unique solution, no solutions, or infinitely
many solutions.
Another, slightly more obscure lesson to learn is that whether we add (1.2) or (1.4) as an
extra condition to (1.1), we have |u(t, x)| ≤ maxy |f (y)| and |u(t, x)| ≤ maxs |v(s)|, respec-
tively, for all t and x – solution everywhere does not exceed its maximum on the surface where
it is prescribed. Moreover, this equation preserves positivity: if f (x) ≥ 0 then u(t, x) ≥ 0
for all t and x, as well. These properties are versions of the maximum principle and we will
encounter them soon in much less trivial settings.
∂φ
+ u · ∇x φ = 0, x ∈ Rn , t ≥ 0. (1.5)
∂t
Here φ(t, x) is the unknown function, and u = (u1 , . . . , un ) is a constant vector in Rn , known
as a “drift” – the terminology will become clear later. The notation u · v, with u, v ∈ Rn ,
denotes the standard inner product in Rn :
u · v = u1 v1 + . . . + un vn ,
is constant in s, for any x and t fixed, and, indeed, using (1.5) we see that if φ(t, x) solves (1.5)
then
dz ∂φ
= (t + s, x + su) + u · ∇φ(t + s, x + su) = 0.
ds ∂t
2
Therefore, if we take any point (t, x) ∈ Rn+1 and draw a line Lt,x = {(t + s, x + us), s ∈ R}
(known as a characteristic) in Rn+1 , then the function φ(t, x) is constant along Lt,x . This gives
a hint of what kind of initial or boundary value problems we can solve for (1.5). Consider the
initial value problem
∂φ
+ u · ∇x φ = 0, x ∈ Rn , t ≥ 0. (1.6)
∂t
φ(0, x) = g(x), for x ∈ Rn ,
with a prescribed function g(x). Then for a given t ≥ 0 and x ∈ Rn we look at the line Lt,x .
It intersects the hyper-plane {t = 0} where the solution is prescribed at the point (0, x − ut)
(we take s = −t in the definition of Lt,x ). Therefore, φ(t, x) = φ(0, x − ut) = g(x − ut). This
is the unique solution to the initial value problem (1.6).
Our choice of the initial conditions as the additional constraint is not arbitrary – the
variable t plays the role of time, and physically, the initial value problem with the data
prescribed at t = 0 comes up most often. However, for the sake of completeness and to see
what can go wrong, consider the boundary value problem prescribed along the plane {x1 = 0}:
∂φ
+ u · ∇x φ = 0, x ∈ Rn , t ≥ 0. (1.7)
∂t
φ(t, x1 = 0, x2 , . . . , xn ) = f (t, x2 , . . . , xn ), for all t ∈ R, and (x2 , . . . , xn ) ∈ Rn−1 ,
with a given function f . Again, given a point (t, x), the line Lt,x intersects the plane {x1 = 0}
at the point corresponding to s = −x1 /u1 : the intersection point is
x1 x1 x1
(t − , 0, x2 − u2 , . . . , xn − un ).
u1 u1 u1
It follows that the solution of (1.7) is
x1 x1 x1
φ(t, x) = f (t − , x2 − u2 , . . . , xn − un ).
u1 u1 u1
It is unique provided that u1 6= 0. If u1 = 0 then the plane {x1 = 0} is parallel to the lines Lt,x ,
thus the value of the solution at (t, x) is not determined by the data prescribed along that
plane. This is a general lesson: the data should be prescribed along surfaces that are not
tangent to characteristics, or one would get non-uniqueness and non-existence (depending on
the data prescribed on such surfaces). We will discuss this in a greater detail later.
Another point to make is that the maximum principle still holds: solution of the initial
value problem (1.6) satisfies
|φ(t, x)| ≤ sup |g(y)|, (1.8)
y
dX(s; t, x)
= u, X(s = t; t, x) = x, (1.9)
ds
3
that is, we are given a “starting” time t ∈ R and a “starting” position x ∈ Rn , and we look
at the trajectory parametrized by the parameter s ∈ R that at the time s = t passes through
the point x. Note that the trajectories X(s; t, x) lie in Rn , unlike the lines Lt,x that lived
in Rn+1 , the space that also involved t. Solutions of (1.9) are explicit:
simply because
Variable drift
Let us now consider the transport equation with a drift u(x) = (u1 (x), u2 (x), . . . , un (x)) that
varies in space:
∂φ
+ u(x) · ∇φ = 0, x ∈ Rn , t ≥ 0, (1.13)
∂t
with a prescribed initial data φ(0, x) = f (x). Recall that when u(x) was constant in space, we
used the trajectories X(s; t, x) (that happened to be straight lines) to construct the solution.
Let us look for the analog of these lines in the case that u(x) varies in space: let X(s; t, x)
be a curve in Rn parametrized by s ∈ R (t and x are fixed here) such that X(s = t; t, x) = x
and define the function z(s) = φ(s, X(s; t, x)), which is the restriction of the function φ(t, x)
to our curve. We compute
n
dz ∂φ(s, X(s; t, x)) X ∂φ(s, X(s; t, x)) dXj
= +
ds ∂s j=1
∂Xj ds
∂φ(s, X(s; t, x)) dX(s; t, x)
= + · ∇φ(s, X(s; t, x)).
∂s ds
4
Therefore, we have
dz
= 0,
ds
or, equivalently, the function z(s) is constant along the curve X(s; t, x), if we choose X(s; t, x)
to be the solution of the system of ODE’s
dX
= u(X). (1.14)
ds
If we supplement (1.14) by an initial condition at s = t:
X(s = t; t, x) = x, (1.15)
then z(t) = φ(t, x). Moreover, if we do choose the curve X(s; t, x) as in (1.14), so that z(s) is
constant in s, we would have z(t) = z(0), which, equivalently, means
φ(t, x) = φ(t, X(t; t, x)) = φ(0, X(0; t, x)) = f (X(0; t, x)). (1.16)
Therefore, solution of the initial value problem for (1.13) can be found as follows: fix t ∈ R
and x ∈ Rn and solve the ODE system (1.14)-(1.15) to find X(0; t, x). Then use (1.16) to
find the value of φ(t, x).
It follows, in particular, from (1.16) that the maximum principle still applies:
and positivity is preserved as well: φ(t, x) ≥ 0 for all x ∈ Rn if f (x) ≥ 0 for all x ∈ Rn .
Example 1. Consider the problem
∂φ ∂φ
+x = 0, t ∈ R, x ∈ R,
∂t ∂x
with the initial data φ(0, x) = f (x). The corresponding ODE is
dX
= X, X(t) = x,
ds
and its solution is X(s) = xes−t . Therefore, X(0) = xe−t , hence φ(t, x) = f (xe−t ) – solution
at a positive time t > 0 has the same profile as f (x) but is stretched out in the x-direction
by the factor et .
Example 2. Consider the problem with the opposite sign of the drift
∂φ ∂φ
−x = 0, t ∈ R, x ∈ R,
∂t ∂x
with the initial data φ(0, x) = f (x). The corresponding ODE is
dX
= −X, X(t) = x,
ds
and its solution is X(s) = xe−s+t . Therefore, X(0) = xet , hence φ(t, x) = f (xet ) – solution
at a positive time t > 0 has the same profile as f (x) but is squished in the x-direction by the
factor et .
5
The inhomogeneous problem
Consider now the initial value problem with a force:
∂φ
+ u · ∇x φ = f (t, x), x ∈ Rn , t ≥ 0. (1.17)
∂t
φ(0, x) = g(x), for x ∈ Rn ,
with prescribed initial data g(x) and force f (t, x). Here we assume again, for simplicity, that
the vector u is constant and does not depend on x. Consider what happens now with the
function z(s) = φ(t + s, x + us):
dz ∂φ
= (t + s, x + su) + u · ∇φ(t + s, x + su) = f (t + s, x + su).
ds ∂t
Choosing again s = −t gives
ˆ 0 ˆ 0 ˆ t
dz
z(0) − z(−t) = dτ = f (t + τ, x + τ u)dτ = f (τ, x − (t − τ )u)dτ.
−t dτ −t 0
or ˆ t
φ(t, x) = g(x − ut) + f (τ, x − (t − τ )u)dτ. (1.19)
0
In order to interpret the formula (1.19) let us define ψ(t, x; τ ) = f (τ, x − (t − τ )u). This
function satisfies the following initial value problem starting at time τ :
∂ψ
+ u · ∇x ψ = 0, x ∈ Rn , t ≥ τ . (1.20)
∂t
ψ(t = τ, x; τ ) = f (τ, x), for x ∈ Rn ,
that is, we decomposed solution of an initial value problem with a force as a sum of the
solution of the initial value problem with zero force, and a time integral of solutions of the
initial value problems with zero force, and with the initial data at an intermediate time τ
given by the force f (τ, x). Such decompositions are known as the Duhamel principle, and
appear in all sorts of linear time-dependent problems we will encounter later.
6
1.3 The Laplace and Poisson equations
One of the most frequently encountered PDEs are the Laplace equation
∆φ = 0, (1.22)
and its inhomogeneous counterpart
−∆ψ = f (x), (1.23)
known as the Poisson equation. Recall that the Laplacian is
∂ 2φ ∂ 2φ
∆φ = + ... + .
∂x1 2 ∂xn 2
Why do we put a minus in the left side of (1.23)? We will see that then (1.23) preserves
positivity under many of the common boundary conditions, that is, ψ(x) ≥ 0 if f (x) ≥ 0
for all x in the domain U where the Poisson equation is posed. Without the minus sign, the
function ψ would be positive if f is negative, which would be inconvenient in some qualitative
considerations.
The Laplace equation is usually derived as follows. Consider a physical quantity Φ such
as mass or heat whose total amount in any given volume U ⊂ Rn does not change in time –
there are no sources or sinks anywhere, and everything is in an equilibrium. The conservation
of Φ can be expressed as ˆ
(F · ν)dS = 0. (1.24)
∂U
Here F is the flux of Φ, and ν is the outward normal to U . The flux is often described by the
Fourier law:
F = −k∇Φ. (1.25)
The constant k > 0 is usually called the diffusivity. The meaning of (1.25) is clear: heat (or
mass) flows from the regions of high Φ to regions of low Φ. Using the Fourier law in (1.24)
gives ˆ
(k∇Φ · ν)dS = 0. (1.26)
∂U
Using Green’s formula, this can be equivalently written as the volume integral
ˆ
∇ · (k∇Φ)dx = 0. (1.27)
U
We use here the notation ∇ · g for the divergence of a vector-valued function g = (g1 , . . . , gn ):
∂g1 ∂gn
∇·g = + ... + .
∂x1 ∂xn
As U is an arbitrary volume element, we conclude from (1.27) that we have
∇ · (k∇Φ) = 0. (1.28)
When k = const, this equation reduces to the Laplace equation (1.22). Otherwise, if k(x)
varies in space, we get an inhomogeneous equation
∇ · (k(x)∇Φ) = 0, (1.29)
which, as we will see, has many similar properties to the Laplace equation.
7
A probabilistic connection interlude
Another nice way to understand how the Laplace equation comes about, as well as many of
its properties is in terms of the discrete equations. For the sake of simplicity of notation, we
describe it in two dimensions. Let U be a bounded sub-domain of the two-dimensional square
lattice Z2 , and let u(x) solve the difference equation
u(x + 1, y) + u(x − 1, y) + u(x, y + 1) + u(x, y − 1) − 4u(x, y) = 0, (1.30)
which is a discrete analog of (1.22). We also impose the boundary condition u(x, y) = g(x, y)
on the boundary ∂U . Here g(x, y) is a prescribed non-negative function, which is positive
somewhere.
We claim that the solution of this problem has the following probabilistic interpretation.
Let (X(t), Y (t)) be the standard random walk on the lattice Z2 – the probability to go up
down, left or right is equal to 1/4, and let it start at the point (x, y): X(0) = x, Y (0) = y.
Let (x̄, ȳ) be the first point where (X(t), Y (t)) reaches the boundary ∂U of the domain. The
point (x̄, ȳ) is, of course, random. The beautiful observation is that the function
v(x, y) = E(g(x̄, ȳ))
gives a solution of (1.30), connecting this discrete equation to the random walk. Why? First,
it is immediate that if the starting point (x, y) is on the boundary of U then, of course, the
exit point from U is simply the starting point: x̄ = x and ȳ = y, so v(x, y) = g(x, y) in that
case. On the other hand, if (x, y) is inside U then the probabilities for the random walk to
go up, down, left right are all equal to 1/4, meaning that v(x, y) can be written as
1
v(x, y) = (v(x + 1, y) + v(x − 1, y) + v(x, y + 1) + v(x, y − 1)).
4
This identity simply uses the definition of the random walk, the definition of v(x, y) and very
elementary probability considerations.
Now, if we let the mesh size be not 1 but h > 0, the discrete equation (1.30) becomes
u(x + h, y) + u(x − h, y) + u(x, y + h) + u(x, y − h) − 4u(x, y) = 0. (1.31)
If we now expand the function u(x) in the Taylor series:
∂u(x, y) h2 ∂ 2 u(x, y)
u(x + h, y) = u(x, y) + h + + ...
∂x 2 ∂x2
and similarly for the other three terms, and let h ↓ 0, the discrete equation (1.31) becomes
the Laplace equation:
uxx + uyy = 0, (1.32)
while the random walk becomes the Brownian motion. More precisely, solution of the Laplace
equation (1.22) in n dimensions has the following probabilistic interpretation: let U be a
domain in Rn and let g(x) be a continuous function on the boundary ∂U . Consider a Brownian
motion B(t; x) that starts at a point x ∈ U and let x̄ be a (random) point where B(t; x) hits
the boundary ∂U for the first time. Then solution of the Laplace equation
∆u = 0 in U (1.33)
8
with the boundary condition u(x) = g(x) for x ∈ ∂U , is u(x) = E(g(x̄)). The reader
unfamiliar with the notion of the Brownian motion should not worry – we will not rely on
this connection in any way other than provide some intuition and motivation.
From the heuristic point of view, now, if g(x) is continuous and non-negative everywhere
on ∂U , and positive at some point x0 ∈ ∂U (and thus in a neighborhood V of x0 as well) then
with a positive probability the exit point x̄ lies in V , so that we have g(x̄) > 0, which means
that u(x) = E(g(x̄)) > 0 as well.
The maximum principle is also a simple consequence of the probabilistic interpretation:
it is easy to see that E(g(x̄)) ≤ supz∈∂U g(z) – expected value of a function can not exceed its
maximum.
∂φ ∂r xi
= v 0 (r) = v 0 (r) , 1 ≤ i ≤ n,
∂xi ∂xi r
and
∂ 2φ ∂ 0 xi 00 x2i v 0 (r) 0 x2i
= v (r) = v (r) + − v (r) .
∂xi 2 ∂xi r r2 r r3
Summing over i, from i = 1 to i = n, keeping in mind that ni=1 x2i = r2 , gives:
P
v 0 (r) r2 n−1 0
∆φ = v 00 (r) + n − v 0 (r) 3 = v 00 (r) + v (r).
r r r
Therefore, for a radial function φ(x) = v(r) to satisfy the Laplace equation, the function v(r)
has to be the solution of the ODE
n−1 0
v 00 (r) + v (r) = 0.
r
Dividing by v 0 (r) (v(r) = const would be a not very exciting solution) gives
0 n−1
(ln(v 0 (r)) = − .
r
Therefore, if n = 1 then v(r) = cr, that is,
u(x) = c|x|.
When n ≥ 2 we get
ln(v 0 (r)) = −(n − 1) ln r + C,
so that when n = 2, we get v 0 (r) = C/r, and
v(r) = C ln r + B,
9
with some constants C and B. Finally, for n ≥ 3 we obtain
C
v 0 (r)) = ,
rn−1
whence v(r) = −C/rn−2 + B.
You may notice immediately that in all three cases, n = 1, n = 2 and n ≥ 3 the radial
solutions that we have obtained above are not twice differentiable at r = 0, and, moreover,
for n ≥ 2 they are not even bounded. So, do they satisfy the Laplace equation? They certainly
do away from x = 0 but what happens there? In order to appreciate this point, observe that
if a smooth function K(x) satisfies
∆K(x) = 0,
for all x ∈ R and a function f (x) is smooth and vanishes outside of a bounded set, then the
function ˆ
v(x) = K(x − y)f (y)dy
with a smooth function f (y) which vanishes outside of some interval [−L, L] so that all
differentiations under the integral sign in the following computation are justified:
ˆ ∞ ˆ x ˆ ∞
0 d d d
φ (x) = |x − y|f (y)dy = (x − y)f (y)dy + (y − x)f (y)dy
dx −∞ dx −∞ dx x
ˆ x ˆ ∞
= f (y)dy − f (y)dy.
−∞ x
It follows that
φ00 (x) = 2f (x). (1.35)
Therefore, the function φ(x) is not a solution of the Laplace equation but rather of the
Poisson equation with the right side given by the function (−2f (x)). In order to get rid of
the pesky (−2) factor we introduce
1
Φ1 (x) = − |x|, x∈R (1.36)
2
and observe that for any ”nice” function f the function
ˆ ∞
φ(x) = Φ1 (x − y)f (y)dy, x ∈ R (1.37)
−∞
10
1.3.1 The fundamental solution of the Laplace equation
The property that any function of the form (1.37) is the solution of the Poisson equation (1.38)
means that the function Φ1 (x) = −|x|/2 is the fundamental solution for the Laplace equation
in one dimension. In higher dimensions, the fundamental solutions of the Laplace equation
are given by
1
Φ(x) = − log |x|, n = 2, (1.39)
2π
and
1 1
Φ(x) = , n ≥ 3. (1.40)
n(n − 2)α(n) |x|n−2
Here α(n) is the volume of the unit sphere in n dimensions. When we say that Φ(x) is the
fundamental solution of the Laplace equation, we mean the following.
Theorem 1.1 Let f ∈ Cc2 (Rn ) (that is, f is twice continuously differentiable and has compact
support), n ≥ 2, and set ˆ
φ(x) = Φ(x − y)f (y)dy, (1.41)
Rn
with Φ(x) given by (1.39) and (1.40) for n = 2 and n ≥ 3, respectively. Then φ(x) is twice
continuously differentiable and satisfies the Poisson equation
Before we go into the proof, let us recall Lebesgue dominated convergence theorem from real
analysis.
Theorem 1.2 (Lebesgue Dominated Convergence Theorem) Let gk (x) be a sequence
of functions such that gk (x) → g(x) as k → ∞ for all x ∈ Rn and assume that there exists a
function Q(x) such that ˆ
|Q(x)|dx < +∞,
Rn
and |gk (x)| ≤ Q(x) for all k and all x ∈ Rn . Then we have
ˆ ˆ
lim gn (x)dx = g(x)dx.
n→+∞ Rn Rn
We will not prove this theorem here, its proof can be found in essentially any textbook on
measure theory and real analysis.
Proof of Theorem 1.1. Step 1. Differentiability of φ(x). Let us first show that φ(x)
defined by (1.41) is differentiable. Assume that f (x) vanishes outside of a ball of radius R.
Note that
ˆ ˆ ˆ
φ(x) = Φ(x − y)f (y)dy = Φ(y)f (x − y)dy = Φ(y)f (x − y)dy,
Rn Rn BR (x)
where BR (x) is the ball of radius R centered at the point x. Therefore, for |h| < 1 we have
ˆ
φ(x + hei ) − φ(x) f (x + hei − y) − f (x − y)
= Φ(y) dy, (1.43)
h BR+1 (x) h
11
where ei = (0, . . . , 1, 0, . . . , 0) is the unit vector in the direction of xi . However, we have
f (x + hei − y) − f (x − y) ∂f (x − y)
gh (y) = → as h → 0,
h ∂xi
uniformly in y ∈ Rn (remember that f is compactly supported) – we consider x here to be
fixed. Moreover, the functions gh (y) are uniformly bounded: there exists a point ξ on the
interval connecting x − y and x − y + hei such that
so that
|h(ei · ∇f (ξ))|
|gh (y)| ≤ ≤ M0 = sup |∇f (z)|.
|h| z∈Rn
Note that while the function Φ(y) is not integrable over all Rn , its integral over any ball is
finite – in particular, over the ball BR+1 (x). Hence, we may apply the Lebesgue dominated
convergence theorem and pass to the limit h → 0 in (1.43) to get
ˆ
∂φ(x) ∂f (x − y)
= Φ(y) dy.
∂xi Rn ∂xi
A very similar argument shows that
ˆ
∂ 2φ ∂ 2 f (x − y)
= Φ(y) dy,
∂xi ∂xj Rn ∂xi ∂xj
hence φ is twice differentiable (you need also to argue why the second derivatives are con-
tinuous but the argument for that is also very similar to what we just did, except without
dividing by any h).
Step 2. Derivation of the Poisson equation. Now, we show that φ satisfies the
Poisson equation. We know from the above that φ(x) is twice continuously differentiable and
ˆ
∆φ(x) = Φ(y)∆f (x − y)dy.
Rn
We need to check that the right side equals f (x). Since Φ(y) is singular at y = 0 we can not
simply integrate by parts in the right side but rather have to be more careful. To this end, we
take a small ε > 0 (that we will send to zero at the end of the proof), and split the integral
above into the integral over the ball B(0, ε) of radius ε centered at y = 0 and its complement:
with ˆ ˆ
Iε (x) = Φ(y)∆f (x − y)dy, Jε (x) = Φ(y)∆f (x − y)dy.
|y|≤ε |y|≥ε
12
Decomposition (1.44) holds, of course, for any ε > 0. Therefore, we also have, trivially:
Our strategy will be to compute the limit in the right side of (1.45) in order to verify that
the Poisson equation (1.42) holds.
The contribution of Iε (x) as ε ↓ 0 is small:
ˆ
|Iε (x)| ≤ Cf |Φ(y)|dy, (1.46)
|y|≤ε
where Cf = supz∈Rn |∆f (z)|. The right side of (1.46) vanishes as ε → 0: when n = 2 we have
ˆ ˆ ε ˆ 2π
1
|Φ(y)|dy ≤ | log r|rdrdω ≤ Cε2 | log ε|,
|y|≤ε 2π 0 0
Let us now look at Jε . First, we recall Green’s formula: given a vector-valued function v(x)
and a scalar valued function f (x) we have, over a nice domain U :
ˆ ˆ ˆ
v(x) · ∇f (x)dx = (v(x) · ν(x))f (x)dSx − f (x)divv(x)dx. (1.48)
U ∂U U
Here, ν(x) is the outward unit normal to the boundary ∂U at the point x ∈ ∂U . Then,
integrating by parts we get for Jε (keep in mind that ∆x f (x − y) = ∆y f (x − y)), since
∆f = div(∇f ):
ˆ ˆ ˆ
Jε (x) = Φ(y)∆x f (x − y)dy = Φ(y)∆y f (x − y)dy = Φ(y)divy (∇y f (x − y))dy
|y|≥ε |y|≥ε |y|≥ε
ˆ ˆ
=− ∇y Φ(y) · ∇y f (x − y)dy + Φ(y)(∇y f (x − y) · ν(y))dSy = Kε + Lε .
|y|≥ε |y|=ε
The second term above is small in the limit ε → 0: let Cf0 = supz∈Rn |∇f (z)|, then
ˆ
0
|Lε | ≤ Cf |Φ(y)|dSy ,
|y|=ε
13
so that when n = 2 we have
|Lε | ≤ Cε| log ε|,
and when n ≥ 3 we have
εn−1
|Lε | ≤ C n−2 = Cε.
ε
In both cases we have
lim Lε = 0. (1.49)
ε↓0
Let us look at that term: integrating by parts using Green’s formula once again gives
ˆ ˆ ˆ
∂Φ(y)
Kε (x) = − ∇y Φ(y) · ∇y f (x − y)dy = ∆Φ(y)f (x − y)dy − f (x − y)dSy .
|y|≥ε |y|≥ε |y|=ε ∂ν
Consider only the case n = 3 – the case n = 2 is very similar. Then the normal derivative
inside the integrand above is
∂Φ 1 1
(y) = = ,
∂ν nα(n)r n−1 nα(n)εn−1
|y|=ε |y|=ε
and does not depend on the point y. The sign above comes from the fact that the outer
normal to {|y| ≥ ε} points toward the origin y = 0. Using this in (1.51) gives
ˆ
1
Kε (x) = − f (x − y)dSy . (1.52)
nα(n)εn−1 |y|=ε
It remains only to observe that nα(n)εn−1 is the surface area of the sphere of radius ε in Rn :
in general, the volume α(n) of the unit sphere in Rn is related to its area sn by α(n) = sn /n
– this is easy to see from calculus. It follows, as f (x) is continuous that
−∆φ(x) = f (x),
as claimed. The end of the proof in dimension n = 2 is very similar to what we did after
(1.51), so we do not present it here. 2
14
A probabilistic interlude
Consider now the following problem: fix two radii r and R, and let the Brownian motion
start a point x inside the annulus D = {r < |x| < R} in Rn . The Brownian motion will
spend some time inside D but will eventually exit D at some random point x̄ such that either
|x̄| = r or |x̄| = R. We ask the following question: what is the probability that the Brownian
motion will exit the annulus at the sphere {|x| = r} and not on the sphere {|x| = R}. Let
us call this probability p(x). It is clear that if the starting point x is such that |x| = r then
p(x) = 1 while if |x| = R then p(x) = 0. One can show, as we did before, that if we replace
the Brownian motion by a discrete random walk then the discretized p(x) satisfies the discrete
Laplace equation:
n
1 X
p(x) = [p(x + ei ) + p(x − ei )] .
2n i=1
Here, ei is the unit vector in the direction of xi and 2n is the total number of the neighbours
of the point x on the lattice (n is the spatial dimension). In the case of the Brownian motion
which is a continuous limit of the random walks, the function p(x) satisfies the Laplace
equation
∆p(x) = 0 for r < |x| < R,
supplemented by the boundary conditions
p(x) = Ax + B,
Ar + B = 1, AR + B = 0,
so that
1 R
A=− , B= .
R−r R−r
Note that if R → +∞, that is, if the right point moves to infinity, we have
A → 0, B → 1 as R → +∞.
This means that p(x) → 1 as R → +∞ for any fixed point x. This is the reflection of the
fact that the Brownian motion is recurrent in one dimension: no matter where it starts, it is
certain to reach the point x = r. On the other hand, in dimensions n ≥ 3 the function p(x)
is given by
A
p(x) = n−2 + B,
|x|
with the constants A and B determined by
A A
+ B = 1, + B = 0.
rn−2 Rn−2
15
This gives
rn−2 Rn−2 rn−2
A= , B = − .
Rn−2 − rn−2 Rn−2 − rn−2
We see that in dimension n ≥ 3, as R → +∞ we have
A → rn−2 , B → 0,
so that p(x) has a limit that depends on x as is always less than one:
rn−2
p(x) → as R → +∞.
|x|n−2
This reflect the fact that the Brownian motion is transitive in dimensions n ≥ 3: no matter
how close to the ball {|x| ≤ r} it starts, there is a positive probability that it will never enter
this ball.
Theorem 1.3 Let U ⊂ Rn be an open set and let B(x, r) be a ball centered at x ∈ Rn of
radius r > 0 contained in U . Assume that the function u(x) satisfies
16
The intuitive reason for the mean value property can be seen from the discrete version of the
Laplace equation we have encountered when we discussed the probabilistic interpretation:
n
1 X
(u(x + hej ) + u(x − hej )) = u(x).
2n j=1
Here h is the mesh size, and ej is the unit vector in the direction of the coordinate axis for xj .
This discrete equation says exactly that the value u(x) is the average of the values of u at the
neighbors of the point x on the lattice with mesh size h, which is similar to the statement of
Theorem 1.3 – though there is no meaning to “nearest” neighbor in the continuous case, and
the average can be taken over an arbitrary large sphere or ball.
Proof. Let us fix the point x ∈ U and define
ˆ
1
φ(r) = u(z)dS(z). (1.55)
|∂B(x, r)| ∂B(x,r)
Therefore, we would be done if we knew that φ0 (r) = 0 for all r > 0 such that the ball B(x, r)
is contained in U . To this end, using the polar coordinates z = x + ry, with y ∈ ∂B(0, 1), we
may rewrite (1.55) as
ˆ
1
φ(r) = u(x + ry)dS(y).
|∂B(0, 1)| ∂B(0,1)
Here, we used the fact that the outward normal to the ball B(x, r) at a point z ∈ ∂B(x, r)
is ν = (z − x)/r. Using the Green’s formula
ˆ ˆ ˆ
∂g
f ∆gdy = f dS − ∇f · ∇gdy,
U ∂U ∂ν U
17
since u is harmonic – it satisfies (1.53). It follows that φ(r) is a constant and then (1.56)
implies that ˆ
1
u(x) = udS, (1.57)
|∂B(x, r)| ∂B(x,r)
which is the second identity in (1.54).
In order to prove the first equality in (1.54) we use the polar coordinates once again:
ˆ ˆ r ˆ ˆ r
1 1 1
udy = udS ds = u(x)nα(n)sn−1 ds
|B(x, r)| B(x,r) |B(x, r)| 0 ∂B(x,s) |B(x, r)| 0
n
nα(n)r
= u(x) = u(x).
α(n)rn
In the second equality above we used two facts: first, the already proved identity (1.57) about
averages on spherical shells, and, second, that the area of an (n − 1)-dimensional unit sphere
is nα(n). Now, the proof of (1.54) is complete. 2
Moreover, if u(x) achieves its maximum at a point x0 in the interior of U then u(x) is
identically equal to a constant in U .
Proof. Let us suppose that u(x) attains its maximum at an interior point x0 ∈ U , and
set M = u(x0 ). Then for any r > 0 sufficiently small (so that the ball B(x0 , r) is contained
in U ) we have ˆ
1
M = u(x) = udy ≤ M,
|B(x0 , r)| B(x0 ,r)
with the equality above holding only if u(y) = M for all y in the ball B(x0 , r). Therefore,
the set S of points where u(x) = M is open. Since u(x) is continuous, this set is also
closed. Since S us both open and closed in U , and U is connected, it follows that S = U ,
hence u(x) = M at all points x ∈ U . 2
Of course, if we replace u by (−u) (which is equally harmonic), we get the minimum
principle for u.
Corollary 1.5 (Strict positivity) Assume that U is a connected domain, and u solves
∆u = 0 in U (1.59)
u = g on ∂U .
18
Assume, in addition, that g ≥ 0, g is continuous on ∂U , and g(x) 6≡ 0. Then u(x) > 0 at
all x ∈ U .
Proof. This is an immediate consequence of the minimum principle: minx∈Ū u(x) ≥ 0, and u
can not attain its minimum inside U , thus u(x) > 0 for all x ∈ U . 2
∆u = f in U (1.60)
u = g on ∂U .
Proof. Let u1 and u2 be two such solutions to (1.60). Then the difference v = u1 −u2 satisfies
the homogeneous problem
∆v = 0 in U (1.61)
v = 0 on ∂U .
The maximum principle implies that v ≤ 0 in U , and the minimum principle implies that v ≥ 0
in U , whence v ≡ 0, and we are done. 2
Proof. The proof is via a miracle: we first define a ”smoothed” version of u, and then
verify that the ”smoothed” version coincides with the original, hence original is also infinitely
smooth. This is as close to a free lunch as it gets.
Consider a radial non-negative function η(x) ≥ 0 that depends ´ only on |x| such that
(i) η(x) = 0 for |x| ≥ 1, (ii) η(x) is infinitely differentiable, and (iii) Rn η(x)dx = 1. Also, for
each ε ∈ (0, 1) define its rescaled version
1 x
ηε (x) = n η .
ε ε
It is straightforward to verify that ηε satisfies the same properties (i)-(iii) above. Moreover,
the function ˆ
uε (x) = ηε (x − y)u(y)dy (1.62)
Rn
19
is infinitely differentiable in the slightly smaller domain Uε = {x ∈: dist(x, ∂U ) > ε}. The
reason is that we can differentiate infinitely many times under the integral sign in (1.62) –
this follows from the standard multivariable calculus theorem on differentiation of integrals
depending on a parameter (the variable x plays the role of a parameter here). Our main claim
is that, because of the mean value property, we have
uε (x) = u(x) for all x ∈ Uε . (1.63)
This will immediately imply that u(x) is infinitely differentiable in the domain Uε . And, as
any point x from U lies in Uε if ε < dist(x, ∂U ), it follows that u(x) is infinitely differentiable
at all points x ∈ U .
Let us now verify (1.63):
ˆ ˆ ˆ
1 |x − y| 1 |x − y|
uε (x) = ηε (x − y)u(y)dy = n η u(y)dy = n η u(y)dy.
Rn ε U ε ε B(x,ε) ε
The last equality holds because η(z) = 0 if |z| ≥ 1, whence ηε (z) = 0 if |z| ≥ ε. Changing
variables y = x + εz gives
ˆ
uε (x) = η (z) u(x + εz)dz.
B(0,1)
which is (1.63). We used the fact that η has integral equal to one in the last step. 2
This regularity property is quite fundamental and appears in one way or other for the class
of elliptic equations (and not just for the Laplace equation) we will discuss later. One of their
main qualitative properties is that solutions are more regular than the data prescribed, and
they behave much nicer than, say, solutions of wave equations and other hyperbolic problems.
Let us now give a more quantitative estimate on how large the derivatives of the harmonic
functions can be.
Theorem 1.8 Let u(x) be a harmonic function in a domain U and let B(y0 , r) be a ball
contained in U centered at a point y0 ∈ U . Then there exist universal constants Cn and Dn
that depends only on the dimension n so that we have
ˆ
Cn
|u(y0 )| ≤ n |u(y)|dy. (1.66)
r B(y0 ,r)
20
and ˆ
Dn
|∇u(y0 )| ≤ n+1 |u(y)|dy. (1.67)
r B(y0 ,r)
The remarkable fact about the estimate (1.67) is that we are able to estimate the size of
the derivatives of a harmonic function in terms of its values – this means that a harmonic
function can not oscillate (oscillation means, essentially, that the function is much smaller
than its derivative). It is a good time to pause and think about why taking a harmonic
function u(x) and setting uε (x) = u(x/ε) with y0 = 0 does not provide a counterexample
for (1.67). We certainly would have |∇uε (0)| = |∇u(0)|/ε. But what has to happen to the
right side of (1.67) when we replace u by uε there?
Proof. First, the estimate (1.66) follows immediately from the first equality in the mean
value formula (1.54). In order to obtain the derivative bound (1.67) note that if u(x) is
harmonic then so are the partial derivatives ∂u/∂xj , whence
ˆ ˆ
∂u(y0 ) 1 ∂u(y) 1
≤ dy = u(y)νj (y)dy , (1.68)
∂xj |B(y0 , r/2)| ∂x j
|B(y 0 , r/2)|
B(y0 ,r/2) ∂B(y0 ,r/2)
where νj (y) is the j-th component of the outward normal. Continuing this estimate we see
that (we use the fact that the area of the unit sphere is nα(n))
n
nα(n)rn−1
∂u(y0 )
≤ 2 sup |u(z)| =
2n
sup |u(z)|. (1.69)
∂xj α(n)rn 2n−1 r z∈B(y0 ,r/2)
z∈B(y0 ,r/2)
Now, we can use the estimate (1.66) applied at any point z ∈ B(y0 , r/2):
ˆ
Cn
|u(z)| ≤ n
|u(z 0 )|dz 0 . (1.70)
(r/2) B(z,r/2)
However, since |y0 − z| ≤ r/2 (this is why we took a smaller ball in (1.68)!), any such
ball B(z, r/2) is contained inside the ball B(y0 , r), thus (1.70) implies that
ˆ
Cn
|u(z)| ≤ |u(z 0 )|dz 0 .
(r/2)n B(y0 ,r)
which is (1.67). 2
Theorem 1.8 is another expression of the fact that harmonic functions do not oscillate –
the first estimate says that the value of the function at a point is bounded by its averages
(but we have seen that already in the mean value property), while the second bound says in a
quantitative way that derivative at a point can not be large without the function being large
around the point. This rules out oscillatory behavior.
21
The Liouville theorem
The Liouville theorem says that a function which is harmonic in all of Rn is either unbounded
or is identically equal to a constant.
Theorem 1.9 Let u(x) be a harmonic bounded function in Rn . Then u(x) is equal identically
to a constant.
Proof. Let us assume that |u(x)| ≤ M for all x ∈ Rn . We fix x0 ∈ Rn and use Theorem 1.8:
ˆ
C Cα(n)rn Cα(n)M
|∇u(x0 )| ≤ n+1 |u(y)|dy ≤ n+1
M≤ .
r B(x0 ,r) r r
As this is true for any r > 0 we may let r → ∞ and conclude that ∇u(x0 ) = 0, thus u(x) is
equal identically to a constant. 2
This theorem is, of course, a direct generalization to higher dimensions of the familiar
Liouville theorem in complex analysis that says that a bounded entire (analytic in all of C)
function has to be equal identically to a constant.
Harnack’s inequality
Here is another way to express lack of oscillations of nonnegative harmonic functions –
their maximum cannot be much larger than their minimum. To trivialize, consider the one-
dimensional situation. Let u(x) be a non-negative harmonic function on the interval (0, 1),
that is, u(x) = ax + b with some constants a, b ∈ R. We claim that if u(x) ≥ 0 for all x ∈ [0, 1]
then
1 u(x)
≤ ≤ 3, (1.72)
3 u(y)
for all x, y in the smaller interval (1/4, 3/4). The constants 1/3 and 3 in (1.72) depend on the
choice of the ”smaller” interval – they would change of we would replace (1/4, 3/4) by another
subinterval of [0, 1]. But once we fix the subinterval, they do not depend on the choice of the
harmonic function. Let us now show that (1.72) holds for all x, y ∈ (1/4, 3/4). Without loss
of generality we may assume that x > y. First, consider the case a > 0. Then, since u(x) is
increasing (because a > 0), we have
u(x) u(3/4) 3a + 4b
1≤ ≤ = . (1.73)
u(y) u(1/4) a + 4b
As u(x) > 0 on [0, 1] we know that b > 0 (and a > 0 by assumption), using this in (1.73)
gives, with c = a/b:
u(x) 3c + 4 8
1≤ ≤ =3− ≤ 3.
u(y) c+4 c+4
On the other hand, if a < 0 then the function u is decreasing, and
22
As u(1) > 0 we know that a + b > 0, and we still have b > 0 since u(0) > 0. Thus, c > −1,
and therefore,
u(x) 1 8 1
1≥ ≥ + ≥ .
u(y) 3 3(3c + 4) 3
We conclude that (1.72), indeed, holds. Geometrically, (1.72) expresses a very simple fact:
if u(3/4) u(1/4) then the slope of the straight line connecting the points (1/4, u(1/4))
and (3/4, u(3/4)) is too large so that it would go below the x-axis at x = 0 contradicting
the assumption that the linear function is positive on the interval (0, 1). On the other hand,
if u(1/4) u(3/4) then this line would go below that x-axis at x = 1. Therefore, the
condition that u(x) > 0 on the larger interval [0, 1] is very important here.
Now, we turn to the general case of dimension larger than one. We say that a set V is
strictly contained in U if V ⊂ U and there exists ε0 > 0 so that for any x ∈ V we have
dist(x, ∂U ) ≥ ε0 .
Theorem 1.10 (Harnack’s inequality) Let U be an open set and let V be strictly contained
in U . Then there exists a constant C that depends on U and V but nothing else so that for
any nonnegative harmonic function u in U we have
Proof. Let r = (1/4)dist(V, ∂U ) and choose two points x, y ∈ V such that |x − y| ≤ r. Then
the ball B(x, 2r) is contained in U so u is harmonic in this ball, and the mean-value principle
implies that ˆ
1
u(x) = u(z)dz. (1.75)
|B(x, 2r)| B(x,2r)
Note also that since |x − y| ≤ r, the ball B(y, r) is contained inside B(x, 2r), and u(z) ≥ 0
everywhere. Hence, (1.75) implies that
ˆ
1
u(x) ≥ u(z)dz. (1.76)
α(n)2n rn B(y,r)
It follows, on the other hand, from the mean-value principle that
ˆ
1
u(y) = u(z)dz. (1.77)
α(n)rn B(y,r)
23
In general, if |x − y| ≥ r, there exists a number N so that we may cover the compact
set V̄ by N balls of radius r/2. Then given any two points x, y ∈ V we can connect them by
a piece-wise straight line curve with no more than N segments, each segment at most r long.
It follows that for any x, y ∈ V we have
1
u(x) ≤ u(y) ≤ 2N n u(x), for all x, y ∈ V . (1.81)
2N n
This, of course, implies (1.74) with C = 2nN . 2
−∆u = f in U , (1.82)
u = g on ∂U .
When the domain U is sufficiently simple (a ball, halfspace, etc.) then we will construct a
more or less explicit formula for the solution. When U is complicated we can not get an
explicit formula but we will reduce solving (1.82) with arbitrary functions f and g to the
special case f = 0, and one particular function g. Having a solution to this one special
case allows to construct solutions for general f and g immediately. This is useful when one
needs to solve the Poisson equation in the same domain for various f and g. It also helps to
understand various qualitative properties of the solutions of the boundary value problem for
the Poisson equation.
24
correct boundary conditions. That is, we are hoping to get an integral representation of the
solution of the boundary value problem (1.82) as
ˆ ˆ
u(x) = G(x, y)f (y)dy + G1 (x, z)g(z)dz, (1.86)
U ∂U
with some functions G(x, y) and G1 (x, z) that are to be determined (but they should not
depend on the functions f and g – they should only depend on the domain U where the
problem is posed).
To this end, we take a point x ∈ U , and a small ball B(x, ε) around it. Consider the
domain Vε = U \ B(x, ε) (that is, U without the ball B(x, ε)) and use the Green’s formula:
ˆ ˆ
∂Φ(z − x) ∂u(z)
[u(y)∆Φ(y − x) − Φ(y − x)∆u(y)] dy = u(z) − Φ(z − x) dS(z).
Vε ∂Vε ∂ν ∂ν
(1.87)
The reason we had to cut out the small ball around the point x is that now when y ∈ Vε the
argument (y − x) of the fundamental solution Φ(x − y) can not vanish, and Φ(z) is regular
when z 6= 0. Otherwise, we would not be able to apply Green’s formula since Φ(z) is singular
at z = 0. As ∆Φ(y − x) = 0 when y 6= x, the above is
ˆ ˆ
∂Φ(z − x) ∂u(z)
− Φ(y − x)∆u(y)dy = u(z) − Φ(z − x) dS(z). (1.88)
Vε ∂Vε ∂ν ∂ν
This identity holds for all ε > 0 and we will now pass to the limit ε ↓ 0 in (1.88), taking the
size of the cut-out region to zero. The boundary ∂Vε of the domain Vε is the union of ∂U
and the sphere Sε = {|z − x| = ε}. The integral over Sε is computed as in the proof of
Theorem 1.1 (we again do only the computations for n ≥ 3, the case n = 2 is similar): first,
we may use the explicit formula
1
Φ(x) = .
n(n − 2)α(n)|x|n−2
∂Φ 1
(z − x) = ,
∂ν nα(n)εn−1
so that
ˆ ˆ ˆ
∂Φ(z − x) 1 1
u(z) dS(z) = u(z)dS(z) = u(z)dS(z). (1.89)
∂Sε ∂ν nα(n)εn−1 ∂Sε |∂Sε | ∂Sε
We used here the formula |∂Sε | = nα(n)εn−1 for a sphere of radius ε in Rn . As u is continuous
at the point x, letting ε ↓ 0 we obtain
ˆ ˆ
∂Φ(z − x) 1
u(z) dS(z) = u(z)dS(z) → u(x) as ε ↓ 0. (1.90)
∂Sε ∂ν |∂Sε | ∂Sε
25
The other term in the right side of (1.88) vanishes as ε ↓ 0:
ˆ ˆ n−1
∂u(z) 1 ∂u(z) ≤ M ε nα(n) → 0,
Φ(z − x) dS(z) =
n(n − 2)α(n)εn−2
dS(z) nα(n)(n − 2)εn−2
∂Sε ∂ν ∂Sε ∂ν
(1.91)
n−1
as ε ↓ 0, where M = supy∈U |∇u|. We used again the formula |∂Sε | = nα(n)ε above.
Therefore, passing to the limit ε ↓ 0 in (1.88) leads to
ˆ ˆ
∂u(z) ∂Φ(z − x)
u(x) = Φ(z − x) − u(z) dS(z) − Φ(y − x)∆u(y)dy. (1.92)
∂U ∂ν ∂ν U
Hence, in order to compute u(x) we should know ∆u inside U (which we do for the solution
of the Poisson equation (1.82), it is f ), as well as u(z) on the boundary ∂U (which we do
know for the solution of the boundary value problem (1.82), it is g), but also the normal
derivative ∂u/∂ν at the boundary of U , and that we do not know a priori – this normal
derivative can only be found after we solve (1.82). Therefore, (1.92) is not yet the answer we
seek – it involves an unknown function ∂u/∂ν.
Note that this would not have been an issue if we would have Φ(x − y) = 0 on ∂U – then
the corresponding term in (1.92) would have vanished. The idea is, then, to amend Φ(x − y)
in such a way as to make this term disappear. This is done as follows. Fix a point x ∈ U and
let φ(y; x) be the solution of the boundary value problem
Observe that, as x lies inside the domain U , the function Φ(x − y) is regular when y lies on
the boundary ∂U - the distance between x and y is uniformly positive. Therefore, (1.93) is
simply a Laplace equation in y with regular prescribed boundary data Φ(x − y) (x here serves
as a parameter). Hence, the function φ(y; x) is regular and has no singularity.
Using the Green’s formula as before (but without the need to throw out a small ball
around the point x since the function φ(y; x) is regular at y = x) gives
ˆ ˆ
∂φ(z; x) ∂u(z)
− φ(y; x)∆u(y) dy = u(z) − φ(z; x) dS(z). (1.94)
U ∂U ∂ν ∂ν
26
The advantage of (1.97) over (1.92) is that the normal derivative ∂u/∂ν on ∂U (which we do
not know) no longer appears in the right side. Hence, solution of the Poisson boundary value
problem (1.82) is given by
ˆ ˆ
∂G(x; z)
u(x) = − g(z) dS(z) + G(x; y)f (y)dy. (1.98)
∂U ∂ν U
This expression is particularly useful when G(x; z) is known explicitly, and we will discuss
below some examples when it can be computed analytically.
−∆u = f,
with the Dirichlet boundary condition u = 0 on ∂Ω. Then, according to (1.98), we have
ˆ
u(x) = G(x, y)f (y)dy.
U
Let us then look for fh (x, y) such that if Gh (x, y) satisfies the boundary value problem
then (1.99) holds. Since both Gh and u vanish on the boundary, Green’s formula says that
ˆ ˆ
Gh (x, y)∆y u(y) = u(y)∆Gh (y)dy.
U U
The question now becomes: what is a suitable choice fh (x, y) such that
ˆ
fh (x, y)u(y)dy − u(x) → 0 as h → 0.
U
27
A suitable choice is to take a fixed f (x) ≥ 0 such that f (x) = 0 for |x| ≥ 1, and
ˆ
f (y)dy = 1,
|x|≤1
Proof. Once again, we will use Green’s formula. Let x 6= y be two distinct points in U , and
set v(z) = G(x; z) and w(z) = G(y; z). Let us cut out two small balls B(x, ε) and B(y, ε)
with ε > 0 so small that the balls are not overlapping and are contained in U . Let
be the domain U with the two balls deleted. Then ∆z w = ∆z v = 0 in Vε as this set contains
neither the point x nor the point y. The Green’s formula then becomes
ˆ
∂v(z) ∂w(z)
w(z) − v(z) dS(z) = 0.
∂Vε ∂ν ∂ν
The boundary of Vε consists of three pieces: the outer boundary ∂U where both w and v
vanish, and the two spheres ∂B(x, ε) and ∂B(y, ε). Therefore, we have
ˆ ˆ
∂v(z) ∂w(z) ∂v(z) ∂w(z)
w(z) − v(z) dS(z) + w(z) − v(z) dS(z) = 0.
∂B(x,ε) ∂ν ∂ν ∂B(y,ε) ∂ν ∂ν
(1.104)
Here ν is the normal pointing inside the spheres (it is the outside normal to Vε which is
outside the two small balls). The functions w and v look as follows near x and y: v is regular
in B(y, ε), w is regular in B(x, ε). On the other hand, v(z) = Φ(x − z) − φ(z; x), and φ(z; x) is
28
regular in z, including in B(x; z), while w(z) = Φ(y − z) − φ(z; y), and φ(z; y) is regular in z,
including in B(y; z). Hence, as in the discussion in the previous section leading up to (1.97),
the main terms in (1.104) are
ˆ ˆ
∂Φ(x − z) ∂Φ(y − z)
w(z) dS(z) − v(z) dS(z) + l.o.t = 0, (1.105)
∂B(x,ε) ∂ν ∂B(y,ε) ∂ν
where l.o.t. denotes terms that tend to zero as ε → 0. Passing to the limit ε ↓ 0 exactly as
in (1.90), since w(z) is continuous at x, and v(z) is continuous in y, gives w(x) − v(y) = 0,
which is exactly (1.103). 2
29
with a prescribed function g(x) is
ˆ ˆ
2xn g(y)
u(x) = K(x, y)g(y)dy1 . . . dyn−1 = dy1 . . . dyn−1 , (1.110)
Rn−1 nα(n) Rn−1 |x − y|n
for x ∈ Rn . Note that the denominator never vanishes since xn > 0 and the distance |x − y|
is computed in Rn so that
1/2
|x − y| = (x1 − y1 )2 + . . . + (xn−1 − yn − 1)2 + x2n .
Of course, one should be more careful here: what (1.98) actually says is that the only
possible smooth solution of (1.109) is given by the convolution with the Poisson kernel, but
we do not know yet that the function u(x) defined by (1.110) is, indeed, a solution of (1.109).
It is quite straightforward to verify that u(x) is harmonic in the upper half-space: all we need
to check for that is that
∆x K(x, y) = 0,
and that is true because (i) y is not in the interior of Rn+ – hence, K(x, y) is regular at
all x ∈ Rn+ , and (ii) K(x, y) = ∂G(x, y)/∂yn , and G(x, y) is harmonic (in both x and y
for x 6= y).
It is much more delicate to verify that the boundary condition for u(x) holds, that is,
that u(x) is continuous up to the boundary {xn = 0}, and that
The reason why the boundary condition holds is as follows. First, one can verify easily that
ˆ
K(x, y)dy = 1, (1.112)
Rn−1
for each fixed x ∈ Rn+ . Second, we can write the Poisson kernel as
2 xn
K(x1 , . . . , xn−1 , xn , y) =
nα(n) ((x1 − y1 )2 + . . . + (xn−1 − yn−1 )2 + x2n )n/2
" 2 2 #−n/2
2 x1 − y 1 xn−1 − yn−1
= + ... + +1 .
nα(n)xn−1
n xn xn
30
Letting xn → 0 gives
ˆ
0 2 −n/2
1 + |y 0 |2 g(x0 )dy 0 = g(x0 ).
lim u(x , xn ) = (1.113)
xn ↓0 nα(n) Rn−1
In the last identity we used (1.112) with x = (0, . . . , 0, 1). The passage to the limit xn → 0
in (1.113) follows from continuity and boundedness of the function g(x). Therefore, u(x)
satisfies the boundary condition in (1.109). Note that this computation is essentially the
same as what we have done in the approximation of the Green’s function in (1.101)-(1.102).
−∆u = f in U , (1.114)
u = g on ∂U
−∆v = 0 in U , (1.115)
v = 0 on ∂U .
Let us multiply the Laplace equation (1.115) by v and integrate over the domain U :
ˆ
v∆v = 0.
U
As v = 0 on ∂U , we conclude that
ˆ
|∇v|2 dx = 0,
U
31
In order to understand the idea, let us recall some linear algebra. Consider the equation
Ax = b, (1.116)
and thus G(x) → +∞ as |x| → +∞. It follows that G(x) is bounded from below and attains
its minimum at some point x̄ = (x̄1 , . . . , x̄n ) ∈ Rn . Using basic calculus we find that this
point must satisfy the equations
n
X
Aij x̄j = bi , i = 1, . . . , n, (1.119)
j=1
which, of course, is nothing but (1.116). Therefore, solving equation (1.116) is exactly equiv-
alent to finding the minimal point of the function G(x). The latter might be a much easier
problem in many circumstances, especially so since the function G(x) is convex, thus it has a
unique minimum.
This idea can be generalized to many PDEs, in particular, to the Poisson equation. We
define the energy functional
ˆ
1 2
I[w] = |∇w| − wf dx, (1.120)
U 2
A = {w ∈ C 2 (Ū ) : w = g on ∂U }.
−∆u = f in U (1.121)
u = g on ∂U
32
Proof. Let us assume that u solves (1.121), take w ∈ A, multiply (1.121) by (u − w) and
integrate: ˆ
(−∆u − f )(u − w)dx = 0.
U
Integrating by parts using the Green’s formula gives
ˆ
(∇u · ∇(u − w) − f (u − w))dx = 0.
U
We used here the fact that u − w = 0 on ∂U to kill the boundary terms in Green’s formula.
It follows that ˆ ˆ
2
(|∇u| − f u) = (∇u · ∇w − f w)dx. (1.123)
U
Now comes the crucial trick: note that
1 1
|(∇u · ∇w)| ≤ |∇u|2 + |∇w|2 .
2 2
Using this in (1.123) leads to
ˆ ˆ
1 1
2
(|∇u| − f u) ≤ ( |∇u|2 + |∇w|2 − f w)dx, (1.124)
U 2 2
hence, ˆ ˆ
1 1
( |∇u|2 − f u) ≤ ( |∇w|2 − f w)dx, (1.125)
U 2 2
which is nothing but I[u] ≤ I[w]. Therefore, if u solves the boundary value problem (1.121)
then it minimizes the functional I[w] over w ∈ A.
To show the other direction, let u be a minimizer of I[w] over A. Take a function v that
is smooth in U and vanishes on the boundary ∂U . Consider the increment of I[w] in the
direction v:
r(s) = I[u + sv].
Then, the function u+sv is in A, and, as u minimizes I[w] over A, we should have r(s) ≥ r(0)
for all s ∈ R. The function r(s) is a quadratic function of s:
ˆ
1 2
r(s) = |∇u + s∇v| − (u + sv)f dx
U 2
ˆ ˆ ˆ
s2
1 2
= |∇u| − uf dx + s (∇u · ∇v − vf ) dx + |∇v|2 dx.
U 2 U 2 U
As r(s) attains its minimum at s = 0, we have
ˆ
(∇u · ∇v − vf ) dx = 0. (1.126)
U
33
Since this identity holds for all smooth functions v that vanish at the boundary ∂U , it follows
that u satisfies
−∆u = f,
and this finishes the proof – the boundary condition u = g on ∂U is satisfied automatically
since u ∈ A. 2
Here, we think of f as a noisy measured image, U is the domain of the recording sensor, λ is
a small parameter, and we are looking for a function w that is close to f but is ”reasonably
smooth” – this is why the gradient term appears in (1.133). It is natural to assume that the
normal derivative of w vanishes at the image edges, so that our goal is to find a minimizer
of J(w) over the set
∂u
A = {u ∈ C 2 (Ω) ∩ C(Ω̄) : = 0 on ∂Ω.}
∂ν
An alternative is to minimize J(w) over the set
where, g is a smoothed version of f on the boundary – the question how we smooth f on the
boundary is separate, and we do not touch upon it here. Let us assume that w is a minimizer
of J(w) over A and variate J over A: take a smooth function η(x) such that ∂η/∂ν = 0 on ∂Ω
and compute
ˆ ˆ ˆ ˆ
J(w + sη) = λ |∇w + s∇η| dx + (w + sη − f ) dx = λ |∇w| dx + (w − f )2 dx
2 2 2
ˆ Ω ˆ Ω ˆ Ω ˆ Ω
Now, for w to be a minimum of J(w) over the admissible set A, the function
34
should attain its minimum at s = 0 for all smooth test functions η that vanish on ∂Ω,
and (1.130) should hold for all such η(x), hence w should be the solution of the boundary
value problem
−λ∆w + w = f in Ω, (1.131)
∂w
= 0 on ∂Ω.
∂ν
Exercise 1.14 Show that if we were minimizing J(w) over the set Ag we would have arrived
at the problem
−λ∆w + w = f in Ω, (1.132)
w = g on ∂Ω.
The truth is that this denoising method oversmooths and is terrible at preserving edges in
an image even for small values of λ, but it is a legitimate first attempt at denoising. A much
better functional to use is
ˆ ˆ
˜
J(w) = λ |∇w|dx + (w − f )2 dx. (1.133)
U U
˜
However, minimizing J(w) leads to a rather complicated nonlinear PDE, so we will not
consider this problem here.
35
Let use assume that τ and h are small and use Taylor’s formula in the right side above.
Then (1.135) becomes:
∂u(t, x) h2 ∂ 2 u τ 2 ∂ 2 u(t, x) ∂ 2 u(t, x)
1 ∂u(t, x)
u(t, x) = u(t, x) − τ +h + + − τh
2 ∂t ∂x 2 ∂x2 2 ∂t2 ∂x∂t
2 2 2 2 2
1 ∂u(t, x) ∂u(t, x) h ∂ u τ ∂ u(t, x) ∂ u(t, x)
+ u(t, x) − τ −h + 2
+ 2
+ τh + ...,
2 ∂t ∂x 2 ∂x 2 ∂t ∂x∂t
which is
∂u h2 ∂ 2 u τ 2 ∂ 2 u(t, x)
τ = + + ...
∂t 2 ∂x2 2 ∂t2
In order to get a non-trivial balance we set τ = h2 . Then the term involving utt in the right
side is smaller than the rest and in the leading order we obtain
∂u 1 ∂ 2u
= , (1.136)
∂t 2 ∂x2
which is the diffusion equation (we could get rid of the factor of 1/2 if we took τ = h2 /2 but
probabilists do not like that). It is supplemented by the initial condition
1, if x ∈ S
u(0, x) =
0 if x ∈
/ S.
More generally, we can take a bounded function f (x) defined on the real line and set
v(t, x) = E{f (X(t))| X(0) = x}.
Essentially an identical argument shows that if τ = h2 then in the limit h → 0 we get the
following Cauchy problem for v(t, x):
∂v 1 ∂ 2v
= (1.137)
∂t 2 ∂x2
v(0, x) = f (x).
What should we expect for the solutions of the Cauchy problem given this informal prob-
abilistic representation? First, it should preserve positivity: if f (x) ≥ 0 for all x ∈ R, we
should have u(t, x) ≥ 0 for all t > 0 and x ∈ R. Second, the maximum principle should
hold: if f (x) ≤ M for all x ∈ R, then we should have u(t, x) ≤ M for all t > 0 and x ∈ R
because the expected value of a quantity can not exceed its maximum. We should also expect
that maxx∈R v(t, x) decays in time, at least if f (x) is compactly supported – this is because
the random walk will tend to spread around and at large times the probability to find it on
the set where f (x) does not vanish, is small.
36
gives a solution of the Poisson equation
−∆u = f (x) in Rn .
Let us now try to find the “moral equivalent” of the fundamental solution for the heat equa-
tion. More precisely, we will look for a function G(t, x) so that the convolution
ˆ
u(t, x) = G(t, x − y)f (y)dy (1.138)
Let us first look for some symmetries that the function G(t, x) has to satisfy. The key
observation is that if u(t, x) is a solution to (1.139), then for all λ > 0 the rescaled function
also solves the heat equation, but with the rescaled initial data
It follows that for any initial data f (x) and any λ > 0 we should have
ˆ
2
u(λ t, λx) = uλ (t, x) = G(t, x − y)f (λy)dy.
This identity holds for any continuous function f (y) with compact support, hence G(t, x) has
to satisfy
1 y
G(λ2 t, λx − y) = n G(t, x − ), (1.141)
λ λ
for all λ > 0, t > 0 and x, y ∈ Rn . Denoting z = x−y/λ, we get an equivalent form of (1.141):
for any λ > 0, t > 0 and z ∈ Rn we have
1
G(λ2 t, λz) = G(t, z). (1.142)
λn
√
Let us choose λ = 1/ t then (1.142) implies
z
G(1, √ ) = tn/2 G(t, z). (1.143)
t
37
In other words, the function G(t, z) has to be of the form
1 z
G(t, z) = v( √ ). (1.144)
tn/2 t
Here we have denoted v(z) = G(1, z). This means that G(t, z) is self-similar: its shape at
different times can be obtained by simple rescaling of the ”shape” of the function v(y), y ∈ Rn .
Let us make a more general self-similar ansatz:
1 x
u(t, x) = m v β , (1.145)
t t
and see for which m and β we can find a self-similar solution of the heat equation
ut = ∆u. (1.146)
We insert the ansatz (1.145) into (1.146) and compute, with y = x/tβ :
m β 1
− v(y) − x · ∇v(y) − ∆v(y) = 0,
tm+1 tm+β+1 tm+2β
or
m β 1
− v(y) − y · ∇v(y) − ∆v(y) = 0, (1.147)
tm+1 tm+1 tm+2β
Now, for (1.147) to be true for all t > 0 and y ∈ Rn , we need this equation to involve only the
variable y – this forces us to take β = 1/2. With this choice of β, equation (1.147) becomes
y
mv(y) + · ∇v(y) + ∆v(y) = 0 (1.148)
2
In order to simplify further we assume that v is radial. Actually, a good exercise is to convince
yourself that the radial symmetry of the heat equation implies that the heat kernel, if it exists,
has to be radial. In other words, this assumption should be automatically satisfied, and we
must have v(y) = w(|y|), with some function w(r), r > 0, turning (1.148) into
n−1 0 r 0
w00 (r) + w + w + mw = 0. (1.149)
r 2
Let us add the boundary condition w(0) = 1, w0 (0) = 0 to (1.149), and look for w(r) > 0 that
decays rapidly as r → +∞ and is positive for all r > 0. Multiplying (1.149) by rn−1 gives
n 0
n−1 0 0 r n−1
n
(r w (r)) + w +r m− w = 0. (1.150)
2 2
38
As we are looking for w(r) > 0, it forces the value m = n/2. With this choice of m, (1.150)
can be solved giving a solution with the properties we have required: integrating this equation
once gives
rn
rn−1 w0 (r) + w = 0, (1.151)
2
so
r
w0 (r) = − w,
2
and
2
w(r) = be−r /4 , (1.152)
with an arbitrary constant b > 0 gives a solution. Therefore, we obtain the following family
of positive solutions to the heat equation:
b 2 /(4t)
u(t, x) = e−|x| , b > 0. (1.153)
tn/2
This motivates the following
Definition 1.15 The function
1 2
G(t, x) = n/2
e−|x| /(4t) , t > 0, x ∈ Rn , (1.154)
(4πt)
is called the heat kernel.
Note that ˆ
G(t, x)dx = 1 for all t > 0. (1.155)
Rn
This is because
ˆ ˆ ˆ ˆ
1 −|x|2 /(4t) 1 −z 2 /4 1 2
G(t, x)dx = n/2
e dx = n/2
e dz = n/2
e−z /4 dz
Rn (4πt) n (4π) Rn (4π) Rn
ˆ ∞ n R
1 2
= √ e−s /4 ds
4π −∞
while
ˆ ∞ 2 ˆ ˆ ∞ ˆ 2π
1 −s2 /4 1 −(x2 +y 2 )/4 1 2
√ e ds = e dxdy = e−r /4 rdrdφ
4π 4π 4π 0
−∞
ˆ ∞ ˆ ∞ 0
1 2
= re−r /4 dr = e−s ds = 1.
2 0 0
Actually, identity (1.155) is the reason why we chose the prefactor to be (4π)−n/2 in the
definition of the heat kernel.
Notice that if u(t, x) is a solution of the heat equation, then so is the shifted func-
tion u(t, x − y) for any y ∈ Rn fixed. It follows that the function
ˆ
v(t, x) = G(t, x − y)f (y)dy, t > 0, x ∈ Rn , (1.156)
Rn
39
is also a solution of the heat equation, provided that the function f (y) is such that we can
differentiate under the integral sign above. This is true, in particular, if f (y) is bounded and
continuous. Moreover, under these assumptions we can differentiate v(t, x) as many times as
we wish – that is, v(t, x) is infinitely differentiable even if f (y) is just continuous and bounded.
Let us now see what happens to v(t, x) as t ↓ 0 – note that (1.156) defines it only for t > 0.
To this end, we first note that
ˆ
1 2
n/2
e−z /4 dz = 1, (1.157)
(4π) Rn
which can be seen by taking t = √ 1 in (1.155). Now, let t > 0 be small and let us make a
change of variables z = (x − y)/ t in (1.156):
ˆ ˆ √
1 −|x−y|2 /(4t) 1 2
v(t, x) = n/2
e f (y)dy = n/2
e−z /4 f (x − z t)dz. (1.158)
(4πt) Rn (4π) Rn
As f is continuous, we have √
f (x − z t) → f (x),
for each x ∈ Rn and z ∈ Rn fixed. Since f is also globally bounded, |f (x)| ≤ M for all x ∈ Rn ,
we can use Lebesgue dominated convergence theorem to conclude from (1.158) that
ˆ √ ˆ
1 −z 2 /4 1 2
v(t, x) = n/2
e f (x − z t)dz → f (x) n/2
e−z /4 dz = f (x). (1.159)
(4π) Rn (4π) Rn
We used identity (1.157) in the last step above. We summarize this discussion as follows.
Theorem 1.16 Let f (x) be a continuous function in Rn and assume that there exists a
constant M > 0 so that |f (x)| ≤ M for all x ∈ Rn . Then the function v(t, x) defined
by (1.156) is infinitely differentiable for all t > 0 and x ∈ Rn and, in addition, satisfies the
heat equation
∂v
− ∆v = 0, t > 0, x ∈ Rn ,
∂t
as well as the initial condition v(0, x) = f (x).
Duhamel’s principle, as discussed at the end of Section 1.2 says that solution of (1.160) can
be found as follows: for every fixed s ∈ [0, t] solve the following Cauchy problem for the
function u(t, x; s), defined for t ≥ s:
∂u(t, x; s)
− ∆u(t, x; s) = 0, t ≥ s, x ∈ Rn , (1.161)
∂t
u(t = s, x; s) = f (s, x), x ∈ Rn .
40
A solution of this problem is
ˆ
u(t, x; s) = G(t − s, x − y)f (s, y)dy.
Rn
Let us verify that this is, indeed, the case. If we could differentiate under the integral sign
that would be simple:
ˆ t
∂v(t, x) ∂u(t, x; s)
− ∆v(t, x) = u(t, x; t) + ( − ∆u(t, x; s))ds = f (t, x). (1.163)
∂t 0 ∂t
The subtlety here is that G(t − s, x − y) is singular at s = t and we need to justify this
formal procedure. This is done very similarly to what we did in the proof of Thereom 1.1 so
we will just briefly describe how this can be done. Let us rewrite v(t, x) given by (1.162) as
ˆ tˆ
v(t, x) = G(s, y)f (t − s, x − y)dyds. (1.164)
0 Rn
Then, if f is sufficiently smooth, we can argue as in the proof of the aforementioned theorem
to get
ˆ tˆ
∂v(t, x) ∂f (t − s, x − y)
− ∆v(t, x) = G(s, y) − ∆x f (t − s, x − y) dyds
∂t ∂t
ˆ 0 Rn
The potential trouble point in the first integral in the right side of (1.165) is s = 0 where G(s, y)
is singular. Hence, we take a small ε > 0 and, using also the fact that
∂f (t − s, x − y) ∂f (t − s, x − y)
∆x f (t − s, x − y) = ∆y f (t − s, x − y), =− ,
∂t ∂s
we split the integral in (1.165) as
ˆ tˆ
∂v(t, x) ∂f (t − s, x − y)
− ∆v(t, x) = G(s, y) − − ∆y f (t − s, x − y) dyds
∂t Rn ∂s
ˆ εˆ ε
ˆ
∂f (t − s, x − y)
+ G(s, y) − − ∆y f (t − s, x − y) dyds + G(t, y)f (0, x − y)dy
0 Rn ∂s Rn
= Iε + Jε + K. (1.166)
The second term in the right side above is small when ε → 0: let
∂f (t, x)
Mf = max n + max |∆f (t, x)| ,
t∈R,x∈R ∂t t∈R,x∈Rn
41
then ˆ εˆ
|Jε | ≤ Mf G(s, y)dsdy = εMf ,
0 Rn
n/2
e−|z| /4 f (t, x)dz = f (t, x).
(4π) Rn
We conclude that Duhamel’s formula, indeed, gives the solution of the inhomogeneous prob-
lem (1.160).
UT = {(t, x) : x ∈ U, 0 ≤ t ≤ T },
and
ΓT = {(t, x) : t = 0 and x ∈ U , or 0 ≤ t ≤ T and x ∈ ∂U }.
Note that the definition of the ”parabolic boundary” ΓT does not include the final time
t = T – it looks in time-space as a cylinder without the top.
42
Theorem 1.17 Let the function u(t, x) satisfy the heat equation
∂u
− ∆u = 0
∂t
in UT . Then u achieves its maximum and minimum over UT on the parabolic boundary ΓT .
In other words, u attains its maximum either at some point x ∈ U at the initial time t = 0,
or, if it attains its maximum at some point (t0 , x0 ) with t0 > 0 then x0 has to belong to the
boundary ∂U .
Proof. Take ε > 0 and consider the function
v(t, x) = u(t, x) − εt.
The function v(t, x) satisfies
∂v
− ∆v = −ε. (1.167)
∂t
Consider the domain [
ŪT = UT {(T, x) : x ∈ U },
which is the union of UT and the ”top” of the parabolic cylinder. The function v(t, x) must
attain its maximum over the set ŪT at some point (t0 , x0 ) ∈ ŪT . We claim that this point
has to lie on the parabolic boundary ΓT . Indeed, if 0 < t0 < T and x0 is not on the boundary
∂U , then the point (t0 , x0 ) is an interior maximum of v(t, x) and as such should satisfy
∂v(t0 , x0 )
= 0, ∆v(t0 , x0 ) ≤ 0,
∂t
which is impossible because of (1.167). On the other hand, if t0 = T and x0 is an interior
point of U , and u attains its maximum over Ū at this point, then we should have
∂v(t0 , x0 )
≥ 0, ∆v(t0 , x0 ) ≤ 0,
∂t
which, once again contradicts (1.167). Hence, the function v attains its maximum over ŪT at
a point (t0 , x0 ) that belongs to ΓT . It means that
max v(t, x) = max v(t, x) ≤ max u(t, x).
(t,x)∈ŪT (t,x)∈ΓT (t,x)∈ΓT
43
Theorem 1.18 Let u(t, x) be a smooth solution of the Cauchy problem
∂u
− ∆u = 0, t > 0, x ∈ Rn , (1.168)
∂t
u(0, x) = f (x).
The proof of this theorem is more technical than in a bounded domain but the idea is similar in
spirit so we not present it here. We just mention that there are solutions of the homogeneous
Cauchy problem
∂u
− ∆u = 0, t > 0, x ∈ Rn , (1.170)
∂t
u(0, x) = 0
that are not identically equal to zero and grow very rapidly at infinity. The role of assumption
(1.169) is to preclude those.
1.4.5 Uniqueness
Let us now go back to the setting of Theorem 1.17.
Theorem 1.19 Let g be a continuous function on the parabolic boundary ΓT and let f be
continuous in UT . Then there exists at most one smooth solution to the initial boundary value
problem
∂u
− ∆u = f, in UT (1.171)
∂t
u = g on ΓT .
Note that the condition u = g on ΓT prescribes both the initial data for u at the time t = 0
and the boundary data along ∂U at times t > 0.
Proof. This follows immediately from the maximum principle for bounded domains (The-
orem 1.17). Indeed, if u1 and u2 solve (1.171) then the difference v = u1 − u2 satisfies
∂v
− ∆v = 0, in UT (1.172)
∂t
v = 0 on ΓT .
44
The same uniqueness result holds for solutions in the whole space: there exists at most
one solution of the initial value problem
∂u
− ∆u = 0, t > 0, x ∈ Rn , (1.173)
∂t
u(0, x) = f (x)
We will show that any solution of the heat equation, regardless of the initial condition, satisfies
dE
≤ 0, (1.176)
dt
hence E(t) ≤ E(0) for all t ≥ 0. However, if v(0, x) = 0 then E(0) = 0, so that E(t) ≤ 0 for
all t ≥ 0. As E(t) is nonnegative by definition, it follows that E(t) = 0 for all t ≥ 0, which
means that v(t, x) ≡ 0.
Let us now show that energy inequality (1.176) holds. We will do this for a general solution
of the Cauchy problem with zero boundary conditions on ∂U :
∂u
− ∆u = 0, for 0 ≤ t ≤ T , x ∈ U , (1.177)
∂t
u(t, x) = 0 for 0 ≤ t ≤ T , x ∈ ∂U ,
u(0, x) = g(x) for x ∈ U .
45
Integration by parts gives
ˆ ˆ
dE ∂u(t, x)
=2 u(t, x) dS(x) − 2 ∇u · ∇udx.
dt ∂U ∂ν U
∂u
− ∆u = 0, t > 0, x ∈ Rn , (1.178)
∂t
u(0, x) = f (x).
The first fact to observe is that if, say, f (x) is continuous and of compact support, we can
differentiate under the integral sign with respect to both t and x arbitrarily many times. This
is because such differentiations will lead to expressions of the sort
ˆ
2
p(t, x, y)e−(x−y) /(4t) f (y)dy,
Rn
where p(t, x, y) is some polynomial in x and y with coefficients that are rational functions of
t that is regular for t > 0. Such integrals converge, which justifies the differentiation under
the integral sign (modulo some very minor details). Therefore, if f (y) is just continuous and
bounded, the function u(t, x) is infinitely differentiable for all t > 0.
We also get directly from (1.179) that
ˆ
1
|u(t, x)| ≤ |f (y)|dy. (1.180)
(4πt)n/2 Rn
46
It is remarkable that only the initial mass
ˆ
m0 = |f (y)|dy
Rn
enters into the upper bound (1.181), and not the maximum of the initial data. This means
that solutions whose initial data consists of a narrow peak drops its maximum very quickly
(a narrow peak has a small mass despite having a large maximum).
Let us now differentiate (1.179). We get for, say, the partial derivative with respect to x1 :
ˆ
∂u(t, x) 1 (x1 − y1 ) −|x−y|2 /(4t)
= n/2
e f (y)dy.
∂x1 (4πt) Rn 2t
2 −z /4
Note that the
√ function p(z) = z1 e is globally bounded:
p it attains its maximum at the
point z̄ = ( 2, 0, 0, . . . , 0) where it takes the value p(z̄) = 2/e. It follows that
ˆ
∂u(t, x) 1 (x1 − y1 ) −|x−y|2 /(4t)
∂x1 (4πt)n/2 2√t n √
≤ e |f (y)|dy
t
ˆ R
ˆ
1 C1
≤ √ |f (y)|dy = (n+1)/2 |f (y)|dy,
(4πt)n/2 2et Rn t Rn
√
with C1 = (4π)n/2 2e. Generally, we have
ˆ
C1
|∇u(t, x)| ≤ |f (y)|dy. (1.182)
t(n+1)/2 Rn
We see from (1.182) that |∇u(t, x)| decays even faster as t → +∞ than |u(t, x)|, and, more-
over, it is controlled by the initial mass of the solution, rather than by its derivatives. This is
a very important property of the heat equation: as t → +∞ both u(t, x) and its derivatives
decay to zero at a rate controlled by the initial mass. Moreover, the higher the order of the
derivative, the faster it decays in time – solution starts to look very smooth.
∂w ∂ 2 w n
− 2
− w = 0, w(0, x) = u(0, x). (1.184)
∂t ∂x 2(t + 1)
47
p
Let us now make
p a change of variables: look for w(t, x) = v(t, x/ 4(t + 1)), then we calculate,
with y = x/ 4(t + 1):
n n
∂w ∂v X xj ∂v ∂v 1 X ∂v ∂v 1
= − 3/2
= − yj = − y · ∇y v,
∂t ∂t j=1
4[(t + 1)] ∂y j ∂t 2(t + 1) j=1
∂y j ∂t 2(t + 1)
and
∂w 1 ∂v
=p .
∂xj 4(t + 1) ∂yj
Differentiating once again gives
∂ 2w 1 ∂ 2v
= .
∂xj 2 4(t + 1) ∂yj 2
Inserting this into (1.184) gives
∂v 1 1 n
− y · ∇y v − ∆y v − v = 0, v(0, y) = u(0, 2y). (1.185)
∂t 2(t + 1) 4(t + 1) 2(t + 1)
The final change of variables is to set s = (1/4) ln(t + 1), that is, v(t, y) = ṽ((ln(t + 1))/4, y),
then
∂v 1 ∂ṽ
= .
∂t 4(t + 1) ∂s
Now, (1.185) becomes
∂ṽ
− 2y · ∇y ṽ − ∆y ṽ − 2nṽ = 0, ṽ(0, y) = u(0, 2y). (1.186)
∂s
2
Equation (1.186) has a special solution v̄(y) = e−y that can be checked directly. The main
point of our analysis is to show that any solution of (1.186) converges as t → +∞ to a multiple
of v̄(y):
ṽ(s, y) → mv̄(y) as s → +∞. (1.187)
In order to find the constant m let us integrate (1.186) in y, keeping in mind the following
identities: first, ˆ ˆ
∂ṽ(s, y) d
dy = ṽ(s, y)dy.
Rn ∂s ds Rn
Next, using Green’s formula we get
ˆ ˆ ˆ
y · ∇ṽdy = − ṽ(s, y)div(y)dy = −n ṽ(s, y)dy,
Rn Rn Rn
48
that is, ˆ
d
ṽ(s, y)dy = 0.
ds Rn
This means that ˆ ˆ
ṽ(s, y)dy = u(0, 2y)dy
Rn Rn
is a conserved quantity. The only possible value for constant m is then determined from the
condition ˆ ˆ
m v̄(y)dy = u(0, 2y)dy, (1.188)
Rn Rn
whence ˆ ˆ
1 1
m= u(0, 2y)dy = u(0, y)dy. (1.189)
π n/2 Rn (4π)n/2 Rn
∂z 2
+ 2y · ∇z − ∆z = 0, z(0, y) = u(0, y)ey . (1.190)
∂s
Let us assume for the sake of simplicity of notation that the dimension n = 1 and set
∂z(s, y)
φ(s, y) = .
∂y
The function φ satisfies
∂φ ∂φ d h 2
i
+ 2y + 2φ − ∆φ = 0, φ(0, y) = u(0, y)ey . (1.191)
∂s ∂y dy
Let us multiply this equation by φ and integrate in y: first we have
ˆ ˆ
∂φ(s, y) 1 d
φ(s, y) dy = φ2 (s, y)dy.
R ∂s 2 ds R
and, finally:
ˆ ˆ
∂ 2 φ(s, y) ∂φ(s, y) 2
φ(s, y) dy = − ∂y dy.
R ∂y 2 R
49
As a consequence, we have
ˆ ˆ
2
φ (s, y)dy ≤ φ (0, y)dy e−2s → 0 as s → +∞.
2
(1.192)
R R
Recalling that φ is actually the derivative of z(s, y) with respect to y, we conclude that z(s, y)
tends to a constant as s → +∞.
Let us now re-interpret the convergence result (1.187) in terms of the original variables t
and x, and the function u(t, x) that actually solves the heat equation (1.183). Recall that
! !
1 1 x 1 ln(t + 1) x
u(t, x) = w(t, x) = v t, p = v ,p .
(1 + t)n/2 (1 + t)n/2 4(t + 1) (1 + t)n/2 4 4(t + 1)
Therefore, after long times solution starts to have the profile of a Gaussian weighted appro-
priately to have the correct mass. But the only information that remains from the initial
condition is its total mass – the rest is completely lost. All solutions ”look the same” – this
is self-similarity.
The Riemann-Lebesgue lemma shows that a continuous signal can not have too much high-
frequency content and fˆ(k) have to decay for large k (this is actually true for a much more
general class of functions). We will denote by Cper [0, 1] the class of functions that are contin-
uous on [0, 1] and f (0) = f (1).
Lemma 2.1 (The Riemann-Lebesgue lemma) If f ∈ Cper [0, 1] then fˆ(k) → 0 as k → +∞.
50
Proof. Note that
ˆ 1 ˆ 1 ˆ 1
1 −2πikx
fˆ(k) = f (x)e −2πikx
dx = − f (x)e −2πik(x+1/(2k))
dx = − f (x − )e dx,
0 0 0 2k
and thus ˆ 1
1 1
fˆ(k) = f (x) − f (x − ) e−2πikx dx.
2 0 2k
As a consequence, since f (x) is continuous on [0, 1], we have
ˆ 1
1 1
|fˆ(k)| ≤
f (x) − f (x − ) dx,
2 0 2k
The definition of the Dini kernel as a sum of exponentials implies immediately that
ˆ 1
DN (t)dt = 1 (2.1)
0
51
for all N , while the expression in terms of sines shows that
1
|DN (t)| ≤ , δ ≤ |t| ≤ 1/2.
sin(πδ)
The ”problem” with the Dini kernel is that its L1 -norm is not uniformly bounded in N .
Indeed, consider
ˆ 1/2
LN = |DN (t)|dt. (2.2)
−1/2
We compute:
ˆ 1/2 ˆ 1/2
| sin((2N + 1)πt)| | sin((2N + 1)πt)|
LN = 2 dt ≥ 2 dt
0 | sin πt| 0 |πt|
ˆ 1/2 ˆ N +1/2
1 1 | sin(πt)|
−2 | sin((2N + 1)πt)|
− dt = 2 dt + O(1)
0 sin πt πt 0 πt
N −1 ˆ
2 X 1 | sin πt|
≥ dt + O(1) ≥ C log N + O(1),
π k=0 0 t + k
which implies (2.3). This means that (2.1) holds because of cancellation of many oscillatory
terms. These oscillations may cause difficulties in the convergence of the Fourier series.
Theorem 2.2 (Dini’s criterion) Let f ∈ C[0, 1] satisfy the following condition at the point x:
there exists δ > 0 so that ˆ
f (x + t) − f (x)
dt < +∞, (2.4)
|t|<δ
t
then limN →∞ SN f (x) = f (x). In particular, this is true if f is differentiable at the point x.
Another criterion for the convergence of the Fourier series was given by Jordan:
52
The du Bois-Raymond example
In 1873, surprisingly, du Bois-Raymond proved that the Fourier series of a continuous function
may diverge at a point.
Theorem 2.4 There exists a continuous function f so that its Fourier series diverges at
x = 0.
Kolmogorov showed in 1926 that an L1 -function may have a Fourier series that diverges at
every point. Then Carelson in 1965 proved that the Fourier series of an L2 -function converges
almost everywhere and then Hunt improved this result to an arbitrary Lp for p > 1.
53
Corollary 2.6 The Parceval identity holds for any f ∈ L2 [0, 1]:
X ˆ 1
ˆ 2
|f (k)| = |f (x)|2 dx. (2.10)
k∈Z 0
54
with fn given by (2.12). Expression (2.13) conveys very effectively the regularizing properties
of the heat equation: the high frequencies are attenuated very quickly due to the exponentially
decaying factor. This is another way to express the fact that that the heat equation kills
oscillations. Another simple consequence of (2.13) is that
ˆ 1
u(t, x) → f0 = f (y)dy, as t → +∞,
0
uniformly in x – that is, in the long time limit solution of the heat equation with periodic
initial data converges to a constant equal to its spatial average.
Then, obviously: ˆ
|fˆ(ξ)| ≤ |f (x)|dx.
Rn
Moreover, the function fˆ(ξ) is continuous, and the Riemann-Lebesegue lemma is easily gen-
eralized to the Fourier transform on Rn , and
lim fˆ(ξ) = 0.
ξ→∞
For a smooth compactly supported function f ∈ Cc∞ (Rn ) we have the following remarkable
algebraic relations between taking derivatives and multiplying by polynomials:
∂f
d
(ξ) = 2πiξj fˆ(ξ), (2.14)
∂xj
and
∂ fˆ
(−2πi)(x
d j f )(ξ) = (ξ). (2.15)
∂ξj
This motivates the following definition.
Definition 2.7 The Schwartz class S(Rn ) consists of functions f such that for any pair of
multi-indices α and β
pαβ (f ) := sup |xα Dβ f (x)| < +∞.
x
As Cc∞ (Rn ) lies inside the Schwartz class, the Schwartz functions are dense in L1 (Rn ).
The main reason to introduce the Schwartz class is the following theorem.
55
Theorem 2.8 (i) The Fourier transform of a Schwartz class function f (x) is a Schwartz
class function fˆ(ξ).
(ii) For any f, g ∈ S(Rn ) we have
ˆ ˆ
f (x)ĝ(x)dx = fˆ(x)g(x)dx. (2.16)
Rn Rn
Proof. First, as
2 2 2
f (x) = e−π|x1 | e−π|x2 | . . . e−π|xn | ,
so that both f and fˆ factor into a product of functions of one variable, it suffices to consider
the case n = 1. The proof is a glimpse of how useful the Fourier transform is for differential
equations and vice versa: the function f (x) satisfies an ordinary differential equation
f 0 + 2πxf = 0, (2.18)
with the boundary condition f (0) = 1. However, relations (2.14) and (2.15) together with (2.18)
imply that fˆ satisfies the same differential equation (2.18), with the same boundary condi-
tion fˆ(0) = f (0) = 1. It follows that f (x) = fˆ(x) for all x ∈ R. 2
We continue with the proof of Theorem 2.8. Relations (2.14) and (2.15) imply that the
Fourier transform of a Schwartz class function is of the Schwartz class.
The Parceval identity can be verified directly as follows:
ˆ ˆ ˆ
f (x)ĝ(x)dx = f (x)g(ξ)e −2πiξ·x
dxdξ = fˆ(ξ)g(ξ)dξ.
Rn R2n Rn
Finally, we prove the inversion formula using a rescaling argument. Let f, g ∈ S(Rn ) then
for any λ > 0 we have
ˆ ˆ ˆ ˆ
1 ξ
f (x)ĝ(λx)dx = f (x)g(ξ)e−2πiλξ·x
dx = fˆ(λξ)g(ξ)dξ = n fˆ(ξ)g dξ.
Rn R2n λ Rn λ
Multiplying by λn and changing variables on the left side we obtain
ˆ x ˆ
ˆ ξ
f ĝ(x)dx = f (ξ)g dξ.
Rn λ Rn λ
Letting now λ → ∞ gives
ˆ ˆ
f (0) ĝ(x)dx = g(0) fˆ(ξ)dξ, (2.19)
Rn Rn
56
2
for all functions f and g in S(Rn ). Taking g(x) = e−π|x| in (2.19) and using Lemma 2.9 leads
to ˆ
f (0) = fˆ(ξ)dξ. (2.20)
Rn
The inversion formula (2.17) now follows if we apply (2.20) to a shifted function fy (x) =
f (x + y), because ˆ
ˆ
fy (ξ) = f (x + y)e−2πiξ·x dx = e2πiξ·y fˆ(ξ),
Rn
so that ˆ ˆ
f (y) = fy (0) = fˆy (ξ)dξ = e2πiξ·y fˆ(ξ)dξ,
Rn Rn
which is (2.17). 2
57
2.5 An application to the heat equation in the whole space
Consider the Cauchy problem for the heat equation
∂u
− ∆u = 0, t > 0, x ∈ Rn , (2.21)
∂t
u(0, x) = f (x), x ∈ Rn .
Taking the Fourier transform of the heat equation in the whole space gives, very similarly to
the periodic case considered in Section 2.3:
∂ û(t, ξ)
+ 4π 2 |ξ|2 û(t, ξ) = 0, (2.22)
∂t
with the initial data
û(0, ξ) = fˆ(x),
where ˆ ˆ
û(t, ξ) = u(t, x)e −2πiξ·x
dx, fˆ(ξ) = f (x)e−2πiξ·x dx.
Rn Rn
The ODE (2.22) can be solved explicitly:
2 2
û(t, ξ) = fˆ(ξ)e−4π |ξ| t . (2.23)
As in the periodic case, we see that high frequency components of the solutions of the heat
equation decay rapidly in time, leading to the regularizing effects we have discussed previously.
The inverse Fourier transform then gives
ˆ ˆ
2 2
u(t, x) = û(t, ξ)e 2πiξ·x
dξ = fˆ(ξ)e−4π |ξ| t e2πiξ·x dξ. (2.24)
Rn Rn
Let us make a couple of “regularizing observations” from (2.24). For example, we have
ˆ ˆ
ˆ −4π 2 |ξ|2 t −4π 2 |ξ|2 t ˆ
|u(t, x)| ≤ |f (ξ)|e dξ ≤ e dξ maxn (|f (ξ)|) .
Rn Rn ξ∈R
Recall that ˆ
|fˆ(ξ)| ≤ |f (x)|dx,
Rn
for all ξ ∈ Rn . It follows that
ˆ ˆ ˆ
−4π 2 |ξ|2 t 1
|u(t, x)| ≤ |f (x)|dx e dξ = |f (x)|dx,
Rn Rn (4πt)n/2 Rn
an estimate we have seen before but now obtained without the use of the Green’s function.
We may also recover self-similarity of the √ solution of the heat equation in the long time
limit: consider t √1 and let us look at x ∼ t: mathematically, this means that we fix y ∈ R
and consider u(t, y t) with the idea of then letting t → +∞ with y fixed. We have
√ ˆ √
2 2
u(t, y t) = fˆ(ξ)e−4π |ξ| t e2π tiξ·y dξ.
Rn
58
√
Let us make a change of variable k = 2 πtξ:
√ ˆ ˆ(0) ˆ
1 k 2 i√πk·y f 2 √
u(t, y t) = √ n fˆ √ e−π|k|
e dk ≈ √ n e−π|k| e2πik·(y/2 π) dk.
(2 πt) Rn 2 πt (2 πt) Rn
Lemma 2.9 implies that
ˆ
fˆ(0) −π|y|2 /(4π)
2
√ e−|y| /4
u(t, y t) ≈ e = f (z)dz,
(4πt)n/2 (4πt)n/2 Rn
which is the self-similarity formula (1.193) we have obtained before by a rather more compli-
cated approach.
Finally, in order to obtain the formula for the solution of the heat equation in terms of
the heat kernel we start with (2.24) and write
ˆ ˆ
ˆ −4π 2 |ξ|2 t −2πiξ·x 2 2
u(t, x) = f (ξ)e e dξ = f (y)e−2πiξ·y e−4π |ξ| t e2πiξ·x dξdy, (2.25)
Rn R2n
59
The functions f and g represent the right and left going waves, respectively.
Let us now consider the Cauchy problem
1 ∂ 2φ ∂ 2φ
− 2 = 0, (3.4)
c2 ∂t2 ∂x
φ(0, x) = p(x),
φt (0, x) = q(x).
Here we use the notation φt = ∂φ/∂t. Note that, since the wave equation is of the second
order in time, we need to prescribe both the initial value of φ and the value of its derivative
at the time t = 0. We would like to express the solution of (3.4) in the form (3.3), that is, as
a sum of left and right going waves. for this decomposition to hold we need
p(x) = f (x) + g(x),
q(x) = −cf 0 (x) + cg 0 (x).
it follows that
1 1
g 0 (x) = (cp0 (x) + q(x)), f 0 (x) = (cp0 (x) − q(x)),
2c 2c
and thus we may take
ˆ x ˆ x
1 1 1 1
g(x) = p(x) + q(y)dy, f (x) = p(x) − q(y)dy. (3.5)
2 2c 0 2 2c 0
Expression (3.7) is known as d’Alembert’s formula. One corollary is that, unlike for the heat
equation, solution at times t > 0 is no more regular than at t = 0: if, say, p(x) has only five
derivatives, the function u(t, x) will not be any more regular than that. If p(x) is oscillatory,
then u(t, x) will also be oscillatory and so on.
The idea that solution may be decomposed into such primitive building blocks goes much
further than this trivial example. Another useful observation that is obvious in the one-
dimensional case is that solutions propagate with a finite speed: if φ(0, x) = 0 in an interval
(a − r, a + r) of length 2r centered at x = a then φ(t, a) = 0 for all t ≤ T0 = r/c.
60
Let ˆ
φ̂(t, ξ) = φ(t, x)e−2πiξ·x dx
R
∂ 2 φ̂
+ 4π 2 c2 |ξ|2 φ̂ = 0, (3.9)
∂t2
with the initial data
φ̂(0, ξ) = p̂(ξ), φ̂t (0, ξ) = q̂(ξ). (3.10)
It follows from (3.9) that
The coefficients A(ξ) and B(ξ) are determined by the initial condition (3.10):
hence
1 q̂(ξ) 1 q̂(ξ)
A(ξ) = p̂(ξ) − , B(ξ) = p̂(ξ) + ,
2 4πicξ 2 4πicξ
leading to
1 q̂(ξ) −2πicξt 1 q̂(ξ)
φ̂(t, ξ) = p̂(ξ) − e + p̂(ξ) + e2πicξt . (3.11)
2 4πicξ 2 4πicξ
Note that the singularities at ξ = 0 in the two terms above cancel each other out, and taking
the inverse Fourier transform gives
ˆ h
1 q̂(ξ) −2πiξct+2πiξx 1 q̂(ξ) 2πiξct+2πiξx
i
φ(t, x) = p̂(ξ) − e + p̂(ξ) + e dξ
2 4πicξ 2 4πicξ
ˆ
R
1 1 x+ct
= (p(x − ct) + p(x + ct)) + q(y)dy, (3.12)
2 2c x−ct
which is nothing but d’Alembert’s formula (3.7). The reader should fill out the details in the
last step above!
It is straightforward to verify that energy of (at least sufficiently smooth) solutions of (3.1)
is preserved: E(t) = E(0). Indeed, we have
ˆ
dE(t) 1
= 2
φt (t, x)φtt (t, x) + ∇φ(t, x) · ∇φt (t, x) dx.
dt Rn c (x)
61
Using the wave equation (3.13) we can re-write this as
ˆ
dE(t)
= [φt (t, x)∆φ(t, x) + ∇φ(t, x) · ∇φt (t, x)] dx.
dt Rn
Integrating by parts in the first term (the boundary terms at infinity vanish) gives
ˆ
dE(t)
= [−∇φt (t, x) · ∇φ(t, x) + ∇φ(t, x) · ∇φt (t, x)] dx = 0,
dt Rn
Let us assume that that (3.14) has two solutions φ1 (t, x) and φ2 (t, x) and set v(t, x) = φ1 (t, x)−
φ2 (t, x). The function v(t, x) satisfies
1 ∂ 2v
− ∆v = 0, (3.15)
c2 (x) ∂t2
v(0, x) = 0,
vt (0, x) = 0.
62
Fix a point x0 ∈ Rn and a time t0 > 0. We will show that u(t0 , x0 ) depends only on the values
of the functions p(x) and q(x) in the ball B(x0 , ct0 ) that is centered at x0 and has the radius
r0 = ct0 . More precisely, if u(t, x) solves the Cauchy problem (3.16), and ψ(x) solves (3.16)
with the initial data
ψ(0, x) = p̃(x), ψt (0, x) = q̃(x),
and p̃(x) = p(x), q̃(x) = q(x) for all x such that |x − x0 | ≤ ct0 , then u(t0 , x0 ) = ψ(t0 , x0 ). In
order to show this, consider φ(t, x) = u(t, x) − ψ(t, x). This function satisfies
1 ∂ 2φ
− ∆φ = 0, (3.17)
c2 ∂t2
φ(0, x) = p(x) − p̃(x), x ∈ Rn ,
φt (0, x) = q(x) − q̃(x), x ∈ Rn .
and
p(x) − p̃(x) = q(x) − q̃(x) = 0 for all x such that |x − x0 | ≤ ct0 . (3.18)
Let us define ˆ
1 1 2 2
e(t) = φ (t, x) + |∇φ(t, x)| dx,
2 |x−x0 |≤c(t0 −t) c2 t
that is, the portion of the total energy contained in the ball |x − x0 | ≤ c(t0 − t) at a time
0 ≤ t ≤ t0 . Note that we have e(0) = 0 because of the initial conditions. Let us re-write e(t)
in the polar coordinates centered around the point x0 :
ˆ ˆ
1 c(t0 −t)
1 2
e(t) = 2 t
2
φ (t, x0 + rω) + |∇φ(t, x0 + rω)| rn−1 dS(ω)dr,
2 0 S n−1 c
where S n−1 is the (n−1)-dimensional sphere of all possible directions. Let us now differentiate
e(t):
ˆ0 −t)
c(t ˆ
de(t) 1
= φt (t, x0 + rω)φtt (t, x0 + rω) + ∇φ(t, x0 + rω) · ∇φt (t, x0 + rω)
dt c2
0 S n−1
n−1
×r dS(ω)dr (3.19)
ˆ
c 1 2
− φ (t, x0 + c(t0 − t)ω) + |∇φ(t, x0 + c(t0 − t)ω)|2 cn−1 (t0 − t)n−1 dS(ω).
2 S n−1 c2 t
63
Using the fact that φ solves the wave equation in the first line above we get
ˆ
1
φt (t, x)φtt (t, x) + ∇φ(t, x) · ∇φt (t, x) dx (3.21)
c2
|x−x0 |≤c(t0 −t)
ˆ
= [φt (t, x)∆φ(t, x) + ∇φ(t, x) · ∇φt (t, x)] dx.
|x−x0 |≤c(t0 −t)
Using Green’s formula in the first term on the second line gives
ˆ
[φt (t, x)∆φ(t, x) + ∇φ(t, x) · ∇φt (t, x)] dx
|x−x0 |≤c(t0 −t)
ˆ
= [−∇φt (t, x) · ∇φ(t, x) + ∇φ(t, x) · ∇φt (t, x)] dx
|x−x0 |≤c(t0 −t)
ˆ ˆ
∂φ(t, y) ∂φ(t, y)
+ φt (t, y) dS(y) = φt (t, y) dS(y).
∂ν ∂ν
|y−x0 |=c(t0 −t) |y−x0 |=c(t0 −t)
We conclude that e(t) ≤ e(0) for all 0 ≤ t ≤ t0 . Recall that e(0) = 0 (from the initial data)
and e(t) ≥ 0 by its very definition. Therefore, we should have e(t) = 0 for all 0 ≤ t ≤ t0 ,
which means that
φ(t, x) = 0 for all |x − x0 | ≤ ct0 ,
as we have claimed.
utt − c2 ∆u = 0 (3.24)
64
provided that |k|2 = 1:
n
X
2 00
2
utt − c ∆u = c f − c 2
kj2 f 00 = 0.
j=1
One can look at the plane waves as ”building blocks” and ask if a general solution of the
wave equation may be decomposed into plane waves. The simplest example is given by the
d’Alembert formula
u(t, x) = h(x − ct) + p(x + ct) (3.25)
for a general solution for the Cauchy problem for the one-dimensional wave equation that we
have already discussed in Section 3.1. The d’Alembert formula (3.25) decomposes an arbitrary
solution of the wave equation into a sum of a left and right going waves. In dimensions higher
than one we have to decompose over waves going in all directions. Hence, we seek a general
solution of the wave equation
utt − c2 ∆u = 0 (3.26)
in the form of a ”sum”– actually, an integral – over plane waves
ˆ
u(t, x) = h(k · x − ct, k)dk, (3.27)
Sn−1
where Sn−1 is the (n − 1)-dimensional sphere of directions. As each function h(k · x − ct, k)
is a solution of the wave equation, so is the integral over k, thus we know that any u(t, x)
defined by (3.27) is a solution of the wave equation, no matter what function h(s, k) we choose.
Therefore, in order to ensure that u(t, x) given by (3.27) solves the Cauchy problem for the
wave equation:
1
utt − ∆u = 0, t > 0,
c2
u(0, x) = f (x),
ut (0, x) = g(x),
we only have to choose the function h(s, k) so as to match the initial conditions:
ˆ
u(0, x) = f (x) = h(k · x, k)dk (3.28)
ˆ
Sn−1
with some prescribed function f (x) and g(x). If we decompose h(s, k) into its even and odd
parts, that is,
h(s, k) + h(−s, −k) h(s, k) − h(−s, −k)
h(s, k) = l(s, k) + q(s, k), l(s, k) = , q(s, k) = ,
2 2
(3.30)
we see that (3.28)-(3.29) become
ˆ ˆ
f (x) = l(x · k, k)dk, g(x) = −c qs (x · k, k)dk. (3.31)
Sn−1 Sn−1
We have used here the fact that the functions q(k · x, k) and ls (k · x, k) are odd in k for
any x ∈ Rn fixed, and the integral of an odd function over a sphere vanishes.
65
The Radon transform
Hence, we have reduced the problem of finding a plane wave decomposition for a general
solution to the wave equation to the following problem: given a function f (x), x ∈ Rn find
an even function l(s, k), with s ∈ R, k ∈ Sn−1 , such that
ˆ
f (x) = l(x · k, k)dk, l(s, k) = l(−s, −k). (3.32)
Sn−1
Now, if the space dimension n is odd (so that ρn−1 = (−ρ)n−1 ), we may re-write the above
integral as
ˆ ˆ ∞ ˆ ˆ 0
1 2πiρk·x ˆ 1
f (x) = e n−1
f (ρk)ρ dρdk + e−2πiρk·x fˆ(−ρk)ρn−1 dρdk
2 Sn−1 0 2 Sn−1 −∞
ˆ ˆ ∞
1
= e2πiρk·x fˆ(ρk)ρn−1 dρdk. (3.33)
2 Sn−1 −∞
Thus, we have constructed the Radon transform of the function f explicitly: if we set
ˆ
1 ∞ 2πiρs ˆ
l(s, k) = e f (ρk)ρn−1 dρ, (3.34)
2 −∞
then (3.33) says that
ˆ
f (x) = l(k · x, k)dk.
Sn−1
66
Here, we have defined ˆ
M (s, k) = g(x)dΣ
x·k=s
as the integral of the function g over the hyperplane x · k = s. On the other hand we may
also write, using the Plancherel identity
ˆ ˆ ˆ ˆ
1
f (x)ḡ(x)dx = fˆ(ξ)ĝ(ξ)dξ = fˆ(ρk)ĝ(ρk)ρn−1 dρdk. (3.37)
Rn Rn 2 Sn−1 R
In the last step above we passed to the spherical coordinates: ξ = ρk with ρ ≥ 0 and k ∈ Sn−1 ,
and then used the same trick as in (3.33) that relied on the fact that the spatial dimension n
is odd. Note that (3.34) means that the Fourier transform of l(s, k) in the s-variable is
ˆ ∞
˜l(ρ, k) = 1
e−2πiρs l(s, k)ds = fˆ(ρk)ρn−1 . (3.38)
−∞ 2
Similarly, if we denote m = Rg the Radon transform of g, we have
ˆ ∞
1
m̃(ρ, k) = e−2πiρs m(s, k)ds = ĝ(ρk)ρn−1 . (3.39)
−∞ 2
Using (3.38) and (3.39) in (3.37) gives
ˆ ˆ ˆ ˆ ˆ
1 ˆ n−1 ˜l(ρ, k) m̃(ρ, k) dρdk.
f (x)ḡ(x)dx = f (ρk)ĝ(ρk)ρ dρdk = 2 (3.40)
2 Sn−1 R Sn−1 R ρn−1
Using again the Plancherel identity (now in the ρ-variable) gives
ˆ ˆ ˆ
f (x)ḡ(x)dx = l(s, k)h̄(s, k)dsdk, (3.41)
Rn Sn−1 R
with the function h(s, k) having the following Fourier transform in s:
m̃(ρ, k)
h̃(ρ, k) = 2 .
ρn−1
Comparing (3.36) and (3.40) we see that
m̃(ρ, k)
M̃ (ρ, k) = 2 .
ρn−1
In other words, the Fourier transform in the s-variable of the Radon transform m = Rg of a
function g can be written in terms of g as
ˆ
1 n−1
m̃(ρ, k) = ρ M̃ (ρ, k), M (s, k) = g(x)dΣ. (3.42)
2 x·k=s
Taking the inverse Fourier transform we arrive at an alternative, and much more geometric,
expression for the Radon transform of a function g(x):
ˆ
1 ∂ n−1 M (s, k)
(Rg)(s, k) = , M (s, k) = g(x)dΣ. (3.43)
2(2πi)n−1 ∂sn−1 x·k=s
Note that, as n is odd, in−1 = ±1, depending on whether n = 2k + 1 with an even or odd k.
As a consequence of (3.43) we obtain the following theorem.
Theorem 3.1 Let the dimension n of the space be odd. If the function g(x) vanishes outside
a ball {|x| ≤ r} then its Radon transform m(s, k) = (Rg)(s, k) vanishes for |s| > r.
67
The Huygens principle
In order to formulate the Huygens principle we need the notion of the domain of influence. We
say that a space-time point (t, z) ∈ Rn+1 is in the domain of influence of a spatial point x ∈ Rn
if given any spatial neighborhood Ux of the point x and any space-time neighborhood Vt,z
of the point (t, z) there exist two solutions of the wave equation u(t, x) and w(t, x) such
/ Ux but there exists (s0 , z 0 ) ∈ Vt,z such
that u(0, x) = w(0, x) and ut (0, x) = wt (0, x) for all x ∈
0 0 0 0
that u(s , z ) 6= w(s , z ). The Huygens principle states the following.
Theorem 3.2 If n is odd, then the domain of influence of the origin in Rn is the double
cone {(t, x) ∈ Rn+1 : |x| = ct}.
Proof. We have to show that the points (t, x) with |x| = 6 ct lie outside the domain of influence
of the origin. That is, we have to find a ball Uε = {y ∈ Rn : |y| < ε} and a space-time
neighborhood Vt,x of (t, x) so that if the Cauchy data for a pair of solutions coincide outside Uε
at t = 0, then the solutions coincide also in Vt,x . This is equivalent to saying that if the Cauchy
data vanishes outside Uε then the corresponding solution of the wave equation vanishes in Vt,x .
Note first that if |x| > ct, then we can choose ε = (|x| − ct)/2. Indeed, if initially both u(0, x)
and ut (0, x) vanish outside of the ball {|x| < ε} then they vanish in the ball of radius ct
around the point x, hence the finite speed of propagation that we have already proved implies
that u(t, x) = 0. Thus, we only need to consider the case |x| < ct.
To this end, we need a slight diversion. Recall that the Fourier transform of l(s, k) in s is
68
Thus, the anti-derivative W1 (s) has a zero integral itself if, in addition to (3.46) we have
ˆ L
sw(s)ds = 0. (3.47)
−L
wtt − ∆w = 0
69
which decays rapidly for large wave numbers k – oscillations dissipate.
The geometric optics studies oscillatory solutions and purports to estimate the error in
terms of a small parameter ε 1 which is the inverse of the non-dimensional wave length.
We consider oscillatory solutions of the wave equation with constant coefficients:
with oscillatory initial data uε (0, x) = eiφ(x)/ε aε (x), uεt (0, x) = 0. When the phase φ(x) = k · x
is a linear function we are back to the plane waves. Now, however, we are interested in more
general phases and, rather more importantly, in theregime ε 1. We look for solutions in
the same from as the initial data:
where we have defined the symbol L̃(ω, k) = ω 2 − c2 |k|2 and the linear operator
∂φ ∂aε
V φ aε = − c2 ∇φ · ∇aε .
∂t ∂t
In order for u of the form (3.49) to be an approximate solution of (3.48) the leading order in
ε in (3.50) (that is, O(ε−2 )) should vanish. This leads to the eikonal equation
2
∂φ
L̃(φt , ∇φ) = − c2 |∇φ|2 = 0. (3.51)
∂t
Solutions of the eikonal equation may be constructed in the same way as for the progressing
waves, when initially ∇φ0 6= 0 everywhere. Then we pick up one branch of φt = ±|∇φ| and
construct solutions by means of the Hamilton-Jacobi theory – we will return to this issue a
little later.
Assuming that solutions of the eikonal equation exist, at least locally in time, we may
proceed to find an equation for the amplitudes aj . The term of order ε−1 in (3.50) has the
form
2Vφ a0 + (Lφ)a0 = 0. (3.52)
This is a first order linear (once the phase φ is determined from the eikonal equation) equation
for a0 , also known as the transport equation. We say that the characteristics of the linear
operator Vφ are called rays. Then (3.52) is an ODE along the rays. In order for the transport
equation to have a solution for an initial value problem we require that φt 6= 0 initially, which
means that |∇φ| = 6 0 as well. Then the eikonal equation becomes φt = ±|∇φ|, depending on
the sign of φt at t = 0, and the equation for a0 admits a smooth solution.
Equations for the higher order coefficients are obtained from the higher order terms in ε
in (3.50): this leads to
1
2Vφ an + (Lφ)an + Lan−1 = 0. (3.53)
i
70
This is a system of first order linear (again, once the phase φ is determined from the eikonal
equation) equations for an – each one has the same family of rays as its characteristics. They
can be solved to provide an expansion for aε up to any order.
The accuracy of the approximation of uε (t, x) by a partial sum
k
X
ukε (t, x) iφ(t,x)/ε
=e εl al (t, x)
l=0
depends on the chosen norm. This is related to the fact that as uε is oscillatory, so its
derivatives are large.
Theorem 3.3 Let the initial data φ(0, x) and a(0, x) be in L2 (Rn ), and let T̄ be such that a
smooth solution of the eikonal equation (3.51) exists for all 0 ≤ t < T̄ and let 0 ≤ T < T̄ .
There exists a constant CT so that we have, uniformly in 0 ≤ t ≤ T , the following estimate:
The constant CT depends on the time T and the Cauchy data for uε at t = 0.
4 Method of characteristics
The linear equations
Consider a linear first order equation
n
X ∂u
aj (x) = c(x)u + f (x), x ∈ Rn , (4.1)
j=1
∂xj
where aj (x), c(x) and f (x) are all prescribed functions. We also prescribe the solution u(x)
along some (n−1)-dimensional surface Γ: u(y) = g(y) for all y ∈ Γ, with a prescribed function
g. The question we are interested in, is whether we can extend u(x) outside of Γ. The idea of
the method of characteristics is as follows: we look for a family of curves X(s; y), that start at
a point y on Γ (X(0; y) = y), and s is a parameter along the curve, such that U (s) = u(X(s))
can be computed ”easily” given that we know U (0) = g(y). This will extend the function u
to all points where characteristics can reach in a unique way.
In order to understand what characteristic curves we should choose, let us differentiate
U (s) with respect to s:
n
dU X ∂u(X(s)) dXj (s)
= . (4.2)
ds j=1
∂x j ds
In order to relate (4.2) to the partial differential equation (4.1) it is convenient to choose X(s)
as a solution of a system of ordinary differential equations
dXj
= aj (X(s)), j = 1, . . . , n, (4.3)
ds
71
with a prescribed initial condition X(0) = y, where y ∈ Γ is a given point on Γ. With this
choice of the curve X(s), equation (4.2) becomes
n
dU X ∂u(X(s)
= aj (s) = c(X(s))U (s) + f (s). (4.4)
ds j=1
∂xj
Now, if we can solve the characteristics equations (4.3), we can insert the solution X(s) into
the scalar ODE
dU
= c(X(s))U (s) + f (X(s)), (4.5)
ds
and solve it. This will provide the solution to the PDE with the boundary condition, at the
points that can be reached by the characteristics.
Consider an example
ux1 + ux2 + u = 1, (4.6)
subject to the initial condition
x1 = y1 + s, x2 = y2 + s, y2 = y1 + y12 , y1 > 0.
x2 − (x1 − y1 ) = y1 + y12 ,
hence √
y1 = x2 − x1 .
We see immediately several issues: first, solution exists only if x2 > x1 – otherwise, a charac-
teristic curve that passes through a point (x1 , x2 ) does not cross the curve Γ at all, meaning
that solution at (x1 , x2 ) is not defined! If x2 > x1 , then s is
√
s = x1 − y 1 = x1 − x2 − x 1 , (4.8)
72
and √ √
u(x1 , x2 ) = 1 − 1 − sin( x2 − x1 ) e−x1 + x2 −x1 .
Note that this solution is not differentiable at the points where x1 = x2 – this is because this
characteristic curve is tangent to the curve Γ at (0, 0). In general, for the boundary value
problem to have a unique solution characteristics should not be tangent to Γ at any point.
If we consider the same example as before but now prescribe the initial data along the
whole curve y2 = y1 + y12 , without the restriction y1 > 0 we will see that at each point (x1 , x2 )
with x2 > x1 the characteristic curve will cross Γ at two points, with
√
y1 = ± x2 − x1 .
This is problematic – which s should we choose then in the formula for u(x1 , x2 )? This
problem comes, once again, from the fact that the characteristic is tangent to Γ at the point
(0, 0).
Generally, solutions exist only in a neighborhood of Γ and only if the characteristics coming
out of Γ are never tangent to it.
Let us consider another example:
∂u ∂u
−x2 + x1 = 0, (4.9)
∂x1 ∂x2
with the initial condition u(x1 , 0) = ψ(x1 ). The characteristics are
dX1 dX2
= −X2 (s), = X1 (s), X1 (0) = y1 , X2 (0) = 0.
ds ds
The solution is
X2 (s) = y1 sin s, X1 (s) = y1 cos s.
The equation for U (s) is
dU
= 0, U (0) = ψ(y1 ),
ds
hence
U (s) = ψ(y1 ).
Therefore, characteristics are circles, and given a point (x1 , x2 ), we have
It follows that we can not define a solution in any open region that includes all of the real
line – but if we restrict Γ to be Γ0 = {(x1 , 0) : x1 > 0}, we can solve the problem with the
solution being q
u(x1 , x2 ) = ψ( x21 + x22 ).
Once again, this solution is not differentiable at the origin, where the characteristics have a
singular point.
73
Nonlinear equations
Now, we consider a nonlinear first-order PDE
F (∇u, u, x) = 0, (4.10)
Let us also differentiate (4.10) with respect to xi , with the variable p = ∇u in the argument
of F : n
∂F ∂F ∂u X ∂F ∂ 2 u
+ + = 0. (4.13)
∂xi ∂u ∂xi j=1 ∂pj ∂xi ∂xj
74
Let us consider an example
∂u ∂u
= u, (4.19)
∂x1 ∂x2
in the right half-plane {x1 > 0} with the boundary condition u(0, x2 ) = x22 . Now, F (p, u, x) =
p1 p2 − u, so characteristics become
dX1 dX2
= P2 (s), = P1 (s)
ds ds
dU
= 2P1 (s)P2 (s), (4.20)
ds
dP1 dP2
= P1 (s), = P1 (s).
ds ds
Integrating these equations gives
and
X1 (s) = X1 (0) + P2 (0)(es − 1), X2 (s) = X2 (0) + P1 (0)(es − 1). (4.21)
At the boundary line {x1 = 0} we have X1 (0) = 0, U (0) = (X2 (0))2 and P2 (0) = 2X2 (0). We
may also find P1 (0) from the PDE
∂u ∂u
= u,
∂x1 ∂x2
which at the point (0, X2 (0)) becomes
which gives P1 (0) = X2 (0)/2. Now, given a point (x1 , x2 ) with x1 > 0 let us find the point
(0, X2 (0)) such that the characteristic coming out of (0, X2 (0)) passes through (x1 , x2 ): using
this in (4.20) leads to
1
x1 = 2X2 (0)(es − 1), x2 = X2 (0) + X2 (0)(es − 1),
2
so that
1 x1 + 4x2
X2 (0) = x2 − x1 , es =
4 4x2 − x1
It follows that
2s
2s 2 2s (x1 + 4x2 )2
u(x1 , x2 ) = U (0) + P1 (0)P2 (0) e − 1 = U (0)e = (X2 (0)) e = . (4.22)
16
75
with a prescribed initial data u(0, x) = g(x). The most canonical example is F (u) = u2 /2,
which is known as the Burgers’ equation:
ut + uux = 0. (5.2)
As we will see, solutions of (5.1) do not necessarily stay continuous for all times t > 0
even if the initial data g(x) is infinitely differentiable. Hence, we need first to devise a
way of formulating the initial value problem for the smooth solutions of the PDE in a way
that does not involve derivatives. That formulation would apply equally well then to non-
smooth solutions. The main difficulty will be to choose this ”derivative-free” formulation in
a physically meaningful way. To begin, we let u(t, x) be a (smooth) solution of (5.1) and
multiply this equation by a smooth ”test function” v(t, x) of compact support, and integrate
by parts (the terms at x = ±∞ vanish because v(t, x) has compact support):
ˆ ∞ˆ ∞
0= v(t, x)(ut + (F (u))x )dxdt
ˆ ∞ˆ ∞
0 −∞
ˆ ∞
=− [vt u + F (u)vx ]dxdt − v(0, x)u(0, x)dx. (5.3)
0 −∞ −∞
This identity should hold for any smooth compactly supported function v(t, x). The advantage
of (5.4) is that it makes sense for any bounded function u(t, x) – it does not involve any
derivatives of u(t, x). We say that u(t, x) is an integral solution of (5.1) if the integral identity
(5.3) holds for any smooth function v(t, x) of compact support.
In order to understand the implications of this definition, consider a particularly simple
piece-wise constant solution u(t, x) such that u(t, x) = ul for x < x(t), and u(t, x) = ur for
x > x(t). We would like to understand how the point x(t) should evolve for u(t, x) to be
an integral solution of the conservation law (5.1). Take a smooth test function v(t, x) that
vanishes at t = 0, then (5.4) is
ˆ ∞ˆ ∞
0= [vt u + F (u)vx ]dxdt (5.5)
0 −∞
ˆ ∞ ˆ x(t) ˆ ∞ ˆ ∞
= [ul vt + F (ul )vx ]dxdt + [ur vt + F (ur )vx ]dxdt.
0 −∞ 0 x(t)
76
As v(0, x) = 0 for all x, and v(t, x) = 0 for all t ≥ T for some T > 0, the first term in the
right side vanishes, giving
ˆ ∞ ˆ x(t) ˆ ∞
ul vt dtdx = − ul v(t, x(t))ẋ(t)dt. (5.8)
0 −∞ 0
exist. Then a nearly identical computation as above shows that u(t, x) should satisfy
ut + (F (u))x = 0, (5.13)
both in the region x < x(t) and in x > x(t), while the Rankine-Hugoniot condition should
still hold:
F (ul (t)) − F (ur (t))
ẋ(t) = . (5.14)
ul (t) − ur (t)
The basic example of such solution is a shock of the Burgers’ equation:
ut + uux = 0, (5.15)
77
Then the solution is piece-wise constant:
1, x < x(t)
u(t, x) = (5.17)
0, x > x(t).
The discontinuity point x(t) moves according to the Rankine-Hugoniot condition correspond-
ing to the flux F (u) = u2 /2, ul = 1, ur = 0:
1/2 − 0 1
ẋ = = ,
1−0 2
and the shock position is x(t) = t/2 – the shock moves with the constant speed v = 1/2:
1, x < t/2
u(t, x) = (5.18)
0, x > t/2.
Let us now consider a flipped initial condition:
0, x < 0
u(0, x) = (5.19)
1, x > 0.
Then the Rankine-Hugoniot condition would give
0 − 1/2 1
ẋ(t) = = ,
0−1 2
and the solution is
0, x < t/2
u(t, x) = (5.20)
1, x > t/2.
However, there is another integral solution with the same initial condition:
0, x < 0,
u(t, x) = x/t, 0 < x < t, (5.21)
1, x > t.
This solution is called a rarefaction wave. It is easy to check that u(t, x) is an integral
solution because it is continuous, piece-wise differentiable, and on each interval where it is
differentiable, it solves (5.1). This example shows that an integral solution is not, generally,
unique, and an important question is to distinguish between various integral solutions.
ut + uux = 0, (5.22)
and try to solve it by the method of characteristics. The general method described in (4.18)
certainly applies, but let us work it out explicitly once again. The full characteristics curve
is (we have trivially T (s) = s, so we do not have to introduce this component)
78
with
∂u(s, X(s)) ∂u(s, X(s))
U (s) = u(s, X(s)), Px (s) = , Pt (s, X(s)) = .
∂x ∂t
Then we compute
dPx
= uxt (s, X(s)) + uxx Ẋ(s), (5.23)
ds
and
dPt
= utt (s, X(s)) + utx Ẋ(s). (5.24)
ds
On the other hand, differentiating the Burgers’ equation (5.22) in x and t, respectively, we
get
utx + u2x + uuxx = 0, utt + ut ux + uuxt = 0. (5.25)
Thus, if we choose
dX(s)
= U (s, X(s)), (5.26)
ds
and use (5.25) in (5.23) and (5.24), we would get
dPx dPt
= −Px2 , = −Pt Px . (5.27)
ds ds
Finally, the Burgers’ equation (5.22) itself means that
dU (s)
= ut + ux Ẋ(s) = 0, (5.28)
ds
because of (5.26). Summarizing, the characteristics are
dX dU dPx dPt
= U, = Px U + Pt , = −Px2 , = −Px Pt ,
ds ds ds ds
starting at
X(0) = y, U (0) = g(y), Px (0) = g 0 (y), Pt (0) = −g(y)g 0 (y).
The solution is
Px (0) g 0 (y) g(y)g 0 (y)
Px (s) = = 0 , Pt (s) = −g(y)Px (s) = − ,
1 + sPx (0) g (y) + s 1 + sg 0 (y)
and, finally, U (s) = U (0) = g(y) – the function u(t, x) is constant along the characteristics.
The projection of the characteristics on the physical space are straight lines
We can not prevent the characteristics from crossing each other in the physical space – this
is what leads to singularities and shocks. However, we can impose a condition that would
ensure that if we start a characteristic at a point (t, x) and run it backwards in time, then it
will not hit any other characteristic before it reaches t = 0. To guarantee this, we require that
if x = x(t) is the discontinuity curve for the solution, then no characteristic comes out of this
curve. This is known as the entropy condition. In the case of a shock of the Burgers’ equation
79
this is ensured if ul > ur – this follows from the simple geometric consideration on the the
plane (x, t), the explicit formula (5.29) for the characteristic, and the fact that U (s) = g(y)
along the characteristic.
Going back to the example of the initial data (5.19) for the Burgers’ equation, we see that
the discontinuous solution does not satisfy the entropy condition, while the rarefaction wave
does.
How does this change for a general conservation law? Consider a general equation of the
form
ut + F 0 (u)ux = 0. (5.30)
The characteristics
(X(s), U (s), Px (s), Pt (s))
now satisfy
dX dU dPx dPt
= F 0 (U ), = Px F 0 (U ) + Pt , = −F 00 (U )Px2 , = −F 00 (U )Px Pt ,
ds ds ds ds
starting at
X(0) = y, U (0) = g(y), Px (0) = g 0 (y), Pt (0) = −g(y)g 0 (y).
The solution is
Px (0) g 0 (y)
U (s) = U (0) = g(y), Px (s) = = ,
1 + sF 00 (g(y))Px (0) g 0 (y) + F 00 (g(y))s
g(y)g 0 (y)
Pt (s) = −g(y)Px (s) = − .
1 + sF 00 (g(y)g 0 (y)
The function u(t, x) is again constant along the characteristics. The projection of the char-
acteristics on the physical space are now the straight lines
X(t) = y + F 0 (g(y))t. (5.31)
The entropy condition for a jump discontinuity is now F 0 (ul ) > F 0 (ur ). If the flux is convex,
this is equivalent to the (simpler!) condition
ul > ur .
Let us now consider a step function initial condition
0, x < 0
u(0, x) = 1, 0 < x > 1, (5.32)
0, x > 0,
80
At the time t = 2 the rarefaction wave (whose front moves with the speed v = 1) catches up
with the shock that moves with the speed v = 1/2. After this time solution is
0, x < 0
u(0, x) = x/t, 0 < x < x(t), (5.34)
0, x > x(t).
x = y + tg(y), (5.39)
because g(y) = h0 (y). As g(y) is a bounded function, the function h(y) satisfies |h(y)| ≤ M |y|,
and so s(t, x; y) tends to +∞ as y → ±∞ for all t and x fixed. Moreover, s(t, x; y) is continuous
in y for all t and x fixed. It follows that it attains its global minimum at some points y. We
let y(t, x) be the smallest such point.
81
Next, set
2 " 2 #
t x − y(t, x) t x−y
w(t, x) = + h(y(t, x)) = inf + h(y) . (5.40)
2 t y∈R 2 t
The function w(t, x) is Lipschitz continuous in x because for any x and x0 we have (denote
y = y(t, x)))
" 2 # 2
t x0 − z
0 t x−y
w(t, x ) − w(t, x) = inf + h(z) − − h(y)
z∈R 2 t 2 t
≤ h(x0 − x + y) − h(y) ≤ M |x0 − x|. (5.41)
yt = −g(y)yx . (5.43)
1 g 2 (y)
wt = − g 2 (y) − g(y)yt + g(y)yt = , (5.46)
2 2
and
wx = g(y)(1 − yx ) + g(y)yx = g(y). (5.47)
82
We conclude that w(t, x) is Lipschitz continuous almost everywhere, and wherever it is dif-
ferentiable, it solves the initial value problem
wx2
wt + = 0, w(0, x) = h(x). (5.48)
2
Therefore, we may define (almost everywhere)
" 2 #
∂w(t, x) ∂ t x − y(t, x)
u(t, x) = = − h(y(t, x)) . (5.49)
∂x ∂x 2 t
x − y(t, x)
u(t, x) = g(y(t, x)) = . (5.50)
t
Let us show that u(t, x) that we have just constructed is an integral solution of the initial
value problem for the Burgers’ equation
The fact that wherever u(t, x) is differentiable, it solves the Burgers’ equation follows imme-
diately from (5.48) after taking the x-derivative. Let us see why it is an integral solution. Let
v(t, x) be a test function and multiply (5.48) by vx :
ˆ ∞ˆ ∞
wx2
vx wt + dxdt = 0. (5.52)
0 −∞ 2
Now, since w is sufficiently regular in time (this requires a little bit of extra work that we
omit), we may integrate by parts in time, leading to
ˆ ∞ˆ ∞ ˆ ∞ˆ ∞ ˆ ∞
vx wt dxdt = − vxt wdxdt − vx w(0, x)dx.
0 −∞ 0 −∞ −∞
83
5.4 Entropy solutions and the Lax-Oleinik formula
Let us first show that the function u(t, x) given by the Lax-Oleinik formula satisfies a one-sided
estimate
z
u(t, x + z) − u(t, x) ≤ , for all t > 0, x ∈ R and z > 0. (5.54)
t
This estimate holds simply under the assumption that the initial condition u(0, x) = g(x)
for the Burgers’ equation (5.37) is a bounded function. We will assume to simplify some
computations that g(x) ≥ 0. Note that if g(x) is not positive but is bounded: |g(x)| ≤ M
then we can consider q(t, x) = u(t, x) + M . This function satisfies
Another change of variable, q(t, x) = R(t, x + M t) brings us to the Burgers’ equation for the
function R(t, x)
Rt + RRx = 0,
with the initial condition R(0, x) = g(x) + M ≥ 0. If we show that (5.54) holds for the
solutions of the Burgers’ equation with bounded non-negative initial conditions, we would
conclude that
z
R(t, x + z) − R(t, x) ≤ ,
t
for all z ≥ 0. This would, in turn, imply
z
u(t, x + z) − u(t, x) = q(t, x + z) − q(t, x) = R(t, x + z + M t) − R(t, x + M t) ≤ ,
t
which is (5.54). Thus, we may assume that g(x) is non-negative everywhere without any loss
of generality.
A consequence of (5.54) is that backwards characteristics do not cross: this is our definition
of an entropy solution. Indeed, the characteristic that passes through (t, x) is a line with the
slope u(t, x):
X(s) − x = u(t, x)(s − t).
Thus, for the characteristics passing through (t, x) and (t, x + z), with z > 0, to cross at some
0 < s < t, we should have
which is prohibited by (5.54). Thus, a solution that satisfies (5.54) is an entropy solution in
the sense that ”backwards characteristics do not cross”.
In order to establish (5.54) we first show that y(t, x) is non-decreasing in x. This is
important because u(t, x) = g(y(t, x)). To check the monotonicity of y(t, x), let x1 < x2 ,
t > 0, and y1 = y(t, x1 ), y2 = y(t, x2 ). We claim that s(t, x2 , y) > s(t, x2 , y1 ) for all y < y1 ,
which means that y2 ≥ y1 . To see this, we note that, since x2 > x1 and y1 > y, we have
x2 y1 + x1 y > x1 y1 + x2 y,
which implies
(x2 − y1 )2 + (x1 − y)2 < (x1 − y1 )2 + (x2 − y)2 , (5.55)
84
Observe also that, as y1 is the smallest global minimizer of s(t, x1 , y), and y < y1 , we have
2 2
x1 − y 1 x1 − y
t + h(y1 ) < t + h(y).
t t
As g(x) ≥ 0 for all x, we know from (5.39) that x1 > y1 , hence x1 − y > x1 − y1 > 0.
Combining this with (5.55) gives
(x2 − y1 )2 (x1 − y)2 (x1 − y1 )2 (x2 − y)2
+ + h(y1 ) < + + h(y1 )
t t t t
(x1 − y)2 (x2 − y)2
< + + h(y),
t t
which is nothing but
(x2 − y1 )2 (x2 − y)2
+ h(y1 ) < + h(y),
t t
and thus s(t, x2 , y) > s(t, x2 , y1 ) for all y < y1 , which implies that y2 ≥ y1 .
We now use (5.50), and the fact that y(t, x) is non-decreasing in x:
x − y(t, x) x − y(t, x + z) x + z − y(t, x + z) z z
u(t, x) = ≥ = − = u(t, x + z) − , (5.56)
t t t t t
which is (5.56).
We will say that u(t, x) is an entropy solution of the Burgers’ equation if it satisfies the
entropy condition (5.54). Note that if u(t, x) is discontinuous at some point and has left and
right limits ul and ur , then (5.54) implies that ul > ur , which is our old entropy condition
that we have introduced for piecewise continuous solutions. However, (5.54) is a more general
form of the entropy condition as it does not require u(t, x) to be piecewise continuous. The
key result is the following theorem.
Theorem 5.1 There exists a unique entropy solution of the initial value problem
It follows that ˆ y
|h(y)| = g(s)ds ≤ M,
0
85
as well. Then, we have
(x − y)2 (x − y)2
s(t, x; y) = + h(y) ≥ − M.
2t 2t
On the other hand, we also have
s(t, x; x) = h(x) ≤ M,
(x − y(t, x))2
+ h(y(t, x)) ≤ M,
2t
whence
(x − y(t, x))2
≤ 2M, (5.59)
2t
for all t > 0. Now, we simply use expression (5.50) for u(t, x):
√ √
x − y(t, x) 4M t 4M
u(t, x) = ≤ = √ . (5.60)
t t t
Hence, in the long time regime the entropy solutions of the Burgers’ equation decay as
O(t−1/2 ). Therefore, the definition of the entropy solution includes an implicit form some
”hidden” dissipation.
Finally, we establish ”decay to an N -wave” result for the solutions of the Burgers’ equation.
Let g(x) have compact support and set
ˆ y ˆ ∞
p = −2 min g(s)ds, q = 2 max g(s)ds.
y∈R −∞ y∈R y
86
We now claim that
In order to establish this, we will show that for x as in (5.63), we have y(t, x) = x. Indeed,
for such x we have
s(t, x; x) = h(x) = h− ,
while for y < −R we have
(x − y)2 (x − y)2
s(t, x; y) = + h(y) = + h− > h− ,
2t 2t
unless y = x. On the other hand, for y > −R we have
(x − y)2 (x + R)2 p pt p
s(t, x) = + h(y) ≥ − + h− ≥ − + h− = h− .
2t 2t 2 2t 2
Hence, y(t, x) = x is the smallest global minimizer of s(t, x; y). Then (5.50) implies that
u(t, x) = 0. Similarly, we can show that
Next, we claim that (for t sufficiently large so that this interval is not empty)
√ √
−R ≤ y(t, x) ≤ R, if R − pt + 1 < x < −R + qt − 1. (5.65)
and √
y(t, −R + qt − 1) ≤ R. (5.67)
√
To see that (5.66) holds, take x0 = R − pt + 1. Note that if z < −R, then h(z) = h− , and
(x0 − z)2
s(t, x0 ; z) = + h(z) ≥ h− .
2t
Choose now z0 so that h(z) = miny∈R h(y) = −p/2 + h− , and |z| ≤ R. Then we have
(x0 − z0 )2 pt p
s(t, x0 , z0 ) = + h(z0 ) < − + h− = h− .
2t 2t 2
It follows that the minimizer y(t, x0 ) of s(t, x0 , y) satisfies (5.66). The proof of (5.67) is very
similar. As
x − y(t, x)
u(t, x) = ,
t
we conclude that x C √ √
u(t, x) − ≤ for R − pt < x < −R + qt. (5.68)
t t
87
√ √
So we have shown√that u(t, x) = 0 for
√ x < −R − pt and x > R + qt, and, in addition,
(5.68) holds for R − pt < x < −R + qt. As a consequence, we have
C √ √ √ √
|u(t, x) − N (t, x)| ≤ , for x ∈/ (−R − pt, R − pt) and x ∈ / (R − qt, R + qt).
t
√ √ √ √
However, for x ∈ (−R − pt, √ R − pt)
√ we have u(t, x) ≤ C/ t, and N (t, x) = x/t ≤ C/ t,
and similarly for x ∈ (R − qt, R + qt). Putting all these pieces together implies that
ˆ
C
|u(t, x) − N (t, x)| ≤ √ ,
R t
as we have claimed. 2
88