0% found this document useful (0 votes)
98 views88 pages

Notes 220 17 PDF

This document provides lecture notes on partial differential equations (PDEs). It introduces some important PDEs including the trivial PDE where the solution does not depend on time, the linear transport equation where the solution moves along characteristic lines, and the maximum principle for PDEs. The material is based on textbooks by Evans and Pinchover and Rubinstein and follows their explanations closely.

Uploaded by

Robin Brown
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
98 views88 pages

Notes 220 17 PDF

This document provides lecture notes on partial differential equations (PDEs). It introduces some important PDEs including the trivial PDE where the solution does not depend on time, the linear transport equation where the solution moves along characteristic lines, and the maximum principle for PDEs. The material is based on textbooks by Evans and Pinchover and Rubinstein and follows their explanations closely.

Uploaded by

Robin Brown
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 88

Lecture notes for Math 220

Lenya Ryzhik∗
December 5, 2017

Essentially nothing found here is original except for a few mistakes and misprints here
and there. These lecture notes are based on material from the following books: L. Evans
”Partial Differential Equations”, Y. Pinchover and J. Rubinstein “An Introduction to Partial
Differential Equations”

1 Some important PDEs


This material is based on Evans Chapter 2, often following it literally.

1.1 A triviality
The very simplest partial differential equation is, probably,
∂u
= 0, (1.1)
∂t
for a function u(t, x) that depends on two variables (but x does not appear in (1.1), of
course). However, there is almost nothing interesting to say about (1.1) apart from observing
that u(t, x) is an arbitrary function of x and does not depend on t. Still, we can draw a
couple of conclusions from (1.1), that we will soon generalize to somewhat more interesting
equations, before abandoning (1.1) for good. First, (1.1) without any extra conditions, has
infinitely many solutions – a function of the form u(t, x) = g(x) solves (1.1) for an arbitrary
function g(x). Therefore, to get a unique solution we need to impose some condition apart
from (1.1) on the function u(t, x). One such constraint is to prescribe the initial data:

u(0, x) = f (x). (1.2)

It is easy to see that solution of (1.1)-(1.2) is unique: u(t, x) = f (x). Another possibility is
to add a boundary condition to (1.1) (usually if t and x play the role of “time” and “space”,
respectively, then we talk about “initial conditions” if the function is prescribed at t = 0, and
about “boundary conditions” if the function is prescribed at x = 0, or some other fixed point
in x):
u(t, 0) = g(t). (1.3)

Department of Mathematics, Stanford University, Stanford CA 94305, USA; [email protected]

1
However, as one can check immediately, (1.1) with the boundary condition (1.3) has a solution
only if g(t) = g0 , that is, g(t) is a constant independent of t. Moreover, in that case solution
is not unique – any function u(t, x) = r(x) with r(0) = g0 would solve (1.1), (1.3) provided
that g(t) is a constant. Another example of a “good” curve to impose some condition is the
line x = t: if we add the constraint
u(t, t) = v(t), (1.4)
with a prescribed function v(t) to (1.1) then solution is unique: u(t, x) = v(x). A quick lesson
to remember is that adding a boundary, or initial, or mixed condition even to an absurdly
simple PDE like (1.1) may lead to existence of a unique solution, no solutions, or infinitely
many solutions.
Another, slightly more obscure lesson to learn is that whether we add (1.2) or (1.4) as an
extra condition to (1.1), we have |u(t, x)| ≤ maxy |f (y)| and |u(t, x)| ≤ maxs |v(s)|, respec-
tively, for all t and x – solution everywhere does not exceed its maximum on the surface where
it is prescribed. Moreover, this equation preserves positivity: if f (x) ≥ 0 then u(t, x) ≥ 0
for all t and x, as well. These properties are versions of the maximum principle and we will
encounter them soon in much less trivial settings.

1.2 The linear transport equation


The next simplest PDE after the triviality of (1.1) is the linear transport equation

∂φ
+ u · ∇x φ = 0, x ∈ Rn , t ≥ 0. (1.5)
∂t
Here φ(t, x) is the unknown function, and u = (u1 , . . . , un ) is a constant vector in Rn , known
as a “drift” – the terminology will become clear later. The notation u · v, with u, v ∈ Rn ,
denotes the standard inner product in Rn :

u · v = u1 v1 + . . . + un vn ,

so that (1.5) is a short-hand for


n
∂φ X ∂φ
+ uj = 0, x ∈ Rn , t ≥ 0.
∂t j=1
∂xj

The initial value problem


Which functions solve (1.5)? Let us first look at this from a bureaucratic point of view: (1.5)
expresses the fact that the directional derivative of φ(t, x), understood as a function of (n+1)-
variables (t, x), vanishes in the direction (1, u). This means that the function

z(s) = φ(t + s, x + su)

is constant in s, for any x and t fixed, and, indeed, using (1.5) we see that if φ(t, x) solves (1.5)
then
dz ∂φ
= (t + s, x + su) + u · ∇φ(t + s, x + su) = 0.
ds ∂t

2
Therefore, if we take any point (t, x) ∈ Rn+1 and draw a line Lt,x = {(t + s, x + us), s ∈ R}
(known as a characteristic) in Rn+1 , then the function φ(t, x) is constant along Lt,x . This gives
a hint of what kind of initial or boundary value problems we can solve for (1.5). Consider the
initial value problem
∂φ
+ u · ∇x φ = 0, x ∈ Rn , t ≥ 0. (1.6)
∂t
φ(0, x) = g(x), for x ∈ Rn ,

with a prescribed function g(x). Then for a given t ≥ 0 and x ∈ Rn we look at the line Lt,x .
It intersects the hyper-plane {t = 0} where the solution is prescribed at the point (0, x − ut)
(we take s = −t in the definition of Lt,x ). Therefore, φ(t, x) = φ(0, x − ut) = g(x − ut). This
is the unique solution to the initial value problem (1.6).
Our choice of the initial conditions as the additional constraint is not arbitrary – the
variable t plays the role of time, and physically, the initial value problem with the data
prescribed at t = 0 comes up most often. However, for the sake of completeness and to see
what can go wrong, consider the boundary value problem prescribed along the plane {x1 = 0}:
∂φ
+ u · ∇x φ = 0, x ∈ Rn , t ≥ 0. (1.7)
∂t
φ(t, x1 = 0, x2 , . . . , xn ) = f (t, x2 , . . . , xn ), for all t ∈ R, and (x2 , . . . , xn ) ∈ Rn−1 ,

with a given function f . Again, given a point (t, x), the line Lt,x intersects the plane {x1 = 0}
at the point corresponding to s = −x1 /u1 : the intersection point is
x1 x1 x1
(t − , 0, x2 − u2 , . . . , xn − un ).
u1 u1 u1
It follows that the solution of (1.7) is
x1 x1 x1
φ(t, x) = f (t − , x2 − u2 , . . . , xn − un ).
u1 u1 u1
It is unique provided that u1 6= 0. If u1 = 0 then the plane {x1 = 0} is parallel to the lines Lt,x ,
thus the value of the solution at (t, x) is not determined by the data prescribed along that
plane. This is a general lesson: the data should be prescribed along surfaces that are not
tangent to characteristics, or one would get non-uniqueness and non-existence (depending on
the data prescribed on such surfaces). We will discuss this in a greater detail later.
Another point to make is that the maximum principle still holds: solution of the initial
value problem (1.6) satisfies
|φ(t, x)| ≤ sup |g(y)|, (1.8)
y

and, in addition, it preserves positivity: if g(x) ≥ 0, then φ(t, x) ≥ 0.


Now, let us discuss the above properties from a slightly different, more physical point of
view, with a clear separation of what is space and what is time: consider a family of particles
that move along trajectories X(s; t, x) which solve the ODE

dX(s; t, x)
= u, X(s = t; t, x) = x, (1.9)
ds

3
that is, we are given a “starting” time t ∈ R and a “starting” position x ∈ Rn , and we look
at the trajectory parametrized by the parameter s ∈ R that at the time s = t passes through
the point x. Note that the trajectories X(s; t, x) lie in Rn , unlike the lines Lt,x that lived
in Rn+1 , the space that also involved t. Solutions of (1.9) are explicit:

X(s; t, x) = x + u(s − t). (1.10)

Define then a function


ψ(t, x) = g(X(0; t, x)) = g(x − ut).
This is the value of the function g at the point where the trajectory passes at s = 0. For a
small time increment ∆t we have

ψ(t + ∆t, x + (∆t)u) = ψ(t, x), (1.11)

simply because

X(0; t + ∆t, x + (∆t)u) = x + (∆t)u − u(t + ∆t) = x − ut = X(0; t, x).

On the other hand, (1.11) implies immediately that ψ(t, x) solves


∂ψ
+ u · ∇x ψ = 0, (1.12)
∂t
with the initial data ψ(0, x) = g(X; 0, x) = g(x), which is (1.6). The maximum principle (1.8)
is an immediate consequence of this interpretation, as is preservation of positivity – the range
of ψ(t, x) lies inside the range of g(x). Here, this “dynamical systems” approach is essentially
indistinguishable from what we did before, but it will become very useful when we turn to
the elliptic and parabolic equations. In the case of first order equations it is known as the
method of characteristics, while for the second order equations it relates to random processes,
Brownian motion and diffusions.

Variable drift
Let us now consider the transport equation with a drift u(x) = (u1 (x), u2 (x), . . . , un (x)) that
varies in space:
∂φ
+ u(x) · ∇φ = 0, x ∈ Rn , t ≥ 0, (1.13)
∂t
with a prescribed initial data φ(0, x) = f (x). Recall that when u(x) was constant in space, we
used the trajectories X(s; t, x) (that happened to be straight lines) to construct the solution.
Let us look for the analog of these lines in the case that u(x) varies in space: let X(s; t, x)
be a curve in Rn parametrized by s ∈ R (t and x are fixed here) such that X(s = t; t, x) = x
and define the function z(s) = φ(s, X(s; t, x)), which is the restriction of the function φ(t, x)
to our curve. We compute
n
dz ∂φ(s, X(s; t, x)) X ∂φ(s, X(s; t, x)) dXj
= +
ds ∂s j=1
∂Xj ds
∂φ(s, X(s; t, x)) dX(s; t, x)
= + · ∇φ(s, X(s; t, x)).
∂s ds

4
Therefore, we have
dz
= 0,
ds
or, equivalently, the function z(s) is constant along the curve X(s; t, x), if we choose X(s; t, x)
to be the solution of the system of ODE’s
dX
= u(X). (1.14)
ds
If we supplement (1.14) by an initial condition at s = t:

X(s = t; t, x) = x, (1.15)

then z(t) = φ(t, x). Moreover, if we do choose the curve X(s; t, x) as in (1.14), so that z(s) is
constant in s, we would have z(t) = z(0), which, equivalently, means

φ(t, x) = φ(t, X(t; t, x)) = φ(0, X(0; t, x)) = f (X(0; t, x)). (1.16)

Therefore, solution of the initial value problem for (1.13) can be found as follows: fix t ∈ R
and x ∈ Rn and solve the ODE system (1.14)-(1.15) to find X(0; t, x). Then use (1.16) to
find the value of φ(t, x).
It follows, in particular, from (1.16) that the maximum principle still applies:

sup φ(t, x) ≤ sup f (x),


x∈Rn x∈Rn

and positivity is preserved as well: φ(t, x) ≥ 0 for all x ∈ Rn if f (x) ≥ 0 for all x ∈ Rn .
Example 1. Consider the problem
∂φ ∂φ
+x = 0, t ∈ R, x ∈ R,
∂t ∂x
with the initial data φ(0, x) = f (x). The corresponding ODE is
dX
= X, X(t) = x,
ds
and its solution is X(s) = xes−t . Therefore, X(0) = xe−t , hence φ(t, x) = f (xe−t ) – solution
at a positive time t > 0 has the same profile as f (x) but is stretched out in the x-direction
by the factor et .
Example 2. Consider the problem with the opposite sign of the drift
∂φ ∂φ
−x = 0, t ∈ R, x ∈ R,
∂t ∂x
with the initial data φ(0, x) = f (x). The corresponding ODE is
dX
= −X, X(t) = x,
ds
and its solution is X(s) = xe−s+t . Therefore, X(0) = xet , hence φ(t, x) = f (xet ) – solution
at a positive time t > 0 has the same profile as f (x) but is squished in the x-direction by the
factor et .

5
The inhomogeneous problem
Consider now the initial value problem with a force:
∂φ
+ u · ∇x φ = f (t, x), x ∈ Rn , t ≥ 0. (1.17)
∂t
φ(0, x) = g(x), for x ∈ Rn ,

with prescribed initial data g(x) and force f (t, x). Here we assume again, for simplicity, that
the vector u is constant and does not depend on x. Consider what happens now with the
function z(s) = φ(t + s, x + us):

dz ∂φ
= (t + s, x + su) + u · ∇φ(t + s, x + su) = f (t + s, x + su).
ds ∂t
Choosing again s = −t gives
ˆ 0 ˆ 0 ˆ t
dz
z(0) − z(−t) = dτ = f (t + τ, x + τ u)dτ = f (τ, x − (t − τ )u)dτ.
−t dτ −t 0

This is nothing but


ˆ t
φ(t, x) = φ(0, x − ut) + f (τ, x − (t − τ )u)dτ, (1.18)
0

or ˆ t
φ(t, x) = g(x − ut) + f (τ, x − (t − τ )u)dτ. (1.19)
0

In order to interpret the formula (1.19) let us define ψ(t, x; τ ) = f (τ, x − (t − τ )u). This
function satisfies the following initial value problem starting at time τ :
∂ψ
+ u · ∇x ψ = 0, x ∈ Rn , t ≥ τ . (1.20)
∂t
ψ(t = τ, x; τ ) = f (τ, x), for x ∈ Rn ,

with τ playing the role of a parameter, 0 ≤ τ ≤ t. We may rephrase (1.19) now as


ˆ t
φ(t, x) = g(x − ut) + ψ(t, x; τ )dτ, (1.21)
0

that is, we decomposed solution of an initial value problem with a force as a sum of the
solution of the initial value problem with zero force, and a time integral of solutions of the
initial value problems with zero force, and with the initial data at an intermediate time τ
given by the force f (τ, x). Such decompositions are known as the Duhamel principle, and
appear in all sorts of linear time-dependent problems we will encounter later.

6
1.3 The Laplace and Poisson equations
One of the most frequently encountered PDEs are the Laplace equation
∆φ = 0, (1.22)
and its inhomogeneous counterpart
−∆ψ = f (x), (1.23)
known as the Poisson equation. Recall that the Laplacian is
∂ 2φ ∂ 2φ
∆φ = + ... + .
∂x1 2 ∂xn 2
Why do we put a minus in the left side of (1.23)? We will see that then (1.23) preserves
positivity under many of the common boundary conditions, that is, ψ(x) ≥ 0 if f (x) ≥ 0
for all x in the domain U where the Poisson equation is posed. Without the minus sign, the
function ψ would be positive if f is negative, which would be inconvenient in some qualitative
considerations.
The Laplace equation is usually derived as follows. Consider a physical quantity Φ such
as mass or heat whose total amount in any given volume U ⊂ Rn does not change in time –
there are no sources or sinks anywhere, and everything is in an equilibrium. The conservation
of Φ can be expressed as ˆ
(F · ν)dS = 0. (1.24)
∂U
Here F is the flux of Φ, and ν is the outward normal to U . The flux is often described by the
Fourier law:
F = −k∇Φ. (1.25)
The constant k > 0 is usually called the diffusivity. The meaning of (1.25) is clear: heat (or
mass) flows from the regions of high Φ to regions of low Φ. Using the Fourier law in (1.24)
gives ˆ
(k∇Φ · ν)dS = 0. (1.26)
∂U
Using Green’s formula, this can be equivalently written as the volume integral
ˆ
∇ · (k∇Φ)dx = 0. (1.27)
U

We use here the notation ∇ · g for the divergence of a vector-valued function g = (g1 , . . . , gn ):
∂g1 ∂gn
∇·g = + ... + .
∂x1 ∂xn
As U is an arbitrary volume element, we conclude from (1.27) that we have
∇ · (k∇Φ) = 0. (1.28)
When k = const, this equation reduces to the Laplace equation (1.22). Otherwise, if k(x)
varies in space, we get an inhomogeneous equation
∇ · (k(x)∇Φ) = 0, (1.29)
which, as we will see, has many similar properties to the Laplace equation.

7
A probabilistic connection interlude
Another nice way to understand how the Laplace equation comes about, as well as many of
its properties is in terms of the discrete equations. For the sake of simplicity of notation, we
describe it in two dimensions. Let U be a bounded sub-domain of the two-dimensional square
lattice Z2 , and let u(x) solve the difference equation
u(x + 1, y) + u(x − 1, y) + u(x, y + 1) + u(x, y − 1) − 4u(x, y) = 0, (1.30)
which is a discrete analog of (1.22). We also impose the boundary condition u(x, y) = g(x, y)
on the boundary ∂U . Here g(x, y) is a prescribed non-negative function, which is positive
somewhere.
We claim that the solution of this problem has the following probabilistic interpretation.
Let (X(t), Y (t)) be the standard random walk on the lattice Z2 – the probability to go up
down, left or right is equal to 1/4, and let it start at the point (x, y): X(0) = x, Y (0) = y.
Let (x̄, ȳ) be the first point where (X(t), Y (t)) reaches the boundary ∂U of the domain. The
point (x̄, ȳ) is, of course, random. The beautiful observation is that the function
v(x, y) = E(g(x̄, ȳ))
gives a solution of (1.30), connecting this discrete equation to the random walk. Why? First,
it is immediate that if the starting point (x, y) is on the boundary of U then, of course, the
exit point from U is simply the starting point: x̄ = x and ȳ = y, so v(x, y) = g(x, y) in that
case. On the other hand, if (x, y) is inside U then the probabilities for the random walk to
go up, down, left right are all equal to 1/4, meaning that v(x, y) can be written as
1
v(x, y) = (v(x + 1, y) + v(x − 1, y) + v(x, y + 1) + v(x, y − 1)).
4
This identity simply uses the definition of the random walk, the definition of v(x, y) and very
elementary probability considerations.
Now, if we let the mesh size be not 1 but h > 0, the discrete equation (1.30) becomes
u(x + h, y) + u(x − h, y) + u(x, y + h) + u(x, y − h) − 4u(x, y) = 0. (1.31)
If we now expand the function u(x) in the Taylor series:
∂u(x, y) h2 ∂ 2 u(x, y)
u(x + h, y) = u(x, y) + h + + ...
∂x 2 ∂x2
and similarly for the other three terms, and let h ↓ 0, the discrete equation (1.31) becomes
the Laplace equation:
uxx + uyy = 0, (1.32)
while the random walk becomes the Brownian motion. More precisely, solution of the Laplace
equation (1.22) in n dimensions has the following probabilistic interpretation: let U be a
domain in Rn and let g(x) be a continuous function on the boundary ∂U . Consider a Brownian
motion B(t; x) that starts at a point x ∈ U and let x̄ be a (random) point where B(t; x) hits
the boundary ∂U for the first time. Then solution of the Laplace equation
∆u = 0 in U (1.33)

8
with the boundary condition u(x) = g(x) for x ∈ ∂U , is u(x) = E(g(x̄)). The reader
unfamiliar with the notion of the Brownian motion should not worry – we will not rely on
this connection in any way other than provide some intuition and motivation.
From the heuristic point of view, now, if g(x) is continuous and non-negative everywhere
on ∂U , and positive at some point x0 ∈ ∂U (and thus in a neighborhood V of x0 as well) then
with a positive probability the exit point x̄ lies in V , so that we have g(x̄) > 0, which means
that u(x) = E(g(x̄)) > 0 as well.
The maximum principle is also a simple consequence of the probabilistic interpretation:
it is easy to see that E(g(x̄)) ≤ supz∈∂U g(z) – expected value of a function can not exceed its
maximum.

Radial solutions of the Laplace equation


Let us first look for explicit
p radially symmetric solutions to the Laplace equation – they
depend only on r = |x| = x21 + . . . x2n . If φ(x) = v(r) depends only on the variable r, then

∂φ ∂r xi
= v 0 (r) = v 0 (r) , 1 ≤ i ≤ n,
∂xi ∂xi r
and
∂ 2φ ∂  0 xi  00 x2i v 0 (r) 0 x2i
= v (r) = v (r) + − v (r) .
∂xi 2 ∂xi r r2 r r3
Summing over i, from i = 1 to i = n, keeping in mind that ni=1 x2i = r2 , gives:
P

v 0 (r) r2 n−1 0
∆φ = v 00 (r) + n − v 0 (r) 3 = v 00 (r) + v (r).
r r r
Therefore, for a radial function φ(x) = v(r) to satisfy the Laplace equation, the function v(r)
has to be the solution of the ODE
n−1 0
v 00 (r) + v (r) = 0.
r
Dividing by v 0 (r) (v(r) = const would be a not very exciting solution) gives

0 n−1
(ln(v 0 (r)) = − .
r
Therefore, if n = 1 then v(r) = cr, that is,

u(x) = c|x|.

When n ≥ 2 we get
ln(v 0 (r)) = −(n − 1) ln r + C,
so that when n = 2, we get v 0 (r) = C/r, and

v(r) = C ln r + B,

9
with some constants C and B. Finally, for n ≥ 3 we obtain
C
v 0 (r)) = ,
rn−1
whence v(r) = −C/rn−2 + B.
You may notice immediately that in all three cases, n = 1, n = 2 and n ≥ 3 the radial
solutions that we have obtained above are not twice differentiable at r = 0, and, moreover,
for n ≥ 2 they are not even bounded. So, do they satisfy the Laplace equation? They certainly
do away from x = 0 but what happens there? In order to appreciate this point, observe that
if a smooth function K(x) satisfies
∆K(x) = 0,
for all x ∈ R and a function f (x) is smooth and vanishes outside of a bounded set, then the
function ˆ
v(x) = K(x − y)f (y)dy

also satisfies ∆v(x) = 0.


Let us now see what happens if the smooth kernel K(x) is replaced by one of the singular
solutions we have just constructed. Consider for simplicity the one-dimensional case. Let us
define a function ˆ ∞
φ(x) = |x − y|f (y)dy, (1.34)
−∞

with a smooth function f (y) which vanishes outside of some interval [−L, L] so that all
differentiations under the integral sign in the following computation are justified:
ˆ ∞ ˆ x ˆ ∞
0 d d d
φ (x) = |x − y|f (y)dy = (x − y)f (y)dy + (y − x)f (y)dy
dx −∞ dx −∞ dx x
ˆ x ˆ ∞
= f (y)dy − f (y)dy.
−∞ x

It follows that
φ00 (x) = 2f (x). (1.35)
Therefore, the function φ(x) is not a solution of the Laplace equation but rather of the
Poisson equation with the right side given by the function (−2f (x)). In order to get rid of
the pesky (−2) factor we introduce
1
Φ1 (x) = − |x|, x∈R (1.36)
2
and observe that for any ”nice” function f the function
ˆ ∞
φ(x) = Φ1 (x − y)f (y)dy, x ∈ R (1.37)
−∞

is the solution of the Poisson equation

−φ00 (x) = f (x), x ∈ R. (1.38)

10
1.3.1 The fundamental solution of the Laplace equation
The property that any function of the form (1.37) is the solution of the Poisson equation (1.38)
means that the function Φ1 (x) = −|x|/2 is the fundamental solution for the Laplace equation
in one dimension. In higher dimensions, the fundamental solutions of the Laplace equation
are given by
1
Φ(x) = − log |x|, n = 2, (1.39)

and
1 1
Φ(x) = , n ≥ 3. (1.40)
n(n − 2)α(n) |x|n−2
Here α(n) is the volume of the unit sphere in n dimensions. When we say that Φ(x) is the
fundamental solution of the Laplace equation, we mean the following.
Theorem 1.1 Let f ∈ Cc2 (Rn ) (that is, f is twice continuously differentiable and has compact
support), n ≥ 2, and set ˆ
φ(x) = Φ(x − y)f (y)dy, (1.41)
Rn
with Φ(x) given by (1.39) and (1.40) for n = 2 and n ≥ 3, respectively. Then φ(x) is twice
continuously differentiable and satisfies the Poisson equation

−∆φ = f (x), x ∈ Rn . (1.42)

Before we go into the proof, let us recall Lebesgue dominated convergence theorem from real
analysis.
Theorem 1.2 (Lebesgue Dominated Convergence Theorem) Let gk (x) be a sequence
of functions such that gk (x) → g(x) as k → ∞ for all x ∈ Rn and assume that there exists a
function Q(x) such that ˆ
|Q(x)|dx < +∞,
Rn
and |gk (x)| ≤ Q(x) for all k and all x ∈ Rn . Then we have
ˆ ˆ
lim gn (x)dx = g(x)dx.
n→+∞ Rn Rn

We will not prove this theorem here, its proof can be found in essentially any textbook on
measure theory and real analysis.
Proof of Theorem 1.1. Step 1. Differentiability of φ(x). Let us first show that φ(x)
defined by (1.41) is differentiable. Assume that f (x) vanishes outside of a ball of radius R.
Note that
ˆ ˆ ˆ
φ(x) = Φ(x − y)f (y)dy = Φ(y)f (x − y)dy = Φ(y)f (x − y)dy,
Rn Rn BR (x)

where BR (x) is the ball of radius R centered at the point x. Therefore, for |h| < 1 we have
ˆ
φ(x + hei ) − φ(x) f (x + hei − y) − f (x − y)
= Φ(y) dy, (1.43)
h BR+1 (x) h

11
where ei = (0, . . . , 1, 0, . . . , 0) is the unit vector in the direction of xi . However, we have

f (x + hei − y) − f (x − y) ∂f (x − y)
gh (y) = → as h → 0,
h ∂xi
uniformly in y ∈ Rn (remember that f is compactly supported) – we consider x here to be
fixed. Moreover, the functions gh (y) are uniformly bounded: there exists a point ξ on the
interval connecting x − y and x − y + hei such that

f (x + hei − y) − f (x − y) = hei · ∇f (ξ),

so that
|h(ei · ∇f (ξ))|
|gh (y)| ≤ ≤ M0 = sup |∇f (z)|.
|h| z∈Rn

The integrand in (1.43) can be bounded then as

|Φ(y)gh (y)| ≤ M0 |Φ(y)|.

Note that while the function Φ(y) is not integrable over all Rn , its integral over any ball is
finite – in particular, over the ball BR+1 (x). Hence, we may apply the Lebesgue dominated
convergence theorem and pass to the limit h → 0 in (1.43) to get
ˆ
∂φ(x) ∂f (x − y)
= Φ(y) dy.
∂xi Rn ∂xi
A very similar argument shows that
ˆ
∂ 2φ ∂ 2 f (x − y)
= Φ(y) dy,
∂xi ∂xj Rn ∂xi ∂xj

hence φ is twice differentiable (you need also to argue why the second derivatives are con-
tinuous but the argument for that is also very similar to what we just did, except without
dividing by any h).
Step 2. Derivation of the Poisson equation. Now, we show that φ satisfies the
Poisson equation. We know from the above that φ(x) is twice continuously differentiable and
ˆ
∆φ(x) = Φ(y)∆f (x − y)dy.
Rn

We need to check that the right side equals f (x). Since Φ(y) is singular at y = 0 we can not
simply integrate by parts in the right side but rather have to be more careful. To this end, we
take a small ε > 0 (that we will send to zero at the end of the proof), and split the integral
above into the integral over the ball B(0, ε) of radius ε centered at y = 0 and its complement:

∆φ(x) = Iε (x) + Jε (x), (1.44)

with ˆ ˆ
Iε (x) = Φ(y)∆f (x − y)dy, Jε (x) = Φ(y)∆f (x − y)dy.
|y|≤ε |y|≥ε

12
Decomposition (1.44) holds, of course, for any ε > 0. Therefore, we also have, trivially:

∆φ(x) = lim(Iε (x) + Jε (x)). (1.45)


ε↓0

Our strategy will be to compute the limit in the right side of (1.45) in order to verify that
the Poisson equation (1.42) holds.
The contribution of Iε (x) as ε ↓ 0 is small:
ˆ
|Iε (x)| ≤ Cf |Φ(y)|dy, (1.46)
|y|≤ε

where Cf = supz∈Rn |∆f (z)|. The right side of (1.46) vanishes as ε → 0: when n = 2 we have
ˆ ˆ ε ˆ 2π
1
|Φ(y)|dy ≤ | log r|rdrdω ≤ Cε2 | log ε|,
|y|≤ε 2π 0 0

and when n > 2 we have


ˆ ˆ εˆ
1 rn−1
|Φ(y)|dy ≤ drdω ≤ Cε2 .
|y|≤ε n(n − 2)α(n) 0 Sn−1 rn−2

Hence, in both cases we conclude that

lim Iε (x) = 0, (1.47)


ε↓0

uniformly in x ∈ Rn . Therefore, the main contribution to φ(x) comes from Jε (x):

∆φ(x) = lim Jε (x).


ε→0

Let us now look at Jε . First, we recall Green’s formula: given a vector-valued function v(x)
and a scalar valued function f (x) we have, over a nice domain U :
ˆ ˆ ˆ
v(x) · ∇f (x)dx = (v(x) · ν(x))f (x)dSx − f (x)divv(x)dx. (1.48)
U ∂U U

Here, ν(x) is the outward unit normal to the boundary ∂U at the point x ∈ ∂U . Then,
integrating by parts we get for Jε (keep in mind that ∆x f (x − y) = ∆y f (x − y)), since
∆f = div(∇f ):
ˆ ˆ ˆ
Jε (x) = Φ(y)∆x f (x − y)dy = Φ(y)∆y f (x − y)dy = Φ(y)divy (∇y f (x − y))dy
|y|≥ε |y|≥ε |y|≥ε
ˆ ˆ
=− ∇y Φ(y) · ∇y f (x − y)dy + Φ(y)(∇y f (x − y) · ν(y))dSy = Kε + Lε .
|y|≥ε |y|=ε

The second term above is small in the limit ε → 0: let Cf0 = supz∈Rn |∇f (z)|, then
ˆ
0
|Lε | ≤ Cf |Φ(y)|dSy ,
|y|=ε

13
so that when n = 2 we have
|Lε | ≤ Cε| log ε|,
and when n ≥ 3 we have
εn−1
|Lε | ≤ C n−2 = Cε.
ε
In both cases we have
lim Lε = 0. (1.49)
ε↓0

Thus, the main contribution to φ must come from Kε :

∆φ(x) = lim Kε (x). (1.50)


ε↓0

Let us look at that term: integrating by parts using Green’s formula once again gives
ˆ ˆ ˆ
∂Φ(y)
Kε (x) = − ∇y Φ(y) · ∇y f (x − y)dy = ∆Φ(y)f (x − y)dy − f (x − y)dSy .
|y|≥ε |y|≥ε |y|=ε ∂ν

As ∆Φ(y) = 0 for y 6= 0, the above is


ˆ
∂Φ(y)
Kε (x) = − f (x − y)dSy . (1.51)
|y|=ε ∂ν

Consider only the case n = 3 – the case n = 2 is very similar. Then the normal derivative
inside the integrand above is
∂Φ 1 1
(y) = = ,

∂ν nα(n)r n−1 nα(n)εn−1

|y|=ε |y|=ε

and does not depend on the point y. The sign above comes from the fact that the outer
normal to {|y| ≥ ε} points toward the origin y = 0. Using this in (1.51) gives
ˆ
1
Kε (x) = − f (x − y)dSy . (1.52)
nα(n)εn−1 |y|=ε

It remains only to observe that nα(n)εn−1 is the surface area of the sphere of radius ε in Rn :
in general, the volume α(n) of the unit sphere in Rn is related to its area sn by α(n) = sn /n
– this is easy to see from calculus. It follows, as f (x) is continuous that

lim Kε (x) = −f (x).


ε↓0

Going back to (1.50) we conclude that

−∆φ(x) = f (x),

as claimed. The end of the proof in dimension n = 2 is very similar to what we did after
(1.51), so we do not present it here. 2

14
A probabilistic interlude
Consider now the following problem: fix two radii r and R, and let the Brownian motion
start a point x inside the annulus D = {r < |x| < R} in Rn . The Brownian motion will
spend some time inside D but will eventually exit D at some random point x̄ such that either
|x̄| = r or |x̄| = R. We ask the following question: what is the probability that the Brownian
motion will exit the annulus at the sphere {|x| = r} and not on the sphere {|x| = R}. Let
us call this probability p(x). It is clear that if the starting point x is such that |x| = r then
p(x) = 1 while if |x| = R then p(x) = 0. One can show, as we did before, that if we replace
the Brownian motion by a discrete random walk then the discretized p(x) satisfies the discrete
Laplace equation:
n
1 X
p(x) = [p(x + ei ) + p(x − ei )] .
2n i=1
Here, ei is the unit vector in the direction of xi and 2n is the total number of the neighbours
of the point x on the lattice (n is the spatial dimension). In the case of the Brownian motion
which is a continuous limit of the random walks, the function p(x) satisfies the Laplace
equation
∆p(x) = 0 for r < |x| < R,
supplemented by the boundary conditions

p(x) = 1 if |x| = r and p(x) = 0 if |x| = R.

In dimension n = 1 the solution is given by

p(x) = Ax + B,

with the constants A and B determined by

Ar + B = 1, AR + B = 0,

so that
1 R
A=− , B= .
R−r R−r
Note that if R → +∞, that is, if the right point moves to infinity, we have

A → 0, B → 1 as R → +∞.

This means that p(x) → 1 as R → +∞ for any fixed point x. This is the reflection of the
fact that the Brownian motion is recurrent in one dimension: no matter where it starts, it is
certain to reach the point x = r. On the other hand, in dimensions n ≥ 3 the function p(x)
is given by
A
p(x) = n−2 + B,
|x|
with the constants A and B determined by
A A
+ B = 1, + B = 0.
rn−2 Rn−2

15
This gives
rn−2 Rn−2 rn−2
A= , B = − .
Rn−2 − rn−2 Rn−2 − rn−2
We see that in dimension n ≥ 3, as R → +∞ we have

A → rn−2 , B → 0,

so that p(x) has a limit that depends on x as is always less than one:

rn−2
p(x) → as R → +∞.
|x|n−2

This reflect the fact that the Brownian motion is transitive in dimensions n ≥ 3: no matter
how close to the ball {|x| ≤ r} it starts, there is a positive probability that it will never enter
this ball.

1.3.2 Qualitative properties of harmonic functions


Our next task is to show, from several points of view that harmonic functions are beautifully
well-behaved.

The mean value property


We begin with the mean value property that shows that locally u(x) is close to its average.
Intuitively this implies that u(x) can not behave very irregularly and should have limited
room to oscillate. And, indeed, the mean value property leads to an amazing number of
qualitative conclusions of this kind. A word on notation: for a set S we denote by |S| its
volume (or area), and by ∂S we denote its boundary (as we did before).
Let us first recall that in one dimension the mean value property is trivial: any harmonic
function in one dimension is linear u(x) = ax + b, and then, of course, for any x ∈ R and
any l > 0 we have
ˆ
1 1 x+l
u(x) = (u(x + l) + u(x − l)) = u(y)dy.
2 2l x−l

Here is the generalization to harmonic functions in higher dimensions.

Theorem 1.3 Let U ⊂ Rn be an open set and let B(x, r) be a ball centered at x ∈ Rn of
radius r > 0 contained in U . Assume that the function u(x) satisfies

∆u = 0 for all x ∈ U , (1.53)

and that u ∈ C 2 (U ). Then we have


ˆ ˆ
1 1
u(x) = udy = udS. (1.54)
|B(x, r)| B(x,r) |∂B(x, r)| ∂B(x,r)

16
The intuitive reason for the mean value property can be seen from the discrete version of the
Laplace equation we have encountered when we discussed the probabilistic interpretation:
n
1 X
(u(x + hej ) + u(x − hej )) = u(x).
2n j=1

Here h is the mesh size, and ej is the unit vector in the direction of the coordinate axis for xj .
This discrete equation says exactly that the value u(x) is the average of the values of u at the
neighbors of the point x on the lattice with mesh size h, which is similar to the statement of
Theorem 1.3 – though there is no meaning to “nearest” neighbor in the continuous case, and
the average can be taken over an arbitrary large sphere or ball.
Proof. Let us fix the point x ∈ U and define
ˆ
1
φ(r) = u(z)dS(z). (1.55)
|∂B(x, r)| ∂B(x,r)

It is easy to see that, since u(x) is continuous, we have

lim φ(r) = u(x). (1.56)


r↓0

Therefore, we would be done if we knew that φ0 (r) = 0 for all r > 0 such that the ball B(x, r)
is contained in U . To this end, using the polar coordinates z = x + ry, with y ∈ ∂B(0, 1), we
may rewrite (1.55) as
ˆ
1
φ(r) = u(x + ry)dS(y).
|∂B(0, 1)| ∂B(0,1)

Then differentiating in r gives


ˆ
0 1
φ (r) = y · ∇u(x + ry)dS(y).
|∂B(0, 1)| ∂B(0,1)

Going back to the z-variables gives


ˆ ˆ
0 1 1 1 ∂u
φ (r) = (z − x) · ∇u(z)dS(z) = dS(z).
|∂B(x, r)| ∂B(x,r) r |∂B(x, r)| ∂B(x,r) ∂ν

Here, we used the fact that the outward normal to the ball B(x, r) at a point z ∈ ∂B(x, r)
is ν = (z − x)/r. Using the Green’s formula
ˆ ˆ ˆ
∂g
f ∆gdy = f dS − ∇f · ∇gdy,
U ∂U ∂ν U

with f = 1 and g = u gives now


ˆ
0 1
φ (r) = ∆u(y)dy = 0,
|∂B(x, r)| B(x,r)

17
since u is harmonic – it satisfies (1.53). It follows that φ(r) is a constant and then (1.56)
implies that ˆ
1
u(x) = udS, (1.57)
|∂B(x, r)| ∂B(x,r)
which is the second identity in (1.54).
In order to prove the first equality in (1.54) we use the polar coordinates once again:
ˆ ˆ r ˆ  ˆ r
1 1 1
udy = udS ds = u(x)nα(n)sn−1 ds
|B(x, r)| B(x,r) |B(x, r)| 0 ∂B(x,s) |B(x, r)| 0
n
nα(n)r
= u(x) = u(x).
α(n)rn
In the second equality above we used two facts: first, the already proved identity (1.57) about
averages on spherical shells, and, second, that the area of an (n − 1)-dimensional unit sphere
is nα(n). Now, the proof of (1.54) is complete. 2

The maximum principle


The first consequence of the mean value property is the maximum principle that says that a
harmonic function attains its maximum over any domain on the boundary and not inside the
domain. Once again, in one dimension this is obvious: a linear function does not have any
local extremal points.
Theorem 1.4 (The maximum principle) Let u(x) be a harmonic function in a connected
domain U and assume that u ∈ C 2 (U ) ∩ C(Ū ). Then

max u(x) = max u(y). (1.58)


x∈U y∈∂U

Moreover, if u(x) achieves its maximum at a point x0 in the interior of U then u(x) is
identically equal to a constant in U .
Proof. Let us suppose that u(x) attains its maximum at an interior point x0 ∈ U , and
set M = u(x0 ). Then for any r > 0 sufficiently small (so that the ball B(x0 , r) is contained
in U ) we have ˆ
1
M = u(x) = udy ≤ M,
|B(x0 , r)| B(x0 ,r)
with the equality above holding only if u(y) = M for all y in the ball B(x0 , r). Therefore,
the set S of points where u(x) = M is open. Since u(x) is continuous, this set is also
closed. Since S us both open and closed in U , and U is connected, it follows that S = U ,
hence u(x) = M at all points x ∈ U . 2
Of course, if we replace u by (−u) (which is equally harmonic), we get the minimum
principle for u.
Corollary 1.5 (Strict positivity) Assume that U is a connected domain, and u solves

∆u = 0 in U (1.59)
u = g on ∂U .

18
Assume, in addition, that g ≥ 0, g is continuous on ∂U , and g(x) 6≡ 0. Then u(x) > 0 at
all x ∈ U .

Proof. This is an immediate consequence of the minimum principle: minx∈Ū u(x) ≥ 0, and u
can not attain its minimum inside U , thus u(x) > 0 for all x ∈ U . 2

Corollary 1.6 (Uniquness) Let g be continuous on ∂U and f be continuous in U . Then


there exists at most one solution u ∈ C 2 (U ) ∩ C(Ū ) to the boundary value problem

∆u = f in U (1.60)
u = g on ∂U .

Proof. Let u1 and u2 be two such solutions to (1.60). Then the difference v = u1 −u2 satisfies
the homogeneous problem

∆v = 0 in U (1.61)
v = 0 on ∂U .

The maximum principle implies that v ≤ 0 in U , and the minimum principle implies that v ≥ 0
in U , whence v ≡ 0, and we are done. 2

Regularity of harmonic functions


Now, we prove that if u(x) is a twice continuously differentiable harmonic function then it
is infinitely differentiable – this is quite an amazing result if you think about it, and is a
very special property of elliptic equations. Another context such a result appears as in the
study of holomorphic functions – a function that has a complex derivative is automatically
infinitely differentiable. The “reason” is, of course, that the real and imaginary parts of a
holomorphic function are harmonic! For the Laplace equation it can be deduced directly from
the mean-value property, and in an arbitrary dimension.

Theorem 1.7 (Regularity) Let u ∈ C 2 (U ) be a harmonic function in a domain U . Then u


is infinitely differentiable in U .

Proof. The proof is via a miracle: we first define a ”smoothed” version of u, and then
verify that the ”smoothed” version coincides with the original, hence original is also infinitely
smooth. This is as close to a free lunch as it gets.
Consider a radial non-negative function η(x) ≥ 0 that depends ´ only on |x| such that
(i) η(x) = 0 for |x| ≥ 1, (ii) η(x) is infinitely differentiable, and (iii) Rn η(x)dx = 1. Also, for
each ε ∈ (0, 1) define its rescaled version
1 x
ηε (x) = n η .
ε ε
It is straightforward to verify that ηε satisfies the same properties (i)-(iii) above. Moreover,
the function ˆ
uε (x) = ηε (x − y)u(y)dy (1.62)
Rn

19
is infinitely differentiable in the slightly smaller domain Uε = {x ∈: dist(x, ∂U ) > ε}. The
reason is that we can differentiate infinitely many times under the integral sign in (1.62) –
this follows from the standard multivariable calculus theorem on differentiation of integrals
depending on a parameter (the variable x plays the role of a parameter here). Our main claim
is that, because of the mean value property, we have
uε (x) = u(x) for all x ∈ Uε . (1.63)
This will immediately imply that u(x) is infinitely differentiable in the domain Uε . And, as
any point x from U lies in Uε if ε < dist(x, ∂U ), it follows that u(x) is infinitely differentiable
at all points x ∈ U .
Let us now verify (1.63):
ˆ ˆ   ˆ  
1 |x − y| 1 |x − y|
uε (x) = ηε (x − y)u(y)dy = n η u(y)dy = n η u(y)dy.
Rn ε U ε ε B(x,ε) ε
The last equality holds because η(z) = 0 if |z| ≥ 1, whence ηε (z) = 0 if |z| ≥ ε. Changing
variables y = x + εz gives
ˆ
uε (x) = η (z) u(x + εz)dz.
B(0,1)

Going to the polar coordinates leads to


ˆ 1 ˆ 
uε (x) = η (r) u(x + εrω)dS(ω) rn−1 dr. (1.64)
0 ∂B(0,1)

The mean value property implies that


ˆ
u(x + εrω)dS(ω) = u(x)|∂B(0, 1)|.
∂B(0,1)

Using this in (1.64), we obtain


ˆ 1 ˆ
n−1
uε (x) = u(x) η (r) |∂B(0, 1)|r dr = u(x) η(y)dy = u(x), (1.65)
0 B(0,1)

which is (1.63). We used the fact that η has integral equal to one in the last step. 2
This regularity property is quite fundamental and appears in one way or other for the class
of elliptic equations (and not just for the Laplace equation) we will discuss later. One of their
main qualitative properties is that solutions are more regular than the data prescribed, and
they behave much nicer than, say, solutions of wave equations and other hyperbolic problems.
Let us now give a more quantitative estimate on how large the derivatives of the harmonic
functions can be.
Theorem 1.8 Let u(x) be a harmonic function in a domain U and let B(y0 , r) be a ball
contained in U centered at a point y0 ∈ U . Then there exist universal constants Cn and Dn
that depends only on the dimension n so that we have
ˆ
Cn
|u(y0 )| ≤ n |u(y)|dy. (1.66)
r B(y0 ,r)

20
and ˆ
Dn
|∇u(y0 )| ≤ n+1 |u(y)|dy. (1.67)
r B(y0 ,r)

The remarkable fact about the estimate (1.67) is that we are able to estimate the size of
the derivatives of a harmonic function in terms of its values – this means that a harmonic
function can not oscillate (oscillation means, essentially, that the function is much smaller
than its derivative). It is a good time to pause and think about why taking a harmonic
function u(x) and setting uε (x) = u(x/ε) with y0 = 0 does not provide a counterexample
for (1.67). We certainly would have |∇uε (0)| = |∇u(0)|/ε. But what has to happen to the
right side of (1.67) when we replace u by uε there?
Proof. First, the estimate (1.66) follows immediately from the first equality in the mean
value formula (1.54). In order to obtain the derivative bound (1.67) note that if u(x) is
harmonic then so are the partial derivatives ∂u/∂xj , whence
ˆ ˆ
∂u(y0 ) 1 ∂u(y) 1
≤ dy = u(y)νj (y)dy , (1.68)
∂xj |B(y0 , r/2)| ∂x j
|B(y 0 , r/2)|
B(y0 ,r/2) ∂B(y0 ,r/2)

where νj (y) is the j-th component of the outward normal. Continuing this estimate we see
that (we use the fact that the area of the unit sphere is nα(n))
n
nα(n)rn−1

∂u(y0 )
≤ 2 sup |u(z)| =
2n
sup |u(z)|. (1.69)
∂xj α(n)rn 2n−1 r z∈B(y0 ,r/2)
z∈B(y0 ,r/2)

Now, we can use the estimate (1.66) applied at any point z ∈ B(y0 , r/2):
ˆ
Cn
|u(z)| ≤ n
|u(z 0 )|dz 0 . (1.70)
(r/2) B(z,r/2)

However, since |y0 − z| ≤ r/2 (this is why we took a smaller ball in (1.68)!), any such
ball B(z, r/2) is contained inside the ball B(y0 , r), thus (1.70) implies that
ˆ
Cn
|u(z)| ≤ |u(z 0 )|dz 0 .
(r/2)n B(y0 ,r)

Now, it follows from (1.69) that


ˆ ˆ
∂u(y0 ) 2n Cn 0 0 Dn
≤ |u(z )|dz = n+1 |u(z)|dz, (1.71)
∂xj r (r/2)n B(y0 ,r) r B(y0 ,r)

which is (1.67). 2
Theorem 1.8 is another expression of the fact that harmonic functions do not oscillate –
the first estimate says that the value of the function at a point is bounded by its averages
(but we have seen that already in the mean value property), while the second bound says in a
quantitative way that derivative at a point can not be large without the function being large
around the point. This rules out oscillatory behavior.

21
The Liouville theorem
The Liouville theorem says that a function which is harmonic in all of Rn is either unbounded
or is identically equal to a constant.

Theorem 1.9 Let u(x) be a harmonic bounded function in Rn . Then u(x) is equal identically
to a constant.

Proof. Let us assume that |u(x)| ≤ M for all x ∈ Rn . We fix x0 ∈ Rn and use Theorem 1.8:
ˆ
C Cα(n)rn Cα(n)M
|∇u(x0 )| ≤ n+1 |u(y)|dy ≤ n+1
M≤ .
r B(x0 ,r) r r

As this is true for any r > 0 we may let r → ∞ and conclude that ∇u(x0 ) = 0, thus u(x) is
equal identically to a constant. 2
This theorem is, of course, a direct generalization to higher dimensions of the familiar
Liouville theorem in complex analysis that says that a bounded entire (analytic in all of C)
function has to be equal identically to a constant.

Harnack’s inequality
Here is another way to express lack of oscillations of nonnegative harmonic functions –
their maximum cannot be much larger than their minimum. To trivialize, consider the one-
dimensional situation. Let u(x) be a non-negative harmonic function on the interval (0, 1),
that is, u(x) = ax + b with some constants a, b ∈ R. We claim that if u(x) ≥ 0 for all x ∈ [0, 1]
then
1 u(x)
≤ ≤ 3, (1.72)
3 u(y)
for all x, y in the smaller interval (1/4, 3/4). The constants 1/3 and 3 in (1.72) depend on the
choice of the ”smaller” interval – they would change of we would replace (1/4, 3/4) by another
subinterval of [0, 1]. But once we fix the subinterval, they do not depend on the choice of the
harmonic function. Let us now show that (1.72) holds for all x, y ∈ (1/4, 3/4). Without loss
of generality we may assume that x > y. First, consider the case a > 0. Then, since u(x) is
increasing (because a > 0), we have

u(x) u(3/4) 3a + 4b
1≤ ≤ = . (1.73)
u(y) u(1/4) a + 4b

As u(x) > 0 on [0, 1] we know that b > 0 (and a > 0 by assumption), using this in (1.73)
gives, with c = a/b:
u(x) 3c + 4 8
1≤ ≤ =3− ≤ 3.
u(y) c+4 c+4
On the other hand, if a < 0 then the function u is decreasing, and

u(x) u(3/4) c+4 1 8


1≥ ≥ = = + .
u(y) u(1/4) 3c + 4 3 3(3c + 4)

22
As u(1) > 0 we know that a + b > 0, and we still have b > 0 since u(0) > 0. Thus, c > −1,
and therefore,
u(x) 1 8 1
1≥ ≥ + ≥ .
u(y) 3 3(3c + 4) 3
We conclude that (1.72), indeed, holds. Geometrically, (1.72) expresses a very simple fact:
if u(3/4)  u(1/4) then the slope of the straight line connecting the points (1/4, u(1/4))
and (3/4, u(3/4)) is too large so that it would go below the x-axis at x = 0 contradicting
the assumption that the linear function is positive on the interval (0, 1). On the other hand,
if u(1/4)  u(3/4) then this line would go below that x-axis at x = 1. Therefore, the
condition that u(x) > 0 on the larger interval [0, 1] is very important here.
Now, we turn to the general case of dimension larger than one. We say that a set V is
strictly contained in U if V ⊂ U and there exists ε0 > 0 so that for any x ∈ V we have
dist(x, ∂U ) ≥ ε0 .
Theorem 1.10 (Harnack’s inequality) Let U be an open set and let V be strictly contained
in U . Then there exists a constant C that depends on U and V but nothing else so that for
any nonnegative harmonic function u in U we have

sup u(x) ≤ C inf u(x). (1.74)


x∈V x∈V

Proof. Let r = (1/4)dist(V, ∂U ) and choose two points x, y ∈ V such that |x − y| ≤ r. Then
the ball B(x, 2r) is contained in U so u is harmonic in this ball, and the mean-value principle
implies that ˆ
1
u(x) = u(z)dz. (1.75)
|B(x, 2r)| B(x,2r)
Note also that since |x − y| ≤ r, the ball B(y, r) is contained inside B(x, 2r), and u(z) ≥ 0
everywhere. Hence, (1.75) implies that
ˆ
1
u(x) ≥ u(z)dz. (1.76)
α(n)2n rn B(y,r)
It follows, on the other hand, from the mean-value principle that
ˆ
1
u(y) = u(z)dz. (1.77)
α(n)rn B(y,r)

Putting (1.76) and (1.77) together gives


1
u(x) ≥ u(y). (1.78)
2n
Reversing the argument we can similarly conclude that
1
u(y) ≥ u(x), (1.79)
2n
hence
1
u(x) ≤ u(y) ≤ 2n u(x), for all x, y ∈ V such that |x − y| ≤ (1/4)dist(V, ∂U ). (1.80)
2n

23
In general, if |x − y| ≥ r, there exists a number N so that we may cover the compact
set V̄ by N balls of radius r/2. Then given any two points x, y ∈ V we can connect them by
a piece-wise straight line curve with no more than N segments, each segment at most r long.
It follows that for any x, y ∈ V we have
1
u(x) ≤ u(y) ≤ 2N n u(x), for all x, y ∈ V . (1.81)
2N n
This, of course, implies (1.74) with C = 2nN . 2

1.3.3 Green’s function for the Poisson equation


We will now show a systematic way to construct solutions of the boundary value problem for
the Poisson equation

−∆u = f in U , (1.82)
u = g on ∂U .

When the domain U is sufficiently simple (a ball, halfspace, etc.) then we will construct a
more or less explicit formula for the solution. When U is complicated we can not get an
explicit formula but we will reduce solving (1.82) with arbitrary functions f and g to the
special case f = 0, and one particular function g. Having a solution to this one special
case allows to construct solutions for general f and g immediately. This is useful when one
needs to solve the Poisson equation in the same domain for various f and g. It also helps to
understand various qualitative properties of the solutions of the boundary value problem for
the Poisson equation.

Definition of the Green’s function


Let us recall the fundamental solution of the Laplace equation Φ(x) we have constructed in
Section 1.3 (compare to (1.39)-(1.40)):
1
Φ(x) = − log |x|, n = 2, (1.83)

and
1 1
Φ(x) = , n ≥ 3. (1.84)
n(n − 2)α(n) |x|n−2
Theorem 1.1 asserts that the function
ˆ
u(x) = Φ(x − y)f (y)dy (1.85)
Rn

is a solution of the Poisson equation


−∆u = f
posed in all of Rn . What we would like to do is to adapt the representation (1.85) to the
boundary value problem (1.82) posed in a bounded domain, and also taking into account the

24
correct boundary conditions. That is, we are hoping to get an integral representation of the
solution of the boundary value problem (1.82) as
ˆ ˆ
u(x) = G(x, y)f (y)dy + G1 (x, z)g(z)dz, (1.86)
U ∂U

with some functions G(x, y) and G1 (x, z) that are to be determined (but they should not
depend on the functions f and g – they should only depend on the domain U where the
problem is posed).
To this end, we take a point x ∈ U , and a small ball B(x, ε) around it. Consider the
domain Vε = U \ B(x, ε) (that is, U without the ball B(x, ε)) and use the Green’s formula:
ˆ ˆ  
∂Φ(z − x) ∂u(z)
[u(y)∆Φ(y − x) − Φ(y − x)∆u(y)] dy = u(z) − Φ(z − x) dS(z).
Vε ∂Vε ∂ν ∂ν
(1.87)
The reason we had to cut out the small ball around the point x is that now when y ∈ Vε the
argument (y − x) of the fundamental solution Φ(x − y) can not vanish, and Φ(z) is regular
when z 6= 0. Otherwise, we would not be able to apply Green’s formula since Φ(z) is singular
at z = 0. As ∆Φ(y − x) = 0 when y 6= x, the above is
ˆ ˆ  
∂Φ(z − x) ∂u(z)
− Φ(y − x)∆u(y)dy = u(z) − Φ(z − x) dS(z). (1.88)
Vε ∂Vε ∂ν ∂ν

This identity holds for all ε > 0 and we will now pass to the limit ε ↓ 0 in (1.88), taking the
size of the cut-out region to zero. The boundary ∂Vε of the domain Vε is the union of ∂U
and the sphere Sε = {|z − x| = ε}. The integral over Sε is computed as in the proof of
Theorem 1.1 (we again do only the computations for n ≥ 3, the case n = 2 is similar): first,
we may use the explicit formula
1
Φ(x) = .
n(n − 2)α(n)|x|n−2

Note that for z on the sphere Sε we have |z − x| = ε, meaning that

∂Φ 1
(z − x) = ,
∂ν nα(n)εn−1

so that
ˆ ˆ ˆ
∂Φ(z − x) 1 1
u(z) dS(z) = u(z)dS(z) = u(z)dS(z). (1.89)
∂Sε ∂ν nα(n)εn−1 ∂Sε |∂Sε | ∂Sε

We used here the formula |∂Sε | = nα(n)εn−1 for a sphere of radius ε in Rn . As u is continuous
at the point x, letting ε ↓ 0 we obtain
ˆ ˆ
∂Φ(z − x) 1
u(z) dS(z) = u(z)dS(z) → u(x) as ε ↓ 0. (1.90)
∂Sε ∂ν |∂Sε | ∂Sε

25
The other term in the right side of (1.88) vanishes as ε ↓ 0:
ˆ ˆ n−1
∂u(z) 1 ∂u(z) ≤ M ε nα(n) → 0,

Φ(z − x) dS(z) =
n(n − 2)α(n)εn−2
dS(z) nα(n)(n − 2)εn−2

∂Sε ∂ν ∂Sε ∂ν
(1.91)
n−1
as ε ↓ 0, where M = supy∈U |∇u|. We used again the formula |∂Sε | = nα(n)ε above.
Therefore, passing to the limit ε ↓ 0 in (1.88) leads to
ˆ   ˆ
∂u(z) ∂Φ(z − x)
u(x) = Φ(z − x) − u(z) dS(z) − Φ(y − x)∆u(y)dy. (1.92)
∂U ∂ν ∂ν U

Hence, in order to compute u(x) we should know ∆u inside U (which we do for the solution
of the Poisson equation (1.82), it is f ), as well as u(z) on the boundary ∂U (which we do
know for the solution of the boundary value problem (1.82), it is g), but also the normal
derivative ∂u/∂ν at the boundary of U , and that we do not know a priori – this normal
derivative can only be found after we solve (1.82). Therefore, (1.92) is not yet the answer we
seek – it involves an unknown function ∂u/∂ν.
Note that this would not have been an issue if we would have Φ(x − y) = 0 on ∂U – then
the corresponding term in (1.92) would have vanished. The idea is, then, to amend Φ(x − y)
in such a way as to make this term disappear. This is done as follows. Fix a point x ∈ U and
let φ(y; x) be the solution of the boundary value problem

−∆y φ = 0 for y ∈ U , (1.93)


φ(y; x) = Φ(y − x) for y ∈ ∂U .

Observe that, as x lies inside the domain U , the function Φ(x − y) is regular when y lies on
the boundary ∂U - the distance between x and y is uniformly positive. Therefore, (1.93) is
simply a Laplace equation in y with regular prescribed boundary data Φ(x − y) (x here serves
as a parameter). Hence, the function φ(y; x) is regular and has no singularity.
Using the Green’s formula as before (but without the need to throw out a small ball
around the point x since the function φ(y; x) is regular at y = x) gives
ˆ ˆ  
∂φ(z; x) ∂u(z)
− φ(y; x)∆u(y) dy = u(z) − φ(z; x) dS(z). (1.94)
U ∂U ∂ν ∂ν

The boundary condition for φ(y; z) shows that this is


ˆ ˆ  
∂φ(z; x) ∂u(z)
− φ(y; x)∆u(y) dy = u(z) − Φ(z − x) dS(z). (1.95)
U ∂U ∂ν ∂ν
Definition 1.11 The Green’s function for a domain U is

G(x; y) = Φ(x − y) − φ(y; x). (1.96)

Now, adding (1.92) and (1.95) up gives


ˆ ˆ
∂G(x; z)
u(x) = − u(z) dS(z) − G(x; y)∆u(y)dy. (1.97)
∂U ∂νz U

26
The advantage of (1.97) over (1.92) is that the normal derivative ∂u/∂ν on ∂U (which we do
not know) no longer appears in the right side. Hence, solution of the Poisson boundary value
problem (1.82) is given by
ˆ ˆ
∂G(x; z)
u(x) = − g(z) dS(z) + G(x; y)f (y)dy. (1.98)
∂U ∂ν U

This expression is particularly useful when G(x; z) is known explicitly, and we will discuss
below some examples when it can be computed analytically.

How to approximate the Green’s function


Let us now briefly discuss how one can approximate the Green’s function that we have just
constructed. Let us think of Gh (x, y) as an approximate Green’s function in a domain U
as h → 0. We will keep the boundary condition G(x, y) = 0 for any x ∈ U and y ∈ ∂U , so
that
Gh (x, y) = 0 for y ∈ ∂Ω.
Let now u(x) be the solution of the boundary value problem

−∆u = f,

with the Dirichlet boundary condition u = 0 on ∂Ω. Then, according to (1.98), we have
ˆ
u(x) = G(x, y)f (y)dy.
U

As Gh (x, y) should approximate G(x, y), we should have


ˆ
u(x) = Gh (x, y)f (y)dy + l.o.t. (1.99)
U

Let us then look for fh (x, y) such that if Gh (x, y) satisfies the boundary value problem

−∆y Gh = fh (x, y), in U (1.100)


Gh (x, y) = 0 for y ∈ ∂U ,

then (1.99) holds. Since both Gh and u vanish on the boundary, Green’s formula says that
ˆ ˆ
Gh (x, y)∆y u(y) = u(y)∆Gh (y)dy.
U U

We can write this as ˆ ˆ


Gh (x, y)f (y)dy = fh (x, y)u(y)dy.
U U

The question now becomes: what is a suitable choice fh (x, y) such that
ˆ
fh (x, y)u(y)dy − u(x) → 0 as h → 0.
U

27
A suitable choice is to take a fixed f (x) ≥ 0 such that f (x) = 0 for |x| ≥ 1, and
ˆ
f (y)dy = 1,
|x|≤1

and set (recall that x ∈ Rn ):


1 x − y 
fh (x; y) = f . (1.101)
hn h
Indeed, for a continuous function u(x) we have for any x fixed, and h < dist(x, ∂U )
ˆ ˆ  ˆ
1 x − y 1 x − y 
fh (x, y)u(y)dy = n f u(y)dy = n f u(y)dy
U h U h h |x−y|≤h h
ˆ ˆ
= f (z)u(x − hz)dz → u(x) f (z)dz = u(x). (1.102)
|z|≤1 |z|≤1

Thus, to find an approximation to Gh (x, y) we need to solve (1.100).

Reciprocity of the Green’s function


Physically, the meaning of G(x, y) is as follows: it is the value of the electric potential at the
point x when a localized charge is put at the point y. The physical principle of reciprocity
implies that G(x; y) = G(y; x). Let us show that this is, indeed, the case.

Theorem 1.12 We have for all x, y ∈ U , x 6= y:

G(x; y) = G(y; x). (1.103)

Proof. Once again, we will use Green’s formula. Let x 6= y be two distinct points in U , and
set v(z) = G(x; z) and w(z) = G(y; z). Let us cut out two small balls B(x, ε) and B(y, ε)
with ε > 0 so small that the balls are not overlapping and are contained in U . Let

Vε = U \ (B(x, ε) ∪ B(y, ε))

be the domain U with the two balls deleted. Then ∆z w = ∆z v = 0 in Vε as this set contains
neither the point x nor the point y. The Green’s formula then becomes
ˆ  
∂v(z) ∂w(z)
w(z) − v(z) dS(z) = 0.
∂Vε ∂ν ∂ν

The boundary of Vε consists of three pieces: the outer boundary ∂U where both w and v
vanish, and the two spheres ∂B(x, ε) and ∂B(y, ε). Therefore, we have
ˆ   ˆ  
∂v(z) ∂w(z) ∂v(z) ∂w(z)
w(z) − v(z) dS(z) + w(z) − v(z) dS(z) = 0.
∂B(x,ε) ∂ν ∂ν ∂B(y,ε) ∂ν ∂ν
(1.104)
Here ν is the normal pointing inside the spheres (it is the outside normal to Vε which is
outside the two small balls). The functions w and v look as follows near x and y: v is regular
in B(y, ε), w is regular in B(x, ε). On the other hand, v(z) = Φ(x − z) − φ(z; x), and φ(z; x) is

28
regular in z, including in B(x; z), while w(z) = Φ(y − z) − φ(z; y), and φ(z; y) is regular in z,
including in B(y; z). Hence, as in the discussion in the previous section leading up to (1.97),
the main terms in (1.104) are
ˆ ˆ
∂Φ(x − z) ∂Φ(y − z)
w(z) dS(z) − v(z) dS(z) + l.o.t = 0, (1.105)
∂B(x,ε) ∂ν ∂B(y,ε) ∂ν
where l.o.t. denotes terms that tend to zero as ε → 0. Passing to the limit ε ↓ 0 exactly as
in (1.90), since w(z) is continuous at x, and v(z) is continuous in y, gives w(x) − v(y) = 0,
which is exactly (1.103). 2

Green’s function for a half space


We will now construct explicitly the Green’s function G(x; y) for a half space
Rn+ = {(x1 , . . . , xn ) : xn > 0}.
This is done by the method of reflections. We need to find a function φ(x; y) that solves the
following boundary value problem:
−∆y φ(x; y) = 0 for y ∈ Rn+ , (1.106)
φ(x; y) = Φ(x − y) for y such that yn = 0,
with a given point x ∈ Rn+ . Consider the reflected point x̄ = (x1 , . . . , xn−1 , −xn ), that lies
in the lower half-space, and the corresponding fundamental solution Φ(x̄ − y). Then (i) as x̄
lies outside of Rn+ , this function is harmonic in Rn+ , and (ii) since for any y with yn = 0 we
have |x − y| = |x̄ − y| and since the function Φ(x) is radial, we have Φ(x − y) = Φ(x̄ − y) for
all points y on the hyper-plane {yn = 0}. Therefore, φ(x; y) = Φ(x̄ − y), and
G(x; y) = Φ(x − y) − Φ(x̄ − y) (1.107)
is the Green’s function for the half space Rn+ . We also need to compute its normal derivative
at the hyper-plane {yn = 0}:
 
∂G(x; y) ∂Φ(y − x) ∂Φ(y − x̄) 1 y n − xn y n + xn
= − =− − .
∂yn ∂yn ∂yn nα(n) |y − x|n |y − x|n
Hence, if yn = 0 then the outward normal to ∂Rn+ at the point y is:
∂G(x; y) ∂G(x, y) 2xn
=− =− .
∂ν ∂yn nα(n)|y − x|n
Let us define the Poisson kernel for the half-space
2xn
K(x, y) = n
, x ∈ Rn+ , y ∈ ∂Rn+ . (1.108)
nα(n)|y − x|
Then, (1.98) tells us that solution of the boundary value problem
−∆u = 0 for y ∈ Rn+ , (1.109)
u(x) = g(x) for x such that xn = 0,

29
with a prescribed function g(x) is
ˆ ˆ
2xn g(y)
u(x) = K(x, y)g(y)dy1 . . . dyn−1 = dy1 . . . dyn−1 , (1.110)
Rn−1 nα(n) Rn−1 |x − y|n

for x ∈ Rn . Note that the denominator never vanishes since xn > 0 and the distance |x − y|
is computed in Rn so that
1/2
|x − y| = (x1 − y1 )2 + . . . + (xn−1 − yn − 1)2 + x2n .

Of course, one should be more careful here: what (1.98) actually says is that the only
possible smooth solution of (1.109) is given by the convolution with the Poisson kernel, but
we do not know yet that the function u(x) defined by (1.110) is, indeed, a solution of (1.109).
It is quite straightforward to verify that u(x) is harmonic in the upper half-space: all we need
to check for that is that
∆x K(x, y) = 0,
and that is true because (i) y is not in the interior of Rn+ – hence, K(x, y) is regular at
all x ∈ Rn+ , and (ii) K(x, y) = ∂G(x, y)/∂yn , and G(x, y) is harmonic (in both x and y
for x 6= y).
It is much more delicate to verify that the boundary condition for u(x) holds, that is,
that u(x) is continuous up to the boundary {xn = 0}, and that

lim u(x1 , . . . , xn ) = g(x1 , . . . , xn−1 ). (1.111)


xn →0

The reason why the boundary condition holds is as follows. First, one can verify easily that
ˆ
K(x, y)dy = 1, (1.112)
Rn−1

for each fixed x ∈ Rn+ . Second, we can write the Poisson kernel as

2 xn
K(x1 , . . . , xn−1 , xn , y) =
nα(n) ((x1 − y1 )2 + . . . + (xn−1 − yn−1 )2 + x2n )n/2
" 2  2 #−n/2
2 x1 − y 1 xn−1 − yn−1
= + ... + +1 .
nα(n)xn−1
n xn xn

Denoting x0 = (x1 , . . . , xn−1 ) we obtain then


ˆ ˆ " #−n/2
x − y 2
0
2
u(x0 , xn ) = K(x, y)g(y)dy = 1 + g(y)dy.
Rn−1 nα(n)xn−1
n Rn−1 xn

Changing variables y 0 = (x0 − y)/xn gives


ˆ
0 2 −n/2
1 + |y 0 |2 g(x0 − xn y 0 )dy 0 .

u(x , xn ) =
nα(n) Rn−1

30
Letting xn → 0 gives
ˆ
0 2 −n/2
1 + |y 0 |2 g(x0 )dy 0 = g(x0 ).

lim u(x , xn ) = (1.113)
xn ↓0 nα(n) Rn−1

In the last identity we used (1.112) with x = (0, . . . , 0, 1). The passage to the limit xn → 0
in (1.113) follows from continuity and boundedness of the function g(x). Therefore, u(x)
satisfies the boundary condition in (1.109). Note that this computation is essentially the
same as what we have done in the approximation of the Green’s function in (1.101)-(1.102).

1.3.4 Energy and variational methods


Uniqueness by the energy method
We now get a glimpse of what is known as energy methods which are a very useful tool in
dealing with many PDE problems. W have previously shown that solution of the boundary
value problem

−∆u = f in U , (1.114)
u = g on ∂U

in a bounded domain U in Rn with a smooth boundary, is unique, using the maximum


principle. Here is a way to show uniqueness without using the maximum principle. Let u1
and u2 be two solutions of (1.114). The function v = u1 − u2 satisfies

−∆v = 0 in U , (1.115)
v = 0 on ∂U .

Let us multiply the Laplace equation (1.115) by v and integrate over the domain U :
ˆ
v∆v = 0.
U

The Green’s formula then implies


ˆ ˆ
∂v
0= v dS − |∇v|2 dx.
∂U ∂ν U

As v = 0 on ∂U , we conclude that
ˆ
|∇v|2 dx = 0,
U

whence v = 0 and u1 = u2 proving the uniqueness claim.

The variational formulation


We now discuss how PDEs can be reformulated in terms of variational problems. This is a
very powerful tool (when it is available, and it is possible not for all PDEs) both to prove
existence and uniqueness of the solution, and, maybe more importantly, to find it numerically.

31
In order to understand the idea, let us recall some linear algebra. Consider the equation

Ax = b, (1.116)

where A is a real-valued n × n matrix, b is a given vector in Rn , and x ∈ Rn is the unknown.


Assume that the matrix A is symmetric positive-definite, meaning that there exists a positive
number a > 0 so that for any y ∈ Rn we have

(Ay · y) ≥ a|y|2 . (1.117)

Consider the function


1
G(x) = (Ax · x) − (b · x), (1.118)
2
defined for all x ∈ Rn . Note that

G(x) ≥ a|x|2 − |b||x|,

and thus G(x) → +∞ as |x| → +∞. It follows that G(x) is bounded from below and attains
its minimum at some point x̄ = (x̄1 , . . . , x̄n ) ∈ Rn . Using basic calculus we find that this
point must satisfy the equations
n
X
Aij x̄j = bi , i = 1, . . . , n, (1.119)
j=1

which, of course, is nothing but (1.116). Therefore, solving equation (1.116) is exactly equiv-
alent to finding the minimal point of the function G(x). The latter might be a much easier
problem in many circumstances, especially so since the function G(x) is convex, thus it has a
unique minimum.
This idea can be generalized to many PDEs, in particular, to the Poisson equation. We
define the energy functional
ˆ  
1 2
I[w] = |∇w| − wf dx, (1.120)
U 2

and the class of admissible functions

A = {w ∈ C 2 (Ū ) : w = g on ∂U }.

Theorem 1.13 A function u ∈ C 2 (Ū ) solves the boundary value problem

−∆u = f in U (1.121)
u = g on ∂U

if and only if u ∈ A and


I[u] = min I[w]. (1.122)
w∈A

32
Proof. Let us assume that u solves (1.121), take w ∈ A, multiply (1.121) by (u − w) and
integrate: ˆ
(−∆u − f )(u − w)dx = 0.
U
Integrating by parts using the Green’s formula gives
ˆ
(∇u · ∇(u − w) − f (u − w))dx = 0.
U

We used here the fact that u − w = 0 on ∂U to kill the boundary terms in Green’s formula.
It follows that ˆ ˆ
2
(|∇u| − f u) = (∇u · ∇w − f w)dx. (1.123)
U
Now comes the crucial trick: note that
1 1
|(∇u · ∇w)| ≤ |∇u|2 + |∇w|2 .
2 2
Using this in (1.123) leads to
ˆ ˆ
1 1
2
(|∇u| − f u) ≤ ( |∇u|2 + |∇w|2 − f w)dx, (1.124)
U 2 2
hence, ˆ ˆ
1 1
( |∇u|2 − f u) ≤ ( |∇w|2 − f w)dx, (1.125)
U 2 2
which is nothing but I[u] ≤ I[w]. Therefore, if u solves the boundary value problem (1.121)
then it minimizes the functional I[w] over w ∈ A.
To show the other direction, let u be a minimizer of I[w] over A. Take a function v that
is smooth in U and vanishes on the boundary ∂U . Consider the increment of I[w] in the
direction v:
r(s) = I[u + sv].
Then, the function u+sv is in A, and, as u minimizes I[w] over A, we should have r(s) ≥ r(0)
for all s ∈ R. The function r(s) is a quadratic function of s:
ˆ  
1 2
r(s) = |∇u + s∇v| − (u + sv)f dx
U 2
ˆ  ˆ ˆ
s2

1 2
= |∇u| − uf dx + s (∇u · ∇v − vf ) dx + |∇v|2 dx.
U 2 U 2 U
As r(s) attains its minimum at s = 0, we have
ˆ
(∇u · ∇v − vf ) dx = 0. (1.126)
U

Integrating by parts, as v = 0 on ∂U , gives


ˆ
(−∆u − f )vdx = 0.
U

33
Since this identity holds for all smooth functions v that vanish at the boundary ∂U , it follows
that u satisfies
−∆u = f,
and this finishes the proof – the boundary condition u = g on ∂U is satisfied automatically
since u ∈ A. 2

1.3.5 Image denoising as a variational problem


The mage denoising problem is to find a function u that is close to a recorded image f but
has ”less noise”. One approach to this problem is to consider the functional
ˆ ˆ
J(w) = λ |∇w| dx + (w − f )2 dx.
2
(1.127)
U U

Here, we think of f as a noisy measured image, U is the domain of the recording sensor, λ is
a small parameter, and we are looking for a function w that is close to f but is ”reasonably
smooth” – this is why the gradient term appears in (1.133). It is natural to assume that the
normal derivative of w vanishes at the image edges, so that our goal is to find a minimizer
of J(w) over the set
∂u
A = {u ∈ C 2 (Ω) ∩ C(Ω̄) : = 0 on ∂Ω.}
∂ν
An alternative is to minimize J(w) over the set

Ag = {u ∈ C 2 (Ω) ∩ C(Ω̄) : u = g on ∂Ω,}

where, g is a smoothed version of f on the boundary – the question how we smooth f on the
boundary is separate, and we do not touch upon it here. Let us assume that w is a minimizer
of J(w) over A and variate J over A: take a smooth function η(x) such that ∂η/∂ν = 0 on ∂Ω
and compute
ˆ ˆ ˆ ˆ
J(w + sη) = λ |∇w + s∇η| dx + (w + sη − f ) dx = λ |∇w| dx + (w − f )2 dx
2 2 2

ˆ Ω ˆ Ω ˆ Ω ˆ Ω

+2sλ (∇η · ∇w)dx + s2 λ |∇η|2 dx + 2s (w − f )ηdx + s2 η 2 dx. (1.128)


Ω Ω Ω Ω

We see that J(w + sη) attains its minimum at s = 0 if


ˆ ˆ
λ (∇η · ∇w)dx + (w − f )ηdx = 0. (1.129)
Ω Ω

As ∂w/∂ν = 0 on the boundary, this can be re-written as


ˆ ˆ
−λ (∆w)ηdx + (w − f )ηdx = 0. (1.130)
Ω Ω

Now, for w to be a minimum of J(w) over the admissible set A, the function

f (s) = J(w + sη),

34
should attain its minimum at s = 0 for all smooth test functions η that vanish on ∂Ω,
and (1.130) should hold for all such η(x), hence w should be the solution of the boundary
value problem

−λ∆w + w = f in Ω, (1.131)
∂w
= 0 on ∂Ω.
∂ν
Exercise 1.14 Show that if we were minimizing J(w) over the set Ag we would have arrived
at the problem

−λ∆w + w = f in Ω, (1.132)
w = g on ∂Ω.

The truth is that this denoising method oversmooths and is terrible at preserving edges in
an image even for small values of λ, but it is a legitimate first attempt at denoising. A much
better functional to use is
ˆ ˆ
˜
J(w) = λ |∇w|dx + (w − f )2 dx. (1.133)
U U

˜
However, minimizing J(w) leads to a rather complicated nonlinear PDE, so we will not
consider this problem here.

1.4 The heat equation


1.4.1 Another probabilistic interlude
We will now consider the heat (or diffusion) equation
∂u
− ∆u = 0. (1.134)
∂t
Usually it is obtained from a balance of heat or concentration that assumes that the flux of
heat is F = −∇u, where u is the temperature – heat flows from hot to cold. Here, we derive
it informally starting with a probabilistic model.
Consider a lattice on the real line of mesh size h: xn = nh. Let X(t) be a random walk
on this lattice that starts at some point x, and after a delay τ jumps to the left or right with
probability 1/2: P (X(τ ) = x + h) = P (X(τ ) = x − h) = 1/2. Then it waits again for time τ ,
and again jumps to the left or right with probability 1/2, and so on. Let S be a subset of
the real line and define u(t, x) = P (X(t) ∈ S|X(0) = x) – this is the probability that at a
time t > 0 the particle is inside the set S given that it started at the point x at time t = 0.
Let us derive an equation for u(t, x). Since the process ”starts anew” after every jump we
have the relation
1 1
P (X(t) ∈ S|X(0) = x) = (X(t − τ ) ∈ S|X(0) = x + h) + (X(t − τ ) ∈ S|X(0) = x − h),
2 2
which is
1 1
u(t, x) = u(t − τ, x + h) + u(t − τ, x − h). (1.135)
2 2

35
Let use assume that τ and h are small and use Taylor’s formula in the right side above.
Then (1.135) becomes:
∂u(t, x) h2 ∂ 2 u τ 2 ∂ 2 u(t, x) ∂ 2 u(t, x)
 
1 ∂u(t, x)
u(t, x) = u(t, x) − τ +h + + − τh
2 ∂t ∂x 2 ∂x2 2 ∂t2 ∂x∂t
2 2 2 2 2
 
1 ∂u(t, x) ∂u(t, x) h ∂ u τ ∂ u(t, x) ∂ u(t, x)
+ u(t, x) − τ −h + 2
+ 2
+ τh + ...,
2 ∂t ∂x 2 ∂x 2 ∂t ∂x∂t
which is
∂u h2 ∂ 2 u τ 2 ∂ 2 u(t, x)
τ = + + ...
∂t 2 ∂x2 2 ∂t2
In order to get a non-trivial balance we set τ = h2 . Then the term involving utt in the right
side is smaller than the rest and in the leading order we obtain
∂u 1 ∂ 2u
= , (1.136)
∂t 2 ∂x2
which is the diffusion equation (we could get rid of the factor of 1/2 if we took τ = h2 /2 but
probabilists do not like that). It is supplemented by the initial condition

1, if x ∈ S
u(0, x) =
0 if x ∈
/ S.
More generally, we can take a bounded function f (x) defined on the real line and set
v(t, x) = E{f (X(t))| X(0) = x}.
Essentially an identical argument shows that if τ = h2 then in the limit h → 0 we get the
following Cauchy problem for v(t, x):
∂v 1 ∂ 2v
= (1.137)
∂t 2 ∂x2
v(0, x) = f (x).
What should we expect for the solutions of the Cauchy problem given this informal prob-
abilistic representation? First, it should preserve positivity: if f (x) ≥ 0 for all x ∈ R, we
should have u(t, x) ≥ 0 for all t > 0 and x ∈ R. Second, the maximum principle should
hold: if f (x) ≤ M for all x ∈ R, then we should have u(t, x) ≤ M for all t > 0 and x ∈ R
because the expected value of a quantity can not exceed its maximum. We should also expect
that maxx∈R v(t, x) decays in time, at least if f (x) is compactly supported – this is because
the random walk will tend to spread around and at large times the probability to find it on
the set where f (x) does not vanish, is small.

1.4.2 The heat kernel and the Cauchy problem


The fundamental solution Φn (x) of the Laplace equation has the property that the convolution
ˆ
u(x) = Φn (x − y)f (y)dy
Rn

36
gives a solution of the Poisson equation

−∆u = f (x) in Rn .

Let us now try to find the “moral equivalent” of the fundamental solution for the heat equa-
tion. More precisely, we will look for a function G(t, x) so that the convolution
ˆ
u(t, x) = G(t, x − y)f (y)dy (1.138)

gives a solution of the Cauchy problem (the initial value problem):

ut = ∆u, t > 0, x ∈ Rn , (1.139)


u(0, x) = f (x).

Let us first look for some symmetries that the function G(t, x) has to satisfy. The key
observation is that if u(t, x) is a solution to (1.139), then for all λ > 0 the rescaled function

uλ (t, x) = u(λ2 t, λx)

also solves the heat equation, but with the rescaled initial data

uλ (0, x) = u(0, λx) = f (λx).

It follows that for any initial data f (x) and any λ > 0 we should have
ˆ
2
u(λ t, λx) = uλ (t, x) = G(t, x − y)f (λy)dy.

Using expression (1.138) for u(λ2 t, λt) leads to


ˆ ˆ ˆ
2 1 y
G(λ t, λx − y)f (y)dy = G(t, x − y)f (λy)dy = n G(t, x − )f (y)dy. (1.140)
λ λ

This identity holds for any continuous function f (y) with compact support, hence G(t, x) has
to satisfy
1 y
G(λ2 t, λx − y) = n G(t, x − ), (1.141)
λ λ
for all λ > 0, t > 0 and x, y ∈ Rn . Denoting z = x−y/λ, we get an equivalent form of (1.141):
for any λ > 0, t > 0 and z ∈ Rn we have
1
G(λ2 t, λz) = G(t, z). (1.142)
λn

Let us choose λ = 1/ t then (1.142) implies
z
G(1, √ ) = tn/2 G(t, z). (1.143)
t

37
In other words, the function G(t, z) has to be of the form
1 z
G(t, z) = v( √ ). (1.144)
tn/2 t
Here we have denoted v(z) = G(1, z). This means that G(t, z) is self-similar: its shape at
different times can be obtained by simple rescaling of the ”shape” of the function v(y), y ∈ Rn .
Let us make a more general self-similar ansatz:
1 x
u(t, x) = m v β , (1.145)
t t
and see for which m and β we can find a self-similar solution of the heat equation

ut = ∆u. (1.146)

We insert the ansatz (1.145) into (1.146) and compute, with y = x/tβ :
m β 1
− v(y) − x · ∇v(y) − ∆v(y) = 0,
tm+1 tm+β+1 tm+2β
or
m β 1
− v(y) − y · ∇v(y) − ∆v(y) = 0, (1.147)
tm+1 tm+1 tm+2β
Now, for (1.147) to be true for all t > 0 and y ∈ Rn , we need this equation to involve only the
variable y – this forces us to take β = 1/2. With this choice of β, equation (1.147) becomes
y
mv(y) + · ∇v(y) + ∆v(y) = 0 (1.148)
2
In order to simplify further we assume that v is radial. Actually, a good exercise is to convince
yourself that the radial symmetry of the heat equation implies that the heat kernel, if it exists,
has to be radial. In other words, this assumption should be automatically satisfied, and we
must have v(y) = w(|y|), with some function w(r), r > 0, turning (1.148) into
n−1 0 r 0
w00 (r) + w + w + mw = 0. (1.149)
r 2
Let us add the boundary condition w(0) = 1, w0 (0) = 0 to (1.149), and look for w(r) > 0 that
decays rapidly as r → +∞ and is positive for all r > 0. Multiplying (1.149) by rn−1 gives
 n 0
n−1 0 0 r n−1
 n
(r w (r)) + w +r m− w = 0. (1.150)
2 2

The boundary conditions at r = 0 and rapid decay of w(r) as r → +∞ imply that


ˆ ∞ ˆ ∞  n+1 0
n 0 0 r
(r w (r)) dr = w = 0.
0 0 2
It follows that ˆ ∞
 n
m− rn w(r)dr = 0.
2 0

38
As we are looking for w(r) > 0, it forces the value m = n/2. With this choice of m, (1.150)
can be solved giving a solution with the properties we have required: integrating this equation
once gives
rn
rn−1 w0 (r) + w = 0, (1.151)
2
so
r
w0 (r) = − w,
2
and
2
w(r) = be−r /4 , (1.152)
with an arbitrary constant b > 0 gives a solution. Therefore, we obtain the following family
of positive solutions to the heat equation:
b 2 /(4t)
u(t, x) = e−|x| , b > 0. (1.153)
tn/2
This motivates the following
Definition 1.15 The function
1 2
G(t, x) = n/2
e−|x| /(4t) , t > 0, x ∈ Rn , (1.154)
(4πt)
is called the heat kernel.
Note that ˆ
G(t, x)dx = 1 for all t > 0. (1.155)
Rn
This is because
ˆ ˆ ˆ ˆ
1 −|x|2 /(4t) 1 −z 2 /4 1 2
G(t, x)dx = n/2
e dx = n/2
e dz = n/2
e−z /4 dz
Rn (4πt) n (4π) Rn (4π) Rn
 ˆ ∞ n R
1 2
= √ e−s /4 ds
4π −∞
while
 ˆ ∞ 2 ˆ ˆ ∞ ˆ 2π
1 −s2 /4 1 −(x2 +y 2 )/4 1 2
√ e ds = e dxdy = e−r /4 rdrdφ
4π 4π 4π 0
−∞
ˆ ∞ ˆ ∞ 0
1 2
= re−r /4 dr = e−s ds = 1.
2 0 0

Actually, identity (1.155) is the reason why we chose the prefactor to be (4π)−n/2 in the
definition of the heat kernel.
Notice that if u(t, x) is a solution of the heat equation, then so is the shifted func-
tion u(t, x − y) for any y ∈ Rn fixed. It follows that the function
ˆ
v(t, x) = G(t, x − y)f (y)dy, t > 0, x ∈ Rn , (1.156)
Rn

39
is also a solution of the heat equation, provided that the function f (y) is such that we can
differentiate under the integral sign above. This is true, in particular, if f (y) is bounded and
continuous. Moreover, under these assumptions we can differentiate v(t, x) as many times as
we wish – that is, v(t, x) is infinitely differentiable even if f (y) is just continuous and bounded.
Let us now see what happens to v(t, x) as t ↓ 0 – note that (1.156) defines it only for t > 0.
To this end, we first note that
ˆ
1 2

n/2
e−z /4 dz = 1, (1.157)
(4π) Rn

which can be seen by taking t = √ 1 in (1.155). Now, let t > 0 be small and let us make a
change of variables z = (x − y)/ t in (1.156):
ˆ ˆ √
1 −|x−y|2 /(4t) 1 2
v(t, x) = n/2
e f (y)dy = n/2
e−z /4 f (x − z t)dz. (1.158)
(4πt) Rn (4π) Rn

As f is continuous, we have √
f (x − z t) → f (x),
for each x ∈ Rn and z ∈ Rn fixed. Since f is also globally bounded, |f (x)| ≤ M for all x ∈ Rn ,
we can use Lebesgue dominated convergence theorem to conclude from (1.158) that
ˆ √ ˆ
1 −z 2 /4 1 2
v(t, x) = n/2
e f (x − z t)dz → f (x) n/2
e−z /4 dz = f (x). (1.159)
(4π) Rn (4π) Rn

We used identity (1.157) in the last step above. We summarize this discussion as follows.
Theorem 1.16 Let f (x) be a continuous function in Rn and assume that there exists a
constant M > 0 so that |f (x)| ≤ M for all x ∈ Rn . Then the function v(t, x) defined
by (1.156) is infinitely differentiable for all t > 0 and x ∈ Rn and, in addition, satisfies the
heat equation
∂v
− ∆v = 0, t > 0, x ∈ Rn ,
∂t
as well as the initial condition v(0, x) = f (x).

1.4.3 Duhamel’s principle for the heat equation


Let us now consider the inhomogeneous heat equation with the zero initial condition
∂v
− ∆v = f (t, x), t > 0, x ∈ Rn , (1.160)
∂t
v(0, x) = 0, x ∈ Rn .

Duhamel’s principle, as discussed at the end of Section 1.2 says that solution of (1.160) can
be found as follows: for every fixed s ∈ [0, t] solve the following Cauchy problem for the
function u(t, x; s), defined for t ≥ s:
∂u(t, x; s)
− ∆u(t, x; s) = 0, t ≥ s, x ∈ Rn , (1.161)
∂t
u(t = s, x; s) = f (s, x), x ∈ Rn .

40
A solution of this problem is
ˆ
u(t, x; s) = G(t − s, x − y)f (s, y)dy.
Rn

The Duhamel’s principle says that solution of (1.160) is given by


ˆ t ˆ tˆ
v(t, x) = u(t, x; s)ds = G(t − s, x − y)f (s, y)dyds. (1.162)
0 0 Rn

Let us verify that this is, indeed, the case. If we could differentiate under the integral sign
that would be simple:
ˆ t
∂v(t, x) ∂u(t, x; s)
− ∆v(t, x) = u(t, x; t) + ( − ∆u(t, x; s))ds = f (t, x). (1.163)
∂t 0 ∂t
The subtlety here is that G(t − s, x − y) is singular at s = t and we need to justify this
formal procedure. This is done very similarly to what we did in the proof of Thereom 1.1 so
we will just briefly describe how this can be done. Let us rewrite v(t, x) given by (1.162) as
ˆ tˆ
v(t, x) = G(s, y)f (t − s, x − y)dyds. (1.164)
0 Rn

Then, if f is sufficiently smooth, we can argue as in the proof of the aforementioned theorem
to get
ˆ tˆ  
∂v(t, x) ∂f (t − s, x − y)
− ∆v(t, x) = G(s, y) − ∆x f (t − s, x − y) dyds
∂t ∂t
ˆ 0 Rn

+ G(t, y)f (0, x − y)dy. (1.165)


Rn

The potential trouble point in the first integral in the right side of (1.165) is s = 0 where G(s, y)
is singular. Hence, we take a small ε > 0 and, using also the fact that
∂f (t − s, x − y) ∂f (t − s, x − y)
∆x f (t − s, x − y) = ∆y f (t − s, x − y), =− ,
∂t ∂s
we split the integral in (1.165) as
ˆ tˆ  
∂v(t, x) ∂f (t − s, x − y)
− ∆v(t, x) = G(s, y) − − ∆y f (t − s, x − y) dyds
∂t Rn ∂s
ˆ εˆ  ε
 ˆ
∂f (t − s, x − y)
+ G(s, y) − − ∆y f (t − s, x − y) dyds + G(t, y)f (0, x − y)dy
0 Rn ∂s Rn
= Iε + Jε + K. (1.166)

The second term in the right side above is small when ε → 0: let

∂f (t, x)
Mf = max n + max |∆f (t, x)| ,
t∈R,x∈R ∂t t∈R,x∈Rn

41
then ˆ εˆ
|Jε | ≤ Mf G(s, y)dsdy = εMf ,
0 Rn

because of (1.155). In order to evaluate Iε we integrate by parts in s and y:


ˆ tˆ  
∂f (t − s, x − y)
Iε = G(s, y) − − ∆y f (t − s, x − y) dyds
ε Rn ∂s
ˆ tˆ  
∂G(s, y)
= − ∆y G(s, y) f (t − s, x − y)dyds
∂s
ˆε R
n

− [G(t, y)f (0, x − y) − G(ε, y)f (t − ε, x − y)] dy.


Rn

The term in the middle line above vanishes since


∂G(s, y)
− ∆y G(s, y) = 0.
∂s
Moreover, the first term in the last line is simply (−K) (with K as in (1.166)), hence
ˆ ˆ
1 2
Iε + K = G(ε, y)f (t − ε, x − y)dy = n/2
e−|y| /(4ε) f (t − ε, x − y)dy.
Rn (4πε) Rn

With the change of variables z = y/ ε this becomes
ˆ
1 2 √
Iε + K = n/2
e−|z| /4 f (t − ε, x − εz)dz.
(4π) Rn

Therefore, as ε ↓ 0 this converges to


ˆ
1 2

n/2
e−|z| /4 f (t, x)dz = f (t, x).
(4π) Rn

We conclude that Duhamel’s formula, indeed, gives the solution of the inhomogeneous prob-
lem (1.160).

1.4.4 The maximum principle for the heat equation


Let us now prove the maximum principle for the heat equation, first in a bounded domain.
Let U ⊂ Rn be a bounded domain and T > 0. We define

UT = {(t, x) : x ∈ U, 0 ≤ t ≤ T },

and
ΓT = {(t, x) : t = 0 and x ∈ U , or 0 ≤ t ≤ T and x ∈ ∂U }.
Note that the definition of the ”parabolic boundary” ΓT does not include the final time
t = T – it looks in time-space as a cylinder without the top.

42
Theorem 1.17 Let the function u(t, x) satisfy the heat equation
∂u
− ∆u = 0
∂t
in UT . Then u achieves its maximum and minimum over UT on the parabolic boundary ΓT .
In other words, u attains its maximum either at some point x ∈ U at the initial time t = 0,
or, if it attains its maximum at some point (t0 , x0 ) with t0 > 0 then x0 has to belong to the
boundary ∂U .
Proof. Take ε > 0 and consider the function
v(t, x) = u(t, x) − εt.
The function v(t, x) satisfies
∂v
− ∆v = −ε. (1.167)
∂t
Consider the domain [
ŪT = UT {(T, x) : x ∈ U },
which is the union of UT and the ”top” of the parabolic cylinder. The function v(t, x) must
attain its maximum over the set ŪT at some point (t0 , x0 ) ∈ ŪT . We claim that this point
has to lie on the parabolic boundary ΓT . Indeed, if 0 < t0 < T and x0 is not on the boundary
∂U , then the point (t0 , x0 ) is an interior maximum of v(t, x) and as such should satisfy
∂v(t0 , x0 )
= 0, ∆v(t0 , x0 ) ≤ 0,
∂t
which is impossible because of (1.167). On the other hand, if t0 = T and x0 is an interior
point of U , and u attains its maximum over Ū at this point, then we should have
∂v(t0 , x0 )
≥ 0, ∆v(t0 , x0 ) ≤ 0,
∂t
which, once again contradicts (1.167). Hence, the function v attains its maximum over ŪT at
a point (t0 , x0 ) that belongs to ΓT . It means that
max v(t, x) = max v(t, x) ≤ max u(t, x).
(t,x)∈ŪT (t,x)∈ΓT (t,x)∈ΓT

However, we also have


max u(t, x) ≤ εT + max v(t, x).
(t,x)∈ŪT (t,x)∈ŪT

Putting the last two inequalities together gives


max u(t, x) ≤ εT + max u(t, x).
(t,x)∈ŪT (t,x)∈ΓT

As ε > 0 is arbitrary, it follows that


max u(t, x) ≤ max u(t, x),
(t,x)∈ŪT (t,x)∈ΓT

and the proof is complete. 2


The situation with the maximum principle in the whole space is slightly more delicate.
Let us just state the result.

43
Theorem 1.18 Let u(t, x) be a smooth solution of the Cauchy problem

∂u
− ∆u = 0, t > 0, x ∈ Rn , (1.168)
∂t
u(0, x) = f (x).

Assume that u satisfies the estimate


2
u(t, x) ≤ Aea|x| , for all x ∈ Rn and 0 ≤ t ≤ T , (1.169)

with some constants a and A. Then we have

sup u(t, x) = sup f (x).


(t,x)∈[0,T ]×Rn x∈Rn

The proof of this theorem is more technical than in a bounded domain but the idea is similar in
spirit so we not present it here. We just mention that there are solutions of the homogeneous
Cauchy problem
∂u
− ∆u = 0, t > 0, x ∈ Rn , (1.170)
∂t
u(0, x) = 0

that are not identically equal to zero and grow very rapidly at infinity. The role of assumption
(1.169) is to preclude those.

1.4.5 Uniqueness
Let us now go back to the setting of Theorem 1.17.

Theorem 1.19 Let g be a continuous function on the parabolic boundary ΓT and let f be
continuous in UT . Then there exists at most one smooth solution to the initial boundary value
problem
∂u
− ∆u = f, in UT (1.171)
∂t
u = g on ΓT .

Note that the condition u = g on ΓT prescribes both the initial data for u at the time t = 0
and the boundary data along ∂U at times t > 0.
Proof. This follows immediately from the maximum principle for bounded domains (The-
orem 1.17). Indeed, if u1 and u2 solve (1.171) then the difference v = u1 − u2 satisfies

∂v
− ∆v = 0, in UT (1.172)
∂t
v = 0 on ΓT .

The maximum principle now implies that v = 0. 2

44
The same uniqueness result holds for solutions in the whole space: there exists at most
one solution of the initial value problem
∂u
− ∆u = 0, t > 0, x ∈ Rn , (1.173)
∂t
u(0, x) = f (x)

that in addition satisfies the estimate


2
u(t, x) ≤ Aea|x| , for all x ∈ Rn and 0 ≤ t ≤ T , (1.174)

with some constants a and A. This follows from Theorem 1.18.

Uniqueness by the energy method


Let us now prove Theorem 1.19 without using the maximum principle, by what is known as
the energy method. We need to show that the only solution of
∂v
− ∆v = 0, in UT (1.175)
∂t
v = 0 on ΓT ,

is v ≡ 0. Let us define the energy


ˆ
E(t) = v 2 (t, x)dx.
U

We will show that any solution of the heat equation, regardless of the initial condition, satisfies
dE
≤ 0, (1.176)
dt
hence E(t) ≤ E(0) for all t ≥ 0. However, if v(0, x) = 0 then E(0) = 0, so that E(t) ≤ 0 for
all t ≥ 0. As E(t) is nonnegative by definition, it follows that E(t) = 0 for all t ≥ 0, which
means that v(t, x) ≡ 0.
Let us now show that energy inequality (1.176) holds. We will do this for a general solution
of the Cauchy problem with zero boundary conditions on ∂U :
∂u
− ∆u = 0, for 0 ≤ t ≤ T , x ∈ U , (1.177)
∂t
u(t, x) = 0 for 0 ≤ t ≤ T , x ∈ ∂U ,
u(0, x) = g(x) for x ∈ U .

The corresponding energy is ˆ


E(t) = u2 (t, x)dx.
U
We compute:
ˆ ˆ
dE ∂u(t, x)
=2 u(t, x) dx = 2 u(t, x)∆u(t, x)dx.
dt U ∂t U

45
Integration by parts gives
ˆ ˆ
dE ∂u(t, x)
=2 u(t, x) dS(x) − 2 ∇u · ∇udx.
dt ∂U ∂ν U

The boundary condition u = 0 on ∂U implies that the above is


ˆ
dE
= −2 |∇u(t, x)|2 dx ≤ 0,
dt U

hence (1.176) holds.

1.4.6 Regularity of solutions of the heat equation


Let us now discuss how regular solutions of the heat equation are. For simplicity, we will
consider only solutions in the whole space (generalization to bounded domains is not very
painful), and we will always assume that they obey the bounds (1.174) in order to ensure
uniqueness. Let u(t, x) satisfy the Cauchy problem

∂u
− ∆u = 0, t > 0, x ∈ Rn , (1.178)
∂t
u(0, x) = f (x).

It is given then explicitly as


ˆ ˆ
1 2 /(4t)
u(t, x) = G(t, x − y)f (y)dy = e−|x−y| f (y)dy. (1.179)
Rn (4πt)n/2 Rn

The first fact to observe is that if, say, f (x) is continuous and of compact support, we can
differentiate under the integral sign with respect to both t and x arbitrarily many times. This
is because such differentiations will lead to expressions of the sort
ˆ
2
p(t, x, y)e−(x−y) /(4t) f (y)dy,
Rn

where p(t, x, y) is some polynomial in x and y with coefficients that are rational functions of
t that is regular for t > 0. Such integrals converge, which justifies the differentiation under
the integral sign (modulo some very minor details). Therefore, if f (y) is just continuous and
bounded, the function u(t, x) is infinitely differentiable for all t > 0.
We also get directly from (1.179) that
ˆ
1
|u(t, x)| ≤ |f (y)|dy. (1.180)
(4πt)n/2 Rn

Therefore, M0 (t) = maxx∈Rn |u(t, x)| decays as t → +∞ as


ˆ
1
M0 (t) ≤ |f (y)|dy. (1.181)
(4πt)n/2 Rn

46
It is remarkable that only the initial mass
ˆ
m0 = |f (y)|dy
Rn

enters into the upper bound (1.181), and not the maximum of the initial data. This means
that solutions whose initial data consists of a narrow peak drops its maximum very quickly
(a narrow peak has a small mass despite having a large maximum).
Let us now differentiate (1.179). We get for, say, the partial derivative with respect to x1 :
ˆ
∂u(t, x) 1 (x1 − y1 ) −|x−y|2 /(4t)
= n/2
e f (y)dy.
∂x1 (4πt) Rn 2t
2 −z /4
Note that the
√ function p(z) = z1 e is globally bounded:
p it attains its maximum at the
point z̄ = ( 2, 0, 0, . . . , 0) where it takes the value p(z̄) = 2/e. It follows that
ˆ
∂u(t, x) 1 (x1 − y1 ) −|x−y|2 /(4t)
∂x1 (4πt)n/2 2√t n √
≤ e |f (y)|dy
t
ˆ R
ˆ
1 C1
≤ √ |f (y)|dy = (n+1)/2 |f (y)|dy,
(4πt)n/2 2et Rn t Rn

with C1 = (4π)n/2 2e. Generally, we have
ˆ
C1
|∇u(t, x)| ≤ |f (y)|dy. (1.182)
t(n+1)/2 Rn

We see from (1.182) that |∇u(t, x)| decays even faster as t → +∞ than |u(t, x)|, and, more-
over, it is controlled by the initial mass of the solution, rather than by its derivatives. This is
a very important property of the heat equation: as t → +∞ both u(t, x) and its derivatives
decay to zero at a rate controlled by the initial mass. Moreover, the higher the order of the
derivative, the faster it decays in time – solution starts to look very smooth.

1.4.7 The self-similar behavior


Let us now analyze what happens to solutions of the heat equation
∂u
− ∆u = 0, (1.183)
∂t
at large times in more detail. As we expect it to decay at the rate t−n/2 (from (1.181), for
instance), we define w(t, x) = (1 + t)n/2 u(t, x) (we take t + 1 rather than t because as t → +∞
they do not differ by much, and now w(0, x) = u(0, x) and we do not have a singularity at
t = 0). This function satisfies

∂w ∂ 2 w n
− 2
− w = 0, w(0, x) = u(0, x). (1.184)
∂t ∂x 2(t + 1)

47
p
Let us now make
p a change of variables: look for w(t, x) = v(t, x/ 4(t + 1)), then we calculate,
with y = x/ 4(t + 1):
n n
∂w ∂v X xj ∂v ∂v 1 X ∂v ∂v 1
= − 3/2
= − yj = − y · ∇y v,
∂t ∂t j=1
4[(t + 1)] ∂y j ∂t 2(t + 1) j=1
∂y j ∂t 2(t + 1)

and
∂w 1 ∂v
=p .
∂xj 4(t + 1) ∂yj
Differentiating once again gives
∂ 2w 1 ∂ 2v
= .
∂xj 2 4(t + 1) ∂yj 2
Inserting this into (1.184) gives
∂v 1 1 n
− y · ∇y v − ∆y v − v = 0, v(0, y) = u(0, 2y). (1.185)
∂t 2(t + 1) 4(t + 1) 2(t + 1)
The final change of variables is to set s = (1/4) ln(t + 1), that is, v(t, y) = ṽ((ln(t + 1))/4, y),
then
∂v 1 ∂ṽ
= .
∂t 4(t + 1) ∂s
Now, (1.185) becomes
∂ṽ
− 2y · ∇y ṽ − ∆y ṽ − 2nṽ = 0, ṽ(0, y) = u(0, 2y). (1.186)
∂s
2
Equation (1.186) has a special solution v̄(y) = e−y that can be checked directly. The main
point of our analysis is to show that any solution of (1.186) converges as t → +∞ to a multiple
of v̄(y):
ṽ(s, y) → mv̄(y) as s → +∞. (1.187)
In order to find the constant m let us integrate (1.186) in y, keeping in mind the following
identities: first, ˆ ˆ
∂ṽ(s, y) d
dy = ṽ(s, y)dy.
Rn ∂s ds Rn
Next, using Green’s formula we get
ˆ ˆ ˆ
y · ∇ṽdy = − ṽ(s, y)div(y)dy = −n ṽ(s, y)dy,
Rn Rn Rn

because divy = n. We also have ˆ


∆ṽ(s, y)dy = 0.
Rn
Putting these expressions together gives
ˆ ˆ ˆ
d
ṽ(s, y)dy + 2n ṽ(s, y)dy − 2n ṽ(s, y)dy = 0,
ds Rn Rn Rn

48
that is, ˆ
d
ṽ(s, y)dy = 0.
ds Rn
This means that ˆ ˆ
ṽ(s, y)dy = u(0, 2y)dy
Rn Rn
is a conserved quantity. The only possible value for constant m is then determined from the
condition ˆ ˆ
m v̄(y)dy = u(0, 2y)dy, (1.188)
Rn Rn
whence ˆ ˆ
1 1
m= u(0, 2y)dy = u(0, y)dy. (1.189)
π n/2 Rn (4π)n/2 Rn

How do we actually establish convergence (1.187) with the constant m determined by


(1.188)? The ratio z(s, y) = ṽ(s, y)/v̄(y) satisfies

∂z 2
+ 2y · ∇z − ∆z = 0, z(0, y) = u(0, y)ey . (1.190)
∂s
Let us assume for the sake of simplicity of notation that the dimension n = 1 and set

∂z(s, y)
φ(s, y) = .
∂y
The function φ satisfies
∂φ ∂φ d h 2
i
+ 2y + 2φ − ∆φ = 0, φ(0, y) = u(0, y)ey . (1.191)
∂s ∂y dy
Let us multiply this equation by φ and integrate in y: first we have
ˆ ˆ
∂φ(s, y) 1 d
φ(s, y) dy = φ2 (s, y)dy.
R ∂s 2 ds R

Next, we get, using integration by parts:


ˆ ˆ
∂φ(s, y)
2 yφ(s, y) dy = − φ2 (s, y)dy,
R ∂y R

and, finally:
ˆ ˆ
∂ 2 φ(s, y) ∂φ(s, y) 2

φ(s, y) dy = − ∂y dy.

R ∂y 2 R

Together, these identities and (1.191) imply that


ˆ ˆ ˆ
∂φ(s, y) 2

1 d 2 2
φ (s, y)dy + φ (s, y)dy + dy = 0.
2 ds R R R ∂y

49
As a consequence, we have
ˆ ˆ 
2
φ (s, y)dy ≤ φ (0, y)dy e−2s → 0 as s → +∞.
2
(1.192)
R R

Recalling that φ is actually the derivative of z(s, y) with respect to y, we conclude that z(s, y)
tends to a constant as s → +∞.
Let us now re-interpret the convergence result (1.187) in terms of the original variables t
and x, and the function u(t, x) that actually solves the heat equation (1.183). Recall that
! !
1 1 x 1 ln(t + 1) x
u(t, x) = w(t, x) = v t, p = v ,p .
(1 + t)n/2 (1 + t)n/2 4(t + 1) (1 + t)n/2 4 4(t + 1)

Now, (1.187) tells us that


ˆ
m −x2 /(4(t+1)) 1 −x2 /(4(t+1))
u(t, x) ≈ e = e u(0, y)dy. (1.193)
(1 + t)n/2 (4π(1 + t)n/2 Rn

Therefore, after long times solution starts to have the profile of a Gaussian weighted appro-
priately to have the correct mass. But the only information that remains from the initial
condition is its total mass – the rest is completely lost. All solutions ”look the same” – this
is self-similarity.

2 The Fourier transform


2.1 The Fourier transform on the circle
2.1.1 Pointwise convergence on the circle
Given a periodic function f defined on the interval [0, 1], we define the Fourier coefficients,
for integers k ∈ Z as ˆ 1
ˆ
f (k) = f (x)e−2πikx dx.
0

Trivially, we have |fˆ(k)| ≤ kf kL1 for all k ∈ Z, where


ˆ 1
kf kL1 = |f (x)|dx.
0

The Riemann-Lebesgue lemma shows that a continuous signal can not have too much high-
frequency content and fˆ(k) have to decay for large k (this is actually true for a much more
general class of functions). We will denote by Cper [0, 1] the class of functions that are contin-
uous on [0, 1] and f (0) = f (1).

Lemma 2.1 (The Riemann-Lebesgue lemma) If f ∈ Cper [0, 1] then fˆ(k) → 0 as k → +∞.

50
Proof. Note that
ˆ 1 ˆ 1 ˆ 1
1 −2πikx
fˆ(k) = f (x)e −2πikx
dx = − f (x)e −2πik(x+1/(2k))
dx = − f (x − )e dx,
0 0 0 2k
and thus ˆ 1  
1 1
fˆ(k) = f (x) − f (x − ) e−2πikx dx.
2 0 2k
As a consequence, since f (x) is continuous on [0, 1], we have
ˆ 1
1 1
|fˆ(k)| ≤

f (x) − f (x − ) dx,
2 0 2k

hence fˆ(k) → 0 as k → +∞. 2


A simple implication of the Riemann-Lebesgue lemma is that
ˆ 1
f (x) sin(mx)dx → 0
0

as m → ∞ for any f ∈ Cper ([0, 1]).


In order to investigate convergence of the Fourier series

X
fˆ(k)e2πikx
k=−∞

let us introduce the partial sums


N
X
SN f (x) = fˆ(k)e2πikx .
k=−N

A convenient way to represent SN f is by writing it as a convolution:


ˆ 1 N
X ˆ 1
2πik(x−t)
SN f (x) = f (t) e dt = f (x − t)DN (t)dt.
0 k=−N 0

Here the Dini kernel is


N
X e2πi(2N +1)t − 1
DN (t) = e2πikt = e−2πiN t (1 + e2πit + e4πit + . . . + e4πiN t ) = e−2πiN t
k=−N
e2πit − 1
e2πi(N +1/2)t − e−2πi(N +1/2)t sin((2N + 1)πt)
= πit −πit
= .
e −e sin(πt)

The definition of the Dini kernel as a sum of exponentials implies immediately that
ˆ 1
DN (t)dt = 1 (2.1)
0

51
for all N , while the expression in terms of sines shows that
1
|DN (t)| ≤ , δ ≤ |t| ≤ 1/2.
sin(πδ)

The ”problem” with the Dini kernel is that its L1 -norm is not uniformly bounded in N .
Indeed, consider
ˆ 1/2
LN = |DN (t)|dt. (2.2)
−1/2

Let us show that


lim LN = +∞. (2.3)
N →+∞

We compute:
ˆ 1/2 ˆ 1/2
| sin((2N + 1)πt)| | sin((2N + 1)πt)|
LN = 2 dt ≥ 2 dt
0 | sin πt| 0 |πt|
ˆ 1/2 ˆ N +1/2
1 1 | sin(πt)|
−2 | sin((2N + 1)πt)|
− dt = 2 dt + O(1)
0 sin πt πt 0 πt
N −1 ˆ
2 X 1 | sin πt|
≥ dt + O(1) ≥ C log N + O(1),
π k=0 0 t + k

which implies (2.3). This means that (2.1) holds because of cancellation of many oscillatory
terms. These oscillations may cause difficulties in the convergence of the Fourier series.

Convergence of the Fourier series for regular functions


Nevertheless, for ”reasonably regular” functions the Fourier series converges and Dini’s crite-
rion for the convergence of the Fourier series is as follows.

Theorem 2.2 (Dini’s criterion) Let f ∈ C[0, 1] satisfy the following condition at the point x:
there exists δ > 0 so that ˆ
f (x + t) − f (x)
dt < +∞, (2.4)
|t|<δ
t

then limN →∞ SN f (x) = f (x). In particular, this is true if f is differentiable at the point x.

Another criterion for the convergence of the Fourier series was given by Jordan:

Theorem 2.3 (Jordan’s criterion) If f is continuous and monotonic on some interval (x −


δ, x + δ) around the point x, except possibly at the point x itself, then
1
lim SN f (x) = [f (x+ ) + f (x− )]. (2.5)
N →+∞ 2

52
The du Bois-Raymond example
In 1873, surprisingly, du Bois-Raymond proved that the Fourier series of a continuous function
may diverge at a point.

Theorem 2.4 There exists a continuous function f so that its Fourier series diverges at
x = 0.

Kolmogorov showed in 1926 that an L1 -function may have a Fourier series that diverges at
every point. Then Carelson in 1965 proved that the Fourier series of an L2 -function converges
almost everywhere and then Hunt improved this result to an arbitrary Lp for p > 1.

2.2 Approximation by trigonometric polynomials


The Cesaro sums
In order to ”improve’ the convergence of the Fourier series consider the corresponding Cesaro
sums
N ˆ 1
1 X
σN f (x) = Sk f (x) = f (t)FN (x − t)dt,
N + 1 k=0 0

where FN is the Fejér kernel


N N
1 X 1 X
FN (t) = Dk (t) = 2 sin(π(2k + 1)t) sin(πt)
N + 1 k=0 (N + 1) sin (πt) k=0
N
1 X
= [cos(2πkt) − cos(2π(k + 1)t]
2(N + 1) sin2 (πt) k=0
1 1 sin2 (π(N + 1)t)
= [1 − cos(2π(N + 1)t] = .
2(N + 1) sin2 (πt) N +1 sin2 (πt)
The definition and explicit form of FN show that, unlike the Dini kernel, FN is non-negative
and has L1 -norm ˆ 1
|FN (t)|dt = 1. (2.6)
0
Moreover, its mass outside of any finite interval around zero vanishes as N → +∞:
ˆ
lim FN (t)dt = 0 for any δ > 0. (2.7)
N →∞ δ<|t|<1/2

This improvement is reflected in the following approxtmation theorem.


Theorem 2.5 Let f ∈ L2 [0, 1], 1 ≤ p < ∞, then
lim kσN f − f kLp = 0. (2.8)
N →∞

Moreover, if f ∈ Cper [0, 1], then


lim kσN f − f kCper [0,1] = 0. (2.9)
N →∞

53
Corollary 2.6 The Parceval identity holds for any f ∈ L2 [0, 1]:
X ˆ 1
ˆ 2
|f (k)| = |f (x)|2 dx. (2.10)
k∈Z 0

2.3 An application to the heat equation: the periodic case


Let us consider the heat equation on the real line
∂u ∂ 2u
= , (2.11)
∂t ∂x2
with the initial data u(0, x) = f (x) that is 1-periodic: f (x + 1) = f (x). We claim that the
solution u(t, x) is 1-periodic in x for any fixed t ≥ 0, that is, u(t, x + 1) = u(t, x) for all t ≥ 0
and x ∈ R. Indeed, the function v(t, x) = u(t, x + 1) − u(t, x) satisfies
∂v(t, x) ∂ 2 v(t, x) ∂u(t, x + 1) ∂ 2 u(t, x + 1) ∂u(t, x) ∂ 2 u(t, x)
− = − − + = 0,
∂t ∂x2 ∂t ∂x2 ∂t ∂x2
and v(0, x) = f (x + 1) − f (x) = 0, by the assumption that f (x) is 1-periodic. It follows that
v(t, x) = 0 for all t > 0, x ∈ R, which means that u(t, x + 1) = u(t, x). Consider the Fourier
transform of both sides of (2.11):
ˆ 1 ˆ
∂u(t, x) −2πinx d 1 dun
e dx = u(t, x)e−2πinx dx = ,
0 ∂t dt 0 dt
where ˆ 1
un = u(t, x)e−2πinx dx
0
are the Fourier coefficients of the function u. On the other hand, since u(t, x) is 1-periodic,
we have ˆ 1 2 ˆ 1
∂ u(t, x) −2πinx ∂u(t, x) −2πinx
2
e dx = 2πin e dx = (2πin)2 un .
0 ∂x 0 ∂x
The boundary terms at x = 0 and x = 1 above disappear because u(t, x) is periodic. There-
fore, we get an ODE for un :
dun
= −4π 2 n2 un , un (0) = fn .
dt
Here ˆ 1
fn = f (x)e−2πinx dx, (2.12)
0
are the Fourier coefficients of the initial data f (x). This gives
2 n2 t
un (t) = e−4π fn ,

and an explicit formula for the solution of the heat equation


2 2
X
u(t, x) = e−4π n t e2πinx fn , (2.13)
n∈Z

54
with fn given by (2.12). Expression (2.13) conveys very effectively the regularizing properties
of the heat equation: the high frequencies are attenuated very quickly due to the exponentially
decaying factor. This is another way to express the fact that that the heat equation kills
oscillations. Another simple consequence of (2.13) is that
ˆ 1
u(t, x) → f0 = f (y)dy, as t → +∞,
0

uniformly in x – that is, in the long time limit solution of the heat equation with periodic
initial data converges to a constant equal to its spatial average.

2.4 The Fourier transform in Rn


Given an L1 (Rn )-function f its Fourier transform is
ˆ
ˆ
f (ξ) = f (x)e−2πix·ξ dx.

Then, obviously: ˆ
|fˆ(ξ)| ≤ |f (x)|dx.
Rn

Moreover, the function fˆ(ξ) is continuous, and the Riemann-Lebesegue lemma is easily gen-
eralized to the Fourier transform on Rn , and

lim fˆ(ξ) = 0.
ξ→∞

For a smooth compactly supported function f ∈ Cc∞ (Rn ) we have the following remarkable
algebraic relations between taking derivatives and multiplying by polynomials:

∂f
d
(ξ) = 2πiξj fˆ(ξ), (2.14)
∂xj

and
∂ fˆ
(−2πi)(x
d j f )(ξ) = (ξ). (2.15)
∂ξj
This motivates the following definition.

Definition 2.7 The Schwartz class S(Rn ) consists of functions f such that for any pair of
multi-indices α and β
pαβ (f ) := sup |xα Dβ f (x)| < +∞.
x

As Cc∞ (Rn ) lies inside the Schwartz class, the Schwartz functions are dense in L1 (Rn ).
The main reason to introduce the Schwartz class is the following theorem.

55
Theorem 2.8 (i) The Fourier transform of a Schwartz class function f (x) is a Schwartz
class function fˆ(ξ).
(ii) For any f, g ∈ S(Rn ) we have
ˆ ˆ
f (x)ĝ(x)dx = fˆ(x)g(x)dx. (2.16)
Rn Rn

(iii) The following inversion formula holds:


ˆ
f (x) = fˆ(ξ)e2πix·ξ dξ (2.17)

for all f ∈ S(Rn ).


Proof. We begin with a lemma that is one of the cornerstones of probability theory.
Lemma 2.9 Let f (x) = e−π|x| , then fˆ(x) = f (x).
2

Proof. First, as
2 2 2
f (x) = e−π|x1 | e−π|x2 | . . . e−π|xn | ,
so that both f and fˆ factor into a product of functions of one variable, it suffices to consider
the case n = 1. The proof is a glimpse of how useful the Fourier transform is for differential
equations and vice versa: the function f (x) satisfies an ordinary differential equation

f 0 + 2πxf = 0, (2.18)

with the boundary condition f (0) = 1. However, relations (2.14) and (2.15) together with (2.18)
imply that fˆ satisfies the same differential equation (2.18), with the same boundary condi-
tion fˆ(0) = f (0) = 1. It follows that f (x) = fˆ(x) for all x ∈ R. 2
We continue with the proof of Theorem 2.8. Relations (2.14) and (2.15) imply that the
Fourier transform of a Schwartz class function is of the Schwartz class.
The Parceval identity can be verified directly as follows:
ˆ ˆ ˆ
f (x)ĝ(x)dx = f (x)g(ξ)e −2πiξ·x
dxdξ = fˆ(ξ)g(ξ)dξ.
Rn R2n Rn

Finally, we prove the inversion formula using a rescaling argument. Let f, g ∈ S(Rn ) then
for any λ > 0 we have
ˆ ˆ ˆ ˆ  
1 ξ
f (x)ĝ(λx)dx = f (x)g(ξ)e−2πiλξ·x
dx = fˆ(λξ)g(ξ)dξ = n fˆ(ξ)g dξ.
Rn R2n λ Rn λ
Multiplying by λn and changing variables on the left side we obtain
ˆ x ˆ  
ˆ ξ
f ĝ(x)dx = f (ξ)g dξ.
Rn λ Rn λ
Letting now λ → ∞ gives
ˆ ˆ
f (0) ĝ(x)dx = g(0) fˆ(ξ)dξ, (2.19)
Rn Rn

56
2
for all functions f and g in S(Rn ). Taking g(x) = e−π|x| in (2.19) and using Lemma 2.9 leads
to ˆ
f (0) = fˆ(ξ)dξ. (2.20)
Rn
The inversion formula (2.17) now follows if we apply (2.20) to a shifted function fy (x) =
f (x + y), because ˆ
ˆ
fy (ξ) = f (x + y)e−2πiξ·x dx = e2πiξ·y fˆ(ξ),
Rn
so that ˆ ˆ
f (y) = fy (0) = fˆy (ξ)dξ = e2πiξ·y fˆ(ξ)dξ,
Rn Rn
which is (2.17). 2

The Schwartz distributions


Definition 2.10 The space S 0 (Rn ) of Schwartz distirbutions is the space of linear functionals
T on S(Rn ) such that T (φk ) → 0 for all sequences φk → 0 in S(Rn ).
Theorem 2.8 allows us to extend the Fourier transform to distributions in S 0 (Rn ) by setting
T̂ (f ) = T (fˆ) for T ∈ S 0 (Rn ) and f ∈ S(Rn ). The fact that T̂ (fk ) → 0 for all sequences fk → 0
in S(Rn ) follows from the continuity of the Fourier transform as a map S(Rn ) → S(Rn ), hence
T̂ is a Schwartz distribution for all T ∈ S 0 (Rn ). For example, if δ0 is the Schwartz distribution
such that δ0 (f ) = f (0), f ∈ S(Rn ), then
ˆ
δ̂0 (f ) = fˆ(0) = f (x)dx,
Rn

so that δ̂0 (ξ) ≡ 1 for all ξ ∈ Rn .


Similarly, since differentiation is a continuous map S(Rn ) → S(Rn ), we may define the
distributional derivative as  
∂T ∂f
(f ) = −T ,
∂xj ∂xj
for all T ∈ S 0 (Rn ) and f ∈ S(Rn ) – the minus sign here comes from the integration by parts
formula, for if T happens to have the form
ˆ
Tg (f ) = f (x)g(x)dx,
Rn

with a given g ∈ S(Rn ), then


  ˆ ˆ
∂f ∂f ∂g
T = (x)g(x)dx = − f (x) (x)dx.
∂xj Rn ∂xj Rn ∂xj
For instance, in one dimension δ0 (x) = 1/2(sgn(x))0 in the distributional sense because for
any function f ∈ S(R) we have
ˆ∞ ˆ0 ˆ∞
h(sgn)0 , f i = −hsgn, f 0 i = − sgn(x)f 0 (x)dx = f 0 (x)dx − f 0 (x)dx = 2f (0) = 2hδ0 , f i.
−∞ −∞ 0

57
2.5 An application to the heat equation in the whole space
Consider the Cauchy problem for the heat equation
∂u
− ∆u = 0, t > 0, x ∈ Rn , (2.21)
∂t
u(0, x) = f (x), x ∈ Rn .

Taking the Fourier transform of the heat equation in the whole space gives, very similarly to
the periodic case considered in Section 2.3:
∂ û(t, ξ)
+ 4π 2 |ξ|2 û(t, ξ) = 0, (2.22)
∂t
with the initial data
û(0, ξ) = fˆ(x),
where ˆ ˆ
û(t, ξ) = u(t, x)e −2πiξ·x
dx, fˆ(ξ) = f (x)e−2πiξ·x dx.
Rn Rn
The ODE (2.22) can be solved explicitly:
2 2
û(t, ξ) = fˆ(ξ)e−4π |ξ| t . (2.23)

As in the periodic case, we see that high frequency components of the solutions of the heat
equation decay rapidly in time, leading to the regularizing effects we have discussed previously.
The inverse Fourier transform then gives
ˆ ˆ
2 2
u(t, x) = û(t, ξ)e 2πiξ·x
dξ = fˆ(ξ)e−4π |ξ| t e2πiξ·x dξ. (2.24)
Rn Rn

Let us make a couple of “regularizing observations” from (2.24). For example, we have
ˆ ˆ  
ˆ −4π 2 |ξ|2 t −4π 2 |ξ|2 t ˆ
|u(t, x)| ≤ |f (ξ)|e dξ ≤ e dξ maxn (|f (ξ)|) .
Rn Rn ξ∈R

Recall that ˆ
|fˆ(ξ)| ≤ |f (x)|dx,
Rn
for all ξ ∈ Rn . It follows that
ˆ ˆ ˆ
−4π 2 |ξ|2 t 1
|u(t, x)| ≤ |f (x)|dx e dξ = |f (x)|dx,
Rn Rn (4πt)n/2 Rn

an estimate we have seen before but now obtained without the use of the Green’s function.
We may also recover self-similarity of the √ solution of the heat equation in the long time
limit: consider t √1 and let us look at x ∼ t: mathematically, this means that we fix y ∈ R
and consider u(t, y t) with the idea of then letting t → +∞ with y fixed. We have
√ ˆ √
2 2
u(t, y t) = fˆ(ξ)e−4π |ξ| t e2π tiξ·y dξ.
Rn

58

Let us make a change of variable k = 2 πtξ:
√ ˆ   ˆ(0) ˆ
1 k 2 i√πk·y f 2 √
u(t, y t) = √ n fˆ √ e−π|k|
e dk ≈ √ n e−π|k| e2πik·(y/2 π) dk.
(2 πt) Rn 2 πt (2 πt) Rn
Lemma 2.9 implies that
ˆ
fˆ(0) −π|y|2 /(4π)
2
√ e−|y| /4
u(t, y t) ≈ e = f (z)dz,
(4πt)n/2 (4πt)n/2 Rn
which is the self-similarity formula (1.193) we have obtained before by a rather more compli-
cated approach.
Finally, in order to obtain the formula for the solution of the heat equation in terms of
the heat kernel we start with (2.24) and write
ˆ ˆ
ˆ −4π 2 |ξ|2 t −2πiξ·x 2 2
u(t, x) = f (ξ)e e dξ = f (y)e−2πiξ·y e−4π |ξ| t e2πiξ·x dξdy, (2.25)
Rn R2n

and use Lemma 2.9 in the last step below:


ˆ ˆ 
−2πiξ·(y−x) −4π 2 |ξ|2 t
u(t, x) = f (y) e e dξ dy (2.26)
Rn Rn
ˆ ˆ √
 ˆ
1 −2πik·(y−x)/ 4πt −π|k|2 1 −(x−y)2 /(4t)
= f (y) e e dk dy = f (y)e dy,
(4πt)n/2 Rn Rn (4πt)n/2 Rn
which is, of course, the expression for u(t, x) in terms of the heat kernel.

3 The wave equation


3.1 The wave equation in one dimension
The wave equation describes propagation of a disturbance φ(t, x) that moves with a local
speed c(x). It has the form
1 ∂ 2φ
− ∆φ = 0. (3.1)
c2 (x) ∂t2
Here n
X ∂2
∆=
j=1
∂xj 2
is the standard Laplacian in n dimensions, t is the time variable, and x is the spatial coor-
dinate. Unless specified otherwise we will always assume t ≥ 0 and x ∈ Rn . When x ∈ R
and the sound speed c is constant then a general solution of (3.1) is given by the famous
d’Alembert formula
φ(t, x) = f (x − ct) + g(x + ct), (3.2)
with arbitrary functions f and g. It is straightforward to verify that any function of the form
(3.2) solves the one-dimensional wave equation
1 ∂ 2φ ∂ 2φ
− 2 = 0. (3.3)
c2 ∂t2 ∂x

59
The functions f and g represent the right and left going waves, respectively.
Let us now consider the Cauchy problem
1 ∂ 2φ ∂ 2φ
− 2 = 0, (3.4)
c2 ∂t2 ∂x
φ(0, x) = p(x),
φt (0, x) = q(x).
Here we use the notation φt = ∂φ/∂t. Note that, since the wave equation is of the second
order in time, we need to prescribe both the initial value of φ and the value of its derivative
at the time t = 0. We would like to express the solution of (3.4) in the form (3.3), that is, as
a sum of left and right going waves. for this decomposition to hold we need
p(x) = f (x) + g(x),
q(x) = −cf 0 (x) + cg 0 (x).
it follows that
1 1
g 0 (x) = (cp0 (x) + q(x)), f 0 (x) = (cp0 (x) − q(x)),
2c 2c
and thus we may take
ˆ x ˆ x
1 1 1 1
g(x) = p(x) + q(y)dy, f (x) = p(x) − q(y)dy. (3.5)
2 2c 0 2 2c 0

Therefore, solution of the Cauchy problem (3.4) is given by


ˆ ˆ
1 1 x−ct 1 1 x+ct
φ(t, x) = p(x − ct) − q(y)dy + p(x + ct) + q(y)dy, (3.6)
2 2c 0 2 2c 0
or, equivalently,
ˆ x+ct
1 1 1
φ(t, x) = p(x − ct) + p(x + ct) + q(y)dy. (3.7)
2 2 2c x−ct

Expression (3.7) is known as d’Alembert’s formula. One corollary is that, unlike for the heat
equation, solution at times t > 0 is no more regular than at t = 0: if, say, p(x) has only five
derivatives, the function u(t, x) will not be any more regular than that. If p(x) is oscillatory,
then u(t, x) will also be oscillatory and so on.
The idea that solution may be decomposed into such primitive building blocks goes much
further than this trivial example. Another useful observation that is obvious in the one-
dimensional case is that solutions propagate with a finite speed: if φ(0, x) = 0 in an interval
(a − r, a + r) of length 2r centered at x = a then φ(t, a) = 0 for all t ≤ T0 = r/c.

3.2 D’Alembert’s formula by the Fourier transform


Let us now use the Fourier transform to solve the Cauchy problem for the wave equation.
Consider the Cauchy problem
1 ∂ 2φ
− ∆φ = 0, t ≥ 0, x ∈ R, (3.8)
c2 ∂t2
φ(0, x) = p(x),
φt (0, x) = q(x).

60
Let ˆ
φ̂(t, ξ) = φ(t, x)e−2πiξ·x dx
R

be the Fourier transform of φ(t, x). It satisfies the ODE

∂ 2 φ̂
+ 4π 2 c2 |ξ|2 φ̂ = 0, (3.9)
∂t2
with the initial data
φ̂(0, ξ) = p̂(ξ), φ̂t (0, ξ) = q̂(ξ). (3.10)
It follows from (3.9) that

φ̂(t, ξ) = A(ξ)e−2πiξct + B(ξ)e2πiξct .

The coefficients A(ξ) and B(ξ) are determined by the initial condition (3.10):

A(ξ) + B(ξ) = p̂(ξ), − 2πicξA(ξ) + 2πicξB(ξ) = q̂(ξ),

hence
1 q̂(ξ) 1 q̂(ξ)
A(ξ) = p̂(ξ) − , B(ξ) = p̂(ξ) + ,
2 4πicξ 2 4πicξ
leading to    
1 q̂(ξ) −2πicξt 1 q̂(ξ)
φ̂(t, ξ) = p̂(ξ) − e + p̂(ξ) + e2πicξt . (3.11)
2 4πicξ 2 4πicξ
Note that the singularities at ξ = 0 in the two terms above cancel each other out, and taking
the inverse Fourier transform gives
ˆ h   
1 q̂(ξ) −2πiξct+2πiξx 1 q̂(ξ) 2πiξct+2πiξx
i
φ(t, x) = p̂(ξ) − e + p̂(ξ) + e dξ
2 4πicξ 2 4πicξ
ˆ
R
1 1 x+ct
= (p(x − ct) + p(x + ct)) + q(y)dy, (3.12)
2 2c x−ct

which is nothing but d’Alembert’s formula (3.7). The reader should fill out the details in the
last step above!

3.3 Conservation of energy


The energy of solutions of (3.1) is defined as
ˆ  
1 1 2 2
E(t) = |φt (t, x)| + |∇φ(t, x)| dx. (3.13)
2 Rn c2 (x)

It is straightforward to verify that energy of (at least sufficiently smooth) solutions of (3.1)
is preserved: E(t) = E(0). Indeed, we have
ˆ  
dE(t) 1
= 2
φt (t, x)φtt (t, x) + ∇φ(t, x) · ∇φt (t, x) dx.
dt Rn c (x)

61
Using the wave equation (3.13) we can re-write this as
ˆ
dE(t)
= [φt (t, x)∆φ(t, x) + ∇φ(t, x) · ∇φt (t, x)] dx.
dt Rn

Integrating by parts in the first term (the boundary terms at infinity vanish) gives
ˆ
dE(t)
= [−∇φt (t, x) · ∇φ(t, x) + ∇φ(t, x) · ∇φt (t, x)] dx = 0,
dt Rn

hence E(t) ≡ E(0), as claimed.


Uniqueness for the Cauchy problem. An immediate corollary of energy conservation
is uniqueness of smooth solutions to the Cauchy problem for wave equation. More precisely,
consider the Cauchy problem (3.4) in an arbitrary dimension (x ∈ Rn )
1 ∂ 2φ
− ∆φ = 0, (3.14)
c2 (x) ∂t2
φ(0, x) = p(x),
ut (0, x) = q(x).

Let us assume that that (3.14) has two solutions φ1 (t, x) and φ2 (t, x) and set v(t, x) = φ1 (t, x)−
φ2 (t, x). The function v(t, x) satisfies
1 ∂ 2v
− ∆v = 0, (3.15)
c2 (x) ∂t2
v(0, x) = 0,
vt (0, x) = 0.

As we have discussed, the energy


ˆ  
1 1 2 2
E(t) = |vt (t, x)| + |∇v(t, x)| dx
2 Rn c2 (x)
is preserved by evolution E(t) = E(0). But at t = 0 we have v(0, x) = vt (0, x) = 0, hence
E(0) = 0. it follows that E(t) = 0 for all t > 0, which, in turn, means that v(t, x) ≡ const.
Since v(0, x) = 0, we conclude that v(t, x) ≡ 0, and thus φ1 (t, x) = φ2 (t, x) for all t ≥ 0,
x ∈ Rn .

3.4 Finite speed of propagation


Another beautiful corollary of the energy conservation is the fact that solutions of the wave
equation propagate with a finite speed. Consider the wave equation in Rn with a constant
speed c:
1 ∂ 2u
− ∆u = 0, (3.16)
c2 ∂t2
u(0, x) = p(x), x ∈ Rn ,
ut (0, x) = q(x), x ∈ Rn .

62
Fix a point x0 ∈ Rn and a time t0 > 0. We will show that u(t0 , x0 ) depends only on the values
of the functions p(x) and q(x) in the ball B(x0 , ct0 ) that is centered at x0 and has the radius
r0 = ct0 . More precisely, if u(t, x) solves the Cauchy problem (3.16), and ψ(x) solves (3.16)
with the initial data
ψ(0, x) = p̃(x), ψt (0, x) = q̃(x),
and p̃(x) = p(x), q̃(x) = q(x) for all x such that |x − x0 | ≤ ct0 , then u(t0 , x0 ) = ψ(t0 , x0 ). In
order to show this, consider φ(t, x) = u(t, x) − ψ(t, x). This function satisfies

1 ∂ 2φ
− ∆φ = 0, (3.17)
c2 ∂t2
φ(0, x) = p(x) − p̃(x), x ∈ Rn ,
φt (0, x) = q(x) − q̃(x), x ∈ Rn .

and
p(x) − p̃(x) = q(x) − q̃(x) = 0 for all x such that |x − x0 | ≤ ct0 . (3.18)
Let us define ˆ  
1 1 2 2
e(t) = φ (t, x) + |∇φ(t, x)| dx,
2 |x−x0 |≤c(t0 −t) c2 t
that is, the portion of the total energy contained in the ball |x − x0 | ≤ c(t0 − t) at a time
0 ≤ t ≤ t0 . Note that we have e(0) = 0 because of the initial conditions. Let us re-write e(t)
in the polar coordinates centered around the point x0 :
ˆ ˆ
1 c(t0 −t)
 
1 2
e(t) = 2 t
2
φ (t, x0 + rω) + |∇φ(t, x0 + rω)| rn−1 dS(ω)dr,
2 0 S n−1 c

where S n−1 is the (n−1)-dimensional sphere of all possible directions. Let us now differentiate
e(t):

ˆ0 −t)
c(t ˆ  
de(t) 1
= φt (t, x0 + rω)φtt (t, x0 + rω) + ∇φ(t, x0 + rω) · ∇φt (t, x0 + rω)
dt c2
0 S n−1
n−1
×r dS(ω)dr (3.19)
ˆ  
c 1 2
− φ (t, x0 + c(t0 − t)ω) + |∇φ(t, x0 + c(t0 − t)ω)|2 cn−1 (t0 − t)n−1 dS(ω).
2 S n−1 c2 t

Going back to the original variables this is


ˆ  
de(t) 1
= φt (t, x)φtt (t, x) + ∇φ(t, x) · ∇φt (t, x) dx (3.20)
dt c2
|x−x0 |≤c(t0 −t)
ˆ  
c 1 2 2
− φ (t, y) + |∇φ(t, y)| dS(y).
2 |y−x0 |=c(t0 −t) c2 t

63
Using the fact that φ solves the wave equation in the first line above we get
ˆ  
1
φt (t, x)φtt (t, x) + ∇φ(t, x) · ∇φt (t, x) dx (3.21)
c2
|x−x0 |≤c(t0 −t)
ˆ
= [φt (t, x)∆φ(t, x) + ∇φ(t, x) · ∇φt (t, x)] dx.
|x−x0 |≤c(t0 −t)

Using Green’s formula in the first term on the second line gives
ˆ
[φt (t, x)∆φ(t, x) + ∇φ(t, x) · ∇φt (t, x)] dx
|x−x0 |≤c(t0 −t)
ˆ
= [−∇φt (t, x) · ∇φ(t, x) + ∇φ(t, x) · ∇φt (t, x)] dx
|x−x0 |≤c(t0 −t)
ˆ ˆ
∂φ(t, y) ∂φ(t, y)
+ φt (t, y) dS(y) = φt (t, y) dS(y).
∂ν ∂ν
|y−x0 |=c(t0 −t) |y−x0 |=c(t0 −t)

Going back to (3.20), we obtain


ˆ  
de(t) ∂φ(t, y) 1 2 c 2
= φt (t, y) − φt (t, y) − |∇φ(t, y)| dS(y). (3.22)
dt |y−x0 |=c(t0 −t) ∂ν 2c 2

As |∂φ/∂ν| ≤ |∇φ|, we get


ˆ  
de(t) 1 2 c 2
≤ |φt (t, y)||∇φ(t, y)| − φt (t, y) − |∇φ(t, y)| dS(y) (3.23)
dt |y−x0 |=c(t0 −t) 2c 2
ˆ 2


1 1
=− √ |φt (t, y)| − c|∇φ(t, y)| dS(y) ≤ 0.
2 |y−x0 |=c(t0 −t) c

We conclude that e(t) ≤ e(0) for all 0 ≤ t ≤ t0 . Recall that e(0) = 0 (from the initial data)
and e(t) ≥ 0 by its very definition. Therefore, we should have e(t) = 0 for all 0 ≤ t ≤ t0 ,
which means that
φ(t, x) = 0 for all |x − x0 | ≤ ct0 ,
as we have claimed.

3.5 Plane wave solutions and the Huygens principle


The plane waves
A plane wave u(t, x) is a function with level sets that are parallel planes, that is, u has the
form u(t, x) = f (k · x − ct), where k ∈ Rn is a fixed wave vector. Then, for a given a ∈ R,
the level sets {u(t, x) = a} are hyperplanes {x ∈ Rn : k · x − ct = a0 }, with a0 ∈ R such
that f (a0 ) = a. Any function of that form solves the wave equation

utt − c2 ∆u = 0 (3.24)

64
provided that |k|2 = 1:
n
X
2 00
2
utt − c ∆u = c f − c 2
kj2 f 00 = 0.
j=1

One can look at the plane waves as ”building blocks” and ask if a general solution of the
wave equation may be decomposed into plane waves. The simplest example is given by the
d’Alembert formula
u(t, x) = h(x − ct) + p(x + ct) (3.25)
for a general solution for the Cauchy problem for the one-dimensional wave equation that we
have already discussed in Section 3.1. The d’Alembert formula (3.25) decomposes an arbitrary
solution of the wave equation into a sum of a left and right going waves. In dimensions higher
than one we have to decompose over waves going in all directions. Hence, we seek a general
solution of the wave equation
utt − c2 ∆u = 0 (3.26)
in the form of a ”sum”– actually, an integral – over plane waves
ˆ
u(t, x) = h(k · x − ct, k)dk, (3.27)
Sn−1

where Sn−1 is the (n − 1)-dimensional sphere of directions. As each function h(k · x − ct, k)
is a solution of the wave equation, so is the integral over k, thus we know that any u(t, x)
defined by (3.27) is a solution of the wave equation, no matter what function h(s, k) we choose.
Therefore, in order to ensure that u(t, x) given by (3.27) solves the Cauchy problem for the
wave equation:
1
utt − ∆u = 0, t > 0,
c2
u(0, x) = f (x),
ut (0, x) = g(x),
we only have to choose the function h(s, k) so as to match the initial conditions:
ˆ
u(0, x) = f (x) = h(k · x, k)dk (3.28)
ˆ
Sn−1

ut (0, x) = g(x) = −c hs (k · x, k)dk, (3.29)


Sn−1

with some prescribed function f (x) and g(x). If we decompose h(s, k) into its even and odd
parts, that is,
h(s, k) + h(−s, −k) h(s, k) − h(−s, −k)
h(s, k) = l(s, k) + q(s, k), l(s, k) = , q(s, k) = ,
2 2
(3.30)
we see that (3.28)-(3.29) become
ˆ ˆ
f (x) = l(x · k, k)dk, g(x) = −c qs (x · k, k)dk. (3.31)
Sn−1 Sn−1

We have used here the fact that the functions q(k · x, k) and ls (k · x, k) are odd in k for
any x ∈ Rn fixed, and the integral of an odd function over a sphere vanishes.

65
The Radon transform
Hence, we have reduced the problem of finding a plane wave decomposition for a general
solution to the wave equation to the following problem: given a function f (x), x ∈ Rn find
an even function l(s, k), with s ∈ R, k ∈ Sn−1 , such that
ˆ
f (x) = l(x · k, k)dk, l(s, k) = l(−s, −k). (3.32)
Sn−1

The function l = Rf is called the Radon transform of the function f (x).


Let us now construct the Radon transform of a function f (x). We start with the Fourier
transform of f and re-write it in the polar coordinates:
ˆ ˆ ˆ ∞
2πiξ·x ˆ
f (x) = e f (ξ)dξ = e2πiρk·x fˆ(ρk)ρn−1 dρdk.
Sn−1 0

Now, if the space dimension n is odd (so that ρn−1 = (−ρ)n−1 ), we may re-write the above
integral as
ˆ ˆ ∞ ˆ ˆ 0
1 2πiρk·x ˆ 1
f (x) = e n−1
f (ρk)ρ dρdk + e−2πiρk·x fˆ(−ρk)ρn−1 dρdk
2 Sn−1 0 2 Sn−1 −∞
ˆ ˆ ∞
1
= e2πiρk·x fˆ(ρk)ρn−1 dρdk. (3.33)
2 Sn−1 −∞
Thus, we have constructed the Radon transform of the function f explicitly: if we set
ˆ
1 ∞ 2πiρs ˆ
l(s, k) = e f (ρk)ρn−1 dρ, (3.34)
2 −∞
then (3.33) says that
ˆ
f (x) = l(k · x, k)dk.
Sn−1

In addition, the function l(s, k) is even:


ˆ ˆ
1 ∞ −2πiρs ˆ n−1 1 ∞ 2πiρs ˆ
l(−s, −k) = e f (−ρk)ρ dρ = e f (ρk)ρn−1 dρ. (3.35)
2 −∞ 2 −∞
We used the change of variables ρ → −ρ and the fact that n is odd in the last equality above.
Thus, the Radon transform of f is given by (3.34).
Expression (3.34) is, however, not as useful as another representation for the Radon
transform that we obtain next. We start with (3.32) and take another rapidly decaying
function g(x) with the Radon transform m = Rg. Multiply (3.32) by ḡ(x) (the complex
conjugate of g(x)) and integrate, interchanging the order of integration and integrating over
planes x · k = s:
ˆ ˆ ˆ ˆ ˆ  
f (x)ḡ(x)dx = l(x · k, k)ḡ(x)dkdx = l(s, k) ḡ(x)dΣ ds dk
Rn
ˆ Sn−1 ×Rn Sn−1 R x·k=s

= l(s, k)M̄ (s, k)dsdk. (3.36)


Sn−1 ×R

66
Here, we have defined ˆ
M (s, k) = g(x)dΣ
x·k=s
as the integral of the function g over the hyperplane x · k = s. On the other hand we may
also write, using the Plancherel identity
ˆ ˆ ˆ ˆ
1
f (x)ḡ(x)dx = fˆ(ξ)ĝ(ξ)dξ = fˆ(ρk)ĝ(ρk)ρn−1 dρdk. (3.37)
Rn Rn 2 Sn−1 R

In the last step above we passed to the spherical coordinates: ξ = ρk with ρ ≥ 0 and k ∈ Sn−1 ,
and then used the same trick as in (3.33) that relied on the fact that the spatial dimension n
is odd. Note that (3.34) means that the Fourier transform of l(s, k) in the s-variable is
ˆ ∞
˜l(ρ, k) = 1
e−2πiρs l(s, k)ds = fˆ(ρk)ρn−1 . (3.38)
−∞ 2
Similarly, if we denote m = Rg the Radon transform of g, we have
ˆ ∞
1
m̃(ρ, k) = e−2πiρs m(s, k)ds = ĝ(ρk)ρn−1 . (3.39)
−∞ 2
Using (3.38) and (3.39) in (3.37) gives
ˆ ˆ ˆ ˆ ˆ
1 ˆ n−1 ˜l(ρ, k) m̃(ρ, k) dρdk.
f (x)ḡ(x)dx = f (ρk)ĝ(ρk)ρ dρdk = 2 (3.40)
2 Sn−1 R Sn−1 R ρn−1
Using again the Plancherel identity (now in the ρ-variable) gives
ˆ ˆ ˆ
f (x)ḡ(x)dx = l(s, k)h̄(s, k)dsdk, (3.41)
Rn Sn−1 R
with the function h(s, k) having the following Fourier transform in s:
m̃(ρ, k)
h̃(ρ, k) = 2 .
ρn−1
Comparing (3.36) and (3.40) we see that
m̃(ρ, k)
M̃ (ρ, k) = 2 .
ρn−1
In other words, the Fourier transform in the s-variable of the Radon transform m = Rg of a
function g can be written in terms of g as
ˆ
1 n−1
m̃(ρ, k) = ρ M̃ (ρ, k), M (s, k) = g(x)dΣ. (3.42)
2 x·k=s
Taking the inverse Fourier transform we arrive at an alternative, and much more geometric,
expression for the Radon transform of a function g(x):
ˆ
1 ∂ n−1 M (s, k)
(Rg)(s, k) = , M (s, k) = g(x)dΣ. (3.43)
2(2πi)n−1 ∂sn−1 x·k=s

Note that, as n is odd, in−1 = ±1, depending on whether n = 2k + 1 with an even or odd k.
As a consequence of (3.43) we obtain the following theorem.
Theorem 3.1 Let the dimension n of the space be odd. If the function g(x) vanishes outside
a ball {|x| ≤ r} then its Radon transform m(s, k) = (Rg)(s, k) vanishes for |s| > r.

67
The Huygens principle
In order to formulate the Huygens principle we need the notion of the domain of influence. We
say that a space-time point (t, z) ∈ Rn+1 is in the domain of influence of a spatial point x ∈ Rn
if given any spatial neighborhood Ux of the point x and any space-time neighborhood Vt,z
of the point (t, z) there exist two solutions of the wave equation u(t, x) and w(t, x) such
/ Ux but there exists (s0 , z 0 ) ∈ Vt,z such
that u(0, x) = w(0, x) and ut (0, x) = wt (0, x) for all x ∈
0 0 0 0
that u(s , z ) 6= w(s , z ). The Huygens principle states the following.
Theorem 3.2 If n is odd, then the domain of influence of the origin in Rn is the double
cone {(t, x) ∈ Rn+1 : |x| = ct}.
Proof. We have to show that the points (t, x) with |x| = 6 ct lie outside the domain of influence
of the origin. That is, we have to find a ball Uε = {y ∈ Rn : |y| < ε} and a space-time
neighborhood Vt,x of (t, x) so that if the Cauchy data for a pair of solutions coincide outside Uε
at t = 0, then the solutions coincide also in Vt,x . This is equivalent to saying that if the Cauchy
data vanishes outside Uε then the corresponding solution of the wave equation vanishes in Vt,x .
Note first that if |x| > ct, then we can choose ε = (|x| − ct)/2. Indeed, if initially both u(0, x)
and ut (0, x) vanish outside of the ball {|x| < ε} then they vanish in the ball of radius ct
around the point x, hence the finite speed of propagation that we have already proved implies
that u(t, x) = 0. Thus, we only need to consider the case |x| < ct.
To this end, we need a slight diversion. Recall that the Fourier transform of l(s, k) in s is

˜l(ρ, k) = 1 fˆ(ρk)ρn−1 , (3.44)


2
It follows from (3.44) that
ˆ
1 ∂j ˜
sj l(s, k)ds = sf
j l(ρ = 0, k) = l(0, k) = 0 for j = 0, 1, . . . , n − 2. (3.45)
R (−2πi)j ∂ρj
Hence, we may define anti-derivatives of l in the s-variable up to the order n − 1 that decay at
infinity – they are compactly supported in s if l(s, k) is. This is based on the following fact:
if a function w(s) is compactly supported in s ∈ R, so that there exists L such that w(s) = 0
for |s| > L, then the anti-derivative
ˆ s ˆ s
0 0
W1 (s) = w(s )ds = w(s0 )ds0
−∞ −L

is also compactly supported if ˆ L


w(s)ds = 0. (3.46)
−L

In addition, if (3.46) holds, we have W1 (s) = 0 for |s| ≥ L, hence


ˆ ∞ ˆ Lˆ s ˆ Lˆ s ˆ L ˆ L 
0 0 0 0 0
W1 (s)ds = w(s )ds ds = w(s )ds ds = w(s ) ds ds0
−∞ −L −∞ −L −L −L s0
ˆ L ˆ L ˆ L
= (L − s0 )w(s0 )ds0 = L w(s0 )ds0 − s0 w(s0 )ds0 .
−L −L −L

68
Thus, the anti-derivative W1 (s) has a zero integral itself if, in addition to (3.46) we have
ˆ L
sw(s)ds = 0. (3.47)
−L

If (3.47) holds, then we can define the second anti-derivative of w


ˆ s
W2 (s) = W1 (s0 )ds0 .
−∞

Continuing by induction, we see that the existence of (n − 1) anti-derivatives of l(s, k) that


decay at infinity is, indeed, guaranteed by the moment condition (3.45).
Going back to the proof of the Huygens principle, let u(0, x) = f (x), ut (0, x) = g(x) be
initial data that vanish outside of Uε . If we now choose h(s, k) to be
 −1
1 ∂
h(s, k) = Rf − [Rg]
c ∂s
then u(t, x) is given by the plane wave decomposition (3.27), as can be seen from (3.30)
and (3.31). The anti-derivative ∂s−1 is well defined – this is what our recent diversion has
shown. Moreover, as both f (x) and g(x) vanish outside the ball Uε , the function h(s, k)
vanishes for |s| > ε as follows from Theorem 3.1 and the same observation about the support
of the anti-derivative. Therefore, expression (3.27) implies that u(t, x) = 0 if |x·k −ct| > ε for
all k ∈ Sn−1 . This condition is satisfied if we choose ε < |ct − |x||. Therefore, the points (t, x)
with |x| < ct also lie outside the domain of influence of the origin.
We ask the reader, as an exercise, to show that the points with |x| = ct are in the domain
of influence of the origin, as an exercise. 2
Theorem 3.2 is false in even dimensions – while it is true that the points in {|x| > ct} lie
outside the domain of influence of the origin (this follows from the finite speed of propagation
results), the points inside {|x| < ct} are influenced by the origin.

3.6 Very basic geometric optics


One of the main properties of the wave equations is propagation of oscillations that does not
occur in the equation: compare the form of solutions of the wave equation

wtt − ∆w = 0

and of the heat equation


ht − ∆h = 0
with oscillatory initial data h(0, x) = w(0, x) = eik·x (for the wave equation we in addition
prescribe wt (0, x) = 0). Solution of the wave equation is
1 1
w(t, x) = eik·(x−ct) + eik·(x+ct)
2 2
which propagates without any decay. On the other hand, solution of the heat equation is
2t
h(t, x) = eik·x−|k|

69
which decays rapidly for large wave numbers k – oscillations dissipate.
The geometric optics studies oscillatory solutions and purports to estimate the error in
terms of a small parameter ε  1 which is the inverse of the non-dimensional wave length.
We consider oscillatory solutions of the wave equation with constant coefficients:

Lu = uεtt − c2 ∆uε = 0 (3.48)

with oscillatory initial data uε (0, x) = eiφ(x)/ε aε (x), uεt (0, x) = 0. When the phase φ(x) = k · x
is a linear function we are back to the plane waves. Now, however, we are interested in more
general phases and, rather more importantly, in theregime ε  1. We look for solutions in
the same from as the initial data:

uε (t, x) = eiφ(t,x)/ε aε (t, x), aε (t, x) = a0 (t, x) + εa1 (t, x) + . . . . (3.49)

We insert this ansatz into (3.48) and compute


 
iφ(t,x)/ε
 iφ(t,x)/ε 1 2i i
L e aε (t, x) = e − 2 L̃(φt , ∇φ)aε + Vφ aε + (Lφ)aε + Laε , (3.50)
ε ε ε

where we have defined the symbol L̃(ω, k) = ω 2 − c2 |k|2 and the linear operator
∂φ ∂aε
V φ aε = − c2 ∇φ · ∇aε .
∂t ∂t
In order for u of the form (3.49) to be an approximate solution of (3.48) the leading order in
ε in (3.50) (that is, O(ε−2 )) should vanish. This leads to the eikonal equation
 2
∂φ
L̃(φt , ∇φ) = − c2 |∇φ|2 = 0. (3.51)
∂t
Solutions of the eikonal equation may be constructed in the same way as for the progressing
waves, when initially ∇φ0 6= 0 everywhere. Then we pick up one branch of φt = ±|∇φ| and
construct solutions by means of the Hamilton-Jacobi theory – we will return to this issue a
little later.
Assuming that solutions of the eikonal equation exist, at least locally in time, we may
proceed to find an equation for the amplitudes aj . The term of order ε−1 in (3.50) has the
form
2Vφ a0 + (Lφ)a0 = 0. (3.52)
This is a first order linear (once the phase φ is determined from the eikonal equation) equation
for a0 , also known as the transport equation. We say that the characteristics of the linear
operator Vφ are called rays. Then (3.52) is an ODE along the rays. In order for the transport
equation to have a solution for an initial value problem we require that φt 6= 0 initially, which
means that |∇φ| = 6 0 as well. Then the eikonal equation becomes φt = ±|∇φ|, depending on
the sign of φt at t = 0, and the equation for a0 admits a smooth solution.
Equations for the higher order coefficients are obtained from the higher order terms in ε
in (3.50): this leads to
1
2Vφ an + (Lφ)an + Lan−1 = 0. (3.53)
i

70
This is a system of first order linear (again, once the phase φ is determined from the eikonal
equation) equations for an – each one has the same family of rays as its characteristics. They
can be solved to provide an expansion for aε up to any order.
The accuracy of the approximation of uε (t, x) by a partial sum
k
X
ukε (t, x) iφ(t,x)/ε
=e εl al (t, x)
l=0

depends on the chosen norm. This is related to the fact that as uε is oscillatory, so its
derivatives are large.

Theorem 3.3 Let the initial data φ(0, x) and a(0, x) be in L2 (Rn ), and let T̄ be such that a
smooth solution of the eikonal equation (3.51) exists for all 0 ≤ t < T̄ and let 0 ≤ T < T̄ .
There exists a constant CT so that we have, uniformly in 0 ≤ t ≤ T , the following estimate:

kuε (t, ·) − ukε (t, ·)kL2 (Rn ) ≤ CT εk . (3.54)

The constant CT depends on the time T and the Cauchy data for uε at t = 0.

4 Method of characteristics
The linear equations
Consider a linear first order equation
n
X ∂u
aj (x) = c(x)u + f (x), x ∈ Rn , (4.1)
j=1
∂xj

where aj (x), c(x) and f (x) are all prescribed functions. We also prescribe the solution u(x)
along some (n−1)-dimensional surface Γ: u(y) = g(y) for all y ∈ Γ, with a prescribed function
g. The question we are interested in, is whether we can extend u(x) outside of Γ. The idea of
the method of characteristics is as follows: we look for a family of curves X(s; y), that start at
a point y on Γ (X(0; y) = y), and s is a parameter along the curve, such that U (s) = u(X(s))
can be computed ”easily” given that we know U (0) = g(y). This will extend the function u
to all points where characteristics can reach in a unique way.
In order to understand what characteristic curves we should choose, let us differentiate
U (s) with respect to s:
n
dU X ∂u(X(s)) dXj (s)
= . (4.2)
ds j=1
∂x j ds

In order to relate (4.2) to the partial differential equation (4.1) it is convenient to choose X(s)
as a solution of a system of ordinary differential equations
dXj
= aj (X(s)), j = 1, . . . , n, (4.3)
ds

71
with a prescribed initial condition X(0) = y, where y ∈ Γ is a given point on Γ. With this
choice of the curve X(s), equation (4.2) becomes
n
dU X ∂u(X(s)
= aj (s) = c(X(s))U (s) + f (s). (4.4)
ds j=1
∂xj

Now, if we can solve the characteristics equations (4.3), we can insert the solution X(s) into
the scalar ODE
dU
= c(X(s))U (s) + f (X(s)), (4.5)
ds
and solve it. This will provide the solution to the PDE with the boundary condition, at the
points that can be reached by the characteristics.
Consider an example
ux1 + ux2 + u = 1, (4.6)
subject to the initial condition

u(y1 , y2 ) = sin y1 on the curve y2 = y1 + y12 , y1 > 0. (4.7)

The characteristic curves are


dX1 dX2
= 1, = 1, X1 (0) = y1 , X2 (0) = y2 ,
ds ds
so that
X1 (s) = y1 + s, X2 (s) = y2 + s.
The ODE for U (s) is
dU
= 1 − U (s), U (0) = sin y1 ,
ds
whose solution is
U (s) = 1 − (1 − sin y1 )e−s .
How do we find a solution at a given point (x1 , x2 )? First, we look for a characteristic that
passes through (x1 , x2 ):

x1 = y1 + s, x2 = y2 + s, y2 = y1 + y12 , y1 > 0.

Let us now solve for y1 : s = x1 − y1 , hence y2 = x2 − (x1 − y1 ), thus

x2 − (x1 − y1 ) = y1 + y12 ,

hence √
y1 = x2 − x1 .
We see immediately several issues: first, solution exists only if x2 > x1 – otherwise, a charac-
teristic curve that passes through a point (x1 , x2 ) does not cross the curve Γ at all, meaning
that solution at (x1 , x2 ) is not defined! If x2 > x1 , then s is

s = x1 − y 1 = x1 − x2 − x 1 , (4.8)

72
and √ √
u(x1 , x2 ) = 1 − 1 − sin( x2 − x1 ) e−x1 + x2 −x1 .


Note that this solution is not differentiable at the points where x1 = x2 – this is because this
characteristic curve is tangent to the curve Γ at (0, 0). In general, for the boundary value
problem to have a unique solution characteristics should not be tangent to Γ at any point.
If we consider the same example as before but now prescribe the initial data along the
whole curve y2 = y1 + y12 , without the restriction y1 > 0 we will see that at each point (x1 , x2 )
with x2 > x1 the characteristic curve will cross Γ at two points, with

y1 = ± x2 − x1 .

This is problematic – which s should we choose then in the formula for u(x1 , x2 )? This
problem comes, once again, from the fact that the characteristic is tangent to Γ at the point
(0, 0).
Generally, solutions exist only in a neighborhood of Γ and only if the characteristics coming
out of Γ are never tangent to it.
Let us consider another example:
∂u ∂u
−x2 + x1 = 0, (4.9)
∂x1 ∂x2
with the initial condition u(x1 , 0) = ψ(x1 ). The characteristics are

dX1 dX2
= −X2 (s), = X1 (s), X1 (0) = y1 , X2 (0) = 0.
ds ds
The solution is
X2 (s) = y1 sin s, X1 (s) = y1 cos s.
The equation for U (s) is
dU
= 0, U (0) = ψ(y1 ),
ds
hence
U (s) = ψ(y1 ).
Therefore, characteristics are circles, and given a point (x1 , x2 ), we have

y12 = x21 + x22 .

It follows that we can not define a solution in any open region that includes all of the real
line – but if we restrict Γ to be Γ0 = {(x1 , 0) : x1 > 0}, we can solve the problem with the
solution being q
u(x1 , x2 ) = ψ( x21 + x22 ).
Once again, this solution is not differentiable at the origin, where the characteristics have a
singular point.

73
Nonlinear equations
Now, we consider a nonlinear first-order PDE

F (∇u, u, x) = 0, (4.10)

with a boundary condition


u = g on Γ, (4.11)
where Γ is a given curve. Once again, we consider a curve X(s) and try to choose it so that
the evolution of U (s) = u(X(s)) would be tractable. Let us also introduce a vector-value
function P (s) = ∇u(X(s)) and consider
n
dPj X ∂ 2 u(X(s) dXi (s)
= . (4.12)
ds i=1
∂xi ∂xj ds

Let us also differentiate (4.10) with respect to xi , with the variable p = ∇u in the argument
of F : n
∂F ∂F ∂u X ∂F ∂ 2 u
+ + = 0. (4.13)
∂xi ∂u ∂xi j=1 ∂pj ∂xi ∂xj

Let us then set the characteristic equation for X(s) as:


dXi ∂F (P (s), U (s), X(s)
= . (4.14)
dt ∂pi
Then (4.12) becomes:
n n
dPj X ∂ 2 u(X(s) dXi (s) X ∂ 2 u(X(s) ∂F (P (s), U (s), X(s)
= = . (4.15)
ds i=1
∂xi ∂xj ds i=1
∂xi ∂xj ∂pi

Using the differentiated equation (4.13) this becomes


dPj ∂F (P (s), U (s), X(s) ∂F (P (s), U (s), X(s)
=− − Pj (s). (4.16)
ds ∂xj ∂u
Finally, the equation for U (s) is
n n
dU X ∂u(X(s)) dXj (s) X ∂F (P (s), U (s), X(s)
= = Pj (s) . (4.17)
ds j=1
∂xj ds j=1
∂pj

To summarize: we need to consider the full system of the characteristic equations


dXj ∂F (P (s), U (s), X(s)
=
ds ∂pj
n
dU X ∂F (P (s), U (s), X(s)
= Pj (s) (4.18)
ds j=1
∂pj
dPj ∂F (P (s), U (s), X(s) ∂F (P (s), U (s), X(s)
=− − Pj (s).
ds ∂xj ∂u

74
Let us consider an example
∂u ∂u
= u, (4.19)
∂x1 ∂x2
in the right half-plane {x1 > 0} with the boundary condition u(0, x2 ) = x22 . Now, F (p, u, x) =
p1 p2 − u, so characteristics become
dX1 dX2
= P2 (s), = P1 (s)
ds ds
dU
= 2P1 (s)P2 (s), (4.20)
ds
dP1 dP2
= P1 (s), = P1 (s).
ds ds
Integrating these equations gives

P1 (s) = P1 (0)es , P2 (s) = P2 (0)es , U (s) = U (0) + P1 (0)P2 (0) e2s − 1 ,




and
X1 (s) = X1 (0) + P2 (0)(es − 1), X2 (s) = X2 (0) + P1 (0)(es − 1). (4.21)
At the boundary line {x1 = 0} we have X1 (0) = 0, U (0) = (X2 (0))2 and P2 (0) = 2X2 (0). We
may also find P1 (0) from the PDE
∂u ∂u
= u,
∂x1 ∂x2
which at the point (0, X2 (0)) becomes

P1 (0)P2 (0) = U (0),

which gives P1 (0) = X2 (0)/2. Now, given a point (x1 , x2 ) with x1 > 0 let us find the point
(0, X2 (0)) such that the characteristic coming out of (0, X2 (0)) passes through (x1 , x2 ): using
this in (4.20) leads to
1
x1 = 2X2 (0)(es − 1), x2 = X2 (0) + X2 (0)(es − 1),
2
so that
1 x1 + 4x2
X2 (0) = x2 − x1 , es =
4 4x2 − x1
It follows that
2s
 2s 2 2s (x1 + 4x2 )2
u(x1 , x2 ) = U (0) + P1 (0)P2 (0) e − 1 = U (0)e = (X2 (0)) e = . (4.22)
16

5 Basic introduction to conservation laws


5.1 Rankine-Hugoniot conditions and integral solutions
We will consider here the initial value problem for the scalar conservation laws in one dimen-
sion:
ut + (F (u))x = 0, t ≥ 0, x ∈ R, (5.1)

75
with a prescribed initial data u(0, x) = g(x). The most canonical example is F (u) = u2 /2,
which is known as the Burgers’ equation:

ut + uux = 0. (5.2)

As we will see, solutions of (5.1) do not necessarily stay continuous for all times t > 0
even if the initial data g(x) is infinitely differentiable. Hence, we need first to devise a
way of formulating the initial value problem for the smooth solutions of the PDE in a way
that does not involve derivatives. That formulation would apply equally well then to non-
smooth solutions. The main difficulty will be to choose this ”derivative-free” formulation in
a physically meaningful way. To begin, we let u(t, x) be a (smooth) solution of (5.1) and
multiply this equation by a smooth ”test function” v(t, x) of compact support, and integrate
by parts (the terms at x = ±∞ vanish because v(t, x) has compact support):
ˆ ∞ˆ ∞
0= v(t, x)(ut + (F (u))x )dxdt
ˆ ∞ˆ ∞
0 −∞
ˆ ∞
=− [vt u + F (u)vx ]dxdt − v(0, x)u(0, x)dx. (5.3)
0 −∞ −∞

Taking into account the initial condition gives


ˆ ∞ˆ ∞ ˆ ∞
0= [vt u + F (u)vx ]dxdt + v(0, x)g(x)dx. (5.4)
0 −∞ −∞

This identity should hold for any smooth compactly supported function v(t, x). The advantage
of (5.4) is that it makes sense for any bounded function u(t, x) – it does not involve any
derivatives of u(t, x). We say that u(t, x) is an integral solution of (5.1) if the integral identity
(5.3) holds for any smooth function v(t, x) of compact support.
In order to understand the implications of this definition, consider a particularly simple
piece-wise constant solution u(t, x) such that u(t, x) = ul for x < x(t), and u(t, x) = ur for
x > x(t). We would like to understand how the point x(t) should evolve for u(t, x) to be
an integral solution of the conservation law (5.1). Take a smooth test function v(t, x) that
vanishes at t = 0, then (5.4) is
ˆ ∞ˆ ∞
0= [vt u + F (u)vx ]dxdt (5.5)
0 −∞
ˆ ∞ ˆ x(t) ˆ ∞ ˆ ∞
= [ul vt + F (ul )vx ]dxdt + [ur vt + F (ur )vx ]dxdt.
0 −∞ 0 x(t)

First, we note that


ˆ ∞ ˆ x(t) ˆ ∞ ˆ ∞ ˆ ∞
F (ul )vx dxdt + F (ur )vx dxdt = [F (ul )v(t, x(t)) − F (ur )v(t, x(t))]dt.
0 −∞ 0 x(t) 0
(5.6)
In order to deal with the time integral we note that
ˆ ∞ ˆ x(t) ˆ ∞ ˆ x(t) ! ˆ ∞
d
ul vt dtdx = ul v(t, x)dx dt − ul v(t, x(t))ẋ(t)dt. (5.7)
0 −∞ 0 dt −∞ 0

76
As v(0, x) = 0 for all x, and v(t, x) = 0 for all t ≥ T for some T > 0, the first term in the
right side vanishes, giving
ˆ ∞ ˆ x(t) ˆ ∞
ul vt dtdx = − ul v(t, x(t))ẋ(t)dt. (5.8)
0 −∞ 0

And similarly we have


ˆ ∞ˆ ∞ ˆ ∞ ˆ ∞  ˆ ∞
d
ur vt dtdx = ur v(t, x)dx dt + ur v(t, x(t))ẋ(t)dt
0 x(t) 0 dt x(t) 0
ˆ ∞
= ur v(t, x(t))ẋ(t)dt. (5.9)
0

Using expressions (5.6)-(5.9) in (5.5) gives


ˆ ∞ ˆ ∞ ˆ ∞
0=− ul v(t, x(t))ẋ(t)dt + ur v(t, x(t))ẋ(t)dt + [F (ul )v(t, x(t)) − F (ur )v(t, x(t))]dt,
0 0 0
(5.10)
or ˆ ∞
0= v(t, x(t))[−ul ẋ(t) + ur ẋ(t) + F (ul ) − F (ur )]dt. (5.11)
0
As this equality holds for all functions v(t, x), we conclude that
F (ul ) − F (ur )
ẋ(t) = . (5.12)
ul − ur
This is known as the Rankine-Hugoniot condition.
The Rankine-Hugoniot condition can be in a very straightforward manner generalized to
solutions that are not piece-wise constant but piecewise smooth: let u(t, x) be an integral
solution of (5.1) such that u(t, x) is smooth at x < x(t), and at x > x(t) but has a jump
discontinuity at x = x(t) such that the limits

ul (t) = lim u(t, x), ul (t) = lim u(t, x),


x→x(t)− x→x(t)=

exist. Then a nearly identical computation as above shows that u(t, x) should satisfy

ut + (F (u))x = 0, (5.13)

both in the region x < x(t) and in x > x(t), while the Rankine-Hugoniot condition should
still hold:
F (ul (t)) − F (ur (t))
ẋ(t) = . (5.14)
ul (t) − ur (t)
The basic example of such solution is a shock of the Burgers’ equation:

ut + uux = 0, (5.15)

with the initial condition 


1, x < 0
u(0, x) = (5.16)
0, x > 0.

77
Then the solution is piece-wise constant:

1, x < x(t)
u(t, x) = (5.17)
0, x > x(t).

The discontinuity point x(t) moves according to the Rankine-Hugoniot condition correspond-
ing to the flux F (u) = u2 /2, ul = 1, ur = 0:
1/2 − 0 1
ẋ = = ,
1−0 2
and the shock position is x(t) = t/2 – the shock moves with the constant speed v = 1/2:

1, x < t/2
u(t, x) = (5.18)
0, x > t/2.
Let us now consider a flipped initial condition:

0, x < 0
u(0, x) = (5.19)
1, x > 0.
Then the Rankine-Hugoniot condition would give
0 − 1/2 1
ẋ(t) = = ,
0−1 2
and the solution is 
0, x < t/2
u(t, x) = (5.20)
1, x > t/2.
However, there is another integral solution with the same initial condition:

 0, x < 0,
u(t, x) = x/t, 0 < x < t, (5.21)
1, x > t.

This solution is called a rarefaction wave. It is easy to check that u(t, x) is an integral
solution because it is continuous, piece-wise differentiable, and on each interval where it is
differentiable, it solves (5.1). This example shows that an integral solution is not, generally,
unique, and an important question is to distinguish between various integral solutions.

5.2 The entropy condition


Let us look at the Burgers’ equation

ut + uux = 0, (5.22)

and try to solve it by the method of characteristics. The general method described in (4.18)
certainly applies, but let us work it out explicitly once again. The full characteristics curve
is (we have trivially T (s) = s, so we do not have to introduce this component)

(X(s), U (s), Px (s), Pt (s)),

78
with
∂u(s, X(s)) ∂u(s, X(s))
U (s) = u(s, X(s)), Px (s) = , Pt (s, X(s)) = .
∂x ∂t
Then we compute
dPx
= uxt (s, X(s)) + uxx Ẋ(s), (5.23)
ds
and
dPt
= utt (s, X(s)) + utx Ẋ(s). (5.24)
ds
On the other hand, differentiating the Burgers’ equation (5.22) in x and t, respectively, we
get
utx + u2x + uuxx = 0, utt + ut ux + uuxt = 0. (5.25)
Thus, if we choose
dX(s)
= U (s, X(s)), (5.26)
ds
and use (5.25) in (5.23) and (5.24), we would get
dPx dPt
= −Px2 , = −Pt Px . (5.27)
ds ds
Finally, the Burgers’ equation (5.22) itself means that

dU (s)
= ut + ux Ẋ(s) = 0, (5.28)
ds
because of (5.26). Summarizing, the characteristics are
dX dU dPx dPt
= U, = Px U + Pt , = −Px2 , = −Px Pt ,
ds ds ds ds
starting at
X(0) = y, U (0) = g(y), Px (0) = g 0 (y), Pt (0) = −g(y)g 0 (y).
The solution is
Px (0) g 0 (y) g(y)g 0 (y)
Px (s) = = 0 , Pt (s) = −g(y)Px (s) = − ,
1 + sPx (0) g (y) + s 1 + sg 0 (y)

and, finally, U (s) = U (0) = g(y) – the function u(t, x) is constant along the characteristics.
The projection of the characteristics on the physical space are straight lines

X(t) = y + g(y)t. (5.29)

We can not prevent the characteristics from crossing each other in the physical space – this
is what leads to singularities and shocks. However, we can impose a condition that would
ensure that if we start a characteristic at a point (t, x) and run it backwards in time, then it
will not hit any other characteristic before it reaches t = 0. To guarantee this, we require that
if x = x(t) is the discontinuity curve for the solution, then no characteristic comes out of this
curve. This is known as the entropy condition. In the case of a shock of the Burgers’ equation

79
this is ensured if ul > ur – this follows from the simple geometric consideration on the the
plane (x, t), the explicit formula (5.29) for the characteristic, and the fact that U (s) = g(y)
along the characteristic.
Going back to the example of the initial data (5.19) for the Burgers’ equation, we see that
the discontinuous solution does not satisfy the entropy condition, while the rarefaction wave
does.
How does this change for a general conservation law? Consider a general equation of the
form
ut + F 0 (u)ux = 0. (5.30)
The characteristics
(X(s), U (s), Px (s), Pt (s))
now satisfy
dX dU dPx dPt
= F 0 (U ), = Px F 0 (U ) + Pt , = −F 00 (U )Px2 , = −F 00 (U )Px Pt ,
ds ds ds ds
starting at
X(0) = y, U (0) = g(y), Px (0) = g 0 (y), Pt (0) = −g(y)g 0 (y).
The solution is
Px (0) g 0 (y)
U (s) = U (0) = g(y), Px (s) = = ,
1 + sF 00 (g(y))Px (0) g 0 (y) + F 00 (g(y))s
g(y)g 0 (y)
Pt (s) = −g(y)Px (s) = − .
1 + sF 00 (g(y)g 0 (y)
The function u(t, x) is again constant along the characteristics. The projection of the char-
acteristics on the physical space are now the straight lines
X(t) = y + F 0 (g(y))t. (5.31)
The entropy condition for a jump discontinuity is now F 0 (ul ) > F 0 (ur ). If the flux is convex,
this is equivalent to the (simpler!) condition
ul > ur .
Let us now consider a step function initial condition

 0, x < 0
u(0, x) = 1, 0 < x > 1, (5.32)
0, x > 0,

for the Burgers’ equation


ut + uux = 0.
For 0 ≤ t ≤ 2 we have a rarefaction wave that follows a shock:


 0, x < 0
x/t, 0 < x < t,

u(0, x) = (5.33)

 1, t < x < 1 + t/2,
0, x > 1 + t/2.

80
At the time t = 2 the rarefaction wave (whose front moves with the speed v = 1) catches up
with the shock that moves with the speed v = 1/2. After this time solution is

 0, x < 0
u(0, x) = x/t, 0 < x < x(t), (5.34)
0, x > x(t).

The position x(t) is determined by the Rankine-Hugoniot condition: ul = x(t)/t, ur = 0, so


ul x
ẋ(t) = = , x(2) = 2,
2 2t

hence x(t) = 2t. Therefore, solution is

 0, x < 0 √
u(0, x) = x/t, 0 < x <√ 2t, (5.35)
0, x > 2t.

The maximum of the solution behaves as




r
2t 2
umax (t) = u(t, 2t) = = , (5.36)
t t
that is, it is decaying in time. This brings a question: where does this dissipation come from?

5.3 The Lax-Oleinik formula for the Burgers’ equation


We will now derive the Lax-Oleinik formula for the solution of the initial value problem
∂u ∂u
+u = 0, (5.37)
∂t ∂x
with the initial condition u(0, x) = g(x) that is uniformly bounded: |g(x)| ≤ M . We will
assume for simplicity that 0 ≤ g(x) ≤ M and that g(x) is piece-wise continuous. The Lax-
Oleinik formula is obtained as follows: consider the function
 2
t x−y
s(t, x; y) = + h(y), (5.38)
2 t
with ˆ y
h(y) = g(z)dz.
0

Let us see where the function s(t, x; y) is minimized over y: sy (t, x, y) = 0 if

x = y + tg(y), (5.39)

because g(y) = h0 (y). As g(y) is a bounded function, the function h(y) satisfies |h(y)| ≤ M |y|,
and so s(t, x; y) tends to +∞ as y → ±∞ for all t and x fixed. Moreover, s(t, x; y) is continuous
in y for all t and x fixed. It follows that it attains its global minimum at some points y. We
let y(t, x) be the smallest such point.

81
Next, set
 2 "  2 #
t x − y(t, x) t x−y
w(t, x) = + h(y(t, x)) = inf + h(y) . (5.40)
2 t y∈R 2 t

The function w(t, x) is Lipschitz continuous in x because for any x and x0 we have (denote
y = y(t, x)))
"  2 # 2
t x0 − z

0 t x−y
w(t, x ) − w(t, x) = inf + h(z) − − h(y)
z∈R 2 t 2 t
≤ h(x0 − x + y) − h(y) ≤ M |x0 − x|. (5.41)

We took z = x0 + y − x above as a ”test point”, so that x0 − z = x − y. Note also that as


t → 0 we have y(t, x) → x – the first term in the definition of s(t, x, y) is blowing up as t → 0
for x 6= y, hence the minimizer has to converge to x – as can be seen from (5.39) because the
function g(y) is uniformly bounded. As a consequence, we have
 2
t x − y(t, x) t
w(t, x) = + h(y(t, x)) = g 2 (t, y(t, x)) + h(y(t, x)) → h(x) as t → 0. (5.42)
2 t 2

Since w(t, x) is Lipschitz in x, it is differentiable almost everywhere (this is a non-trivial fact


known as the Rademacher theorem). Similarly, one can show that w(t, x) is Lispchitz in t.
Let us find an equation for the function w(t, x). The function y(t, x) where it is differen-
tiable, satisfies
yx + tg 0 (y)yx = 1,
and
yt + g(y) + tg 0 (y)yt = 0.
Both of these relations follow from differentiating (5.39). It follows that

yt = −g(y)yx . (5.43)

Let us now compute:


1 (y − x)
wt = − 2
(x − y)2 + yt + h0 (y)yt , (5.44)
2t t
and
1
wx = (x − y)(1 − yx ) + h0 (y)yx . (5.45)
t
0
Using (5.39) and since h (y) = g(y), we obtain

1 g 2 (y)
wt = − g 2 (y) − g(y)yt + g(y)yt = , (5.46)
2 2
and
wx = g(y)(1 − yx ) + g(y)yx = g(y). (5.47)

82
We conclude that w(t, x) is Lipschitz continuous almost everywhere, and wherever it is dif-
ferentiable, it solves the initial value problem

wx2
wt + = 0, w(0, x) = h(x). (5.48)
2
Therefore, we may define (almost everywhere)
"  2 #
∂w(t, x) ∂ t x − y(t, x)
u(t, x) = = − h(y(t, x)) . (5.49)
∂x ∂x 2 t

Note that (5.47) and (5.39) imply that

x − y(t, x)
u(t, x) = g(y(t, x)) = . (5.50)
t
Let us show that u(t, x) that we have just constructed is an integral solution of the initial
value problem for the Burgers’ equation

ut + uux = 0, u(0, x) = g(x). (5.51)

The fact that wherever u(t, x) is differentiable, it solves the Burgers’ equation follows imme-
diately from (5.48) after taking the x-derivative. Let us see why it is an integral solution. Let
v(t, x) be a test function and multiply (5.48) by vx :
ˆ ∞ˆ ∞ 
wx2

vx wt + dxdt = 0. (5.52)
0 −∞ 2

Now, since w is sufficiently regular in time (this requires a little bit of extra work that we
omit), we may integrate by parts in time, leading to
ˆ ∞ˆ ∞ ˆ ∞ˆ ∞ ˆ ∞
vx wt dxdt = − vxt wdxdt − vx w(0, x)dx.
0 −∞ 0 −∞ −∞

Next, we may integrate by parts in x since w is Lipschitz continuous in x:


ˆ ∞ˆ ∞ ˆ ∞ˆ ∞ ˆ ∞
vx wt dxdt = − vxt wdxdt − vx w(0, x)dx
0
ˆ ∞ˆ ∞
−∞ 0
ˆ ∞−∞ −∞
ˆ ∞ˆ ∞ ˆ ∞
=− vt wx dxdt + vwx (0, x)dx = − vt udxdt + v(0, x)g(x)dx.
0 −∞ −∞ 0 −∞ −∞

Inserting this into (5.52) gives


ˆ ∞ˆ ∞  ˆ ∞
u2

vt u + vx dxdt + v(0, x)g(x)dx = 0, (5.53)
0 −∞ 2 −∞

which means exactly that u(t, x) is an integral solution of (5.51).


Expression (5.49) is known as the Lax-Oleinik formula for the solution of the Burgers’
equation.

83
5.4 Entropy solutions and the Lax-Oleinik formula
Let us first show that the function u(t, x) given by the Lax-Oleinik formula satisfies a one-sided
estimate
z
u(t, x + z) − u(t, x) ≤ , for all t > 0, x ∈ R and z > 0. (5.54)
t
This estimate holds simply under the assumption that the initial condition u(0, x) = g(x)
for the Burgers’ equation (5.37) is a bounded function. We will assume to simplify some
computations that g(x) ≥ 0. Note that if g(x) is not positive but is bounded: |g(x)| ≤ M
then we can consider q(t, x) = u(t, x) + M . This function satisfies

qt + (q − M )qx = 0, q(0, x) = g(x) + M ≥ 0.

Another change of variable, q(t, x) = R(t, x + M t) brings us to the Burgers’ equation for the
function R(t, x)
Rt + RRx = 0,
with the initial condition R(0, x) = g(x) + M ≥ 0. If we show that (5.54) holds for the
solutions of the Burgers’ equation with bounded non-negative initial conditions, we would
conclude that
z
R(t, x + z) − R(t, x) ≤ ,
t
for all z ≥ 0. This would, in turn, imply
z
u(t, x + z) − u(t, x) = q(t, x + z) − q(t, x) = R(t, x + z + M t) − R(t, x + M t) ≤ ,
t
which is (5.54). Thus, we may assume that g(x) is non-negative everywhere without any loss
of generality.
A consequence of (5.54) is that backwards characteristics do not cross: this is our definition
of an entropy solution. Indeed, the characteristic that passes through (t, x) is a line with the
slope u(t, x):
X(s) − x = u(t, x)(s − t).
Thus, for the characteristics passing through (t, x) and (t, x + z), with z > 0, to cross at some
0 < s < t, we should have

x − tu(t, x) > x + z − tu(t, x + z),

which is prohibited by (5.54). Thus, a solution that satisfies (5.54) is an entropy solution in
the sense that ”backwards characteristics do not cross”.
In order to establish (5.54) we first show that y(t, x) is non-decreasing in x. This is
important because u(t, x) = g(y(t, x)). To check the monotonicity of y(t, x), let x1 < x2 ,
t > 0, and y1 = y(t, x1 ), y2 = y(t, x2 ). We claim that s(t, x2 , y) > s(t, x2 , y1 ) for all y < y1 ,
which means that y2 ≥ y1 . To see this, we note that, since x2 > x1 and y1 > y, we have

x2 y1 + x1 y > x1 y1 + x2 y,

which implies
(x2 − y1 )2 + (x1 − y)2 < (x1 − y1 )2 + (x2 − y)2 , (5.55)

84
Observe also that, as y1 is the smallest global minimizer of s(t, x1 , y), and y < y1 , we have
 2  2
x1 − y 1 x1 − y
t + h(y1 ) < t + h(y).
t t
As g(x) ≥ 0 for all x, we know from (5.39) that x1 > y1 , hence x1 − y > x1 − y1 > 0.
Combining this with (5.55) gives
(x2 − y1 )2 (x1 − y)2 (x1 − y1 )2 (x2 − y)2
+ + h(y1 ) < + + h(y1 )
t t t t
(x1 − y)2 (x2 − y)2
< + + h(y),
t t
which is nothing but
(x2 − y1 )2 (x2 − y)2
+ h(y1 ) < + h(y),
t t
and thus s(t, x2 , y) > s(t, x2 , y1 ) for all y < y1 , which implies that y2 ≥ y1 .
We now use (5.50), and the fact that y(t, x) is non-decreasing in x:
x − y(t, x) x − y(t, x + z) x + z − y(t, x + z) z z
u(t, x) = ≥ = − = u(t, x + z) − , (5.56)
t t t t t
which is (5.56).
We will say that u(t, x) is an entropy solution of the Burgers’ equation if it satisfies the
entropy condition (5.54). Note that if u(t, x) is discontinuous at some point and has left and
right limits ul and ur , then (5.54) implies that ul > ur , which is our old entropy condition
that we have introduced for piecewise continuous solutions. However, (5.54) is a more general
form of the entropy condition as it does not require u(t, x) to be piecewise continuous. The
key result is the following theorem.
Theorem 5.1 There exists a unique entropy solution of the initial value problem

ut + uux = 0, u(0, x) = g(x), (5.57)

with g ∈ L∞ (R), and it is given by the Lax-Oleinik formula.


We will not prove this result here. Note that existence has been already proved.

5.5 Long time behavior


Let us now address the long-time decay of the solutions of the Burgers’ equation that we have
mentioned previously. Consider the entropy solution of (5.57) and let us assume that
ˆ ∞
|g(x)|dx ≤ M < +∞. (5.58)
−∞

It follows that ˆ y


|h(y)| = g(s)ds ≤ M,
0

85
as well. Then, we have

(x − y)2 (x − y)2
s(t, x; y) = + h(y) ≥ − M.
2t 2t
On the other hand, we also have

s(t, x; x) = h(x) ≤ M,

hence at the minimizing point y(t, x) we have

(x − y(t, x))2
+ h(y(t, x)) ≤ M,
2t
whence
(x − y(t, x))2
≤ 2M, (5.59)
2t
for all t > 0. Now, we simply use expression (5.50) for u(t, x):
√ √
x − y(t, x) 4M t 4M
u(t, x) = ≤ = √ . (5.60)
t t t
Hence, in the long time regime the entropy solutions of the Burgers’ equation decay as
O(t−1/2 ). Therefore, the definition of the entropy solution includes an implicit form some
”hidden” dissipation.
Finally, we establish ”decay to an N -wave” result for the solutions of the Burgers’ equation.
Let g(x) have compact support and set
ˆ y ˆ ∞
p = −2 min g(s)ds, q = 2 max g(s)ds.
y∈R −∞ y∈R y

Define the N -wave as  √ √


x/t, if − pt < x < qt,
N (t, x) =
0, otherwise.
Theorem 5.2 Assume that p, q > 0. Then there exists C > 0 so that
ˆ ∞
C
|u(t, x) − N (t, x)|dx ≤ √ . (5.61)
∞ t
Proof. First, note that if g(x) = 0 for |x| ≥ R, then h(x) = h− for all x < −R, and h(x) = h+
for all x > R, with ˆ ˆ
0 R
h− = − g(s)ds, h+ = g(s)ds.
−R 0

One can easily check that


p q
min h(y) = − + h− = − + h+ . (5.62)
y 2 2

86
We now claim that

u(t, x) = 0 for all x < −R − (pt)1/2 , for all t > 0. (5.63)

In order to establish this, we will show that for x as in (5.63), we have y(t, x) = x. Indeed,
for such x we have
s(t, x; x) = h(x) = h− ,
while for y < −R we have

(x − y)2 (x − y)2
s(t, x; y) = + h(y) = + h− > h− ,
2t 2t
unless y = x. On the other hand, for y > −R we have

(x − y)2 (x + R)2 p pt p
s(t, x) = + h(y) ≥ − + h− ≥ − + h− = h− .
2t 2t 2 2t 2
Hence, y(t, x) = x is the smallest global minimizer of s(t, x; y). Then (5.50) implies that
u(t, x) = 0. Similarly, we can show that

u(t, x) = 0 for all x > R + (qt)1/2 , for all t > 0. (5.64)

Next, we claim that (for t sufficiently large so that this interval is not empty)
√ √
−R ≤ y(t, x) ≤ R, if R − pt + 1 < x < −R + qt − 1. (5.65)

Since y(t, x) is non-decreasing in x it suffices to show that



y(t, R − pt + 1) ≥ −R, (5.66)

and √
y(t, −R + qt − 1) ≤ R. (5.67)

To see that (5.66) holds, take x0 = R − pt + 1. Note that if z < −R, then h(z) = h− , and

(x0 − z)2
s(t, x0 ; z) = + h(z) ≥ h− .
2t
Choose now z0 so that h(z) = miny∈R h(y) = −p/2 + h− , and |z| ≤ R. Then we have

(x0 − z0 )2 pt p
s(t, x0 , z0 ) = + h(z0 ) < − + h− = h− .
2t 2t 2
It follows that the minimizer y(t, x0 ) of s(t, x0 , y) satisfies (5.66). The proof of (5.67) is very
similar. As
x − y(t, x)
u(t, x) = ,
t
we conclude that x C √ √
u(t, x) − ≤ for R − pt < x < −R + qt. (5.68)

t t

87
√ √
So we have shown√that u(t, x) = 0 for
√ x < −R − pt and x > R + qt, and, in addition,
(5.68) holds for R − pt < x < −R + qt. As a consequence, we have

C √ √ √ √
|u(t, x) − N (t, x)| ≤ , for x ∈/ (−R − pt, R − pt) and x ∈ / (R − qt, R + qt).
t
√ √ √ √
However, for x ∈ (−R − pt, √ R − pt)
√ we have u(t, x) ≤ C/ t, and N (t, x) = x/t ≤ C/ t,
and similarly for x ∈ (R − qt, R + qt). Putting all these pieces together implies that
ˆ
C
|u(t, x) − N (t, x)| ≤ √ ,
R t
as we have claimed. 2

88

You might also like