Partial Differential Equations
Partial Differential Equations
Partial Differential Equations
net/publication/330304623
CITATIONS READS
0 6,310
1 author:
Per K. Jakobsen
UiT The Arctic University of Norway
78 PUBLICATIONS 754 CITATIONS
SEE PROFILE
All content following this page was uploaded by Per K. Jakobsen on 06 February 2019.
Contents
1 Introduction 3
2 First notions 4
3 PDEs as mathematical models of physical systems 8
3.1 A trac model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.2 Diusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Heat conduction . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.4 A random walk . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
3.5 The vibrating string . . . . . . . . . . . . . . . . . . . . . . . . . 16
5 Numerical methods 22
5.1 The nite dierence method for the heat equation . . . . . . . . 23
5.2 The Crank-Nicholson scheme . . . . . . . . . . . . . . . . . . . . 29
5.2.1 Finite dierence method for the heat equation: Neumann
boundary conditions . . . . . . . . . . . . . . . . . . . . . 30
5.3 Finite dierence method for the wave equation . . . . . . . . . . 31
5.4 Finite dierence method for the Laplace equation . . . . . . . . . 35
5.4.1 Iterations in Rn . . . . . . . . . . . . . . . . . . . . . . . . 38
5.4.2 Gauss-Jacobi . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.4.3 Gauss-Seidel . . . . . . . . . . . . . . . . . . . . . . . . . 46
1
7 Classication of equations: Characteristics 73
7.1 Canonical forms for equations of hyperbolic type . . . . . . . . . 75
7.2 Canonical forms for equations of parabolic type . . . . . . . . . . 78
7.3 Canonical forms for equations of elliptic type . . . . . . . . . . . 80
7.4 Characteristic curves . . . . . . . . . . . . . . . . . . . . . . . . . 82
7.5 Characteristic curves again: A dierent point of view . . . . . . . 85
7.6 Stability theory, energy conservation and dispersion . . . . . . . 94
2
11.6 Group velocity 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . 217
12 Projects 221
12.1 Project1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 221
12.2 Project2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 223
12.3 Project3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 225
1 Introduction
The eld of partial dierential equations (PDEs) is vast in size and diversity.
The basic reason for this is that essentially all fundamental laws of physics are
formulated in terms of PDEs. In addition, approximations to these fundamental
laws, that form a patchwork of mathematical models covering the range from the
smallest to the largest observable space-time scales, are also formulated in terms
of PDEs. The diverse applications of PDEs in science and technology testify to
the exibility and expressiveness of the language of PDEs, but it also makes it a
hard topic to teach right. Exactly because of the diversity of applications, there
are just so many dierent points of view when it comes to PDEs. These lecture
notes view the subject through the lens of applied mathematics. From this point
of view, the physical context for basic equations like the heat equation, the wave
equation and the Laplace equation are introduced early on, and the focus of the
lecture notes are on methods, rather than precise mathematical denitions and
proofs. With respect to methods, both analytical and numerical approaches are
discussed.
These lecture notes has been successfully used as the text for a master class in
partial dierential equations for several years. The students attending this class
are assumed to have previously attended a standard beginners class in ordinary
dierential equations and a standard beginners class in numerical methods. It is
also assumed that they are familiar with programming at the level of a beginners
class in informatics at the university level.
While writing these lecture notes we have been inuenced by the writings
of some of the many authors that previously have written textbooks on partial
dierential equations [3, 6, 5, 8, 7, 2, 4]. However, for the students that these
lecture notes are aimed at, books like [3],[6],[5],[8] and [7] are much too advanced
either mathematically [3],[6],[8],[8] or technically [5],[7]. The books [2],[4] would
be accessible to the students we have in mind, but in [2] there is too much
focus on precise mathematical statements and proofs and less on methods, in
particular numerical methods. The book [4] is a better match than [2] for the
students these lecture notes has been written for, but it is a little supercial
and does not reach far enough. With respect to reach and choice of topics, the
book Partial Dierential Equations of Applied Mathematics, written by Erich
Zauderer[1] would be a perfect match for the students we have mind. However,
the students taking the class covered by these notes does not have any previous
exposure to PDEs, and the book [1] is for the most part too dicult for them to
follow. The book is also very voluminous and contain far to much material for
3
a class covering one semester with ve hours of lecture each week. The lecture
notes for the most part follow the structure of [1], but simplify the language
and makes a selection of topics that can be covered in the time available. The
lecture notes deviate from [1] at several points, in particular in the section
covering physical modeling examples, integral transforms and in the treatment
of numerical methods. After mastering these lecture notes, the students should
be able to use [1] as a reference for more advanced methods that are not covered
by the notes. In order for students to pass the master class these notes has been
written for, the students must complete the three projects included in the last
section of the lecture notes and pass an oral exam where they must defend their
work and also answer questions based on the content of these lecture notes.
2 First notions
A partial dierential equation(PDE), is an equation involving one or more func-
tions of two or more variables, and their partial derivatives, up to some nite
order. Here are some examples
ux + uy = 0, (1)
ux + yuy = 0, (2)
where
∂u
ux ≡ ∂x u ≡ .
∂x
The highest derivative that occur in the equation is the order of the equation.
All equations (1,2,3) are of order one.
A PDE is scalar if it involves only one unknown function. In this course we
will mostly work with scalar PDEs of two independent variables. This is hard
enough as we will see!
Most important theoretical and numerical issues can be illustrated using
such equations. It is also a fact that many physical systems can be modeled by
equations of this type. We will also for the most part restrict our attention on
equations of order two or less. This is the type of equations that most frequently
occur in modeling situations. This fact is linked to the use of second order Taylor
expansions and to the use of Newton's equations in one form or another.
The most general rst order scalar PDE, in two independent variables, is of
the form.
4
the equation for all (x, y) ∈ D. PDEs typically have many dierent solutions.
The dierent solutions can be dened on dierent, even disjoint domains.
We will later relax this notion of solutions somewhat. In more advanced
textbooks what we call a solution here is called a classical solution.
The most general second order scalar PDE of two independent variables is
of the form
2
A solution in D∈R is now a function where all second order partial derivation
are continuous and where the function satisfy equation (5) for all (x, y) ∈ D.
Here are some examples
Equations (1),(2),(6) and (7) are linear equations whereas equations (3) and
(8) are not linear. Equations that are not linear are said to be nonlinear. The
importance of the linearity/nonlinearity distinction in the theory of PDEs can
hardly be overstated. We have already seen the importance of this notion in
the theory of ODEs.
The precise meaning of linearity is best stated using the language of linear
algebra. Let us consider equation (1).
Dene an operator L by
L = ∂x + ∂y .
L u = 0.
L (u + v) = ∂x (u + v) + ∂y (u + v)
= ∂x u + ∂x v + ∂y u + ∂y v
= ∂x u + ∂y u + ∂x u + ∂y v
= L u + L v,
5
Properties 1 and 2 show us that L is a linear dierential operator.
Denition 1. A scalar PDE is linear if it can be written in the form
L (u) = g,
for some linear dierential operator L and given function g. The equation is
homogeneous if g = 0 and inhomogeneous if g 6= 0.
Let us consider equation (3) from page 1. Dene a dierential operator L by
L (u) = 1.
Note that
L fails property 2 and is thus not a linear operator (It also fails property 1).
Thus the equation
(∂x u)2 + (∂y u)2 = 1,
is nonlinear. It is not hard to see that, basically, a PDE is linear if it can be
written as a rst order polynomial in u and its derivatives. All other equations
are nonlinear.
The most general rst order linear scalar PDE of two independent variables
is of the form
a(x, y)ux + b(x, y)uy + c(x, y)u = d(x, y).
The most important property of homogeneous linear equations is that they
satisfy the superposition principle.
Let u, v be solutions of a linear homogeneous scalar PDE. Thus
L u = 0,
L v = 0,
where L is the linear dierential operator dening the equation. Let w = au+bv
where a, b ∈ R (or C). Then
6
Thus w is also a solution for all choices of a and b. This is the superposition
principle. We have already seen this principle at work in the theory of ODEs.
Generalizing the argument just given, we obviously have that if u1 , . . . , un
are solutions then
n
X
w= ai ui , ai ∈ R (or C),
i=1
is also a solution for all choices of the constants {ai }.
In the theory of ODEs we found that any solution of a homogeneous linear
ODE of order n is of the form
y = c1 y1 + · · · + cn yn ,
where y1 , . . . , y n are n linearly independent solutions of the ODE and c1 , . . . , cn
are some constants. Thus the general solution has n free constants. For linear
PDEs, the situation is not that simple.
Let us consider the PDE
uxx = 0. (9)
ux = f (y),
where f (y) is an arbitrary function of y. Integrating one more time we get the
general solution
u = xf (y) + g(y),
where g(y) is another arbitrary function of y. Thus the general solution of
equation (9) depends on two arbitrary functions. The corresponding ODE
y 00 = 0,
has a general solution y = ax + b that depends on two arbitrary constants. The
equation (9) is very special but the conclusion we can derive from it is very
general. (But not universal! )
This fact makes it much harder to work with PDEs than ODEs.
Let us consider another example that will be important later in the course.
The PDE that we want to solve is
uxy = 0. (10)
uy = f (y), (11)
where f (y) is arbitrary. Integrating equation (11) with respect to y gives us the
general solution to equation (10) in the form
u = F (x) + G(y),
R
where G(y) = f (y)dy and F (x) are arbitrary functions of y and x.
7
3 PDEs as mathematical models of physical sys-
tems
Methods for solving PDEs is the focus of this course. Deriving approximate
description of physical systems in terms of PDEs is less of a focus and is in fact
better done in specialized courses in physics, chemistry, nance etc.
It is however important to have some insight into the physical modeling
context for some of the main types of equations we discuss in the course. This
insight will make the methods we introduce to solve PDEs more natural and
easy to understand.
i) There is only one lane, so that all cars move in the same direction (no
passing is allowed).
ii) There are no intersections or o/on ramps. Thus cars can not enter or
leave the road.
Because of i) and ii), we can represent the road by a line that we coordinatize
by a variable x. Let
f (x, t) = # of cars passing a point x from left to right, per unit time, at time t.
From the denition of ∆N and f and assumption ii) we get the identity
8
We now make the fundamental continuum assumption: n(x, t) and f(x,t) are
smooth functions. This is a reasonable approximation if the number of cars in
each interval [x, x + ∆x] is very large and change slowly when x and t vary.
f (x, t) is also assumed to be slowly varying. (Imagine observing a crowded road
from a great height, say from a helicopter). Rewrite equation (12) as
1
{n(x, t) + ∂t n(x, t)∆t − n(x, t)}
∆t
1
≈− {f (x, t) + ∂x f (x, t)∆x − f (x, t), },
∆x
∂t n ≈ −∂x f.
∂t n + ∂x f = 0. (14)
∂x f = ∂x α(n) = α0 (n)∂x n.
9
Thus our model of the trac ow is given by the PDE
c(n) ≈ c0 , (18)
For somewhat higher densities we might have to use the more precise approxi-
mation
c(n) ≈ c0 + c1 n, (20)
∂t n + (c0 + c1 n)∂x n = 0.
For the linear transport equation (19), let us look for solutions of the form
∂t n = −c0 ϕ0 ,
∂x n = ϕ0 ,
⇓
∂t n + c0 ∂x n = −c0 ϕ0 + c0 ϕ0 = 0.
Thus
n(x, t) = ϕ(x − c0 t), (21)
is a solution of equation (19) for any function ϕ. Let us assume that the density
of cars is known at one time, say at t = 0. Thus
n(x, 0) = f (x),
10
m
ϕ(x − c0 · 0) = f (x),
m
ϕ(x) = f (x).
Thus the function
n(x, t) = f (x − c0 t), (23)
is a solution of
∂t n + c0 ∂x n = 0,
n(x, 0) = f (x). (24)
(24) is an example of an initial value problem for a PDE. Initial value problems
for PDEs are often, for historical reasons, called Cauchy problems.
Note that at this point we have not shown that (23) is the only solution to
the Cauchy problem (24). Proving that there are no other solutions is obviously
important because a deterministic model (and this is!) should give us exactly
one solution, not several.
The solution (23) describe an interesting behavior that is common to very
many physical systems. It describe a wave. From (23), it is evident that
We are used to observing waves when looking at the surface of the sea, but
might not have expected to nd it in the description of trac on a one-lane
road.
Trac waves are in fact common and play no small part in the occurrence
of trac jams.
11
3.2 Diusion
Consider a narrow pipe oriented along the x-axis containing water that is not
moving. Assume there is a material substance, a dye for example, of mass
density u(x, t) dissolved in the water. Here we assume that the mass density is
uniform across the pipe so that the density only depends on x which by choice
is the coordinate along the length of the pipe.
Figure 3
If the pipe is not leaking, the mass inside the section of the pipe between x1 and
x2 can only change by dye entering or leaving through the points x1 and x2 .
This is the law of conservation of mass which is the rst great pillar of classical
physics.
Let f (x, t) be the amount of mass owing through a point x at time t pr.unit
time. This is the mass ux. By convention f positive means that mass is owing
to the right.
Mass conservation then gives the balance equation
dM
= f1 − f2 , fi = f (xi , t), i = 1, 2, · · · . (25)
dt
Fick's law states that we have the following relation between mass ux and mass
density.
f (x, t) = −kux (x, t), k>0 (26)
Thus mass ow from high density areas towards low density areas.
Fick's law is only a phenomenological approximate relation based on the
empirical observations that mass ux is for the most part a function of (x, t),
thorugh the density gradient, thus f (x, t) = f (ux (x, t)). Assuming that the
ux is a smooth function of the density gradient, we can represent the ux as a
power series in the density gradient
Since there is no mass ux if the density gradient is zero we must have f0 = 0.
Furthermore, since mass ows from high density domains to low density domains
12
we must have f1 < 0. We can therefore write f1 = −k , where k > 0. Clearly, if
the density gradient is large enough, the third term in the power series 27 will
be signicant and must be included. Like for the trac model, this will lead
to a nonlinear equation instead of the linear equation that will be the result of
truncating the power series at the second term. In general we will assume that
the parameter k in Fick's law depends on x. This corresponds to the assumption
that the relation between ux and density gradient depends on position. This
is not uncommon. Using Fick's law we can write the mass balance equation as
x2
dM
= kux ,
dt x1
m
Z x2 Z x2
ut dx = (kux )x dx,
x1 x1
m
Z x2
(ut − (kux )x )dx = 0, ∀x1 , x2
x1
m
ut − (kux )x = 0. (28)
(
∇ = (∂x , ∂y ) 2D
ut = ∇ · (k∇u) 2
.
∇ = (∂x , ∂y , ∂z ) 3D
If the mass ow coecient, k, is constant in space, the diusion equation takes
the form
ut = k∇2 u. (29)
These equations and their close cousins describe a huge range of physical,
chemical, biological and economic phenomena.
If the mass ow is stationary(time independent) and the mass ow coecient
independent of space, we get the simplied equation
∇2 u = 0, u = u(x). (30)
This is the Laplace equation and occur in the description of many stationary
phenomena.
13
Figure 4
The second great pillar of classical physics is the law of conservation of energy.
Assuming no energy is gained or lost along the length of the bar, the conservation
of energy gives us the balance equation
dE
= f1 − f2 , (31)
dt
where fi = fi (xi , t) and f (x, t) is the energy ux at x at time t. Here we use here
the convention that positive energy ux means that energy ow to the right. In
order to get a model for the bar, we must relate e(x, t) and f (x, t) to T (x, t).
For many materials we have the following approximate identity
where ρ is the mass density of the bar and C is the heat capacity at constant
volume. For (32) to hold the temperature must not vary too quickly. Energy is
always observed to ow from high to low temperature regions. If the tempera-
ture gradient is not too large, Fourier's law hold for the energy ux
f (x, t) = −qTx , q>0 (33)
dE
= f1 − f2 ,
dt
m
14
Z x2 x2 Z x2
CρTt dx = qTx = (qTx )x dx,
x1 x1 x1
m
Z x2
(CρTt − (qTx )x )dx = 0, ∀x1 , x2
x1
m
CρTt − (qTx )x = 0.
If the heat conduction coecient, q, does not depend on x, we get
q
Tt = DTxx , D= , (34)
Cρ
Again we get the diusion equation. In this context the equation is called the
equation of heat conduction. Energy ow in 2D and 3D similarly leads to 2D
and 3D version of the heat conduction equations
Tt = D∇2 T, (35)
∇2 T = 0. (36)
The wave equation, diusion equation and Laplace equations and their close
cousins will be the main focus in this course.
In the next example we will show that the diusion equation also arise from a
seemingly dierent context. This example is the beginning of a huge application
that have ramications in all areas of science, technology, economy, etc. I am
talking about stochastic processes.
1 1
P (x, t + τ ) = P (x − δ, t) + P (x + δ, t). (37)
2 2
15
This is just saying that probability is conserved. From this point of view (37) is a
balance equation derived from a conservation law just like equations (11), (38), (25)
and (31). Using Taylor expansions
P (x, t) + ∂t P τ + O(τ 2 )
1 1 1 1
= P (x, t) − ∂x P δ + ∂xx P δ 2 + P (x, t)
2 2 4 2
1 1
+ ∂x P δ + ∂xx P δ 2 + O(δ 4 ).
2 4
Thus by truncating the expansions we get
δ2
∂t P = D∂xx P, D= .
2τ
This is again the diusion equation. A more general and careful derivation of
the random walk equation can be found in section 1.1 in [1].
i) The string can only move in a single plane, and in this plane it can only
move transversally.
Using i) the state of the string can be described by a single function u(x, t)
measuring the height of the deformed string over the x-axis at time t. Negative
values of u means that the string is below the x-axis.
Figure 5
16
ii) Any real string has a nite thickness, and bending of the string result
in a tension (negative pressure) on the outer edge of the string and a
compression(pressure) on the inner edge of the bend.
Figure 6
We model the string as innitely thin and thus assume no compression force.
The tension, T, in a deformed string is always tangential to the string and
assumed to be of constant length and time independent
T = |T| = const.
iii) The string has a mass density given by ρ which is independent of position
and also time independent.
Note: ii) and iii) are not totally independent because a typical cause of nonuni-
form tension, T , is that the string has a variable material composition and
this usually imply varying mass density.
Let us consider a piece of string above the interval [x1 , x2 ]. The total vertical
momentum (horizontal momentum is by assumption i) equal to zero), P (t), in
this piece of string at time t is
Z x2
P (t) = dxρut . (38)
x1
17
Recall that momentum is conserved in classical (non relativistic) physics. This is
the third great pillar of classical physics. Note that with the y -axis oriented like
in gure 5, a piece of the string moving vertically up gives a positive contribution
to the total momentum P (t). Recall also that force is by denition change in
momentum per unit time. Thus, if we assume there are no forces acting on the
string inside the interval [x1 , x2 ], then momentum inside the interval can only
change through the action of the tension force at the bounding points x1 and x2 .
If we orient the coordinated axes as in gure 5, a positive vertical component,
f1 , of the tension force at x1 , will lead to an increase the total momentum p(t).
A similar convention apply for f2 at x2 .
Conservation of momentum for the string is then expressed by the following
balance equation
dP
= f1 + f2 .
dt
π
f1 = −T cos θ1 = −T cos( − θ)
2
= −T sin θ
ux
= −T p .
1 + u2x x=x1
Figure 8
π
f2 = −T cos θ2 = −T cos( + θ)
2
= T sin θ
ux
=T p .
1 + u2x x=x2
Figure 9
|ux | 1.
18
Applying iv) gives us
f1 ≈ −T ux ,
x=x1
(39)
f2 ≈ T ux .
x=x2
dP x2
= −T ux + T ux = T ux ,
dt x1 x2 x1
m
Z x2 Z x2
(ρut )t dx = (T ux )x dx,
x1 x1
m
Z x2
(ρutt − T uxx )dx = 0, ∀x1 , x2
x1
m
ρutt − T uxx = 0,
m
s
T
utt − c2 uxx = 0, c= > 0. (40)
ρ
This is the wave equation. Similar modeling of small vibrations of membranes
(2D) and solids (3D) give wave equations
(
∇2 = ∂xx + ∂yy 2D
utt − c2 ∇2 u = 0 .
∇2 = ∂xx + ∂yy + ∂zz 3D
Wave equations describe small vibrations and wave phenomena in all sorts of
physical systems, from stars to molecules.
i) initial conditions
19
An initial condition consists of specifying the state of the system at some par-
ticular time t0 . For the diusion equation, we would specify
u(x, t0 ) = ϕ(x),
for some given function ϕ. This would for example describe the distribution of
dye at time t0 . For the wave equation, we need two initial conditions since the
equation is of second order in time (same as for ODEs)
u(x, t0 ) = ϕ(x),
∂t u(x, t0 ) = ϕ(x).
i) u(0, t) = 0, u(l, t) = 0 ,
The rst is a Dirichlet condition, the second a Neumann condition and the third
a Robin condition. Here n is a normal to the curve (2D) or surface (3D) that
denes the boundary of our domain and a is a function dened on the boundary.
Let us consider the temperature prole of a piece of material covering a domain
D. We want to model heat conduction in this material.
∂t u = D∇2 u, in D,
u(x, 0) = ϕ(x), x ∈ D,
u(x, t) = 0, ∀ x ∈ ∂D,
20
n is by convention the outward pointing
normal.
Figure 10
ii) If the material is perfectly isolated (no heat loss), we must solve the prob-
lem
∂t u = D∇2 u, in D,
u(x, 0) = ϕ(x), x ∈ D,
∂n u(x, t) = 0, ∀ x ∈ ∂D.
The general morale here is that the choice of initial and boundary conditions
are part of the modelling process just like the equation themselves.
ii) Uniqueness: There exists no more than one solution that satisfy all the
requirements of the model.
iii) Stability. The unique solution depends in a stable manner on the data
of the problem: Small changes in data (boundary conditions, initial con-
ditions, parameters in equations, etc, etc) lead to small changes in the
solution.
Producing a formal mathematical proof that a given model is well posed can
be a hard problem and belongs to the area of pure mathematics. In applied
mathematics we derive the models using our best physical intuition and sound
methods of approximation and assume as a working hypothesis that they are
well posed. If they are not, the models will eventually break down and in this
21
way let us know. The way they break down is often a hint we can use to modify
them such that the modied models are well posed.
We thus use an empirical approach to the question of well-posedness in
applied mathematics. We will return to this topic later in the class.
5 Numerical methods
In most realistic modelling, the equations and/or the geometry of the domains
are too complicated for solution by hand to be possible. In the old days
this would mean that the problems were unsolvable. Today, the ever increasing
memory and power of computers oer a way to solve previously unsolvable prob-
lems. Both exact methods and approximate numerical methods are available,
often combined in easy to use systems, like Matlab, Mathematica, Maple etc.
In this section we will discuss numerical methods. The use of numerical
methods to solve PDEs does not make more theoretical methods obsolete. On
the contrary, heavy use of mathematics is often required to reformulate the
PDEs and write them in a form that is well suited for the architecture of the
machine we will be running it on, which could be serial, shared memory, cluster
etc.
Solving PDEs numerically has two sources of error.
1) Truncation error: This is an error that arises when the partial dierential
equation is approximated by some (multidimensional) dierence equation.
Continuous space and time can not be represented in a computer and as
a consequence there will always be truncation errors. The challenge is
to make them as small as possible and at the same time stay within the
available computational resources.
22
5.1 The nite dierence method for the heat equation
Consider the following initial/boundary value problem for the heat equation
xi = hi, i = 0, . . . , N,
tn = kn, n ∈ N,
1
where x0 = 0, xN = 1 and thus h= N . The solution will only be computed at
the grid points.
Figure 11
u0i = ϕi = ϕ(xi ),
n n
∂2u
∂u 1
u(xi , tn + k) = uni + k+ k 2 + O(k 3 ),
∂t i 2 ∂t2 i
n n
∂2u
∂u 1
u(xi , tn − k) = uni − k+ k 2 + O(k 3 ).
∂t i 2 ∂t2 i
23
Using the Taylor expansions we easily get the following expressions
n
u(xi , tn + k) − uni
∂u
= + O(k),
k ∂t
in
uni − u(xi , tn − k)
∂u
= + O(k)),
k ∂t
in
u(xi , tn + k) − u(xi , tn − k) ∂u
= + O(k 2 )).
2k ∂t i
n
un+1
− uni
∂u i
≈
forward dierence
∂t i k
n approx. has an error of order k,
un − un−1
∂u
≈ i i
backward dierence
∂t i k
n
∂u un+1 − un−1
≈ i i
center dierence order k2 .
∂t i 2k
(42)
n n n
∂2u ∂3u
∂u 1 1
u(xi + h, tn ) = uni + h+ h2 + h3 + O(h4 ),
∂x i 2 ∂x2 i 6 ∂x3 i
n n n
∂2u ∂3u
∂u 1 1
u(xi − h, tn ) = uni − h+ h2 − h3 + O(h4 ).
∂x i 2 ∂x2 i 6 ∂x3 i
n
u(xi + h, tn ) − 2uni + u(xi − h, tn ) ∂2u
= + O(h2 ).
h2 ∂x2 i
n
∂2u uni+1 − 2uni + uni−1
≈ (center dierence) order h2 . (43)
∂x2 i h2
Using the forward dierence in time and center dierence in space in (41) gives
24
k
Dene s= h2 . Then the problem we must solve is
un+1
i = s(uni+1 + uni−1 ) + (1 − 2s)uni , i = 1, . . . , N − 1,
un0 = unN = 0, (44)
u0i = ϕi .
Figure 12
The exact solution is found by separation of variables which we will discuss later
in this class.
∞
8 X 1 1 2 2
u(x, t) = 2 2
sin mπ e−m π t sin mπx.
π m=1 m 2
In order to compute the numerical solution, we must choose values for the
spacetime discretization parameters h and k. In the gures 13 and 14, we plot
the initial condition, the exact solution and the numerical solution for h = 0.05,
thus N = 50, and k = h2 s = 0.0004s for s = 0.49 and 0.51. We use 100
iterations in n.
It is clear from the gure 14 that the numerical solution for s = 0.51 is very
bad. This is an example of a numerical instability. Numerical instability is a
constant danger when we compute numerical solutions to PDEs.
For a simple equation like the heat equation we can understand this insta-
bility, why it appears and how to protect against it.
Let us try to solve the dierence equation in (44) by separating variables.
We thus seek a solution of the form
unj = Xj Tn ,
25
Figure 13: In this gure we have s=0.49
26
⇓
Xj Tn+1 = s(Xj+1 Tn + Xj−1 Tn ) + (1 − 2s)Xj Tn ,
⇓
Tn+1 Xj+1 + Xj−1
= (1 − 2s) + s .
Tn Xj
Both sides must be equal to the same constant ξ. This gives us two dierence
equations coupled through the separation constant ξ.
1) Tn+1 = ξTn .
Xj+1 + Xj−1
2) s + (1 − 2s) = ξ.
Xj
Tn = ξ n T0 .
X0 = XN = 0,
We look for solutions of the form Xj = z j . This implies that
Xj−1 = z −1 Xj ,
Xj+1 = zXj ,
ξ = s(z + z −1 ) + (1 − 2s),
m
1 ξ
z 2 + ( − 2 − )z + 1 = 0. (46)
s s
This equation has two solutions z1 and z2 which we in the following assume to
be dierent. (Show that the case when z1 = z2 does not lead to a solution of
the boundary value problem (45).)
The general solution of (45) is thus
Xj = Az1j + Bz2j .
The general solution must satisfy the boundary conditions. This implies that
X0 = 0 ⇔ A + B = 0,
XN = 0 ⇔ Az1N + Bz2N = 0.
27
Nontrivial solutions exists only if
N
z1
= 1. (47)
z2
This condition can not be satised if z1 6= z2 are real. Since equation (46) has
real coecient, we must have
z1 = z = ρeiθ ,
z2 = z̄ = ρe−iθ .
πr
e2iθN = 1, ⇔ θr = , r = 1, 2, . . . , N − 1.
N
The corresponding solutions of the boundary value problem (45) are
(r)
Xj = A(eiθr j − e−iθr j ) = 2iA sin θr j.
(r) rπj
Xj = sin , r = 1, 2, . . . , N − 1.
N
It is easy to verify that
(1) (N −1)
c1 Xj + · · · + cN −1 Xj = 0, ∀j
⇓
c1 = c2 = · · · = cN −1 = 0.
(r) N −1
The vectors {Xj }r=1 are thus linearly independent and indeed form a basis
N −1
because R has dimension N − 1.
(r)
Each vector Xj gives a separated solution.
rπj
ur nj = ξrn sin ,
N
to the boundary value problem
un+1
i = s(uni+1 + uni−1 ) + (1 − 2s)uni , i = 1, . . . , N − 1,
un0 = unN = 0.
rπ
ξr = (1 − 2s) + s(eiθr + e−iθr ) = 1 − 2s(1 − cos ).
N
28
These solutions give exponential growth if |ξr | > 1 for some r. Thus the re-
quirement for stability is
|ξr | 6 1, ∀r.
If this holds, the numerical scheme is stable.
Now
rπ
−1 6 cos 6 1, ∀ r,
N
m
rπ
0 6 1 − cos 6 2, ∀ r.
N
Therefore −1 6 ξ r 6 1 if
1 − 4s 1 −1,
m
1
s6 .
2
This is the stability condition indicated by the numerical test illustrated in
gures (13) and (14).
un+1
j − unj
= (1 − Q)(δ 2 u)nj + Q(δ 2 u)n+1
j . (48)
k
When Q = 0, we get the explicit scheme we have discussed. For Q > 0 the
scheme is implicit; At each step we must solve a linear system.
We analyze the stability of the scheme by inserting a separated solution of
the form
2rπj
unj = (ξr )n ei N . (49)
Note: This separated solution will not satisfy Dirichlet boundary conditions but
periodic boundary conditions, unj+N = unj In this type of stability analysis
we disregard the actual boundary conditions of the problem, using instead
periodic boundary conditions. A numerical scheme is Von Neumann Stable
if
|ξr | 6 1, ∀r
There are other notions of stability where the actual boundary conditions
are taken into account, but we will not discuss them in this course. Usually
Von Neumann Stability gives a good indicator of stability, also when the
actual boundary conditions are used.
29
If we insert (49) into the Q-scheme we get
2rπ
s(1 − 2Q)(1 − cos ) 6 1.
N
This is always true for 1 − 2Q 6 0. Thus if 21 6 Q 6 1 the numerical scheme is
stable for all choices of grid parameters k and h. The price we pay is that the
scheme is implicit.
For
1
2 6 Q 6 1 we say that the Q-scheme is unconditionally stable. For
1
Q = 2 , the Q-scheme is called the Crank-Nicolson scheme.
Observe that when N is large, the angles
2πr
θr = , r = 1, 2, · · · N,
N
form a very dense set of points in the angular interval [0, 2π]. Thus, in the
argument above we might for simplicity take θ to be a continuous variable in
the interval [0, 2π]. We will therefore in the following investigate Von Neuman
stability by inserting
unj = ξ n η j ,
where η = eiθ , into the numerical scheme of interest. Verify that using this
simplied approach leads to the same stability conditions for the two schemes
(44) and (48).
One possible discretization for the boundary conditions is to use forward dier-
ence at x=0 and backward dierence at x = 1.
un1− un0
= f n,
h
unN − unN −1
= gn .
h
These approximations are only of order h. The space derivative in the equation
was approximated to second order h2 . We get second order also at the boundary
30
Figure 15
by using center dierence. However we then need to extend our grid by ghost
points x−1 and xN +1
u1 − u−1 uN +1 − uN −1
(ux )0 = , (ux )N = .
2h 2h
Thus our numerical scheme is
un+1
i = s(uni+1 + uni−1 ) + (1 − 2s)uni , i = 0, . . . , N,
un−1 = un1 − 2hf n ,
unN +1 = unN −1 + 2hg n .
xj = hj, j = 0, . . . , N,
tn = kn, n ∈ N,
L
where x0 = 0, xN = L N.
and thus h=
The wave equation is required to hold at the grid points only.
un+1
j − 2unj + un−1
j unj+1 − 2unj + unj−1
= ,
k2 h2
m
31
Figure 16
un+1
j = s(unj+1 + unj−1 ) + 2(1 − s)unj − un−1
j , (50)
2
where s = hk2 . The boundary conditions evaluated on the grid simply become
un0 = 0, unN = 0.
Note that the dierence equation (50) is of order 2 in n and j. In order to
solve the dierence equation and compute the solution at time level n + 1, we
need to know the values on time level n and n − 1. Thus to compute u2j , we
need u1j and u0j . These values are provided by the initial conditions
u(x, 0) = ϕ(x),
ut (x, 0) = ψ(x).
Evaluated on the grid these are
u0j = ϕj ,
(ut )0j = ψj .
We now need a dierence approximation for ut at timelevel n = 0. Since the
scheme (50) is of order 2 in t and x, we should use a 2-order approximation for
ut . If we use a 1-order approximation for ut in the initial condition the whole
numerical scheme is of order 1. Using a centered dierence in time, we get
u1j − u−1
j
= ψj . (51)
2k
We have introduced a ghost point t−1 in time with associated u value u−1
j . If
−1
we insert n=0 in (50), we get a second identity involving uj
We can use (51) and (52) to eliminate the ghost point and nd in conjunction
with u0j = ϕj that
u0j = ϕj ,
1
u1j = s(ϕj+1 + ϕj−1 ) + (1 − s)ϕj + kψj .
2
32
These are the initial values for the scheme (50). We now test the numerical
scheme using initial conditions
1 2
ϕ(x) = e−a(x− 2 L) ,
ψ(x) = 0.
1
u(x, t) = (ϕ(x + t) + ϕ(x − t)),
2
to this problem is found using a method that we will discuss later in this class.
For the test we use L = 20 and chose the parameter a so large that ϕ(x) to a
good approximation satisfy the boundary conditions for the wave equation. In
gures (17) and (18) we plot the initial condition, the exact solution and the
numerical solution on the interval (0, L). The √
√ space-time grid parameters are
h = 0.1, thus N = 200, and k = h s = 0.0004 s for s = 0.9 in gure 17 and
s = 1.1 in gure 18. We use 60 iterations in n.
Figure 17
33
Figure 18
It appears as if the scheme is stable for s.1 and unstable for s > 1. Let us
prove this using the Von Neumann Approach.
unj = η j ξ n , η = eiθ .
⇓ (divided by ξn ηj )
1 1
ξ+ − 2 = s(η + − 2) = 2s(cos θ − 1).
ξ η
Dene p = s(cos θ − 1). Then the equation can be written as
ξ 2 − 2(1 + p)ξ + 1 = 0.
34
we therefore have stability if p > −2. It is easy to verify that the scheme is also
stable for p = −2. Taking into account the denition of p we thus have stability
if
s(cos θ − 1) ≥ −2,
2
s6 .
1 − cos θ
2
This holds for all θ if s61 because clearly
1−cos θ 1 1. Thus we have stability
if
s 6 1.
which is what our numerical test indicated.
Figure 19
Using center dierence for uxx and uyy , we get the scheme
35
Figure 20
Observe that the value in the center is simply the mean of the value of all closest
neighbors. The stencil of the scheme i shown in gure (20).
There is no iteration process in the scheme (53) (There is no time!). It is
a linear system of variables for the (N − 1)(N − 1) interior gridpoints. We are
thus looking at a linear system of equations
Ax = b, (54)
where b comes from the boundary values. Observe however, that the linear
system (53) is not actually written in the form (54). In order to apply linear
algebra to solve (53), we must write (53) in the form (54). The problem that
arise at this point is that the interior gridpoints in (53) are not ordered, whereas
the vector components in (54) are ordered. We must thus choose an ordering of
the gridpoints. Each choice will lead to a dierent structure for the matrix A (for
instance which elements are zero and which are nonzero). Since the eciency of
iterative methods depends on the structure of the matrix, this is a point where
we can be clever (or stupid!).
For the problem at hand it is natural to order the interiour gridpoints lex-
iographical.
The b vector is obtained from the values of u at the boundary. For example
36
when j = 1, we have
1
ui,1 = (ui+1,1 + ui−1,1 + ui,2 + ui,0 ), 1 6 i 6 N − 1.
4
The boundary condition gives
i = 1,
4u1,1 = u2,1 + u0,1 + u1,2 + u1,0 ,
m
−4u1,1 + u1,2 + u2,1 = −f0,1 − f1,0 ,
1 < i < N − 1,
4ui,1 = ui+1,1 + ui−1,1 + ui,2 + ui,0 ,
m
ui−1,1 − 4ui,1 + ui+1,1 + ui,2 = −fi,0 ,
i = N − 1,
4uN −1,1 = uN,1 + uN −2,1 + uN −1,2 + uN −1,0 ,
m
uN −2,1 − 4uN −1,1 + uN −1,2 = −fN,1 − fN −1,0 .
Thus the matrix for the system has a structure that starts o as
N −1 (N − 1)(N − 1)
−4 1 0 ··· ··· ··· 0 1 0 ··· ··· ··· ··· ··· 0
1 −4 1 0 ··· ··· 0 0 1 0 ··· ··· ··· ··· 0
.
. .
. .
. .
By writing down a larger part of the linear system we eventually realize that
the matric of the system has the form of a (N − 1) × (N − 1) block matrix
B I 0 ··· ··· 0
I B I 0 ··· 0
A= ,
. .
. .
. .
0 ··· ··· ··· I B
where I is the (N − 1)(N − 1) identity and where B is the following (N − 1) ×
(N − 1) tri-diagonal matrix
37
−4 1 0 ··· ··· ··· 0
1
−4 1 0 ··· ··· 0
B= 0
1 −4 1 0 ··· · · ·
.
. .
. .
. .
0 ··· ··· ··· ··· 1 −4
We observe that most elements in the matrices A and B are zero, we say the
matrices are sparse. The structural placing of the nonzero elements represents
the sparsity structure of the matrix. Finite dierence methods for PDEs usually
leads to sparse matrises with a simple sparsity structure.
When N is large, as it often is, the linear system (54) represents an enormous
computational challenge.
In 3D with 1000 points in each direction, we get a linear system with 109
independent variables and the matrix consists of 1018 entries!
There are two broad classes of methods available for solving large linear
systems of P equations in P unknowns.
xn+1 = M xn ,
The area of iterative solution methods for linear systems is vast and we can not
possibly give an overview of the available methods and issues in a introductory
class on PDEs. However, it is appropriate to discuss some of the central ideas.
5.4.1 Iterations in Rn
Let u0 ∈ Rn be a given vector in Rn and consider the following iteration
ul+1 = M ul + c,
ul+1 = M ul + c,
38
⇓
u∗ = M u∗ + c,
⇓
(I − M )u∗ = c.
Thus u∗ solves a linear system. The idea behind iterative solution of linear
system can be described as follows:
Find a matrix M and a vector c such that
i) Au = f,
(I − M )u = c,
The condition i) can be realized in many ways and one seeks choises M and c
such that the convergence is fast.
In order get anywhere, it is at this point evident that we need to be more
precise about convergence in Rn . This we do by introducing a vector norm on
n
R that measure distance between vectors. We will also need a corresponding
norm on n × n matrices.
n×n matrix is a linear operator on Rn . Let us assume that Rn comes with
A
n
a norm k·k dened. There are many such norms in used for R . The following
three norms are frequently used
n
! 21
X
kxk2 = |xi |2 ,
i=1
n
X
kxk1 = |xi |,
i=1
kxk∞ = sup {|xi |} .
i
For matrix norms corresponding to vector norms we have the following two
important inequalities.
kAxk 6 kAkkxk,
kABk 6 kAkkBk.
39
For k·k1 and k·k∞ , one can show that
X
kAk∞ = max |ai,j |,
j
i
X
kAk1 = max |ai,j |.
i
j
where λs are the eigenvalues of A. For the k·k2 norm we have the following
identity p
kAk2 = ρ(AA∗ ),
where A∗ is the adjoint of A. For the special case when A is selfadjoint, A = A∗ ,
we have
⇓
2 2
spectrum(A ) = λ |λ ∈ spectrum(A) ,
1 1 1
kAk2 = (ρ(AA∗ )) 2 = (ρ(A2 )) 2 = (max |λs |2 ) 2
s
p
= max |λs |2 = max |λs | = ρ(A).
s s
ul+1 = M ul + c,
and let us use the k·k2 norm on Rn with the associated norm k·k2 on n×n
matrices. The iteration will then converge to u∗ in Rn i
lim kul − u∗ k2 = 0.
l→∞
But
kul − u∗ k2 = kM ul−1 + c − (M u∗ + c)k2
= kM (ul−1 − u∗ )k2 6 kM k2 kul−1 − u∗ k2 .
40
By induction we get
Theorem 1. Let M be symmetric and assume that for all eigenvalues λ for M
we have
|λ| < 1.
Then the iteration
ul+1 = M ul + c,
converges to the unique solution of
(I − M )u = c,
5.4.2 Gauss-Jacobi
Any matrix can be written in the form
A = L + D + U,
u = MJ u + c,
ul+1 = MJ ul + c.
41
In component form the Gauss-Jacobi iteration can be written in the form
1 X
ul+1
i = (bi − aij ulj ), i = 1, . . . , n .
aii
j6=i
Observe that
MJ = −D−1 (L + U ),
⇓
DMJ = −L − U = −A + D,
⇓
1
MJ = −D−1 A + I, and D−1 = − I,
4
⇓
1
MJ = A + I.
4
Let λ be an eigenvalue of A with corresponding eigenvector v. Then we have
1 1
MJ v = Av + Iv = ( λ + 1)v.
4 4
This means that
1
spectrum(MJ ) = λ+1 λ ∈ spectrum(A) . (55)
4
In order to nd the spectrum of A, we will start from a property of the spectrum
for block-matrices.
Let C be a matrix on the form
C11 ··· ··· C1m
. .
C= . . .
. .
Cm1 ··· ··· Cmm
where Cij are n×n matrices. Thus C is a (nm) × (nm) matrix. Let us assume
that there is a common set of eigenvectors for the matrices Cij . Denote this set
by {v k }nk=1 . Thus we have
Cij v k = λkij v k ,
where λkij is the eigenvalue of the matrix Cij corresponding to the eigenvector
k
v .
Let α1 , . . . , αm be real numbers. We will seek eigenvectors of C on the form
k
α1 v
α2 v k
wk = . .
..
αm v k
42
Then wk is an eigenvector of C with associated eigenvalue µk only if
Awk = µk wk ,
m
(λk11 α1 λk1m αm )v k µk α1 v k
+ ··· +
. .
. . .
. .
k k k k k
(λm1 α1 + · · · + λmm αm )v µ αm v
The eigenvalues of C are thus given by the eigenvalues of the matrix
λ11 ... λ1m
. .
Λ= . . .
. .
λm1 ... λmm
Dv = λv,
43
bvi−1 + avi + bvi+1 = λvi , i = 1, . . . , n ,
and v0 = vn+1 = 0. This is a boundary-value problem for a dierence equation
of order 2. We write the equation on the form
⇓
1 1
r± = (−(a − λ) ± ((a − λ)2 − 4b2 ) 2 ).
2b
If r+ 6= r− , the general solution to the dierence equation is
i i
vi = Ar+ + Br− ,
the boundary conditions give
v0 = 0, ⇔ A + B = 0,
n+1 n+1
vn+1 = 0, ⇔ Ar+ + Br− = 0,
nontrival solutions exits only if
n+1 n+1
r− − r− = 0,
m
n+1
r+
= 1. (57)
r−
1
But r+ r− = 1 ⇒ r− = r+ and therefore from equation (57)
2 (n+1)
(r+ ) = 1,
m
k
r+ = eπi n+1 ,
⇓
k
−πi n+1
r− =e .
Furthermore, from equation (56) we have
a−λ
= −(r+ + r− ),
b
m
λ = a + b(r+ + r− ),
m
k
λ = a + 2b cos(π n+1 ). (58)
44
The eigenvalues of the tri-diagonal matrix D are thus
πk
λk = a + 2b cos , k = 1, . . . , n .
n+1
Returning to our analysis of the matrix A, we can now assemble our matrix Λk
λk 1 0 ... ... 0
1 λk 1 0 ... 0
Λk = . . ,
.. .
.
0 ... ... ... 1 λk
πk
where λk = −4 + 2 cos N , k = 1, . . . , N − 1. Λk is also tri-diagonal and the
previous theory for tri-diagonal matrices give us the eigenvalues
πl
λkl = λk + 2 cos , l = 1, . . . , N − 1 ,
N
⇓
πk πl
λkl = −4 + 2 cos + 2 cos ,
N N
or using standard trigonometric identities
πk πl
λkl = −4 sin2 + sin2 , k = 1, . . . , N − 1, l = 1, . . . , N − 1 ,
2N 2N
|µkl | < 1,
m
1 1 k l
1 − sin2 πxk − sin2 πyl < 1, xk = , yl = .
2 2 N N
Since both xk , yl ∈ (0, 1), it is evident that the condition holds and that we have
convergence. We get the largest eigenvalue by choosing k = l = 1. The spectral
radius can then for large N be approximated as
1 1
ρ(M ) = 1 − 2 sin2 πh ≈ 1 − (πh)2 ,
2 2
45
1
where we have recall that h=
N grid parameter for the grid dened in gure
(19). The relative error is dened as
l
kul − u∗ k
1 2 2
6 (ρ(M ))l ≈ 1− π h .
ku0 − u∗ k 2
l
1
1 − π 2 h2 ≈ h2 ,
2
m
2
1
h2
e− 2 lπ ≈ h2 ,
m
4 2
l≈ N ln N.
π2
Gauss elimination requires in general O(N 3 ) and since for large N we have
2 3
N ln N N we observe that Gauss-Jacobi iteration is much faster that Gauss
elimination.
5.4.3 Gauss-Seidel
We use the same decomposition of A as for the Gauss-Jacobi method, A =
L+D+U
(L + D + U )u = b,
⇓
−(L + D)u = U u − b,
⇓
u = MS u + c,
−1
where MS = −(L + D) U , c = (L + D)−1 b. In component form we have the
iteration
1 X X
ul+1
i = (bi − aij ul+1
j − aij ulj ), i = 1, . . . , n .
aii j<i j>i
We will now see that Gauss-Seidel converges faster than Gauss-Jacobi. It also
use the memory in a more eective way. For the stability of this method we
must nd the spectrum of the matrix
46
Note that the iteration matrix for the Gauss-Seidel method can be rewritten
into the form
MS = −(I + D−1 L)−1 D−1 U.
Using straight forward matrix manipulations it is now easy to verify that λs is
an eigenvalue to the matrix MS i
easy to show that Let now α be any number, and dene a matrix E by
I 0 ... ... 0
0 αI 0 ... 0
E = . .
.
.. .
.
0 ... ... ... αN −2 I
Doing the matrix multiplication, it is easy to verify that we have the identity
1
E(D + L + U )E −1 = D + αL + U,
α
from which we immediately can conclude
1
det D + αL + U = det (D + L + U ) . (60)
α
This derivation only require D to be a diagonal matrix, L to be a lower triangular
matrix and U to be a upper triangular matrix. Using (59) and (60), we have
0 = det (λs I − MS )
= det (λs D + λs L + U )
1
= det λs D + αλs L + U
α
p 1
= det λs D + λs (L + U ) , α= √
λs
(N −1) p
= λs 2 det (µD + L + U ) , µ = λs
(N −1)
= λs 2 det(D)det µ + D−1 (L + U )
(N −1)
= λs 2
det(D)det (µI − MJ ) .
Thus, we have proved that the nonzero eigenvalues λs of the Gauss-Seidel iter-
ation matrix are determined by the eigenvalues µ of the Jacobi iteration matrix
through the formula
λs = µ2 (61)
Since all eigenvalues of the Jacobi iteration matrix MJ are inside the unit circle,
this formula implies that the same is true for the eigenvalues of the Gauss-
Seidel iteration matrix MS . Thus Gauss-Seidel also converges for all initial
47
conditions. As for the rate of convergence, using the results from the analysis
of the convergence of the Gauss-Jacobi method, we nd that
2
1 2 2
ρ(MS ) ≈ 1 − π h ≈ 1 − π 2 h2 ,
2
and therefore
kul − u∗ k 2
0 ∗
= h2 ⇔ (1 − π 2 h2 )l = h2 ⇔ l ≈ 2 N 2 ln N.
ku − u k π
For a xed accuracy the Gauss-Seidel only needs half as many iterations as
Gauss-Jacobi.
These two examples have given a taste of what one needs to do in order to
dene iteration methods for linear systems and study their convergence.
As previously mentioned there is an extensive set of iteration methods avail-
able for solving large linear systems of equation, and many has been imple-
mented in numerical libraries available for most common programming lan-
guages. The choice of method depends to a large extent on the structure of
your matrix. Here are two common choices:
The conjugate gradient method has also been extended to more general matrices.
A standard reference for iterative methods is Iterative methods for sparse linear
systems by Yoosef Saad.
48
6.1 Higher order equations and rst order systems
It turns out that higher order PDEs and rst order systems of PDEs are not
really separate things. Usually one can be converted into the other. Whether
this is useful or not depends on the context.
Let us recall the analog situation from the theory of ODEs. The ODE for
the Harmonic Oscillator is
x00 + w2 x = 0.
Dene two new functions by
x1 (t) = x(t),
x2 (t) = x0 (t).
Then
x01 = x0 = x2 ,
x02 = x00 = −w2 x = −w2 x1 .
Thus we get a rst order system of ODEs.
x01 = x2 ,
x02 = −w2 x1 .
In the theory of ODEs this is a very useful device that can be used to transform
any ODE of order n into a rst order system of ODEs in Rn .
Note that the transition from one high order ODE, to a system of rst order
ODEs is not by any means unique. If we for the Harmonic Oscillator rather
dene
x1 = x + x0 ,
x2 = x − x0 ,
we get the rst order system
utt − γ 2 uxx = 0.
Observe that
49
Inspired by this let us dene a function
(∂t + γ∂x )v = 0.
ut − γux = v,
vt + γvx = 0.
Observe that the second equation is decoupled from the rst. We will later in
this class see that this can be used to solve the wave equation.
Writing a higher order PDE as a rst order system is not always useful.
Observe that
(∂x + i∂y )(∂x − i∂y )u = ∂xx u + ∂yy u.
Thus we might write the Laplace equation as the system
ux − iuy = v,
vx + ivy = 0.
However this is not a very useful system since we most of the time are looking
for real solutions to the Laplace equation.
If we rather put
v = ux ,
w = uy .
Then using the Laplace equation and assuming equality of mixed partial deriva-
tives we have
vy = uxy = uyx = wx ,
vx = uxx = −uyy = −wy .
vy − wx = 0,
(62)
vx + wy = 0.
These are the famous Cauchy-Riemann equations. If you have studied com-
plex function theory you know that (62) is the key to the whole subject; they
determine the complex dierentiability of the function
(62) is a coupled system and does not on its own make it any simpler to nd
solutions to the Laplace equation. However, by using the link between complex
50
function theory and the Laplace equation, great insight into it's set of solutions
can be gained.
Going from rst order systems to higher order equations is not always pos-
sible but sometimes very useful.
A famous case is the Maxwell equations for electrodynamics, which in a
region containing no sources of currents and charge are
∇ × E + ∂t B = 0,
∇ × B − µ0 ε0 ∂t E = 0,
(63)
∇ · E = 0,
∇ · B = 0,
µ0 ε0 ∂t E = ∇ × B,
⇓
µ0 ε0 ∂tt E = ∇ × ∂t B = ∇ × (−∇ × E),
⇓
µ0 ε0 ∂tt E + ∇ × ∇ × E = 0,
and
∇ × ∇ × E = ∇(∇ · E) − ∇2 E = −∇2 E,
⇓
ε0 µ0 ∂tt E − ∇2 E = 0.
This is the 3D wave equation and it predicts, as we will see later in this course,
that there are solutions describing electromagnetic disturbances traveling at
a speed v = √ε10 µ0 in source free regions (for example in a vacuum). The
parameters ε0 , µ0 are constants of nature that was known to Maxwell, and
when he calculated v , he found that the speed of these disturbances was very
close to the known speed of light.
He thus conjectured that light is an electromagnetic disturbance. This was
a monumental discovery whose ramications we are still exploring today.
where in general, a, b, c and d depends on x and t. The solution method for (64)
is based on a geometrical interpretation of the equation.
51
Observe that the two functions a(x, t) and b(x, t) can be thought of as the
two components of a vector eld f (x, t) in the plane.
Figure 22
Thus we have
dγ
= f (γ(s)),
ds
m
dx (65)
= a,
ds
dt
= b.
ds
Let v(x, t) be a solution of (64) and dene
dv(s)
= c(x(s), t(s))v(s) + d(x(s), t(s)). (66)
ds
From (65) and (66) we get a coupled system of ODEs
dx
= a,
ds
dy
= b, (67)
ds
dv
= cv + d.
ds
52
The general theory of ODEs tells us that through each point (x0 , t0 , v0 ), there
goes exactly one solution curve of (67) if certain regularity conditions on a, b, c
and d are satised. The family of curves that solves (67) are called the charac-
dx
= a,
ds
dt
= b,
ds
are called characterictic base curves. They are also, somewhat imprecisely,
called characteristic curves.
Most of the time we are not interested in the general solution of (64), but
rather in special solutions that in addition to (64) has given values on some
curve C in the (x, t) plane. Any such problem is called an initial value problem
v(x, 0) = ϕ(x),
53
Let (x(τ ), y(τ )) be a parametrization of C. Then our generalized notion of
initial value problem for (64) consists of imposing the condition
where ϕ(τ ) is a given function. Our previous notion of initial value problem
corresponds to the choice.
Since this is a parametrization of the x-axis in the (x, t) plane. Thus in summary,
the (generalized) initial value problem for (64) consists of nding functions
x = x(s, τ ),
t = t(s, τ ), (68)
v = v(s, τ ),
such that
dx
= a, x(0, τ ) = x(τ ),
ds
dt
= b, t(0, τ ) = t(τ ), (69)
ds
dv
= cv + d, v(0, τ ) = ϕ(τ ).
ds
Where (x(τ ), t(τ )) parametrize the initial curve and where ϕ(τ ) are the given
values of v on the initial curve. The reader might at this point object and ask
how we can possibly consider (68), solving (69), to be a solution to our original
problem (64). Where is my v(x, t) you might ask? This is after all what we
started out to nd.
In fact, the function v(x, t), solving our equation (64), is implicitly contained
in (68).
We can in principle use the two rst equations in (68) to express s and τ as
functions of x and t.
x = x(s, τ ),
t = t(s, τ ),
m (70)
s = s(x, t),
τ = τ (x, t).
It is known that this can be done locally close to a point (s0 , τ0 ) if the determi-
nant of the Jacobian matrix at (s0 , τ0 ), this is what is known as the Jacobian
in calculus, is nonzero.
54
Thus if the Jacobian is nonzero for all points on the initial curve the initial value
problem for (64) has a unique solution for any specication of ϕ on the initial
curve. This solution is
Actually nding a formula for the solution v(x, t) using this method, which is
called the method of characteristics, can be hard. This is because in general it is
hard to nd explicit solutions to the coupled ODEs (69) and even if you succeed
with this, inverting the system (70) analytically might easily be impossible.
However, even if we can not nd an explicit solution, the method of charac-
teristics can still give great insight into what the PDE is telling us. Additionally,
the method of characteristic is the starting point for a numerical approach to
solving PDEs of the type (64) and of related types as well.
In our trac model, we encountered the equation
∂t n + c0 ∂x n = 0. (71)
This equation is of the type (64) and can be solved using the method of char-
acteristic. Let the initial data be given on the x-axis.
h(x, 0) = ϕ(x).
Our system of ODEs determining the base characteristic curves for the trac
equation (71) is thus
dt
= 1, t(0, τ ) = 0,
ds
dx
= c0 , x(0, τ ) = τ,
ds
dn
= 0, n(0, τ ) = ϕ(τ ).
ds
Observe that the last ODE tell us that the function n is constant along the base
characteristic curves.
dt
= 1 ⇒ t = s + t0 , t(0, τ ) = 0 ⇒ t0 = 0 ⇒ t(x, τ ) = s,
ds
dx
= c0 ⇒ x = c0 s + x0 , x(0, τ ) = τ ⇒ x0 = τ
ds
⇒ x(s, τ ) = c0 s + τ,
dn
= 0 ⇒ n = n0 , n(0, τ ) = ϕ( τ ) ⇒ n0 = ϕ(τ )
ds
⇒ n(s, τ ) = ϕ(τ ).
t=s ⇒ s = t,
x = c0 s + τ ⇒ τ = x − c0 s = x − c0 t,
55
and our solution to the initial value problem is
This is the unique solution to our initial value problem. The base characteristic
curves are given in parametrized form as
t = s,
x = c0 s + τ, τ ∈ R,
or in unparametrized form as
x − c0 t = τ.
Figure 25: Base characteristic curves for the trac equation (71).
The solution in general depends on both the choice of initial curve and which
values we specify for the solution on this curse. In fact, if we are not careful
here, there might be many solutions to the initial value problem or none at all.
Since n is constant along a base characteristic curve, its value is determined
by the initial value on the point of the initial curve where the base characteristic
curve and the initial curve coincide. If this occurs at more than one point there
is in general no solution to the initial value problem.
Let for example C be given through a parametrization
x(τ ) = τ, (72)
t(τ ) = τ (1 − τ ). (73)
The initial values is some function ϕ(τ ) dened on C. The initial curve is thus
a parabola and the base characteristic curves form a family of parallel lines
x − c0 t = const
56
Figure 26: Base characteristic curves and parabolic initial curve for the trac
equation (71).
This only holds for very special initial data. At a certain point x = τ3 , t =
τ3 (1 − τ3 ), the characteristic base curve is tangent to the initial curve.
At this point the equations
x = x(s, τ ),
t = t(x, τ ),
x = c0 s + τ,
t = s + τ (1 − τ ).
The Jacobian is
∆ = ∂s x ∂τ t − ∂s t ∂τ x = c0 (1 − 2τ ) − 1,
1 d
= (τ (1 − τ )) = (1 − 2τ ),
c0 dτ
⇓
c0 (1 − 2τ ) = 1,
⇓
∆ = 0.
57
A point where the base characteristic curves and the initial curve are tangent
is called a characteristic point. The initial curve is said to be characteristic at
this point. Thus the initial curve C is characteristic at the point (x, t) where
1 1
x= (1 − ),
2 c0
1 1
t = (1 − 2 ).
4 c0
If the initial curve is characteristic at all of its points, we say that the initial
curve is characteristic. It in fact is a base characteristic curve. The initial
value problem then in general has no solution, and if it does have a solution,
the solution is not unique. A characteristic initial value problem for our trac
equation (71) would be
∂t n + c0 ∂x n = 0,
n(c0 t, t) = ϕ(t).
General solution of the equation is
n(c0 t, t) = ϕ(t),
m
f (0) = ϕ(t), ∀t.
In order for a solution to exist, ϕ(t) = A must be a constant. And if ϕ(t) = A
then (74), with any function f (y) where f (0) = A, would solve the initial value
problem. Thus we have nonuniqueness.
Let us next consider the initial value problem
t∂t v + ∂x v = 0,
v(x, 1) = ϕ(x). (75)
The initial curce is the horizontal line t=1 which can be parametrized using
x(τ ) = τ,
t(τ ) = 1.
dt
= t, t(0, τ ) = 1,
ds
dx
= 1, x(0, τ ) = τ,
ds
dv
= 0, v(0, τ ) = ϕ(τ ),
ds
58
and
dt
= t ⇒ t = t0 es , t(0, τ ) = 1 ⇒ t0 = 1 ⇒ t = es ,
ds
dx
= 1 ⇒ x = s + x0 , x(0, τ ) = τ ⇒ x0 = τ ⇒ x = s + τ,
ds
dv
= 0 ⇒ v = v(τ ), v(0, τ ) = ϕ(τ ) ⇒ v = ϕ(τ ).
ds
We now solve for s and τ in terms of x and t
t = es ,
x = s + τ.
s = ln t,
τ = x − s = x − ln t,
⇓
v(x, t) = ϕ(τ ) = ϕ(x − ln t).
This solution is constant along the base characteristic curves
59
τ = x − ln t,
m
x−τ −τ x
t=e =e e , τ ∈ R.
The solution we have found exists only for t > 0. This is its domain of denition
Figure 27: Base characteristic curves and initial curve for the initial value prob-
lem (75).
As a nal example, let us consider the initial value problem for the wave
equation
vtt − γ 2 vxx = 0,
v(x, 0) = f (x),
vt (x, 0) = g(x). (76)
We have previously seen that this can be written as a rst order system
vt − γvx = u, (77)
ut + γux = 0. (78)
v(x, 0) = f (x),
u(x, 0) = (vt − γvx )(x, 0) = g(x) − γf 0 (x).
We start by solving the homogeneous equation (78). We will use the method of
characteristics on the problem
ut + γux = 0,
u(x, 0) = g − γf 0 .
The corresponding system of ODEs is
dx
= γ, x(0, τ ) = τ,
ds
dt
= 1, t(0, τ ) = 0,
ds
du
= 0, u(0, τ ) = g(τ ) − γf 0 (τ ).
ds
60
Solving we nd
dx
= γ ⇒ x = γs + x0 , x(0, τ ) = τ ⇒ x0 = τ ⇒ x = γs + τ,
ds
dt
= 1 ⇒ t = s + t0 , t(0, τ ) = 0 ⇒ t0 = 0 ⇒ t = s,
ds
du
= 0 ⇒ u = u0 , u(0, τ ) = g(τ ) − γf 0 (τ ) ⇒ u0 = g(τ ) − γf 0 (τ )
ds
⇒ u(s, τ ) = g(τ ) − γf 0 (τ ).
x = γs + τ, τ = x − γt,
⇔
t = s, s = t,
from which it follows that the function u(x, t) = u(s(x, t), τ (x, t)) is given by
Let us next apply the method of characteristic to the initial value problem for
equation (77). This problem is
v(x, 0) = f (x).
The corresponding system of ODEs is
dx
= −γ, x(0, τ ) = τ,
ds
dt
= 1, t(0, τ ) = 0,
ds
dv
= g(x(s) − γt(s)) − γf 0 (x(s) − γt(s)), v(τ, 0) = f (τ ).
ds
The rst two equations give
x = −γs + τ, τ = x + γt,
⇔
t = s, s = t.
v(0, τ ) = f (τ ) ⇒ v0 (τ ) = f (τ ),
61
Z s
v(s, τ ) = f (τ ) + dσ(g(x(σ) − γt(σ)) − γf 0 (x(σ) − γt(σ))).
0
Observing that
x − γt = −γs + τ − γs = −2γs + τ,
the formula for v become
Z s
v(s, τ ) = f (τ ) + dσ(g(−2γσ + τ ) − γf 0 (−2γσ + τ )).
0
Z −2γs+τ
1
v(s, τ ) = f (τ ) − dy(g(y) − γf 0 (y)),
2γ τ
Therefore, the function v(x, t) = v(s(x, t), τ (x, t)) is given by the formula
Z x−γt
1
v(x, t) = f (x + γt) − dy(g(y) − γf 0 (y))
2γ x+γt
1 x−γt
Z Z x+γt
1
= f (x + γt) + dyf 0 (y) + dyg(y),
2 x+γt 2γ x−γt
⇓
Z x+γt
1 1
v(x, t) = (f (x + γt) + f (x − γt)) + dyg(y).
2 2γ x−γt
This is the famous D'Alembert formula for the solution of the initial value
problem for the wave equation on an innite domain. It was rst derived by the
French mathematician D'Alembert in 1747.
R
Let G(ξ) = g(y)dy . Then the formula can be rewritten as
1 1 1 1
v(x, t) = f (x + γt) + G(x + γt) + f (x − γt) − G(x − γt) .
2 2 2 2γ
The solution is clearly a linear superposition of a left moving and a right moving
wave.
62
where
a = a(x, t, u),
b = b(x, t, u),
c = c(x, t, u).
F (x, t, u) ≡ u(x, t) − u = 0.
dx
= a(x, t, u),
ds
dt
= b(x, t, u),
ds
du
= c(x, t, u).
ds
Observe that for this quasilinear case the two rst equations does not decou-
ple and the 3 ODEs has to be solved simultaneously. This can obviously be
challenging.
For suciently smooth a, b and c, there is a solution curve, or characteristic
curve, passing through any point (x0 , t0 , u0 ) in (x, t, u) space.
The initial value problem is specied by giving the value of u on a curve in
(x, t) space. This curve together with the initial data determines a curve, C, in
63
(x, t, u) space that we call the initial curve. To solve the initial value problem
we pass a characteristic curve through every point of the initial curve. If these
curves generate a smooth surface, the function, whose graph is equal to the
surface, is the solution of the initial value problem.
Figure 28
dx
= a, x(0, τ ) = x(τ ),
ds
dt
= b, t(0, τ ) = t(τ ),
ds
du
= c, u(0, τ ) = ϕ(τ ).
ds
The theory of ODEs gives a unique solution
x = x(s, τ ),
t = t(s, τ ), (80)
u = ϕ(s, τ ).
Observe that
∂s x = a, ∂s t = b,
so the Jacobian is
∂s x ∂τ x
∆(s, τ ) = = (a∂τ t − b∂τ x)(s, τ ).
∂s t ∂τ t
Let us assume that the Jacobian is nonzero along a smooth initial curve. We
also assume that a, b and c are smooth. Then ∆(s, τ ) is continuous in s and τ,
and therefore nonzero in an open set, surrounding the initial curve.
By the inverse function theorem, the equations
x = x(s, τ ),
t = t(s, τ ),
64
can be inverted in an open set surrounding the initial curve. Thus we have
s = s(x, t),
τ = τ (x, t).
Then
u(x, t) = ϕ(s(x, t), τ (x, t))
is the unique solution to the initial value problem
a∂x u + b∂t u = c,
u(x(τ ), t(τ )) = ϕ(τ ).
In general, going through this program and nding the solution u(x, t) can be
very hard. However deploying the method can give great insight into a given
PDE.
Let us consider the following quasilinear equation
ut + uux = 0,
u(x, 0) = ϕ(x). (81)
It is called the (inviscid) Burger's equation. Let us solve Burger's equation using
the method of characteristics. Our initial value problem is
ut + uux = 0,
u(x, 0) = ϕ(x).
The initial curve is parametrized by
x = τ, t = 0, u = ϕ(τ ).
Here
a(x, t, u) = u,
b(x, t, u) = 1,
c(x, t, u) = 0.
65
Let us compute the Jacobian along the initial curve. If the characteristic curves
are
x = x(s, τ ),
t = t(s, τ ),
u = u(s, τ ),
where
dx
= u, x(0, τ ) = τ,
ds
dt
= 1, t(0, τ ) = 0, (82)
ds
du
= 0, u(0, τ ) = ϕ(τ ).
ds
Then
∂τ t = 0, ∂τ x = 1 at (0, τ ),
⇓
∆(0, τ ) = (a∂τ t − b∂τ x)|(0,τ ) = −1 6= 0,
so ∆ 6= 0 along the entire initial curve. Let us next solve (82). From the last
equation in (82) we get
u = u0 .
Thus u is a constant independent of s. Since u is a constant, the second equation
in (82) give us
x = u0 s + x0 ,
and
u = ϕ(τ ),
x = ϕ(τ )s + τ.
t = s.
If the conditions for the inverse function theorem holds, the equations
x = ϕ(τ )s + τ,
t = s,
66
can be inverted to yield
τ = τ (x, t), s = t,
and the solution to our problem is
Without specifying ϕ, we can not invert the system and nd a solution to our
initial value problem. We can however write down an implicit solution. We have
u = ϕ(τ ),
x = ϕ(τ )s + τ,
t = s,
⇓
τ = x − us,
⇓
u = ϕ(x − tu).
Even without precisely specifying ϕ we can still gain great insight by considering
the way the characteristics depends on the initial data. Projection of the char-
acteristics in (x, t, u) space down into (x, t) space give us base characteristics of
the form
x = ϕ(τ )t + τ.
This is a family of straight lines in (x, t) space parametrized by the parameter
τ. Observe that the slope of the lines depends on τ, thus the lines in the family
are not parallel in general. This is highly signicant. The slope of the base
characteristics is
dt 1
= ,
dx ϕ(τ )
and is evidently varying as a function of τ. This implies the possibility that
the base characteristics will intersect. If they do, we are in trouble because
the solution, u, is constant along the base characteristics. Thus if the base
characteristics corresponding to parameters τ1 and τ2 intersect at a point (x∗ , t∗ )
in (x, t) space then at this point u will have two values
u1 = ϕ(τ1 ),
u2 = ϕ(τ2 ),
ϕ(τ1 ) = ϕ(τ2 ).
But in this case the two characteristics have the same sloop and would not have
intersected. The conclusion is that base characteristics can intersect and when
that happens u is no longer a valid solution.
67
There is another way of looking at this phenomenon. Recall that the implicit
solution is
u = ϕ(x − ut). (83)
there are points to the left that move faster than points further to the right.
Thus points to the left can catch up with points to the right and lead to a
steepening of the wave
Eventually the steepening can create a wave with a vertical edge. At this point
|ux | = ∞ and the model no longer describe what it is supposed to model. We
thus have a breakdown.
It is evidently important to nd out if and when such a breakdown occurs.
There are two ways to work this out. The rst approach use the implicit solution.
ux = ϕ0 (x − ut)(1 − ux t),
68
⇓
0
ϕ (x − ut)
ux = .
1 + tϕ0 (x − ut)
Clearly |ux | become innite if the denominator become zero
1 + tϕ0 (x − ut) = 0,
m
1
t=− ,
ϕ0 (τ )
where we recall that τ = x − ut. What is important is to nd the smallest time
for which innite sloop occurs. The solution is not valid for times larger than
this. Thus we must determine
∗ 1
t = min − 0 .
τ ϕ (τ )
When initial data is given, ϕ(τ ) is known and t∗ can be found.
The second method involve looking at the geometry of the characteristics.
It leads to the same breakdown time t∗ .
We can immediately observe that if ϕ is increasing for all x so that
ϕ0 (τ ) > 0,
then t∗ < 0 and no breakdown occur. For breakdown to occur we must have
0
ϕ (τ ) < 0 for some τ . Let us look at some specic choices of initial conditions.
i) u(x, 0) = ϕ(x) = −x
1
⇒ t(τ ) = − = 1,
ϕ0 (τ )
⇓
∗
t = min ϕ(τ ) = 1.
τ
69
x
Figure 32: The solution u(x, t) = t−1 at t=0 and t = t1 > 0.
Figure 33: Intersecting base characteristics for the initial condition u(x, 0) = −x
⇓
∗
t = min ϕ(τ ) = 0.
τ
⇓
u2 t2 + (1 − 2xt)u + x2 − 1,
⇓
1 1
u = 2 −(1 − 2xt) ± (1 − 2xt)2 − 4t2 (x2 − 1) 2 ,
2t
⇓
1
x 1 ± (1 + 4t(t − x)) 2
u= − .
t 2t2
70
The initial condition requires that
lim u(x, t) = 1 − x2 ,
t→0+
1
(1 + 4t(t − x)) 4 ≈ 1 + 2t(t − x) − 2t2 (t − x)2 .
1
x 1 − (1 + 4t(t − x)) 2
u(x, t) = − .
t 2t2
We have
1 1
ux = (1 − (1 + 4t(t − x))− 2 ),
t
and thus innite sloop occurs at all space-time points (x, t) determined by
1 + 4t(t − x) = 0,
m
1
x=t+ .
4t
This curve is illustarted in gure (34). There clearly are space-time points
with arbitrarily small times on this curve and we therefore conclude that the
breakdown time is t∗ = 0.
The actual wave propagation is illustrated in gure 35. The steepening is evi-
dent.
u = sin(x − ut)
71
Figure 34: Set of space-time points where u(x, t) has innite sloop.
Figure 35: The solution u(x, t) with u(x, 0) = 1 − x2 at t=0 and t = t1 > 0
and thus
ϕ(τ ) = sin τ,
⇓
0
ϕ (τ ) = cos τ,
⇓
t∗ = minτ − cos1 τ .
t∗ = 1,
and this minimum breakdown time is achieved for each τn = (2n + 1)π .
Recall that the characteristic curves are parametrized by τ through
x = sin(τ )t + τ.
Thus we can conclude that the breakdown occurs at the space-time points
(t∗ , x∗n ) where
x∗ = sin(τn )t∗ + τn = (2n + 1)π.
The breakdown is illustrated in gure 36.
72
Figure 36: The solution u(x, t) with u(x, 0) = sin x at t=0 and t = t1 > 0
where u = u(x, y), A = A(x, y) etc. The wave equation, diusion equation and
Laplace equation are clearly included in this class.
What separates one equation in this class from another are determined en-
tirely by the functions A → G. Thus the exact form of these functions deter-
mines all properties of the equation and the exact form of the set of solutions
(or space of solutions as we usually say). In fact if G = 0 the space of solutions
is a vectorspace. This is because the dierential operator
L = A∂xx + 2B∂xy + C∂yy + D∂x + E∂y + F,
is a linear operator. (Prove it!) Thus even if we at this point knows nothing
about the functions forming the space of solutions of the given equation, we do
know that it has the abstract property of being a vectorspace.
Studying abstract properties of the space of solutions of PDEs is a central
theme in the modern theory of PDEs. Very rarely will we be able to get an
analytical description of the solution space for a PDE.
Knowing that a given PDE has a solution space that has a set of abstract
properties is often of paramount importance. This knowledge can both inform
us about expected qualitative properties of solutions and act as a guide for
selecting which numerical or analytical approximation methods are appropriate
for the given equation.
The aim of this section is to describe three abstract properties that the
equation (85) might have or not. These are the properties of being hyperbolic,
parabolic or elliptic.
Let us then begin: We rst assume that A(x, y) is nonzero in the region of
interest. This is not a very restrictive assumption and can be achieved for most
reasonable A. We can then divide (85) by A
2B C D E F G
uxx + uxy + uyy + ux + uy + u = . (86)
A A A A A A
73
We will now focus on the terms in (86) that are of second order in the derivatives.
These terms form the principal part of the equation. The dierential operator
dening the principal part is
2B C
L = ∂xx + ∂xy + ∂yy . (87)
A A
We now try to factorize the dierential operator by writing it in the form
2B
a+b=− ,
A
C
ab = , c = bx − aby .
A
From the rst two equations we nd that both a and b solves the equation
Az 2 + 2Bz + C = 0. (89)
a = ω+ , b = ω− .
We thus have
1 n 1
o
ω± = −B ± (B 2 − AC) 2 , (90)
A
and then
2B C
∂xx + ∂xy + ∂yy
A A
=(∂x − ω + ∂y )(∂x − ω − ∂y ) + (ωx− − ω + ωy− )∂y .
We are in general interested in real solutions to our equations and aim to use
this factorization to rewrite and simplify our equation. Thus it is of importance
to know how many real solutions equation (89) has. From formula (90) we see
that this is answered by studying the sign of B 2 (x, y) − A(x, y)C(x, y) in the
domain of interest. We will here assume that this quantity is of one sign in the
region of interest. This can always be achieved by restricting the domain, if
necessary.
The equation (85) is called
74
i) Hyperbolic if B 2 − AC > 0,
ii) Parabolic if B 2 − AC = 0,
iii) Elliptic if B 2 − AC < 0.
For the wave equation
utt − c2 uxx = 0,
we have A = 1, B = 0, C = −c2 (x → t, y → x in the general form of the
equation (85))
⇒ B 2 − AC = −1 · (−c2 ) = c2 > 0.
Thus the wave equation is hyperbolic.
For the diusion equation we have
ut − Duxx = 0,
we have A = 0, B = 0, C = −D
⇒ B 2 − AC = 0.
uxx + uyy = 0,
we have A = 1, B = 0, C = 1
⇒ B 2 − AC = −1 < 0.
dy
= −ω + . (91)
dx
Let the solution curves of this equation be parametrized by the parameter ξ.
Thus the curves can be written as
ξ = ξ(x, y).
In a similar way we can write the solution curves for the ODE
dy
= −ω − , (92)
dx
75
in the form
η = η(x, y),
where η is the parameter parameterizing the solution curves for (92). We now
introduce (ξ, η) as new coordinates in the plane. The change of coordinates is
thus
ξ = ξ(x, y),
(93)
η = η(x, y).
∆ = ξx ηy − ξy ηx .
The solution curves, or characteristic curve, of the equation (85) are by deni-
tion determined by
ξ(x, y) = const,
η(x, y) = const.
Dierentiate implicit with respect to x. This gives, upon using (91) and (92),
0 = ξx + y 0 ξy = ξx − ω + ξy ⇒ ξx = ω + ξy ,
(94)
0 = ηx + y 0 ηy = ηx − ω − ηy ⇒ ηx = ω − ηy .
Thus
∆ = (ω + ξy )ηy − ξy (ω − ηy ) = (ω + − ω − )ηy ξy 6= 0,
because, by assumption, ω + 6= ω − in our domain. Thus the proposed change of
coordinates (93) does in fact dene a change of coordinates.
Figure 37
ξx
Inserting ω+ = ξy from equation (94) we get, after multiplication by ξy2
A(x, y)ξx2 (x, y) + 2B(x, y)ξx (x, y)ξy (x, y) + C(x, y)ξy2 (x, y) = 0.
76
In a similar way the root z = ω− show that the function η(x, y) satisfy the
equation
A(x, y)ηx2 (x, y) + 2B(x, y)ηx (x, y)ηy (x, y) + C(x, y)ηy2 (x, y) = 0.
Thus both ξ(x, y) and η(x, y), that determines the characteristic coordinate
system are solutions to the characteristic PDE corresponding to our original
PDE (85)
A(x, y)ϕ2x (x, y) + 2B(x, y)ϕx (x, y)ϕy (x, y) + x(x, y)ϕ2y (x, y) = 0.
ux = ξx uξ + ηx uη ,
uy = ξy uξ + ηy uη ,
⇓
ux − ω uy = ξx uξ + ηx uη − ω + ξy uξ − ω + ηy uη
+
(96)
= (ξx − ω + ξy )uξ + (ηx − ω + ηy )uη
= (ηx − ω + ηy )uη = ηy (ω − − ω + )uη ,
because
d
ξx − ω + ξy = ξx + y 0 (x)ξy = ξ(x, y(x)) = 0.
dx
Since
ξ(x, y) = const,
describe the solution curves to the equation (91). In a similar way we nd
ux − ω − uy = ξy (ω + − ω − )uξ . (97)
∂x − ω + ∂y = −ηy (ω + − ω − )∂η ,
(98)
∂x − ω − ∂y = ξy (ω + − ω − )∂ξ ,
Substituting this into (85) and expressing all rst order derivatives with respect
to x and y in terms of ∂ξ and ∂η , we get an equation of the form
where we have divided by the nonzero quantity ξy ηy (w+ − w− ) and where now
u = u(ξ, η), a = a(ξ, η) etc. For any particular equation a, b, c and d will be
determined explicitely in terms of A, B, C and D.
77
Since (100) results from (85) by a change of coordinates, (100) and are equivalent
PDEs. Their spaces of solutions will consist of functions that are related through
the change of coordinates (93).
(100) is called the canonical form for equations of hyperbolic type.
There is a second common form for hyperbolic equation. We nd this by
doing the additional transformation
ξ = α + β,
η = α − β,
⇓
uαα − uββ + e
auα + ebuβ + e
cu = d,
e (101)
utt − uxx = 1,
ω + = ω − = ω,
ξ = ξ(x, y),
dy
= −ω.
dx
Let
η = η(x, y),
be another family of curves chosen such that
ξ = ξ(x, y),
(102)
η = η(x, y),
78
denes a new coordinate system (ξ, η) in the domain of interest. For example if
A, B and C are constants we have
ξ = y + ωx,
∆ = ξx ηy − ξy ηx = ω 2 − c 6= 0.
We now express all derivatives ux , uxx , uxy , uy , uyy in terms of uξ , uη , uξη , uξξ , uηη .
using the chain rule. For example we have
ux = ξx uξ + ηx uη ,
uy = ξy uξ + ηy uη ,
uxx = ξxx uξ + ηxx uη + ξx2 uξξ + 2ξx ηx uξη + ηx2 uηη ,
etc.
Inserting these expressions into the equation we get a principal part of the
form
2
Aξx + 2Bξx ξy + Cξy2 uξξ + 2 [Aξx ηx + Bξx ηy + Bξy ηx + Cξy ηy ] uξη
Since
ξx = ωξy ,
we have
Thus, the term uξξ drops out of the equation. Also recall that for the parabolic
case we have ω = −BA . Therefore we have
Therefore the term uξη also drops out of the equation. Also, observe that the
term
Aηx2 + 2Bηx ηy + Cηy2 , (104)
79
must be nonzero. This is because if it was zero, η = η(x, y) would dene a
second family of characteristic curves dierent from ξ = ξ(x, y) and we know
that no such family exists for the parabolic case.
We can therefore divide the equation by the term (104) and we get
This is the canonical form for equations of parabolic type. We observe that the
diusion equation
ut − Duxx = 0,
is of this form.
B 2 − AC < 0,
where a = a(α, β) etc. The same canonical form can be found using only real
variables. The key step in the reduction to canonical form is to simplify the
principal part of the equation using a change of variables. Let us therefore do
a general change of variables
ξ = ξ(x, y),
η = η(x, y).
From equation (103) we have the following expression for the principal part of
the transformed equation
2
Aξx + 2Bξx ξy + Cξy2 uξξ + 2 [Aξx ηx + Bξx ηy + Bξy ηx + Cξy ηy ] uξη
The functions ξ(x, y) and η(x, y) are at this point only constrained by the Jacobi
condition
∆(x, y) = (ξx ηy − ξy ηx )(x, y) 6= 0.
In the elliptic case, both solutions ξ, η to the characteristic equation are complex.
Thus, for the elliptic case the coecients in front of uξξ and uηη can not be
made to vanish. However if we let η(x, y) be arbitrary and let ξ = ξ(x, y) be
the solution family of the system
dy Bηx + Cηy
= . (107)
dx Aηx + Bηy
80
Then along the solution curves ξ(x, y) = const, we have
dy
ξx + ξy = 0,
dx
⇓
Bηx + Cηy
ξx = −ξy .
Aηx + Bηy
Thus
Bηx + Cηy
∆ = ξx ηy − ξy ηx = −ξy ηy − ξy η x
Aηx + Bηy
−ξy
= {ηy (Bηx + Cηy ) + ηx (Aηx + Bηy )}
Aηx + Bηy
−ξy
Bηx ηy + Cηy2 + Aηx2 + Bηx ηy
=
Aηx + Bηy
−ξy 2
= Aηx + 2Bηx ηy + Cηy2 6= 0,
Aηx + Bηy
because the characteristic equation has no real solutions.
Therefore
ξ = ξ(x, y),
η = η(x, y),
is a change of coordinates.
Furthermore
Bηx + Cηy
Aξx ηx + Bξx ηy + Bξy ηx + Cξy ηy = (Aηx + Bηy )(ξx + ξy ) = 0.
Aηx + Bηy
Thus, the coecient in front of uξη is zero. Since the quantity
the coecients in front of uξξ and uηη are nonzero and of the same sign. Divid-
ing, we nd that the principal part takes the form
81
Introduce a change of variables dened by
α= ξ, (110)
β= √1 η. (111)
h0
Using the chain rule we nd that the principal part of an elliptic equation with
constant coecients, takes the form
uαα + uββ .
One can show that also elliptic equations with variable coecients can be put
into canonical form (112) by a suitable change of coordinates. This argument
however either relies on complex analysis, or some rather intricate real variable
manipulations.
uξξ parabolic,
We will assume that u, ux , uy are continuous on D but that uxx , uxy , uyy can be
discontinuous across the curve separating D1 and D2 . Thus u is not a solution
in the usual sense to (113) on the whole domain D.
82
Figure 38
ξ = ϕ(x, y),
η = ψ(x, y),
dene a change of coordinates. For this to be the case the Jacobian must be
nonzero everywhere.
We now transform our equation into the new coordinate system in the same
way as on page 79, and get
where we only has displayed the principal part of the equation. What we
precisely mean when we say that uxx , uxy and uyy can be discontinous across
y = h(x) is that
u, uξ , uη , uξη , uηη ,
are continuous but that uξξ can be discontinuous. Evaluating (114) above and
below the curve ξ = 0 ⇔ y = h(x), taking the dierence and going to the limit
as the points approach y = h(x) from above and below we get
where [uξξ ]|ξ=0 is the jump in uξξ across y = h(x). Since [uξξ ]|ξ=0 6= 0, we must
have
Aξx2 + 2Bξx ξy + Cξy2 = 0.
83
Thus ξ(x, y) must satisfy the characteristic PDE and consequently
ξ = 0 ⇔ y = h(x),
x + γt = const,
x − γt = const.
Singularities, that is discontinuities in uxx , uxy , uyy , can only occur along these
Figure 39
curves. Singularities given in initial data at t=0 is propagated along the char-
acteristic curves and persists forever. This behavior is typical for all hyperbolic
equations.
The diusion equation is
ut = Duxx ,
⇒ A = D, B = C = 0.
84
So the characteristic PDE is
ϕ2x = 0, ⇒ ϕ = ϕ(t),
t = const.
Figure 40
Singularities in the initial data given on the x-axis can not leave the x-axis and
will not be seen at any point (x, t) for t > 0. Thus, all singularities in initial data
are washed out instantly for t > 0. This behaviour is typical for all parabolic
equations.
Finally, the Laplace equation is
uxx + uyy = 0,
85
Let us start by considering the case of rst order linear scalar PDEs in two
independent variables. The initial value problem in assumed to be given in
terms of an initial curve that is the graph of a function h(x)
y = h(x).
ux + h0 uy = f 0 , (117)
h0 f0
1 ux
= . (119)
a b uy cf + d
For points close to the initial curve we can express the solution of the initial
value problem (115),(116) as a Taylor series.
Figure 41
Clearly if the solution u(x, y) is to exist and be unique, then at least ux (x0 , y0 )
and uy (x0 , y0 ) must exist as a unique solution of the system (119).
This is ensured if
h0
1
det 6= 0.
a b (x0 ,y0 )
Even if this is true it is not certain that the initial value problem is uniquely
solvable, there is the question of the coecients of the rest of the Taylor series
and of course there is the matter of convergence for such series . . . .
86
However, if on the other hand,
h0
1
det = 0, (120)
a b (x0 ,y0 )
ux (x0 , y0 ), uy (x0 , y0 ) either does not exist and if they do exist they are not
unique. This follows from the elementary theory of linear systems of equations.
So if (120) holds we can know for sure that we do not have existence and
uniqueness for the initial value problem (115),(116).
Condition (120) can be written as
b − ah0 = 0,
m
b
h0 = . (121)
a
This equation tells us that y = h(x) must be a characteristic curve! This follow,
because the family of base characteristic curves (x(s), y(s)) are determined by
being solutions to
dx dy
= a, = b. (122)
ds ds
If the curves in question are graphs of functions we can parametrise them using
x. So that they are given by a function y(x). For this function (122) together
with the chain rule gives
dy dy ds b
= · = .
dx ds dx a
and (121) say that h(x) solves this equation so that y = h(x) is a base charac-
teristic curve.
Let us next consider the case of second order linear scalar PDEs in two
independent variables.
The characteristic curves have preciously been found to satisfy
The curves on which the PDE could have singularities, y = h(x), were related
to ϕ(x, y) through
ϕ(x, y) = y − h(x) = 0,
and (123) then give the equation
The solutions of this nonlinear ODE gives the characteristic curves for PDEs of
the form
Auxx + Buxy + Cuyy = F (x, y, u, ux , uy ), (125)
where
F (x, y, u, ux , uy ) = Dux + Euy + F u + G.
87
Recall that all parameters are functions of x and y.
Let y = h(x) be a initial curve where we have given u, ux and uy . Thus
Figure 42
For points close to the initial curve the solution can be expand as a Taylor series.
Equations (127),(127) and (129) dene a linear system of equations for uxx , uxy
and uyy
h0 q0
1 0 uxx
0 1 h0 uxy = r0 . (130)
A 2B C uyy F (x, h(x), p, q, r)
The Taylor expansion show that the the problem (125) with initial data (126)
can have a unique solution only if the determinant of the matrix in the linear
system (130) is nonzero.
88
If, on the other hand, the determinant is zero, the problem is sure to either
have no solution or to have a nonunique solution. This condition is
1 h0
0
det 0 1 h0 = 0,
A 2B C
m (131)
0 02
C − 2Bh + Ah = 0, (132)
and this is exactly the equation for the characteristic curves (124).
Thus characteristic curves are curves for which the (generalized) initial value
problem either can not be solved or for which the solution is non-unique.
There is something awkward about the presentation given and equation
(132). Since we are considering the general initial value problem as giving
u, ux and uy along some curve in the (x, y) plane, there is really no point in
restricting to curves that is the graph of a function y = h(s). Why not consider
curves that are graph of a function x = k(y) or indeed a curve that is neither a
graph of a function of x or a function of y ?
Figure 43
Let us consider case iii). Let ϕ(x, y) be a function such that C is a level curve
of ϕ
C : ϕ(x, y) = 0.
Pick any point (x0 , y0 ) on C. Then the piece of the curve close to (x0 , y0 ) will
either be the graph of a function of x or the graph of a function of y
Let us assume, without loss of generality, that the piece of C close to (x0 , y0 ) is
the graph of a function of x, y = h(x). But then we must have
ϕ(x, h(x)) = 0,
⇓
ϕx + h0 ϕy = 0,
⇓
0 ϕx
h =− .
ϕy
89
Figure 44
2
ϕx ϕx
C + 2B +A = 0,
ϕy ϕy
m
Aϕ2x + 2Bϕx ϕy + Cϕ2y = 0. (133)
The conclusion of this is that curves, C, for which the generalized initial value
problem (125), (126) either has no solution or has a nonunique solution, must
be level curves
ϕ(x, y) = 0,
where ϕ is a solution of (133). And all solutions of (133) determines such special
initial curves.
We have met equation (133) before on page 9 of these notes where it was
called the characteristic equation for (125).
Thus the special initial curves discussed in this section are in fact character-
istic curves as dened previously.
The characteristic curves are seen to play three dierent roles for PDEs of
type (125).
i) Level curves for special coordinate system used to reduce the PDE to
canonical form.
iii) Curves for which the initial value problem for the PDE either have no
solution or have a non-unique solution.
So far the statements i), ii) and iii) only applies to linear second order scalar
PDEs in two independent variables. Can we extend these results to more general
equations?
90
The answer is yes we can! Many of these ideas can be extended far beyond
the class of equations we have discussed so far. In [1] classication and associated
characteristic curves and surfaces for large classes of equations are discussed on
pages 137-151.
In this course we will not discuss all these various cases but will rather
restrict to linear scalar second order equations in n independent variables where
n can be any positive integer.
Equations of this type takes the form
n X
X n n
X
aij ∂xi ∂xj u + bi ∂xi u + cu + d = 0, (134)
i=1 j=1 i=1
Observe that
1 1
aij ∂xi ∂xj u = aij ∂xi ∂xj u + aij ∂xi ∂xj u
2 2
1 1
= aij ∂xi ∂xj u + aji ∂xj ∂xi u
2 2
1 1
= aij ∂xi ∂xj u + aji ∂xi ∂xj u
2
2
1 1
= aij + aji ∂xi ∂xj u
2 2
=e
aij ∂xi ∂xj u,
91
where aij = 12 (aij + aji ).
e
aij
e are evidently the components of a symmetric matrix
aij = e
e aji .
Thus, we can without loss of generality assume that the matrix A = (aij ) in
equation (135) is a symmetrix matrix.
Let us rst assume that all aij are constants. Let λk be the eigenvalues of
A and R = (rki ) be the orthogonal matrix whose rows are the correponding
eigenvectors. In terms of matrices we have
λ1 0
t ..
RAR = .
.
0 λn
ξp = rpk xk .
Thus the second order mixed partial derivatives all vanish. This is the canonical
form for (135). If we return to the n=2 case we observe that
92
Based on this we now make the following classication of equations of type
(135).
λi > 0, ∀i
or − elliptic type,
λi < 0, ∀i
(
one of λi > 0 orλi < 0
− hyperbolic type,
All other λi opposite sign
For the elliptic case we can make the further change of coordinates
1
αi = ξi , (136)
|λi |
leading to a principal part of the form
Assuming λ1 > 0 and λ2 . . . λn < 0, the hyperbolic case is reduced, using (136),
to the form
n
X
∂x21 − ∂x2i . (138)
i=2
Similar reductions are possible for the parabolic case. Some n = 3, 4 basic
examples of elliptic, hyperbolic and parabolic equations are
The classication of equations of type (135) into elliptic, hyperbolic and parabolic
based on eigenvalues of the matrix A is still possible when aij depends on
(x1 , . . . , xn ) and are thus not constant. The eigenvalues and the classication
will however in general depend on which point we are at. Also, it is in general
not possible to reduce it to a canonical form where all terms with mixed partial
derivatives vanish. This is seen to be true by a simple counting argument.
Let
ξ1 = ξ1 (x1 , . . . , xn ),
.
. (139)
.
ξn = ξn (xn , . . . , xn ),
be a change of coordinates in Rn . Using the chain rule the principal part of
equation (135) can be written as
X
Fij (ξ1 , . . . , ξn )∂ξi ∂ξj u, (140)
ij
93
where Fij is some expression containing derivatives of the functions ξj . Let us
try to eliminate all mixed partials. Then for this to happened, we must have
1
There are exactly
2 n(n − 1) terms with mixed partial derivatives. Thus there
1
are n(n − 1) equations in (141). On the other hand we only have n unknown
2
functions ξ1 , . . . , ξn . If
1
n(n − 1) > n,
2
the system (141) in general has no solution. This happened for n 1 4.
For equations of type (135), surfaces supporting singularities of solutions,
and on which given initial data does not give a unique solution for the initial
value problem, are given as solutions to
These surfaces are called characteristic surfaces and equation (142) is called the
characteristic equation for equation (135). For example the 3D wave equation
utt − uxx − uyy − uzz = 0,
ϕ = ωt − k1 x − k2 y − k3 z + c,
where ω 2 = k12 + k22 + k32 . For each value of t the characteristic surface
ϕ = 0,
m
k1 x + k2 y + k3 z = ωt + c,
is a plane normal to the vector k = (k1 , k2 , k3 ). Such surfaces play a pivotal
role in the theory of wave equations where they appear through special solutions
called plane waves.
94
where A, B, C, D, E and F are real constants. Certain aspects of our conclusions
for this class can be extended to a wider class of equations[1].
We start our investigations by looking for special solutions to (143) of expo-
nential form
u(x, t) = a(k)e(ikx+λ(k)t) . (144)
This is a complex solution but both the real and imaginary part of u are real
solutions of (143). (Verify this). We will in this section interpret the variable
t as time and study the evolution of the solutions (144) for t > 0. The special
solutions (144) are called normal modes for (143). By the principle of linear
super-position we can add together several normal modes corresponding to dif-
ferent values of k to create more general solutions. Taking an innite set of
discrete values {kj } we get a solution of the form
∞
X
u(x, t) = a(kj )e(ikj x+λ(kj )t) . (145)
j=−∞
This is a Fourier series, and the sense in which such series solves (143) will be
discussed extensively later in the class. There we will also discuss a much more
general notion of normal mode that applies to a much wider class of equations.
We will see that the normal modes (144) can be combined in a continuous sum
by which we mean an integral
Z ∞
u(x, t) = dk a(k)e(ikx+λ(k)t) , (146)
−∞
u(x, t) = a(k)e(ikx+λ(k)t) ,
⇓
ut = λ(k)u, ux = iku, utt = λ2 (k)u,
uxt = ikλ(k)u, uxx = −k 2 u.
95
This equation has two complex solutions. For each of these we can write
We will assume that the expression |a(k)| is bounded for all k . The growth of
|u(x, t)| as a function of k for t>0 is then determined by Re[λ(k)].
There are now two distinct possibilities. Either Re[λ(k)] is bounded above
for all k ∈ R, or it is not.
Dene
Ω = sup Re[λ(k)]. (151)
k∈R
If Re[λ(k)] is unbounded we let Ω = +∞. Let us rst consider the case Ω = +∞.
Previously in this class we introduced the notion of being well-posed for initial
and or boundary value problems for PDEs. In order to be well-posed, three
criteria had to be satised
3) The solution must depend in a continuous way on the data of the problem:
Small data must produce a small solution.
Data here means parameters occurring in the PDEs, description of the boundary
curves or surfaces, if any, and functions determining boundary and/or initial
conditions
1
For the case Ω = +∞, let a(k) = λ(k) 2 and consider the initial value problem
for the normal mode with data at t = 0. The data are from (144)
1
u(x, 0) = eikx ,
λ2 (k)
1 ikx
ut (x, 0) = e .
λ(k)
Since Re[λ(k)] are unbounded (Ω = +∞) there are values of k such that |u(x, 0)|
and |ut (x, 0)| are as small as we like. Thus the data in this initial value problem
can be made arbitrarily small. However
1
|u(x, t)| = eRe[λ(k)]t ,
λ2 (k)
96
grows exponentially as a function of k and can for any xed t > 0 be made
1
arbitrarily large. This is because an algebraic decay is always overwhelmed
λ2 (k)
by an exponential growth eRe[λ(k)]t (Basic calculus). Thus arbitrarily small data
can be made arbitrarily large for any t > 0. Condition 3) on page 52 is not
satised and consequently the problem is not well posed. If the case Ω = +∞
is realized, it is back to the drawing board, something is seriously wrong with
the model! If Ω < +∞ one can show that the problem is well posed.
Let us look at some specic examples.
The equation
uxx + utt + ρu = 0,
where ρ is constant, is an elliptic equation. Our equation for λ(k) for this case
is
λ2 + (−k 2 + ρ) = 0,
⇓
p
λ(k) = ± k 2 − ρ.
λ(k) is clearly unbounded when k ranges over R. Thus Ω = +∞ and the Cauchy
problem for the equation with data at t = 0 is not well posed.
The equation
ρuxx + ut = 0,
where ρ is constant is a parabolic equation. In fact, we can also write it as
ut = −ρuxx .
For ρ < 0 it is the heat equation. For ρ > 0 it is called the reverse heat
equation. We get the reverse heat equation from the heat equation by making the
transformation t0 = −t. Increasing values of t0 then corresponds to decreasing
values of t. Solving the reverse heat equation thus corresponds to solving the
heat equation backward in time.
We now get the following equation for λ(k)
− ρk 2 + λ = 0 ⇒ λ = ρk 2 . (152)
Clearly λ(k) is unbounded for ρ > 0 ⇒ Ω = +∞ which implies that the reverse
heat equation is not well posed. This say something profound about the process
of heat diusion or any other process described by the diusion equation. Such
processes are irreversible, they have a built-in direction of time. It is a well
known fact that no fundamental physical theory is irreversible, they have no
direction of time. So how come the diusion equation has one? Where and how
did the direction of time enter into the logical chain connecting microscopic
reversible physics to macroscopic irreversible diusion processes? This question
has been discussed for more than a hundred years and no consensus has been
reached. Nobody really knows.
For ρ < 0, λ(k) is bounded above by zero ⇒ Ω = 0 and the Cauchy problem
for the heat equation is well-posed.
97
The equation
utt − uxx + ρu = 0,
where ρ is a constant, is a hyperbolic equation. The equation for λ(k) is
λ2 + k 2 + ρ = 0,
⇓
p
λ = ±i k 2 + ρ.
For ρ<0 the real part of λ vanish for |k| large enough and for ρ > 0 the real
part vanish for all k. Thus in either case Ω < +∞ and the Cauchy problem for
this equation is well-posed.
The constant Ω is called the stability index for the equation. If Ω>0 there
is some set or range of k values whose corresponding normal modes experience
exponential growth. If this is the case the equation is said to be unstable. If
Ω < +∞ the Cauchy problem is well posed, but the unbounded growth can still
invalidate the PDE as a mathematical model for some t > 0. The reason for
this is that PDE models are usually derived using truncated Taylor expansions
or expansions of some other sort. In order for these truncations to remain valid
the solutions must remain small. This assumption will break down in the case
of unbounded growth. If Ω<0 all normal modes decay exponentially. If this
is the case the PDE is strictly stable. If Ω=0 there may be some modes that
does not decay and will continue to play a role for all t > 0. If this is the case
the PDE is neutrally stable.
Consider the parabolic equation
ut − c2 uxx + aux + bu = 0.
λ + c2 k 2 + iak + b = 0,
⇓
λ(k) = −c2 k 2 − b − iak,
Re[λ(k)] = −b − c2 k 2 .
Clearly Ω = −b. It is stable if b>0 and unstable if b<0 and neutrally stable
for b = 0.
The equation
uxx + 2uxt + utt = 0,
is parabolic and has an equation for λ given by
−k 2 + 2ikλ + λ2 = 0,
m
(λ + ik)2 = 0,
98
m
λ(k) = −ik,
and thus Ω = 0. The equation is therefore neutrally stable. It has a normal
mode solution
u1 (x, t) = aeik(x−t) .
It is straight forward to verify that it also have a solution
u2 (x, t) = bteik(x−t) .
u(x, 0) = 0
ut (x, 0) = 1,
The amplitude of the normal mode is preserved for all time. In many cases of
importance, |u|2 is a measure of the energy of the normal mode and we have
that the energy in the system is conserved in time.
In general, an equation where Re[λ(k)] = 0, is called conservative. For the
conservative case we dene the positive quantity
ω = |Im[λ(k)]|,
λ(k) = ±iω(k),
depending on the sign of the imaginary part of the complex root λ(k) of equation
(152).
99
The normal mode then takes the form
The equation
ω = ω(k), (154)
ω 00 (k) 6= 0, (155)
and thus ω(k) is not a linear function of k , the equation is said to be of dispersive
type. The normal mode can, for the conservative case, be written as
where θ = kx ± ω(k)t is called the phase of the normal mode. It is clear that
for k > 0, (156) represents a wave moving from right to left(+) or from left to
right(-), with speed
ω(k)
vf (k) = . (157)
k
The quantity vf is called the phase speed of the wave. If ω 00 (k) = 0 then
ω(k) = ck
where c is some constant. For this case the phase speed is independent of k
ω(k) ck
vf = = = c.
k k
Thus all normal modes moves at the same speed.
For the case when w00 (k) 6= 0, each normal mode moves at its own speed,
and the linear superpositions (145) and (146) represents spreading or dispers-
ing waves. For such solutions the phase speed is not the relevant quantity to
consider. We will see later that for such waves the group velocity.
dω
vg = , (158)
dk
is the relevant quantity.
If the stability index Ω60 and Re[λ(k)] < 0 for all, except a nite number
of k -values, the equation is said to be of dissipative type.
Let us consider the so-called Telegrapher's equation
λ2 + γ 2 k 2 + 2λλ
b = 0,
100
⇓
q
λ(k) = −λ
b± λ b2 − γ 2 k 2 .
utt − γ 2 uxx + c2 u = 0.
p
ω= γ 2 k 2 + c2 ,
d2
L= . (159)
dx2
101
For any pair of functions u, w we have
Thus uLw − wLu is a total derivative. Identities like (159) are of outmost
importance in the theory of PDEs. Here is another example. Let
d
L= , (161)
dx
and dene a operator L∗ by
d
L∗ = − . (162)
dx
Then
uLw − wL∗ u = uw0 + wu0 = (uw)0 .
Thus we get a total derivative. Next consider an operator
L = ∇2 . (163)
uLw − wL∗ u = ∇ · F,
102
Figure 45
u= 0 on S, (166)
or
∂n u = 0 on S.
Let V be the vector space of smooth functions dened on D and that satisfy the
boundary conditions (166) (Prove that this is a vector space). On the vector
space V, we can dene and inner product using the formula
Z
(u, w) = dv uw.
D
103
We know from linear algebra that self-adjoint operators have many good prop-
erties, we will discuss many of these in the next section of these lecture notes.
An important class of dierential operators in the theory of PDEs consists
of operators of the form
To see why this is so take the single term wa11 ∂x1 ∂x1 u. We have
bi = ∂xj aij .
If this is the case we have
If the coecients are constants (173) imply that a self-adjoint operator can not
have any rst derivative terms.
Let us look at some examples.
The operator
Lu = −∇ · (p∇u) + qu, p > 0,
is an elliptic operator (Show this).
Observe that
104
so L is formally self-adjoint.
The operator
Lu
e = ρutt + Lu,
e − uLw
wLu e = ρwutt + wLu − ρuwtt − uLw
= ∂t (ρwut − ρuwt ) + ∇ · (pu∇w − pw∇u)
=∇e · (pu∇w − pw∇u, ρwut − ρuwt ),
where ∇
e = (∇·, ∂t ) is a 4D divergence operator. Thus L
e is formally self-adjoint.
Finally dene the operator
Lu
b = ρut + Lu,
b ∗ u = −ρut + Lu.
L
Then
Thus b∗
L is the formal adjoint of L
b. Since b ∗ 6= L
L b the operator L
b is not formally
self-adjoint.
The notion of a formal adjoint can be extended to a much larger class of
operators[1] than (170).
105
Our rst restriction is to consider only scalar PDEs. Within the class of scalar
PDEs we chose a subclass that is quite general and cover situations that occur
in many applications in science and technology. In chapter 8 of [1] a much larger
class of PDEs is investigated using similar methods to the ones we discuss here.
Let L be a second order dierential operator dened by
We will in general assume that p > 0 and q 1 0. The type of equations we want
to consider are
Elliptic case:
Lu = ρ(x)F (x), ρ > 0. (176)
Parabolic case:
ρ(x)∂t u + Lu = ρ(x)F (x, t), ρ > 0. (177)
Hyperbolic case:
It is an important point here that the functions p, ρ and q does not depend on
time.
These equations, of elliptic, parabolic and hyperbolic type appear as the rst
approximation to a large set of mathematical modelling problems in science and
technology.
Let us consider the case of mass ow. Let D be some domain with boundary
S.
Figure 46
106
This requirement explains the assumption ρ>0 in equations (176),(177) and
(178).
We have the balance equation for mass
Z Z
dM
=− dSj · n + dV H. (180)
dt S D
where j is the mass ux and H is a source of mass. In general, j = j(x, u, ∇u),
but for low concentration and small gradients we can Taylor expand and nd
Here a constant term and a term only depending on u has been dropped on
physical grounds: All mass ux is driven by gradients, and we have introduced
a minus sign and assumed p>0 in order to accommondate the fact that mass
always ow from high to low concentration. In general H = H(x, t, u) but for
low concentration we can also here Taylor expand and nd
Fe = ρF. (183)
The minus sign in the rst term is introduced to model a situation where high
concentration leads to high loss. Imagine a sink hole where the amount leaving
increase in proportion to the mass present.
Using (181), (182) and (183) in (180) we get
Z Z Z
dv ρ∂t u = dS p∇u · n + dV {−qu + ρF } ,
V S D
⇓ divergence theorem
Z
dV {ρ∂t u − p∇ · (p∇u) + qu − ρF } = 0.
V
This is assumed to hold for all domains D. Therefore we must have
ρ∂t u + Lu = ρF,
107
Figure 47
Figure 48
It is not uncommon that conditions of the rst, second and third kind are
specied at dierent parts of the boundary.
Figure 49
For heat conduction problems, part of the boundary could have a given tem-
prature (rst kind) whereas other parts could be insulated (second kind).
For the one dimensional case, ∂G consists of the two end points of the interval
G = (0, l). So ∂G = {0, l}. For the hyperbolic and parabolic case the general
108
conditions (184) are
The minus sign at x=0 occur because the unit normal at the point x=0 is
given by the number (-1).
If F and B in (176), (177), (178) and (184) are zero the corresponding boundary
value problem is homogeneous.
in 2D and 3D.
In 1D, the homogeneous condition is
α1 u(0, t) − β1 ux (0, t) = 0,
(188)
α2 u(l, t) + β2 ux (l, t) = 0, t > 0.
with the boundary conditions (187) or (188) and the single initial condition
where ρ, p, q are functions of x alone. The equation (192) is supplied with the
boundary conditions (187) for the 2D case or (188) for the 1D case. The analog
109
of the initial conditions in the time dependent case are the following conditions
at y=0 and y = l.
l) = g(x), x ∈ G.
u(x, 0) = f (x), u(x, b (193)
This assumed form for the elliptic equation certainly does not cover all possibil-
ities that occur in applications. We observe for example that the domain always
is a generalized cylinder
Figure 50
However, the assumed form makes a unied presentation for the hyperbolic,
parabolic and elliptic case possible.
After all this preparatory work, let us start separating variables!
The method of separation of variables starts by looking for special solutions
for (192). Inserting (194) and (195) into (186), (190) and (192) gives, upon
division by NM
N 00 (t) LM (x)
=− , hyperbolic case, (196)
N (t) ρ(x)M (x)
N 0 (t) LM (x)
=− , parabolic case, (197)
N (t) ρ(x)M (x)
N 00 (y) LM (x)
− =− , elliptic case. (198)
N (y) ρ(x)M (x)
Since the two sides of equations (196) → (198) depends on dierent variables,
both sides must be equal to the same constant. This constant we write, by
convention, in the form −λ. This give us in all three cases the following equation
for M.
LM (x) = λρ(x)M (x), (199)
and
110
The function M (x) in equation (199) is subject to boundary conditions (187) or
(188). Since both conditions are homogeneous, as are equation (199), M (x) = 0
is a solution. This solution, also called the trivial solution, gives u = 0 and thus
plays no role.
Dening A = ρ1 L, we observe that the structure of equation (199) is
AM = λM
occurring in equation (199) and determining the functions M (x) in the separa-
tion
u = N M,
for the hyperbolic case (186), the parabolic case (190) and the elliptic case (192),
is very special. We will now determine some of the properties of the spectrum of
L. The spectrum of a linear operator is the collection of eigenvalues and corre-
sponding eigenspaces of the operator. Recall that the eigenspace corresponding
to a given eigenvalue consists of all eigenvectors of the operator that has the
same eigenvalue.
Using vector calculus we observe that
Z Z
dV {wLu − uLw} = dS p(u∂n w − w∂n u). (205)
G ∂G
111
Let us assume that both u and w satisfy the boundary conditions (184) with
B = 0. Thus
α(x)u + β(x)∂n u ∂G
= 0,
(206)
α(x)w + β(x)∂n w ∂G
= 0.
For a given u, w that satisfy (206) we can think of (206) as a linear system for
α and β
u ∂n u α
= 0. (207)
w ∂n w ∂G
β
But our assumptions on the functions α and β imply that they can not both be
zero at any point. Thus, the system (207) must have a nonzero solution. This
imply that the determinant of the matrix dening the linear system (207), must
be zero or
(u ∂n w − w ∂n w)|∂G = 0. (208)
Z
⇒ dV {wLu − uLw} = 0. (209)
sZ
1/2
kf k = ((f, f )) = dV ρ(x)f 2 (x). (211)
G
kf k = 0 ⇔ f = 0. (212)
Thus, what we have dened is a norm in the usual sense of linear algebra. Using
this inner product (209) can be written as
1 1
(w, Lu) = ( Lw, u).
ρ ρ
112
Let now Mm , Ml be eigenfunctions of L corresponding to dierent eigen-
values, λk 6= λl . Since both Mk and Ml satisfy the homogeneous boundary
conditions, (206), we have
Z
dV {Mk L Ml − Ml L Mk } = 0,
G
but
LMk = ρλk Mk ,
LMl = ρλl Ml .
Thus
Z Z
0= dV {Mk L Ml − Ml L Mk } = dV ρ(λl − λk )Mk Ml
G G
= (λl − λk )(Mk , Ml ).
But λk 6= λl ⇒ λk − λl 6= 0, and therefore we must have (Mk , Ml ) = 0.
G ∂G
{u∗ ∂n u − u∂n u∗ } ∂G
= 0,
⇓
Z
dV {uLu∗ − Luu∗ } = 0.
G
Let now λ be a complex eigenvalue and M the corresponding complex valued
eigenfunction. Then (9.2) imply that
Z
0= dV {M (ρλM )∗ − ρλM M ∗ }
G
Z
= dV {ρλ∗ M M ∗ − ρλM M ∗ }
G
Z
∗
= (λ − λ) dV ρM M ∗ .
G
113
But Z
dV ρ|M |2 = 0 ⇒ M = 0,
G
so we must have
λ∗ = λ,
which means that the eigenvalues are real. The eigenfunctions can then always
be chosen to be real.
Let now u be a (real valued) function that satises the boundary condition
(184) with B = 0. Thus u ∈ V , as dened on page 108. Then we have
Z Z Z
dV uLu = − dV {∇ · (pu∇u) − p(∇u)2 } + dV qu2
G G G
Z Z
= dV {p(∇u)2 + qu2 } − dS pu∂n u.
G ∂G
Split the boundary ∂G into three pieces S1 , S2 and S3 , where on Si the boundary
conditions are of kind i, where i = 1, 2, 3 (see page 7). Then on S1 ∪S2 , pu∂n u =
0 and on S3 , pu∂n u = −p α
β u2
Z Z Z
2 2 α
⇒ dV uLu = dV {p(∇u) + qu } + dS p u2 . (214)
β
G G S3
The assumptions we have put on p, α, β and q earlier implies that the right hand
side of (214) is positive. Thus, using the inner product dened on page 16, we
have found that
1
( Lu, u) ≥ 0. (215)
ρ
This means, by denition, that
1
ρ L is a operator. positive
Let now Mk be an eigenfunction corresponding to the eigenvalue λk . From
(215) we have
Z Z Z
1
(Mk , LMk ) = dV Mk LMk = dV Mk ρλk Mk = λk dV ρMk2 ≥ 0,
ρ
G G G
and Z
dV ρMk2 > 0.
G
114
Figure 51
limit point for a sequence {an } is a point a∗ ∈ R such that ∀ > 0 there exists
intely many n such that
|an − a∗ | < .
Thus the eigenvalues only "accumulate" around λ = ∞.
Let now {λk } be the countably innite set of eigenvalues. One can also
prove that there are at most a nite set of linearly independent eigenvectors
corresponding to each λk . We can assume that these has been orthogonalized
using the Gram-Schmidt orthogonalization process.
Let us in the following for simplicity only consider the case when each
eigenspace is of dimension one and where all k , λk > 0. Thus λ = 0 is not
an eigenvalue. Then the equations (200-202) for the function N gives upon
solution the following:
√ √
Nk (t) = ak cos λk t + bk sin λk t, hyperbolic case, (216)
λk t
Nk (t) = ak e , parabolic case, (217)
√ √
λk y − λk y
Nk (y) = ak e + bk e , elliptic case. (218)
uk = Nk Mk ,
that solves equations (186), (190) or (192) and the boundary conditions (187)
or (188). These functions will however not in general satisfy the initial condi-
tions. In order to also satisfy the initial conditions (183) for the hyperbolic case,
(191) for the parabolic case and (193) for the elliptic case we consider formal
superposition
∞
X ∞
X
u= uk = M k Nk . (219)
k=1 k=1
∞
X ∞
X
u(x, 0) = Mk (x)Nk (0) = ak Mk (x) = f (x), (220)
k=1 k=1
115
∞
X ∞ p
X
ut (x, 0) = Mk (x)Nk0 (0) = λk bk Mk (x) = g(x). (221)
k=1 k=1
(220) and (221) require us to nd eigenfunction expansions for the the functions
f and g in terms of the eigenfunctions {Mk }∞
k=1 . In what sense f and g can be
represented by their eigenfunction expansions is an important, and in general
very hard, question that will be discussed more fully only for the 1D case in this
class.
You might recall from elementary calculus that for innite series of functions,
several dierent notions of convergence are available. The most important ones
are pointwise convergence, uniform convergence and mean-square convergence.
We will see that the last type of convergence will hold for very mild conditions
on f and g. However, mean-square convergence is not enough to ensure that
(219) is an actual, honest to God, solution to our initial-boundary value problem
(186),(187),(189). For this to be the case we need a stronger convergence than
mean-square. This stronger form of convergence will allow us to dierentiate
the formal superposition series (219) term by term. If this is possible, we have a
classical solution. If it is not possible, (219) is a generalized solution that might
still carry important information about the problem.
In order to express the coecients {ak } and {bk } in terms of f and g, we
use the fact that the eigenfunctions {Mk } are orthogonal for k 6= k 0
∞
X
(f, Mj ) = ak (Mk , Mj ) = aj (Mj , Mj ),
k=1
(f, Mj )
⇒ aj = . (222)
(Mj , Mj )
In the same way we nd
(g, Mj )
bj = p . (223)
λj (Mj , Mj )
The formal solution to the hyperbolic initial-boundary value problem (186-188)
is then
∞
X p p
u(x, t) = (ak cos λk t + bk sin λk t) Mk (x), (224)
k=1
116
For the elliptic case we get the conditions
∞
X
u(x, 0) = (ak + bk )Mk (x) = f (x),
k=1
X∞
u(x, ˜l) = (ak eλk l̃ + bk e−λk l̃ )Mk (x) = g(x).
k=1
ak + bk = (f, Mk ),
e λk l̃
ak + e−λk l̃ bk = (g, Mk ). (226)
These equations can be solved to give unique values for {ak } and {bk }. Our
formal solution for the elliptic case is then
∞
X
u(x, y) = (ak eλk y + bk e−λk y ) Mk (x). (227)
k=1
We will now take a closer look at the formal solutions (224), (225) and (227)
for the 1D case.
d du
L[u(x)] = − p(x) + q(x)u(x) = λρ(x)u(x), (228)
dx dx
α1 u(0) − β1 u0 (0) = 0,
α2 u(l) + β2 u0 (l) = 0. (229)
The problem dened by (228) and (229) is known as the Sturm-Liouville prob-
lem. If we require the following additional conditions
iii) αi ≥ 0, βi ≥ 0, αi + βi > 0,
117
functions p and/or ρ to vanish at the endpoints x = 0, x = l. We then get a
singular Sturm-Liouville problem. This generalization do occur in important
applications of this theory.
Before we start developing the theory, let us introduce some important con-
structions. These constructions have already been introduced in the more gen-
eral setting discussed earlier, but will be repeated here in the 1D context.
For real valued functions on 0≤x≤l which are bounded and integrable,
we introduce the inner product
Z l
(ϕ, ψ) = ρ(x)ϕ(x)ψ(x) dx. (231)
0
Observe that (231) is evidently bilinear and symmetric. For continuous functions
we have
Z l
ρ(x)ϕ(x)2 dx = 0, ⇒ ϕ(x) = 0 ∀x.
0
Thus, (231) really denes an inner product in the sense of linear algebra on the
linear vector space consisting of continuous functions on 0 ≤ x ≤ l.
Introduce the norm of a function by
!1/2
Z l
2
kϕk = ρ(x)ϕ (x) . (232)
0
1
ϕ̂ = ϕ,
kϕk
⇓
1 1
kϕ̂k = k ϕk = kϕk = 1.
kϕk kϕk
Two functions ϕ and ψ are orthogonal on the interval 0<x<l if
(ϕ, ψ) = 0. (234)
(ϕk , ϕj ) = 0, k 6= j. (235)
If also kϕk k = 1 for all k, the set is orthonormal. An orthogonal set can always
be turned into an orthonormal set by using the approach described above.
118
If we allow the functions to assume complex values, we replace the inner
product (231) by
Z l
(ϕ, ψ) = ρ(x)ϕ(x)ψ(x) dx, (236)
0
where for any complex number z, z is its complex conjugate. Observe that we
have bilinearity and
Thus (236) denes a Hermitian inner product in the sense of linear algebra.
Let us return to the real valued case. Let an orthonormal set of functions
{ϕk } be given and let ϕ be a square-integrable function on 0 < x < l. Then
the innite set of numbers (ϕ, ϕk ) are called the Fourier coecients of ϕ with
respect to {ϕk }. The formal series
∞
X
(ϕ, ϕk ) ϕk , (240)
k=1
is called the Fourier series of ϕ with respect to the set {ϕk }. Note that we
use the summation sign in (240) in a formal sense - we have at this point not
assumed anything about the convergence of the innite series of functions (240).
The name, Fourier series, is taken from the special case when {ϕk } is a set
of trigonometric functions. Often in the literature the term Fourier series refer
only to this trigonometric case, but we will use the term Fourier series for all
formal series (240).
So, in what sense does (240) converge? And if it converge, does it converge
to the function ϕ that we used to construct the formal series?
Let us start by considering a nite sum,
N
X
SN = (ϕ, ϕk ) ϕk .
k=1
119
For this sum we have
N
X N
X N
X
kϕ − (ϕ, ϕk ) ϕk k2 = (ϕ − (ϕ, ϕk ) ϕk , ϕ − (ϕ, ϕj ) ϕj )
k=1 k=1 j=1
N
X N
X N
X
= (ϕ, ϕ) − (ϕ, ϕk )2 − (ϕj , ϕj )2 + (ϕ, ϕk )(ϕ, ϕj )(ϕk , ϕj )
k=1 j=1 k,j=1
N
X N
X
= kϕk2 − 2 (ϕ, ϕk )2 + (ϕ, ϕk )2
k=1 k=1
N
X
= kϕk2 − (ϕ, ϕk )2 .
k=1
N
X
kϕk2 − (ϕ, ϕk ) ≥ 0,
k=1
⇓
N
X
(ϕ, ϕk ) ≤ kϕk2 .
k=1
N
all numerical series
(ϕ, ϕk )2 ,
P
This inequality says that partial sums of the
k=1
2
consisting entirely of positive terms, are bounded by kϕk . It then follows, from
the fact that all increasing bounded sequences are convergent that the series
convergent
PN 2
k=1 (ϕ, ϕk ) is and that
∞
X
(ϕ, ϕk )2 ≤ kϕk2 , (241)
k=1
lim kϕ − SN k = 0.
N →∞
∞
X
(ϕ, ϕk )2 = kϕk2 , (242)
k=1
120
then we have
N
X
lim kϕ − SN k2 = lim kϕ − (ϕ, ϕk ) ϕk k2
N →∞ N →∞
k=1
N
X
= lim {kϕk2 − (ϕ, ϕk )2 } = 0.
N →∞
k=1
(242) is called Parsevals's equality. The argument we just did shows that if
Parseval's equality holds, for a given ϕ and given orthonormal set {ϕk }, then
the Fourier series (240) converges to ϕ in the mean.
An orthonormal set {ϕk }∞
k=1 is said to be complete if the Parseval equality
(242) holds for all square integrable functions. We have proved that if an or-
thonormal set of functions is complete then the formal Fourier series (240) for
any square integrable functions ϕ, converge in the mean square square sense to
the function ϕ.
ii) The eigenvalues are real and non-negative and the eigenfunctions can be
chosen to be real.
iii) Each eigenvalue is simple. This means that each eigenspace is of dimension
one.
with lim λk = ∞.
k→∞
∞
X
u= (u, uk ) uk ,
k=1
121
vi) If u(x) is continuous on 0 ≤ x ≤ l and has a piecewise continuous rst
derivative on 0≤x≤l and u(x) satises the boundary conditions (229)
then the Fourier series (240) converge absolutely and uniformly to u(x) i
0 ≤ x ≤ l.
vii) If u(x) has some jump discontinuity at a point x0 in the interior of the
−
interval 0≤x≤l then (240) converge to
1
2 (u(x0 )+ u(x+0 )) at this point.
Here
u(x−
0 ) = lim u(x), (approaching from the left),
x→x−
0
u(x+
0 ) = lim+ u(x), (approaching from the right),
x→x0
so the only thing that remains to do is to actually nd the eigenvalues and
eigenfunctions for particular cases of interest. We will do so for a few important
cases.
All regular Sturm-Liouville problems can be simplied if we use the following
uniform approach:
Let v(x; λ), w(x; λ) be solutions of the initial value problem
d du
− p(x) + q(x)u(x) = λ + ρ(x)u(x), (243)
dx dx
with initial conditions
v(0; λ) = 1, w(0; λ) = 0,
0
v (0; λ) = 0, w0 (0; λ) = 1. (244)
α1 u(0; λ) − β1 u0 (0; λ) = 0.
In order to also satisfy the boundary condition at x=l we must have
The eigenvalues are solutions of equation (246) and the corresponding eigen-
functions are determined by (245).
We will now apply the uniform approach to several cases leading to various
trigonometric eigenfunctions and then turn to a couple of important examples
of singular Sturm-Liouville problems.
122
10.1.1 Trigonometric eigenfunctions
For this case we choose p(x) = ρ(x) = 1 and q(x) = 0. Equation (226) in the
Sturm-Liouville problem for this case reduces to
√
v(x; λ) = cos( λx), (248)
√
sin( λx)
w(x; λ) = √ , (249)
λ
satisfy (247) and the initial values (244). Here we assume λ > 0. We leave the
case λ=0 as an exercise for the reader. The eigenfunction for λ > 0 is found
from (245) and is
√
√ sin( λx)
u(x) = β1 cos( λx) + α1 √ . (250)
λ
The eigenvalues are determined from the equation (245), which for this case
simplies into
√ √ √
λ(α1 β2 + β1 α2 ) cos( λl) + (α2 + α1 − λβ2 β1 ) sin( λl) = 0. (251)
√
sin( λl) = 0,
2
πk
λk = , k = 1, 2, 3, . . . , (253)
l
and the corresponding normalized eigenfunctions are from (250)
r
2 πk
uk (x) = sin x , k = 1, 2, . . . . (254)
l l
123
For this case λ = 0 is not an eigenvalue because for λ = 0 equation (247) reduces
to
−u00 (x) = 0,
⇓
u(x) = Ax + B,
u(0) = 0 ⇒ B = 0
⇒ u(x) = 0.
u(l) = 0 ⇒ A = 0
The expansion of a function in a series
∞
X
u= (u, uk ) uk ,
k=1
is called the Fourier Sine series. Using explicit expressions for the eigenfunction
and the inner product we get the following formulas for the Fourier Sine series
r ∞
2X πk
u(x) = ak sin x,
l l
k=1
where r Z l
2 πk
ak = dx u(x) sin x.
l 0 l
√
λ sin λl = 0.
λ0 = 0 is clearly a solution. For λ0 = 0 our equation reduces to
−u00 (x) = 0,
⇓
u(x) = Ax + B,
u0 (0) = 0
⇒ A=0
⇒ u(x) = B.
u0 (l) = 0 ⇒ A=0
Thus, λ = 0 is an eigenvalue for this case. The corresponding normalized
eigenfunction is
124
1
u0 (x) = √ . (256)
l
For λ>0 we can divide the eigenvalue equation by λ and get
√
sin( λl) = 0, (257)
⇓ (258)
2
πk
λk = , k = 1, 2, . . . . (259)
l
r
2 πk
uk (x) = cos x , k = 1, 2, . . . . (260)
l l
The series expansion
∞
X
u = (u, u0 )u0 + (u, uk ) uk ,
k=1
is the Fourier Cosine series. Using explicit expression for the eigenfunctions
and the inner product we get
r r ∞
1 2X πk
u(x) = a0 + ak cos x,
l l l
k=1
where
r Z l
1
a0 = dx u(x),
l
r Z0 l
2 πk
ak = dx u(x) cos x.
l 0 l
Note that all the general properties of eigenvalues and eigenfunctions for Sturm-
Liouville problems can be veried to hold for the eigenvalues and eigenfunctions
we have found for the Fourier Sine and Fourier Cosine cases.
Instead of using the boundary conditions (227) we use periodic boundary con-
ditions
125
u(−l) = u(l), (262)
0 0
u (−l) = u (l). (263)
√ √
u(x) = a cos λx + b sin λx, (264)
⇓ (265)
0
√ √ √ √
u (x) = −a λ sin λx + b λ cos λx,
and
√ √ √ √
u(−l) = u(l) ⇒ a cos λl − b sin λl = a cos λl + b sin λl,
⇓
√
2b sin λl = 0. (266)
√ √ √ √ √ √ √ √
u0 (−l) = u0 (l) ⇒ a λ sin λl + b λ cos λl = −a λ sin λl + b λ cos λl,
⇓
√
2a sin λl = 0. (267)
2
πk
λk = , k = 1, 2, . . . .
l
For each such eigenvalue a and√
b are arbitrary.
√ Thus (264) gives us an eigenspace
of dimension 2 spanned by cos λk x and sin λk x. Introducing an inner product
and corresponding norm
Z l
(u, v) = dx u(x)v(x),
−l
Z l
kuk2 = dx u2 (x),
−l
1 πk
uk (x) = √ sin x, k = 1, 2, . . . (268)
l l
1 πk
uˆk (x) = √ cos x.
l l
126
One can show that there are no negative eigenvalues and that λ0 = 0 is an
eigenvalue with corresponding normalized eigenfunction
1
uˆ0 (x) = √ . (269)
2l
Using these functions we get a Fourier series
∞
X
u = (u, uˆ0 )uˆ0 + [(u, uˆk )uˆk + (u, uk )uk .] (270)
k=1
∞
1 1 X πk πk
u(x) = √ a0 + √ ak cos x + bk sin x,
2l l k=1 l l
where
Z l
1
a0 = √ dx u(x),
2l 0
Z l
1 πk
ak = √ dx u(x) cos x,
l 0 l
Z l
1 πk
bk = √ dx u(x) sin x.
l 0 l
n2
d du
L(u(x)) = − x + = λxu(x), 0 < x < l, (271)
dx dx x
u(l) = 0. (272)
127
at x = 0. We might therefore in general expect there to be solutions that are
singular at x=0 and (272) eliminate such solutions as acceptable eigenfunc-
tions. One might worry that all solutions are singular at x = 0. Then we will
get no eigenvalues and eigenfunctions at all! By using the method of Frobenious
from the theory of ordinary dierential equations one can show that (271) has
solutions that are bounded at x = 0. These solutions can be expressed in terms
of Bessel functions of order n, Jn , in the following simple way
√
u(x) = Jn ( λx). (273)
√
Jn ( λl) = 0. (274)
One can prove that this equation has innitely many solutions λkn , k = 1, 2, 3, . . .
They are all positive and λkn → ∞ as k → ∞. A large number of zeros for
the function Jn has been found using numerical methods. Let these zeros be
αkn , k = 1, 2, 3, . . . In terms of these zeros the eigenvalues are
α 2
kn
λkn = , k = 1, 2, . . . . (275)
l
The corresponding eigenfunctions are
p
ûkn (x) = Jn ( λkn x), k = 1, 2, . . . . (276)
Z l
(u, v) = dx xu(x)v(x),
0
the eigenfunctions can be normalized
√ √
2 Jn ( λkn x)
ukn = √ , k = 1, 2, . . . , , (277)
l |Jn+1 ( λkn l)|
where we have used the formulas
Z l
p
2
p l2 2 p
kJn ( λkn x)k = xJn2 ( λkn x) dx = Jn+1 ( λkn l). (278)
0 2
As you have seen in calculus there are a large number of formulas linking ele-
mentary functions like sin(x), cos(x), ln(x), ex etc. and their derivatives and
anti-derivatives. The Bessel functions Jn is an extension of this machinery.
There are a large set of formulas linking the Jn and their derivatives and anti-
derivatives to each other and other functions like sin(x), cos(x), ln(x) etc. The
functions in calculus (sin(x), cos(x), . . . ) are called elementary functions. The
function Jn is our rst example of a much larger machinery of functions called
special functions. There are many handbooks that list some of the important
128
relations and formulas involving special functions. Symbolic systems like math-
ematica, maple, etc. embody most of these formulas and often produce answers
to problems formulated in terms of such functions.
Note that the boundary conditions (272) are not of the type (227) discussed
in the general theory of Sturm-Liouville theory. Thus, there is no guarantee at
this point that eigenfunctions corresponding to dierent eigenvalues are orthog-
onal. This is typical for singular Sturm-Liouville problems; the properties of
the eigenvalues and eigenfunctions must be investigated in each separate case.
Observe that, in general, with L dened as in (226) we have
Z l Z l
dx uLv = dx u[−(pv 0 )0 + qv]
0 0
Z l Z l
=− dx u(pv 0 )0 + dx uqv
0 0
Z l Z l
l
= −u(pv 0 ) 0
+ dx u0 pv 0 + dx quv
0 0
Z l Z l
0 l 0 l 0 0
= −u(pv ) 0
+ (u p)v 0
− dx (pu ) v + dx uqv
0 0
Z l
l
= {pu0 v − puv 0 } 0
+ dx Luv
0
Z l
l
⇒ dx {uLv − vLu} = {pu0 v − pv 0 u} 0
(279)
0
Recall that for our current singular Sturm-Liouville problem, the space of func-
tions we need to expand in terms of the eigenfunctions (277), consists of func-
tions such that
u(l) = 0, (280)
Thus on the right hand side of (279) the evaluation at x = 0 must be interpreted
as a limit as x approach 0 (0, l).
from inside the interval
Let us apply (279) to a pair of eigenfunctions ui , uj corresponding to dierent
eigenvalues λi 6= λj . Thus
Lui = ρλi ui ,
Luj = ρλj uj .
129
(λi − λj )(ui , uj ) = (λi ui , uj ) − (ui , λj uj ) (281)
1 1
= ( Lui , uj ) − (ui , Luj )
ρ ρ
Z l
l
= dx{Lui uj − ui Luj } = [puj u0i − pui u0j ] 0 .
0
everything is okay and the eigenfunctions form an orthogonal set. Since p(x) = x
we thus have the requirement
√
1 + O(x2 ),
J0 ( √λx) =
as x → 0+ .
Jn ( λx) = O(xn ), n>0
Thus the functions are bounded at x = 0. Also
√
J00 ( √λx)
= O(x),
as x → 0+ ,
Jn0 ( λx) = O(xn−1 ), n>0
but then
√
lim xJ00 ( λx) = 0,
x→0 +
√
lim+ xJn0 ( λx) = 0, n > 0.
x→0
√
p 2
ukn (x) = αkn Jn ( λkn x), with αkn = √ ,
l|Jn+1 ( λkn l)|
form an orthonormal set.
One can show that these eigenfunctions form a complete set. Any smooth
function dened on the interval 0≤x≤l that satises the boundary condition
u(l) = 0 can be expanded in a convergent series
130
∞
X
u= (u, ukn ) ukn . (283)
k=1
d 2 du
L(u(x)) = − (1 − x ) = λu(x), −1 < x < l, (284)
dx dx
with boundary conditions
dk 2
1
Pk (x) = (x − 1)k . (287)
2k k! dxk
It is very easy to verify that
Z 1
2 2
kPk (x)k = Pk2 (x) = , k = 0, 1, 2, . . . . (289)
−1 2k + 1
Thus we get an orthonormal set of eigenfunctions
r
2k + 1
uk (x) = Pk (x), k = 0, 1, 2, . . . . (290)
2
One can show that this set is complete and that any smooth function dened
over the interval −1 ≤ x ≤ 1 can be expanded in a convergent series
∞
X
u= (u, uk ) uk . (291)
k=0
131
10.2 Series solutions to initial boundary value problems
It is now time for returning to the solution of initial boundary value problems.
We will consider a few specic examples and do detailed calculations for these
cases. However, before we do that let us recapitulate the separation of variable
strategy for the hyperbolic case. As we have seen, the strategy for parabolic
and elliptic equations is very similar.
that solves the PDE (186) and the boundary conditions. These could be
of the form (187), but as we have seen in the previous section on singular
Sturm-Liouville problems, other possibilities exist.
√ √
Nk (t) = ak cos( λt) + bk sin( λt), (293)
∞
X √ √
u(x, t) = {ak cos( λt) + bk sin( λt)} Mk (x). (294)
k=1
(f, MK ) (g, MK )
ak = , bk = √ , (295)
(Mk , Mk ) λk (Mk , Mk )
where f and g are the initial data. Then
∞
X
u(x, 0) = ak Mk (x) = f (x), (296)
k=1
∞ p
X
ut (x, 0) = λk bk Mk (x) = g(x)
k=1
and thus the formal solution (294) satises the PDE, the boundary con-
ditions and the initial conditions.
132
vi) Investigate the convergence of the series (294). If the initial data f and g
are smooth enough the series will converges so fast that it can be dieren-
tiated twice termwise. In this case the formal series solution is a classical
(honest to God!) solution to our initial/boundary value problem. For less
smooth f and g the convergence is weaker and will lead to generalized
(non-classical) solutions.
ut (x, 0) = g(x).
These are the initial values. We now separate variables, u(x, t) = M (x)N (t),
and get the following regular Sturm-Liouville problem for M
M (0) = M (l) = 0.
πk 2
)
λk = l
q
2 πk
k = 1, 2, . . . (301)
Mk (x) = l sin l x
πkc πkc
Nk (t) = ak cos t + bk sin t , k = 1, 2, . . . (302)
l l
and the corresponding separated solutions uk (x, t) are
r
πkc πkc 2 πk
uk (x, t) = ak cos t + bk sin t sin x , for k = 1, 2, . . .
l l l l
(303)
133
In order to satisfy the initial conditions we write down the formal solution
∞
X
u(x, t) = uk (x, t). (304)
k=1
∞ ∞ r
X X 2 πk
u(x, 0) = uk (x, 0) = sin
ak x = f (x),
l l
k=1 k=1
∞ ∞
X πkc r
X 2 πk
ut (x, 0) = ∂t uk (x, 0) = bk sin x = g(x).
l l l
k=1 k=1
r Z l
2 πk
ak = (f, Mk ) = dx f (x) sin x , (305)
l 0 l
√ Z l
l 2l πk
bk = (g, Mk ) = dx g(x) sin x .
πkc πkc 0 l
With these values for ak and bk , (304), is the formal solution of the initial value
problem for the vibrating string xed at both ends.
The separated solutions (303) are called normal modes for the vibrating
string. The formal solution (304) is a sum, or linear superposition, of normal
modes. The notion of normal modes is a concept that goes way beyond their
use for the vibrating string. Separated solutions satisfying a hyperbolic PDE
and boundary conditions are in general called normal modes.
Normal modes is a physical concept that has very wide applicability and
deep roots. Physicists in general are not very familiar with eigenvalue problems
for linear dierential operators and their associated eigenfunctions, but they all
know what normal modes are!
In quantum theory applied to elds, normal modes is a key idea. In this
setting, they are in fact the mathematical embodiment of elementary particles.
The newly discovered Higgs Boson is a normal mode for a certain eld equation
closely related to the 3D wave equation. In the quantum theory of materials,
so-called many-body theory, the interaction between material normal modes
(phonons) and electronic normal modes is responsible for the phenomenon of
superconductivity.
Now then, when can we expect the formal solution (304) to be an actual,
honest to God, classical solution? In general, nding sharp conditions on the
initial data f (x) and g(x) that will imply the existence of a classical solution is a
hard mathematical problem that is best left to specialists in mathematical anal-
ysis. In fact, sometimes the problem is so hard that entirely new mathematical
ideas have to be developed in order to solve this problem.
134
You might know that all of modern mathematics is formulated in terms of,
and gets its consistency from, set theory. You might not know that set theory
was invented by George Cantor in the 1880's while studying the convergence of
trigonometric Fourier series!
So what does mathematical analysis tell us about the convergence of the
formal solution (304)? The following conditions are sucient to ensure the
existence of a classical solution for the vibrating string:
πkc πk
uk (x, t) = αk cos t + δk sin x , k = 1, 2, . . . , (306)
l l
where
q p
2 2 + b2
αk = ak
l k k = 1, 2, . . . (307)
−1 bk
δk = −tg ak
The formula (306) shows that for a normal mode each point x0 perform a har-
monic vibration with frequency
πkc
wk = (308)
l
and amplitude
πk
Ak = αk sin x0 (309)
l
Solutions to the wave equation are in general called waves. The normal mode
(303) then is a standing wave ; each point on the string oscillate in place as t
varies.
Some points on the string remain xed during the oscillation. These are the
points
ml
xm = , m = 1, 2, . . . , k − 1
k
which are called nodes for the standing wave. The standing wave has maximum
amplitude at the points
135
(2n + 1)l
xn = , n = 0, 1, 2, . . . , k − 1,
2k
which are called antinodes.
Since each normal mode performs a harmonic oscillation, uk (x, t) is often
called the k th harmonic. Since a vibrating string makes sound, the normal
mode uk will produce a pure tone at frequency ωk . Thus the decomposition of
the string into normal modes corresponds to the decomposition of sound into
pure tones. If the tone-spectrum is pleasing we call it music!
Recall that the energy of a small piece of a vibrating string located between
x and x + dx is
Z l
1
E(t) = dx {ρ(∂t u(x, t))2 + T (∂x u(x, t))2 }.
2 0
Writing
uk (x, t) = Nk (t)Mk (x),
where
we observe that
Z l
dx {(∂t u(x, t))2 = (∂t u, ∂t u)
0
!
X X
= Nk0 (t)Mk , Nk0 0 (t)Mk0
k k0
X
= Nk0 (t)Nk0 0 (t)(Mk , Mk0 )
kk0
X X
= (Nk0 (t))2 (Mk , Mk ) = (Nk0 (t)Mk , Nk0 (t)Mk )
k k
XZ l
= dx (∂t uk (x, t))2 .
k 0
Z l XZ l
dx (∂x u(x, t))2 = dx (∂x uk (x, t))2 .
0 k 0
136
Therefore
X
E(t) = Ek (t),
k
where
1 l
Z
Ek (t) = dx {ρ(∂t uk (x, t))2 + T (∂x uk (x, t)2 }
2 0
ω 2 m(a2k + b2k )
= k ,
2l
where m = ρl = mass of string.(Do this calculation!)
Thus each Ek is independent of t and so is the total energy
∞
mX 2 2
E(t) = ωk (ak + b2k ),
2l
k=1
A−B A−B
2 cos sin = sin A + sin B (310)
2 2
A−B A+B
−2 sin sin = cos A − cos .B
2 2
Using these formulas on the expression (303) for the normal modes we nd
1 πk πk
uk (x, t) = √ (ak sin (x + ct) + ak sin (x − ct) (311)
2l l l
1 πk πk
+ √ (bk cos (x − ct) − bk cos (x − ct) . (312)
2l l l
Thus the normal mode can be written as a sum of left and right traveling waves.
This is not surprising since we know that all solutions of the wave equation can
be written as a sum of left and right traveling waves. What is special for the
normal modes is that their traveling wave components are arranged in such a
way that we get a standing wave.
137
ut (x, t) − c2 uxx (x, t) = 0, 0 < x < l, t > 0, (313)
where u(x, t) is the temperature of the rod as a function of time, and where c2
is the coecient of heat conductivity. In order to get a unique solution we must
specify the initial temperature distribution
The separation of variable approach leads to the same eigenvalue problem as for
the string. The corresponding functions Nk (t) now become
πkc 2
Nk (t) = ak e−( l ) t , k = 1, 2, . . . (316)
∞ r ∞
X 2X πkc 2 πk
u(x, t) = Nk (t)Mk (x) = ak e−( l ) t sin x . (317)
l l
k=1 k=1
r Z l
2 πk
ak = dx f (x) sin x . (318)
l 0 l
(317) with the constants given by (318) is a formal solution to our initial bound-
ary value problem.
Because of the exponential term in the solution for t > 0, the series (317)
converges so strongly that the series can be dierentiated termwise as many
times as we like. Thus (317) satises the PDE and boundary conditions if we
merely assume that f (x) is a bounded function on 0 ≤ x ≤ l. However, as t→0
the exponential term vanishes, so in order to ensure that (317) also satises the
initial condition, we need stronger conditions on f. Our friendly mathematical
analyst tells us that the following conditions are sucient:
For the wave equation we know that discontinuities in the initial data are
preserved by the equation and propagated along the characteristic curves x±
ct = const. For the heat conduction problem, the exponential factor will ensure
that the solution is innitely many times dierentiable for any t>0 even if the
initial data is discontinuous!
The heat conduction equation can not support any kind of discontinuities
for t > 0. We have seen this same result appear previously from the fact that
the only characteristic curves for the heat conduction equation are of the form
t = const.
138
10.2.3 The Laplace equation in a rectangle
Consider the boundary value problem
uxx (x, y) + uyy (x, y) = 0, 0 < x < l, 0 < y < ˆl, (319)
The equation, which is the Laplace equation, and boundary conditions deter-
mine the steady state displacement of a stretched membrane on a rectangular
domain. The functions f and g specify the stretching of the membrane on the
boundary curves y=0 and y = ˆl.
The separation of variable method leads to the same eigenvalue problem as
before but now the corresponding functions Nk are
πk πk
Nk (y) = aˆk e l y + bˆk e− l y . (320)
r
πk πk ˆ 2 πk
uk (x, y) = [ak sinh y + bk sinh (y − l) ] sin x , (321)
l l l l
∞
X
u(x, y) = uk (x, y). (322)
k=1
∞ r
X πk ˆ 2 πk
u(x, 0) = bk sinh − l sin x = f (x), (323)
l l l
k=1
∞ r
ˆ
X πk ˆ 2 πk
u(x, l) = ak sinh l sin x = g(x). (324)
l l l
k=1
139
(322) with ak and bk given by (325) determines the formal solution to the
problem (319).
Assume that
Z l
dx |f (x)| ≤ m
0
Z l
for some m > 0. (326)
dx |g(x)| ≤ m
0
Then
Z l r
1 2 πk
|bk | = dx f (x) sin x (327)
| sinh − πkl l̂ | 0 l l
q q
m 2l m 2l
≤ =
1 πk l̂
l (1 − e− l
2πkl̂
| sinh − πkl l̂ | 2e )
√2 πk (l̂−y)
πk ˆ m e l
|bk sinh (y − l) | ≤ l
2πkl̂
l e
πkl̂ (1−e− l
l
)
√2 πk y
m l e− l
= 2πkl̂
(1−e− l )
140
ii) f (0) = f (l) = g(0) = g(l) = 0
The fact that solutions to the Laplace equations are smooth in the interior
of the rectangle even if the boundary data has discontinuities is, as we have
seen, also a consequence of having no real characteristics at all.
The equations (329) are supplied with homogeneous boundary conditions and
also homogeneous initial conditions
g(x, τ )
v(x, τ ; τ ) = 0, vt (x, τ ; τ ) = , Hyperbolic case, (331)
ρ(x)
g(x, τ )
v(x, τ ; τ ) = , Parabolic case.
ρ(x)
We assume this solution has been found using separation of variables or some
other method.
Dene a function u(x, t) by
Z t
u(x, t) = dτ v(x, t; τ ). (332)
0
Then we have by the rules of calculus
141
Z t
ut (x, t) = v(x, t; t) + dt vt (x, t; τ ),
0
Z t
vtt = ∂t (v(x, t; t)) + vt (x, t; t) + dτ vtt (x, t; τ ), (333)
0
Z t
L[u(x, t)] = dτ L[v(x, t; τ )]
0
g(x, t)
v(x, t; t) = , parabolic
ρ(x)
Thus we have for the hyperbolic case
g(x, t)
ρutt (x, t) + L[u(x, t)] = ρ(x)
ρ(x)
Z t
+ dτ {ρvtt (x, t; τ ) + L[v(x, t; τ )] = g(x, t). (335)
0
(335) show that the u(x, t) constructed in (332) solves the inhomogeneous prob-
lem for the hyperbolic case. In a similar way we can show that the function
v(x, t; τ ) for the parabolic case also solves the ihomogenous equation for that
case. (332) is Duhamel's principle.
utt (x, t) − c2 uxx (x, t) = g(x, t), −∞ < x < ∞, t > 0, (336)
vt (x, τ ; τ ) = g(x, τ ).
We solve (337) using the d'Alembert formula
Z x+c(t−τ )
1
u(x, t; τ ) = dθ g(θ, τ ). (338)
2c x−c(t−τ )
142
The solution to (336), which we denote by up , is then according to Duhamel's
principle
Z t Z x+c(t−τ )
1
up (x, t) = dτ dθ g(θ, τ ). (339)
2c 0 x−c(t−τ )
Let uh (x, t) be the solution to the homogeneous wave equation with arbitrary
initial data
ut (x, 0) = G(x).
According to the d'Alembert formula the solution with initial data (340) is
Z x+ct
1 1
uh (x, t) = {F (x + ct) + F (x − ct)} + dθ G(θ). (341)
2 2c x−ct
Let now
Verify that u(x, t) satises the inhomogeneous wave equation with initial data
(340). Inserting the formulas for up and uh we get
Z x+ct
1 1
u(x, t) = {F (x + ct) + F (x − ct)} + dθ G(θ) (343)
2 2c x−ct
Z t Z x+c(t−τ )
1
+ dτ dθ g(θ, τ ).
2c 0 x−c(t−τ )
From this formula it is clear that the value of u at a point (x0 , t0 ), with t0 > 0
depends only on the data at points in a characteristic triangle.
The characteristic triangle is called the domain of dependence of the solution
at the point (x0 , t0 ).
Similarly, we dene the domain of inuence of a point (x0 , t0 ) to be the set
of space-time points (x, t) such that the domain of dependence of the solution
at (x, t) includes the point (x0 , t0 ).
The point (x0 , t0 ) will have no causal connection to points beyond its domain
of inuence. These notions play a fundamental role in the theory of relativity,
where the domain of inuence of a space-time point (x, t) is called a light cone.
All material particles and force mediating particles must move within the light
cone. This is strictly true only for classical physics. In quantum theory, both
material particles and force mediators can, in the form of virtual particles, move
beyond the light cone.
143
Figure 52
Figure 53
u(0, t) = u(l, t) = 0.
144
Separation of variables for this problem gives (page 72)
r ∞
2X πkc 2 πk
u(x, t; τ ) = ak (τ )e−( l ) (t−τ ) sin x , (346)
l l
k=1
where
r Z l
2 πk
ak (τ ) = dx g(x, τ ) sin x , k = 1, 2, . . . (347)
l 0 l
Duhamel's principle now gives the solution to (344) in the form
r ∞ Z t
2X 2 πk
dτ ak (τ )e−( )
πkc
(t−τ )
u(x, t; τ ) = l sin x , (348)
l 0 l
k=1
r
l
a1 = ,
2
ak = 0, k > 1,
2 n
l πc 2
o π
⇒ u(x, t) = 1 − e−( l ) t sin x . (350)
πc l
Observe that
2
l π
u(x, t) → sin x ≡ u(x), (351)
πc l
when t → +∞. By direct dierentiation we nd that the limiting function,
u(x), is a solution to
π
−c2 uxx (x) = sin x , (352)
l
u(x) = u(l) = 0,
which shows that the function u(x) is a time independent, or stationary solution
to the heat equation.
In fact, if g(x, t) = G(x) then from (348) we get
∞
r ( 2 )
2X l πkc 2
πk
u(x, t) = ak 1 − e−( l ) t sin x . (353)
l πkc l
k=1
145
For this case the limiting function u(x) is
It is easy to verify that u(x) is in fact a stationary solution to the driven heat
equation (344)
u(x) = u(l) = 0.
Figure 54
146
This transform strategy is an important approach to solving complex problems
in applied mathematics and theoretical physics.
We will stick to the restricted class discussed previously in these notes. For
that class we have an operator, L, given by
K = ∂t parabolic case,
(360)
Let {Mk } be the complete set of eigenfunctions for the eigenvalue problem
where as usual we apply the spatial boundary conditions. We will assume that
the eigenfunctions have been normalized so that
where
Nk = (u, Mk ). (365)
The coecients Nk are functions of t for the hyperbolic and parabolic case and
a function of y for the elliptic case with (generalized) cylinder symmetry. For
the general elliptic case the Nk 's are constants.
The formula (365) denes the nite Fourier transform and (364) the inverse
nite Fourier transform.
In order to derive an equation for the coecients Nk we multiply (361) by
Mk (x) and integrate over the domain for the variable x. We let this domain be
denoted by G.
147
This integration gives us the equation
Z Z Z
dV ρ(x)Mk (x)Ku(x, t) + dV Mk (x)Lu(x, t) = dV ρ(x)Mk (x)F. (366)
G G G
Using the denition of the nite Fourier transform and the inner product, each
term in this equation can be reformulated into a more convenient form.
For the rst term we have
Z
R
dV ρ(x)Mk (x)Ku(x, t) =K dV ρ(x)u(x, t)Mk (x)
G
G
= K(u, Mk (x)) = KNk . (367)
Z Z
dV Mk (x)Lu(x, t) = dV u(x, t)LMk (x)
ZG G
∂G
The second term in (368) can be further simplied using the boundary condi-
tions.
Recall that the boundary conditions on u and Mk are
αu + β∂n u = B,
where u B are functions of x and t for the hyperbolic and parabolic case,
and
functions of x and y for the restricted elliptic case and a function of x for the
general elliptic case. As usual α = α(x) and β = β(x).
Using the classication of the boundary conditions into type 1, 2, 3 intro-
duced on page 104 we have for the second term in (368)
148
Z
dS p{Mk ∂n u − u∂n Mk }.
∂G
Z Z
= dS pMk ∂n u − dS p∂n Mk u
∂G ∂G
Z Z
p
= dS pMk ∂n u − dS ∂n Mk B
S α
S2 S3 S1
Z
− dS p∂n Mk u
S
S2 S3
Z Z
p p
= dS Mk β∂n u + dS Mk αu
S β S β
S2 S3 S2 S3
Z
p
− dS ∂n Mk B
α
S1
Z Z
p p
= dS Mk B − dS ∂n Mk B
S β α
S2 S3 S1
≡ Bk . (370)
Substituting (367), (368) and (370) into (366) gives us the system
KNk + λk Nk = Fk + Bk , k = 1, 2, . . . , (371)
where
Fk = (F, Mk ).
For the general elliptic case K = 0 so (371) are algebraic equations for the
Nk . For the restricted elliptic case we have a system of ordinary dierential
equations
This system of ODEs must be solved as a boundary value problem. The bound-
ary conditions are
where we have used the boundary conditions and where u(0) ≡ u(x, t) etc.
For the parabolic case K = ∂t we have a system of ODEs
149
∂t Nk (t) + λk Nk (t) = Fk (t) + Bk (t), k = 1, 2, . . . , (374)
Observe that (374) is an innite system of ODEs, but they are uncoupled. This
is a result of the linearity of the underlying PDEs and the fact that the Mk 's
are eigenfunctions for the operator L dening the equations. The nite Fourier
transform can also be used on nonlinear PDEs, as we will see, and for this case
the ODE's for the Nk (t) will be coupled. The ODEs (374) are of a particular
simple type. They are linear rst order equations. Such equations can be solved
using an integrating factor. The solution is found to be
Zt
Nk (t) = Nk (0)e−λk t + dτ [Fk (τ ) + Bk (τ )]e−λk (t−τ ) . (376)
This formula is only useful if we can get some exact or approximate analytical
solution of the integral. If a numerical solution is required it might be more
ecient to solve the ODEs (374) numerically, using for example Runge-Kutta.
For the hyperbolic case, K = ∂tt and we get a system of second order ODEs
This second order equation is also of a simple type which is solvable by the
method of variation of parameters. We get the formula
p 1 p
Nk (t) = Nk (0) cosλk t + √ ∂t Nk (0) sin λk t (379)
λk
Z t
1 p
+√ dτ [Fk (τ ) + Bk (τ )] sin λk (t − τ ).
λk 0
Just as for the parabolic case the formula (379) is only useful if some exact or
approximate analytical solution can be found. If a numerical solution is required
it is more ecient to solve the ODEs (377) using a numerical ODE solver.
Recall that when the Nk s have been computed, the solution to the PDE is
constructed using the inverse nite Fourier transform
X
u= Nk MK . (380)
k
150
The representation (380) is problematic when we approach the boundary
since all the Mk s satisfy homogeneous boundary conditions, whereas u satises
inhomogeneous boundary conditions. If these conditions are of, say, type 1 and
if (380) is a classical solution (series converges pointwise) then (380) implies
that u is zero on the boundary. But we know that u in general satises nonzero
boundary conditions. Thus the series (380) can not converge pointwise as we
approach the boundary. Therefore, (380) is in general not a classical solution,
but rather some form of generalized solution.
The series might still converge pointwise away from the boundary, but the
convergence will typically be slow since the expansion coecients, Nk , "know
about" the function over the whole domain and this leads to slow convergence
everywhere.
There is however a way, which we will discuss shortly, to avoid this boundary
problem and get series that converge fast enough to get a classical solution.
Let us now look at a couple of examples.
where Mi (x) are one of the eigenfunctions of L. We assume that initial and
boundary conditions are homogeneous
f (x) = 0,
g(x) = 0, (382)
B(x, t) = 0.
and because the initial conditions for the PDE (381) are homogeneous, we have
Bk (t) = 0.
Nk (0) = 0,
∂t Nk (0) = 0.
151
We get Nk = 0, k 6= i and from (379)
Z t
1 p
Ni (t) = √ dτ sin ωτ sin λi (t − τ ) (386)
λi 0
√ √
ω sin( λi t) − λi sin ωt
= √ .
λi (ω 2 − λi )
1 ω p
u(x, t) = 2 √ sin λi t − sin ωt Mi (x). (387)
ω − λi λi
This is now the exact solution to this problem. From (387) it is clear that as
√
ω → λi the amplitude for the solution (387) becomes larger.
√ We can not
evaluate (387) at ω = λ√ i , since the denominator is then zero, but if we take
the limit of (387) as ω → λi , using L'Hopitals rule on the way, we nd
√
1 sin λi t p
u(x, t) = √ √ − t cos λi t Mi (x). (388)
2 λi λi
The formulas (387) and (388) tell the following story: The general solution of
the homogeneous equation
152
x = r cos θ, (391)
y = r sin θ.
Then
153
Thus in polar coordinates (390) can be written as
1 1
∂rr u(r, θ) + ∂r u(r, θ) + 2 ∂θθ u(r, θ) = F (r, θ)
r r
u(R, θ) = f (θ). (392)
1 1
L = ∂rr + ∂r + 2 ∂θθ , (393)
r r
we can express the solution u as a Fourier series
X
u(r, θ) = ak Mk (r, θ). (394)
k
This is done in chapter 8 of our text-book. Here we will solve this problem using
the nite Fourier transform. Introduce the operators
L̂ = −∂θθ . (396)
The idea is now to introduce a nite Fourier transform based on the eigenfunc-
tions of the operator L̂. Observe that since θ is the polar angle, running from
zero to 2π , all functions are periodic in θ with period 2π . Thus we are lead to
the eigenvalue problem
This problem is easiest to solve using complex exponentials. The general solu-
tion of the equation (398) is for λ>0
√ √
M (θ) = Aei λθ
+ Be−i λθ
, (400)
√ √ √ √
Aei λθ
+ Be−i λθ
= Aei λ(θ+2π)
+ Be−i λ(θ+2π)
,
m
√ √ √ √
e i λθ
1 − ei λ2π
A + e−i λθ
1 − e−i λ2π
B = 0. (401)
154
√
The functions e±i λθ
are linearly independent (verify this), so we must conclude
from (401) that
√
A 1 − ei λ2π = 0, (402)
√ √
B 1 − e−i λ2π = 0, ⇔ B 1 − ei λ2π = 0.
√
If 1 − e−i λ2π
6= 0 both A and B must be zero. This gives M (θ) = 0, so such
λs are not eigenvalues. The other possibility is that
√
e−i λ2π
= 1, (403)
p
2π λk = 2πk, k = 1, 2 . . . ,
m
λk = k 2 . (404)
These are the positive eigenvalues. The corresponding eigenspaces are two-
dimensional and spanned by the normalized eigenfunctions
1 1
√ cos kθ, √ sin kθ. (405)
π π
In order to nd out if λ=0 is an eigenvalue we put λ=0 in (398) and get the
equation
M (θ) = Aθ + B. (407)
M (θ + 2π) = M (θ),
m
A(θ + 2π) + B = Aθ + B,
m
2πA = 0 ⇒ A = 0,
155
1
√ . (408)
2π
Verify that there are no negative eigenvalues.
Using (405) and (408) we dene a nite Fourier transform for functions
u(r, θ) by
Z 2π
1
a0 (r) = √ dθ u(r, θ), (409)
2π 0
Z 2π
1
ak (r) = √
dθ u(r, θ) cos kθ
π 0
Z 2π k = 1, 2, . . . . (410)
1
bk (r) = √ dθ u(r, θ) sin kθ
π 0
The corresponding inverse Finite Fourier transform is
1
u(r, θ) = √ a0 (r)
2π
∞
X 1
+ √ {ak (r) cos kθ + bk (r) sin kθ}. (411)
π
k=1
The ODEs for the coecients ak and bk are now found by multiplying (397) by
the eigenfunctions and integrating over the interval (0, 2π). Verify that we get
1 k2
a00k (r) + a0k (r) − 2 ak (r) = −Ak (r), k = 0, 1, 2, . . . ,
r r
00 1 0 k2
bk (r) + bk (r) − 2 bk (r) = −Bk (r), k = 0, 1, 2, . . . , (412)
r r
where Ak and Bk are the Fourier coecients of F,
Z 2π
1
A0 (r) = √ dθ F (r, θ), (413)
2π 0
Z 2π
1
Ak (r) = √
dθ F (r, θ) cos kθ
π 0
Z 2π k = 1, 2, . . . . (414)
1
Bk (r) = √ dθ F (r, θ) cos kθ
π 0
Note that only solutions of the singular equations (412) that are bounded as
r→0 are acceptable. So our boundary condition at r=0 is
156
At r=R the boundary condition is
u(R, θ) = f (θ).
Taking the Finite Fourier transform of the boundary data f (θ), we nd the the
boundary conditions for the functions ak (r), bk (r) at r = R aret
Z 2π
1
a0 (R) = α0 ≡ √ dθ f (θ), (416)
2π 0
Z 2π
1
ak (R) = αk ≡ √
dθ f (θ) cos kθ
π 0
Z 2π k = 1, 2, . . . . (417)
1
bk (R) = βk ≡ √ dθ f (θ) sin kθ
π 0
Note that both of the equations in (412) are inhomogeneous versions of the
homogeneous Euler equation
1 k2
Ck00 (r) + Ck0 (r) − 2 Ck (r) = 0. (418)
r r
A basis for the solution space of such equations can be found using standard
methods. We have
( (
constant rk
C0 (r) = , Ck (r) = k = 1, 2, . . . (419)
ln r r−k
The solutions of (412) can now be found using variation of parameters. We
have
Z r Z R
R R
a0 (r) = dt ln A0 (t)t + dt ln A0 (t)t + αk (R),
0 r r t
Z r " k k #
1 R r k t
ak (r) = dt − Ak (t)t (420)
2k 0 r R R
Z R " k k # k r k
1 R t r
+ dt − Ak (t)t + αk (R) ,
2k r t R R R
Z r " k k # k
1 R r t
bk (r) = dt − Bk (t)t (421)
2k 0 r R R
Z R " k k # k r k
1 R t r
+ dt − Bk (t)t + βk (R) .
2k r t R R R
157
Let us now return to the general scheme. As we have indicated, the solutions
found using the nite Fourier transform will in general have slow convergence be-
cause the eigenfunctions satisfy the homogeneous boundary conditions, whereas
in general the solution might satises inhomogeneous boundary conditions.
However, the problem of slow convergence can in general be solved using an
approach which we will now explain.
Let us rst consider the case when the inhomogeneous terms in the equation
and boundary conditions are independent of time (we are here thinking about
the hyperbolic and the parabolic cases).
Thus we have
We will assume that we can solve (423). Dene w(x, t) = u(x, t) − v(x). Then
w(x) solves
= α(x)u(x, t) − α(x)v(x)
+ β(x)∂n u(x, t) − β(x)∂n v(x)
= B(x) − B(x) = 0. (427)
158
Let us assume that we can nd a function v(x, t) such that
= α(x)u(x, t) + α(x)v(x, t)
= β(x)∂n u(x, t) − β(x)∂n v(x, t) = B(x, t) − B(x, t) = 0,
and
where
Lv(x, t)
F̂ (x, t) = F (x, t) − Kv(x, t) − .
ρ(x)
Thus w(x, t) solves an inhomogeneous equation with homogeneous boundary
conditions. The solution to this problem can be found using Duhamel's principle
or the nite Fourier transform. Convergence of the series should now be much
faster and thus lead to more ecient numerical calculations.
Let us illustrate this method for a 1D inhomogeneous heat equation.
1
v(x, t) = (xg2 (t) + (l − x)g1 (t)), (434)
l
satisfy the inhomogeneous boundary conditions. Dening w(x, t) = u(x, t) −
v(x, t), we nd immediately that
159
1
wt (x, t) − c2 wxx (x, t) = g(x, t) − (xg20 (t) + (l − x)g10 (t)), (435)
l
for t > 0 and 0 < x < l. The boundary condition are by construction homoge-
neous
w(0, t) = 0, (436)
w(l, t) = 0,
1
= f (x) − (xg2 (0) − (l − x)g1 (0)).
l
This problem can now be solved by, for example series expansion, and the
convergence of the series should be good.
∂t n + a(n)∂x n = 0, (438)
a(n) ≈ a0 + a1 n. (439)
Inserting (439) into (438) and rearranging the terms we get the nonlinear equa-
tion
160
By assumption (439), the nonlinear term in this equation is smaller that the
linear ones.
The use of truncated expansions like (439) are very common when modeling
physical systems by PDEs. We saw this for example for the vibrating string.
The refractive index, n, is a material property that determines the optical
response of dielectrics and gases. For inhomogenous materials, like corrugated
glass for example, it will depend on position n = n(x).
An important equation in the theoretical description of light is
1
∂tt (n2 E) − ∇2 E = 0. (441)
c2
This is a equation which model light under normal conditions well. It is clearly
a linear equation. For light under more extreme conditions, for example light
generated by a laser, one nd that the refractive index depends on the light
intensity I = |E|2 . We thus in general have have n = n(x, I). This functions
has been measured for many materials and one nd that in many cases a good
approximation is
n = n0 + n2 I, (442)
where both n0 and n2 in general depends on position. Inserting (442) into (441)
and keeping only terms that are linear in the intensity I, we get the equation
n20 2n0 n2
2
∂tt E − ∇2 E = − 2 ∂tt (|E|2 E). (443)
c c
This is now a nonlinear equation. The factor n2 is for most materials very small,
2
so ligh of very high intensity I = |E| is required in order for the nonlinearity to
have any eect. From your basic physics class you might recall that light tend
to bend towards areas of higher refractive index; this property is the basis for
Snell's law in optics. Thus the intensity dependent index (442) will tend to focus
light into areas of high light intensity. This increases the intensity, creating an
even larger index, which focus light even more strongly. This is a runaway eect
that quickly creates a local light intensity high enough to destroy the material
the light is traveling through.
The study of nonlinear properties of high-intensity light is a large and very
active eld of research which has given us amazing technologies, like the laser,
optical bers, perfect lenses of microscopic size and even invisibility cloaks!
(Well almost anyway)
For some physical systems there are no good linear approximations. One
such area is uid dynamics. The most important equation in this eld of science
is the Navier-Stokes equation.
where ρ0 is the mass density, u(x, t) the uid velocity eld, p(x, t) the pressure
in the uid and µ the viscosity (friction). The second term on the left hand side
of (444) is clearly nonlinear and in many important applications not small. The
161
essential nonlinearity of the Navier-Stokes equation gives rise to the phenomenon
of turbulence and other physical phenomena that are extremely hard to describe
and predict. In fact, it is not even known, after almost 200 years, if the Cauchy
problem for (444) is well posed. There is a possibility that solutions of (444)
develop singularities in nite time. If you can solve this problem, the Clay
mathematics institute will pay you one million dollars!
For systems where the nonlinear terms are small, one would expect that
methods introduced for solving linear PDEs should be useful, somehow. They
are, and we will illustrate this using a nonlinear heat equation
The parameter λ̂ measures the strength of the nonlinearity. The equation can
model situations where we have a nonlinear heat source. The equation (445)
clearly has a solution
u0 (x, t) = 0. (446)
where h(x) is of normal size, h(x) = O(1). We are similarly introducing a O(1)
function w(x, t) through
The rst step is to do a linear stability analysis. For this we recognize that, if
w = O(1), the last term in (449) is much smaller than the other terms because
<< 1. We therefore drop the small term and are left with the equation
162
(450) is an arbitrary good approximation to (449) if is small enough and
w = O(1). Going from (449) to (450) is called linearization. Linearization is an
idea of outmost importance in applied mathematics.
We will now investigate the linearized equation (450) for the innite and
nite case separately.
m
λ(k) = λ̂ − k 2 . (453)
From this we see that the stability index for the equation (450) is
i) u0 is stable if λ̂ < 0,
In the unstable case the normal mode (451) experience unbounded exponential
growth. However, when the growth makes
the nonlinear term is of the same size as the linear one and the linearized
equation is not valid anymore. From (455) this happend when
2 w2 ∼ 1, (456)
m
1
w∼ ,
which gives us, using (451), a break-down time
| log |
t∗ = . (457)
λ̂
163
So what happens after the break-down time? Observe that the nonlinear equa-
tion (445) has another simple uniform solution
u1 (x, t) = 1. (458)
We can now ask when this solution is stable. We introduce a small parameter
and a O(1) function v(x, t) through
λ = −2λ̂ − k 2 , (461)
i) u1 is stable if λ̂ > 0,
164
Let us look for a solution to (445) that is independent of x, u = u(t). Such a
solution must satisfy
u(0) = .
eλ̂t
u(x, t) = u(x) = 1 . (464)
(1 + 2 (e2λ̂t − 1)) 2
It is easy to verify that
for the case λ̂ > 0. Thus this special solution lends further support to our
conjecture.
Recall that linearizing (442) around the simple solution u0 (x, t) = 0 gives us an
equation of the form
We can solve this problem using the nite Fourier transform. Introduce the
operator L = −∂xx . The relevant eigenfunctions satisfy
Mk (0) = Mk (π) = 0
λk = −k 2 , k = 1, 2, . . .
and the normalized eigenfunctions are
r
2
Mk (x) = sin kx. (470)
π
165
The associated nite Fourier transform of a given function f (x) is
Nk = (f, Mk ), k = 1, 2, . . . (471)
X
f (x) = Nk Mk (x). (472)
k
We now express an arbitrary solution to (467) in terms of an inverse nite
Fourier transform
X
w(x, t) = Nk (t)Mk (x). (473)
k
The evolution equations for the Nk (t) are found from (467) if we multiply by
Mk (x) and integrate
Z π Z π
dx wt (x, t)Mk (x) − dx wxx (x, t)Mk (x) (474)
0
Z0 π
= dx λ̂w(x, t)Mk (x).
0
Z π Z π
dx wt (x, t)Mk (x) = ∂t dx w(x, t)Mk (x) = Nk0 (t), (475)
0 0
Z π Z π
π
dx wxx (x, t)Mk (x) = wx (x, t)Mk (x) 0
− dx wx (x, t)Mkx (x) (476)
0
Z 0π
π
− w(x, t)Mkx (x) 0 + dx w(x, t)Mkxx (x)
0
Z π
= dx w(x, t)(−k 2 )Mk (x) = −k 2 Nk (t), (477)
0
Z π
dx λ̂w(x, t)Mk (x) = λ̂Nk (t) (478)
0
Inserting (466), (476) and (478) into (474) gives the following uncoupled system
of ODEs for the unknown functions Nk (t)
m
Nk0 (t) = (λ̂ − k 2 )Nk (t),
m
2
Nk (t) = ak e(λ̂−k )t
.
166
The general solution to (467) can thus be written as
∞
X
w(x, t) = wk (x, t), (480)
k=1
We see that
For the innite domain case the stability condition was λ̂ < 0. Thus the stability
condition is not the same for the case of innite and nite domain. This is in
general true.
For the unstable case, λ̂ > 1, one or more normal modes experience expo-
nential growth, which after a nite time invalidate the linearized equation (467).
What is the system going to do after this time? This is a questing of nonlinear
stability.
Nonlinear stability is a much harder problem than linear stability but pow-
erful techniques exist in the form of asymptotic expansions which in many im-
portant cases can provide answer to the question of nonlinear stability.
Here we will not dive into this more sophisticated approach but will rather
introduce an approach based on the nite Fourier transform.
Observe that the nite Fourier transform can also be applied to the nonlinear
equation
∞
X
w(x, t) = Nk (t)Mk (x). (483)
k=1
The evolution equations for the Nk (t)s are found as for the linearized case. We
have
Z π Z π
dx wt (x, t)Mk (x) − dxwxx (x, t)Mk (x) (484)
0 0
Z π Z π
= dx λ̂w(x, t)Mk (x) − dx λ̂2 w3 (x, t)Mk (x).
0 0
The only new term as compared with the linearized case is the second term on
the right hand side of (484). For this term we have
167
Z π
dx λ̂2 w3 (x, t)Mk (x) (485)
0
∞ Z
X π
2
= λ̂ dx Ni (t)Nj (t)Nl (t)Mi (x)Mj (x)Ml (x)Mk (x)
i,j,l=1 0
X∞
= λ̂2 aijl
k Ni (t)Nj (t)Nl (t),
i,j,l=1
where
Z π
aijl
k = dx Mi (x)Mj (x)Mk (x)Ml (x). (486)
0
∞
X
Nk0 (t) 2
+ (k − λ̂)Nk (t) = −λ̂ 2
aijl
k Ni (t)Nj (t)Nl (t). (487)
i,j,l=1
− λ̂2 a111 3
1 N1 (t), (489)
is much larger than the other terms. A good approximation to (487) therefore
appears to be
111
The coecient a1 can be calculated using (486)
Z π Z π
4 3
a111
1 = dx (M1 (x)) = 4
dx sin4 x = . (491)
0 π 0 2π
168
3λ̂ 2 3
⇒ N10 (t) + (1 − λ̂)N1 (t) + N1 (t) = 0 (492)
2π
If we multiply this equation by N1 (t) we obtain
1 3λ̂2 2
y + (1 − λ̂)y + y = 0, (493)
2 2π
where y(t) = N12 (t). (493) is a so-called Ricatti equation and can be solved
exactly. There is a general way to solve such equations.
A general Ricatti equation is of the form
where
q20 (x)
R(x) = q1 (x) + .
q2 (x)
u0
v=− . (498)
u
Then we get the following linear equation for u
Using (498) and (495), any solution of (499) gives a solution of the Ricatti
equation (494). Since the general solution to (499) can be found explicitly in
some cases, the general solution (494) can also be found for those cases.
For our equation (493) we have
q0 (x) = 0 (500)
q1 (x) = 2(λ̂ − 1)
3λ̂2
q2 (x) = −
π
169
We can nd the general solution to this equation which leads to the general
solution (493). With initial condition given as in (447) we get
(h, M1 )e(λ̂−1)t
N1 (t) = h h i i 21 (501)
11 Integral transforms
Integral transforms are used to solve PDEs on unbounded domains. They have
very wide applicability and play a fundamental role in applied mathematics and
theoretical physics.
In these notes I will concentrate on equations of the form
where
and where
∂ hyperbolic case,
tt
∂ parabolic case,
t
K= (505)
∂yy
elliptic case, cylinder symmetry,
0 general elliptic case.
Recall that the basic idea in the separation of variables is to look for special
solution to (503) of the form
and
170
00 2
N (t) + λ N (t) = 0
hyperbolic case,
M (l) = M (−l) = 0.
The general solution to the equation is for λ>0
Aeiλl + Be−iλl = 0,
−ilλ
eilλ
e A
= 0. (512)
eilλ e−ilλ B
In order for (512) to have a nontrivial solution, and thus for λ to be in spectrum
of L = −∂xx , the matrix in (512) must have zero determinant
−ilλ
eilλ
e
det ilλ = 0, (513)
e e−ilλ
m
4ilλ
e = 1,
m
4lλk = 2πk,
πk
λk = , k = 1, 2, . . . . (514)
2l
171
There are no eigenvalues for λ ≤ 0. Thus (514) denes the spectrum of l.
Observe that the distance ∆λ between two spectral points is
π
∆λ = . (515)
2l
Figure 55
When the size of the domain increases without bound we observe that the
distance between spectral points go to zero
lim ∆λ = 0. (516)
l→∞
This is in general the case for unbounded domains; the spectrum of the operators
form a continuum.
For the discrete case, with eigenvalues {λk }∞
k=1 and corresponding eigen-
∞
functions {Mk }k=1 , the basic idea was to look for solutions to the PDE in the
form of an innite sum
∞
X
u= Nk M k . (518)
k=1
We recall that in the case of nite domains, the tting of initial conditions to
our solution consisted in nding constants {ak }∞
k=1 such that
∞
X
f (x) = ak Mk (x). (520)
k=1
172
Z
ak = (Mk , f ) = dV ρ(x)f (x)Mk (x). (521)
G
Fitting initial conditions for the case of unbounded domains will by analogy
involve nding coecients aλ , λ ∈ D such that
Z
f (x) = dλ aλ Mλ (x). (522)
This is in fact what Paul Dirac did early in the previous century. Dirac was
an English genius that was one of the founding fathers of quantum theory. He
came to this idea while trying to solve the Schrodinger equation from quantum
mechanics. He found that δλλ00 only depended on λ − λ0 and he wrote it as
δ(λ) is called the Dirac delta function after him. In order to do its intended job
he found that it would have to have the following two properties
δ(λ) = 0, λ 6= 0, (526)
Z
dλ δ(λ) = 1. (527)
It is a trivial exercise to prove that no function exists that have the two prop-
erties (526) and 527). Mathematicians were quick to point this out but Dirac
didn't care, he did not know much mathematics, and he did not have to know it
either, because he was in fact a genius and people like him move to a dierent
drum.
Eventually mathematicians did make sense of the Dirac delta function and
in fact discovered a whole new mathematical universe called the theory of distri-
butions. We will discuss elements of this theory next semester. In this semester
we will use objects like the Dirac delta sparingly and when we do only in a
purely formal way like Dirac himself.
There is an important moral to the story of the Dirac delta. Pure math-
ematics is important because it gives us abstract frameworks to help organize
173
our mathematical methods. Pure mathematics can also give us more ecient
ways of doing our calculations and make precise under which conditions our
manipulations will keep us within the boundaries of the known mathematical
universe.
However the framework and bounds imposed by pure mathematics should
not stop us from sometimes disregarding the rules and play with the formulas,
like Dirac did. In the nal analysis, virtually all of mathematics originate from
such rule-defying play.
It is however also appropriate to give a warning; disregarding the rules and
playing with the formulas should not be a cover for sloppy thinking. This is
unfortunately the case quite frequently in applied science.
We can't all be a Dirac. Disregarding the rules imposed by pure mathematics
will, by most people, most of the time, produce useless nonsense.
In these notes I will sometimes break the rules a la Dirac and I will try to
remember telling you when I do. All my rule breaking can be made sense of
within the universe of distributions.
All nite Fourier transforms will in the limit of unbounded domains give rise
to integral transforms. There is thus a large set of such transforms available,
each tailored to a particular class of problems. In these notes I will focus on the
Sine-, Cosine-, Fourier- and Laplace transforms.
The boundary condition is that M (x) is bounded when |x| → ∞. The general
solution of the equation (528) is
|a(λ)| < Ma ,
∀λ, (530)
|b(λ)| < Mb ,
then the boundedness of Mλ follows from the triangle inequality
174
|Mλ (x)| = |a(λ)eiλx + b(λ)e−iλx | (531)
iλx −iλx
≤ |a(λ)||e | + |b(λ)||e |
= |a(λ)| + |b(λ)| ≤ Ma + Mb .
Therefore the spectrum in this case consists of the whole real axis. We can now
use these eigenfunctions to represent functions f (x) in the following way
Z ∞
f (x) = dλ {a(λ)eiλx + b(λ)e−iλx } (532)
−∞
Z ∞ Z ∞
= dλ a(λ)eiλx + dλ b(λ)e−iλx
−∞ −∞
Z ∞ Z ∞
−iλx
= dλ a(−λ)e + dλ b(λ)e−iλx
−∞ −∞
Z ∞
−iλx
= dλ c(λ)e ,
−∞
where c(λ) = a(−λ) + b(λ). Note that in (532), f (x) is represented by eiλx
in terms of a standard improper integral. In the theory of integral transforms,
other, less standard, improper integrals appear. We will discuss these when they
arise.
The formula (532) is by denition the inverse Fourier transform. The key
question is how to nd the c(λ) corresponding to a given function f (x).
For this purpose, let us introduce a small number >0 and dene
(
−i, x > 0,
(x) = (533)
i, x < 0.
Observe that for all real λ we have
Z ∞ Z L
dx e−i(λ+(x))x = lim dx e−i(λ+(x))x (534)
−∞ L→∞ −L
(Z )
0 Z L
−i(λ+i)x −i(λ−i)x
= lim dx e + dx e (535)
L→∞ −L 0
1 0 1 L
= lim e−i(λ+i)x + e−i(λ−i)x (536)
L→∞ −i(λ + i) −L −i(λ − i) 0
1 1
= lim +
L→∞ −i(λ + i) i(λ − i)
1 −L iλL 1 −L −iλL
+ e e − e e
i(λ + i) i(λ − i)
1 1 2
= − = 2 . (537)
i(λ − i) −i(λ + i) + λ2
175
Thus (
∞
0, λ 6= 0
Z
−i(λ+(x))x
lim dx e = . (538)
→0 −∞ ∞, λ=0
This result makes the following formula plausible (I am playing with formulas
now)
Z ∞
dx e−λx ∝ δ(λ). (539)
−∞
Z ∞
1
δ(λ) = dx e−λx (540)
2π −∞
Since the Dirac delta function is zero for λ 6= 0 it appears that for all reasonable
functions, f (λ), we have
Z ∞
dλ f (λ)δ(λ − λ̂) = f (λ̂). (541)
−∞
(I am playing now.)
Let us now return to our original question of how to nd the c(λ) corre-
sponding to a given function f (x) in formula (532)
Z ∞
f (x) = dλ c(λ)e−iλx . (542)
−∞
Multiplying (542) by eiλ̂x and integrate with respect to x over −∞ < x < ∞
gives
Z ∞ Z ∞ Z ∞
dx f (x)e iλ̂x
= dx e iλ̂x
dλ c(λ)e−iλx (543)
−∞ −∞ −∞
Z ∞ Z ∞
= dλ c(λ) dx e−i(λ−λ̂)x
−∞ −∞
Z ∞
= dλ c(λ)2πδ(λ − λ̂) = 2πc(λ̂). (544)
−∞
Z ∞
1
c(λ) = dx f (x)eiλx . (545)
2π −∞
This formula denes the Fourier transform. Note that there are several con-
ventions about where to place the factor of 2π . A more symmetric choice[1],
is
176
Z ∞
1
F (λ) = √ dx f (x)eiλx , (546)
2π −∞
Z ∞
1
f (x) = √ dx F (x)e−iλx . (547)
2π −∞
Note that not only are there dierent conventions on where to place the factor
of 2π but also in which exponent to place the minus sign.
In this class (546) denes the Fourier transform and (547) denes the inverse
Fourier transform.
Pure mathematicians have made extensive investigations into for which pair
of functions f (x) and F (λ) the formulas (546) and (547) are valid. For example
if f (x) is piecewise continuously dierentiable on each nite interval and if
additionally
Z ∞
dx |f (x)| < ∞, (548)
−∞
formula (546) and (547) holds pointwise. This result and related ones are im-
portant but we will use (546) and (547) in a much wider context than this.
Let f (x) = δ(x). Then (546) gives
Z ∞
1 1
F (λ) = √ dx δ(x)eiλx = √ , (549)
2π −∞ 2π
and (547) then gives
Z ∞ Z ∞
1 1 1
f (x) = √ dλ √ e−iλx = dλ e−iλx = δ(x), (550)
2π −∞ 2π 2π −∞
where we have used the formula (540) which we argued for previously. More
generally, the pair of formulas (546) and (547) holds for any distribution, not
just the Delta distribution. More about this next semester. In this semester we
will basically use (546) and (547) to take Fourier transform and inverse Fourier
transform of anything we want!
The Fourier transform has many powerful properties. You can nd them in
handbooks or on the web whenever you need them. For example if F (λ) and
G(λ) are Fourier transforms of f (x) and g(x) then aF (λ) + bG(λ) is the Fourier
transform of af (x) + bg(x). A less trivial, and extremely useful for us, property
involves products of Fourier transforms F (λ)G(λ) of functions f (x) and g(x).
Such products often occur when we use Fourier transforms to solve ODEs and
PDEs.
Playing with formulas we get the following
177
Z ∞ Z ∞ Z ∞
1 1
√ dλ F (λ)G(λ)e−iλx = dλ G(λ)e−iλx dt f (t)eiλt
2π −∞ 2π −∞ −∞
Z ∞ Z ∞
1
= dt f (t) dλ G(λ)e−iλ(x−t)
2π −∞ −∞
Z ∞
1
=√ dt f (t)g(x − t). (551)
2π −∞
The integral on the right hand side of (551) is known as a convolution integral.
Canceling the factor √
1
2π
we get the convolution integral theorem for Fourier
transforms
Z ∞ Z ∞
dt f (t)g(x − t) = dλ F (λ)G(λ)e−iλx . (552)
−∞ −∞
Z ∞ Z ∞
dt f (t)g(−t) = dλ F (λ)G(λ). (553)
−∞ −∞
Z ∞
1
F (λ) = √ dt f (t)eiλt ,
2π −∞
⇓
Z ∞ Z ∞
1 1
F (λ) = √ dt f (t)e−iλt = √ dt f (−t)eiλt , (554)
2π −∞ 2π −∞
Z ∞ Z ∞ Z ∞ Z ∞
dt |f (t)|2 = dt f (t)f (t) = dt f (t)g(−t) = dλ |F (λ)|2 .
−∞ −∞ −∞ −∞
(555)
The identity
Z ∞ Z ∞
dt |f (t)|2 = dλ |F (λ)|2 . (556)
−∞ −∞
178
Z ∞ ∞ Z ∞
1 0 iλx 1 1
√ dxf (x)e = √ f (x)eiλx −√ dxf (x)(iλ)eiλx
2π −∞ 2π −∞ 2π −∞
Z ∞
1
= −iλ √ dxf (x)eiλx = −iλF (λ). (557)
2π −∞
Thus, derivatives are converted into multiplication by the Fourier transform.
(557) can be easily generalized to
Z ∞
1
√ dx f (n) (x)eiλx = (−iλ)n F (λ). (558)
2π −∞
Formulas (557) and (558) are the main sources for the great utility of Fourier
transforms in the theory of PDEs and ODEs.
Let F (λ) and Y (λ) be the Fourier transforms of f (x) and y(x). Then the
transform of the equation (559) is, upon using (558), given by
F (λ)
Y (λ) = . (562)
λ2 + k 2
Using the inverse Fourier transform and the convolution theorem we get the
following representation of y(x):
Z ∞
1 1
y(x) = √ dλ 2 F (λ)e−iλx
2π −∞ λ + k2
Z ∞
1
=√ dt g(x − t)f (t), (563)
2π −∞
where
∞
e−iλξ
Z
1
g(ξ) = √ dλ . (564)
2π −∞ λ2 + k 2
179
Using a table of transforms or residue calculus from complex function theory
we have
√
2π −k|ξ|
g(ξ) = e , (565)
2k
so that the solution y(x) is
Z ∞
1
y(x) = dt e−k|x−t| f (t). (566)
2k −∞
Because of the exponentially decaying factor, y(x) and y 0 (x) will both go to zero
as |x| → ∞ if f (t) decays fast enough at innity.
Actually, computing y(x) for a given f (t) using formula (566) is no simple
matter; tables of integrals, complex function theory, numerical methods and
asymptotic methods are all of help here.
For some simple choices for f (t) the integral in (566) can be evaluated easily.
Let us rst choose f (t) = 1. Formula (566) implies
Z ∞
1
y(x) = dt e−k|x−t| (567)
2k −∞
Z x Z ∞
1
= dt e−k(x−t) + dt ek(x−t)
2k −∞ x
1 1 kt x 1 −kt ∞
−kx kx
= e e +e − e
2k k −∞ k x
1 1 1 1
= + = 2.
2k k k k
Observe that
1
y 00 (x) − k 2 y(x) = −k 2 = −1 = −f (x), (568)
k2
so y(x) dened in (567) is a solution to the equation (559), but it does clearly
not satisfy the boundary conditions (560). The reason why this occurs is that
in deriving the formula (566) we assumed that f (x) had a Fourier transform
F (λ). However f (x) = 1 leads to the divergent integral
Z ∞
1
F (λ) = √ dx eiλx (569)
2π −∞
for F (λ). Using (550) we conclude that F (λ) is not a function but a distribution
√
F (λ) = 2πδ(λ). (570)
Calculations like the one leading up to the "solution" (567) are very common
whenever transforms are used to solve PDEs. They are ideal calculations that
is meant to represent the limit of a sequence of problems, each producing an
180
honest to God solution to the boundary value problem (559),(560). Such ideal
calculations are often possible to do in closed form and does often lead to some
insight into the general case, but in what sense the result of the calculation is
a limit of a sequence of actual solution is for the most part left unsaid. This
will be claried when we discuss distributions next semester. However, let us
preview next semester by saying exactly what we mean by a limit here.
Dene a sequence of functions {fN (x)}
(
1, |x| ≤ N,
fN (x) = (571)
0, |x| > N.
Figure 56
This function has a Fourier transform for each N and our formula (566)
should give a solution to the boundary value problem (559),(560)
Z ∞
1
yN (x) = dt e−k|x−t| fN (t) (572)
2k −∞
Z N
1
= dt e−k|x−t| .
2k −N
181
"Z #
x Z N
1 −k(x−t) k(x−t)
⇒ yN (x) = dt e + dt e (573)
2k −N x
1 −kx 1 kt x kx 1 −kt N
= e e +e − e
2k k −N k x
1 −kx kx
e − e−kN − ekx e−kN − e−kx
= e
2k 2
1
1 − e−kN e−kx − e−kN ekx + 1
=
2k 2
1
1 − e−kN cosh kx .
= (574)
k2
The calculations for |x| > N is completed in a similar way. This gives us
1
2 1 − e−kN cosh kx , |x| ≤ N,
yN (x) = k (575)
1 e−k|x| sinh kN , |x| ≥ N.
k2
0
yN (x) is a solution to (559) and it is easy to verify that yN (x), yN (x) → 0 when
|x| → ∞, so we have an innite sequence of solutions {yN (x)} to the boundary
value problem dened by (559),(560).
For any x there evidently exists a N so large that N ≥ |x|. Therefore
for such an x we have fN (x) = f (x) for all N > |x| and consequently the
sequence of functions {fN }∞
N converge pointwise to the functions f (x) = 1.
This convergence is however not uniform.
Observe that for any x there obviously exists ininitely many N such that
N ≥ |x|. For such N we have from (575) that
1 1
1 − e−kN cosh kx → 2 = y(x),
yN (x) = 2
(576)
k k
when N →∞
Thus, this convergence, like the one for the source f (x), is also pointwise
but not uniform. This non-uniformity of the limit is why the limiting function,
y(x), does not satisfy the boundary condition (560) even though all yN (x) does.
As a second example of an ideal calculation let us choose f (t) = δ(t). Then
formula (566) gives
Z ∞
1 1 −k|x|
y(x) = dt e−k|x−t| δ(t) = e . (577)
2k −∞ 2k
This function satises the boundary condition and the equation for all x 6= 0.
It is however not dierentiable at x = 0.
Introduce a sequence of functions {gN (x)} by
182
N 1
,
|x| ≤ ,
gN (x) = 2 N (578)
1
0, |x| > .
N
Using formula (566) with f (x) = gN (x), it is straight forward to show that
the corresponding solution is
N k
−N
1
2 1 − e cosh(kx) , |x| ≤ ,
2k N
yN (x) = (579)
N −|x| k 1
e sinh , |x| > .
2k 2
N N
1
For any x there exist innitely many N such that |x| ≥ N , and for such N we
have from (579) that
N −|x| k 1 −|x|
yN (x) = e sinh → e ,
2k 2 N 2k
when N approach innity. Furthermore, for x = 0 we have from (579) that
N k
1
yN (0) = 2 1 − e− N → ,
2k 2k
when N approach innity.
We have thus established that the innite sequence of solutions (579) of
equation (559) corresponding to the sequence {gN } converge pointwise to the
1 −k|x|
solution y(x) =
2k e corresponding to the Dirac delta function f (x) = δ(x).
The sequence (579) does not converge uniformly to the function y(x) and this
explains the fact that all functions from the sequence (579) are dierentiable at
1 −|x|
x=0 whereas the limiting function y(x) = 2k e is not.
In the previous example the sequence {fN (x)} converged pointwise to the
limiting function f (x) = 1 that generated the solution y(x) = k12 . Is it similarly
true that in the current example the sequence {gN (x)} converge to the Dirac
1 −k|x|
delta function that generated the solution f (x) =
2k e ?
In order to answer this question we start by observing that for all N
1
Z ∞ Z N N N 1 1
dx gN (x) = dx = ( + ) = 1, (580)
−∞ 1
−N 2 2 N N
so gN (x) satises the same integral identity (527) as the Dirac delta function
(Here D = (−∞, ∞)). Furthermore for any x 6= 0 there exists a Nx such that
1
|x| > . (581)
Nx
Therefore for all N ≥ Nx , we have gN (x) = 0. Therefore
183
so the limit of the sequence gN when N → 0 also satises the identity (526). So
in fact
I am playing with formulas here; the limit in (583) does not mean the usual thing.
After all, δ(x) is not a function but something dierent, namely a distribution.
It would perhaps be more fair to say that the limit of the sequence {gN (x)},
when N →∞ does not exist, since there can be no function satisfying (526)
and (527).
What you see here is an example of a very subtle and profound mathematical
idea called completion. You have actually used this idea constantly since grade
school, but you may not have been told.
When we learn children arithmetic we start with whole numbers and eventu-
ally move on to fractions or rational numbers as they are called. The universe of
numbers are rational numbers at grade school level. At some point the children
are confronted with the equation
x2 = 2. (584)
If the teacher is any good he will by trial and error convince the children that
(584) has no solutions. So what is there to do? He will say that (584) actually
√
has a solution that we denote by 2, and he will show, perhaps using a calcu-
lator, how to nd a sequence of rational numbers that more and more closely
√ √ √
approximate 2. In a similar way the teacher introduces 3, 5 etc. and he
develops rules for calculations with such things.
But the children has been tricked. When the teacher argues that calculated
√
sequences of rational numbers approximate
√ 2 better and better and in the limit
equals 2, he has been pulling wool over their eyes. What he has done is to
complete the set of rational numbers, in the process creating the real numbers.
√ √
There is no sequence of rational numbers converging to 2, in reality 2 is
dened to be a certain set of sequences of rational numbers. The existence
√
of 2 is postulated, pure and simple, through the completion process. After
the completion process has lead to the set of real numbers, pure and applied
mathematics can give them additional intuitive meaning by mapping them onto
other physical and mathematical systems. In this way real numbers have come
to be interpreted as geometrical points on a line etc.
The relation between functions and distributions is exactly the same as the
one between rational and real numbers. For the case of distributions Dirac is the
teacher and the rest of us the children. Next semester we will map distributions
onto a system derived from linear algebra and in this way gain more insight into
the nature of distributions.
The particular "solution" (577) found by choosing f (x) = δ(x) is an example
of what is called a Green's function. Writing
1 −k|x−y|
g(x − y) = e . (585)
2k
184
We have seen that g(x − y) is a solution to the equation
where we are taking the derivative with respect to the variable x, the vari-
able y is a parameter here. Green's functions are of outmost importance in
applied mathematics and theoretical physics and will be discussed extensively
next semester. As you can see from (586), Green's functions are solutions to
dierential equations formulated in the universe of distributions. For this reason
Green's functions are not actually functions but distributions.
Let us now proceed to consider a slightly dierent ODE
0
For this equation the boundary conditions can not be that y(x) and y (x) → 0
as |x| → ∞. The reason is that (587) does not have solutions decaying at
innity. In fact for f (x) = 0 the solution consists of sin kx and cos kx, which
are bounded but does not vanish. We will therefore only require that y and y 0
are bounded at innity. This implies, by an argument identical to the one on
page 11, that the spectrum is real. We apply the Fourier transform like for the
previous example and nd
F (λ)
Y (λ) = . (588)
λ2 − k 2
We observe that, compared to the previous case, a problem has appeared: Y (λ)
is not dened at λ = ±k . The proper way to handle this is to use complex
function theory and for more complicated situations complex function theory
becomes a necessity. Here we will proceed in a more heuristic way by observing
that (587) becomes (559) if we let k → ik or k → −ik . Making the same
replacement in the solution formula (566) we get two solutions
Z ∞
i
y± (x) = ± dt e±ik|x−t| f (t). (589)
2k −∞
Both
of y± (x) satisfy (587) and are bounded if
f (t) is absolutely integrable
R ∞
−∞
dt|f (t)| < ∞ . Thus the boundary value problem (587) does not have a
1
− ϕtt + ϕxx = −f (x), −∞ < x < ∞, (590)
c2
and let the source, f (x), be supported in some bounded domain G. Outside
this domain G, (590) is the homogeneous wave equation whose general solution
consists of left- and right going waves.
We assume that the source is the originator of all waves. Thus, no source,
no waves. Then it is evident that there can be no left moving waves to the right
of G and no right-moving waves to the left of G. This is a physical condition
185
Figure 57
satised by all known wave phenomena; waves move away from the source that
creates them. This is in fact part of the denition of what a physical source is!
Let us separate variables in (590)
The physical boundary condition thus consists of killing terms containing e−ikx
ikx
to the right of the support of f and terms containing e to the left of the
support of f. This can be expressed by saying that
where Supp(f ) = {x; f (x) 6= 0}. Note that Supp(f ) by denition includes its
boundary points and is thus a closed set.
We have here assumed that the support of f is bounded. However it is
enough that f (x) → 0, |x| → ∞. For this more general case we pose the
conditions
186
(594) or more generally (595) is called the radiation condition at innity, or
outgoing waves at innity and is the correct boundary condition for equations
describing wavelike phenomena. (595) picks out the unique solution y+ (x) to
our boundary value problem (587),(595). Choosing f (x) = δ(x) gives us the
following Green's function for this case
i ik|x−ξ|
G(x − ξ) = e . (596)
2k
In order to apply the Fourier transform we must assume that f has a Fourier
transform and that u, ux goes to zero at x = ±∞.
Taking the Fourier transform with respect to x of the heat equation (597)
we get
U (λ, 0) = F (λ),
where
Z ∞
1
U (λ, t) = √ dx eiλx u(x, t), (599)
2π −∞
Z ∞
1
F (λ) = √ dx eiλx f (x). (600)
2π −∞
The equations (598) are uncoupled for dierent values of λ. It is the linearity
of the heat equation that produce an uncoupled system of ODEs for the set of
functions U (λ, t) indexed by the spectral parameter λ. Note how similar this is
to what we got when we applied the nite Fourier transform to linear PDEs on
a nite domain.
The initial value problems (598) are easy to solve. We get
2
λ2 t
U (λ, t) = F (λ)e−c . (601)
187
Z ∞
1
u(x, t) = √ dλ e−iλx U (λ, t) (602)
2π −∞
Z ∞
1 2 2
=√ dλ e−iλx F (λ)e−c λ t
2π −∞
Z ∞ Z ∞
1 1 0 2 2
=√ dλ √ dx0 e−iλx eiλx f (x0 )e−c λ t
2π −∞ 2π −∞
Z ∞ Z ∞
1 0 −iλ(x−x0 )−c2 λ2 t
= dx dλ e f (x0 ),
2π −∞ −∞
Z ∞
1 2
λ2 t
g(α) = dλ e−iλα−c , (603)
2π −∞
Note that we could have gotten this expression more directly by using the con-
volution theorem. Observe that
Z ∞
1 2 2
g(α) = dλ e−iλα−c λ t (605)
2π −∞
Z 0 Z ∞
1 2 2 1 2 2
= dλ e−iλα−c λ t + dλ e−iλα−c λ t
2π −∞ 2π 0
Z ∞ Z ∞
1 2 2 1 2 2
= dλ eiλα e−c λ t + dλ e−iλα e−c λ t
2π 0 2π 0
1 ∞
Z
2 2
= dλ cos λαe−c λ t .
π 0
This integral, because of the exponential factor present for t > 0, converges very
fast, so we can dierentiate under the integral sign
1 ∞
Z
dg 2 2
(α) = − dλ λ sin λαe−c λ t (606)
dα π 0
λ=∞
Z ∞
1 −c2 λ2 t α 2 2
= sin λαe − dλ cos λαe−λ c ,
2πc2 t λ=0 2πc2 t 0
so
dg α
= − 2 g. (607)
dα 2c t
188
This ODE is simple to solve. The solution is
α2
g(α) = g(0)e− 4c2 t , (608)
and
Z ∞
1 2
λ2 t 1
g(0) = dλ e−c =√ . (609)
π 0 4πc2 t
(This is a Gaussian integral.) Thus
1 α2
g(α) = √ e− 4c2 t . (610)
4πc2 t
The formula for u(x, t) thus becomes
Z ∞ (x−x0 )2
1
u(x, t) = √ dx0 e− 4c2 t f (x0 ). (611)
4πc2 t −∞
1 (x−x0 )2
u(x, t; x0 ) ≡ G(x − x0 , t) = √ e− 4c2 t . (613)
4πc2 t
G is yet another example of a Green's function. In this context it is called the
fundamental solution to the heat equation. The fundamental solution is the
temperature prole generated by a point-like heat source located at x = x0 for
t = 0.
Note that it is an actual classical solution (verify!) for all t>0 even if the
initial condition is extremely singular. This is yet another example illustrating
the extreme smoothening properties of the heat equation.
From the solution formula (611) it is evident that even if the initial heat
prole is zero beyond a nite interval [−L, L], the solution is nonzero for any
point x∈R for any t > 0, however small t is.
The heat equation clearly does not respect causality.
Figure 58
189
Z ∞
dx0 G(x − x0 , t)g(x0 ) → g(x), t → 0+ . (614)
−∞
Thus
Z ∞ Z ∞
dx0 G(x − x0 , t)g(x0 ) ≈ g(x) dx0 G(x − x0 , t) = g(x), (616)
−∞ −∞
because
Z ∞
dx0 G(x − x0 , t) = 1. (617)
−∞
Verify this!
u(x, 0) = f (x),
ut (x, 0) = g(x).
Taking the Fourier transform of this problem, proceeding like for the heat equa-
tion, we get
u(λ, 0) = F (λ),
∂t u(λ, 0) = G(λ).
The initial value problem (619) is entirely standard and the solution is
1 1
u(λ, t) = F (λ) + G(λ) eiλct (620)
2 2iλc
1 1
+ F (λ) − G(λ) e−iλct .
2 2iλc
190
Taking the inverse transform we get
Z ∞ Z ∞
1 1
u(x, t) = √ dλ e−iλ(x−ct) F (λ) + √ dλ e−iλ(x+ct) F (λ) (621)
2 2π −∞ 2 2π −∞
Z ∞ Z ∞
1 G(λ) 1 G(λ)
+ √ dλ e−iλ(x−ct) − √ dλ e−iλ(x+ct) .
2c 2π −∞ iλ 2c 2π −∞ iλ
Z ∞
1
g(x) = √ dλ e−iλx G(λ). (622)
2π −∞
But then
Z x Z ∞
1 G(λ)
ds g(s) = − √ dλ e−iλx + D, (623)
0 2π −∞ iλ
where D is a constant. Given (623), (621) can be written as
Z x+ct
1 1
u(x, t) = (f (x − ct) + f (x + tc)) + ds g(s). (624)
2 2c x−ct
This is the d'Alembert formula, now derived using the Fourier transform.
There is however something rather shy with what we did. Let us say that
we did not realize that the identity (623) holds. Then, for given F (λ), G(λ),
we would have to nd u(x, t) by solving the integrals in (621) analytically or
numerically. We should however worry about the last two integrals. They are
of the general form
Z ∞
H(λ)
I(α) = dλ e−iλα , α ∈ R, (625)
−∞ iλ
where in general H(0) 6= 0. The integral is singular at λ=0 and the question
how the integral should be understood must be raised.
Away from λ=0 there are no problems, so the question we need to ask is if
we can make sense of integrals of the type
Z L
1
dλ , (626)
−L λ
where L is arbitrary but xed. In calculus we have seen that at least some
integrals with a singular integrand in the domain of integration can be made
sense of by considering it to be an improper integral. Let us try to use this
approach on (626):
191
Z L Z 0 Z L
1 1 1
dλ = dλ + dλ (627)
−L λ −L λ 0 λ
Z − Z L
1 1
= lim dλ + lim dλ
→0 −L λ δ→0 δ λ
Z Z L
1 1
= lim dµ + lim dλ
→0 L µ δ→0 δ λ
Z L Z L
1 1
= − lim dµ + lim dλ
→0 µ δ→0 δ λ
= − lim (ln L − ln ) + lim (ln L − ln δ)
→0 δ→0
= lim ln − lim ln δ.
→0 δ→0
This limit does not exist. Thus (626) can not be given meaning as an improper
integral. Observe however that if we in (627) let =δ and take the dierence
before we take the limit we get zero, which is a nite answer. Thus what we are
proposing is to understand (626) in the following way
(Z )
Z L − Z L
1 1 1
dλ = lim dλ + dλ (628)
−L λ →0 −L λ λ
(Z )
Z L
1 1
= lim dµ + dλ
→0 L µ λ
= lim {ln − ln L + ln L − ln } = 0.
→0
When we use the denition (628) we understand the integral as a Cauchy prin-
cipal value integral. This is now a new way for you to calculate integrals with a
singular integrand. In general for a function f (x), with a singularity at x = x0 ,
the denition is
(Z )
Z L x0 − Z L
PV dx f (x) = lim dx f (x) + dx f (x) , (629)
−L →0 −L x0 +
Z 1 Z − Z 1
1 1 1
PV dx 2 = lim dx 2 + dx 2 (630)
−1 x →0 −1 x x
1 − 1 1
= lim − + −
→0 x −1 x
1 1 1
lim +1−1+ = lim = ∞!
→0 →0
192
1
If we interpret the two integrals containing in (615) as Cauchy principal value
λ
integrals, we can use the formula to calculate the solution to the wave equation.
However, there is still a weak shy smell here; why is interpreting the singular
integrals in (615) as Cauchy principal value integrals the right thing to do? How
do we know that this will give the same solution as, say, the d'Alembert formula?
Could one not imagine some other funny way of getting a nite number out of
integrals like (626)? And would this not perhaps produce a dierent solution;
which should not be possible because the Cauchy problem for the wave equation
has a unique solution? What is going on?
Answering this question, which is an important one, would take us too far
from the main focus of the course, so I will refrain from giving an answer. Here
I will only ensure you that this is the right way to understand the integral (626)
and it does give the same solution as the d'Alembert formula. It is also worth
noting that questions like the one we have been considering frequently turn up
when we use transforms to solve PDEs, and solving them can often be a deep
and vexing problem.
u(x, 0) = f (x),
u(x, y) is bounded as y → ∞.
Using this Fourier transform on (631) gives us the following innite system of
uncoupled ODEs:
u(λ, 0) = F (λ),
u(λ, y) is bounded as y → ∞.
193
Z ∞
1
u(x, y) = √ dλ e−iλx F (λ)e−|λ|y (635)
2π −∞
Z ∞ Z ∞
1 0
= dλ e−iλx e−|λ|y dx0 eiλx f (x0 )
2π −∞ −∞
Z ∞ Z ∞
1 0 0 0
= dx f (x ) dλ e−iλ(x−x )−|λ|y .
2π −∞ −∞
Z ∞
0
dλ e−iλ(x−x )−|λ|y
−∞
Z ∞
=2 dλ cos λ(x − x0 )e−λy
0
2y
= , (636)
(x − x0 )2 + y 2
and therefore we have the following expression for the solution of the Laplace
equation in a half-space
∞
f (x0 )
Z
y
u(x, y) = dx0 . (637)
π −∞ (x − x0 )2 + y 2
By direct substitution it can be veried that (637) is a solution to the Laplace
equation fory > 0.
y
When y → 0 the factor goes to zero, but the integrand approaches
π
f (x0 )
, (638)
(x − x0 )2
which is singular at x0 = x. A careful study of the limit shows that the limit of
(637) when y→0 is actually f (x). We will not do this here.
For simple choices of f (x) we can solve the integral (637). For example if
f (x) = H(x) where H(x) is the Heaviside step function
(
1, x > 0,
H(x) = (639)
0, x < 0,
we nd
y ∞ H(x0 )
Z
u(x, y) = dx0 (640)
π −∞ (x − x0 )2 + y 2
Z ∞
y 1
= dx0 0
π 0 (x − x )2 + y 2
Z ∞
1 1
= dx0 2 .
πy 0
0
1 + x−x y
194
x0 − x 1 0
Substitute u = and du = dx in this integral. This gives us the
y y
following explicit expression for u(x, y)
Z ∞ ∞
1 1 1
u(x, y) = du
2
= tg −1 (u) (641)
π x
−y 1 + u π −x
y
1 1 x
= + tg −1 .
2 π y
This function satises the Laplace equation and
1 1 π
1 1 x + · = 1, x > 0,
lim u(x, y) = + lim tg −1 = 2 π 2 (642)
2 π y→0 y 1 + 1 · − π = 0, x < 0,
y→0
2 π 2
so u(x, y) satises the boundary condition at y = 0. Also
1
lim u(x, y) = , (643)
y→∞ 2
so u(x, y) is bounded at y = +∞.
However, u(x, y) does not go to zero as |x| → ∞ and so does not actually
have a Fourier transform, which we assumed. The reason for this is that our
boundary function f (x) = H(x) does not have a Fourier transform in the con-
ventional sense. Observe that u(x, y) is innitely smooth for y > 0 even if it
is discontinuous in x for y = 0. We are rediscovering the fact that the Laplace
equation can not support solutions with singularities. As we have seen, this is
a general feature of elliptic equations.
Let us say that we were rather to solve the Laplace equation with Neumann
conditions
uy (x, 0) = g(x).
We can reduce this to a problem of the previous type using a trick, invented by
an English mathematician, called Stoke's rule.
Introduce v(x, y) = uy (x, y). Dierentiating (637) with respect to y gives
Z y
u(x, y) = ds v(x, s), (646)
0
195
we get
Z ∞
1 y f (x0 )
Z
u(x, y) = ds s dx0
π 0 −∞ (x − x0 )2 + s2
Z ∞ Z y
1 s
= dx0 f (x0 ) ds 0
π −∞ 0 (x − x )2 + s2
Z ∞
1
dx0 ln (x − x0 )2 + y 2 f (x0 ).
= (647)
2π −∞
Observe that u(x, y) dened by (647) is not the only solution to (644).
For any solution v(x, y) to (645) we can add an arbitrary function C(x) to
the right hand side of (646) and in this way dene a whole innite familty of
functions u(x, y) Z y
u(x, y) = ds v(x, s) + C(x). (648)
0
All such functions satisfy the boundary condition uy (x, 0) = g(x), but in order
to satisfy the Laplace equation (644) the added function C(x) must be linear.
Furthermore, in order for the solution to be bounded in x this linear function
must be a constant.
M (x) = Ax + B, (652)
and
196
and
This function is clearly bounded for all real λ and therefore all such λ are in the
spectrum. Observe however that
and eigenfunctions
Mλ (x) = sin λx. (656)
For (649) with the boundary conditions (651) we nd in a similar way the
spectrum
0 ≤ λ < ∞, (657)
and eigenfunctions
and then let l approach innity. On the bounded domain (659), the boundary
condition at x = +∞ is replaced by the condition M (l) = 0. This gives us the
following normalized eigenfunctions
r
2
Mk (x) = sin λk x, k = 1, 2, . . . , (660)
l
where
kπ
λk = . (661)
l
A function f˜(x) can now be expanded in a Fourier series with respect to (660)
according to
r ∞
2X
f˜(x) = Fk sin λk x, (662)
l
k=1
r Z l
2
Fk = dx sin λk xf˜(x). (663)
l 0
197
Observe that the distance between the points λk on the λ-axis is
π
∆λ = λk+1 − λk = . (664)
l
We use this and rewrite (662), (663) in the following way
√ ∞
2X
f˜(x) = ∆λ sin λk xF̃k , (665)
π
k=1
√ Z l
F˜k (x) = 2 dx sin λk xf˜(x), (666)
0
√
where F̃k = lFk . If we formally let l→∞ in (665) and (666) we get
√ Z ∞
2
f˜(x) = dλ sin λxF̃ (λ), (667)
π 0
√ Z ∞
F̃ (λ) = 2 dx sin λxf˜(x). (668)
0
In order to get a more symmetric and pleasing form of the transform (667),
(668) we introduce scalings
√
2α ∞
Z
⇒ f (x) = dλ sin λxF (λ), (670)
πβ 0
√ Z ∞
2β
F (λ) = dx sin λxf (x). (671)
α 0
√
Choose α= π , β = 1. Then we get the Fourier sine transform
∞
r Z
2
f (x) = dλ sin λxFs (λ), (672)
π 0
∞
r Z
2
Fs (λ) = dx sin λxf (x). (673)
π 0
Our friendly analysis can tell us for which pair of nice functions the improper
integrals converge and for which (672) and (673) holds.
However we will apply it to much more general things and will proceed
formally and hope that in the end everything will be ok in the universe of
distributions. Here it is only worth noting that even though sin λx is zero for
x = 0, f (x) in (672) might not approach zero as x approach zero. This is
because in general the convergence of the improper integral will not allow us to
interchange integral and limit.
198
A similar derivation for the boundary condition (651) gives us the Fourier
cosine transform
∞
r Z
2
f (x) = dλ cos λx Fc (λ), (674)
π 0
∞
r Z
2
Fc (λ) = dx cos λx f (x). (675)
π 0
The Fourier sine- and cosine transforms have several interesting and useful
properties[1]. Here we will discuss how they behave with respect to derivatives.
Observe that
Z ∞ ∞
Z ∞
0
dx sin λxf (x) = sin λxf (x) −λ dx cos λxf (x). (676)
0 0 0
2 ∞
r
Z r
0 2
dx cos λxf (x) = − f (0) + λFs (λ), (679)
π 0 π
r Z ∞ r
2 00 2 0
dx cos λxf (x) = −λ f (0) − λ2 Fc (λ). (680)
π 0 π
We observe that in order to apply the Fourier sine- and cosine transforms to
solve dierential equations on a semi innite domain the boundary conditions
must be of a particular type. For second order equations the Fourier sine trans-
form requires Dirichlet condition and the cosine transform requires Neumann
condition. Since one typically solve time dependent problems on a semi-innite
domain t > 0, one might think that the Fourier Sine- and Cosine transforms
should be useful for this, but they are in fact not.
If the equation contains a rst derivative with respect to time we will nd
that the Fourier Sine transform of ut is expressed in terms of the Fourier Cosine
transform for u and vice versa. If the equation contains two time derivatives,
utt , there are two data given at t = 0, u and ut . But both the Fourier Sine and
199
the Fourier Cosine transform can only include one condition at t = 0, u for the
Sine transform and ut for the Cosine transform.
We will see that for initial value problems, the Laplace transform is the
appropriate one.
Another useful property of the Fourier Sine- and Cosine transform is
∞
r Z
2 ∂Fc
dx x sin λxf (x) = − , (681)
π 0 ∂λ
∞
r Z
2 ∂Fs
dx x cos λxf (x) = . (682)
π 0 ∂λ
r
2 2 2
∂t Us (λ, t) + (λc) Us (λ, t) = λc g(t), (686)
π
with initial condition
where
∞
r Z
2
Us (λ, t) = dx sin λx u(x, t), (688)
π 0
∞
r Z
2
Fs (λ) = dx sin λx f (x). (689)
π 0
200
2 2
us (λ, t) = Fs (λ)e−λ c t , (690)
r Z t
2 2 2
+ λc 2
dτ e−λ c (t−τ ) g(τ ).
π 0
Taking the inverse Fourier Sine transform (610) we get the solution in the form
2 ∞
r Z
2 2
u(x, t) = dλ sin λx Fs (λ)e−λ c t (691)
π 0
2c2 ∞
Z Z t
2 2
+ dλ sin λx λ dτ e−λ c (t−τ ) g(τ )
π 0
Z ∞0
2 ∞
Z
2 2
= dλ sin λx ds sin λs f (s)e−λ c t
π 0 0
2c2 ∞
Z Z t
2 2
+ dλ sin λx λ dτ e−λ c (t−τ ) g(τ )
π 0 0
2 ∞
Z Z ∞
−λ2 c2 t
= ds dλ sin λx sin λs e f (s)
π 0 0
Z ∞
2c2 t
Z
2 2
+ dτ dλ λ sin λx e−λ c (t−τ ) g(τ ).
π 0 0
Using the addition formulas for cosine we have
But then
Z ∞
2 2 2
dλ sin λx sin λse−λ c t
(693)
π 0
1 ∞
Z
2 2
= dλ {cos λ(x − s) − cos λ(x + s)}e−λ c t
π 0
1 ∞ 1 ∞
Z Z
2 2 2 2
= dλ cos λ(x − s)e−λ c t − dλ cos λ(x + s)e−λ c t
π 0 π 0
1 (x−s)2 1 (x+s)2
=√ e− 4c2 t − √ e− 4c2 t
4πc2 t 4πc2 t
= G(x − s, t) − G(x + s, t),
where we have used quantities dened in (605) and (610). We also have
Z ∞
1 2 2
dλ λ sin λx e−λ c (t−τ ) (694)
π 0
Z ∞
1 2 2
= − ∂x dλ cos λx e−λ c (t−τ )
π 0
= −∂x G(x, t − τ ),
201
so the solution can compactly be written as
Z ∞
u(x, t) = ds {G(x − s, t) − G(x + s, t)}f (s) (695)
0
Z t
− 2c2 dτ ∂x G(x, t − τ )g(τ ). (696)
0
Z ∞
u(x, t) = ds {G(x − s, t) − G(x + s, t)} u0 (697)
0
Z ∞ Z ∞
u0 (x−s)2
− 4c2 t
(x+s)2
− 4c2 t
=√ ds e − ds e
4πc2 t 0 0
s−x s+x
r= √ r= √
2c t 2c t
(Z )
∞ Z ∞
u0 −r 2 −r 2
=√ dr e − dr e
π − 2cx√t x
√
2c t
Z x√
2u0 2c t 2 x
= √ dr e−r = u0 erf √ ,
π 0 2c t
where erf(ξ) is the error function
Z ξ
2 2
erf(ξ) = √ dr e−r . (698)
π 0
We observe that
u(x, t) → 0 as t → ∞. (702)
Thus u(x, t) satises all requirements of our problem even if u(x, 0) = u0 does
not have a Fourier sine transform. One can show that the solution (697) is the
limit of a sequence of problems that all have initial data for which the transform
exist.
202
For the second case we don't assume that f (x) is constant, but merely that
it is bounded
Z ∞
ds {G(x − s, t) − G(x + s, t)}f (s)
0
∞ Z
x
≤M ds {G(x − s, t) − G(x + s, t)} = M erf √ → 0.
0 2c t
Therefore as t→∞ the solutionu(x, t) approaches
Z t
u(x, t) ≈ −2c2 dτ ∂x G(x, t − τ )g(τ ). (704)
0
This solution is called a steady state. It is not time invariant but represent the
solution for times so large that the initial data has been dissipated. For such
large times the solution only depends on the boundary condition, or drive at
the boundary. Recall that an analogy behavior is displayed by a driven, damped
harmonic oscillator.
Let now α be a constant such that α > k. Then the Fourier transform of
e−αt f (t) will exist and is given by
Z ∞
1
F̂ (λ̂) = √ dt eiλ̂t e−αt f (t), (706)
2π 0
where the fact that f (t) = 0 for t ≤ 0 has been taken into account. For example
if f (t) = 1 we get
Z ∞
1 −1
F̂ (λ̂) = √ dt e(iλ̂−α)t = √ . (707)
2π 0 2π(iλ̂ − α)
203
In the Fourier transform we know that λ̂ is a real number related to the spectrum
of a certain dierential operator. Observe, however, that the right hand side of
(707) can be evaluated for complex λ̂ also. For example
−1 1
F̂ (i) = √ =√ . (708)
2
2π(i − α) 2π(1 + α)
Also observe that (707) can not be evaluated for all complex numbers, in fact for
λ̂ = −iα the denominator in (707) is zero. However, this is the only problematic
point and F̂ (λ̂) can be extended to the whole complex plane otherwise
−1
F̂ (λ̂) = √ , λ̂ ∈ C − {−iα}. (709)
2π(iλ̂ − α)
In a similar way we will now prove that F̂ (λ) can be extended to complex values
for any f (t) of exponential order.
Evaluating (706) at λ̂ = λ̂r + iλ̂i ∈ C gives us
Z ∞
1
F̂ (λ̂) = √ dt e(iλ̂−α)t f (t). (710)
2π 0
For this to make sense the integral has to converge. But
Z ∞
1 ˆ
|F̂ (λ̂)| ≤ √ dt e−(λi +α)t |f (t)| (711)
2π 0
Z c
1 ˆ
≤√ dt e−(λi +α)t |f (t)|
2π 0
Z ∞
1 ˆ
+√ dt M e(k−λi −α)t
2π c
and the last integral is nite only if
So F̂ (λ) can not be dened in the whole complex plane but only in a domain
dened by (712).
Below the horizontal line λ̂ = −i(α − k) in the complex plane, we expect the
function F̂ (λ̂) to have singularities. For the special choice
(
1, t ≥ 0,
f (t) = (713)
0, t < 0,
this was exactly what occured.
Change variables in the complex plane to λ through
λ̂ = i(λ − α) (714)
204
Figure 59
√
F (λ) = 2π F̂ (i(λ − α)). (715)
Z ∞
F (λ) = dt e−λt f (t). (716)
0
λr > k. (717)
F (λ) is the Laplace transform of f (t). Using the Fourier inversion formula
we have
Z ∞
−αt 1
e f (t) = √ dλ̂ e−λ̂t F̂ (λ̂). (718)
2π −∞
205
Figure 60
Z ∞ Z ∞
dt e−λt f (n) (t) = λn dt e−λt f (t) − λn−1 f (0) − . . . f (n−1) (0). (721)
0 0
This formula is easy to prove using integration by parts. From (721) it is clear
that the n initial conditions for a nth order dierential equation can be included
in a natural way. The Laplace transform is therefore very useful for solving
initial value problem for both ODEs and PDEs. Here we will apply it to initial
boundary value problems for PDEs.
We will now apply the Laplace transform to two initial value problems for
PDEs.
u(x, 0) = ut (x, 0) = 0,
u(0, t) = f (t),
u(x, t) → 0, x → ∞.
206
λ2 U (x, λ) − λu(x, 0) − λ2 ut (x, 0) − c2 Uxx (x, λ) = 0 (723)
U (0, λ) = F (λ).
λ λ
U (x, λ) = a(λ)e− c x + b(λ)e c x . (725)
Recall that we evaluate (725) in the right halfspace only, so λr > 0. In order to
satisfy the boundary condition
b(λ) = 0. (727)
λ
U (x, λ) = F (λ)e− c x . (728)
(
1, ξ ≥ 0,
H(ξ) = (729)
0, ξ < 0.
Then
Z ∞
dt H(t − b)f (t − b)e−λt (730)
0
Z ∞
= dt f (t − b)e−λt , ξ = t − b, dξ = dt,
b
Z ∞
= dξ f (ξ)e−λ(ξ+b) = e−λb F (λ).
0
x
This is equal to (728) with b= c . Thus the inverse Laplace transform of (728)
must be
x x
u(x, t) = H t − f t− (731)
c c
and we have solved the initial value problem.
207
11.3.2 The heat equation on a nite interval
Consider the problem
U (0, λ) = U (l, λ) = 0.
This is now a boundary value problem. Equation (734) can be solved using vari-
ation of parameters from the theory of ODEs. Fitting the boundary conditions
to the variation of parameter solution gives
Z !
x
√ √
!
1 λ λ
U (x, λ) = √ √ ds sinh s sinh (l − x) f (s) (735)
c λ sinh l c λ 0 c c
Z l √ ! √ !
1 λ λ
+ √ √ ds sinh x sinh (l − s) f (s).
c λ sinh l λ x c c
c
Finding the inverse transform of (735) is not easy, but we have a secret weapon.
In the inversion formula for the Laplace transform (719) we can place the inte-
gration curve L anywhere as long as λr > k . Thus we may assume in (735) that
|λ| is as large as we want.
For such λ we have
√
l λ ∞
1 2 2e− c l
√
λ
X √
2jl λ
√ = √
l λ
√ = √ = 2e− c e− c . (736)
λl e −e − l cλ 1 − e−2
l λ
sinh c
c c
j=0
√
The last equality sign in (736) is valid because for |λ| large enough e−2l λ/c
<1
and therefore
∞
1 X
= xj . (737)
1 − x j=0
The hyperbolic sine under the integral signs can also be expressed in terms
of exponentials. Thus we end up with an innite sum where each term is an
exponential.
From a table of Laplace transforms we have
208
Z ∞ √
1 α2 1
dt e−λt √ e− 4t = √ e−α λ , α > 0. (738)
0 πt λ
Inserting (738) into the innite sum, formally interchanging summation and
integration gives us nally
Z l ∞
X
u(x, t) = ds {G(x − s − 2jl, t) − G(x + s − 2jl, t)}f (s), (739)
0 j=−∞
where G(y, t) is the fundamental solution to the heat equation. We have previ-
ously found the solution to (732) using a Fourier sine series
r ∞
2X πjc 2 πj
u(x, t) = aj e−( l ) t sin x . (740)
l j=1 l
Problem (732) has a unique solution so even if (739) and (740) appear to be
very dierent they must in fact be identical.
The formulas can be proven to be identical using the Poisson summation
formula [1].
The formulas are mathematically equivalent, but are useful for evaluating
u(x, t) t. Because of the exponential term in (740), eval-
for dierent ranges of
uating u(x, t) t using this formula requires only a few terms.
for large
However for small t, j must be large before the exponential term kicks in.
Thus we require many terms in (740) in order to evaluate u(x, t) for small t.
On the other hand, for small t, the fundamental solution G(y, t) is sharply
peaked around y = 0. Thus for small t, only a few terms in (739) contribute to
u(x, t). Thus for small t (739) is fast to evaluate.
In general it is not easy to invert integral transforms. There are however
powerful analytic approximation methods that can be brought to bear on this
problem.
Observe that the formula for the Laplace transform
Z ∞
F (λ) = dt e−λt f (t) (741)
0
clearly shows that F (λ) for large |λ| depends mainly on f (t) for small t. This
can be made more precise. From the body of results called Abelian asymptotic
theory we have the following result:
If
∞
X
f (t) ≈ αn tn , t → 0, (742)
n=0
then
∞
X n!
F (λ) ≈ αn , |λ| → ∞. (743)
n=0
λn+1
209
The converse in also true, (743) implies (742). Thus if we are only interested in
f (t) for small t we can nd the expression for f (t) by expanding F (λ) in inverse
powers of λ. This expansion gives us the coecients αn which we then insert
into (742).
There are also results relating the behavior of f (t) for large t to the behavior
of F (λ) for small λ. This body of results is called Tauherian asymptotic the-
ory. Describing the simplest results from this theory requires however complex
analysis so we will not do it here.
ut (x, 0) = g(x).
Recall that the Klein-Gordon equation is of dispersive type. We have found this
previously using normal mode analysis. Taking the Fourier transform of the
Klein-Gordon equation we get
1
ω = (γ 2 λ2 + c2 ) 2 . (748)
Using the inverse Fourier transform we get the following general solution to
(744)
Z ∞
1
u(x, t) = √ dλ F+ (λ)ei(ω(λ)t−λx) (749)
2π −∞
Z ∞
1
+√ dλ F− (λ)e−i(ω(λ)t+λx) .
2π −∞
The functions F+ (λ) and F− (λ) are determined from initial data.
From (745) the Fourier transform of the initial data is
210
Z ∞
1
U (λ, 0) = F (λ) ≡ √ dx f (x)eiλx , (750)
2π −∞
Z ∞
1
Ut (λ, 0) = G(λ) ≡ √ dx g(x)eiλx .
2π −∞
Using (747) we get
F± (λ)ei(±ω(λ)t−λx) . (753)
r c 2
dx ω(λ)
vf (λ) = =± =± γ2 + . (754)
dt λ λ
Recall that the phase speed of a normal mode is the speed we have to move at
in order for the phase
0
θ± (x(t), t) = θ± , (756)
⇓
dx
∂x θ± + ∂t θ± = 0,
dt
⇓ (757)
dx
−λ ± ω(λ) = 0,
dt
⇓
dx ω(λ)
=± .
dt λ
211
The Klein-Gordon equation has the same principle part as the wave equation.
It thus have the same characteristic speed ±γ .
As we have seen, discontinuities move at characteristic speed ±γ and ±γ
represents the upper speed limit for the Klein-Gordon equation.
But from (754) we see that the phase speed is bigger than γ for all modes.
The physical relevance of the phase speed is therefore in question. Furthermore,
noting that the solution to the Klein-Gordon equation is a superposition of
normal modes which all move at dierent phase velocities, it is not at all clear
that the phase speed is a relevant quantity for this equation.
In order to investigate this problem we will consider special initial conditions
of the form
2π
L0 = . (759)
k0
In (758) we assume that 0 < << 1 is a number so small that
Then the amplitudes p(x) and q(x) vary very little over a period L0 . The
initial conditions (758) are what we call wave packets. These are functions that
locally look like plane waves but that has a (complex) amplitude that varies
over scales much larger than L0 .
Figure 61
Using simple properties of the Fourier transform we nd easily from (749)
that
212
1 λ − k0
F (λ) = P , (761)
1 λ − k0
G(λ) = Q ,
where P and Q are the Fourier transform of p and q. From Figure 62 it is clear
Figure 62
1
that in Fourier space, F (λ) and G(λ) are narrow "spikes" of width , height
and centered at λ = k0 .
The functions F+ (λ), F− (λ) will also be of this form
1 λ − k0
F± (λ) = A± , (762)
for some functions A± .
The solution of the Klein-Gordon equation consists of two terms of a very
similar structure. The rst term is
Z ∞
1
u+ (x, t) = √ dλ F+ (λ)ei(ω(λ)t−λx) (763)
2π −∞
Z ∞
1 1 λ − k0
=√ dλ A+ eiω(λ)t e−iλx .
2π −∞
∞
λ − k0
Z
1 1 0
u+ (x, t) ≈ √ dλ A+ ei(ω(k0 )+ω (k0 )(λ−k0 ))t e−iλx (764)
2π −∞
Z ∞
1 1 λ − k0 0
= ei(ω(k0 )t−k0 x) √ dλ A+ e−i((λ−k0 )x−ω (k0 )(λ−k0 )t .
2π −∞
213
Figure 63
λ − k0 1
η= ⇒ dη = dλ, (765)
gives us
Z ∞
1 0
u+ (x, t) ≈ ei(ω(k0 )t−k0 x) √ dη f+ (η)e−iη(x−ω (k0 )t) . (766)
2π −∞
Dening
Z ∞
1
a+ (ξ) = √ dη A+ (η)e−iηξ , (767)
2π −∞
We observe that the envelope of the wave packet, a+ , translates without chang-
ing shape at a speed
dω
vg ≡ (k0 ). (769)
dλ
vg is the group velocity for the wave packet. The group velocity, and not the
phase velocity, is the relevant speed for a wave packet. This conclusion holds
true for all equations of dispersive type.
For the Klein-Gordon equation
1
ω = (γ 2 λ2 + c2 ) 2 , (770)
⇓
2
γ λ γ
vg = 1 =r 2 < γ.
(γ 2 λ2 + c2 ) 2 c
1 + γλ
The group velocity is always smaller than the characteristic velocity γ for the
Klein-Gordon equation
214
Figure 64
Note that the statement that the wave packet move at the group velocity is
only approximately true. A closer investigation shows that eventually the wave
packet spreads out and disperse.
We will shortly see that the group velocity plays a pivotal role in the descrip-
tion of general solutions to dispersive equations. However, before we do this we
must describe a tool of outmost importance. This is the method of stationary
phase.
Z ∞
I(k) = dt f (t)eikϕ(t) , (771)
−∞
where ϕ(t) is a real function and f (t) can be a complex function. Both functions
are assumed to be smooth.
We want to investigate the integral for large k. The formulas we derive will
hold in the limit k → ∞. In any actual application of the formulas, k is a xed
number, that is large, but not innite. The errors we make in applying the
asymptotic formulas using a given nite k, are not so easy to estimate. That is
why we call it method of stationary phase rather than theory of stationary phase:
Proofs are few and far between in this domain of applied mathematics. The
method of stationary phase and asymptotic methods in general tend however to
be surprisingly accurate even for values of k that are nowhere "close" to innity.
Observe that when k is very large the phase term
eikϕ(t) , (772)
performs very fast oscillations. Since f (t) is assumed to be smooth, the phase
(772) will oscillate on a scale much faster than the scale at which the amplitude,
f (t), varies.
215
Figure 65
In this situation, as the gure illustrates, the positive and negative contributions
in the integral tends to cancel each other so that we get
I(k) → 0, k → ∞. (773)
This argument can be made into precise mathematics using the Riemann-
Lebesques lemma from complex analysis, but we will not pursue this any further
here.
Let t = t0 . Close to t0 we have
1
ϕ(t) = ϕ(t0 ) + ϕ0 (t0 )(t − t0 ) + ϕ00 (t0 )(t − t0 )2 + . . . (774)
2
The largest contribution to I(k) comes close to points where
ϕ0 (t0 ) = 0, (775)
because around such points the phase varies more slowly than around points
where ϕ0 (t0 ) 6= 0 for k large.
A point t0 such that (775) hold is called a stationary point. If t = t0 is a sta-
tionary point, the contribution to I(k), which we call I(k; t0 ), is approximately
given by
Z t0 +
00
1
(t0 )(t−t0 )2 )k
I(k; t0 ) ≈ dt f (t)ei(ϕ(t0 )+ 2 ϕ (776)
t0 −
Z t0 +
00
k
(t0 )(t−t0 )2
≈ f (t0 )eiϕ(t0 )k dt ei 2 ϕ ,
t0 −
where we have used the fact that for a small enough interval (t0 − , t0 + ), f (t)
varies very little around the value f (t0 ). This is where we use the smoothness
of f (t). If f (t) has a singularity at t = t0 the approximation (776) is not valid.
216
If k is large enough the integrand will oscillate very fast beyond (t0 − , t0 + ),
so by the same argument as the one leading up to (773), we can conclude that
the contribution from the integral
Z
00
k
(t0 )(t−t0 )2
dt ei 2 ϕ (777)
|t−t0 |>
|t−t0 |<
Z ∞
00
k
(t0 )(t−t0 )2
I(k; t0 ) ≈ f (t0 )eiϕ(t0 )k dt ei 2 ϕ . (779)
−∞
s
ϕ00 (t0 )
iϕ(t0 )k 2π iπ
4 |ϕ00 (t0 )|
I(k; t0 ) ≈ f (t0 )e e . (780)
k|ϕ00 (t0 )|
If you have several stationary points, each point gives a contribution of type
(780) so that nally
n
X
I(k) ≈ I(k; tj ), (781)
j=1
where ϕ0 (tj ) = 0, j = 1, 2, . . . , n.
The method of stationary phase can be extended to integrals over nite
domains and to multidimensional integrals.
The importance of the method of stationary phase can hardly be overstated.
For example, it is by using this method that we understand how our macroscopic
everyday world, ruled by classical physics, arise from a substratum ruled by the
laws of quantum physics.
Z ∞
1
u(x, t) = √ dλ F+ (λ)ei(ω(λ)t−λx) (782)
2π −∞
Z ∞
1
+√ dλ F− (λ)e−i(ω(λ)t+λx) ,
2π −∞
217
where F+ and F− are dened in terms of the initial data in (752). Both terms
in (782) have a similar structure, so let us focus our attention on the rst of
them
Z ∞
1
I(x, t) ≡ √ dλ F+ (λ)e+i(ω(λ)t−λx) . (783)
2π ∞
We want to investigate this expression for both x and t large. For such large
values of x and t the phase varies very quickly as a function of λ and we will,
according to the method of stationary phase, get the largest contributions to
I(x, t) close to values of λ where the phase is stationary. Thus we seek λ such
that
m
0
ω (λ)t − x = 0.
Thus for given x and t large we nd the stationary points by solving the equation
x
ω 0 (λ) = . (785)
t
For the Klein-Gordon equation we have from (748)
1
ω(λ) = (γ 2 λ2 + c2 ) 2 . (786)
γ2λ x
1 = , (787)
(γ 2 λ2 + c) 2 t
⇓
x 2
γ 4 λ2 = (γ 2 λ2 + c2 ),
t
m
x 2 x 2
γ4 − γ2 λ2 = c2 , (788)
t t
m
x 2 x 2
γ2 γ2 − λ2 = c2 . (789)
t t
x
< γ. (790)
t
But the light cone for the Klein-Gordon equation is |x| = γ|t|,
218
Figure 66
so stationary points only exists for (x, t) inside the light cone. Outside the light
cone there are no stationary points and according to the Riemann Lebesques
lemma I(x, t) ≈ 0.
x
Assuming
t <γ we solve (789) and nd
cx
x 2 − 21
λ= γ2 − . (791)
γ t t
x
We do not get ± in this formula because according to (787) λ and
t have the
same sign.
Observe that
− 23
ω 00 (λ) = γ 2 c2 γ 2 λ2 + c2 > 0, (792)
so that
and therefore
ϕ00 (λ)
= 1. (794)
|ϕ00 (λ)|
Thus the stationary phase formula (780) gives us
1 π
I(x, t) ≈ √ H(λ)ei(ω(λ)t−λx+ 4 ) , (795)
t
1
where H(λ) = F+ (λ)(ω 00 (λ))− 2 and where the phase in (783) has been written
as
x
ω(λ)t − λx = t(ω(λ) − ) = tϕ, (796)
t
so k in the stationary phase formula is identied with t 1.
So what have we found? Observe that
219
x γ
= ω 0 (λ) = 21 < γ. (797)
t c
1+ γλ
Figure 67
According to (795) and (779) for large x and t along the line x = ω 0 (λ)t, the
2π
wave eld take the form of a plane wave with frequency ω(λ), wavelength
λ
and amplitude
1
|I(x, t)| ∼ √ |H(λ)|. (798)
t
1
The amplitude of the plane wave will decay as √ . Or to put it another way; if
t
we shift to a moving frame moving at speed ω 0 (λ) with respect to the lab frame
1
we would see a plane wave whose amplitude decays as √and whose frequency
t
ω(λ) and wavelength λ will depend on our speed. Observe that ω 0 (λ) is just
2π
220
12 Projects
12.1 Project1
In this project the aim is to develop and implement on a computer a nite
dierence algoritm for the solution of the 1D wave equation. Deriving the nite
dierence algoritm and implementing it in a computer code is not really the
challenging and time consuming step here. The real challenge is to make a
convincing case for the correctness of the computer implementation. As I have
stressed in my lectures we should never, ever, trust anything that comes out of
a computer without extensive testing of the implemented code.
So how do we proceed to build the needed trust in our implementation? The
best way to do this is to run the code in a situation where an analytic solution
is known. Most of the time an analytic solution also needs to be implemented
numerically, it typically include innite sums of functions and integrals that can
not be solved in terms of elementary functions. So what we are really doing is
to compare the output of two dierent computer implementations for the same
solution. If this is to make sense and build condence, it is of paramount im-
portance that the two codes implementing the nite dierence algoritm and the
analytical solution should be as independent as humanly possible. Under such
circumstances closely similar output from the two codes will build condence
that our nite dierence solution is correct.
So the challenge then is to nd analytical solutions to the problem at hand,
in our particular case this is the 1D wave equation subject to some boundary
conditions that we will specify shortly. It is typically the case that we will not
be able to nd analytic solutions to the actual problem we are trying to solve
but to a related problem or a special case of the problem where the domain,
boundary conditions and/or equation is changed somewhat or xed by some
specic choises. After all, if we could nd an analytic solution to the actual
problem it would not be much point in spending a lot of eort in trying to
solve it using nite dierences. It is however important that changing from the
actual problem to the related problem does not involve major modications to
the core of the nite dierence code. The reason for this is that such major
modications are their own source of error and as a consequence our test will
not be very convincing as a test of the actual problem.
So what is the actual problem then? Here it is:
a) Implement a nite dierence method for this problem. The code should
be exible enough to handle arbitrary choises for the functions h,k ,ϕ and ψ.
221
b) Show that
1
u(x, t) = (ϕ(x + t) + ϕ(x − t))
2
is a solution to the problem
In your nite dierence code choose l and ϕ in such a way that the interval
is much larger than the support of ϕ.( By the support of ϕ I mean the set of
points where the function is nonzero). A Gaussian function of the form
2
ϕ(x) = ae−bx
is a good choise if the parameters a, b and x0 are choosen in the right way. Also
choose h = k = 0. As long as the support of the solution u is well away from
the boundary of the interval [0, l] our exact solution and the nite dierence
solution should match. This is our rst test.
c) Our second test involve a rather general idea called an "articial source".
We consider the modied problem
This equation would appear as the evolution equation for a string clamped
at both ends and where there is a precribed force acting along the string. This
kind of prescribed inuence is called a "source". In the articial source test we
turn the problem around and rather assume a solution of a particular type and
then compute what the corresponding source must be.
We thus pick a function u(x, t) and calculate the source ρ(x, t), boundary
functions h(t), k(t) and initial data ϕ(x) and ψ(x) so that u(x, t) is in fact a
solution to our modied problem. Thus
We now have a analytic solution to our modied problem and can compare
this solution to the solution produced by our nite dierence code. You are free
222
to choose the function u(x, t) for yourself but here is a possible family of choises
2
u(x, t) = ae−b(x−x0 cos(ωt))
This is a family of Gaussian functions whose center move back and forth
in a periodic manner determined by the parameters x0 and ω. Compare the
numerical solution to the exact solution for several choises of parameters. Make
sure that at least one of the test involve the boundary conditions in a signicant
way. By this I mean that h(t) and k(t) should be nonzero for some values of t.
d) Now use you code to solve the problem
This model what happend when you periodically move the left end of the
string,whose right end is clamped, up and down. What do you expect to hap-
pend? Do your code verify your intuition?
12.2 Project2
In this project the aim is to develop a stable numerical scheme for the initial
value problem
u(−L, t) = 0
u(L, t) = 0
u(x, 0) = ϕ(x)
where
a = a(x, t)
b = b(x, t, u)
ρ = ρ(x, t)
Remark: Since the equation is rst order in time, one can actually not pose
boundary conditions at both endpoints. Thus we should only use the condition
at one end point, for example u(−L, t) = 0. However since we are using center
dierence we really also need a boundary condition at x = L. In this problem I
ask you to use the condition u(−L, t) = 0. Using these two conditions you can
construct a nite dierence scheme using center dierence in space. However,
and this is of outmost importance, you must make sure that the solution does
not become nonzero in the region close to the boundaries, if it does the scheme
223
will go unstable no matter what we do. But this restriction is really ok since
we really are trying to simulate the dynamics on an innite domain, and thus
the computations boundaries at x = −L, L should be "invisible" at all time.
a) Derive a numerical scheme for equation (799) using center dierences for the
time and space derivatives. Find the Von-Neuman condition for numerical
stability of your scheme for the case when a = a0 , b = b0 are constants
and ρ = 0.
b) Verify the numerical stability condition by running the your numerical scheme
for some initial condition of your choosing. Run with grid parameters
both in the Von Neumann stable and unstable regime Since our numerical
scheme is meant to be a good approximation to solutions of equation (799)
on an innite interval, you must make sure that the initial condition is
very small or zero close to the end points of the interval [−L, L]. A well
localized Gaussian function will do the job.
ut + uux = 0, −∞<x<∞
u(x, 0) = ϕ(x)
Find exact formulas for the shock time for solutions to the initial value
problem from e) corresponding to the initial data
i)
0, x>1
1 − x, 0 < x < 1
ϕ(x) =
x + 1, −1 < x < 0
0, x<0
ii)
1
ϕ(x) =
1 + x2
iii)
ϕ(x) = sech(x)
224
Test the validity of your nite dierence code, and exact formulas for the
shock time, by running your code with the given initial conditions and
verifying that the graph of the numerical solutions become approximately
vertical for some x as the shock time is approached.
12.3 Project3
In this project we are going to solve a heat equation numerically using the nite
dierence method and the nite Fourier transform.
u(0, t) = f (t)
u(l, t) = g(t)
u(x, 0) = ϕ(x)
a) Implement a nite dierence method for this problem. The code should be
exible enough to handle arbitrary choises for the functions f, g , ρ and
ϕ.
b) The function
1 g(t) f (t)
ue (x, t) = (xh(x, t)( ) + (l − x)h(x, t)( )
l h(l, t) h(0, t)
satisfy the boundary conditions. Calculate the corresponding articial source
ρ and initial condition ϕ. Now run your nite dierence code with the cal-
culated ρ and ϕ and compare the numerical solution to the exact solution
ue (x, t). Do this with a couple of choises for the functions f (t), g(t) and
h(x, t) and for several values of the parameters in the problem. For each
choise plot the numerical and exact solution in the same plot and thereby
verify that they coincide.
c) Solve (800) using the nite Fourier transform. Before you apply the nite
Fourier transform you should convert (800) into a problem with homoge-
nous boundary conditions like in equation 4.6.45 in our textbook. Use the
same choises for f (t) and g(t) that you used in problem b) and choose
your own source. Compare the nite Fourier transform solution with the
nite dierence solution for some choises of initial conditions. Make sure
that the initial conditions is consistent with the boundary conditions
ϕ(0) = f (0)
ϕ(l) = g(0)
d) Let the source in problem (800) be the nonlinear source from equation 4.7.1
in our textbook.
ρ(x, t) = λu(x,
b t)(1 − u2 (x, t))
225
Use the nite dierence code to verify the linear and nonlinear stability results
found in section 4.7 of our textbook. In particular you should shown that
References
[1] Partial Dierential Equations of Applied Mathematics, Erich Sauderer,
2006, Wiley.
[8] Green's functions and Boundary value problems, Ivar Stakgold and
Michael J. Holst,2011, Wiley.
226