Math412 Notes
Math412 Notes
c
S. A. Fulling
Fall, 2005
The Wave Equation
1. I will show how a particular, simple partial differential equation (PDE) arises
in a physical problem.
2. Well look at its solutions, which happen to be unusually easy to find in this
case.
3. Well solve the equation again by separation of variables, the central theme of
this course, and see how Fourier series arise.
2u 2
2 u
= c ,
t2 x2
where c is a constant, which turns out to be the speed of the waves described by
the equation.
Most textbooks derive the wave equation for a vibrating string (e.g., Haber-
man, Chap. 4). It arises in many other contexts for example, light waves (the
electromagnetic field). For variety, I shall look at the case of sound waves (motion
in a gas).
Sound waves
We assume that the gas moves back and forth in one dimension only (the x
direction). If there is no sound, then each bit of gas is at rest at some place (x, y, z).
There is a uniform equilibrium density 0 (mass per unit volume) and pressure P0
(force per unit area). Now suppose the gas moves; all gas in the layer at x moves
the same distance, X(x), but gas in other layers move by different distances. More
precisely, at each time t the layer originally at x is displaced to x + X(x, t). There
it experiences a new density and pressure, called
2
........
.......
........
X(x, t)
OLD NEW
P0 VOLUME
P0 P (x, t) VOLUME
P (x + x, t)
X(x + x, t) ........
........
........
x x + x x + X(x, t) x + x + X(x + x, t)
Given this scenario, Newtons laws imply a PDE governing the motion of the
gas. The input to the argument is three physical principles, which will be translated
into three equations that will imply the wave equation.
I. The motion of the gas changes the density. Take a slab of thickness x
in the gas at rest. The total amount of gas in the slab (measured by mass) is
0 volume = 0 x area.
We can consider a patch with area equal to 1. In the moving gas at time t,
this same gas finds itself in a new volume (area times thickness)
X
X(x + x, t) X(x, t) x;
x
X
0 x = x + x .
x
(Cancel x.) So
X
0 = (0 + 1 ) + 0 + 1 .
x
Since 1 0 , we can replace 0 + 1 by 0 in its first occurrence but not
the second, where the 0 is cancelled, leaving 1 as the most important term.
Therefore, we have arrived (essentially by geometry) at
X
1 = 0 . (I)
x
3
II. The change in density corresponds to a change in pressure. (If you
push on a gas, it pushes back, as we know from feeling balloons.) Therefore,
P = f (), where f is some increasing function.
P0 + P1 = f (0 + 1 ) f (0 ) + 1 f (0 )
P1 = c2 1 . (II)
III. Pressure inequalities generate gas motion. The force on our slab (mea-
sured positive to the right) equals the pressure acting on the left side of the
slab minus the pressure acting on the right side (times the area, which we set
to 1). But this force is equal to mass times acceleration, or
2X
(0 x) .
t2
2X P
0 x
2
= P (x, t) P (x + x, t) x.
t x
(Cancel x.) But P0 /x = 0. So
2X P1
0 2
= . (III)
t x
Now put the three equations together. Substituting (I) into (II) yields
X
P1 = c2 0 .
x
Put that into (III):
2X 2 2X
0 = +c 0 .
t2 x2
Finally, cancel 0 :
2X 2
2 X
=c .
t2 x2
Remark: The thrust of this calculation has been to eliminate all variables but
one. We chose to keep X, but could have chosen P1 instead, getting
2 P1 2
2 P1
= c .
t2 x2
(Note that P1 is proportional to X/x by (II) and (I).) Also, the same equation
is satisfied by the gas velocity, v(x, t) X/t.
4
DAlemberts solution
w x + ct, z x ct.
w z
= + =c c ,
t t w t z w z
w z
= + = + .
x x w x z w z
Therefore,
2u
=c c u
t2 w z w z
2
2 u 2u 2u
=c 2 + 2 .
w2 w z z
Similarly,
2u 2u 2u 2u
= + 2 + .
x2 w2 w z z 2
Thus the wave equation is
1 2u 1 2u 2u
0= = .
4 x2 c2 t2 w z
5
u
Then it just says that is a constant, as far as w is concerned. That is,
z
u
= (z) (a function of z only).
z
Consequently, Z z
u(w, z) = (
z ) d
z + C(w),
z0
where z0 is some arbitrary starting point for the indefinite integral. Note that the
constant of integration will in general depend on w. Now since was arbitrary, its
indefinite integral is an essentially arbitrary function too, and we can forget and
just call the first term B(z):
(The form of the result is symmetrical in z and w, as it must be, since we could
u
equally well have worked with the equation in the form z w
= 0.)
Interpretation
What sort of function is B(x ct)? It is easiest to visualize if B(z) has a peak
around some point z = z0 . Contemplate B(x ct) as a function of x for a fixed t:
It will have a peak in the neighborhood of a point x0 satisfying x0 ct = z0 , or
x0 = z0 + ct.
That is, the bump moves to the right with velocity c, keeping its shape exactly.
...
t, u .
....
.....
.....
B ..
.....
.. ... .....
. . .
..........
.
.
6
(Note that in the second drawing we have to plot u on the same axis as t. Such
pictures should be thought of as something like a strip of movie film which we are
forced to look at without the help of a projector.)*
Similarly, the term C(x + ct) represents a wave pattern which moves rigidly
to the left at the wave velocity c. If both terms are present, and the functions
are sharply peaked, we will see the two bumps collide and pass through each other.
If the functions are not sharply peaked, the decomposition into left-moving and
right-moving parts will not be so obvious to the eye.
Initial conditions
In a concrete problem we are interested not in the most general solution of the
PDE but in the particular solution that solves the problem! How much additional
information must we specify to fix a unique solution? The two arbitrary functions
in the general solution recalls the two arbitrary constants in the general solution of
a second-order ordinary differential equation (ODE), such as
d2 u
+ 4u = 0; u(t) = B sin(2t) + A cos(2t).
dt2
In that case we know that the two constants can be related to two initial conditions
(IC):
du
u(0) = A, (0) = 2B.
dt
Similarly, for the wave equation the two functions B(z) and C(w) can be related
to initial data measured at, say, t = 0. (However, things will not be so simple for
other second-order PDEs.)
Lets assume for the moment that our wave equation applies for all values of x
and t:
< x < , < t < .
We consider initial data at t = 0:
u
u(x, 0) = f (x), (x, 0) = g(x).
t
7
The second condition implies
Z
g(x)
B(x) + C(x) = dx = G(x) + A,
c
where G is any antiderivative of g/c, and A is an unknown constant of integration.
Solve these equations for B and C:
B(x) = 21 [f (x) G(x) A], C(x) = 12 [f (x) + G(x) + A].
We note that A cancels out of the total solution, B(x ct) + C(x + ct). (Being
constant, it qualifies as both left-moving and right-moving; so to this extent, the
decomposition of the solution into left and right parts is ambiguous.) So we can set
A = 0 without losing any solutions. Now our expression for the solution in terms
of the initial data is
This is the first form of dAlemberts fundamental formula. To get the second
form, use the fundamental theorem of calculus to rewrite the G term as an integral
over g:
Z x+ct
1 1
u(x, t) = 2 [f (x + ct) + f (x ct)] + 2c g(w) dw.
xct
This formula demonstrates that the value of u at a point (x, t) depends only on
the part of the initial data representing stuff that has had time to reach x while
traveling at speed c that is, the data f (w, 0) and g(w, 0) on the interval of
dependence
x ct < w < x + ct (for t > 0).
Conversely, any interval on the initial data surface (the line t = 0, in the two-
dimensional case) has an expanding region of influence in space-time, beyond which
its initial data are irrelevant. In other words, signals or information are carried
by the waves with a finite maximum speed. These properties continue to hold for
other wave equations (for example, in higher-dimensional space), even though in
those cases the simple dAlembert formula for the solution is lost and the waves no
longer keep exactly the same shape as they travel.
t
(x, t)
..............
......
....
.... .....
..... ..... ..... ...
.
.
....
.....
.
.....
.....
.....
t ......
..... .
......
. . . . . . ........
. .........
.
..... ..... ..... . . . . .
..
..... ..... ...... . . . . . ..... .
..... ..... ...... . .........
.
..... ..... ..... . . .
. . .........
.....
.
.....
. .....
.....
.. x ......
.....
.
....
. .
......... x
...... . .........
xt x+t .
..
. .
....
..
.
.
.
.
. . .........
..... . . ........
...... . . .
........ .
. . . . . . .........
.... . . .........
...... . . . . .
..... ...
8
Boundary conditions
In realistic problems one is usually concerned with only part of space (e.g, sound
waves in a room). What happens to the waves at the edge of the region affects what
happens inside. We need to specify this boundary behavior, in addition to initial
data, to get a unique solution. To return to our physical example, if the sound waves
are occurring in a closed pipe (of length L), then the gas should be motionless at
the ends:
X(0, t) = 0 = X(L, t).
Mathematically, these are called Dirichlet boundary conditions (BC). In contrast,
if the pipe is open at one end, then to a good approximation the pressure at that
point will be equal to the outside pressure, P0 . By our previous remark, this implies
that the derivative of X vanishes at that end; for instance,
X
(0, t) = 0
x
instead of one of the previous equations. This is called a Neumann boundary con-
dition.
When a wave hits a boundary, it reflects, or bounces off. Lets see this
mathematically. Consider the interval 0 < x < and the Dirichlet condition
u(0, t) = 0.
We know that
u(x, t) = B(x ct) + C(x + ct) (1)
and
B(w) = 21 [f (w) G(w)], C(w) = 21 [f (w) + G(w)], (2)
where f and cG g are the initial data. However, if we try to calculate u from
(1) for t > x/c, we find that (1) directs us to evaluate B(w) for negative w; this is
not defined in our present problem! To see what is happening, start at (x, t) and
trace a right-moving ray backwards in time: It will run into the wall (the positive
t-axis), not the initial-data surface (the positive x-axis).
Salvation is at hand through the boundary condition, which gives us the addi-
tional information
B(ct) = C(ct). (3)
For t > 0 this condition determines B(negative argument) in terms of C(positive
argument). For t < 0 it determines C(negative argument) in terms of B(positive
argument). Thus B and C are uniquely determined for all arguments by (2) and
(3) together.
9
In fact, there is a convenient way to represent the solution u(x, t) in terms of
the initial data, f and g. Let us define f (x) and g(x) for negative x by requiring
(2) to hold for negative values of w as well as positive. If we let y ct, (2) and (3)
give (for all y)
f (y) G(y) = f (y) G(y). (4)
We would like to solve this for f (y) and G(y), assuming y positive. But for that
we need an independent equation (to get two equations in two unknowns). This is
provided by (4) with negative y; write y = x and interchange the roles of right
and left sides:
f (x) + G(x) = f (x) + G(x). (5)
Rewrite (4) with y = +x and solve (4) and (5): For x > 0,
f (x) = f (x), G(x) = G(x). (6)
What we have done here is to define extensions of f and g from their original
domain, x > 0, to the whole real line. The conditions (6) define the odd extension
of f and the even extension of G. (Its easy to see that g = cG is then odd, like f .)
We can now solve the wave equation in all of R2 ( < x < , < t < ) with
these odd functions f and g as initial data. The solution is given by dAlemberts
formula,
u(x, t) = 12 [f (x + ct) + f (x ct)] + 12 [G(x + ct) G(x ct)],
and it is easy to see that the boundary condition, u(0, t) = 0, is satisfied, because
of the parity (evenness and oddness) of the data functions. Only the part of the
solution in the region x > 0 is physical; the other region is fictitious. In the latter
region we have a ghost wave which is an inverted mirror image of the physical
solution.
u ......
.. ...
... ...
... .....
. ...
... .....
.. .....
......... ........
.
.......
.
.... .....
....... x
.....
..... ....
.... ...
... ....
... ...
... ..
... ..
... ...
...
The calculation for Neumann conditions goes in very much the same way, lead-
ing to even extensions of f and g. The result is that the pulse reflects without turn-
ing upside down. Approximations to the ideal Dirichlet and Neumann boundary
conditions are provided by a standard high-school physics experiment with SlinkyTM
springs. A small, light spring and a large, heavy one are attached end to end. When
a wave traveling along the light spring hits the junction, the heavy spring remains
almost motionless and the pulse reflects inverted. When the wave is in the heavy
spring, the light spring serves merely to stabilize the apparatus; it carries off very
little energy and barely constrains the motion of the end of the heavy spring. The
pulse, therefore, reflects without inverting.
10
Two boundary conditions
Suppose that the spatial domain is 0 < x < L with a Dirichlet condition at
each end. The condition u(0, t) = 0 can be treated by constructing odd and even
extensions as before. The condition u(L, t) = 0 implies, for all t,
0 = B(L ct) + C(L + ct)
(7)
= 21 [f (L ct) G(L ct)] + 21 [f (L + ct) + G(L + ct)].
Treating this equation as we did (4), we find an extension of f and G beyond the
right end of the interval:
f (L + ct) = f (L ct) = +f (L + ct),
G(L + ct) = G(L ct) = G(L + ct).
(In more detail: Treat f (L+ct) and G(L+ct) with t > 0 as the unknowns. Replacing
t by t in (7) gives two independent equations to be solved for them.) Finally, set
ct = s + L:
f (s + 2L) = f (s), G(s + 2L) = G(s) (8)
for all s. That is, the properly extended f and G (or g) are periodic with period 2L.
Here is another way to derive (8): Lets go back to the old problem with just one
boundary, and suppose that it sits at x = L instead of x = 0. The basic geometrical
conclusion cant depend on where we put the zero of the coordinate system: It must
still be true that the extended data function is the odd (i.e., inverted) reflection of
the original data through the boundary. That is, the value of the function at the
point at a distance s to the left of L is minus its value at the point at distance s to
the right of L. If the coordinate of the first point is x, then (in the case L > 0) s
equals L x, and therefore the coordinate of the second point is L + s = 2L x.
(This conclusion is worth remembering for future use: The reflection of the point
x through a boundary at L is located at 2L x.) Therefore, the extended data
function satisfies
f (x) = f (2L x).
In the problem with two boundaries, it also satisfies f (x) = f (x), and thus
f (2L x) = f (x), which is equivalent to the first half of (8) (and the second half
can be proved in the same way).
The dAlembert formula with these periodic initial data functions now gives a
solution to the wave equation that satisfies the desired boundary and initial condi-
tions. If the original initial data describe a single bump, then the extended initial
data describe an infinite sequence of image bumps, of alternating sign, as if space
were filled with infinitely many parallel mirrors reflecting each others images. Part
of each bump travels off in each direction at speed c. What this really means is
that the two wave pulses from the original, physical bump will suffer many reflec-
tions from the two boundaries. When a ghost bump penetrates into the physical
region, it represents the result of one of these reflection events.
11
. .. . .. .....
. . . .. .
.
.
.
. .
. t. .................
.
.....
..... . . .
. .
.
.
. .
.
.
.
.. . ....... ..
. ..... . ... . . .
. .. . . ...... ....... . . . . . .
. . .... .
... . . . . .
.. . .. . .
. ......
. . . .
. . .
. . . . ........ ......... . . . .. . .
. . ... ..... . . . .
. .
. . . ........ .....
..... . . .
.
. . .
. . . ..... ..... . . . . .
. .. . ..... . . . . .
..... . .
. ... . ..... ........ . . . .. . .
. . .. . ..... .... .. . . .. .
.. . ............... .. .. .
. . . . ..... . . . .
. .... .... . .
. .. . ..... . . . .. .
. . . ..... ..... . . . . .
. . . .
. ..
......
. .. ...... . . . . . .
.
. . . . .... . .... . . . . .
. . . ....... .. .
.. . . . . . ..
. .
.
. . ...... .....
. .....
..
.
.
. . .
. . .
. x
1. For most linear PDEs, the waves (if indeed the solutions are wavelike at all)
dont move without changing shape. They spread out. This includes higher-
dimensional wave equations, and also the two-dimensional KleinGordon equa-
tion,
2u 2u
2
= 2
m2 u,
t x
which arises in relativistic quantum theory. (In homework, however, you are
likely to encounter a partial extension of dAlemberts solution to three dimen-
sions.)
2. For most linear PDEs, it isnt possible to write down a simple general solution
constructed from a few arbitrary functions.
3. For many linear PDEs, giving initial data on an open curve or surface like t = 0
is not the most appropriate way to determine a solution uniquely. For example,
Laplaces equation
2u 2u
+ 2 =0
x2 y
is the simplest of a class of PDEs called elliptic (whereas the wave equation is
hyperbolic). For Laplaces equation the natural type of boundary is a closed
curve, such as a circle, and only one data function can be required there.
Lets again consider the wave equation on a finite interval with Dirichlet con-
ditions (the vibrating string scenario):
2u 2
2 u
= c , (PDE)
t2 x2
12
where 0 < x < L (but t is arbitrary),
u
u(x, 0) = f (x), (x, 0) = g(x). (IC)
t
During this first exposure to the method of variable separation, you should
watch it as a magic demonstration. The reasons for each step and the overall
strategy will be philosophized upon at length on future occasions.
2u 2u
2
= XT , 2
= X T,
t x
and hence XT = c2 X T from the PDE. Lets divide this equation by c2 XT :
T X
= .
c2 T X
This must hold for all t, and for all x in the interval. But the left side is a function of
t only, and the right side is a function of x only. Therefore, the only way the equation
can be true everywhere is that both sides are constant! We call the constant K:
T X
= K = .
c2 T X
X(0) = 0 = X(L). ()
The former possibility would make the whole solution zero an uninteresting,
trivial case so we ignore it. Therefore, we turn our attention to the ordinary
differential equation satisfied by X,
X + KX = 0, ()
13
Case 1: K = 0. Then X(x) = Ax + B for some constants. () implies
B = 0 = AL + B, hence A = 0 = B. This solution is also trivial.
14
Matching initial data
So far we have looked only at (PDE) and (BC). What initial conditions does
un satisfy?
f (x) = u(x, 0) = X(x)T (0) = C sin(x),
u
g(x) = (x, 0) = X(x)T (0) = cD sin(x).
t
Using trig identities, it is easy to check the consistency with DAlemberts solution:
where Z z
1
G(z) = g(x) dx = D cos(x) + constant.
c
The traveling nature of the xct and x+ct parts of the solution is barely noticeable,
because they are spread out and superposed. The result is a standing vibration. It
is a called a normal mode of the system described by (PDE) and (BC).
But what if the initial wave profiles f (x) and g(x) arent proportional to one of
the eigenfunctions, sin nx
L
? The crucial observation is that both (PDE) and (BC)
are homogeneous linear equations. That is,
Therefore, any linear combination of the normal modes is a solution. Thus we know
how to construct a solution with initial data
N
X XN
nx cn nx
f (x) = Cn sin , g(x) = Dn sin .
n=1
L n=1
L L
This is still only a limited class of functions (all looking rather wiggly). But what
about infinite sums?
X nx
f (x) = Cn sin , etc.
n=1
L
Fact: Almost any function can be written as such a series of sines! That is
what the next few weeks of the course is about. It will allow us to get a solution
for any well-behaved f and g as initial data.
15
Remark: For discussion of these matters of principle, without loss of generality
we can take L = , so that
Xn (x) = sin(nx), n = n.
Before we leave the wave equation, lets take stock of how we solved it. I cannot
emphasize too strongly that separation of variables always proceeds in two steps:
1. Hunt for separated solutions (normal modes). The assumption that the solution
is separated (usep = X(x)T (t)) is only for this intermediate calculation; most
solutions of the PDE are not of that form. During this step we use only the
homogeneous conditions of the problem those that state that something is
always equal to zero (in this case, (PDE) and (BC)).
16
Fourier Series
Periodic functions
Examples and remarks: (1) sin(2x) is periodic with period 2 and also with
period or 4. (If p is a period for f , then an integer multiple of p is also a period.
In this example the fundamental period the smallest positive period is .)
(2) The smallest common period of {sin(2x), sin(3x), sin(4x), . . . } is 2. (Note that
the fundamental periods of the first two functions in the list are and 2/3, which
are smaller than this common period.) (3) A constant function has every number
as period.
* Where did the cosines come from? In the previous example we had only sines, because
we were dealing with Dirichlet boundary conditions. Neumann conditions would lead to
cosines, and periodic boundary conditions (for instance, heat conduction in a ring) would
lead to both sines and cosines, as well see.
17
It is convenient to answer the last question first. That is, lets assume () and
then find formulas for an and bn in terms of f . Here we make use of the . . .
Z
0 if n 6= 0,
cos(nx) dx =
2 if n = 0;
Z
sin(nx) cos(mx) dx = 0;
Z
0 if n 6= m,
sin(nx) sin(mx) dx =
if n = m 6= 0;
Z
0 if n 6= m,
cos(nx) cos(mx) dx =
if n = m 6= 0.
We do similar calculations for m = 0 and for sin(mx). The conclusion is: If f has
a Fourier series representation at all, then the coefficients must be
Z
1
a0 = f (x) dx,
2
Z
1
an = cos(nx) f (x) dx,
Z
1
bn = sin(nx) f (x) dx.
18
Note that the first two equations cant be combined, because of an annoying factor
of 2. (Some authors get rid of the factor of 2 by defining the coefficient a0 differently:
a0 X
f (x) = + an cos(nx) + bn sin(nx) . (* NO *)
2 n=1
Example: Find the Fourier coefficients of the function (triangle wave) which
is periodic with period 2 and is given for < x < by f (x) = |x|.
......... .........
.....
.
.... ........
..... .....
.....
f .....
.
.... ........
..... .....
.....
..
. ..... ..
. .....
.... ...
.....
.
......
.
.....
..
..... .
..
.
.....
.....
..
..
. . ..
..
. .....
..
...
. .....
. ..
...
. .....
..... ..
...
. .....
. ..
...
. .....
..... ..
...
. .....
. ..
...
. .....
.....
..... ..
...
. .....
. ..
...
. .....
..... .
..
... .....
. .
..
... .....
..... .
..
. ..... .
..
. .....
..... ...
.. ..... . .
..
. .....
..... . .
... ...... .
..
. . .....
..... .
.. ...... ..
. .....
..... ....... ...
..... .....
...
...... ....
. .
.....
.
.. .....
..... x
2 2
Z
an = |x| cos(nx) dx
Z 0 Z
= (x) cos(nx) dx + x cos(nx) dx.
0
19
R
Here the first term equals 0 y sin(ny) dy, but this is just the negative of the
second term. So bn = 0. (This will always happen when an odd integrand is
integrated over an interval centered at 0.)
(The symbol is a reminder that we have calculated the coefficients, but havent
proved convergence yet. The important idea is that this formal Fourier series
must have something to do with f even if it doesnt converge, or converges to
something other than f .)
Its fun and informative to graph the first few partial sums of this series with
suitable software, such as Maple. By taking enough terms of the series we really do
get a good fit to the original function. Of course, with a finite number of terms we
can never completely get rid of the wiggles in the graph, nor reproduce the sharp
points of the true graph at x = n.
If f (x) is defined for < x , then it has a periodic extension to all x: just
reproduce the graph in blocks of length 2 all along the axis. That is,
(Here the operative equality (the target of if and only if) is the middle one.
The left one is a definition, and the right one is a consequence of our continuity
assumption. The notation limx means the same as limx , etc.) This issue of
continuity is important, because it influences how well the infinite Fourier series
converges to f , as well soon see.
20
are completely determined by the values of f (x) in the original interval (, ] (or,
for that matter, any other interval of length 2 all of which will give the same
values for the integrals). Thus we think of a Fourier series as being associated with
as well as
with x as the angle that serves as coordinate on the circle. The angles x and x+2n
represent the same point on the circle.
.....................................
........ ......
...... .....
........ ....
.
... ... ..... ....
..
.
.
. ... ...
..... ......
..
. .
..
...... ..
...
. ...
. ....
... .... .....
...
... .....
...
...
... .
.
....
.
..... ..
x ..
...
.. ...
...
... .....
... .......... ..
... ......... ..
... . ...
.... .. ....
.
.
..... .
..... .....
...... .....
......... ......
..................................
In particular, and are the same point, no different in principle from any
other point on the circle. Again, f (given for x (, ]) qualifies as a continuous
function on the circle only if f () = f (). The behavior f () 6= f () counts as
a jump discontinuity in the theory of Fourier series.
(axes not to scale!) and has nothing to do with the full parabola,
The
R coefficients of this scalloped periodic function are given by integrals such as
2
cos(mx) x dx. If we were to calculate the integrals over some other interval
21
R 2
of length 2, say 0 cos(mx) x2 dx, then we would get the Fourier series of a very
different function:
... ..
...
..
...
... ... ...
.... .. ..
.. ... ...
... ... ...
... .. ..
...
. ...
. ...
.
... ..
...
..
...
...
... .... ....
..... ..... .....
... ... ...
... ... ...
... ... ...
.
..... .
....
. .
....
.
.... .
....
.
....
..... .... ....
.... .... ....
..
......
. ..
...... ..
......
. ..... ..... .....
..... ..... .....
..... ..... .....
..
..
...... ..
..
...... ..
..
......
. . .
....... ....... .......
.........
...................
.........
................... 2 .........
...................
This does not contradict the earlier statement that the integration interval is irrel-
evant when you start with a function that is already periodic.
f (x) = f (x).
f (x) = f (x).
In either case, the values f (x) for x < 0 are determined by those for x > 0 (or
vice versa).
(1) even + even = even; odd + odd = odd; even + odd = neither.
In the language of linear algebra, the even functions and the odd functions each
form subspaces, and the vector space of all functions is their direct sum.
(2) even even = even; odd odd = even; even odd = odd.
22
(3) (even) = odd; (odd) = even.
R R
(4) odd = even; even = odd + C.
Theorem: If f is even, its Fourier series contains only cosines. If f is odd, its
Fourier series contains only sines.
Proof: We saw this previously for an even example function. Lets work it out
in general for the odd case:
Z
an f (x) cos(nx) dx
Z 0 Z
= f (x) cos(nx) dx + f (x) cos(nx) dx
0
Z Z
= f (y) cos(ny) dy + f (x) cos(nx) dx
0 0
= 0.
Z
bn f (x) sin(nx) dx
Z 0 Z
= f (x) sin(nx) dx + f (x) sin(nx) dx
0
Z Z
= f (y) sin(ny) dy + f (x) sin(nx) dx
0 0
Z
=2 f (x) sin(nx) dx.
0
Similarly, the even extension gives a series of cosines for any f on 0 < x < .
This series includes the constant term, n = 0, for which the coefficient formula has
23
an extra factor 21 . The formulas are
X
f (x) an cos(nx)
n=0
Z
2
where an = f (x) cos(nx) dx for n > 0,
0
Z
1
a0 = f (x) dx
0
for even f on < x <
or any f on 0 < x < .
y
X ny
f (y) f bn sin
L n=1
L
Z L
2 ny
where bn = f (y) sin dy
L 0 L
for odd f on L < y < L
or any f on 0 < y < L.
To keep the formulas simple, theoretical discussions of Fourier series are conducted
for the case L = ; the results for the general case then follow trivially.
or
(2) sines and cosines of period K (taking K = 2L, interval = (L, L)).
In each case, the arguments of the trig functions in the series and the coefficient
formulas are
mx
, m = integer.
L
Which series to choose (equivalently, which extension of the original function) de-
pends on the context of the problem; usually this means the type of boundary
conditions.
24
Complex Fourier series
Every complex number has the form z = x + iy with x and y real. To manipulate
these, assume that i2 = 1 and all rules of ordinary algebra hold. Thus
z* x iy = complex conjugate of z.
Note that
(z1 + z2 )* = z1 * + z2 *, (z1 z2 )* = z1 *z2 *.
Define
ei cos + i sin ( real);
then
ez = ex+iy
= ex eiy
= ex (cos y + i sin y);
i
e = 1 if is real; ez+2i = ez ;
1
ei = 1, ei/2 = i, ei/2 = e3i/2 = i = , e2i = e0 = 1;
i
1
ei = ei = ; ei = cos i sin ;
ei
1 i 1 i
cos = e + ei , sin = e ei .
2 2i
25
In the Fourier formulas () for periodic functions on the interval (, ), set
c0 = a0 , cn = 12 (an ibn ), cn = 12 (an + ibn ).
The result is
X
f (x) cn einx ,
n=
Z
1
where cn = f (x) einx dx.
2
(Note that we are now letting n range through negative integers as well as nonneg-
ative ones.) Notice that now there is only one coefficient formula. This is a major
simplification!
Alternatively, the complex form of the Fourier series can be derived from one
orthogonality relation,
Z
1 inx imx 0 if n 6= m,
e e dx =
2 1 if n = m.
As usual, we can scale these formulas to the interval (L, L) by the variable
change x = y/L.
Convergence theorems
So far weve seen that we can solve the heat equation with homogenized Dirich-
let boundary conditions and arbitrary initial data (on the interval [0, ]), provided
that we can express an arbitrary function g (on that interval) as an infinite linear
combination of the eigenfunctions sin (nx):
X
g(x) = bn sin nx.
n=1
Furthermore, we saw that if such a series exists, its coefficients must be given by
the formula Z
2
bn = g(x) sin nx dx.
0
So the burning question of the hour is: Does this Fourier sine series really converge
to g(x)?
No mathematician can answer this question without first asking, What kind
of convergence are you talking about? And what technical conditions does g sat-
isfy? There are three standard convergence theorems, each of which states that
certain technical conditions are sufficient to guarantee a certain kind of convergence.
Generally speaking,
26
more smoothness in g
....................
.......
......... ........... ........
........ ...........
....... ...
...
...
............ ........
........
.......... ........
........
........
. x
0
This class of functions is singled out, not only because one can rather eas-
ily prove convergence of their Fourier series (see next theorem), but also because
they are a natural type of function to consider in engineering problems. (Think of
electrical voltages under the control of a switch, or applied forces in a mechanical
problem.)
1
2
[g(x ) + g(x+ )]
(which is just g(x) if g is continuous at x). [Note that at the endpoints the series
obviously converges to 0, regardless of the values of g(0) and g(). This zero is
simply 12 [g(0+ ) + g(0 )] or 12 [g(+ ) + g( )] for the odd extension!]
Remarks:
1. Uniform convergence means: For every we can find an N so big that the
partial sum
N
X
gN (x) bn sin (nx)
n=1
27
approximates g(x) to within an error everywhere in [0, ]. The crucial point
is that the same N works for all x; in other words, you can draw a horizontal
line, y = , that lies completely above the graph of |g(x) gN (x)|.
y
... N
g
..................................................... g
. .........
........ .........
..........
...
. ....
....... ............
.. .......... ..... .......
.
. .
....... ............
................ ....... x
0
3. For the same reason, the sine series cant converge uniformly near an endpoint
where g doesnt vanish. An initial-value function which violated the condition
g(0) = g() = 0 would be rather strange from the point of view of the Dirichlet
boundary value problem that gave rise to the sine series, since there we want
u(0, t) = u(, t) = 0 and also u(x, 0) = g(x)!
28
5. There are continuous (but not piecewise smooth) functions whose Fourier series
do not converge, but it is hard to construct an example! (See Appendix B.)
Z
2 X
Parsevals Equation: |g(x)| dx = |bn |2 .
0 2 n=1
(In particular, the integral converges if and only if the sum does.)
Proof: Taking convergence for granted, lets calculate the integral. (Ill
assume that g(x) and bn are real, although Ive written the theorem so that it
applies also when things are complex.)
Z Z X
X
2
|g(x)| dx = bn sin (nx) bm sin (mx) dx
0 0 n=1 m=1
Z X
= bn2 sin2 nx dx
0 n=1
X 2
= bn .
2 n=1
(The integrals have been evaluated by the orthogonality relations stated earlier.
Only terms with m = n contribute, because of the orthogonality of the sine func-
tions. The integral with m = n can be evaluated by a well known rule of thumb:
The integral of sin2 x over any integral number of quarter-cycles of the trig func-
tion is half of the integral of sin2 x + cos2 x namely, the length of the interval,
which is in this case.)
There are similar Parseval equations for Fourier cosine series and for the full
Fourier series on interval (, ). In addition to its theoretical importance, which we
can only hint at here, Parsevals equation can be used to evaluate certain numerical
infinite sums, such as
X
1 2
2
= .
n=1
n 6
29
L2 (or Mean) Convergence Theorem: If g is square-integrable, then the
series converges in the mean:
Z
|g(x) gN (x)|2 dx 0 as N .
0
Remarks:
1. Recalling the formulas for the length and distance of vectors in 3-dimensional
space,
X3 3
X
2 2 2
|~x| xn , |~x ~y | (xn yn )2 ,
n=1 n=1
must not be taken too literally in such a case such as by writing a computer
program to add up the terms for a fixed value of x. (The series will converge
(pointwise) for almost all x, but there may be special values where it doesnt.)
Prior to Fall 2000 this course spent about three weeks proving the convergence
theorems and covering other aspects of the theory of Fourier series. (That material
has been removed to make room for more information about PDEs, notably Green
functions and the classification of PDEs as elliptic, hyperbolic, or parabolic.) Notes
for those three weeks are attached as Appendix B.
30
Fundamental Concepts: Linearity and Homogeneity
This is probably the most abstract section of the course, and also the most im-
portant, since the procedures followed in solving PDEs will be simply a bewildering
welter of magic tricks to you unless you learn the general principles behind them.
We have already seen the tricks in use in a few examples; it is time to extract and
formulate the principles. (These ideas will already be familiar if you have had a
good linear algebra course.)
I think that you already know how to recognize linear and nonlinear equations,
so lets look at some examples before I give the official definition of linear and
discuss its usefulness.
Algebraic equations:
Linear Nonlinear
( )
x + 2y = 0, x5 = 2x
x 3y = 1
What distinguishes the linear equations from the nonlinear ones? The most
visible feature of the linear equations is that they involve the unknown quantity
(the dependent variable, in the differential cases) only to the first power. The
unknown does not appear inside transcendental functions (such as sin and ln), or
31
in a denominator, or squared, cubed, etc. This is how a linear equation is usually
recognized by eye. Notice that there may be terms (like cos 3t in one example)
which dont involve the unknown at all. Also, as the same example term shows,
theres no rule against nonlinear functions of the independent variable.
The formal definition of linear stresses not what a linear equation looks like,
but the properties that make it easy to describe all its solutions. For concreteness
lets assume that the unknown in our problem is a (real-valued) function of one
or more (real) variables, u(x) or u(x, y). The fundamental concept is not linear
equation but linear operator:
In each example its easy to check that () is satisfied, and we also see the char-
acteristic first-power structure of the formulas (without u-independent terms this
time). In each case L is a function on functions, a mapping which takes a function
as input and gives as output either another function (as in the first two examples) or
a number (as in the last two). Such a superfunction, considered as a mathematical
object in its own right, is called an operator.
L(u) = g,
So the possible u-independent terms enter the picture in the role of g. This
leads to an absolutely crucial distinction:
32
Homogeneous vs. nonhomogeneous equations
Among our original examples, the linear ODE example was nonhomogeneous
(because of the cos 3t) and the PDE example was homogeneous. The algebraic
example is nonhomogeneous because of the 1. Here we are thinking of the system
of simultaneous equations as a single linear equation in which the unknown quantity
is a two-component vector,
x
~u .
y
The linear operator L maps ~u onto another vector,
0
~g = .
1
As you probably know, the system of equations can be rewritten in matrix notation
as
1 2 x 0
= .
1 3 y 1
The linear operator is described by the square matrix
1 2
M= .
1 3
u(0, x) = f (x).
Often the differential equation will be homogeneous but at least one of the boundary
conditions will be nonhomogeneous. (The reverse situation also occurs.) Therefore,
I think its helpful to introduce one more bit of jargon:
33
Definitions: A linear problem consists of one or more linear conditions (equa-
tions) to be satisfied by the unknown, u. A linear problem is homogeneous if all of
its conditions are homogeneous, nonhomogeneous if one or more of the conditions
are nonhomogeneous.
u + 4u = 0, u(0) = 1, u (0) = 0
u 2u
= + j(x), u(x, 0) = 0, u(0, t) = 0, u(t, 1) = 0
t x2
is a nonhomogeneous linear problem. The boundary conditions and the initial con-
dition are homogeneous, but the heat equation itself is nonhomogeneous in this case;
the function j represents generation of heat inside the bar (perhaps by combustion or
radioactivity), a possibility not considered in the discussion of the heat-conduction
problem in Appendix A.
The importance of linear problems is that solving them is made easy by the
superposition principles (which dont apply to nonlinear problems):
Principles of Superposition:
34
Then u = u1 + 3u2 , for example, is also a solution. (In fact, we know
that the most general solution is c1 u1 + c2 u2 where the cs are arbitrary
constants. But for this we need a deeper existence-and-uniqueness theorem
for second-order ODEs; it doesnt just follow from linearity.)
u + 4u = ex , u(0) = 15 , u (0) = 1
5 .
u + 4u = ex , u(0) = 0, u (0) = 0.
35
Recalling Principles 2 and 3 as applied to the differential equation alone
(not the initial conditions), we see that u = up +y, where y is some solution
of y + 4y = 0. A moments further thought shows that the correct y is
the solution of Problem 5:
y + 4y = 0, y(0) = 15 , y (0) = 15 .
In summary, these principles provide the basic strategies for solving linear prob-
lems. If the problem is nonhomogeneous and complicated, you split it into simpler
nonhomogeneous problems and add the solutions. If the solution is not unique,
the nonuniqueness resides precisely in the possibility of adding a solution of the
corresponding homogeneous problem. (In particular, if the original problem is ho-
mogeneous, then you seek the general solution as a linear combination of some list
of basic solutions.) If the problem statement contains enough initial and bound-
ary conditions, the solution will be unique; in that case, the only solution of the
homogeneous problem is the zero function.
u 2u
PDE: = ,
t x2
36
(1) v is to be a solution of the problem consisting of the PDE and the nonhomo-
geneous BC, with no particular IC assumed. It is possible to find a solution of
this problem which is independent of t: v(x, t) = V (x).
w(0, t) = 0, w(1, t) = 0,
and the initial condition needed to make u satisfy the original IC. Namely,
The details of steps (1) and (2) are carried out in Appendix A.
A related principle is
L1 (u) = f1 , L2 (u) = f2 .
37
1. Zero out the other condition. Solve
L1 (u1 ) = f1 , L2 (u1 ) = 0,
L1 (u2 ) = 0, L2 (u2 ) = f2 .
Then u = u1 + u2 .
(b) Laplaces equation in a rectangle with boundary values given on two per-
pendicular sides.
2. Temporarily ignore the other condition. Solve L1 (u1 ) = f1 and let L2 (u1 )
be whatever it turns out to be, say L2 (u1 ) h. Next solve
L1 (u2 ) = 0, L2 (u2 ) = f2 h.
Then u = u1 + u2 .
(b) finding a steady-state solution for the wave or heat equation with nonzero,
but time-independent, boundary conditions.
38
Moving into Higher Dimensions: The Rectangle
We will now work out a big example problem. It will break up into many
small examples, which will demonstrate many of the principles weve talked about
often in a slightly new context.
Problem statement
Without loss of generality, we can assume that the variables have been scaled so
that a = .
u 2u 2u
PDE: = + 2.
t x2 y
u u
BC1 : (t, 0, y) = 0 = (t, , y) ,
x x
That is, the plate is insulated on the sides, and the temperature on the top and
bottom edges is known and given by the functions p and q. Finally, there will be
some initial temperature distribution
Steady-state solution
From our experience with the one-dimensional problem, we know that we must
eliminate the nonhomogeneous boundary condition (BC2 ) before we can solve the
initial-value problem by separation of variables! Fortunately, p and q are indepen-
dent of t, so we can do this by the same technique used in one dimension: hunt for
a time-independent solution of (PDE) and (BC), v(t, x, y) = V (x, y), then consider
the initial-value problem with homogeneous boundary conditions satisfied by u v.
39
So, we first want to solve
2V 2V
PDE: + = 0,
x2 y 2
V V
BC1 : (0, y) = 0 = (, y) ,
x x
V = V1 + V2 ,
Remark: This splitting is slightly different from the one involving the steady-
state solution. In each subproblem here we have replaced every nonhomogeneous
condition except one by its corresponding homogeneous condition. In contrast, for
the steady-state solution we simply discarded the inconvenient nonhomogeneous
condition, and later will modify the corresponding nonhomogeneous condition in
the other subproblem to account for the failure of the steady-state solution to vanish
on that boundary. Which of these techniques is best varies with the problem, but
the basic principle is the same: Work with only one nonhomogeneous condition at
a time, so that you can exploit the superposition principle correctly.
X Y
0 = X Y + XY == .
X Y
The boundary condition (BC1 ) implies that
X (0) = 0 = X ().
Therefore, up to a constant,
40
Now Y must be a solution of Y = n2 Y that vanishes at y = 0; that is, up to a
constant,
Y (y) = sinh ny if n 6= 0.
The case 0 must be treated separately: Y (y) = y. We have now taken care of three
of the four boundaries. The remaining boundary condition is nonhomogeneous, and
thus we cannot apply it to the individual separated solutions XY ; first we must
adding up the separated solutions with arbitrary coefficients:
X
V2 (x, y) = a0 y + an cos nx sinh ny.
n=1
This is a Fourier cosine series, so we solve for the coefficients by the usual formula:
Z
2
an sinh nb = cos nx q(x) dx (n > 0).
0
Divide by sinh nb to get a formula for an . For n = 0 the Fourier formula lacks the
factor 2, and we end up with
Z
1
a0 = q(x) dx.
b 0
Solving for V1 is exactly the same except that we need Y (b) = 0 instead of
Y (0) = 0. The appropriate solution of Y = n2 Y can be written as a linear
combination of sinh ny and cosh ny, or of eny and eny , but it is neater to write it
as
Y (y) = sinh n(y b) ,
which manifestly satisfies the initial condition at b as well as the ODE. (Recall that
hyperbolic functions satisfy trig-like identities, in this case
sinh n(y b) = cosh nb sinh ny sinh nb cosh ny
= 21 enb eny 12 enb eny ,
so the three forms are consistent.) Again the case n = 0 is special: Y (y) = y b.
We now have
X
V1 (x, y) = A0 (y b) + An cos nx sinh n(y b).
n=1
41
At y = 0 this becomes
X
p(x) = A0 b An cos nx sinh nb.
n=1
Thus Z
2
An = cos nx p(x) dx (n > 0),
sinh nb 0
Z
1
A0 = p(x) dx.
b 0
This completes the solution for V1 and hence for v(t, x, y).
Homogeneous problem
w w
BC1 : (t, 0, y) = 0 = (t, , y) ,
x x
42
Since there is only one nonhomogeneous condition, we can separate variables
immediately:
wsep (t, x, y) = T (t)X(x)Y (y).
T XY = T X Y + T XY .
T X Y
= + = .
T X Y
(We know that is a constant, because the left side of the equation depends only
on t and the right side does not depend on t at all. By analogy with the one-
dimensional case we can predict that will be positive.) Since X /X depends only
on x and Y /Y depends only on y, we can introduce another separation constant:
X Y
= , = + .
X Y
The boundary conditions translate to
ny n2 2
Y (y) = sin , + =
b b2
n2 2
2
= m + 2 mn .
b
Then
T (t) = et .
(As usual in separation of variables, we have left out all the arbitrary constants
multiplying these solutions. They will all be absorbed into the coefficients in the
final Fourier series.)
We are now ready to superpose solutions and match the initial data. The most
general solution of the homogeneous problem is a double infinite series,
X
X
ny mn t
w(t, x, y) = cmn cos mx sin e .
m=0 n=1
b
43
To solve for cmn we have to apply Fourier formulas twice:
X Z b
2 ny
cmn cos mx = sin g(x, y) dy;
m=0
b 0 b
Z Z b
2 2 ny
cmn = dx dy cos mx sin g(x, y) (m > 0),
b 0 0 b
Z Z b
2 ny
c0n = dx dy sin g(x, y).
b 0 0 b
This completes the solution for w. Now we have the full solution to the original
problem:
u(t, x, y) = w(t, x, y) + V (x, y).
+
+ +
+
sin y
b
cos x sin y
b cos x sin 2y
b
(Recall that cos (0x) = 1.) The function is positive or negative in each region
according to the sign shown. The function is zero on the solid lines and its nor-
mal derivative is zero along the dashed boundaries. The functions have these key
properties for our purpose:
44
These functions form an orthogonal basis for the vector space of functions whose
domain is the rectangle (more precisely, for the space L2 of square-integrable func-
tions on the rectangle), precisely analogous to the orthogonal basis of eigenvectors
for a symmetric matrix that students learn to construct in linear-algebra or ODE
courses.
Go back now to the steady-state problem and suppose that the boundary con-
ditions on all four sides of the rectangle are of the normal-derivative type:
2V 2V
PDE: + = 0,
x2 y 2
V V
BC1 : (0, y) = f (y), (, y) = g(y),
x x
V V
BC2 : (x, 0) = p(x), (x, b) = q(x).
y y
= V ds
n
C
Z Z Z b Z b
= f (y) dy + g(y) dy p(x) dx + q(x) dx.
0 0 0 0
45
Without even attempting to solve the problem, we can see that there is no solution
unless the net integral of the (outward) normal derivative data around the entire
perimeter of the region is exactly equal to zero.
This fact is easy to understand physically if we recall that this problem arose
from a time-dependent problem of heat conduction, and that a Neumann boundary
condition is a statement about heat flow out of the region concerned. If there is
a net heat flow out of the region (and no heat source in the interior), then the
rectangular object ought to be cooling off! It is not surprising that no steady-state
solution can exist.
The common lesson of these two examples is, Just because you can expand
an unknown solution in a Fourier series doesnt mean that you should. Sometimes
a simply polynomial will do a better job.
2u 2u
+ 2 = 0,
x2 y
u u
(x, 0) = f1 (x), (x, L) = f2 (x),
y y
u u
(0, y) = g1 (y), (K, y) = g2 (y),
x x
46
with Z Z
K L
[f1 (x) + f2 (x)] dx + [g1 (y) + g2 (y)] dy = 0.
0 0
y f2 0 f2
L
g1 g2 = g1 g2 + 0 0
x
f1 K 0 f1
Following the usual strategy, lets break up the problem into two, so that we
have nonhomogeneous data in only one variable at a time. (The diagram indicates
the resulting boundary equations.) But we have outfoxed ourselves.* There is no
RK RL
reason why 0 [f1 (x)+f2 (x)] dx and 0 [g1 (y)+g2 (y)] dy should equal 0 individually,
so in general the two subproblems will not have solutions. What to do?
Let Z Z
K L
1 1
C= [f1 (x) + f2 (x)] dx = + [g1 (y) + g2 (y)] dy.
2KL 0 2KL 0
We would like to have a solution, u(x, y), of the original problem with data
f1 , f2 , g1 , g2 . Suppose for a moment that such a solution exists, and consider
w u CV . We see that 2 w = 0 and that w satisfies Neumann boundary
conditions shown in the next diagram, along with the obvious decomposition:
y f2 + 2CL 0 f2 + 2CL
L
g1 g2 = g1 g2 + 0 0
2CK 2CK
x
f1 K 0 f1
47
We calculate
Z L
[g1 (y) + g2 (y) 2CK] dy = 2CKL 2CKL = 0,
0
Z K
[f1 (x) + f2 (x) + 2CK] dx = 2CKL + 2CKL = 0.
0
Therefore, each of these subproblems does have a solution, which can be constructed
as a Fourier cosine series in the usual way. (As usual in pure Neumann problems,
the solutions are nonunique because an arbitrary constant could be added. Apart
from that, the n = 0 term in each cosine series is a function that is independent of
the Fourier variable and linear in the other variable. (Try it and see!))
We can now define u = w+CV and observe that it solves the original Laplacian
problem. (Hence it could serve as the steady-state solution for a related heat or
wave problem.)
f 0 = f 0 + 0 0
x
0 K 0 0
The two subproblems are solved by Fourier sine series in the usual way. Unless
f (0) = 0 = f (L) and g(0) = 0 = g(K), the solutions will demonstrate nonuniform
convergence (and the Gibbs phenomenon). Suppose, however, that f and g are
continuous (and piecewise smooth) and
Then the boundary data function is continuous all around the boundary, and one
suspects that the optimal Fourier solution should be better behaved. The standard
decomposition has introduced an artificial discontinuity at the corner marked
and thus a spurious difficulty of poor convergence.
V (x, y) = A + Bx + Cy + Dxy
49
Fourier Transforms and Problems on Infinite Domains
50
Intuitive derivation of the Fourier transform
It is easy to see how a Fourier series becomes an integral when the length
of the interval goes to infinity. For this it is most convenient to use the complex-
exponential form of the Fourier series. Recall that for a function on a finite interval
of length 2L, we have
X
f (x) = cn einx/L ,
n=
Z L
1
cn = f (x) einx/L dx.
2L L
Lets write
n
kn .
L
Then X
f (x) = cn eikn x .
kn
(n + 1) n
kn = .
L L L
This suggests that for f defined on the whole real line, < x < , all values of
k should appear.
L = L0
n: 1 2 3
L = 4L0
k
n: 4 5 8 9 12
51
Z L
1
f(kn ) = f (x) eikn x dx.
2 L
As L the first formula looks like a Riemann sum. In the limit we therefore
expect
Z
1
f (x) = f(k) eikx dk,
2
Z
1
f (k) = f (x) eikx dx.
2
Note the surprising symmetry between these two formulas! f is called the
Fourier transform of f , and f is the inverse Fourier transform of f.
Of course, this does not solve our example problem. There the allowed func-
tions were cos(kx), not eikx , and we were poised to expand an initial temperature
distribution, defined for positive x only, in terms of them: If
Z
f (x) u(x, 0) = A(k) cos(kx) dk,
0
then Z
2
u(x, t) = A(k) cos(kx) ek Kt
dk
0
is the solution.
The way to get from exponentials to sines and cosines is basically the same as
in finite Fourier series. First, note that the Fourier transformation we have derived
(for < x < ) can be rewritten in terms of sin(kx) and cos(kx) (0 k < )
in place of eikx ( < k < ). You can easily work out that the formulas are
Z
f (x) = [A(k) cos(kx) + B(k) sin(kx)] dk,
0
Z
1
A(k) = cos(kx) f (x) dx,
Z
1
B(k) = sin(kx) f (x) dx.
This is seldom done in practical calculations with functions defined on (, ),
except by people with a strong hatred for complex numbers.
52
However, the trigonometric functions become very useful in calculations on a
half-line (semiinfinite interval) with a boundary condition at the end. An arbitrary
function on 0 x < can be identified with its even extension to the whole real
line. An even function has a Fourier transform consisting entirely of cosines (rather
than sines), and the formula for the coefficient function can be written as an integral
over just the positive half of the line:
Z
f (x) = A(k) cos(kx) dk,
0
Z
2
A(k) = cos(kx) f (x) dx.
0
r Z
2
A(k) = f (x) cos(kx) dx.
0
Still other people put the entire factor 2 into the A 7 F equation.* In any case,
A is called the Fourier cosine transform of f , and its often given a notation such
as fc (k) or FC (k).
It should now be clear how to finish the solution of the heat problem in the
infinite bar with insulated left end.
Convergence theorems
Our derivation of the Fourier transformation formulas is not a proof that ap-
plying the two formulas in succession really will take you back to the function f
from which you started; all the convergence theorems for Fourier series need to
be reformulated and reproved for this new situation. In fact, since the integrals
* Similar notational variations are found for the full (complex-exponential) Fourier
transform.
53
are improper, the function f needs to satisfy some technical conditions before the
integral f will converge at all.
First, lets state the generalization to Fourier transforms of the pointwise con-
vergence theorem for Fourier series. To get a true theorem, we have to make a
seemingly fussy, but actually quite natural, technical condition on the function:
Lets define a function with domain (, ) to be piecewise smooth if its restric-
tion to every finite interval is piecewise smooth. (Thus f is allowed to have infinitely
many jumps or corners, but they must not pile up in one region of the line.) The
Fourier transform is defined by
Z
1
f(k) f (x) eikx dx.
2
a) f(k) is continuous.
a) f(k) is also square-integrable. (The integral defining f(k) may not converge at
every point k, but it will converge in the mean, just like the inversion integral
discussed below.)
54
b) A Parseval equation holds:
Z Z
2
|f (x)| dx = |f(k)|2 dk.
(If you define f so that the 2 is kept all in one place, then this formula will
not be so symmetrical.)
where Z
1
f (x) f(k) eikx dk.
2
55
This differentiation property of Fourier transforms can be used to solve lin-
ear differential equations with constant coefficients. Consider the inhomogeneous
ordinary differential equation
d2 f
2
2 f (x) = g(x),
dx
where 2 > 0 and g is square-integrable and piecewise continuous. Take the Fourier
transform of both sides:
Therefore,
g (k)
f(k) = ,
k 2 + 2
and hence Z
1 g(k)eikx
f (x) = dk.
2 k 2 + 2
(We wont know how to evaluate this integral, or even whether it can be done
analytically, until we know what g is. Nevertheless, this is a definite formula for the
solution. Well return to this formula and press it a little farther later.)
Once again, another way of looking at the calculation we have just done is
as an analogue of diagonalizing a matrix. Suppose we want to solve the equation
M ~x = ~y , where ~x and ~y are in R2 and M is a 2 2 matrix. If we can find a matrix
U for which
1 m1 0
M = U DU, D= ,
0 m2
then
1
1 1 1 1 m1 0
M =U D U, D = 1 .
0 m2
You may (should!) object that the general solution of this ODE should contain
two arbitrary constants. Indeed, the solution we have found is not the most general
56
one, but it is the only square-integrable one. (You can easily check that none of the
solutions of the associated homogeneous equation,
d2 f
2 f (x) = 0
dx2
(with 2 > 0), are square-integrable, so adding one of them to our solution will give a
solution of the inhomogeneous equation that is not in L2 .) The Fourier calculation
in effect takes place entirely within the vector space L2 (, ) (although the
eigenfunctions are not themselves members of that space).
The foregoing may have reminded you of the Laplace-transform technique for
solving ODEs. In fact, the two transforms are closely related.
Allow s to be complex:
Then
Z
F (s) = f (x) ex eikx dx
0
= 2 Fourier transform of f (x)ex ( fixed).
57
to time can be applied to nonhomogeneous boundary data that depend on time, so
that the steady-state solution technique does not apply.)
or
Z
1
f (x) = F ( + ik) e(+ik)x dk.
2
Z +i
1
f (x) = F (s) esx ds.
2i i
In courses on complex analysis (such as Math. 407 and 601), it is shown that this
integral makes sense as a line integral in the complex plane. It provides an inversion
formula for Laplace transforms. In elementary differential-equations courses (such
as Math. 308) no such formula was available; the only way to invert a Laplace
transform was to find it in the right-hand column of the table that is, to know
beforehand that that function can be obtained as the direct Laplace transform of
something else. The complex analysis courses also provide techniques for evaluating
such integrals, so the number of problems that can be solved exactly by Laplace
transforms is significantly extended.
k = Im s
......
........
...... ..
...
.
= Re s
=
const.
58
Convolutions, autocorrelation function, and power spectrum
In this course we emphasize the use of the Fourier transform in solving partial
differential equations. The Fourier transform also has important applications in
signal processing and the analysis of data given as a function of a time variable.
Here we take a quick look at some of the tools of that trade.
f1 f2 = f2 f1 ,
By manipulating the formulas defining the Fourier transform and its inverse,
it is easy to show the following:
Theorem:
59
As an application of the convolution theorem, return to the differential equation
f 2 f = g and the solution
Z
1 g(k)eikx
f (x) = dk.
2 k 2 + 2
Suppose we knew that
1
= h(k)
+ 2
k2
for some particular h(x). Then we could write
Z
1
f (x) = h(x t)g(t) dt
2
thereby expressing the solution as a single integral instead of two (one to find g
and then one to find f ).
Can we find h? Well, the most obvious way would be to evaluate the inverse
Fourier transform, Z
1 eikx
h(x) = dk.
2 k 2 + 2
Unfortunately, one needs some theorems of complex analysis to evaluate this. For-
tunately, I know the answer:
r
1 |x|
h(x) = e (if > 0).
2
It can be verified by elementary means (see the next section) that this h satisfies
Z
1 1
h(k) h(x) eikx dx = .
2 k + 2
2
So we end up with Z
1
f (x) = e|xt| g(t) dt. ()
2
The function h, by the way, is called a Green function for this problem. It
plays the same role as the matrix M 1 in the two-dimensional algebraic analogue.
60
The uncertainty principle
If f (x) is sharply peaked, then f(k) must be spread out; if f(k) is sharply
peaked, then f (x) must be spread out. A precise statement of this principle is:
Z Z
1
2 2
(x x0 ) |f (x)| dx (k k0 )2 |f(k)|2 dk kf k4
4
61
Green Functions
Here we will look at another example of how Fourier transforms are used in
solving boundary-value problems. This time well carry the solution a step further,
reducing the solution formula to a single integral instead of a double one.
2u 2u
PDE: + 2 = 0,
x2 y
This equation might arise as the steady-state problem for heat conduction in a
large plate, where we know the temperature along one edge and want to simplify
the problem by ignoring the effects of the other, distant edges. It could also arise
in electrical or fluid-dynamical problems.
It turns out that to get a unique solution we must place one more condition on u:
it must remain bounded as x or y or both go to infinity. (In fact, it will turn out that
usually the solutions go to 0 at .) Excluding solutions that grow at infinity seems
to yield the solutions that are most relevant to real physical situations, where the
region is actually finite. But it is the mathematics of the partial differential equation
that tells us that to make the problem well-posed we do not need to prescribe some
arbitrary function as the limit of u at infinity, as we needed to do in the case of
finite boundaries.
Separating variables for this problem at first gives one a feeling of dej`a vu:
X Y
== ;
X Y
write as k 2 . The remaining steps, however, are significantly different from the
case of the finite rectangle, which we treated earlier.
62
If 6= 0, the solution of the x equation can be
X(x) = eikx ,
where any k and its negative give the same . The condition of boundedness requires
that k be real but does not further restrict it! Taking k = 0 yields the only bounded
solution with = 0. Therefore, we take the X in each separated solution to be eikx
for some real k. The corresponding will be positive or zero.
We are now finished with the homogeneous conditions, so were ready to su-
perpose the separated solutions. Since k is a continuous variable, superpose in
this case means integrate, not sum:
Z
u(x, y) = dk c(k) eikx e|k|y .
Here c(k) is an arbitrary function, which plays the same role as the arbitrary coef-
ficients in previous variable separations. The initial condition is
Z
f (x) = dk c(k) eikx .
Comparing with the formula for the inverse Fourier transform, we see that c(k) =
1 f(k). That is,
2 Z
1
c(k) = f (x) eikx dx.
2
In other words, the solution can be written
Z
1
u(x, y) = dk f(k) eikx e|k|y .
2
63
things in the same equation, we must first rewrite the definition of the Fourier
transform using a different variable:
Z
1
f(k) = dz eikz f (z) .
2
Then Z Z
1
u(x, y) = dk dz eik(xz) e|k|y f (z).
2
Well evaluate this multiple integral with the k integral on the inside. (This step
requires some technical justification, but that is not part of our syllabus.) The inner
integral is
Z Z 0 Z
ik(xz) |k|y ik(xz) ky
dk e e = dk e e + dk eik(xz) eky
0
0
e
ik(xziy)
eik(xz+iy)
= +
i(x z iy) i(x z + iy)
0
1 1
=
i(x z iy) i(x z + iy)
2y
= .
(x z)2 + y 2
Thus Z
1 y
u(x, y) = dz f (z). ()
(x z)2 + y 2
The function
1 y
G(x z, y)
(x z)2 + y 2
is called a Green function for the boundary-value problem we started from. It is
also called the kernel of the integral operator
u = G(f )
defined by (). The point of () is that it gives the solution, u, as a function of the
boundary data, f .
64
Gaussian integrals
The Green function for the heat equation on an infinite interval is derived from
the Fourier-transform solution in much the same way. To do that we need a basic
integral formula, which Ill now derive.
where t is positive.
C = H(0)
Z
2
= ek t dk
Z
1 2
= eq dq,
t
65
by the substitution q = k t. But it is well known that
Z
2
eq dq = ,
So r
C= .
t
Z
r
ikx k2 t x2 /4t
e e dk = e .
t
Now I leave it as an exercise* to solve the initial-value problem for the heat
equation for x (, ):
u 2u
= , (PDE)
t x2
u(0, x) = f (x), (IC)
in analogy to our two previous Fourier-transform solutions. You should then find
that the problem is solved by the Green function
1 1 2
G(t, x z) H(x z) = e(xz) /4t .
2 4t
Note also that the formula in the box is also useful for evaluating similar inte-
grals with the roles of x and k interchanged. (Taking the complex conjugate of the
formula, we note that the sign of the i in the exponent doesnt matter at all.)
66
Delta functions
The PDE problem defining any Green function is most simply expressed in
terms of the Dirac delta function. This, written (x z) (also sometimes written
(x, z), z (x), or 0 (x z)), is a make-believe function with these properties:
Also,
Z b
f (z) if z (a, b),
(x z) f (x) dx =
a 0 if z
/ [a, b].
3. (x) is the limit of a family of increasingly peaked functions, each with inte-
gral 1:
.......
... ...
... ...
... ...
. ..
... ...
.. ....
small
1 .... ..
...
.
(x) = lim ...
.
...
. ...
0 x + 2
2 .
.
. ...
... ...
.
.. ...
.. ..
1 2 2 .
.
..
. ......................... ..... large
or lim ex / .
. .
...... .........
.. .
. ...............
............ ..... ............
........... ....
0 ............................................................
........................................................
....... .................
................ ................................
...........................................
or lim d (x), x
0
1
2
67
d
4. (x z) = h(x z), where h(w) is the unit step function, or Heaviside
dx
function (equal to 1 for w > 0 and to 0 for w < 0). Note that h(t z) is the
limit as 0 of a family of functions of this type:
1 ....
.............................
................
.........
..
.......
.
.
.....
....
....
..
......
.
.....
.......
.........
..................
............................... x
z
g ........................................
..............
......
....
A
...
..........
..................................
............
........... x
z
Example:
( )
0 for x < 2,
g(x) = = x h(x 2).
x for x 2
Then
g (x) = h(x 2) x h (x 2)
= h(x 2) 2 (x 2).
g 2 g 2
x x
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
.....
.
68
Interpretation of differential equations involving
Consider
y + p(x)y + q(x)y = A (x z).
We expect the solution of this equation to be the limit of the solution of an equation
whose source term is a finite but very narrow and hard kick at x = z. The
equation is easier to solve than one with a finite peak.
[Notational remarks: limxz means the same as limxz + ; limxz means limxz .
Also, limxz y(x) is sometimes written y(z + ), and so on.]
Conditions (3) and (4) tell us how to match solutions of (1) and (2) across the
joint. Here is the reasoning behind them:
That is,
y (z + ) y (z ) + small term ( 0 as 0) = A.
69
..........
... .....
..
.
...
.. ...
...
.
.... ..
..
..
. ...
.
... ...
.
. ...
.
. .
.....
..
..
. .
..
. .
.
.. ...
... z .............
.......
.
x ...
............................
..
..
..
..
..
...
.. .
. .
.
.................... ... ...
... ...
... ...
.. ....
.
.. ..
... ..
... ...
... ....
.
..............
We can solve such an equation by finding the general solution on the interval
to the left of z and the general solution to the right of z, and then matching the
function and its derivative at z by rules (3) and (4) to determine the undetermined
coefficients.
y = 0 for x < 1.
0 = y(1 ) = y(1+ ) = C + D,
0 + 1 = y (1 ) + 1 = y (1+ ) = C.
That is,
C + D = 0,
C = 1.
70
Delta functions and Green functions
For lack of time, in this course we wont devote much attention to nonhomo-
geneous partial differential equations. (Haberman, however, discusses them exten-
sively.) So far our nonhomogeneities have been initial or boundary data, not terms
in the PDE itself. But problems like
u 2 u
= (t, x)
t x2
and
2u 2u
+ 2 = j(x, y),
x2 y
where and j are given functions, certainly do arise in practice. Often transform
techniques or separation of variables can be used to reduce such PDEs to nonho-
mogeneous ordinary differential equations (a single ODE in situations of extreme
symmetry, but more often an infinite family of ODEs).
Here I will show how the delta function and the concept of a Green function
can be used to solve nonhomogeneous ODEs.
71
(for x in the interval (0, )) and since the operator on the left-hand side of () is
linear, we expect that Z
y(x) G (x z) f (z) dz
0
will be the solution to our problem! That is, since the operator is linear, it can be
moved inside the integral (which is a limit of a sum) to act directly on the Green
function:
Z 2
d2 y 2 d 2
+ y = + G (x z) f (z) dz
dx2 0 dx2
Z
= (x z) f (z) dz
0
= f (x),
Therefore, the only task remaining is to solve (z ). We go about this with the
usual understanding that
(x z) = 0 whenever x 6= z.
Thus (z ) implies
d2 G (x, z)
+ 2 G (x, z) = 0 if x 6= z.
dx2
Therefore, for some constants A and B,
We need four equations to determine these four unknowns. Two of them are
the boundary conditions:
72
The third is that G is continuous at z:
The final condition is the one we get by integrating (z ) over a small interval
around z:
G (z + , z) G (z , z) = 1.
x x
(Notice that although there is no variable x left in this equation, the partial
derivative with respect to x is still meaningful: it means to differentiate with respect
to the first argument of G (before letting that argument become equal to the second
one).) This last condition is
One of the equations just says that A = 0. The others can be rewritten
C cos + D sin = 0,
B sin z C cos z D sin z = 0,
B cos z C sin z + D cos z = 1.
This system can be solved by Cramers rule: After a grubby calculation, too long
to type, I find that the determinant is
0 cos sin
sin z cos z sin z = sin .
cos z sin z cos z
sin (z )
B(z) = ,
sin
sin z
C(z) = ,
cos sin z
D(z) = .
sin
Thus
sin x sin (z )
G (x, z) = for x < z,
sin
sin z sin (x )
G (x, z) = for x > z.
sin
(Reaching the last of these requires a bit more algebra and a trig identity.)
73
So we have found the Green function! Notice that it can be expressed in the
unified form
sin x< sin (x> )
G (x, z) = ,
sin
where
x< min(x, z), x> max(x, z).
This symmetrical structure is very common in such problems.
d2 y
2
+ 2 y = 0,
dx
has solutions satisfying the boundary conditions. If the homogeneous problem has
solutions (other than the zero function), then the solution of the nonhomogeneous
problem (if it exists) must be nonunique, and we have no right to expect to find a
formula for it! In fact, the existence of solutions to the nonhomogeneous problem
also depends upon whether is an integer (and also upon f ), but we dont have
time to discuss the details here.
Remark: The algebra in this example could have been reduced by writing the
solution for x > z as
G (x, z) = E sin (x ).
(That is, we build the boundary condition at into the formula by a clever choice
of basis solutions.) Then we would have to solve merely two equations in two
unknowns (B and E) instead of a 3 3 system.
y(0) = 0, y (0) = 0.
2
G(x, z) + p(x) G(x, z) + q(x)G(x, z) = (x z)
x2 x
with those initial conditions, and then expect to find y in the form
Z
y(x) = G(x, z) f (z) dz.
74
It is not immediately obvious what the limits of integration should be, since there
is no obvious interval in this problem.
y + p(x)y + q(x)y = 0
are known; call them y1 (x) and y2 (x). Of course, until we are told what p and q
are, we cant write down exact formulas for y1 and y2 ; nevertheless, we can solve
the problem in the general case getting an expression for G in terms of y1 and
y2 , whatever they may be.
As before we will get four equations in the four unknowns, two from initial data
and two from the continuity of G and the prescribed jump in its derivative at z.
Let us consider only the case z > 0. Then the initial conditions
G(0, z) = 0, G(0, z) = 0
x
force A = 0 = B. The continuity condition, therefore, says that G(z, z) = 0, or
Then
y2 y1
C= , D= .
W W
Thus our conclusion is that (for z > 0)
0 for x < z,
G(x, z) = 1
y1 (z)y2 (x) y2 (z)y1 (x) for x > z.
W (z)
75
Now recall that the solution of the original ODE,
was supposed to be Z
y(x) = G(x, z) f (z) dz.
Assume that f (z) 6= 0 only for z > 0, where our result for G applies. Then the
integrand is 0 for z < 0 (because f = 0 there) and also for z > x (because G = 0
there). Thus
Z x
y(x) = G(x, z) f (z) dz
0
Z x Z x
y1 (z)f (z) y2 (z)f (z)
= dz y2 (x) dz y1 (x).
0 W (z) 0 W (z)
This is exactly the same solution that is found in differential equations text-
books by making the ansatz
2u 2u
PDE: + 2 = 0,
x2 y
Its solution is
1 y
G(x z, y) ,
(x z)2 + y 2
the Green function that we constructed for Laplaces equation in the upper half
plane. Therefore, the general solution of Laplaces equation in the upper half plane,
with arbitrary initial data
u(x, 0) = f (x),
is Z
u(x, y) = dz G(x z, y)f (z).
76
Similarly, the Green function
1 2
G(t, x z) = e(xz) /4t .
4t
that we found for the heat equation solves the heat equation with initial data
u(0, x) = (x z).
This is a very useful formula! Here is another way of seeing what it means and why
it is true:
Recall that Z
1
f (x) = f(k) eikx dk,
2
Z
1
f(k) = f (z) eikz dz.
2
Let us substitute the second formula into the first:
Z Z
1
f (x) = eik(xz) f (z) dz dk.
2
Of course, this equation is useless for computing f (x), since it just goes in a circle;
its significance lies elsewhere. If were willing to play fast and loose with the order
of the integrations, we can write it
Z Z
1 ik(xz)
f (x) = e dk f (z) dz,
2
77
which says precisely that Z
1
eik(xz) dk
2
satisfies the defining property of (xz). Our punishment for playing fast and loose
is that this integral does not converge (in the usual sense), and there is no function
with the desired property. Nevertheless, both the integral and the object itself
can be given a rigorous meaning in the modern theory of distributions; crudely
speaking, they both make perfect sense as long as you keep them inside other
integrals (multiplied by continuous functions) and do not try to evaluate them at a
point to get a number.
What would happen if we tried this same trick with the Fourier series formulas?
Lets consider the sine series,
X
f (x) = bn sin nx,
n=1
Z
2
bn = f (z) sin nz dz.
0
This gives
Z X
2
f (x) = sin nx sin nz f (z) dz. ()
0 n=1
Yes and no. In () the variables x and z are confined to the interval [0, ]. () is a
valid representation of the delta function when applied to functions whose domain
is [0, ]. If we applied it to a function on a larger domain, it would act like the odd,
periodic extension of (x z), as is always the case with Fourier sine series:
2X X
sin nx sin nz = [(x z + 2M ) (x + z + 2M )].
n=1
M =
z 2 z z + 2
x
2 2
z 2 z z + 2
78
The Poisson summation formula
Note: This is not what Haberman calls Poisson formula in Exercise 2.5.4
and p. 433.
Lets repeat the foregoing discussion for the case of the full Fourier series on
the interval (L, L):
X Z L
inx/L 1
f (x) = cn e , cn = einy/L f (y) dy
n=
2L L
leads to
X Z L
1
f (x) = eiy/L f (y) dy einx/L
n=
2L L
Z L "
#
1 X in(xy)/L
= e f (y) dy.
L 2L n=
Therefore,
1 X in(xy)/L
e = (x y) for x and y in (L, L).
2L n=
Now consider y = 0 (for simplicity). For x outside (L, L), the sum must equal
the 2L-periodic extension of (x):
1 X inx/L X
e = (x 2LM ). ()
2L n=
M =
This Poisson summation formula says that the sum of a function over a an
equally spaced grid of points equals the sum of its Fourier transform over a certain
79
other equally p
spaced grid of points. The most symmetrical version comes from
choosing L = 2
X
X
f ( 2 n) = f ( 2 M ).
n= M =
However, the most frequently used version, and probably the easiest to remember,
comes from taking L = 12 : Starting with a numerical sum
X
f (M ),
M =
which is
Z
X
f (x)e2inx dx
n=
80
Additional Topics on Green Functions
Z x+t
1 1
u(t, x) = [f (x + t) + f (x t)] + g(z) dz. (1)
2 2 xt
For simplicity consider only the case t > 0. Then (1) can be written
Z
1
u(t, x) = dz f (z)[(z x t) + (z x + t)]
2
Z (2)
1
+ dz g(z)[h(z x + t) h(z x t)],
2
dh(w)
(w) = .
dw
Now define
1
G(t, x, z) [h(z x + t) h(z x t)],
2
so that
G 1
(t, x, z) = [(z x + t) + (z x t)].
t 2
Then (2) can be rewritten as
Z Z
G u
u(t, x) = (t, x, z)u(0, z) dz + G(t, x, z) (0, z) dz.
t t
(Although we assumed t > 0, this formula also holds for t < 0.)
81
Green functions for nonhomogeneous problems
Recall that the solution of the initial-value problem for the homogeneous heat
equation is Z
u(t, x) = H(t, x, y)f (y) dy (f (x) u(0, x)),
where
1 2
H(t, x, y) = e(xy) /4t .
4t
H could be defined as the solution of the initial-value problem
H 2H
= , H(0, x, y) = (x y). (4)
t x2
u 2u
= + j(t, x) (for t > 0), u(0, x) = 0 (5)
t x2
(where weve imposed the homogeneous initial condition to make the solution
unique). In view of our experience with ODEs we might expect the solution to
be of the form Z Z
u(t, x) = dy ds G(t, x; s, y)j(s, y), (6)
0
where G satisfies
G 2 G
= (t s)(x y), G(0, x, s, y) = 0 (7)
t x2
(i.e., the temperature response to a point source of heat at position y and time s.)
The surprising fact is that G turns out to be essentially the same thing as H.
82
It can be proved that differentiation under the integral sign is legitimate here, so
lets just calculate
Z Z Z t
u H
= dy H(0, x, y)j(t, y) + dy ds (t s, x, y)j(s, y),
t 0 t
Z Z t
2u 2H
= dy ds (t s, x, y)j(s, y).
x2 0 x2
u
Now use (4) to evaluate the first term in t and to observe that the other term
2
cancels xu2 when we construct
Z
u 2 u
= dy (x y)j(t, y) = j(t, x).
t x2
Also, we have u(0, x) = 0. So our u solves the problem (5). In other words, the
solutions of (5) is (6) with
H(t s, x, y) if s t,
G(t, x; s, y) =
0 if s > t.
Put the other way around: The Green function that solves the initial-value
problem for the homogeneous heat equation is
where G is the Green function that solves the nonhomogeneous heat equation with
homogeneous initial data (and is defined by (7)). This connection between nonho-
mogeneous and homogeneous Green functions is called Duhamels principle (specifi-
cally, for the heat equation, and more loosely, for analogous more general situations).
The previous result for the wave equation is another instance of this principle:
It can be shown that
is a Green function for the nonhomogeneous wave equation, in the sense that
Z Z t
u(t, x) = dy ds Gret (t, x; s, y)f (s, y)
satisfies
2u 2u
2 = f (t, x).
t2 t
83
(Here the G(ts, x, z) is the one previously constructed for the wave equation.) The
subscript ret stands for retarded. It means that the effects of the source f show up
only later in time. (Pictorially, a point source at (s, y) emits a wave into the forward-
pointing space-time cone of points (t, x) with its vertex at the source. Elsewhere
Gret = 0.) Because the wave equation is second-order and time-symmetric, there are
infinitely many other Green functions, corresponding to different initial conditions.
In particular, there is an advanced Green function that absorbs everything and
leaves the space empty of waves at later times. For thermodynamic reasons the
retarded solution is the relevant one in most applications. (You do not often turn
on a flashlight with an incoming wave already focused on it.)
We shall soon see the Duhamel principle at work for Laplaces equation, too.
Coulomb fields
2 u = j(x),
1
G(x, y) = G(x y) =
4r
(times constants that depend on the system of electrical units being used). In
general dimension n (greater than 2, a special case) this becomes
C
G(x, y) = ,
r n2
where [(n 2)C]1 is the surface area of the unit (n 1)-sphere. For n = 2 the
formula is
ln r ln r 2
G(x, y) = = .
2 4
2 n1
2 = 2
+ + ,
r r r (angles)
84
so 2 r 2n = 0, as required. Now the hard part is showing that the function has the
delta behavior at the origin. Let B be the ball of radius centered at y, and let S
be its boundary (a sphere of radius ). If we trust that Gausss theorem continues
to hold if delta functions in the derivatives are taken into account, then
Z Z Z
2 n n1 G n1
Gd z = G d
n S= d S
B S S r
Z
= C(2 n) 1n n1 d(angles)
S
= (n 2)C (area of sphere of unit radius) = 1.
Thus the singularity at the origin has the correct normalization. To make a
real
R 2proof one should do R two things: (1) We really need to show, not just that
G = 1, but that 2 G(z)f (z) dn z = f (0) for all smooth functions f . This
is not much harder than the calculation just shown: Either use Greens symmetric
identity (reviewed in a later subsection), or expand f in a power series. All the
unwanted terms will go to zero as 0. (2) Strictly speaking, the action of 2 on
G is defined by integration by parts (in the whole space):
Z Z
2 n
G(z)f (z) d z G(z)2 f (z) dnz,
Rn Rn
where f is assumed to vanish at infinity. Now apply Gausss theorem to the outside
of S , where we know it is valid, to show that this integral equals f (0).
85
G = 0 .............................................x
.....
... ......
............. ......
.............
...............
. ......
.
.....
...
...
z
.....
Fictitious Physical
region region
because (x z)2 + y 2 is the square of the distance from x to the positive charge and
(x z)2 + y 2 is the square of the distance to the fictitious negative charge. Notice
that G(x, z) 6= G(x z) in this problem, unlike the Coulomb potential and all
the other simple Green functions we have seen for translation-invariant problems.
(This problem is not invariant under translations, because the boundary is fixed at
x = 0.)
Similarly, the Green function for the heat equation on a half-line with u(0, x) =
0 is
1 h (xy)2 /4t 2
i
H(t, x, y) H(t, x, y) = e e(x+y) /4t . (9)
4t
This can be shown equal to the Fourier solution
Z
2 2
sin(kx) sin(ky)ek t dk. (10)
0
Function (9) is the Green function for the nonhomogeneous heat equation with
the source at time s = 0 (from which the general case can be obtained by the
substitution t t s), but by Duhamels principle it is also the Green function for
the homogeneous heat equation with initial data given at t = 0, and it is in that
role that we have previously encountered (10).
86
The periodic case
Suppose we are interested in the initial-value problem for the heat equation on
a ring with coordinate x, < x . We know that the relevant Green function
is
1 X in(xy) n2 t
K(t, x, y) = e e (11)
2 n=
from substituting
Z
X
1 iny 2
cn = e f (y) dy into u(t, x) = cn einx en t .
2 n=
But another way to get such a Green function is to start from the one for the whole
line, H, and add copies of it spaced out periodically:
X 1 2
K(t, x, y) = e(xy2M ) /4t . (12)
M =
4t
Each term of (12) (and hence the whole sum) satisfies the heat equation for t > 0.
As t 0 the term with M = 0 approaches (x y) as needed, and all the other
terms approach 0 if x and y are in the basic interval (, ). Finally, the function
is periodic, K(t, x + 2, y) = K(t, x, y), as desired.
The functions (11) and (12) are equal, although this is not obvious by inspec-
tion. Neither sum can be evaluated in closed form in terms of elementary functions.
From a numerical point of view they are useful in complementary domains, be-
cause the sum in (12) converges very fast when t is small, whereas the one in (11)
converges best when t is large.
The equality of (11) and (12) is an instance of the Poisson summation formula.
This is most easily seen when x = y, so that the equality is
X X
1 2 1 n2 t
e(2M ) /4t = e . (13)
4t n=
2
M =
Since Z
1 2 1 k) dk
H(t, z) = ez /4t = eikz H(t,
4t 2
k) =
where H(t, 1
2
ek t , (13) is the Poisson relation
2
X X
1 n).
H(t, 2M ) = H(t,
M =
2 n=
87
Finite intervals
In the Neumann case, reflect the source (at y) through both ends to get image
charges at y and 2L y. Continue this process indefinitely in both directions to
get an infinite sequence of images that build up the needed even periodic extension of
the delta functions and hence of the Green function and, ultimately, of the solution
of the PDE problem.
x
2L L L 2L
y 2L y 2L y y y + 2L y + 2L
In the Dirichlet case the first two images are negative, and thereafter they
alternate in sign so as to build up the odd periodic extensions. (Compare the
end of the previous section, where the corresponding linear combination of delta
functions was sketched.)
88
Greens identity has many applications to PDEs, of which we can demonstrate
only one of the simplest. Suppose that G(x, y) is the Green function that solves
the homogeneous Dirichlet problem for the Poisson equation in V :
This formula expresses u in terms of its Dirichlet data on S. It therefore solves the
nonhomogeneous Dirichlet problem for Laplaces equation in V . This is the version
of Duhamels principle that applies to this situation.
For example, let V be the upper half plane. By the method of images ((8)
above with the coordinates turned around), the Green function is
1 1
G(x, y) = ln[(x1 y1 )2 + (x2 y2 )2 ] + ln[(x1 y1 )2 + (x2 + y2 )2 ].
4 4
(Here y = (y1 , y2 ), etc., and the image charge is at (y1 , y2 ).) To get g(y, x)
n (x) G(x.y) we need to differentiate with respect to x2 (since the outward
direction is down) and evaluate it at x2 = 0 (the boundary S). This gives
1 1 y2
[(x1 y1 )2 +y22 ]1 (y2 )+ [(x1 y1 )2 +y22 ]1 (+y2 ) = [(x1 y1 )2 +y22 ]1 .
2 2
1 y
g(x, z; y) = ,
(x z)2 + y 2
89
SturmLiouville Problems
So far all of our example PDEs have led to separated equations of the form
X + 2 X = 0, with standard Dirichlet or Neumann boundary conditions. Not
X V (x)X = 2 X
Also, if the boundary in a problem is a circle, cylinder, or sphere, the solution of the
problem is simplified by converting to polar, cylindrical, or spherical coordinates,
so that the boundary is a surface of constant radial coordinate. This simplification
of the boundary conditions is bought at the cost of complicating the differential
equation itself: we again have to deal with ODEs with nonconstant coefficients,
such as
d2 R 1 dR n2
+ 2 R = 2 R.
dr 2 r dr r
The good news is that many of the properties of Fourier series carry over to
these more general situations. As before, we can consider the eigenvalue problem
defined by such an equation together with appropriate boundary conditions: Find
all functions that satisfy the ODE (for any value of ) and also satisfy the bound-
ary conditions. And it is still true (under certain conditions) that the set of all
eigenfunctions is complete: Any reasonably well-behaved function can be expanded
as an infinite series where each term is proportional to one of the eigenfunctions.
This is what allows arbitrary data functions in the original PDE to be matched to
a sum of separated solutions! Also, the eigenfunctions are orthogonal to each other;
this leads to a simple formula for the coefficients in the eigenfunction expansion,
and also to a Parseval formula relating the norm of the function to the sum of the
squares of the coefficients.
* That is, the density, etc., vary from point to point. This is not the same as nonho-
mogeneous in the sense of the general theory of linear differential equations.
90
Orthonormal bases
However, in certain cases this may make the formula for n more complicated, so
that the redefinition is hardly worth the effort. A prime example is the eigenfunc-
tions in the Fourier sine series:
Z
n (x) sin nx |n (x)|2 dx = ;
0 2
therefore,
r
2
n (x) sin nx
are the elements of the orthonormal basis. (This is the kind of normalization often
used for the Fourier sine transform, as we have seen.) A good case can be made,
however, that normalizing the eigenfunctions is more of a nuisance than a help in
this case; most people prefer to put the entire 2/ in one place rather than put half
of it in the Fourier series and half in the cofficient formula.
Now let f (x) be an arbitrary (nice) function on [a, b]. If f has an expansion
as a linear combination of the s,
X
f (x) = cn n (x),
n=1
91
then
Z b
X Z b Z b
m (x)* f (x) dx = cn m (x)* n (x) dx = cm |m (x)|2 dx
a n=1 a a
Z b
cm = m (x)* f (x) dx. ()
a
(In the rest of this discussion, I shall assume that the orthogonal set is orthonor-
mal. This greatly simplifies the formulas of the general theory, even while possibly
complicating the expressions for the eigenfunctions in any particular case.)
Z b
X
|f (x)|2 dx = |cn |2 .
a n=1
(called
P Bessels inequality), and (2) the best approximation to f (x) of the form
cn n (x) is the one where the coefficients are computed by formula (). These
last two statements remain true when {n } is a finite set in which case, obviously,
the probability that a given f will not be exactly a linear combination of the s
is greatly increased. (The precise meaning of (2) is that the choice () of the cn
minimizes the integral
Z b
X
2
f (x) cn n (x) dx.
a n=1
That is, we are talking about least squares approximation. It is understood in this
discussion that f itself is square-integrable on [a, b].)
P
Now suppose that every square-integrable f is the limit of a series n=1 cn n .
(This series is supposed to converge in the mean that is, the least-squares
integral
Z b M
X
2
f (x) cn n (x) dx
a n=1
92
for a partial sum approaches 0 as M .) Then {n } is called a complete set
or an orthonormal basis. This is the analogue of the mean convergence theorem
for Fourier series. Under certain conditions there may also be pointwise or uni-
form convergence theorems, but these depend more on the special properties of the
particular functions being considered.
So far this is just a definition, not a theorem. To guarantee that our orthonor-
mal functions form a basis, we have to know where they came from. The miracle of
the subject is that the eigenfunctions that arise from variable-separation problems
do form orthonormal bases:
SturmLiouville theory
Theorem: Suppose that the ODE that arises from some separation of variables
is
L[X] = 2 r(x)X on (0, L), ()
where L is an abbreviation for a second-order linear differential operator
a, b, c, and r are continuous on [0, L], and a(x) > 0 and r(x) > 0 on [0, L]. Suppose
further that Z L Z L
L[u](x) * v(x) dx = u(x)* L[v](x) dx ()
0 0
for all functions u and v satisfying the boundary conditions of the problem. (In
terms of the inner product in L2 , this condition is just hLu, vi = hu, Lvi.) Then:
and also a new operator, A[X] L[X]/r, so that the differential equation ()
is the eigenvalue equation A[X] = 2 X. The two factors of r cancel in ().)
93
(3) The eigenfunctions are complete. (This implies that the corresponding PDE
can be solved for arbitrary boundary data, in precise analogy to Fourier series
problems!)
The proof that a given L satisfies () (or doesnt satisfy it, as the case may be)
involves just integrating by parts twice. (Setting v equal to u in the intermediate
step of this calculation gives, as a bonus, a proof of part (7) in the continuation of
this theorem below. You are invited to fill in the details.) It turns out that () will
be satisfied if L has the form
d d
p(x) + q(x)
dx dx
(with p and q real-valued and well-behaved) and the boundary conditions are of the
type
X (0) X(0) = 0, X (L) + X(L) = 0
with , etc., real.* Such an eigenvalue problem is called a regular SturmLiouville
problem.
The proof of the conclusions (1) and (2) of the theorem is quite simple and
is a generalization of the proof of the corresponding theorem for eigenvalues and
eigenvectors of a symmetric matrix (which is proved in many physics courses and
linear algebra courses). Part (3) is harder to prove, like the convergence theorems
for Fourier series (which are a special case of it).
u 2u
= (0 < x < L, 0 < t < ),
t x2
u
u(0, t) = 0, (L, t) + u(L, t) = 0
x
* The reason for the minus sign in the first equation is to make true property (7)
stated below.
94
In a realistic problem, the zeros in the BC would be replaced by constants; as
usual, we would take care of that complication by subtracting off a steady-state
solution. Physically, the constant value of u
x (L, t) + u(L, t) is proportional to the
temperature of the air (or other fluid medium) to which the right-hand endpoint
of the bar is exposed; heat is lost through that end by convection, according to
Newtons law of cooling. Mathematically, such a BC is called a Robin boundary
condition, as opposed to Dirichlet or Neumann.
The separation of variables proceeds just as in the more standard heat prob-
lems, up to the point
2
T (t) = e t , X(x) = sin x.
To get the sine I used the boundary condition X(0) = 0. The other BC is
X (L) + X(L) = 0,
or
cos L + sin L = 0, ( )
or
1
tan L = . ()
It is easy to find the approximate locations of the eigenvalues, 2 , by graphing the
two sides of () (as functions of ) and picking out the points of intersection. (In
the drawing we assume > 0.)
. .. .. ..
... ... ... ...
.. .. .. ..
...
. .
. .
. .
.
. .. .. ..
... ... ... ...
.. .. .. ..
...
. .... .... ....
... .
...
.
...
.
...
... ... ... ...
.
.... ...
. .... ...
.
. . . .
... ... ... ...
... ...... 2...... 3...... 4...
......
. 1 L ..
. 2 L .
.. 3 L ..
. 4 L ...
........... ...
. .... .... ...
.........
......... ..
.. .
... .
...
.....
. .. .. ..
.........
......... ... ... ... ...
......... ..... ... ... ...
...........
......... .... .... ....
.. ............ ... ... ...
.. ......... .. .. ..
.... .........
......... .... .... ....
. ......... . . .
... ......... ..... ..
...
..
...
... ..........
... ...
... .. ............. ... ...
.... .... ......... . .
. . .........
......... .. .
..
.. .. ......... .. ..
.. .. ......... .... ...
.... .... ..........
........... .
.
. . ...
... ... .. ................ ..
.. .. .. .........
..
... ... ... .........
......... .
. .. .. ......... ..
.. ... ... ........ ....
... 3 ..... 5 ..... 7 ...........................
....
.........
2L . 2L . 2L . 2L . .........
.........
.........
.......
1 n
The nth root, n , is somewhere
between n 2 L and L ; as n , n
becomes arbitrarily close to n 12 L
, the vertical asymptote of the tangent func-
tion. For smaller n one could guess n by eye and then improve the guess by,
95
for example, Newtons method. (Because of the violent behavior of tan near the
asymptotes, Newtons method does not work well when applied to (); it is more
fruitful to work with ( ) instead.)
This problem satisfies the conditions of the SturmLiouville theorem, so the eigen-
functions
n sin n x
are guaranteed to be orthogonal. This can be verified by direct computation (mak-
ing use of the fact that n satisfies ()). Thus
Z L Z L
f (x) sin m x dx = bm sin2 m x dx.
0 0
and divide by it. (This number is not just 21 L, as in the Fourier case.) Alternatively,
we could construct orthonormal basis functions by dividing by the square root of
this quantity:
n
n .
k n k
Then the coefficient formula is simply
Z L
Bm = f (x) m (x) dx
0
P
(where f (x) = m Bm m , so Bm = k m kbm ).
The theorem also guarantees that the eigenfunctions are complete, so this solu-
tion is valid for any reasonable f . (Nevertheless, if < 0 it is easy to overlook one
of the normal modes and end up with an incomplete set by mistake. See Haberman,
Figs. 5.8.2 and 5.8.3.)
96
More properties of SturmLiouville eigenvalues and eigenfunctions
(4) For each eigenvalue 2 there is at most one linearly independent eigenfunction.
(Note: This is true only for the regular type of boundary conditions,
(5) n approaches + as n .
(6) n (x) has exactly n 1 zeros (nodes) in the interval (0, L) (endpoints not
counted). (The basic reason for this is that as increases, becomes increas-
ingly concave and oscillatory.)
(7) If , , , , p(x), and q(x) are all nonnegative, then the n2 are all nonneg-
ative. (Corollary: For the heat equation, the solution u(x, t) approaches 0 as
t + if all the eigenvalues are positive; it approaches a constant if = 0
occurs.)
Note that parts (1) and (7) of the theorem make it possible to exclude the
possibilities of complex and negative eigenvalues without a detailed study of the
solutions of the ODE for those values of 2 . In first learning about separation
of variables and Fourier series we did make such a detailed study, for the ODE
X = 2 X, but I remarked that the conclusion could usually be taken for granted.
(Indeed, Appendix A gives the proof of (1) and (7), specialized to X = 2 X.)
97
1. The set of eigenfunctions needed to expand an arbitrary function may depend
on 2 as a continuous variable, as in the case of the Fourier transform.
Lets return to the general case and assume that the eigenfunctions have been
chosen orthonormal. We have an expansion formula
X
f (x) = cn n (x) ()
n=1
This is called the completeness relation for the eigenfunctions {n }, since it ex-
presses the fact that the whole function f can be built up from the pieces cn n . In
the special case of the Fourier sine series, we looked at this formula earlier.
98
This equation is equivalent to
Rb
a
m (x)* n (x) dx = mn ,
where
1 if m = n
mn
0 if m 6= n.
(This is called the Kronecker delta symbol; it is the discrete analogue of the Dirac
delta function or, rather, Diracs delta function is a continuum generalization of
it!) This orthogonality relation summarizes the fact that the s form an orthonor-
mal basis.
Note that the completeness and orthogonality relations are very similar in
structure. Basically, they differ only in that the variables x and n interchange roles
(along with their alter egos, z and m). The different natures of these variables
causes a sum to appear in one case, an integral in the other.
This becomes Z X
b
n2 t
u(t, x) = dz f (z) n (x) n (z)* e .
a n=1
X 2
G(x, z; t) = n (x) n (z)* en t .
n=1
Similarly,
X
n (x) n (z)*
G(x, z; ) =
n=1
n2 2
RL
is the resolvent kernel, the Green function such that u(x) = 0 G(x, z; )g(z) dz
solves the nonhomogeneous ODE L[u]+2 u = g (if r = 1) with the given boundary
99
conditions. (Our first Green function example constructed with the aid of the delta
function, several sections back, was a resolvent kernel.)
It may be easier to solve for the Green functions directly than to sum the series
in these formulas. In fact, such formulas are often used in the reverse direction,
to obtain information about the eigenfunctions and eigenvalues from independently
obtained information about the Green function.
100
Polar Coordinates and Bessel Functions
Polar coordinates
x = r cos ,
y = r sin .
The usual reason for rewriting a PDE problem in polar coodinates (or another
curvilinear coordinate system) is to make the boundary conditions simpler, so that
the method of separation of variables can be applied. For example, the vanishing
of u(x, y) on a circle is easier to apply when expressed as
u(4, ) = 0
It is the most obvious of the types of regions that look like rectangles when
expressed in polar coordinates. Others are
and three others that have no convenient names (although partially eaten piece of
pie might do for one of them).
In any such case one will want to rewrite the whole problem in polar coordinates
to exploit the geometry. This is likely to make the PDE itself more complicated,
however. At least once in your life, you should go through the calculation using
the product rule and multivariable chain rule repeatedly, starting from formulas
such as
r
= +
x x r x
101
that shows that the two-dimensional Laplacian operator
2 2
2 +
x2 y 2
is equal to
2 1 1 2
+ + .
r 2 r r r 2 2
It is worth noting that the r-derivative terms
2 u 1 u
+
r 2 r r
can also be written as a single term,
1 u
r .
r r r
u(r, ) = R(r)().
We get
1 1
(rR ) + 2 R = 0
r r
(where the primes are unambiguous, because each function depends on only one
variable). Observe that we can separate the r and dependence into different
terms by dividing by R/r 2 :
r(rR )
+ = 0.
R
We can therefore introduce an unknown constant (eigenvalue) and split the equation
into two ordinary DEs:
r(rR )
= K, = K.
R
102
The first of these is our old friend whose solutions are the trig functions; we put it
aside to deal with later.
that is, for the half-eaten piece of pie and the annulus (ring). For the more
common situations of the disc, disc exterior, and sector, the SL problem is singular.
z ln r (hence r = ez ),
so that
d dz d 1 d
= = .
dr dr dz r dz
Then
R 1 d2 R
rR = , (rR ) = ,
z r dz 2
so the equation becomes
d2 R
+ KR = 0.
dz 2
It is our old friend after all!
1. K = 2 < 0 : R = ez = r .
2. K = 0 : R = 1 and R = z = ln r.
3. K = 2 > 0 : R = eiz = r i ;
103
Boundary conditions in polar coordinates
First, since the coordinate goes all the way around, u(r, ) must be periodic
in with period 2. Therefore, the solutions of the angular equation, = K,
will be the terms of a full Fourier series at the standard scale:
X
u(r, ) = cn ein Rn (r).
n=
Second, at the rim of the disc a well-posed potential problem requires a standard
nonhomogeneous Dirichlet, Neumann, or Robin condition, such as
u(r0 , ) = f ().
This will be applied to the whole series, not each term Rn , and will eventually
determine the coefficients cn .
that is, Z
1 1
cn = |n|
ein f () d.
r0 2
104
II. A partially eaten piece of pie
Consider the truncated sector, or polar rectangle, bounded by the four curves
r = r1 , r = r2 , = 1 , = 2 ,
where 0 < r1 and r2 < . In this case, all four boundaries are of the regular type.
Lets suppose that nonhomogeneous data are given on all four sides something
like
u(r1 , ) = f1 (), u(r2 , ) = f2 (),
u u
(r, 1 ) = f3 (r), (r, 2 ) = f4 (r).
As in the Cartesian rectangle case, before separating variables we must split this
into two problems, one with homogeneous boundary conditions and one with ho-
mogeneous r boundary conditions. Let us say u = v +w, where v and w individually
solve the potential equation, v satisfies
v v
v(r1 , ) = 0, v(r2 , ) = 0, (r, 1 ) = f3 (r), (r, 2 ) = f4 (r),
and w satisfies
w w
w(r1 , ) = f1 (), w(r2 , ) = f2 (), (r, 1 ) = 0, (r, 2 ) = 0.
R(r1 ) = 0 = R(r2 )
(1) determine a discrete list of allowable values of , and (2) determine the ratio of
A to B (or C,+ to C, ). This leaves an overall constant factor in R undeter-
mined, as is always the case in finding normal modes. I postpone the details of this
calculation for a moment; the principle is the same as in the very first separation-
of-variables problem we did, where the eigenvalues turned out to be (n/L)2 and
the eigenfunctions sin(nx/L) times an arbitrary constant.
= +2 .
105
Thus the angular dependence of this solution is exponential, not trigonometric. We
can write X
v(r, ) = c R (r) C e + D e .
Thats v; now we need to find w. That problem is like this one, except that
the roles of r and are interchanged. The result will be a Fourier cosine series in
with radial factors that depend exponentially on ln r; that is, linear combinations
of r n and r n . I hope that by now I can leave the details to your imagination.
106
Thus the SturmLiouville expansion involved in this problem is simply an ordi-
nary Fourier sine series, though expressed in very awkward notation because of the
context in which it arose. (In our usual notation, we have L ln(r1 /r2 ), = n/L,
x = z ln r1 , C,+ = bn /2i.) We would have encountered the same complications in
Cartesian coordinates if we had considered examples where none of the boundaries
lay on the coordinate axes (but the boundaries were parallel to the axes).
0 r < r2 , 1 < 2 ,
with nonhomogeneous data on the straight sides. (This is the limiting case of
the v problem above as r1 0.) The endpoint r = 0 is singular, so we are not
guaranteed that a standard SturmLiouville expansion will apply. Indeed, in terms
of the variable z = ln r, where the radial equation becomes trivial, the endpoint
is at z = . This problem is therefore a precise polar analogue of the infinite
rectangular slot problem, and the solution will be a Fourier sine or cosine transform
in a variable z + C that vanishes when r = r2 . (That is, C = ln r2 .)
Bessel functions
The drum problem: Consider the wave equation (with c = 1) in a disc with
homogeneous Dirichlet boundary conditions:
2u
2 u = , u(r0 , , t) = 0,
t2
u
u(r, , 0) = f (r, ), (r, , 0) = g(r, ).
t
(Note that to solve the nonhomogeneous Dirichlet problem for the wave equation,
we would add this solution to that of the disc potential problem, I, solved in the
previous section; the latter is the steady-state solution for the wave problem.)
2 T
= = 2 .
T
107
Therefore
T = eit + eit = A cos(t) + B sin(t).
As for , it will be periodic in and satisfy (r0 , ) = 0 along with the Helmholtz
equation
2 2 1 1 2
= = r + .
r r r r 2 2
r(rR ) 2 2
+r = = 2.
R
(In the last section we did this step for = 0, and 2 was called K.) The
boundary condition becomes R(r0 ) = 0, and as in the previous disc problem we
need to assume that R is bounded as r 0, so that will be differentiable at the
origin and be a solution there. The angular equation is the familiar = 2 ,
with solutions
() = ein with n = || an integer.
Remark: Unlike the last disc problem, here we have homogeneous BC on both
and R. The nonhomogeneity in this problem is the initial data on u.
n2
(rR ) R + 2 rR = 0
r
or in the form
1
2 2
R + R + 2 R = 0.
r r
This is called Bessels equation if 2 6= 0. (We already studied the case = 0 at
length. Recall that the solutions were powers of r, except that ln r also appeared
if n = 0.) We can put the Bessel equation into a standard form by letting
z d d
z r; r= , = .
dr dz
After dividing by 2 we get
d2 R 1 dR n2
+ + 1 2 R = 0.
dz 2 z dz z
(The point of this variable change is to get an equation involving only one arbitrary
parameter instead of two.)
108
If we have a solution of this equation, say R = Zn (z), then R(r) Zn (r) is
a solution of the original equation (with = n). All solutions Zn (z) are called
Bessel functions of order n. Although they are not expressible in terms of elementary
functions (except when n is half an odd integer), they have been studied so much
that many properties of them are known and tabulated in handbooks, symbolic
algebra programs, etc.
Remark: For the disk problem, n must be an integer (which we can take
nonnegative), but for sector problems, other values of n can appear.
into Bessels equation, equate the coefficient of each power of z to 0, and try to solve
for and the cm (method of Frobenius). It turns out that = n (so can be
identified with the of the original equation), and that for n a nonnegative integer
there is a solution of the assumed form only for the positive root. It is called J:
z n X
(1)m z 2m
Jn (z) .
2 m=0 m! (n + m)! 2
This series converges for all z.
109
Behavior at small argument (z 0):
1 z n
Jn (z) ,
n! 2
2
Y0 (z) ln z,
(n 1)! z n
Yn (z) if n > 0.
2
Therefore, for a problem inside a disc only J functions will appear, by the
boundedness criterion previously mentioned.
The crossover point between the r n behavior and the trigonometric behavior
is somewhere close to z = n.
2. The Bessel functions (for real nand ) are oscillatory at infinity. (Note that
their envelope decreases as 1/ z, but this is not enough to make them square-
integrable.)
110
Recursion relations:
zJn + nJn = zJn1 ,
zJn nJn = zJn+1 .
(So the derivative of a Bessel function is not really a new function. Note that
the second (and hence any higher) derivative can be calculated using the Bessel
equation itself.)
The recursion relations are useful in many ways. For instance, computer pro-
grams need to calculate Jn by brute force only for a few values of n and then use
the recursion relations to interpolate.
In the application just discussed, we had 2 > 0 and 2 > 0. But Bessels
equation,
1 2 2
R + R + 2 R = 0.
r r
also makes sense, and has applications, when one or both of these parameters is
negative or complex, so that or is complex. Complex corresponds to complex
z, since z = r. In particular, imaginary (negative real 2 ) corresponds to eval-
uation of the Bessel functions on the imaginary axis: Z (i||r). This is analogous
to the passage from enx to einx , which yields the trigonometric functions (except
that here we are moving in the reverse direction, as we shall now see).
These Bessel functions of imaginary argument (but real ) are called modified
Bessel functions. A standard basis consists of two functions called I (z) and K (z),
chosen to behave somewhat like sinh z and ez , respectively.
Definitions:
I (z) i J (iz),
K (z) i+1 H(1) (iz).
2
111
Behavior at small argument (z 0):
1 z
I (z) ,
! 2
1 z
K (z) ( 1)! ,
2 2
K0 (z) ln z.
ez
I (z) ,
2z
r
z
K (z) e .
2z
t = r sinh , t = r cosh ,
or
x = r cosh x = r sinh .
(The first of these transformations of variables can be related to the twin paradox
in special relativity. The two apply to different regions of the tx plane.) If you
apply such a transformation to the KleinGordon equation,
2u 2u
+ m2 u = 0,
t2 x2
you will get for the r dependence a Bessel equation with imaginary and real
or imaginary (depending on which of the two hyperbolic transformations youre
using). Therefore, the solutions will be either Ji or Ki functions.
Many ordinary differential equations are Bessels equation in disguise. That is,
they become Bessels equation after a change of dependent or independent variable,
or both. One example is the deceptively simple-looking equation
d2 u
+ xu = 0,
dx2
112
whose solutions are called Airy functions. If you let
3 1
y 32 x 2 , u x 2 Z,
then you get
d2 Z 1 dZ 1
+ + 1 2 Z = 0,
dy 2 y dy 9y
the Bessel equation of order = 31 . Therefore, the Airy functions are essentially
Bessel functions: 3
u = x Z 3 32 x 2 .
1
Therefore, the eigenvalues of our problem (or, rather, their square roots) are
znk
nk . (1)
r0
The presence of nk scaling the radial coordinate compresses the nth Bessel
function so that k of the lobes of its graph fit inside the disk of radius r0 . Putting
the radial and angular parts together, we have the eigenfunctions
k (r, ) = R(r)() = Jn ( nk r) ei ( = n). (2)
We could equally well use the real eigenfunctions in which ei is replaced by sin n
or cos n; those functions are easier to visualize. In the drawing the lines and curves
indicate places where such a equals 0, and the signs indicate how the solution
Re or Im bulges above or below the plane = 0. Such patterns may be seen in
the surface of a cupful of coffee or other liquid when the container is tapped lightly.
(Compare the rectangle eigenfunctions in an earlier section.)
113
.................... .................... ....................
......... ...... ......... ...... ......... ......
..... ..... ..... ..... .....
.... .... .... ..................... ........
..... ...
... ..... ...
... .... .........
. ..... ..
.... ... .... ..................... ... .... ..... ................... .... ....
.. ... .. ... ... ... . . . .. .. ..
....
...
+ ...
.
. ....
...
..
... +
...... ...... .
...
.
...
.
. .... ... .... +
... .... ....... ........ ..... ....
.. .. ..
... . ... ........ . ... ..... ..... ..... ...
....
.....
.....
...
..
. ....
.....
.....
...
..
. ... .......
..... ... ........ ....
...... ........ .........
.
......
.............................
......
............................. +
..........................
01 02 03
......................... ......................... .........................
....... ..... ....... ..... ....... .....
..... ..... ..... .. ..... ......... ..
....... ... ....... ....................... ...... ....... ............ ............ ......
... ...
.. ... ...... .... ..
... ... ... ..... ................... ..... ....
... .. ... ... ... ... ... ... .. .
... .. .. .. ..
...
... + ... + +
... ..
... .... ... ...
... ..
+ ++
... .. ..
... ... .... ... .. ...
... .... ..................... ..... ....
...
...
.. ... ....
... ........ ......... .... ... ...... .. .
... .. .... ........... . .... ......... ............ ......
..... . ....
...... ..... .....
.....
..... ......
.....
.......... ............... .......
...........................
.......
..................... ..
...
.
.........
11 12 13
.................... .................... ....................
......... ...... ......... ...... .........
..... ..... ..... ......... ......
.... ..... .... ............................ ....... + +.... ............ ............ ........
..
..
+
...
... +
...
. .... .... ... .. ..
... .... .......... ............ ...... ....
.
...
....
. ...
..
.
.... ....
+
... ...
... ..
.. ..
.. ..
... ... ....
+
.... .... ....
... .. ..
.. .. ..
. ... ..
... .. ... ... .. .. ... ... ... .
...
+ ...
. +
... ... .... ..... +
... ... .... .. ... ...
... .... ........................ .... ....
+ +
... .. ... .... . ..
. .. ...
....
.....
......
..............................
.....
...
.
+
.... .......
.....
.
. ....... ...
...... .......... ...........
........................
.. .... .......
......
.. .
..... ......................... .......
.
..............................
21 22 23
g(r, ) = u(0, r, )
X X
= ck k (r, )
= k=1
X X
= ck Jn ( nk r) ei .
= k=1
114
it turns out that the conclusions of the theorem are still valid in this case: The eigen-
functions are complete (for each fixed n), and they are orthogonal with respect to
the weight function r:
Z r0
Jn ( ni r) Jn ( nj r) r dr = 0 if i 6= j.
0
Furthermore, the experts on Bessel functions assure us that the integral in the
denominator can be evaluated:
Z 1
1
Jn (znk )2 d = Jn+1 (znk )2 .
0 2
That is,
Z r0 Z 2
1
ck = r02 Jn+1 ( nk )2 r dr d k (r, )* g(r, )
r=0 =0
Z r0 Z 2
1
= r dr d k (r, )* g(r, ).
k k k2 r=0 =0
(In the last version I have identified the constant factor as the normalization con-
stant for the two-dimensional eigenfunction.) We now see that the mysterious
weight factor r has a natural geometrical interpretation: It makes the r and
integrations go together to make up the standard integration over the disc in polar
coordinates!
115
But I thought we were solving the wave equation, to model the vibrations of a
drum? Yes, your absent-minded professor shifted to the heat equation in midstream,
then decided to stay there to keep the formulas simpler. What changes are needed
in the foregoing to finish the drum problem? The eigenvalues (1) and eigenfunctions
(2) are the same. However, for each eigenfunction there are now two possible terms
in the solution; the eigenfunction expansion (3) needs to be replaced by
X
X
u(t, r, ) = ck k (r, ) cos ( nk t) + dk k (r, ) sin ( nk t).
= k=1
[There is an important pitfall to avoid here, which is not confined to polar coor-
dinates. (It also arises, for instance, in the wave equation for vibrations in a ring,
using Fourier series.) Suppose that you chose to use the real eigenfunctions. Then
it would be a mistake to write in the summand something like
This would result in equations for the unknown coefficients that are nonlinear, hence
hard to solve; also, the solution will not be unique, and may not even exist for some
initial data. Remember to write the general solution as a linear combination of all
possible (independent) elementary separated solutions:
To finish the problem, we need to set u and its time derivative equal to the
given initial data and solve for the c and d coefficients. The same orthogonality
properties used in the treatment of the heat equation apply here, so (after twice as
much work) you will end up with formulas analogous to (4).
A higher-dimensional example
116
The Laplacian operator is
2 2 2
2 = + +
x2 y 2 z 2
2 1 1 2 2
= 2+ + 2 + .
r r r r 2 z 2
The problem to solve is: 2 u = 0 inside the cylinder, with Dirichlet data given
on all three parts of the cylinders surface:
................................................................
........
.......
......... g 2
..............................................................
....
....
u(r0 , , z) = f (, z),
u(r, , 0) = g1 (r, ), f
u(r, , L) = g2 (r, ).
..... ....... ....... ....... ....... .
.. ..
.......
......... g 1
....
.
....
..............................................................
u = u1 + u2 ,
where
u = R(r)()Z(z).
d2 Z d2
2
2 Z = 0, 2
2 + = 0,
dz d
d2 R 1 dR 2 2
+ + 2 R=0
dr 2 r dr r
except that it is not yet clear whether the quantities here named 2 and 2 are
really positive. (If we find out they arent, well change notation.) Note that the
radial equation is a Bessel equation.
117
is the kth zero of Jn . The new element is the Z equation, whose solutions are
exponentials. As in some previous problems, the most convenient basis for these
solutions consists of certain hyperbolic functions. Cutting a long story short, we
arrive at the general solution
X
X
u1 (r, , z) = Jn ( nk r) [Ank cos n sinh nk z + Bnk sin n sinh nk z
n=0 k=1
+ Cnk cos n sinh nk (L z) + Dnk sin n sinh nk (L z)] .
(I chose real eigenfunctions for variety, and to reinforce an earlier warning about
how to write correct linear combinations of normal modes.) Then, for example, we
have
g1 (r, ) = u1 (r, , 0)
X X
= Jn ( nk r) [Cnk cos(n) sinh ( nk L) + Dnk sin(n) sinh ( nk L)] ,
n=0 k=1
and therefore
R r0 R 2
0
r dr 0 d Jn ( nk r) cos(n) g1 (r, )
Cnk = Rr .
sinh( nk L) 0 0 Jn ( nk r)2 r dr
The solutions for C0k and Dnk , and the solutions for Ank and Bnk in terms of g2 ,
are similar (and by now routine).
Z Z 2
Cn () = r dr d Jn (r) cos(n) g1 (r, ).
0 0
(This is not supposed to be obvious; proving it is beyond the scope of this course.)
To clarify the crux of these Bessel expansions, lets strip away the angular
complications and summarize them as one-dimensional eigenfunction expansions.
Consider an arbitrary function f (r).
118
where R r0
Jn ( nk r) f (r) r dr
Ak = 0R r0 .
0
Jn ( nk r)2 r dr
This is a generalization of the Fourier sine series, where the ordinary differential
equation involved is a variable-coeffient equation (Bessels) instead of X =
2 X.
where Z
A() = f (r) Jn (r) r dr.
0
2 2 < 0.
The solutions of this equation are modified Bessel functions, which are regu-
lar Bessel functions evaluated at imaginary argument. Letting ir puts the
equation into standard form:
d2 R 1 dR n2
+ + 1 2 R = 0.
d 2 d
119
is the one that is nice at 0 and K is the one that is nice at infinity. In our problem,
zero is the relevant boundary, so R(r) = In (r) and
X
X
mz mr
u2 (r, , z) = sin [Amn cos(n) + Bmn sin(n)] In .
m=1 n=0
L L
f (, z) = u2 (r0 , , z)
X X
mz mr
0
= sin [Amn cos(n) + Bmn sin(n)] In ,
m=1 n=0
L L
and the coefficients are found by two steps of ordinary Fourier series inversion. In
this case the Bessel functions are not used as elements of a basis of eigenfunctions
to expand data; rather, they play the same auxiliary role as the sinh(L) in some
2 t
of our Cartesian potential problems and the e t=0
= 1 in heat problems.
120
Spherical Coordinates and Legendre Functions
Spherical coordinates
Lets adopt the notation for spherical coordinates that is standard in physics:
= longitude or azimuth,
= colatitude 2 latitude or polar angle.
z
.....
......
......
x = r sin cos ,
..............
......
..... ..... .
..
.......
.
.
......
......
y = r sin sin , .
......
..
......
. y
.......... ...
..... .... .
..
......
. ......
... ... ....... ............
z = r cos . ..
....
.....
.
.
.. ........ .
..........
...
.....................
.....
x .....
....
....
The ranges of the variables are: 0 < r < , 0 < < , and is a periodic
coordinate with period 2.
1 2 2 u 2 u
(ru) or + .
r r 2 r 2 r r
usep = R(r)()().
We get
r 2 2 u (r 2 R ) 1 (sin ) 1
= + + .
u R sin sin2
(Here the primes in the first term indicate derivatives with respect to r, those in the
second term derivatives with respect to , etc. There is no ambiguity, since each
function depends on only one variable.) We have arranged things so that the first
121
term depends only on r, and the others depend on r not at all. Therefore, we can
introduce a separation constant (eigenvalue) into Laplaces equation:
(r 2 R ) 1 (sin ) 1
= K = + .
R sin sin2
Put the r equation aside for later study. The other equation is
sin (sin ) 2
+ K sin + = 0.
sin (sin )
2
=m = + K sin2 .
Just as in two dimensions, problems involving the whole sphere will be dif-
ferent from those involving just a sector. If the region involves a complete
sphere, then () must be 2-periodic. Therefore, m is an integer, and is
A cos(m) + B sin(m) (or C+ ei + C ei ). Then we can write the equation as
1 m2
(sin ) + K = 0.
sin sin2
This is an eigenvalue problem for K. Recall that the proper interval (for the whole
sphere) is 0 < < . We have a SturmLiouville problem, singular at both end-
points, 0 and , with weight function r() = sin .
on the interval 1 < x < 1. The first two terms can be combined into
d 2 dZ
(1 x ) .
dx dx
122
If m = 0, this equation is called Legendres equation and the solutions are
Legendre functions. Solutions of the equation with m 6= 0 are associated Legendre
functions.
P0 (x) = 1
P1 (x) = x; 1 () = cos
1 dl 2
Pl (x) = l l
(x 1)l .
2 l! dx
Its clear that any linear combination of P and Q with a nonzero Q component is
singular at the endpoints.
123
The orthogonality and normalization properties of the Legendre polynomials
are Z 1
Pl (x) Pk (x) dx = 0 if l 6= k,
1
Z 1
2
Pl (x)2 dx = .
1 2l + 1
R1 R
Note that 1 [. . . x . . .] dx is the same as 0 [. . . cos . . .] sin d. The factor sin is
to be expected on geometrical grounds; it appears naturally in the volume element
in spherical coordinates,
dV = dx dy dz = r 2 sin dr d d,
and the surface area element on a sphere,
dS = r02 sin d d.
Now we can put all the pieces together to solve a boundary value problem
with no dependence. (If the problem has this axial symmetry and the solution is
unique, then the solution must also have that symmetry. Clearly, this will require
axially symmetric boundary data.) If the region in question is a ball (the interior
of a sphere), then the form of the general axially symmetric solution is
X
u(r, ) = bl r l Pl (cos ).
l=0
If Dirichlet boundary data are given on the sphere, then
X
f () u(r0 , ) = bl r0l Pl (cos )
l=0
for all between 0 and . Therefore, by the orthogonality and normalization
formulas previously stated,
Z
2l + 1
bl = f () Pl (cos ) sin d.
2r0l 0
If the region is the exterior of a sphere, we would use r (l+1) instead of r l . For
the shell between two spheres, we would use both, and would need data on both
surfaces to determine the coefficients. As always, Neumann or Robin data instead of
Dirichlet might be appropriate, depending on the physics of the individual problem.
124
Spherical harmonics
where the functions Plm , called associated Legendre functions, are solutions of
2 m2
[(1 x )P ] + l(l + 1) P = 0.
1 x2
The condition of regularity at the poles forces |m| l, and this constraint has been
taken into account by writing the sum over m from l to l. There is a generalized
Rodrigues formula,
(1)m 2 m/2 d
l+m
Plm (x) = l
(1 x ) l+m
(x2 1)l .
2 l! dx
The completeness (basis) property is: An arbitrary* function on the sphere (i.e., a
function of and as they range through their standard intervals) can be expanded
as
X
X l
g(, ) = Alm Ylm (, ),
l=0 m=l
125
A table of the first few spherical harmonics
Y00 = 1
4
q
Y11 = 83
sin ei
q
Y10 = 43
cos
q
Y11 = 83
sin ei
q
Y22 = 1
4
15
2 sin2 e2i
q
Y21 = 8 15
sin cos ei
q
0 5 3 2 1
Y2 = 4 2 cos 2
q
Y21 = 8 15
sin cos ei
q
Y22 = 14 215
sin2 e2i
where Z
Alm = d Ylm (, )* g(, ).
This, of course, is precisely what we need to solve the potential equation with
arbitrary boundary data on a spherical boundary. But such a way of decomposing
functions on a sphere may be useful even when no PDE is involved, just as the
Fourier series and Fourier transform have many applications outside differential
equations. For example, the shape of the earth (as measured by the gravitational
attraction on satellites) is represented by a sum of spherical harmonics, where the
first (constant) term is by far the largest (since the earth is nearly round). The three
terms with l = 1 can be removed by moving the origin of coordinates to the right
spot; this defines the center of a nonspherical earth. Thus the first interesting
terms are the five with l = 2; their nonzero presence is called the quadrupole
moment of the earth. Similar remarks apply to the analysis of any approximately
spherical object, force field, etc.*
* See, for example, M. T. Zuber et al., The Shape of 433 Eros from the NEAR-
Shoemaker Laser Rangefinder, Science 289, 20972101 (2000), and adjacent articles, for
126
A sensible person does not try to memorize all the formulas about spherical
harmonics (or any other class of special functions). The point is to understand
that they exist and why they are useful. The details when needed are looked up
in handbooks or obtained from computer software. Complicated formulas should
not obscure the beauty and power of our march from a basis of eigenvectors in R2 ,
through Fourier series in one dimension, to this basis of eigenfunctions on a sphere!
Similarly, the other types of Bessel functions have their spherical counterparts, yl ,
(1)
hl , etc.
an analysis of a potato-shaped asteroid. There the harmonics with factors eim are
combined into real functions with factors cos m and sin m, so the five coefficients for
l = 2 are named C20 , C21 , S21 , C22 , S22 .
127
The surprising good news is that these fractional-order Bessel functions are not
an entirely new family of functions. They can all be expressed in terms of sines and
cosines. One has
sin z cos z
j0 (z) = , y0 (z) =
z z
(note that j0 is regular at 0 and y0 is not, as expected from their definitions),
sin z cos z
j1 (z) = 2
,
z z
and, in general,
l
l 1 d sin z
jl (z) = z ,
z dz z
l
l 1 d cos z
yl (z) = z .
z dz z
Notice that for large l they contain many terms, if all the derivatives are worked
out.
128
Classification of Second-Order Linear Equations
2u 2u
Laplace: + 2 =0
x2 y
2u 2u
wave: 2 =0
x2 t
2 u u
heat: = 0.
x2 t
Each of these turned out to have its own characteristic properties, which we want
to review here and put in a more general context. Of particular interest for each
equation are
(1) what sort of data (initial or boundary conditions) are needed to constitute a
well-posed problem one with exactly one solution;
(3) how the influence of the data spreads (causality or finite propagation speed).
The most general second-order linear differential equation in two variables, say
x and y, looks like
2u 2u 2u
L[u] A(x, y) + B(x, y) + C(x, y)
x2 x y y 2
u u
+ D(x, y) + E(x, y) + F (x, y)u = 0,
x y
where A, . . . , F are functions of x and y. Suppose just for a moment that these
coefficients are constants. Then the long expression is reminiscent of the formula for
the most general conic section. Indeed, if we replace each /x by a new variable,
X, and replace each /y by Y , and replace L by 0, then we get exactly the conic
section equation:
0 = AX 2 + BXY + CY 2 + DX + EY + F.
Now recall from analytic geometry that it is always possible to make a rotation
of axes in the XY space after which the cross-term coefficient B is zero. Suppose
that this has been done:
0 = AX 2 + CY 2 + DX + EY + F.
129
Then recall that (if certain degenerate cases are ignored) the curve described by
this equation is an
We assign the same terminology to the partial differential equations that result
when X is replaced by /x, etc. Thus Laplaces equation is elliptic, the wave
equation is hyperbolic, and the heat equation is parabolic. (In the latter two cases
y is called t for physical reasons.)
Now suppose that A, etc., do depend on x and y. Then at each point (x, y) it
is possible to find a rotation
= cos sin ,
x x y
= sin + cos ,
y x y
which eliminates the B(x, y) term. (The angle may depend on x and y, so B is not
necessarily zero at other points.) The character of the PDE at that point is defined
to be elliptic, hyperbolic, or parabolic depending on the signs of the coefficients of
the new coefficients A and C there. The discriminant
elliptic if < 0,
hyperbolic if > 0,
parabolic if = 0.
For most equations of practical interest, the operator will be of the same type at
all points.
130
Remark: From linear algebra you may recall that what we are doing here is
diagonalizing the matrix (quadratic form)
1
A 2B
1
,
2B C
that the new A and C are the eigenvalues of that matrix, and that is 4 times
its determinant. This is the secret to extending the classification to equations in
more than two variables, such as
2u 2u 2u
2 = 0.
t2 x2 y
This example counts as hyperbolic, since it has one coefficient with sign opposite to
the others. More generally, there is a coefficient matrix which has to be diagonalized,
and the signs of its eigenvalues are what counts: The operator is elliptic if all the
signs are the same, hyperbolic if one is different, and parabolic if one eigenvalue is
zero and the others have the same sign. (There are other possibilities, such as two
positive and two negative eigenvalues, but they seldom arise in applications.)
Now lets discuss the three matters listed at the beginning. The facts Im about
to state are generalizations of things we already know about the heat, wave, and
Laplace equation.
131
Since real science and engineering deal with only approximately measured data, this
makes the solution in the backward direction almost useless in practice.
For an elliptic equation, one might expect to have a well-posed problem given
the value of u and its normal derivative on an initial surface, since the equation is
second-order in every variable. However, it turns out that a solution may not exist
for all data; it will exist in a neighborhood of the surface, but it will blow up
somewhere else. When solutions exist, they may be unstable. Instead, the proper
and natural boundary condition for an elliptic equation (as we know from physical
applications of Laplaces equation) is to prescribe the function or its derivative
(but not both) at every point on a closed curve or surface surrounding a region.
(Conversely, this sort of boundary condition will not give a well-posed problem for
a hyperbolic or parabolic equation.)
I have been using the term well-posed without formally defining it. It means,
above all, that the problem (consisting, typically, of a differential equation plus
boundary conditions) has been stated so that it has exactly one solution. Stating
too few conditions will make the solution nonunique; too many conditions, and
it will not exist; try to use the wrong kind of conditions (e.g., initial data for an
elliptic equation), and there will be no happy medium! In addition, it is customary
to require stability; that is, that the solution depends continuously on the data.
(2) Elliptic and parabolic equations (with smooth coefficients) have solutions
that are smooth (that is, differentiable arbitrarily many times), regardless of how
rough their data (boundary values) are. But solutions of hyperbolic equations may
be nondifferentiable, discontinuous, or even distributions such as (x ct) for
the wave equation. In other words, singularities in the initial data are propagated
by a hyperbolic equation into the solution region.
(3) Hyperbolic equations spread the initial data out into space at a finite wave
speed. (In applications, this is the speed of sound, the speed of light, etc.) In con-
trast, the initial data of the heat equation can instantly affect the solution arbitrarily
far away.
Like the heat equation, it is first-order in time. Therefore, u(x, 0) (by itself) is
appropriate initial data.
132
Unlike the heat equation, but like the wave equation, its solutions are not
necessarily smooth. Unlike the wave equation, the singularities in the solutions
can disappear and then reappear at later times; this happens most notoriously
for the Green function of the harmonic oscillator equation
u 2u
i = 2 + x2 u ,
t x
which contains the periodic factor csc(2t).
Unlike the wave equation, but like the heat equation, its solutions are not
limited by a finite propagation speed.
Its nicest property is unitarity: The L2 norm of the solution at any fixed t is
the same as the L2 norm of the initial data. That is,
Z Z
2
|u(x, t)| dx = |u(x, 0)|2 dx.
D D
(Here it is assumed that the differential operator in the spatial variables (the
Hamiltonian) is self-adjoint.)
133
so at least one of the second-order partials is positive. Therefore, u(0, 0) cannot be
a local maximum. Similarly, if F (0, 0) and u(0, 0) are both negative, then at least
one of the second-order partials is negative, so u(0, 0) cannot be a local minimum.
Putting these facts together, we can conclude:
u ... NO
.
... ..............
..........................................
..... ...
..... ...
.... ...
.
.... ...
....
..
.
.
. ....
.... x
..
.
.
.
. .....
....... ..
. ........... ..........
............................................... .....
y .
.
...
.
.
.
.
.................. Maybe, but how?
NO
which in turn is easy to see from the expansion of u in a Fourier series (in the polar
angle) inside the circle (see p. 103). The same thing is true for Laplaces equation
in 3 dimensions, with the circle replaced by a sphere and the Fourier series by the
expansion in spherical harmonics.
As weve seen, the maximum principle holds only for a rather restricted class of
differential equations: not only must the equation be elliptic or parabolic, but also
134
there is a sign condition on the terms without derivatives. Nevertheless, various
forms of the maximum principle are important tools in proving theorems about the
properties of solutions. Here are two examples:
135
Appendix A
2u u
2
= . ()
x t
Its a partial differential equation (PDE) because partial derivatives of the unknown
function with respect to two (or more) variables appear in it.
u
2 u =
t
will be qualitatively similar). u(t, x) is the temperature in the bar (possibly with
something subtracted off, as well see). The equation follows quickly from algebraic
formulations of the physical principles that
(1) the amount of heat energy in any small region of the bar is proportional to the
temperature there,
(2) the rate of heat flow is proportional to the derivative of the temperature, since
its driven by temperature differences between regions.
136
can change K to 1. So there is no loss of generality in ignoring K henceforth. This
uses up only one of the three degrees of freedom in the units. The other two can
be used in other ways.
Typically, our bar will have a finite length, say L. We can rescale x to make
L have any convenient value; the most popular choices are 1 (not surprisingly) and
(for reasons that will become obvious later). After that, we can rescale t so as to
keep K equal to 1. We can also add a constant to x so that the left endpoint of the
bar is at x = 0.
Scaling u will not change the form of the equation, since it is linear (see below).
However, this scaling freedom can be used to simplify a boundary condition or initial
condition.
(1) If we know the temperature distribution at one time (say t = 0), we can hope
to predict the temperature at later times, but not necessarily at earlier times.
(If we observe a room full of mosquitos, it is hard to tell by looking which
corner they flew out of.) Thus we will be solving () in the region
(2) We need to know what happens to the heat when it reaches the end of the
bar. Obviously it will make a big difference to the temperature distribution
whether the end is insulated or in contact with some other material which can
conduct heat away. There are four standard types of boundary conditions that
can be considered. Each type is worthy of consideration for its own sake as
a mathematical possibility, but it happens that each one has a real physical
interpretation in the heat problem:
(A) Dirichlet condition: u(t, 0) = (t) for some given function . This says
that the temperature at the end of the bar is controlled (say by contact
with a heat bath).
137
t. .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
(t) . . . . . . . . . . . . . . (t)
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . . . . . . . . .
. . . . . . f. (x)
. . . . . . .
x
0 L
u
(B) Neumann condition: (t, 0) = (t). This says that the heat flow through
x
the end is controlled. This is hard to do in practice, except in the special
case = 0, which says that the end is insulated.
In all these cases of conditions at x = 0, one would need another condition (not
necessarily the same kind) at x = L to complete the specification of the problem.
(D) Periodic boundary conditions: These deal with both endpoints at once.
u u
u(t, 0) = u(t, L), (t, 0) = (t, L) .
x x
The usual physical interpretation of this is that our bar is actually a
ring, and x is an angle. (Thus L = 2 when x is measured in radians.)
One tends to think of the boundary conditions as part of the definition of the
physical system under study, while the initial conditions label the various possible
solutions of the equations of motion of that given system. In other words, in our
discussions the boundary conditions are usually more constant, the initial con-
ditions more variable. Imposing the initial conditions is usually the last step in
finding a solution, as it is usually is for ODEs, too.
We shall now complete the solution of the one-dimensional heat problem with
fixed end temperatures (in mathematical terms, nonhomogeneous Dirichlet data
138
that are independent of time). The overall solution strategy is outlined at the end
of the section Fundamental concepts in the main text of the notes; we continue
from there.
Return to step (1) and assume that v(t, x) = V (x). Then the equation becomes
0 = V , and the boundary conditions become V (0) = T1 , V (1) = T2 . We see that
V = C1 x + C2 and thus
T1 = C2 , T2 = C1 + C2 .
Therefore,
V (x) = (T2 T1 )x + T1 .
V (0) = F1 , V (1) = F2
Separation of variables
Now return to the second half of the problem, the initial-value problem for the
heat equation with homogenized boundary conditions:
w 2w
PDE: = ,
t x2
which satisfy all the homogeneous equations of the problem (namely, the PDE and
BC) but not (usually) the nonhomogeneous equations (the IC, in this case).
Then we will try to satisfy the nonhomogeneous conditions by a superposition
139
(or infinite linear combination) of these separated solutions: It will look something
like
X
w(t, x) = cn Tn (t)Xn (x).
n=1
(1) since the separated (product) solutions satisfy the homogeneous conditions, the
sum will also;
T (t) X (x)
= .
T (t) X(x)
In this equation the left side depends only on t and the right side depends only on x.
The only way the equation can then hold for all t and all x is that both quantities
are constant:
T (t) X (x)
= = .
T (t) X(x)
(I have advance information that the most interesting values of this constant will be
negative, so I call it . However, we are not yet ready to make any commitment
as to whether is positive, negative, zero, or even complex. All possibilities must
be considered.)
X + X = 0, (1)
T + T = 0. (2)
Now look at the boundary conditions, which are
140
These impose restrictions on X, not T . (If we were to satisfy either of them for all
t by setting T (t) = 0, we would make the entire solution wsep identically equal to 0,
a trivial and uninteresting solution.) Our next task is to find the values of that
allow X to vanish at both 0 and 1.
X + 2 X = 0, X(0) = 0, X(1) = 0.
X = c1 cos x + c2 sin x, ()
and the first boundary condition forces c1 = 0. We can choose c2 = 1 without loss of
generality (since what we are looking for is a linearly independent set of separated
solutions wsep ). So X = sin x. Then the second boundary condition is
sin = 0.
n n, n = 1, 2, . . . .
[Notice that the root = 0 is irrelevant, since solutions of the ODE with = 0 do
not have the form (). Negative s give nothing new, which is why we restricted
to be positive when we introduced it.] Note, incidentally, that if we were working
on the interval 0 < x < instead of 0 < x < 1, we would get just n = n, without
the .
141
Similar arguments show that negative and complex s give only trivial solu-
tions. In the negative case, write = 2 ; then
X = c1 cosh x + c2 sinh x,
and the result follows (since cosh 0 6= 0 and sinh z 6= 0 unless z = 0). If is complex,
it has two complex square roots, which are negatives (not complex conjugates!) of
each other. Thus
X = c1 e(+i)x + c2 e(+i)x ,
where ( + i)2 = and 6= 0 (else we would be back in the case of positive ).
X(0) = 0 implies that c2 = c1 , and then X(1) = 0 implies that
e(+i) = e(+i) .
Since 6= 0, these two complex numbers have different moduli (absolute values), so
this conclusion is a contradiction.
There is a more modern, less grubby way to see that has to be positive. Using
the ODE (X = X) and the BC (which allow us to discard all endpoint terms
which arise in integration by parts), we see that
Z 1 Z 1
2
|X(x)| dx = X* X dx
0 0
Z 1
=+ |X |2 dx
0
Z 1
= (X )* X dx
0
Z 1
= +* |X|2 dx.
0
Comparing the first and last members of this chain of equalities, we see that = *
that is, must be real. Comparing either of the extreme members with the one
in the middle, we see that is positive, since two integrals are positive.
This argument suggests a general method for handling such questions when the
second-derivative operator is replaced by a more general linear differential operator
L[X]. If the L can be moved by integration by parts from one side of the integral
to the other,
Z b Z b
X* L[X] dx = (L[X])* X dx,
a a
then all the allowed eigenvalues must be real. (Here it is understood that X(x)
satisfies the boundary conditions of the problem, though not necessarily the dif-
ferential equation. An operator with this integration-by-parts symmetry is called
142
self-adjoint.) If, in addition, an intermediate step in the integration by parts is
a manifestly positive (or nonnegative) integral, then the s must be positive (or
nonnegative, respectively).
n n2 = (n)2
X 2
w(t, x) = bn sin( n x) en t
n=1
X
g(x) = bn sin( n x). ()
n=1
More generally, if the spatial interval is 0 < x < L, then we would like () to
be true for the appropriate choice of the n s namely,
n
n =
L
143
(In the general case, of course, the integral would be from 0 to L.) Now
1 1
sin nx sin mx = 2 cos (nx mx) 2 cos(nx + mx),
so
Z
1 1
sin nx sin mx dx = sin (n m)x sin (n + m)x
0 2(n m) 2(n + m) 0
=0
Conclusion: If () is true,
X
g(x) = bn sin nx,
n=1
then Z
2
bn = g(x) sin nx dx.
0
() is called the Fourier sine series of the function g, and the bn are its Fourier
coefficients.
144