0% found this document useful (0 votes)
12 views186 pages

Fis Mat 1

Uploaded by

davicoutinho
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
12 views186 pages

Fis Mat 1

Uploaded by

davicoutinho
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 186

MATHEMATICAL

PHYSICS I

ALBERTO MARTÍNEZ TORRES


Pure mathematics is, in its way, the poetry of logical ideas.

− Albert Einstein.

The miracle of the appropriateness of the language of mathematics for the


formulation of the laws of physics is a wonderful gift which we neither
understand nor deserve

− Eugene Wigner.

Mathematical physics is in the first place physics and it could not exist without
experimental investigations

− Peter Debye.
C ONTENTS

1 F OURIER A NALYSIS 1
1.1 T HE INNER PRODUCT . . . . . . . . . . . . . . . . . . . 5
1.2 T HE F OURIER SERIES . . . . . . . . . . . . . . . . . . . 16
1.3 C ONVERGENCE OF THE EXPANSION . . . . . . . . . . . . . 26
1.4 F OURIER SERIES AND NON - PERIODIC FUNCTIONS . . . . . . . 34
1.5 I NTEGRATION AND DIFFERENTIATION OF THE SERIES . . . . . . 37
1.6 T HE F OURIER TRANSFORM. . . . . . . . . . . . . . . . . 41
1.7 P ROPERTIES OF F OURIER TRANSFORMS . . . . . . . . . . . . 51
1.8 F OURIER TRANSFORM IN MORE THAN ONE DIMENSION . . . . . 54
1.9 OTHER INTEGRAL TRANSFORMS : T HE L APLACE TRANSFORM . . . 56
1.10 P ROPERTIES OF THE L APLACE TRANSFORM . . . . . . . . . . 63
1.11 A PPENDIX : D IRAC DELTA FUNCTION . . . . . . . . . . . . 65
1.12 A PPENDIX : T HE L EBESGUE INTEGRAL . . . . . . . . . . . . 74
1.13 F URTHER READING . . . . . . . . . . . . . . . . . . . 79

2 I NTRODUCTION TO PARTIAL DIFFERENTIAL EQUATIONS 81


2.1 I NITIAL CONDITIONS AND BOUNDARY CONDITIONS . . . . . . . 84
2.2 L INEAR AND NONLINEAR EQUATIONS . . . . . . . . . . . . 86
2.3 S OLUTION FOR LINEAR ORDINARY DIFFERENTIAL EQUATIONS . . . 90
2.3.1 C ONSTANTS COEFFICIENTS . . . . . . . . . . . . . . . . . 92
2.3.2 VARIABLE COEFFICIENTS : SERIES SOLUTION . . . . . . . . 95
1) S ERIES SOLUTIONS ABOUT AN ORDINARY POINT. . . 98
2) S ERIES SOLUTIONS ABOUT A REGULAR SINGULAR POINT.103

i
CONTENTS ii

i) D ISTINCT ROOTS NOT DIFFERING BY AN IN -


TEGER . . . . . . . . . . . . . . . . . . . . . . . 104
ii) D ISTINCT ROOTS DIFFERING BY AN INTEGER . 107
iii) R EPEATED ROOT OF THE INDICIAL EQUATION . 111
2.4 S EPARATION OF VARIABLES . . . . . . . . . . . . . . . . 111
2.4.1 C ARTESIAN COORDINATES . . . . . . . . . . . . . . . . . . 111
2.4.2 S UPERPOSITION OF SEPARATED SOLUTIONS . . . . . . . . 114
2.4.3 P OLAR COORDINATES . . . . . . . . . . . . . . . . . . . . . 118
2.5 L APLACE ’ S EQUATION . . . . . . . . . . . . . . . . . . . 120
2.5.1 C YLINDRICAL COORDINATES . . . . . . . . . . . . . . . . 120
2.5.2 S PHERICAL COORDINATES . . . . . . . . . . . . . . . . . . 127
2.6 T HE DIFFUSION EQUATION . . . . . . . . . . . . . . . . 137
2.6.1 H EAT- CONDUCTING ROD . . . . . . . . . . . . . . . . . . 138
2.6.2 H EAT CONDUCTION IN A RECTANGULAR PLATE . . . . . . 140
2.6.3 H EAT CONDUCTION IN A CIRCULAR PLATE . . . . . . . . . 141
2.7 T HE S CHRÖDINGER EQUATION . . . . . . . . . . . . . . . 143
2.7.1 QUANTUM PARTICLE IN A BOX . . . . . . . . . . . . . . . . 144
2.8 F URTHER READING . . . . . . . . . . . . . . . . . . . . 147

3 O RTHOGONAL POLYNOMIALS AND SPECIAL FUNCTIONS 149


3.1 G ENERALIZED R ODRIGUES FORMULA . . . . . . . . . . . . 150
3.2 C LASSIFICATION . . . . . . . . . . . . . . . . . . . . . 155
3.3 R ECURRENCE RELATIONS . . . . . . . . . . . . . . . . . 159
3.4 T HE CLASSICAL POLYNOMIALS . . . . . . . . . . . . . . . 163
3.4.1 H ERMITE POLYNOMIALS . . . . . . . . . . . . . . . . . . . 163
3.4.2 L AGUERRE POLYNOMIALS . . . . . . . . . . . . . . . . . . 164
3.4.3 L EGENDRE POLYNOMIALS . . . . . . . . . . . . . . . . . . 166
3.4.4 OTHER CLASSICAL ORTHOGONAL POLYNOMIALS . . . . . 167
1) J ACOBI POLYNOMIALS . . . . . . . . . . . . . . . . . . 168
2) G EGENBAUER POLYNOMIALS . . . . . . . . . . . . . . 168
3) C HEBYSHEV POLYNOMIALS OF THE FIRST KIND . . . 169
4) C HEBYSHEV POLYNOMIALS OF THE SECOND KIND . 170
3.5 E XPANSION IN TERMS OF ORTHOGONAL POLYNOMIALS . . . . . 170
3.6 G ENERATING FUNCTIONS . . . . . . . . . . . . . . . . . 172
3.7 S PECIAL FUNCTIONS : B ESSEL FUNCTIONS . . . . . . . . . . 175
3.8 F URTHER READING . . . . . . . . . . . . . . . . . . . . 180
C HAPTER

1
F OURIER A NALYSIS

T HE PROFOUND STUDY OF NATURE IS THE MOST FRUITFUL


SOURCE OF MATHEMATICAL DISCOVERIES

JOSEPH FOURIER

Until now, you have probably used a Taylor expansion whenever you had
to deal with the problem of representing a certain function f (x) in the neigh-
borhood of some point, say x = x 0 . And, as you already know, the Taylor series
expansion of a function f (x) consists in representing the function as an infi-
nite power series in the polynomials (x − x 0 ), and is given by
¯
X∞ 1 dk f ¯ X∞
f (x) = ¯ (x − x ) k
≡ fk xk . (1.1)
k ¯ 0
k=0 k! d x x=x 0 k=0

Thus, to find the Taylor series expansion of f (x) we simply need to take deriva-
tives of f (x) evaluated only at the point of reference, x = x 0 . But, of course, we
can do this if the function f (x) is infinitely differentiable at x = x 0 . For exam-
ple, let us consider what we might call a complicated function
³ x ´3
f (x) = ln(cos(x 2 ) + 2) + , (1.2)
3
which we plot, together with its Taylor expansion around x 0 = 2, in Fig. 1.1.
As we can see in Fig. 1.1, within few terms of the Taylor series, we can get a
good estimation of the function f (x) in Eq. (1.1) in the neighborhood of x 0 = 2.
However, we cannot get the behavior of f (x) in a larger interval, outside the
region x ∼ 2, let’s say −3 ≤ x ≤ 4. To do this, we need functions with more

1
2

Figure 1.1: Plot of f (x) in Eq. (1.2) (solid line) and its Taylor expansion around
x 0 = 2 including 1, 2, 3 and 4 terms only of the expansion.

wiggles. So, what happens if what we want is to obtain an expansion of the


function f (x) in a finite interval instead of in the neighborhood of some point
x 0 ? And, then, what happens if f (x) is not differentiable at some point x = x 0
belonging to that interval? Can we still get a series expansion of f (x)?
Here starts the interesting part: note that by using Eq. (1.1), we are actually
representing the function f (x) in a given interval as a linear combination of
the monomials {x k }∞ k=0
. This can be done because {x k }∞k=0
form a basis (yes,
they do, as you will see in the next section!!) of what is called square-integrable
function, which are those for which

Zb
d x ω(x) | f (x)|2 (1.3)
a

is defined, and where ω(x) refers to a weight function (a strictly positive real-
valued function). We see then that all this bears a strong formal resemblance
to the problem of expressing a vector in n-dimensional space as a linear com-
bination of n linearly independent vectors. If we consider a vector | f 〉, we
3

represent | f 〉 in the form


X
n
|f 〉 = f k |e k 〉, (1.4)
k=0

where {|e k 〉}nk=0 is an orthonormal basis and f k some coefficients. When con-
sidering Eq. (1.1), we are doing basically the same: if f is a scalar function of
x,
X

f (x) = f k e k (x), (1.5)
k=1

where {e k (x)} would be the basis. If f (x) is infinitely differentiable in the con-
sidered interval, one solution for Eq. (1.5) is x k−1 , i.e.,{e 0 (x), e 1 (x), . . . , e n (x), . . . } =
{1, x, x 2 , . . . , x n , . . . }. If the function f (x) has a finite discontinuity at a finite
number of points x di , i = 1, 2, . . . , N , in the interval considered, where at each
x di the left-hand and right-hand limits of f , i.e.,

lim f (x di − ϵ) and lim f (x di + ϵ), (1.6)


ϵ→0 ϵ→0

exists1 , it is often possible to divide up the interval into subintervals such that
in each subinterval the function f (x) is continuous and monotonic. Such a
f (x) is called piecewise continuous. Notice that every continuous function
is a piecewise continuous function. In other words, f is piecewise continu-
ous if its graph is a smooth curve except for finitely many jumps (where f is
discontinuous) and corners (where d f /d x is discontinuous), but we do not
allow infinite discontinuities (such as f (x) = 1/x has at x = 0). If in addition
f (x) has a piecewise continuous first derivative, then f (x) is called piecewise
smooth. Some examples of piecewise continuous/piecewise smooth func-
tions are shown in Fig. 1.2. We can then obtain a representation of a piecewise
continuous function of the form (1.5).
The general idea of the Fourier analysis is to study how general functions
can be decomposed into trigonometric or exponential functions with definite
frequencies. There are two types of Fourier expansions:

1. Fourier series, which represent functions which are periodic as a dis-


crete sum of sine and cosine terms (or, equivalently, complex expo-
nential functions) instead of x k . Fourier series can describe functions
which are piecewise continuous and are periodic. There are, of course,
also other advantages in using trigonometric terms to expand a peri-
odic function: they are easy to differentiate and integrate and they are
periodic.
1
If in an interval a ≤ x ≤ b, the endpoints a (or b) is one of the exceptional points x di , we
require only the right-hand (or left-hand) limit to exist.
4

Figure 1.2: Top panel: (Left) The function f (x) = |x|, with −π ≤ x ≤ π. Note
that f is continuous throughout the interval and its derivative is discontin-
½ 2at x = 0. Then, f (x) is piecewise smooth (Right) The function
uous only
x , −π < x < 0
f (x) = . Both the function and its derivative are continu-
x 2 + 1, 0 ≤ x < π
ous except at x = 0. Thus, the function is piecewise smooth. Bottom panel: the
function f (x) = x 1/3 on any interval that includes x = 0 is continuous, but its
derivative is not piecewise continuous, since d f /d x = 1/(3x 2/3 ) is ∞ at x = 0.
Thus, the function is not piecewise smooth. In other words, any region in-
cluding x = 0 cannot be broken up into pieces such that d f /d x is continuous.
½ 1/2
x , x <0
A function, like for example, f (x) = is piecewise continuous,
x 2 + 1, x ≥ 0
df
but its derivative is not, since lim− d x → ∞, where 0− means that we approach
x→0
0 from the left side. Thus, the function f (x) is not piecewise smooth.

2. Fourier transform. A general function which is not necessarily periodic


can be written as a continuous integral of trigonometric or exponential
functions with a continuum of possible frequencies.

The reason why Fourier analysis is so important in physics is that it goes


beyond the mechanical process of splitting a function into simpler pieces and
recombining them: many (although certainly not all) of the differential equa-
tions that govern physics systems are linear, which implies that the sum of two
solutions is again a solution. Therefore, since Fourier analysis tells us that any
1.1 T HE INNER PRODUCT 5

function can be written in terms of sinusoidal functions, we can limit our at-
tention to these functions when solving the differential equations. And then
we can build up any other function from these special ones. This is a very
helpful strategy, because it is invariably easier to deal with sinusoidal func-
tions than general ones. Fourier analysis shows up, for example, in classical
mechanics and the analysis of normal modes, in electromagnetism and the
frequency analysis of waves, in noise considerations and thermal physics, in
quantum theory and the transformation between momentum and coordinate
representations, and in quantum field theory and the creation and annihila-
tion operation formalism.

1.1 T HE INNER PRODUCT


The most basic example of an inner product is the familiar dot product of two
vectors |v〉 and |w〉 lying in the Euclidean space Rn

X
n
〈v|w〉 = v 1 w 1 + v 2 w 2 + · · · + v n w n = vk wk . (1.7)
k=1

Observe that we are using a new symbol | 〉 to denote a generic vector. This
object is called ket and this nomenclature is due to Dirac. Since |v〉 and |w〉
are uniquely specified by their components in a given basis, we may, in this
basis, write them as column vectors. In this way,
   
v1 w1
 v2   w2 
   
|v〉 =  .. , |w〉 =  .. . (1.8)
 .   . 
vn wn

The inner product 〈v|ω〉 is given by the matrix product of the transpose con-
jugate of the column vector representing |v〉, thus, a row vector, which we
denote as 〈v|, and the column vector representing |w〉, i.e.,
 
w1
¡ ¢
 w2  X

n
〈v|w〉 = v 1∗ v 2∗ ... v n∗  .. = v ∗w . (1.9)
 .  k=1 k k
wn

The symbol 〈 | is called bra and 〈v| denotes the transpose conjugate of |v〉.
In case of vectors in Rn , {v k∗ }nk=1 = {v k }nk=1 , and Eq. (1.9) reduces to Eq. (1.7).
1.1 T HE INNER PRODUCT 6

Equation (1.7) is the cornerstone of Euclidean geometry. The key remark


is that the dot product of a vector v with itself,

〈v|v〉 = |v 1 |2 + |v 2 |2 + · · · + |v n |2 , (1.10)

is the sum of the modulus squares of its elements, and hence equal to the
square of its length. Therefore, the Euclidean norm or length of a vector is
found by taking the square root, i.e.,
p q
||v|| = 〈v|v〉 = |v 1 |2 + |v 2 |2 + · · · + |v n |2 . (1.11)

This formula generalize the classical Pythagorean theorem to n-dimensional


Euclidean space. Since each term in the sum of Eq. (1.10) is non-negative, the
length of a vector is also non-negative, ||v|| ≥ 0. Furthermore, the only vector
of length 0 is the zero vector.
The dot product and norm satisfy certain evident properties, and these
serve as the basis for the abstract definition of more general inner products
on vector spaces. The formal definition of an inner product on the complex
vector space V is a pairing that takes two vectors |v〉, |w〉 belonging to V and
produces a number 〈v|ω〉 belonging to the complex plane C. The inner prod-
uct is required to satisfy the following three axioms for all u, v, w ∈ V, and c, d
∈C

1. Bilinearity,

〈cu + d v|w〉 = c ∗ 〈u|w〉 + d ∗ 〈v|w〉, (1.12)


〈u|c v + d w〉 = c〈u|v〉 + d 〈u|w〉. (1.13)

2. Symmetry,

〈v|w〉 = 〈w|v〉∗ . (1.14)

3. Positivity,

〈v|v〉 > 0, (1.15)

whenever |v〉 6= |0〉, while 〈0|0〉 = 0.

Given an inner product, the associated norm of a vector |v〉 ∈ V is defined


as the positive square root of the inner product of the vector with itself, i.e.,
p
||v|| = 〈v|v〉. (1.16)
1.1 T HE INNER PRODUCT 7

The positivity axiom implies that ||v|| ≥ 0 is real and non-negative, and equals
0 if and only if |v〉 = |0〉 is the zero vector.
A vector space equipped with an inner product is called an inner product
space, and a given vector space can admit many different inner products (if
you wish, you can easily verify the inner product axioms for the Euclidean dot
product).

E XAMPLE 1.1 (T HE WEIGHTED P RODUCT). While the dot product is certainly


the most important inner product on Rn , it is by no means the only possibility.
We can consider also the weighted inner product

〈v|w〉 = 2v 1 w 1 + 5v 2 w 2 , (1.17)

between vectors |v〉 = v 1 |i 〉 + v 2 | j 〉, |w〉 = w 1 |i 〉 + w 2 | j 〉 in R2 . The symmetry


axiom is immediate. Moreover (in this case c, d ∈ R),

〈cu + d v|w〉 = 2(cu 1 + d v 1 )w 1 + 5(cu 2 + d v 2 )w 2 ,


= (2cu 1 w 1 + 5cu 2 w 2 ) + (2d v 1 w 1 + 5d v 2 w 2 ) = c〈u|w〉 + d 〈v|w〉,
(1.18)

which verifies the first bilinearity condition (the second follows by a very sim-
ilar computation). Moreover,

〈v|v〉 = 2v 12 + 5v 22 ≥ 0, (1.19)

is clearly strictly positive for any |v〉 6= |0〉 and equal to zero when |v〉 = |0〉,
which proves positivity and hence establishes Eq. (1.17) as a legitimate inner
product on R2 . The associated weighted norm is
q
||v|| = 2v 12 + 5v 22 . (1.20)

In general, for Rn , if c 1 , c 2 , . . . , c n is a set of positive numbers, the correspond-


ing weighted inner product and weighted norm are defined by
s
Xn X
n
〈v|w〉 = c i v i w i , |v| = c i v i2 . (1.21)
i =1 i =1

The numbers c i > 0 are the weights: the larger the weight c i , the more the
i th coordinate of v contributes to the norm. Weighted norms are particularly
important in statistics and data fitting, where one wants to emphasize certain
quantities and de-emphasize others; this is done by assigning suitable weights
to the different components of the data vector |v〉.
1.1 T HE INNER PRODUCT 8

Interestingly, we can also define inner products and norms on function


spaces! If we consider a bounded closed interval which is a subset of R, i.e.,
[a, b] ⊂ R, and the vector space consisting of all piecewise continuous func-
tions f : [a, b] → C, the integral

Zb
〈 f |g 〉 = d x f ∗ (x)g (x), (1.22)
a

defines an inner product on the vector space (which can be easily proven).
The associated norm is, according to the basic definition in Eq. (1.16),
v
u b
q uZ
u
|| f || ≡ 〈 f | f 〉 = t d x | f (x)|2 , (1.23)
a

which is known as the norm of the function f over the interval [a, b] and plays
the same role in infinite-dimensional function space that the Euclidean norm
or length of a vector plays in the finite-dimensional Euclidean vector space
Rn . One can also define weighted inner products. The weights along the in-
terval are specified by a continuous positive scalar function ω(x) > 0. The
corresponding weighted inner product and norm are
v
u b
Zb uZ
u
〈 f |g 〉 = d x ω(x) f ∗ (x)g (x), || f || = t d x ω(x)| f (x)|2 . (1.24)
a a

Functions for which such an integral exists and is finite are said to be square-
integrable over the interval [a, b] and the space of square-integrable functions
over the interval [a, b] is denoted by L ω2 (a, b). In this notation, L represents
the name Lebesgue, who generalized the notion of the ordinary Riemann in-
tegral to cases for which the integrand could be highly discontinuous; the su-
perscript 2 indicates the integrability of the square of the modulus of each
function; the values a and b denote the limits of integration and ω refers to the
weight function. When ω(x) = 1, we use the notation L 2 (a, b). Every piece-
wise continuous function defined in a bounded interval belongs to L ω2 (a, b),
but some functions with singularities are also members of the square inte-
grable functions. This includes such non-piecewise continuous functions as
sin(1/x) and x −1/3 , as well as the strange function
½
1, if x is a rational number,
r (x) = (1.25)
0, if x is irrational
1.1 T HE INNER PRODUCT 9

Thus, while well behaved in some respects, square-integrable functions can


be quite wild in others.

£ ¤
E XAMPLE 1.2. If we take [a, b] = 0, π2 , then the L 2 inner product between
f (x) = sin(x) and g (x) = cos(x) is equal to
π
Z2 ¯x=π/2
1 ¯ 1
〈 f |g 〉 = d x sin(x)cos(x) = − cos (x)¯¯
2
= . (1.26)
2 x=0 2
0

Similarly, the norm of the function f (x) is


v
u π/2 r
uZ
u π
||sin(x)|| = t d x (sin(x)) =
2 . (1.27)
4
0

Note that if we would have considered f (x) = 1, the norm would be


v
u π/2 r
uZ
u π
|| f || = t d x1 =
2 , (1.28)
2
0

and not 1, as you might have expected. It is also important to note that the
value of the norm depends upon which interval the integral is take over. For
instance, on the longer interval [0, π],
v
uZπ
u p
u
|| f || = t d x 12 = π. (1.29)
0

Thus, when dealing with the L 2 inner product or norm, we must always be
careful to specify the function space, or, equivalently, the interval on which it
is being evaluated.

Every inner product satisfy:

1. The Cauchy-Schwarz inequality,


|〈v|w〉| ≤ ||v|| ||w||, |v〉, |w〉 ∈ V. (1.30)
Here, ||v|| is the associated norm, while | | denote absolute value. In
case of the L ω2 (a, b) inner product on function space,
v v
¯ Zb ¯ uu Zb
u b
uZ
¯ ¯
¯ d x ω(x) f ∗ (x)g (x)¯ ≤ t d x ω(x)| f (x)|2 u
u
t d x ω(x)|g (x)|2 . (1.31)
¯ ¯
a a a
1.1 T HE INNER PRODUCT 10

2. The triangle inequality, which follows from the Cauchy-Schwarz inequal-


ity,

||v + w|| ≤ ||v|| + ||w||, (1.32)

for every |v〉, |ω〉 ∈ V . Equality holds if and only if |v〉 and |ω〉 are parallel
vectors. In case of the L ω2 (a, b) inner product on function space,
v v v
u b u b u b
uZ uZ uZ
u u u
t d x ω(x)| f (x) + g (x)| ≤ t d x ω(x)| f (x)| + t d x ω(x)|g (x)|2 .
2 2

a a a
(1.33)

Given any inner product on a vector space, we can use the quotient

〈v|w〉
cosθ = , (1.34)
||v|| ||w||

to define the “angle” between the elements |v〉, |w〉 ∈ V . The Cauchy-Schwarz
inequality tells us that the ratio lies between −1 and +1, and hence the angle θ
is well-defined, and in fact, unique of we restrict it to lie in the range 0 ≤ θ ≤ π.
For example, using the standard dot product on R3 , the angle between the
vectors |v〉 = 1|i 〉 + 1|k〉 and w = 1| j 〉 + 1|k〉 is given by

1 1 π
cosθ = p p = =⇒ θ= , (1.35)
2 2 2 3

i.e., 60◦ . Similarly, the “angle” between the functions f (x) = x and g (x) = x 2
defined on the interval [0, 1] is given by

R1
2
d x x3 1 r
〈x|x 〉 0 4 15
cosθ = =s s =q q = , (1.36)
||x|| ||x 2 || R1 R1 1 1 16
d x x2 d x x4 3 5
0 0

so that θ ' 0.25268 radians. Of course, one should not try to give this no-
tion of angle between functions more significance than the formal definition
warrants. It does not correspond to any “angular” properties of their graphs.
Also, the value depends on the choice of inner product and the interval upon
which it is being computed. But even in Euclidean space Rn , the measure-
ment of angle and length depends upon the choice of an underlying inner
product. Different inner products lead to different angle measurements; only
1.1 T HE INNER PRODUCT 11

for the standard Euclidean dot product does angle correspond to our every-
day experience. But the important point is that using the Schwarz inequality,
which holds for any inner product, one can show that the integral in Eq. (1.24)
is defined.
As you already know, two elements |v〉, |w〉 ∈ V of an inner product space
V are called orthogonal if their inner product 〈v|w〉 = 0. For example, the
vectors |v〉 = 1|i 〉 + 2| j 〉 and |w〉 = 6|i 〉 − 3| j 〉 are orthogonal with respect to the
Euclidean dor product in R2 , since 〈v|w〉 = 1·6+2·(−3) = 0. But, interestingly,
the functions, for example, f (x) = x and q(x) = x 2 − 21 are orthogonal with
R1
respect to the inner product 〈 f |g 〉 = d x f ∗ (x)g (x) on the interval [0, 1], since
0

¿ ¯ À Z1 µ ¶ Z1 ³
¯ 2 1 2 1 x´
x ¯x − = dx x x − = d x x3 − = 0. (1.37)
2 2 2
0 0

However, in other interval, like [0, 2], they fail to be orthogonal,

¿ ¯ À Z2 µ ¶ Z2 ³
¯ 2 1 2 1 x´
x ¯x − = dx x x − = d x x3 − = 3. (1.38)
2 2 2
0 0

Note that the monomials {x k }∞


k=0
are not orthogonal with respect to the above
inner product in non of the two intervals considered. For example,

Z1 ¯
x 2 ¯¯1 1
〈1|x〉 = dx x = ¯ = . (1.39)
2 x=0 2
0

There are two important theorems for the space L ω2 (a, b) (which we are
not going to proof). One of them is the Stone-Weierstrass theorem: the se-
quence of monomials {x k }∞k=0
are linearly independent and they form a basis
of L ω (a, b). Then, a piecewise continuous function f (x) ∈ L ω2 (a, b) can be
2

written as
X

f (x) = fk xk . (1.40)
k=0

This is very interesting because we can associate vectors with the func-
tion f (x) and the monomials {x k }∞ k=0
in the space L ω2 (a, b). Indeed, if we take
different values of x (let’s call them x 1 , x 2 , etc.) in the interval [a, b], we can
consider the entire set of values of the function f (x) to be represented by a
1.1 T HE INNER PRODUCT 12

vector | f 〉, where
 
f (x 1 )
  ¡ ¢
| f 〉 =  f (x 2 )  , 〈f | = f ∗ (x 1 ) f ∗ (x 2 ) · · · . (1.41)
..
.

with

Zb Zb

〈f |f 〉 = d x ω(x) f (x) f (x) = d x ω(x)| f (x)|2 . (1.42)
a a

By definition, if we have two functions f (x) and g (x) belonging to L ω2 (a, b)


and a constant c ∈ C,

| f + g 〉 = | f 〉 + |g 〉, c| f 〉 (1.43)

represents f (x) + g (x), and c f (x), respectively.


Similarly, we can construct vectors related to the entire set of values for the
monomials x k , which we represent as |k〉, with k = 0, 1, 2, . . .
     
1 x1 x 12
     2 
|0〉 =  1  , |1〉 =  x 2  , |2〉 =  x 2  . (1.44)
.. .. ..
. . .

Then, from Eq. (1.40), we have

f (x 1 ) = f 0 + f 1 x 1 + f 2 x 12 + . . . ,
f (x 2 ) = f 0 + f 1 x 2 + f 2 x 22 + . . . ,
..
. (1.45)

which, can be written as


       
f (x 1 ) 1 x1 x 12
 f (x 2 )   1   x2   x 22 
  = f0   + f1   + f2  +..., (1.46)
.. .. .. ..
. . . .

or in terms of Eqs. (1.41) and (1.44),

X

| f 〉 = f 0 |0〉 + f 1 |1〉 + f 2 |2〉 + · · · = f k |k〉. (1.47)
k=0
1.1 T HE INNER PRODUCT 13

Interestingly, given the linearly independent vectors related to the monomials


{x k }∞
k=0
, one can apply the so-called Gram-Schmidt procedure to produce or-
thonormal functions e k (x), k = 1, 2, . . . , from them and the corresponding or-
thonormal vectors, which we represent by {|e k 〉}∞ k=1
, constructed in the same
way as the vectors {|k〉}, i.e.,
 
e k (x 1 )
  ¡ ¢
|e k 〉 =  e k (x 2 )  , 〈e k | = e k∗ (x 1 ) e k∗ (x 2 ) · · · . (1.48)
..
.

These vectors form then an orthonormal basis in L ω2 (a, b), i.e.,

Zb
〈e k |e l 〉 = d x ω(x)e k∗ (x)e l (x) = δkl , (1.49)
a

which can be used to expand functions of L ω2 (a, b).


In case of a vector |v〉 ∈ Rn , if
X
n
|v〉 = v k |e k 〉, (1.50)
k=1

where {|e k 〉}nk=1 is an orthonormal basis, i.e.,

〈e k |e l 〉 = δkl , (1.51)

we can determine the coefficients v k in Eq. (1.50) by taking the inner product,
in this case, the scalar product, of |v〉 with each 〈e k | in succession, i.e.,

v k = 〈e k |v〉. (1.52)

Similarly, if we choose an orthonormal basis in L ω2 (a, b), which we denote


as {|e k 〉}∞
k=1
, we can then write

X

|f 〉 = f k |e k 〉, (1.53)
k=1

and the coefficients f k can be calculated by taking the inner product [see
Eq. (1.24)] of | f 〉 with each |e k 〉 in succession as

Zb
f k = 〈e k | f 〉 = d x ω(x) e k∗ (x) f (x). (1.54)
a
1.1 T HE INNER PRODUCT 14

These numbers are called the Fourier coefficients of | f 〉 with respect to the
basis {|e k 〉}∞
k=1
and they can be thought of as values of a function f : N → C,
where N is the (infinite) set of natural numbers. The vectors |e k 〉 are com-
plete, i.e., the only vector in L ω2 (a, b) that is orthogonal to all |e k 〉 is the zero
vector, for which all the Fourier coefficients f 1 = f 2 = · · · = 0. For instance,
using Eq. (1.53)

X
∞ X
∞ X
∞ X

〈f |f 〉 = f k∗ f l 〈e k |e l 〉 = f k∗ f l δkl
k=1 l =1 k=1 l =1
X∞
= | f k |2 , (1.55)
k=1

and 〈 f | f 〉 = 0 if f k = 0. Note that from Eq. (1.42), we have

X
∞ Zb
2
〈f |f 〉 = | fk | = d x ω(x)| f (x)|2 , (1.56)
k=1 a

P

and since the integral is finite for f (x) ∈ L ω2 (a, b), we get that | f k |2 con-
k=1
verges. Even more, think now in a vector |v〉 ∈ R3 : if we consider a coordinate
frame and draw a point there which is related to |v〉, the norm ||v|| represents
the distance from the point drawn to the origin of the coordinate frame, mea-
sured following an imaginary line connecting the considered point with the
origin of the frame. So if we could measure such distance, let’s say with a
ruler, we have the value for ||v||. If we consider now an orthonormal basis in
R3 and write |v〉 in that basis, i.e., |v〉 = v 1 |i 〉 + v 2 | j 〉 + v 3 |k〉, we can calculate
||v|| as ||v||2 = |v 1 |2 + |v 2 |2 + |v 3 |2 , and, of course, ||v|| will coincide with the
above mentioned distance. But this happens only because we have consid-
ered all vectors which form a basis in R3 , in this case, three, i.e., our basis is
complete. If we would have forgotten to incorporate in the basis one, or more,
of the vectors |i 〉, | j 〉 or |k〉, we, of course, do not have a basis (our basis is not
complete), which means that the value obtained for ||v||2 from measuring the
distance from the corresponding point to the origin of our coordinate frame
P
k<n P
k<n
and the result obtained from |v k |2 would differ, with ||v||2 > |v k |2 . Only
k=1 k=1
if our basis is complete, i.e., if we have all the vectors needed to form a basis of
R3 , we will get that the value obtained for ||v||2 from measuring the distance
P
3
between the point and the origin and that from |v k |2 would coincide. The
k=1
notion of completeness normally is not emphasized when discussing an n-
dimensional vector space as Rn since if you take away some of the vectors of
1.1 T HE INNER PRODUCT 15

the basis, as we stated, you do not have a basis, because you have less than
n vectors. The situation is different in infinite dimension, as it is the case of
L ω2 (a, b). If you start with a basis and take away some of the vectors, you
still have an infinite number of orthonormal vectors. Equation (1.56), analo-
gously to our discussion of |v〉 ∈ R3 , is telling us that if we calculate 〈 f | f 〉 using
Rb P

the integral d x ω(x)| f (x)|2 and the summation | f k |2 and both results do
a k=1
not match, we have forgotten some vector in our basis! The notion of com-
pleteness ensures that no orthonormal vector is taken out of the basis and the
vectors |e k 〉 satisfy the completeness relation
X

|e k 〉〈e k | = 1, (1.57)
k=1

with 1 the identity matrix. Indeed, note that, from Eqs. (1.53) and (1.54)
X
∞ X
∞ X
∞ X

|f 〉 = f k |e k 〉 = 〈e k | f 〉|e k 〉 = |e k 〉〈e k | f 〉 =⇒ |e k 〉〈e k | = 1, (1.58)
k=1 k=1 k=1 k=1

where we have used the fact that 〈e k | f 〉 is a number, thus, commutes.


Completeness also tells us that a function is uniquely determined by its
Fourier coefficients. More generally, two vectors | f 〉, |g 〉 ∈ L ω2 (a, b) have the
same Fourier coefficients if and only if they are the same, i..e, | f 〉 = |g 〉. Effec-
tively, by analogy to Eq. (1.55)

〈 f − g | f − g 〉 = 〈 f | f 〉 − 〈 f |g 〉 − 〈g | f 〉 + 〈g |g 〉
X∞ X
∞ X
∞ X

= | f k |2 − f k∗ g k − f k g k∗ + |g k |2
k=1 k=1 k=1 k=1
X∞ X

= ( f k − g k )∗ ( f k − g k ) = | f k − g k |2 . (1.59)
k=1 k=1

If | f 〉 = |g 〉, Eq. (1.59) should be zero, thus, f k = g k .


The other important theorem of L ω2 (a, b) is the Riesz-Fischer theorem,
which establishes that if {|e k 〉} is an orthonormal basis of L ω2 (a, b), the map
f → f k , with f k given by Eq. (1.54), is an isomorphism.
We might reformulate the results of this section in the following way: given
a set {e k (x)}∞ k=1
of orthonormal function representing a set {|e k 〉} of basis vec-
2
tors of L ω (a, b), any piecewise continuous function f (x) for which the inte-
gral

Zb
d x ω(x)| f (x)|2 (1.60)
a
1.2 T HE F OURIER SERIES 16

exists and is finite, can be expanded in the infinite sum


X

f (x) = f k e k (x), (1.61)
k=1

with
Zb
fk = d x ω(x) e k∗ (x) f (x). (1.62)
a

1.2 T HE F OURIER SERIES


Let us consider a piecewise continuous function f (x) defined in the interval
[−L, L] with period 2L, i.e.,
f (x) = f (x + 2L). (1.63)
The continuous functions
1
e m (x) = p e i m(π/L)x , m = 0, ±1, ±2, . . . (1.64)
2L
represent vectors |e m 〉 which form an orthonormal basis in the space L 2 (−L, L),
thus, ω(x) = 1. Indeed, it is easy to see that
ZL
1
〈e m |e n 〉 = d x e −i m(π/L)x e i n(π/L)x = δmn . (1.65)
2L
−L

Then, the function f (x) ∈ L 2 (−L, L) can be expanded as


X
∞ 1 X

f (x) = f m e m (x) = p f m e i m(π/L)x , (1.66)
m=−∞ 2L m=−∞
where
ZL
1
f m = 〈e m | f 〉 = p d xe −i m(π/L)x f (x). (1.67)
2L
−L

Instead of the vectors |e m 〉, one can use as a basis in L 2 (−L, L) certain of


their linear combinations,
|e 0+ 〉 = |e 0 〉,
+ 1
|e m 〉 = p [|e m 〉 + |e −m 〉], (1.68)
2
− i
|e m 〉 = − p [|e m 〉 − |e −m 〉], (1.69)
2
1.2 T HE F OURIER SERIES 17

with m = 1, 2, . . . , which also have the orthonormality property and which are
represented by the trigonometric functions
1 1 1 ³ π ´
e 0+ (x) = e 0 (x) = p , e m +
(x) = p [e m (x) + e −m (x)] = p cos m x ,
2L 2 L L
i 1 ³ π ´

em (x) = − p [e m (x) − e −m (x)] = p sin m x . (1.70)
2 L L
In this way,
X
∞ X

f (x) = f m e m (x) = f 0 e 0 (x) + [ f −m e −m (x) + f m e m (x)]. (1.71)
m=−∞ m=1

Using Eq. (1.69), we can write


1 1
|e 0 〉 = |e 0+ 〉, +
|e −m 〉 = p [|e m −
〉 − i |e m 〉], +
|e m 〉 = p [|e m −
〉 + i |e m 〉], (1.72)
2 2
then, Eq. (1.71) can be written as
X∞ h 1 i i
+ −
f (x) = f 0 e 0 (x) + p ( f m + f −m )e m (x) + p ( f m − f −m )e m (x)
m=1 2 2
X∞ h i
≡ f 0+ e 0+ (x) + f m+ e m
+
(x) + f m− e m

(x) , (1.73)
m=1

where we have defined


1 i
f 0+ ≡ f 0 , f m+ ≡ p ( f m + f −m ), f m− ≡ p ( f m − f −m ). (1.74)
2 2
In this way, by using Eqs. (1.73) and (1.70), we get
1 X∞ h 1 ³ π ´ 1 ³ π ´i
f (x) = f 0+ p + f m+ p cos m x + f m− p sin m x . (1.75)
2L m=1 L L L L

The coefficients f 0+ , f m+ and f m− can be obtained as

ZL
1
f 0+ = 〈e 0+ | f 〉= p d x f (x),
2L
−L
ZL ³ π ´
1
f m+ = 〈e m
+
|f 〉 = p d x cos m x f (x), (1.76)
L L
−L
ZL ³ π ´
1
f m− −
= 〈e m |f 〉= p d x sin m x f (x). (1.77)
L L
−L
1.2 T HE F OURIER SERIES 18

Then, we can rewrite Eq. (1.75) as

a0 X∞ h ³ π ´ ³ π ´i
f (x) = + a m cos m x + b m sin m x , (1.78)
2 m=1 L L

where

ZL ³ π ´
1
am = d x cos m x f (x), m ≥ 0, (1.79)
L L
−L

and

ZL ³ π ´
1
bm = d x sin m x f (x), m > 0. (1.80)
L L
−L

If f (x) is an even or an odd function of x, i.e.,

f (x) = ± f (−x), x ∈ [−L, L], (1.81)

the Fourier series can be simplified. It is easy to verify than if f (x) is even all
b m = 0, while if f (x) is odd, all a m = 0.
It is important to note that since f (x) is a periodic function with period
2L, the Fourier series expansion extends the domain of definition of f (x) to
all the intervals 2kL − L ≤ x ≤ 2kL + L, k = 0, 1, 2, . . . , since f (x) with −L ≤ x ≤ L
is equivalent to f (x − 2kL) with 2kL − L ≤ x ≤ 2kL + L; both will give the same
Fourier series expansion. For this reason we just needed the periodic function
f (x) in [−L, L].
Periodic functions are not always defined on [−L, L]. Let us consider a
periodic function f (u) that is defined on [a, b] with period T = b − a. The
transformation
2L ³ T´
x≡ u−a− , (1.82)
T 2
brings the interval [−L, L] into [a, b], therefore, using Eq. (2.2), we have the
Fourier expansion

1 X

f (u) = p f m e i m(π/L)(2L/T )(u−a−T /2)
2L m=−∞
1 X

=p f m e i m(2π/T )u e −i m(2π/T )(a+T /2) , (1.83)
2L m=−∞
1.2 T HE F OURIER SERIES 19

with
Zb
1 2L
fm = p d u e −i m(π/L)(2L/T )(u−a−T /2) f (u)
2L T
a
p Zb
2L
= d u e −i m(2π/T )u e i m(2π/T )(a+T /2) f (u). (1.84)
T
a

Substituting Eq. (1.84) in (1.83), we can write Eq. (1.83) as


1 X

f (u) = p F m e i m(2π/T )u , (1.85)
T m=−∞

where
Zb
1
Fm = p d u e −i m(2π/T )u f (u). (1.86)
T
a

The functions p1 e i m(2π/T )u are an orthonormal base of L 2 (a, b), thus, Eq. (1.85)
T
is the corresponding Fourier expansion of the function. We can redefine the
variable u as x, and write the Fourier expansion for a function f (x) in the in-
terval [a, b] with period T = b − a as
1 X

f (x) = p f m e i m(2π/T )x , (1.87)
T m=−∞

where now,
Zb
1
fm = p d x e −i m(2π/T )x f (x). (1.88)
T
a

The Fourier series then might be thought of as an expansion of a periodic


function into simple harmonics of the same period.
Note that the integral in Eq. (1.88) requires simply that f ∈ L (a, b), i.e.,

Zb
d x | f (x)| (1.89)
a

exists and is finite, since


¯ Zb ¯ Zb Zb
¯ ¯
¯ d x f (x)e −i kx ¯ ≤ d x| f (x)e −i kx | = d x| f (x)|. (1.90)
¯ ¯
a a a
1.2 T HE F OURIER SERIES 20

Thus, Eq. (1.88) makes also sense for a piecewise continuous function f ∈
L (a, b) and one can also perform a Fourier expansion in this case too.
Equation (1.88) can also be written in terms of sine and cosine functions.
By analogy to Eq. (1.78),

a0 X∞ h ³ 2π ´ ³ 2π ´i
f (x) = + a m cos m x + b m sin m x , (1.91)
2 m=1 T T

where now
Zb ³ 2π ´
2
am = d x cos m x f (x), m ≥ 0, (1.92)
T T
a

and
Zb ³ 2π ´
2
bm = d x sin m x f (x), m > 0. (1.93)
T T
a

It is interesting to note the analogy between the coefficients a m , b m and


the parallel and orthogonal projections of a vector |v〉 ∈ R2 : a vector |v〉 =
v x |i 〉 + v y | j 〉 ∈ R2 which forms an angle θ with respect to the x-axis, can be
written as |v〉 = ||v||cos(θ)|i 〉 + ||v||sin(θ)| j 〉, where ||v||cos(θ) = 〈i |v〉 is the
projection of |v〉 along the x-axis, and ||v||sin(θ) = 〈 j |v〉 is the projection of
|v〉 along the y-axis, i.e., perpendicular to the direction of the x-axis. Simi-
larly, in the vector space L 2 (a, b), the sines and cosines are a perpendicular
set of axes and the Fourier analysis projects f onto each of these axes. The
coefficients a m and b m are precisely the projections of f in these orthogonal
directions. The relation of the coefficients a m and b m to f m comes directly
from the link between the cosine and sine and the complex exponential, i.e.,
µ ¶ µ ¶
−i m(2π/T )x 2π 2π
e = cos m x − i sin m x ,
T T
µ ¶ µ ¶
i m(2π/T )x 2π 2π
e = cos m x + i sin m x . (1.94)
T T

Using Eqs. (1.94), (1.88), (1.92) and (1.93), we get


p p
T T
fm = (a m − i b m ), f −m = f m∗ = (a m + i b m ). (1.95)
2 2
Before showing that the right side of Eq. (2.2) converges, indeed, to the
considered function f (x), let’s obtain the Fourier series for some functions.
1.2 T HE F OURIER SERIES 21

V(t)

Figure 1.3: Periodic square wave potential with V0 = 1.

E XAMPLE 1.3. In the study of electrical circuits, periodic voltage signals V (t )


of different shapes are encountered. An example is a square wave voltage of
height V0 , “duration” τ and “rest duration” τ (see Fig. 1.3). The equation for
V (t ) with 0 ≤ t ≤ 2τ is given by by
½
V0 if 0 ≤ t ≤ τ,
V (t ) = (1.96)
0 if τ < t ≤ 2τ.

The potential as a function of time is a piecewise continuous function of pe-


riod 2τ (the whole cycle of the potential variation), with V (t ) ∈ L 2 (0, 2τ). Thus,
V (t ) can be expanded as a Fourier series. Let us consider, for example, the
complex version of the Fourier expansion, i.e., Eqs. (1.87) and (1.88). We can
write then
1 X

V (t ) = p Vm e i 2mπt /(2τ) , (1.97)
2τ m=−∞

where
Z2τ
1
Vm = p d t V (t )e −i 2πmt /(2τ) . (1.98)

0
1.2 T HE F OURIER SERIES 22

Using Eqs. (1.96) and (1.98), we get


Z τ
V0 −i mπt /τ V0 ³ τ ´
Vm = p dt e =p − [(−1)m − 1]
2τ 0 2τ i mπ
(
0 if m is even and m 6= 0,
= p2τV0 , (1.99)
i mπ
if m is odd.

which is valid for m 6= 0. In case of m = 0, from Eq. (1.98),


Z2τ Z τ r
1 V0 τ
V0 = p d tV (t ) = p d t = V0 . (1.100)
2τ 2τ 0 2
0

Using Eqs. (1.99), (1.100) and (1.97), we can write


· r p µ −1 ¶¸
1 τ 2τV0 X 1 i mπt /τ X
∞ 1 i mπt /τ
V (t ) = p V0 + e + e
2τ 2 iπ m=−∞ m m=1 m
m odd m odd
· µ ¶¸
1 1 X
∞ 1 −i mπt /τ X

1 i mπt /τ
= V0 + e + e
2 iπ m=1 −m m=1 m
m odd m odd
· µ o¶¸
1 1 X
∞ 1 n i mπt /τ −i mπt /τ
= V0 + e −e
2 iπ m=1 m
m odd
· µ ¶¸
1 2 X
∞ 1 mπt
= V0 + sin
2 π m=1 m τ
m odd
· µ ¶¸
1 2 X∞ 1 (2k + 1)πt
= V0 + sin . (1.101)
2 π k=0 2k + 1 τ

Alternatively, instead of using the complex version of the Fourier expan-


sion, we could have started by using Eqs. (1.91), (1.92) and (1.93). In this case,
using Eq. (1.92)
Z2τ µ ¶ Zτ µ ¶
2 2π V0 π
am = d t V (t )cos mt = d t cos mt
2τ 2τ τ τ
0 0
µ ¶¯τ
V0 1 π ¯
= π sin mt ¯¯ = 0, m > 0. (1.102)
τ τm τ 0

For m = 0, we have from Eq. (1.92)


Z2τ Zτ
2 V0
a0 = d t V (t ) = d t = V0 . (1.103)
2τ τ
0 0
1.2 T HE F OURIER SERIES 23

Using Eq. (1.93),


Z2τ µ ¶ Zτ µ ¶
2 2π V0 π
bm = d t V (t )sin mt = d t sin mt
2τ 2τ τ τ
0 0
µ ¶¯τ
V0 1 π ¯ V0
= − π cos mt ¯¯ = − [(−1)m − 1]
τ τm τ 0 πm
½
0, if m even and m 6= 0,
= 2V0 . (1.104)
πm if m odd

Using now Eq. (1.91)


µ ¶
V0 X
∞ 2V
0 π
V (t ) = + sin mt
2 m=1 πm τ
m odd
· µ ¶¸
1 2 X∞ 1 π
= V0 + sin (2k + 1)t , (1.105)
2 π k=0 2k + 1 τ
which coincides, as expected, with the result in Eq. (1.101).
Figure 1.4 shows the graphical representation of Eq. (1.101) when only a
finite number of terms are present. As you can see, at points of discontinu-
ity (for example t = τ), the value of the function V (t ) is not defined, but the
Fourier series expansion assigns it a value, which corresponds to the average
of the two values on the right and left of the discontinuity. For instance, when
we substitute t = τ in Eq. (1.101), all the sine terms vanish and we obtain V0 /2,
which is the average of V0 (on the left of the discontinuity at t = τ) and 0 (on
the right on the discontinuity at t = τ). This is a general property of the Fourier
series!

E XAMPLE 1.4. Another frequently used voltage is the sawtooth voltage (see
Fig. 1.5). The equation for V (t ), with 0 ≤ t < τ, is

V (t ) = V0 t /τ. (1.106)

In this case, using Eqs. (1.87) and (1.88), we have that the corresponding
Fourier expansion is given by
1 X ∞
V (t ) = p Vm e i 2πmt /τ , (1.107)
τ m=−∞
with

1
Vm = p d t V (t )e −i 2πmt /τ . (1.108)
τ
0
1.2 T HE F OURIER SERIES 24

V(t)

Figure 1.4: Various approximations to the Fourier series of the square-wave


potential. The dashed, thick grey and solid lines correspond to the result ob-
tained keeping the first term, the first 3 and 15 terms of the series, respectively.

Figure 1.5: The periodic saw-tooth potential with height taken to be 1.

Substituting Eq. (1.106) in the integral above, we get


Zτ Zτ
1 t −i 2πmt /τ −3/2
Vm = p d t V0 e = V0 τ d t t e −i 2πmt /τ
τ τ
0 0
| {z }
Integrate by parts
µ ¯τ Zτ ¶
−3/2 τt ¯
−i 2πmt /τ ¯ τ −i 2πmt /τ
= V0 τ e ¯ + i 2mπ d t e (1.109)
−i 2mπ 0
0
| {z }
=0
1.2 T HE F OURIER SERIES 25

In this way,
µ ¶ p
−3/2 τ2 V0 τ
Vm = V0 τ =− , where m 6= 0. (1.110)
−i 2mπ i 2πm
In case of m = 0, from Eq. (1.108),
Zτ Zτ
1 1 t 1 p
V0 = p d t V (t ) = p d t V0 = V0 τ. (1.111)
τ τ τ 2
0 0

Thus, using Eqs. (1.110) and (1.111), we can write Eq. (1.107) as
· p µ −1 ¶¸
1 1 p V0 τ X 1 i 2πmt /τ X ∞ 1
i 2πmt /τ
V (t ) = p V0 τ − e + e
τ 2 i 2π m=−∞ m m=1 m
· p ∞ µ ¶¸
1 1 p V0 τ X 1 i 2πmt /τ −i 2πmt /τ
=p V0 τ − e −e
τ 2 i 2π m=1 m
· µ ¶¸
1 1 X∞ 1 2πmt
= V0 − sin . (1.112)
2 π m=1 m τ
Alternatively, if we could have used Eqs. (1.91), (1.92) and (1.93). In this
case, you can check that all the coefficients a m = 0 with m > 0 are zero. Using
Eq. (1.92) for the case m = 0, we have
Zτ Zτ
2 V0
a0 = d t V (t ) = 2 d t t = V0 . (1.113)
τ τ
0 0

Using now Eq. (1.93),


Zτ µ ¶ Zτ µ ¶
2 2πm V0 2πm
bm = d t V (t )sin t = 2 2 d t t sin t
τ τ τ τ
0 0
| {z }
Integrate by parts
· µ ¶¯ µ ¶ µ ¶¯ ¸
V0 τ 2πm ¯¯τ τ 2 2πm ¯¯τ
=2 − t cos t ¯ + sin t ¯
τ2 2πm τ 0 2πm τ 0
V0
=− , m > 0. (1.114)
πm
From Eq. (1.91) we get the following Fourier series,
· µ ¶¸
1 1 X∞ 1 2πm
V (t ) = V0 − sin t , (1.115)
2 π m=1 m τ
which coincides with the result in Eq. (1.112), as expected.
Figure 1.6 shows the graphical representation of the above series keeping
the first few terms.
1.3 C ONVERGENCE OF THE EXPANSION 26

Figure 1.6: Various approximations to the Fourier series in Eq. (1.112). The
dashed line corresponds to the first term of the series, the thick grey line is the
result keeping the first 3 terms, and the solid line represents the first 15 terms
of the expansion.

1.3 C ONVERGENCE OF THE EXPANSION


Let us consider a piecewise continuous function f (x) ∈ L 2 (a, b) with period
T = b − a. We start by defining a vector | f n 〉 as

X
n
| fn 〉 = f k |e k 〉, (1.116)
k=1

such that Eq. (1.53) becomes

X

|f 〉 = f k |e k 〉 = lim | f n 〉. (1.117)
n→∞
k=1

We can then consider | f n 〉 as an approximation to | f 〉 which grows better and


better as n increases indefinitely and what we need to know is how big is the
deviation of | f n 〉 with respect to | f 〉 when n → ∞. A useful measure of the
deviation of | f n 〉 from | f 〉, which applies simultaneously to the whole interval
1.3 C ONVERGENCE OF THE EXPANSION 27

a ≤ x ≤ b, is given by the following inner product

Zb
En ≡ 〈 f − fn | f − fn 〉 = d x| f (x) − f n (x)|2 = 〈 f | f 〉 − 〈 f | f n 〉 − 〈 f n | f 〉 + 〈 f n | f n 〉.
a
(1.118)

Notice that for an arbitrary function f (x) ∈ L 2 (a, b),

Zb
〈f |f 〉 = d x | f |2 , (1.119)
a

and the average value2


1
〈f |f 〉 = 〈 f | f 〉 = | f |2 . (1.120)
b−a
Hence, we see that E n /(b − a) represents the mean of the square of the error
f (x)− f n (x) integrated over the interval [a, b]. It is customary to call E n simply
the mean-square error of the approximation of | f 〉 by | f n 〉. Then, if

lim E n = 0, (1.121)
n→∞

we say that | f n 〉 converges in the mean to | f 〉. This type of convergence is


called mean-square convergence, and one writes

| f 〉 = lim | f n 〉, (1.122)
n→∞

which is read | f 〉 equals the limit in the mean of the sequence | f n 〉 as n ap-
proaches infinity. What we are going to show is that Eq. (1.121) is satisfied as
far as f k = 〈e k | f 〉.
Using Eqs. (1.53) and (1.116) in Eq. (1.118), we get
³X
n X
n ´
En = 〈 f | f 〉 − f k∗ 〈e k | f 〉 + f k 〈 f |e k 〉
k=1 k=1
X
n X
n
+ f k∗ f l 〈e k |e l 〉
k=1 l =1
n ³
X ´ Xn
= 〈f |f 〉− f k∗ 〈e k | f 〉 + f k 〈 f |e k 〉 + f k∗ f k , (1.123)
k=1 k=1

2
the average value of a function g (x) on an interval a ≤ x ≤ b is defined as g =
1
R
b
b−a d x g (x).
a
1.3 C ONVERGENCE OF THE EXPANSION 28

where we have made use of Eq. (1.65), i.e,

〈e k |e l 〉 = δkl . (1.124)

Since 〈e k | f 〉 = 〈 f |e k 〉∗ , and noticing that

( f k − 〈e k | f 〉)∗ ( f k − 〈e k | f 〉) = f k∗ f k − ( f k∗ 〈e k | f 〉 + f k 〈e k | f 〉∗ ) + 〈e k | f 〉∗ 〈e k | f 〉,
(1.125)

we can write Eq. (1.123) as


X
n X
n
En = 〈 f | f 〉 − 〈e k | f 〉∗ 〈e k | f 〉 + ( f k − 〈e k | f 〉)∗ ( f k − 〈e k | f 〉). (1.126)
k=1 k=1

Then, the mean-square error in the approximation of | f 〉 by | f n 〉 as given by


Eq. (1.116) is minimized when the last term in Eq. (1.126)
X
n
( f k − 〈e k | f 〉)∗ ( f k − 〈e k | f 〉) = 0, (1.127)
k=1

i.e., when

f k = 〈e k | f 〉. (1.128)

Thus, choosing the expansion coefficient f k in Eq. (1.116) to be the finite in-
tegral transform of f (x) exactly minimizes the mean-square error in the ap-
proximation of f (x) by f n (x). We have then,
X
n
min(E n ) = 〈 f | f 〉 − f k∗ f k . (1.129)
k=1

Returning to Eq. (1.118), we can write it as

Zb
En = d x | f (x) − f n (x)|2 , (1.130)
a

so that, since the integrand is a positive function, E n ≥ 0 for b ≥ a, and, there-


fore,

min(E n ) ≥ 0, b≥a (1.131)

as well. Thus, from Eq. (1.129),


X
n
f k∗ f k ≤ 〈 f | f 〉, (1.132)
k=1
1.3 C ONVERGENCE OF THE EXPANSION 29

or
X
n
〈 f |e k 〉〈e k | f 〉 ≤ 〈 f | f 〉. (1.133)
k=1

Since | f k |2 ≥ 0, the sequence,

X
n
| f k |2 (1.134)
k=1

is non-decreasing. Then, provided that there is some finite constant M such


that

Zb
0 < 〈f |f 〉 = d x | f (x)|2 < M , (1.135)
a

equation (2.249) gives

X
n
| f k |2 < M , (1.136)
k=1

which holds independently of the value of n. Therefore

X
n
lim | f k |2 (1.137)
n→∞
k=1

must exists, and we have


X

| f k |2 ≤ 〈 f | f 〉, (1.138)
k=1

where

Zb
f k = 〈e k | f 〉 = d x e k∗ (x) f (x). (1.139)
a

Equation (1.138) is known as Bessel’s inequality, and since we are considering


f (x) ∈ L 2 (a, b), Eq. (1.136), thus, Eq. (1.138), holds. Then, we have obtained
P

that the series | f k |2 converges, from where it follows that
k=1

lim f k = lim 〈e k | f 〉 = 0. (1.140)


k→∞ k→∞
1.3 C ONVERGENCE OF THE EXPANSION 30

Next, we shall derive a necessary and sufficient condition for the mean-
square error (1.118) to approach zero as n → ∞. Since both the error E n and
the minimum error min(E n ) are non-negative, we have

0 ≤ min(E n ) ≤ E n . (1.141)

Therefore,

lim E n = 0 (1.142)
n→∞

only if

lim min(E n ) = 0. (1.143)


n→∞

By using now Eq. (1.129), we get

X
∞ Zb
| f k |2 = 〈 f | f 〉 = d x | f (x)|2 . (1.144)
k=1 a

Equation (1.144) is called Parseval’s equality [or frequently the equation of


completeness (see Sec. 1.2)] and provides a necessary and sufficient condition
for the series
X

f k |e k 〉, (1.145)
k=1

with f k = 〈e k | f 〉 to converge to | f 〉 in the mean-square sense. It is this idea


which makes precise the meaning of the equality

X

|f 〉 = 〈e k | f 〉|e k 〉. (1.146)
k=1

A sequence of vectors like those {| f n 〉}∞


n=1 defined in Eq. (1.116) and which
satisfy Eq. (1.121),

lim 〈 f n − f | f n − f 〉 = 0 (1.147)
n→∞

they also satisfy that

lim 〈 f n − f m | f n − f m 〉 = 0. (1.148)
n,m→∞

Such a sequence of vectors {| f n 〉}∞


n=1 is called a Cauchy sequence. Equation
(1.147) states that every Cauchy sequence of vectors in L 2 (a, b) has a limit
1.3 C ONVERGENCE OF THE EXPANSION 31

vector in L 2 (a, b), i.e., there exists a vector | f 〉 ∈ L 2 (a, b) such | f 〉 can be
interpreted as the limit when n → ∞ of the sequence {| f n 〉}. Vector spaces
for which Eq. (1.147) is true are called complete. Thus, L 2 (a, b) is an inner
product space which is complete, and inner product spaces which are com-
plete are also known as Hilbert spaces. Hilbert space is the natural way to let
the number of dimensions become infinite, and at the same time to keep the
geometry of ordinary Euclidean space. Physicists on the 1920’s realized that
Hilbert space was the correct setting to establish Quantum Mechanics!
Instead of considering the mean-square error as in Eq. (1.118), we can ex-
amine instead the actual error in approximating f (x) by

X
n
f n (x) = f k e k (x), (1.149)
k=1

To do this, we define E n (x) as

E n (x) = f (x) − f n (x). (1.150)

Notice that E n (x) depends upon the particular value of x in the interval a ≤
x ≤ b. If we could show that

lim E n (x) = 0 (1.151)


n→∞

for each a ≤ x ≤ b, then we could say that f n (x) converges point-wise to f (x)
in the interval [a, b]. The exact mathematical conditions under which an ex-
pansion in an arbitrary complete set of orthogonal functions is pointwise-
convergent lie outside the scope of our discussion.
We have then established that the orthogonal expansion of a square in-
tegrable piecewise continuous function converges in the mean-square sense.
Nevertheless, a piecewise continuous function f (x) can have a finite number
of finite discontinuities in the interval a ≤ x ≤ b. Near such finite discontinu-
ities the approximating function

X
n
f n (x) = f k e k (x) (1.152)
k=1

fails to match the jump in f (x). However, the series itself does not produce
a discontinuous function, as we saw in Example 1.3. In fact, it is possible to
proof that at a point of finite discontinuity x d , The Fourier series converges to

1
lim[ f (x d + ϵ) + f (x d − ϵ)]. (1.153)
2 ϵ→0
1.3 C ONVERGENCE OF THE EXPANSION 32

This behavior (see Fig. 1.7), which is called Gibbs’s phenomenon, can be qual-
itatively understood on the following basis: At the point x d , the slope of f (x),
i.e., d f /d x, becomes infinite. However f n (x) consists of the sum of a finite
number of smoothly varying functions forming the first n terms of a conver-
gent series. Therefore, d f n /d x must be a smoothly varying bounded function
for a ≤ x ≤ b. Hence, d f n /d x cannot match the infinite slope d f /d x of f (x) at
a point of discontinuity x = x d in [a, b]. The finite series f n (x) tries to achieve
the infinite slope of f (x) at x = x d and thereby overshoots the discontinuity
by a certain amount. As more terms of the series are included, the overshoot
δ moves in position arbitrarily close to the discontinuity, producing spikes of
zero thickness, but it never disappears even in the limit of an infinite num-
ber of terms. Since these additional spikes have zero thickness, they do not
effect the mean-square convergence of the infinite series for f (x), but they do
indicate the limitations of the process of representing f (x) by an orthogonal
expansion. The amount by which lim f n (x) overshoots f (x) at the discon-
n→∞
tinuity x = x d depends on the precise forms of both f (x) and the functions
e k (x). In general it is of the order of 9 percent of the jump in f (x) at x = x d .
A final comment is here in order: if we consider a periodic function f ∈
L (a, b), i.e., a function for which

Zb Zb
∗ 1/2
〈f |f 〉 = d x [ f (x) f (x)] = d x | f (x)| (1.154)
a a

exists and is finite, and calculate the Fourier coefficients for such a function,
we do not necessary have that

X

| f k |2 < ∞, (1.155)
k=−∞

since that happens only if f ∈ L 2 (a, b). However, there is an important Lemma,
which is called the Riemann-Lebesgue lemma, for f ∈ L (a, b), which says
that the coefficients

Zb
1
fm = p d x e −i m(2π/T )x f (x), (1.156)
T
a

of any function f ∈ L (a, b) tends to zero as |m| → ∞. Thus the partial sums

1 X n
f n (x) = p f m e i m(2π/T )x (1.157)
T m=−n
1.3 C ONVERGENCE OF THE EXPANSION 33

Figure 1.7: The convergence of a Fourier series expansion of a square-wave


function, including (a) one term, (b) two terms, (c) three terms and (d ) 20
terms. The overshoot δ is shown in (d ).

converge to f in the sense of distance in L (a, b), Eq. (1.154), i.e.,

Zb µ ¶1/2

lim 〈 f n − f | f n − f 〉 = lim d x [ f n (x) − f (x)] [ f n (x) − f (x)] = 0. (1.158)
n→∞ n→∞
a

Thus, if f ∈ L (a, b), we can set up a series for it of the form

1 X

f (x) = p f m e i m(2π/T )x , (1.159)
T m=−∞
1.4 F OURIER SERIES AND NON - PERIODIC FUNCTIONS 34

with

Zb
1
fm = p d x e −i m(2π/T )x f (x), (1.160)
T
a

and we continue calling such expansion a Fourier expansion.

1.4 F OURIER SERIES AND NON - PERIODIC


FUNCTIONS

What happens if the function f (x) is non-periodic in the fixed range given?
In such a case, we may propose another function, let us called it g (x), which
continues the original one outside the range so as to make it periodic. The
Fourier series of this periodic function g (x) would then correctly represent the
non-periodic function f (x) in the desired range. Since we are often at liberty
to extend the function f (x) in a number of ways, we can sometimes make
g (x) odd or even in a symmetric interval about the origin and, then, reduce
the calculation of the Fourier coefficients. In view of the Gibbs’s phenomenon
explained earlier, the choices for g (x) may be reduced, since g (x) must not
be discontinuous at the end-points of the interval of interest, otherwise the
Fourier series will not converge to the required value there.

E XAMPLE 1.5. Let us consider, for example, the function f (x) = x 2 , with 0 ≤
x ≤ 2, which is clearly a non-periodic function. To determine a Fourier ex-
pansion which can be related to f (x) we must first make the function peri-
odic. We do this by extending the range of interest to −2 ≤ x ≤ 2 in such a way
that the new function g (x) is an even function, i.e., g (x) = g (−x) and letting
g (x + T ) = g (x), where T is the period (in this case, T = 4) (see Fig. 1.8(top)).
In this way, all the coefficients b m will be zero. We could also extend the range
so as to make the function g (x) odd, i.e., g (x) = −g (−x) and then make g (x)
periodic in such a way that g (x + T ) = g (x) (see Fig. 1.8(bottom)). In this case
all a m = 0, m ≥ 0, will be zero. Note, however, that within the latter choice,
due to the Gibbs’s phenomenon, the Fourier expansion of g (x) will converge
to zero at x = ±2, while the original function f (x = ±2) = 4. The even exten-
sion is then better, because the Fourier expansion will converge to the original
values of f (x) at x = ±2. Let us consider g (x) to be the even extension of f (x).
Since the interval considered is symmetric about the origin and g (x) is even,
1.4 F OURIER SERIES AND NON - PERIODIC FUNCTIONS 35

g(x)
4

x
-6 -4 -2 2 4 6
g(x)
4

x
-6 -4 -2 2 4 6

-2

-4

Figure 1.8: Periodic extensions of the function f (x) = x 2 . (Top) Even extension
of f (x) plotted in the range −6 ≤ x ≤ 6. (Bottom) Odd extension of f (x) plotted
in the range in the range −6 ≤ x ≤ 6.
1.4 F OURIER SERIES AND NON - PERIODIC FUNCTIONS 36

all b m = 0. Using Eq. (1.92),

Z2 µ ¶ Z2 µ ¶
2 2 2πmx 4 2 2πmx
am = d x x cos = d x x cos , (1.161)
4 4 4 4
−2 0

where in the last step we have made used of the fact that g (x) is even in x.
Thus, integrating by parts twice, we get
· µ ¶¯ Z2 µ ¶¸
2 2 πmx ¯¯2 4 πmx
am = x sin − d x xsin
πm 2 ¯0 πm 2
0
· µ ¶¯ Z2 µ ¶
8 πmx ¯¯2 8 πmx
= 2 2 xcos − d x cos
π m 2 ¯0 π 2 m 2 2
0
16 16
= 2 2 cos(πm) = 2 2 (−1)m , m > 0. (1.162)
π m π m
In case of m = 0, from Eq. (1.92)

Z2 Z2 Z2
2 4 8
a0 = d x g (x) = d xg (x) = x 2d x = . (1.163)
4 4 3
−2 0 0

In this way, using Eq. (1.91), we can write the Fourier expansion of g (x) in the
range −2 ≤ x ≤ 2 as
µ ¶
4 16 X ∞ (−1)m πmx
g (x) = + 2 cos . (1.164)
3 π m=1 m 2 2

In this way, we can write that in the range 0 ≤ x ≤ 2.


µ ¶
4 16 X ∞ (−1)m πmx
f (x) = + 2 cos . (1.165)
3 π m=1 m 2 2

For instance, from Eq. (1.165)


µ ¶
4 16 X
∞ (−1)m 4 16 π2 4 4
f (0) = + 2 = + − = − = 0,
3 π m=1 m 2 3 π2 12 3 3
4 16 X
∞ (−1)m 4 16 X ∞ (−1)2m
f (2) = + 2 cos(πm) = +
3 π m=1 m 2 3 π2 m=1 m 2
µ ¶
4 16 X
∞ 1 4 16 π2 4 8
= + 2 = + = + = 4. (1.166)
3 π m=1 m 2 3 π2 6 3 3
1.5 I NTEGRATION AND DIFFERENTIATION OF THE SERIES 37

1.5 I NTEGRATION AND DIFFERENTIATION OF THE


SERIES

It is sometimes possible to find the Fourier series of a function by integration


or differentiation of another Fourier series. In general, the orthogonal expan-
sion of a piecewise continuous function f (x), with period T = b − a, in a com-
plete set of functions {e k (x)} can always be integrated term by term. Clearly,
when integrating in such a way there is a constant of integration that must be
found. The integral series will converge in the mean-square sense to the inte-
gral of f (x). There is, however, a complication: the integral of a periodic func-
tion is not necessarily periodic. The simplest example is the constant function
1, which is certainly periodic, but its integral, namely x, is not. On the other
hand, integrals of all the other periodic sine and cosine functions appearing
in the Fourier series are periodic. Thus, only the integration of the constant
term a 0 /2 appearing in Eq. (1.91) might cause us difficulty when we try to in-
tegrate a Fourier series as in Eq. (1.91). If g (x) is the Fourier series obtained by
integrating term by term the Fourier series of f (x), we get, from Eq. (1.91)

Zx X∞ h ³ 2π ´¯x
a0 am ¯
g (x) = d y f (y) = C + x+ sin m y ¯
2 m=1 (2πm/T ) T 0
0
³ 2π ´¯x i
bm ¯
− cos m y ¯ , (1.167)
(2πm/T ) T 0

where C is a constant of integration. In this way,

Zx X
∞ bm a0
g (x) = d y f (y) = C + + x
m=1 (2πm/T ) 2
0
X∞ h bm ³ 2π ´ am ³ 2π ´i
+ − cos m x + sin m x (1.168)
m=1 (2πm/T ) T (2πm/T ) T

P

bm
Since m(2π/T )
produces a finite number, we can reabsorbed this term in
m=1
the constant of integration C , which we need to determine, i.e.,

X
∞ bm
C+ →C (1.169)
m=1 (2πm/T )
1.5 I NTEGRATION AND DIFFERENTIATION OF THE SERIES 38

and write
Zx X∞ h ³ 2π ´
a0 bm
g (x) = d y f (y) = C + x+ − cos m x
2 m=1 (2πm/T ) T
0
am ³ 2π ´i
+ sin m x . (1.170)
(2πm/T ) T

The right hand side of Eq. (1.170) is not, strictly speaking, a Fourier series,
since we do not have a periodic function due to the term a 0 x/2. There are two
ways to interpret this formula within the Fourier framework. Either we can
write
a0 X∞ h bm ³ 2π ´
g (x) − x =C + − cos m x
2 m=1 (2πm/T ) T
am ³ 2π ´i
+ sin m x (1.171)
(2πm/T ) T

and interpret the right hand side as the Fourier series of the function on the
left hand side, or, alternatively, we could replace the function x by its Fourier
series. Then, we can Rconsider the right hand side of Eq. (1.170) as the Fourier
x
series of the integral 0 d y f (y).
Considering the properties of the sine and cosine functions, it is easy to
show that

Zb ³ 2π ´ 1 Zb ³ 2π ´
1
d x cos m x = d x sin m x = 0. (1.172)
T T T T
a a

Then, the constant C in Eq. (1.171) can be obtained as

Zb ³
1 a0 ´
C= d x g (x) − x . (1.173)
T 2
a

Note that from Eq. (1.92),

Zb Zb
a0 1 1
= d x f (x) = d x f (x) = f (1.174)
2 T b−a
a a

is the mean or average of the function f (x) on the interval [a, b]. If the func-
tion f (x)Rhas zero mean, i.e., a 0 = 0, then g (x) can be considered as the Fourier
x
series of 0 d y f (y).
1.5 I NTEGRATION AND DIFFERENTIATION OF THE SERIES 39

According to Eq. (1.92),

Zb Zb
a0 1 1
= d x f (x) = d x f (x) = f (1.175)
2 T b−a
a a

is the mean or average of the function f (x) on the interval [a, b]. Then, a func-
tion has no a 0 term in its Fourier series if and only if it has zero mean. It can
be easily shown that the mean zero functions are precisely the ones which
remain periodic upon integration.
In any case, we can see from Eq. (1.171) that the effect of integration if to
place and additional power of m in the denominator of each coefficient. Thus,
we have a faster convergence than before.
Term-by-term differentiation, however, is a much more precarious mat-
ter. Differentiation of the series produces m factors in the numerator, thus,
we can destroy the convergence of the series. Therefore, to justify taking the
derivative of a Fourier series, we need to know that the differentiated function
remains reasonable nice. If f (x) is a continuous function of x for all x (thus,
f (x) has not finite jumps) and f (x) is also periodic, then the Fourier series
that results from differentiating term by term converges to d f /d x, provided
that d f /d x itself is piecewise continuous.
These properties of the Fourier series may be useful in calculating com-
plicated Fourier series, since simple Fourier series may easily be evaluated (or
found from standard tables) and often the more complicated series can then
be build up by integration and/or differentiation.

E XAMPLE 1.6. Let’s determine the Fourier series of x 3 for 0 < x ≤ 2 by using
the result from example 1.5, in which we determined the Fourier series for x 2
in the range 0 < x ≤ 2 by extending the function x 2 to make it periodic in the
range −2 ≤ x ≤ 2. If
Zx Zx
x3
g (x) = d y f (y) = d y y2 = , (1.176)
3
0 0

we can get the Fourier series for g (x) in the range 0 < x ≤ 2 by using Eqs. (1.171),
(1.162), (1.163). In this way, we can write
µ ¶
4 X
∞ 1 16 m 2πm
g (x) − x = C + (−1) sin
m=1 2πm/4 π m
3 2 2 4
µ ¶
32 X
∞ (−1) m
πm
=C + 3 sin . (1.177)
π m=1 m 3 2
1.5 I NTEGRATION AND DIFFERENTIATION OF THE SERIES 40

Using Eq. (1.173), we can determine the constant C as


Z2 µ ¶
1 x3 4
C= dx − x = 0, (1.178)
4 3 3
−2
since the integrand is odd. Then,
µ ¶
4 32 X
∞ (−1)m πm
g (x) = x + 3 sin . (1.179)
3 π m=1 m 3 2
In this way, we can write that for 0 < x ≤ 2,
µ ¶
x3 4 32 X
∞ (−1)m πm
= x+ 3 sin . (1.180)
3 3 π m=1 m 3 2
Note, however, that this not the Fourier expansion for x 3 since the term 4x/3
appears in the series and we need to express Eq. (1.180) in terms of sine and
cosine functions. To solve this issue, we can differentiate the expression in
Eq. (1.165), which corresponds to the Fourier expansion of x 2 in the range
0 < x ≤ 2, and which was obtained from Eq. (1.164). Then, we have
µ ¶
X∞ (−1)m πmx
2x = −8 sin . (1.181)
m=1 πm 2
Substituting Eq. (1.181) in Eq. (1.180), we can write the full Fourier expansion
of x 3 as
16 X∞ (−1)m µ 6
¶ µ
πmx

3
x = − 1 + 2 2 sin . (1.182)
π m=1 m π m 2

Be careful with differentiating term by term! We could think in using Eq. (1.181),
and differentiate it term by term to get d f /d x with f (x) = x. We find
µ ¶ µ ¶
X

m πmx X

m+1 πmx
1 = −2 (−1) cos =2 (−1) cos , (1.183)
m=1 2 m=1 2
which makes no sense, since the left-hand side of the equation is a constant,
while the right-hand side depends on x! Why is this happening? This is be-
cause Eq. (1.181) is obtained from the periodic extension of x 2 , thus, the se-
ries in Eq. (1.181) does not converge to x, but rather to its periodic exten-
sion. In fact, since there are only sine term in Eq. (1.181), the Fourier series
of Eq. (1.181) corresponds to the one obtained for an odd periodic extension
of x (see Fig. 1.9).
Such periodic extension of x has a jump discontinuity at odd multiples of
2, thus, the function is not continuous for all values of x and term by term dif-
ferentiation of the corresponding Fourier series does not converge to d (x)/d x =
1.
1.6 T HE F OURIER TRANSFORM 41

Figure 1.9: Odd extension of the function f (x) = x.

1.6 T HE F OURIER TRANSFORM


The Fourier series representation of a function f (x) is valid for the entire real
line as long as f (x) is periodic. But in many situations we might have func-
tions f (x) that are defined in some interval [a, b], where a, and b can be ±∞,
respectively, and which are not periodic. It would be useful if we could also ex-
pand such functions in some form of Fourier “series” without having to repeat
the function outside the interval. How to do this then? Let us start by consid-
ering a function f (x) defined in the interval [a, b], and which is a non-periodic
function. As a specific case, suppose we are interested in representing a func-
tion f (x) that is defined only for the interval [a, b] and is assigned the value
zero everywhere else (see Fig. 1.10).

Figure 1.10: An example of a function f (x) we want to represent.

Next, we can extend the range of f (x) by introducing the function f Λ (x)
defined in the interval [a −Λ/2, b +Λ/2], where Λ is an arbitrary positive num-
1.6 T HE F OURIER TRANSFORM 42

Figure 1.11: By introducing the function f Λ , we create a periodic version of


f (x) where the different “copies” of the function are separated.

ber, with period L + Λ, where L = b − a, as



 0 if a − Λ/2 < x < a,
f Λ (x) = f (x) if a ≤ x ≤ b, (1.184)

0 if b < x < b + Λ/2

In this way, we have managed to separate various copies of the original func-
tion by Λ (see Fig. 1.11). It should be clear that if Λ → ∞, we can completely
isolate the function and stop the repetition. In other words,

lim f Λ (x) = f (x). (1.185)


Λ→∞

We can now obtain the Fourier expansion of f Λ (x). Using Eqs. (1.87) and
(1.88), we have

1 X

f Λ (x) = p f mΛ e i 2πmx/(L+Λ) , (1.186)
L + Λ m=−∞

where

Z
b+Λ/2
1
f mΛ =p f Λ (x)e −i 2πmx/(L+Λ) . (1.187)
L +Λ
a−Λ/2

Let us now introduce the variables


2π 2π
k =m = m∆k, ∆k = . (1.188)
L +Λ L +Λ
In this way, the summation over m in Eq. (1.186) can be written as a summa-
tion over k and f mΛ can considered as a function of k, i.e., f Λ (k). We can then
1.6 T HE F OURIER TRANSFORM 43

rewrite Eqs. (1.186) and (1.187) as

1 X∞
f Λ (x) = p f˜Λ (k)e i kx ∆k, (1.189)
2π k=−∞

with

Z
b+Λ/2
˜Λ 1
f (k) = p d x f Λ (x)e −i kx . (1.190)

a−Λ/2

Effectively, note that when substituting Eq. (1.190) in (1.189),

1 1 1 1 2π 1 1
p p ∆k → p p =p p , (1.191)
2π 2π 2π 2π L + Λ L +Λ L +Λ

which is the same factor that we obtain when substituting Eq. (1.187) into
(1.186).
If we consider now the limit Λ → ∞, ∆k becomes vanishingly small, i.e.,
∆k → d k, and k becomes a continuum. In other words, as m changes by one
unit, k changes only slightly. Thus, the infinite sum of terms in the Fourier
series of Eq. (1.189) becomes an integral in the limit Λ → ∞. We would then
have
Z∞
1
f (x) = p d k f˜(k)e i kx , (1.192)

−∞

with
Z∞
1
f˜(k) = p d x f (x)e −i kx . (1.193)

−∞

Equations (1.194) and (1.195) are called the Fourier integral transforms of
f˜(k) and f (x), respectively. The function f˜(k) is called the Fourier transform
while f (x) is the inverse Fourier transform of f˜(k). Note that the
of f (x), p
factor 1/ 2π appearing in Eqs. (1.194) and (1.195) is clearly arbitrary, with the
only requirement being that their product should be equal to 1/(2π). It would
have been possible, for example, to define

Z∞
f (x) = d k f˜(k)e i kx , (1.194)
−∞
1.6 T HE F OURIER TRANSFORM 44

with
Z∞
1
f˜(k) = d x f (x)e −i kx . (1.195)

−∞

and you might find different conventions in different books. We state with the
convention of Eqs. (1.194) and (1.195), which is more symmetric.
In general, the Fourier transform f˜(k) of a function f (x) is a complex-
valued function of k. Thus, we can write f˜(k) in polar form as

f˜(k) = r (k)e −i ϕ(k) , (1.196)

where r (k) is the modulus of f˜(k) and

ϕ(k) = −arg[ f˜(k)]. (1.197)

In this way, from Eq. (1.194) we have

Z∞
1
f (x) = p d kr (k)e i [kx−ϕ(k)] , (1.198)

−∞

which shows that f (x) is a superposition of simple harmonic oscillations of


continuously varying amplitude r (k), phase ϕ(k), and wave number k.
Sufficient conditions for the existence of the Fourier transform of f (x) can
be stated as follows: if f (x) is a piecewise continuous function on the interval
[−∞, ∞], and if in addition f ∈ L (−∞, ∞), i.e.,

Z∞
0≤ d x| f (x)| ≤ M < ∞, (1.199)
−∞

then
Z∞
f˜(k) = d x f (x)e −i kx (1.200)
−∞

exists for every real k. Furthermore, f˜(k) is bounded, since


¯ Z∞ ¯ Z∞
¯ ¯
˜ ¯
| f (k)| ≤ ¯ −i kx ¯
d x f (x)e ¯≤ d x | f (x)| ≤ M < ∞. (1.201)
−∞ −∞
1.6 T HE F OURIER TRANSFORM 45

The Riemann-Lebesgue lemma, in this case, establishes that

lim f˜(k) = 0, (1.202)


|k|→∞

At a point of discontinuity, as in case of the Fourier series,


Z∞
1 1
p d k f˜(k)e i kx = lim[ f (x + ϵ) + f (x d − ϵ)],
2π 2 ϵ→0
−∞
Z∞
1 1
p d x f (x)e −i kx = lim[ f˜(k + ϵ) + f˜(k − ϵ)]. (1.203)
2π 2 ϵ→0
−∞

What happens if the function f (x) is an even or odd function? Can we get
a simplification of the Fourier transform as in case of the Fourier series? Note
that Eq. (1.195) can be written as
Z∞
1
f˜(k) = p d x f (x)[cos(kx) − i sin(kx)]

−∞
= f˜c (k) − i f˜s (k), (1.204)

where
Z∞
1
f˜c (k) = p d x f (x)cos(kx),

−∞
Z∞
1
f˜s (k) = p d x f (x)sin(kx), (1.205)

−∞

are called the Fourier cosine and sine transform, respectively, of a function
f (x).
If the function f (x) is an even function of x, i.e., f (x) = f (−x),
Z∞
1
p d x f (x)sin(kx) = 0, (1.206)

−∞

since the integral of an odd function over a symmetric interval about the ori-
gin is zero. Then
Z∞ r Z∞
1 2
f˜(k) = p d x f (x)cos(kx) = d x f (x)cos(kx), (1.207)
2π π
−∞ 0
1.6 T HE F OURIER TRANSFORM 46

because the integral of an even function over a symmetric interval about the
origin is twice the integral taken over one-half the interval. This shows that if
f (x) is an even function of x, its Fourier transform is purely real. Similarly, if
f (x) is an odd function of x, i.e., f (x) = − f (−x),
Z∞ r Z∞
i 2
f˜(k) = − p d x f (x)sin(kx) = −i d x f (x)sin(kx), (1.208)
2π π
−∞ 0

because the integrand in the above equation is again even, i.e., it is the prod-
uct of two odd functions of x.
Given an arbitrary function f (x) we can either take the complex Fourier
transform of f (x) directly, or the sine and cosine Fourier transforms.

E XAMPLE 1.7. Let us evaluate the Fourier transform of the Gaussian function
2
f (x) = ae −bx , a,b > 0. Using Eq. (1.195),
Z∞ 2 Z∞
a −b(x 2 +i kx/b) ae −k /(4b) 2
f˜(k) = p dx e = p d x e −b(x+i k/2b) . (1.209)
2π 2π
−∞ −∞

To evaluate the last integral we need to use techniques of complex analysis to


which you all are already familiar. Thus, we simply give the result
Z∞ r
−b(x+i k/2b)2 π
dx e = . (1.210)
b
−∞

Then, we have
a 2
f˜(k) = p e −k /(4b) , (1.211)
2b
which is also a Gaussian function.

You might be now wondering if we could also associate a vector | f 〉 to the


function f (x) now that we have passed from a summation on the index m to
an integral on the continuum variable k. And the answer is: yes!! Let {|e x 〉}x∈R
be an orthonormal basis in L 2 (−∞, ∞), we can interpret the number f (x) as
the component with “index x” of | f 〉, i.e.,

f (x) = 〈e x | f 〉, (1.212)

and the functions e k (x) as the component with “index x” of |e k 〉, i.e.,

e k (x) = 〈e x |e k 〉. (1.213)
1.6 T HE F OURIER TRANSFORM 47

We physicist prefer more the notation |x〉 and |k〉 for the vectors |e x 〉 and |e k 〉.
Then, we write

f (x) = 〈x| f 〉. (1.214)

and

e k (x) = 〈x|k〉. (1.215)

Note that the inner product of two functions f (x) and g (x) belonging to
L ω2 (a, b) can now be written as

Zb Zb

〈g | f 〉 = d x ω(x) g (x) f (x) = d x ω(x) 〈g |x〉〈x| f 〉
a a

³ Zb ´
= 〈g | d x ω(x) |x〉〈x| | f 〉, (1.216)
a

thus, the vectors {|x〉}x∈R are such that

Zb
d x ω(x)|x〉〈x| = 1. (1.217)
a

Equation (1.217) is the generalization of Eq. (1.57) to a continuum values of


the index in the summation, in which case

X Zb
→ d x ω(x). (1.218)
i
a

The meaning of this replacement is evident: given a function f (x i ) defined


over an enumerable set of points x i in some interval [a, b] and distributed with
P
n Rb
a density ω(x), the sum f (x i )(b−a)/n goes over into the integral d x ω(x) f (x)
i =1 a
when the number of points x i considered increases indefinitely.
Using Eq. (1.217), we also get that,

³ Zb ´ Zb
|f 〉 = d x ω(x)|x〉〈x| | f 〉 = d x ω(x) f (x)|x〉, (1.219)
a a
1.6 T HE F OURIER TRANSFORM 48

which shows how to expand a vector | f 〉 in terms of the |x〉’s. If we take now
the inner product of Eq. (1.219) with 〈x 0 |, we obtain

Zb
0 0
〈x | f 〉 = f (x ) = d x ω(x) f (x)〈x 0 |x〉, (1.220)
a

where x 0 is assumed to lie in the interval [a, b], otherwise f (x 0 ) = 0 by defi-


nition. This equation, which holds for arbitrary f , tells us immediately that
ω(x)〈x 0 |x〉 is no ordinary function of x and x 0 . For instance, suppose f (x 0 ) = 0.
then, the result of integration is always zero, regardless of the behavior of f
at other points. Clearly, there is an infinitude of functions that vanish at x 0 ,
yet all of them give the same integral! Pursuing this line of argument more
quantitatively, one can show, for example, that ω(x)〈x 0 |x〉 = 0 if x 6= x 0 . In fact,

ω(x)〈x 0 |x〉 = δ(x − x 0 ), (1.221)

which is the Dirac delta function and remember that for a function f defined
on the interval [a, b] has the following property (if this is the first time you
have encountered with the Dirac delta function, you should take a look at the
Appendix 1.11).

Zb ½
0 f (x 0 ) if x 0 ∈ (a, b),
d x f (x)δ(x − x ) = (1.222)
0 otherwise
a

Let us now particularize all this for Eqs. (1.194) and (1.195). In our case,
ω(x) = 1 and the range of x is [−∞, ∞]. Using Eqs. (1.214) and (1.215), and
interpreting f˜(k) as the component with “index” k of | f˜〉, i.e.,

f˜(k) = 〈k| f˜〉. (1.223)

equation (1.194) can be written as

Z∞ µ Z∞ ¶
〈x| f 〉 = d k〈k| f˜〉〈x|k〉 = 〈x| d k|k〉〈k| | f˜〉, (1.224)
−∞ −∞

where
1
〈x|k〉 = p e i kx . (1.225)

1.6 T HE F OURIER TRANSFORM 49

Due to the map f → f k established by the Riesz-Fischer theorem in the Fourier


series, equation (1.224) suggest the indentification | f˜〉 ≡ | f 〉 as well as the
identity,
Z∞
d k |k〉〈k| = 1, (1.226)
−∞

which is the same as Eq. (1.217) for our particular case, i.e., ω(x) = 1 and the
interval [−∞, ∞]. Then, Eq. (1.221) yields,

〈k|k 0 〉 = δ(k − k 0 ), (1.227)

which upon the insertion of a unit operator gives an integral representation of


the delta function
µ Z∞ ¶
0 0
δ(k − k ) = 〈k|1|k 〉 = 〈k| d x|x〉〈x| |k 0 〉
−∞
Z∞ Z∞
1 0 0
= d x〈k|x〉〈x|k 〉 = d x e i (k −k)x . (1.228)

−∞ −∞

Indeed, if we substitute Eq. (1.194) in (1.195), we get


Z∞ · Z∞ ¸
1 −i kx 1 0
f˜(k) = p dx e p d k f˜(k 0 )e i k x
0
2π 2π
−∞ −∞
Z∞ Z∞
1 0
= dx d k 0 f˜(k 0 )e i (k −k)x

−∞ −∞
Z∞ · Z∞ ¸
˜ 0 1 0 i (k 0 −k)x
= d k f (k ) dx e

−∞ −∞
Z∞
= d k 0 f˜(k 0 )δ(k 0 − k) = f˜(k). (1.229)
−∞

Obviously, proceeding in a similar way, we can also write


µ Z∞ ¶ Z∞
0 0 0 1 0
δ(x − x ) = 〈x |1|x〉 = 〈x | d k|k〉〈k| |x〉 = d k e i (x−x )k , (1.230)

−∞ −∞

and use this equation when substituting Eq. (1.195) into (1.194), getting the
identity f (x) = f (x).
1.6 T HE F OURIER TRANSFORM 50

In this way, {|x〉}x∈R and {|k〉}k∈R form two bases of L 2 (−∞, ∞) and we
can express a vector | f 〉 ∈ L 2 (−∞, ∞) in terms of these two bases by using
the inner product of L 2 (−∞, ∞): 〈x| f 〉 corresponds to the components of | f 〉
in the basis {|x〉}x∈R , i.e., f (x), while 〈k| f 〉 represents the component of | f 〉
in the basis {|k〉}k∈R . How are these two bases connected? They are related
through Eq. (1.225). The Fourier transform, and its inverse, establishes the
way of obtaining f˜(k) = 〈k| f 〉 given f (x) = 〈x| f 〉 and vice versa.

Figure 1.12: The square “bump” function of Example 1.8.

E XAMPLE 1.8 (T HE H EISENBERG PRINCIPLE). Let us determine the Fourier


transform of the function f (x) defined by (see Fig. 1.8)
½
b if |x| ≤ a,
f (x) = (1.231)
0 if |x| > a.

Using Eq. (1.195), we have

Z∞ Z a µ ¶
1 −i kx b −i kx 2ab sin(ka)
f˜(k) = p d x f (x)e =p dx e =p . (1.232)
2π 2π −a 2π ka
−∞

Simple, right? Let us discuss this result in detail. First, note that if a → ∞, the
function f (x) becomes a constant function over the entire real line, and we
get from Eq. (1.232)

2b sin(ka) 2b
f˜(k) = p lim = p πδ(k) (1.233)
2π a→∞ k 2π
To get the last equation one has to note that Eq. (1.228) can be written as

Za
1 0 1 sin(a(k − k 0 ))
δ(k − k 0 ) = lim d x e i (k −k)x = lim . (1.234)
a→∞ 2π a→∞ π k − k0
−a
1.7 P ROPERTIES OF F OURIER TRANSFORMS 51

Next, let b → ∞ and a → 0 in such a way that 2ab, which is the area under
f (x), is 1. Then f (x) will approach the δ-function, and f˜(k) becomes

2ab sin(ka) 1 sin(ka) 1


f˜(k) = lim p =p lim =p . (1.235)
b→∞, 2π ka 2π |a→0 {zka } 2π
a→0
Use L’Hopital’s rule

So, since f (x) approaches to the δ-function under the limit considered, we
p obtained that the Fourier transform of the δ-function is the constant
have
1/ 2π.
Finally, we note that the width of f (x) is ∆x = 2a, and the width of f˜(k) is
roughly the distance, on the k-axis, between its first two roots, k + and k − , on
either side of k = 0: ∆k = k + − k − = 2π/a. Thus, increasing the width of f (x)
results in a decrease in the width of f˜(k). In other words, when the function
is wide, its Fourier transform is narrow. In the limit of infinite width (a con-
stant function), we get infinite sharpness (the δ-function). The last two state-
ments are very general.p For instance, in Example 1.7 the width of f (x), which
is proportional top 1/ b, is the inverse relation to the width of f˜(k), which is
proportional to b. In fact, it can be shown that ∆x∆k ≥ 1 for any function
f (x). When both sides of this inequality are multiplied by the reduced Planck
constant ħ = h/(2π), the result is the celebrated Heisenberg uncertainty rela-
tion of quantum mechanics ∆x∆p ≥ ħ, where p = ħk is the momentum of the
particle. In this context, the width of the function, which corresponds to the
so-called wave packet, measures the uncertainty in the position x of a quan-
tum mechanical particle. Similarly, the width of the Fourier transform mea-
sures the uncertainty in k, which is related to the momentum p of the particle
via p = ħk.

1.7 P ROPERTIES OF F OURIER TRANSFORMS


Here we simply list some useful properties related to Fourier transforms (the
proof of these properties can be easily verified by using the definition of the
transform). For writing these properties in a compact form, it is more conve-
nient to introduce the notation F [ f (x)], instead of f˜(k), to denote the Fourier
transform of f (x).

1. Differentiation:
· ¸
dn f
F = (i k)n f˜(k). (1.236)
d xn
1.7 P ROPERTIES OF F OURIER TRANSFORMS 52

2. Integration:
·Z x ¸
1 ˜
F f (y)d y = f (k) + 2πC δ(k), (1.237)
ik
where the last term in the above equation corresponds to the Fourier
transform of the constant of integration C associated with the indefinite
integral on the left side of Eq. (1.237).

3. Scaling:
µ ¶
1 ˜ k
F [ f (ax)] = f . (1.238)
a a

4. Translation:
F [ f (x + a)] = e i ak f˜(k). (1.239)

5. Exponential multiplication:
· ¸
αx
F e f (x) = f˜(k + i α), (1.240)

where α is, in general, a complex number.

6. Convolution theorem: The convolution of a pair of functions f (x) and


g (x) is given by the following integral, which is called convolution inte-
gral,
Z∞
h(x) = d y f (x − y)g (y). (1.241)
−∞

If we define z = x − y, for a given x, d z = −d y, then


Z∞ Z∞
h(x) = d z f (z)g (x − z) = d z g (x − z) f (z) (1.242)
−∞ −∞

as well. Let
Z∞
1
h̃(k) = p d x h(x)e −i kx ,

−∞
Z∞
1
f˜(k) = p d x f (x)e −i kx ,

−∞
Z∞
1
g̃ (k) = p d x g (x)e −i kx (1.243)

−∞
1.7 P ROPERTIES OF F OURIER TRANSFORMS 53

be the Fourier transforms of h(x), f (x), and g (x), respectively. The con-
volution theorem says that if

Z∞
h(x) = d y f (x − y)g (y), (1.244)
−∞

then
p
h̃(k) = 2π f˜(k)g̃ (k). (1.245)

In other words, the Fourier transform of a convolution is equal


p to the
product of the separate Fourier transforms multiplied by 2π. Effec-
tively, using Eq. (1.244) and the definition of Fourier transform,

Z∞ Z∞ Z∞
1 −i kx 1 −i kx
h̃(k) = p d x h(x)e =p dx e d y f (x − y)g (y).
2π 2π
−∞ −∞ ∞
(1.246)

Now, let z = x − y, such that for a given value of y, d z = d x. Then,

Z∞ Z∞ Z∞ Z∞
1 −i k(z+y) 1 −i kz
h̃(k) = p d z d ye f (z)g (y) = p d z f (z)e d y g (y)e −i k y
2π 2π
−∞ ∞ −∞ −∞
1 hp ˜ ihp i p
=p 2π f (k) 2πg̃ (k) = 2π f˜(k)g̃ (k). (1.247)

The convolution of two functions f and g is often written as f ∗ g , and


Eqs. (1.241) and (1.242) show that it is an operation which is clearly
commutative, i.e., f ∗ g = g ∗ f . The convolution is also associative and
distributive. Similarly to Eq. (1.245), it is also possible to proved that the
Fourier transform of the product of two functions f (x) and g (x) is given
by

1
F [ f (x)g (x)] = p f˜(k) ∗ g̃ (k). (1.248)

7. Parseval’s theorem: If
Z∞
1
f˜(k) = p d x f (x)e −i kx , (1.249)

−∞
1.8 F OURIER TRANSFORM IN MORE THAN ONE DIMENSION 54

and
Z∞
1
f (x) = p d k f˜(k)e i kx , (1.250)

−∞

then
Z∞ Z∞
2
d x | f (x)| = d k | f˜(k)|2 . (1.251)
−∞ −∞

Indeed, using the definition of Fourier transform

Z∞ Z∞ Z∞ · Z∞ ¸∗ · Z∞ ¸
∗ 1 1
2
d x | f (x)| = d x [ f (x)] f (x) = dx p d l f˜(l )e ilx
p d l f˜(k)e i kx
2π 2π
−∞ −∞ −∞ −∞ −∞
Z∞ · Z∞ ¸· Z∞ ¸
1 1
= dx p d l [ f˜(l )]∗ e −i l x p d l f˜(k)e i kx
2π 2π
−∞ −∞ −∞
Z∞ Z∞
1
= dl d k [ f˜(l )]∗ f˜(k)(2π)δ(k − l )

−∞ −∞
Z∞
= d k| f˜(k)|2 , (1.252)
−∞

where in the last step we have used Eq. (1.228).

1.8 F OURIER TRANSFORM IN MORE THAN ONE


DIMENSION

We use can easily generalized Eqs. (1.194) and (1.195) if more than one di-
mension is involved. By noticing that in three dimensions, kx corresponds to
the projection of r = xiˆ + y jˆ + zk̂ in the direction of k = k iˆ, we can write for
x = (x 1 , x 2 , . . . , x n ) and k = (k 1 , k 2 , . . . , k n )
Z
1
f (xx ) = d n k e ikk ·xx f˜(k
k ),
(2π)n/2
Z
1
f˜(k
k) = d n x e −ikk ·xx f (xx ). (1.253)
(2π)n/2
1.8 F OURIER TRANSFORM IN MORE THAN ONE DIMENSION 55

and
Z
0 1 0
δ(k
k −k k )= n
d n x e i (kk −kk )·xx ,
(2π)
Z
0 1 0
δ(xx − x ) = n
d n k e i (xx −xx )·kk , (1.254)
(2π)

with the inner product relations,

1 1
〈xx |k
k〉 = n/2
e ikk ·xx , 〈k
k |xx 〉 = e −ikk ·xx . (1.255)
(2π) (2π)n/2

Equations (1.254) and (1.255) and the indentification | f˜〉 ≡ | f 〉 exhibit a strik-
ing resemblance between |xx 〉 and |k k 〉. In fact, any given abstract vector | f 〉
can be expressed either in terms of its x-representation, 〈xx | f 〉 = f (xx ), or in
terms of its k representation, 〈kk | f 〉 ≡ f˜(k
k ). These two representations are
completely equivalent, and there is one-to-one correspondence between the
two, which is given by Eq. (1.253). The representation that is used in practice
is dictated by the physical application. In quantum mechanics, for instance,
most of the time the x-representation, corresponding to the position, is used,
because then the operator equations turn into differential equations that are
(in many cases) linear and easier to solve than the corresponding equations
in the k-representation, which is related to the momentum.

E XAMPLE 1.9. In this example we are going to evaluate the Fourier transform
of the Yukawa potential

qe −αr
Vα (r ) = , α>0 (1.256)
r

where r is the modulus of the vector position x of the particle in R3 . In this


case, the Fourier transform is given by
Z
1 qe −αr
k) =
Ṽα (k d 3 xe −ikk ·xx . (1.257)
(2π)3/2 r

To evaluate the above integral it is convenient to use spherical coordinates.


We are free to pick any direction as the z-axis, thus, a simplifying choice in
k̂, where k = |k
this case is to choose k = k k̂ k |. In this way, k · x = kr cosθ, where
θ is the polar angle in spherical coordinates. Then, Eq. (1.257) becomes

Z∞ Z1 Z2π −αr
q 2 −i kr cosθ e
k) =
Ṽα (k d r r d cosθ d ϕe . (1.258)
(2π)3/2 r
0 −1 0
1.9 OTHER INTEGRAL TRANSFORMS : T HE L APLACE TRANSFORM 56

The ϕ integration simply gives a factor 2π, while in case of the θ integration
we get

Z1
1
d cosθe −i kr cosθ = (e i kr − e −i kr ). (1.259)
i kr
−1

Then,
Z∞ −αr
q(2π) 2e 1
k) =
Ṽα (k 3/2
d r r (e i kr − e −i kr )
(2π) r i kr
0
Z∞ · ¸
q 1 (−α+i k)r −(α+i k)r
= 1/2
dr e −e
(2π) i k
0
· ¯ ¯ ¸
q 1 e (−α+i k)r ¯¯∞ e (−α+i k)r ¯¯∞
= + . (1.260)
(2π)1/2 i k −α + i k ¯0 α + i k ¯0

Note that the when r → ∞, the factor e −αr → 0 and we get,

2q 1
k) = p
Ṽα (k . (1.261)
2π k + α2
2

The parameter α is a measure of the range of the potential. It is clear that the
larger α is, the smaller the range. In fact, it was in response to the short range
of nucleon forces that Yukawa introduced α, which turns out to be related to
the mass of a pion.

1.9 OTHER INTEGRAL TRANSFORMS : T HE L APLACE


TRANSFORM

The Fourier transform exists only for a function f (x) which satisfies the con-
dition
Z∞
0< d x | f (x)| ≤ M < ∞, (1.262)
−∞

with M being some positive real number. However, even for simple functions
like f (x) = e i kx , the Fourier transform fails to converge. What to do then with
such functions? Furthermore, we might be interested in a given function only
for x > 0 (for instance, consider x to represent the time variable t ). This leads
1.9 OTHER INTEGRAL TRANSFORMS : T HE L APLACE TRANSFORM 57

us to consider the Laplace transform, f˜(s) or L[ f (x)] of f (x), which is defined


by
Z∞
f˜(s) ≡ d x f (x)e −sx , (1.263)
0

provided that the integral exists. We assume here that s is real and positive, but
complex values with Re(s) > 0 would have to be considered in a more detailed
study. Through Eq. (1.263) we define a linear transformation L that converts
functions of the variable x to functions of a new variable s:

L[a f 1 (x) + b f 2 (x)] = a L[ f 1 (x)] + b L[ f 2 (x)] = a f˜1 (s) + b f˜2 (s). (1.264)

A few comments on the existence of the integral are in order. The infinite
integral of f (x), i.e.,
Z∞
d x f (x), (1.265)
0

need not exist. For instance, f (x) may diverge exponentially for large x. How-
ever, if there are some constants s 0 , M and x 0 ≤ 0 such that for all x > x 0

|e −s0 x f (x)| ≤ M , (1.266)

the Laplace transform will exist for s > s 0 ; f (x) is then said to have exponential
2
growth of order s 0 . As a counterexample, f (x) = e x does not satisfy the con-
2
dition given by Eq. (1.266) and is not of exponential order. Thus, L[e x ] does
not exist.
The Laplace transform may also fail to exist because of a sufficiently strong
singularity in the function f (x) as x → 0. For example,
Zx
d x x n e −sx (1.267)
0

diverges at the origin for n ≤ −1. The Laplace transform L[x n ] does not exist.
Before continuing with further discussions, I guess you might wonder about
the origin of the definition of the Laplace transformation of a function, and
you might be surprised to know that such a definition can be considered sim-
ply as the continuous analog of the Taylor series of a function! Indeed, if we
have a function F (x) which admits a Taylor expansion, we can write
X

F (x) = fn xn . (1.268)
n=0
1.9 OTHER INTEGRAL TRANSFORMS : T HE L APLACE TRANSFORM 58

Instead of writing the coefficients of the expansion as f n , we use a more suit-


able notation for our continuous generalization and write f (n). Then
X

F (x) = f (n)x n . (1.269)
n=0

Imaging n as a discretization of a continuous variable t , we can replace the


P

by the integral on a variable t ranging from 0 to ∞ , and write:
n=0

Z∞
F (x) = d t f (t )x t . (1.270)
0

For the series in Eq. (1.269) to converge, we consider 0 < x < 1, since negative
values of x or x > 1 can mess up the convergence. This means that lnx < 0 and
we can introduce a change of variable,

−s = ln(x), (1.271)

with s > 0. In this way, we can write

x t = [e ln(x) ]t = e lnxt = e −st . (1.272)

Then, we can write Eq. (1.270) in terms of the variable s as


Z∞
F (e −s ) ≡ d t f (t )e −st , (1.273)
0

or simply,
Z∞
F (s) = d t f (t )e −st . (1.274)
0

Given the function f (t ), we obtain F (s), which is its Laplace transform!

E XAMPLE 1.10. Let us calculate the Laplace transform of some elementary


functions. In all cases we assume that f (x) = 0 for x < 0. For example, let us
consider

f (x) = 1, x > 0, (1.275)

then
Z∞
1
f˜(s) = d xe −sx = , for s > 0. (1.276)
s
0
1.9 OTHER INTEGRAL TRANSFORMS : T HE L APLACE TRANSFORM 59

Next, let

f (x) = e αx , x > 0. (1.277)

The Laplace transform becomes

Z∞ · (α−s)x ¸¯∞
e ¯
f˜(s) = αx −sx
dx e e = ¯ = 1 , for s > α. (1.278)
α − s ¯0 s −α
0

Using this relation, we can determine the Laplace transform of certain other
functions. For example, since

1 1
cosh(αx) = (e αx + e −αx ), sinh(αx) = (e αx − e −αx ) (1.279)
2 2
we have
µ ¶
1 1 1 s
L[cosh(αx)] = + = 2 ,
2 s −α s +α s − α2
µ ¶
1 1 1 α
L[sinh(αx)] = − = 2 , (1.280)
2 s −α s +α s − α2

both valid for s > |α|.


From the relations

cos(αx) = cosh(i αx), sin(αx) = −i sinh(i αx), (1.281)

it is evident that we can obtain transforms of the sine and cosine if α is re-
placed by i α in Eq. (1.280):
s α
L[cos(αx)] = , L[sin(αx)] = , (1.282)
s 2 + α2 s 2 + α2
both valid for s > 0. It is a curious fact that lim L[sin(αx)] = 1/α despite the
s→0
R∞
fact that d x sin(αx) does not exists.
0
Last case: f (x) = x n . We have then
Z
Γ(n + 1)
L[x n ] = d x x n e −sx = , s > 0, n > −1, (1.283)
s n+1

where we have introduced the Euler’s Gamma function.


1.9 OTHER INTEGRAL TRANSFORMS : T HE L APLACE TRANSFORM 60

Table 1.1: Some common Laplace transforms. The transforms are valid for
s > s0 .
f (x) f˜(s) s0
c c/s 0
cx n cn!/s n+1 0
sin(αx) α/(s 2 + α2 ) 0
cos(αx) s/(s 2 + α2 ) 0
e αx 1/(s − α) α
x n e αx n!/(s − α)n+1 α
sinh(αx) α/(s 2 − α2 ) |α|
cosh(αx) s/(s 2 − α2 ) |α|
e αx sin(βx) β/[(s − α)2 + β2 ] α
e αx cos(βx) (s − α)/[(s − α)2 + β2 ] α
1
x 1/2 3 1/2
2 (π/s ) 0
x −1/2 (π/s)1/2 0
δ(x − x 0 ) e −sx0 0
½
1 for x ≥ x 0
θ(x − x 0 ) = e −sx0 /s 0
0 for x < x 0

In general, if f (x) is piecewise continuous and has exponential growth of


order s 0 , then its Laplace transform is defined for all s > s 0 . Also, if f and g are
piecewise continuous functions that are of exponential growth, and L[ f (x)] =
L[g (x)] for all s sufficiently large, then f (x) = g (x) at all points of continuity
x > 0.
Unlike the Fourier transformation, the inversion of the Laplace transform
is not an easy operation to perform, since an explicit formula for f (x), given
f˜(s), is not straightforwardly obtained from Eq. (1.263). The general method
for obtaining an inverse Laplace transform makes use of complex variable
theory, with contour integration, and we are not going to enter into those de-
tails here. However, progress can be made without having to find an explicit
inverse, since we can prepare from Eq. (1.263) a list of the Laplace transforms
of common functions and when faced with an inversion to carry out, hope
to find the given transform (together with its parent function) in the listing.
Such list is given in Table 1.1. A difficulty while using the Table 1.1 is the fact
that the inverse Laplace transform is not entirely unique. Two functions f 1 (x)
and f 2 (x) can have the same transform if their difference is a null function,
1.9 OTHER INTEGRAL TRANSFORMS : T HE L APLACE TRANSFORM 61

meaning that for all s 0 > 0

Zs0
d x [ f 1 (x) − f 2 (x)] = 0. (1.284)
0

This result is known as Lerch’s theorem, and is not quite equivalent to f 1 (x) =
f 2 (x), because it permits f 1 (x) and f 2 (x) to differ at isolated points. How-
ever, in most problems studied in physics this ambiguity is not important and
for all practical purposes when finding inverse Laplace transforms using Ta-
ble 1.1, the inverse Laplace transform would be considered unique. The in-
verse Laplace transformation is linear, thus

L−1 [a f˜1 (s) + b f˜2 (s)] = a f 1 (x) + b f 2 (x). (1.285)

E XAMPLE 1.11. Using Table 1.1, let us calculate the function f (x) whose Laplace
transform is given by

α2
f˜(s) = . (1.286)
s(s 2 + α2 )

First, this function is not listed in Table 1.1. However, we can rewrite Eq. (1.286)
as
1 s
f˜(s) = − 2 . (1.287)
s s + α2
In this way,
· ¸ · ¸
−1
˜ −1 1 −1 s
f (x) = L [ f (s)] = L −L . (1.288)
s s 2 + α2

Using now Table 1.1, we have

f (x) = 1 − cos(αx), x > 0. (1.289)

E XAMPLE 1.12. In this example we are going to use the Laplace transform and
Table 1.1, to evaluate a definite integral. In particular, we are going to calculate

Z∞
sin(y x)
f (x) = d y . (1.290)
y
0
1.9 OTHER INTEGRAL TRANSFORMS : T HE L APLACE TRANSFORM 62

To do this integral, we need to consider three different cases: x > 0, x = 0 and


x < 0. Note, however, that if x = 0, the integral is 0 and if x < 0,
Z∞
sin(y|x|)
f (x = −|x|) = − d y = − f (x > 0). (1.291)
y
0

Then, it is enough to calculate Eq. (1.290) for x > 0.


Suppose we determine the Laplace transform of f (x) (x > 0), which we
denote as f˜(s).Then, we have
· Z∞ ¸ Z∞ · Z∞ ¸
˜ sin(y x) sin(y x) −sx
f (s) = L[ f (x)] = L dy = dx dy e
y y
0 0 0
Z∞ · Z∞ ¸
1 −sx
= dy d x sin(x y)e . (1.292)
y
0 0

Using Table 1.1, the factor in square brackets is just the Laplace transform of
sin(x y). Then, we have
Z∞ µ ¶¯∞
1 1 ¯
−1 y ¯ π
f˜(s) = d y = tan ¯ = . (1.293)
s2 + y 2 s s y=0 2s
0

Now, we can determine f (x) as


· ¸
−1 ˜ −1 π
f (x) = L [ f (s)] = L . (1.294)
2s

Using Table 1.1, we get


π
f (x) = , x > 0. (1.295)
2
Then, using Eq. (1.291),
π
f (x) = − , x < 0. (1.296)
2
In this way

Z∞ π/2 x > 0,
sin(x y) 
f (x) = d y = 0 x = 0, (1.297)
y 
0 −π/2 x < 0.

We see then that f (x) describes a step function with height π at x = 0.


1.10 P ROPERTIES OF THE L APLACE TRANSFORM 63

1.10 P ROPERTIES OF THE L APLACE TRANSFORM


Here we list some properties of the Laplace transform. As in case of the Fourier
transform, these properties can be proven from the direct definition of the
transform and we do not provide here a demonstration of them.

1. Differentiation:
· ¸
d f (x)
L = s L[ f (x)] − f (+0), s > 0. (1.298)
dx
This property can be proven starting from the definition of the Laplace
transform for d f /d x and integrating by parts. Naturally, both f (x) and
its derivative must be such that the integrals do not diverge. Since f (x)
and/or d f /d x can be piecewise continuous, strictly speaking, the zero
of x needs to be approached from the positive side of x. For this reason
we write f (+0) instead of f (0). An extension of Eq. (1.298) to higher
derivatives is also possible:
· n ¸ ¯
d f (x) n n−1 d n−1 f ¯¯
L = s L[ f (x)] − s f (0) − · · · − , s > 0. (1.299)
d xn d x n−1 ¯x=+0

2. Change of scale:
µ ¶
1 ˜ s
L[ f (ax)] = f . (1.300)
a a

3. Substitution:
· ¸
˜ ax
f (s − a) = L e f (x) . (1.301)

4. Translation:

e −bs f˜(s) = L[ f (x − b)], (1.302)

where b > 0 and remember that we are considering that f (x) = 0 if x < 0,
thus, f (x − b) = 0 for 0 ≤ x < b.

5. Derivative of a transform:
· ¸
d n f (s) n
= L (−x) f (x) . (1.303)
d sn
In this case, e −sx f (x) needs to converge exponentially for large s, such
that all the integrals obtained on the right-hand side of the equation
when calculating the transform will be uniformly convergent due to the
decreasing exponential behavior of e −sx f (x).
1.10 P ROPERTIES OF THE L APLACE TRANSFORM 64

6. Integration of a transform: considering y large enough so that e −x y f (y)


decreases exponentially as y → ∞, the integral

Z∞
f (x) = d ye −x y f (y) (1.304)
0

is uniformly convergent with respect to x. Then

Z∞ · ¸
f (x)
d x f (x) = L , (1.305)
x
s

provided limx→0 f (x)/x exists. The lower limit s must be chosen large
enough so that f˜(s) is within the region of uniform convergence.

7. Convolution theorem: If the functions f (x) and g (x) have Laplace trans-
forms f˜(s) and g̃ (s) then

· Zx ¸
L[ f ∗ g ] = L d y f (y)g (x − y) = f˜(s)g̃ (s),
0
Zx
L−1 [ f˜(s)g̃ (s)] = d y f (y)g (x − y) = f ∗ g . (1.306)
0

The convolution defined above, i.e.,


Zx
f ∗g = d y f (y)g (x − y), (1.307)
0

is commutative, i.e., f ∗ g = g ∗ f , and is associative and distributive.

E XAMPLE 1.13. Here is an example showing how the derivative formula in


Eq. (1.303) can be used. Knowing that

d2
−α2 sin(αx) = [sin(αx)] (1.308)
d x2
we can determine the Laplace transform for sin(αx). Effectively, if we apply
the Laplace transform to Eq. (1.308) we get
· 2 µ ¶¸
2 d
−α L[sin(αx)] = L sin(αx) . (1.309)
d x2
1.11 A PPENDIX : D IRAC DELTA FUNCTION 65

Using now Eq. (1.299), we get


· 2 µ ¶¸ ¯
d 2 d [sin(αx)] ¯¯
L sin(αx) = s L[sin(αx)] − ssin(0) − ¯
d x2 dx x=0
= s 2 L[sin(αx)] − α. (1.310)

Then,
α
L[sin(αx)] = , (1.311)
s 2 + α2
which confirms the result obtained in Eq. (1.282).

1.11 A PPENDIX : D IRAC DELTA FUNCTION


Paul Adrian Maurice Dirac, one of the most inventive mathematical physi-
cists of all time, co-founder of quantum theory, inventor of relativistic quan-
tum mechanics in the form of an equation which bears his name, predictor of
the existence of anti-matter, clarifier of the concept of spin, and contributor
to the unraveling of the mathematical difficulties associated with the quan-
tization of the general theory of relativity, came across the subject matter of
this Appendix in his study of quantum mechanical scattering. Actually, it was
Oliver Heaviside, British engineer, who started using delta functions in practi-
cal applications, with remarkable success. Heavisde, who was not deterred by
the lack of rigorous justifications, was ridiculed by the pure mathematicians
of his day, and eventually succumbed to mental illness. But, some thirty years
later, Paul Dirac resurrected the delta function for quantum mechanical ap-
plications, and this finally made theoreticians sit up and take notice (indeed,
the term Dirac delta function is quite common). In 1944, the French mathe-
matician Laurent Schwartz finally established a rigorous theory called theory
of distributions that incorporated such unsual generalized functions, like the
delta function. It is beyond the scope of this course to introduce formally the
theory of distributions, rather, in the spirit of Heaviside, we simply provide an
idea of what a delta function and we start with an intuitive approach drawn
from electrostatics.
Let us start by imagining that we have a straight linear charge distribu-
tion of length L with uniform charge density as shown in Fig. 1.13a. If the
total charge of the line segment is q, then the linear density will be λ = q/L.
We are interested in the graph of the function describing the linear density
in the interval [−∞, +∞]. Assuming that the midpoint of the segment is x 0
and its length L, we can easily draw the graph of the function. This is shown
in Fig. 1.13b. The graph is that of a function that is zero for values less than
1.11 A PPENDIX : D IRAC DELTA FUNCTION 66

Figure 1.13: (a) The charged line segment and (b) its linear density function.

x 0 −L/2, q/L for values between x 0 −L/2 and x 0 +L/2, and zero again for values
greater than x 0 + L/2. Let us call this function λ(x). Then, we can write,

 0, if x < x 0 − L/2,
λ(x) = q/L, if x 0 − L/2 < x < x 0 + L/2, (1.312)

0, if x > x 0 + L/2.

Now suppose that we squeeze the segment on both sides so that the length
shrinks to L/2 without changing the position of the midpoint and the amount
of charge. The new function describing the linear charge density will now be

 0, if x < x 0 − L/4,
λ(x, x 0 ) = q 2/L, if x 0 − L/4 < x < x 0 + L/4, (1.313)

0, if x > x 0 + L/4.

The charge q has been factorized for later convenience. We have also intro-
duced a second argument to emphasize the dependence of the function λ on
the midpoint. Instead of one-half, we can shrink the segment to any fraction,
while still keeping both the amount of charge and the midpoint unchanged.
Shrinking the size L/n and renaming the function as λn (x, x 0 ) to reflect its de-
pendence on n, we have

 0, if x < x 0 − L/(2n),
λn (x, x 0 ) = q n/L, if x 0 − L/(2n) < x < x 0 + L/(2n), (1.314)

0, if x > x 0 + L/(2n).

This function is depicted in Fig. 1.14 for n = 10 as well as for some smaller
values of n. As you can see, the height of λn (x, x 0 ) increases at the same time
that its width decreases.
Instead of a charge distribution that abruptly changes from zero to some
finite value and just abruptly drops to zero, let us consider a charge distribu-
tion that smoothly rises to a maximum value and just as smoothly falls to zero.
1.11 A PPENDIX : D IRAC DELTA FUNCTION 67

Figure 1.14: The linear density λn (x, x 0 ) in Eq. (1.314) versus x as n increases.

There are, of course, many functions which could be used to describe such a
charge distribution, but one which is convenient is a Gaussian distribution.
For example,
r
n −n(x−x0 )2
λn (x, x 0 ) = q e , (1.315)
π
p p
has a peak of height q n/π at x = x 0 (why the factor n/π, you are going to
understand it in the next paragraph) and drops to smaller and smaller values
as we get farther and farther away from x 0 in either direction, as shown in
Fig. 1.15. It is clear from the figure that the width of the graph of λn (x, x 0 ) gets
smaller as n → ∞.

Figure 1.15: λn (x, x 0 ) in Eq. (1.315) versus x for several values of n. The Gaus-
sian bell-shaped curve approaches the Dirac delta function as the width of the
curve approaches zero. The value of n is 1 for the dashed curve, 4 for the thick
solid line, and 20 for the thin solid line.

In both cases λn (x, x 0 ) is a true linear charge density in the sense that its
integral gives the total charge. This is evident in the first case, Eq. (1.314),
1.11 A PPENDIX : D IRAC DELTA FUNCTION 68

because of the way the function was defined. In the second case, Eq. (1.315),
once we integrate from −∞ to +∞, we also obtain the total charge q. The
region of integration extends over all real numbers in the second case because
at every point of the real line we have some nonzero charge. Furthermore,
we can extend the interval of integration over all real numbers even for the
first case, because the function vanishes outside the interval [x 0 − L/(2n), x 0 +
L/(2n)] and no other contribution to the integral arises. We thus can write
Z∞
d xλn (x, x 0 ) = q (1.316)
−∞

for all such λn (x, x 0 ) functions. It is convenient to divide λn (x, x 0 ) by q and


define new functions δn (x, x 0 ) by
λn (x, x 0 )
δn (x, x 0 ) ≡ , (1.317)
q
so that, from Eq. (1.314)

 0, if x < x 0 − L/(2n),
δn (x, x 0 ) = n/L, if x 0 − L/(2n) < x < x 0 + L/(2n), (1.318)

0, if x > x 0 + L/(2n),

while, from Eq. (1.315),


r
n −n(x−x0 )2
δn (x, x 0 ) = e . (1.319)
π
Both these functions have the property that
Z∞
d xδn (x, x 0 ) = 1. (1.320)
−∞

In other words, the integral of δn (x, x 0 ) over all the real numbers is one, and, in
particular, independent of n. Using δn (x, x 0 ) we define the Dirac delta func-
tion, δ(x, x 0 ) as

δ(x, x 0 ) = lim δn (x, x 0 ), (1.321)


n→∞

and, since the integral in Eq. (1.320) is independent of n, has the following
property:
Z∞
d xδ(x, x 0 ) = 1. (1.322)
−∞
1.11 A PPENDIX : D IRAC DELTA FUNCTION 69

The Dirac delta function has infinite height and zero width at x 0 , but these two
undefined quantities compensate for one another to give a finite area under
the graph of the function. The Dirac delta function is actually not a function
in the usual sense, because at the only point that it is nonzero, it is infinite!
Although we have separated the arguments of the Dirac delta function by
a comma, the function depends only on the difference between the two argu-
ments. This becomes clear if we think of the Dirac delta function as the limit
when n → ∞ of the exponential in Eq. (1.315), because the latter is a function
of x − x 0 . We therefore have the important relation

δ(x, x 0 ) = δ(x − x 0 ). (1.323)

In particular, since the delta function becomes infinite at x = x 0 , we have


¯ ¯
¯ ¯
¯
δ(x, x 0 )¯ = δ(x − x 0 )¯¯ = δ(0) = ∞. (1.324)
x=x 0 x=x 0

We can think of the last equality as an identity satisfied by the Dirac delta func-
tion: the Dirac delta function is zero everywhere except at the point which
makes its argument zero, in which case the Dirac delta function is infinite.
Since the Dirac delta function is zero almost everywhere, we can shrink the
region of integration in Eq. (1.322) to a smaller interval and write

Zb
d xδ(x − x 0 ) = 1 (1.325)
a

as long as x 0 lies in the interval [a, b]. If x 0 is outside the interval, then the
integral will be zero because the delta function would always be zero in the
region of integration. We can then summarize these results in the following
way:

Zb ½
1, if a < x 0 < b,
d xδ(x − x 0 ) = (1.326)
0, otherwise.
a

Equation (1.322) is then a special case of this for which −∞ < x 0 < ∞ for any
value of x 0 . Any function, as the Dirac delta function, whose integral over all
real numbers is one is called a linear density function. What kind of distri-
bution the Dirac delta function describes? For example, consider mδ(x − x 0 ),
where m designates mass. This function is zero everywhere except at x 0 and
its integral is the total mass m. Thus, it it is to be a mass distribution, it has
to be a point mass located at x 0 . The linear density of a point mass is infinite
because its length is zero, and this is precisely what mδ(x − x 0 ) describes.
1.11 A PPENDIX : D IRAC DELTA FUNCTION 70

From a mathematical point of view, the most important property, which


is sometimes used to define the Dirac delta function, occurs when it mul-
tiplies a smooth function in an integrand (by smooth function, in this con-
text, we mean a function which does not change abruptly when its argument
changes by a small amount). First, look at an integral with a δn (x − x 0 ) inside.
If the function f (x) multiplying δn (x − x 0 ) is smooth and n is large enough,
the product f (x)δn (x − x 0 ) practically vanishes outside a narrow interval in
which δn (x − x 0 ) is appreciably different from zero. For example, if n = 107 ,
x = x 0 + 0.001, and from Eq. (1.315), then δn (x − x 0 ) = 0.08. Assuming that f
does not change appreciably in the small interval of width 0.002 around x 0 ,
f (x)δn (x − x 0 ) drops to about 8% of the value it has at x 0 . For larger values of
n this drop is even sharper. In fact, no matter what function we choose, there
is always a large enough n such that the product f (x)δn (x − x 0 ) will drop to as
small a value as we please in as short an interval as we please. Then, we can
approximate the integral over all real numbers to an integral over that small
interval. If we call this interval [x 0 − ϵ, x 0 + ϵ], we have

Z∞ xZ0 +ϵ

d x f (x)δ(x − x 0 ) ' d x f (x)δn (x − x 0 )


−∞ x 0 −ϵ
xZ0 +ϵ

' f (x 0 ) d xδn (x − x 0 )
x 0 −ϵ
Z+∞
' f (x 0 ) d xδn (x − x 0 ) ' f (x 0 ). (1.327)
−∞

The approximation in the second line follows from the fact that f (x) is almost
constant in the small interval [x 0 −ϵ, x 0 +ϵ]. The third approximation is a result
of the smallness of δn outside the interval, and the equality follows because δn
is a linear density function. In the limit n → ∞, δn becomes the Dirac delta
function and the approximation in Eq. (1.327) becomes an equality, i.e.,

Z∞
d x f (x)δ(x − x 0 ) = f (x 0 ). (1.328)
−∞

This is equivalent to

Zb ½
f (x 0 ), if a < x 0 < b,
d x f (x)δ(x − x 0 ) = (1.329)
0, otherwise.
a
1.11 A PPENDIX : D IRAC DELTA FUNCTION 71

In words, the result of integration is the value of f at the root (i.e., the zero)
of the argument of the delta function, provided this root is inside the range of
integration. In this way, the result of integration is always well defined then
because it is simply the value of a good function f at a point, say x 0 . In fact,
the result of integration is so nice that one can even define the derivative of the
Dirac delta function by differentiating Eq. (1.329) with respect to x 0 , obtaining
Z∞ ¯
d d f ¯¯
d x f (x) [δ(x − x 0 )] = − . (1.330)
dx d x ¯x=x0
−∞

In fact, the nth derivative of the Dirac delta function satisfies

Zb ½
(n) (−1)n f (n) (x 0 ), ifa < x 0 < b,
d x f (x)δ (x − x 0 ) = (1.331)
0, otherwise,
a

where the superscript (n) indicates the nth derivative.


Interestingly, the Dirac delta function is connected with another common
function called the step function or Heaviside θ function, which is defined as
½
1, if x > 0,
θ(x) = (1.332)
0, if x < 0.

The Heaviside θ function, or simply, θ-function, is useful in writing functions


that have discontinuities or cusps. For instance, absolute values can be writ-
ten in terms of the step function

|x| = xθ(x) − xθ(−x) or |x − y| = (x − y)[θ(x − y) − θ(y − x)]. (1.333)

A piecewise continuous function such as


½
f 1 (x), if 0 < x < 1,
f (x) = (1.334)
f 2 (x), if x > 1,

can be written as

f (x) = f 1 (x)θ(x)θ(1 − x) + f 2 (x)θ(x − 1). (1.335)

Because the θ-function is constant everywhere except at 0, its derivative


is zero everywhere except at 0. The discontinuity at 0 makes the derivative
infinite there
¯
d θ ¯¯ θ(ϵ) − θ(−ϵ) 1−0
¯ = lim = lim → ∞. (1.336)
d x x=0 ϵ→0 2ϵ ϵ→0 2ϵ
1.11 A PPENDIX : D IRAC DELTA FUNCTION 72

This fact strongly suggests the identification of the derivative of the θ-function
as the Dirac delta function. Noting that
½
1, if x > x 0 ,
θ(x − x 0 ) = (1.337)
0, if x < x 0 ,

and that d [θ(x − x 0 )]/d x is zero everywhere except at x 0 , for any well-behaved
function f (x) we obtain

Z∞ xZ0 +ϵ xZ0 +ϵ
d [θ(x − x 0 )] d [θ(x − x 0 )] d [θ(x − x 0 )]
d x f (x) = d x f (x) ' f (x 0 )
dx dx dx
−∞ x 0 −ϵ x 0 −ϵ
¯x0 +ϵ
¯
= f (x 0 )θ(x − x 0 )¯¯ = f (x 0 )[θ(ϵ) − θ(−ϵ)]
x 0 −ϵ
= f (x 0 )[1 − 0] = f (x 0 ). (1.338)

We thus have another important representation of the Dirac delta function:

d [θ(x − x 0 )]
δ(x − x 0 ) = . (1.339)
dx
All these discussions can be generalized to many variables. For example,
in two dimensions and Cartesian coordinates
n −n[(x−x0 )2 +(y−y 0 )2 ]
δn (rr −rr 0 ) ≡ δn (x − x 0 , y − y 0 ) = e
π
n −n(x−x0 )2 −n(y−y 0 )2
= e e = δn (x − x 0 )δn (y − y 0 ) (1.340)
π
and the integral of δn over the entire x y-plane equals to one. In this way,

δ(rr −rr 0 ) = lim δn (rr −rr 0 ), (1.341)


n→∞

which is zero everywhere except at the point which makes all two of its argu-
ment arguments zero, in which case it is infinite, i.e., (see Fig. 1.16)

δ(rr −rr 0 ) = δ(x − x 0 , y − y 0 ) = δ(x − x 0 )δ(y − y 0 ). (1.342)

If S is a region of the two-dimensional space, then for a smooth function f (rr ),


Z ½
2 f (rr 0 ), if r 0 ∈ S,
d r f (rr )δ(rr −rr 0 ) = (1.343)
0, otherwise,
S
R RR
where d 2 r = d xd y.
1.11 A PPENDIX : D IRAC DELTA FUNCTION 73

Figure 1.16: As n gets larger and larger, the two-dimensional Gaussian expo-
nential in Eq. (1.340) approaches to the two-dimensional Dirac delta function.
For the left bump, n = 400; for the middle bump, n = 1000; and for the right
spike n = 4000.

Similarly, in three dimensions and Cartesian coordinates, we can define


the function δn as
µ ¶3/2
n 2 2 2
δn (rr −rr 0 ) ≡ δn (x − x 0 , y − y 0 , z − z 0 ) = e −n[(x−x0 ) +(y−y 0 ) +(z−z0 ) ]
π
µ ¶3/2
n 2 2 2
= e −n(x−x0 ) e −n(y−y 0 ) e −n(z−z0 ) = δn (x − x 0 )δn (y − y 0 )δ(z − z 0 )
π
(1.344)

and the integral of δn over the entire x y z-plane equals to one. In this way,

δ(rr −rr 0 ) = lim δn (rr −rr 0 ), (1.345)


n→∞

which is zero everywhere except at the point which makes all three of its ar-
gument arguments zero, in which case it is infinite, i.e.,

δ(rr −rr 0 ) = δ(x − x 0 , y − y 0 , z − z 0 ) = δ(x − x 0 )δ(y − y 0 )δ(z − z 0 ). (1.346)

If V is a region of the three-dimensional space, then for a smooth function


f (rr ),
Z ½
3 f (rr 0 ), if r 0 ∈ V,
d r f (rr )δ(rr −rr 0 ) = (1.347)
0, otherwise,
V
R RRR
where d 3 r = d xd yd z.
1.12 A PPENDIX : T HE L EBESGUE INTEGRAL 74

1.12 A PPENDIX : T HE L EBESGUE INTEGRAL


The collection of all continuous functions defined on an interval [a, b] forms
a linear vector space, however that space is not complete. For example, let us
consider the infinite sequence of continuous functions { f k }∞
k=1
defined in the
interval [−1, 1] by
 1
 1, k < x ≤ 1,
kx+1
f k (x) = , − k1 < x < k1 , (1.348)
 2
0, −1 ≤ x < − k1 .

These functions belongs to the inner product space of continuous functions


with its usual inner product

Z1
〈 f |g 〉 = d x f ∗ (x)g (x). (1.349)
−1

It is straightforward to verify that

Z1
〈 fk | fl 〉 = d x | f k (x) − f l (x)|2 → 0 (1.350)
−1

when k, l → ∞. Thus, { f k }∞k=1


forms a Cauchy sequence. However, the se-
quence of functions f k (x) converges to the function
½
1, 0 < x < 1,
f (x) = (1.351)
0, −1 < x < 0,

which is discontinuous at x = 0 and therefore does not belong to the space


of continuous functions. It can be proved that a continuous function f (x)
satisfying

lim 〈 f | f k 〉 = 0 (1.352)
k→∞

does not exist. In this way, we arrive to the conclusion that the space of contin-
uous functions form a linear space, but it is not complete with respect to the
inner product defined in Eq. (1.349) since in the limit k → ∞, the sequence
{ f k }∞
k=1
does not give an element of the vector space considered. There is
something missing in the vector space: discontinuous functions!
Then, to have L 2 (a, b) or L 1 (a, b) as complete spaces with respect their
inner products, we need to include discontinuous functions like the one in
1.12 A PPENDIX : T HE L EBESGUE INTEGRAL 75

Figure 1.17: Riemann’s method for integration.

Eq. (1.351) and even non-piecewise continuous functions as sin(1/x), x −1/3 as


well as the Dirichlet’s function
½
1 x ∈ rational number,
f (x) = (1.353)
0 otherwise.

But then comes the problem: How to calculate the inner product of these
functions? In other words, how to determine

Zb
d x | f (x)|2 (1.354)
a

in case of f (x) ∈ L 2 (a, b) or

Zb
d x | f (x)| (1.355)
a

in case of f (x) ∈ L (a, b) with f (x) being such discontinuous functions. Here
is precisely where the name of Lebesgue becomes relevant! To determine such
integrals the familiar notion of the Riemann integral needs to be generalized
so as to be able to integrate such rebellious functions. This generalization is
achieved by introducing the Lebesgue integral, which is equal to the Riemann
integral for functions that are integrable in the conventional sense. In the fol-
lowing we simply provide the basics of the Lebesgue method for integration
to give to the reader only a very general idea of what a Lebesgue integral is.
The principal difference between the integrals of Riemann and Lebesgue
may be illustrated by pictures. Figure 1.17 shows a positive-continuous func-
tion f defined on an interval −∞ < a ≤ x ≤ b < ∞, subdivided as for Riemann
1.12 A PPENDIX : T HE L EBESGUE INTEGRAL 76

integration: You pick a series of points of subdivision

a = x 0 < x 1 < x 2 < · · · < x n = b, (1.356)

form the Riemann sum


X
n
f (x k0 )(x k − x k−1 ), (1.357)
k=1

in which x k0 is any point between x k−1 and x k , and verify that this sum ap-
proaches a limit, namely the Riemann integral

Zb
d x f (x), (1.358)
a

as n → ∞ and the biggest of the lengths x k − x k−1 (k ≤ n) tends to 0. Particu-


larly, it is important to remember that for a partition P of [a, b], we can define
the upper sum of f over this P (see Fig. 1.18) as
X
n
U (P, f ) = M k ∆x k , ∆x k = x k − x k−1 , (1.359)
k=1

and the lower sum of f over this subdivision as


X
n
L(P, f ) = m k ∆x k , (1.360)
k=1

where

M k = sup[ f (x) : x k−1 ≤ x ≤ x k ], m k = inf[ f (x) : x k−1 ≤ x ≤ x k ], (1.361)

are the supremum and infimum of f in the specified subdivision. Remem-


ber that given a subset of a partially order set, the supremum (infimum) is the
least (greatest) element in the set that is greater (less) than or equal to all ele-
ments of the subset, if such an element exists. In other words, the supremum
(infimum) of f corresponds to the least upper (greatest lower) bound of f in
the subdivision of [a, b] considered. Then, one defines the upper and lower
Riemann integral of f on [a, b] as the infimum of U (P, f ) and as the supre-
mum of L(P, f ), respectively, i.e.

Zb Zb
d x f (x) = inf[U (P, f )], d x f (x) = sup[L(P, f )]. (1.362)
a a
1.12 A PPENDIX : T HE L EBESGUE INTEGRAL 77

Figure 1.18: Left: Representation of the areas considered when determining


U (P, f ) for a partition P on [a, b]. Right: Representation of the areas consid-
ered when calculating u(P, f ) for a partition P on [a, b].

f(x)

Figure 1.19: Graph of the Dirichlet’s function given in Eq. (1.353).

Note that the upper Riemann integral of f is always greater than or equal to
the lower Riemann integral. When the two are equal to each other, we say
that f is Riemann integrable on [a, b], and we call this common value the Rie-
mann integral of f . Let us then determine the integral of the function f (x)
in Eq. (1.353) (see Fig. 1.19). If we partition the domain of this function, then
each subinterval will contain both rational and irrational numbers. Thus, the
supremum on each subinterval is 1 and the infimum on each subinterval is 0.
Then

Zb Zb
d x f (x) = b − 1 and d x f (x) = 0, (1.363)
a a
1.12 A PPENDIX : T HE L EBESGUE INTEGRAL 78

so the upper and lower Riemann integrals are different: this function is not
Riemann integrable.
What to do then with this kind of functions? Lebesgue simply turned the
Riemann’s recipe for integration on its side and subdivided the range of the
function instead of the domain. The idea, as indicated by the different shad-
ings in Fig. 1.20, being to lump together the points at which the function takes
on (approximately) the same values. This would appear to be a perfectly triv-
ial modification, but it has far-reaching consequences (however, those details
are beyond the scope of this Appendix)! Lebesgue’s recipe tells you first to
subdivide the vertical axis by a series of points

min( f ) ≥ y 0 < y 1 < · · · < y n ≥ max( f ), (1.364)

next, to form the sum


X
n
y k−1 × measure(x : y k−1 ≤ f (x) < y k ), (1.365)
k=1

in which measure(x : . . . ) is the sum of the lengths of the subintervals of a ≤


x ≤ b on which the stated inequality takes place, and finally to verify that this
sum approaches the same number

Zb
d x f (x) (1.366)
a

as n → ∞ and the biggest of the lengths y k − y k−1 (k ≤ n) tends to 0. The


point is that by now extending the idea of measure from unions of disjoint
subintervals to the wider class of measurable subsets of the interval a ≤ x ≤
b, you can integrate a much wider class of functions by means of Lebesgue’s
recipe than you can by Riemann’s.
A more formal explanation of the Lebesgue integral, including how to con-
struct measurable subsets, etc., is beyond the scope of these notes and we
also do not need all these details, since for the kind of functions considered
in this course, the Riemann integral exists. When the Riemann integral exists,
the Lebesgue integral also exists, and both integrals are equal. However, you
should know that the integrals appearing in the definition of inner product,
like,

Zb
d x| f (x)|2 , (1.367)
a

are in the Lebesgue sense and not in the Riemann sense!


1.13 F URTHER READING 79

Figure 1.20: Lebesgue’s method for integration.

1.13 F URTHER READING


This chapter has been prepared using the following books:

• Fundamentals of Mathematical Physics, by Edgar A. Kraut (Dover Publi-


cations, Inc., 1995).

• Mathematics For Physicist, by Philippe Dennery and André Krzywicki


(Dover. Publications, Inc., 1995).

• Fourier Series and Integrals, by H. Dym and H. P. McKean (Academic


Press, 1972).

• Fourier Analysis and its Applications, by Gerald B. Folland (Wadsworth&


Brooks,1992).

• Fourier Analysis and Boundary Value Problems, by Enrique A. González-


Velasco (Elsevier,1996).

• Fourier Analysis of Economic Phenomena, by Toru Maruyama (Springer,


2018). Don’t be confused by the title!

• Applied Linear Algebra, by Peter J. Olver and Chehrzad Shakiban (Un-


dergraduate Texts in Mathematics, Second Edition).

• Introduction to Partial Differential Equations, by Peter J. Olver (Springer,


2014).

• Mathematical Methods For Physics and Engineering, by K. F. Riley, M. P.


Hobson, S. J. Bence (Cambridge University Press, Third Edition).

• Mathematical Methods for Physicists, A Comprehensive Guide, by George


B. Arfken, Hans J. Weber and Frank E. Harris (Elsevier, Seventh Edition).
1.13 F URTHER READING 80

• Mathematical Physics, A Modern Introduction to its Foundations, by Sadri


Hassani (Springer, Second Edition).

• Mathematics for Physicists, Introductory Concepts and Methods, by Alexan-


dre Altland and Jan Von delft (Cambridge University Press, 2019).

• Variational methods in Mathematics, Science and Engineering, by Karel


Rektorys (D. Reidel publishing company, Second edition).
C HAPTER

2
I NTRODUCTION TO PARTIAL
DIFFERENTIAL EQUATIONS

N EWTON HAS SHOWN US THAT A LAW IS ONLY A NECESSARY


RELATION BETWEEN THE PRESENT STATE OF THE WORLD
AND ITS IMMEDIATELY SUBSEQUENT STATE .
A LL THE OTHER
LAWS SINCE DISCOVERED ARE NOTHING ELSE ; THEY ARE IN
SUM , DIFFERENTIAL EQUATIONS

HENRI POINCARÉ

What are partial differential equations? A differential equation is an equa-


tion that relates the derivatives of a (scalar) function depending on one or
more independent variables. For example,

d 4u d 2u
+ + u 2 = cos(x) (2.1)
d x4 d x2
is a differential equation for the function u(x) depending on a single (inde-
pendent) variable x, while

∂u ∂2 u ∂2 u
= 2 + 2 −u (2.2)
∂t ∂x ∂y

is a differential equation involving a function u(t , x, y) of three independent


variables.
A differential equation is called ordinary if the function u depends on only
a single variable, and partial if it depends on more than one variable. Usually
(but not quite always) the dependence of u can be inferred from the deriva-
tives that appear in the differential equation. The order of the differential

81
82

equation is that of the highest-order derivative that appears in the equation.


Thus, Eq. (3.21) is a fourth-order ordinary differential equation, while Eq. (2.2)
corresponds to a second-order partial differential equation. If the differential
equation contains no derivatives of the function u, we say that the equation
has order 0. But the latter are more properly treated as algebraic equations, in
the sense that they are not true differential equations. To be a bona fide differ-
ential equation, it must contain at least one derivative of u, and hence have
order ≥ 1.
There are two common notations for partial derivatives, and we shall em-
ploy them interchangeably. The first, used in Eqs. (3.21) and (2.2), is the famil-
iar Leibniz notation that employs a d to denote ordinary derivatives of func-
tions of a single variable, and the ∂ symbol for partial derivatives of functions
of more than one variable. An alternative, more compact notation employs
subscripts to indicate partial derivatives. For example, u t represents ∂u/∂t ,
while u xx is used for ∂2 u/∂x 2 , and u xx y for ∂3 u/∂x 2 ∂y. Thus, in subscript no-
tation, the partial differential equation (2.2) is written as

u t = u xx + u y y − u. (2.3)

We will similarly abbreviate partial differential operators, sometimes writing


∂/∂x as ∂x , while ∂2 /∂x 2 can be written as either ∂2x or ∂xx , and ∂3 /∂x 2 ∂y be-
comes ∂xx y = ∂2x ∂ y .
It is worth mentioning that the preponderance of differential equations
arising in applications in physics, engineering, and within mathematics itself
are of either first or second order, with the latter being by far the most preva-
lent. Third-order differential equations arise when modeling waves in dis-
persive media, e.g., water waves or plasma waves. Fourth-order differential
equations show up in elasticity problems and in image processing. Equations
of order ≥ 5 are very rare.
Incidentally, most partial differential equations arising in physics applica-
tions are real, and, although complex solutions often facilitate their analysis,
at the end of the day we require real, physically meaningful solutions.
In this course, we are going to restrict our discussion on partial differen-
tial equation to second-order partial differential equations, since they are the
type of partial differential equation most encountered in physics. In particu-
lar, we are going to focus our attention to some well-known partial differential
equations in physics, which are:

1. The wave equation, which describes as a function of position r and


time t the transverse displacement from equilibrium, u(rr , t ) of a vibrat-
ing string or membrane or a vibrating solid, gas or liquid and is given
83

by

1 ∂2 u ∂2 ∂2 ∂2
∇2u = , ∇2 = + + . (2.4)
c 2 ∂t 2 ∂x 2 ∂y 2 ∂z 2

This equation also appears in electromagnetism, where u may be a com-


ponent of the electric or magnetic field in a electromagnetic wave or the
current or voltage along a transmission line. The quantity c is the speed
of propagation of the waves.

2. The diffusion equation, given by

∂u
∇2 u =
κ∇ . (2.5)
∂t
This equation describes the temperature u(rr , t ) in a region containing
no heat sources or sinks. It also applies to the diffusion of a chemical
that has a concentration u(rr , t ). The constant κ is called the diffusivity.
The equation is clearly second order in the three spatial variables, but
first order in time.

3. Laplace’s equation, which can be obtained by setting ∂u


∂t in the diffusion
equation, i.e.,

∇ 2 u = 0, (2.6)

and describes, for example, the steady-state temperature distribution in


a solid in which there are no heat sources, i.e., the temperature distribu-
tion after a long time has elapsed. Laplace’s equation also describes the
gravitational potential in a region containing no matter or the electro-
static potential in a charge-free region. Further, it applies to the flow of
an incompressible fluid with no sources, sinks or vortices; in this case u
is the velocity potential, from which the velocity is given by u = ∇ u.

4. Schrödinger’s equation, which is given by

ħ 2 ∂u
− ∇ u + V (rr )u = i ħ . (2.7)
2m ∂t
This equation describes the quantum mechanical wavefunction u(rr , t )
of a non-relativistic particle of mass m moving in the force field pre-
scribed by the (real) potential function V (rr ). While the solution u is
complex-valued, the independent variables t , x, representing time and
space, remain real. As in case of the diffusion equation it is second order
in the three spatial variables and first order in time.
2.1 I NITIAL CONDITIONS AND BOUNDARY CONDITIONS 84

2.1 I NITIAL CONDITIONS AND BOUNDARY


CONDITIONS

How many solutions does a partial differential equation have? In general, lots.
Even ordinary differential equations have infinitely many solutions. Indeed,
the general solution to a single nth order ordinary differential equation de-
pends on n arbitrary constants. The solutions to partial differential equa-
tions are yet more numerous, in that they depend on arbitrary functions. Very
roughly, we can expect the solution to an nth order partial differential equa-
tion involving m independent variables to depend on n arbitrary functions of
m − 1 variables. But this must be taken with a large grain of salt−only in a few
special instances will we actually be able to express the solution in terms of
arbitrary functions.
An ordinary or a partial differential equation will provide a unique solu-
tion to a physical problem only if the initial or the starting value of the so-
lution is known. We refer to this as the boundary conditions. For ordinary
differential equations, when time is involved, boundary conditions amount
to the specification of one or more properties of the solution at an initial time;
that is why for ordinary differential equations involving time one speaks of
initial conditions. For example,

dy
= a y(t ), a ∈ R, (2.8)
dt
is a differential equation, i.e., and equation involving both y(t ) and d y/d t and
is an ordinary differential equation. Its solution is given by

y(t ) = be at , b ∈ R, (2.9)

where each value of b defines a different solution. How to know that Eq. (2.9)
is indeed a solution? We just need to substitute Eq. (2.9) into Eq. (3.129) and
check if Eq. (3.129) is satisfied:
· ¸ · ¸
d at at at
be = abe = a be , (2.10)
dt

which is precisely Eq. (3.129). So, as we have mentioned, the first thing that
we have observed is that the solution of a differential equation need not to be
unique. The set of all solutions of a differential equation is called general so-
lution. To fix a unique specific solution we need to impose some conditions.
For example, we might require that the solution of Eq. (3.129) obey an initial
condition such as y(0) = 1. If we impose such condition, using Eq. (2.9), we fix
a particular solution with b = 1.
2.1 I NITIAL CONDITIONS AND BOUNDARY CONDITIONS 85

A similar specification of auxiliary conditions applies to partial differen-


tials. Equations modeling equilibrium phenomena are supplemented by bound-
ary conditions imposed on the boundary of the domain of interest. In favor-
able circumstances, the boundary condition serve to single out a unique solu-
tion. For example, the equilibrium temperature of a body is uniquely specified
by its boundary behavior. If the domain is unbounded, one must also restrict
the nature of the solution at large distances. The combination of a partial dif-
ferential equation along with suitable boundary conditions is referred to as a
boundary value problem.
There are three principle types of boundary value problems that arise in
most applications. Specifying the value of the solution along the boundary of
the domain is called a Dirichlet boundary condition, to honor the nineteenth-
century analyst Johan Peter Gustav Lejeune Dirichlet. Specifying the normal
derivative of the solution along the boundary results in a Neumann boundary
condition, named after his contemporary Carl Gottfried Neumann (if n is a
unit outward normal vector to the boundary of the domain Ω, ∂u/∂n = ∇ u·n n is
the normal derivative of u). Prescribing the function along part of the bound-
ary and the normal derivative along the remainder results in a mixed bound-
ary value problem. If Ω is unbounded, u should decay to zero reasonable
rapidly at large distances. So, for example, in thermal equilibrium, the Dirich-
let boundary problem specifies the temperature of a body along its bound-
ary, and our task is to find the interior temperature distribution by solving an
appropriate partial differential equation. Similarly, the Neumann boundary
value problem prescribes the heat flux through the boundary. In particular,
an insulated boundary has no heat flux, and hence the normal derivative of
the temperature is zero on the boundary. The mixed boundary value problem
prescribes the temperature along part of the boundary and the heat flux along
the remainder. Again, our task is to determine the interior temperature of the
body by solving an appropriate differential equation.
For partial differential equations modeling physical processes, in which
time is one of the independent variables, the solution is to be specified by one
or more initial conditions. The number of initial conditions required depends
on the highest-order time derivative that appears in the equation. For exam-
ple, in thermodynamics, which involves only the first-order time derivative
of the temperature, the initial condition requires specifying the temperature
of the body at the initial time. Newtonian mechanics describes the accelera-
tion or second-order time derivative of the motion, and so requires two initial
conditions, for example, the initial position and initial velocity of the system.
On bounded domains, one must also impose suitable boundary conditions
in order to uniquely characterize the solution and hence the subsequent dy-
namical behavior of the physical system. The combination of the partial dif-
2.2 L INEAR AND NONLINEAR EQUATIONS 86

ferential equation, the initial conditions, and the boundary conditions leads
to an initial-boundary value problem.
An additional consideration is that, besides any smoothness required by
the partial differential equation within the domain, the solution and any of
its derivatives specified in any initial or boundary condition should also be
continuous at the initial or boundary point where the condition is imposed.
For example, if the initial condition specifies the function value u(0, x) for
a < x < b, while the boundary conditions specify the derivatives ∂u ∂x
(t , a) and
∂u
∂x
(t , b) for t > 0, then, in addition to any smoothness required inside the do-
main {a < x < b, t > 0}, we also require that u be continuous at all initial points
(0, x), and that its derivative ∂u∂x
be continuous at all boundary points (t , a) and
(t , b), in order that u(t , x) qualify as a solution to the initial-boundary value
problem.

2.2 L INEAR AND NONLINEAR EQUATIONS


As with algebraic equations, in partial differential equations, as well as in or-
dinary differential equations, there is a crucial distinction between linear and
nonlinear partial differential equations. While linear algebraic equations are
eminently solvable by a variety of techniques, linear ordinary differential equa-
tions, of order ≥ 2, already present a challenge. Linear partial differential
equations are of a yet higher level of difficulty, and only a small handful of
specific equations can be completely solved. Moreover, explicit solutions tend
to be expressible only in the form of infinite series, requiring subtle analytic
tools to understand their convergence and properties. For the vast majority
of partial differential equations, the only feasible way of producing general
solutions is through numerical approximation.
The distinguishing feature of linearity is that it enables one to straight-
forwardly combine solutions to form new solutions, through a general su-
perposition principle. Linear superposition is universally applicable to all
linear equations and systems, including linear algebraic systems, linear ordi-
nary differential equations, linear partial differential equations, linear initial
and boundary value problems, and so on. Let us introduce the basic idea in
the context of a single differential equation.
A differential equation is called homogeneous linear if it is a sum of terms,
each of which involves the dependent variable u or one of its derivatives to the
first power; on the other hand, there is no restriction on how the terms involve
2.2 L INEAR AND NONLINEAR EQUATIONS 87

the independent variables. Thus,


d 2u u
+ =0 (2.11)
dx 2 1 + x2
is a homogeneous linear second-order ordinary differential equation. The
partial differential equation
2
∂u x∂ u
=e + cos(x − t )u (2.12)
∂t ∂x 2
is an homogeneous linear partial differential equation, however, an equation
like
∂u ∂u ∂2 u
+u = (2.13)
∂t ∂x ∂x 2
is not linear, since the second term involves the product of u and its derivative
ux .
A more precise definition of a homogeneous linear differential equation
begins with the concept of a linear differential operator L. Such operators are
assembled by summing the basic partial derivative operators, with either con-
stant coefficients or, more generally, coefficients depending on the indepen-
dent variables. The operator acts on sufficiently smooth functions depending
on the relevant independent variables. The linearity of L imposes two key re-
quirements:

L[u + v] = L[u] + L[v], L[cu] = cL[u], (2.14)

for any two (sufficiently smooth) functions u, v, and any constant c.


In this way, an homogeneous linear differential equation is defined as one
with the form

L[u] = 0, (2.15)

where L is a linear differential operator.

E XAMPLE 2.1. Let’s consider the second-order differential operator


∂2 ∂2 u
L= , whereby L[u] = , (2.16)
∂x 2 ∂x 2
for any function u(x, y) continuously differentiable twice. As we can see

∂2 ∂2 u ∂2 v
L[u + v] = (u + v) = + = L[u] + L[v],
∂x 2 ∂x 2 ∂x 2
∂2 ∂2 u
L[cu] = 2 (cu) = c 2 = cL[u], (2.17)
∂x ∂x
2.2 L INEAR AND NONLINEAR EQUATIONS 88

which are valid for any functions continuously differentiable twice, u, v and
any constant c. The corresponding homogeneous linear differential equation
L[u] = 0 is
∂2 u
= 0. (2.18)
∂x 2
Let’s consider now

L = ∂t − ∂2x , with L[u] = ∂t u − ∂2x u = u t − u xx = 0. (2.19)

As we can see

L[u + v] = ∂t (u + v) − ∂2x (u + v) = (∂t u − ∂2x u) + (∂t v − ∂2x v) = L[u] + L[v],


L[cu] = ∂t (cu) − ∂2x (cu) = c(∂t u − ∂2x u) = cL[u]. (2.20)

Similarly, the linear differential operator

L = ∂2t − ∂x [κ(x)∂x ] = ∂2t − κ(x)∂2x − κ0 (x)∂x , (2.21)

where κ(x) (κ0 (x) = d κ/d x) is a prescribed continuous differentiable function


of x alone, defines the homogeneous linear partial differential equation

L[u] = ∂2t u − ∂x [κ(x)∂x u] = u t t − ∂x [κ(x)u x ] = u t t − κ(x)u xx − k 0 (x)u x = 0.


(2.22)

The defining attributes of linear operators (2.3.1) imply the key properties
shared by all homogeneous linear differential equations: If u 1 , . . . , u k are so-
lutions to a common homogeneous linear equation L[u] = 0, then the linear
combination, or superposition, u = c 1 u 1 +· · ·+c k u k is a solution for any choice
of constants c 1 , . . . , c k . This is called the superposition principle. Indeed,

L[u] = L[c 1 u 1 + · · · c k u k ] = L[c 1 u 1 + · · · + c k−1 u k−1 ] + L[c k u k ]


= · · · = L[c 1 u 1 ] + · · · + L[c k u k ] = c 1 L[u 1 ] + · · · + c k L[u k ]. (2.23)

In particular, if the functions u 1 , . . . , u k are solutions of the differential equa-


tion, L[u 1 ] = 0, . . . , L[u k ] = 0, then the right side of the preceding equation
vanishes, proving that u also solves the differential equation L[u] = 0. In case
of ordinary differential equations, as for linear algebraic equations, once one
finds a sufficient number of independent solutions, which coincide with the
order of the equation, the general solution is obtained as a linear combina-
tion thereof. In the language of linear algebra, the solutions form a finite-
dimensional vector space. In contrast, most linear systems of partial differen-
tial equations admit an infinite number of independent solutions, meaning
2.2 L INEAR AND NONLINEAR EQUATIONS 89

that the solution space is infinite-dimensional, and, as a consequence, one


cannot hope to build the general solution by taking finite linear combinations.
Instead, one requires the far more delicate operation of forming infinite series
involving the basic solutions, which is totally linked with the Fourier analysis.
If L[u] = 0 is called homogenous linear differential equation, what if the
right side of the equation is not zero, for example, what if we have

L[v] = f , (2.24)

where L is a linear differential operator, v is the unknown function, and f is


a given nonzero function of the independent variables alone. Such equations
are called inhomogeneous linear differential equations. For example,

L[v] = ∂t v − ∂2x v = v t − v xx = f (t , x), (2.25)

where f (t , x) is a specified function models the thermodynamics of a one-


dimensional medium subject to an external heat source.
The basic technique for solving inhomogeneous linear equations is: (1)
Determine the general solution to the homogeneous equation; (2) Find a par-
ticular solution to the inhomogeneous version. The general solution to the
inhomogeneous equation is then obtained by adding the two together. In this
way, if we call v ∗ the particular solution to the inhomogeneous linear equa-
tion L[v ∗ ] = f , the general solution to L[v] = f is given by v = v ∗ + u, where u
is the general solution to the corresponding homogeneous equation L[u] = 0.
Indeed,

L[v] = L[v ∗ + u] = L[v ∗ ] + L[u] = f + 0 = f , (2.26)

as required. And if we set u = v − v ∗ ,

L[u] = L[v] − L[v ∗ ] = f − f = 0, (2.27)

and hence u is a solution to the homogeneous differential equation. Thus,


v = v ∗ + u has the required form.
In physical applications, one can interpret the particular solution v ∗ as a
response of the system to the external forcing function. The solution u to the
homogeneous equation represents the system’s internal, unforced behavior.
The general solution to the inhomogeneous linear equation is thus a combi-
nation, v = v ∗ + u of the external and internal responses.
Finally, the superposition principle for inhomogeneous linear equations
allows one to combine the responses of the system to different external forc-
ing functions: let v 1 , . . . , v k be solutions to the inhomogeneous linear sys-
tems L[v 1 ] = f 1 , . . . , L[v k ] = f k , involving the same linear operator L. Then,
2.3 S OLUTION FOR LINEAR ORDINARY DIFFERENTIAL EQUATIONS 90

given any constants c 1 , . . . , c k the linear combination v = c 1 v 1 + · · · c k v k solves


the inhomogeneous system L[v] = f for the combined forcing functions f =
c1 f 1 + · · · + ck f k .
The two general superposition principles furnish us with powerful tools
for solving linear partial differential equations which we shall repeatedly ex-
ploit throughout this chapter. In contrast, nonlinear partial differential equa-
tions are much tougher, and, typically, knowledge of several solutions is of
scant help in constructing others. Indeed, finding even one solution to a
nonlinear partial differential equation can be quite challenge. In this course,
we will simply concentrate on analyzing the solutions and their properties to
some of the most basic and most important linear partial differential equa-
tions.
Our aim is to solve some of the common partial differential equations ap-
pearing in physics inside a bounded domain Ω ∈ R3 (keep in mind that the
boundary of the domain Ω consists of one or more surfaces). To do this, we
are going to use the boundary conditions of Dirichlet, Neumann and mixed.
The boundary conditions needed to specify a unique solution will depend
on the equation considered. For example, the solution of the Poisson’s equa-
tion is unique if either u or its normal derivative ∂u/∂n is specified on S, i.e.,
with either Dirichlet or Neumann boundary conditions. However, consider-
ing mixed boundary conditions overdetermines the problem and leads to no
solution. We will not enter into the details of showing that the boundary con-
ditions chosen, indeed, produce a unique solution since such proofs are be-
yond the scope of this introduction to partial differential equations.
Even more, we concentrate on the method called separation of variables
to solve second-order partial differential equations. When using this tech-
nique, we are going to transform the partial differential equation in a set of
ordinary differential equations which need to be solve. For this reason, be-
fore embarking on the separation of variables method for partial differential
equation, we need to present some basic resolution methods for the kind of
ordinary differential equations we need to face.

2.3 S OLUTION FOR LINEAR ORDINARY DIFFERENTIAL


EQUATIONS

The kind of ordinary differential equation which we are going to encounter


when applying the separation of variables method to partial differential equa-
2.3 S OLUTION FOR LINEAR ORDINARY DIFFERENTIAL EQUATIONS 91

tions like the wave equation, the Laplace equation, etc., is of the form

dnu d n−1 u du
L[u] = a n (x) n
+ a n−1 (x) n−1
+ · · · + a 1 (x) + a 0 (x)u = 0, (2.28)
dx dx dx
which is an homogeneous, linear, ordinary differential equation with variable
coefficients, i.e., the a i , i = 1, 2, . . . , n appearing in Eq. (2.28) depend on x. The
general solution of Eq. (2.28) will contain n arbitrary constants, which can be
determined if n boundary conditions are also provided.
To determine u(x), since we have c i , i = 1, 2, . . . , n, arbitrary constants that
may be determined if n boundary conditions are provided, we need to find n
solutions, u 1 (x), u 2 (x), . . . , u n (x) and, due to the linearity of the differential
equation, construct the linear superposition

u(x) = c 1 u 1 (x) + c 2 u 2 (x) + · · · + c n u n (x). (2.29)

In general, for n functions u i (x), i = 1, 2, . . . , n, being linearly independent


on a domain we need that

c 1 u 1 (x) + c 2 u 2 (x) + · · · + c n u n (x) 6= 0 (2.30)

over the domain in question, for any set of constants c 1 , c 2 , . . . , c n , except for
the trivial case c 1 = c 2 = · · · = c n = 0. A statement equivalent to the above equa-
tion, which is perhaps more useful for the practical determination of linear
independence can be obtained by repeatedly differentiating Eq. (2.30) n − 1
times in all, to obtain n simultaneous equations for c 1 , c 2 , . . . , c n :

c 1 u 1 (x) + c 2 u 2 (x) + · · · + c n u n (x) = 0,


d u 1 (x) d u 2 (x) d u n (x)
c1 + c2 + · · · + cn = 0,
dx dx dx
..
.
d n−1 u 1 (x) d n−1 u 2 (x) d n−1 u n (x)
c1 + c 2 + · · · + c n = 0. (2.31)
d x n−1 d x n−1 d x n−1
We can write Eq. (2.31) in a matrix form as
    
u 1 (x) u 2 (x) ... u n (x) c1 0
 d u 1 (x) d u 2 (x) d u n (x)    
 ...  c2   0 
 dx
..
dx
..
dx
..  =  (2.32)
  ..
 
..

 . . ... .  . .
d n−1 u 1 (x) d n−1 u 2 (x) d n−1 u n (x) cn 0
d x n−1 d x n−1
... d x n−1
2.3.1 C ONSTANTS COEFFICIENTS 92

In this way, if the determinant W (u 1 , u 2 , . . . , u n ) of the matrix


 
u 1 (x) u 2 (x) ... u n (x)
 d u1 (x) d u 2 (x) d u n (x) 
 ... 
 dx
.
dx
.
dx
.  (2.33)
 .. .. .. 
 ... 
n−1 n−1 n−1
d u 1 (x) d u 2 (x) d u n (x)
d x n−1 d x n−1
... d x n−1

on the domain considered is non-zero, then the only solution to Eq. (2.31)
is the trivial solution c 1 = c 2 = · · · = c n = 0, thus, the n functions u 1 (x), u 2 (x),
u n (x) are linearly independent on the domain. This determinant W (u 1 , u 2 , . . . , u n )
is called the Wronskian of the set of functions. Note, however, that, in gen-
eral, the vanishing of the Wronskian does not guarantee that the functions are
linearly dependent. For example, if we consider the functions u 1 (x) = x and
u 2 (x) = |x|, we have that ddux1 = 1 and ddux2 = x/|x|. Then
¯ ¯
¯ x |x| ¯ x 2
¯
W (u 1 , u 2 ) = ¯ ¯
x ¯= − |x| = |x| − |x| = 0, (2.34)
1 |x| |x|
Then we might conclude that u 1 (x) = x and u 2 (x) = x/|x| are not linearly inde-
pendent, and that such conclusion is valid no matter the interval considered
for x. Note, however, that ddux2 = x/|x| does not exist at x = 0, and if we consider
a domain where x = 0 is part of it, the vanishing of W (u 1 , u 2 ) does not imply
that u 1 and u 2 are linearly dependent. However, it is possible to demonstrate
that if u i (x) are solutions to an nth order ordinary linear differential equation,
which is our case, and the Wronskian W (u 1 , u 2 , . . . , u n ) vanishes, then {u i }ni=1
is a linearly dependent set of functions. Moreover, if the Wronskian does not
vanish for some value of x, then it does not vanish for all values of x, in which
case an arbitrary linear combination of the u i (x) constitutes, as stated before,
the most general solution to the nth order ordinary linear differential equa-
tion.

2.3.1 C ONSTANTS COEFFICIENTS


Let us first study the case in which the a i , i = 0, 1, . . . , n, coefficients in Eq. (2.28)
are constants rather than functions of x, i.e., we have
dnu d n−1 u du
an n
+ a n−1 n−1
+ · · · + a1 + a 0 u = 0. (2.35)
dx dx dx
The standard method to find the solution u(x) of Eq. (2.35) is to try a solu-
tion of the form Ae λx for the equation. Since
dn
[Ae λx ] = λn [Ae λx ], (2.36)
d xn
2.3.1 C ONSTANTS COEFFICIENTS 93

we get from Eq. (2.35),

[a n λn + a n−1 λn−1 + · · · + a 1 λ + a 0 ]Ae λx = 0, (2.37)

which produces the following polynomial equation in λ of order n,

a n λn + a n−1 λn−1 + · · · + a 1 λ + a 0 = 0, (2.38)

since A 6= 0. In general, Eq. (2.38 has n roots, which we denote as λ1 , λ2 , . . . ,


λn and we may distinguish three main cases:

1. All roots are real and distinct. In this case, the n solutions to Eq. (2.35)
are u i (x) = e λi x , i = 1, 2, . . . , n, and the related Wronskian would be not
zero since all λi are different to each other. Thus, the solution u(x) is
given by the linear superposition of all the u i (x), i.e.,

u(x) = c 1 e λ1 x + c 2 e λ2 x + · · · + c n e λn x . (2.39)

2. Some roots are complex. If all a i coefficients are real and one of the
roots of Eq. (2.38) is complex, say λR + i λI , with λR , λI ∈ R, then its
complex conjugate λR − i λI is also a root. In this case, when combining
them into a linear superposition, we will have that

c 1 e (λR +i λI )x + c 2 e (λR −i λI )x = e λR x [c 1 e i λI x + c 2 e −i λI x ]
= e λR x [c 1 {cos(λI x) + i sin(λI x)}
+ c 2 {cos(λI x) − i sin(λI x)}]
= e λR x [(c 1 + c 2 )cos(λI x) + i (c 1 − c 2 )sin(λI x)]
≡ e λR x [αcos(λI x) + βsin(λI x)]
= e λR x Acos(λI x + ϕ) or e λR x B sin(λI x + η),
(2.40)

where α, β, A, B , ϕ and η are arbitrary constants to be determined by


the boundary conditions.

3. Some roots are repeated. If, for example, the root λ1 occurs k times
(k > 1), then we have not found n linearly independent solutions of
Eq. (2.35). We must find k − 1 further solutions that are linearly inde-
pendent of those already found and also of each other. Interestingly, by
direct substitution into Eq. (2.35), if e λ1 x is a solution,

xe λ1 x , x 2 e λ1 x , ... , x k−1 e λ1 x (2.41)


2.3.1 C ONSTANTS COEFFICIENTS 94

are also solutions, it is easily shown that they, together with the solu-
tions already found, form a linearly independent set of n functions. In
this way,

u(x) = (c 1 + c 2 x + · · · + c k x k−1 )e λ1 x + c k+1 e λk+1 x + c k+2 e λk+2 x


+ · · · + c n e λn x . (2.42)

The above argument can be easily extended if more than one root is
repeated. For example, suppose as before that λ1 is a k-fold root of
Eq. (2.38) and, further, that λ2 is an l -fold root (both k, l >1). Then, the
solution u(x) reads

u(x) = (c 1 + c 2 x + · · · + c k x k−1 )e λ1 x + (c k+1 + c k+2 x + · · · + c k+l x l −1 )e λ2 x


+ c k+l +1 e λk+l +1 x + c k+l +2 e λk+l +2 x + · · · + c n e λn x . (2.43)

E XAMPLE 2.2. Let’s consider the equation

d 2u du
2
−2 + u = 0. (2.44)
dx dx

Using as solution for this equation the function Ae λx , we get the polynomial

λ2 − 2λ + 1 = 0. (2.45)

The above equation has as root λ1 = 1, which occurs twice. Thus, e λ1 x and
xe λ1 x are two linearly independent solutions and

u(x) = (c 1 + c 2 x)e x . (2.46)

Let’s check our solution:


· ¸
du
= c 1 + c 2 (x + 1) e x ,
dx
· ¸
d 2u
= c 1 + c 2 (x + 2) e x
d x2
· ¸
d 2u du
=⇒ 2 − 2 = − c 1 − c 2 x e x = −u, (2.47)
dx dx

as expected.

E XAMPLE 2.3. Solve


d 2u
+ 4u = 0. (2.48)
d x2
2.3.2 VARIABLE COEFFICIENTS : SERIES SOLUTION 95

Using Ae λx as solution for Eq. (2.48), we get the polynomial


p
λ2 + 4 = 0 =⇒ λ = −4 = ±2i , (2.49)

thus, we have two roots, λ1 = 2i and λ2 = −2i . Then, using Eq. (2.40), the
solution u(x) is given by

u(x) = αcos(2x) + βsin(2x), (2.50)

where α, β are arbitrary constants which can be determined from the bound-
ary conditions. We can check our solution:

du
= −2αsin(2x) + 2βcos(2x),
dx
d 2u
= −4αcos(2x) − 4βsin(2x) = −4u, (2.51)
d x2
as expected.

2.3.2 VARIABLE COEFFICIENTS : SERIES SOLUTION


Other kind of ordinary differential equations which we will need to solve when
dealing with the previously mentioned partial differential equations are lin-
ear, homogeneous, ordinary differential equations with variable coefficients.
Here we are going to discuss a method for obtaining solutions for such equa-
tions in the form of convergent series. Such series can be evaluated numeri-
cally, and those occurring most commonly are named and tabulated, such as
sin(x), cos(x) or e x . In particular, we are going to be concerned with second-
order linear, homogeneous, ordinary differential equations. Such equations
can be written in the form

d 2u du
2
+ P (x) +Q(x)u = 0, (2.52)
dx dx
and its solution can be written as

u(x) = c 1 u 1 (x) + c 2 u 2 (x). (2.53)

The solutions u 1 (x) and u 2 (x) are linearly independent, thus, as we saw, the
Wronskian
¯ ¯
¯ u1 u2 ¯ d u2 du 1
W (x) = ¯ d u1 du2 ¯¯ = u 1
¯ − u2 6= 0. (2.54)
dx dx dx dx
2.3.2 VARIABLE COEFFICIENTS : SERIES SOLUTION 96

So far we have always assumed that u(x) is a real function of a real variable
x. However, this is not always the case, and we are going to broaden our dis-
cussions in this section by generalizing u(x) to a complex function u(z) of a
complex variable z. We thus consider the second-order linear homogeneous
equation

d 2u du
+ P (z) +Q(z)u(z) = 0, (2.55)
d z2 dz
where differentiation with respect to z is treated in a way analogous to ordi-
nary differentiation with respect to a real variable x. We limit our considera-
tions to the cases where the functions P (z) and Q(z) are analytic in a certain
domain R, except at an enumerable number of points of R where these func-
tions may have isolated singularities.
If at some point z = z 0 the functions P (z) and Q(z) are finite and can be
expressed as complex power series about z 0
X
∞ X

P (z) = P n (z − z 0 )n , Q(z) = Q n (z − z 0 )n , (2.56)
n=0 n=0

then P (z) and Q(z) are said to be analytic at z = z 0 , and this point is called
an ordinary point of the ordinary differential equation. If, however, P (z) or
Q(z), or both, diverge at z = z 0 , then it is called singular point of the ordinary
differential equation. Even if an ordinary differential equation is singular at a
given point z = z 0 , it may still possess a non-singular solution at that point.
In fact, the necessary and sufficient condition for such a solution to exist is
that (z − z 0 )P (z) and (z − z 0 )2Q(z) are both analytic at z = z 0 . Singular points
that have this property are called regular singular points, whereas any singu-
lar point not satisfying both these criteria is called an irregular or essential
singularity.
Sometimes z 0 might not be finite and we might need to determine the
nature of the point |z| → ∞. This can be done by simply substituting w = 1/z
into the differential equation and investigating the behavior at w = 0.

E XAMPLE 2.4. The Legendre’s equation has the form

d 2u du
(1 − z 2 ) 2
− 2z + l (l + 1)u = 0, (2.57)
dz dz
where l is a constant. Let us show that z = 0 is an ordinary point, z = ±1 and
|z| → ∞ are regular singular points of the equation. First, Eq. (2.57) can be
written as
d 2u 2z d u l (l + 1)
− + u = 0, (2.58)
dz 2 1 − z2 d z 1 − z2
2.3.2 VARIABLE COEFFICIENTS : SERIES SOLUTION 97

thus, comparing with Eq. (2.55)

2z 2z l (l + 1) l (l + 1)
P (z) = − = − , Q(z) = = . (2.59)
1 − z2 (1 + z)(1 − z) 1 − z2 (1 + z)(1 − z)

It is then clear that P (z) and Q(z) are both analytic at z = 0 and both diverge at
z = ±1. Then, z = 0 is an ordinary point of the differential equation and z = ±
are singular points. At z = 1,

2z 1−z
(z − 1)P (z) = , (z − 1)2Q(z) = l (l + 1) (2.60)
1+z 1+z
and they are both analytic at z = 1. Hence, z = 1 is a regular singular point.
Similarly, at z = −1, both (z + 1)P (z) and (z + 1)2Q(z) are analytic, thus, z = −1
is another regular singular point of the equation.
Next, letting w = 1/z,

du du dw 1 du du
= =− 2 = −w 2 ,
dz dw dz z dw dw
µ ¶ µ ¶
d 2u d 2 du dw d 2 du
= −w = −w
d z2 d z dw dz dw dw
µ 2 ¶ µ ¶
2 du 2d u 3 du d 2u
= −w − 2w −w =w 2 +w . (2.61)
dw dw2 dw dw2

In this way, if we substitute these derivatives into Legendre’s equation (2.58)


we get
µ ¶ µ ¶
1 3 du d 2u 1 du
1− 2 w 2 +w 2
+ 2 w2 + l (l + 1)u = 0, (2.62)
w dw dw w dw

which simplifies to give

d 2u 3 du
w 2 (w 2 − 1) + 2w + l (l + 1)u = 0. (2.63)
dw2 dw

Dividing through by w 2 (w 2 − 1) to put the equation into standard form, and


comparing with (2.55) we identify

2w l (l + 1)
P (w) = , Q(w) = . (2.64)
w2 − 1 w 2 (w 2 − 1)

At w = 0, P (w) is analytic but Q(w) diverges, and so the point |z| → ∞ is a sin-
gular point of the Legendre’s equation. However, wince wP (w) and w 2Q(w)
are both analytic at w = 0, |z| → ∞ is a regular singular point.
1) S ERIES SOLUTIONS ABOUT AN ORDINARY POINT. 98

1) S ERIES SOLUTIONS ABOUT AN ORDINARY POINT.

If z = z 0 is an ordinary point of Eq. (2.55) then it can be shown that every


solution u(z) of the equation is also analytic at z = z 0 . From now onwards we
will consider z 0 as the origin, i.e., z 0 = 0, and if this is not already the case, then
the substitution Z = z −z 0 will make it so. Since every solution is analytic, u(z)
can be represented by a power series of the form

X

u(z) = an z n . (2.65)
n=0

Moreover, it can be shown that such a power series converges for |z| < R,
where R is the radius of convergence and is equal to the distance from z = 0 to
the nearest singular point of the ordinary differential equation. At the radius
of convergence, however, the series may or may not converge.
Since every solution of Eq. (2.55) is analytic at an ordinary point, it is al-
ways possible to obtain two independent solutions of the form (2.65) from
which the general solution

u(z) = c 1 u 1 (z) + c 2 u 2 (z) (2.66)

can be constructed.
Using Eq. (2.65),

du X ∞ X∞
= na n z n−1 = (n + 1)a n+1 z n ,
d z n=0 n=0
d u X
2 ∞
n−2
X∞
= n(n − 1)a n z = (n + 2)(n + 1)a n+2 z n . (2.67)
d z 2 n=0 n=0

Substituting the above expressions into Eq. (2.55) and requiring that the co-
efficients of each power of z sum to zero, we obtain a recurrence relation ex-
pressing each a n in terms of the previous a r , 0 ≤ r ≤ n − 1. In some cases we
may find that the recurrence relation leads to a n = 0 for some n greater than a
value N , for one or both of the two solutions u 1 (x) and u 2 (x). In such a case,
the series solution becomes a polynomial, thus, the solution would converge
for all finite z.

E XAMPLE 2.5. Let’s determine the series solution about z = 0 of the differential
equation

d 2u
+ u = 0. (2.68)
d z2
1) S ERIES SOLUTIONS ABOUT AN ORDINARY POINT. 99

First, by inspection z = 0 is an ordinary point of the equation, and so we may


obtain two independent solutions. Using Eqs. (2.65) and (2.67), we can write
Eq. (2.68) as
∞ ·
X
¸
(n + 2)(n + 1)a n+2 + a n z n = 0, (2.69)
n=0

For this equation to be satisfied we need that the coefficients of each power of
z vanishes separately, thus,
an
(n + 2)(n + 1)a n+2 + a n = 0 =⇒ a n+2 = − , n ≥ 0. (2.70)
(n + 2)(n + 1)

Using this equation, for a given a 0 , we can calculate the even coefficients, i.e.,
a 2 , a 4 , a 6 , etc., while for a given a 1 , we can determine the odd coefficients, i.e.,
a 3 , a 5 , a 7 , etc. Two independent solutions can be obtained by setting either
a 0 or a 1 to zero and choosing the other coefficient equal to 1. For example, if
we set a 0 = 0 and choose a 1 = 1, all the even coefficients a 2n , n = 0, 1, 2, . . . are
equal to zero. For the even coefficients, using Eq. (2.70)

a1 1 a3 a1 1
a3 = − = − , a5 = − = = , ...
3·2 3! 5 · 4 5 · 4 · 3 · 2 5!
(−1)n
a 2n+1 = , n = 0, 1, 2, . . . (2.71)
(2n + 1)!

Then, we obtain the solution


X∞ (−1)n 2n+1
u 1 (z) = z . (2.72)
n=0 (2n + 1)!

Similarly, if we choose a 0 = 1 and a 1 = 0, all odd coefficients a 2n+1 , n = 0, 1, 2, . . . ,


are equal to zero, while for the even coefficients, using Eq. (2.70), we get

a0 1 a2 1 1
a2 = − = − , a4 = − = = , ...
2 2! 4 · 3 4 · 3 · 2 4!
(−1)n
a 2n = , n = 0, 1, 2, . . . (2.73)
(2n)!

In this way we obtain a second, independent, solution

X∞ (−1)n
u 2 (z) = z 2n . (2.74)
n=0 (2n)!

Note that both series converge for all z, as might be expected since Eq. (2.68)
possesses no singular point, except |z| → ∞. Interestingly, the series in Eqs. (2.72)
1) S ERIES SOLUTIONS ABOUT AN ORDINARY POINT. 100

and (2.74) correspond to the series expansion of sin(z) and cos(z), respec-
tively, around z = 0. Then, we can write the solution of Eq. (2.68) as

u(z) = c 1 u 1 (z) + c 2 u 2 (z) = c 1 sin(z) + c 2 cos(z), (2.75)

where c 1 and c 2 are arbitrary constants to be fixed by boundary conditions if


supplied. The linear independence of u 1 (z) and u 2 (z) is obvious, but you can
check it by calculating the Wronskian. Using Eq. (2.54),

W (z) = sin(z)[−sin(z)] − cos(z)[cos(z)] = −1. (2.76)

Since W (z) 6= 0, the two solutions are linearly independent.

Solving the above example was quite straightforward and the resulting se-
ries were easily recognized and written in closed form, i.e., in terms of elemen-
tary functions. But this is not usually the case. Another simplifying feature of
the previous example was that we obtained a two-term recurrence relation
relating a n+2 and a n , so that the odd- and even-numbered coefficients were
independent of one another. This is also not usually the case and, in general,
the recurrence relation expresses a n in terms of any number of the previous
a r , 0 ≤ r ≤ n − 1.

E XAMPLE 2.6. Find the series solutions about z = 0 of


d 2u 2
− u = 0. (2.77)
dz 2 (1 − z)2
First, z = 0 is an ordinary point of the differential equation, and, therefore, we
can find two independent solutions. Using Eqs. (2.65) and (2.67) and multi-
plying through by (1 − z)2 we can write the above equation as
X
∞ X

(1 − 2z + z 2 ) n(n − 1)a n z n−2 − 2 a n z n = 0. (2.78)
n=0 n=0

In this way, we have


X
∞ X
∞ X
∞ X

n(n − 1)a n z n−2 − 2 n(n − 1)a n z n−1 + n(n − 1)a n z n − 2 a n z n = 0.
n=0 n=0 n=0 n=0
(2.79)

The above equation can be written also as


X
∞ X
∞ X
∞ X

(n + 2)(n + 1)a n+2 z n − 2 (n + 1)na n+1 z n + n(n − 1)a n z n − 2 a n z n = 0.
n=0 n=0 n=0 n=0
(2.80)
1) S ERIES SOLUTIONS ABOUT AN ORDINARY POINT. 101

Then,
· ¸
X

(n + 2)(n + 1)a n+2 − 2(n + 1)na n+1 + (n(n − 1) − 2)a n z n = 0. (2.81)
n=0

Since n(n − 1) − 2 = n 2 − n − 2 = (n + 1)(n − 2), we have


· ¸
X∞
(n + 1) (n + 2)a n+2 − 2na n+1 + (n − 2)a n z n = 0. (2.82)
n=0

Then, we need that

(n + 2)a n+2 − 2na n+1 + (n − 2)a n = 0, n ≥ 0. (2.83)

In this way, given a 0 and a 1 , we can get any other coefficient a n . From Eq. (2.118),
one obvious solution is, for example, a n = a 0 for all n , since we will get

[(n + 2) − 2n + (n − 2)]a 0 = 0, (2.84)

which is satisfied for any value of n. Choosing then a 0 = 1, we find the solution
X
∞ X

u 1 (z) = an z n = zn = 1 + z + z2 + z3 + . . . (2.85)
n=0 n=0

The above series is a geometric series and can be summed, obtaining


1
u 1 (z) = . (2.86)
1−z
This solution is singular at z = 1, which is the singular point of Eq. (2.77) near-
est to z = 0.
Note that from Eq. (2.118)

n=0: 2a 2 − 2a 0 = 0,
n=1: 3a 3 − 2a 2 − a 1 = 0,
n=2: 4a 4 − 4a 3 = 0,
n=3: 5a 5 − 6a 4 + a 3 = 0,
..
. (2.87)

so, another obvious solution to the recurrence relation in Eq. (2.118) is to


choose a 2 = a 0 , a 1 = −2a 2 = −2a 0 , and a n = 0 for n > 2. Choosing a 0 = 1,
we get the solution

u 2 (z) = 1 − 2z + z 2 = (1 − z)2 . (2.88)


1) S ERIES SOLUTIONS ABOUT AN ORDINARY POINT. 102

Note that this solution is valid for all finite values of z. In this way, the solution
of Eq. (2.77) is given by
c1
u(z) = c 1 u 1 (z) + c 2 u 2 (z) = + c 2 (1 − z)2 . (2.89)
1−z
The linear independence of u 1 and u 2 is obvious but can be checked by com-
puting the Wronskian. Using Eq. (2.54) (and changing x to z),
1 1
W (z) = [−2(1 − z)] − (1 − z)2 = −3 6= 0, (2.90)
1−z (1 − z)2

thus, u 1 (z) and u 2 (z) are linearly independent.


Alternatively, we could have obtained two linear independent solutions
by choosing a 0 = 1, a 1 = 0 and a 0 = 0, a 1 = 1. In this case, we would have
obtained
(1 − z)2 4 1 (1 − z)2 1
u 1 (z) = − , u 2 (z) = − . (2.91)
3 3 1−z 3 3(1 − z)
You can easily check that u 1 (z) and u 2 (z) satisfy Eq. (2.77). The linear inde-
pendence of the two solutions can also be checked by calculating the corre-
sponding Wronskian, obtaining the result W (z) = 1. Then, we have as solution
to Eq. (2.77)

u(z) = d 1 u 1 (z) + d 2 u 2 (z)


· ¸ · ¸
(1 − z)2 4 1 (1 − z)2 1
= d1 − + d2 − . (2.92)
3 3 1−z 3 3(1 − z)
Now, let’s consider some boundary conditions. For example,

u(0) = 0 and u(1/2) = 1/2. (2.93)

Using Eqs. (2.89) and (2.93), it is easy to see that


2 2
c1 = , c2 = − . (2.94)
7 7
Similarly, from Eqs. (2.92) and (2.93), we get
6
d 1 = 0, d2 = − . (2.95)
7
If we now replace the values obtained for c 1 , c 2 , d 1 and d 2 , equations (2.89)
and (2.92) produce the same answer, which is
2z(3 − 3z + z 2 )
u(z) = , (2.96)
7(1 − z)
as expected.
2) S ERIES SOLUTIONS ABOUT A REGULAR SINGULAR POINT. 103

2) S ERIES SOLUTIONS ABOUT A REGULAR SINGULAR POINT.

There are several second-order linear ordinary differential equations in physics


and engineering having regular singular points in the finite complex plane.
For this reason, we must extend our previous discussion to obtaining series
solutions to ordinary differential equations about such points. In what fol-
lows, as done in the previous section, we continue assuming that the regular
singular point about which the solution is required is at z = 0. If not, the sub-
stitution Z = z − z 0 will make it so.
If z = 0 is a regular singular point of Eq. (2.52), then at least one of P (z)
and Q(z) is not analytic at z = 0, and in general we should not expect to find a
power series solution of the form (2.65). We must therefore extend the method
to include a more general form for the solution. In fact, there is a theorem
called Fuch’s theorem which guaranties the existence of at least one solution
to Eq. (2.52) of the form
X

u(z) = z σ an z n , (2.97)
n=0

where the exponent σ is a number that may be real or complex and where
a 0 6= 0 (since, if it were otherwise, σ could be redefined as σ + 1 or σ + 2, etc.,
so as to make a 0 6= 0). Such a series is called a generalized power series or
Frobenius series. As in the case of a simple power series solution, the radius
of convergence of the Frobenius series is, in general, equal to the distance to
the nearest singularity of the ordinary differential equation.
Let’s define
S(z) ≡ zP (z), T (z) ≡ z 2Q(z), (2.98)
such that in terms of S(z) and T (z), Eq. (2.52) can be written as
d 2 y S(z) d y T (z)
+ + 2 y = 0. (2.99)
d z2 z dz z
Substituting now Eq. (2.97) in (2.99), since
dy X ∞
= (n + σ)a n z n+σ−1 ,
d z n=0
d2y X ∞
= (n + σ)(n + σ − 1)a n z n+σ−2 , (2.100)
d z 2 n=0
we get
X
∞ X
∞ X

(n + σ)(n + σ − 1)a n z n+σ−2 + S(z) (n + σ)a n z n+σ−2 + T (z) a n z n+σ−2 = 0.
n=0 n=0 n=0
(2.101)
2) S ERIES SOLUTIONS ABOUT A REGULAR SINGULAR POINT. 104

Dividing this equation through by z σ−2 , we obtain


∞ ·
X
¸
(n + σ)(n + σ − 1) + S(z)(n + σ) + T (z) a n z n = 0, (2.102)
n=0

which is valid for any z. Setting z = 0, all terms in the sum with n > 0 vanish
and we get

[σ(σ − 1) + S(0)σ + T (0)]a 0 = 0. (2.103)

Since we require a 0 6= 0, we obtain

σ(σ − 1) + S(0)σ + T (0) = 0. (2.104)

Equation (2.104) is called indicial equation. This equation is a quadratic in σ


and, in general, has two roots, σ1 and σ2 , which are called the indices of the
regular singular point. By substituting each of these roots into Eq. (2.102) in
turn and requiring that the coefficients of each power of z vanish separately,
we obtain a recurrence relation for each root expressing each a n as a function
of the previous a r , 0 ≤ r ≤ n − 1. We shall see that the larger root of the in-
dicial equation always yields a solution to the ordinary differential equation
in the form of a Frobenius series. The form of the second solution depends,
however, on the relationship between the two indices σ1 and σ2 . There are
three possible general cases: (a) distinct roots not differing by an integer; (b)
distinct roots differing by an integer (not equal to zero); (c) repeated roots.

i) D ISTINCT ROOTS NOT DIFFERING BY AN INTEGER . If the roots of the in-


dicial equation σ1 and σ2 differ by an amount that is not an integer, then the
recurrence relations corresponding to each root lead to two linearly indepen-
dent solutions of the ordinary differential equation, with both solutions taking
the form of a Frobenius series:
X
∞ X

u 1 (z) = z σ1 an z n , u 2 (z) = z σ2 bn z n . (2.105)
n=0 n=0

The linear independence of these two solutions follows from the fact that
u 2 /u 1 is not a constant since σ2 − σ1 is not an integer. Then, the general solu-
tion is given by

u(z) = c 1 u 1 (z) + c 2 u 2 (z). (2.106)

Note that σ1 and σ2 can be complex numbers where σ2 = σ∗1 . In such a case,
σ1 − σ2 = σ1 − σ∗1 = 2i Im[σ1 ], which is purely imaginary , thus, σ1 − σ2 cannot
be equal to an integer, as required.
2) S ERIES SOLUTIONS ABOUT A REGULAR SINGULAR POINT. 105

E XAMPLE 2.7. Find the power series solutions around z = 0 of

d 2u du
4z 2
+2 + u = 0. (2.107)
dz dz
The above equation can be written as

d 2u 1 du 1
2
+ + y = 0. (2.108)
dz 2z d z 4z
If we compare the latter equation with (2.55), we can identify P (z) and Q(z),

1 1
P (z) = , Q(z) = . (2.109)
2z 4z
Clearly, z = 0 is a singular point of the differential equation, but since

1 z
S(z) = zP (z) = , Q(z) = z 2Q(z) = (2.110)
2 4
are finite at z = 0, z = 0 is a regular singular point. We then use a Frobenius
series as in Eq. (2.97) to determine u(z). From Eqs. (2.104) and (2.110),
µ ¶
1 1
σ(σ − 1) + σ = 0 =⇒ σ σ − = 0, (2.111)
2 2

which has roots σ1 = 1/2 and σ2 = 0. Since these roots do not differ by an
integer, we expect to find two independent solutions to Eq. (2.108). From
Eq. (2.102),

X∞ · 1 z
¸
(n + σ)(n + σ − 1) + (n + σ) + a n z n = 0, (2.112)
n=0 2 4

which can be rewritten as


X∞ ·½ 1
¾
1
(n + σ)(n + σ − 1) + (n + σ) a n + a n−1 ]z n = 0. (2.113)
n=0 2 4

Thus, we have that


½ ¾
1 1
(n + σ)(n + σ − 1) + (n + σ) a n + a n−1 = 0. (2.114)
2 4

If we choose the root σ1 = 1/2, Eq. (2.114) becomes


a n−1
(4n 2 + 2n)a n + a n−1 = 0 =⇒ an = − . (2.115)
2n(2n + 1)
2) S ERIES SOLUTIONS ABOUT A REGULAR SINGULAR POINT. 106

Setting a 0 = 1 and using (2.115), we have

a0 1 a1 1 (−1)n
a1 = − =− , a2 = − = , ... , an = , n = 0, 1, 2, . . .
3·2 3! 5 · 4 5! (2n + 1)!
(2.116)

In this way, using Eq. (2.97), we have


p p
p X ∞ (−1)n n p ( z)3 ( z)5
u 1 (z) = z z = z− + −...
n=0 (2n + 1)! 3! 5!
p
= sin( z). (2.117)

If we choose the root σ2 , Eq. (2.114) gives


a n−1
(4n 2 − 2n)a n + a n−1 = 0 =⇒ an = − . (2.118)
2n(2n − 1)

Setting a 0 = 1 and using (2.118), we get

a0 1 a1 1 (−1)n
a1 = − =− , a2 = − = , ... , an = , n = 0, 1, 2 . . .
2·1 2! 4 · 3 4! (2n)!
(2.119)

Then, using Eq. (2.97) we obtain

X∞ (−1)n z z2
u 2 (z) = zn = 1 − + − . . .
n=0 (2n)! 2! 4!
p 2 p 4
z z p
= 1− + − · · · = cos( z). (2.120)
2! 4!
The linearly independence of Eqs. (2.117) and (2.120) can be checked by cal-
culating the Wronskian for these solutions. Using Eq. (2.54)
· ¸ · ¸
p 1 p p 1 p
W (z) = sin( z) − p sin( z) − cos( z) p cos( z)
2 z 2 z
1
= − p 6= 0. (2.121)
2 z

Hence, the general solution to Eq. (2.108) is given by


p p
u(z) = c 1 u 1 (z) + c 2 u 2 (z) = c 1 sin( z) + c 2 cos( z). (2.122)
2) S ERIES SOLUTIONS ABOUT A REGULAR SINGULAR POINT. 107

ii) D ISTINCT ROOTS DIFFERING BY AN INTEGER . Whatever the roots of the


indicial equation, the recurrence relation corresponding to the larger of the
two always leads to a solution of the ordinary differential equation. However,
if the roots of the indicial equation differ by an integer then the recurrence re-
lation corresponding to the smaller root may or may not lead to a second lin-
early independent solution, depending on the ordinary differential equation
under consideration. Note that for complex roots of the indicial equation, the
larger root is taken to be the one with the larger real part.
In order to construct the general solution to the ordinary differential equa-
tion, however, we require two linearly independent solutions u 1 and u 2 . If we
consider σ1 to be the solution obtained with the largest root σ1 , we need a
method to find a solution u 2 which is linearly independent to u 1 . A way of
doing this is by using the Wronskian of u 1 and u 2 : If u 1 and u 2 are two linearly
independent solutions of the equation (2.55), then the Wronskian of these two
solutions is given by [see Eq. (2.54)]

d u2 d u1
W (z) = u 1 (z) − u 2 (z) . (2.123)
dz dz

Dividing this expression by u 12 (z), we obtain


µ ¶ µ ¶
W 1 d u2 u2 d u1 1 d u2 d 1 d u2
= − 2 = + u2 = . (2.124)
u 12 u 1 d z u1 d z u1 d z d z u1 d z u1

Integrating this equation we can write then

Zz
W (u)
u 2 (z) = u 1 (z) du . (2.125)
u 12 (u)

Interestingly, from Eq. (2.123),

dW d u 1 d u 2 d 2 u2 d u2 d u1 d 2 u1 d 2 u2 d 2 u1
= + u1 − − u 2 = u 1 − u 2 . (2.126)
dz dz dz d z2 dz dz d z2 d z2 d z2

Since both u 1 and u 2 satisfy Eq. (2.55), we can substitute d 2 u 1 /d z 2 and d 2 u 1 /d z 2


in terms of d u 1 /d z and d u 2 /d z, respectively. In this way, we get
· ¸ · ¸
dW d u2 d u1
= −u 1 P (z) +Q(z)u 2 + u 2 P (z) +Q(z)u 1
dz dz dz
· ¸
d u2 d u1
= −P (z) u 1 − u2 = −P (z)W (z). (2.127)
dz dz
2) S ERIES SOLUTIONS ABOUT A REGULAR SINGULAR POINT. 108

In this way
µ ¶ Z
dW W
= −P (z)d z =⇒ ln = − d zP (z)
W W0
R
=⇒ W (z) = W0 e − d zP (z)
, (2.128)

where W0 is an arbitrary integration constant. Using Eq. (2.128) we can write


Eq. (2.125) as
Zz Ru
1 − d v P (v)
u 2 (z) = u 1 (z) du e , (2.129)
u 12 (u)

where we have chosen W0 = 1 since it can be reabsorbed in the arbitrary con-


stant multiplying u 2 (z) when constructing the general solution u(z). Note that
we do not specify any lower limits in the integrals appearing in Eq. (2.129),
since, if specified, they simply make a contribution equal to a constant times
the known first solution u 1 (x), and hence add nothing new. In this way, given
u 1 (z), we can in principle compute u 2 (z) through Eq. (2.129).
One particular case is worth mentioning: if the point about which the so-
lution is required, i.e., z = 0, is in fact an ordinary point of the ordinary dif-
ferential equation rather than a regular singular point, then substitution of
the Frobenius series (2.97) leads to an indicial equation with roots σ = 0 and
σ = 1. Although these roots differ by an integer (unity), the recurrence re-
lations corresponding to the two roots yield two linearly independent power
series solutions (one for each root).

E XAMPLE 2.8. Find the power series solution about z = 0 of

d 2u du
z(z − 1) 2
+ 3z + u = 0. (2.130)
dz dz
By writing the above equation as

d 2u 3 du 1
+ + u=0 (2.131)
dz 2 z − 1 d z z(z − 1)
and comparing with Eq. (2.55), we can identify P (z) and Q(z),
3 1
P (z) = , Q(z) = . (2.132)
z −1 z(z − 1)
As we can see, z = 0 is a singular point, but since
3z z
S(z) = zP (z) = , T (z) = z 2Q(z) = , (2.133)
z −1 z −1
2) S ERIES SOLUTIONS ABOUT A REGULAR SINGULAR POINT. 109

are finite there, it is a regular singular point and we expect to find at least
one solution in the form of a Frobenius series. Using Eq. (2.104), since S(0) =
T (z) = 0, we get

σ(σ − 1) = 0, (2.134)

thus, we have the roots σ1 = 1 and σ2 = 0. Since the roots differ by an integer
(unity), it may not be possible to find two linearly independent solutions of
Eq. (2.131) in the form of Frobenius series. We are guaranteed, however, to
find one such solutions corresponding to the larger root, σ1 . Using Eq. (2.102)
for σ = σ1 and (2.133), we get
∞ ·
X 3z z
¸
(n + 1)n + (n + 1) + a n z n = 0. (2.135)
n=0 z − 1 z − 1

Multiplying the above equation through by z − 1, we have

X∞ · ¸
(n + 1)n(z − 1) + 3z(n + 1) + z a n z n = 0,
n=0
X∞ · ¸
=⇒ {(n + 1)(n + 3) + 1}z − n(n + 1) a n z n = 0, (2.136)
n=0

which can be rewritten as


X∞ · ¸
{n(n + 2) + 1}a n−1 − n(n + 1)a n z n = 0
n=0
· ¸
X∞
=⇒ (n + 1) (n + 1)a n−1 − na n z n = 0. (2.137)
n=0

Since the above equation is valid for an arbitrary z, we need that

n +1
(n + 1)a n−1 − na n = 0 =⇒ an = a n−1 . (2.138)
n
Setting a 0 = 1, we get

3 4
a 1 = 2a 0 = 2, a 2 = a 1 = 3, a 3 = a 2 = 4, ..., a n = n + 1, n = 0, 1, 2, . . .
2 3
(2.139)

Using Eq. (2.97), one of the solutions for Eq. (2.131) is then

X

u 1 (z) = z (n + 1)z n = z(1 + 2z + 3z 2 + . . . ). (2.140)
n=0
2) S ERIES SOLUTIONS ABOUT A REGULAR SINGULAR POINT. 110

The term between the brackets is an arithmetic-geometric series which can


be summed and we obtain
1
u 1 (z) = z . (2.141)
(1 − z)2
Let us now try to find a second solution corresponding to the smaller root
of the indicial equation, i.e., σ2 , by setting σ = σ2 in Eq. (2.102). We then have
∞ ·
X 3z z
¸
n(n − 1) + n+ an z n = 0 (2.142)
n=0 z − 1 z − 1

Multiplying the above equation through by z − 1, we get


∞ ·
X
¸
n(n − 1)(z − 1) + 3nz + z a n z n = 0
n=0
X∞ · ¸
=⇒ {n(n + 2) + 1}z − n(n − 1) a n z n = 0 (2.143)
n=0

which can be rewritten as


∞ ·
X
¸
{(n − 1)(n + 1) + 1}a n−1 − n(n − 1)a n z n = 0,
n=0
· ¸
X∞
=⇒ n na n−1 − (n − 1)a n z n = 0. (2.144)
n=0

In this way, since the above equation is valid for an arbitrary value of z, we get
that
n
an = a n−1 . (2.145)
n −1
Since we require a 0 6= 0, we see that the above expression produces an a 1
which is infinite, thus, within this method, we can not get a second solution u 2
related to the root σ2 . We consider then the Wronskian method to determine
u 2 (z). Using Eq. (2.129) and substituting u 1 (z) and P (z),

Zz
z (1 − u)4 − Ru d v 3
u 2 (z) = du e v−1
(1 − z)2 u2
Zz
z (1 − u)4 −3ln(u−1)
= du e
(1 − z)2 u2
Zz · ¸
z u −1 z 1
= du 2 = ln(z) + . (2.146)
(1 − z)2 u (1 − z)2 z
2.4 S EPARATION OF VARIABLES 111

We can now calculate the Wronskian of u 1 (z) and u 2 (z) to show, as expected,
that the two solutions are linearly independent. In fact, the Wronskian has
already been evaluated as W (u) = e −3ln(u−1) = (u−1)−3 , i.e., W (z) = 1/(z−1)3 6=
0. In this way, the general solution to Eq. (2.131) is given by
µ · ¸¶
z 1
u(z) = c 1 u 1 (z) + c 2 u 2 (z) = c 1 + c 2 ln(z) + . (2.147)
(1 − z)2 z

iii) R EPEATED ROOT OF THE INDICIAL EQUATION . If the indicial equation


has a repeated root, i.e., σ1 = σ2 ≡ σ, then obviously only one solution in the
form of a Frobenius series (2.97) may be found as described above,

X

u 1 (z) = z σ an z n . (2.148)
n=0

We need then to find another linearly independent solution u 2 (z) to con-


struct the general solution to Eq. (2.55). To do this, we can use the Wronskian
method explained above.

2.4 S EPARATION OF VARIABLES


2.4.1 C ARTESIAN COORDINATES
Suppose we seek a solution u(x, y, z, t ) to some partial differential equation
expressed in Cartesian coordinates. Let us attempt to obtain one that has the
product form

u(x, y, z, t ) = X (x)Y (y)Z (z)T (t ). (2.149)

A solution which has this form is said to be separable in x, y, z and t , and


seeking solutions of this form is called the method of separation of variables.
For a general partial differential equation it is likely that a separable solution
is impossible, but certainly some common and important equations as the
ones we listed earlier do have useful solutions of this form, and we are going
to illustrate the method of solution by studying the three-dimensional wave
equation

1 ∂2 u
∇2u = . (2.150)
c 2 ∂t 2
2.4.1 C ARTESIAN COORDINATES 112

For the present, we are going to work in Cartesian coordinates and assume a
solution of the form (2.149), and later on we will work in other coordinate sys-
tems, like spherical or cylindrical. In Cartesian coordinates, Eq. (2.150) takes
the form
∂2 u ∂2 Y ∂2 u 1 ∂2 u
+ + = . (2.151)
∂x 2 ∂y 2 ∂z 2 c 2 ∂t 2
Substituting Eq. (2.149), we get
d2X d 2Y d2Z 1 d 2T
Y Z T + X Z T + X Y T = X Y Z . (2.152)
d x2 d y2 d z2 c2 dt2
If we now divide the above equation throughout by u = X Y Z T , we obtain
1 d2X 1 d 2Y 1 d2Z 1 1 d 2T
+ + = . (2.153)
X d x2 Y d y 2 Z d z2 c2 T d t 2
In this way, of the four terms in the equation, the first one is a function of x
only, the second of y only, the third of z only and the right-hand side a func-
tion of t only, and yet there is an equation connecting them. This can only
be so for all x, y, z and t if each of the terms does not in fact depend upon
the corresponding independent variable but is equal to a constant such that
Eq. (2.153) is satisfied. Let us make the choice,
1 d2X 1 d 2Y 1 d2Z 1 1 d 2T
2
= −l 2 , 2
= −m 2 , 2
= −n 2 , 2 2
= −µ2 . (2.154)
X dx Y dy Z dz c T dt
Then, from Eq. (2.153), the relation between the four constants l , m, n and µ
is given by
µ2 = l 2 + m 2 + n 2 . (2.155)
These constants are called separation constants. The important point to no-
tice is that by assuming a separable solution, the partial differential equation
(2.151), which contains derivatives with respect to the four independent vari-
ables all in one equation, has been reduced to four separate ordinary differen-
tial equations (2.154), which are connected through four constant parameters
that satisfy the algebraic equation (2.155).
The general methods for solving ordinary differential equations (see Sec. 2.3.1)
show that the solutions of equations (2.154) are given by
X (x) = Ae i l x + B e −i l x = A 0 cos(l x) + B 0 sin(l x),
Y (y) = C e i m y + De −i m y = C 0 cos(m y) + D 0 sin(m y),
Z (z) = E e i nz + F e −i nz = E 0 cos(nz) + F 0 sin(nz),
T (t ) = Ge i cµt + He −i cµt = G 0 cos(cµt ) + H 0 sin(cµt ), (2.156)
2.4.1 C ARTESIAN COORDINATES 113

where A, B , . . . , H (A 0 , B 0 , . . . , H 0 ) are constants, which may be determined if


boundary conditions are imposed on the solution. Depending on the geom-
etry of the problem and the boundary conditions it can be more appropriate
to use the exponential form or the cosine and sine form for the solutions.
As an example, suppose that we take as particular solution the four func-
tions

X (x) = e i l x , Y (y) = e i m y , Z (z) = e i nz , T (t ) = e −i cµt . (2.157)

This gives the solution solution of Eq. (2.151)

u(x, y, z, t ) = e i l x e i m y e i nz e −i cµt = e i (l x+m y+nz−cµt ) ≡ Ae iKK ·rr −ωt , (2.158)

which represents a plane wave of amplitude A = 1, angular frequency ω = cµ


and propagating in a direction given by the wave-number vector K = lii +m j +
k.
nk

E XAMPLE 2.9. Use the method of separation of variables to obtain for the one-
dimensional diffusion equation

∂2 u ∂u
κ = , (2.159)
∂x 2 ∂t
a solution that tends to zero as t → ∞ for all x.
First, in this case, we have only two independent variables x and t , thus,
we assume a solution of the form

u(x, t ) = X (x)T (t ). (2.160)

Substituting this expression into Eq. (2.159) and dividing through by u = X T


(and also by κ), we obtain

1 d2X 1 1 dT
= . (2.161)
X dx 2 κ T dt
Since the left-hand side is a function of x only, and the right-hand side is a
function of t only, Eq. (2.161) implies that each side must equal a constant.
For convenience, we choose

1 d2X 1 1 dT
= −λ2 , = −µ2 , (2.162)
X d x2 κ T dt
and, from Eq. (2.161), we have the relation

µ2 = λ2 (2.163)
2.4.2 S UPERPOSITION OF SEPARATED SOLUTIONS 114

between the constants µ and λ. The solution of Eq. (2.162) is straightforward,


X (x) = Acos(λx) + B sin(λx),
2
T (t ) = C e −λ κt
, (2.164)
where A, B , and C are arbitrary constants. In this way,
2
u(x, t ) = X (x)T (t ) = [Acos(λx) + B sin(λx)]e −λ κt
, (2.165)
where we have reabsorbed the constant C in A and B . To satisfy the boundary
condition u → 0 as t → ∞, we need that λ2 κ > 0. Since κ > 0, λ2 > 0.

2.4.2 S UPERPOSITION OF SEPARATED SOLUTIONS


As you have probably noticed, there is considerable freedom in the values
of the separation constants, the only essential requirement being the equa-
tion which relates them. This is a general feature for solutions in separated
form, which, if the original partial differential equation has n independent
variables, will contain n − 1 independent separation constants. If the original
partial differential equation is linear (as are the Laplace, Schrödinger, diffu-
sion and wave equations) then mathematically acceptable solutions can be
formed by superposing solutions corresponding to different allowed values of
the separation constants. Let’s take a two-variable example, where we thus
have one separation constant, which we denote as λ. If
u λ1 (x, y) = X λ1 (x)Yλ1 (y) (2.166)
is a solution of a linear partial differential equation by giving the separation
constant the value λ1 , then the superposition
X
u(x, y) = a 1 u λ1 (x, y) + a 2 u λ2 (x, y) + · · · = a i X λi (x)Yλi (y) (2.167)
i

is also a solution for any constants a i , provided that the λi are the allowed
values of the separation constant λ given the imposed boundary conditions.
Note that if the boundary conditions allow any of the separation constants to
be zero, then the form of the general solution is normally different and must
be deduced by returning to the separated ordinary differential equations.
The value of the superposition approach is that a boundary condition, say
that u(x, y) takes a particular form f (x) when y = 0, might be met by choosing
the constants a i such that
X
f (x) = a i X λi (x)Yλi (0). (2.168)
i

In general, this will be possible provided that the functions X λi (x) form a com-
plete set (as do the sinusoidal functions of Fourier series).
2.4.2 S UPERPOSITION OF SEPARATED SOLUTIONS 115

E XAMPLE 2.10. A semi-infinite rectangular metal plate occupies the region


0 ≤ x ≤ ∞ and 0 ≤ y ≤ b in the x y-plane (see Fig. 2.1). The temperature at the
far end of the plate and along its two long sides is fixed at 0◦ C. If the tempera-
ture at the plate at x = 0 is also fixed and is given by f (y), find the steady-state
temperature distribution u(x, y) of the plate. Hence, find the temperature dis-
tribution if f (y) = u 0 , where u 0 is a constant.

Figure 2.1: A semi-infinite metal plate whose edges are kept at fixed tempera-
tures.

The two dimensional heat diffusion equation satisfied by the temperature


u(x, y, t ) is, using Eq. (2.292)
µ ¶
∂2 u ∂2 u ∂u
κ + 2 = . (2.169)
∂x 2 ∂y ∂t

In this case, we are asked to find the steady-state temperature, which corre-
sponds to ∂u/∂t = 0, and so, we are left with the two-dimensional equation

∂2 u ∂2 u
+ = 0, (2.170)
∂x 2 ∂y 2

which is the Laplace equation. As we have seen, by assuming a separable so-


lution of the form u(x, y) = X (x)Y (y) we are going to obtain, from Eq. (2.153)

1 d2X 1 d 2Y
+ = 0, (2.171)
X d x2 Y d y 2

which implies

1 d2X 1 d 2Y
= −l 2 , = −m 2 , (2.172)
X d x2 Y d y2
2.4.2 S UPERPOSITION OF SEPARATED SOLUTIONS 116

with the separation constants l 2 = −m 2 [from Eq. (2.171)]. In the current prob-
lem, we have to satisfy the boundary conditions u(x, 0) = u(x, b) = 0. A sinu-
soidal expression for Y (y) seems then more appropriate than an exponential
form. Furthermore, we also require u(∞, y) = 0, thus, an exponential form for
X (x) seems more convenient. Then, we write

1 d2X
= m2 =⇒ X (x) = Ae mx + B e −mx ,
X d x2
1 d 2Y
2
= −m 2 =⇒ Y (y) = C cos(m y) + Dsin(m y), (2.173)
Y dy

and, thus,

u(x, y) = [Ae mx + B e −mx ][C cos(m y) + Dsin(m y)]. (2.174)

Applying the boundary conditions, u(∞, y) = 0 implies A = 0 if we take m > 0,


since e mx → ∞ when x → ∞. Considering u(x, 0) = 0, we get C = 0 and if we
absorb the constant D into B leaves us with

u(x, y) = B e −mx sin(m y). (2.175)

Using now that u(x, b) = 0, we require sin(mb) = 0, so m = ±kπ/b, where k


is any positive integer. Notice that negative values of m would lead to expo-
nential terms that diverge as x → ∞, then m = kπ/b. In this way, we have
found
µ ¶
−kπx/b kπy
u k (x, y) = B k e sin . (2.176)
b

Considering now the principle of superposition (2.167), the general solution


satisfying the boundary conditions can be written as
µ ¶
X∞
−kπx/b kπy
u(x, y) = Bk e sin , (2.177)
k=1 b

for some constants B k [note that the term k = 0 is identically zero, thus, we
have omitted in the sum in Eq. (2.177)]. Using the remaining boundary con-
dition u(0, y) = f (y), we see that the constants B k must satisfy
µ ¶
X∞ kπy
f (y) = B k sin . (2.178)
k=1 b

This is clearly a Fourier series expansion of f (y)! For Eq. (2.178) to hold, how-
ever, the continuation of f (y) outside the region 0 ≤ y ≤ b must be an odd
2.4.2 S UPERPOSITION OF SEPARATED SOLUTIONS 117

periodic function with period 2b (see Fig. 2.2). We also see from the figure
that if the original function f (y) does not equal zero at either of y = 0 and
y = b then its continuation has a discontinuity at the corresponding point(s).
Nevertheless, as discussed in Chapter 1, the Fourier series will converge to the
mid-points of these jumps and hence tend to zero in this case. If, however,
the top and bottom edges of the plate were held not at 0◦ C but at some other
non-zero temperature, then, in general, the final solution would posses dis-
continuities at the corners x = 0, y = 0 and x = 0, y = b.

Figure 2.2: The continuation of f (y) for a Fourier sine series.

Bearing in mind these technicalities, the coefficients B k in Eq. (2.178), as


we show in Chapter 1, are given by
Zb µ ¶
2 kπ
Bk = d y f (y)sin y . (2.179)
b b
0
Therefore, if f (y) = u 0 , i.e., the temperature of the side at x = 0 is constant
along its length, we get
Zb µ ¶
2u 0 kπ
Bk = d ysin y
b b
0
· µ ¶¸¯b
2u 0 b kπ ¯
= − cos y ¯¯
b kπ b 0
½
2u 0 4u 0 /(kπ), for k odd,
=− [(−1)n − 1] = (2.180)
kπ 0, for k even.
Therefore, the final solution is
µ ¶
X∞ 4u 0 −(2k+1)πx/b (2k + 1)π
u(x, y) = e sin y . (2.181)
k=1 (2k + 1)π b
2.4.3 P OLAR COORDINATES 118

Often, the principle of superposition can be also used to: (1) write the so-
lution to problems with more complicated boundary conditions as the sum of
solutions to problems that each satisfy only some part of the boundary con-
dition but when added together satisfy all the conditions. (2) For dealing with
inhomogeneous boundary conditions. In general, inhomogeneous boundary
conditions can cause difficulties and it is usual to attempt a suitable change
of variables and transform the problem into an equivalent homogeneous one.
But we are not going to enter into these details.

2.4.3 P OLAR COORDINATES


Many systems in two and three dimensions are more naturally expressed in
some form of polar coordinates, in which full advantage can be taken of any
inherent symmetries, rather than in Cartesian coordinates. For example, the
potential associated with an isolated point charge has a very simple expres-
sion when polar coordinates are used: q/(4πϵ0 r ). However, in Cartesian co-
ordinates it involves all three coordinates and square roots. For these reasons
we now turn to the separation of variables method in polar coordinates (par-
ticularly, cylindrical and spherical coordinates).

Figure 2.3: Cylindrical (left) and spherical (right) coordinates.

In cylindrical coordinates, see Fig. 2.3, the position of a point in space hav-
ing Cartesian coordinates x, y, z can be expressed in terms of ρ, ϕ, z,

x = ρcosϕ, y = ρsinϕ, z = z, (2.182)


2.4.3 P OLAR COORDINATES 119

and ρ ≥ 0, 0 ≤ ϕ ≤ 2π and −∞ < z < ∞. In this way, the vector position r of the
point can be written as

r = ρcosϕii + ρsinϕjj + zk
k. (2.183)

If we take the partial derivatives of r with respect to ρ, ϕ and z respectively,


and divide the resulting vectors by its corresponding norm, we obtain the
three unit vectors

e ρ = cosϕii + sinϕjj ,
e ϕ = −sinϕii + cosϕjj ,
ez =k. (2.184)

These three unit vectors, like the Cartesian unit vectors i , j and k form a ba-
sis. Plane polar coordinates correspond to a situation where the point P is on
the x y plane, thus, they can be obtained as particular case of the cylindrical
coordinates for which z = 0.
In case of spherical coordinates, the position of a point in space with Carte-
sian coordinates x, y, z can be expressed in terms of r , θ, ϕ (see Fig. 2.3), where

x = r sinθ cosϕ, y = r sinθ sinϕ, z = r cosθ, (2.185)

and the unit basis vectors are

e r = sinθ cosϕii + sinθ sinϕjj + cosθk


k,
e θ = cosθ cosϕii + cosθ sinϕjj − sinθk
k,
e ϕ = −sinϕii + cosϕjj . (2.186)

The equations we have considered (wave equation, the diffusion equa-


tion, Schrödinger equation, Laplace’s equation, etc.) contain the operator ∇ 2 ,
which, in Cartesian coordinates, is given by

∂2 ∂2 ∂2
∇2 = + + . (2.187)
∂x 2 ∂y 2 ∂z 2

In case of cylindrical coordinates, we have


µ ¶
2 1 ∂ ∂ 1 ∂2 ∂2
∇ = ρ + 2 + , (2.188)
ρ ∂ρ ∂ρ ρ ∂ϕ2 ∂z 2

and in spherical coordinates,


µ ¶ µ ¶
2 1 ∂ 2 ∂ 1 ∂ ∂ 1 ∂2
∇ = 2 r + 2 sinθ + (2.189)
r ∂r ∂r r sinθ ∂θ ∂θ r 2 sin2 θ ∂ϕ2
2.5 L APLACE ’ S EQUATION 120

2.5 L APLACE ’ S EQUATION


To illustrate the method of separation of variables in polar coordinates we
are going to consider the Laplace’s equation. Common physical problems in-
volving Laplace’s equation are those from electrostatics in empty space and
steady-state heat transfer. In each case, a surface is held at some (not neces-
sarily uniform) potential or temperature and the potential or temperature is
sought in regions away from the surface.

2.5.1 C YLINDRICAL COORDINATES


Let’s illustrate the method of separation of variables in polar coordinates by
considering the Laplace’s equation in cylindrical coordinates. In this case, us-
ing Eq. (2.187), the Laplace equation (2.6) is given by
µ ¶
1 ∂ ∂u 1 ∂2 u ∂2 u
ρ + 2 + = 0. (2.190)
ρ ∂ρ ∂ρ ρ ∂ϕ2 ∂z 2

We then proceed by trying a solution of the form,

u(ρ, ϕ, z) = P (ρ)Φ(ϕ)Z (z), (2.191)

which, on substitution into Eq. (2.4) and division through by u = P ΦZ , gives


µ ¶
1 d dP 1 d 2Φ 1 d 2 Z
ρ + + = 0. (2.192)
Pρ dρ dρ Φρ 2 d ϕ2 Z d z 2

The last term depends only on z, and the first and second, taken together,
depend only on ρ and ϕ. Taking the first separation constant to be k 2 , we find

1 d2Z
= k 2,
Z d z2

µ ¶
1 d dP 1 d 2Φ
ρ + + k 2 = 0. (2.193)
Pρ dρ dρ Φρ 2 d ϕ2

The first of these equations has the straightforward solution

Z (z) = E e −kz + F e kz . (2.194)

Note that when writing the above equation in term of exponentials, we are
implicitly assuming that k 2 > 0. If not, we can always define k 2 ≡ −λ2 , with
λ > 0, and the exponentials in Eq. (2.194) would produce a solution in terms
on sin(λz) and cos(λz).
2.5.1 C YLINDRICAL COORDINATES 121

Multiplying the second equation through by ρ 2 , we obtain


µ ¶
ρ d dP 1 d 2Φ
ρ + + k 2 ρ 2 = 0, (2.195)
P dρ dρ Φ d ϕ2
in which the second term depends only on Φ and the other terms depend only
on ρ. Taking the second separation constant to be −m 2 , we find

1 d 2Φ
= −m 2 , (2.196)
Φ dϕ 2

µ ¶
d dP
ρ ρ + (k 2 ρ 2 − m 2 )P = 0. (2.197)
dρ dρ
The equation in the azimuthal angle ϕ has the familiar solution

Φ(ϕ) = C cos(mϕ) + Dsin(mϕ). (2.198)

For physical situations defined for the entire range of ϕ, (ρ, ϕ, z) and (ρ, ϕ +
2π, z) represent the same physical point. Then u, and thus, Φ, must be single-
valued and so not change when ϕ increases by 2π. This implies that m must
be an integer. In this way, cos(mϕ + m2π) = cos(mϕ) and sin(mϕ + m2π) =
sin(mϕ). Negative values of m will not give rise to any new solutions, so they
are not included in the range of m. In the particular case m = 0, Eq. (2.196)
produces as solution

Φ(ϕ) = C 0 ϕ + D 0 . (2.199)

However, in this case, Φ(ϕ) = Φ(ϕ + 2π) if C 0 = 0. Thus, if we require single-


valued solutions, we need C 0 = 0, i.e., Φ(ϕ) would be a constant, C 0 , for the
case m = 0. But this is precisely the result that we get by using the form in
Eq. (2.198) for m = 0. Then, we do not need to treat differently the solutions
for m 6= 0 and m = 0 and we can write

Φ(ϕ) = C cos(mϕ) + Dsin(mϕ), m = 0, 1, 2, . . . (2.200)

where C ad D are constants to be determine by the boundary conditions. Note


that a physical problem where axial symmetry is present, i.e., there is no de-
pendence on the value of ϕ, fixes the value of m to zero, since in such a case
Φ(ϕ) = C , which is a constant, and the solution u will be independent of ϕ.
In case of Eq. (2.197), it is convenient to define the variable v = kρ to write
Eq. (2.197) as
µ ¶
d 2P 1 d P m2
+ + 1 − 2 P = 0. (2.201)
dv2 v dv v
2.5.1 C YLINDRICAL COORDINATES 122

Equation (2.201) is one of the most famous differential equation of mathe-


matical physics called Bessel differential equation. As we can see, v = 0 is a
regular singular point: the functions multiplying d P /d v and P are both sin-
gular at v = 0, but if we multiplying them by v and v 2 , respectively, the new
functions obtained are both analytical in v = 0. Thus, a way of finding solu-
tions for the Bessel differential equation is by means of the Frobenius method,
i.e.,
X

P (v) = v σ an v n . (2.202)
n=0

In this case, the indicial equation [see Eq. (2.104)] is


σ(σ − 1) + σ − m 2 = 0 =⇒ σ2 = m 2 , (2.203)
and we have the roots σ1 = m and σ2 = −m. Since m is an integer, these two
roots differ by an integer 2m. We can then find a solution for the larger root,
i.e., σ1 , and use Eq. (2.129) to find another linear independent solution. From
Eq. (2.102) for σ = σ1 , we have
∞ ·
X
¸
(n + m)(n + m − 1) + (n + m) + (v − m ) a n v n = 0,
2 2
(2.204)
n=0

which can be rewritten as


∞ h
X i a n−2
n(n + 2m)a n + a n−2 v n = 0 =⇒ a n = − . (2.205)
n=0 n(n + 2m)
If we choose a 0 = 1 and a 1 = 0, all a n for odd values of n are zero, while for
even values of n we get
a0 1
a2 = − =− 2 ,
2(2 + 2m) 2 (1 + m)
a2 1 m!
a4 = − = 4 = 4 ,
4(4 + 2m) 2 2(2 + m)(1 + m) 2 2!(2 + m)!
a4 1 m!
a6 = − =− 6 =− 6 ,
6(6 + 2m) 2 3 · 2(3 + m)(2 + m)(1 + m) 2 3!(3 + m)!
..
.
(−1)i
a 2i = m! , i = 0, 1, 2, . . . (2.206)
22i i !(i + m)!
In this way, one solution ρ 1 (v) of Eq. (2.201) is given by
µ ¶m ∞ µ ¶2n
m
X∞ (−1)n 2n m v
X (−1)n v
P 1 (v) = m!v v = m!2 ,
n=0 2 n!(n + m)! 2 n=0 n!(n + m)! 2
2n

(2.207)
2.5.1 C YLINDRICAL COORDINATES 123

which is a convergent series: the ratio test for the convergence of series yields
¯ ¯
¯ a 2(n+1) v 2(n+1) ¯
lim ¯ ¯ ∼ lim 1 v 2 = 0, (2.208)
n→∞ ¯ a 2n v 2n ¯ n→∞ n 2

which is convergent for all values of v.


The Bessel function of order m, which is denoted by J m (v), is defined as
µ ¶m ∞ µ ¶2n
v X (−1)n v
J m (v) = . (2.209)
2 n=0 n!(n + m)! 2

In this way, we write

P 1 (v) = m!2m J m (v). (2.210)

The factor m!2m is a constant factor which can be reabsorbed in the arbitrary
constant multiplying the solution P 1 (v) when constructing the general solu-
tion P (v). In this way, we can consider as solution related to the root σ1

P 1 (v) = J m (v). (2.211)

To get now a second solution P 2 (v), which is linearly independent to P 1 (v),


we can used now Eq. (2.129).
Zv Ru Zv
1 − d α α1 1
P 2 (v) = J m (v) du 2
e = J m (x) du 2
. (2.212)
Jm (u) u Jm (u)

Note that contrary to P 1 (v), P 2 (v) is not well behaved at v = 0 due to the pres-
ence of u in the denominator of the integrand. Although the above procedure
produces a second solution for the Bessel equation, it is not the customary
procedure. It turns out to be more common to use as second solution the
combination
J m (v)cos(mπ) − J −m (v)
Ym (v) = , (2.213)
sin(mπ)
which is called the Bessel function of the second kind or Neumann function
(more details related to the Bessel functions will be studied in Chapter 3). In
this way, we have obtained that

P (v) = A J m (v) + B Ym (v), (2.214)

or, in terms of ρ,

P (ρ) = A J m (kρ) + B Ym (kρ), (2.215)


2.5.1 C YLINDRICAL COORDINATES 124

Figure 2.4: A conducting cylindrical can whose top has a potential given by
V (ρ, θ) with the rest of the surface grounded.

where A and B are arbitrary constants to be determined from the boundary


conditions. The Neumann function Ym (kρ) is singular at ρ = 0, as expected
from (2.212), and so, when seeking solutions to Laplace’s equation in cylin-
drical coordinates within some region containing the ρ = 0 axis, we require
B = 0.
The complete separated-variable solution in cylindrical polars of Laplace’s
equation is thus given by

u(ρ, ϕ, z) = [A J m (kρ) + B Ym (kρ)][C cos(mϕ) + Dsin(mϕ)]


× [E e −kz + F e kz ]. (2.216)

Of course, we may use the principle of superposition to build up more gen-


eral solutions by adding together solutions of the form (2.216) for all allowed
values of the separation constants k and m.

E XAMPLE 2.11. Consider a cylindrical conducting can of radius a, height h


and whose base is in the z = 0 plane (see Fig. 2.4). Suppose that the potential
at the top face varies as V (ρ, ϕ) while the lateral surface and the bottom face
are grounded. Let us find the electrostatic potential u at all points inside the
can.
The electrostatic potential u satisfy the Laplace’s equation subject to the
imposed boundary conditions. Due to the symmetry of the problem, it is con-
venient to use cylindrical coordinates. In this way, we have that u(ρ, ϕ, z) is
2.5.1 C YLINDRICAL COORDINATES 125

given by Eq. (2.216). Since the bottom face of the can is grounded, u(ρ, ϕ, 0) =
0 for arbitrary ρ and ϕ. This means that

u(ρ, ϕ, 0) = [A J m (kρ) + B Ym (kρ)][C cos(mϕ) + Dsin(mϕ)][E + F ] = 0


=⇒ E = −F (2.217)

yielding to

u(ρ, ϕ, z) = [A J m (kρ) + B Ym (kρ)][C cos(mϕ) + Dsin(mϕ)]sinh(kz), (2.218)

where we have reabsorbed the constant 2F in the other arbitrary constants of


the solution. Since u(0, ϕ, z) is finite, no Neumann function is allowed in the
expansion, thus, B = 0, and we have

u(ρ, ϕ, z) = J m (kρ)[C cos(mϕ) + Dsin(mϕ)]sinh(kz), (2.219)

where we reabsorbed the arbitrary constant A in the coefficients C and D.


Since the lateral surface is grounded, u(a, ϕ, z) = 0 for arbitrary ϕ and z. From
Eq. (2.219), this means
x mn
J m (ka) = 0 =⇒ ka = x mn =⇒ k mn = n = 1, 2, . . . (2.220)
a
where x mn is the nth root, i.e., the nth zero, of J m (in general, there are in-
finitely many roots). Then, it exists a set of values of k, k mn , for which Eq. (2.220)
is verified. In this way, using the principle of superposition, the most general
solution for u(ρ, ϕ, z) is given by
µ ¶ µ ¶
X∞ X ∞ x mn x mn
u(ρ, ϕ, z) = Jm ρ sinh z [C mn cos(mϕ) + D mn sin(mϕ)].
m=0 n=1 a a
(2.221)

This result is the so-called Fourier-Bessel series. The arbitrary constants C mn


and D mn are to be determined by the remaining boundary condition, which
states that u(ρ, ϕ, h) = V (ρ, ϕ). In this way,
µ ¶ µ ¶· ¸
X∞ X∞ x mn x mn
V (ρ, ϕ) = Jm ρ sinh h C mn cos(mϕ) + D mn sin(mϕ) .
m=0 n=1 a a
(2.222)

To get C mn and D mn from the above equation, we need to use a property of


the Bessel functions, which establishes that
Za µ ¶ µ ¶
x mn x mk 1
d ρρ J m ρ Jm ρ = a 2 J m+1
2
(x mn )δkn , (2.223)
a a 2
0
2.5.1 C YLINDRICAL COORDINATES 126

and the fact that


Z2π Z2π
1 1
d ϕ cos(mϕ)cos( j ϕ) = d ϕ sin(mϕ)sin( j ϕ) = δ j m ,
π π
0 0
Z2π
d ϕ cos(mϕ)sin( j ϕ) = 0. (2.224)
0

In this way, multiplying both sides of Eq. (2.222) by ρ J m (x mk ρ/a)cos(l ϕ) and


integrating from zero to 2π in ϕ and from zero to a in ρ we can get C mn . Simi-
larly, changing cosine by sine and following the same steps, we can determine
D mn . Then,
µ ¶
R
2π Ra
2 dϕ d ρ ρV (ρ, ϕ)J m xmn
a
ρ cos(mϕ)
0 0
C mn = µ ¶ ,
πa 2 J m+1
2
(x mn )sinh xmn
a h
µ ¶
R
2π Ra
2 dϕ d ρ ρV (ρ, ϕ)J m xmn
a
ρ sin(mϕ)
0 0
D mn = µ ¶ . (2.225)
πa 2 J m+1
2
(x mn )sinh xmn
a h

The important case of axial symmetry requires special consideration. In


such a case, the potential of the top surface V (ρ, ϕ) must be independent of
ϕ. Furthermore, the part Φ(ϕ) in u(ρ, ϕ, z) must be also a constant, otherwise
u would depend on ϕ, which will not be compatible with the azimuthal sym-
metry. Hence, from Eq. (2.200), we need m = 0. The zero value for m reduces
the double summation of Eq. (2.221) to a single sum, and we get
µ
¶ µ ¶
X
∞ x 0n x 0n
u(ρ, z) = C 0n J 0 ρ sinh z
n=1 a a
µ ¶ µ ¶
X∞ x 0n x 0n
≡ Cn J0 ρ sinh z . (2.226)
n=1 a a

The coefficients C n ≡ C 0n correspond to setting m = 0 in the expression for


C mn , i.e.,

Za µ ¶
4 x 0n
Cn = µ ¶ d ρ ρV (ρ)J 0 ρ , (2.227)
a
a 2 J 12 (x 0n )sinh x0na h 0
2.5.2 S PHERICAL COORDINATES 127

where V (ρ) is the ϕ-independent potential of the top surface. If we consider


for example the simple case where V (ρ) is a constant potential V0 , we then get
Za µ ¶
4V0 x 0n
Cn = µ ¶ d ρ ρ J0 ρ . (2.228)
a
a 2 J 12 (x 0n )sinh x0na h 0

To calculate this integral, we need to use the following property of the Bessel
functions
Z
d x x m J m−1 (x) = x m J m (x), (2.229)

where we omit writing the arbitrary integration constant. In this way, if we


consider the substitution x = x 0n ρ/a, d x = x 0n d ρ/a in the integral in Eq. (2.228)
Za µ ¶ Zx0n · ¸¯x0n
x 0n a2 a2 ¯ a2
d ρ ρ J0 ρ = 2 d x x J 0 (x) = 2 x J 1 (x) ¯¯ = J 1 (x 0n ). (2.230)
a x 0n x 0n 0 x 0n
0 0

and we get
4V0
Cn = µ ¶. (2.231)
x 0n h
x 0n J 1 (x 0n )sinh a

Therefore
µ ¶ µ ¶
J 0 xa0n ρ x 0n
sinh a z
X∞
u(ρ, z) = 4V0 µ ¶ (2.232)
n=1 x 0n
x 0n J 1 (x 0n )sinh a h

2.5.2 S PHERICAL COORDINATES


In spherical coordinates, using Eq. (2.189), the Laplace’s equation (2.6) adopts
the form
µ ¶ µ ¶
1 ∂ 2 ∂u 1 ∂ ∂u 1 ∂2 u
r + sinθ + = 0. (2.233)
r 2 ∂r ∂r r 2 sinθ ∂θ ∂θ r 2 sin2 θ ∂ϕ2
Our method of procedure will be as before; we try a solution of the form

u(r, θ, ϕ) = R(r )Θ(θ)Φ(ϕ). (2.234)


2.5.2 S PHERICAL COORDINATES 128

Substituting this in Eq. (2.233), dividing through by u = RΘΦ and multiplying


by r 2 , we obtain
µ ¶ µ ¶
1 d 2 dR 1 d dΘ 1 d 2Φ
r + sinθ + = 0. (2.235)
R dr dr Θsinθ d θ dθ Φsin2 θ d ϕ2
The first term depends only on r and the second and third terms taken to-
gether depend only on θ and ϕ. Thus, Eq. (2.235) is equivalent to the two
equations
µ ¶
1 d 2 dR
r = λ, (2.236)
R dr dr

µ ¶
1 d dΘ 1 d 2Φ
sinθ + = −λ. (2.237)
Θsinθ d θ dθ Φsin2 θ d ϕ2
Equation (2.236) is a homogeneous equation,

d 2R dR
r2 2
+ 2r − λR = 0, (2.238)
dr dr
which can be reduced, by the substitution r = e t , and writing R(r ) = S(t ), to

d 2S d S
+ − λS = 0. (2.239)
dt2 dt
This has as solution

S(t ) = Ae λ1 t + B e λ2 t . (2.240)

If we substitute Eq. (2.240) into Eq. (2.239), we get

(λ21 + λ1 − λ)Ae λ1 t + (λ22 + λ2 − λ)B e λ2 t = 0, (2.241)

which should be valid for an arbitrary value of t . Then, we need that

λ21 + λ1 − λ = 0,
λ22 + λ2 − λ = 0. (2.242)

If we subtract these equations, we have

λ21 + λ1 = λ22 − λ2 =⇒ λ21 − λ22 = λ2 − λ1


=⇒ (λ1 − λ2 )(λ1 + λ2 ) = λ2 − λ1
=⇒ λ1 + λ2 = −1. (2.243)
2.5.2 S PHERICAL COORDINATES 129

Summing the two expressions in Eq. (2.242), we get

λ21 + λ22 + λ1 + λ2 − 2λ = 0 =⇒ (λ1 + λ2 )2 − 2λ1 λ2 + λ1 + λ2 − 2λ = 0 (2.244)

Considering Eq. (2.243), the above equation can be written as

−2λ1 λ2 − 2λ = 0 =⇒ λ1 λ2 = −λ. (2.245)

In this way, the solution to the radial equation is

R(r ) = Ar λ1 + Br λ2 , (2.246)

where λ1 + λ2 = −1 and λ1 λ2 = −λ. If we define λ1 ≡ l , we have λ2 = −(l + 1)


and

λ = l (l + 1). (2.247)

Hence, we can write

R(r ) = Ar l + Br −(l +1) . (2.248)

Now, considering Eq. (2.237), multiplying through by sin2 θ and using Eq. (2.247),
it too takes a separated form:
· µ ¶ ¸
sinθ d dΘ 2 1 d 2Φ
sinθ + l (l + 1)sin θ + = 0. (2.249)
Θ dθ dθ Φ d ϕ2

Taking the separation constant as −m 2 , the equation in the azimuthal angle ϕ


is

1 d 2Φ
= −m 2 (2.250)
Φ d ϕ2

which has the same solution as in cylindrical coordinates, namely

Φ(ϕ) = C cos(mϕ) + Dsin(mϕ). (2.251)

As before, single-valuedness of u requires that m is an integer. Note that, for


m = 0, Eq. (2.251) produces a constant, which is a form compatible with az-
imuthal symmetry, i.e., problems where u is independent of the azimuthal
angle.
Using Eqs. (2.250) and (2.249), we are left only with the equation
µ ¶
sinθ d dΘ
sinθ + l (l + 1)sin2 θ = m 2 . (2.252)
Θ dθ dθ
2.5.2 S PHERICAL COORDINATES 130

Considering the change of variable θ to µ = cosθ, we have

dµ d dµ d d
= −sinθ = −(1 − µ2 )1/2 =⇒ = = −(1 − µ2 )1/2 , (2.253)
dθ dθ dθ dµ dµ

and setting Θ(θ) = M (µ), we get


· ¸ · ¸
d 2 dM m2
(1 − µ ) + l (l + 1) − M = 0, (2.254)
dµ dµ 1 − µ2

which can also be written as


· ¸
2 d 2M dM m2
(1 − µ ) − 2µ + l (l + 1) − M = 0, (2.255)
d µ2 du 1 − µ2
or
· ¸
d 2M 2µ d M 1 m2
− + l (l + 1) − M = 0. (2.256)
d µ2 1 − µ2 d µ 1 − µ2 1 − µ2

This equation is called associated Legendre equation, while the case m = 0,


i.e.,

d 2M 2µ d M l (l + 1)
− + M = 0. (2.257)
d µ2 1 − µ2 d µ 1 − µ2

is called the Legendre equation. For simplicity, we are going to restrict to


physical problems involving axial symmetry, thus m = 0. This means that we
need to find the general solution of the Legendre equation. Since µ = 0 is an
ordinary point of Eq. (2.257), we can consider a series solution for M of the
form
X

M (µ) = a n µn . (2.258)
n=0

In this way, multiplying Eq. (2.257) by the factor (1 − µ2 ) we get the recurrence
relation
∞ ·
X
¸
(n + 2)(n + 1)a n+2 (1 − µ ) − 2µ(n + 1)a n+1 + l (l + 1)a n µn = 0, (2.259)
2
n=0

which can be rewritten as


X∞ · ¸
(n + 2)(n + 1)a n+2 + {l (l + 1) − n(n + 1)}a n µn = 0
n=0
n(n + 1) − l (l + 1)
=⇒ a n+2 = an . (2.260)
(n + 2)(n + 1)
2.5.2 S PHERICAL COORDINATES 131

In this way
(n − 2)(n − 1) − l (l + 1)
an = a n−2
n(n − 1)
(l − n + 2)(l + n − 1)
=− a n−2 , n = 2, 3, . . . , (2.261)
n(n − 1)
Considering a 0 = 1 and a 1 = 0, all a n coefficients with odd values of n are zero,
while for the even values of n we get
l (l + 1) l (l + 1)
a2 = − a0 = − ,
2 2
(l − 2)(l + 3) l (l + 1)(l − 2)(l + 3) l (l − 2)(l + 3)(l + 1)
a4 = − a2 = = ,
4·3 4! 4!
(l − 4)(l + 5) l (l + 1)(l − 2)(l + 3)(l − 4)(l + 5) l (l − 2)(l − 4)(l + 5)(l + 3)(l + 1)
a6 = − a4 = − = ,
6·5 6! 6!
..
.
(−1)i l !! (l + 2i − 1)!!
a 2i = , i = 0, 1, 2, . . . (2.262)
(2i )! (l − 2i )!! (l − 1)!!
where we have introduced the double factorial, which is given by j !! = j ( j −
2)( j −4)( j −6) · · · . In this way, we have found one solution for Eq. (2.257), which
is
X
∞ (−1)n l !! (l + 2n − 1)!! (2n)
M 1 (µ) = µ . (2.263)
n=0 (2n)! (l − 2n)!! (l − 1)!!
By applying the ratio test to these series,
¯ ¯
¯ a 2(n+1) µ2(n+1) ¯
lim ¯ ¯ = µ2 , (2.264)
n→∞ ¯ a 2n µ2n ¯

which converges for |µ| < 1, and so the radius of convergence is unity, which,
as expected, is the distance to the nearest singular point of Eq. (2.257).
Similarly, if we choose a 0 = 0 and a 1 = 1, all coefficients a n where n is an
even number are zero, while for the case of n being an odd number we obtain
(l − 1)(l + 2) (l − 1)(l + 2)
a3 = − a1 = − ,
3·2 3!
(l − 3)(l + 4) (l − 1)(l − 3)(l + 4)(l + 2)
a5 = − a3 = ,
5·4 5!
(l − 5)(l + 6) (l − 1)(l − 3)(l − 5)(l + 6)(l + 4)(l + 2)
a7 = − a5 = − ,
7·6 7!
..
.
(−1)i (l − 1)!! (l + 2i )!!
a 2i +1 = , i = 0, 1, 2, . . . (2.265)
(2i + 1)! (l − 2i − 1)!! l !!
2.5.2 S PHERICAL COORDINATES 132

In this way, we have a second solution to Eq. (2.257), which is linearly inde-
pendent to the M 1 (µ),
X∞ (−1)n (l − 1)!! (l + 2n)!! 2n+1
M 2 (µ) = µ . (2.266)
n=0 (2n + 1)! (l − 2n − 1)!! l !!

As in case of M 1 (µ), the above series is convergent for |µ| < 1. Hence, the
general solution to Eq. (2.257) is given by

M (µ) = E M 1 (µ) + F M 2 (µ). (2.267)

Next, note that µ = cosθ = ±1 means θ = 0, π, which is part of the range of


the polar angle θ in spherical coordinates. This means that even if M (µ) di-
verges at µ = ±1, since µ = ±1 are singular points of the Legendre equation, we
need that either M 1 (µ) or M 2 (µ) are finite at µ = ±1 in order to have a physical
acceptable solution. Since the separation constant λ, and, thus, l , is arbitrary,
we can, if it is helpful, put restrictions on it. Note that, from Eq. (2.260), if l is
an integer, the recurrence relation when the value of n equals l gives

l (l + 1) − l (l + 1)
a l +2 = a l = 0, (2.268)
(l + 1)(l + 2)
i.e., the series terminates and we obtain a polynomial solution of order l ,
which is finite for any value of µ. In particular, if l is even, then M 1 (µ) reduces
to a polynomial, whereas if l is odd the same is true of M 2 (µ). In each case,
the other series does not terminate and therefore converges only for |µ| < 1.
These solutions (suitably normalized) are called the Legendre polynomials
of order l ; they are written as P l (µ) and it is conventional to normalize them
such that P l (1) = 1.
According to whether l is even or odd, we define the Legendre functions
of the second kind as Q l (µ) = αl M 2 (µ) or Q l (µ) = βl M 1 (µ), where

(−1)l /2 2l [(l /2)!]2


αl = , for l even, (2.269)
l!
(−1)(l +1)/2 2l −1 {[(l − 1)/2]!}2
βl = , for l odd. (2.270)
l!
The general solution of Eq. (2.257) is then written as

M (µ) = E P l (µ) + FQ l (µ), (2.271)

or, equivalently,

Θ(θ) = E P l (cosθ) + FQ l (cosθ) (2.272)


2.5.2 S PHERICAL COORDINATES 133

where P l (cosθ) is a polynomial of order l , and so converges for all θ, and


Q l (cosθ) is an infinite series that converges only for |cosθ| < 1. Thus, if we
require solutions of Laplace’s equation which are finite when cosθ = ±1, i.e.,
on the polar axis where θ = 0, π, we must have F = 0 in Eq. (2.272).
The general solution u(r, θ, ϕ) of the Laplace’s equation in spherical coor-
dinates for the case of axial symmetry can then be written as
· ¯ ¯ ¸
l −(l +1) ¯ ¯
u(r, θ, ϕ) = [Ar + Br ] C cos(mϕ)¯ + Dsin(mϕ)¯ [E P l (cosθ) + FQ l (cosθ)]
m=0 m=0
l −(l +1)
= [Ar + Br ][E P l (cosθ) + FQ l (cosθ)], (2.273)

where we have absorb the constant C in the other constants. As before, a gen-
eral solution may be obtained by superposing solutions of this form for the
allowed values of the separation constants (l in case of axial symmetry). As
mentioned above, if the solution is required to be finite on the polar axis, then
F = 0 for all values of the separation constants. Although we have not solved
Eq. (2.257) for general values of m, when m 6= 0, one simply replaces P l (cosθ)
and Q l (cosθ) by the associated Legendre functions P lm (cosθ) and Q lm (cosθ),
which are given by

d |m| d |m|
P lm (x) = (1 − x 2 )|m|/2 P l (x), Q lm (x) = (1 − x 2 )|m|/2 Q l (x) (2.274)
d x |m| d x |m|
where 0 ≤ |m| ≤ l and P l0 (x) = P l (x), Q l0 (x) = Q l (x). In this way, the general
solution of the Laplace’s equation in spherical coordinates is given by
X
u(r, θ, ϕ) = [A l m r l + B l m r −(l +1) ][C l m cos(mϕ) + D l m sin(mϕ)]
l ,m
× [E l m P lm (cosθ) + F l m Q lm (cosθ)] (2.275)

In case of axial symmetry, the preceding equation can be written as


X
u(r, θ, ϕ) = [A l r l + B l r −(l +1) ][E l P l (cosθ) + F l Q l (cosθ)]. (2.276)
l

Further details of the Legendre polynomials will be studied in Chapter 3.

E XAMPLE 2.12. Two solid heat-conducting hemispheres of radius a, sepa-


rated by a very small insulating gap, form a sphere. The two halves of the
sphere are in contact−on the outside−with two (infinite) heat baths at tem-
perature T0 and −T0 (see Fig. 2.5). We want to find the temperature distribu-
tion T (r, θ, ϕ) inside the sphere.
We choose a spherical coordinate system in which the origin coincides
with the center of the sphere and the polar axis is perpendicular to the equa-
torial plane. The hemisphere with temperature T0 is assumed to constitute
2.5.2 S PHERICAL COORDINATES 134

Figure 2.5: Two heat-conducting hemispheres held at two different tempera-


tures. The upper hemisphere has the polar range 0 ≤ θ < π/2 or 0 < cosθ < 1,
and the lower hemisphere has the range π/2 < θ ≤ π or −1 ≤ cosθ < 0.

the northern hemisphere. Since the problem is clearly axially symmetric, T is


independent of ϕ, and we have m = 0. We require the solution to be finite on
the polar axis, so we must have F = 0 in Eq. 2.273. Since the origin is in the
region of interest, we need to exclude all negatives powers of r , which is ac-
complished by setting the coefficient B equal to zero. Therefore, the solution
T (r, θ, ϕ) must be of the form
X

T (r, θ) = A n r n P n (cosθ). (2.277)
n=0

It remains to calculate the constants A n . This is done by noting that


½
T0 , if 0 ≤ θ < π/2,
T (a, θ) = (2.278)
−T0 , if π/2 < θ ≤ π,

or, equivalently,
½
−T0 , if − 1 ≤ µ < 0,
T (a, µ) = (2.279)
T0 , if 0 < µ ≤ 1.

Using Eq. (2.277),


X

T (a, µ) = A n a n P n (µ). (2.280)
n=0

To determine the coefficients A n we can use a property of the Legendre poly-


nomials, which establishes that

Z1
2
d µP n (µ)P m (µ) = δnm , n, m = 0, 1, 2, . . . . (2.281)
2n + 1
−1
2.5.2 S PHERICAL COORDINATES 135

In this way, we can write

Z1
1 2n + 1
An = n d µP n (µ)T (a, µ)
a 2
−1
· Z0 Z1 ¸
1 2n + 1
= n d µ(−T0 )P n (µ) + d µ(T0 )P n (µ)
a 2
−1 0
· Z0 Z1 ¸
2n + 1
= T0 − d µP n (µ) + d µP n (µ) . (2.282)
2a n
−1 0

Next, the first integral can be related to the second one by using the following
property of the Legendre polynomials

P n (−µ) = (−1)n P n (µ). (2.283)

In this form, substituting µ → −µ in the integral

Z0 Z0 Z1 Z1
d µP n (µ) = (−d µ)P n (−µ) = d µP n (−µ) = (−1)n d µP n (µ). (2.284)
−1 +1 0 0

Using the above result, we can write Eq. (2.282) as

Z1 ¸
2n + 1 n
An = T0 [1 − (−1) ] d µP n (µ)
2a n
0

2n + 1  1
0, if n is even,
= T0 R (2.285)
2a n  2 d µP 2k+1 (µ), if n = 2k + 1,
0

where we have written the odd value of n as 2k + 1, with k = 0, 1, 2, . . . . It re-


mains then to evaluate the integral of a Legendre polynomial of odd order in
the interval [0, 1]. Such integral is given by

Z1
(−1)k (2k)!
d µP 2k+1 (µ) = . (2.286)
22k+1 k!(k + 1)!
0
2.5.2 S PHERICAL COORDINATES 136

Then, we can write

A 2n = 0,
Z1
2(2n + 1) + 1
A 2n+1 = 2 T0 d µP 2n+1
2a 2n+1
0
n
(−1) (4n + 3)(2n)!
= T0 , (2.287)
22n+1 n!(n + 1)!a 2n+1
with n = 0, 1, 2, . . . . Substituting these expressions in Eq. (2.277), we arrive to
the final answer
∞ (−1)n (4n + 3)(2n)! µ r ¶2n+1
X
T (r, θ) = T0 2n+1 n!(n + 1)!
P 2n+1 (cosθ). (2.288)
n=0 2 a

Note that, in general, if the temperature on the surface of the sphere had
been given as a function of θ and ϕ, then we would have P lm (cosθ) instead
of P l (cosθ) and we would have had to consider a double series summed over
l and m, since the solution would not have been axially symmetric.

E XAMPLE 2.13. Consider two electrically conducting hemispheres of radius a


separated by a small insulating gap at the equator. The upper hemisphere is
held at potential V0 and the lower one at −V0 as shown in Fig. 2.6. We want to
find the potential at points outside the resulting sphere.

Figure 2.6: Two electrically conducting hemispheres held at two different


potentials. The upper hemisphere has the polar angle range 0 ≤ θ < π/2
or 0 < cosθ ≤ 1, and the lower hemisphere has the range π/2 < θ ≤ π or
−1 ≤ cosθ < 0.

This problem is completely analogous to the one shown in Example 2.12,


except for the fact that outside the sphere (for r > a) we require the solution
to be bounded as r → ∞, i.e., the potential must vanish at infinity. In view
2.6 T HE DIFFUSION EQUATION 137

of Eq. (2.276), the terms 1/r (l +1) can be neglected when r → ∞ and to satisfy
V (r → ∞, θ) = 0, we need that A l = 0 for all l . So we can write the potential as
X∞ B
n
V (r, θ) = P (cosθ).
n+1 n
(2.289)
n=0 r

Note that, as in Example 2.12,


½
−V0 , if − 1 ≤ µ < 0,
V (a, µ) = (2.290)
V0 , if 0 < µ ≤ 1.

Then, the calculation of the coefficients B n is completely identical to that of


A n in Example 2.12. We do not repeat the calculation here, and we simply give
the final result
X∞ (−1)n (4n + 3)(2n)! µ a ¶2n+2
V (r, θ) = V0 2n+1 n!(n + 1)!
P 2n+1 (cosθ). (2.291)
n=0 2 r

This expression is precisely the multipole expansion of the two hemispheres.

2.6 T HE DIFFUSION EQUATION


One important class of second order partial differential equations, which we
have not yet considered in detail, is that in which the second derivative with
respect to one variable appears, but only the first derivative with respect to
another, usually time. This is exemplified by the diffusion equation, some-
times also called the heat equation, whose most simplified version (the one
we consider here) is

∂u
∇2 u =
κ∇ . (2.292)
∂t
This equation describes the temperature u(rr , t ) in a region containing no heat
sources or sinks; κ > 0 is a real constant characterizing the medium in which
heat is flowing. The separation of variables u(rr , t ) = T (t )R(rr ) yields

∂ dT
∇2 [T (t )R(rr )] =⇒ R(rr )
[T (t )R(rr )] = κ∇ ∇2 R.
= κT (t )∇ (2.293)
∂t dt
Dividing both sides by T (t )R(rr ), we obtain

1 dT 1
= κ ∇ 2 R ≡ −κλ. (2.294)
T dt R
2.6.1 H EAT- CONDUCTING ROD 138

The left-hand side is a function of t , and the right-hand side a function of r .


The independence of these variables forces each side to be a constant. Calling
this constant −κλ for latter convenience, we obtain an ordinary differential
equation in time and a partial differential equation in the remaining variables:
dT
+ κλT = 0 and ∇ 2 R + λR = 0. (2.295)
dt
The general solution of the first equation is (see Sec. 2.3.1),
T (t ) = Ae −κλt , (2.296)
and that of the second equation can be obtained precisely by the methods of
the previous section. We illustrate this by some examples, but first we need to
keep in mind that λ is to be assumed positive, otherwise, the exponential in
Eq. (2.296) will cause a growth of T (t ), and, therefore, the temperature beyond
bounds.

2.6.1 H EAT- CONDUCTING ROD


Let us consider a one-dimensional conducting rod with one end at the origin
x = 0 and the other at x = b. The two ends are held at u = 0. Initially, at t = 0,
we assume a temperature distribution on the rod given by some function f (x).
We want to calculate the temperature at time t at any point x on the rod.
Due to the one-dimensionality of the rod, the y− and z− dependence can
be ignores, and the Laplacian ∇ 2 is reduced to a second derivative in x. Thus,
the second equation in (2.295) becomes
d2X
+ λX = 0, (2.297)
d x2
where X is a function of x alone. The general solution of this equation, as we
saw in Sec. 2.3.1, the general solution is
p p
X (x) = B cos( λx) +C sin( λx). (2.298)
Since the two ends of the rod are held at u = 0, we have the boundary con-
ditions u(t , 0) = 0 = u(t , b), which imply that X (0) = 0X (b). These give B = 0
and
p p
sin( λb) = 0 =⇒ λb = mπ, for m = 1, 2, . . . (2.299)
With a label m attached to λ, the solution, and the constant multiplying it, we
can now write
µ ¶ µ ¶
mπ 2 mπ
λm = and X m (x) = C m sin x , for m = 1, 2, . . . (2.300)
b b
2.6.1 H EAT- CONDUCTING ROD 139

The (subscripted) solution of the time equation is also simply obtained:


2
Tm (t ) = A n e −κ(mπ/b) t . (2.301)

This leads to a general solution of the form


µ ¶
X

−(mπκ/b)2 t mπ
u(t , x) = Bm e sin x , (2.302)
m=1 b

where B m ≡ A m C m . The initial condition f (x) = u(0, x) yields


µ ¶
X∞ mπx
f (x) = B m sin , (2.303)
m=1 b

which is a Fourier series from which we can calculate the coefficients B m :

Zb µ ¶
2 mπ
Bm = d x f (x)sin x . (2.304)
b b
0

Thus, if we know the initial temperature distribution on the rod, i.e., the func-
tion f (x), we can determine the temperature distribution of the rod for all
time. For instance, if the initial temperature distribution of the rod is uniform,
say u 0 , then

Zb µ ¶ · µ ¶¸¯b
2u 0 mπ 2u 0 b mπ ¯
Bm = d x sin x = − cos x ¯¯
b b b mπ b 0
0
· ¸
2u 0
= 1 − (−1)m . (2.305)

It follows that the odd m’s survive, and if we set m = 2n + 1, we obtain
4u 0
B 2n+1 = (2.306)
π(2n + 1)
and
∞ e −[(2n+1)πκ/b]2 t · ¸
4u 0 X (2n + 1)π
u(t , x) = sin x . (2.307)
π n=0 2n + 1 b

This distribution of temperature for all time can be obtained numerically for
any heat conductor whose κ is known. Note that the exponential in the sum
causes the temperature to drop to zero (the fixed temperature of its two end
points) eventually. This conclusion is independent of the initial temperature
distribution of the rod, as Eq. (2.302) indicates.
2.6.2 H EAT CONDUCTION IN A RECTANGULAR PLATE 140

2.6.2 H EAT CONDUCTION IN A RECTANGULAR PLATE


As a more complicated example involving a second spatial variable, consider
a rectangular heat-conducting plate with sides of length a and b all held at
u = 0. Assume that at time t = 0 the temperature has a distribution function
f (x, y). Let us find the variation of temperature for all points (x, y) at all times
t > 0.
The spatial part of the diffusion equation for this problem is

∂2 R ∂2 R
+ + λR = 0. (2.308)
∂x 2 ∂y 2

A separation of variables, R(x, y) = X (x)Y (y), and its usual procedure leads to
the following equation:

1 d2X 1 d 2Y
+ + λ = 0. (2.309)
X d x2 Y d y 2

In this way, defining

1 d2X 1 d 2Y
≡ −µ, ≡ −ν, (2.310)
X d x2 Y d y2

we get the following ordinary differential equations:

d2X d 2Y
+ µX = 0, + νY = 0, λ = µ + ν. (2.311)
d x2 d y2

Due to the periodicity of the boundary conditions,

u(0, y, t ) = u(a, y, t ) = u(x, 0, t ) = u(x, b, t ) = 0, (2.312)

the general solution of (2.311) can be conveniently expressed in terms of trigono-


metric functions, rather than exponentials (see Sec. 2.3.1), and, using Eq. (2.312),
we get the following indexed constants and solutions
µ ¶ µ ¶
nπ 2 nπ
µn = and X n (x) = A n sin x for n = 1, 2, . . . ,
a a
µ ¶ µ ¶
mπ 2 mπ
νm = and Ym (y) = B m sin y for m = 1, 2, . . . (2.313)
b b

So, λ becomes a double indexed quantity:


µ ¶2 µ ¶
nπ mπ 2
λ ≡ λmn = µn + νm = + . (2.314)
a b
2.6.3 H EAT CONDUCTION IN A CIRCULAR PLATE 141

The solution to the T equation can be expressed as Tmn (t ) = C mn e −κλmn t .


Putting everything together, we obtain
µ ¶ µ ¶
X∞ X ∞
−κλmn t nπ mπ
u(x, y, t ) = A mn e sin x sin y , (2.315)
n=1 m=1 a b

where A mn = A n B m C mn is an arbitrary constant. To determine it, we impose


the initial condition u(x, y, 0) = f (x, y). This yields
µ ¶ µ ¶
X∞ X ∞ nπ mπ
f (x, y) = A mn sin x sin y . (2.316)
n=1 m=1 a b

Using now the orthogonality properties of the trigonometric functions:


Z x0 +L µ ¶ µ ¶
πr x πpx
d xsin cos = 0, for all r and p, (2.317)
x0 L L


Z x 0 +L µ ¶ µ ¶  L for r = p = 0,
πr x πpx L
d xcos cos = for r = p > 0, (2.318)
x0 L L  2
0 for r 6= p,


Z x 0 +L µ ¶ µ ¶  0 for r = p = 0,
πr x πpx L
d xsin
L
sin
L
=
 2 for r = p > 0, (2.319)
x0
0 for r =
6 p,

where r and p are integers greater or equal to zero and x 0 is arbitrary (but
finite), we can find the coefficients A mn :
Za Zb µ ¶ µ ¶
4 nπ mπ
A mn = dx d y f (x, y)sin x sin y . (2.320)
ab a b
0 0

2.6.3 H EAT CONDUCTION IN A CIRCULAR PLATE


In this example, we consider a circular plate of radius a whose rim is held
at u = 0 and whose initial surface temperature is characterized by a function
f (ρ, ϕ). We are seeking the temperature distribution on the plate for all time.
Considering the circular plate in the z = 0 plane, with the z axis being perpen-
dicular to the plate, we can consider planar polar coordinates, i.e., cylindrical
coordinates with z = 0. The spatial part of the heat equation in z is given then
by
µ ¶
1 ∂ ∂R 1 ∂2 R
ρ + + λR = 0, (2.321)
ρ ∂ρ ∂ρ ρ ∂ϕ2
2.6.3 H EAT CONDUCTION IN A CIRCULAR PLATE 142

which, after the separation of variables, R(ρ, ϕ) = P (ρ)Φ(ϕ), reduces to (see


Sec. 2.4)

Φ(ϕ) = Acos(mϕ) + B sin(mϕ) for m = 0, 1, 2, . . . , (2.322)

µ ¶
d 2P 1 d P m2
+ + λ − 2 P = 0. (2.323)
d ρ2 ρ d ρ ρ

The solution of the last (Bessel) equation, which is well defined for ρ = 0 and
vanishes at ρ = a is, as we already saw in Sec. 2.4
µ ¶
x mn p x mn
P (ρ) = C J m ρ , with λ= and n = 1, 2, . . . , (2.324)
a a

where, as usual, x mn is the nth root of J m . We see that λ is a double-indexed


quantity. The time equation (2.296) then can be written as
2
Tmn (t ) = D mn e −κλmn t = D mn e −κ(xmn /a) t . (2.325)

Multiplying the three solutions and summing over the two indices yields the
most general solution
µ ¶ · ¸
X
∞ X
∞ x mn −κ(x mn /a)2 t
u(ρ, ϕ, t ) = Jm ρ e A mn cos(mϕ) + B mn sin(mϕ) .
m=0 n=1 a
(2.326)

The coefficients are determined from the initial condition


µ ¶· ¸
X∞ X ∞ x mn
f (ρ, ϕ) = u(ρ, ϕ, 0) = Jm ρ A mn cos(mϕ) + B mn sin(mϕ) ,
m=0 n=1 a
(2.327)

which is basically identical to Eq. (2.224). Therefore, the coefficients are given
by expressions similar to Eq. (2.225). In the case at hand, we get

Z2π Za µ ¶
2 x mn
A mn = d ϕ d ρρ f (ρ, ϕ)J m ρ cos(mϕ),
πa 2 J m+1
2
(x mn ) a
0 0
Z2π Za µ ¶
2 x mn
B mn = d ϕ d ρρ f (ρ, ϕ)J m ρ sin(mϕ). (2.328)
πa 2 J m+1
2
(x mn ) a
0 0
2.7 T HE S CHRÖDINGER EQUATION 143

In particular, if the initial temperature distribution is independent of ϕ,


the only term which contributes, as we saw when we solved the Laplace equa-
tion, is m = 0, and we get
µ ¶
X
∞ x 0n 2
u(ρ, t ) = An J0 ρ e −κ(x0n /a) t . (2.329)
n=1 a

With f (ρ) = u(ρ, 0) representing the ϕ-independent initial temperature distri-


bution the coefficient A n is found to be

Za µ ¶
4 x 0n
A n ≡ A 0n = d ρ ρ J0 ρ , (2.330)
a 2 J 12 (x 0n ) a
0

where we have used Eq. (2.223).

2.7 T HE S CHRÖDINGER EQUATION


The Schrödinger equation, describing the non-relativistic quantum phenom-
ena, is

ħ2 2 ∂u
− ∇ u + V (rr )u = i ħ , (2.331)
2µ ∂t

where µ is the mass of a subatomic particle, ħ is Planck’s constant (dividided


by 2π), V is the potential energy of the particle, and |u(rr , t )|2 is the probability
density of finding the particle at r at time t . Let’s start by separating the r and
t dependence:

u(rr , t ) = R(rr )T (t ). (2.332)

Substituting Eq. (2.332) in (2.331), we get

ħ2 2 dT
−T ∇ R + V (rr )(RT ) = i Rħ . (2.333)
2µ dt

Dividing both sides by RT yields,

1 ħ2 2 1 dT
− ∇ R + V (rr ) = i ħ . (2.334)
R 2µ T dt

The left-hand side is a function of position alone, and the right-hand side is
a function of time alone, and since r and t are independent variables, as we
2.7.1 Q UANTUM PARTICLE IN A BOX 144

have already discussed in previous sections, the only way that Eq. (2.334) can
hold is for both sides to be constant, say E :

1 ħ2 2 ħ 2
− ∇ R + V (rr ) = E =⇒ − ∇ R + V (rr )R = E R (2.335)
R 2µ 2µ

and
1 dT dT iE
iħ =E =⇒ = − dt. (2.336)
T dt T ħ
The solution of the time part is easily obtained, since it can be integrated di-
rectly:

ln(T ) = β − i (E /ħ)t =⇒ T = e β e −i E t /ħ ≡ Ae −i E t /ħ , (2.337)

where β and A ≡ e β are arbitrary constants of integration. The constant of


separation, E , actually correspond to the energy of the quantum particle. It is
the solution of the first equation in (2.335), which we rewrite it as

∇2R + [E − V (rr )]R = 0, (2.338)
ħ2
that will take up most of our time in this section.

2.7.1 Q UANTUM PARTICLE IN A BOX


Let’s consider an atomic particle of mass µ confined in a rectangular box with
sides a, b, and c, i.e., the potential V is infinity at the boundaries of the box
(in this way the particle cannot escape from the box), while V = 0 inside the
box. In this way the behavior of the particle inside the box is governed by the
Schrödinger equation for a free particle, i.e., V = 0. With this assumption,
Eq. (2.338) becomes
2µE
∇2R + R = 0. (2.339)
ħ2
A separation of variables, R(x, y, z) = X (x)Y (y)Z (z), yields to the ordinary dif-
ferential equations

d2X d 2Y d2Z
+ λX = 0, + σY = 0, + νZ = 0, (2.340)
d x2 d y2 d z2

with

λ + σ + ν = 2µE /ħ2 . (2.341)


2.7.1 Q UANTUM PARTICLE IN A BOX 145

Since we cannot find the particle outside the box, we impose the boundary
conditions
R(0, y, z) = R(a, y, z) = 0 =⇒ X (0) = 0 = X (a),
R(x, 0, z) = R(x, b, z) = 0 =⇒ Y (0) = 0 = Y (b),
R(x, y, 0) = R(x, y, c) = 0 =⇒ Z (0) = 0 = Z (c), (2.342)
in this way, |u(x, y, z, t )|2 d xd yd z, which is related to the probability density
of finding the particle between (x, y, z) and (x + d x, y + d y, z + d z) at some in-
stant t , would be zero outside the box (this is not true for a particle inside a
finite potential well, in which case the particle has a nonzero probability of
tunneling out of the well). From Sec. 2.3.1, the general solution of each of the
equations in (2.340), together with the about boundary conditions, lead to the
following solutions:
µ ¶ µ ¶
nπ nπ 2
X n (x) = A n sin x , λn = , for n = 1, 2, . . . , (2.343)
a a
µ ¶ µ ¶
mπ mπ 2
Ym (y) = B m sin y , σm = , for m = 1, 2, . . . , (2.344)
b b
µ ¶ µ ¶2
lπ lπ
Zl (z) = C l sin z , νl = , for l = 1, 2, . . . , (2.345)
c c
where the multiplicative constants have been suppressed. Using the values
found for λn , σm and νl together with Eqs. (2.337) and (2.341), the time solu-
tion has the form
·µ ¶ µ ¶ µ ¶2 ¸
−i E nml t /ħ ħ2 nπ 2 mπ 2 lπ
Tnl m (t ) = D nl m e , where E nml = + + .
2µ a b c
(2.346)
The solution of the Schrödinger equation that is consistent with the boundary
conditions is, then,
µ ¶ µ ¶ µ ¶
X

−i E nml t /ħ nπ mπ lπ
u(x, y, z, t ) = A nml e sin x sin y sin z . (2.347)
n,m,l =1 a b c
The constants A l mn ≡ A n B m C l D nml are determined by the initial shape u(x, y, z, 0).
In fact, setting t = 0, multiplying by the product of the three sine functions in
the three variables, using the orthogonality properties of the trigonometric
functions, and integrating over appropriate intervals for each coordinate, we
obtain
Za Zb Zc µ ¶ µ ¶ µ ¶
8 nπ mπ lπ
A nml = d x d y d zu(x, y, z, 0)sin x sin y sin z .
abc a b c
0 0 0
(2.348)
2.7.1 Q UANTUM PARTICLE IN A BOX 146

As we mentioned earlier,
µ ¶
ħ2 π2 n 2 m 2 l 2
E nl m = + + (2.349)
2µ a 2 b 2 c 2

represents the energy of the particle which depends on three positive inte-
gers (n, l , m). Each set of three positive integers (n, l , m) represents a quantum
state of the particle. For a cube, a = b = c, the energy of the particle is

ħ2 π2 2 2 2 ħ2 π2
E nl m = (n + m + l ) = (n 2 + m 2 + l 2 ), (2.350)
2µa 2 2µV 2/3

where V = a 3 is the volume of the box. The ground state, i.e., the one with low-
est energy, is (1, 1, 1), has energy 3ħ2 π2 /(2µV 2/3 ), and is nondegenerate (only
one state corresponds to this energy). However, the higher-level states are de-
generate. For instance, the three distinct states (1, 1, 2), (1, 2, 1), and (2, 1, 1)
all correspond to the same energy, 6ħ2 π2 /(2µV 2/3 ). The degeneracy increases
rapidly with larger values of n, m and l .
Note that Eq. (2.350) can be written as

2µE nml V 2/3


n 2 + m 2 + l 2 = r 2 where r 2 = . (2.351)
ħ2 π2
This looks like the equation of a sphere in the nml -space. If r is large, the
number of states contained within the sphere of radius R (the number of
states with energy less than or equal to E ) is simply the volume of the first
octant of the sphere in the nml -plane, since n, l and m are all positive. If N is
the number of such states, we have
µ ¶ µ ¶
1 4π 3 π 2µE nml 3/2
N= r = V. (2.352)
8 3 6 ħ2 π2

Thus the density of states, i.e., the number of states per unit volume, is then
µ ¶
N π 2µ 3/2 3/2
N = = E nml . (2.353)
V 6 ħ2 π2

This is an important formula in solid-state physics, because the energy E nml is


(without minor modifications required by spin, an intrinsic angular momen-
tum of the particle) the Fermi energy. If the Fermi energy is denoted by E F ,
equation (2.353) gives E F = αn 2/3 , where α is some constant.
2.8 F URTHER READING 147

2.8 F URTHER READING


This chapter has been prepared using the following references:

• Mathematics for Physicists, Introductory Concepts and Methods, by Alexan-


dre Altland and Jan Von delft (Cambridge University Press, 2019).

• Mathematics For Physicist, by Philippe Dennery and André Krzywicki


(Dover. Publications, Inc., 1995).

• Mathematical Methods for Physicists, A Comprehensive Guide, by George


B. Arfken, Hans J. Weber and Frank E. Harris (Elsevier, Seventh Edition).

• Mathematical Methods For Physics and Engineering, by K. F. Riley, M. P.


Hobson, S. J. Bence (Cambridge University Press, Third Edition).

• Mathematical Methods for Students of Physics and Related Fields, by Sadri


Hassani (Springer, second edition, 2009).
C HAPTER

3
O RTHOGONAL POLYNOMIALS
AND SPECIAL FUNCTIONS

In Chapter 1 we studied the Fourier expansion of a periodic function f (x) ∈


L ω2 (a, b), which basically consisted in writing f (x) as an expansion in terms
of orthonormal functions e k (x), where these e k (x) were sine and cosine func-
tions. As we saw, we could associate a vector | f 〉 to the function f (x) and a
vector |e k 〉 to the functions e k (x) and write
X

|f 〉 = f k |e k 〉, (3.1)
k=1

and the coefficients f k were obtained from the corresponding inner product
of L ω2 (a, b), i.e.,
Zb
f k = 〈e k | f 〉 = d xω(x)e k∗ (x) f (x). (3.2)
a

In turns out that sine and cosine functions are not the only family of func-
tions that are orthogonal and complete. In this chapter we introduce a class of
orthogonal polynomials, the so-called classical ones, which are of particular
importance in physical applications and permits also the expansion of f (x)
in terms of e k (x). This includes, for example, the Legendre polynomials that
we introduced in Chapter 2. A way of obtaining these orthogonal polynomials
is to start with the monomials 1, x, x 2 , . . . , x n , which are not orthogonal, but
form a base in L ω2 (a, b) according with the Stone-Weierstrass theorem, and
apply the Gram-Schmidt orthogonalization process to these monomials, i.e.,
X 〈x k |e
k−1
j〉
e k (x) = x k − e j (x), k = 1, 2, . . . , (3.3)
j =0 〈e j |e j 〉

149
3.1 G ENERALIZED R ODRIGUES FORMULA 150

In this way, we can then recursively determine e k (x), k = 1, 2, . . . which sat-


isfy the orthogonality condition 〈e i |e j 〉 = 0 if i 6= j . And since the monomials
used to obtain e k (x) form a base of L ω2 (a, b), the set {e k (x)}∞
k=1
form a base of
L ω2 (a, b) too. We, however, are not going to follow this approach, but rather
a much more elegant, although less general, which simultaneously produces
most classical polynomials of interest to physicists.

3.1 G ENERALIZED R ODRIGUES FORMULA


Let us start by introducing the functions
1 dn
F n (x) = [ω(x)s n (x)], for n = 0, 1, 2, . . . , (3.4)
ω(x) d x n

where

1. The function F 1 (x) is a first-degree polynomial in x.

2. The function s(x) is a polynomial in x of degree less than or equal to 2


with only real roots.

3. The function ω(x) is a strictly positive function, integrable in the inter-


val [a, b], that satisfies the boundary conditions

ω(a)s(a) = ω(b)s(b) = 0. (3.5)

Then F n (x) is a polynomial of degree n in x and is orthogonal, on the in-


terval [a, b], with weight ω(x), to any polynomial p k (x) of degree k < n, i.e.,
F n (x) ∈ L ω2 (a, b) and p k (x) ∈ L ω2 (a, b) with

Zb
〈p k |F n 〉 = d xω(x)p k (x)F n (x) = 0, for k < n. (3.6)
a

Since F n (x) and p k (x) are real, from now onwards, we omit the complex con-
jugation involved in their inner product.

L ET ’ S PROVE IT ! Before starting with the formal proof of the above statements,
first, we need two realize that

(i) If we denote by the symbol p (≤k) (x) an arbitrary polynomial in x of de-


gree ≤ k, the following identity holds
dm
[w(x)s n (x)p (≤k) ] = ω(x)s n−m p (≤k+m) . (3.7)
d xm
3.1 G ENERALIZED R ODRIGUES FORMULA 151

Effectively, from Eq. (3.4) with n = 1, we get


µ ¶
dω ds
s = ω(x) F 1 − . (3.8)
dx dx

Thus,
· ¸
d n dω n n −1 d s d p (≤k)
[ω(x)s (x)p (≤k) ] = s (x)p (≤k) + nω(x)s (x) s p (≤k) +
dx dx dx dx
·½ ¾ ¸
n−1 d s d p (≤k)
= ω(x)s F 1 (x) + (n − 1) p (≤k) + s .
dx dx
(3.9)

Since F 1 (x) is a first degree polynomial in x and s(x) is a polynomial in


x of degree ≤ 2, the term between brackets on the right hand side of
Eq. (3.9) is a polynomial of degree ≤ k + 1. Thus,

d
[ω(x)s n (x)p (≤k) ] = ωs n−1 p (≤k+1) , (3.10)
dx
where
½ ¾
ds d p (≤k)
p (≤k+1) ≡ F 1 (x) + (n − 1) p (≤k) + s . (3.11)
dx dx

m n
(ii) All the derivatives d [ω(x)s
d xm
(x)]
with m < n vanish at x = a and x = b.
Indeed, from Eq. (3.7), putting k = 0 and p (≤0) ≡ p 0 = 1, we get

d m [ω(x)s n (x)]
= ω(x)s n−m (x)p (≤m)
d xm
= [ω(x)s(x)]s n−m−1 (x)p (≤m) . (3.12)

Since ω(a)s(a) = ω(b)s(b) = 0, the right hand side in Eq. (3.12) vanishes
at x = a and x = b when n > m. In the case of an infinite interval, it can
be shown that ω(x)s(x) vanishes at infinity faster than any polynomial.

Let us now first proof the orthogonality condition in Eq. (3.6). The proof
3.1 G ENERALIZED R ODRIGUES FORMULA 152

involves multiple use of integration by parts:


Zb Zb
d n [ω(x)s n (x)]
d xω(x)p k (x)F n (x)d x = d xp k (x)
d xn
a a
Zb · ¸
d d n−1 [ω(x)s n (x)]
= d xp k (x)
dx d x n−1
a
¯ Zb
d n−1 [ω(x)s n (x)] ¯¯b d p k d n−1 [ω(x)s n (x)]
= p k (x) ¯ − d x .
d x n−1 x=a dx d x n−1
| {z } a
=0 from property (ii)
(3.13)

This shows that each integration by parts transfers one differentiation from
ω(x)s n (x) to pk(x) and introduces a minus sign. Thus, after k integrations by
parts, we get
Z Z
k d k p k d n−k [ω(x)s n (x)]
d xω(x)p k (x)F n (x) = (−1) dx . (3.14)
d xk d x n−k
Since the kth derivative of a polynomial of degree k is a constant, we can take
d k p k /d x k outside the integration, i.e.,
Z k Z
k d pk d n−k [ω(x)s n (x)]
d xω(x)p k (x)F n (x) = (−1) d x
d xk d x n−k
k Z · n−k−1 ¸
k d pk d d [ω(x)s n (x)]
= (−1) dx
d xk dx d x n−k−1
· ¸¯
k
k d pk d
n−k−1
[ω(x)s n (x)] ¯¯x=b
= (−1) ¯ = 0. (3.15)
d xk d x n−k−1 x=a
| {z }
considering property (ii)

Note that n − k − 1 ≥ 0 because k < n, so that the last line of the equation is
well-defined.
Next, we need to proof that F n (x) is a polynomial of degree precisely equal
to n. Let’s see this. First, from Eq. (3.7), putting k = 0, m = n and p (≤0) = p 0 = 1,
we have
d n [ω(x)s n (x)] 1 d n [ω(x)s n (x)]
= ω(x)p (≤n) , or F n (x) = = p (≤n) . (3.16)
d xn ω(x) d xn
Then, we can write

F n (x) = p (≤n−1) (x) + k n(n) x n . (3.17)


3.1 G ENERALIZED R ODRIGUES FORMULA 153

In this way,

Zb
〈F n | f n 〉 = d xω(x)[F n (x)]2
a
Zb Zb
= d xω(x)p (≤n−1) F n (x) + k n(n) d xω(x)x n F n (x). (3.18)
a a

The left-hand side of Eq. (3.18) is a positive quantity because both ω(x) and
[F n (x)]2 are positive, and the first integral on the right-hand side vanishes
from Eq. (3.15), since k ≤ n − 1. Therefore, the second term on the right-hand
side in Eq. (3.18) cannot be zero. In particular, k n(n) 6= 0, and F n (x) is of degree
n.

For historical reason, different polynomial functions are normalized dif-


ferently and it is common to introduce a normalization constant K n in the
definition of F n (x), and write
1 d n [ω(x)s n (x)]
F n (x) = . (3.19)
K n ω(x) d xn
This equation is called the generalized Rodrigues formula.
The sequence {F n (x)}∞n=0 forms an orthogonal set of polynomials on the
interval [a, b] with weight ω(x), and it can be shown that F n (x) satisfy the fol-
lowing differential equation
· ¸
d d Fn
ω(x)s(x) = ω(x)λn F n (x), (3.20)
dx dx

where λn = K 1 k 1(1) n + σ2 n(n − 1), with k 1(1) being the coefficient of x in F 1 (x)
and σ2 the coefficient of x 2 in s(x).

R EALLY ? In case you are curious to know who Eq. (3.20) arises, here we pro-
vide a proof. Since F n (x) is a polynomial which in the interval [a, b] is orthog-
onal with weight ω(x) to any polynomial of degree n, i.e.,

Zb
d xω(x)F n (x)p (<n) = 0, (3.21)
a

we have that d F n /d x is a polynomial of degree ≤ (n−1). According to Eq. (3.7),


· ¸
1 d d Fn
s(x)ω(x) (3.22)
ω(x) d x dx
3.1 G ENERALIZED R ODRIGUES FORMULA 154

is a polynomial of degree ≤ n. Thus, we can write


· ¸
1 d d Fn X n
s(x)ω(x) = −ω(x) λ(i )
n F n (x), (3.23)
ω(x) d x dx i =1

where λin are some numbers. Multiplying both sides of Eq. (3.23) by F m and
integrating, we get, using the orthogonality property of F m

Zb · ¸
d d Fn
d xF m s(x)ω(x) = −λ(m)
n hm , (3.24)
dx dx
a

with

Zb
2
hm = d xω(x)F m (x). (3.25)
a

Writting,
· ¸ · ¸
d d Fn d d Fn d Fm d Fn
F m (x) s(x)ω(x) = F m (x)s(x)ω(x) − s(x)ω(x) ,
dx dx dx dx dx dx
(3.26)

and using that s(x)ω(x) vanishes at the ends of the integration interval, the
left-hand side of Eq. (3.24) yields for m < n,

Zb · ¸ Zb
d d Fn d Fn d Fm
d xF m s(x)ω(x) = − d xω(x)s(x)
dx dx dx dx
a a
Zb µ · ¸¶
1 d d Fm
= d xω(x)F n (x) s(x)ω(x) = 0,
ω(x) d x dx
a
(3.27)

where we have used that F n (x) is orthogonal to any polynomial of degree < n.
Comparing with Eq. (3.24), we arrive to λ(m) (n)
n = 0 for m < n. Defining λn ≡ λn
for simplicity, we can write Eq. (3.23) in the form
· ¸
d d Fn
s(x)ω(x) = −ω(x)λn F n (x). (3.28)
dx dx
3.2 C LASSIFICATION 155

Now, we need to determine the constant λn . Putting m = n in Eq. (3.24), we


obtain on the left-hand side,

Zb · ¸ Zb · ¸
d Fn d {s(x)ω(x)} d F n d 2 Fn
d xF n (x) s(x)ω(x) = d xF n (x) + s(x)ω(x)
dx dx dx d x2
a a
Zb · ¸
d Fn d 2 Fn
= d xω(x)F n (x) K 1 F 1 (x) + s(x) ,
dx d x2
a
(3.29)

where we have used Eq. (3.19) for n = 1. Because of the orthogonality property
of the polynomials F n , only the nth power of x in the nth degree polynomial
in the square brackets contributes to the integral. If k n(n) is the coefficient of
x n in F n (x), the coefficient of x n in F 1 (x) ddFxn is given by nk 1(1) k n(n) . Similarly,
if σ2 is the coefficient of x 2 in s(x), then the coefficient of x n in sd 2 F n /d x 2 is
given by σ2 n(n − 1)k n(n) . In this way, we get

Zb · ¸ · ¸ Zb
d Fn (1)
d xF n (x) s(x)ω(x) = K 1 nk 1 + σ2 n(n − 1) d xω(x)F n (x)k n(n) x n
dx
a a
· ¸
= K 1 nk 1(1) + σ2 n(n − 1) h n . (3.30)

Comparing this latter equation with Eq. (3.24), and since the only non-zero
value for λ(m)
n is when m = n, we get

(1)
λn ≡ λ(n)
n = K 1 k 1 n + σ2 n(n − 1). (3.31)

The polynomials F n (x) are collectively called classical orthogonal poly-


nomials.

3.2 C LASSIFICATION
Let us now investigate the consequence of various choices of s(x). We start
with F 1 (x), which, according to Eq. (3.19)

1 d [ω(x)s(x)] 1 d [ω(x)s(x)] K 1 F 1 (x)


F 1 (x) = or = , (3.32)
K 1 ω(x) dx ω(x)s(x) dx s(x)
3.2 C LASSIFICATION 156

which, integrating, can be interpreted to yield


· ¸ Z ·Z ¸
ω(x)s(x) K 1 F 1 (x) K 1 F 1 (x)
ln = dx =⇒ ω(x)s(x) = Aexp dx , (3.33)
A s(x) s(x)

where A is an arbitrary integration constant. Since F 1 (x) is a polynomial of


degree 1, it can be written as F 1 (x) = k 1(0) + k 1(1) x. It follows then
·Z ¸
K 1 (k 1(0) + k 1(1) x)
ω(x)s(x) = Aexp dx . (3.34)
s(x)

Next, we look at three choices for s(x): a constant, a polynomial of degree


1, and a polynomial of degree 2 and we examine the possibility of finding ω(x)
which satisfies Eq. (3.34) as well as the boundary condition (3.5). For a con-
stant s(x) ≡ s, Eq. (3.34) can be easily integrated to get
·Z ¸
2 2
ω(x)s ≡ Aexp d x(2αx + β) = Ae αx +βx+C ≡ B e αx +βx , (3.35)

where 2α ≡ K 1 k 1(1) /s, β ≡ K 1 k 1(0) /s, B ≡ Ae C . Using Eq. (3.5),


2 2
B e αa +βa
= 0 = B e αb +βb
. (3.36)

For nonzero B , the only way that this equality can hold is for α to be negative
and for a 2 and b 2 to be infinite. Since a < b, we must take a = −∞ and b = +∞.
In this way, Eq. (3.35) can be written as
2 2
ω(x)s = B e −|α|x +βx
= B e −(|α|x −βx)
. (3.37)

Now note that


· µ ¶¸2
2
p β β2
|α|x − βx = |α| x − − , (3.38)
2|α| 4|α|

so we have
½ · µ ¶¸2 ¾
β2 p β
ω(x)s = B e 4|α| exp − |α| x − (3.39)
2|α|

Since a simultaneous linear transformation of the argument in ω(x) and


s(x) does not modify either the degree of the corresponding polynomial F n (x)
or their orthogonality property (such transformations can change the limits a
and b of the interval, but the conditions listed for F n (x) remain satisfied), we
p 2
perform the substitution x → |α|[x −β/(2|α|)] and choose B = e −β /(4|α|) and
3.2 C LASSIFICATION 157

s = 1. Note that the normalization constants K n remain arbitrary; they will


be fixed by convention later on, thus, we are not concerned for the present
about multiplicative numerical factors that can be absorbed into K n . For this
reason, we can redefine B and choose s = 1 without any lost of generality. In
this way, after redefining the x variable, from Eq. (3.35), we obtain s(x) = 1,
2
ω(x) = e −x , and the limits a and b remain unchanged, i.e., −∞ and ∞. Using
now Eq. (3.19), we can determine F n (x). The resulting polynomials are called
Hermite polynomials and are denoted by Hn (x).
If the degree of s(x) is 1, then we can write s(x) = σ0 +σ1 x and from Eq. (3.34),
we get
·Z ¸
K 1 (k 1(0) + k 1(1) x)
ω(x)(σ0 + σ1 x) = Aexp dx
σ0 + σ1 x
·Z µ ¶¸
K 1 k 1(1) K 1 k 1(0) − K 1 k 1(1) σ0 /σ1
= Aexp dx +
σ1 σ0 + σ1 x
≡ B (σ0 + σ1 x)ρ e γx , (3.40)

where γ ≡ K 1 k 1(1) /σ1 , ρ ≡ K 1 k 1(0) /σ1 − K 1 k 1(1) σ0 /σ21 , and B is A modified by the
constant of integration. Using Eq. (3.5),

B (σ0 + σ1 a)ρ e γa = 0 = B (σ0 + σ1 b)ρ e γb , (3.41)

which give a = −σ0 /σ1 , ρ > 0, γ < 0, i.e., γ = −|γ| < 0, and b = +∞. We now
redefine the variable x → |γ|(σ0 +σ1 x)/σ1 , such that, in terms of the new vari-
able Eq. (3.40) can be written as
µ ¶
x xσ1 ρ −x+|γ|σ0 /σ1
ω(x) σ1 = B e
|γ| |γ|
µ ¶
xσ1 ρ |γ|σ0 /σ1 −x
=B e e . (3.42)
|γ|

In this way
µ ¶ρ−1
|γ|σ0 /σ1 σ1
ω(x)x = B e x ρ e −x ≡ B̃ x ρ e −x , (3.43)
|γ|
³ ´ρ−1
with B̃ = B e |γ|σ0 /σ1 σ 1
|γ|
an arbitrary constant which can be reabsorbed in
the normalization constant K n . Alternatively, since we will use a convention
to define K n , we can simply redefine B̃ = 1 and write

ω(x)x = x ρ e −x , (3.44)
3.2 C LASSIFICATION 158

such that

s(x) = x, ω(x) = x ρ−1 e −x . (3.45)

Defining ν = ρ − 1, we have then

ω(x) = x ν e −x , (3.46)

where ν > −1 since ρ > 0. Note that after redefining the variable x, the limit
a = −σ0 /σ1 becomes 0, while b = ∞ remains unaltered. In this way, we have
found

ω(x) = x ν e −x , ν > −1, s(x) = x, a = 0, b = ∞, (3.47)

Using now Eqs. (3.19) and (3.47), we can get F n (x). The resulting polynomials
are called Laguerre polynomials and are denoted by L νn (x).
Similarly, we can obtain the weight and the interval of integration for the
case when s(x) is of degree 2. The result, with the corresponding redefini-
tion of variables and parameters, is ω(x) = (1 + x)µ (1 − x)ν , with µ, ν > −1,
s(x) = 1−x 2 , a = −1 and b = +1. The polynomials found are called Jacobi poly-
µ,ν
nomials are are denoted by P n (x). The Jacobi polynomials are themselves
divided into other subcategories depending on the values of µ and ν and for
historical reason, and because they play important roles in applications, they
are named differently. For example, in case of µ = ν = 0, we have ω(x) = 1, and
the obtained polynomials, represented as P n (x), are called Legendre polyno-
mials; For µ = ν = ∓1/2, we have ω(x) = (1 − x 2 )∓1 and we have the Chebyshev
polynomials of the first and second kind, respectively. Strictly speaking, the
definition of each of the preceding polynomials contains also the specifica-
tion of the normalization constant K n in the generalized Rodrigues formula,
something which is called standardization; this standardization will be spec-
ified later on, but you should be aware that different books can use different
standardizations for the same polynomials.

E XAMPLE 3.1 (S TANDARDIZATION OF L EGENDRE ’ S POLYNOMIALS). Let us find


the orthogonal polynomials forming a basis of L 2 (−1, 1), which we denote by
P n (x), where n is the degree of the polynomial. Let P 0 (x) = 1. To find P 1 (x),
write P 1 (x) = ax +b, and determine a and b in such a way that P 1 (x) is orthog-
onal to P 0 (x):

Z1 Z1 ¯
1 2 ¯¯1
0= d xP 1 (x)P 0 (x) = d x(ax + b) = ax ¯ + 2b = 2b. (3.48)
2 −1
−1 −1
3.3 R ECURRENCE RELATIONS 159

So one of the coefficients, b, is zero. To find the other one, we need some
standardization procedure. We standardize P n (x) by requiring that P n (1) = 1
for any value n. For n = 1 this yields a · 1 = 1, or a = 1, so that P 1 (x) = x.
We can calculate P 2 (x) similarly: write P 2 (x) = ax 2 + bx + c, impose the
condition that it would be orthogonal to both P 1 (x) and P 0 (x), and enforce
the standardization procedure. All this will yield
Z1 Z1
2 2
0= d xP 2 (x)P 0 (x) = a + 2c, 0= d xP 2 (x)P 1 (x) = b, (3.49)
3 3
−1 −1

and P 2 (1) = a + b + c = 1. These three equations have the unique solution


a = 3/2, b = 0, c = −1/2. Thus, P 2 (x) = (3x 2 − 1)/2. These are the first three
Legendre polynomials.

3.3 R ECURRENCE RELATIONS


Orthogonal polynomials satisfy functional relations, which are also called re-
currence relations. First, any three consecutive orthogonal polynomials sat-
isfy that
F n+1 (x) = (αn x + βn )F n (x) + γn F n−1 (x), (3.50)
where αn , βn and γn are constants which depend on n only and which are
determined by the class of polynomials considered.

L ET ’ S PROVE IT ! Let us denote by k n(n) and k n(n−1) the coefficients of x n and


x n−1 in F n (x). The polynomial
· (n+1) ¸
k
F n+1 (x) − n+1 xF n (x), (3.51)
k n(n)
is clearly of degree ≤ n and therefore can be written as
· (n+1) ¸
k X n
F n+1 (x) − n+1 xF n (x), = a j F j (x). (3.52)
k n(n) j =0

Taking the inner product of both sides of this equation with F m (x), we get
Zb (n+1) Zb
k n+1
d xω(x)F n+1 (x)F m (x) − d xω(x)xF n (x)F m (x)
a
k n(n) a

X
n Zb
= aj d xω(x)F j (x)F m (x). (3.53)
j =0
a
3.3 R ECURRENCE RELATIONS 160

Using the orthogonality relation (3.21), the first integral on the left-hand side
vanishes as long as m ≤ n; the second integral vanishes if m ≤ n − 2, since in
this case xF m (x) is a polynomial of degree n − 1. In this way, we have

X
n Zb
aj d xω(x)F j (x)F m (x) = 0 for m ≤ n − 2, (3.54)
j =0
a

but, by Eq. (3.21), the integral in the sum is zero unless j = m. Therefore, the
sum reduces to

Zb
am d xω(x)[F m (x)]2 = 0 for m ≤ n − 2. (3.55)
a

Since the above integral is nonzero, we conclude that

a m = 0, for m = 0, 1, . . . , n − 2, (3.56)

and Eq. (3.52) reduces to


· (n+1) ¸
k n+1
F n+1 (x) − xF n (x) = a n−1 F n−1 (x) + a n F n (x). (3.57)
k n(n)

This is the recurrence formula we are looking for; there remains to find the
constants a n and a n−1 . First, if we multiply Eq. (3.57) by ω(x)F n−1 (x) and in-
tegrate, using the orthogonality property (3.21), the integral of the first term
on the left-hand side as well as the integral of the first term on the right-hand
side are both zero, thus, we have

· (n+1) ¸ Zb
k
− n+1 d xω(x)xF n (x) = h n−1 a n−1 . (3.58)
k n(n) a

where h n−1 is given by Eq. (3.24). Now, notice that due to the orthogonality
relation (3.21)

Zb Zb Zb
2
hn = d xω(x)[F n (x)] = d xω(x)F n (x)F n (x) = d xω(x)F n (x)[k n(n) x n ].
a a a
(3.59)
3.3 R ECURRENCE RELATIONS 161

Then, Eq. (3.58) can be written as

(n+1) Zb
k n+1
h n−1 a n−1 = − d xω(x)xF n (x)F n−1 (x)
k n(n) a
(n+1) (n) Zb
k n+1 k n−1
=− d xω(x)F n (x)k n(n) x n
k n(n) k n(n) a
k (n+1) (n)
k n−1
= − n+1 hn . (3.60)
k n(n) k n(n)

In this way,
(n+1) (n)
h n k n+1 k n−1
a (n−1) = − . (3.61)
h n−1 k n(n) k n(n)

To get the coefficient a n we simply need to compare the coefficients of x n


in both sides of Eq. (3.57):
(n+1) (n) (n+1) (n−1)
(n) k n+1 k n+1 k n+1 kn
k n+1 − k n(n−1) = a n k n(n) =⇒ a n = − . (3.62)
k n(n) k n(n) [k n(n) ]2

Comparing Eqs. (3.50) and (3.57),


(n+1) · (n) ¸
k n+1 k n+1 k n(n−1) h n αn
αn = , βn = αn − , γn = − . (3.63)
k n(n) (n+1)
k n+1 k n(n) h n−1 αn−1

Other recurrence relations can be obtained from Eq. (3.50). For example,
if we differentiate both sides of Eq. (3.50) and use Eq. (3.20), we get
· ¸
d Fn d [ω(x)s(x)]
2ω(x)s(x)αn + αn + ω(x)λn (αn x + βn ) F n
dx dx
− ω(x)λn+1 F n+1 (x) + ω(x)γn λn−1 F n−1 (x) = 0. (3.64)

We can get another recurrence relation involving derivatives by substitut-


ing Eq. (3.50) in (3.64), and simplifying we get
· ¸
d Fn d [ω(x)s(x)]
2ω(x)s(x)αn + αn + ω(x)(λn − λn+1 )(αn x + βn ) F n (x)
dx dx
+ ω(x)γn (λn−1 − λn+1 )F n−1 (x) = 0. (3.65)
3.3 R ECURRENCE RELATIONS 162

Two other recurrence relations can be obtained by differentiating equa-


tions (3.65) and (3.64), respectively, and using the differential equation for
F n (x). Then solve the first equation so obtained for γn d [ω(x)F dx
n−1 (x)]
and sub-
stitute the result in the second equation. After simplification, the result will
be
½· ¸ ¾
d d [ω(x)s(x)]
2ω(x)αn λn F n (x) + αn + ω(x)(λn − λn−1 )(αn x + βn ) F n (x)
dx dx
d [ω(x)F n+1 ]
+ (λn−1 − λn+1 ) = 0. (3.66)
dx
Finally, we simply state one more useful recurrence relation:

d ω(x) dω
A n (x)F n (x) − λn+1 (αn x + βn ) F n+1 (x) + γn λn−1 (αn x + βn ) F n−1 (x)
dx dx
d F n+1 d F n−1
+ B n (x) + γn D n (x) = 0, (3.67)
dx dx
where
· ¸
d 2 [ω(x)s(x)] dω
A n (x) = (αn x + βn ) 2ω(x)αn λn + αn + λn (αn x + βn )
d x2 dx
d [ω(x)s(x)]
− α2n ,
dx
d [ω(x)s(x)]
B n (x) = αn − ω(x)(αn x + βn )(λn+1 − λn ),
dx
d [ω(x)s(x)]
D n (x) = ω(x)(αn x + βn )(λn−1 − λn ) − αn . (3.68)
dx
All these recurrence relations seem to be very complicated. However, com-
plexity is the price we pay for generality! When we work with specific orthog-
onal polynomials, the equation simplify considerably. For instance, for Her-
mite and Legendre polynomials, Eq. (3.65),

d Hn d Pn
= 2nHn−1 (x), and (1 − x 2 ) + nxP n (x) − nP n−1 (x) = 0. (3.69)
dx dx
Also, applying Eq. (3.66) to Legendre polynomials gives

d P n+1 d Pn
−x − (n + 1)P n (x) = 0, (3.70)
dx dx
and Eq. (3.67) yields

d P n+1 d P n−1
− − (2n + 1)P n (x) = 0. (3.71)
dx dx
3.4 T HE CLASSICAL POLYNOMIALS 163

It is possible to find many more recurrence relations by manipulating the


existing recurrence relations.
Before studying specific orthogonal polynomials, let us pause for a mo-
ment to appreciate the generality and elegance of the preceding discussion:
with a few assumptions and a single defining equation we have severely re-
stricted the choice of the weight function and with it the choice of the interval
[a, b]. We have nevertheless exhausted the list of the so-called classical or-
thogonal polynomials.

3.4 T HE CLASSICAL POLYNOMIALS


We now construct the specific polynomials used frequently in physics. We
have seen that the four parameters K n , k n(n) , k n(n−1) , and h n determine all the
properties of the polynomials. Once K n is fixed by some standardization, we
can determine all the other parameters: k n(n) and k n(n−1) will be given by the
generalized Rodrigues formula, and h n can be calculated as follows:

Zb Zb
2
hn = d xω(x)[F n (x)] = d x ω(x)[k n(n) x n + . . . ]F n (x)
a a
Zb Zb · ¸
1 d n [ω(x)s n (x)] k n(n) d d n−1 [ω(x)s n (x)]
= k n(n) d xω(x)x n
= d xxn
K n (x)ω(x) d xn Kn dx d x n−1
a a
¯ Zb
k n(n) n d n−1 [ω(x)s n (x)] ¯¯b k n(n) d (x n ) d n−1 [ω(x)s n (x)]
= x ¯ − K dx (3.72)
Kn d x n−1 a n dx d x n−1
a

The first term of the last line is zero by the property (ii) of Sec. 3.1. It is clear
that each integration by parts introduces a minus sign and shifts one differ-
entiation from ω(x)s n (x) to x n . Thus, after n integrations by parts and noting
0 n
d n [x n ]
that d [ω(x)s
dx 0
(x)]
= ω(x)s n
(x) and d x n = n!, we obtain

Zb
(−1)n k n(n) n!
hn = d xω(x)s n (x). (3.73)
Kn
a

3.4.1 H ERMITE POLYNOMIALS


The Hermite polynomials are standardized such that K n = (−1)n . In this case,
2
as we saw in Sec. 3.2, s(x) = 1 and ω(x) = e −x , thus, the generalized Rodrigues
3.4.2 L AGUERRE POLYNOMIALS 164

formula (3.19) give

n x2 d n −x 2
Hn (x) = (−1) e [e ]. (3.74)
d xn
2
It is clear that each time e −x is differentiated, a factor −2x is introduced. The
2
highest power of x is obtained when we differentiate e −x n times. This yields
2 2
(−1)n e x (−2x)n e −x = 2n x n , thus, k n(n) = 2n .
To obtain k n(n−1) , it is helpful to see whether the polynomial is even or odd.
Substituting −x for x in Eq. (3.74), we get Hn (−x) = (−1)n Hn (x), which shows
that if n is even (odd), Hn is an even (odd) polynomial, i.e., it can have only
even (odd) powers of x. In either case, the next-higher power of x in Hn (x) is
not n − 1 but n − 2. Thus, the coefficient of x n−1 is zero for Hn (x), and we have
p
k n(n−1) = 0. For h n , we use Eq. (3.73) to obtain h n = π2n n!.
Next we calculate the recurrence relation of Eq. (3.50). We can readily cal-
culate the constants needed: αn = 2, βn = 0, γn = −2n. Then substitute these
in Eq. (3.50) to obtain

Hn+1 (x) = 2x Hn (x) − 2nHn−1 (x). (3.75)

Other recurrence relations can be obtained similarly.


Finally, the differential equation of Hn (x) is obtained by first noting that
K 1 = −1, σ2 = 0, F 1 (x) = 2x, then k 1(1) = 2. All of this gives λn = −2n, which can
be used in Eq. (3.20) to get

d 2 Hn d Hn
2
− 2x + 2nHn (x) = 0. (3.76)
dx dx

3.4.2 L AGUERRE POLYNOMIALS


For Laguerre polynomials, the standardization is K n = n!. As we saw in Sec. 3.2,
in case of Laguerre polynomials, s(x) = x, ω(x) = x ν e −x with ν > −1. In this
way, from the generalized Rodrigues formula

1 d n [x ν e −x x n ] 1 −ν x d n [x n+ν e −x ]
L νn (x) = = x e . (3.77)
n!x ν e −x d xn n! d xn

To find k n(n) we note that differentiating e −x does not introduce any new pow-
ers of x, but only a factor of −1. Thus, the highest power of x is obtained by
leaving x n+ν alone and differentiating e −x n times. This gives

1 −ν x n+ν (−1)n n (−1)n


x e x (−1)n e −x = x =⇒ k n(n) = . (3.78)
n! n! n!
3.4.2 L AGUERRE POLYNOMIALS 165

In this case, changing x → −x in Eq. (3.77) distorts the right-hand side


of Eq. (3.77), and the evenness or oddness of L νn (x) is not helpful to deter-
mine k n(n−1) . The coefficient k n(n−1) can be calculated by noticing that the next-
highest power of x is obtained by adding the first derivative of x n+ν n times
and multiplying the result by (−1)n−1 , which comes from differentiating e −x .
We obtain
· ¸
1 −ν x n−1 n+ν−1 −x (−1)n−1 (n + ν) n−1
x e (−1) n(n + ν)x e = x , (3.79)
n! (n − 1)!
and therefore
(n + ν)
k n(n−1) = (−1)n−1 . (3.80)
(n − 1)!
Finally, for h n we get
Z∞ Z∞
(−1)n [(−1)n /n!]n! ν −x n 1
hn = d xx e x = d xx n+ν e −x . (3.81)
n! n!
0 0

If ν is not an integer (and it need not be), the integral on the right-hand side
cannot be evaluated by elementary methods. In fact, this integral occurs so
frequently in mathematical applications that it is given a special name, the
gamma function. We are not going to discuss this function in this course and
at this point we simply note that
Z∞
Γ(z + 1) ≡ d x x z e −x , Γ(n + 1) = n!, n ∈ N, (3.82)
0

and write h n as
Γ(n + ν + 1) Γ(n + ν + 1)
hn = = . (3.83)
n! Γ(n + 1)
The relevant parameters for the recurrence relation can be easily calculated
using Eq. (3.63):
1 2n + ν + 1 n +ν
αn = − , βn = , γn = − . (3.84)
n +1 n +1 n +1
Substituting these in Eq. (3.50) and simplifying yields
(n + 1)L νn+1 (x) = (2n + ν + 1 − x)L νn (x) − (n + ν)L νn−1 . (3.85)

With k 1(1) = −1 and σ2 = 0, we get λn = −n, and the differential equation


(3.20) becomes
d 2 L νn d L νn
x + (ν + 1 − x) + nL νn (x) = 0. (3.86)
d x2 dx
3.4.3 L EGENDRE POLYNOMIALS 166

3.4.3 L EGENDRE POLYNOMIALS


Instead of discussing the Jacobi polynomials as a whole, we discuss the spe-
cial case of the Legendre polynomials P n (x), which are more widely used in
physics. In case of the Legendre polynomials, as we saw in Sec. 3.2, µ = ν = 0 in
the Jacobi polynomials and ω(x) = 1. The standardization is K n = (−1)n 2n n!.
Thus, the generalized Rodrigues formula reads

(−1)n d n [(1 − x 2 )n ]
P n (x) = . (3.87)
2n n! d xn

To determine k n(n) , we expand the expression in square brackets in the above


expression using the binomial theorem, i.e.,
µ ¶ µ ¶
n
X
n
n n−k k n n!
(a + b) = a b , = , (3.88)
k=0
k k k!(n − k)!

and take the nth derivative of the highest power of x. This yields

(−1)n d n [(−x 2 )n ] 1 d n [x 2n ]
k n(n) x n = =
2n n! d xn 2n n! d x n
1
= n 2n(2n − 1)(2n − 2) . . . (n + 1)x n . (3.89)
2 n!
If we multiply and divide the above expression by n!, and take a factor of 2 out
of all terms in the numerator, the even terms yield a factor of n! and the odd
terms give a Γ function, such that we can write
µ ¶
1
Γ n+2
(n) n
kn = 2 µ ¶ . (3.90)
1
n!Γ 2

To find k n(n−1) , as in case of the Hermite polynomials, it is useful to look


at the evenness or oddness of the polynomials P n (x). By using the gener-
alized Rodrigues formula, as we did for the Hermite polynomials, P n (−x) =
(−1)n P n (x), which tells us that P n (x) is either even or odd. In either case, x
will not have an (n − 1)st power. Therefore k n(n−1) = 0.
We now calculate h n as given by Eq. (3.73)
µ ¶ µ ¶
n (n) Z1 2 Γ n + 2 /Γ 21 Z1
n 1
(−1) k n n!
hn = d x(1 − x 2 )n = n
d x(1 − x 2 )n . (3.91)
Kn 2 n!
−1 −1
3.4.4 OTHER CLASSICAL ORTHOGONAL POLYNOMIALS 167

The integral appearing on the right-hand side of the preceding equation can
be evaluated by repeated integration by parts to get

Z1 m Z1
2 n(n − 1) . . . (n − m + 1)
d x(1 − x 2 )n = d xx 2m (1 − x 2 )n−m
3 · 5 · 7 . . . (2m − 1)
−1 −1
µ ¶
2Γ 12 n!
= µ ¶. (3.92)
1
(2n + 1)Γ n + 2

In this way,
2
hn = . (3.93)
2n + 1
Next, we need αn , βn and γn for the recurrence relation. Using Eq. (3.63),
µ ¶ µ ¶
1
(n+1)
n+1
2 Γ n +1+ 2 n!Γ 12
k n+1 2n + 1
αn = (n) = µ ¶ = µ ¶= , (3.94)
kn 1 1 n +1
(n + 1)!Γ 2 2 Γ n+2
n

where we have used the property Γ(z + 1) = zΓ(z). We also have βn = 0, since
k n(n−1) = 0 = k n+1
(n)
and γn = −n/(n + 1). In this way, the recurrence relation is
given by

(n + 1)P n+1 (x) = (2n + 1)xP n (x) − nP n−1 (x). (3.95)

Now we use K 1 = −2, P 1 (x) = x, thus, k 1(1) = 1, and σ2 = −1, since s(x) =
1 − x 2 as we saw in Sec. 3.2, to obtain λn = −n(n + 1). Then, we obtain the
following differential equation from Eq. (3.20)
· ¸
d 2 d Pn
(1 − x ) = −n(n + 1)P n (x), (3.96)
dx dx

which can be written as


d 2Pn d Pn
(1 − x 2 ) 2
− 2x + n(n + 1)P n (x) = 0. (3.97)
dx dx

3.4.4 OTHER CLASSICAL ORTHOGONAL POLYNOMIALS


The rest of the classical orthogonal polynomials can be constructed similarly.
Here, for completeness, we simply state the results.
1) J ACOBI POLYNOMIALS 168

1) J ACOBI POLYNOMIALS
µ,ν
• Nomenclature: P n (x).

• Standardization: K n = (−2)n n!.

• Constants:
Γ(2n + µ + ν + 1) n(ν − µ)
k n(n) = 2−n , k n(n−1) = kn ,
n!Γ(n + µ + ν + 1) 2n + µ + ν
2µ+ν+1 Γ(n + µ + 1)Γ(n + ν + 1)
hn = . (3.98)
n!(2n + µ + ν + 1)Γ(n + µ + ν + 1)

• Rodrigues formula:
n · ¸
µ,ν (−1)n −µ −ν d µ+n ν+n
P n (x) = (1 + x) (1 − x) (1 + x) (1 − x) . (3.99)
2n n! d xn

• Differential equation:
µ,ν µ,ν
d 2Pn d Pn
(1 − x 2 ) 2
+ [µ − ν − (µ + ν + 2)x]
dx dx
µ,ν
+ n(n + µ + ν + 1)P n (x) = 0. (3.100)

• A recurrence relation:
µ,ν
2(n + 1)(n + µ + ν + 1)(2n + µ + ν)P n+1 (x)
· ¸
2 2 µ,ν
= (2n + µ + ν + 1) (2n + µ + ν)(2n + µ + ν + 2)x + ν − µ P n (x)
µ,ν
− 2(n + µ)(n + ν)(2n + µ + ν + 2)P n−1 (x). (3.101)

2) G EGENBAUER POLYNOMIALS

• Nomenclature: C nλ (x).
µ ¶
Γ n+λ+ 12 Γ(2λ)
• Standardization: K n = (−2) n! n µ ¶.
Γ(n+2λ)Γ λ+ 21

• Constants:
µ ¶
p 1
πΓ(n + 2λ)Γ λ + 2
2n Γ(n + λ)
k n(n) = , k n(n−1) = 0, hn = . (3.102)
n! Γ(λ) n!(n + λ)Γ(2λ)Γ(λ)
3) C HEBYSHEV POLYNOMIALS OF THE FIRST KIND 169

• Rodrigues formula:
µ ¶
n 1
(−1) Γ(n + 2λ)Γ λ + 2 n · ¸
λ 2 −λ+1/2 d 2 n+λ−1/2
C n (x) = µ ¶ (1 − x ) (1 − x ) .
1 d xn
2 n!Γ n + λ + 2 Γ(2λ)
n

(3.103)

• Differential equation:

d 2C nλ dC nλ
(1 − x 2 ) − (2λ + 1)x + n(n + 2λ)C nλ (x) = 0. (3.104)
d x2 dx

• A recurrence relation:
λ
(n + 1)C n+1 (x) = 2(n + λ)xC nλ (x) − (n + 2λ − 1)C n−1
λ
(x). (3.105)

3) C HEBYSHEV POLYNOMIALS OF THE FIRST KIND

• Nomenclature: Tn (x).

• Standardization: K n = (−1)n (2n)!


2n n!
.

• Constants: k n(n) = 2n−1 , k n(n−1) = 0, h n = π2 .

• Rodrigues formula:
n · ¸
(−1)n 2n n! 2 1/2 d 2 n−1/2
Tn (x) = (1 − x ) (1 − x ) . (3.106)
(2n)! d xn

• Differential equation:

d 2 Tn d Tn
(1 − x 2 ) 2
−x + n 2 Tn (x) = 0. (3.107)
dx dx

• A recurrence relation:

Tn+1 (x) = 2xTn (x) − Tn−1 (x). (3.108)


4) C HEBYSHEV POLYNOMIALS OF THE SECOND KIND 170

4) C HEBYSHEV POLYNOMIALS OF THE SECOND KIND

• Nomenclature: Un (x).

• Standardization: K n = (−1)n 2(2n+1)!


n (n+1)! .

• Constants: k n(n) = 2n , k n(n−1) = 0, h n = π2 .

• Rodrigues formula:
n · ¸
(−1)n 2n (n + 1)! 2 −1/2 d 2 n+1/2
Un (x) = (1 − x ) (1 − x ) . (3.109)
(2n + 1)! d xn

• Differential equation:

d 2Un dUn
(1 − x 2 ) 2
− 3x + n(n + 2)Un (x) = 0. (3.110)
dx dx

• A recurrence relation:

Un+1 (x) = 2xUn (x) −Un−1 (x). (3.111)

3.5 E XPANSION IN TERMS OF ORTHOGONAL


POLYNOMIALS

As in case of the Fourier expansion, we can use the classical orthogonal poly-
nomials to write an arbitrary function f (x) ∈ L ω2 (a, b) as a series of these poly-
nomials. If we denote a complete set of orthogonal polynomials by {C k (x)}∞ k=0
,
with the corresponding vectors being represented by |C k 〉, the vectors
1
|e k 〉 = p |C k 〉, k = 0, 1, 2, . . . (3.112)
hk

where h k is given by Eq. (3.24), form an orthonormal basis in L ω2 (a, b). In this
way, if | f k 〉 are the vectors related to the function f (x) ∈ L ω2 (a, b), we can write
X

|f 〉 = f k |e k 〉. (3.113)
k=0

where
Zb Zb
1
f k = 〈e k | f 〉 = d xω(x)e k∗ (x) f (x) = p d xω(x)C k∗ (x) f (x). (3.114)
hk
a a
3.5 E XPANSION IN TERMS OF ORTHOGONAL POLYNOMIALS 171

This means that the partial sum


X
n 1
f k e k (x), e k (x) = p C k (x) (3.115)
k=0 hk

converge in the mean to f (x), i.e.,

Zb ¯ ¯2
¯ X
∞ ¯
lim ¯
d xω(x)¯ f (x) − f k e k (x)¯¯ = 0. (3.116)
n→∞
a k=0

Then, we have
X

f (x) = f k e k (x) (3.117)
k=0

E XAMPLE 3.2. We have already seen some examples of expansion of functions


in terms of classical orthogonal polynomials when solving partial differen-
tial equations. For instance, in example 2.12 we obtained an expansion of
the temperature T (r, θ) inside a sphere formed by two solid heat-conducting
hemispheres of radius a, separated by a very small insulating gap, in terms of
Legendre polynomials. In the present example, we are going to consider the
Dirac delta function and expand it in terms of Legendre polynomials in the
interval −1 < x < 1. We start by using Eq. (3.117) and write
X
∞ P n (x)
δ(x) = fn p , (3.118)
n=0 hn

where, using Eq. (3.114),

Z1
1
fn = p d xP n (x)δ(x). (3.119)
hn
−1

Using Eq. (3.93) and the properties of the Dirac delta function
r
2n + 1
fn = P n (0). (3.120)
2
and, thus,
X∞ 2n + 1
δ(x) = P n (0)P n (x). (3.121)
n=0 2

For odd values of n, the above expression give zero, because P n (x) is an odd
polynomial. This is to be expected because δ(x) is an even function of x, i.e.,
3.6 G ENERATING FUNCTIONS 172

δ(x) = δ(−x) = 0 for x 6= 0. To evaluate P n (0) for even values of n, we use the
recurrence relation (3.95) for x = 0:
n −1
(n + 1)P n+1 (0) = −nP n−1 (0) =⇒
|{z} P n (0) = − P n−2 (0). (3.122)
n
n→n−1

Iterating this m times, we obtain

(n − 1)(n − 3) . . . (n − 2m + 1)
P n (0) = (−1)m P n−2m (0). (3.123)
n(n − 2)(n − 4) . . . (n − 2m + 2)

For n = 2m, since P 0 (x) = 1, we get

(2m − 1)(2m − 3) . . . 3 · 1
P 2m (0) = (−1)m P 0 (0)
2m(2m − 2) . . . 4 · 2
2m(2m − 1)(2m − 2) . . . 3 · 2 · 1
= (−1)m
[2m(2m − 2) . . . 4 · 2]2
(2m)! (2m)!
= (−1)m m 2
= (−1)m 2m . (3.124)
[2 m!] 2 (m!)2

Thus, we can write


X∞ 4m + 1 X∞ 4m + 1 (2m)!
δ(x) = P 2m (0)P 2m (x) = (−1)m 2m P 2m (x).
m=0 2 m=0 2 2 (m!)2
(3.125)

3.6 G ENERATING FUNCTIONS


It is possible to generate all orthogonal polynomials of a certain kind from
a single function of two variables g (x, t ) by repeated differentiation of that
function. Such a function is called a generating function. This generating
function is assumed to be expandable in the form

X

g (x, t ) = a n t n F n (x), (3.126)
n=0

so that the nth derivative of g (x, t ) with respect to t evaluated at t = 0 gives


F n (x) to within a multiplicative constant. The constant a n is introduced for
convenience. Clearly, for g (x, t ) to be useful, it must be in closed form. The
derivation of such a function for general F n (x) is nontrivial, and we shall not
attempt to derive such a general generating function. Instead, we consider the
3.6 G ENERATING FUNCTIONS 173

case of the Legendre polynomials and show that the functions P n (x) defined
by the equation

X

g (x, t ) = (1 − 2xt + t 2 )−1/2 = t n P n (x), (3.127)
n=0

satisfy the Legendre equation and, thus, are the Legendre polynomials (note
that, in this case, the constants a n in Eq. (3.126) are all equal to 1). In this way,
from Eq. (3.127), the nth coefficient of the Taylor series expansion of g (x, t ) =
(1 − 2xt + t 2 )−1/2 about t = 0 is the Legendre polynomial P n (x). Specifically,
· µ ¶¸¯
1 ∂n 1 ¯
P n (x) = p ¯ , (3.128)
n! ∂t n ¯
1 − 2t x + t 2 t =0

and g (x, t ) = (1−2xt + t 2 )−1/2 is, thus, the generating function of the Legendre
polynomials.
Let’s start then. First, we differentiate the defining equation (3.127) with
respect to x and get

X∞ dP
t (1 − 2xt + t 2 )−3/2 =
n n
t . (3.129)
n=0 d x

Also, we differentiate Eq. (3.127) with respect to t to get

X

(x − t )(1 − 2xt + t 2 )−3/2 = nP n (x)t n−1 . (3.130)
n=0

Using now Eq. (3.127), we can write Eq. (3.129) as

X
∞ X∞ dP
n n
t P n (x)t n = (1 − 2xt + t 2 ) t , (3.131)
n=0 n=0 d x

or, equivalently

X
∞ X∞ dP
n n X∞ dP
n n+1 X
∞ dP
n n+2
P n (x)t n+1 = t − 2x t + t
n=0 n=0 d x n=0 d x n=0 d x
X∞ ·dP ¸
d P n d P n−1 n+1
n+1
= − 2x + t . (3.132)
n=0 dx dx dx

Equating the coefficients of t n+1 we obtain the recurrence relation

d P n+1 d P n d P n−1
P n (x) = − 2x + . (3.133)
dx dx dx
3.6 G ENERATING FUNCTIONS 174

Equations (3.129) and (3.130) can be combined as

X∞ dP
n n X∞ X∞
(x − t ) t =t nP n (x)t n−1 = nP n (x)t n , (3.134)
n=0 d x n=0 n=0

or, equivalently
· ¸
X
∞ d P n d P n−1 n X ∞
x − t = nP n (x)t n . (3.135)
n=0 dx dx n=0

Then, the coefficients of t n satisfy the relation

d P n d P n−1
x − = nP n (x). (3.136)
dx dx
Eliminating d P n /d x between Eqs. (3.133) and (3.136) gives the further result

d P n+1 dP n
(n + 1)P n (x) = −x . (3.137)
dx dx
If we now take Eq. (3.137) with n replaced by n − 1 and add x times (3.136)
to it, we obtain

d Pn
(1 − x 2 ) = n[P n−1 (x) − xP n (x)]. (3.138)
dx
Differentiating now both sides with respect to x and using Eq. (3.136), we find
·µ ¶ ¸
2 d 2Pn d Pn d P n−1 d Pn
(1 − x ) − 2x =n −x − P n (x)
d x2 dx dx dx
= n[−nP n (x) − P n (x)] = −n(n + 1)P n (x), (3.139)

so the P n (x) defined by Eq. (3.127) satisfy the above differential equation,
which is precisely the Legendre equation [see Eq. (2.257)].
In the following we list the generating functions for the others classical
orthogonal polynomials and the corresponding value a n of Eq. (3.126):
2
• Hermite Hn (x): g (x, t ) = e −t +2xt
, a n = 1/n!.

• Laguerre L νn (x): g (x, t ) = e −xt /(1−t ) /(1 − t )ν+1 ; a n = 1.

• Chebyshev first kind Tn (x): g (x, t ) = (1 − t 2 )(t 2 − 2xt + 1)−1 ; a n = 2 if


n 6= 0, a 0 = 1.

• Chebyshev second kind Un (x): g (x, t ) = (t 2 − 2xt + 1)−1 , a n = 1.


3.7 S PECIAL FUNCTIONS : B ESSEL FUNCTIONS 175

3.7 S PECIAL FUNCTIONS : B ESSEL FUNCTIONS


The polynomials C k (x) used in Eq. (3.113) do not necessarily need to be the
classical ones. For example, in Sec. 2.4 we introduced the Bessel functions of
order m as
µ ¶m ∞ µ ¶2n
x X (−1)n x
J m (x) = , (3.140)
2 n=0 n!(m + n)! 2

which were solutions of the differential equation


µ ¶
d2y d y m2
x 2+ + x− y(x) = 0. (3.141)
dx dx x
The Bessel functions are always given in terms of their expansion in power se-
ries, or as an integral involving parameters. The point to emphasize is that it
is generally impossible to reduce the Bessel functions to any functional com-
bination of more elementary functions such as polynomials, or trigonometric
and exponential functions. Although Eq. (3.140) was obtained assuming that
m was an integer, lifting this restriction still yields a series which is convergent
everywhere, and one can define Bessel functions whose orders are real or even
complex numbers. In the following we shall confine our discussions to Bessel
functions of integer orders. The only difficulty is to correctly interpret (m +n)!
for non-integer values of m. But this can be done with the Γ function given in
Eq. (3.82): since Γ(n+1) = n! for positive integer values of n, the Γ function can
be considered as the generalization of the factorials to non-integer values. We
sometimes write Γ(x + 1) = x! for any real x and call Γ the factorial function.
The Γ function is defined for all values of its argument except 0 and negative
integers, for which the Γ function becomes infinite. The most complete ana-
lytic discussion of the Γ function allows complex arguments too and uses the
full machinery of complex calculus. However, in our case, we can confine our
discussions to the case of integers values of m.
The Bessel functions satisfy similar relations to those of the classical poly-
nomials. For example, from Eq. (3.140), using that Γ( j + 1) = j ! for a positive
integer j , we have
µ ¶−m ∞ µ ¶2n
x X (−1)n x
J −m (x) =
2 n=0 n!Γ(−m + n + 1) 2
µ ¶−m ∞ µ ¶2n
x X (−1)n x
= , (3.142)
2 n=m n!Γ(−m + n + 1) 2

because the first m terms of the first series have Γ functions in the denomi-
nator with negative integer (or zero) arguments. Now in the second series, we
3.7 S PECIAL FUNCTIONS : B ESSEL FUNCTIONS 176

can replace n by k = n − m. This yields


µ ¶−m ∞ µ ¶2m+2k
x X (−1)m+k x
J −m (x) =
2 k=0 (m + k)!Γ(k + 1) 2
µ ¶m ∞ µ ¶2k
m x
X (−1)k x
= (−1) = (−1)m J m (x), (3.143)
2 k=0 k!Γ(m + k + 1) 2

Using Eq. (3.140) it is possible to obtain a number of recurrence relations


involving Bessel functions of integer orders, which we simply list

2m
J m−1 (x) + J m+1 (x) = J m (x),
x
d Jm
J m−1 (x) − J m+1 (x) = 2 . (3.144)
dx
Combining these two equations, we obtain

m d J m (x)
J m−1 (x) = J m (x) + ,
x dx
m d Jm
J m+1 (x) = J m (x) − . (3.145)
x dx
We can use these equations to obtain new, and more useful, relations. For
example, by differentiating x m J m (x), we get
· ¸
d d Jm
x J m (x) = mx m−1 J m (x) + x m
m
dx dx
· ¸
m m d Jm
=x J m (x) + = x m J m−1 (x). (3.146)
x dx

If we integrate now the preceding equation,


Z
x J m (x) = d xx m J m−1 (x).
m
(3.147)

But, more importantly, Bessel functions satisfy an orthogonality relation


similar to that of the Legendre polynomials. However, unlike Legendre poly-
nomials, the quantity that determines the orthogonality of the different Bessel
functions is not the order, but a parameter in their argument. Let’s see this.
Consider two solutions of the Bessel equation in cylindrical coordinates, i.e.,
Eq. (2.197), corresponding to the same azimuthal parameter, but with differ-
ent radial parameter. More specifically, let f (ρ) = J m (kρ) and g (ρ) = J m (l ρ).
3.7 S PECIAL FUNCTIONS : B ESSEL FUNCTIONS 177

Then, from Eq. (2.197),


µ ¶
d2 f 1 d f 2 m2
+ + k − 2 f (ρ) = 0,
d ρ2 ρ d ρ ρ
2 µ ¶
d g 1 dg 2 m2
+ + l − 2 g (ρ) = 0. (3.148)
d ρ2 ρ d ρ ρ
If we multiply the first equation by ρg and the second equation by ρ f and
subtract, we get
· µ ¶¸
d dg df
ρ f −g = (k 2 − l 2 )ρ f g . (3.149)
dρ dρ dρ
Now we can integrate this equation with respect to ρ from some initial value
(say a) to some final value (say b) to obtain

· µ ¶¸¯ Zb
dg d f ¯¯b 2 2
ρ f −g = (k − l ) d ρρ f (ρ)g (ρ). (3.150)
dρ d ρ ¯a
a

In all physical applications, a and b can be chosen to make the left-hand side
vanish. Then, substituting for f and g in terms of the corresponding Bessel
functions, we get

Zb
2 2
(k − l ) d ρρ J m (kρ)J m (l ρ) = 0. (3.151)
a

It follows that if k 6= l , then the integral vanishes, i.e.,

Zb
d ρρ J m (kρ)J m (l ρ) = 0, if k 6= l . (3.152)
a

To complete the orthogonality relation, we must R also address the case when
2
k = l . This involves the evaluation of the integral d ρρ J m (kρ), which, upon
the change of variable x = kρ, reduces to
Z
1 2
2
d xx J m (x). (3.153)
k
Integrating by parts and using the recurrence relations for the Bessel func-
tions, it is possible to show that
Z · ¸
1 2 1 2 2 1 2 2 1 2 d Jm 2
d xx J m (x) = x J m (x) − m J m (x) + x , (3.154)
k2 2 2 2 dx
3.7 S PECIAL FUNCTIONS : B ESSEL FUNCTIONS 178

or, in terms of the variable ρ,


Z µ ¶ · ¸
2 1 2 m2 2 1 2 d Jm 2
d ρρ J m (kρ) = ρ − 2 J m (kρ) + ρ . (3.155)
2 k 2 dρ
In most applications, the lower limit of integration is zero and the upper limit
is a positive number a. The right-hand side of Eq. (3.155) vanishes at the lower
limit because of the following reason: the first term vanishes at ρ = 0 since
J m (0) = 0 for all m > 0 as is evident from the series expansion (3.140). For
m = 0 and ρ = 0, the parentheses in the first term of Eq. (3.155) vanishes. So,
the first term is zero for all m ≥ 0 at the lower limit of integration. The second
term vanishes due to the presence of ρ 2 . Thus, we obtain
Za µ ¶ · ¯ ¸2
2 1 2 m2 2 1 2 d J m ¯¯
d ρρ J m (kρ) = a − 2 J m (ka) + a (3.156)
2 k 2 d ρ ¯ρ=a
0

for all m ≥ 0 and, by Eq. (3.143), for all negative integers too. It is customary
to simplify the right-hand side of Eq. (3.156) by choosing k in such a way that
J m (ka) = 0, i.e., that ka is a root of the Bessel function of order m. If we denote
x mn the nth root of J m (x) (in general, there are infinitely many roots)
x mn
ka = x mn =⇒ k= , n = 1, 2, . . . , (3.157)
a
and if we use Eq. (3.145), we obtain
Za µ ¶ · ¸2
2 x mn ρ 1 2
d ρρ J m = a J m+1 (x mn ) . (3.158)
a 2
0

Equations (3.152) and (3.158) can be combined into a single equation by


using a Kronecker delta: the Bessel function of integer order satisfy the or-
thogonality relations
Za µ ¶ µ ¶
x mn ρ x mk ρ 1
d ρρ J m Jm = a 2 J m+1
2
(x mn )δkn , (3.159)
a a 2
0

where a > 0 and x mn is the nth root of J m (x).


The orthogonality relation obtained can be used to expand other func-
tions in terms of Bessel functions of a specific order, as in case of the Fourier
series or the classical polynomials. If a function f (ρ) is defined in the interval
[0, a], then we may write
µ ¶
X∞ x mn ρ
f (ρ) = cn J m . (3.160)
n=1 a
3.7 S PECIAL FUNCTIONS : B ESSEL FUNCTIONS 179

The coefficients c n can be found by multiplying both sides by ρ J m (x mk ρ/a),


integrating from zero to a and using (3.159). This yields
Za µ ¶
2 x mn ρ
cn = 2
d ρρ f (ρ)J m , (3.161)
a 2 J m+1 (x mn ) a
0

which is the analogous to Eq. (3.114).


Just as in the case of the classical orthogonal polynomials, the Bessel func-
tions of integer order have a generating function, i.e., there exists a function
g (x, t ), which is,
· µ ¶¸
x 1
g (x, t ) = exp t− , (3.162)
2 t
such that
X

g (x, t ) = t n J n (x). (3.163)
n=−∞

E XAMPLE 3.3. Let us find the expansion of ρ k in terms of Bessel functions.


Equations (3.161) and (3.147) suggest expanding it in terms of J k (x) because
the integrals can be performed. Therefore, we write
µ ¶
k
X∞ x kn ρ
ρ = cn J k , (3.164)
n=1 a
where, from Eq. (3.161)
Za µ ¶
2 k+1 x kn ρ
cn = 2
d ρρ Jk . (3.165)
a 2 J k+1 (x kn ) a
0

Introducing y = x kn ρ/a in the integral gives


Zxkn
2a k
cn = k+2 2
d y y k+1 J k (y). (3.166)
x kn J k+1 (x kn )
0

Using now Eq. (3.147) with m being replaced by k + 1,


2a k
cn = . (3.167)
x kn J k+1 (x kn )
Thus, we get
µ ¶
x ρ
J k kn
X
∞ a
ρ k = 2a k . (3.168)
n=1 x kn J k+1 (x kn )
3.8 F URTHER READING 180

3.8 F URTHER READING


This chapter has been prepared using the following references:

• Mathematics For Physicist, by Philippe Dennery and André Krzywicki


(Dover. Publications, Inc., 1995).

• Mathematical Physics, A Modern Introduction to its Foundations, by Sadri


Hassani (Springer, Second Edition).

• Mathematical Methods for Students of Physics and Related Fields, by Sadri


Hassani (Springer, second edition, 2009).

• Mathematical Methods For Physics and Engineering, by K. F. Riley, M. P.


Hobson, S. J. Bence (Cambridge University Press, Third Edition).

You might also like