Notes
Notes
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Natural Sciences Tripos: IB Mathematical Methods I
Contents
0 Introduction i
0.1 Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
0.2 Books . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
0.3 Lectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
0.4 Printed Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
0.5 Example Sheets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
0.6 Acknowledgements. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
0.7 Revision. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
1 Vector Calculus 1
1.0 Why Study This? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Vectors and Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Cartesian Coordinate Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Vector Calculus in Cartesian Coordinates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.1 The Gradient of a Scalar Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.3 The Geometrical Signicance of Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 The Divergence and Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.1 Vector elds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.3.2 The Divergence and Curl of a Vector Field . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.3.4 F . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4 Vector Dierential Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.5 Second Order Vector Dierential Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5.1 div curl and curl grad . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.5.2 The Laplacian Operator
2
. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.5.3 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.6 The Divergence Theorem and Stokes Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.6.1 The Divergence Theorem (Gauss Theorem) . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.6.2 Stokes Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.6.3 Examples and Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.6.4 Interpretation of divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.6.5 Interpretation of curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.7 Orthogonal Curvilinear Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.7.1 What Are Orthogonal Curvilinear Coordinates? . . . . . . . . . . . . . . . . . . . . . . . . 15
Natural Sciences Tripos: IB Mathematical Methods I a c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
1.7.2 Relationships Between Coordinate Systems . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.7.3 Incremental Change in Position or Length. . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.7.4 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.7.5 Spherical Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7.6 Cylindrical Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.7.7 Volume and Surface Elements in Orthogonal Curvilinear Coordinates . . . . . . . . . . . 19
1.7.8 Gradient in Orthogonal Curvilinear Coordinates . . . . . . . . . . . . . . . . . . . . . . . 20
1.7.9 Examples of Gradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.7.10 Divergence and Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.7.11 Laplacian in Orthogonal Curvilinear Coordinates . . . . . . . . . . . . . . . . . . . . . . . 23
1.7.12 Further Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.7.13 Aide Memoire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
2 Partial Dierential Equations 25
2.0 Why Study This? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.1 Nomenclature . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
2.1.1 Linear Second-Order Partial Dierential Equations . . . . . . . . . . . . . . . . . . . . . . 25
2.2 Physical Examples and Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2.1 Waves on a Violin String . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2.2 Electromagnetic Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
2.2.3 Electrostatic Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.4 Gravitational Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.5 Diusion of a Passive Tracer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
2.2.6 Heat Flow . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
2.2.7 Other Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
2.3 Separation of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4 The One Dimensional Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.1 Separable Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30
2.4.2 Boundary and Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
2.4.3 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2.4.4 Unlectured: Oscillation Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.5 Poissons Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.5.1 A Particular Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.5.2 Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.5.3 Separable Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
2.6 The Diusion Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.6.1 Separable Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
2.6.2 Boundary and Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
2.6.3 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
Natural Sciences Tripos: IB Mathematical Methods I b c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
3 Fourier Transforms 39
3.0 Why Study This? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.1 The Dirac Delta Function (a.k.a. Alchemy) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.1.1 The Delta Function as the Limit of a Sequence . . . . . . . . . . . . . . . . . . . . . . . . 39
3.1.2 Some Properties of the Delta Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
3.1.3 An Alternative (And Better) View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
3.1.4 The Delta Function as the Limit of Other Sequences . . . . . . . . . . . . . . . . . . . . . 40
3.1.5 Further Properties of the Delta Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.1.6 The Heaviside Step Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
3.1.7 The Derivative of the Delta Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2 The Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2.1 Denition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
3.2.2 Examples of Fourier Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
3.2.3 The Fourier Inversion Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3.2.4 Properties of Fourier Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.2.5 Parsevals Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
3.2.6 The Convolution Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.2.7 The Relationship to Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3 Application: Solution of Dierential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3.1 An Ordinary Dierential Equation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.3.2 The Diusion Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4 Matrices 53
4.0 Why Study This? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1.1 Some Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1.2 Denition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
4.1.3 Linear Independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.1.4 Basis Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.2 Change of Basis: the Role of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2.1 Transformation Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2.2 Properties of Transformation Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.2.3 Transformation Law for Vector Components . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.3 Scalar Product (Inner Product) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.3.1 Denition of a Scalar Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
4.3.2 Worked Exercise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
4.3.3 Some Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.3.4 The Scalar Product in Terms of Components . . . . . . . . . . . . . . . . . . . . . . . . . 61
4.3.5 Properties of the Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.4 Change of Basis: Diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
Natural Sciences Tripos: IB Mathematical Methods I c c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
4.4.1 Transformation Law for Metrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
4.4.2 Diagonalization of the Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.4.3 Orthonormal Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
4.5 Unitary and Orthogonal Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 65
4.6 Diagonalization of Matrices: Eigenvectors and Eigenvalues . . . . . . . . . . . . . . . . . . . . . . 66
4.7 Eigenvalues and Eigenvectors of Hermitian Matrices . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.7.1 The Eigenvalues of an Hermitian Matrix are Real . . . . . . . . . . . . . . . . . . . . . . . 67
4.7.2 An n-Dimensional Hermitian Matrix has n Orthogonal Eigenvectors . . . . . . . . . . . . 68
4.7.3 Diagonalization of Hermitian Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.7.4 Diagonalization of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.8 Hermitian and Quadratic Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.8.1 Eigenvectors and Principal Axes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
4.8.2 The Stationary Properties of the Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.9 Mechanical Oscillations (Unlectured: See Easter Term Course) . . . . . . . . . . . . . . . . . . . 74
4.9.0 Why Have We Studied Hermitian Matrices, etc.? . . . . . . . . . . . . . . . . . . . . . . . 74
4.9.1 Governing Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.9.2 Normal Modes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.9.3 Normal Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.9.4 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5 Elementary Analysis 77
5.0 Why Study This? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.1 Sequences and Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.1.1 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.1.2 Sequences Tending to a Limit, or Not. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
5.2 Convergence of Innite Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2.1 Convergent Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2.2 A Necessary Condition for Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2.3 Divergent Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2.4 Absolutely Convergent Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3 Tests of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3.1 The Comparison Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3.2 DAlemberts Ratio Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.3.3 Cauchys Test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
5.4 Power Series of a Complex Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.4.1 Convergence of Power Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.4.2 Radius of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.4.3 Determination of the Radius of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.4.4 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.4.5 Analytic Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
Natural Sciences Tripos: IB Mathematical Methods I d c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
5.4.6 The O Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.5 Integration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.5.1 Why Do We Have To Do This Again? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.5.2 The Riemann Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
5.5.3 Properties of the Riemann Integral . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
5.5.4 The Fundamental Theorems of Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
6 Ordinary Dierential Equations 89
6.0 Why Study This? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.1 Second-Order Linear Ordinary Dierential Equations . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.2 Homogeneous Second-Order Linear ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.2.1 The Wronskian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
6.2.2 The Calculation of W . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.2.3 A Second Solution via the Wronskian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
6.3 Taylor Series Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
6.3.1 The Solution at Ordinary Points in Terms of a Power Series . . . . . . . . . . . . . . . . . 91
6.3.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
6.3.3 Example: Legendres Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
6.4 Regular Singular Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.4.1 The Indicial Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.4.2 Series Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
6.4.3 Example: Bessels Equation of Order . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6.4.4 The Second Solution when 1 2 Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
6.5 Inhomogeneous Second-Order Linear ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
6.5.1 The Method of Variation of Parameters . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.5.2 Two Point Boundary Value Problems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.5.3 Greens Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.5.4 Two Properties Greens Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.5.5 Construction of the Greens Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.5.6 Unlectured Alternative Derivation of a Greens Function . . . . . . . . . . . . . . . . . . . 103
6.5.7 Example of a Greens Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.6 Sturm-Liouville Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.6.1 Inner Products and Self-Adjoint Operators . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.6.2 The Sturm-Liouville Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.6.3 The Role of the Weight Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.6.4 Eigenvalues and Eigenfunctions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.6.5 The Eigenvalues of a Self-Adjoint Operator are Real . . . . . . . . . . . . . . . . . . . . . 109
6.6.6 Eigenfunctions with Distinct Eigenvalues are Orthogonal . . . . . . . . . . . . . . . . . . 110
6.6.7 Eigenfunction Expansions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 111
6.6.8 Eigenfunction Expansions of Greens Functions for Self-Adjoint Operators . . . . . . . . . 113
6.6.9 Approximation via Eigenfunction Expansions . . . . . . . . . . . . . . . . . . . . . . . . . 113
Natural Sciences Tripos: IB Mathematical Methods I e c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
0 Introduction
0.1 Schedule
This is a copy from the booklet of schedules.
1
Schedules are minimal for lecturing and maximal for
examining; that is to say, all the material in the schedules will be lectured and only material in the
schedules will be examined. The numbers in square brackets at the end of paragraphs of the schedules
indicate roughly the number of lectures that will be devoted to the material in the paragraph.
Mathematical Methods I 24 lectures, Michaelmas term
This course comprises Mathematical Methods I, Mathematical Methods II and Mathematical
Methods III and six Computer Practicals. The material in Course A from Part IA will be
assumed in the lectures for this course.
2
Topics marked with asterisks should be lectured, but
questions will not be set on them in examinations.
The material in the course will be as well illustrated as time allows with examples and
applications of Mathematical Methods to the Physical Sciences.
3
Vector calculus
Reminder of grad, div, curl,
2
. Divergence theorem and Stokes theorem. Vector dierential
operators in orthogonal curvilinear coordinates, e.g. cylindrical and spherical polar coordi-
nates. [4]
Partial dierential equations
Linear second-order partial dierential equations; physical examples of occurrence, the meth-
od of separation of variables (Cartesian coordinates only). [3]
Fourier transform
Fourier transforms; relation to Fourier series, simple properties and examples, delta function,
convolution theorem and Parsevals theorem treated heuristically, application to diusion
equation. [3]
Matrices
Ndimensional vector spaces, matrices, scalar product, transformation of basis vectors. Quad-
ratic and Hermitian forms, quadric surfaces. Eigenvalues and eigenvectors of a matrix; de-
generate case, stationary property of eigenvalues. Orthogonal and unitary transformations.
[5]
Elementary Analysis
Idea of convergence and limits. Convergence of series; comparison and ratio tests. Power series
of a complex variable; circle of convergence. O notation. The integral as a sum. Dierentiation
of an integral with respect to its limits. Schwarzs inequality. [2]
Ordinary dierential equations
Homogeneous equations; solution by series (without full discussion of logarithmic singulari-
ties), exemplied by Legendres equation. Inhomogeneous equations; solution by variation of
parameters, introduction to Greens function.
Sturm-Liouville theory; self-adjoint operators, eigenfunctions and eigenvalues, reality of eigen-
values and orthogonality of eigenfunctions. Eigenfunction expansions and determination of
coecients. Legendre polynomials; orthogonality. [7]
1
See http://www.maths.cam.ac.uk/undergrad/NST/sched/.
2
However, if you took course A rather than B, then you might like to recall the following extract from the schedules:
Students are . . . advised that if they have taken course A in Part IA, they should consult their Director of Studies about
suitable reading during the Long Vacation before embarking upon Part IB.
3
Time is always short.
Natural Sciences Tripos: IB Mathematical Methods I i c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
0.2 Books
An extract from the schedules.
There are very many books which cover the sort of mathematics required by Natural Sci-
entists. The following should be helpful as general reference; further advice will be given by
Lecturers. Books which can reasonably be used as principal texts for the course are marked
with a dagger.
G Arfken & H Weber Mathematical Methods for Physicists, 5th edition. Elsevier/Academic
Press, 2001 (46.95 hardback).
J W Dettman Mathematical Methods in Physics and Engineering. Dover, 1988 (13.60 pa-
perback).
H F Jones Groups, Representation and Physics, 2nd edition. Institute of Physics Publishing,
1998 (24.99 paperback)
E Kreyszig Advanced Engineering Mathematics, 8th edition. Wiley, 1999 (30.95 paperback,
92.50 hardback)
J W Leech & D J Newman How to Use Groups. Chapman & Hall, 1993 (out of print)
K F Riley, M P Hobson & S J Bence Mathematical Methods for Physics and Engineering.
2nd ed., Cambridge University Press, 2002 (29.95 paperback, 75.00 hardback).
R N Snieder A guided tour of mathematical methods for the physical sciences. Cambridge
University Press, 2001 (21.95 paperback)
There is likely to be an uncanny resemblance between my notes and Riley, Hobson & Bence. This is
because we both used the same source, i.e. previous Cambridge lecture notes, and not because I have
just copied out their textbook (although it is true that I have tried to align my notation with theirs)!
4
Having said that, it really is a good book. A must buy.
Of the other books I like Mathews & Walker, but it might be a little mathematical for some. Also, the
rst time I gave a service mathematics course (over 15 years ago to aeronautics students at Imperial),
my notes bore an uncanny resemblance to Kreyszig . . . and that was not because we were using a common
source!
0.3 Lectures
Lectures will start at 11:05 promptly with a summary of the last lecture. Please be on time since
it is distracting to have people walking in late.
I will endeavour to have a 2 minute break in the middle of the lecture for a rest and/or jokes
and/or politics and/or paper aeroplanes
5
; students seem to nd that the break makes it easier to
concentrate throughout the lecture.
6
I will aim to nish by 11:55, but am not going to stop dead in the middle of a long proof/explanation.
I will stay around for a few minutes at the front after lectures in order to answer questions.
By all means chat to each other quietly if I am unclear, but please do not discuss, say, last nights
football results, or who did (or did not) get drunk and/or laid. Such chatting is a distraction.
4
In a previous year a student hoped that Riley et al. were getting royalties from my lecture notes; my view is that I
hope that my lecturers from 30 years ago are getting royalties from Riley et al.!
5
If you throw paper aeroplanes please pick them up. I will pick up the rst one to stay in the air for 5 seconds.
6
Having said that, research suggests that within the rst 20 minutes I will, at some point, have lost the attention of all
of you.
Natural Sciences Tripos: IB Mathematical Methods I ii c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
I want you to learn. I will do my best to be clear but you must read through and understand your
notes before the next lecture . . . otherwise you will get hopelessly lost. An understanding of your
notes will not diuse into you just because you have carried your notes around for a week . . . or
put them under your pillow.
I welcome constructive heckling. If I am inaudible, illegible, unclear or just plain wrong then please
shout out.
I aim to avoid the words trivial, easy, obvious and yes
7
. Let me know if I fail. I will occasionally
use straightforward or similarly to last time; if it is not, email me ([email protected])
or catch me at the end of the next lecture.
Sometimes I may confuse both you and myself, and may not be able to extract myself in the middle
of a lecture. Under such circumstances I will have to plough on as a result of time constraints;
however I will clear up any problems at the beginning of the next lecture.
This is a service course. Hence you will not get pure mathematical levels of rigour; having said that
all the outline/sketch proofs could in principle be tightened up given sucient time.
In Part IA all NST students were required to study mathematics. Consequently the lecturers
adapted their presentation to account for the presence of students who might like to have been
somewhere else, e.g. there were lots of how to do it recipes and not much theory. This is an optional
course, and as a result there will more theory than last year (although less than in the comparable
mathematics courses). If you are really to use a method, or extend it as might be necessary in
research, you need to understand why a method works, as well as how to apply it. FWIW the NST
mathematics schedules are decided by scientists in conjunction with mathematicians.
If anyone is colour blind please come and tell me which colour pens you cannot read.
0.4 Printed Notes
Printed notes will be handed out for the course . . . so that you can listen to me rather than having
to scribble things down. If it is not in the printed notes or on the example sheets it should not be
in the exam.
Any notes will only be available in lectures and only once for each set of notes.
I do not keep back-copies (otherwise my oce would be an even worse mess) . . . from which you
may conclude that I will not have copies of last times notes (so do not ask).
There will only be approximately as many copies of the notes as there were students at the previous
lecture. We are going to fell a forest as it is, and I have no desire to be even more environmentally
unsound.
Please do not take copies for your absent friends unless they are ill.
The notes are deliberately not available on the WWW; they are an adjunct to lectures and are not
meant to be used independently.
If you do not want to attend lectures then use one of the excellent textbooks, e.g. Riley, Hobson &
Bence.
With one or two exceptions gures/diagrams are deliberately omitted from the notes. I was taught
to do this at my teaching course on How To Lecture . . . the aim being that it might help you to
stay awake if you have to write something down from time to time.
There are a number of unlectured worked examples in the notes. In the past I have been tempted to
not include these because I was worried that students would be unhappy with material in the notes
that was not lectured. However, a vote in an earlier year was overwhelming in favour of including
unlectured worked examples.
Please email me corrections to the notes and example sheets ([email protected]).
7
But I will fail miserably in the case of yes.
Natural Sciences Tripos: IB Mathematical Methods I iii c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
0.5 Example Sheets
There will be ve example sheets. They will be available on the WWW at about the same time as
I hand them out (see http://damtp.cam.ac.uk/user/examples/).
You should be able to do the revision example sheet, i.e. example sheet 0, immediately (although
you might like to wait until the end of lecture 2 for a couple of the questions).
You should be able to do example sheets 1/2/3/4 after lectures 6/12/18/24 respectively. Please
bear this in mind when arranging supervisions.
0.6 Acknowledgements.
The following notes were adapted from those of Paul Townsend, Stuart Dalziel, Mike Proctor and Paul
Metcalfe.
0.7 Revision.
You should check that you recall the following.
The Greek alphabet.
A alpha
B beta
gamma
delta
E epsilon
Z zeta
H eta
theta
I iota
K kappa
lambda
M mu
N nu
xi
O o omicron
pi
P rho
sigma
T tau
upsilon
phi
X chi
psi
omega
There are also typographic variations on epsilon (i.e. ), phi (i.e. ), and rho (i.e. ).
The rst fundamental theorem of calculus. The rst fundamental theorem of calculus states that
the derivative of the integral of f is f, i.e. if f is suitably nice (e.g. f is continuous) then
Key
Result
d
dx
__
x
x1
f(t) dt
_
= f(x) . (0.1)
Natural Sciences Tripos: IB Mathematical Methods I iv c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
The second fundamental theorem of calculus. The second fundamental theorem of calculus states
that the integral of the derivative of f is f, e.g. if f is dierentiable then
Key
Result
_
x2
x1
df
dx
dx = f(x
2
) f(x
1
) . (0.2)
The Gaussian. The function
f(x) =
1
2
2
exp
_
x
2
2
2
_
(0.3)
is called a Gaussian of width ; in context of probability theory is the standard deviation. The
area under this curve is unity, i.e.
Key
Result
1
2
2
_
exp
_
x
2
2
2
_
dx = 1 . (0.4)
Cylindrical polar co-ordinates (, , z).
In cylindrical polar co-ordinates the position vector r is given
in terms of a radial distance from an axis e
z
, a polar angle
, and the distance z along the axis:
r = cos e
x
+ sine
y
+z e
z
(0.5a)
= e
+z e
z
, (0.5b)
where 0 < , 0 2 and < z < .
Remark. Often r and/or are used in place of and/or
respectively (but then there is potential confusion with the dif-
ferent denitions of r and in spherical polar co-ordinates).
Spherical polar co-ordinates (r, , ).
In spherical polar co-ordinates the position vector r is given in
terms of a radial distance r from the origin, a latitude angle
, and a longitude angle :
r = r sin cos e
x
+r sin sine
y
+r cos e
z
(0.6a)
= re
r
, (0.6b)
where 0 r < , 0 and 0 2.
Taylors theorem for functions of more than one variable. Let f(x, y) be a function of two vari-
ables, then
f(x +x, y +y) = f(x, y) +x
f
x
+y
f
y
+
1
2!
_
(x)
2
2
f
x
2
+2xy
2
f
xy
+(y)
2
2
f
y
2
_
. . . . (0.7)
Exercise. Let g(x, y, z) be a function of three variables. Expand g(x +x, y +y, z +z) correct to
O(x, y, z).
Partial dierentiation. For variables q
1
, q
2
, q
3
,
_
q
1
q
1
_
q2,q3
= 1 ,
_
q
1
q
2
_
q1,q3
= 0 , etc. , (0.8a)
Natural Sciences Tripos: IB Mathematical Methods I v c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
and hence
Key
Result
q
i
q
j
=
ij
, (0.8b)
where
ij
is the Kronecker delta:
ij
=
_
1 if i = j
0 if i ,= j
. (0.9)
The chain rule. Let h(x, y) be a function of two variables, and suppose that x and y are themselves
functions of a variable s, then
dh
ds
=
h
x
dx
ds
+
h
y
dy
ds
. (0.10a)
Suppose instead that h depends on n variables x
i
(i = 1, . . . , n), so that h = h(x
1
, x
2
, . . . , x
n
). If
the x
i
depend on m variables s
j
(j = 1, . . . , m), then for j = 1, . . . , m
Key
Result
h
s
j
=
n
i=1
h
x
i
x
i
s
j
. (0.10b)
An identity. If
ij
is the Kronecker delta, then for a
j
(j = 1, 2, 3),
3
j=1
a
j
1j
= a
1
11
+a
2
12
+a
3
13
= a
1
, (0.11a)
and more generally
Key
Result
3
j=1
a
j
ij
= a
1
i1
+a
2
i2
+a
3
i3
= a
i
. (0.11b)
Line integrals. Let ( be a smooth curve, then
Key
Result
_
C
dr =
_
C
dr . (0.12)
The transpose of a matrix. Let A be a 3 3 matrix:
A =
_
_
A
11
A
12
A
13
A
21
A
22
A
23
A
31
A
32
A
33
_
_
. (0.13a)
Then the transpose, A
T
, of this matrix is given by
A
T
=
_
_
A
11
A
21
A
31
A
12
A
22
A
32
A
13
A
23
A
33
_
_
. (0.13b)
Fourier series. Let f(x) be a function with period L, i.e. a function such that f(x +L) = f(x). Then
the Fourier series expansion of f(x) is given by
Key
Result
f(x) =
1
2
a
0
+
n=1
a
n
cos
_
2nx
L
_
+
n=1
b
n
sin
_
2nx
L
_
, (0.14a)
where
a
n
=
2
L
_
x0+L
x0
f(x) cos
_
2nx
L
_
dx, (0.14b)
b
n
=
2
L
_
x0+L
x0
f(x) sin
_
2nx
L
_
dx, (0.14c)
Natural Sciences Tripos: IB Mathematical Methods I vi c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
and x
0
is an arbitrary constant. Also recall the orthogonality conditions
_
L
0
sin
2nx
L
sin
2mx
L
dx =
L
2
nm
, (0.15a)
_
L
0
cos
2nx
L
cos
2mx
L
dx =
L
2
nm
, (0.15b)
_
L
0
sin
2nx
L
cos
2mx
L
dx = 0 . (0.15c)
Let g
e
(x) be an even function, i.e. a function such that g
e
(x) = g
e
(x), with period L = 2L. Then
the Fourier series expansion of g
e
(x) can be expressed as
g
e
(x) =
1
2
a
0
+
n=1
a
n
cos
_
nx
L
_
, (0.16a)
where
a
n
=
2
L
_
L
0
g
e
(x) cos
_
nx
L
_
dx. (0.16b)
Let g
o
(x) be an odd function, i.e. a function such that g
o
(x) = g
o
(x), with period L = 2L. Then
the Fourier series expansion of g
o
(x) can be expressed as
g
o
(x) =
n=1
b
n
sin
_
nx
L
_
, (0.17a)
where
b
n
=
2
L
_
L
0
g
o
(x) sin
_
nx
L
_
dx. (0.17b)
Recall that if integrated over a half period, the orthogonality conditions require care since
_
L
0
sin
nx
L
sin
mx
L
dx =
L
2
nm
, (0.18a)
_
L
0
cos
nx
L
cos
mx
L
dx =
L
2
nm
, (0.18b)
but
_
L
0
sin
nx
L
cos
mx
L
dx =
_
_
0 if n +m is even,
2nL
(n
2
m
2
)
if n +m is odd.
(0.18c)
Natural Sciences Tripos: IB Mathematical Methods I vii c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Suggestions.
Examples.
1. Include Amperes law, Faradays law, etc., somewhere (see 1997 Vector Calculus notes).
Additions/Subtractions?
1. Remove all the enlargethispage commands.
2. 2D divergence theorem, Greens theorem (e.g. as a special case of Stokes theorem).
3. Add Fourier transforms of cos x, sinx and periodic functions.
4. Check that the addendum at the end of 3 has been incorporated into the main section.
5. Swap 3.3.2 and 3.3.1.
6. Swap 4.2 and 4.3.
7. Explain that observables in quantum mechanics are Hermitian operators.
8. Come up with a better explanation of why for a transformation matrix, say A, det A ,= 0.
Natural Sciences Tripos: IB Mathematical Methods I viii c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
1 Vector Calculus
1.0 Why Study This?
Many scientic quantities just have a magnitude, e.g. time, temperature, density, concentration. Such
quantities can be completely specied by a single number. We refer to such numbers as scalars. You have
learnt how to manipulate such scalars (e.g. by addition, subtraction, multiplication, dierentiation) since
your rst day in school (or possibly before that).
However other quantities have both a magnitude and a direction, e.g. the position of a particle, the
velocity of a particle, the direction of propagation of a wave, a force, an electric eld, a magnetic eld.
You need to know how to manipulate these quantities (e.g. by addition, subtraction, multiplication,
dierentiation) if you are to be able to describe them mathematically.
1.1 Vectors and Bases
Denition. A quantity that is specied by a
[positive] magnitude and a direction in space
is called a vector.
Example. A point P in 3D (or 2D) space can be
specied by giving its position vector, r, from
some chosen origin 0.
Vectors exist independently of any coordinate system. However, it is often very useful to describe them
in term of a basis. Three non-zero vectors e
1
, e
2
and e
3
can form a basis in 3D space if they do not all
lie in a plane, i.e. they are linearly independent. Any vector can be expressed in terms of scalar multiples
of the basis vectors:
a = a
1
e
1
+a
2
e
2
+a
3
e
3
. (1.1)
The a
i
(i = 1, 2, 3) are said to the components of the vector a with respect to this basis.
Note that the e
i
(i = 1, 2, 3) need not have unit magnitude and/or be orthogonal. However calculations,
etc. are much simpler if the e
i
(i = 1, 2, 3) dene a orthonormal basis, i.e. if the basis vectors
have unit magnitude [e
i
[ = 1 (i = 1, 2, 3);
are mutually orthogonal e
i
e
j
= 0 if i ,= j.
These conditions can be expressed more concisely as
e
i
e
j
=
ij
(i, j = 1, 2, 3) , (1.2)
where
ij
is the Kronecker delta:
ij
=
_
1 if i = j
0 if i ,= j
. (1.3)
The orthonormal basis is right-handed if
e
1
e
2
= e
3
, (1.4)
so that the ordered triple scalar product of the
basis vectors is positive:
[e
1
, e
2
, e
3
] = e
1
e
2
e
3
= 1 . (1.5)
Natural Sciences Tripos: IB Mathematical Methods I 1 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Exercise. Show using (1.2) and (1.4) that
e
2
e
3
= e
1
and that e
3
e
1
= e
2
. (1.6)
1.1.1 Cartesian Coordinate Systems
We can set up a Cartesian coordinate system by
identifying e
1
, e
2
and e
3
with unit vectors point-
ing in the x, y and z directions respectively. The
position vector r is then given by
r = xe
1
+y e
2
+z e
3
(1.7a)
= (x, y, z) . (1.7b)
Remarks.
1. We shall sometimes write x
1
for x, x
2
for y and x
3
for z.
2. Alternative notations for a Cartesian basis in R
3
(i.e. 3D) include
e
1
= e
x
= i , e
2
= e
y
= j and e
3
= e
z
= k, (1.8)
for the unit vectors in the x, y and z directions respectively. Hence from (1.2) and (1.6)
i.i = j.j = k.k = 1 , i.j = j.k = k.i = 0 , (1.9a)
i j = k, j k = i , k i = j . (1.9b)
1.2 Vector Calculus in Cartesian Coordinates.
1.2.1 The Gradient of a Scalar Field
Let (r) be a scalar eld, i.e. a scalar function
of position r = (x, y, z).
Examples of scalar elds include temperature
and density.
Consider a small change to the position r, say
to r + r. This small change in position will
generally produce a small change in . We es-
timate this change in using the Taylor series
for a function of many variables, as follows:
= (r +r) (r) = (x +x, y +y, z +z) (x, y, z)
=
x
x +
y
y +
z
z +. . .
=
_
x
e
x
+
y
e
y
+
z
e
z
_
(xe
x
+y e
y
+z e
z
) +. . .
= r +. . . .
In the limit when becomes innitesimal we write d for .
8
Thus we have that
d = dr , (1.10)
8
This is a bit of a fudge because, strictly, a dierential d need not be small . . . but there is no quick way out.
Natural Sciences Tripos: IB Mathematical Methods I 2 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
where the gradient of is dened by
grad =
x
e
x
+
y
e
y
+
z
e
z
=
3
j=1
x
j
e
j
(1.11)
We can dene the vector dierential operator (pronounced grad) independently of by writing
Key
Result
e
x
x
+e
y
y
+e
z
z
=
j
e
j
x
j
, (1.12)
where
j
will henceforth be used as a shorthand for
3
j=1
.
1.2.2 Example
Find f, where f(r) is a function of r = [r[. We will use this result later.
Answer. First recall that r
2
= x
2
+y
2
+z
2
. Hence
2r
r
x
= 2x, i.e.
r
x
=
x
r
. (1.13a)
Similarly, by use of the permutations x y, y z and z x,
r
y
=
y
r
,
r
z
=
z
r
. (1.13b)
Hence, from the denition of gradient (1.11),
Key
Result
r =
_
r
x
,
r
y
,
r
z
_
=
_
x
r
,
y
r
,
z
r
_
=
r
r
. (1.14)
Similarly, from the denition of gradient (1.11) (and from standard results for the derivative of a function
of a function),
f(r) =
_
f(r)
x
,
f(r)
y
,
f(r)
z
_
=
_
df
dr
r
x
,
df
dr
r
y
,
df
dr
r
z
_
= f
(r)r (1.15a)
= f
(r)
r
r
. (1.15b)
1.2.3 The Geometrical Signicance of Gradient
The normal to a surface. Suppose that a surface in 3D
space is dened by the condition that (r) = constant.
Also suppose that for an innitesimal, but non-zero, dis-
placement vector dr
d (r +dr) (r) = 0 . (1.16)
Then dr is a tangent to the surface.
Further suppose that ,= 0, then from (1.10) it fol-
lows that is orthogonal to dr. Moreover, since we can
choose dr to be in the direction of any tangent, we con-
clude that is orthogonal to all tangents.
Natural Sciences Tripos: IB Mathematical Methods I 3 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Hence must be orthogonal/normal to surfaces of constant , and so if n is the unit normal to a
surface of constant , then (upto a sign)
n =
[[
. (1.17)
1/02
We also note from (1.10) that d is a maximum when dr is parallel to , i.e. the direction of is
the direction in which is changing most rapidly.
The directional derivative. More generally consider the
rate of change of in the direction given by the unit
vector l. To this end consider (r +sl). If we regard this
as a function of the single variable s then we may use a
Taylor series to deduce that
= (r +s l) (r) = s
d
ds
(r +sl)
s=0
+. . . ,
or in the limit of s becoming innitesimal,
d = ds
d
ds
(r +sl)
s=0
. (1.18)
Next we note that if we substitute
dr = ds l (1.19)
into (1.10), then we obtain
d = ds (l ) . (1.20)
Equating (1.18) and (1.20) yields
l =
d
ds
(r +sl)
s=0
. (1.21)
Hence l is the rate of change of in the direction l. It is referred to as a directional derivative. 1/03
1.2.4 Applications
1. Find the unit normal at the point r(x, y, z) to the surface
(r) xy +yz +zx = c , (1.22)
where c is a positive constant. Hence nd the points where the tangents to the surface are parallel
to the (x, y) plane.
Answer. First calculate
= (y +z, x +z, y +x) . (1.23)
Then from (1.17) the unit normal is given by
n =
[[
=
(y +z, x +z, y +x)
_
2(x
2
+y
2
+z
2
+xy +xz +yz
. (1.24)
The tangents to the surface (r) = c are parallel to the
(x, y) plane when the normal is parallel to the z-axis, i.e. when
n = (0, 0, 1) or n = (0, 0, 1), i.e. when
y = z and x = z . (1.25)
Natural Sciences Tripos: IB Mathematical Methods I 4 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Hence from the equation for the surface, i.e. (1.22), the points where the tangents to the surface
are parallel to the (x, y) plane satisfy
z
2
= c . (1.26)
1/04
2. Unlectured. A mountains height z = h(x, y) depends on Cartesian coordinates x, y according to
h(x, y) = 1 x
4
y
4
0. Find the point at which the slope in the plane y = 0 is greatest.
Answer. The slope of a path is the rate of change in the vertical direction divided by the rate of
change in the horizontal direction. So consider a path on the mountain parameterised by s:
r(s) = (x(s), y(s), h(x(s), y(s))) . (1.27)
As s varies, the rate of change with s in the vertical direc-
tion is
dh
ds
, while the rate of change with s in the horizontal
direction is
_
_
dx
ds
_
2
+
_
dy
ds
_
2
. Hence the slope of the path
is given by
slope =
dh
ds
_
_
dx
ds
_
2
+
_
dy
ds
_
2
=
h
x
dx
ds
+
h
y
dy
ds
_
_
dx
ds
_
2
+
_
dy
ds
_
2
from (0.10a)
= l h, (1.28)
where
l =
1
_
_
dx
ds
_
2
+
_
dy
ds
_
2
_
dx
ds
,
dy
ds
, 0
_
. (1.29)
Thus the slope is a directional derivative. On y = 0
slope =
4x
3 dx
ds
dx
ds
= 4x
3
sign
_
dx
ds
_
. (1.30)
Therefore the magnitude of the slope is largest where [x[ is largest, i.e. at the edge of the mountain
[x[ = 1. It follows that max [slope[ = 4.
1.3 The Divergence and Curl
1.3.1 Vector elds
is an example of a vector eld, i.e. a vector specied at each point r in space. More generally, we
have for a vector eld F(r),
F(r) = F
x
(r)e
x
+F
y
(r)e
y
+F
z
(r)e
z
=
j
F
j
(r)e
j
, (1.31)
where F
x
, F
y
, F
z
, or alternatively F
j
(j = 1, 2, 3), are the components of F in this Cartesian coordinate
system. Examples of vector elds include current, electric and magnetic elds, and uid velocities.
We can apply the vector operator to vector elds by means of dot and cross products.
Natural Sciences Tripos: IB Mathematical Methods I 5 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
1.3.2 The Divergence and Curl of a Vector Field
Divergence. The divergence of F is the scalar eld
div F F =
_
e
x
x
+e
y
y
+e
z
z
_
(F
x
e
x
+F
y
e
y
+F
z
e
z
)
=
F
x
x
+
F
y
y
+
F
z
z
(1.32)
=
j
F
j
x
j
, (1.33)
from using (1.2) and remembering that in a Cartesian coordinate system the basis vectors do not
Key
Result
depend on position.
Curl. The curl of F is the vector eld
curl F F =
_
e
x
x
+e
y
y
+e
z
z
_
(F
x
e
x
+F
y
e
y
+F
z
e
z
)
=
_
F
z
y
F
y
z
_
e
x
+
_
F
x
z
F
z
x
_
e
y
+
_
F
y
x
F
x
y
_
e
z
(1.34a)
=
e
x
e
y
e
z
x
y
z
F
x
F
y
F
z
(1.34b)
=
e
1
e
2
e
3
x1
x2
x3
F
1
F
2
F
3
, (1.34c)
from using (1.8) and (1.9b), and remembering that in a Cartesian coordinate system the basis
Key
Result
vectors do not depend on position. Here
x
=
x
, etc..
1/01
1.3.3 Examples
1. Unlectured. Find the divergence and curl of the vector eld F = (x
2
y, y
2
z, z
2
x).
Answer.
F =
(x
2
y)
x
+
(y
2
z)
y
+
(z
2
x)
z
= 2xy +2yz +2zx. (1.35)
F =
e
x
e
y
e
z
x
y
z
x
2
y y
2
z z
2
x
= y
2
e
x
z
2
e
y
x
2
e
z
= (y
2
, z
2
, x
2
) . (1.36)
2. Find r and r.
Answer. From the denition of divergence (1.32), and recalling that r = (x, y, z), it follows that
r =
x
x
+
y
y
+
z
z
= 3 . (1.37)
Next, from the denition of curl (1.34a) it follows that
r =
_
z
y
y
z
,
x
z
z
x
,
y
x
x
y
_
= 0. (1.38)
Natural Sciences Tripos: IB Mathematical Methods I 6 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
1.3.4 F .
In (1.32) we dened the divergence of a vector eld F, i.e. the scalar F. The order of the operator
and the vector eld F is important here. If we invert the order then we obtain the scalar operator
(F ) F
x
x
+F
y
y
+F
z
z
=
j
F
j
x
j
. (1.39)
Remark. As far as notation is concerned, for scalar
F () =
j
F
j
_
x
j
_
=
j
_
F
j
x
j
_
= (F ) .
However, the right hand form is preferable. This is because for a vector G, the ith component of Recipe
(F )G is unambiguous, namely
((F )G)
i
=
j
F
j
G
i
x
j
, (1.40)
while the ith component of F (G) is not, i.e. it is not clear whether the ith component of
F (G) is
j
F
j
G
i
x
j
or
j
F
j
G
j
x
i
.
1.4 Vector Dierential Identities
Calculations involving can be much speeded up when certain vector identities are known. There are
a large number of these! A short list is given below of the most common. Here is a scalar eld and F,
G are vector elds.
( F) = F +(F ) , (1.41a)
(F) = (F) +() F, (1.41b)
(F G) = G (F) F (G) , (1.41c)
(F G) = F( G) G( F) +(G )F (F )G, (1.41d)
(F G) = (F )G+(G )F +F (G) +G(F) . (1.41e)
Example Verications.
(1.41a):
( F) =
( F
x
)
x
+
( F
y
)
y
+
( F
z
)
z
=
F
x
x
+F
x
x
+
F
y
y
+F
y
y
+
F
z
z
+F
z
z
=
F
x
x
+
F
y
y
+
F
z
z
+F
x
x
+F
y
y
+F
z
z
= F +(F ) .
Natural Sciences Tripos: IB Mathematical Methods I 7 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Unlectured. (1.41c):
(F G) =
x
(F
y
G
z
F
z
G
y
) +
y
(F
z
G
x
F
x
G
z
) +
z
(F
x
G
y
F
y
G
x
)
= G
z
F
y
x
+F
y
G
z
x
F
z
G
y
x
G
y
F
z
x
+G
x
F
z
y
+F
z
G
x
y
F
x
G
z
y
G
z
F
x
y
+G
y
F
x
z
+F
x
G
y
z
F
y
G
x
z
G
x
F
y
z
= G
x
_
F
z
y
F
y
z
_
+G
y
_
F
x
z
F
z
x
_
+G
z
_
F
y
x
F
x
y
_
+F
x
_
G
y
z
G
z
y
_
+F
y
_
G
z
x
G
x
z
_
+F
z
_
G
x
y
G
y
x
_
= G (F) F (G) .
Warnings.
1. Always remember what terms the dierential operator is acting on, e.g. is it all terms to the right
or just some?
2. Be very very careful when using standard vector identities where you have just replaced a vector
with . Sometimes it works, sometimes it does not! For instance for constant vectors D, F and G
F (DG) = D (GF) = D (F G) .
However for and vector functions F and G
F (G) ,= (GF) = (F G) ,
since
F (G) = F
x
_
G
z
y
G
y
z
_
+F
y
_
G
x
z
G
z
x
_
+F
z
_
G
y
x
G
x
y
_
,
while
(GF) = F
x
_
G
z
y
G
y
z
_
+F
y
_
G
x
z
G
z
x
_
+F
z
_
G
y
x
G
x
y
_
+G
x
_
F
y
z
F
z
y
_
+G
y
_
F
z
x
F
x
z
_
+G
z
_
F
x
y
F
y
x
_
.
1.5 Second Order Vector Dierential Operators
1.5.1 div curl and curl grad
Using the denitions grad, div and curl, i.e. (1.11), (1.32) and (1.34a), and assuming the equality of
mixed derivatives, we have that
curl (grad) = () =
_
y
z
z
y
,
z
x
x
z
,
x
y
y
x
_
= 0 . (1.42)
and
div(curl F) = (F) =
x
_
F
z
y
F
y
z
_
+
y
_
F
x
z
F
z
x
_
+
z
_
F
y
x
F
x
y
_
= 0 , (1.43)
Natural Sciences Tripos: IB Mathematical Methods I 8 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Remarks.
1. Since by the standard rules for scalar triple products (F) () F, we can summarise
both of these identities by
Key
Result
0 . (1.44)
2. There are important converses to (1.42) and (1.43). The following two assertions can be proved
(but not here).
(a) Suppose that F = 0; the vector eld F(r) is said to be irrotational. Then there exists a
scalar potential, (r), such that
F = . (1.45)
Application. A force eld F such that F = 0 is said to the conservative. Gravity is a
conservative force eld. The above result shows that we can dene a gravitational potential
such that F = .
(b) Suppose that B = 0; the vector eld B(r) is said to be solenoidal. Then there exists a
non-unique vector potential, A(r), such that
B = A. (1.46)
Application. One of Maxwells equations for a magnetic eld, B, states that B = 0. The
above result shows that we can dene a magnetic vector potential, A, such that B = A.
Example. Evaluate (p q), where p and q are scalar elds. We will use this result later.
Answer. Identify p and q with F and G respectively in the vector identity (1.41c). Then it
follows from using (1.44) that
(p q) = q (p) p (q) = 0 . (1.47)
2/02
1.5.2 The Laplacian Operator
2
From the denitions of div and grad
div(grad) = () =
x
_
x
_
+
y
_
y
_
+
z
_
z
_
=
_
2
x
2
+
2
y
2
+
2
z
2
_
. (1.48)
We conclude that the Laplacian operator,
2
= , is given in Cartesian coordinates by
2
= =
2
x
2
+
2
y
2
+
2
z
2
. (1.49)
Remarks.
1. The Laplacian operator
2
is very important in the natural sciences. For instance it occurs in
(a) Poissons equation for a potential (r):
2
= , (1.50a)
where (with a suitable normalisation)
i. (r) is charge density in electromagnetism (when (1.50a) relates charge and electric po-
tential);
ii. (r) is mass density in gravitation (when (1.50a) relates mass and gravitational potential).
Natural Sciences Tripos: IB Mathematical Methods I 9 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
(b) Schrodingers equation for a non-relativistic quantum mechanical particle of mass m in a
potential V (r):
2
2m
2
+V (r) = i
t
, (1.50b)
where is the quantum mechanical wave function and is Plancks constant divided by 2.
(c) Helmholtzs equation
2
f +
2
f = 0 , (1.50c)
which governs the propagation of xed frequency waves (e.g. xed frequency sound waves).
Helmholtzs equation is a 3D generalisation of the simple harmonic resonator
d
2
f
dx
2
+
2
f = 0 .
2/03
2. Although the Laplacian has been introduced by reference to its eect on a scalar eld (in our
case ), it also has meaning when applied to vectors. However some care is needed. On the rst
example sheet you will prove the vector identity
(F) = ( F)
2
F. (1.51a)
The Laplacian acting on a vector is conventionally dened by rearranging this identity to obtain
2
F = ( F) (F) . (1.51b)
2/04
1.5.3 Examples
1. Find
2
r
n
= div(r
n
) and curl(r
n
). We will use this result later.
Answer. Put f(r) = r
n
in (1.15b) to obtain
r
n
= nr
n1
r
r
= nr
n2
(x, y, z) . (1.52)
So from the denition of divergence (1.32):
2
r
n
= (r
n
) =
(nr
n2
x)
x
+
(nr
n2
y)
y
+
(nr
n2
z)
z
= nr
n2
+n(n 2)r
n3
x
r
x +. . . using (1.13a)
= 3nr
n2
+n(n 2)r
n4
(x
2
+y
2
+z
2
)
= n(n +1)r
n2
. (1.53)
From the denition of curl (1.34a):
(r
n
) =
_
(nr
n2
z)
y
(nr
n2
y)
z
, . . . , . . .
_
=
_
n(n 2)r
n3
y
r
z n(n 2)r
n3
z
r
y, . . . , . . .
_
using (1.13a)
= 0 . (1.54)
Check. Note that from setting n = 2 in (1.52) we have that r
2
= 2r. It follows that (1.53) and
(1.54) with n = 2 reproduce (1.37) and (1.38) respectively. (1.54) also follows from (1.42).
2. Unlectured. Find the Laplacian of
sin r
r
.
Answer. Since the Laplacian consists of rst taking a gradient, we rst note from using result
(1.15a), i.e. f(r) = f
(r)r, that
_
sinr
r
_
=
_
cos r
r
sinr
r
2
_
r . (1.55a)
Natural Sciences Tripos: IB Mathematical Methods I 10 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Further, we recall from (1.14) that
r =
r
r
, (1.55b)
and also from (1.53) with n = 1 that
(r) =
2
r
. (1.55c)
Hence
2
_
sinr
r
_
=
_
sinr
r
_
=
__
cos r
r
sinr
r
2
_
r
_
from (1.55a)
=
_
cos r
r
sinr
r
2
_
r +r
_
cos r
r
sinr
r
2
_
from identity (1.41a)
= 2
_
cos r
r
2
sinr
r
3
_
+
r
r
_
cos r
r
sinr
r
2
_
from (1.55b) & (1.55c)
= 2
_
cos r
r
2
sinr
r
3
_
+
r
r
_
sinr
r
2 cos r
r
2
+
2 sinr
r
3
_
r using (1.15a) again
=
sinr
r
. (1.56)
Remark. It follows that f =
sin r
r
satises Helmholtzs equation (1.50c) for = 1.
1.6 The Divergence Theorem and Stokes Theorem
These are two very important integral theorems for vector elds that have many scientic applications.
1.6.1 The Divergence Theorem (Gauss Theorem)
Divergence Theorem. Let o be a nice surface
9
enclosing a volume 1 in R
3
, with a normal n that points
outwards from 1. Let u be a nice vector eld.
10
Then
Key
Result
___
V
udV =
__
S(V)
u dS, (1.57)
where dV is the volume element, dS = ndS is the vector surface element, n is the unit normal to
the surface o and dS is a small element of surface area. In Cartesian coordinates
dV = dxdy dz , (1.58a)
and
dS =
x
dy dz e
x
+
y
dz dxe
y
+
z
dxdy e
z
, (1.58b)
where
x
= sign( n e
x
),
y
= sign( n e
y
) and
z
= sign( n e
z
).
At a point on the surface, u n is the ux of
u across the surface at that point. Hence the
divergence theorem states that u integrated
over a volume 1 is equal to the total ux of
u across the closed surface o surrounding the
volume.
9
For instance, a bounded, piecewise smooth, orientated, non-intersecting surface.
10
For instance a vector eld with continuous rst-order partial derivatives throughout V.
Natural Sciences Tripos: IB Mathematical Methods I 11 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Remark. The divergence theorem relates a triple integral to a double integral. This is analogous to the
second fundamental theorem of calculus, i.e.
_
h2
h1
df
dz
dz = f(h
2
) f(h
1
) , (1.59)
which relates a single integral to a function. 2/01
Outline Proof. Suppose that o is a surface enclosing a volume 1 such that Cartesian axes can be chosen
so that any line parallel to any one of the axes meets o in just one or two points (e.g. a convex
surface). We observe that
___
V
udV =
___
V
_
u
x
x
+
u
y
y
+
u
z
z
_
dV ,
comprises of three terms; we initially concentrate on the
___
V
uz
z
dV term.
Let region / be the projection of o onto the
xy-plane. Let the lower/upper surfaces, o
1
/o
2
respectively, be parameterised by
o
1
: r = (x, y, h
1
(x, y))
o
2
: r = (x, y, h
2
(x, y)) .
Then using the second fundamental theorem of calculus (1.59)
___
V
u
z
z
dxdy dz =
__
A
_
_
h2
z=h1
u
z
z
dz
_
dxdy
=
__
A
(u
z
(x, y, h
2
(x, y)) u
z
(x, y, h
1
(x, y)) dxdy . (1.60)
Now consider the projection of a surface element dS on the upper surface onto the xy plane. It
follows geometrically that dxdy = [ cos [ dS, where is the angle between e
z
and the unit normal
n; hence on o
2
dxdy = e
z
ndS = e
z
dS. (1.61a)
On the lower surface o
1
we need to dot n with e
z
in order to get a positive area; hence
dxdy = e
z
dS. (1.61b)
We note that (1.61a) and (1.61b) are consistent with (1.58b) once the tricky issue of signs is sorted
out. Using (1.58a), (1.61a) and (1.61b), equation (1.60) can be rewritten as
___
V
u
z
z
dV =
__
S2
u
z
e
z
dS +
__
S1
u
z
e
z
dS =
__
S
u
z
e
z
dS, (1.62a)
since o
1
+o
2
= o. Similarly by permutation (i.e. x y, y z and z x),
___
V
u
y
y
dV =
__
S
u
y
e
y
dS,
___
V
u
x
x
dV =
__
S
u
x
e
x
dS. (1.62b)
Adding the above results we obtain the divergence theorem (1.57):
___
V
udV =
__
S
u dS.
Natural Sciences Tripos: IB Mathematical Methods I 12 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
1.6.2 Stokes Theorem
Let o be a nice open surface bounded by a nice closed curve (.
11
Let u(r) be a nice vector eld.
12
Then
Key
Result
__
S
u dS =
_
C
u dr , (1.63)
where the line integral is taken in the direction of ( as specied by the right-hand rule.
Stokes theorem thus states that the ux of u
across an open surface o is equal to the circulation
of u round the bounding curve (.
Outline Proof. Given an extra half lecture I might
just be able to get somewhere without losing too
many of you. If you are really interested then my IA
Mathematical Tripos notes on Vector Calculus are
available on the WWW at
http://damtp.cam.ac.uk/user/cowley/teaching/.
1.6.3 Examples and Applications
1. A body is acted on by a hydrostatic pressure force
p = gz, where is the density of the surrounding
uid, g is gravity and z is the vertical coordinate.
Find a simplied expression for the pressure force
on the body starting from
F =
__
S
p dS. (1.64)
Answer. Consider the individual components of u
and use the divergence theorem. Then
e
z
F =
__
S
p e
z
dS =
___
V
(p e
z
) dV =
___
V
(gz)
z
dV = g
___
V
dV = Mg ,
(1.65a)
where M is the mass of the uid displaced by the body. Similarly
e
x
F =
___
V
(p e
x
) dV =
___
V
(gz)
x
dV = 0 , (1.65b)
and e
y
F = 0. Hence we have Archimedes Theorem that an immersed body experiences a loss of
weight equal to the weight of the uid displaced:
F = Mg e
z
. (1.65c)
2. Show that provided there are no singularities, the integral
_
C
dr , (1.66)
11
Or to be slightly more precise: let S be a piecewise smooth, open, orientated, non-intersecting surface bounded by a
simple, piecewise smooth, closed curve C.
12
For instance a vector eld with continuous rst-order partial derivatives on S.
Natural Sciences Tripos: IB Mathematical Methods I 13 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
where is a scalar eld and ( is an open path joining two xed points A and B, is independent of
the path chosen between the points.
Answer. Consider two such paths: (
1
and (
2
. Form
a closed curve
( from these two curves. Then using
Stokes Theorem and the result (1.42) that a curl of
a gradient is zero, we have that
_
C1
dr
_
C2
dr =
_
b
C
dr
=
__
b
S
() dS
= 0 .
where
o is a nice open surface bounding
(. Hence
_
C1
dr =
_
C2
dr . (1.67)
Application. If is the gravitational potential, then g = is the gravitational force, and
_
C
()dr is the work done against gravity in moving from A to B. The above result demonstrates
that the work done is independent of path.
3/02
1.6.4 Interpretation of divergence
Let a volume 1 be enclosed by a surface o, and consider
a limit process in which the greatest diameter of 1 tends
to zero while keeping the point r
0
inside 1. Then from
Taylors theorem with r = r
0
+r,
___
V
u(r) dV =
___
V
( u(r
0
) +. . . ) dV = u(r
0
) [1[ +. . . ,
where [1[ is the volume of 1. Thus using the divergence theorem (1.57)
u = lim
|V|0
1
[1[
__
S
u dS, (1.68)
where o is any nice small closed surface enclosing a volume 1. It follows that u can be interpreted
as the net rate of ux outow at r
0
per unit volume.
Application. Suppose that v is a velocity eld. Then
v > 0
__
S
v dS > 0 net positive ux there exists a source at r
0
;
v < 0
__
S
v dS < 0 net negative ux there exists a sink at r
0
.
1.6.5 Interpretation of curl
Let an open smooth surface o be bounded by a curve (.
Consider a limit process in which the point r
0
remains
on o, the greatest diameter of o tends to zero, and the
normals at all points on the surface tend to a specic
direction (i.e. the value of n at r
0
). Then from Taylors
theorem with r = r
0
+r,
Natural Sciences Tripos: IB Mathematical Methods I 14 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
__
S
(u(r)) dS =
__
S
(u(r
0
) +. . . ) dS = u(r
0
) n[o[ +. . . ,
where [o[ is the area of o. Thus using Stokes theorem (1.63)
n (u) = lim
S0
1
[o[
_
C
u dr , (1.69)
where o is any nice small open surface with a bounding curve (. It follows that n ( u) can be
interpreted as the circulation about n at r
0
per unit area. 3/03
Application.
Consider a rigid body rotating with angular velocity
about an axis through 0. Then the velocity at a
point r in the body is given by
v = r . (1.70)
Suppose that ( is a circle of radius a in a plane nor-
mal to . Then the circulation of v around ( is
_
C
v dr =
_
2
0
(a) a d = 2a
2
. (1.71)
Hence from (1.69)
(v) = lim
a0
1
a
2
_
C
v dr = 2 . (1.72)
We conclude that the curl is a measure of the local rotation of a vector eld.
Exercise. Show by direct evaluation that if v = r then v = 2. 3/04
1.7 Orthogonal Curvilinear Coordinates
1.7.1 What Are Orthogonal Curvilinear Coordinates?
There are many ways to describe the position of points in space. One way is to dene three independent
sets of surfaces, each parameterised by a single variable (for Cartesian coordinates these are orthogonal
planes parameterised, say, by the point on the axis that they intercept). Then any point has coordinates
given by the labels for the three surfaces that intersect at that point.
The unit vectors analogous to e
1
, etc. are the
unit normals to these surfaces. Such coordinates
are called curvilinear. They are not generally much
use unless the orthonormality condition (1.2), i.e.
e
i
e
j
=
ij
, holds; in which case they are called
orthogonal curvilinear coordinates. The most com-
mon examples are spherical and cylindrical polar co-
ordinates. For instance in the case of spherical po-
lar coordinates the independent sets of surfaces are
spherical shells and planes of constant latitude and
longitude.
It is very important to realise that there is a key dierence between Cartesian coordinates and other Key
Point orthogonal curvilinear coordinates. In Cartesian coordinates the directions of the basis vectors e
x
, e
y
, e
z
are independent of position. This is not the case in other coordinate systems; for instance, e
r
the normal
Natural Sciences Tripos: IB Mathematical Methods I 15 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
to a spherical shell changes direction with position on the shell. It is sometimes helpful to display this
dependence on position explicitly:
e
i
e
i
(r) . (1.73)
1.7.2 Relationships Between Coordinate Systems
Suppose that we have non-Cartesian coordinates, q
i
(i = 1, 2, 3). Since we can express one coordinate
system in term of another, there will be a functional dependence of the q
i
on, say, Cartesian coordinates
x, y, z, i.e.
q
i
q
i
(x, y, z) (i = 1, 2, 3) . (1.74)
For cylindrical polar coordinates and spherical polar coordinates we know that:
Cylindrical Polar Spherical Polar
Coordinates Coordinates
q
1
= (x
2
+y
2
)
1/2
r = (x
2
+y
2
+z
2
)
1/2
q
2
= tan
1
_
y
x
_
= tan
1
_
(x
2
+y
2
)
1/2
z
_
q
3
z = tan
1
(y/x)
Remarks
1. Note that q
i
= c
i
(i = 1, 2, 3), where the c
i
are constants, dene three independent sets of surfaces,
each labelled by a parameter (i.e. the c
i
). As discussed above, any point has coordinates given
by the labels for the three surfaces that intersect at that point.
2. The equation (1.74) can be viewed as three simultaneous equations for three unknowns x, y, z. In
general these equations can be solved to yield the position vector r as a function of q = (q
1
, q
2
, q
3
),
i.e. r r(q) or
x = x(q
1
, q
2
, q
3
) , y = y(q
1
, q
2
, q
3
) , z = z(q
1
, q
2
, q
3
) . (1.75)
For instance:
Cylindrical Polar Spherical Polar
Coordinates Coordinates
x cos r cos sin
y sin r sinsin
z z r cos
3/01
1.7.3 Incremental Change in Position or Length.
Consider an innitesimal change in position. Then, by the chain rule, the change dx in x(q
1
, q
2
, q
3
) due
to changes dq
j
in q
j
(i = 1, 2, 3) is
dx =
x
q
1
dq
1
+
x
q
2
dq
2
+
x
q
3
dq
3
=
j
x
q
j
dq
j
. (1.76)
Natural Sciences Tripos: IB Mathematical Methods I 16 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Using similar results for dy and dz, the vector displacement dr is
dr = dxe
x
+dy e
y
+dz e
z
=
_
_
j
x
q
j
dq
j
_
_
e
x
+
_
_
j
y
q
j
dq
j
_
_
e
y
+
_
_
j
z
q
j
dq
j
_
_
e
z
=
j
_
x
q
j
e
x
+
y
q
j
e
y
+
z
q
j
e
z
_
dq
j
=
j
h
j
dq
j
, (1.77a)
where
h
j
=
x
q
j
e
x
+
y
q
j
e
y
+
z
q
j
e
z
=
r(q)
q
j
(j = 1, 2, 3) . (1.77b)
Thus the innitesimal change in position dr is a vector sum of displace-
ments h
j
(r) dq
j
along each of the three q-axes through r.
Remark. Note that the h
j
are in general functions of r, and hence the directions of the q-axes vary in
space. Consequently the q-axes are curves rather than straight lines; the coordinate system is said
to be a curvilinear one.
The vectors h
j
are not necessarily unit vectors, so it is convenient to write
h
j
= h
j
e
j
(j = 1, 2, 3) , (1.78)
where the h
j
are the lengths of the h
j
, and the e
j
are unit vectors i.e.
h
j
=
r
q
j
and e
j
=
1
h
j
r
q
j
(j = 1, 2, 3) . (1.79)
Again we emphasise that the directions of the e
j
(r) will, in general, depend on position.
1.7.4 Orthogonality
For a general q
j
coordinate system the e
j
are not necessarily mutually orthogonal, i.e. in general
e
i
e
j
,= 0 for i ,= j .
However, for orthogonal curvilinear coordinates the e
i
are required to be mutually orthogonal at all
points in space, i.e.
e
i
e
j
= 0 if i ,= j .
Since by denition the e
j
are unit vectors, we thus have that
e
i
e
j
=
ij
. (1.80)
An identity. Recall from (0.11b) that
3
j=1
a
j
ij
= a
i
. (1.81)
Natural Sciences Tripos: IB Mathematical Methods I 17 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Incremental Distance. In an orthogonal curvilinear coordinate system the expression for the incremental
distance, [dr[
2
simplies. We nd that
[dr[
2
= dr dr =
_
i
h
i
dq
i
e
i
_
_
_
j
h
j
dq
j
e
j
_
_
from (1.77a) and (1.78)
=
i,j
(h
i
dq
i
)(h
j
dq
j
) e
i
e
j
=
i,j
(h
i
dq
i
)(h
j
dq
j
)
ij
from (1.80)
=
i
h
2
i
(dq
i
)
2
. from (1.81) with a
j
= h
j
dq
j
(1.82)
Key
Result
Remarks.
1. All information about orthogonal curvilinear coordinate systems is encoded in the three functions
h
j
(j = 1, 2, 3).
2. It is conventional to order the q
i
so that the coordinate system is right-handed.
1.7.5 Spherical Polar Coordinates
In this case q
1
= r, q
2
= , q
3
= and
r = r sin cos e
x
+r sin sine
y
+r cos e
z
= (r sin cos , r sin sin, r cos ) . (1.83)
Hence
r
q
1
=
r
r
= (sin cos , sin sin, cos ) ,
r
q
2
=
r
r
q
1
= 1 , e
1
= e
r
= (sin cos , sin sin, cos ) , (1.84a)
h
2
=
r
q
2
= r , e
2
= e
r
q
3
= r sin , e
3
= e
and e
j
h
j
dq
j
e
j
= dr e
r
+r d e
+r sin de
. (1.85)
Natural Sciences Tripos: IB Mathematical Methods I 18 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
1.7.6 Cylindrical Polar Coordinates
In this case q
1
= , q
2
= , q
3
= z and
r = cos e
x
+ sine
y
+z e
z
= ( cos , sin, z) .
Exercise. Show that
r
q
1
=
r
= (cos , sin, 0) ,
r
q
2
=
r
= ( sin, cos , 0) ,
r
q
3
=
r
z
= (0, 0, 1) .
and hence that
h
1
=
r
q
1
= 1 , e
1
= e
r
q
2
= , e
2
= e
r
q
3
= 1 , e
3
= e
z
= (0, 0, 1) . (1.86c)
Remarks.
e
i
e
j
=
ij
and e
1
e
2
= e
3
, i.e. cylindrical polar coordinates are a right-handed orthogonal
curvilinear coordinate system.
e
and e
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
The surface element can also be deduced for arbi-
trary orthogonal curvilinear coordinates. First con-
sider the special case when dS[[ e
3
, then
dS = (h
1
dq
1
e
1
) (h
2
dq
2
e
2
)
= h
1
h
2
dq
1
dq
2
e
3
. (1.90)
In general
dS = sign( n e
1
)h
2
h
3
dq
2
dq
3
e
1
+sign( n e
2
)h
3
h
1
dq
3
dq
1
e
2
+sign( n e
3
)h
1
h
2
dq
1
dq
2
e
3
. (1.91)
4/02
4/03
4/04
1.7.8 Gradient in Orthogonal Curvilinear Coordinates
First we recall from (1.10) that for Cartesian coordinates and innitesimal displacements d = dr.
Denition. For curvilinear orthogonal coordinates (for which the basis vectors are in general functions
of position), we dene to be the vector such that for all dr
d = dr . (1.92)
In order to determine the components of when is viewed as a function of q rather than r, write
=
i
e
i
i
, (1.93)
then from (1.77a), (1.78), (1.80), (1.81) and (1.92)
d =
i
e
i
j
h
j
e
j
dq
j
=
i,j
i
(h
j
dq
j
)e
i
e
j
=
i
(h
i
dq
i
) . (1.94)
But according to the chain rule, an innitesimal change dq to q will lead to the following innitesimal
change in (q
1
, q
2
, q
3
)
d =
q
i
dq
i
=
i
_
1
h
i
q
i
_
(h
i
dq
i
) . (1.95)
Hence, since (1.94) and (1.95) must hold for all dq
i
,
i
=
1
h
i
q
i
, (1.96)
and from (1.93)
=
i
e
i
h
i
q
i
=
_
1
h
1
q
1
,
1
h
2
q
2
,
1
h
3
q
3
_
. (1.97)
Remark. Each term has dimensions /length.
As before, we can consider to be the result of acting on with the vector dierential operator
Key
Result
=
i
e
i
1
h
i
q
i
. (1.98)
Natural Sciences Tripos: IB Mathematical Methods I 20 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
1.7.9 Examples of Gradients
Cylindrical Polar Coordinates. In cylindrical polar coordinates, the gradient is given from (1.86a), (1.86b)
and (1.86c) to be
= e
+ e
+ e
z
z
. (1.99a)
Spherical Polar Coordinates. In spherical polar coordinates the gradient is given from (1.84a), (1.84b) and
(1.84c) to be
= e
r
r
+ e
1
r
+ e
1
r sin
. (1.99b)
1.7.10 Divergence and Curl
We can now use (1.98) to compute F and F in orthogonal curvilinear coordinates. However,
rst we need a preliminary result which is complementary to (1.79). Using (1.98), and the fact that (see
(0.8b))
q
i
q
j
=
ij
, i.e. that
_
q
1
q
1
_
q2,q3
= 1 ,
_
q
1
q
2
_
q1,q3
= 0 , etc., (1.100a)
we can show that
q
i
=
j
e
j
1
h
j
q
i
q
j
=
j
e
j
h
j
ij
=
e
i
h
i
, i.e. that e
i
= h
i
q
i
. (1.100b)
We also recall that the e
i
form an orthonormal right-handed basis; thus e
1
= e
2
e
3
(and cyclic
permutations). Hence from (1.100b)
e
1
= h
2
q
2
h
3
q
3
, and cyclic permutations. (1.100c)
Divergence. We have with a little bit of inspired rearrangement, and remembering to dierentiate the e
i
because they are position dependent:
F =
_
i
F
i
e
i
_
=
_
(h
2
h
3
F
1
)
_
e
1
h
2
h
3
__
+cyclic permutations
=
e
1
h
2
h
3
(h
2
h
3
F
1
) +h
2
h
3
F
1
_
e
1
h
2
h
3
_
+cyclic permutations using (1.41a)
=
e
1
h
2
h
3
j
e
j
_
1
h
j
q
j
(h
2
h
3
F
1
)
_
+h
2
h
3
F
1
(q
2
q
3
)
+cyclic permutations . using (1.98) & (1.100c)
Recall from (1.80) that e
1
e
j
=
1j
, and from example (1.47), with p = q
2
and q = q
3
, that
(q
2
q
3
) = 0 .
It follows that
Key
Result
F =
1
h
1
h
2
h
3
_
q
1
(h
2
h
3
F
1
) +
q
2
(h
3
h
1
F
2
) +
q
3
(h
1
h
2
F
3
)
_
. (1.101)
Cylindrical Polar Coordinates. From (1.86a), (1.86b), (1.86c) and (1.101)
div F =
1
(F
) +
1
+
F
z
z
. (1.102a)
Natural Sciences Tripos: IB Mathematical Methods I 21 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Spherical Polar Coordinates. From (1.84a), (1.84b), (1.84c) and (1.101)
div F =
1
r
2
r
_
r
2
F
r
_
+
1
r sin
(sin F
) +
1
r sin
F
. (1.102b)
4/01
Curl. With a little bit of inspired rearrangement we have that
F =
_
i
F
i
e
i
_
=
_
(h
i
F
i
)
_
e
i
h
i
__
=
i
(h
i
F
i
)
e
i
h
i
+
i
h
i
F
i
(q
i
) using (1.41b) & (1.100b)
=
j
_
1
h
i
h
j
(h
i
F
i
)
q
j
_
e
j
e
i
. using (1.42) & (1.98)
But e
1
e
2
= e
3
and cyclic permutations, and e
k
e
k
= 0, hence
F =
e
1
h
2
h
3
_
(h
3
F
3
)
q
2
(h
2
F
2
)
q
3
_
+
e
2
h
3
h
1
_
(h
1
F
1
)
q
3
(h
3
F
3
)
q
1
_
+
e
3
h
1
h
2
_
(h
2
F
2
)
q
1
(h
1
F
1
)
q
2
_
. (1.103a)
All three components of the curl can be written in the concise form
Key
Result
F =
1
h
1
h
2
h
3
h
1
e
1
h
2
e
2
h
3
e
3
q1
q2
q3
h
1
F
1
h
2
F
2
h
3
F
3
. (1.103b)
Cylindrical Polar Coordinates. From (1.86a), (1.86b), (1.86c) and (1.103b)
curl F =
1
e
z
z
F
F
z
(1.104a)
=
_
1
F
z
z
,
F
z
F
z
,
1
(F
_
. (1.104b)
Spherical Polar Coordinates. From (1.84a), (1.84b), (1.84c) and (1.103b)
curl F=
1
r
2
sin
e
r
re
r sine
F
r
rF
r sinF
(1.105a)
=
_
1
r sin
_
(sin F
_
,
1
r sin
F
r
1
r
(rF
)
r
,
1
r
(rF
)
r
1
r
F
r
_
. (1.105b)
Remarks.
1. Each term in a divergence and curl has dimensions F/length.
2. The above formulae can also be derived in a more physical manner using the divergence theorem
and Stokes theorem respectively.
Natural Sciences Tripos: IB Mathematical Methods I 22 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
1.7.11 Laplacian in Orthogonal Curvilinear Coordinates
Suppose we substitute F = into formula (1.101) for the divergence. Then since from (1.97)
F
i
=
1
h
i
q
i
,
we have that
2
=
1
h
1
h
2
h
3
_
q
1
_
h
2
h
3
h
1
q
1
_
+
q
2
_
h
3
h
1
h
2
q
2
_
+
q
3
_
h
1
h
2
h
3
q
3
__
. (1.106)
We thereby deduce that in a general orthogonal curvilinear coordinate system
2
=
1
h
1
h
2
h
3
_
q
1
_
h
2
h
3
h
1
q
1
_
+
q
2
_
h
3
h
1
h
2
q
2
_
+
q
3
_
h
1
h
2
h
3
q
3
__
. (1.107)
Cylindrical Polar Coordinates. From (1.86a), (1.86b), (1.86c) and (1.107)
2
=
1
_
+
1
2
+
2
z
2
. (1.108a)
Spherical Polar Coordinates. From (1.84a), (1.84b), (1.84c) and (1.107)
2
=
1
r
2
r
_
r
2
r
_
+
1
r
2
sin
_
sin
_
+
1
r
2
sin
2
2
, (1.108b)
=
1
r
2
r
2
(r) +
1
r
2
sin
_
sin
_
+
1
r
2
sin
2
2
. (1.108c)
Remark. We have found here only the form of
2
as a dierential operator on scalar elds. As noted
earlier, the action of the Laplacian on a vector eld F is most easily dened using the vector
identity
2
F = ( F) (F) . (1.109)
Alternatively
2
F can be evaluated by recalling that
2
F =
2
(F
1
e
1
+F
2
e
2
+F
3
e
3
) ,
and remembering (a) that the derivatives implied by the Laplacian act on the unit vectors too, and
(b) that because the unit vectors are generally functions of position (
2
F)
i
,=
2
F
i
(the exception
being Cartesian coordinates).
1.7.12 Further Examples
Evaluate r, r, and
2
_
1
r
_
in spherical polar coordinates, where r = r e
r
. From (1.102b)
r =
1
r
2
r
_
r
2
r
_
= 3 , as in (1.37).
From (1.105b)
r =
_
0 ,
1
r sin
r
,
1
r
r
_
= (0, 0, 0) , as in (1.38).
From (1.108c) for r ,= 0
2
_
1
r
_
=
1
r
2
r
2
_
r
_
1
r
__
= 0 , as in (1.53) with n = 1. (1.110)
Natural Sciences Tripos: IB Mathematical Methods I 23 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
1.7.13 Aide Memoire
Orthogonal Curvilinear Coordinates.
=
i
e
i
1
h
i
q
i
.
div F =
1
h
1
h
2
h
3
_
q
1
(h
2
h
3
F
1
) +
q
2
(h
3
h
1
F
2
) +
q
3
(h
1
h
2
F
3
)
_
.
curl F =
1
h
1
h
2
h
3
h
1
e
1
h
2
e
2
h
3
e
3
q1
q2
q3
h
1
F
1
h
2
F
2
h
3
F
3
2
=
1
h
1
h
2
h
3
_
q
1
_
h
2
h
3
h
1
q
1
_
+
q
2
_
h
3
h
1
h
2
q
2
_
+
q
3
_
h
1
h
2
h
3
q
3
__
.
Cylindrical Polar Coordinates: q
1
= , h
1
= 1, q
2
= , h
2
= , q
3
= z, h
3
= 1.
= e
+ e
+ e
z
z
.
div F =
1
(F
) +
1
+
F
z
z
.
curl F =
1
e
z
z
F
F
z
=
_
1
F
z
z
,
F
z
F
z
,
1
(F
_
.
2
=
1
_
+
1
2
+
2
z
2
.
Spherical Polar Coordinates: q
1
= r, h
1
= 1, q
2
= , h
2
= r, q
3
= , h
3
= r sin.
= e
r
r
+ e
1
r
+ e
1
r sin
.
div F =
1
r
2
r
_
r
2
F
r
_
+
1
r sin
(sin F
) +
1
r sin
F
.
curl F =
1
r
2
sin
e
r
re
r sine
F
r
rF
r sinF
=
_
1
r sin
_
(sin F
_
,
1
r sin
F
r
1
r
(rF
)
r
,
1
r
(rF
)
r
1
r
F
r
_
.
2
=
1
r
2
r
_
r
2
r
_
+
1
r
2
sin
_
sin
_
+
1
r
2
sin
2
2
.
Natural Sciences Tripos: IB Mathematical Methods I 24 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
2 Partial Dierential Equations
2.0 Why Study This?
Many (most?, all?) scientic phenomena can be described by mathematical equations. An important
sub-class of these equations is that of partial dierential equations. If we can solve and/or understand
such equations (and note that this is not always possible for a given equation), then this should help us
understand the science.
2.1 Nomenclature
Ordinary dierential equations (ODEs) are equations relating one or more unknown functions of a vari-
able with one or more of the functions derivatives, e.g. the second-order equation for the motion
of a particle of mass m acted on by a force F
m
d
2
r(t)
dt
2
= F, (2.1a)
or equivalently the two rst-order equations
r(t) = v(t) and m v(t) = F(t) . (2.1b)
The unknown functions are referred to as the dependent variables, and the quantity that they
depend on is known as the independent variable.
Partial dierential equations (PDEs) are equations relating one or more unknown functions of two or
more variables with one or more of the functions partial derivatives with respect to those variables,
e.g. Schrodingers equation (1.50b) for the quantum mechanical wave function (x, y, z, t) of a non-
relativistic particle:
2
2m
_
2
x
2
+
2
y
2
+
2
z
2
_
+V (r) = i
t
. (2.2)
5/02
Linearity. If the system of dierential equations is of the rst degree in the dependent variables, then
the system is said to be linear. Hence the above examples are linear equations. However Eulers
equation for an inviscid uid,
_
u
t
+(u )u
_
= p , (2.3)
where u is the velocity, is the density and p is the pressure, is nonlinear in u.
Order. The power of the highest derivative determines the order of the dierential equation. Hence (2.1a)
and (2.2) are second-order equations, while each equation in (2.1b) is a rst-order equation. 5/03
5/04
2.1.1 Linear Second-Order Partial Dierential Equations
The most general linear second-order partial dierential equation in two independent variables is
a(x, y)
x
2
+b(x, y)
2
xy
+c(x, y)
y
2
+d(x, y)
x
+e(x, y)
y
+f(x, y) = g(x, y) . (2.4)
We will concentrate on examples where the coecients, a(x, y), etc. are constants, and where there are
other simplifying assumptions. However, we will not necessarily restrict ourselves to two independent
variables (e.g. Schrodingers equation (2.2) has four independent variables).
Natural Sciences Tripos: IB Mathematical Methods I 25 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
2.2 Physical Examples and Applications
2.2.1 Waves on a Violin String
Consider small displacements on a stretched elastic string of
density per unit length (when not displaced). Assume that
all displacements y(x, t) are vertical (this is a bit of a cheat),
and resolve horizontally and vertically to obtain respectively
T
2
cos
2
= T
1
cos
1
, (2.5a)
( dx)
2
y
t
2
= T
2
sin
2
T
1
sin
1
= T
2
cos
2
(tan
2
tan
1
) . (2.5b)
In the light of (2.5a) let
T = T
j
cos
j
(j = 1, 2) , (2.6a)
and observe that
tan =
y
x
. (2.6b)
Then from (2.5b) it follows after use of Taylors theorem that
dx
2
y
t
2
= T (tan
2
tan
1
)
= T
_
x
y(x +dx, t)
x
y(x, t)
_
= T
2
y
x
2
dx +. . . , (2.7)
and hence, in the innitesimal limit, that
2
y
t
2
=
T
2
y
x
2
. (2.8)
This is the wave equation with wavespeed c =
_
T
2
y
t
2
= c
2
2
y
x
2
. (2.9)
2.2.2 Electromagnetic Waves
The theory of electromagnetism is based on Maxwells equations. These relate the electric eld E, the
magnetic eld B, the charge density and the current density J:
E =
0
, (2.10a)
E =
B
t
, (2.10b)
B =
0
J +
1
c
2
E
t
, (2.10c)
B = 0 , (2.10d)
where
0
is the dielectric constant,
0
is the magnetic permeability, and c
2
= (
0
0
)
1
is the speed of
light. If there is no charge or current (i.e. = 0 and J = 0), then from (2.10a), (2.10b), (2.10c) and the
vector identity (1.51a):
1
c
2
2
E
t
2
=
B
t
using (2.10c) with J = 0
= (E) using (2.10b)
=
2
E( E) using identity (1.51a)
=
2
E. using (2.10a) with = 0 (2.11)
Natural Sciences Tripos: IB Mathematical Methods I 26 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
We have therefore recovered the three-dimensional wave equation (cf. (2.9))
2
E
t
2
= c
2
_
2
x
2
+
2
y
2
+
2
z
2
_
E. (2.12)
Remark. The pressure perturbation of a sound waves satises the scalar equivalent of this equation (but
with c equal to the speed of sound rather than that of light).
2.2.3 Electrostatic Fields
Suppose instead a steady electric eld is generated by a known charge density . Then from (2.10b)
E = 0 ,
which implies from (1.45) that there exists an electric potential such that
E = . (2.13)
It then follows from the rst of Maxwells equations, (2.10a), that satises Poissons equation
2
=
0
, (2.14a)
i.e.
_
2
x
2
+
2
y
2
+
2
z
2
_
=
0
. (2.14b)
2.2.4 Gravitational Fields
A Newtonian gravitational eld g satises
g = 4G , (2.15a)
and
g = 0 , (2.15b)
where G is the gravitational constant and is mass density. From the latter equation and (1.45) it follows
that there exists a gravitational potential such that
g = . (2.16)
Thence from (2.15a) we deduce that the gravitational potential satises Poissons equation
2
= 4G . (2.17)
Remark. Electrostatic and gravitational elds are similar! 5/01
2.2.5 Diusion of a Passive Tracer
Suppose we want describe how an inert chemical diuses through a solid or stationary uid.
13
Denote the mass concentration of the dissolved
chemical per unit volume by C(r, t), and the ma-
terial ux vector of the chemical by q(r, t). Then
the amount of chemical crossing a small surface dS
in time t is
local ux = (q dS) t .
13
Reacting chemicals and moving uids are slightly more tricky.
Natural Sciences Tripos: IB Mathematical Methods I 27 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Hence the ux of chemical out of a closed surface o enclosing a volume 1 in time t is
surface ux =
_
_
__
S
q dS
_
_
t . (2.18)
Let Q(r, t) denote any chemical mass source per unit time per unit volume of the media. Then if the
change of chemical within the volume is to be equal to the ux of the chemical out of the surface in
time t
_
_
__
S
q dS
_
_
t =
_
_
d
dt
___
V
C dV
_
_
t +
_
_
___
V
QdV
_
_
t . (2.19a)
Hence using the divergence theorem (1.57), and exchanging the order of dierentiation and integration,
___
V
_
q +
C
t
Q
_
dV = 0 . (2.19b)
But this is true for any volume, and so
C
t
= q +Q. (2.20)
The simplest empirical law relating concentration ux to concentration gradient is Ficks law
q = DC , (2.21)
where D is the diusion coecient; the negative sign is necessary if chemical is to ow from high to low
concentrations. If D is constant then the partial dierential equation governing the concentration is
C
t
= D
2
C +Q. (2.22)
Special Cases.
Diusion Equation. If there is no chemical source then Q = 0, and the governing equation becomes the
diusion equation
C
t
= D
2
C . (2.23)
Poissons Equation. If the system has reached a steady state (i.e.
t
0), then with f(r) = Q(r)/D the
governing equation is Poissons equation
2
C = f . (2.24)
Laplaces Equation. If the system has reached a steady state and there are no chemical sources then the
concentration is governed by Laplaces equation
2
C = 0 . (2.25)
2.2.6 Heat Flow
What governs the ow of heat in a saucepan, an engine block,
the earths core, etc.? Can we write down an equation?
Let q(r, t) denote the ux vector for heat ow. Then the
energy in the form of heat (molecular vibrations) owing out
of a closed surface o enclosing a volume 1 in time t is again
(2.18). Also, let
E(r, t) denote the internal energy per unit mass of the solid,
Q(r, t) denote any heat source per unit time per unit volume of the solid,
(r, t) denote the mass density of the solid (assumed constant here).
Natural Sciences Tripos: IB Mathematical Methods I 28 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
The ow of heat in/out of o must balance the change in internal energy and the heat source over, say,
a time t (cf. (2.19a))
_
_
__
S
q dS
_
_
t =
_
_
d
dt
___
V
E dV
_
_
t +
_
_
___
V
QdV
_
_
t .
For slow changes at constant pressure (1st and 2nd law of thermodynamics)
E(r, t) = c
p
(r, t) , (2.26)
where is the temperature and c
p
is the specic heat (assumed constant here). Hence using the divergence
theorem (1.57), and exchanging the order of dierentiation and integration (cf. (2.19b)),
___
V
_
q +c
p
t
Q
_
dV = 0 .
But this is true for any volume, hence
c
p
t
= q +Q. (2.27)
Experience tells us heat ows from hot to cold. The simplest empirical law relating heat ow to temper-
ature gradient is Fouriers law (cf. Ficks law (2.21))
q = k , (2.28)
where k is the heat conductivity. If k is constant then the partial dierential equation governing the
temperature is (cf. (2.22))
t
=
2
+
Q
c
p
(2.29)
where = k/(c
p
) is the diusivity (or coecient of diusion).
2.2.7 Other Equations
There are numerous other partial dierential equations describing scientic, and non-scientic, phenom-
ena. One equation that you might have heard a lot about is the Black-Scholes equation for call option
pricing
w
t
= rw rx
w
x
1
2
v
2
x
2
2
w
x
2
, (2.30)
where w(x, t) is the price of the call option of the stock, x is the variable market price of the stock, t is
time, r is a xed interest rate and v
2
is the variance rate of the stock price!
14
Also, despite the impression given above where all the equations except (2.3) are linear, many of the most
interesting scientic (and non-scientic) equations are nonlinear. For instance the nonlinear Schrodinger
equation
i
A
t
+
2
A
x
2
= A[A[
2
,
where i is the square root of -1, admits soliton solutions (which is one of the reasons that optical bres
work).
14
Its by no means clear to me what these terms mean, but http://www.physics.uci.edu/~silverma/bseqn/bs/bs.html
is one place to start!
Natural Sciences Tripos: IB Mathematical Methods I 29 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
2.3 Separation of Variables
You may have already met the general idea of separability when solving ordinary dierential equations,
e.g. when you studied separable equations to the special dierential equations that can be written in the
form
X(x)dx
. .
function of x
= Y (y)dy
. .
function of y
= constant .
Sometimes functions can we written in separable form. For instance,
f(x, y) = cos x expy = X(x)Y (y) , where X = cos x and Y = expy ,
is separable in Cartesian coordinates, while
g(x, y, z) =
1
(x
2
+y
2
+z
2
)
1
2
is not separable in Cartesian coordinates, but is separable in spherical polar coordinates since
g = R(r)()() where R =
1
r
, = 1 and = 1 .
Solutions to partial dierential equations can sometimes be found by seeking solutions that can be written
in separable form, e.g.
Time & 1D Cartesians: y(x, t) = X(x)T(t) , (2.31a)
2D Cartesians: (x, y) = X(x)Y (y) , (2.31b)
3D Cartesians: (x, y, z) = X(x)Y (y)Z(z) , (2.31c)
Cylindrical Polars: (, , z) = R()()Z(z) , (2.31d)
Spherical Polars: (r, , ) = R(r)()() . (2.31e)
However, we emphasise that not all solutions of partial dierential equations can be written in this form.
2.4 The One Dimensional Wave Equation
2.4.1 Separable Solutions
Seek solutions y(x, t) to the one dimensional wave equation (2.9), i.e.
2
y
t
2
= c
2
2
y
x
2
, (2.32a)
of the form
y(x, t) = X(x)T(t) . (2.32b)
On substituting (2.32b) into (2.32a) we obtain
X
T = c
2
T X
,
where a and a
denote dierentiation by t and x respectively. After rearrangement we have that
1
c
2
T(t)
T(t)
. .
function of t
=
X
(x)
X(x)
. .
function of x
= , (2.33a)
where is a constant (the only function of t that equals a function of x). We have therefore split the
PDE into two ODEs:
T c
2
T = 0 and X
X = 0 . (2.33b)
There are three cases to consider.
Natural Sciences Tripos: IB Mathematical Methods I 30 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
= 0. In this case
T(t) = X
(x) = 0 T = A
0
+B
0
t and X = C
0
+D
0
x,
where A
0
, B
0
, C
0
and D
0
are constants, i.e.
y = (A
0
+B
0
t)(C
0
+D
0
x) . (2.34a)
=
2
> 0. In this case
T
2
c
2
T = 0 and X
2
X = 0 .
Hence
T = A
e
ct
+B
e
ct
and X = C
coshx +D
sinhx,
where A
, B
, C
and D
e
ct
+B
e
ct
_
(C
coshx +D
sinhx) . (2.34b)
Alternatively we could express this as
y =
_
A
coshct +
B
sinhct
__
e
x
+
D
e
x
_
, or as . . .
where
A
,
B
,
C
and
D
T +k
2
c
2
T = 0 and X
+k
2
X = 0 .
Hence
T = A
k
cos kct +B
k
sinkct and X = C
k
cos kx +D
k
sinkx,
where A
k
, B
k
, C
k
and D
k
are constants, i.e.
y = (A
k
cos kct +B
k
sinkct) (C
k
cos kx +D
k
sinkx) . (2.34c)
Remark. Without loss of generality we could also impose a normalisation condition, say, C
2
j
+D
2
j
= 1.
2.4.2 Boundary and Initial Conditions
Solutions (2.34a), (2.34b) and (2.34c) represent three families of solutions.
15
Although they are based
on a special assumption, we shall see that because the wave equation is linear they can represent a wide
range of solutions by means of superposition. However, before going further it is helpful to remember
that when solving a physical problem boundary and initial conditions are also needed.
Boundary Conditions. Suppose that the string considered in 2.2.1 has ends at x = 0 and x = L that
are xed; appropriate boundary conditions are then
y(0, t) = 0 and y(L, t) = 0 . (2.35)
It is no coincidence that there are boundary conditions at two values of x and the highest derivative
in x is second order.
Initial Conditions. Suppose also that the initial displacement
and initial velocity of the string are known; appropriate
initial conditions are then
y(x, 0) = d(x) and
y
t
(x, 0) = v(x) . (2.36)
Again it is no coincidence that we need two initial condi-
tions and the highest derivative in t is second order.
We shall see that the boundary conditions restrict the choice of .
15
Or arguably one family if you wish to nit pick in the complex plane.
Natural Sciences Tripos: IB Mathematical Methods I 31 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
2.4.3 Solution
Consider the cases = 0, < 0 and > 0 in turn. These constitute an uncountably innite number of
solutions; our aim is to end up with a countably innite number of solutions by elimination.
= 0. If the homogeneous, i.e. zero, boundary conditions (2.35) are to be satised for all time, then in
(2.34a) we must have that C
0
= D
0
= 0.
> 0. Again if the boundary conditions (2.35) are to be satised for all time, then in (2.34b) we must
have that C
= D
= 0.
< 0. Applying the boundary conditions (2.35) to (2.34c) yields
C
k
= 0 and D
k
sinkL = 0 . (2.37)
If D
k
= 0 then the entire solution is trivial (i.e. zero), so the only useful solution has
sinkL = 0 k =
n
L
, (2.38)
where n is a non-zero integer. These special values of k are eigenvalues and the corresponding
eigenfunctions, or normal modes, are
X
n
= D
n
L
sin
nx
L
. (2.39)
Hence, from (2.34c), solutions to (2.9) that satisfy the boundary condition (2.35) are
y
n
(x, t) =
_
/
n
cos
nct
L
+B
n
sin
nct
L
_
sin
nx
L
, (2.40)
where we have written /
n
for A
n
L
D
n
L
and B
n
for B
n
L
D
n
L
. Since (2.9) is linear we can superimpose
(i.e. add) solutions to get the general solution
y(x, t) =
n=1
_
/
n
cos
nct
L
+B
n
sin
nct
L
_
sin
nx
L
, (2.41)
where there is no need to run the sum from to because of the symmetry properties of sin and
cos. We note that when the solution is viewed as a function of x at xed t, or as a function of t at xed
x, then it has the form of a Fourier series. 6/01
The solution (2.41) satises the boundary conditions (2.35) by
construction. The only thing left to do is to satisfy the initial
conditions (2.36), i.e. we require that
y(x, 0) = d(x) =
n=1
/
n
sin
nx
L
, (2.42a)
y
t
(x, 0) = v(x) =
n=1
B
n
nc
L
sin
nx
L
. (2.42b)
/
n
and B
n
can now be found using the orthogonality relations for sin (see (0.18a)), i.e.
_
L
0
sin
nx
L
sin
mx
L
dx =
L
2
nm
. (2.43)
Hence for an integer m > 0
2
L
_
L
0
d(x) sin
mx
L
dx =
2
L
_
L
0
_
n=1
/
n
sin
nx
L
_
sin
mx
L
dx
=
n=1
2/
n
L
_
L
0
sin
nx
L
sin
mx
L
dx
=
n=1
2/
n
L
L
2
nm
using (2.43)
= /
m
, using (0.11b) (2.44a)
Natural Sciences Tripos: IB Mathematical Methods I 32 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
or alternatively invoke standard results for the coecients of Fourier series. Similarly
B
m
=
2
mc
_
L
0
v(x) sin
mx
L
dx. (2.44b)
2.4.4 Unlectured: Oscillation Energy
A vibrating string has both potential energy (because of the stretching of the string) and kinetic energy
(because of the motion of the string). For small displacements the potential energy is approximately
PE =
_
L
0
1
2
Ty
2
dx =
_
L
0
1
2
(cy
)
2
dx, (2.45a)
since c
2
= T
1
, and the kinetic energy is approximately
KE =
_
L
0
1
2
y
2
dx. (2.45b)
Hence from (2.41) and (2.43)
PE =
1
2
c
2
_
L
0
_
n=1
_
/
n
cos
nct
L
+B
n
sin
nct
L
_
n
L
cos
nx
L
_
m=1
_
/
m
cos
mct
L
+B
m
sin
mct
L
_
m
L
cos
mx
L
_
dx
=
1
2
c
2
m,n
_
/
n
cos
nct
L
+B
n
sin
nct
L
__
/
m
cos
mct
L
+B
m
sin
mct
L
_
mn
2
L
2
L
2
mn
=
2
c
2
4L
n=1
n
2
_
/
n
cos
nct
L
+B
n
sin
nct
L
_
2
, (2.46a)
KE =
1
2
_
L
0
_
n=1
nc
L
_
/
n
sin
nct
L
+B
n
cos
nct
L
_
sin
nx
L
_
m=1
nc
L
_
/
m
sin
mct
L
+B
m
cos
mct
L
_
sin
mx
L
_
dx
=
2
c
2
4L
n=1
n
2
_
/
n
sin
nct
L
+B
n
cos
nct
L
_
2
. (2.46b)
It follows that the total energy is given by
E = PE +KE =
n=1
2
c
2
n
2
4L
_
/
2
n
+B
2
n
_
=
normal modes
(energy in mode) . (2.46c)
Remark. The energy is conserved in time (since there is no dissipation). Moreover there is no transfer of
energy between modes.
Exercise. Show, by averaging the PE and KE over an oscillation period, that there is equi-partition of
energy over an oscillation cycle.
2.5 Poissons Equation
Suppose we are interested in obtaining solutions to Poissons equation
2
= f , (2.47a)
Natural Sciences Tripos: IB Mathematical Methods I 33 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
where, say, is a steady temperature distribution and f = Q/(c
p
2
) is a scaled heat source (see (2.29)).
For simplicity let the world be two-dimensional, then (2.47a) becomes
_
2
x
2
+
2
y
2
_
= f . (2.47b)
Suppose we seek a separable solution as before, i.e. (x, y) = X(x)Y (y). Then on substituting into (2.47b)
we obtain
X
X
=
Y
Y
f
XY
. (2.48)
It follows that unless we are very fortunate, and f(x, y) has a particular form (e.g. f = 0), it does not
look like we will be able to nd separable solutions.
In order to make progress the trick is to rst nd a[ny] particular solution,
s
, to (2.47b) (cf. nding
a particular solution when solving constant coecient ODEs last year). The function =
s
then
satises Laplaces equation
2
= 0 . (2.49)
This is just Poissons equation with f = 0, for which we have just noted that separable solutions exist.
To obtain the full solution we need to add these [countably innite] separable solutions to our particular
solution (cf. adding complementary functions to a particular solution when solving constant coecient
ODEs last year).
2.5.1 A Particular Solution
We will illustrate the method by considering the par-
ticular example where the heating f is uniform, f = 1
wlog (since the equation is linear), in a semi-innite rod,
0 x, of unit width, 0 y 1.
In order to nd a particular solution suppose for the
moment that the rod is innite (or alternatively consider
the solution for x 1 for a semi-innite rod, when the
rod might look innite from a local viewpoint).
Then we might expect the particular solution for the temperature
s
to be independent of x, i.e.
s
s
(y). Poissons equation (2.47b) then reduces to
d
2
s
dy
2
= 1 , (2.50a)
which has solution
s
= a
0
+b
0
y
1
2
y
2
, (2.50b)
where a
0
and b
0
are constants.
2.5.2 Boundary Conditions
For the rod problem, experience suggests that we need to specify one of the following at all points on
the boundary of the rod:
the temperature (a Dirichlet condition), i.e.
= g(r) , (2.51a)
where g(r) is a known function;
the scaled heat ux (a Neumann condition), i.e.
n
n = h(r) , (2.51b)
where h(r) is a known function;
Natural Sciences Tripos: IB Mathematical Methods I 34 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
a mixed condition, i.e.
(r)
n
+(r) = d(r) , (2.51c)
where (r), (r) and d(r) are known functions, and (r) and (r) are not simultaneously zero.
For our rod let us consider the boundary conditions
= 0 on x = 0 (0 y 1), y = 0 (0 x < ) and y = 1 (0 x < ), and
x
0 as x . (2.52)
For these conditions it is appropriate to take a
0
= 0 and b
0
=
1
2
in (2.50b) so that
s
=
1
2
y(1 y) 0 . (2.53)
Let =
s
, then satises Laplaces equation (2.49) and, from (2.52) and (2.53), the boundary
conditions
=
1
2
y(1 y) on x = 0 , = 0 on y = 0 and y = 1 , and
x
0 as x . (2.54)
7/03
2.5.3 Separable Solutions
On writing (x, y) = X(x)Y (y) and substituting into Laplaces equation (2.49) it follows that (cf. (2.48))
X
(x)
X(x)
. .
function of x
=
Y
(y)
Y (y)
. .
function of y
= , (2.55a)
so that
X
X = 0 and Y
+Y = 0 . (2.55b)
We can now consider each of the possibilities = 0, > 0 and < 0 in turn to obtain, cf. (2.34a), 7/02
7/04 (2.34b) and (2.34c),
= 0.
= (A
0
+B
0
x)(C
0
+D
0
y) . (2.56a)
=
2
> 0.
=
_
A
e
x
+B
e
x
_
(C
cos y +D
siny) . (2.56b)
= k
2
< 0.
= (A
k
cos kx +B
k
sinkx)
_
C
k
e
ky
+D
k
e
ky
_
. (2.56c)
The boundary conditions at y = 0 and y = 1 in (2.54) state that (x, 0) = 0 and (x, 1) = 0. This
implies (cf. the stretched string problem) that solutions proportional to sin(ny) are appropriate; hence
we try = n
2
2
where n is an integer. The eigenfunctions are thus
n
=
_
/
n
e
nx
+B
n
e
nx
_
sin(ny) , (2.57)
where /
n
and B
n
are constants. However, if the boundary condition in (2.54) as x is to be satised
then /
n
= 0. Hence the solution has the form
=
n=1
B
n
e
nx
sin(ny) . (2.58)
The B
n
are xed by the rst boundary condition in (2.54), i.e. we require that
1
2
y(1 y) =
n=1
B
n
sin(ny) . (2.59a)
Natural Sciences Tripos: IB Mathematical Methods I 35 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Using the orthogonality relations (2.43) it follows that
B
m
= 2
(1)
m
1
m
3
3
. (2.59b)
and hence that
=
1
2
y(1 y)
=0
4
3
(2 +1)
3
sin((2 +1)y) e
(2+1)x
, (2.60a)
or equivalently
=
=0
4
3
(2 +1)
3
sin((2 +1)y)
_
1 e
(2+1)x
_
. (2.60b)
2.6 The Diusion Equation
2.6.1 Separable Solutions
Seek solutions C(x, t) to the one dimensional diusion equation of the form
C(x, t) = X(x)T(t) . (2.61)
On substituting into the one dimensional version of (2.22),
C
t
= D
2
C
x
2
,
we obtain
X
T = DT X
.
After rearrangement we have that
1
D
T(t)
T(t)
. .
function of t
=
X
(x)
X(x)
. .
function of x
= , (2.62a)
where is again a constant. We have therefore split the PDE into two ODEs:
T DT = 0 and X
X = 0 . (2.62b)
There are again three cases to consider.
= 0. In this case
T(t) = X
(x) = 0 T =
0
and X =
0
+
0
x,
where
0
,
0
and
0
are constants. Combining these results we obtain
C =
0
(
0
+
0
x) ,
or
C =
0
+
0
x, (2.63a)
since, without loss of generality (wlog), we can take
0
= 1.
=
2
> 0. In this case
T D
2
T = 0 and X
2
X = 0 .
Hence
T =
exp(D
2
t) and X =
coshx +
sinhx,
where
and
= 1 wlog,
C = exp(D
2
t) (
coshx +
sinhx) . (2.63b)
Natural Sciences Tripos: IB Mathematical Methods I 36 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
= k
2
< 0. In this case
T +Dk
2
T = 0 and X
+k
2
X = 0 .
Hence
T =
k
exp(Dk
2
t) and X =
k
cos kx +
k
sinkx,
where
k
,
k
and
k
are constants. On taking
k
= 1 wlog,
C = exp(Dk
2
t) (
k
cos kx +
k
sinkx) . (2.63c)
2.6.2 Boundary and Initial Conditions
Consider the problem of a solvent occupying the region
between x = 0 and x = L. Suppose that at t = 0 there is
no chemical in the solvent, i.e. the initial condition is
C(x, 0) = 0 . (2.64a)
Note that here we specify one initial condition based on
the observation that the highest derivative in t in (2.22)
is rst order.
Suppose also that for t > 0 the concentration of the chemical is maintained at C
0
at x = 0, and is 0 at
x = L, i.e.
C(0, t) = C
0
and C(L, t) = 0 for t > 0 . (2.64b)
Again it is no coincidence that there two boundary conditions and the highest derivative in x is second
order.
Remark. Equation (2.22) and conditions (2.64a) and (2.64b) are mathematically equivalent to a descrip-
tion of the temperature of a rod of length L which is initially at zero temperature before one of
the ends is raised instantaneously to a constant non-dimensional temperature of C
0
.
2.6.3 Solution
The trick here is to note that
the inhomogeneous (i.e. non-zero) boundary condition at x = 0, i.e C(0, t) = C
0
, is steady, and
the separable solutions (2.63b) and (2.63c) depend on time, while (2.63a) does not.
It therefore seems sensible to try and satisfy the the boundary conditions (2.64b) using the solution
(2.63a). If we call this part of the total solution C
(x) = C
0
_
1
x
L
_
, (2.65)
which is just a linear variation in C from C
0
at x = 0 to 0 at x = L. Write
C(x, t) = C
(x) +
C(x, t) , (2.66)
where
C is a sum of the separable time-dependent solutions (2.63b) and (2.63c). Then from the initial
condition (2.64a), the boundary conditions (2.64b), and the steady solution (2.65), it follows that
C(x, 0) = C
0
_
1
x
L
_
, (2.67a)
and
C(0, t) = 0 and
C(L, t) = 0 for t > 0 . (2.67b)
Natural Sciences Tripos: IB Mathematical Methods I 37 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
If the homogeneous boundary conditions (2.67b) are to be satised then, as for the wave equation,
separable solutions with > 0 are unacceptable, while = k
2
< 0 is only acceptable if
k
= 0 and
k
sinkL = 0 . (2.68a)
It follows that if the solution is to be non trivial then
k =
n
L
. (2.68b)
The eigenfunctions corresponding to (2.68b) are
X
n
=
n
sin
nx
L
, (2.68c)
where
n
=
n
L
. Again, because (2.22) is a linear equation, we can add individual solutions to get the
general solution
C(x, t) =
n=1
n
exp
_
n
2
2
Dt
L
2
_
sin
nx
L
. (2.69)
The
n
are xed by the initial condition (2.67b):
C
0
_
1
x
L
_
=
n=1
n
sin
nx
L
. (2.70a)
Hence
m
=
2C
0
L
_
L
0
_
1
x
L
_
sin
mx
L
dx =
2C
0
m
. (2.70b)
The solution is thus given by
C = C
0
_
1
x
L
_
n=1
2C
0
n
exp
_
n
2
2
Dt
L
2
_
sin
nx
L
. (2.71a)
or from using (2.70a)
C =
n=1
2C
0
n
_
1 exp
_
n
2
2
Dt
L
2
__
sin
nx
L
. (2.71b)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
The solution (2.71b) with C
0
= 1 and L = 1, plotted at
times t = 0.0001, t = 0.001, t = 0.01, t = 0.1 and t = 1
(curves from left to right respectively).
Paradox. sin
nx
L
is not a separable solution of the dif-
fusion equation.
Remark. As t in (2.71a)
C C
0
_
1
x
L
_
= C
(x) . (2.72)
Remark. Solution (2.71b) is odd and has period 2L. We
are in eect solving the 2L-periodic diusion prob-
lem where C is initially zero. Then, at t = 0+, C
is raised to +1 at 2nL+ and lowered to 1 at
2nL (for integer n), and kept zero everywhere
else.
Natural Sciences Tripos: IB Mathematical Methods I 38 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
3 Fourier Transforms
3.0 Why Study This?
Fourier transforms, like Fourier series, tell you about the spectral (or harmonic) properties of functions.
As such they are useful diagnostic tools for experiments. Here we will primarily use Fourier transforms
to solve dierential equations that model important aspects of science.
3.1 The Dirac Delta Function (a.k.a. Alchemy)
3.1.1 The Delta Function as the Limit of a Sequence
Consider the discontinuous function
(x) =
_
_
0 x <
1
2
x
0 < x
. (3.1a)
Then for all values of , including the limit 0+,
_
(x) dx = 1 . (3.1b)
Further we note that for any dierentiable function g(x) and constant
_
(x )g
(x) dx =
_
+
1
2
g
(x) dx
=
1
2
_
g(x)
_
+
=
1
2
(g( +) g( )) .
In the limit 0+ we recover from using Taylors theorem and writing g
(x) = f(x)
lim
0+
_
(x )f(x) dx = lim
0+
1
2
_
g() +g
() +
1
2
2
g
() +. . .
g() +g
()
1
2
2
g
() +. . .
_
= f() . (3.1c)
We will view the delta function, (x), as the limit as 0+ of
(x), i.e.
(x) = lim
0+
(x) . (3.2)
Applications. Delta functions are the mathematical way of modelling point objects/properties, e.g. point
charges, point forces, point sinks/sources.
3.1.2 Some Properties of the Delta Function
Taking (3.2) as our denition of a delta function, we infer the following.
1. From (3.1a) we see that the delta function has an innitely sharp peak of zero width, i.e.
(x) =
_
x = 0
0 x ,= 0
. (3.3a)
Natural Sciences Tripos: IB Mathematical Methods I 39 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
2. From (3.1b) it follows that the delta function has unit area, i.e.
_
(x) by
(x) =
1
2
_
e
kx|k|
dk (3.5a)
=
1
2
__
0
e
kx+k
dk +
_
0
e
kxk
dk
_
=
1
2
_
1
x +
1
x
_
=
(x
2
+
2
)
. (3.5b)
We note by substituting x = y that (cf. (3.1b))
_
(x
2
+
2
)
dx =
_
1
(y
2
+1)
dy =
1
_
arctany
_
= 1 .
Also, by means of the substitution x = ( +z) followed by an application of Taylors theorem, the 8/03
analogous result to (3.1c) follows, namely
lim
0+
_
(x )f(x) dx = lim
0+
_
(z)f( +z) dz
= lim
0+
_
1
(z
2
+1)
(f() +zf
() +. . . ) dz
= f() .
16
By good we mean, for instance, that f(x) is everywhere dierentiable any number of times, and that
Z
d
n
f
dx
n
2
dx < for all integers n 0.
17
However we will not always be holier than thou: see (3.6).
Natural Sciences Tripos: IB Mathematical Methods I 40 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Hence, if we are willing to break the injunction that (x) should always be employed in an integrand as
a linear operator, we can infer from (3.5a) that
(x) =
1
2
_
e
kx
dk . (3.6)
Another popular choice for
(x) =
1
2
2
exp
_
x
2
2
2
_
. (3.7a)
The analogous result to (3.1b) follows by means of the substi-
tution x =
2 y:
_
(x) dx =
1
2
2
_
exp
_
x
2
2
2
_
dx =
1
exp
_
y
2
_
dy = 1 . (3.7b)
The equivalent result to (3.1c) can also be recovered, as above, by the substitution x = ( +
2 z)
followed by an application of Taylors theorem.
3.1.5 Further Properties of the Delta Function
The following properties hold for all the denitions of
e
kx
dk =
1
2
_
e
x
d =
1
2
_
e
x
d = (x) . (3.8a)
2. (x) is real. From (3.6) and (3.8a), with
denoting a complex conjugate, it follows that
(x) =
1
2
_
e
kx
dk = (x) = (x) . (3.8b)
3.1.6 The Heaviside Step Function
The Heaviside step function, H(x), is dened for x ,= 0 by
H(x) =
_
0 x < 0
1 x > 0
. (3.9)
This function, which is sometimes written (x), is discontinuous
at x = 0:
lim
x0
H(x) = 0 ,= 1 = lim
x0+
H(x) .
There are various conventions for the value of the Heaviside step
function at x = 0, but it is not uncommon to take H(0) =
1
2
.
The Heaviside function is closely related to the Dirac delta function, since from (3.3a) and (3.3b)
H(x) =
_
x
() d. (3.10a)
By analogy with the rst fundamental theorem of calculus (0.1), this suggests that
H
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Unlectured Remark. As a check on (3.10b) we see from integrating by parts that
_
(x )f(x) dx =
_
H(x )f(x)
_
H(x )f
(x) dx
= f()
_
(x) dx
= f()
_
f(x)
_
= f() .
Hence from the denition the delta function (3.4) we may identify H
(x )f(x) dx =
_
(x )f(x)
_
(x )f
(x) dx = f
(). (3.11)
3.2 The Fourier Transform
3.2.1 Denition
Given a function f(x) such that
_
[f(x)[ dx < ,
we dene its Fourier transform,
f(k), by
f(k) =
1
2
_
e
kx
f(x) dx. (3.12)
Notation. Sometimes it will be clearer to denote the Fourier transform of a function f by T[f] rather
than
f, i.e.
T[] . (3.13)
Remark. There are diering normalisations of the Fourier transform. Hence you will encounter denitions
where the (2)
1
2
is either not present or replaced by (2)
1
, and other denitions where the kx
is replaced by +kx.
Property. If the function f(x) is real the Fourier transform
f(k) is not necessarily real. However if f is
both real and even, i.e. f
(x) = f(x) and f(x) = f(x) respectively, then by using these properties
and the substitution x = y it follows that
f is real:
(k) =
1
2
_
e
kx
f
2
_
e
kx
f(x) dx since f
(x) = f(x)
=
1
2
_
e
ky
f(y) dy let x = y
=
f(k) . from (3.12) (3.14)
Similarly we can show that if f is both real and odd, then
f is purely imaginary, i.e.
f
(k) =
f(k).
Conversely it is possible to show using the Fourier inversion theorem (see below) that
if both f and
f are real, then f is even;
if f is real and
f is purely imaginary, then f is odd.
Natural Sciences Tripos: IB Mathematical Methods I 42 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
3.2.2 Examples of Fourier Transforms
The Fourier Transform (FT) of e
b|x|
(b > 0). First, from (3.5a) and (3.5b) we already have that
1
2
_
e
kx|k|
dk =
(x
2
+
2
)
.
For what follows it is helpful to rewrite this result by making the transformations x , k x
and b to obtain
_
e
xb|x|
dx =
2b
2
+b
2
. (3.15)
We deduce from the denition of a Fourier transform, (3.12), and (3.15) with = k, that
T[e
b|x|
] =
1
2
_
e
kxb|x|
dx
=
1
2
2b
k
2
+b
2
. (3.16)
The FTs of cos(ax) e
b|x|
and sin(ax) e
b|x|
(b > 0). Unlectured. From (3.12), the denition of cosine,
and (3.15) rst with = a k and then with = a +k, it follows that
T[cos(ax) e
b|x|
] =
1
2
2
_
_
e
ax
+e
ax
_
e
kxb|x|
dx
=
b
2
_
1
(a k)
2
+b
2
+
1
(a +k)
2
+b
2
_
. (3.17a)
This is real, as it has to be since cos(ax) e
b|x|
is even.
Similarly, from (3.12), the denition of sine, and (3.15) rst with = ak and then with = a+k,
it follows that
T[sin(ax) e
b|x|
] =
1
2
2
_
_
e
ax
e
ax
_
e
kxb|x|
dx
=
b
2
_
1
(a k)
2
+b
2
1
(a +k)
2
+b
2
_
. (3.17b)
This is purely imaginary, as it has to be since sin(ax) e
b|x|
is odd.
The FT of a Gaussian. From the denition (3.12), the completion of a square, and the substitution
x = (y
2
k),
18
it follows that
T
_
1
2
2
exp
_
x
2
2
2
__
=
1
2
_
exp
_
x
2
2
2
kx
_
dx
=
1
2
_
exp
_
1
2
_
x
+k
_
2
1
2
2
k
2
_
dx
=
1
2
exp
_
1
2
2
k
2
_
_
exp
_
1
2
y
2
_
dy
=
1
2
exp
_
1
2
2
k
2
_
. (3.18)
Hence the FT of a Gaussian is a Gaussian.
The FT of the delta function. From denitions (3.4) and (3.12) it follows that
T[(x a)] =
1
2
_
(x a)e
kx
dx
=
1
2
e
ka
. (3.19a)
18
This is a little naughty since it takes us into the complex x-plane. However, it can be xed up once you have done
Cauchys theorem.
Natural Sciences Tripos: IB Mathematical Methods I 43 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Hence the Fourier transform of (x) is 1/
2
_
H(x a)e
kx
dx
=
1
2
_
a
e
kx
dx
=
1
2
_
e
kx
k
_
a
We now have a problem, since what is lim
x
e
kx
? For the time being the simplest resolution is
(in the spirit of 3.1.4) to nd T[H(x a)e
(xa)
] for > 0, and then let 0+. So
T[H(x a)e
(xa)
] =
1
2
_
H(x a)e
(xa)kx
dx
=
1
2
_
e
(xa)kx
k
_
a
=
1
2
e
ka
+k
. (3.19b)
On taking the limit 0 we have that
T[H(x a)] =
e
ka
2 k
. (3.19c)
Remark. For future reference we observe from a comparison of (3.19a) and (3.19c) that
kT[H(x a)] = T[(x a)] . (3.19d)
9/02
9/04
3.2.3 The Fourier Inversion Theorem
Given a function f we can compute its Fourier transform
f from (3.12). For many functions the converse
is also true, i.e. given the Fourier transform
f of a function we can reconstruct the original function f. To
see this consider the following calculation (note the use of a dummy variable to avoid an overabundance
of x)
1
2
_
e
kx
f(k) dk =
1
2
_
e
kx
_
1
2
_
e
k
f() d
_
dk from denition (3.12)
=
_
d f()
_
1
2
_
dk e
k(x)
_
swap integration order
=
_
f(k) =
1
2
_
e
kx
f(x) dx T[f] , (3.20a)
then the inverse transform (note the change of sign in the exponent) acting on
f(k) recovers f(x), i.e.
f(x) =
1
2
_
e
kx
f(k) dk 1[
f] . (3.20b)
Natural Sciences Tripos: IB Mathematical Methods I 44 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Note that
1 [T [f]] = f , and T
_
1
_
f
__
=
f . (3.20c)
9/03
Example. Find the Fourier transform of (x
2
+b
2
)
1
.
Answer. From (3.16)
T
_
e
b|x|
_
(k) =
1
2
2b
k
2
+b
2
.
Hence from (3.20c)
_
2b
2
e
b|x|
= 1
_
1
k
2
+b
2
_
(x) ,
or, after applying the transformation x k,
1
_
1
x
2
+b
2
_
(k) =
_
2b
2
e
b|k|
. (3.21a)
But, from the transformation x k in (3.20b) and comparison with (3.20a), we see that
T[f(x)](k) = 1[f(x)](k) . (3.21b)
Hence, making the transformation k k in (3.21a), we nd that
T
_
1
x
2
+b
2
_
(k) =
_
2b
2
e
b|k|
. (3.21c)
3.2.4 Properties of Fourier Transforms
A couple of useful properties of Fourier transforms follow from (3.20a) and (3.20b). In particular we
shall see that an important property of the Fourier transform is that it allows a simple representation
of derivatives of f(x). This has important consequences when we come to solve dierential equations.
However, before we derive these properties we need to get Course A students upto speed.
Lemma. Suppose g(x, k) is a function of two variables, then for constants a and b
d
dx
_
b
a
g(x, k) dk =
_
b
a
g(x, k)
x
dk . (3.22)
Proof. Work from rst principles, then
d
dx
_
b
a
g(x, k) dk = lim
0
1
_
_
b
a
g(x +, k) dk
_
b
a
g(x, k) dk
_
=
_
b
a
lim
0
_
g(x +, k) g(x, k)
_
dk
=
_
b
a
g(x, k)
x
dk .
Dierentiation. If we dierentiate the inverse Fourier transform (3.20b) with respect to x we obtain
df
dx
(x) =
1
2
_
e
kx
_
k
f(k)
_
dk = 1
_
k
f
_
. (3.23)
Now Fourier transform this equation to conclude from using (3.20c) that
T
_
df
dx
_
= T
_
1
_
k
f
__
= k
f . (3.24a)
In other words, each time we dierentiate a function we multiply its Fourier transform by k. Hence
T
_
d
2
f
dx
2
_
= k
2
f and T
_
d
n
f
dx
n
_
= (k)
n
f . (3.24b)
Natural Sciences Tripos: IB Mathematical Methods I 45 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Multiplication by x. This time we dierentiate (3.20a) with respect to k to obtain
d
f
dk
(k) =
1
2
_
e
kx
(xf(x)) dx.
Hence, after multiplying by , we deduce from (3.12) that (cf. (3.24a))
f
dk
= T [xf(x)] . (3.25)
Translation. The Fourier transform of f(x a) is given by
T[f(x a)] =
1
2
_
e
kx
f(x a) dx from (3.12)
=
1
2
_
e
k(y+a)
f(y) dy x = y +a
= e
ka
1
2
_
e
ky
f(y) dy rearrange
= e
ka
T[f(x)] from (3.12). (3.26)
See (3.19a) and (3.19c) for a couple of examples that we have already done.
3.2.5 Parsevals Theorem
Parsevals theorem states that if f(x) is a complex function of x with Fourier transform
f(k), then
_
f(x)
2
dx =
_
f(k)
2
dk . (3.27)
Proof.
_
f(x)
2
dx =
_
dxf(x)f
(x)
=
1
2
_
dx
__
dk e
kx
f(k)
_ __
d e
x
()
_
from (3.20b) & (3.20b)
=
_
dk
f(k)
_
d
f
()
_
1
2
_
dxe
(k)x
_
swap integration order
=
_
dk
f(k)
_
d
f
() (k ) from (3.6)
=
_
dk
f(k)
f(k)
2
dk .
Unlectured Example. Find the Fourier transform of xe
|x|
and use Parsevals theorem to evaluate the
integral
_
k
2
(1 +k
2
)
4
dk .
Answer. From (3.16) with b = 1
T
_
e
|x|
_
=
1
2
2
1 +k
2
. (3.28a)
Next employ (3.25) to obtain
T
_
xe
|x|
_
=
k
T
_
e
|x|
_
=
_
2
2k
(1 +k
2
)
2
. (3.28b)
Natural Sciences Tripos: IB Mathematical Methods I 46 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Then from Parsevals theorem (3.27) and a couple of integrations by parts
_
k
2
(1 +k
2
)
4
dk =
8
_
x
2
e
2|x|
dx =
4
_
0
x
2
e
2x
dx =
16
. (3.28c)
9/01
An Application: Heisenbergs Uncertainty Principle. Suppose that
(x) =
1
(2
2
x
)
1
4
exp
_
x
2
4
2
x
_
(3.29)
is the [real] wave-function of a particle in quantum mechanics. Then, according to quantum me-
chanics,
[
2
(x)[ =
1
_
2
2
x
exp
_
x
2
2
2
x
_
, (3.30)
is the probability of nding the particle at position x, and
x
is the root mean square deviation
in position. Note that since [
2
[ is the Gaussian of width
x
,
_
[
2
(x)[ dx =
1
_
2
2
x
_
exp
_
x
2
2
2
x
_
dx = 1 . (3.31)
Hence there is unit probability of nding the particle somewhere! The Fourier transform of (x)
follows from (3.18) after the substitution =
2
x
and a multiplicative normalisation:
(k) =
_
2
2
x
_
1
4
exp
_
2
x
k
2
_
=
1
(2
2
k
)
1
4
exp
_
k
2
4
2
k
_
where
k
=
1
2
x
. (3.32)
Hence
2
is another Gaussian, this time with a root mean square deviation in wavenumber of
k
.
In agreement with Parsevals theorem
_
(k)
2
[ dk = 1 . (3.33)
We note that for the Gaussian
k
x
=
1
2
. More generally, one can show that for any (possibly
complex) wave-function (x),
x
1
2
(3.34)
where
x
and
k
are, as for the Gaussian, the root mean square deviations of the probability
distributions [(x)[
2
and [
(k)[
2
, respectively. An important and well-known result follows from
(3.34), since in quantum mechanics the momentum is given by p = k, where = h/2 and h is
Plancks constant. Hence if we interpret x =
x
and p =
k
to be the uncertainty in the
particles position and momentum respectively, then Heisenbergs Uncertainty Principle follows
from (3.34), namely
p x
1
2
. (3.35)
A general property of Fourier transforms
that follows from (3.34) is that the smaller
the variation in the original function (i.e.
the smaller
x
), the larger the variation in
the transform (i.e. the larger
k
), and vice
versa. In more prosaic language
a sharp peak in x a broad bulge in k,
and vice versa.
This property has many applications, for instance
a short pulse of electromagnetic radiation must contain many frequencies;
a long pulse of electromagnetic radiation (i.e. many wavelengths) is necessary in order to
obtain an approximately monochromatic signal.
Natural Sciences Tripos: IB Mathematical Methods I 47 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
3.2.6 The Convolution Theorem
The convolution, f g, of a function f(x) with a function g(x) is dened by
(f g)(x) =
_
dy f(x y) g(y) z y
= (g f)(x) from (3.36).
The Fourier transform T[f g]. If the functions f and g have Fourier transforms T[f] and T[g] respec-
tively, then
T[f g] =
2T[f]T[g] , (3.37)
since
T[f g] =
1
2
_
dxe
kx
__
dy f(y) g(x y)
_
from (3.12) & (3.36)
=
1
2
_
dy f(y)
_
dxe
kx
g(x y) swap integration order
=
1
2
_
dy f(y)
_
dz e
k(z+y)
g(z) x = z +y
=
1
2
_
dy f(y) e
ky
_
dz e
kz
g(z) rearrange
=
2 T[f]T[g] from (3.12).
The Fourier transform T[fg]. Conversely the Fourier transform of the product fg is given by the con-
volution of the Fourier transforms of f and g divided by
2, i.e.
T[fg] =
1
2
T[f] T[g] , (3.38)
since
T[fg](k) =
1
2
_
dxe
kx
f(x)g(x) from (3.12)
=
1
2
_
dxe
kx
g(x)
_
1
2
_
d e
x
f()
_
from (3.20b) with k
=
1
2
_
d
f()
_
1
2
_
dxe
i(k)x
g(x)
_
swap integration order
=
1
2
_
d
f() g(k ) from (3.12)
=
1
2
(
f g)(k)
1
2
(T[f] T[g]) (k) from (3.36).
Application. Suppose a linear black box (e.g. a circuit) has output G() exp(t) for a periodic input
exp(t). What is the output r(t) corresponding to input f(t)?
Natural Sciences Tripos: IB Mathematical Methods I 48 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Answer. Since the black box is linear, changing the
input produces a directly proportional change in out-
put. Thus since an input exp(t) produces an output
G() exp(t), an input F() exp(t) will produce an
output R() exp(t) = G()F() exp(t).
Moreover, since the black box is linear we can superpose input/output, and hence an input
f(t) =
1
2
_
F() e
t
d , (3.39)
will produce an output
r(t) =
1
2
_
G()F() e
t
d
=
1
2
_
_
1
2
T[f g]
_
e
t
d from (3.37)
=
1
2
(f g)(t) , from (3.20b) (3.40)
where g(t) is the inverse transform of G(), and we have used t and , instead of x and k respec-
tively, as the variables in the Fourier transforms and their inverses.
Remark. If the know the output of a linear black box for all possible harmonic inputs, then we
know everything about the black box. 10/02
10/0
Unlectured Example. A linear electronic device is such that an input f(t) = H(t)e
t
yields an output
r(t) =
H(t)(e
t
e
t
)
.
What is the output for an input h(t)?
Answer. Let F() and R() be the Fourier transforms of f(t) and r(t) respectively. Then, since
the device is linear, the principle of superposition means that an input F() exp(t) produces
an output R() exp(t), and that an input exp(t) produces an output G() exp(t) where
G = R/F.
Next we note from (3.19b) with = , a = 0 and k = , that the Fourier transform of the input is
F() T[H(t)e
t
] =
1
2
1
+
. (3.41a)
Similarly, the Fourier transform of the output is
R() T
_
H(t)(e
t
e
t
)
_
=
1
2 ( )
_
1
+
1
+
_
. (3.41b)
Hence
G() =
R()
F()
=
1
_
1
+
+
_
=
1
+
. (3.42a)
It then follows from (3.41a) that the inverse transform of G() is given by
g(t) =
2H(t)e
t
. (3.42b)
We can now use the result from the previous example to deduce from (3.40) (with the change of
notation f h) that the output H(t) to an input h(t) is given by
H(t) =
1
2
(h g)(t) =
1
2
_
dy h(y)H(t y) e
(ty)
from (3.42b)
=
_
t
dy h(y) e
(yt)
H(t y) = 0 for y > t
=
_
0
d h(t ) e
. y = t (3.43)
Natural Sciences Tripos: IB Mathematical Methods I 49 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
3.2.7 The Relationship to Fourier Series
Suppose that f(x) is a periodic function with period L (so that f(x +L) = f(x)). Then f can be
represented by a Fourier series
f(x) =
n=
a
n
exp
_
2nx
L
_
, (3.44a)
where
a
n
=
1
L
_ 1
2
L
1
2
L
f(x) exp
_
2nx
L
_
dx. (3.44b)
Expression (3.44a) can be viewed as a superposition of an innite number of waves with wavenumbers 10/03
k
n
= 2n/L (n = , . . . , ). We are interested in the limit as the period L tends to innity. In this
limit the increment between successive wavenumbers, i.e. k = 2/L, becomes vanishingly small, and
the spectrum of allowed wavenumbers k
n
becomes a continuum. Moreover, we recall that an integral can
be evaluated as the limit of a sum, e.g.
_
g(k) dk = lim
k0
n=
g(k
n
)k where k
n
= nk . (3.45)
Rewrite (3.44a) and (3.44b) as
f(x) =
1
n=
f(k
n
) exp(xk
n
) k ,
and
f(k
n
) =
1
2
_ 1
2
L
1
2
L
f(x) exp(xk
n
) dx,
where
f(k
n
) =
La
n
2
.
We then see that in the limit k 0, i.e. L ,
f(x) =
1
2
_
f(k) =
1
2
_
dx
2
a
2
= f(x) , (3.47)
Natural Sciences Tripos: IB Mathematical Methods I 50 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
where a is a constant and f is a known function. Suppose also that satises the [two] boundary
conditions [[ 0 as [x[ .
Suppose that we multiply the left-hand side (3.47) by
1
2
exp(kx) and integrate over x. Then
1
2
_
e
kx
_
d
2
dx
2
a
2
_
dx = T
_
d
2
dx
2
_
a
2
T () from (3.12)
= k
2
T () a
2
T () from (3.24b). (3.48a)
The same action on the right-hand side yields T(f). Hence from taking the Fourier transform of the
whole equation we have that
k
2
T () a
2
T () = T (f) . (3.48b)
Rearranging this equation we have that
T () =
T (f)
k
2
+a
2
, (3.48c)
and so from the inverse transform (3.20b) we have the solution
=
1
2
_
e
kx
T (f)
k
2
+a
2
dk . (3.48d)
Remark. The boundary conditions that [[ 0 as [x[ were implicitly used when we assumed that
the Fourier transform of existed. Why?
3.3.2 The Diusion Equation
Consider the diusion equation (see (2.23) or (2.29)) governing the evolution of, say, temperature, (x, t):
t
=
x
2
. (3.49)
In 2.6 we have seen how separable solutions and Fourier series can be used to solve (3.49) over nite
x-intervals. Fourier transforms can be used to solve (3.49) when the range of x is innite.
19
We will assume boundary conditions such as
constant and
x
0 as [x[ , (3.50)
so that the Fourier transform of exists (at least in a generalised sense):
(k, t) =
1
2
_
e
kx
dx. (3.51)
If we then multiply the left hand side of (3.49) by
1
2
exp(kx) and integrate over x we obtain the
time derivative of
:
1
2
_
e
kx
t
dx =
t
_
1
2
_
e
kx
dx
_
swap dierentiation and integration
=
t
from (3.51) .
A similar manipulation of the right hand side of (3.49) yields
1
2
_
e
kx
_
x
2
_
dx = k
2
from (3.24b).
19
Semi-innite ranges can also be tackled by means of suitable tricks: see the example sheet.
Natural Sciences Tripos: IB Mathematical Methods I 51 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Putting the left hand side and the right hand side together it follows that
(k, t) satises
t
+k
2
= 0 . (3.52a)
This equation has solution 10/01
(k, 0) =
1
2
_
e
ky
(y, 0) dy , (3.54a)
and so
(k, t) =
1
2
_
exp(ky k
2
t) (y, 0) dy . (3.54b)
We can now use the Fourier inversion formula to nd (x, t):
(x, t) =
1
2
_
dk e
kx
2
_
dk e
kx
_
1
2
_
exp(ky k
2
t) (y, 0) dy
_
from (3.54b)
=
1
2
_
dy (y, 0)
_
dk exp(k(x y) k
2
t) swap integration order.
From completing the square, or alternatively from our earlier calculation of the Fourier transform of a
Gaussian (see (3.18) and apply the transformations (2t)
1
2
, k (y x) and x k), we have that
_
dk exp
_
k(x y) k
2
t
_
=
_
t
exp
_
(x y)
2
4t
_
. (3.55)
Substituting into the above expression for (x, t) we obtain a
solution to the diusion equation in terms of the initial condi-
tion at t = 0:
(x, t) =
1
4t
_
dy (y, 0) exp
_
(x y)
2
4t
_
. (3.56a)
Example. If (x, 0) =
0
(x) then we obtain what is sometimes
referred to as the fundamental solution of the diusion equa-
tion, namely
(x, t) =
0
4t
exp
_
x
2
4t
_
. (3.56b)
Physically this means that if the temperature at one point of
an innite rod is instantaneously raised to innity, then the
resulting temperature distribution is that of a Gaussian with a
maximum temperature decaying like t
1
2
and a width increas-
ing like t
1
2
.
Natural Sciences Tripos: IB Mathematical Methods I 52 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
4 Matrices
4.0 Why Study This?
A very good question (since this material is as dry as the Sahara). A general answer is that matrices
are essential mathematical tools: you have to know how to manipulate them. A more specic answer is
that we will study Hermitian matrices, and observables in quantum mechanics are Hermitian operators.
We will also study eigenvalues, and you should have come across these suciently often in your science
courses to know that they are an important mathematical concept.
4.1 Vector Spaces
The concept of a vector in three-dimensional Euclidean space can be generalised to n dimensions.
4.1.1 Some Notation
First some notation.
Notation Meaning
in
there exists
for all
4.1.2 Denition
A set of elements, or vectors, are said to form a complex linear vector space V if
1. there exists a binary operation, say addition, under which the set V is closed so that
if u, v V , then u +v V ; (4.1a)
2. addition is commutative and associative, i.e. for all u, v, w V
u +v = v +u, (4.1b)
(u +v) +w = u +(v +w) ; (4.1c)
3. there exists closure under multiplication by a complex scalar, i.e.
if a C and v V then av V ; (4.1d)
4. multiplication by a scalar is distributive and associative, i.e. for all a, b C and u, v V
a(u +v) = au +av , (4.1e)
(a +b)u = au +bu, (4.1f)
a(bu) = (ab)u; (4.1g)
5. there exists a null, or zero, vector 0 V such that for all v V
v +0 = v ; (4.1h)
6. for all v V there exists a negative, or inverse, vector (v) V such that
v +(v) = 0. (4.1i)
Natural Sciences Tripos: IB Mathematical Methods I 53 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Remarks.
The existence of a negative/inverse vector (see (4.1i)) allows us to subtract as well as add vectors,
by dening
u v u +(v) . (4.2)
If we restrict all scalars to be real, we have a real linear vector space, or a linear vector space over
reals.
We will often refer to V as a vector space, rather than the more correct linear vector space.
4.1.3 Linear Independence
A set of m non-zero vectors u
1
, u
2
, . . . u
m
is linearly
independent if
m
i=1
a
i
u
i
= 0 a
i
= 0 for i = 1, 2, . . . , m. (4.3)
Otherwise, the vectors are linearly dependent,
i.e. there exist scalars a
i
, at least one of which
is non-zero, such that
m
i=1
a
i
u
i
= 0.
Denition: Dimension of a Vector Space. If a vector space V contains a set of n linearly independent
vectors but all sets of n +1 vectors are linearly dependent, then V is said to be of dimension n. 11/04
Examples.
1. (1, 0, 0), (0, 1, 0) and (0, 0, 1) are linearly independent since
a(1, 0, 0) +b(0, 1, 0) +c(0, 0, 1) = (a, b, c) = 0 a = 0, b = 0, c = 0 .
2. (1, 0, 0), (0, 1, 0) and (1, 1, 0) = (1, 0, 0) +(0, 1, 0) are linearly dependent.
3. Since
(a, b, c) = a(1, 0, 0) +b(0, 1, 0) +c(0, 0, 1) , (4.4)
the vectors (1, 0, 0), (0, 1, 0) and (0, 0, 1) span a linear vector space of dimension 3.
4.1.4 Basis Vectors
If V is an n-dimensional vector space then any set of n linearly independent vectors u
1
, . . . , u
n
is a
basis for V . The are a couple of key properties of a basis. 11/02
11/03
1. We claim that for all vectors v V , there exist scalars v
i
such that
v =
n
i=1
v
i
u
i
. (4.5)
The v
i
are said to be the components of v with respect to the basis u
1
, . . . , u
n
.
Natural Sciences Tripos: IB Mathematical Methods I 54 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Proof. To see this we note that since V has dimension n, the set u
1
, . . . , u
n
, v is linearly depen-
dent, i.e. there exist scalars (a
1
, . . . , a
n
, b), not all zero, such that
n
i=1
a
i
u
i
+bv = 0. (4.6)
If b = 0 then the a
i
= 0 for all i because the u
i
are linear independent, and we have a contradiction;
hence b ,= 0. Multiplying by b
1
we have that
v =
n
i=1
_
b
1
a
i
_
u
i
=
n
i=1
v
i
u
i
, (4.7)
where v
i
= b
1
a
i
(i = 1, . . . , n). 2
2. The scalars v
1
, . . . , v
n
are unique.
Proof. Suppose that
v =
n
i=1
v
i
u
i
and that v =
n
i=1
w
i
u
i
. (4.8)
Then, because v v = 0,
0 =
n
i=1
(v
i
w
i
)u
i
. (4.9)
But the u
i
(i = 1, . . . , n) are linearly independent, so the only solution of this equation is v
i
w
i
= 0
(i = 1, . . . n). Hence v
i
= w
i
(i = 1, . . . n), and we conclude that the two linear combinations (4.8)
are identical. 2
Remark. Let u
1
, . . . , u
m
be a set of vectors in an n-dimensional vector space.
If m > n then there exists some vector that, when expressed as a linear combination of the
u
i
, has non-unique scalar coecients. This is true whether or not the u
i
span V .
If m < n then there exists a vector that cannot be expressed as a linear combination of the u
i
.
Examples.
1. Three-Dimensional Euclidean Space E
3
. In this case the scalars are real and V is three-dimensional
because every vector v can be written uniquely as (cf. (4.4))
v = v
x
e
x
+v
y
e
y
+v
z
e
z
(4.10a)
= v
1
u
1
+v
2
u
2
+v
3
u
3
, (4.10b)
where e
x
= u
1
= (1, 0, 0), e
y
= u
2
= (0, 1, 0), e
3
= u
1
= (0, 0, 1) is a basis.
2. The Complex Numbers. Here we need to be careful what we mean.
Suppose we are considering a complex linear vector space,
i.e. a linear vector space over C. Then because the scalars
are complex, every complex number z can be written
uniquely as
z = 1 where C, (4.11a)
and moreover
1 = 0 = 0 for C. (4.11b)
We conclude that the single vector 1 constitutes a ba-
sis for C when viewed as a linear vector space over C.
Natural Sciences Tripos: IB Mathematical Methods I 55 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
However, we might alternatively consider the complex numbers as a linear vector space over R, so
that the scalars are real. In this case the pair of vectors 1, i constitute a basis because every
complex number z can be written uniquely as
z = a 1 +b where a, b R, (4.12a)
and
a 1 +b = 0 a = b = 0 if a, b, R. (4.12b)
Thus we have that
dim
C
C = 1 but dim
R
C = 2 , (4.13)
where the subscript indicates whether the vector space C is considered over C or R.
Worked Exercise. Show that 2 2 real symmetric matrices form a real linear vector space under addition.
Show that this space has dimension 3 and nd a basis.
Answer. Let V be the set of all real symmetric matrices, and let
A =
_
a
a
a
a
_
, B =
_
b
b
b
b
_
, C =
_
c
c
c
c
_
,
be any three real symmetric matrices.
1. We note that addition is closed since A +B is a real symmetric matrix.
2. Addition is commutative and associative since for all [real symmetric] matrices, A +B = B +A
and (A +B) +C = A +(B +C).
3. Multiplication by a scalar is closed since if p R, then pA is a real symmetric matrix.
4. Multiplication by a scalar is distributive and associative since for all p, q R and for all [real
symmetric] matrices, p(A +B) = pA +pB, (p +q)A = pA +qA and p(qA) = (pq)A.
5. The zero matrix,
0 =
_
0 0
0 0
_
,
is real and symmetric (and hence in V ), and such that for all [real symmetric] matrices
A +0 = A.
6. For any [real symmetric] matrix there exists a negative matrix, i.e. that matrix with the
components reversed in sign. In the case of a real symmetric matrix, the negative matrix is
again real and symmetric.
Therefore V is a real linear vector space; the vectors are the 2 2 real symmetric matrices.
Moreover, the three 2 2 real symmetric matrices
U
1
=
_
1 0
0 0
_
, U
2
=
_
0 1
1 0
_
, and U
3
=
_
0 0
0 1
_
, (4.15)
are independent, since for p, q, r R
pU
1
+qU
2
+rU
3
=
_
p q
q r
_
= 0 p = q = r = 0 .
Further, any 2 2 real symmetric matrix can be expressed as a linear combination of the U
i
since
_
p q
q r
_
= pU
1
+qU
2
+rU
3
.
We conclude that the 2 2 real symmetric matrices form a three-dimensional real linear vector
space under addition, and that the vectors U
i
dened in (4.15) form a basis. 11/01
Exercise. Show that 3 3 symmetric real matrices form a vector space under addition. Show that this
space has dimension 6 and nd a basis.
Natural Sciences Tripos: IB Mathematical Methods I 56 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
4.2 Change of Basis: the Role of Matrices
4.2.1 Transformation Matrices
Let u
i
: i = 1, . . . , n and u
i
: i = 1, . . . , n be two sets of basis vectors for an n-dimensional vector
space V . Since the u
i
: i = 1, . . . , n is a basis, the individual basis vectors of the basis u
i
: i = 1, . . . , n
can be written as
u
j
=
n
i=1
u
i
A
ij
(j = 1, . . . , n) , (4.16)
for some numbers A
ij
. From (4.5) we see that A
ij
is the ith component of the vector u
j
in the basis
u
i
: i = 1, . . . , n. The numbers A
ij
can be represented by a square n n transformation matrix A
A =
_
_
_
_
_
A
11
A
12
A
1n
A
21
A
22
A
2n
.
.
.
.
.
.
.
.
.
.
.
.
A
n1
A
n2
A
nn
_
_
_
_
_
. (4.17)
Similarly, since the u
i
: i = 1, . . . , n is a basis, the individual basis vectors of the basis u
i
: i = 1, . . . , n
can be written as
u
i
=
n
k=1
u
k
B
ki
(i = 1, 2, . . . , n) , (4.18)
for some numbers B
ki
. Here B
ki
is the kth component of the vector u
i
in the basis u
k
: k = 1, . . . , n.
Again the B
ki
can be viewed as the entries of a matrix B
B =
_
_
_
_
_
B
11
B
12
B
1n
B
21
B
22
B
2n
.
.
.
.
.
.
.
.
.
.
.
.
B
n1
B
n2
B
nn
_
_
_
_
_
. (4.19)
4.2.2 Properties of Transformation Matrices
From substituting (4.18) into (4.16) we have that
u
j
=
n
i=1
_
n
k=1
u
k
B
ki
_
A
ij
=
n
k=1
u
k
_
n
i=1
B
ki
A
ij
_
. (4.20)
However, because of the uniqueness of a basis representation and the fact that
u
j
= u
j
1 =
n
k=1
u
k
kj
,
it follows that
n
i=1
B
ki
A
ij
=
kj
. (4.21)
Hence in matrix notation, BA = I, where I is the identity matrix. Conversely, substituting (4.16) into
(4.18) leads to the conclusion that AB = I (alternatively argue by a relabeling symmetry). Thus
B = A
1
, (4.22a)
and
det A ,= 0 and det B ,= 0 . (4.22b)
Natural Sciences Tripos: IB Mathematical Methods I 57 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
4.2.3 Transformation Law for Vector Components
Consider a vector v, then in the u
i
: i = 1, . . . , n basis we have from (4.5)
v =
n
i=1
v
i
u
i
.
Similarly, in the u
i
: i = 1, . . . , n basis we can write
v =
n
j=1
v
j
u
j
(4.23)
=
n
j=1
v
j
n
i=1
u
i
A
ij
from (4.16)
=
n
i=1
u
i
n
j=1
(A
ij
v
j
) swap summation order.
Since a basis representation is unique it follows from (4.5) that
v
i
=
n
j=1
A
ij
v
j
, (4.24)
which relates the components of v in the basis u
i
: i = 1, . . . , n to those in the basis u
i
: i = 1, . . . , n.
Some Notation. Let v and v
=
_
_
_
_
_
v
1
v
2
.
.
.
v
n
_
_
_
_
_
respectively. (4.25)
Note that we now have bold v denoting a vector, italic v
i
denoting a component of a vector, and
sans serif v denoting a column matrix of components. Then (4.24) can be expressed as
v = Av
. (4.26a)
By applying A
1
to either side of (4.26a) it follows that
v
= A
1
v . (4.26b)
Unlectured Remark. Observe by comparison between (4.16) and (4.26b) that the components of v trans-
form inversely to the way that the basis vectors transform. This is so that the vector v is unchanged:
v =
n
j=1
v
j
u
j
from (4.23)
=
n
j=1
_
n
k=1
(A
1
)
jk
v
k
__
n
i=1
u
i
A
ij
_
from (4.26b) and (4.16)
=
n
i=1
u
i
_
_
n
k=1
v
k
_
_
n
j=1
A
ij
(A
1
)
jk
_
_
_
_
swap summation order
=
n
i=1
u
i
_
n
k=1
v
k
ik
_
AA
1
= I
=
n
i=1
v
i
u
i
. contract using (0.11b)
Natural Sciences Tripos: IB Mathematical Methods I 58 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Worked Example. Let u
1
= (1, 0), u
2
= (0, 1) and u
1
= (1, 1), u
2
= (1, 1) be two sets of basis vec-
tors in R
2
. Find the transformation matrix A
ij
that connects them. Verify the transformation law
for the components of an arbitrary vector v in the two coordinate systems.
Answer. We have that
u
1
= ( 1, 1) = (1, 0) +(0, 1) = u
1
+u
2
,
u
2
= (1, 1) = 1 (1, 0) +(0, 1) = u
1
+u
2
.
Hence from comparison with (4.16)
A
11
= 1 , A
21
= 1 , A
12
= 1 and A
22
= 1 ,
i.e.
A =
_
1 1
1 1
_
with inverse A
1
=
1
2
_
1 1
1 1
_
.
First Check. Note that A
1
is consistent with the observation that
u
1
= (1, 0) =
1
2
( (1, 1) (1, 1) ) =
1
2
(u
1
u
2
) ,
u
2
= (0, 1) =
1
2
( (1, 1) +(1, 1) ) =
1
2
(u
1
+u
2
) .
Second Check. Consider an arbitrary vector v, then
v = v
1
u
1
+v
2
u
2
=
1
2
v
1
(u
1
u
2
) +
1
2
v
2
(u
1
+u
2
)
=
1
2
(v
1
+v
2
)u
1
1
2
(v
1
v
2
)u
2
.
Thus
v
1
=
1
2
(v
1
+v
2
) and v
2
=
1
2
(v
1
v
2
) .
From (4.26b), i.e. v
= A
1
v, we obtain as above that
A
1
=
_
1
2
1
2
1
2
1
2
_
.
12/02
12/03
12/04
4.3 Scalar Product (Inner Product)
4.3.1 Denition of a Scalar Product
The prototype linear vector space V = E
3
has the additional property that any two vectors u and v can
be combined to form a scalar u v. This can be generalised to an n-dimensional vector space V over C
by assigning, for every pair of vectors u, v V , a scalar product u v C with the following properties.
1. If we [again] denote a complex conjugate with
then we require that
u v = (v u)
. (4.27a)
Note that implicit in this equation is the conclusion that for a complex vector space the ordering
of the vectors in the scalar product is important (whereas for E
3
this is not important). Further, if
we let u = v, then this implies that
v v = (v v)
, (4.27b)
i.e. v v is real.
2. The scalar product should be linear in its second argument, i.e. for a, b C
u (av
1
+bv
2
) = a u v
1
+b u v
2
. (4.27c)
3. The scalar product of a vector with itself should be positive, i.e.
v v 0 . (4.27d)
This allows us to write v v = [v[
2
, where the real positive number [v[ is the norm (cf. length) of
the vector v.
Natural Sciences Tripos: IB Mathematical Methods I 59 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
4. We shall also require that the only vector of zero norm should be the zero vector, i.e.
[v[ = 0 v = 0. (4.27e)
Remark. Properties (4.27a) and (4.27c) imply that for a, b C
(au
1
+bu
2
) v = (v (au
1
+bu
2
))
= (av u
1
+bv u
2
)
= a
(v u
1
)
+b
(v u
2
)
= a
(u
1
v) +b
(u
2
v) , (4.28)
i.e. the scalar product is anti-linear in the rst argument.
Failure to remember this is a common cause of error.
However, if a, b R then (4.28) reduces to linearity in both arguments.
Alternative notation. An alternative notation for the scalar product and associated norm is
u [ v ) u v , (4.29a)
|v| [v[ = (v v)
1
2
. (4.29b)
4.3.2 Worked Exercise
Question. Find a denition of inner product for the vector space of real symmetric 2 2 matrices under
addition.
Answer. We have already seen that the real symmetric 2 2 matrices form a vector space. In dening
an inner product a key point to remember is that we need property (4.43a), i.e. that the scalar
product of a vector with itself is zero only if the vector is zero. Hence for real symmetric 2 2
matrices A and B consider the inner product dened by
A [ B ) =
n
i=1
n
j=1
A
ij
B
ij
(4.30a)
= A
11
B
11
+A
12
B
12
+A
21
B
21
+A
22
B
22
, (4.30b)
where we are using the alternative notation (4.29a) for the inner product. For this denition of
inner product we have for real symmetric 2 2 matrices A, B and C, and a, b C:
as in (4.27a)
B [ A ) =
n
i=1
n
j=1
B
ij
A
ij
= A [ B )
;
as in (4.27c)
A [ (B +C) ) =
n
i=1
n
j=1
A
ij
(B
ij
+C
ij
)
=
n
i=1
n
j=1
A
ij
B
ij
+
n
i=1
n
j=1
A
ij
C
ij
= A [ B ) + A [ C ) ;
as in (4.27d)
A [ A ) =
n
i=1
n
j=1
A
ij
A
ij
=
n
i=1
n
j=1
[A
ij
[
2
0 ;
as in (4.27e)
A [ A ) = 0 A = 0 .
Hence we have a well dened inner product.
Natural Sciences Tripos: IB Mathematical Methods I 60 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
4.3.3 Some Inequalities
Schwarzs Inequality. This states that
[ u [ v )[ |u| |v| , (4.31)
with equality only when u is a scalar multiple of v.
Proof. Write u [ v ) = [ u [ v )[e
v [ u ) +[[
2
v [ v ) from (4.27c) and (4.28)
= u [ u ) +(e
)[ u [ v )[ +[[
2
v [ v ) from (4.27a).
First, suppose that v = 0. The right-hand-side then simplies from a quadratic in to an expression
that is linear in . If u [ v ) ,= 0 we then have a contradiction since for certain choices of this
simplied expression can be negative. Hence we conclude that
u [ v ) = 0 if v = 0,
in which case (4.31) is satised as an equality. Next suppose
that v ,= 0 and choose = re
i=1
v
i
u
i
and w =
n
j=1
w
j
u
j
, (4.34)
we have that
v w =
_
n
i=1
v
i
u
i
_
_
n
j=1
w
j
u
j
_
=
n
i=1
n
j=1
v
i
w
j
u
i
u
j
from (4.27c) and (4.28)
=
n
i=1
n
j=1
v
i
G
ij
w
j
. (4.35)
Natural Sciences Tripos: IB Mathematical Methods I 61 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
We can simplify this expression (which determines the scalar product in terms of the G
ij
), but rst it
helps to have a denition.
Denition. The Hermitian conjugate of a matrix A is dened to be
A
= (A
T
)
= (A
)
T
, (4.36)
where
T
denotes a transpose.
Example.
If A =
_
A
11
A
12
A
21
A
22
_
then A
=
_
A
11
A
21
A
12
A
22
_
.
Properties. For matrices A and B recall that (AB)
T
= B
T
A
T
. Hence (AB)
T
= B
T
A
T
, and so
(AB)
= B
. (4.37a)
Also, from (4.36),
A
=
_
A
T
_
T
= A. (4.37b)
Let w be the column matrix
w =
_
_
_
_
_
w
1
w
2
.
.
.
w
n
_
_
_
_
_
, (4.38a)
and let v
(v
)
T
=
_
v
1
v
2
. . . v
n
_
. (4.38b)
Then the scalar product (4.35) can be written as
v w = v
Gw, (4.39)
where G is the matrix, or metric, with entries G
ij
(metrics are a key ingredient of General Relativity).
4.3.5 Properties of the Metric
First two denitions for a n n matrix A.
Denition. The matrix A is said to be a Hermitian if it is equal to its own Hermitian conjugate, i.e. if
A
= A. (4.40)
Denition. The matrix A is said to be positive denite if for all column matrices v of length n,
v
)
ij
G
ij
= (G
ji
)
from (4.33)
= u
i
u
j
from (4.27a)
= G
ij
. from (4.33) (4.42b)
Hence G is Hermitian, i.e.
G
= G. (4.42c)
Natural Sciences Tripos: IB Mathematical Methods I 62 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Remark. That G is Hermitian is consistent with the requirement (4.27b) that [v[
2
= v v is real, since
(v v)
= ((v v)
)
T
since a scalar is its own transpose
= (v v)
Gv)
from (4.39)
= v
Gv from (4.42c)
= v v . from (4.39)
Property: a metric is positive denite. From (4.27d) and (4.27e) we have from the properties of a scalar
product that for any v
[v[
2
0 with equality i v = 0, (4.43a)
where i means if and only if. Hence, from (4.39), for any v
v
i
: i = 1, . . . , n, while in 4.3 we introduced inner products and dened the metric associated with
a given basis. We next consider how a metric transforms under a change of basis.
First we recall from (4.26a) that for an arbitrary vector v, its components in the two bases transform
according to v = Av
, where v and v
= v
. (4.44)
Hence for arbitrary vectors v and w
v w = v
Gw from (4.39)
= v
GAw
, (4.45)
where G
i
: i = 1, . . . , n basis. Since v and w are arbitrary we conclude that
the metric in the new basis is given in terms of the metric in the old basis by
G
= A
GA. (4.46)
Unlectured Alternative Derivation. (4.46) can also be derived from the denition of the metric since
(G
)
ij
G
ij
= u
i
u
j
from (4.33)
=
_
n
k=1
u
k
A
ki
_
_
n
=1
u
A
j
_
from (4.16)
=
n
k=1
n
=1
A
ki
(u
k
u
) A
j
from (4.27c) and (4.28)
=
n
k=1
n
=1
A
ik
G
k
A
j
from (4.33) and (4.42a)
= (A
GA)
ij
. (4.47)
Natural Sciences Tripos: IB Mathematical Methods I 63 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Remark. As a check we observe that
(G
= (A
GA)
= A
(A
= A
GA = G
. (4.48)
Thus G
1
0 0
0
2
0
.
.
.
.
.
.
.
.
.
.
.
.
0 0
n
_
_
_
_
_
.
Let us suppose that there exists an invertible matrix
A such that
G
= A
GA = , (4.49a)
where is a diagonal matrix, i.e. a matrix such that
ij
=
i
ij
. (4.49b)
The matrix A is said to diagonalize G. Subsequently
we shall (almost) show that for any Hermitian ma-
trix G a matrix A can be found to diagonalize G.
Properties of the
i
.
1. Because G
= is Hermitian,
i
=
ii
=
ii
=
i
(i = 1, . . . , n), (4.50)
and hence the diagonal entries
i
are real.
2. From (4.45), (4.49a) and (4.49b) we have that
0 [v[
2
= v
=
n
i=1
n
j=1
v
i
i
ij
v
j
=
n
i=1
i
[v
i
[
2
, (4.51)
with equality only if v = 0 (see (4.27e)). This can only be true for all vectors v if
i
> 0 for i = 1, . . . , n, (4.52)
i.e. if the diagonal entries
i
are strictly positive.
4.4.3 Orthonormal Bases
From (4.33), (4.49a) and (4.49b) we see that
u
i
u
j
= G
ij
=
ij
=
i
ij
. (4.53)
Hence u
i
u
j
= 0 when i ,= j, i.e. the new basis vectors are orthogonal. Further, because the
i
are
strictly positive we can normalise the basis, viz.
e
i
=
1
i
u
i
, (4.54a)
so that
e
i
e
j
=
ij
. (4.54b)
The e
i
: i = 1, . . . , n are thus an orthonormal basis. We conclude, subject to showing that G can be
diagonalized (because it is an Hermitian matrix), that:
Natural Sciences Tripos: IB Mathematical Methods I 64 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Any vector space with a scalar product has an orthonormal basis.
It also follows, since from (4.33) the elements of the metric are just e
i
e
j
, that the metric for an
orthonormal basis is the identity matrix I.
The scalar product in orthonormal bases. Let the column vectors v and w contain the components of two
vectors v and w, respectively, in an orthonormal basis e
i
: i = 1, . . . , n. Then from (4.39)
v w = v
I w = v
w. (4.55)
This is consistent with the denition of the scalar product from last year.
Orthogonality in orthonormal bases. If the vectors v and w are orthogonal, i.e. v w = 0, then the
components in an orthonormal basis are such that
v
w = 0 . (4.56)
4.5 Unitary and Orthogonal Matrices
Given an orthonormal basis, a question that arises is what changes of basis maintain orthonormality.
Suppose that e
i
: i = 1, . . . , n is a new orthonormal basis, and suppose that in terms of the original
orthonormal basis
e
i
=
n
k=1
e
k
U
ki
, (4.57)
where U is the transformation matrix (cf. (4.16)). Then from (4.46) the metric for the new basis is given
by
G
= U
I U = U
U. (4.58a)
For the new basis to be orthonormal we require that the new metric to be the identity matrix, i.e. we
require that
U
U = I . (4.58b)
Since det U ,= 0, the inverse U
1
exists and hence
U
= U
1
. (4.59)
Denition. A matrix for which the Hermitian conjugate is equal to the inverse is said to be unitary.
Vector spaces over R. An analogous result applies to vector spaces over R. Then, because the transfor-
mation matrix, say U = R, is real,
U
= R
T
,
and so
R
T
= R
1
. (4.60)
Denition. A real matrix with this property is said to be orthogonal.
Example. An example of an orthogonal matrix is the 3 3 rotation matrix R that determines
the new components, v
= R
T
v, of a three-dimensional vector v after a rotation of the axes
(note that under a rotation orthogonal axes remain orthogonal and unit vectors remain unit
vectors).
13/01
Natural Sciences Tripos: IB Mathematical Methods I 65 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
4.6 Diagonalization of Matrices: Eigenvectors and Eigenvalues
Suppose that M is a square n n matrix. Then a non-zero column vector x such that
Mx = x , (4.61a)
where C, is said to be an eigenvector of the matrix M with eigenvalue . If we rewrite this equation
as
(MI)x = 0 , (4.61b)
then since x is non-zero it must be that
det(MI) = 0 . (4.62)
This is called the characteristic equation of the matrix M. The left-hand-side of (4.62) is an nth order
polynomial in called the characteristic polynomial of M. The roots of the characteristic polynomial are
the eigenvalues of M.
Example. Find the eigenvalues of
M =
_
0 1
1 0
_
(4.63a)
Answer. From (4.62)
0 = det(MI) =
_
_
_
_
1
1
_
_
_
_
=
2
+1 = ( )( +) , (4.63b)
and so the eigenvalues of M are .
Since an nth order polynomial has exactly n, possibly complex, roots (counting multiplicities in the case
of repeated roots), there are always n eigenvalues
i
, i = 1, . . . , n. Let x
i
be the respective eigenvectors,
i.e.
Mx
i
=
i
x
i
(4.64a)
or in component notation
n
k=1
M
jk
x
i
k
=
i
x
i
j
. (4.64b)
Let X be the n n matrix dened by
(X)
ij
X
ij
= x
j
i
, (4.65a)
i.e.
X =
_
_
_
_
_
_
x
1
1
x
2
1
x
n
1
x
1
2
x
2
2
x
n
2
.
.
.
.
.
.
.
.
.
.
.
.
x
1
n
x
2
n
x
n
n
_
_
_
_
_
_
. (4.65b)
Then (4.64b) can be rewritten as
n
k=1
M
jk
X
ki
=
i
X
ji
=
n
k=1
X
jk
ki
i
(4.66a)
or, in matrix notation,
MX = X, (4.66b)
Natural Sciences Tripos: IB Mathematical Methods I 66 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
where is the diagonal matrix
=
_
_
_
_
_
1
0 0
0
2
0
.
.
.
.
.
.
.
.
.
.
.
.
0 0
n
_
_
_
_
_
. (4.66c)
If X has an inverse, X
1
, then
X
1
MX = , (4.67)
i.e. X diagonalizes M. But for X
1
to exist we require that det X ,= 0; this is equivalent to the requirement
that the columns of X are linearly independent. These columns are just the eigenvectors of M, so
an n n matrix is diagonalizable if and only if it has n linearly-independent eigenvectors.
4.7 Eigenvalues and Eigenvectors of Hermitian Matrices
In order to determine whether a metric is diagonalizable, we conclude from the above considerations that
we need to determine whether the metric has n linearly-independent eigenvectors. To this end we shall
determine two important properties of Hermitian matrices.
4.7.1 The Eigenvalues of an Hermitian Matrix are Real
Let H be an Hermitian matrix, and suppose that e is a non-zero eigenvector with eigenvalue . Then
He = e , (4.68a)
and hence
e
He = e
e . (4.68b)
Take the Hermitian conjugate of both sides; rst the left hand side
_
e
He
_
= e
e since (AB)
= B
and (A
= A
= e
e)
e . (4.69b)
On equating the above two results we have that
e
He =
e . (4.70)
It then follows from (4.68b) and (4.70) that
(
) e
e = 0 . (4.71)
However we have assumed that e is a non-zero eigenvector, so
e
e =
n
i=1
e
i
e
i
=
n
i=1
[e
i
[
2
> 0 , (4.72)
and hence it follows from (4.71) that =
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
4.7.2 An n-Dimensional Hermitian Matrix has n Orthogonal Eigenvectors
i
,=
j
. Let e
i
and e
j
be two eigenvectors of an Hermitian matrix H. First of all suppose that their
respective eigenvalues
i
and
j
are dierent, i.e.
i
,=
j
. From pre-multiplying (4.68a), with
e e
i
, by (e
j
)
we have that
e
j
He
i
=
i
e
j
e
i
. (4.73a)
Similarly
e
i
He
j
=
j
e
i
e
j
. (4.73b)
On taking the Hermitian conjugate of (4.73b) it follows that
e
j
H
e
i
=
j
e
j
e
i
.
However, H is Hermitian, i.e. H
i
=
j
. The case when there is a repeated eigenvalue is more dicult. However with sucient mathe-
matical eort it can still be proved that orthogonal eigenvectors exist for the repeated eigenvalue.
Instead of adopting this approach we appeal to arm-waving arguments.
An experimental approach. First adopt an experimental approach. In real life it is highly unlikely
that two eigenvalues will be exactly equal (because of experimental error, etc.). Hence this
case never arises and we can assume that we have n orthogonal eigenvectors.
A perturbation approach. Alternatively suppose that in
the real problem two eigenvalues are exactly equal.
Introduce a specic, but small, perturbation of size
(cf. the introduced in (3.19b) when calculating
the Fourier transform of the Heaviside step function)
such that the perturbed problem has unequal eigen-
values (this is highly likely to be possible because the
problem with equal eigenvalues is likely to be struc-
turally unstable). Now let 0. For all non-zero
values of (both positive and negative) there will be
n orthogonal eigenvectors. On appealing to a conti-
nuity argument there will be n orthogonal eigenvec-
tors for the specic case = 0.
14/02
14/04
Lemma. Orthogonal eigenvectors e
i
and e
j
are linearly independent.
Proof. Suppose there exist a
i
and a
j
such that
a
i
e
i
+a
j
e
j
= 0 . (4.77)
Natural Sciences Tripos: IB Mathematical Methods I 68 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Then from pre-multiplying (4.77) by e
j
and using (4.76) it follows that
0 = a
j
e
j
e
j
= a
j
n
k=1
(e
j
)
k
(e
j
)
k
= a
j
n
k=1
(e
j
)
k
2
. (4.78)
Since e
j
is non-zero it follows that a
j
= 0. By the relabeling symmetry, or by using the Hermitian
conjugate of (4.76), it similarly follows that a
i
= 0. 2
We conclude that, whether or not two or more eigenvalues are equal,
an n-dimensional Hermitian matrix has n orthogonal eigenvectors that are linearly independent.
Remark. We can tighten this is result a little further by noting that, for any C,
if He
i
=
i
e
i
, then H(e
i
) =
i
(e
i
) . (4.79a)
This allows us to normalise the eigenvectors so that
e
i
e
i
= 1 . (4.79b)
Hence for Hermitian matrices it is always possible to nd n orthonormal eigenvectors that are
linearly independent.
14/03
4.7.3 Diagonalization of Hermitian Matrices
It follows from the above result, 4.6, and specically (4.65b), that an Hermitian matrix H can be
diagonalized to the matrix by means of the transformation X
1
HX if
X =
_
_
_
_
_
_
e
1
1
e
2
1
e
n
1
e
1
2
e
2
2
e
n
2
.
.
.
.
.
.
.
.
.
.
.
.
e
1
n
e
2
n
e
n
n
_
_
_
_
_
_
. (4.80)
Remark. If the e
i
are orthonormal eigenvectors of H then X is a unitary matrix. To see this note that
(X
X)
ij
=
n
k=1
(X
)
ik
(X)
kj
=
n
k=1
(e
i
k
)
e
j
k
= e
i
e
j
=
ij
by orthonormality, (4.81a)
or, in matrix notation,
X
X = I. (4.81b)
Hence X
= X
1
, and we conclude X is a unitary matrix.
We deduce that every Hermitian matrix, H, is diagonalizable by a transformation X
HX, where X is a
unitary matrix. Hence, if in (4.49a) we identify H and X with G and A respectively, we see that
a metric can always be diagonalized by a suitable choice of basis,
namely the basis made up of the eigenvectors of G. Similarly, if we restrict ourselves to real matrices, then
every real symmetric matrix, S, is diagonalizable by a transformation R
T
SR, where R is an orthogonal
matrix.
Natural Sciences Tripos: IB Mathematical Methods I 69 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Example. Find the orthogonal matrix that diagonalizes the real symmetric matrix
S =
_
1
1
_
where 0 is real. (4.82)
Answer. The characteristic equation is
0 =
_
_
_
_
1
1
_
_
_
_
= (1 )
2
2
. (4.83)
The solutions to (4.83) are
=
_
+
= 1 +
= 1
. (4.84)
The corresponding eigenvectors e
__
e
1
e
2
_
= 0 , (4.85a)
or
_
1 1
1 1
__
e
1
e
2
_
= 0 . (4.85b)
,= 0. If ,= 0 (in which case
+
,=
) we have that
e
2
= e
1
. (4.86a)
On normalising e
so that e
= 1, it follows that
e
+
=
1
2
_
1
1
_
, e
=
1
2
_
1
1
_
. (4.86b)
Note that e
+
e
= 0, as proved earlier.
= 0. If = 0, then S = I, and so any non-zero vector is an eigenvector with eigenvalue 1. In
agreement with the result stated earlier, two linearly-independent eigenvectors can still be
found, and we can choose them to be orthonormal, e.g. e
+
and e
1
e
+
2
e
2
_
=
_
1
2
1
2
1
2
1
2
_
=
1
2
_
1 1
1 1
_
. (4.87)
As a check we note that
R
T
R =
1
2
_
1 1
1 1
__
1 1
1 1
_
=
_
1 0
0 1
_
, (4.88)
and
R
T
SR =
1
2
_
1 1
1 1
__
1
1
__
1 1
1 1
_
=
1
2
_
1 1
1 1
__
1 + 1
1 + 1 +
_
=
_
1 + 0
0 1
_
= . (4.89)
14/01
Natural Sciences Tripos: IB Mathematical Methods I 70 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
4.7.4 Diagonalization of Matrices
For a general n n matrix M with n distinct eigenvalues
i
, (i = 1, . . . , n), it is possible to show (but
not here) that there are n linearly independent eigenvectors e
i
. It then follows from our earlier results in
4.6 that M is diagonalized by the matrix
X =
_
_
_
_
_
_
e
1
1
e
2
1
e
n
1
e
1
2
e
2
2
e
n
2
.
.
.
.
.
.
.
.
.
.
.
.
e
1
n
e
2
n
e
n
n
_
_
_
_
_
_
. (4.90)
Remark. If M has two or more equal eigenvalues it may or may not have n linearly independent eigen-
vectors. If it does not have n linearly independent eigenvectors then it is not diagonalizable. As an
example consider the matrix
M =
_
0 1
0 0
_
, (4.91)
with the characteristic equation
2
= 0; hence
1
=
2
= 0. Moreover, M has only one linearly
independent eigenvector, namely
e =
_
1
0
_
, (4.92)
and so M is not diagonalizable.
Normal Matrices. However, normal matrices, i.e. matrices such that M
M = MM
Hx =
n
i=1
n
j=1
x
i
H
ij
x
j
, (4.93a)
is called an Hermitian form; it is a function of the complex numbers (x
1
, x
2
, . . . , x
n
). Moreover, we
note that
(x
Hx)
= (x
Hx)
x since (AB)
= B
= x
Hx . since H is Hermitian
Hence an Hermitian form is real.
The Real Case. An important special case is obtained by restriction to real vector spaces; then x and H
are real. It follows that H
T
= H, i.e. H is a real symmetric matrix; let us denote such a matrix by S.
In this case
x
T
Sx =
n
i=1
n
j=1
x
i
S
ij
x
j
. (4.93b)
When considered as a function of the real variables x
1
, x
2
, . . . , x
n
, this expression called a quadratic
form.
Remark. In the same way that an Hermitian matrix can be viewed as a generalisation to complex matrices
of a real symmetric matrix, an Hermitian form can be viewed a generalisation to vector spaces over
C of a quadratic form for a vector space over R.
Natural Sciences Tripos: IB Mathematical Methods I 71 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
4.8.1 Eigenvectors and Principal Axes
Let us consider the equation
x
T
Sx = constant , (4.94)
where S is a real symmetric matrix.
Conic Sections. First suppose that n = 2, then with
x =
_
x
y
_
and S =
_
_
, (4.95)
(4.94) becomes
x
2
+2xy +y
2
= constant . (4.96)
This is the equation of a conic section. Now suppose that x = Rx
, where x
= (x
, y
)
T
and R is a
real orthogonal matrix. The equation of the conic section then becomes
x
T
S
= constant, where S
= R
T
SR. (4.97)
Now choose R to diagonalize S so that
S
=
_
1
0
0
2
_
, (4.98)
where
1
and
2
are the eigenvalues of S; from our earlier results in 4.7.3 we know that such a
matrix can always be found (and wlog det R = 1). The conic section then takes the simple form
1
x
2
+
2
y
2
= constant . (4.99)
The prime axes are identical to the eigenvectors of S
are
e
1
=
_
1
0
_
and e
2
=
_
0
1
_
, (4.100)
with eigenvalues
1
and
2
respectively. Axes that coincide with the eigenvectors are known as
principal axes. In the terms of the original axes, the principal axes are given by e
i
= Re
i
(i = 1, 2).
Interpretation. If
1
2
> 0 then (4.99) is the equation for an ellipse with principal axes coinciding
with the x
and y
axes.
Scale. The scale of the ellipse is determined by the constant on the right-hand-side of (4.94)
(or (4.99)).
Orientation. The orientation of the ellipse is determined by the eigenvectors of S.
Shape. The shape of the ellipse is determined by the eigenvalues of S.
In the degenerate case,
1
=
2
, the ellipse becomes a circle with no preferred principal axes.
Any two linearly independent vectors may be chosen as the principal axes, which are no longer
necessarily orthogonal but can be be chosen to be so.
If
1
2
< 0 then (4.99) is the equation for a hyperbola with principal axes coinciding with
the x
and y
axes. Similar results to above hold for the scale, orientation and shape.
Quadric Surfaces. For a real 3 3 symmetric matrix S, the equation
x
T
Sx = k , (4.101)
where k is a constant, is called a quadric surface. After a rotation of axes such that S S
= , a
diagonal matrix, its equation takes the form
1
x
2
+
2
y
2
+
3
z
2
= k . (4.102)
Natural Sciences Tripos: IB Mathematical Methods I 72 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
The axes in the new prime coordinate system are again known as principal axes, and the eigen-
vectors of S
(or S) are again aligned with the principal axes. We note that when
i
k > 0,
the distance to surface along the ith principal axes =
_
k
i
. (4.103)
Special Cases.
In the case of metric matrices we know that S is positive denite, and hence that
1
,
2
and
3
are all positive. The quadric surface is then an ellipsoid
If
1
=
2
then we have a surface of revolution about the z
axis.
If
1
=
2
=
3
we have a sphere.
If
3
0 then we have a cylinder.
If
2
,
3
0, then we recover the planes x
=
_
k
1
.
15/02
4.8.2 The Stationary Properties of the Eigenvalues
Suppose that we have an orthonormal basis, and let
x be a point on x
T
Sx = k where k is a constant. Then
from (4.55) the distance squared from the origin to
the quadric surface is x
T
x. This distance naturally
depends on the value of k, i.e. the scale of the surface.
This dependence on k can be removed by considering
the square of the relative distance to the surface, i.e.
(relative distance to surface)
2
=
x
T
x
x
T
Sx
. (4.104)
Let us consider the directions for which this relative distance, or equivalently its inverse
(x) =
x
T
Sx
x
T
x
, (4.105)
is stationary. We can nd the so-called rst variation in (x) by letting
x x +x and x
T
x
T
+x
T
, (4.106)
by performing a Taylor expansion, and by ignoring terms quadratic or higher in [x[. First note that
(x
T
+x
T
)(x +x) = x
T
x +x
T
x +x
T
x +. . .
= x
T
x +2x
T
x +. . . . since the transpose of a scalar is itself
Hence
1
(x
T
+x
T
)(x +x)
=
1
x
T
x +2x
T
x +. . .
=
1
x
T
x
_
1 +
2x
T
x
x
T
x
+. . .
_
1
=
1
x
T
x
_
1
2x
T
x
x
T
x
+. . .
_
.
Similarly
(x
T
+x
T
)S(x +x) = x
T
Sx +x
T
Sx +x
T
Sx +. . .
= x
T
Sx +2x
T
Sx +. . . . since S
T
= S
Natural Sciences Tripos: IB Mathematical Methods I 73 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Putting the above results together we have that
(x) =
(x
T
+x
T
)S(x +x)
(x
T
+x
T
)(x +x)
x
T
Sx
x
T
x
=
x
T
Sx +2x
T
Sx +. . .
x
T
x
_
1
2x
T
x
x
T
x
+. . .
_
x
T
Sx
x
T
x
=
2x
T
Sx
x
T
x
x
T
Sx
x
T
x
2x
T
x
x
T
x
+. . .
=
2
x
T
x
_
x
T
Sx (x)x
T
x
_
=
2
x
T
x
x
T
(Sx (x)x) . (4.107)
Hence the rst variation is zero for all possible x
when
Sx = (x)x , (4.108)
i.e. when x is an eigenvector of S and is the associ-
ated eigenvalue. So the eigenvectors of S are the di-
rections which make the relative distance (4.104) sta-
tionary, and the eigenvalues are the values of (4.105)
at the stationary points.
By a similar argument one can show that the eigenvalues of an Hermitian matrix, H, are the values of
the function
(x) =
x
Hx
x
x
(4.109)
at its stationary points.
4.9 Mechanical Oscillations (Unlectured: See Easter Term Course)
4.9.0 Why Have We Studied Hermitian Matrices, etc.?
The above discussion of quadratic forms, etc. may have appeared rather dry. There is a nice application
concerning normal modes and normal coordinates for mechanical oscillators (e.g. molecules). If you are
interested read on, if not wait until the rst few lectures of the Easter term course where the following
material appears in the schedules.
4.9.1 Governing Equations
Suppose we have a mechanical system described by coordinates q
1
, . . . , q
n
, where the q
i
may be distances,
angles, etc. Suppose that the system is in equilibrium when q = 0, and consider small oscillations about
the equilibrium. The velocities in the system will depend on the q
i
, and for small oscillations the velocities
will be linear in the q
i
, and total kinetic energy T will be quadratic in the q
i
. The most general quadratic
expression for T is
T =
j
a
ij
q
i
q
j
= q
T
A q . (4.110)
Since kinetic energies are positive, A should be positive denite. We will also assume that, by a suitable
choice of coordinates, A can be chosen to be symmetric.
Consider next the potential energy, V . This will depend on the coordinates, but not on the velocities,
i.e. V V (q). For small oscillations we can expand V about the equilibrium position:
V (q) = V (0) +
i
q
i
V
q
i
(0) +
1
2
j
q
i
q
j
2
V
q
i
q
j
(0) +. . . . (4.111)
Natural Sciences Tripos: IB Mathematical Methods I 74 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Normalise the potential so that V (0) = 0. Also, since the system is in equilibrium when q = 0, we require
that there is no force when q = 0. Hence
V
q
i
(0) = 0 . (4.112)
Hence for small oscillations we may approximate (4.111) by
V =
j
b
ij
q
i
q
j
= q
T
Bq . (4.113)
We note that if the mixed derivatives of V are equal, then B is symmetric. Assume next that there are
no dissipative forces so that there is conservation of energy. Then we have that
d
dt
(T +V ) = 0 . (4.114)
In matrix form this equation becomes, after using the symmetry of A and B,
0 =
d
dt
(T +V ) = q
T
A q + q
T
A q + q
T
Bq +q
T
B q
= 2 q
T
(A q +Bq) . (4.115)
We will assume that the solution of this matrix equation that we need is the one in which the coecient
of each q
i
is zero, i.e. we will require
A q +Bq = 0 . (4.116)
15/01
4.9.2 Normal Modes
We will seek solutions to (4.116) that all oscillate with the same frequency, i.e. we seek solutions
q = x cos(t +) , (4.117)
where is a constant. Substituting into (4.116) we nd that
(B
2
A)q = 0 , (4.118)
or, on the assumption that A is invertible,
(A
1
B
2
I)q = 0 . (4.119)
Thus the solutions
2
are the eigenvalues of A
1
B, and are referred to as eigenfrequencies or normal
frequencies. The eigenvectors are referred to as normal modes, and in general there will be n of them.
4.9.3 Normal Coordinates
We have seen how, for a quadratic form, we can change coordinates so that the matrix for a particular
form is diagonalized. We assert, but do not prove, that a transformation can be found that simultaneously
diagonalizes A and B, say to M and N. The new coordinates, say u, are referred to as normal coordinates.
In terms of them the kinetic and potential energies become
T =
i
u
2
i
= u
T
M u and V =
i
u
2
i
= u
T
Nu (4.120)
respectively, where
M =
_
_
_
_
_
1
0 0
0
2
0
.
.
.
.
.
.
.
.
.
.
.
.
0 0
n
_
_
_
_
_
and N =
_
_
_
_
_
1
0 0
0
2
0
.
.
.
.
.
.
.
.
.
.
.
.
0 0
n
_
_
_
_
_
. (4.121)
The equations of motion are then the uncoupled equations
i
u
i
+
i
u
i
= 0 (i = 1, . . . , n). (4.122)
Natural Sciences Tripos: IB Mathematical Methods I 75 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
4.9.4 Example
Consider a system in which three particles of mass m, m and m are connected in a straight line by
light springs with a force constant k (cf. an idealised model of CO
2
).
20
The kinetic energy of the system is then
T =
1
2
(mx
2
1
+mx
2
2
+mx
2
3
) , (4.123a)
while the potential energy stored in the springs is
V =
1
2
k((x
2
x
1
)
2
+(x
3
x
2
)
2
) . (4.123b)
The kinetic and potential energy matrices are thus
A =
1
2
m
_
_
1 0 0
0 0
0 0 1
_
_
and B =
1
2
k
_
_
1 1 0
1 2 1
0 1 1
_
_
, (4.124)
respectively. In order to nd the normal frequencies we therefore need to nd the roots to |B
2
A| = 0
(see (4.118)), i.e. we need the roots to
_
_
_
_
_
_
1 1 0
1 2 1
0 1 1
_
_
_
_
_
_
= (1 )( ( +2)) = 0 , (4.125)
where = m
2
/k. The eigenfrequencies are thus
1
= 0 ,
2
=
_
k
m
_1
2
,
3
=
_
k
m
_1
2
_
1 +
2
_1
2
, (4.126)
with corresponding (non-normalised) eigenvectors
x
1
=
_
_
1
1
1
_
_
, x
2
=
_
_
1
0
1
_
_
, x
3
=
_
_
1
2/
1
_
_
. (4.127)
Remark. Note that the centre of mass of the system is at rest in the case of x
2
and x
3
.
20
See Riley, Hobson & Bence (1997).
Natural Sciences Tripos: IB Mathematical Methods I 76 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
5 Elementary Analysis
5.0 Why Study This?
Analysis is one of the foundations upon which mathematics is built. At some point you ought to at least
inspect the foundations! Also, you need to have an idea of when, and when not, you can sum a series,
e.g. a Fourier series.
5.1 Sequences and Limits
5.1.1 Sequences
A sequence is a set of numbers occurring in order. If the sequence is unending we have an innite
sequence.
Example. If the nth term of a sequence is s
n
=
1
n
, the sequence is
1,
1
2
,
1
3
,
1
4
, . . . . (5.1)
5.1.2 Sequences Tending to a Limit, or Not.
Sequences tending to a limit. A sequence, s
n
, is said to tend to the limit s if, given any positive , there
exists N N() such that
[s
n
s[ < for all n > N . (5.2a)
We then write
lim
n
s
n
= s . (5.2b)
Example. Suppose s
n
= x
n
with [x[ < 1. Given 0 < < 1 let N() be the smallest integer such
that, for a given x,
N >
log 1/
log 1/[x[
. (5.3)
Then, if n > N,
[s
n
0[ = [x[
n
< [x[
N
< . (5.4a)
Hence
lim
n
x
n
= 0 . (5.4b)
Property. An increasing sequence tends either to a limit or to +. Hence a bounded increasing
sequence tends to a limit, i.e. if
s
n+1
> s
n
, and s
n
< / R for all n, then s = lim
n
s
n
exists. (5.5)
Remark. You really ought to have a proof of this property, but I do not have time.
21
Sequences tending to innity. A sequence, s
n
, is said to tend to innity if given any A (however large),
there exists N N(A) such that
s
n
> A for all n > N . (5.6a)
We then write
s
n
as n . (5.6b)
Similarly we say that s
n
as n if given any A (however large), there exists N N(A)
such that
s
n
< A for all n > N . (5.6c)
Oscillating sequences. If a sequence does not tend to a limit or , then s
n
is said to oscillate. If s
n
oscillates and is bounded, it oscillates nitely, otherwise it oscillates innitely.
21
Alternatively you can view this property as an axiom that species the real numbers R essentially uniquely.
Natural Sciences Tripos: IB Mathematical Methods I 77 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
5.2 Convergence of Innite Series
5.2.1 Convergent Series
Given an innite sequence of numbers u
1
, u
2
, . . . , dene the partial sum s
n
by
s
n
=
n
r=1
u
r
. (5.7)
If as n , s
n
tends to a nite limit, s, then we say that the innite series
r=1
u
r
, (5.8)
converges (or is convergent), and that s is its sum.
Example: the convergence of a geometric series. The series
r=0
x
r
= 1 +x +x
2
+x
3
+. . . , (5.9)
converges to (1 x)
1
provided that [x[ < 1.
Answer. Consider the partial sum
s
n
= 1 +x + +x
n1
=
1 x
n
1 x
. (5.10)
If [x[ < 1, then from (5.4b) we have that x
n
0 as n , and hence
s = lim
n
s
n
=
1
1 x
for [x[ < 1. (5.11)
However if [x[ 1 the series diverges.
5.2.2 A Necessary Condition for Convergence
A necessary condition for s to converge is that u
r
0 as r .
Proof. Using the fact that u
r
= s
r
s
r1
we have that
lim
r
u
r
= lim
r
(s
r
s
r1
) = lim
r
s
r
lim
r
s
r1
= s s = 0 . (5.12)
However, as we are about to see with the example u
r
=
1
r
(see (5.13) and (5.14)), u
r
0 as r is
not a sucient condition for convergence.
5.2.3 Divergent Series
An innite series which is not convergent is called divergent.
Example. Suppose that
u
r
=
1
r
so that s
n
=
n
r=1
1
r
= 1 +
1
2
+
1
3
+
1
n
. (5.13)
Natural Sciences Tripos: IB Mathematical Methods I 78 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Consider s
2
m where m is an integer. First we note that
m = 1 : s
2
= 1 +
1
2
m = 2 : s
4
= s
2
+
1
3
+
1
4
> s
2
+
1
4
+
1
4
= 1 +
1
2
+
1
2
m = 3 : s
8
= s
4
+
1
5
+
1
6
+
1
7
+
1
8
> s
4
+
1
8
+
1
8
+
1
8
+
1
8
> 1 +
1
2
+
1
2
+
1
2
.
Similarly we can show that (e.g. by induction)
s
2
m > 1 +
m
2
, (5.14)
and hence the series is divergent.
5.2.4 Absolutely Convergent Series
A series
u
r
is said to converge absolutely if
r=1
[u
r
[ (5.15)
converges, otherwise any convergence of the series is said to be conditional.
Example. Suppose that
u
r
= (1)
r1
1
r
so that s
n
=
n
r=1
(1)
r1
1
r
= 1
1
2
+
1
3
+(1)
n1 1
n
. (5.16)
Then, from the Taylor expansion
log(1 +x) =
r=1
(x)
r
r
, (5.17)
we spot that s = lim
n
s
n
= log 2; hence
r=1
u
r
converges. However, from (5.13) and (5.14)
we already know that
r=1
[u
r
[ diverges. Hence
r=1
u
r
is conditionally convergent.
Property. If
[u
r
[ converges then so does
u
r
(see the Example Sheet for a proof).
5.3 Tests of Convergence
5.3.1 The Comparison Test
If we are given that v
r
> 0 and
S =
r=1
v
r
(5.18)
is convergent, the innite series
r=1
u
r
is also convergent if 0 < u
r
< /v
r
for some / independent of r.
Proof. Since u
r
> 0, s
n
=
n
r=1
u
r
is an increasing sequence. Further
s
n
=
n
r=1
u
r
< /
n
r=1
v
r
, (5.19)
and thus
lim
n
s
n
< /
r=1
v
r
= /S , (5.20)
i.e. s
n
is an increasing bounded sequence. Thence from (5.5)
r=1
u
r
is convergent. 16/01
16/02
16/03
16/04
Remark. Similarly if
r=1
v
r
diverges, v
r
> 0 and u
r
> /v
r
for some / independent of r, then
r=1
u
r
diverges.
Natural Sciences Tripos: IB Mathematical Methods I 79 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
5.3.2 DAlemberts Ratio Test
Suppose that u
r
> 0 and that
lim
r
_
u
r+1
u
r
_
= . (5.21)
Then
u
r
converges if < 1, while
u
r
diverges if > 1.
Proof. First suppose that < 1. Choose with < < 1. Then there exists N N() such that
u
r+1
u
r
< for all r > N . (5.22)
So
r=1
u
r
=
N
r=1
u
r
+u
N+1
_
1 +
u
N+2
u
N+1
+
u
N+2
u
N+1
u
N+3
u
N+2
+. . .
_
<
N
r=1
u
r
+u
N+1
(1 + +
2
+. . . ) by hypothesis
<
N
r=1
u
r
+
u
N+1
1
by (5.11) since < 1. (5.23)
We conclude that
r=1
u
r
is bounded. Thence, since s
n
=
n
r=1
u
r
is an increasing sequence, it
follows from (5.5) that
u
r
converges.
Next suppose that > 1. Choose with 1 < < . Then there exists M M() such that
u
r+1
u
r
> > 1 for all r > M , (5.24a)
and hence
u
r
u
M
>
rM
> 1 for all r > M . (5.24b)
Thus, since u
r
, 0 as r , we conclude that
u
r
diverges.
5.3.3 Cauchys Test
Suppose that u
r
> 0 and that
lim
r
u
1/r
r
= . (5.25)
Then
u
r
converges if < 1, while
u
r
diverges if > 1.
Proof. First suppose that < 1. Choose with < < 1. Then there exists N N() such that
u
1/r
r
< , i.e. u
r
<
r
for all r > N . (5.26)
It follows that
r=1
u
r
<
N
r=1
u
r
+
r=N+1
r
. (5.27)
We conclude that
r=1
u
r
is bounded (since < 1). Moreover s
n
=
n
r=1
u
r
is an increasing
sequence, and hence from (5.5) we also conclude that
u
r
converges.
Next suppose that > 1. Choose with 1 < < . Then there exists M M() such that
u
1/r
r
> > 1 , i.e. u
r
>
r
> 1 , for all r > M . (5.28)
Thus, since u
r
, 0 as r ,
u
r
must diverge.
Natural Sciences Tripos: IB Mathematical Methods I 80 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
5.4 Power Series of a Complex Variable
A power series of a complex variable, z, has the form
f(z) =
r=0
a
r
z
r
where a
r
C. (5.29)
Remark. Many of the above results for real series can be generalised for complex series. For instance, if
the sum of the absolute values of a complex series converges (i.e. if
[u
r
[ converges), then so does
the series (i.e.
u
r
). Hence if
[a
r
z
r
[ converges, so does
a
r
z
r
.
5.4.1 Convergence of Power Series
If the sum
r=0
a
r
z
r
converges for z = z
1
, then it converges absolutely for all z such that [z[ < [z
1
[.
Proof. First we note that
[a
r
z
r
[ = [a
r
z
r
1
[
z
z
1
r
. (5.30)
Also, since
a
r
z
r
1
converges, then from 5.2.2, a
r
z
r
1
0 as r . Hence for a given there
exists N N() such that if r > N then [a
r
z
r
1
[ < and
[a
r
z
r
[ <
z
z
1
r
if [z[ < [z
1
[. (5.31)
Thus
a
r
z
r
converges for [z[ < [z
1
[ by comparison with a geometric series.
Corollary. If the sum diverges for z = z
1
then it diverges for all z such that [z[ > [z
1
[. For suppose that
it were to converge for some such z = z
2
with [z
2
[ > [z
1
[, then it would converge for z = z
1
by the
above result; this is in contradiction to the hypothesis.
5.4.2 Radius of Convergence
The results of 5.4.1 imply that there exists some circle in the complex z-plane of radius (possibly 0
or ) such that:
a
r
z
r
converges for [z[ <
a
r
z
r
diverges for [z[ >
_
[z[ = is the circle of convergence. (5.32)
The real number is called the radius of convergence. On [z[ = the sum may or may not converge.
5.4.3 Determination of the Radius of Convergence
Let
f(z) =
r=0
u
r
where u
r
= a
r
z
r
. (5.33)
Use DAlemberts ratio test. If the limit exists, then
lim
r
a
r+1
a
r
=
1
. (5.34)
Proof. We have that
lim
r
u
r+1
u
r
= lim
r
a
r+1
a
r
[z[ =
[z[
by hypothesis.
Natural Sciences Tripos: IB Mathematical Methods I 81 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Hence the series converges absolutely by DAlemberts ratio test if [z[ < . On the other hand if
[z[ > , then
lim
r
u
r+1
u
r
=
[z[
> 1 . (5.35)
Hence u
r
, 0 as r , and so the series does not converge. It follows that is the radius of
convergence.
Remark. The limit (5.34) may not exist, e.g. if a
r
= 0 for r odd then
a
r+1
a
r
is alternately 0 or .
Use Cauchys test. If the limit exists, then
lim
r
[a
r
[
1/r
=
1
. (5.36)
Proof. We have that
lim
r
[u
r
[
1/r
= lim
r
[a
r
[
1/r
[z[ =
[z[
by hypothesis. (5.37)
Hence the series converges absolutely by Cauchys test if [z[ < .
On the other hand if [z[ > , choose with 1 < < [z[/. Then there exists M M() such that
[u
r
[
1/r
> > 1 , i.e. [u
r
[ >
r
> 1 , for all r > M .
Thus, since u
r
, 0 as r ,
u
r
must diverge. It follows that is the radius of convergence.
5.4.4 Examples
1. Suppose that f(z) is the geometric series
f(z) =
r=0
z
r
.
Then a
r
= 1 for all r, and hence
a
r+1
a
r
= 1 and [a
r
[
1/r
= 1 for all r. (5.38)
Hence = 1 by either DAlemberts ratio test or Cauchys test, and the series converges for [z[ < 1.
In fact
f(z) =
1
1 z
.
Note the singularity at z = 1 which determines the radius of convergence.
2. Suppose that f(z) =
r=1
(z)
r
r
. Then a
r
=
(1)
r1
r
, and hence
a
r+1
a
r
=
r
r +1
1 as r . (5.39)
Hence = 1 by DAlemberts ratio test. As a check we observe that
[a
r
[
1/r
=
_
1
r
_
1/r
, and log [a
r
[
1/r
=
1
r
log
1
r
0 as r .
Thus
[a
r
[
1/r
1 as r , (5.40)
and we conrm by Cauchys test that = 1. In fact the series converges to log(1 + z) for [z[ < 1;
the singularity at z = 1 xes the radius of convergence.
Natural Sciences Tripos: IB Mathematical Methods I 82 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
3. Next suppose f(z) =
r=0
z
r
r!
. Then a
r
=
1
r!
and
a
r+1
a
r
=
1
r +1
0 as r . (5.41)
Hence = by DAlemberts ratio test. As a check we observe that
[a
r
[
1/r
=
_
1
r!
_
1/r
, and log [a
r
[
1/r
=
1
r
log r! .
But Stirlings formula gives that
log r! r log r +
1
2
log r r +. . . as r ,
and so
log [a
r
[
1/r
log r as r .
Thus
[a
r
[
1
r
0 as r , (5.42)
and we conrm by Cauchys test that = . In fact the series converges to e
z
for all nite z.
4. Finally suppose f(z) =
r=0
r!z
r
. Then a
r
= r! and
a
r+1
a
r
= r +1 as r . (5.43)
Hence = 0 by DAlemberts ratio test. As a check we observe, using Stirlings formula, that
[a
r
[
1/r
= (r!)
1/r
, and log [a
r
[
1/r
=
1
r
log r! log r as r ,
and so
[a
r
[
1/r
as r . (5.44)
Thus we conrm by Cauchys test this series has zero radius of convergence; it fails to dene the
function f(z) for any non-zero z.
5.4.5 Analytic Functions
A function f(z) is said to be analytic at z = z
0
if it has Taylor series expansion about z = z
0
with a
non-zero radius of convergence, i.e. f(z) is analytic at z = z
0
if for some > 0
f(z) =
r=0
a
r
(z z
0
)
r
for [z z
0
[ < . (5.45a)
The coecients of the Taylor series can be evaluated by dif-
ferentiating (5.45a) n times and then evaluating the result at
z = z
0
, whence
a
n
=
1
n!
d
n
f
dz
n
(z
0
) . (5.45b)
5.4.6 The O Notation
Suppose that f(z) and g(z) are functions of z. Then
if f(z)/g(z) is bounded as z 0 we say that f(z) = O(g(z));
if f(z)/g(z) 0 as z 0 we say that f(z) = o(g(z)).
Natural Sciences Tripos: IB Mathematical Methods I 83 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Example. As x 0 we have that
sinx = O(1) since sinx/1 is bounded as x 0;
sinx = o(1) since sinx/1 0 as x 0;
sinx = O(x) since sinx/x is bounded as x 0.
Remark. The O notation is often used in conjunction with truncated Taylor series, e.g. for small (z z
0
)
f(z) = f(z
0
) +(z z
0
)f
(z
0
) +
1
2
(z z
0
)
2
f
(z
0
) +O((z z
0
)
3
) . (5.46)
17/02
17/04
5.5 Integration
You have already encountered integration as both
the inverse of dierentiation, and
some form of of summation.
The aim of this part of the course is to emphasize that these two denitions are equivalent for continuous
functions.
5.5.1 Why Do We Have To Do This Again?
You already know the denition
_
b
a
f(t)dt = lim
N
N
j=1
f(a +jh) h where h = (b a)/N , (5.47)
so why are mathematicians not really content with it?
One answer is that while (5.47) is OK for OK functions,
consider Dirichlets function
f =
_
0 on irrationals,
1 on rationals.
(5.48)
If
a = 0 and b = , then (5.47) evaluates to 0,
a = 0 and b = p/q, where p/q is a rational ap-
proximation to (e.g. 22/7 or better), then (5.47)
evaluates to p/q.
Since we can choose p/q to be arbitrarily close to we would appear to have a problem.
22
We conclude
that we need a better denition of an integral. In particular
we need a better way of dividing up [a, b];
we need to be more precise about the limit as the subdivisions tend to zero.
17/01
17/03
22
In fact Dirichlets function is not Riemann integrable, so this example is a bit of a cheat.
Natural Sciences Tripos: IB Mathematical Methods I 84 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
5.5.2 The Riemann Integral
Dissection. A dissection, partition or subdivision D
of the interval [a, b] is a nite set of points
t
0
, . . . , t
N
such that
a = t
0
< t
1
< . . . < t
N
= b .
Modulus. Dene the modulus, gauge or norm of a
dissection D, written [D[, to be the length of
the longest subinterval (t
j
t
j1
) of D, i.e.
[D[ = max
1jN
[t
j
t
j1
[ . (5.49a)
Riemann Sum. A Riemann sum, (D, ), for a bounded function f(t) is any sum
(D, ) =
N
j=1
f(
j
)(t
j
t
j1
) where
j
[t
j1
, t
j
] . (5.49b)
Note that if the subintervals all have the same length so that t
j
= (a+jh) and t
j
t
j1
= h where
h = (t
N
t
0
)/N, and if we take
j
= t
j
, then (cf. (5.47))
(D, ) =
N
j=1
f(a +jh) h. (5.49c)
Integrability. A bounded function f(t) is integrable if there exists I R such that
lim
|D|0
(D, ) = I , (5.49d)
where the limit of the Riemann sum must exist independent of the dissection (subject to the
condition that [D[ 0) and independent of the choice of for a given dissection D.
23
Denite Integral. For an integrable function f the Riemann denite integral of f over the interval [a, b]
is dened to be the limiting value of the Riemann sum, i.e.
_
b
a
f(t) dt = I . (5.49e)
Remark. An integral should be thought of as the limiting value of a sum, not as the area under a
curve. Of course this is not to say that integrals are not a handy way of calculating the areas
under curves.
Example. Suppose that f(t) = c, where c is a real constant. Then from (5.49b)
(D, ) =
N
j=1
c(t
j
t
j1
) = c(b a) , (5.50a)
whatever the choice of D and . Hence the required limit in (5.49d) exists. We conclude that
f(t) = c is integrable, and that
_
b
a
c dt = c(b a) . (5.50b)
23
More precisely, f is integrable if given > 0, there exists I R and > 0 such that whatever the choice of for a
given dissection D
|(D, ) I| < when |D| < .
Note that this is far more restrictive than saying that the sum (5.49c) converges as h 0.
Natural Sciences Tripos: IB Mathematical Methods I 85 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Remark. Proving integrability using (5.49d) is in general non trivial (since there are a rather large
number of dissections and sample points to consider).
24
However if we know that a function is
integrable then the limit (5.49d) needs only to be evaluated once, i.e. for a limiting dissection and
sample points of our choice (usually those that make the calculation easiest).
5.5.3 Properties of the Riemann Integral
Using (5.49d) and (5.49e) it is possible to show for integrable functions f and g, a < c < b, and k R,
that
_
b
a
f(t) dt =
_
a
b
f(t) dt , (5.51a)
_
b
a
f(t) dt =
_
c
a
f(t) dt +
_
b
c
f(t) dt , (5.51b)
_
b
a
kf(t) dt = k
_
b
a
f(t) dt , (5.51c)
_
b
a
(f(t) +g(t)) dt =
_
b
a
f(t) dt +
_
b
a
g(t) dt , (5.51d)
_
b
a
f(t) dt
_
b
a
[f(t)[ dt . (5.51e)
It is also possible to deduce that if f and g are integrable then so if fg.
Schwarzs Inequality. For integrable functions f and g (cf. (4.31))
_
_
b
a
fg dt
_
2
_
_
b
a
f
2
dt
__
_
b
a
g
2
dt
_
. (5.52)
Proof. Using the above properties it follows that
0
_
b
a
(f +g)
2
dt =
2
_
b
a
f
2
dt +2
_
b
a
fg dt +
_
b
a
g
2
dt . (5.53)
If
_
b
a
f
2
dt = 0 then
2
_
b
a
fg dt +
_
b
a
g
2
dt 0 .
This can only be true for all if
_
b
a
fg dt = 0; the [in]equality follows.
If
_
b
a
f
2
dt ,= 0 then choose (cf. the proof of (4.31))
=
_
b
a
fg dt
_
b
a
f
2
dt
, (5.54)
and the inequality again follows.
Remark. This will not be the last time that we will nd an analogy between scalar/inner products
and integrals.
24
As you might guess, there is a better way to do it.
Natural Sciences Tripos: IB Mathematical Methods I 86 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
5.5.4 The Fundamental Theorems of Calculus
Suppose f is integrable. Dene
F(x) =
_
x
a
f(t) dt . (5.55)
F is continuous. F is a continuous function of x since
[F(x +h) F(x)[ =
_
x+h
x
f(t) dt
_
x+h
x
[f(t)[ dt
_
max
xtx+h
[f(t)[
_
h,
and hence
lim
h0
[F(x +h) F(x)[ = 0 .
The First Fundamental Theorem of Calculus. This states that
dF
dx
=
d
dx
__
x
a
f(t) dt
_
= f(x) , (5.56)
i.e. the derivative of the integral of a function is the function.
Proof. Suppose that
m = min
xtx+h
f(t) and M = max
xtx+h
f(t) .
We can show from the denition of a Riemann integral that
for h > 0
mh
_
x+h
x
f(t) dt Mh,
so
m
F(x +h) F(x)
h
M .
But if f is continuous, then as h 0 both m and M tend
to f(x). We can similarly sandwich (F(x +h) F(x))/h if
h < 0. (5.56) then follows from the denition of a derivative.
The Second Fundamental Theorem of Calculus. This essentially states that the integral of the derivative
of a function is the function, i.e. if g is dierentiable then
_
x
a
dg
dt
dt = g(x) g(a) . (5.57)
Proof. Dene f(x) by
f(x) =
dg
dx
(x) ,
and then dene F as in (5.55). Then using (5.56) we have that
d
dx
(F g) = 0 .
Natural Sciences Tripos: IB Mathematical Methods I 87 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Hence from integrating and using the fact that F(a) = 0 from (5.55),
F(x) g(x) = g(a) .
Thus using the denition (5.55)
_
x
a
dg
dt
dt = g(x) g(a) . (5.58)
The Indenite Integral. Let f be integrable, and suppose f = F
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
6 Ordinary Dierential Equations
6.0 Why Study This?
Numerous scientic phenomena are described by dierential equations. This section is about extending
your armoury for solving ordinary dierential equations, such as those that arise in quantum mechan-
ics and electrodynamics. In particular we will study a sub-class of ordinary dierential equations of
Sturm-Liouville type. In addition we will consider eigenvalue problems for Sturm-Liouville operators.
In subsequent courses you will learn that such eigenvalues x, say, the angular momentum of electrons
in an atom.
6.1 Second-Order Linear Ordinary Dierential Equations
The general second-order linear ordinary dierential equation (ODE) for y(x) can, wlog, be written as
y
+p(x)y
+py
+qy = 0 , (6.2)
can be superposed to give a third, i.e. if y
1
and y
2
are two solutions then for , R another solution is
y = y
1
+y
2
. (6.3)
Further, suppose that y
1
and y
2
are two linearly independent solutions, where by linearly independent
we mean, as in (4.3), that
y
1
(x) +y
2
(x) 0 = = 0 . (6.4)
Then since (6.2) is second order, the general solution of (6.2) will be of the form (6.3); the parameters
and can be viewed as the two integration constants. This means that in order to nd the general
solution of a second order linear homogeneous ODE we need to nd two linearly-independent solutions.
Remark. If y
1
and y
2
are linearly dependent, then y
2
= y
1
for some R, in which case (6.3) becomes
y = ( +)y
1
, (6.5)
and we have, in eect, a solution with only one integration constant = ( +).
6.2.1 The Wronskian
If y
1
and y
2
are linearly dependent, then so are y
1
and y
2
(since if y
2
= y
1
then from dierentiating
y
2
= y
1
). Hence y
1
and y
2
are linearly dependent only if the equation
_
y
1
y
2
y
1
y
2
__
_
= 0 , (6.6)
has a non-zero solution. Conversely, if this equation has a solution then y
1
and y
2
are linearly dependent.
It follows that non-zero functions y
1
and y
2
are linearly independent if and only if
_
y
1
y
2
y
1
y
2
__
_
= 0 = = 0 . (6.7)
Natural Sciences Tripos: IB Mathematical Methods I 89 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Since Ax = 0 only has a zero solution if and only if det A ,= 0, we conclude that y
1
and y
2
are linearly
independent if and only if
_
_
_
_
y
1
y
2
y
1
y
2
_
_
_
_
= y
1
y
2
y
2
y
1
,= 0 . (6.8)
The function,
W(x) = y
1
y
2
y
2
y
1
, (6.9)
is called the Wronskian of the two solutions. To recap: if W is non-zero then y
1
and y
2
are linearly
independent.
6.2.2 The Calculation of W
We can derive a dierential equation for the Wronskian, since
W
= y
1
y
2
y
1
y
2
from (6.9) since the y
1
y
2
terms cancel
= y
1
(py
2
+qy
2
) +(py
1
+qy
1
)y
2
using equation (6.2)
= p(y
1
y
2
y
1
y
2
) since the qy
1
y
2
terms cancel
= pW from denition (6.9). (6.10)
This is a rst-order equation for W, viz.
W
+p(x)W = 0 , (6.11)
with solution
W(x) = exp
_
_
x
p() d
_
, (6.12)
where is a constant (a change in lower limit of integration can be absorbed by a rescaling of ).
Remark. If for one value of x we have that W ,= 0, then W is non-zero for all values of x (since
expx > 0 for all x). Hence if y
1
and y
2
are linearly independent for one value of x, they are linearly
independent for all values of x. In the case that y
1
and y
2
are known implicitly, e.g. in terms of
series or integrals, this is a welcome result since it means that we just have to nd one value of x
where is it relatively easy to evaluate W in order to conrm (or otherwise) linear independence. 18/02
6.2.3 A Second Solution via the Wronskian
Suppose that we already have one solution, say y
1
, to the homogeneous equation. Then we can calculate
a second linearly independent solution using the Wronskian as follows.
First, from the denition of the Wronskian (6.9)
y
1
y
2
y
1
y
2
= W(x) . (6.13)
Hence from dividing by y
2
1
_
y
2
y
1
_
=
y
2
y
1
y
2
y
1
y
2
1
=
W
y
2
1
.
Now integrate both sides and use (6.12) to obtain
y
2
(x) = y
1
(x)
_
x
W()
y
2
1
()
d
= y
1
(x)
_
x
y
2
1
()
exp
_
_
p() d
_
d . (6.14)
In principle this allows us to compute y
2
given y
1
. 18/04
Natural Sciences Tripos: IB Mathematical Methods I 90 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Example. Given that y
1
(x) is a solution of Bessels equation of zeroth order,
y
+
1
x
y
+y = 0 , (6.15)
nd another independent solution in terms of y
1
for x > 0.
Answer. In this case p(x) = 1/x and hence
y
2
(x) = y
1
(x)
_
x
y
2
1
()
exp
_
_
1
d
_
d
= y
1
(x)
_
x
1
y
2
1
()
d . (6.16)
18/03
6.3 Taylor Series Solutions
It is useful now to generalize to complex functions y(z) of a complex variable z. The homogeneous ODE
(6.2) then becomes
y
+p(z)y
+q(z)y = 0 . (6.17)
If p and q are analytic at z = z
0
(i.e. they have power series expansions (5.45a) about z = z
0
), then
z = z
0
is called an ordinary point of the ODE. A point at which p and/or q is singular, i.e. a point at
which p and/or q or one of its derivatives is innite, is called a singular point of the ODE.
6.3.1 The Solution at Ordinary Points in Terms of a Power Series
If z = z
0
is an ordinary point of the ODE for y(z), then we claim that
y(z) is analytic at z = z
0
, i.e. there exists > 0 for which (see (5.45a))
y =
n=0
a
n
(z z
0
)
n
when [z z
0
[ < . (6.18a)
For simplicity we will assume henceforth wlog that z
0
= 0 (which corre-
sponds to a shift in the origin of the z-plane). Then we seek a solution of
the form
y =
n=0
a
n
z
n
. (6.18b)
18/01
Next we substitute (6.18b) into the governing equation (6.17) to obtain
n=02
n(n 1)a
n
z
n2
+
n=01
na
n
p(z)z
n1
+
n=0
a
n
q(z)z
n
= 0 ,
or, after the substitution k = n 2 and = n 1 in the rst and second terms respectively,
k=0
(k +2)(k +1)a
k+2
z
k
+
=0
( +1)a
+1
p(z)z
n=0
a
n
q(z)z
n
= 0 . (6.19)
At an ordinary point p(z) and q(z) are analytic so we can write
p(z) =
m=0
p
m
z
m
and q(z) =
m=0
q
m
z
m
. (6.20)
Then, after the substitutions k r and n, (6.19) can be written as
r=0
(r +2)(r +1)a
r+2
z
r
+
n=0
m=0
_
(n +1)a
n+1
p
m
+a
n
q
m
_
z
n+m
= 0 . (6.21)
Natural Sciences Tripos: IB Mathematical Methods I 91 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
We now want to rewrite the double sum to include powers
like z
r
. Hence let r = n +m and then note that (cf. a
change of variables in a double integral)
n=0
m=0
(n, m) =
r=0
r
n=0
(n, r n) , (6.22)
since m = r n 0. Hence (6.21) can be rewritten as
r=0
_
(r +2)(r +1)a
r+2
+
r
n=0
_
(n +1)a
n+1
p
rn
+a
n
q
rn
_
_
z
r
= 0 . (6.23)
Since this expression is true for all [z[ < , each coecient of z
r
(r = 0, 1, . . . ) must be zero. Thus we
deduce the recurrence relation
a
r+2
=
1
(r +2)(r +1)
r
n=0
_
(n +1)a
n+1
p
rn
+a
n
q
rn
_
for r 0 . (6.24)
Therefore a
r+2
is determined in terms of a
0
, a
1
, . . . , a
r+1
. This means that if a
0
and a
1
are known then
so are all the a
r
; a
0
and a
1
play the role of the two integration constants in the general solution.
Remark. Proof that the radius of convergence of (6.18b) is non-zero is more dicult, and we will not
attempt such a task in general. However we shall discuss the issue for examples.
6.3.2 Example
Consider
y
2
(1 z)
2
y = 0 . (6.25)
z = 0 is an ordinary point so try
y =
n=0
a
n
z
n
. (6.26)
We note that
p = 0 , q =
2
(1 z)
2
= 2
m=0
(m+1)z
m
, (6.27)
and hence in the terminology of the previous subsection p
m
= 0 and q
m
= 2(m+1). Substituting into
(6.24) we obtain the recurrence relation
a
r+2
=
2
(r +2)(r +1)
r
n=0
a
n
(r n +1) for r 0 . (6.28)
However, with a small amount of forethought we can obtain a simpler, if equivalent, recurrence relation.
First multiply (6.25) by (1 z)
2
to obtain
(1 z)
2
y
2y = 0 ,
and then substitute (6.26) into this equation. We nd, on expanding (1 z)
2
= 1 2z +z
2
, that
n=02
n(n 1)a
n
z
n2
2
n=01
n(n 1)a
n
z
n1
+
n=0
(n
2
n 2)a
n
z
n
= 0 ,
after the substitution k = n 2 and = n 1 in the rst and second terms respectively,
k=0
(k +2)(k +1)a
k+2
z
k
2
=0
( +1)a
+1
z
n=0
(n +1)(n 2)a
n
z
n
= 0 .
Natural Sciences Tripos: IB Mathematical Methods I 92 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Then after the substitutions k n and n, we group powers of z to obtain
n=0
(n +1) ((n +2)a
n+2
2na
n+1
+(n 2)a
n
) z
n
= 0 ,
which leads to the recurrence relation
a
n+2
=
1
n +2
(2na
n+1
(n 2)a
n
) for n 0 . (6.29)
This two-term recurrence relation again determines a
n
for n 2 in terms of a
0
and a
1
, but is simpler
than (6.28).
Exercise for those with time! Show that the recurrence relations (6.28) and (6.29) are equivalent.
Two solutions. For n = 0 the recurrence relation (6.29) yields a
2
= a
0
, while for n = 1 and n = 2 we
obtain
a
3
=
1
3
(2a
2
+a
1
) and a
4
= a
3
. (6.30)
First we note that if 2a
2
+a
1
= 0, then a
3
= a
4
= 0, and hence a
n
= 0 for n 3. We thus have as
our rst solution (with a
0
= ,= 0)
y
1
= (1 z)
2
. (6.31a)
Next we note that a
n
= a
0
for all n is a solution of (6.29). In this case we can sum the series to
obtain (with a
0
= ,= 0)
y
2
=
n=0
z
n
=
1 z
. (6.31b)
Linear independence. The linear independence of (6.31a) and (6.31b) is clear. However, to be extra sure
we calculate the Wronskian:
W = y
1
y
2
y
1
y
2
= (1 z)
2
(1 z)
2
+2(1 z)
(1 z)
= 3 ,= 0 . (6.32)
Hence the general solution is
y(z) = (1 z)
2
+
1 z
, (6.33)
for constants and . Observe that the general solution is singular at z = 1, which is also a singular
point of the equation since q(z) = 2(1 z)
2
is singular there.
6.3.3 Example: Legendres Equation
Legendres equation is
y
2z
1 z
2
y
+
( +1)
1 z
2
y = 0 , (6.34)
where R. The points z = 1 are singular points but z = 0 is an ordinary point, so for smallish z try
y =
n=0
a
n
z
n
. (6.35)
On substituting this into (1 z
2
) (6.34) we obtain
n=02
n(n 1)a
n
z
n2
n=0
n(n 1)a
n
z
n
2
n=0
na
n
z
n
+
n=0
( +1)a
n
z
n
= 0 .
Hence with k = n 2 in the rst sum
k=0
(k +2)(k +1)a
k+2
z
k
n=0
(n(n +1) ( +1)) a
n
z
n
= 0 ,
Natural Sciences Tripos: IB Mathematical Methods I 93 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
and thence after the transformation k n
n=0
((n +2)(n +1)a
n+2
(n(n +1) ( +1)) a
n
) z
n
= 0 .
This implies that
a
n+2
=
n(n +1) ( +1)
(n +1)(n +2)
a
n
for n = 0, 1, 2, . . . . (6.36)
Two solutions can be constructed by choosing
a
0
= 1 and a
1
= 0, so that
y
1
= 1
( +1)
2
z
2
+O(z
4
) ; (6.37a)
a
0
= 0 and a
1
= 1, so that
y
2
= z +
2 ( +1)
6
z
3
+O(z
5
) . (6.37b)
The Wronskian at the ordinary point z = 0 is thus given by
W = y
1
y
2
y
1
y
2
= 1 1 0 0 = 1 . (6.38)
Since W ,= 0, y
1
and y
2
are linearly independent.
Radius of convergence. The series (6.37a) and (6.37b) are eectively power series in z
2
rather than z.
Hence to nd the radius of convergence we either need to re-express our series (e.g. z
2
y and
a
2n
b
n
), or use a slightly modied DAlemberts ratio test. We adopt the latter approach and
observe from (6.36) that
lim
n
a
n+2
z
n+2
a
n
z
n
= lim
n
[z[
2
= [z[
2
. (6.39)
It then follows from a straightforward extension of DAlemberts ratio test (5.21) that the series
converges for [z[ < 1. Moreover, the series diverges for [z[ > 1 (since a
n
z
n
, 0), and so the radius
of convergence = 1. On the radius of convergence, determination of whether the series converges
is more dicult.
Remark. The radius of convergence is distance to nearest singularity of the ODE. This is a general
feature.
Legendre polynomials. In the generic situations both series (6.37a) and (6.37b) have an innite number
of terms. However, for = 0, 1, 2, . . . it follows from (6.36)
a
+2
=
( +1) ( +1)
( +1) ( +2)
a
= 0 , (6.40)
and so the series terminates. For instance,
= 0 : y = a
0
,
= 1 : y = a
1
z ,
= 2 : y = a
0
(1 3z
2
) .
These functions are proportional to the Legendre polynomials, P
(1) = 1. Thus
P
0
= 1 , P
1
= z , P
2
=
1
2
(3z
2
1) , etc. (6.41)
19/02
19/04
Natural Sciences Tripos: IB Mathematical Methods I 94 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
6.4 Regular Singular Points
Let z = z
0
be a singular point of the ODE. As before we can take z
0
= 0 wlog (otherwise dene z
= zz
0
so that z
+zs(z)y
+t(z)y = 0 . (6.44)
19/03
6.4.1 The Indicial Equation
We claim that there is always at least one solution to (6.44) of the form
y = z
n=0
a
n
z
n
with a
0
,= 0 and C. (6.45)
To see this substitute (6.45) into (6.44) to obtain
n=0
_
( +n)( +n 1) +( +n)s(z) +t(z)
_
a
n
z
+n
= 0 ,
or after division by z
n=0
_
( +n)( +n 1) +( +n)s(z) +t(z)
_
a
n
z
n
= 0 . (6.46)
We now evaluate this sum at z = 0 (when z
n
= 0 except when n = 0) to obtain
(( 1) +s
0
+t
0
)a
0
= 0 , (6.47)
where s
0
= s(0) and t
0
= t(0); note that since s and t are analytic at z = 0, s
0
and t
0
are nite. Since
by denition a
0
,= 0 (see (6.45)) we obtain the indicial equation for :
2
+(s
0
1) +t
0
= 0 . (6.48)
The roots
1
,
2
of this equation are called the indices of the regular singular point. 19/01
6.4.2 Series Solutions
For each choice of from
1
and
2
we can nd a recurrence relation for a
n
by comparing powers of z
in (6.46), i.e. after expanding s and t in power series.
1
2
/ Z. If
1
2
/ Z we can nd both linearly independent solutions this way.
1
2
Z. If
1
=
2
we note that we can nd only one solution by the ansatz (6.45). However, as we
shall see, its worse than this. The ansatz (6.45) also fails (in general) to give both solutions when
1
and
2
dier by an integer (although there are exceptions).
Natural Sciences Tripos: IB Mathematical Methods I 95 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
6.4.3 Example: Bessels Equation of Order
Bessels equation of order is
y
+
1
z
y
+
_
1
2
z
2
_
y = 0 , (6.49)
where 0 wlog. The origin z = 0 is a regular singular point with
s(z) = 1 and t(z) = z
2
2
. (6.50)
A power series solution of the form (6.45) solves (6.49) if, from (6.46),
n=0
_
( +n)( +n 1) +( +n)
2
_
a
n
z
n
+
n=0
a
n
z
n+2
= 0 , (6.51)
i.e. after the transformation n n 2 in the second sum, if
n=0
_
( +n)
2
2
_
a
n
z
n
+
n=2
a
n2
z
n
= 0 . (6.52)
Now compare powers of z to obtain
n = 0 :
2
2
= 0 (6.53a)
n = 1 :
_
( +1)
2
2
_
a
1
= 0 (6.53b)
n 2 :
_
( +n)
2
2
_
a
n
+a
n2
= 0 . (6.53c)
(6.53a) is the indicial equation and implies that
= . (6.54)
Substituting this result into (6.53b) and (6.53c) yields
(1 2) a
1
= 0 (6.55a)
n(n 2) a
n
= a
n2
for n 2 . (6.55b)
Remark. We note that there is no diculty in solving for a
n
from a
n2
using (6.55b) if = +. However,
if = the recursion will fail with a
n
predicted to be innite if at any point n = 2. There are
hence potential problems if
1
2
= 2 Z, i.e. if the indices
1
and
2
dier by an integer.
2 / Z. First suppose that 2 / Z so that
1
and
2
do not dier by an integer. In this case (6.55a) and
(6.55b) imply
a
n
=
_
_
_
0 n = 1, 3, 5, . . . ,
a
n2
n(n 2)
n = 2, 4, 6, . . . ,
(6.56)
and so we get two solutions
y
= a
0
z
_
1
1
4(1 )
z
2
+. . .
_
. (6.57)
2 = 2m+1, m N. It so happens in this case that even though
1
and
2
dier by an odd integer
there is no problem; the solutions are still given by (6.56) and (6.57). This is because for Bessels
equation the power series proceed in even powers of z, and hence the problem recursion when
n = 2 = 2m + 1 is never encountered. We conclude that the condition for the recursion relation
(6.55b) to fail is that is an integer.
= 0. If = 0 then
1
=
2
and we can only nd one power series solution of the form (6.45), viz.
y = a
0
_
1
1
4
z
2
+. . .
_
. (6.58)
Natural Sciences Tripos: IB Mathematical Methods I 96 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
= m N. If is a positive integer then we can nd one solution by choosing = . However if we
take = then a
2m
is predicted to be innite, i.e. a second series solution of the form (6.45)
fails.
Remarks. The existence of two power series solutions for 2 = 2m + 1, m N is a lucky accident. In
general there exists only one solution of the form (6.45) whenever the indices
1
and
2
dier by
an integer. We also note that the radius of convergence of the power series solution is innite since
from (6.56)
lim
n
a
n
a
n2
= lim
n
1
n(n 2)
= 0 .
6.4.4 The Second Solution when
1
2
Z
A Preliminary: Bessels equation with = 0. In order to obtain an idea how to proceed when
1
2
Z,
rst consider the example of Bessels equation of zeroth order, i.e. = 0. Let y
1
denote the solution
(6.58). Then, from (6.16) (after the transformations x z)
y
2
(z) = y
1
(z)
_
z
1
y
2
1
()
d . (6.59)
For small (positive) z we can deduce using (6.58) that
y
2
(z) = a
0
(1 +O(z
2
))
_
z
1
a
2
0
(1 +O(
2
)) d
=
a
0
log z +. . . . (6.60)
We conclude that the second solution contains a logarithm.
The claim. Let
1
,
2
be the two (possibly complex) solutions to the indicial equation for a regular
singular point at z = 0. Order them so that
Re (
1
) Re (
2
) . (6.61)
Then we can always nd one solution of the form
y
1
(z) = z
1
n=0
a
n
z
n
with, say, the normalisation a
0
= 1 . (6.62)
If
1
2
Z we claim that the second-order solution takes the form
y
2
(z) = z
2
n=0
b
n
z
n
+k y
1
(z) log z , (6.63)
for some number k. The coecients b
n
can be found by substitution into the ODE. In some very
special cases k may vanish but k ,= 0 in general.
Example: Bessels equation of integer order. Suppose that y
1
is the series solution with = +m to
z
2
y
+zy
+
_
z
2
m
2
_
y = 0 , (6.64)
where, compared with (6.49), we have written m for . Hence from (6.45) and (6.56)
y
1
= z
m
=0
a
2
z
2
, (6.65)
since a
2+1
= 0 for integer . Let
y = ky
1
log z +w, (6.66)
Natural Sciences Tripos: IB Mathematical Methods I 97 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
then
y
= ky
1
log z +
ky
1
z
+w
and y
= ky
1
log z +
2ky
1
z
ky
1
z
2
+w
.
On substituting into (6.64), and using the fact that y
1
is a solution of (6.64), we nd that
z
2
w
+zw
+(z
2
m
2
)w = 2kzy
1
. (6.67)
Based on (6.54), (6.61) and (6.63) we now seek a series solution of the form
w = k z
m
n=0
b
n
z
n
. (6.68)
On substitution into (6.67) we have that
k
n=01
n(n 2m)b
n
z
nm
+k
n=0
b
n
z
nm+2
= 2k
=0
(2 +m)a
2
z
2+m
.
After multiplying by z
m
and making the transformations n n2 and 2 n2m in the second
and third sums respectively, it follows that
n=1
n(n 2m)b
n
z
n
+
n=2
b
n2
z
n
= 2
n=2m
neven
(n m)a
n2m
z
n
.
We now demand that the combined coecient of z
n
is zero. Consider the even and odd powers
of z
n
in turn.
n = 1, 3, 5, . . . . From equating powers of z
1
it follows that b
1
= 0, and thence from the recurrence
relation for powers of z
2+1
, i.e.
(2 +1)(2 +1 2m)b
2+1
= b
21
,
that b
2+1
= 0 ( = 1, 2, . . . ).
n = 2, 4, . . . , 2m, . . . . From equating even powers of z
n
:
2 n 2m2 : b
n2
= n(n 2m)b
n
, (6.69a)
n = 2m : b
2m2
= 2ma
0
, (6.69b)
n 2m+2 : b
n
=
1
n(n 2m)
b
n2
2(n m)
n(n 2m)
a
n2m
. (6.69c)
Hence: 20/02
if m 1 solve for b
2m2
in terms of a
0
from (6.69b);
if m 2 solve for b
2m4
, b
2m6
, . . . , b
2
, b
0
in terms of b
2m2
, etc. from (6.69a);
nally, on the assumption that b
2m
is known (see below), solve for b
2m+2
, b
2m+4
, . . . in
terms of b
2m
and the a
2
( = 1, 2, . . . ) from (6.69c).
b
2m
is undetermined since this eectively generates a solution proportional to y
1
; wlog b
2m
= 0.
20/01
20/03
20/04
6.5 Inhomogeneous Second-Order Linear ODEs
We now [re]turn to the real inhomogeneous equation
y
+p(x)y
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
6.5.1 The Method of Variation of Parameters
The question that remains is how to nd the particular solution. To that end rst suppose that we have
solved the homogeneous equation and found two linearly-independent solutions y
1
and y
2
. Then in order
to nd a particular solution consider
y
0
(x) = u(x)y
1
(x) +v(x)y
2
(x) . (6.72)
If u and v were constants (parameters) y
0
would solve the homogeneous equation. However, we allow the
parameters to vary, i.e. to be functions of x, in such a way that y
0
solves the inhomogeneous problem.
Remark. We have gone from one unknown function, i.e. y
0
and one equation (i.e. (6.70)), to two unknown
functions, i.e. u and v and one equation. We will need to nd, or in fact choose, another equation.
We now dierentiate (6.72) to nd that
y
0
= (uy
1
+vy
2
) +(u
y
1
+v
y
2
) (6.73a)
y
0
= (uy
1
+vy
2
+u
1
+v
2
) +(u
y
1
+v
y
2
+u
1
+v
2
) . (6.73b)
If we substitute the above into the inhomogeneous equation (6.70) we will have not apparently made
much progress because we will still have a second-order equation involving terms like u
and v
. However,
suppose that we eliminate the u
and v
y
1
+v
y
2
= 0 . (6.74)
Then (6.73a) and (6.73b) become
y
0
= uy
1
+vy
2
(6.75a)
y
0
= uy
1
+vy
2
+u
1
+v
2
. (6.75b)
It follows from (6.72), (6.75a) and (6.75b) that
y
0
+py
0
+qy
0
= u(y
1
+py
1
+qy
1
) +v(y
2
+py
2
+qy
2
) +u
1
+v
2
= u
1
+v
2
,
since y
1
and y
2
solve the homogeneous equation (6.17). Hence y
0
solves the inhomogeneous equation
(6.70) if
u
1
+v
2
= f . (6.76)
We now have two simultaneous equations for u
, v
=
fy
2
W
and v
=
fy
1
W
, (6.77)
where W is the Wronskian,
W = y
1
y
2
y
2
y
1
. (6.78)
W is non-zero because y
1
and y
2
were chosen to be linearly independent. Integrating we obtain
u =
_
x
a
y
2
()f()
W()
d and v =
_
x
a
y
1
()f()
W()
d , (6.79)
where a is arbitrary. We could have chosen dierent lower limits for the two integrals, but we do not need
to nd the general solution, only a particular one. Substituting this result back into (6.72) we obtain as
our particular solution
y
0
(x) =
_
x
a
f()
W()
_
y
1
()y
2
(x) y
1
(x)y
2
()
_
d . (6.80)
Natural Sciences Tripos: IB Mathematical Methods I 99 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Remark. We observe that, since the integrand is zero when = x,
y
0
(x) =
_
x
a
f()
W()
_
y
1
()y
2
(x) y
1
(x)y
2
()
_
d . (6.81)
Hence the particular solution (6.80) satises the initial value homogeneous boundary conditions
y(a) = y
(a) = 0 . (6.82)
More general initial value boundary conditions would be inhomogeneous, e.g.
y(a) = k
1
, y
(a) = k
2
, (6.83)
where k
1
and k
2
are constants which are not simultaneously zero. Such inhomogeneous boundary
conditions can be satised by adding suitable multiples of the linearly-independent solutions of the
homogeneous equation, i.e. y
1
and y
2
.
Example. Find the general solution to the equation
y
+y = sec x. (6.84)
Answer. Two linearly independent solutions of the homogeneous equation are
y
1
= cos x and y
2
= sinx, (6.85a)
with a Wronskian
W = y
1
y
2
y
2
y
1
= cos x(cos x) sinx(sinx) = 1 . (6.85b)
Hence from (6.80) a particular solution is given by
y
0
(x) =
_
x
sec
_
cos sinx cos xsin
_
d
= sinx
_
x
d cos x
_
x
tan d
= xsinx +cos xlog [cos x[ . (6.86)
The general solution is thus
y(x) = ( +log [cos x[) cos x +( +x) sinx. (6.87)
6.5.2 Two Point Boundary Value Problems
For many important problems ODEs have to be solved subject to boundary conditions at more than one
point. For linear second-order ODEs such boundary conditions have the general form
Ay(a) +By
(a) = k
a
, (6.88a)
Cy(b) +Dy
(b) = k
b
, (6.88b)
for two points x = a and x = b (wlog b > a), where A, B, C, D, k
a
and k
b
are constants. If k
a
= k
b
= 0
these boundary conditions are homogeneous, otherwise they are inhomogeneous.
For simplicity we shall consider the special case of homogeneous boundary conditions given by
y(a) = 0 and y(b) = 0 . (6.89)
The general solution of the inhomogeneous equation (6.70) satisfying the rst boundary condition
y(a) = 0 is, from (6.71), (6.80) and (6.82),
y(x) = /(y
1
(a)y
2
(x) y
1
(x)y
2
(a)) +
_
x
a
f()
W()
_
y
1
()y
2
(x) y
1
(x)y
2
()
_
d , (6.90a)
where the rst term on the right-hand side is the general solution of the homogeneous equation vanishing
at x = a and / is a constant. If we now impose the condition y(b) = 0, then we require that
/
_
y
1
(a)y
2
(b) y
1
(b)y
2
(a)
_
+
_
b
a
f()
W()
_
y
1
()y
2
(b) y
1
(b)y
2
()
_
d = 0 . (6.90b)
This determines / provided that y
1
(a)y
2
(b) y
1
(b)y
2
(a) ,= 0.
Natural Sciences Tripos: IB Mathematical Methods I 100 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Remark. If per chance y
1
(a)y
2
(b) y
1
(b)y
2
(a) = 0 then a solution exists to the homogeneous equation
satisfying both boundary conditions. As a general rule a solution to the inhomogeneous problem
exists if and only if there is no solution of the homogeneous equation satisfying both boundary
conditions.
A particular choice of y
1
and y
2
. For linearly independent y
1
and y
2
let
y
a
(x) = y
1
(a)y
2
(x) y
1
(x)y
2
(a) and y
b
(x) = y
1
(b)y
2
(x) y
1
(x)y
2
(b) , (6.91)
so that y
a
(a) = 0 and y
b
(b) = 0. The Wronskian of y
a
and y
b
is given by
y
a
y
b
y
a
y
b
=
_
y
1
(a)y
2
(b) y
1
(b)y
2
(a)
_
(y
1
y
2
y
1
y
2
)
=
_
y
1
(a)y
2
(b) y
1
(b)y
2
(a)
_
W ,
where W is the Wronskian of y
1
and y
2
. Hence y
a
and y
b
are linearly independent if
y
1
(a)y
2
(b) y
1
(b)y
2
(a) ,= 0 , (6.92)
i.e. the same condition that allows us to solve for /. Under such circumstances we can redene y
1
and y
2
to be y
a
and y
b
respectively.
In general the amount of algebra can be reduced by a sensible choice of y
1
and y
2
so that they
satisfy the homogeneous boundary conditions at x = a and x = b respectively, i.e. in the case of
(6.89)
y
1
(a) = y
2
(b) = 0 . (6.93)
6.5.3 Greens Functions
Suppose that we wish to solve the equation
L
(x)
y(x) = f(x) , (6.94)
where L
(x)
is the general second-order linear dierential operator in x, i.e.
L
(x)
=
d
2
dx
2
+p(x)
d
dx
+q(x) , (6.95)
where p and q are continuous functions. To x ideas we will assume that the solution should satisfy
homogeneous boundary conditions at x = a and x = b, i.e. k
a
= k
b
= 0 in (6.88a) and (6.88b).
Next, suppose that we can nd a solution G(x, ) that is the response of the system to forcing at a
point , i.e. G(x, ) is the solution to
L
(x)
G(x, ) = (x ) , (6.96)
subject to
AG(a, ) +BG
x
(a, ) = 0 and C G(b, ) +DG
x
(b, ) = 0 , (6.97)
where G
x
(x, ) =
G
x
(x, ) and we have used
x
rather than
d
dx
since G is a function of both x and .
Then we claim that the solution of the original problem (6.94) is
y(x) =
_
b
a
G(x, )f() d . (6.98)
To see this we rst note that (6.98) satises the boundary conditions, since
_
0 d = 0. Further, it also
satises the inhomogeneous equation (6.94) (or (6.70)) because
L
(x)
y(x) =
_
b
a
L
(x)
G(x, ) f() d dierential wrt x, integral wrt
=
_
b
a
(x ) f() d from (6.96)
= f(x) from (3.4) . (6.99)
The function G(x, ) is called the Greens function of L
(x)
for the given homogeneous boundary condi-
tions. 21/02
Natural Sciences Tripos: IB Mathematical Methods I 101 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
6.5.4 Two Properties Greens Functions
In the next section we will construct a Greens function. However, rst we need to derive two properties
of G(x, ). Suppose that we integrate equation (6.96) from to + for > 0 and consider the limit
0. From (3.4) the right hand side is equal to 1, and hence
1 = lim
0
_
+
L
(x)
Gdx
= lim
0
_
+
2
G
x
2
+p
G
x
+q G
_
dx from (6.95)
= lim
0
_
+
x
_
G
x
+p G
_
dx + lim
0
_
+
dp
dx
G+q G
_
dx rearrange
= lim
0
_
G
x
+p G
_
x=+
x=
lim
0
_
+
_
dp
dx
q
_
Gdx. (6.100)
How can this equation be satised? Let us suppose that G(x, ) is continuous at x = , i.e. that
lim
0
_
G(x, )
_
+
= 0 . (6.101a)
Then since p and q are continuous, (6.100) reduces to
lim
0
_
G
x
_
x=+
x=
= 1 , (6.101b)
i.e. that there is a unit jump in the derivative of G at x = (cf. the unit jump in the Heaviside step
function (3.9) at x = 0). Note that a function can be continuous and its derivative discontinuous, but
not vice versa. 21/01
21/03
21/04
6.5.5 Construction of the Greens Function
G(x, ) can be constructed by the following procedure. First we note that when x ,= , G satises the
homogeneous equation, and hence G should be the sum of two linearly independent solutions, say y
1
and
y
2
, of the homogeneous equation. So let
G(x, ) =
_
()y
1
(x) +
()y
2
(x) for a x < ,
+
()y
1
(x) +
+
()y
2
(x) for x b.
(6.102)
By construction this satises (6.96) for x ,= . Next we obtain equations relating
() and
() by
requiring at x = that G is continuous and
G
x
has a unit discontinuity. It follows from (6.101a) and
(6.101b) that
_
+
()y
1
() +
+
()y
2
()
()y
1
() +
()y
2
()
= 0 ,
_
+
()y
1
() +
+
()y
2
()
()y
1
() +
()y
2
()
= 1 ,
i.e.
y
1
()
_
+
()
()
+y
2
()
_
+
()
()
= 0 ,
y
1
()
_
+
()
()
+y
2
()
_
+
()
()
= 1 ,
i.e.
_
y
1
y
2
y
1
y
2
__
_
=
_
0
1
_
. (6.103)
A solution exists to this equation if
W
_
_
_
_
y
1
y
2
y
1
y
2
_
_
_
_
,= 0 ,
Natural Sciences Tripos: IB Mathematical Methods I 102 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
i.e. if y
1
and y
2
are linearly independent; if so then
=
y
2
()
W()
and
+
=
y
1
()
W()
. (6.104)
Finally we impose the boundary conditions. For instance, suppose that the solution y is required to
satisfy (6.89) (i.e. y(a) = 0 and y(b) = 0), then the appropriate boundary conditions for G are
G(a, ) = G(b, ) = 0 , (6.105)
i.e. A = C = 1 and B = D = 0 in (6.97). It follows from (6.102) that we require
()y
1
(a) +
()y
2
(a) = 0 , (6.106a)
+
()y
1
(b) +
+
()y
2
(b) = 0 . (6.106b)
can now be determined from (6.104), (6.106a) and (6.106b). For simplicity choose y
1
and y
2
as
in (6.93) so that y
1
(a) = y
2
(b) = 0; then
+
=
= 0 , (6.107a)
and thence from (6.104)
=
y
2
()
W()
and
+
=
y
1
()
W()
. (6.107b)
It follows from (6.102) that
G(x, ) =
_
_
y
1
(x)y
2
()
W()
for a x < ,
y
1
()y
2
(x)
W()
for x b.
(6.108)
Initial value homogeneous boundary conditions. Suppose that instead of the two-point boundary condi-
tion (6.89) we require that y(a) = y
2
(a) = 0, then in place of (6.107a) we have that
_
0 for a x < ,
y
1
()y
2
(x) y
1
(x)y
2
()
W()
for x b.
(6.109)
6.5.6 Unlectured Alternative Derivation of a Greens Function
By means of a little bit of manipulation we can also recover (6.108) from our earlier general solution
(6.90a) and (6.90b). That solution was derived for the homogeneous boundary conditions (6.89), i.e.
y(a) = 0 and y(b) = 0 .
As above choose y
1
and y
2
so that they satisfy the boundary conditions at x = a and x = b respectively,
i.e. let
y
1
(a) = y
2
(b) = 0 . (6.110)
In this case we have from (6.90b) that
/ =
1
y
2
(a)
_
b
a
f()
W()
y
2
() d . (6.111)
Natural Sciences Tripos: IB Mathematical Methods I 103 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
It follows from (6.90a) that
y(x) =
_
b
a
f()
W()
y
1
(x)y
2
() d +
_
x
a
f()
W()
_
y
1
()y
2
(x) y
1
(x)y
2
()
_
d
=
_
x
a
y
1
()y
2
(x)
W()
f() d +
_
b
x
y
1
(x)y
2
()
W()
f() d
=
_
b
a
G(x, )f() d (6.112)
where, as in (6.108), G(x, ) is dened by
G(x, ) =
_
_
y
1
()y
2
(x)
W()
for x, i.e. x ,
y
1
(x)y
2
()
W()
for > x, i.e. x < .
(6.113)
Remark. Note from (6.113) that G(x, ) is continuous at x = , and that
G
x
(x, ) =
_
_
y
1
()y
2
(x)
W()
for x ,
y
1
(x)y
2
()
W()
for x < .
(6.114)
Hence, from using the denition of the Wronskian (6.9),
G
x
is discontinuous at x = with discon-
tinuity
lim
0
_
G
x
(x, )
_
x=+
x=
=
y
1
()y
2
()
W()
y
1
()y
2
()
W()
= 1 . (6.115)
6.5.7 Example of a Greens Function
Find the Greens function in 0 < a < b for
L
(x)
=
d
2
dx
2
+
1
x
d
dx
n
2
x
2
, (6.116a)
with homogeneous boundary conditions
G(a, ) = 0 and
G
x
(b, ) = 0 , (6.116b)
i.e. with A = D = 1 and B = C = 0 in (6.97).
Answer. Seek solutions to the homogeneous equation L
(x)
y = 0 of the form y = x
r
. Then we require
that
r(r 1) +r n
2
= 0 , i.e. r = n.
Let
y
1
=
_
x
a
_
n
_
a
x
_
n
and y
2
=
_
x
b
_
n
+
_
b
x
_
n
, (6.117)
where we have constructed y
1
and y
2
so that y
1
(a) = 0 and y
2
(b) = 0 as is appropriate for boundary
conditions (6.116b) (cf. the choice of y
1
(a) = 0 and y
2
(b) = 0 in 6.5.5 since in that case we required
the Greens function to satisfy boundary conditions (6.105)). As in (6.102) let
G(x, ) =
_
()y
1
(x) +
()y
2
(x) for a x < ,
+
()y
1
(x) +
+
()y
2
(x) for x b.
Natural Sciences Tripos: IB Mathematical Methods I 104 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Since we require that G(a, ) = 0 from (6.116b), and by construction y
1
(a) = 0, it follows that
2
(b) = 0,
it follows that
+
= 0. Hence
G(x, ) =
_
()y
1
(x) for a x < ,
+
()y
2
(x) for x b.
We also require that G is continuous and
G
x
has a unit discontinuity at x = , hence
+
()y
2
() =
()y
1
() and
+
()y
2
()
()y
1
() = 1 .
Thus
=
y
2
()
W()
,
+
=
y
1
()
W()
and G(x, ) =
_
_
y
1
(x)y
2
()
W()
for a x < ,
y
1
()y
2
(x)
W()
for x b.
(6.118)
This has the same form as (6.108) because we [carefully] chose y
1
and y
2
in (6.117) to satisfy the
boundary conditions at x = a and x = b respectively. Note, however, the boundary condition that
the solution is required to satisfy at x = b is dierent in the two cases, i.e. y
2
(b) = 0 in (6.108)
while y
2
(b) = 0 in (6.118).
6.6 Sturm-Liouville Theory
Denition. A second-order linear dierential operator L is said to be of Sturm-Liouville type if
L =
d
dx
_
p(x)
d
dx
_
q(x) , (6.119a)
where p(x) and q(x) are real functions dened for a x b, with
p(x) > 0 for a < x < b . (6.119b)
Notation alert. The use of p and q in (6.119a) is dierent from their use up to now in this section, e.g.
in (6.2), (6.70) and (6.95). Unfortunately both uses are conventional.
6.6.1 Inner Products and Self-Adjoint Operators
Given two, possibly complex, piecewise continuous functions
u(x) and v(x), dene an inner product u [ v ) by
u [ v ) =
_
b
a
u
; (6.121a)
u [ v +t ) = u [ v ) + u [ t ) ; (6.121b)
v [ v ) 0 ; (6.121c)
v [ v ) = 0 v = 0 . (6.121d)
Natural Sciences Tripos: IB Mathematical Methods I 105 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Denition. A general dierential operator
L is said to be self-adjoint if
_
u
Lv
_
=
_
Lu [ v
_
. (6.122)
Remark. Whether or not an operator
L is self adjoint with respect to an inner product depends on the
choice of weight function in the inner product. 22/02
6.6.2 The Sturm-Liouville Operator
Consider the Sturm-Liouville operator (6.119a) together with the identity weight function w = 1, then
u [ Lv ) =
_
b
a
dxu
Lv from (6.120a)
=
_
b
a
dxu
_
d
dx
_
p
dv
dx
_
+qv
_
from (6.119a)
=
_
u
p
dv
dx
_
b
a
+
_
b
a
dxp
du
dx
dv
dx
_
b
a
dxqu
v integrate by parts
=
_
u
p
dv
dx
+p
du
dx
v
_
b
a
_
b
a
dxv
d
dx
_
p
du
dx
_
_
b
a
dxv qu
integrate by parts
=
_
p
_
v
du
dx
u
dv
dx
__
b
a
+
_
b
a
dxv Lu
from (6.119a)
=
_
p
_
v
du
dx
u
dv
dx
__
b
a
+
_
b
a
dx(Lu)
v since L real
=
_
p
_
v
du
dx
u
dv
dx
__
b
a
+ Lu [ v ) from (6.120a). (6.123a)
Suppose we now insist that u and v be such that
_
p
_
v
du
dx
u
dv
dx
__
b
a
= 0 , (6.123b)
then (6.122) is satised. We conclude that the dierential operator
L =
d
dx
_
p(x)
d
dx
_
q(x)
acting on functions, say u or v, which satisfy homogeneous boundary conditions at x = a and x = b (e.g.
u(a) = 0, v(a) = 0 and u(b) = 0, v(b) = 0), is self-adjoint with respect to the inner product with w = 1. 22/01
22/03
Remarks.
The boundary conditions are part of the conditions for an operator to be self-adjoint.
Suppose that an inner product for column vectors u and v is dened by (cf. (4.55))
u [ v ) = u
v . (6.124)
Then for a Hermitian matrix H we have that
u [ Hv ) = u
Hv = u
v since H is Hermitian
= (Hu)
v = Hu [ v ) since (AB)
= B
. (6.125)
A comparison of (6.122) and (6.125) suggests that self-adjoint operators are to general operators
what Hermitian matrices are to general matrices. 22/04
Natural Sciences Tripos: IB Mathematical Methods I 106 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
6.6.3 The Role of the Weight Function
Not all second-order linear dierential operators have the Sturm-Liouville form (6.119a). However, sup-
pose that
L is a second-order linear dierential operator not of Sturm-Liouville form, then there exists
a function w(x) so that
L = w
L (6.126)
is of Sturm-Liouville form.
Proof. The general second-order linear dierential operator acting on functions dened for a x b
can be written in the form
L = P(x)
d
2
dx
2
R(x)
d
dx
Q(x) , (6.127a)
where P, Q and R are real functions; we shall assume that
P(x) > 0 for a < x < b . (6.127b)
Hence for L dened by (6.126)
L = wP
d
2
dx
2
wR
d
dx
wQ
=
d
dx
_
wP
d
dx
_
+
_
d
dx
_
wP
_
wR
_
d
dx
wQ. (6.128)
The operator L in (6.128) is of Sturm-Liouville form (6.119a) if we choose our integrating factor w
so that
P
dw
dx
+
_
dP
dx
R
_
w = 0 , (6.129a)
and let
p = wP and q = wQ. (6.129b)
On solving (6.129a), and on choosing the constant of integration so that w(a) = 1, we obtain
w = exp
_
x
a
1
P()
_
R()
dP
dx
()
_
d . (6.130)
Remark. It follows from (6.130) that w > 0, and hence from (6.127b) and (6.129b) that p > 0 for
a < x < b (cf. (6.119b)).
Examples. Put the operators
L =
d
2
dx
2
d
dx
and
L =
d
2
dx
2
1
x
d
dx
in Sturm-Liouville form.
Answers. For the rst operator P = R = 1. Hence from (6.130), w = expx (wlog a = 0), and thus
L = e
x
L =
d
dx
_
e
x
d
dx
_
.
For the second operator P = 1 and R = x
1
. Hence from (6.130), w = x (wlog a = 1), and
thus
L = x
L =
d
dx
_
x
d
dx
_
.
Natural Sciences Tripos: IB Mathematical Methods I 107 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Is
L self adjoint? We have seen that the general second-order linear dierential operator
L can be trans-
formed into Sturm-Liouville form by multiplication by a weight function w. It follows from 6.6.2
that, subject to the boundary conditions (6.123b) being satised, w
L = L is self-adjoint with
respect to an inner product with the identity weight function, i.e.
_
b
a
u
(Lv) dx =
_
b
a
(Lu
) v dx. (6.131a)
However suppose that we slightly rearrange this equation to
_
b
a
u
Lv) wdx =
_
b
a
(
Lu
) v wdx. (6.131b)
Then from reference to the denition of an inner product with weight function w, i.e. (6.120a), we
see that, subject to appropriate boundary conditions being satised, i.e.
_
wP
_
v
du
dx
u
dv
dx
__
b
a
= 0 , (6.132)
Ly = y , (6.133)
where is the, possibly complex, eigenvalue associated with the eigenfunction y ,= 0.
Example. The Schrodinger equation for a one-dimensional quantum harmonic oscillator is
_
2
2m
d
2
dx
2
+
1
2
k
2
x
2
_
= E .
This is an eigenvalue equation where the eigenvalue E is the energy level of the oscillator.
Remark. If
L is not in Sturm-Liouville form we can multiply by w to get the equivalent eigenvalue
equation,
Ly = wy , (6.134)
where L is in Sturm-Liouville form.
The Claim. We claim, but do not prove, that if the functions on which
L [ or equivalently L] acts are such
that the boundary conditions (6.132) [ or equivalently (6.123b) ] are satised, then it is generally
the case that (6.133) [ or equivalently (6.134) ] has solutions only for a discrete, but innite, set of
values of :
n
, n = 1, 2, 3, . . . (6.135)
These are the eigenvalues of
L [ or equivalently L]. The corresponding solutions y
n
(x), n =
1, 2, 3, . . . are the eigenfunctions.
Example. Find the eigenvalues and eigenfunctions for the operator
L =
d
2
dx
2
, (6.136)
on the assumption that L acts on functions dened on 0 x that vanish at end-points x = 0
and x = .
Natural Sciences Tripos: IB Mathematical Methods I 108 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Answer. L is in Sturm-Liouville form with p = 1 and q = 0. Further, the boundary conditions
ensure that (6.123b) is satised. Hence L is self-adjoint. The eigenvalue equation is
y
+y = 0 , (6.137a)
with general solution
y = cos
1
2
x + sin
1
2
x. (6.137b)
Non-zero solutions exist with y(0) = y() = 0 only if
= 0 and sin
1
2
= 0 . (6.138)
Hence = n
2
for integer n, and the corresponding eigenfunctions are
y
n
(x) = sinnx. (6.139)
Remark. The eigenvalues
n
= n
2
are real (cf. the eigenvalues of an Hermitian matrix.).
The norm. We dene the norm, |y|, of a (possibly complex) function y(x) by
|y|
2
y [ y )
w
=
_
b
a
[y[
2
wdx, (6.140)
where we have introduced the subscript w (a non-standard notation) to indicate the weight function
w in the inner product. It is conventional to normalize eigenfunctions to have unit norm. For our
example (6.139) this results in
y
n
=
_
2
_1
2
sinnx. (6.141)
6.6.5 The Eigenvalues of a Self-Adjoint Operator are Real
Let
L be a self-adjoint operator with respect to an inner product with weight w, and suppose that y is
a non-zero eigenvector with eigenvalue satisfying
Ly = y . (6.142a)
Take the complex conjugate of this equation, remembering that
L and w are real, to obtain
Ly
. (6.142b)
Hence
_
b
a
_
y
Ly y
Ly
_
wdx =
_
b
a
(y
y y
)
_
b
a
[y[
2
wdx.
= (
) y [ y )
w
. (6.143)
But
L is self adjoint with respect to an inner product with weight w, and hence the left hand side of
(6.143) is zero (e.g. (6.131b) with u = v = y). It follows that
(
) y [ y )
w
= 0 , (6.144)
But y [ y )
w
,= 0 from (6.121d) since y has been assumed to be a non-zero eigenvector. Hence
=
y [ y )
w
from (6.121a) and (6.121b). (6.146)
This is essentially (6.144), and hence (6.145) follows as above.
Natural Sciences Tripos: IB Mathematical Methods I 109 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
6.6.6 Eigenfunctions with Distinct Eigenvalues are Orthogonal
Denition. Two functions u and v are said to be orthogonal with respect to a given inner product, if
u [ v )
w
= 0 . (6.147)
As before let
L be a general second-order linear dierential operator that is self-adjoint with respect to
an inner product with weight w. Suppose that y
1
and y
2
are eigenvectors of
L, with distinct eigenvalues
1
and
2
respectively. Then
Ly
1
=
1
y
1
, (6.148a)
Ly
2
=
2
y
2
. (6.148b)
From taking the complex conjugate of (6.148a) we also have that
Ly
1
=
1
y
1
, (6.148c)
since
L and
1
are real. Hence
_
b
a
_
y
1
Ly
2
y
2
Ly
1
_
wdx =
_
b
a
(y
1
2
y
2
y
2
1
y
1
) wdx from (6.148b) and (6.148c)
= (
2
1
)
_
b
a
y
1
y
2
wdx
= (
2
1
) y
1
[ y
2
)
w
. (6.149)
But
L is self adjoint, and hence the left hand side of (6.149) is zero (e.g. (6.131b) with u = y
1
and
v = y
2
). It follows that
(
2
1
) y
1
[ y
2
)
w
= 0 . (6.150)
Hence if
1
,=
2
then the eigenfunctions are orthogonal since
y
1
[ y
2
)
w
= 0 . (6.151)
Remark. As before the same result can be obtained using inner product notation since (6.150) follows
from
2
y
1
[ y
2
)
w
= y
1
[
2
y
2
)
w
from (6.121b)
= y
1
[
Ly
2
)
w
from (6.148b)
=
Ly
1
[ y
2
)
w
since
L is self-adjoint
=
1
y
1
[ y
2
)
w
from (6.148a)
=
1
y
1
[ y
2
)
w
from (6.121a) and (6.121b)
=
1
y
1
[ y
2
)
w
from (6.145). (6.152)
23/01
Example. Returning to our earlier example, we recall that we showed that the eigenfunctions of the
Sturm-Liouville operator
L =
d
2
dx
2
, (6.153)
acting on functions that vanish at end-points x = 0 and x = , are
y
n
=
_
2
_1
2
sinnx. (6.154)
Natural Sciences Tripos: IB Mathematical Methods I 110 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Since
_
0
y
n
y
m
dx =
2
_
0
sinnxsinmxdx
=
1
_
0
_
cos(n m)x cos(n +m)x
_
dx
= 0 if n ,= m, (6.155)
we conrm that the eigenfunctions are indeed orthogonal.
Orthonormal set. We have seen that eigenfunctions with dierent eigenvalues are mutually orthogonal.
We claim, but do not prove, that mutually orthogonal eigenfunctions can be constructed even for
repeated eigenvalues (cf. the experimental error argument of 4.7.2). Further, if we normalize all
eigenfunctions to have unit norm then we have an orthonormal set of eigenfunctions, i.e.
_
b
a
wy
n
y
m
dx = y
n
[ y
m
)
w
=
mn
. (6.156)
23/02
6.6.7 Eigenfunction Expansions
Let y
n
, n = 1, 2, . . . be an orthonormal set of eigenfunctions of a self-adjoint operator. Then we claim
that any function f(x) with the same boundary conditions as the eigenfunctions can be expressed as an
eigenfunction expansion
f(x) =
n=1
a
n
y
n
(x) , (6.157a)
where the coecients a
n
are given by
a
n
= y
n
[ f )
w
, (6.157b)
i.e. we claim that the eigenfunctions form a basis. A set of eigenfunctions that has this property is said
to be complete. 23/03
We will not prove the existence of the expansion (6.157a). However, if we assume such an expansion does
exist, then we can conrm that the coecients must be given by (6.157b) since
y
n
[ f )
w
= y
n
[
m=1
a
m
y
m
)
w
from (6.157a)
=
m=1
a
m
y
n
[ y
m
)
w
from inner product property (4.27c)
=
m=1
a
m
nm
from (6.156)
= a
n
from (0.11b), and as required.
23/04
The completeness relation. It follows from (6.157a) and (6.157b) that
f(x) =
n=1
y
n
[ f )
w
y
n
(x)
=
n=1
y
n
(x)
_
b
a
w() y
n
()f() d from (6.120a)
=
_
b
a
f()
_
w()
n=1
y
n
(x)y
n
()
_
d interchange sum and integral. (6.158)
This expression holds for all functions f satisfying the appropriate homogeneous boundary condi-
tions. Hence from (3.4)
w()
n=1
y
n
(x)y
n
() = (x ) . (6.159a)
This is the completeness relation.
Natural Sciences Tripos: IB Mathematical Methods I 111 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Remark. Suppose we exchange x and in the complex conjugate of (6.159a), then using the facts
that the weight function is real and the delta function is real and symmetric (see (3.8a) and
(3.8b)), we also have that
w(x)
n=1
y
n
(x)y
n
() = (x ) . (6.159b)
Example: Fourier series. Again consider the Sturm-Liouville operator
L =
d
2
dx
2
. (6.160)
In this case assume that it acts on functions that are 2-periodic. L is still self-adjoint with weight
function w = 1 since the periodicity ensures that the boundary conditions (6.123b) are satised if,
say, a = 0 and b = 2. This time we choose to write the general solution of the eigenvalue equation
(6.134) as
y = exp
_
1
2
x
_
+ exp
_
1
2
x
_
. (6.161)
This solution is 2-periodic if = n
2
for integer n (as before). Note that now we have two eigenfunc-
tions for each eigenvalue (except for n = 0). Label the eigenfunctions by y
n
for n = . . . , 1, 0, 1, . . . ,
with corresponding eigenvalues
n
= n
2
. Although there are repeated eigenvalues an orthonormal
set of eigenfunctions exists (as claimed), e.g.
y
n
=
1
2
exp(nx) for n Z. (6.162)
Hence from (6.157a) a 2-periodic function f has an eigenfunction expansion
f(x) =
1
n=
a
n
exp(nx) , (6.163)
where a
n
is given by (6.157b). This is just the Fourier series representation of f. In this case the
completeness relation (6.159a) reads (cf. (3.6))
1
2
n=
exp
_
n(x )
_
= (x ) . (6.164)
Example: Legendre polynomials. Legendres equation (6.34),
(1 x
2
)y
2xy
+( +1)y = 0 , (6.165a)
can be written as an Sturm-Liouville eigenvalue equation Ly = y where
L =
d
dx
_
(1 x
2
)
d
dx
_
and = ( +1) . (6.165b)
In terms of our standard notation
p = 1 x
2
and q = 0 . (6.165c)
Suppose now we require that L operates on functions y that remain nite at x = 1 and x = 1.
Then py = 0 at x = 1 and x = 1, and hence the boundary conditions (6.123b) are satised if
a = 1 and b = 1. It follows that L is self-adjoint.
Further, we saw earlier that Legendres equation has solutions that are nite at x = 1 when
= 0, 1, 2, . . . , (see (6.40) and following), and that the solutions are Legendre polynomials P
(x).
Identify the eigenvalues and eigenfunctions as
= ( +1) and y
(x) = P
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
Remarks.
With the conventional normalization, P
1
1
= 0 . (6.168)
6.6.8 Eigenfunction Expansions of Greens Functions for Self-Adjoint Operators
Let
n
and y
n
be the eigenvalues and the complete orthonormal set of eigenfunctions of a self-adjoint
operator
L acting on functions satisfying (6.123b). Provided none of the
n
vanish, we claim that the
Greens function for
L can be written as
G(x, ) =
n=1
w()y
n
()y
n
(x)
n
, (6.169)
where G(x, ) satises the same boundary conditions as the y
n
(x). This result follows from the observation
that
LG(x, ) =
n=1
w() y
n
()
Ly
n
(x) from (6.169)
=
n=1
w()y
n
() y
n
(x) from (6.134)
= (x ) from (6.159a). (6.170)
Remark. The form of the Greens function (6.169) shows that
w(x) G(x, ) = w() G
(, x) . (6.171)
Resonance. If
n
= 0 for some n then G(x, ) does not exist. This is consistent with our previous
observation that
Ly = f has no solution for general f if
Ly = 0 has a solution satisfying appropriate
boundary conditions; y
n
(x) is precisely such a solution if
n
= 0. The vanishing of one of the
eigenvalues is related to the phenomenon of resonance. If a solution to the problem (including the
boundary conditions) exists in the absence of the forcing function f (i.e. there is a zero eigenvalue
of
L) then any non-zero force elicits an innite response.
6.6.9 Approximation via Eigenfunction Expansions
It is often useful, e.g. in a numerical method, to approximate a function with Sturm-Liouville boundary
conditions by a nite linear combination of Sturm-Liouville eigenfunctions, i.e.
f(x)
N
n=1
a
n
y
n
(x) . (6.172)
Dene the error of the approximation to be
N
(a
1
, a
2
, . . . , a
N
) =
_
_
_
_
f(x)
N
n=1
a
n
y
n
(x)
_
_
_
_
2
. (6.173)
Natural Sciences Tripos: IB Mathematical Methods I 113 c [email protected], Michaelmas 2004
T
h
i
s
i
s
a
s
u
p
e
r
v
i
s
o
r
s
c
o
p
y
o
f
t
h
e
n
o
t
e
s
.
I
t
i
s
n
o
t
t
o
b
e
d
i
s
t
r
i
b
u
t
e
d
t
o
s
t
u
d
e
n
t
s
.
One denition of the best approximation is that the error (6.173) should be minimized with respect
to the coecients a
1
, a
2
, . . . , a
N
. By expanding (6.173) we have that, assuming that the y
n
are an
orthonormal set,
N
=
_
f(x)
N
n=1
a
n
y
n
(x)
f(x)
N
m=1
a
m
y
m
(x)
_
= f [ f )
N
n=1
a
n
y
n
[ f )
N
m=1
a
m
f [ y
m
) +
N
n=1
N
m=1
a
n
a
m
y
n
[ y
m
)
= |f|
2
n=1
_
a
n
y
n
[ f ) +a
n
y
n
[ f )
_
+
N
n=1
a
n
a
n
. (6.174)
Hence if we perturb the a
n
to a
n
+a
n
we have that
N
=
N
n=1
_
a
n
_
y
n
[ f ) a
n
_
+a
n
_
y
n
[ f )
n
_
_
. (6.175)
By setting
N
= 0 we see that
N
is minimized when
a
n
= y
n
[ f ), or equivalently a
n
= y
n
[ f )
. (6.176)
We note that this is an identical value for a
n
to (6.157b). The value of
N
is then, from (6.174),
N
= |f|
2
n=1
[a
n
[
2
. (6.177)
Since
N
0 from (6.173), we arrive at Bessels inequality
|f|
2
n=1
[a
n
[
2
. (6.178)
It is possible to show, but not here, that this inequality becomes an equality when N , and hence
|f|
2
=
n=1
[a
n
[
2
(6.179)
which is a generalization of Parsevals theorem.
Remark. While it is not strictly true that any function satisfying the Sturm-Liouville boundary condi-
tions can be expressed as an eigenfunction expansion (6.157a) (since there are restrictions such as
continuity), it is true that
n=1
y
n
[ f ) y
n
(x)
_
_
_
_
2
= 0 . (6.180)
Natural Sciences Tripos: IB Mathematical Methods I 114 c [email protected], Michaelmas 2004