Mathematical Methods I
Mathematical Methods I
Contents
Contents a
0 Introduction i
0.1 Schedule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
0.2 Books . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
0.3 Course Website . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
0.4 Lectures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
0.5 Lecture Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
0.6 Example Sheets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
0.7 Examples Classes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
0.8 Computational Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
0.9 Election of Student Representatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
0.10 Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
0.11 Acknowledgements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
0.12 Assumed Knowledge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v
1 Vector Calculus 1
1.0 Why Study This? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1 Vectors and Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.1 Three-dimensional Euclidean space, points and vectors . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Bases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.3 Cartesian Coordinate Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Suffix Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.1 Dyadic and suffix equivalents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.2.2 Summation convention . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2.3 Matrix expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.4 Kronecker delta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.2.5 More on basis vectors (Unlectured) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.6 The Levi-Civita symbol or alternating tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.2.7 The vector product in suffix notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.8 The product of two Levi-Civita symbols . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.2.9 A proof of Schwarz’s inequality (Unlectured) . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3 Vector Calculus in Cartesian Coordinates. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.1 The Gradient of a Scalar Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
1.3.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.3 The Geometrical Significance of Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.3.4 Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
1.7 The Big Integral Theorems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7.1 The Divergence Theorem (Gauss’ Theorem) . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7.2 Stokes’ Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.7.3 Examples and Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.7.4 Interpretation of divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.7.5 Interpretation of curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.8 Orthogonal Curvilinear Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.8.1 What Are Orthogonal Curvilinear Coordinates? . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.8.2 Relationships Between Coordinate Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.8.3 Incremental Change in Position or Length. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.8.4 The Jacobian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
1.8.5 Properties of Jacobians . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.8.6 Orthogonality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.8.7 Spherical Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.8.8 Cylindrical Polar Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
1.8.9 Volume and Surface Elements in Orthogonal Curvilinear Coordinates . . . . . . . . . . . . . 27
1.8.10 Gradient in Orthogonal Curvilinear Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.8.11 Examples of Gradients . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.8.12 Divergence and Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.8.13 Laplacian in Orthogonal Curvilinear Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . 30
1.8.14 Further Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
1.8.15 Aide Memoire . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
2 Green’s Functions 33
2.0 Why Study This? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.0.1 Physical motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.1 The Dirac Delta Function (a.k.a. Alchemy) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.1.1 The Delta Function as the Limit of a Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . 33
2.1.2 Some Properties of the Delta Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.1.3 An Alternative (And Better) View . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
2.1.4 The Delta Function as the Limit of Other Sequences . . . . . . . . . . . . . . . . . . . . . . . 35
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
2.4.1 The Green’s Function for two-point homogeneous boundary-value problems . . . . . . . . . . 40
2.4.2 Two Properties Green’s Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.4.3 Construction of the Green’s Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
2.4.4 Examples of Green’s Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
2.4.5 The Green’s Function for homogeneous initial-value problems . . . . . . . . . . . . . . . . . . 43
2.4.6 Inhomogeneous boundary conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
3 Fourier Transforms 45
3.0 Why Study This? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.1 The Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.1.1 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.1.2 Examples of Fourier Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
3.1.3 The Fourier Inversion Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
3.1.4 Properties of Fourier Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
3.1.5 The Relationship to Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50
3.2 The Convolution Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2.1 Definition of convolution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
3.2.2 Interpretation and examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.2.3 The convolution theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
3.2.4 Correlation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 53
3.3 Parseval’s theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
3.4 Power spectra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
3.5 Solution of Ordinary Differential Equations using Fourier Transforms . . . . . . . . . . . . . . . . . . 57
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
4.5 Poisson’s Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.5.1 A Particular Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.5.2 Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
4.5.3 Separable Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.6 The Diffusion Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.6.1 Separable Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
4.6.2 Boundary and Initial Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.6.3 Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.6.4 A Rough and Ready Outline Recipe . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
4.7 Solution Using Fourier Transforms (Non-examinable & Unlectured) . . . . . . . . . . . . . . . . . . . 72
4.7.1 The diffusion equation as an exemplar . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
5 Matrices 74
5.0 Why Study This? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.1 Vector Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.1.1 Some Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.1.2 Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.1.3 Span and linear independence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
5.1.4 Basis Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76
5.2 Change of Basis: the Rôle of Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2.1 Linear Operators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
5.2.2 Transformation Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2.3 Properties of Transformation Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.2.4 Transformation Law for Vector Components . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
5.3 Some Definitions of Special Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
5.4 Scalar Product (Inner Product) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.4.1 Definition of a Scalar Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
5.4.2 Worked Example . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.4.3 Some Inequalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
5.4.4 The Scalar Product in Terms of Components . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
5.4.5 Properties of the Metric . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
5.7.5 Worked example (unlectured) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
5.7.6 Uses of diagonalization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.8 Forms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
5.8.1 Eigenvectors and Principal Axes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.8.2 Quadrics and conics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94
5.8.3 The Stationary Properties of the Eigenvalues . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
6 Elementary Analysis 99
6.0 Why Study This? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.1 Sequences and Limits . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.1.1 Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.1.2 Sequences tending to a limit, or not. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
6.2 Convergence of Infinite Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.2.1 Convergent and divergent series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 100
6.2.2 A necessary condition for convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
6.2.3 Absolute and conditional convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.3 Tests of Convergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.3.1 The comparison test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
6.3.2 D’Alembert’s ratio test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
6.3.3 Cauchy’s test . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.4 Functions of a Continuous Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.4.1 Limits and continuity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
6.4.2 The O notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.5 Taylor’s Theorem for Functions of a Real Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
6.6 Analytic Functions of a Complex Variable . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.6.1 Complex differentiability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.6.2 The Cauchy–Riemann equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
6.6.3 Analytic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
6.6.4 Consequences of the Cauchy–Riemann equations . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.6.5 Taylor series for analytic functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
6.7 Zeros, Poles and Essential Singularities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
7.0 Why Study This? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.1 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.1.1 First-order linear ordinary differential equations . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.1.2 Second-order ordinary differential equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
7.1.3 Second-order linear ordinary differential equations . . . . . . . . . . . . . . . . . . . . . . . . 114
7.2 Homogeneous Second-Order Linear ODEs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.2.1 Linearly independent solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.2.2 The Wronskian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.2.3 The Calculation of W . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 115
7.2.4 A Second Solution via the Wronskian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
7.3 Taylor Series Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.3.1 Ordinary and singular points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 117
7.3.2 The solution at ordinary points in terms of a power series . . . . . . . . . . . . . . . . . . . . 117
7.3.3 Example (possibly unlectured) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
7.3.4 Example: Legendre’s Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 120
7.4 Regular Singular Points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
7.4.1 The Indicial Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 121
7.4.2 Series Solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.4.3 Example: Bessel’s Equation of Order ν . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
7.4.4 The Second Solution when σ1 − σ2 ∈ Z . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
7.4.5 Irregular singular points . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 125
7.5 The Method of Variation of Parameters (Unlectured and Not in Schedule) . . . . . . . . . . . . . . . 125
0.1 Schedule
The schedules, or syllabuses, are determined by a committee which has input from all the Physical Science
subjects in the Natural Sciences and from Computer Science and is agreed by the Faculty of Mathematics.
The schedules are minimal for lecturing and maximal for examining; that is to say, all the material in the
schedules will be lectured and only this material will be examined.
Below is a copy from the booklet of schedules.1 The numbers in square brackets at the end of paragraphs
indicate roughly (I emphasise roughly) the number of lectures that will be devoted to the material in the
paragraph.
Please note that the committee responsible for the schedules has recently asked me to lecture the section
on Partial differential equations after the section on the Fourier transform (instead of before the section on
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Green’s functions).
Vector calculus
Vector calculus Suffix notation. Einstein summation convention. Contractions using δij and εijk .
Reminder of vector products, grad, div, curl, ∆2 , and their representations using suffix notation.
Divergence theorem and Stokes’ theorem. Vector differential operators in orthogonal curvilinear
coordinates, e.g. cylindrical and spherical polar coordinates. Jacobians. [6]
Green’s functions
Response to impulses, delta function (treated heuristically), Green’s functions for initial and
boundary value problems. [3]
Fourier transform
Fourier transforms; relation to Fourier series, simple properties and examples, convolution the-
orem, correlation functions, Parseval’s theorem and power spectra. [2]
Matrices
N -–dimensional vector spaces, matrices, scalar product, transformation of basis vectors. Eigen-
values and eigenvectors of a matrix; degenerate case, stationary property of eigenvalues. Orthog-
onal and unitary transformations. Quadratic and Hermitian forms, quadric surfaces. [5]
1 See
https://www.maths.cam.ac.uk/undergradnst/files/misc/NSTschedules.pdf
and also
https://www.maths.cam.ac.uk/undergradnst/currentstudents.
2 However, if you took course A rather than B, then you might like to recall the following extract from the schedules:
The material from course A is assumed. Students are nevertheless advised that if they have taken course A in Part
IA, they should consult their Director of Studies about suitable reading during the Long Vacation before embarking
upon part IB Mathematics.
3 Time is always short.
0.2 Books
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
There are very many books which cover the sort of mathematics required by Natural Scientists.
The following should be helpful as general reference; further advice will be given by Lecturers.
Books which can reasonably be used as principal texts for the course are marked with a dagger.
The prices given are intended as a guide only, and are subject to change.
†
G Arfken & H Weber Mathematical Methods for Physicists, 6th edition. Elsevier, 2005 (£44.09).
†
J W Dettman Mathematical Methods in Physics and Engineering. Dover, 1988 (£23.99 paper-
back).
H F Jones Groups, Representation and Physics, 2nd edition. Institute of Physics Publishing,
1998 (£45.99 paperback)
E Kreyszig Advanced Engineering Mathematics, 8th edition. Wiley, 1999 (10th edition available,
£46.59 hardback)
†
J Mathews & R L Walker Mathematical Methods of Physics, 2nd edition. Pearson/Benjamin
Cummings, 1970 (From £42.00 used).
†
K F Riley, M P Hobson & S J Bence Mathematical Methods for Physics and Engineering. 3nd
ed., Cambridge University Press, 2002 (£39.99 paperback).
R N Snieder A guided tour of mathematical methods for the physical sciences, 2nd edition.
Cambridge University Press, 2004 (£34.19 paperback)
There is likely to be a resemblance between my notes and Riley, Hobson & Bence. This is because we both
used the same source, i.e. previous Cambridge lecture notes.4
Of the other books, I like Mathews & Walker, but it might be a little mathematical for some. Also, the first
time I gave a ‘service’ mathematics course (over 35 years ago to aeronautics students at Imperial), my notes
bore a resemblance to Kreyszig . . . and that was not because we were using a common source!
4 When I lectured this course two decades ago, a student hoped that Riley et al. were getting royalties from my lecture
notes; my hope is that my lecturers from 45 years ago are getting royalties from Riley et al.!
• I will aim to finish by 11:55, but am not going to stop dead in the middle of a long proof/explanation.
• I welcome constructive heckling. Hence, if I am inaudible, illegible, unclear (e.g. you spot a typo or
I use jargon you do not understand), or just plain wrong, then please speak up. I will endeavour to
stay around for a few minutes at the front after lectures in order to answer questions. Questions and
comments, particularly longer ones, can also be emailed to me at [email protected].
• I want you to learn. I will do my best to be clear but you must read through and understand your
notes before the next lecture . . . otherwise you will get hopelessly lost. An understanding of your notes
will not diffuse into you just because you have carried your notes around for a week, or put them under
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
your pillow, or watched them (possibly more than once) online (especially if done at double speed).
• I aim to avoid the words trivial, easy, obvious and yes 5 . Let me know if I fail. I will occasionally use
straightforward or similarly to last time; if it is not, email me at [email protected], or
catch me at the end of the next lecture.
• Sometimes I may confuse both you and myself, and may not be able to extract myself in the middle of
a lecture. Under such circumstances I will have to plough on as a result of time constraints; however
I will clear up any problems at the beginning of the next lecture.
• This is a ‘service’ course, so you will not get pure mathematical levels of rigour. However, I will give
some justification for a method, rather than just a recipe, because if you are to use a method efficiently
and effectively, or extend it as might be necessary in research, you need to understand why a method
works, as well as how to apply it.
• If anyone is colour blind please tell me which colours you cannot read.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
rangement caused by moving the lecture material on Partial differential equations to later. Supervisors
can have access to the answers almost immediately, as indicated on the Moodle site.
There will be Examples Classes on Wednesday 2 November and Wednesday 23 November from 14:00 to
16:00 in the Cockcroft Lecture Theatre.
I have been asked to remind you that there is a Computational Projects element to the course that you
need to register on the course Moodle by 23 October 2022.
The Faculty Board of Mathematics asked DAMTP to set up a Staff-Student Committee for Mathematics
in the Natural Sciences to provide an opportunity for discussion of matters relating to the courses. The
Committee has four staff and three student members, the latter being drawn from the A and B courses in
Part IA and from the Part IB course.
Hence, this Consultative Committee for NST Mathematics will need an elected undergraduate member
drawn from this course. I have been asked to conduct an election. It has been suggested that a week’s notice
be given and that nominations are asked for in writing, countersigned by the nominee as a guarantee of
willingness to serve. It has been proposed that the election can take place by a show of hands at the start
of a designated lecture.
Please could you hand me nominations in writing, countersigned by the nominee, by the end of the lecture
on Monday 17 October?
If you would prefer that the election take place by other than a show of hands, please could you email an
alternative suggestion to [email protected].
0.10 Feedback
Comments and administrative/organisational queries on the course, lectures and the examples sheets can
be made via the email address [email protected].
Comments received will be edited and passed on anonymously to the relevant lecturer and others concerned.
They will also be considered at the next meeting of the Staff Student Consultative Committee. Queries will
either be answered directly or passed on to the relevant lecturer.
The Lecture Notes and Example Sheets were adapted from those of Paul Townsend, Stuart Dalziel, Mike
Proctor, Paul Metcalfe and Henrik Latter.
Familiarity with the following topics at the level of Course A of Part IA Mathematics for Natural Sciences
will be assumed.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
• Eigenvalues and eigenvectors of matrices
• Taylor series and the geometric series
• Calculus of functions of several variables
• Line, surface and volume integrals
• The Gaussian integral
• First-order ordinary differential equations
• Second-order linear ODEs with constant coefficients
• Fourier series
• Permutations
More specifically, you should check that you recall the following.
The second fundamental theorem of calculus. The second fundamental theorem of calculus states
that the integral of the derivative of f is f , e.g. if f is differentiable then
Z x2
df Key
dx = f (x2 ) − f (x1 ) . (0.2)
x1 dx Result
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
is called a Gaussian of width σ; in context of probability theory σ is the standard deviation. The area
under this curve is unity, i.e. Z ∞
1 x2 Key
√ exp − 2 dx = 1 . (0.4) Result
2πσ 2 −∞ 2σ
Taylor’s theorem for functions of more than one variable. Let f (x, y) be a function of two vari-
ables, then
∂f ∂f
f (x + δx, y + δy) = f (x, y) + δx + δy
∂x ∂y
2
∂2f ∂2f
1 ∂ f
+ (δx)2 2 + 2δxδy + (δy)2 2 . . . . (0.7)
2! ∂x ∂x∂y ∂y
Exercise. Let g(x, y, z) be a function of three variables. Expand g(x + δx, y + δy, z + δz) correct to
O(δx, δy, δz).
and hence
∂qi Key
= δij , (0.8b) Result
∂qj
where δij is the Kronecker delta:
1 if i = j
δij = . (0.9)
0 ̸ j
if i =
The chain rule. Let h(x, y) be a function of two variables, and suppose that x and y are themselves
functions of a variable s, then
dh ∂h dx ∂h dy
= + . (0.10a)
ds ∂x ds ∂y ds
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Suppose instead that h depends on n variables xi (i = 1, . . . , n), so that h = h(x1 , x2 , . . . , xn ). If the
xi depend on m variables sj (j = 1, . . . , m), then for j = 1, . . . , m
n
∂h X ∂h ∂xi Key
= . (0.10b)
∂sj ∂xi ∂sj Result
i=1
Vector identities. Suppose that vectors a, b and c have components (a1 , a2 , a3 ), (b1 , b2 , b3 ) and (c1 , c2 , c3 )
respectively.
Scalar or dot product. The scalar product for a and b is given by
n
X
a·b= ai bi = a1 b1 + a2 b2 + a3 b3 . (0.11a)
i=1
Z Z
Key
dr = − dr . (0.6)
C −C Result
where
Z x0 +L
2 2πnx
an = f (x) cos dx , (0.8b)
L x0 L
Z x0 +L
2 2πnx
bn = f (x) sin dx , (0.8c)
L x0 L
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Z L
2πnx 2πmx L
sin sin dx = δnm , (0.9a)
0 L L 2
Z L
2πnx 2πmx L
cos cos dx = δnm , (0.9b)
0 L L 2
Z L
2πnx 2πmx
sin cos dx = 0. (0.9c)
0 L L
Let ge (x) be an even function, i.e. a function such that ge (−x) = ge (x), with period L = 2L. Then the
Fourier series expansion of ge (x) can be expressed as
∞
X nπx
1
ge (x) = 2 a0 + an cos , (0.10a)
n=1
L
where Z L
2 nπx
an = ge (x) cos dx . (0.10b)
L 0 L
Let go (x) be an odd function, i.e. a function such that go (−x) = −go (x), with period L = 2L. Then
the Fourier series expansion of go (x) can be expressed as
∞
X nπx
go (x) = bn sin , (0.11a)
n=1
L
where Z L
2 nπx
bn = go (x) sin dx . (0.11b)
L 0 L
Recall that if integrated over a half period, the ‘orthogonality’ conditions require care since
Z L
nπx mπx L
sin sin dx = δnm , (0.12a)
0 L L 2
Z L
nπx mπx L
cos cos dx = δnm , (0.12b)
0 L L 2
but
Z L 0 if n + m is even,
nπx mπx
sin cos dx = 2nL (0.12c)
0 L L if n + m is odd.
π(n2 − m2 )
Permutations A permutation of degree n is a function that rearranges n distinct objects, such as the first
n strictly positive integers {1, 2, . . . , n}, amongst themselves.
An even (odd) permutation is one consisting of an even (odd) number of transpositions (interchanges
of two neighbouring objects).
(0.13a) and (0.13b) are, respectively, even and odd permutations of {1, 2, 3}.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Additions/Subtractions?
1. Remove all the \enlargethispage commands.
2. 2D divergence theorem, Green’s theorem (e.g. as a special case of Stokes’ theorem).
3. Add Fourier transforms of cos x, sin x and periodic functions.
4. Check that the addendum at the end of § 3 has been incorporated into the main section.
5. Swap § 4.7.1 and § 3.5.
6. Swap § 5.2 and § 5.4.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
7. Explain that observables in quantum mechanics are Hermitian operators.
8. Come up with a better explanation of why for a transformation matrix, say A, det A ̸= 0.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
A field is a quantity that depends continuously on position (and possibly on time). Examples include:
• air pressure in this room (scalar field)
• electric field in this room (vector field)
Vector calculus is concerned with scalar and vector fields. The spatial variation of fields is described by
vector differential operators, which appear in the partial differential equations governing the fields.
Vector calculus is most easily done in Cartesian coordinates, but other systems (curvilinear coordinates) are
better suited for some problems because of symmetries or boundary conditions.
1.1.2 Bases
Points and vectors have a geometrical existence without reference to any coordinate system. However, it is
often very useful to describe them in term of a basis for the space. Three non-zero vectors e1 , e2 and e3 can
form a basis in 3D space if they do not all lie in a plane, i.e. they are linearly independent. Any vector can
be expressed uniquely in terms of scalar multiples of the basis vectors:
v = v1 e1 + v2 e2 + v3 e3 . (1.1)
The vi (i = 1, 2, 3) are said to the components of the vector v with respect to this basis.
Remark. The choice of basis is not unique. The components of a vector are different with respect to two
different bases.
e1 × e2 = e3 , (1.3)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
[e1 , e2 , e3 ] = e1 × e2 · e3 = 1 . (1.4)
r = x e1 + y e2 + z e3 (1.6a)
= (x, y, z) , (1.6b)
Remarks.
e1 = ex = i = ı̂ = x̂ = x
b1 , e2 = ey = j = ȷ̂ = ŷ = x
b2 and e3 = ez = k = k̂ = ẑ = x
b3 , (1.7)
for the unit vectors in the x, y and z directions respectively. Hence from (1.2c) and (1.5)
3. Two different bases, if both orthonormal and right-handed, are simply related by a rotation.
4. The Cartesian components of a vector are different with respect to two different Cartesian bases.
So far we have used dyadic notation for vectors. Suffix notation is an alternative means of expressing vectors
(and tensors). Once familiar with suffix notation, it is generally easier to manipulate vectors using suffix
notation.6
An alternative to the notation used for the vector (1.1), is to write
v = v1 e1 + v2 e2 + v3 e3 = (v1 , v2 , v3 ) (1.9a)
= {vi } for i = 1, 2, 3 . (1.9b)
Suffix notation. We will refer to v as {vi }, with the i = 1, 2, 3 understood; i is then termed a free suffix.
Remark. Sometimes we will denote the ith component of the vector v by (v)i , i.e. (v)i = vi .
Example: the position vector. The position vector r can be written as
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
r = (x, y, z) = (x1 , x2 , x3 ) = {xi } . (1.10)
Remark. The use of x, rather than r, for the position vector in dyadic notation possibly seems more under-
standable given the above expression for the position vector in suffix notation. Henceforth we will use
x and r interchangeably.
a1 = b1 , (1.11b)
a2 = b2 , (1.11c)
a3 = b3 . (1.11d)
In suffix notation we express this equality as
ai = bi for i = 1, 2, 3 . (1.11e)
This is a vector equation; when we omit the ’for i = 1, 2, 3’, it is understood that the one free suffix i ranges
through 1, 2, 3 (or 1, 2 in 2D) so as to give three component equations. Similarly
c = λa + µb ⇔ ci = λai + µbi
⇔ cj = λaj + µbj
⇔ cα = λaα + µbα
⇔ c¥ = λa¥ + µb¥ ,
where is is assumed that i, j, α and ¥, respectively, range through (1, 2, 3).7
Remark. It does not matter what letter, or symbol, is chosen for the free suffix, but it must be the same in
each term.
Dummy suffices. In suffix notation the scalar product becomes
a·b = a1 b1 + a2 b2 + a3 b3
X3
= ai bi
i=1
3
X
= ak bk , etc.,
k=1
where we note that the equivalent equation on the right hand side has no free suffices since the dummy
suffix (in this case α) has again been summed out.
Further examples.
(i) As another example consider the equation (a · b)c = d. In suffix notation this becomes
3
X 3
X
(ak bk ) ci = ak bk ci = di , (1.12)
k=1 k=1
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
where k is the dummy suffix, and i is the free suffix that is assumed to range through (1, 2, 3). It
is essential that we used different symbols for both the dummy and free suffices!
(ii) In suffix notation the expression (a · b)(c · d) becomes
3
! 3
X X
(a · b)(c · d) = ai bi cj dj
i=1 j=1
3 X
X 3
= ai bi cj dj ,
i=1 j=1
where, especially after the rearrangement, it is essential that the dummy suffices are different.
In the case of free suffices we are assuming that they range through
P (1, 2, 3) without the need to explicitly
say so. Under Einstein’s summation convention the explicit sum, , can be omitted for dummy suffices.8
In particular
• if a suffix appears more than twice in one term of an equation, something has gone wrong (unless
there is an explicit sum).
Remark. This notation is powerful because it is highly abbreviated (and so aids calculation, especially in
examinations), but the above rules must be followed, and remember to check your answers (e.g. the
free suffices should be identical on each side of an equation).
Examples. Under suffix notation and the summation convention
8 Learning to omit the explicit sum is a bit like learning to change gear when starting to drive. At first you have to
remind yourself that the sum is there, in the same way that you have to think consciously where to move gear knob. With
practice you will learn to note the existence of the sum unconsciously, in the same way that an experienced driver changes
gear unconsciously; however you will crash a few gears on the way!
Under suffix notation the following equation is problematical (and probably best avoided unless you
will always remember to double count the i on the right-hand side)
ni ni = n2i because i occurs twice on the left-hand side and only once on the right-hand side.
01/22 Remark. If the summation convention is not being used, this should be noted explicitly.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Transpose of a matrix (A):
Determinant of a (3 × 3) matrix (A), where (if you have not met it before) εijk is defined in (1.18) below:
det A = εijk A1i A2j A3k .
Properties.
(iv)
3
X
δii = δii = δ11 + δ22 + δ33 = 3 . (1.14e)
i=1
(v)
ap δpq bq = ap bp = aq bq = a · b . (1.14f)
(vi) From (1.2c)
ei · ej = δij . (1.14g)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Contraction. Contraction is an operation by which we set one free index equal to another, so that it is
summed over. For example, the contraction of aij is aii . Contraction is equivalent to multiplication by
a Kronecker delta:
aij δij = a11 + a22 + a33 = aii . (1.15)
e(i)
j
= δij , (1.16d)
and equivalently
Revision. A permutation of degree n is a function that rearranges n distinct objects (taken in our case to
be the first n strictly positive integers {1, 2, . . . , n}) amongst themselves.
If n = 3 there are 6 permutations (including the identity permutation) that re-arrange {1, 2, 3} to
{1, 2, 3}, {2, 3, 1}, {3, 1, 2}, (1.17a)
{1, 3, 2}, {2, 1, 3}, {3, 2, 1}. (1.17b)
An even (odd) permutation is one consisting of an even (odd) number of transpositions (interchanges
of two neighbouring objects). Hence, (1.17a) and (1.17b) are, respectively, even and odd permutations
of {1, 2, 3}.
Definition 1.1. We define the Levi-Civita permutation symbol, εijk (i, j, k = 1, 2, 3), to be the set of 27
quantities such that
1 if i j k is an even permutation of 1, 2, 3;
εijk = −1 if i j k is an odd permutation of 1, 2, 3; (1.18)
0 otherwise
Worked exercise.
For a symmetric tensor sij , i, j = 1, 2, 3, such that sij = sji evaluate εijk sij .
Solution. By relabelling the dummy suffices we have from (1.19c) and the symmetry of sij that
3 X
X 3 3 X
X 3 3 X
X 3 3 X
X 3
εijk sij = εabk sab = εjik sji = − εijk sij , (1.20a)
i=1 j=1 a=1 b=1 j=1 i=1 i=1 j=1
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
or equivalently by using the summation convention
εijk sij = εabk sab = εjik sji = −εijk sij , (1.20b)
where we have successively relabelled i → a → j and j → b → i. Hence we conclude that
εijk sij = 0 . (1.20c)
We claim that
3 X
X 3
(a × b)i = εijk aj bk = εijk aj bk , (1.21)
j=1 k=1
where we note that there is one free suffix and two dummy suffices.
Check.
3 X
X 3
(a × b)1 = ε1jk aj bk = ε123 a2 b3 + ε132 a3 b2 = a2 b3 − a3 b2 ,
j=1 k=1
We claim that
δil δim δin
εijk εlmn = δjl δjm δjn . (1.24)
δkl δkm δkn
As proof we observe that the value of both the LHS and the RHS:
(i) is 0 when any of (i, j, k) are equal (two rows equal in a determinant), or when any of (l, m, n) are
equal (two columns equal in a determinant);
(ii) is 1 when (i, j, k) = (l, m, n) = (1, 2, 3);
(iii) changes sign when any of (i, j, k) are interchanged (row interchange in a determinant), or when any
of (l, m, n) are interchanged (column interchange in a determinant).
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Remarks.
(i) There are four free suffices/indices on each side, with i as a dummy suffix on the left-hand side.
Hence (1.25b) represents 34 equations.
(ii) Given any product of two epsilons with one common index, the indices can be permuted cyclically
into this form, for instance:
εαβγ εµνβ = εβγα εβµν = δγµ δαν − δγν δαµ (1.25c)
Contracted 2 and contracted 3 identities. A further contraction of the identity (1.24) yields from (1.25b)
εijk εijn = δjj δkn − δjn δkj
= 3δkn − δkn
= 2δkn , (1.26a)
while a further contraction yields
εijk εijk = 6 . (1.26b)
Vector triple product. Using suffix notation for the vector triple product, we recover in agreement with
(0.11e):
a × (b × c) i = εijk aj (b × c)k
= εijk aj εklm bl cm only two identical sufficies
= εkij εklm aj bl cm from (1.19c) permutate the sufficies
= (δil δjm − δim δjl ) aj bl cm from (1.25b)
= aj bi cj − aj bj ci from (1.14b) and (1.14c)
= (a · c)b − (a · b)c i . (1.27b)
There is an elegant proof of Schwarz’s inequality (which works in n-dimensions) using the summation
convention:
2 2 2
∥x∥ ∥y∥ − |x · y| = xi xi yj yj − xi yi xj yj
= 12 xi xi yj yj + 12 xj xj yi yi − xi yi xj yj relabel indicies in half the first term
1
= 2 (xi yj − xj yi )(xi yj − xj yi ) factorize
⩾ 0.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Let ψ(r) be a scalar field, i.e. a scalar function
of position r = (x, y, z).
∂ψ ∂ψ ∂ψ
grad ψ ≡ ∇ψ = ex + ey + ez
∂x ∂y ∂z
X ∂ψ
= ej . (1.28b)
∂xj
02/22 j
In the limit when δ• becomes infinitesimal we write d• for δ•.9 Thus we have that
dψ = ∇ψ · dr . (1.29)
We can define the vector differential operator ∇ (pronounced ‘grad’) independently of ψ by writing
∂ ∂ ∂ X ∂
∇ ≡ ex + ey + ez = ej (1.30a)
∂x ∂y ∂z j
∂x j Key
Results
∂
= ej , using the summation convention. (1.30b)
∂xj
9 This is a bit of a ‘fudge’ because, strictly, a differential d• need not be small . . . but there is no quick way out.
Find ∇f , where f (r) is a function of r = |r|. We will use this result later.
Answer. First recall that r2 = x2 + y 2 + z 2 . Hence
∂r ∂r x
2r = 2x , i.e. = . (1.31a)
∂x ∂x r
Similarly, by use of the permutations x → y, y → z and z → x,
∂r y ∂r z
= , = . (1.31b)
∂y r ∂z r
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
∂r ∂r ∂r x y z r Key
∇r = , , = , , = . (1.32)
∂x ∂y ∂z r r r r Result
Similarly, from the definition of gradient (1.28b) (and from standard results for the derivative of a function
of a function),
∂f (r) ∂f (r) ∂f (r)
∇f (r) = , ,
∂x ∂y ∂z
df ∂r df ∂r df ∂r
= , ,
dr ∂x dr ∂y dr ∂z
= f ′ (r)∇r (1.33a)
r
= f ′ (r) . (1.33b)
r
d
δψ = ψ(r + δs l) − ψ(r) = δs ψ(r + sl) + ... ,
ds s=0
d
dψ = ds ψ(r + sl) . (1.34a)
ds s=0
But from (1.29) with dr = ds l,
dψ = ds (l · ∇ψ) . (1.34b)
Since (1.34a) and (1.34b) hold for all ds, it follows that
d
l · ∇ψ = ψ(r + sl) . (1.35)
ds s=0
(i) More generally, the rate of change of ψ with arclength s along a curve is dψ/ds = l·grad ψ, where
l = dr/ds is the unit tangent vector to the curve.
(ii) When the directional derivative is zero, i.e. l · ∇ψ = 0, it follows that if ∇ψ ̸= 0, then ψ does not
change in the direction of l; hence l is a tangent to the surface ψ = constant.
∇ψ
n
b= . (1.36)
|∇ψ|
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
1. Find the unit normal at the point r(x, y, z) to the surface
ψ(r) ≡ xy + yz + zx = −c , (1.37)
where c is a positive constant. Hence find the points where the tangents to the surface are parallel to
the (x, y) plane.
Answer. First calculate
∇ψ = (y + z, x + z, y + x) . (1.38a)
∇ψ (y + z, x + z, y + x)
n
b= =p . (1.38b)
|∇ψ| 2(x2 + y 2 + z 2 + xy + xz + yz
y = −z and x = −z . (1.38c)
Hence from the equation for the surface, i.e. (1.37), the points where the tangents to the surface are
parallel to the (x, y) plane satisfy
z2 = c , (1.38d)
so from (1.38c)
r = ±(−c, −c, c) . (1.38e)
01/04
2. Unlectured. A mountain’s height z = h(x, y) depends on Cartesian coordinates x, y according to
h(x, y) = 1 − x4 − y 4 ⩾ 0. Find the point at which the slope in the plane y = 0 is greatest.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
−4x3 dx
dx
slope = dx
ds
= −4x3 sign . (1.40c)
ds
ds
Therefore the magnitude of the slope is largest where |x| is largest, i.e. at the edge of the mountain
|x| = 1. It follows that max |slope| = 4.
∇ψ is an example of a vector field, i.e. a vector specified at each point r in space. More generally, we have
for a vector field F(r), X
F(r) = Fx (r)ex + Fy (r)ey + Fz (r)ez = Fj (r)ej , (1.41)
j
1.4.3 Examples
1. Unlectured. Find the divergence and curl of the vector field F = (x2 y, y 2 z, z 2 x).
Answer.
∂(x2 y) ∂(y 2 z) ∂(z 2 x)
∇·F= + + = 2xy + 2yz + 2zx . (1.44)
∂x ∂y ∂z
ex ey ez
∇×F = ∂x ∂y ∂z
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
x2 y y2 z z2x
= −y 2 ex − z 2 ey − x2 ez
= −(y 2 , z 2 , x2 ) . (1.45)
2. Find ∇ · r and ∇ × r.
Answer. From the definition of divergence (1.42a), and recalling that r = (x, y, z) = (x1 , x2 , x3 ), it
follows that
∂x ∂y ∂z
∇·r= + + = 3. (1.46a)
∂x ∂y ∂z
or equivalently from (1.42b)
∂xi ∂xi
∇·r= = δii = 3 , since, from Example Sheet 0, = δij . (1.46b)
∂xi ∂xj
Next, from the definition of curl (1.43a) it follows that
∂z ∂y ∂x ∂z ∂y ∂x
∇×r= − , − , − = 0. (1.46c)
∂y ∂z ∂z ∂x ∂x ∂y
or equivalently from (1.43c)
∂xk
∇ × r = εijk ei = εijk ei δjk = εijj ei = 0 . (1.46d)
∂xj
1.4.4 F·∇ .
In (1.42a) we defined the divergence of a vector field F, i.e. the scalar ∇ · F. The order of the operator ∇
and the vector field F is important here. If we invert the order then we obtain the scalar operator
∂ ∂ ∂ ∂
(F · ∇) ≡ Fx + Fy + Fz = Fj . (s.c.) (1.47a)
∂x ∂y ∂z ∂xj
while the ith component of F · (∇G) is not, i.e. it is not clear whether the ith component of F · (∇G)
is
X ∂Gi X ∂Gj
Fj or Fj .
j
∂xj j
∂xi
Calculations involving ∇ can be much sped up when certain vector identities are known. There are a large
number of these! A short list is given below of the most common. Here ψ is a scalar field and F, G are
vector fields.
∇ · (ψ F) = ψ ∇ · F + (F · ∇)ψ , (1.48a)
∇ · (F × G) = G · (∇ × F) − F · (∇ × G) , (1.48c)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Example Verifications.
(1.48a):
∂(ψ Fi )
∇ · (ψ F) = from (1.42b)
∂xi
∂Fi ∂ψ
=ψ + Fi
∂xi ∂xi
= ψ ∇ · F + (F · ∇)ψ from (1.42b) and (1.47a).
Unlectured. (1.48c):
∂
∇ · (F × G) = (εijk Fj Gk ) from (1.42b) and (1.21)
∂xi
∂Fj ∂Gk
= Gk εijk + Fj εijk
∂xi ∂xi
∂Fj ∂Gk
= Gk εkij − Fj εjik from (1.19c)
∂xi ∂xi
= G · (∇ × F) − F · (∇ × G) . from (1.43c).
(1.48d):
∂
(∇ × (F × G))i = εijk (εklm Fl Gm ) from (1.21) and (1.43c)
∂xj
∂Gm ∂Fl
= (δil δjm − δim δjl ) Fl + Gm from (1.25b)
∂xj ∂xj
∂Gj ∂Fi ∂Gi ∂Fj
= Fi + Gj − Fj − Gi from (1.14b) and (1.14c)
∂xj ∂xj ∂xj ∂xj
03/22 = (F (∇ · G) − G (∇ · F) + (G · ∇)F − (F · ∇)G)i from (1.42b) and (1.47a).
Warnings.
1. Always remember what terms the differential operator is acting on, e.g. is it all terms to the right or
just some?
2. Be very very careful when using standard vector identities where you have just replaced a vector
with ∇. Sometimes it works, sometimes it does not! For instance for constant vectors D, F and G
F · (D × G) = D · (G × F) = −D · (F × G) .
F · (∇ × G) ̸= ∇ · (G × F) = −∇ · (F × G) ,
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Using the definitions grad, div and curl, i.e. (1.28b), (1.42a) and (1.43a), and assuming the equality of mixed
derivatives, we have that (cf. (1.20c) where we showed that εijk sij = 0 is sij is symmetric)
curl (grad ψ) = ∇ × (∇ψ) = εijk ∂xj (∂xk ψ)
= εikj ∂xk ∂xj ψ relabel dummy suffices j and k
= −εijk ∂xj ∂xk ψ permutate ikj in εikj & swap partials
=0 quantity equals its negative. (1.49a)
Similarly,
div(curl F) = ∇ · (∇ × F) = ∂xi εijk ∂xj Fk
= ∂xj εjik ∂xi Fk relabel dummy suffices i and j
= −∂xi εijk ∂xj Fk permutate jik in εjik & swap partials
= 0. quantity equals its negative (1.49b)
Remarks.
1. Since by the standard rules for scalar triple products ∇ · (∇ × F) ≡ (∇ × ∇) · F, we can summarise
both of these identities by
Key
∇ × ∇ ≡ 0. (1.50) Result
2. There are important converses to (1.49a) and (1.49b). The following two assertions can be proved (but
not here).
(a) Suppose that ∇ × F = 0; the vector field F(r) is said to be irrotational. Then there exists a scalar
potential, φ(r), such that
F = ∇φ . (1.51)
Application. A force field F such that ∇ × F = 0 is said to the conservative. Gravity is a
conservative force field. The above result shows that we can define a gravitational potential φ
such that F = ∇φ.
(b) Suppose that ∇ · B = 0; the vector field B(r) is said to be solenoidal. Then there exists a
non-unique vector potential, A(r), such that
B = ∇ × A. (1.52)
Application. One of Maxwell’s equations for a magnetic field, B, states that ∇ · B = 0. The above
result shows that we can define a magnetic vector potential, A, such that B = ∇ × A.
Example. Evaluate ∇ · (∇p × ∇q), where p and q are scalar fields. We will use this result later.
Answer. Identify ∇p and ∇q with F and G respectively in the vector identity (1.48c). Then it follows
from using (1.50) that
∇ · (∇p × ∇q) = ∇q · (∇ × ∇p) − ∇p · (∇ × ∇q) = 0 . (1.53)
02/02
∂2 ∂2 ∂2 ∂2
∇2 = ∇ · ∇ = = + + . (1.54c)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
∂x2i ∂x21 ∂x22 ∂x23
Remarks.
1. The Laplacian operator ∇2 is very important in the natural sciences. For instance it occurs in
(a) Poisson’s equation for a potential φ(r):
∇2 φ = ρ , (1.55a)
d2 f
+ ω2 f = 0 .
dx2
02/03
2. Although the Laplacian has been introduced by reference to its effect on a scalar field (in our case ψ),
it also has meaning when applied to vectors. However some care is needed. On the first example sheet
you will prove the vector identity
∇ × (∇ × F) = ∇(∇ · F) − ∇2 F . (1.56a)
The Laplacian acting on a vector is conventionally defined by rearranging this identity to obtain
∇2 F = ∇(∇ · F) − ∇ × (∇ × F) . (1.56b)
02/04
∂(nrn−2 xi )
∇2 rn = ∇ · (∇rn ) =
∂xi
∂xi ∂rn−2
= nrn−2 + nxi
∂xi ∂xi
xi
= 3nrn−2 + nxi (n − 2)rn−3 using (1.31a)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
r
= n(n + 1)rn−2 . (1.58)
Check. Note that from setting n = 2 in (1.57) we have that ∇r2 = 2r. It follows that, with n = 2,
(1.58) reproduces (1.46a).
sin r
2. Unlectured. Find the Laplacian of r .
Answer. Since the Laplacian consists of first taking a gradient, we first note from using result (1.33a),
i.e. ∇f (r) = f ′ (r)∇r, that
sin r cos r sin r
∇ = − 2 ∇r . (1.59a)
r r r
These are two very important integral theorems for vector fields that have many scientific applications.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
dV = dx dy dz , (1.62a)
and
dS = σx dy dz ex + σy dz dx ey + σz dx dy ez , (1.62b)
n · ex ), σy = sign(b
where σx = sign(b n · ey ) and σz = sign(b
n · ez ).
Remark. The divergence theorem relates a triple integral to a double integral. This is analogous to the
second fundamental theorem of calculus, i.e.
Z h2
df
dz = f (h2 ) − f (h1 ) , (1.63)
h1 dz
Now consider the projection of a surface element dS on the upper surface onto the xy plane. It follows
geometrically that dx dy = | cos α| dS, where α is the angle between ez and the unit normal nb ; hence
on S2
dx dy = ez · n
b dS = ez · dS . (1.65a)
On the lower surface S1 we need to dot n
b with −ez in order to get a positive area; hence
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
dx dy = −ez · dS . (1.65b)
We note that (1.65a) and (1.65b) are consistent with (1.62b) once the tricky issue of signs is sorted
out. Using (1.62a), (1.65a) and (1.65b), equation (1.64) can be rewritten as
ZZZ ZZ ZZ ZZ
∂uz
dV = uz ez · dS + uz ez · dS = uz ez · dS , (1.66a)
∂z
V S2 S1 S
The generalisation for a scalar field. For a scalar field ψ(x) with continuous first-order partial derivatives
in V, ZZZ ZZ
∇ψ dV = ψ dS . (1.68a)
V S
Proof. Set u = ψ a in (1.67), where a is an arbitrary constant vector. Then from (1.48a)
ZZZ ZZ
a. ∇ψ dV = a. ψ dS . (1.68b)
V S
Since a is arbitrary, (1.68a) follows.12 Alternatively, choose a = ei to obtain the component form
Key
ZZZ ZZ
∂ψ
dV = ψni dS . (1.68c) Result
∂xi
V S
The generalisation for a vector potential. For a vector potential A with continuous first-order partial deriva-
tives in V, ZZZ ZZ
∇ × A dV = nb × A dS . (1.69)
V S
Proof (unlectured). Either set u = a × A in (1.67), where where a is an arbitrary constant vector, and
04/22 then proceed as above, or let ψ = εijk Aj in (1.68c), to recover (1.69) in component form.
Let S be any ‘nice’ open surface bounding the ‘nice’ closed curve C.13 Let u(r) be a ‘nice’ vector field.14
Then ZZ I
Key
∇ × u · dS = u · dr , (1.70a) Result
S C
where the line integral is taken in the direction of C as specified by the ‘right-hand rule’.
Remark. Stokes’ theorem thus states that the flux of ∇ × u
across an open surface S is equal to the circulation of u
round the bounding curve C.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
ZZ Z
∂uy ∂ux
− dxdy = (ux dx + uy dy) , (1.70b)
A ∂x ∂y C
and ey · F = 0. Hence we have Archimedes’ Principle that an immersed body experiences a loss of
weight equal to the weight of the fluid displaced:
F = M g ez . (1.72c)
13 Or to be slightly more precise: let S be a piecewise smooth, open, orientated, non-intersecting surface bounded by a simple,
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Sb
= 0,
Application.
R Suppose that φ is the gravitational potential, then g = −∇φ is the gravitational force, and
C
(−∇φ) · dr is the work done against gravity in moving from A to B. The above result demonstrates
that the work done is independent of path. Indeed, from (1.29), i.e. ∇φ · dr = dφ,
Z Z
03/02 ∇φ · dr = dφ = φ(B) − φ(A) . (1.75)
C C
where S is any ‘nice’ small closed surface enclosing a volume V. It follows that ∇ · u can be interpreted as
the net rate of flux outflow at r0 per unit volume.
where S is any ‘nice’ small open surface with a bounding curve C. It follows that n
b ·(∇×u) can be interpreted
03/03 as the circulation about n
b at r0 per unit area.
Application.
Consider a rigid body rotating with angular velocity ω
about an axis through 0. Then the velocity at a point
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
r in the body is given by
v = ω × r. (1.78a)
There are many ways to describe the position of points in space. One way is to define three independent sets
of surfaces, each parameterised by a single variable (for Cartesian coordinates these are orthogonal planes
parameterised, say, by the point on the axis that they intercept). Then any point has ‘coordinates’ given by
the labels for the three surfaces that intersect at that point.
It is very important to realise that there is a key difference between Cartesian coordinates and other Key
orthogonal curvilinear coordinates. In Cartesian coordinates the directions of the basis vectors ex , ey , ez are Point
independent of position. This is not the case in other coordinate systems; for instance, er the normal to a
spherical shell changes direction with position on the shell. It is sometimes helpful to display this dependence
on position explicitly:
ei ≡ ei (r) . (1.79)
Suppose that we have non-Cartesian coordinates, qi (i = 1, 2, 3). Since we can express one coordinate system
in term of another, there will be a functional dependence of the qi on, say, Cartesian coordinates x, y, z, i.e.
qi ≡ qi (x, y, z) (i = 1, 2, 3) . (1.80)
For cylindrical polar coordinates and spherical polar coordinates we know that:
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
q3 z ϕ = tan−1 (y/x)
Remarks
1. Note that qi = ci (i = 1, 2, 3), where the ci are constants, define three independent sets of surfaces,
each ‘labelled’ by a parameter (i.e. the ci ). As discussed above, any point has ‘coordinates’ given by
the labels for the three surfaces that intersect at that point.
2. The equation (1.80) can be viewed as three simultaneous equations for three unknowns x, y, z. In
general these equations can be solved to yield the position vector r as a function of q = (q1 , q2 , q3 ),
i.e. r ≡ r(q) or
x = x(q1 , q2 , q3 ) , y = y(q1 , q2 , q3 ) , z = z(q1 , q2 , q3 ) . (1.81)
For instance:
Consider an infinitesimal change in position. Then, by the chain rule, the change dxi in xi (q1 , q2 , q3 ) due to
changes dqj in qj (i = 1, 2, 3) is
∂xi ∂xi ∂xi ∂xi
dxi = dq1 + dq2 + dq3 = dqj (i = 1, 2, 3). (s.c.) (1.82)
∂q1 ∂q2 ∂q3 ∂qj
where the hj = |hj | are the lengths of the hj , and the ej are unit vectors, i.e.
∂r 1 ∂r
hj = and ej = (j = 1, 2, 3) . (no s.c.) (1.84b)
∂qj hj ∂qj
Remarks.
(i) The hj will, in general, depend on position r. Consequently, the ej (r) will vary in space (cf. the x bi ),
and the q-axes will be curves rather than straight lines. The coordinate system is said to be curvilinear.
(ii) The scale factors or metric coefficients, hj , convert co-ordinate increments into lengths. Any point at
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
05/22 which hj = 0 is a coordinate singularity at which the coordinate system breaks down.
The Jacobian of (x, y, z) with respect to (q1 , q2 , q3 ) is defined as the determinant of this matrix:
∂x ∂x ∂x
∂q1 ∂q2 ∂q3
∂(x, y, z) ∂y ∂y ∂y
J≡ = |J| = ∂q1 ∂q2 ∂q3 . (1.85b)
∂(q1 , q2 , q3 ) ∂z ∂z ∂z
∂q1 ∂q2 ∂q3
The columns of the above matrix are the vectors hi defined in (1.83b). Therefore the Jacobian is equal to
the scalar triple product
J = [h1 , h2 , h3 ] = h1 ·h2 × h3 . (1.85c)
Consider now three sets of variables αi , βi and γi , with 1 ⩽ i ⩽ n, none of which need be Cartesian
coordinates. According to the chain rule of partial differentiation,
∂αi ∂αi ∂βk
= . (s.c.) (1.89)
∂γj ∂βk ∂γj
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
The left-hand side is the ij-component of the Jacobian matrix of the transformation from αi to γi . The
equation states that this matrix is the product of the Jacobian matrices of the transformations from αi to
βi and from βi to γi , i.e. the Jacobian matrix of a composite transformation is the product of the Jacobian
matrices of the transformations of which it is composed.
The chain rule for Jacobians. Taking the determinant of (1.89), we recover the chain rule for Jacobians:
∂(α1 , · · · , αn ) ∂(α1 , · · · , αn ) ∂(β1 , · · · , βn )
= . (1.90)
∂(γ1 , · · · , γn ) ∂(β1 , · · · , βn ) ∂(γ1 , · · · , γn )
The inverse transformation for Jacobians. In the special case in which γi = αi for all i, the left-hand side
is 1 (the determinant of the unit matrix), and so we obtain
−1
∂(α1 , · · · , αn ) ∂(β1 , · · · , βn )
= . (1.91)
∂(β1 , · · · , βn ) ∂(α1 , · · · , αn )
Hence, the Jacobian of an inverse transformation is the reciprocal of that of the forward transformation.
This is a multidimensional generalization of the result dx/dy = (dy/dx)−1 .
1.8.6 Orthogonality
For a general qj coordinate system the ej are not necessarily mutually orthogonal, i.e. in general
ei · ej ̸= 0 for i ̸= j .
However, for orthogonal curvilinear coordinates the ei are required to be mutually orthogonal at all points
in space, i.e.
ei · ej = 0 if i ̸= j .
Since by definition the ej are unit vectors, we thus have that
ei · ej = δij . (1.92)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
It follows from (1.84b) that
∂r
h1 = hr = = 1, e1 = er = (sin θ cos ϕ, sin θ sin ϕ, cos θ) , (1.94b)
∂q1
∂r
h2 = hθ = = r, e2 = eθ = (cos θ cos ϕ, cos θ sin ϕ, − sin θ) , (1.94c)
∂q2
∂r
h3 = hϕ = = r sin θ , e3 = eϕ = (− sin ϕ, cos ϕ, 0) . (1.94d)
∂q3
Remarks.
(i) ei · ej = δij and e1 × e2 = e3 , i.e. spherical polar coordinates are a right-handed orthogonal curvilinear
coordinate system. If we had chosen, say, q1 = r, q2 = ϕ, q3 = θ, then we would have ended up with a
left-handed system.
(ii) er , eθ and eϕ are functions of position.
(iii) Spherical polars are singular at r = 0, θ = 0 and θ = π, i.e. on the ‘north-south’ axis.
(iv) Recalling from (1.83a) and (1.84a) that the hj give the components of the displacement vector dr
along the r, θ, and ϕ axes, we have that
X
dr = hj dqj ej = dr er + r dθ eθ + r sin θ dϕ eϕ . (1.95)
j
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
(iii) Cylindrical polars are singular on the axis ρ = 0.
(iv) As noted in the section on Assumed Knowledge, §0.12, in the case of cylindrical polar coordinates,
sometimes r and/or θ are used in place of ρ and/or ϕ respectively (but then there is potential confusion
with the different definitions of r and θ in spherical polar co-ordinates). Further, in order to maximise
confusion, instead of ρ (which, admittedly, can be useful for other things, such as density), some
authors use R, s or ϖ.
Volume element. For orthogonal curvilinear coordinate systems it follows from (1.86a) that
Example: Spherical Polar Coordinates. In the case of spherical polar coordinates we have from (1.94b),
(1.94c), (1.94d) and (1.97a) that
dV = r2 sin θ drdθdϕ . (1.97b)
The volume of the sphere of radius a is therefore
ZZZ Z a Z π Z 2π
4 3
dV = dr dθ dϕ r2 sin θ = πa . (1.97c)
0 0 0 3
V
First we recall from (1.29) that for Cartesian coordinates and infinitesimal displacements dψ = ∇ψ · dr.
Definition. For curvilinear orthogonal coordinates (for which the basis vectors are in general functions of
position), we define ∇ψ to be the vector such that for all dr
dψ = ∇ψ · dr . (1.99)
But according to the chain rule, an infinitesimal change dq to q will lead to the following infinitesimal
change in ψ ≡ ψ(q1 , q2 , q3 )
X ∂ψ X 1 ∂ψ
dψ = dqi = (hi dqi ) . (1.100c)
i
∂qi i
hi ∂qi
Hence, since (1.100b) and (1.100c) must hold for all dqi ,
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
1 ∂ψ
αi = , (1.100d)
hi ∂qi
and from (1.100a)
X ei ∂ψ 1 ∂ψ 1 ∂ψ 1 ∂ψ
∇ψ = = , , . (1.100e)
i
hi ∂qi h1 ∂q1 h2 ∂q2 h3 ∂q3
Cylindrical Polar Coordinates. In cylindrical polar coordinates, the gradient is given from (1.96b), (1.96c) and
(1.96d) to be
∂ 1 ∂ ∂
∇ = eρ + eϕ + ez . (1.102a)
∂ρ ρ ∂ϕ ∂z
Spherical Polar Coordinates. In spherical polar coordinates the gradient is given from (1.94b), (1.94c) and
(1.94d) to be
∂ 1 ∂ 1 ∂
∇ = er + eθ + eϕ . (1.102b)
∂r r ∂θ r sin θ ∂ϕ
We can now use (1.101) to compute ∇ · F and ∇ × F in orthogonal curvilinear coordinates. However, first
we need a preliminary result which is complementary to (1.84b). Since
∂qi
= δij , (1.103a)
∂qj
We also recall that the ei form an orthonormal right-handed basis; thus e1 = e2 × e3 (and cyclic permuta-
tions). Hence from (1.103b)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Recall from (1.92) that e1 · ej = δ1j , and from example (1.53), with p = q2 and q = q3 , that
∇ · (∇q2 × ∇q3 ) = 0 .
It follows that
1 ∂ ∂ ∂ Key
∇·F= (h2 h3 F1 ) + (h3 h1 F2 ) + (h1 h2 F3 ) . (1.104)
h1 h2 h3 ∂q1 ∂q2 ∂q3 Result
1 ∂ 1 ∂Fϕ ∂Fz
div F = (ρFρ ) + + . (1.105a)
ρ ∂ρ ρ ∂ϕ ∂z
1 ∂ 1 ∂ 1 ∂Fϕ
r 2 Fr +
div F = 2
(sin θ Fθ ) + . (1.105b)
04/01 r ∂r r sin θ ∂θ r sin θ ∂ϕ
Curl. Again with a little bit of inspired rearrangement we have that
!
X
∇×F=∇× Fi ei
i
X ei
= ∇ × (hi Fi )
i
hi
X ei X
= ∇(hi Fi ) × + hi Fi (∇ × ∇qi ) using (1.48b) & (1.103b)
i
hi i
X X 1 ∂(hi Fi )
= ej × ei . using (1.49a) & (1.101)
i j
hi hj ∂qj
All three components of the curl can be written in the concise form
h1 e1 h2 e2 h3 e3
1 ∂ ∂ ∂ Key
∇×F= ∂q1 ∂q2 ∂q3 . (1.106b) Result
h1 h2 h3
06/22 h1 F1 h2 F2 h3 F3
eρ ρ eϕ ez
1
∇×F = ∂ρ ∂ϕ ∂z (1.107a)
ρ
Fρ ρ Fϕ Fz
1 ∂Fz ∂Fϕ ∂Fρ ∂Fz 1 ∂(ρFϕ ) 1 ∂Fρ
= − , − , − . (1.107b)
ρ ∂ϕ ∂z ∂z ∂ρ ρ ∂ρ ρ ∂ϕ
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
1 ∂(sin θ Fϕ ) ∂Fθ 1 ∂Fr 1 ∂(rFϕ ) 1 ∂(rFθ ) 1 ∂Fr
= − , − , − . (1.108b)
r sin θ ∂θ ∂ϕ r sin θ ∂ϕ r ∂r r ∂r r ∂θ
Remarks.
Suppose we substitute F = ∇ψ into formula (1.104) for the divergence. Then since from (1.100e)
1 ∂ψ
Fi = ,
hi ∂qi
we have that
2 1 ∂ h2 h3 ∂ψ ∂ h3 h1 ∂ψ ∂ h1 h2 ∂ψ
∇ ψ ≡ ∇ · ∇ψ = + + . (1.109)
h1 h2 h3 ∂q1 h1 ∂q1 ∂q2 h2 ∂q2 ∂q3 h3 ∂q3
1 ∂2ψ ∂2ψ
2 1 ∂ ∂ψ
∇ ψ= ρ + 2 + . (1.111a)
ρ ∂ρ ∂ρ ρ ∂ϕ2 ∂z 2
∂2ψ
2 1 ∂ 2 ∂ψ 1 ∂ ∂ψ 1
∇ ψ = r + sin θ + , (1.111b)
r2 ∂r ∂r r2 sin θ ∂θ ∂θ r2 sin2 θ ∂ϕ2
1 ∂2 ∂2ψ
1 ∂ ∂ψ 1
= (rψ) + sin θ + . (1.111c)
r ∂r2 r2 sin θ ∂θ ∂θ r2 sin2 θ ∂ϕ2
Remark. We have found here only the form of ∇2 as a differential operator on scalar fields. As noted earlier,
the action of the Laplacian on a vector field F is most easily defined using the vector identity
∇2 F = ∇(∇ · F) − ∇ × (∇ × F) . (1.112)
∇2 F = ∇2 (F1 e1 + F2 e2 + F3 e3 ) ,
and remembering (a) that the derivatives implied by the Laplacian act on the unit vectors too, and
(b) that because the unit vectors are generally functions of position (∇2 F)i ̸= ∇2 Fi (the exception
being Cartesian coordinates).
1
Evaluate ∇ · r, ∇ × r, and ∇2 r in spherical polar coordinates, where r = r er . From (1.105b)
1 ∂
r2 · r = 3 ,
∇·r= 2
as in (1.46a). (1.113a)
r ∂r
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
From (1.108b)
1 ∂r 1 ∂r
∇×r= 0, ,− = (0, 0, 0) , as in (1.46c). (1.113b)
r sin θ ∂ϕ r ∂θ
From (1.111c) for r ̸= 0
1 ∂2
21 1
∇ = r = 0, as in (1.58) with n = −1. (1.113c)
r r ∂r2 r
P.T.O.
1 ∂ ∂ ∂
div F = (h2 h3 F1 ) + (h3 h1 F2 ) + (h1 h2 F3 ) .
h1 h2 h3 ∂q1 ∂q2 ∂q3
h1 e1 h2 e2 h3 e3
1 ∂ ∂ ∂
curl F = ∂q1 ∂q2 ∂q3 .
h1 h2 h3
h1 F1 h2 F2 h3 F3
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
1 ∂ h2 h3 ∂ψ ∂ h3 h1 ∂ψ ∂ h1 h2 ∂ψ
∇2 ψ = + + .
h1 h2 h3 ∂q1 h1 ∂q1 ∂q2 h2 ∂q2 ∂q3 h3 ∂q3
∂ 1 ∂ ∂
∇ = eρ + eϕ + ez .
∂ρ ρ ∂ϕ ∂z
1 ∂ 1 ∂Fϕ ∂Fz
div F = (ρFρ ) + + .
ρ ∂ρ ρ ∂ϕ ∂z
eρ ρ eϕ ez
1
curl F = ∂ρ ∂ϕ ∂z
ρ
Fρ ρ Fϕ Fz
1 ∂Fz ∂Fϕ ∂Fρ ∂Fz 1 ∂(ρFϕ ) 1 ∂Fρ
= − , − , − .
ρ ∂ϕ ∂z ∂z ∂ρ ρ ∂ρ ρ ∂ϕ
1 ∂2ψ ∂2ψ
2 1 ∂ ∂ψ
∇ ψ= ρ + + .
ρ ∂ρ ∂ρ ρ2 ∂ϕ2 ∂z 2
∂ 1 ∂ 1 ∂
∇ = er + eθ + eϕ .
∂r r ∂θ r sin θ ∂ϕ
1 ∂ 1 ∂ 1 ∂Fϕ
r2 Fr +
div F = 2
(sin θ Fθ ) + .
r ∂r r sin θ ∂θ r sin θ ∂ϕ
∂2ψ
2 1 ∂ 2 ∂ψ 1 ∂ ∂ψ 1
∇ ψ= 2 r + 2 sin θ + 2 2 .
r ∂r ∂r r sin θ ∂θ ∂θ r sin θ ∂ϕ2
Numerous scientific phenomena are described by differential equations. This section is about extending your
armoury for solving ordinary differential equations, such as those that arise in quantum mechanics and
electrodynamics. In particular, we will be interested in how to model an idealized point charge or point
mass, or a localized source of heat, waves, etc.
Newton’s second law for a particle of mass m moving in one dimension subject to a force F (t) is
dp
=F, (2.1a)
dt
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
where
dx
p=m, (2.1b)
dt
is the momentum. Suppose that the force is applied only in the time interval 0 < t < δt. The total change
in momentum, termed the impulse, is
Z δt
δp = F (t) dt = I . (2.1c)
0
We may wish to represent mathematically a situation in which the momentum is changed instantaneously,
e.g. if the particle experiences a collision. To achieve this, F must tend to infinity while δt tends to zero,
in such a way that its integral I is finite and non-zero. The delta function is introduced to meet these and
similar requirements.
Further we note that for any differentiable function g(x) and constant ξ
Z ∞ Z ξ+ε
′ 1 ′
δε (x − ξ)g (x) dx = g (x) dx
−∞ ξ−ε 2ε
1h iξ+ε
= g(x)
2ε ξ−ε
1
= (g(ξ + ε) − g(ξ − ε)) .
2ε
In the limit ε → 0+ we recover, from using Taylor’s theorem and writing g ′ (x) = f (x),
Z ∞
1
lim δε (x − ξ)f (x) dx = lim g(ξ) + εg ′ (ξ) + 12 ε2 g ′′ (ξ) + . . .
ε→0+ −∞ ε→0+ 2ε
= f (ξ) . (2.2c)
Applications. Delta functions (as mathematical objects of infinite density and zero spatial extension but
having a non-zero integral effect) are a mathematical way of modelling point objects/properties, e.g.
point charges, point masses, point forces, point sinks/sources.
(i) From (2.2a) we see that the delta function has an infinitely sharp peak of zero width, i.e.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
∞ x=0
δ(x) = . (2.4a)
0 x ̸= 0
(ii) From (2.2b) it follows that the delta function has unit area, i.e.
Z β
δ(x) dx = 1 for any α > 0 , β > 0 . (2.4b)
−α
(iii) From (2.2c), and a sneaky interchange of the limit and the integration, we conclude that the delta
function can perform ‘surgical strikes’ on integrands picking out the value of the integrand at one
particular point, i.e. Z ∞
δ(x − ξ)f (x) dx = f (ξ) . (2.4c)
−∞
Remark. This result is equivalent to the substitution property of the Kronecker delta:
3
X
δij aj = ai .
j=1
The Dirac delta function can be understood as the equivalent of the Kronecker delta symbol for
functions of a continuous variable.
• The delta function δ(x) is not a function, but a distribution or generalised function.
• (2.4c) is not really a property of the delta function, but its definition. In other words δ(x) is the
generalised function such that for all ‘nice’ functions f (x)15
Z ∞
δ(x − ξ)f (x) dx = f (ξ) . (2.5)
−∞
• Given that δ(x) is defined within an integrand as a linear operator, it should always be employed in
8/02 an integrand as a linear operator.16
8/04
15 By ‘nice’ we mean, for instance, that f (x) is everywhere differentiable any number of times, and that
dn f 2
Z ∞
n
dx < ∞ for all integers n ⩾ 0.
−∞ dx
The top-hat sequence, (2.2a), is not unique in tending to the delta function in an appropriate limit; there
are many such sequences of well-defined functions.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Graphs of the Witch of Agnesi, (2.6a), and the Gaussian, (2.7a), for increasingly smaller values of ε.
= f (ξ) .
It follows that if we are willing to break the injunction that δ(x) should always be employed in
an integrand as a linear operator, we infer from (2.6c) that
Z ∞
1
δ(x) = eıkx dk . (2.6d)
2π −∞
x2
1
δε (x) = √ exp − 2 . (2.7a)
2πε2 2ε
The analogous
√ result to (2.2b) follows by means of the substitu-
tion x = 2 εy:
Z ∞ Z ∞ Z ∞
x2
1 1
exp −y 2 dy = 1 .
δε (x) dx = √ exp − 2 dx = √ (2.7b)
−∞
2
2πε −∞ 2ε π −∞
√
The equivalent result to (2.2c) can also be recovered by the substitution x = (ξ + 2 εz) followed by
an application of Taylor’s theorem.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
2.1.5 Further Properties of the Delta Function
The following properties hold for all the definitions of δε (x) above (i.e. (2.2a), (2.6a), (2.6c) and (2.7a)), and
thence for δ by the limiting process. Alternatively they can be deduced from (2.6d).
(i) δ(x) is symmetric. From (2.6d) it follows using the substitution k = −ℓ that
Z ∞ Z −∞ Z ∞
1 −ıkx 1 ıℓx 1
δ(−x) = e dk = − e dℓ = eıℓx dℓ = δ(x) . (2.8a)
2π −∞ 2π ∞ 2π −∞
(ii) δ(x) is real. From (2.6d) and (2.8a), with ∗ denoting a complex conjugate, it follows that
Z ∞
1
δ ∗ (x) = e−ıkx dk = δ(−x) = δ(x) . (2.8b)
2π −∞
There are various conventions for the value of the Heaviside step
function at x = 0, but it is not uncommon to take H(0) = 21 .
The Heaviside function is closely related to the Dirac delta function, since from (2.4a) and (2.4b)
Z x
H(x) = δ(ξ) dξ. (2.10a)
−∞
By analogy with the first fundamental theorem of calculus (0.1), this suggests that
H ′ (x) = δ(x) . (2.10b)
= f (ξ) .
Hence from the definition the delta function (2.5) we may identify H ′ (x) with δ(x).
F (t) = I δ(t) ,
i.e. a spike of strength I localized at t = 0. If the particle is at rest before the impulse, the solution
for its momentum is
8/01 p = I H(t).
We can define the derivative of δ(x) by using (2.4a), (2.4c) and a formal integration by parts:
Z ∞ h i∞ Z ∞
δ ′ (x − ξ)f (x) dx = δ(x − ξ)f (x) − δ(x − ξ)f ′ (x) dx = −f ′ (ξ) , (2.11)
−∞ −∞ −∞
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
where f (x) is any differentiable function.
Alternatively, the derivative[s] of the delta function can be defined as the limits of sequences of functions.
The generating functions for δ ′ (x) are the derivatives of (smooth) functions (e.g. Gaussians) that generate
δ(x), and have both positive and negative ‘spikes’ localized at x = 0.
Remark (unlectured). Not all operations are permitted on generalized functions. In particular, two gener-
alized functions of the same variable cannot be multiplied together, e.g. H(x)δ(x) is meaningless.
However δ(x)δ(y) is permissible and represents a point source in a two-dimensional space.
The general second-order linear ordinary differential equation (ODE) for y(x) can, wlog, be written as
Remark. If y1 and y2 are linearly dependent then y2 = γy1 for some γ ∈ R, in which case (2.13b) becomes
y = (α + βγ)y1 , (2.14)
and we have, in effect, a solution with only one integration constant σ = (α + βγ).
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
2.2.2 Inhomogeneous Second-Order Linear ODEs
since
Ly = Ly0 + αLy1 + βLy2 (2.15c)
= f + 0 + 0. (2.15d)
Here y1 (x) and y2 (x) are complementary functions, while y0 (x) is referred to as a particular solution, or a
07/22 particular integral.
If y1 and y2 are linearly dependent (i.e. y2 = γy1 for some γ), then so are y1′ and y2′ (since, from differentiating,
y2′ = γy1′ ). Hence y1 and y2 are linearly dependent only if the equation
y1 y2 α
= 0, (2.16a)
y1′ y2′ β
has a non-zero solution for α and β. Conversely, if this equation has a solution then y1 and y2 are linearly
dependent. It follows that non-zero functions y1 and y2 are linearly independent if and only if
y1 y2 α
= 0 ⇒ α = β = 0. (2.16b)
y1′ y2′ β
Since Ax = 0 only has a zero solution if and only if det A ̸= 0, we conclude that y1 and y2 are linearly
independent if and only if
y1 y2
= y1 y2′ − y2 y1′ = W ̸= 0 , (2.17b)
y1′ y2′
i.e. the Wronskian is non-zero.
Two boundary conditions (BCs) must be specified to determine fully the solution of a second-order ODE.
A boundary condition is usually an equation relating the values of y and y ′ at one point.
Remark. Without loss of generality we can assume that the BCs do not involve y ′′ and higher derivatives,
since the ODE allows y ′′ and higher derivatives to be expressed in terms of y and y ′ .
where A, B and E are constants, and A and B are not both zero. If E = 0 the BC is said to be homogeneous.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Initial-value problem. If both BCs are specified at the same point we have an initial-value problem, e.g. solve
d2 x dx
m = F (t) for t ⩾ 0, subject to x = = 0 at t = 0. (2.19a)
dt2 dt
Boundary-value problem. If the BCs are specified at different points we have a two-point boundary-value
problem, e.g. solve
If a differential equation involves a step function or delta function, this generally implies a lack of smoothness
in the solution. The equation can be solved separately on either side of the discontinuity and the two parts
of the solution connected by applying the appropriate matching conditions. Consider, as an example, the
linear second-order ODE
d2 y
+ y = δ(x) . (2.20)
dx2
If x represents time, this equation could represent the behaviour of a simple harmonic oscillator in response
to an impulsive force. In each of the regions x < 0 and x > 0 separately, the right-hand side vanishes and
the general solution is a linear combination of cos x and sin x. We may write
(
α− cos x + β− sin x, x < 0
y= .
α+ cos x + β+ sin x, x > 0
Since the general solution of a second-order ODE should contain only two arbitrary constants, it should be
possible to relate α+ and β+ to α− and β− .
What is the nature of the non-smoothness in y? Integrate (2.20) from x = −ε to x = ε to obtain
Z ε 2 Z ε Z ε
d y
2
dx + y(x) dx = δ(x) dx , (2.21a)
−ε dx −ε −ε
i.e. Z ε
′ ′
y (ε) − y (−ε) + y(x) dx = 1 . (2.21b)
−ε
Now let ε → 0. If we assume that y is bounded, then the integral term makes no contribution and we get
x=ε
dy dy
≡ lim = 1. (2.21c)
dx ε→0 dx
x=−ε
Since there is only a finite jump in the derivative of y, we may further conclude that y is continuous, in
which case the jump conditions are
dy
[y] = 0, =1 at x = 0 . (2.21d)
dx
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
L y(x) = f (x) , (2.24a)
where L is the general second-order linear differential operator in x, i.e.
d2 d
+ p(x)
L= + q(x) , (2.24b)
dx2 dx
with p and q being continuous functions. To fix ideas we will assume that the solution should satisfy
homogeneous boundary conditions at x = a and x = b, i.e.
Ay(a) + By ′ (a) = 0 , (2.25a)
′
Cy(b) + Dy (b) = 0 . (2.25b)
where A, B, C and D are constants such that A and B are not both zero, and C and D are not both zero.
Next, suppose that we can find a solution G(x; ζ) that is the response of the system to forcing at a point ζ,
i.e. G(x; ζ) is the solution to
L G(x; ζ) = δ(x − ζ) , (2.26a)
subject to the boundary conditions (cf. (2.25a) and (2.25b))
A G(a; ζ) + B Gx (a; ζ) = 0 and C G(b; ζ) + D Gx (b; ζ) = 0 , (2.26b)
where
∂2 ∂
L= 2
+ p(x) + q(x) , (2.26c)
∂x ∂x
∂G
Gx (x; ζ) = (x; ζ) , (2.26d)
∂x
∂ d
and we have used ∂x rather than dx since G is a function of both x and ζ. Then we claim that the solution
of the original problem (2.24a) is
Z b
y(x) = G(x; ζ)f (ζ) dζ . (2.27)
a
To see this we first note that (2.27) satisfies the boundary conditions (2.25a) and (2.25b), since from (2.26b)
Z b
Ay(a) + By ′ (a) = (A G(a; ζ) + B Gx (a; ζ)) f (ζ) dζ = 0 , (2.28a)
a
Z b
Cy(b) + Dy ′ (b) = (C G(b; ζ) + D Gx (b; ζ)) f (ζ) dζ = 0 . (2.28b)
a
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
x=ζ−ε
How can this equation be satisfied? Taking the lead from (2.21d), suppose that G(x; ζ) is bounded near
x = ζ, then since p and q are continuous, (2.29) reduces to
x=ζ+ε
∂G
lim + pG = 1.
ε→0 ∂x
x=ζ−ε
This implies that the jump in the derivative of G is bounded (cf. the unit jump in the Heaviside step function
(2.9) at x = 0). In turn, this means that G must be continuous. We conclude that
ζ+ε x=ζ+ε
∂G
lim G(x; ζ) =0 and lim = 1. (2.30)
ε→0 ε→0 ∂x
ζ−ε x=ζ−ε
By construction this satisfies (2.26a) for x ̸= ζ. Next we obtain equations relating α± (ζ) and β± (ζ) by
requiring at x = ζ that G is continuous and ∂G ∂x has a unit discontinuity. It follows from (2.30) that
α+ (ζ)y1 (ζ) + β+ (ζ)y2 (ζ) − α− (ζ)y1 (ζ) + β− (ζ)y2 (ζ) = 0 ,
α+ (ζ)y1′ (ζ) + β+ (ζ)y2′ (ζ) − α− (ζ)y1′ (ζ) + β− (ζ)y2′ (ζ) = 1 ,
i.e.
y1 y2 α+ − α− 0
= . (2.32)
y1′ y2′ β+ − β− 1
A solution exists to this equation if, see (2.17b),
y1 y2
W ≡ ̸= 0 ,
y1′ y2′
i.e. if y1 and y2 are linearly independent; if so then
y2 (ζ) y1 (ζ)
α+ − α− = − and β+ − β− = . (2.33) 08/22
W (ζ) W (ζ)
Natural Sciences Tripos: IB Mathematical Methods I 41 © [email protected], Michaelmas 2022
Finally we impose the boundary conditions. For instance, suppose that the solution y is required to satisfy
(cf. (2.19b))
y(a) = y(b) = 0 . (2.34a)
Then the appropriate boundary conditions for G would be
G(a; ζ) = G(b; ζ) = 0 , (2.34b)
i.e. A = C = 1 and B = D = 0 in (2.26b). It follows from (2.31) that we would require
α− (ζ)y1 (a) + β− (ζ)y2 (a) = 0 , (2.35a)
α+ (ζ)y1 (b) + β+ (ζ)y2 (b) = 0 . (2.35b)
α± , β± could then be determined from the four equations in (2.33), (2.35a) and (2.35b).
More generally, for the homogeneous boundary conditions (2.25a) and (2.25b), i.e.
Ay(a) + By ′ (a) = 0 and Cy(b) + Dy ′ (b) = 0 , (2.36a)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
the appropriate boundary conditions for G are
∂G
AG(a; ζ) + B (a; ζ) = 0 (2.36b)
∂x
∂G
CG(b; ζ) + D (b; ζ) = 0 . (2.36c)
∂x
For simplicity construct complementary functions y1 and y2 so that they satisfy the boundary condition at
a and b respectively, i.e. choose y1 and y2 so that
Ay1 (a) + By1′ (a) = 0 and Cy2 (b) + Dy2′ (b) = 0 . (2.37a)
Then
α+ = β− = 0 , (2.37b)
Remark. This method fails if the Wronskian W [y1 , y2 ] vanishes. This happens if y1 is proportional to y2 , i.e.
if there is a complementary function that happens to satisfy the homogeneous boundary conditions
both at x = a and x = b. In this case the equation Ly = f may not have a solution satisfying the
boundary conditions; if it does, the solution will not be unique (cf. resonance).
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
∂G
We also require that G is continuous and ∂x has a unit discontinuity at x = ζ, hence
β+ (ζ)y2 (ζ) = α− (ζ)y1 (ζ) and β+ (ζ)y2′ (ζ) − α− (ζ)y1′ (ζ) = 1 . (2.40)
(ii) Find the Green’s function for the two-point boundary-value problem
Answer. The complementary functions satisfying left and right boundary conditions are
Thus
sin x sin(ζ − 1)
0⩽x⩽ζ,
sin 1
G(x; ζ) = (2.45)
sin ζ sin(x − 1) ζ ⩽ x ⩽ 1 .
sin 1
So, being careful to choose the correct expression for G depending on whether x ⩽ ζ or x ⩾ ζ,
Z 1
y(x) = G(x; ζ)f (ζ) dζ
0
x 1
sin(x − 1)
Z Z
sin x
= sin ζ f (ζ) dζ + sin(ζ − 1)f (ζ) dζ . (2.46)
sin 1 0 sin 1 x
Suppose that instead of the two-point boundary conditions (2.25a) and (2.25b), we require that
∂G
i.e. α− = β− = 0. The conditions that G be continuous and ∂x has a unit discontinuity then give that
Or in matrix form
y1 (ζ) y2 (ζ) α+ (ζ) 0
= (2.50a)
y1′ (ζ) y2′ (ζ) β+ (ζ) 1
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
with solution ′
α+ (ζ) 1 y2 (ζ) −y2 (ζ) 0 −y2 (ζ)/W (ζ)
= = (2.50b)
β+ (ζ) W (ζ) −y1′ (ζ) y1 (ζ) 1 y1 (ζ)/W (ζ)
The Green’s function is therefore
0 for a ⩽ x < ζ,
G(x; ζ) = y (ζ)y2 (x) − y1 (x)y2 (ζ) (2.51)
1
for ζ ⩽ x ⩽ b.
W (ζ)
Answer. The complementary functions that satisfy the boundary conditions are y1 = sin x and
y2 = cos x, with Wronskian
Further
y1 (ζ)y2 (x) − y1 (x)y2 (ζ) = sin ζ cos x − sin x cos ζ = sin(ζ − x) . (2.53b)
Thus
(
0 0⩽x⩽ζ,
G(x; ζ) = (2.53c)
sin(x − ζ) x > ζ ,
and thus
Z x
y(x) = sin(x − ζ)f (ζ) dζ (2.53d)
0
So far we only considered problems with homogeneous boundary conditions. One can also use Green’s
functions to solve problems with inhomogeneous boundary conditions. The trick is to solve the homogeneous
equation Lyibc = 0 for a function yibc which satisfies the inhomogeneous boundary conditions. Then solve the
inhomogeneous equation Lyhbc = f , perhaps using the Green’s function method discussed in this chapter,
imposing homogeneous boundary conditions on yhbc . Then linearity means that yibc + yhbc satisfies the
inhomogeneous equation with inhomogeneous boundary conditions.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Given a function f (x) such that Z ∞
|f (x)| dx < ∞ ,
−∞
Notation. Sometimes it is clearer to denote the Fourier transform of a function f by F[f ] rather than fe, i.e.
F[•] ≡ e
•. (3.2)
Remark. There are differing normalisations of the Fourier transform. Hence you will encounter definitions
1
where the (2π)− 2 is either not present or replaced by (2π)−1 , and other definitions where the −ıkx is
replaced by +ıkx.
Property. If the function f (x) is real the Fourier transform fe(k) is not necessarily real. However if f is both
real and even, i.e. f ∗ (x) = f (x) and f (x) = f (−x) respectively, then by using these properties and
the substitution x = −y it follows that fe is real:
Z ∞
∗ 1
f (k) =
e √ eıkx f ∗ (x) dx from c.c. of (3.1)
2π −∞
Z ∞
1
=√ eıkx f (−x) dx since f ∗ (x) = f (−x)
2π −∞
Z ∞
1
= √ e−ıky f (y) dy let x = −y
2π −∞
= fe(k) . from (3.1) (3.3)
Similarly we can show that if f is both real and odd, then fe is purely imaginary, i.e. fe∗ (k) = −fe(k).
Conversely it is possible to show using the Fourier inversion theorem (see below) that
For what follows it is helpful to rewrite this result by making the transformations x → −ℓ, k → x and
ε → b to obtain Z ∞
2b
e−ıℓx−b|x| dx = 2 . (3.4)
−∞ ℓ + b2
Natural Sciences Tripos: IB Mathematical Methods I 45 © [email protected], Michaelmas 2022
We deduce from the definition of a Fourier transform, (3.1), and (3.4) with ℓ = k, that
Z ∞
1
F[e−b|x| ] = √ e−ıkx−b|x| dx
2π −∞
1 2b
=√ . (3.5)
2π k + b2
2
The FTs of cos(ax) e−b|x| and sin(ax) e−b|x| (b > 0). Unlectured. From (3.1), the definition of cosine, and
(3.4) first with ℓ = a − k and then with ℓ = a + k, it follows that
Z ∞
−b|x| 1
eıax + e−ıax e−ıkx−b|x| dx
F[cos(ax) e ] = √
2 2π −∞
b 1 1
= √ + . (3.6a)
2π (a − k)2 + b2 (a + k)2 + b2
This is real, as it has to be since cos(ax) e−b|x| is even.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Similarly, from (3.1), the definition of sine, and (3.4) first with ℓ = a − k and then with ℓ = a + k, it
follows that
Z ∞
−b|x| 1
eıax − e−ıax e−ıkx−b|x| dx
F[sin(ax) e ] = √
2ı 2π −∞
−ıb 1 1
= √ − . (3.6b)
2π (a − k)2 + b2 (a + k)2 + b2
09/22 This is purely imaginary, as it has to be since sin(ax) e−b|x| is odd.
The FT of a Gaussian. From the definition (3.1), the completion of a square, and the substitution
x = (εy − ıε2 k),17 it follows that
Z ∞
x2 x2
1 1
F √ exp − 2 = exp − 2 − ıkx dx
2πε2 2ε 2πε −∞ 2ε
Z ∞ 2
1 x
= exp − 12 + ıεk − 12 ε2 k 2 dx
2πε −∞ ε
Z ∞
1
exp − 12 ε2 k 2 exp − 12 y 2 dy
=
2π −∞
1 1 2 2
= √ exp − 2 ε k . (3.7)
2π
Hence the FT of a Gaussian of width (standard deviation) ε is a Gaussian of width ε−1 . This illustrates
a property of the Fourier transform: the narrower the function of x, the wider the function of k.
The FT of the delta function. From definitions (2.5) and (3.1) it follows that
Z ∞
1
F[δ(x − a)] = √ δ(x − a)e−ıkx dx
2π −∞
1
= √ e−ıka . (3.8a)
2π
√
Hence the Fourier transform of δ(x) is 1/ 2π. Recalling the description of a delta function as a limit
of a Gaussian, see (2.7a), we note that this result with a = 0 is consistent with (3.7) in the limit
ε → 0+.
The FT of the step function. From (2.9) and (3.1) it follows that
Z ∞
1
F[H(x − a)] = √ H(x − a)e−ıkx dx
2π −∞
Z ∞
1
= √ e−ıkx dx
2π a
−ıkx ∞
1 e
= √
2π −ık a
17 This is a little naughty since it takes us into the complex x-plane. However, it can be fixed up once you have done Cauchy’s
theorem.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Remark. For future reference we observe from a comparison of (3.8a) and (3.8c) that
ıkF[H(x − a)] = F[δ(x − a)] . (3.8d)
The FT of the top-hat function. Consider the discontinuous ‘top-hat’ function g(x) defined by
(
c a < x < b,
g(x) = (3.9a)
0 otherwise .
Then
√ Z b
ic −ikb
c e−ikx dx = − e−ika .
2π ge(k) = e (3.9b)
a k
For instance, if a = −1, b = 1 and c = 1
√ i −ik 2 sin k
2π ge(k) = e − eik = . (3.9c)
k k
9/02
9/04
Given a function f we can compute its Fourier transform fe from (3.1). For many functions the converse is
also true, i.e. given the Fourier transform fe of a function we can reconstruct the original function f . To see
this consider the following calculation (note the use of a dummy variable • to avoid an overabundance of x)
Z ∞ Z ∞ Z ∞
1 1 1
√ eıkx fe(k) dk = √ eıkx √ e−ık• f (•) d• dk from definition (3.1)
2π −∞ 2π −∞ 2π −∞
Z ∞ Z ∞
1
= d • f (•) dk eık(x−•) swap integration order
−∞ 2π −∞
Z ∞
= d • f (•) δ(x − •) from definition (2.6d)
−∞
= f (x) . from definition (2.5)
then the inverse transform (note the change of sign in the exponent) acting on fe(k) recovers f (x), i.e.
Z ∞
1
f (x) = √ eıkx fe(k) dk ≡ I[fe] . (3.10b)
2π −∞
Note that h h ii
9/03 I [F [f ]] = f , and F I fe = fe. (3.10c)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
h i 1 2b
F e−b|x| (k) = √ .
2π k + b2
2
However, we have seen that Fourier transforms can be assigned in a wider sense to some functions
that do not satisfy all of these conditions, e.g. f (x) = 1.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
=√ e−ık(y+α) f (y) dy x=y+α
2π −∞
Z ∞
1
= e−ıkα √ e−ıky f (y) dy rearrange
2π −∞
= e−ıkα F[f (x)] from (3.1). (3.14)
Exponential. Similarly, the Fourier transform of eıαx f (x) for constant α is given by
Z ∞
1
F[eıαx f (x)](k) = √ e−ı(k−α)x f (x) dx
2π −∞
= F[f (x)](k − α) . (3.15)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
2 n
d f 2e d f
F = −k f and F = (ık)n fe. (3.21b)
dx2 dxn
Remark. That Fourier transforms allow a simple representation of derivatives of f (x) in Fourier space
has important consequences for solving differential equations.
Alternative proof (unlectured). This does not reply on the use of the inverse Fourier transform:
Z ∞
df
F = f ′ (x) e−ikx dx
dx −∞
Z ∞
∞
= f (x) e−ikx −∞ − f (x)(−ik)e−ikx dx
−∞
= ik fe(k) (3.21c)
The integrated part vanishes because f (x) must tend to zero as x → ±∞ in order to possess a
Fourier transform.
Multiplication by x. This time we differentiate (3.10a) with respect to k to obtain
Z ∞
dfe 1
(k) = √ e−ıkx (−ıxf (x)) dx .
dk 2π −∞
Hence, after multiplying by ı, we deduce from (3.1) that (cf. (3.21a))
dfe
10/22 ı = F [xf (x)] . (3.22)
dk
Suppose that f (x) is a periodic function with period L (so that f (x + L) = f (x)). Then f can be represented
by a Fourier series
∞
X 2πınx
f (x) = an exp , (3.23a)
n=−∞
L
where
1
2L
Z
1 2πınx
an = f (x) exp − dx . (3.23b)
10/03 L − 21 L L
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
and
1
2L
Z
1
fe(kn ) = √ f (x) exp (−ıxkn ) dx ,
2π − 12 L
where
√
L an 2π an
fe(kn ) = √ ≡ .
2π ∆k
We then see that in the limit ∆k → 0, i.e. L → ∞,
Z ∞
1
f (x) = √ fe(k) exp (ıxk) dk , (3.25a)
2π −∞
and
Z ∞
1
fe(k) = √ f (x) exp (−ıxk) dx . (3.25b)
2π −∞
These are just our earlier definitions of the inverse Fourier transform (3.10b) and Fourier transform (3.1)
respectively.
The convolution expresses the amount of overlap of one function g as it is shifted over another function f .
In statistics, a continuous random variable x (for instance, the height of a person drawn at random from
the population) has a probability distribution (or density) function f (x). The probability of x lying in the
range x0 < x < x0 + δx in the limit of small δx is f (x0 )δx.
If x and y are independent random variables with distribution functions f (x) and g(y), then let the distri-
bution function of their sum, z = x + y, be h(z). For the above example, suppose y is the height of a soap
box drawn at random; then z would be the height of a random person while standing on the soap box.
For any given value of x, the probability that z lies in the range
z0 < z < z0 + δz , (3.27a)
is just the probability that y lies in the range
z0 − x < y < z0 − x + δz , (3.27b)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
which is g(z0 − x)δz. That’s for a fixed x; so the probability that z lies in this same range for all x is
Z ∞
h(z0 )δz = f (x)g(z0 − x)δz dx , (3.27c)
−∞
which implies
h = f ∗g. (3.27d)
Applications. The effect of measuring, observing or processing scientific data can often be described as a
convolution of the data with a certain function. For instance:
(i) When a point source is observed by a telescope, a broadened image is seen, known as the point
spread function of the telescope. When an extended source is observed, the image that is seen is the
convolution of the source with the point spread function.
In this sense convolution corresponds to a broadening or distortion of the original data.
(ii) A point mass M at position R gives rise to a gravitational potential Φp (r) = −GM/|r − R|. A
continuous mass density ρ(r) can be thought of as a sum of infinitely many point masses ρ(R) d3 R at
positions R. The resulting gravitational potential is
Z
ρ(R) 3
Φ(r) = −G d R (3.28)
|r − R|
which is the (3D) convolution of the mass density ρ(r) with the potential of a unit point charge at the
origin, −G/|r|.
If the functions f and g have Fourier transforms F[f ] and F[g] respectively, then
√
F[f ∗ g] = 2πF[f ]F[g] . (3.29)
Proof.
Z ∞ Z ∞
1 −ıkx
F[f ∗ g] = √ dx e dy f (y) g(x − y) from (3.1) & (3.26)
2π −∞ −∞
Z ∞ Z ∞
1
=√ dy f (y) dx e−ıkx g(x − y) swap integration order
2π −∞ −∞
Z ∞ Z ∞
1
= √ dy f (y) dz e−ık(z+y) g(z) x=z+y
2π −∞ −∞
Z ∞ Z ∞
1
=√ dy f (y) e−ıky dz e−ıkz g(z) rearrange
2π −∞ −∞
√
= 2π F[f ]F[g] from (3.1).
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
1 1
= √ (fe ∗ ge)(k) ≡ √ (F[f ] ∗ F[g]) (k) from (3.26).
2π 2π
Remarks.
(i) Convolution is an operation best carried out as a multiplication in the Fourier domain.
(ii) The Fourier transform of a product is non-trivial.
(iii) Convolution can be undone (deconvolution) by a division in the Fourier domain. If g is known and
f ∗ g is measured, then f can be obtained, in principle.
Application (unlectured). Suppose a linear ‘black box’ (e.g. a circuit) has output G(ω) exp (ıωt) for a periodic
input exp (ıωt). What is the output r(t) corresponding to input f (t)?
10/02
Remark. If the know the output of a linear black box for all possible harmonic inputs, then we know
10/04 everything about the black box.
3.2.4 Correlation
The correlation of two functions, h = f ⊗ g, is defined by
Z ∞
h(x) = [f (y)]∗ g(x + y) dy . (3.32)
−∞
Correlation is a way of quantifying the relationship between two (typically oscillatory) functions. If two
signals (oscillating about an average value of zero) oscillate in phase with each other, their correlation will
Natural Sciences Tripos: IB Mathematical Methods I 53 © [email protected], Michaelmas 2022
be positive. If they are out of phase, the correlation will be negative. If they are completely unrelated, their
correlation will be zero.
The Fourier transform of a correlation is
Z ∞ Z ∞
1
h̃(k) = √ [f (y)] g(x + y) dy e−ikx dx
∗
2π −∞ −∞
Z ∞Z ∞
1
=√ [f (y)]∗ g(z) eiky e−ikz dz dy (z = x + y)
2π −∞ −∞
Z ∞ ∗ Z ∞
1
=√ f (y) e−iky dy g(z) e−ikz dz
2π −∞ −∞
√
= 2π[f˜(k)]∗ g̃(k) (3.33)
Remarks.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
(i) This result (or the special case g = f ) is the Wiener–Khinchin theorem.
√
autoconvolution and autocorrelation of f are f ∗ f and f ⊗ f . Their Fourier transforms are
(ii) The √ 2π f˜2
and 2π|f˜|2 , respectively.
Remark. Parseval’s theorem means that the Fourier transform is a ‘unitary transformation’ that preserves
the ‘inner product’ between two functions (see later), in the same way that a rotation preserves lengths
and angles.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Z ∞
k2 π ∞ 2 −2|x| π ∞ 2 −2x
Z Z
π
2 4
dk = x e dx = x e dx = . (3.36c)
−∞ (1 + k ) 8 −∞ 4 0 16
9/01
An Application: Heisenberg’s Uncertainty Principle (unlectured). Suppose that
x2
1
ψ(x) = 1 exp − (3.37)
(2π∆2x ) 4 4∆2x
is the [real] wave-function of a particle in quantum mechanics. Then, according to quantum mechanics,
x2
2 1
|ψ (x)| = p exp − 2 , (3.38)
2π∆2x 2∆x
is the probability of finding the particle at position x, and ∆x is the root mean square deviation in
position.
Remark. There is unit probability of finding the particle somewhere since |ψ 2 | is the Gaussian of
width ∆x and Z ∞ Z ∞
x2
2 1
|ψ (x)| dx = p exp − 2 dx = 1 . (3.39)
−∞ 2π∆2x −∞ 2∆x
√
The Fourier transform of ψ(x) follows from (3.7) after the substitution ε = 2 ∆x and a multiplicative
normalisation:
2 14
2∆x
exp −∆2x k 2
ψ(k) =
e
π
k2
1 1
= 1 exp − 2 where ∆k = . (3.40)
2
(2π∆k ) 4 4∆ k 2∆ x
Hence ψe2 is another Gaussian, this time with a root mean square deviation in wavenumber of ∆k . In
agreement with Parseval’s theorem Z ∞
e 2 | dk = 1 .
|ψ(k) (3.41)
−∞
1
In the case of the Gaussian, ∆k ∆x = 2. More generally, one can show that for any (possibly complex)
wave-function ψ(x),
1
∆k ∆x ⩾ 2 (3.42)
where ∆x and ∆k are, as for the Gaussian, the root mean square deviations of the probability distribu-
tions |ψ(x)|2 and |ψ(k)|
e 2
, respectively. An important and well-known result follows from (3.42), since
in quantum mechanics the momentum is given by p = ℏk, where ℏ = h/2π and h is Planck’s constant.
Hence if we interpret ∆x = ∆x and ∆p = ℏ∆k to be the uncertainty in the particle’s position and
momentum respectively, then Heisenberg’s Uncertainty Principle follows from (3.42), namely
∆p ∆x ⩾ 12 ℏ . (3.43)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
3.4 Power spectra
The quantity
Φ(k) = |f˜(k)|2 (3.44)
appearing in the Wiener–Khinchin theorem and Parseval’s theorem is the (power) spectrum or (power)
spectral density of the function f (x). The Wiener–Khinchin theorem states that the Fourier Transform of
the autocorrelation function is the power spectrum.
This concept is often used to quantify the spectral content (as a function of angular frequency ω) of a
signal f (t).
The spectrum of a perfectly periodic signal consists of a series of delta functions at the principal frequency
and its harmonics, if present. Its autocorrelation function does not decay as t → ∞.
White noise is an ideal random signal with autocorrelation function proportional to δ(t): the signal is
perfectly decorrelated. It therefore has a flat spectrum (Φ = constant).
Less idealized signals may have spectra that are peaked at certain frequencies but also contain a general
noise component.
∞
d2 ψ
Z 2
1 d ψ
√ e−ıkx − a 2
ψ dx = F − a2 F (ψ) from (3.1)
2π −∞ dx2 dx2
= −k 2 F (ψ) − a2 F (ψ) from (3.21b). (3.46a)
The same action on the right-hand side yields −F(f ). Hence from taking the Fourier transform of the whole
equation we have that
F (f )
F (ψ) = , (3.46c)
k 2 + a2
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
and so from the inverse transform (3.10b) we have the solution
Z ∞
1 F (f )
ψ=√ eıkx 2 dk . (3.46d)
2π −∞ k + a2
Remark. The boundary conditions that |ψ| → 0 as |x| → ∞ were implicitly used when we assumed that the
Fourier transform of ψ existed. Why?
4.1 Nomenclature
Partial differential equations (PDEs) are equations relating one or more unknown functions, say ψ, (the
dependent variable[s]) of two or more independent variables, say x, y, z and t, with one or more of the
functions’ partial derivatives with respect to those variables. Hence a partial differential equation is a
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
equation of the form
∂ψ ∂ψ ∂ 2 ψ ∂ 2 ψ ∂ 2 ψ
F ψ, , , , , , . . . , x, y = 0 (4.1a)
∂x ∂y ∂x2 ∂x∂y ∂y 2
involving ψ and any of its derivatives evaluated at the same point, e.g. Schrödinger’s equation (1.55b)
for the quantum mechanical wave function ψ(x, y, z, t) of a non-relativistic particle:
ℏ2
2
∂2 ∂2
∂ ∂ψ
− + 2 + 2 ψ + V (r)ψ = iℏ . (4.1b)
2m ∂x2 ∂y ∂z ∂t
5/02
Order. The power of the highest derivative determines the order of the differential equation. Hence (4.1b)
is a second-order equation.
Linearity. If the system of differential equations is of the first degree in the dependent variables, then the
system is said to be linear, i.e. (4.1a) is linear if F depends linearly on ψ and its derivatives. Hence
Schrödinger’s equation is linear; however Euler’s equation for an inviscid fluid,
∂u
ρ + (u · ∇)u = −∇p , (4.2)
∂t
5/03
5/04 where u is the velocity, ρ is the density and p is the pressure, is nonlinear in u.
Remarks.
(i) If g = 0 the equation is said to be homogeneous.
(ii) We will concentrate on examples where the coefficients, a, b, c, d, e and f are independent of x
and y, in which case the equation is said to have constant coefficients.
(iii) These ideas can be generalized to more than two independent variables (e.g. Schrödinger’s equa-
tion (4.1b) has four independent variables), or to systems of PDEs with more than one dependent
variable.
L is a linear operator. L is a linear operator since
where ψ and φ any functions of x and y, and α and β are any constants.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
all displacements y(x, t) are vertical (this is a bit of a cheat),
and resolve horizontally and vertically to obtain respectively
∂2y
ρ dx = T (tan θ2 − tan θ1 )
∂t2
∂ ∂
= T y(x + dx, t) − y(x, t)
∂x ∂x
∂2y
= T 2 dx + . . . , (4.8a)
∂x
and hence, in the infinitesimal limit, that
∂2y T ∂2y
2
= . (4.8b)
∂t ρ ∂x2
q
This is the wave equation with wavespeed c = Tρ . In general the one-dimensional wave equation is
∂2y ∂2y
2
= c2 2 . (4.8c)
∂t ∂x
Typical physical constants. For a violin (D-)string: T ≈ 40 N, and ρ ≈ 1 g m−1 so c ≈ 200 m s−1 .
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Remarks.
(i) B obeys the same equation.
(ii) The pressure perturbation of a sound waves satisfies the scalar equivalent of this equation, where
c ≈ 300 m s−1 equals the speed of sound.
E = −∇φ . (4.12)
It then follows from the first of Maxwell’s equations, (4.9a), that φ satisfies Poisson’s equation
ρ
∇2 φ = − , (4.13a)
ϵ0
i.e.
∂2 ∂2 ∂2
ρ
2
+ 2+ 2 φ=− . (4.13b)
∂x ∂y ∂z ϵ0
Remark. The vector field E (and the vector field g below) is said to be generated by the potential φ. A
scalar potential is easier to work with because it does not have multiple components and its value
is independent of the coordinate system. The potential is also directly related to the energy of the
system.
∇ · g = −4πGρ , (4.14a)
and
∇ × g = 0, (4.14b)
where G is the gravitational constant and ρ is mass density. From the latter equation and (1.51) it follows
that there exists a gravitational potential φ such that
g = −∇φ . (4.15)
Thence from (4.14a) we deduce that the gravitational potential satisfies Poisson’s equation
∇2 φ = 4πGρ . (4.16)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Let Q(r, t) denote any chemical mass source per unit time per unit volume of the media. Then if the change
of chemical within the volume is to be equal to the flux of the chemical out of the surface in time δt
ZZ ZZZ ZZZ
d
q · dS δt = − C dV δt + Q dV δt . (4.18a)
dt
S V V
Hence using the divergence theorem (1.61), and exchanging the order of differentiation and integration,
ZZZ
∂C
∇·q+ − Q dV = 0 . (4.18b)
∂t
V
q = −D∇C , (4.20)
where D is the diffusion coefficient; the negative sign is necessary if chemical is to flow from high to low
concentrations. If D is constant then the partial differential equation governing the concentration is
∂C
= D∇2 C + Q . (4.21)
∂t
Diffusion Equation. If there is no chemical source then Q = 0, and the governing equation becomes the
diffusion equation
∂C
= D∇2 C . (4.22)
∂t
Poisson’s Equation. If the system has reached a steady state (i.e. ∂t ≡ 0), then with f (r) = Q(r)/D the
governing equation is Poisson’s equation
∇2 C = −f . (4.23)
Laplace’s Equation. If the system has reached a steady state and there are no chemical sources then the
concentration is governed by Laplace’s equation
∇2 C = 0 . (4.24)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
ZZ ZZZ ZZZ
d
q · dS δt = − ρE dV δt + Q dV δt .
dt
S V V
For ‘slow’ changes at constant pressure (1st and 2nd law of thermodynamics)
E(r, t) = cp θ(r, t) , (4.25)
where θ is the temperature and cp is the specific heat (assumed constant here). Hence using the divergence
theorem (1.61), and exchanging the order of differentiation and integration (cf. (4.18b)),
ZZZ
∂θ
∇ · q + ρcp − Q dV = 0 . (4.26)
∂t
V
You may have already met the general idea of ‘separability’ when solving ordinary differential equations,
e.g. when you studied separable equations to the special differential equations that can be written in the
form
X(x)dx = Y (y)dy = constant . (4.32)
| {z } | {z }
function of x function of y
Sometimes functions can we written in separable form. For instance,
f (x, y) = cos x exp y = X(x)Y (y) , where X = cos x and Y = exp y , (4.33)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
1
(x2 + y 2 + z 2 ) 2
is not separable in Cartesian coordinates, but is separable in spherical polar coordinates since
1
g = R(r)Θ(θ)Φ(ϕ) where R= r , Θ = 1 and Φ = 1 . (4.35)
Solutions to partial differential equations can sometimes be found by seeking solutions that can be written
in separable form, e.g.
12/22 However, we emphasise that not all solutions of partial differential equations can be written in this form.
Seek solutions y(x, t) to the one dimensional wave equation (4.8c), i.e.
∂2y ∂2y
2
= c2 2 , (4.37a)
∂t ∂x
of the form
y(x, t) = X(x)T (t) . (4.37b)
On substituting (4.37b) into (4.37a) we obtain
X T̈ = c2 T X ′′ , (4.38)
where a ˙ and a ′ denote differentiation by t and x respectively. After rearrangement we have that
1 T̈ (t) X ′′ (x)
= = λ, (4.39a)
c2 T (t) X(x)
| {z } | {z }
function of t function of x
where λ is a constant (the only function of t that equals a function of x). We have therefore split the PDE
into two ODEs:
T̈ − c2 λT = 0 and X ′′ − λX = 0 . (4.39b)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Alternatively we could express this as
y = Ãσ cosh σct + B̃σ sinh σct C̃σ eσx + D̃σ e−σx , or as . . .
y = (Ak cos kct + Bk sin kct) (Ck cos kx + Dk sin kx) . (4.40h)
Remark. Without loss of generality we could also impose a normalisation condition, say, Cj2 + Dj2 = 1.
Solutions (4.40b), (4.40e) and (4.40h) represent three families of solutions.20 Although they are based on a
special assumption, we shall see that because the wave equation is linear they can represent a wide range
of solutions by means of superposition. However, before going further it is helpful to remember that when
solving a physical problem boundary and initial conditions are also needed.
Boundary Conditions. Suppose that the string considered in § 4.2.1 has ends at x = 0 and x = L that are
fixed; appropriate boundary conditions are then
It is no coincidence that there are boundary conditions at two values of x and the highest derivative
in x is second order.
Initial Conditions. Suppose also that the initial displacement and
initial velocity of the string are known; appropriate initial
conditions are then
∂y
y(x, 0) = d(x) and (x, 0) = v(x) . (4.42)
∂t
Again it is no coincidence that we need two initial conditions
and the highest derivative in t is second order.
20 Or arguably one family if you wish to nit pick in the complex plane.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
L
where n is a non-zero integer. These special values of k are eigenvalues and the corresponding eigen-
functions, or normal modes, are
nπx
Xn = D nπ sin . (4.45)
L
L
Hence, from (4.40h), solutions to (4.8c) that satisfy the boundary condition (4.41) are
nπct nπct nπx
yn (x, t) = An cos + Bn sin sin , (4.46)
L L L
where we have written An for A nπ L
D nπ
L
and Bn for B nπ
L
D nπ
L
. Since (4.8c) is linear we can superimpose (i.e.
add) solutions to get the general solution
∞
X nπct nπct nπx
y(x, t) = An cos + Bn sin sin , (4.47)
n=1
L L L
where there is no need to run the sum from −∞ to ∞ because of the symmetry properties of sin and cos.
We note that when the solution is viewed as a function of x at fixed t, or as a function of t at fixed x, then
6/01 it has the form of a Fourier series.
The solution (4.47) satisfies the boundary conditions (4.41) by con-
struction. The only thing left to do is to satisfy the initial conditions
(4.42), i.e. we require that
∞
X nπx
y(x, 0) = d(x) = An sin , (4.48a)
n=1
L
∞
∂y X nπc nπx
(x, 0) = v(x) = Bn sin . (4.48b)
∂t n=1
L L
An and Bn can now be found using the orthogonality relations for sin (see (0.12a)), i.e.
Z L
nπx mπx L
sin sin dx = δnm . (4.49)
0 L L 2
Hence for an integer m > 0
∞
!
2 L 2 L
Z Z
mπx X nπx mπx
d(x) sin dx = An sin sin dx
L 0 L L 0 n=1
L L
∞
2An L
Z
X nπx mπx
= sin sin dx
n=1
L 0 L L
∞
X 2An L
= δnm using (4.49)
n=1
L 2
= Am , using (1.14c) (4.50a)
or alternatively invoke standard results for the coefficients of Fourier series. Similarly
Z L
2 mπx
Bm = v(x) sin dx . (4.50b)
mπc 0 L
Natural Sciences Tripos: IB Mathematical Methods I 66 © [email protected], Michaelmas 2022
4.5 Poisson’s Equation
Suppose we are interested in obtaining solutions to Poisson’s equation
∇2 θ = −f , (4.51a)
where, say, θ is a steady temperature distribution and f = Q/(ρcp ν 2 ) is a scaled heat source (see (4.29)).
For simplicity let the world be two-dimensional, then (4.51a) becomes
2
∂2
∂
+ 2 θ = −f . (4.51b)
∂x2 ∂y
Suppose we seek a separable solution as before, i.e. θ(x, y) = X(x)Y (y). Then on substituting into (4.51b)
we obtain
X ′′ Y ′′ f
=− − . (4.52)
X Y XY
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
It follows that unless we are very fortunate, and f (x, y) has a particular form (e.g. f = 0), it does not look
like we will be able to find separable solutions.
In order to make progress the trick is to first find a[ny] particular solution, θs , to (4.51b) (cf. finding a
particular solution when solving constant coefficient ODEs last year). The function φ = θ − θs then satisfies
Laplace’s equation
∇2 φ = 0 . (4.53)
This is just Poisson’s equation with f = 0, for which we have just noted that separable solutions exist. To
obtain the full solution we need to add these [countably infinite] separable solutions to our particular solution
(cf. adding complementary functions to a particular solution when solving constant coefficient ODEs last
year).
4.5.1 A Particular Solution
We will illustrate the method by considering the particular
example where the heating f is uniform, f = 1 wlog (since
the equation is linear), in a semi-infinite rod, 0 ⩽ x, of unit
width, 0 ⩽ y ⩽ 1.
Then we might expect the particular solution for the temperature θs to be independent of x, i.e. θs ≡ θs (y).
Poisson’s equation (4.51b) then reduces to
d2 θs
= −1 , (4.54a)
dy 2
which has solution
θs = a0 + b0 y − 21 y 2 , (4.54b)
where a0 and b0 are constants.
4.5.2 Boundary Conditions
For the rod problem, experience suggests that we need to specify one of the following at all points on the
boundary of the rod:
• the temperature (a Dirichlet condition), i.e.
θ = g(r) , (4.55a)
where g(r) is a known function;
• the scaled heat flux (a Neumann condition), i.e.
∂θ
≡n
b · ∇θ = h(r) , (4.55b)
∂n
where h(r) is a known function;
θs = 21 y(1 − y) ⩾ 0 . (4.57)
Let φ = θ −θs , then φ satisfies Laplace’s equation (4.53) and, from (4.56a), (4.56b) and (4.57), the boundary
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
conditions
On writing φ(x, y) = X(x)Y (y) and substituting into Laplace’s equation (4.53) it follows that (cf. (4.52))
X ′′ (x) Y ′′ (y)
= − = λ, (4.59a)
X(x) Y (y)
| {z } | {z }
function of x function of y
so that
X ′′ − λX = 0 and Y ′′ + λY = 0 . (4.59b)
7/02
7/04
We can now consider each of the possibilities λ = 0, λ > 0 and λ < 0 in turn to obtain, cf. (4.40b), (4.40e)
and (4.40h),
λ = 0.
φ = (A0 + B0 x)(C0 + D0 y) . (4.60a)
λ = σ 2 > 0.
φ = Aσ eσx + Bσ e−σx (Cσ cos σy + Dσ sin σy) .
(4.60b)
λ = −k 2 < 0.
φ = (Ak cos kx + Bk sin kx) (Ck cosh ky + Dk sinh ky) . (4.60c)
The boundary conditions at y = 0 and y = 1 in (4.58a) state that φ(x, 0) = 0 and φ(x, 1) = 0. This implies
(cf. the stretched string problem) that solutions proportional to sin(nπy) are appropriate; hence we try
λ = n2 π 2 where n is an integer. The eigenfunctions are thus
where An and Bn are constants and, without loss of generality, n > 0. However, if the boundary condition
in (4.58b) as x → ∞ is to be satisfied then An = 0. Hence the solution has the form
∞
X
φ= Bn e−nπx sin(nπy) . (4.62)
n=1
The Bn are fixed by the first boundary condition in (4.58b), i.e. we require that
∞
X
− 12 y(1 − y) = Bn sin(nπy) . (4.63a)
n=1
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
4.6.1 Separable Solutions
Seek solutions C(x, t) to the one dimensional version of the diffusion equation, (4.22), i.e.
∂C ∂2C
=D 2 , (4.65)
∂t ∂x
of the form
C(x, t) = X(x)T (t) . (4.66)
On substitution we obtain
X Ṫ = D T X ′′ . (4.67)
After rearrangement we have that
1 Ṫ (t) X ′′ (x)
= = λ, (4.68a)
D T (t) X(x)
| {z } | {z }
function of t function of x
where λ is again a constant. We have therefore split the PDE into two ODEs:
Ṫ − DλT = 0 and X ′′ − λX = 0 . (4.68b)
λ = 0. In this case
Ṫ (t) = X ′′ (x) = 0 ⇒ T = α0 and X = β0 + γ0 x , (4.69a)
where α0 , β0 and γ0 are constants. Combining these results we obtain
C = α0 (β0 + γ0 x) ,
or
C = β0 + γ 0 x , (4.69b)
since, without loss of generality (wlog), we can take α0 = 1.
λ = σ 2 > 0. In this case
Ṫ − Dσ 2 T = 0 and X ′′ − σ 2 X = 0 . (4.69c)
Hence
T = ασ exp(Dσ 2 t) and X = βσ cosh σx + γσ sinh σx , (4.69d)
where ασ , βσ and γσ are constants. On taking ασ = 1 wlog,
C = exp(Dσ 2 t) (βσ cosh σx + γσ sinh σx) . (4.69e)
C(x, 0) = 0 . (4.70a)
Suppose also that for t > 0 the concentration of the chemical is maintained at C0 at x = 0, and is 0 at
x = L, i.e.
C(0, t) = C0 and C(L, t) = 0 for t > 0 . (4.70b)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Again it is no coincidence that there two boundary conditions and the highest derivative in x is second
order.
Remark. Equation (4.21) and conditions (4.70a) and (4.70b) are mathematically equivalent to a description
of the temperature of a rod of length L which is initially at zero temperature before one of the ends
is raised instantaneously to a constant non-dimensional temperature of C0 .
4.6.3 Solution
The trick here is to note that
• the inhomogeneous (i.e. non-zero) boundary condition at x = 0, i.e C(0, t) = C0 , is steady, and
• the separable solutions (4.69e) and (4.69h) depend on time, while (4.69b) does not.
It therefore seems sensible to try and satisfy the the boundary conditions (4.70b) using the solution (4.69b).
If we call this part of the total solution C∞ (x) then, with β0 = C0 and γ0 = −C0 /L in (4.69b),
x
C∞ (x) = C0 1 − , (4.71)
L
which is just a linear variation in C from C0 at x = 0 to 0 at x = L. Write
where Ce is a sum of the separable time-dependent solutions (4.69e) and (4.69h). Then from the initial
condition (4.70a), the boundary conditions (4.70b), and the steady solution (4.71), it follows that
e 0) = −C0 1 − x ,
C(x, (4.73a)
and L
C(0,
e t) = 0 and C(L,
e t) = 0 for t > 0. (4.73b)
If the homogeneous boundary conditions (4.73b) are to be satisfied then, as for the wave equation, separable
solutions with λ > 0 are unacceptable, while λ = −k 2 < 0 is only acceptable if
Hence Z L
2C0 x mπx 2C0
Γm =− 1− sin dx = − . (4.76b)
L 0 L L mπ
The solution is thus given by
∞ 2 2
x X 2C0 n π Dt nπx
C = C0 1− − exp − 2
sin . (4.77a)
L n=1
nπ L L
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
∞ 2 2
X 2C0 n π Dt nπx
C = 1 − exp − 2
sin . (4.77b)
n=1
nπ L L
0.9
0.8
0.7
0.6
The solution (4.77b) with C0 = 1 and L = 1, plotted at
0.5
times t = 0.0001, t = 0.001, t = 0.01, t = 0.1 and t = 1
0.4 (curves from left to right respectively).
0.3
0.2
0.1
0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Consider the diffusion equation (see (4.22) or (4.29)) governing the evolution of, say, temperature, θ(x, t):
∂θ ∂2θ
=ν 2. (4.79)
∂t ∂x
In § 4.6 we have seen how separable solutions and Fourier series can be used to solve (4.79) over finite
x-intervals. Fourier transforms can be used to solve (4.79) when the range of x is infinite.21
We will assume boundary conditions such as
∂θ
θ → constant and →0 as |x| → ∞ , (4.80)
∂x
so that the Fourier transform of θ exists (at least in a generalised sense):
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Z ∞
1
θ(k, t) = √
e e−ıkx θ dx . (4.81)
2π −∞
If we then multiply the left hand side of (4.79) by √12π exp (−ıkx) and integrate over x we obtain the time
derivative of θ:
e
Z ∞ Z ∞
1 −ıkx ∂θ ∂ 1 −ıkx
√ e dx = √ e θ dx swap differentiation and integration
2π −∞ ∂t ∂t 2π −∞
∂ θe
= from (4.81) .
∂t
A similar manipulation of the right hand side of (4.79) yields
Z ∞ 2
1 −ıkx ∂ θ
√ e ν 2 dx = −νk 2 θe from (3.21b).
2π −∞ ∂x
Putting the left hand side and the right hand side together it follows that θ(k,
e t) satisfies
∂ θe
10/01 + νk 2 θe = 0 . (4.82a)
∂t
This equation has solution
e t) = γ(k) exp(−νk 2 t) ,
θ(k, (4.82b)
where γ(k) is an unknown function of k (cf. the Γn in (4.75)).
Suppose that the temperature distribution is known at a specific time, wlog t = 0. Then from evaluating
(4.82b) at t = 0 we have that
γ(k) = θ(k,
e 0) and so θ(k, e 0) exp(−νk 2 t) .
e t) = θ(k, (4.83)
But from definition (3.1) Z ∞
e 0) = √1
θ(k, e−ıky θ(y, 0) dy , (4.84a)
2π −∞
and so Z ∞
1
θ(k, t) = √
e exp(−ıky − νk 2 t) θ(y, 0) dy . (4.84b)
2π −∞
We can now use the Fourier inversion formula to find θ(x, t):
Z ∞
1
θ(x, t) = √ dk eıkx θ(k,
e t) from (3.10b)
2π −∞
Z ∞ Z ∞
1 ıkx 1 2
= √ dk e √ exp(−ıky − νk t) θ(y, 0) dy from (4.84b)
2π −∞ 2π −∞
Z ∞ Z ∞
1
= dy θ(y, 0) dk exp(ık(x − y) − νk 2 t) swap integration order.
2π −∞ −∞
21 Semi-infinite ranges can also be tackled by means of suitable ‘tricks’: see the example sheet.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
namely
x2
θ0
θ(x, t) = √ exp − . (4.86b)
4πνt 4νt
Physically this means that if the temperature at one point of
an infinite rod is instantaneously raised to ‘infinity’, then the
resulting temperature distribution is that of a Gaussian with a
1
maximum temperature decaying like t− 2 and a width increasing
1
like t 2 .
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Inter alia, we will study eigenvalues and eigenvectors; these are characteristic numbers and directions
associated with matrices, which allow them to be expressed in the simplest form. Moreover, the matrices that
occur in scientific applications usually have special symmetries that impose conditions on their eigenvalues
and eigenvectors, e.g. Hermitian matrices (observables in quantum mechanics are Hermitian operators).
5.1.2 Definition
A set of elements, or ‘vectors’, are said to form a complex linear vector space V if
(i) there exists a binary operation, say addition, under which the set V is closed so that
if u, v ∈ V , then u + v ∈ V ; (5.1a)
(iv) multiplication by a scalar is distributive and associative, i.e. for all a, b ∈ C and u, v ∈ V
a(u + v) = au + av , (5.1e)
(a + b)u = au + bu , (5.1f)
a(bu) = (ab)u ; (5.1g)
(v) there exists a null, or zero, vector 0 ∈ V such that for all v ∈ V
v +0 = v; (5.1h)
(vi) for all v ∈ V there exists a negative, or inverse, vector (−v) ∈ V such that
v + (−v) = 0 . (5.1i)
(iv) We will often refer to V as a vector space, rather than the more correct linear vector space.
The basic example of a vector space is F n . An element of F n is a list of n scalars, (x1 , . . . , xn ), where xi ∈ F .
This is called an n-tuple. Vector addition and scalar multiplication are defined component-wise:
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
(x1 , . . . , xn ) + (y1 , . . . , yn ) = (x1 + y1 , . . . , xn + yn ) (5.3a)
α(x1 , . . . , xn ) = (αx1 , . . . , αxn ) (5.3b)
where a1 , a2 , . . . , am are scalars and, henceforth, we will use the summation convention.
Definition: Span. The span of S is the set of all vectors that are linear combinations of S. If the span of S
is the entire vector space V , then S is said to span V .
ai ui = 0 . (s.c.)
Definition: Dimension of a Vector Space. If a vector space V contains a set of n linearly independent vectors
11/04 but all sets of n + 1 vectors are linearly dependent, then V is said to be of dimension n.
Examples.
(i) Since
(a, b, c) = a(1, 0, 0) + b(0, 1, 0) + c(0, 0, 1) , (5.6)
the vectors (1, 0, 0), (0, 1, 0) and (0, 0, 1) span a linear vector space of dimension 3.
(ii) (1, 0, 0), (0, 1, 0) and (0, 0, 1) are linearly independent since
(iii) (1, 0, 0), (0, 1, 0) and (1, 1, 0) are linearly dependent since (1, 1, 0) = (1, 0, 0) + (0, 1, 0).
Remarks.
(i) If an additional vector is included in a spanning set, it remains a spanning set.
11/02 (ii) If a vector is removed from a linearly independent set, the set remains linearly independent.
11/03
(i) We claim that for all vectors v ∈ V , there exist scalars vi such that
v = vi ui . (5.7a)
The vi are said to be the components of v with respect to the basis {u1 , . . . , un }.
Proof (unlectured). To see this we note that since V has dimension n, the set {u1 , . . . , un , v} is linearly
dependent, i.e. there exist scalars (a1 , . . . , an , b), not all zero, such that
ai ui + bv = 0 . (5.7b)
If b = 0 then the ai = 0 for all i because the ui are linear independent, and we have a contradiction;
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
hence b ̸= 0. Multiplying by b−1 we have that
v = − b−1 ai ui
=vi ui , (5.7c)
Then, because v − v = 0,
0 = (vi − wi )ui . (5.8b)
But the ui (i = 1, . . . , n) are linearly independent, so the only solution of this equation is vi − wi = 0
(i = 1, . . . n). Hence vi = wi (i = 1, . . . n), and we conclude that the two linear combinations (5.8a) are
identical. 2
Remarks. In (ii) and (i) below, {u1 , . . . , um } are a set of vectors in an n-dimensional vector space.
(i) If m < n then there exists a vector that cannot be expressed as a linear combination of the ui .
(ii) If m > n then there exists some vector that, when expressed as a linear combination of the ui ,
has non-unique scalar coefficients. This is true whether or not the ui span V .
(iii) Vector spaces can have infinite dimension, e.g. the set of functions defined on the interval
0 ⩽ x < 2π and having Fourier series
∞
X
f (x) = fn einx . (5.9)
n=−∞
Here f (x) is the ‘vector’ and fn are its ‘components’ with respect to the ‘basis’ of functions einx .
14/22 Functional analysis deals with such infinite-dimensional vector spaces.
Examples.
(i) Three-Dimensional Euclidean Space E3 . In this case the scalars are real and V is three-dimensional
because every vector v can be written uniquely as (cf. (5.6))
v = vx ex + vy ey + vz ez (5.10a)
= v1 u1 + v2 u2 + v3 u3 , (5.10b)
and moreover
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
However, we might alternatively consider the complex numbers as a linear vector space over R, so that
the scalars are real. In this case the pair of ‘vectors’ {1, i} constitute a basis because every complex
number z can be written uniquely as
z = a · 1 + b · ı where a, b ∈ R , (5.12a)
and
Remarks.
(i) R3 is not quite the same as physical space because physical space has a rule for the distance
between two points (i.e. Pythagoras’s theorem, if physical space is approximated as Euclidean)
(ii) R2 is not quite the same as C because C has a rule for multiplication
Worked exercise. Show that 2 × 2 real symmetric matrices form a real linear vector space under addition.
Show that this space has dimension 3 and find a basis.
Answer. Let V be the set of all real symmetric matrices, and let
αa βa αb βb αc βc
A= , B= , C= ,
βa γ a βb γ b βc γc
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
11/01 under addition, and that the ‘vectors’ Ui defined in (5.15) form a basis.
Exercise. Show that 3 × 3 symmetric real matrices form a vector space under addition. Show that this space
has dimension 6 and find a basis.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Similarly, since the {u′i : i = 1, . . . , n} is a basis, the individual basis vectors of the basis {ui : i = 1, . . . , n}
can be written as
ui = u′k Bki (i = 1, 2, . . . , n) , (5.21a)
for some numbers Bki . Here Bki is the kth component of the vector ui in the basis {u′k : k = 1, . . . , n}.
Again the Bki can be viewed as the entries of a matrix B
B11 B12 · · · B1n
B21 B22 · · · B2n
B= . .. . (5.21b)
.. ..
.. . . .
Bn1 Bn2 · · · Bnn
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
u′ = uA . (5.28c)
From a comparison between (5.28b) and (5.28c) we see that the components of v transform inversely
to the way that the basis vectors transform. This is so that the vector v is unchanged:
v =vj′ u′j from (5.25)
−1
= (A )jk vk (ui Aij ) from (5.28b) and (5.20a)
=ui vk Aij (A−1 )jk
de facto swap summation order
=ui (vk δik ) AA−1 = I
=vi ui . contract using (1.14c)
Worked example. Let {u1 = (1, 0), u2 = (0, 1)} and {u′1 = (1, 1), u′2 = (−1, 1)} be two sets of basis vectors
in R2 . Find the transformation matrix Aij that connects them. Verify the transformation law for the
components of an arbitrary vector v in the two coordinate systems.
Answer. We have from direct substitution and using (5.20a) that
u′1 = ( 1, 1) = (1, 0) + (0, 1) = u1 + u2 = uj Aji ,
u′2 = (−1, 1) = −1 · (1, 0) + (0, 1) = −u1 + u2 = uj Aj2 .
Hence
A11 = 1 , A21 = 1 , A12 = −1 and A22 = 1 ,
i.e. !
1 1
1 −1
A= , with inverse A−1 = 1
2 .
1 1 −1 1
First Check. Note that A−1 is consistent with (5.21a), (5.23a) and the observation that
u1 = (1, 0) = 12 ( (1, 1) − (−1, 1) ) = 21 (u′1 − u′2 ) = u′j A−1
ji ,
Orthogonal matrix. A square n × n matrix is orthogonal if its transpose is equal to its inverse:
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
These ideas can be generalized to a complex vector space; however, first we need a definition.
Hermitian conjugate. The Hermitian conjugate of a matrix A is defined to be the complex conjugate (de-
noted by ∗ ) of its transpose, i.e.
For example
A∗11 A∗21
A11 A12 A13 †
if A = then A = A∗12
A∗22 . (5.31a)
A21 A22 A23
A∗13 A∗23
Similarly, the Hermitian conjugate of a column matrix x is a row matrix, e.g.
†
x1
x2
x† = . = x∗1 x∗2 x∗n .
··· (5.31b)
..
xn
The Hermitian conjugate of a product of matrices. For matrices A and B recall that (AB)T = BT AT . Hence
(AB)T∗ = BT∗ AT∗ , and so
(AB)† = B† A† . (5.32b)
(ABCx)† = x† C† B† A† , (5.32c)
† † † †
(x Ay) = y A x . (5.32d)
In the latter example, if x and y are column matrices, each side of the equation is a scalar (a complex
number). The Hermitian conjugate of a scalar is just the complex conjugate.
Positive definiteness. A square n × n matrix A is said to be positive definite if for all column matrices v of
length n
v† Av ⩾ 0 , with equality iff v = 0 , (5.33a)
where ‘iff’ means if and only if.
Remark. If equality to zero were possible in (5.33a) for non-zero v, then A would said to be positive
15/22 rather than positive definite.
Hermitian matrix. A square n × n matrix is Hermitian if it is equal to its Hermitian conjugate:
Unitary matrix. A square n × n matrix is unitary if its Hermitian conjugate is equal to its inverse:
A† = A−1 or AA† = A† A = 1 . (5.33d)
Normal matrix. A square n × n matrix is normal if it commutes with its Hermitian conjugate:
AA† = A† A . (5.33e)
Exercise. Verify that Hermitian, anti-Hermitian and unitary matrices are all normal.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
5.4.1 Definition of a Scalar Product
The prototype linear vector space V = E3 has the additional property that any two vectors u and v can
be combined to form a scalar u · v. This can be generalised to an n-dimensional vector space V over C by
assigning, for every pair of vectors u, v ∈ V , a scalar product u · v ∈ C with the following properties.
(i) The scalar product should be linear in its second argument, i.e. for a, b ∈ C
u · (av1 + bv2 ) = a u · v1 + b u · v2 . (5.34a)
where we are using the alternative notation (5.36a) for the inner product. For this definition of inner
product we have for real symmetric 2 × 2 matrices A, B and C, and a, b ∈ C:
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
(i) as in (5.34a)
⟨ A | (βB + γC) ⟩ = A∗ij (βBij + γCij )
= βA∗ij Bij + γA∗ij Cij
= β ⟨A |B ⟩ + γ ⟨A |C ⟩;
(ii) as in (5.34b)
∗ ∗
⟨ B | A ⟩ = Bij Aij = ⟨ A | B ⟩ ;
⟨A |A ⟩ = 0 ⇒ A = 0.
First, suppose that v = 0. The right-hand-side then simplifies from a quadratic in λ to an expression
that is linear in λ. If ⟨ u | v ⟩ =
̸ 0 we then have a contradiction since for certain choices of λ this
simplified expression can be negative. Hence we conclude that
⟨u |v ⟩ = 0 if v = 0 ,
in which case (5.38) is satisfied as an equality. Next
suppose that v ̸= 0 and choose λ = re−ıα so that from
(5.34d)
Proof. This follows from taking square roots of the following inequality:
∗
∥u + v∥2 = ⟨ u | u ⟩ + ⟨ u | v ⟩ + ⟨ u | v ⟩ + ⟨ v | v ⟩ from above with λ = 1
2 2
= ∥u∥ + 2 Re ⟨ u | v ⟩ + ∥v∥ from (5.34b)
2 2
⩽ ∥u∥ + 2|⟨ u | v ⟩| + ∥v∥
⩽ ∥u∥2 + 2∥u∥ ∥v∥ + ∥v∥2 from (5.38)
2
⩽ (∥u∥ + ∥v∥) .
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Suppose that we have a scalar product defined on a vector space with a given basis {ui : i = 1, . . . , n}. We
will next show that the scalar product is in some sense determined for all pairs of vectors by its values for
all pairs of basis vectors. To start, define the complex numbers Gij by
v · w = (vi ui ) · (wj uj )
=vi∗ wj ui · uj from (5.34a) and (5.35)
=vi∗ Gij wj . (5.42)
v · w = v† G w , (5.43)
where G is the matrix, or metric, with entries Gij (metrics are a key ingredient of General Relativity).
Remark (unlectured). That G is Hermitian is consistent with the requirement (5.34c) that |v|2 = v · v is
real, since
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
(5.46b)
then, since x is non-zero, we conclude that a a non-trivial linear combination of the columns of the matrix
(M − λI) is equal to zero, i.e. that the columns of the matrix are linearly dependent. This statement is also
equivalent to the requirement
det(M − λI) = 0 , (5.47)
which is called the characteristic equation of the matrix M. The left-hand-side of (5.47) is an nth order
polynomial in λ called the characteristic polynomial of M.
The roots of the characteristic polynomial are the eigenvalues of M, and since an nth order polynomial has
exactly n, possibly complex, roots (counting multiplicities in the case of repeated roots), there are always n
eigenvalues.
Examples.
(i) Find the eigenvalues and eigenvectors of
0 1
M= (5.48a)
−1 0
Answer. From (5.47)
−λ 1
0 = det(M − λI) = = λ2 + 1 = (λ − ı)(λ + ı) , (5.48b)
−1 −λ
and so the eigenvalues of M are ±ı. The eigenvectors are the non-zero solutions to
∓ı 1 x 0
= . (5.48c)
−1 ∓ı y 0
Hence there are two linearly independent eigenvectors
1 1
α and β , (5.48d)
ı −ı
where α and β are any non-zero constants.
(ii) Find the eigenvalues and eigenvectors of
0 1
M= (5.49a)
0 0
Answer. From (5.47)
−λ 1
0 = det(M − λI) = = λ2 , (5.49b)
0 −λ
and so the eigenvalues of M are 0 and 0. The eigenvectors are the non-zero solutions to
0 1 x 0
= . (5.49c)
0 0 y 0
Hence there is only one linearly independent eigenvector, namely any non-zero multiple of
1
. (5.49d)
0
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Let X be the n × n matrix whose columns are the eigenvectors of M, then
(X)ij ≡ Xij = xji , (5.51a)
i.e. 1
x1 x21 ··· xn1
x1 x2 ··· xn2
2 2
X= .. .. .. .. .
(5.51b)
. . . .
xn x2n
1
··· xnn
in which case (5.50b) can be rewritten as
Xn n
X
Mjk Xki = λi Xji = Xjk δki λi (5.52a)
k=1 k=1
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Degenerate eigenvalues. The case when there is a repeated eigenvalue is more difficult. However with suf-
ficient mathematical effort it can still be proved that orthogonal eigenvectors exist for the repeated
eigenvalue. Instead of adopting this approach we appeal to arm-waving arguments.
An ‘experimental’ approach. First adopt an ‘experimental’ approach. In real life it is highly unlikely
that two eigenvalues will be exactly equal (because of experimental error, etc.). Hence this case
never arises and we can assume that we have n orthogonal eigenvectors. §
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
5.6.2 Diagonalization of Hermitian Matrices
It follows from the above result, (5.51b) and (5.53), that an Hermitian matrix H can be ‘diagonalized’ to
the matrix Λ by means of the transformation X−1 H X, where the columns of X are the eigenvectors of H:
1
x1 x21 · · · xn1
x1 x2 · · · xn
2 2 2
X= .. .. .. .. .
(5.60)
. . . .
x1n x2n ··· xnn
Orthonormal eigenvectors. If the xi are orthonormal eigenvectors of H then X is a unitary matrix since
(X† X)ij = (X† )ik (X)kj = (xik )∗ xjk = xi† xj = δij by orthonormality, (5.61a)
or, in matrix notation,
X† X = I . (5.61b)
Hence X is a unitary matrix, and we deduce that every Hermitian matrix, H, is diagonalizable by a
transformation
X† HX = Λ , (5.62)
where X is a unitary matrix.
In the case when we restrict ourselves to real matrices, we conclude that every real symmetric matrix, S,
is diagonalizable by a transformation RT S R, where R is an orthogonal matrix.
Example. Find the orthogonal matrix that diagonalizes the real symmetric matrix
1 β
S= where β is real. (5.63)
β 1
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
(+) (−) √1 √1
! !
x1 x1 2 2 1 1 1
R≡X= = =√ . (5.68)
(+) (−) √1 − √12 2 1 −1
x2 x2 2
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
v · w = v† G w from (5.43)
′† † ′
= v A G Aw from (5.28a) and (5.73).
But from (5.43) we must also have that in terms of the new basis
v · w = v′† G′ w′ , (5.74)
where G′ is the metric in the new {u′i : i = 1, . . . , n} basis. Since v and w are arbitrary we conclude that
the metric in the new basis is given in terms of the metric in the old basis by
G′ = A† GA . (5.75)
Alternative derivation (unlectured). (5.75) can also be derived from the definition of the metric since
(G′ )ij ≡ G′ij = u′i · u′j from (5.40)
= (uk Aki ) · (uℓ Aℓj ) from (5.20a)
= A∗ki (uk · uℓ ) Aℓj from (5.34a) and (5.35)
= A†ik Gkℓ Aℓj from (5.40) and (5.44a)
= (A† GA)ij . (5.76)
with equality only if v = 0. This can only be true for all vectors v′ if
′
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
The {ei : i = 1, . . . , n} are thus an orthonormal basis. We conclude that:
any vector space with a scalar product has an orthonormal basis.
Since from (5.40) the elements of the metric are just ei · ej , the metric for an orthonormal basis is the
identity matrix I.
The scalar product in orthonormal bases. Let the column vectors v and w contain the components of two
vectors v and w, respectively, in an orthonormal basis {ei : i = 1, . . . , n}. Then from (5.43)
v · w = v† I w = v† w . (5.82)
This is consistent with the definition of the scalar product from last year.
Orthogonality in orthonormal bases. If the vectors v and w are orthogonal, i.e. v · w = 0, then the compo-
nents in an orthonormal basis are such that
17/22 v† w = 0 . (5.83)
Vector spaces over R. An analogous result applies to vector spaces over R. Then, because the transformation
matrix, say U = R, is real,
U† = R T ,
and so R must the orthogonal:
RT = R−1 . (5.87)
Example. An example of an orthogonal matrix is the 3 × 3 rotation matrix R that determines the new
components, v′ = RT v, of a three-dimensional vector v after a rotation of the axes (note that
under a rotation orthogonal axes remain orthogonal and unit vectors remain unit vectors).
−λ i 0
0 = −i −λ 0
0 0 1−λ
= (λ2 − 1)(1 − λ) , (5.88a)
with solutions
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
λ = 1, −1, 1 . (5.88b)
Remark. All three eigenvectors given in (5.89c) and (5.90c) can be confirmed to be orthonormal.
Diagonalisation of H. Using (5.62), (5.89c) and (5.90c), with the choices α = β = γ = 0, it follows that
√ √ √ √
1/√2 −i/√ 2 0 0 i 0 1/√ 2 1/ √2 0 −1 0 0
1/ 2 i/ 2 0 −i 0 0 i/ 2 −i/ 2 0 = 0 1 0 .
0 0 1 0 0 1 0 0 1 0 0 1
Examples.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
i=1
−1
tr(M) = tr XΛX
= tr ΛX−1 X
using tr(AB) = Aij Bji = Bji Aij = tr(BA)
n
X
= tr(Λ) = λi , (5.92c)
i=1
n −1
tr(Mn ) = tr XΛ X
= tr Λn X−1 X = tr(Λn ) .
(5.92d)
Remark. (5.92b) and (5.92c) are in fact true for all matrices (whether or not they are diagonalizable), as
follows from the product and sum of roots in the characteristic equation
5.8 Forms
Definition: form. A map F(x)
n X
X n
F(x) = x† Ax = x∗i Aij xj , (5.94a)
i=1 j=1
Definition: quadratic form. An important special case is obtained by restriction to real vector spaces; then
x and H are real. It follows that HT = H, i.e. H is a real symmetric matrix; let us denote such a matrix
by S. In this case
Xn X n
F(x) = xT Sx = xi Sij xj . (5.94c)
i=1 j=1
H = UΛU† , (5.95a)
x′ = U † x , (5.95b)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
i=1
Transforming to a basis of orthonormal eigenvectors transforms the Hermitian form to a standard form with
no ‘off-diagonal’ terms. The orthonormal basis vectors that coincide with the eigenvectors of the coefficient
matrix, and which lead to the simplified version of the form, are known as principal axes.
Example. Let F(x) be the quadratic form
S = 21 A + AT ,
(5.97c)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Conic Sections. First suppose that n = 2 and that Λ (or equivalently S) does not have a zero eigenvalue,
then with ′
′ x λ1 0
x = and Λ = , (5.98a)
y′ 0 λ2
(5.97g) becomes
λ1 x′2 + λ2 y ′2 = k , (5.98b)
which is the normalised equation of a conic section.
λ1 λ2 > 0. If λ1 λ2 > 0, then k must have the same sign as the λj (j = 1, 2),
and (5.98b) is the equation of an ellipse with principal axes coin-
ciding with the x′ and y ′ axes.
Three Dimensions. If n = 3 and Λ does not have a zero eigenvalue, then with
′
x λ1 0 0
x′ = y ′ and Λ = 0 λ2 0 , (5.101a)
z′ 0 0 λ3
(5.97g) becomes
λ1 x′2 + λ2 y ′2 + λ3 z ′2 = k . (5.101b)
When λi k > 0, r
k
the distance to surface along the ith principal axes = . (5.101c)
λi
Analogously to the case of two dimensions, this equation describes a number of characteristic surfaces.
2 xT x
(relative distance to surface) = . (5.102)
xT Sx
Figure 5.3: Hyperboloid of one sheet (λ1 > 0, λ2 > 0, λ3 < 0, k > 0) and hyperboloid of two sheets
(λ1 > 0, λ2 < 0, λ3 < 0, k > 0); Wikipedia.
Figure 5.4: Paraboloid of revolution (λ1 x′2 + λ2 y ′2 + z ′ = 0, λ1 > 0, λ2 > 0) and hyperbolic paraboloid
(λ1 x′2 + λ2 y ′2 + z ′ = 0, λ1 < 0, λ2 > 0); Wikipedia.
by performing a Taylor expansion, and by ignoring terms quadratic or higher in |δx|. First note that
Hence
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
1 1
= T
(xT + δxT )(x + δx) x x + 2δxT x + . . .
−1
2δxT x
1
= T 1 + T + ...
x x x x
2δxT x
1
= T 1 − T + ... .
x x x x
Similarly
Sx = λ(x)x , (5.106)
Analysis is one of the foundations upon which mathematics is built. It is the careful study of infinite processes
such as limits, convergence, continuity, differential and integral calculus. This section covers some of the
basic concepts including the important problem of the convergence of infinite series, since you need to have
an idea of when, and when not, you can sum a series, e.g. a Fourier series.
We also discuss the remarkable properties of analytic functions of a complex variable.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
A sequence is a set of of real or complex numbers, sn , defined for all integers n ⩾ n0 and occurring in order.
If the sequence is unending we have an infinite sequence.
1
Example. If the nth term of a sequence is sn = n, the sequence is
1, 12 , 13 , 14 , . . . . (6.1)
Sequences tending to a finite limit. A sequence, sn , is said to tend to the limit s if, given any positive ε,
there exists N ≡ N (ε) such that
In other words the members of the sequence are eventually contained within an arbitrarily small disk
centred on s. We write this as
lim sn = s or as sn → s as n → ∞ (6.2b)
n→∞
Examples.
(i) Suppose sn = n−α for any α > 0. Given 0 < ε < 1 it follows that
α1
−α 1
|n − 0| < ε for all n > N (ε) = . (6.3a)
ε
Hence
lim n−α = 0 for any α > 0. (6.3b)
n→∞
(ii) Suppose sn = xn with |x| < 1. Given 0 < ε < 1 let N (ε) be the smallest integer such that,
for a given x,
log 1/ε
N> . (6.4a)
log 1/|x|
Then, if n > N ,
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Property. An increasing sequence tends either to a limit or to +∞. Hence a bounded increasing
sequence tends to a limit, i.e. if
sn+1 > sn , and sn < K ∈ R for all n, then s = lim sn exists. (6.6)
n→∞
Remark. You really ought to have a proof of this property, but I do not have time.22
Sequences tending to infinity. A sequence, sn , is said to tend to infinity if given any A (however large), there
exists N ≡ N (A) such that
sn > A for all n > N . (6.7a)
We then write
sn → ∞ as n → ∞. (6.7b)
Similarly we say that sn → −∞ as n → ∞ if given any A (however large), there exists N ≡ N (A)
such that
sn < −A for all n > N . (6.7c)
Oscillating sequences. If a sequence does not tend to a limit or ±∞, then sn is said to oscillate. If sn
oscillates and is bounded, it oscillates finitely, otherwise it oscillates infinitely.
Definition: Partial sum. Given an infinite sequence of numbers u1 , u2 , . . . , define the partial sum sn by
n
X
sn = ur . (6.9)
r=r0
Definition: Convergent series. If as n → ∞, sn tends to a finite limit, s, then we say that the infinite series
∞
X
ur , (6.10)
r=r0
22 Alternatively you can view this property as an axiom that specifies the real numbers R essentially uniquely.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
r=0
Remark. However, as illustrated by the harmonic series (see (6.13a), (6.13b) and (6.13c)), ur → 0 as r → ∞
is not a sufficient condition for convergence.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
< ε, (6.16b)
P
and so ur also converges.
P P P
Definition: Conditional convergence. If |ur | diverges, then ur may or may not converge. If ur does
converge, it is said to converge conditionally.
Example. Suppose that
n
1 X 1
ur = (−1)r−1 so that sn = (−1)r−1 = 1 − 1
2 + 1
3 · · · + (−1)n−1 n1 . (6.17a)
r r=1
r
Then, from the Taylor expansion
∞
X (−x)r
log(1 + x) = − , (6.17b)
r=1
r
P∞
we spot that s = limn→∞ sn = log 2; hence r=1 ur converges.
P∞ P∞ However, from (6.13a), (6.13b) and
(6.13c) we already know that r=1 |ur | diverges. Hence r=1 ur is conditionally convergent.
is also convergent.
Pn
Proof. Since ur > 0, sn = r=1 ur is an increasing sequence. Further
n
X n
X
sn = ur < K vr , (6.19a)
r=1 r=1
and thus
∞
X
lim sn < K vr = K S , (6.19b)
n→∞
r=1
16/01 P∞
i.e. sn is an increasing bounded sequence. Thence from (6.6), r=1 ur is convergent.
16/02
P∞ P∞
16/03 Remark. Similarly if r=1 vr diverges, vr > 0 and ur > Kvr for some K independent of r, then r=1 ur
16/04 diverges.
D’Alembert’s ratio test. We start by supposing that the ur are real and positive, i.e. ur > 0. Define the ratio
of successive terms to be
ur+1
ϱr = , (6.20a)
ur
and suppose that ϱr tends to a limit ϱ as r → ∞, i.e.
ur+1
lim = ϱ. (6.20b)
r→∞ ur
P
Then ur converges if ϱ < 1 and diverges if ϱ > 1.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Proof.
ϱ < 1. For the case ϱ < 1, choose σ with ϱ < σ < 1. Then there exists N ≡ N (σ) such that
ur+1
<σ for all r>N. (6.21a)
ur
It follows that
∞ N
X X uN +2 uN +2 uN +3
ur = ur + uN +1 1 + + + ...
r=1 r=1
uN +1 uN +1 uN +2
N
X
< ur + uN +1 (1 + σ + σ 2 + . . . ) by hypothesis
r=1
N
X uN +1
< ur + by (6.12c) since σ < 1. (6.21b)
r=1
1−σ
P∞ Pn
We conclude that r=1P ur is bounded. Thence, since sn = r=1 ur is an increasing sequence, it
follows from (6.6) that ur converges.
ϱ > 1. For the case ϱ > 1, choose τ with 1 < τ < ϱ. Then there exists M ≡ M (τ ) such that
ur+1
>τ >1 for all r>M, (6.22a)
ur
and hence
ur
> τ r−M > 1 for all r>M. (6.22b)
uM
P
Thus, since ur ̸→ 0 as r → ∞, we conclude that ur diverges.
P
Corollary. A series ur of complex terms converges if the limit of the absolute ratio of successive terms is
less than one, i.e. if
ur+1
lim = ϱ < 1. (6.23)
r→∞ ur
ur converges absolutely, thence from § 6.2.3 we conclude
P
Proof. D’Alembert’s
P ratio test shows that
that ur converges.
P −1
Example. For the harmonic series r ,
r
ϱr = →1 as r → ∞ , (6.24)
r+1
from which nothing can be concluded. A different test is required, such as the integral comparison
test.
Remark. The ratio test can not be used for series in which some of the terms are zero. However, it can
19/22 sometimes be adapted by relabelling the series to remove the vanishing terms.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Next suppose that ϱ > 1. Choose τ with 1 < τ < ϱ. Then there exists M ≡ M (τ ) such that
u1/r
r > τ > 1 , i.e. ur > τ r > 1 , for all r > M . (6.26c)
P
Thus, since ur →
̸ 0 as r → ∞, ur must diverge.
Definition: Bounded as z → ∞. The function f (z) is bounded as z → ∞ if there exist positive numbers K
and R such that |f (z)| < K for all z with |z| > R.
Warning: approaches to a point. There are different ways in which z can approach z0 or ∞, especially in
the complex plane. Sometimes the limit or bound applies only if the point is approached in a particular
way. For example, consider tanh(z) as |z| → ∞ for z real:
lim tanh z = 1, lim tanh z = −1 (6.29a)
z→+∞ z→−∞
This notation implies that z is approaching positive or negative real infinity along the real axis. But
if z approaches infinity along the imaginary axis, i.e. z → ±i∞, the limit of tanh is not defined.
x(1 + x) x(1 + x)
lim = 1, lim = −1 (6.29b)
x→0+ |x| x→0− |x|
f (z)
(i) if is bounded as z → z0 , we say that f (z) = O(g(z)) as z → z0 ;
g(z)
f (z)
→0 as z → z0 , we say that f (z) = o(g(z)) as z → z0 ;
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
(ii) if
g(z)
f (z)
(iii) if →1 as z → z0 , we say that f (z) ∼ g(z) as z → z0 .
g(z)
Remarks.
h2 ′′ hn−1 (n−1)
f (x0 + h) = f (x0 ) + hf ′ (x0 ) + f (x0 ) + . . . + f (x0 ) + Rn , (6.31a)
2! (n − 1)!
where the remainder after n terms, Rn , can be shewn to be (by integrating by parts)
x0 +h
(x0 + h − x)n−1 (n)
Z
Rn = f (x) dx . (6.31b)
x0 (n − 1)!
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
6.6 Analytic Functions of a Complex Variable
6.6.1 Complex differentiability
Definition: Complex differentiability. The complex derivative of the function f (z) at the point z = z0 is
defined as
f (z) − f (z0 )
f ′ (z0 ) = lim , (6.34a)
z→z0 z − z0
where the same limit must be obtained for any sequence of complex values for z that tends to z0 . If
this same limit exists, the function f (z) is said to be complex differentiable at z = z0 .
Alternative expression. Another way to write this is
df f (z + δz) − f (z)
≡ f ′ (z) = lim , (6.34b)
dz δz→0 δz
where the limit must be the same when δz → 0 by any
route/direction in the complex plane.
Remark. Requiring a function of a complex variable to be differ-
entiable is a surprisingly strong constraint.
Definition: Analytic function. If a function f (z) has a complex derivative at every point z in a region R of
the complex plane, it is said to be analytic in R.23
Remark. To be analytic at a point z = z0 , f (z) must be differentiable throughout some neighbourhood
|z − z0 | < ε of that point.
Definition: Entire function. An entire function is one that is analytic in the whole complex plane.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Examples of entire functions. Each of the following can be confirmed to satisfy the the Cauchy-Riemann
equations (6.37) in the whole complex plane.
(i) f (z) = c: a complex constant.
(ii) f (z) = z: for which u = x and v = y.
(iii) f (z) = exp(z): for which
f (z) = ez = ex eiy = ex cos y + i ex sin y = u + iv . (6.38a)
The Cauchy–Riemann equations are satisfied for all x and y since
∂u ∂v ∂v ∂u
= ex cos y = and = ex sin y = − . (6.38b)
∂x ∂y ∂x ∂y
As expected the derivative of the exponential function is
∂u ∂v
f ′ (z) = +i = ex cos y + i ex sin y = ez . (6.38c)
∂x ∂x
(iv) f (z) = z n , where n is a positive integer: for which
(ii) The usual product, quotient and chain rules apply to complex derivatives of analytic functions,
e.g.
d n d d
z = nz n−1 , sin z = cos z, cosh z = sinh z . (6.40c)
dz dz dz
Definition: Singular points. Many complex functions are analytic everywhere in the complex plane except
at isolated points, which are called the singular points or singularities of the function.
Examples.
(i) f (z) = P (z)/Q(z), where P (z) and Q(z) are polynomials. This is called a rational function
and is analytic except at points where Q(z) = 0.
(ii) f (z) = z c is analytic except at z = 0 if c is a complex constant which is not a positive integer
(see (6.39) for the case when c is a positive integer).
(iii) f (z) = ln z is analytic except at z = 0.
The last two examples are in fact multiple-valued functions, which require special treatment (see next
term).
23 Some use this definition for holomorphic functions and use analytic for functions with a convergent power series. However,
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Example. Suppose that you are given u(x, y) = x2 − y 2 , then what is the analytic function? From (6.37)
∂v ∂u
= = 2x ⇒ v = 2xy + g(x) , (6.41a)
∂y ∂x
∂v ∂u
=− = 2y ⇒ 2y + g ′ (x) = 2y ⇒ g ′ (x) = 0 . (6.41b)
∂x ∂y
Therefore v(x, y) = 2xy + c, where c is a real constant, and we find that
f (z) = x2 − y 2 + i(2xy + c) = (x + iy)2 + ic = z 2 + ic . (6.41c)
Property: u and v are harmonic functions. The real and imaginary parts of an analytic function satisfy
Laplace’s equation (they are harmonic functions):
∂2u ∂2u
∂ ∂u ∂ ∂u
+ 2 = +
∂x2 ∂y ∂x ∂x ∂y ∂y
∂ ∂v ∂ ∂v
= + −
∂x ∂y ∂y ∂x
=0 (6.42)
The proof that ∇2 v = 0 is similar.
Remark. This property provides a useful method for solving Laplace’s equation in two dimensions:
one “just” needs to find an analytic function that satisfies the boundary conditions.
Property: u and v are conjugate harmonic functions. Using the Cauchy–Riemann equations (6.37), we see
that
∂u ∂v ∂u ∂v
∇u · ∇v = +
∂x ∂x ∂y ∂y
∂v ∂v ∂v ∂v
= −
∂y ∂x ∂x ∂y
= 0. (6.43)
Hence the curves of constant u and those of constant v are orthogonal: u and v are said to be conjugate
20/22
harmonic functions.
If a function of a complex variable is analytic in a region R of the complex plane, not only is it differentiable
everywhere in R, it is also differentiable any number of times. It follows that if f (z) is analytic at z = z0 ,
it has an infinite Taylor series
∞
X 1 1 dn f
f (z) = an (z − z0 )n , where an = f (n) (z0 ) ≡ (z0 ) . (6.44)
n=0
n! n! dz n
As discussed in § 6.8, this series converges within some neighbourhood of z0 .
Alternative definition of analyticity. An alternative definition of the analyticity of a function f (z) at z = z0
is that f (z) has a Taylor series expansion about z = z0 with a non-zero radius of convergence.
The first non-zero term in the Taylor series of f (z) about z = z0 is then proportional to (z − z0 )N .
Indeed
f (z) ∼ aN (z − z0 )N as z → z0 (6.45b)
A simple zero is a zero of order 1. A double zero is one of order 2, etc.
Examples.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
(i) f (z) = z has a simple zero at z = 0.
(ii) f (z) = (z − i)2 has a double zero at z = i.
(iii) f (z) = z 2 − 1 = (z − 1)(z + 1) has simple zeros at z = ±1.
Worked exercise. Find and classify the zeros of f (z) = sinh z.
Answer.
sinh z = 21 (ez − e−z ) = 0
if
ez = e−z ⇒ e2z = 1 ⇒ z = nπi, n ∈ Z.
Since f ′ (z) = cosh z = cos(nπ) ̸= 0 at these points, all the zeros are simple zeros.
f (z) is not analytic at z = z0 , and we say that f (z) has a pole of order N . We refer to a pole of order 1
as a simple pole, a pole of order 2 as a double pole, etc.
Expansion of f (z) near a pole. Because g(z) is analytic, from (6.44) it has a Taylor series expansion at z0 :
∞
X
g(z) = bn (z − z0 )n with b0 ̸= 0 . (6.47a)
n=0
Hence
∞
X
f (z) = (z − z0 )−N g(z) = an (z − z0 )n , (6.47b)
n=−N
with an = bn+N , and a−N ̸= 0. This is not a Taylor series because it includes negative powers of
z − z0 , and f (z) is not analytic at z = z0 .
Remarks.
(i) If f (z) has a zero of order N at z = z0 , then 1/f (z) has a pole of order N there, and vice versa.
(ii) If f (z) is analytic and non-zero at z = z0 and g(z) has a zero of order N there, then f (z)/g(z)
has a pole of order N there.
Worked exercise. Find and characterise the poles of
2z
f (z) = . (6.48a)
(z + 1)(z − i)2
2(i + w)
f (z) =
(i + w + 1)w2
2i(1 − iw)
=
(i + 1) 1 + 12 (1 − i)w w2
2i
(1 − iw) 1 − 12 (1 − i)w + O(w2 )
= 2
(i + 1)w
= (1 + i)w−2 1 − 12 (1 + i)w + O(w2 )
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Definition: Laurent series. It can be shown that any function
that is analytic (and single-valued) throughout an an-
nulus α < |z − z0 | < β centred on a point z = z0 has a
unique Laurent series,
∞
X
f (z) = an (z − z0 )n , (6.49)
n=−∞
(i) If the first non-zero term in the Laurent series has n ⩾ 0, then f (z) is analytic at z = z0 and the series
is just a Taylor series.
(ii) If the first non-zero term in the Laurent series has n = −N < 0, then f (z) has a pole of order N at
z = z0 .
(iii) Otherwise, if the Laurent series involves an infinite number of terms with n < 0, then f (z) has an
essential singularity at z = z0 .
Remark. The behaviour of a function near an essential singularity is remarkably complicated. Picard’s
theorem states that, in any neighbourhood of an essential singularity, the function takes all possible
complex values (possibly with one exception) at infinitely many points. In the case of f (z) = e1/z , the
exceptional value 0 is never attained.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
r=0
Hence the Taylor series for an analytic function, (6.44), is a power series.
Many of the tests of convergence for real series discussed in § 6.3 can be generalised for complex series.
Indeed, we have already noted that if the sum P of the absolute
P values of a complex series converges, i.e. if
|ur | converges, then so does the series, i.e. ur . Hence if |ar (z −z0 )r | converges, so does ar (z −z0 )r .
P P
Corollary. If the sum diverges for z = z1 then it diverges for all z such that |z − z0 | > |z1 − z0 |. For suppose
that it were to converge for some such z = z2 with |z2 − z0 | > |z1 − z0 |, then it would converge for
z = z1 by the above result; this is in contradiction to the hypothesis.
Definition: Radius and circle of convergence. These results imply there must exist a real, non-negative num-
ber R such that
ar (z − z0 )r converges for |z − z0 | < R
P
. (6.54)
ar (z − z0 )r diverges for |z − z0 | > R
P
R is called the radius of convergence, and |z − z0 | = R is called the circle of convergence, within which
the series converges and outside of which it diverges.
Remarks.
(i) The radius of convergence may be be zero (exceptionally), positive or infinite.
(ii) On the circle of convergence, the series may either converge or diverge.
(iii) The radius of convergence of the Taylor series of a function f (z) about the point z = z0 is equal
to the distance of the nearest singular point of the function f (z) from z0 . Since a convergent
power series defines an analytic function, no singularity can lie inside the circle of convergence.
Natural Sciences Tripos: IB Mathematical Methods I 111 © [email protected], Michaelmas 2022
6.8.3 Determination of the radius of convergence
Without loss of generality take z0 = 0, so that (6.52) becomes
∞
X
f (z) = ur where ur = ar z r . (6.55)
r=0
ar+1 1
lim = . (6.56a)
r→∞ ar R
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
lim = lim |z| = by hypothesis (6.56a).
r→∞ ur r→∞ ar R
Hence the series converges absolutely by D’Alembert’s ratio test if |z| < R. On the other hand if
|z| > R, then
ur+1 |z|
lim = > 1. (6.56b)
r→∞ ur R
Hence ur ̸→ 0 as r → ∞, and so the series does not converge. It follows that R is the radius of
convergence.
ar+1
Remark. The limit (6.56a) may not exist, e.g. if ar = 0 for r odd then is alternately 0 or ∞.
ar
Use Cauchy’s test (unlectured). If the limit exists, then
1
lim |ar |1/r = . (6.57a)
r→∞ R
Proof. We have that
|z|
lim |ur |1/r = lim |ar |1/r |z| = by hypothesis. (6.57b)
r→∞ r→∞ R
Hence the series converges absolutely by Cauchy’s test if |z| < R.
On the other hand if |z| > R, choose τ with 1 < τ < |z|/R. Then there exists M ≡ M (τ ) such
that
|ur |1/r > τ > 1 , i.e. |ur | > τ r > 1 , for all r > M .
P
21/22 Thus, since ur ̸→ 0 as r → ∞, ur must diverge. It follows that R is the radius of convergence.
6.8.4 Examples
(i) Suppose that ar = 1 for all r, then f (z) is the geometric series
∞
X
f (z) = zr . (6.58a)
r=0
Both D’Alembert’s ratio test, (6.56a), and Cauchy’s test, (6.57a), give R = 1:
ar+1
=1 and |ar |1/r = 1 for all r. (6.58b)
ar
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
1
= lim |ar |1/r = 1 . (6.59d)
R r→∞
Remark. The series converges to log(1 + z) for |z| < 1, where the singularity at z = −1 limits the
radius of convergence. In fact it can be shewn that the series converges on the circle |z| = 1
except at the point z = −1.
1
(iii) If ar = r! for all r, then f (z) is the series
∞
X zr
f (z) = . (6.60a)
r=0
r!
Thought of as a power series in (−z 2 ), this has |ar+1 /ar | = (2r + 1)/(2r + 3) → 1 as r → ∞. Therefore
R = 1 in terms of (−z 2 ). But since | − z 2 | = 1 is equivalent to |z| = 1, the series converges for |z| < 1
and diverges for |z| > 1.
7.1 Classification
7.1.1 First-order linear ordinary differential equations
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
The general linear first-order inhomogeneous ODE,
F (y ′′ , y ′ , y, x) = 0 , (7.2)
for an unknown function y(x), where y ′ = dy/dx, y ′′ = d2 y/dx2 and F is a known function.
We recover the standard form (2.12a) by dividing through by the coefficient of y ′′ to obtain
Recall from § 2.2 that if f (x) = 0 the equation is said to be homogeneous, otherwise it is said to be
inhomogeneous.
Remarks.
(i) The principle of superposition applies to linear ODEs as to all linear equations.
(ii) Although the solution may be of interest only for real x, it is often informative to analyse the
solution in the complex domain.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
where α and β are arbitrary constants, and there are two arbitrary constants because the equation is of
second order.
where κ is a constant (a change in lower limit of integration can be absorbed by a rescaling of κ).
Remarks.
(i) Up to the multiplicative constant κ, the Wronskian W can be shown to be the same for any two
linearly independent solutions y1 and y2 , and hence it is an intrinsic property of the ODE.
(ii) If W ̸= 0 for one value of x (and p is integrable) then W ̸= 0 for all x (since exp y > 0 for all y).
Hence if y1 and y2 are linearly independent for one value of x, they are linearly independent for
all values of x; it follows that linear independence need be checked at only one value of x. In the
case that y1 and y2 are known implicitly, e.g. in terms of series or integrals, this means that we
just have to find one value of x where is it relatively easy to evaluate W in order to confirm (or
18/02 otherwise) linear independence.
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Z x Z η
κ
= y1 (x) exp − p(ζ) dζ dη . (7.8b)
y12 (η)
In principle this allows us to compute y2 given y1 .
Remarks.
(i) The indefinite integral involves an arbitrary additive constant, since any amount of y1 can be
added to y2 .
(ii) W involves an arbitrary multiplicative constant, since y2 can be multiplied by any constant.
(iii) This expression for y2 therefore provides the general solution of the homogeneous ODE.
18/04
Example. Given that y = xn is a solution of x2 y ′′ − (2n − 1)xy ′ + n2 y = 0, find the general solution.
Answer. First write the ODE in the standard form (7.3c):
2
′′ 2n − 1 ′ n
y − y + y=0 (7.9a)
x x2
Next calculate the Wronskian (7.6c):
Z x x
2n − 1
Z
W = κ exp − p(ζ) dζ = κ exp dζ
ζ
= κ exp ((2n − 1) ln x + constant)
= Λx2n−1 , (7.9b)
for any non-zero constant Λ Finally, calculate the second solution from (7.8a):
Z x Z x
W (η) n Λ
y2 = y1 dη = x dη
y12 (η) η
= Λxn ln x + Bxn . (7.9c)
Remark. The same result can be obtained by writing y2 (x) = y1 (x)u(x) and obtaining a first-order
linear ODE for u′ . This method applies to higher-order linear ODEs and is reminiscent of the
factorization of polynomial equations.
Example. Given that y1 (x) is a solution of Bessel’s equation of zeroth order,
1 ′
y ′′ + y + y = 0, (7.10a)
x
find another independent solution in terms of y1 for x > 0.
Answer. In this case p(x) = 1/x and hence
Z x Z η
κ 1
y2 (x) = y1 (x) exp − dζ dη
y 2 (η) ζ
Z x1
1
= κy1 (x) dη . (7.10b)
18/03 η y12 (η)
It is useful now to generalize to complex functions y(z) of a complex variable z. The homogeneous linear
second-order ODE (7.4a) in standard form then becomes
y ′′ (z) + p(z)y ′ (z) + q(z)y(z) = 0 . (7.11)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Definition: regular singular points. A singular point z = z0 is regular if:
(z − z0 )p(z) and (z − z0 )2 q(z) are both analytic at z = z0 .
Example. Consider Legendre’s equation
(1 − z 2 )y ′′ − 2zy ′ + ℓ(ℓ + 1)y = 0 , (7.12a)
where ℓ is a constant. To identify the singular points and their nature, we divide through by (1 − z 2 )
to obtain the standard form with
2z ℓ(ℓ + 1)
p(z) = − , q(z) = (7.12b)
1 − z2 1 − z2
Both p(z) and q(z) are analytic for all z except z = ±1, which are the singular points. However, they
are both regular since
2z 2 1−z
(z − 1)p(z) = , and (z − 1) q(z) = ℓ(ℓ + 1) (7.12c)
1+z 1+z
22/22 are both analytic at z = 1, and similarly for z = −1.
7.3.2 The solution at ordinary points in terms of a power series
If z = z0 is an ordinary point of (7.11), then we claim that y(z) is analytic
at z = z0 , and consequently the equation has two linearly independent
solutions of the form (see (6.44))
∞
X
y= an (z − z0 )n when |z − z0 | < R , (7.13)
n=0
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
X X
•(n, m) = •(r − m, m) , (7.16b)
n=0 m=0 r=0 m=0
Now substitute series (7.14c), (7.17a) and (7.17b) into equation (7.11), and group powers of z r , to obtain
∞ r
!
X X
(r + 2)(r + 1)ar+2 + (m + 1)am+1 pr−m + am qr−m zr = 0 . (7.18)
r=0 m=0
Since this expression is true for all |z| < R, each coefficient of z r (r = 0, 1, . . . ) must be zero. Thus we
deduce the recurrence relation
r
1 X
ar+2 = − (m + 1)am+1 pr−m + am qr−m for r ⩾ 0 . (7.19)
(r + 2)(r + 1) m=0
This is a recurrence relation that determines ar+2 (for r ⩾ 0) in terms of the preceding coefficients
a0 , a1 , . . . , ar+1 . This means that if a0 and a1 are known then so are all the ar . The first two coefficients a0
and a1 play the rôle of the two integration constants in the general solution.
Remarks.
(i) The above procedure is rarely followed to the letter. For instance, if p and q are rational functions
(i.e. ratios of polynomials) it is a much better idea to multiply the equation through by a suitable
factor to clear denominators before substituting in the power series for y, y ′ and y ′′ .
(ii) Proof that the radius of convergence of the series (7.14a) is non-zero is more difficult, and we will
not attempt such a task in general. However we shall discuss the issue for examples.
and hence in the terminology of the previous subsection pm = 0 and qm = −2(m + 1). Substituting into
(7.19) we obtain the recurrence relation
r
2 X
ar+2 = an (r − n + 1) for r ⩾ 0. (7.22b)
(r + 2)(r + 1) n=0
However, with a small amount of forethought we can obtain a simpler, if equivalent, recurrence relation.
First multiply (7.20) by (1 − z)2 to obtain
(1 − z)2 y ′′ − 2y = 0 ,
and then substitute (7.21) into this equation. We find, on expanding (1 − z)2 = 1 − 2z + z 2 , that
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
∞
X ∞
X ∞
X
n(n − 1)an z n−2 − 2 n(n − 1)an z n−1 + (n2 − n − 2)an z n = 0 .
n≠ 02 n≠ 01 n=0
After the substitutions r = n − 2, r = n − 1 and r = n in the first, second and third terms respectively, we
obtain
X∞
(r + 1) (r + 2)ar+2 − 2rar+1 + (r − 2)ar z r = 0 ,
r=0
First we note that if 2a2 + a1 = 0, then a3 = a4 = 0, and hence ar = 0 for r ⩾ 3. We thus have as our
first solution (with a0 = α ̸= 0)
y1 = α(1 − z)2 . (7.24b)
Next we note that ar = a0 for all r is a solution of (7.23). In this case we can sum the series to obtain
(with a0 = β ̸= 0)
∞
X β
y2 = β zn = . (7.24c)
n=0
1 − z
Linear independence. The linear independence of (7.24b) and (7.24c) is clear, as confirmed by the calculation
of the Wronskian:
β β
W = y1 y2′ − y1′ y2 = α(1 − z)2 + 2α(1 − z) = 3αβ ̸= 0 . (7.25)
(1 − z)2 (1 − z)
Hence the general solution is
β
y(z) = α(1 − z)2 + , (7.26)
1−z
for constants α and β.
Radius of convergence. From (6.58b) the radius of convergence of (7.24c) is R = 1, which is consistent with
the general solution being singular at z = 1, and the equation having a singular point at z = 1 since
q(z) = −2(1 − z)−2 .
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
From substituting r = n − 2 in the first sum and r = n in the next three sums, and from grouping powers
of z r , we obtain
∞
X
(r + 2)(r + 1)ar+2 − (r(r + 1) − ℓ(ℓ + 1)) ar z r = 0 .
r=0
The recurrence relation is therefore
r(r + 1) − ℓ(ℓ + 1) (r − ℓ)(r + ℓ + 1)
ar+2 = ar = ar for r = 0, 1, 2, . . . . (7.29)
(r + 1)(r + 2) (r + 1)(r + 2)
a0 and a1 are arbitrary constants, with the other coefficients following from the recurrence relation. For
instance:
ℓ(ℓ + 1) 2
(i) if a0 = 1 and a1 = 0, then y1 = 1 − z + O(z 4 ) is an even solution; (7.30a)
2
2 − ℓ(ℓ + 1) 3
(ii) if a0 = 0 and a1 = 1, then y2 = z + z + O(z 5 ) is an odd solution. (7.30b)
6
The Wronskian at the ordinary point z = 0 is thus given by
W = y1 y2′ − y1′ y2 = 1 · 1 − 0 · 0 = 1 . (7.31)
Since W ̸= 0, y1 and y2 are linearly independent (although it should have been obvious already ©).
Radius of convergence. The series (7.30a) and (7.30b) are effectively power series in z 2 rather than z. Hence
to find the radius of convergence we either need to re-express our series (e.g. z 2 → y and a2n → bn ), or
use a slightly modified D’Alembert’s ratio test. We adopt the latter approach and observe from (7.29)
that
an+2 z n+2 n(n + 1) − ℓ(ℓ + 1)
lim n
= lim |z|2 = |z|2 . (7.32)
n→∞ an z n→∞ (n + 1)(n + 2)
It then follows from a straightforward extension of D’Alembert’s ratio test (6.20b) that the series
converges for |z| < 1. Moreover, the series diverges for |z| > 1 (since an z n ̸→ 0), and so the radius
of convergence R = 1. On the radius of convergence, determination of whether the series converges is
more difficult.
Remark. The radius of convergence is the distance to nearest singularity of the ODE. This is a general
feature.
Legendre polynomials. In the generic situations both series (7.30a) and (7.30b) have an infinite number of
terms. However, for ℓ = 0, 1, 2, . . . it follows from (7.29)
ℓ(ℓ + 1) − ℓ(ℓ + 1)
aℓ+2 = aℓ = 0 , (7.33)
(ℓ + 1) (ℓ + 2)
and so the series terminates. For instance,
ℓ = 0 : y = a0 ,
ℓ = 1 : y = a1 z ,
ℓ = 2 : y = a0 (1 − 3z 2 ) .
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
19/02
19/04
23/22
7.4 Regular Singular Points
Let z = z0 be a regular singular point of equation (7.11) where, as before, wlog we can take z0 = 0. If we
write
1 1
p(z) = s(z) and q(z) = 2 t(z) , (7.35a)
z z
then the homogeneous equation (7.11) becomes, after multiplying by z 2 ,
where, from the definition of a regular singular point, s(z) and t(z) are both analytic at z = 0. It follows
19/03 that s0 ≡ s(0) and t0 ≡ t(0) are finite.
If z = 0 is a regular singular point, Fuchs’s theorem guarantees that there is always at least one solution to
(7.35b) of the form
∞
X
y = zσ an z n with a0 ̸= 0 and σ ∈ C , (7.36)
n=0
i.e. a Taylor series multiplied by a power z σ , where the index σ is to be determined.
Remarks.
(i) This is a Taylor series only if σ is a non-negative integer.
(ii) There may be one or two solutions of this form (see below).
(iii) The condition a0 ̸= 0 is required to define σ uniquely.
To understand why the solutions behave in this way, substitute (7.36) into (7.35b) to obtain, after division
by z σ ,
X∞
(σ + n)(σ + n − 1) + (σ + n)s(z) + t(z) an z n = 0 .
(7.37a)
n=0
We now evaluate this sum at z = 0, recalling that z n = 0 except for n = 0, to obtain
σ 2 + σ(s0 − 1) + t0 = 0 . (7.38)
19/01 The roots σ1 , σ2 of this equation are called the indices of the regular singular point.
7.4.2 Series Solutions
For each choice of σ from σ1 and σ2 we can find a recurrence relation for an by comparing powers of z in
(7.37a), i.e. after expanding s and t in power series.
σ1 − σ2 ∈
/ Z. If σ1 − σ2 ∈
/ Z we can find both linearly independent solutions this way.
σ1 − σ2 ∈ Z. If σ1 = σ2 we note that we can find only one solution by the ansatz (7.36). However, as we
shall see, it’s worse than this. The ansatz (7.36) also fails (in general) to give both solutions when σ1
and σ2 differ by an integer (although there are exceptions).
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Frobenius’s method is used to find the series solutions about a regular singular point. This is best demon-
strated by example.
A power series solution of the form (7.36) solves (7.39) if, from (7.37a),
∞
X ∞
X
(σ + n)(σ + n − 1) + (σ + n) − ν 2 an z n + an z n+2 = 0 ,
(7.41a)
n=0 n=0
n = 0 : σ2 − ν 2 = 0 since a0 ̸= 0 (7.42a)
2 2
n=1: (σ + 1) − ν a1 = 0 (7.42b)
(σ + n)2 − ν 2 an + an−2 = 0 .
n⩾2: (7.42c)
σ = ±ν . (7.43)
(1 ± 2ν) a1 = 0 (7.44a)
n(n ± 2ν) an = −an−2 for n ⩾ 2 . (7.44b)
Radius of convergence. The radius of convergence of the solution is infinite since from (7.44b)
an 1
lim = lim = 0.
n→∞ an−2 n→∞ n(n ± 2ν)
This is consistent with p and q having no singularities other than at z = 0.
Remark. We note that there is no difficulty in solving for an from an−2 using (7.44b) if σ = +ν. However, if
σ = −ν the recursion will fail with an predicted to be infinite if at any point n = 2ν. There are hence
potential problems if σ1 − σ2 = 2ν ∈ Z, i.e. if the indices σ1 and σ2 differ by an integer.
z2 z4
y1 = a0 z +ν 1 − + + ··· , (7.45b)
4(1 + ν) 32(1 + ν)(2 + ν)
z2 z4
y2 = a0 z −ν 1 − + + ··· . (7.45c)
4(1 − ν) 32(1 − ν)(2 − ν)
2ν = 2m + 1, m ∈ N. It so happens in this case that even though σ1 and σ2 differ by an odd integer
there is no problem; the solutions are still given by (7.45a), (7.45b) and (7.45c). This is because for
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
Bessel’s equation the power series proceed in even powers of z, and hence the problem recursion when
n = 2ν = 2m + 1 is never encountered. We conclude that the condition for the recursion relation
(7.44b) to fail is that ν is an integer.
Remark. If ν = 21 , then (7.44a) does not force the choice a1 = 0. However, if a1 ̸= 0 the effect is to
add a multiple of y1 to y2 ; hence, wlog, one can choose a1 = 0.
2ν = 0. If ν = 0 then σ1 = σ2 and we can only find one power series solution of the form (7.36), viz.
y = a0 1 − 41 z 2 + . . . .
(7.46)
2ν = 2m, m ∈ N. If ν is a positive integer, m, then we can find one solution by choosing σ = ν. However
if we take σ = −ν then a2m is predicted to be infinite, i.e. a second series solution of the form (7.36)
fails.
Remark. The existence of two power series solutions for 2ν = 2m + 1, m ∈ N is a ‘lucky’ accident. In general
there exists only one solution of the form (7.36) whenever the indices σ1 and σ2 differ by an integer.
for some number k. The coefficients bn can be found by substitution into the ODE. In some very
special cases k may vanish but k ̸= 0 in general.
Example: Bessel’s equation of integer order.25 Suppose that y1 is the series solution with σ = +m to
z 2 y ′′ + zy ′ + z 2 − m2 y = 0 ,
(7.49)
where, compared with (7.39), we have written m for ν. Hence from (7.36) and (7.45a)
∞
X
y1 = z m a2ℓ z 2ℓ , (7.50)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
ℓ=0
Based on (7.43), (7.48a) and (7.48c) we now seek a series solution of the form
∞
X
w = k z −m bn z n . (7.53)
n=0
We now demand that the combined coefficient of z n is zero. Consider the even and odd powers of z n
in turn.
n = 1, 3, 5, . . . . From equating powers of z 1 it follows that b1 = 0. Next, from writing n = 2ℓ + 1
(ℓ = 1, 2, . . . ) and equating powers of z 2ℓ+1 , we obtain the following recurrence relation for the
b2ℓ+1 :
(2ℓ + 1)(2ℓ + 1 − 2m)b2ℓ+1 = −b2ℓ−1 .
Since b1 = 0, we conclude that b2ℓ+1 = 0 (ℓ = 1, 2, . . . ).
n = 2, 4, . . . , 2m, . . . . Let n = 2ℓ (ℓ = 1, 2, . . . ), then from equating powers of z 2ℓ we obtain
25 The schedules specifically state “without full discussion of logarithmic singularities”, hence you may assume that the
z2 z4
y1 = 1 − + + ··· , (7.55a)
4 64
z2 3z 4
y2 = y1 ln z + − + ··· , (7.55b)
4 128
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
and for ν = 1
z3 z5
y1 = z − + + ··· , (7.55c)
8 192
2 3z 3
y2 = y1 ln z − + + ··· . (7.55d)
z 32
Remark. These examples illustrate a feature that is commonly encountered in scientific applications: one
solution is regular (i.e. analytic) and the other is singular. Often only the regular solution is an
20/02 acceptable solution of the scientific problem.
If either (z − z0 )p(z) or (z − z0 )2 q(z) is not analytic at the point z = z0 , it is an irregular singular point of
the equation (7.11). The solutions can have worse kinds of singular behaviour there.
Example: the equation z 4 y ′′ + 2z 2 y ′ − y = 0 has an irregular singular point at z = 0. Its solutions are
exp(±z −1 ), both of which have an essential singularity at z = 0.
20/01
In fact this example is just the familiar equation d2 y/dx2 = y with the substitution z = 1/x. Even this
20/03 simple ODE has an irregular singular point at x = ∞.
20/04
24/22 7.5 The Method of Variation of Parameters (Unlectured and Not in Schedule)
The question that remains is how to find the particular solution. To that end first suppose that we have
solved the homogeneous equation and found two linearly-independent solutions y1 and y2 . Then in order to
find a particular solution consider
If u and v were constants (‘parameters’) y0 would solve the homogeneous equation. However, we allow the
‘parameters’ to vary, i.e. to be functions of x, in such a way that y0 solves the inhomogeneous problem.
Remark. We have gone from one unknown function, i.e. y0 and one equation (i.e. (2.15a)), to two unknown
functions, i.e. u and v and one equation. We will need to find, or in fact choose, another equation.
If we substitute the above into the inhomogeneous equation (2.15a) we will have not apparently made much
progress because we will still have a second-order equation involving terms like u′′ and v ′′ . However, suppose
that we eliminate the u′ and v ′ terms from (7.57a) by demanding that u and v satisfy the extra equation
u′ y1 + v ′ y2 = 0 . (7.58)
This is a specific individual’s copy of the notes. It is not to be copied and/or redistributed.
where W is the Wronskian,
W = y1 y2′ − y2 y1′ . (7.62)
W is non-zero because y1 and y2 were chosen to be linearly independent. Integrating we obtain
Z x Z x
y2 (ζ)f (ζ) y1 (ζ)f (ζ)
u=− dζ and v = dζ , (7.63)
a W (ζ) a W (ζ)
where a is arbitrary. We could have chosen different lower limits for the two integrals, but we do not need
to find the general solution, only a particular one. Substituting this result back into (7.56) we obtain as our
particular solution Z x
f (ζ)
y0 (x) = y1 (ζ)y2 (x) − y1 (x)y2 (ζ) dζ . (7.64)
a W (ζ)
Hence the particular solution (7.64) satisfies the initial value homogeneous boundary conditions
y(a) = y ′ (a) = 0 . (7.66)
More general initial value boundary conditions would be inhomogeneous, e.g.
y(a) = k1 , y ′ (a) = k2 , (7.67)
where k1 and k2 are constants which are not simultaneously zero. Such inhomogeneous boundary
conditions can be satisfied by adding suitable multiples of the linearly-independent solutions of the
homogeneous equation, i.e. y1 and y2 .
Example. Find the general solution to the equation
y ′′ + y = sec x . (7.68)