0% found this document useful (0 votes)
21 views194 pages

Phys Notes

Physics notes

Uploaded by

Troy Horses
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views194 pages

Phys Notes

Physics notes

Uploaded by

Troy Horses
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 194

NOTES ON

PHYSICS

Peter Heuer
University of California Los Angeles
A Reminder

Sometimes I hate physics for being so complicated.

I get frustrated and feel like I’ll never understand.

I fail tests and break equipment,

disappointing myself and those who have taught me.

But sometimes, when the pieces snap together,

the beauty of physics appears like a mirage above the pages.

My shaky mastery of the equations feels like a great power.

And I am profoundly grateful.


Contents
1 Mathematical Methods 10
1.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
1.1.1 Properties of Matrices and Matrix Operations . . . . . . . . . . . . . . . . 10
1.1.2 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
1.1.3 The Eigenvalue Problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
1.1.4 Change of Coordinate Matrices . . . . . . . . . . . . . . . . . . . . . . . . 12
1.1.5 Rotation Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.1.6 Matrix Exponentiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2 Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.2.1 The Solid Angle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
1.3 Coordinate Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.3.1 Spherical Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
1.4 Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.4.1 Trig Substitution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
1.5 Fourier Series and Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5.1 Fourier Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
1.5.2 Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
1.6 The Legendre Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7 Vector Calculus . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7.1 Line and Volume Elements . . . . . . . . . . . . . . . . . . . . . . . . . . 18
1.7.2 Surface Normals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.7.3 Vector Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
1.7.4 The Divergence Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.7.5 The Curl Theorem (Stokes Theorem) . . . . . . . . . . . . . . . . . . . . 21
1.7.6 The Helmholtz Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
1.8 The Calculus of Variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.8.1 Functional Derivatives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.9 Complex Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.9.1 Some Useful Identities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
1.9.2 Extension to the Complex Plane . . . . . . . . . . . . . . . . . . . . . . . 22
1.9.3 Poles and Residue . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
1.9.4 The Cauchy Residue Theorem . . . . . . . . . . . . . . . . . . . . . . . . 23
1.10 Solving Differential Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.10.1 Separation of Variables . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
1.10.2 Roots of the Characteristic Polynomial . . . . . . . . . . . . . . . . . . . . 25
1.11 Einstein Summation Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.11.1 The Summation Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.11.2 The Dot Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.11.3 The Levi-Civita Symbol . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
1.11.4 Cross Product . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
1.11.5 Example: Proving Vector Identities with Einstein Notation . . . . . . . . 28
1.12 Special Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.12.1 Legendre Polynomials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
1.12.2 Associated Legendre Polynomials . . . . . . . . . . . . . . . . . . . . . . . 29
1.12.3 Spherical Harmonics (The Ylm ’s) . . . . . . . . . . . . . . . . . . . . . . . 30
1.12.4 Bessel Functions and Hankel Functions . . . . . . . . . . . . . . . . . . . 31
1.12.5 Airy Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.13 Probability and Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
1.13.1 Combinations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

3
1.14 Statistics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.14.1 The Autocorrelation function . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.15 Miscellaneous . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.15.1 The Feynman Integration Trick . . . . . . . . . . . . . . . . . . . . . . . . 35
1.15.2 Raleigh’s Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
1.15.3 The Law of Cosines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.15.4 Taylor Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.15.5 The Binomial Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
1.15.6 The Schwartz Inequality . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.16 Useful Facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.16.1 Integrals . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
1.16.2 Series . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.16.3 Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.16.4 Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
1.16.5 Theorems/Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39

2 Classical Mechanics 39
2.1 Poisson Brackets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
2.1.1 Identities and Useful Relations . . . . . . . . . . . . . . . . . . . . . . . . 39
2.1.2 Poisson Bracket form of Hamilton’s Equations of Motion . . . . . . . . . . 40
2.2 Derivation of Virial Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
2.3 Derivation of the Euler-Lagrange Equation (1D) . . . . . . . . . . . . . . . . . . 41
2.4 Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.4.1 Types of Constraints . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.4.2 Lagrange Multipliers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
2.4.3 Generalized Coordinates . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.5 Conserved Quantities and Noether’s Theorem . . . . . . . . . . . . . . . . . . . . 44
2.5.1 Equivalent Lagrangians . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.5.2 Noether’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
2.5.3 Conserved Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
2.6 Derivation of Hamilton’s Equations of Motion . . . . . . . . . . . . . . . . . . . . 45
2.7 Canonical Coordinates and the Hamilton-Jacobi Equation . . . . . . . . . . . . . 46
2.7.1 Canonical Coordinates and Transformations . . . . . . . . . . . . . . . . . 46
2.7.2 The Method of Generating Functions . . . . . . . . . . . . . . . . . . . . 47
2.7.3 The Hamilton-Jacobi Equation . . . . . . . . . . . . . . . . . . . . . . . . 48
2.8 The Harmonic Oscillator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
2.8.1 Solutions of the 1D Harmonic Osscilator . . . . . . . . . . . . . . . . . . 48
2.8.2 Solutions of the Damped 1D Harmonic Oscillator . . . . . . . . . . . . . 49
2.8.3 Finding the Frequency of Small Osscillations: Approximations of a SHO 51
2.9 Orbital Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.9.1 The Reduced Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51
2.9.2 Orbital Conserved Quantities . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.9.3 Kepler’s Second Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
2.9.4 Deriving Orbits with Newtonian Mechanics: The Integral Equation . . . 52
2.9.5 Deriving the Orbit Equation . . . . . . . . . . . . . . . . . . . . . . . . . 53
2.9.6 The Kepler Problem (Explicitly Equations for Orbits) . . . . . . . . . . . 55
2.10 Rigid Body Motion (Rotation in Euler Angles) . . . . . . . . . . . . . . . . . . . 55
2.10.1 Moment of Inertia . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56
2.10.2 Calculating Inertia Tensors . . . . . . . . . . . . . . . . . . . . . . . . . . 57
2.10.3 Principal Axes of Rotation: Diagonalizing the Inertia Tensor . . . . . . . 57
2.10.4 Euler Angles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

4
2.10.5 Euler’s Equations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.11 Classical Scattering Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59
2.11.1 Example: Hard Sphere Scattering . . . . . . . . . . . . . . . . . . . . . . 60

3 Relativity 61
3.1 The Lorentz Transformation and its Consequences . . . . . . . . . . . . . . . . . 61
3.1.1 The Lorentz Transformation . . . . . . . . . . . . . . . . . . . . . . . . . 61
3.1.2 Time Dilation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.1.3 Lorentz Contraction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
3.1.4 The Einstein Velocity Addition Rule . . . . . . . . . . . . . . . . . . . . . 63
3.2 Four-Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.2.1 Four-Vectors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
3.2.2 Lorentz Transformations in Four-Vector Notation . . . . . . . . . . . . . . 65
3.3 Relativistic Mass . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.4 Relativistic Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
3.5 Relativistic Three Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
3.6 Four Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4 Electromagnetism 68
4.1 A Note on Griffith’s “script r" . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.2 The Units of Electromagnetism . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
4.3 Static Electric Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.3.1 Electric Field Discontinuities . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.3.2 Green’s Reciprocity Relation . . . . . . . . . . . . . . . . . . . . . . . . . 70
4.3.3 Example: Force Between Spherically Symmetric Charge Distributions
Using Green’s Reciprocity Relation . . . . . . . . . . . . . . . . . . . . . . 70
4.3.4 Energy of Electrostatic Charge Distributions . . . . . . . . . . . . . . . . 71
4.3.5 Example: Electric field of a Uniformly Charged Sphere . . . . . . . . . . . 72
4.4 The Scalar Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 73
4.4.1 Definition of the Scalar Potential . . . . . . . . . . . . . . . . . . . . . . . 73
4.4.2 Example: Potential of a Uniformly Charged Sphere . . . . . . . . . . . . . 73
4.4.3 Example: Potential of a Uniformly Charged Spherical Shell (by direct
integration) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.5 Capacitance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
4.5.1 Example: Capacitance of a Parallel Plate Capacitor . . . . . . . . . . . . 75
4.5.2 Energy Stored in a Charged Capacitor . . . . . . . . . . . . . . . . . . . . 76
4.6 Potential Theory (Poisson and Laplace Equations) . . . . . . . . . . . . . . . . . 76
4.6.1 Cartesian Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.6.2 Cylindrical Solution . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
4.6.3 A Note on Scale Independence . . . . . . . . . . . . . . . . . . . . . . . . 78
4.7 The Multipole Expansion (Monopoles and Dipoles) . . . . . . . . . . . . . . . . . 78
4.7.1 The Multipole Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.7.2 The Monopole and Dipole Potentials . . . . . . . . . . . . . . . . . . . . . 79
4.7.3 The Spherical Multipole Expansion . . . . . . . . . . . . . . . . . . . . . . 80
4.7.4 The Electric Field of a Dipole . . . . . . . . . . . . . . . . . . . . . . . . . 81
4.7.5 Torques and Forces on Physical Dipoles . . . . . . . . . . . . . . . . . . . 81
4.8 Polarizability and Dielectrics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.8.1 Deriving the Bound Surface and Volume Charges . . . . . . . . . . . . . . 82
4.8.2 Polarizability . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
4.8.3 The “Fake" Field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.9 Fields in Matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84
4.9.1 D and H . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84

5
4.10 Boundary Conditions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.11 The Method of Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 85
4.11.1 Some Useful Image Charges . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.12 Green Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 86
4.12.1 Physical Interpretation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.12.2 Specific Green Functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
4.13 Magnetic Multipole Expansion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.13.1 Magnetic Dipoles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.13.2 Forces on a Magnetic Dipole . . . . . . . . . . . . . . . . . . . . . . . . . 88
4.13.3 Magnetic Field of a Magnetic Dipole . . . . . . . . . . . . . . . . . . . . . 89
4.14 Magnetostatics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.14.1 The Lorentz Force . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90
4.14.2 Ohms Law and Potential Theory with Steady Currents . . . . . . . . . . 90
4.14.3 Resistors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.14.4 The Biot-Savart Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.14.5 Ampere’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
4.14.6 The Vector Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.14.7 Inductance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
4.14.8 Magnetic Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.14.9 Magnetic Scalar Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
4.14.10 Simple Magnetic Materials and the Auxiliary Field, H . . . . . . . . . . . 94
4.14.11 Bound and Free Currents . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
4.14.12 The Auxiliary Magnetic Scalar Potential and the “Fake" Magnetic Charge 95
4.15 Electrodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
4.15.1 Time Varying Scalar and Vector Potentials . . . . . . . . . . . . . . . . . 96
4.15.2 Conservation of Charge and its Consequences . . . . . . . . . . . . . . . . 96
4.15.3 The Displacement Current . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.15.4 Polarization Current . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
4.15.5 Faraday’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
4.16 Slowly Varying Fields and Currents: Quasi-statics . . . . . . . . . . . . . . . . . 98
4.16.1 Quasi-Electrostatics and Quasi-Magnetostatics . . . . . . . . . . . . . . . 99
4.17 Gauge Freedom . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.17.1 The Coulomb Gauge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.17.2 The Lorentz Gauge . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99
4.18 Energy and Momentum in E&M Fields . . . . . . . . . . . . . . . . . . . . . . . . 100
4.18.1 Energy Flow and the Poynting Vector . . . . . . . . . . . . . . . . . . . . 100
4.18.2 Electromagnetic Momentum . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.19 Electromagnetic Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.19.1 The Wave Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
4.19.2 Plane Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 102
4.19.3 Polarization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 103
4.20 Wave Packets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
4.20.1 Group Velocity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.21 Waves in Matter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 105
4.21.1 Simple Media . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.21.2 Simple Plane Waves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 106
4.21.3 Specular Reflection and Snell’s Law . . . . . . . . . . . . . . . . . . . . . 107
4.21.4 Total Internal Reflection . . . . . . . . . . . . . . . . . . . . . . . . . . . . 108
4.21.5 Polarized Light at Boundaries (The Fresnel Equations) . . . . . . . . . . . 109
4.21.6 Simple Conducting Matter . . . . . . . . . . . . . . . . . . . . . . . . . . 110
4.21.7 Waves in Dispersive Media . . . . . . . . . . . . . . . . . . . . . . . . . . 111

6
4.21.8 The Drude Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 113
4.21.9 The Lorentz Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.21.10 The Appleton Model of a Magnetized Plasma . . . . . . . . . . . . . . . . 114
4.22 Waveguides and Transmission Lines . . . . . . . . . . . . . . . . . . . . . . . . . 114
4.22.1 TEM Waves (Transmission Lines) . . . . . . . . . . . . . . . . . . . . . . 115
4.22.2 Conducting Plane(s) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 116
4.22.3 Conducting Tubes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118
4.22.4 Example: Rectangular Tube Waveguides . . . . . . . . . . . . . . . . . . . 120
4.22.5 Example: Cylindrical Tube Waveguides . . . . . . . . . . . . . . . . . . . 121
4.22.6 Flow of Energy Through a Waveguide . . . . . . . . . . . . . . . . . . . . 121
4.23 Fields of Moving Particles (Radiation) . . . . . . . . . . . . . . . . . . . . . . . . 122
4.23.1 Retardation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 122
4.23.2 Fields of a Moving Point Charge . . . . . . . . . . . . . . . . . . . . . . . 123
4.23.3 Time-Dependent Electric Dipoles . . . . . . . . . . . . . . . . . . . . . . . 124
4.23.4 Fields in the Radiation Zone (Time-domain) . . . . . . . . . . . . . . . . 126
4.23.5 Fields in the Radiation Zone (Time-domain) . . . . . . . . . . . . . . . . 127
4.23.6 The Larmor Formula . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 127
4.23.7 Generalized Cartesian Multipole Radiation . . . . . . . . . . . . . . . . . 127
4.23.8 Electric and Magnetic Dipole Radiation . . . . . . . . . . . . . . . . . . . 128
4.24 Electromagnetic Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 129
4.24.1 Thomson Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.24.2 Rayleigh Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 130
4.24.3 The Born Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
4.24.4 The Optical Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 132

5 Thermodynamics 133
5.1 The Laws of Thermodynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.1.1 The Zeroth Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.1.2 The First Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.1.3 The Second Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.1.4 The Third Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 133
5.2 Ideal Gases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 134
5.2.1 Velocity of Particles in an Ideal Gas . . . . . . . . . . . . . . . . . . . . . 134
5.2.2 Energy of an Ideal Gas . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.3 The PV Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 135
5.4 Isotherms and Adiabats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.4.1 Isotherms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.4.2 Adiabats . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
5.5 Heat Engines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 138
5.5.1 Efficiency and Efficiency Limits . . . . . . . . . . . . . . . . . . . . . . . . 138
5.5.2 The Carnot Cycle . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 139
5.6 Intensive vs. Extensive Quantities . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.7 The Chemical Potential . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140
5.8 Thermodynamic Potentials . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.8.1 The Internal Energy (or Fundamental Thermodynamic Relation . . . . . 141
5.8.2 The Enthalpy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141
5.8.3 The Helmholtz Potential (or Helmholtz Free Energy) . . . . . . . . . . . . 142
5.8.4 The Gibbs Free Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
5.9 Maxwell Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.10 The Thermodynamic Square . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
5.11 Heat Capacity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 144

7
5.11.1 Specific Heat . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5.11.2 Constant Volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 145
5.11.3 Constant Pressure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 146
5.11.4 Relationship between Heat Capacities . . . . . . . . . . . . . . . . . . . . 146

6 Statistical Mechanics 146


6.1 Multiplicity, Temperature, and Entropy . . . . . . . . . . . . . . . . . . . . . . . 146
6.2 Density of States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147
6.2.1 Density of Quantum States in Phase Space . . . . . . . . . . . . . . . . . 147
6.2.2 Example: Density of States for a 3D Square Well . . . . . . . . . . . . . . 148
6.3 Energy Distribution at Equilibrium . . . . . . . . . . . . . . . . . . . . . . . . . . 149
6.4 The Microcanonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
6.5 The Canonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
6.5.1 Probability of States . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 150
6.6 The Grand Canonical Ensemble . . . . . . . . . . . . . . . . . . . . . . . . . . . . 151
6.7 The Boltzmann Factor and Boltzmann Equation . . . . . . . . . . . . . . . . . . 152
6.8 The Partition Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 152
6.8.1 The Canonical Partition Function . . . . . . . . . . . . . . . . . . . . . . 152
6.8.2 The Grand Canonical Partition Function . . . . . . . . . . . . . . . . . . 153
6.8.3 Combining Partition Functions . . . . . . . . . . . . . . . . . . . . . . . . 153
6.8.4 Ideal Gas Partition Function . . . . . . . . . . . . . . . . . . . . . . . . . 153
6.8.5 The Partition Function and Thermodynamics . . . . . . . . . . . . . . . . 154
6.9 Using the Partition Function . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155
6.9.1 Average Energy and Fluctuations in Energy . . . . . . . . . . . . . . . . . 155
6.9.2 Average State Occupancy and Fluctuations . . . . . . . . . . . . . . . . . 156
6.10 The Equipartition Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156
6.11 Quantum Statistics: Bose, Fermi, and Boltzmann Distributions . . . . . . . . . . 157
6.11.1 The Average Number of Particles . . . . . . . . . . . . . . . . . . . . . . . 157
6.11.2 The Fermi-Dirac Distribution . . . . . . . . . . . . . . . . . . . . . . . . . 159
6.11.3 The Bose-Einstein Distribution . . . . . . . . . . . . . . . . . . . . . . . . 160
6.11.4 The Planck Distribution (Photons) . . . . . . . . . . . . . . . . . . . . . . 160
6.11.5 The Maxwell-Boltzmann Distribution . . . . . . . . . . . . . . . . . . . . 161
6.12 Black Body Radiation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.12.1 Planck’s Law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 161
6.12.2 The Stefan-Boltzmann Law . . . . . . . . . . . . . . . . . . . . . . . . . . 162
6.12.3 Energy Density of a Black Body Photon Gas . . . . . . . . . . . . . . . . 163
6.13 Fermi-Gases . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6.13.1 The Fermi Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164
6.14 Velocity Distributions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 166
6.14.1 The Maxwell Velocity Distribution: Rigorous Derivation . . . . . . . . . . 166
6.14.2 The Maxwell Velocity Distribution: Fast Derivation . . . . . . . . . . . . 167
6.14.3 Maxwell Distribution of a Single Velocity Component . . . . . . . . . . . 167
6.14.4 Maxwell Speed Distribution . . . . . . . . . . . . . . . . . . . . . . . . . . 167
6.15 Kinetic Theory . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 168
6.15.1 Number of Particles Hitting a Surface . . . . . . . . . . . . . . . . . . . . 168
6.15.2 Effusion Through an Aperture . . . . . . . . . . . . . . . . . . . . . . . . 169
6.16 The Saha Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 170

8
7 Quantum Mechanics 171
7.1 The Bra-ket Notation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
7.1.1 Tips for Working with Bra-kets . . . . . . . . . . . . . . . . . . . . . . . . 171
7.1.2 Inner and Outer Products . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
7.1.3 Basis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 171
7.2 Complete Sets of Compatible Observables (CSCO) . . . . . . . . . . . . . . . . . 172
7.3 Time Evolution, Translation, and Rotation Operators . . . . . . . . . . . . . . . 173
7.3.1 The Translation Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . 173
7.3.2 The Time Evolution Operator . . . . . . . . . . . . . . . . . . . . . . . . . 173
7.3.3 The Rotation Operator . . . . . . . . . . . . . . . . . . . . . . . . . . . . 174
7.4 Approaches to Quantum Mechanics (Schrodinger vs. Heisenberg ) . . . . . . . . 174
7.5 The Schrodinger Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 175
7.5.1 Spherical Solutions to the Schrodinger Equation . . . . . . . . . . . . . . 175
7.6 The Heisenberg Equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 176
7.7 Commutator Relations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
7.7.1 The Canonical Commutation Relations . . . . . . . . . . . . . . . . . . . 177
7.7.2 Classical Correspondence of Commutators . . . . . . . . . . . . . . . . . . 177
7.7.3 Ehrenfest’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
7.8 Dispersion and Uncertainty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
7.8.1 Dispersion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 177
7.8.2 The Uncertainty Principle . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
7.9 Spin 1/2 Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178
7.9.1 Constructing the state vectors . . . . . . . . . . . . . . . . . . . . . . . . 178
7.9.2 Example: Deriving the |S ~ · n̂, +i State Ket . . . . . . . . . . . . . . . . . 179
7.9.3 Deriving the Spin Operators . . . . . . . . . . . . . . . . . . . . . . . . . 180
7.9.4 Deriving the Commutator Relationships . . . . . . . . . . . . . . . . . . . 180
7.9.5 Pauli Spin Matrices for the Spin 1/2 System . . . . . . . . . . . . . . . . 181
7.10 Total Angular Momentum: J~ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
7.10.1 Getting s, S, l, L, j, and J straight . . . . . . . . . . . . . . . . . . . . . . 182
7.10.2 Quantum Numbers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182
7.10.3 A Note on Excessive j’s in Notation . . . . . . . . . . . . . . . . . . . . . 183
7.10.4 Addition of Angular Momenta . . . . . . . . . . . . . . . . . . . . . . . . 183
7.10.5 Clebsch-Gordan Coefficients . . . . . . . . . . . . . . . . . . . . . . . . . . 185
7.11 Time Independent Perturbation Theory . . . . . . . . . . . . . . . . . . . . . . . 187
7.11.1 Non-Degenerate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
7.11.2 Degenerate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 188
7.12 Time Dependent Non-Degenerate Perturbation Theory . . . . . . . . . . . . . . . 189
7.12.1 Fermi’s Golden Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 189
7.13 Scattering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 190
7.13.1 Partial Wave Analysis: Theory . . . . . . . . . . . . . . . . . . . . . . . . 190
7.13.2 Phase Shift Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192
7.13.3 Partial Wave Analysis: Practical Application . . . . . . . . . . . . . . . . 192
7.13.4 The Born Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . 193

Image credits:
Title page: Wikipedia.
Divider Graphic: http://kevinflemingphd.com/wp-content/uploads/2014/02/divider-line.
png
Compiled on September 23, 2015.
1 Mathematical Methods
1.1 Matrices
1.1.1 Properties of Matrices and Matrix Operations
Pn
• Trace The trace of an n × n matrix A is defined as tr(A) = i=0 Ai . Alternately, this
k hk|A|ki.
P
can be written in bra-ket notation as tr(A) =
The following are some useful properties of the trace:

– Trace is linear: tr(A + B) = tr(A) + tr(B).


– tr(cA) = ctr(A).
– tr(AT ) = tr(A).
– tr(AB) = tr(BA).
– The trace is invariant under cyclic permutations of its elements: tr(ABC) =
tr(BCA) = tr(CAB).

• Conjugate Transpose (or Adjoint Matrix, Hermitian Conjugate, or Hermitian


Transpose): The complex-conjugate transpose of a matrix is given by:

A†ij = A∗ji (1.1.1)

The following are some properties of the conjugate transpose, for matrices A, B of the
same dimensions, and a complex constant r.

– (A + B)† = A† + B † (Distributive)
– (rA)† = r∗ A†
– (AB)† = B † A†
– (A† )† = A
– The eigenvalues of A† are the complex conjugates of the eigenvalues of A.

• Normal Matrix A matrix is A normal if

A† A = AA† (1.1.2)

Where A† is the conjugate-transpose of A.

• Singular Matrix A matrix A is said to be singular if it does not have a matrix inverse.
This is the case iff the

• Hermitian or Self-Adjoint: A matrix A is Hermitian if it is equal to its own complex-


conjugate transpose, denoted A† . Note that, in order to be Hermitian, a matrix must be
square.
The following properties hold for hermitian matrices:

– All eigenvalues of a Hermitian matrix are real.

10
– All Hermitian matrices are normal
– The determinant of a Hermitian matrix is always real.

The following properties describe the Hermiticity of combinations of matrices.

– If C is a square matrix, then C + C † is Hermitian.


– If C is a square matrix, then C − C † is Skew Hermitian (see below).

• Skew Hermitian or Antihermitian: A matrix is skew hermitian if

A† = −A (1.1.3)

• Diagonalizable: A matrix A is diagonalizable if it can be transformed into a basis in


which it is a diagonal matrix. In other words, there must exist some matrix Q such that
Q−1 AQ is diagonal.
A matrix will be diagonal when written in a basis consisting of its eigenvectors.

• Unitary: A matrix U is unitary if:

U †U = U U † = I (1.1.4)

Where I is the identity matrix.


The following properties hold for any unitary matrix U :

– Unitary transformations preserve inner products!. That is, for some vectors
|Ai and |Bi, hU A|U Bi = hU AU † |Bi = hA|Bi.
– U is normal
– U is diagonalizable.
– The eigenvectors of U are orthogonal.

1.1.2 Determinants

The following are some important facts about determinants, where A and B are n × n matrices
and a constant c:

• det(AB) = det(A)det(B)

• det(A−1 ) = det(A)−1

• det(AT ) = det(A)

• det(cA) = cn (A)

11
1.1.3 The Eigenvalue Problem

Let A be a n × n matrix, and x ∈ R_n. x is an eigenvector if the following equation holds:

Ax = λx (1.1.5)

Where the λ’s are called the eigenvalues of A. Rearranging this equation, we obtain the condition
that Ax − λx = (A − λI)x = 0.
We are uninterested in the trivial solution to this equation, where x = 0. However, if
(A − λI)−1 exists, then we can easily deduce that x = 0 by applying the inverse to both sides.
Therefore, for non-trivial solutions, (A − λI) must be singular (non-invertable). This is only the
case if:
det(A − λI) = 0 (1.1.6)

Evaluating this determinant yields an equation in λ whose roots are the eigenvalues of the matrix
A.
Once we have found the set of λi ’s, the next task is to identify the eigenvectors xi that go
with each λi . These can be found by the following system of n equations in x1 , x2 , ...xn :

(A − λI)xi = 0 (1.1.7)

Solving this system of equations fully determines xi .

1.1.4 Change of Coordinate Matrices

Suppose we have a vector x ∈ Rn expressed in a basis of n vectors, αi . We would like to change


coordinates to another basis, βi . To do this, we must assemble a change of coordinates matrix.
We first express our current basis vectors, αi in terms of our new basis vectors, βi :
X
αi = cij βj (1.1.8)
j

The cij ’s now make up the change of coordinates matrix, where each "i" represents a column.
We will write this matrix as Q = cij .
The vector x can now be transformed:

xβ = Qxα (1.1.9)

The same change-of-coordinates matrix can be used to convert a matrix T (which could represent
an operator) from the α to the β basis:

Tβ = Q−1 Tα Q (1.1.10)

The change-of-coordinate matrix Q must be a unitary matrix: that is, we require that QQ† =
Q† Q = I. Unitary transformations preserve the norm of a vector or matrix. This is of course
necessary for a well defined coordinate transformation, which should not change the length of
the vectors you are transforming!

12
1.1.5 Rotation Matrices

A particularly common type of coordinate change (and unitary transformation) is a rotation


about the origin. The following matrices can be applied as change-of-coordinate matrices (to
either a vector or, on either side, to a matrix) to effect a rotation.
In 2D: !
cos θ − sin θ
Rorigin = (1.1.11)
sinθ cos θ
This matrix rotates a clockwise rotation of the coordinate system by an angle θ.
In 3D, three rotation matrices are necessary:
 
1 0 0
 
0 cos θ − sin θ 
Rx =  (1.1.12)

0 sin θ cos θ
 
cos θ 0 sin θ
 
Ry = 
 0 1 0   (1.1.13)
− sin θ 0 cos θ
 
cos θ − sin θ 0
 
Rz = 
 sin θ cos θ 0
 (1.1.14)
0 0 1
Notice that each of the 3D rotation matrices ’contains’ the 2D rotation matrix. The rows of
the matrix that contain these elements are the ones that are rotating. The Line with the 1 is
the axis about which the rotation is being performed. Noting these similarities will make the
matrices easier to memorize.

1.1.6 Matrix Exponentiation

Occasionally, we will find ourselves with a matrix A in an equation such as eA . We will interpret
this “matrix exponential" by using the Taylor series of ex :

X An
eA = (1.1.15)
n=0
n!

1.2 Geometry
1.2.1 The Solid Angle

The solid angle describes the amount of surface area of a sphere spanned by some θ and φ,
independent of the radius of the sphere. This often useful when describing a situation from the
point of view of an observer at the center of the sphere. For example, if you hold up your thumb
to block the Sun in the sky, your thumb and the Sun occupy the same solid angle.
The solid angle is defined as a differential element to be:

dΩ = sin(θ)dθdφ (1.2.1)

13
Figure 1.2.1: Solid Angle Diagram(Source:
http://www.thermopedia.com/content/4761/2.gif

The solid angle is often used as a nice way to simplify the spherical Jacobian term when
integrating in spherical coordinates:

dA = r2 sin(θ)dθdφ = r2 dΩ (1.2.2)

This solid angle is responsible for the ubiquitous factors of 4π in spherical integrals, since:
Z
dΩ = 4π (1.2.3)
surf ace

1.3 Coordinate Systems


1.3.1 Spherical Coordinates

Figure 1.3.1: Spherical Coordinates. Source: Wikipedia.

Spherical coordinates are at the center of one of the most annoying convention disputes in
science. In physics, the angle φ is chosen as the azimuthal angle and θ as the polar angle (see

14
fig. 1.3.1), while in mathematics the two are often reversed! There is a strong common-sense
argument to be made for the mathematicians system, since polar coordinates is a projection of
spherical coordinates onto a plane, and is usually denoted in r and θ. However, for the sake of
consistency, I have tried to stick to the physicists convention in these notes.
Some clever geometry reveals that vectors can be transformed from Cartesian to spherical
coordinates as:
r̂ = sin θ cos φx̂ + sin θ sin φŷ + cos θẑ (1.3.1)

θ̂ = cos θ cos φx̂ + cos θ sin φŷ sin θẑ (1.3.2)

φ̂ = − sin θx̂ + cos θŷ (1.3.3)

Really only the hatr expression is really necessary to memorize. If necessary, the expression for
p̂hi is easy to derive from a diagram.
It is also often useful to know the time derivatives of some of these unit vectors:

d
r̂ = θ̇t̂heta (1.3.4)
dt

d
θ̂ = −θ̇r̂ (1.3.5)
dt

1.4 Calculus
1.4.1 Trig Substitution

Trig substitution makes use of trigonometric identities to simplify integrals. Starting with the
identity:
sin2 (x) + cos2 (x) = 1 (1.4.1)

If we divide both sides by cos2 (x), we get:

tan2 (x) + 1 = sec2 (x) (1.4.2)

Whiled dividing by sin2 (x) leaves:

1 + cot2 (x) = csc2 (x) (1.4.3)

Now, consider an integral of the form:

xdx
Z
I= √ (1.4.4)
a2 ± x2

15
Where a is a real constant. We could simplify the integral by dividing out an a:

1 xdx
Z
I= s (1.4.5)
a  2
x2
1± a2

For the moment, lets take ± → +. We can make the argument of the square root
resemble one of the trig identities above if we make the substitution:

x x
= tan(u) or = cot(u) (1.4.6)
a a

In general, the first of these is going to be nicer, since the derivative of tan(u) makes it easy to
write that:
dx = a sec2 (u)du (1.4.7)

Making the substitution and then applying the trig identity:

1 a tan(u)(a sec2 (u)du) tan(u) sec2 (u)du sin(u)


Z Z Z Z
I= =a =a tan(u) sec(u)du = a du
cos2 (u)
p p
a 1 + tan2 (u) sec2 (u)
(1.4.8)
The resulting integral is now easy to solve with another substitution: z = cos(u).
If instead in the original integral we had ± → −, we could simply subtract 1 from both sides
of the trig identities to get a new set of three easily applicable identities.

Notice that, in order for the this trick to work, we need both a term (usually a2 + x2 ) to
morph into the trig identity, as well as a free x on top or bottom to help make the resulting
integral tractable.

1.5 Fourier Series and Transforms


1.5.1 Fourier Series

Fourier series are a way of writing a periodic function f as a sum of (possibly infinite) Sine and
cosine functions of different frequencies, weighted by appropriate constants. f is usually written:

∞ ∞
nπx nπx
X   X  
f (x) = An cos + Bn sin (1.5.1)
n=0
L n=1
L

Generally a function we wish to represent with a Fourier series exists over a finite interval, but
is not necessarily periodic. In order to find the Fourier series in this case, we create a periodic
function by copying the function end-to-end, off to infinity.
The problem of finding the Fourier series is now reduced to finding the appropriate constants
An and Bn .This is accomplished by way of "Fourier’s Trick".
Fourier’s Trick relies on the orthogonality of Sine and cosine functions over a some domain
[a, b] at different frequencies. For example:
Z b
sin(nx) sin(mx)dx = δmn (1.5.2)
a

16
Similarly, Sine and cosine’s of all frequencies are orthogonal1.1 .
We can use this fact to find expressions for the constants in the Fourier series. For example,
we will derive the expression for Bn .
Starting
 with the general Fourier series written above, multiply both sides of the equation

mπx
by sin L , and then integrate from −L to L (where 2L is the period of f (x)):

Z L ∞
Z L X ∞
mπx nπx mπx nπx mπx
    
X     
f (x) sin dx = An cos sin + Bn sin sin dx
−L L −L n=0 L L n=1
L L
(1.5.3)
By the orthogonality of Sine and cosine, all but one of the terms on the right hand side is now
zero!
Z L Z L
mπx nπx mπx
     
f (x) sin dx = δmn Bn sin sin dx (1.5.4)
−L L −L L L
Collapsing the delta to replace m’s with n’s, and performing the integral, we get that
Z L
mπx
 
f (x) sin dx = Bn L (1.5.5)
−L L
And therefore our expression for Bn is:
Z L
1 mπx
 
Bn = f (x) sin dx (1.5.6)
L −L L
The same trick can be used to derive an expression for An :
Z L
1 mπx
 
An = f (x) cos dx (1.5.7)
L −L L
With the special case of A0 being easily written as
Z L
1
A0 = f (x)dx (1.5.8)
2L −L

1.5.2 Fourier Transform

A Fourier transform take a periodic function and returns a continuous function of the frequency
components. The inverse Fourier transform returns from this frequency domain to completely
reconstruct the original function in its original domain.
The Fourier transform is given by:
Z ∞
F (k) = f (x)e−2πixk dx (1.5.9)
−∞

while the inverse Fourier transform is given by:


Z ∞
f (x) = F (k)e2πixk dk (1.5.10)
−∞
1.1
This can be easily seen: Sine is odd, cosine is even, and the domain required to be symmetric.

17
These formulas are essentially a continuous version of the Fourier Series coefficient formulas
derived earlier. The separate sine and cosine components are now being handled simultaneously
by the complex exponential.
The following are some important properties of the Fourier Transform:

• The transform is linear: if h(x) = af (x) + bg(x), then H(k) = aF (k) + bG(k)

There are a number of common Fourier transform relationships that are useful to have available:

• Translation in x means a phase factor in k: f (x): f (x − x0 ) ↔ e−2πix0 k F (k)

• Translation in k means a phase factor in x: F (k − k0 ) ↔ e2πik0 x f (x)


1 k

• Constant multiples of x: f (ax) ↔ |a| F a

dn f (x)
• Derivatives in x bring down multiples of k in k-space: dxn ↔ (2πik)n F (k)
i n dn F (k)
• Likewise, multiples of x in x-space correspond to derivatives in k-space:xn f (x) ↔

2π dkn

1.6 The Legendre Transform


The Legendre transform is a common variable transformation that is useful when it is more
dF (x)
convenient to describe a function F (x) in terms of the derivative s = dx .
The Legendre transformation can only be applied when the function F (x) is convex, that is
d2 F
dx2
> 0, and smooth. This is required so that there exists a 1 − 1 relation between s and x.
When this is the case, we can invert the usual relation s(x) to obtain x(s).
When these requirements are met, the Legendre transform is written 1.2 :

G(s) = sx − F (x(s)) (1.6.1)

The following are some important properties of the Legendre transform:

• The Legendre transform is an involution, which means that it is its own inverse, such that
G(G(x)) = x.

1.7 Vector Calculus


1.7.1 Line and Volume Elements

The line element is the differential distance in a coordinate system. Each line element in 3D
space takes the form:
dx = h1 ê1 + h2 ê2 + h3 ê3 (1.7.1)

Where the h’s are elements of the Jacobian called ”scale factors” that are always the same for
each coordinate system. In the three most commonly used coordinate systems:

• Cartesian: dx = x̂ + ŷ + ẑ
1.2
A derivation of this expression, as well as a geometric motivation, can be found in the paper Making Sense
of the Legendre Transform by R.K.P Zia et al., Am. J. Phys. 77 (7) July 2009 pg. 614.

18
• Cylindrical: dx = ρ̂ + ρθ̂ + ẑ

• Spherical: dx = r̂ + rθ̂ + r sin θφ̂

The volume element is the differential volume in a coordinate system (i.e. what you would put
at the end of an integral in order to integrate over a volume). In general, the differential volume
(in a space with coordinates q1 , q2 , and q3 )is:

d3 V = (h1 h2 h3 )dq1dq2 dq3 (1.7.2)

So that:

• Cartesian: d3 V = dxdydz

• Cylindrical: d3 V = ρdρdθdz

• Spherical: d3 V = r2 sin θdrdθdφ

1.7.2 Surface Normals

The surface normal is defined to be orthogonal to the surface, and has a magnitude of the
differential area element. It is sometimes to imagine dS = n̂dS, where dS is the usual area
element, while n̂ is unit normal vector. Suppose you have some surface S in 3D, paramaterized
by two variables to be S(t, u). Then a surface normal can be constructed as (∂t S × ∂u S)dudt!
Of course, whether this normal vector points up or down on the surface depends on the order of
this cross product.

1.7.3 Vector Derivatives

(The best intuitive explanation of vector derivatives is given in the first chapter of Griffiths E&M
book. Rather than attempt to top that, this section is focused on easy ways to remember the
form of the operators in different coordinate systems).
Vector derivatives are constructed using the ’derivative vector’, ∇ (formally called ’nabla’,
but often referred to as ’del’). In a coordinate system with coordinates q1 , q2 , and q3 :

1 ∂ 1 ∂ 1 ∂
∇= ê1 + ê2 + ê3 (1.7.3)
h1 ∂q1 h2 ∂q2 h3 ∂q3

Applying ∇ to a scalar function f produces the gradient, which is a vector:

1 ∂f 1 ∂f 1 ∂f
∇= ê1 + ê2 + ê3 (1.7.4)
h1 ∂q1 h2 ∂q2 h3 ∂q3

Dotting ∇ with a scalar function f produces the divergence which is a scalar (and is therefore
often called the scalar product):

1 ∂ ∂ ∂
 
∇·f = (f h2 h3 ) + (f h1 h3 ) + (f h1 h2 ) (1.7.5)
h1 h2 h3 ∂q1 ∂q2 ∂q3

19
Notice that the h’s inside the derivative operator are all but the one cooresponding to the
coordinate being differentiated against in that term.
Crossing ∇ with a vector function A produces the curl which is is a vector (and is therefore
often called the vector product):

h1 ê1 h2 ê2 h3 ê3


1
∇×A= ∂1 ∂2 ∂3 (1.7.6)
h1 h2 h3
h1 A1 h2 A2 h3 A3

The only commonly used second-derivative operator is the Laplacian, ∇2 , which acts on a
scalar function f and returns a scalar. The Laplacian is defined as ∇2 f = ∇ · ∇f , and can be
calculated this way from the results above. The result is:

1 ∂ h2 h3 ∂f ∂ h1 h3 ∂f ∂ h1 h2 ∂f
 
2
∇ f= + + (1.7.7)
h1 h2 h3 ∂q1 h1 ∂q1 ∂q2 h2 ∂q2 ∂q3 h3 ∂q3

Notice that all four of these vector derivatives reduce to a simple form in Cartesian coordiates.
It is worth memorizing these Cartesian results, but then also memorizing these general equations
(along with the appropriate line element) so that you can conjure up the appropriate derivatives
in whatever coordinate system you happen to find yourself. There’s no need to memorize the
Laplacian: you rarely need it, and if you do, you can just take ∇ · ∇f to find it.
Here are some useful vector identities concerning vector derivatives:

• Curl of a gradient is zero: ∇ × (∇φ) = 0

• Divergence of a curl is zero: ∇ · (∇ × A) = 0

• Divergence of a vector with a scalar coefficient: ∇ · (cA) = c(∇ · A) + A · (∇c)

• Curl of a vector with a scalar coefficient: ∇ × (cA) = c(∇ × A) + (∇c) × A

• Curl of a curl: ∇ × (∇ × A) = ∇(∇ · A) − ∇2 A

• Divergence of a cross product: ∇ · (A × B) = B · (∇ × A) − A · (∇ × B)

• Curl of a cross product: ∇ × (A × B) = A(∇ · B) − B(∇ · A) + (B · ∇)A − (A · ∇)B



1

r̂ 0

1


∇ = − 2, ∇ = (1.7.8)
r r r r2
1.7.4 The Divergence Theorem

Consider a vector field F and and some surface, S, enclosing a volume V . The divergence
theorem states that:
Z Z
F · dS = ∇ · F dV (1.7.9)
S V

Where dS is the (outward facing) surface normal along the surface.

20
This theorem makes sense in light of a couple examples. Consider first a vector field that is
diverging from a source inside the surface. In this case, the divergence on the RHS would be
positive, while the dot product on the LHS would also be positive. The same idea, but with
opposite sign, holds if a sink is inside the surface.
Now, imagine a constant vector field. This field is divergence free, so the RHS is zero. On
the LHS, the dot product is positive on half of the surface, but negative on the other half of the
surface. These terms cancel, making the LHS also zero.

1.7.5 The Curl Theorem (Stokes Theorem)

Consider a vector field F and a curve C the encloses some surface S. The curl theorem states
that
Z Z
F · dC = ∇ × F dS (1.7.10)
C S

~ is the area element of the surface, with


Where dC is the line element along the curve C, and dS
direction defined normal to the surface S.
Of course, the direction around which the integral along C is carried out will affect the
sign of the result, as well the choice of an inward or outward oriented dS. The positive surface
normal corresponds to integrating along C counter-clockwise.
A convenient visual explanation of the theorem can be seen by considering the dot product
of various vector fields with the line elements of various curves. For a constant vector field, these
dot products will cancel around the (closed) curve. However, for a vector field with curl, the
sum of the dot products may be non-zero (as the field points different directions at different
points around the loop).

1.7.6 The Helmholtz Theorem

The Helmholtz Theorem (or Helmholtz Decomposition)1.3 states that any vector field F can be
decomposed into a divergence-free and a curl-free component:

F = −∇φ + ∇ × A (1.7.11)

Where φ is a scalar function called the potential, and A is a vector function called the vector
potential. Notice that, since ∇ × ∇φ = 0 by an identity, the first term is curl free, while since
∇ · (∇ × A) = 0 by another identity, the second term is divergence free.
1
If we further assume that F → 0 faster than r as r → ∞, then the theorem states
that:

1 ∇0 · F (r 0 ) 0
Z
φ(r) = dV (1.7.12)
4π allspace |r − r 0 |

1 ∇0 × F (r 0 ) 0
Z
A(r) = dV (1.7.13)
4π allspace |r − r 0 |
1.3
http://en.wikipedia.org/wiki/Helmholtz_decomposition

21
1.8 The Calculus of Variations
1.8.1 Functional Derivatives

A functional derivative is the derivative (still ’rate of change’) of a functional as you vary one of
its parameters. If: Z
J[f ] = L[x, f (x), f 0 (x)]dx (1.8.1)

Then, if we perturb the functional L by some variation, δf :

L[x, f (x), f 0 (x)] → L[x, f (x) + δf, f 0 (x) + deltaf 0 ] (1.8.2)

Then, to first order in the variation. the functional derivative is:

δJ
Z
δJ = δf dx (1.8.3)
δf

∂J
Where the ’differentiation’ is carried out just as you would expect if it were written ∂f .

1.9 Complex Analysis


Broadly speaking, complex analysis is the calculus of complex numbers. There are many useful
results from this field that translate into physics, most of which relate to special integration
tricks.

1.9.1 Some Useful Identities

A complex number C is defined as C = a + ib, where Re(C) = a and Im(C) = b, while


C ? = a − ib. Directly from these definitions, we can derive:

C + C ? = 2a (1.9.1)

Which, along with C − C ? = 2ib, yields the expressions:

C + C?
a= (1.9.2)
2

C − C?
b= (1.9.3)
2i
These expressions are very useful whenever you need to get rid of (or conjure out of nowhere) a
Re(C) or Im(C).

1.9.2 Extension to the Complex Plane

In many cases, the functions that we want to apply the techniques of complex analysis to will
actually be defined on the real numbers. In order to apply our techniques, we will need to extend
these functions to the complex plane. This is done by simply making the substitution:

f (x) ←→ f (z), x ∈ R, z ∈ C (1.9.4)

22
The previously 1D function now exists on the complex plane. Notice that when Im(f (z)) = 0,
the function once again lies only on the real axis (the x-axis).
All of this is rigorously mathematically justifiable, but if actually want to see a proof of that,
you should probably have been a mathematician.

1.9.3 Poles and Residue

A pole is a point where a complex function diverges to ∞ or −∞. Loosely speaking, a pole
1
behaves “like” the function f (z) = zn . The integer n is called the order of the pole, and
quantifies how fast the function diverges at that location.
Poles can be separated into two categories. A simple pole is of order n = 1. All other poles
are simply referred to as poles or higher-order poles.
Contour integrals around a pole are proportional to a constant, known as the residue of the
pole. This property is as fantastically useful as it is surprising! The residue of poles is written
as Res(f, c), where f (z) is the function and c is the location of the pole on the complex plane.
The value of the residue can be calculated using one of several formulas:

• Simple poles (n = 1):


Res(f, c) = lim (z − c)f (z) (1.9.5)
z→c
g(z)
However, for simple poles it often turns out that we can express f (z) = h(z) . When this is
the case, we can use the much simpler formula:

g(c)
Res(f, c) = (1.9.6)
h0 (c)
Where the prime represents differentiation with respect to z.
Higher Order Poles (n > 1):
For higher order poles, you’re stuck with this equation:

1 dn−1
Res(f, c) = lim n−1 ((z − c)n f (z)) (1.9.7)
(n − 1)! z→c dz
However, it’s not nearly as bad as it looks! The factor of (z − c)n will generally cancel
part of f (z), making the differentiation easier rather than harder.
This equation is technically just a more general version of the one for simple poles: notice
that it reduces to the first formula when n = 1.

1.9.4 The Cauchy Residue Theorem

The Cauchy Residue Theorem states that the value of a contour integral around some number
N of poles at locations aN is simply:
I X
f (z)dz = 2πi Res(f, aN ) (1.9.8)
C N

We normally apply this theorem in order to calculate the the integral from −∞ to ∞ of some

23
Figure 1.9.1: Contour for evaluating a real integral. Source: Wikipedia

real-valued function that we have analytically continued to the complex plane1.4 . In this case, we
want to pick a contour that includes all of the x-axis, while not including any other substantial
1
contributions. In the limit where f (z) falls off faster than z as z → ∞, it is legitimate to
chose a contour like the one shown in figure 1.9.1. Since the entire arc exists very far from the
origin, the contribution of the integral is negligible then, and:
I Z ∞
f (z)dz = f (x)dx (1.9.9)
C −∞

One potential sticky point arises when a pole exists on the real axis. In this case, we must

Figure 1.9.2: Two options for closing a contour around a pole on the real axis. Source:
http://www.nhn.ou.edu/~milton/p5013/chap7.pdf.

define the principal value of an integral1.5 . For a simple pole at a we define:


Z ∞  Z a− Z ∞ 
P f (x)dx = lim f (x)dx + f (x)dx (1.9.10)
−∞ →0+ −∞ a+

With this definition in place, we will now add a “bump” to our contour that circumnavigates
the troublesome pole (as shown in figure 1.9.2. Using this contour, it turns out that:
Z ∞
1
I X 
P f (x)dx = f (z)dz = 2πi Res(f, aN ) + Res(f, a) (1.9.11)
−∞ C N
2

Where the second pole a is the troublesome pole at the origin. In other words, the net result is
that a simple pole on the origin only contributes half of it’s residue to the integral.
1.4
The Wikipedia page has an excellent worked out example: http://en.wikipedia.org/wiki/Residue_
theorem.
1.5
Here’s a good resource on this: http://www.nhn.ou.edu/~milton/p5013/chap7.pdf.

24
A hand-waving explanation for this fact is that, as the residue “emerges” from the point a in all
directions, only half of it is “caught” by the upper half of the plane.

1.10 Solving Differential Equations


1.10.1 Separation of Variables

Separation of variables allows us to solve a differential equation (usually ∇2 F (x) = 0) by


assuming that the solution can be written as a product of functions that each only depend on
one of the variables:

∇2 F (x, y, z) = ∇2 X(x)Y (y)Z(z) (1.10.1)

This assumption is NOT generally true. However, it turns out that the solution space is spanned
by these solutions, so we can write any other solution as a linear combination of these “separable"
solutions.
When applied in 3D with Cartesian coordinates, separation of variables goes as follows:

d2 d2 d2
∇2 F (x, y, z) = 0 → yz X(x) + xz Y (y) + xy Z(z) = 0 (1.10.2)
dx2 dy 2 dz 2
Dividing both sides by xyz:

1 d2 1 d2 1 d2
2
X(x) + 2
Y (y) + Z(z) = 0 (1.10.3)
x dx y dy z dz 2
Now for the crucial argument. Since each of these terms depends only on one variable, but
they sum to zero, each term itself must be equal to a constant. Otherwise, changing x while
leaving the other terms fixed would violate the equation. Typically, once a problem is solved,
it is realized that some strange definition of these constants (such as l(l + 1)) is convenient.
However, in the simplest case:

1 d2 1 d2 1 d2
X(x) = α, Y (y) = β, Z(z) = γ (1.10.4)
x dx2 y dy 2 z dz 2
Notice that there is now a relationship between these constants, namely:

α+β+γ =0 (1.10.5)

1.10.2 Roots of the Characteristic Polynomial

This method of solving differential equations works whenever the coefficients of the differential
equation are constants. It relies on two facts about second order homogeneous differential
equations:

1. The linear combination of two solutions to a differential equation is itself a solution

2. The general solution of a second order homogeneous differential equation is a linear


combination of any two linearly independent solutions.

25
dn y
The characteristic polynomial can be found by substituting rn for each dr2
:

d2 y dy
a 2
+b + cy = 0 =⇒ ar2 + br + c = 0 (1.10.6)
dx dx
This procedure is motivated by considering a trial solution of the form y = erx , where y 00 = r2 y,
y 0 = ry, etc. The resulting equation is known as the characteristic equation or the auxiliary
equation: ar2 + br + c = 0.
The importance of this equation is that its roots can be used to find solutions to the
differential equation and, therefore, produce a general solution when put together as a linear
combination.
The roots of the equation as written above are clearly:
√ √
−b + b2 − 4ac −b − b2 − 4ac
r1 = , r2 = (1.10.7)
2a 2a
The particular solutions to this equation always take the form erx . However the form of the
general solution depends on the sign of the discriminant, b2 − 4ac, which determines which roots
of the equation are real and/equal.

• If b2 − 4ac > 0, then the solutions will be real and unequal, each producing an exponential
as a solution, yielding the general solution:

y = c1 er1 x + c2 er2 x (1.10.8)

• If b2 − 4ac = 0, then the two roots will be real and equal: r1 = r2 = r. One solution
remains the exponential erx , but the other must be found separately. It happens that xerx
is also a solution. Thus, the general solution is:

y = c1 erx + c2 xerx (1.10.9)

• If b2 − 4ac < 0, then the roots will be unequal and complex. If we denote the solutions as
r1 = α + iβ and r2 = α − iβ, the solutions are again written:

y = c1 e(α+iβ)x + c2 e(α−iβ)x (1.10.10)

Which can be rearranged to the form:

   
y = eαx c1 e(iβ)x + c2 e(−iβ)x = eαx c1 cos(βx) + c2 sin(βx) (1.10.11)

Useful Source: Stewart Calculus http://www.stewartcalculus.com/data/CALCULUS%20Concepts%


20and%20Contexts/upfiles/3c3-2ndOrderLinearEqns_Stu.pdf

26
1.11 Einstein Summation Notation
1.11.1 The Summation Notation

The Einstein summation notation convention is that indices that appear multiple times in
an expression should be interpreted as summations. These double indicies are also known as
"dummy" indices, because they disappear if the implied sum is actually written out. For example:

X
ai bi = ai bi (1.11.1)
i

Indices that appear only once are "free", i.e. they are just variables that indicate that the
equation written is just one of a set of similar equations. For example:

X
Vj = ai bi Zj =⇒ (V )j = (Z)j ai bi (1.11.2)
i

This expression describes a set of j (the free variable) equations, each of which includes the sum
ai bi . Free and dummy indices are analogous to free and bound variables. The free indices are
placeholders for some value, while the bound indices already have a fixed value.
The position of the incides is used to denote column and row vectors (which are often
multiplied together to form a sum). They should not be misread as exponents . For example:
 
a1
 
a  h i
i  2
a bi =   b1 b2 b3 ... (1.11.3)
a3 
 
...
Here are some tips on using the notation:

• While doing index manipulations, return as much as possible to normal vector notation in
between steps. For example, if you complete a sub-part of a problem, rewrite the indices
in vector notation as much as possible. Removing dummy indices in this way makes it
easier to see connections between different equations.

• Wikipedia suggests a helpful mnemonic for remembering which index position corresponds
to which vector: "Upper indices go up to down; lower indices go left to right."

1.11.2 The Dot Product

The dot product between two vectors can be easily expressed in Einstein notation:

a · b = ai bi (1.11.4)

1.11.3 The Levi-Civita Symbol

The Levi-Civita symbol is a constant whose value depends on the permutation of its indices. The
symbol (in three dimensions) is written ijk , and its value is defined by the following conditions:

• ijk = 1

27
• ikj = −1 (When the indices have been permuted)

• iij = ijj = 0 (Whenever two indices are equal)

Since it is only the permutation and not the precise order of the indices of the constant that
matters, ijk = jki = kij .

1.11.4 Cross Product

This definition of the Levi-Civita constant allows the cross product to be easily expressed:

(a × b)i = ijk aj bk (1.11.5)

Where i is a free variable, and the sum is conducted over the repeated dummy variables j and k.
The resulting vector is therefore:
 
1jk aj bk
 
a×b= j k (1.11.6)
2jk a b 
3jk aj bk
A useful simplification of the product of two Levi-Civita symbols can be made when they share
an index:

ijk imn = δjm δkn − δjn δkm (1.11.7)

Where the Kronecker delta of two indices is defined by:

• δij = 1 if i = j

• δij = 1 if i 6= j

So that:

δij ai = ai = aj (1.11.8)

When condensing an expression using this identity, it is useful to think about changing all the
indices to a "preferred set", i.e. exchanging m for j and n for k whenever possible.

1.11.5 Example: Proving Vector Identities with Einstein Notation

Einstein notation and the Levi-Civita symbol make proving some vector identities easier. Consider
the expression:(a × b) × c.


(a × b) × c i
= imn (a × b)m cn
= imn (mjk aj bk )cn
= (δjm δkn − δjn δkm )aj bk cn
= aj cj bk − bk ck aj

28
But this is equal to:

(a · c)b − (b · c)a = aj cj bk − bk ck aj (1.11.9)

So we see that (a × b) × c = (a · c)b − (b · c)a. QED.

1.12 Special Functions


1.12.1 Legendre Polynomials

The Legendre Polynomials usually arise in problems in spherical coordinates with azimuthal
(φ) symmetry. Formally, they are solutions to a differential equation known as “Legendre’s
Equation". They are usually found by means of the Rodrigues formula:

1 dn 2
Pn (x) = (x − 1)n (1.12.1)
2n n! dxn
However, they also emerge as coefficients in a Taylor series expansion:


1 X
√ = Pn (x)tn (1.12.2)
1 − 2xt + t 2
n=0

This context is most useful when expanding the multipole expansion, which can be rewritten in
this form:


1 1 1 1 1X
=√ = q = Pn (x)rn (1.12.3)
r 2 0
r − 2~r · r + r 02 r 0
1 − 2r̂ · r̂0 rr + r02 r n=0
r2

The first few Legendre polynomials (and the only ones you really need to memorize) are:

• P0 (x) = 1

• P1 (x) = x

• P2 (x) = 21 (3x2 − 1)

As a set of special functions, the Legendre Polynomials have a variety of useful properties
including an orthogonality relationship:
Z 1
2
Pn (x)Pm (x) = δnm (1.12.4)
−1 2n + 1
The Legendre Polynomials are alternatingly symmetric and antisymmetric:

Pn (−x) = (−1)n Pn (x) (1.12.5)

1.12.2 Associated Legendre Polynomials

Although the normal Legendre Polynomials only emerge as solutions when a problem has
azimuthal symmetry, in the general case (and for the construction of the Spherical Harmonics),
we want a form of the Legendre Polynomials that allows both l and m to vary. These functions
are the “associated Legendre Polynomials".

29
The associated Legendre Polynomials are solutions to a modified version of Rodrigues’s
formula:

m dm (−1)m 2 m dl+m 2
Plm (x) = (−1)m (1 − x2 ) 2 P l (x) = (1 − x ) 2 (x − 1)l (1.12.6)
dxm 2l l! dxl+m

These functions very rarely make an appearance by themselves, but are important as part of the
definition of the Spherical Harmonics.

1.12.3 Spherical Harmonics (The Ylm ’s)

The Spherical Harmonics, or (or “YLMs") are the angular part of the general solution to Laplace’s
equation (∇2 f = 0) in spherical coordinates. As such, they often appear in E&M (as potentials)
or in QM as solutions to the Schrodinger Equation.
The Spherical Harmonics are normally written in terms of the associated Legendre Polyno-
mials 1.6 :

s
(2l + 1)(l − m)! m
Ylm (θ, φ) = Pl (cos(θ))eimφ (1.12.7)
4π(l + m)!
Notice that, in the azimuthally symmetric case where m = 0:
s
2l + 1
Yl0 (θ, φ) = Pl (cos(θ)) (1.12.8)

The first few spherical harmonics are:

• l= 0 Y00 = √1

• l= 1
q
– Y1−1 = 3
8π sin(θ)e−iφ
q
3
– Y10 = 4π cos(θ)
q
3
– Y11 = − 8π sin(θ)eiφ

• l= 2
q
1 15
– Y22 = 4 2π sin2 (θ)e2iφ
q
15
– Y21 = − 8π sin(θ) cos(θ)eiφ
q  
5 3 1
– Y20 = 4π 2 cos2 (θ) − 2

Different Spherical Harmonics are orthogonal:


Z 2π Z π
0
dφ dθ sin(θ)Yl∗m
0 (θ0 , φ0 )Ylm (θ, φ) = δl,l0 δm,m0 (1.12.9)
0 0
1.6
See Jackson pg. 107.

30
And also complete:

∞ X
l
0
Yl∗m (θ0 , φ0 )Ylm (θ, φ) = δ(φ − φ0 )δ(cos(θ) − cos(θ0 ))
X
0 (1.12.10)
l=0 m=−l

This means that any function may be expanded in terms of the Spherical Harmonics:

∞ X
X l
f (θ, φ) = Am m
l Yl (θ, phi) (1.12.11)
l=0 m=−l

Where the constant coefficients can be found to be:


Z
Am
l = Yl∗m (θ, φ)f (θ, φ) sin(θ)dθdφ (1.12.12)

One extremely important result with Spherical Harmonics is the Addition theorem 1.7 , which
allows a Legendre Polynomial in terms of cos(γ) 1.8 , where γ is the angle between the observer
vector r and the source vector r0 , in terms of Spherical Harmonics in the angles of those two
vectors:

l
4π X
Pl (cos(γ)) = Y ∗m (θ0 , φ0 )Ylm (θ, φ) (1.12.13)
2l + 1 m=−l l

1.12.4 Bessel Functions and Hankel Functions

Bessel functions are solutions of the following differential equation (which often arises when
solving Laplace’s equation through separation of variables in spherical or cylindrical coordinates):

d2 y dy
x2 2
+x + (x2 − α2 )y = 0 (1.12.14)
dx dx
When α is an integer, the solutions y(x) to this equation are Bessel Functions, or Cylindrical
Harmonics. When α is a half integer, the solutions are the Spherical Bessel Functions. Other
values of α are possible, but do not generally appear in solutions to physical problems.
Since the equation above is a second-order differential equation, the most general solutions
are linear combinations of two linearly independent solutions. However, just like exponential,
sin,cos,sinh, and cosh can all be used to form solutions to Laplaces’s equation in Cartesian
coordinates, there are many different pairs of Bessel functions that can be used to most
conveniently write the solutions.
There are a variety of ways of calculating the form of these function, none of which are
generally used very often. Therefore, we will simply take the shape of the regular Bessel Functions
to be given (just like Sine) and then describe the relationship of the other functions to these.
The first (and most important) solution is:

y(x) = aJα (x) + bNα (x) (1.12.15)

Where a and b are constants. These “ordinary” bessel functions are the standard solutions to
1.7
See Jackson pg. 110 (although even HE doesn’t seem to prove this).
1.8
cos(γ) can also be written as cos(γ) = cos(θ) cos(θ0 ) + sin(θ) sin(θ0 ) cos(φ − φ0 ).

31
(a) Bessel Functions of the First Kind (b) Bessel Functions of the Second Kind

Figure 1.12.1: Source: Wikipedia

Laplace’s equation in cylindrical coordinates. The Jα ’s are Bessel Functions of the First
Kind, pictured in Fig. 1.12.1a. It is worth noting that J0 (0) = 1. The functions oscillate back
and forth with an irregular period. The zeros of the Bessel Functions (where it crosses the
x-axis) can be calculated, and are sometimes useful.
The Nα ’s are called Neumann Functions or Bessel Functions of the Second Kind,
and are another set of solutions to the equation that are linearly independent to the Jα ’s.
With these two functions, we can define the Hankel Functions:

Hα(1) (x) = Jα (x) + iNα (x) (1.12.16)

Hα(2) (x) = Jα (x) − iNα (x) (1.12.17)

These rarely come up, but are apparently of some theoretical value.
We can also define the Modified Bessel Functions Iα (x) and Kα (x) here, which are
another pair of solutions (that can be written in terms of the ordinary Bessel Functions) that are
valid for imaginary arguments. Again, we can assemble a solution using this pair of functions:

y(x) = aIα (x) + bKα (x) (1.12.18)

Another form of Bessel functions arises when considering the Helmholtz equation (∇2 Y +k 2 Y = 0)
in spherical coordinates. These are known as Spherical Bessel Functions, and are pictured
in Figure 1.12.2. The Spherical Bessel Functions can be written in terms of the ordinary Bessel
Functions1.9 .
In general, the spherical Y )α’s will be eliminated from physical solutions because they are
not finite at the origin. It is, however, useful to remember the first few Spherical Bessel Functions
of the first kind:

sin(x)
• J0 (x) = x
1.9
See Wikipedia for details.

32
(a) Spherical Bessel Functions of the First Kind (b) Spherical Bessel Functions of the Second Kind

Figure 1.12.2: Source: Wikipedia

sin(x) cos(x)
• J1 (x) = x2
− x
 
3 sin(x) 3 cos(x)
• J2 (x) = x2
−1 x − x2

The spherical bessel functions reduce asymptoticlly to the following functions for x << 1:

2l l!
jl → xl , x << 1 (1.12.19)
(2l + 1)!

(2l)! 1
nl → − , x << 1 (1.12.20)
2l l! xl+1
You can also define Spherical Hankel Functions using the Spherical Bessel Functions in the
exact same way that the ordinary Hankel Functions are defined in terms of the ordinary Bessel
Functions:

(1)
hl (x) = jl (x) + inl (x) (1.12.21)

(2)
hl (x) = jl (x) − inl (x) (1.12.22)

These functions are solutions to the radial spherical equation1.10 :

d2 u l(l + 1)
− u = −k 2 u (1.12.23)
dr2 r2
(1) (−i)l+1 ix (2) (i)l+1 −ix
Asymptotically, (for large x), these functions go as hl ≈ x e and hl ≈ x e .
1.10
This equation often arises when the raidal coordinate has been transformed by u(r) = r R(r).

33
1.12.5 Airy Functions

Airy functions1.11 are solutions to the following differential equation (which often appears when
considering linear potentials):

d2 f (x)
= xf (x) (1.12.24)
dx2
There are two linearly independent solutions to this equation, which are called Ai(x) and Bi(x).
The most general solution can then be written:

f (x) = aAi(x) + bBi(x) (1.12.25)

for some constants a and b. These solutions can be represented as integrals:


Z ∞  3
1 s

Ai(x) = cos + sx ds (1.12.26)
π 0 3
Z ∞  3
1 s

s3
Bi(x) = e 3 +sx + sin + sx ds (1.12.27)
π 0 3
More commonly, however, you will encounter approximations of the Airy functions far away
from the origin:

• x >> 0:

1 2 3
Ai(x) ≈ √ 1 e− 3 x 2 , x >> 0 (1.12.28)
2 πx 4

1 2 3
Bi(x) ≈ √ 1 e3x2 , x >> 0 (1.12.29)
πx 4

• x << 0

1 2 π
 
3
Ai(x) ≈ √ 1 sin (−x) 2 + , x << 0 (1.12.30)
π(−x) 4 3 4

1 2 π
 
3
Bi(x) ≈ √ 1 cos (−x) 2 + , x << 0 (1.12.31)
π(−x) 4 3 4

1.13 Probability and Combinations


1.13.1 Combinations
!
n
The combination is the number of ways to chose k objects out of n objects, disregarding
k
order. This can be calculated as:
!
n n!
= (1.13.1)
k k!(n − k)!
1.11
See Griffiths QM pg. 327

34
1.14 Statistics
1.14.1 The Autocorrelation function

Suppose you have a data set that consists of N pieces of data, each of which is a time series
x(t) of some process taking place: {x1 (t), x2 (t), ...xN (t)}. You suspect that the same process is
taking place in each data set, but that the process takes place at a different point (i.e. different
time) in each data set. The autocorrelation (or "serial correlated errors). of the data set allows
this to be determined.
The auto correlation function is defined to be:

N
1 X
Rxx (t1 , t1 + τ ) = lim xk (t1 )xk (t1 + τ ) (1.14.1)
N →∞ N
k=1

1.15 Miscellaneous
1.15.1 The Feynman Integration Trick

The Feynman integration trick is a way of taking one integral you know and using it to generate a
whole family of other integrals merely by differentiating. The general idea will be to differentiate
both sides of the known integral (possibly with respect to a constant!) to generate the desired
expression. For example, consider the integral:
Z ∞ r
2 π
e−ax dx = (1.15.1)
−∞ a
If we differentiate both sides of this expression with respect to x, we generate:
Z ∞
2
−2a xe−ax dx = 0 (1.15.2)
−∞

An even more useful trick allows us bring down even powers of x. Pretend for a moment that
2
f (x) = e−ax is really f (x, a). Then we can take the derivative of both sides with respect to a:
Z ∞ √
2 −ax2 π
x e dx = 3 (1.15.3)
−∞ 2a 2
This new expression must of course still hold if we now again set a = const. Repeating the
R∞ 2N e−ax2 dx!
process can generate any integral of the form −∞ x

1.15.2 Raleigh’s Formula

Raleigh’s Formula: expression of a plane wave in spherical coordinates:


X
eikz = il (2l + 1)jl (kr)Pl (cos θ) (1.15.4)
l=0

Where jl (kr) is a spherical bessel function of the first kind, and Pl is a Legendre polynomial.

35
Figure 1.15.1: Law of Cosines (Source: Wikipedia)

1.15.3 The Law of Cosines

The Law of Cosines is a more general version of the Pythagorean theorem, and states (for the
triangle illustrated above) that:

c2 = a2 + b2 − 2ac cos γ (1.15.5)

The same relationship holds permuting a, b, c, and the associated angles.

1.15.4 Taylor Series

The Taylor series allows us to approximate a function near a point by a series of terms, each
of which depends on a derivative of the function. In principle, well behaved functions can be
perfectly represented in this way by a Taylor series with infinite terms. However, we usually
truncate the series after a few terms; making an approximation that makes the problem tractable.
The Taylor series for a function f (x) about a point a is:

f 00 (a) f n (a)
f (x) = f (a) + f 0 (a)(x − a) + (x − a)2 + ... (x − a)n (1.15.6)
2! n!
A Taylor series can similarly be written to approximate a vector function A(r):

1
A(r + ) = A(r) + ( · ∇)A(r) + ( · ∇)2 A(r) + ... (1.15.7)
2
Almost always, this series is truncated to the first two terms.

1.15.5 The Binomial Theorem

The binomial theorem is used for expanding repeated products of a binomial, i.e. for calculating
(x + y)n . The theorem states that:

n n
! !
n
X n n−k k X n k n−k
(x + y) = x y = x y (1.15.8)
k=0
k k=0
k

For example, this theorem generates the basic result that:

(x + y)2 = x2 + 2xy + y 2 (1.15.9)


! ! !
2 2 2
Since = = 1 and = 2.
0 2 1

36
If n is a positive integer, then the sum above will converge, yielding the correct expanded
polynomial. However, if n is negative, a fraction, or both, the sum will be infinite!
In this latter case we can compute the coefficients of each term manually:
!
n n!
= =n (1.15.10)
1 1!(n − 1)!
!
n n! n(n − 1)
= = (1.15.11)
2 2!(n − 2)! 2!

and etc.1.12
In particular it is useful to remember that, for small x:

n(n − 1) 2 n(n − 1)(n − 2) 3


(1 + x)n = 1 + nx + x + x + ... (1.15.12)
2! 3!
And that in therefore in the special case where n = 12 :

1 1 1 1
(1 + x)− 2 = 1 + x − x2 + x3 + ... (1.15.13)
2 8 16

1.15.6 The Schwartz Inequality

Begin by noting that for some complex number λ

(hα| + λ? hβ|)(|αi + λ|βi) ≥ 0 (1.15.14)

This must be true because the LHS is just an inner product of a vector with its complex conjugate,
which is positive definite.
−hβ|αi
Now, make the divinely inspired choice of λ = hβ|βi . We can now write the product above
as:

−hα|βihβ| −hβ|αi|βi
(hα| + )(|αi + )≥0 (1.15.15)
hβ|βi hβ|βi
Now, expanding this expression and rearranging the resulting terms yields the Schwartz inequality:

hα|αihβ|βi ≥ |hα|βi|2 (1.15.16)

1.16 Useful Facts


You’ll be saving yourself a lot of pain if you memorize these facts early and use them often!

1.16.1 Integrals
Z ∞
x
xn e− a dx = n!an+1 (1.16.1)
0

Z ∞
x2 √ (2n)! 2n+1
x2n e− a2 dx = π a (1.16.2)
0 n!
1.12
Useful link: https://www.physicsforums.com/threads/binomial-theorem-for-fractional-exponents.240990/

37
Z ∞
x2 n! 2(n+1)
x2n+1 e− a2 dx = a (1.16.3)
0 2

x
Z p
√ dx = x2 ± a2 (1.16.4)
x2 ± a2

1.16.2 Series

Note: All of these identities are series, but on some I have placed the sum on the left of the
equals sign, while for others they are on the right. This merely reflects which way you are more
likely to use the identity.

X xn x2
ex = ≈1+x+ + ... (1.16.5)
n=0
n! 2!


X (−1)n 2n+1 x3 x5
sin(x) = x ≈x− + + ... (1.16.6)
n=0
(2n + 1)! 3! 5!


X (−1)n 2n x2 x4
cos(x) = x ≈1− + + ... (1.16.7)
n=0
(2n)! 2! 4!

Notice from here that eix = cos(x) + i sin(x).

√ x x2
1+x≈1+ − + ... for x << 1 (1.16.8)
2 8


X 1
xn = , for |x| < 1 (The Geometric Series) (1.16.9)
n=0
1−x


1
e−an =
X
(1.16.10)
n=0
1 − e−a

1.16.3 Vectors

|a × b|2 = |a|2 + |b|2 − (a · b)2 (1.16.11)

A · (B × C) = B · (C × A) = C · (A × B) (any permutation) (1.16.12)

1.16.4 Equations
B
 
iφ −1
A + iB → e with φ = tan (1.16.13)
A
The following procedure (completing the square) is useful when you want to rearrange a
polynomial to look like a perfect square, shifted by some constant.
2
√ b b2

2 2
ax + b + c = ax + √ +c− (completing the square) (1.16.14)
2 a 4a

38
1.16.5 Theorems/Results

The Baker-Campbell-Hausdorff formula:


[A,B]
eAB = eA eB e− 2 (1.16.15)

1 −r
∇ =− (1.16.16)
r r 3

1
∇2 = −4πδ(r − r 0 ) (1.16.17)
|r − r 0 |
Proof :
Taking the gradient of 1
r gives − −r
r3
. Now, taking the divergence (spherical coordinates make
this easy) shows that this is always = 0. However, applying Gauss’s law over a sphere and the
∇ · (∇r)dV = intS −r 1 1 2
R R R
result above: V r3
· dS = S r2 dS = S r2 r sin(θ)drdθdφ = 4π. Since we
have shown that ∇2 r̂r = 0, and also that this integral around the origin is NON-zero, it must be
that ∇2 rr̂ = 4πδ(r).
When you have a delta function in the z direction, you can rewrite it as:

δ(cos(θ))
δ(z) = (1.16.18)
r

2 Classical Mechanics
2.1 Poisson Brackets
The Poisson bracket of two quantities, A and B, with respect to two variables q and p, is defined
to be:

X ∂A ∂B ∂A ∂B
{A, B} = − (2.1.1)
i
∂qi ∂pi ∂pi ∂qi

2.1.1 Identities and Useful Relations

• Clearly

{A, B} = − {B, A} (2.1.2)

• Poisson brackets distribute over addition. This follows directly from the fact that derivatives
distribute over addition.

{A, B + C} = {A, B} + {A, C} (2.1.3)

• Poisson brackets obey the product and quotient rules, which also follow directly from the

39
corresponding properties of derivatives.

{A, BC} = {A, B} C + {A, C} B (2.1.4)

B C {A, B} − B {A, C}
 
A, = (2.1.5)
C C2
Notice that this relation is identical to the corresponding relationship for commutators,
except that, since all terms in a Poisson bracket commute, it does not matter which side
of the brackets B and C are pulled out to.
P ∂A ∂qi ∂A ∂qi ∂A ∂qi ∂qi
• The Poisson bracket {A, qi } = i ∂qi ∂pi − ∂pi ∂qi = − ∂pi , since ∂pi = 0 and ∂qi = 1.
Therefore we have:

∂A
{A, qi } = − (2.1.6)
∂pi
And similarly:

∂B
{B, pi } = (2.1.7)
∂qi

2.1.2 Poisson Bracket form of Hamilton’s Equations of Motion

The total time derivative of a function A(q, p, t) is often of interest in the Hamiltonian formalism
of classical mechanics. This time derivative can be written:

dA ∂A X ∂A ∂A
 
= + q̇i + ṗi (2.1.8)
dt ∂t i
∂qi ∂pi

∂H
Hamilton’s equations of motion tell us that q̇i = ∂pi and ṗi = − ∂H
∂qi . Therefore we can rewrite
the above expression as:

dA ∂A X ∂A ∂H ∂A ∂H
 
= + − (2.1.9)
dt ∂t i
∂qi ∂pi ∂pi ∂qi

Or, more compactly:

dA ∂A
= + {A, H} (2.1.10)
dt ∂t

2.2 Derivation of Virial Theorem


The goal of the viral theorem is to find a convenient expression for the time average value of
kinetic energy, hT i. Consider the following expression for a particle moving (for simplicity) in
the x direction:

X X
pi · ri = mi ẋi · xi (2.2.1)
i i

Now consider the time derivative of this expression:

40
d X
  X X
mi ẋi · xi = mi ẍi · xi + mi ẋi 2
dt i i i
2 · xi = · xi = i ∇i V · xi . Now, making
P P P P
Notice that i mi ẋi = 2T and that i mi ẍi iF
these substitutions, consider the time average of the equation:

d X
   X
mi ẋi · xi = h ∇i V · xi i + 2hT i (2.2.2)
dt i i

If we suppose that that path of the particle being considered is either cyclic or bound, than
the time average of the time derivative (the LHS) will be zero. This is true for any bounded
function. The time average of an arbitrary function is defined as:
Z T
1
hf (t)i = f (t)dt (2.2.3)
T 0
dF
If f (t) is a total time derivative of some function F , so that f (t) = dt , and we consider the limit
as t → ∞, then we have:
Z T
dF 1 dF 1
h i = lim dt = lim (f (T ) − f (0)) (2.2.4)
dt t→∞ T 0 dt t→∞ T
If f(t) is bound, then limt→∞ f (t) is finite while t goes to ∞, and therefore hf (t)i = 0. Applying
this reasoning to our derivation, the LHS vanishes, and the remaining terms can be rearranged
to produce the statement of the virial theorem:

X
2hT i = −h ∇i V · xi i (2.2.5)
i

2.3 Derivation of the Euler-Lagrange Equation (1D)


The Euler-Lagrange equation is one of those derivations in which we assume a function exists
with some properties, and then use those properties to create an equation that, in the end, solve
for the function we supposed at the beginning.
Consider the following integral, called the action:
Z t2
S= L(q, q̇, t)dt (2.3.1)
t1

The fundamental physics behind the Lagrangian formulation of classical mechanics is that motion
occurs so as to minimize the action. Therefore, let us suppose that there exists a function f (t)
such that
Z t2
S= L(f (t), f˙(t), t)dt (2.3.2)
t1

is minimized. Since f (x) is a minimum, any perturbation to f (x) will necessarily increase the
action integral above. Let us then introduce such a perturbation:

g(t) = f (t) + η(t) (2.3.3)

41
where  is a scaling constant. We will require that the perturbation goes to zero at both endpoints,
that is to say η(t1 ) = η(t2 ) = 0.
We can now insert this new function into the action integral to obtain
Z t2
S = L(g(t), ġ(t), t)dt (2.3.4)
t1

Now we will maximize S with respect to  by setting

dS
=0 (2.3.5)
d
We can easily calculate this derivative by bringing the  derivative inside the action integral to
act directly upon the integrand:

dL dg dL dġ dL dt dL
= + + (2.3.6)
d d dg d dġ d dt
dt
The last term of this expression vanishes, because t is not dependent on , and therefore d is
zero. Two of the other derivatives can be explicitly calculated from the definition of g:

dg
= η(t) (2.3.7)
d

dġ
= η̇(t) (2.3.8)
d
Leaving us with the final expression:

dL dL dL
= η(t) + η̇(t) (2.3.9)
d dg dġ
We know from calculus that the integral of this expression ( dS
d ) is equal to zero when  = 0,


since  was a perturbation from the minimum function f (t). Evaluating  = 0 changes our
variables back, so that g → f . Plugging the expression back into the integral:
Z t2 
dL dL

η(t) + η̇(t) dt (2.3.10)
t1 df df˙
This integral can be further simplified by an integration by parts on the second term, where
dL
u= df˙
and v = η(t). This gives us:
Z t2   t2
dL d dL dL
 
− dtη(t)dt + η(t) (2.3.11)
t1 df dt df˙ df˙ t1

The last term goes to zero based on the boundary conditions we required for η at the beginning:
η(t1 ) = η(t2 ) = 0.
The remaining integral is:
Z t2 
dL d dL

− dtη(t)dt = 0 (2.3.12)
t1 df dt df˙
The last step is made by the Fundamental Lemma of the Calculus of Variations, which states
that, for this case, the integrand must be equal to zero. This leaves us with the completed

42
Euler-Lagrange equation:

dL d dL
− =0 (2.3.13)
df dt df˙

2.4 Constraints
The beauty of the Lagrangian formulation of classical mechanics lies in how easily it handles
constraining forces. There are two types of constraints, and two methods of representing them.

2.4.1 Types of Constraints

• Holonomic constraints, depend only on the qi ’s and t, but not on the derivatives, q̇i .
Holonomic constraints also only involve equalities, not inequalities. Therefore, a general
holonomic constraint can be written:

φ(qi , t) = 0 (2.4.1)

• Non-holonomic constraints depend on the derivatives of coordinates, and may involve


inequalities. Our general methods will not work with non-holonomic constraints.
One exception to this statement must be more for so-called integrable constraints. These
are non-holonomic constraints for which

d
φ(qi , q̇i , t) = φ(qi , q̇i , t) = 0 (2.4.2)
dt
Such that φ(qi , q̇i , t) is constant. In this case, the non-holonomic constraint can be easily
reduced to a holonomic constraint, and the usual methods applied.

In any case, the "standard form" of a constraint is taken to be φ = 0. Constraints should be


rearranged into this form before being used in any of the equations described in this section.

2.4.2 Lagrange Multipliers

Holonomic constraints can be included in Lagrangian by introducing an extra degree of freedom,


λ, for each constraint. The action then takes the form
Z  X 
S= L(qi , q̇i , t) + λα φα (qi , t) dt (2.4.3)
t α

Where there are α constraints. The corresponding Euler-Lagrange equations are:

d ∂L ∂L X ∂φα
− = λα (2.4.4)
dt ∂ q̇i ∂qi α ∂qi
The resulting system of α equations will fully determine the values of the λα constants. Once
these have been found, the force exerted by the constraint(s) in the i direction can be found to

43
be:

X ∂φα
Fi = λα (2.4.5)
α ∂qi
The great advantage of the Lagrange multiplier method is that it allows these forces to be
determined, while the generalized coordinate method does not.

2.4.3 Generalized Coordinates

The goal of the generalized coordinates method is to choose a coordinate system which naturally
includes the constraints. For example, if a ball is constrained to roll along the inside of a
spherical shell, good generalized coordinates would be θ and φ. In this example, the fact that r
is implicitly fixed embodies the constraint.
Generalized coordinates can also be thought of as simply reducing the number of degrees of
freedom in the Lagrangian using the constraint relationships. Each of the constraints can be
solved for a variable in the Lagrangian, then substituted in. The result is a simpler Lagrangian
(fewer degrees of freedom) that now contains the constraint information implicitly.

2.5 Conserved Quantities and Noether’s Theorem


Conserved quantities (or "integrals of motion" or "constants of motion") are quantities that do
not change during motion, i.e. quantities with a zero time derivative. In other words, a quantity
dQ
Q is conserved if and only if dt = 0.

2.5.1 Equivalent Lagrangians

During the derivation of the Lagrangian, it becomes clear 2.1 that two Lagrangians may differ by
a total time derivative but still yield the same equations of motion (the total time derivative
disappearing during the variation of the action). Two Lagrangians L and L0 are thus equivalent
if:

d
L0 (q, q̇, t) = L(q, q̇, t) + Λ(q, t) (2.5.1)
dt

2.5.2 Noether’s Theorem

Noether’s theorem 2.2 states that a conserved quantity exists for every symmetry of the system.
A "symmetry of the system" exists whenever a variable in the Lagrangian can be perturbed
by a constant to yield a different but equivalent Lagrangian. In this case, the two Lagrangians

must differ by a total time derivative dt (see above section).
Suppose the coordinate qi has been perturbed, so that qi0 (t, ) = qi (t) + δqi + O(2 ). Λ can
then be calculated via the equation:

∂ ∂
L = Λ (2.5.2)
∂ =0 ∂t
2.1
See Landau and Lifshitz pg. 4
2.2
Proven by Emmy Noether, who is for some reason often passed over in classical mechanics textbooks.

44
We can write the perturbation δqi as:

∂qi0 (t, )
δqi = (2.5.3)
∂ =0

There are two commonly occurring special cases that are worth mentioning.

• If the perturbation in question is simply a first order translation such as qi0 = qi + , then
we see that δqi = 1.

• If the perturbation is in time itself, i.e. qi0 (t, ) = qi (t + ), then we see that δqi = q̇i . Note
∂ 0 ∂ ∂
that ∂ qi (t, ) = ∂ qi (t + ) = ∂t qi (t) = q̇i , which allows us to easily integrate the equation
above for Λ, finding that Λ = L.

Given some perturbation, Noether’s theorem then states that the conserved quantity Q is then
given by

X ∂L
Q= δqi − Λ (2.5.4)
i
∂ q̇i

Q is sometimes referred to as the Nother Charge.

2.5.3 Conserved Quantities

Translations in space and time and rotations can be shown (via Noether’s theorem) to generate
the common constants of motion.

• Translations in time yield conservation of energy. In the case where L is not explicitly
∂L 2.3 ,
time dependent, ∂t = 0, we see that Λ = L and therefore:

X ∂L
H= δqi − L (2.5.5)
i
∂ q̇i

Where H is now the Hamiltonian (or total energy).

• Linear momentum is generated by translations.

• Angular momentum is generated by rotations.

2.6 Derivation of Hamilton’s Equations of Motion


We start by writing the differential of the Lagrangian L:

X ∂L ∂L ∂L
dL = ( dqi + dq̇i ) + dt (2.6.1)
i
∂qi ∂ q̇i ∂t
∂L
Our goal is to eliminate dq̇ in favor of dp. By definition, momentum is pi = ∂ q̇i
So we can rewrite the above equation as

X ∂L ∂L
dL = ( dqi + pi dq̇i ) + dt (2.6.2)
i
∂qi ∂t
2.3
See remarks above about the perturbation for this case.

45
Now since d(pi q̇i ) = dpi q̇i + pi dq̇i , we can eliminate the pi dq̇i term in the above equation. We
also note that, from the Euler-Lagrange equation:

d ∂L ∂L ∂L
 
= =⇒ ṗi = (2.6.3)
dt ∂ q̇i ∂qi ∂qi
Making these two substitutions and rearranging, our expression for dL is now:

X ∂L
dL − d(pi q̇i ) = (ṗi dqi − q̇i dpi ) + dt (2.6.4)
i
∂t

But, since the Hamiltonian is defined as:

X
H= pi q̇i − L (2.6.5)
i

We see that the right hand side of our equation is just −dH!

X ∂L
dH = (q̇i dpi − ṗi dqi ) − dt (2.6.6)
i
∂t

More directly, we also see that

X ∂H ∂H ∂H
dH = ( dpi + dqi + dt (2.6.7)
i
∂pi ∂qi ∂t

These two equations look very similar: in fact, they have matching terms, each multiplied by a
matching differential element! Equating the coefficients of these differentials gives us Hamilton’s
equations of motion:

∂H
q̇i = (2.6.8)
∂pi

∂H
ṗi = − (2.6.9)
∂qi
And, less helpfully,

∂H ∂L
= (2.6.10)
∂t ∂t

2.7 Canonical Coordinates and the Hamilton-Jacobi Equation


2.7.1 Canonical Coordinates and Transformations

The Euler-Lagrange equations are invariant with respect to coordinate changes in position of
the form Qi = Qi (q, t). One of the major advantages of the Hamiltonian formulation is that
Hamilton’s equations are also invariant with respect to coordinate changes in momentum.
Let us consider two very general transformations, which we will allow to be time dependent:
Qi = Qi (p, q, t) and Pi = Pi (p, q, t). While all transformations of this form are allowed, only

46
certain transformations will maintain the form of Hamilton’s equations, that is:

∂H 0 ∂H 0
Q̇i = , Ṗi = − (2.7.1)
∂Pi ∂Qi
Where H 0 is the Hamiltonian in the new coordinates: H(Qi , Pi ). Coordinates that obey
these relations are called canonical coordinates, and the transformations that produce them
canonical transformations.
Poisson brackets are also invariant under canonical transformations, i.e:

{A(Q, P ), B(Q, P )}Q,P = {A(q, p), B(q, p)}q,p (2.7.2)

It can be proved 2.4 that transformations Qi and Pi are canonical iff:

• {Qi , Qj }p,q = 0

• {Pi , Pj }p,q = 0

• {Qj , Pi }p,q = δij

2.7.2 The Method of Generating Functions

The method of generating functions allows us to easily find canonical coordinate transformations.
Suppose we wish to transform from one set of given coordinates, q and p, to new (as-yet
unknown) set of coordinates Q and P . For this transformation to be canonical, the form of
Hamilton’s equations must be preserved. Fundamentally, this is the same as requiring that
S(q, p) = S(Q, P ):
Z Z
(pq̇ − H(p, q))dt = (P Q̇ − K(P, Q))dt (2.7.3)

Where we have rewritten L in the action using the definition of the Hamiltonian. The "new"
Hamiltonian, K(P, Q), is (half jokingly) referred to as the Kamiltonian.
In order for two actions to be equivalent, they must be related by a total time derivative 2.5 .

Therefore, if we introduce a new function F :

d
pq̇ − H(p, q) = P Q̇ − K(P, Q) + F (2.7.4)
dt
Now we can chose among several possible F ’s in terms of combinations of our new and
old coordinates, namely F1 (q, Q), F2 (q, P ), F3 (p, Q), F4 (p, P ). For now, we will proceed with
F = F1 (q, Q). The above equation then reduces to:

∂F1 (q, Q) ∂F1 (q, Q) ∂F1 (q, Q)


pq̇ − H(p, q) = P Q̇ − K(P, Q) + + q̇ + Q̇ (2.7.5)
∂t ∂q ∂Q
Grouping terms with like coefficients, we can then derive that:

∂F1 (q, Q)
p= (2.7.6)
∂q
2.4
See Landau pg. 144
2.5
See Landau pg. 4.

47
∂F1 (q, Q)
P =− (2.7.7)
∂Q
And, somewhat less usefully,

∂F1 (q, Q)
K=H+ (2.7.8)
∂t
Similar relations can be be derived for the other "generating functions" by the same method. In
total, therefore,

∂F1
• F1 (q, Q): p = ∂q , P = − ∂F
∂Q
1

∂F2 ∂F2
• F2 (q, P ): p = ∂q , Q= ∂p . Among these, F2 seems to be the one most regularly used.

• F3 (p, Q): q = − ∂F ∂F3


∂p , P = − ∂Q
3

• F4 (p, P ): q = − ∂F
∂p , Q =
4 ∂F3
∂P

Once these relationships have been established, a single generating function can then be used to
generate an associated canonical transformation!

2.7.3 The Hamilton-Jacobi Equation

In the Hamiltonian formulation, the principal of least action becomes the Hamilton-Jacobi
equation:

∂S
+ H(q, p, t) = 0 (2.7.9)
∂t
Where S is the action.
∂L
Now, the Euler-Lagrange equation tells us that ṗi = ∂qi . Integrating both sides of this
equation with respect to t, and pulling the integral through the qi derivative, we can write
∂S
pi = ∂qi . Therefore, the Hamilton-Jacobi equation can be written:

∂S ∂S
+ H(q, , t) = 0 (2.7.10)
∂t ∂q
∂S
Now, when writing out the full Hamiltonian where pi → ∂qi :

1 ∂S
 
|∇S|2 + V + =0 (2.7.11)
2m ∂t
Where V is the potential.

2.8 The Harmonic Oscillator


2.8.1 Solutions of the 1D Harmonic Osscilator

Useful Sources: http://scipp.ucsc.edu/ haber/ph5B/sho09.pdf


The simple harmonic oscillator is characterize by the equation:

mẍ + kx = 0 (2.8.1)

48
Since we expect an oscillatory solution, we will use the trial solution:

x(t) = A sin(ωt) + B cos(ωt) (2.8.2)

We will apply the boundary conditions x(0) = x0 and ẋ(0) = v0 to eliminate the required two
degrees of freedom to obtain a unique solution. Taking the first derivative and solving for A and
B in terms of these conditions, we can rewrite the trial solution as:

v0
x(t) = sin(ωt) + x0 cos(ωt) (2.8.3)
ω
Taking the second derivative:

−v0
 
ẍ(t) = ω 2 sin(ωt) − x0 cos(ωt) = −ω 2 x(t) (2.8.4)
ω

Plugging this function into the original differential equation then gives us the relationship:

− mω 2 + k)x(t) = 0 (2.8.5)

Or, since x(t) is not everywhere zero:


s
k
ω0 = (2.8.6)
m
Which is the frequency of the harmonic oscillator.
A linear combination of sines and cosines can be combined together into a single phase
shifted sine function using the following identity :

A cos(α + β) = A cos(α) cos(β) − A sin(α) sin(β) =⇒


A cos(ωt + φ) = A cos(ωt) cos(φ) − A sin(ωt) sin(φ)

−v0
So, identifying the arbitrary constants A cos(φ) = x0 and ASin(φ) = ω , we have

x(t) = A cos(ωt + φ) (2.8.7)

A here is interpreted as the amplitude of the oscillations, while φ is the phase shift of the
oscillator from a cosine dependence on position.

2.8.2 Solutions of the Damped 1D Harmonic Oscillator

The damping force on a harmonic oscillator is modeled by introducing a −bẋ (b > 0) term into
the force equation:

mẍ = −bẋ − kx (2.8.8)

49
q
b k
Rearranging this equation, and letting β = 2m and ω0 = m.

ẍ + 2β ẋ + ω0 2 x = 0 (2.8.9)

By analyzing the characteristic equation of this differential equation (as described in the
mathematical methods section), we see that it has two roots:
q q
r1 = −β + β 2 − ω0 2 , r2 = −β − β 2 − ω0 2 (2.8.10)

The exact form of the general solution is determined by the sign of the discriminant of these
roots, β 2 − ω0 2 . The general solution will be:
 √ √ 
−βt β 2 −ω0 2 t − β 2 −ω0 2 t
x(t) = e A1 e + A2 e (2.8.11)

• If ω0 2 > β 2 , so that β 2 − ω0 2 < 0, the roots will be complex. This creates oscillatory
behavior that is known as under-damped motion.
In this case, the general solution can be simplified by introducing a frequency for the
oscillatory motion, ω1 2 = ω0 2 − β 2 . With this substitution, the equation for under-damped
motion can be rewritten:
 
x(t) = e−βt A1 eiω1 t + A2 e−iω1 t = Ae−βt cos(ω1 t − δ) (2.8.12)

This solution clearly describes an initial oscillation that dies away with time due to the
e−βt term. Note that the frequency ω1 is not a true frequency, because the motion is not
truly periodic. However, it does accurately reflect the spacing between zero crossings, and
is therefore meaningful in this sense.

• If β 2 = ω0 2 < 0, then there will only be one, real root to the characteristic equation.
This is known as the critically damped case. A second solution can be produced by
multiplying the first by t, and so a general solution can be written:

x(t) = (A + Bt)e−βt (2.8.13)

This equation describes the shortest path for the oscillator to its equilibrium value.

• If ω0 2 < β 2 , so that β 2 − ω0 2 > 0, the roots will be real. This is known as the over-
p
damped case. Again assigning a new frequency variable, ω2 = β 2 − ω0 2 , the general
solution can be written:
 
−βt ω2 t −ω2 t
x(t) = e A1 e + A2 e (2.8.14)

Noting that, since the exponents are now real, no oscillatory behavior emerges. Instead, the
over-damped solutions swing wildly back and forth in a seemingly unpredictable fashion.

Useful sources: Marion & Thorton pg. 108

50
2.8.3 Finding the Frequency of Small Osscillations: Approximations of a SHO

Many systems behave like harmonic oscillators in the region immediately around an equilibrium
point. In many cases, these oscillations can be modeled as SHO, allowing us to calculate their
approximate frequency and amplitude. This is done by Taylor expanding the potential of the
system in question about its equilibrium point until the result looks like a simple harmonic
oscillator potential. At this point, the frequency of oscillations can simply be read off.
The potential of a true harmonic oscillator is:

1
USHO (x) = kx2 (2.8.15)
2

Consider some arbitrary potential U (x) acting on a particle with mass m. Suppose U has an
equilibrium point at x = x0 . We are free to shift the potential such that U (x0 ) = 0. At x0 , the
potential is at a minimum. This means by definition that:

dU
(x0 ) = 0 (2.8.16)
dx

Now, consider the first several terms of the Taylor expansion of this potential about x0 :

dU 1 d2 U
U (x) ≈ U (x0 ) + (x − x0 ) + (x − x0 )2 + ... (2.8.17)
dx x0 2 dx2 x0

As stated above, the first two terms of this expansion go to zero, leaving

1 d2 U
U (x) ≈ (x − x0 )2 (2.8.18)
2 dx2 x0

But, since x − x0 is the distance of the particle from the equilibrium point, this is exactly the
potential of as SHO! Once derivatives have been taken and evaluated at x0 , some constant kU
will remain, allowing us to write the equation in the form:

1
U (x) ≈ kU (x − x0 )2 (2.8.19)
2

Therefore, the approximate frequency of small oscillations of the particle about the equilibrium
point can be written: s
kU
ω0 = (2.8.20)
m

2.9 Orbital Motion


2.9.1 The Reduced Mass

The reduced mass allows us to simplify the motion of two bodies orbiting one another by
transforming the problem so that it resembles a single body orbiting around the center of mass
of the system. Consider the Lagrangian for two orbiting bodies in a potential U (r):

1 1
L = m1 r˙1 2 + m2 r˙2 2 − U (r) (2.9.1)
2 2

51
Now suppose that we care only about motion that occurs relative to the center of mass of the
system, neglecting motion of the center of mass, such as translation of the entire system. We
can then set the center of mass as our origin, which implies:

m2 r2 + m1 r1 = 0 (2.9.2)

Let us also introduce a new vector r = r1 − r2 . Together these two equations produce the
following relations:
m2 m1
r1 = r, r2 = − r (2.9.3)
m1 + m2 m1 + m2
Now substituting these expressions back into the Lagrangian, we obtain:

1 m1 m2 2 1
L= ṙ − U (r) = µṙ2 − U (r) (2.9.4)
2 m1 + m2 2

Where we have now defined the reduced mass:

m1 m2
µ= (2.9.5)
m1 + m2

The system is now represented as one body, with mass µ, orbiting another body at the center of
mass with mass mcm = m1 + m2 , connected by the vector r.

2.9.2 Orbital Conserved Quantities

The Lagrangian of an orbiting body can be written:

1
L = µ(ṙ2 + r2 θ̇2 ) − U (r) (2.9.6)
2
∂L
It is clear that the Lagrangian is cyclic in θ, i.e. ∂θ = 0. Therefore, there angular momentum in
this direction, pθ , is conserved. By Nother’s theorem, this angular momentum is:

∂L
pθ = = µr2 θ̇ = l (2.9.7)
∂ θ̇

2.9.3 Kepler’s Second Law

If we ask how much area the radius vector “sweeps out" in for a given dθ, the 12 BH formula for
the area of a triangle gives us:
1
dA = r(rdθ) (2.9.8)
2
Dividing by the time, dt:
dA 1 dθ l
= r2 = (2.9.9)
dt 2 dt 2µ
dA
Since dt is proportional to the angular momentum, which is conserved, we know that the radius
vector must “sweep out" the same amount of area per unit time throughout the orbit.

2.9.4 Deriving Orbits with Newtonian Mechanics: The Integral Equation

Steps:

52
1. Write the energy of the orbiting body

2. Rearrange for ṙ
dθ dt
3. Rewrite as an integral for θ using dθ = dt dr dr and the angular momentum
We will start by writing the energy of an orbiting body with reduced mass µ:

1 1 l2
E = µṙ2 + + U (r) (2.9.10)
2 2 µr2

This can be rearranged for ṙ:


s
2 l2
ṙ = ± (E − U ) − 2 2 (2.9.11)
µ µ r

dθ dt
We would prefer, however, an equation for θ(r). We can combine the fact that dθ = dt dr dr = θ̇ṙ dr
with the equation for the angular momentum, theta˙ = l 2 , to get:
µr

l
dθ = dr (2.9.12)
µr2 ṙ

We can combine this with our equation for ṙ to write the function we desire as an integral:

± rl2
Z
θ(r) = s   dr (2.9.13)
l2
2µ E − U − 2µr2

This equation can now technically be used to calculate orbits, although it is awfully cumbersome.
It is interesting to examine the third term under the radical. The quantity in the parentheses
l2
has units of Energy, and U − 2µr2
can be treated as an effective potential. In addition to the
regular potential, there is a third term that is due to the fictitious "centrifugal" force:

l2
Uef f = U (r) + (2.9.14)
2µr2

From this potential we can calculate the value of the centrifugal force:

d l2 l2
Fc = − − = = µrθ̇2 (2.9.15)
dr 2µr2 µr3

This effective potential due to the centrifugal force is the reason that there are bound states of
this system! While neither term in the effective potential has a minumum by itself, their sum
does.
Source: Marion & Thorton: pg. 291.

2.9.5 Deriving the Orbit Equation

Lim 1055 features a nicer way of computing the derivatives of r here that flows better conceptually.
I plan on replacing this with that eventually. In the mean time, check pg. 85 in Lim Classical
Mechanics.

53
Steps:

1. Write the Lagrangian for an orbiting body, then apply the Euler-Lagrange equation
1
2. Make the substitution u = r

3. From the definition of u and the angular momentum equation, derive (using lots of chain
rule) expressions for the r terms in the equation

4. Substitute for the r terms and rearrange into the final form.

The Lagrangian of an orbiting body is:

1
L = µ(ṙ2 + r2 θ̇2 ) − U (r) (2.9.16)
2
Applying the Lagrange-Euler equation gives us:

∂U
µ(r̈ − rθ̇2 ) = − = F (r) (2.9.17)
∂r
This equation can be solved using the substitution u = 1r . First we will a relation to substitute
to eliminate r̈ and rθ̇2 . First:

du 1 dr 1 dr dt 1 ṙ µ
=− 2 =− 2 = − 2 = − ṙ (2.9.18)
dθ r dθ r dt dθ r θ̇ l
l
Where the last step used the equation for angular momentum, θ̇ = µr2
.
Now:

d2 u d µ dt d µ µ µ2
   
2
= − ṙ = − ṙ = − r̈ = − 2 r2 r̈ (2.9.19)
dθ dθ l dθ dt l lθ̇ l
Where again the last step uses the angular momentum equation. From these two expressions,
we can write:

l2 2 d2 u
r̈ = − u (2.9.20)
µ2 dθ2

l2 3
rθ̇2 = u (2.9.21)
µ2
We can now substitute these results back into our original result from the Lagrange-Euler
equation to obtain:
d2 u µ 1 1
+ u = − 2 2F( ) (2.9.22)
dθ2 l u u
This is the orbit equation. Substituting back in for F and r:

d2 1 1 µ ∂U
2
+ = 2 r2 (2.9.23)
dθ r r l ∂r

Source: Marion & Thorton: pg. 292.

54
2.9.6 The Kepler Problem (Explicitly Equations for Orbits)
−k
We will start with the integral orbit equation, substituting the central potential U (r) = r :

± rl2
Z
θ(r) = s   dr (2.9.24)
k l2
2µ E + r − 2µr2

If we assign the minimum value of r to be θ = 0, we get (elaborate? How?):

l2 1
µk r −1
cos(θ) = q (2.9.25)
2El2
1+ µk2

If we assign the following constants :

l2
α= (2.9.26)
µk
s
2El2
= 1+ (2.9.27)
µk 2
Which allows us to rewrite the orbit equation as:

α
= 1 +  cos(θ) (2.9.28)
r
This equation describes conic sections, which represent all posible orbits. The shape of these
orbits can be grouped by the value of the eccentricity, :

•  > 1 , E > 0: Hyperbola. These orbits are not bounded.

•  = 1 , E = 0: Parabola. These orbits are not bounded, but are just on the edge of
becoming so.

• 0 <  < 1 , Vmin < E < 0: Ellipse. These orbits are bound.

•  = 0 , E = Vmin : Circle. This is the lowest energy bound orbit.

Source: Marion & Thorton: pg. 300.

2.10 Rigid Body Motion (Rotation in Euler Angles)


The Lagrangian and Hamiltonian formulations of classical mechanics discussed so far give us
many ways of describing the center of mass motion of a system. However, physical objects are
also free to rotate around their center of mass. In order to analyze this complicated motion,
we will conceptually separate motion into motion of the center of mass (described by regular
classical mechanics in the lab frame) and motion around the center of mass (described by rigid
body motion in a frame that rotates along with the body, called the body frame).
This section directly follows Thorton and Marion, Chapter 11.

55
2.10.1 Moment of Inertia

Just as mass of an object determines how hard it is to move, the mass distribution of an object
determines how difficult it is to rotate. This mass distribution is characterized by the inertia
tensor.
The inertia tensor appears naturally when considering the energy of a rotating particle.
Consider a rigid body made up of α particles that are both rotating and translating. The
particle’s velocities in the lab frame can be written

vα = Vtrans + ω × rα (2.10.1)

Where Vtrans is the translational velocity of the CM in the lab frame, and ω and rα are the
angular velocity and position vector of the particles in the body frame.
The kinetic energy of this system is

X1
T = mα (Vtrans + ω × rα )2 (2.10.2)
α 2

· ω × rα ) = Vtrans · ω ×
P P
Noting that α mα (Vtrans α mα rα = 0 because, in the center of
mass frame in which the rα vectors are defined, the center of mass is at the origin, and thus
P
α mα rα = 0. The kinetic energy can thus be separated into distinct translational and rotational
terms:

X1 X1
2
T = mα Vtrans + mα (ω × rα )2 (2.10.3)
α 2 α 2

Consider now only the rotational term. Using the identity that (A × B)2 = A2 B 2 − (A · B)2 ,
we can simplify this expression to:

1X
mα ω 2 rα2 − (ω · rα )2

Trot = (2.10.4)
2 α
P
Plugging in components and making the substitution ωi = j ωj δij , we can pull out the ω’s:

1X X X  
Trot = ωi ωj mα δij x2α,k − xα,i xα,j (2.10.5)
2 i,j α k

This is now the final expression for the rotational kinetic energy. However, it makes sense to
collect the position and mass information in the body frame into a single quantity, which we
will define as the inertia tensor:

X  X 
Iij = mα δij x2α,k − xα,i xα,j (2.10.6)
α k

Now the rotational kinetic energy takes the form taught in high school physics classes,

1X
Trot = Iij ωi ωj (2.10.7)
2 i,j

56
2.10.2 Calculating Inertia Tensors

When calculating the inertia tensor, it is usually easiest to use the above definition of the tensor
elements. Notice that diagonal elements will end up having a form similar to

X  
Iii = mα (x21,α + x22,α + x23,α ) − x2i,α (2.10.8)
α

While off-diagonal elements will have the form:

X
Iij = mα (−xi,α xj,α ) (2.10.9)
α

It is clear from the definition of the tensor that it is symmetric: i.e. Iij = Iji .
It is also clear that, generalizing the summation over α to an integral over all mass particles,
the tensor elements for a continuous mass distribution are given by:
Z  X 
Iii = ρ(r) δij x2k − xi xj dV (2.10.10)
V k

It is, however, less clear that once one moment of inertia has been calculated, it is possible to
easily find the moment of inertia around any other parallel axis. This result 2.6 is known as
Steiner’s parallel-axis theorem. It states that, if we have an original moment of inertia tensor
Jij defined about some origin, the moment of inertia about a new origin defined by the vector a
from the old origin, the new inertia tensor can be written:

Iij = Jij − M (a2 δij − ai aj ) (2.10.11)

Where M is the total mass of the object. Keep in mind that this theorem only applies to parallel
axes!
A second general theorem, called the perpendicular axis theorem, concerns only objects
that exist in a 2D plane (or can be approximated as such), for example a piece of paper or thin
rectangular slab. If the object exists in the x − y plane, then the perpendicular axis theorem
states that the moments of inertia along the axes are related by:

Iz = Ix + Iy (2.10.12)

2.10.3 Principal Axes of Rotation: Diagonalizing the Inertia Tensor

In general, an inertia tensor will have some non-zero off diagonal elements. However, once
diagonalized (through the normal methods of linear algebra), we can give a physical interpretation
to the diagonal elements.
Diagonalizing a matrix involves multiplying it by a change of coordinates matrix of eigenvec-
tors, which amounts to a rotation of the inertia tensor about the origin. The resulting orientation
of the axes are known as the principal axes of the inertia tensor, and are the preferred basis
for describing the rotations of the body.
2.6
For a proof, see Thorton and Marion, pg. 429)

57
2.10.4 Euler Angles

Figure 2.10.1: Euler Angles graphic (Source: Wikipedia)

Rigid body motion is essentially the study of tops: rigid objects that are fixed to rotate
about one point. The Euler angles are the natural coordinate system for describing this motion,
with angles corresponding to nutation (θ), precession (φ) and the rotation of the top around its
own axis (ψ). Conveniently, many other rotating bodies can also be easily described using these
angles.
In a normal Cartesian coordinate system, one is free to "give directions" in whatever order
one pleases: it does not matter if you move over by x and up by y, or up by y then over by
x. With Euler angles, however, this gets very confusing! The best way to think about the
Euler angles is as a series of directions, specifying how to get from the "rest" position of the top
(upright in the lab frame) to any position during its motion.
To get to some position with Euler angles, you follow the following steps. Let the body axes
be labeled 1,2,3, and the fixed lab-frame axes x,y,z. Imagine the "body" to be a top, spinning
about it’s 3 axis.

1. Rotate around the lab’s x or y axis (which is a convention that is different in classical and
quantum mechanics) through the angle θ. The body’s 3 axis is now θ away from the lab z
axis. This covers all possible tilts of the top.

2. Rotate the body around the lab z axis, through an angle φ. This covers all possible
orientations of the top around the z axis as it precesses (remember that its point is fixed
at the origin).

3. Finally, rotate the top about its 3 axis. This sets the orientation of the top.

With the Euler angles so defined, we can represent each rotation as a rotation matrix, allowing
(very messy) calculations of the bodies’ rotational angular velocities in terms of the Euler angles,
which are measured from the lab frame! 2.7

These glorious yet disgusting equations are as follows:

ω1 = φ̇ sin(θ) sin(ψ) + θ̇ cos(ψ) (2.10.13)


2.7
For the gory details, see Thorton and Marion pg. 442

58
ω2 = φ̇ sin(θ) cos(ψ) − θ̇ sin(ψ) (2.10.14)

ω3 = φ̇ cos(θ) + ψ̇ (2.10.15)

It is handy to note that

ω12 + ω22 = φ̇2 sin2 (θ) + θ̇2 (2.10.16)

2.10.5 Euler’s Equations

Solving for dynamics in Euler angles generally involves solving the associated Euler-Lagrange
equations. Luckily, in Euler angles these easily2.8 reduce to a system of simple equations, known
as Euler’s Equations. For a body with principal moments of inertia I1 , I2 , and I3 and an angular
velocity ω and a torque N , split into components along the principal axes, we have:

I1 ω̇1 − (I2 − I3 )ω2 ω3 = N1 (2.10.17)

By permutation of the axes, we must also have the corresponding relations:

I2 ω̇2 − (I3 − I1 )ω3 ω1 = N2 (2.10.18)

I3 ω̇3 − (I1 − I2 )ω1 ω2 = N3 (2.10.19)

2.11 Classical Scattering Theory

Figure 2.11.1: Classical scattering diagram (Source: Griffiths QM, pg. 394)

Classical scattering occurs when a small particle interacts with a relatively massive (and
therefore effectively immovable) object. This interaction will alter the trajectory of the smaller
object, causing it to “bounce” or “scatter” off of the larger object. This classical model is
important in its own right (for example, it explains scattering of particles off of nuclei). However,
2.8
See Thorton and Marion pg. 444 for a derivation

59
it is particularly important to understand as a basis for electromagnetic scattering (diffraction,
etc.) and quantum mechanical scattering (scattering between quantum particles).

Figure 2.11.2: Classical scattering diagram (Source: Griffiths QM, pg. 396)

A particle approaches an object (or scattering center) with an impact parameter b (see
fig. 2.11.1). Our goal is to determine the angular distribution of scattered particles, as well as
the total cross section of the object; i.e. what area of the stream of particles it interacts with.
The way the particle scatters can be characterized by the ratio between the area dσ that the
particle passes through, and the angle dΩ that it scatters into (Fig. 2.11.2). We define the ratio
between these two quantities as a function of θ as the differential cross section:


D(θ) = (2.11.1)
dΩ

D(θ) is solely a property of the object (or potential) that the particle scatters off of, and
is independent of the particle itself. “Solving” the scattering problem therefore amounts to
determining D(θ) for the given potential or object.
In terms of the variables of Fig. 2.11.2, dσ = bdbdφ, while it is always true that dΩ = sin θdθdφ.
Therefore, we can also express D(θ) as2.9 :

b db
D(θ) = (2.11.2)
sin θ dθ

Once D(θ) has been found, the total cross section can be found by integrating over the entire
solid angle: Z
σ= D(θ)dΩ (2.11.3)

2.11.1 Example: Hard Sphere Scattering

The simplest classical example of scattering is the problem of scattering off of a hard sphere of
radius R. In this case, if the particle strikes the surface at an angle α, it will reflect off of the
surface an angle α from the surface normal. Based on Fig. 2.11.3 we can then express θ = π − 2α.
Then:
π θ θ
   
b = R sin α = R sin − = R cos (2.11.4)
2 2 2
2.9 db

is usually negative, so we take the absolute value by convention.

60
Figure 2.11.3: Hard sphere scattering (Source: Griffiths QM, pg. 395)

So:   
 2 cos−1 b

b≤R
R
θ= (2.11.5)
b≤R

 0

Now we can explicitly calculate D(θ):

db 1 θ R2
 
= − R sin → D(θ) = (2.11.6)
dθ 2 2 4

3 Relativity
3.1 The Lorentz Transformation and its Consequences
3.1.1 The Lorentz Transformation

The basic principle of special relativity insists that the speed of light be the same in all reference
frames. From this assumption, we can derive the form of the Lorentz Transformation between
inertial reference frames. Defining γ to be:

1
γ=q (3.1.1)
v2
1− c2

It is worth noting here that γ ≥ 1, since v ≤ c.


The transform into a frame moving in the x direction with velocity v can be written:
Lorentz Transformation

• t0 = γ(t − v
c2
x)

• x0 = γ(x − vt)

• y0 = y

• z0 = z

Of course, to obtain the inverse transformation we simply substitute v → −v.


Inverse Lorentz Transformation

61
• t = γ(t0 + v 0
c2
x)

• x = γ(x0 + vt0 )

• y = y0

• z = z0

3.1.2 Time Dilation

One immediate consequence of accepting the Lorentz transformation is that the classical concept
of simultaneity is no longer applicable.
Suppose that a clock starts at x = 0 at t = 0. The clock then moves away from the origin
along the x-axis at velocity v. The clock will stop when t0 = T 0 . We wish to find T : the time it
takes for the clock to stop in the rest frame.
When the clock stops, t0 = T 0 , t = T , and x = v ∗ T . Therefore, from the time component of
the Lorentz transform, we can write:

v2 T
T 0 = γ(T − 2
T) = (3.1.2)
c γ
Or, finally,

T = γT 0 (3.1.3)

This is the fundamental statement of time dilation. Since γ ≥ 1, T ≥ T 0 . MORE time will pass
in the reference frame than in the moving frame. This is often rephrased that "moving clocks
run slow".

3.1.3 Lorentz Contraction

Another fundamental consequence of the Lorentz transformation is that observers in different


frames of reference will not agree about the lengths of objects.
Consider a thought experiment where a rod of length L0 (in its own moving frame) is moving
at a velocity v in the x direction. Set the origin of each coordinates system so that at t = t0 = 0,
one end of the rod is at x = x0 = 0. In the moving (primed) frame, then, the other end of the
rod will be at x0 = L0 . In the rest frame, however, the x term of the Lorentz Transformation
tells us that:

x0 = γx (3.1.4)

Therefore, calculating L ≡ ∆x = x − 0 = and L0 ≡ ∆x0 = x0 − 0:

L0 = γL (3.1.5)

Or

L0
L= (3.1.6)
γ

62
Since γ ≥ 1, this means that L ≤ L0 : the rod appears to shrink (lengthwise) when viewed from
the rest frame. It is important to note that this phenomenon only occurs along the direction of
motion: the dimensions of the rod in the y and z directions are not distorted.

3.1.4 The Einstein Velocity Addition Rule

Suppose a particle is moving in the moving primed frame with a velocity u0 , while the primed
frame itself is moving with a velocity v. We wish to find the velocity of the particle in the rest
frame, which we will call u.
If we set the origins of both frames so that x = x0 = 0 at t = t0 = 0, then the inverse Lorentz
transformation gives us that:

∆x γ(∆x0 + v∆t0 )
= (3.1.7)
∆t γ(∆t0 + cv2 ∆x0 )
∆x0
Rearranging this expression, and noting that ∆t0 = u0 and ∆x
∆t = u, we have

u0 + v
u= 0 (3.1.8)
1 + uc2v

This is the Einstein velocity addition rule. Notice that, in the case where u0 , v << c, it reduces
to the classical velocity addition rule:

u = u0 + v (3.1.9)

Also notice that, even if u0 = c, u = c. Therefore this rule reflects the fundamental rule of
relativity: that light moves at the same velocity in every reference frame.

3.2 Four-Vectors
3.2.1 Four-Vectors

A four-vector is a generalized form of a normal vector in 3D space (or "three-vector"). The extra
component has been added to include time as another pseudo-dimension. However, this analogy
between time and the spatial dimensions should not be stretched too far, as there are some
important asymmetries between the two cases.
The position four-vector is3.1 :
x =< ct, x, y, z > (3.2.1)

There are an unfortunate number of conflicting conventions for the time component of this
vector. Some authors include it as the first (0th) component, while others place it last (4th).
Sometimes that term is made imaginary (ict) for reasons that will be apparent later. In these
notes, all four vectors will be written according to the convention above.
The position vector is naturally contravariant. In Einstein notation, all contravariant
3.1
Notice that the c in the first term is necessary for that component to have units of length!

63
vectors are symbolized by upper indices:

xµ (3.2.2)

The cooresponding covariant vectors (dual vectors) are symbolizedf by lower indices:

xµ (3.2.3)

Another convention avoids a common piece of notational clutter. Sometimes while summing
over four-vectors you wish to index all four components of a vector, while for others you wish to
only index the three spatial components, treating the time component separately. Instead of
explicitly stating which is meant, a convention is defined:

• Greek indices (µ, ν, etc.) are assumed to run over all four components of the
four-vector so that, for example, µ = 0, 1, 2, 3.

• Latin indices (i, j, etc.) are assumed to run over only the spatial components of the
tensor, i.e. i = 1, 2, 3.

With normal three dimensional Euclidian vectors, the dot product is defined as the product
of a contravariant (”normal”) vector with its cooresponding covariant (”dual”) vector. The
generalized dot product of two vectors xµ and y ν is:

g(xµ , y ν ) = xµ gµ,ν y ν (3.2.4)

Where gµ,ν is called the metric, and depends on the geometry of spacetime in which you
are working. In general relativity, the metric is calculated as a way to represent curved space
time. In special relativity, however, we work in Minkowski space, (”flat space”), which is a
pseudo-Euclidian3.2 space in which:
 
−1 0 0 0
 
 0 1 0 0
gµ,ν =  (3.2.5)
 

 0 0 1 0
 
0 0 0 1

Another convention arises here: we could just as easily define the Minkowski metric above with
−1 ↔ 1:  
1 0 0 0
 
0 −1 0 0
gµ,ν =  (3.2.6)
 

0 0 −1 0
 
0 0 0 −1
This choice defines a property called the signature of the metric. The signature of a metric
can formally be written (p, q, r), where p is the number of positive eigenvalues, q the number of
negative eigenvalues, and r the number of zero eigenvalues. However, as a shorthand, the two
3.2
Minkowski space is not really a Euclidian space, because of the minus sign in the metric defined below.

64
separate signature conventions for the Minkowski metric are usually notated3.3 :

(3, 1) → (−, +, +, +), (1, 3) → (+, −, −, −) (3.2.7)

There is no reason to prefer one over the other. However, for consistency, in these notes we
will use the (3, 1) = (+, −, −, −) metric written above (eq. 3.2.6).
In Euclidian space, performing a dot product between two contravariant vectors involves
transforming one of them into a covariant ”dual” vector:
     
... ...
  ...
   
... · ... → ... ... ... ... (3.2.8)
     
... ... ...

The metric still performs a similar operation with four-vectors. The Minkowski metric defined
above is a covariant metric tensor (hence why it was written with lower indices). For every
such tensor there exists a contravariant metric tensor such that:

g µ,ν gν,σ = gσ,ν g ν,µ = δσµ (3.2.9)

Notice that indices must match up in pairs in order for multiplication of the matrices to be valid.
In the special case where the indices on both metrics are the same:

g µ,ν gν,µ = I (3.2.10)

Where I is the identity matrix. For the Minkowski metric in particular, gµ,ν = g µ,ν : the
Minkowski metric is its own inverse.
The covariant and contravariant metrics have the property of transforming covariant four-
vectors into contravariant four-vectors (and vice versa):

g µ,ν xν = xµ
(3.2.11)
gµ,ν xν = xµ

As a mnumonic (although not mathematically cogent), this can be thought of as a process of


canceling indices:
g µ,ν xν = xµ (3.2.12)


3.2.2 Lorentz Transformations in Four-Vector Notation

The four-vector notation makes the Lorentz transformations look much nicer. If we define β = vc ,
the Lorentz transformation (for motion in the x direction) is:

• x0 0 = γ(x0 − βx1 )

• x0 1 = γ(x1 − βx0 )

• x0 2 = x2
3.3
r=0 is understood by omiting r.

65
• x0 3 = x3

This system of equations can be expressed using a tensor:

µ
x0 = Λµν xν (3.2.13)

Where Λ is:  
γ −γβ 0 0
 
−γβ γ 0 0
Λ= (3.2.14)
 

 0 0 1 0
 
0 0 0 1
Of course, this holds for any arbitrary four-vector a:

µ
a0 = Λµν aν (3.2.15)

3.3 Relativistic Mass


Mass (as defined by F = ma) is not constant as speed varies in relativity. However, the
relativistic mass, m0 , is. The two are related by:

m = γm0 (3.3.1)

Note that the terms ’relativistic mass’, ’invariant mass’, and ’rest mass’ all mean the same thing:
the relativistic mass is invariant, which is why it is also equal to the rest mass.

3.4 Relativistic Energy


The total relativistic energy includes both the rest energy of a particle and any kinetic energy it
may have. The relativistic energy of a particle with rest mass m moving at velocity v is:

E = γmc2 (3.4.1)

Sometimes you may want to split the energy into kinetic and rest parts. The rest component is
always:
Erest = mc2 (3.4.2)

Which means that the kinetic energy must then be:

KE = E − Erest = (γ − 1)mc2 (3.4.3)

Another extremely useful equation relates the total relativistic energy to the rest mass (m) and
the magnitude of the three momentum (p2 ):
q
E= p2 c2 + m2 c4 (3.4.4)

66
From this follows an important result: for zero-mass particles:

|p|
E= (3.4.5)
c

3.5 Relativistic Three Momentum


The relativistic three momentum is simply the same kinematic momentum we are used to dealing
with, but with the mass replaced by the relativistic mass:

p = γmv (3.5.1)

Where m is the invariant mass.

3.6 Four Momentum


The four momentum is a generalized version of the momentum vector that now includes the
energy of the particle:
E
pµ = ( , p1 , p2 , p3 ) (3.6.1)
c
Notice that E is the total energy of the particle, not just its rest energy! In other words, the
same E as in: q
E= |p|2 c2 + m2 c4 (3.6.2)

Where p is the three momentum.


Just like three momenta, four momenta can be added:

p1,µ + p2,µ = p12,µ (3.6.3)

Four momenta is also conserved during elastic collisions:

pµ,i = pµ,f (3.6.4)

The magnitude of pµ can be found via a dot product on the Minkowski metric:

E2
pµ pµ = − p21 − p22 − p23 (3.6.5)
c2

The most important feature of the four momentum is that this product, pµ pµ is
frame invariant: the same in all reference frames. If we can write the four momenta in
two separate reference frames, we can be sure that their magnitudes will be the same.
One consequence of this applies to the four momenta of a single particle. If we shift into the
rest frame of the particle, then all of its three-momenta components are zero, so that:

E2
pµ pµ = = m2 c2 (3.6.6)
c2

Notice that this will not work for multiple particles, since you cannot shift into a reference frame
where they are both at rest! (unless they have the same 3 momentum vector).

67
Before you get too excited, this is a good point to sit back and remember that (a+b)2 6= a2 +b2 .
In general, before transforming frames, you will need to combine all the four momenta in a
system to get a total four momentum.

4 Electromagnetism
4.1 A Note on Griffith’s “script r"

Figure 4.1.1: Diagram of Script r (Source: Griffiths “Introduction to Electrodynamics", pg. 9)

In the first chapter of Introduction to Electromagnetism 4.1 , Griffiths defines a convenient


“separation" vector, r , as:

r = r − r0 (4.1.1)

where r is the vector from the origin to the field point, and r 0 the vector from the origin to
the source point (See figure 4.1.1). The important thing to remember is that r points from
the source point to the field point.
This vector is denoted by r , while its magnitude is denoted by r .
In these notes, I generally use arrows rather than boldface to denote vectors, since some
LATEXcharacters don’t have bold versions. However, to be consistent with Griffiths, I WILL try
to use r instead of r .
Finally, the reader should be aware that I spent a fair amount of time getting the Griffiths
r symbol set up in LATEX. I hope you’re grateful. 4.2

Here are some generally useful facts about r :

= − rr̂ 2 , ∇0 r1 r̂
   
1
• ∇ r = r2 (Proved in Griffiths (prob. 1.13) by direct differentiation)

4.2 The Units of Electromagnetism


Good habits of dimensional analysis have a tendency to disappear in E&M, since the units are
no longer ones we are familiar with in everyday life.
Just like chemistry and thermodynamics, E&M deals with very large numbers of electrons.
In chemistry and thermodynamics, we make these numbers manageable by defining Avogadro’s
number and the Boltzmann constant respectively. For the same reason, in E&M, we will define
4.1
Page 9
4.2
If you ever need to do this, you can download PDFs of the symbols and a sample file demonstrating their use
from Griffith’s website: http://academic.reed.edu/physics/faculty/griffiths.html

68
the Columb to be 6.241 × 1018 . It is important to remember that the columb by itself is not
really a unit: it’s just a number!
Charge is measured in “electrons" 4.3 ; the intrinsic charge of the electron taken to be a
universal constant. In practice, we will talk about columbs as a unit of charge. However, what
we are really referring to is the charge of one columbs worth of electrons!
The most basic true unit in E&M is the Ampere, which is an SI base unit, defined to be
the flow of one columb of electrons past a point in one second:

C
1A = 1 (4.2.1)
s

The first derived unit of E&M that we should define is the Volt, which is defined to be the
potential difference between two points that will produce 1 Joule of energy per Columb that
flows between them:
J
1V = 1 (4.2.2)
C
Directly from the volt, we can now define a unit of electric fields. Since the electric field is
defined to be the gradient of the potential, it will have units of Volts per Meter. To my
knowledge, there isn’t a special name for this unit.
We can also now define a unit for capacitance. One Farad is defined to be the capacitance
of a capacitor that holds one columb of charge and has a potential difference of one volt, leading
us to write:
C
1F = 1 (4.2.3)
V
The unit for electrical resistance come straight from Ohms law, and is the Ohm:

V
1Ω = 1 (4.2.4)
A

The units of magnetic field flux density are unfortunately rather cluttered. The current (derived)
SI unit for magnetic field flux density is the Tesla, which is defined such that a 1 Columb test
charge passing through a 1 Tesla B-field at a speed of 1 Meter per Second experiences a force of
1 Newton. In other words, by the Lorentz force law:

N ·s
1T = 1 (4.2.5)
m·C

There are a number of nicer ways to write this unit, but this one makes the most sense from the
definition.
4.3
Actually, the fundamental unit of charge is one-third of an electron, to account for the “fractional" charges of
quarks.

69
4.3 Static Electric Fields
4.3.1 Electric Field Discontinuities

Electric field lines are discontinuous wherever they cross a charged surface. The exact amount
of this discontinuity is highly useful, and can often be used as a boundary condition in potential
theory.
The discontinuity in an electric field E crossing a surface defined by the surface normal n̂2
(going from side 2 to side 1, so the surface normal of section 2 ) with charge σ is:

σ
n̂2 · (E1 − E2 ) = E1,⊥ − E2,⊥ = (4.3.1)
0

n̂2 × (E1 − E2 ) = E1,k − E2,k = 0 (4.3.2)

The quintessential example of these discontinuities (and the easiest way to re-derive them in a
pinch)is a surface charge σ on a flat surface with no external electric field. Then:

σ −σ σ
E1,⊥ − E2,⊥ = − = (4.3.3)
20 20 0

4.3.2 Green’s Reciprocity Relation

Green’s reciprocity relation relation concerns two seperate charge distributions. It states that
the potential energy of charge distribution A due to charge distribution B is equal to the energy
of B in the field due to A. In other words:

1 ρ2 (r)ρ1 (r 0 ) 1 ρ1 (r 0 )ρ2 (r)


Z Z Z Z
U2,1 = d3 r d3 r0 0
= d3 r0 d3 r = U1,2 (4.3.4)
4π0 |r − r | 4π0 |r 0 − r|

Notice that this works because the integrals can be interchanged, and the vector difference in
the denominator is a magnitude. The definition of the electrical potential allows this equation
to be written in a more compact form:
Z Z
3
d rρ2 (r)φ1 (r) = d3 r0 ρ1 (r 0 )φ2 (r 0 ) (4.3.5)

This equation can sometimes be used as a trick to calculate the force between two different
charge distributions.

4.3.3 Example: Force Between Spherically Symmetric Charge Distributions Using


Green’s Reciprocity Relation

Suppose that we have two spherically symmetric charge distributions, ρ1 and ρ2 (total charges
Q1 and Q2 ) with 1 centered at the origin and 2 at some displacement R. We want to calculate
the force between these two distributions, which we can do if we can calculate the potential
energy of their interaction. We know that the potential energy of ρ1 in the field of ρ2 is:
Z
U1,2 = d3 rρ1 (r)φ2 (r) (4.3.6)

70
Since ρ2 is spherically symmetric, outside of its boundarys its electric field must be identical to
that of a point charge centered at R. We can then rewrite φ2 as φp,2 : the potential from a point
charge. Z
U1,2 = d3 rρ1 (r)φp,2 (r) (4.3.7)

Green’s Reciprocity Relation allows us to rewrite this equation as:


Z
U1,2 = d3 rρp,2 (r)φ1 (r) (4.3.8)

The charge density of the point charge at 2 can be written as ρp,2 = δ(r − R)Q2 . However, since
we are now in the perspective of 2, 1 now looks like a point charge too! so φ1 → φp,1 .

1 Q1
Z  
U1,2 = d3 rδ(r − R)Q2 (4.3.9)
4π0 r

Which now reduces easily to:


1 Q1 Q2
U1,2 = (4.3.10)
4π0 R
And, finally:
∂U1,2 1 Q1 Q2
F1,2 = − = (4.3.11)
∂R 4π0 R2

4.3.4 Energy of Electrostatic Charge Distributions

Distributions of stationary charges have inherent potential energy due to their proximity to
one another. Since the electrical potential is taken to be zero at infinity, charge particles have
zero potential energy at infinity. Therefore, the potential energy of a charge distribution is the
energy it takes to assemble this distribution by bringing each particle in one at a time from
infinity. The first particle can be added for free, while the following particles must fight against
the potential of all of the particles that have come before. This can be written neatly as a sum:

N X N N X N
1 X qi qj 1 1X qi qj
UE = W = = (4.3.12)
4π0 j=1 i<j |ri − rj | 4π0 2 j=1 i6=j |ri − rj |

Where, in the last expression, the summations have been simplified by double counting each
interaction, which is corrected by a factor of 12 .
Recognizing part of this equation as the potential of the FINISHED charge distribution, φ,
we can write the sum as:
N
1X
UE = qi φ(ri ) (4.3.13)
2 i=1

For continuous distributions, this becomes:

1
Z
UE = d3 rρ(r)φ(r) (4.3.14)
2

It is also worth considering the total energy of two charge distributions that have been brought

71
together such that ρ(r) = ρ1 (r) + ρ2 (r):

1 ρ1 (r)ρ2 (r)
Z Z
UE (ρ) = UE (ρ1 ) + UE (ρ2 ) + d3 r d3 r0 (4.3.15)
4π0 |r − r 0 |

The first two terms, UE (ρ1 ) and UE (ρ2 ) are the energies of the two charge distributions by
themselves, while the third term is known as the interaction energy.
The energy in an electric configuration can also be computed by integrating over the square
of the electric field:
1
Z
UE = 0 d3 r|E(r)|2 (4.3.16)
2
It is important to remember that this method of calculating the energy of a charge distribution
necessarily contains the self-energy of the particles in question, as well as their interaction energy.
This expression must therefore only be used with non-singular charge distributions, since point
charges will lead to pathological infinite energies 4.4 .

This equation is the basis of the definition of the electric field energy density:

1
uE (r) = 0 |E(r)|2 (4.3.17)
2

4.3.5 Example: Electric field of a Uniformly Charged Sphere

Consider a sphere of radius R and a total charge Q uniformly distributed throughout its volume.
Since the problem has spherical symmetry, we can use Gauss’s law to calculate the electric field
inside and outside the sphere.
First, let us chose an arbitrary surface of integration on which to apply Gauss’s law. We will
chose a sphere of radius s, concentric with the first sphere.
Consider first the case where s < R, in order to find the field within the sphere. In order to
apply Gauss’s law, we need to calculate how much charge is contained within this sphere. The
charge density of the body is:
Q 3Q
ρ= 4 = (4.3.18)
3 πR
3 4πr3
s3
Therefore, the charge inside the surface of integration is: Qs = R3
Q. Gauss’s law then says that:

Qs
Z
E · d~a = (4.3.19)
sphere 0
Since E ⊥ da and is constant over the entire surface, we can easily carry out the dot product on
the LHS and pull |E| out of the integral. The remaining integral is just the surface area of the
sphere!
Qs
|E|(4πs2 ) = (4.3.20)
0
Plugging in for Qs , we have:
1 Qs
|E| = (4.3.21)
4π0 R3
4.4
For more discussion, see Zangwill pg. 78.

72
Now, if s > R, Qs = Q, so our task is much simpler. Following the same procedure, we see that:

1 Q
|E| = (4.3.22)
4π0 s2

So, our final result is:


1 Qs
Es<R = ŝ (4.3.23)
4π0 R3

1 Q
Es>R = ŝ (4.3.24)
4π0 s2

4.4 The Scalar Potential


4.4.1 Definition of the Scalar Potential

The Helmholtz decomposition theorem tells us that any vector field can be broken into a curl
free and a divergence free portion: F = −∇φ + ∇ × A. Since Maxwell’s equations tell us that
∇ × E = 0, we can make an even stronger statement: that the electric field can be represented
entirely by the curl free portion of the function above!:

E(r) = −∇φ(r) (4.4.1)

1
The Helmholtz theorem states that, as long as E falls off faster than r as r → ∞, φ can be
written:
1 ∇0 · F (r 0 ) 0
Z
φ(r) = dV (4.4.2)
4π allspace |r − r 0 |
ρ(r)
However, from another of Maxwell’s equations we know that ∇ · E = 0 , so we can rewrite
this equation as:
1 ρ(r 0 )
Z
φ(r) = dV 0 (4.4.3)
4π0 allspace |r − r 0 |

It is important to note that the physically important function E (which produces forces etc.)
depends only on the gradient of φ. Therefore, since ∇(φ + c) = ∇φ, the scalar potential is
arbitrary up to an additive constant. This is known as gauge freedom. Although we will often
chose c = 0 for simple calculations, there are several other gauge choices that will end up being
convenient for more complicated problems.

4.4.2 Example: Potential of a Uniformly Charged Sphere

In the previous example, we found the electric field inside and outside a uniformly charged
sphere. We will now integrate these fields to find the appropriate potentials.
Outside the sphere, the electric field is simply that of a point particle, so the potential must
also be that of a point particle. However, if this does not satisfy you, you can easily carry out
the integration:
Z s
1 Q 1 Q
V (s) = − ŝ · ds = (4.4.4)
4π0 ∞ s2 4π0 s

73
Inside the sphere, we will need to split the integral into two parts:
Z R Z s 
V (s) = − Es>R Es<R (4.4.5)
∞ R

Plugging in our electric field from the previous example, we have:


Z R Z s
Q 1 1

0 02 0
V (s) = − ds s ds (4.4.6)
4π0 ∞ s02 2R3 R

Evaluating these integrals leaves:

Q 1
 
V (s) = − −R+ (s2 − r3 ) (4.4.7)
4π0 2R3

1
This is then nicely rewritten by pulling out a 2R :

1 Q s2
 
V (s) = 3− 2 (4.4.8)
4π0 2R R

4.4.3 Example: Potential of a Uniformly Charged Spherical Shell (by direct inte-
gration)

Figure 4.4.1: Uniformly charged sphere variable choice (Source: Griffiths E&M pg. 85)

We wish to calculate the potential of the pictured uniformly charged spherical shell (radius
r, charge per unit area σ) by direct integration 4.5 . Note that, by symmetry, we may choose the
point at which we measure the field to be along the z-axis.
Now we must find an expression for r in terms of r, r0 , and θ. Luckily the law of cosines
provides just such an expression:

r 2
= r2 + r02 − 2rr0 cos θ (4.4.9)

For a spherical shell, r0 is constant, so lets set r0 = R, and r = s. Therefore, the integral we are
4.5
This problem and solution are directly from Griffiths, pg. 85.

74
must solve is simply:

σR2 sin θ
Z π
1 σ 1
Z
V (s) = dA0 = 2π √ dθ (4.4.10)
4π0 surf ace r 4π0 0 s2 + R2 − 2sR cos θ

Making the u-substitution u = s2 + R2 − 2sR cos θ turns the integral into:

1 πRσ
Z
1
V (s) = u− 2 du (4.4.11)
4π0 s

Which evaluates to:


2πRσ q
 q 
V (s) = 2
(R + s) − (R − s) 2 (4.4.12)
s

Or, alternately, utilizing the fact that |x| = x2 :

2πRσ
 
V (s) = |R + s| − |R − s| (4.4.13)
s

Now inside the sphere s < R, while outside the shell s > R. This expression then simplifies to
the final answers:
R2 σ
Vs>R,outside (s) = (4.4.14)
0 s


Vs<R,inside (s) = (4.4.15)
0

4.5 Capacitance
4.5.1 Example: Capacitance of a Parallel Plate Capacitor

Imagine a set of parallel plates, each with area A, set a distance d apart. Suppose some charge is
moved from one plate to the other by way of a connecting wire, so that the plates have charges
of Q and −Q respectively. What is the capacitance of this system?
Capacitance in general is defined as:

Q
C= (4.5.1)
V

Since we have imagined that we know the charge on the plates, what we need to do is calculate
the voltage between the plates. We can do this by integrating the electric field, which we know
σ Q
must be E = 0 = A0 :
Z d
Qd
V = Edx = (4.5.2)
0 A0
Then, plugging in to the first equation:

A0
C= (4.5.3)
d

75
4.5.2 Energy Stored in a Charged Capacitor

In order to charge a capacitor, we first place one positive and one negative electron on each
plate. Each subsequent electron that we add to these plates will then have to fight against the
potential of the first electron (and so forth) in order to be placed on the plate. The total energy
is then equal to the total amount of work done during this process:
Z Q
W = V (q)dq (4.5.4)
0

Where V (q) is the potential on the plate as a function of the charge currently on the plate.
Q Q
However, since we also know that for a capacitor, C = V →V = C, we have
Z Q
q
W = dq (4.5.5)
0 C

or, evaluating the integral:


1 Q2
W = (4.5.6)
2 C
Q
This can be rewritten in terms of V, again using C = V:

1 1
W = QV = CV 2 (4.5.7)
2 2

4.6 Potential Theory (Poisson and Laplace Equations)


Poisson’s equation states that:

ρf ree (r)
∇2 = φ(r) = − (4.6.1)
0

In the case of a region containing zero free charge (for example a cavity or capacitor), this
reduces to Laplace’s equation:
∇2 φ(r) = 0 (4.6.2)

General solutions to this equation can be found by separation of variables in each coordinate
system. The particular solution is then identified by applying two boundary conditions.
First, we require that the potential be continuous across boundaries. If this were not the
case, the electric field would be infinite at the discontinuity, which is unphysical. For example, if
φ1 (r) and φ2 (r) are potentials in two regions that share a boundary rs , then:

φ1 (rs ) = φ2 (rs ) (4.6.3)

The second condition comes from the requirement that:

1 E1 · n̂ − 2 E2 · n̂ = σf (4.6.4)

76
Which provides the following condition on the potential4.6 :

∂φ1 ∂φ2
1 − 2 = σf (4.6.5)
∂n rs ∂n rs

4.6.1 Cartesian Solution

The solution to Laplace’s equation in Cartesian symmetry is:



A0 + B0 x α=0
Xα (x) = (4.6.6)
A eαx + B e−αx α 6= 0
α α


C0 + D0 y β=0
Yβ (y) = (4.6.7)
C eβy + D e−βy β 6= 0
β β

E0 + F0 x γ=0
Zγ (z) = (4.6.8)
E eγz + F e−γz γ 6= 0
α γ

Where α2 , β 2 , and γ 2 are the separation constants, and we require that:

α2 + β 2 + γ 2 = 0 (4.6.9)

Which means that at least ONE of the three constants must be imaginary (unless all three are
zero).

4.6.2 Cylindrical Solution

Under cylindrical symmetry, Laplace’s equation separates into:

d2 G
+ α2 G = 0 (4.6.10)
dφ2

d2 Z
+ k2 Z = 0 (4.6.11)
dz 2

d dR
 
ρ ρ + (k 2 ρ2 − α2 )R = 0 (4.6.12)
dρ dρ
The first two of these equations yield exponential/linear solutions:

C0 + D 0 α α=0
Gα (φ) = (4.6.13)
C eiαφ + D e−iαφ α 6= 0
α α

4.6
Note that both derivatives here are written with respect to the surface normal of ONE side of the surface.
Some books use both surface normals and drop the minus sign.

77

E0 + F0 y k=0
Zk (z) = (4.6.14)
E ekz + F e−kz k 6= 0
k k

Notice that, since α and k can be either real or imaginary as required by boundary conditions,
these are essentially the same solutions as to the linear equation. The i’s in the first equation
are included because, in general, one of these two directions will be oscillatory.
The remaining equation, eq. 4.6.12, is the Bessel Function Equation. However, it only
produces Bessel functions as solutions when k 6= 0:

Ak Jα (kρ) + B k Nα (kρ) k 2 > 0, α 6= 0
α α
Rαk (ρ) = (4.6.15)
Ak I (kρ) + B k K (kρ) k 2 > 0, α 6= 0
α α α α

Where Iα and Kα are the modified Bessel functions (which will rarely appear). In words, the
important thing to remember is that if k 6= 0, the solution is a linear combination of
Bessel functions of kρ. In the case where k = 0, however, the solutions are:

A0 + B0 log ρ k = 0, α = 0
Rα0 (ρ) = (4.6.16)
A0 ρα + B 0 ρ−α k = 0, α 6= 0
α α

In theory, we would perform separation of variables to find the values and signs of α and k for
each problem. However, the best way to approach this type of problem is to apply the boundary
conditions on φ and z first. Seeing which functions will fit those conditions will tell you the
values of α and k, allowing you to pick the appropriate radial solution.

4.6.3 A Note on Scale Independence

A clever trick greatly simplifies potential problems that are scale independent in one or more
directions. If a problem looks identical if you were to scale the z axis by a factor, then the
solution cannot depend on z, meaning that Z(z) = constant.

4.7 The Multipole Expansion (Monopoles and Dipoles)


4.7.1 The Multipole Expansion

Calculating potentials of arbitrary charge distributions can be extremely messy. However, the
farther away from the charge distribution we go, the less the specific details of the charge
distribution matter! The multipole expansion will allow us to approximate the field far away
from the source.
For an arbitrary charge distribution ρ(r 0 ), the potential is:

1 1
Z
V (r) = ρ(r 0 )dV 0 (4.7.1)
4π0 r

78
From the law of cosines (and figure 4.1.1) we see that

r 2 2
= r2 + r0 − 2rr0 cos θ (4.7.2)
r0
4.7
 r0 0

Or, letting = r r − 2 cos θ :

1 1 1
= (1 + )− 2 (4.7.3)
r r
The binomial theorem tells us that we can expand the (1 + ) term into an infinite series, and
that since  << 1, the series will be well approximated by the first several terms. Therefore we
will write:

1 1 1 3 5
 
= 1 −  + 2 − 3 + ... (4.7.4)
r r 2 8 16
In effect, we are now done: these three terms provide an approximation of the potential, and are
known as the monopole, dipole, and quadrupole terms respectively. Of course the point at
which we truncate the series is arbitrary, depending on how small  is, and how precisely we
want to specify V (r). We could use only the first two terms, or add a forth (which would be the
octopole term).
1
It is worth remembering that, for the nth term by itself, V ∼ rn . Therefore the monopole
term falls off as 1r , the dipole as 1
r2
, etc.
This expression can be somewhat simplified by plugging back in for  and collecting like
r0
powers of r. It then turns out that the coefficients of these like terms are just the Legendre
polynomials! The potential can then be (exactly) written as:


1 X 1
Z
V (r) = (r0 )n Pn (cos θ0 )ρ(r 0 )dV 0 (4.7.5)
4π0 n=0 r(n+1)

Or, explicitly writing the first several terms:

1 1 1 1 3 1
 Z Z Z   
V (r) = ρ(r 0 )dV 0 + r0 cos θ0 ρ(r 0 )dV 0 + (r0 )2 cos2 θ0 − ρ(r 0 )dV 0 + ...
4π0 r r2 r3 2 2
(4.7.6)

4.7.2 The Monopole and Dipole Potentials

We will now consider the first several terms individually. The monopole term is simply the
potential as if all of the charge of the distribution was located at a point at its center:

1 Q
Vmono (r) = (4.7.7)
4π0 r
The dipole term is slightly more complicated. We have:

1 1
Z
Vdip (r) = r0 cos θ0 ρ(r 0 )dV 0 (4.7.8)
4π0 r2
4.7
There is a prime on the θ because varying this angle will correspond to selecting different parts of the volume
of the charge distribution, which is the primed r-variable

79
By noting that r̂ · r 0 = r0 cos θ0 , we can rewrite this as:

1 r̂
Z
Vdip (r) = · r0 ρ(r 0 )dV 0 (4.7.9)
4π0 r2
The integral in this expression is independent of r, and is often therefore packaged inside a new
vector, known as the dipole moment:
Z
p= r0 ρ(r 0 )dV 0 (4.7.10)

The dipole potential can now be rewritten simply as:

1 p · r̂
Vdip (r) = (4.7.11)
4π0 r2
From this definition of p we also see that for a discrete distribution of n charges qi :

n
qi ri0
X
p= (4.7.12)
i=1

It is clear from this last expression that, in general the dipole moment is dependent on the choice
of origin. It turns out 4.8 that the dipole moment is independent of the choice of origin IF the
Pn
total charge Q = i=1 qi = 0!

4.7.3 The Spherical Multipole Expansion


1
In some situations (in fact, most situations), it is more convenient to expand r directly in
spherical coordinates. In terms of the Legendre polynomials, we can expand4.9 :

∞ l
1 1X r0

= Pl (r̂ · r̂0 ), for r0 < r (4.7.13)
r r l=0 r

Notice that r̂ · r̂0 = cos(γ) = cos(θ) cos(θ0 ) + sin(θ) sin(θ0 ) cos(φ − φ0 ). The Addition Theorem
for Spherical Harmonics says that:

l
4π X
Pl (cos(γ)) = Y ∗m (θ0 , φ0 )Ylm (θ, φ) (4.7.14)
2l + 1 m=−l l

So, making this substitution, we can now write the multipole expansion directly as:

∞ X l l
1 1 4π X r0

= Yl∗m (θ0 , φ0 )Ylm (θ, φ), for r0 < r (4.7.15)
r r 2l + 1 l=0 m=−l r

Using this result, the spherical multipole potential can be written as:

∞ X l
1 X Ylm (θ, φ)
 
Φ(r, θ, φ) = Am + B m l m
r Y (θ, φ) (4.7.16)
4π0 l=0 m=−l l rl+1 l l

4.8
Griffiths pg. 151-152
4.9
See Zangwill pg. 107. If you’re optimistic, try looking in Jackson.

80
Where the first term is the exterior multipole expansion, and the second term is the interior
multipole expansion. The constants are the multipole moments:

4π 4π ρ(r 0 ) m∗ 0 0
Z Z
3 0 0 0l
Am
l = d r ρ(r )r Ylm∗ (θ0 , φ0 ), Blm = d3 r0 Y (θ , φ ) (4.7.17)
2l + 1 2l + 1 r0l+1 l

If the problem has azimuthal symmetry (like all good problems), these equations reduce to:

∞ 
1 X Pl (cos(θ))

l
Φ(r, θ, φ) = Al + B l r P l (cos(θ)) (4.7.18)
4π0 l=0 rl+1

Where again the first and second terms are the exterior and interior expansions respectively.
The constants can are now:

ρ(r 0 )
Z Z
Al = d3 r0 ρ(r 0 )r0l Pl (cos(θ0 )), Bl = d3 r0 Pl (cos(θ0 )) (4.7.19)
r0l+1

4.7.4 The Electric Field of a Dipole

Now that we have the potential of a dipole, finding the electric field is simply a matter of taking
the gradient (albeit in spherical coordinates):

1 p
Edip (r, θ) = (2 cos θr̂ + sin θθ̂) (4.7.20)
4π0 r3
A few qualitative observations of this field are worth noting (and very common pGRE material):

1
• The field falls off as r3

• When θ = 0, the field points in the θ̂ direction.

4.7.5 Torques and Forces on Physical Dipoles

Figure 4.7.1: Torque on a physical dipole (Source: Griffiths “Introduction to Electrodynamics",


pg. 164)

A physical dipole consists of two opposite charges on either end of a (mass-less, charge-less,
insulating) stick. When placed in an electric field, the dipole will rotate and/or translate,
depending on the specifics of the field.

81
Consider first the torque on a dipole (pictured in fig. 4.7.1). The torque can be directly
written as:

d −d
   
N= × qE + × −qE = qd × E (4.7.21)
2 2
or

N =p×E (4.7.22)

If E is uniform, the net force on the dipole is clearly zero. However, if E is NOT uniform, then

F = F+ + F− = q(E+ − E− ) = q∆E (4.7.23)

Where ∆E = E+ −E− . Assuming that the dipole is very small, we can approximate ∆E = d·∇E
which then gives us the usual expression for the force on a dipole:

F = (p · ∇)E (4.7.24)

Given these equations, we can now calculate the potential energy of a pure dipole p in an electric
field E. Imagine bringing the dipole in from infinity to the origin. If take a path perpendicular
to ∇E, we will do no work during this stage. Now, we will rotate from it’s current position at
the origin (which will be making a right angle with E, since we moved in along the gradient) to
an angle θ. The work required to do this is
Z θ
pE sin θ0 dθ0 = −pE cos θ = −p · E (4.7.25)
π
2

So, therefore:

Udip = −p · E (4.7.26)

4.8 Polarizability and Dielectrics


4.8.1 Deriving the Bound Surface and Volume Charges

For a dipole moment per unit volume P (r 0 ), we have the following potential:

1
Z
P (r 0 ) · r̂
Vdip (r) = dV 0 (4.8.1)
4π0 V r 2


 
Noting that ∇0 1
r = r2 (derived by explicit differentiation in Griffiths), we can rewrite this
expression as:

1 1
Z  
0 0
Vdip (r) = P (r ) · ∇ dV 0 (4.8.2)
4π0 V r
Now, integrating by parts (and using the product rule on the first term):

1 P (r 0 ) 1
Z   Z 
0 0 0 0 0
Vdip (r) = ∇ · dV − (∇ · P (r ))dV (4.8.3)
4π0 V r V r

82
Applying the divergence theorem to the first term yields:

1 P (r 0 ) 1
I   Z 
Vdip (r) = · d~a0 − (∇0 · P (r 0 ))dV 0 (4.8.4)
4π0 S r V r
We will now define the surface and bound charges in order to simplify this expression. Note that
the first term looks like the potential of a surface charge distribution. We will therefore define:

σb = P (r 0 ) · n̂ (4.8.5)

Where n̂ is the surface normal, parallel to da0 . The second term, on the other hand, looks like
the potential of a strange volume charge distribution, so we will define:

ρb = −∇ · P (r 0 ) (4.8.6)

We can use these “bound charges" the same way as any charges, calculating electric fields etc.
However, they do allow us to easily rewrite the potential of our object:

1 σb ρb
I Z 
Vdip (r) = da0 − dV 0 (4.8.7)
4π0 S r V r
4.8.2 Polarizability

When a system of charges that are free to move (such as particles in an atom or charges on
the surface of a conductor) is subjected to an electric field, the position of the changes will be
rearranged to try and counteract the external field. For example, in an atom, the electron will
be pushed towards one side of the electron cloud. The strength of this response (and therefore
the electric field it creates) is called the polarizability of the system.
When an object is polarized in 1D, it acquires an electric dipole moment:

p = αE (4.8.8)

Where E is the incident electric field, and α is the polarizability of the system (a constant).
For general 3D systems, the polarizability α is replaced by a corresponding constant tensor:

pi = αij Ej (4.8.9)

This tensor can be diagonalized to find the principle axes of the system’s polarizability: much
like the similar process for an inertia tensor.
We will generally be more interested in the polarization density in a material, defined as:

dp
P= (4.8.10)
¯ dV

I have no idea why the usual convention about lower case letters and density was reversed here,
but it unfortunately was so watch out!

83
4.8.3 The “Fake" Field

There is a useful calculation trick for finding the electric field produced by the polarization of an
object derived in Zangwill, pg. 162. The “fake field" E(r), is the field of the object if, instead of
some polarization, it had a uniform charge of ρ = 1. This field can then be easily calculated
using the normal tools of electrostatics. Once E(r) has been found, the electric field due to the
polarization of the object is:

EP (r) = −(P · ∇)E(r) (4.8.11)

4.9 Fields in Matter


4.9.1 D and H

When working in regions where materials are electrically or magnetically polarization (dielectrics,
magnetizable materials, etc.) it is possible to define modified versions of the E and B fields
that are created only by free charges and currents:

D = 0 E + P (4.9.1)

B
H= −M (4.9.2)
µ0

If the matter’s polarization response is linear and isotropic4.10 , these equations can be further
simplified. In the most general 3Dd anisotropic case4.11 :
X
Pi = 0 χij Ej (4.9.3)
j

Where χij is the electric susceptibility tensor. If the material is isotropic, then this tensor
becomes a scalar:
P = 0 χE E (4.9.4)

If the system is linear, then, we assume that P ∝ E, so that χE is constant. Then:

D = 0 E + 0 χE E (4.9.5)

So, defining  such that:



χE = −1 (4.9.6)
0
We arrive at:
D = E (4.9.7)
4.10
Linear
4.11
Notice that the 0 that appears here will not be replaced by an analogous µ0 in the version for M . This
asymmetry is produced by our desire for the form of χE and χM to be the same.

84
Similarly, if we define:
µ
χM = −1 (4.9.8)
µ0

Then:
µH = E (4.9.9)

Note that µ is seemingly on the wrong side of the second equation. This is the result of a
historical misunderstanding in which H was considered the physical field (rather than B) for a
time. Unfortunately, the convention stuck before the mistake was uncovered.

4.10 Boundary Conditions


We now have all the tools available to state the most general boundary conditions on E and B
fields. For memorization sake, it is easiest to remember the boundary conditions on D and H:

(D1 − D2 ) · n̂ = σf
(D1 − D2 ) × n̂ = (P1 − P2 ) × n̂
(H1 − H2 ) · n̂ = −(M1 − M2 ) · n̂
(H1 − H2 ) × n̂ = Kf

Note which side the n̂’s are on in the cross products, as switching that is an easy way to make
a sign error. To recover the E and B conditions, σf → σ, Kf → K, M = P = 0, and the
relevant 0 ’s and µ0 ’s appear4.12 :

σ
(E1 − E2 ) · n̂ =
0
(E1 − E2 ) × n̂ = 0
(B1 − B2 ) · n̂ = 0
(B1 − B2 ) × n̂ = µ0 K

Integrating these conditions then produces boundary conditions on the scalar potentials φ =
R R
− E · dl and ψ = − H · dl.

4.11 The Method of Images


The method of images relies on the fact that, for a given charge distribution, there is only one
possible potential (that is, the solution to Laplace’s equation is unique). Given this, if a solution
can be found that matches the given boundary conditions, it must be correct, regardless of how
it was found.
We can exploit this trick by replacing a difficult problem with a simpler one that shares the
same boundary conditions, usually by adding additional charges that lie outside the region of
interest. For example, a the potential of a point charge over a conducting plane can be found by
replacing the plane with a second “mirror” point charge below the plane. The resulting solution
4.12
0 , not , since if we are using E we’re not in a dielectric!

85
is obviously only valid above the plane.
Deciding where to place these image charges is more of an art than a science. However, the
general guideline is to start by trying to get the correct potential on each surface.

4.11.1 Some Useful Image Charges

There are several image charge distributions that occur commonly enough to be worth memorizing:

• Point charge above a conducting plane. The appropriate image charge is an equal
and opposite point charge placed an equal distance below the plane.

• Point charge a distance s from a conducting sphere.

Figure 4.11.1: Image charge for a conducting sphere (Source: Zangwill E&M: pg. 246)

R2
The appropriate image charge is placed a distance b = s from the center of the sphere,
with a charge q 0 = − Rs q.
If the conducting sphere is held at a potential other than zero, add an additional image
charge at the center of the sphere to mimic this potential.
In particular, if the sphere holds a charge Q, the image charge at its center should be
qR
q 00 = Q + s (taking into account the other image charge).

4.12 Green Functions


The Green function for a problem is a mathematical object that allows us to easily find the
potential for a wide range of possible situations. The goal is to find a function G(r, r 0 ) for the
problem such that
1
∇2 G(r, r 0 ) = − δ(r, r 0 ) (4.12.1)
0
If this is true, then Green’s Theorem:
Z Z
f ∇2 g − g∇2 f d3 r = n̂ · (f ∇g − g∇f )dS (4.12.2)
V S

Reduces, with the choices f = φ and g = G (remembering that ∇2 φ = − ρ(r)


0 ):

∂G(r, r 0 ) 0 ∂φ(r 0 ) 0
Z Z Z
0 0 3 0 0
φ(r) = ρ(r )G(r, r )d r − 0 φ(r ) dS + 0 G(r, r 0 ) dS (4.12.3)
V S ∂n0 S ∂n0

86
In principle, if we can find G(r, r 0 ) for a given geometry, this equation now gives us the potential
of the charge distribution ρ(r).
In practice, we usually deal with two main types of boundary conditions on G(r, r 0 ):
∂G(r,r 0 )
Dirichlet, where G(rS , r 0 ) = 0 (along the surface), or Neumann, where dn0 = − 01A .
In the first (and most common) of these cases, the Dirichlet boundary condition causes the
last integral to be zero, simplifying the expression above to:

∂G(r, r 0 ) 0
Z Z
0 0 3 0
φ(r) = ρ(r )G(r, r )d r − 0 φ(r 0 ) dS (4.12.4)
V S ∂n0

While for the Neumann condition, the second integral becomes hφiS :

∂φ(r 0 ) 0
Z Z
φ(r) = hφiS + ρ(r 0 )G(r, r 0 )d3 r0 + 0 G(r, r 0 ) dS (4.12.5)
V S ∂n0

4.12.1 Physical Interpretation

The Green function for a problem is the potential created by a single point charge (the delta
potential described in equation 4.12.1). This potential can be conceptually split into two parts:
G0 (the field of the point charge) and F , the field of all other charges in the system the arise in
response to the point charge. As such:

G(r, r 0 ) = G0 + F (4.12.6)

There is a very important connection to the method of images here. For example, take
the classic image charge problem of a point charge near a conducting plane. The potential of
the point charge is G0 , while the potential of the image charge placed below the plane is F ! The
result is that the Green function for this problem can be determined directly using the method
of images, without any complicated mathematics.

4.12.2 Specific Green Functions

There are many mathematical methods for coaxing Green functions out of problems, many of
which are well covered in Zangwill. However, there are a few well known Green functions that
are worth knowing:

• Point charge a distance a from a conducting (uncharged) sphere:

1 1
G(r, r 0 ) = 0
− r0 a 0
(4.12.7)
|r − r | | a r − r0 r |

We can derive this equation by knowing the image charge for a conducting sphere discussed
earlier. If a point charge q is brought a distance a (on axis) from a conducting sphere of
radius R, the overall potential in space (from BOTH the real charge and the image charge)
is:
q −q Rs
φ(r) = + (4.12.8)
r − s r − Rs2

87
If we let q = 1 and rearrange the second term, we get:

1 R 1 R
φ(r) = + = + (4.12.9)
r − s rs − R 2 r − s s(R − sR22 )
s

Which is Jackson eq. 2.16. This can then be simply rearranged into the on-axis version of
the Green function again (where r0 = s):

1 R
φ(r) = + r
(4.12.10)
r−s Rs − Rs s)

4.13 Magnetic Multipole Expansion


4.13.1 Magnetic Dipoles

Physically, magnetic dipoles are loops of current. The magnetic dipole moment (usually
symbolized by m or µ) is defined from the vector potential multipole expansion as:

1
Z
m= d3 rr(r) × j(r) (4.13.1)
2 V

Where m is oriented normal to the plane of the current loop. For a loop of area A carrying a
current I, this simplifies enormously to:

m = IA (4.13.2)

An effective dipole can also be generated by a charged particle moving in circles. In this case, a
quick classical calculation 4.13 lets us write the effective dipole moment as

q
mL = L (4.13.3)
2m

Quantum mechanics introduces a couple correction factors to this equation, which produce the
following equation for a spinning quantum mechanical particle:

gµB
mS = S (4.13.4)
~

Where µB is the Bohr magneton, and g is the appropriate Landau g-factor. Sometimes these
factors are grouped together into one proportionality constant γ, called the gyromagnetic
ratio:
m = γJ (4.13.5)

4.13.2 Forces on a Magnetic Dipole

In general, the force on a magnetic dipole is

F = ∇(m · B) (4.13.6)

However, in the (common) case that m is spatially invariant, this simplifies to F = (m · ∇)B.
4.13
Zangwill pg. 340.

88
It is important to note that the force on a magnetic dipole is zero unless either the dipole
moment or (more likely) the field itself has some gradient. Generally, it is the gradient of a
magnetic field that exerts force on a dipole.
When placed in an external magnetic field, magnetic dipoles will experience a torque that
tends to align the magnetic dipole moment with the external field lines:

N =m×B (4.13.7)

The potential energy of a magnetic dipole in a field is

U = −m · B (4.13.8)

However, there is an important, subtle point here. With electric dipoles, since the charge
(and therefore internal energy) of the dipoles didn’t change, we could find the force on a dipole
by taking the gradient of U . However, for magnetic dipoles, the current in the dipole CAN be
changed by moving the dipole in the external magnetic field. Normally, we add an additional
requirement that the current remain fixed, but this modification means that energy may leave
or enter the system in this way!
It is possible4.14 to do a rigorous calculation to find the total energy of the dipole system,
including that of the currents. It comes out that the magnetic potential energy is in fact the
negative of the magnetic total energy.

4.13.3 Magnetic Field of a Magnetic Dipole

The magnetic vector potential of a magnetic dipole is 4.15 :

µ0 m × r
A(r) = , r >> R (4.13.9)
4π r3

Therefore,
µ0 r r
 
B(r) = ∇ × A(r) = m(∇ · 3 ) − (m · ∇) 3 (4.13.10)
4π r r
The first term reduces is a delta function, 4πδ(r), since it is zero for all non-zero r, but non zero
if we integrate over a spherical volume and then use Gauss’s law. The remaining term works out
nicely in index notation:

∂ rj δij 3 2rj ri m 3r(r · m) m 3r̂(r̂ · m)


 
mi 3
= mi 3 − = − = 3 − (4.13.11)
∂ri r r 2 r5 r 3 r 5 r r3

Therefore, the final field becomes

µ0 3r̂(r̂ · m) − m
B(r) = (4.13.12)
4π r3
4.14
Zangwill pg. 387.
4.15
Getting here from the beginning of the multipole expansion takes a few identities and some work, so I’ve left
it out. You can find it worked out in Zangwill, pg. 337.

89
Notice that this field is essentially exactly the same as that of the electric dipole in form! This
means that, qualitatively, the two fields have the same shape. It also makes it a lot easier to
memorize.

4.14 Magnetostatics
Magnetostatics is defined by the requirement that ∇ · j = 0: no currents are created or destroyed.

4.14.1 The Lorentz Force

The force on a charged particle moving through a magnetic field is given by the Lorentz force:

F = q~v × B (4.14.1)

For a current (which is, after all, just a continuous stream of charged particles), this generalizes
to the force of one current distribution on another
Z
F1→2 = j2 (r) × B1 (r)d3 r (4.14.2)
V2

4.14.2 Ohms Law and Potential Theory with Steady Currents

Ohm’s law is normally written (by engineers or anyone working in a lab) as:

V = IR (4.14.3)

However, when working with electric fields, it is usually easier to take a spatial integral and
1
rewrite r = σ where σ is the conductivity and r is the resistivity. The relationship between
resistivity and resistance for a wire is:

rL
R= (4.14.4)
a

Where L is the length of the wire and a is its cross-sectional area.


A perfect insulator has a conductivity of zero. Rewriting Ohms law using the conductivity
yields:
j = σE (4.14.5)

σ is a property of a given material. Therefore, Ohms law allows us to find the current generated
in a given conductor by some external electric field.
Combined with the central assumptions of magnetostatics, Ohms law allows us to apply
potential theory to situations with steady currents. For spatially-constant σ, ∇ · j = 0 and
j = σE implies that σ∇ · E = 0, or, in turn: ρ = 0 (by Gauss’s law) and, since ∇2 φ = 0 (since
E = −∇φ). This last condition means that we have the same Laplace equation for the potential
as in electrostatics, and can therefore apply all of our potential theory techniques.
This results in two boundary conditions that must be met by an electric potential in a region

90
with steady currents:

∂φ1 ∂φ2
n̂2 · (j1 − j2 ) = j1,⊥ − j2,⊥ = σ1 |S − σ2 |S = 0 (4.14.6)
∂n ∂n

n̂2 × (E1 − E2 ) = E1,k − E2,k = φ1 |S − φ2 |S = 0 (4.14.7)

4.14.3 Resistors

Resistance is somewhat analogous to capacitance, in the sense that it is a ratio between the
potential over a system and some other quantity. The primary difference is that, in addition to
geometry, resistance also depends on the material properties of the system.
Much like capacitance, the easiest way to find the resistance of some system is to place an
arbitrary potential across it and calculate the current that arises. Once this is found, Ohms law
can be used to calculate the resistance.

4.14.4 The Biot-Savart Law

In magnetostatics, it is true by assumption that ∇ · B = 0 and ∇ × B = µ0 j. Specifying both


the curl and divergence of B in this way makes it possible, through Helmholtz’s theorem, to
specify a unique B for a given j. The formula that emerges is the Biot-Savart law:

µ0 j(r 0 ) × (r − r 0 )
Z
B(r) = d3 r0 (4.14.8)
4π V |r − r 0 |3
This general formula holds for any charge distribution. If the current in question is restricted to
a 2D or 1D surface, the equation can be simplified accordingly:

µ0 K(rS ) × (r − rS )
Z
B(r) = dS (4.14.9)
4π S |r − rS |3

µ0 I × (r − l) µ0 I d~l × (r − l)
Z Z
B(r) = d~l 3
= (4.14.10)
4π l |r − l| 4π l |r − l|3

4.14.5 Ampere’s Law

Applying the Stokes theorem to ∇ × B = µ0 j yields:


I Z
d~l · B = µ0 ~ ·j
dS (4.14.11)
C S
This law is the magnetostatic equivalent to Gauss’s law. It says that the magnetic field along a
curve C is equal to the integral over all of the current that passes through the surface bounded
by C.
Applying Ampere’s law is similar to using Gauss’s law. We identify a suitable curve C based
on the symmetry of the problem: we want the curve to be everywhere parallel or perpendicular
to B. It is often useful to also exploit areas of zero magnetic field when choosing C.

91
4.14.6 The Vector Potential

Since ∇ · B = 0, we can write B in terms of a vector function A, defined by the relation:

B =∇×A (4.14.12)

Much like the electric potential, the magnetic vector potential is arbitrary up to the gradient of
a scalar function, since ∇ × ∇f = 0:

A0 = A + ∇f (4.14.13)

We can derive a general expression for A in terms of currents by starting with the other Maxwell
equation for B, ∇ × B = µ0 j:

∇ × (∇ × A) = µ0 j =⇒ ∇(∇ · A) − ∇2 A = µ0 j (4.14.14)

We can further simplify if we chose f to put us in the Coulomb gauge, where by definition
∇ · A = 0. Then the equation reduces to ∇2 A = −µ0 j. By analogy with Poisson’s equation for
0 µ0 j
the electrical potential of a point charge (where ρ → 4π ) gives us a general formula for the A
created by some system of currents:

µ0 j(r 0 )
Z
A(r) = d3 r0 (4.14.15)
4π V |r − r 0 |

4.14.7 Inductance

Picture two wire loops positioned next to one another. If a current begins in one loop, it will
create magnetic flux through the other, inducing a current there as well. How much current is
induced per a given current depends on the geometry of the situation, much like the voltage
across a capacitor for a given charge depends only on the geometry of the capacitor. This
motivates us to define a similar object to capacitance current-carrying conductors. This quantity
is the inductance.
Consider first the self inductance of a loop of wire with a current (which determines the
amount of back-current produced in the loop as you try to establish a magnetic field). This
inductance is given by

φ
L= (4.14.16)
I
Where φ is the flux through the loop and I is the current. More generally, self inductance for
non-loops (say, the self-inductance of a wire) is defined by:

1
Z
L= 2 d3 rj(r) · A(r) (4.14.17)
I V

When written this way, it becomes clear (by comparison with the magnetic energy equation)
that

1
UB = LI 2 (4.14.18)
2

92
Multiple conductors can share a mutual inductance, where each generates currents in the
others. The inductance of such a system is a matrix, Mik , where the self inductances described
above make up the diagonal: Mii . For a set of N conductors, we now have:

N
X φi
Mik = (4.14.19)
I
k=1 k

and

µ0 1 ji (r) · jk (r0 )
Z Z
Mik = d3 r d3 r0 (4.14.20)
4π Ii Ik |r − r 0 |
It is important to note that, by a kind of reciprocity relationship, this matrix must be symmetric:
the inductance loop 1 has with loop 2 must be the same as loop 2 with loop 1, since inductance
only depends on the geometry. In other words:

Mik = Mki (4.14.21)

4.14.8 Magnetic Energy

The total energy contained in a magnetic field is

1
Z
UB = |B|2 d3 r ≥ 0 (4.14.22)
2µ0 V

We can UB is positive definite by contradiction: if it were not, it would be energetically favorable


for currents to form spontaneously (since zero magnetic field would not be the ground state).
Sometimes we are only concerned with the interaction energy between two current-carrying
objects (i.e. the energy that would be released if they were taken to infinity). This energy can
be calculated as
Z Z
3
VB = j1 (r) · A2 (r)d r = j2 (r) · A1 (r)d3 r (4.14.23)

Notice the reciprocity relationship displayed here: the interaction energy can just as easily be
calculated using the potential of 1 and the current of 2 as the potential of 2 and the current of 1

4.14.9 Magnetic Scalar Potential

Since ∇ × B = 0 when j = 0 and ∇ · B = 0 always, we can draw an analogy between this system
and the set of equations that describe the electrical field in a region free of charge: ∇ × E = 0
and ∇ · E = 0. Therefore, within the current-free region we can define a magnetic scalar
potential that has the same form as the electrical scalar potential:

B = −∇ψ (4.14.24)

Therefore, since ∇ · B = 0

∇2 ψ = 0 (4.14.25)

93
This is the same Laplace equation we had for electrostatics, so we can apply all of the tools of
potential theory that we developed there! Bessel functions and all.
The boundary conditions on the scalar potential are very similar to those for electrostatics:
The Maxwell equations for B produce a different set of boundary conditions than we had
with electrostatics:

n̂ · (B1 − B2 ) surf ace


= (B1,⊥ − B2,⊥ ) surf ace
=0 (4.14.26)

n̂ × (B1 − B2 ) surf ace


= (B1,k − B2,k ) surf ace
= µ0 K(r)|surf ace (4.14.27)

4.14.10 Simple Magnetic Materials and the Auxiliary Field, H

When a magnetizable material is exposed to a magnetic field, its magnetic domains align to
produce a magnetic dipole moment per unit volume, M . This in turn combines to produce an
overall dipole moment for the object4.16 :
Z
m= M dV (4.14.28)

The degree to which magnetizable materials magnetize is characterized by their magnetic


permeability, µ. This plays an analogous role to  for electric permittivity. Sometimes µ is given
as a dimensionless ratio with the vacuum permittivity:

µ
κm = (4.14.29)
µ0
When dealing with magnetizable materials, it is often most convenient to work with the auxiliary
field, H, which is defined to be:

B
H= −M (4.14.30)
µ0
In general the response of a given material to an incident magnetic field can be very complicated.
However, we generally model materials as behaving linearly to an incident field. Such linear
materials are said to be paramagnetic if χm > 0, which re-enforce the incident B field, or
diamagnetic if χm < 0, which decreases the incident B field.
In such a linear material. for some incident auxiliary field H, the magnetization per unit
volume is given by:

M = χm H (4.14.31)

Where χm is yet another expression for µ:

µ
χm = κm − 1 = −1 (4.14.32)
µ0
4.16
Don’t ask me why big M is the dipole moment DENSITY while little m is the TOTAL dipole moment.

94
This allows us to write eq. 4.14.30 for such linear materials in an even simpler form:

B
H= (4.14.33)
µ

4.14.11 Bound and Free Currents

When taken together, the aligned magnetic dipoles of a magnetized material produce a magnetic
field identical to that of a system of volume and surface currents. By analogy with electric
polarizability, we will refer to these currents as bound currents, to differentiate them from the
free currents, which are the REAL currents in the material. The magnitude and direction of
these currents are determined by the following equations:

jM = ∇ × M (4.14.34)

σM = M · n̂ s
(4.14.35)

4.14.12 The Auxiliary Magnetic Scalar Potential and the “Fake" Magnetic Charge

Inside or around magnetizable materials, it is often more convenient to defined the magnetic
scalar potential as:

H = −∇ψM (4.14.36)

Where the “M" subscript differentiates this potential from the other magnetic scalar potential
defined earlier. This potential has its own set of general boundary conditions, which can be
found in Zangwill, pg. 417. However, when the materials involved are simple, linear magnetic
materials, the boundary conditions simplify to the following:

ψ1 s
= ψ2 s
(4.14.37)

dψ1 dψ2
µ1 s
= µ2 s
(4.14.38)
dn dn
As a purely mathematical trick, it is often convenient to calculate ψM in terms of a fictitious
“magnetic surface charge" and “magnetic volume charge":

ρ∗ = −∇ · M (4.14.39)

σ ∗ = M · n̂ s
(4.14.40)

Once these have been determined, the potential can be found using potential equations from
electrostatics. What used to be E is now H, so we can (for example) find H using Gauss’s law:

∇ · HM = ρ∗ (4.14.41)

95
It’s worth emphasizing again that this is just a math trick that we can use because the
equations turn out to be the same, and we are much happier dealing with electric fields. There’s
no such thing as a magnetic charge!!

4.15 Electrodynamics
4.15.1 Time Varying Scalar and Vector Potentials

When fields and sources vary in time, it is possible for charges to create magnetic fields. This in
turn intertwines the roles of the scalar and vector potentials.
Plugging B = ∇ × A into Maxwell’s ∇ × B equation yields:

∂A
 
0=∇× E+ (4.15.1)
∂t
Since, in general, ∇ × ∇f = 0 for some scalar f , the quantity in the parentheses can be written
as the gradient of a scalar function, which is the scalar potential we know and love:

∂A
E+ = −∇φ (4.15.2)
∂t
Rearranging, we now have an expression for the electric field represented by these time-varying
potentials:

∂A
E = −∇φ − (4.15.3)
∂t
Plugging this result, (and B = ∇ × A) into the remaining Maxwell equations yields a system of
two (difficult) coupled equations that theoretically contain all the physics of the dynamic system.

∂ −ρ
∇2 φ + (∇ · A) = (4.15.4)
∂t 0

1 ∂2A 1 ∂φ
 
∇2 A − 2 2
−∇ ∇·A+ 2 = −µ0 j (4.15.5)
c ∂t c ∂t

These equations are simplified substantially by choosing the correct gauge, as discussed in the
section on gauge freedom.

4.15.2 Conservation of Charge and its Consequences

The conservation of charge is normally stated mathematically as:


∇·j = (4.15.6)
dt
Where j is the current density, and ρ is the charge distribution. This relation can be easily
derived by starting with the more intuitive expression that the total change in charge in a region
is due to the sum of the currents leaving the region:

dQ
Z
= j · dA (4.15.7)
dt

96
ρd3 r and applying the divergence theorem in reverse to the right hand
R
Transforming Q → V
side:

d
Z Z
∇ · jdV = ρdV (4.15.8)
dt
Since these integrals can now be combined, we have


Z  
∇·j− dV = 0 (4.15.9)
dt
We then argue that the integrand itself must in general be zero:


∇·j = (4.15.10)
dt

4.15.3 The Displacement Current

Charge conservation states that:


∇·j = (4.15.11)
dt
However, differentiating Gauss’s law simply gives:

∂ρ ∂E
= 0 ∇ · (4.15.12)
∂t ∂t
Combining these two equations:

∂E
∇ · j = 0 ∇ · (4.15.13)
∂t
This equality suggests that some type of current is created by a changing electric field. This
current is called the displacement current for historical reasons. The displacement current is
thus defined as:

∂E
jD = 0 (4.15.14)
∂t

4.15.4 Polarization Current

Just as real moving charges constitute a current (and can create a magnetic field), moving bound
charges also create an effective current, called the Polarization Current. This can be found
by applying the conservation of bound charge4.17 :

∂ρb
+ ∇ · jb = 0 (4.15.15)
∂t
Since

ρb = −∇ · P (4.15.16)
4.17
I am using the subscript b to denote “bound". Zangwill uses p in the same expressions for “polarization".

97
We can rewrite this expression as

∂P
∇ · jb = ∇ · (4.15.17)
∂t
The most general solution to this equation is:

∂P
jb = +∇×Λ (4.15.18)
∂t
Where Λ is some vector function. It turns out4.18 that this function is actually an effective
magnetization caused by the polarization current. Therefore, the equation is normally written
as:

∂P
jb = + ∇ × Mb (4.15.19)
∂t

4.15.5 Faraday’s Law

Farady’s law for dynamic fields is:

∂B
∇×E =− (4.15.20)
∂t
Faraday’s law states that, when the magnetic flux (field × area) changes through a conducting
loop, a voltage will be induced in the loop:


V =− (4.15.21)
dt
Usually, the V in this equation is written as a script E for the Electromotive Force. However,
since this force is measured in volts and always takes the form of a voltage, I see no need for its
separate existence.

4.16 Slowly Varying Fields and Currents: Quasi-statics


We characterize a field or current as “slowly varying" if its characteristic length scale and
oscillation frequency obey the inequality4.19 .

ω 2 l2 << c2 (4.16.1)

This can be viewed as a requirement that changes in the system happen slowly compared to the
time for light signals to propagate through the system.
When making these sorts of estimates, it is useful to make the approximations

1 ∂
∇≈ , ≈ω (4.16.2)
l ∂t
4.18
See Zangwill pg. 459.
4.19
Zangwill handwaves his way to this expression on pg. 468.

98
4.16.1 Quasi-Electrostatics and Quasi-Magnetostatics
∂E
In the Quasi-Electrostatic regime, we assume that ∂t = 0. This knocks out a term in Maxwell’s
equations.
∂B
In Quasi-Magnetistatics we similarly make the assumption that ∂t = 0.

4.17 Gauge Freedom


As previously noted, there is an inherent ambiguity in our definitions of the scalar and vector
potentials:

∂Λ
φ0 = φ − , A0 = A + ∇Λ (4.17.1)
∂t
We are free to chose whatever arbitrary Λ that we like above. This choice is equivalent to specify
the value of: ∇ · A.
It is important to keep straight which gauge different equations are written in!4.20

4.17.1 The Coulomb Gauge

In the Coulomb gauge, we set:

∇·A=0 (4.17.2)

This is a natural choice, as terms of ∇ · A appear in both of the master equations for the time
varying potentials. With this choice, those equations simplify to:

−ρ
∇2 φ = (4.17.3)
0

1 ∂2A 1 ∂φ
∇2 A − 2 2
= −µ0 j + 2 ∇ (4.17.4)
c ∂t c ∂t
Once you have the potentials, the corresponding fields are given by:

∂A
B = ∇ × A, E = −∇φ − (4.17.5)
∂t

4.17.2 The Lorentz Gauge

In the Lorentz gauge we set:

1 ∂φ
∇·A=− (4.17.6)
c2 ∂t
4.20
Zangwill puts subscripts on EVERY A and φ to remind you of this. I think that’s cumbersome, so I’m leaving
them off. Goodness knows, when it comes to actually calculating anything on paper, you’ll be dropping them too.

99
This serves to decouple the master equations, yielding two somewhat complicated but symmetric
equations:

1 ∂2φ −ρ
∇2 φ − 2 2
= (4.17.7)
c ∂t 0

1 ∂2A
∇2 A − = −µ0 j (4.17.8)
c2 ∂t2
This form of the equations lends itself to both wave problems (i.e. radiation) and relativistic
calculations.
Once you have the potentials, the corresponding fields are given by:

∂A
B = ∇ × A, E=− (4.17.9)
∂t

4.18 Energy and Momentum in E&M Fields


4.18.1 Energy Flow and the Poynting Vector

The mechanical work done by a current against an electric field is F · v, or:

dWmech
Z
= (ρE + j × B) · vd3 r (4.18.1)
dt V

Since j k v, (j × B) · v = 0. Also, ρv = j. Thus, we can rewrite

dWmech
Z
= (j · E)d3 r (4.18.2)
dt V

1 ∂E
But, since one of Maxwell’s equations tells us that µ0 j = ∇ × B − c2 ∂t
, this can be rewritten
(by just substituting for j in j · E):

dWmech 1 ∂E
Z  
= ∇ × B − 0 · Ed3 r (4.18.3)
dt V µ0 ∂t
A vector identity, combined with Maxwell’s equation for ∇ × E shows that: ∇ · (E × B) =
−B ∂B
∂t − E(∇ × B). This allows the equation above to be rearranged as:

dWmech 1 1 ∂B ∂E 3
Z  
= ∇ · (E × B) + B · − 0 E · d r (4.18.4)
dt V µ0 µ0 ∂t ∂t
If you’re clever, you notice that:

1 ∂B ∂E 3 ∂ 1 ∂
Z Z
B· + 0 E · d r= 0 (E · E + c2 B · B)d3 r = UEM (4.18.5)
V µ0 ∂t ∂t ∂t V 2 ∂t

We can also rewrite one of the terms as a surface integral of a vector function called the Poynting
Vector:

1
S= E×B (4.18.6)
µ0

100
Applying the Stokes theorem to the E × B term, we can now rewrite the entire equation as:

d dUtot
Z
(Umech + UEM ) = =− S · dA (4.18.7)
dt dt S
This equation has a profound interpretation. Energy can obviously be transferred between
electromagnetic and mechanical energy, all on the left hand side of the equation. However, it is
also possible for the total energy to change, if the Poynting flux term is non-zero! This term
must represent energy being radiated from the system.
S can be interpreted as a current density of electromagnetic energy.

4.18.2 Electromagnetic Momentum

Electromagnetic fields can store and carry momentum. The momentum density contained in a
field is:

S
g= = 0 (E × B) (4.18.8)
c2
For a single particle in a field4.21 , the total momentum of the particle due to the field is p = qA.
This expression often pops up when writing the Hamiltonian of such a particle.
Fields can also store angular momentum. The angular momentum density is sensibly defined
by:

l=r×g (4.18.9)

4.19 Electromagnetic Waves


4.19.1 The Wave Equation

Maxwell’s equations in free space are:

∇ · E = 0, ∇·B =0 (4.19.1)

∂B 1 ∂E
∇×E =− , ∇×B = (4.19.2)
∂t c2 ∂t
By taking the curl of ∇ × E, we can write:

∂B ∂
∇ × (∇ × E) = ∇ × − = − (∇ × B) (4.19.3)
∂t ∂t
Substituting in the other Maxwell equation for ∇ × B:

1 ∂2E
∇ × (∇ × E) = − (4.19.4)
c2 ∂t2
4.21
Zangwill pg. 515 for details.

101
Since ∇ × (∇ × f ) = ∇(∇ · f ) − ∇2 f , we can rewrite this as:

1 ∂2E
∇2 E − =0 (4.19.5)
c2 ∂t2
This is the wave equation for the electric field. Following the exact same process but starting
with the ∇ × B Maxwell equation yields the same equation for B:

1 ∂2B
∇2 B − =0 (4.19.6)
c2 ∂t2
Notice how the existence of these wave equations (and therefore their solutions) is due to the
mixing of E and B terms in the time-dependent part of Maxwell’s equations. In a deep sense,
that’s the only reason we have light.

4.19.2 Plane Waves

The most important solution to the wave equation is the plane wave: a spatial structure that
translates with time in only one direction. It can be shown4.22 that any function w(z, t) =
g(z − ct) + f (z + ct) (for arbitrary g and f ) will be a solution.
This waveform can be generalized to any direction (rather than just z) by making the
substitution z = k · r, where k is the wave vector. k points in the direction of propagation of
the wave, and has a magnitude related to the wave’s speed. We will also recast c|k| = ω.
It is pretty easy to see that, when evolved in time, this waveform will move with a speed c in
the k direction. This is known as the phase velocity (since it is defined by watching a point of
constant phase move).
The fields corresponding to these solutions look like:

1
E = E0 (k · r − ωt), B = k̂ × E (4.19.7)
c
Notice the potential for confusion between E and E0 . While both are vectors, only the latter is
constant. We see from these equations that

|E| = c|B| (4.19.8)

Therefore, it will suffice for us to work almost exclusively with the electric field, knowing that
we can easily reproduce the corresponding magnetic field.
Without loss of generality, we will chose to work in a basis of sinusoidal functions, such that

E = E0 cos(k · r − ωt) (4.19.9)

To make the calculus easier, we will chose to work with complex waves, with the understanding
that the actual wave we are talking about is only the real part of our equation. We can therefore
express E = Re[Ẽ], where:

Ẽ = Ẽ0 ei(k·r−ωt) (4.19.10)


4.22
Zangwill pg. 539.

102
If a wave is completely represented by one such sinusoidal function, it is said to be a monochro-
matic plane wave. If not, any solution to the wave equation can be represented as a Fourier
series over such plane waves.
Waves carry energy, and the energy carried in a monochromatic plane wave is given simply
by the usual E&M energy equation:

1 1
uEM = 0 (|Re(Ẽ)|2 + c2 |Re(B̃)|2 ) = 0 |Re(Ẽ)|2 (4.19.11)
2 2
Over a time average:

1
huEM i = 0 |Re(Ẽ0 )|2 (4.19.12)
2
Similar expressions can then be worked out for the momentum and Poynting flux of the wave:

huEM i
hgi = k̂ (4.19.13)
c

hSi = c2 hgi = chuEM ik̂ (4.19.14)

Since the Poynting flux is an energy density current, we can define the energy velocity as:

hSi
vE = = ck̂ (4.19.15)
huEM i
Which confirms that we have, in fact, described a light wave!

4.19.3 Polarization

We have shown that, for any given E-field, a corresponding B-field of B = k̂ × E can be a freely
propagating electromagnetic wave. This means that the direction of the electric field is arbitrary,
and can in fact change in time as long as the B-field also changes.
We define the polarization of a light wave by the motion of the tip of the E-field vector. In
the most general case, we will allow this vector to trace out an ellipse in the transverse plane.

Figure 4.19.1: Coordinate system and a sign convention for circular polarization. Source:
Zangwill pg. 547.

In the basis ê1 and ê2 as illustrated in figure 4.19.1, the most general electric field can be

103
expressed as:

Re(Ẽ) = A cos(φ + δ1 )ê1 + B cos(φ + δ2 )ê2 = E1 ê1 + E2 ê2 (4.19.16)

Some rearrangement of this equation shows that the components E1 , E2 of E obey the equation
of an ellipse paramaterized by the phase difference between the ê2 and ê1 components δ = δ2 − δ1 :
2 2
E1 E2 E1 E2
    
+ −2 cos(δ) = sin2 (δ) (4.19.17)
A B A B
This state is known as elliptical polarization.
This equation simplifies substantially if we assume that the ê1 and ê2 components are
completely out of phase, i.e. δ = δ2 − δ1 = mπ, (m = 0, 1, 2, ...). We can parameterize the
wave in terms of the angle between E and ê1 : θ.

Ẽ = E0 (cos(θ)ê1 + sin(θ)ê2 )ei(k·r−ωt) (4.19.18)

This state is called linear polarization, as the tip of E traces back and forth across the line
defined by θ (note that θ is fixed for a given wave, so the coefficients of the unit vectors above
are constant).
A second simplification occurs when δ = δ2 − δ1 = π2 m, (m = ±1, ±2, ...). In this case, the
ellipse becomes a circle, which can be described by:

ê1 ± iê2 i(k·r−ωt)


 
Ẽ± = E0 √ e (4.19.19)
2
This is known as circular polarization. Based on this equation, it is natural to transform into
a basis of circular coordinates:

1 1
ê+ = √ (ê1 + iê2 ), ê− = √ (ê1 − iê2 ) (4.19.20)
2 2
Notice that the choice of + and − for these vectors is arbitrary. This leads to a troublesome
sign convention that differs even between sub-fields of physics. In these notes (consistent with
Zangwil) + is taken to be counter-clockwise as see looking INTO the light ray.
In this basis, the wave can be conveniently written as:

E˜± = (ê± )ei(k·r−ωt) (4.19.21)

This circular basis is an equally good basis for expressing an arbitrary wave.

4.20 Wave Packets


Plane waves provide a simple basis of solutions to the wave equation, but are obviously unphysical
(because they extend to infinity). This problem can be solved by assembling a wave packet by
combining a continuum of plane waves, weighted by a vector envelope function of frequency

104
(or, equivalently, k), ε(k):

1
Z
Ẽ = ε(k)ei(k·r−ωt) d~k (4.20.1)
(2π)3
This general vector wave packet is very difficult to use, in general we will work with a similar
scalar wave packet, assembled from solutions to the scalar wave equation:

1
Z
u(r, t) = ε(k)ei(k·r−ωt) d~k (4.20.2)
(2π)3
Again, ε(k) is an envelope function that defines the shape of the wave packet at t = 0: however
in this case it is a scalar. We have switched notation from Ẽ to u(r, t) to indicate that this
function cannot actually be an electric field (because it is a scalar).
Incidentally, it can be shown (Zangwill pg. 554) that u(r) and ε(k) are a Fourier transform
pair, meaning that:

1
∆xi ∆ki ≥ (4.20.3)
2
This “uncertainty relationship" is characteristic of all Fourier transform pairs. In this case, it
states that a a wave packet can be made arbitrarily small, at the cost of including more and
more frequencies. Just as for the position-momentum pair in QM, the maximally certain state
possible is a Gaussian envelope function, making this the standard choice.

4.20.1 Group Velocity

In general, ω(k). This means that different plane waves in a packet may travel at different
velocities (depending on the properties of the material). If we assume that a wavepacket is
mostly localized in frequency space (includes a narrow band of frequencies), we can justify
approximating ω as a Taylor series:

∂ω
ω(k) = ω(k0 ) + (k − k0 )i + ... (4.20.4)
∂ki k=k0

We will define the group velocity of the wave packet as the second coefficient in this expansion:

∂ω
vg = = ∇k ω(k) (4.20.5)
∂ki k=k0 k=k0

This is the effective speed at which the wave packet propagates. Notice that, in vacuum:

ω = ck → vg = c (4.20.6)

In a medium, this velocity will in general be a function of k.

4.21 Waves in Matter


Waves in matter can be conceptually subdivided into simple waves, in which all frequencies
travel at the same velocity, i.e. the group velocity is constant, and waves in dispersive media,
where the group velocity is a function of frequency.

105
4.21.1 Simple Media

Simple media are characterized by the equations:

D = E, B = µH (4.21.1)

For convenience, we will introduce two collections of these constants. The index of refraction:

√ µ
r
n = c µ = (4.21.2)
µ0 0

and the intrinsic impedance:


r
µ
Z= (4.21.3)

For a given medium, then, Z and n contain the same information. This means you will commonly
need to translate from one to the other, which is easily done4.23 :

µ
Z= (4.21.4)
nc
In many cases, it may be assumed that µ = µ0 . This is often justified, as magnetic effects for
most materials are much weaker than their electric counterparts. Such materials are called
non-magnetic.

4.21.2 Simple Plane Waves

In simple media, it is typical to chose to describe waves in terms of E and H. If no sources are
present, Maxwell’s equations are then:

∇ · E = 0, ∇·H =0 (4.21.5)

∂H ∂E
∇ × E = −µ , ∇×B = (4.21.6)
∂t ∂t
If we apply these equations to a typical monochromatic plane wave:

Ẽ = Ẽ0 ei(k·r−ωt) , H̃ = H̃0 ei(k·r−ωt) (4.21.7)

We produce a set of four constraints. The first two state that both E or H are perpendicular to
the direction of propagation of the wave:

k · E = 0, k·H =0 (4.21.8)

The second two set restrictions on the relationship between E and H:

k × E = ωµH, k × H = ωE (4.21.9)


4.23 n 1
Note that this expression is equivalent to Z = c . This version is preferable, since |H| = Z
|E|, so we would
like to have n in the numerator of that expression when n is complex later.

106
Combining these latter two equations (by taking the curl of one and then plugging in the other)
gives:

k × (k × E) = −ω 2 µE (4.21.10)

The triple cross product identity then produces:

c
k · k = µω 2 → ω(k) = k (4.21.11)
n
∂ω
Therefore, the group velocity ∂k is just vg = nc . Media with higher indices of refraction have the
effect of slowing down the wave.
The second set of conditions also gives us a relation between E and H:

1
H= k̂ × E (4.21.12)
Z
Therefore, just like in vacuum, we can often think about waves only in terms of the electric field,
simply recovering H at the end of the calculation.
As usual, Maxwell’s equations require the following conditions at boundaries (with no
sources)(directions relative to the surface normal):

D1,⊥ = D2,⊥ , B1,⊥ = B2,⊥ (4.21.13)

E1,k = E2,k , H1,k = H2,k (4.21.14)

4.21.3 Specular Reflection and Snell’s Law

Figure 4.21.1: Reflection and refraction at a boundary. Source: Griffiths E&M, pg. 389.

Imagine a light wave incident on a boundary between two regions with different indices of
refraction. An incident ray will produce both a transmitted and a refracted ray. Together, these
three rays lie in a plane, which is defined as the plane of incidence. Matching the incident,

107
transmitted, and reflected waves at the boundary produces an equation like:

()ei(kI ·r−ωI t) + ()ei(kR ·r−ωR t) = ()ei(kT ·r−ωT t) , (at z = 0) (4.21.15)

Where the empty parentheses stand for hither-to-be-determined constants. In order to have any
hope of satisfying the boundary conditions given by Maxwell’s equations everywhere on this
plane, two conditions must be satisfied:

• ωI = ωR = ωT . If this was not the case, the solutions could only ever match for one point
in time.

• kI · r|z=0 = kR · r|z=0 = kT · r|z=0 . Letting r = h1, 0, 0i or something similar makes it


clear that this statement also implies that kI,x = kR,x = kT,x , and kI,y = kR,y = kT,y .

The condition that kI,x = kR,x = kT,x immediately implies that sin(θI ) = sin(θR ), since
|kI | = |kR |. This is equivalent to the Law of Specular Reflection:

θI = θR (4.21.16)

Applying the same reasoning to the transmitted wave produces:

|kI | sin(θI ) = |kR | sin(θR ) (4.21.17)


ωn
Since k = c , this produces Snell’s Law:

n1 sin(θI ) = n2 sin(θR ) (4.21.18)

4.21.4 Total Internal Reflection

Figure 4.21.2: Total internal reflection. Source: Wikipedia.

Total internal reflection occurs when a wave travels from a region of higher n to one of lower
n at a steep angle. Snell’s law predicts that, after a critical angle denoted θc where the wave
is refracted horizontal to the surface, the wave can actually be refracted back into the original
π
material. The critical angle can be calculated by setting θT = 2 in Snell’s law, which yields:

n2
 
−1
θc = sin (4.21.19)
n1

108
4.21.5 Polarized Light at Boundaries (The Fresnel Equations)

Figure 4.21.3: Reflection and refraction of a polarized wave at a boundary. Source: Zangwill pg.
591.

When considering the interaction between a polarized light wave and a boundary, we must
match boundary conditions for both E and H. We will treat two possible situations separately:

• p-polarized, TM (transverse magnetic), or k: E is parallel to the plane of incidence


(left panel in figure 4.21.3).

• s-polarized, TE (transverse electric), or ⊥: E is perpendicular to the plane of


incidence (right panel in figure 4.21.3).

We will treat the p case (the s-case follows by the same logic). The requirement that E1,k = E2,k
directly yields (based on figure 4.21.3):

EI cos(θI ) + ER cos(θR ) = ET cos(θT ) (4.21.20)

While the condition that H1,k = H2,k means that:

HI + HR = HT (4.21.21)
1
But since H = Z k̂ × E, this second equation can be rewritten as4.24

1 1
(EI + ER ) = ET (4.21.22)
Z1 Z2
Combining these two equations (and changing from Z to n) generates Fresnel’s Equations
for the reflection and transmission coefficients:

2
ER n2 cos θI − n1 cos θT
  
Rp = = (4.21.23)
EI p n1 cos θT + n2 cos θI

2
ET 2n1 cos θI
  
Tp = = (4.21.24)
EI p n1 cos θT + n2 cos θI

4.24
Both sides also get a k̂×, but we can cancel those and consider just the parts written below.

109
Following the same process for s-polarized waves yields:

2
ER n1 cos θI − n2 cos θT
  
Rs = = (4.21.25)
EI s n1 cos θI + n2 cos θT

2
ET 2n1 cos θI
  
Ts = = (4.21.26)
EI s n1 cos θI + n2 cos θT

4.21.6 Simple Conducting Matter

Simple conducting matter is simple matter that also allows for currents to be driven by the
electric field, so that in addition to D = E and B = µH, we also produce a (free) current
jf = σE. In simple conducting matter , µ, and σ are all constant: in particular they have no
dependence on the frequency of the incident wave. E and H must still be perpendicular to k
(no dispersion).
The fourth of Maxwell’s equations now reads:

∂E ∂E
   
∇×B =µ j+ = µ σE +  (4.21.27)
∂t ∂t
If we take E to be a plane wave, we can re-write this expression as:

iσ ∂E ∂E
 
∇×B =µ + = µ0 ˜ (4.21.28)
ω ∂t ∂t

Where we have defined (for our convenience) a complex dielectric constant:


˜ =  + (4.21.29)
ω

Or:
˜ iσ
=1+ (4.21.30)
 ω
Correspondingly, we must also now define a complex index of refraction and impedance:
q
ñ(ω) = c µ˜
(ω) (4.21.31)

s
µ
Z̃(ω) = (4.21.32)
˜(ω)

The dispersion relation can now be written in the same form as that in free space (that name is
a bit misleading here: this fact means that there IS no dispersion!):

ω
k = ñ k̂ (4.21.33)
c

It is important to remember that these quantities are not physical. The physical variables (,
µ, and σ) are still constant! These complex quantities are simply a convenient way of writing
Maxwell’s equations for this situation.

110
It maybe concerning that, in both of these equations, we appear to be taking the square

root of an imaginary number, which is an odd procedure. In order to avoid having i in our
equations, we can compute ñ by writing:

ñ = α + iβ (4.21.34)

Such that
ñ2 = α2 + 2iαβ − β 2 = (α2 − β 2 ) + i(2αβ) (4.21.35)

Then, we can match real and imaginary parts to find a solution for ñ:


 
2 2 2 2
(α − β ) + i(2αβ) = ñ = c µ  + (4.21.36)
ω

Solutions to this system are:


v
r u  2  12
µ u
t1 + σ
Re(ñ) = α = c +1 (4.21.37)
2 ω

v
r u  2  12
µ u
t1 + σ
Im(ñ) = β = c −1 (4.21.38)
2 ω

Since k = ñ ωc k̂, the equation for a plane wave can now be written:

ω ω
E = E0 e− c Im(ñ)k̂·r ei( c Re(ñ)k̂·r−ωt) (4.21.39)

The first exponential is completely real, meaning that it will decay rather than oscillate within
the material. This disipation of energy is known as Joule heating. The value Im(ñ) = β is
therefore a measure of how strongly the wave is absorbed.
The amount of energy lost due to Joule heating can be calculated by considering the energy
lost as work done to create free current in the presence of the electric field:

dW
Z Z
= d3 rjf · E = σ d3 r|E|2 (4.21.40)
dt V V

4.21.7 Waves in Dispersive Media

A dispersive medium is one in which the variables (ω), µ(ω) and σ(ω) are dependent on
frequency. In general, these quantities will actually become tensors, which can cause a plane
wave to physically disperse (spread out) as it travels through the material. In simpler calculations,
we will assume that these tensors are diagonal and have the same value along each axis:

(ω)↔ → ˜(ω)I (4.21.41)

Where I is the identity matrix. In this case, the wave will still disperse in 1D, since the speed of
light in the material will be different for different frequencies of the light wave.
The current induced by the (already complex) Electric field is now written in terms of a

111
complex conductance:

j̃ = σ̃(ω)Ẽ (4.21.42)

Here, it turns out4.25 that σ̃(ω) is a Fourier transform pair with the “actual” conductance, which
is a function of time:
Z ∞ Z ∞
1
σ̃(ω) = σ(t)eiωt
dt, σ(t) = σ̃(ω)e−iωt dω (4.21.43)
−∞ 2π −∞

The same is generally true for the host of other complex variables we must introduce: each
corresponds via a Fourier transform to a “real” time dependent function4.26 :

P̃ = 0 χ̃(ω)Ẽ, M̃ = χ̃M (ω)H̃ (4.21.44)

D̃ = ˜(ω)Ẽ, B̃ = µ̃(ω)H̃ (4.21.45)

Suppose that we decide to calculate the polarization current j̃P based on this oscillating electric
field:

∂ P̃
j̃P = (4.21.46)
∂t
Since P̃ is the Fourier transform of P (t), the time derivative can be replaced by an −iω:

j̃P = −iω P̃ = −iω0 χ̃(ω)Ẽ (4.21.47)

But since P̃ = 0 χ̃(ω)Ẽ, we know that:

iσ̃
P̃ = (4.21.48)
ω

Then, since:
D̃ = 0 Ẽ + P̃ = ˜(ω)Ẽ (4.21.49)

We can solve for a new relation for ˜(ω)):

iσ̃(ω)
˜(ω) = 0 + (4.21.50)
ω

Notice that this relation is slightly different from the one for simple conducting
matter! Namely, the  from before has become an 0 . This does not mean that (ω) = 0 :
merely that the information about (ω) has been bundled up inside σ̃(ω).
∂P
We could also use Maxwell’s equations to show that j = ∂t + ∇ × M implies:

χ + χM + χχM ∂P χ + χM + χχM
j= = ∇×M (4.21.51)
χ ∂t χM (1 + χ)
4.25
Zangwill pg. 625.
4.26
For this reason, Zangwill replaces the F̃ with a F̂ for these functions. I’m sticking with F̃ , because I think
the change causes more confusion than it’s worth.

112
These equations lead to an important conceptual result: when describing time-dependent
currents in a material, we can choose to model the currents as entirely polarization
currents (using ˜) or magnetization currents (using χ̃M )).
For low frequency systems, we will often use both ˜ and µ̃ = µ0 (1 + χ̃M ), while at higher
frequencies we will set M = 0, treating ALL response of the material as a dielectric (and thus
grouping it into ˜).
Having established that we can describe dispersive media through a combination of complex,
frequency dependent functions, we now want to find physical models that will provide us with
these functions.

4.21.8 The Drude Model

The Drude model pictures a material as being made up of free charges (neutralized by some
other oppositely charged particles) that experience a simple drag force proportional to their
velocity. Their equation of motion is thus:

m
mv̇ = qE − v (4.21.52)
¯ τ

Where τ is the damping constant. Imagine a single particle being hit by a plane wave. Ignoring
the transient initial solution, the particle will oscillate at the same frequency as the wave driving
it. Making this assumption allows us to solve this equation:

qE0 1
v= 1 eiωt (4.21.53)
m τ − iω

Then, since j̃(ω) = nqv = σ̃(ω)E, we can solve for σ̃(ω):

nq 2 τ σ0
σ̃(ω) = = (4.21.54)
m(1 − iωτ ) 1 − iωτ

iσ̃(ω)
For any dispersive medium, ˜(ω) = 0 + ω . Therefore, plugging in σ̃(ω):

ωp2 τ 2
 2
˜(ω) ωp τ 1
  
= 1− +i (4.21.55)
0 1 + ω2τ 2 ω 1 + ω2τ 2

Where ωp2 is the natural oscillation frequency of the plasma, known as the plasma frequency:

nq 2
ωp2 = (4.21.56)
0 m

In the low frequency/high damping limit, ωτ << 1:

˜(ω) ωp2 τ
≈1+i (4.21.57)
0 ω

113
While for the high frequency/low damping limit, ωτ >> 1:

˜(ω) ωp2
≈1− 2 (4.21.58)
0 ω

Notice that, in the high frequency limit, ˜(ω) is entirely real, meaning that a wave will not decay:
it will be undamped. In fact, the wave is undamped for all ω > ωp .
When a plasma is cold and/or dilute, it will suffer very little damping, meaning that τ >> 1,
and therefore that the second limit can be used. This scenario tends to come up much more
often in exams.

4.21.9 The Lorentz Model

The Lorentz Model pictures a dielectric material as being made up of individual atoms, each of
which has a immovable nucleus and an electron which may vibrate as if on a damped spring
with frequency ω0 and damping constant Γ. The equation of motion for a particle is thus:

mr̈ = qE − mΓṙ + mω02 r (4.21.59)

In terms of these variables (and the plasma frequency defined in the previous section):

˜(ω) ωp2
=1+ 2 (4.21.60)
0 ω0 − ω 2 − iωΓ

4.21.10 The Appleton Model of a Magnetized Plasma

The Appleton model is similar to the Drude model, but without collisions and allowing for
an external magnetic field. Therefore, it is ideal for describing a cold (relatively collision-less)
plasma.
In this model, the presence of the magnetic field allows for coupling between the x and y
(transverse) directions. For this reason, it is no longer sufficient to use a vector permitivity, and
we must revert to the most general definition of  as a tensor:

D = ˜ · E (4.21.61)

The Appleton model then says that, in terms of the plasma frequency and the cyclotron frequency
qB0
ωc = m :

ωp2 ω 2 ωc
 
1− ω 2 −ω 2 i ω(ω2p−ω2 ) 0
 c c 
2ω ωp2
˜(ω) = 0 −i ω(ωω2p−ωc
1− 0 (4.21.62)
 
2) ω −ωc2
2 
 c 
2
ωp
0 0 1− ω2

4.22 Waveguides and Transmission Lines


Waveguide’s and transmission lines are both structures that control the propagation of electro-
magnetic waves. Several examples of tremendous practical importance are power transmission
lines, coax data cables, and fiber optics.

114
Fields propagating through such a system are classified as TE (Transverse Electric, TM
(Transverse Magnetic, or TE (Transverse Electric and Magnetic, where “transverse”
here means that the given field vector is always perpendicular to the direction of the wave’s
propagation through the waveguide. The primary distinction between waveguides and transmis-
sion lines is that waveguides can only support TE and TM waves, while transmission
lines are usually TEM4.27 .
In any waveguide or transmission line the fields must satisfy the usual conductor boundary
conditions at the surface:

Ek s
=0 (4.22.1)
H⊥ s
=0 (4.22.2)

4.22.1 TEM Waves (Transmission Lines)

Suppose that a TEM wave propagates in the hatz direction. The simplest fields will look like a
plane wave along ẑ, but will have more complex behavior in the transverse plane determined by
the geometry of the conductor. We will therefore chose to separate:

E = E⊥ ei(kz−ωt)
(4.22.3)
H = H⊥ ei(kz−ωt)

For some geometries, it will be simple enough (or sometimes necessary) to use these fields along
with the free space Maxwell equations and conductor boundary conditions to find the fields.
However, if the walls of the conductor are nicely perpendicular to the direction of propagation,
it is convenient to create a separate boundary value problem in the transverse plane.
To do so, we will substitute E = E⊥ and B = B⊥ into the source free Maxwell equations.

By splitting up the vector derivative into perpendicular and parallel components, ∇ = ∇⊥ + ẑ ∂z ,
we can write Maxwell’s curl equations as:

∂E⊥ ∂H⊥
∇⊥ × E⊥ + ẑ × = −µ
∂z ∂t (4.22.4)
∂H⊥ ∂E⊥
∇⊥ × H⊥ + ẑ × =
∂z ∂t
Notice that the ∇⊥ components above are in the ẑ direction, while all other components are in
the transverse plane. We can therefore take only the ẑ components to write:

∇⊥ × E⊥ = 0
(4.22.5)
∇⊥ × H⊥ = 0
Applying a similar process to the divergence Maxwell equations produces:

∇⊥ · E⊥ = 0
(4.22.6)
∇⊥ · H⊥ = 0
4.27
It seems to be possible for transmission lines to carry a TE or TM wave under some circumstances, but this
is an unusual case.

115
At this point, these four equations (in addition to the normal boundary conditions for conducting
matter) completely specify a 2D boundary value problem for the traverse fields.
Returning to the ẑ and transverse separated Maxwell curl equations above (eq. 4.22.4), we
examine the equality of the remaining transverse terms:

∂E⊥ ∂H⊥
ẑ × = −µ
∂z ∂t (4.22.7)
∂H⊥ ∂E⊥
ẑ × =
∂z ∂t
If we further assume that even the transverse fields are plane waves (E⊥ ∝ ei(kz−ωt) , B⊥ ∝
ei(kz−ωt) ) 4.28 , these equations become:

ẑ × kE⊥ = −µωH⊥
(4.22.8)
ẑ × kH⊥ = ωE⊥
This system of equations implies that:

1
ω=√ h (4.22.9)
µ
Plugging this relation back into the transverse terms above (eq. 4.22.8), we can rewrite the
ẑ × E⊥ equation to be:
r
µ
ẑ × E⊥ = H⊥ (4.22.10)

This equation is analogous to B = 1c k̂ × E did in free space.

4.22.2 Conducting Plane(s)

Figure 4.22.1: TE wave incident on a conducting plane. Source: Zangwill pg. 673

Before considering waveguides, it is constructive to examine the interaction between an E&M


wave and a conducting plane (which can be thought of as one face of a rectangular waveguide).
Imagine a TE wave incident at an angle θ from the surface normal on a conducting plane
(Fig. 4.22.1). At the surface, boundary conditions demand that E⊥ s
= 0. However, this
condition applies to the total field at this point, which includes both the incident and reflected
4.28
Zangwill writes k as h for some reason in this section.

116
fields. Two valid fields that satisfy this condition would therefore be:

EI = ŷE0 exp(ik(z sin θ − x cos θ − ct))


(4.22.11)
ER = −ŷE0 exp(ik(z sin θ + x cos θ − ct))

The B field is then fixed by ∇ × E = − ∂B


∂t . Notice that, at x = 0, these fields exactly cancel
everywhere along the plane, satisfying the boundary condition. Notice, also, by the nature of
the cosine function that represents the real part of this field, that there are infinitely many such
zeros, spaced out by the relation:

kx cos θ = mπ, m ∈ Z+ (4.22.12)

We are therefore free to place a second conducting plate at any of these values of x. Doing so will
cause the wave to bounce back and forth indefinitely between the plates, acting as a sort of 2D
waveguide. Notice that it is not possible to have a TEM wave exhibit this behavior: that wave
would be normally incident on the plate, and therefore never propagate down the waveguide.

Figure 4.22.2: TE wave between two conducting plates. Source: Zangwill pg. 673

Conversely, then, if we place a plate at x = a as shown in Fig. 4.22.2, we are imposing a


requirement that:

1 π
k=m , m ∈ Z+ (4.22.13)
cos θm a
π ω 4.29
Since 0 < θm < 2 as drawn in the figure, 0 < cos θm < 1 and k = c , we see that the following
relationship governs the possible values of ωm for a given plate placement x = a:

π
ωm > m (4.22.14)
a
The boundary value here defines the cutoff frequency of this waveguide:

π
ωc = m (4.22.15)
a
The physical meaning of this frequency is illuminated by rewriting the total field inside the
waveguide in terms of the

mπx i(kz sin θ−ωt)


 
EI + ER = −2E0 sin e (4.22.16)
a
4.29
Here I am assuming that the inside of the waveguide is vacuum so that the wave speed is c. If the waveguide
is filled with matter, substitute the appropriate wavespeed in terms of the index of refraction.

117
Notice that we can rewrite
 2
ω2 π
k sin θ = k (1 − cos θ) = 2 − m2
2 2 2 2
= hm 2 (4.22.17)
c a
For convenience, we will refer to this entire expression as hm 2 . Then k sin θm = hm , so:

mπx i(zhm −ωt)


 
E = −2E0 sin e (4.22.18)
a
Notice now that if hm is real, this wave will propagate (since the total z dependence will
remain imaginary). However, if hm is imaginary, the wave will die out as an evanescent wave
(as the z dependence becomes real).
Therefore, hm real corresponds to ωm > m πa = ωp . Therefore only waves with frequencies
greater than the cutoff frequency ωp can propagate through the waveguide.

4.22.3 Conducting Tubes

Figure 4.22.3: A generalized conducting tube. Source: Zangwill pg. 675.

The most common type of waveguide is the conducting tube. A generalized conducting tube
is shown in figure 4.22.3 to be an infinitely long tube with conducting, parallel walls and a
uniform cross section. Such a tube can support either TE or TM fields, but not TEM fields.
Translation symmetry along the axis of the tube guarantees that Ez = ẑ · E is constant along
the z axis (but not necessarily zero), and the same is true for Hz = ẑ · H. If we then assume
that the field propogates as a plane wave down the z axis,it is natural to write:

E = (E⊥ (r⊥ ) + ẑEz (r⊥ ))ei(kz−ωt)


(4.22.19)
H = (H⊥ (r⊥ ) + ẑHz (r⊥ ))ei(kz−ωt)
Notice that for TE waves, Ez = 0, while for TM waves Hz = 0. Maxwell’s curl equations are:

∇ × E = iωµH
(4.22.20)
∇ × H = −iωE

Again writing ∇ = ∇⊥ + ẑ ∂z and using the fields defined above,

∇ × E = (∇⊥ × E⊥ + ikẑ × E⊥ − ẑ × ∇⊥ Ez )ei(kz−ωt) (4.22.21)

118
A similar expression can be found for H. Notice in the above equation that the first ∇⊥ × E⊥
must point along ẑ (perpendicular to E⊥ ), while the remaining two terms must be transverse
(perpendicular to ẑ). Therefore, substituting these curl expressions into Maxwell’s equations
allows us to split them up by components:

∇⊥ × E⊥ = iωµHz ẑ
(4.22.22)
iωµH⊥ − ikẑ × E⊥ = −ẑ × ∇⊥ Ez

∇⊥ × H⊥ = −iωEz ẑ
(4.22.23)
iωE⊥ + ikẑ × H⊥ = −ẑ × ∇⊥ Hz
If we take Ez and Hz to be givens, this system of equations completely defines E⊥
and H⊥ (each of which has two components). Rearrangement (by taking the cross product
ẑ with the second and fourth equations) yields:

ik iωµ
E⊥ = 2
∇⊥ Ez − 2 ẑ × ∇⊥ Hz
γ γ
(4.22.24)
ik iω
H⊥ = 2 ∇⊥ Hz + 2 ẑ × ∇⊥ Ez
γ γ

Where for convenience

γ 2 = µω 2 − k 2 (4.22.25)

Yet more rearrangement (now taking the curl of both of these equations) transforms these
equations into a pair of Helmholtz equations:

(∇⊥ 2 + γ 2 )Ez = 0
(4.22.26)
(∇⊥ 2 + γ 2 )Hz = 0

So far, these equations are totally general for any mode. However, assuming that Ez = 0 or
Hz = 0 greatly simplifies both of the boxed sets of equations (eq. 4.22.24 and eq. 4.22.26) to a
single Helmholtz equation equation for the ẑ direction and a pair of equations to determine the
traverse fields. Some final rearrangement produces this system:

• TE:
Ez = 0
(∇⊥ 2 + γ 2 )Hz = 0
ik (4.22.27)
H⊥ = ∇⊥ Hz
γ2
ωµ
E⊥ = − ẑ × H⊥
k

119
• TM:
Hz = 0
(∇⊥ 2 + γ 2 )Ez = 0
ik (4.22.28)
E⊥ = ∇⊥ Ez
γ2
ω
H⊥ = ẑ × E⊥
k
This result has reduced the entire problem of finding the fields to finding their ẑ components.
The Helmholtz equations for these components are a boundary value problem determined by
the condition that n̂ × E s
= 0 everywhere on the surface of the tube. Zangwill shows4.30 that
for our generalized tube geometry, this condition reduces to:

∂Hz
Ez s
= 0 (TM Waves) , = 0 (TE Waves) (4.22.29)
∂n s

This tidy system of equations given out initial fields justifies our choice of TE and TM as a
“basis”: any field can be written as E = ET M + ET E .
Explicitly calculating the TE and TM components of our fields in this manner yield a useful
result:

ZHT E = ET M
(4.22.30)
ET E = −ZHT M
q
µ
Where Z is the impedance Z = .
Another very important result4.31 is the cutoff frequency for a generalized tube waveguide:

γi c
ωc,i = √ = γi (4.22.31)
µ n

Where i indexes the modes of the waveguide (each mode has its own cutoff frequency).
These equations alone do not prove that TEM modes cannot propagate through
a conducting tube. Setting Hz = 0 in our original transverse Maxwell equations tells us
that ∇⊥ × E⊥ = 0, while Maxwell’s divergence equation requires that ∇⊥ cot E⊥ = 0. Together
these describe a boundary value problem ∇2⊥ φ = 0 where E⊥ = −∇φ. Since the boundary is a
conductor, φ must be constant and therefore E⊥ = 0. A similar argument with the other set of
Maxwell equations setting Ez = 0 shows that H⊥ = 0. Therefore, if Ez = Hz = 0, no wave
can propagate.

4.22.4 Example: Rectangular Tube Waveguides

For the case of a rectangular tube with sides of length a and b, the Helmholtz equation boundary
value problem reduces to:

∂2 ∂2
 
TM: + + γ 2 Ez = 0, and Ez |s = 0 (4.22.32)
∂x2 ∂y 2
4.30
pg. 677.
4.31
Derived in Zangwill pg. 679

120
∂2 ∂2 ∂Hz
 
TE: + + γ 2 Hz = 0, and =0 (4.22.33)
∂x2 ∂y 2 ∂n s

Which produces fields that look like:

mπx nπy
   
TE: Ezmn (x, y) = E sin sin , m, n ∈ Z+ (4.22.34)
a a

mπx nπy
   
TM: Hzmn (x, y) = H cos cos , m, n ∈ Z+ (4.22.35)
a a
q
m2 n2
With eigenvalues γmn = π a2
+ b2
.

4.22.5 Example: Cylindrical Tube Waveguides

For the case of a cylindrical tube of radius R, the Helmholtz equation boundary value problem
reduces to:

∂2 1 ∂ 1 ∂2
 
TM: + + + γ 2 Ez = 0, and Ez |s = 0 (4.22.36)
∂ρ2 ρ ∂ρ ρ2 ∂φ2

∂2 1 ∂ 1 ∂2 ∂Hz
 
TE: + + + γ 2 Hz = 0, and =0 (4.22.37)
∂ρ2 ρ ∂ρ ρ2 ∂φ2 ∂n s

The solutions to these Helmholtz equations in cylindrical coordinates are cylindrical Bessel
functions (of the first kind, in order to enforce finiteness of the field at the origin) in ρ and
exponential in φ:

TM: Ezmn (ρ, φ) = EJm (γmn


TM
ρ)e±imφ , m = 0, 1..., n = 1, 2... (4.22.38)

TE: Hzmn (ρ, φ) = HJm (γmn


TE
ρ)e±imφ , m = 0, 1..., n = 1, 2... (4.22.39)

Where the absence of of n = 0 is due to the non-existence of J0,0 . The eigenvalues are defined as:

TM umn TE wmn
γmn = , γmn = (4.22.40)
R R
Where umn are the zeros of the Bessel function Jmn , and wmn are the zeros of the derivative
0
of the Bessel function: Jmn (the derivative appears because the TE boundary condition is a
derivative).

4.22.6 Flow of Energy Through a Waveguide

The time averaged Poynting flux through a waveguide is given by:

1 ∗
hSi = Re(E⊥ × H⊥ ) · ẑ (4.22.41)
2
For both TE and TM waves, this energy propagates at the group velocity.

121
4.23 Fields of Moving Particles (Radiation)
As a charged particle moves, its electric field is deformed and it generates a magnetic field (as
a sort of one-particle current). These fields are further deformed if the particle accelerates.
During this acceleration, many of the field lines may remain attached to the particle. However,
some field lines cross themselves, forming closed field loops in a phenomenon known as field line
re-connection. These loops are now self-sufficient radiation fields, and will propagate away from
the particle to infinity. This is the birth of an electromagnetic wave.
This process is somewhat analogous to a wet dog shaking off water droplets. Most of the
water remains attached to the dog by surface tension. However, occasionally some water forms a
droplet, which is flung free of the fur into empty space, now held together by its own surface
tension.
While it is true that all radiation is created by accelerating charges, the reverse is not true.
Under special circumstances, multiple particles may move in a way that conspires to cancel out
the radiating components of their fields as they accelerate, therefore producing no radiation.

4.23.1 Retardation

When describing radiation, we are concerned with the field of an accelerating charge very far
away (at infinity, in fact). Since these fields (and therefore the information they carry) propagate
at finite speed (the speed of light), the field at some point r is determined, not by the acceleration
r̈ of the source charge now, but it’s acceleration some time ago: t0 , called the retarded time4.32 .

|r − r 0 |
t0 = t − (4.23.1)
c
The quantity |r − r 0 | = r is can be expressed exactly using the law of cosines (see fig. 4.1.1 for
a reminder of the definition of each vector):
q q
|r − r 0 | = r0 2 + r2 − 2rr0 cos θ = (r − r0 cos θ)2 + r0 2 sin2 θ (4.23.2)

In the limit where r0 << r, we can approximate r0 ≈ 0 in the above expression. Notice
that, if we are calculating a radiation field, this is approximation is actually exact.
We can chose to examine the radiation field at infinity, where this condition is satisfied exactly.

|r − r 0 | ≈ r (1st order) (4.23.3)

In which case, the retarded time becomes

r
t0 = t − (1st order) (4.23.4)
c
Even at infinity, this first order approximation is only valid if the source can be
treated as a point source without loss of important physics. This is sometimes NOT
the case, as interference between different parts of the source can create interference that change
|r−r 0 |
4.32
There is, theoretically, also an advanced time, t0 = t + c
. However, this time is not causal, and is therefore
not used.

122
the radiation fields. In this, case we will use a 2nd order approximation. Letting only r0 2 = 0 in
eq. 4.23.2:

|r − r 0 | ≈ r − r 0 cos θ = r − r̂ · r 0 (2nd order) (4.23.5)

Such that, to 2nd order:

r r̂ · r 0
t0 = t − + (2nd order) (4.23.6)
c c
It can be shown4.33 that the retarded potentials are exactly what we might hope they would be:

1 ρ(r 0 , tret )
Z
φret (r, t) = φ(r, tret ) = d3 r0 (4.23.7)
4π0 r
µ0 j(r, tret )
Z
Aret (r, t) = A(r, tret ) = d3 r0 (4.23.8)
4π r
However, THE SAME IS NOT TRUE FOR THE FIELDS (in short, because the deriva-
tives to produce them are complicated by the implicit dependence of tret on r). The fields must
∂A
be obtained by E = −∇φ − ∂t and B = ∇ × A. This calculation can be done for a general
current and charge density to produce Schott’s Formulae (or Jefimenko’s Equations)4.34 .

1 ρret r r ∂ρret 1 ∂jret


Z  
3 0
E(r, t) = d r + − 2 (4.23.9)
4π0 r 3 cr 2 ∂t c r ∂t

µ0 jret 1 ∂jret
Z  
B(r, t) = d r 3 0
+ × r (4.23.10)
4π r 3 c r 2 ∂t
Where r = |r − r 0 | and tret = t − r . In actual calculations, it is usually easier to just calculate
c
the retarded potentials, and find the fields directly from them.

4.23.2 Fields of a Moving Point Charge

Suppose that a point charge q moves along some path. The Schott/Jefimenko equations
(eq. 4.23.9 and eq. 4.23.10) are not easily applied to this situation, since the particle does not
really constitute a current or a charge density. Instead, we must return to the retarded potential:

1 ρ(r 0 , tret )
Z
φret (r, t) = d3 r0 (4.23.11)
4π0 r
At a given observation point in time and space, the field produced by the charge is created by
only one time and position on the particles trajectory. Therefore, r is fixed as we evaluate the
integral (although its functional dependence changes from |r − r0 | to |r − r 0 (r, tret )|).

1 1
Z
φret (r, t) = ρ(r 0 , tret )d3 r0 (4.23.12)
4π0 r
4.33
Griffiths E&M pg. 424.
4.34
Derived in Zangwill, pg.726. or Griffiths E&M pg. 427.

123
For a stationary charge, the remaining integral would be simply equal to q. However, Lorentz
contraction of the particle actually distorts the charge of the particle4.35 . Lorentz contraction
1
along an axis introduces a factor of 1− vc . However, since we are viewing the particle from an
1
arbitrary angle, this is modified to be .
1− r̂ · vc
so that:

1 1 1 1 q
Z   
φret (r, t) = ρ(r 0 , tret )d3 r0 = (4.23.13)
4π0 r 4π0 r 1 − r̂ · v
c

Similarly,

µ0 ρ(r 0 , tret )v(tret ) µ0 v(tret ) µ0 qcv


Z Z
A(r, t) = 3
d r= ρ(r 0 , tret )d3 r = (4.23.14)
4π r 4π r 4π ( r c − r · v)

The final results are called the Liénard-Wiechert potentials:

1 1 q
φret (r, t) =
4π0 r 1 − r̂ · vc
(4.23.15)
µ0 qcv
A(r, t) =
4π ( r c − r · v)
Notice that:

v
A(r, t) = φ(r, t) (4.23.16)
c2
These results then lead, through a tremendously intimidating several pages of vector algebra4.36
to the fields generated by the motion of the point charge. Letting u ≡ c r̂ − v:

q r
E(r, t) = [(c2 − v 2 )u + r × (u × a)]
4π0 ( r · u)3
(4.23.17)
1
B(r, t) = r̂ × E(r, t)
c
An important example is that of a particle moving with constant velocity . In this case:

q r̂ 1− v2
c2
E(r, t) = (4.23.18)
4φ0 r2 1−
3
v sin2 θ 2
2
2
c

In this case, the E and B fields are as shown in figure 4.23.1.

4.23.3 Time-Dependent Electric Dipoles

Consider a time dependent dipole moment p(t) (the following discussion will be valid for any
dipole moment). The associated charge distribution is (if we place the dipole at the origin):

ρ(r, t) = −p(t) · ∇δ(r) = −∇ · (p(t)δ(r)) (4.23.19)


4.35
It turns out this is true even for a point particle, since point particles are really just a limit of a finite charge
distribution.
4.36
Griffiths E&M pg.436-437, if you really must.

124
Figure 4.23.1: Radiation pattern point charge with a constant velocity (comparable to c) point
charge. (Source: Griffith’s E&M, pg. 440)

and therefore, by the charge continuity equation:

j(r, t) = ṗ(t)δ(r) (4.23.20)

Evaluating the retarded potentials with these functions yields:

1 ṗ(tret ) · r p(tret ) · r
 
φ(r, t) = + (4.23.21)
4π0 cr2 r3

µ0 ṗ(tret )
A(r, t) = (4.23.22)
4π r
The resulting fields are then:

1 3r̂(r̂ · pret ) − pret 3r̂(r̂ · pret


˙ ) − pret
˙ 3r̂(r̂ · p¨ret ) − p¨ret
 
E(r, t) = 3
+ 2
+ (4.23.23)
4π0 r cr c2 r

µ0 ˙
pret p¨ret
 
B(r, t) = − r̂ × 2
+ (4.23.24)
4π r cr

Notice that the terms in both of these fields have different power’s of r in their denominators.
These terms are each dominant in a particular region of space. For example, since S ∝ E × B,
1
only the r terms can contribute to radiation, as all others will die off as r → ∞.
If the source varies with a characteristic frequency ω:

1
• The Near Field: ∝ r3
, rω << c.
The near field is the closest to the source, including the source itself. It’s field contributions
is the pret term of E (B is zero in this region).
1
• The Intermediate Field: ∝ r2
, rω ≈ c.
˙ terms.
The intermediate field includes the pret

125
• The Far Field / Radiation Zone: ∝ 1r , rω >> c.
The intermediate field extends to infinity, and includes the p¨ret terms. It is only these
components of the fields that contribute to radiation.
Here, it is useful to define some notation:

1 3r̂(r̂ · p¨ret ) − p¨ret µ0 p¨ret


 
Erad (r, t) = , Brad (r, t) = − r̂ × (4.23.25)
4π0 c2 r 4π cr

The power radiated by the dipole per unit solid angle can be written in terms of the Poynting
flux of only the radiation zone terms:

dP 1
= r2 r̂ · S = r2 r̂ · (Erad × Brad ) (4.23.26)
dΩ µ0
Substituting in the fields found earlier gives the radial power distribution for a generalized dipole
moment:

dP µ0
= |r̂ × p¨ret |2 (4.23.27)
dΩ 16π 2 c
Note that, if the direction of p¨ret is fixed, the cross product produces a sin2 θ factor. Integrating
over the solid angle produces the total radiated power:

1 2|p¨ret |2
P (t) = (4.23.28)
4π0 3c3
This equation is very similar to (but not the same as the Larmor formula for the power radiated
from a point charge. While conceptually separate, the Larmor formula can be reproduced here
by setting p = qr.

4.23.4 Fields in the Radiation Zone (Time-domain)

In general, the vector potential in the radiation zone (letting r ≈ r is:

µ0
Z
Arad (r, t) = d3 r0 jret (4.23.29)
4πr
The general fields produced by this current are then:

∂Arad
cBrad (r, t) = −r̂ × (4.23.30)
∂t

Erad (r, t) = −r̂ × cBrad (r, t) (4.23.31)

So, since clearly |Erad | = c|Brad |:, we have:

dP r2
= r2 r̂ · S = |Erad |2 (4.23.32)
dΩ cµ0

126
4.23.5 Fields in the Radiation Zone (Time-domain)

While the fields given in the previous section are valid, it is often easier (for inherently periodic
problems) to work with their Fourier transform into the frequency domain. Both the current
and vector potential can be written as a Fourier transform:
Z ∞
1
j(r, t) = j(r|ω)e−iωt dω
2π −∞
Z ∞ (4.23.33)
1
Arad (r, t) = Arad (r|ω)e−iωt dω
2π −∞

This allows us to write4.37 :


µ0 eikr
Arad (r|ω) = j(k|ω) (4.23.34)
4π r
Where j(k|ω) is the Fourier transform in both space and time of the source current density.
The resulting fields are:

Erad (r|ω) = −iωr̂ × [r̂ × Arad (r|ω)]


(4.23.35)
cBrad (r|ω) = iωr̂ × Arad (r|ω)

4.23.6 The Larmor Formula

Consider an accelerating point charge with velocity v(t) = r˙0 (t), which we will interpret as a
current:

j(r, t) = qv(t)δ(r − r0 (t)) (4.23.36)

Taking the first order retardation approximation, jret = j(r, t − rc ). Evaluating the field and
then plugging into the general power distribution formula gives:

dP µ0 q 2 |aret |2
= sin2 θ (4.23.37)
dΩ 16π 2 c
Which, integrated over the solid angle, is the Larmor formula:

1 2q 2 |aret |2
P = (4.23.38)
4π0 3c3

4.23.7 Generalized Cartesian Multipole Radiation

The general E and B fields of a certain current distribution j are determined by time derivatives
of the vector potential:

µ0
Z
Arad (r, t) = d3 r0 jret (4.23.39)
4πr
If we introduce a new quantity, the radiation vector:

∂ r 1
Z
α(r, t) = d3 r0 j(r 0 , t − + r̂ · r 0 ) (4.23.40)
∂t c c
4.37
Zangwill pg. 736

127
Now, suppose we are only interested in an approximation of fields at a point far from the source
(as in the electric potential multipole expansion). In this case, we can approximate:

r r̂ · r 0
jret = j(r 0 , t − + )
c c
r r̂ · r 0 ∂ ~ 0 r 1 r̂ · r 0 2 ∂ 2 r
 
0
~
≈ j(r , t − ) + j(r , t − ) + 2
j(r 0 , t − ) + ... (4.23.41)
c c ∂t c 2 c ∂t c

Just like the terms of the electric scalar potential multipole expansion, each of these terms
corresponds to a particular “moment”. Zangwill shows4.38 that α can now be written (up to the
three terms in j written above) in terms of the electric dipole moment p, the magnetic dipole
moment m, and the electric quadrupole moment Q:

d2 1 d2 1 d3
α(r, t) ≈ pret + m ret × r̂ + Qret · r̂ (4.23.42)
dt2 c dt2 c dt3
By dividing the field in this way, we can easily compute the E and B fields for each moment.
By convention, the following labels are used to refer to the moments:

• E1 → Electric Dipole

• M1 → Magnetic Dipole

• E2 → Electric Quadrupole

4.23.8 Electric and Magnetic Dipole Radiation

The radiation zone fields for the electric dipole are4.39 :

µ0 r̂(r̂ · p̈ret ) − p̈ret µ0 [r̂ × (r̂ × p)]


EE1 = −r̂ × cBE1 = =
4π r 4π r (4.23.43)
µ0 r̂ × p̈ret
BE1 = −
4πc r
The angular power distribution is:

dP µ0
 
= |r̂ × p̈ret |2 (4.23.44)
dΩ E1 16π 2 c

The corresponding radiated power looks very similar to the Larmor formula, but is in fact more
general:
1 2|p̈ret |2
PE1 (t) = (4.23.45)
4π0 3c3
The Larmor formula can be retrieved by setting p = qr.
In the (extremely common) case of a time harmonic dipole moment p(t) = pe−iωt , the dipole
4.38
In sections throughout section 20.7.
4.39
Zangwill pg. 745.

128
fields simplify to:
µ0 2 ei(kr−ωt)
EE1 = − ω [r̂ × (r̂ × p)]
4π r (4.23.46)
µ0 ω 2 e i(kr−ωt)
BE1 = (r̂ × p)
4π c r
m
Everything for the magnetic dipole is the same, but with p → c , and switching E and M such
that:
1
EM 1 → −cBE1 , BM 1 → EE1 (4.23.47)
c
This is because nature feels bad about how many radiation formulas you already have to
memorize.

4.24 Electromagnetic Scattering


In the electromagnetic scattering problem, we are interested in the interaction between an
incoming light wave and some object that absorbs the light and re-radiates it. The nature of
this interaction is described by the angular redistribution of power:

dσ Power scattered into dΩ


= (4.24.1)
dΩ Total incident power

Which can be expressed in a number of useful ways:


   
dP dP
dσ r2 r̂ · hSrad i dΩ dΩ
= = = 1 (4.24.2)
dΩ |hSinc i| |hSinc i| 2 0 c|Einc |
2

The total field can be split into an incident wave and a scattered wave:

eikr
 
ik0 ·r
E = Einc + Erad = E0 ê0 e + f (k) e−iωt (4.24.3)
r

Where we have absorbed all of the angular information about the scattering pattern into the
scattering amplitude f (k).
Since the radiated field is by definition in the radial direction, r̂·hSrad i = hSrad i. Therefore4.40 :

dσ r2 hSrad i r2 |Erad |2
= = = |f (k)|2 (4.24.4)
dΩ |hSinc i| |E0 |2
In general, this differential cross section will depend on the polarization of the incoming wave. If
we are interested in scattered waves with polarization , then r̂ = ˆ, so:

dσ r2 |∗ · Erad |2
= = |∗ · f (k)|2 (4.24.5)
dΩ |E0 |2

We have previously derived expressions for the electric field radiated by an arbitrary time
4.40
Remember, these fields are in vacuum, so E = cB.

129
harmonic current density. Inserting this expression yields4.41 :
2 2
dσscatt k
 Z
0
= k̂ × d3 r0 j(r 0 |ω)eik·r (4.24.6)
dΩ 4π0 E0 c

4.24.1 Thomson Scattering

Thomson scattering is a light wave scattering off of a single free electron. In the weak field limit,
the electric force on the electron dominates, so its equation of motion is:

mr¨0 = −eE0 ei(k0 ·r0 −ωt) (4.24.7)

This motion creates a current density, which in turn radiates. Using the radiation vector potential
to find Erad , we can calculate that:
2
dσ e2

= |k̂0 × ê0 |2 = re2 (1 − |k̂ · ê0 |2 ) (4.24.8)
dΩ 4π0 mc2

e2
Not coincidentally, re is the classical electron radius: re = 4π0 mc2
. This result was derived
assuming a polarized incident wave. However, we are generally interested in the unpolarized
cross section. This can be calculated by averaging over two polarization basis vectors:

2
dσ 1 X
= r2 (1 − |k̂ · êm |2 ) (4.24.9)
dΩ unpol 2 m=1 e

In this case:
dσ⊥
= re2 (1 − cos2 (0)) = re2 (4.24.10)
dΩ
While
dσk
= re2 (1 − sin2 θ) = re2 cos2 θ (4.24.11)
dΩ
Remember that θ is measured from the z-axis. Therefore:

dσ 1
= re2 (1 + cos2 θ) (4.24.12)
dΩ unpol 2

4.24.2 Rayleigh Scattering

Rayleigh scattering is scattering of a plane wave off of a small (compared to the wavelength)
dielectric or conducting object. In this scenario, the radiation of the object is primarily time
harmonic electric and magnetic dipole radiation. Thus we can write:
2
µ0
  
4
Erad = (EE1 + EM 1 )|r̂=k̂ = ω k̂ × (k̂ × p) + k̂ × m (4.24.13)
4πr

So that, then:
2  2
dσ k02

= k̂ × (k̂ × p) + k̂ × m (4.24.14)
dΩ 4π0 E0 c
4.41
Zangwill pg. 777. Zangwill uses the frequency space representation: you could also use the time basis.

130
For a non-magnetic (m = 0) simply polarizable object where p = α0 E0 ê0 :
 2 2
dσ k0 α
= (1 − |k̂ · ê0 |2 )2 (4.24.15)
dΩ 4π

Even without integrating, we can see immediately that σ ∝ k04 ∝ λ−4 . This strong dependence
of the cross section on wavelength is the reason that the sky is blue during the day and red
at sunset. During the day, the light we see coming from the sky is reflected sunlight, which
means it is dominated by low wavelength (blue) light. At sunset, when we look straight through
the atmosphere towards the sun, blue light is preferentially scattered away from us and we see
predominantly long wavelength (red) light that is not scattered.

4.24.3 The Born Approximation

Calculating the polarization and currents within a polarization material in an external field
is rigorously an iterative process. The incident field creates some polarization, which in turn
cancels out part of the field, changing the polarization etc. ad infinitum.
In the Born approximation we will truncate this process after the first iteration. Subse-
quent steps are negligible when the object is only weakly polarization, so the Born Approxi-
mation is valid in this limit.
Note that this is more general than Rayleigh scattering, which assumed that the object was
both weakly polarizable AND small compared to the wavelength. The Born approximation
makes no assumption about the size of the object.
If the scattering object is simply polarizable, then P = 0 χEinc . Assuming the incident field
is a plane wave, the polarization current is:

j = −iω0 χ(ω)Einc = −iω(ω)0 χê0 E0 ei(k0 ·r−ωt) (4.24.16)

Plugging this into the formula derived earlier for the differential cross section of a time harmonic
current (eq. 4.24.6)4.42 :
 2 2 2
dσscatt k
Z
0
= 0
|k̂ × ê0 |2 d3 r0 χ(ω)ei(k0 −k)·r (4.24.17)
dΩ 4π

|k̂ × ê0 |2 sets the same angular dependence as found in the previous sections. Often, it is easiest
to evaluate this integral by introducing a new variable:

q = k − k0 (4.24.18)

Where k0 is the wave vector of the incident wave, and k is the wave vector of the scattered wave.
You can then solve the problem in terms of q and substitute back in at the end.
That means that this cross section again represents only the cross section a polarized incident
wave. In order to find the unpolarized cross section, we again carry out the averaging procedure:
4.42
Zangwill writes χ(r, ω), but I’m not terrible concerned with situations where the permitivity varies within the
object

131
2
1 X
|k̂ × ê0 |2 → (1 − |k̂ · êm |2 ) (4.24.19)
2 m=1

If the electric susceptibility is constant with respect to ω (the low frequency limit, as inertia of
charges in the material do not come into play), this simplifies to:
 2 2
dσscatt k 0
= |k̂ × ê0 |2 (V χ)2 (4.24.20)
dΩ 4π

4.24.4 The Optical Theorem

Consider4.43 a plane wave ψinc incident on an object. Far away, the total wave in space ψ looks
like:
eikr
ψ(r) ≈ eikr + f (θ) (4.24.21)
r
Higher order terms in r−1 may exist, but are negligible far away. Also far away, we can
approximate r by a Taylor series:
q
x2 + y 2
r= x2 + y 2 + z 2 ≈ z + (4.24.22)
2z

Substituting these into ψ and taking the mod square to find the intensity:

f (θ) ik (x2 +y2 ) f ∗ (θ) −ik (x2 +y2 ) |f (θ)|2


|ψ|2 ≈ 1 + e 2z + e 2z + (4.24.23)
z z z2

In the far field approximation we are working in, the final z −2 term is negligible. Since
c + c∗ = 2Re(c) for any complex number c, we can then write:

f (θ) ik (x2 +y2 )


 
2
|ψ| ≈ 1 + 2Re e 2z (4.24.24)
z

Now, we want to find the total intensity incident on some surface area A in the x-y plane:
|ψ|2 dxdy. This is easy enough if we assume that the sheet occupies most of the solid angle
R
R R∞
above the source, so that we can approximate the integrals as dx → −∞ dx. If we additionally
approximate θ ≈ 04.44 , the integrals become:

f (0) ∞ ik x2
Z  Z Z ∞ 
ik 2
|ψ|2 dxdy ≈ A + 2Re e 2z dx e 2z y dy
z −∞ −∞
f (0) 2πiz
 
= A + 2Re (4.24.25)
z k

=A− Im(f (0))
k
4.43
Derivation due to Wikipedia: http://en.wikipedia.org/wiki/Optical_theorem
4.44
These two assumptions are somewhat contradictory. They would make the most sense in a limit where the
screen actually occupies only a small solid angle, but for some reason most of the wave is concentrated in this area.

132
|ψ|2 dA = A. Therefore, the total4.45 cross
R
If there was no object in the way, we would expect
section of the object must be:

σtot = Im(f (0)) (4.24.26)
k

5 Thermodynamics
5.1 The Laws of Thermodynamics
5.1.1 The Zeroth Law

The so-called zeroth law of thermodynamics states that if two bodies are in thermal equilibrium
with a third body, then all three are together in equilibrium. This is so eminently reasonable
that they didn’t want to make it a “4th Law of Thermodynamics", so it became the zeroth.

5.1.2 The First Law

The first law of thermodynamics is a simple statement of conservation of energy. Since objects
that contain heat can do work, it follows that heat is a form of energy. This is normally
summed up with an energy conservation law:

∆U = Q − W (5.1.1)

Where W is the work done by the system, and as such represents energy lost, while Q is heat
added to the system.

5.1.3 The Second Law

The second law states that equilibrium states correspond to states of maximum entropy.
This means that, in any naturally occurring process, entropy must be increasing. This modern
statement is equivalent to two historical statements that are still sometimes useful5.1 :

• Lord Kelvin:
A transformation whose only final result is to transform work into heat extracted from a
source which is at the same temperature throughout is impossible.

• Clausius:
If heat flows by conduction from body A to another body B, then a transformation whose
only final result is to transfer heat from B to A is impossible.

5.1.4 The Third Law

The entropy of a perfect crystal at 0K has zero entropy.


4.45
Including both scattering and any absorption.
5.1
Copied here from pg. 30-31 of Fermi’s Thermodynamics.

133
5.2 Ideal Gases
An ideal gas is a theoretical gas that experiences no inter-particle interactions. Its behavior is
characterized by the ideal gas equation:

P V = N kT (5.2.1)

Where N stands for the number of molecules in the gas, and k is the Boltzmann constant
(1.38 × 10−23 K
J
).
Chemists (obsessed as they are with large quantities of stuff) tend to rewrite N k as nR,
J
where n is the number of moles of gas, and R is the universal gas constant 8.314 K·mol . The two
definitions are perfectly equivalent.
Another fundamental property of an ideal gas (or any simple thermodynamics system) is
that it may be described completely using only three parameters (or two, once the ideal gas law
has been applied). For example, the state of an ideal gas can be specified by V , P , and T . The
ideal gas law then allows us to express these three in terms of only two.
For an ideal gas in particular, internal energy is only a function of T, so we can write U (T ).

5.2.1 Velocity of Particles in an Ideal Gas

Figure 5.2.1: Source: Schroder’s Thermal Physics, pg. 10

The RMS velocity of a molecule in an ideal gas can be calculated using a simple thought
experiment (Fig. 5.2.1).
Imagine a molecule trapped in a box with a piston as one side. The molecule exerts an
average pressure on the side of the box:

F̄x,onpiston −F̄x,onmolecule −m ∆v
∆t
x
P̄ = = = (5.2.2)
A A A
2L
∆t is the time between collisions, which can be reasonably written as ∆t = vx . At the wall,
∆vx = −2vx (since the molecule turns completely around). Plugging in, we obtain

−m(−2vx ) mvx2
P̄ = = (5.2.3)
A( 2L
vx )
V

This can be re-written as P̄ V = mvx2 . If we now consider an ideal gas of N such particles (such

134
that vx now becomes v̄x : the average over all the particles velocities), the ideal gas law gives us:

N kT = N mv̄x2 (5.2.4)

or

kT = mv̄x2 (5.2.5)

Now, the same argument applies in the y and z directions, so 3kT = mv̄ 2 . Therefore:

q s
3kT
vrms = v¯2 = (5.2.6)
m
This is the average speed of a molecule in an ideal gas.

5.2.2 Energy of an Ideal Gas

Since an ideal gas experiences no inter-particle interactions, its energy is simply a sum of the
kinetic energies (also called “thermal energies") of each of its constituent molecules.
The thermal energy of a single molecule depends on its degrees of freedom, f . For a single
degree of freedom, the equipartition theorem of statistical mechanics says that:

1
KE = kT (5.2.7)
2
Therefore, the energy for f degrees of freedom is:

f
KE = kT (5.2.8)
2
A monoatomic molecule moving back and forth in 1D has f = 1; the same molecule in 3D has
f = 3. Allow a diatomic molecule to rotate adds two degrees of freedom (one for each axis
around which rotation is permitted). Allowing vibration in a diatomic molecule adds another 2
degrees of freedom5.2 . Therefore, diatomic gasses have f = 7.
The thermal energy of an ideal gas of N particles is then given by:

f
Uthermal = N kT (5.2.9)
2

5.3 The PV Plane


Work done by the expansion or contraction of a gas can be readily written as W = P dV .
Therefore, it is often convenient to graph thermodynamic processes as functions P (V ) on a P vs.
V plane, such that the area under a given line on the graph represents the work done by that
segment of the process.
5.2
According to Schroeder, one is for the kinetic energy of the vibration, and the other for the potential energy.

135
Figure 5.3.1: Two thermal process graphed on the PV plane. Source: Wikipedia.

5.4 Isotherms and Adiabats


Gases that expand or contract with certain conditions held constant follow particular curves on
the PV plane and are described easily by certain mathematical expressions.

5.4.1 Isotherms

Isothermic expansion or contraction occurs when the volume of a gas (and thus its pressure)
changes while the temperature is held constant. The ideal gas law tells us that, since for
such a process N kT is constant:

P V = constant (5.4.1)
1
Therefore, the graph of an isotherm on the PV plane goes as P (V ) ∝ V (see Fig. 5.3.1).
The ideal gas law can also be used to calculate the work done across an isotherm:
Z Vb Z Vb
nkT V2
 
W = P dV = dV = nkT Log (5.4.2)
Va Va V V2
Notice how the above process depends on the fact that T is constant with respect to V.
Since the energy of an ideal gas is ONLY a function of its temperature ( U (T )) and NOT its
volume, ∆U = 0 for an isotherm. This implies in turn that:

Q = ∆W (5.4.3)

5.4.2 Adiabats

Adiabatic expansion or compression is defined by the absence of heat transfer: Q = 0. Under


this condition, we have from the 1st law that dU + dW = dQ = 0. Since U (T ) is only a function
of T, this is equivalent to:

∂U
dT + P dV = 0 (5.4.4)
∂T

136
Eliminating P through the ideal gas law and rearranging,

∂U 1 nk
 
dT + dV = 0 (5.4.5)
∂T T V
∂U
∂T happens to be the heat capacity at constant volume CV . However, since for an ideal gas
U (T ) = f2 nkT :

∂U f
= nk (5.4.6)
∂T 2
So this expression can be rewritten as

f 1 1
 
dT + dV = 0 (5.4.7)
2 T V
f
Integrating (and moving 2 to the V term) yields:

2
Log(T ) + Log(V ) = constant (5.4.8)
f
Then, exponentiation finally leaves:

2
TV f = constant (5.4.9)

This is one form of the adiabat law. In order to be graphed on the PV plane, we can rewrite this
expression in terms of P and V using the ideal gas law (absorbing nk into the constant side):

2
+1
PV f = constant (5.4.10)

This exponent is known as the adiabatic index, γ, and is normally rewritten as:

2 f +2
γ= +1= (5.4.11)
f f
It also happens that:

Cp
γ= (5.4.12)
CV
In terms of this exponent, then, the adiabat law can be rewritten in a number of useful ways:

P V γ = constant (5.4.13)

T V γ−1 = constant (5.4.14)

1−γ
TP γ = constant (5.4.15)
1
The first of these equations tells us that P (V ) ∝ Vγ. Since in general γ > 1, adiabats are
steeper than isotherms. This difference allows the construction of adiabatic/isothermal cyclic
engines like the Carnot cycle.

137
5.5 Heat Engines
A heat engine is a device that takes in heat from a high-temperature reservoir, turns some into
work, and dumps the rest into a low-temperature reservoir. We will also discuss refrigeration
cycles, which function on a similar principle: taking outside work to move heat from a low-
temperature reservoir to a high temperature reservoir.

5.5.1 Efficiency and Efficiency Limits

The concept of efficiency seeks to quantify how “good" a particular engine is. This is most
naturally represented by the ratio between the benefit of a cycle and its cost:

W
ηE = (5.5.1)
Qh
Where Qh is the heat transferred from the hot reservoir (and the subscript E distinguishes this
from the efficiency for a refrigeration cycle, which is discussed next). Since W = Qh − Qc by
conservation of energy, this can be readily written as:

Qh − Qc Qc
ηE = =1− (5.5.2)
Qh Qh

For a refrigeration cycle, we have a different balance of priorities:

Qc
ηR = (5.5.3)
W
Again using W = Qh − Qc , we can rewrite this as:

1
ηR = Qh
(5.5.4)
Qc −1

Now, the second law states that the net entropy of a closed system can only increase or stay
constant (never decrease). Therefore, in an engine, the entropy discharged into the cold reservoir
must be greater or equal to the entropy taken from the hot reservoir:

Sc ≥ Sh (5.5.5)
Q
Since S = T, this means that

Qc Qh
≥ (5.5.6)
Tc Th
or, rearranged,

Qc Tc
≥ (5.5.7)
Qh Th

138
Qc
The greatest possible engine efficiency occurs at the smallest possible Qh , so the highest efficiency
possible for an engine is:

Tc
ηC = 1 − (5.5.8)
Th
C here stands for Carnot, since this efficiency is theoretically achieved by the Carnot Cycle.
A similar entropy argument can be made for refrigeration cycles. In this case, the condition
is reversed (since the flow of heat is reversed):

Sc ≤ Sh (5.5.9)

Which, by the same steps, leads to the maximum possible refrigeration efficiency:

1
ηR,max = Th
(5.5.10)
Tc −1

5.5.2 The Carnot Cycle

Figure 5.5.1: The Carnot Cycle. Source: Wikipedia

The Carnot Cycle5.3 is probably the simplest heat engine cycle. It consists of two isotherms,
connected by adiabats. The four legs of the cycle are as follows:

1. 1 to 2 (Isotherm): The system absorbs heat from the hot reservoir, and the working
gas expands. This is the first half of the out-stroke of the engine.
5.3
Here’s a great article about the Carnot cycle from Oberlin College: http://www.oberlin.edu/physics/
dstyer/P111/Carnot.pdf

139
2. 2 to 3 (Adiabat): The system is removed from the hot reservoir and allowed to cool.
The working gas continues to expand, now adiabatically.

3. 3 to 4 (Isotherm): The system is placed in contact with the cold reservoir, causing the
working gas to contract. This does negative work, but is necessary to return the system to
the beginning of the cycle.

4. 4 to 1 (Adiabat): The system is removed from the cold reservoir and warms up slightly
as it decreases even more in volume until uniting again with the top isotherm.

During each isotherm, dU = 0 (since U (T )), and therefore Q = W . Work is also done on the
adiabatic stretches of the cycle, however it turns out that W2→3 = W4→1 , so the net work done
by the cycle is just:

∆W = W1→2 + W3→4 (5.5.11)

Where W3→4 is negative.

5.6 Intensive vs. Extensive Quantities


An intensive quantity is one that does not change if you change the size of the system. Examples
are temperature, density, etc.
An extensive quantity DOES change if you change the size of the system. Examples are
mass, and volume.
An intensive quantity can be constructed by taking the ratio of two extensive quantities, the
result of which is scale-invariant.
Thermodynamic systems usually (always?) are described by two intensive and two extensive
quantities.

5.7 The Chemical Potential


The chemical potential is the quantity that is equal when two systems (that are allowed to
exchange particles) are in equilibrium. In other words, in a state of equilibrium:

µA = µB (5.7.1)

The chemical potential can be defined with respect to any of the thermodynamic potential
energies:
∂F ∂G ∂H ∂U
µ= = = = (5.7.2)
∂N T,V ∂N T,P ∂N S,P ∂N S,V

Out of these, perhaps the most often used is:

∂F
µ= (5.7.3)
∂N T,V

Roughly speaking, the chemical potential represents the amount of energy required to introduce
a new particle into the system.

140
• If µ > 0, it takes energy to add particles to the system, and new particles will have to be
’forced’ in.

• If µ < 0, the system is ’accepting’ of new particles, and they will join spontaneously.

• If µ = 0, it costs no energy to either leave or enter the system. A practical example is


photons in a black body, which can freely enter or exit the gas.

If we imagine that the bath of particles ’outside’ the system is at µ = 0, then this leads to
the sensible conclusion that particles move from regions of higher chemical potential to regions
of lower chemical potential.

5.8 Thermodynamic Potentials


A thermodynamic potential is a scalar function that completely describes a system. Just like
potential energies, only differences in thermodynamic potentials are physically meaningful.
Equilibrium states are often found by minimizing the internal energy, U. However, under dif-
ferent conditions, it can be more convenient to instead minimize one of the other thermodynamic
potentials (which will also yield an equilibrium state).
Most generally, each of these potentials includes a dN term to account for systems where
the particle number is not fixed. If the number is fixed, this term is of course zero, which is
often how the relations are expressed in textbooks.

5.8.1 The Internal Energy (or Fundamental Thermodynamic Relation


X
dU = T dS − pdV + µi dNi (5.8.1)
i

The internal energy is the energy it would take to create a system out of nothing. The −pdV
term represents work that would have to be done against the external atmosphere to create
room for the system.
The primary importance of the thermodynamic potentials is that they easily generate many
useful partial derivative expressions. For example, from the fundamental thermodynamic relation:

∂U
• Constant volume → T = ∂S V,N

5.8.2 The Enthalpy

H = U + pV (5.8.2)

Applying the fundamental thermodynamic relation for dU :

X
dH = T dS + V dP + µi dNi (5.8.3)
i

The enthalpy represents the energy that goes into creating a system excluding work done against
the external constant pressure on the system. The first law of thermodynamics states that:

U = Q − P ∆V + Wother (5.8.4)

141
Where the P V term is work done against the atmosphere (pushing a piston, etc.), and Wother is
other work. For the enthalpy, then:

H = Q + Wother (5.8.5)

Therefore the enthalpy only changes due to added heat or other forms of work, excluding the
expansion of the system. This makes enthalpy a natural quantity to use for recording heat
capacities etc. that are independent of the value of the external pressure.
For example, cooking directions often implicitly depend on the amount of energy required
to boil water. This energy includes both the energy to boil the water, and some extra work to
account for the increased volume occupied by the water vapor. This means that the amount
of energy required varies based on atmospheric pressure: hence why cooking times must be
adjusted at higher altitudes.
If instead cooking times could somehow be expressed in enthalpy, the same directions would
work at any altitude (pressure).

5.8.3 The Helmholtz Potential (or Helmholtz Free Energy)

F = U − TS (5.8.6)

Applying the fundamental thermodynamic relation for dU :

X
dF = −SdT − pdV + µi dNi (5.8.7)
i

The Helmholtz free energy represents the amount of energy necessary to create a system from
nothing excluding energy that can be leeched for free from the environment. For example, somehow
conjuring an object at some non-zero temperature would require some energy. However, if the
environment is also at temperature T, this energy can be had “for free" by conduction.

5.8.4 The Gibbs Free Energy

G = U − TS + PV (5.8.8)

Applying the fundamental thermodynamic relation for dU :

X
dG = −SdT + V dP + µi dNi (5.8.9)
i

The Gibbs free energy is truly the energy required to create a system out of nothing, since it
includes BOTH the energy necessary to push the atmosphere out of the way and the heat that
can be leeched from the environment by conduction.

142
5.9 Maxwell Relations
Maxwell relations are useful thermodynamic identities produced by twice differentiating a
thermodynamic potential. In general, for a potential Φ(x1 , x2 , ...), a Maxwell relation is:

∂ ∂Φ ∂ ∂Φ
= (5.9.1)
∂xj ∂xi ∂xi ∂xj
For example, the fundamental thermodynamic relation tells us that

∂U ∂U
T = , P =− (5.9.2)
∂S V ∂V S

We will now take the V partial of left relation and the S partial of the right relation. Since it is
true that:

∂ 2 F (x, y) ∂ 2 F (x, y)
= (5.9.3)
∂x∂y ∂y∂x
We can then write:

∂T ∂P
=− (5.9.4)
∂V S ∂S V

Applying the same process to each of the other thermodynamic potentials produces the following
other potentials.

∂T ∂V
= (5.9.5)
∂P S ∂S P

∂S ∂P
= (5.9.6)
∂V T ∂T V

∂S ∂V
− = (5.9.7)
∂P T ∂T P

5.10 The Thermodynamic Square

Figure 5.10.1: The thermodynamic square. Source: Wikipedia.

The thermodynamic square is a visual mnemonic for remembering both the thermodynamic

143
potentials and the Maxwell relations. The order of the letters on the square can be easily
remembered as: “Good Physicists Have Sudied Under Very Fine Teachers ”
To construct each of the potentials, follow the following process:

1. Start on the letter of the potential you wish to write the differential identity for. For
example, to find dU , we start on U .

2. The two variables on the near side of the square will be differentials, while the ones on
the far side will not. Variables on the opposite diagonals go with one another. If one of
the NON-differential variables has a minus sign, the term picks up a minus sign (ignore
minus signs for differential elements). For example, for dU , the terms will be (−p)(dV )
and (T )(dS).

3. The final function is the sum of these terms: dU = T dS − pdV .

The square can also be used to generate the four Maxwell relations:

1. We will form a “U” shape, where the ends of the U are the numerators of the differentials
and the bends are the denominators. For example,

∂S ∂V
= (5.10.1)
∂p ∂T

2. If BOTH sides have a minus variable, that side gets a minus sign. If EACH side has one,
we ignore the minus sign. In this case:

∂S ∂V
− = (5.10.2)
∂p ∂T

3. Finally, each derivative is held constant with respect to the denominator of the OTHER
derivative:

∂S ∂V
− = (5.10.3)
∂p T ∂T p

4. The other three Maxwell relations can be generated by rotating the U by 90, 180, and 270
degrees.

There is a second thermodynamic square used for remembering the non-differential formulas of
the potentials:
In this case the interpretation is very simple: H = U + P V , G = U + P V − T S, etc.

5.11 Heat Capacity


The heat capacity of an object is the amount of heat required to raise its temperature by some
amount:

Q
C= (5.11.1)
∆T

144
Figure 5.10.2: The second thermodynamic square. Source: Schroeder pg. 151.

We will derive two expressions for the heat capacity: one for heating at constant volume and the
other for heating at constant pressure.

5.11.1 Specific Heat

The specific heat of an object is a measure of its heat capacity per unit mass: c has units of
J
K·kg . Just like the heat capacity, there are separately capacities for constant volume or pressure:
cV and cP respectively.
The definition of heat capacity is often rearranged and written in the following way with the
specific heat:

∆Q = mc∆T (5.11.2)
J cal
A potentially useful number to know is the specific heat of water: 4.186 C·g , or 1 C·g .

5.11.2 Constant Volume

The first law of thermodynamics states that U = Q − W or, taking dW = pdV for work done
against the environmental pressure as the system expands: dQ = dU + dW .
Now, assume that U (T, V ) (remember, we are free to do this since the state of the system is
entirely determined by the two intensive parameters T and P, which is related to V). Then:

∂U ∂U
dU = dT + dV (5.11.3)
∂T V ∂V T

Substituting this into the previous expression for dQ:

∂U ∂U
 
dQ = dT + p + dV (5.11.4)
∂T V ∂V T

Or, at constant volume:

dQ ∂U
CV = = (5.11.5)
dT ∂T V

145
5.11.3 Constant Pressure

Similarly to the derivation above, if we take U (T, P ), then:

∂U ∂U
dU = dT + dP (5.11.6)
∂T P ∂P T

Likewise, if V (T, P ) (as is clearly the case with the ideal gas, and is true in general for simple
systems):

∂V ∂V
dV = dT + dP (5.11.7)
∂T P ∂P T

Combining these equations with the expression dQ = dU + pdV yields:

∂U ∂V ∂U ∂V
   
dQ = +P dP + +P dT (5.11.8)
∂P T ∂P T ∂T P ∂T P

Or, at constant pressure:

dQ ∂U ∂V
 
CP = = +P (5.11.9)
dT ∂T P ∂T P

The second term here represents extra heat that is necessary to do work against the constant
pressure environment as the system expands.

5.11.4 Relationship between Heat Capacities

There is an important relationship between the heat capacities at constant pressure and volume5.4 :

∂P ∂V
  
CP − CV = T (5.11.10)
∂T V,n ∂T P,n

In particular, for an ideal gas,

Nk Nk
  
CP − CV = T = Nk (5.11.11)
V P

6 Statistical Mechanics
6.1 Multiplicity, Temperature, and Entropy
The multiplicity of a state is defined to be the number of distinguishable states that share the
same energy, and is symbolized Ω(E).
The entropy is a measure of the amount of information “stored” in a state, or equivalently
the amount of “disorder”. The entropy increases as the multiplicity increases because a greater
number of accessible states affords more combinations (disorder), and therefore requires more
information to fully describe the state of the system. Some (more mathematical) books define a
5.4
The derivation seems to be too long and annoying to include here.

146
dimensionless entropy:

σ = log Ω (6.1.1)

In general, though, the entropy is defined in terms of the Boltzmann constant k:

S = k log Ω (6.1.2)

Although a hand-waving link is made between this thermodynamic entropy and the Shannon
entropy of information theory, the mathematical link between these two concepts is more
tenuous6.1 .
The temperature of a system is defined by the derivative of the entropy to ensures that the
thermal equilibrium of two systems occurs when their temperatures are equal and the combined
entropy of the systems is maximized:

1 ∂S
= (6.1.3)
T ∂E
To save notation, we also introduce a convenient expression:

∂ log Ω 1
β= = (6.1.4)
∂E kT

6.2 Density of States


We will almost always discuss systems in terms of their energy states. However, many systems
have multiple states with the same energy (degeneracy). This fact becomes important when
applying statistics that rely on the total number of states. We therefore define the density of
states to be the number of degenerate energy states of an energy E. In terms of the number of
states, then:

dΩ
g(E) = (6.2.1)
dE

6.2.1 Density of Quantum States in Phase Space

In classical mechanics, a particle in a volume with position r and momentum p can occupy
an infinite number of degenerate states. However, quantum mechanics insists that only states
of quantized momentum and position can actually be occupied. The uncertainty principle6.2
(approximately6.3 ) states that:

dpdx ≈ h (6.2.2)
6.1
For an interesting book-length analysis of this topic, see “A Farewell to Entropy: Statistical Thermodynamics
Based on Information” by Arieh Ben-Naim.
6.2
I’ve heard that this isn’t the most rigorous derivation of the density of states, but I’m not sure why.
6.3
I’m not sure why a factor of π is traditionally neglected here, but it seems to always be left off.

147
So therefore the smallest possible volume of a cell in six-dimensional6.4 x-p phase space is:

dp3 dx3 ≈ h3 (6.2.3)

The density of states can then be written:

dp3 dx3
g(E)dE = (6.2.4)
h3

For systems with spin-based degeneracy (such as fermions in a Fermi gas, like electrons in a
metal), an additional factor of 2 should also be included.
It might be easy to confuse g(E) and ns (the occupancy of a state s). ns tells you how many
particles there are IN a state s (with energy Es ), while g(E) tells you how many states there
ARE with energy less than E.

6.2.2 Example: Density of States for a 3D Square Well

This example illustrates a different way of calculating the density of states.


The density of states per unit volume can be expressed:

dN dN dk
g(E) = = (6.2.5)
dk dk dE

N (k) can be found by imagining a 3D k phase space. For a square well, the volume taken up by
 3
π
each cell in such a phase space is L . The number of possible states for a given |k| is then:
¯
 3
L 1 4 3
N (k) = 2 × × × πk (6.2.6)
π 8 3

1
Where the factor of 2 accounts for the spin of the states, and 8 × 43 πk 3 is the volume of one
quadrant of the sphere. Thus:
 3
dN L
= πk 2 (6.2.7)
dk π
~2 k2
Now, in order to find a relationship between E and k, we can use the fact that E = 2m :

dk m
= 2 (6.2.8)
dE ~ k

2mE
Finally, since the above relationship also gives us that k = ~ , we can substitute all of these
together to get: √
28π 3 √
g(E) = m2 E (6.2.9)
h3
Notice that the only point in this derivation where the square well came into play was the size
6.4
This is obviously for three dimensional space, but the same process is easy to follow in any dimensionality.

148
of the phase space cell, so more generally we could write:
√ 3√
1 2m 2 E
g(E) = (6.2.10)
δV ~3

Where δV is the volume of a phase space cell for the system in question.
Good source: http://ecee.colorado.edu/~bart/book/book/chapter2/ch2_4.htm

6.3 Energy Distribution at Equilibrium


Consider two macroscopic systems A and A0 (all primed functions will refer to this second
system) where A0 >> A. Suppose that the total energy of the system is fixed (E + E 0 = Etot ),
and that the two systems only weakly interact (so that they may exchange energy, but have
negligible interaction energy).
If we additionally assume that the two states are in equilibrium, then system A is equally
likely to be in any state that is accessible to it, given its energy. The probability of A having an
energy E is therefore:

Ω(E)
P (E) = CΩ(E) = P (6.3.1)
E Ω(E)

Likewise, the probability of A and A0 having energies E and E 0 respectively is:

P (E, E 0 ) = CΩ(E)Ω0 (E 0 ) (6.3.2)

However, since the total energy of the system is fixed, this can be expressed in terms of the
(constant) total energy Etot :

P (E) = CΩ(E)Ω0 (Etot − E) (6.3.3)

Notice that, as E increases, Ω(E) rapidly increases, while Ω0 (Etot − E) just as rapidly decreases
(since the number of states per energy level usually increases exponentially or even as a factorial).
It therefore follows that the product, P (E) is very sharply peaked6.5 . The implication of this is
that, for a given E, the system is practically certain to reside in a well defined equilibrium
state.
We now want to find the energy value E that corresponds to this equilibrium point. Of
∂P
course, this can be found by considering ∂E = 0. However, for mathematical convenience, we
will instead (equivalently) maximize the logarithm of P (E):

∂ log P 1 ∂P
= =0 (6.3.4)
∂E P ∂E
Since log P (E) = log C + log Ω(E) + log Ω0 (Etot − E), this becomes

∂ log Ω(E) ∂ log Ω0 (Etot − E)


+ (−1) =0 (6.3.5)
∂E ∂E
6.5
See proof in Kittel and Kroemer, pg. 18-19.

149
Then in terms of β at equilibrium:

β(E) = β(E 0 ) (6.3.6)

Since β = ∂S
∂E , this implies that, at equilibrium, S + S 0 is maximized. Equivalently, then, when
two systems are at equilibrium,


(S + S 0 ) = 0 (6.3.7)
∂E
And therefore T = T 0 .

6.4 The Microcanonical Ensemble


The microcannonical ensemble describes a system with a fixed total energy, number of particles,
and volume. It is assumed that all of the states accessible to the system at this fixed
energy are equally probable. Therefore, if there are N such accessible states,

1
P = (6.4.1)
N

6.5 The Canonical Ensemble


The Canonical Ensemble represents a system in thermal equilibrium with a much larger system
that acts as a heat reservoir. The system can exchange energy, but not particles with the heat
reservoir.

6.5.1 Probability of States

Suppose that the small system is in a definite state, Es . We want to find the probability of this
occurring, which is given (eq. 6.3.3) as:

P (Es ) = CΩ(Es )Ω0 (Etot − Es ) = CΩ0 (Etot − Es ) (6.5.1)

Ω(Es ) = 1 since the state of the small system is defined. In the limit where Es << Etot , which
is satisfied in the canonical ensemble;

∂ log Ω0
 
0 0
log Ω (Etot − Es ) ≈ log Ω (Etot ) − Es (6.5.2)
∂E 0 Es =0
 
∂ log Ω0
Since ∂E 0 = β 0 |Es =0 = β 0 , and at equilibrium β 0 = β since T 0 = T :
Es =0

log Ω0 (Etot − Es ) ≈ log Ω0 (Etot ) − βEs (6.5.3)

So therefore, by taking the exponential:

Ω0 (Etot − Es ) = Ω0 (Etot )e−βEs (6.5.4)

150
Since Ω0 (Etot ) is constant, this allows us to write the probability P (Es ) as:

P (Es ) = Ce−βEs (6.5.5)


R
Where C is an constant determined by the fact that Cds = 1.
This probability distribution is the origin of the Boltzmann Factor, which is:

e−βEs (6.5.6)

6.6 The Grand Canonical Ensemble


The Grand Canonical Ensemble6.6 represents a system that is able to exchange both heat and
particles with a heat and particle reservoir.
If A is a small system, and A0 a heat reservoir:

E + E 0 = Etot
(6.6.1)
N + N 0 = Ntot
By analogous arguments to those for the canonical ensemble, the probability of observing A in a
particular state characterized by Er and Nr is:

P (Er , Nr ) = CΩ0 (Etot − Er , Ntot − Nr ) (6.6.2)

Which leads, under the approximation that E 0 ≈ Etot and N 0 ≈ Ntot , to:

∂ log Ω ∂ log Ω
log Ω0 (Etot − Er , Ntot − Nr ) = log Ω0 (Etot , Ntot ) − − (6.6.3)
∂E 0 Er =0 ∂N 0 Nr =0

As before, we have

∂ log Ω
β= (6.6.4)
∂E 0 Er =0

But now, analogously:

∂ log Ω
α= (6.6.5)
∂N 0 Nr =0

We will then define the chemical potential to be

α
µ = −kT α = − (6.6.6)
β
So

log Ω0 (Etot − Er , Ntot − Nr ) = log Ω0 (Etot , Ntot )e−β(Er −µNr ) (6.6.7)


6.6
This is possibly the most pretentious name for anything in all of physics. The only difference between this
and the Canonical Ensemble is that it accounts for changes in particle number. Does that really deserve the title
“grand”?

151
Which, by taking the exponential, allows us to write the probability neatly as:

P (Er , Nr ) = Ce−β(Er −µNr ) (6.6.8)

Where, as usual, the constant C is set by C −1 =


R
P (Er , Nr )dEr dNr = 1.

6.7 The Boltzmann Factor and Boltzmann Equation


The Boltzmann factor is the name given to the exponential part of the probability distribution
in the canonical ensemble:

e−βEs (6.7.1)

This factor is only meaningful as a ratio: it gives information about the relative but not absolute
probabilities of certain states6.7 . However, this means the factor by itself can be used to find the
relative probabilities of two states in thermodynamic equilibrium.
Suppose a system of many particles in which two possible energies for an individual particle
are EA and EB . If the density of states for a particle is D(E), then the ratio of Boltzmann
factors tells us that the ratio between the number of particles in each state (Ni = N (Ei )) is:

NA D(EA ) −β(EA −EB )


= e (6.7.2)
NB D(EB )
This result is known as the Boltzmann Equation.

6.8 The Partition Function


The partition function describes the probability distribution of states of different energies. The
partition function takes different forms in different ensembles.

6.8.1 The Canonical Partition Function

The canonical partition function applies when describing a system within the canonical ensemble.
For classical discrete systems, we define:

e−βEs
X
Z= (6.8.1)
s
1
Where β = kT and Es is the energy of the sth state. If the spectrum of states is continuous, the
sum approaches an integral as long as we include the density of states to weight it:

1
Z Z Z
−βE
Z= g(E)dEe = 3 e−βH(p,q) d3 qd3 p (6.8.2)
h

Where in the last step we have substituted in an expression for the density of states. The factors
of h3 depends on the dimensionality (h3 in 3D).
6.7
This is why the absolute magnitude of the partition function is meaningless: physical quantities expressed in
terms of Z must be somehow normalized, usually by a factor of Z1 .

152
For the quantum mechanical discrete case, the energy is replaced by the Hamiltonian:

Z = Tr e−β Ĥ

(6.8.3)

For continuous states (again substituting in the density of states explicitly)

1
Z
Z= 3 hq, p|e−β Ĥ |q, pid3 qd3 p (6.8.4)
h

6.8.2 The Grand Canonical Partition Function

When working in the grand canonical ensemble (allowing for particle exchange), the discrete
classical partition function is:

e−β(Es −µNs )
X
Z= (6.8.5)
s

6.8.3 Combining Partition Functions

Suppose two systems 1 and 2 are weakly interacting and distinguishable. Then the total system
energy is Etot = E1 + E2 . Exploiting the properties of exponentials, this means we can rewrite
the partition function of the system as follows:

e−β(Er +Es ) = e−βEr e−βEs


X X X
(6.8.6)
r,s r,s r,s

Or, therefore
Ztot = Z1 Z2 (6.8.7)

Equivalently:
log Ztot = log Z1 + log Z2 (6.8.8)

If N systems are identical but distinguishable, and Z1 is the partition function for one of the
systems, then:
Ztot = Z1N (6.8.9)

If the N identical systems are indistinguishable, we need to account for double counting, which
leaves:
1 N
Ztot = Z (6.8.10)
N! 1
Or, utilizing the Sterling approximation:

log Ztot = N log Z1 − N log N (6.8.11)

6.8.4 Ideal Gas Partition Function

The partition function for an ideal gas in three dimensions in the canonical ensemble can be
found by first writing the partition function for a single particle:

1 βp2 V βp2 V
Z Z
3
Z1 = e− 2m d3 pd3 x = 4πp2 e− 2m dp = (2mπkT ) 2 (6.8.12)
h3 h3 h3

153
Since an ideal gas is made up of identical, indistinguishable particles, the total partition function
is then:
log Z = N log Z1 − N log N (6.8.13)

6.8.5 The Partition Function and Thermodynamics

The partition function contains all relevant information about a system, and therefore can
be used to express the variables of thermodynamics. Suppose that a canonical ensemble is
characterized by a temperature (β) and a single external parameter x, such that Z(β, x). Then6.8

∂ log Z ∂ log Z
d log Z = dx + dβ (6.8.14)
∂x ∂β
∂ log Z
As previously derived, ∂β = −E. Now, all we need to do is rewrite the first term.
The macroscopic work done on the system in a state r can be expressed as:

∂Er
∆x Er = dx (6.8.15)
∂x
Pn ∂Er
The average work done on a system of n external parameters can be written dW = i=1 − ∂xi dxi ,
so if we set n=1 and explicitly evaluate the time average
 
e−βEr − ∂Er
P
r ∂x dx
dW = −∂E
∂x
r
= (6.8.16)
e−βEr
P
r
 
P −βEr − ∂Er
= − β1 ∂Z
Since re ∂x dx ∂x , we can pull the Boltzmann factor and sum past the

derivative to obtain E allows us to use the above relation between E and Z. The denominator is
just Z, so we can write:

∂ log Z
βdW = dx (6.8.17)
∂x
This is the expression for the first term in the original equation that we sought. Plugging it in:

d log Z = βdW − Edβ (6.8.18)

This can equally well be rewritten as:

d log Z = βdW − d(Eβ) + βdE (6.8.19)

Collecting derivatives and using the first law, dQ = dW + dE

d(log Z + βE) = β(dW + dE) = βdQ (6.8.20)

Since dQ = T dS, we can then integrate both sides to produce

S = kT log Z + βE = k(log Z + βE) (6.8.21)


6.8
The following derivation is from Reif pg.215.

154
So we can also rewrite F :

F = E − T S = −kT log Z (6.8.22)

6.9 Using the Partition Function


6.9.1 Average Energy and Fluctuations in Energy

The partition function has the form:

e−β
X
Z= (6.9.1)

Which differs only slightly from the equation for the average energy:

e−β
P
hEi = P −β (6.9.2)
e

We can therefore cleverly write:


1 ∂Z
hEi = − (6.9.3)
Z ∂β
Or, even more cleverly:

hEi = − log Z (6.9.4)
∂β

Taking a second derivative yields:

∂2
hE 2 i = log Z (6.9.5)
∂β 2

This is useful, because the ’fluctuations’ in E (i.e. the standard deviation of the E distribution)
are given by:
h(∆E)2 i = hE 2 − 2hEiE + hEi2 i = hE 2 i − hEi2 (6.9.6)

However, we can rearrange the equation for hE 2 i:


2
∂2 ∂ 1 ∂Z 1 ∂Z ∂hEi
  
hE 2 i = 2
log Z = + =− + hEi2 (6.9.7)
∂β ∂β Z ∂β Z 2 ∂β ∂β

Which means that:


∂2
h(∆E)2 i = log Z (6.9.8)
∂β 2

Or (if we already have an expression for hEi):


h(∆E)2 i = − hEi (6.9.9)
∂β

(See Reif pg. 213 for more details)

155
6.9.2 Average State Occupancy and Fluctuations

We can play a similar trick to that of the previous section in order to determine the number of
particles in a given energy state. Since the partition function contains a term of e−βs for each
state with energy s , and since:

∂ −βi −βe−βi i=s
e = (6.9.10)
∂s 0 i 6= s

we see that we can cleverly count the number of such states:

−1 1 ∂Z −1 ∂
hns i = = log Z (6.9.11)
β Z ∂s β ∂s

It is important to note that hns i = ns . We will often just drop the average brackets, since
obviously we don’t have access to the actual ns function!
Similarly to the previous section, we can use this fact to also find the fluctuations in the
particle number:
1 ∂2
h(∆ns )2 i = log Z (6.9.12)
β 2 ∂2s

Or:
1 ∂
h(∆ns )2 i = − hns i (6.9.13)
β ∂s

See Reif pg. 336-337 for more details.

6.10 The Equipartition Theorem


The Equipartition theorem states that every quadratic degree of freedom in a system’s
Hamiltonian, has an average energy of 21 kT in thermal equilibrium. This can be easily
proven classically using the Boltzmann distribution.
Consider a single quadratic degree of freedom, H = Ax2i . The partition function for this
degree of freedom is:
Z ∞
π
r
−βAx2i
Zi = e dxi = (6.10.1)
−∞ βA
The total partition function of a system of f such independent degrees of freedom is

 f2
π

Z = Πi Zi = (6.10.2)
βA
The average energy of this partition function is therefore

∂ f ∂ π  f 1
hEi = − log Z = log(β) − log = (6.10.3)
∂β 2 ∂β A 2β

156
And thus

1
hEi = kT (6.10.4)
2

6.11 Quantum Statistics: Bose, Fermi, and Boltzmann Distributions


In classical mechanics, particles are considered to be distinguishable from one another, while in
quantum mechanics this is not the case. This has profound consequences on statistical mechanics
when counting the number of states. Consider the following two states, where each quantum
number represents one particle:

Ψ1 = h...Q, ...P, ...i, Ψ2 = h...P, ...Q, ...i (6.11.1)

In quantum mechanics, these two states are identical (since the particles are indistinguishable),
while in classical mechanics they are distinct.

(Ψ1 = Ψ2 )quantum , (Ψ1 6= Ψ2 )classical (6.11.2)

At the same time, quantum mechanics also differentiates between two types of particles; Bosons
(integer spin) and Fermions (half-integer spin). For bosons, the wavefunction is symmetric under
the interchange of two particles. This reiterates the requirment that:

Ψ1 = Ψ2 (6.11.3)

For fermions, however, the wavefunction is antisymmetric under interchange of two particles.
Therefore, we additionally require that:

Ψ1 = −Ψ2 (6.11.4)

Therefore, we see that for a Ψ where both fermions occupy the same state Q:

Ψ3 = (..., Q, ...Q, ...) = 0 (6.11.5)

This is the Pauli Exclusion Principle: no two fermions may occupy exactly the same
quantum state.

6.11.1 The Average Number of Particles

The average number of particles in a given state s can be calculated using the partition function
P
for the system. Consider the energy of the system in some configuration R, where N = r nr
particles can be in one of r energy states r . The energy of a given configuration is:

X
ER = n1 1 + n2 2 + ... = nr r (6.11.6)
r

157
In the canonical ensemble, the probability of finding the system in this state is:

1 −β(n1 1 +n2 2 +...)


PR = e (6.11.7)
Z
The partition function for the entire system is then:

e−βER = e−β(n1 1 +n2 2 +...)


X X
Z= (6.11.8)
R R

Therefore, the average number of particles in any given energy state s can be given by averaging
over the occupancy of that state for every possible configuration of the system:

ns e−β(n1 1 +n2 2 +...)


P
R
ns = P −β(n1 1 +n2 2 +...)
(6.11.9)
Re

Some clever rearrangement yields:

1 ∂Z
ns = − (6.11.10)
βZ ∂s
or
1 ∂ log(Z)
ns = − (6.11.11)
β ∂s

This equation allows us to find the distribution of particles given the partition function.
Another quantity often defined is the dispersion of the number of particles in s:

(∆ns )2 = (ns − ns)2 = n2s − ns2 (6.11.12)

This can be nicely recast using the partition function6.9 :

1 ∂ 2 log(Z) 1 ∂ns
(∆ns )2 = 2 2
=− (6.11.13)
β ∂s β ∂s
Further statistical arguments allow us to write ns more simply for systems that obey particular
quantum statistics. For systems with indistinguishable particles, the state R is completely
determined by the numbers (n1 , n2 , ...), so the sums over R in the general equation can be
replaced by sums over all possible combinations of nr , which we will symbolize as the set
P
n1,n2,... :

P −β(n1 1 +n2 2 +...)


n1,n2,... ns e
ns = P −β(n1 1 +n2 2 +...)
(6.11.14)
n1,n2,... e

We can rearrange this equation (using properties of exponentials) to sum over one particular ns
first. We will symbolize the set of all possible combinations of nr excluding ns as n1 , n2 , ...¬ns .
 
e−βns s e−β(n1 1 +n2 2 +...)
P P
ns ns n1 ,n2 ,...¬ns
ns =   (6.11.15)
e−βns s −β(n1 1 +n2 2 +...)
P P
ns n1 ,n2 ,...¬ns e

6.9
Rief pg. 336.

158
Notice that the second sums in the numerator and denominator of eq. 6.11.15 do not in general
cancel. Fixing the total number of particles,

X
= nr = N (6.11.16)
r
P
Given this requirement, the acceptable values of nr for the sum n1 ,n2 ,...¬ns are dependent on
the current value of ns . Therefore they cannot be pulled out of the sum to cancel. The exception
to this rule is photon statistics, where the particle number is allowed to fluctuate.
Equation 6.11.15 can now be further reduced to an expression without sums by making
further assumptions about the statistics of the system.

6.11.2 The Fermi-Dirac Distribution

For Fermions, we require that

nr = 0, 1 ∀r (6.11.17)

This obviously holds for r = s as well. Therefore, we can explicitly carry out the ns sum in
eq. 6.11.15 over ns = 0, 1:

0 + e−βs Zs (N − 1) 1
ns = = (6.11.18)
Zs (N ) + e−βs Zs (N − 1) Zs (N ) βs
Zs (N −1) e +1

Where, for compactness, we have introduced the following notation for the partition function of
N − ns remaining particles, excluding ns :

X X
Zs (N − ns ) ≡ eβ(n1 1 +n2 2 +...) , s.t. nr = N − ns (6.11.19)
n1 ,n2 ,...¬ns r6=s

If 1 << N (or, more generally, ∆N << N ), a Taylor expansion of log Zs (N − ∆N ) becomes:

log Zs (N − δN ) = log Zs (N ) − αs δN (6.11.20)


∂ log Zs
Where αs = ∂N . For large N, we approximate Zs to be slowly varying with s, such that
αs ≈ α. Combining these approximations allows us to write:

Zs (N )
= eα (6.11.21)
Zs (N − 1)
So that

1
ns = (6.11.22)
eα+βs +1
Customarily, this is rewritten by using the fact that F = −kT log Z to rewrite α as α =
1 ∂F µ
− kT ∂N = − kT = −βµ. The result is the Fermi-Dirac Distribution:

1
ns = (6.11.23)
eβ(s −µ) +1

159
Notice that 0 ≥ ns ≥ 1, which is required by our assumption that nr = 0, 1.

6.11.3 The Bose-Einstein Distribution

For bosons, the total number of particles is conserved and the particles are still indistinguishable.
However, there is now no restriction on the maximum value of nr :

nr = 0, 1, 2, ... (6.11.24)

Coorespondingly, the general ns equation (eq. 6.11.15) is now (utilizing the definition of Zs from
the Fermi-Dirac section):

0 + e−βs Zs (N − 1) + 2e−2βs Zs (N −2) + ...


ns = (6.11.25)
Zs (N ) + e−βs Zs (N − 1) + e−2βs Zs (N − 2)
Applying the N >> ∆N approximations from the Fermi-Dirac section, this becomes

Zs (N ) 0 + e−βs e−α + 2e−2βs e−2α + ... ns (α+βs )


  P
ns ns e
ns =  = P (6.11.26)
Zs (N ) 1 + e−βs e−α + 2e−2βs e−2α + ... ns (α+βs )

ns e

For simplicity, let zs = α + βs . Then

∂ ns zs
ns ens zs
P
ns e
P
ns ∂zs ∂ X
ens zs

ns = P ns zs
= = log (6.11.27)
ens zs
P
ns e ns ∂zs ns

But the sum inside the log is a simple series (allowing us to evaluate the sum):

X 1
ens zs = (6.11.28)
ns 1 − e−βs

So

1
ns = (6.11.29)
eβ(s −µ) −1

6.11.4 The Planck Distribution (Photons)

For the special case of photons, the number of particles is allowed to fluctuate. A similar
derivation to those for the Fermi and Bose distributions can be made using this fact (Reif pg.
339). However, we can also simply observe that, in this case, the partition function for the
remaining states when ns is specified, Zs , is independent of the total N , so:

∂ log Zs
αs = =0 (6.11.30)
∂N
and therefore

1
ns = (6.11.31)
eβs −1

160
6.11.5 The Maxwell-Boltzmann Distribution

In the classical case of distinguishable particles, any set of (n1 , n2 , ...) can be rearranged in
N!
n1 !n2 !... combinations (instead of just one in the quantum case). This means that the partition
function is:

N!
e−β(n1 1 +n2 2 +...)
X
Z= (6.11.32)
n !n
n1 ,n2 ,... 1 2 !...

Utilizing

1 ∂ log(Z)
ns = − (6.11.33)
β ∂s
The Maxwell-Boltzmann Distribution is:

e−βs
ns = N P −βr (6.11.34)
re

6.12 Black Body Radiation


All bodies spontaneously emit radiation. Near thermodynamic equilibrium, the spectrum of
which is described by the Bose-Einstein distribution, since photons are bosons.
A black body is a perfect absorber of radiation. For example, an enclosure with a small
pinhole will absorb virtually all light that enters (since light would have to reflect many, many
times to find its way back out through the pinhole). In this idealized situation, the properties of
the photon gas (radiation) emitted by the black body are described exactly by the Bose-Einstein
distribution with the energy E = ~ω. From this fact, we can derive the characteristic energy
spectrum for black body radiation (an important discovery in the early history of quantum
mechanics).6.10
Throughout this section, it is useful to keep in mind the picture of a black body as a box
with a pinhole, and to draw an analogy between photons escaping the blackbody box through
the pinhole, and a heated molecular gas escaping a box through a pinhole.

6.12.1 Planck’s Law

We will begin by writing the average energy of a photon gas in three dimensions6.11 :

1 d3 pd3 x
Z
U (T ) = (6.12.1)
h3 eβ − 1

The x integral yields a volume, V . Since the momenta of the photons are randomly oriented,
we can simplify the integral by substituting d3 p = (2)4πp2 . The additional factor of 2 must be
included, since one photon of each of two polarization could inhabit a single k-state.

V (8πp2 )dp
Z
U (T ) = (6.12.2)
h3 eβ − 1
6.10
This section mostly follows Wikipedia, but another useful resource is: http://disciplinas.stoa.usp.br/
pluginfile.php/48089/course/section/16461/qsp_chapter10-plank.pdf
6.11
Three dimensions is important, since it sets the dimensionality of the integrals and the factor of h out front.

161
Since  is a function of p, we need to change to a common variable. Since the final result is
ω
traditionally expressed in terms of frequency, we will chose that. Since p = ~k, k = c, and
ω = 2πf :
2~πf hf
p= = (6.12.3)
c c
Also,
 = ~ω = 2π~f = hf (6.12.4)

So, changing variables and canceling some h’s:

8πV f 3 df
Z
U (T ) = (6.12.5)
c3 eβhf − 1

We will use the integrated later, so we’ll give it its own symbol:

8πV f 3 df
u(f, T ) = (6.12.6)
c3 eβhf − 1

The spectral radiance6.12 (power per unit surface area per solid angle per hertz, or joules per
square meter) of a pinhole on the black body is6.13 :

u(T )c
I(f, T ) = (6.12.7)

Which leaves us with Planck’s Law:

2hf 3 1
I(f, T ) = 2 βhf
(6.12.8)
c e −1

Remember that I(T ) has units of watts per m2 per steradian per hertz, or joules per m2 , making
it a spectral radiance.

6.12.2 The Stefan-Boltzmann Law

Suppose we now want to know the total power radiated by a section of black body surface area,
A. We will include all frequencies, but only integrate over half of a solid angle, since we are
interested in radiation coming out of the black body (picture the black body as a flat sheet).
Z ∞ Z
P =A I(f, T )df dΩ (6.12.9)
0 half sphere

However, there is a subtle point here. Black bodies are Lambertian 6.14 , which means that the
intensity they radiate in a given direction is proportional to the cosine of the angle between the
given direction and the surface normal. If we imagine our black body sheet in the XY plane,
6.12
Warning: be wary of terms like intensity, radiance, etc. All are used loosely, but each has a separate meaning
(representing different dimensions). A list (I don’t know about a GOOD one) of the distinctions can be found at
https://en.wikipedia.org/wiki/Radiance#SI_radiometry_units
6.13
This equation is the same as for a gas escaping a heated container, except all photons move with speed c.
6.14
https://en.wikipedia.org/wiki/Lambert%27s_cosine_law

162
then:
2hf 3 cos(θ)
I(f, T, θ) = I(f, T ) cos(θ) = (6.12.10)
c2 eβhf − 1
Accounting for this fact:

f3
Z ∞
2hA
Z
P = 2 df cos(θ)dΩ (6.12.11)
c 0 eβhf − 1 half sphere

In order to carry out the integration, we’ll first change variables to make the frequency integral
dimensionless. Making the substitution x = βhf :
π
2A(kT )4 x3
Z ∞ Z Z 2π
2
P = dx cos(θ) sin(θ)dθ dφ (6.12.12)
c2 h3 0 ex − 1 0 0

The x integral term is complicated6.15 , but evaluates to:

x3 dx π4
Z
= (6.12.13)
ex − 1 15

Carrying out the integral produces the Stefan-Boltzmann Law:

2π 5 k 4
 
P =A T4 (6.12.14)
15c2 h3

The law is greatly simplified by labeling the constant in parentheses as the Stefan-Boltzmann
Constant:
2π 5 k 4 π2 k4
σ= = (6.12.15)
15c2 h3 60~3 c2
So that the law becomes simply:
P = σAT 4 (6.12.16)

This equation can be extended to more general circumstances beyond black body radiation.
For an imperfect radiator with emissivity e, 0 < e < 1 (e = 1 for a black body) radiating into
surroundings with a temperature Tc , Tc < T :

P = eσA(T 4 − Tc 4 ) (6.12.17)

6.12.3 Energy Density of a Black Body Photon Gas

The average energy density in the photon gas is given by integrating the energy distribution
function, u(f, T ), over all frequencies to find the total energy:

f 3 df
Z ∞ Z ∞
8πV
U (T ) = u(f, T )df = 3 (6.12.18)
0 c 0 eβhf − 1
6.15
This is a Bose-Einstein integral, but I’d recommend not worrying about integrals that are hard enough to
have names.

163
Solving the frequency integral the same way as in the previous section, we see that the average
energy density in the photon gas is:

U (T ) 8π(kT )4
= (6.12.19)
V 15(hc)3 )

This expression can be simplified by using the Stefan-Boltzmann constant. However, note that
this is NOT equivalent to the Stefan-Boltzmann Law, which applies to radiated power, rather
than internal energy:
U (T ) 4σT 4
= (6.12.20)
V c

6.13 Fermi-Gases
6.13.1 The Fermi Energy

When a boson gas is in its ground state (T=0), all particles are in the same energy level, and the
total energy of the gass is zero. The Pauli exclusion principle prevents fermions from occupying
this same configuration. Instead, at T=0 a fermion gas will have a non-zero energy, called the
Fermi Energy.
The following discussion holds in general for any spin-ful fermions. However, the most
important application of this material is the description of electrons within a conducting
material: the so called “electron gas”.
The Fermi distribution gives the mean occupancy of a state as a function of energy:

1
ns = (6.13.1)
eα+βs +1
−α 6.16
For convenience, lets define a constant: µ = β .

1
ns = (6.13.2)
eβ(s −µ) +1

At some non-zero temperature T, energy levels far below µ are always occupied, while those far
above µ are never occupied. When  ≈ µ, the states have some probability of being occupied
(see Figure 6.13.1).
However, when T = 0, this middle region disappears. The Fermi distribution becomes:

1
ns = (6.13.3)
e∞(s −µ) +1

This function switches abruptly from 0 to 1 at  = µ. This is exactly what we expect to happen
due to the Pauli exclusion principle: in the ground state, fermions will fill up each quantum
level, starting at the bottom, until reaching some maximum. This maximum, µ is called the
Fermi energy.
The Fermi energy corresponds to a particular momentum, and therefore a particular wave
6.16
Some books use µ = F instead.

164
Figure 6.13.1: Graph of the Fermi distribution at temperature T 6= 0. F () = ns. Source: Reif
pg. 389.

Figure 6.13.2: Graph of the Fermi distribution at temperature T = 0. F () = ns. Source: Reif
pg. 389.

number kF :
~2 kF2
µ= (6.13.4)
2m
4 3
The total number of states is then the volume of a sphere in k-space: 3 πkF . Since the volume
of a single cell in k-space is (2π)3 , and each state can hold 2 fermions (of opposite spin):

V 4 3
 
N =2 πk (6.13.5)
(2π) 3 F
3

Using these two equations, we can rearrange for an expression for the Fermi energy of a given
system at T=0 :
 23
~2 N

µ(T =0) = 3π 2 (6.13.6)
2m V

165
6.14 Velocity Distributions
6.14.1 The Maxwell Velocity Distribution: Rigorous Derivation

Consider a single molecule in a gas with a mass m, position r, and momentum p. The energy of
p 2
the particle can be subdivided into kinetic energy, ( 2m ) and internal energy, Eint . In general
Eint (r), because of interactions with other molecules. However, if we assume that the gas is
sufficiently dilute that these are negligible, we can write:

p2
Etot = + Eint (6.14.1)
2m
Where Eint is constant. Since the molecule is in thermal contact with the rest of the gas, we can
describe its states as a canonical ensemble. The probability of occupying a particular state, with
a particular r and p is then:

p2 p2
P (r, p) ∝ e−β( 2m +Eint ) ∝ e−β( 2m ) (6.14.2)

Where we have absorbed e−βEint into the proportionality constant. In this system, r and p are
continuous variables, so we can ask about the probability of finding the molecule within r ± dr,
and p ± dp. In three dimensions, this is represented by the differential:

p2
P (r, p)d3 rd3 p = Ce−β( 2m ) d3 rd3 p (6.14.3)

Conventionally, we will chose to change variables from p to v. :

mv 2
P (r, v)d3 rd3 v = Ce−β( 2
)
d3 rd3 v (6.14.4)

So far, this probability distribution describes just a single particle. However, multiplying both
sides of this equation by the total number of particles, N , turns the probability P (r, v) into the
average number of particles that occupy this state (On the RHS, N is absorbed into the constant
of proportionality C). We will symbolize average number function as f (r, v):

mv 2
f (r, v)d3 rd3 v = Ce−β( 2
)
d3 rd3 v (6.14.5)

Now, the integral of this expression must equal the total number of particles in the gas:
Z Z
f (r, v)d3 rd3 v = N
(6.14.6)
mv 2
Z Z
→ C e−β( 2
)
d3 rd3 v = N
R 3
The integral is independent of r, so d r = V , where V is the volume. The remaining integral
  32
N βm
yields6.17 C= V 2π .
N
If we define the number density n = V , we can then write the Maxwell velocity
R∞ mvx2
6.17
The integral can be evaluated by calculating −∞
e−β 2 dvx and then cubing the result.

166
distribution as:

 32
βm

mv 2
3 3
f (v)d rd v = n e−β( 2
)
d3 rd3 v (6.14.7)

Note that f (v): the distribution is independent of r, and only depends on the magnitude of v.

6.14.2 The Maxwell Velocity Distribution: Fast Derivation

We want to calculate the number of molecules in an ideal, dilute gas (N particles in a volume
V) within some velocity and position range, f (v)d3 rd3 v. The probability of a single molecule
having that speed is given by the Boltzmann factor:

βv 2
f (v)d3 rd3 v ∝ e− 2m (6.14.8)

We are integrating over vectors (v), but we want our final function to be of a scalar, f (v).
Therefore, we need to think about the surface area of a sphere in velocity space, which means
that:
f (v)d3 rd3 v ∝ 4πv 2 (6.14.9)

Lastly, we know that, since the function we are looking for is a fraction of the total number of
particles:
βv 2
Z Z
f (v)d3 rd3 v = CN 4πv 2 e− 2m d3 rd3 v = N (6.14.10)

Where C, the normalization coefficient, can now be determined by evaluating the integral. If we
N
define the number density n = V , we recover:

 32
βm

mv 2
3 3
f (v)d rd v = n e−β( 2
)
d3 rd3 v (6.14.11)

6.14.3 Maxwell Distribution of a Single Velocity Component

Given the Maxwell distribution, we may wish to find the number of molecules in a gas with a
particular velocity component, say vx :
Z Z
g(vx ) = dvx = f (v)d3 v (6.14.12)
vy vz

Or, evaluating the integral:

 12 2
m

mvx
g(vx )dvx = n e− 2kT dvx (6.14.13)
2πkT

6.14.4 Maxwell Speed Distribution

Given the Maxwell distribution, we might want to find the number of molecules with a given
speed, regardless of direction. If we visualize the Maxwell distribution as existing in a 3D

167
spherical coordinate velocity space, this function is just the integral over the surface area of a
sphere:

F (v)dv = 4πv 2 f (v)dv (6.14.14)

Or, plugging in the Maxwell distribution:

 32
βm

mv 2
F (v)dv = 4πnv 2
e−β( 2
)
d3 rd3 v (6.14.15)

6.15 Kinetic Theory


Kinetic theory applies statistical mechanics (particularly the Maxwell velocity distribution)
to analyze the motion of individual molecules in an ideal gas to make predictions about the
behavior of the whole.

6.15.1 Number of Particles Hitting a Surface

Figure 6.15.1: Cylinder of molecules that will strike an area dA. Source: Reif pg. 271.

Suppose we have an ideal gas in an enclosure. Consider a particular velocity class of molecules
with velocity v at an angle θ to the surface normal of the wall. The number of those molecules
that will hit a surface area dA in a time dt is the volume of a cylinder (shown in Fig. 6.15.1)
with sides |v|dt long and slanted at an angle θ.
The volume of a slanted cylinder is V = (base)(height), just like a parallelogram in 2D. The
base here is dA, while the height is |v|dt cos(θ). A similar (but obviously different) cylinder
could be drawn for every other |v| and θ class. The Maxwell velocity distribution f (v) tells us
the number of molecules in each |v| state.Therefore, the number of molecules impacting dA in

168
dt is the :
n(v)d3 v = (vdtf (v) cos(θ)dAd3 v (6.15.1)

We will then define Φ(v) (which we could call the ’number intensity’ in analogy to light intensity)
¯
to be the number of molecules hitting the surface, per dA, per dt:

Φ(v)d3 v = (vf (v) cos(θ)d3 v (6.15.2)

In order to calculate the total number of particles hitting dA per dt, we need to integrate Φ(v).
However, we should not include particles whose velocities are going away from the wall! We avoid
this by requiring that vz > 0 on our integration bounds. In spherical velocity coordinates6.18 ,
then, 0 < θ < π2 .
Notice that we do not need to change the bounds of the v integral: the v we added determines
which speed classes should be counted. All we have to do is not count particles that won’t hit
the wall at all.
The resulting integrals then become:
Z ∞ Z π Z 2π
2
Φtot (v) = (vf (v))v 2 dv cos(θ) sin(θ)dθ dφ (6.15.3)
0 0 0

This equation is easy to remember if you see that it is the same as the average velocity, except
with the added cos(θ) and some changes to the integration bounds. In fact, explicitly applying
the Maxwell velocity distribution and integrating now yields a simple expression in terms of the
average velocity of the gas molecules:

1
Φtot (v) = nv̄ (6.15.4)
4
q
8kT
Where n is the number density. Since we know v̄ = πm for an ideal gas, for that case we can
write: s
1 8kT
Φtot (n, m, T ) = n (6.15.5)
4 πm

6.15.2 Effusion Through an Aperture

The discussion in the previous section can be easily applied to determine the number of particles
passing through an aperture of area A in a time interval dt:

Number of Particles
dv = A(f (v)v cos(θ))(v 2 )dΩdv (6.15.6)
dt

The rate of effusion (total number of particles per dt), R, can then be found by integrating as
above over bounds restricted such that the velocities are all pointing in the right direction:
Z ∞ Z π Z 2π
2
3
R=A f (v)v dv cos(θ) sin(θ)dθ) dφ (6.15.7)
0 0 0
6.18
Don’t just blindly add a 4πv 2 ! The extra cos(θ) means we need to actually do the θ integral, rather than just
jump to 4π.

169
Clearly, then: s
nA nA 8kT
R(n, m, T ) = AΦtot = v̄ = (6.15.8)
4 4 πm
Consider two boxes (A and B), filled with different gases, with a small pinhole of area A in wall
that separates them. The aperture is clearly the same size for each box. The ratio between these
effusion rates of the two gases is known as Graham’s Law:
q
TA
RA mnA
= q A (6.15.9)
RB TB
nB m B

6.16 The Saha Equation


The Saha equation6.19 is an expression for the percentage of ionization due to thermal collisions
for a gas at equilibrium (so that the canonical ensemble applies). The gas is assumed to be
weakly ionized (short Debye length), so that there is negligible ion shielding.
Consider two ionization states: an effective “ground state”6.20 (n) and an ionized state
(n + 1) + e− that includes a free electron.
For two subsequent energy levels, the Boltzmann equation(eq. 6.7.2) tells us that:

Nn+1 D(En+1 + e− ) −β(En+1 −En )


= e (6.16.1)
Nn D(En )
We can explicitly include the density of state for the free electron (eq. ??):

4πp2 dp
D(En+1 + e− ) = D(En+1 ) (6.16.2)
ρe− h3
Where ρe− is the density of free electrons within the gas. If the transition n → n + 1 takes an
energy χ, then the energy difference between the states is:

p2
En+1 − En = +χ (6.16.3)
2m
p2
Where 2m is the energy of the free electron.
Now, in terms of experimentally important variables, we don’t really care about the state of the
free electron. We can calculate ratio between the states independent of pe− by integrating over
all p:
Z ∞
Nn+1 D(En+1 ) 2 p2
= 4πp2 dpe−β( 2m +χ) (6.16.4)
Nn D(En ) ρe− h3 0

Evaluating the integral leaves us with the Saha equation:


1
Ni+1 ρe− (2πmkT ) 2 2D(Ei+1 ) −βχ
= e (6.16.5)
Ni h3 D(Ei )
6.19
Good PDF: http://www.astro.princeton.edu/~gk/A403/ioniz.pdf
6.20
Effective because this could not be the actual ground state.

170
7 Quantum Mechanics
7.1 The Bra-ket Notation
Bra-ket notation is a shorthand used in the matrix representation of quantum mechanics. Vectors,
or "kets" are column vectors that live in the Hilbert space H, and are written |αi.
Bras, or row vectors, are written hα|. Bras do not live in Hilbert space, but rather in the
"dual" space to H. This dual space is defined by the inner (or "dot") product. The dual space
contains all elements whose inner product with an element of H is in C. If H is the space of
column vectors, the dual space is the corresponding space of row vectors.

7.1.1 Tips for Working with Bra-kets


 
• Remember that hA|Bi is just a number! Therefore, |CihA|Bi = hA|Bi |Ci.

7.1.2 Inner and Outer Products

The inner product (or "dot product") of α and β is written hα|βi. By definition, this operation
produces a scalar in C.
The outer product (or "tensor product") α and β is written |αihβ|. The outer product takes
two vectors and returns a matrix

7.1.3 Basis

Suppose that |αi i is a basis for H. Then the following relation, known as the closure relation-
ship, holds:

|a0 iha0 | = I
X
(7.1.1)
a0

Where I is the identity matrix.


This closure relationship can be used to express an arbitrary state vector |αi in terms of the
a0 basis:

|a0 iha0 |αi


X
|αi = (7.1.2)
a0

So that |αi can be expressed as a linear combination of the basis kets with coefficients ha0 |αi.
We can also apply the same relationship to the self-inner product of |αi:
X 
0 0
|ha0 |a0 i|2
X
1 = hα|αi = hα| |a iha | |αi = (7.1.3)
a0 a0

Therefore the expansion coefficients of |αi in the a0 basis must satisfy this normalization condition.
The single outer product of two basis vectors is known as the projection operator: Λa0 =
|a0 iha0 |. Notice that, when applied to an arbitrary ket |αi, returns the component of |αi along
a0 : |a0 iha0 |αi.

171
An operator X can be expressed as a matrix by applying the closure relationship twice:

|a00 iha00 |X|a0 iha0 |


XX
X= (7.1.4)
a00 a0

The center term, ha00 |X|a0 i, is called the matrix element of X. Since it is just a number, it can
be pulled forward, leaving:

ha00 |X|a0 i|a00 iha0 |


XX
X= (7.1.5)
a00 a0

This can be visualized as a matrix, indexed by a0 and a00 , with the associated matrix elements as
the values at each position in the matrix.

7.2 Complete Sets of Compatible Observables (CSCO)


Most quantum mechanical systems have enough degrees of freedom that any single observable
(i.e. operator) will be degenerate. In order to specify a unique state, a series of operators must
be applied, together removing the degeneracy. This process can be thought of as choosing basis
vectors for a vector space. If too few operators are included, then the system will be degenerate.
If too many are chosen, the system will not be self-consistent (the state will be over-determined).
In order to be simultaneously measurable, two operators must share the same eigenvectors.
We say that two operators with this property are compatible observables. A set of such operators
that is sufficient to eliminate any degeneracy in a system is referred to as a complete set of
compatible observables (CSCO).
If two operators A and B have the same eigenvectors |a0 , b0 i, then we see that

AB|a0 , b0 i = a0 b0 |a0 , b0 i (7.2.1)

But also that

BA|a0 , b0 i = b0 a0 |a0 , b0 i = a0 b0 |a0 , b0 i (7.2.2)

Since a0 and b0 are just numbers. Therefore AB = BA or, equivalently:

[A, B] = 0 (7.2.3)

If this is the case, then A and B must be compatible. If on the other hand A and B are
incompatible, then [A, B] 6= 0.
Another exercise illustrates this principle well. Consider the matrix element of the commutator
[A, B] in the basis of A’s eigenvectors:

ha00 |[A, B]|a0 i = (a00 − a0 )ha00 |B|a0 i = 0 (7.2.4)

Where the |ai vectors are the eigenvectors of the A operator. This equation shows that the
matrix elements of B expressed in the |ai basis must be zero, unless they are along the diagonal.
Thus the |ai vectors must also be eigenvectors of B.

172
With this knowledge, the task of finding a CSCO is reduced to finding a set of operators
{A, B, C, ...} that all commute with one another, i.e. [A, B] = [B, C] = [A, C] = 0 etc.

7.3 Time Evolution, Translation, and Rotation Operators


A number of operators exist that allow us to turn one ket into another. These operators have
several important properties in common. For a given operator A:

• A† A = 1 (A is unitary). This property is necessary for probability conservation to be


preserved under the operator7.1 .

• A(x2 , x1 )A(x1 , x0 ) = A(x2 , x0 ). (The operator A is transitive.

When using these operators, it is often useful to remember the Taylor series of an exponential
ex when x << 0:

x2 xn
ex ≈ 1 + x + + ... + (7.3.1)
2 n!

7.3.1 The Translation Operator

The translation operator takes a ket and translates it through space. The infinitesimal form of
the operator is

ipx dx0
Jx (dx0 ) = 1 − (7.3.2)
~
For finite translations (in the x direction), this operator becomes

−ipx (x0 − x00 )


 
Jx (x0 , x00 ) = exp (7.3.3)
~

7.3.2 The Time Evolution Operator

The job of the time evolution operator, written as U(t, t0 ), is to take a ket and move it forwards
or backwards in time.
The infinitesimal form of the time evolution operator is given by

iHdt
U(t0 + dt, t0 ) = 1 − (7.3.4)
~
For finite time translations, this operator can be more easily expressed as an exponential.
However, its exact form depends on the time dependence of the Hamiltonian.

• Case 1: The Hamiltonian is time-independent. The time evolution operator can then be
written:

−iH(t − t0 )
 
U(t, t0 ) = exp (7.3.5)
~
This is by far the most commonly used form of the operator.
7.1
Sakurai 68.

173
• Case 2: The Hamiltonian is time dependent, but the H’s at each time commute with one
another. The time evolution operator can then be written:

Z t
−i
 
0 0
U(t, t0 ) = exp H(t )dt (7.3.6)
~ 0
Rt
Notice that all we have done is replace H(t − t0 ) with the integral 0 H(t0 )dt0 !

• Case 3: The Hamiltonian is time dependent, and the H’s at different times do NOT
commute. The time evolution operator can then be written:

∞ 
−i n t t1
X  Z Z Z tn−1
U(t, t0 ) = 1 + ... H(t1 )H(t2 )...H(tn )dt1 dt2 ..dtn (7.3.7)
n=1
~ t0 t0 t0

This horrendous operator can apparently be made more tractable using the time ordering
operator. Either way, it’s a horrible mess.

7.3.3 The Rotation Operator

The rotation operator takes a ket and rotates it about the origin.
The infinitesimal form of the operator is

J · n̂
 
D(n̂, dφ) = 1 − i dφ (7.3.8)
~
For finite rotations, the operator can be written as an exponential:

−iJ · n̂φ
 
D(n̂, φ) = exp (7.3.9)
~

7.4 Approaches to Quantum Mechanics (Schrodinger vs. Heisenberg )


There are two primary (equivalent) approaches to solving problems in Quantum Mechanics.

• Schrodinger Picture: In the Schrodinger picture, kets are considered to evolve in time,
while operators remain constant. Over time, |αi → U(t)|αi.

• Heisenberg Picture: In the Heisenberg picture, kets are constant, while operators evolve
in time. Over time, X → U † XU.

Of course, these two pictures are mathematically identical when written in bra-ket notation,
because as |αi → U|αi = |α, ti and |βi → U|β, ti, the expectation value of some operator X
becomes

hα, t|X|β, ti = hα|U † XU|βi (7.4.1)

Where, of course, U † XU is just the time-evolved operator X(t) in the Heisenberg picture.

174
7.5 The Schrodinger Equation
The famous Schrodinger equation is a direct consequence of our definition of the time-evolution
operator. Consider the difference

U(t + dt, t0 ) − U(t, t0 ) (7.5.1)

Since U is transitive, we know that

iHdt
 
U(t + dt, t0 ) = U(t + dt, t)U(t, t0 ) = 1 − U(t, t0 ) (7.5.2)
~
So

H
U(t + dt, t0 ) − U(t, t0 ) = −i U(t, t0 )dt (7.5.3)
~
However, from the definition of the derivative, we know that

∂ U(t + dt, t0 ) − U(t, t0 )


U(t, t0 ) = (7.5.4)
∂t dt
Therefore, we can rewrite this difference to obtain the Schrodinger equation:


i~ U(t, t0 ) = HU(t, t0 ) (7.5.5)
∂t
In the Schrodinger picture, the wavefunction Ψ of a state |αi is represented as Ψ = U|αi.
Therefore, applying each side of the Schrodinger equation to some ket, we obtain:


i~ Ψ = HΨ (7.5.6)
∂t
Or, writing out the Hamiltonian and momentum operators explicitly:

∂ −~2 2
 
i~ Ψ = ∇ + V (x) Ψ (7.5.7)
∂t 2m

7.5.1 Spherical Solutions to the Schrodinger Equation

Spherical solutions to the Schrodinger equation are found via separation of variables:

ψ(r, θ, φ) = R(r)Θ(θ)Φ(φ) (7.5.8)

After separation of variables, the radial component yields the Radial Schrodinger Equation:

−~2 ∂ 2 ~2 l(l + 1)
 
+ + V (r) u(r) = Eu(r) (7.5.9)
2m ∂r2 2mr2

Where u(r) = rR(r). Notice that this change of variables to u(r) means that our boundary
conditions change too: R(∞) = 0 is automatically enforced for oscillatory u(r), and is somewhat
complicated to enforce for other solutions.

175
We can now however require that u(0) = 0. Here is a sketchy proof7.2 : the wave function
must be finite and single valued at the origin, which normally translates into the boundary
condition ψ(0, θ, φ) = constant ∀θ, φ. Therefore, u(0) = 0 × constant = 0.
Also note that some authors prefer to preserve the form of the cartesian Schrodinger equation
by creating an effective potential:

~2 l(l + 1)
Vef f (r) = V (r) + (7.5.10)
2mr2

So that the radial equation can then be written:

−~2 ∂ 2
 
+ Vef f (r) u(r) = Eu(r) (7.5.11)
2m ∂r2

7.6 The Heisenberg Equation


The Heisenberg equation of motion is the equation of motion in the Heisenberg picture. To
derive it, we start with the relation between an operator A(S) in the Schrodinger picture with
the corresponding operator in the Heisenberg picture, A(H) :

A(H) (t) = U † (t)A(S) U(t) (7.6.1)

Where we have assumed that A(S) is independent of time, as is usually the case. Now, we
calculating the time derivative:

∂ (H) ∂U † (t) (S) ∂U(t)


A (t) = A U(t) + U † (t)A(S) (7.6.2)
∂t ∂t ∂t
 
−iHt
However, assuming that U(t) = exp ~ , this expression becomes

∂ (H) 1
 
A (t) = U † A(S) UU † HU − U † HUU † A(S) U (7.6.3)
∂t i~
Now, noting that by our assumption of the form of U, [U, H] = 0, and that per definition U is
unitary, so that U † U = 1.
The equation then reduces to a simple commutator relationship, which is the final form of
the Heisenberg equation:

∂ (H) 1
A (t) = [A(H) , H] (7.6.4)
∂t i~
7.2
The origin and existence of this boundary condition is non-trivial, and purportedly caused Feynman some
anxiety (why should the particle have ZERO probability of existing at the origin?!). Details and discussion can
be found here: http://arxiv.org/ftp/arxiv/papers/1001/1001.3285.pdf and here http://arxiv.org/ftp/
arxiv/papers/1302/1302.0839.pdf

176
7.7 Commutator Relations
7.7.1 The Canonical Commutation Relations

The Canonical Commutation Relations are often taken as axioms of quantum mechanics (or, if
you will, as definitions of the variables they include). They are:

[xi , xj ] = 0 (7.7.1)

[pi , pj ] = 0 (7.7.2)

[xi , pj ] = i~δij (7.7.3)

7.7.2 Classical Correspondence of Commutators

In general, the following relationship holds between quantum mechanical commutator brackets
and the corresponding classical Poisson bracket:

1
{A, B}classical ↔ [A, B]quantum (7.7.4)
i~

7.7.3 Ehrenfest’s Theorem

Ehrenfest’s theorem is a relatively simple result that follows directly from the rule that [A, BC] =
[A, B]C + B[A, C]:

∂F
[x, F (p)] = i~ (7.7.5)
∂p

∂G
[p, G(p)] = −i~ (7.7.6)
∂x

7.8 Dispersion and Uncertainty


7.8.1 Dispersion

The dispersion of an operator (sometimes also called the mean square deviation or variance) is
defined to be:

∆A = A − hAi (7.8.1)

The quantity we are most often interested in is the expectation value square of the dispersion:

h(∆A)2 i = h(A2 − 2AhAi + hAi2 )i (7.8.2)

Since hAi is just a number, taking the expectation value again doesn’t change anything: hhAi2 i =
hAi2 . The middle term, however, is not constant: h−2AhAii = −2hAi2 , again since hAi is

177
constant. Therefore the expression becomes:

h(∆A)2 i = hA2 i − hAi2 (7.8.3)

This equation is by far the most useful for calculating uncertainties.

7.8.2 The Uncertainty Principle

For two operators A and B, the uncertainty principle states 7.3 :

1
h(∆A)2 ih(∆B)2 i ≥ |h[A, B]i|2 (7.8.4)
4
For example, for the operators x and p, where [x, p] = i~, this produces the famous uncertainty
relation:

~2
(∆x)2 (∆p)2 ≥ (7.8.5)
4

7.9 Spin 1/2 Systems


7.9.1 Constructing the state vectors

We will begin by assuming (without loss of generality) that the spin vector points in the Z
direction, and will introduce two kets to describe its two possible orientations: |+i and |−i, each
of which we will take to be normalized.
Measuring Sx for either of these kets will yield a 50/50 split between |Sx , +i and |Sx , −i, so
we require that:

1
h±|Sx , ±i = √ (7.9.1)
2
We also require that hSx , ±|Sx , ±i = 1. Together, these two conditions allow us to write an
expression for |Sx , ±i, specifying its coefficients up to an arbitrary phase factor, which we will
chose to collect onto the second term. We can also easily construct |Sx , −i by noting that it
must be orthogonal to |Sx , +i.

1 1
|Sx , ±i = √ |+i ± √ eiφ1 |−i (7.9.2)
2 2
|Sy , ±i can be constructed in a similar manner. The coefficients are determined up to an arbitrary
phase factor by requiring that:

1
hSy , ±|Sz , ±i = √ (7.9.3)
2
So our expression for Sy is then:

1 i
|Sy , ±i = √ |+i ± √ eiφ2 |−i (7.9.4)
2 2
7.3
For a derivation, see Sakurai 2nd edition, pg. 34-35

178
The arbitrary phase factors φ1 and φ2 can be eliminated by requiring that:

1
hSy , ±|Sx , −i = hSy , ±|Sx , +i = √ (7.9.5)
2
From here, inserting the kets constructed above, we see that φ2 − φ1 = ± π2 . Therefore, setting
one ket’s phase to 0, we can rewrite both kets to eliminate the phase. The final kets for each
direction (in the z basis) are then:

|Sz , ±i = |±i (7.9.6)

1 1
|Sx , ±i = √ |+i ± √ |−i (7.9.7)
2 2

1 i
|Sy , ±i = √ |+i ± √ |−i (7.9.8)
2 2

7.9.2 ~ · n̂, +i State Ket


Example: Deriving the |S

We start by defining the unit vector n̂ in the usual way for spherical coordinates (with φ = α
and θ = β):

n̂ = sin(β) cos(α)n̂x + sin(β) sin(α)n̂y + cos(β)n̂z (7.9.9)

Of course,

S · n̂ = nx Sx + ny Sy + nz Sz (7.9.10)

We know that the {|+i, |−i} basis is complete, so:

|S · n̂, +i = a|+i + b|−i (7.9.11)

Where

a2 + b2 = 1 (7.9.12)

Now, we will take as given that

~
S · n̂|S · n̂, +i = |S · n̂, +i (7.9.13)
2
We will evaluate the LHS explicitly, then set RHS = |S · n̂, +i = a|+i + b|−i to solve for
constants a and b.
Using the expressions derived for Sx , Sy , and Sz previously, we can expand the LHS into the
{|+i, |−i} basis, then expand. Matching terms then produces the following equations:

((nx − iny )b + nz a)|+i = a|+i (7.9.14)

179
((nx + iny )a + nz b)|−i = b|−i (7.9.15)

Combined with the normalization condition, the solution to this set of equations is:

β β iα
   
|S · n̂, +i = cos |+i + sin e |−i (7.9.16)
2 2

Similarly, you can also find the minus eigenstate:

β −iα β
   
|S · n̂, −i = − sin e |+i + cos |−i (7.9.17)
2 2

This is a VERY useful result to have memorized!

7.9.3 Deriving the Spin Operators

Operators may be assembled through a sum over their eigenvectors (Sakurai 1.3.34):

a0 |a0 iha0 |
X
(7.9.18)
a0

For each of the spin states constructed in the previous section, the eigenvalue a0 = ± ~2 . We can
therefore easily construct the operators for the spin in each direction:

−~
   
~
Sx = |Sx , +ihSx , +| + |Sx , −ihSx , −| (7.9.19)
2 2
After the expressions for |Sx , ±i have been substituted, expanded, and terms are allowed to
cancel, the Sx operator is given by:
 
~
Sx = (|+ih−|) + (|−ih+|) (7.9.20)
2
The other two operators are similarly derived:
 
~
Sy = − i(|+ih−|) + i(|−ih+|) (7.9.21)
2
 
~
Sz = (|+ih+|) + (|−ih−|) (7.9.22)
2

7.9.4 Deriving the Commutator Relationships

[Si , Sj ] = iijk ~Sk (7.9.23)

1
{Si , Sj } = ~2 δij (7.9.24)
2
From the anti-commutator:

3
S 2 = Sx 2 + Sy 2 + Sz 2 = ~2 (7.9.25)
4

180
7.9.5 Pauli Spin Matrices for the Spin 1/2 System

Rotations in 3 dimensions make up the ’rotation group’; SO(3), or Special Orthogonal Group
3. The elements of this group correspond to rotations about the origin, and contain only real
valued entries.
Similarly, in quantum mechanics, rotations in 2D spin systems belong to the SU(2) group.
The Pauli spin matrices are a basis for this group:
!
0 1
σ1 = σx = (7.9.26)
1 0
!
0 −i
σ2 = σy (7.9.27)
i 0
!
1 0
σ3 = σz (7.9.28)
0 −1
Together, these are often written as:
!
δj3 δj1 − iδj2
σj = (7.9.29)
δj1 + iδj2 −δj3
Conveniently, this group happens to correspond to the spin operators we have previously been
using:

~
Sj = σj (7.9.30)
2
Each matrix has two eigenvalues, ±1, which are often just denoted by ±. The eigenvectors of
each matrix can be identified with the spin states derived earlier in this section:

• σ1 = σx

!
1 1
σx,+ = |Sx , +i = √ (7.9.31)
2 1
!
1 1
σx,− = |Sx , −i = √ (7.9.32)
2 −1

• σ2 = σy

!
1 1
σy,+ = |Sy , +i = √ (7.9.33)
2 i
!
1 1
σy,− = |Sy , −i = √ (7.9.34)
2 −i

181
• σ3 = σz

!
1 1
σz,+ = |Sz , +i = √ (7.9.35)
2 0
!
1 0
σz,− = |Sz , −i = √ (7.9.36)
2 1

The following are facts about the Pauli matrices:

• All three matrices are Hermitian.

• The square of each matrix is identity: σ12 = σ22 = σ32 = I.

• det(σi ) = −1

• T r(σi ) = 0

• The Pauli vector is defined to be σ = σx x̂ + σy + ŷ + σz ẑ

• [σa , σb ] = 2iabc σc

• {σa , σb } = 2δab I.

7.10 Total Angular Momentum: J~


7.10.1 Getting s, S, l, L, j, and J straight

On one level, these variables are all used for the same thing: angular momenta! However, there
are some general conventions for when each variable is used, and these can occasionally be
important when you get into a problem that involves one or more.
First, when dealing with atoms, lowercase variables refer to an individual electron,
while uppercase variables refer to the entire atom or system!. This convention is not
always followed with other systems (say, a bunch of particles in a box), but it is generally used
when working with atoms.
Then:

• s/S: Spin

• l/L: Angular momentum

• j/J: TOTAL Angular momentum

7.10.2 Quantum Numbers

A state of a spin-ful particle is specified by three quantum numbers, n, l, and m. n is called the
principal quantum number and (in the absence of any external fields, etc.) solely determines
the energy of the particle. It can take on any integer value:

n = 0, 1, 2, ... (7.10.1)

182
The second, s/l/j, has many names in different situations. When a particle exists on its own, it
is called s, and represents the total spin of the particle. If the particle has angular momentum
because it is undergoing circular motion, the quantum number is l, the orbital angular
momentum quantum number and represents the total angular momentum of the orbit.
Finally, if the system is a composite system, (say, two spin-ful particles, or one spin-ful particle
also undergoing circular motion) the quantum number is j, the total angular momentum,
and represents the total (vector sum) angular momentum of the system. In any case:

j = 0, 1, 2, ...n − 1 (7.10.2)

Finally, for any of the types of systems above, the third quantum number is the projection of the
angular momentum on the z axis. This is denoted m, and is called the magnetic quantum
number because it plays an important role in the Zeeman Effect. It ranges:

m = −j, −|j − 1|, ...0...|j − 1|, j (7.10.3)

7.10.3 A Note on Excessive j’s in Notation

Discussions of total angular momentum necessarily involve multiple particles, each of which
has its own angular momentum and magnetic quantum numbers, ji and mi respectively. To
make matters messier,these must all often be written in a single ket, in order to write down the
current state of a system of particles! Unfortunately, essentially many sources use conflicting
notation schemes, which gets incredibly confusing.
In this book, the normal convention will be followed that states in the total angular momentum
basis will be written as:

|jtotal , mtotal i (7.10.4)

While kets written in the individual angular momentum basis will be written:

|j1 , m1 i1 |j2 , m2 i2 ...|ji , mi ii (7.10.5)

My personal opinion is that this notation is nicer than the alternative |j1 , j2 , ...ji ; m1 , m2 ...mi i,
because it emphasizes the separation between the particles in this basis. The indices after each
closing karet are helpful when variables are replaces with numbers, which makes the kets easy to
accidentally interchange.I will include them whenever helpful for clarity.
In many problems the ji ’s of each particle will be given and equal, so these they can be
dropped from the kets for simplicity’s sake.

7.10.4 Addition of Angular Momenta


1 7.4
Imagine a system of two quantum mechanical particles, each with spin 2 . If these particles
were classical, their angular momentum vectors could be easily added to produce a definite
combined angular momentum vector. However, in quantum mechanics it isn’t that easy.
7.4
The best reference for this problem is Griffiths QM, pg. 184

183
The general problem we have is that a given system of particles can be written in one of two
CSCOs:

• The separate particles CSCO, which consists of the J12 , J22 , J1z , and J2z operators. Notice
how the Jz operators are separate. In this CSCO, kets are specified by: |j1 , m1 i|j2 , m2 i.

• The combined particles CSCO, which consists of the J12 , J22 , J 2 , and Jz operators, where
J and Jz are of the TOTAL system. In this CSCO, kets are specified by: |j1 , j2 , j, mi.

Our goal is to find a way to write a combined state, say |1, 1i, as a linear combination of separate
1
particle states for the two spin 2 particles. This is easy for |1, 1i, because there is only ONE
1
combination of two spin 2 particles that can produce this state!:

1 1 1 1
|1, 1i = | , i1 | , i2 (7.10.6)
2 2 2 2
Now, recall that, in descending order of energy, the combined states of this system must be:

• |1, 1i

• |1, 0i

• |1, −1i

• |0, 0i

This means that we can move from one of these states to the other by applying the raising and
lowering operator! Since the two particles are independent, we can write:

J± = J1,± + J2,± (7.10.7)

Now we can use the lowering operator, along with the ket we already managed to "translate"
into the separate particle basis, to find the others!

1 1 1 1
J− |1, 1i = (J1,± + J2,± )| , i1 | , i2 (7.10.8)
2 2 2 2
Now, remembering that:
q
J± |j, mi = (j ∓ m)(j ± m + 1)|j, m − 1i (7.10.9)

We can determine that:

√ 1 1 1 1 1 1 1 1
2|1, 0i = | , − i1 | , i2 + | , i1 | , − i2 (7.10.10)
2 2 2 2 2 2 2 2
Or, more properly:

1 1 1 1 1 1 1 1 1
 
|1, 0i = √ | , − i1 | , i2 + | , i1 | , − i2 (7.10.11)
2 2 2 2 2 2 2 2 2

184
This IS the representation of the system state |1, 0i in the single particle basis! Carrying out the
same process again (remembering that J− || 21 , − 12 i = 0) yields:

1 1 1 1
|1, 0i = | , − i1 | , − i2 (7.10.12)
2 2 2 2
Which is exactly what we should have been expecting. Now, the lowering operator won’t help us
break |0, 0i into the separate particle basis. However, we can easily guess this one too. We know
that |0, 0i should be a linear combination of | 12 , − 12 i1 | 21 , 12 i2 and | 12 , 21 i1 | 12 , − 12 i2 , just like |1, 0i.
However, it must also be orthogonal to |1, 0i, and still normalize to 1! The only combination
that fits all these parameters is:

1 1 1 1 1 1 1 1 1
 
|0, 0i = √ | , − i1 | , i2 − | , i1 | , − i2 (7.10.13)
2 2 2 2 2 2 2 2 2

7.10.5 Clebsch-Gordan Coefficients

In the section above, the states of a system in the combined basis were represented as a
linear combination of states in the separate particle basis. The coefficients in this linear
combination are the Clebsch-Gordan Coefficients. The coefficients are most accurately
symbolized as:
j1 ,j2 ,j
Cm 1 ,m2 ,m
(7.10.14)

Where j1 , m1 and j2 , m2 are the quantum numbers of each particle in the separate basis, and
j, m are the quantum numbers of the combined state. Notice that there are no Clebsch-Gordan
Coefficients for more than 2 particles! That would be an utter nightmare. Instead, when
combining or breaking apart states with many particles, you will first break the state into halves,
then continue to break those apart until you have the individual basis.
Perhaps the easiest way to think about the Clebsch-Gordan Coefficients is as a function,
like in a programming language. For example, the command Mathematica that generates
Clebsch-Gordan Coefficients is:

j1 ,j2 ,j
Cm 1 ,m2 ,m
= ClebschGordan[{j1,m1},{j2,m2},{j,m}]

You can use this function to break down any state in the combined basis you want. Say you have
a system of two particles, 1 and 2, and total spins j and m. Then we know immediately that:

XX j1
X j2
X 
j1 ,j2 ,j
|j, mi = Cm |j , m1 i|j2 , m2 i
1 ,m2 ,m 1
(7.10.15)
j1 j2 m1 =−j1 m2 =−j2

The upside of this complicated mess is that you need never perform the computation to find
1
these coefficients (which is horrible for systems higher than 2 ⊗ 12 ): you can use the computer
functions instead!
Occasionally (on an exam, perhaps, or while time traveling in the 20th century) you may be
required to read these coefficients off of a strange and very dense table (fig. 7.10.2). Figure 7.10.1
helps explain how to decipher these tables:

185
Figure 7.10.1: Detail of a CGC table (for 12 ⊗ 12 ), with labels. Source:
https://www.physicsforums.com/attachments/cg-table-jpg.32635/

• Important: A square root sign is understood over every entry in the table. If the entry
q
has a minus sign, you’re meant to put that OUTSIDE of the square root. So − 21 → − 12 .

• The bold letters in the upper left corner (outside of the boxes) are j1 and j2 : the j’s for
the individual particles. For this table j1 = j2 = 12 .

• The remainder of the chart is grouped into L-shaped boxes, each of which groups states
that contain the same m (total m) values.

• Each column in the top segment of the ’L’ contains j and m: the quantum numbers of a
state in the combined representation.

• Each row in the side segment of the ’L’ contains m1 and m2 for a state in the single particle
representation.

• In the middle of the ’L’ are the actual Clebsch-Gordan Coefficients.

We can read the table in two ways. Suppose we want to break down the combined state
|j = 1, m = 0i. We make two ’L’s through the box with the coefficients. The first L (shaded red
on the drawing)tells us that (j = 1, m = 0, m1 = 12 , m2 = − 12 ):
r
1 1
, ,1
2 2
1
C 1
,− 1
,0
= (7.10.16)
2 2 2

Similarly, the second L (shaded blue on the drawing) tells us that(j = 1, m = 0, m1 = − 12 ,


m2 = 12 ):
r
1 1
, ,1
2 2
1
C − 21 , 12 ,0
= (7.10.17)
2
Therefore: r r
1 1 1 1 1 1
|1, 0i = | ,− i + |− , i (7.10.18)
2 2 2 2 2 2

186
The table can also be read by making L’s in the other direction to express individual basis states
in terms of the combined representation. For example, for m1 = − 12 , m2 = 12 :
r r
1 1 1 1
|− , i= |1, 0i + − |0, 0i (7.10.19)
2 2 2 2

For more information: http://www.eng.fsu.edu/~dommelen/quantum/style_a/clgrdn.


html

Figure 7.10.2: CGC table. Source:


https://www.physicsforums.com/attachments/cg-table-jpg.32635/

7.11 Time Independent Perturbation Theory


Some quantum mechanical systems are fully and analytically solvable (hydrogen atom, SHO,
square well, etc.), while others are much more complicated. Perturbation theory provides
approximations of solutions to difficult problems in terms of the solutions to already-solved
easier problems. If the Hamiltonian for a system under study is H, the goals is to write:

H = H0 + H 0 (7.11.1)

Where H0 is the Hamiltonian of a solved system (i.e. we know the wave functions and their
associated energies).

187
Simple formulas exist for systems that are non-degenerate, but these equations quickly fail
when applied to degenerate systems (for example, some have denominators that can be zero for
degenerate systems). Therefore, a more complicated set of equations exists for these situations.
In general, the energies of a system are just the eigenvalues of its Hamiltonian in matrix
form. For non-degenerate systems, the matrix that approximates a system to some order
will automatically be diagonalized, making it trivial to read off the eigenvalues. However, for
degenerate systems, the matrix will NOT be diagonalized. The energies are then found (in
theory) by choosing a different, special basis that diagonalizes the matrix or (usually in practice)
by simply finding the eigenvalues of the matrix.

7.11.1 Non-Degenerate

To first order:
En(1) = hn|H 0 |ni (7.11.2)

For the wavefunction:


X hm|H 0 |ni
ψn(1) = (0) (0)
(0)
ψm (7.11.3)
m6=n En − Em

To second order:
X |hm|H 0 |ni|2
En(2) = (0) (0)
(7.11.4)
m6=n En − Em

(2)
In general, you won’t need to calculate ψn , and the equation is a bit nasty, so I won’t reproduce
it here.

7.11.2 Degenerate

To first order, the energies of a degenerate system are the eigenvalues of the matrix:

Wij = hi|H 0 |ji (7.11.5)

To second order, the energies are the eigenvalues of a slightly more complicated matrix:

X Wnm Wmi
ani = (0) (0)
(7.11.6)
m6=n En − Em

Where Wij is the same matrix used above for first order, n, i range only over the degenerate
subspace, while m ranges over the whole space. For example, if your original Hamiltonian
matrix (Wij ) is 4 × 4, but only the energies corresponding to 1 and 2 are degenerate, then
n, i = 1, 2 and m = 1, 2, 3, 4. The resulting ani matrix would in this case then just be a 2 × 2.

188
7.12 Time Dependent Non-Degenerate Perturbation Theory
Time dependent perturbation theory allows us to analyze time dependent perturbations to a
solved system. Suppose that we have the Hamiltonian:

H = H0 + V (t) (7.12.1)

Where H0 is the Hamiltonian of our solved system, for which we will label the states |ni and
the energy levels En . Assuming that the states |ni are complete (which is generally true), we
can write any state of our new Hamiltonian (at any time) as a linear combination of the time
evolved basis kets:
iEn t
cn (t)e−
X
|ψ(t)i = ~ |ni (7.12.2)
n

The coefficients cn (t) are generally complex, and are called the amplitudes. The probability of
finding the system in a state n is given by |cn (t)|2 .The coefficients can be found via an iterative
formula: Z t
−i X 0
cn (t) = cn (0) + e−iωnk t hn|V (t0 )|kick (t0 )dt0 (7.12.3)
~ k 0

Where:
En − Ek
ωnk = (7.12.4)
~
Each time this formula is used recursively, it increases the order to which cn (t) is accurate by 1.
In principle, then:
cn (t) = c(0) (1) (2)
n + cn + cn + ... (7.12.5)
(0)
The first term is constant: cn = cn (0), and represents the fact that, to 0th order, a perturbed
system will just stay in the state it already occupies! To first order, then7.5 :
Z t
−i X 0
c(1)
n (t) = e−iωnk t hn|V (t0 )|kick (0)dt0 (7.12.6)
~ k 0

This equation is extremely useful, since we usually know the exact values of ck (0)! For example,
if |ni are the kets for the SHO, and at t = 0 we have:
r r
3 1
|ψ(t = 0)i = |0i + |1i (7.12.7)
4 4
q q
3 1
Then c0 (0) = 4 and c1 (0) = 4.

7.12.1 Fermi’s Golden Rule

Fermi’s golden rule gives the transition rate R between two states f and i effected by a
perturbation H 0 :

R1→2 = |hf |H 0 |ii|2 ρf (7.12.8)
~
7.5
The cn (0) from the formula above isn’t included, of course, since that term represents all lower orders.

189
Where ρf is the density of final states, which makes sense, since we’d expect the transition
probability to scale linearly with the number of possible states the system could jump into at
the final energy.

7.13 Scattering
Here’s a good source on scattering in general: http://juser.fz-juelich.de/record/20885/
files/A2_Bluegel.pdf

7.13.1 Partial Wave Analysis: Theory

(Following Griffiths QM, pg. 399 to 408)


The general (separated) solution to the Schrodinger equation for a spherically symmetric
potential (of which spherically symmetric scattering potentials are a subset) can be written:

ψ(r, θ, φ) = R(r)Ylm (θ, φ) (7.13.1)

The bulk of the physics here takes place in the radial direction. If we make the (common) change
of variables that u(r) = rR(r), the radial schrodinger equation is

~2 d2 u ~2 l(l + 1)
 
− 2
+ V (r) + u = Eu (7.13.2)
2m dr 2m r2
We will search for solutions to this equation by considering three regions. In the scattering
1
region, V 6= 0. In the intermediate region, V = 0 but we may not neglect r2
(forcing us to retain
the centrifugal term of the Schrodinger equation). Finally, in the radiation zone, we take r >> 1
and, of course, still V = 0.
k2 ~2
Let’s start with the radiation zone. Rewriting E in terms of the wave number k; E = 2m ,
the equation then simplifies to

d2 u
≈ −k 2 u (7.13.3)
dr2
For which we know the solution:

eikr
R(r) ≈ (7.13.4)
r
Now we can progress onto the intermediate zone. Here, since V = 0, the Schrodinger equation
simplifies to:

d2 u l(l + 1)
− = −k 2 u (7.13.5)
dr2 r2
(1) (2)
The solutions to this equation are the Spherical Hankel functions, hl and hl . Asymptotically,
(1) (−i)l+1 (2) (i)l+1
hl ≈ x eix and hl ≈ x e−ix . Therefore, since we want our solution to represent an
(1)
outgoing (scattered) wave, we choose only hl , so

(1)
R(r) hl (kr) (7.13.6)

190
Therefore, the complete wavefunction outside of the scattering region is
 
X (1)
ψ(r, θ, φ) = A eikz + Cl,m hl (kr)Ylm (θ, φ) (7.13.7)
l,m

Where the first term represents the incoming plane wave in the z direction, and the second term
is the scattered part of the wavefunction. Since we usually deal with azimuthally symmetric
potentials, it is convenient to reduce the Ylm ’s down in terms of Legendre polynomials. We will
also make the (later intelligable) substutition:
q
Cl,0 = i( l + 1)k 4π(2l + 1)al (7.13.8)

Where al is called the partial wave amplitude. The solution can now be written:
 ∞ 
ikz
X
l+1 (1)
ψ(r, θ, φ) = A e +k i (2l + 1)al hl (kr)Pl (cos θ) (7.13.9)
l=0

In the large r limit, this equation can be written in terms of the scattering amplitude f (θ):

eikr
 
ikz
ψ(r, θ, φ) = A e + f (θ) (7.13.10)
r
Where


X
f (θ) = (2l + 1)al Pl (cos θ) (7.13.11)
l=0

This is useful, since it can be shown that


D(θ) = = |f (θ)|2 (7.13.12)
dΩ
In order to actually find the partial wave amplitudes, we will need to first need to write the
wavefunction entirely in spherical coordinates (instead of the current mix of spherical and linear
coordinates). The problem coordinate is z, which can be rewritten in terms of spherical bessel
functions via the Rayleigh formula:


X
eikz = il (2l + 1)jl (kr)Pl (cos θ) (7.13.13)
l=0

to yield:


X (1)
il (2l + 1) jl (kr) + ikal hl (kr) Pl (cos θ)
 
ψ(r, θ) = A (7.13.14)
l=0

From here, the general strategy for applying partial waves analysis is as follows:

1. Identify the relevant boundary conditions between the scattering region and the intermedi-
ate region.

2. Fit the general form of the wavefunction to the boundary conditions, and solve for al .

3. Using al , compute f (θ) and then dω etc.

191
7.13.2 Phase Shift Analysis

When a particle scatters, the amplitude of its wavefunction must remain the same (in order to
conserve probability flow into and out of the scattering region), but the outgoing wave may pick
up a phase shift. The Rayleigh formula allows us to express an incoming plane wave as:

ψ0l = Ail (2l + 1)jl (kr)Pl (cos θ) (7.13.15)

Using the asymptotic form of the spherical Hankel functions to re-write the spherical bessel
function when kr >> 1, this becomes

1 (1) (2l + 1)  ikr


 
(2)
ψ0l = Ail (2l + 1) e − (−1)l e−ikr Pl (cos θ)

(hl + hl ) Pl (cos θ) = A (7.13.16)
2 2ikr

This equation represents the wavefunction with no potential present. However, the wavefunction
above is already clearly divided into an incoming wave (the second term) and an outgoing wave
(the first term). Therefore, if there is a NON-zero potential, the effect will be to add a phase
shift to the outgoing term.

(2l + 1)  i(kr+2δl )
ψ0l = A − (−1)l e−ikr Pl (cos θ)

e (7.13.17)
2ikr
By comparing this equation to the wavefunction we calculated by partial wave analysis, we
conclude that:

1 iδl
al = e sin(δl ) (7.13.18)
k

7.13.3 Partial Wave Analysis: Practical Application

In practice, for finding scattering cross sections, the most useful result of partial wave analysis
is:
∞ ∞
X X 4π
σ= σl = (2l + 1) sin2 (δl ) (7.13.19)
l=0 l=0
k2

Where δl is called the phase shift. For scattering at low energies of off of a central potential,
we can approximate that:

σ ≈ σ0 = sin2 (δ0 ) (7.13.20)
k2
Therefore, the problem of finding the cross section is reduced to that of finding the phase shift.
To accomplish this, you must solve the radial Schrodinger equation7.6 :

−~2 ∂ 2 ~2 l(l + 1)
 
+ + V (r) u(r) = Eu(r) (7.13.21)
2m ∂r2 2mr2

Where u(r) = rR(r), where R(r) is the actual radial component of the wave function. Remember
that this change of variables also changes our boundary conditions: R(∞) = 0 automatically
now (at least for oscillatory solutions to u(r)), and u(0) = 0 (which is NOT generally true for
7.6
Radial, since this only applies to central potentials.

192
R(r)). Notice that, for l = 0, this equation reduces to a very familiar form:

−~2 ∂ 2
 
+ V (r) u(r) = Eu(r) (7.13.22)
2m ∂r2

Normally, we’d seek solutions of the form u(r) = Aeikr + Be−ikr ). However, in this case, we
want to instead chose the form:

u(r) = A sin(kr + δ0 ) = (7.13.23)

The constant δ0 is the phase shift between the incoming wave and the outgoing wave, and is
exactly the zeroth order phase shift we are looking for!. The scattering cross section can now be
determined simply by plugging this result in the formula above for σ.

7.13.4 The Born Approximation

The Born Approximation is a rearrangement of the Schrodinger equation into a nicer integral
equation. The laborious details are well covered in Griffths QM, pg. 408-412. The resulting
equation is:

m eik|r−r0 |
Z
ψ(r) = φ0 (r) − V (r0 )ψ(r0 )d3 r0 (7.13.24)
2π~2 V |r − r0 |
This looks like a direct solution for φ, but it is not, since the integral on the right hand side
ALSO depends on ψ.
Suppose that the potential V (r) is localized near r = 0. If we additionally now suppose that
the incoming wave is not substantially altered by the potential, then we arrive at the 1st Born
approximation:
m
Z
0
f (θ, φ) ≈ − eiq·r V (r 0 )d3 r 0 (7.13.25)
2π~2
Where q = k0 − k. This equation is extremely useful: most problems about scattering in the
1st Born approximation essentially amount to simply solving it. Notice that q(r, θ, φ), while
the integration is over r0 , θ0 , and φ0 . In other words, q is a constant with respect to the
variables of integration!.
With the further assumption of a spherically symmetric potential, the 1st Born Approximation
then becomes:
Z ∞
2m 1
f (θ) ≈ − rV (r) sin(qr)dr, (spherical symmetry) (7.13.26)
~2 q 0

θ 7.7

Where now we can also simplify q = |q| = 2k sin 2 . Notice that f is now only a function of
θ: a spherically symmetric cannot depend on φ!.
In the (separate) limit where the energy of the scattered particle is low, the 1st Born
7.7
This follows from the Law of Cosines.

193
approximation reduces to:

m
Z
f (θ, φ) ≈ − V (r)d3 r, (low energy) (7.13.27)
2π~2 V

194

You might also like