Electromagnetism & Relativity: Brian Pendleton

Download as pdf or txt
Download as pdf or txt
You are on page 1of 106

Electromagnetism & Relativity

[PHYS10093] Semester 1, 2021/22

Brian Pendleton

• Email: [email protected]

• Office: JCMB 4413

• Phone: 0131-650-5241

• Web: http://www.ph.ed.ac.uk/∼bjp/emr/

[ Last updated on Monday 29th November, 2021 at 18:55 ]


Books

The course should be self-contained, but it’s always good to read textbooks to
broaden your education.

• David J Griffiths,
Introduction to Electrodynamics, (Prentice Hall)

• John R Reitz, Frederick J Milford, Robert W Christy,


Foundations of Electromagnetic Theory, (Addison-Wesley)

• JD Jackson,
Classical Electrodynamics (Wiley) – advanced, good for next year.

• KF Riley, MP Hobson and SJ Bence,


Mathematical Methods for Physics and Engineering, (CUP 1998).

• PC Matthews,
Vector Calculus, (Springer 1998).

• ML Boas,
Mathematical Methods in the Physical Sciences, (Wiley 2006).

• GB Arfken and HJ Weber,


Mathematical Methods for Physicists, (Academic Press 2001).

• DE Bourne and PC Kendall,


Vector Analysis and Cartesian Tensors, (Chapman and Hall 1993).

Griffiths is the main text for electromagnetism; Reitz, Milford & Christy is a stan-
dard electromagnetism text; Jackson is pretty advanced, but it will also be good for
Classical Electrodynamics next year.
The other books are useful for the first part of the course, which will introduce
tensors and their applications in physics.

Acknowledgements: Many thanks to Richard Ball, Martin Evans, Roger Horsley


and Donal O’Connell for providing copies of their lecture notes and tutorial sheets,
from which I have sampled with (apparent) impunity.

i
Contents

1 Revision of vector algebra 1


1.1 Revision of vectors & index/suffix notation . . . . . . . . . . . . . . . 1
1.1.1 Position vector . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.2 Cartesian coordinates . . . . . . . . . . . . . . . . . . . . . . . 1
1.1.3 The Kronecker delta symbol δij . . . . . . . . . . . . . . . . . 2
1.1.4 The Levi-Civita or epsilon symbol ijk . . . . . . . . . . . . . 2
1.1.5 Scalar product . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.6 Vector product . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.7 Scalar triple product . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.8 Vector triple product . . . . . . . . . . . . . . . . . . . . . . . 3
1.2 Linear transformation of basis . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Transformation properties of vectors and scalars . . . . . . . . . . . . 4
1.3.1 Transformation of vector components . . . . . . . . . . . . . . 4
1.3.2 Scalar and vector products . . . . . . . . . . . . . . . . . . . . 4

2 Cartesian tensors 5
2.1 Definition and transformation properties . . . . . . . . . . . . . . . . 5
2.1.1 Definition of a tensor . . . . . . . . . . . . . . . . . . . . . . . 5
2.1.2 Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.3 Dyadic notation . . . . . . . . . . . . . . . . . . . . . . . . . . 6
2.1.4 Internal consistency in the definition of a tensor . . . . . . . . 7
2.1.5 Properties of Cartesian tensors – tensor algebra . . . . . . . . 7
2.1.6 The quotient theorem . . . . . . . . . . . . . . . . . . . . . . . 9
2.2 Revision of matrices and determinants . . . . . . . . . . . . . . . . . 9
2.2.1 Matrices . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
2.2.2 Determinants . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2.3 Linear equations . . . . . . . . . . . . . . . . . . . . . . . . . 13

ii
2.2.4 Orthogonal matrices . . . . . . . . . . . . . . . . . . . . . . . 15
2.3 Pseudotensors, pseudovectors & pseudoscalars . . . . . . . . . . . . . 16
2.3.1 Vectors (revision) . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.2 Pseudovectors . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
2.3.3 Pseudotensors . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.4 Some physical examples . . . . . . . . . . . . . . . . . . . . . . . . . 19
2.5 Invariant/Isotropic Tensors . . . . . . . . . . . . . . . . . . . . . . . . 20

3 Rotation, reflection & inversion tensors 22


3.1 The rotation tensor . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.1.1 Rotation about an arbitrary axis . . . . . . . . . . . . . . . . 22
3.1.2 Some important properties of the rotation tensor . . . . . . . 23
3.2 Reflections and inversions . . . . . . . . . . . . . . . . . . . . . . . . 24
3.3 Projection operators . . . . . . . . . . . . . . . . . . . . . . . . . . . 24
3.4 Active and passive transformations . . . . . . . . . . . . . . . . . . . 25

4 The inertia tensor 26


4.1 Angular momentum and kinetic energy . . . . . . . . . . . . . . . . . 26
4.1.1 Angular momentum . . . . . . . . . . . . . . . . . . . . . . . 27
4.1.2 Kinetic energy . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
4.1.3 The parallel axes theorem . . . . . . . . . . . . . . . . . . . . 28
4.1.4 Diagonalisation of rank-two tensors . . . . . . . . . . . . . . . 30

5 Taylor expansions 34
5.1 The one-dimensional case . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.1.1 Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.1.2 A precursor to the three-dimensional case . . . . . . . . . . . 37
5.2 The three-dimensional case . . . . . . . . . . . . . . . . . . . . . . . . 37

6 Curvilinear coordinates 39
6.1 Orthogonal curvilinear coordinates . . . . . . . . . . . . . . . . . . . 40
6.1.1 Scale factors and basis vectors . . . . . . . . . . . . . . . . . . 40
6.1.2 Examples of orthogonal curvilinear coordinates (OCCs) . . . . 41
6.2 Elements of length, area and volume . . . . . . . . . . . . . . . . . . 42
6.2.1 Vector length . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
6.2.2 Arc length and metric tensor . . . . . . . . . . . . . . . . . . . 43

iii
6.2.3 Vector area . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.2.4 Volume . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
6.3 Components of a vector field in curvilinear coordinates . . . . . . . . 45
6.4 Div, grad, curl and the Laplacian in OCCs . . . . . . . . . . . . . . . 46
6.4.1 Gradient . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46
6.4.2 Divergence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47
6.4.3 Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.4.4 Laplacian of a scalar field . . . . . . . . . . . . . . . . . . . . 50
6.4.5 Laplacian of a vector field . . . . . . . . . . . . . . . . . . . . 50

7 Electrostatics 51
7.1 The Dirac delta function in three dimensions . . . . . . . . . . . . . . 51
7.2 Coulomb’s law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
7.3 The electric field . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
7.3.1 Field lines . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
7.3.2 The principle of superposition . . . . . . . . . . . . . . . . . . 55
7.4 The electrostatic potential for a point charge . . . . . . . . . . . . . . 55
7.5 The static Maxwell equations . . . . . . . . . . . . . . . . . . . . . . 56
7.5.1 The curl equation . . . . . . . . . . . . . . . . . . . . . . . . . 56
7.5.2 Conservative fields and potential theory . . . . . . . . . . . . 56
7.5.3 The divergence equation . . . . . . . . . . . . . . . . . . . . . 58
7.6 Electric dipole . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
7.6.1 Potential and electric field due to a dipole . . . . . . . . . . . 60
7.6.2 Force, torque and energy . . . . . . . . . . . . . . . . . . . . . 61
7.7 The multipole expansion . . . . . . . . . . . . . . . . . . . . . . . . . 63
7.7.1 Worked example . . . . . . . . . . . . . . . . . . . . . . . . . 65
7.7.2 Interaction energy of a charge distribution . . . . . . . . . . . 66
7.7.3 A brute-force calculation - the circular disc . . . . . . . . . . . 66
7.8 Gauss’ law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 68
7.9 Boundaries . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.9.1 Normal component . . . . . . . . . . . . . . . . . . . . . . . . 70
7.9.2 Tangential component . . . . . . . . . . . . . . . . . . . . . . 71
7.9.3 Conductors . . . . . . . . . . . . . . . . . . . . . . . . . . . . 71
7.10 Poisson’s equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72

iv
7.10.1 Uniqueness of solution . . . . . . . . . . . . . . . . . . . . . . 72
7.10.2 Methods of solution . . . . . . . . . . . . . . . . . . . . . . . . 73
7.10.3 The method of images . . . . . . . . . . . . . . . . . . . . . . 74
7.11 Electrostatic energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 77
7.11.1 Electrostatic energy of a general charge distribution . . . . . . 77
7.11.2 Field energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
7.12 Capacitors (condensers) and capacitance . . . . . . . . . . . . . . . . 79
7.12.1 Parallel-plate capacitor . . . . . . . . . . . . . . . . . . . . . . 79
7.12.2 Concentric conducting spheres . . . . . . . . . . . . . . . . . . 80
7.12.3 Energy stored in a capacitor . . . . . . . . . . . . . . . . . . . 81

8 Magnetostatics 82
8.1 Currents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
8.1.1 Charge and current conservation . . . . . . . . . . . . . . . . . 83
8.1.2 Conduction current . . . . . . . . . . . . . . . . . . . . . . . . 84
8.2 Forces between currents (Ampère, 1821) . . . . . . . . . . . . . . . . 85
8.2.1 The Lorentz force . . . . . . . . . . . . . . . . . . . . . . . . . 86
8.3 Biot-Savart law . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 87
8.3.1 Long straight wire . . . . . . . . . . . . . . . . . . . . . . . . 88
8.3.2 Two long parallel wires . . . . . . . . . . . . . . . . . . . . . . 88
8.3.3 Current loop . . . . . . . . . . . . . . . . . . . . . . . . . . . . 89
8.4 Divergence and curl of the magnetic field: Gauss and Ampère laws . . 89
8.4.1 Ampère’s law (1826) . . . . . . . . . . . . . . . . . . . . . . . 91
8.4.2 Conducting surfaces . . . . . . . . . . . . . . . . . . . . . . . 92
8.5 The vector potential . . . . . . . . . . . . . . . . . . . . . . . . . . . 93
8.6 Magnetic dipoles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
8.6.1 Magnetic moment and angular momentum . . . . . . . . . . . 97
8.6.2 Force and torque on a dipole in an external field . . . . . . . . 97
8.7 Summary: electrostatics and magnetostatics . . . . . . . . . . . . . . 100

v
Chapter 1

Revision of vector algebra

There should be nothing new in this (brief) chapter. It’s intended to revise concepts
and notation, and to refresh your memory. The style and content should be similar
to Kristel Torokoff’s Fields notes, to which you should refer for detailed expositions.

1.1 Revision of vectors & index/suffix notation


In this chapter we will work entirely in three-dimensional real space, R3 .

1.1.1 Position vector


The position vector is bound to some origin O and
gives the position of a point relative to that origin; it 
1

 x
will generally be denoted by x or r.  

O

1.1.2 Cartesian coordinates

Let {e i }, i = 1, 2, 3, be a set of orthonormal Cartesian


basis vectors in R3 . By definition, these satisfy e3 6

e1 · e1 = e2 · e2 = e3 · e3 = 1 e2
e1 · e2 = e2 · e3 = e3 · e1 = 0 

3
 -
e1

The position vector x may be expressed in terms of Cartesian coordinates (x1 , x2 , x3 ):


x = x1 e 1 + x2 e 2 + x3 e 3

Similarly, for an arbitrary vector a in R3 , we write


3
X
a = a1 e 1 + a2 e 2 + a3 e 3 = ai e i ≡ ai e i
i=1
In the last expression, we’ve used the Einstein summation convention to denote an
implicit sum over the repeated or dummy index i = 1, 2, 3

1
1.1.3 The Kronecker delta symbol δij
Define δij , where i and j can take on the values 1, 2, 3, such that

1 when i = j
δij ≡
0 when i 6= j

so that δ11 = δ22 = δ33 = 1 and δ12 = δ13 = δ23 = · · · = 0.


The equations satisfied by the orthonormal basis vectors {e i } may then be written
succinctly as
e i · e j = δij

The components ai of a in an orthonormal basis may be obtained using orthonor-


mality
 
a · e i = aj e j · e i = aj δij = ai or occasionally (a)i
In this equation i is a ‘free’ index and it takes on all three values i = 1, 2, 3.

1.1.4 The Levi-Civita or epsilon symbol ijk


Define ijk , where i, j and k can each take on the values 1, 2, 3, such that

ijk = +1 if (i, j, k) is an even permutation of (123)


= −1 if (i, j, k) is an odd permutation of (123)
= 0 otherwise (i.e. when 2 or more indices are the same)

In other words:
123 = 231 = 312 = +1; 213 = 321 = 132 = −1; all others = 0.

Note the cyclic symmetry


ijk = kij = jki = −jik = −ikj = −kji
The Kronecker delta and epsilon symbols generalise straightforwardly to any number
of dimensions.

1.1.5 Scalar product


The scalar product, also known as the inner product, or just the “dot” product, of
two vectors a and b is defined as
a · b = a b cos θab
where a ≡ |a| and b ≡ |b| are the lengths of the two vectors, and θab is the angle
between them. In Cartesian coordinates
a · b = a1 b 1 + a2 b 2 + a3 b 3 ≡ ai b i
where we used the summation convention in the last expression.

2
1.1.6 Vector product
The vector product (or cross product) is defined as
a × b ≡ a b sin θab n
where n is a unit vector orthogonal to both a and b, and the vectors {a, b, n} form
a right-handed set.
Using this definition, the orthonormality relations satisfied by the vector products
of the (right-handed) orthonormal basis vectors {e i } can be written as:
e i × e j = ijk e k ∀ i, j = 1, 2, 3
where we have again used
P3 the summation convention: there is an implicit sum over
the repeated index k: k=1 .
Using this result, in Cartesian coordinates, the ith component of the vector product
of a and b may be written as
(a × b)i = ijk aj bk
P3 P3
where there is an implicit sum over both of the repeated indices j & k: j=1 k=1 .
Equivalently, we can write the vector product as
a × b = e i (a × b)i = ijk e i aj bk

1.1.7 Scalar triple product


From the results above, we can easily deduce an algebraic definition of the scalar
triple product in Cartesian coordinates
(a, b, c) ≡ a · (b × c) = ai (b × c)i = ijk ai bj ck

1.1.8 Vector triple product



The vector triple product a × b × c satisfies
  
a× b×c = a·c b− a·b c
which may be proved by considering explicit components of each side of the equation.
It may also be obtained from the following expression for the product of two epsilon
symbols with one common (summed) index:

ijk klm = δil δjm − δim δjl

You must know this result!


When the epsilon symbols have two common indices, we get another useful result
(exercise)
ijk pjk = 2 δip

3
1.2 Linear transformation of basis
Let {e i } and {e 0i } be two orthonormal bases with a common origin related by the
linear transformation
e 0i = `ij e j
where we have again used the summation convention, so the repeated index j is
summed over.
We sometimes refer to the bases {e i } and {e 0i } as frames S and S 0 , respectively.
Using orthonormality of both bases, it is straightforward to show that the nine
constants `ij satisfy (exercise)
`ik `jk = `ki `kj = δij ⇔ L LT = LT L = I
where we have used matrix notation in the 2nd version.
The 3 × 3 transformation matrix L is an orthogonal matrix, whose elements are
given by (L)ij = `ij = e 0i · e j (exercise), and I is the unit matrix.
Every orthogonal matrix L has determinant det L = ±1.
If L represents a rotation of the basis {e i }, then det L = 1. This is called a proper
transformation.
If L represents an inversion of the basis, i.e. e 0i = −e i , or a reflection of the basis
vectors in some plane, then det L = −1. These are called improper transformations.

1.3 Transformation properties of vectors and scalars

1.3.1 Transformation of vector components


Let a be any vector, with components ai in the {e i } basis and components a0i in the
e 0i basis, so that


a = ai e i = a0i e 0i
The vector a is the same in both bases, but its components in the two bases are
related by
a0i = `ij aj

1.3.2 Scalar and vector products


The scalar product of two vectors a and b is the same in both bases
a · b = ai bi = a0i b0i
Hence the name scalar product.
The vector product a × b is a pseudovector.
The scalar triple product is a pseudoscalar, and the vector triple product is a vector.
We shall revisit all these properties in detail after we introduce tensors in the next
chapter. . .

4
Chapter 2

Cartesian tensors

2.1 Definition and transformation properties


Consider a rotation of the {e i } basis (frame S) to the {e0i } basis (frame S 0 ). This
is called a passive rotation.
The rotation matrix L, with components `ij , satisfies LLT = I = LT L, and it has
unit determinant det L = +1.
The components of two arbitrary vectors a and b in the two frames are related by
a0i = `ij aj
b0i = `ij bj
Let us now define a vector a as an entity whose 3 components ai in S are related to
its 3 components a0i in S 0 by a0i = `ij aj .
Note that the vector a is not rotated – it remains fixed in space – only the basis
vectors are rotated. Hence the term passive.
Now consider the 9 quantities ai bj . Under the change of basis, these transform to
a0i b0j = `ir ar `js bs
= (`ir `js ) (ar bs )
The 9 quantities ar bs obey a particular transformation law under the change of basis
{e i } → {e0i } (change of frame S → S 0 ). This motivates our definition of a tensor.

2.1.1 Definition of a tensor


Following on from our new definition of a vector, we define a tensor of rank 2, T , as
an entity whose 32 = 9 cartesian components Tpq in S are related to its 9 components
Tij0 in S 0 by

Tij0 = `ip `jq Tpq

where L is the rotation matrix with components `ij which takes S → S 0 . Since there
are 2 free indices (i & j) in the above expression, it represents 9 equations.

5
Similarly a tensor of rank n, T , is defined to be an entity whose 3n components
0
Tijk···op (n-indices) in S 0 are related to its 3n components Trst···vw (n-indices) in S by

0
Tijk···op = `ir `js `kt · · · `ov `pw Trst···vw

where there are n factors of ` on the RHS.


In this new language

• A scalar is a tensor of rank 0 (i.e. a0 = a).


• A vector is a tensor of rank 1 (as a0i = `ij aj ).

We shall often be sloppy and say Tijk···rs is a tensor, when what we really mean is
that T is a tensor with components Tijk···rs in a particular frame S.
The expressions tensor of rank 2 and second-rank tensor are used interchangeably.
Similarly for tensor of rank n and nth -rank tensor.
Important: Note that a rank-n tensor is more general than the ‘direct product’ or
‘tensor product’ of components of n vectors, i.e., not every tensor has components
that can be written as a direct product of vectors ai bj ck . . . pr qs . For example, ai bj +
aj bi is a rank-2 tensor. Another counterexample for n = 2 is given in section 2.1.5.

2.1.2 Fields
A scalar or vector or tensor quantity is called a field when it is a function of position:

• Temperature T (r) is a scalar field


• The electric field Ei (r) is a vector field
• The stress-tensor field Pij (r) is a (rank 2) tensor field

In the latter case the transformation law is


Pij0 (r) = `ip `jq Ppq (r) or Pij0 (x0k ) = `ip `jq Ppq (xk ) with x0k = `kl xl
These two expressions mean the same thing, but the latter form is perhaps better.

2.1.3 Dyadic notation


In some (mostly older) books you will see dyadic notation. This is rather clumsy for
tensors – although it works well for vectors of course!
dyadic notation index notation
a ai
a·b ai b i
A Aij or aij
a A b or a · A · b ai Aij bj
··· ···
We will not use dyadic notation for tensors in this course.

6
2.1.4 Internal consistency in the definition of a tensor
Let Tij , Tij0 , Tij00 be the components of a tensor in frames S, S 0 , S 00 respectively
L be the rotation matrix which takes S → S 0 , L = {`ij }

And let
M be the rotation matrix which takes S 0 → S 00 , M = {mij }
Then
Tij00 = mip mjq Tpq
0

= mip mjq (`pk `ql Tkl )


= (M L)ik (M L)jl Tkl
= nik njl Tkl
where N = ML is the rotation matrix which takes S → S 00 , so the definition of a
tensor is self-consistent. A similar result can be derived for vectors (exercise).

2.1.5 Properties of Cartesian tensors – tensor algebra


• Addition: if Tij···p and Uij···p are two tensors with the same rank n, i.e both
have n indices, then
Vij···p = Tij···p + Uij···p
is also a tensor of rank n. The proof is straightforward.
• Multiplication or tensor product: if Tij···s and Ulm···r are the components of
tensors T and U of rank n and m respectively, then
Vij···slm···r = Tij···s Ulm···r
are the components of a tensor T U of rank n + m, which has 3n × 3m = 3m+n
components:
0 0 0
Vi···r = Ti···s Ul···r
= `iα · · · `sδ Tα···δ `l · · · `rρ U···ρ
= `iα · · · `rρ Vα···ρ
where there are n + m factors of ` in the last expression.
Note that we sometimes use Greek letters α, β, etc for dummy indices when
we have a large number of them.
Example: If U = λ is a scalar, and T is a tensor of rank n, then λ T is a
tensor of rank 0 + n = n.
• Contraction: if Tijk···s is a tensor of rank n, then Tiik···s (i.e. n−2 free indices) is
a tensor of rank n − 2. The process of setting indices to be equal and summing
is called contraction.
Example: If Tij = ai bj is a tensor of rank 2, then Tii = ai bi is a tensor of
rank 0 (scalar), it’s the usual scalar product.
The process of multiplying two tensors and contracting over a pair (or pairs)
of indices on different tensors is (often) called taking the scalar product.

7
• If Tij is a tensor then so is Tji (where Tji are the components of the transpose
T T of T ).
• If Tij = Tji in S, then Tij0 = Tji0 in S 0 :
p↔q
Tij0 = `ip `jq Tpq = `iq `jp Tqp = `jp `iq Tpq = Tji0

Tij is a symmetric tensor; the symmetry is preserved under a change of basis.


[The notation p ↔ q refers to relabelling indices.]
Similarly if Tij = −Tji , then Tij0 = −Tji0 . Tij is an anti-symmetric tensor.
Given any second rank tensor T , we can always decompose it into symmetric
and anti-symmetric parts

1
Tij = 2
(Tij + Tji ) + 21 (Tij − Tji )

• We can re-write the tensor transformation law for rank 2 tensors (only) using
matrix notation:
Tij0 = `ip `jq Tpq = (LT LT )ij
so
T 0 = LT LT ≡ LT L−1 (for rank 2 only)

• Kronecker delta, δij , is a second rank tensor


To show this, first note that the definition

+1 i = j
δij =
6 j
0 i=
holds in all frames, so that, for example,

a · b = δij ai bj = δij a0i b0j

so δij0 = δij .
Starting with
δij0 = δij
and using the general result `ip `jp = δij , we obtain
given
δij0 = δij = `ip `jq δpq

which we recognise as the definition of a second rank tensor.


A tensor which has the same components in all frames is called an invariant
or isotropic tensor.
δij is an example of a tensor that can not be written in the form Tij = ai bj .
[To show this, assume we can write δij = ai bj . The diagonal terms in this
expression require non-zero ai and bi for all i, while the off-diagonal terms
require at least 3 of ai and bj to be zero, which is a contradiction.]
There are no invariant first rank tensors, apart from the zero vector 0 .

8
2.1.6 The quotient theorem
Let T be an entity with 9 components in any frame, say Tij in S, and Tij0 in S 0 .
Let a be an arbitrary vector and let bi = Tij aj . If b always transforms as a vector,
then T is a second rank tensor.
To prove this, we determine the transformation properties of T . In S 0 we have
b0i = Tij0 a0j = Tij0 `jk ak
≡ `ij bj = `ij Tjk ak
Equate the last expression on each line and rearrange
Tij0 `jk − `ij Tjk ak = 0


This expression holds for all vectors a, for example a = (1, 0, 0) etc, therefore
Tij0 `jk = `ij Tjk
⇒ Tij0 `jk `mk = `ij `mk Tjk [ multiplied both sides by `mk ]
0
⇒ Tim = `ij `mk Tjk
where we used `jk `mk = δjm in the last step. Thus T transforms as a tensor.
Important example: If there is a linear relationship between two vectors a and
b, so that ai = Tij bj , it follows from the quotient theorem that T is a tensor.
This is an alternative definition of a second-rank tensor, and it’s often the way that
tensors arise in nature.
Quotient theorem: Generalising, if Rij···r is an arbitrary tensor of rank-m, and
Tij···s is a set of 3n numbers (with n > m) such that Tij···s Rij···r is a tensor of rank
n − m , then Tij···s is a tensor of rank-n.
Proof is similar to the rank-2 case, but it isn’t very illuminating.

2.2 Revision of matrices and determinants


Before extending the discussion to pseudotensors, we revise and extend some familiar
properties of matrices and determinants.

2.2.1 Matrices
An M × N matrix is a rectangular array of numbers with M rows and N columns,
 
a11 a12 ··· a1,N −1 a1N
 a21
 a22 ··· a2,N −1 a2N 


 · · 

A =  · ·  ≡ {aij }


 · · 

 aM −1,1 aM −1,2 aM −1,N −1 aM −1,N 
aM,1 aM,2 aM,N −1 aM,N

9
The quantities {aij }, with aij ≡ (A)ij for all 1 ≤ i ≤ M , 1 ≤ j ≤ N , are the
elements of the matrix.

A square matrix has N = M . We’ll work mostly with 3 × 3 matrices, but the
majority of what we’ll do generalises to N × N matrices rather easily.

• We can add & subtract same-dimensional matrices and multiply a matrix by


a scalar. Component forms are obvious, e.g. A = B + λC becomes aij =
bij + λcij in index notation. Since both i and j are free indices, this represents
9 equations.

• The unit matrix, I, defined by


 
1 0 0
I= 0 1 0 ,
0 0 1

has components δij , i.e. Iij = δij

• The trace of a square matrix is the sum of its diagonal elements

Tr A = aii

Note the implicit sum over i due to our use of the summation convention.

• The transpose of a square matrix A with components aij is defined by swapping


its rows with its columns, so

AT ij = ATij = aji


• If
A= AT then aji = aij A is symmetric

If
A = −AT then aji = −aij A is antisymmetric

Product of matrices

We can very easily implement the usual ‘row into column’ matrix multiplication rule
in index notation.
If A (with elements aij ) is an M × N matrix and B (with elements bij ) is an N × P
matrix then C = AB is an M × P matrix with elements cij = aik bkj . Since
we’re using the summation convention, there is an implicit sum k = 1, . . . N in this
expression.
Matrix multiplication is associative and distributive

A(BC) = (AB)C ≡ ABC


A(B + C) = AB + AC

10
respectively, but it’s not commutative: AB 6= BA in general.
An important result is
(AB)T = B T AT
which follows because

(AB)Tij = (AB)ji = ajk bki = (B T )ik (AT )kj = (B T AT )ij

2.2.2 Determinants
The determinant det A (or |A| or ||A||) of a 3 × 3 matrix A may be defined by

det A = lmn a1l a2m a3n

This is equivalent to the ‘usual’ recursive definition,



a11 a12 a13

det A = a21 a22 a23 ,

a31 a32 a33

where the first index labels rows and the second index columns. Expanding the
determinant gives

a22 a23 a21 a23 a21 a22
det A = a11 − a12
+ a13

a32 a33 a31 a33 a31 a32
= a11 (a22 a33 − a23 a32 ) − a12 (a21 a33 − a23 a31 ) + a13 (a21 a32 − a22 a31 )
= (123 a11 a22 a33 + 132 a11 a23 a32 ) + . . .
= 1mn a11 a2m a3n + . . .
= lmn a1l a2m a3n

Thus the two forms are equivalent. The  form is convenient for derivation of various
properties of determinants
Note that only one term from each row and column appears in the determinant sum,
which is why the determinant can be expressed in terms of the  symbol.
The determinant is only defined for a square matrix, but the definition can be
generalised to N × N matrices,

det A = i1 ...iN a1i1 . . . aN iN ,

where the epsilon symbol with N indices is defined by



 +1 if i1 , . . . , iN is an even permutation of 1, . . . , N
i1 ...iN = −1 if i1 , . . . , iN is an odd permutation of 1, . . . , N .
0 otherwise

We shall usually consider N = 3, but most results generalise to arbitrary N .

11
We may use these results to derive several alternative (and equivalent) expressions
for the determinant.
First define the quantity
Xijk = lmn ail ajm akn
It follows that

Xjik = lmn ajl aim akn


= mln ajm ail akn (where we relabelled l ↔ m)
= −lmn ail ajm akn = −Xijk

Thus the symmetry of Xijk is dictated by the symmetry of lmn , and we must have

Xijk = c ijk

where c is some constant. To determine c, set i = 1, j = 2, k = 3, which gives


123 c = X123 , so c = lmn a1l a2m a3n = det A and hence

ijk det A = lmn ail ajm akn

Multiplying by ijk and using ijk ijk = 6 (exercise) gives the symmetrical form for
det A
1
det A =  
3! ijk lmn
ail ajm akn

This elegant expression isn’t of practical use because the number of terms in the
sum increases from 33 to 36 (overcounting).
We can obtain a result similar to the boxed expression above by defining

Ylmn = ijk ail ajm akn

Using the same argument as before [tutorial] gives

Ylmn = lmn [ijk ai1 aj2 ak3 ]

Since det A = 1/3! lmn Ylmn this means that

det A = ijk ai1 aj2 ak3

and
lmn det A = ijk ail ajm akn

[tutorial]

12
Properties of Determinants

We can easily derive familiar properties of determinants from the definitions above:

• Adding a multiple of one row to another does not change the value of the
determinant.
Example: Adding a multiple of the second row to the first row

lmn a1l a2m a3n → lmn (a1l + λ a2l ) a2m a3n = lmn a1l a2m a3n + 0

and det A is unaltered. The last term is zero because lmn a2l a2m = 0.

• Adding a multiple of one column to another does not change the value of the
determinant (use the other form for the determinant, det A = ijk ai1 aj2 ak3 ).

• Interchanging any two rows of a matrix changes the sign of the determinant.
Example: Interchanging the first and second rows gives

lmn a2l a1m a3n = mln a2m a1l a3n = −lmn a1l a2m a3n = − det A

In the first step we simply relabelled l ↔ m.


When A has two identical rows, det A = 0.

• Interchanging any two columns of a matrix also changes the sign of the deter-
minant (use the other form for the determinant, det A = ijk ai1 aj2 ak3 ).

Finally,
ail aim ain



ijk lmn det A = ajl ajm ajn
(2.1)
akl akm akn

To derive this, start with the original definition of det A as | · · · | and permute rows
and columns. This produces ± signs equivalent to  permutations.

2.2.3 Linear equations


A standard use of matrices & determinants is to solve (for x) the matrix-vector
equation
Ax = y
where A is a square 3 × 3 matrix. Representing x and y by column matrices, and
writing out the components, this becomes

a11 x1 + a12 x2 + a13 x3 = y1


a21 x1 + a22 x2 + a23 x3 = y2
a31 x1 + a32 x2 + a33 x3 = y3

In index notation
aij xj = yi

13
With a suitable definition of the inverse A−1 of the matrix A, we can write the
solution as
x = A−1 y or xi = A−1
ij yj

where A−1 −1
ij ≡ (A )ij is the ij
th
element of the inverse of A

1 1
A−1
ij = imn jpq apm aqn
2! det A
By explicit multiplication [tutorial] we can show AA−1 = I = A−1 A as required.
Alternatively [tutorial],
CT
A−1 =
det A
where C = {cij } is the co-factor matrix of A, and cij = (−1)i+j × the determinant
formed by omitting the row and column containing aij .
Note that a solution exists if and only if det A 6= 0.
These results generalise to N × N matrices.

Determinant of the transpose

det AT = lmn ATl1 ATm2 ATn3


= lmn a1l a2m a3n
and hence
det AT = det A

Product of determinants

If C = AB so that cij = aik bkj then


det C = ijk c1i c2j c3k
= ijk a1l bli a2m bmj a3n bnk
= [ijk bli bmj bnk ] a1l a2m a3n
= lmn det B a1l a2m a3n
and hence
det AB = det A det B

Product of two epsilons

The product of two epsilon symbols with no identical indices may be written as

δil δim δin

ijk lmn = δjl δjm δjn
δkl δkm δkn

14
This equation has 6 free indices, so it represents 36 = 729 identities: 18 say ‘1 = 1’,
18 say ‘−1 = −10 , so 693 say ‘0 = 0’.
The proof follows almost trivially by setting A = I in equation (2.1)

ail aim ain

ijk lmn det A = ajl ajm ajn ,
akl akm akn
whence det I = 1 and aij = δij . Unfortunately, this result isn’t as useful as one
might expect. . .
If we set l = k and sum over k

δik δil δim

ijk klm = δjk δjl δjm

δkk δkl δkm

= δik (δjl δkm − δjm δkl ) − δil (δjk δkm − 3δjm ) + δim (δjk δkl − 3δjl ))
= δim δjl − δil δjm + 2δil δjm − 2δim δjl
we obtain an old friend, which we now note can be written as the determinant of a
2 × 2 matrix,
δil δim
ijk klm = δil δjm − δim δjl =
δjl δjm

2.2.4 Orthogonal matrices


Let A be a 3 × 3 matrix with elements aij , and define the row vectors
a(1) = (a11 , a12 , a13 ) , a(2) = (a21 , a22 , a23 ) , a(3) = (a31 , a32 , a33 ) ,

so that a(i) j = aij . If we choose the vectors a(i) to be orthonormal:

a(1) · a(2) = a(2) · a(3) = a(3) · a(1) = 0 ,


and (1) 2
a = a(2) 2 = a(3) 2 = 1 ,

i.e.
a(i) · a(j) = δij ,
then A is an orthogonal matrix. The rows of A form an orthonormal triad.

Properties of orthogonal matrices

• Consider
    
a11 a12 a13 a11 a21 a31 1 0 0
AAT =  a21 a22 a23   a12 a22 a32  =  0 1 0  ,
a31 a32 a33 a13 a23 a33 0 0 1
Therefore
A AT = I

15
• Taking the determinant of both sides of AAT = I gives det A det AT = 1.
Since det AT = det A, then (det A)2 = 1, and therefore

det A = ±1

• Since det A 6= 0, the inverse matrix A−1 always exists, and therefore

A−1 = AT

Multiplying this equation on the right by A gives

AT A = I

so the columns of A also form an orthonormal triad.

2.3 Pseudotensors, pseudovectors & pseudoscalars


Armed with properties of determinants, let’s now return to basis transformations.
Suppose that we now allow reflection and inversion (as well as rotation) of the basis
vectors, and represent them all by a transformation matrix L with
det L = +1 for rotations
det L = −1 for reflections and inversions
In the second case, the handedness of the basis is changed: if S is right-handed
(RH), then S 0 is left-handed (LH), and vice versa.
Before introducing pseudotensors, we note that the basis vectors always transform
as
e0i = `ij ej

A second-rank tensor or true tensor T obeys the transformation law


Tij0 = `ip `jq Tpq
for all transformations, i.e. rotations, reflections and inversions.
A second-rank pseudotensor T obeys the transformation law

Tij0 = (det L) `ip `jq Tpq

There is a change of sign for components of a pseudotensor, relative to a true tensor,


when we make a reflection or inversion of the reference axes.
We can similarly define pseudovectors, pseudoscalars and rank-n pseudotensors.
Note: det L = −1 for a basis transformation that consists of any combination of
an odd number of reflections or inversions, plus any number of rotations.

16
2.3.1 Vectors (revision)
Inversion of the basis vectors is defined by
(e1 , e2 , e3 ) → (e01 , e02 , e03 ) = (−e1 , −e2 , −e3 )
so that
 
−1 0 0
L =  0 −1 0 
0 0 −1
which has components `ij = −δij .
If {e i } is a standard right-handed basis, then {e0i } is left-handed.
Let a be a vector (also called a tensor of rank 1 or a polar vector or a true vector ).
We showed previously that the components ai transform as a0i = `ij ai , so we have
a01 = −a1 a02 = −a2 a03 = −a3
The vector itself therefore transforms as
a0 ≡ a0i e0i = (−ai ) (−ei ) = ai ei = a
as expected. Thus a (true/polar) vector remains unchanged by inversion of the
basis.

S ~e2 S′
~a ~a
~e′3

~e1 ~e′1
ref lection

~e3
~e′2

Inversion is sometimes called reflection in the origin - hence the label in the figure
above.

2.3.2 Pseudovectors
If a and b are (true) vectors then c = a × b is a pseudovector (also known as a
pseudotensor of rank 1 or an axial vector ).
Let’s illustrate this by considering an inversion.
c01 = a02 b03 − a03 b02 = (−a2 ) (−b3 ) − (−a3 ) (−b2 ) = +c1 etc
Therefore
c0 ≡ c0i e0i = (+ci ) (−ei ) = −ci ei = −c
Thus the direction of c = a × b is reversed in the new LH basis.

17
S ~e2 S′
~b ~b
~e′3
~c
~c ~e1 ref lection ~e′1

~a ~a
~e3
~e′2
The vectors a, b and a × b form a LH triad in the S 0 basis, i.e. they always have the
same ‘handedness’ as the underlying basis.
Now consider the general case. We define the epsilon symbol in all bases, regardless
of their handedness, in the same way:

 +1 if ijk is an even permutation of 123
ijk = −1 if ijk is an odd permutation of 123
0 otherwise

The components of c = a × b are ci = ijk aj bk in S and c0i = ijk a0j b0k in S 0 .


We can now determine the components of c = a × b in S 0
c0i = ijk a0j b0k
= ijk `jp ap `kq bq
= mjk `ir `mr `jp `kq ap bq
= det L `ir rpq ap bq
= det L `ir (a × b)r
where we used ijk = mjk δim = mjk `ir `mr to get to the third line, and the deter-
minant result rpq det L = mjk `mr `jp `kq to get the fourth line. So, finally

c0i = det L `ir cr

This is our definition of a pseudovector. Equivalently, since e0i = `ij ej then


 
c0 ≡ c0i e0i = (det L `ir cr ) `ij e j = det L cj e j = det L c
Therefore a pseudovector changes sign under any improper transformation such as
inversion of the basis or reflection of the basis in a plane.

2.3.3 Pseudotensors
0
A pseudotensor of rank n, T , is defined to be an entity whose 3n components Tijk···op
(n-indices) in S 0 are related to its 3n components Trst···vw (n-indices) in S by

0
Tijk···op = (det L) `ir `js `kt · · · `ov `pw Trst···vw

where there are n factors of ` on the RHS, and the basis vectors of S 0 and S are
related by e 0i = `ij e j so that (L)ij = `ij .

18
Invariant pseudotensors: The -symbol is a pseudotensor of rank 3. The proof
is simple and uses the fact that  is the same in all bases, together with (det L)2 = 1

0ijk ≡ ijk = det L det L ijk = det L `ip `jq `kr pqr

where we used the definition of the determinant in the last step. Furthermore since
 is the same in all bases, it’s an isotropic or invariant rank-3 tensor.
We can build pseudotensors of higher rank using a combination of vectors, tensors,
δ and . For example:

• If ai and bi are rank 1 tensors (i.e. vectors), then the 35 quantities ijk al bm
are the components of a pseudotensor of rank 5.

• ijk δpq is an invariant pseudotensor of rank 5.

In general:

• The product of two tensors is a tensor

• The product of two pseudotensors is a tensor

• The product of a tensor and a pseudotensor is a pseudotensor

2.4 Some physical examples


dr(t)
• Velocity v = vector
dt
• Acceleration a = v̇ vector

• Force F = ma vector

• Electric field E = F /q vector


(F is the force on a test charge q)

• Torque G = r×F pseudovector

• Angular velocity (ω) v = ω×r pseudovector

• Angular momentum L = r×p pseudovector

• Magnetic field (B) F = qv × B pseudovector

P
µ0 I dr0 × (r − r0 )
I
i
B(r) =
4π r − r 0 3

C
~r
d~r′
~r′
(Biot–Savart Law – more later) C
O

19
• E·B pseudoscalar
A pseudoscalar changes sign under an improper transformation: a0 = (det L) a
In this case, one can easily show that
0 
E · B = det L E · B

• Helicity h = p·s pseudoscalar

p~

The pseudovector s is the angular momentum, ~s

or spin, of a particle/ball. As the ball spins,


a point on it traces out a RH helix. In the
figure, p is parallel to s

• Inertia tensor (I) Li = Iij ωj rank 2 tensor


• Conductivity tensor (σ) Ji = σij Ej rank 2 tensor
• Stress tensor (P ) dFi = Pij dSj rank 2 tensor

2.5 Invariant/Isotropic Tensors


A tensor T is invariant or isotropic 1 if it has the same components, Tijk··· in any
Cartesian basis (or frame of reference), so that
Tijk··· = `iα `jβ `kγ · · · Tαβγ···
for every (orthogonal) transformation matrix L = {`ij }.
Similarly, T is an invariant pseudotensor if
Tijk··· = det L `iα `jβ `kγ · · · Tαβγ···

invariant
Theorem: If aij is a second-rank invariant tensor, then aij = λ δij

Proof: For a rotation of π/2 about the z-axis


 
0 1 0
L =  −1 0 0 
0 0 1
Since the only non-zero elements are `12 = 1, `21 = −1, `33 = 1, using aij =
`iα `jβ aαβ , we find
a11 = `1α `1β aαβ = `12 `12 a22 = a22
a13 = `1α `3β aαβ = `12 `33 a23 = a23
a23 = `2α `3β aαβ = `21 `33 a13 = −a13
1
Isotropic means “the same in all directions”.

20
Therefore
a11 = a22 , a13 = 0 = a23
Similarly, for a rotation of π/2 about the y-axis, we find (exercise)
a11 = a33 , a12 = 0 = a32
Finally, for a rotation of π/2 about the x-axis
a22 = a33 , a21 = 0 = a31
The only solution to these equations is aij = λ δij . We’ve already shown that δij is
an invariant second-rank tensor, therefore λ δij is the most general invariant second-
rank tensor.
One can use a similar argument to show that the only invariant vector is the zero
vector. It’s obvious that no non-zero vector has the same components in all bases!
Theorem: There is no invariant tensor of rank 3. The most general invariant
pseudotensor of rank 3 has components
invariant
aijk = λ ijk

Proof: This is similar to the rank 2 case [tutorial].


Theorem: The most general 4th rank invariant tensor has components
invariant
aijkl = λ δij δkl + µ δik δjl + ν δil δjk

The proof is long and not very illuminating. See, for example, Matthews, Vector
Calculus, (Springer) or Jeffreys, Cartesian Tensors, (CUP). However, it’s easy to
show that the expression above is indeed an invariant tensor.

Invariant tensors of higher rank

• The most general rank-5 invariant pseudotensor has components


invariant
aijklm = λ δij klm + µ δik jlm . . .
where the dots indicate all terms consisting of a constant × further permuta-
tions of the 5 indices.
• The most general rank-6 invariant tensor has components
invariant
aijklmn = λ δij δkl δmn + . . .
Note that invariant tensors involving products of two epsilons can always be
rewritten as sums of products of δs using the expression for the product of two
epsilons.
• The most general invariant tensor of rank 2n is a sum of products of constants
times n Kronecker deltas. There is no invariant pseudotensor of rank 2n.
• Similarly, the most general invariant pseudotensor of rank 2n + 1 is a sum of
products of constants times one  and n − 1 Kronecker deltas. There is no
invariant tensor of rank 2n + 1.

21
Chapter 3

Rotation, reflection & inversion


tensors

The transformations of the previous chapters in which we rotate (or reflect or invert)
the basis vectors keeping the vector fixed are called passive transformations.
Alternatively we can keep the basis fixed and rotate (or reflect or invert) the vector.
These are called active transformations and are represented by second-rank tensors.

3.1 The rotation tensor

3.1.1 Rotation about an arbitrary axis

Consider a rotation of a rigid body, through an-


gle θ, about an axis which points in the direction
of the unit vector n. The axis passes through a Q
fixed origin O.
θ P
S ~n × ~x
The point P is rotated to Q. The position vector Q
x is rotated to y . θ
−→ ~y P
In the first diagram, OS is the projection of x S T
onto the n direction, i.e. x · n n. ~n ~x
−→
In the second diagram, n × x is parallel to T Q.

−→ −→ −→
y = OS + ST + T Q O

 x − (x · n)n n×x
= x · n n + SQ cos θ + SQ sin θ
| {z } | SP | {z } |n × x|
−→ {z } −→
|ST | |T Q| | {z }
SP
c
T
d Q
  
= x · n n + cos θ x − x · n n + sin θ n × x

(as SP = SQ and |n × x| = SP = SQ).

22
This gives the important result

y = x cos θ + (1 − cos θ) n · x n + (n × x) sin θ

In index notation, this is


yi = xi cos θ + (1 − cos θ)nj xj ni + ikj nk xj sin θ
which we write as 
yi = Rij θ, n xj

This is a linear relation between the vectors x and y, so R θ, n is a rank 2 tensor,
the rotation tensor, with components

Rij θ, n = δij cos θ + (1 − cos θ) ni nj − ijk nk sin θ

3.1.2 Some important properties of the rotation tensor


(i) In any basis R is represented by an orthogonal matrix, with components Rij ,
and det R = 1. Proofs [tutorial]
(ii) It is straightforward to show that [tutorial]
TrR = Rii = 1 + 2 cos θ [independent of n]
− 12 kij Rij = nk sin θ

If R is known, then the angle θ and the axis of rotation n can be determined.
Note: R has 1 + (3 − 1) = 3 independent parameters.
R S
(iii) The product of two rotations x → y → z is given by z = SR x .
(iv) Consider a small (infinitesimal) rotation δθ, for which
cos δθ = 1 + O(δθ2 ) and sin δθ = δθ + O(δθ3 ) , then

Rij = δij − ijk nk δθ

A quicker (and sufficient) graphical proof follows di- |~n × ~x|δθ


δθ
rectly from the diagram on the right, which gives |~n × ~x|

y − x = n × x δθ ~y

~n ~x
from which the result above follows directly.

(v) For θ 6= 0, π, R has only one real eigenvalue +1 , with one real eigenvector n .
[tutorial]

23
3.2 Reflections and inversions
Consider reflection of a vector x → y in a
~n
plane with unit normal n.

From the figure y = x − 2 x · n n ~x
In index notation, this becomes

yi = σij xj where σij = δij − 2 ni nj

The quantities σij are the components of the


reflection tensor σ. ~y

Inversion of a vector in the origin is given by y = −x . This defines the parity


operator P :
yi = Pij xj where Pij = −δij

The parity operator is an invariant tensor with components −δij .


For reflections and inversions, respectively, det σ = det P = −1.
Note that for reflections and inversions, performing the operation twice yields the
original vector, i.e. σ 2 = I , P 2 = I.

3.3 Projection operators


P is a parallel projection operator onto a vector u if
Pu = u and Pv = 0
where v is any vector orthogonal to u , i.e. v · u = 0 . Similarly Q is an orthogonal
projection to u if
Qu = 0 and Qv = v
so that Q = I − P . Suitable operators are (exercise: check this!)
ui uj ui uj
Pij = and Qij = δij −
u2 u2
These have the properties
P2 = P , Q2 = Q , P Q = QP = 0
They’re also unique. For example, if there exists another operator T with the same
properties as P , i.e. T u = u and T v = 0, then for any vector w ≡ µ u + ν v + λ u × v
we have  
(P − T ) w = µ u + 0 + 0 − µ u + 0 + 0 = 0
because
= ui uj /u2 jkl uk vl = 0
  
P u×v i
= Pij u × v j
This holds for all vectors w , so T = P .

24
3.4 Active and passive transformations
• Rotation of a vector in a fixed basis is called an active transformation
x → y with yi = Rij xj in the {ei } basis
• Rotation of the basis whilst keeping the vector fixed is called a passive trans-
formation
{ei } → {e0i } and xi → x0i = `ij xj

If we set Rij = `ij , then numerically yi = x0i .


Consider a simple example of both types of rotation:

Rotation of a vector about the z-axis ~e2

Rij (θ, e3 ) = δij cos θ + (1 − cos θ) δi3 δj3 − ijk δk3 sin θ
 
cos θ − sin θ 0 ~y
=  sin θ cos θ 0 
0 0 1 θ
~x

where we used ni = δi3 . ~e1

This is an active rotation through angle θ .

~e3

Rotation of the basis about the z-axis


~e2

e0i = `ij ej ≡ Rij ej ~e′2

In components

e01 = cos θ e1 − sin θ e2 ~x


e02 = sin θ e1 + cos θ e2
~e1
θ
e03 = e3
~e′1
This is a passive rotation through angle −θ.
~e3, ~e′3

We conclude that

An active rotation of the vector x through angle θ is equivalent numer-


ically to a passive rotation of the basis vectors by an equal and opposite
amount.
Colloquially, rotating a vector in one direction is numerically equivalent
to rotating the basis in the opposite direction.

The general case can be built from three rotations (Euler angles).

25
Chapter 4

The inertia tensor

4.1 Angular momentum and kinetic energy

Suppose a rigid body of arbitrary shape rotates with


angular velocity ω = ω n about a fixed axis, parallel to
the unit vector n, which passes through the origin. ω

Consider a small element of mass dm in volume dV at


the point P , with position vector r relative to O. P
dm
If the rigid body has density (mass per unit volume)
ρ(r), then dm = ρ dV . O ~r
The velocity of the element is

v = ω×r


We can see this geometrically from the figure.
The distance |δr| moved in time δt, is δθ

|δr| = r sin φ δθ = δθ n × r

So its velocity is ~r
δr δθ φ
v = = ω×r where ω = n
δt δt ~ω

Alternatively, we can use the rotation tensor R(θ, n)

δxi = Rij δθ, n xj − xi = −ijk nk δθ + O (δθ)2 xj


  

δxi  δθ
⇒ = n×r i δt
δt
which again gives v = ω × r.

26
4.1.1 Angular momentum
The angular momentum L of a point particle of mass m at position r moving with
velocity v = ṙ is L = r × p, where the momentum p = mv.
The angular momentum dL of an element of mass dm = ρ dV at r is
dL = ρ(r) dV r × v
The angular momentum of the entire rotating body is then
Z

L = ρ r × ω × r dV
body

In components
Z
Li = ρ ijk xj (klm ωl xm ) dV
body
Z
= ρ (δil δjm − δim δjl ) xj ωl xm dV
body
Z
ρ r2 ωi − xi xj ωj dV

=
body

Thus Z
ρ r2 δij − xi xj dV

Li = Iij ωj with Iij =
body

The geometric quantity I(O) (where O refers to the origin) is called the inertia
tensor.1 It is a tensor because L is pseudovector, ω is a pseudovector, and hence
from the quotient theorem I is a tensor.
Iij is symmetric and independent of the rotation axis n , it’s a property of the body.

4.1.2 Kinetic energy


2
The kinetic energy, dT , of an element of mass dm is dT = 21 (ρ dV ) ω × r . The
kinetic energy of the body is then
1
Z
T = ρ (ijk ωj xk ) (ilm ωl xm ) dV
2 body
1
Z
= ρ (δjl δkm − δjm δkl ) ωj xk ωl xm dV
2 body
1
Z
ρ ω 2 r2 − (r · ω)2 dV

=
2 body
1
Z
ρ r2 δij − xi xj dV ωi ωj

=
2 body
which gives
1 1
T = Iij ωi ωj = L · ω
2 2
1
Some of my colleagues insist that I(O) should be called the moment of inertia tensor. You
will see both names in books.

27
Alternative (more familiar) forms

Recalling that the angular velocity may be written as ω = ω n, consider


n · L = ni Iij ωj = Iij ni nj ω ≡ I (n) ω
L · n is the component of angular momentum parallel to the axis n, and
Z  Z
(n) 2
2  2
I = Iij ni nj = ρ r − r·n dV ≡ ρ r⊥ dV
body body

is the moment of inertia about n, with r⊥ the perpendicular distance from the n-axis.
Similarly for the kinetic energy, so that
)
L(n) = I (n) ω
with L(n) = L · n , I (n) = Iij ni nj
T = 21 I (n) ω 2

Example: Consider a cube of side a of constant density ρ and mass M = ρa3


Z
ρ r2 δij − xi xj dV

Iij (O) =
z
In this case
Z a
x 2 + y 2 + z 2 − x2
 
I11 = ρ dx dy dz
0 y
1 3 1 3
a 2 5 2 2
= ρ 3y xz + 3 z xy 0 = 3 ρa = 3Ma
Z a a
I12 = ρ dx dy dz (−xy)
0
x
21 2 a
= − 14 ρa5 = − 14 M a2 O
1 
= −ρ 2x 2y z 0

2
− 41 − 41
 
3
Similarly for the other components. By symmetry, I(O) = M a2  − 14 2
− 41 
 
3
− 14 − 41 2
3

4.1.3 The parallel axes theorem


ω
It’s often more useful, and also simpler, to find the
inertia tensor about the centre of mass G, rather P
dm
than about an arbitrary point O. There is, how- ′
~r
ever, a simple relationship between them. O ~r
−→
Taking O to be the origin, and OG = R , we have ~
R G
r = R + r0 , giving
Z
ρ(r) r2 δij − xi xj dV

Iij (O) =
Z n 2 o
= ρ0 (r0 ) R + r0 δij − (Xi + x0i ) Xj + x0j dV 0

28
In the above, ρ(r) = ρ(R + r0 ) ≡ ρ0 (r0 ), and we changed integration
R 0variables to r0 .
Expanding the integrand and using the definition of G, namely ρ (r0 ) r0 dV 0 = 0 ,
we get Z
ρ0 (r0 ) R2 + r02 δij − Xi Xj − x0i x0j dV 0
 
Iij (O) =
Hence
Iij (O) = Iij (G) + M (R2 δij − Xi Xj )

where M = ρ0 (r0 ) dV 0 is the total mass of the body. This is a general result; given
R

I(G) we can easily find the inertia tensor about any other point.
The general result above is often called the parallel axes theorem. However, the
parallel axes theorem technically refers to the inertia tensor about an axis n through
G, which is parallel to the original axis n through O, as in the figure above, so that
I (n) (O) = I (n) (G) + M R⊥
2
p
where R⊥ ≡ R2 − (R · n)2 is the ⊥ distance of G from the n axis through O.

Example (revisited): In our previous example, the centre of mass G is at the


centre of the cube with position vector R = 21 a, 12 a, 12 a . Using Cartesian coordi-
nates with their origin at G
Z a/2
dx dy dz x2 + y 2 + z 2 − x2
 
I11 (G) = ρ
−a/2
n   o
3 a/2
1 a/2 a/2 1
 3 a/2 a/2 a/2
= ρ 3 y −a/2 [x]−a/2 [z]−a/2 + 3 z −a/2 [x]−a/2 [y]−a/2

= ρ 31 · 2(a/2)3 2(a/2) 2(a/2) · 2 = 61 ρa5 = 16 M a2




and z
Z a/2
I12 (G) = ρ dx dy dz (−xy)
−a/2
a/2 y
2
1
= 0 because 2x −a/2
= 0
a G
Similarly for all the other components.
x
O
The inertia tensor about the centre of mass is then
 
1
6 0 0
M a2
Iij (G) = M a2  0 16 0  = δij
 
1
6
0 0 6 ij
and
1
− 41 − 41
 
2
M R2 δij − Xi Xj = M a2 

− 14 1
− 41 
 
2
− 14 − 41 1
2 ij
1 1 2
Since 2 + 6 = 3, this reproduces our previous result for Iij (O).

29
4.1.4 Diagonalisation of rank-two tensors
Question: are there any directions for ω such that L is parallel to ω?
If so, then L = λ ω, and hence
(Iij − λδij ) ωj = 0
For a non-trivial solution of these three simultaneous linear equations (i.e. ω 6= 0),
we must have det (Iij − λδij ) = 0. Expanding the determinant, or writing it as
det (Iij − λδij ) = 16 ijk lmn (Iil − λδil ) (Ijm − λδjm ) (Ikn − λδkn ) = 0
and then expanding gives
P − Qλ + Rλ2 − λ3 = 0
where
1
P = 6 ijk lmn Iil Ijm Ikn
= det I
1
Q = 6 ijk lmn (δil Ijm Ikn + Iil δjm Ikn + Iil Ijm δkn )
1
= 6 (δjm δkn − δjn δkm ) Ijm Ikn × 3
2 2
1
 
= 2 (TrI) − Tr I
1
R = 6 ijk lmn (δil δjm Ikn + δil Ijm δkn + Iil δjm δkn )
1
= 6 2 δkn Ikn × 3

= Tr I
For any tensor A, we know that det A and TrA are invariant (the same in any basis),
hence the quantities P , Q, R are invariants of the tensor I (i.e. their values are also
the same in any basis).
The three values of λ (i.e. the solutions of the cubic equation) are the eigenvalues
of the rank-two tensor, and the vectors ω are its eigenvectors.2 We will generally
call the eigenvectors e.

Eigenvectors and eigenvalues: If we take I ω = λ ω (or Iij ωj = λ ωi ), and


multiply on the left by a transformation matrix L, we obtain (in matrix notation)
T 0
   
LI L L
|{z} ij ωj = λ L ω i
⇒ Iij L ω j
= λ L ω i
=1

In the primed basis, we have by definition Iij0 ωj0 = λ0 ωi0 . Comparing with the second
equation above, we see that eigenvectors ω are indeed vectors, i.e. they transform
as vectors: ωi0 = `ij ωj .
Similarly, eigenvalues are scalars, i.e. they transform as scalars: λ0 = λ .
Note that only the direction (up to a ± sign) of the eigenvectors is determined by
the eigenvalue equation, the magnitude is arbitrary.
The answer to our question is that we must find the eigenvalues λ(i) , i = 1, 2, 3 and
the corresponding eigenvectors ω (i) , whence L(i) = λ(i) ω (i) (no sum on i).
2
Yes, the language is indeed the same as for matrices.

30
Eigenvalues and eigenvectors of a real symmetric tensor

Theorems

(i) All of the eigenvalues of a real symmetric matrix are real.


(ii) The eigenvectors corresponding to distinct eigenvalues are orthogonal.
If a subset of the eigenvalues is degenerate (eigenvalues are equal), the corre-
sponding eigenvectors can be chosen to be orthogonal because:
(a) the eigenvector subspace corresponding to the degenerate eigenvalues is
orthogonal to the other eigenvectors;
(b) within this subspace, the eigenvectors can be chosen to be orthogonal by
the Gram-Schmidt procedure.

Proofs will not be given here – see books or lecture notes from mathematics courses.

Diagonalisation of a real symmetric tensor

Let T be a real second-rank symmetric tensor with real eigenvalues λ(1) , λ(2) , λ(3)
and orthonormal eigenvectors ` (1) , `( 2) , ` (3) , so that T ` (i) = λ(i) ` (i) (no summa-
tion) and ` (i) · ` (j) = δij . Let the matrix L have elements
 (1) (1) (1) 
`1 `2 `3
(i)  (2) (2) (2)  (i)
`ij = `j ≡  `1 `2 `3  = ` · ej
(3) (3) (3)
`1 `2 `3 ij
th th
I.e the i row of L is the i eigenvector of T . L is an orthogonal matrix
(LLT )ij = `im `jm = `m
(i) (j)
`m = δij
We can always choose the normalised eigenvectors ` (i) to form a right-handed basis:
 
det L = ijk `1i `2j `3k = ` (3) · ` (1) × ` (2) = +1
With this choice, L is a rotation matrix which transforms S to S 0 .
The tensor T transforms as (summing over the indices p, q only)
Tij0 = `ip `jq Tpq
= `p(i) Tpq `q(j)
= `p(i) λ(j) `p(j)
or
λ(1) 0 0
 

Tij0 = λ(j) δij =  0 λ(2) 0 


0 0 λ(3) ij
Thus we have found a basis or frame, S 0 , in which the tensor T takes a diagonal
form; the diagonal elements are the eigenvalues of T .3
3
Thus tensors may be diagonalised in the same way as matrices.

31
The inertia tensor

When studying rigid body dynamics, it’s (usually) best to work in a basis in which
the inertia tensor is diagonal. The eigenvectors of I define the principal axes of the
tensor. In this (primed) basis
 
A 0 0
I0 =  0 B 0 
0 0 C

where the (positive) quantities A, B, C are called the principal moments of inertia.
In this basis, the angular momentum and kinetic energy take the form

L = A ω10 e01 + B ω20 e02 + C ω30 e03


T = 12 A ω10 2 + B ω20 2 + C ω30 2


For a free body (i.e. no external forces), L and T are conserved (time-independent),
but ω will in general be time dependent.
Note that the angular momentum L is parallel to the angular velocity ω when
both are parallel to one of the principal axes of the inertia tensor. For example, if
ω = ω10 e01 then
L = A ω10 e01
In this case, the body is rotating about a principal axis which passes through its
centre of mass.
Notes:

• The principal axes basis is used in the Lagrangian Dynamics course to study
the rotational motion of a free rigid body in the Newtonian approach to dy-
namics, and the motion of a spinning top with principal moments (A, A, C) in
the Lagrangian approach.

• The principal axes basis/frame is ‘fixed to the body’, i.e. it moves with the
rotating body, and is therefore a non-inertial frame.

The inertia ellipsoid

The remainder of this chapter is non-examinable, but it may be of interest to those


taking the course Lagrangian Dynamics.
A geometrical picture is provided by the inertia ellipsoid, which is defined by

Iij ωi ωj = 1
√ √
A factor of 2T is absorbed into ω by convention: ωi → ωi / 2T .

32
ω2
In the principal axes basis, where ωi0 = `ij ωj ,
we have ~h
~n ω1′
P
A ω10 2 +B ω20 2 +C ω30 2 = 1
ω2′ ~ω
This is called the normal form. It describes
an ellipsoid because A, B, C are all positive.
(ThisR follows from the definition, for example ω1
O
A = ρ (y 2 + z 2 ) dV .)
[The angular momentum L is labelled h in the
figure on the right. To be fixed . . . ]
ω3 ω3′

In any basis, a small displacement ω → ω + dω on


the ellipsoidal surface at the point P , with normal
n, obeys d~ω ~n
Iij ωi dωj = Lj dωj = 0

for all dω. P


Therefore L is orthogonal to dω and parallel to n,
i.e. L is always orthogonal to the surface of the
ellipsoid at P .
In the principal axes basis

L = A ω10 e01 + B ω20 e02 + C ω30 e03

The directions for which L is parallel to ω are obviously P ω1′


the directions of the principal axes of the ellipsoid. For
~n
example, if ω = ω10 e01 then
ω2′
L = A ω10 e01 ~ω

In this case, the body is rotating about a principal axis O


which passes through its center of mass.
This gives a ‘geometrical’ answer to our original ques-
ω3′
tion.

Notes:

• If two principal moments are identical (A, A, C), the ellipsoid becomes a
spheroid.

• If all three principal moments are identical the ellipsoid becomes a sphere, and
L is always parallel to ω.

33
Chapter 5

Taylor expansions

Taylor expansion is one of the most important and fundamental techniques in the
physicist’s toolkit. It allows a differentiable function to be expressed as a power
series in its argument(s). This is useful when approximating a function, it often
allows the problem to be ‘solved’ in some range of interest, and it’s used in deriving
fundamental differential equations. We shall use the expression ‘Taylor’s Theorem’
interchangeably with ‘Taylor expansion’.
We assume familiarity with Taylor expansions of functions of one variable, so we
won’t go through this material on the blackboard in the lectures. However, for
completeness, we include some comprehensive notes that you should read and/or
work through in your own time.
You will most likely have seen the multivariate Taylor expansion beyond leading
order in previous courses, but probably not in the way it’s done here. . .

5.1 The one-dimensional case


Let f (x) have a continuous mth -order derivative f (m) (x) in a ≤ x ≤ b, so that
Z x1
f (m) (x0 ) dx0 = f (m−1) (x1 ) − f (m−1) (a)
a

Integrating a total of m times gives


Z xm Z x1
··· f (m) (x0 ) dx0 · · · dxm−1
a
Z xma Z x2
 (m−1)
(x1 ) − f (m−1) (a) dx1 · · · dxm−1

= ··· f
Za xm Za x3
 (m−2)
(x2 ) − f (m−2) (a) − (x2 − a)f (m−1) (a) dx2 · · · dxm−1

= ··· f
a a
1
= f (xm ) − f (a) − (xm − a)f 0 (a) − (xm − a)2 f 00 (a) − · · ·
2!
1
− (xm − a)m−1 f (m−1) (a)
(m − 1)!

34
where we used the basic integral
Z x
1
(y − a)n−1 dy = (x − a)n
a n
Now let xm → x, which gives
1
f (x) = f (a) + (x − a)f 0 (a) + (x − a)2 f 00 + . . . +
2!
1
(x − a)m−1 f (m−1) (a) + Rm ,
(m − 1)!
where is n! = n(n − 1) · · · 1 is the usual factorial function, with 0! = 1, and the
remainder Rm is Z x Z x1
Rm = ··· f (m) (x0 ) dx0 · · · dxm−1
a a
But from the mean value theorem applied to f (m) , we have
Z x
f (m) (x0 ) dx0 = (x − a)f (m) (ξ) , a≤ξ≤x
a
which gives the “Lagrange form” for the remainder
1
Rm (x) = (x − a)m f (m) (ξ) , a≤ξ≤x
m!
Notes
Ra
• We can repeat the proof above for x
· · · where x ∈ [c, a] with c ≤ a ≤ b.
Since nothing changes, we can talk about expansion in a region about x = a.
• If (limm→∞ Rm ) → 0 (as usually assumed here), we have an infinite series
n=∞
X 1 (n)
f (x) = f (a) (x − a)n
n=0
n!

This is the Taylor expansion of f (x) about x = a.


The set of x values for which the series converges is called the region of con-
vergence of the Taylor expansion.
• If a = 0 , then

X 1 (n)
f (x) = f (0) xn
n=0
n!
The Taylor expansion about x = 0 is called the Maclaurin expansion.
Physicist’s “proof ” We can bypass the formal proof above by assuming
that a power series expansion of f (x) exists (i.e. the polynomials xn form a
complete basis) so that
X∞
f (x) = an x n .
n=0
Now differentiate p times, set x = 0, and equate coefficients for each p, to
obtain
1 (p)
f (p) (0) = 0 + · · · + p! ap + · · · + 0 which gives ap = f (0) (as before)
p!

35
5.1.1 Examples

Example 1: Expand the function

f (x) = sin x

about x = 0. We need 0000000000


1111111111
1111111111
0000000000
x
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
f (2n) (0) = (−1)n sin 0 = 0 0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
f (2n+1) (0) = (−1)n cos 0 = (−1)n 0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
x − 3!1 x3
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
Now, since |f (m) (ξ)| ≤ 1, then, for fixed x, 0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
0000000000
1111111111
1 m (m) 1 m
|Rm | = x f (ξ) ≤ x →0
m! m!

Therefore ∞
X x2n+1 x 3 x5
sin x = (−1)n = x− + + ...
n=0
(2n + 1)! 3! 5!
This ‘small x’ expansion is shown in the figure.

Example 2: Expand the function

f (x) = (1 + x)α

about x = 0. In this case


α!
f (n) (0) = α(α − 1) · · · (α − n + 1) ≡
(α − n)!
giving
∞ ∞  
X α! n
X α
f (x) = x ≡ xn
n!(α − n)! n
n=0 n=0

The Taylor expansion includes the binomial expansion, α need not be a +ve integer.

Example 3: a ‘problem’ case Consider, for example, the well-behaved function


 
1
f (x) = exp − 2
x
Now f (0) = 0, and f (n) (0) = 0 ∀n, so

f (x) = 0 + 0 + 0 + . . . = 0 ∀x

Beware of essential singularities – not all functions


with an infinite number of derivatives can be ex-
pressed as a Taylor series. See Laurent Series in
Honours Complex Variables next semester.

36
5.1.2 A precursor to the three-dimensional case
If we regard f (x + a) ≡ g(a) temporarily as a function of a only, we can write g(a)
as a Maclaurin series in powers of a

X 1 (n)
f (x + a) ≡ g(a) = g (0) an , g (n) (0) = f (n) (x)
n=0
n!

which we can rewrite as


∞  
X 1 (n) n d
f (x + a) = f (x) a ≡ exp a f (x)
n=0
n! dx
 
d
The differential operator exp a is defined by its power-series expansion. This
dx
is the form that we shall generalise to three dimensions in an elegant way.
It can also be obtained by first defining

F (t) ≡ f (x + at)

which we regard as a function of t. We need the expansion in powers of t about


t = 0, namely

X tn (n)
F (t) = F (0) (5.1)
n=0
n!

Noting that F (n) (0) = an f (n) (x), and setting t = 1, we find



X 1 n (n)
f (x + a) = F (1) = a f (x)
n=0
n!

as before.

5.2 The three-dimensional case


With this trick, we can use the one-dimensional result to find the Taylor expansion
of φ(r + a) in powers of a about the point r . Let

F (t) ≡ φ(r + ta) ≡ φ(u) (where we defined u = r + ta)



X tn (n)
= F (0)
n=0
n!

where we used equation (5.1) above. We want φ(r + a) which is F (1).


Using the chain rule, the first derivative of F (t) with respect to t is

∂φ(u) ∂ui ∂φ
F (1) (t) =

= ai = a · ∇ u φ(u)
∂ui ∂t ∂ui

37
where we used
∂ui ∂ ∂φ
= (xi + tai ) = ai and defined a · ∇ u ≡ ai
∂t ∂t ∂ui

The nth derivative of F (t) is


n n
F (n) (t) = a · ∇ u φ(u) and hence F (n) (0) = a · ∇ r φ(r) (5.2)

For F (1) we have


X 1 n 
φ(r + a) = a · ∇ r φ(r) ≡ exp a · ∇ r φ(r)
n=0
n!

This is the Taylor expansion of a scalar field φ(r) in three dimensions, in a rather
elegant form.
Generalisation to an arbitrary tensor field is easy. Simply replace φ(r) by Tij··· (r)
in the above expression.

1
Example: Find the Taylor expansion of φ(r + a) = for r  a.
|r + a|
Since φ(r) = 1/|r| = 1/r , we have

1 X 1 n 1
= a · ∇r
|r + a| n=0
n! r

1 1 1 1 1
= + (ai ∂i ) + (ai ∂i ) (aj ∂j ) + · · ·
r 1! r 2! r
a·r 3(a · r)2 − a2 r2
 
1 1
= − 3 + +O 4
r r 2r5 r

Exercise: check this explicitly.


This result is used in the multipole expansion in electrostatics and magnetostatics.

38
Chapter 6

Curvilinear coordinates

As you have seen in previous courses, it is often convenient to work with coordinate
systems other than Cartesian coordinates {xi }, i.e. (x1 , x2 , x3 ) or (x, y, z). Since
most of the material in this chapter should be familiar, albeit in a slightly different
formalism, we won’t go through it on the blackboard – as suggested by the 2018/19
class in their course feedback! Instead, you should read through it on your own.

For example, spherical polar coordinates (r, θ, φ) z6


are defined by: P


x = r sin θ cos φ r
θ

y = r sin θ sin φ  -y
 φ

z = r cos θ x



We shall set up a formalism to deal with rather general coordinate systems, of which
spherical polars are a very important example.
Suppose we make a transformation from the Cartesian coordinates (x1 , x2 , x3 ) to
the variables (u1 , u2 , u3 ), which are functions of the {xi }

u1 = u1 (x1 , x2 , x3 )
u2 = u2 (x1 , x2 , x3 )
u3 = u3 (x1 , x2 , x3 )

If the variables {ui } are single-valued functions of the variables {xi }, then we can
make the inverse transformations,

xi = xi (u1 , u2 , u3 ) for i = 1, 2, 3,

except possibly at certain points.


A point may be specified by its Cartesian coordinates {xi }, or its curvilinear coor-
dinates {ui }. We may define the curvilinear coordinates by equations giving {xi }
as functions of {ui }, or vice-versa.1
1
Sometimes the curvilinear coordinates are called (u, v, w), just as Cartesians are called (x, y, z).

39
• For Cartesian coordinates, the surfaces ‘xi = constant’ (i = 1, 2, 3) are planes,
with (constant) normal vectors e i (the Cartesian basis vectors) intersecting at
right angles.
x2

• For general curvilinear coordinates,


the surfaces ‘ui = constant’ do not u2 = constant

have constant normal vectors, nor do


x2 = c
they intersect at right angles.
For example, in 2-D, we might have
the situtation illustrated in the dia-
gram on the right.
x1 = c x1
u1 = constant

From the definition of spherical polar coordinates (r, θ, φ), we have


( )
p z y
r = x2 + y 2 + z 2 θ = cos−1 p φ = tan−1 .
x2 + y 2 + z 2 x
The surfaces of constant r, θ, and φ are
r = constant ⇒ spheres centred at the origin
θ = constant ⇒ cones of semi-angle θ and axis along the z-axis
φ = constant ⇒ planes passing through the z-axis
Not all of these surfaces are planes, but they do intersect at right angles. The angle
θ is undefined at the origin; φ is undefined on the z-axis.

6.1 Orthogonal curvilinear coordinates


If the coordinate surfaces (surfaces of constant ui ), intersect at right angles, as in the
example of spherical polars, the curvilinear coordinates are said to be orthogonal.

6.1.1 Scale factors and basis vectors


Let point P have position vector r = r(u1 , u2 , u3 ). If we change u1 by du1 (with u2
and u3 fixed), then r → r + dr , with
∂r
dr = du1 ≡ h1 e 1 du1
∂u1
where we have defined the scale factor h1 and the unit vector e 1 by

∂r
h1 = and e 1 = 1 ∂r
∂u1 h1 ∂u1

• The scale factor h1 gives the length h1 du1 of dr when we change u1 → u1 +du1 .
• e 1 is a unit vector in the direction of increasing u1 (with fixed u2 and u3 .)

Similarly, we can define hi and e i for i = 2 and 3.

40
In general, if we change a single ui , keeping the other two fixed, we have
∂r
= hi e i i = 1, 2, 3 [no sum on i]
∂ui

• The unit vectors {e i } are in general not constant vectors – their directions
depend on the position vector r, and hence on the curvilinear coordinates
{ui }. [They should perhaps be called {e ui } or {e u , e v , e w } to avoid confusion
with Cartesian basis vectors.]

• If the curvilinear unit vectors satisfy e i · e j = δij , the {ui } are said to be
orthogonal curvilinear coordinates, and the three unit vectors {e i } form an
orthonormal basis. We will always choose the ui to give a right-handed basis.

u3
e3

e2
u1
u2
e1

6.1.2 Examples of orthogonal curvilinear coordinates (OCCs)


Cartesian coordinates:
∂r
r = x ex + y ey + z ez ⇒ hx e x = = e x , etc.
∂x
The scale factors are all unity, and each individual unit vector points in the same
direction everywhere.

Spherical polar coordinates: u1 = r , u2 = θ , u3 = φ (in that order)

r = r sin θ cos φ e x + r sin θ sin φ e y + r cos θ e z



∂r ∂r
= sin θ cos φ e x + sin θ sin φ e y + cos θ e z ⇒ hr = = 1
∂r ∂r

∂r ∂r
= r cos θ cos φ e x + r cos θ sin φ e y − r sin θ e z ⇒ hθ = = r
∂θ ∂θ

∂r ∂r
= −r sin θ sin φ e x + r sin θ cos φ e y ⇒ hφ = = r sin θ
∂φ ∂φ
Hence the unit vectors for spherical polars are

e r = sin θ cos φ e x + sin θ sin φ e y + cos θ e z = r / r


e θ = cos θ cos φ e x + cos θ sin φ e y − sin θ e z
e φ = − sin φ e x + cos φ e y

41
These unit vectors are normal to the surfaces described above (spheres, cones and
planes).
They are orthogonal:
z6
er
er · eθ = er · eφ = eθ · eφ = 0 P :e
3


 φ
 A
 AU
And they form a right-handed orthonormal  eθ
 r

basis:
θ 
 -y
e r ×e θ = e φ , e θ ×e φ = e r , e φ ×e r = e θ .
 φ


x 
See also tutorial sheet.



Cylindrical coordinates: u1 = ρ , u2 = φ , u3 = z (in that order)

r = ρ cos φ e x + ρ sin φ e y + z e z

∂r ∂r ∂r
⇒ = cos φ e x + sin φ e y = −ρ sin φ e x + ρ cos φ e y = ez
∂ρ ∂φ ∂z
The scale factors are then (tutorial) hρ = 1 , hφ = ρ , hz = 1 , and the basis vectors
are

e ρ = cos φ e x + sin φ e y e φ = − sin φ e x + cos φ e y ez = ez

These unit vectors are normal to surfaces which are (respectively): cylinders centred
on the z-axis (ρ = constant), planes through the z-axis (φ = constant), planes
perpendicular to the z axis (z = constant), and they are clearly orthonormal.

6.2 Elements of length, area and volume

6.2.1 Vector length


If we change u1 → u1 + du1 , keeping u2 and u3 fixed, then r → r + dr 1 with
dr 1 = h1 e 1 du1 . The infinitesimal element of length along e 1 is h1 du1 .
Similarly, the infinitesimal lengths along the curvilinear basis vectors e 2 , e 3 , are
h2 du2 , h3 du3 respectively.
If we change ui → ui + dui , for all i = 1, 2, 3, then
3
X
dr = h1 du1 e 1 + h2 du2 e 2 + h3 du3 e 3 = hi dui e i
i=1

Note: We are not using the summation convention here – because the expression
for dr above contains 3 identical indices! Great care is needing when using the
summation convention with OCCs.

42
6.2.2 Arc length and metric tensor
Defining ds as the length of the infinitesimal vector dr , we have (ds)2 = dr · dr
In Cartesian coordinates: (ds)2 = (dx)2 + (dy)2 + (dz)2

(i) For an arbitrary set of curvilinear coordinates (not necessarily orthogonal),


letting uk → uk + duk , ∀ k = 1, 2, 3, gives
∂xk ∂xk ∂xk ∂xk
(dr)k = dxk = du1 + du2 + du3 = dui
∂u1 ∂u2 ∂u3 ∂ui

The square (ds)2 of the length ds is then


∂xk ∂xk
(ds)2 = dxk dxk = dui duj ≡ gij dui duj (6.1)
∂ui ∂uj
where we defined the 32 = 9 components gij of the metric tensor by
∂xk ∂xk ∂r ∂r
gij = gji = = ·
∂ui ∂uj ∂ui ∂uj
The 9 quantities gij are the components of a symmetric second-rank tensor.
For an arbitrary set of curvilinear coordinates, gij is not diagonal in general.
(ii) In terms of scale factors2
! !
X X X
(ds)2 =

hi e i dui · hj e j duj = hi hj e i · e j dui duj (6.2)
i j ij

For orthogonal curvilinear coordinates, we have e i · e j = δij , and hence


X
(ds)2 = h2i (dui )2 = h21 du21 + h22 du22 + h23 du23 (6.3)
i

Comparing equations (6.1) and (6.2), the metric tensor can be written as

gij = hi hj e i · e j

For orthogonal curvilinear coordinates, the metric tensor is diagonal

gij = h2i δij (no sum on i)

and it’s simplest to use equation (6.3).


Example: For spherical polars, we showed that

hr = 1 hθ = r hφ = r sin θ

In this case
(ds)2 = (dr)2 + r2 (dθ)2 + r2 sin2 θ (dφ)2

X 3
X
2
The shorthand means etc.
i i=1

43
6.2.3 Vector area dS
e2
If we let u1 → u1 + du1 , with u2 , u3 fixed, then

r → r + dr 1 with dr 1 = h1 e 1 du1 h2 du2


dS
Similarly, if we let u2 → u2 +du2 , with u1 , u3 fixed, h1 du1

r → r + dr 2 with dr 2 = h2 e 2 du2 r(u1, u2, u3) e1

The vector area of the infinitesimal parallelogram whose sides are the vectors dr1
and dr 2 is
dS = (dr 1 ) × (dr2 ) = (h1 du1 e 1 ) × (h2 du2 e 2 )
For OCCs, this simplifies to
dS 3 = h1 h2 du1 du2 e 3 ,
since e 1 × e 2 = e 3 for orthogonal systems. dS3 points in the direction of e 3 , which
is normal to the surfaces u3 =constant, and the infinitesimal area is a rectangle.
The vector areas dS 1 and dS 2 are defined similarly.

Example: For spherical polars, if we vary θ and φ, keeping r fixed, we easily obtain
a familiar result
dS r = (hθ dθ e θ ) × hφ dφ e φ = hθ hφ dθ dφ e r = r2 sin θ dθ dφ e r


Similarly, if we vary φ and r, keeping θ fixed, we obtain the vector element of area
on a cone of semi-angle θ, with its axis along the z axis

dS θ = hφ dφ e φ × (hr dr e r ) = hφ hr dφ dr e θ = r sin θ dr dφ e θ
Similarly for dS φ [tutorial].

6.2.4 Volume
The volume of the infinitesimal parallelepiped with edges dr1 , dr2 and dr3 is

dV = dr1 × dr2 · dr3 = |((h1 du1 e 1 ) × (h2 du2 e 2 )) · (h3 du3 e 3 )|

For OCCs, we have (e 1 × e 2 ) · e 3 = 1 (in a RH basis), hence


dV = h1 h2 h3 du1 du2 du3
In this case, the infinitesimal volume is a cuboid.
For spherical polars, we have dV = hr hθ hφ dr dθ dφ = r2 sin θ dr dθ dφ
For a general set of curvilinear coordinates

dV = g du1 du2 du3
where g is the determinant of the metric tensor.

44
6.3 Components of a vector field in curvilinear
coordinates
A vector field a(r) can be expressed in terms of curvilinear components ai , defined
by
X 3
a(r) = ai (u1 , u2 , u3 ) e i
i=1

where e i is the ith curvilinear basis vector (which again should really be called eui
to avoid confusion with the Cartesian basis vectors.)
For orthogonal curvilinear coordinates, the component ai can be obtained by taking
the scalar product of a with the ith curvilinear basis vector e i

ai = a(r) · e i

NB ai must be expressed in terms of u1 , u2 , u3 (not x, y, z) when working in the


{ui } basis.

Example: If a = a e x in Cartesians, then in spherical polars



ar = a · e r = (a e x ) · sin θ cos φ e x + sin θ sin φ e y + cos θ e z = a sin θ cos φ

Similarly, aθ = a · e θ and aφ = a · e φ , and we obtain a in the spherical-polar basis


(exercise)

a(r, θ, φ) = a sin θ cos φ e r + cos θ cos φ e θ − sin φ e φ

• You can often spot the curvilinear components “by inspection”.

• In general, one chooses the set of coordinates which matches most closely the
symmetry of the problem.

45
6.4 Div, grad, curl and the Laplacian in OCCs

6.4.1 Gradient
The gradient of a scalar field3 f (r) is defined in terms of the change in the field
df (r) when r → r + dr
df (r) = ∇ f (r) · dr (6.4)
Now write f (r) in terms of orthogonal curvilinear coordinates: f (r) = f (u1 , u2 , u3 ).
As usual, we denote the curvilinear basis vectors by {e 1 , e 2 , e 3 } .
Let u1 → u1 + du1 , u2 → u2 + du2 , and u3 → u3 + du3 .
Using the chain rule, we have
∂f ∂f ∂f
df = du1 + du2 + du3 (6.5)
∂u1 ∂u2 ∂u3
We need to rewrite the RHS of this equation in the form of equation (6.4). Start
with
dr = h1 du1 e 1 + h2 du2 e 2 + h3 du3 e 3
and use orthogonality of the curvilinear basis vectors, e i · e j = δij , to rewrite
equation (6.5) as

∂f ∂f ∂f
df = du1 + du2 + du3
∂u1 ∂u2 ∂u3
 
∂f ∂f ∂f
= e + e + e · (e 1 du1 + e 2 du2 + e 3 du3 )
∂u1 1 ∂u2 2 ∂u3 3
 
1 ∂f 1 ∂f 1 ∂f
= e + e + e · (h1 e 1 du1 + h2 e 2 du2 + h3 e 3 du3 )
h1 ∂u1 1 h2 ∂u2 2 h3 ∂u3 3
 
1 ∂f 1 ∂f 1 ∂f
= e + e + e · dr
h1 ∂u1 1 h2 ∂u2 2 h3 ∂u3 3

Comparing this result with equation (6.4), which holds for all dr, gives us ∇ f in
orthogonal curvilinear coordinates

3
1 ∂f 1 ∂f 1 ∂f X 1 ∂f
∇f = e1 + e2 + e3 = e
h1 ∂u1 h2 ∂u2 h3 ∂u3 i=1
hi ∂ui i

For spherical polars, hr = 1, hθ = r, hφ = r sin θ, and we have

∂f 1 ∂f 1 ∂f
∇ f (r, θ, φ) = er + eθ + e
∂r r ∂θ r sin θ ∂φ φ
3
We use f (r) rather than φ(r) in order to avoid confusion with the angle φ in spherical polars.

46
6.4.2 Divergence
Let a(r) be a vector field, which we write in orthogonal curvilinear coordinates as
3
X
a(r) = ai (u1 , u2 , u3 ) e i
i=1

where ai are the components of a in the curvilinear basis, e i is the ith curvilinear
basis vector, and a is continuously differentiable.
We obtain ∇ · a in orthogonal curvilinears using the integral definition of divergence
1
Z
∇ · a = lim a · dS ,
δV →0 δV δS

where δS is the closed surface bounding δV .

Let the point P have curvilinear coordinates (u1 , u2 , u3 ).

Choose δV to be a small “cuboid” with its three R


u3 C
edges {δr i } along the basis vectors {e i } at P :
S 
D
δr 1 = h1 δu1 e 1
δr 2 = h2 δu2 e 2 u2

1
δr 3 = h3 δu3 e 3 Q B

P q u1
APP

The outward element of area on the face ABCD is dS = +h2 h3 du2 du3 e 1
The outward element of area on the face P QRS is dS = −h2 h3 du2 du3 e 1
The contributions to the surface integral from the faces ABCD and P QRS are then
Z u3 +δu3 Z u2 +δu2 n o
[a1 h2 h3 ]ABCD − [a1 h2 h3 ]P QRS du02 du03
u3 u2
Z u3 +δu3 Z u2 +δu2 n o
= [a1 h2 h3 ](u1 +δu1 ,u0 ,u0 ) − [a1 h2 h3 ](u1 ,u0 ,u0 ) du02 du03
2 3 2 3
u3 u2
( )
u3 +δu3 u2 +δu2  

Z Z
= δu1 (a1 h2 h3 ) du02 du03 (by Taylor’s theorem)
u3 u2 ∂u 1 (u1 ,u02 ,u03 )
 

= δu1 δu2 δu3 (a1 h2 h3 ) (6.6)
∂u1 (u1 ,u2 ,u3 )

In the last step, we assumed that δV is small enough that the integrand is ap-
proximately constant over the range of integration. We can then approximate the
integrals over u02 and u03 by the integrand evaluated at the point P ,
 

δu1 (a1 h2 h3 )
∂u1 (u1 ,u2 ,u3 )

47
multiplied by the range of integration δu2 δu3 . [This is a rather crude use of the
mean value theorem!]
The contributions of the other four faces to the integral over δS can be obtained
similarly, or by cyclic permutations of the indices {1, 2, 3} in equation (6.6).
Finally, divide by the volume of the cuboid δV = h1 h2 h3 δu1 δu2 δu3 , whereupon all
the factors of δui cancel, and we obtain our final expression for ∇ · a in orthogonal
curvilinear coordinates

 
1 ∂ ∂ ∂
∇·a = (a1 h2 h3 ) + (a2 h3 h1 ) + (a3 h1 h2 )
h1 h2 h3 ∂u1 ∂u2 ∂u3

For Cartesian coordinates, the scale factors are all unity, and we recover the usual
expression for ∇ · a in Cartesians.
For spherical polars we have
 
1 ∂ 2  ∂ ∂
∇ · a(r, θ, φ) = 2 r sin θ ar + (r sin θ aθ ) + (r aφ )
r sin θ ∂r ∂θ ∂φ
 
1 ∂ 2  1 ∂ ∂
= 2 r ar + (sin θ aθ ) + (aφ )
r ∂r r sin θ ∂θ ∂φ

where ar , aθ , and aφ are the components of the vector field a in the basis {e r , e θ , e φ }.

6.4.3 Curl
We obtain ∇ × a in orthogonal curvilinear coordinates using the line integral defi-
nition of curl.
The component of ∇ × a in the direction of the
unit vector n is e1
1
I

n · ∇ × a = lim a · dr e3
δS→0 δS δC

where δS is a small planar surface, with unit nor- 3


4
mal n, bounded by the closed curve δC. δC

Let δS be a small rectangular surface parallel to δS


1 2
the e 2 −e 3 plane with one corner at r(u1 , u2 , u3 ),
and with edges

δr 2 = h2 δu2 e 2 and δr 3 = h3 δu3 e 3 r(u1, u2, u3)


e2
which lie along the basis vectors, so that n = e 1 .

The line integral around the curve δC is the sum of the line integrals along the lines

48
1 → 4 respectively,
I Z u2 +δu2 Z u3 +δu3
a · dr = [a2 h2 ](u1 ,u0 ,u3 ) du02 + [a3 h3 ](u1 , u2 +δu2 ,u0 ) du03
2 3
δC u2 u3
Z u2 +δu2 Z u3 +δu3
− [a2 h2 ](u1 ,u0 ,u3 +δu3 ) du02 − [a3 h3 ](u1 , u2 ,u0 ) du03
2 3
u2 u3

Using Taylor’s theorem, we can write this as


Z u3 +δu3 (   )

I
a · dr = δu2 (a3 h3 ) du03
δC u3 ∂u 2 0
(u1 , u2 ,u ) 3

( )
u2 +δu2  

Z
− δu3 (a2 h2 ) du02
u2 ∂u3 (u1 , u0 ,u3 )
2

In each case, we approximate the integrals over u03 and u02 by the product of the
integrand and the integration ranges δu3 and δu2 , respectively. Hence
∂ ∂
I
a · dr = (a3 h3 ) δu2 δu3 − (a2 h2 ) δu3 δu2
δC ∂u2 ∂u3
where all the {ai } and {hi } are evaluated at r(u1 , u2 , u3 ).
Finally, we divide by the area of the rectangle δS = h2 h3 δu2 δu3 , whereupon all the
factors of δui cancel, and we obtain
 
  1 ∂ ∂
e1 · ∇ × a = ∇ × a 1 = (a3 h3 ) − (a2 h2 )
h2 h3 ∂u2 ∂u3
The components of ∇ × a in the directions of the curvilinear basis vectors e 2 and
e 3 may be obtained similarly, or by cyclic permutations of the indices.
It is convenient to write the final result in the form


h e h e h e
1 1 2 2 3 3

1 ∂ ∂ ∂
∇×a =
h1 h2 h3 ∂u1 ∂u2 ∂u3



h1 a1 h2 a2 h3 a3

For spherical polars we have




er r e θ r sin θ e φ

1 ∂ ∂ ∂
∇×a = 2
r sin θ ∂r ∂θ ∂φ


ar raθ r sin θaφ

49
6.4.4 Laplacian of a scalar field
The action of the Laplacian operator on a scalar field f (r) is defined by ∇2 f =
∇ · (∇ f ) .
Using the expression for ∇ · a, with a = ∇ f , derived above, we find

      
2 1 ∂ h2 h3 ∂f ∂ h3 h1 ∂f ∂ h1 h2 ∂f
∇f = + +
h1 h2 h3 ∂u1 h1 ∂u1 ∂u2 h2 ∂u2 ∂u3 h3 ∂u3

In spherical polars, we have


      
2 1 ∂ 2 ∂f ∂ ∂f ∂ 1 ∂f
∇ f (r, θ, φ) = 2 r sin θ + sin θ +
r sin θ ∂r ∂r ∂θ ∂θ ∂φ sin θ ∂φ

∂ 2f
     
1 ∂ 2 ∂f 1 ∂ ∂f
= 2 r + 2 2 sin θ sin θ +
r ∂r ∂r r sin θ ∂θ ∂θ ∂φ2

The expression for the Laplacian of a scalar field in spherical polars is one of the
most useful results in the course, with applications in electromagnetism, quantum
mechanics, optics, elasticity, fluid mechanics, meteorology, general relativity, cos-
mology, . . .

6.4.5 Laplacian of a vector field


The Laplacian of a vector field a(r) in curvilinear coordinates is defined by means
of the identity
∇ × ∇ × a = ∇ ∇ · a − ∇2 a
 

in the form4

∇2 a = ∇ ∇ · a − ∇ × ∇ × a
 

The quantities on the right hand side are evaluated using the expressions for grad,
div and curl derived above.

4
Remember the mnemonic ‘grad-div-minus-curl-curl’ or GDMCC, pronounced ‘guddumk’.
Thanks to Peter Boyle for teaching me this!

50
Chapter 7

Electrostatics

7.1 The Dirac delta function in three dimensions

Consider the mass of a body with density ρ(r). The mass


V
of the body is Z
M= ρ(r) dV
V
How can we use this general expression for the case of a ~r0
single particle? What is the ‘density’ of a single ‘point’
particle with mass M at r0 ? O

We need a ‘function’ ρ(r) with the properties


)
ρ(r) = 0 ∀ r 6= r0
R ρ(r) = M δ(r − r0 )
M = V ρ(r) dV r0 ∈ V | {z }
notation

Generalising slightly, we define the delta function to pick out the value of the function
f (r 0 )1 at one point r 0 in the range of integration, so that
(
f (r0 ) r0 ∈ V
Z
dV f (r) δ(r − r0 ) =
V 0 otherwise

Similarly, the total charge on a body with charge density (charge per unit volume)
ρ(r) is Z
Q= ρ(r) dV
V

The one dimensional delta function

The delta function may be defined by a sequence of functions δ (x − a), each of


‘area’ unity, which have the desired limit when integrated over. We give a number
of examples of how this may be done.
1
f (r0 ) = M in the example above

51
• Top hat

 1

a−<x<a+ 1

δ (x − a) = 2
0 otherwise

a−ǫ a+ǫ

• Witch’s hat

 1 [ − |x − a|]

1
a−<x<a+ ǫ
δ (x−a) = 2
0 otherwise

a−ǫ a+ǫ

• Gaussian

1

(x − a)2
 
1 ǫ π
δ (x − a) = √ exp −
 π 2

In each case
Z +∞ Z +∞
dx f (x) δ (x − a) = dx f (x + a) δ (x)
−∞ −∞
Z +∞
dx f (a) + x f 0 (a) + x2 /2 f 00 (a) + . . . δ (x)
 
=
−∞

where we shifted the integration variable in the first line, and Taylor-expanded the
integrand in the second. The function f (x) is a ‘good’ test function, i.e. one for
which the integral is convergent for all .

52
For the top hat, we need to evaluate
Z +∞ Z +
n 1
dx x δ (x) = dx xn
−∞ − 2
(
1
n n = 0, 2, 4, . . . 1 n=0

n+1
= → =
0 n = 1, 3, 5, . . . →0
|{z} 0 otherwise

Hence
Z +∞
dx f (x) δ (x − a) |{z}
→ f (a) i.e. δ (x − a) → δ(x − a)
−∞ →0

Similarly for the other representations. The Gaussian representation is the cleanest,
because it’s a smooth function.

Notes:

(i) The Dirac delta ‘function’ isn’t a function, it’s a distribution or generalised
function.

(ii) Colloquially, it’s an infinitely-tall infinitely-thin spike of unit area.

(iii) The delta function is the continuous-variable analogue of the Kronecker delta
symbol. If we let i → x
Z
ui δij = uj → dx x δ(x − x0 ) = x0

(iv) An important identity is


Z +∞ X f (xi )
dx f (x) δ (g(x)) =
−∞ i
|g 0 (xi )|

where g(xi ) = 0, i.e. xi are the simple zeroes of g(x) [tutorial].

The three dimensional delta function

In Cartesian coordinates (x, y, z),

δ (3) (r − r0 ) ≡ δ(r − r0 ) = δ(x − x0 ) δ(y − y0 ) δ(z − z0 )

In orthogonal curvilinear co-ordinates (u1 , u2 , u3 ),


1
δ(r − a) = δ(u1 − a1 ) δ(u2 − a2 ) δ(u3 − a3 )
h1 h2 h3
where h1 , h2 , h3 are the usual scale factors [tutorial].
(In the last equation, we set r0 = a to avoid double subscripts on the RHS.)

53
7.2 Coulomb’s law

Experimentally, the force between two point charges q P

and q1 at positions r and r1 , respectively, is given by


Coulomb’s law
 ~r
1 q q1 r − r1
F1 =
4π0 r − r 3

1 1

F 1 is the force on the charge q at r, produced by the ~r1


charge q1 at r1 . O

Charges can be positive or negative. For qq1 > 0 we have repulsion, and for qq1 < 0
we have attraction: like charges repel and opposite charges attract.
In SI units, charge is measured in Coulombs (C). The proton charge is defined to be
exactly 1.602176634 × 10−19 C, so that 1C is the charge of 0.62415 . . . × 1019 protons.
The permittivity of free space is measured as 0 = 8.85418781 . . .×10−12 C 2 N −1 m−2 .
Aside: Similarly for Newton’s law of gravitation,

m m1 r − r1
F 1 = −G
r − r 3

1
which is always attractive (hence the negative sign, so that G, m, m1 are all positive).
In SI units: G = 6.672 × 10−11 N m2 kg 2 .

7.3 The electric field


The electric field E is ‘produced’ by a charge configuration, and is defined in terms
of the force on a small positive test charge,
1
E(r) = lim F
q→0 q

Clearly, E is a vector field.

7.3.1 Field lines


Field lines are the ‘lines of force’ on the test charge.
Newton’s equations imply that the motion of a (test) particle
is unique, which implies that the field lines do not cross, and
thus that they are well-defined and can be measured.
Thus for our two charges q and q1 we have −

F 1 = q E(r)

i.e. particle 1 ‘produces’ an electrostatic field E(r) . The dia-


gram shows the field lines produced by a negative charge.

54
The particle at P ‘feels’ the electrostatic field as a force q E(r) with

1 q1 r − r 1
E(r) = (7.1)
4π0 r − r 3

1

7.3.2 The principle of superposition

Consider a set of charges qi situated at ri P

The principle of superposition is motivated by ex-


periment; it states that the total electric field at r
is the vector sum of the fields due to the individual
~r
charges at ri
 i
1 X qi r − r i
E(r) = ~ri
4π0 i r − r 3

i
O

P
In the limit of (infinitely) many charges, we intro-
duce a continuous charge density (charge/volume) ~r − ~r′
ρ(r0 ), so that the charge in dV 0 at position r0 is
ρ(r0 ) dV 0 . The electric field is then ~r

r − r0

1
Z
0 0
E(r) = dV ρ(r )
4π0 r − r 0 3

V
~r′

To return to our original example of a single charge q1 at position r1 , we simply set


the charge density ρ(r0 ) = q1 δ(r0 − r1 ), which recovers the result in equation (7.1).

7.4 The electrostatic potential for a point charge


Since
1 (r − r1 )
∇ = − (7.2)
r − r1 r − r 3

1
where ∇ operates on r (not r1 ), then for a point charge q1 at r1
 !
q1 r − r1 q1 1
E(r) = = − ∇
4π0 r − r 3 4π0 r − r1

1

i.e. we may write


1 q
E(r) = −∇ φ(r) with φ(r) = 1 (7.3)
4π0 r − r1

φ(r) is the electrostatic potential for the electric field E(r).

55
7.5 The static Maxwell equations

7.5.1 The curl equation


For a continuous charge distribution, we again use equation (7.2) to write the electric
field as a gradient
!
0
1 1 1 ρ(r )
Z Z
E(r) = − dV 0 ρ(r0 ) ∇ = −∇ dV 0 (7.4)
4π0 V r − r0 4π0 V r − r0
Note that ∇ operates on r (not r0 ) so we can take it out of the volume integral
over r0 . Therefore
!
0
1 ρ(r )
Z
∇ × E = −∇ × ∇ dV 0
4π0 V r − r0
But the curl of the gradient of a scalar field is always zero, which implies

∇×E =0

for all static electric fields. This is (the static version of) Maxwell’s third equation.

7.5.2 Conservative fields and potential theory


A vector field that satisfies ∇ × E = 0 is said to be conservative (or irrotational ).

Consider the integral of ∇ × E over an open surface S


bounded by the closed curve C1 − C2 . Using Stokes’ B ~b
C2
theorem
Z I

0 = ∇ × E · dS = E · dr S
S C1 −C2

Therefore Z Z C1
E · dr = E · dr A ~a
C1 C2

Since the line integral is independent of the path from a to b, it can only depend on
the end points. So, for some scalar field φ, we must have
Z b
− E · dr = φ(b) − φ(a)
a
Now let a = r and b = r + δr, where δr is small, so we can approximate the integral
−E(r) · δr + . . . = φ(r + δr) − φ(r) = ∇φ · δr + . . .
where we used the definition of the gradient in the last step. Since this holds ∀ δr ,
we can always write
E(r) = −∇ φ(r)

The scalar field φ(r) is called the potential for the vector field E(r).

56
An explicit expression for φ(r) can be obtained from (7.4). We have E = −∇φ with

1 ρ(r0 )
Z
φ(r) = dV 0
4π0 V r − r0

This is linear superposition for potentials.


P q1
As in the case of the electric field, if we set
ρ(r0 ) = q1 δ(r0 − r1 ), we recover the potential
for a single charge (equation (7.3))
~r1
~r
1 q
φ(r) = 1
4π0 r − r1
O

Notes:

• For a surface charge distribution, with charge/unit-area σ(r), the electric field
produced is
0

1 r − r 1 σ(r0 )
Z Z
0 0
E(r) = dS σ(r ) 3 and φ(r) = dS 0
4π0 S r − r 0 4π0 S |r − r0 |

where dS 0 is the infinitesimal (scalar) element of area at r0 on the surface S.


• For a line distribution of charge, with charge/unit-length λ(r)

r − r0
 0
1 1 0 λ(r )
Z Z
0 0
E(r) = dl λ(r ) and φ(r) = dl
4π0 C r − r 0 3 4π0 C |r − r0 |

where dl0 is the infinitesimal element of length along the line (or curve) C.
• In SI units, the potential is measured in Volts V . In terms of other units
V = C/(C 2 N −1 m−1 ) = N mC −1 = JC −1 .
• Field lines are perpendicular to surfaces of con- ~n

stant potential φ, called equipotentials or equipo-


tential surfaces.
d~r
Let dr be a small displacement of the position
vector r of a point in the equipotential surface
φ = constant.
Therefore φ = const
0 = dφ = ∇φ · dr
so E = −∇ φ is perpendicular to dr.
Thus electric field lines E are everywhere perpen-
dicular to the surfaces φ = constant. − φ = const

57
• The potential φ is only defined up to an overall constant. If we let φ → φ + c,
the electric field E = −∇ φ (and hence the force) is unchanged. So only
potential differences have physical significance.
In most physical situations, φ → constant as r → ∞, and we usually choose
the constant to be zero.

• So far we’ve defined the potential in purely mathematical terms.


Physically, the potential difference, VAB , between two points A and B is defined
as the energy per unit charge required to move a small test charge q from A
to B:
1
VAB ≡ lim WAB
q→0 q

1
Z Z
= − F · dr = − E · dr
q C C
Z Z B
= ∇φ · dr = dφ
C A
= φB − φA

The −ve sign is because this is the work done against the force F . Since the
field is conservative, the integral is independent of the path – it depends only
on the end points.

• The potential energy of a point charge q at position r in an external electro-


static field E ext (r) = −∇ φext (r) is therefore given by q φext (r).
We may generalise this to a charge distribution in an external electric field
E ext (r) = −∇φext . In this case, the (interaction) energy is
Z
W = dV ρ(r) φext (r)
V

Note that this does not include the the self-energy of the charge distribution.
To emphasize this we write φext . [More on this later]

7.5.3 The divergence equation


Let’s return to the potential for an arbitrary charge distribution

1 ρ(r0 )
Z
φ(r) = dV 0
4π0 V r − r0

Since E = −∇ φ, we have ∇ · E = −∇2 φ, and hence

1 1
Z
∇ · E(r) = − dV 0 ρ(r0 ) ∇2 (7.5)
4π0 V r − r0

Note that ∇2 acts only on r (not on r0 ), so we can take it inside the integral over r0 .

58
 
21
Theorem ∇ = −4π δ(r) ∀r
r

Proof: We first prove it for r 6= 0


   
2 1 r6=0 x 
i 3 3 −5
∇ = −∂i 3 = − 3 − xi r 2xi = 0
r r r 2

To prove the result for r = 0, we integrate ∇2 (1/r) over an arbitrary volume V


containing the origin r = 0.
Z  
1
Z  
1
Z r
2 2
∇ dV = ∇ dV = − ∇ · 3 dV
V r Vε r Vε r
r ε
Z
= − 3
· dS = − 3 4πε2 = −4π
Sε r ε
In the first line, we used our previous result that ∇2 (1/r) = 0 everywhere away
from the origin to write the original integral as an integral over a sphere of radius ε
centred on the origin, with volume Vε and area Sε respectively.
We then used the divergence theorem to obtain the first result on the second line.
On the surface Sε , we have r = ε er and dS = er dS, where er is a unit vector in
the direction of r, so the integral over the surface of the sphere is straightforward –
check it!
[Alternatively,
R Rwe may write the surface integral as an integral over solid angle
3
S
r · dS/r = S dΩ = 4π.]
We can now take the limit ε → 0, which simply shrinks the sphere down to the
origin, leaving the integral unchanged.
Since our result
R for the integral holds for an arbitrary volume V centred on the
origin, and V δ(r) dV = 1, we deduce that ∇2 (1/r) = −4π δ(r). Similarly
!
1
∇2 = −4π δ(r − r0 )
r − r0
Substituting this result into equation (7.5) gives
1
Z
dV 0 ρ(r0 ) −4πδ r − r0

∇ · E(r) = −
4π0 V
Using the delta function to perform the integral on the right hand side, we get
Maxwell’s first equation
ρ(r)
∇ · E(r) =
0
We now have the two electrostatic Maxwell equations
ρ
∇·E = ∇×E = 0
0
In terms of the potential
ρ
E = −∇φ ∇2 φ = −
0
The second equation is called Poisson’s equation.

59
7.6 Electric dipole

Physically, an electric dipole consists of two nearby +q


equal and opposite (point) charges, with charge −q
d~
situated at r0 and charge +q at r0 + d .
~r0 + d~
Define the dipole moment p = qd. −q

It will turn out to be useful to consider the dipole


limit, in which ~r0

p = q→∞
lim qd
d→0
O
with p finite (and constant). This is sometimes ~r

called a point dipole or an ideal dipole. P

7.6.1 Potential and electric field due to a dipole


Dipole potential The electrostatic potential φ(r) produced by the dipole is
" #
q 1 1
φ(r) = −
4π0 r − r0 − d r − r
0
" #
q 1 d · (r − r0 ) 1
= + + O(d2 ) −
4π0 r − r0 r − r 3 r − r0
0
where we Taylor (or binomial) expanded the first term about r − r0 [tutorial].
In the dipole limit, the terms of O(qd2 ) vanish, and the potential is simply
1 p · (r − r0 )
φ(r) =
4π0 |r − r0 |3
For a dipole at the origin we have
1 p·r
φ(r) =
4π0 r3
Note that φ(r) falls off as 1/r2 .

Electric field The ith component of the electric field produced by (or due to) a
dipole of moment p situated at the origin is
1 p x 
j j
Ei (r) = −∂i φ = − ∂i
4π0 r3
   
1 δij 3 −5
= − p j 3 + xj − r 2xi
4π0 r 2
Therefore
3p · r p
 
1
E(r) = 5
r− 3
4π0 r r
which falls off as 1/r3 .

60
Spherical polar coordinates
Consider spherical polar coordinates (r, θ, χ), with
the z-axis chosen parallel to the dipole moment, ~ez
i.e. p = p ez , so that p · r = pr cos θ .
~er
[We use χ instead of φ for the azimuthal angle in
order to avoid confusion with the potential φ.]
p 1 θ
Then φ(r) = cos θ θ
4π0 r2
p 1  
E(r) = 3 cos θ er
− e z ~eθ
4π0 r3

The last expression uses a “mixed coordinate basis” (r, θ, z), which is useful at times.
We can also obtain this result using the expression for ∇φ in polar co-ordinates
E = −∇φ
 
∂φ 1 ∂φ 1 ∂φ
= − e + e + e
∂r r r ∂θ θ r sin θ ∂χ χ
 
p 2 sin θ
= − − 3 cos θ er − 3 eθ
4π0 r r

The second form can be obtained from the first by


substituting ez = er cos θ − eθ sin θ into the latter.
[Exercise: verify this.] φ = const

The sketch shows the electric field (full lines) and


equipotentials (dashed lines) for the dipole.
This picture holds in the dipole limit, but it’s also E~

valid when r  d, the ‘far zone’. The latter is of


course the important case for physics!

7.6.2 Force, torque and energy


Force on a dipole

The force on a dipole at position r due to an external


electric field E ext is +q

d~
F (r) = −q E ext (r) + q E ext (r + d)
~r + d~
 
= −q E ext (r) + q E ext (r) + (d · ∇)E ext (r) + · · · −q

In the point dipole limit


~r


F (r) = p · ∇ E ext (r)
O

61
Torque on a dipole

The torque (or moment of the force, or couple) on a dipole, about the point r where
the dipole is located, due to the external electric field is

G(r) = −q 0 × E ext (r) + q 0 + d × E ext (r + d)
  
= q d × E ext (r) + d · ∇ E ext (r) + · · ·
Taking the dipole limit (i.e. ignoring terms of order O(qd2 )), we find

G(r) = p × E ext (r)

Energy of a dipole

The energy of a dipole in an external electric field E ext is


W = −q φext (r) + q φext (r + d)
  
= −q φext (r) + q φext (r) + d · ∇ φext (r) + · · ·
In the dipole limit, using Eext = −∇ φext , we find

W = −p · E ext

How is this expression


 for the energy of the dipole related to the force on the dipole,
namely F = p · ∇ E ext ?
Recall the following identity for vector fields a and b
    
∇ a·b = a·∇ b+ b·∇ a+a× ∇×b +b× ∇×a

If we set a = p = constant, and b = E ext , then since E · ∇ p = 0 and ∇ × p = 0,
we find   
∇ p · E ext = p · ∇ E ext + p × ∇ × E ext
But ∇ × E ext = 0, so p × (∇ × E ext ) = 0, and hence
 
F (r) = p · ∇ E ext = ∇ p · E ext = −∇ W
The force on the dipole is the gradient of the potential energy, as one would expect.

Examples

• For the case of a homogeneous (i.e. constant, independent of r) external field,


E ext (r) = E 0 , we have
F = 0 and G(r) = p × E 0
Since W = −p · E 0 , a stable or equilibrium position (i.e. position of minimum
energy) occurs when p is parallel to E 0 . Colloqually, dipoles “like to align
with the field”.

62
• If the dipole at r has dipole moment p1 , and the electric field E ext (r) is due
to a second dipole of moment p2 at the origin, then
   
1  3 p 2
· r p2
W = −p1 · E ext (r) with E ext (r) = r − 
4π0 r5 r3

Therefore p~1

p1 · p2 3(r · p1 )(r · p2 )
 
1
W = −
4π0 r3 r5
~r
The interaction energy is not only depen- p~2
dent on the distance between the dipoles,
but also on their relative orientations. O

7.7 The multipole expansion


Consider the case of a charge distribution, ρ(r), localised in a volume V . For con-
venience we will take the origin inside V .

The potential at the point P is


q′
1 ρ(r0 )
Z
~r′
φ(r) = dV 0 P
4π0 V |r − r0 | O

For |r| much larger than the extent of V , i.e |r|  |r0 | for all |r0 | such that ρ(r0 ) 6= 0,
we can expand the denominator using the binomial theorem
(1 + x)n = 1 + nx + (n(n − 1)/2) x2 + O(x3 )
n o−1/2
r − r0 −1 = r 2 − 2r · r0 + r0 2


( 2 )−1/2
−1 r · r 0 r 0
= r 1 − 2 2 + 2
r r
(  0 3 )
r · r0 1 r02 3 −2r · r0 2
 
1 r
= 1+ 2 − + + O
r r 2 r2 8 r2 r
This can also be obtained by Taylor expansion [exercise]. Then
"
0 0
2 2 02
#
1 1 r · r 3 r · r − r r
Z
φ(r) = dV 0 ρ(r0 ) + 3 + + ...
4π0 V r r 2r5
This gives the multipole expansion for the potential
1 Q 1 p·r 1 Qij xi xj
φ(r) = + 3
+ + ...
4π0 r 4π0 r 4π0 2r5

63
where
Z
Q = dV 0 ρ(r0 ) is the total charge within V
ZV
p = dV 0 r0 ρ(r0 ) is the dipole moment about the origin
ZV
dV 0 3 x0i x0j − r02 δij ρ(r0 )

Qij = is the quadrupole tensor
V

The multipole expansion is valid in the far zone, i.e. when r  r0 , with r0 the size
of the charge distribution.

• If Q 6= 0, the monopole term dominates


1 Q
φ(r) =
4π0 r
and in the far zone, r  r0 , the E field is that of a point charge at the origin.

• When the total charge Q = 0 the dipole term dominates


1 p·r
φ(r) =
4π0 r3
If the charge density is given by two equal but oppositely-charged particles
close together, i.e. ρ(r0 ) = q δ(r0 − d) − δ(r0 ) , then
 

Z
dV 0 r0 q δ(r0 − d) − δ(r0 ) = q d
 
p =

which is the dipole moment as defined previously, and hence justifies the name.

• If Q = 0 and p = 0, the quadrupole term dominates

1 Qij xi xj
φ(r) =
4π0 2r5
The quadrupole tensor Qij is symmetric, Qij = Qji , and traceless, Qii = 0.

Why quadrupole? d
q d
A simple linear quadrupole is defined by plac- −2q q
ing two dipoles (so four charges) ‘back to back’
with equal and opposite dipole moments, as r’
shown.
r
From the figure O

−2q
 
1 q q
φ(r) = 0
+ 0
+
4π0 |r−r −d| |r−r +d| | r − r0 |

64
Expanding the denominators in the usual way, and defining ρ = r − r0 , the leading
(1/ρ) and dipole (1/ρ2 ) terms cancel (exercise), so for large ρ
1 3(ρ · d)2 − d2 ρ2 1 Qij ρi ρj
φ(r) = q 5
=
4π0 ρ 4π0 ρ5
where Qij = q (3di dj − δij d2 ) is the (traceless, symmetric) quadrupole tensor – as
above.
The quadrupole moment is sometimes defined to be Q = 2qd2 , where the ‘2’ is
conventional.

7.7.1 Worked example


The region inside the sphere r < a, contains a charge density
ρ(x, y, z) = f z (a2 −r2 )
where f is a constant. Show that at large distances from the origin the potential
due to the charge distribution is given approximately by
2f a7 z
φ(r) =
1050 r3
The multipole expansion gives
P ·r
  
1 Q 1
φ(r) = + +O
4π0 r r3 r3
In spherical polars (r, θ, χ),
x = r sin θ cos χ , y = r sin θ sin χ , z = r cos θ

The total charge Q is (we drop the primes in this calculation for brevity):
Z Z 2π Z π Z a  
Q = ρ(r) dV = f r cos θ (a −r ) r2 sin θ dr dθ dχ = 0.
2 2
V 0 0 0
π π
1
Z Z
This integral vanishes because cos θ sin θ dθ = sin(2θ) dθ = 0.
0 0 2
The total dipole moment P about the origin is
Z Z
P = r ρ(r) dV = r e r ρ(r) dV
V V
Z 2π Z πZ a
= r (sin θ cos χ e 1 + sin θ sin χ e 2 + cos θ e 3 )
0 0 0
 
f r cos θ (a −r ) r2 sin θ dr dθ dχ
2 2

The x and y components of the χ integral vanish. The z component factorises:


Z 2π Z π Z a
2 4 2 2 2 2a7
Pz = f dχ sin θ cos θ dθ r (a −r ) dr = f 2π
0 0 0 3 35
Putting it all together, we obtain
1 8πa7 f e 3 · r 2f a7 z
φ(r) = =
4π0 105 r3 1050 r3

65
7.7.2 Interaction energy of a charge distribution
Let’s consider the interaction energy W of an arbitrary (but bounded) charge dis-
tribution in an external electric field E ext = −∇ φext ,
Z
W = dV ρ(r) φext (r)

For a charge distribution localised around the origin, we Taylor-expand φext (r) about
r=0
 1 2
φext (r) = φext (0) + r · ∇ φext (0) + r · ∇ φext (0) + . . .
2!
1
= φext (0) − r · E ext (0) − xi xj ∂j Eext i (0) + . . .
2
The last term may be re-written as − 6 (3xi xj − r2 δij ) ∂j Eext i (0). The additional
1

term with the δij vanishes because an external field satisfies ∇ · E ext = 0 in the
region around the origin. Therefore

W = Q φext (0) − p · E ext (0) − 16 Qij ∂j Eext i (0) + . . .

The physical picture is that the total charge couples to the external potential φext ,
the dipole moment with the external field E ext , and the quadrupole moment with
the spatial derivative of the external field.

7.7.3 A brute-force calculation - the circular disc


The electric field E and potential φ can be evaluated exactly for a number of in-
teresting symmetric charge distibutions. We give one example using cylindrical
coordinates before moving on to more powerful techniques.
A circular disc of radius a carries uniform surface
charge density σ. Find the electric field and the z
potential due to the disc on the axis of symmetry.
Electric field: Start with the general expression
0

1 r − r
Z
E(r) = dS 0 σ(r0 ) r
4π0 S r − r 0 3

Choose the z axis parallel to the axis of symmetry r − r′


with the origin at the centre of the disc, so that r χ
lies on the z axis and r0 lies in the x − y plane.
In cylindrical coordinates (ρ, χ, z), we have

r = z ez and r0 = ρ e ρ r′
O ρ
Therefore

r−r0 = z e z − ρ e ρ r − r0 = ρ2 + z 2 1/2

and a

66
Using the expression for e ρ in terms of e x and e y , we have
a 2π

σ z e z − ρ cos χ e x + sin χ e y
Z Z
E(z) = ρ dρ dχ
4π0 0 0 (ρ2 + z 2 )3/2
" #ρ=a
a
σ z ρ dρ σz −1
Z
= 2π ez + 0 = ez
4π0 0 (ρ2 + z 2 )3/2 20 (ρ2 + z 2 )1/2
ρ=0
" # #"
σz 1 σ1 z z
= − ez = − ez
20 |z| (a2 + z 2 )1/2 20 |z| (a2 + z 2 )1/2

The electric field on the z axis is parallel to the z axis because of symmetry. The
sum of all the contributions to Ex and Ey cancel, i.e. they integrate to zero. So we
really only needed to calculate Ez !
Consider two limits:

(i) z  a Expand the second term in square brackets


1 a2
  4 
1 1 2 2 −1/2
 1 a
= 1 + a /z = 1 − + O
(a2 + z 2 )1/2 |z| |z| 2 z2 z4

Keeping only the leading term, we have


σa2 Q ez
E(z) = sgn(z) e z = sgn(z)
40 z 2 4π0 z 2
where the signum (or sign) function sgn(z) ≡ z/|z| is +1 for z > 0, and −1
for z < 0; and Q = σπa2 is the total charge on the disc. In the far zone, we
recover the field for a point charge, as expected.

(ii) a  z In this case, the leading behaviour is obtained by dropping the second
term in square brackets:
σ
E(z) = sgn(z) e
20 z
This is the electric field due to an infinite charged surface – see later.

Potential: Start with the general expression


0
1 0 σ(r )
Z
φ(r) = dS
4π0 S |r − r0 |
For the disc, we have
Z a Z 2π
σ ρ dρ dχ σ h 2  iρ=a
2 1/2
φ(z) = = ρ + z
4π0 0 0 (ρ2 + z 2 )1/2 20 ρ=0

σ h 2 1/2 i
= a + z2 − |z|
20
Note: it’s often much easier to find φ than E !

67
(i) z  a Expanding as before, we find (exercise – important!)
σ a2 Q
φ(z) = =
40 |z| 4π0 |z|
as expected.
(ii) a  z In this case we find a linear potential
σ   σ
φ(z) = a − |z| = − |z| + constant
20 20
∂φ
Exercise: Check that E(z) = − e in each case.
∂z z

7.8 Gauss’ law


We showed previously that the electric field, E(r), due to a charge distribution, ρ(r),
satisfies ∇ · E = ρ/0 . This is the differential form of Maxwell’s first equation.
Integrating ∇ · E = ρ/0 over a volume V , bounded by a closed surface S, and using
the divergence theorem Z Z
∇ · E dV = E · dS
V S
gives Gauss’ law
1 Qenc
Z Z
E · dS = ρ dV =
S 0 V 0
where Qenc is the total charge enclosed byR the volume V . [For brevity, we will often
drop the subscript ‘enc’.] The quantity S E · dS is called the flux of the electric
field through the surface S.
Gauss’ Law (also known as Gauss’ Theorem) is the integral form of Maxwell’s first
equation. It’s extremely useful, particularly for problems with symmetry2 , for prob-
lems in potential theory, and for determining the behaviour of fields at boundaries.

Examples

• Consider a sphere of radius a, carrying charge Q,


centred on the origin, with uniform charge density
ρ0 = Q/( 43 πa3 ).
By symmetry, the electric field will point in the ra-
dial direction (outwards or inwards), so that E(r) =
Er (r) er . ρ
Integrating over a sphere of radius r, we find
(
a
4 3
3 πr ρ0 /0 r≤a
Z
2
E · dS = Er (r) 4πr = 4 3
S 3 πa ρ0 /0 r≥a

2
This goes against the general rule that it is easier to compute the potential (a scalar) first
rather than the electric field (a vector)!

68
Therefore
~
|E|
Q r


 r≤a
4π0 a3

E(r) =
Q r
r≥a r


1/r2

4π0 r3
Outside the sphere, the electric field ap-
pears to come from a point source, and
inside it increases linearly with r. r
r=a

We can obtain the electrostatic potential from


∂φ(r)
E(r) = Er (r) er = −∇φ(r) = − e
∂r r
Integrating with respect to r gives
  2 
Q r Q 1
3a2 − r2

 − 4π + C1 = r≤a


2a3 4π0 2a 3

0
φ(r) =  
 Q 1 Q 1
+ C2 = r≥a


4π0 r 4π0 r

For r > a we chose the constant of integration C2 to be zero so that φ → 0 as


r → ∞. The potential outside the sphere is again that of a point charge.
For r < a, we chose the constant of integration C1 = −3/(2a) so that φ is
continuous across the boundary at r = a. [Exercise: check this calculation.]
Note: E has a cusp, so the derivative ∂Er /∂r is discontinuous at the boundary.
• Consider a long (infinite) straight wire with constant charge/unit length λ.

Using cylindrical coordinates with the z axis par-


allel to the wire, we integrate over a cylinder of
length L and radius ρ with its axis along the wire.
By symmetry we must have E = Eρ (ρ) eρ .
Using Gauss’ Law, we get
L
1
Z
E · dS = Eρ (ρ) 2πρL + |{z}
0 = λL
S 0
ends
This gives
λ 1
E(r) = e
2π0 ρ ρ

The potential can be found by integrating Eρ = −∂φ/∂ρ, which gives


 
λ λ ρ
φ(r) = − ln ρ + constant = − ln
2π0 2π0 ρ0
where we chose the constant of integration to give φ(r) = 0 when ρ = ρ0 ,
where ρ0 is a constant.

69
• Infinite flat sheet of charge with constant charge density σ per unit area.
Integrate over a cylindrical ‘Gaussian pill box’ with axis perpendicular to the
sheet. See tutorial, and below.

7.9 Boundaries
Useful results for the changes in the normal and tangential components of the electric
field across a boundary may be obtained using Gauss’ Law and Stokes’ theorem.
Consider a surface carrying surface charge density σ. The electric field on one side
of the boundary is E 1 , and on the other E 2 . The unit normal to the surface is n.

σ
2

~2 1
E

~n
~1
E

7.9.1 Normal component


Consider a small√ cylindrical ‘Gaussian pillbox’ of area A and negligible height δ`
(so that δ`  A), which straddles the surface. If A is sufficiently small, E(r) is
approximately constant over A, but due to the charge density σ on the surface, E(r)
will be different on the top and bottom of the pillbox. Apply Gauss’ Law
1
Z Z
E · dS = ρ dV
S 0
and recall dS = n dS, then for small A σ
2
 σA
E 2 − E 1 · n A = (E2⊥ − E1⊥ ) A =
0 A δl
1

where E1⊥ and E2⊥ are the components ~n


of the electric field perpendicular to the
surface. The factors of A cancel, hence
S
 σ
n · E2 − E1 =
0

NB Since δ`  A, the integral over the curved surface may be neglected.
Thus the normal component of the electric field E, is discontinuous across the
boundary when σ 6= 0. In this case the discontinuity is proportional to the surface
charge density.

70
7.9.2 Tangential component
Consider a small rectangle of length `, and negligible width δ` (so that δ`  `),
which straddles the surface. Applying Stokes’ theorem
Z I
∇ × E · dS = E · dr ,
S C

we can ignore contributions from the ends with length δ`. Since ∇ × E = 0, we get
I

0 = E · dr = E2k − E1k `
C σ
2
where E1k and E2k are the components of
the electric field parallel to the boundary.
1111111
0000000 0
1
(For small enough `, we assume E1k and 0000000
1111111 0
1
0
1
0000000
1111111
0000000
1111111 0
1 1
0000000
1111111 0 δl
1
E2k are constant along each side of the 0000000
1111111
0000000
1111111
0000000
1111111
000000
111111
0000000
1111111
S111111
000000
0000000
1111111
rectangle.) ~n
000000
111111
0000000
1111111
000000
111111
0000000
1111111
000000
111111
0000000
1111111
000000
111111
0000000
1111111
0000000
1111111l
000000
111111
000000
111111
This can be written as 000000
111111

n × E1 = n × E2 S

where we used the fact that the cross product of the electric field E with n picks
out the tangential component Ek of the electric field, since

E = E⊥ n + E k

Thus the tangential component of E is continuous across a charged boundary.

7.9.3 Conductors
Physically, a conductor is a material in which ‘free’, or ‘nearly free’ or ‘surplus’
electrons can move (or flow) freely when an electric field is applied.

In Electrostatics

• For a conductor in equilibrium, where all charges are at rest, all the charge
resides on the surface of the conductor, i.e. ρ = 0 inside a conductor.
This holds because if ρ 6= 0, then due to Maxwell’s first equation ∇ · E = ρ/0
(or Gauss’ law), so we must have E 6= 0, and hence the charge would move
and we wouldn’t have equilibrium – a contradiction. So E = 0 and hence
φ = constant everywhere inside a conductor.
Physically, the charges repel and move to the surface.
• The electric field on the surface of a conductor is normal to the surface,
i.e. E k n, otherwise charge would move along the surface
Thus if dr is a displacement on the surface of a conductor, E · dr = −dφ = 0,
so φ = constant on the surface of a conductor, i.e. an equipotential.

71
Therefore, on the surface of a conductor,
vacuum

σ ~
Et = 0 , En = E
0
conductor

The external electric field induces a charge on


the surface of the conductor, which in turn de-
forms the external field so that it is perpendic- ~ =0
E
ular to the conductor surface. In the case of a
conductor, the surface charge is calculated from
the electric field (and not vice-versa as is the φ = const
usual case).
For insulators we have the opposite situation – the charges are fixed and we must
calculate the electric field and the potential from the charge density.

7.10 Poisson’s equation


In section (7.5.3), we showed that ∇ × E = 0, and hence there exists a potential φ
such that E = −∇φ. Substituting this into Maxwell’s first equation ∇ · E = ρ/0
gives Poisson’s equation
ρ
∇2 φ = −
0
If we know ρ, this may be solved for φ (or vice versa), given appropriate boundary
conditions (bcs). In a charge-free region (i.e. ρ = 0) Poisson’s equation becomes
Laplace’s equation
∇2 φ = 0

7.10.1 Uniqueness of solution


Partial differential equations have many linearly-independent solutions. How do we
know which is the correct one for a given physical system?
Theorem: For a volume V bounded by a surface (or
set of surfaces) S, if we are given a set of boundary φ = const
conditions:
conductor surface
Either on the potential φ, V ∂φ σ
∂n = − ǫ0
φ(r) for r ∈ S Dirichlet bcs φ ρ 6= 0
Or the normal component of E, i.e. En = −n · ∇ φ ,
dipole sheet
∂φ σ
≡ n · ∇ φ(r) for r ∈ S Neumann bcs
∂n conductor surface
where n is the unit normal to S, then the solution of
Poisson’s equation in V is unique.

72
Proof: Let φ1 (r) and φ2 (r) be two solutions of Poisson’s equation, both of which
satisfy the boundary conditions, and let ψ = φ1 − φ2 , so that ∇2 ψ = 0 for both sets
of boundary conditions.
Applying the divergence theorem to the LHS of the vector calculus identity

∇ · ψ ∇ ψ = ψ ∇2 ψ + ∇ ψ · ∇ ψ


gives Z Z 
 2
2 
ψ ∇ψ · n dS = ψ ∇ ψ + ∇ ψ dV

S V

Since either ψ = 0 (Dirichlet) or ∂ψ/∂n = 0 (Neumann) on S, then


Z
∇ψ 2 dV = 0

V

This implies that ∇ψ = 0 everywhere in V , which integrates to ψ = constant.

Consider the Dirichlet and Neumann boundary condition cases separately:


Dirichlet: In this case ψ = 0 on S, so the constant is zero and the two solutions
are equal
φ1 = φ2

Neumann: In this case


φ1 = φ2 + constant
Since the potential is only defined up to a constant, the constant may be disregarded.
Thus both types of boundary conditions give a unique solution of Poisson’s equation.
Notes:

• We can specify either Dirichlet or Neumann boundary conditions at each point


on the boundary, but not both. To specify both is generally inconsistent, since
the solution is then overdetermined.

• However, we can specify either Dirichlet or Neumann boundary conditions on


different parts of the surface.

• The uniqueness theorem means we can use any method we wish to obtain the
solution - if it satisfies the correct boundary conditions, and is a solution of
the equation, then it is the correct solution.

7.10.2 Methods of solution


The theorem is useful: if you find a solution somehow, it is the solution. For example:

(i) Guesswork

(ii) Numerical methods

73
(iii) Direct integration of Poisson’s equation
(iv) Gauss’ law plus symmetry
(v) Method of images
(vi) Separation of variables
(vii) Green-function method

Method (iii) uses ‘direct’ integration to find the (unique) solution of Poisson’s equa-
tion, i.e.
1 ρ(r0 )
Z
φ(r) = dV
4π0 V r − r0
which is okay for simple situations with a bounded potential.
We have studied method (iv) already.

7.10.3 The method of images


In the method of images, we add fictitious charges outside the volume under con-
sideration in such a way that the system including the fictitious charges satisfies
Poisson’s equation in the region of interest, plus the correct boundary conditions.
ρ + bcs ⇔ ρ + mirror charges in unphysical region to mimic bcs

Example: Point charge and a planar conducting surface

In the figure, the point charge q is at position (0, 0, z0 ) above an infinite conducting
surface in the x − y plane (on the left in the figure). The region of interest V is
the half-space z > 0. The image charge −q is at position (0, 0, −z0 ) below the x − y
plane, so it’s not in the physical region.

q −q q

(0, 0, z0)

V : halfspace z > 0

The potential due to the pair of charges is


" #
1 q q
φ(r) = −
4π0 (x2 + y 2 + (z − z )2 ) 21 1
(x2 + y 2 + (z + z0 )2 ) 2
0

74
Since φ satisfies Poisson’s equation everywhere in the half-space (z > 0), and

φ(r)|r=(x,y,0) = 0

then the surface of the conductor (at z = 0) is an equipotential, i.e. φ satisfies the
boundary conditions, so φ is the unique solution.

Electric field: We may calculate the electric field from the potential

" #
q (x, y, z − z0 ) (x, y, z + z0 )
E(r) = − ∇φ = 3 − 3
4π0 (x2 + y 2 + (z − z0 )2 ) 2 (x2 + y 2 + (z + z0 )2 ) 2

On the surface of the conductor (z = 0) this becomes


q z0
E(r) r=(x,y,0) = − 3 ez
2π0
(x2 + y2 + z02 ) 2

i.e. the field at the surface of the conductor is perpen-


dicular to the surface, as it must be.

The surface charge density on the conductor is E


q z0
σ(r) = 0 E(r) r=(x,y,0) = − 3

(x2 + y 2 + z02 ) 2
q
The total charge induced on the conducting surface is,
using polar coordinates (ρ, χ),
Z
Q = σ dS
z=0
2π ∞
q 1
Z Z
= − z0 dχ ρ dρ 3 = −q
2π 0 0 (ρ2 + z02 ) 2

which is just the mirror charge, as one would expect.

Force The force on the positive charge due to the conductor is that between two
point charges separated by distance 2z0

q2 1
F = − e
4π0 (2z0 )2 z

75
Example: point charge and a metal sphere

Consider an earthed sphere of radius a centred


on the origin, together with a point charge q at
r
point P which lies a distance b from the centre r1
of the sphere. The volume V is the space r ≥ a r2
and we choose the mirror charge q 0 = −q a/b, θ q
at P 0 where OP 0 = a2 /b, in order to satisfy the O P′ P
boundary conditions at r = a. q ′ = −qa/b

[The potential on an earthed conductor is de-


fined to be φ = 0 everywhere on and inside the a2/b

conductor – the potential of the ‘earth’.] b

The potential of the two-charge system is


q0
 
1 q
φ(r) = +
4π0 r1 r2
" #
q 1 a/b
= −
4π0 (r2 + b2 − 2br cos θ) 12 1
(r2 + (a4 /b2 ) − 2(a2 /b) r cos θ) 2
On the boundary (r = a)
" #
q 1 a/b
φ(r)|r=a = − = 0
4π0 (a2 + b2 − 2ab cos θ) 12 1
(a2 + (a4 /b2 ) − 2(a3 /b) cos θ) 2
and hence by the uniqueness theorem, this is the correct solution.

Example: Earthed conducting sphere in a uniform electric field

Consider an earthed conducting sphere of radius a centred on the origin, and subject
to a uniform external electric field E.
P
ez
r

φ=0

For a constant electric field E = (0, 0, E) we have φ = −Ez = −Er cos θ. We


anticipate that the field will induce a dipole moment parallel to e z on the sphere,
so we use a linear superposition as an ansatz (guess) for the potential:
(B e z ) · r B
φ(r) = −E z + 3
= −Er cos θ + 2 cos θ
r r
76
This is a solution of Laplace’s equation outside the sphere, where ρ = 0. As r → ∞,
we have φ → −Er cos θ. The boundary condition on the surface of the sphere is
B
φ|r=a = −Ea cos θ + cos θ = 0 which gives B = a3 E
a2
Therefore the (unique) solution for this problem is
a3 E
φ(r, θ) = −Er cos θ + cos θ
r2
Alternatively, we can use the method of images, as illustrated below. The solution to
the constant-external-field problem is obtained in the limit that the charges outside
the sphere move to infinity, and the ones inside move inwards to form a dipole at
the centre.
ez

+e −e′ e′ −e

φ=0

R R

7.11 Electrostatic energy

7.11.1 Electrostatic energy of a general charge distribution


Recall that the work done in moving a point charge q in a field E from ra to rb is
Z rb Z rb Z rb  
We = −q E · dr = q ∇φ · dr = q dφ = q φ(rb ) − φ(ra )
ra ra ra

Thus the work done to bring in a point charge q from infinity (where φ = 0) to
position r is W = q φ(r).
Let’s assemble a system of n charges qi from ∞ to positions ri :
W1 = 0 [nothing else is there]
1 q1
W 2 = q2 [the work done in the field of q1 ]
4π0 |r1 − r2 |
 
1 q1 q2
W3 = q3 + [the work done in the field of q1 and q2 ]
4π0 |r1 − r3 | |r2 − r3 |
and so on. Therefore the work done to bring in the ith charge qi to position ri is
i−1
qi X qj
Wi =
4π0 j=1 |rj − ri |

77
The total work done is
n n X
i−1 n
X X 1 qi qj 1 X 1 qi qj 1X
We = Wi = = ≡ qi φi
i i j=1
4π0 |rj − ri | 2 4π0 |rj − ri | 2 i
i,j, (i6=j)

which we may write as

1X X qj 1
We = qi φi with φi = φ(ri ) =
2 i 4π0 |rj − ri |
j, (j6=i)

where φi is the potential felt by qi due to all the other charges.3


In the limit of a continuous charge distribution ρ(r), we have

1
Z
We = dV ρ(r) φ(r)
2 V

where the integral is over all space.

7.11.2 Field energy


We can write We in terms of the (total) electric field E using Maxwell’s first equation
in the form ρ = 0 ∇ · E . Then We becomes

0
Z

We = dV φ ∇ · E
2
Now use the product rule to rewrite
    2
φ ∇ · E = ∇ · φ E − ∇ φ · E = ∇ · φ E + E

so that
0
Z   2 
We = dV ∇ · φ E + E
2
Applying the divergence theorem to the integral of the first term, taking V to be all
space, and S to be a large sphere of radius R → ∞, this integral becomes
 
1 1
Z Z
2
 
∇ · φ E dV = φ E · dS ≈ O 4πR → 0 as R → ∞
V S R R2

The total energy stored in the electric field is then

0
Z
2
We = dV E(r)
2
3
The factor of 1/2 ensures that we don’t count the energy required to bring in charge j from
∞ in the presence of charge i at ri and the energy required to bring in charge i from ∞ in the
presence of charge j at rj . Note there is no factor of 1/2 in our previous expression for the energy
of a charge in an external electrostatic field.

78
and the energy density (energy/unit volume) at r is
0 2
we (r) = E(r)
2
Notes: The general result for We was derived for a continuous charge distribution.
When there are point charges we have to be careful with self-energy contributions
which should be excluded from the integral because they lead to divergences.
The two boxed expressions for We are complementary. We can think of the electro-
static energy as lying in the charge distribution or as being stored in the E field.
Finally, note that since the energy density we is quadratic in the electric field, we
don’t have superposition of energy density.

7.12 Capacitors (condensers) and capacitance


A capacitor is formed from a pair of conductors 1 and 2 carrying equal and opposite
charges, Q and −Q. The potentials on the conductors are φ1 and φ2 , so the potential
difference is V = φ1 − φ2 . Clearly, φ1 and φ2 are proportional to Q (up to a common
constant), so V ∝ Q, and we define capacitance

Q
C=
V

which depends only on the geometry of the capacitor. The SI unit of capacitance is
the farad or Coulomb per Volt, 1F = 1CV−1 .

7.12.1 Parallel-plate capacitor


The simplest example is the parallel plate capacitor

E=0
φ1
+Q

a E
A
d
−Q
φ2
E=0


Two parallel plates of area A have a separation a (with a  A), and carry surface
charge densities +σ and −σ on their inner surfaces (because of the attractive force
between the charges on the two plates). The total charge on the upper plate is
Q = σA, and on the lower plate, −Q = −σA.
We can obtain the electric field using Gauss’ law.

79
Choosing e z to point vertically upwards, first take a pillbox that straddles the inner
surface of the upper plate, as we did in section (7.9.1) of these notes. The electric
field E = 0 in the interior of the conducting plate. Due to symmetry, between
the two plates, the E field is normal to the inner surface of the upper plate, so
E = Ez e z . On the lower flat surface of the pillbox, area S, we have dS = dS(−e z ),
so
σS
Z Z
E · dS = −Ez dS = −Ez S =
S S 0
by Gauss’ law. Hence the electric field inside the capacitor (i.e. betweeen the plates)
is
σ
E inside = − e z
0
Now take a pillbox that straddles the outer surface of the upper plate. Since there
is no charge on the upper surface, and E = 0 inside the plate, we have E = 0 at the
outer surface. Similarly for the lower plate (exercise). Therefore
σ
E outside = 0 E inside = − e z
0
We can obtain the potential between the plates in the usual way
σ ∂φ σz
Ez = − = − ⇒ φ(z) = + constant
0 ∂z 0
so the potential difference between the plates is
σa Qa
V = φ1 − φ2 = =
0 A 0
and the capacitance is
Q A 0
C = =
V a
which is a purely geometrical property of the plates, as expected.

7.12.2 Concentric conducting spheres

Consider a capacitor consisting of two concentric spheres,


radii a and b, with b > a, centred on the origin. The outer
sphere carries charge +Q and the inner one −Q. b
By Gauss’ law, the electric field is zero inside the inner E
sphere, and outside the outer one [exercise]. a +Q
By Gauss’ law, between the spheres, we have −Q
Q 1 E
φ(r) = − + constant for a < r < b
4π0 r

The potential difference is


Q b−a
 
Q 1 1
V = φb − φa = − − =
4π0 b a 4π0 ab
so the capacitance is
ab
C = 4π0
b−a

80
7.12.3 Energy stored in a capacitor
Using our first expression for the energy of a charge distribution, we have

Q QV CV 2 Q2
W = (φ1 − φ2 ) = = =
2 2 2 2C
which holds for a capacitor of any shape.
For parallel plates
 2
0 0 σ a Q2 Q2
Z
2
We = |E| dV = (A a) = =
2 2 0 20 A 2C

as before.

81
Chapter 8

Magnetostatics

In the previous chapter, we studied static charge distributions, which lead to an


electric field. In this chapter we study the case of steady currents (also known as
time-independent currents), which lead to a magnetic field.

8.1 Currents
A current is created by moving charges: a charge q moving at velocity v gives an
‘elementary’ current j = q v , which is a vector quantity.
Consider a volume charge density ρ(r) moving with velocity v(r). This defines a
(bulk) current density,
J(r) = ρ(r) v(r)
We can also have a surface current density,

K(r) = σ(r) v(r)

where σ(r) is the surface charge density (charge/area), and a line current

I(r) = λ(r) v(r)

where λ(r) is the line charge density (charge/length).


We define a current element dI for each of these three cases by

 J(r) dV current element in the bulk




dI(r) = K(r) dS current element in a surface

I(r) dr current element along a line (or wire)

Note: Care is needed with current elements. For example, K dS 6= K dS since the
left hand side points in the direction of the current vector on the surface, but the
right hand side points normal to the surface. On the other hand I dr = I dr since
a line current element always points along the wire in the direction dr.
Units: From the definition of line current, the dimension of current I is

[I] = C m−1 m s−1 = C s−1 ≡ A (Ampères)


  

82
The unit of current I is called the Ampère (A), 1A = 1C s−1 (Coulombs/second).
Similarly
[K] = A m−1 [J] = A m−2 [dI] = A m
Note that current I and current element dI have different units,1 and that none of
these current ‘densities’ has units of current/volume! This is a little confusing, but
it’s standard . . .
The total (scalar) current I passing through a surface S is the flux of J through S
Z
I = J · dS
S
where dS is normal to the surface. Similarly, the flux of K across the curve C is
Z
I = K · n0 dr
C
0
where the unit vector n is normal to C, in the plane of K, and dr is an infinitesimal
(scalar) line element on the curve C.

8.1.1 Charge and current conservation


Experiment: total (net) charge is conserved; charge can’t be created or destroyed.
Consider a volume V bounded by a closed surface S. The charge Q in the volume
V changes due to current flowing across the surface S, with outward vector element
of area dS:
∂Q(t) ∂ ∂ρ(r, t)
Z Z Z Z
= ρ(r, t) dV = dV = − J(r, t)·dS = − ∇·J(r, t) dV
∂t ∂t V V ∂t S V
R
where S J(r, t) · dS is the total current flow across S. The minus sign is due
to current flowing out of the surface reducing the charge in V , and we used the
divergence theorem in the last step. This equation holds for all volumes V , so
∂ρ(r, t)
+ ∇ · J(r, t) = 0
∂t
which describes charge conservation locally at the point r .
In static situations, ∂ρ/∂t = 0 (by definition), so ρ(r, t) = ρ(r) is independent of t,
and ∇ · J = 0, which is called current conservation.

Example: Consider a disc, carrying uniform charge z


density σ, rotating about its axis at angular velocity ω.
The current density on the disc is

K = σv = σω×r 
R

Hence 
∇·K = σ∇· ω×r = 0

1
Some authors use different fonts I and I for these quantities, but this is hard to do on the
blackboard!

83
8.1.2 Conduction current
A common situation is where we have have no net electric charge, but a current
exists because positive and negative charges move with different velocities
J = ρ+ v + + ρ− v − with ρ = ρ+ + ρ− = 0 , but |v + | =
6 |v − |

Examples
Metal: nuclei are fixed, but electrons move: v + = 0, so J = ρ− v −
Electrolyte: positive and negative ions move with different velocities: J = ρ+ v + − v −


Current is often created by an electric field, which forces charge carriers to move. A
simple linear model, which agrees with experiment in common situations is Ohm’s
law
J = σE
where σ is called the conductivity2 .
In some materials J is not in general parallel to E, and the linear model must be
generalised to
Ji = σij Ej
where σij are the components of a second-rank tensor, the conductivity tensor.

Notes

(i) When ρ = 0, we have ∇ · E = 0 (using Maxwell’s first equation) and ∇ · J = 0


(current conservation).
(ii) We cannot have a static closed current loop. Using Stokes’ theorem
I Z

E · dr = ∇ × E · dS = 0
C S
because ∇ × E = 0 for static electric fields. Ohm’s law, J = σ E, then implies
I
J · dr = 0
C
so I = 0 for a closed loop.

So we must have a battery to keep current flowing.


[But see next semester for non-static situations.]
In static situations, we can write E = −∇ φ, so I
Z r2
E · dr = φ(r1 ) − φ(r2 ) = V12
r1

i.e. the potential supplied by the battery. This is


called the electromotive force or emf E .
This is not a sensible name because emf is not a force,
but we’re stuck with it.
E
2
Not to be confused with the surface charge density!

84
(iii) In a perfect conductor (for example a superconductor), σ → ∞, so to keep J
finite, we must have E = 0 (as in electrostatics).
(iv) For an insulator, σ = 0.

8.2 Forces between currents (Ampère, 1821)


Consider two parallel wires, of length L, a distance d apart, carrying currents I1
and I2 .

I1 I2 I1 I2

r1 − r2

Parallel currents attract Anti-parallel currents repel

From experiment, we find:

• For perpendicular currents, there is no force.


• Otherwise, the force between current elements dI 1 and dI 2 is a central force
with an inverse square law. The force on infinitesimal current element 1 due
to infinitesimal current element 2 in SI units is

µ0 r1 − r2 µ0 r̂
dF 12 = − dI 1 · dI 2 3 = − dI 1 · dI 2 12 2
4π r − r 4π r12
1 2

where r12 = r1 − r2 and r12 = r12 .
The constant µ0 is measured to be µ0 = 1.2566370614 . . . × 10−6 NA−2 =
1.00000000082(20) × 4π × 10−7 N A−2 (or N C−2 S2 ), and is called the perme-
ability of the vacuum (or of free space), or just the magnetic constant.

For two current loops C1 and C2 carrying currents I1 and I2

dr2
r1 − r2
E E
dr1
r1 r2 C2
C1 I1 I2

85
Linear superposition (which comes from experiment) gives
µ0  r12
I I

F 12 = − I1 dr1 · I2 dr2 3
4π C1 C2 r12
where we used the expression for current elements dI 1 · dI 2 = I1 dr1 · I2 dr2
This is the analog for currents of Coulomb’s law. Note that F 21 = −F 12 as expected.
We now separate this into two parts. The loop carrying current I2 produces a
magnetic field B(r), which in turn produces a force on the loop carrying I1 . Start
by writing   
dr1 × dr2 × r12 = dr2 dr1 · r12 − dr1 · dr2 r12
Using Stokes’ theorem,
dr1 · r12
    
1 1
I I Z
3
= − dr1 · ∇1 = − dS 1 · ∇1 × ∇1 = 0
C1 r12 C1 r12 S1 r12
where ∇1 is the gradient with respect to r1 , and curl-grad is always zero. Therefore
(I I  I )
µ0 dr × dr × r dr · r
I
1 2 12 1 12
F 12 = I1 I2 3
− dr2 3
4π C1 C2 r12 C2 C1 r12
!
µ0 I2 dr2 × r12
I I
= I1 dr1 × 3
C1 4π C2 r12

since the second term in the first line is zero (as we’ve just shown). Therefore we
can write I
F 12 = I1 dr1 × B(r1 )
C1
which is the force on C1 due to B(r1 ), where

µ0 I2 dr2 × r12 µ0 I2 dr2 × r1 − r2
I I
B(r1 ) = =
4π C2 3
r12 4π C2 r − r 3

1 2

is the magnetic field 3 at r1 due to the current in C2 . This is the Bio-Savart law,
(approx 1820).

8.2.1 The Lorentz force


Generalising, the force on a current element dI due to a magnetic field B is (locally)
dF = dI × B
so  Z


 I dr × B for a line current

 C

 Z
F = K × B dS for a surface current

 S

 Z


 J × B dV for a bulk current
V
3
In some textbooks, B is called the magnetic induction.

86
R
The force on a charge distribution ρ due to an electric field E is V
ρ E dV , so the
force density (force/unit volume) at r is
f (r) = ρ(r) E(r)
Similarly, for a bulk current density in a magnetic field, we often write f = J × B
Therefore, the combined force density is

f = ρE + J × B

which is called the Lorentz force (strictly, the force density).


For a moving point charge q at r0 : ρ(r) = q δ(r − r0 ) and J(r) = q v δ(r − r0 ).
Integrating the Lorentz force density over a volume V 0 containing the charge gives

F = q E+v×B

We may regard these as the definitions of the fields E and B.


Note that magnetic fields do no work. If a charge q in a magnetic field B moves a
distance dr = v dt, the work done by the field is
  
dW = F · dr = q v × B · dr = q v × B · v dt = q v × v · B dt = 0
So magnetic fields change the direction of charged particles, but do not accelerate
them to higher (or lower) speeds.
Units: [B] = NC−1 m−1 s = NA−1 m−1 = T (Tesla). [E and B have different units.]

8.3 Biot-Savart law


dB ( r )
The magnetic field dB(r) at r due to a current element Idr0 P
at r0 is r − r’
µ0 I dr0 × r − r0

dB(r) = r
4π r − r 0 3

I dr’
dB(r) is orthogonal to dr0 and r − r0 , and tangential to a

r’
circle centred on an axis through dr0 . The field lines of dB
‘wrap’ around this axis. This is the right-hand grip rule. C
The magnetic field at r due to a current loop carrying current I is
O
I dr0 × r − r0

µ0
I
B(r) =
4π C r − r 0 3

For a bulk current density J , with dI = J dV , and a surface current density K ,


with dI = K dS, we have
0 0 0 0
 
µ0 J(r ) × r − r µ K(r ) × r − r
Z Z
0
B(r) = dV 0 and B(r) = dS 0
4π V r − r 0 3 4π S r − r 0 3

We can use the Bio-Savart law to compute the magnetic field directly for some
simple current distributions.

87
8.3.1 Long straight wire
Consider a long (infinite) straight wire along z ez
the z axis. Without loss of generality, take
the position vector r of the point P to be per- eφ
pendicular to the z axis at the origin. From

the diagram, using cylindrical coordinates, we dr′
have r = ρ eρ and r0 = z 0 e z . Hence
0 0 0 0
r ′ = z ′ ez
r − r = ρ eρ − z ez dr = dz ez
r − r′
Therefore
θ
0
dr × r − r 0

= ρ dz 0

ez × eρ

0
= ρ dz eφ
O
r P
φ
The magnetic field due to current I in the wire is then
µ0 ∞ I dr0 × r − r0
 Z ∞
µ0 dz 0
Z
B(r) = 3 = Iρ 3/2 eφ
4π −∞ 4π

r − r 0 −∞ ρ2 + z 0 2

To evaluate this integral, we use the substitution z 0 = ρ tan θ, so dz 0 = ρ sec2 θ dθ


and ρ2 + z 0 2 = ρ2 (1 + tan2 θ) = ρ2 sec2 θ, which gives
Z π/2
µ0 I µ0 I
B(r) = cos θ dθ eφ = e
4π ρ −π/2 2πρ φ
So B is inversely proportional to the distance ρ of the point P from the wire, and it
points in the direction of increasing φ, i.e. its field lines ‘wrap around’ the wire in a
circle. This is the right-hand grip rule again.

8.3.2 Two long parallel wires

Now consider the force on a current element dI 2 of a second


parallel wire (at distance d from the first) coming from the I1 I2
magnetic field B 1 (r) due to the first wire. Again we choose
coordinates so that this current element lies at r = ρ eρ in d
cylindrical coordinates. The force dF on the current element
I2 dr2 at r2 due to the magnetic field B 1 produced by wire 1 is:

µ0 µ0 I1 I2
dF = I2 dr2 × B 1 (r2 ) = I2 dz ez × I1 eφ = − dz eρ
2πd 2πd
There is an attractive (for I1 I2 > 0) force per unit length between two parallel
infinitely long straight wires: 4
µ0 I1 I2
f =
2πd

4
Until 20 May 2019, this was the basis of the definition of the Ampère, which was defined so
that if the wires were 1m apart and the current in each wire was 1A, the force per unit length
between the two wires was exactly 2 × 10−7 Nm−1 . Since 1A = 1Cs−1 , it also defined the Coulomb.

88
8.3.3 Current loop
Consider a current loop, radius a, carrying current I.
Find the magnetic field on the axis through the centre z
of the loop. From the figure: dr′ × (r − r′)
1/2
r = z e z , r0 = a e ρ , r − r0 = (a2 + z 2 )
r
Hence

dr0 × r − r0 r − r′
 
= a e φ dφ × z e z − a e ρ
φ
= az e ρ − a2 (−e z ) dφ


So

µ0 I 2π
az e ρ + a2 e z dφ
Z
B(r) = r′
4π 0 (a2 + z 2 )3/2 O dr′
µ0 I a2
= e
2 (a2 + z 2 )3/2 z a
R 2π R 2π 
where we used 0 e ρ dφ = 0 cos φ e x + sin φ e y dφ = 0. The components of the
magnetic field perpendicular to ez cancel due to symmetry as we integrate around
the loop

8.4 Divergence and curl of the magnetic field: Gauss


and Ampère laws
The magnetic field produced by a bulk current density J is
!
J(r0 ) × r − r0

µ0 µ0 1
Z Z
B(r) = dV 0 = − dV 0 J(r0 ) × ∇
4π r − r 0 3 4π r − r 0

V V

where ∇ acts on r, (not on r0 ). Now


( !) ( !)
1 1
∇ · J(r0 ) × ∇ = −J(r0 ) · ∇ × ∇ = 0
r − r0 r − r 0

because ‘curl grad’ is always zero. Note that J = J(r0 ), so it’s treated as a constant
vector when calculating the gradient with respect to r. Therefore
( !)
µ0 1
Z
0 0
∇ · B(r) = dV J(r ) · ∇ × ∇ = 0
4π V r − r0

and we obtain Maxwell’s second equation, also known as the differential form of the
Gauss law for magnetostatics
∇ ·B = 0

89
Since the second equation always holds, it tells us that there are no magnetic charges
or ‘monopoles’. There are no sources or sinks for magnetic fields; the field lines form
closed loops.
Compare this with Maxwell’s first equation, ∇ · E = ρ/0 , which states that elec-
tric charge density ρ is the source of electric field [strictly, electric flux – see next
semester].
Similarly
( !) ! !
0 1 2 1  1
∇× J(r ) × ∇
r − r 0 = ∇ J − J · ∇ ∇

r − r 0

r − r0
!
1
= −4πδ r − r0 J − J · ∇ ∇
 

r − r0
so
( !)
µ0 1
Z
dV 0 −4πδ r − r J(r0 ) − J(r0 ) · ∇ ∇
0
 
∇ × B(r) = −
4π V
r − r 0

The integral over the volume in the first term can be performed (trivially) with the
delta function, whilst the second term vanishes (see below), and we obtain.

∇ × B = µ0 J

This is the differential form of Ampère’s law for steady currents, also known as the
static form of Maxwell’s fourth equation.5
To show that the second term vanishes, we take the second ∇ out of the integral to
give (up to constants)
!
1
Z
∇ dV 0 J(r0 ) · ∇
V r − r0
The integral may be written in terms of the gradient ∇ 0 , wrt r0
! !
1 1
Z Z
0 0 0 0 0
dV J(r ) · ∇ = − dV J(r ) · ∇
V r − r0 V r − r0
( ! )
1 1
Z
dV 0 ∇ 0 · J(r0 ) − ∇ 0 · J(r0 )

= − 0

r − r 0
V r − r

J(r0 )
Z
= − · dS 0 + 0
0
S r−r

where we used the divergence theorem on the first term, which is zero provided
that the current density J(r0 ) → 0 at infinity, and the second term is zero because
∇ · J = 0 in magnetostatics, where we have steady currents.

In fact, ∇ × B = µ0 J implies ∇ · J = 0, since ∇ · ∇ × B ≡ 0.
5
In electrodynamics, there is an additional term on the RHS, due to time-dependent charge
distributions and therefore non-steady currents. See next semester.

90
8.4.1 Ampère’s law (1826)
The fundamental laws of magnetostatics are

∇·B = 0 and ∇ × B = µ0 J

Applying the divergence theorem to a closed surface S, which bounds the volume
V , in the first equation gives
Z Z
B · dS = ∇ · B dV = 0
S V

i.e. the magnetic flux (flux of B) through any closed surface is zero – Gauss’ law of
magnetostatics.
Applying Stokes’ theorem for a closed curve C bounding an open surface S to the
second equation gives
I Z Z
B · dr = ∇ × B · dS = µ0 J · dS = µ0 I
C S S

In words: the circulation of the magnetic field B


B
around a closed loop C is equal to µ0 × the total
current I flowing through the loop.
This is the integral form of Ampère’s law, which is J
extremely useful in finding B in symmetric situa- S
C
tions, just like Gauss’ law in electrostatics.

Example: long straight wire Consider a long straight wire lying along the z
axis and use cylindrical coordinates. By symmetry, B(r) will be independent of z,
its magnitude will be independent of φ, and it must point in the e φ direction (this
is the right-hand grip rule again). Hence

B(r) = Bφ (ρ) e φ
Consider a circle of radius ρ with its centre on the z axis,
so that dr = ρ dφ e φ , and apply Ampère’s law I
I
B · dr = µ0 I ⇒ Bφ (ρ) 2πρ = µ0 I
C

which reproduces our previous result with very little effort,


µ0 I
B(r) = e φ
2πρ φ

91
Example: coil/solenoid Consider a long coil6 of radius a, with N tightly-wound
turns/unit length, centred on the z axis. Again, we use cylindrical coordinates.

By symmetry B(r) must be in the z direction, and indepen- z


dent of φ and z. (See also question 6 on Tutorial Sheet 10.) φ
Therefore
B(r) = Bz (ρ) ez
Away from the wire, we have ∇ × B = 0, so
I
∂Bz L
=0 ⇒ Bz = constant
∂ρ
Since the constant is zero as ρ → ∞, then B must be zero
ρ
everywhere outside the coil, which is a remarkable result!
So we have (
B : ρ<a
Bz =
0 : ρ>a

Now apply Ampère’s law to the rectangular ‘loop’ of length L shown in the figure,
which has one long side inside the coil, so dr = ±dz e z , and one outside. Therefore

B . L − 0 . L = µ0 N LI

The magnetic field is constant inside the coil, with

B = µ0 N I e z

8.4.2 Conducting surfaces


Consider a conducting surface carrying current density K. What are the boundary
conditions on the magnetic field at the surface?
Normal component: Use a small Gaussian pill-
box of negligible height with a circular surface of
area A, which straddles the surface as shown.
For small enough A, we can take B to be constant
n
over A. Applying the divergence theorem gives B2
Z Z

0 = ∇ · B dV = B · dS = B 2 − B 1 · n A
V S
B1
So
B2 · n = B1 · n

The normal component of the magnetic field is continuous across the boundary.
6
A ‘long’ coil means its length is much greater than its radius, so we can treat it as being
effectively infinite.

92
Tangential component: Use a small rectangular
Ampère loop of length L and negligible width strad- n
dling the surface, which has normal n, as shown. For
n′
small enough L, we can take B to be constant along B2
each long side of the rectangle. K
Let the surface spanning the loop have normal n0 , B1 L
so the long side of the loop has direction n0 × n.
Applying Ampère’s law gives
I Z
0
K · n0 dr = µ0 K · n0 L
 
B · dr = B 2 − B 1 · n × n L = µ0 I = µ0
C C0
0
where
R the line0 C is along that part of the surface contained within the loop, so that
I = C 0 K · n dr is the total current flowing through the loop. Therefore
 
n × B 2 − B 1 · n0 = µ0 K · n0


This holds for all n0 tangential to the surface, therefore



n × B 2 − B 1 = µ0 K

There is a discontinuity in the tangential components of the magnetic field due to


the surface current density.

8.5 The vector potential


A vector field B(r) that satisfies ∇ · B = 0 can be written as the curl of a vector
potential 7
∇ · B = 0 ⇔ ∃ A such that B = ∇ × A

The vector potential is not unique, it’s defined up to the gradient of an arbitrary
scalar field χ(r). If we make the gauge transformation
A 7→ A0 = A + ∇ χ
then
∇ × A → ∇ × A0 = ∇ × A + ∇ × ∇ χ = ∇ × A + 0
The magnetic field B (and hence the physics) is unchanged by the transformation.
To fix the gauge uniquely we may choose to add an additional constraint on A, so
that
∇·A = 0
which is called Coulomb gauge. If ∇ · A = ψ 6= 0, we can always find a gauge
transformation χ such that ∇ · A0 = 0 , i.e.
∇ · A0 = ∇ · A + ∇2 χ = 0 ⇒ ∇2 χ = −ψ
This is just Poisson’s equation, which can be solved for χ in the standard way.
7
Compare this with the case of electrostatics, where ∇ × E = 0 ⇒ the electric field can
be expressed as the gradient of a scalar potential φ , and vice versa, i.e. ∇ × E = 0 ⇔
∃ φ such that E = −∇ φ

93
Now if B = ∇ × A, with ∇ · A = 0, then from the differential form of Ampère’s law,
µ0 J = ∇ × B = ∇ × ∇ × A = ∇ ∇ · A − ∇2 A = −∇2 A
 

So ∇2 A = −µ0 J

This has the form of Poisson’s equation for A (rather 3 equations for Ai ), with so-
lution8
µ0 J(r0 )
Z
A(r) = dV 0
4π V r − r0
which satisfies the boundary condition A(r) → 0 as r → ∞
If we take the curl of this equation, we recover the Biot-Savart law [exercise].
We can write similar expressions for line and surface currents [exercise].

Example: For a magnetic field B, which is constant everywhere,



A = 12 B × r
because
1
  
∇× 2 B × r = 21 B ∇ · r − 1
2 B · ∇ r = 23 B − 21 B = B

If we know B, we can find A using Stokes’ theorem in problems with a lot of


symmetry. I Z Z

A · dr = ∇ × A · dS = B · dS ≡ Φ
C S S
Φ is the magnetic flux crossing the surface S bounded by the closed curve C.

Example: For a solenoid centred on the z axis, we


z B
showed previously that
( )
B ez : ρ < a
B =
0 : ρ>a φ

By symmetry, A(r) = Aφ (ρ) e φ . I


Choosing S to be a horizontal disc of radius ρ, as shown ρ
a2
ρ>a : 2πρ Aφ = Bπa2 ⇒ Aφ (ρ) = 21 B
ρ
ρ<a : 2πρ Aφ = Bπρ2 1
⇒ Aφ (ρ) = 2 Bρ

[The case ρ < a is as in the first example above.]


Note that A 6= 0 everywhere outside the solenoid, even though B = 0 everywhere
outside it! This is important in quantum mechanics – see the Quantum Theory
course.
ρ(r0 )
Z
1
8
Compare this with ∇2 φ = −ρ/0 in electrostatics, with solution φ(r) = dV 0
4π0 V r − r0

94
8.6 Magnetic dipoles
Consider a circular wire, radius a, in the x − y plane, car-
rying current I. What is the form of the vector potential z
r
at points with r  a?

µ0 I dr0
I
A(r) =
4π C r − r0
Wlog take r in the x − z plane, so r = x ex + z ez and
r′
 
0
r = a cos φ ex + sin φ ey φ
  O x
0
dr = a − sin φ ex + cos φ ey dφ I

r · r0
 
1 1
For r  a = 1 + 2 + ... with r · r0 = ax cos φ
r − r 0 r r
Then
2π  1  
µ0 I ax cos φ
Z 
A(r) = dφ a − sin φ ex + cos φ ey 1+ + ...
4π 0 r r2
µ0 I π a2 x µ0 m × r
≈ 3
ey =
4π r 4π r3
R 2π
where we used 0 cos2 θ = π, with all the other integrals being zero, and defined

m = π a2 I ez = I S

where S is the vector area of the loop. Now



m×r 3 m · r r − r2 m
   
1  1 
∇× = 3∇ × m × r + ∇ × m×r =
r3 r r3 r5

So that 
µ0 3 m · r r − r 2 m
B(r) = ∇ × A(r) =
4π r5
which is the field of a magnetic dipole with dipole moment m.
Compare this with the expression for the electric field for an electric dipole of mo-
ment p at the origin:
3p · r p
 
1
E(r) = r− 3
4π0 r5 r

95
As we shall now show, we get the same result far away from any current loop,
and the field lines for electric dipoles, magnetic dipoles, and bar magnets (see next
chapter) are therefore the same in the far zone (far-field limit).

E B B

+ N
− S
I

electric dipole magnetic dipole bar magnet

Multipole expansion For any current loop near the origin, we expand the mag-
netic vector potential A as above,

r · r0
 
µ0 I 0 1
I
A(r) = dr 1 + 2 + ...
4π C r r
Applying Stokes’ theorem to an arbitrary constant vector c gives
I I Z
0 0
∇ 0 × c · dS 0 = 0

c· dr = c · dr =
C C S

Since this holds for all vectors c, then C dr0 = 0 for any closed curve C, so the first
H

term in the expansion of A always vanishes. Hence there are no magnetic monopoles,
as expected.
To evaluate the second term, first apply Stokes’ theorem to c r · r0 .


I Z Z
0
0 0 0 0
∇ 0 r · r0 × c · dS 0
   
c· dr r · r = ∇ × c r · r · dS =
C S S
Z Z
r × c · dS 0 = c · dS 0 × r = c · S × r
 
=
S S

where S = S dS 0 is the vector area of the current loop. Since this holds for all
R

constant vectors c , we find


I
dr0 r · r0 = S × r

C

Therefore the first non-zero term in the multipole expansion of A(r) is


µ0 I S × r µ0 m × r
A(r) = 3
=
4π r 4π r3
as in our explicit example of the circular wire above. This holds for any current loop
whose size is much less than r.
If we go further in the expansion, we get a magnetic quadrupole term, etc.

96
8.6.1 Magnetic moment and angular momentum
As system of charged particles flowing in a closed current loop has angular momen-
tum.
For a constant vector c, consider
I I Z
 
c· r × dr = c × r · dr = ∇ × c × r · dS
C C S
Z Z
  
= c ∇ · r − c · ∇ r · dS = 2 c · dS
S S
Since this holds for all constant vectors c, we have
Z I
1
dS = S = 2 r × dr
S C
We can then rewrite the dipole moment as
Z I I
1 1
m = I dS = 2 I r × dr = 2 r × dI
S C C
where dI is the current element.
Z
1

For a bulk current dI = J dV , this becomes m = 2 r × J dV , and since
V
J = ρ v, we have Z
1

m = 2 r × v ρ dV
V
Now suppose the charge density ρ is made up of particles each with charge q and
mass m, then ρ = (q/m) ρm where ρ is the charge density (charge/unit volume),
and ρm is the mass density (mass/unit volume). Then
q q q
Z Z
m = r × v ρm dV = r × p dV = L = gL
2m V 2m V 2m
where p is the momentum density (momentum/unit volume), so L is the angular
momentum, and g is called the gyromagnetic ratio of the particles.
It turns out that this derivation is wrong by a factor of two for an isolated spin- 12
particle such as an electron (with charge −e), for which g = −e/m. The extra factor
of two requires relativity.

8.6.2 Force and torque on a dipole in an external field


Force on the dipole
Consider a current loop in an external magnetic field
B(r). The Lorentz force on the loop is
B
I
F = I dr0 × B(r0 )
C
I
dr0 × B(r) + r0 − r · ∇ B(r) + . . .
  
= I
C C

97
where r0 is a point on the loop, and we Taylor-expanded
H B(r
0
) about a point r inside
0
(or close to) the loop. We’ve already shown that C dr = 0, so the terms in the
integrand that don’t depend on r0 integrate to zero, and hence
I
dr0 × r0 · ∇ B(r) + . . .

F = I
C

We showed above that I


dr0 r0 · r

= S×r
C

The derivation above works for any vector field a(r) [check it!], so
I
dr0 r0 · a(r) = S × a(r)

(8.1)
C

Then
I
dx0j r0 · ∇ Bk = I ijk S × ∇ j Bk = I
   
Fi = I ijk S×∇ ×B i
C

so that   
F = m × ∇ × B = −m ∇ · B + ∇ m · B
But ∇ · B = 0, therefore

F = ∇ m · B ≡ −∇ Wdip

So the energy of the dipole in the external field is



Wdip = − m · B

Note: this is not the total energy of the dipole. It also takes energy to make the
current flow in the loop. [This is covered next semester.]

Note that these results mirror those for the electric dipole where F = ∇ p · E and
W = −p · E .

Torque on the dipole


I I
0 0 0
 0 0
dr r · B(r) − B(r) r0 · dr0
   
G = r × I dr × B(r ) = I
C C

where we approximated B(r0 ) ≈ B(r) on the RHS. The second term vanishes on
applying Stokes’ theorem (because ∇ 0 × r0 = 0), and we again apply the result of
equation (8.1) to the first term, which gives
I
dr0 r0 · B = IS × B = m × B

G = I
C

which again mirrors the result G = p × E for an electric dipole.


The results are the same as for the electric dipole because one can think of a magnetic
dipole as made up of two magnetic monopoles.

98
m
m

⇔ +

I −

This is a useful way to remember the formulae, but magnetic monopoles (probably!)
do not exist.

Example: compass needle We can measure the magnetic field B via oscillations
of a compass needle (which can be regarded as a magnetic dipole).

Since torque is the rate of change of angular momentum, m


G = dL
dt
, we have
dL φ
= m×B
dt B

For a needle with moment of inertia I, this becomes


d  
I φ̇ = −m B sin φ
dt
where φ is the angle between m and B (see diagram).
For small oscillations, where φ  1, we have
mB
φ̈ + φ = 0
I
which gives an oscillation frequency of
r
mB
ω =
I

99
8.7 Summary: electrostatics and magnetostatics
∂ρ
Electrostatics: Stationary charges = 0 are the source of electric fields
∂t
ρ
∇·E = [ME1]
0
Coulomb’s law (field due to point charge) leads to

∇ × E = 0 [ME3 (static)] ⇒ E = −∇ φ

In turn, the above lead to Poisson’s equation for the scalar potential φ
ρ
∇2 φ = −
0

Magnetostatics: Steady current loops ∇ · J = 0 produce magnetic fields [no


magnetic monopoles].

∇·B = 0 [ME2] ⇒ B = ∇×A

The Biot-Savart law (field due to current element) leads to

∇ × B = µ0 J [ME4 (static)]

In turn, in Coulomb gauge, ∇ · A = 0, the above lead to a vector Poisson equation


for A
∇2 A = −µ0 J

Lorentz force: On a point particle



F = q E + v×B

or the Lorentz force density


f = ρE + J × B

Next semester, you’ll see how the Maxwell equations involving curls, (ME3 and
ME4), need to be modified when time-varying currents and fields are present.

100

You might also like