Electromagnetism
Electromagnetism
Introduction
Credit
These Mathematica Notebooks are based on original TEX notes by Tom Marsh leader of the Astronomy & Astrophysics group in the Department of Physics at Warwick University.
1.1 Aims
Electromagnetism is one of the four fundamental forces. Along with gravity, it is also the one we encounter most obviously in every-day life. It is of immense practical importance and underlies optics, electricity generation, and modern communications and as well as the motors and transformers which crop up in almost every household appliance. Electromagnetism is a field theory, and was the first physical theory that unified seemingly separate branches of physics, in this case optics and electricity. In field theories the physical quantities (e.g., the electric and magnetic field) are defined over all space. Compare this with classical mechanics where it makes no sense to talk of the velocity of a particle defined over all space. In order to understand such continuously changing quantities we will make frequent use of vector derivatives. These are often difficult to get used to when first encountered. The first part of the course develops these alongside Maxwell's equations (Chapters 2, 3), and a major aim of this course is to make you familiar with these quantities. Wave solutions of Maxwell's equations are presented as the consequences of these can be seen almost daily, even if only in a rainbow or the reflection from a puddle on a road. The aims of the course are to develop intuition for the behaviour of electromagnetic waves by looking at them in different situations. By the end of the course you should have become familiar with vector calculus, the physics of electric and magnetic fields, and the physics of waves, both in a general sense, and in the specific case of electromagnetic waves.
1.3 Assessment
The two assignments will contribute 40% of the marks for the course and the 100 minute exam the remaining 60%.
Introduction
1.4 Notes
The course notes are at http://physics.uwa.edu.au/pub/Electromagnetism as Mathematica Notebooks and in PDF format. The Notebooks try to be more-or-less self contained and cover everything you should know without covering too much. Please be on the look-out for errors and let me know of any that you find.
1.5 References
No one book is entirely suited to this course, and in any case books are very much a matter of personal preference. The one I like best is Introduction to Electrodynamics by Griffiths. Classical Electrodynamics by Jackson is the most famous and comprehensive text, but only recommended to the very mathematically inclined. Finally, volume 2 of The Feynman Lectures on Physics are worth looking at for their physical insight, particularly with regard to vector calculus. I would urge you to look at more than one treatment of any topic that you have difficulty with as each version may contain elements that help. 1. D J Griffiths, Introduction to Electrodynamics, 3rd edition, Prentice-Hall, 1999. 2. J D Jackson, Classical Electrodynamics, 2nd edition, Wiley, 1975. 3. R P Feynman, R B Leighton, and M Sands, The Feynman Lectures on Physics, Volume 2: Electromagnetism and Matter, Addison-Wesley, 1963-65.
1.6 Conventions
The notes are arranged in chapters each of which may cover one or more lectures. The order of the topics follows the order of the lectures. Each chapter starts with an introduction that briefly lays out what is to come. Worked examples are included, most, but not all, of which will be covered in the lectures. Some sections are marked with a : this warning sign indicates that you should watch out. At the end of the chapters, a short section summarises the principal results and equations which you should aim to master. Appendices are used to collect together material on specific topics such as vector calculus, coordinate systems, and delta functions. The notes follow various conventions for the symbols: 1. Vector quantities are always in bold-face e.g., A 2. The magnitudes of vectors are scalars and are indicated by e.g., A 3. Cross-products are indicated by ` Unit vectors are indicated with a hat as in x for a unit vector along the x direction. Another convention that needs to be understood is that of a right-handed set of axes. For many students the vector nature of electromagnetism is one of its most difficult aspects as it is often necessary to picture problems in 3dimensions. The relative orientation of various vectors is often an issue. Starting with x and y axes at right-angles to ` ` ` each other, a right-handed set of axes is defined by z = x y.
Introduction
` x = UnitVector@3, 1D 81, 0, 0< ` y = UnitVector@3, 2D 80, 1, 0< ` z = UnitVector@3, 3D 80, 0, 1< ` ` ` z x y True A helpful rule for cross-products is to orient your right hand so that your fingers point from the first to the second ` ` vector (e.g., from x to y in this case). Your thumb then points in the direction of the cross-product. Since these notes are Mathematica Notebooks, I use Mathematica conventions throughout. I find some of these conventions very useful. The exponential e (), imaginary i (), and differential d () are all displayed using "doublestruck" characters (which distinguishes them from ordinary letters e, i, and d). Integrals, e.g.,
p
cosHxL x
0
1 change of variables, e.g., . x tanHqL Simplify x2 + 1 q sec2 HqL and total derivatives also use : sinHx yL x y x + y cosHx yL x Partial derivatives use : sinHx yL x y cosHx yL In the figures fields and currents are indicated by crosses, , if they point down into the page and dots, , if they point up out of the page. x
Chapter 2
Gradients and Potentials
2.1 Introduction
There are many circumstances in which the rate at which a physical quantity changes with distance needs to be known. In building a road, the rate of change of height with horizontal distance ~ the gradient ~ is all-important. Gradients of pressure in fluids drive accelerating flows and gradients of temperature drive heat flow. The physical quantities are usually distributed over three dimensions and so the first task in this chapter is to extend the definition of gradient from one dimension, where it is given by the derivative with respect to position, to three dimensions. We will find that the three dimensional gradient is a vector and can be calculated by application of a new operator called the gradient or vector derivative operator. We then look at how the nature of the electrostatic field allows us to define a quantity called the potential whose gradient is equal to the electric field. The chapter finishes with example calculations of fields from potentials.
f =J
x ` f ` f ` f ` ` ` N y x +y +z Ix x + y y + z zM = f l, x y z z
(2.1)
where l is the line element vector H x, y, zL, i.e., x ` ` ` l = y x x + y y + z z, z and is the vector derivative operator (called del or more rarely nabla), = J
x y z
` ` ` Nx +y +z x y z
Gradient
2.2.1 f or grad f
The quantity f is the gradient of f , and is also sometimes written as grad f . Since f = f l, for a given length of line element l, f is maximum when l is parallel to f . Thus f points in the direction of maximum increase of f and its magnitude equals the rate of change of f in that direction. The gradient is the key to straightforward extension of some well-known equations that apply in one dimension. Thus the well known equation for heat conductivity: Q = -k T x , F=V x
where Q is the heat flux in W.m-2 , and k is the conductivity, becomes Q = -k T, F = - V (2.2)
in 3D with the heat flux now a vector pointing in the direction of maximum decrease in temperature.
Figure 2.1 Contours of equal temperature, T, with arrows representing - T. See http://demonstrations.wolfram.com/VisualizingTheGradientVector/. Figure 2.1 illustrates the idea of the gradient in a two-dimensional example. The contours represent lines of equal temperature (isotherms), in a case where there are two peaks of temperature with one higher than the other. The gradient is always perpendicular to the lines of equal temperature and it is large where the lines are close together. Example 2.1 Why is the gradient always perpendicular to contour lines (or, in 3D, contour surfaces)?
Gradient
If a line element l lies in a line or surface along which f is constant (i.e., an isoline or isosurface) then we can write f l = 0. Therefore l must be perpendicular to the gradient f , which is why the arrows representing the gradient in Figure 2.1 were drawn at right-angles to the contour lines. Exercise 2.1 In Figure 2.2 what does P represent?
Gradient
In Mathematica you can compute Taylor series by adding an order term to the function: f Hh + xL + OHhL5 1 1 1 f HxL + f HxL h + f HxL h2 + f H3L HxL h3 + f H4L HxL h4 + OIh5 M 2 6 24 For example, the Maclaurin series (i.e., the Taylor series about 0) for tanHxL is tanHxL + OHxL10 x3 2 x5 17 x7 62 x9 x+ + + + + OIx10 M 3 15 315 2835 See http://demonstrations.wolfram.com/TaylorSeries/. Since the Taylor series involves the derivatives of a function at a point, x, if f HxL = 0 and f HxL < 0, then f Hx + hL = f HxL - a h2 + , (where a > 0) and x is a local maximum of f because, in the neighbourhood of x (i.e., around h = 0), f decreases as we move away from x. But of course, you already knew this from high-school calculus. However, in more than one variable, the situation is more complicated. One very interesting (formal) way of writing the Taylor series is f Hx + hL =
h
x
f HxL
f HxL = 1 + h
1 2! h2 2!
+ h3 3!
1 3!
+ f HxL
= f HxL + h f HxL +
f HxL +
f H3L HxL +
h
x
This idea turns out to be useful in group theory. The action of the operator the function to f Hx + hL.
Another advantage of this notation is that it is straightforward to extend it to any number of variables by replacing h
x
= f HxL + Hh L f HxL +
Hh L2 f HxL +
Some care needs to be taken when interpreting this expression: For two variables, the second term is
Gradient
Hh L f HxL = h f HxL = Hh, kLIx , y M f Hx, yL = h f H1,0L Hx, yL + k f H0,1L Hx, yL, and the third term is 1 2! 1 2 Hh L2 f HxL = 1 2 Hh, kLIx , y M IHh, kLIx , y M f Hx, yLM = 1 2 k2 f H0,2L Hx, yL.
Hh L2 f HxL = hT H h = H h k L
h , k
where H is the Hessian matrix. We require an important result from linear algebra: A symmetric n n matrix M is positive definite xT M x > 0 for all x 0 in n all the n eigenvalues, li , of M are such that each li > 0. Similarly, a negative definite matrix has each li < 0. Since the Taylor series
f Hx,yL x f Hx,yL y
f Hx + h, y + kL = f Hx, yL + H h k L
1 2
2 f Hx,yL
2 f Hx,yL x y 2 f Hx,yL y y
H h k L
x x 2 f Hx,yL y x
h + k
involves the partial derivatives of f at the point, x = Hx, yL, if f HxL = 0, i.e., f H1,0L Hx, yL = 0 = f H0,1L Hx, yL, and H is negative definite then Hx, yL is a a local maximum of f . See http://demonstrations.wolfram.com/EigenvaluesCurvatureAndQuadraticForms/. The relationship between the sign of the eigenvalues and the sign of xT M x results directly from the definition of the eigenvalues, li and corresponding (orthonormal) eigenvectors, ui , of a symmetric matrix: Mui = li ui uT Mui = li uT ui = li di, j . j j If the eigenvectors span n , we can express any vector in n as x = a1 u1 + + an un where each ai . Then
n n i=1 n n n n n i=1
xT M x =
a j uT M ai ui = ai a j uT Mui = ai a j li di, j = li a2 . j j i
j=1 i=1 j=1 i=1 j=1
Since a2 > 0 and the ai are arbitrary, xT M x > 0li > 0, "i=1,2,,n . i The eigenvectors diagonalize the symmetric matrix, which can be written in the form M = PT D P, where P is the matrix of eigenvectors, P = Hu1 u2 un L, and D is the diagonal matrix with the eigenvalues l1 , l2 , , ln along the diagonal. Then xT Mx = xT PT D P x = HP xLT D HP xL yT D y = Il1 y2 + l2 y2 + + ln y2 M, n 1 2 where y = P x. Clearly, l1 y2 + l2 y2 + + ln y2 > 0 for arbitrary (real) yi only if all li > 0. n 1 2
Gradient
5 -2 . Is M positive definite? -2 8
8x, y<.M.8x, y< Factor 5 x2 - 4 x y + 8 y2 it is not immediately obvious that this expression is positive for arbitrary x and y. However, if we write the result in the form % True it is now obvious, since Hx - 2 yL2 and H2 x + yL2 are both positive for all x = Hx, yL 0 in 2 . Alternatively, we see that both eigenvalues are positive: L = Eigenvalues@MD 89, 4< Hence M is positive definite. Alternatively, with = DiagonalMatrix@LD 9 0 0 4 then clearly yT D y = l1 u2 + l2 v2 > 0 for all y = Hu, vL 0 in 2 . 8u, v<..8u, v< 9 u2 + 4 v2 The orthogonal eigenvectors are Eigenvectors@MDT -1 2 2 1 We need to make these orthonormal: P= % %
1 5 2 5 2 5 1 5
9 5
Hx - 2 yL2 +
4 5
H2 x + yL2 Simplify
10
Gradient
PT .M.P Simplify 9 0 0 4 and we confirm that PT D P = M: PT ..P M True Computing HP xLT D HP xL we also obtain a result that is positive for all x = Hx, yL 0 in 2 . HP.8x, y<L..HP.8x, y<L 2x y 2 2y 4 + +9 5 5 5 Simplify % 9 4 Hx - 2 yL2 + H2 x + yL2 5 5 Example 2.3 Describe the conic 5 x2 - 4 x y + 8 y2 = 36. Visualizing the conic shows that it is an ellipse: ContourPlotA5 x2 - 4 y x + 8 y2 , 8x, -4, 4<, 8y, -4, 4<, Contours 836<, ContourShading False, ContourStyle BlackE 4
x 5
-2
-4 -4 -2 0 2 4
8x, y<.M.8x, y< 36 ExpandAll 5 x2 - 4 x y + 8 y2 36 Diagonalizing the matrix, xT Mx = xT PT D P x = HP xLT D HP xL = yT D y, where y = Hu, vL = P x. Hence the equation becomes 8u, v<..8u, v< 36 Simplify 9 u2 + 4 v2 36 that is,
u2 4
v2 9
11
Gradient
that is,
u2 4
v2 9
ContourPlotB
-2
-4 -4 -2 0 2 4
The effect of the orthogonal matrix P on x is to rotate the axes: P.8x, y< 2y : 5
x 5
2x 5
y 5
>
f HBL - f HAL = f = f l.
Figure 2.3 Path of integration from A to B and back again. If we then move back from B to A over a different path, the total change in f will be zero and thus
12
Gradient
f l = 0, where the symbol indicates an integral over a closed loop. The reverse of this process can be shown to be true. That is, if integrals over closed loops in a vector field A are always zero i.e., A l = 0 for any loop, then A can be derived from a scalar field, called y say, by taking its gradient A = y. This is an important theorem since it is generally much easier to work with scalars than vectors. Since the force on a charge q in an electric field E is q E, and so a force -q E needs to be applied to hold the charge still, the integral -q E l represents the work needed to move the charge around a loop. In electrostatics this must be zero or else we could obtain energy indefinitely by allowing the charge to move around the loop in the direction that makes the work needed negative. In electrostatics E l = 0 for all loops and therefore from above, we must be able to derive E from a scalar, i.e., E = y. In fact by convention we write E = - f, where f is called the electric potential. The minus sign means that the potential increases as one nears positive charges and makes f the work done in bringing a unit charge from infinity to a given point. The reasoning above breaks down in time varying cases when it is possible for E l 0 (e.g., think of the coils of a transformer). Thus the above equation applies in electrostatics only. We have used the conservation of energy to argue that E l = 0 and any vector field that satisfies this condition is known as a conservative field. Not all fields satisfy this condition. For example any field that can be drawn in closed loops cannot have a zero-line integral around these loops. The magnetic field around a wire is one example, and in general it is not possible to derive magnetic fields from scalar potentials.
k H-k xL = 0 . 0
` Thus f = -k x is the potential of a uniform field pointing in the x direction (i.e., i) with magnitude k. This example can be generalised: Example 2.5 What is the electric field equivalent to the potential f = -Ar where A is a constant vector and r is the position vector?
13
Gradient
Example 2.5 What is the electric field equivalent to the potential f = -Ar where A is a constant vector and r is the position vector? The dot product can be expanded out: f = -A1 x - A2 y - A3 z. We then have A1 E = - f = A2 = A. A3 Therefore a uniform field, E, has a potential of the form f = -Er. Now for a trickier case: Example 2.6 What is the electric field equivalent to the potential f = This can be answered in two ways: 1. Direct approach Apply the vector derivative operator to 1r remembering that r2 = x2 + y2 + z2 . Thus since 1 1 r r2 x 1 Ix2 + y2 + z2 M r2 x
12 1 r
x r and r x we obtain 1
=-
=-
Ix2 + y2 + z2 M x
12
1 2
Ix2 + y2 + z2 M
-12
x 2x= , r
x r
=-
x r3
and so, with similar expressions for the other components, we obtain E = - f = r r3 ` r r2 er r2 , (2.3)
` where r or er are unit vectors pointing in the radial direction. Therefore, as expected, a 1r potential gives a 1 r2 electric field. 2. Intuitive approach Use Eq. 2.1, f = f l. If l is parallel to f , i.e., we step along the direction of the gradient, then this becomes f = f l, or f = f l
14
Gradient
along a path parallel to the gradient. For the electric field we can similarly write E = - f l for a path parallel to the field. For f = 1r the field must point in the radial direction by symmetry so we take the derivative moving out in radius, i.e., E = - f r for any spherically symmetric potential f. f = 1r f r = 1 r2 E= ` r r2
This trivially gives the result (Eq. 2.3) obtained more painfully above, and can be applied to any potential that varies with r only.
Computation of fields using Mathematica The Norm function x - y gives the distance between points x and y. The potential of a (point) charge q positioned at r 0, measured at r is fHr_, r0_, q_: 1L := q r - r0
For example, the potential due to a unit charge at the origin = 80, 0, 0<; measured at the point P = 8x, y, z< ; is fHP, L
x2 + y2 + z2 We simplify this since 8x, y, z< . f1 Hx_, y_, z_L = Simplify@fHP, L, 8x, y, z< D 1 x2 + y2 + z2 In Mathematica after defining : f_ := : f , f , f z >
x y
We can compute - f in Cartesian coordinates directly: = SimplifyA- f1 Hx, y, zL . x2 + y2 + z2 r2 , r > 0E x y z : , , > r3 r3 r3 Introducing the unit vector, er , in the radial direction
15
Gradient
er =
r x y z : , , > r r r the electric field can be written as True Alternatively, the electric field for any spherically symmetric potential f can be computed using E = - f r : True
1 r
P r3
er r2
er
-2
-4 -4 -2 0 2 4
Note that the field lines and equipotentials are orthogonal (i.e., they intersect at right-angles). Restricting attention to the x-z plane (i.e., y = 0), the (Cartesian) components of the electric field are 9x , y , z = = - f1 Hx, 0, zL x z : , 0, > 32 32 Ix2 + z2 M Ix2 + z2 M Note that restricting attention to the x-z plane simplifies the computations slightly and is convenient when plotting graphs of the potential and field. However, you should remember that, in general, the potential and field are functions of all 3 (Cartesian) coordinates. There is an important sublety here: from first year you should already be aware that the density of lines in plots of the electric field are proportional to the strength of the field. However, the density needs to be computed in 3 dimensions (i.e., lines per unit volume) rather than in 2 dimensions (i.e., lines per unit area). If you do this for a point charge you will find that the density of lines does indeed go like the inverse square of the distance from the charge, i.e., 1 r2 .
16
There is an important sublety here: from first year you should already be aware that the density of lines in plots of Gradient the electric field are proportional to the strength of the field. However, the density needs to be computed in 3 dimensions (i.e., lines per unit volume) rather than in 2 dimensions (i.e., lines per unit area). If you do this for a point charge you will find that the density of lines does indeed go like the inverse square of the distance from the charge, i.e., 1 r2 . A powerful alternative visualization is a surface plot with the equipotential lines superimposed onto the surface:
Imagine placing a ball bearing on this surface under the influence of gravity acting in the vertical direction. Qualitatively, the magnitude and direction of the force on the ball-bearing is obvious. By analogy, one can immediately obtain the forces acting on and the resulting motion of a positive test charge in such a potential. Here is a plot of the equipotential surfaces in 3D.
17
Gradient
f=
` pr 4 p e0 r
2
pr 4 p e0 r
3
p1 x + p2 y + p3 z 4 p e0 Ix2 + y2 + z M
2 32
p cosHqL 4 p e0 r2 (2.4)
` where p is a constant vector and q is the angle between p and the radial direction, r. What is the electric field of such a potential? Since the potential is expressed in spherical coordinates r and q, it is easiest to work out the field in the radial (increasing r) and tangential (increasing q) directions. We start again from Eq. 2.1, f = f l, or its equivalent ~ here f = -E l. If l is parallel to the radial direction, only the radial component of E, E r , contributes to the dot product and thus Er = f l
q const
=-
r f l r
=-
f r
with the partial derivative showing that only r changes. Similarly if we move tangentially, only the tangential component Eq contributes to the dot product and Eq = f l
r const
=-
q f l q
=-
1 f r q
Note here that, for r constant, if we move from Hr, qL to Hr, q + qL we have moved by l = r q so q l = 1r which is why a 1r term appears (as it must to give the correct dimensions). For q constant, if we move from Hr, qL to Hr + r, qL we have moved by l = r so r l = 1. Applying Er = - r and Eq = - r Er = 2 p cosHqL 4 p e0 r3 p sinHqL 4 p e0 r3 . ,
f 1 f q
Figure 2.4 The field pattern (purple) and equipotential lines (black) for a potential of the form p cosHqL r2 .
18
Gradient
The field pattern (purple) and equipotential lines (black) for a potential of the form p cosHqL r2 .
Figure 2.5 The equipotential surfaces for a potential of the form p cosHqL r2 .
Total derivative versus partial derivative Recall the definition of the total derivative. For f a function of r and q we find that f Hr, qL q l l f H0,1L Hr, qL + r l
f H1,0L Hr, qL
where f H0,1L Hr, qL denotes the partial derivative of f with respect to its second argument, i.e., q: f Hr, qL q f H0,1L Hr, qL and similarly for f H1,0L Hr, qL. Computing the total derivative for constant q we obtain: SetAttributes@q, ConstantD; r l f H1,0L Hr, qL f Hr, qL l
and similarly for constant r. A useful observation: f = gHx, y, tL; f t gH0,0,1L Hx, y, tL + Figure 2.4 y t x t
gH0,1,0L Hx, y, tL +
gH1,0,0L Hx, y, tL
19
Gradient
f t True
f t
+:
f f x y , >.: , > t t x y
2.2.7 Dipole
The total potential, f2 Hx, y, zL, of a pair of equal and opposite charges, +1 positioned at 80, 0, 1< and -1 positioned at 80, 0, -1<, is f2 Hx_, y_, z_L = Simplify@fHP, 80, 0, -1<, -1L + fHP, 80, 0, 1<, 1L, 8x, y, z< D 1 1 x2 + y2 + Hz - 1L2 x2 + y2 + Hz + 1L2 with corresponding electric field 9x , y , z = = - f2 Hx, y, zL x x : , 32 32 Ix2 + y2 + Hz - 1L2 M Ix2 + y2 + Hz + 1L2 M y Ix2 + y2 + Hz - 1L2 M
32
y Ix2 + y2 + Hz + 1L2 M
32
>
Below we plot the charges, equipotential lines, and field lines together:
+
0
-2
-4
-4
-2
20
Gradient
-2
-4
-4
-2
Compare this figure with the corresponding one for a (pure) dipole. Here is a surface plot of f2 Hx, y, zL for y = 0 with the equipotential lines superimposed onto the surface,
21
Gradient
See http://demonstrations.wolfram.com/ElectricDipolePotential/.
2.2.8 Quadrupole
It is not hard to extend such computations to arbitrary collections of charges. A combination that is particularly important in the study of nuclear physics, magnets used in particle accelerators, and gravitational waves (see http://en.wikipedia.org/wiki/Gravitational_wave), is the quadrupole which, as its name suggests, consists of 4 poles. Consider the following arrangement of charges: +1 at 81, 0, 1< and 8-1, 0, -1< and -1 at 81, 0, -1< and 8-1, 0, 1<: f4 Hx_, y_, z_L = Simplify@ fHP, 8-1, 0, -1<, 1L + fHP, 8-1, 0, 1<, -1L + fHP, 81, 0, -1<, -1L + fHP, 81, 0, 1<, 1L, 8x, y, z< D 1 1 + Hx + 1L2 + y2 + Hz - 1L2 Hx - 1L2 + y2 + Hz + 1L2 1 Hx + 1L2 + y2 + Hz + 1L2 The corresponding electric field is + 1 Hx - 1L2 + y2 + Hz - 1L2
22
Gradient
9x , y , z = = - f4 Hx, y, zL x-1 x-1 x+1 : + 32 32 32 IHx - 1L2 + y2 + Hz - 1L2 M IHx - 1L2 + y2 + Hz + 1L2 M IHx + 1L2 + y2 + Hz - 1L2 M x+1 IHx + 1L2 + y2 + Hz + 1L2 M y IHx - 1L2 + y2 + Hz + 1L2 M z-1 IHx + 1L2 + y2 + Hz - 1L2 M
32 32 32
>
Here is a surface plot with the equipotential lines superimposed onto the surface:
Example 2.7 What can you say about the stability of a positive test charge positioned at the origin, 80, 0, 0<, for the quadrupole potential?
23
Gradient
Example 2.7 What can you say about the stability of a positive test charge positioned at the origin, 80, 0, 0<, for the quadrupole potential? First we need to define stability: Something is stable if, after an arbitrary (small) perturbation, the resulting forces acting on it tend to return it to its original position. From the above diagram it is clear that, after a small displacement in the north-east (45) or south-west (225) directions, the resulting force would tend to return the test charge to its original position. However, after a small displacement in the north-west (135) or south-east (315) directions, the resulting force on the test charge is away from the origin and towards one of the negative charges. Hence a test charge positioned at the origin is not stable. Note that both the potential and its first (partial) derviatives (i.e., its electric field) at 80, 0, 0< are both identically zero: f4 H0, 0, 0L 0 9x , y , z = . 8x 0, y 0, z 0< 80, 0, 0< Using calculus we then know that 80, 0, 0< is an extremum. In single-variable calculus, if a function has zero derivative then one test to decide whether it is a maximum or a minimum is to compute its second derivative. In higher dimensions there are other (topological) possibilities including saddle-points. The generalization of the single-variable test is to compute the eigenvalues of the matrix of second derivatives (i.e., the Hessian):
2 f4 Hx,y,zL x
2
2 f4 Hx,y,zL x z 2 f4 Hx,y,zL z2
2 f4 Hx,y,zL z x
. 8x 0, y 0, z 0<
0
3 2
3 2
If all the eigenvalues are negative (positive) then we have a maximum (minimum). If the sign of the eigenvalues is mixed then we have a saddle-point: Eigenvalues@%D 3 3 :, > 2 2 That we have a saddle-point should be obvious from the surface plot. See http://demonstrations.wolfram.com/MultipoleFields/.
0 0
This problem is dealt with by E Durand in Electrostatique, Tome 1, Distributions (Masson, Paris 1964 ). 1. Write down the total potential for this configuration of charges;
24
Gradient
Write down the total potential for this configuration of charges; P = 8x, y, z<; f3 Hx_, y_, z_L = Simplify@fHP, P1TL + fHP, P2TL + fHP, P3TL, 8x, y, z< D 1 1 1 + + x2 - x + y2 + z2 +
z 3
1 3
x2 + x + y2 + z2 +
z 3
1 3
x2 + y2 + J
1 3
- zN
Here is a plot of the potential in the x-z plane. Plot3D@f3 Hx, 0, zL, 8x, -1, 1<, 8z, -1, 1<, MeshFunctions 88x, y, z< z<, ClippingStyle None, PlotRange 80, 8<D
2. Write down the electric field for this configuration of charges; 9x , y , z = = - f3 Hx, y, zL 2x-1 2x+1 : + 32 z 1 2 Jx2 - x + y2 + z2 + + 3N 2 Jx2 + x + y2 + z2 +
3
z 3
1 32 N 3
+ Jx + y + J y Jx + y + J
2 2 1 2 2
x
1 3
- zN N
2 32
y Jx2 - x + y2 + z2 + 2z+
1 3 z 3 z 3
1 32 N 3
y Jx2 + x + y2 + z2 + 2z+
z 3 1 3 z 3
1 32 N 3
3 1
- zN N -z
1 3
2 32
2 Jx2 - x + y2 + z2 +
+ 3N
1 32
2 Jx2 + x + y2 + z2 +
+ 3N
1 32
2 2
Jx + y + J
- zN N
2 32
>
3.
Solve the equations 9x , y , z = 80, 0, 0< numerically. Here is the solution found using Newton's method with z -0.1. numroot = FindRootA9x , y , z = 80, 0, 0<, 88x, 0<, 8y, 0<, 8z, -0.1<<E 8x 0., y 0., z -0.164382<
25
Gradient
Here is the same solution found exactly. 0, zF z 98z 0<, 9z RootA243 110 - 1863 18 - 459 16 - 171 14 + 42 12 - 1 &, 3E== SolveB The numerical value of this root is N@%D 88z 0.<, 8z -0.164382<< 4. Show that the electric field vanishes at 4 points. The origin is a trivial solution: origin = 80, 0, 0< 80, 0, 0< We have found a second solution, pt = 80, 0, z< . numroot 80, 0, -0.164382< and two more solutions can be found by rotation about the y-axis. = RotationTransform@120 , 80, 1, 0<D -2 TransformationFunctionB 0
3 2 1
f3 H0, 0, zL
0 1
3 2
0 0 0 1 F
0
1
0 -2 0 0
= Thread 8P origin, P pt, P pt, P pt< 88x 0, y 0, z 0<, 8x 0, y 0, z -0.164382<, 8x -0.142359, y 0., z 0.0821911<, 8x 0.142359, y 0., z 0.0821911<< As a check, 9x , y , z = . Chop 0 0 0 0 0 0 0 0 0 0 0 0 5. Visualise the 4 points on an equipotential plot. Restricting attention to the x-z plane, here is a plot of the critical points and equipotential contours:
26
Gradient
From this plot it looks like 80, 0, 0< is a minimum and the other three critical points are saddle-points. Here is a plot of the electric field lines and equipotentials in the x-z plane: 0.4
0.2
-0.2
Note that the apparent convergence of flux at the centre is illusory. The flux flow towards the centre diverts out of the plane of the source charges. 6. What can you say about the stability of a test charge positioned at each of the above 4 points? We need to compute the Hessian matrix. In 2 dimensions this reads
2 f3 Hx,y,zL 2 f3 Hx,y,zL x z f3 Hx,y,zL z2
2
H=
x
2
f3 Hx,y,zL z x
Evaluating the eigenvalues of the Hessian at the first critical point, 80, 0, 0<:
27
Gradient
it looks like 80, 0, 0< is a minimum because both eigenvalues are positive. However, the potential is a function of all three coordinates so we really need to compute the matrix
2 f3 Hx,y,zL x2 2 f3 Hx,y,zL y x 2 f3 Hx,y,zL z x 2 f3 Hx,y,zL x y 2 f3 Hx,y,zL y2 2 f3 Hx,y,zL z y 2 f3 Hx,y,zL x z 2 f3 Hx,y,zL y z 2 f3 Hx,y,zL z2
H=
That the first critical point (which looked like a minima in two dimensions) is a saddle-point should be obvious from the physical situation: imagine placing a positive test charge at the origin. The force on the test charge after a small displacement out of the plane of the the three fixed charges is away from the origin. H . P1T Eigenvalues :-9 3, 9 2 3 , 9 2 3 >
24.2438 -17.0743 -7.16948 24.2438 -17.0743 -7.16948 24.2438 -17.0743 -7.16948 The last 3 critical points are themselves vertices of an equilateral triangle. Note that, by symmetry, we should not be suprised then that the eigenvalues of H evaluated at these critical points are equal.
where Q is the total charge. However, if Q is zero, what is the leading term of the potential for
large r? To answer this question, again consider a physical dipole. Here the total charge is Q = q - q = 0. At the point P = 8x, y, z<; the potential is s s f2 Hx_, y_, z_, q_: 1, s_: 1L = SimplifyBf P, :0, 0, - >, -q + f P, :0, 0, >, q , 8x, y, z, s< F 2 2 1 Iz - 2 M + x2 + y2
s 2
1 I 2 + zM + x2 + y2
2
28
Gradient
In spherical polar coordinates ( see http://demonstrations.wolfram.com/SphericalCoordinates ~ but note that this demonstration has q f),
29
Gradient
r q f
the potential reads f2 Hr cosHfL sinHqL, r sinHfL sinHqL, r cosHqL, q, sL FullSimplify 2q 1 4 r2 - 4 r s cosHqL + s2 1 4 r2 + 4 r s cosHqL + s2
For r p s we expand f2 into a Taylor series in s: SimplifyA% + O@sD6 , r > 0E q s cosHqL q s3 cosHqL I5 cos2 HqL - 3M 1 + + q s5 cosHqL I63 cos4 HqL - 70 cos2 HqL + 15M + OIs6 M r2 8 r4 128 r6 The leading term is f= q s cosHqL 4 p e0 r2 = ` pr 4 p e0 r2
where the dipole moment is p = q s. This corresponds to a pure dipole potential. Evidently the potential of a dipole goes like 1 r2 for large r.
30
Gradient
Figure 2.6 Dipole moment p of a water molecule. The potential of an arbitrary charge distribution confined to a volume V is fHPL = 1 4 p e0
V
rHsL RHsL
s,
(2.5)
where R is the distance from s to P. With respect to a fixed origin , we can obtain a systematic expansion for f HPL in terms of inverse powers of r. The diagram below defines the variables. Without loss of generality, we have aligned P with the z-axis:
31
Gradient
r R
ds
q s 0
In spherical polar coordinates, we write S = 8s cosHfL sinHqL, s sinHfL sinHqL, s cosHqL<; P = 80, 0, r<; and find that R = ComplexExpand@P - S, TargetFunctions 8Re, Im<D Simplify r2 - 2 r s cosHqL + s2 Since r, R, and s are the sides of a triangle, using the cosine rule it is clear that R is independent of the azimutahal angle f. For r > s we expand Factor 1
1 R
It turns out that the trigonometric terms are Legendre polynomials, Pn HcosHqLL:
32
Gradient
Factor Table@Pn HcosHqLL, 8n, 0, 3<D 1 1 :1, cosHqL, I3 cos2 HqL - 1M, cosHqL I5 cos2 HqL - 3M> 2 2 which are generated by the generating function: 1 1-2xt+t Hence we can write 1 RHsL 1 r
2
tn Pn HxL, t < 1.
n=0
n=0
s r
O Pn HcosHqLL, r > s.
(2.6)
n=0
1 r
n+1
n s Pn HcosHqLL rHsL s V
1 4 p e0
2
rHsL s +
V
1 r2
s cosHqL rHsL s +
V
1 r3
s
V
1 2
This is the desired result ~ the multipole expansion of V in powers of 1r. The first term (~ 1r) is the monopole term, the second ( ~ 1 r2 ) is the dipole term, the third ( ~ 1 r3 ) is the quadrupole term, and so on. Although (2.7) is exact it is more useful as an approximation scheme. The leading term in the expansion provides the approximate potential at large distances from the charge distribution. This expansion is not restricted to computing the potential due to a charge distribution: it arises in many fields including atomic and molecular physics (both for bound states of atoms and molecules and in scattering theory), nuclear physics, and gravitational computations. It is usually easiest to compute (2.7) in spherical polar coordinates. To change coordinates you need to compute the Jacobian determinant,
x r x q y q z q x f y f z f
r x y z
y r z r
r q f = r2 sinHqL r q f,
(2.8)
In spherical polar coordinates, spc = r 8sinHqL cosHfL, sinHqL sinHfL, cosHqL<; the Jacobian matrix reads
Hr cosHfL sinHqLL r Hr sinHfL sinHqLL r Hr cosHqLL r Hr cosHfL sinHqLL q Hr sinHfL sinHqLL q Hr cosHqLL q Hr cosHfL sinHqLL f Hr sinHfL sinHqLL f Hr cosHqLL f
cosHfL sinHqL r cosHqL cosHfL -r sinHqL sinHfL sinHqL sinHfL r cosHqL sinHfL r cosHfL sinHqL cosHqL -r sinHqL 0 Alternatively,
33
Gradient
D@spc, 88r, q, f<<D cosHfL sinHqL r cosHqL cosHfL -r sinHqL sinHfL sinHqL sinHfL r cosHqL sinHfL r cosHfL sinHqL cosHqL -r sinHqL 0 or spc 88r, q, f<< cosHfL sinHqL r cosHqL cosHfL -r sinHqL sinHfL sinHqL sinHfL r cosHqL sinHfL r cosHfL sinHqL cosHqL -r sinHqL 0 and the determinant ( ) simplifies to % Simplify r2 sinHqL
2.3 Summary
Eq. 2.1 f = f l shows how much a function f HrL changes in moving from r to r + l. If a vector field satisfies A l = 0
C
for any circuit C, it is said to be conservative and we can write A = y, where y is some scalar function. In electrostatics the electric field must be conservative and by convention with y = -f we write E = - f. Expanding potentials into Taylor series, e.g,
fHx,yL
fHx + h, y + kL = fHx, yL + H h k L.
x fHx,yL y
1 2
2 fHx,yL
2 fHx,yL x y fHx,yL y y
2
H h k L.
x x fHx,yL y x
2
h + , k
is useful when determining stability and for finding the leading long-range behaviour of a potential.
Chapter 3
Gauss' Law, Gauss' Theorem, and Divergence
3.1 Introduction
In this chapter we look at Gauss' Law in a new way. The standard form of Gauss' Law involves integrated quantities e.g., the "flux emergent" from a region is the flux per unit area integrated over the surface. Although this form is very useful in problems with a high degree of symmetry, it only provides a constraint in most other cases without being of much use in finding a functional form for the electric field. In this chapter a local form of Gauss' Law is derived that applies at every point. In proceeding towards the local version of Gauss' Law, a new quantity measuring the production of flux per unit volume is introduced. This scalar quantity is called the divergence and can be derived from the field using , the vector derivative operator of Chapter 2.
` A surface S encloses a charge q. The electric flux coming out of an element of the surface S = n S ` Figure 3.1 ( S has magnitude S directed along the normal n) is E S = E S cosHqL where q is the angle between ` the electric field E and n. The total flux emergent from the surface S is then given by E S,
S
where the circle through the integral sign indicates an integral over a closed surface. In SI units, Coulomb's Law is E= q ` r, 4 p e0 r2 1
35
Gauss' Law
` where r is a unit vector in the radial direction. Therefore the total flux emergent from the surface is q 4 p e0
S
` r S r2
` The integrand equals the projected area of the element as seen from the point charge (i.e., r S) divided by its distance squared. This is the definition of the solid angle subtended by the element, W. The SI unit of solid angle is the steradian. See http://demonstrations.wolfram.com/SolidAnglesOnASphere/. In spherical polar coordinates (see Eq. 2.8): ` ` r S r V = r S = r r W, S = S r = W = sinHqL q f. r2
2 2
q e0
W
S
sinHqL q f = 4 p steradians.
0
Example 3.1: Compute the volume and surface area of a sphere using spherical polar coordinates. Since the total electric fields of two charges is the superposition of the electric field of each, added vectorially, the result can be extended to many charges and we find that the electric flux emergent from a closed surface is equal to the charge enclosed by the surface divided by e0 . This is Gauss' Law, E S =
S
qenclosed e0
(3.1)
which depends upon the 1 r2 nature of Coulomb's Law. Note that gravitational forces obey an equivalent Law: g S = H4 p GL menclosed .
S
36
Gauss' Law
The field from an infinite plane emerges at right-angles to it. The (gaussian) surface we consider consists Figure 3.2 of two faces, each of the same arbitrary shape with area A, which lie parallel to the plane, and vertical walls which connect the two faces. The charge enclosed by this volume is s A and so by Gauss' Law we obtain 2EA= sA e0 ,
and therefore the magnitude of the electric field from an infinite plane is given by E= s 2 e0
The field from an infinite plane is equal but opposite on both sides. A more realistic case is the field close to large charged conductor, where "close" implies that it is effectively a plane. This can be treated in exactly the same way except now the field inside the conductor is zero (if it wasn't, current would flow and that would not be electrostatics). Thus all the flux escapes on one side and we get E= s e0 ,
for the field close to a charged conductor. Example 3.2: The electric field beneath a thunder cloud is 1000 V/m. What is the surface charge density of the ground underneath the cloud? Electrostatically, the Earth is a conductor. Thus E = s e0 applies and so s = 1000 e0 = 8.9 10-8 C.m-2 .
37
Gauss' Law
Figure 3.3 The gaussian surface for a long cylinder is itself a cylinder, but of radius r. This cylinder is co-axial with the infinite cylinder so that the electric field is uniform over, and perpendicular to, its curved surface. No electric flux is parallel goes through the two ends of the gaussian cylinder. The surface over which the flux emerges has area 2 p r L, while the amount of charge enclosed is l L. Therefore by Gauss' Law we have 2prLE= and so E= l 2 p e0 r lL e0 ,
Unlike the case of a plane, getting closer to a real cylinder never makes it appear to be an infinite cylinder: end effects do not become infinitesimal. However, there are situations of great practical importance where the above solution is useful. In particular the above field describes the field pattern inside co-axial cables, even in the timevarying case.
Figure 3.4 The gaussian surface for a sphere is itself a sphere, but of radius r. The field will come out radially and will therefore be perpendicular to the 4 p r2 area of the gaussian sphere. Thus by Gauss' Law
38
Gauss' Law
The field will come out radially and will therefore be perpendicular to the 4 p r2 area of the gaussian sphere. Thus by Gauss' Law 4 p r2 E = Q e0 ,
This is so familiar that it almost seems "obvious" but try deriving it directly from Coulomb's Law and you will see that it is not. This result also applies for an arbitrary spherically symmetric charge distribution where QHrL is the charge enclosed in a (Gaussian) sphere of radius r:
S=V
(3.2)
This is a fundamental equation which you need to remember. Even point charges can be included in this formulation by use of delta functions. We want a version of Gauss' Law that applies at a point. However, one cannot define a volume enclosed or a surface area for a point, and so we consider instead a finite volume that is shrunk to infinitesimal dimensions. Consider first the charge enclosed, Q = r V,
V
as V becomes smaller. For a continuous charge distribution, for V sufficiently small, r is essentially constant and so, in the limit V 0, Q r V. We want a finite limit so it makes more sense to divide by V so that we have
39
Gauss' Law
We want a finite limit so it makes more sense to divide by V so that we have lim 1 V r V = r.
V
V0
Consider the following limit for the left-hand side of Eq. 3.2, called the divergence of the electric field (div E for short): div E = lim 1 V
S=V
V0
E S.
(3.3)
This quantity is the amount of electric flux produced per unit volume at a point. It can be defined similarly for any vector field. For instance we will find later that the divergence of the magnetic field is always zero. With this definition Gauss' Law at a point becomes div E = re0 , which says that the amount of electric flux produced per unit volume is proportional to the charge density at every point. Eq. 3.3 defines the divergence. By considering particular shapes for the volume V, we can obtain expressions for computing the divergence that are suited to particular geometries. Cartesian coordinates are most commonly used, and so let us consider a small cuboid oriented with its sides along the x, y and z axes and centred on the point Hx, y, zL:
Figure 3.5
A small box with sides parallel to the cartesian axes and drawn to have more electric flux leaving than entering.
Let its sides have lengths D x, D y, and D z. We will now calculate the flux emergent from this cuboid. First consider the amount of flux emerging from the two faces oriented parallel to the y-z plane. Only the x component of the electric field, Ex , contributes to the flux through these faces, and in one face it points in while at the other it points out. A subscripted function such as Gx Hx, y, zL denotes the x-component of the vector GHx, y, zL. Partial derivatives are denoted using any of the equivalent standard notations x GHx, y, zL,
GHx,y,zL x
sloppy to denote the partial derivative of a function using a subscript on the function for how would you interpret an expression like 3 Gi ? i=1 Taking the difference between the x components evaluated in the centre of each face and multiplying by their area these faces contribute Ex x + Dx 2 , y, z - Ex x Dx 2 , y, z D yD z, (3.4)
to the flux emergent from the cuboid. The only reason that there is any net contribution to the flux is that the Ex component may change across the cuboid so that the two faces do not cancel. Thus Fig. 3.4 has been drawn to indicate that more flux leaves than enters the box. As D x becomes small, the expression for Ex can be expanded to first order e.g.,
40
Gauss' Law
to the flux emergent from the cuboid. The only reason that there is any net contribution to the flux is that the Ex component may change across the cuboid so that the two faces do not cancel. Thus Fig. 3.4 has been drawn to indicate that more flux leaves than enters the box. As D x becomes small, the expression for Ex can be expanded to first order e.g., Ex x + Dx 2 , y, z = Ex Hx, y, zL + 1 Ex 2 x D x.
The partial derivative applies as the change is in x alone. A similar expression with a negative sign applies for the other face and substituting into Eq. 3.4 we obtain a contribution to the emergent flux of Ex x D x D yD z.
The other four faces give analogous contributions from the y and z components and, recognising the product of lengths D x D yD z as the volume V, we get a total emergent flux from the cuboid of E S = Ex x + Ey y + Ez z V.
S=V
Therefore the limit of Eq. 3.2, which we called the divergence of E, becomes div E = lim 1 V
S=V
V0
E S =
Ex x
Ey y
Ez z
Using , the vector derivative operator of Chapter 2, the expression on the right can be written in shorthand form as E. We thus arrive at our target, a form of Gauss' Law that applies at a point: E = Ex x + Ey y + Ez z = r e0 (3.5)
This is the first of Maxwell's equations. Like all equations, it is best remembered not just as a collection of symbols but from the physical meaning of the various terms. Remembering that E ~ the divergence of E ~ represents the amount of electric flux produced per unit volume, by Gauss' Law it must equal the charge per unit volume, r, divided by e0 .
41
Gauss' Law
VectorPlot3D@v, 8x, -1, 1<, 8y, -1, 1<, 8z, -1, 1<, VectorScale 8Automatic, [email protected]<D
1 0 -1
-1
-1 0 1
From the geometrical interpretation we expect this field to have zero divergence at any point 8x, y, z<. Computing the divergence in cartesian coordinates we obtain vP1T x 0 Example 3.4: An electric field has the form Ex = k x, Ey = Ez = 0. What is its divergence and what physical set-up could give such a field? This is about the simplest possible field other than a constant. We obtain immediately E = k: = 8k x, 0, 0<; P1T x k Here is a plot of the vector field for k = 1: + P2T y + P3T z + vP2T y + vP3T z
42
Gauss' Law
1 0 -1
-1 -1 0 1
The physical interpretation follows from Gauss' Law. The charge density is proportional to E and so this field comes from a uniform charge density and would be the form of field set up inside an infinite slab of uniform charge density perpendicular to the x-axis.
Ir2 EHrLM r
dr =
1 Ir2 EHrLM r2 r
V,
where the middle term follows from taking small differences of the expression r2 EHrL treated as a single function. We can easily verify this result using Mathematica: 4 p Hr + drL2 EHr + drL - 4 p r2 EHrL + O@drD2 I4 p r2 dr M True Hence, from the definition of divergence, 1 Ir2 EHrLM r2 r + O@drD2 Simplify
43
Gauss' Law
E lim we obtain E =
1 V
S=V
V0
E S,
1 Ir2 EHrLM r2 r
Proceeding in this manner one can obtain more general expressions for the divergences of fields expressed in spherical polar and other coordinates. It turns out that there are much more direct methods for computing the divergence in any coordinate system ~ see Appendix A for a summary of vector operators in orthogonal coordinate systems. For example, the general expression for the divergence in spherical polar coordinates is (see Eq. A.14) E = 1 r2 r Ir2 Er M + 1 HsinHqL Eq L + 1 Ef , (3.6)
r sinHqL q
r sinHqL f
` ` ` where E Er r + Eq q + Ef f. ` Example 3.5: What is the divergence of the vector function v = r = r r? First, let us visualise this vector field using cartesian coordinates: v = 8x, y, z<;
1 0 -1
-1 -1 0 1
From the geometrical interpretation we expect this field to have large (positive) divergence at any point 8x, y, z<. Computing the divergence in cartesian coordinates we obtain vP1T x 3 ` ` ` ` Since v vr r + vq q + vf f = r r, using the divergence in spherical polar coordinates (Eq. 3.6) we find that 1 r 3
2
vP2T y
vP3T z
r Ir2 rM
which is identical to the result obtained using cartesian coordinates. For this particular vector field, the divergence does not depend on the point at which it is computed.
44
Gauss' Law
which is identical to the result obtained using cartesian coordinates. For this particular vector field, the divergence does not depend on the point at which it is computed. Example 3.6: What is the divergence of v = With x =. v= 8x, y, z<
32 1 r2
` r
1 r3
r?
From this plot we might expect this field to have non-zero divergence. However, computing the divergence in cartesian coordinates, vP1T x 0 or spherical polar coordinates, 1 r2 0 r r 2 1 r2 + vP2T y + vP3T z Together
we find that the divergence is identically zero! What is going on here? We will return to this example shortly.
45
Gauss' Law
which is known as Poisson's equation. Here 2 is the Laplacian operator which, written in Cartesian form, is 2 = 2 x2 + 2 y2 + 2 z2
The Laplacian operator also arises in quantum mechanics. There are other forms for 2 in different coordinate systems. E.g., the Laplacian operator in spherical polar coordinates reads (Eq. A.15) 2 = 1 r2 r r2 r + 1 1 sinHqL q + 1 2 . (3.8)
r2 sinHqL q
sinHqL2 f2
Poisson's equation (Eq. 3.7) can be used to find the charge distribution given a form for the potential. Example 3.7: What charge distribution is needed to give a potential of the form f = k r2 where r is the distance from a point? Apply the 2 operator to r2 = x2 + y2 + z2 . Thus x x r2 = 2 x, 2 x = 2,
so 2 Ik r2 M = 6 k. Alternatively, using the Laplacian operator in spherical polar coordinates, we get the same result: 2 Ik r2 M = k 1 r r
2
r2
Ir2 M r
=k
1 r r
2
I2 r3 M = 6 k.
Therefore Eq. 3.7 gives r = -6 k e0 . Thus a constant charge density gives a potential proportional to r2 and this is the form of potential inside a uniformly charged sphere for example.
which is known as Laplace's equation. Solutions of this equation with boundary conditions are important in the design of the focussing fields of TV tubes for instance. Example 3.8: Verify that the 1r Coulomb potential satisfies Laplace's equation. Using brute force by applying the 2 operator in cartesian form, i.e.,
46
Gauss' Law
=-
x r3
since x2 + y2 + z2 = r2 . Alternatively, using the Laplacian operator in spherical polar coordinates (Eq. A.15), 2 1 r = 1 r2 r r2 H1rL r = 1 r2 r H-1L = 0.
must represent the emergent flux from a volume V for any vector field A. We have seen that the emergent flux can also be written as S=V A S and so therefore we expect A S = A V.
V
S=V
(3.10)
This result is known as Gauss' Theorem (sometimes it is called the divergence theorem). It is important to distinguish between Gauss' Theorem, which has only mathematical content, and applies to any physical vector field, as opposed to Gauss' Law which is founded in experiment and is just another way of expressing Coulomb's Law. The importance of Gauss' Theorem is that it provides a way to transform between the surface and volume integrals frequently encountered in physics. Thus if we go back to the integral version of Gauss' Law (Eq. 3.1) E S = 1 e0 r V,
V
S=V
47
Gauss' Law
S=V
E S = E V =
V
1 e0
r V,
V
which, as before, is Gauss' Law in differential form (but it no longer depends upon the assumption of a volume of particular shape as that is accounted for in the proof of Gauss' Theorem).
4 p e0 r
Hence the electric field E is EHrL = - fHrL = In Example 3.6 we saw that v= 1 ` r v = 0 E = 0. r2 - f HrL ` 1 Q ` r= r. r 4 p e0 r2
However, after visualizing the fields we were puzzled to find that the divergence was identically zero. Alternatively, in Example 3.8 we found that 2 I r M = 0. Hence E -2 f = Q 4 p e0 2 1 r = 0.
1
and, for a point charge, the charge density is zero everywhere except at r = 0 where it is infinite! If we apply the divergence theorem to the electric field E we find that E r =
V
E S =
Q 4 p e0
1
S=V r2
r2 W =
Q e0
S=V
However, above we have shown that E = 0. What is the resolution to this paradox? Hint: you should be suspicious of "point" charges. Taking Coulomb's Law at face value, the potential and field of a "point charge" at the origin are infinite. Note that, although there is no such thing as a "point charge", the electron is effectively a point charge with physical radius d 10-17 m. The problem is the point r = 0, where E blows up. A more careful analysis shows that E = 0 everywhere except at the origin. We seem to require a function with the bizarre property that E = 0 everywhere except at a single point, yet
48
Gauss' Law
E V =
V
Q e0
No function can possibly behave this way. What we have stumbled onto is the Dirac delta function which is not really a function at all (it is a distribution). The Dirac delta "function" was originally defined by Dirac as dHxL =
0 x0 , and dHxL x = 1. x=0 -
(3.11)
f Hx - aL dHxL x = f HaL.
(3.12)
However this definition does not make sense mathematically. It turns out that Dirac had a good idea though. See Appendix B for more on dHxL. Example 3.10: Compute E, f, and E for the following (spherically symmetric) charge distribution:
Q
rHrL =
4 3
p R3
r<R rR
Using spherical polar coordinates, the charge enclosed in a sphere of radius r < R is qHrL = rHrL V = 4 p rHrL r2 r = 4 p
V 0 r
Q
4 3
p R3
2 r r = 4 p 0
Q
4 3
r3
p R3 3
=Q
r3 R3
For r = R, we find that qHRL = Q, the total charge. Using Gauss' Law (also see Section 3.3.3), the electric field is
Qr 1 4 p e0 R3 Q 1 4 p e0 r2
` r r<R rR
EHrL =
` r
4R
49
Gauss' Law
fHrL =
Q 1 4 p e0 R3 Q 1 4 p e0 r
I 2 R2 -
1 2
r2 M r < R rR
8 p R e0
Q 4 p R e0
4R
EHrL =
Ir Er M =
1 e0
Q
4 3
pR
r e0
r<R rR
4R
This result is to be expected from Gauss' Law. Note that E is discontinuous (because r itself is discontinuous). If we apply the divergence theorem to the electric field E we find that E V =
V
rHrL e0
V = 1 4 p e0
Qr
qHrL e0
E S =
S=V
r2 W R3 Q 2 r W r2
r<R = rR
Q e0
r3 R3
r<R rR
qHrL e0
which all checks out. There is no paradox here. Consider now what happens to rHrL = limit, i.e., as R 0:
Q
4 3
p R3
50
Gauss' Law
Q
4 3
p R3
R 2 R 2R
The boxes get narrower and taller ~ in such a way that the volume integral (note the r2 r factor) is a constant equal to the total charge. If you read through Appendix B you should be able to show that, in the limit, rHrL = Q dHrL and hence E =
rHrL e0
Q e0
dHrL. This simple result is a key ingredient to a concise formulation much of electromagnetism.
where the partial derivative indicates that the volume is fixed in position. Since the total charge Q = r V we find t r V = - J S.
V S
Now apply Gauss' Theorem to transform the surface integral into a volume integral and we find r
V
t
t
V = - J V.
V
We can bring
As before, since this applies for any volume V we obtain our final result, the continuity equation for electric charge
51
Gauss' Law
r t
+ J = 0.
(3.13)
This equation expresses the conservation of electric charge. It says that at every point the electric current produced per unit volume ( J) must be balanced by a decrease in the charge density.
In incompressible flow (a good approximation at low speeds), rm is constant and we have v = 0, an important equation in fluid dynamics. In the conduction of heat in a uniform solid the ''density" of heat is C T where C is the heat capacity per unit volume and T is the temperature. The equation of continuity is then C T t + Q = 0.
where Q is the heat flux. We obtained an expression for Q in terms of the temperature gradient in Eq. 2.2 and substituting this we obtain 2 T = C T k t ,
a fundamental equation in the theory of heat conduction, also known as the diffusion equation.
52
Gauss' Law
3.6 Summary
In this chapter we expressed Gauss' Law in integral form as E S =
S
1 e0
r V,
V
We then progressed from that to considering the limit of infinitesimal volumes, defining a scalar quantity called the divergence div E by div E = lim 1 V E S.
S
V0
A form convenient for cartesian coordinates was developed by considering a volume in the shape of a cuboid. We obtained div E = Ex x + Ey y + Ez z = E.
We then returned to the Gauss' Theorem, a mathematical theorem that applies to any continuous vector field and allows one to transform between surface and volume integrals A S = A V.
S V
Gauss' theorem was applied to derive the continuity equation which expresses the conservation of charge in differential form: r t + J = 0,
where J is the current density. Finally the examples illustrated how to cope with non-cartesian coordinate systems.
Chapter 4
Faraday's Law, Stokes' Theorem and curl
4.1 Introduction
This chapter repeats the pattern of Chapter 3. We start from an experimentally derived physical law, in this case Faraday's Law of induction, and derive a differential version of it that applies at a point. In so doing, we introduce the third and final of the vector derivatives, a vector called the curl of a field. We then follow up with Stokes' Theorem which, in the same way that Gauss' Theorem is used to transform between volume and surface integrals, can be used to transform between surface and line integrals.
We met this earlier in Chapter 2 where we said that this quantity had to be zero for energy conservation. However, that was in electrostatics, and does not apply when work is being done to change the fields. A corollary is that the electrostatic relation E = - f no longer applies in the time-varying case. The flux connecting the circuit C is the integral of the magnetic flux density B over any surface, S, bounded by C = S and can be written B S.
S
(4.1)
C=S
For the sign to make sense, the direction in which the circuit is travelled has to be defined. Fig. 4.1 shows the convention based upon the right-hand rule. If one grasps the circuit with the right-hand so that the fingers point along the direction of B, then the thumb points along the direction in which C is traversed.
54
Faraday
Figure 4.1
Magnetic flux threads a circuit C which is covered by a surface S that has C as its boundary. The arrow on C indicates in which direction the line integral is taken for B pointing in the direction shown.
Example 4.1: What is the electric field inside a long solenoid of n turns/unit length when the current I flowing through the coils changes?
The figure shows side and end-on views of a solenoid carrying current I. The end-on view looks into the Figure 4.2 magnetic field (represented by dots). To calculate the electric field induced by changing I, a circuit is taken to be a circle of radius r enclosing the field. We will assume the result from first year that the magnetic field inside the solenoid is given by B = m0 n I. By symmetry E must run in circles around the axis of the solenoid and so we take such a circle as our circuit. Since E runs parallel to the circuit at all points, the line integral reduces to E l = 2 p r E.
C
For circuits inside the radius a of the solenoid Hr < aL, the flux linking the circuit is FB = B S = p r2 B = p r2 m0 n I.
S
55
Faraday
2 p r E = -p r2 m0 n or E=r 2 m0 n I t
I t
Hr < aL.
There is no magnetic flux outside the solenoid so the flux linking the circuit stays fixed at p a2 B for r > a and therefore E=a2 2r m0 n I t Hr > aL.
The existence of an electric field outside the coil allows signals flowing through the coil to be picked up with a loop of wire enclosing the coil.
The figure shows a small loop used to obtain an expression for the line integral in the limit of Figure 4.3 infinitesimal size. The circuit is traversed in a direction appropriate for the right-hand rule and a z-axis which points out of the page. To be specific we consider the loop illustrated in Fig. 4.3. This is a rectangle in the x-y plane with sides parallel to the x and y-axes. Note that by contrast with the derivation of divergence, the loop's orientation is significant. This will be reflected in the quantity called curl which we will introduce which turns out to be a vector rather than a scalar like divergence. The circuit direction indicated in Fig. 4.3 follows the right-hand rule for a right-handed coordinate set of ` ` ` axes in which z = x y. The line integral around the circuit has four separate parts corresponding to the line segments PQ, QR, RS and SP. The contribution from PQ is due entirely to the x component of E, which we evaluate at the mid-point of the segment as Ex Hx, y - y2L. Multiplying this by the length of the segment and adding in the other three similar terms we have
56
Faraday
E l =
C
Ex Hx, y - y2L x + Ey Hx + x2, yL y - Ex Hx, y + y2L x - Ey Hx - x2, yL y, with the minus signs appearing when we travel against the direction of the coordinate axes. This expression can be grouped into two pairs of differences: E l = IEy Hx + x2, yL - Ey Hx - x2, yLM y - HEx Hx, y + y2L - Ex Hx, y - y2LL x,
C
which, when expanded to first order as we did when deriving div E E, yields: y x + dx , y - y x dx ,y dy + OHdxL3
dy 2
dx + OHdyL3
Ey x
Ex y
x y = H ELz x y = E S.
where we have recognised that the term in brackets is the z-component of the vector E. The last expression ` follows because the area vector representing the loop is given by S = z x y. Although we have not proved it, this result is general, i.e., for an infinitesimal flat element of area S of any shape and orientation bounded by a loop C = S we can write E l = E S. (4.2)
C=S
The quantity E is a vector and is called the curl of the electric field. You may also sometimes see it called the rot E, short for rotation. Eq. 4.2 defines curl in the same way that Eq. 3.3 defined the divergence. Of all three derivatives we have now encountered ~ grad, div, and curl ~ the curl is the most difficult to get a feel for. Its nature is defined by Eq. 4.2. When thinking about curl, one should picture a small loop embedded in the vector field and consider what the circulation around it is. Still, it is not always obvious whether there is any overall line integral. We will look at some examples later which may help, but first we will finish with Stokes' Theorem. Any finite surface can be subdivided into many small flat facets obeying the above equation. Adding the line integrals of all of these facets, the individual contributions cancel except on the outer boundary (for example refer back to Fig. 4.1 and consider adding the integrals around the two adjacent dashed squares). We then obtain Stokes' Theorem E l = E S,
S
C=S
(4.3)
for any surface S bounded by the circuit C = S. This applies to any physical vector field, not just E. S and C here are now finite in contrast to Eq. 4.2 and S no longer has to be flat. With Stokes' theorem we can transform line integrals surface integrals. As a simple application, let us revisit the condition C E l = 0 which we derived for electrostatic fields in Chapter 2. Since this applies for any circuit, Stokes' theorem implies that
57
Faraday
As a simple application, let us revisit the condition C E l = 0 which we derived for electrostatic fields in Chapter 2. Since this applies for any circuit, Stokes' theorem implies that E = 0, for electrostatic fields (0 is a zero vector). Such a field is said to be curl-free or irrotational. The study of irrotational fluid flows for which v = 0 is of great importance in aerodynamics, and approximations based on this explain why aircraft fly. The reverse of the above condition is also true. That is if a vector field A satisfies A = 0, then we can write C A l = 0 and, from Chapter 2, that A can be derived from a potential A = f. We are saying then that f = 0 which is reasonable if you remember that the cross-product of a vector with itself is zero (although this is not a proof because is not an ordinary vector). Since the condition v = 0 implies v = f, curl-free flows are also called potential flows. Recalling that incompressible flows satisfy v = 0, then we have 2 f = 0, and so incompressible potential flows satisfy Laplace's equation which is also satisfied by electrostatic fields, a useful mathematical similarity between very different physical systems.
B S.
S
The total time derivative t allows the loop to move, but we do not want this because this means that we are measuring B in our rest frame while we are measuring E in the rest frame of the loop. A simple thought experiment shows that E and B change according to the frame in which they are measured: Picture a charge q moving at velocity v in a region with a magnetic field B but no electric field. The force on the charge is q v B. How does the picture alter when viewed from a frame in which the charge is at rest (even if it only at rest for an infinitesimal time). Since in this frame the charge is at rest, there is no v B term, and yet the charge must feel a force because it moves in a circle in a magnetic field. We are forced to the conclusion that in the new frame there is an electric field, which for low v must have strength v B. In other words, a magnetic field in one frame may look like an electric field in another frame (see the section on Lorentz transformations below). The important point is that it is essential to measure electric and magnetic fields in the same reference frame. This can be done by fixing the loop to be stationary in our rest frame (and replacing total derivatives by partial derivatives) whereby we obtain E = B t (4.4)
This is the differential version of Faraday's Law and the second of Maxwell's equations. It contains no more physics than the integral version, Eq. 4.1, but it applies at a point.
58
Faraday
1 c
2
Hv E L ;
The primed expressions HE , B L correspond to quantities measured in a coordinate system moving at a uniform velocity v with respect to the coordinate system in which the unprimed expressions, HE, BL, are deduced. Can you see where the Lorentz force law comes from? What about the Biot-Savart law?
59
Faraday
= 8a, b, c<; = 8x, y, z<; = 8b z - c y, c x - a z, a y - b x< It is usual to visualize this vector field in two dimensions. For example, VectorPlot@8- y, x<, 8x, -1, 1<, 8y, -1, 1<D 1.0
0.5
0.0
-0.5
From the geometrical interpretation we expect this field to have non-zero curl at any point 8x, y, z<. Also, we expect the curl to point in the z-direction as the right-hand rule would suggest. Note that the vector field really exists in three dimensions: VectorPlot3D@8z - y, x - z, y - x<, 8x, -1, 1<, 8y, -1, 1<, 8z, -1, 1<D
1 0 -1 1
-1 -1 0 1
However, it is harder to decide from this plot whether we expect this field to have non-zero curl. Computing the curl in cartesian coordinates we obtain
60
Faraday
9 y P3T - z P2T, z P1T - x P3T, x P2T - y P1T= 82 a, 2 b, 2 c< % 2 True ` On other occasions it is better to remember the meaning of curl. For example what is the curl of E = r4 -rl r? It would be a difficult task to calculate all the necessary derivatives, but also an unnecessary one. This is a spherically symmetric field, so the curl, which is a vector, can only point along the radial direction: anything else would not be spherically symmetric. The derivation of the curl (see Eq. 4.2) means that, in the radial direction we need to calculate the line integral around a loop perpendicular to the radial direction. But this is everywhere perpendicular to the field and therefore the line integral, and the curl, are zero everywhere. This is true for any spherically symmetric field, and seems natural as there is no sense of rotation about such a field. When calculating the divergence we used its basic definition to compute its value in a spherically symmetric case (see section 3.4.1). A similar approach can be taken with the curl in the case of cylindrical symmetry. Consider a field A that runs in circles around the z-axis with a strength that varies as AHrL where r is the distance from the z-axis. This describes, for instance, the magnetic field around a wire carrying a current, with AHrL 1r.
Figure 4.4
The figure shows the path of integration used to evaluate the curl in a field that runs in circles around an axis pointing out of the page and passing through the centre of the circles.
Only loops perpendicular to z can have any line integral. Therefore, the curl must lie in the z-direction. Therefore we choose a small loop which lies perpendicular to the z-axis as illustrated in Fig. 4.4. We are free to choose any shape for the loop, so we pick one to make the calculation as easy as possible. The circuit C illustrated in Fig. 4.4 either runs parallel to the field or perpendicular to it. The line integral around it can then be written down as A l A S = Hr + rL AHr + rL q - r AHrL q =
C
Hr AHrLL r
q r = A r q r
with the last step following from Eq. 4.2, and where AHrL is the magnitude of the field at distance r. We can easily verify the middle result (up to second order) using Mathematica: Hr + drL AHr + drL q - r AHrL q + OHdrL2 True Hr AHrLL r q dr + OHdrL2
61
Faraday
Proceeding in this manner one can obtain more general expressions for the curls of fields expressed in any coordinate system. See Appendix A for a summary of vector operators in orthogonal coordinate systems. For example, the general expression for the curl in spherical polar coordinates is (see Eq. A.16) A= 1 r sinHqL Ar f IsinHqL Af M q - sinHqL Aq f er + 1 Hr Aq L r r Ar q
1 r sinHqL
Ir Af M r
eq +
ef
` ` ` where A Ar r + Aq q + Af f. Example 4.3: What is the curl of a field that runs in circles around the z-axis with a strength that drops off inversely with distance? Using the above result, we put AHrL = 1r and find that A = 0. This is the form of magnetic field near a wire carrying a current. It illustrates the point that a field can appear to have circulation but have no curl. However our calculation breaks down on the axis itself because a loop enclosing that would definitely have a finite integral (c.f. computing the divergence of the electric field of a point charge). Example 4.4: What is the curl of a field that runs in circles around the z-axis with a strength that increases linearly with distance? ` Now AHrL = r and we find A = 2 z. This is the form of magnetic field inside a wire of uniform current density. It turns out that the curl of the magnetic field is proportional to the current density so it is no fluke that the curl turns out to have constant magnitude. Since the region near a current carrying wire carries no current itself, the result in Example 4.3 is no surprise either.
because a scalar on the left cannot equal a vector on the right. When first met these quantities can be confusing. One has no "feel" or "intuition" for them. Intuition is actually a misleading expression; "experience of" would be more accurate. Why should anyone have intuition for a concept such as curl which they have never met before? You can only develop "intuition" after use and after seeing these quantities in action. The way to develop it most quickly is to remember equations such as Eq. 4.4 and always to focus on the physical meaning behind the symbols.
4.7 Summary
62
Faraday
4.7 Summary
We started in this chapter by expressing Faraday's law. We then considered how this can be applied to an infinitesimal region and in doing so defined a new quantity, the curl of E, defined by (for infinitesimal loops) E l curl E S = E S,
C
with the second form based on consideration of a small rectangular loop. The curl of a vector field is itself a vector. This led on to more general equation called Stokes' Theorem that can be applied to finite surfaces and can be used to transform between surface and line integrals. This equation was then applied to the integral version of Faraday's law to arrive at the differential form of Faraday's law.
Chapter 5
Magnetic fields
5.1 Introduction
In this chapter we look at the physics of magnetostatics. We will encounter the second pair of Maxwell's equations although one of them will have to be modified for time variable phenomena later. We make use of Gauss' and Stokes' theorems, but no new mathematics has to be introduced. Our starting points are the Biot-Savart and Ampre's laws.
The cross-product here gives a magnetic field obeying the usual right-hand rule for magnetic fields from currents. That is, with the thumb of your right-hand pointing along the current, your fingers point in the direction of the field. There are rather few situations where the Biot-Savart law proves practical to use and we are not going to use it to any great extent. The main point we take from it is that the field lines run in circles around an axis defined by the direction of l. If they run in circles, no flux of B is produced or destroyed, and therefore we can write immediately B = 0, (5.1)
because B, the divergence of B, represents the amount of magnetic flux produced per unit volume. This equation is the third of Maxwell's equations. It can be proved more formally, but the proof is not illuminating. If we compare with the equivalent equation for the electric field (Eq. 3.5) E =
r e0
are no sources of magnetic flux, or in other words there are no magnetic charges (magnetic monopoles).
We can relate this to the electric field of the point charge as follows B= m0 4p 4 p e0 v 1 ` r = m0 e0 v E.
4 p e0 r2
64
Magnetic Fields
Compare this with the Lorentz transformation formula for magnetic fields: B = g B + 1 c2 Hv E L ;
C=S
B l = m0 J S.
S
Here C is the loop through which the current flows, J is the current density and S is any surface bounded by C. Applying Stokes' theorem to transform the line integral into a surface integral we have: B l = B S = m0 J S
S S
C=S
and, since this applies for any loop C, we must have B = m0 J, (5.3)
which is the differential form of Ampre's law. This was the relation referred to in the discussion after example 4.4 in which we calculated the curl of some example fields. As an unusual application of Ampre's law, suppose that we wish to measure the total current flowing to or from the ground during a thunderstorm. We could do so by measuring the magnetic field at a series of points on the ground at the boundary of the storm. Taking the line integral would give us the current. This would a great deal easier than measuring the current directly, which would in any case require knowing where lightning was going to strike. Example 5.1: A current I flows in a long wire of circular cross-section of radius a (Fig. 5.1). What is the magnetic field as a function of the distance r from the axis of the wire?
65
Magnetic Fields
Figure 5.1
A cross-section of a wire carrying a current into the page (represented by crosses). C is the path used to determine the magnetic field.
We will apply the integral form of Ampre's law (Eq. 5.2). We need to define a suitable circuit. Since the magnetic field must run around the wire in circles, the obvious path is itself a circle centred on the axis of the wire so that the magnetic field is everywhere parallel to it and of the same strength. This problem is very similar to example 4.1 where we calculated the electric field of a solenoid. Fig. 5.1 shows such a path. The line integral C B l reduces to 2 p r B. The current linked depends on whether the circuit is inside or outside the wire. If it is outside (r > a) then the current enclosed is simply I; if it is inside (r < a) then the current enclosed scales with area (i.e., r2 ) and must therefore be I Hr aL2 so that it equals I for r = a. Applying Ampre's law (Eq. 5.2) we obtain
m0 I 2pr m0 I r 2pa
2
, r>a . , r<a
B=
Example 5.2: Derive the equation B = m0 n I for the magnetic field inside a long solenoid with n turns per unit length and carrying a current I.
Figure 5.2
Figure 5.2: Cross-section of a solenoid with representing wires carrying current into the page and representing currents flowing out of the page. The circuit C is used to determine the magnetic field.
66
Magnetic Fields
Consider the rectangular circuit as shown in Fig. 5.2. Only the side running parallel to the field inside the solenoid gives any contribution to C B l. Its contribution is B L. The circuit links a current of n L I and so B L = m0 n L I. Therefore B = m0 n I.
5.4 Summary
Starting from the Biot-Savart law, the third of Maxwell's equations, Eq. 5.1, was written down. This equation expresses the fact that no magnetic charges have ever been found. Next Ampre's law was translated into mathematical form (Eq. 5.2). Applying Stokes' theorem, the differential version, Eq. 5.3, was immediately obtained.
Chapter 6
Electromagnetic Waves
6.1 Introduction
In this chapter we show that Ampre's Law cannot apply in the time-varying case. We consider how to modify, introducing an extra term, the displacement current, to satisfy charge conservation. We then demonstrate the existence of electromagnetic waves. We examine the essential properties of these waves in the vacuum, considering both the general properties of waves and properties specific to electromagnetic waves.
B = 0 B = m0 J
E = - t
The second pair of equations relate the curl of one vector field to a different vector field. If we take the final equation, for instance, it says that the free current density, J, is the curl of the magnetic field. This places an important restriction upon the nature of J . To realise why, we first need a mathematical result (a vector field identity) which states that for any vector field A, the divergence of its curl equals zero, i.e., H AL = 0. Taking the divergence ( ) of both sides of B = m0 J we obtain J = 0 In other words Ampre's Law (as we have seen it so far) implies that the current density J is divergence-less. This result cannot be true. It says that the total flux of current per unit volume is everywhere zero. Equivalently, using Gauss' Theorem we have J r =
V
J S = 0,
S=V
which says that the total current flowing out of any volume is always zero. This is wrong because it would mean that nothing could ever be charged or discharged. Every time a capacitor is charged, J = 0 is violated as charge flows on and off the plates. We saw the correct relation for the divergence of the current density when we discussed continuity equations. If current flows out of a volume, it is balanced by a loss of charge from the volume, and this led us to J + Using Gauss' Law E =
r e0 r t
= 0.
68
Waves
J +
r t
J + e0
E t
J + e0
E t
, t
J + e0
E t
0,
we have a divergence-less field on each side of the equation. Moreover, in static cases the modified equation reduces to B = m0 J which was derived from magnetostatic experiments. Finally, the modified equation now resembles Faraday's Law, the only difference being that because there are no magnetic charges, there is no magnetic current term in Faraday's Law. Equation 6.1 is our final version of Ampre's Law and completes the set of Maxwell's equations. The new term, m0 e0
E , t
is related to the displacement current (actually a current density) and was introduced by
Maxwell. Although very suggestive, the above discussion provides only a motivation for the introduction of the displacement current and ultimately its true test rests on experiments. For example, there are many other terms that could be added which would also have zero divergence. However, the displacement current is needed for the propagation of electromagnetic waves, and so every time one turns on a light its existence is demonstrated. The effects of the displacement current are exactly those of an ordinary current, and cannot be distinguished from it. For instance, as a capacitor is charged, and the field between the plates increases, it is as if a current were flowing between the plates and a magnetic field will be generated just as it would if there were a true current of the same magnitude. Why wasn't the displacement current found experimentally? First, the experiments that led to Ampre's Law are difficult to perform in time varying cases; Ampre experimented with coils and steady currents. Second the displacement current is small. In vacuum, a rate of change of electric field of order 1011 V.m-1 .s-1 over 1 m2 is needed to produce a current of only 1 A. The displacement current is often negligible with the important exception of when no ordinary current can flow, as in a vacuum or a dielectric. When we derive the wave equation in Section 6.5, the displacement current is vital.
B = 0 B = m0 JJ + e0
E N t
E = - t
(6.2)
Note that these equations apply generally but, because the total charge and current densities include contributions from polarisation and magnetisation, it is not usually convenient to use them when materials are present. Each of these differential equations has an integral equivalent. Very often it is the integral versions that are easier to apply, but the differential equations are vital in the study of electromagnetic radiation as we will see in the rest of this chapter. The integral versions can be derived by suitable integration followed by application of Stokes' Theorem or Gauss' Theorem. For example, consider
69
Waves
B = m0 J + e0
E t
This has a curl on the left, so if we integrate it over some surface, we will be able to transform the resulting surface integral into a line integral by Stokes' theorem. Thus we get the following steps B S = m0 J + e0
S S
E t
S,
and so, after applying Stokes' Theorem to the left-hand side, we get B l = m0 J + e0
S
E t
S,
C=S
This equation says that the line integral of B around a circuit is equal to the sum of the free and displacement currents flowing through it. We can go through a similar procedure for each equation and we obtain the following integral equations equivalent to Equations 6.2: S E S =
1 e0
V r r
B t
S B S = 0 C=S B l = m0 S JJ + e0
E N S t
C E l = -S
(6.3)
The right-hand sides of these equations can be replaced by integrated quantities such as charge or current as appropriate.
where z represents any wave-like quantity. Why does this describe waves? First we have to define what a wave is. A wave is some sort of disturbance that propagates with time. In the simplest case waves propagate without changing shape. For example, someone's voice sounds the same, apart from loudness, almost independently of the distance of the speaker. The sound waves are little distorted by travel in air. A wave of this form travelling in the x-direction can be described by zHx, tL = z0 f Hx - v tL. This function is constant for constant values of x - v t, which implies that x = v t + const and so v represents the speed at which the disturbance travels which we will call the wave or phase velocity. For example, with the function f Hx_L = 1 1 + x2 ;
70
Waves
double-click on the graphic below to see an animation of a one-dimensional wave. Manipulate@Plot@ f @x - 1.2 tD, 8x, 0.01, 20<, PlotRange AllD, 8t, 0, 11<, SaveDefinitions TrueD
10
15
20
GraphicsGrid@Partition@Table@ Plot@ f @x - 1.2 tD, 8x, 0.01, 20<, PlotRange All, Ticks 85 Range@4D, 80, 1<<D, 8t, 0, 11<D, 3DD 1 1 1
5 1
10
15
20 1
10
15
20 1
10
15
20
5 1
10
15
20 1
10
15
20 1
10
15
20
5 1
10
15
20 1
10
15
20 1
10
15
20
10
15
20
10
15
20
10
15
20
To see that zHx, tL satisfies the wave equation, we substitute it in. We therefore need to calculate various derivatives. Thus, for example, putting
71
Waves
zHx_, t_L = z0 f Hx - v tL; then the second derivative with respect to x is 2 zHx, tL x2 z0 f Hx - t vL whilst the second derivative with respect to t is 2 zHx, tL t2 v z0 f Hx - t vL
2
z0 f Hx - t vL ==
Since f Hx - t vL crops up on both sides, this result holds for any f , provided that v2 = v2 , f so that the constant vf appearing in wave equation can be identified with the wave velocity. The above relation shows that v = vf because waves can go in either direction.
6.4.2 Linearity and superposition The next important property of the wave equation is its linearity which allows one to superpose solutions. This means that given any two solutions of the wave equation, their sum (or in general any linear combination) is also a solution. The physical consequence of this property is that two beams of light do not affect each other even where they cross. Linearity is an extremely useful property, and although nonlinear equations show more interesting effects, they are often harder to deal with. Sound waves are linear at typical strengths; when they become non-linear they turn into shock waves. Ocean waves are approximately linear when the depth of the water is much larger than their height. However, as they approach the shore this is no longer the case and the top of the wave curls over and the wave breaks. The behaviour in this zone is highly nonlinear. To prove the property of superposition for the wave equation, suppose that we have two solutions z1 and z2 that satisfy the wave equation. That is 2 z1 = 1 2 z1 v2 t2 f , 2 z2 = 1 2 z2 v2 t2 f
72
Waves
2 z3 = 2 Hz1 + z2 L = 2 z1 + 2 z2 =
1 2 z1 v2 t2 f
1 2 z2 v2 t2 f
1 2 Hz1 + z2 L v2 f t2
1 2 z3 v2 t2 f
If there were any non-linear terms such as z 2 , the above proof would break down. Hence the close connection between linearity and superposition.
6.4.3 Plane waves We saw above that zHx, tL = z0 f Hk x - w tL is a solution of the wave equation corresponding to a wave travelling at vf in the x-direction. More generally zHr, tL = z0 f Hk r - w tL is a solution. To confirm that this is a possible solution, we write this equation out explicitly zHx_, y_, z_, t_L = z0 f Ha x + b y + c z - w tL; and compute the left-hand side 2 zHx, y, z, tL + 2 zHx, y, z, tL + 2 zHx, y, z, tL z2 Factor
where a, b, and c are the components of the vector k = Ha, b, cL which has magnitude k. Then we have the important (dispersion) relation vf = w k (6.4)
The function f Ha x + b y + c z - w tL is constant for x, y, z, and t which satisfy a x + b y + c z - w t = constant, At a fixed instant of time therefore a x + b y + c z = k r = constant.
73
Waves
This is the equation of a plane in three dimensions and so the function f describes plane waves. All physical quantities related to the wave are constant over these planes. The vector k is perpendicular to the planes of constant phase and therefore is parallel to the direction of the waves. It is called the wave vector. As time changes, the wave-fronts move. We can calculate the velocity at which they move from consideration of the argument of f , more usually called the phase, which we denote by y = k r - w t. Differentiating with respect to time, with y constant and setting v =
r t
it is only movement perpendicular to the wavefronts that is of interest. The wave vector k is perpendicular to the ` wave fronts and so we can set v = vf k, whereby we obtain vf = w k ,
which we have already seen when we proved that the plane wave solution satisfied the wave equation. This shows that the vf in the wave equation is the velocity at which the wavefronts move in the direction of the wave vector k, as expected. Thus our 1D results extend naturally to 3D through the wave vector. The wave vector k points in the direction of motion of the wavefronts and for all the waves that we will consider, this coincides with the direction that a beam of radiation (i.e., the energy) travels. Rather surprisingly perhaps, there are waves for which this is not the case (e.g., light in birefringent crystals) but we will not consider these in this course.
6.4.4 Harmonic plane waves A particularly important solution to the wave equation is the harmonic plane wave form zHr, tL = z0 i Hkr-w tL = z0 HcosHk r - w tL + sinHk r - w tLL Applying the operators
t t
zHr, tL = k z Hr, tL, along with analogous results for vector fields. In other words, application of the differential operator
t
equivalent to multiplication by k. For example, E = 0 becomes k E = 0 or k E = 0. Compare with the quantum mechanical momentum operator, p = - .
74
Waves
In this course we will only consider electromagnetic waves in vacuum and restrict attention to the situation where there are no free currents (J = 0) or charges (r = 0). Hence Maxwell's equations become E = 0
B E = - t
B = 0 B = m0 e0
E t
(6.5)
The first pair of equations restricts the variety of possible fields. The standard way to proceed with such pairs of coupled differential equations is to take the derivative of one of them and use the other to eliminate one or other of the independent variables. In this case we take the curl of Faraday's Law because this leads to a E term which can be eliminated using Ampres law: B = m0 e0 E t
The left-hand side can be rearranged using a second vector field identity A = H AL - 2 A. Using the commutativity of and t on the right-hand side we obtain -2 B = m0 e0 H EL t = -m0 e0 2 B t2
Substituting from Equation (6.6) for B and E we then have 2 B = m 0 e 0 Similarly, one finds that 2 E = m 0 e 0 2 E t2 2 B t2
This equation has the form of a three-dimensional (vector) wave equation for the field E, with phase velocity v= 1 m0 e0 Since m0 = 4 p 10-7 and e0 = 8.89 10-12 we obtain vf = 2.99 108 m.s-1 , which is equal to the speed of light, c. The clear implication is that light is itself an electromagnetic wave. This is an early instance of the unification of seemingly separate branches of physics, in this case electromagnetism and optics, and one of the triumphs of 19th century physics.
75
Waves
EHr, tL = E0 i Hkr-w tL BHr, tL = B0 i Hkr-w tL we can simplify the equations, translating them from vector differential equations to plain vector equations: kE = 0 kE = w B kB = 0 k B = -m0 e0 w E (6.6)
The first two equations show that E and B are perpendicular to the wave vector k. Since k points in the direction of the wave, this means that electromagnetic waves are transverse waves. In general a vector in 3D has three degrees of freedom. The condition that E must be perpendicular to k reduces this to two degrees of freedom. Physically this corresponds to the two polarisations that light can be split into. The other two equations relate E and B. It is normal to regard the electric field as the one which defines the wave, and for example the direction it points defines the polarisation of the wave. Thus it is convenient to use k E = w B to obtain the magnetic field strength. This equation shows that B is perpendicular to E, and so we have found the property of electromagnetic waves that E, B, and k are mutually perpendicular. Since k and E are perpendicular, in terms of magnitudes we have k E = w B, and therefore E =
1 c2 w k
B = vf B by Eq. 6.5.
For waves in a vacuum vf = c, and so B = E c. The final equation k B = m0 e0 w E tells us nothing new since with m0 e0 = it also reduces to B = E c.
6.7 Summary
In this chapter we showed that Ampre's law B = m0 J fails to satisfy charge conservation and introduced a new term, the displacement current, in order to correct it. Thus we obtained B = m0 J + e0 E t .
This gave us our final versions of Maxwell's equations. We then studied general properties of the wave equation, and in particular plane waves of the form zHr, tL = z0 f Hk r - v tL. We showed that k is a vector pointing in the direction of the wave and that the phase or wave velocity vf was given by vf = w k
We then showed that Maxwell's equations in free space lead to a 3D wave equation 2 E = m0 e0 2 E t2 .
The complex exponential form of the wave allowed us to substitute - k for and w for t in all Maxwell's equations and therefore derive relations between the fields and k and we showed that E, B, and k were mutually perpendicular.
76
Waves
A = H AL - 2 A
There are elegant methods of proving that A = H AL - 2 A. but these are outside the scope of this course. An inelegant but straightforward proof is to to examine each component. The x-component of left hand side reads J
A y Hx,y,zL x
Ax Hx,y,zL y
N -
Ax Hx,y,zL z
Az Hx,y,zL x
N
H1,1,0L
y
H0,0,2L -Ax Hx,
z y, zL +
H1,0,1L Az Hx,
y, zL -
H0,2,0L Ax Hx,
y, zL + Ay
Hx, y, zL
A y Hx,y,zL y
Az Hx,y,zL z
N -
2 Ax Hx, y, zL x2
x -Ax
H0,0,2L
2 Ax Hx, y, zL y2 Hx, y, zL
2 Ax Hx, y, zL z2
Hx, y, zL - Ax
H0,2,0L
Hx, y, zL + Az
H1,0,1L
Hx, y, zL + Ay
H1,1,0L
These two results are identical % %% True You can either compute the other two components or argue that x, y, and z are just labels which can be permuted at will (which implies that proving the identity for one component proves it for all components). Note that 2 A consists of a scalar operator (2 ) applied to a vector (A) resulting in a vector.
Appendix A
Differential Operators
A.1 Definitions
When parameterizing the position of a point in 3-space, the particular coordinate system which is most useful depends on the symmetry of the physical or geometric system at hand. All coordinate systems can be derived from the Cartesian system by a particular (non-linear) transformation. If x = j e j x j is a particular geometric point referred to a rectangular frame of reference, the same point may also be described by coordinates qi derived from the transformation xi = xi Iq j M, i, j = 1, 2, 3. (1.1)
To obtain the transformation between different coordinate systems, one needs to compute the partial derivatives xi q j . The matrix of partial derivatives (the linear map D xHqL) is known as the Jacobian matrix (of x):
x1 q1 x1 q2 x2 q2 x3 q2 x1 q3 x2 q3 x3 q3
D xHqL
x2 q1 x3 q1
(1.2)
i=1
xi xi q j qk
= 0, k j.
(1.3)
h2 j
=
i=1
xi qj
, j = 1, 2, 3,
(1.4)
HD xHqLL D xHqL
x1 q2 x1 q3
x2 q1 x3 q1
h2 1 = 0 0
0 h2 2 0
0 0 h2 3 (1.5)
78
Appendix A
(1.7)
fi x j
f1 xn f2 xn
D f HxL
x1
fm x1
fm x2
fm xn
(1.8)
Dk f HxL Hy - x, , y - xL =
i1 ,,ik =1
k f xi1 xik
(1.9)
(1.10)
79
Appendix A
Divergence : F = Laplacian : 2 f = 1
J xHqL h1
J xHqL q1
F1 +
q2
J xHqL h2
F2 +
q3
J xHqL h3
F3
(1.12)
J xHqL f h2 1 1 J xHqL q1 h1 e1
q1
J xHqL q1
q2
J xHqL f h2 2 h3 e3
q3
q2
q3
J xHqL f h2 3 q3
(1.13)
h2 e2
q2
Curl : F =
(1.14)
h1 F1 h2 F2 h3 F3
A.2 Examples
A.2.1 Spherical Polar Coordinates
A.2.1.1 Definition x = r cosHfL sinHqL, y = r sinHfL sinHqL, z = r cosHqL
y
(1.15)
Note that care must be taken when inverting these relations. For example, f tan-1 J x N in general ~ though you will see this statement appearing regularly in textbooks.
cosHfL sinHqL r cosHqL cosHfL -r sinHqL sinHfL sinHqL sinHfL r cosHqL sinHfL r cosHfL sinHqL cosHqL -r sinHqL 0 This matrix can also be generated directly by taking the outer product of the partial derivative (D) with the coordinate vectors: Outer@D, 8r cosHfL sinHqL, r sinHfL sinHqL, r cosHqL<, 8r, q, f<D cosHfL sinHqL r cosHqL cosHfL -r sinHqL sinHfL sinHqL sinHfL r cosHqL sinHfL r cosHfL sinHqL cosHqL -r sinHqL 0
A.2.1.3 Jacobian Determinant The Jacobian determinant is needed when computing the volume element:
80
Appendix A
= Simplify r2 sinHqL
0 0 r sin2 HqL we see that spherical polar coordinates are an othogonal coordinate system. The diagonal entries, TransposeH%, 81, 1<L 91, r2 , r2 sin2 HqL= lead to the scale factors: 9hr , hq , hf = = 81, r, r sinHqL< % PowerExpand
A.2.1.5 Differential Operators From the definitions, we find that Gradient : f = f r er + 1 f r q eq + 1 f ef 1 Ff (1.16)
r sinHqL f 1
Divergence : F =
1 r2 r
Ir2 Fr M + f r
r sinHqL q 1 1
HsinHqL Fq L + f q
r sinHqL f + 1 2 f
(1.17)
Laplacian : 2 f =
1 r2 r 1
r2
r2 sinHqL q
sinHqL
sinHqL2 f2
(1.18)
(1.19)
81
Appendix A
r q
Note that care must be taken when inverting these relations. For example, f tan-1 J x N in general ~ though you will see this statement appearing regularly in textbooks.
y
f = 2 tan-1 J x+r N is correct. Alternatively, you can use f = tan-1 Hx, yL, a special form of tan-1 which gives the arc tangent of y x, taking into account which quadrant the point Hx, yL is in.
82
Appendix A
y x+r f tan 2
A.2.2.2 Jacobian Matrix We generate the Jacobian matrix directly by taking the outer product of the partial derivative (D) of the coordinate vectors: = Outer@D, 8r cosHfL, r sinHfL, z<, 8r, f, z<D cosHfL -r sinHfL 0 sinHfL r cosHfL 0 0 0 1
A.2.2.3 Jacobian Determinant The Jacobian determinant is needed when computing the volume element: = Simplify r
A.2.2.4 Scale Factors Computing T . Simplify 1 0 0 0 r2 0 0 0 1 we see that spherical polar coordinates are an othogonal coordinate system. The diagonal entries, TransposeH%, 81, 1<L 91, r2 , 1= lead to the scale factors: 9h r , hf , hz = = 81, r, 1< % PowerExpand
A.2.2.5 Differential Operators From the definitions, we find that Gradient : f = f r er + 1 f r f ef + f z ez (1.21)
83
Appendix A
Divergence : F = 1
1 Ir F r M r r r f r
1 Ff r f +
Fz z + 2 f z2
(1.22)
Laplacian : 2 f =
1 2 f r2 f2
r r er
(1.23)
Curl : F =
r ef ez
r f z = r F rF F z r f Ff z er + Fr z Fz r ef + 1 Ir Ff M r r Fr f ez
(1.24)
1 Fz r f
Appendix B
Dirac's Delta Function dHxL
B.1 Examples
Consider the (piecewise continuous) function -2 qe_ Hx_L :=
x 2e 1 2 1
x < -e -e x e x>e
0.4
0.2
-4
-2 -0.2
-0.4
True
85
Appendix B
PlotB:d2 @xD, d1 @xD, d 1 @xD>, 8x, -4, 4<, Exclusions None, PlotRange AllF
2
1.0
0.8
0.6
0.4
0.2
-4
-2
d1 HxL x
-5
1
5
d12 HxL x
-5
1 In other words
de HxL x = 1, "e e + .
de HxL x =
qe HxL x = qe HxL = -
1 2
-1 2
= 1, "e e + .
f HxL de HxL x,
where f HxL is an arbitrary function which goes to 0 "sufficiently fast" as x . For e "sufficiently small", we can write
f HxL de HxL x =
1 2e
f HxL x >
-e
1 2e
H2 e f H0LL = f H0L,
using the Mean Value Theorem. As e 0, de HxL dHxL, and dHxL has the interesting properties that
dHxL x = 1,
86
Appendix B
B.2 Definition
The symbol dHxL is not a function in the usual mathematical sense. A function in one dimension is a mapping between ordered pairs x y = f HxL. In the case of the symbol dHxL, any such mapping carries every point x on the real axis, save one, into the number zero. This is hardly a well behaved function. Nevertheless, it can be treated symbolically as though it shared most properties of ordinary smooth functions. I will often treat it as an ordinary function. Our purpose here is to outline this highly useful notation, and not to give a mathematical justification for this use. It is possible to view dHxL as representing the symbolic limit of a sequence of suitably defined functions. Imagine, for example, a sequence based on the parameter e defined by any of the following three functions:
de HxL =
e 1 p x2 +e2
de_ @x_D :=
e
2
p x + e2
2
-4
-2
de HxL x
1 1+x2
lim
e0
- 1 + x2
de HxL x
de HxL =
1 sinJ e N x p
87
Appendix B
de HxL =
1 sinJ e N x p
de_ @x_D :=
1 sinJ e N p x
2
0.4
0.2
-4
-2
de HxL x
-x2
de HxL =
1 p
1 e
e2 1 1
-x2
de_ @x_D :=
p e
e2
1.0
0.8
0.6
0.4
0.2
-4
-2
88
Appendix B
de HxL x
B.3 Sequence
Each of these functions, for any e ` 1, have the properties (1) sharply peaked at x = 0; and (2) area under curve is unity independent of e. In short, if one constructs a convergent sequence of e's, then the quantity dHxL = lime0 de HxL. for each of the above functions de HxL, has the desired properties of the delta "function". By this I mean the following: whenever a delta "function" appears multiplying a smooth function under an integral sign, you should imagine that it is replaced by de HxL and the integral evaluated. Then, after integration, the limit of the sequence of the results of integration is taken. This process gives meaning to the delta function. With this idea in mind, I can treat the delta function as if it were itself a smooth function and even write, for example,
d HxL f HxL x = -
where I have integrated by parts. Other useful results, whose justification are based in such arguments, are these: dHa xL = dIx2 - a2 M =
1 2 a dHxL a
dHHx - aL Hx - bLL =
An example of this is
-x-b x y z = -b-a . dHx - aL
As a further example, consider hHrL = 2 if r 0. Clearly, hHrL is singular at r = 0. To investigate its behavior near the singularity, integrate over a small spherical volume V centered at the origin. The divergence theorem 1 r = 1 r = - r r3 =1 r3 r -3 r r r5 =3 r3 3 r3 =0
89
Appendix B
Clearly, hHrL is singular at r = 0. To investigate its behavior near the singularity, integrate over a small spherical volume V centered at the origin. The divergence theorem A r =
V
A S,
S=V
= V I r M r = S=V I r M S = S=V I r M r S = -S=V = -4 p . Conclusion: hHrL is zero everywhere except at a single point, namely r = 0. There it is infinite, but in such a way that its volume integral over the singularity is -4 p. Therefore, we have the identity -
2 r r3 1 1
r r2 W
= -S=V W
1 r
` r r2
= 4 p dHrL.