0% found this document useful (0 votes)
20 views42 pages

Acprof 9780199228676 Chapter 2

This document provides an introduction to wave phenomena, focusing on sinusoidal waves and their mathematical representation using complex numbers and Fourier transforms. It discusses the properties of waves, including amplitude, frequency, wavelength, and the principle of superposition, which explains how waves interact. The chapter also covers vector representation of waves and the concept of stationary and traveling waves, illustrating these concepts with various examples and figures.

Uploaded by

byour923
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
20 views42 pages

Acprof 9780199228676 Chapter 2

This document provides an introduction to wave phenomena, focusing on sinusoidal waves and their mathematical representation using complex numbers and Fourier transforms. It discusses the properties of waves, including amplitude, frequency, wavelength, and the principle of superposition, which explains how waves interact. The chapter also covers vector representation of waves and the concept of stationary and traveling waves, illustrating these concepts with various examples and figures.

Uploaded by

byour923
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 42

Waves, complex numbers and

Fourier transforms 2
The theory of X-ray and neutron scattering relies heavily on the

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


mathematics of waves. This chapter provides a tutorial introduction
to the basic physical concepts, and the associated analytical tools,
needed for an understanding of wave phenomena.

2.1 Sinusoidal waves


An everyday description of a wave would be a ‘wiggle’, or something
that goes up-and-down as you move forward. The progression of the
fluctuations could refer to changes in ‘height’ with respect to position
at a fixed time, or with respect to time at a fixed position. Several
examples of geometrical waves are shown in Fig. 2.1; they are un-
usual in that they have points where there are abrupt changes in
the value of the function or its gradient. What they have in common
with the more familiar sinusoidal variation of Fig. 2.2 is a regularly Fig. 2.1 Geometric examples of waves:
repeating pattern. square, triangular and exponential.
The sine and cosine curves of Fig. 2.2 are regarded as the archety-
pal waves, as they occur in many elementary physical situations; for
example, the vibrations of an elastic string. Their smooth character-
istics also make them amenable to analytical manipulation. An easy
way of visualizing sinusoidal variations is to think about the projec-
tion of a circular motion onto the horizontal and vertical axes, as
illustrated in Fig. 2.3.

Fig. 2.2 The sinusoidal curves, or waves, ψ = A sin θ and ψ = A cos θ. Fig. 2.3 The generation of sinusoidal
variations through circular motion.

Elementary Scattering Theory, First Edition, D.S. Sivia, © D.S. Sivia 2011. Published in 2011
by Oxford University Press.
20 Waves, complex numbers and Fourier transforms

The two curves shown in Fig. 2.2 are identical apart from a lateral
shift of π/2 radians, or 90◦ : cos θ = sin (θ + π/2). Hence, the general
expression for a function of this type is

ψ = A sin(θ + φ) , (2.1)

where A is the amplitude of the wave and the angle φ, or the phase,
controls its horizontal displacement with respect to sin θ. If the θ in
eqn (2.1) varies linearly with position x, so that θ = k x where k is a
constant, then we obtain a sinusoidal variation with respect to this
physical coordinate:

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


ψ = A sin(k x + φ) . (2.2)

Since the sine curve cycles around every 2π radians, the correspond-
ing repeat distance, or wavelength λ, can be found from


k = . (2.3)
λ

This is called the wavenumber and has SI units of rad m−1 . Note
that, as mentioned in Section 1.5, spectroscopists use the same term
for 1/λ given in cm−1 .
If the φ in eqn (2.2) itself varies linearly with time t, so that it can
be written as φ = φo − ω t where φo and ω are constants, then we
obtain the travelling wave

ψ = A sin(k x − ω t + φo ) . (2.4)

That is to say, with ω > 0, the sinusoidal variation in x moves steadily


towards the right as time evolves; this is illustrated in Fig. 2.4. The
crests and troughs of the translated wave will coincide with those of
an earlier time after a duration T , called the period, given by


ω = . (2.5)
T

Fig. 2.4 The travelling wave of eqn (2.4) plotted as a function of x for several values
of t, from zero to a quarter of the period.
2.1 Sinusoidal waves 21

The reciprocal of T , usually denoted by ν, is known as the frequency


of the wave. It is related to its angular variant, ω, through

1
ω = 2πν , (2.6) ν =
T

with ω specified in rad s−1 and ν in cycles per second or Hz (hertz).


The speed of the wave, c, follows readily from the observation that
it moves forward by a distance λ in a time T :

λ ω
c = = = νλ , (2.7)
T k

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


in agreement with the result quoted in eqn (1.16).

2.1.1 The direction of propagation


A negative prefactor was chosen for the ω t term in eqn (2.4) so that
the wave would travel in the positive x direction; the opposite sign,
ψ = A sin(k x + ω t + φo ), gives a wave that moves backwards. In fact,
a reversal also occurs with ψ = A sin(− k x − ω t + φo ). Is there any
reason for preferring one of these alternatives over the other to de-
fine the sense of the progression?
Conceptually, it would make more sense to associate the change of
sign with the spatial term, rather than the temporal factor, because
we are concerned with an orientation. This line of thought leads
to the following generalization to accommodate fully the directional r = (x , y , z)
aspect of waves: k = (kx , ky , kz )
ψ = A sin(k • r − ω t + φo ) , (2.8) k • r = kx x + ky y + kz z

where the bold script k and r are vectors, and the dot between them |k|2 = k2
indicates their ‘scalar multiplication’. The vector r denotes a general = kx2 + ky2 + kz2
position in space, with coordinates x, y and z, but what do the three
components, kx , ky and kz , of the wavevector k represent? Its mag-
nitude, or modulus, |k| = k is the familiar wavenumber of eqn (2.3),
and its orientation indicates the direction of propagation. For a wave
travelling along the x direction, with ky = kz = 0, the scalar product
k • r = kx x where kx = k for a forwards progression and kx = −k for
the reverse.
Since r and k are generally three-dimensional vectors, the wave
of eqn (2.8) tends to be a function of x, y, z and t. As such, it rep-
resents a travelling ‘plane wave’ rather than a moving oscillation
on a string. That is to say ψ, which could be the air pressure in a
sound wave, is uniform in planes perpendicular to k, but its value
varies sinusoidally with time in the direction of the wavevector in
accordance with the wavelength of eqn (2.3), the period of eqn (2.5)
and the speed in eqn (2.7). The situation is illustrated for the two-
dimensional analogue in Fig. 2.5. Fig. 2.5 The geometry of a plane wave.
22 Waves, complex numbers and Fourier transforms

Magnitudes, directions and vectors


Quantities that have both a magnitude and a direction, such as a force,
are called vectors. Unlike scalars, which only have a ‘size’, they cannot
be quantified by a single number. They are defined by coordinates, or an
array of numbers giving displacements with respect to a set of reference or
basis axes. In the most common case of an x, y and z or Cartesian system,
the vectors a and b can be written as

a = (ax , ay , az ) and b = (bx , by , bz ) .

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


Addition and subtraction are straightforward, in that the corresponding
components are just combined separately:

a + b = (ax + bx , ay + by , az + bz ) ,

with all the pluses replaced by minuses for a take away. The multiplica-
tion of a vector by a scalar, µ say, is also easy,

µ a = (µax , µay , µaz ) ,

and yields a vector with the original direction but an appropriately scaled
length. The modulus, magnitude or length of a vector is given by Pythago-
ras’ theorem; it’s one for a unit or normalized vector.
A vector can be multiplied by another in two different ways. The first is
a ‘dot’ or scalar product, which is a sum of the products of corresponding
elements:
|a| 2 = a • a = a2x + a2y + a2z a • b = a x bx + a y by + a z bz (2.9)

and is geometrically the modulus of a times the modulus of b times the


a • b = |a| |b| cos θ cosine of the angle between them. Vectors of non-zero length are perpen-
dicular, or orthogonal, to each other if their dot product is zero; if they are
also of unit length, they are said to be orthonormal.
Vectors can also be multiplied by a ‘cross’ or vector product. This is a bit
more complicated since the result is a vector:

a × b = (ay bz − by az , az bx − bz ax , ax by − bx ay ) . (2.10)

Geometrically, its magnitude is equal to the modulus of a times the mod-


ulus of b times the sine of the angle between them; this is also the area of
|a × b| = |a| |b| sin θ the related parallelogram. The direction of the cross product is perpendic-
ular to both a and b, and given by the ‘right-hand screw rule’: if the curl
of the right-hand fingers indicates the sense of rotation needed to go from
a to b, then the direction is given by the out-stretched thumb.
The physical meaning of a dot product holds irrespective of the dimen-
sionality of the vectors and eqn (2.9) generalizes in an obvious way. The
same is not true of a cross product, which is specific to a space of three di-
a•b = b•a mensions (as considered here). The scalar product is also symmetric with
respect to an interchange of a and b whereas the vector product is anti-
a×b = − b×a symmetric: the latter changes sign but the former does not. Division by a
vector is not defined and must never be performed.
2.1 Sinusoidal waves 23

2.1.2 The principle of superposition


A central feature of waves is that they pass through each other un-
affected and, where overlapped, give a net result that is the sum of
the individual contributions. This principle of superposition lies at
the heart of scattering theory. Here we illustrate it with a couple
of one-dimensional examples involving the combination of just two
sinusoidal waves.
Consider first the case where the waves are identical but travel-
ling in opposite directions. With the simplifying assignments that
A = 1 and φo = 0 in eqn (2.4), to reduce the algebraic clutter, the
principle of superposition yields

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


`X+Y ´ `X−Y ´
ψ = sin(k x − ω t) + sin(−k x − ω t) sin X+ sin Y = 2 sin 2
cos 2

= − 2 sin(ω t) cos(k x) ,

where the second line follows from a trigonometric ‘factor formula’


and the antisymmetric properties of the sine function. This is called
a stationary wave (Fig. 2.6), because there is no movement with time
along the x direction. The separation of ψ into a product of spatial
and temporal terms results in a purely ‘up-and-down’ oscillation, at
a frequency of ω, with an amplitude that varies sinusoidally with
wavelength λ = 2π/k. The locations at which the amplitude is zero Fig. 2.6 A stationary wave, plotted as
are called nodes. a function of x for several values of t.
As a second example, consider two waves travelling in the same
direction with equal amplitudes but slightly different wavelengths
and frequencies: k ± ∆k and ω ± ∆ω, where the ∆-terms represent
small departures from the average k and ω. With the simplification
that A = 1 and φo = 0, as before, ψ is now a product of two travelling
waves:
       
ψ = sin k +∆k x − ω +∆ω t + sin k−∆k x − ω −∆ω t

= − 2 sin(k x − ω t) cos ∆k x − ∆ω t ,

Fig. 2.7 The slowly varying ‘beating’ modulation, of wavelength 2π/∆k, propagates
with a speed of ∆ω/∆k, whereas the finer structure inside the envelope has the
properties of the average wavelength and frequency, ω and k.
24 Waves, complex numbers and Fourier transforms

and is illustrated in Fig. 2.7. The amplitude of a sinusoid with the


mean wavelength of 2π /k, propagating with a speed ω/k, is mod-
ulated by a slowly varying envelope of wavelength 2 π/∆k, moving
with a speed of ∆ω/∆k. This is the origin of the beating that is
heard when neighbouring musical notes are played together: the
sound becomes periodically louder and quieter.
Although we have only considered the combination of two simi-
lar waves, its generalization to the sum of many such components
results in the formation of wavepackets; the beating modulation of
Fig. 2.7 is just the most elementary example. The shape of the
wavepacket will be preserved on propagation if all its constituents
travel with the same speed c, when

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


ω dω
=c and = c.
k dk

From Sivia and Rawlings (1999),


Foundations of Science Mathematics, Gradients, rates of change and differentiation
Oxford Chemistry Primers Series, 77.
The relationship between two quantities, x and y say, can be visualized
with the aid of a graph. While the intersections of the associated curve
with the x and y axes may be of interest, it is often more important to
know the slope at any given point; that is, how quickly y increases, or de-
creases, as x changes, and vice versa. This issue is at the heart of the topic
of differentiation, and the related rules and formulae are simply ways of
calculating the gradient algebraically.
Let us begin with a precise definition of what is meant by the slope of
a curve. Suppose that y is related to x through some function called ‘f ’,
usually written as y = f(x), so that f(x) = m x + c for a general straight
line, and f(x) = sin(x) for a sinusoidal variation, and so on. Then, if the
horizontal coordinate changes from x to x + ∆x, where ∆x represents a
small increment, the value of y is altered from f(x) to f(x+∆x). The
gradient, at a point x, is defined to be the ratio of the change in the ver-
tical coordinate, ∆y, to that of the horizontal increment, as ∆x becomes
vanishingly small. This can be stated formally as

dy ∆y f(x + ∆x) − f(x)


= lim = lim , (2.11)
dx ∆x→0 ∆x ∆x→0 ∆x

where dy/dx is known as the derivative, or differential coefficient, and


is pronounced ‘dy-by-dx’. The tendency of ∆x → 0 has to be approached
gradually to ascertain the limiting value of the ratio ∆y/∆x, as both in-
crements are individually equal to zero when the condition is met. Strictly
speaking, we should check that the same value of dy/dx is obtained
dy d
y′ = = (y) = f ′ (x) whether ∆x is positive or negative, but this is assured as long as the
dx dx
curve y = f(x) is ‘smooth’; inconsistencies will arise if kinks and sudden
„ « breaks (or discontinuities) are present, and the function is said to be non-
′′d2 y d dy
y = = = f ′′ (x) differentiable at those points.
dx 2 dx dx
2.2 Complex numbers 25

If the speed of the sinusoidal waves varies with their wavelength,


because the frequency does not happen to be directly proportional
to the wavenumber in the medium of interest, then the wavepacket
will change with time. This phenomenon is called dispersion, and
the relationship between ω and k which determines the nature of
the ‘spreading’,
ω = ω (k) ,
is called the ‘dispersion relation’ or the ‘dispersion curve’ (Fig. 2.8).
For the non-dispersive case, ω = c k , there is a unique speed, c, as-
sociated with the propagation of a wavepacket. The ratio ω/k and
the derivative dω/dk still yield useful characteristic speeds, how-

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


ever, when there is a dominant contribution from sinusoidal waves
around a particular wavelength:
ω dω
vφ = and vg = , (2.12)
k dk
where vφ is called the phase velocity, and gives the rate at which the Fig. 2.8 A dispersion curve, and the
crests and troughs of the local wavefront move, and vg is the group related phase and group velocities for
waves in the neighbourhood of ko .
velocity, which indicates how fast the envelope of the wavepacket
travels.

2.2 Complex numbers


The analysis of wave phenomena is aided greatly by the use of com-
plex numbers. In particular, by a result which links an exponential
to sines and cosines:

ei θ = cos θ + i sin θ , (2.13)

where i2 = −1. Since complex numbers play a central role in theo-


retical work, we will devote a few pages to them; as with most of the
mathematical background given in this book, the material is based
on Sivia and Rawlings (1999).

2.2.1 Definition
If any number, integer or fraction, positive or negative, is multiplied
by itself, then the result is always greater than, or equal, to zero.
What, then, is the square root of −9 ? To address this question we
need to invent an imaginary number, usually denoted by ‘ i ’, whose
square is defined to be negative:

i2 = −1 . (2.14)

A real number, say b (where b2 > 0), times i is also imaginary; it’s −9 = ± 3i
just b times bigger than i. If a is also an ordinary number, then the
sum z of a and ib,
z = a + ib , (2.15)
26 Waves, complex numbers and Fourier transforms

is known as a ‘complex’ number; this does not indicate an intrinsic


difficulty with the concept, but highlights the hybrid nature of the
entity. It consists of both a real part and an imaginary one:

Re {z} = a and Im {z} = b . (2.16)

It may seem odd that Im {z} is b rather than ib, but this is because
it represents the size of the imaginary component.

2.2.2 Basic algebra


To add or subtract complex numbers, we simply add or subtract the

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


real and imaginary parts separately:

a + ib ± (c + id ) = a ± c + i (b ± d ) , (2.17)

where a, b , c and d are real. The usual rules of algebra apply for
1 + 2i − (5 − i) = −4 + 3i brackets and multiplication, except that every occurrence of i2 is
replaced by −1. Thus, it’s easy to show that the product of a + i b
and c + id is given by

(a + ib) (c + id ) = a c − b d + i (a d + b c) , (2.18)

since i2 b d = −b d. Division involves the use of a complex conjugate,


(1 + 2i) (3 − i) = 5 + 5i so let us consider this first.
The conjugate of a complex number z, denoted by z ∗ , is defined to
have the same real part but the opposite imaginary component; that
is, Re{z ∗} = Re{z} and Im {z ∗ } = −Im {z}. In terms of eqn (2.15),
therefore,

z ∗ = (a + ib) = a − ib . (2.19)

Hence, complex numbers and their conjugates satisfy the following


relationships:

z + z∗ = 2a = 2 Re {z}
z − z∗ = 2 ib = 2 i Im {z} (2.20)
2
z z ∗ = a2 + b 2 = |z|

We will come to the meaning of |z| shortly, but the important point
about eqn (2.20) is that the product z z ∗ is a real number. This fea-
ture enables us to calculate the real and imaginary part of the ratio
of two complex numbers by multiplying both the top and bottom by
the conjugate of the denominator
a + ib a + ib c − id a c + b d + i (b c − a d )
= × = . (2.21)
c + id c + id c − id c2 + d2
To evaluate the ratio (1+2i)/(3−i), for example, we multiply it by
unity in the form (3+i)/(3+i); this gives a real denominator of 10,
and a complex numerator of 1+7i. Hence the result is 1/10 + i 7/10.
2.2 Complex numbers 27

2.2.3 The Argand diagram


So far we have considered complex numbers from an algebraic point
of view; it is often helpful to think of them in geometrical terms.
This is easily done with the aid of an Argand diagram where the
horizontal, or x, axis of a graph is seen as representing the real part
of a complex number, and the vertical, or y, axis gives the imaginary
component. Thus the point with (x, y) coordinates (a, b) corresponds
to the complex number a + ib. Its conjugate z ∗ is a reflection in the
real axis. An alternative way of specifying the location of a point on a
graph is through its distance r from the origin, and the anticlockwise
angle θ that this ‘radius’ makes with the (positive) real axis. In this

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


system r is known as the modulus, magnitude or amplitude of z ; θ
is called its argument or phase.
The quantities r, θ, a and b in the Argand diagram are related
through elementary trigonometry by

a = r cos θ and b = r sin θ , (2.22)

or, in the reverse sense, by

r 2 = a2 + b 2 and θ = tan−1 (b/a) . (2.23)

A comparison between eqns (2.20) and (2.23) shows that z z ∗ = r2 ,


where r = |z| is the modulus of the complex number. The second
part of eqn (2.23) needs qualification because there is an ambiguity
of 180◦ with tan−1 (b/a). The arctangent of unity, for example, could
be either 45◦ or −135◦ . For complete consistency with eqn (2.22),
0 < θ < π (radians) for b > 0 and −π < θ < 0 for b < 0; θ is zero for a > 0,
and ± π for a < 0, if b = 0. It is also worth remembering that θ is only
defined to within a factor of 2 π, because we could add (or subtract)
any integer number of 360◦ to it and obtain the same point in the
Argand diagram.

2.2.4 The imaginary exponential


Perhaps the most important result in complex analysis concerns the
exponential of an imaginary number:
x2 x3
ei θ = cos θ + i sin θ , (2.24) ex = 1 + x + + + ···
2! 3!
where θ is in radians. This equation can be verified by substituting
x = i θ in the Taylor series expansion for ex , and collecting the odd θ3 θ5
sin θ = θ − + − ···
and even powers of θ separately; remembering that i2 = −1, a com- 3! 5!
parison with the Taylor series for sin θ and cos θ yields eqn (2.24).
The product of r and ei θ allows a complex number to be expressed θ2 θ4
cos θ = 1 − + − ···
in a very compact form in terms of its modulus and argument: 2! 4!

z = a + ib = r (cos θ + i sin θ) = r ei θ , (2.25)


28 Waves, complex numbers and Fourier transforms

where a, b , r and θ are related through eqns (2.22) and (2.23). As


can be seen from the Argand diagram, and verified by the symmetry
properties of sines and cosines, its conjugate entails the replacement
of θ with −θ:

z ∗ = a − ib = r (cos θ − i sin θ) = r e−i θ , (2.26)

from which the result z z ∗ = r2 follows immediately.


Although the exponential form of a complex number is very useful
when dealing with roots and logarithms, and provides a valuable in-
sight into products and quotients, our interest here is in its relation-
ship with waves. This hinges on eqn (2.24), which enables eqn (2.8)

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


to be written as the imaginary part of

ψ = A ei(k •r −ω t) , (2.27)

where A is now a complex number whose modulus and argument


give the amplitude and phase offset of the wave, respectively:

A = |A| ei φo . (2.28)

The real part of eqn (2.27) also represents the same wave, apart
from a difference of 90◦ in the value of φo .

From Sivia and Rawlings (1999),


Foundations of Science Mathematics, Taylor series
Oxford Chemistry Primers Series, 77.
When dealing with a complicated function, it can be useful to approxi-
mate it with one of a simpler form. While the latter may not represent a
complete and accurate description of the situation at hand, it frequently
provides the only means of making analytical progress. There are many
approximations that could be used, of course, but it is the one that cap-
tures the salient features that is most helpful. A Taylor series is appropri-
ate when our principal interest lies in the behaviour of a function in the
neighbourhood of a particular point.
Consider the curve y = f(x). The crudest approximation to this function
is a horizontal line y = a0 , where a0 is a constant; if a0 = f(xo ), then it will
even be correct at x = xo . A better approximation would be a sloping line
y = a0 + a1 (x−xo ), where the coefficient a1 allows for a non-zero gradient.
Continuing along this path, we could add a quadratic (or curvature) term
a2 (x−xo )2 , a cubic contribution a3 (x−xo )3 , and so on, to gain further im-
provements. Thus, a function f(x) can be approximated about the point
xo by using a polynomial expansion:

f(x) ≈ a0 + a1 (x−xo ) + a2 (x−xo )2 + a3 (x−xo )3 + · · · . (2.29)


˛ This is the essence of a Taylor series. Its advantage is that the right-hand
1 dnf ˛˛
an = side of eqn (2.29) is usually easier to calculate, differentiate, integrate,
n! dx n ˛xo
and generally manipulate, than the expression on the left. The case of
xo = 0, when the Taylor series simplifies, is called a Maclaurin series.
2.3 Fourier series 29

The benefit of using eqn (2.27) over (2.8) in wave analysis is that
exponentials are easier to deal with mathematically than sinusoids;
multiplication, differentiation and integration, for example, are more
straightforward. As an illustration of this advantage, let’s derive the
‘compound angle’ formulae for sines and cosines with complex num-
bers. Starting with the rule of eqn (1.2) for combining powers,

ei (α+β) = ei α ei β ,

and expanding the exponentials with eqn (2.24),

cos (α+β) + i sin (α+β) = (cos α + i sin α) (cos β + i sin β) ,

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


the equating of the real and imaginary parts on the left- and right-
hand sides yields the desired results:

cos (α+β) = cos α cos β − sin α sin β , (2.30)


sin (α+β) = sin α cos β + cos α sin β . (2.31)

As well as being the real and imaginary parts of exp(iθ), sines and
cosines can also be expressed as

ei θ + e−i θ ei θ − e−i θ
cos θ = and sin θ = , (2.32)
2 2i
which follow from the addition and subtraction, respectively, of eqn
(2.24) with its complex conjugate.

2.3 Fourier series


Let’s begin wave analysis by considering how periodic signals, such
as those in Fig. 2.1, can be decomposed into the sum of sinusoids.
Suppose that the function f(x) repeats itself after a ‘distance’ of λ,
so that
f(x) = f(x +λ) . (2.33)
This has the same periodicity as sines and cosines of wavenumber
k = 2π/λ. A simple approximation to f(x), which matches its wave-
length, is therefore

f(x) ≈ a0 + a1 cos(kx) + b1 sin(kx) , (2.34)

where a0 , a1 and b1 are constants whose values need to be selected


in some way. The crudest assignment would be to set both a1 and b1
equal to zero, giving an invariant f(x) ≈ a0 , but the linear combina-
tion of sin(kx) and cos(kx) allows for a sinusoidal variation with the
correct period and an appropriate amplitude and phase:

a cos(kx) + b sin(kx) = A sin(kx + φ) ,

where a = A sin φ and b = A cos φ, in accordance with eqn (2.31).


30 Waves, complex numbers and Fourier transforms

The sines and cosines of 2 kx, 3 kx, 4 kx, and so on, also satisfy the
periodicity of eqn (2.33); they just go through several, or many, com-
plete cycles in the interval λ. We can obtain a better approximation
to f(x), therefore, by including contributions from these higher-order
terms:
f(x) ≈ a0 + a1 cos(kx) + a2 cos(2 kx) + a3 cos(3 kx) + · · ·
(2.35)
+ b1 sin(kx) + b2 sin(2 kx) + b3 sin(3 kx) + · · ·

This expansion is called a Fourier series, and eqn (2.34) is simply


the first-order version of it which contains only the lowest, or funda-
mental, harmonic.

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


We will come to the evaluation of the coefficients an and bn , for
integer n, shortly but note that one of the sets goes to zero if f(x)
possesses a symmetry about the y-axis:
(
sin(θ) = − sin(−θ) f(−x) =⇒ bn = 0
f(x) = (2.36)
cos(θ) = cos(−θ) − f(−x) =⇒ an = 0

because sines and cosines are odd and even functions, respectively.
The generalization of eqn (2.35) explains why the invariant term is
designated as a0 , and why there is no corresponding b0 (apart from
its general redundancy): they are the coefficients of cos(0) = 1 and
sin(0) = 0, with the b0 being unnecessary since it adds nothing to the
Fourier series.

2.3.1 Orthogonality and the Fourier coefficients


A prescription for the an and bn in eqn (2.35) presents itself once
we realize that the related sine and cosine functions are orthogonal.
By this we mean that the integral of the product of any two over the
interval of the period λ will be zero, unless they happen to be exactly
the same functions:
Zλ (
0 if m 6= n ,
sin(m kx) sin(n kx) dx = (2.37)
λ
2 if m = n ,
0

with an identical expression for cos(m kx) cos(n kx), but n 6= 0, and


sin(m kx) cos(n kx) dx = 0 . (2.38)
0

Although these sines and cosines aren’t perpendicular in a geomet-


rical sense, this type of integral is the functional analogue of a dot
product which is zero for orthogonal vectors.
If we multiply eqn (2.35) through by one of the sine or cosine func-
tions, sin(m kx) or cos(m kx), and integrate the resultant products
over the period λ, then all but one of the terms on the right-hand
2.3 Fourier series 31

side will be zero due to eqns (2.37) and (2.38). The surviving m = n
contributions yield the formulae for the Fourier coefficients:

Zλ Zλ
2 2
an = λ f(x) cos(n kx) dx and bn = λ f(x) sin(n kx) dx (2.39)
0 0

for n = 1, 2, 3, . . . , from which eqn (2.36) can be verified. If eqn (2.35) Zλ


is integrated over the period λ as it stands, then the constant a0 is sin(n kx) dx = 0
seen to be the average value of f(x): 0

Zλ Zλ

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


1
a0 = λ f(x) dx . (2.40) dx = λ
0 0

From Sivia and Rawlings (1999),


Cumulative properties and integrals Foundations of Science Mathematics,
Oxford Chemistry Primers Series, 77.
While differentiation is concerned with the slope of y = f(x), integration
deals with the ‘area under the curve’. This relates to the average and
cumulative behaviour of y, over some range in x.
To set up a definition of an integral, consider the region bounded by the
straight lines x = a , x = b and y = 0, and the curve y = f(x). The size
of the enclosure can be estimated by approximating it as a whole series of
narrow vertical strips, and adding together the areas of these contiguous
rectangular blocks. If the x-axis between a and b is divided into N equal
intervals, then the width of each strip is given by ∆x = (b−a)/N ; the
corresponding heights of the thin blocks are equal to the values of the
function f(x) at their central positions. In other words, the area of the j th
strip, which is at x = xj and of height y = f(xj ), is f(xj ) ∆x; the index j
ranges from 1 to N, of course, with x1 = a +∆x/2 and xN = b −∆x/2 . As
N tends to infinity, ∆x → 0 and the approximation to the area under the
curve becomes ever more accurate. This limiting form of the summation
procedure defines an integral
Zb Zb X
N

y dx = f(x) dx = lim f(xj ) ∆x , (2.41)


N→∞
a a j=1

R
where the symbol dx is read as the ‘integral, from a to b, with respect
to x’. The use of the term ‘area’ in the above discussion needs some qual-
ification, in that it can be negative; this is because the ‘height’ of a strip
f(xj ) < 0 whenever the curve y = f(x) lies below the x-axis (and even the
‘width’ ∆x < 0 if b < a).
Although an integral is defined as the limiting form of a summation, it
is usually calculated analytically by noting that ‘integration is the reverse
of differentiation’. While this may not be obvious, it is easily illustrated
with an example from everyday kinematics: the distance travelled by a
car (say) is the integral of the speed with respect to time, and speed is the
rate of change of distance with time (a derivative).
32 Waves, complex numbers and Fourier transforms

2.3.2 The complex Fourier series


The Fourier series of eqn (2.35) can be written in a very compact
form by using complex numbers:

X
f(x) = c n ei nkx , (2.42)
n =−∞

where the Σ stands for a summation over integer values of n, from


−∞ to ∞, and the approximation has been replaced by an equality
to denote a definition. The right-hand side of eqn (2.42) will yield a
real function, f(x), as long as the complex coefficients c n satisfy the

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


conjugacy condition
c −n = c n∗ . (2.43)
This follows from eqn (2.20) because the contribution to the sum
from pairs of positive and negative values of n of equal magnitudes
will then be
“ ”∗ n o
c n ei nk x = c n∗ e−i nk x c n ei nkx + c −n e−i nkx = 2 Re c n ei nkx

= an cos(nkx) + bn sin(nkx) ,

for n 6= 0, where we have substituted 2 c n = an − ibn in the second


line to obtain consistency with eqn (2.35). In fact, the formula for
the complex coefficients is simply

cn = 1
λ f(x) e−i nkx dx , (2.44)
0
with c 0 = a0 .

2.4 Fourier transforms


We began our discussion of Fourier series by considering how a peri-
odic function could be decomposed into, or approximated by, a sum of
sinusoidal waves. The analysis can be extended to the non-periodic
case by letting λ → ∞, so that no repetitions are required of f(x)
within a finite interval. To carry out this limiting procedure, it is
helpful to define

∆k = and kn = n ∆k
λ
because the wavenumber of the fundamental harmonic, ∆k, shrinks
gradually to zero as λ gets ever larger and kn approaches a contin-
uum even with integer n. The imaginary exponentials of eqns (2.42)
and (2.44) can then be written as

ei kn x and e−i kn x ,
2.4 Fourier transforms 33

and the coefficients expressed as

c n = α F(kn ) ∆k ,

where α is a constant and F(k) is a continuous function of k. With


these substitutions, eqns (2.42) and (2.44) become


X Zλ/2
i kn x
f(x) = α F(kn ) e ∆k and F(kn ) = 1
2πα f(x) e−i kn x dx ,
n =−∞
−λ/2

where we have expressed the integral over a period as being from

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


−λ/2 to λ/2, instead of 0 to λ, for a more symmetrical appearance.
The limit of λ→∞, when ∆k→0, can now be taken safely, and yields
the integrals

Z∞
f(x) = √1

F(k) ei kx dk (2.45)
−∞

and

Z∞
F(k) = √1

f(x) e−i kx dx (2.46)
−∞

as being the continuum


√ versions of eqns (2.42) and (2.44), where
we have set α = 1/ 2π for aesthetic reasons of symmetry. These
come as a linked pair, and define a Fourier transform and its inverse;
which one is called which is quite arbitrary.
While the exponents of a Fourier transform and its inverse must
have opposite signs, their precise definitions are a matter of conven-
tion; the choice of α is up to us, for example. If the wavenumber is
taken to be 1/λ instead of 2π/λ, as done by spectroscopists, then the
exponents will be ± i2 πk x and neither integral will have a scaling
term:
Z∞ Z∞
i 2πkx
f(x) = F(k) e dk and F(k) = f(x) e−i 2πkx dx ,
−∞ −∞

where k is measured in cycles per unit length, typically cm−1 , rather


than the SI radians per metre.
Although our goal is to gain a physical insight into Fourier trans-
forms, we first need to discuss some of their formal properties. Basic
symmetries are a good place to start, as the most common one is the
continuum analogue of eqn (2.43):

f(x) = f(x)∗ ⇐⇒ F(−k) = F(k)∗ , (2.47)


34 Waves, complex numbers and Fourier transforms

which states that the Fourier transform of a real function is ‘con-


jugate symmetric’. If one of them possesses a symmetry about the
origin, then so too will the other:
( (
f(−x) F(−k) ,
f(x) = ⇐⇒ F(k) = (2.48)
− f(−x) − F(−k) .

Equations (2.47) and (2.48) can be combined to show that the Fourier
transform of a real and symmetric function is also real and even,
whereas that of a real and antisymmetric function is imaginary and
odd; this is equivalent to eqn (2.36).
The substitution of k = 0 in eqns (2.46) and (2.47) reveals F(0) to

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


be proportional to the area under the curve y = f(x),
Z∞
F(0) = √1 f(x) dx , (2.49)

−∞


and necessarily real if f(x) = f(x) . It will equal zero if f(x) = −f(−x).
Technically, the integral of the modulus, |f(x)|, must be bounded (or
finite) if its Fourier transform is to exist everywhere; this is known
as the Dirichlet condition.

2.4.1 Convolution theorem


One of the most useful results in Fourier theory concerns the convo-
lution of two functions. Mathematically, the convolution of g(x) and
h(x) is defined by
Z∞
g(x) ⊗ h(x) = g(t) h(x−t) dt , (2.50)
−∞

where g ⊗ h is read as ‘g convolved with h’, and physically represents


a ‘blurring’ of g(x) by h(x). This can be understood from the example
of Fig. 2.9, where g(x) consists of four spikes, or δ-functions, and
h(x) is a broad asymmetric function. The convolution is carried out

Fig. 2.9 The convolution of the spiky function g(x) with the broad asymmetric function h(x): f(x) = g(x) ⊗ h(x).
2.4 Fourier transforms 35

by replacing each of the the sharp peaks in g(x) with scaled copies of
h(x) and adding together the four contributions; those from the two
closely spaced components in the middle, shown by dotted grey lines,
combine to give a resultant function where the constituent doublet
is no longer resolved clearly. Although it’s not as easy to visualize
it the other way around, eqn (2.50) can equally be thought of as the g(x) ⊗ h(x) = h(x) ⊗ g(x)
blurring of h(x) by g(x).
The convolution theorem states that the Fourier transform of the
convolution of two functions is proportional to the product of their
Fourier transforms:

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


f(x) = g(x) ⊗ h(x) ⇐⇒ F(k) = 2π G(k) × H(k) , (2.51)

where F(k), G(k) and H(k) are the Fourier transforms of f(x), g(x)
and h(x), respectively, according to eqn (2.46). Given the reciprocity
between a Fourier transform and its inverse,

f(x) = g(x) × h(x) ⇐⇒ F(k) = √1 G(k) ⊗ H(k) . (2.52)


The power of eqn (2.51) will be illustrated in a physical sense in the


next section, and throughout this book, but its computational benefit
stems from the fact that it’s much easier to multiply functions than
to convolve them. To work out g(x)⊗h(x) numerically, for example, g(x) ⊗ h(x) =
it’s quicker to use a fast Fourier transform (FFT) computer subrou- Z∞
tine to calculate G(k) and H(k), and inverse Fourier transform their 1

G(k) H(k) ei k x dk
product, than to compute the integral of eqn (2.50) directly. −∞
Putting k = 0 in eqn (2.51), and interpreting F(0), G(0) and H(0)
with eqn (2.49), shows that the area under the convolution is equal
to the product of the corresponding individual integrals:
Z∞ Z∞ Z∞
 
g(x) ⊗ h(x) dx = g(x) dx × h(t) dt . (2.53)
−∞ −∞ −∞

This can be seen from the example of Fig. 2.10, where an array of
different shaped peaks is convolved with a Gaussian. Although the

Fig. 2.10 The convolution of a function with an array of different shaped peaks, g(x), with a Gaussian, h(x).
36 Waves, complex numbers and Fourier transforms

The Dirac δ-function


A Dirac δ-function, δ (x−xo ), is a sharp spike of unit area at a given loca-
tion, xo ; its simplicity as a ‘point impulse’ makes it a useful test object for
studying equations that model physical situations. Mathematically, it is
defined by
Z∞
δ(x −xo ) = 0 if x 6= xo and δ(x −xo ) dx = 1 ,
−∞

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


and can be thought of as the limiting form of a variety of functions as they
become ever narrower. Of these the most straightforward is a rectangular
column of width ǫ , centred on x = xo , and height 1/ǫ ; this acquires the
properties of δ (x−xo ) in the limit of ǫ → 0. An important corollary of the
above definition is
Zb (
f(xo ) if a < xo < b ,
f(x) δ(x −xo ) dx = (2.54)
0 otherwise ,
a

δ(x −xo ) ⊗ h(x) = h(x −xo ) so that integrals involving a δ-function are easy to evaluate.

two spikes on the left of g(x) merge into one in f(x), because they
are very closely spaced compared with the width of h(x), the areas
of the various components in the blurred output are proportional to
those of the input signal. The amplitudes of the narrowest peaks are
affected the most, since their relative spreading is the greatest as a
result of the convolution; the slowly varying parts of the structure
change the least.

2.4.2 Auto-correlation function


The last Fourier concept that we need to consider concerns the auto-
correlation function, or ACF, which provides information on the dis-
tance distribution of the various structures in f(x). Mathematically,
the ACF of f(x) is defined by
Z∞

ACF(x) = f(t) f(x+t) dt , (2.55)
−∞


and is real if f(x) = f(x). Although this looks like a self-convolution,

or f(x) ⊗ f(−x), it’s not the best way to think about eqn (2.55). The
ACF is largest at the origin,

Z∞
˛ ˛2 ∗
f(x) f(x) = ˛ f(x)˛ > 0

ACF(0) = f(t) f(t) dt ,
−∞
2.4 Fourier transforms 37

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


Fig. 2.11 An f(x) consisting of four sharp peaks and its auto-correlation function.
The spike at the origin of the ACF should be three times higher than drawn, and has
been suppressed for clarity. The relationship of the closest and farthest peaks in f(x)
to their corresponding mutual contributions in the ACF is indicated.

because everything correlates with itself. The value of the ACF at


a distance L away from the origin is calculated by multiplying f(x)
with a copy that’s displaced by L relative to it, f(L+x), and integrat-
ing the product; its magnitude is a measure of how much structure
there is in f(x) separated by a distance of L. This can be under-
stood most easily by considering the ACF of a function that consists
of a few sharp peaks, such as that shown in Fig. 2.11. Basically, two
spikes at x1 and x2 in f(x), with amplitudes A1 and A2 , will contribute
a symmetric pair of very sharp components at ± (x1 −x2 ), and mag-
nitude A1 A2 , towards the ACF of f(x); they will also add an amount ACF(−x) = ACF(x)∗
A12 +A22 to the ACF at the origin.
The reason for discussing the ACF is its linear relationship to the
modulus of a Fourier transform:
Z∞
2
ACF(x) = F(k) ei kx dk , (2.56)
−∞

where F(k) is given by eqn (2.46). While a Fourier transform and its
inverse contain the same information, albeit in different ways, and
it’s possible to switch between one and the other through eqns (2.45)
and (2.46), the situation becomes less straightforward if only F(k)
is available. We can begin to appreciate the problems caused by such
a loss of the Fourier phase by comparing the relative complexity of
the ACF with f(x) in Fig. 2.11. The ACF, which is directly available
38 Waves, complex numbers and Fourier transforms

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


Fig. 2.12 An f(x) containing a variety of peaks and its auto-correlation function.

from F(k) through eqn (2.56), is much harder to interpret in terms


of the underlying structure; for a diffuse case, such as that in Fig.
2.12, it’s almost impossible.

2.5 Fourier optics and physical insight


So far, we have discussed Fourier transforms in a largely abstract
context. Now let’s try to gain some physical insight into their proper-
ties with the aid of diffraction experiments familiar from high school
physics. First, though, we need to establish the link between optics
and Fourier transforms.
The geometry of the diffraction experiment is shown in Fig. 2.13,
where a travelling plane wave passes through a set of slits and pro-
duces a pattern of dark and light bands on a very distant screen. We
have made the problem one-dimensional for simplicity, but will indi-
cate its generalization later. The nature of the aperture is defined by
the function A(x), which describes how much light passes through
it at position x; this is called the aperture function. It usually only
takes values of zero or one, corresponding to complete opaqueness
and transparency respectively, but it could in principle be complex
with 0 6 A(x) 6 1.
To calculate the diffraction pattern, the principle of superposition
tells us that we need to add up all the waves that emerge through
the aperture. The amplitude of the contribution from the narrow
region between x and x + ∆x is proportional to A(x) ∆x, but what
2.5 Fourier optics and physical insight 39

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


Fig. 2.13 The geometry for Fraunhofer diffraction by a one-dimensional aperture,
A(x); the interference pattern of interest, I(q), is projected onto a distant screen.

about its phase φ? That depends on both x and the angle of propaga-
tion relative to the incident wave, θ, as well as the time t. The phase
will be invariant with position parallel to the incoming wavefront,
but will gain a relative factor of

∆φ = 2π λ x sin θ

in the direction of θ due to the associated path difference of x sin θ.


Hence, the complex contribution to the resultant wave is

∆ψ = ψo A(x) ei qx ∆x ,

where q = 2π sin θ/λ and the temporal variation has been absorbed
into the ‘constant’ of proportionality, ψo . The diffracted wave, ψ, is
the sum of all such terms; in the limit ∆x → 0, it becomes the Fourier
transform of the aperture function:
Z∞
ψ(q) = ψo A(x) eiqx dx . (2.57)
−∞

Thus we met Fourier transforms a (very) long time ago but did not
realize it! Before reminding ourselves of the results from elementary
diffraction experiments, and trying to understand them in terms
what we’ve now learnt about Fourier transforms, we need to make a
few qualifying remarks.
The first point is essentially a technicality, but the above analy-
sis assumes that we are considering Fraunhofer diffraction. This
40 Waves, complex numbers and Fourier transforms

is the limit where the projection screen is so far away that all the
waves reaching a particular point can be considered to be travelling
in parallel directions. The equations becomes more cumbersome
when this approximation does not hold, and leads to the theory of
Fresnel diffraction.
The more serious point of note is that the observed, or measured,
diffraction pattern is not the complex function ψ(q) but its intensity,
or modulus-squared, I(q):
2 ∗
I(q) = ψ(q) = ψ(q) ψ(q) . (2.58)

The difficulties caused by such a loss of phase information, in terms

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


of ascertaining the aperture function from its diffraction pattern,
have been alluded to in Section 2.4.2, but we will encounter them
again throughout this book.

2.5.1 Young’s double slit


A first introduction to interference experiments usually involves a
Young’s double slit. This consists of a pair of very narrow slits that
are separated by a distance d, and give rise to a diffraction pattern of
uniformly spaced dark and light bands which become closer together
as d increases. Let’s try to understand this theoretically by using
eqns (2.57) and (2.58).
The aperture function for a Young’s double slit can be modeled
by two δ-functions located at a distance of d/2 on either side of an
arbitrarily defined origin,
 
A(x) = δ x − d2 + δ x + d2 ,

Z∞ and is plotted in Fig. 2.14. Since δ-functions are easy to integrate,


δ(x −xo ) ei qx dx = ei qxo from eqn (2.54), the Fourier transform of A(x) is readily shown to
−∞ yield
 
ψ(q) = ψo ei qd/2 + e−i qd/2
 
= ψo 2 cos qd
2 ,

where we have used eqn (2.32) in writing the second line. The prod-
uct of this diffracted wave with its complex conjugate, ψ(q)∗ , leads
to the prediction
h  i2
I(q) ∝ cos qd
2 ∝ 1 + cos(qd ) , (2.59)

2
where all the multiplicative prefactors not involving q, such as ψo ,
2
cos 2θ = 2 cos θ − 1 have been omitted and a trigonometric double angle formula used
on the far right-hand side. This pattern of ‘uniform cosine fringes’ is
plotted in Fig. 2.14.
2.5 Fourier optics and physical insight 41

Fig. 2.14 The aperture function for a Young’s double slit, A(x), its Fourier transform, ψ(q), and the diffraction pattern, I(q).

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


The theoretical result in eqn (2.59) is consistent with the experi-
mental observations: the dark and light bands are equally spaced, of
uniform intensity and become closer together in inverse proportion
to the distance d between the slits. The last feature is a universal
property of Fourier transforms: the length scales which character-
ize a function and its Fourier transform are inversely related to each
other. This leads to the use of the terminology reciprocal space when
referring to the Fourier domain.

2.5.2 A single wide slit


Another common interference experiment involves a single wide slit
that gives rise to a diffraction pattern where the intensity of the
light bands diminishes rapidly away from a central bright region,
which is itself twice as broad as the rest. Let’s also try to understand
this theoretically.
If we take the x-origin to be in the middle of the slit of width w,
then the aperture function becomes
(
1 if |x | 6 w2 ,
A(x) =
0 otherwise ,

and is plotted in Fig. 2.15. According to eqn (2.57), therefore,

Zw/2
ψ(q) = ψo eiqx dx .
−w/2

This Fourier transform is easy to evaluate, because the integration d ` µx ´


e = µ e µx
of an exponential is straightforward, and yields dx
 w/2
eiqx ψo  i qw/2 
ψ(q) = ψo = e − e−i qw/2 .
iq −w/2 iq

The difference of the imaginary exponentials on the far right-hand


side can be recognized as being equal to 2 i times sin(qw/2) from
42 Waves, complex numbers and Fourier transforms

Fig. 2.15 The aperture function for a single wide slit, A(x), its Fourier transform, ψ(q), and the diffraction pattern, I(q).

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


eqn (2.32). With this substitution, the modulus-squared of ψ(q) leads
to the prediction
h i2 1 − cos(qw)
1 qw
I(q) ∝ q sin 2 ∝ , (2.60)
q2
which is shown in Fig. 2.15 and consistent with the sinc-squared be-
sin θ
sinc θ = → 1 as θ → 0 haviour of the observed diffraction pattern. We again see the inverse
θ
relationship between the width of the aperture function and the
spread of the diffraction pattern: as one of them becomes broader
the other gets narrower.

2.5.3 A diffraction grating


A diffraction grating is an aperture consisting of a large number of
thin, parallel and equally spaced lines. In one dimension, it can be
modelled as a periodic array of δ-functions:

X
A(x) = δ(x − md ) ,
m =−∞

Z∞ where d is the distance between the grating lines. Swapping the or-
δ(x −md) ei qx dx = ei qmd der of integration and summation, and using eqn (2.54), the Fourier
−∞ transform of eqn (2.57) reduces to

X
ψ(q) = ψo eiqdm .
m =−∞

The nature of ψ(q) becomes apparent once we realize that it’s pro-
portional to the sum of complex numbers that are of unit magnitude
but varying phase. They will add up coherently if the product q d is
an integer number of 2π, yielding a huge resultant sum, but cancel
out otherwise. Hence,

X
ψ(q) ∝ δ(q − n qo ) , (2.61)
n =−∞
2.5 Fourier optics and physical insight 43

Fig. 2.16 The aperture function for a diffraction grating, A(x), its Fourier transform, ψ(q), and the diffraction pattern, I(q).

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


where qo = 2 π/d . The diffraction pattern has the same structure
as the grating, therefore, but the spacing of the lines is inversely
related to d (Fig. 2.16).
In terms of the physical set up of Fig. 2.13, where q = 2 π sin θ/λ, 2π sin θ 2π n
=
sharp bright lines are seen when λ d

n λ = d sin θ , (2.62)

for n = 0, ±1, ± 2, . . . , ± nmax , where the trigonometric constraint that


|sin θ| 6 1 imposes a cutoff on the highest observable order nmax . If
the spacing of the diffraction grating is known, such as 500 lines per
millimetre (so that d = 2 µm), then eqn (2.62) provides the basis for
an accurate measurement of the wavelength of the illumination. If
white light is used for the experiment, then the intense central line
is accompanied by increasingly dispersed rainbows for the higher
orders; this is because each of the wavelengths that makes up white
light satisfies eqn (2.62) for a slightly different angle θ for a given
value of n = 6 0.

2.5.4 The convolution theorem in action


Although a real diffraction grating isn’t infinite as assumed above,
we expect the analysis to be a very good approximation for one that
is sufficiently large. The case of a grating of limited extent w can be
addressed by combining the results of eqns (2.60) and (2.61) through
the convolution theorem: as the aperture function can be expressed
as a product of an infinite grating with line spacing d and a sin-
gle slit of width w, as illustrated in Fig. 2.17, the Fourier trans- A(x) = g(x) × h(x)
form of the finite grating is equal to the convolution of the Fourier
transforms of the infinite grating and the single wide slit. The re- ∴ ψ(q) = ψo G(q) ⊗ H(q)
sultant diffraction pattern is simply that of the infinite grating but
with each of the δ-functions replaced by a narrow sinc-squared func-
tion, as shown in Fig. 2.17. A qualification is in order here, in that
2 2
I(q) ∝ |G(q)| ⊗ |H(q)| is only an approximation (albeit a good one);
2
strictly speaking, I(q) ∝ | G(q) ⊗ H(q)| . Given the inverse relation-
ship between the length scales of a function and its Fourier trans-
44 Waves, complex numbers and Fourier transforms

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


Fig. 2.17 The diffraction pattern from a grating of limited extent, w, can be evaluated from a knowledge of the Fourier trans-
forms of an infinite grating, with line spacing d, and a single slit, of width w, through the use of the convolution theorem.

form, the width of the large diffraction peaks tells us about the size
w of the grating whereas the distance between them indicates the
d-spacing of its lines. As the number of grating lines goes up, so that
the ratio w/d increases, the principal peaks become narrower and
more low-level wiggles appear between them.
The convolution theorem also enables us to ascertain the diffrac-
tion pattern for a pair of broad slits from the results of eqns (2.59)
and (2.60). Taking each to be of width w, and separated by d, the
A(x) = g(x) ⊗ h(x) aperture function can be seen as a convolution of an ideal Young’s
double slit with a narrow but finite single slit, as in Fig. 2.18. Since
∴ ψ(q) = ψo G(q) × H(q) the Fourier transform of the former is then equal to the product of
those of the latter, the intensity of the uniform cosine fringes that
we’d expect from a perfect Young’s double slit is modulated by a
slowly varying sinc-squared function.

2.5.5 Multi-dimensional generalization


Having illustrated Fourier transforms and the use of the convolu-
tion theorem with one-dimensional versions of familiar high school
experiments, let’s indicate the multi-dimensional generalization of
eqn (2.57). A closer examination of Fig. 2.13 reveals q to be the x-
k = (kx , ky , kz ) component of the wavevector k of Section 2.1.1:
 
kx = 2πλ sin θ = q and kz = 2πλ cos θ ,
2.5 Fourier optics and physical insight 45

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


Fig. 2.18 The diffraction pattern from a pair of slits of width w and separation d can be evaluated from a knowledge of the
Fourier transforms of a Young’s double slit, of spacing d, and a single slit, of width w, with the convolution theorem.

where we have taken z to be the original direction of propagation,


from the aperture to the projection screen, and |k| = 2π/λ. With this
observation, it seems plausible that the two-dimensional diffraction
pattern, I(kx , ky ), from an aperture in the x–y plane, A(x, y), with y
coming out of the page in Fig. 2.13, might be given by the modulus-
squared of
Z∞ Z∞
ψ(kx , ky ) = ψo A(x, y) ei(kx x +ky y) dx dy . (2.63)
−∞ −∞

This double integral, over the surface area of the aperture, simpli-
fies to the product of two one-dimensional integrals if the aperture
function is separable:
Z∞ Z∞
ikx x
ψ(kx , ky ) = ψo A1 (x) e dx A 2 (y) eiky y dy
−∞ −∞

if A(x, y) = A1 (x) A 2 (y). The substitution of either δ(y) or a constant


for A 2 (y), and the properties of δ-functions, allows us to confirm that Z∞
eqn (2.63) reduces to the one-dimensional form of eqn (2.57) if the ei (q−qo)t dt = 2π δ(q−qo)
aperture is either infinitesimally thin or invariant with respect to y. −∞
Strictly speaking,

ψ → ψ1 (kx ) as A → A1 (x) δ(y) and ψ → ψ1 (kx ) δ(ky ) as A → A1 (x)


46 Waves, complex numbers and Fourier transforms

From Sivia and Rawlings (1999),


Foundations of Science Mathematics, Multiple integrals
Oxford Chemistry Primers Series, 77.
In ordinary integration, we are concerned with the area under the curve
y = f(x). Many functions of interest in real life entail several variables,
and multiple integrals are a natural extension of the one-dimensional
ideas to deal with multivariate problems.
To get a feel for how multiple integrals arise, let’s consider a couple of
physical examples. Suppose that we wish to calculate the force exerted
on a wall by a gale. If the pressure P was constant across the whole face
with area A, then the total force is simply P×A. With a varying pressure
P(x, y), the answer is not so obvious. This situation can be handled by

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


thinking about the wall as consisting of many small square segments, each
with area δx δy, so that the total force is the sum of all the contributions
P(x, y) δx δy; in the limiting case when δx → 0 and δy → 0, we have
ZZ
Force = P(x, y) dx dy
wall

where the double integral indicates that the infinitesimal summation is


being carried out over a two-dimensional surface (in the x and y direc-
tions). Incidentally, if the wall does not have a conventional (rectangular)
shape then its area can be calculated similarly according to
ZZ
Area = dx dy .
wall

The double integral is also called a surface integral.


Another illustration is provided by quantum mechanics where the
modulus-squared of the wave function, |ψ(x, y, z)|2 , of an electron (say)
gives the probability density of finding it at some point in space. The
chances that the electron is in a small (cuboid) region of volume δx δy δz
is then |ψ(x, y, z)|2 δx δy δz. Hence, the probability of finding it within a
finite domain V is given by
ZZZ
Probability = |ψ(x, y, z)|2 dx dy dz ,
V

which is known as a triple, or volume, integral.

and demonstrates the reciprocal Fourier relationship between the


widths of A 2 (y) and ψ 2 (ky ) in the limit of complete invariance versus
a δ-function. A careful consideration of the situation, in a manner
analogous to that used to derive eqn (2.57), shows eqn (2.63) to be
the correct two-dimensional extension.
The most common case of two-dimensional diffraction is from a
circular hole, but a rectangle is easier to deal with analytically. This
is because the aperture function of the latter, which is equal to one
inside the rectangle and zero outside it, is separable and yields a
2.5 Fourier optics and physical insight 47

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


Fig. 2.19 Diffraction patterns from two-dimensional apertures: (a) a rectangular
opening of size Dx by Dy , and (b) a circular hole of radius R.

Fourier transform that is just a product of the now familiar sinc


functions in kx and ky ; its modulus-squared is shown in Fig. 2.19(a).
The evaluation of the integral of eqn (2.63) is less straightforward
for a circular aperture, but the resultant diffraction pattern is plot-
ted in Fig. 2.19(b). It is circularly symmetric, depending only on
kx2 +ky2 , and is similar to a sinc function in the radial direction; the
behaviour is formally governed by a J1 Bessel function. The central
bright region is called an Airy disc, and its spread is the basis of the
resolution formula of eqn (1.8).
Having seen the formulae for the Fourier transforms of one- and
two-dimensional functions, in eqns (2.45), (2.46) and (2.63), we can
state the M -dimensional generalization succinctly by using vector
notation:

∞ ∞ ∞
−M
f(r) = (2π) 2
··· F(k) e−i k •r dM k (2.64) d3 k = dkx dky dkz
−∞ −∞ −∞

and k • r = kx x + ky y + kz z + · · ·

∞ ∞ ∞
−M
F(k) = (2π) 2
··· f(r) ei k •r dM r (2.65) d3 r = dx dy dz
−∞ −∞ −∞

where r = (x, y, z, . . . ) and k = (kx , ky , kz , . . . ) have M components,


with corresponding hyper-volume elements dM r and dM k. We re-
iterate that a Fourier transform and its inverse come as a linked
pair, but which one is called which is arbitrary. Their precise defini-
tions are also a matter of convention. No multiplicative prefactors
are required if the wavevector is specified in cycles rather than ra-
dians per unit length, for example, when the k in the exponents is
replaced with 2πk.
48 Waves, complex numbers and Fourier transforms

2.6 Fourier data analysis


The analysis of data from X-ray and neutron scattering experiments
is similar to the task of making inferences about the aperture func-
tion from its diffraction pattern. If we knew that A(x) consisted of
a small number of slits, n say, of equal spacing d, as in Fig. 2.17
with w = (n−1)d, then an examination of the width and separation
of the principal peaks in I(q) readily provides the desired parame-
ters n and d. In less well informed circumstances, however, all we
have to go on is the relationship between A(x) and I(q) enshrined in
eqns (2.57) and (2.58). How can the data then be analysed and what

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


difficulties are likely to arise?

2.6.1 The phase problem


Ignoring matters of practicality for the moment, the most relevant
mathematical operation that can be performed on a diffraction pat-
tern is a Fourier transform:
Z∞ Z∞ Z∞ Z∞ Z∞ Z∞
−i k •r M ∗
· · · I(k) e d k ∝ · · · A(t) A(r + t) dM t ,
−∞ −∞ −∞ −∞ −∞ −∞

giving the M -dimensional generalization of the ACF of eqns (2.55)


and (2.56), where
Z∞ Z∞ Z∞
2
I(k) = ψ(k) and ψ(k) = ψo · · · A(r) ei k •r dM r .
−∞ −∞ −∞

Whereas the correspondence between A(r) and the complex function


ψ(k) is one-to-one, implying that there is no loss of information in
the transformation, the same is not true of A(r) and the real and
positive diffraction pattern I(k). Only the auto-correlation function
of A(r) can be ascertained unambiguously from I(k), and we have
already seen, in Figs. 2.11 and 2.12, how much more difficult it is to
interpret the ACF than A(r).
The simplest way of appreciating how the lack of phase, arg {ψ(k)},
in a diffraction pattern results in a loss of uniqueness about A(r) is
to consider the Fourier transform of an aperture function that has
been shifted by ro ,
Z∞ Z∞ Z∞
ψo · · · A(r + ro ) ei k •r dM r = ψ(k) e−i k •ro ,
−∞ −∞ −∞

which differs from that of A(r) only through an additional factor of


−k • ro in its argument; the intensity, or modulus-squared,
h ih i∗

eθ e−θ = e0 = 1 ψ(k) e−i k •ro ψ(k) e−i k •ro = ψ(k) ψ(k) e−i k •ro ei k •ro ,
2.6 Fourier data analysis 49

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


Fig. 2.20 The phase problem: (c) has the Fourier phases of (a) and the Fourier am-
plitudes of (b), while (d) has the phases of (b) and the amplitudes of (a).

is unchanged. The amplitude of a Fourier transform is, therefore,


insensitive to translation. The same is true of the inversion of a real
function, so that A(r) and A(−r) give identical diffraction patterns

if A(r) = A(r) . This provides another elementary demonstration of
the loss of uniqueness without the Fourier phase.
The importance of the phase of a Fourier transform can be illus-
trated dramatically with graphical examples of the type shown in
Fig. 2.20. Here two photographs, pertaining to any subject or scene,
are Fourier transformed numerically and the phases of one assigned
to the amplitudes of the other. Each of these hybrid sets of complex
coefficients is then inverse Fourier transformed, and the resultant
pictures examined visually. Instinctively we would guess that the
outcome of this numerical experiment will be a complete mess, for
why should the Fourier phases of one distribution of light intensity
have anything to do with the amplitudes from another; if not, we
might expect to see some sort of mixture of the two sources. What we
find in practice is certainly degraded compared to the originals, but
each output only resembles the scene which contributed the Fourier
phase with no hint of that from which the amplitudes were taken.
50 Waves, complex numbers and Fourier transforms

It seems that most of the structural information in a Fourier trans-


form resides in its phase; and since this is missing in diffraction
data, it makes their analysis difficult in general without additional
prior knowledge.

2.6.2 Truncation effects and windowing


Even when the main interest is in the ACF of the aperture function,
and the absence of Fourier phase is not a problem, the limited sam-
pling of a diffraction pattern causes difficulties in practice. In the
simplest one-dimensional case, when I(q) is available only within
the range |q| 6 qmax , the truncated Fourier integral

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


qZmax qZmax
−iqx
I(q) e dq = 2 I(q) cos(qx) dq , (2.66)
−qmax 0

where the cosine equivalent on the right assumes that the aperture
A(x) = A(x)∗ =⇒ I(q) = I(−q) function is real, yields an estimate of the ACF that is corrupted by
ripples with a characteristic wavelength of 2π/qmax . These artefacts
can be understood with the aid of the convolution theorem, and Fig.
2.21, by considering eqn (2.66) to be the Fourier transform of the
product of the full but unmeasured diffraction pattern, J(q), and a
‘top-hat’ function of width 2 qmax , H(q). The result is, therefore, the
true but unknown auto-correlation function, acf, convolved with a

Fig. 2.21 The Fourier transform of a diffraction pattern of limited q-extent, I(q), yields an ACF of the aperture function which
is corrupted by truncation ripples associated with qmax ; their origin is easily understood from the convolution theorem.
2.6 Fourier data analysis 51

sinc function, h(x), whose central peak has a full width at half max-
imum (FWHM) of about 3.8/qmax .
The messy picture due to the truncation ripples can be cleaned up
greatly by multiplying the incomplete diffraction pattern, I(q), with
a window function, W(q), which decays smoothly from one at the
origin to zero around ±qmax , before the (inverse) Fourier transform
is calculated. This is illustrated in Fig. 2.22 with the ubiquitous
Gaussian,  
q2 √
W(q) = exp − , (2.67) FWHM = 8 ln2 σ ≈ 2.35 σ
2 σ2
whose standard deviation σ was chosen somewhat arbitrarily as

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


qmax /2 . The resultant auto-correlation function, denoted by acf, is
said to be a filtered version of the ACF given by I(q). The suppression
of the truncation ripples can also be understood from the convolution
theorem, which tells us that acf(x) = ACF(x) ⊗ w(x) , because the
subsidiary oscillations are averaged out through a blurring with the
filter w(x). The latter is just the Fourier transform of the windowing
function, W(q), with
 2 2
σ x
w(x) ∝ exp − (2.68)
2
for the case of eqn (2.67). Although the spurious peaks and troughs
are increasingly reduced as w(x) becomes broader, requiring W(q)

Fig. 2.22 Truncation ripples can be suppressed by multiplying the diffraction pattern, I(q), by a ‘window’ function, W(q),
which decays smoothly to zero over a q-range comparable to that of the measurements, before calculating the (inverse) Fourier
transform; the resultant ‘filtered’ auto-correlation function can also be understood from the convolution theorem.
52 Waves, complex numbers and Fourier transforms

to be narrower, the drawback is that intrinsically sharp features


of the auto-correlation function are smeared out even more. Thus
filtering is a matter of striking a balance between the suppression of
the truncation ripples and a further loss of resolution. A variety of
windowing functions have been developed to try to best achieve this
end.
Diffraction measurements are often unattainable at low q-values
as well as high ones, so that I(q) is available only within the range
qmin 6 |q| 6 qmax . The truncated Fourier integral,
qZmax

I(q) cos(qx) dq ,

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


qmin

then yields an estimate of the ACF of the aperture function that is


plagued by both low and high frequency artefacts. Structure in A(x)
which is longer than around 2π/qmin or narrower than about 2π/qmax
cannot be inferred reliably. The difficulty caused by a lack of I(0) is
easiest to appreciate since, from eqns (2.57) and (2.58), it relates to
the area under A(x):
2
Z∞ Z∞
I(0) ∝ A(x) dx ∝ ACF(x) dx , (2.69)
−∞ −∞

where the equivalent expression on the far right follows from the
x-integral of eqn (2.55), or the inverse of eqn (2.56) with k (or q) set
to zero. As the truncated Fourier integral implicitly assumes that
I(0) = 0 if qmin 6= 0 , the resultant ACF will contain equal amounts
of positive and negative structure to ensure a net null area. Apart
from at the origin, q = 0, the diffraction pattern is insensitive to the
addition of a constant to A(x) or its ACF.

2.6.3 Noise and probability theory


In practice, the analysis of a diffraction pattern is also limited by
the noise in the measurement process and the extent to which the
details of the experimental setup are understood and modelled. The
task is not really one of calculating an inverse Fourier transform,
which isn’t possible in a strict mathematical sense, but a matter of
making inferences about the aperture function given incomplete and
noisy data. The tool for dealing with and quantifying uncertainty is
Data analysis: a Bayesian tutorial, probability theory, as developed by Laplace (1812), and the reader is
Sivia (1996), Oxford University Press; referred to Sivia (1996) for an extended tutorial. A brief overview is
2nd edition (2006) with Skilling. given below.
The generic data analysis problem can be stated as follows: Given
a set of N measurements {Dk }, for k = 1, 2, 3, . . . , N , and some per-
tinent information H, what can we infer about the object of inter-
est A(x)? The Fourier nature of the experiment enters the analysis
2.6 Fourier data analysis 53

through the equation that predicts the k th data point, Fk , for a given
aperture A(x):  
Fk = f I(q), k , (2.70)

where
2
Z∞
I(q) = ψo A(x) eiqx dx (2.71)
−∞

and ‘ f ’ is the function that models the measurement process. In


the simplest case Fk = I(qk ), but a more common situation involves

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


the convolution of I(q) with an instrumental response, or resolution,
function, R(q), and the addition of a slowly varying background sig-
nal, B(q):
Z∞
Fk = I(q) R(qk −q) dq + B(qk ) . (2.72)
−∞

The noise, or the expected mismatch between Fk and Dk , is usually


quantified through an error-bar, σk , which is a shorthand way of
assigning a Gaussian probability for the likelihood of the k th datum:
" #
  1 (Dk − Fk )2
prob Dk A(x), H = √ exp − , (2.73)
σk 2π 2 σk2

where the vertical bar ‘ | ’ means ‘given’ (so that all items to the right
of this conditioning symbol are taken as being true) and the comma
is read as the conjunction ‘and’. A knowledge of eqns (2.70)–(2.72)
and, hopefully, the related resolution and background functions, as
well as the error-bars, is implicitly assumed in H. If the N measure-
ments, {Dk }, are independent, in that the noise associated with one
is unrelated to that of another (as far as H is concerned), then their
joint likelihood is just the product of the individual contributions:

  N
Y  
prob {Dk } A(x), H = prob Dk A(x), H .
k=1

In conjunction with eqn (2.73), therefore, the likelihood function for


the data can be written as
   2
χ
prob {Dk } A(x), H ∝ exp − , (2.74)
2

where
N 
X 2
2 Fk −Dk
χ = (2.75)
σk
k=1

is the sum of the squares of the normalized residuals.


54 Waves, complex numbers and Fourier transforms

Our inference, or ‘state of knowledge’, about the aperture function


in the light of the data and H is not encapsulated by the likelihood
function but by the posterior probability,
 
prob A(x) {Dk }, H ,

where the positions of {Dk } and A(x) are reversed with respect to the
conditioning symbol. The A(x) which gives the largest value for the
posterior probability can be regarded as the ‘best’ estimate of the
aperture function, while the range of the alternatives that yield a
reasonable fraction of the maximum probability gives an indication
of the uncertainty. The likelihood function is related to the posterior

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


probability through Bayes’ theorem,
   
  prob {Dk } A(x), H × prob A(x) H
prob A(x) {Dk }, H =   ,
prob {Dk } H

where the second term in the numerator is called the prior probabil-
ity, and represents our state of knowledge (or ignorance) about the
aperture function before the analysis of the data, and the denomi-
nator usually constitutes an uninteresting proportionality constant
(required for normalization) since it doesn’t explicitly mention A(x).
The latter plays a crucial role when comparing different assump-
tions or models, however, such as H1 versus H2 , and is referred to
as the ‘global likelihood’, ‘prior predictive’ or simply the evidence in
that context.
A quantitative discussion of the aperture function is contingent
on a parametric description of A(x), of course, and its choice is a
reflection of the information H at hand. If it were known that we
were dealing with a pair of slits of equal finite width, as in Fig. 2.18
for example, then A(x) would be defined by the two parameters d
and w as follows:
(
1 if x ± d2 6 w2 ,
A(x) =
0 otherwise ,

where d > w > 0 . If very little information was available, then we


might use the formulation
M
X
A(x) = cj Gj (x) ,
j=1

where the M coefficients, {cj }, define the aperture function through


a linear combination of suitable basis functions, Gj (x). Although a
larger value of M provides greater flexibility in the range of A(x)
that can be modelled, a more careful choice of the Gj (x) can re-
duce the number required and, thereby, aid many aspects of the data
analysis task.
2.6 Fourier data analysis 55

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


Fig. 2.23 The likelihood constraint on the value of a Fourier coefficient given (a) a
noisy measurement of its modulus-squared and (b) additional phase information.

When the aperture function is defined by only a few parameters,


the data tend to impose a strong constraint on their allowed values.
The likelihood function dominates the posterior probability in this
case and the prior, which is relatively broad to represent ignorance,
is largely irrelevant. For the likelihood function of eqns (2.74) and
(2.75), therefore, the optimal parameters of A(x) are those that yield
the smallest value of χ2 ; this is called the least-squares estimate.
When little is known about the aperture function beforehand, and
its description entails a large number of parameters to reflect this
initial ignorance, it becomes important to give due consideration to
the prior to encode whatever weak information is available about
A(x). For example, positivity, bounds, local smoothness and so on.
This leads to the use of regularization procedures, or constrained
optimization, such as maximum entropy.
The computational task of finding the maximum of the posterior
probability distribution and determining its spread, in the space of
the parameters used to describe A(x), can be a very challenging one.
If we have a good initial estimate of the optimal solution, then an
efficient gradient algorithm, such as Newton–Raphson, can often be
employed. Otherwise we may need to use the slower, but more ro-
bust, Monte Carlo methods. These sorts of practical considerations
can make it tempting to ignore the noise and limited coverage of the
data, and try to emulate an inverse Fourier transform in some way.
For the ACF, and with appropriate filtering, this can provide a useful
quick method for a qualitative analysis.
As mentioned earlier, the loss of the Fourier phase in diffraction
experiments causes a serious difficulty for ascertaining the aperture
function. We can begin to appreciate this from a probabilistic point
of view by considering the constraint that the likelihood function
imposes on the value of a Fourier coefficient, ψ(qk ), when only its
modulus-squared can be measured; this is shown pictorially in Fig.
2.23(a). Unlike the case of Fig. 2.23(b), where additional phase in-
56 Waves, complex numbers and Fourier transforms

formation is available, the permissible values of ψ(qk ) do not shrink


towards a unique point in the Argand plane even in the limit of
noiseless data; they reduce instead to a thin circular region, with a
phase ambiguity of 2π radians.

2.7 A list of useful formulae


To finish off this principally mathematical chapter, covering the im-
portant prerequisites for scattering theory, we give a list of some
useful formulae.

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


Powers and logarithms (see Section 1.1)

aM aN = aM+N ,
N
(aM ) = aM N ,

a0 = 1 , a−N = 1/aN and a1/p = p
a (integer p) .

y = a x ⇐⇒ x = log a (y)

log(A B) = log(A) + log(B) and log(A/B) = log(A) − log(B) ,


log(Aβ ) = β log(A) and log b (A) = log a (A) × log b (a) .

Trigonometry

y 1 x 1 y sin θ 1
sin θ = = , cos θ = = , tan θ = = = .
r cosec θ r sec θ x cos θ cot θ

x2 + y 2 = r2 ⇐⇒ sin2 θ + cos2 θ ≡ 1
tan2 θ + 1 ≡ sec2 θ
cot2 θ + 1 ≡ cosec2 θ

sin( A ± B) = sin A cos B ± cos A sin B =⇒ sin 2θ = 2 sin θ cos θ


cos( A ± B) = cos A cos B ∓ sin A sin B =⇒ cos 2θ = cos2 θ − sin2 θ

2 sin A cos B = sin( A + B) + sin( A − B)


2 cos A cos B = cos( A + B) + cos( A − B)
−2 sin A sin B = cos( A + B) − cos( A − B)

a b c
= =
sin A sin B sin C

c 2 = a 2 + b 2 − 2 a b cos C
2.7 A list of useful formulae 57

Power series, sums and expansions (see Section 2.2.4)

x3 x5 x7
sin x = x − + − + ···
3! 5! 7!

x2 x4 x6 n! = n×(n−1)×(n−2)×· · ·×3×2×1
cos x = 1 − + − + ···
2! 4! 6!

x2 x3 x4 e = e1 = 1 + 1 +
1 1 1
e x = exp (x) = 1 + x + + + + ··· 2
+ +
6 24
+ ···
2! 3! 4!
= 2.718 . . .

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


x2 x3 x4 x5 
log e (1+x) = ln (1+x) = x − + − + − ··· |x| < 1
2 3 4 5

p (p−1) 2 p (p−1)(p−2) 3  √ x x2
(1+x) p = 1 + p x + x + x + ··· |x| < 1 1+ x = 1 + − + ···
2! 3! 2 8

n
X “n”
n n n!
(a + b) = Ck ak b n−k n
Ck = = , 0! = 1
k k! (n−k)!
k =0

nh i
n
X N (N +1)
a + (k −1) d = 2 a + (n −1) d 1 + 2 + 3 +··· + N =
2 2
k =1

n
X a (1 − r n ) a
a r k−1 = −→ as n → ∞ and |r| < 1
1−r 1− r
k =1

Vectors (see Section 2.1.1)

a × (b × c) = (a • c) b − (a • b) c

Complex numbers (see Section 2.2)

e x − e−x
sinh x = = − i sin(i x)
2
e x + e−x
cosh x = = cos(i x)
2
sinh x e 2x − 1
tanh x = =
cosh x e 2x + 1
58 Waves, complex numbers and Fourier transforms

Differentiation and integration (see Sections 2.1.2 and 2.3.1)


„ «  
d2y d dy dn y d dn−1 y dy 1
y ′′
= = = , y′ = = ,
dx 2 dx dx dx n dx dx n−1 dx dx/dy
 
d  dv du d u v u′ − u v ′
uv = u +v , = ,
dx dx dx dx v v2

X n
` ´′′ dn  dk u dn−k v dy dy du dy/dt
uv = u v ′′ + 2 u ′ v ′ + u ′′ v n uv =
n
Ck , = × = .
dx
k =0
dx k dx n−k dx du dx dx /dt

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


Zb h ib
df
= g(x) ⇐⇒ g(x) dx = f (x) + c = f (b) − f (a)
dx a
a

Zb(t)
d db da
g(x) dx = g(b) − g(a)
dt dt dt
a(t)

Z Z Z Z
dv du du
u dx = u v − v dx and g(u) dx = g(u) du
dx dx dx

df df df
f (x) f (x) f (x)
dx dx dx

xn n x n−1 ln x 1/x sinh x cosh x

ex ex log a (x) (x ln a)−1 cosh x sinh x


2 2
ax a x ln a e−x −2 x e−x tanh x sech 2 x
−1/2 −1/2
y = sin θ ⇐⇒ θ = sin−1 y sin x cos x sin−1 x 1−x 2 sinh−1 x 1+x 2
−1/2 −1/2
cos x − sin x cos−1 x − 1−x 2 cosh−1 x x 2 −1
−1 −1
tan x sec 2 x tan−1 x 1+x 2 tanh−1 x 1−x 2

Zx √ Z∞
−t2 π
erf (−x) = − erf (x) e dt = erf (x) and ei(x−xo)t dt = 2π δ(x−xo)
2
0 −∞
erf (∞) = 1 (see Section 2.4.1)
2.7 A list of useful formulae 59

Fourier transforms (see Sections 2.4 and 2.5.5)

Z∞ Z∞ Z∞
−i kx
˛ ˛
F(k) = f (x) e dx ⇐⇒ f (x) = 1
2π F(k) ei kx dk ˛ f (x)˛ dx < ∞
−∞ −∞ −∞

f (x) F(k) f (x) F(k)

df
f (x +xo ) e i k xo F(k) i k F(k) f (x) = 0 for x = ±∞
dx

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025


1
f (x) ⊗ g(x) G(k) F(k) f (x) g(x) 2π G(k) ⊗ F(k)

δ(x −xo ) e−i k xo 1 2π δ(k)

δ(x +d ) + δ(x −d ) 2 cos(kd ) δ(x +d ) − δ(x −d ) 2 i sin(kd )


 ∞
X ∞
X
1 if |x| 6 w 2

k sin(kw) δ(x − n d) δ k − 2πdm
0 otherwise
n =−∞ m =−∞

e−ax for x > 0 1 a
π (a2 +x2 ) e−a|k| a >0
0 otherwise a + ik
 2
 2
√1 k2/2
σ 2π
exp − 2xσ2 e−σ

2.7.1 Dimensional analysis


Theoretical analysis involves the use of equations for understanding
physical phenomena in a quantitative manner. Since the derivation
of the relationships can be mathematically complicated, it’s always
worth carrying out ‘sanity checks’ on the formulae before applying
them in detailed calculations; this is a good way of detecting alge-
braic mistakes and typographical errors. A requirement to simplify
to familiar or intuitive results in elementary cases is one part of
this approach, but the need for ‘dimensional consistency’ provides
an even more basic test.
Physical parameters related to mechanics can be analysed in terms
of their associated dimensions of length, L, time, T , and mass, M .
Thus velocity, being a displacement per unit time, has dimensions
of L T −1 ; acceleration, being the rate of change of velocity, L T −2 ;
force, from Newton’s second law of motion, M L T −2 ; energy, from
work = f orce × distance, M L 2 T −2 ; and so on. While it may be nec-
essary to add charge, Q, and temperature, Θ, to the basic list for
dealing with electromagnetism and thermodynamics, the balance
60 Waves, complex numbers and Fourier transforms

implied by an = symbol means that the dimensions on both sides


of an equation must match up (or else something has gone wrong).
Indeed, the dimensions of every component separated by a + or −
must be the same, and the arguments of functions, such as exp, log,
sin and cos, should be dimensionless.

Downloaded from https://academic.oup.com/book/7397/chapter/152238668 by guest on 13 February 2025

You might also like