Acprof 9780199228676 Chapter 2
Acprof 9780199228676 Chapter 2
Fourier transforms 2
The theory of X-ray and neutron scattering relies heavily on the
Fig. 2.2 The sinusoidal curves, or waves, ψ = A sin θ and ψ = A cos θ. Fig. 2.3 The generation of sinusoidal
variations through circular motion.
Elementary Scattering Theory, First Edition, D.S. Sivia, © D.S. Sivia 2011. Published in 2011
by Oxford University Press.
20 Waves, complex numbers and Fourier transforms
The two curves shown in Fig. 2.2 are identical apart from a lateral
shift of π/2 radians, or 90◦ : cos θ = sin (θ + π/2). Hence, the general
expression for a function of this type is
ψ = A sin(θ + φ) , (2.1)
where A is the amplitude of the wave and the angle φ, or the phase,
controls its horizontal displacement with respect to sin θ. If the θ in
eqn (2.1) varies linearly with position x, so that θ = k x where k is a
constant, then we obtain a sinusoidal variation with respect to this
physical coordinate:
Since the sine curve cycles around every 2π radians, the correspond-
ing repeat distance, or wavelength λ, can be found from
2π
k = . (2.3)
λ
This is called the wavenumber and has SI units of rad m−1 . Note
that, as mentioned in Section 1.5, spectroscopists use the same term
for 1/λ given in cm−1 .
If the φ in eqn (2.2) itself varies linearly with time t, so that it can
be written as φ = φo − ω t where φo and ω are constants, then we
obtain the travelling wave
ψ = A sin(k x − ω t + φo ) . (2.4)
2π
ω = . (2.5)
T
Fig. 2.4 The travelling wave of eqn (2.4) plotted as a function of x for several values
of t, from zero to a quarter of the period.
2.1 Sinusoidal waves 21
1
ω = 2πν , (2.6) ν =
T
λ ω
c = = = νλ , (2.7)
T k
where the bold script k and r are vectors, and the dot between them |k|2 = k2
indicates their ‘scalar multiplication’. The vector r denotes a general = kx2 + ky2 + kz2
position in space, with coordinates x, y and z, but what do the three
components, kx , ky and kz , of the wavevector k represent? Its mag-
nitude, or modulus, |k| = k is the familiar wavenumber of eqn (2.3),
and its orientation indicates the direction of propagation. For a wave
travelling along the x direction, with ky = kz = 0, the scalar product
k • r = kx x where kx = k for a forwards progression and kx = −k for
the reverse.
Since r and k are generally three-dimensional vectors, the wave
of eqn (2.8) tends to be a function of x, y, z and t. As such, it rep-
resents a travelling ‘plane wave’ rather than a moving oscillation
on a string. That is to say ψ, which could be the air pressure in a
sound wave, is uniform in planes perpendicular to k, but its value
varies sinusoidally with time in the direction of the wavevector in
accordance with the wavelength of eqn (2.3), the period of eqn (2.5)
and the speed in eqn (2.7). The situation is illustrated for the two-
dimensional analogue in Fig. 2.5. Fig. 2.5 The geometry of a plane wave.
22 Waves, complex numbers and Fourier transforms
a + b = (ax + bx , ay + by , az + bz ) ,
with all the pluses replaced by minuses for a take away. The multiplica-
tion of a vector by a scalar, µ say, is also easy,
and yields a vector with the original direction but an appropriately scaled
length. The modulus, magnitude or length of a vector is given by Pythago-
ras’ theorem; it’s one for a unit or normalized vector.
A vector can be multiplied by another in two different ways. The first is
a ‘dot’ or scalar product, which is a sum of the products of corresponding
elements:
|a| 2 = a • a = a2x + a2y + a2z a • b = a x bx + a y by + a z bz (2.9)
a × b = (ay bz − by az , az bx − bz ax , ax by − bx ay ) . (2.10)
= − 2 sin(ω t) cos(k x) ,
Fig. 2.7 The slowly varying ‘beating’ modulation, of wavelength 2π/∆k, propagates
with a speed of ∆ω/∆k, whereas the finer structure inside the envelope has the
properties of the average wavelength and frequency, ω and k.
24 Waves, complex numbers and Fourier transforms
2.2.1 Definition
If any number, integer or fraction, positive or negative, is multiplied
by itself, then the result is always greater than, or equal, to zero.
What, then, is the square root of −9 ? To address this question we
need to invent an imaginary number, usually denoted by ‘ i ’, whose
square is defined to be negative:
i2 = −1 . (2.14)
√
A real number, say b (where b2 > 0), times i is also imaginary; it’s −9 = ± 3i
just b times bigger than i. If a is also an ordinary number, then the
sum z of a and ib,
z = a + ib , (2.15)
26 Waves, complex numbers and Fourier transforms
It may seem odd that Im {z} is b rather than ib, but this is because
it represents the size of the imaginary component.
a + ib ± (c + id ) = a ± c + i (b ± d ) , (2.17)
where a, b , c and d are real. The usual rules of algebra apply for
1 + 2i − (5 − i) = −4 + 3i brackets and multiplication, except that every occurrence of i2 is
replaced by −1. Thus, it’s easy to show that the product of a + i b
and c + id is given by
(a + ib) (c + id ) = a c − b d + i (a d + b c) , (2.18)
z + z∗ = 2a = 2 Re {z}
z − z∗ = 2 ib = 2 i Im {z} (2.20)
2
z z ∗ = a2 + b 2 = |z|
We will come to the meaning of |z| shortly, but the important point
about eqn (2.20) is that the product z z ∗ is a real number. This fea-
ture enables us to calculate the real and imaginary part of the ratio
of two complex numbers by multiplying both the top and bottom by
the conjugate of the denominator
a + ib a + ib c − id a c + b d + i (b c − a d )
= × = . (2.21)
c + id c + id c − id c2 + d2
To evaluate the ratio (1+2i)/(3−i), for example, we multiply it by
unity in the form (3+i)/(3+i); this gives a real denominator of 10,
and a complex numerator of 1+7i. Hence the result is 1/10 + i 7/10.
2.2 Complex numbers 27
ψ = A ei(k •r −ω t) , (2.27)
A = |A| ei φo . (2.28)
The real part of eqn (2.27) also represents the same wave, apart
from a difference of 90◦ in the value of φo .
The benefit of using eqn (2.27) over (2.8) in wave analysis is that
exponentials are easier to deal with mathematically than sinusoids;
multiplication, differentiation and integration, for example, are more
straightforward. As an illustration of this advantage, let’s derive the
‘compound angle’ formulae for sines and cosines with complex num-
bers. Starting with the rule of eqn (1.2) for combining powers,
ei (α+β) = ei α ei β ,
As well as being the real and imaginary parts of exp(iθ), sines and
cosines can also be expressed as
ei θ + e−i θ ei θ − e−i θ
cos θ = and sin θ = , (2.32)
2 2i
which follow from the addition and subtraction, respectively, of eqn
(2.24) with its complex conjugate.
The sines and cosines of 2 kx, 3 kx, 4 kx, and so on, also satisfy the
periodicity of eqn (2.33); they just go through several, or many, com-
plete cycles in the interval λ. We can obtain a better approximation
to f(x), therefore, by including contributions from these higher-order
terms:
f(x) ≈ a0 + a1 cos(kx) + a2 cos(2 kx) + a3 cos(3 kx) + · · ·
(2.35)
+ b1 sin(kx) + b2 sin(2 kx) + b3 sin(3 kx) + · · ·
because sines and cosines are odd and even functions, respectively.
The generalization of eqn (2.35) explains why the invariant term is
designated as a0 , and why there is no corresponding b0 (apart from
its general redundancy): they are the coefficients of cos(0) = 1 and
sin(0) = 0, with the b0 being unnecessary since it adds nothing to the
Fourier series.
with an identical expression for cos(m kx) cos(n kx), but n 6= 0, and
Zλ
sin(m kx) cos(n kx) dx = 0 . (2.38)
0
side will be zero due to eqns (2.37) and (2.38). The surviving m = n
contributions yield the formulae for the Fourier coefficients:
Zλ Zλ
2 2
an = λ f(x) cos(n kx) dx and bn = λ f(x) sin(n kx) dx (2.39)
0 0
Zλ Zλ
R
where the symbol dx is read as the ‘integral, from a to b, with respect
to x’. The use of the term ‘area’ in the above discussion needs some qual-
ification, in that it can be negative; this is because the ‘height’ of a strip
f(xj ) < 0 whenever the curve y = f(x) lies below the x-axis (and even the
‘width’ ∆x < 0 if b < a).
Although an integral is defined as the limiting form of a summation, it
is usually calculated analytically by noting that ‘integration is the reverse
of differentiation’. While this may not be obvious, it is easily illustrated
with an example from everyday kinematics: the distance travelled by a
car (say) is the integral of the speed with respect to time, and speed is the
rate of change of distance with time (a derivative).
32 Waves, complex numbers and Fourier transforms
= an cos(nkx) + bn sin(nkx) ,
ei kn x and e−i kn x ,
2.4 Fourier transforms 33
c n = α F(kn ) ∆k ,
∞
X Zλ/2
i kn x
f(x) = α F(kn ) e ∆k and F(kn ) = 1
2πα f(x) e−i kn x dx ,
n =−∞
−λ/2
Z∞
f(x) = √1
2π
F(k) ei kx dk (2.45)
−∞
and
Z∞
F(k) = √1
2π
f(x) e−i kx dx (2.46)
−∞
Equations (2.47) and (2.48) can be combined to show that the Fourier
transform of a real and symmetric function is also real and even,
whereas that of a real and antisymmetric function is imaginary and
odd; this is equivalent to eqn (2.36).
The substitution of k = 0 in eqns (2.46) and (2.47) reveals F(0) to
∗
and necessarily real if f(x) = f(x) . It will equal zero if f(x) = −f(−x).
Technically, the integral of the modulus, |f(x)|, must be bounded (or
finite) if its Fourier transform is to exist everywhere; this is known
as the Dirichlet condition.
Fig. 2.9 The convolution of the spiky function g(x) with the broad asymmetric function h(x): f(x) = g(x) ⊗ h(x).
2.4 Fourier transforms 35
by replacing each of the the sharp peaks in g(x) with scaled copies of
h(x) and adding together the four contributions; those from the two
closely spaced components in the middle, shown by dotted grey lines,
combine to give a resultant function where the constituent doublet
is no longer resolved clearly. Although it’s not as easy to visualize
it the other way around, eqn (2.50) can equally be thought of as the g(x) ⊗ h(x) = h(x) ⊗ g(x)
blurring of h(x) by g(x).
The convolution theorem states that the Fourier transform of the
convolution of two functions is proportional to the product of their
Fourier transforms:
where F(k), G(k) and H(k) are the Fourier transforms of f(x), g(x)
and h(x), respectively, according to eqn (2.46). Given the reciprocity
between a Fourier transform and its inverse,
This can be seen from the example of Fig. 2.10, where an array of
different shaped peaks is convolved with a Gaussian. Although the
Fig. 2.10 The convolution of a function with an array of different shaped peaks, g(x), with a Gaussian, h(x).
36 Waves, complex numbers and Fourier transforms
δ(x −xo ) ⊗ h(x) = h(x −xo ) so that integrals involving a δ-function are easy to evaluate.
two spikes on the left of g(x) merge into one in f(x), because they
are very closely spaced compared with the width of h(x), the areas
of the various components in the blurred output are proportional to
those of the input signal. The amplitudes of the narrowest peaks are
affected the most, since their relative spreading is the greatest as a
result of the convolution; the slowly varying parts of the structure
change the least.
∗
and is real if f(x) = f(x). Although this looks like a self-convolution,
∗
or f(x) ⊗ f(−x), it’s not the best way to think about eqn (2.55). The
ACF is largest at the origin,
Z∞
˛ ˛2 ∗
f(x) f(x) = ˛ f(x)˛ > 0
∗
ACF(0) = f(t) f(t) dt ,
−∞
2.4 Fourier transforms 37
where F(k) is given by eqn (2.46). While a Fourier transform and its
inverse contain the same information, albeit in different ways, and
it’s possible to switch between one and the other through eqns (2.45)
and (2.46), the situation becomes less straightforward if only F(k)
is available. We can begin to appreciate the problems caused by such
a loss of the Fourier phase by comparing the relative complexity of
the ACF with f(x) in Fig. 2.11. The ACF, which is directly available
38 Waves, complex numbers and Fourier transforms
about its phase φ? That depends on both x and the angle of propaga-
tion relative to the incident wave, θ, as well as the time t. The phase
will be invariant with position parallel to the incoming wavefront,
but will gain a relative factor of
∆φ = 2π λ x sin θ
∆ψ = ψo A(x) ei qx ∆x ,
where q = 2π sin θ/λ and the temporal variation has been absorbed
into the ‘constant’ of proportionality, ψo . The diffracted wave, ψ, is
the sum of all such terms; in the limit ∆x → 0, it becomes the Fourier
transform of the aperture function:
Z∞
ψ(q) = ψo A(x) eiqx dx . (2.57)
−∞
Thus we met Fourier transforms a (very) long time ago but did not
realize it! Before reminding ourselves of the results from elementary
diffraction experiments, and trying to understand them in terms
what we’ve now learnt about Fourier transforms, we need to make a
few qualifying remarks.
The first point is essentially a technicality, but the above analy-
sis assumes that we are considering Fraunhofer diffraction. This
40 Waves, complex numbers and Fourier transforms
is the limit where the projection screen is so far away that all the
waves reaching a particular point can be considered to be travelling
in parallel directions. The equations becomes more cumbersome
when this approximation does not hold, and leads to the theory of
Fresnel diffraction.
The more serious point of note is that the observed, or measured,
diffraction pattern is not the complex function ψ(q) but its intensity,
or modulus-squared, I(q):
2 ∗
I(q) = ψ(q) = ψ(q) ψ(q) . (2.58)
where we have used eqn (2.32) in writing the second line. The prod-
uct of this diffracted wave with its complex conjugate, ψ(q)∗ , leads
to the prediction
h i2
I(q) ∝ cos qd
2 ∝ 1 + cos(qd ) , (2.59)
2
where all the multiplicative prefactors not involving q, such as ψo ,
2
cos 2θ = 2 cos θ − 1 have been omitted and a trigonometric double angle formula used
on the far right-hand side. This pattern of ‘uniform cosine fringes’ is
plotted in Fig. 2.14.
2.5 Fourier optics and physical insight 41
Fig. 2.14 The aperture function for a Young’s double slit, A(x), its Fourier transform, ψ(q), and the diffraction pattern, I(q).
Zw/2
ψ(q) = ψo eiqx dx .
−w/2
Fig. 2.15 The aperture function for a single wide slit, A(x), its Fourier transform, ψ(q), and the diffraction pattern, I(q).
Z∞ where d is the distance between the grating lines. Swapping the or-
δ(x −md) ei qx dx = ei qmd der of integration and summation, and using eqn (2.54), the Fourier
−∞ transform of eqn (2.57) reduces to
∞
X
ψ(q) = ψo eiqdm .
m =−∞
The nature of ψ(q) becomes apparent once we realize that it’s pro-
portional to the sum of complex numbers that are of unit magnitude
but varying phase. They will add up coherently if the product q d is
an integer number of 2π, yielding a huge resultant sum, but cancel
out otherwise. Hence,
∞
X
ψ(q) ∝ δ(q − n qo ) , (2.61)
n =−∞
2.5 Fourier optics and physical insight 43
Fig. 2.16 The aperture function for a diffraction grating, A(x), its Fourier transform, ψ(q), and the diffraction pattern, I(q).
n λ = d sin θ , (2.62)
form, the width of the large diffraction peaks tells us about the size
w of the grating whereas the distance between them indicates the
d-spacing of its lines. As the number of grating lines goes up, so that
the ratio w/d increases, the principal peaks become narrower and
more low-level wiggles appear between them.
The convolution theorem also enables us to ascertain the diffrac-
tion pattern for a pair of broad slits from the results of eqns (2.59)
and (2.60). Taking each to be of width w, and separated by d, the
A(x) = g(x) ⊗ h(x) aperture function can be seen as a convolution of an ideal Young’s
double slit with a narrow but finite single slit, as in Fig. 2.18. Since
∴ ψ(q) = ψo G(q) × H(q) the Fourier transform of the former is then equal to the product of
those of the latter, the intensity of the uniform cosine fringes that
we’d expect from a perfect Young’s double slit is modulated by a
slowly varying sinc-squared function.
This double integral, over the surface area of the aperture, simpli-
fies to the product of two one-dimensional integrals if the aperture
function is separable:
Z∞ Z∞
ikx x
ψ(kx , ky ) = ψo A1 (x) e dx A 2 (y) eiky y dy
−∞ −∞
∞ ∞ ∞
−M
f(r) = (2π) 2
··· F(k) e−i k •r dM k (2.64) d3 k = dkx dky dkz
−∞ −∞ −∞
and k • r = kx x + ky y + kz z + · · ·
∞ ∞ ∞
−M
F(k) = (2π) 2
··· f(r) ei k •r dM r (2.65) d3 r = dx dy dz
−∞ −∞ −∞
where the cosine equivalent on the right assumes that the aperture
A(x) = A(x)∗ =⇒ I(q) = I(−q) function is real, yields an estimate of the ACF that is corrupted by
ripples with a characteristic wavelength of 2π/qmax . These artefacts
can be understood with the aid of the convolution theorem, and Fig.
2.21, by considering eqn (2.66) to be the Fourier transform of the
product of the full but unmeasured diffraction pattern, J(q), and a
‘top-hat’ function of width 2 qmax , H(q). The result is, therefore, the
true but unknown auto-correlation function, acf, convolved with a
Fig. 2.21 The Fourier transform of a diffraction pattern of limited q-extent, I(q), yields an ACF of the aperture function which
is corrupted by truncation ripples associated with qmax ; their origin is easily understood from the convolution theorem.
2.6 Fourier data analysis 51
sinc function, h(x), whose central peak has a full width at half max-
imum (FWHM) of about 3.8/qmax .
The messy picture due to the truncation ripples can be cleaned up
greatly by multiplying the incomplete diffraction pattern, I(q), with
a window function, W(q), which decays smoothly from one at the
origin to zero around ±qmax , before the (inverse) Fourier transform
is calculated. This is illustrated in Fig. 2.22 with the ubiquitous
Gaussian,
q2 √
W(q) = exp − , (2.67) FWHM = 8 ln2 σ ≈ 2.35 σ
2 σ2
whose standard deviation σ was chosen somewhat arbitrarily as
Fig. 2.22 Truncation ripples can be suppressed by multiplying the diffraction pattern, I(q), by a ‘window’ function, W(q),
which decays smoothly to zero over a q-range comparable to that of the measurements, before calculating the (inverse) Fourier
transform; the resultant ‘filtered’ auto-correlation function can also be understood from the convolution theorem.
52 Waves, complex numbers and Fourier transforms
I(q) cos(qx) dq ,
where the equivalent expression on the far right follows from the
x-integral of eqn (2.55), or the inverse of eqn (2.56) with k (or q) set
to zero. As the truncated Fourier integral implicitly assumes that
I(0) = 0 if qmin 6= 0 , the resultant ACF will contain equal amounts
of positive and negative structure to ensure a net null area. Apart
from at the origin, q = 0, the diffraction pattern is insensitive to the
addition of a constant to A(x) or its ACF.
through the equation that predicts the k th data point, Fk , for a given
aperture A(x):
Fk = f I(q), k , (2.70)
where
2
Z∞
I(q) = ψo A(x) eiqx dx (2.71)
−∞
where the vertical bar ‘ | ’ means ‘given’ (so that all items to the right
of this conditioning symbol are taken as being true) and the comma
is read as the conjunction ‘and’. A knowledge of eqns (2.70)–(2.72)
and, hopefully, the related resolution and background functions, as
well as the error-bars, is implicitly assumed in H. If the N measure-
ments, {Dk }, are independent, in that the noise associated with one
is unrelated to that of another (as far as H is concerned), then their
joint likelihood is just the product of the individual contributions:
N
Y
prob {Dk } A(x), H = prob Dk A(x), H .
k=1
where
N
X 2
2 Fk −Dk
χ = (2.75)
σk
k=1
where the positions of {Dk } and A(x) are reversed with respect to the
conditioning symbol. The A(x) which gives the largest value for the
posterior probability can be regarded as the ‘best’ estimate of the
aperture function, while the range of the alternatives that yield a
reasonable fraction of the maximum probability gives an indication
of the uncertainty. The likelihood function is related to the posterior
where the second term in the numerator is called the prior probabil-
ity, and represents our state of knowledge (or ignorance) about the
aperture function before the analysis of the data, and the denomi-
nator usually constitutes an uninteresting proportionality constant
(required for normalization) since it doesn’t explicitly mention A(x).
The latter plays a crucial role when comparing different assump-
tions or models, however, such as H1 versus H2 , and is referred to
as the ‘global likelihood’, ‘prior predictive’ or simply the evidence in
that context.
A quantitative discussion of the aperture function is contingent
on a parametric description of A(x), of course, and its choice is a
reflection of the information H at hand. If it were known that we
were dealing with a pair of slits of equal finite width, as in Fig. 2.18
for example, then A(x) would be defined by the two parameters d
and w as follows:
(
1 if x ± d2 6 w2 ,
A(x) =
0 otherwise ,
aM aN = aM+N ,
N
(aM ) = aM N ,
√
a0 = 1 , a−N = 1/aN and a1/p = p
a (integer p) .
y = a x ⇐⇒ x = log a (y)
Trigonometry
y 1 x 1 y sin θ 1
sin θ = = , cos θ = = , tan θ = = = .
r cosec θ r sec θ x cos θ cot θ
x2 + y 2 = r2 ⇐⇒ sin2 θ + cos2 θ ≡ 1
tan2 θ + 1 ≡ sec2 θ
cot2 θ + 1 ≡ cosec2 θ
a b c
= =
sin A sin B sin C
c 2 = a 2 + b 2 − 2 a b cos C
2.7 A list of useful formulae 57
x3 x5 x7
sin x = x − + − + ···
3! 5! 7!
x2 x4 x6 n! = n×(n−1)×(n−2)×· · ·×3×2×1
cos x = 1 − + − + ···
2! 4! 6!
x2 x3 x4 e = e1 = 1 + 1 +
1 1 1
e x = exp (x) = 1 + x + + + + ··· 2
+ +
6 24
+ ···
2! 3! 4!
= 2.718 . . .
p (p−1) 2 p (p−1)(p−2) 3 √ x x2
(1+x) p = 1 + p x + x + x + ··· |x| < 1 1+ x = 1 + − + ···
2! 3! 2 8
n
X “n”
n n n!
(a + b) = Ck ak b n−k n
Ck = = , 0! = 1
k k! (n−k)!
k =0
nh i
n
X N (N +1)
a + (k −1) d = 2 a + (n −1) d 1 + 2 + 3 +··· + N =
2 2
k =1
n
X a (1 − r n ) a
a r k−1 = −→ as n → ∞ and |r| < 1
1−r 1− r
k =1
a × (b × c) = (a • c) b − (a • b) c
e x − e−x
sinh x = = − i sin(i x)
2
e x + e−x
cosh x = = cos(i x)
2
sinh x e 2x − 1
tanh x = =
cosh x e 2x + 1
58 Waves, complex numbers and Fourier transforms
X n
` ´′′ dn dk u dn−k v dy dy du dy/dt
uv = u v ′′ + 2 u ′ v ′ + u ′′ v n uv =
n
Ck , = × = .
dx
k =0
dx k dx n−k dx du dx dx /dt
Zb(t)
d db da
g(x) dx = g(b) − g(a)
dt dt dt
a(t)
Z Z Z Z
dv du du
u dx = u v − v dx and g(u) dx = g(u) du
dx dx dx
df df df
f (x) f (x) f (x)
dx dx dx
Zx √ Z∞
−t2 π
erf (−x) = − erf (x) e dt = erf (x) and ei(x−xo)t dt = 2π δ(x−xo)
2
0 −∞
erf (∞) = 1 (see Section 2.4.1)
2.7 A list of useful formulae 59
Z∞ Z∞ Z∞
−i kx
˛ ˛
F(k) = f (x) e dx ⇐⇒ f (x) = 1
2π F(k) ei kx dk ˛ f (x)˛ dx < ∞
−∞ −∞ −∞
df
f (x +xo ) e i k xo F(k) i k F(k) f (x) = 0 for x = ±∞
dx