Bstract: K Ikx
Bstract: K Ikx
LONG CHEN
A BSTRACT. Fast Fourier transform (FFT) is a fast algorithm to compute the discrete
Fourier transform in O(N log N ) operations for an array of size N = 2J . It is based
on the nice property of the principal root of xN = 1. In addition to the recursive imple-
mentation, a non-recursive and in-place implementation, known as butterfly algorithm, is
also provided.
The matrix F is called Fourier matrix and the (l, k) item of matrix F is eikxl = eikl =
. So we can simply write the matrix F = ( lk )N N , for l, k = 0, 1, . . . , N 1. The
lk
w2 = i
w = e2/8 = cos 2 2
8 + i sin 8
w3
2
w 4 = 1 w8 = 1
8 Real axis
w5 w7 = w
w 6 = i
p
Figure 3.11: The eight solutions to z8 = 1 are 1, w, w2 , . . . , w7 with w = (1 + i)/ 2.
F IGURE 1. default
The square of w can be found directly (it just doubles the angle):
Lemma 1.1. The matrix F is orthogonal, i.e., the column vectors of F are mutually or-
w2 = (cos q + i sin q )2 = cos2 q sin2 q + 2i sin q cos q .
thogonal.
The Let
Proof. realvpart cos2 q sin2 q is cos 2q , and the imaginary part 2 sin q cos q is sin 2q .
k be the column vectors of F . We compute vk vl , for k 6= l, as
(Note that i is not included; the imaginary part is a real number.) Thus w2 = cos 2q +
N 1
(2) i sin 2q . The square of w is1 still
+ Won+the 2
+ circle,
W unit + W but at the double angle 2q . That
k l that wn lies at the angle nq , and we are right.
withmakes
W =ussuspect . A key observation is that W is still a root of unity as a power of the
There
generator, is aWbetter
i.e., N
= way
kN to take =
lN powers
1k 1lof=w.1.The
Andcombination
since k 6= l,ofWcosine
6= 1.and sine
Then is a
multiply
complex
(2) by 1 Wexponential,
, we obtainwith
1 amplitude
W = 0 and
N one and phase (2)
conclude angle q : since W 6= 1.
is zero
As a consequence, we obtain the q + i sinFqF= e=iq .N I and thus the inverse of F(2)is a
cosidentity
scaled F , i.e. F = F /N . Here we
1
skip the transpose since F is symmetric. Notice that
The rules for multiplying, like (e2 )(e3 ) = e5 , continue to hold when the exponents iq are
F looks just like F : simply changeiq to .
imaginary. The powers of w = e stay on the unit circle:
2. FAST F2 OURIER T RANSFORM 1
Powers of w w = ei2q , wn = einq , = eiq . (3)
w
FFT is a fast algorithm for computing F c or F u. It is a divide-and-conquer algorithm.
The nth
To unify thepower is at thewe
discussion, angle nq . When
consider n = 1, the yreciprocal
the computation with has
= FN x 1/w FN angle
= (wkjq . If
N )N N ,
we multiply cos q + i
i.e., for k = 0, . . . , N 1,sin q by cos( q ) + i sin(q ), we get the answer 1:
eiq eiq = (cos q + i sin q )(cos
N
X 1 sin q ) = cos2 q + sin2 q = 1.
q i kj
(3) yk = wN xj .
Note. I remember the day when a letterj=0 came to MIT from a prisoner in New York,
asking if Eulers formula (2) was true. It is really astonishing that three of the key
THE FAST FOURIER TRANSFORM 3
Algorithm (FFT)
(1) Divide x into xeven and xodd .
(2) Compute yeven = FNc xeven , yodd = FNc xodd
(3) Merge yeven and yodd into u.
The merge step is not straight-forward. The formulae found by Cooley and Tukey [1] is
based on the following properties of the factor wN which can be verified easily. The key is
wN2
= wNc .
(1) Symmetry:
kj kj k(j+N/2) kj
(wN ) = wN , wN = wN .
(2) Periodic:
kj k(j+N ) (k+N )j
wN = wN = wN .
(3)
kj mkj kj kj/m
wN = wmN , wN = wN/m .
Theorem 2.1. For k = 0, . . . , Nc 1, we have
(4) k
y(k) = yeven (k) + wN yodd (k)
(5) k
y(k + Nc ) = yeven (k) wN yodd (k)
Proof. We split the sum in the formulae of yk into even and odd parts:
N 1 c 1
NX c 1
NX
k(2j+1)
X kj k2j
yk = wN xj = wN x2j + wN x2j+1 .
j=0 j=0 j=0
10 w = exp(-i*2*pi/N);
11 Nc = N/2;
12 k = 1:Nc;
13 tempy_odd = w.(k-1).*y_odd;
14 y(k) = y_even + tempy_odd;
15 y(j+Nc) = y_even - tempy_odd;
The dominant operation in the merge step is the multiplication of complex numbers, i.e.
line 13. We thus skip the addition and count the number of multiplication only. We can
easily get the recurrence
T (N ) = 2T (N/2) + N/2
from which we obtain T (N ) = 1/2 N log2 N .
3. B UTTERFLY A LGORITHM
We discuss a non-recursive and in-place implementation of FFT. The top-to-bottom
phase is an even-odd reordering of the input array. An example of N = 8 is displayed in
the following table.
L=4 0 1 2 3 4 5 6 7
L=3 0 2 4 6 1 3 5 7
L=2 0 4 2 6 1 5 3 7
L=1 0 4 2 6 1 5 3 7
This ordering can be generated by bit reversal permutation. For an index n, written in
binary with digits b2 b1 b0 is switched with the index with reversed digits b0 b1 b2 ; see the
figure for an illustration.
7 x(ir+1) = t;
8 end
9 end
The bottom-to-top can be decomposed into pairs of the following butterfly diagram.
j
The weight wL is called twiddle factor. This provides a in-place implementation of FFT.
Namely we can use the same array to store the updated values. Suppose N = 2L . We use
subscript L, L = 1 : J, to denote a level index. The two input data of the butterfly diagram
is separated by the length dL = 2L1 . The twiddle factor in level L is wL = w2L =
2JL
wN . The updated formulae is
k
yL (k) = yL1 (k) + wL yL1 (k + dL )
k
yL (k + dL ) = yL1 (k) wL yL1 (k + dL ).
We can first copy values yL1 (k), yL1 (k+dL ) to two temporary location and then rewrite
with the values of yL (k), yL (k + dL ). So only one array of size N is needed. The N = 8
case is illustrated in the following figure. Note that the bottom (left) is in the bit-reverse
ordering and the top (right) is the natural ordering.
A MATLAB code is presented below.
1 for L = 1:level
2 d = 2(L-1);
3 for j = 0:d-1
4 p = j*2(level - L);
5 wL = wp;
6 for k = j+1:2L:N
7 a = x(k);
8 b = wL*x(k+d);
9 x(k) = a + b;
10 x(k+d) = a - b;
11 end
12 end
13 end
4. VARIANTS OF FFT
In the signal process, the input array x(t) is a signal sampled at certain time. The
output is in the frequency domain. The FFT presented in the previous section is called
decimation-in-time (DIT).
6 LONG CHEN
Now we discuss the decimation-in-frequency (DIF) version. The divide step is easier.
We simply split the input array x(0 : N 1) into two x1 = x(0 : Nc 1) and x2 = x(Nc :
N 1). Then apply the butterfly procedure to x1 and x2 to get x1 and x2 . After that we
apply FFT to shorter arrays x1 and x2 to obtain y1 and y2 . To merge y1 and y2 , we need
even-odd splitting of y: yeven = y(0 : 2 : N 2), yodd = y(1 : 2 : N 1).
Exercise 4.1. (1) Derive the formulae from x1 , x2 to yeven and yodd .
(2) Write a pseudocode for DIF (recursive and non-recursive).
When the length N is not of power 2, we can simply add zeros to enlarge the length.
We now discuss another variation of FFT adapt to the general factorization N = N1 N2
and usually N1 is small and called the radix (say between 2 to 8). The FFT presented in
the previous section is known radix-2 DIT.
The idea is to reinterpret the 1-D array of length N1 N2 to a two dimensional matrix
of size N2 N1 and then apply the Fourier transform to N1 arrays of shorter length N2 .
The single summation index j is changed to two subscripts (j1 , j2 ) and the relation is
j = j2 N1 + j1 for j1 = 0, . . . , N1 , j2 = 0, . . . , N2 . We can split the sum as
N 1 1 1 N
NX 2 1
k(j1 ,j2 )
X kj
X
y(k) = wN x(j) = wN x(j1 , j2 )
j=0 j1 =0 j2 =0
1 1 N
NX X2 1
kj2 N1 +kj1
= wN1 N2
xj1 (j2 )
j1 =0 j2 =0
1 1
NX 2 1
NX
kj1 kj2
= wN wN2
xj1 (j2 )
j1 =0 j2 =0
1 1
NX
kj1
= wN yj1 (k).
j1 =0
THE FAST FOURIER TRANSFORM 7
Here y is obtained from y by multiplying N twiddle factors. The last sum is then a Fourier
transform of y.
So far the sampling is uniform or equivalently the grid of the unit circle is of equi-
distance. We refer to [2] for FFT on nonuniform sampling. A comprehensive treatment of
FFTs can be found in [3].
Exercise 4.2. (1) Use Fourier matrix to find out eigenvectors and eigenvalues of the
N N circulant matrix
c0 cN 1 ... c2 c1
c1 c0 cN 1 . . . c2
C =
.
cN 2 ... ... c0 cN 1
cN 1 cN 2 ... ... c0
(2) Design an O(N log N ) algorithm to compute Cx or C 1 x.
R EFERENCES
[1] J. W. Cooley and J. W. Tukey. An Algorithm for the Machine Computation of the Complex Fourier Series.
Mathematics of Computation, 19:297, 1965.
[2] L. Greengard and J.-Y. Lee. Accelerating the Nonuniform Fast Fourier Transform, 2004.
[3] C. Van Loan. Computational frameworks for the fast Fourier transform. SIAM, 1992.