0% found this document useful (0 votes)
119 views3 pages

Cooley Tukey FFT

DSP - Cooley Tukey FFT Algorithm

Uploaded by

JM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
119 views3 pages

Cooley Tukey FFT

DSP - Cooley Tukey FFT Algorithm

Uploaded by

JM
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as TXT, PDF, TXT or read online on Scribd
You are on page 1/ 3

The Cooley�Tukey algorithm, named after J.W.

Cooley and John Tukey, is the most


common fast Fourier transform (FFT) algorithm. It re-expresses the discrete Fourier
transform (DFT) of an arbitrary composite size N = N1N2 in terms of smaller DFTs of
sizes N1 and N2, recursively, to reduce the computation time to O(N log N) for
highly composite N (smooth numbers). Because of the algorithm's importance,
specific variants and implementation styles have become known by their own names,
as described below.

Because the Cooley-Tukey algorithm breaks the DFT into smaller DFTs, it can be
combined arbitrarily with any other algorithm for the DFT. For example, Rader's or
Bluestein's algorithm can be used to handle large prime factors that cannot be
decomposed by Cooley�Tukey, or the prime-factor algorithm can be exploited for
greater efficiency in separating out relatively prime factors.

The algorithm, along with its recursive application, was invented by Carl Friedrich
Gauss. Cooley and Tukey independently rediscovered and popularized it 160 years
later.

This algorithm, including its recursive application, was invented around 1805 by
Carl Friedrich Gauss, who used it to interpolate the trajectories of the asteroids
Pallas and Juno, but his work was not widely recognized (being published only
posthumously and in neo-Latin).[1][2] Gauss did not analyze the asymptotic
computational time, however. Various limited forms were also rediscovered several
times throughout the 19th and early 20th centuries.[2] FFTs became popular after
James Cooley of IBM and John Tukey of Princeton published a paper in 1965
reinventing the algorithm and describing how to perform it conveniently on a
computer.[3]

Tukey reportedly came up with the idea during a meeting of President Kennedy�s
Science Advisory Committee discussing ways to detect nuclear-weapon tests in the
Soviet Union by employing seismometers located outside the country. These sensors
would generate seismological time series. However, analysis of this data would
require fast algorithms for computing DFT due to number of sensors and length of
time. This task was critical for the ratification of the proposed nuclear test ban
so that any violations could be detected without need to visit Soviet facilities.
[4][5] Another participant at that meeting, Richard Garwin of IBM, recognized the
potential of the method and put Tukey in touch with Cooley however making sure that
Cooley did not know the original purpose. Instead Cooley was told that this was
needed to determine periodicities of the spin orientations in a 3-D crystal of
Helium-3. Cooley and Tukey subsequently published their joint paper, and wide
adoption quickly followed due to the simultaneous development of Analog-to-digital
converters capable of sampling at rates up to 300KHz.

The fact that Gauss had described the same algorithm (albeit without analyzing its
asymptotic cost) was not realized until several years after Cooley and Tukey's 1965
paper.[2] Their paper cited as inspiration for work by I. J. Good on what is now
called the prime-factor FFT algorithm (PFA);[3] although Good's algorithm was
initially thought to be equivalent to the Cooley�Tukey algorithm, it was quickly
realized that PFA is a quite different algorithm (only working for sizes that have
relatively prime factors and relying on the Chinese Remainder Theorem, unlike the
support for any composite size in Cooley�Tukey).[6]

A radix-2 decimation-in-time (DIT) FFT is the simplest and most common form of the
Cooley�Tukey algorithm, although highly optimized Cooley�Tukey implementations
typically use other forms of the algorithm as described below. Radix-2 DIT divides
a DFT of size N into two interleaved DFTs (hence the name "radix-2") of size N/2
with each recursive stage.

The discrete Fourier transform (DFT) is defined by the formula:


X_k = \sum_{n=0}^{N-1} x_n e^{-\frac{2\pi i}{N} nk},

where k is an integer ranging from 0 to N-1.

Radix-2 DIT first computes the DFTs of the even-indexed inputs (x_{2m}=x_0, x_2, \
ldots, x_{N-2}) and of the odd-indexed inputs (x_{2m+1}=x_1, x_3, \ldots, x_{N-1}),
and then combines those two results to produce the DFT of the whole sequence. This
idea can then be performed recursively to reduce the overall runtime to O(N log N).
This simplified form assumes that N is a power of two; since the number of sample
points N can usually be chosen freely by the application, this is often not an
important restriction.

The Radix-2 DIT algorithm rearranges the DFT of the function x_n into two parts: a
sum over the even-numbered indices n={2m} and a sum over the odd-numbered indices
n={2m+1}:

\begin{matrix} X_k & = & \sum \limits_{m=0}^{N/2-1} x_{2m}e^{-\frac{2\pi i}{N}


(2m)k} + \sum \limits_{m=0}^{N/2-1} x_{2m+1} e^{-\frac{2\pi i}{N} (2m+1)k} \
end{matrix}

One can factor a common multiplier e^{-\frac{2\pi i}{N}k} out of the second sum, as
shown in the equation below. It is then clear that the two sums are the DFT of the
even-indexed part x_{2m} and the DFT of odd-indexed part x_{2m+1} of the function
x_n. Denote the DFT of the Even-indexed inputs x_{2m} by E_k and the DFT of the
Odd-indexed inputs x_{2m + 1} by O_k and we obtain:

\begin{matrix} X_k= \underbrace{\sum \limits_{m=0}^{N/2-1} x_{2m} e^{-\frac{2\


pi i}{N/2} mk}}_{\mathrm{DFT\;of\;even-indexed\;part\;of\;} x_m} {} + e^{-\frac{2\
pi i}{N}k} \underbrace{\sum \limits_{m=0}^{N/2-1} x_{2m+1} e^{-\frac{2\pi i}{N/2}
mk}}_{\mathrm{DFT\;of\;odd-indexed\;part\;of\;} x_m} = E_k + e^{-\frac{2\pi i}{N}k}
O_k. \end{matrix}

Thanks to the periodicity of the DFT, we know that

E_{{k} + \frac{N}{2}} = E_k

and

O_{{k} + \frac{N}{2}} = O_k.

Therefore, we can rewrite the above equation as

\begin{matrix} X_k & = & \left\{ \begin{matrix} E_k + e^{-\frac{2\pi i}{N}k}


O_k & \mbox{for } 0 \leq k < N/2 \\ \\ E_{k-N/2} + e^{-\frac{2\pi i}{N}k} O_{k-N/2}
& \mbox{for } N/2 \leq k < N . \\ \end{matrix} \right. \end{matrix}

We also know that the twiddle factor e^{-2\pi i k/ N} obeys the following relation:

\begin{matrix} e^{\frac{-2\pi i}{N} (k + N/2)} & = & e^{\frac{-2\pi i k}{N} -


{\pi i}} \\ & = & e^{-\pi i} e^{\frac{-2\pi i k}{N}} \\ & = & -e^{\frac{-2\pi i k}
{N}} \end{matrix}

This allows us to cut the number of "twiddle factor" calculations in half also. For
0 \leq k < \frac{N}{2}, we have

\begin{matrix} X_k & = & E_k + e^{-\frac{2\pi i}{N}k} O_k \\ X_{k+\frac{N}{2}}


& = & E_k - e^{-\frac{2\pi i}{N}k} O_k \end{matrix}
This result, expressing the DFT of length N recursively in terms of two DFTs of
size N/2, is the core of the radix-2 DIT fast Fourier transform. The algorithm
gains its speed by re-using the results of intermediate computations to compute
multiple DFT outputs. Note that final outputs are obtained by a +/- combination of
E_k and O_k \exp(-2\pi i k/N), which is simply a size-2 DFT (sometimes called a
butterfly in this context); when this is generalized to larger radices below, the
size-2 DFT is replaced by a larger DFT (which itself can be evaluated with an FFT).
Data flow diagram for N=8: a decimation-in-time radix-2 FFT breaks a length-N DFT
into two length-N/2 DFTs followed by a combining stage consisting of many size-2
DFTs called "butterfly" operations (so-called because of the shape of the data-flow
diagrams).

This process is an example of the general technique of divide and conquer


algorithms; in many traditional implementations, however, the explicit recursion is
avoided, and instead one traverses the computational tree in breadth-first fashion.

The above re-expression of a size-N DFT as two size-N/2 DFTs is sometimes called
the Danielson�Lanczos lemma, since the identity was noted by those two authors in
1942[7] (influenced by Runge's 1903 work[2]). They applied their lemma in a
"backwards" recursive fashion, repeatedly doubling the DFT size until the transform
spectrum converged (although they apparently didn't realize the linearithmic [i.e.,
order N log N] asymptotic complexity they had achieved). The Danielson�Lanczos work
predated widespread availability of computers and required hand calculation
(possibly with mechanical aids such as adding machines); they reported a
computation time of 140 minutes for a size-64 DFT operating on real inputs to 3�5
significant digits. Cooley and Tukey's 1965 paper reported a running time of 0.02
minutes for a size-2048 complex DFT on an IBM 7094 (probably in 36-bit single
precision, ~8 digits).[3] Rescaling the time by the number of operations, this
corresponds roughly to a speedup factor of around 800,000. (To put the time for the
hand calculation in perspective, 140 minutes for size 64 corresponds to an average
of at most 16 seconds per floating-point operation, around 20% of which are
multiplications.)

You might also like