Cooley Tukey FFT
Cooley Tukey FFT
Because the Cooley-Tukey algorithm breaks the DFT into smaller DFTs, it can be
combined arbitrarily with any other algorithm for the DFT. For example, Rader's or
Bluestein's algorithm can be used to handle large prime factors that cannot be
decomposed by Cooley�Tukey, or the prime-factor algorithm can be exploited for
greater efficiency in separating out relatively prime factors.
The algorithm, along with its recursive application, was invented by Carl Friedrich
Gauss. Cooley and Tukey independently rediscovered and popularized it 160 years
later.
This algorithm, including its recursive application, was invented around 1805 by
Carl Friedrich Gauss, who used it to interpolate the trajectories of the asteroids
Pallas and Juno, but his work was not widely recognized (being published only
posthumously and in neo-Latin).[1][2] Gauss did not analyze the asymptotic
computational time, however. Various limited forms were also rediscovered several
times throughout the 19th and early 20th centuries.[2] FFTs became popular after
James Cooley of IBM and John Tukey of Princeton published a paper in 1965
reinventing the algorithm and describing how to perform it conveniently on a
computer.[3]
Tukey reportedly came up with the idea during a meeting of President Kennedy�s
Science Advisory Committee discussing ways to detect nuclear-weapon tests in the
Soviet Union by employing seismometers located outside the country. These sensors
would generate seismological time series. However, analysis of this data would
require fast algorithms for computing DFT due to number of sensors and length of
time. This task was critical for the ratification of the proposed nuclear test ban
so that any violations could be detected without need to visit Soviet facilities.
[4][5] Another participant at that meeting, Richard Garwin of IBM, recognized the
potential of the method and put Tukey in touch with Cooley however making sure that
Cooley did not know the original purpose. Instead Cooley was told that this was
needed to determine periodicities of the spin orientations in a 3-D crystal of
Helium-3. Cooley and Tukey subsequently published their joint paper, and wide
adoption quickly followed due to the simultaneous development of Analog-to-digital
converters capable of sampling at rates up to 300KHz.
The fact that Gauss had described the same algorithm (albeit without analyzing its
asymptotic cost) was not realized until several years after Cooley and Tukey's 1965
paper.[2] Their paper cited as inspiration for work by I. J. Good on what is now
called the prime-factor FFT algorithm (PFA);[3] although Good's algorithm was
initially thought to be equivalent to the Cooley�Tukey algorithm, it was quickly
realized that PFA is a quite different algorithm (only working for sizes that have
relatively prime factors and relying on the Chinese Remainder Theorem, unlike the
support for any composite size in Cooley�Tukey).[6]
A radix-2 decimation-in-time (DIT) FFT is the simplest and most common form of the
Cooley�Tukey algorithm, although highly optimized Cooley�Tukey implementations
typically use other forms of the algorithm as described below. Radix-2 DIT divides
a DFT of size N into two interleaved DFTs (hence the name "radix-2") of size N/2
with each recursive stage.
Radix-2 DIT first computes the DFTs of the even-indexed inputs (x_{2m}=x_0, x_2, \
ldots, x_{N-2}) and of the odd-indexed inputs (x_{2m+1}=x_1, x_3, \ldots, x_{N-1}),
and then combines those two results to produce the DFT of the whole sequence. This
idea can then be performed recursively to reduce the overall runtime to O(N log N).
This simplified form assumes that N is a power of two; since the number of sample
points N can usually be chosen freely by the application, this is often not an
important restriction.
The Radix-2 DIT algorithm rearranges the DFT of the function x_n into two parts: a
sum over the even-numbered indices n={2m} and a sum over the odd-numbered indices
n={2m+1}:
One can factor a common multiplier e^{-\frac{2\pi i}{N}k} out of the second sum, as
shown in the equation below. It is then clear that the two sums are the DFT of the
even-indexed part x_{2m} and the DFT of odd-indexed part x_{2m+1} of the function
x_n. Denote the DFT of the Even-indexed inputs x_{2m} by E_k and the DFT of the
Odd-indexed inputs x_{2m + 1} by O_k and we obtain:
and
We also know that the twiddle factor e^{-2\pi i k/ N} obeys the following relation:
This allows us to cut the number of "twiddle factor" calculations in half also. For
0 \leq k < \frac{N}{2}, we have
The above re-expression of a size-N DFT as two size-N/2 DFTs is sometimes called
the Danielson�Lanczos lemma, since the identity was noted by those two authors in
1942[7] (influenced by Runge's 1903 work[2]). They applied their lemma in a
"backwards" recursive fashion, repeatedly doubling the DFT size until the transform
spectrum converged (although they apparently didn't realize the linearithmic [i.e.,
order N log N] asymptotic complexity they had achieved). The Danielson�Lanczos work
predated widespread availability of computers and required hand calculation
(possibly with mechanical aids such as adding machines); they reported a
computation time of 140 minutes for a size-64 DFT operating on real inputs to 3�5
significant digits. Cooley and Tukey's 1965 paper reported a running time of 0.02
minutes for a size-2048 complex DFT on an IBM 7094 (probably in 36-bit single
precision, ~8 digits).[3] Rescaling the time by the number of operations, this
corresponds roughly to a speedup factor of around 800,000. (To put the time for the
hand calculation in perspective, 140 minutes for size 64 corresponds to an average
of at most 16 seconds per floating-point operation, around 20% of which are
multiplications.)