Parallel Fast Fourier Transform: 159.735 Studies in Parallel and Distributed System
Parallel Fast Fourier Transform: 159.735 Studies in Parallel and Distributed System
Name: Bo LIU
ID: 03278999
Parallel Fast Fourier Transform
Introduction
Fourier Transform plays an important role in signal processing, image processing and
voice recognition and so on. It has been using for wide range of areas. It may be used for
people’s life, and it may be used for scientific research as well. The Fourier Transform
has many applications in science and engineering. For example, it is often used in digital
signal processing applications such as signal processing, voice recognition and image
processing. The Discrete Fourier Transform is a specific kind of Fourier Transform. It
maps a sequence over time to another sequence over frequency. However, if the Discrete
Fourier Transform is implemented straightforward, the time complexity is O(n2). It is not
a better way to be used in practice. Alternatively, the Fast Fourier Transform is just O(n
log n) algorithm to perform the Discrete Fourier Transform which can be easily
parallelized as well.
Fourier analysis
Fourier analysis is the representation of continuous function by a potentially infinite
series of sin and cosine functions. It is grown out of the study of Fourier series. The
Fourier series is a function which can be expressed as the sum of a series of sins and
cosines.
∞ ∞
1
f ( x) = a0 + ∑ an cos(nx) + ∑ bn sin(nx)
2 n =1 n =1
Where n = 1, 2, 3 …
1 1 π
∫ π f ( x) cos(nx)dx
π
a0 =
π ∫ π f ( x)dx
−
an =
π −
1 π
bn =
π ∫ π
f ( x) sin( nx)dx
−
Page 2
Parallel Fast Fourier Transform
The numbers an and bn are called Fourier coefficients of f. so, infinite sum f(x) is called
the Fourier series of f. Fourier series can be generalized to complex numbers, and further
generalized to derive the Fourier Transform.
Fourier Transform
The Fourier Transform is defined by the expression:
∞
F ( k ) = ∫ f ( x )e − 2πikx dk
−∞
∞
f ( x) = ∫ F (k )e 2πikx dk
−∞
Fourier Transform actually maps a time domain (series) into the frequency domain
(series). So, the Fourier Transform is often called the frequency domain. Inverse Fourier
Transform maps the domain of frequencies back into the corresponding time domain. The
two functions are inverses of each other.
Frequency domain ideas are important in many application areas, including audio, signal
processing and image processing. For example, spectrum analysis is widely used to
analyzed speech, compress images, and search for periodicities in a wide variety of data
in biology, physics and so on. Particularly, JPEG compression algorithm which are
widely used and very effective, use a version of the Fourier Cosine Transform to
compress the image data.
However, the Fourier transform is not suitable for machine computation because infinity
of samples have to be considered. There is an algorithm called Discrete Fourier
Transform, which is modified based on the Fourier Transform, can be used for machine
computation.
Page 3
Parallel Fast Fourier Transform
n =0
The inverse DFT is given by:
N −1 2 πi
1 kn
fn =
N
∑X
k =0
k e N
2πi
N
Where wn =e = cos(2π / N ) + i sin(2π / N ) is a primitive Nth root or unity.
DFT Computation
Given n elements vector x, the DFT Matrix vector product Fnx, where fi,j = wnij for 0 <= I,
j < n. The following examples are done based on formula above and DFT Matrix.
DFT of vector (2, 3), the primitive square root of unity for w2 is -1.
w20×0 w20×1 x0 1 1 2 5
1×0 = =
w
2 w2 x1 1 − 1 3 − 1
1×1
Page 4
Parallel Fast Fourier Transform
DFT of vector (1, 2, 4, 3), the primitive 4th root of unity for w4 is i.
As the time complexity of DFT for n samples is O (n2) if the DFT is implemented
straightforward. So, using DFT is not a best way in practice. There is an improved
algorithm called Fast Fourier Transform (FFT) which produces exactly the same
result as the DFT. It uses divide – and – conquer strategy. So, it only takes O(n log n)
time to compute n samples. The only difference between DFT and FFT is that FFT
is much faster than DFT. It can be thought as a fast version of DFT.
The idea is that keep dividing a DFT sequence of N samples into two sub sequence.
It splits the even index and odd index each step. If N is a power of 2, it keeps
splitting the sequence until each subsequence only has one element. The rearranged
index is just the bit-reversed order as the original index.
Page 5
Parallel Fast Fourier Transform
So, for example, 16 points, we have log2 N steps which is 4 steps and each has N
operations. Finally the time complexity is O(N log N).
Page 6
Parallel Fast Fourier Transform
Use example in the Discrete Fourier Transform section to re-do it with FFT. The
diagram is shown below.
When parallelize the FFT algorithm, we have to consider that which algorithm is
suitable for implementing the FFT. The recursive way for the FFT algorithm is easy
to implement. However, there are two reasons for using an iterative way for FFT
algorithm. First, iterative version of the FFT algorithm can perform fewer index
computations. Second, it is easier to derive a parallel FFT algorithm when the
sequential algorithm is in iterative form. As we may already know that the output
index is the bit-reversed as the input index. So, use this idea to rearrange the index.
The following graph shows the process for parallel Fast Fourier Transform:
Page 7
Parallel Fast Fourier Transform
Top sequence is input and bottom sequence is output. Each process is represented by
a gray rectangle.
There are three phrases for the parallel algorithm. Assume n is number of elements,
and p is number of processes. First, the processes permute the input sequence a,
rearrange the indices. In the second phrase, the processes perform the first log n –
log p iterations of the FFT by performing the required multiplications, additions and
subtraction on complex numbers. In the third phase the processes perform the final
log p iterations of the FFT, and swapping values with partner across hypercube
dimension.
So, each process controls n/p elements of input sequence a. There are log p iterations
in which each process swaps about n/p values with a partner process. The overall
communication time complexity is O ((n/p) log p), and the computational
complexity of the parallel algorithm is O (n log n/p).
Page 8
Parallel Fast Fourier Transform
Reference
Brigham, E.O., (1988). The fast Fourier transform and its applications.
Englewood Cliffs, N,J.: Prentice Hall.
Chu, E., & George, A., (1999). Inside the FFT black box: serial and
parallel fast Fourier transform algorithms. Boca Raton, Fla.: CRC Press.
Page 9