The Quantum Fourier Transform and Jordan's Algorithm
The Quantum Fourier Transform and Jordan's Algorithm
Dave Bacon
Department of Computer Science & Engineering, University of Washington
After Simon’s algorithm, the next big breakthrough in quantum algorithms occurred when Peter Shor discovered
his algorithm for efficiently factoring numbers. This algorithm makes us of the quantum Fourier transform. In this
lecture we will deviate to discuss the (quantum) discrete Fourier transform and see an application of this transform
which was only recently (2005) realized.
There are many motivations for the discrete Fourier transform. Those of you in computer science have probably
encountered them first in signal processing and perhaps further in cool theory results like switching lemmas. Those of
you in physics have definitely encountered the continuous fourier transform, most likely first in quantum theory where
we learn that the fourier transform goes between the momentum and position representations of a wave function.
Suppose that we have a vector f of N complex numbers, fk , k ∈ {0, 1, . . . , N − 1}. Then the discrete Fourier
transform (DFT) is a map from these N complex numbers to N complex numbers, the Fourier transformed coefficients
f˜j , given by
N −1
1 X −jk
f˜j = √ ω fk (1)
N k=0
2πi
where ω = exp N . The inverse DFT is given by
N −1
1 X jk ˜
fj = √ ω fk (2)
N k=0
To see this consider how the basis vectors transform. If fkl = δk,l , then
N −1
1 X −jk 1
f˜l j = √ ω δk,l = √ ω −jl . (3)
N k=0 N
These DFTed vectors are orthonormal:
N −1 N −1 N −1
∗ 1 X jl −jm 1 X j(l−m)
f˜l j f˜jm =
X
ω ω = ω (4)
j=0
N j=0 N j=0
This last sum can be evaluated as a geometric series, but beware of the (l − m) = 0 term, and yields
N −1
∗
f˜l j f˜jm = δl,m .
X
(5)
j=0
From this we can check that the inverse DFT does indeed perform the inverse transform:
N −1 N −1 N −1 N −1 N −1
1 X jk ˜ 1 X jk 1 X −lk 1 X (j−l)k X
fj = √ ω fk = √ ω √ ω fl = ω fl = δj,l fl = fj (6)
N k=0 N k=0 N l=0 N
k,l=0 l=0
An important property of the DFT is the convolution theorem. The circular convolution of two vectors f and g is
given by
N
X −1
(f ∗ g)i = fj gi−j (7)
j=0
where we define g−m = gN −m . The convolution theorem states that the DFT turns convolution into pointwise vector
multiplication. In other words if the components of the DFT of (f ∗ g) are c̃k , then c̃k = f˜k g̃k . What use is the
convolution theorem? Well this leads us nicely to our next topic, the fast Fourier transform.
2
Naively how many math operations do we have to do to perform a discrete Fourier transform? Well for each
component of the new vector we will need to perform N multiplications and then we will need to add these components.
Since we need to do this for each of the N different component. Thus we see that N 2 complex multiplications and
N (N − 1) complex additions are needed to compute the DFT. The goal of the fast Fourier transform is to perform
the DFT using less basic math operations. There are may ways to do this. We will describe one particular method
for N = 2n and will put off discussion of the case where N 6= 2n until later. So assume N = 2n from here until I say
otherwise.
The Fast Fourier Transform (FFT) we will consider is based on observing the fact that the there are symmetries of
the coefficients in the DFT,
ω k+N/2 = −ω k
ω k+N = ω k . (8)
Suppose we want to perform the DFT of the vector f . Split the components of f up into smaller vectors of size N/2,
e and o. The coefficients of e are the components of f which are even and the coefficients of o are the components of
f which are odd. The order of the coefficients is retained. Then it is easy to see that
N −1 N/2−1 N/2−1
1 X −ij 1 X −2ij X
f˜j = √ ω fi = √ ω ei + ω −(2i+1)j oi
N i=0 N i=0 i=0
N/2−1 N/2−1
1 X −ij −j
X −ij
= √ ωN/2 ei + ωN ωN/2 oi (9)
N i=0 i=0
Now recall that j runs from 0 to N − 1 and the DFTs of e and f are periodic with period N/2. Using this and the
above symmetry we find that we can express our formula as
√ −j
2f˜j = ẽj + ωN õj 0 ≤ j ≤ N/2 − 1
√ −j
2f˜j = ẽj − ω õj N/2 − 2 ≤ j ≤ N − 1
N
(11)
Suppose that we first compute the DFT over e and o and then uses them in this formula to compute the full DFT
2 2
of f . How many complex multiplications do we need to perform? Well to compute e and o requires 2 N2 = N2
−j
multiplications. We need another N/2 to compute ωN õj . Forget about the square root of two, it can always be put
2
in at the end as an extra N multiplications. Thus we require N2 + N2 complex multiplications to compute the DFT
as opposed to N 2 in the the previous method. This is a reduction of about a factor of 2 for large N .
Further it is clear that for N = 2n we can use the above trick all the way down to N = 2. How many complex
multiplications do we need to perform if we do this? Let Tn denote the number of multiplications at the N = 2n th
level, such that T1 = 4. Then
Tn ≤ 2Tn−1 + 2n (13)
which has solution Tn ≤ 2n n. In other words the running time is bounded by N log N . Thus we see that in the FFT
we can compute the DFT in a complexity of N log N operations. This is a nice little improvement. Of historical
interest apparently Gauss knew the FFT algorithm.
Here is a very cool application of the FFT. Suppose that you have two polynomials with complex coefficients:
f (x) = a0 + a1 x + · · · + aN −1 xN −1 and g(x) = b0 + b1 x + · · · + bN −1 xN −1 . If you multiply these two polynomials
3
PN −1 P2(N −1)
together you get a new polynomial f (x)g(x) = i,j=0 ai bj xi+j = k=0 ck xk . The new coefficients for this
polynomial are a function of the two polynomials:
N
X −1
ck = al bk−l (14)
l=0
where the sum is over all valid polynomial terms (i.e. when k − l is negative, there is no term in the sum.) One sees
that computing ck requires N 2 multiplications.
But wait, the expression for ck looks a lot like convolution. Indeed suppose that we form a 2N dimensional vector
a = (a0 , . . . , aN −1 , 0, . . . , 0) and b = (b0 , . . . , bN −1 , 0, . . . , 0) from our original data. The vector c which will represent
the coefficients of the new polynomial are then given by
2N
X −1
ck = al bk−l mod2N (15)
l=0
Now we don’t need to condition this sum on their being valid terms. Now this is explicitly convolution! Thus we
can compute the coefficients ck by the following algorithm. Compute the DFT of the vectors a and b. Pointwise
multiply these two vectors. Then inverse DFT this new vector. The result will be ck by the convolution theorem. If
we use the FFT algorithm for this procedure, then we will require O(N log N ) multiplications. This is pretty cool: by
using the FFT we can multiply polynomials faster than our naive grade school method for multiplying polynomials.
It is good to see that our grad school self can do things our grade school self cannot do. Some of you will even
know that you can use the FFT to multiply integers N integers with a cost of O(N log2 N ) or used recursively:
O(N log N log log N log log log N · · · ).
Now lets turn to the Quantum Fourier transform (QFT). We’ve already seen the QFT for N = 2. It is the Hadamard
transform:
1 1 1
H=√ (16)
2 1 −1
Why is this the QFT for N = 2? Well suppose have the single qubit state a0 |0i + a1 |1i. If we apply the Hadamard
operation to this state we obtain the new state
1 1
√ (a0 + a1 )|0i + √ (a0 − a1 )|1i = ã0 |0i + ã1 |1i. (17)
2 2
In other words the Hadamard gate performs the DFT for N = 2 on the amplitudes of the state! Notice that this is
very different that computing the DFT for N = 2: remember the amplitudes are not numbers which are accessible to
us mere mortals, they just represent our description of the quantum system.
So what is the full quantum Fourier transform? It is the transform which takes the amplitudes of a N dimen-
sional state and computes the Fourier transform on these amplitudes (which are then the new amplitudes in the
computational basis.) In other words, the QFT enacts the transform
N −1 N −1 N −1 N −1
X X X 1 X −xy
ax |xi → ãx |xi = √ ωN ay |xi. (18)
x=0 x=0 x=0
N y=0
It is easy to see that this implies that the QFT performs the following transform on basis states:
N −1
1 X −xy
|xi → √ ωN |yi (19)
N y=0
The QFT is a very important transform in quantum computing. It can be used for all sorts of cool tasks, including,
as we shall see in Shor’s algorithm. But before we can use it for quantum computing tasks, we should try to see if
we can efficiently implement the QFT with a quantum circuit. Indeed we can and the reason we can is intimately
related to the fast Fourier transform.
Let’s derive a circuit for the QFT when N = 2n . The QFT performs the transform
n
2 −1
1 X −xy
|xi → √ ωN |yi. (22)
2n y=0
Expanding the exponential of a sum to a product of exponentials and collecting these terms in from the appropriate
terms we can express this as
n
1 X O −x2 n−k
yk
|xi → √ ωN |yk i (24)
2n y1 ,y2 ,...,yn ∈{0,1} k=1
Take the first qubit of |x1 , . . . , xn i and apply a Hadamard transform. This produces the transform
1
|xi → √ |0i + e−2πi0.x1 |1i ⊗ |x2 , x3 , . . . , xn i
(29)
2
Now define the rotation gate
1 0
Rk = (30)
0 exp −2πi
2k
If we now apply controlled R2 , R3 , etc. gates controlled on the appropriate bits this enacts the transform
1
|xi → √ |0i + e−2πi0.x1 x2 ...xn |1i ⊗ |x2 , x3 , . . . , xn i
(31)
2
Thus we have reproduced the last term in the QFTed state. Of course now we can proceed to the second qubit,
perform a Hadamard, and the appropriate controlled Rk gates and get the second to last qubit. Thus when we are
finished we will have the transform
1
|0i + e−2πi0.x1 x2 ···xn |1i ⊗ |0i + e−2πi0.x1 x2 ···xn−1 |1i ⊗ · · · ⊗ |0i + e−2πi0.xn |1i
|xi → √ (32)
2n
Reversing the order of these qubits will then produce the QFT!
The circuit we have constructed on n qubits is
• • ··· • H
Pn n(n+1)
This circuit is polynomial size in n. Actually we can count the number of quantum gates in it: i=1 i = 2
n
Hadamards and controlled Rk gates plus b 2 c swap gates. What was it that allowed us to construct an efficient circuit?
Well if you look at the factorization we used, you will see that we have basically used the same trick that we used
for the FFT! But now, since we are working on amplitudes and not operating on the the complex vectors themselves,
we get an algorithm which scales nicely in the number of qubits. It is important to realize that the QFT cannot be
used like the FFT on data. Thus there is a tendency to want to port quantum computers over to signal processing.
Currently there are some preliminary ideas about how to do this, but the naive way you might expect this to work
does not work.
There is a very neat application of the QFT which was only recently realized. It is motivated by the Bernstein-
Vazirani problem. Recall that in the nonrecursive Bernstein-Vazirani problem we given a function fs (x) = x · s and we
desired to find s. Notice that in some since, s is the (discrete) gradient of the function (which in this case is linear.)
Now we know that central to this was the Hadamard, which we have seen is the QFT for N = 2. So can we use the
QFT over N to learn something about the gradient of a more general function? In 2004, Jordan showed that this was
possible (while he was a graduate student! This should give hope to all graduate student around the world. Indeed,
it should even give hope to postdocs and faculty members as well.)
Suppose that we are given a black box which computes a function f : Rd → R on some pre-specified domain. We
will scale this domain to [0, 1)d and assume that we have the function computed to some finite precision of bits n.
∂f
Now suppose that we want to calculate an estimate of the gradiant of this function ∇f = ∂x , ∂f , . . . , ∂x
1 ∂x2
∂f
d
without
loss of generality at the origin. Now classically when we query f we only obtain one value of f . A natural way to
calculate an estimate of the gradient of f is to evaluate f at the origin, and then f along the d different directions, i.e.
f (0, . . . , 0, l, 0, . . . , 0), and, evaluate the differences. For small enough l this will approximate the gradient using d + 1
6
queries. A more symmetric method is to query the function at +l/2 and −l/2 and take the differences. Thus we can
easily see that to evaluate an estimate of the gradiant we need some Ω(d) queries. (Really we should be defining the
class of functions which we can estimate, and being careful about accuracy, etc., but we will skip over these details.)
But can we do better using a quantum algorithm? Suppose that we have access to the function f by the unitary
oracle:
X l x1 l xn
Uf = |x1 , . . . , xd ihx1 , . . . , xd | ⊗ |y + f (− + l n , . . . , − + l n ) mod 2n ihy| (34)
n
2 2 2 2
x1 ,...,xd ,y∈{0,1}
where the function’s input has been appropriately scaled to a region of size l. Now we do what we did in the Bernstein-
Vazirani problem. First we perform phase kickback. We do this by feeding into the |yi register of this unitary the
state
n
2X −1
2πix
|1̃i = exp |xi (35)
x=0
2n
along with superpositions over all possible |xi inputs. The state |1̃i can be created by performing a QFT on |1i. The
resulting state will be
!
X 2πif (− 2l + l 2xn1 , . . . , − 2l + l 2xnd )
exp |x1 , . . . , xd i ⊗ |1̃i (36)
2n
x1 ,...,xd ∈{0,1}
If we now perfrom QFTs individually on each of these d registers we will obtain (with high probability, under some
reasonable assumptions)
∂f l ∂f l
| ,..., i ⊗ |1̃i (38)
∂x1 2n ∂xd 2n
Thus we see that we have obtained the gradient of the function using a single quantum query! Now of course we
need to be careful about analyzing this algorithm correctly, and if you are interested in seeing the details of such a
calculation you can find it at http://arxiv.org/abs/quant-ph/0405146.
Acknowledgments
The diagrams in these notes were typeset using Q-circuit a latex utility written by graduate student extraordinaires
Steve Flammia and Bryan Eastin.