0.1 Phase Estimation Technique
0.1 Phase Estimation Technique
In this lecture we will describe Kitaevs phase estimation algorithm, and use it to obtain an alternate derivation of a
quantum factoring algorithm. We will also use this technique to design quantum circuits for computing the Quantum
Fourier Transform modulo an arbitrary positive integer.
0.1 Phase Estimation Technique
In this section, we dene the phase estimation problem and describe an efcient quantum circuit for it.
Property 0.1 Let U be a NN unitary transformation. U has an orthonormal basis of eigenvectors
1
_
,
2
_
, . . . ,
N
_
with eigenvalues
1
,
2
, . . . ,
N
, where
j
= e
2i
j
for some
j
.
Proof: U, being unitary, maps unit vectors to unit vectors and hence all the eigenvalues have unit magnitude, i.e. they
are of the form e
2i
for some . Let
j
_
and
k
_
be two distinct eigenvectors with distinct eigenvalues
j
and
k
.
We have that
j
j
,
k
) =
j
j
,
k
) = U
j
,
k
) =
j
,U
k
) =
j
,
k
) =
k
j
,
k
). Since
j
,=
k
, the inner
product
j
,
k
) is 0, i.e. the eigenvectors
j
_
and
k
_
are orthonormal. 2
Given a unitary transformation U, and one of its eigenvector
j
_
, we want to gure out the corresponding eigenvalue
j
(or, equivalently,
j
). This is the phase estimation problem.
Denition 0.2 For any unitary transformation U, let C-U stand for a controlled U circuit which conditionally
transforms
_
to U
_
as shown in Figure ??.
_
U
/
_
b
if b = 0 then
/
_
=
_
else if b = 1 then
/
_
=U
_
Figure 0.1: Controlled U Circuit
Assume that we have a circuit which implements the controlled U transformation (We will see later in the course how
to construct a circuit that implements a controlled U transformation given a circuit that implements U). The phase
estimation circuit in Figure ?? can be used to estimate the value of .
The phase estimation circuit performs the following sequence of transformations:
0
_
_
H
s.t. (
0
_
+
1
_
)
_
C-U
s.t.
0
_
_
+ s.t.
1
_
_
= ( s.t.
0
_
+
1
_
)
_
CS 294, Spring 2009, 0-1
_
U
0
_
Measure H H
Figure 0.2: Phase Estimation Circuit
Note that after the C-U transformation, the eigenvector remains unchanged while we have been able to put into the
phase of the rst qubit. A Hadamard transform on the rst qubit will transform this information into the amplitude
which we will be able to measure.
H
1 +
0
_
+
1
1
_
Let P(0) and P(1) be the probability of seeing a zero and one respectively on measuring the rst qubit. If we write
= e
2i
, we have:
P(0) =
1 +cos2 +i sin2
2
=
1 +cos2
2
P(1) =
1 cos2 i sin2
2
=
1 cos2
2
There is a bias of
1
2
cos2 in the probability of seeing a 0 or 1 upon measurement. Hence, we can hope to estimate
by performing the measurement several times. However, to estimate cos2 within m bits of accuracy, we need to
perform(2
m
) measurements. This follows from the fact that estimating the bias of a coin to within with probability
at least 1 requires (
log(1/)
2
) samples.
We will now see how to estimate efciently. Suppose we can implement the C
m
-U transformation as dened below.
_
U U
k
(
_
)
k
(m bits)
k 0, 1, . . . , 2
m
1
Figure 0.3: m-Controlled U Circuit
Denition 0.3 For any unitary transformation U, let C
k
-U stand for a k-controlled U circuit which implements the
transformation
k
_
k
_
U
k
_
as shown in Figure ??.
CS 294, Spring 2009, 0-2
Estimating within m bits of accuracy is equivalent to estimating integer j, where
j
2
m
is the closest approximation to
. Let M = 2
m
and w
M
= e
2i
M
.
The circuit in Figure ?? performs the following sequence of transformations:
0
m
_
_
H
m
_
1
M
M1
k=0
k
_
_
_
C
m
-U
_
1
M
M1
k=0
k
_
_
_
=
_
1
M
M1
k=0
w
jk
M
k
_
_
_
Note that the rst register now contains the Fourier Transform mod M of j and if we apply the reverse of the Fourier
Transform mod M (note that quantum circuits are reversible), we will get back j.
QFT
1
M
j
_
_
U
0
m
_
j
_
QFT
1
M
H
m
Figure 0.4: Efcient Phase Estimation Circuit
If =
j
2
m
, then clearly the circuit outputs j. If
j
2
m
, then the circuit outputs j with high probability (Exercise!).
0.2 Kitaevs Factoring Algorithm
In this section, we will see how to use the phase estimation circuit to factor a number.
Recall that the problem of factoring reduces to the problem of order nding. To factor N, it is sufcient to pick a
random number a and compute the minimum positive r such that a
r
1 mod N. With reasonable probability, r is even
and a
r/2
, mod N and hence N [ a
r
1, i.e. N [ (a
r/2
+1)(a
r/2
1). Since N does not divide a
r/2
1, it must be the
case that a part of it divides a
r/2
+1 and hence gcd(N, a
r/2
+1) is a non-trivial factor of N.
We now reduce the problem of order nding to the phase estimation problem. Consider the unitary transformation
CS 294, Spring 2009, 0-3
M
a
:
x
_
xa mod N
_
. Its eigenvectors are
k
_
=
1
r
_
1
_
+w
k
a
_
+. . . +w
k(r1)
a
r1
_
_
, where w = e
2i/r
:
M
a
k
_
=
1
r
_
a
_
+w
k
a
2
_
+. . . +w
k(r1)
a
r
_
_
= w
k
1
r
_
1
_
+w
k
a
_
+. . . +w
k(r1)
a
r1
_
_
= w
k
k
_
It follows that
k
_
is an eigenvector of M
a
with eigenvalue w
k
. Hence, if we can implement the C
m
-M
a
transformation
and construct the eigenvector
k
for some suitable k, we can use the phase estimation circuit to obtain an approximation
to the eigenvalue w
k
and therefore reconstruct r as follows: w
k
= e
2i
for = k/r. Recall that phase estimation
reconstructs
j
2
m
where j is the output of the phase estimation procedure carried out to m bits of precision. Thus
with high probability
j
2
m
is a very close approximation to
k
r
. Assuming that k is relatively prime to r (which we will
ensure with high probability) we can estimate r using the method of continued fractions if we choose M N
2
.
Lets look carefully at the C
m
-M
a
transformation. It transforms
k
_
x
_
xa
k
mod N
_
. But this is precisely the
transformation that does modular exponentiation. There exists a classical circuit that performs this transformation in
O([x[
2
[k[) time, and thus we can construct a quantum circuit that implements the C
m
-M
a
transformation.
It is not obvious how to obtain an eigenvector
k
_
for some k, but it is easy to obtain the uniform superposition of
the eigenvectors
0
_
,
1
_
. . .
r1
_
. Note that
1
r
r1
k=0
k
_
=
1
_
. Hence, if we use
1
_
as the second input to the
phase estimation circuit, then we will be able to measure a random eigenvalue w
k
, where k is chosen u.a.r. from the
set 0, . . . , r 1. Note that k = 0 is completely useless for our purposes. But k will be relatively prime to r with
reasonable probability.
1
_
M
a
0
m
_
j
_
QFT
1
M
QFT
M
Figure 0.5: Order Finding Circuit (Kitaevs)
With these observations, it is easy to see that the circuit in Figure ?? outputs
j
_
with high probability, where
j
2
m
is the
closest approximation to
k
r
for some random k. Note that with reasonable probability, k is relatively prime to r and if
that is the case, then we can estimate r using the method of continued fractions if we choose M N
2
. Note also that
QFT
M
and H
q
act in an identical manner on
0
q
_
, so H
q
could be used in the above circuit in place of the QFT
M
transformation.
Though the thinking and the analysis behind the Kitaevs and Shors order-nding algorithm are different, it is inter-
esting to note that the two circuits are almost identical. Figure ?? describes the Shors circuit. The quantities q, Q and
x in the Shors algorithm correspond to m, M and a in the Kitaevs algorithm. Also, note that raising a to some power
is same as performing controlled multiplication.
CS 294, Spring 2009, 0-4
x x
k
( mod N)
0
q
_
measure QFT
Q
QFT
Q
Figure 0.6: Order Finding Circuit (Shors)
0.3 QFT mod Q
In this section, we will present Kitaevs quantum circuit for computing Fourier Transform over an arbitrary positive
integer Q, not necessarily a power of 2. Let m be such that 2
m1
< Q 2
m
and let M = 2
m
.
Recall that the Fourier Transform mod Q sends
a mod Q
_
1
Q
Q1
b=0
w
ab
b
_
let
=
a
_
where w = e
2i/Q
. Note that
a
_
[ a = 0, 1, . . . , Q1 forms an orthonormal basis, so we may regard the Fourier
Transform as a change of basis.
Consider the following sequence of transformations, which computes something close to the Fourier Transform mod
Q:
a
_
0
_
a
_
Q
Q1
b=0
b
_
a
_
Q
Q1
b=0
w
ab
b
_
=
a
_
a
_
We can implement the circuit that sends
0
_
1
Q
Q1
b=0
b
_
efciently in the following two ways:
1. Perform the following sequence of transformations.
0
_
m
0
_
H
m
1
M
2
m
1
x=0
x
_
0
_
xQ
1
M
2
m
1
x=0
x
_
x Q
_
Note that since we can efciently decide whether or not x Q classically, we can also do so quantum mechan-
ically. Now take measurement on the second register. If the result is a 0, the rst register contains a uniform
superposition over
0
_
, . . . ,
Q1
_
. If not, we repeat the experiment. At each trial, we succeed with probability
Q/M > 2
m1
/2
m
= 1/2.
2. If we pick a number u.a.r. in the range 0 to Q1, the most signicant bit of the number is 0 with probability
2
m1
/Q. We can therefore set the rst bit of our output to be the superposition:
2
m1
Q
0
_
+
1
2
m1
Q
1
_
CS 294, Spring 2009, 0-5
If the rst bit is 0, then the remaining m1 bits may be chosen randomly and independently, which correspond
to the output of H
m1
on
0
m1
_
. If the rst bit is 1, we need to pick the remaining m1 bits to correspond to
a uniformly chosen random number between 0 and Q2
m1
, which we can do recursively.
The second transformation
a
_
b
_
w
ab
a
_
b
_
can be made using the controlled phase shift circuit.
This gives us an efcient quantum circuit for
a
_
0
_
a
_
a
_
, but what we really want is a circuit for
a
_
a
_
.
In particular, for application to factoring, we need a circuit that forgets the input a in order to have interference in
the superposition over
a
_
.
What we would like is a quantumcircuit that transforms
a
_
a
_
0
_
a
_
. If we could nd a unitary transformation
U with eigenvector
a
_
and eigenvalue e
2ia/Q
, then we could use phase estimation to implement the transformation
0
_
a
_
a
_
a
_
. By reversing the quantumcircuit for phase estimation (which we could do since quantumcircuits
are reversible), we have an efcient quantum circuit for
a
_
0
_
a
_
a
_
0
_
a
_
which is what we need. Note that the phase estimation circuit with m bits of precision outputs j such that
j
2
m
a
Q
. So
if we take 2
m
>> Q
2
, we can use continued fractions to reconstruct a as required above.
To see that the required U exists, consider U :
x
_
x 1 mod Q
_
. Then,
U(
a
) =U
_
Q1
b=0
w
ab
b
_
_
=
Q1
b=0
w
ab
b 1
_
= w
a
Q
b=1
w
a(b1)
b 1
_
= w
a
a
.
In addition, note that U
k
can be efciently computed with a classical circuit, and can therefore be both efciently and
reversibly computed with a quantum circuit. The overall circuit to compute QFT mod Q is shown in Figure ?? (The
circuit should be read from right to left).
U
QFT
1
M
H
m
H
m
0
m
_
a
_
Controlled
Phase
Shift
a
_
Figure 0.7: Using Reverse Phase Estimation Circuit to do QFT mod Q for arbitrary Q
0.4 Mixed Quantum State
Next, we will outline a very interesting application of phase estimation. Before we can do this, we must introduce
some concepts in quantum information theory.
So far we have dealt with pure quantum states
[) =
x
[x).
CS 294, Spring 2009, 0-6
This is not the most general state we can think of. We can consider a probability distribution of pure states, such as [0)
with probability 1/2 and [1) with probability 1/2. Another possibility is the state
_
[+) =
1
2
([0) +[1)) with probability 1/2
[) =
1
2
([0) [1)) with probability 1/2
In fact, no measurement can distinguish the rst case
_
0
_
or
1
__
from this case. This will be seen below.
In general, we can think of mixed state p
i
, [
i
) as a collection of pure states [
i
), each with associated probability
p
i
, with the conditions 0 p
i
1 and
i
p
i
= 1. One context in which mixed states arise naturally is in quantum
protocols, where two players share an entangled (pure) quantum state. Each players view of their quantum register
is then a probability distribution over pure states (achieved when the other player measures their register). Another
reason we consider such mixed states is because the quantum states are hard to isolate, and hence often entangled to
the environment.
0.5 Density Matrix
Now we consider the result of measuring a mixed quantum state. Suppose we have a mixture of quantum states [
i
)
with probability p
i
. Each [
i
) can be represented by a vector in C
2
n
, and thus we can associate the outer product
[
i
)
i
[ =
i
i
, which is an 2
n
2
n
matrix
_
_
_
_
_
a
1
a
2
.
.
.
a
N
_
_
_
_
_
_
a
1
a
2
a
N
_
=
_
_
_
_
_
a
1
a
1
a
1
a
2
a
1
a
N
a
2
a
2
a
1
a
2
a
2
a
N
.
.
.
.
.
.
a
N
a
1
a
N
a
2
a
N
a
N
_
_
_
_
_
.
We can now take the average of these matrices, and obtain the density matrix of the mixture p
i
, [
i
):
=
i
p
i
[
i
)
i
[.
We give some examples. Consider the mixed state [0) with probability of 1/2 and [1) with probablity 1/2. Then
[0)0[ =
_
1
0
_
_
1 0
_
=
_
1 0
0 0
_
,
and
[1)1[ =
_
0
1
_
_
0 1
_
=
_
0 0
0 1
_
.
Thus in this case
=
1
2
[0)0[ +
1
2
[1)1[ =
_
1/2 0
0 1/2
_
.
Now consider another mixed state, this time consisting of [+) with probability 1/2 and [) with probability 1/2. This
time we have
[+)+[ = (1/2)
_
1
1
_
_
1 1
_
=
1
2
_
1 1
1 1
_
,
and
[)[ = (1/2)
_
1
1
_
_
1 1
_
=
1
2
_
1 1
1 1
_
.
CS 294, Spring 2009, 0-7
Thus in this case the offdiagonals cancel, and we get
=
1
2
[+)+[ +
1
2
[)[ =
_
1/2 0
0 1/2
_
.
Note that the two density matrices we computed are identical, even though the mixed state we started out was different.
Hence we see that it is possible for two different mixed states to have the same density matrix.
Nonetheless, the density matrix of a mixture completely determines the effects of making a measurement on the
system:
Theorem 0.1: Suppose we measure a mixed state p
j
, [
j
) in an orthonormal bases [
k
). Then the outcome is [
k
)
with probability
k
[[
k
).
Proof: We denote the probability of measuring [
k
) by Pr[k]. Then
Pr[k] =
j
p
j
[
j
[
k
)[
2
=
j
p
j
k
[
j
)
j
[
k
)
=
_
j
p
j
[
j
)
j
[
k
_
=
k
[[
k
).
2
Thus mixtures with the same density matrix are indistinguishable by measurement. It will be shown in the next section
that, in fact, two mixtures are distinguishable by measurement if and only if they have different density matrices.
We list several properties of the density matrix:
1. is Hermitian, so the eigenvalues are real and the eigenvectors orthogonal.
2. If we measure in the standard basis the probability we measure i, P[i] =
i,i
. Also, the eigenvalues of are
non-negative. Suppose that and [e) are corresponding eigenvalue and eigenvector. Then if we measure in the
eigenbasis, we have
Pr[e] =e[[e) = e[e) = .
3. tr = 1. This is because if we measure in the standard basis
i,i
= Pr[i] but also
i
Pr[i] = 1 so that
i
i,i
=
i
Pr[i] = 1.
Consider the following two mixtures and their density matrices:
cos[0) +sin[1) w.p. 1/2 =
1
2
_
c
s
_
_
c s
_
=
1
2
_
c
2
cs
cs s
2
_
cos[0) sin[1) w.p. 1/2 =
1
2
_
c
s
_
_
c s
_
=
1
2
_
c
2
cs
cs s
2
_
_
_
=
_
cos
2
0
0 sin
2
_
[0) w.p. cos
2
= cos
2
_
1
0
_
_
1 0
_
= cos
2
_
1 0
0 0
_
[1) w.p. sin
2
= sin
2
_
0
1
_
_
0 1
_
= sin
2
_
0 0
0 1
_
_
_
=
_
cos
2
0
0 sin
2
_
Thus, since the mixtures have identical density matrices, they are indistinguishable.
CS 294, Spring 2009, 0-8
0.6 Von Neumann Entropy
We will now show that if two mixed states are represented by different density measurements, then there is a measure-
ment that distinguishes them. Suppose we have two mixed states, with density matrices A and B such that A ,= B. We
can ask, what is a good measurement to distinguish the two states? We can diagonalize the difference AB to get
AB = EE
i
is the difference in the probability of measuring e
i
:
Pr
A
[i] Pr
B
[i] =
i
.
We can dene the distance between two probability distributions (with respect to a basis E) as
[D
A
D
B
[
E
=
(Pr
A
[i] Pr
B
[i]) .
If E is the eigenbasis, then
[D
A
D
B
[
E
=
i
[
i
[ = tr[AB[ =|AB|
tr
,
which is called the trace distance between A and B.
Claim Measuring with respect to the eigenbasis E (of the matrix AB) is optimal in the sense that it maximizes the
distance [D
A
D
B
[
E
between the two probability distributions.
Before we prove this claim, we introduce the following denition and lemma without proof.
Denition Let a
i
N
i=1
and b
i
N
i=1
be two non-increasing sequences such that
i
a
i
=
i
b
i
. Then the sequence a
i
is said to majorize b
i
if for all k,
k
i=1
a
i
i=1
b
i
.
Lemma[Schur] Eigenvalues of any Hermitian matrix majorizes the diagonal entries (if both are sorted in nonincreasing
order).
Now we can prove our claim.
Proof Since we can reorder the eigenvectors, we can assume
1
2
n
. Note that tr(AB) = 0, so we must
have
i
i
= 0. We can split the
i
s into two groups: positive ones and negative ones, we must have
i
>0
=
1
2
|AB|
tr
i
<0
=
1
2
|AB|
tr
.
Thus
max
k
k
i=1
i
=
1
2
|AB|
tr
.
Now consider measuring in another basis. Then the matrix AB is represented as H = F(AB)F
, and let
1
2
n
be the diagonal entries of H. Similar argument shows that
max
k
k
i=1
i
=
1
2
n
i=1
[
i
[ =
[D
A
D
B
[
F
2
.
But by Schurs lemma the
i
s majorizes
i
s, so we must have
[D
A
D
B
[
F
[D
A
D
B
[
E
=|AB|
tr
.
CS 294, Spring 2009, 0-9
Let H(X) be the Shannon Entropy of a random variable X which can take on states p
1
. . . p
n
.
H(p
i
) =
i
p
i
log
1
p
i
In the quantum world, we dene an analogous quantity, S(), the Von Neumann entropy of a quantum ensemble with
density matrix with eigenvalues
1
, . . . ,
n
:
S() = H
1
, . . . ,
n
=
i
log
1
i
0.7 Phase Estimation and Mixed State Computation
Liquid NMR (Nuclear Magnetic Resonance) quantum computers have successfully implemented 7 qubits and per-
formed a stripped down version of quantum factoring on the number 15. In liquid NMR, the quantum register is
composed of the nuclear spins in a suitably chosen molecule - the number of qubits is equal to the number of atoms
in the molecule. We can think of the computer as consisting of about 10
16
such molecules (a macroscopic amount
of liquid), each controlled by the same operations simultaneously. Thus we will have 10
16
copies of our state, each
consisting of say 7 qubits. We assume that we can address the qubits individually, so that for example, we could
preform an operation such as CNOT on the 2nd and 4th qubit (simultaneously on each copy).
The catch in liquid NMR quantum computing is that initializing the register is hard. Each qubit starts out in state [0)
with probability 1/2 + and in state [1) with probability 1/2 . Here depends upon the strength of the magnetic
eld that the liquid sample is placed in. Using very strong magnets in the NMR apparatus, the polarization is still
about 10
5
.
If = 0 then the density matrix describing the quantum state of the register is =
1
2
n
I. This means that if we apply
a unitary transformation U, the density matrix of the resulting state is I
U
UIU
2
signal gets amplied by the 10
16
copies of the computation being carried out simultaneously. The problem is that
is exponentially small in n the number of qubits. Therefore liquid NMR quantum computation cannot scale beyond
10-20 qubits.
What if we tried one qubit with -bias and n-1 qubits maximally mixed?
Question: Say we have a single clean bit (and n maximally mixed qubits), what can we do with this?
We can do at least one quantum computation, phase estimation to approximate the trace of a unitary matrix. Use the
single clean qubit as the control bit and apply the (controlled) unitary to the n maximally mixed qubits. We can think
of the n qubits as being a uniform mixture over the eigenvectors of the unitary, and upon measuring, we get out a
random eigenvalue estimate for trace(U). See Figure ??
CS 294, Spring 2009, 0-10
I U
0
_
Measure Measure H H
Figure 0.8: Phase Estimation Circuit for Trace(U)
Is there anything else that we can do with just one qubit? Can you prove limits on what can be done with one clean
qubit? And where is the entanglement in the computation?
CS 294, Spring 2009, 0-11