Lectures on quantum circuits l01
Lectures on quantum circuits l01
This is a course on quantum algorithms. It is intended for graduate students who have already
taken an introductory course on quantum information. Such an introductory course typically covers
only the early breakthroughs in quantum algorithms, namely Shor’s factoring algorithm (1994) and
Grover’s searching algorithm (1996). The purpose of this course is to show that there is more to
quantum computing than Shor and Grover by exploring some of the many quantum algorithms
that have been developed since then.
The course will consist of three main parts.
• In the first part, we will discuss algorithms that generalize the main idea of Shor’s algorithm.
These algorithms make use of the quantum Fourier transform, and typically achieve an ex-
ponential (or at least superpolynomial) speedup over classical computers. In particular, we
will explore a group-theoretic problem called the hidden subgroup problem. We will see how
a solution of this problem for abelian groups leads to several applications, and we will also
discuss what is known about the nonabelian case.
• In the second part, we will explore the concept of quantum walk, a quantum generalization
of random walk. This concept leads to a powerful framework for solving search problems,
generalizing Grover’s search algorithm. It can also be used to solve other problems in query
complexity, such as evaluating Boolean formulas. Most of these applications involve a polyno-
mial speedup over classical computers (although there are also some indications that quantum
walk may be useful for more dramatic speedups).
• In the third (short) part, we will discuss lower bounds on quantum query complexity, demon-
strating limitations on the power of quantum algorithms. We will cover the two main quantum
lower bound techniques, the adversary method and the polynomial method.
We will also discuss a few other topics along the way, including the simulation of quantum
dynamics and (time permitting) approximate computation of the Jones polynomial.
In this lecture, we will briefly review some background material on quantum computation. If
you plan to take this course, most of this material should be familiar to you (except perhaps the
details of the Solovay-Kitaev theorem).
Quantum data
1
g ∈ G, and X
|φi = bg |gi (2)
g∈G
for an arbitrary superposition over the group. We assume that there is some canonical way of
efficiently representing group elements using bit strings; it is usually unnecessary to make this
representation explicit.
If a quantum computer stores the state |ψi and the state |φi, its overall state is given by the
tensor product of those two states. This may be denoted |ψi ⊗ |φi = |ψi|φi = |ψ, φi.
Quantum circuits
The allowed operations on (pure) quantum states are those that map normalized states to normal-
ized states, namely unitary operators U , satisfying U U † = U † U = I. (You probably know that
there are more general quantum operations, but for the most part we will not need to use them in
this course.)
To have a sensible notion of efficient computation, we require that the unitary operators ap-
pearing in a quantum computation are realized by quantum circuits. We are given a set of gates,
each of which acts on one or two qubits at a time (meaning that it is a tensor product of a one- or
two-qubit operator with the identity operator on the remaining qubits). A quantum computation
begins in the |0i state, applies a sequence of one- and two-qubit gates chosen from the set of allowed
gates, and finally reports an outcome obtained by measuring in the computational basis.
In principle, any unitary operator on n qubits can be implemented using only 1- and 2-qubit gates.
Thus we say that the set of all 1- and 2-qubit gates is (exactly) universal. Of course, some unitary
operators may take many more 1- and 2-qubit gates to realize than others, and indeed, a counting
argument shows that most unitary operators on n qubits can only be realized using an exponentially
large circuit of 1- and 2-qubit gates.
In general, we are content to give circuits that give good approximations of our desired unitary
transformations. We say that a circuit with gates U1 , U2 , . . . , Ut approximates U with precision if
kU − Ut . . . U2 U1 k ≤ . (3)
Here k·k denotes some appropriate matrix norm, which should have the property that if kU − V k
is small, then U should be hard to distinguish from V no matter what quantum state they act on.
A natural choice (which will be suitable for our purposes) is the spectral norm
kA|ψik
kAk := max , (4)
|ψi k|ψik
p
(where k|ψik = hψ|ψi denotes the vector 2-norm of |ψi), i.e., the largest singular value of A.
Then we call a set of elementary gates universal if any unitary operator on a fixed number of
qubits can be approximated to any desired precision using elementary gates.
It turns out that there are finite sets of gates that are universal: for example, the set {H, T, C}
2
with
iπ/8 1 0 0 0
1 1 1 e 0 0 1 0 0
H := √ T := C := . (5)
2 1 −1 0 e−iπ/8 0 0 0 1
0 0 1 0
There are situations in which we say a set of gates is effectively universal, even though it cannot
actually approximate any unitary operator on n qubits. For example, the set {H, T 2 , Tof}, where
1 0 0 0 0 0 0 0
0 1 0 0 0 0 0 0
0 0 1 0 0 0 0 0
0 0 0 1 0 0 0 0
Tof :=
(6)
0 0 0 0 1 0 0 0
0 0 0 0 0 1 0 0
0 0 0 0 0 0 0 1
0 0 0 0 0 0 1 0
is universal, but only if we allow the use of ancilla qubits (qubits that start and end in the |0i
state). Similarly, the basis {H, Tof} is universal in the sense that, with ancillas, it can approximate
any orthogonal matrix. It clearly cannot approximate complex unitary matrices, since the entries
of H and Tof are real; but the effect of arbitrary unitary transformations can be simulated using
orthogonal ones by simulating the real and imaginary parts separately.
Are some universal gate sets better than others? Classically, this is not an issue: the set of possible
operations is discrete, so any gate acting on a constant number of bits can be simulated exactly
using a constant number of gates from any given universal gate set. But we might imagine that
some quantum gates are much more powerful than others. For example, given two rotations about
strange axes by strange angles, it may not be obvious how to implement a Hadamard gate, and
we might worry that implementing such a gate to high precision could take a very large number of
elementary operations, scaling badly with the required precision.
Fortunately, it turns out that this is not the case: a unitary operator that can be realized
efficiently with one set of 1- and 2-qubit gates can also be realized efficiently with another such set.
In particular, we have the following.
Theorem (Solovay-Kitaev). Fix two universal gate sets that are closed under inverses. Then any t-
gate circuit using one gate set can be implemented to precision using a circuit of t·poly(log t ) gates
from other set (indeed, there is a classical algorithm for finding this circuit in time t · poly(log t )).
Thus, not only are the two gate sets equivalent under polynomial-time reduction, but the
running time of an algorithm using one gate set is the same as that using the other gate set up to
logarithmic factors. This means that even polynomial quantum speedups are robust with respect
to the choice of gate set.
To establish this, we first note the basic fact that errors in the approximation of one quantum
circuit by another accumulate linearly.
Lemma. Let Ui , Vi be unitary matrices satisfying kUi − Vi k ≤ for all i ∈ {1, 2, . . . , t}. Then
kUt . . . U2 U1 − Vt . . . V2 V1 k ≤ t.
3
Proof. We use induction on t. For t = 1 the lemma is trivial. Now suppose the lemma holds for
a particular value of t. Then by the triangle inequality and the fact that the norm is unitarily
invariant (kU AV k = kAk for any unitary matrices U, V ),
kUt+1 Ut . . . U1 − Vt+1 Vt . . . V1 k
= kUt+1 Ut . . . U1 − Ut+1 Vt . . . V1 + Ut+1 Vt . . . V1 − Vt+1 Vt . . . V1 k (7)
≤ kUt+1 Ut . . . U1 − Ut+1 Vt . . . V1 k + kUt+1 Vt . . . V1 − Vt+1 Vt . . . V1 k (8)
= kUt+1 (Ut . . . U1 − Vt . . . V1 )k + k(Ut+1 − Vt+1 )Vt . . . V1 k (9)
= kUt . . . U1 − Vt . . . V1 k + kUt+1 − Vt+1 k (10)
≤ (t + 1), (11)
Thus, in order to simulate a t-gate quantum circuit with total error at most , it suffices to
simulate each individual gate with error at most /t.
To simulate an arbitrary individual gate, the strategy is to first construct a very fine net covering
a very small ball around the identity using the group commutator,
JU, V K := U V U −1 V −1 . (12)
To approximate general unitaries, we will effectively translate them close to the identity.
Note that it suffices to consider unitary gates with determinant 1 (i.e., elements of SU(2)) since
a global phase is irrelevant. Let
S := {U ∈ SU(2) : kI − U k ≤ } (13)
denote the -ball around the identity. Given sets Γ, S ⊆ SU(2), we say that Γ is an -net for S if
for any A ∈ S, there is a U ∈ Γ such that kA − U k ≤ . The following result (to be proved later
on) indicates how the group commutator helps us to make a fine net around the identity.
Lemma. If Γ is an 2 -net for S , then JΓ, ΓK := {JU, V K : U, V ∈ Γ} is an O(3 )-net for S2 .
To make an arbitrarily fine net, we apply this idea recursively. But first it is helpful to derive
a consequence of the lemma that is more suitable for recursion. We would like to maintain the
quadratic relationship between the size of the ball and the quality of the net. If we aim for a k3 -net
(for some constant k), we would like it to apply to arbitrary points in Sk3/2 , whereas the lemma
only lets us approximate points in S2 . To handle an arbitrary A ∈ Sk3/2 , we first let W be the
closest gate in Γ to A. For sufficiently small we have k3/2 < , so Sk3/2 ⊂ S , and therefore
A ∈ S . Since Γ is an 2 -net for S , we have kA − W k ≤ 2 , i.e., kAW † − Ik ≤ 2 , so AW † ∈ S2 .
Then can apply the lemma to find U, V ∈ Γ such that kAW † − JU, V Kk = kA − JU, V KW k ≤ k 2 3 .
In other words, if Γ is an 2 -net for S , then JΓ, ΓKΓ := {JU, V KW : U, V, W ∈ Γ} is a k 2 3 -net for
Sk3/2 .
Now suppose that Γ0 is an 20 -net for S0 , and let Γi := JΓi−1 , Γi−1 KΓi−1 for all positive integers
3/2 i
i. Then Γi is an 2i -net for Si , where i = ki−1 . Solving this recursion gives i = (k 2 0 )(3/2) /k 2 .
With these tools in hand, we are prepared to establish the main result.
4
First we take products of elements of Γ to form a new universal gate set Γ0 that is an 20 -net
for SU(2), for some sufficiently small constant 0 . We know this can be done since Γ is universal.
Since 0 is a constant, the overhead in constructing Γ0 is constant.
Now we can find V0 ∈ Γ0 such that kU − V0 k ≤ 20 . Since kU − V0 k = kU V0† − Ik, we have
U V0† ∈ S20 . If 0 is sufficiently small, then 20 < k0 = 1 , so U V0† ∈ S1 .
3/2
Since Γ0 is an 20 -net for SU(2), in particular it is an 20 -net for S0 . Thus by the above argument,
Γ1 is an 21 -net for S1 , so we can find V1 ∈ Γ1 such that kU V0† − V1 k ≤ 21 < k1 = 2 , i.e.,
3/2
It remains to prove the lemma. A key idea is to move between the Lie group SU(2) and its
Lie algebra, i.e., the Hamiltonians generating these unitaries. In particular, we can represent any
A ∈ SU(2) as A = ei~a·~σ , where ~a ∈ R3 and ~σ = (σx , σy , σz ) is a vector of Pauli matrices. Note that
we can choose k~ak ≤ π without loss of generality.
In the proof, the following basic facts about SU(2) will be useful.
(i) kI − ei~a·~σ k = 2 sin k~a2k = k~ak + O(k~ak3 )
~
(ii) keib·~σ − ei~c·~σ k = k~b − ~ck + O(k~b − ~ck3 )
(iii) [~b · ~σ , ~c · ~σ ] = 2i(~b × ~c) · ~σ
~ ~
(iv) kJeib·~σ , ei~c·~σ K − e−[b·~σ,~c·~σ] k = O(k~bkk~ck(k~bk + k~ck))
Here the big-O notation is with respect to k~ak → 0 in (i), with respect to k~b − ~ck → 0 in (ii), and
with respect to k~bk, k~ck → 0 in (iv).
Proof of Lemma. Let A ∈ S2 . Our goal is to find U, V ∈ Γ such that kA − JU, V Kk = O(3 ).
Choose ~a ∈ R3 such that A = ei~a·~σ . Since A ∈ S2 , by (i) we can choose ~a so that k~ak = O(2 ).
5
Then choose ~b, ~c ∈ R3 such that 2~b × ~c = ~a. We can choose these vectors to be orthogonal
~
and of equal length, so that k~bk = k~ck = k~ak/2 = O(). Let B = eib·~σ and C = ei~c·~σ . Then
p
the only difference between A and JB, CK is the difference between the commutator and the group
commutator, which is O(3 ) by (iv).
However, we need to choose points from the net Γ. So let U = ei~u·~σ be the closest element of
Γ to B, and let V = ei~v·~σ be the closest element of Γ to C. Since Γ is an 2 -net for S , we have
kU − Bk ≤ 2 and kV − Ck ≤ 2 , so in particular k~u − ~bk = O(2 ) and k~v − ~ck = O(2 ).
Now by the triangle inequality, we have
Note that it is possible to improve the construction somewhat over the version described above.
Furthermore, it can be generalized to SU(N ) for arbitrary N . In general, the cost is exponential
in N 2 , but for any fixed N this is just a constant.
Reversible computation
6
a quantum computer. This transformation can be applied to any superposition of computational
basis states, so for example, we can perform the transformation
1 X 1 X
√ |x, 0i 7→ √ |x, f (x)i. (22)
2n x∈{0,1}n
2n x∈{0,1}n
Note that this does not necessarily mean we can efficiently implement the map |xi 7→ |f (x)i,
even when f is a bijection (so that this is indeed a unitary transformation). However, if we can
efficiently invert f , then we can indeed do this efficiently.
Uniformity
When we give an algorithm for a computational problem, we consider inputs of varying sizes.
Typically, the circuits for instances of different sizes with be related to one another in a simple way.
But this need not be the case; and indeed, given the ability to choose an arbitrary circuit for each
input size, we could have circuits computing uncomputable languages. Thus we require that our
circuits be uniformly generated : say, that there exists a fixed (classical) Turing machine that, given
a tape containing the symbol ‘1’ n times, outputs a description of the nth circuit in time poly(n).
Quantum complexity
We say that an algorithm for a problem is efficient if the circuit describing it contains a number
of gates that is polynomial in the number of bits needed to write down the input. For example, if
the input is a number modulo N , the input size is dlog2 N e.
With a quantum computer, as with a randomized (or noisy) classical computer, the final result
of a computation may not be correct with certainty. Instead, we are typically content with an
algorithm that can produce the correct answer with high enough probability (for a decision problem,
bounded above 1/2; for a non-decision problem for which we can check a correct solution, Ω(1)).
By repeating the computation many times, we can make the probability of outputting an incorrect
answer arbitrarily small.
In addition to considering explicit computational problems, in which the input is a string, we
will also consider the concept of query complexity. Here the input is a black box transformation, and
our goal is to discover some property of the transformation by making as few queries as possible. For
example, in Simon’s problem, we are given a transformation f : Zn2 → S satisfying f (x) = f (y) iff
y = x ⊕ t for some unknown t ∈ Zn2 , and the goal is to learn t. The main advantage of considering
query complexity is that it allows us to prove lower bounds on the number of queries required
to solve a given problem. Furthermore, if we find an efficient algorithm for a problem in query
complexity, then if we are given an explicit circuit realizing the black-box transformation, we will
have an efficient algorithm for an explicit computational problem.
Sometimes, we care not just about the size of a circuit for implementing a particular unitary
operation, but also about its depth, the maximum number of gates on any path from an input to
an output. The depth of a circuit tells us how long it takes to implement if we can perform gates in
parallel. In the problem set, you will get a chance to think about parallel circuits for implementing
the quantum Fourier transform.
7
Fault tolerance
In any real computer, operations cannot be performed perfectly. Quantum gates and measure-
ments may be performed imprecisely, and errors may happen even to stored data that is not being
manipulated. Fortunately, there are protocols for dealing with faults that may occur during the
execution of a quantum computation. Specifically, the threshold theorem states that as long as the
noise level is below some threshold (depending on the noise model, but typically in the range of
10−3 to 10−4 , an arbitrarily long computation can be performed with an arbitrarily small amount
of error.
In this course, we will always assume implicitly that fault-tolerant protocols have been applied,
such that we can effectively assume a perfectly functioning quantum computer.