Intro Stabilizer Circuits
Intro Stabilizer Circuits
Abstract
The Gottesman-Knill theorem states that a circuit made up of controlled-not,
Hadamard, phase, and measurement gates can be simulated efficiently on a classical
computer and provides a method for doing so in O(n3 ) time. Aaronson and Gottes-
man’s paper Improved Simulation of Stabilizer Circuits describes how to reduce the
simulation time to O(n2 ) and explores the computational complexity implications of
the theorem. An immediate corollary of the Gottesman-Knill theorem is that the type
of circuits to which the theorem applies – known as stabilizer circuits – are not univer-
sal for quantum computation. Aaronson and Gottesman answer the question of where
stabilizer circuits lie in the realm of computational complexity theory by showing that
they are complete for the classical complexity class ⊕L and thus are likely not universal
even for classical computation.
1 Introduction
One difficulty with designing quantum algorithms is the lack of a method for easily testing
and debugging circuits. To debug circuits in the classical setting, programmers can add
test conditions or read intermediate data to find where problems occur. In the quantum
setting, tricks like these may disturb the coherence of the quantum states being manipulated
by the quantum circuit. Furthermore, it seems likely that large-scale quantum computers
may not be built for many years to come, while physicists and chemists are looking for ways
to simulate quantum systems now. With these problems in mind, a method for simulating
quantum circuits on a classical computer is sought.
Several packages already exist for simulating general quantum systems on classical com-
puters [5, 6, 7, 8], but the running time of these simulators grows exponentially in the number
of qubits being simulated. It is of course expected that no method exists for simulating all
quantum circuits efficiently on a classical computer, as that would imply that quantum com-
puters could at best offer a polynomial speed-up over classical computers – something that
was shown not to be true for at least some problems by the n-bit version of the Deutsch-
Jozsa algorithm [11]. Attention is thus restricted to trying to find specific classes of quantum
circuits that can be simulated efficiently on classical computers.
1
The Gottesman-Knill theorem [2, Theorem 10.5.4] provides a simple example of one such
class of quantum circuits, known as stabilizer circuits. The theorem gives a constructive
method for simulating an n-qubit stabilizer circuit in time that is polynomial in n. Simulat-
ing measurements in the computational basis requires O(n3 ) time in the original proposed
simulation method, while simulating the other gates that make up stabilizer circuits requires
only O(n) time. Aaronson and Gottesman provide a method for improving the efficiency
of simulating measurements within stabilizer circuits, which then reduces the time needed
to simulate a stabilizer circuit to O(n2 ) [1]. The cost of this speed-up is a doubling of the
number of bits that must be kept track of throughout the algorithm. They also answer
the question of where stabilizer circuits live in relation to other well-known computational
complexity classes.
The remainder of this paper proceeds as follows. Section 2 outlines the notation and
terminology necessary to discuss Aaronson and Gottesman’s results, Section 3 describes
the original approach for simulating stabilizer circuits, Section 4 outlines the algorithm
that Aaronson and Gottesman proposed for improving measurement simulation efficiency,
and Section 5 discusses some of the computational complexity implications of the fact that
performing this simulation is possible.
2 Preliminaries
First, we recall the definitions of several 1-qubit quantum gates:
1 0 0 1 1 1 1
I= X= H = √2
0 1 1 0 1 −1
(1)
0 −i 1 0 1 0
Y = Z= S=
i 0 0 −1 0 i
The Gottesman-Knill Theorem states that a quantum circuit consisting only of CN OT ,
H, S, and 1-qubit measurement gates can be simulated efficiently on a classical computer –
a quantum circuit of this form is known as a stabilizer circuit. The group generated by H
and S under multiplication is known as the Clifford group, which contains each of the four
Pauli matrices I, X, Y , and Z. A stabilizer circuit that contains no measurement gates is
thus referred to as a Clifford group circuit.
Define the group Pn of n-qubit Pauli operators to be the group of all tensor products
of n Pauli matrices, together with the multiplicative factors ±1 and ±i. The multiplicative
factors ±1 and ±i are required to ensure that Pn is indeed a true group. As a notational
convenience, tensor product signs are omitted when describing elements of Pn (eg. XY Z
means X ⊗ Y ⊗ Z). Similarly, subscripts may also be used on a Pauli matrix to indicate that
those Pauli matrices appear in the indicated tensor slots, with the identity matrix elsewhere
(eg. Z2,4 means I ⊗ Z ⊗ I ⊗ Z ⊗ I ⊗ · · · ⊗ I, where the number of identity matrices being
tensored will be clear from context).
Given a pure state |ψi, it is said that a unitary matrix U stabilizes |ψi if U |ψi = |ψi,
where global phase is not ignored (eg. the Pauli X matrix stabilizes the state √12 |0i + √12 |1i
2
and the Pauli Z matrix stabilizes |0i). The identity matrix I stabilizes all states, while −I
stabilizes no states. It is clear that the set Stab(|ψi) of stabilizers of |ψi is a group because
if U |ψi = |ψi and V |ψi = |ψi then U −1 |ψi = |ψi and U V |ψi = |ψi as well.
To see why this theorem is useful in this setting, first note that a general quantum state
requires about 2n parameters to be completely described. Theorem 1 says, however, that any
state |φi that can be obtained from |0i⊗n via a stabilizer circuit can be described uniquely
in terms of S(|ψi), and |S(|ψi)| = 2n . It then follows that any |ψi of the type described by
Theorem 1 can be described by exactly n elements of Pn (in particular, the n generators of
S(|ψi)). Furthermore, each generator of S(|ψi) can be described by 2n + 1 bits: 2 bits for
each of the n Pauli matrices and 1 bit for the phase (note that the only possible phases are
±1, since a phase of ±i would imply that −I · · · I ∈ S(|ψi), which is impossible).
Putting all of this together, Theorem 1 implies that any state |φi that can be obtained
from |0i⊗n via a stabilizer circuit can be described uniquely by n(2n + 1) = 2n2 + n bits.
Equally important is the fact that these bits can be updated efficiently after a CN OT , H, S,
or measurement gate is applied to |ψi. Throughout the rest of this paper, when a quantum
state |ψi is referred to, it is assumed that it satisfies the conditions of Theorem 1 unless
stated otherwise.
3
Operation Input Output Operation Input Output
X1 X1 X2 X X
X
X2 X2 Z −Z
CN OT
Z1 Z1
X −X
Z2 Z1 Z 2 Y
Z −Z
X Z
H X −X
Z X Z
Z Z
X Y
S
Z Z
Figure 1: The update rules for updating Pauli X and Z matrices upon conjugation by various
Clifford group gates – for the CN OT gate, qubit 1 is the control and qubit 2 is the target.
Adapted from [2, Section 10.5.2].
matrices together in the same order. For example, the update rule for applying S to Y
follows from the fact that Y = iXZ. The update rule applied to X yields Y and the update
rule applied to Z yields Z, so the update rule applied to Y yields i(Y )(Z) = −X.
Thus to update S(|ψi) after a quantum gate is applied to the ath qubit, simply apply the
update rule to the ath tensor slot of each of the its n generators. Updating S(|ψi) in this
way takes O(n) time as the update rules needs to be applied n times to simulate an H, S,
X, Y , or Z gate, and 2n updates need to be performed to simulate a CN OT gate.
4
for the two eigenvalues +1 and −1 of P are both 12 and thus the measurement outcome
can be determined uniformly randomly with equal probability. After randomly choosing a
measurement outcome the rule for updating S(|ψi) is to replace gq in the list of generators
by ±P , where the sign in front of P corresponds to the measurement outcome chosen.
Case 2: P commutes with all the generators of S(|ψi). In this case we know that
gj P |ψi = P gj |ψi = P |ψi for all generators gj and thus S(P |ψi) = S(|ψi). This implies that
P |ψi is a multiple of |ψi which, together with the fact that P 2 = I, implies that P |ψi = ±|ψi.
Thus, the measurement of P gives ±1 with probability 1 and does not disturb the state of the
system (and thus does not alter S(|ψi)). However, determining whether the measurement
outputs +1 or −1 (equivalently |0i or |1i, respectively) seems to require inverting an n × n
matrix, which takes O(nω ) time, where ω is the exponent of matrix multiplication [9]. The
current lowest known exponent of matrix multiplication is 2.376 [10], although in practice
the naive (standard) algorithm of matrix multiplication is used for numerical stability and
implementation reasons, giving a running time of O(n3 ).
Note that Zk commutes with Q ∈ Pn if and only if the k th tensor slot of Q looks like Z
or I. Thus the two cases above can be distinguished in O(n) time by simply looking at the
k th tensor slot of each of g1 , g2 , . . . , gn .
This procedure shows how to efficiently simulate stabilizer circuits on a classical com-
puter. Recall, however, that if a measurement is deterministic (ie. it falls into case 2 above)
then the simulation requires O(n3 ) time; a significant increase over the O(n) time needed to
simulate Clifford group gates and the O(n2 ) time needed to simulate random measurements
(as in case 1 above). This large running time for simulating certain measurements is the
motivation for looking for an improved algorithm for simulating stabilizer circuits.
5
Definition 2 The tableau of a state |ψi is a matrix consisting of binary variables xij , zij for
all i ∈ {1, . . . , 2n}, j ∈ {1, . . . , n}, and ri for all i ∈ {1, . . . , 2n}:
Recalling that Y = iXZ, the above rule for xij and zij can be intuitively thought of as
xij = 1 if and only if the j th tensor slot of Ri contains an X and zij = 1 iff the j th tensor
slot of Ri contains a Z. It can also be seen that the first n rows of a tableau represent the n
destabilizer generators for that state and the last n rows represent the n stabilizer generators
for that state.
As a simple example of a tableau, if |ψi = |00i then S(|ψi) = {+ZI, +IZ}. That is, the
stabilizers are R3 = +ZI and R4 = +IZ. It is not difficult to see that we can then choose
the destabilizers to be R1 = +XI and R2 = +IX, as the set {+XI, +IX, +ZI, +IZ, +iI}
generates all of Pn . It is then easy to verify that ri = 0 ∀i, x11 = 1, x22 = 1, xij = 0 for all
other i, j, z31 = 1, z42 = 1, and zij = 0 for all other i, j. It thus follows that a tableau for
this state is:
1 0 0 0 0
0 1 0 0 0
0 0 1 0 0
(2)
0 0 0 1 0
Note that any given state will have several tableaux, so the obvious generalization of
the above tableau to more qubits will be the starting point of the tableau algorithm, as it
is a “nice” way of representing the state |0i⊗n . The procedures for simulating CN OT , H,
and S gates are provided in Appendix II and follow directly from the update rules given in
Figure 1. It is not difficult to see that simulating these gates requires only O(n) time, as in
the original algorithm.
6
4.2 Improving measurement simulations
In order to simulate a measurement of qubit a in the computational basis, a procedure
called rowsum(h, j) is repeatedly used. The rowsum(h, j) procedure replaces Rh by Rh Rj
– the details of its implementation are provided in Appendix II. Two cases are considered
depending on whether the measurement will output a random (case 1) or deterministic
(case 2) result. Note that these two cases correspond exactly to the two cases described
in Section 3.2 for the original algorithm. As was the case earlier, these two cases can be
distinguished in O(n) time.
Case 1: There exists p ∈ {n + 1, . . . , 2n} such that xpa = 1. Let q be the smallest
such p. This case corresponds to the situation where the ath tensor slot of at least one
stabilizer of |ψi looks like either X or Y (ie. the ath tensor slot of some stabilizer does not
look like I or Z). This means that the ath qubit of |ψi does not look like |0i or |1i and
thus the measurement outcome is random. The state therefore needs to be updated, which
is done as follows.
For all j ∈ {1, . . . , q − 1, q + 1, . . . , 2n} such that xja = 1 call rowsum(j, q). This step
corresponds to replacing gj by gq gj in the list of generators of S(|ψi) whenever gq anti-
commutes with gj . Next, set the (q − n)th row of the tableau equal to the q th row of the
tableau. Then set xqj = zqj = 0 for all j ∈ {1, . . . , n}, except set zqa = 1. Finally, set rq to
be 0 or 1 with equal probability. These last three steps correspond to replacing gq by ±P ,
where P = Za can be thought of as the observable that is being measured. The measurement
outcome is rq .
Note that it takes O(n2 ) time to simulate a measurement using this method, as the
rowsum procedure is run O(n) times and each execution of the rowsum procedure takes
O(n) time. This agrees with the O(n2 ) running time of simulating random measurements in
the original algorithm.
Case 2: There does not exist p ∈ {n + 1, . . . , 2n} such that xpa = 1. In this case |ψi
looks like |0i or |1i on the ath qubit and thus the measurement is deterministic and the state
does not need to be updated. All that needs to be determined is whether the measurement
will result in 0 or 1. To this end, create a (2n + 1)st row on the tableau and set it identially
equal to zero. Then for all j ∈ {1, . . . , n} such that xja = 1 call rowsum(2n + 1, j + n) and
return r2n+1 as the measurement outcome.
This procedure corresponds to multiplying together all of the stabilizer generators Rj+n
such that the destabilizer generator Rj anti-commutes with Za , and returning the resulting
phase (±1) as the measurement outcome. Notice that this procedure takes O(n2 ) time to
carry out in general, as the rowsum procedure is called O(n) times and takes O(n) time to
run each time. Also notice that this procedure is the reason that the n destabilizer generators
were added to the algorithm; in no other part of the tableau algorithm are they needed nor
do they provide a speed-up elsewhere.
To see why this procedure correctly computes the measurement outcome, notice that Za
must commute with all generators of S(|ψi) (just as in case 2 of the original algorithm). It’s
not hard to then show that either Za ∈ S(|ψi) or −Za ∈ S(|ψi), but not both (if both were
in the stabilizer, then −Za Za = −I would also be in the stabilizer, which can’t happen).
7
Furthermore, the sign in front of Za in the stabilizer corresponds exactly to whether the
measurement outcome is 0 or 1, since |0i is stabilized by Z and |1i is stabilized by −Z.
In order to determine the sign of Za in the stabilizer, a method for computing ±Za by
multiplying together stabilizer generators is developed. Notice for the initial Tableau (2) that
the j th stabilizer generator anti-commutes with the j th destabilizer generator but commutes
with the rest of the destabilizer generators. It is not difficult to prove that these properties
remain after any number of gates from a stabilizer circuit are applied. It is similarly easy to
see that Za will anti-commute with the destabilizer generator Rj if and only if the stabilizer
generator Rj+n appears in the list of stabilizer generators that are multiplied together to
compute ±Za . Thus, multiplying together all of the stabilizer generators Rj+n such that
Rj anti-commutes with Za computes ±Za , where the sign corresponds to the measurement
outcome.
The new tableau algorithm provides a method for simulating both random and deter-
ministic measurements on a classical computer in O(n2 ) time. Note that in both case 1 and
case 2 above that the number of times that rowsum(h, j) is called depends on the number
of stabilizer generators Rn+j and destabilizer generators Rj such that their ath tensor slot
looks like X or Y . One can observe from Figure 1 and the Tableau (2), however, that it is
expected that states that started as |0i⊗n and have had very few CN OT , H and S gates
applied to them will have fewer than O(n) such stabilizer and destabilizer generators, re-
sulting in measurements that take less than O(n2 ) time to simulate. It is thus expected that
measurements performed sufficiently early in the circuit will be simulated by the tableau
algorithm much quicker than measurements that are performed later in the circuit, a fact
that is confirmed by Aaronson and Gottesman [1, Figure 2].
Definition 3 P is the class of problems solvable by a Turing machine in time that is poly-
nomial in n, where n is the size of the input.
8
Definition 4 L is the class of problems solvable by a Turing machine restricted to use an
amount of memory in O(log(n)), where n is the size of the input.
Definition 5 ⊕L is the class of all problems that are reducible to simulating a circuit whose
size is polynomial in n and is composed entirely of X and CN OT gates, acting on the initial
state |0i⊗n .
It is clear from these definitions that ⊕L ⊆ P because – roughly speaking – any problem
that can be reduced to simulating a circuit consisting of only a polynomial number of X and
CN OT gates must be solvable in polynomial time. It is currently not known whether this
inclusion is strict or not, but it is believed that ⊕L 6= P . Similarly, it is known that L ⊆ ⊕L
– a fact that is made clearer by a more complicated alternate definition for ⊕L [1, 12]. It is
conjectured that L 6= ⊕L, but this is also currently unknown.
It is simple to see that X = HSSH and thus it follows that any circuit consisting entirely
of X and CN OT gates is a stabilizer circuit – that is, any problem in ⊕L is reducible to
simulating a polynomial-size stabilizer circuit. In other words, simulating stabilizer circuits
is ⊕L-hard. Aaronson and Gottesman further show that simulating stabilizer circuits is
a problem that is in ⊕L, which in turn shows that simulating stabilizer circuits is ⊕L-
complete. It thus follows that it is likely the case that stabilizer circuits are not universal
even for classical computation, because if they were then it would follow that ⊕L = P .
In order to prove that simulating stabilizer circuits is a problem that is in ⊕L, Aaronson
and Gottesman argue that the principle of deferred measurement shows that they can assume
without loss of generality that the stabilizer circuit contains only one measurement gate, and
it is located at the end of the circuit. They then show that the tableau algorithm allows
for the stabilizer circuit to be simulated using only O(log(n)) memory provided they can
simulate subcircuits consisting of X and CN OT gates with an oracle. This then shows that
the problem of simulating a stabilizer circuit with an oracle for simulating subcircuits of X
and CN OT gates is in L. It is then a result of Hertrampf, Reith, and Vollmer [13] that
shows that this implies that simulating a stabilizer circuit is a problem that must be in ⊕L.
Since simulating stabilizer circuits is a problem that is in ⊕L, it is the case that stabilizer
circuits can be simulated efficiently by circuits consisting entirely of X and CN OT gates.
This provides another way of looking at the Gottesman-Knill theorem. Any stabilizer circuit
can be simulated efficiently by a circuit made up entirely of X and CN OT gates. X and
CN OT gates can be thought of as classical operations, though (N OT and controlled-N OT ,
respectively), so it follows that stabilizer circuits can of course be simulated efficiently on a
classical computer.
6 Conclusion
The Gottesman-Knill theorem, tableau algorithm, and several related results all help pin
down exactly where stabilizer circuits reside in the realm of computational complexity. Even
though stabilizer circuits are an extremely important class of quantum circuits as CN OT ,
H, S, and Pauli gates form the foundation of several important quantum algorithms, it was
9
shown that they are not universal for quantum computation and likely not universal even
for classical computation.
The work of Aaronson and Gottesman in [1] improves greatly on the already-important
Gottesman-Knill theorem, but certainly there are related avenues of interest that are still
open. It was shown in [14] that generalizing stabilizer circuits by adding any 1- or 2-qubit
gate not generated by CN OT , H, and S results in a set of gates that is universal for
quantum computation. Is there a set of quantum gates that is neither universal for quantum
computation nor efficiently simulable on a classical computer? Is there a way to improve the
tableau algorithm so that simulation of measurements on stabilizer circuits can be performed
in O(n) time? Answers to these questions will help shed more light on the delicate link
between quantum computational complexity and classical computational complexity that
the Gottesman-Knill theorem highlights.
Acknowledgements. This paper was primarily based on the paper Improved Simulation of
Stabilizer Circuits by S. Aaronson and D. Gottesman [1]. Additional information about the
Gottesman-Knill theorem and its proof were provided by the book Quantum Computation
and Quantum Information by M. Nielsen and I. Chuang [2].
References
[1] Aaronson, S. and Gottesman, D., Improved Simulation of Stabilizer Circuits.
http://xxx.lanl.gov/abs/quant-ph/0406196
[2] M. A. Nielsen and I. Chuang, Quantum computation and quantum information, Cam-
bridge University Press, Cambridge, 2000.
[4] C. Bennett and S. J. Wiesner. Communication via one- and two-particle operators on
Einstein-Podolsky-Rosen states. Phys. Rev. Lett., 69:2881, 1992.
10
[9] Cohn, H., Umans, C., Kleinberg, R., Szegedy, B., Group-theoretic Algorithms for Matrix
Multiplication, Proceedings of the 46th Annual IEEE Symposium on Foundations of
Computer Science 2005, IEEE Computer Society, 2005, 438-449.
[11] D. Deutsch and R. Jozsa, Rapid solutions of problems by quantum computation. Pro-
ceedings of the Royal Society of London A, 439-553, 1992.
[13] U. Hertrampf, S. Reith, and H. Vollmer. A note on closure properties of logspace MOD
classes, Information Processing Letters 75(3):91-93, 2000.
[14] Y. Shi. Quantum Information and Computation 3(1), 84, 2003. quant-ph/0205115.
[15] The matrix iI needs to be added to the list of generators in general to ensure that it is
possible to find n destabilizer Pauli operators that, together with S(|ψi), truly generate
all of Pn including the four possible multiplicative factors.
Appendix I: Corrections
In Aaronson and Gottesman’s paper [1] it is said on page 4 that xij = 1 if Pij = Y or Pij = Z
(and xij = 0 otherwise) and that zij = 1 if Pij = X or Pij = Y (and zij = 0 otherwise).
There was a typo in these definitions, however, as they would lead to the tableau for the
state |00i (ie. the “identity matrix”) being
0 0 1 0 0
0 0 0 1 0
1 0 0 0 0 .
0 1 0 0 0
Similarly, it is easy to see that the measurement protocol described on page 5 of their paper
does not work as intended under the given definitions of xij and zij . Their paper provides
(T ) (T )
the correct definitions of xij and zij (which are very related to xij and zij ) on page 8. The
correct definitions are xij = 1 if Pij = X or Pij = Y (and xij = 0 otherwise) and zij = 1 if
Pij = Y or Pij = Z (and zij = 0 otherwise), as stated in Definition 2.
11
Simulating H on qubit a. For all i ∈ {1, . . . , 2n} set ri = ri ⊕ xia zia and swap xia with
zia .
Simulating S on qubit a. For all i ∈ {1, . . . , 2n} set ri = ri ⊕ xia zia and zia = zia ⊕ xia .
These update rules simply implement the update rules given in Figure 1. For example, if
the ith destabilizer has its ath tensor slot occupied by X then xia = 1 and zia = 0. Thus, after
simulating an S gate on qubit a we have xia = 1, zia = 0 ⊕ 1 = 1 (recall from Definition 2
that this corresponds to the ath tensor of the destabilizer now being occupied by Y ) and
ri = ri ⊕ 0 = ri (ie. the phase of the destabilizer does not change). This agrees with the rule
given in Figure 1 that says S updates X to Y .
The rowsum(h, j) procedure. First, define the function g : {0, 1}4 7→ {−1, 0, 1} by:
0, if x1 = z1 = 0
z − x , if x = z = 1
2 2 1 1
g(x1 , z1 , x2 , z2 ) =
z2 (2x2 − 1), if x1 = 1, z1 = 0
x2 (1 − 2z2 ), if x1 = 0, z1 = 1
Intuitively, g is a function that returns the exponent to which the imaginary number
i is raised (either 0, 1, or −1) when the Pauli matrices represented by x1 z1 and x2 z2 are
multiplied together. For example, if x1 = z2 = 0 and z1 = x2 = 1 then Definition 2 shows
that x1 z1 and x2 z2 represent Z and X, respectively. Multiplying Z and X together gives
ZX = iY . Since the exponent on i is 1, we know that g(0, 1, 1, 0) = 1. This fact is confirmed
by the formula for g given earlier.
The rowsum procedure takes two arguments h and j and performs the following set of
updates to the tableau:
For all k ∈ {1, . . . , n} set xhk =
Pxjk ⊕ xhk and set zhk = zjk ⊕ zhk .
Next, define m ≡ 2rh + 2rj + nk=1 g(xjk , zjk , xhk , zhk ) mod 4 and set rh = m2 . Note that
this is a valid operation because m will never equal 1 or 3.
Updating the x’s and z’s corresponds to multiplying the hth and j th generators together,
while updating rh keeps track of the phase of the new hth generator.
12