Quantum Algorithm for Linear Systems of Equations
Quantum Algorithm for Linear Systems of Equations
Introduction.—Quantum computers are devices that har- However, the condition number often scales with the size
ness quantum mechanics to perform computations in ways of the problem, which presents a more serious limitation of
that classical computers cannot. For certain problems, our algorithm. Coping with large condition numbers has
quantum algorithms supply exponential speedups over been studied extensively in the context of classical algo-
their classical counterparts, the most famous example rithms. In the discussion section, we will describe the
being Shor’s factoring algorithm [1]. Few such exponential applicability of some of the classical tools (pseudoinverses,
speedups are known, and those that are (such as the use of preconditioners) to our quantum algorithm.
quantum computers to simulate other quantum systems We sketch here the basic idea of our algorithm and then
[2]) have so far found limited use outside the domain of discuss it in more detail in the next section. Given a
quantum mechanics. This Letter presents a quantum algo- Hermitian N N matrix A, and a unit vector b, ~ suppose
rithm to estimate features of the solution of a set of linear we would like to find x~ satisfying Ax~ ¼ b. ~ (We discuss
equations. Compared to classical algorithms for the same later questions of efficiency as well as how the assumptions
task, our algorithm can be as much as exponentially faster. we have made about A and b~ can be relaxed.) First, the
Linear equations play an important role in virtually all P
algorithm represents b~ as a quantum state jbi ¼ N i¼1 bi jii.
fields of science and engineering. The sizes of the data sets
Next, we use techniques of Hamiltonian simulation [3,4] to
that define the equations are growing rapidly over time, so
apply eiAt to jbi for a superposition of different times t.
that terabytes and even petabytes of data may need to be
This ability to exponentiate A translates, via the well-
processed to obtain a solution. In other cases, such as when
known technique of phase estimation [5,6], into the ability
discretizing partial differential equations, the linear equa- to decompose jbi in the eigenbasis of A and to find the
tions may be implicitly defined and thus far larger than the corresponding eigenvalues j . Informally, the state of the
original description of the problem. For a classical com- P
system after this stage is close to Nj¼1 j ju ij i, where uj
puter, even to approximate the solution of Nlinear equa- Pj j
tions in N unknowns in general requires time that scales at is the eigenvector basis of A, and jbi ¼ N j¼1 j juj i. We
least as N. Indeed, merely to write out the solution takes would then like to perform the linear map taking jj i to
time of order N. Frequently, however, one is interested not C1j jj i, where C is a normalizing constant. As this
in the full solution to the equations, but rather in computing operation is not unitary, it has some probability of failing,
some function of that solution, such as determining the which will enter into our discussion of the runtime below.
total weight of some subset of the indices. After it succeeds, we uncompute the jj i register and are
PN 1
We show that in some cases, a quantum computer can left with a state proportional to j¼1 j j juj i ¼
approximate the value of such a function in time which A1 jbi ¼ jxi.
scales logarithmically in N, and polynomially in the con- An important factor in the performance of the matrix
dition number (defined below) and desired precision. The inversion algorithm is , the condition number of A, or the
dependence on N is exponentially better than what is ratio between A’s largest and smallest eigenvalues. As the
achievable classically, while the dependence on condition condition number grows, A becomes closer to a matrix
number is comparable, and the dependence on error is which cannot be inverted, and the solutions become less
worse. Typically, the accuracy required is not very large. stable. Our algorithms will generally assume that the sin-
gular values of A lie between 1= and 1; equivalently, Related work.—Previous papers gave quantum algo-
2 I Ay A I. In this case, our algorithm uses roughly rithms to perform linear algebraic operations in a limited
Oð2 logðNÞ=Þ steps to output a state within distance of setting [10]. Our work was extended by Ref. [11] to solving
jxi. Therefore, the greatest advantage our algorithm has nonlinear differential equations.
over classical algorithms occurs when both and 1= are Algorithm.—We now give a more detailed explanation
poly logðNÞ, in which case it achieves an exponential of the algorithm. First, we want to transform a given
speedup. Hermitian matrix A into a unitary operator eiAt which we
This procedure yields a quantum-mechanical represen- can apply at will. This is possible (for example) if A is s
tation jxi of the desired vector x.~ Clearly, to read out all the sparse and efficiently row computable, meaning it has at
components of x~ would require one to perform the proce- most s nonzero entries per row, and given a row index,
dure at least N times. However, often one is interested not these entries can be computed in time OðsÞ. Under these
in x~ itself, but in some expectation value x~ T Mx,~ where M is assumptions, Ref. [3] shows how to simulate eiAt in time
some linear operator (our procedure also accommodates O½logðNÞs
~ 2
t;
nonlinear operators as described below). By mapping M to
a quantum-mechanical operator, and performing the quan- where the O ~ suppresses more slowly growing terms (de-
tum measurement corresponding to M, we obtain an esti- scribed in Ref. [12]). If A is not Hermitian, define
mate of the expectation value hxjMjxi ¼ x~ T Mx, ~ as desired.
0 A
A wide variety of features of the vector x~ can be extracted A¼
~ : (1)
Ay 0
in this way, including normalization, weights in different
parts of the state space, moments, etc. As A~ is Hermitian, we can solve the equation
A simple example where the algorithm can be used is to !
~
b
see if two different stochastic processes have similar stable A y~ ¼
~
state [7]. Consider a stochastic process where the state of 0
system at time t is described by the N-dimensional vector to obtain
xt and evolves according to the recurrence relation x~ t ¼
0
Ax~ t1 þ b. ~ The stable state of this distribution is given by y¼ :
x~
jxi ¼ ðI AÞ1 jbi. Let x~ 0t ¼ A0 x~ 0t1 þ b~0 and jx0 i ¼ ðI
Applying this reduction if necessary, the rest of the Letter
A0 Þ1 jb0 i. To know if jxi and jx0 i are similar, we perform assumes that A is Hermitian.
the SWAP test between them [8]. We note that classically We also need an efficient procedure to prepare jbi. For
finding out if twopprobability distributions are similar P2
ffiffiffiffi example, if bi and ii¼i jbi j2 are efficiently computable,
requires at least ð N Þ samples [9]. 1
then we can use the procedure of Ref. [13] to prepare jbi.
The strength of the algorithm is that it works only with
Alternatively, our algorithm could be a subroutine in a
OðlogNÞ-qubit registers, and never has to write down all of
~ or x. larger quantum algorithm of which some other component
A, b, ~ In situations (detailed below) where the is responsible for producing jbi.
Hamiltonian simulation and our nonunitary step incur The next step is to decompose jbi in the eigenvector
only poly logðNÞ overhead, this means our algorithm takes basis, using phase estimation [5,6]. Denote by juj i, the
exponentially less time than a classical computer would
eigenvectors of A (or equivalently, of eiAt ), and by j the
need even to write down the output. In that sense, our
corresponding eigenvalues. Let
algorithm is related to classical Monte Carlo algorithms, sffiffiffiffi
which achieve dramatic speedups by working with samples 2 T1 X ð þ 12Þ
from a probability distribution on N objects rather than by j0 i :¼ sin ji (2)
T ¼0 T
writing down all N components of the distribution.
However, while these classical sampling algorithms are for some large T. The coefficients of j0 i are chosen
powerful, we will prove that in fact any classical al- (following Ref. [6]) to minimize a certain quadratic loss
gorithm requires in general exponentially more time than function which appears in our error analysis (see Ref. [12]
our quantum algorithms to perform the same matrix inver- for details).
sion task. Next, we apply the conditional Hamiltonian evolution
PT1 iAt0 =T
¼0 jihj e on j0 i jbi, where t0 ¼ Oð=Þ.
Outline.—The rest of the Letter proceeds by first de-
scribing our algorithm in detail, analyzing its runtime and Fourier transforming the first register gives the state
comparing it with the best known classical algorithms.
Next, we prove (modulo some complexity-theoretic as- X X
N T1
kjj j jkijuj i; (3)
sumptions) hardness results for matrix inversion that imply j¼1 k¼0
both that our algorithm’s runtime is nearly optimal, and
that it runs exponentially faster than any classical algo- where jki are the Fourier basis states, and jkjj j is large if
rithm. We conclude with a discussion of applications, and only if j 2k t0 . Defining k ¼ 2k=t0 , we can
~ :
generalizations, and extensions. relabel our jki register to obtain
150502-2
week ending
PRL 103, 150502 (2009) PHYSICAL REVIEW LETTERS 9 OCTOBER 2009
X X
N T1 to invert matrices (with the right choice of parameters) can
kjj j j~k ijuj i: be used to simulate a general quantum computation.
j¼1 k¼0 The complexity of matrix inversion.—We will show that
Adding a qubit and rotating conditioned on j~k i yields a quantum circuit using n qubits and T gates can be
vffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi simulated by inverting an Oð1Þ sparse matrix A of dimen-
X X
N T1 u
u sion N ¼ Oð2n Þ. The condition number is OðT 2 Þ if we
C2 C
kjj j j~k ijuj i t1 2 j0i þ j1i ; need A to be positive definite or OðTÞ if not. As a result, if
j¼1 k¼0 ~k ~k
classical computers could estimate quantities of the form
where C is chosen to be Oð1=Þ. We now undo the phase x~ y Mx~ in polyðlogN; ; 1=Þ time, then any polyðnÞ-gate
estimation to uncompute the j~k i. If the phase estimation quantum circuit could be simulated by a polyðnÞ-time
were perfect, we would have kjj ¼ 1 if ~k ¼ j , and 0 classical algorithm. Such a simulation is strongly conjec-
otherwise. Assuming this for now, we obtain tured to be false, and is known to be impossible in the
v ffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi presence of oracles [16].
XN u
u
C2 C The reduction from a general quantum circuit to a matrix
j juj i t1 2 j0i þ j1i :
j¼1 j j inversion problem also implies that our algorithm cannot
be substantially improved (under standard assumptions). If
To finish the inversion, we measure the last qubit. the runtime could be made polylogarithmic in , then any
Conditioned on seeing 1, we have the state problem solvable on n qubits could be solved in polyðnÞ
sffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffiffi N time (i.e., BQP ¼ PSPACE), a highly unlikely possibility.
1 X C
PN j juj i Even improving our dependence to 1 for > 0
j¼1 C jj j =jj j j¼1
2 2 2 j would allow any time-T quantum algorithm to be simu-
Pn lated in time oðTÞ; iterating this would again imply that
which corresponds to jxi ¼ j¼1 j 1 j juj i up to normal- BQP ¼ PSPACE. Similarly, improving the error depen-
ization. We can determine the normalization factor from dence to poly logð1=Þ would imply that BQP includes
the probability of obtaining 1. Finally, we make a mea- PP, and even minor improvements would contradict oracle
surement M whose expectation value hxjMjxi corresponds lower bounds [17].
to the feature of x~ that we wish to evaluate. The reduction.—We now present the key reduction from
Runtime and error analysis.—We present an informal simulating a quantum circuit to matrix inversion. Consider
description of the sources of error; the exact error analysis a quantum circuit acting on n ¼ logN qubits which ap-
and runtime considerations are presented in Ref. [12]. plies T two-qubit gates U1 ; . . . ; UT . The initial state is
Performing the phase estimation is done by simulating j0in , and the answer is determined by measuring the first
eiAt . Assuming that A is s sparse, this can be done with qubit of the final state, corresponding to the observable
error in time proportional to ts2 ðt=Þoð1Þ ¼: Oðts ~ 2 Þ. M ¼ j0ih0j I n1 .
The dominant source of error is phase estimation. This Now adjoin an ancilla register of dimension 3T and
step errs by Oð1=t0 Þ in estimating , which translates into a define a unitary
relative error of Oð1=t0 Þ in 1 . If 1=, taking t0 ¼
X
T
Oð=Þ induces a final error of . Finally, we consider the U¼ jt þ 1ihtj Ut þ jt þ T þ 1iht þ Tj I
success probability of the post-selection process. Since t¼1
C ¼ Oð1=Þ and 1, this probability is at least y
þ jt þ 2T þ 1 mod 3Tiht þ 2Tj U3Tþ1t : (4)
ð1=2 Þ. Using amplitude amplification [14], we find
that OðÞ repetitions are sufficient. Putting this all together, We have chosen U so that for T þ 1 t 2T, applying
we obtain the stated runtime of OðlogðNÞs ~ =Þ.
2 2
Ut to j1ij c i yields jt þ 1i UT U1 j c i. If we now
Classical matrix inversion algorithms.—To put our al- define A ¼ I Ue1=T , then ðAÞ ¼ OðTÞ, and
gorithm in context, one of the best general-purpose classi- X
cal matrix inversion algorithms is the conjugate gradient A1 ¼ Uk ek=T : (5)
k0
pffiffiffiffi [15], which, when A is positive definite, uses
method
O½ logð1=Þ matrix-vector multiplications pffiffiffiffi each taking This can be interpreted as applying Ut for t a geometrically
time OðNsÞ for a total runtime of O½Ns logð1=Þ. (If A distributed random variable. Since U3T ¼ I, we can as-
is not positive definite, O½ logð1=Þ multiplications are sume 1 t 3T. If we measure the first register and
required, for a total time of O½Ns logð1=Þ.) An impor- obtain T þ 1 t 2T [which occurs with probability
tant question is whether classical methods can be improved e2 =ð1 þ e2 þ e4 Þ 1=10], then we are left with the
when only a summary statistic of the solution, such as second register in the state UT U1 j c i, corresponding to
x~ y Mx,
~ is required. Another question is whether our quan- a successful computation. Sampling from jxi allows us to
tum algorithm could be improved, say to achieve error in sample from the results of the computation. Using these
time proportional to poly logð1=Þ. We show that the an- techniques, it is possible to show that matrix inversion is
swer to both questions is negative, using an argument from BQP-complete. Full details of the proof are given in the
complexity theory. Our strategy is to prove that the ability supplementary online material [12].
150502-3
week ending
PRL 103, 150502 (2009) PHYSICAL REVIEW LETTERS 9 OCTOBER 2009
Discussion.—There are a number of ways to extend our Instead, it can compute fðAÞjbi for any computable f.
algorithm and relax the assumptions we made while pre- Depending on the degree of nonlinearity of f, nontrivial
senting it. We will discuss first how to invert a broader class tradeoffs between accuracy and efficiency arise. Some
of matrices and then consider measuring other features of x~ variants of this idea are considered in Refs. [4,11,23].
and performing operations on A other than inversion. We thank the W. M. Keck Foundation for support, and
Certain nonsparse A can be simulated and therefore A. W. H. thanks them as well as MIT for hospitality while
inverted; see Ref. [4] for techniques and examples. It is this work was carried out. A. W. H. was also funded by the
also possible to invert nonsquare matrices, using the re- U.K. EPSRC grant ‘‘QIP IRC.’’ S. L. thanks R. Zecchina
duction presented from the non-Hermitian case to the for encouraging him to work on this problem. We are
Hermitian one. grateful as well to R. Cleve, D. Farmer, S. Gharabian,
The most important challenge in applying our algorithm J. Kelner, S. Mitter, P. Parillo, D. Spielman, and
is controlling the scaling of with N. In the worst case, M. Tegmark for helpful discussions.
can scale exponentionally with N, although this is not
robust against small perturbations in A [18,19]. More often,
is polynomial in N, in which case our algorithm may not
outperform classical algorithms, or may offer only poly- [1] P. W. Shor, Proc. of the 35th FOCS (IEEE, New York,
nomial speedups. Finally, the ideal situation for our algo- 1994), pp. 124–134.
rithm is when is polylogarithmic in N, as is the case with [2] S. Lloyd, Science 273, 1073 (1996).
finite element models that use a fixed lattice spacing and a [3] D. W. Berry, G. Ahokas, R. Cleve, and B. C. Sanders,
growing number of dimensions ([20], Section 9.6). Commun. Math. Phys. 270, 359 (2007).
Given a matrix A with large condition number, our [4] A. M. Childs, arXiv:0810.0312.
algorithm can also choose to invert only the part of jbi [5] R. Cleve, A. Ekert, C. Macchiavello, and M. Mosca,
arXiv:quant-ph/9708016.
which is in the well-conditioned part of A (i.e., the sub-
[6] A. Luis and J. Peřina, Phys. Rev. A 54, 4564 (1996).
space spanned by the eigenvectors with large P eigenvalues). [7] D. G. Luenberger, Introduction to Dynamic Systems:
Formally, instead of transforming jbi ¼ j j juj i to jxi ¼
P 1 Theory, Models, and Applications (Wiley, New York,
j j j juj i, we transform it to a state which is close to 1979).
X X [8] H. Buhrman, R. Cleve, J. Watrous, and R. de Wolf, Phys.
1
j j juj ijwelli þ j juj ijilli Rev. Lett. 87, 167902 (2001).
j;j <1= j;j 1=
[9] P. Valiant, Proc. of the 40th Annual STOC (ACM,
in time proportional to for any chosen (i.e., not
2 New York, 2008), pp. 383–392.
necessarily the true condition number of A). The last qubit [10] A. Klappenecker and M. Rotteler, Phys. Rev. A 67,
is a flag which enables the user to estimate the size of the 010302(R) (2003).
ill-conditioned part, or to handle it in any other way she [11] S. K. Leyton and T. J. Osborne, arXiv:0812.4423.
[12] See EPAPS Document No. E-PRLTAO-103-055942 for
wants. If A is not invertible and 1= is taken to be smaller
detailed proofs of the claims in our manuscript. For more
than the smallest nonzero eigenvalue of A, then this pro- information on EPAPS, see http://www.aip.org/pubservs/
cedure can be used to compute the pseudoinverse of A. epaps.html.
Another method that is often used in classical algorithms [13] L. Grover and T. Rudolph, arXiv:quant-ph/0208112.
to handle ill-conditioned matrices is to apply a precondi- [14] G. Brassard, P. Høyer, M. Mosca, and A. Tapp, Quantum
tioner [21,22]. If we have a method of generating a pre- Amplitude Amplification and Estimation, Contemporary
conditioner matrix B such that ðBAÞ is smaller than ðAÞ, Mathematics Series Millenium Volume 305 (AMS, New
then we can solve Ax~ ¼ b~ by instead solving the possibly York, 2002).
easier matrix inversion problem ðBAÞx~ ¼ Bb. ~ Further, if A [15] J. R. Shewchuk, Technical Report No. CMU-CS-94-125,
School of Computer Science, Carnegie Mellon University,
and B are both sparse, then BA is as well. Thus, as long as a Pittsburgh, Pennsylvania, March 1994.
state proportional to Bjbi can be efficiently prepared, our [16] D. R. Simon, SIAM J. Comput. 26, 1474 (1997).
algorithm could potentially run much faster if a suitable [17] E. Farhi, J. Goldstone, S. Gutmann, and M. Sipser, Phys.
preconditioner is used. Rev. Lett. 81, 5442 (1998).
The outputs of the algorithm can also be generalized. We [18] A. Sankar, D. A. Spielman, and S. H. Teng, SIAM J.
can estimate degree-2k polynomials in the entries of x~ by Matrix Anal. Appl. 28, 446 (2006).
generating k copies of jxi and measuring the appropriate [19] T. Tao and V. Vu, arXiv:0805.3167.
nk-qubit observable on the state jxik . Alternatively, one [20] S. C. Brenner and L. R. Scott, The Mathematical Theory of
can use our algorithm to generate a quantum analogue of Finite Element Methods (Springer-Verlag, New York,
2008).
Monte Carlo calculations, where given A and b~ we sample [21] K. Chen, Matrix Preconditioning Techniques and Appli-
from the vector x, ~ meaning that the value i occurs with cations (Cambridge Univ. Press, Cambridge, U.K., 2005).
probability jx~ i j2 . [22] D. A. Spielman and S. H. Teng, arXiv:cs.NA/0607105.
Perhaps the most far-reaching generalization of the ma- [23] L. Sheridan, D. Maslov, and M. Mosca, J. Math Phys. A
trix inversion algorithm is not to invert matrices at all. 42, 185302 (2009).
150502-4