An Efficient Quantum Factoring Algorithm
An Efficient Quantum Factoring Algorithm
Oded Regev∗
arXiv:2308.06572v2 [quant-ph] 17 Aug 2023
Abstract
We show that n-bit√integers can be factorized by independently running a quantum circuit
with Õ(n3/2 ) gates for n + 4 times, and then using polynomial-time classical post-processing.
The correctness of the algorithm relies on a number-theoretic heuristic assumption reminiscent
of those used in subexponential classical factorization algorithms. It is currently not clear if the
algorithm can lead to improved physical implementations in practice.
1 Introduction
Shor’s celebrated algorithm [Sho99] allows to factorize n-bit integers using a quantum circuit of
size (i.e., number of gates) Õ(n2 ). For factoring to be feasible in practice, however, it is desirable
to reduce this number further. Indeed, all else being equal, the fewer quantum gates there are in a
circuit, the likelier it is that it can be implemented without noise and decoherence destroying the
quantum effects.
Here we show that quantum circuits of size Õ(n3/2 ) are enough. More precisely, we present
√
an algorithm that independently runs n + 4 times a quantum circuit with Õ(n3/2 ) gates. The
outputs are then classically post-processed in polynomial time (using a lattice reduction algorithm)
to generate the desired factorization.
The quantum circuit size can be made even smaller if super-polynomial-time classical post-
processing is allowed. Specifically, for any 0 < ε ≤ 1/2, it can be brought down to Õ(n3/2−ε ) using
classical post-processing (solving a hard lattice problem) running in time exp(Õ(n2ε )). The number
of times the quantum circuit needs to be applied is still small (n1/2+ε ). A curious corollary is that
if lattice-based cryptography is broken classically (more precisely, if a polynomial-time classical
algorithm exists for hard lattice problems), then quantum circuits of nearly-linear size Õ(n) are
sufficient for factoring integers. This is obtained by taking ε = 1/2 in the previous discussion.
A few remarks are in order. First, our algorithm relies on a number-theoretic heuristic assump-
tion reminiscent of those used in subexponential classical factorization algorithms [CP05]. Second,
the number of qubits in our quantum circuit is O(n3/2 ), higher than the O(n) in optimized imple-
mentations of Shor’s algorithm. Third, the quantum circuit’s depth is smaller than the one in Shor’s
original algorithm by Õ(n1/2 ). (Note, though, that Shor’s algorithm can be implemented using cir-
cuits of depth only O(log n) at the expense of a (much) larger number of gates Ω(n5 ) [CW00].)
Finally, we expect similar ideas to apply to the discrete logarithm problem; details will be provided
in the full version.
Courant Institute of Mathematical Sciences, New York University. Supported by a Simons Investigator Award
∗
1
While our analysis is asymptotic, we expect a significant improvement in the number of gates
also for moderately large n, say 2048 bits, possibly by up to two or three orders of magnitude.
For such n, fast integer multiplication is typically not advantageous, and one instead uses highly
optimized variants of naive integer multiplication, leading to a quantum circuit size of approximately
n3 for Shor’s algorithm. Combined with the fact that polynomial-time lattice reduction algorithms
provide surprisingly good approximation factors (e.g., 1.01d for a d-dimensional lattice [GN08]), this
suggests that our approach can potentially achieve a circuit size closer to n2 without fast integer
multiplication.
It is important to note, though, that an improvement in the number of gates does not necessarily
translate into an improved practical implementation. Indeed, in most architectures currently being
considered by industry, the space (or number of qubits) plays an important role. Shor’s algorithm is
amenable to extensive optimizations, allowing implementations with a very small number of qubits
(see [GE21] and references therein). It is currently not clear if our algorithm can benefit from all
these optimizations, the main issue being our use of repeated squaring. It therefore remains to be
seen whether the algorithm can lead to improved physical implementations in practice.
Statement of the result: Fix some n-bit number N ≤ 2n to be factorized. For some d > 0,
let b1 , . . . , bd be some small O(log d)-bit integers (say, bi is the ith prime number) and let ai = b2i .
Define the lattice
n Y 2 o
L = (z1 , . . . , zd ) ∈ Zd bzi i = 1 mod N
i
n Y o
= (z1 , . . . , zd ) ∈ Zd azi i = 1 mod N ⊂ Zd . (1)
i
Assuming N is odd and not a prime power (since factoring is easy otherwise), we heuristically
expect at least half the vectors in L to not be in L0 . For instance, when N is a product of two
distinct odd primes, there are 4 square roots of 1 modulo N , so heuristically, half of the vectors in
L should not be in L0 . Q
Given a vector z ∈ L \ L0 , we have that b = i bzi i mod N is a square root of unity modulo N
(because z ∈ L) yet it is a non-trivial square root of 1, i.e., not equal to ±1 modulo N (because
z∈ / L0 ). In this case, N divides the product (b − 1)(b + 1) but does not divide either of the terms,
and we therefore must have that gcd(b − 1, N ) is a non-trivial factor of N , as desired. Therefore,
it suffices to find a vector in L \ L0 .
By the pigeon-hole principle (or Minkowski’s first theorem) and using the fact that there are
at most N ≤ 2n possible values for the product modulo N in Eq. (1) (i.e., √ the determinant of
L is at most 2 ), L is guaranteed to have nonzero vectors of norm at most d2n/d .1 While we
n
1
To see this, consider all vectors z ∈ {−2n/d−1 , . . . , 2n/d−1 }d . Since there are more than 2n ≥ N such vectors,
there
√ n/d are two that lead to the same product in Eq. (1). Their difference is therefore in L, and is of norm at most
d2 .
2
expect some (in fact, at least half) of them to be in L \ L0 , we do not know how to prove it.2
Instead, we will make the heuristic assumption that there exists a vector in L \ L0 of norm at most
T = exp(O(n/d)). With this assumption, the algorithm is guaranteed to provide a factorization of
N , as in our main result, stated next.
√
Theorem 1.1. Let N be an n-bit number and assume that for d = n and O(log n)-bit numbers
√
b1 , . . . , bd , there exists a vector in L \ L0 of norm at most T = exp(O( n)). Then, there is a
√
classical polynomial-time algorithm that outputs a non-trivial factor of N using n + 4 calls to a
quantum circuit of size O(n3/2 log n).
Related work: The idea of reducing the cost of the quantum circuit at the expense of applying
it several times independently and classically post-processing the outputs was already suggested by
Seifert [Sei01] (see also [EH17]). However, the improvement obtained by his algorithm is only by a
constant factor.
Acknowledgements: The author is grateful to Martin Ekerå, Craig Gidney, Minki Hhan, Igor
Shparlinski, Noah Stephens-Davidowitz, Thomas Vidick, and Ronald de Wolf for their comments
on an early draft.
3
Õ(n3/2 ). The elementary but crucial idea is to perform all multiplications on the small numbers ai
√
directly, so that the only operations we have to perform on large n-bit numbers are log2 R = O( n)
squaring operations.
ρ1/√2R (v − w + Zd )
Qv (w) := . (3)
ρ1/√2R (v − D −1 Zd )
In other words, a sample from Q can be described as the output of the following process: let v ∈
L∗ /Zd be a uniform coset; then, output a sample from the Gaussian distribution on {0, 1/D, . . . , (D−
1)/D}d whose mass at point w is proportional to ρ1/√2R (v − w + Zd ). Importantly, as we show in
√ √
Claim A.7, with all but probability O(2−d ), distRd /Zd (w, v) ≤ d/( 2R), where distRd /Zd (w, v) :=
minz∈Zn dist(w, v + z) denotes distance in the torus, i.e., modulo 1.
The quantum procedure starts by approximating to within 1/poly(d) the state proportional to
X
ρR (z)|zi , (4)
z∈{−D/2,...,D/2−1}d
similarly to how it was done in, e.g., [Reg09]. Namely, to generate this “discrete Gaussian state”,
first note that it can be written as the tensor product of d copies of the one-dimensional state
(d = 1). It therefore suffices to generate the one-dimensional state. To do this, we use a standard
technique which basically puts each qubit from the most-significant to the least-significant into the
appropriate superposition of |0i and |1i conditioned on the values of the previous qubits [GR02].
To obtain a small circuit size, notice that beyond the O(log d) most significant
√ qubits, all remaining
qubits are within distance 1/poly(d) of the “plus state” (|0i + |1i)/ 2, no matter what values we
condition on. (This uses that D = O(poly(d) · R) and that ρR changes slowly, namely, that for all
z ∈ {−D/2, . . . , D/2 − 1} and 0 ≤ k ≤ D/poly(d), ρR (z) is within 1 ± 1/poly(d) of ρR (z + k).)
We can therefore simply initialize the remaining qubits to the plus state using one Hadamard gate
per qubit. In summary, we can approximate the state in (4) using a quantum circuit of size only
4
d(log D + poly(log d)), where the poly(log d) term is for computing the rotation needed for each
of the O(log d) most significant qubits, and the log D term is due to the Hadamard gates on the
remaining qubits.
The next step is the most costly one. Here we apply a classical procedure in superposition
Q z +D/2
in order to compute the value i ai i mod N into a new register |ei. (We added Q D/2 for
zi
convenience so we do not need to worry about negative exponents.) Notice that h(z) := i ai mod
N is a homomorphism from Zd to Z∗N , the multiplicative group of integers modulo N , and that its
kernel is L. Therefore, there is a bijection between Zd /L and the image of h. As a result, since |ei
will be ignored, we can equivalently write the resulting state up to normalization as
X X
ρR (z)|zi|ei . (5)
e∈Zd /L z∈(L+e)∩[−D/2,D/2)d
Q z +D/2
To compute i ai i mod N , first notice that when all the exponents are in {0, 1}, we can
compute the product of the d numbers in a binary tree fashion, leading to the recurrence T (d) =
2T (d/2) + M (d log d), where M (k) is the number of gates needed to compute the product of two
k-bit numbers, and here we are using that a1 , . . . , ad are all small O(log d) bit numbers. Using fast
integer multiplication [HvdH21], M (k) = O(k log k), leading to a circuit of size O(d log3 d). The
general case of exponents in {0, . . . , D − 1} can be handled using a repeated squaring-like idea.
More specifically, for j = 0, . . . , ⌊log2 (D − 1)⌋, let zij denote the jth bit of zi + D/2, with j = 0
being the most significant. Then, letting e be a register initialized to 1, we do the following for
j = 0, . . . , ⌊log2 (D − 1)⌋: square e, then compute the product of the subset of the ai determined
by the zij , and multiply e by the result. To summarize, the circuit size needed for this step is
O(log D · (d log 3 d + n log n)), where we use that e is an n-bit number and can therefore be squared
in time O(n log n).
In the final step, we apply the quantum Fourier transform (over ZdD ) to the |zi register, and then
output the vector in {0, 1/D, . . . , (D − 1)/D}d obtained by measuring that register and dividing
by D. As we show using a standard calculation in Appendix A (specifically, in Claim A.5 and
Proposition A.6), the resulting distribution is within O(2−d ) distance of the distribution Q, as
desired. The circuit size needed for this step is only O(d log D·log((log D)/ε)) by using approximate
QFT with error ε [Cop02]. Taking ε = 1/poly(d), we get circuit size O(d · log D · (log log D + log d)).
To summarize, the quantum procedure uses a circuit of size
Corollary 4.2. In the setting of Theorem 4.1, r + 4 uniformly random elements of G generate G
with probability at least 1/2.
5
Proof. Otherwise, with probability at least 1/2, r + 5 elements are needed to generate G. Since this
random variable is never smaller than r by assumption, its expectation is at least r + 5/2 > r + σ,
in contradiction.
Lemma 4.3. Let L ⊂ Zd , m ≥ d + 4, and assume v1 , . . . , vm are uniformly chosen cosets from
L∗ /Zd . With probability at least 1/4, it holds that for all nonzero u ∈ Zd /L, there exists an i such
/ [−ε, ε] mod 1, where ε = (4 det L)−1/m /3.
that hu, vi i ∈
Proof. First, notice that the group L∗ /Zd can be generated by at most d elements, e.g., by taking
a basis of L∗ . Therefore, by Corollary 4.2, with probability at least 1/2, v1 , . . . , vm generate
L∗ /Zd . Assume that this is the case. Fix some nonzero u ∈ Zd /L, and consider the distribution
of hu, vi mod 1 where v is uniformly chosen from L∗ /Zd . This distribution is not identically zero
(as otherwise u would be the zero coset L). In fact, it must be equal to the uniform distribution
over the set {0, 1/t, 2/t, . . . , (t − 1)/t} for some t ≥ 2. This follows, e.g., from the invariance of the
uniform distribution over L∗ /Zd to shifts by elements from that same group. If t < 1/ε then by
our assumption, there must exist an i such that hu, vi i 6= 0 mod 1 which in particular implies that
hu, vi i ∈
/ [−ε, ε] mod 1. Otherwise, assume t ≥ 1/ε and notice that for any fixed i, the probability
of hu, vi i ∈ [−ε, ε] mod 1 is
(1 + 2⌊tε⌋)/t ≤ 3ε .
Therefore, the probability that hu, vi i ∈ [−ε, ε] mod 1 for all i ∈ {1, . . . , m} is at most (3ε)m . We
complete the proof by applying the union bound over all det L − 1 nonzero elements u in Zd /L.
Lemma 4.4. Let L ⊂ Zd and m ≥ d + 4. Assume v1 , . . . , vm are uniformly chosen cosets from
L∗ /Zd . For some δ > 0 let w1 , . . . , wm ∈ [0, 1)d satisfy that distRd /Zd (wi , vi ) < δ for all i. For some
S > 0, define the d + m-dimensional lattice L′ generated by the columns of
I 0
d×d
B = S · w1 .
S · Im×m
S · wm
Then, for any u ∈ L, there exists a vector u′ ∈ L′ whose first d coordinates are equal to u and whose
norm is at most kuk · (1 + m · S 2 · δ2 )1/2 . Moreover, with probability at least 1/4 (over the choice
of the vi ), any nonzero u′ ∈ L′ of norm ku′ k < min(S, δ−1 ) · ε/2 satisfies that its first d coordinates
are a nonzero vector in L, where ε = (4 det L)−1/m /3.
Proof. Take any u ∈ L. Then, for any i, hu, vi i = 0 mod 1 (since vi ∈ L∗ /Zd ), and therefore, by
Cauchy-Schwarz, hu, wi i is within distance δkuk of an integer. The claim now follows by taking the
combination of the first d columns of B given by the coordinates of u and taking an appropriate
combination of the remaining m columns of B to make the last m coordinates of the resulting vector
at most Sδkuk in absolute value. Next, assume v1 , . . . , vm satisfy the conclusion of Lemma 4.3,
which happens with probability at least 1/4. Take any nonzero u′ ∈ L′ and let u ∈ Zd be its first
d coordinates. If u = 0 then clearly ku′ k ≥ S, as desired. Assume therefore that u is not in L, or
6
equivalently, that the coset u + L is not the zero element in Zd /L. If kuk ≥ ε/(2δ), we are done, so
assume kuk < ε/(2δ). By Lemma 4.3, there exists an i such that hu, vi i is at least ε away from an
integer. By Cauchy-Schwarz, this implies that hu, wi i is at least
away from an integer. As a result, u′ has a coordinate of absolute value at least Sε/2, as desired.
√ √
Proof of Theorem 1.1. Take d = n and R = √ exp(C √n) for a large enough constant C > 0. With
these parameters, and recalling that D ∈ [2 d · R, 4 d · R), the quantum procedure’s circuit size
in (6) becomes
O(n3/2 log n)
√ √
and its output is a point w within distance δ = d/( 2R) of a uniformly chosen v ∈ L∗ /Zd (with
all but probability 1/poly(d)). Apply the quantum procedure d + 4 times independently to obtain
such vectors w1 , . . . , wd+4 .
Consider the (2d + 4)-dimensional lattice L′ given in Corollary 4.5. By our assumption, there
exists a vector in u′ ∈ L′ of norm at most (d + 5)1/2 · T whose first d coordinates are a nonzero
7
vector u ∈ L \ L0 . We next apply the classical algorithm in Claim 5.1 to L′ with the norm bound
(d + 5)1/2 · T . As its output, we obtain vectors z1′ , . . . , zℓ′ of norm at most
where the inequality follows since det L ≤ N ≤ 2n and by choosing R large enough. As a result,
by the second property in Corollary 4.5, except with probability 1/4, if we denote the first d
coordinates of zi′ by zi , we have that zi ∈ L for all i. Moreover, at least one of the zi must not be
in L0 , otherwise u, which is an integer combination of the zi (since u′ is an integer combination
of the zi′ ) would also be in L0 . Finally, we apply for each of the zi the gcd calculation outlined
in Section 1. Since there exists an i such that zi ∈ L \ L0 , one of these calculations will yield a
non-trivial factor of N , as desired.
and rearrange.
We will use the following formulation of the Poisson summation formula. Here, fˆ denotes the
Fourier transform of f . For instance, ρbs = sn ρ1/s .
Lemma A.3. (Poisson summation formula) For any lattice L and any (nice enough) function
f : Rn → C,
f (L) = det(L∗ )fˆ(L∗ ) ,
where fˆ denotes the Fourier transform of f .
Let |ϕ1 i be the state in Eq. (5) which, to recall, is given by
X X
|ϕ1 i = Z1−1 ρR (z)|zi|ei ,
e∈Zd /L z∈(L+e)∩{−D/2,...,D/2−1}d
8
√
Claim A.4. We have that Z12 ∈ [1 ± 2 · 2−d ](R/ 2)d .
Proof. Notice that
X X
Z12 = ρR (z)2 = ρR/√2 (z) .
z∈{−D/2,...,D/2−1}d z∈{−D/2,...,D/2−1}d
√ √
Therefore, by Lemma A.1 (with x = 0) and using D/2 ≥ dR/ 2, Z12 satisfies
where, again, Z2 > 0 is the normalization term. In other words, whereas in |ϕ1 i we truncate the
discrete Gaussian to the box {−D/2, . . . , D/2 − 1}d , in |ϕ2 i we let it wrap around modulo D. In
the next claim, we show that the two states are very close, which√ intuitively follows from the fact
that the Gaussian mass does not extend much beyond radius dR.
Claim A.5. We have that
k|ϕ1 i − |ϕ2 ik2 ≤ 2 · 2−d .
Moreover, Z1 /Z2 ∈ [1 ± 2−d ].
Proof. Let M be the 2d-dimensional lattice given by all vectors (z1 , z2 ) ∈ Z2d such that both
z1 = z2 mod D and z1 = z2 mod L. Then notice that
X X 2
Z22 = ρR (z + DZd ) ∩ (L + e)
e∈Zd /L z∈ZdD
X
= ρR (z1 ) · ρR (z2 )
∈Zd
z1 ,z2
z1 =z2 mod D
z1 =z2 mod L
= ρR (M) .
Similarly, denoting by M′ the subset of M corresponding to all vectors (z1 , z2 ) such that neither
z1 nor z2 are in {−D/2, . . . , D/2 − 1}d ,
X X 2
kZ1 |ϕ1 i − Z2 |ϕ2 ik22 = ρR ((z + DZd ) ∩ (L + e)) \ {−D/2, . . . , D/2 − 1}d
e∈Zd /L z∈ZdD
= ρR (M′ )
≤ 2−2d · ρR (M)
= 2−2d · Z22 ,
9
where
√ we√used Lemma A.1 (with x = 0) and the fact that all vectors in M′ are of norm at least
D/ 2 ≥ 2dR. Dividing both sides by Z22 , we get
By the triangle inequality, this implies that Z1 /Z2 = k(Z1 /Z2 ) · |ϕ1 ik2 ∈ [1 ± 2−d ]. Using the
triangle inequality again, we get that
k|ϕ1 i − |ϕ2 ik2 ≤ k|ϕ1 i − (Z1 /Z2 ) · |ϕ1 ik2 + k(Z1 /Z2 ) · |ϕ1 i − |ϕ2 ik2 ≤ 2 · 2−d ,
as desired.
Proposition A.6. The distribution obtained by applying QFT to |ϕ2 i, discarding the e register,
measuring the z register, and dividing the result by D is within statistical distance O(2−d ) of the
distribution Q defined in Eq. (2).
10
Therefore, P (w) is bounded from below by
X
P ′ (w) := R2d · D −d · (det L)−1 · Z2−2 · ρ1/(√2R) (v − w + Zd )
v∈L∗ /Zd
X
= αv Qv (w) ,
v∈L∗ /Zd
where√we used the Poisson summation formula, and in the last step we applied Corollary A.2 (using
D ≥ 2dR) and the triangle inequality. √ From Claim A.4 and the second part of Claim A.5, we get
that Z22 is within 1 ± O(2−d ) of (R/ 2)d , and so αv ∈ (det L)−1 · [1 ± O(2−d )]. This establishes
that P ′ is within ℓ1 distance O(2−d ) of Q. In particular, its ℓ1 norm (i.e., total mass) is at least
1 − O(2−d ), which, combined with it being a lower bound on P , implies that it is within ℓ1 distance
O(2−d ) of P . Using the triangle inequality,
where in the last inequality we used Eq. (7) (specifically, that α0 ≤ (1 + O(2−d ))αv ).
References
[Ban93] Wojciech Banaszczyk. New bounds in some transference theorems in the geometry of
numbers. Mathematische Annalen, 296(4):625–635, 1993. 8
[Cop02] Don Coppersmith. An approximate Fourier transform useful in quantum factoring, 2002.
quant-ph/0201067. 5
[CP05] Richard Crandall and Carl Pomerance. Prime numbers. Springer, New York, second
edition, 2005. A computational perspective. 1
11
[CW00] Richard Cleve and John Watrous. Fast parallel circuits for the quantum Fourier trans-
form. In 41st Annual Symposium on Foundations of Computer Science (Redondo Beach,
CA, 2000), pages 526–536. IEEE Comput. Soc. Press, Los Alamitos, CA, 2000. 1
[EH17] Martin Ekerå and Johan Håstad. Quantum algorithms for computing short discrete
logarithms and factoring RSA integers. In Post-quantum cryptography, volume 10346 of
Lecture Notes in Comput. Sci., pages 347–363. Springer, Cham, 2017. 3
[GE21] Craig Gidney and Martin Ekerå. How to factor 2048 bit RSA integers in 8 hours using
20 million noisy qubits. Quantum, 5:433, April 2021. 2
[GN08] Nicolas Gama and Phong Q. Nguyen. Finding short lattice vectors within Mordell’s
inequality. In STOC’08, pages 207–216. ACM, New York, 2008. 2, 8
[GR02] Lov Grover and Terry Rudolph. Creating superpositions that correspond to efficiently
integrable probability distributions, 2002. arXiv:quant-ph/0208112. 4
[HvdH21] David Harvey and Joris van der Hoeven. Integer multiplication in time O(n log n). Ann.
of Math. (2), 193(2):563–617, 2021. 5
[Pom01] Carl Pomerance. The expected number of random elements to generate a finite abelian
group. Period. Math. Hungar., 43(1-2):191–198, 2001. 5
[Reg09] Oded Regev. On lattices, learning with errors, random linear codes, and cryptography.
J. ACM, 56(6):Art. 34, 40, 2009. 4
[Sei01] Jean-Pierre Seifert. Using fewer qubits in Shor’s factorization algorithm via simultaneous
Diophantine approximation. In Topics in cryptology—CT-RSA 2001 (San Francisco,
CA), volume 2020 of Lecture Notes in Comput. Sci., pages 319–327. Springer, Berlin,
2001. 3
[Sho99] Peter W. Shor. Polynomial-time algorithms for prime factorization and discrete loga-
rithms on a quantum computer. SIAM Rev., 41(2):303–332, 1999. 1
12