Shors Algorith 2022
Shors Algorith 2022
LIGANG XIAO
Institute of Quantum Computing and Computer Theory, School of Computer Science and Engineering,
Sun Yat-sen University, Guangzhou, 510006, China
The Guangdong Key Laboratory of Information Security Technology,
Sun Yat-sen University, Guangzhou, 510006, China
DAOWEN QIU a
Institute of Quantum Computing and Computer Theory, School of Computer Science and Engineering,
Sun Yat-sen University, Guangzhou, 510006, China
The Guangdong Key Laboratory of Information Security Technology,
Sun Yat-sen University, Guangzhou, 510006, China
QUDOOR Technologies Inc., Guangzhou, 510006, China
LE LUO
School of Physics and Astronomy, Sun Yat-sen University
Zhuhai, 519082, China
QUDOOR Technologies Inc., Guangzhou, 510006, China
PAULO MATEUS
Instituto de Telecomunicações, Departamento de Matemática, Instituto Superior Técnico,
Av. Rovisco Pais 1049-001 Lisbon, Portugal
Shor’s algorithm is one of the most important quantum algorithm proposed by Peter Shor [Proceedings
of the 35th Annual Symposium on Foundations of Computer Science, 1994, pp. 124–134]. Shor’s
algorithm can factor a large integer with certain probability and costs polynomial time in the length
of the input integer. The key step of Shor’s algorithm is the order-finding algorithm, the quantum
part of which is to estimate s/r, where r is the “order” and s is some natural number that less than
r. Shor’s algorithm requires lots of qubits and a deep circuit depth, which is unaffordable for current
physical devices. In this paper, to reduce the number of qubits required and circuit depth, we propose
a quantum-classical hybrid distributed order-finding algorithm for Shor’s algorithm, which combines
the advantages of both quantum processing and classical processing. In our distributed order-finding
algorithm, we use two quantum computers with the ability of quantum teleportation separately to
estimate partial bits of s/r. The measuring results will be processed through a classical algorithm to
ensure the accuracy of the results. Compared with the traditional Shor’s algorithm that uses multiple
control qubits, our algorithm reduces nearly L/2 qubits for factoring an L-bit integer and reduces the
circuit depth of each computer.
Keywords: Shor’s algorithm, distributed Shor’s algorithm, quantum-classcial hybrid, quantum telepor-
tation, circuit depth
27
28 Distributed Shor’s algorithm
1 Introduction
Quantum computing has shown great potential in some fields or problems, such as chemical molecular
simulation [1], portfolio optimization [2], large number decomposition [3], unordered database search
[4] and linear equation solving [5] et al. At present, there have been many useful algorithms in
quantum computing [6], but to realize these algorithms requires the power of medium or large scale
general quantum computers. However, it is still very difficult to develop a medium or large scale
general quantum computer, because there are important physical problems in quantum computer that
have not been solved. In the NISQ (Noisy Intermediate-Scale Quantum) era, we can only perform
quantum algorithms with few qubits and low circuit depth. Therefore it is necessary to consider
reducing the number of qubits and other computing resources required for quantum algorithms.
Distributed quantum computing is a computing method that solves problems collaboratively through
multiple computing nodes. In distributed quantum computing, we can use multiple slightly smaller
quantum computers to complete a task that was originally completed by a single medium or large scale
quantum computer. Distributed quantum computing not only reduces the number of qubits required,
but also sometimes reduces the circuit depth of each computer. This is also important since noise is
increased with circuit being deepened . Therefore, distributed quantum computing has been studied
significantly (for example, [7, 8, 9, 10]).
Shor’s algorithm proposed by Peter Shor in 1994 [3] is an epoch-making discovery. It can factor
a large integer with certain probability and costs time polynomial in the length of the input integer,
whereas the time complexity of the best known classical algorithm for factoring large numbers is
subexponential but superpolynomial. Shor’s algorithm can be applied in cracking various cryptosys-
tems, such as RSA cryptography and elliptic curve cryptography. For this reason, Shor’s algorithm has
received extensive attention from the community. However, recently some researchers have pointed
out that using Shor’s algorithm to crack the commonly used 2048-bit RSA integer requires physical
qubits of millions [11]. So it is vital to consider reducing the logic qubits required in Shor’s algorithm.
Many researchers have been working on reducing the number of qubits required for Shor’s algorithm
[12, 13, 14], and these results have shown that Shor’s algorithm can be implemented using only one
control qubit to factor a L-bit integer together with 2L + c qubits and circuit depth O(L3 ), where c is
a constant. But the method requires multiple intermediate measurements.
In 2004, Yimsiriwattana et al. [10] proposed a distributed Shor’s algorithm. In this distributed
algorithm, it directly divides the qubits into several parts, so each part has fewer qubits than the
original one. Since all unitary operators can be decomposed into single qubit quantum gates and
CNOT gates [15], they only need to consider how to implement CNOT gates acting on different parts,
while a CNOT gate acting on different parts can be implemented by means of pre-sharing EPR pairs,
local operations and classical communication. They clarified that their distributed algorithm needs to
communicate O(L2 ) classical bits.
In this paper, we propose a new distributed Shor’s algorithm. It is a quantum-classical hybrid
algorithm, which not only takes advantage of fast quantum computing, but also takes advantage of
the ease of processing measuring results of classical algorithms. In our distributed algorithm, two
computers execute sequentially. Each computer estimates several bits of some key intermediate quan-
tity (the ratio of s and r, where r is the “order” and s is some natural number that less than r). In
order to guarantee the correlation between the two computers’ measuring results to some extent, we
employ quantum communication. Furthermore, to obtain high accuracy, we can adjust the measuring
result of the first computer in terms of the measuring result of the second computer through classical
Ligang Xiao, Daowen Qiu, Le Luo and Paulo Mateus 29
post-processing. Compared with the traditional Shor’s algorithm that uses multiple control qubits,
our algorithm reduces the cost of qubits (reduces nearly L/2 qubits) and the circuit depth of each
computer. Although each computer in our distributed algorithm requires more qubits than the imple-
mention of Shor’s algorithm mentioned above that uses only one control qubit, our method of using
quantum communication to distribute the phase estimation of Shor’s algorithm may be applicable to
other quantum algorithms.
The remainder of the paper is organized as follows. In Section 2, we review quantum teleportation
and some quantum algorithms related to Shor’s algorithm. In Section 3, we present a distributed Shor’s
algorithm (more specifically, a distributed order-finding algorithm), and prove the correctness of our
algorithm. In Section 4, we analyze the performance of our algorithm, including space complexity,
time complexity, circuit depth and communication complexity. Finally in Section 5, we conclude with
a summary.
2 Preliminaries
In this section, we review the quantum Fourier transform, phase estimation algorithm, order-finding
algorithm and others we will use. We assume that the readers are familiar with the liner algebra and
basic notations in quantum computing (for the details we can refer to [15]).
for j = 0, 1, · · · , 2n − 1.
Quantum Fourier transform and the inverse quantum Fourier transform can be implemented by
using O(n2 ) single qubit gates and O(n2 ) CNOT gates [3, 15].
for any positive integer m and m-bit string j, where the first register is control qubits. Then we can
apply phase estimation algorithm to estimate ω. Fig. 1 shows the implementation of Cm (U ). For the
sake of convenience, we first define the following notations. In this paper, we treat bit strings and their
corresponding binary integers as the same.
30 Distributed Shor’s algorithm
Definition 1. For any real number ω = a1 a2 · · · al .b1 b2 · · · , where ak1 ∈ {0, 1}, k1 = 1, 2, · · · , l
and bk2 ∈ {0, 1}, k2 = 1, 2, · · · , denote |ψt,ω i, ω{i,j} , ω[i,j] , and dt (x, y) respectively as follows:
t
1 2P −1
• |ψt,ω i: for any positive integer t, |ψt,ω i = QF T −1 √ e2πijω |ji.
2t j=0
• ω{i,j} : for any integer i, j with 1 ≤ i ≤ j, ω{i,j} = bi bi+1 · · · bj .
• dt (x, y): for any two t-bit strings (or t-bit binary integers) x, y, define dt (x, y) =
min(|x − y|, 2t − |x − y|).
dt (·, ·) is a useful distance to estimate the error of the algorithms in our paper and it has the
following properties. We specify a mod N = (kN + a) mod N for any negative integer a and
positive integer N , where k is an integer and satisfies kN + a ≥ 0.
Lemma 1. Let t be a positive integer and let x, y be any two t-bit strings. It holds that:
(I) Let B = {b ∈ {−(2t − 1), · · · , 2t − 1} : (x + b) mod 2t = y}. Then dt (x, y) = minb∈B |b|.
(II) dt (·, ·) is a distance on {0, 1}t .
(III) Let t0 < t be an positive integer. If dt (x, y) < 2t−t0 , then
Proof. First we prove (I). It is clear for the case of x = y. Without loss of generality, assume x > y.
Since x 6= y, we have B contains only 2 elements. Note that
x + (y − x) mod 2t = y, (5)
t t
x + (2 − (x − y)) mod 2 = y, (6)
t
|y − x| ≤ 2 − 1, (7)
t t
|2 − (x − y)| ≤ 2 − 1 (8)
and y − x 6= 2t − (x − y), we get that y − x and 2t − (x − y) are exactly two elements of B. Hence
minb∈B |b| = min(|x − y|, 2t − |x − y|) = dt (x, y). Thus (I) holds.
Ligang Xiao, Daowen Qiu, Le Luo and Paulo Mateus 31
Then we prove (II). We just need to show that dt (·, ·) satisfies the triangle inequality, that is,
dt (x, y) ≤ dt (x, z) + dt (z, y) holds for any t-bit string z. By (I), we know that there exists b1 , b2 ∈
{−(2t − 1), · · · , 2t − 1} such that
and
(x + b1 ) mod 2t = z, (z + b2 ) mod 2t = y. (10)
Hence (x + b1 + b2 ) mod 2t = y. Then by (I) again, we have
(2t−t0 x[1,t0 ] + x[t0 +1,t] + b) mod 2t = 2t−t0 y[1,t0 ] + y[t0 +1,t] . (12)
dt (2t−t0 x[1,t0 ] , 2t−t0 y[1,t0 ] ) ≤ |b + x[t0 +1,t] − y[t0 +1,t] | < 2 · 2t−t0 . (13)
Hence
dt0 (x[1,t0 ] , y[1,t0 ] ) < 2. (14)
Therefore Eq. (4) holds.
We can understand dt (·, ·) in a more intuitive way. We place numbers 0 to 2t evenly on a circum-
ference where 0 and 2t coincide. Suppose the distance of two adjacent points on the circumference is
1. Then dt (x, y) can be regarded as the length of the shortest path on the circumference from x to y.
Next we review the phase estimation algorithm (see Algorithm 1) and its associated results.
If the fractional part of ω does not exceed t bits (i.e. 2t ω is an integer), by observing Eq. (2)
ω
e
and the step 4 in Algorithm 1, we can see that ω e is a perfect estimate of ω (i.e. t = ω). However,
2
ω ω
sometimes ω is not approximated by t but is approximated by 1 − t . For example, if the binary
e e
2 2
representation of ω is ω = 0.11 · · · 1 (sufficiently many 1s), we will obtain the measuring result
00 · · · 0 with high probability, since at this time e2πiω is close to e2πi0 = 1. The output ω
e of the phase
ω
estimation algorithm should satisfy that t is close to ω or ω − 1. We have the following results.
e
2
Proposition 1 (See [15]). In Algorithm 1, for any > 0 and any positive integer n, if t = n +
1
ω , ω{1,t} ) < 2t−n is at least 1 − .
dlog2 (2 + )e, then the probability of dt (e
2
ω , ω{1,t} ) < 2t−n , then we have
e and real number ω ∈ [0, 1). If dt (e
Lemma 2. For any t-bit string ω
ω ω
| t − ω| ≤ 2−n or 1 − | t − ω| ≤ 2−n , where n < t.
e e
2 2
Proof. Since |2t ω − ω{1,t} | < 1, if dt (e
ω , ω{1,t} ) = |e
ω − ω{1,t} |, we have
ω − 2t ω| ≤ |e
|e ω − ω{1,t} | + |ω{1,t} − 2t ω| ≤ 2t−n , (15)
ω
and thus | − ω| ≤ 2−n ; if dt (e
ω , ω{1,t} ) = 2t − |e
ω − ω{1,t} |, we have
e
2t
2t − |e
ω − 2t ω| ≤ 2t − (|e
ω − ω{1,t} | − |ω{1,t} − 2t ω|) ≤ 2t−n , (16)
ω
and therefore, we have 1 − | − ω| ≤ 2−n .
e
2t
ω ω
is an estimate of ω{1,t} with error less than 2−n , then t is an estimate of ω
e e
That is to say, if t
2 2
with error no larger than 2−n .
Denote
r−1
1 X −2πi s k k
|us i = √ e r |a mod N i. (18)
r
k=0
We have
s
Ma |us i = e2πi r |us i, (19)
r−1
1 X
√ |us i = |1i, (20)
r s=0
Ligang Xiao, Daowen Qiu, Le Luo and Paulo Mateus 33
and (
0 if s 6= s0 ,
hus |us0 i = δs,s0 = (21)
1 if s = s0 .
So if we expect to apply phase estimation algorithm in finding order, the key is to construct Cm (Ma ),
that is, for any m-bit string j,
The purpose of the quantum part of the order-finding algorithm (steps 1 to 5 in Algorithm 2) is to
s
get a measuring result m such that m is an estimation of with error no large than 2−(2L+1) for some
r
m s
s ∈ {0, 1, · · · , r − 1} (i.e. | t − | ≤ 2−(2L+1) ), because it is one of the prerequisites to ensure the
2 r
34 Distributed Shor’s algorithm
t
correctness of the result in step 6 [15]. Let {Pi } be any projective measurement on C2 and let |φs i
be any t-qubit quantum state for s = 0, 1, · · · , r − 1. By Eq. (21), we have
r−1
X r−1
X
k(Pj ⊗ I) |φs i|us ik2 = k(Pj |φs i)|us ik2 (23)
s=0 s=0
for Pj ∈ {Pi }. Hence by Propositon 1 and Eq. (23), we can obtain the following proposition imme-
diately.
s
Proposition 2 (See [15]). In Algorithm 2, the probability of dt (m, ( ){1,t} ) < 2t−(2L+1) for any fixed
r
1−
s ∈ {0, 1, · · · , r − 1} is at least . And the probability that there exists an s ∈ {0, 1, · · · , r − 1}
r
such that
s
dt (m, ( ){1,t} ) < 2t−(2L+1) (24)
r
is at least 1 − .
s
Proof. Denote As = {x ∈ {0, 1}t : dt (x, ( ){1,t} ) < 2t−(2L+1) }. Let Qs = i∈As |iihi|. For any
P
r
s
fixed s ∈ {0, 1, · · · , r − 1}, the probability of dt (m, ( ){1,t} ) < 2t−(2L+1) is
r
r−1 r−1
1 X 1X
k(Qs ⊗ I) √ |ψt,k/r i|uk ik2 = k(Qs |ψt,k/r i)|uk ik2 (by Eq. (23)) (25)
r r
k=0 k=0
1
≥ k(Qs |ψt,s/r i)|us ik2 (26)
r
1
= k(Qs |ψt,s/r i)k2 (27)
r
1−
≥ (by Propositon 1) (28)
r
P
Let Q = r−1 |iihi|. And the probability that there exists an s ∈ {0, 1, · · · , r − 1} such that
i∈ ∪ As
s=0
s
dt (m, ( ){1,t} )) < 2t−(2L+1) is
r
r−1 r−1
1 X 1X
k(Q ⊗ I) √ |ψt,k/r i|uk ik2 = k(Q|ψt,k/r i)|uk ik2 ( by Eq.(23)) (29)
r r
k=0 k=0
r−1
1X
≥ k(Qk |ψt,k/r i)|uk ik2 (30)
r
k=0
r−1
1X
= k(Qk |ψt,k/r i)k2 (31)
r
k=0
≥ 1 − (by Equation (28)) (32)
Although it is an important part to discuss the probability of obtaining r correctly from the mea-
suring result by applying continued fractions algorithm, the details are omitted here and we focus on
s m s
considering whether the measuring result is an estimation of for some s (i.e. | t − | ≤ 2−(2L+1) ),
r 2 r
since this is exactly the goal of the quantum part in the order-finding algorithm.
Theorem 1 ([16]). When Alice and Bob share L pairs of EPR pairs, they can simulate transmitting L
qubits by communicating 2L classical bits.
Remark 1. Although Algorithm 3 is a serial algorithm, the two computers can also execute in parallel
to some extent. For example, execute the algorithm in the following order: 1, (2, 6), 3, 5, 7, (4, 8), 9,
10, 11, where i represents the ith step in Algorithm 3, and (i, j) means that the ith and jth steps are
executed in parallel.
Remark 2. If we initialize the quantum state of computer B to |0iB |1iD (register D is L-qubit) and
36 Distributed Shor’s algorithm
do not employ quantum teleportation in Fig. 3, that is, computer A and computer B execute “partial”
order-finding algorithm respectively, then the final quantum states of computers A and B will become
Pr−1 g Pr−1 g0 s
s=0 |s/riA |us iC and s=0 |s/r iB |us iD , respectively, where s/r is an estimation of ( ){1, L
g
r 2 +1}
0 s
and s/r is an estimation of ( ){ L ,2L+1} . Therefore, in this case, if computer A measures register A
g
r 2
and computer B measures register B, their measuring results may not correspond to the same s/r.
Next we prove the correctness of our algorithm, that is, we can obtain an output m such that
m s
| (2L+1+p) − | ≤ 2−(2L+1) holds for some s ∈ {0, 1, · · · , r − 1} with high probability. Let
2 r
r, L, t1 , t2 , p, m1 , m2 , mpref ix , m, 0 , |φf inal i be the same as those in Algorithm 3 and Algorithm 4.
s0 s0
We first prove that if m1 and m2 are both estimations of some bits of with = 0.a1 a2 · · · a L +1 ,
r r 2
then the output m is perfect (i.e. m = a1 a2 · · · a L a L +1 0 · · · 0), and the probability of this case is not
2 2
1
less than .
r
L s0 s0
Proposition 3. Let s0 ∈ {0, 1, · · · , r − 1} satisfy that 2 2 +1 · is an integer, that is, =
r r
L
0.a1 a2 · · · a L +1 where ai ∈ {0, 1}, i = 1, 2, · · · , + 1. Then in Algorithm 3, it holds that
2 2
1
Prob(m = a1 a2 · · · a L +1 0 · · · 0) ≥ . (33)
2 r
s0 L
Proof. Since the fractional part of is at most ( + 1)-bit, in Algorithm 3, we have
r 2
|ψt1 ,s0 /r i = |a1 a2 · · · a L +1 0 · · · 0i (34)
2
and
|ψ L i = |a L a L +1 0 · · · 0i. (35)
t2 ,2 2 −1 s0 /r 2 2
1
= . (38)
r
Since CorrectResults(x, y) = a1 a2 · · · a L a L +1 0 · · · 0, the lemma holds.
2 2
Ligang Xiao, Daowen Qiu, Le Luo and Paulo Mateus 37
−1 1 r−1
P
8: Apply QF T to register B: → |φf inal i = √ |ψt ,s/r iA |ψ L2 −1 iB |us iC
r s=0 1 t2 ,2 s/r
9: Computer A measures register A and computer B measures register B:
A obtains a t1 -bit string m1 and B obtains a t2 -bit string m2 .
10: m ← CorrectResults(m1 , m2 ): m is a (2L + 1 + p)-bit string.
11: Apply continued fractions algorithm: obtain r.
L s0
Then we prove that if m2 is an estimation of the th to (2L+1)th bit of , we can get (m2 )[1,2] =
2 r
s0
( ) L L .
r { 2 , 2 +1}
38 Distributed Shor’s algorithm
L s0
Lemma 3. Let s0 ∈ {0, 1, · · · , r − 1} satisfy that 2 2 +1 · is not an integer and let m2 satisfy
r
s0
dt2 (m2 , ( ) L ) < 2p . (39)
r { 2 ,2L+1+p}
s0
Then (m2 )[1,2] = ( ) L L .
r { 2 , 2 +1}
L s0
Proof. Since 2 2 +1 · is not an integer, we have
r
L
−L 1 2 2 +1 s0 mod r r−1
2 < ≤ ≤ < 1 − 2−L . (40)
r r r
s0 s0
So we get ( ){ L +2, 3L +1} is not 00 · · · 0 or 11 · · · 1. Hence, ( ){ L +2,2L+1} is not 00 · · · 0 or
r 2 2 r 2
s0
11 · · · 1. That is to say, if we add or subtract 1 to ( ){ L ,2L+1} , its first two bits are not changed.
r 2
Thus by Eq. (39), we have
s0
(m2 )[1,2] = ( ){ L , L +1} (41)
r 2 2
Therefore the lemma holds.
s0
If (m2 )[1,2] = ( ){ L , L +1} , that is, the first two bits of m2 are correct, then we can use these two
r 2 2
bits of m2 to “correct” m1 . The following lemma can be used to show the correctness of Algorithm 4.
Lemma 4. Let t > 2 be a positive integer and let x, y be two t-bit strings with dt (x, y) ≤ 1. Then
there only exists one element b0 in {−1, 0, 1} such that (x + b0 ) mod 2t = y, and for any b ∈
{−1, 0, 1}, (x + b) mod 2t = y if and only if (x[t−1,t] + b) mod 22 = y[t−1,t] .
Proof. By Lemma 1, we know that there exists such a b0 . It is clear that such a b0 is unique. Next we
prove that for any b ∈ {−1, 0, 1}, (x + b) mod 2t = y if and only if (x[t−1,t] + b) mod 22 = y[t−1,t] .
For any b ∈ {−1, 0, 1}, suppose (x + b) mod 2t = y, then we have
That is,
(x[t−1,t] + b) mod 22 = y[t−1,t] . (43)
On the other hand, for any b ∈ {−1, 0, 1}, suppose (x[t−1,t] + b) mod 22 = y[t−1,t] . Since there only
exists one elements b1 in {−1, 0, 1} such that (x[t−1,t] + b1 ) mod 22 = y[t−1,t] , b is equal to b0 , that
is, b satisfies (x + b) mod 2t = y. Consequently, the lemma holds.
s0
We can inspect Lemma 4 from another aspect. If d L +1 (m1 , ( ) L ) ≤ 1 and (m2 )[1,2] =
r {1, 2 +1}
2
s0
( ) L L hold for some s0 , then the CorrectionBit in Algorithm 4 exists, and mpref ix =
r { 2 , 2 +1}
s0
( ){1, L +1} holds as well.
r 2
Finally, we give the following results, which completes the proof of the correctness of our algo-
rithms.
Ligang Xiao, Daowen Qiu, Le Luo and Paulo Mateus 39
s0
Proposition 4. Let m2 satisfy dt2 (m2 , ( ){ L ,2L+1+p} ) < 2p for some s0 ∈ {0, 1, · · · , r − 1} with
r 2
L
+1 s0 s0 m s0
2 2 · being not an integer. Suppose dt1 (m1 , ( ){1,t1 } ) < 2p . Then | 2L+1+p − | ≤ 2−(2L+1) .
r r 2 r
s0 L s0
Proof. Since dt2 (m2 , ( ){ L ,2L+1+p} ) < 2p and 2 2 +1 · is not an integer, by Lemma 3, we have
r 2 r
s0
(m2 )[1,2] = ( ) L L . (44)
r { 2 , 2 +1}
s0 L
Since dt1 (m1 , ( ){1,t1 } ) < 2p and t1 = + 1 + p, by Lemma 1, we have
r 2
s0
d L +1 ((m1 )[1, L +1] , ( ) L ) ≤ 1. (45)
2 2 r {1, 2 +1}
As a result, in Algorithm 4, the CorrectionBit exists. By Eq. (44), Lemma 4, and the steps 1 to 2 in
Algorithm 4, we get
s0
mpref ix = ( ){1, L +1} . (46)
r 2
Since m = mpref ix ◦ (m2 )[3,t2 ] , by Eq. (44) and Eq. (46), we get
s0 s0
d2L+1+p (m, ( ){1,2L+1+p} ) = d 3L +2+p (m2 , ( ){ L ,2L+1+p} ) < 2p . (47)
r 2 r 2
s0 s0
Since is not an integer, similar to Eq. (40), we know that ( ){1,2L+1} is not 00 · · · 0 or 11 · · · 1.
r r
s0 s0
Then by Eq. (47), we get d2L+1+p (m, ( ){1,2L+1+p} ) = |m − ( ){1,2L+1+p} )|. Therefore, by Eq.
r r
(47) and Lemma 2, we obtain
m s0
| 2L+1+p − | ≤ 2−(2L+1) . (48)
2 r
m s0
Theorem 2. In Algorithm 3, for any fixed s0 ∈ {0, 1, · · · , r−1}, the probability of | 2L+1+p − | ≤
2 r
1−
2−(2L+1) is at least . The probability that there exists an s ∈ {0, 1, · · · , r − 1} such that
r
m s
| 2L+1+p − | ≤ 2−(2L+1) is at least 1 − .
2 r
L s0
Proof. By Proposition 3, for any fixed s0 ∈ {0, 1, · · · , r − 1} with 2 2 +1 · being an integer, we
r
have
m s0 1
Prob( 2L+1+p = ) ≥ . (49)
2 r r
L s0
For any fixed s0 ∈ {0, 1, · · · , r − 1} with 2 2 +1 · being not an integer, by Proposition 1 and Eq.
r
(23), we get that the probabilty of
s0
dt2 (m2 , ( ) L )) < 2p (50)
r { 2 ,2L+1+p}
and
s0
dt1 (m1 , ( ){1,t1 } ) < 2p (51)
r
40 Distributed Shor’s algorithm
1 1 1−
is at least (1 − 0 )2 = (1 − )2 > . Consequently, by Proposition 4, we obtain
r r 2 r
m s0 1−
Prob(| − | ≤ 2−(2L+1) ) > . (52)
22L+1+p r r
Similar to the proof of Proposition 2, we can obtain the probability that there exists an s ∈ {0, 1, · · · , r−
m s
1} such that | 2L+1+p − | ≤ 2−(2L+1) is at least 1 − . Finally, the theorem has been proved.
2 r
4 Complexity analysis
The complexity of the circuit of (distributed) order-finding algorithm depends on the construction of
Ct (Ma ). There are two kinds of implementation of Ct (Ma ) proposed by Shor [18]. The first method
(denoted as method (I)) needs time complexity O(L3 ) and space complexity O(L), and the second
method (denoted as method (II) ) needs time complexity O(L2 log L log log L) and space complexity
O(L log L log log L). In this section, we compare our distributed order-finding algorithm with the
traditional order-finding algorithm. For a more concrete comparison, we consider that Ct (Ma ) is
implemented by method (I). There is a concrete implementation of order-finding algorithm by using
method (I) in [10]. However, the advantages of our distributed order-finding algorithm in space and
circuit depth are independent of whether method (I) or method (II) is used.
Space complexity. The implementation of the operator Ct (Ma ) in method (I) needs t + L qubits
plus b auxiliary qubits for any positive integer a, where b is O(L). By Theorem 1, to teleport L qubits,
computers A and B need to share L pairs of EPR pairs and communicate with 2L classical bits. As
5L 1 5L 1
a result, A needs + 1 + dlog2 (2 + e) + b qubits and B needs + 2 + dlog2 (2 + )e + b
2 2
1
qubits. As a comparison, order-finding algorithm needs 3L + 1 + dlog2 (2 + )e + b qubits. So, our
2
L
distributed order-finding algorithm can reduce nearly qubits.
2
Time complexity. The operator Ct (Ma ) can be implemented by means of O(tL2 ) elementary
gates in method (I). Hence the gate complexity (or time complexity) in both our distributed order-
finding algorithm and order-finding algorithm is O(L3 ).
Circuit depth. By Fig. 1, we know that the circuit depth of Ct (Ma ) depends on the circuit depth
x x
of controlled-Ma2 (x = 0, 1, · · · , t−1) and t. The circuit depth of controlled-Ma2 is O(L2 ) in method
(I). By observing the value “t” in order-finding algorithm and our distributed order-finding algorithm,
we clearly get that the circuit depth of each computer in our distributed order-finding algorithm is less
than the traditional order-finding algorithm, even though both are O(L3 ).
Communication complexity. In our distributed Shor’s algorithm, we need to teleport L qubits
(in step 5 of Algorithm 3). Therefore, the communication complexity of our distributed Shor’s al-
gorithm is O(L). As a comparison, the communication complexity of the distributed order-finding
algorithm proposed in [10] is O(L2 ). In [10] they directly divide the circuit into several parts. How-
erver, the CNOT gates acting on different parts cannot be directly implemented. In order to solve this
difficulty, they use some operations called cat-entangler and cat-disentangler to implement non-local
CNOT gates (the implemention of each non-local CNOT gate needs to communicate 2 classical bits
and previously share an EPR pair). They demonstrated that their division makes it necessary to imple-
ment O(L2 ) non-local CNOT gates and thus concluded that the communication complexity of their
distributed Shor’s algorithm is O(L2 ).
Ligang Xiao, Daowen Qiu, Le Luo and Paulo Mateus 41
5 Conclusions
In this paper, we have proposed a new distributed Shor’s algorithm. More specifically, we have pro-
posed a new quantum-classical hybrid distributed order-finding algorithm, which uses quantum com-
puting to obtain results quickly, while using classical algorithms to guarantee the accuracy of the
results. In this distributed quantum algorithm, two computers work sequentially via quantum telepor-
s
tation. Each of them can obtain an estimation of partial bits of for some s ∈ {0, 1, · · · , r − 1} with
r
high probability, where r is the “order”. It is worth mentioning that they can also be executed in par-
allel to some extent. We have shown that our distributed algorithm has advantages over the traditional
order-finding algorithm in space and circuit depth, which is vital in the NISQ era. Our distributed
L
order-finding algorithm can reduce nearly qubits and reduce the circuit depth to some extent for
2
each computer. However, unlike parallel execution, the way of serial execution that has been used in
our algorithm leads to noise in both computers.
We have proved the correctness of this distributed algorithm on two computers, a natural problem
is whether or not this method can be generalized to multiple computers or to other quantum algorithms.
We would further consider the problem in subsequent study.
Acknowledgements
The authors are grateful to the anonymous referee for proposing very important comments and sug-
gestions that help us improve the quality of the paper. This work is partly supported by the National
Natural Science Foundation of China (Nos. 61572532, 61876195) and the Natural Science Foundation
of Guangdong Province of China (No. 2017B030311011).
References
1. A. Aspuru-Guzik, A.D. Dutoi, P. J. Love and M. Head-Gordon (2005), Simulated quantum computation of
molecular energies, Science, Vol. 309(5741), pp. 1704–1707.
2. G. Rosenberg, P. Haghnegahdar, P. Goddard, P. Carr, K. Wu and M. L. De Prado (2016), Solving the optimal
trading trajectory problem using a quantum annealer, IEEE J. Sel. Top. Signal Process., Vol. 10(6), pp. 1053–
1060.
3. P. W. Shor (1994), Algorithms for quantum computation: discrete logarithms and factoring, in: Proceedings
of the 35th Annual Symposium on Foundations of Computer Science, pp. 124–134.
4. L. K. Grover (1996), A fast quantum mechanical algorithm for database search, in: Proceedings of the
twenty-eighth annual ACM symposium on Theory of computing, pp. 212–219.
5. A. W. Harrow, A. Hassidim and S. Lloyd (2009), Quantum algorithm for linear systems of equations, Phys.
Rev. Lett., Vol. 103(15), pp. 150502.
6. A. Montanaro (2016), Quantum algorithms: an overview, npj Quantum Inf., Vol. 2, pp. 15023.
7. J. Avron, O. Casper and I. Rozen (2021), Quantum advantage and noise reduction in distributed quantum
computing, Phys. Rev. A, Vol. 104(5), pp. 052404.
8. R. Beals, S. Brierley, O. Gray, A. W. Harrow, S. Kutin, N. Linden, D. Shepherd and M. Stather (2013),
Efficient distributed quantum computing, Proc. Math. Phys. Eng. Sci., Vol. 469(2153), pp. 20120686.
9. K. Li, D. Qiu, L. Li, S. Zheng and Z. Rong (2017), Application of distributed semi-quantum computing
model in phase estimation, Inform. Process. Lett., Vol. 120, pp. 23–29.
10. A. Yimsiriwattana and S.J. Lomonaco (2004), Distributed quantum computing: a distributed Shor algo-
rithm, Quantum Inf. Comput. II, Vol. 5436, pp. 360–372.
11. C. Gidney and M. Ekera (2021), How to factor 2048 bit RSA integers in 8 hours using 20 million noisy
qubits, Quantum, Vol. 5, pp. 433.
42 Distributed Shor’s algorithm
12. S. Beauregard (2003), Circuit for Shor’s algorithm using 2n + 3 qubits, Quantum Inf. Comput., Vol. 3(2),
pp. 175–185.
13. T. Haner, M. Roetteler and K. M. Svore (2017), Factoring using 2n + 2 qubits with Toffoli based modular
multiplication, Quantum Inf. Comput., Vol. 17(7-8), pp. 673–684.
14. S. Parker and M. B. Plenio (2000), Efficient factorization with a single pure qubit and log N mixed qubits,
Phys. Rev. Lett., Vol. 85(14), pp. 3049–3052.
15. M. A. Nielsen and I. L. Chuang (2000), Quantum Computation and Quantum Information, Cambridge
University Press, Cambridge.
16. C. Bennett, G. Brassard, C. Crépeau, R. Jozsa, A. Peres and W. K. Wootters (1993), Teleporting an unknown
quantum state via dual classical and Einstein-Podolsky-Rosen channels, Phys. Rev. Lett., Vol. 70(13), pp.
1895–1899.
17. N. Gisin and R. Thew (2007), Quantum communication, Nat. Photonics, Vol. 1(3), pp. 165–171.
18. P. W. Shor (1999), Polynomial-time algorithms for prime factorization and discrete logarithms on a quan-
tum computer, Siam Rev., Vol. 41(2), pp. 303–332.
Appendix A: An example
We will give an example to show the procedure of our distributed order-finding algorithm. For conve-
nience, we omit some details and modify some parameters.
Give a 10-bit composite number N = 210 − 1 = 1023 and an integer a = 2. The order r of a
modulo N is 10 (i.e. r = 10). In this example, the purpose of the quantum part of the order-finding
algorithm is to output an estimation of s/r for some s with error no larger than 2−21 . In addition,
Alice is to estimate the first 6 bits of s/r and Bob is to estimate the 5th to the 21th bit of s/r (17 bits
for Bob). The procedure of our distributed Shor’s algorithm is as follows:
(1) Alice initializes qubits as |0iA |1iC and Bob initializes qubits as |0iB .
Registers A, B, and C have 6-qubit, 17-qubit, and 10 qubit, respectively.
0
where s/r
g is an estimation of the 5th to the 21th bit of s/r.
(6) Apply steps 1 and 2 of the CorrectResults subroutine to correct Alice’s measurement.
7
Alice’s estimate of the 5th and the 6th bits of 10 is 01. Bob’s estimate of the 5th and the 6th bits
7
of 10 is 00. Since ((01)2 − 1) mod 22 = (00)2 , we get that the CorrectionBit is −1. We use
the CorrectionBit to correct the result of Alice, and it becomes ((1011 01)2 − 1) mod 26 =
(1011 00)2 . Fig. A.2 shows the relationship between these results (the overall result is obtained
in the next step).
(7) By directly concatenating the first 6th bits of Alice’s result and the bits after the 2th bit of Bob’s
result, we obtain the overall estimation is 0.1011 0011 0011 0011 0011 0 (21 bits). It satisfies
7 1
(0.1011 0011 0011 0011 0011 0)2 − < 21 . (A.1)
10 2
It means that we output an estimation with error no large than 2−21 and thus our algorithm
works for this example.
s 734003
Note that our estimation of is (0.1011 0011 0011 0011 0011 0)2 = . After applying
r 1048576
continued fractions algorithm, we have
734003 1
=0+ . (A.2)
1048576 1
1+
1
2+
1
3+
1
52428 +
2
Hence we know that
1 7
0+ = (A.3)
1 10
1+
1
2+
3
is the closest to (0.1011 0011 0011 0011 0011 0)2 of all numbers with the form pq where
s 7
gcd(p, q) = 1 and q < N = 1023. Thus, we get that = for some s and obtain that 10 is a
r 10
factor of r. If repeating steps (1)-(8) several times, we are likely to get all prime factors of the
order r, and finally conclude that r = 10.
We have obtained the order r by means of our distributed order-finding algorithm. Let us continue
with the rest of the Shor’s algorithm. Since r = 10 is even, we compute gcd(2r/2 + 1, N ) and
gcd(2r/2 − 1, N ). We get that
and
gcd(2r/2 − 1, N ) = gcd(31, 1023) = 31. (A.5)
Finally, we have N = 1023 = 33 ∗ 31 and we can continue to try to factor 33 and 31 similarly if
necessary.