0% found this document useful (0 votes)
41 views18 pages

Shors Algorith 2022

This document describes a distributed version of Shor's algorithm to factor integers. The proposed algorithm uses two quantum computers that work together to estimate the periodicity of a function, which is a key step in Shor's algorithm. Each quantum computer estimates part of the periodicity information. They communicate via quantum teleportation and their results are combined classically for high accuracy. This distributed approach reduces the number of qubits and circuit depth needed compared to the traditional implementation of Shor's algorithm.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
41 views18 pages

Shors Algorith 2022

This document describes a distributed version of Shor's algorithm to factor integers. The proposed algorithm uses two quantum computers that work together to estimate the periodicity of a function, which is a key step in Shor's algorithm. Each quantum computer estimates part of the periodicity information. They communicate via quantum teleportation and their results are combined classically for high accuracy. This distributed approach reduces the number of qubits and circuit depth needed compared to the traditional implementation of Shor's algorithm.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 18

Quantum Information and Computation, Vol. 23, No.

1&2 (2023) 0027–0044



c Rinton Press

DISTRIBUTED SHOR’S ALGORITHM

LIGANG XIAO
Institute of Quantum Computing and Computer Theory, School of Computer Science and Engineering,
Sun Yat-sen University, Guangzhou, 510006, China
The Guangdong Key Laboratory of Information Security Technology,
Sun Yat-sen University, Guangzhou, 510006, China

DAOWEN QIU a
Institute of Quantum Computing and Computer Theory, School of Computer Science and Engineering,
Sun Yat-sen University, Guangzhou, 510006, China
The Guangdong Key Laboratory of Information Security Technology,
Sun Yat-sen University, Guangzhou, 510006, China
QUDOOR Technologies Inc., Guangzhou, 510006, China

LE LUO
School of Physics and Astronomy, Sun Yat-sen University
Zhuhai, 519082, China
QUDOOR Technologies Inc., Guangzhou, 510006, China

PAULO MATEUS
Instituto de Telecomunicações, Departamento de Matemática, Instituto Superior Técnico,
Av. Rovisco Pais 1049-001 Lisbon, Portugal

Received July 17, 2022


Revised November 17, 2022

Shor’s algorithm is one of the most important quantum algorithm proposed by Peter Shor [Proceedings
of the 35th Annual Symposium on Foundations of Computer Science, 1994, pp. 124–134]. Shor’s
algorithm can factor a large integer with certain probability and costs polynomial time in the length
of the input integer. The key step of Shor’s algorithm is the order-finding algorithm, the quantum
part of which is to estimate s/r, where r is the “order” and s is some natural number that less than
r. Shor’s algorithm requires lots of qubits and a deep circuit depth, which is unaffordable for current
physical devices. In this paper, to reduce the number of qubits required and circuit depth, we propose
a quantum-classical hybrid distributed order-finding algorithm for Shor’s algorithm, which combines
the advantages of both quantum processing and classical processing. In our distributed order-finding
algorithm, we use two quantum computers with the ability of quantum teleportation separately to
estimate partial bits of s/r. The measuring results will be processed through a classical algorithm to
ensure the accuracy of the results. Compared with the traditional Shor’s algorithm that uses multiple
control qubits, our algorithm reduces nearly L/2 qubits for factoring an L-bit integer and reduces the
circuit depth of each computer.

Keywords: Shor’s algorithm, distributed Shor’s algorithm, quantum-classcial hybrid, quantum telepor-
tation, circuit depth

aCorresponding author (D. Qiu). E-mail addresses: [email protected] (D. Qiu)

27
28 Distributed Shor’s algorithm

1 Introduction
Quantum computing has shown great potential in some fields or problems, such as chemical molecular
simulation [1], portfolio optimization [2], large number decomposition [3], unordered database search
[4] and linear equation solving [5] et al. At present, there have been many useful algorithms in
quantum computing [6], but to realize these algorithms requires the power of medium or large scale
general quantum computers. However, it is still very difficult to develop a medium or large scale
general quantum computer, because there are important physical problems in quantum computer that
have not been solved. In the NISQ (Noisy Intermediate-Scale Quantum) era, we can only perform
quantum algorithms with few qubits and low circuit depth. Therefore it is necessary to consider
reducing the number of qubits and other computing resources required for quantum algorithms.
Distributed quantum computing is a computing method that solves problems collaboratively through
multiple computing nodes. In distributed quantum computing, we can use multiple slightly smaller
quantum computers to complete a task that was originally completed by a single medium or large scale
quantum computer. Distributed quantum computing not only reduces the number of qubits required,
but also sometimes reduces the circuit depth of each computer. This is also important since noise is
increased with circuit being deepened . Therefore, distributed quantum computing has been studied
significantly (for example, [7, 8, 9, 10]).
Shor’s algorithm proposed by Peter Shor in 1994 [3] is an epoch-making discovery. It can factor
a large integer with certain probability and costs time polynomial in the length of the input integer,
whereas the time complexity of the best known classical algorithm for factoring large numbers is
subexponential but superpolynomial. Shor’s algorithm can be applied in cracking various cryptosys-
tems, such as RSA cryptography and elliptic curve cryptography. For this reason, Shor’s algorithm has
received extensive attention from the community. However, recently some researchers have pointed
out that using Shor’s algorithm to crack the commonly used 2048-bit RSA integer requires physical
qubits of millions [11]. So it is vital to consider reducing the logic qubits required in Shor’s algorithm.
Many researchers have been working on reducing the number of qubits required for Shor’s algorithm
[12, 13, 14], and these results have shown that Shor’s algorithm can be implemented using only one
control qubit to factor a L-bit integer together with 2L + c qubits and circuit depth O(L3 ), where c is
a constant. But the method requires multiple intermediate measurements.
In 2004, Yimsiriwattana et al. [10] proposed a distributed Shor’s algorithm. In this distributed
algorithm, it directly divides the qubits into several parts, so each part has fewer qubits than the
original one. Since all unitary operators can be decomposed into single qubit quantum gates and
CNOT gates [15], they only need to consider how to implement CNOT gates acting on different parts,
while a CNOT gate acting on different parts can be implemented by means of pre-sharing EPR pairs,
local operations and classical communication. They clarified that their distributed algorithm needs to
communicate O(L2 ) classical bits.
In this paper, we propose a new distributed Shor’s algorithm. It is a quantum-classical hybrid
algorithm, which not only takes advantage of fast quantum computing, but also takes advantage of
the ease of processing measuring results of classical algorithms. In our distributed algorithm, two
computers execute sequentially. Each computer estimates several bits of some key intermediate quan-
tity (the ratio of s and r, where r is the “order” and s is some natural number that less than r). In
order to guarantee the correlation between the two computers’ measuring results to some extent, we
employ quantum communication. Furthermore, to obtain high accuracy, we can adjust the measuring
result of the first computer in terms of the measuring result of the second computer through classical
Ligang Xiao, Daowen Qiu, Le Luo and Paulo Mateus 29

post-processing. Compared with the traditional Shor’s algorithm that uses multiple control qubits,
our algorithm reduces the cost of qubits (reduces nearly L/2 qubits) and the circuit depth of each
computer. Although each computer in our distributed algorithm requires more qubits than the imple-
mention of Shor’s algorithm mentioned above that uses only one control qubit, our method of using
quantum communication to distribute the phase estimation of Shor’s algorithm may be applicable to
other quantum algorithms.
The remainder of the paper is organized as follows. In Section 2, we review quantum teleportation
and some quantum algorithms related to Shor’s algorithm. In Section 3, we present a distributed Shor’s
algorithm (more specifically, a distributed order-finding algorithm), and prove the correctness of our
algorithm. In Section 4, we analyze the performance of our algorithm, including space complexity,
time complexity, circuit depth and communication complexity. Finally in Section 5, we conclude with
a summary.

2 Preliminaries
In this section, we review the quantum Fourier transform, phase estimation algorithm, order-finding
algorithm and others we will use. We assume that the readers are familiar with the liner algebra and
basic notations in quantum computing (for the details we can refer to [15]).

2.1 Quantum Fourier transform


Quantum Fourier transform is a unitary operator with the following action on the standard basis states:
n
2 −1
1 X 2πijk/2n
QF T |ji = √ e |ki, (1)
2n k=0

for j = 0, 1, · · · , 2n − 1. Hence the inverse quantum Fourier transform is acted as follows:


n
2 −1
−1 1 X 2πijk/2n
QF T √ e |ki = |ji, (2)
2n k=0

for j = 0, 1, · · · , 2n − 1.
Quantum Fourier transform and the inverse quantum Fourier transform can be implemented by
using O(n2 ) single qubit gates and O(n2 ) CNOT gates [3, 15].

2.2 Phase estimation algorithm


Phase estimation algorithm is an application of the quantum Fourier transform. Let |ui be a quantum
state and let U be a unitary operator that satisfies U |ui = e2πiω |ui for some real number ω ∈ [0, 1).
Suppose we can create the quantum state |ui and implement controlled operation Cm (U ) whose
control qubits is m-qubit such that

Cm (U )|ji|ui = |jiU j |ui (3)

for any positive integer m and m-bit string j, where the first register is control qubits. Then we can
apply phase estimation algorithm to estimate ω. Fig. 1 shows the implementation of Cm (U ). For the
sake of convenience, we first define the following notations. In this paper, we treat bit strings and their
corresponding binary integers as the same.
30 Distributed Shor’s algorithm

Fig. 1. Implementation for Cm (U )

Definition 1. For any real number ω = a1 a2 · · · al .b1 b2 · · · , where ak1 ∈ {0, 1}, k1 = 1, 2, · · · , l
and bk2 ∈ {0, 1}, k2 = 1, 2, · · · , denote |ψt,ω i, ω{i,j} , ω[i,j] , and dt (x, y) respectively as follows:
t
1 2P −1
• |ψt,ω i: for any positive integer t, |ψt,ω i = QF T −1 √ e2πijω |ji.
2t j=0
• ω{i,j} : for any integer i, j with 1 ≤ i ≤ j, ω{i,j} = bi bi+1 · · · bj .

• ω[i,j] : for any integer i, j with 1 ≤ i ≤ j ≤ l, ω[i,j] = ai ai+1 · · · aj .

• dt (x, y): for any two t-bit strings (or t-bit binary integers) x, y, define dt (x, y) =
min(|x − y|, 2t − |x − y|).

dt (·, ·) is a useful distance to estimate the error of the algorithms in our paper and it has the
following properties. We specify a mod N = (kN + a) mod N for any negative integer a and
positive integer N , where k is an integer and satisfies kN + a ≥ 0.

Lemma 1. Let t be a positive integer and let x, y be any two t-bit strings. It holds that:
(I) Let B = {b ∈ {−(2t − 1), · · · , 2t − 1} : (x + b) mod 2t = y}. Then dt (x, y) = minb∈B |b|.
(II) dt (·, ·) is a distance on {0, 1}t .
(III) Let t0 < t be an positive integer. If dt (x, y) < 2t−t0 , then

dt0 (x[1,t0 ] , y[1,t0 ] ) ≤ 1. (4)

Proof. First we prove (I). It is clear for the case of x = y. Without loss of generality, assume x > y.
Since x 6= y, we have B contains only 2 elements. Note that

x + (y − x) mod 2t = y, (5)
t t
x + (2 − (x − y)) mod 2 = y, (6)
t
|y − x| ≤ 2 − 1, (7)
t t
|2 − (x − y)| ≤ 2 − 1 (8)

and y − x 6= 2t − (x − y), we get that y − x and 2t − (x − y) are exactly two elements of B. Hence
minb∈B |b| = min(|x − y|, 2t − |x − y|) = dt (x, y). Thus (I) holds.
Ligang Xiao, Daowen Qiu, Le Luo and Paulo Mateus 31

Then we prove (II). We just need to show that dt (·, ·) satisfies the triangle inequality, that is,
dt (x, y) ≤ dt (x, z) + dt (z, y) holds for any t-bit string z. By (I), we know that there exists b1 , b2 ∈
{−(2t − 1), · · · , 2t − 1} such that

|b1 | = dt (x, z), |b2 | = dt (z, y), (9)

and
(x + b1 ) mod 2t = z, (z + b2 ) mod 2t = y. (10)
Hence (x + b1 + b2 ) mod 2t = y. Then by (I) again, we have

dt (x, y) ≤ |b1 + b2 | ≤ |b1 | + |b2 | = dt (x, z) + dt (z, y). (11)

Thus, (II) holds.


Finally we prove (III). By (I) and dt (x, y) < 2t−t0 , we know that there exists an integer b with
|b| < 2t−t0 such that

(2t−t0 x[1,t0 ] + x[t0 +1,t] + b) mod 2t = 2t−t0 y[1,t0 ] + y[t0 +1,t] . (12)

Then by (I) again we have

dt (2t−t0 x[1,t0 ] , 2t−t0 y[1,t0 ] ) ≤ |b + x[t0 +1,t] − y[t0 +1,t] | < 2 · 2t−t0 . (13)

Hence
dt0 (x[1,t0 ] , y[1,t0 ] ) < 2. (14)
Therefore Eq. (4) holds.

We can understand dt (·, ·) in a more intuitive way. We place numbers 0 to 2t evenly on a circum-
ference where 0 and 2t coincide. Suppose the distance of two adjacent points on the circumference is
1. Then dt (x, y) can be regarded as the length of the shortest path on the circumference from x to y.
Next we review the phase estimation algorithm (see Algorithm 1) and its associated results.

Algorithm 1 Phase estimation algorithm


Procedure:
1: Create initialize state |0i⊗t |ui.
2: Apply H ⊗t to the first register:
t
1 2P −1
H ⊗t |0i⊗t |ui = √ |ji|ui.
t
2 j=0
3: Apply Ct (U ):
t t t
1 2P −1 1 2P −1 1 2P −1
Ct (U ) √ |ji|ui = √ |jiU j |ui = √ |jie2πijω |ui.
t
2 j=0 t
2 j=0 t
2 j=0
4: Apply QF T −1 :
t
1 2P −1
QF T −1 √ e2πijω |ji|ui = |ψt,ω i|ui.
2t j=0
5: Measure the first register:
obtain a t-bit string ω e.
32 Distributed Shor’s algorithm

If the fractional part of ω does not exceed t bits (i.e. 2t ω is an integer), by observing Eq. (2)
ω
e
and the step 4 in Algorithm 1, we can see that ω e is a perfect estimate of ω (i.e. t = ω). However,
2
ω ω
sometimes ω is not approximated by t but is approximated by 1 − t . For example, if the binary
e e
2 2
representation of ω is ω = 0.11 · · · 1 (sufficiently many 1s), we will obtain the measuring result
00 · · · 0 with high probability, since at this time e2πiω is close to e2πi0 = 1. The output ω
e of the phase
ω
estimation algorithm should satisfy that t is close to ω or ω − 1. We have the following results.
e
2
Proposition 1 (See [15]). In Algorithm 1, for any  > 0 and any positive integer n, if t = n +
1
ω , ω{1,t} ) < 2t−n is at least 1 − .
dlog2 (2 + )e, then the probability of dt (e
2
ω , ω{1,t} ) < 2t−n , then we have
e and real number ω ∈ [0, 1). If dt (e
Lemma 2. For any t-bit string ω
ω ω
| t − ω| ≤ 2−n or 1 − | t − ω| ≤ 2−n , where n < t.
e e
2 2
Proof. Since |2t ω − ω{1,t} | < 1, if dt (e
ω , ω{1,t} ) = |e
ω − ω{1,t} |, we have

ω − 2t ω| ≤ |e
|e ω − ω{1,t} | + |ω{1,t} − 2t ω| ≤ 2t−n , (15)

ω
and thus | − ω| ≤ 2−n ; if dt (e
ω , ω{1,t} ) = 2t − |e
ω − ω{1,t} |, we have
e
2t
2t − |e
ω − 2t ω| ≤ 2t − (|e
ω − ω{1,t} | − |ω{1,t} − 2t ω|) ≤ 2t−n , (16)

ω
and therefore, we have 1 − | − ω| ≤ 2−n .
e
2t
ω ω
is an estimate of ω{1,t} with error less than 2−n , then t is an estimate of ω
e e
That is to say, if t
2 2
with error no larger than 2−n .

2.3 Order-finding algorithm


Phase estimation algorithm is a key subroutine in order-finding algorithm. Given an L-bit integer
N and a positive integer a with gcd(a, N ) = 1, the purpose of order-finding algorithm is to find the
order r of a modulo N , that is, the least integer r that satisfies ar ≡ 1( mod N ). An important unitary
operator Ma in order-finding algorithm is defined as

Ma |xi = |ax mod N i. (17)

Denote
r−1
1 X −2πi s k k
|us i = √ e r |a mod N i. (18)
r
k=0

We have
s
Ma |us i = e2πi r |us i, (19)
r−1
1 X
√ |us i = |1i, (20)
r s=0
Ligang Xiao, Daowen Qiu, Le Luo and Paulo Mateus 33

and (
0 if s 6= s0 ,
hus |us0 i = δs,s0 = (21)
1 if s = s0 .

So if we expect to apply phase estimation algorithm in finding order, the key is to construct Cm (Ma ),
that is, for any m-bit string j,

Cm (Ma )|ji|xi = |ji|aj x mod N i. (22)

Algorithm 2 [15] and Fig. 2 show the precedure of order-finding algorithm.

Algorithm 2 Order-finding algorithm


Input: Positive integers N and a with gcd(N, a) = 1.
Output: The order r of a modulo N .
Procedure:
1: Create initial state |0i⊗t |1i:
1
t = 2L + 1 + dlog2 (2 + )e and the second register has L qubits.
2
2: Apply H ⊗t to the first register:
t
1 2P −1
⊗t ⊗t √
H |0i |1i = |ji|1i.
2t j=0
3: Apply Ct (Ma ):
t t t
1 2P −1 1 2P −1 1 r−1 1 r−1P 2P −1 s
|jiM j ( √ |jie2πij r |us i.
P
Ct (Ma ) √ |ji|1i = √ |us i) = √
t
2 j=0 t
2 j=0 r s=0 t
r2 s=0 j=0
4: Apply QF T −1 :
t
1 r−1 P 2P −1 s 1 r−1
QF T −1 √ |jie2πij r |us i = √
P
|ψt,s/r i|us i
t
r2 s=0 j=0 r s=0
5: Measure the first register:
s
obtain a t-bit string m that is an estimation of for some s.
r
6: Apply continued fractions algorithm:
obtain r.

Fig. 2. Circuit for order-finding algorithm

The purpose of the quantum part of the order-finding algorithm (steps 1 to 5 in Algorithm 2) is to
s
get a measuring result m such that m is an estimation of with error no large than 2−(2L+1) for some
r
m s
s ∈ {0, 1, · · · , r − 1} (i.e. | t − | ≤ 2−(2L+1) ), because it is one of the prerequisites to ensure the
2 r
34 Distributed Shor’s algorithm

t
correctness of the result in step 6 [15]. Let {Pi } be any projective measurement on C2 and let |φs i
be any t-qubit quantum state for s = 0, 1, · · · , r − 1. By Eq. (21), we have

r−1
X r−1
X
k(Pj ⊗ I) |φs i|us ik2 = k(Pj |φs i)|us ik2 (23)
s=0 s=0

for Pj ∈ {Pi }. Hence by Propositon 1 and Eq. (23), we can obtain the following proposition imme-
diately.
s
Proposition 2 (See [15]). In Algorithm 2, the probability of dt (m, ( ){1,t} ) < 2t−(2L+1) for any fixed
r
1−
s ∈ {0, 1, · · · , r − 1} is at least . And the probability that there exists an s ∈ {0, 1, · · · , r − 1}
r
such that
s
dt (m, ( ){1,t} ) < 2t−(2L+1) (24)
r
is at least 1 − .

s
Proof. Denote As = {x ∈ {0, 1}t : dt (x, ( ){1,t} ) < 2t−(2L+1) }. Let Qs = i∈As |iihi|. For any
P
r
s
fixed s ∈ {0, 1, · · · , r − 1}, the probability of dt (m, ( ){1,t} ) < 2t−(2L+1) is
r
r−1 r−1
1 X 1X
k(Qs ⊗ I) √ |ψt,k/r i|uk ik2 = k(Qs |ψt,k/r i)|uk ik2 (by Eq. (23)) (25)
r r
k=0 k=0
1
≥ k(Qs |ψt,s/r i)|us ik2 (26)
r
1
= k(Qs |ψt,s/r i)k2 (27)
r
1−
≥ (by Propositon 1) (28)
r
P
Let Q = r−1 |iihi|. And the probability that there exists an s ∈ {0, 1, · · · , r − 1} such that
i∈ ∪ As
s=0
s
dt (m, ( ){1,t} )) < 2t−(2L+1) is
r

r−1 r−1
1 X 1X
k(Q ⊗ I) √ |ψt,k/r i|uk ik2 = k(Q|ψt,k/r i)|uk ik2 ( by Eq.(23)) (29)
r r
k=0 k=0
r−1
1X
≥ k(Qk |ψt,k/r i)|uk ik2 (30)
r
k=0
r−1
1X
= k(Qk |ψt,k/r i)k2 (31)
r
k=0
≥ 1 −  (by Equation (28)) (32)

Therefore, the proposition holds.


Ligang Xiao, Daowen Qiu, Le Luo and Paulo Mateus 35

Although it is an important part to discuss the probability of obtaining r correctly from the mea-
suring result by applying continued fractions algorithm, the details are omitted here and we focus on
s m s
considering whether the measuring result is an estimation of for some s (i.e. | t − | ≤ 2−(2L+1) ),
r 2 r
since this is exactly the goal of the quantum part in the order-finding algorithm.

2.4 Quantum teleportation


Quantum teleportation is an important means to realize quantum communication [16, 17]. Quantum
teleportation is effectively equivalent to physically teleporting qubits, but in fact, the realization of
quantum teleportation only requires classical communication and both parties to share an EPR pair in
advance. The following result is useful.

Theorem 1 ([16]). When Alice and Bob share L pairs of EPR pairs, they can simulate transmitting L
qubits by communicating 2L classical bits.

3 Distributed order-finding algorithm


In [9], a distributed phase estimation algorithm was proposed. However, in the method of [9], it is not
proved that the distance between the estimation and the true value can be less than the given margin of
error. Their ideas deserve further consideration. In this section, by combining with quantum telepor-
tation, we proposed a distributed order-finding algorithm and prove the correctness of our algorithm.
Without loss of generality, assume that L = dlog2 (N )e is even. The idea of our distributed order-
finding algorithm is as follows. We need two quantum computers (named A and B). We first apply
L s
order-finding algorithm in computer A and obtain an estimation of the first + 1 bits of for some
2 r
L s
s ∈ {0, 1, · · · , r − 1}, and similarly obtain an estimation of the ( + 2)th bit to (2L + 1)th bit of in
l l
2 ls
r
computer B. We can realize this by using Ct (Ma2 ), since Ma2 |us i = e2πi(2 r ) |us i and the fractional
s s l
part of 2l starts at the (l +1)th bit of he fractional part of . Moreover, since Ma2 = Ma2l mod N and
r l
r l
we can calculate a2 mod N classically with time complexity O(l), we can construct Ct (Ma2 ) with
the same way as Ct (Ma ). In addition, to guarantee the measuring results of A and B corresponding
s
to the same , we need quantum teleportation.
r
However, in order to make the distance between the overall estimation and the true value less
1 L
than 2L+1 , computer B actually estimates the th bit to (2L + 1)th bit, where the estimation of
2 2
L L
the th bit and the ( + 1)th bit is used to “correct” the measuring result of A. This “correction”
2 2
operation is handed over to a classical subroutine named CorrectResults. Our distributed order-
finding algorithm is shown in Algorithm 3 and Fig. 3, and the subroutine CorrectResults is shown
in Algorithm 4. In addition, we give an example to show the procedure of our distrbuted order-finding
algorithm in Appendix A.

Remark 1. Although Algorithm 3 is a serial algorithm, the two computers can also execute in parallel
to some extent. For example, execute the algorithm in the following order: 1, (2, 6), 3, 5, 7, (4, 8), 9,
10, 11, where i represents the ith step in Algorithm 3, and (i, j) means that the ith and jth steps are
executed in parallel.
Remark 2. If we initialize the quantum state of computer B to |0iB |1iD (register D is L-qubit) and
36 Distributed Shor’s algorithm

Fig. 3. Circuit for distributed order finding algorithm

do not employ quantum teleportation in Fig. 3, that is, computer A and computer B execute “partial”
order-finding algorithm respectively, then the final quantum states of computers A and B will become
Pr−1 g Pr−1 g0 s
s=0 |s/riA |us iC and s=0 |s/r iB |us iD , respectively, where s/r is an estimation of ( ){1, L
g
r 2 +1}
0 s
and s/r is an estimation of ( ){ L ,2L+1} . Therefore, in this case, if computer A measures register A
g
r 2
and computer B measures register B, their measuring results may not correspond to the same s/r.
Next we prove the correctness of our algorithm, that is, we can obtain an output m such that
m s
| (2L+1+p) − | ≤ 2−(2L+1) holds for some s ∈ {0, 1, · · · , r − 1} with high probability. Let
2 r
r, L, t1 , t2 , p, m1 , m2 , mpref ix , m, 0 , |φf inal i be the same as those in Algorithm 3 and Algorithm 4.
s0 s0
We first prove that if m1 and m2 are both estimations of some bits of with = 0.a1 a2 · · · a L +1 ,
r r 2
then the output m is perfect (i.e. m = a1 a2 · · · a L a L +1 0 · · · 0), and the probability of this case is not
2 2
1
less than .
r
L s0 s0
Proposition 3. Let s0 ∈ {0, 1, · · · , r − 1} satisfy that 2 2 +1 · is an integer, that is, =
r r
L
0.a1 a2 · · · a L +1 where ai ∈ {0, 1}, i = 1, 2, · · · , + 1. Then in Algorithm 3, it holds that
2 2
1
Prob(m = a1 a2 · · · a L +1 0 · · · 0) ≥ . (33)
2 r
s0 L
Proof. Since the fractional part of is at most ( + 1)-bit, in Algorithm 3, we have
r 2
|ψt1 ,s0 /r i = |a1 a2 · · · a L +1 0 · · · 0i (34)
2

and
|ψ L i = |a L a L +1 0 · · · 0i. (35)
t2 ,2 2 −1 s0 /r 2 2

Denote x = a1 a2 · · · a L a L +1 0 · · · 0 and y = a L a L +1 0 · · · 0. By Eq. (23), we have


2 2 2 2

Prob(m1 = x and m2 = y) = k|xihx| ⊗ |yihy| ⊗ I |φf inal ik2 (36)


1
≥ k|xihx| ⊗ |yihy| ⊗ I √ |ψt1 ,s0 /r i|ψ L2 −1 i|us0 ik2 (37)
r t2 ,2 s0 /r

1
= . (38)
r
Since CorrectResults(x, y) = a1 a2 · · · a L a L +1 0 · · · 0, the lemma holds.
2 2
Ligang Xiao, Daowen Qiu, Le Luo and Paulo Mateus 37

Algorithm 3 Distributed order-finding algorithm


Input: Positive integers N and a with gcd(N, a) = 1.
Output: The order r of a modulo N .
Procedure:
1: Computer A creates initial state |0iA |1iC . Computer B creates initial state |0iB :
Here registers A, B and C are t1 -qubit, t2 -qubit and L-qubit, respectively. We take
L 3L 1 
t1 = + 1 + p and t2 = + 2 + p, where p = dlog2 (2 + 0 e and 0 = .
2 2 2 2
Computer A:
1 r−1
2: Apply H ⊗t1 to register A:
P ⊗t1
→√ (H |0iA |us iC )|0iB .
r s=0
t1
1 r−1P 1 2P −1 s
3: Apply Ct1 (Ma ) to registers A and C: →√ (√ e2πij r |jiA |us iC )|0iB .
r s=0 2t1 j=0
1 r−1
4: Apply QF T −1 to register A:
P
→√ |ψt ,s/r iA |us iC |0iB
r s=0 1
1 r−1
P
5: Teleport the qubits of register C to computer B: →√ |ψt ,s/r iA |0iB |us iC
r s=0 1
Computer B:
1 r−1
6: Apply H ⊗t2 to register B: |ψt ,s/r iA H ⊗t2 |0iB |us iC
P
→√
r s=0 1
L −1
7: Apply Ct2 (Ma2 2 ) to registers B and C:
t2
1 r−1 1 2P −1 L −1
s
e2πij(2 2 r ) |jiB )|us iC
P
→ √ |ψt1 ,s/r iA ( √
r s=0 t
2 j=0
2

−1 1 r−1
P
8: Apply QF T to register B: → |φf inal i = √ |ψt ,s/r iA |ψ L2 −1 iB |us iC
r s=0 1 t2 ,2 s/r
9: Computer A measures register A and computer B measures register B:
A obtains a t1 -bit string m1 and B obtains a t2 -bit string m2 .
10: m ← CorrectResults(m1 , m2 ): m is a (2L + 1 + p)-bit string.
11: Apply continued fractions algorithm: obtain r.

Algorithm 4 CorrectResults subroutine


Input: Two measuring results: t1 -bit string m1 and t2 -bit string m2 .
m s
Output: An estimation m such that | (2L+1+p) − | ≤ 2−(2L+1) for some s ∈ {0, 1, · · · , r − 1}.
2 r
Procedure:
1: Choose CorrectionBit ∈ {−1, 0, 1} such that
((m1 )[ L , L +1] + CorrectionBit) mod 22 = (m2 )[1,2] .
2 2
L
2: mpref ix ← ((m1 )[1, L +1] + CorrectionBit) mod 2 2 +1
2
3: m ← mpref ix ◦ (m2 )[3,t2 ] (“◦” represents catenation)
4: return m

L s0
Then we prove that if m2 is an estimation of the th to (2L+1)th bit of , we can get (m2 )[1,2] =
2 r
s0
( ) L L .
r { 2 , 2 +1}
38 Distributed Shor’s algorithm

L s0
Lemma 3. Let s0 ∈ {0, 1, · · · , r − 1} satisfy that 2 2 +1 · is not an integer and let m2 satisfy
r
s0
dt2 (m2 , ( ) L ) < 2p . (39)
r { 2 ,2L+1+p}
s0
Then (m2 )[1,2] = ( ) L L .
r { 2 , 2 +1}
L s0
Proof. Since 2 2 +1 · is not an integer, we have
r
L
−L 1 2 2 +1 s0 mod r r−1
2 < ≤ ≤ < 1 − 2−L . (40)
r r r
s0 s0
So we get ( ){ L +2, 3L +1} is not 00 · · · 0 or 11 · · · 1. Hence, ( ){ L +2,2L+1} is not 00 · · · 0 or
r 2 2 r 2
s0
11 · · · 1. That is to say, if we add or subtract 1 to ( ){ L ,2L+1} , its first two bits are not changed.
r 2
Thus by Eq. (39), we have
s0
(m2 )[1,2] = ( ){ L , L +1} (41)
r 2 2
Therefore the lemma holds.
s0
If (m2 )[1,2] = ( ){ L , L +1} , that is, the first two bits of m2 are correct, then we can use these two
r 2 2
bits of m2 to “correct” m1 . The following lemma can be used to show the correctness of Algorithm 4.

Lemma 4. Let t > 2 be a positive integer and let x, y be two t-bit strings with dt (x, y) ≤ 1. Then
there only exists one element b0 in {−1, 0, 1} such that (x + b0 ) mod 2t = y, and for any b ∈
{−1, 0, 1}, (x + b) mod 2t = y if and only if (x[t−1,t] + b) mod 22 = y[t−1,t] .

Proof. By Lemma 1, we know that there exists such a b0 . It is clear that such a b0 is unique. Next we
prove that for any b ∈ {−1, 0, 1}, (x + b) mod 2t = y if and only if (x[t−1,t] + b) mod 22 = y[t−1,t] .
For any b ∈ {−1, 0, 1}, suppose (x + b) mod 2t = y, then we have

(x + b) mod 22 = y mod 22 . (42)

That is,
(x[t−1,t] + b) mod 22 = y[t−1,t] . (43)
On the other hand, for any b ∈ {−1, 0, 1}, suppose (x[t−1,t] + b) mod 22 = y[t−1,t] . Since there only
exists one elements b1 in {−1, 0, 1} such that (x[t−1,t] + b1 ) mod 22 = y[t−1,t] , b is equal to b0 , that
is, b satisfies (x + b) mod 2t = y. Consequently, the lemma holds.
s0
We can inspect Lemma 4 from another aspect. If d L +1 (m1 , ( ) L ) ≤ 1 and (m2 )[1,2] =
r {1, 2 +1}
2
s0
( ) L L hold for some s0 , then the CorrectionBit in Algorithm 4 exists, and mpref ix =
r { 2 , 2 +1}
s0
( ){1, L +1} holds as well.
r 2
Finally, we give the following results, which completes the proof of the correctness of our algo-
rithms.
Ligang Xiao, Daowen Qiu, Le Luo and Paulo Mateus 39

s0
Proposition 4. Let m2 satisfy dt2 (m2 , ( ){ L ,2L+1+p} ) < 2p for some s0 ∈ {0, 1, · · · , r − 1} with
r 2
L
+1 s0 s0 m s0
2 2 · being not an integer. Suppose dt1 (m1 , ( ){1,t1 } ) < 2p . Then | 2L+1+p − | ≤ 2−(2L+1) .
r r 2 r
s0 L s0
Proof. Since dt2 (m2 , ( ){ L ,2L+1+p} ) < 2p and 2 2 +1 · is not an integer, by Lemma 3, we have
r 2 r
s0
(m2 )[1,2] = ( ) L L . (44)
r { 2 , 2 +1}
s0 L
Since dt1 (m1 , ( ){1,t1 } ) < 2p and t1 = + 1 + p, by Lemma 1, we have
r 2
s0
d L +1 ((m1 )[1, L +1] , ( ) L ) ≤ 1. (45)
2 2 r {1, 2 +1}
As a result, in Algorithm 4, the CorrectionBit exists. By Eq. (44), Lemma 4, and the steps 1 to 2 in
Algorithm 4, we get
s0
mpref ix = ( ){1, L +1} . (46)
r 2

Since m = mpref ix ◦ (m2 )[3,t2 ] , by Eq. (44) and Eq. (46), we get
s0 s0
d2L+1+p (m, ( ){1,2L+1+p} ) = d 3L +2+p (m2 , ( ){ L ,2L+1+p} ) < 2p . (47)
r 2 r 2
s0 s0
Since is not an integer, similar to Eq. (40), we know that ( ){1,2L+1} is not 00 · · · 0 or 11 · · · 1.
r r
s0 s0
Then by Eq. (47), we get d2L+1+p (m, ( ){1,2L+1+p} ) = |m − ( ){1,2L+1+p} )|. Therefore, by Eq.
r r
(47) and Lemma 2, we obtain
m s0
| 2L+1+p − | ≤ 2−(2L+1) . (48)
2 r

m s0
Theorem 2. In Algorithm 3, for any fixed s0 ∈ {0, 1, · · · , r−1}, the probability of | 2L+1+p − | ≤
2 r
1−
2−(2L+1) is at least . The probability that there exists an s ∈ {0, 1, · · · , r − 1} such that
r
m s
| 2L+1+p − | ≤ 2−(2L+1) is at least 1 − .
2 r
L s0
Proof. By Proposition 3, for any fixed s0 ∈ {0, 1, · · · , r − 1} with 2 2 +1 · being an integer, we
r
have
m s0 1
Prob( 2L+1+p = ) ≥ . (49)
2 r r
L s0
For any fixed s0 ∈ {0, 1, · · · , r − 1} with 2 2 +1 · being not an integer, by Proposition 1 and Eq.
r
(23), we get that the probabilty of
s0
dt2 (m2 , ( ) L )) < 2p (50)
r { 2 ,2L+1+p}
and
s0
dt1 (m1 , ( ){1,t1 } ) < 2p (51)
r
40 Distributed Shor’s algorithm

1 1  1−
is at least (1 − 0 )2 = (1 − )2 > . Consequently, by Proposition 4, we obtain
r r 2 r
m s0 1−
Prob(| − | ≤ 2−(2L+1) ) > . (52)
22L+1+p r r

Similar to the proof of Proposition 2, we can obtain the probability that there exists an s ∈ {0, 1, · · · , r−
m s
1} such that | 2L+1+p − | ≤ 2−(2L+1) is at least 1 − . Finally, the theorem has been proved.
2 r

4 Complexity analysis
The complexity of the circuit of (distributed) order-finding algorithm depends on the construction of
Ct (Ma ). There are two kinds of implementation of Ct (Ma ) proposed by Shor [18]. The first method
(denoted as method (I)) needs time complexity O(L3 ) and space complexity O(L), and the second
method (denoted as method (II) ) needs time complexity O(L2 log L log log L) and space complexity
O(L log L log log L). In this section, we compare our distributed order-finding algorithm with the
traditional order-finding algorithm. For a more concrete comparison, we consider that Ct (Ma ) is
implemented by method (I). There is a concrete implementation of order-finding algorithm by using
method (I) in [10]. However, the advantages of our distributed order-finding algorithm in space and
circuit depth are independent of whether method (I) or method (II) is used.
Space complexity. The implementation of the operator Ct (Ma ) in method (I) needs t + L qubits
plus b auxiliary qubits for any positive integer a, where b is O(L). By Theorem 1, to teleport L qubits,
computers A and B need to share L pairs of EPR pairs and communicate with 2L classical bits. As
5L 1 5L 1
a result, A needs + 1 + dlog2 (2 + e) + b qubits and B needs + 2 + dlog2 (2 + )e + b
2  2 
1
qubits. As a comparison, order-finding algorithm needs 3L + 1 + dlog2 (2 + )e + b qubits. So, our
2
L
distributed order-finding algorithm can reduce nearly qubits.
2
Time complexity. The operator Ct (Ma ) can be implemented by means of O(tL2 ) elementary
gates in method (I). Hence the gate complexity (or time complexity) in both our distributed order-
finding algorithm and order-finding algorithm is O(L3 ).
Circuit depth. By Fig. 1, we know that the circuit depth of Ct (Ma ) depends on the circuit depth
x x
of controlled-Ma2 (x = 0, 1, · · · , t−1) and t. The circuit depth of controlled-Ma2 is O(L2 ) in method
(I). By observing the value “t” in order-finding algorithm and our distributed order-finding algorithm,
we clearly get that the circuit depth of each computer in our distributed order-finding algorithm is less
than the traditional order-finding algorithm, even though both are O(L3 ).
Communication complexity. In our distributed Shor’s algorithm, we need to teleport L qubits
(in step 5 of Algorithm 3). Therefore, the communication complexity of our distributed Shor’s al-
gorithm is O(L). As a comparison, the communication complexity of the distributed order-finding
algorithm proposed in [10] is O(L2 ). In [10] they directly divide the circuit into several parts. How-
erver, the CNOT gates acting on different parts cannot be directly implemented. In order to solve this
difficulty, they use some operations called cat-entangler and cat-disentangler to implement non-local
CNOT gates (the implemention of each non-local CNOT gate needs to communicate 2 classical bits
and previously share an EPR pair). They demonstrated that their division makes it necessary to imple-
ment O(L2 ) non-local CNOT gates and thus concluded that the communication complexity of their
distributed Shor’s algorithm is O(L2 ).
Ligang Xiao, Daowen Qiu, Le Luo and Paulo Mateus 41

5 Conclusions
In this paper, we have proposed a new distributed Shor’s algorithm. More specifically, we have pro-
posed a new quantum-classical hybrid distributed order-finding algorithm, which uses quantum com-
puting to obtain results quickly, while using classical algorithms to guarantee the accuracy of the
results. In this distributed quantum algorithm, two computers work sequentially via quantum telepor-
s
tation. Each of them can obtain an estimation of partial bits of for some s ∈ {0, 1, · · · , r − 1} with
r
high probability, where r is the “order”. It is worth mentioning that they can also be executed in par-
allel to some extent. We have shown that our distributed algorithm has advantages over the traditional
order-finding algorithm in space and circuit depth, which is vital in the NISQ era. Our distributed
L
order-finding algorithm can reduce nearly qubits and reduce the circuit depth to some extent for
2
each computer. However, unlike parallel execution, the way of serial execution that has been used in
our algorithm leads to noise in both computers.
We have proved the correctness of this distributed algorithm on two computers, a natural problem
is whether or not this method can be generalized to multiple computers or to other quantum algorithms.
We would further consider the problem in subsequent study.

Acknowledgements
The authors are grateful to the anonymous referee for proposing very important comments and sug-
gestions that help us improve the quality of the paper. This work is partly supported by the National
Natural Science Foundation of China (Nos. 61572532, 61876195) and the Natural Science Foundation
of Guangdong Province of China (No. 2017B030311011).

References

1. A. Aspuru-Guzik, A.D. Dutoi, P. J. Love and M. Head-Gordon (2005), Simulated quantum computation of
molecular energies, Science, Vol. 309(5741), pp. 1704–1707.
2. G. Rosenberg, P. Haghnegahdar, P. Goddard, P. Carr, K. Wu and M. L. De Prado (2016), Solving the optimal
trading trajectory problem using a quantum annealer, IEEE J. Sel. Top. Signal Process., Vol. 10(6), pp. 1053–
1060.
3. P. W. Shor (1994), Algorithms for quantum computation: discrete logarithms and factoring, in: Proceedings
of the 35th Annual Symposium on Foundations of Computer Science, pp. 124–134.
4. L. K. Grover (1996), A fast quantum mechanical algorithm for database search, in: Proceedings of the
twenty-eighth annual ACM symposium on Theory of computing, pp. 212–219.
5. A. W. Harrow, A. Hassidim and S. Lloyd (2009), Quantum algorithm for linear systems of equations, Phys.
Rev. Lett., Vol. 103(15), pp. 150502.
6. A. Montanaro (2016), Quantum algorithms: an overview, npj Quantum Inf., Vol. 2, pp. 15023.
7. J. Avron, O. Casper and I. Rozen (2021), Quantum advantage and noise reduction in distributed quantum
computing, Phys. Rev. A, Vol. 104(5), pp. 052404.
8. R. Beals, S. Brierley, O. Gray, A. W. Harrow, S. Kutin, N. Linden, D. Shepherd and M. Stather (2013),
Efficient distributed quantum computing, Proc. Math. Phys. Eng. Sci., Vol. 469(2153), pp. 20120686.
9. K. Li, D. Qiu, L. Li, S. Zheng and Z. Rong (2017), Application of distributed semi-quantum computing
model in phase estimation, Inform. Process. Lett., Vol. 120, pp. 23–29.
10. A. Yimsiriwattana and S.J. Lomonaco (2004), Distributed quantum computing: a distributed Shor algo-
rithm, Quantum Inf. Comput. II, Vol. 5436, pp. 360–372.
11. C. Gidney and M. Ekera (2021), How to factor 2048 bit RSA integers in 8 hours using 20 million noisy
qubits, Quantum, Vol. 5, pp. 433.
42 Distributed Shor’s algorithm

12. S. Beauregard (2003), Circuit for Shor’s algorithm using 2n + 3 qubits, Quantum Inf. Comput., Vol. 3(2),
pp. 175–185.
13. T. Haner, M. Roetteler and K. M. Svore (2017), Factoring using 2n + 2 qubits with Toffoli based modular
multiplication, Quantum Inf. Comput., Vol. 17(7-8), pp. 673–684.
14. S. Parker and M. B. Plenio (2000), Efficient factorization with a single pure qubit and log N mixed qubits,
Phys. Rev. Lett., Vol. 85(14), pp. 3049–3052.
15. M. A. Nielsen and I. L. Chuang (2000), Quantum Computation and Quantum Information, Cambridge
University Press, Cambridge.
16. C. Bennett, G. Brassard, C. Crépeau, R. Jozsa, A. Peres and W. K. Wootters (1993), Teleporting an unknown
quantum state via dual classical and Einstein-Podolsky-Rosen channels, Phys. Rev. Lett., Vol. 70(13), pp.
1895–1899.
17. N. Gisin and R. Thew (2007), Quantum communication, Nat. Photonics, Vol. 1(3), pp. 165–171.
18. P. W. Shor (1999), Polynomial-time algorithms for prime factorization and discrete logarithms on a quan-
tum computer, Siam Rev., Vol. 41(2), pp. 303–332.

Appendix A: An example
We will give an example to show the procedure of our distributed order-finding algorithm. For conve-
nience, we omit some details and modify some parameters.
Give a 10-bit composite number N = 210 − 1 = 1023 and an integer a = 2. The order r of a
modulo N is 10 (i.e. r = 10). In this example, the purpose of the quantum part of the order-finding
algorithm is to output an estimation of s/r for some s with error no larger than 2−21 . In addition,
Alice is to estimate the first 6 bits of s/r and Bob is to estimate the 5th to the 21th bit of s/r (17 bits
for Bob). The procedure of our distributed Shor’s algorithm is as follows:

(1) Alice initializes qubits as |0iA |1iC and Bob initializes qubits as |0iB .
Registers A, B, and C have 6-qubit, 17-qubit, and 10 qubit, respectively.

(2) Alice applies “partial” order-finding algorithm.


Pr−1 g
Alice’s qubits becomes s=0 |s/ri A |us iC , where s/r is an estimation of the first 6 bits of s/r
g
for some s.

(3) Teleport the qubits of register C to Bob.


Pr−1 g
At this time, the global quantum state is s=0 |s/riA |0iB |us iC . Also, Bob owns qubits of
register B and C.

(4) Bob applies “partial” order-finding algorithm.


The global quantum state becomes
r−1
X 0
|s/ri
g A |s/r
g iB |us iC ,
s=0

0
where s/r
g is an estimation of the 5th to the 21th bit of s/r.

(5) Alice measures register A and Bob measures register B.


In this step, Alice and Bob will obtain an estimation of partial bits of some s/r, respectively.
Suppose s = 7 (we should remember that in the actual algorithm process we do not know s and
r). It is worth mentioning that order-finding algorithm can be regarded as a phase estimation
Ligang Xiao, Daowen Qiu, Le Luo and Paulo Mateus 43

algorithm. Note that rs = 10 7


= (0.1011 0011 0011 · · · )2 where (0.1011 0011 0011 · · · )2
indicates that 0.1011 0011 0011 · · · is a binary representation. We can see that the Alice’s
measurement is most likely to be
1011 01,
7
since 0.1011 01 is the nearest 6-bit binary decimal to 10 = (0.1011 0011 0011 · · · )2 . Similarly,
we know that Bob’s measurement is most likely to be

0011 0011 0011 0011 0


7
(17 bits), since the bits after the 5th bit (include the 5th bit) of 10 is

0.0011 0011 0011 0011 0011 · · · .

Fig. A.1 shows the relationship between these estimations.

Fig. A.1. The relationship of estimations

(6) Apply steps 1 and 2 of the CorrectResults subroutine to correct Alice’s measurement.
7
Alice’s estimate of the 5th and the 6th bits of 10 is 01. Bob’s estimate of the 5th and the 6th bits
7
of 10 is 00. Since ((01)2 − 1) mod 22 = (00)2 , we get that the CorrectionBit is −1. We use
the CorrectionBit to correct the result of Alice, and it becomes ((1011 01)2 − 1) mod 26 =
(1011 00)2 . Fig. A.2 shows the relationship between these results (the overall result is obtained
in the next step).

(7) By directly concatenating the first 6th bits of Alice’s result and the bits after the 2th bit of Bob’s
result, we obtain the overall estimation is 0.1011 0011 0011 0011 0011 0 (21 bits). It satisfies
7 1
(0.1011 0011 0011 0011 0011 0)2 − < 21 . (A.1)
10 2
It means that we output an estimation with error no large than 2−21 and thus our algorithm
works for this example.

(8) Apply continued fractions algorithm.


Although this step is not considered in our paper, for completeness, we still show this step.
44 Distributed Shor’s algorithm

Fig. A.2. The relationship of results

s 734003
Note that our estimation of is (0.1011 0011 0011 0011 0011 0)2 = . After applying
r 1048576
continued fractions algorithm, we have
734003 1
=0+ . (A.2)
1048576 1
1+
1
2+
1
3+
1
52428 +
2
Hence we know that
1 7
0+ = (A.3)
1 10
1+
1
2+
3
is the closest to (0.1011 0011 0011 0011 0011 0)2 of all numbers with the form pq where
s 7
gcd(p, q) = 1 and q < N = 1023. Thus, we get that = for some s and obtain that 10 is a
r 10
factor of r. If repeating steps (1)-(8) several times, we are likely to get all prime factors of the
order r, and finally conclude that r = 10.

We have obtained the order r by means of our distributed order-finding algorithm. Let us continue
with the rest of the Shor’s algorithm. Since r = 10 is even, we compute gcd(2r/2 + 1, N ) and
gcd(2r/2 − 1, N ). We get that

gcd(2r/2 + 1, N ) = gcd(33, 1023) = 33 (A.4)

and
gcd(2r/2 − 1, N ) = gcd(31, 1023) = 31. (A.5)
Finally, we have N = 1023 = 33 ∗ 31 and we can continue to try to factor 33 and 31 similarly if
necessary.

You might also like