Parallel-Analysis-of-an-Improved-RSA-Algorithm
Parallel-Analysis-of-an-Improved-RSA-Algorithm
Abstract—This paper aims at speeding up RSA decryption. as Batch RSA with four phases: Setup, Percolate-Up,
BS1PRSA (Batch RSA-S1 Multi-Power RSA) algorithm Exponentiation-Phase and Percolate-Down [9-13].
improves the performance of RSA decryption by combining Setup: Given a security parameter n and four additional
the load transferring technique and multi-prime technique in parameters k, c and b as input. b is the batch size.
the Batch RSA algorithm [2]. The parallelism of the BS1PRSA
algorithm is analyzed to improve the performance of RSA
• Compute two distinct primes p,q each one ¬«n / 3¼» bits
decryption in the paper and the algorithm can be efficiently in length and generate N = p2q and
implemented in parallel on multi-core devices.
φ ( N ) = ( p − 1)( q − 1)
Keywords- RSA; BS1PRSA; decryption; improved; paralell; • Let e1 , " , eb be b different encryption exponents,
multi-core;
relatively prime to φ ( N ) and to each other. The
I. INTRODUCTION public exponent ei should be very small.
RSA [1] is one of the algorithms used in Public-key Otherwise, the extra arithmetic required is too
Cryptography which can be used for encryption, signature,
and key exchange purposes. It is the main operation of RSA expensive. Each ei computes,
to compute modular exponentiation. When RSA decrypts the 1≤ i ≤ b , d ei = ei−1 mod φ ( N ) and
ciphertext and generates the signatures, more computation
capacity and time will be required. RSA has very high E= ∏ bi=1 ei (mod N) . Compute private
computational cost, in contrast to the many quite speedy exponent d = d e1 × d e 2 " × d eb mod φ ( N ) Compute
private key systems available. So some methods were
proposed to speed up RSA decryption. There are several e_inv_p = e −1 mod p and
avenues for speeding up RSA: Improved RSA algorithms, 2 −1
p _ inv _ q = ( p ) mod q
2
Faster clock rates, Special-purpose hardware, Parallel
computers and algorithm. Represent the private exponent d as
The paper focuses on the last of these options, parallel d = f1d1 + ... + f k d k mod φ ( N ) , where the d i ’s
computing, since it seems to provide the greatest potential and fi ’s, 1 ≤ i ≤ k , are random vector elements of
for speedup over the long term. I will present a novel
c and |n| bits, respectively. The choice and security
algorithm for modular exponentiation which is particularly
of parameters: k and c are discussed later. In the
well-suited to parallel machines and describe an paper, k is chosen 2 and d’s representation becomes
implementation of it.
d = f1d1 + f 2 d 2 mod φ ( N ) storing it for use in
II. THE NEW PROPOSED VARIANT Exponentiation-Phase.
In this section, a new variant of RSA is proposed and is • The private key is < N , d1 , d 2 > and the public key is
called BS1PRSA (Batch RSA-S1 Multi-Power Improved
< N , ei , f1 , f 2 , e _ inv _ p, p 2 _ inv _ q > for each
RSA) in the paper. The variant effectively combines
Multi-Power RSA [3-4] and RSA-S1 system [5-7] based on encryption, for 1 ≤ i ≤ b .
Batch RSA. It can obtain a higher speedup than the Batch • Given messages m1 ,..., mb and vi is computed
RSA and the above two RSA variants. Before the proposals e
of optimizing the RSA cryptosystem [1] is presented, the by vi = mi i (mod N ) , for 1 ≤ i ≤ b
RSA basic algorithms will be reviewed. Percolate-Up: The private exponent d is divided to the
A. BS1PRSA algorithm d1 , d 2 , f1 and f 2 vectors, for k=2,and the f1 and f 2
In this subsection, BS1PRSA is proposed and it is based vectors are computed in the phase [2]. So, the original goal
E /e
on RSA-S1 system. It can obtain a higher decryption of getting V = ∏ bi=1 Vi i (mod N ) in Percolate-Up
speedup than the original Batch RSA. BS1PRSA is described
319
are lots of exploitable concurrency in the decryption of IV. CONCLUSION
BS1PRSA. Thus, based on the analysis of the In this paper, BS1PRSA which can improve the
data-decomposition and task-decomposition, the variant can performance of the decryption was analyzed. The variant can
be efficiently implemented in parallel. Figure 3 also shows obtain high performance by transferring the decryption
that BS1PRSA decryption can be paralleled. computations to encryption and reducing the modulus and
private exponents. Now, lots of multi-core devices are being
introduced into the market .The architecture for multi-core
V = V1 × V2 × V3 × V4
processors can improve the performance of processors by
V1 f 1 V2 f 1 V3 f 1 V4 f 1
parallel work. BS1PRSA can be easily implemented in
V
f1 parallel and can get higher speedup based on current
(V1 )
f1
× (V2 )
f1
× (V3 )
f1
× (V4 )
f1
multi-core devices. BS1PRSA has obvious parallel features
by parallel analysis, and are easy to be efficiently
implemented in parallel. BS1PRSA can execute in parallel
Parallel
V1 f 2 V2 f 2 V3 f 2 V4 f 2
on multi-core devices, giving full play to the parallel
V
f2 processing capabilities of these devices to further improve
(V1 )
f2
× (V2 )
f2
× (V3 )
f2
× (V4 )
f2
the performance of Batch RSA.
…… Parallel REFERENCES
V1 fk V2 fk V3 fk V4 fk fk
V [1] R. Rivest,A. Shamir, L. Aldeman, “A Methoed for Obtaining
(V1 )
fk
× (V2 )
fk
× (V3 )
fk
× (V4 )
fk
DigitalSignatures and Public-key Cryptosystems,”J. Communications
of the ACM, 1978, 21(2): 120-126.
[2] Guang Zhao,Henbo Li, “An Efficient Variant of the Batch RSA
Fig.2 Encryption parallelism for four encryption client and k Cryptosystem,” C. Proc of the 2nd International Conference on
public exponentiations Network Engineering and Computer Science,2011.
[3] A .Fiat , “Batch RSA,”C. Proc of Crypto ’89,
The big modular exponentiation of standard RSA is LNCS435,1989.Berlin:Springer-Verlag,1989:175-185.
decomposed into many smaller modular exponentiations in [4] D.Boneh,H.Shacham, “Fast Variants of RSA,”R.RSA Laboratories
Cryptobytes,2002,5(1):1-8.
the variant. The encryption and decryption of the variant can
[5] T. Takagi. Fast RSA-type cryptosystem modulo pkq. In H. Krawczyk,
be broken into multiple tasks that can execute editor, CRYPTO, volume 1462 of Lecture Notes in Computer
simultaneously [12]. For the parallel implementation of the Science,pages 318–326. Springer, 1998.
variant, the multi-core computers can be chosen as the [6] T.Matsumoto,K.Kato, “Speeding up secret computations with
insecure auxiliary device,”C.Proc of the 8th Annual International
parallel hardware platform and the software development Crypto Conference on Advances in
platform of parallel programs can choose the OpenSSL [15] Cryptology.London:Springer-Verlag,1988.
cryptographic library and OpenMP [14]. [7] C .Castelluccia, E. Mykletun, and G. Tsudik. “Improving secure
server performance by re-balancing SSL/TLS handshakes,”C. Proc of
the 2006 ACM Symposium on Information, computer and
Batch RSA Decryption communications security.New York: ACM, 2006: pages 26-34.
d = fd + + f d mod φ ( N ) [8] J-J. Quisquater and C. Couvreur, “Fast decipherment algorithm for
f1 f1 d1 RSA public-key cryptosystem,”J. Eletronic Letters, vol 18:905–907,
V (V ) (mod N ) 1982.
[9] Yunfei Li, Qing Liu,Tong Li, “Design and Implementation of an
Parallel
Improved RSA Algorithm ,”C.EDT 2010,2010: 390-393.
V
f2 f d [10] Yunfei Li, Qing Liu,Tong Li, “Two efficient methods to speed up the
(V 2 ) 2 (mod N) batch RSA decryption,”C. In: IWACI 2010,2010:469-473.
Parallel
[11] Yunfei Li, Qing Liu,Tong Li, “Design and implementation of two
d
m = V mod N improved batch RSA algorithms,”C.In: ICCSIT
fi …… f d 2010,2010(4):156-160.
V (V i ) i (mod N )
[12] Qing Liu,Yunfei Li,Tong Li,Lin Hao, “The Research of the Batch
Parallel RSA Decryption Performance”J.Journal of Computational
Information Systems,2011,7(3):948-955.
fk
V f d
(V k ) k (mod N ) [13] Yunfei Li, Qing Liu,Tong Li,“Efficient variant of RSA
cryptosystem,”J. Journal of Computer Applications, 30(9): 255-293,
2010.
Fig.3 Decryption parallelism for Batch RSA decryption server [14] Y G.Timothy,A.Beverly,“Patterns for Parallel Programming,”M.
client, k public exponentiations and four batch size Addison-Wesley Professional.2005.11.
[15] J. Viega, M. Messier and P. Chandra, “Network Security with
OpenSSL, “,M.O’Reilly, 2002.
320