Hardware Design of Cryptographic Accelerator
Hardware Design of Cryptographic Accelerator
Slovakia
Abstract — The goal of this paper is representation of the applications for e-passport [7], the hardware
computing unit focused on implementation hardware implementation of this algorithm on the development
cryptography accelerator. After analysis of assigned boards according to [8] and [9], or the implementation of
problem, there are next chapters dedicated to available the RSA algorithm using the Chinese remainder theorem,
solutions, comparison of them, the issues of cryptography on parallel architecture [10]. This article discusses the
and computer security, FPGA device in which the designed implementation of the RSA algorithm in the FPGA chip
solution is verified and tested, the chapters dedicated to and thus contributes to the widening of the application
cryptography algorithm RSA. Chapters of synthesis have possibilities of this algorithm [11].
the specifically implemented and designed solution in In the following chapters, the hardware implementation
programming languages of lower and higher level. In
of RSA algorithm is approached. The description of RSA
evaluation and conclusion chapter there are measurements
algorithm and FPGA device is presented in the second
on software and hardware level and comparison among
chapter. The next three chapters deals with the design of
them.
cryptography accelerator. The achieved results and their
comparison to the results of the similar project are located
I. INTRODUCTION in the final evaluation.
Cryptography is based on unidirectional mathematical II. RSA ALGORITHM
functions. The characteristic feature of which is that they
allow a fairly fast encryption of the input and guarantee The RSA algorithm was published in 1978 by three
that decryption of the message is a process of authors. Ron Rivest, Adi Shamir and Leonard Adleman
computationally demanding (and with certain encryption have designed an algorithm as an asymmetric cipher,
conditions - impossible). Well-known cryptographic which is based on Euler's formula, and which is usable
algorithms are DES, 3DES, IDEA, AES or Diffie- both for encrypting messages and for signing documents.
Hellman, El Gamal and RSA algorithms (initials Rivest, Encryption proceeds as follows. Anyone wanting to
Shamir, Adleman) [1] [2]. send a private message M to another user encrypts
The first four mentioned algorithms belong to a group (enciphers) the message before transmitting it by
of symmetric ciphers. The principle of symmetric recipient’s public key. The message could be send over an
encryption is that both the sender and the recipient of the unsecured network, such as the Internet. Anyone who was
message share the secret key to which the message was "eavesdropping" on the communication would only see
encrypted by the sender and the recipient decrypts it. It is the encrypted message. Because they would not know
clear that prior to the first exchange of messages, the how to decrypt it successfully, the message would make
partners must agree on a common secret key, either by no sense to them. As such, privacy can be ensured in
way of a personal meeting or by sending it through a electronic communication. The recipient uses own private
credible secure channel. DES (Data Encryption Standard) key to decrypt the message from sender [1].
algorithm was patented in December 1976 in the US.
A. The process of creating public and private key
Although the DES cipher was broken in 1993 [3], it is
still possible to find the application of this algorithm. The 1) Two randomly generated large different prime
advantage of this algorithm is the possibility of hardware numbers p, q are selected.
implementation as it is presented in publications [4], [5].
2) Calculate
The second group of mentioned cryptographic
algorithms belong to the group called asymmetric
algorithms. The RSA algorithm is one of the most n = p×q (1)
widespread public key algorithms. The name was given by
the authors - Ron Rivest, Adi Shamir and Leonard
Adleman, who created their algorithm in 1977. The RSA 3) Calculate
algorithm can be used for both encryption and digital
signatures. Two large prime numbers are randomly
generated by the RSA, from which the public key is φ (n) = (p - 1)(q - 1) (2)
deduced during the calculation process. From this, the
private key is calculated by the mathematical operation.
RSA encryption with a key less than 1024 bit is not 4) A random integer is selected
considered safe anymore [6].
The implementation of the RSA algorithm on the 1 < e < φ ( n) where GCD (e, φ ( n)) = 1 (3)
FPGA device is also widely applied as demonstrated by
C = M e (mod n) (6)
000202
Authorized licensed use limited to: Amrita School Of Engineering - Kollam. Downloaded on May 31,2025 at 11:47:34 UTC from IEEE Xplore. Restrictions apply.
SAMI 2018 • IEEE 16th World Symposium on Applied Machine Intelligence and Informatics • February 7-10 • Košice, Herl’any, Slovakia
• CLB - Configurable Logic Blocks are the main execution units (Figure 4). We can also choose the Linux
logic resources of the chip. They allow to create MMU configuration. The basic configuration defines a
both combinational and sequential logic circuits. reasonable ratio between power, occupied area, and
The 7-series configurable logic block (CLB) working frequency. For working with the soft-processor, it
provides advanced, high-performance FPGA is necessary to add additional components to the design
logic: real 6-input look-up table (LUT) and link them to the processor as shown in Figure 3.
technology, dual LUT5 (5-input LUT) option,
Distributed Memory and Shift Register Logic
capability, dedicated high-speed carry logic for
arithmetic functions, wide multiplexers for
efficient utilization.
• Slice - each Kintex 7 CLB chip contains only
CLICEL and SLICEM slices that contain four
LUTs (in function of function generators) and
nine flip-flops type FF (eight flip-flops),
multiplexers and a carry logic flip-flop, a top-up Fig. 3 Basic configuration
adder (carry logic), and the ability to create
distributed RAM or shift registers. However,
only SLICEMs can use their LUTs as distributed
RAM or SRLs.
• LUT - Look-Up-Table is a logical structure
designed as a function generator. Its main task is
to implement logical functions. The function
generators can implement any arbitrarily defined
x-input Boolean function (x-input look-up table).
• BRAM –The block RAM in Xilinx® 7 series
FPGAs stores up to 36 Kbits of data and can be
configured as either two independent 18 Kb
RAMs, or one 36 Kb RAM [22]. Fig. 4 Settings of the soft processor according to the selected preference
IV. IMPLEMENTATION
V. DESIGN OF CRYPTOGRAHY ACCELERATOR
Algorithm implementation is based on the soft
processor design for the FPGA chip. A soft processor is VHDL is the Very High Speed Integrated circuit
also known as a softcore microprocessor. It is essentially hardware description language for the programming
the core of a microprocessor that can be implemented language that is used to describe the hardware. It is used
using logical synthesis. It can be implemented on for designing and simulating digital integrated circuits
schedules containing programmable logic like CLPD or and programmable gate arrays as CLPD and in our case
FPGA devices [23]. The key benefits of using a soft FPGA. The basics of constructing VHDL is an entity and
processor include the processor's configurability, as architecture. The first mentioned component defines the
opposed to a hard processor as a CPU, which has its component's interface. The architecture contains three
hardware architecture fixed and unchanged [24]. The styles of description, namely structural, behavioral and
number of soft processors per FPGA chip is limited by dataflow descriptions. The VHDL language allows you to
the size of this FPGA chip. The basic IP block of the describe the circuit on gate, RTL and algorithmic level.
Microblaze soft processor is shown in the following The language is strongly typed and has the means to
figure [25]. describe parallelization, connectivity and explicit
expression of time. The VHDL language is used for
simulation as well as for the description of integrated
circuits [22].
Cryptographic Accelerator Modules have been
designed in the Vivado Design Suite - HLx Editions
2016.3 application using both block design options as
Fig. 2 Soft processor MicroBlaze [11] well as VHDL and C descriptions of individual modules.
• Microblaze – Softprocessor for communication
The properties of the Microblaze soft-processor can be between the user and the designed RSA module.
modified using the processor configuration settings [26].
• Clocking Wizard – generating the
The main partition is configuration for the minimum area
synchronization signal for each module of the
on the FPGA chip, which is characterized by the smallest
proposed system.
possible MicroBlaze core, without cache memory and
debug module (Figure 4). The maximum performance • Processor System Reset – provides for resetting
configuration defines a large cache memory, a debug the components of the designed system.
module, and a full computing unit (all execution units) • Axi Interconnect – the role of this module is to
(Figure 4). The maximum frequency configuration is link the components of the designed system
cache-free, without debug mode and uses a reduced set of
000203
Authorized licensed use limited to: Amrita School Of Engineering - Kollam. Downloaded on May 31,2025 at 11:47:34 UTC from IEEE Xplore. Restrictions apply.
M. Huliþ et al. • Hardware Design of Cryptographic Accelerator
000204
Authorized licensed use limited to: Amrita School Of Engineering - Kollam. Downloaded on May 31,2025 at 11:47:34 UTC from IEEE Xplore. Restrictions apply.
SAMI 2018 • IEEE 16th World Symposium on Applied Machine Intelligence and Informatics • February 7-10 • Košice, Herl’any, Slovakia
TABLE IV.
AMMOUNT OF HARDWARE UNITS USED
VHDL Verilog
Slice 0 0
FF 2327 2326
DSP 20 20
BRAM 0 0
TABLE V.
TIME DEPENDENCY ON SELECTED MESSAGE AND KEY Fig. 7 Time dependency in ms on key length and selected
LENGHT architecture
Message
256 bit 512 bit 1024 bit TABLE VI.
lenght/Key 128 bit key
key key key CONCURENT MEASUREMENTS
lenght CLB
Test. device Frequency
used
501 394 500 202 1000404
32 bit 48 527 ms RSA FPGA Xilinx
ms ms ms 36.3 MHz 28,350
ePass [7] VIRTEX-V
501 654 500 441 999689 RSA FPGA
unavailable 150 MHz 5950
64 bit 49 710 ms [8]
ms ms ms RSA FPGA 3s100evq100-
68.57 MHz 2366
501 156 498 533 1 512 527 [9] 4
128 bit 501 156 ms RSA FPGA
ms ms ms CRT Virtex-6
33MHz 236
1 000 2 002 3 502 131 paralel. FPGA
265 bit 499 725 ms Arch. [10]
643 ms 239 ms ms Xilinx 152.77
RSA_mul16s 2407
2 002 001 2 501 2 502 4 003 048 KC705 MHz
512 bit
ms 488 ms 441 ms ms
Xilinx 254.46
RSA_LUT 6711
3 001 928 5 003 8 005 8 006 811 KC705 MHz
1024 bit
ms 929 ms 381 ms ms
000205
Authorized licensed use limited to: Amrita School Of Engineering - Kollam. Downloaded on May 31,2025 at 11:47:34 UTC from IEEE Xplore. Restrictions apply.
M. Huliþ et al. • Hardware Design of Cryptographic Accelerator
To compare the speed of executing an encryption [9] A. H. Ansari, A. R. Landge, “RSA algorithm realization on
management operation using a RSA algorithm on an FPGA”, International Journal of Advanced Research in Computer
Engineering & Technology (IJARCET) Volume 2, Issue 7, July
FPGA chip with a rate of encryption management to the 2013
CPU algorithm, it was also implemented in C language. [10] A. Shashank, “FPGA Implementation of RSA Encryption and
Based on the measured data, it is obvious that CRT based Decryption using Parallel Architecture,” Journal of
implementation on FPGA devices fulfilled the Innovation in Electronics and Communication
assumption, i.e. there has been a multiple acceleration of [11] L. Vokorokos, A. Baláž, N. Ádám, “Events Planning In Intrusion
the encryption process in favor of the FPGA. At the same Detection Systems”, Acta Electrotechnica et Informatica. Roþ. 7,
time, data on the rate of encryption was compared using ý. 4 (2007), S. 82-86. - ISSN 1335-8243
the RSA_LUT and RSA_mul16s modules. [12] E. Pietriková, S. Chodarev, “Towards Programmer Knowledge
Profile Generation“ Acta Electrotechnica et Informatica. Roþ. 16,
The results show the advantage of using the principle þ. 1 (2016), s. 15-19. - ISSN 1335-8243.
while using RSA look up table architecture with higher [13] P. FeciĐak, K. Kleinova, J. Janitor, “Qos in Network Traffic
working frequency but uses more CLB, but this means no Management,” Acta Electrotechnica et Informatica. Roþ. 10, ý. 4
problem for tested architecture because we needed only 3 (2010), S. 24-28. - ISSN 1335-8243
percentages from available CLB on board. [14] B. Schneier, “Applied Cryptography,” 1996. ISBN-10:978-
0471117094
ACKNOWLEDGMENT [15] M. Nemec, M. Sys, P. Svenda, The Return of Coppersmith’s
Attack: Practical Factorization of Widely Used RSA Moduli, 2017
This work was supported by KEGA Agency of the ACM SIGSAC Conference.
Ministry of Education, Science, Research and Sport of the [16] E. Chovancová, N. Ádám, A. Baláž, E. Pietríková, P. FeciĐak, S.
Slovak Republic under Grant No. 077TUKE-4/2015 ŠimoĖák, M. Chovanec, “Securing distributed computer systems
„Promoting the interconnection of Computer and Software using an advanced sophisticated hybrid honeypot technology,“
Engineering using the KPIkit“, and Grant No. 003TUKE- Computing and Informatics. Roþ. 36, þ. 1 (2017), s. 113-139. -
4/2017 Implementation of Modern Methods and ISSN 1335-9150
Education Forms in the Area of Security of Information [17] L. Vokorokos, A. Baláž, B. Madoš, “Anomaly and Misuse
and Communication Technologies towards Requirements Intrusions Variability Detection,” Acta Electrotechnica et
Informatica. Roþ. 2010, ý. 4 (2010), S. 5-9. - ISSN 1335-8243
of Labour Market. This support is very gratefully
[18] L. Vokorokos, N. Ádám, A. Baláž, “Training Set Parallelism in
acknowledged. PAHRA Architecture,” Acta Electrotechnica et Informatica. Roþ.
7, ý. 3 (2007), S. 13-18. - ISSN 1335-8243
REFERENCES [19] J. Hoffstein, J. Pipher, J. H. Silverman, “An Introduction to
[1] W. Stallings, “Cryptography and Network Security Principles and Mathematical Cryptography, “ 2008 978-0387779935
Practices,” Prentice Hall. 2005 0-17-187316-4 [20] Xilinx Inc. KC705 Evaluation Board for the Kintex-7 FPGA. 2016
[2] M. Petrvalsky, M. Drutarovsky, M. Varchola, “Compact FPGA Available web source:
Hardware Platform For Power Analysis Attacks On Cryptographic https://www.xilinx.com/support/documentation/boards_and_kits/k
Algorithms Implementations,” In: Acta Electrotechnica et c705/ug810_KC705_Eval_Bd.pdf
Informatica. Roþ. 16, ý. 2 (2016), S. 3-7. - ISSN 1335-8243 [21] Xilinx. User Guide UG474 (v1.8) September 27, 2016. “7 Series
[3] P. R. Wilson, “Design Recipes for FPGAs,” Newnes is an imprint FPGAs Configurable Logic Block,” Available web source:
of Elsevier Linacre House, Jordan Hill, Oxford OX2 8DP. 2007 https://www.xilinx.com/support/documentation/user_guides/ug47
ISBN: 978-0-7506-6845-3 4_7Series_CLB.pdf
[4] G. Rouvroy, F. X. Standaert, J. J. Quisquater, J. D. Legat, “ [22] Xilinx. User Guide UG473 (v1.12) September 27, 2016. “7 Series
Efficient Uses of FPGAs for Implementations of DES and Its FPGAs Memory Resources,” Available web source:
Experimental Linear Cryptanalysis,“ IEEE TRANSACTIONS ON https://www.xilinx.com/support/documentation/user_guides/ug47
COMPUTERS, VOL. 52, NO. 4, APRIL 2003. 3_7Series_Memory_Resources.pdf
[5] P. Vishwanath, R. C. Joshi, A. K. Saxena, “FPGA [23] L. Vokorokos, E. Chovancová, “Viacjadrová architektúra
IMPLEMENTATION OF DES USING PIPELINING CONCEPT zameraná na akceleráciu výpoþtov, “ Acta Informatica Pragensia.
WITH SKEW CORE KEY-SCHEDULING”. Journal of Vol. 2, no. 1 (2013), p. 79-90. - ISSN 1805-4951.
Theoretical and Applied Information Technology, 2009. [24] L. Vokorokos, B. Madoš, V. Ruska, “FPGA Hardware
[6] M. Prerna, A. Sachdeva, “A Study of Encryption Algorithms AES, Acceleration For Visualization With Use Of The Ray Tracing
DES and RSA for Security”. Volume 13 Issue 15 Version 1.0 Algorithm,” Acta Electrotechnica et Informatica. Roþ. 14, ý. 2
Year 2013. Double Blind Peer Reviewed International Research (2014), S. 3-7. - ISSN 1335-8243
Journal. : Global Journals Inc. (USA) Online ISSN: 0975-4172 & [25] L. H. Crockett, R. A. Elliot, M. A. Enderwitz, R. W. Stewart, “The
Print ISSN: 0975-4350 Zynq Book: Embedded Processing with the ARM Cortex-A9 on
[7] S. Khaled, H. Hussien, S. Yehia, “FPGA Implementation of RSA the Xilinx Zynq-7000 All Programmable SoC,” First Edition,
Encryption Algorithm for E-Passport Application,” World Strathclyde Academic Media, 2014.
Academy of Science, Engineering and Technology International [26] J. A. Peter, “The VHDL Cookbook,” Dept. Computer Science
Journal of Computer, Electrical, Automation, Control and University of Adelaide, South Australia. 1990.
Information Engineering Vol:8, No:1, 2014
[8] A. C. Shantilal, “A Faster Hardware Implementation of RSA
Algorithm” Department of Electrical & Computer Engineering,
Oregon State University, Corvallis, Oregon 97331 USA.
000206
Authorized licensed use limited to: Amrita School Of Engineering - Kollam. Downloaded on May 31,2025 at 11:47:34 UTC from IEEE Xplore. Restrictions apply.