Reliable CRC-Based Error Detection Constructions For Finite Field
Reliable CRC-Based Error Detection Constructions For Finite Field
1, JANUARY 2021
Abstract— Finite-field multiplication has received prominent attention checksum codes and spatial/temporal redundancies for the NTRU
in the literature with applications in cryptography and error-detecting encryption algorithm have been presented in [11].
codes. For many cryptographic algorithms, this arithmetic operation is
Our proposed error-detection architectures are adapted to the
a complex, costly, and time-consuming task that may require millions
of gates. In this work, we propose efficient hardware architectures Luov cryptographic algorithm [12]; however, they can be applied
based on cyclic redundancy check (CRC) as error-detection schemes to different PQC algorithms that use finite-field multipliers. The
for postquantum cryptography (PQC) with case studies for the Luov Luov algorithm was submitted for National Institute of Standards
cryptographic algorithm. Luov was submitted for the National Institute and Technology (NIST) standardization competition [13] and was
of Standards and Technology (NIST) PQC standardization competition
and was advanced to the second round. The CRC polynomials selected advanced to the second round [14]. Cyclic redundancy check (CRC)
are in-line with the required error-detection capabilities and with the field error-detection schemes are applied in our proposed hardware con-
sizes as well. We have developed verification codes through which software structions to make sure that they are overhead-aware with high
implementations of the proposed schemes are performed to verify the error coverage. Our contributions in this brief are summarized as
derivations of the formulations. Additionally, hardware implementations
of the original multipliers with the proposed error-detection schemes
follows.
are performed over a Xilinx field-programmable gate array (FPGA), 1) Error-detection schemes for the finite-field multipliers G F(2m )
verifying that the proposed schemes achieve high error coverage with with m > 1 used in the Luov cryptographic algorithm are
acceptable overhead.
proposed. These error-detection architectures are based on
Index Terms— Cyclic redundancy check (CRC), fault detection, CRC-5. Additionally, we explore and study both primitive and
field-programmable gate array (FPGA), finite-field multiplica- standardized generator polynomials for CRC-5, comparing their
tion. complexity.
2) We derive new formulations for the error-detection schemes
I. I NTRODUCTION of Luov’s algorithm, performing software implementations for
Many modern, sensitive applications and systems use finite-field the sake of verifications. We note that such derivation covers
operations in their schemes, among which finite-field multiplication a wide range of applications and security levels. Nevertheless,
has received prominent attention. Finite-field multipliers perform the presented schemes are not confined to these case studies.
multiplication modulo, an irreducible polynomial used to define the 3) The proposed error-detection architectures are embedded into
finite field. For postquantum cryptography (PQC), the inputs can the original finite-field multipliers. We perform the implemen-
be very large, and the finite-field multipliers may require millions tations using Xilinx field-programmable gate array (FPGA)
of logic gates. Therefore, it is a complex task to implement such family Kintex Ultrascale+ for device xcku5p-ffvd900-1-i to
architectures resilient to natural and malicious faults; consequently, confirm that the schemes are overhead-aware and that they
research has focused on ways to eliminate errors and obtain more provide high error coverage.
reliability with acceptable overhead [1]–[6]. Moreover, there has been
previous work on countering fault attacks and providing reliability for II. P RELIMINARIES
PQC. Sarker et al. [7] used error-detection schemes of number theo-
There are five popular PQC algorithm classes: code-based,
retic transform (NTT) to detect both permanent and transient faults.
hash-based, isogeny-based, lattice-based, and multivariate-quadratic-
Mozaffari-Kermani et al. [8] performed fault detection for stateless
equation-based cryptosystems [15]. Code-based cryptography differs
hash-based PQC signatures. Additionally, error-detection hash trees
from others in that its security relies on the hardness of decoding
for stateless hash-based signatures are proposed in [9] to make such
in a linear error-correcting code. Hash-based cryptography creates
schemes more reliable against natural faults and help protecting them
signature algorithms based on the security of a selected cryptographic
against malicious faults. In [10], algorithm-oblivious constructions
hash function. The security of isogeny-based cryptography is based
are proposed through recomputing with swapped ciphertext and
on the hard problem to find an isogeny between two given supersingu-
additional authenticated blocks, which can be applied to the Galois
lar elliptic curves. Lattice-based cryptography is capable of creating
counter mode (GCM) architectures using different finite-field multi-
a public-key cryptosystem based on lattices. Lastly, the security of
pliers in G F(2128 ). Several countermeasures based on error-detection
multivariate-quadratic-equation-based cryptography depends on the
Manuscript received May 13, 2020; revised August 8, 2020 and difficulty of solving a system of multivariate polynomials over a finite
September 18, 2020; accepted October 11, 2020. Date of publication field. Such cryptographic schemes use large field sizes to provide the
October 26, 2020; date of current version December 29, 2020. This work needed security levels.
was supported by the U.S. National Science Foundation (NSF) under Award
SaTC-1801488. (Corresponding author: Mehran Mozaffari-Kermani.) Luov is a multivariate public key cryptosystem and an adaptation of
Alvaro Cintas Canto and Mehran Mozaffari-Kermani are with the Depart- the unbalanced oil and vinegar (UOV) signature scheme, but there is
ment of Computer Science and Engineering, University of South Florida, a restriction on the coefficients of the public key. Instead, the scheme
Tampa, FL 33620 USA (e-mail: [email protected]; [email protected]). uses two finite fields: one is the binary field of two elements, whereas
Reza Azarderakhsh is with the Department of Computer and Electrical
Engineering and Computer Science and I-SENSE, Florida Atlantic University,
the other is its extension of degree m. F2 is the binary field and F2m
Boca Raton, FL 33431 USA (e-mail: [email protected]). is its extension of degree m. The central map F: F n2m → F o2m is a
Digital Object Identifier 10.1109/TVLSI.2020.3031170 quadratic map, where o and v satisfy n = o + v, αi, j,k , βi,k and γk
1063-8210 © 2020 IEEE. Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 15,2021 at 08:28:40 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 29, NO. 1, JANUARY 2021 233
Fig. 1. Finite-field multiplier with the proposed error-detection schemes based on CRC.
are chosen from the base field F2 , and whose components
f1, . . . , fo Luov algorithm. Thus, we derive and apply CRC signatures [17] to
are in the form f k(x) = vi=1 nj =i αi, j,k xi x j + ni=1 βi,k xi + γk . the finite-field multipliers used in Luov algorithm. This would be a
These finite-field multiplications are very complex and require step forward toward detecting natural and malicious intelligent faults,
large-area footprint. Therefore, it is a complex task to implement especially and as discussed in this brief, considering both primitive
such architectures resilient to natural and malicious faults. The aim and standardized CRCs with different fault multiplicity coverage.
of this work is to provide countermeasures against natural faults and CRC was first proposed in 1961 and it is based on the theory of cyclic
fault injections for the finite-field multipliers used in cryptosystems error-correcting codes. To implement CRC, a generator polynomial
such as the Luov algorithm as a case study, noting that the proposed g(x) is required. The message becomes as the dividend, the quotient
error-detection schemes can be adapted to other applications and is discarded, and the remainder produces the result. In CRC, a fixed
cryptographic algorithms whose building blocks need finite-field number of check bits are appended to the data and these check bits
multiplications. Readers who are interested in knowing more details are inspected when the output is received to detect any errors.
about the Luov’s cryptographic algorithm are encouraged to refer The entire finite-field multiplier with our error-detection schemes
to [12]. is shown in Fig. 1, where actual CRC (ACRC) and predicted
CRC (PCRC) stand for ACRC signatures and PCRC signatures,
III. P ROPOSED FAULT-D ETECTION A RCHITECTURES respectively. In Fig. 1, only one EF is shown for clarity; however,
for CRC-5, which is the case study proposed in this brief, 5 EFs are
The multiplication of any two elements A and B of G F(2m ), computed on each module. In Fig. 2, the α module is shown more
following
m−1 the approach in [16], can be presented as A · B mod f (x) = in-depth to clarify how the proposed CRC signatures work in each
b · ((Aα i ) mod f (x)) = m−1 b · X (i) , where the set of
i=0 i i=0 i finite-field multiplier.
α i ’s is the polynomial basis of element A, the set of bi ’s is the B For the sum and pass-thru modules, it follows the approach as for
coefficients, f (x) is the field polynomial, X (i) = α·X (i−1) mod f (x), parity signatures described in [16]. For the sum module in CRC-1,
and X (0) = A. To perform finite-field multiplication, three different p̂x is equal to the sum of the parity bits of the input elements A
modules are needed: sum, α, and pass-thru modules. The sum module and B in G F(2m ), p̂ X = p A + p B . Furthermore, for the pass-thru
adds two elements in G F(2m ) using m two-input XOR gates, the α module in CRC-1, p̂ X = b · p A , where b is an element in G F(2). For
module multiplies an element of G F(2m ) by α and then reduces any other CRC-n scheme, instead of summing all the bits, it checks
the result modulo f (x), and lastly, the pass-thru module multiplies a n bits at a time in the sum and pass-thru modules. For the α module,
G F(2m ) element by a G F(2) element. One finite-field multiplication we have
uses a total of m − 1 sum modules, m − 1 α modules, and m
pass-thru modules to get the output. Fault injection can occur in any
A(x) · x = am−1 · x m + am−2 · x m−1 + · · · + a0 · x (1)
of these modules, and formulations for parity signatures in G F(2m )
are derived in [16]. Parity signatures provide an error flag (EF) on
each module. The major drawback of parity signatures is that their for which a set of derivations is needed to implement CRC-n into
error coverage is approximately 50%, that is, if the number of faults is it. In Table I, the generator polynomials used to derive the CRC-5
even, the approach would not be able to detect the faults. This highly signatures are shown. The generator polynomial g0 (x) is one of
predictable countermeasure can be circumvented by intelligent fault the standards used for radio frequency identification [18]. The other
injection. three generator polynomials g1 (x), g2 (x), and g3 (x) are primitive
In this work, our aim is the derivation of error-detection schemes polynomials. The benefit of using a primitive polynomial as the
that provide a broader and higher error coverage than parity generator that the resulting code has full total block length, which
signatures and explore the application of such schemes to the means that all 1-bit errors within that block length have separate
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 15,2021 at 08:28:40 UTC from IEEE Xplore. Restrictions apply.
234 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 29, NO. 1, JANUARY 2021
TABLE I
S TANDARDIZED (S TAND .) AND P RIMITIVE (P RIM .) G ENERATOR P OLYNOMIALS AND T HEIR C ORRESPONDING CRC S IGNATURES
remainders. Moreover, since the remainder is a linear function of the To calculate the ACRC-5 for G F(216 ) in the α module
block, all 2-bit errors within that block length can be identified. (AC RC516 ), we rename the coefficients of (3): a14 as γ15 , . . ., a0
For the α module of the Luov’s finite-field multipliers, g0 (x) = as γ1 :
x 5 + x 3 + 1 is used as the standardized generator polynomial for
A(x) · x ≡ γ15 x 15 + γ14 x 14 + γ13 x 13 + γ12 x 12
CRC-5. To find its CRC signatures, this fixed polynomial is used as
follows: + γ11 x 11 + γ10 x 10 + γ9 x 9 + γ8 x 8 + γ7 x 7
+ γ6 x 6 + γ5 x 5 + γ4 x 4 + γ3 x 3 + γ2 x 2 + γ1 x 1
x 5 ≡ x 3 + 1 mod g0 (x)
+ γ0 mod g0 (x) (5)
x 6 ≡ x 4 + x mod g0 (x)
x 7 ≡ x 5 + x 2 ≡ x 3 + x 2 + 1 mod g0 (x) and the generator polynomial is applied as follows:
.. A(x) · x ≡ γ15 (x 2 + x) + γ14 (x + 1) + γ13 (x 4 + x 2 + 1)
.
x 15 ≡ x 2 + 1 mod g0 (x). (2) + γ12 (x 4 + x 3 + x 2 + x) + γ11 (x 3 + x 2 + x + 1)
+ γ10 (x 4 + x + 1) + γ9 (x 4 + x 3 + x 2 + 1)
According to (1), we obtain A(x) · x = a15 · x 16 + a14 · x 15 +
+ γ8 (x 4 + x 3 + x) + γ7 (x 3 + x 2 + 1) + γ6 (x 4 + x)
· · · + a1 · x 2 + a0 · x. Then, applying the irreducible polynomial
f (x) = x 16 + x 12 + x 3 + x + 1, one obtains + γ5 (x 3 + 1) + γ4 x 4 + γ3 x 3 + γ2 x 2
+ γ1 x 1 + γ0 mod g0 (x)
A(x) · x ≡ a15 x 12 + a15 x 3 + a15 x + a15 + a14 x 15
+ a13 x 14 + a12 x 13 + a11 x 12 + a10 x 11 + a9 x 10 or
+ a8 x 9 + a7 x 8 + a6 x 7 + a5 x 6 + a4 x 5 + a3 x 4 ACRC516 = (γ13 + γ12 + γ10 + γ9 + γ8 + γ6 + γ4 )x 4
+ a2 x 3 + a1 x 2 + a1 x mod f (x). (3) + (γ13 + γ12 + γ11 + γ9 + γ8 + γ7 + γ5 + γ3 )x 3
To calculate the PCRC-5 for G F(216 ) in the α module + (γ15 + γ13 + γ12 + γ11 + γ9 + γ7 + γ2 )
(PC RC516 ), the generator polynomial is applied as · x 2 + (γ15 + γ14 + γ12 + γ11 + γ10 + γ8 + γ6 + γ1 )
+ a11 (x 4 + x 3 + x 2 + x) + a10 (x 3 + x 2 + x + 1) The predicted output and the actual output are divided into five
+ a9 (x 4 + x + 1) + a8 (x 4 + x 3 + x 2 + 1) parity groups as shown in (4) and (6), respectively. These parity
groups are XORed with each other to determine if there has been
+ a7 (x 4 + x 3 + x) + a6 (x 3 + x 2 + 1) + a5 (x 4 + x)
any fault, for example, flip of bits, during the α module opera-
+ a4 (x 3 + 1) + a3 x 4 + a2 x 3 + a1 x 2 + a0 x mod g0 (x) tion. In total, each α module outputs five EFs. Fig. 2 shows the
or implementation of the α module with the proposed error-detection
schemes. A(x) is the input with the form p(x) = am−1 x m−1 +
PCRC516 = (a15 + a12 + a11 + a9 + a8 + a7 + a5 + a3 )x 4 · · · + a1 x + a0 , which goes to two different modules that run in
+ (a12 + a11 + a9 + a8 + a7 + a6 + a4 + a2 )x 3 parallel. In the α module, (1) takes place. The output from the α
module is divided into five groups in the ACRC module, which
+ (a15 + a14 + a12 + a11 + a10 + a8 + a6 + a1 )x 2
are denoted as xa1 –xa5 in Fig. 2. Meanwhile, A(x) is also being
+ (a14 + a13 + a11 + a10 + a9 + a7 + a5 + a0 )x divided into five groups in the PCRC module, which are denoted as
+ (a15 + a13 + a12 + a10 + a9 + a8 + a6 + a4 ). (4) x 1p –x 5p . Once the two CRC modules are done, each group is XORed
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 15,2021 at 08:28:40 UTC from IEEE Xplore. Restrictions apply.
IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 29, NO. 1, JANUARY 2021 235
TABLE II
OVERHEADS OF THE P ROPOSED E RROR -D ETECTION S CHEMES FOR THE F INITE -F IELD M ULTIPLIERS U SED IN THE L UOV A LGORITHM D URING THE
P OLYNOMIAL G ENERATION ON X ILINX FPGA FAMILY K INTEX U LTRASCALE+ FOR D EVICE XCKU 5 P - FFVD 900-1- I
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 15,2021 at 08:28:40 UTC from IEEE Xplore. Restrictions apply.
236 IEEE TRANSACTIONS ON VERY LARGE SCALE INTEGRATION (VLSI) SYSTEMS, VOL. 29, NO. 1, JANUARY 2021
overhead obtained by applying error-detection schemes of NTT [6] M. Mozaffari-Kermani, R. Azarderakhsh, and A. Aghaie, “Reliable
architectures is 24%. The worst case area overhead of [8] and [9] and error detection architectures of Pomaranch for false-alarm-sensitive
cryptographic applications,” IEEE Trans. Very Large Scale Integr. (VLSI)
is more than 33% with a performance degradation of more than 14%
Syst., vol. 23, no. 12, pp. 2804–2812, Dec. 2015.
when fault-detection architectures are applied to stateless hash-based [7] A. Sarker, M. Mozaffari-Kermani, and R. Azarderakhsh, “Hardware
signatures. These and similar prior works on classical cryptography constructions for error detection of number-theoretic transform utilized
verify that the proposed error-detection architectures obtain similar in secure cryptographic architectures,” IEEE Trans. Very Large Scale
overheads compared to other works on fault detection, achieving an Integr. (VLSI) Syst., vol. 27, no. 3, pp. 738–741, Mar. 2019.
[8] M. Mozaffari-Kermani, R. Azarderakhsh, and A. Aghaie, “Fault detec-
acceptable overhead. These degradations are acceptable for providing tion architectures for post-quantum cryptographic stateless hash-based
error detection to the original architectures which lack such capability secure signatures benchmarked on ASIC,” ACM Trans. Embedded Com-
to thwart natural or malicious faults. put. Syst., vol. 16, no. 2, pp. 59:1–59:19, Dec. 2016.
[9] M. Mozaffari-Kermani and R. Azarderakhsh, “Reliable hash trees for
post-quantum stateless cryptographic hash-based signatures,” in Proc.
V. C ONCLUSION IEEE Int. Symp. Defect Fault Tolerance VLSI Nanotechnol. Syst. (DFTS),
In this work, we have derived error-detection schemes for the Oct. 2015, pp. 103–108.
finite-field multipliers used in postquantum cryptographic algorithms [10] M. M. Kermani and R. Azarderakhsh, “Reliable architecture-oblivious
error detection schemes for secure cryptographic GCM structures,” IEEE
such as Luov, noting that the proposed error-detection schemes can Trans. Rel., vol. 68, no. 4, pp. 1347–1355, Dec. 2019.
be adapted to other applications and cryptographic algorithms whose [11] A. A. Kamal and A. M. Youssef, “Strengthening hardware implemen-
building blocks need finite-field multiplications. The error-detection tations of NTRUEncrypt against fault analysis attacks,” J. Cryptograph.
architectures proposed in this work are based on CRC-5 signatures Eng., vol. 3, no. 4, pp. 227–240, Nov. 2013.
[12] A. Kipnis, J. Patarin, and L. Goubin, “Unbalanced oil and vinegar
and we have performed software implementations for the sake of signature schemes,” in Proc. Int. Conf. Theory Appl. Cryptograph. Techn.
verification. Additionally, we have explored and studied both prim- Berlin, Germany: Springer, 1999, pp. 206–222.
itive and standardized generator polynomials for CRC-5, comparing [13] D. Moody, “Post-quantum cryptography: NIST’s plan for the future,”
the complexity for each of them. We have embedded the proposed Tech. Rep., Feb. 2016. [Online]. Available: https://csrc.nist.gov/csrc/
error-detection schemes into the original finite-field multipliers of media/projects/post-quantum-cryptography/documents/pqcrypto-2016-
presentation.pdf
the Luov’s algorithm, obtaining high error coverage with acceptable [14] D. Moody, “Post-quantum cryptography: Round 2 submissions,”
overhead. Tech. Rep., Mar. 2019. [Online]. Available: https://csrc.nist.gov/CSRC/
media/Presentations/Round-2-of-the-NIST-PQC-Competition-What-
was-NIST/images-media/pqcrypto-may2019-moody.pdf
R EFERENCES
[15] D. J. Bernstein, “Post-quantum cryptography,” in Encyclopedia of Cryp-
[1] J. L. Danger et al., “On the performance and security of multiplication tography and Security, H. C. A. van Tilborg and S. Jajodia, Eds. Boston,
in G F(2 N ),” Cryptography, vol. 2, no. 3, pp. 25–46, 2018. MA, USA: Springer, 2011, pp. 949–950, doi: 10.1007/978-1-4419-5906-
[2] M. Mozaffari-Kermani and A. Reyhani-Masoleh, “Reliable hardware 5_386.
architectures for the third-round SHA-3 finalist Grostl benchmarked on [16] A. Reyhani-Masoleh and M. A. Hasan, “Error detection in polynomial
FPGA platform,” in Proc. DFT, Oct. 2011, pp. 325–331. basis multipliers over binary extension fields,” in Proc. CHES, 2002,
[3] M. Mozaffari-Kermani and A. Reyhani-Masoleh, “A low-cost pp. 515–528.
S-box for the advanced encryption standard using normal basis,” [17] EPC Radio-Frequency Identity Protocols Class-1 Generation-2 UHF
in Proc. IEEE Int. Conf. Electro/Inf. Technol., Jun. 2009, pp. 52–55. RFID Protocol for Communications at 860 MHz 960 MHz, EPC Global,
[4] M. Yasin, B. Mazumdar, S. S. Ali, and O. Sinanoglu, “Security analysis Brussels, Belgium, Version 1.0.23, 2008.
of logic encryption against the most effective side-channel attack: DPA,” [18] T. V. Ramabadran and S. S. Gaitonde, “A tutorial on CRC computations,”
in Proc. IEEE Int. Symp. Defect Fault Tolerance VLSI Nanotechnol. Syst. IEEE Micro, vol. 8, no. 4, pp. 62–75, Aug. 1988.
(DFTS), Oct. 2015, pp. 97–102. [19] S. Subramanian, M. Mozaffari-Kermani, R. Azarderakhsh, and
[5] M Mozaffari-Kermani, R. Azarderakhsh, A. Sarker, and A. Jalali, M. Nojoumian, “Reliable hardware architectures for cryptographic
“Efficient and reliable error detection architectures of hash-counter-hash block ciphers LED and HIGHT,” IEEE Trans. Comput.-Aided
tweakable enciphering schemes,” ACM Trans. Embedded Comput. Syst., Design Integr. Circuits Syst., vol. 36, no. 10, pp. 1750–1758,
vol. 17, no. 2, pp. 54:1–54:19, May 2018. Oct. 2017.
Authorized licensed use limited to: Univ of Calif Santa Barbara. Downloaded on June 15,2021 at 08:28:40 UTC from IEEE Xplore. Restrictions apply.