0% found this document useful (0 votes)
23 views14 pages

Singular Curve Point Decompression Attack

Uploaded by

nexof53264
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
23 views14 pages

Singular Curve Point Decompression Attack

Uploaded by

nexof53264
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 14

2015 Workshop on Fault Diagnosis and Tolerance in Cryptography

Singular Curve Point Decompression Attack

Johannes Blömer and Peter Günther∗

University of Paderborn
Germany
Email: {johannes.bloemer, peter.guenther}@upb.de

Abstract—In this work, we show how to use instruction Besides their efficiency, elliptic curves are also interesting
skip faults to transfers the discrete logarithm problem from because they are used to define cryptographic pairings. For
a cryptographically strong elliptic curve to a weak singular many cryptographic primitives the most efficient realizations
curve. More specifically, we attack the algorithm that computes
from a field element a point on the curve. This algorithm is are based on pairings. Examples include identity based
a building block of point decompression, hashing to curves, encryption (IBE) [1], attribute based encryption [2], or
and random point sampling. Our attack is most powerful for signature schemes with additional properties [3].
curves of j-invariant zero that often occur in pairing based In practice, the security of a system does not only depend
cryptography. Therefore, to demonstrate the effectivity of our
attack in practice, we perform it on an AVR Xmega A1 for the
on the hardness of the DLOG problem but also on the
pairing based Boneh-Lynn-Shacham short signature scheme. concrete implementation. Over the last years many fault
attacks on ECC implementations were published. Before
Keywords-elliptic curve, fault attack, singular curve, pairing
based cryptography
describing our results, we give an overview over previous
work that is most relevant for our results. For a general
I. I NTRODUCTION survey we refer to [4], [5].
In this paper, we introduce fault attacks that allow us
to transfer the discrete logarithm (DLOG) problem from Previous Work
a strong elliptic curve that is defined over a large prime For the large prime field case, most standards define
field to a weak subgroup of a finite field. Our transfer elliptic curves by a short Weierstrass equation of the form
is indirect via a singular curve and we use instruction y 2 = x3 + a4 x + a6 and from now on, we consider only
skip faults to force the result of point decompression to curves of this form. A special category of fault attacks on
this curve. Note that point decompression is an important ECC are so-called weak curve attacks. Here, an attacker
building block of elliptic curve cryptography (ECC) and tampers with the parameters a4 or a6 , the coordinates x and
part of many standards and schemes. We practically perform y of group elements, or the curve’s field of definition. The
our attack for Boneh-Lynn-Shacham (BLS) signatures on an effect is that ECSM is performed in a different group where
AVR Xmega A1. the complexity of solving the DLOG problem is lower.
The security of many cryptographic primitives is based Notable examples for attacks in this category are given in
on the DLOG problem in multiplicative subgroups of fi- [6], [7], [8], and [9].
nite fields or in subgroups of elliptic curves. For elliptic Attacks where the weak curve is obtained by modification
curves, the DLOG problem is conjectured to be harder of the base point of the ECSM are often called invalid point
than for finite fields. Consequently, groups in ECC have attacks [4]. An important observation made in [6] is that
an efficient representation due to their relatively small size. many algorithms for ECSM only use the parameter a4 , but
A technique to obtain even more efficient representations do not use the parameter a6 . This fact can be exploited by an
is point compression. Here, a point (x, y) is represented attacker: A fault is introduced to move the base point onto
only by its x-coordinate and the sign of its y-coordinate. a weak curve that shares the parameter a4 with the original
Prior to the elliptic curve scalar multiplication (ECSM), curve but uses a different parameter a6 . Then, the ECSM is
a compressed point has to be decompressed based on the performed implicitly on the weak curve. If the attacker is
equation that defines the curve. Point compression halves able to access the result of the ECSM, the DLOG problem
the representation size of elliptic curve elements. Therefore, can be solved on the weak curve. This result is used to
it is part of many ECC schemes and related standards, e.g., obtain the DLOG on the original curve as well. For most
IEEE 1363. previous fault attacks, the notion of a weak curve is that of
∗ This work was partially supported by the German Ministry of Education a curve with smooth composite order, i.e., a curve where the
and Research, grant 16KIS0062. largest prime factor of the curve is upper-bounded by some

978-1-4673-7579-5/15 $31.00 © 2015 IEEE 71


DOI 10.1109/FDTC.2015.17
smoothness parameter. On such a curve, the DLOG can be Our Contribution
computed using the Pohlig-Hellman approach.
In this paper, we show how faults at point decompression
Several countermeasures have been proposed against fault can be used to realize an invalid point attack that results
attacks on ECC [4], [5]. The standard approach to defeat in points on singular curves, not just elliptic curves with
invalid point attacks is to check if the result of the ECSM smooth order.
is on the original elliptic curve; if this is not the case,
For our attack, we assume protocols where the output of
the output is discarded. Furthermore, point compression
point decompression is the input of the ECSM. We distin-
has sometimes been proposed as a natural countermeasure
guish between two types of protocols: protocols that give
against invalid curve attacks because decompression assures
an attacker strong control over the input of decompression
that the resulting point is on the curve defined by a4 and
and protocols that give an attacker weak control over the
a6 .
input of decompression. Here, strong control refers to the
It was already observed, for example in [10], that a possibility of the attacker to actually choose the input of
validity check can be attacked with a second fault. An decompression. Weak control refers to the possibility of the
invalid point attack that shows how validity checks can be attacker to choose a bit string such that its image under a
circumvented is the twist curve attack from [9]. This attack hash function is the input of decompression.
is based on implementations of ECSM that do not use the We show that for every elliptic curve, there exists a set of
y-coordinate of the base point. The authors use faults to inputs for point decompression that is vulnerable to instruc-
perform the ECSM on a weak twist with smooth order. They tion skip faults. That means that for every element of this set,
furthermore show how a second fault can be used to map the we identify an instruction in the decompression algorithm
point back to the original curve in order to pass the validity such that if this instruction is skipped, decompression of the
check. This attack has two limitations: First, many standard corresponding element yields a point on a singular curve. For
implementations do not use the required special form of the arbitrary elliptic curves in short Weierstrass form, the set of
ECSM. Secondly, the efficiency of the attack depends on the vulnerable elements is relatively small. Hence, to mount our
smoothness of the order of the twist. attack, strong control is required for explicitly selecting one
To deal with attacks on the validity check, [11] propose of those elements. For curves with a4 = 0, or equivalently,
a countermeasure that randomizes the base point by adding curves with j-invariant 0, it turns out that half of the inputs of
a random point. In a correct ECSM, the randomness will be point decompression are vulnerable. Therefore, it is possible
removed afterwards by subtracting a power of the random for an attacker with weak control to choose input strings that
point. If a fault modifies the base point into a point on a are hashed to vulnerable elements.
weak curve, the randomness will not cancel out and the With respect to efficiency, our attack is more efficient for
result carries no information about the secret. curves with j-invariant 0 than for general curves. This is
An attack that is related to our attack in the sense that it because for general curves, our attack results in a singular
also reduces the DLOG problem from elliptic curves to the curve that is isomorphic to a multiplicative subgroup of a
DLOG problem in a finite field is the famous work of [12], finite field, where for curves with j-invariant 0, the resulting
called the MOV attack. Different from our work, the MOV curve is isomorphic to the additive group of a finite field
attack does not use faults and singular curves to obtain this where the DLOG problem is trivial. In summary, for curves
reduction but uses a bilinear pairing on E. The attack is with j-invariant 0, our attack has weaker assumptions on the
efficient for supersingular curves (not to be confused with protocol and is more efficient compared to general curves.
singular curves) and, consequently, these curves are avoided As we explain below, for various reasons curves with j-
for standard ECC today. invariant 0 are of considerable interest in practice.
Singular curves are weak curves because they are always The efficiency of our attack for curves with j-invariant 0
isomorphic to either an additive or a multiplicative subgroup has also practical advantages. In many scenarios, first the
of their field of definition. In case of an additive subgroup, DLOG has to be computed on the weak curve to obtain
the DLOG problem is trivial. In the multiplicative case, the a candidate for the secret. Then the candidate has to be
DLOG problem is not trivial but sub-exponential. Because verified, e.g., based on the public key. If the fault mechanism
fields used in ECC are relatively small, sub-exponential is not very precise, a large number of experiments has to be
DLOG algorithms are often efficient enough to solve the performed before a point on the weak curve is obtained.
DLOG problem. Consequently, singular curves were already Consequently, a large number of candidate DLOG instances
considered before. In [13], a weak curve attack based on have to be solved. This is easily possible for our attack
singular curves is proposed. It assumes that the attacker is on curves with j-invariant 0, where the DLOG problem is
able to directly choose the base point of the ECSM on the trivial. For our attack on general curves or for attacks that
singular curve. For most schemes, this is not possible and rely on invalid points with smooth order like [6], their sub-
the attack often does not directly apply in practice. exponential run-time may not be efficient enough in practice.

72
Usually, in our attack we are able to deduce the exponent Furthermore, many pairing based signature schemes hash
we are looking for from a single faulty run of the protocol messages to curves. Hence, they give an attacker weak
on a vulnerable element. This makes our attack applicable control over the input of decompression and are susceptible
to protocols where the exponent is a nonce and that allow us to our attack. Examples for such schemes are given in [23],
recover the secret key from a complete nonce. To recover the [3], [24]. Also, it was explicitly proposed to instantiate
complete nonce from a single run, the attack in [6] requires such schemes with curves of j-invariant 0 [25]. The popular
that the invalid point has large but smooth order. The proba- combination of pairing based signature schemes with curves
bility to obtain such a point is relatively small. Consequently, of j-invariant 0 is an important application of our attack.
the expected number of experiments to recover the secret is This also motivates our practical attack on an instantiation
larger than in our case, and the attack is harder to realize in of the pairing based BLS short signature scheme [23] with
practice. BN curves of j-invariant 0.
Furthermore, the invalid point attack from [6] applied As we explained above, schemes that offer strong control
to curves with j-invariant 0 may fail. If the orignal curve over the input of decompression are vulnerable to our attack
has j-invariant 0, the invalid points in [6] are on an elliptic for all curves. Examples for signature schemes that offer
curve that has also j-invariant 0. This implies that the faulty strong control includes the blind signature scheme from [3].
curve is the twist of the original curve. If the twist does not So far, we always assumed that the attacker is able to
have smooth order, the attack is not successful. In [9], the access the result of the ECSM. Typically, this is true for
invalid points are always on the twist of the original curve, signature schemes, because the result of the ECSM with
independent of the j-invariant. As for [6], the attack fails if the secret signing key is required for public verification.
the twist does not have smooth order. Our attack does not The situation differs for key agreement and encryption
require curves with weak twists and is even more efficient schemes. On the one hand, encryption and key agreement
for curves with j-invariant 0. schemes from standards like IEEE 1363 or ANSI X9.63
Because the set of vulnerable inputs of decompression is offer strong control over the input of decompression. On
large for curves with j-invariant 0, in this case our attack also the other hand, they often apply a key derivation function
applies to random point sampling. This allows us to defeat (KDF) to the output of the ECSM. Hence, these schemes
the randomization countermeasure from [11]. The basic idea are also vulnerable to our attack, but only if the attacker is
is to attack decompression with one fault and random point able to invert the KDF or is able to access the result of the
sampling with another fault to obtain a randomized point on ECSM prior to the KDF. Note that this limitation holds also
the singular curve. for other attacks on encryption schemes like [6].
To show that our attack can be realized in practice, we The paper is structured as follows. In Section II we give
evaluate it on an AVR Xmega A1 for an implementation the necessary background on elliptic curves and especially
of the BLS short signature scheme with BN curves of j- singular curves. Furthermore, we introduce our fault model.
invariant 0. As countermeasure, the implementation per- In Section III-A we present our attack for curves with j-
forms an output point validity check. Hence, we have to invariant 0. In Section III-B we present a more restricted
use two faults. We place the first fault at decompression to attack but for general curves. In Section III-D, we relate our
obtain a point on a singular curve and with the second fault attack to existing cryptographic schemes and discuss it with
we eliminate the validity check. respect to proposed countermeasures. In Section IV, we give
results for a practical evaluation of our attack on the AVR
Xmega A1 for BLS short signatures.
Applications
Our attack is easier to realize for curves of j-invariant 0 II. BACKGROUND
because for these curves only weak control over the input of In this section we start with the necessary background on
decompression is required. These curves are popular in ECC elliptic and singular curves. Then we introduce two abstract
and especially pairing based cryptography (PBC) because protocols to describe our attacks. Furthermore, we introduce
they have useful properties. First, they admit non-trivial our fault model.
endomorphisms that can be used to speed up the ECSM
[14], [15]. Secondly, they are the only curves with twists of A. Elliptic Curves
degree 6, therefore they yield the most efficient pairings [16]. Let Fq be a field with q elements. With F+ q and Fq

Finally, supersingular curves, which are required to define we denote the additive and the multiplicative subgroup of
symmetric pairings, often have j-invariant 0 [17]. Conse- Fq , respectively. Let q = pn for a prime p. Then Fq is a
quently, curves with j-invariant 0 are defined in standards Fp vector space of dimension n and we can represent an
like [18]; considerable effort has been spent to construct element a ∈ Fq as a vector (a0 , . . . , an−1 ) where ai ∈ Fp is
them for application in PBC [19], [20], [21]; and they are represented by an integer in the interval [0, p−1]. For p > 2,
proposed in, e.g., [1] and RFC-5091 [22]. let b be the least significant bit (LSB) of a0 . Then we define

73

(−1)b as the sign of a and a ∈ Fq√ 2 as the positive square 1) If a4 = a6 = 0, then E has a singular point at S =
root of a. For a ∈ Fq it holds√that a ∈ Fq if and only if (0, 0) and the map φ+ : Ens (Fq ) → F+
q with
a(q−1)/2 = 1 and in that case a can be computed with the x
Tonelli-Shanks algorithm [26]. (x, y) → and O → 0 (2)
y
Let Fq be a finite field of characteristic p > 3. Let E be
is an isomorphism of groups.
a, not necessarily smooth, algebraic curve defined by a short
2) If a4 = 0 define xS = − 3a 6
2a4 . Then E has a singular
Weierstrass equation: √
point at S = (xS , 0). Furthermore, with α = 3xS
E : y 2 = x3 + a 4 x + a 6 . (1) define the map φ∗ : Ens (Fq ) → F∗q (α) with

If a4 , a6 ∈ Fq we say the curve is defined over Fq and y − α(x − xS )


(x, y) → and O → 1. (3)
write E/Fq . For a field Fqk ⊇ Fq we write E(Fqk ) for the y + α(x − xS )
Fqk -rational points, i.e., for the points with coordinates in a) If α ∈ Fq , then φ∗ is an isomorphism of the groups
Fq k . Ens (Fq ) and F∗q .
We have to distinguish two types of points on E, smooth b) If α ∈ Fq , then φ∗ is an isomorphism of Ens (Fq )
points and singular points. Singular points are points where and {u + αv | u, v ∈ Fq , u2 − 3xS v 2 = 1} viewed
the partial derivatives of the defining equation of E van- as a multiplicative group.
ish simultaneously. Points that are not singular are called
smooth. A curve of the form (1) without singular points is Proof: For case 1 note that the partial derivatives 3x2
called an elliptic curve in short Weierstrass form. A curve and 2y of x3 − y 2 vanish simultaneously at S = (0, 0)
with singular points is called a singular curve. With Ens we and hence S is singular by definition. According to The-
denote the set of smooth points of E. By definition Ens = E orem 2.29 of [28] the map φ+ is an isomorphism of groups.
for elliptic curves. For case 2a and case 2b we see that the partial derivatives
We define the discriminant of E as Δ = Δ(E) = 2y and 3x2 + a4 of (1) vanish at S = (xS , 0) and hence
−16(4a34 + 27a26 ). The discriminant equals Δ(E) = 0 S is singular. Now we translate (1) such that the singular
if and only if E is singular [27]. For smooth curves, point is at (0, 0). By expanding y 2 = (x + xS )3 + a4 (x +
i.e., curves with Δ(E) = 0, we define the j-invariant as xS ) + a6 and applying the identity 4a34 + 27a26 = 0 from the
j = j(E) = −1728(4a4 )3 /Δ. Hence, j = 0 for curves with discriminant of singular curves we obtain y 2 = x3 + 3xS x2 .
a4 = 0. A curve E  with j(E  ) = j(E) is called a twist of Then, it follows from Theorem 2.30 of [28] that φ∗ is an
E [27]. isomorphism of groups.
Using the chord-and-tangent law, the smooth part Ens of Hence, the DLOG problem on Ens (Fq ) can be trans-
a curve given by (1) forms an additive group [27]. Its neutral ferred to the DLOG problem in F+ ∗ ∗
q , Fq , or Fq 2 Fq (α),
+
element is a special point at infinity in the projective plane respectively. For Fq , the DLOG can be computed with a
and is denoted by O. For the sum of n elements P +· · ·+P simple division in Fq and hence can be performed in time
we write nP and call this operation elliptic curve scalar O(log(q)3 ). Let
multiplication (ECSM) with scalar n. If P, Q ∈ Ens and if
LN (a) = exp(log(N )a (log log N )1−a )
P is of order r and Q = nP for n ∈ Z/rZ we call n the
discrete logarithm (DLOG) of Q in base P . For n-torsion be the sub-exponential function. For computing DLOG in
points, i.e. points of order dividing n we write E[n] and we F∗q and F∗q2 there exist sub-exponential time algorithms with
define E(Fqk )[n] := E(Fqk ) ∩ E[n]. Lq (1/3) and Lq2 (1/3), respectively [29]. In both cases, the
In cryptography, a pairing is a non-degenerate, bilinear DLOG can be computed much faster than the DLOG on an
map G1 × G2 → GT with groups G1 , G2 , and GT . elliptic curve E(Fq ).
Usually, it is defined based on elliptic curves E/Fq with
G1 , G2 ⊂ E(Fqk )[r] and GT ⊂ F∗qk . The parameter k is B. Abstraction for ECC Protocols
called the embedding degree of r with respect to q and can In practice, the instantiation of an elliptic curve based
be defined as the smallest positive integer such that r|q k −1. scheme includes public domain parameters. The following
For background on pairings, we refer the reader to [17]. two definitions are based on [30] and [8]:
Our attack is based on the next theorem that describes Definition 2 (ECC Domain parameters). For an ECC
the structure of singular curves. The theorem summarizes scheme, we define the domain parameters as a tuple D =
Theorem 2.29 and Theorem 2.30 of [28] and adapts The- (Fq , E, G, r, c) with
orem 2.30 of [28] to curves in short Weierstrass form (see
1) A field Fq of size q = pn with p > 3 defined by a
also Proposition 2.5 of [27]).
proper description.
Theorem 1. Let E/Fq be a singular curve defined as in 2) An elliptic curve E defined by two field elements
(1). a4 , a6 ∈ Fq and (1).

74
Require: b ∈ {0, 1}, x, a4 , a6 ∈ Fq Require: m ∈ {0, 1}∗ , E : y 2 = x3 + a4 x + a6 , hash
Ensure: (x, y) with y 2 = x3 + a4 x + a6 function H : {0, 1}∗ → Fq × {0, 1}, r ∈ N with
1: procedure Decompress(x, b) #E(Fq ) = cr
2: v ← x2  v = x2 Ensure: P ∈ E(Fq )[r]
3: v ← v + a4  v = x2 + a 4 1: procedure HashToCurve(m)
4: v ← v·x  v = x 3 + a4 x 2: i←0
5: v← √ v + a6  v = x 3 + a 4 x + a6 3: repeat  until (x, b) is valid compression
6: if v ∈ F √q then 4: (x, b) ← H(m  i)
7: v← v 5: U ← Decompress(x, b)
8: y ← (−1)b v 6: i←i+1
9: return (x, y) 7: until U = O
10: else 8: P ← cU  map to subgroup
11: return O 9: return P
12: end if 10: end procedure
13: end procedure
Figure 2. Algorithm HashToCurve for hashing to E(Fq )[r].
Figure 1. Algorithm Decompress for decompressing points (cf. [31],
Appendix 12.8).
Protocol 1 (Decompress-And-Multiply). For domain param-
eters D = (Fq , E, G, r, c), and B’s secret s ∈ Z/rZ define
3) A point G ∈ E(Fq ) of prime order defined by two field the protocol Decompress-And-Multiply as follows:
elements xG , yG ∈ Fq as G = (xG , yG ) 1) A sends a tuple (xP , b) ∈ Fq × {0, 1} to B.
4) The order r of G. 2) B computes P = Decompress(xP , b) ∈ E(Fq ).
5) The co-factor c of r defined as #E(Fq ) = cr. 3) B computes Q = sP and sends the result to A.
Definition 3 (Valid point). A point Q = (xQ , yQ ) is This protocol is related to key-agreement and encryption
valid with respect to a set of domain parameters D = schemes from ECC. Note that it offers A in the role of an
(Fq , E, G, r, c) if Q ∈ E(Fq ) and if Q is of order r. attacker strong control over the input of decompression.
A point P ∈ E(Fq ) with P = (xP , yP ) can be uniquely Protocol 2 (Hash-And-Multiply). For domain parameters
represented with log2 (q + 1) + 1 bits as (xP , b) ∈ Fq × D = (Fq , E, G, r, c), and B’s secret s ∈ Z/rZ define the
{0, 1} where (−1)b is the sign of yP . Then, P can be com- protocol Hash-And-Multiply as follows:
pletely recovered based on (1). This strategy is called point 1) A sends a message m ∈ {0, 1}∗ to B.
compression and part of many standards like for example 2) B computes P = HashToCurve(m) ∈ E(Fq ).
IEEE 1363 [31]. Fig. 1 lists an example implementation 3) B computes Q = sP and sends the result to A.
of point decompression. Similar implementations are used
in real-world implementations, for example in Open Secure This protocol resembles pairing based signatures schemes
Socket Layer (OpenSSL). like the BLS short signature scheme [23]. We give additional
There are also protocols that require that arbitrary strings examples in Section III-D. Note that this protocol offers A
are hashed to the underlying group. HashToCurve in Fig. 2 in the role of an an attacker only weak control over the input
implements the standard approach for hashing bit strings of decompression because B’s input is hashed before it is
to elliptic curves (cf. [25, Section 3.3.3]). It is based on a decompressed.
cryptographic hash function H : {0, 1}∗ → Fq × {0, 1} that Definition 4. Consider the party B of Protocol 1 or Pro-
can easily be built from any cryptographic hash function tocol 2. If B’s implementation of ECSM does not use the
that hashes to {0, 1}t for t ∈ N. The basic approach is parameter a6 , we call B invalid point vulnerable (IPV). If
to hash m  i to a compressed representation (xP , b) with B is IPV and outputs Q without checking that Q is valid,
x-coordinate xP and sign (−1)b and then decompress the we call B first order IPV (IPV1). If B is IPV and outputs Q
result to the curve. If the image of m  i under H is not only after a positive validity check, we call B second order
the compression of a point, the counter i is incremented and IPV (IPV2).
the process is repeated until a valid representation has been
found. We show in Section III that we can attack implementations
In order to simplify the description of our attack and of that are IPV1 with our singular curve attack using only
previous work, we introduce two abstract protocols between one fault. We further discuss implementations with validity
the parties A and B. In Section III-D we relate these check in Section III-E, and in Section IV we show how
protocols to real-world protocols. we can attack IPV2 implementations by means of a second
fault.

75
Note that implementations following standards like DLOG instance in F+ ∗ ∗
q for a4 = 0 and in Fq , or Fq 2 for
IEEE 1363 do not use the parameter a6 for ECSM (cf. a4 = 0.
Annex A of [31]) and are IPV. With respect to the validity In the following subsections we show how we can intro-
check, note that Decompress and HashToCurve already duce a fault at Decompress such that P is modified into a
output points on the correct curve. Furthermore, IEEE 1363 point on a singular curve. We start with the case a4 = 0,
defines validity checks as optional. Hence, implementations i.e., j(E) = 0 and continue with the case j(E) = 0.
might even be IPV1.
A. Attacks for Curves with j-Invariant Zero
C. Instruction Skip Faults by Clock Glitching
We now present our attacks for curves E : y 2 = x3 + a6 ,
Several fault injection techniques have been used to attack i.e., curves with j-invariant j(E) = 0. We only present the
cryptographic algorithms, including clock glitching, voltage attack on Hash-And-Multiply in detail because the attack on
glitching, electromagnetic pulses, or laser beams. With those Decompress-And-Multiply follows trivially (see Remark 2 at
techniques, it is possible to generate different faults during the end of this section).
program execution like instruction replacement faults, bit For the attack, assume domain parameters D =
faults, or register faults. For an overview, see [32], [5]. (Fq , E, G, r, c) where Fq is of characteristic p. We assume
In this work, we consider instruction replacement faults, that E : y 2 = x3 +a6 is defined over Fp and that r|#E(Fp ).
or more specifically, instruction skip faults. Instruction re- Note that the case p = q occurs in PBC where q = pk and
placement faults are faults where a fault is injected during k is the embedding degree of p with respect to r.
instruction fetch, instruction decode, or instruction execution In the attack, A takes the role of the attacker and performs
with the effect that another instruction is executed in place the following steps in the protocol Hash-And-Multiply with
of the original instruction. An instruction skip fault is the B:
special case where the original instruction is not executed
at all or replaced with an instruction that does not effect A-1 √ Select messages m ∈ {0, 1}∗ until a message with
the data of the cryptographic algorithm. It has been shown u ∈ F∗q and H(m  0) = (u, b) has been found.
that instruction replacement faults and instruction skip faults A-2 Send m to B. While B computes HashToCurve(m),
can be achieved with different injection techniques and mount an instruction skip fault at Line 5 of
on different computer architectures. For example, in [33] Decompress.
instruction skip faults on AVR CPUs were realized based A-3 For B’s output√ Q̃ = (xQ , yQ ) return key candidates
on clock glitches. In [34], instruction skip faults on ARM s1 = (−1)b uxQ (yQ c)−1 mod p and s2 = s1 + p.
CPUs were introduced by means of electromagnetic pulses. The following theorem summarizes the effectivity of the
In Section III, we describe a fault attack based on instruc- attack:
tion skip faults. Furthermore, in Section IV we perform the Theorem 5. If B is IPV1 (see Definition 4) with secret s and
attack on an AVR Xmega A1 by skipping the call to an Fq if A’s instruction skip in step A-2 successfully skips Line 5 of
addition. Decompress, then either s1 = s mod r or s2 = s mod r.
III. S INGULAR C URVE FAULT ATTACK Furthermore, all computations of A can be performed in
time O((log q)3 ).
In this section, we present our attack and show how it can
be realized depending on the type of elliptic curve. We ex- Proof: First note that in step A-1, A chooses B’s input
plain our attack for IPV1 implementations from Definition 4 m such that (u, b) = H(m  0) with a square u. According
without validity checks. Furthermore, we analyze different to Protocol 2, B will now execute Decompress(u, b) as
cryptographic protocols with respect to their vulnerability to subroutine of HashToCurve. Assume A’s instruction skip in
the attack. step A-2 is successful and Line 5 of Decompress is skipped.
We start with an outline of the basic idea of our attack. Because a4 = 0, in this case, the value of v in Line 6 of
Let E be an elliptic curve in short Weierstrass form from (1). Decompress will be v = u3 . Because u is a square in F∗q ,
Assume a protocol as defined in Protocol 1 or Protocol 2 so is v. Hence, the output of the erroneous
√ execution of
that outputs Q = sP for P, Q ∈ E(Fq ) and for a secret Decompress will be Ũ = (u, (−1)b u u). This point is an
s ∈ Z. In our attacks, we mount a fault such that P is element of Ẽ : y 2 = x3 . Furthermore, with u ∈ F∗q it holds
modified into a smooth point P̃ on a singular curve Ẽ(Fq ). that u = 0 and hence Ũ is not the singular point (0, 0).
The relation between E and Ẽ is as follows. If E is defined If B is IPV, the algorithm for ECSM used by B does
as E : y 2 = x3 + a4 x + a6 , then Ẽ is defined as Ẽ : y 2 = not depend on a6 . Hence, B will perform ECSM with the
x3 + a4 x + ã6 . Because E and Ẽ share the parameter a4 , an co-factor in Line 8 of HashToCurve and ECSM with s in
algorithm for computing sP that does not use the parameter step 3 of Protocol 2 on Ẽ. With P̃ = cŨ ∈ Ẽ it follows
a6 will compute and output Q̃ = sP̃ on the singular curve that B will compute Q̃ = sP̃ ∈ Ẽ. Because B is assumed
Ẽ. Based on Theorem 1 we will then be able to solve the to be IPV1, B will return Q̃, even if it is not on E.

76
Now we use the group isomorphism φ+ : Ens (Fq ) → F+
q , the attack twice, because every second element in Fq is a
(x, y) → x/y from Theorem 1 to map the DLOG instance square.
to F+q . We obtain the two equations B. Attack on Decompress-And-Multiply for General Curves
+
φ (Q̃) = xQ /yQ We now present an attack on Decompress-And-Multiply
√ (4) that applies to all curves in short Weierstrass form E : y 2 =
φ+ (Q̃) = sφ+ (P̃ ) = scφ+ (U ) = sc(−1)b / u.
x3 + a4 x + a6 . Based on the ideas of the previous section

If we solve them for s we obtain s = (−1)b u/(y √ Q c)
our aim is to introduce an error such that the ECSM is
+
mod d, where√d is the order of φ (P̃ ) = √ c(−1)b / u in performed on the singular curve Ẽ : y 2 = x3 + a4 x + a˜6
F+q . Because u = 0, the order of (−1)b / u is p. With with discriminant Δ(Ẽ) = −16(4a34 + 27ã26 ) = 0.
p √#E(Fq ) = rc, the same holds for c and hence also for For efficiency reasons, nearly all standardized curves with
c/ u. Because we assumed that r|#E(Fp ) the Hasse bound a4 = 0 have a4 = −3. It follows from Δ(Ẽ) = 0 that
implies r < 2p. Therefore, either s1 = s mod r or s2 = s the corresponding singular curves are given as Ẽ : y 2 =
mod r. x3 − 3x ± 2. To make our description more concrete, we
With respect to the time complexity, note that multiplica- focus on these curves throughout this section but our attack
tion and inversion in Fq can be performed in O((log q)3 ). can also be generalized to other curves.
To describe our attack, we define the set F ⊂ Fq of faults
Note that it is easy for A to choose a message m in step A- that we can introduce at the computation v ← v + a4 in
1 such that it results in a square u. This is because in Fq Line 3 of Decompress. We add δ to F if we are able to
every second element is a square. Hence, we can reasonably perform an instruction skip fault that modifies v ← v + a4
expect that a hash function H approximately maps every into v ← v + a4 + δ. For example in the case a4 = −3,
second message to a square u. the OpenSSL implementation computes x3 − 3x + a6 as
Remark 1. The same attack applies if B uses point compres- x3 − (2x + x) + a6 . Now assume we are able to skip the
sion for the output Q. The reason is that point compression, addition of x or the addition of −(2x + x) = −3x. Then
different from decompression, does not use a4 and a6 of either x3 − 2x + a6 or x3 + a6 is computed and we obtain
the original curve E. Hence, in a successful attack, B will F = {1, 3}.
output the compression of Q̃ ∈ Ẽ. Then, Q̃ can be recovered In the attack, A takes the role of the attacker and performs
in step A-3 of the attack based on the equation of Ẽ. the following steps in the protocol Decompress-And-Multi-
ply with B:
We see from the proof of Theorem 5 that we recover s
only modulo p. Hence, it is crucial for the attack to work A-1 If F is empty fail.
over fields of large characteristic p with r|#E(Fp ). But if A-2 Remove an element δ from F and define xi =
this is the case, one run of the protocol is enough to recover ((−1)i 2 − a6 )/δ for i = 0, 1.
s ∈ Z/rZ up to one bit that selects between s1 and s2 . A-3 Select i ∈ {0, 1} such that x3i + (a4 + δ)xi + a6 is a
Hence, the attack is also applicable for protocols where s is square in F∗q . If no such i exists go back to step A-1.
a nonce or an ephemeral key that is refreshed in each run A-4 Set b = 0 and send (xi , b) as input to B.
of the protocol. A-5 While B decompresses (xi , b), mount an instruction
skip fault at Line 3 of Decompress in order to replace
Later, in Section IV, we perform the attack of this section
the addition with a4 by an addition with a4 + δ.
in practice. There we also provide numerical examples for 3 − 3x + (−1)i 2, P̃ = (x , y ), x =
A-6 Define yi = x√ i i i i S
the concrete analysis.
(−1) , and α = 3xS . With φ∗ from Theorem 1 output
i
Remark 2 (Decompress-And-Multiply). The attack directly DLOG s of φ∗ (Q̃) to the basis φ∗ (P̃ ) in Fq (α)∗ as
applies to the protocol Decompress-And-Multiply because key candidate.
in this protocol, A has strong control over the input of
Decompress and is able to choose (u, b) ∈ Fq × {0, 1} Theorem 6. If B is IPV1 (see Definition 4) with secret s
with a square u as B’s input. and if A’s instruction skip in step A-5 successfully replaces
the addition with a4 = −3 in Line 3 of Decompress by
Remark 3. The attack also applies to random point sampling.
addition with a4 + δ, then A s output fulfills s = s mod d
To see this, note that the standard approach (cf. [31], Ap-
where d is the order of φ∗ (P̃ ) in the multiplicative group
pendix 11.1) to select an element from E(Fq ) uniformly at
Fq (α)∗ .
random is very similar to HashToCurve. Instead of hashing
the message m, (u, b) ∈ Fq × {0, 1} is sampled uniformly Proof: If the instruction skip in Line 3 of Decompress
at random until it is a valid compression. Then (u, b) is is successful, and the addition with a4 in Line 3 of
decompressed to obtain a point on the curve. Hence, we Decompress is replaced by addition with a4 + δ, the value
can attack Decompress on input (u, b) as before. If u is a of v in Line 6 of Decompress will be xi (x2i + a4 + δ) + a6 .
square, we are successful. On expectation, we have to repeat Then, the check in step A-3 guarantees that this is a square

77
and hence, the output of Decompress will be (xi , u) with applies to a broader class of protocols compared to the attack
u = (−1)b xi (x2i + a4 + δ) + a6 . for general curves from Section III-B. For curves with j = 0,
We will now show that u = yi and hence that Decompress our attack can be used to attack the protocols Decompress-
will output P̃ . With A’s choice xi = ((−1)i 2 − a6 )/δ and And-Multiply and Hash-And-Multiply defined in Protocol 1
with a4 = −3, we obtain and Protocol 2, respectively. Furthermore, it can also be used
to attack random point sampling (cf. Remark 3). For general
u2 − yi2 = δxi + a6 − (−1)i 2 curves the attack only applies to the protocol Decompress-
(−1)i 2 − a6 And-Multiply because the attacker needs strong control over
=δ + a6 − (−1)i 2 = 0.
δ the input x of Decompress. If x is the image of the hash
With b = 0 it follows that u = yi and hence Decompress function H in HashToCurve this control is not given.
outputs P̃ . With respect to efficiency, the attack performs also better
From the definition of yi , we see that this point is on the for curves with j = 0 than for general curves. Here, the
curve Ẽ : y 2 = x3 −3x+(−1)i 2. The discriminant of Ẽ is 0 reason is that in the former case, the DLOG problem is
and it follows that Ẽ is singular. Because step A-3 ensures reduced to F+ q and in the latter case, the DLOG problem
yi = 0 it follows from Theorem 1 that P̃ is smooth and is reduced to F∗q or F∗q2 . In F+q , computing the DLOG is a
hence P̃ ∈ Ens (Fq ). Furthermore, if B is IPV1, it follows trivial inversion while in F∗q or F∗q2 , sub-exponential index
that B will compute and output Q̃ = sP̃ ∈ Ẽns . calculus algorithms like
√ the number field sieve are required.
Finally, in step A-6, A will compute the DLOG in the Especially if α = 3xS from Theorem 1 is not in Fq ,
subgroup of Fq (α)∗ ⊆ F∗q2 that is generated by φ∗ (P̃ ). With the DLOG problem is reduced to F∗q2 . Here, computing the
Theorem 1 and because the order of P̃ is d, A’s output s DLOG is cheaper than on the original curve, but depending
will satisfy s = s mod d. on the size of q, it still might be too expensive in practice.
We see that Theorem 6 is not as explicit as Theorem 5
in two aspects. First, it does not guarantee that the order D. Application to Real-World Protocol Instantiations
of φ∗ (P̃ ) is large as it has been for the order of φ+ (P̃ )
in Section III-A. It follows that we can not guarantee that In the next two subsections, we show the relation of our
enough bits about s are recovered by the attack. Secondly, abstract protocols to real-world protocols.
the sub-exponential complexity for computing the DLOG in 1) Schemes of the Hash-And-Multiply Type: In Sec-
Fq (α)∗ can only be shown based on heuristic assumptions. tion III-A we demonstrated a realistic attack on Hash-And-
Nevertheless, for our practical attacks we can assume that Multiply for curves with j = 0. We also explained that
d ≈ p and that the complexity of DLOG in F∗q and F∗q2 is these curves are often used for the instantiation of pairing
Lq (1/3) and Lq2 (1/3), respectively. based schemes. In this section, we present specific pairing
To demonstrate that the attack is efficient for parameters based signature schemes where a signature request to party
of practical relevance, we performed the analysis for the B with secret key s under a message m can be modeled as
curve secp192r1 from [18] that is defined over a 192 bit an invocation of Hash-And-Multiply.
prime field Fq . We were able to compute the DLOG in F∗q First note, that the secret key extraction protocol of the
with the function znlog of Pari/GP in 39 hours on one pairing based Boneh-Franklin IBE scheme [1] is also of this
core of a 64 bit Intel Core i5 CPU with 8 GB RAM. The type: If A asks for a secret key under identity ID ∈ {0, 1}∗ ,
complete numerical example can be found in the appendix. the private key generator B with master secret key s ∈ Z/rZ
computes PID = HashToCurve(ID) and returns QID = sPID
Remark 4. The attack also applies to protocols where A is
as B’s private key. Because secret key extraction should be
able to provide an uncompressed input P to B and instead,
performed in a protected environment, it is not very realistic
B aborts if P is not on E. In this case, B will validate that
that a fault attack can be mounted.
y 2 = x3 + a4 x + a6 holds for the input P from A. In an
attack, A provides the input P̃ = (xi , yi ) from step A-6 But as Naor pointed out [1], the key extraction of a pairing
as B’s input. Then A attacks the addition with a4 xi in the based IBE scheme can be transformed into a secure signature
check yi2 = x3i + a4 xi + a6 such that B checks if yi2 = algorithm by interpreting the identity ID as the message,
x3i + (a4 + δ)xi + a6 holds. From the proof of Theorem 6 the master secret key s as the signing key, and the ID’s
it follows that the check will pass for the point P̃ . private key QID as signature under message ID. This is
Note that Remark 1 from Section III-A for the case where sometimes called the Naor transform. It has been applied to
B outputs Q in compressed form applies also for the attack obtain signature schemes with many interesting properties
in this section. that are of the Hash-And-Multiply type. Examples include
BLS short signatures [23], threshold signatures and multisig-
C. Comparison of Attacks natures [3], aggregate signatures [35], and signcryption [24].
Table I shows a summary of our attacks. We see that the These schemes use that pairings provide gap Diffie-Hellman
attack for curves with j-invariant j = 0 from Section III-A groups, i.e., groups where the computational Diffie-Hellman

78
Table I
C OMPARISON OF OUR SINGULAR CURVE ATTACK ON CURVES OVER Fq .

Curve E : y 2 = x3 + a6 (j-invariant 0) E : y 2 = x3 + a4 x + a6
Section III-A III-B
Decompress-And-Multiply yes yes
Hash-And-Multiply yes no
Random point sampling yes no
DLOG problem reduced to F+q F∗q or F∗q2
Complexity log(q)3 Lq (1/3) or Lq2 (1/3)

problem is hard but the decisional Diffie-Hellman problem s ∈ Z/rZ uniformly at random and then compute P =
is easy. HashToCurve(ID), Q = sP , and V = (s + H(m, Q))DID .
Because of its simplicity and its popularity, we look at the The signature is the tuple σ = (Q, V ). We can interpret Q as
BLS signature scheme from [23]. We adapt the notation to the output of the protocol Hash-And-Multiply on input ID.
additive groups on elliptic curves and make the instantiation If P = HashToCurve(ID) was not pre-computed offline,
of groups more explicit: we are able to apply our attack to learn the nonce s and
recover the secret key DID from V and H(m, Q). Note that
Definition 7 (BLS Signatures). The BLS signature scheme
the attack on the nonce is possible here, because our attack
consists of three polynomial time algorithms that are defined
requires only a single signature request.
as follows: 2) Schemes of the Decompress-And-Multiply Type: We
1) Setup: On input a security parameter 1n select domain showed in the Section III-A and Section III-B that proto-
parameters D = (Fq , E, G, r, c), groups G1 , G2 ⊆ cols of the type Decompress-And-Multiply, including key-
E(Fq )[r], and GT ⊆ F∗q of prime order r. Furthermore agreement protocols, key transport protocols, and encryption
select a bilinear map e : G1 × G2 → GT , and a schemes, are vulnerable to our attack. Different from the at-
hash function HashToCurve : {0, 1}∗ → G1 . Select tack on Hash-And-Multiply, this holds independent from the
generators P1 ∈ G1 and P2 ∈ G2 . Pick s uniformly at instantiation of the curve. Hence, our attack is in principle
random from Z/rZ and compute A = sP2 . Output relevant for standardized ECC protocols like elliptic curve
pk = (G1 , G2 , GT , P1 , P2 , e, HashToCurve, A) (5) Diffie-Hellman scheme (ECDH) and elliptic curve integrated
encryption scheme (ECIES).
as public key and sk = (pk, s) as secret key. Note that we assumed in the definition of Decompress-
2) Sign: On input a message m ∈ {0, 1}∗ and a secret key And-Multiply that the attacker can access the result Q = sP
sk = (pk, s) compute P = HashToCurve(m) ∈ G1 of the ECSM with the secret key s. It turns out that most
and output σ = sP as signature. applications of the Decompress-And-Multiply type do not
3) Verify: On input a message m, a signature σ ∈ G1 , directly release Q. Instead they apply a cryptographic hash
and a public key pk of the form (5), output 1 if and function as part of a KDF to obtain a symmetric ephemeral
only if e(σ, P2 ) = e(HashToCurve(m), A). key. Hence, the output of a decryption or key agreement
operation will not necessarily give an attacker access to Q.
Note that the scheme is of type Hash-And-Multiply: To
Note that the same limitation applies to previous attacks like
generate a signature, first, the input m ∈ {0, 1}∗ is hashed
the invalid point attack on ElGamal from [6]. Still, there
to the curve by means of HashToCurve. The result is used
might be scenarios where the attacker is able to obtain Q.
as base-point for the ECSM with the secret key s. Finally,
For example assume that the long term key s is stored on a
the result of the ECSM is released as the signature.
smart-card that releases Q to an unprotected processor that
Furthermore, note that signatures consist of just one G1
derives a ephemeral key from Q. In this case an attacker
element. As it was analyzed in [25], this scheme is very
with physical access to the smart-card is able to apply the
efficient if it is instantiated with BN curves. In this case,
attack.
G1 = E(Fp )[r] and with point compression signatures are Further examples of the type Decompress-And-Multiply
only of size log p, i.e., approximately half the size of elliptic that actually do release Q include the blind signature scheme
curve digital signature algorithm (ECDSA) signatures. from [3]. Hence, this schem is vulnerable to our attack
Another signature scheme that can be interpreted as without further assumptions.
Hash-And-Multiply, and that does not result from the Naor
transform, is the identity based signature scheme from [36]. E. Countermeasures
Here, an identity ID has a secret key DID ∈ E. To compute The first option to defeat fault injection is to detect faults
a signature for ID under message m we first draw a nonce by the hardware. In our case of instruction skips that result

79
from clock glitching, we can try to detect the clock glitches
by sensors. Furthermore, redundancy can be added to detect Glitcher
the modification of instructions. Both techniques are well Queue
reset
known [32] and are used in tamper resistant hardware. timeout 1 Timer
Since hardware mechanisms are not always available and timeout 2

also because they sometimes only help against special injec-


tion techniques, software countermeasures are also required. ... 33 MHz
Several software countermeasures against attacks on ECC
timeout 256
based schemes are known. For a survey, see [4]. Especially 99 MHz
against invalid point attacks, a point validity check of the
result Q was proposed. It basically evaluates (1) for Q to configure
verify that Q is on E. If the verification fails, the result Q is
discarded and a failure message is returned. It was observed trigger
reset
already in the context of RSA, that the conditional branch Host Target
of this check can be attacked with a second fault [10]. This
clock
is exactly what we do in our practical attack in Section IV.
Furthermore, we argue that it is very difficult to implement *.log *.py
serial_io CPU
the validity check in a secure way and that it is not enough
to protect the check’s branch instruction. To give just one
example, we could inject the very same fault that we used
during decompression of P on the evaluation of (1) during Figure 3. Simplified block diagram of our setup. The host configures the
the check of Q. Then, Q would pass the test exactly in the glitcher, which generates the glitches on the external clock of the target
device. The target executes the program under attack.
case where it is on the singular curve (cf. Remark 4) .
To deal with attacks on the branch instruction, in [11]
another method has been proposed to protect against second
on an IPV2 implementation. We use the first fault to obtain
order fault attacks on ECC. The idea is to mask P with a
a point on a singular curve and the second fault to skip a
random point R prior to the ECSM and remove the mask
validity check of the output point. We give the necessary
after ECSM. For example, we may compute the output Q =
background on our setup, its concrete implementation, and
sP as Q = s(P +R)−sR with R uniform at random. Then,
the details of the attack.
if a fault modifies the masked P into P̃ ∈ E, P̃ and R are on
different curves. Hence, the addition with R and subtraction A. Clock Glitching Setup
with sR are not valid group operations on Ẽ and the mask CPU clock glitching is the mechanism of altering the code
will not cancel out. Then Q will depend on the unknown R. execution by clocking the CPU outside its specification for
As noted in Remark 3, sampling R ∈ E(Fq ) is vulnerable to a short period of time. We use the same setup that was used
our attack from Section III-A for curves with j = 0 because in [37] and we refer the reader to [37] for additional details.
standard sampling algorithms use Decompress as a building The setup consists of three main components: the glitcher,
block. To circumvent the masking countermeasure in this the host system, and the target. A block diagram of the setup
case, we apply a second order attack. With the first fault, we is shown in Fig. 3.
attack the decompression of P to obtain P̃ on the singular The glitcher is a DDK [38] which consists of an FPGA
curve Ẽ. With the second fault, we attack the sampling of R and an ARM CPU. It is used to generate the external clock
to obtain R̃ ∈ Ẽ. If both faults are successful all subsequent for the target device and to generate the glitches on the clock
group operations will be performed on the singular curve. signal. The host system is a standard PC that configures
Hence, the output of the masked computation will be Q̃ = the glitcher and acquires the output of the device under
s(P̃ + R̃)−sR̃ = sP̃ ∈ Ẽ. Then we can apply our reduction attack. The target, an AVR Xmega A1, executes the attacked
to the DLOG problem in F+ q as in Section III-A. program.
As a conclusion, we see that software countermeasures Basically, the clock glitching mechanism works as fol-
have to be implemented very carefully because they can be lows. The glitcher uses two internal clocks, a low frequency
manipulated with a second fault. This is especially true for clock at 33 MHz and a high frequency clock at 99 MHz. The
curves j(E) = 0, because in this case, even countermeasures FPGA of the glitcher implements a 32-bit timer running at
based on randomization are vulnerable to our attack. 33 MHz. The timeout values of the timer are loaded from
IV. P RACTICAL E VALUATION a queue that is filled by the host. When the timeout on top
of the queue is reached it is removed from the queue and
In this section, we present our practical realization of the the target clock switches from 33 MHz to 99 MHz for a
attack of Section III-A. We perform a second order attack defined number of clock cycles. The effect is a controlled

80
overclocking of the target. With a queue of size 256, also Table II
A V R - G C C GENERATED ASSEMBLY SNIPPET OF COMPUTATION
higher order fault attacks are possible. For synchronization x3P + a4 xP + a6 .
with the target, we use the reset input of the timer.

B. The Target Implementation 1 .ep_rhs:


2 ...
For the concrete pairing implementation we used the 3 /*init pointer to a_6*/
RELIC toolkit [39]. It includes C implementations of finite 4 movw r30, r24
5 /*load a_6 into r20*/
field arithmetic, ECC, and PBC for different hardware 6 ld r20, Z
platforms. Especially it provides an implementation of the 7 /*init pointer to v=x^3+a_4 x*/
BLS signature scheme. For our attack, we use RELIC 8 movw r22, r28
9 subi r22, 0xEB
version 0.3.5 without modifications of the source code. 10 sbci r23, 0xFF
We compile the library with the avr-gcc toolchain and 11 movw r24, r22
optimization level -O1. 12 /*call addition of v and a_6*/
13 call 0x38a6 ; 0x38a6 <fp_add_dig>
We use BN curves to instantiate the corresponding groups 14 rjmp .+18
G1 , G2 , and GT from the scheme in Definition 7. BN curves 15 ...
are natively supported by RELIC. In RELIC, hashing to
the curve is implemented similar to HashToCurve by the
function ep_map. This function uses the function ep_rhs
as a sub-program to compute x3 + a4 x + a6 in a way similar fault, A manipulates the validity check such that B outputs
to Line 2–Line 5 of Decompress. The function ep_rhs the invalid point Q̃. Hence, a successful attack will provide
itself uses the function fp_add_dig to add x3 + a4 x and A with the DLOG instance Q̃, P̃ ∈ Ẽ.
a6 in Line 5 of Decompress. 2) Target Instruction: To understand how we attack the
As a basic protection against fault attacks, we imple- addition with a6 , we refer to Table II. It shows avr-gcc
mented the point validation countermeasure that was dis- generated assembly code of our RELIC based implemen-
cussed in Section III-E: Before the result of an ECSM is tation. Concretely, this part corresponds to Line 5 of
released, the implementation checks that it is a valid point Decompress on input (u, b) = H(m  i). Hence, with a4 =
in the sense of Definition 3. If not, the device responds 0 it will add v = u3 and a6 . In Line 4 of Table II, a pointer
with an error message. Hence, following Definition 4, our to a6 is loaded into the pointer register Z (r31:r30). Then,
implementation is IPV2. The check is implemented with an the value of a6 is loaded into register r20. In Line 8–Line 10
if statement and with no special protection against attacks. the registers r22 and r23 are initialized with a pointer to
v = u3 . In Line 11 the pointer is copied to registers r24 and
C. The concrete Attack r25. Then the function fp_add_dig is called that adds the
If A asks B for a BLS signature under a message m and B value of a6 in r20 to the variable that is referenced by the
returns a signature for m under his private key s to A, then pointer in r23:r22. The result is written to the address in
this is an instance of Hash-And-Multiply from Protocol 2 r25:r24. Because both, r23:r22 and r25:r24 point to
with P = HashToCurve(m) and Q = σ. Furthermore, if the same variable v, the function fp_add_dig overwrites
we assume that the domain parameters of B specify BN v with v + a6 = u3 + a6 .
curves that have j-invariant j = 0 we can apply the attack Now, an instruction skip fault that removes the call
from Section III-A on Hash-And-Multiply. In this section, instruction in Line 13 will prevent an update of v that
we provide the necessary details. is referenced by r25:r24. Hence, it virtually skips the
1) Outline: Now, we outline our attack. We use two faults addition u3 +a6 in Line 5 of Decompress. If the message m
to attack the IPV2 implementation. We use the first fault to was chosen properly, u is a square, and hence ep_map √ that
obtain a point on a singular curve. With the second fault, we corresponds to HashToCurve will output P̃ = (u, u u) ∈
skip the validity check of the output. A, in the role of an Ẽ.
attacker, asks for a BLS signature of B under a selected Note that there are various other instructions that, if
message m. While B computes P = HashToCurve(m) skipped, would also have the effect of virtually skipping
as part of the signature generation, A introduces the first the addition with a6 . For example we could tamper with the
instruction skip fault. More specifically, A places the fault target pointer in r25:r24 with the effect that a6 is added
within ep_rhs that corresponds to the addition with a6 in to data at another irrelevant address. A further possibility in
Line 5 of Decompress. This results in P̃ on the singular our case, where a6 fits into one word, is to skip the actual
curve Ẽ. Because BN curves have prime order the co-factor arithmetic add instruction within fp_add_dig. In fact,
will be c = 1 and HashToCurve will output P̃ . Finally, B we found many different instruction cycles for the targeted
will compute a faulty signature Q̃ = sP̃ ∈ Ẽ. Then B will implementation where a fault had the effect of skipping the
check the validity of Q̃. With the second instruction skip addition with a6 .

81
For the second fault that we used to skip the validity check 5753659896978. That gives us the output Q̃ = (xQ , yQ ) =
of Q, we also had several options for the target instruction. sP̃ ∈ Ẽ with xQ = 40532871190117267482965735164319
For our naive implementation, we were able to eliminate the 423862598390468 and yQ = 103085238587127999197783
check by simply skipping an rjmp instruction that bypasses 933077858660011787397349.
the signature output for invalid points. Finally, we recover the secret s based on Theorem 1:
We conclude that it is difficult to identify and protect √
vulnerable instructions and that a validity check has to be φ+ (Q̃) yP xQ uxQ
s1 = = = mod p.
implemented very carefully to be effective. This is especially φ+ (P̃ ) x P yQ yQ
true because for both faults, we had several options for
choosing the target instructions. Because r < p, we only obtain one possible candidate with
3) Timing the Clock Glitches: In our simplified setup, the s1 = s.
target generates a synchronization signal after reading the We repeated the attack for 10 different secret keys and
signature request. This signal is used to reset the timer of were always successful in getting s with only a few signature
the glitcher. For the attack, we then had to learn approximate requests. This shows that our attack is a serious threat on
timings t1 and t2 of the two target instructions relative to unprotected devices and that it has to be considered for side
this signal. channel resistant implementations.
To learn t1 , we analyzed the assembly of our target
implementation for a chosen message m without knowledge V. C ONCLUSION AND F UTURE W ORK
of the target secret key. This is possible because from In this work, we have shown that it is possible to mount
HashToCurve we see that the timing t1 of the addition with a singular curve attack on various types of cryptographic
a6 , and hence the timing of the first glitch only depends on schemes. In the attack, we use instruction skip faults at
the message that is signed. In particular, it does not depend point decompression to move points from an elliptic curve
on the secret key. to a weak singular curve. In theory, our attack is able to
To learn t2 , we computed a signature under message m recover the secret key with one run of the protocol. We
on the target device with the target secret key. From the have also shown that our attack applies to many protocols,
response time of the target, we estimated the timing of and especially to many pairing based signature schemes. To
the validity check. Because the validity check is performed prove the practicability of the attack, we performed it on an
immediately before B’s output, this gives a good estimate AVR Xmega A1.
for t2 . Our attack is based on curves in short Weierstrass form.
Finally, we tried different values for the timing of the two It remains to show that the attack also works for points
glitches that are close to our estimations t1 and t2 . Once, on curves in other models, like general Weierstrass form,
we obtain a point on the singular curve, we compute the Montgomery form, or Edwards form. Here, besides point
DLOG from it. decompression, the conversion of point representations may
Note that in a real attack, the synchronization signal that offer additional vulnerabilities.
we use to reset the timer is not generated by the target.
Instead, it has to be generated externally. For example we ACKNOWLEDGMENT
could infer it from the communication of the signature
We like to thank our colleagues from the SecT department
request.
of TU Berlin for their support and especially we thank
4) Sample Parameters: Our concrete instantiation of do-
Ricardo Gomes da Silva and Dmitry Nedospasov for their
main parameters is the BN curve y 2 = x3 + 17 defined over
help with the experimental setup.
a prime field with characteristic p = 205523667896953300
194896352429254920972540065223. The curve has prime R EFERENCES
order #E(Fp ) = r = 205523667896953300194895899082
072403858390252929 and hence c = 1. [1] D. Boneh and M. K. Franklin, “Identity-based encryption
For the message signed by B, we chose the ASCII repre- from the Weil pairing,” in Proceedings of Advances in Cryp-
sentation of m = “Hello world!”. With RELIC’s implemen- tology - CRYPTO 2001, ser. LNCS, vol. 2139. Springer,
2001, pp. 213–229.
tation of the hash function H that is used in HashToCurve,

this message results in H(m  0) = (u, b) with u ∈ Fp . [2] A. Sahai and B. Waters, “Fuzzy identity-based encryption,” in
Then with c = 1 and b = 0, the point P̃ = (xP , yP ) with Proceedings of Advances in Cryptology - EUROCRYPT 2005,
xP = u = 178593680027287028005098471379742442193 ser. LNCS, vol. 3494. Springer, 2005, pp. 457–473.

364077343 and yP = u u = 946605451840607112720365
45917981790865316334518 is on the singular curve Ẽ. [3] A. Boldyreva, “Threshold signatures, multisignatures and
blind signatures based on the Gap-Diffie-Hellman-Group sig-
In our first attack, the uniform selection of the secret nature scheme,” in Proceedings of Public Key Cryptography -
resulted in s = 10978578780290111700474048136693776 PKC 2003, ser. LNCS, vol. 2567. Springer, 2003, pp. 31–46.

82
[4] J. Fan and I. Verbauwhede, “An Updated Survey on Secure [17] M. Joye and G. Neven, Eds., Identity-Based Cryptography,
ECC Implementations: Attacks, Countermeasures and Cost,” ser. Cryptology and Information Security. IOS Press, 2009,
in Cryptography and Security: From Theory to Applications, vol. 2.
ser. LNCS, D. Naccache, Ed. Springer Berlin Heidelberg,
2012, vol. 6805, p. 265–282. [18] C. Research, “Standards for efficient cryptography 2 (SEC
2),” Standards for Efficient Cryptography Group (SECG),
[5] A. Barenghi, L. Breveglieri, I. Koren, and D. Naccache, “Fault Tech. Rep. Version 2.0, 2010.
injection attacks on cryptographic devices: Theory, practice,
and countermeasures,” Proceedings of the IEEE, vol. 100, [19] P. S. L. M. Barreto and M. Naehrig, “Pairing-friendly elliptic
no. 11, pp. 3056–3076, 2012. curves of prime order,” in Proceedings of Selected Areas in
Cryptography 2005, ser. LNCS, vol. 3897. Springer, 2006.
[6] I. Biehl, B. Meyer, and V. Müller, “Differential Fault At-
tacks on Elliptic Curve Cryptosystems,” in Proceedings of [20] E. Kachisa, E. Schaefer, and M. Scott, “Constructing Brezing-
Advances in Cryptology - CRYPTO 2000, ser. LNCS, vol. Weng Pairing-Friendly Elliptic Curves Using Elements in the
1880. Springer, 2000, pp. 131–146. Cyclotomic Field,” in Proceedings of Pairing-Based Cryp-
tography 2008, ser. LNCS. Springer, 2008, vol. 5209, p.
[7] M. Ciet and M. Joye, “Elliptic curve cryptosystems in the 126–135.
presence of permanent and transient faults,” Designs, Codes
and Cryptography, vol. 36, pp. 33–43, 2005. [21] P. Barreto, B. Lynn, and M. Scott, “Constructing elliptic
curves with prescribed embedding degrees,” in Proceedings
[8] A. Antipa, D. R. L. Brown, A. Menezes, R. Struik, and of Security in Communication Networks - (SCN) 2002, ser.
S. A. Vanstone, “Validation of elliptic curve public keys,” LNCS, vol. 2576. Springer, 2003, p. 257–267.
in Proceedings of Public Key Cryptography - PKC 2003, ser.
LNCS, vol. 2567. Springer, 2003, pp. 211–223. [22] X. Boyen and L. Martin, “Identity-based cryptography stan-
dard (IBCS) #1: Supersingular curve implementations of the
BF and BB1 cryptosystems,” RFC 5091, Internet Engineering
[9] P.-A. Fouque, R. Lercier, D. Réal, and F. Valette, “Fault attack
Task Force, Dec. 2007.
on elliptic curve Montgomery ladder implementation,” in Pro-
ceedings of Fault Diagnosis and Tolerance in Cryptography
- FDTC 2008. IEEE Computer Society, 2008, pp. 92–98. [23] D. Boneh, B. Lynn, and H. Shacham, “Short signatures from
the Weil pairing,” Journal of Cryptology, vol. 17, no. 4, pp.
297–319, 2004.
[10] S.-M. Yen, S. Kim, S. Lim, and S.-J. Moon, “RSA speedup
with Chinese Remainder Theorem immune against hardware
[24] B. Libert and J.-J. Quisquater, “Efficient signcryption with
fault cryptanalysis,” IEEE Trans. Comput., vol. 52, no. 4, pp.
key privacy from gap Diffie-Hellman groups,” in Proceed-
461–472, Apr. 2003.
ings of Public Key Cryptography - PKC 2004, ser. LNCS.
Springer, 2004, vol. 2947, pp. 187–200.
[11] A. Dominguez-Oviedo and M. A. Hasan, “Algorithm-level
error detection for Montgomery ladder-based ECSM,” Jour- [25] S. Chatterjee, D. Hankerson, E. Knapp, and A. Menezes,
nal of Cryptographic Engineering, vol. 1, no. 1, p. 57–69, “Comparing two pairing-based aggregate signature schemes,”
2011. Designs, Codes and Cryptography, vol. 55, no. 2-3, pp. 141–
167, 2010.
[12] A. J. Menezes, T. Okamoto, and S. A. Vanstone, “Reducing
Elliptic Curve Logarithms to Logarithms in a Finite Field,” [26] S. D. Galbraith, Mathematics of Public Key Cryptography.
IEEE Trans. Inf. Theory, vol. 39, no. 5, pp. 1639–1646, 1993. Cambridge University Press, 2012.

[13] K. Karabina and B. Ustaoglu, “Invalid-curve attacks on (hy- [27] J. H. Silverman, The Arithmetic of Elliptic Curves, 2nd ed.,
per)elliptic curve cryptosystems,” Advances in Mathematics ser. Graduate Texts in Mathematics. Springer, 2009, vol.
of Communications, pp. 307–321, 2010. 106.

[14] R. P. Gallant, R. J. Lambert, and S. A. Vanstone, “Faster [28] L. C. Washington, Elliptic Curves Number Theory and Cryp-
point multiplication on elliptic curves with efficient endo- tography. Chapman and Hall/CRC, 2003.
morphisms,” in Proceedings of Advances in Cryptology -
CRYPTO 2001, ser. LNCS, vol. 2139. Springer, 2001, pp. [29] A. Joux, R. Lercier, N. Smart, and F. Vercauteren, “The
190–200. number field sieve in the medium prime case,” in Proceedings
of Advances in Cryptology - CRYPTO 2006, ser. LNCS.
[15] S. D. Galbraith and M. Scott, “Exponentiation in pairing- Springer, 2006, vol. 4117, p. 326–344.
friendly groups using homomorphisms,” in Proceedings of
Pairing-Based Cryptography 2008, ser. LNCS, vol. 5209. [30] “X9.62 public key cryptography for the financial ser-
Springer, 2008, pp. 211–224. vice industry: The elliptic curve digital signature algorithm
(ECDSA),” American National Standards Institute (ANSI),
[16] C. Costello, T. Lange, and M. Naehrig, “Faster Pairing Tech. Rep., 1999.
Computations on Curves with High-Degree Twists,” in Pro-
ceedings of Public Key Cryptography - PKC 2010, ser. LNCS, [31] “IEEE standard specifications for public-key cryptography,”
vol. 6056. Springer, 2010, pp. 224–242. IEEE Std 1363-2000, pp. 1–228, Aug. 2000.

83
[32] H. Bar-El, H. Choukri, D. Naccache, M. Tunstall, and A PPENDIX
C. Whelan, “The sorcerer’s apprentice guide to fault attacks,”
Proceedings of the IEEE, vol. 94, no. 2, pp. 370–382, Feb
2006. We give a numerical example for the attack from Sec-
[33] J. Balasch, B. Gierlichs, and I. Verbauwhede, “An in-depth tion III-B. Assume the curve secp192r1 from [18]. Here,
and black-box characterization of the effects of clock glitches the field has size q = p = 2192 − 264 − 1 and the curve is
on 8-bit MCUs,” in Proceedings of Fault Diagnosis and given as E : y 2 = x3 − 3x + a6 , with a6 = 245515554600
Tolerance in Cryptography - FDTC 2011, 2011, pp. 105–114. 8943817740293915197451784769108058161191238065 of
[34] L. Rivière, Z. Najm, P. Rauzy, J. Danger, J. Bringer, and prime order r = 627710173538668076383578942317605
L. Sauvage, “High precision fault injections on the instruction 9013767194773182842284081. For the fault, we assume
cache of ARMv7-M architectures,” Cryptology ePrint Archive δ = −a4 = 3. This models an attack that completely skips
Report 2015/147, 2015, http://eprint.iacr.org/2015/147. the addition in Line 3 of Decompress.
[35] D. Boneh, C. Gentry, B. Lynn, and H. Shacham, “Aggregate It turns out that for both choices i = 0 and i = 1, we
and verifiably encrypted signatures from bilinear maps,” in obtain a square in step A-3. We choose i = 0 because in this

Proceedings of Advances in Cryptology - EUROCRYPT 2003, case xS = (−1)i = 1 and α = 3 = 2326297227347680
ser. LNCS, vol. 2656. Springer, 2003, pp. 416–432.
280080327471553644541955937223163763835902 is con-
[36] J. C. Cha and J. H. Cheon, “An Identity-Based Signature from tained in Fq . For i = 0 we obtain P̃ = (x0 , y0 ) with
Gap Diffie-Hellman Groups,” in Proceedings of Public Key x0 = 3366349308254805903310428310405960349132903
Cryptography - PKC 2003, ser. LNCS, vol. 2567. Springer, 114206486228165 and y0 = 275212011998315130431081
2003, pp. 18–30.
4177025133128564319916601877729887. This point is on
[37] J. Blömer, R. G. da Silva, P. Günther, J. Krämer, and J. Seifert, the singular curve Ẽ : y 2 = x3 − 3x + 2.
“A practical second-order fault attack against a real-world For B’s secret we chose s = 111367856364593063
pairing implementation,” in Proceedings of Fault Diagnosis
and Tolerance in Cryptography - FDTC 2014, A. Tria and 5777463018382920452293547545942927564878 uniformly
D. Choi, Eds. IEEE Computer Society, 2014, pp. 123–136. at random in Z/rZ and obtain Q̃ = sP̃ ∈ Ẽ with
xQ = 4665052203107855484719784993687553637005598
[38] D. Nedospasov and T. Schröder, “Introducing Die 487381434479215 and yQ = 928861089175744755007680
Datenkrake: Programmable logic for hardware security
analysis,” in Proceedings of Workshop on Offensive 841945733105078722559985213649009.
Technologies - WOOT 2013. USENIX Association, 2013. Now, we apply Theorem 1. The images of P̃ and
[39] D. F. Aranha and C. P. L. Gouvêa, “RELIC is an Efficient LI-
Q̃ under φ∗ are given as φ∗ (P̃ ) = 1544399762588
brary for Cryptography,” https://github.com/relic-toolkit/relic. 312570436026623516397841564503871684736441528 and
φ∗ (Q̃) = 1050682542392422889161194344907633478108
[40] D. Coppersmith, A. M. Odlyzko, and R. Schroeppel, “Dis- 216148696830571786. We used the function znlog of
crete logarithms in GF(p),” Algorithmica, vol. 1, no. 1, pp.
1–15, Jan. 1986.
Pari/GP to compute s as DLOG of φ∗ (Q̃) to the basis
φ∗ (P̃ ). According to the documentation the implementation
[41] Proceedings of Public Key Cryptography - PKC 2003, ser. is based on the linear sieve index calculus method [40].
LNCS, vol. 2567. Springer, 2003. The computation took us 39 hours on one core of a 64 bit
[42] Proceedings of Advances in Cryptology - CRYPTO 2001, ser. Intel Core i5 CPU with 8 GB RAM.
LNCS, vol. 2139. Springer, 2001. Let d be the order of φ∗ (P̃ ). In this example we obtain a
[43] Proceedings of Pairing-Based Cryptography 2008, ser. LNCS, ratio of r/d ≈ 2. Hence, s ∈ {s , s+d} and in our particular
vol. 5209. Springer, 2008. example we find s = s .

84

You might also like