ICISC2005
ICISC2005
Ladder
1 Introduction
Cryptographic hardware devices like smart cards are widely used nowadays. Dur-
ing the past few years many research results have been published on considering
smart card side-channel attacks because of the popular usage of smart cards on
implementing cryptosystems. This new branch of cryptanalysis is usually called
the side-channel attack. The power analysis attack is an important category of
side-channel attack originally pointed out by Kocher [1] in which both simple
power analysis (SPA) and differential power analysis (DPA) were considered.
In a SPA, the attacker observes on one or a few collected power traces of
the smart card executing an algorithm and tries to identify the occurrence of
an instruction execution or a specific operand/data access which are driven by
a part of the private key. Through the above observation, if precise enough, the
private key can be derived. In a DPA, the attacker tries to verify his guess on a
part of the private key by analyzing on only some specific bits of the result of a
specific intermediate step of an algorithm which is a function of the private key.
In order to largely enhance the signal to noise ratio to mount a successful DPA,
it usually collects much more1 power traces than in a SPA and partitions the
power traces into some groups according to the guessed key bits and a underlying
attack design. Difference of the above power traces of different groups is therefore
used to verify the guess on key bits. Usually, a DPA is mounted by analyzing
on many executions of the same algorithm with different random inputs, and
theoretically those inputs will be better if statistically unrelated.
Exponentiation and its analogy, point scalar multiplication on elliptic curve,
are of central importance in modern cryptosystems implementation as they are
of the basic operation of almost all modern public-key cryptosystems, e.g., the
RSA system [2], the ElGamal system [3], and the elliptic curve cryptography [4,
5]. Therefore, many side-channel attacks and also the related countermeasures on
implementing exponentiation and point scalar multiplication have been reported
in the literature.
The square-multiply-always exponentiation (or point scalar multiplication)
algorithm [6] is a well-known SPA countermeasure which exploits a simple and
useful trick to design a regularly executing algorithm by introducing redundant
computation into each loop iteration when necessary. Unfortunately, Fouque and
Valette proposed the doubling attack [7] to threaten the square-multiply-always
algorithm (more precisely, the left-to-right version of the algorithm) by exploiting
the existence of redundant computation in a novel approach.
Joye and Yen proposed an enhanced SPA countermeasure based on the Mont-
gomery ladder [8] which was demonstrated to be also regularly executed but
based on a totally different idea from the original square-multiply-always ex-
ponentiation. The most special thing about the Montgomery ladder is that no
redundant computation exists in the algorithm which is also helpful to be im-
mune from some hardware fault attacks [9, 10].
1
It usually collects a few thousands or more power traces in order to obtain a mean-
ingful average power trace.
Relative Doubling Attack Against Montgomery Ladder 3
depend on the private key related instructions being executed and/or the data
being manipulated. Therefore, the side-channel information may be exploited to
mount a successful attack to retrieve the embedded private key, e.g., the private
exponent d in M d mod n.
The classical binary exponentiation algorithm in Fig. 1 (a) includes a con-
ditional branch (i.e., the Step (04)) that is driven by the secret data di . If the
two possible branches behave differently (or the branch decision operation it-
self behaves distinguishably), then some side-channel analysis (e.g., the simple
power analysis–SPA) may be employed to retrieve the secret data di . So, further
enhancement on the algorithms is necessary.
A novel idea of introducing redundant operations and eliminating secret data
dependent statements was proposed previously to enhance the basic algorithms
such that the improved versions behave more regularly. Some square-multiply-
always (or its counterpart called the double-add-always for point scalar multi-
plication) based algorithms were already developed [6] by employing this obser-
vation. Two of these square-multiply-always algorithms are shown in Fig. 2.
attacker cannot tell the values of A and/or B, however the attacker can detect
the collision if A = B.
The following example given in Table 1 provides the details of the doubling
attack. Let the private exponent d be 75 = (1, 0, 0, 1, 0, 1, 1)2 and the two related
input data be M and M 2 , respectively. The computational process of raising M d
and (M 2 )d using the left-to-right square-multiply-always algorithm reveals the
fact that if di = 0, then both the first computations (both are squarings) of
iteration3 i − 1 for M d and iteration i for (M 2 )d will be exactly the same. So,
observing collisions (observation on the existence of same instruction with same
operand) within computations of two collected power consumption traces enables
the attacker to identify all private key bits of zero value except the LSB of d. In
the scenario of RSA private computation, it is assumed that d0 = 1.
The assumption made (was claimed in [7] to be correct by experiment) is
very reasonable since the target computations usually take many machine clock
cycles (thus more easy to measure and to observe) and depend greatly on the
operands, so the collision is more easy to detect.
i di Process of M d Process of (M 2 )d
6 1 12 12
1×M 1 × M2
5 0 M2 (M 2 )2
M2 × M M4 × M2
4 0 (M 2 )2 (M 4 )2
M4 × M M8 × M2
3 1 (M 4 )2 (M 8 )2
M8 × M M 16 × M 2
2 0 (M 9 )2 (M 18 )2
M 18 × M M 36 × M 2
1 1 (M 18 )2 (M 36 )2
M 36 × M M 72 × M 2
0 1 (M 37 )2 (M 74 )2
M 74 × M M 148 × M 2
Return M 75 M 150
attack, and also the safe-error attacks [9, 10] (a category of hardware fault at-
tack). The algorithm is given in Fig. 3. This algorithm is only SPA resistant and
is used to simplify the description of the proposed attack. However, an enhanced
version in [8] meant to be immune from the safe-error attacks with Step 04 re-
placed by (Rb ← Rb × Rdi mod n) is still vulnerable to the relative doubling
attack proposed in this paper.
It is evident that the Montgomery ladder (and its enhanced version) behave
regularly and most specially that there is no redundant computation within the
algorithm.
Input: M, d = (dm−1 · · · d0 )2 , n
Output: M d mod n
01 R0 ← 1; R1 ← M
02 for i = m − 1 downto 0 do
03 b ← ¬di
04 Rb ← R0 × R1 mod n
05 Rdi ← (Rdi )2 mod n
06 return R0
The assumption made in this paper is basically the same as what considered in
the doubling attack [7] and that in an attack reported in [14]. The assumption
is that an adversary can distinguish collision of power trace segments (within
a single or more power traces) when the smart card performs twice the same
computation even if the adversary is not able to tell which exact computation
is done. The collision instance to be distinguished in [7] and in our proposed
attack is the modular squaring computation. An adversary is assumed to be
able to detect the collision of A2 mod n and B 2 mod n if A = B even though A
and B are unknown.
In the algorithm (Fig. 3), the register R0 is used to store the value of M Li and
the register R1 is used to store M Hi . In order to develop an execution regular
and SPA immune algorithm, the operations of Step 04 and Step 05 are designed
to be as follows
(R1 = M Hi , R0 = M Li ) = M Li+1 × M Hi+1 , (M Li+1 )2 if di = 0, (2)
and
(R0 = M Li , R1 = M Hi ) = M Li+1 × M Hi+1 , (M Hi+1 )2 if di = 1. (3)
The above statements clearly demonstrate that the Montgomery ladder ex-
ecutes highly regular and there is no redundant computation within the algo-
rithm. Whatever the processed bit di , there is always a multiplication followed
by a squaring. On the contrary, we want to emphasize that in the left-to-right
square-multiply-always algorithm (see Fig. 2 (a)), redundant computation (i.e.,
Step 05: Rb ← R0 × M mod n) does exist when di = 0. The original doubling
attack on the algorithm in Fig. 2 (a) exploits the existence of this redundant
computation.
However, no research has been reported on whether the Montgomery ladder
can be immune from the doubling attack or any doubling-like attack in the light
of the fact of no redundant computation within the algorithm. A straightforward
result can be obtained easily is that the original doubling attack does not apply
to the Montgomery ladder. However, the following result will show that another
doubling-like attack can still be applicable to the Montgomery ladder.
Fact 1 Given di = 0, then we have Li = 2Li+1 .
m−1
Proof. This can be obtained directly from the definition of Li = j=i dj 2j−i
since di = 0.
Fact 2 Given di = 1, then we have Hi = 2Hi+1 .
m−1
Proof. From the definitions of Li = j=i dj 2j−i , Hi = Li + 1, and also di = 1,
we have Hi = Li + 1 = (2Li+1 + 1) + 1 = 2(Li+1 + 1) = 2Hi+1 .
From Eq.(2), we understand that if di = di−1 = 0 then both
R0 ← (M Li )2 : Step 05 of iteration i − 1 when evaluating M d
(4)
R0 ← ((M 2 )Li+1 )2 : Step 05 of iteration i when evaluating (M 2 )d ,
will perform the same computation because of Li = 2Li+1 (see Fact 1). Due to
this observation of collision on computation, a new doubling-like attack can be
mounted to derive the knowledge of di = di−1 = 0.
8 Sung-Ming Yen, Lee-Chun Ko, SangJae Moon, and JaeCheol Ha
On the other hand, from Eq.(3), we also observe that if di = di−1 = 1 then
both
R1 ← (M Hi )2 : Step 05 of iteration i − 1 when evaluating M d
(5)
R1 ← ((M 2 )Hi+1 )2 : Step 05 of iteration i when evaluating (M 2 )d ,
will perform the same computation because of Hi = 2Hi+1 (see Fact 2). This
observation of collision on computation leads to the knowledge of di = di−1 = 1.
All other cases, say di = di−1 , will lead to either one of the following results
case (1): di = 0 and di−1 = 1
R1 ← (M Hi )2 : Step 05 of iteration i − 1 when evaluating M d
(6)
R0 ← ((M 2 )Li+1 )2 : Step 05 of iteration i when evaluating (M 2 )d ,
case (2): di = 1 and di−1 = 0
R0 ← (M Li )2 : Step 05 of iteration i − 1 when evaluating M d
(7)
R1 ← ((M 2 )Hi+1 )2 : Step 05 of iteration i when evaluating (M 2 )d .
Based on the definition of Montgomery ladder, it is evident that in the case
(1) we have Hi = 2Li+1 and no collision of computation can be detected.
Similarly, in the case (2) no collision of computation can be detected since
Li = 2Hi+1 .
i di Process of M d Process of (M 2 )d
6 1 R0 = 1 × M R0 = 1 × M 2
R1 = M 2 R1 = (M 2 )2
5 0 R1 = M 2 × M R1 = M 4 × M 2
R0 = M 2 R0 = (M 2 )2
4 0 R1 = M 3 × M 2 R1 = M 6 × M 4
R0 = (M 2 )2 R0 = (M 4 )2
3 1 R0 = M 4 × M 5 R0 = M 8 × M 10
R1 = (M 5 )2 R1 = (M 10 )2
2 0 R1 = M 10 × M 9 R1 = M 20 × M 18
R0 = (M 9 )2 R0 = (M 18 )2
1 1 R0 = M 18 × M 19 R0 = M 36 × M 38
R1 = (M 19 )2 R1 = (M 38 )2
0 1 R0 = M 37 × M 38 R0 = M 74 × M 76
R1 = (M 38 )2 R1 = (M 76 )2
Return R0 = M 75 R0 = M 150
One may argue that the standard blinding technique can easily prevent the
proposed relative doubling as well as the original doubling attacks. However, we
have some remarks on this claim.
The first disagreement is that the standard blinding technique is well known
as a countermeasure against DPA. The second disagreement is that in a standard
blinding technique the input data should be protected by a random mask which
will then be removed from the result. However, it has been pointed out clearly in
[7] that a regular mask updating (meant to be efficient), e.g., the one mentioned
in [6], will be vulnerable to the doubling attack. It can be verified easily that the
regular mask updating in [6] is also vulnerable to the proposed relative doubling
attack. It was suggested eventually that it had better use a real random mask
to avoid the attack. Unfortunately, the computational overhead of employing a
real random mask is usually very high.
The work and especially the title of [7] imply that upward (right-to-left) expo-
nentiation could be better than downward (left-to-right) exponentiation when
considering vulnerability from the doubling attack. This is also the case for the
proposed relative doubling attack. However, the above mentioned superiority of
the upward exponentiation is not obtained without any additional cost. It is evi-
dent that the upward square-multiply-always exponentiation in Fig. 2 (b) needs
one more temporary memory than the downward exponentiation does.
Purpose of the following discussion is to clarify that upward exponentiation
is not a necessary requirement meant to be immune from the doubling attack
and the proposed relative doubling attack. The following SPA-protected and
Relative Doubling Attack Against Montgomery Ladder 11
Notice that the algorithm (Fig. 4) needs only two temporary memory (same
as that in Fig. 2 (a)) and this leads to one less temporary memory requirement
than the doubling attack immune upward algorithm in Fig. 2 (b). Recall that
if we take into account the fact that the input datum M is also stored inside
the smart card (as already described previously), then the algorithm in Fig. 4
needs only one temporary memory which leads to two less temporary memory
requirement than the doubling attack immune upward algorithm in Fig. 2 (b).
However, it is worth noting that protection against relative doubling attack does
not necessarily ward off other potential attacks.
5 Conclusions
The Montgomery ladder can be secure against both the ordinary SPA and the
ordinary doubling attack. But, in this paper we showed that the Montgomery
ladder is vulnerable to the proposed relative doubling attack. Both the ordi-
nary doubling attack and the proposed relative doubling attack share the same
reasonable attack assumption of observing collision on computations. One differ-
ence is that the original doubling attack (against square-multiply-always algo-
rithm) fully exploits the existence of redundant computation, while the proposed
relative doubling attack (against Montgomery ladder) does not exploit any re-
dundant computation. Our relative doubling attack uses a different approach to
derive the private key.
6 Acknowledgment
The authors would like to thank the anonymous reviewers for their helpful sug-
gestions and comments on both technical and editing issues. These suggestions
improve extensively to the final version of this paper.
12 Sung-Ming Yen, Lee-Chun Ko, SangJae Moon, and JaeCheol Ha
References
1. P. Kocher, J. Jaffe and B. Jun, “Differential power analysis,” Advances in Cryptol-
ogy – CRYPTO ’99, LNCS 1666, pp. 388–397, Springer-Verlag, 1999.
2. R.L. Rivest, A. Shamir, and L. Adleman, “A method for obtaining digital signatures
and public-key cryptosystem,” Commun. of ACM, vol. 21, no. 2, pp. 120–126, 1978.
3. T. ElGamal, “A public key cryptosystem and a signature scheme based on discrete
logarithms,” IEEE Trans. Inf. Theory, vol. 31, no. 4, pp. 469–472, 1985.
4. V. Miller, “Uses of elliptic curve in cryptography,” Advances in Cryptology –
CRYPTO ’85, LNCS 218, pp. 417–426, Springer-Verlag, 1985.
5. N. Koblitz, “Elliptic curve cryptosystems,” Mathematics of Computation, vol. 48,
no. 177, pp. 203–209, Jan. 1987.
6. J.-S. Coron, “Resistance against differential power analysis for elliptic curve cryp-
tosystems,” Proc. of Cryptographic Hardware and Embedded Systems – CHES ’99,
LNCS 1717, pp. 292–302, Springer-Verlag, 1999.
7. P.-A. Fouque and F. Valette, “The doubling attack – why upwards is better than
downwards,” Proc. of Cryptographic Hardware and Embedded Systems – CHES ’03,
LNCS 2779, pp. 269–280, Springer-Verlag, 2003.
8. M. Joye and S.M. Yen., “The Montgomery powering ladder,” Proc. of Cryptographic
Hardware and Embedded Systems – CHES ’02, LNCS 2523, pp. 291–302, Springer-
Verlag, 2003.
9. S. M. Yen and M. Joye, “Checking Before Output May Not be Enough against
Fault-Based Cryptanalysis,” IEEE Trans. on Computers, 49(9):967-970, September
2000.
10. S.M. Yen, S.J. Kim, S.G. Lim and S.J. Moon, “A countermeasure against one
physical cryptanalysis may benefit another attack,” Proc. of Information Security
and Cryptology – ICISC ’01, LNCS 2288, pp. 414–427, Springer-Verlag, 2002.
11. D.M. Gordon, “A survey of fast exponentiation methods,” Journal of Algorithms,
vol. 27, pp. 129–146, 1998.
12. P.L. Montgomery, “Speeding the Pollard and elliptic curve methods of factoriza-
tion,” Mathematics of Computation, vol. 48, no. 177, pp. 243–264, Jan. 1987.
13. S.M. Yen and C.S. Laih, “Fast algorithms for LUC digital signature computation,”
IEE Proc. Computers and Digital Techniques, vol. 142, no. 2, pp. 165–169, March
1995.
14. K. Schramm, T. Wollinger, and C. Paar, “A new class of collision attacks and its
application to DES,” Proc. of Fast Software Encryption – FSE ’03, LNCS 2887,
pp. 206–222, Springer-Verlag, 2003.
15. PKCS #1 v2.1, “RSA Cryptography Standard”, 5 January 2001.
http://www.rsasecurity.com/rsalabs/pkcs/
16. M. Bellare and P. Rogaway, “Optimal asymmetric encryption padding – How to en-
crypt with RSA,” Advances in Cryptology – EUROCRYPT ’94, LNCS 950, pp. 92–
111, Springer-Verlag, 1995.
17. S.M. Yen, C.C. Lu, and S.Y. Tseng, “Method for protecting public key schemes
from timing, power and fault attacks,” U.S. Patent Number US2004/0125950 A1,
July 2004.