0% found this document useful (0 votes)
35 views13 pages

RSA and Power Analysis

The document discusses the vulnerabilities of the RSA public key cryptosystem, particularly in smart card implementations, to power analysis attacks. It outlines various attack methods, including timing attacks and differential power analysis (DPA), and presents countermeasures to enhance security. The authors emphasize the importance of understanding the execution details of RSA operations to effectively mitigate these risks.

Uploaded by

scribdml
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
35 views13 pages

RSA and Power Analysis

The document discusses the vulnerabilities of the RSA public key cryptosystem, particularly in smart card implementations, to power analysis attacks. It outlines various attack methods, including timing attacks and differential power analysis (DPA), and presents countermeasures to enhance security. The authors emphasize the importance of understanding the execution details of RSA operations to effectively mitigate these risks.

Uploaded by

scribdml
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 13

RSA and power analysis

Mehdi-Laurent Akkar1? and Paul Dischamp2


1
Bull CP8, 68 route de Versailles,
78431 Louveciennes, France.
[email protected]
2
Oberthur Card Systems,
25, rue Auguste Blanche, 92800 Puteaux, France.
[email protected]

Abstract. RSA is probably the most famous and the most used public
key cryptosystem. It is implemented in many smart cards with crypto
functionality. Due to its popularity it is also one of the most attacked
(with the DES). Since the first article on power analysis against RSA has
been published by Paul Kocher, many attacks and countermeasures have
been proposed. In this paper we expose an analysis of several attacks.
We will then present several countermeasures and some ideas about a
”secure” implementation with reasonable performance of RSA.

Keywords: RSA, Public Key Cryptography, Smart cards, Blinding,


Power analysis, DPA, SPA.

1 Introduction
Protecting a card against power analysis on PKC1 is perhaps one of
the most simple and difficult things to do. Indeed it looks simple because
of numerous mathematical properties the underlying structure allows to
use; several countermeasures do exactly that, blinding methods being the
perfect example. But on the other hand the mathematical computations
require a lot of operations and, therefore, many smart cards integrate a
crypto-processor implementing the basic modular operations. Of course,
the programmer cannot change or bypass these operations and, in many
cases, there are restrictions on the usage of those hardwired instructions.
That’s why, as usual, when dealing with security problems, the program-
mer has to carefully study the detail of the execution of these operations.
In this paper, we will first present some general results on RSA and
the way it is implemented in many smart cards. Next, we will discuss
about several attacks and their (practical) feasibility. And finally, we will
present some countermeasures against these attacks.
?
Research done while at Oberthur Card Systems.
1
Public Key Cryptosystems
2 RSA and the smart card

As usual we will consider the following RSA parameters:


– P and Q two prime numbers
– N =P ×Q
– e and d the public and secret exponents (ed = 1 mod (P − 1)(Q − 1))
– a message M and the corresponding ciphertext C = M e mod N
For efficiency, most of the implementation use subring computation
to increase the speed execution of the modular exponentiation C d mod N
(we consider the deciphering/signature operation) as follows:
1. Compute CP = C mod P and CQ = C mod Q
2. Compute dP = d mod P − 1 and dQ = d mod Q − 1
3. Execute the two exponentiations to get
d
MP = CPdP mod P and MQ = CQQ mod Q

4. Use the Chinese Remainder reconstruction to get M :

if P ≤ Q then M = ((MQ −MP )(1/P mod Q) mod Q)×P +MP ) mod N

(in practice, 1/P mod Q, dP and d − Q are usually precomputed)


In many smart cards, the following operations are implemented in
hardware: modular squaring, modular multiplication, modular addition,
modular reduction. To speed up the computation, some algorithmic im-
provement are used, like the algorithms of Sedlak , Quisquater or Mont-
gomery [9]. Sometimes, the knowledge of these algorithms is necessary to
succeed in an adapted attack.
We will see in the next section that all of the four steps of the usual
RSA algorithms are vulnerable to power analysis.

3 Attacks

3.1 Basic attacks


The first attack appeared in 1996 with the publication of the timing
attack [5] on the Montgomery multiplication algorithm. The second one,
again by P. Kocher [4], was on the ”square and multiply” exponentiation
algorithm used to compute M = C d mod N , which is summarized here:
– M=1
– from the most to the least significant bit b of the exponent do:
• M = M 2 mod N
• if the bit b = 1 do M = M × C mod N
– output M
The attack was based on the fact that the implementation was using
a special procedure to execute the squaring step which is faster. It was
easy by considering the difference between squaring and multiply patterns
to extract the secret exponent d by reading the curve consumption (cf.
fig.1).

0.6

0.4

0.2

−0.2

−0.4

1 2 3 4 5 6 7 8 9 10
4
x 10

Fig. 1. Each high peak is the beginning of one operation. A large peak is a multiply,
a thin peak is a square.

Often, what the attacker gets is not exactly the d exponent but the
reduced exponent dP (or dQ ). By elementary equations, we get

e ∗ dP − 1
P =1+ 1≤k≤e
k
So, if e is small (often 3 or 65537), it is very easy to recover P and Q by
exhaustive search on the value of k. Unfortunately, when e is random, it
is impossible to proceed like this; however another method is applicable
in the general case. The idea2 is the following:

– Pick a random number x


– compute y = xedP mod N
– evaluate P̃ = GCD(x − y, N )
– output P̃ = P with very high probability.

Another attack which could be very efficient and easy to do in some


special case is the following: if someone has a card where it is possible to
change the exponent, he can perform a comparative SPA attack between
the calculus consumption of his card and the real one. In this way, it is
possible to guess the exponent bit per bit (cf. fig.2).

0.1

0.05

−0.05

0.5 1 1.5 2 2.5 3


4
x 10
0.1

0.05

−0.05

0.5 1 1.5 2 2.5 3


4
x 10
0.1

0.05

−0.05

0.5 1 1.5 2 2.5 3


4
x 10

Fig. 2. The hacker subtract the upper curve (real card) with the middle curve (his
card) to obtain the last curve. A difference indicates him that he didn’t guess the good
exponent.

2
Original idea by Jacques Stern.
3.2 Attack on the initial message reduction

Let us analyze how to attack the initial reduction on cards implement-


ing the ”CRT version” of RSA. In most cases the reduction is performed
by approximatively3 the following division algorithm (see [3, 7]):

– INPUT: the base b, x = {xn ...x1 , x0 }b , y = {yt ...y0 }b


– OUTPUT: q and r such that x = qy + r, 0 ≤ r < y
– q = 0, while x ≥ ybn−t do : qn−t = qn−t + 1, x = x − ybn−t end-do.
– for i = n down to t + 1 do:
• if xi = yt then do qi−t−1 = b − 1 end-do
• else do qi−t−1 = (xi b + xi−1 )/yt end-do end-if
• while qi−t−1 (yt b + yt−1 ) > xi b2 + xi−1 b + xi−2
• do qi−t−1 = qi−t−1 − 1 end-do
• x = x − qi−t−1 ybi−t−1
• if x < 0 then do x = x + ybi−t−1 and qi−t−1 − 1 end-do
• end-do end-for
– r=x
– output r and q

Often, due to general knowledge of the architecture of analyzed com-


ponent, it is easy to determine the base b used by the algorithm. Most
of the time, it is 8, 16 or 32; and sometimes 1, but then the algorithm is
much simplified (subtract and test the carry).
It can occur that the algorithm first checks if x is smaller than y and
in this case does not execute the division step. By power analysis, it is
quite easy to determine if such a procedure is executed or not; in this
case, one can use a dichotomy process to determine the value of P in
chosen plaintext attack:

– INPUT: The size n of P in bits


– OUTPUT: The value of P .
– X=0
– for i = n + 1 down to 0 do:
• X = X + 2i
• Execute RSA with message X
• If the division occurs do X = X − 2i end-do
• end-do end-for
– output X
3
Some issues about initial conditions and normalization of the input values will not
be discussed here.
Even in the case where the test M < P is not performed, it is still
possible to use this attack. Indeed, by comparative SPA method, one can
see difference between the execution of RSA(0) and RSA(X) (X being the
test value of last algorithm), one can also determine if the qi value stay
equal to zero or not during all the algorithm.
In case of small base architecture, one can use a DPA type attack by
simulating the first bits of the divisor P . With this information, one can
guess what would be the first bits of the partial remainder obtained in
the first step of the reduction. By a usual DPA selection function (on the
value of one bit for example), one can incrementally obtain all the bits
of the secret factor P . As in the timing attack, it is possible to detect a
guessing error when no bias appears in the curve differences for several
consecutive bits.
It seems possible to perform some other attacks on all steps of the
algorithm by an appropriate selection function on the message; for ex-
ample, the last step of the algorithm occurs one time on 2b and it may
appear some attack exploiting this particular step.

3.3 Attack on the exponentiation step


DPA attacks The most natural idea is to proceed as follow with a DPA
attack against a 1024 bits RSA (for example):
– initialize an array T[1000][256]
– For i=0 to 1000: generate Mi a random message.
– For i=0 to 1000, For j=0 to 255 do
• Compute X = HW (Mij mod N ) where HW represent the Ham-
ming weight
• if X > 512 T[i][j] = 1 else T[i][j] = 0
– Proceed to an usual DPA attack on the consumption curves with the
T [i][j] selection function.
By ”usual DPA attack on the curves”, we mean the following: we
consider that we are able to isolate the next squaring after the first byte
of the exponent (or all the exponentiation step but the treatment will be
slower). At this step will be proceeded the operation (Mij )2 mod N and
so, if the assumption of j is correct, a difference will occur between the
low Hamming weight results and the high ones. If j is incorrectly guessed,
the -short- exponentiation is just a kind of random and no difference will
appear. After this, simply repeat the attack for the other bytes of the key.
Unfortunately, in most cases, one cannot perform this attack for the
simple reason that CRT method is used and therefore the hacker has no
access to the modulus P and Q. The simulation of the exponentiation is
not still possible. Luckily for the attacker, SPA methods exist to analyze
directly the exponentiation phase.

SPA attacks The first -and the most employed- method is to try to
directly distinguish the square and the multiply steps, or, when both
are always executed, to distinguish between the real and the dummy
multiplications. Some time ago, in most cases it was possible to distinguish
the difference just by looking the consumption curves: the timings or the
shape (the bump...) of the two operations was so different than one could
distinguish immediately the two steps. In more recent implementations,
the programmers write very carefully the exponentiation and a visual
analysis is no longer possible (cf. fig.3).

0.3

0.25

0.2

0.15

0.1

0.05

−0.05

−0.1

−0.15

0.5 1 1.5 2 2.5


4
x 10

Fig. 3. Each high peak is the beginning of one operation, but it is very difficult to
distinguish a difference between the square and multiply.

For precise analysis, the simple thing is to apply an auto-correlation


on the consumption curve. This method will detect very small differences
in time or in amplitude of the different operations and sometimes will be
enough to break the exponentiation.
A second idea is to analyze the correlation of the curve with the first
’multiply by C’ operation (this one is often easy to isolate because it is
the first operation different of ”1 = 1 × 1”. This multiplication will occur
with probability 1/2 at each loop of the algorithm and even if one of the
operand will be unknown, the other will be C, and so some correlation
often occurs on the multiplication step.
A third method which gives partial information can sometimes be
used: the exponent is often manipulated in a loop on the bytes of the key;
in this case, even if one cannot distinguish the square and the multiply
one can get the Hamming weight of each of the bytes of the key.
A last method, when both square and multiply are executed, is to
analyze the succession square1, multiply1, square2. Several cases could
appear:

Correlation between case 1 case 2 case 3 case 4


Sq1 and Sq2 yes yes no no
Begin of mult1 and sq 2 yes yes no no
End of mult 1 and sq 2 no yes yes yes
All mult 1 and sq 2 no yes no no

The case 1 corresponds to a zero bit exponent where, during the mul-
tiplication by C (which is dummy in this case) the add and shift algorithm
uses C to add and not the result of the square4 ; indeed there is a cor-
relation between the two squares because the output of the first is the
input of the second. There is a correlation between the beginning of the
multiply and the square 2 because the input of the multiply is the input
of the squaring 2. There is no correlation between the other part of the
multiply and the squaring because just after the beginning of the multi-
ply the value just squared is modified because it is the value C which is
added at each elementary step of the multiplication.
The case 2 corresponds to a zero bit, where the multiplication step uses
the squared value for the elementary addition step in the multiplication.
The case 3 is a 1 bit exponent with C value for elementary addition.
The case 4, a 1 bit exponent with squared value for elementary addition.

These ideas of attack generally apply to normal, Montgomery, Sedlak


and Quisquater architecture.
4
It is quite complicated but this difference is important between the case 1/2 and
3/4.
Some information on this subject could be find in (see [8]).

3.4 Attacking the recombinaison


First let us remind the CRT reconstruction:

if P ≤ Q then M = ((MQ −MP )(1/P mod Q) mod Q)×P +MP ) mod N

The attack is based on this simple remark: if M < P , the quantity


MQ − MP involved in the computation of the result M is equal to zero.
And then the multiplication by 1/P mod Q and P is null too. As far
as the power consumption is concerned, this is a big difference. So, once
again, knowing the public key one can encrypt a small message M → C
and do the signature of C by the card. Then, as previously, it is easy to
obtain the value of the factor P by dichotomy.

4 Countermeasures and Implementation

4.1 Basic ideas


We will present in this section some implementation ideas to make
the code more secure during the RSA computation. Most of them will be
applied in a small pseudo RSA code.
– The most important thing is that the code has to have no branches
both in assembly or C languages. In C, the programmer often does
not control the compilation phase (optimizations and reorganization
of the code may occur...), so, at least he has to avoid the use of
conditional branch (if, while ...). Even in assembly, the code has to
be really ”linear”, because even if the same instructions are executed
from different parts of the ROM, a simple comparison of the traces
of consumption could inform the attacker of the branch executed (cf.
[1]).
– Always use indirect addresses instead of moving the values; moreover
the place of the value must not give much information about the
message. Try to use similar addresses for the different buffers (not
one at the address 0x0005 and the other in 0xFEFF).
– Try to implement the critical part while something else is happening,
typically when the cryptoprocessor is working for RSA implementa-
tion in smart card.
Example (very bad)

Set(buffer1,1)
for(i=0;i<n/8;i++)
for(j=0;j<8;k++)
{
Square(buffer2,buffer1);
if(bit(i,j,d)==1)
{
Multiply(buffer1,buffer2,C);
}
else
{
Copy(buffer1,buffer2);
}
}

Example (better)

Set(buffer1,1)

p1 = &buffer1;
p2 = &buffer2;

for(i=0;i<n;i++)
{
waitrdm();
BeginSquare(p2,p1);
waitrdm();
WaitEndSquare();
waitrdm();
BeginMultiply(p1,p2,C);
waitrdm();
bit = bit(i,n);
p1 = bit*p1 + (1-bit)*p2;
p2 = bit*p2 + (1-bit)*p1;
waitrdm();
WaitEndMultiply();
waitrdm();
}
4.2 Blinding ideas

To avoid some predictions on the values manipulated during the RSA


computation, some operations of blinding can be applied on the message,
the exponent and the modulus.

– Ñ = rdm1 × N
– M̃ = M + rdm2 × N with preferably gcd(rdm1, rdm2) = 1
– d˜ = d + rdm3 × (P − 1)(Q − 1)

Another idea is to modify the message before and after the exponen-
tiation:

– Pick up Y = rdm and X = Y −e mod N


– Replace M by M X mod N
– Exponentiate M X with e
– Unblind (M X)d mod N by multiplying by Y

In practice, the blinding methods are used with 32 or 64 bits ran-


dom values which is sufficient to prevent from even quite powerful adver-
saries without losing much in performance (in some cases, for example
Quisquater architectures, due to the use of an extended modulus, some
countermeasures do not even affect the performance).
All these blindings can of course be applied to the message, the ex-
ponent and the modulus in its original form (M , d and N ) but also to
the ”reduced” form (MP , dP and P ). The main advantage is to avoid any
prediction on the values operated on during the RSA and so counteract
most of the attack presented in last section. But the major problem is
that none of them prevent an attacker which is able to get the exponen-
tiation exponent in one ”shot” to completely break the system5 ! In a
nutshell, for RSA, it seems easier to prevent DPA attacks than SPA ones.

4.3 Work in progress

Now the smart card world seems to go towards powerful 32 bits core
processor without crypto-processor, but with some efficient useful basic
instructions (multiplier 32x32 to 64 bits, ...). The advantage of those
processors is that the programmer fully controls his implementation and
countermeasures can be implanted deep in the heart of algorithm; such a
thing is impossible with crypto-processors.
5
Even if the adversary does not get the real secret exponent d, he gets a working
decrypting exponent.
Most of these attacks can easily be adapted to other finite field algo-
rithms and to elliptic curves; moreover concerning the latter, there are
specific technics using the structure of ECC (see [2, 6]).

4.4 Some remarks to the reviewer


In some cases, it could even be more secure to change the value of
the blinding random during the computation. Indeed, we want to point
out than during the exponentiation the buffer is always multiplied by the
same value, and this weakness could be exploited by more complicated
attacks. [Some precisions about this may be included in the final version
of the paper].

An efficient countermeasure against SPA would be to blind each mes-


sage by a ’personal value’ associated to a special exponent, so that com-
promising this exponent does not compromise the secret exponent. Work
is under progress but we are still confronted with some problems6 .

We will include in the final paper curves to illustrate the attacks


presented here, as space allows.

5 Conclusion

It seems that, as for private key cryptosystems the hardest problem on


smart card now is to prevent the card against quite enhanced SPA type at-
tacks. Some powerful and theoretical countermeasures exist against DPA,
and not against SPA. luckily many adhoc solutions exist to prevent SPA
attacks.

References
1. M.-L. Akkar, R. Bévan, P. Dischamp, and D. Moyart. Power analysis, what is now
possible. Asiacrypt’00, 2000.
2. J.-S. Coron. Resistance against differential power analysis for elliptic curve cryp-
tosystems. CHES, 1999.
3. D.E. Knuth. The Art of Computer Programming, volume 2. Addison Wesley, third
edition, 1988.
4. P. Kocher, J. Jaffe, and B. Jun. Differential power analysis. Web Site:
www.cryptography.com/dpa, 1998.
6
I am still optimistic for the term ”some”, because right now, the considered solution
imply to solve a hard -a special case of the DPL- problem on the card at each
signature!!
5. P. C. Kocher. Timing attacks on implementations of Diffie-Hellman, RSA, DSS,
and other systems. Crypto ’96, pages 104–113, 1996.
6. J. López and R. Dahab. Fast multiplication on elliptic curve over gf (2m ) without
precomputation. CHES, 1999.
7. A.J. Menezes, P.C. van Oorschot, and S.A. Vanstone. Handbook of Applied Cryp-
tography. CRC Press, 1997.
8. T.S. Messerges, E.A. Dabbish, and R.H. Sloan. Power analysis attacks of modular
exponentiation in smartcards. CHES, 1999.
9. P.L. Montgomery. Modular multiplication without trial division. Mathematics of
Computation, 54, pages 839–854, 1990.

You might also like