Exponential Bounds For Information Leakage in Unknown-Message Side-Channel Attacks
Exponential Bounds For Information Leakage in Unknown-Message Side-Channel Attacks
July 8, 2010
Abstract: In [1], the authors introduced an important new information theoretic numerical measure
for assessing a system’s resistance to unknown-message side-channel attacks and computed a formula
for the limit of the numerical values defined by this measure as the number of side-channel observations
tends to infinity. Here, we present corresponding quantitative (exponential) bounds that yield an
actual rate-of-convergence for this limit, something not given in [1]. Such rate-of-convergence results
can potentially be used to significantly strengthen the utility of the limit formula of [1] as a tool
to reduce computational complexity difficulties associated with calculating the side-channel attack
resistance measure presented there. In addition, our arguments here show how the arguments used in
[1] to prove the limit formula can be substantially simplified.
1 Introduction
In a side-channel attack, the attacker attempts to circumvent the security of cryptographic
algorithms by exploiting information inadvertently exposed within the concrete context of their
actual, real-world implementations. Such possibly indirect yet nonetheless potentially quite
——————————–
* Email: [email protected], Mailing address: 3133 Connecticut Ave., NW #521,
Washington, DC 20008, USA.
1
relevant information might include that revealed by running time characteristics [2], cache
behavior [4], electromagnetic signals [5], or other electronic or physical properties or emanations.
In an unknown-message side-channel attack scenario, the attacker cannot see or control the
input that is decrypted (or encrypted) by the system. An example of such an unknown message
attack is a timing attack against systems employing advanced countermeasures such as message
blinding.
In [1], the authors propose an important new information theoretic measure for quantifying
the resistance of systems to unknown-message side channel attacks. Their measure, denoted
Λ = Λ(n) (see (6) below), is a quantitative benchmark for measuring the reduction in the
attacker’s uncertainty (i.e., the corresponding information gain) regarding the attacked system’s
designated cryptographic key after n side-channel observations. In addition, in their Theorem
1 in [1], they compute an explicit formula for the limit of Λ(n) as n → ∞ (restated here as (7)
below).
Our own Theorem 1 in §3 below not only recovers the qualitative result limn→∞ Λ(n) ap-
pearing in [1] but very significantly strengthens and improves on it by establishing quantitative
(exponential) bounds on the corresponding rate-of-convergence (see (8) below), something not
addressed in [1]. Additionally, in Example 1 in §3, we show how our numerical estimate in (8)
can actually be applied in a concrete, if somewhat abstract, setting. Also, we believe that an
additional important contribution of our work here is that the proof of our Theorem 1 involves
what we believe to be a much-simplified argument relative to that used in [1] to establish (7).
In [1] (see the Full Version of that paper), a relatively involved argument is invoked using some-
what elaborate and perhaps esoteric results from information theory to compute limn→∞ Λ(n).
Here, our simplified argument is entirely self-contained aside from an appeal to the well-known,
classical Hoeffding inequality. In fact, to simply identify the limit limn→∞ Λ(n) as is done in [1]
without also proving our own quantitative bound in (8), we would have actually only needed,
using an argument exactly analogous to the one we use to prove our Theorem 1, just the
traditional, standard Law of Large Numbers (in its weak form) from probability theory.
In [1], the authors also introduce a precise algorithm for facilitating the computation of
2
Λ(n). However, as they themselves point out, the time complexity of the algorithm is expo-
nential in n, rendering computation for large values of n infeasible. In fact, they promote their
determination of the limit limn→∞ Λ(n) in their Theorem 1 as a way to remedy this compu-
tational complexity problem since it allows one to compute limits for the resistance to side
channel attacks without being confronted with the exponential increase in n. We believe that
the quantitative bound we present here in the statement of our own Theorem 1, by giving the
associated rate-of-convergence, has the potential to dramatically strengthen this computational
complexity remediation strategy since it shows, of course, exactly how large n must be for
limn→∞ Λ(n) to be a good approximation of Λ(n). In fact, the bound shows that the exponen-
tial increase in the time complexity of the algorithm given in [1] for computing Λ(n) is actually
offset by an exponential decrease in the numerical error resulting from using limn→∞ Λ(n) as an
approximation for it.
The rest of this article is organized as follows. In the next section, we give the necessary
background on unknown message side channel attacks and relevant information theoretic con-
cepts. In the final section, we state and prove our fundamental new result, Theorem 1, and
present an example, Example 1, that directly applies it, as discussed above.
3
assume the attacker has full knowledge regarding the implementation IF , i.e., f = fIF is known
to the attacker.
In a side-channel attack, the attacker collects side-channel observations f (k, m1 ), ..., f (k, mn )
for ascertaining the key k or narrowing down its possible values. Such an attack is unknown-
message if the attacker cannot observe or choose the messages mi ∈ M . (By way of contrast, an
attack is known-message if the attacker can observe but cannot influence the choice of mi ∈ M ,
and an attack is chosen-message if the attacker can choose mi ∈ M .)
Now let pK : K → R and pM : M → R be probability distributions on the sets of keys K and
messages M respectively. These immediately give rise to random variables K and M modeling
the respective random choices of keys and messages. We assume that pK = pK and pM = pM
are known to the attacker. For positive integers n, let the random variable O n : K × M n →
On be defined by O n (k, m1 , ..., mn ) = (f (k, m1 ), ..., f (k, mn )), where pKM n (k, m1 , ..., mn ) =
pK (k)pM (m1 )...pM (mn ) is the probability distribution on K × M n . Also, we write O = O 1 .
Now naturally for each key k ∈ K, the random variable O gives rise to a probability
distribution pO|K=k on O. For k, k 0 ∈ K, define k ≡ k 0 if and only if pO|K=k = pO|K=k0 . Then, ≡
def
is an equivalence relation on K, and V = K/ ≡ denotes the (finite) set of equivalence classes
and |V | its cardinality. The random variable V : K → V defined by V(k) = [k] maps every key
to its equivalence class with respect to ≡.
Now for any random variable X assuming values in the set X, we of course have the infor-
mation theoretic (Shannon) entropy
X
H(X ) = − pX (x)log2 (pX (x)). (1)
x∈X
If Y is another random variable defined on the same probability space taking values in some set
Y , then, for y ∈ Y , we denote by H(X |Y = y) the entropy of X with respect to the distribution
pX |Y=y . As in [1], we then define the conditional entropy as
X
H(X |Y) = pY (y)H(X |Y = y). (2)
y∈Y
4
We have the identity
H(X Y) = H(Y) + H(X |Y) (3)
with on = (on1 , ..., onn ) and | · | denoting the cardinality of the enclosed set. Finally, in the next
section, we will also require a measure of how far apart, in a sense, the distributions of the form
pO|V=B , B ∈ V are. So, for this, set
def
Λ(n) = H(K|O n ), n a positive integer, (6)
as a practical measure for potentially quantifying the resistance of systems against unknown
message side channel attacks. In addition, they prove that
Our own Theorem 1 below recovers the qualitative result (7) above that appears in [1] but
very significantly strengthens and improves on it by establishing a quantitative (exponential)
estimate for the difference H(K|O n ) − H(K|V) (see (8) below). This yields a bound on the
5
corresponding convergence rate, something not given in [1]. Moreover, we believe that an
additional important contribution of our work here is that the proof of our Theorem 1 involves
a much-simplified argument relative to that used in [1] to establish (7), as discussed in more
detail in the Introduction.
So, without further ado, our numerical estimate on information gain after repeated unknown
message side-channel measurements is given in the following theorem.
Theorem 1. We have
Proof. Note that pO|K=k = pO|V=[k] , so that H(O n |K = k) = H(O n |V = [k]) and
X X X
H(O n |K) = pK (k)H(O n |K = k) = pK (k)H(O n |K = k)
k∈K k∈V k∈B
X
= pV (B)H(O n |V = B) = H(O n |V). (11)
B∈V
6
Now observe that H(KV) = H(K), since V is determined by K. Hence, invoking (3), H(K|O n )−
H(K|V) = (H(O n |K)+H(K)−H(O n ))−(H(KV)−H(V)) = H(O n |K)+H(V)−H(O n ). Using
(11), it now follows that H(O n |K) + H(V) = H(O n |V) + H(V) = H(O n V). As H(O n V) −
H(O n ) = H(V|O n ), (10) immediately follows.
Proof of Theorem 1. By Lemma 1 above, it is sufficient to show that H(V|O n ) ≤ 2|V |2 Iexp(−2n²2V ).
Note first that from the definitions (1) and (2), it follows that
X X
H(V|O n ) = pO n (on )(− pV|O n =on (B)log2 (pV|O n =on (B))). (12)
on ∈O n B∈V
ti
Now, for each i = 1, ..., I, let pni = pni (oi , on ) = n
, where ti is as in (4), and, for each
B ∈ V = K/ ≡ and on ∈ On where n is any positive integer, write
def
m(on , B) = maxi=1,...,I |pni (oi , on ) − pO|V=B (oi )|. (13)
Note that, by iterated application of the classical Hoeffding’s inequality (see, for example,
Theorem 2.3(a) in [3]), it follows that, for any B ∈ V and ² > 0,
In fact, (14) directly follows from the Hoeffding inequality because, for each i ∈ I, we can
define indicator random variables that are set to 1 if onj = oi where on = (on1 , ..., onn ) but
0 otherwise, so that the expectation in the statement of Hoeffding’s inequality is simply the
probability pO|V=B (oi ). In any case, notice as well that
Now consider that the definition of V = K/ ≡ clearly implies that, if ² is small enough, in
particular if ² = ²V with ²V defined as in (5) above, then, for any on ∈ On , any B ∈ V satisfying
m(on , B) ≤ ² (if such B ∈ V does exist) must be unique, and we denote this corresponding
unique equivalence class B via Bon . Hence, for such small ², the random variable V is fully
7
determined by the random variable O n under m(·, ·) and in fact we have V = BO n . So, for
² = ²V , we can rewrite the conclusion of (15) as
pV|O n =on (Bon ) = 1 and pV|O n =on (B) = 0 for B ∈ V, B 6= Bon . (17)
Hence, since for x ∈ (0, 1], |xlog2 (x)| ≤ 1, it follows directly from (12) along with (18) that
References