Optimal Asymmetric Encryption
Optimal Asymmetric Encryption
1 Introduction
A s y m m e t r i c (i.e. public key) encryption is a goal for which there is a large and
widely-recognized gap between practical schemes and provably-secure ones: the
practical m e t h o d s are efficient but not well-founded, while the provably-secure
schemes have more satisfying security properties but are not nearly as efficient?
The goal of this p a p e r is to (nearly) have it all: to do a s y m m e t r i c encryption in
a way as efficient as any mechanism yet suggested, yet to achieve an assurance
benefit almost as good as t h a t obtained by provable security.
In the setup we consider a sender who holds a k-bit to k~bit t r a p d o o r permu-
tation f and wants to transmit a message z to a receiver who holds the inverse
p e r m u t a t i o n f - l . We concentrate on the case which arises m o s t often in cryp-
tographic practice, where n = Iz[ is at least a little smaller t h a n k.
W h a t practioners want is the following: encryption should require just one
c o m p u t a t i o n of f ; decryption should require just one c o m p u t a t i o n of f - l ; the
length of the enciphered text should be precisely k; and the length n of the
text z t h a t can be encrypted is close to k. Since heuristic schemes achieving
these conditions exist [22, 15], if provable security is provided at the cost of
violating any of these conditions (e.g., two applications of f to encrypt, message
length n + k rather t h a n k) practioners will prefer the heuristic constructions.
A variety of goals for encryption have come to be known which are actually
stronger than the notion of [11]. These include non-malleability [7] and chosen
ciphertext security. We introduce a new notion of an encryption scheme being
plain~ezt-aware--roughly said, it should be impossible for a party to produce a
valid ciphertext without "knowing" the corresponding plalntext (see Section 3
for a precise definition). In the ideal-hash model that we assume, this notion can
be shown to imply non-malleability and chosen-ciphertext security.
We construct a plalntext-aware encryption scheme by slightly modifying the
basic scheme. Let/c and/Co be as before and let k 1 be another parameter. This
time let r~ : k -- k0 - kl. Let the generator be G: {O, 1} ~~ --* {0, 1}'~+kl and the
hash function H: {0, 1}'~+~1 ~ {0, 1} k0. To encrypt, choose a random k0-bit r
and set
1.3 Efficiency
The function f can be set to any candidate trapdoor permutation such as RSA
[21] or modular squaring [19, 3]. In such a case the time for computing G and
H is negligible compared to the time for computing f, f - 1 . Thus complexity is
discussed only in terms of f, f - 1 computations. In this light our basic encryption
scheme requires just a single application of f to encrypt, a single application
of f - 1 to decrypt, and the length of the ciphertext is k (as long as k _~ n Jr k0).
Our p]alntext-aware scheme requires a single application of f to encrypt, a single
application of f - 1 to decrypt, and the length of the ciphertext is still k (as long
as k >_ n + k 0 + k l ) .
A concrete instantiation of our plaintext aware scheme (using RSA for f and
getting G, H from the Secure Hash Algorithm [18]) is given in Section 7.
much greater assurance benefit t h a n purely ad. hoc. protocol design. We refer
the reader to t h a t paper for further discussion of the meaningfulness, motivation
and history of this ideal hash approach.
1.6 Extensions
1.7 P r i o r w o r k in e n c r y p t i o n
b Exact security is not new: previous works which address it explicitly include [10,
14, 23, 16, 8, 1]. Moreover, although it is true that most theoretical works only
provide asymptotic security guarantees of the form Uthe success probability of a
polynomially bounded adversary is negligible" (everything measured as a function
of the security parameter), the exact security can be derived from examination of
the proof. (However, a lack of concern with the exactness means that in many cases
the reductions are very inefficient, and the results are not useful for practice).
96
matching our plaintext aware scheme in computation but having bit complex-
ity n + k + kl. Non-malleability is provably achieved by [7], but the scheme
is extremely inefficient. An efficient scheme proven in [2] to achieve both non-
malleability and chosen-ciphertext security under the ideal-hash model is
2 Preliminaries
We extend the definition of semantic security [11] to the random oracle model
in a way which enables us to discuss exact security.
The following definition will be used to discuss (exact) security. It captures the
notion of semantic security [11] appropriately lifted to take into account the
presence of G, H .
7 Candidates like RSA [21] don't quite fit our definition, in that the domain of RSA
is some Z~, a proper subset of of {0, 1~h. Things can be patched in standard ways.
99
Note that t is the total running time; ie. the sum of the times in the two stages.
Similarly qgen, qhash are the total number of G and H queries, respectively.
Let $" be a trapdoor permutation generator and k0(.) a positive integer valued
function such that ko(k) < k for all k > 1. The basic scheme G wi~h parameters
Y and k0(') has an associated plaintext-length function of n(k) = k - ko(k). On
input 1 ~, the generator G runs .7"(1~) to obtain (f, f - x ) . Then it outputs the
pair of algorithms (s •) determined as follows:
(1) On input z of length n = n(k), algorithm s selects a random r of length
k0 = ko(k). It s e t s , = ~e~(r) and t = r ~ C , ) . It sets ~ = s II t and
returns y = f(w).
(2) On input y of length k, algorithm l) computes w = f - t ( y ) . Then it sets s
to the first n bits of w and t to the last ko bits of w. It sets r = t@H(s),
and returns the string z = s a G ( r ) .
The oracles G and H which s and ~P reference above have i n p u t / o u t p u t lengths
of G : {0, 1} ~~ ---, {0, 1) n and H : {0, 1)'* ~ {0, 1) ~~ We use the encoding of f
as the encoding of E and the encoding of f - x as the encoding of 7).
The intuition behind the (semantic) security of this scheme is as follows. We
wish to guarantee that the adversary, given a point y in the range of f , must
recover the complete preimage w = rffi of y if she is to say anything meaningful
about z itself. Well, if the adversary does not recover all of the first n bits
of the preimage, s, then she will have no idea about the value H(s) which
is its hash; a failure to know anything about H(s) implies a failure to know
anything about r = H(s)@t (where t is the last ko bits of w), and therefore G(r),
and therefore z = G(r)@s itself. Now, assuming the adversary does recover s,
100
a failure to completely recover t will again mean that the adversary fails to
completely recover r, and, in the lack of complete knowledge about r, z@G(r)
is uniformly distributed and so again the adversary can know nothing about z.
Yet the above discussion masks some subtleties and a formal proof of security
is more complex than it might appear. This is particularly the case when one is
interested, as we are here, in achieving the best possible exact security.
The following theorem says that if there is an adversary A who is able to
break the encryption scheme with some success probability, then there is an algo-
rithm M which can invert the underlying trapdoor permutation with comparable
success probability and in comparable time. This implies that if the trapdoor
permutations can't be inverted in reasonable time (which is the implicit assump-
tion) then our scheme is secure. But the theorem says more: it specifies exactly
how the resources and success of M relate to those of A and to the underlying
scheme parameters k, n, ko (k = n + k0).
The inverting algorithm M can by obtained from A in a "uniform" way;
the theorem says there is a "universal" oracle machine U such that M can be
implemented by U with oracle access to A. It is important for practice that the
"description" of U is "small;" this is not made explicit in the theorem but is clear
from the proof. The constant ~ depends only on details of the underlying model
of computation. We write ~, k0 for n(k), ko(k), respectively, when, as below, k is
understood.
Let .7- be a trapdoor permutation generator. Let /co(') and /C1(') be positive
integer valued functions such that/co(k) +/Cl(/C) < k for all k > 1. The plainCezZ-
aware scheme G with parameters Y,/co,/Cl has an associated plaintext-length
function of n(/c) = k -/co(k) -/Cl(/C). On input 1~, the generator G runs Y(1 ~)
to obtain ( f , / - x). Then it outputs the pair of algorithms (s ~ ) determined as
fonows:
(1) On input z of length n = n(/c), algorithm s selects a random r of length
ko =/co(k). It sets s = zOninG(r) and t = r~H(s). It sets w = s II t and
returns y = f(w).
(2) On input y of length k, algorithm :D computes w = f-X(y). Then it sets s
to the first n-6kl bits o f w and t to the last/co bits ofw. It sets r = t$H(s).
It sets z to the first n bits of s~G(r) and z to the last/Cl bits of s@G(r).
If z = 0 kx then it returns z, else it returns *.
The oracles G and H which s a n d / ) reference above have i n p u t / o u t p u t lengths
of G: {0, 1} ~~ --~ {0, 1} n and H: {0, 1} n --~ {0, 1} ~~
The semantic security of this scheme as given by the following theorem is a
consequence of Theorem 3.
Proof. Let ~' be the generator for the basic scheme with parameters ~T and ko--
the associated plaintext-length function is n'(k) = k - ko(k) = n(k) + kl(k). Let
A ' be the adversary for ~' who (i) in the find-stage runs A to get (z0, Zl, c) and
outputs (zoo ~*, z,0 tl, c); and (ii) in the guess-stage removes the padded zeroes
from the messages and runs A. Now apply Theorem 3 to A'. []
The intuition for the plaintext awareness of our encryption scheme can be de-
scribed as follows. Let y be the string output by B. If she hasn't asked G(r),
then almost certainly the first n -6 kl bits of the preimage of y won't end with
the right substring 0~*; and if she hasn't asked H(s), then she can't know r; but
if the adversary does know s, then certainly she knows its first n bits, which is z.
To discuss exact security it is convenient to say that adversary B(.) is a
(~,qsen, qha~h)-adversar~J for {7(1~) if for all (E,•) E [G(lk)], B(s runs in at
most t steps, makes qsen G-queries and makes qhash H-queries.
machine K and a constant ~ such that for each integer 4 the following is true.
Suppose B is a (t, qgen, qhuh)-adversary for 0(1~). Then K = U B is a (t',e')-
plaintext extractor for B, ~, where
128(=),
~ r@Ho, -- .
i r i+1;
until IND(v,);
return f(v,);
do not specify; and adding enough additional padding to fill out the length of
the string we have made to k - 128 bits. The resulting string z now plays the
same role as the z of our basic scheme, and a separate 128-bit r is then used to
encrypt it.
We comment that in the concrete scheme shown in Figure 7 we have elected
to make our generator and hash function sensitive both to our scheme itself
(via K0) and to the particular function f (via desc). Such "key separation" is
a generally-useful heuristic to help ensure that when the same key is used in
multiple (separately-secure) algorithms that the internals of these algorithms do
not interact in such a way as to jointly compromise security. The use of "key
variants" o l , o's and ~'s is motivated similarly. Our choice to only use half the
bits of SHA has to do with a general "deficiency" in the use of SHA-like hash
functions to instantiate random oracles; see [2] for a discussion.
Acknowledgments
We thank Don Johnson for an early discussion on this problem, where he de-
scribed the m e t h o d of [15]. We thank Silvio Micali for encouraging us to find and
present the exact security of our constructions. Thanks also to (anonymous) Eu-
rocrypt reviewers for their comments and corrections.
This work was carried out while the second author worked for IBM in Austin,
Texas (System Design (Jamil Bissar), PSP LAN Systems).
References
A Proof of Theorem 3
(3.2) Suppose A makes G-query g. Then for each h on the H-list M con-
structs the string wh,g = h II g@Hh and computes Yh,g = f(wh,~).
(3.2.1) If there are h, g such t h a t Yh,g = Y then M sets w* = wn,g. It
sets Gg = h(~Zb, a d d s g to the G-list, and returns Gg to A.
(3.2.2) Else (ie. there are no h, g such t h a t Yh,g = Y) M provides A
with a r a n d o m string Gg of length n and adds g to the G-list.
The output of M is w* if this string was defined in the above experiment, and
fail otherwise. Note t h a t the H-list and G-list include the queries of b o t h the
find and guess stages of A's execution.
It is easy to verify t h a t the a m o u n t of time ~' to carry out G a m e 1 is as claimed.
It is also easy to verify t h a t there is a universal machine U such t h a t the com-
putation of M can be done by U A.
We note t h a t as soon as M successfully finds a point w* = f - Z ( y ) , it could stop
and o u t p u t w*. Not only do we have it go on, but some variables and actions
(such as the usage of the bit b in Step (3.2.1) come into play only after w* is
found. These "unnecessary" actions do not affect the success probability of M
but we put t h e m in to simplify our exposition of the analysis of M ' s success
probability. The intuition is t h a t A in the above experiment is trying to predict
b and M is trying to make the distribution provided to A look like t h a t which
A would expect were A running under the experiment which defines A's success
in breaking the encryption scheme. Unfortunately, M does not provide A with
a simulation which is quite perfect. Let us now proceed to the analysis.
We consider the probability space given by the above experiment. T h e inputs
f , y to M are drawn at r a n d o m according to (f, f - l ) 4-- ~'(lk); y ,-- {0, 1} ~. We
call this " G a m e 1" and we let P r l ['] denote the corresponding probability.
Let w = f - l ( y ) and write it as w = s [[ ~ where Is I = n and : k0. Let r be
the r a n d o m variable t~H(s). We consider the following events.
W = AskR A AskS.
Now consider the experiment which defines the advantage of A. Namely, first
choose ( f . , f , z ) ._ j r ( l k ) and let s be the corresponding encryption function
under the basic scheme. Then choose
and run A (y*, =L c*). Let Pr; [.] be the corresponding distribution and
Game 1" the game.
Now consider playing Game 1" a little bit differently. As before, choose ( f . , f , z )
~-- ~ ( 1 ~) and let E* be the corresponding encryption function. But now choose
y* *- C0,1} k uniformly at random first, and then select the rest according to
the distribution which makes the outcome the same as in Game-l*. (This is
possible because the distribution on y*-values in Game 1" is indeed uniform).
We let Game 2* be this different way of playing Game 1".
We claim that Game 2 and Game 2* are identical in the sense that the view of
A at any point in these two games is the same. Indeed we have chosen the event
G so that the oracle queries we are returning in Game 1 will mimic Game 2* as
long as G remains true.
We omit details to formally justify these claims, but a good way to get some
intuition is to assume for simplicity that the End-stage is trivial and A always
outputs the same strings z~, z~, c*. Now if y* is fixed then the conditional dis-
tribution on G., H . can be described as follows: Pick H . at random; pick G.(g)
to be random whenever g ~ t~H.(s). But G.(t~H.(s)) must be constrained to
be either s ~ z ~ or s@z~, the choice of which being at random.
To proceed further with our analysis (of Game 1), let us introduce the following
additional events:
The first step is to show that the probability that the good event fails is low.
Proof. The intuition is that as long as H-query s has not been made, each G-
query has probability only 2 -~~ of being r. Now, --G = FBADV GBAD. In GBAD
is already included the fact that no H-query of s has been made before the G-
query r. But in FBAD it could be that H-query s was made. But the probability
109
The random choice of y implies that Prl [FAskS] < (]hash2-n. On the other hand
P r l [AskR I -,FAskS] _< qgen2 -k~ []
2qgen2- k
Pr2 [W] _> Pr2 [A ---b] 21
Prl [G]
Now we wish to upper bound the last two terms above. If-~AskR then clearly .4
has no advantage in predicting b:
Pr2 [.4 -- b l -~AskR] _<1/2. (2)
On the other hand let RBS be the event that r is on the G-list and at the time
it was put there, s was not on the H-list. Recall that k = h0 + n. One can check
110
that:
Prx [AskR A -~AskS A G] = Prl [RBS A G, E {8~zo, $ ~ g l } ]
= Prl [RBS]. Prl [G, E {,E)zo,,(~zx} I RBS]
Now put the bounds provided by (2) and (4) into (1) to get
2qse,2 -~
P r 2 [ A = b ] _< Pr2[VV] + 8 9 Prl[G]
> (e 2qgen2-k
- _ - PrI[G] ) ' P r l [ G ]
= e . P E I [G] - 2qgen2 -k
B Proof of Theorem6
We define the plaintext extractor K. Let (f, f - l ) E [~'(1~)] and let uc be the
corresponding encryption function as constructed by our plaintext-aware scheme.
Let r : (Tgen , Thash) where
Tgen = (7"1,G 1 ) , . . . , (Pq,on, Gqlto=)
= (,1,H1), ....
111