The RC6 Block Cipher:
A simple fast secure
AES proposal
Ronald L. Rivest MIT
Matt Robshaw RSA Labs
Ray Sidney RSA Labs
Yiqun Lisa Yin RSA Labs
(August 21, 1998)
Outline
u Design Philosophy
u Description of RC6
u Implementation Results
u Security
u Conclusion
Design Philosophy
u Leverage our experience with RC5: use
data-dependent rotations to achieve a
high level of security.
u Adapt RC5 to meet AES requirements
u Take advantage of a new primitive for
increased security and efficiency:
32x32 multiplication, which executes
quickly on modern processors, to
compute rotation amounts.
Description of RC6
Description of RC6
u RC6-w/r/b parameters:
– Word size in bits: w ( 32 )( lg(w) = 5 )
– Number of rounds: r ( 20 )
– Number of key bytes: b ( 16, 24, or 32 )
u Key Expansion:
– Produces array S[ 0 … 2r + 3 ] of w-bit
round keys.
u Encryption and Decryption:
– Input/Output in 32-bit registers A,B,C,D
RC6 Primitive Operations
w
A+B Addition modulo 2
w
A-B Subtraction modulo 2
A⊕B Exclusive-Or
RC5
A <<< B Rotate A left by amount in
low-order lg(w ) bits of B
A >>> B Rotate A right, similarly
(A,B,C,D) = (B,C,D,A) Parallel assignment
w
AxB Multiplication modulo 2
RC6 Encryption (Generic)
B = B + S[ 0 ]
D = D + S[ 1 ]
for i = 1 to r do
{
t = ( B x ( 2B + 1 ) ) <<< lg( w )
u = ( D x ( 2D + 1 ) ) <<< lg( w )
A = ( ( A ⊕ t ) <<< u ) + S[ 2i ]
C = ( ( C ⊕ u ) <<< t ) + S[ 2i + 1 ]
(A, B, C, D) = (B, C, D, A)
}
A = A + S[ 2r + 2 ]
C = C + S[ 2r + 3 ]
RC6 Encryption (for AES)
B = B + S[ 0 ]
D = D + S[ 1 ]
for i = 1 to 20 do
{
t = ( B x ( 2B + 1 ) ) <<< 5
u = ( D x ( 2D + 1 ) ) <<< 5
A = ( ( A ⊕ t ) <<< u ) + S[ 2i ]
C = ( ( C ⊕ u ) <<< t ) + S[ 2i + 1 ]
(A, B, C, D) = (B, C, D, A)
}
A = A + S[ 42 ]
C = C + S[ 43 ]
RC6 Decryption (for AES)
C = C - S[ 43 ]
A = A - S[ 42 ]
for i = 20 downto 1 do
{
(A, B, C, D) = (D, A, B, C)
u = ( D x ( 2D + 1 ) ) <<< 5
t = ( B x ( 2B + 1 ) ) <<< 5
C = ( ( C - S[ 2i + 1 ] ) >>> t ) ⊕ u
A = ( ( A - S[ 2i ] ) >>> u ) ⊕ t
}
D = D - S[ 1 ]
B = B - S[ 0 ]
Key Expansion (Same as RC5’s)
u Input: array L[ 0 … c-1 ] of input key words
u Output: array S[ 0 … 43 ] of round key words
u Procedure:
S[ 0 ] = 0xB7E15163
for i = 1 to 43 do S[i] = S[i-1] + 0x9E3779B9
A=B=i=j=0
for s = 1 to 132 do
{ A = S[ i ] = ( S[ i ] + A + B ) <<< 3
B = L[ j ] = ( L[ j ] + A + B ) <<< ( A + B )
i = ( i + 1 ) mod 44
j = ( j + 1 ) mod c }
From RC5 to RC6
in seven easy steps
(1) Start with RC5
RC5 encryption inner loop:
for i = 1 to r do
{
A = ( ( A ⊕ B ) <<< B ) + S[ i ]
( A, B ) = ( B, A )
}
Can RC5 be strengthened by having rotation
amounts depend on all the bits of B?
Better rotation amounts?
u Modulo function?
Use low-order bits of ( B mod d )
Too slow!
u Linear function?
Use high-order bits of ( c x B )
Hard to pick c well!
u Quadratic function?
Use high-order bits of ( B x (2B+1) )
Just right!
B x (2B+1) is one-to-one mod 2w
Proof: By contradiction. If B ≠ C but
w
B x (2B + 1) = C x (2C + 1) (mod 2 )
then
w
(B - C) x (2B+2C+1) = 0 (mod 2 )
But (B-C) is nonzero and (2B+2C+1) is
odd; their product can’t be zero! o
Corollary:
B uniform à B x (2B+1) uniform
(and high-order bits are uniform too!)
High-order bits of B x (2B+1)
u The high-order bits of
2
f(B) = B x ( 2B + 1 ) = 2B + B
depend on all the bits of B .
u Let B = B31B30B29 … B1B0 in binary.
u Flipping bit i of input B
– Leaves bits 0 … i-1 of f(B) unchanged,
– Flips bit i of f(B) with probability one,
– Flips bit j of f(B) , for j > i , with
probability approximately 1/2 (1/4…1),
– is likely to change some high-order bit.
(2) Quadratic Rotation Amounts
for i = 1 to r do
{
t = ( B x ( 2B + 1 ) ) <<< 5
A = ( ( A ⊕ B ) <<< t ) + S[ i ]
( A, B ) = ( B, A )
}
But now much of the output of this nice
multiplication is being wasted...
(3) Use t, not B, as xor input
for i = 1 to r do
{
t = ( B x ( 2B + 1 ) ) <<< 5
A = ( ( A ⊕ t ) <<< t ) + S[ i ]
( A, B ) = ( B, A )
}
Now AES requires 128-bit blocks.
We could use two 64-bit registers, but
64-bit operations are poorly supported
with typical C compilers...
(4) Do two RC5’s in parallel
Use four 32-bit regs (A,B,C,D), and do
RC5 on (C,D) in parallel with RC5 on (A,B):
for i = 1 to r do
{
t = ( B x ( 2B + 1 ) ) <<< 5
A = ( ( A ⊕ t ) <<< t ) + S[ 2i ]
( A, B ) = ( B, A )
u = ( D x ( 2D + 1 ) ) <<< 5
C = ( ( C ⊕ u ) <<< u ) + S[ 2i + 1 ]
( C, D ) = ( D, C )
}
(5) Mix up data between copies
Switch rotation amounts between copies,
and cyclically permute registers instead of
swapping:
for i = 1 to r do
{
t = ( B x ( 2B + 1 ) ) <<< 5
u = ( D x ( 2D + 1 ) ) <<< 5
A = ( ( A ⊕ t ) <<< u ) + S[ 2i ]
C = ( ( C ⊕ u ) <<< t ) + S[ 2i + 1 ]
(A, B, C, D) = (B, C, D, A)
}
One Round of RC6
A B C D
t u
<<< f <<< f
5 5
<<< <<<
S[2i] S[2i+1]
A B C D
(6) Add Pre- and Post-Whitening
B = B + S[ 0 ]
D = D + S[ 1 ]
for i = 1 to r do
{
t = ( B x ( 2B + 1 ) ) <<< 5
u = ( D x ( 2D + 1 ) ) <<< 5
A = ( ( A ⊕ t ) <<< u ) + S[ 2i ]
C = ( ( C ⊕ u ) <<< t ) + S[ 2i + 1 ]
(A, B, C, D) = (B, C, D, A)
}
A = A + S[ 2r + 2 ]
C = C + S[ 2r + 3 ]
(7) Set r = 20 for high security
B = B + S[ 0 ] (based on analysis)
D = D + S[ 1 ]
for i = 1 to 20 do
{
t = ( B x ( 2B + 1 ) ) <<< 5
u = ( D x ( 2D + 1 ) ) <<< 5
A = ( ( A ⊕ t ) <<< u ) + S[ 2i ]
C = ( ( C ⊕ u ) <<< t ) + S[ 2i + 1 ]
(A, B, C, D) = (B, C, D, A)
}
A = A + S[ 42 ]
C = C + S[ 43 ]
Final RC6
RC6 Implementation Results
CPU Cycles / Operation
Java Borland C Assembly
Setup 110000 2300 1108
Encrypt 16200 616 254
Decrypt 16500 566 254
Less than two clocks per bit of plaintext !
Operations/Second (200MHz)
Java Borland C Assembly
Setup 1820 86956 180500
Encrypt 12300 325000 787000
Decrypt 12100 353000 788000
Encryption Rate (200MHz)
MegaBytes / second
MegaBits / second
Java Borland C Assembly
Encrypt 0.197 5.19 12.6
1.57 41.5 100.8
Decrypt 0.194 5.65 12.6
1.55 45.2 100.8
Over 100 Megabits / second !
On an 8-bit processor
u On an Intel MCS51 ( 1 Mhz clock )
u Encrypt/decrypt at 9.2 Kbits/second
(13535 cycles/block;
from actual implementation)
u Key setup in 27 milliseconds
u Only 176 bytes needed for table of
round keys.
u Fits on smart card (< 256 bytes RAM).
Custom RC6 IC
u 0.25 micron CMOS process
u One round/clock at 200 MHz
u Conventional multiplier designs
2
u 0.05 mm of silicon
u 21 milliwatts of power
u Encrypt/decrypt at 1.3 Gbits/second
u With pipelining, can go faster, at cost
of more area and power
RC6 Security Analysis
Analysis procedures
u Intensive analysis, based on most
effective known attacks (e.g. linear
and differential cryptanalysis)
u Analyze not only RC6, but also several
“simplified” forms (e.g. with no
quadratic function, no fixed rotation
by 5 bits, etc…)
Linear analysis
u Find approximations for r-2 rounds.
u Two ways to approximate A = B <<< C
– with one bit each of A, B, C (type I)
– with one bit each of A, B only (type II)
– each have bias 1/64; type I more useful
u Non-zero bias across f(B) only when
input bit = output bit. (Best for lsb.)
u Also include effects of multiple linear
approximations and linear hulls.
Security against linear attacks
Estimate of number of plaintext/ciphertext
pairs required to mount a linear attack.
128
(Only 2 such pairs are available.)
Rounds Pairs
8 247
12 283
16 2119
20 RC6 2155 Infeasible
191
24 2
Differential analysis
u Considers use of (iterative and non-
iterative) (r-2)-round differentials as
well as (r-2)-round characteristics.
u Considers two notions of “difference”:
– exclusive-or
– subtraction (better!)
u Combination of quadratic function and
fixed rotation by 5 bits very good at
thwarting differential attacks.
An iterative RC6 differential
u A B C D
1<<16 1<<11 0 0
1<<11 0 0 0
0 0 0 1<<s
0 1<<26 1<<s 0
1<<26 1<<21 0 1<<v
1<<21 1<<16 1<<v 0
1<<16 1<<11 0 0
u Probability = 2-91
Security against
differential attacks
Estimate of number of plaintext pairs
required to mount a differential attack.
128
(Only 2 such pairs are available.)
Rounds Pairs
8 256
12 2117
16 2190 Infeasible
20 RC6 2238
24 2299
Security of Key Expansion
u Key expansion is identical to that of
RC5; no known weaknesses.
u No known weak keys.
u No known related-key attacks.
u Round keys appear to be a “random”
function of the supplied key.
u Bonus: key expansion is quite “one-
way”---difficult to infer supplied key
from round keys.
Conclusion
u RC6 more than meets the
requirements for the AES; it is
– simple,
– fast, and
– secure.
u For more information, including copy
of these slides, copy of RC6
description, and security analysis, see
www.rsa.com/rsalabs/aes
(The End)