The Science of Encryption: Prime Numbers and Mod N Arithmetic
The Science of Encryption: Prime Numbers and Mod N Arithmetic
Go check your e-mail. Youll notice that the webpage address starts with https://. The s
at the end stands for secure meaning that a process called SSL is being used to encode the
contents of your inbox and prevent people from hacking your account. The heart of SSL as well
as pretty much every other computer security or encoding system is something called a public key
encryption scheme. The first article below describes how a public key encryption scheme works,
and the second explains the mathematics behind it: prime numbers and mod n arithmetic.
enciphering schemes should be asymmetric. For thousands of years all ciphers were symmetric
the key for encrypting a message was identical to the key for decrypting it, but used, so to speak,
in reverse. To change 5 100 100 5 15 55 or 6 120 120 6 18 66 back into attack, for instance,
one simply reverses the encryption by dividing the numbers with the key, instead of multiplying
them, and then replaces the numbers with their equivalent letters. Thus sender and receiver must
both have the key, and must both keep it secret. The symmetry, Diffie and Hellman realized, is the
origin of the key-management problem. The solution is to have an encrypting key that is different
from the decrypting key one key to encipher a message, and another, different key to decipher
it. With an asymmetric cipher, Alice could send encrypted messages to Bob without providing
him with a secret key. In fact, Alice could send him a secret message even if she had never before
communicated with him in any way.
If this sounds ridiculous, it should, Schneier wrote in Secrets and Lies (2001). It sounds
impossible. If you were to survey the worlds cryptographers in 1975, they would all have told you
it was impossible. One year later, Diffie and Hellman showed that it was possible, after all. (Later
the British Secret Service revealed that it had invented these techniques before Diffie and Hellman,
but kept them secret and apparently did nothing with them.)
To be precise, Diffie and Hellman demonstrated only that public-key encryption was possible in
theory. Another year passed before three MIT mathematicians Ronald L. Rivest, Adi Shamir,
and Leonard M. Adleman figured out a way to do it in the real world. At the base of the RivestShamir-Adleman, or RSA, encryption scheme is the mathematical task of factoring. Factoring
a number means identifying the prime numbers which, when multiplied together, produce that
number. Thus 126,356 can be factored into 2 x 2 x 31 x 1,019, where 2, 31, and 1,019 are all
prime. (A given number has only one set of prime factors.) 1 Surprisingly, mathematicians
regard factoring numbers part of the elementary-school curriculum as a fantastically difficult
task. Despite the efforts of such luminaries as Fermat, Gauss, and Fibonacci, nobody has ever
discovered a consistent, usable method for factoring large numbers. Instead, mathematicians try
potential factors by invoking complex rules of thumb, looking for numbers that divide evenly. For
big numbers the process is horribly time-consuming, even with fast computers. The largest number
yet factored is 155 digits long. It took 292 computers, most of them fast workstations, more than
seven months.
Note something odd. It is easy to multiply primes together. But there is no easy way to take
the product and reduce it back to its original primes. In crypto jargon, this is a trapdoor: a
function that lets you go one way easily, but not the other. Such one-way functions, of which
this is perhaps the simplest example, are at the bottom of all public-key encryption. They make
asymmetric ciphers possible.
To use RSA encryption, Alice first secretly chooses two prime numbers, p and q, each more than
a hundred digits long. This is easier than it may sound: there are an infinite supply of prime
numbers. Last year a Canadian college student found the biggest known prime: 213466917 1. It
has 4,053,946 digits; typed without commas in standard 12-point type, the number would be more
than ten miles long. Fortunately Alice doesnt need one nearly that big. She runs a program that
randomly selects two prime numbers for her and then she multiplies them by each other, producing
pq, a still bigger number that is, naturally, not prime. This is Alicespublic key. (In fact, creating
the key is more complicated than I suggest here, but not wildly so.)
1Pop quiz: which one of our theorems from class says this?
As the name suggests, public keys are not secret; indeed, the Alices of this world often post
them on the Internet or attach them to the bottom of their e-mail. When Bob wants to send Alice
a secret message, he first converts the text of the message into a number. Perhaps, as before, he
transforms attack into 5 100 100 5 15 55. Then he obtains Alices public key that is, the
number pq by looking it up on a Web site or copying it from her e-mail. (Note here that Bob does
not use his own key to send Alice a message, as in regular encryption. Instead, he uses Alices key.)
Having found Alices public key, he plugs it into a special algorithm invented by Rivest, Shamir,
and Adleman to encrypt the message.
At this point the three mathematicians cleverness becomes evident. Bob knows the product pq,
because Alice has displayed it on her Web site. But he almost certainly does not know p and q
themselves, because they are its only factors, and factoring large numbers is effectively impossible.
Yet the algorithm is constructed in such a way that to decipher the message the recipient must
know both p and q individually. Because only Alice knows p and q, Bob can send secret messages to
Alice without ever having to swap keys. Anyone else who wants to read the message will somehow
have to factor pq back into the prime numbers p and q.2
In the real world, public-key encryption is practically never used to encrypt actual messages.
The reason is that it requires so much computation even on computers, public-key is very slow.
According to a widely cited estimate by Schneier, public-key crypto is about a thousand times
slower than conventional cryptography. As a result, public-key cryptography is more often used
as a solution to the key-management problem, rather than as direct cryptography. People employ
public-key to distribute regular, private keys, which are then used to encrypt and decrypt actual
messages. In other words, Alice and Bob send each other their public keys. Alice generates a
symmetric key that she will only use for a short time (usually, in the trade, called a session key),
encrypts it with Bobs public key, and sends it to Bob, who decrypts it with his private key. Now
that Alice and Bob both have the session key, they can exchange messages. When Alice wants to
begin a new round of messages, she creates another session key. Systems that use both symmetric
and public-key cryptography are called hybrid, and almost every available public-key system, such
as PGP is a hybrid.3
2The next article will give you an indication of how amazingly difficult this is
3Or SSL. PGP is the encryption process used for most secure computer databases, whereas SSL is typically used
over the internet. It is also a hybrid: Encoding and decoding the contents of an entire webpage using public-key
encryption would slow down your internet browser too much. Instead, a public-key is used to send a temporary
private key that lets you decode the encrypted data from the website. Every time you visit facebook or gmail, the
private key changes, and your information is kept secure.
(4)
(5)
(6)
(7)
(8)
number e is also part of the public key, so B also is told the value of e. [See footnote4 for a
remark on why were using the number (p 1)(q 1).]
Now B knows enough to encode a message to A. Suppose, for this example, that the message
is the number M = 35.
B calculates the value of C = M e (mod N ) = 357 (mod 943).
357 = 64339296875 and 64339296875(mod 943) = 545. The number 545 is the encoding
that B sends to A.
Now A wants to decode 545. To do so, he needs to find a number d such that ed =
1(mod (p 1)(q 1)), or in this case, such that 7d = 1(mod 880). A solution is d = 503,
since 7 503 = 3521 = 4(880) + 1 = 1(mod 880).
To find the decoding, A must calculate C d (mod N ) = 545503 (mod 943). This looks like
it will be a horrible calculation, and at first it seems like it is, but notice that 503 =
256 + 128 + 64 + 32 + 16 + 4 + 2 + 1 (this is just the binary expansion of 503). So this means
that
4Why the number (p 1)(q 1)? It has to do with the function that we talked about in class. When p and q
Exercises
(1) Summarize, in non-mathematical terms, how public-key encryption works.
(2) Is public-key encryption more secure or at least less risky than private-key encryption?
What are the main advantages and disadvantages of each?
(3) Follow through all the steps of RSA encryption as outlined in Davis article, using the prime
numbers p = 17, q = 19 and e = 11 to encode the message 81 as some other number,
and then decode it back. Hint: to save you some time in step (7), a number d such that
ed = 1(mod (p 1)(q 1)) is the number d = 131. But you must check this to make sure
that it works, i.e. show that 1 is the remainder when you divide (p 1)(q 1) by ed.
Be sure to show all of your work and write down all of the steps.