The Key To Cryptography - The RSA Algorithm
The Key To Cryptography - The RSA Algorithm
5-1-2018
Recommended Citation
Robinson, Clifton Paul. (2018). The Key to Cryptography: The RSA Algorithm. In BSU Honors Program Theses and Projects. Item 268.
Available at: http://vc.bridgew.edu/honors_proj/268
Copyright © 2018 Clifton Paul Robinson
This item is available as part of Virtual Commons, the open-access institutional repository of Bridgewater State University, Bridgewater, Massachusetts.
The Key to Cryptography: The RSA Algorithm
May 1, 2018
U NDERGRADUATE T HESIS
Author: Advisors:
Dedicated to
Contents
Abstract v
1 Introduction 1
1.1 The Project Overview . . . . . . . . . . . . . . . . . . . . . . . . 1
4 The Mathematics 9
4.1 What is a Prime Number? . . . . . . . . . . . . . . . . . . . . . 9
4.2 Factoring Numbers . . . . . . . . . . . . . . . . . . . . . . . . . 10
4.3 Factoring Products of Two Prime Numbers . . . . . . . . . . . 10
4.4 The Use of Fermat’s Little Theorem . . . . . . . . . . . . . . . . 10
4.5 The Factoring Algorithms . . . . . . . . . . . . . . . . . . . . . 11
4.5.1 Trial Division . . . . . . . . . . . . . . . . . . . . . . . . 11
4.5.2 Pollard Rho . . . . . . . . . . . . . . . . . . . . . . . . . 11
4.5.3 Continued Fraction Factoring . . . . . . . . . . . . . . . 12
4.5.4 Quadratic Sieve . . . . . . . . . . . . . . . . . . . . . . . 15
iv
5 Computing Languages 16
5.1 Python . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5.2 SageMath . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
5.3 Java . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
9 Conclusion 25
9.1 Future Threats . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
9.2 Final Thoughts . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
Acknowledgements 26
Bibliography 27
A Code Appendix 28
A.1 The Main Code Used . . . . . . . . . . . . . . . . . . . . . . . . 28
v
Abstract
The Key To Cryptography: The RSA Algorithm
Introduction
• Carmichael Number
• Cipher
• CIPHERTEXT
• Continued Fraction
1
a0 +
1
a1 +
1
a2 +
···
• Cryptography
• Cryptology
– The science and art of making and breaking codes and ciphers.
• Cryptosystem
• Decryption
• Encryption
• Modular Arithmetic
• plaintext
• Prime Number
• Public-Key Cryptography
• Quantum Computing
• RSA
• Semiprime Number
– These are numbers that are the product of two prime numbers.
This means the only divisors are 1, itself, prime 1, prime 2.
• Time Complexity
2.2 Theorems
Theorem 1 (The Fundamental Theorem of Arithmetic).
Every integer greater than 1 can be written as a product of prime numbers, perhaps
with just one prime in the product, and this product is unique when the primes are
written in non-descending order.
k
e e
p1e1 × p2e2 × ... × pkk = ∏ pi i
i =1
Proof.
If n is composite, then it has at least two prime factors.
Let p and q be two of them.
Assume p ≤ q, then n ≥ pq ≥ p2
√
So p ≤ n.
Theorem 4 (Fermat’s Little Theorem).
If p is a prime number and n is an integer not divisible by p, then p divides n p−1 − 1,
that is, n p−1 ≡ 1 (mod p).
Corollary 4.1. If p is a prime number and n is an integer, then n p ≡ n (mod p).
3.2 A Transition
Up until the Greeks, Cryptography was considered to be a form of
literature because it would just be manipulating writing. Once the Romans
came along the focus shifted from literature to mathematics. The
Caesar Cipher, named after Julius Caesar, started to implement tiny amounts
of math, mainly addition. This cipher, more commonly referred to as the shift
cipher, shifts the original letters in the message a certain amount of spaces
and returns a random group of letters. For example, using our
3.3. Cryptography at War 7
alphabet, if we shifted the letters 5 spaces then the letter ’A’ would be changed
to ’F’ and the person receiving this message would be able to convert it back.
Bob Alice
Key Creation
• Pick your secret primes, p This is the part where Bob will
and q. Compute N = pq create a public key so people can
• Choose an encryption ex- send him messages.
ponent, e, and make sure (For back and forth
gcd(e, ( p − 1)(q − 1)) = 1. communication Alice will create
her own public key)
• Publish N and e.
Encryption
Decryption
• Compute d satisfying: ed ≡
1(mod( p − 1)(q − 1)) If everything is done correctly, Bob
will receive the message that Alice
• Compute m0 ≡ cd (modN ). sent.
• Then, m0 =plaintext m
(Wagstaff, 2013)
This table shows how you can send messages using RSA.
9
The Mathematics
4.1 What is a Prime Number?
Prime numbers are an integral part of mathematics, especially when it
comes to factoring. We know that a prime number is a number that is only
divisible by 1 and itself. One amazing thing about prime numbers is that any
whole number is comprised of different primes. For example, let’s look at
the number 100:
When you start to factor 100 you get:
2 × 50
50 = 2 × 25
25 = 5 × 5
100 = 2 × 2 × 5 × 5 or 100 = 22 × 52
actually prime numbers. These will mess up public keys because you are not
actually using prime numbers. When this theorem is used correctly then it
is a really powerful theorem that makes prime factorization and public-key
cryptography easier.
1
a0 +
1
a1 +
1
a2 +
···
ai = [ αi ]
1
α i +1 =
αi − ai
To understand continued fractions better, we shall look at an example:
Let α = 437
√ √
437 = 20 + ( 437 − 20)
1
= 20 + √
√ 1 √437+20
437−20 437+20
1
= 20 + √
20+ 437
37
4.5. The Factoring Algorithms 13
√ √
20 + 437 −17 + 437 1
= 1+ = 1+ √
37 37 37√ −17−√437
−17+ 437 −17− 437
1 1
= 1+ √ = 1+ √
−629−37 437 17+ 437
−148 4
√ √
17 + 437 −19 + 437 1
= 9+ = 9+ √
4 4 4√ −19−√437
−19+ 437 −19− 437
√
−76 − 4 437 1
= 9+ = 9+ √
−76 19+ 437
19
√ √
19 + 437 −19 + 437 1
= 2+ = 2+ √
19 19 19√ −19−√437
−19+ 437 −19− 437
1 1
= 2+ √ = 2+ 1
361+19 437 √
76 19+ 437
4
√ √
19 + 437 −17 + 437 1
= 9+ = 9+ √
4 4 4√ −17−√437
−17+ 437 −17− 437
1 1
= 9+ √ = 9+ 1
68+4 437 √
148 17+ 437
37
√ √
17 + 437 −20 + 437 1
= 1+ = 1+ √
37 37 37√ −20−√437
−20+ 437 −20− 437
1 1
= 1+ √ = 1+ 1
(37∗20)+37 437 √
37 20+ 437
1
√
20 + 437 = 40
This will now continue to repeat
14 Chapter 4. The Mathematics
We know to stop there because for square roots every continued fraction
repeats eventually. So in this example it went:
20, 1, 9, 2, 9, 1, 40
When we finally see the repetition, the final number in the sequence will be
double the first number (i.e. 20 ∗ 2 = 40).
In Cryptography, continued fractions can actually be used to factor.
√
Continued fractions can also be used for n and it will continue to use the
form we saw above:
√ x−1
n = 1+ √
1+ n
Using different numbers as integers from every step of the continued fraction
we can create a strong algorithm that factors numbers well. There are five
integers in this algorithm: i, qi , pi , Qi , Ai (Wagstaff, 2013). i is just the step
you are on, so it will start at 0 and work its way up until the fraction ends.
Ai is computed outside of the method, but it is the numerator of the i-th
convergent. In the continued fraction method, once it is computed then every
part of it is used to help factor. Each integer has a specific location in the
continued fraction and they are used to create a table to compute different
integers. These are the positions:
1
qi + √
pi + α
Qi
We are able to use continued fractions to factor because they help us find
perfect square solutions. The key fact is the relationship between all of these
variables when we compute the continued fraction. We have
(−1)i Qi = A2i−1 − Bi2−1 N, which implies A2i−1 ≡ (−1)i Qi (mod N ). Thus, if
we can find a Qi that is a perfect square (or we can build a perfect square by
multiplying together different Qi ), then we have a solution to the congruence
below. The solution is to x2 ≡ y2 (mod n). Then that means ( x + y)( x − y) ≡
0 (mod n) and continuing going until we are able to find the factors of our
original number (Wagstaff, 2013). However, the continued fraction factoring
algorithm was overshadowed by the Quadratic Sieve. In 4.5.4 you will learn
about the second step of both algorithms. Both use the second step, but the
first steps are different, with the Quadratic Sieve being more efficient.
4.5. The Factoring Algorithms 15
Computing Languages
One of the main goals of this project was to implement the factoring
algorithms into code. From the code, we would take our data. We ended up
using three programming languages: Python, Sage, and Java.
5.1 Python
Python was created in the late 1980s and it is one of the most commonly
used languages today in programming. Throughout this research, it was also
the main language that was used. Python is a language that is able to handle
numbers no matter how large they get, however, the runtime is slower than
other languages. Python is also good when you need to explain the code; it
is usually straightforward and can be explained easily. We were also lucky
that Python had an easy to implement timer so we could gather extremely
accurate data.
5.2 SageMath
SageMath was brought into this research late, but it ended up being the
most effective language for what we needed. SageMath is actually a form of
the language Python, but it adds in tons of new math functions. So this gave
us a lot of extra information that we could use, including certain factoring
algorithms. This is still a fairly new language and it also runs on a server.
So because of this gathering time measurements can be different because of
how busy the server is. However, you can download your own copy of Sage
so it doesn’t depend on server traffic.
5.3 Java
Initially, we expected Java to be useful for this research, however, it was
the opposite. This language was faster at factoring, but it is good with large
numbers. We could have added different classes to improve larger numbers,
but it was not needed when researching this topic.
17
def trial_division(n):
a = []
f = 2
while n > 1:
if (n % f == 0):
a.append(f)
n /= f
else:
f += 1
return a
Pollard Rho
x = 2; y = 2; d = 1
while d = 1:
x = g(x), where g(x)=(x2+1) mod n
y = g(g(y))
d = gcd(|x - y|, n)
if d = n:
return failure
else:
return d
18 Chapter 6. Implementation of the Algorithms
Quadratic Sieve ≥ Continued Fraction > Pollard’s Rho > Trial Division
7.1.1 Graphs
For every algorithm that was implemented, we collected timed data to
compare the algorithms against each other. Each algorithm has a graph that
shows the data collected from the tests we ran. These tests were testing the
speed of the algorithm in seconds compared to how many digits there were
to factor. These graphs are able to show the algorithms against each other
better than just explaining it.
There are several types of line graphs here. There are three individual line
graphs for the three implemented algorithms. The comparison of the three
algorithms against each other depending on the number of digits (The length
of a Semiprime number). We will also see the comparison of some algorithms
in different languages.
Unlike the Trial Division algorithm, Pollard’s Rho was able to factor up
to 20 digits. It was fairly quick up to 20 digits, only taking about 3 seconds.
However, once we got past 20 digits the algorithm was not able to factor
anymore because of how long it took. Similar to the Trial Division, Pollard’s
Rho did not perform as well as expected. For factoring digits 1 to 20 it
performed extremely well, however it dropped off so quickly once it got
larger that it made it slow.
The Quadratic Sieve graph looks much different than the other two
algorithms. This is because the sieve works best with large numbers. We
were able to factor semiprime numbers up to 70 digits long. This is a huge
improvement from the first two algorithms. This was also expected because
of how strong it is. I believe that if we allowed more time to gather data we
would have been able to factor the 129-digit number associated with RSA.
22 Chapter 7. Data and Observations
Looking at all three algorithms together shows a lot about their different
rates. Trial Division ends up going off the graph fairly quick. This shows
the strength of the other two algorithms and also how weak Trial Division is.
Pollard’s Rho looks good compared to Trial Division, but when you see the
Quadratic Sieve it beats all of them. The sieve almost looks like a completely
flat line because it factors 20-digit semiprimes in under one second. This is
great proof of how powerful the Quadratic Sieve is compared to the others.
Conclusion
9.1 Future Threats
The RSA public key is so simple, but it is so hard to break. Prime numbers
are such an interesting topic because we know so much about them but at the
same time, we barely know anything as well. It is extremely impressive that
the RSA cryptosystem has lasted for over 40 years. It appears that public
keys will keep being used until there is an easy way to factor the semiprime
numbers.
A threat to these types of public keys is currently being worked; it is called
Quantum Computing. If Quantum Computing becomes more than just a
theory and works on the scale it is expected to work, it would be able to factor
any number almost instantly. This would make many of today’s
cryptosystems obsolete but also bring in a new age of encryption techniques.
The reason this is so fast is that it can hold values of 0, 1, or both at the same
time with "quantum bits". Regular bits can just hold one value at a time, not
multiple, which is why Quantum Computing is so much faster. Currently,
many companies has been working towards creating a quantum computer.
Acknowledgements
This thesis was only possible because of the help of many individuals.
Writing a thesis is no easy task and I would like to extend my sincerest thanks
to everyone that helped.
First, I would like to thank my advisor, Dr. Jackie Anderson. Someone who
spent over a year working with me on this project and without her amazing
knowledge and guidance would not have been able to complete this research.
I would also like to thank the members of my reading committee, Dr. Ward
Heilman, and Dr. Haleh Khojasteh. They took the time to help edit and make
this thesis better than it was before and gave great feedback for this paper.
Bibliography
Code Appendix
A.1 The Main Code Used
These are the needed for a lot of the code:
import random
import math
import re
from fractions import gcd
def primeNumber(n):
## This program checks to see if a number is prime or not ##
squareRoot = int(math.sqrt(n)+1)
else:
print(n, "is a composite number.")
def primeGenerator(n,outfileName):
outfile = open(outfileName, "w")
primeList = []
outfile.close()
print("\nThe", len(primeList),"prime numbers have been written to",
outfileName)
A.1. The Main Code Used 29
def primeGeneratorParameters(a,b,outfileName):
outfile = open(outfileName, "w")
primeList = []
outfile.close()
print("\nThe", len(primeList),"prime numbers have been written to",
outfileName)
def FLT(n):
primeTest = 2**(n-1)
if(n==2):
print(n, "is a prime number.")
elif(n%2==0):
print(n, "is not a prime number.")
elif(primeTest % n == 1):
print("There is a possibility that", n, "is a prime number.")
else: ## If nothing else is there then it is not a prime ##
print(n, "is not a prime number.")
def normalFactor(number):
factors = []
squareRoot = int(math.sqrt(number)+1)
print("")
print("There are", int(len(factors)/2), "ways to factor", number)
30 Appendix A. Code Appendix
def pollardRho(N):
b = random.randint(1, N-3) ## finds a random b ##
s = random.randint(0, N-1) ## finds a random s ##
A = s
B = s
def f(x):
return ((x**2)+b)%N
g = 1
attempt = eval(input("How many times would you like to attempt this? "))
print("")
for i in range(attempt):
while(g == 1):
A = f(A) ## sends A through the f(x)
B = f(f(B)) ## sends B through f(x) twice
g = gcd(A-B,N) ## sets g to the gcd