Integer Factorization Algorithms: Connelly Barnes
Integer Factorization Algorithms: Connelly Barnes
Connelly Barnes
Department of Physics, Oregon State University
December 7, 2004
I. Introduction 3
1. Terminology 3
2. Fundamental Theorem of Arithmetic 3
3. Practical Motivation 3
II. Algorithms 5
1. Algorithm: Trial Division 5
2. Pseudocode: Trial Division 5
3. Algorithm: Fermat Factorization 5
4. Pseudocode: Fermat Factorization 6
5. Algorithm: Pollard rho Factorization 6
6. Pseudocode: Pollard rho Factorization 7
7. Algorithm: Brent’s Factorization Method 7
8. Pseudocode: Brent’s Factorization Method 8
9. Algorithm: Pollard p-1 Factorization 9
10. Pseudocode: Pollard p-1 Factorization 10
V. Conclusion 13
Appendix B. References 17
2
I. Introduction
This paper gives a brief survey of integer factorization algorithms. We offer
several motivations for the factorization of large integers. A number of factoring
algorithms are then explained, and pseudocode is given for each. Bounds in running time
are found for algorithms which are always successful, and failure cases are shown for
probabilistic algorithms. Finally, the run times of all presented algorithms are plotted for
certain prime products and compared.
1. Terminology
Big O notation:
The function f (x ) is O ( g ( x )) as x → ∞ if and only if there are positive
real constants c, k such that for every x > k , 0 ≤ f ( n ) ≤ cg (n ) .
Example: f ( x) = 2 x 2 + x + 1 is O ( x 2 ) as x → ∞ for c = 3, k = 0 .
Trivial factor:
A positive integer factor s of N such that s = 1 or s = N.
Nontrivial factor:
A positive integer factor s of N such that 1 < s < N .
Prime number:
A positive integer greater than 1 that is divisible by no positive integers
other than 1 and itself.
3. Practical Motivations
The fundamental theorem of arithmetic implies that any composite integer can be
factored. Given the number N = 21, it is straightforward to find the factors of N:
21 = 3 ⋅ 7 . Now consider a larger composite number:
3
N = 25195908475657893494027183240048398571429282126204
03202777713783604366202070759555626401852588078440
69182906412495150821892985591491761845028084891200
72844992687392807287776735971418347270261896375014
97182469116507761337985909570009733045974880842840
17974291006424586918171951187461215151726546322822
16869987549182422433637259085141865462043576798423
38718477444792073993423658482382428119816381501067
48104516603773060562016196762561338441436038339044
14952634432190114657544454178424020924616515723350
77870774981712577246796292638635637328991215483143
81678998850404453640235273819513786365643912120103
97122822120720357.
1/ 3
64
S = O exp n (log n )2 / 3
9
steps to factor an integer with n decimal digits. The running time of the algorithm is
bounded below by functions polynomial in n and bounded above by functions
exponential in n [2].
The apparent difficulty of factoring large integers is the basis of some modern
cryptographic algorithms. The RSA encryption algorithm [3], and the Blum Blum Shub
cryptographic pseudorandom number generator [4] both rely on the difficulty of factoring
large integers. If it were possible to factor products of large prime numbers quickly,
these algorithms would be insecure.
The SSL encryption used for TCP/IP connections over the World Wide Web
relies on the security of the RSA algorithm [5]. Hence if one could factor large integers
quickly, "secured" Internet sites would no longer be secure.
Finally, in computational complexity theory, it is unknown whether factoring is in
the complexity class P. In technical terms, this means that there is no known algorithm
for answering the question "Does integer N have a factor less than integer s?" in a number
of steps that is O( P( n)) , where n is the number of digits in N, and P(n) is a polynomial
function. Moreover, no one has proved that such an algorithm exists, or does not exist.
In layman's terms, one can simply ask the question, "What is the fastest algorithm for
factoring large numbers?" This is an important open question in mathematics [6].
4
II. Algorithms
1. Algorithm: Trial Division
Trial division is the simplest algorithm for factoring an integer. Assume that s
and t are nontrivial factors of N such that st = N and s ≤ t. To perform the trial division
algorithm, one simply checks whether s | N for s = 2,…, N . When such a divisor s is
found, then t = N / s is also a factor, and a factorization has been found for N. The upper
bound of s ≤ N is provided by the following theorem:
N = x2 − y2
N = ( x + y )( x − y )
Assume that s and t are nontrivial odd factors of N such that st = N and s ≤ t. We
can find x and y such that s = (x – y) and t = (x + y). Solving this equation, we find that x
= (s + t) / 2 and y = (t – s) / 2. Here x and y are integers, since the difference between any
two odd numbers is even, and an even number is divisible by two. Since s > 1 and t ≥ s ,
5
we find that x ≥ 1 and y ≥ 0 . For particular x, y satisfying s = (x – y) and t = (x + y), we
thus know that x = N + y 2 , and hence x ≥ N . Also, x ≤ ( s + t ) / 2 ≤ 2t / 2 ≤ N .
For an algorithm, we choose x1 = N , and x i +1 = x i + 1 . For each i, we check
whether y i = x i2 − N is an integer and whether ( x i + y i ), ( x i − y i ) are nontrivial
factors of N. If both of these conditions hold, we return the nontrivial factors. Otherwise,
we continue to the next i, and exit once xi = N.
x 0 ≡ 2 (mod N )
x n +1 ≡ x n2 + 1 (mod N )
This sequence will eventually become periodic. It can be shown that the length of
the cycle is less than or equal to N by a proof by contradiction: assume that the length L
of the cycle is greater than N, however we have only N distinct xn values in our cycle of
length L>N, so there must exist two xn values are congruent, and these can be identified
as the “starting points” of a cycle with length less than or equal to N. Probabilistic
arguments show that the expected time for this sequence (mod N) to fall into a cycle and
expected length of the cycle are both proportional to N , for almost all N [8]. Other
6
initial values and iterative functions often have similar behavior under iteration, but the
function f ( n) = x n2 + 1 has been found to work well in practice for factorization.
Assume that s and t are nontrivial factors of N such that st = N and s ≤ t. Now
suppose that we have found nonnegative integers i, j with i < j such that x i ≡ x j (mod s)
but x i ≡/ x j (mod N ). Since s | ( x i − x j ) , and s | N , we have that s | gcd( x i − x j , N ) . By
assumption s ≥ 2 , thus gcd( xi − x j , N ) ≥ 2. By definition we know gcd( x i − x j , N ) | N .
However, we have that N /| ( xi − x j ) , and thus that N /| gcd( xi − x j , N ) . So we have
that N /| gcd( xi − x j , N ) , gcd( xi − x j , N ) > 1 , and gcd( xi − x j , N ) | N . Therefore
gcd( x i − x j , N ) is a nontrivial factor of N.
Now we must find i, j such that x i ≡ x j (mod s) and x i ≡/ x j (mod N ). Observe
that the sequence x n (mod s ) is periodic with the length of the cycle proportional to s.
Pollard suggested that x n be compared to x 2n for n = 1, 2, 3, …. For each n, we check
whether gcd( x n − x 2 n , N ) is a nontrivial factor of N. If gcd( x n − x 2n , N ) is a trivial
factor of N, we repeat the iterative process until a factor is found. If no factor is found,
the algorithm does not terminate.
7
nontrivial factor s of N by finding indices i, j with i < j such that x i ≡ x j (mod s) and
x i ≡/ x j (mod N ). The x n sequence is defined by the recurrence relation:
x 0 ≡ 2 (mod N )
x n +1 ≡ x n2 + 2 (mod N )
function integralPowerOf2(z)
pow2 := 1
while pow2 <= z do
if pow2 = z then
return true
end if
pow2 := pow2 * 2
end while
return false
end function
8
Proof. Let there be d binary bits in z, and let ( · )i be an operator which gives the ith
binary bit of ( · ), where i = 1 is the least significant bit. If z is an integral power of 2,
then clearly z k = 0 for k = 1, 2, …, d − 1 , and z d = 1 . We also have that z − 1 < z , so
clearly ( z − 1) d = 0 . Using the truth table for the logical AND operator, we find that
(z & (z − 1))k must be 0 for k = 1…d. Hence (z & (z − 1))k = 0 . In the case that z is not
an integral power of 2, z d = 1 . Let α be the largest integral power of 2 that is less than
z. Then z > α , hence z − 1 ≥ α , and thus ( z − 1) d = α d = 1 . Using the truth table for the
logical AND operator at bit d we find that (z & (z − 1))d = 1 , hence (z & (z − 1))k ≠ 0 .
Therefore, z is an integral power of 2 if and only if z & ( z − 1) = 0 .
2 p −1 ≡ 1 (mod p )
( ) q
2k ! ≡ 2 p −1 ≡ 1q ≡ 1 (mod p)
9
10. Pseudocode: Pollard p-1 Factorization
function pollard_p1(N)
# Initial value 2^(k!) for k = 0.
two_k_fact := 1
for k from 1 to infinity
# Calculate 2^(k!) (mod N) from 2^((k-1)!).
two_k_fact := modPow(two_k_fact, k, N)
rk := gcd(two_k_fact - 1, N)
if rk <> 1 and rk <> N then
return rk, N/rk
end if
end for
end function
0 1
a b = a b0 2 a b1 2 ⋅ ... ⋅ a bn −1 2
n −1
( ) (a )
= a2
0 b0
21
b1
( )
⋅ ... ⋅ a 2
n −1 bn −1
.
have:
n −1
ab = ∏ a2
k
k =0
bk ≠0
k +1
( ) 2
Also note that a 2 = a 2⋅2 = a 2 . Via a process of repeated squaring, we can
k k
thus construct an algorithm which returns the least nonnegative integer y such that
a b ≡ y (mod m ) .
10
Here a % m is a modulo operation, which returns the least nonnegative integer y
such that a ≡ y (mod m ).
Figure 1 shows a plot of the median number of steps for each algorithm versus the
number of decimal digits d in the prime factors, where "steps" is defined as the number of
iterations through the for loop.
For each value of d, each algorithm was tested 100 times. For each test, integers
s, t were chosen in a uniform random manner from the set of integers having d decimal
digits. If s was composite, or t was composite, or s equaled t, then the numbers were re-
selected. Once a valid pair s, t was found, the algorithm was run on the product st for up
to 106 steps. The median number of steps is plotted for each algorithm.
11
Figure 1
1000000
100000
10000
Number of Steps
Pollard rho
Pollard p-1
1000 Trial factorization
Fermat factorization
Brent factorization
100
10
1
0 1 2 3 4 5 6 7
x0 ≡ 2 (mod 21)
x1 ≡ x 02 + 1 ≡ 5 (mod 21)
x 2 ≡ x12 + 1 ≡ 5 (mod 21)
x n ≡ x n2−1 + 1 ≡ 5 (mod 21) for n ≥ 1
12
If n ≥ 1, x 2 n − x n = 0 . The algorithm at each step for n = 1, 2, … computes
gcd( x 2n − x n , N ) = gcd(0, N ) = N . The algorithm never finds a nontrivial factor, and
never terminates.
x k ≡ 2 k ! − 1 (mod 65) k = 1, 2, 3, …
x1 ≡ 21 − 1 ≡ 1 (mod 65)
x2 ≡ 22 − 1 ≡ 3 (mod 65)
x3 ≡ 26 − 1 ≡ 63 (mod 65)
x4 ≡ 224 − 1 ≡ 0 (mod 65)
xn +1 ≡ ( xn + 1) n +1 − 1 ≡ 1n +1 − 1 ≡ 0 (mod 65) for k ≥ 5
The Pollard p-1 algorithm computes at each step gcd( x k , N ) . For the first three
steps, we find that gcd(1, 65) = 1, gcd(3, 65) = 1, and gcd(63, 65) = 1. For steps k ≥ 4
we find gcd(0, 65) = 65. Hence the algorithm never finds a nontrivial factor, and never
terminates.
V. Conclusion
There are no known algorithms which can factor arbitrary large integers
efficiently. Probabilistic algorithms such as the Pollard rho and Pollard p-1 algorithm are
in most cases more efficient than the trial division and Fermat factorization algorithms.
However, probabilistic algorithms can fail when given certain prime products: for
example, Pollard's rho algorithm fails for N = 21. Integer factorization algorithms are an
important subject in mathematics, both for complexity theory, and for practical purposes
such as data security on computers.
13
Appendix A. Maple Source Code for Simulation
> # Define each factorization algorithm
14
# Increment iteration counter.
iters := iters + 1;
if iters >= maxsteps then
return 1, N, maxsteps;
fi;
od;
end;
15
# of two randomly selected k-digit primes, over 100 runs of the algorithm.
> median_steps_for_k_digit_prime := proc(algo, k)
local i, times, p, q, p1, p2, iter;
# Initially empty seqence of the number of steps made by the given algo
# for each pair of random primes.
times := seq(j, j=0..-1);
# Run the algorithm 100 times on products of two random k-digit primes.
for i from 1 to 100 do
while 1=1 do
p := rand(10^(k-1)..10^k-1)();
q := rand(10^(k-1)..10^k-1)();
if isprime(p) and isprime(q) and p <> q then
break;
fi;
od;
# Run the algorithm, but bail out after 1e6 steps.
p1, p2, iter := algo(p*q, 1000000);
times := times, iter;
od;
times := sort([times]);
return times[1+floor(nops(times)/2)];
end;
> # Reproduce the median number of steps for each algorithm when
> # given the products of two randomly selected 4-digit primes.
>
> # Vary the last argument to reproduce the data in Figure 1.
>
> median_steps_for_k_digit_prime(trial_factor, 4);
3342
> median_steps_for_k_digit_prime(fermat_factor, 4);
101
> median_steps_for_k_digit_prime(pollard_rho, 4);
40
> median_steps_for_k_digit_prime(pollard_p1, 4);
36
> median_steps_for_k_digit_prime(brent_factor, 4);
97
16
Appendix B. References
[1]. Kalisky, Burt. "RSA Factoring Challenge." USENET newsgroup sci.crypto. March 18, 1991.
Available: http://www.google.com/groups?selm=BURT.91Mar18092126%40chirality.rsa.com ,
Accessed November 17, 2004.
[2]. "General number field sieve." From Wikipedia, an online encyclopedia. November 13, 2004.
Available: http://en.wikipedia.org/wiki/GNFS
[3]. Wesstein, Eric W. "RSA Encryption." From Mathworld, an online encyclopedia. April, 2001.
Available: http://mathworld.wolfram.com/RSAEncryption.html
[4]. Junod, Pascal. "Cryptographic Secure Pseudo-Random Bits Generation: The Blum-Blum-Shub
Generator." August 1999. Available: http://www.win.tue.nl/~henkvt/boneh-bbs.pdf
[5]. Housley et al. "RFC 2459: Internet X.509 Public Key Infrastructure Certificate and CRL
Profile." January, 1999. Available: http://www.faqs.org/rfcs/rfc2459.html
[6]. "Integer factorization – Difficulty and complexity." From Wikipedia, an online encyclopedia.
October 30, 2004. Available: http://en.wikipedia.org/wiki/Integer_factorization
[7]. Weisstein, Eric W. "Fermat, Pierre de." From MathWorld, an online encyclopedia.
Available: http://scienceworld.wolfram.com/biography/Fermat.html
[8]. Weisstein, Eric W. "Pollard Rho Factorization." From MathWorld, an online encyclopedia.
December 28, 2002. Available: http://mathworld.wolfram.com/PollardRhoFactorizationMethod.html
[9]. Weisstein, Eric W. "Brent's Factorization Method." From MathWorld, an online encyclopedia.
December 28, 2002. Available: http://mathworld.wolfram.com/BrentsFactorizationMethod.html
[10]. Ohannessian, Robert J. "Bob's page of mildly useful but still pretty neat code snippets."
February 18, 2003. Available: http://bob.allegronetwork.com/prog/tricks.html
[11]. Weisstein, Eric W. "Pollard Rho Factorization." From MathWorld, an online encyclopedia.
December 28, 2002. Available: http://mathworld.wolfram.com/Pollardp-1FactorizationMethod.html
[12]. Campbell, Robert. "Computation – Exponentiation via the Russian Peasant Algorithm."
March 29, 1998. Available: http://www.math.umbc.edu/%7Ecampbell/Math413Fall98/7-
FermatThm.html
17