Chaper 1
Chaper 1
This algorithm, when presented with an input number n, gives rise to√the
following calculation: In the loop in lines 2–5 the numbers i = 2, 3, . . . , n,
in this order, are tested for being a divisor of n. As soon as a divisor is
found, the calculation stops and returns the value 1. If no divisor is found,
the answer 0 is returned. The algorithm solves the primality problem in the
following sense:
n is a prime number if and only if Algorithm 1.1.1 returns 0.
if n = a · b for 1 < a, b < n, then one of the factors a and b is
This is because √
not larger than n, and hence such a factor must be found by the algorithm.
For moderately large n this procedure may be used for a calculation by hand;
using a modern computer, it is feasible to carry it out for numbers with 20
or 25 decimal digits. However, when confronted with a number like
M. Dietzfelbinger: Primality Testing in Polynomial Time, LNCS 3000, pp. 1-12, 2004.
Springer-Verlag Berlin Heidelberg 2004
2 1. Introduction: Efficient Primality Testing
n = 74838457648748954900050464578792347604359487509026452654305481,
this method cannot be used, simply because it takes too long. The 62-digit
number n happens to be prime, so the loop runs for more than 1031 rounds.
One might think of some simple tricks to speed up the computation, like
dividing by 2, 3, and 5 at the beginning, but afterwards not by any proper
multiples of these numbers. Even after applying tricks of this kind, and under
the assumption that a very fast computer is used that can carry out one trial
division in 1 nanosecond, say, a simple estimate shows that this would take
more than 1013 years of computing time on a single computer.
Presented with such a formidably large number, or an even larger one with
several hundred decimal digits, naive procedures like trial division are not
helpful, and will never be even if the speed of computers increases by several
orders of magnitude and even if computer networks comprising hundreds of
thousands of computers are employed.
One might ask whether considering prime numbers of some hundred dec-
imal digits makes sense at all, because there cannot be any set of objects
in the real world that would have a cardinality that large. Interestingly, in
algorithmics and especially in cryptography there are applications that use
prime numbers of that size for very practical purposes. A prominent example
of such an application is the public key cryptosystem by Rivest, Shamir, and
Adleman [36] (the “RSA system”), which is based on our ability to create
random primes of several hundred decimal digits. (The interested reader may
wish to consult cryptography textbooks like [37, 40] for this and other exam-
ples of cryptosystems that use randomly generated large prime numbers.)
One may also look at the primality problem from a more theoretical point
of view. A long time before prime numbers became practically important as
basic building blocks of cryptographic systems, Carl Friedrich Gauss had
written:
“The problem of distinguishing prime numbers from composites, and
of resolving composite numbers into their prime factors, is one of
the most important and useful in all of arithmetic. . . . The dignity
of science seems to demand that every aid to the solution of such an
elegant and celebrated problem be zealously cultivated.” ([20], in the
translation from Latin into English from [25])
Obviously, Gauss knew the trial division method and also methods for finding
the prime decomposition of natural numbers. So it was not just any procedure
for deciding primality he was asking for, but one with further properties —
simplicity, maybe, and speed, certainly.
fast on numbers that are not too large. But what does “fast ” and “not too
large” mean? Clearly, for any algorithm the number of computational steps
made on input n will grow as larger and larger n are considered. It is the rate
of growth that is of interest here. √
To illustrate a growth rate different from n as in Algorithm 1.1.1, we
consider another algorithm for the primality problem (Lehmann [26]).
Algorithm 1.2.1 (Lehmann’s Primality Test)
Input: Odd integer n ≥ 3, integer ≥ 2.
Method:
0 a, c: integer; b[1..]: array of integer;
1 for i from 1 to do
2 a ← a randomly chosen element of {1, . . . , n − 1};
3 c ← a(n−1)/2 mod n;
4 if c ∈
/ {1, n − 1}
5 then return 1;
6 else b[i] ← c;
7 if b[1] = · · · = b[] = 1
8 then return 1;
9 else return 0;
The intended output of Algorithm 1.2.1 is 0 if n is a prime number and 1
if n is composite. The loop in lines 1–6 causes the same action to be carried
out times, for ≥ 2 a number given as input. The core of the algorithm is
lines 2–6. In line 2 a method is invoked that is important in many efficient
algorithms: randomization. We assume that the computer that carries out
the algorithm has access to a source of randomness and in this way can
choose a number a in {1, . . . , n − 1} uniformly at random. (Intuitively, we
may imagine it casts fair “dice” with n − 1 faces. In reality, of course, some
mechanism for generating “pseudorandom numbers” is used.) In the ith round
through the loop, the algorithm chooses a number ai at random and calculates
(n−1)/2 (n−1)/2
ci = a i mod n, i.e., the remainder when ai is divided by n. If ci
is different from 1 and n − 1, then output 1 is given, and the algorithm stops
(lines 4 and 5); otherwise (line 6) ci is stored in memory cell b[i]. If all of the
ci ’s are in {1, n − 1}, the loop runs to the end, and in lines 7–9 the outcomes
c1 , . . . , c of the rounds are looked at again. If n − 1 appears at least once,
output 0 is given; if all ci ’s equal 1, output 1 is given.
We briefly discuss how the output should be interpreted. Since the algo-
rithm performs random experiments, the result is a random variable. What
is the probability that we get the “wrong” output? We must consider two
cases.
Case 1: n is a prime number. (The desired output is 0.) — We shall see
later (Sect. 6.1) that for n an odd prime exactly half of the elements a
of {1, . . . , n − 1} satisfy a(n−1)/2 mod n = n − 1, the other half satisfies
a(n−1)/2 mod n = 1. This means that the loop runs through all rounds,
4 1. Introduction: Efficient Primality Testing
example, a number with 80 binary digits has about 24 decimal digits.) Simi-
larly, as an elementary operation we view the addition or the multiplication
of two bits. A rough estimate on the basis of the naive methods shows that
certainly c · bit operations are sufficient to add, subtract, or compare two
-bit numbers; for multiplication and division we are on the safe side if we
assume an upper bound of c · 2 bit operations, for some constant c. Assume
now an algorithm A is given that performs TA (n) elementary operations on
input n. We consider possible bounds on TA (n) expressed as fi (log n), for
some functions fi : N → R; see Table 1.1. The table lists the bounds we
get for numbers with about 60, 150, and 450 decimal digits, and it gives the
binary length of numbers we can treat within 1012 and 1020 computational
steps.
Table 1.1. Growth functions for operation bounds. fi (200), fi (500), fi (1500) de-
note the bounds obtained for 200-, 500-, and 1500-bit numbers; si (1012 ) and si (1020 )
are the maximal numbers of binary digits admissible so that an operation bound
of 1012 resp. 1020 is guaranteed
to situations where the length of the numbers that can be treated is already
severely restricted — with (log n)9 operations we may deal with one 7-digit
number in 1 second; treating a single 50-digit √ number takes years.√ Bounds
f7 (log n) = (log n)2 ln ln n , f8 (log n) = c · 2 log n , and f9 (log n) = c n exceed
any polynomial in log n for sufficiently large n. For numbers with small bi-
nary length log n, however, some of these superpolynomial bounds may still be
smaller than high-degree polynomial bounds, as the comparison between f6 ,
f7 , and f8 shows. In particular, note that for log n = 180,000 (corresponding
to a 60,000-digit number) we√have 2 ln ln(log n) < 5, so f7 (log n) < (log n)5 .
The bound f9 (log n) = c n, which belongs to the trial division method,
is extremely bad; only very short inputs can be treated.
Summing up, we see that algorithms with a polynomial bound with a truly
small exponent are useful even for larger numbers. Algorithms with polyno-
mial time bounds with larger exponents may become impossible to carry out
even for moderately large numbers. If the time bound is superpolynomial,
treating really large inputs is usually out of the question. From a theoretical
perspective, it has turned out to be useful to draw a line between computa-
tional problems that admit algorithms with a polynomial operation bound
and problems that do not have such algorithms, since for large enough n,
every polynomial bound will be smaller than every superpolynomial bound.
This is why the class P, to be discussed next, is of such prominent importance
in computational complexity theory.
1.3 Is PRIMES in P?
In order to formulate what exactly the question “Is PRIMES in P?” means,
we must sketch some concepts from computational complexity theory. Tra-
ditionally, the objects of study of complexity theory are “languages” and
“functions”. A nonempty finite set Σ is regarded as an alphabet, and one
considers the set Σ ∗ of all finite sequences or words over Σ. The most impor-
tant alphabet is the binary alphabet {0, 1}, where Σ ∗ comprises the words
ε (the empty word), 0, 1, 00, 01, 10, 11, 000, 001, 010, 011, 100, 101, . . . .
Note that natural numbers can be represented as binary words, e.g., by means
of the binary representation: bin(n) denotes the binary representation of n.
Now decision problems for numbers can be expressed as sets of words over
{0, 1}, e.g.
SQUARE = {bin(n) | n ≥ 0 is a square}
= {0, 1, 100, 1001, 10000, 11001, 100100, 110001, 1000000, . . .}
codes the problem “Given n, decide whether n is a square of some number”,
while
1.4 Randomized and Superpolynomial Time Algorithms 7
is in class P.
Thus, to establish that PRIMES is in P it is sufficient to find an algorithm
A for the primality problem that operates on (not too large) numbers with
a polynomial operation bound. The question of whether such an algorithm
might exist had been open ever since the terminology for asking the question
was developed in the 1960s.
most n1 . Algorithms with this kind of behavior are called primality proving
algorithms.
The algorithm of Adleman and Huang (AAH ) may be combined with, for
example, the Solovay-Strassen Test (ASS ) to obtain an error-free randomized
algorithm for the primality problem with expected polynomial time bound,
as follows: Given an input n, run both algorithms on n. If one of them gives
a definite answer (AAH declares that n is a prime number or ASS declares
that n is composite), we are done. Otherwise, keep repeating the procedure
until an answer is obtained. The expected number of repetitions is smaller
than 2 no matter whether n is prime or composite. The combined algorithm
gives the correct answer with probability 1, and the expected time bound is
polynomial in log n.
There are further algorithms that provide proofs for the primality of an
input number n, many of them quite successful in practice. For much more
information on primality testing and primality proving algorithms see [16].
(A complete list of the known algorithms as of 2004 may be found in the
overview paper [11].)
Such was the state of affairs when in August 2002 M. Agrawal, N. Kayal, and
N. Saxena published their paper “PRIMES is in P”. In this paper, Agrawal,
Kayal, and Saxena described a deterministic algorithm for the primality prob-
lem, and a polynomial bound of c · (log n)12 · (log log n)d was proved for the
number of bit operations, for constants c and d.
In the time analysis of the algorithm, a deep result of Fouvry [19] from
analytical number theory was used, published in 1985. This result concerns
the density of primes of a special kind among the natural numbers. Unfor-
tunately, the proof of Fouvry’s theorem is accessible only to readers with a
quite strong background in number theory. In discussions following the pub-
lication of the new algorithm, some improvements were suggested. One of
these improvements (by H.W. Lenstra [10, 27]) leads to a slightly modified
algorithm with a new time analysis, which avoids the use of Fouvry’s the-
orem altogether, and makes it possible to carry out the time analysis and
correctness proof solely by basic methods from number theory and algebra.
The new analysis even yields an improved bound of c · (log n)10.5 · (log log n)d
on the number of bit operations. Employing Fouvry’s result one obtains the
even smaller bound c · (log n)7.5 · (log log n)d .
Experiments and number-theoretical conjectures make it seem likely that
the exponent in the complexity bound can be chosen even smaller, about 6
instead of 7.5. The reader may consult Table 1.1 to get an idea for num-
bers of which order of magnitude the algorithm is guaranteed to terminate
in reasonable time. Currently, improvements of the new algorithm are be-
10 1. Introduction: Efficient Primality Testing
ing investigated, and these may at some time make it competitive with the
primality proving algorithms currently in use. (See [11].)
Citing the title of a review of the result [13], with the improved and
simplified time analysis the algorithm by Agrawal, Kayal, and Saxena appears
even more a “Breakthrough for Everyman”: a result that can be explained
to interested high-school students, with a correctness proof and time analysis
that can be understood by everyone with a basic mathematical training as
acquired in the first year of studying mathematics or computer science. It
is the purpose of this text to describe this amazing and impressive result
in a self-contained manner, along with two randomized algorithms (Solovay-
Strassen and Miller-Rabin) to represent practically important primality tests.
The book covers just enough material from basic number theory and
elementary algebra to carry through the analysis of these algorithms, and so
frees the reader from collecting methods and facts from different sources.
is not easily solvable for n sufficiently large. An introduction into the subject
of factoring is given in, for example, [41]; an in-depth treatment may be
found in [16]. As an example, we mention one algorithm from the family of
the fastest known factorization algorithms, the “number field sieve”, which
1/3 2/3
has a superpolynomial running time bound of c · ed·(ln n) (ln ln n) , for a
constant d a little smaller than 1.95 and some c > 0. Using algorithms like
this, one has been able to factor single numbers of more than 200 decimal
digits.
It should be noted that with respect to factoring (and to the security
of cryptosystems that are based on the supposed difficulty of factoring) no
change is to be expected as a consequence of the new primality test. This
algorithm shares with all other fast primality tests the property that if it
declares an input number n composite, in most cases it does so on the basis
of indirect evidence, having detected a property in n prime numbers cannot
have. Such a property usually does not help in finding a proper factor of n.
Of course, the book may be read from cover to cover. In this way, the reader
is lead on a guided tour through the basics of algorithms for numbers, of
number theory, and of algebra (including all the proofs), as far as they are
needed for the analysis of the three primality tests treated here.
Chapter 2 should be checked for algorithmic notation and basic algo-
rithms for numbers. Readers with some background in basic number the-
ory and/or algebra may want to read Sects. 3.1 through 3.5 and Sects. 4.1
through 4.3 only cursorily to make sure they are familiar with the (standard)
topics treated there. Section 3.6 on the density bounds for prime numbers
and Sect. 4.4 on the fact that in finite fields the multiplicative group is cyclic
are a little more special and provide essential building blocks of the analysis
of the new primality test by Agrawal, Kayal, and Saxena.
Chapters 5 and 6 treat the Miller-Rabin Test and the Solovay-Strassen
Test in a self-contained manner; a proof of the quadratic reciprocity law,
which is used for the time analysis of the latter algorithm, is provided in
Appendix A.3. These two chapters may be skipped by readers interested
exclusively in the deterministic primality test.
Chapter 7 treats polynomials, in particular polynomials over finite fields
and the technique of constructing finite fields by quotienting modulo an ir-
reducible polynomial. Some special properties of the polynomial X r − 1 are
developed there. All results compiled in this section are essential for the anal-
ysis of the deterministic primality test, which is given in Chap. 8.
Readers are invited to send information about mistakes, other suggestions
for improvements, or comments directly to the author’s email address:
[email protected]
12 1. Introduction: Efficient Primality Testing