Quadratic Sieve
Quadratic Sieve
1
2 4 HOW QS OPTIMIZES FINDING CONGRUENCES
n produces a very large number of different prime fac- The factorization of a value of y(x) that splits over the
tors, and so very long vectors and a very large matrix. factor base, together with the value of x, is known as a
The answer is to look specifically for numbers a such that relation. The quadratic sieve speeds up the process of
a2 mod n has only small prime factors (they are smooth finding relations by taking x close to the square root of
numbers). They are harder to find, but using only smooth n. This ensures that y(x) will be smaller, and thus have a
numbers keeps the vectors and matrices smaller and more greater chance of being smooth.
tractable. The quadratic sieve searches for smooth num-
bers using a technique called sieving, discussed later,
(⌈√ ⌉ )2
from which the algorithm takes its name. y(x) = n + x − n (where x is a small integer)
⌈√ ⌉
y(x) ≈ 2x n
3 The algorithm This implies that y is on the order of 2x[√n]. However, it
also implies that y grows linearly with x times the square
To summarize, the basic quadratic sieve algorithm has root of n.
these main steps:
Another way to increase the chance of smoothness is by
simply increasing the size of the factor base. However,
1. Choose a smoothness bound B. The number π(B), it is necessary to find at least one smooth relation more
denoting the number of prime numbers less than B, than the number of primes in the factor base, to ensure
will control both the length of the vectors and the the existence of a linear dependency.
number of vectors needed.
2. Use sieving to locate π(B) + 1 numbers ai such that 4.1 Partial relations and cycles
bi=(ai2 mod n) is B-smooth.
3. Factor the bi and generate exponent vectors mod 2 Even if for some relation y(x) is not smooth, it may be
for each one. possible to merge two of these partial relations to form a
full one, if the two y 's are products of the same prime(s)
4. Use linear algebra to find a subset of these vectors outside the factor base. [Note that this is equivalent to
which add to the zero vector. Multiply the corre- extending the factor base.] For example, if the factor base
sponding ai together naming the result mod n: a and is {2, 3, 5, 7} and n = 91, there are partial relations:
the bi together which yields a B-smooth square b2 .
Another method that has some acceptance is the elliptic residue modulo each of these primes). These primes will
curve method (ECM). In practice, a process called sieving be the basis for sieving.
is typically used. If f(x) is the polynomial f(x)=x^2-n we Now
have √ we construct our sieve VX of Y (X) = (X +
⌈ N ⌉)2 − N = (X + 124)2 − 15347 and begin the
sieving process for each prime in the basis, choosing to
sieve the first 0 ≤ X < 100 of Y(X):
f (x) = x2 − n
[ ]
f (x + kp) = (x + kp)2 − n V = Y (0) Y (1) Y (2) Y (3) Y (4) Y (5) · · · Y (99)
[ ]
f (x + kp) = x2 + 2xkp + (kp)2 − n = 29 278 529 782 1037 1294 · · · 34382
f (x + kp) = f (x) + 2xkp + (kp)2 ≡ f (x) (mod p) The next step is to perform the sieve. For each p in our
factor base {2, 17, 23, 29} solve the equation
Thus solving f(x) ≡ 0 (mod p) for x generates a whole se-
quence of y=f(x)s which are divisible by p. This is finding
√
a square root modulo a prime, for which there exist effi- Y (X) ≡ (X + ⌈ N ⌉)2 − N ≡ 0 (mod p)
cient algorithms, such as the Shanks–Tonelli algorithm.
(This is where the quadratic sieve gets its name: y is a to find the entries in the array V which are divisible by p.
quadratic polynomial in x, and the sieving process works For p = 2 solve (X +√124)2 − 15347 ≡ 0 (mod 2) to
like the Sieve of Eratosthenes.) get the solution X ≡ 15347 − 124 ≡ 1 (mod 2) .
The sieve starts by setting every entry in a large array A[] Thus, starting at X=1 and incrementing by 2, each entry
of bytes to zero. For each p, solve the quadratic equation will be divisible by 2. Dividing each of those entries by 2
mod p to get two roots α and β, and then add an approx- yields
imation to log(p) to every entry for which y(x) = 0 mod
p ... that is, A[kp + α] and A[kp + β]. It is also necessary
[ ]
to solve the quadratic equation modulo small powers of p V = 29 139 529 391 1037 647 ··· 17191
in order to recognise numbers divisible by the square of
a factor-base prime. √remaining primes p in {17, 23, 29} the
Similarly for the
equation X ≡ 15347 − 124 (mod p) is solved. Note
At the end of the factor base, any A[] containing a value
that for every p > 2, there will be 2 resulting linear equa-
above a threshold of roughly log(n) will correspond to
tions due to there being 2 modular square roots.
a value of y(x) which splits over the factor base. The
information about exactly which primes divide y(x) has
√
been lost, but it has only small factors, and there are X ≡ 15347 − 124 ≡ 8 − 124 ≡ 3 (mod 17)
many good algorithms (trial division by small primes,
≡ 9 − 124 ≡ 4 (mod 17)
SQUFOF, Pollard rho, and ECM are usually used in some √
combination) for factoring a number known to have only X ≡ 15347 − 124 ≡ 11 − 124 ≡ 2 (mod 23)
small factors. ≡ 12 − 124 ≡ 3 (mod 23)
√
There are many y(x) values that work, so the factoriza- X ≡ 15347 − 124 ≡ 8 − 124 ≡ 0 (mod 29)
tion process at the end doesn't have to be entirely reli-
≡ 21 − 124 ≡ 13 (mod 29)
able; often the processes misbehave on say 5% of inputs,
requiring a small amount of extra sieving. Each equation X ≡ a (mod p) results in Vx being divis-
ible by p at x=a and each pth value beyond that. Dividing
V by p at a, a+p, a+2p, a+3p, etc., for each prime in
5 Example of basic sieve the basis finds the smooth numbers which are products of
unique primes (first powers).
This example will demonstrate standard quadratic sieve
without logarithm optimizations or prime powers. Let the [ ]
V = 1 139 23 1 61 647 · · · 17191
number to be factored N = 15347, therefore the ceiling
of the square root of N is 124. Since N is small, the basic Any entry of V that equals 1 corresponds to a smooth
polynomial is enough: y(x) = (x + 124)2 − 15347. number. Since V0 , V3 , and V71 equal one, this corre-
sponds to:
• Number of factors for polynomial A coefficients: 10 SIMPQS is accessible as the qsieve command in the
(see Multiple polynomials above) SAGE computer algebra package or can be down-
loaded in source form. SIMPQS is optimized for
• Large prime bound: 128795733 (26 bits) (see Large use on Athlon and Opteron machines, but will oper-
primes above) ate on most common 32 and 64 bit architectures. It
• Smooth values found: 25952 by sieving directly, is written entirely in C.
24462 by combining numbers with large primes • a factoring applet by Dario Alpern, that uses the
quadratic sieve if certain conditions are met.
• Final matrix size: 50294 × 50414, reduced by filter-
ing to 35750 × 35862 • The PARI/GP computer algebra package includes
an implementation of the self-initialising multiple
• Nontrivial dependencies found: 15 polynomial quadratic sieve implementing the large
• Total time (on a 1.6 GHz UltraSparc III): 35 min 39 prime variant. It was adapted by Thomas Papaniko-
seconds laou and Xavier Roblot from a sieve written for the
LiDIA project. The self initialisation scheme is
• Maximum memory used: 8 MB based on an idea from the thesis of Thomas Sos-
nowski.
• A variant of the quadratic sieve is available in the
9 Factoring records MAGMA computer algebra package. It is based
on an implementation of Arjen Lenstra from 1995,
Until the discovery of the number field sieve (NFS), used in his “factoring by email” program.
QS was the asymptotically fastest known general-purpose
• msieve, an implementation of the multiple polyno-
factoring algorithm. Now, Lenstra elliptic curve factor-
mial quadratic sieve with support for single and dou-
ization has the same asymptotic running time as QS (in
ble large primes, written by Jason Papadopoulos.
the case where n has exactly two prime factors of equal
Source code and a Windows binary are available.
size), but in practice, QS is faster since it uses single-
precision operations instead of the multi-precision oper- • YAFU, written by Ben Buhrow, is similar to msieve
ations used by the elliptic curve method. but is faster for most modern processors. It uses
On April 2, 1994, the factorization of RSA-129 was com- Jason Papadopoulos’ block Lanczos code. Source
pleted using QS. It was a 129-digit number, the product code and binaries for Windows and Linux are avail-
of two large primes, one of 64 digits and the other of 65. able.
The factor base for this factorization contained 524339 • Ariel, a simple Java implementation of the quadratic
primes. The data collection phase took 5000 MIPS- sieve for didactic purposes.
years, done in distributed fashion over the Internet. The
data collected totaled 2GB. The data processing phase
took 45 hours on Bellcore's (now Telcordia Technologies) 11 See also
MasPar (massively parallel) supercomputer. This was the
largest published factorization by a general-purpose algo-
• Lenstra elliptic curve factorization
rithm, until NFS was used to factor RSA-130, completed
April 10, 1996. All RSA numbers factored since then • primality test
have been factored using NFS.
The current QS record is a 135-digit cofactor of 2803 −
2402 +1 , itself an Aurifeuillian factor of 21606 +1 , which 12 References
was split into 66-digit and 69-digit prime factors in 2001.
[1] Carl Pomerance, Analysis and Comparison of Some Inte-
ger Factoring Algorithms, in Computational Methods in
Number Theory, Part I, H.W. Lenstra, Jr. and R. Tijde-
10 Implementations man, eds., Math. Centre Tract 154, Amsterdam, 1982,
pp 89-139.
• PPMPQS and PPSIQS
[2] Pomerance, Carl (December 1996). “A Tale of Two
• mpqs Sieves”. Notices of the AMS 43 (12). pp. 1473–1485.
• SIMPQS is a fast implementation of the self- • Richard Crandall and Carl Pomerance (2001).
initialising multiple polynomial quadratic sieve writ- Prime Numbers: A Computational Perspective (1st
ten by William Hart. It provides support for the ed.). Springer. ISBN 0-387-94777-9. Section 6.1:
large prime variant and uses Jason Papadopoulos’ The quadratic sieve factorization method, pp. 227–
block Lanczos code for the linear algebra stage. 244.
6 13 OTHER EXTERNAL LINKS
14.2 Images