0% found this document useful (0 votes)
199 views7 pages

Quadratic Sieve

The quadratic sieve is an integer factorization algorithm that is faster than general number field sieve for integers under 100 decimal digits. It works by finding a congruence of squares modulo the integer n being factored. It does this in two phases: a data collection phase where it finds smooth numbers whose squares are also smooth modulo n, and a data processing phase where it uses linear algebra to find a congruence of squares from the collected data. The algorithm optimizes the process of finding congruences by choosing parameters that make the squares smaller and more likely to be smooth.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
199 views7 pages

Quadratic Sieve

The quadratic sieve is an integer factorization algorithm that is faster than general number field sieve for integers under 100 decimal digits. It works by finding a congruence of squares modulo the integer n being factored. It does this in two phases: a data collection phase where it finds smooth numbers whose squares are also smooth modulo n, and a data processing phase where it uses linear algebra to find a congruence of squares from the collected data. The algorithm optimizes the process of finding congruences by choosing parameters that make the squares smaller and more likely to be smooth.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

Quadratic sieve

The quadratic sieve algorithm (QS) is an integer fac- 2 The approach


torization algorithm and, in practice, the second fastest
method known (after the general number field sieve). It is
Let x mod y denote the remainder after dividing x by
still the fastest for integers under 100 decimal digits or so,
y. To factorize the integer n, Fermat’s method entails
and is considerably simpler than the number field sieve.
a search for a single number a such that a2 mod n is a
It is a general-purpose factorization algorithm, meaning
square. But these a are hard to find. The quadratic sieve
that its running time depends solely on the size of the
consists of computing a2 mod n for several a, then find-
integer to be factored, and not on special structure or
ing a subset of these whose product is a square. This will
properties. It was invented by Carl Pomerance in 1981
yield a congruence of squares.
as an improvement to Schroeppel’s linear sieve.[1]
For example, 412 mod 1649 = 32, 422 mod 1649 = 115,
and 432 mod 1649 is 200. None of these is a square,
but the product (32)(200) = 6400 = 802 , and mod 1649,
1 Basic aim (32)(200) = (412 )(432 ) = ((41)(43))2 . Since (41)(43)
mod 1649 = 114, this gives a congruence of squares: 1142
The algorithm attempts to set up a congruence of squares ≡ 802 (mod 1649). To finish this factorization example,
modulo n (the integer to be factorized), which often leads continue reading Congruence of squares.
to a factorization of n. The algorithm works in two
phases: the data collection phase, where it collects in- But how to solve the problem of, given a set of numbers,
formation that may lead to a congruence of squares; and finding a subset whose product is a square? The solu-
the data processing phase, where it puts all the data it has tion uses the concept of an exponent vector. For example,
3 2 0 1
collected into a matrix and solves it to obtain a congru- the prime-power factorization of 504 is 2 3 5 7 . It can
ence of squares. The data collection phase can be easily be represented by the exponent vector (3,2,0,1), which
parallelized to many processors, but the data processing gives the exponents of 2, 3, 5, and 7 in the prime factor-
phase requires large amounts of memory, and is difficult ization. The number 490 would similarly have the vector
to parallelize efficiently over many nodes or if the pro- (1,0,1,2). Multiplying the numbers is the same as compo-
cessing nodes do not each have enough memory to store nentwise adding their exponent vectors: (504)(490) has
the whole matrix. The block Wiedemann algorithm can the vector (4,2,1,3).
be used in the case of a few systems each capable of hold- A number is a square if every number in its expo-
ing the matrix. nent vector is even. For example, the vectors (3,0,0,1)
The naive approach to finding a congruence of squares is and (1,2,0,1) add to (4,2,0,2), so (56)(126) is a square.
to pick a random number, square it, and hope the least Searching for a square requires knowledge only of the
non-negative remainder modulo n is a perfect square (in parity of the numbers in the vectors, so it is possible to
the integers). For example, 802 mod 5959 is 441, which reduce the entire vector mod 2 and perform addition of
is 212 . This approach finds a congruence of squares only elements mod 2: (1,0,0,1) + (1,0,0,1) = (0,0,0,0). This is
rarely for large n, but when it does find one, more often particularly efficient in practical implementations, as the
than not, the congruence is nontrivial and the factoriza- vectors can be represented as bitsets and addition mod 2
tion is complete. This is roughly the basis of Fermat’s reduces to bitwise XOR.
factorization method. The problem is reduced to: given a set of (0,1)-vectors,
The quadratic sieve is a modification of Dixon’s factor- find a subset which adds to the zero vector mod 2. This
ization method. is a linear algebra problem; the solution is a linear depen-
dency. It is a theorem of linear algebra that with more
The general running time required for the quadratic sieve vectors than each vector has elements, such a dependency
(to factor an integer n) is must exist. It can be found efficiently, for example by
placing the vectors as rows in a matrix and then using
√ Gaussian elimination, which is easily adapted to work
e(1+o(1)) ln n ln ln n = Ln [1/2, 1] for integers mod 2 instead of real numbers. The desired
square is then the product of the numbers corresponding
in the L-notation.[2] to those vectors.
The constant e is the base of the natural logarithm. However, simply squaring many random numbers mod

1
2 4 HOW QS OPTIMIZES FINDING CONGRUENCES

n produces a very large number of different prime fac- The factorization of a value of y(x) that splits over the
tors, and so very long vectors and a very large matrix. factor base, together with the value of x, is known as a
The answer is to look specifically for numbers a such that relation. The quadratic sieve speeds up the process of
a2 mod n has only small prime factors (they are smooth finding relations by taking x close to the square root of
numbers). They are harder to find, but using only smooth n. This ensures that y(x) will be smaller, and thus have a
numbers keeps the vectors and matrices smaller and more greater chance of being smooth.
tractable. The quadratic sieve searches for smooth num-
bers using a technique called sieving, discussed later,
(⌈√ ⌉ )2
from which the algorithm takes its name. y(x) = n + x − n (where x is a small integer)
⌈√ ⌉
y(x) ≈ 2x n
3 The algorithm This implies that y is on the order of 2x[√n]. However, it
also implies that y grows linearly with x times the square
To summarize, the basic quadratic sieve algorithm has root of n.
these main steps:
Another way to increase the chance of smoothness is by
simply increasing the size of the factor base. However,
1. Choose a smoothness bound B. The number π(B), it is necessary to find at least one smooth relation more
denoting the number of prime numbers less than B, than the number of primes in the factor base, to ensure
will control both the length of the vectors and the the existence of a linear dependency.
number of vectors needed.

2. Use sieving to locate π(B) + 1 numbers ai such that 4.1 Partial relations and cycles
bi=(ai2 mod n) is B-smooth.

3. Factor the bi and generate exponent vectors mod 2 Even if for some relation y(x) is not smooth, it may be
for each one. possible to merge two of these partial relations to form a
full one, if the two y 's are products of the same prime(s)
4. Use linear algebra to find a subset of these vectors outside the factor base. [Note that this is equivalent to
which add to the zero vector. Multiply the corre- extending the factor base.] For example, if the factor base
sponding ai together naming the result mod n: a and is {2, 3, 5, 7} and n = 91, there are partial relations:
the bi together which yields a B-smooth square b2 .

5. We are now left with the equality a2 =b2 mod n from


212 ≡ 71 · 11 (mod 91)
which we get two square roots of (a2 mod n), one by
taking the square root in the integers of b2 namely 292 ≡ 21 · 11 (mod 91)
b, and the other the a computed in step 4.
Multiply these together:
6. We now have the desired identity: (a + b)(a − b) =
0 (mod n) . Compute the GCD of n with the dif-
ference (or sum) of a and b. This produces a factor, (21 · 29)2 ≡ 21 · 71 · 112 (mod 91)
although it may be a trivial factor (n or 1). If the
factor is trivial, try again with a different linear de- and multiply both sides by (11−1 )2 modulo 91. 11−1 mod-
pendency or different a. ulo 91 is 58, so:

The remainder of this article explains details and exten-


sions of this basic algorithm. (58 · 21 · 29)2 ≡ 21 · 71 (mod 91)

142 ≡ 21 · 71 (mod 91)


4 How QS optimizes finding con- producing a full relation. Such a full relation (obtained by
combining partial relations) is called a cycle. Sometimes,
gruences forming a cycle from two partial relations leads directly
to a congruence of squares, but rarely.
The quadratic sieve attempts to find pairs of integers x
and y(x) (where y(x) is a function of x) satisfying a much
weaker condition than x2 ≡ y2 (mod n). It selects a set 4.2 Checking smoothness by sieving
of primes called the factor base, and attempts to find x
such that the least absolute remainder of y(x) = x2 mod n There are several ways to check for smoothness of the
factorizes completely over the factor base. Such x values ys. The most obvious is by trial division, although this
are said to be smooth with respect to the factor base. increases the running time for the data collection phase.
5.2 Matrix Processing 3

Another method that has some acceptance is the elliptic residue modulo each of these primes). These primes will
curve method (ECM). In practice, a process called sieving be the basis for sieving.
is typically used. If f(x) is the polynomial f(x)=x^2-n we Now
have √ we construct our sieve VX of Y (X) = (X +
⌈ N ⌉)2 − N = (X + 124)2 − 15347 and begin the
sieving process for each prime in the basis, choosing to
sieve the first 0 ≤ X < 100 of Y(X):
f (x) = x2 − n
[ ]
f (x + kp) = (x + kp)2 − n V = Y (0) Y (1) Y (2) Y (3) Y (4) Y (5) · · · Y (99)
[ ]
f (x + kp) = x2 + 2xkp + (kp)2 − n = 29 278 529 782 1037 1294 · · · 34382

f (x + kp) = f (x) + 2xkp + (kp)2 ≡ f (x) (mod p) The next step is to perform the sieve. For each p in our
factor base {2, 17, 23, 29} solve the equation
Thus solving f(x) ≡ 0 (mod p) for x generates a whole se-
quence of y=f(x)s which are divisible by p. This is finding

a square root modulo a prime, for which there exist effi- Y (X) ≡ (X + ⌈ N ⌉)2 − N ≡ 0 (mod p)
cient algorithms, such as the Shanks–Tonelli algorithm.
(This is where the quadratic sieve gets its name: y is a to find the entries in the array V which are divisible by p.
quadratic polynomial in x, and the sieving process works For p = 2 solve (X +√124)2 − 15347 ≡ 0 (mod 2) to
like the Sieve of Eratosthenes.) get the solution X ≡ 15347 − 124 ≡ 1 (mod 2) .
The sieve starts by setting every entry in a large array A[] Thus, starting at X=1 and incrementing by 2, each entry
of bytes to zero. For each p, solve the quadratic equation will be divisible by 2. Dividing each of those entries by 2
mod p to get two roots α and β, and then add an approx- yields
imation to log(p) to every entry for which y(x) = 0 mod
p ... that is, A[kp + α] and A[kp + β]. It is also necessary
[ ]
to solve the quadratic equation modulo small powers of p V = 29 139 529 391 1037 647 ··· 17191
in order to recognise numbers divisible by the square of
a factor-base prime. √remaining primes p in {17, 23, 29} the
Similarly for the
equation X ≡ 15347 − 124 (mod p) is solved. Note
At the end of the factor base, any A[] containing a value
that for every p > 2, there will be 2 resulting linear equa-
above a threshold of roughly log(n) will correspond to
tions due to there being 2 modular square roots.
a value of y(x) which splits over the factor base. The
information about exactly which primes divide y(x) has

been lost, but it has only small factors, and there are X ≡ 15347 − 124 ≡ 8 − 124 ≡ 3 (mod 17)
many good algorithms (trial division by small primes,
≡ 9 − 124 ≡ 4 (mod 17)
SQUFOF, Pollard rho, and ECM are usually used in some √
combination) for factoring a number known to have only X ≡ 15347 − 124 ≡ 11 − 124 ≡ 2 (mod 23)
small factors. ≡ 12 − 124 ≡ 3 (mod 23)

There are many y(x) values that work, so the factoriza- X ≡ 15347 − 124 ≡ 8 − 124 ≡ 0 (mod 29)
tion process at the end doesn't have to be entirely reli-
≡ 21 − 124 ≡ 13 (mod 29)
able; often the processes misbehave on say 5% of inputs,
requiring a small amount of extra sieving. Each equation X ≡ a (mod p) results in Vx being divis-
ible by p at x=a and each pth value beyond that. Dividing
V by p at a, a+p, a+2p, a+3p, etc., for each prime in
5 Example of basic sieve the basis finds the smooth numbers which are products of
unique primes (first powers).
This example will demonstrate standard quadratic sieve
without logarithm optimizations or prime powers. Let the [ ]
V = 1 139 23 1 61 647 · · · 17191
number to be factored N = 15347, therefore the ceiling
of the square root of N is 124. Since N is small, the basic Any entry of V that equals 1 corresponds to a smooth
polynomial is enough: y(x) = (x + 124)2 − 15347. number. Since V0 , V3 , and V71 equal one, this corre-
sponds to:

5.1 Data collection


5.2 Matrix Processing
Since N is small, only 4 primes are necessary. The first
4 primes p for which 15347 has a square root mod p are Since smooth numbers Y have been found with the prop-
2, 17, 23, and 29 (in other words, 15347 is a quadratic erty Y ≡ Z 2 (mod N ) , the remainder of the algorithm
4 8 PARAMETERS FROM REALISTIC EXAMPLE

follows equivalently to any other variation of Dixon’s fac-


torization method.
y(x) = (Ax + B)2 − n A, B ∈ Z
Writing the exponents of the product of a subset of the
equations Assuming B 2 − n is a multiple of A, so that B 2 − n =
AC the polynomial y(x) can be written as y(x) = A ·
(Ax2 + 2Bx + C) . If then A is a square, only the factor
29 = 20 · 170 · 230 · 291 (Ax2 + 2Bx + C) has to be considered.
782 = 2 · 17 · 23 · 29
1 1 1 0
This approach (called MPQS, Multiple Polynomial
22678 = 21 · 171 · 231 · 291 Quadratic Sieve) is ideally suited for parallelization, since
each processor involved in the factorization can be given
as a matrix (mod 2) yields: n, the factor base and a collection of polynomials, and it
will have no need to communicate with the central pro-
  cessor until it is finished with its polynomials.
0 0 0 1 [ ]
S · 1 1 1 0 ≡ 0 0 0 0 (mod 2)
1 1 1 1 7 Large primes
A solution to the equation is given by the left null space,
simply 7.1 One large prime
If, after dividing by all the factors less than A, the re-
[ ] maining part of the number (the cofactor) is less than
S= 1 1 1
A2 , then this cofactor must be prime. In effect, it can
Thus the product of all 3 equations yields a square (mod be added to the factor base, by sorting the list of rela-
N). tions into order by cofactor. If y(a) = 7*11*23*137 and
y(b) = 3*5*7*137, then y(a)y(b) = 3*5*11*23 * 72 *
1372 . This works by reducing the threshold of entries in
the sieving array above which a full factorization is per-
29 · 782 · 22678 = 226782
formed.
and
7.2 More large primes
1242 · 1272 · 1952 = 30708602 Reducing the threshold even further, and using an effec-
tive process for factoring y(x) values into products of even
So the algorithm found relatively large primes - ECM is superb for this - can find
relations with most of their factors in the factor base, but
with two or even three larger primes. Cycle finding then
226782 ≡ 30708602 (mod 15347) allows combining a set of relations sharing several primes
into a single relation.
Testing the result yields GCD(3070860 - 22678, 15347)
= 103, a nontrivial factor of 15347, the other being 149.
This demonstration should also serve to show that the 8 Parameters from realistic exam-
quadratic sieve is only appropriate when n is large. For
a number as small as 15347, this algorithm is overkill.
ple
Trial division or Pollard rho could have found a factor
with much less computation. To illustrate typical parameter choices for a realistic
example on a real implementation including the multi-
ple polynomial and large prime optimizations, the tool
msieve was run on a 267-bit semiprime, producing the
6 Multiple polynomials following parameters:

In practice, many different polynomials are used for


• Trial factoring cutoff: 27 bits
y, since only one polynomial will not typically provide
enough (x, y) pairs that are smooth over the factor base. • Sieve interval (per polynomial): 393216 (12 blocks
The polynomials used must have a special form, since of size 32768)
they need to be squares modulo n. The polynomials must
all have a similar form to the original y(x) = x2 − n: • Smoothness bound: 1300967 (50294 primes)
5

• Number of factors for polynomial A coefficients: 10 SIMPQS is accessible as the qsieve command in the
(see Multiple polynomials above) SAGE computer algebra package or can be down-
loaded in source form. SIMPQS is optimized for
• Large prime bound: 128795733 (26 bits) (see Large use on Athlon and Opteron machines, but will oper-
primes above) ate on most common 32 and 64 bit architectures. It
• Smooth values found: 25952 by sieving directly, is written entirely in C.
24462 by combining numbers with large primes • a factoring applet by Dario Alpern, that uses the
quadratic sieve if certain conditions are met.
• Final matrix size: 50294 × 50414, reduced by filter-
ing to 35750 × 35862 • The PARI/GP computer algebra package includes
an implementation of the self-initialising multiple
• Nontrivial dependencies found: 15 polynomial quadratic sieve implementing the large
• Total time (on a 1.6 GHz UltraSparc III): 35 min 39 prime variant. It was adapted by Thomas Papaniko-
seconds laou and Xavier Roblot from a sieve written for the
LiDIA project. The self initialisation scheme is
• Maximum memory used: 8 MB based on an idea from the thesis of Thomas Sos-
nowski.
• A variant of the quadratic sieve is available in the
9 Factoring records MAGMA computer algebra package. It is based
on an implementation of Arjen Lenstra from 1995,
Until the discovery of the number field sieve (NFS), used in his “factoring by email” program.
QS was the asymptotically fastest known general-purpose
• msieve, an implementation of the multiple polyno-
factoring algorithm. Now, Lenstra elliptic curve factor-
mial quadratic sieve with support for single and dou-
ization has the same asymptotic running time as QS (in
ble large primes, written by Jason Papadopoulos.
the case where n has exactly two prime factors of equal
Source code and a Windows binary are available.
size), but in practice, QS is faster since it uses single-
precision operations instead of the multi-precision oper- • YAFU, written by Ben Buhrow, is similar to msieve
ations used by the elliptic curve method. but is faster for most modern processors. It uses
On April 2, 1994, the factorization of RSA-129 was com- Jason Papadopoulos’ block Lanczos code. Source
pleted using QS. It was a 129-digit number, the product code and binaries for Windows and Linux are avail-
of two large primes, one of 64 digits and the other of 65. able.
The factor base for this factorization contained 524339 • Ariel, a simple Java implementation of the quadratic
primes. The data collection phase took 5000 MIPS- sieve for didactic purposes.
years, done in distributed fashion over the Internet. The
data collected totaled 2GB. The data processing phase
took 45 hours on Bellcore's (now Telcordia Technologies) 11 See also
MasPar (massively parallel) supercomputer. This was the
largest published factorization by a general-purpose algo-
• Lenstra elliptic curve factorization
rithm, until NFS was used to factor RSA-130, completed
April 10, 1996. All RSA numbers factored since then • primality test
have been factored using NFS.
The current QS record is a 135-digit cofactor of 2803 −
2402 +1 , itself an Aurifeuillian factor of 21606 +1 , which 12 References
was split into 66-digit and 69-digit prime factors in 2001.
[1] Carl Pomerance, Analysis and Comparison of Some Inte-
ger Factoring Algorithms, in Computational Methods in
Number Theory, Part I, H.W. Lenstra, Jr. and R. Tijde-
10 Implementations man, eds., Math. Centre Tract 154, Amsterdam, 1982,
pp 89-139.
• PPMPQS and PPSIQS
[2] Pomerance, Carl (December 1996). “A Tale of Two
• mpqs Sieves”. Notices of the AMS 43 (12). pp. 1473–1485.

• SIMPQS is a fast implementation of the self- • Richard Crandall and Carl Pomerance (2001).
initialising multiple polynomial quadratic sieve writ- Prime Numbers: A Computational Perspective (1st
ten by William Hart. It provides support for the ed.). Springer. ISBN 0-387-94777-9. Section 6.1:
large prime variant and uses Jason Papadopoulos’ The quadratic sieve factorization method, pp. 227–
block Lanczos code for the linear algebra stage. 244.
6 13 OTHER EXTERNAL LINKS

13 Other external links


• Reference paper from University of Illinois at
Urbana-Champaign
7

14 Text and image sources, contributors, and licenses


14.1 Text
• Quadratic sieve Source: http://en.wikipedia.org/wiki/Quadratic_sieve?oldid=620027069 Contributors: Michael Hardy, Ixfd64, Timwi,
Dcoetzee, Ww, Zoicon5, Itai, Jaredwf, Lupo, Decrypt3, Giftlite, Alexf, Chbarts, MoraSique, Paul F., The JPS, Arabani, MZMcBride,
Fivemack, YurikBot, Bovineone, Alpertron, Amcfreely, Woscafrench, Robost, That Guy, From That Show!, Wwkk, SmackBot, Fulldecent,
Chris the speller, TimBentley, Octahedron80, Tompsci, Saxbryn, Jafet, Fsswsb, CRGreathouse, Ntsimp, Gremagor, MC10, Thijs!bot,
Schneau, Philippe, Blastwave, Kilrothi, Reedy Bot, Derlay, TXiKiBoT, Optimisteo, Jimbo Grilles, Beej175560, Skippydo, Nusumareta,
Billycorganisbald, Sun Creator, XLinkBot, Addbot, Btelcorb, LaaknorBot, West.andrew.g, DigitalSorcerer, Luckas-bot, Yobot, Citation
bot, Groovenstein, Raulshc, Antares5245, Citation bot 1, 10metreh, Thinking of England, ZéroBot, John Cline, Scott contini, Tw1s7y and
Anonymous: 43

14.2 Images

14.3 Content license


• Creative Commons Attribution-Share Alike 3.0

You might also like