0% found this document useful (0 votes)

14 views

A Multi-Level Blocking Distinct-Degree Factorization Algorithm

This document summarizes a new algorithm for factoring polynomials over GF(2) called the multi-level blocking distinct-degree factorization algorithm. The algorithm speeds up previous algorithms by replacing multiplications with faster squaring operations. As an application, the authors give a fast algorithm to search for all irreducible trinomials of degree r over GF(2). Under reasonable assumptions, the new algorithm has complexity O(r^2(log r)^{3/2}(log log r)^{1/2}) to search all trinomials of degree r, providing a speedup of over 560 times compared to the naive algorithm when searching trinomials of degree 24036583.

Uploaded by

Anony Usery

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

A Multi-Level Blocking Distinct-Degree Factorization Algorithm

Uploaded by

Anony Usery

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 12

Contemporary Mathematics

A Multi-level Blocking Distinct-degree

Factorization Algorithm

Richard P. Brent and Paul Zimmermann

Abstract. We give a new algorithm for performing the distinct-degree factor-

ization of a polynomial P (x) over GF(2), using a multi-level blocking strategy.
The coarsest level of blocking replaces GCD computations by multiplications,
as suggested by Pollard (1975), von zur Gathen and Shoup (1992), and others.
The novelty of our approach is that a finer level of blocking replaces multipli-
cations by squarings, which speeds up the computation in GF(2)[x]/P (x) of
certain interval polynomials when P (x) is sparse.
As an application we give a fast algorithm to search for all irreducible
trinomials xr + xs + 1 of degree r over GF(2), while producing a certificate
that can be checked in less time than the full search. Naive algorithms cost
O(r 2 ) per trinomial, thus O(r 3 ) to search over all trinomials of given degree r.
Under a plausible assumption about the distribution of factors of trinomials,
the new algorithm has complexity O(r 2 (log r)3/2 (log log r)1/2 ) for the search
over all trinomials of degree r. Our implementation achieves a speedup of
greater than a factor of 560 over the naive algorithm in the case r = 24036583
(a Mersenne exponent).
Using our program, we have found two new primitive trinomials of degree
24036583 over GF(2) (the previous record degree was 6972593).

1. Introduction
The problem of factoring a univariate polynomial P (x) over a finite field F
often arises in computational algebra [7, 11, 12]. An important case is when F
has small characteristic and P (x) has high degree but is sparse, that is P (x) has
only a small number of nonzero terms.
To simplify the exposition we restrict attention to the case where F = GF(2)
and P (x) is a trinomial
P (x) = xr + xs + 1, r > s > 0,
although the ideas apply more generally and should be useful for factoring sparse
polynomials over fields of small characteristic.

1991 Mathematics Subject Classification. Primary 11B83, 11Y05, 11Y16; Secondary 11-04,
11K31, 11N35, 11R09, 11T06, 11Y55, 12-04, 68Q25 .
Key words and phrases. Amortized complexity, distinct-degree factorization, finite field,
irreducible trinomial, Mersenne exponent, polynomial factorization, primitive trinomial.

c
2008 the authors. rpb230

1
2 RICHARD P. BRENT AND PAUL ZIMMERMANN

Our aim is to give an algorithm with good amortized complexity, that is, one
that works well on average. Since we are restricting attention to trinomials, we
average over all trinomials of fixed degree r.
Our motivation is to speed up previous algorithms for searching for irreducible
trinomials of high degree [5, 6, 13, 14]. For given degree r, we want to find all
irreducible trinomials xr + xs + 1.
In our examples the degree r is a Mersenne exponent, i.e., 2r − 1 is a Mersenne
prime. In this case an irreducible trinomial of degree r is necessarily primitive. In
general, without the restriction to Mersenne exponents, we would need the prime
factorisation of 2r − 1 in order to test primitivity (see e.g., [10]).
We are only interested in Mersenne exponents r = ±1 mod 8, because in other
cases Swan’s theorem [15, 21, 22] rules out irreducible trinomials of degree r
(except for s = 2 or r − 2, but these cases are usually easy to handle: for example
if r = 13466917 or 20996011 we have r = 1 mod 3, so xr + x2 + 1 is divisible by
x2 + x + 1).
Mersenne exponents can be found on the GIMPS website [23]. At the time
of writing, the five largest known Mersenne exponents r satisfying the condition
r = ±1 mod 8 are r = 6972593, 24036583, 25964951, 30402457 and 32582657. In
the smallest case r = 6972593, a primitive trinomial was found by Brent, Larvala
and Zimmermann [6] using an efficient implementation of the naive algorithm.
However, it was not feasible to consider the larger Mersenne exponents r using the
same algorithm, since the time complexity of this algorithm is roughly of order r3 ,
and the next case r = 24036583 would take about 41 times longer than r = 6972593.
With the new “fast” algorithm described in this paper we have been able to find two
primitive trinomials of degree r = 24036583 in less time than the naive algorithm
took for r = 6972593. The speedup over the naive algorithm for r = 24036583 is
about a factor of 560.
If xr + xs + 1 is reducible then we want to provide an easily-checked certificate
of reducibility. The certificate can simply be an encoding of an irreducible factor
f of xr + xs + 1. We choose the factor f of smallest degree d > 0. In case
there are several factors of equal smallest degree d, we give the one that is least in
lexicographic order, e.g., x3 + x + 1 is preferred to x3 + x2 + 1.
1.1. Distinct-degree factorization. Factorization of polynomials over finite
fields typically proceeds in three stages: square-free factorization, distinct-degree
factorization, and equal-degree factorization. The most time-consuming stage, and
the one that we consider in this paper, is distinct-degree factorization [8, 10, 11].
The program described in §4.3 performs equal-degree factorization when it is
necessary to split a product of equal-degree factors in order to give the unique cer-
tificate described above, but this is cheap (on average) because it is rarely required
for factors of high degree.
In the complexity analysis we only consider the time required to find one non-
trivial factor (it will be a factor of smallest degree) or output “irreducible”, since
that is what is required in the search for irreducible trinomials. However the algo-
rithm outlined in §2.4 readily extends to a complete distinct-degree factorization.
d
1.2. Factorization over GF(2). It is well-known that x2 + x is the product
of all irreducible polynomials of degree dividing d. For example,
3
x2 + x = x(x + 1)(x3 + x + 1)(x3 + x2 + 1).
A MULTI-LEVEL BLOCKING DISTINCT-DEGREE FACTORIZATION ALGORITHM 3

Thus, a simple algorithm to find a factor of smallest degree of P (x) is to compute

d
GCD(x2 + x, P (x)) for d = 1, 2, . . . The first time that the GCD is nontrivial, it
contains a factor of minimal degree d. If the GCD has degree > d, it must be a
product of factors of degree d. If no factor has been found for d ≤ r/2, where
r = deg(P (x)), then P (x) must be irreducible.
Some simplifications are possible when P (x) = xr + xs + 1 is a trinomial over
GF(2) with r or s odd (otherwise P (x) is trivially reducible):
(1) We can skip the case d = 1 because a trinomial can not have a factor of
degree 1.
(2) Since xr P (1/x) = xr + xr−s + 1, we only need consider s ≤ r/2.
(3) We can assume that P (x) is square-free.
(4) By applying Swan’s theorem, we can often show that the trinomial under
consideration has an odd number of irreducible factors; in this case we
only need check d ≤ r/3 before claiming that P (x) is irreducible.

2. Complexity of the algorithm

2d
Note that x should not be computed explicitly; it is much better to compute
2d
x mod P (x) by repeated squaring. The complexity of squaring modulo a trinomial
of degree r is only S(r) = O(r) bit-operations.

2.1. Complexity of polynomial multiplication and squaring. We need

to perform multiplications in GF(2)[x]/P (x), and an important special case is
squaring a polynomial modulo P (x), so we consider the bit-complexity of these
operations.
Multiplication of polynomials of degree r over GF(2) can be performed in time
M (r) = O(r log r log log r). We have implemented an algorithm of Schönhage [17]
that achieves this bound. The algorithm uses a radix-3 FFT and is different from
the better-known Schönhage-Strassen algorithm [18]. We remark that the log log r
term in the time-bound for the Schönhage-Strassen algorithm has been reduced by
Fürer [9], but it is not clear if a similar idea can be used to improve Schönhage’s
algorithm [17]. In any event the log log r term comes from the number of levels of
recursion and is a small constant for the values of r that we are considering.
In practice, Schönhage’s algorithm is not the fastest unless r is quite large. We
have also implemented classical, Karatsuba and Toom-Cook algorithms that have
M (r) = O(rα ), 1 < α ≤ 2, since these algorithms are easier to implement and are
faster for small r. Our implementations of the Toom-Cook algorithms TC3 and
TC4 are based on recent ideas of Bodrato [1].
For brevity we assume that r is large and Schönhage’s algorithm is used. On a
64-bit machine the crossover versus TC4 occurs around degree r = 180000, see [4].
In the complexity estimates we assume that M (r) is a sufficiently smooth and
well-behaved function.
By squaring we mean squaring a polynomial of degree < r and reduction mod
P (x). Squaring in GF(2)[x]/P (x) can be performed in time S(r) = Θ(r) ≪ M (r)
(assuming, as usual, that P (x) is a trinomial). Our algorithm takes advantage of
the fact that squaring is much faster than multiplication.
Where possible we use the memory-efficient squaring algorithm of Brent, Lar-
vala and Zimmermann [5], which in our implementation is about 2.2 times faster
than the naive squaring algorithm.
4 RICHARD P. BRENT AND PAUL ZIMMERMANN

2.2. Complexity of GCD. For GCDs we use a sub-quadratic algorithm that

runs in time G(r) = Θ(M (r) log r). More precisely,
(2.1) G(2r) = 2G(r) + Θ(M (r)),
so
M (r) = Θ(r log r log log r) ⇒ G(r) = Θ(M (r) log r).
If the classical or Karatsuba algorithm (or one of the Toom-Cook class of algo-
rithms) is used for multiplication, then M (r) = Θ(rα ) for some α > 1, and in this
case it follows from (2.1) that
G(r) = Θ(M (r)).
In practice, for r ≈ 2.4 × 107 and our implementation on a 2.2 Ghz Opteron,
S(r) ≈ 0.005 second, M (r) ≈ 2 seconds, G(r) ≈ 80 seconds, so M (r)/S(r) ≈ 400,
and G(r)/M (r) ≈ 40.
2.3. Avoiding GCD computations. In the context of integer factorization,
Pollard [16] suggested a blocking strategy to avoid most GCD computations and
thus reduce the amortized cost; von zur Gathen and Shoup [12] applied the same
idea to polynomial factorization.
The idea of blocking is to choose a parameter ℓ > 0 and, instead of computing
d
GCD(x2 + x, P (x)) for d ∈ [d′ , d′ + ℓ),
compute
d′
GCD(pℓ (x2 , x), P (x)),
where the interval polynomial pℓ (X, x) is defined by
Y
ℓ−1
j

pℓ (X, x) = X2 + x .
j=0

In this way we replace ℓ GCDs by one GCD and ℓ − 1 multiplications mod P (x).
The drawback of blocking is that we may have to backtrack if P (x) has more
than one factor with degree in the interval [d′ , d′ + ℓ), since the algorithm produces
the product of these factors. Thus ℓ should not be too large. The optimal strategy
depends on the expected size distribution of factors and the ratio of times for GCDs
and multiplications.
2.4. Multi-level blocking. Our new idea is to use a finer level of blocking
to replace most multiplications by squarings, which speeds up the computation in
GF(2)[x]/P (x) of the above interval polynomials. The idea is to split the interval
[d′ , d′ + ℓ) into k ≥ 2 smaller intervals of length m over which
Y
m−1
j
Xm
(2.2) pm (X, x) = X2 + x = xm−j sj,m (X),
j=0 j=0

where
X
(2.3) sj,m (X) = X k,
0≤k<2m , w(k)=j

and w(k) denotes the Hamming weight of k, that is the number of nonzero bits in
the binary representation of k.
A MULTI-LEVEL BLOCKING DISTINCT-DEGREE FACTORIZATION ALGORITHM 5

For example, if m = 3, we have:

pm (X, x) = x3 + x2 (X 4 + X 2 + X) + x(X 6 + X 5 + X 3 ) + X 7 ;
hence s0,3 (X) = 1, s1,3 (X) = X 4 + X 2 + X, s2,3 (X) = X 6 + X 5 + X 3 , and
s3,3 (X) = X 7 .
Note that
sj,m (X 2 ) = sj,m (X)2 in GF(2)[x]/P (x).
d d−m
Thus, pm (x2 , x) can be computed with cost m2 S(r) if we already know sj,m (x2 )
for 0 < j ≤ m. (The constant polynomial s0,m (X) = 1 is computed only once.)
d−3
Continuing the example with m = 3, and assuming that we know s1,3 (x2 ),
d−3 d−3 d
s2,3 (x2 ), and s3,3 (x2 ), squaring each of these m = 3 times gives s1,3 (x2 ),
d d d
s2,3 (x2 ), and s3,3 (x2 ), from which we can easily get p3 (x2 , x) using the sum in
Eq. (2.2).
In this way we replace m − 1 multiplications and m squarings — if we used
the product in Eq. (2.2) — by m2 squarings. Each sj,m , 0 < j ≤ m, requires m
d−m d
squarings to be shifted from argument x2 to argument x2 p . The summation in
Eq. (2.2) costs only O(mr), which is negligible. Choosing m ≈ M (r)/S(r) (about
20 if M (r)/S(r) ≈ 400), the speedup over single-level blocking is about m/2 ≈ 10
(not counting the cost of GCDs).
Von zur Gathen and Gerhard [11, p. 1685] suggested using the same idea with
m = 2 (thus reducing the number of multiplications by a factor of two), but did
not consider choosing an optimal m > 2.
At first sight initialization of the polynomials sj,m (X) for X = x might appear
to be expensive, since the definition (2.3) involves O(2m ) terms. However, the
polynomials sj,m (X) satisfy a “Pascal triangle” recurrence relation
sj,m (X) = sj,m−1 (X 2 ) + Xsj−1,m−1 (X 2 )
with boundary conditions
(
0 if j > m ≥ 0,
sj,m (X) =
1 if m ≥ j = 0.
Using this recurrence, it is easy to compute sj,m (x) mod P (x) for 0 ≤ j ≤ m in
time O(m2 r). Thus, the initialization is cheap.
To summarise, we use two levels of blocking:
(1) The outer level replaces most GCDs by multiplications.
(2) The inner level replaces
p most multiplications by squarings.
(3) The parameter m ≈ M (r)/S(r) is used for the inner level of blocking.
(4) A different parameter ℓ = km is used for the outer level of blocking.
For example, suppose S = 1/400, M = 1, G = 40 (where we have normalised
so M = 1). We could choose ℓ = 80 and m = 20. With no blocking, the cost for
an interval of length 80 is 80G + 80S = 3200.2; with 1-level blocking the cost is
G + 79M + 80S = 119.2; with 2-level blocking the cost is G + 3M + 1600S = 47.0.

2.5. Sieving out small √ factors. We define a small factor to be one with
degree d < 21 log2 r, so 2d < r. The constant 21 in the definition is arbitrary and
could be replaced by any fixed constant in (0, 1). A large factor is a factor that is
not small.
6 RICHARD P. BRENT AND PAUL ZIMMERMANN

It would be inefficient to find small factors in the same way as large factors.
Instead, let D = 2d − 1, r′ = r mod D, s′ = s mod D. Then
′ ′
P (x) = xr + xs + 1 = xr + xs + 1 mod (xD − 1),
so we only need compute
′ ′
GCD(xr + xs + 1, xD − 1).
√
Because r′ , s′ < D < r, the cost of finding small factors is negligible (both
theoretically and in practice), so can be neglected.
2.6. Outer-level blocking strategy. The blocksize in the outer level of
blocking is ℓ = km. We take a linearly increasing sequence of block sizes
k = k0 j for j = 1, 2, 3, . . . ,
where the first interval starts at about log r (since small factors will have been
found by sieving).
The choice k = k0 j leads to a quadratic polynomial for the interval bounds.
More generally, we could take k to be a polynomial of degree δ > 0 in j, so the
interval bounds would be a polynomial of degree δ + 1. The analysis of §4 would go
through with minor changes. Generally, increasing δ reduces the number of GCDs
but increases the number of squarings/multiplications. In practice, we found that
the simple choice δ = 1 is close to optimal.
In principle, using the data that we have obtained on the distribution of degrees
of smallest factors of trinomials (see §3), and assuming that this distribution is not
very sensitive to the degree r, we could obtain a strategy that is close to optimal.
However, the choice k0 j with suitable k0 is easy to implement and not too far from
optimal. The number of GCD and sqr/mul operations is usually within a factor of
1.5 of the minimum possible in our experiments.

3. Distribution of degrees of factors

In order to predict the expected behaviour of our algorithm, we need to know
the expected distribution of degrees of smallest irreducible factors. From Swan’s
theorem [22], we know that there are significant differences between the distribution
of factors of trinomials and of all polynomials of the same degree. Our complexity
estimates are based on the heuristic assumption that this difference is not too large,
in a sense made precise by Hypothesis 3.1.
Hypothesis 3.1. Over all trinomials xr + xs + 1 of degree r over GF(2), the
probability πd that a trinomial has no nontrivial factor of degree ≤ d, 1 < d ≤ r, is
at most c/d, where c is a constant.
Hypothesis 3.1 implies that there are at most c irreducible trinomials of de-
gree r. This is probably false, as there may well be a sequence of exceptional r for
which the number of irreducible trinomials is unbounded. Thus, we may need to
replace the constant c in Hypothesis 3.1 by a slowly-growing function c(r). Never-
theless, in order to give realistic complexity estimates that are in agreement with
experiments, we assume below that Hypothesis 3.1 is correct. Under this assump-
tion we use an amortized model to obtain the total complexity over all trinomials
of degree r.
From Hypothesis 3.1, the probability that a trinomial does not have a small
factor (as defined in §2.5) is O(1/ log r).
A MULTI-LEVEL BLOCKING DISTINCT-DEGREE FACTORIZATION ALGORITHM 7

Table 1 gives the observed values of dπd for r = 3021377, r = 6972593, and
r = 24036583. The maximum values for each r are given in bold. The table shows
that the values of dπd are remarkably stable for small d, and bounded by 4 for
large d (this is because there are four irreducible trinomials of degree 3021377 and
also four of degree 24036583, when we count both trinomials xr + xs + 1 and their
reciprocals xr + xr−s + 1).

Table 1. dπd for various degrees r.

d r = 3021377 r = 6972593 r = 24036583

2 1.333 1.333 1.333
3 1.429 1.429 1.429
4 1.524 1.524 1.524
5 1.536 1.536 1.536
6 1.598 1.598 1.598
7 1.600 1.600 1.600
8 1.667 1.667 1.667
9 1.642 1.642 1.642
10 1.652 1.652 1.652
100 1.763 1.771 1.770
1000 1.783 1.756 1.786
10000 1.946 1.873 1.786
100000 1.986 1.606 1.880
279383 1.480 2.084 1.813
1000000 1.324 1.147 1.831
10000000 – – 1.664
r−1 4.000 2.000 4.000

3.1. Consequences of the hypothesis. Define pk = πd−1 − πd to be the

probability that the smallest nontrivial factor f of a randomly chosen trinomial has
degree d = deg(f ). In order to estimate the running time of our algorithm, we use
the following Lemma, which gives the expectation Eβ of dβ .

Lemma 3.2. If β > 0 is constant and Hypothesis 3.1 holds, then


r
X 
O(1) if β < 1,
β
Eβ := d pd = O(log r) if β = 1,


d=1 O(rβ−1 ) if β > 1.
8 RICHARD P. BRENT AND PAUL ZIMMERMANN

Proof. We use summation by parts. Note that a trinomial has no factor of

degree 1, so p1 = 0 and π0 = π1 = 1. Thus
Xr Xr
Eβ = dβ pd = dβ (πd−1 − πd )
d=1 d=1
r−1
X
= (d + 1)β − dβ πd + π0 − rβ πr
d=1
r−1
X (d + 1)β − dβ
≤ 1+c (by Hypothesis 3.1)
d
d=1
r−1
!
X
≤ 1+O dβ−2
d=1
and the result follows.
The following Lemma gives a stronger result in the case β < 1.
Lemma 3.3. If 0 < β < 1, 0 < D ≤ r, and Hypothesis 3.1 holds, then
Xr

dβ pd = O Dβ−1 .
d=D

Proof. The proof is similar to that of Lemma 3.2. We end with the upper
bound
r−1
X (d + 1)β − dβ
+ Dβ πD−1 .
d
d=D
From Hypothesis 3.1, πD−1 = O(1/D), and the sum over d is O(Dβ−1 ), so the
result follows.

4. Expected cost of sqr/mul and GCD

Recall that the inner level of blocking
p replaces m − 1 multiplications by m2 − m
squarings, where the choice m ≈ M (r)/S(r) makes the total cost of squarings
about equal to the cost of multiplications. √
For a smallest
√ factor of degree d, the number of squarings is m(d + O( d)),
where the O( d) term follows from our choice of outer-level blocksizes (see §2.6).
Averaging over all trinomials of degree r, the expected number of squarings is
 
X √
O m (d + O( d))pd  ,
d≤r/2

and from Lemma 3.2 this is O(m log r). Thus, the expected cost of sqr/mul opera-
tions per trinomial is
p p
O S(r) log r M (r)/S(r) = O log r M (r)S(r)

(4.1) = O r(log r)3/2 (log log r)1/2 .
If we used only a single level of blocking, then the cost of multiplications would
dominate that of squarings,
with an expected cost per trinomial of O (log rM (r)) =
O r(log r)2 log log r .
A MULTI-LEVEL BLOCKING DISTINCT-DEGREE FACTORIZATION ALGORITHM 9

The bound (4.1) is correct as r → ∞. In practice, for r < 6.4 × 107 , our imple-
mentation of Schönhage’s FFT-based polynomial multiplication algorithm [17] calls
a different multiplication routine (usually TC4) to perform smaller multiplications,
rather than recursively calling itself. TC4 has exponent α′ = ln(7)/ ln(4) ≈ 1.4, so
the effective exponent for FFT multiplication is α = (1 + α′ )/2 ≈ 1.2 > 1. In this
case, the expected cost of sqr/mul operations per trinomial is
p
(4.2) O log r M (r)S(r) = O(r(1+α)/2 log r) = O(r1.1··· log r).

4.1. Expected cost of GCDs. Suppose that P (x) has a smallest factor of
degree d. The number of GCDs required to find the √ factor, using our (quadratic
polynomial) blocking strategy, is at least 1, and O( d) if d is large. By Hypothe-
sis 3.1, the expected number of GCDs for a trinomial with no small factor is
 
X
1+O d1/2 pd  ,
log2 r<2d≤r

and by Lemma 3.3 this is

1
1+O √ .
log r
Thus the expected cost of GCDs per trinomial is
(4.3) O(G(r)/ log r) = O(M (r)) = O(r log r log log r).
The estimate (4.3) is asymptotically less than the expected cost (4.1) of sqr/mul
operations. However, if M (r) = O(rα ) with α > 1, then the expected cost of
GCDs is O(rα / log r), which is asymptotically greater than the expected cost (4.2)
of sqr/mul operations. Note the expected cost of GCDs does not depend on whether
we use one or two levels of blocking.
For r ≈ 2.4 × 107 , GCDs take about 65% of the time versus 35% for sqr/mul.

4.2. Comparison with previous algorithms. For simplicity we use the O e

e
notation which ignores log factors. For example, M (r) = O(r).
The “naive” algorithm, as implemented by Brent, Larvala and Zimmermann [5]
e 2 ) per trinomial, or O(r
and earlier authors, takes an expected time O(r e 3 ) to cover
all trinomials of degree r.
The single-level blocking strategy and the new algorithm both take expected
e
time O(r) e 2 ) to cover all trinomials of degree r.
per trinomial, or O(r
In practice, the new algorithm is faster than the naive algorithm by a factor
of about 160 for r = 6972593, and by a factor of about 560 for r = 24036583. For
r = 24036583, where sqr/mul operations take 35% of the total time in the new
algorithm, and the corresponding speedup is about 10, this gives a global speedup
of more than 4 over the single-blocking strategy.

4.3. Some details of our implementation. We first implemented the

2-level blocking strategy in NTL [19]. To get full efficiency, we rewrote all critical
routines and tuned them efficiently on the target processors. Our squaring routine
implements the algorithm described in [5], which is more than twice as fast as the
corresponding optimized NTL routine for trinomials. Our multiplication routine
implements Toom-Cook 3-way, 4-way, and Schönhage’s algorithm [17]. We also
10 RICHARD P. BRENT AND PAUL ZIMMERMANN

improved the basecase multiplication code; more details concerning efficient multi-
plication in GF(2)[x] are available in [4]. Finally, we implemented a subquadratic
GCD routine, since NTL only provides a classical GCD for binary polynomials.

4.4. Primitive trinomials. The largest published primitive trinomial is

x6972593 + x3037958 + 1,
found by Brent, Larvala and Zimmermann [5] in 2002 using a naive (but efficiently
implemented) algorithm.
In March–April 2007, we tested our new program by verifying the published
results on primitive trinomials for Mersenne exponents r ≤ 6972593, and in the pro-
cess produced certificates of reducibility (lists of smallest factors for each reducible
trinomial). These are available from the first author’s website [3].
In April–August 2007, we ran our new algorithm to search for primitive trino-
mials of degree r = 24036583. This is the next Mersenne exponent, apart from two
that are trivial to exclude by Swan’s theorem. It would take about 41 times as long
as for r = 6972593 by the naive algorithm, but our new program is 560 times faster
than the naive algorithm. Each trinomial takes on average about 16 seconds on a
2.2 Ghz Opteron.
The complete computation was performed in four months, using about 24
Opteron and Core 2 processors located at ANU and INRIA.
We found two new primitive trinomials of (equal) record degree:
(4.4) x24036583 + x8412642 + 1
and
(4.5) x24036583 + x8785528 + 1.

4.5. Verification. Allan Steel [20] kindly verified irreducibility of (4.4)–(4.5)

using Magma [2]. Each verification took about 67 hours on an 2.4 GHz Core 2
processor. Independent verifications using our irred V3.15 program [5, 6] took
about 35 hours on a 2.2 Ghz Opteron. The difference in speed is mainly due to the
fast squaring algorithm implemented in irred.
Primitivity of (4.4)–(4.5) follows from irreducibility provided that the degree
24036583 is a Mersenne exponent. We have not verified this, but rely on computa-
tions performed by the GIMPS project [23].
Reducibility of the remaining trinomials of degree 24036583 can be verified
using the certificate (or extended log, a list of smallest irreducible factors) available
from our website [3]. The verification takes less than 10 hours using Magma on a
2.66 Ghz Core 2 processor.

5. Conclusion
The new double-blocking strategy, combined with fast multiplication and GCD
algorithms, has allowed us to find new primitive trinomials of record degree.
The same ideas should work over finite fields GF(p) for small prime p > 2, and
for factoring sparse polynomials P (x) that are not necessarily trinomials: all we
need is that the time for p-th powers (mod P (x)) is much less than the time for
multiplication (mod P (x)).
A MULTI-LEVEL BLOCKING DISTINCT-DEGREE FACTORIZATION ALGORITHM 11

Acknowledgements. We thank Allan Steel for verifying irreducibility of the

trinomials (4.4)–(4.5), and Marco Bodrato, Pierrick Gaudry and Emmanuel Thomé
for their assistance in implementing fast algorithms for multiplication of polynomi-
als over GF[2]. ANU and INRIA provided computing facilities. The first author’s
research was supported by MASCOS and the Australian Research Council.

References
[1] M. Bodrato, Towards Optimal Toom-Cook Multiplication for Univariate and Multivariate
Polynomials in Characteristic 2 and 0, Lecture Notes in Computer Science 4547, 119–136.
Springer, 2007. http://bodrato.it/papers/#WAIFI2007
[2] W. Bosma, and J. Cannon, Handbook of Magma Functions, School of Mathematics and
Statistics, University of Sydney, 1995. http://magma.maths.usyd.edu.au/
[3] R. P. Brent, Search for primitive trinomials (mod 2), http://wwwmaths.anu.edu.au/∼ brent/
trinom.html
[4] R. P. Brent, P. Gaudry, E. Thomé and P. Zimmermann, Faster Multiplication in GF(2)[x],
Proceedings of ANTS VIII, A. van der Poorten, A. Stein, editors, Lecture Notes in Com-
puter Science, 2008, to appear. Also INRIA Tech Report RR-6359, http://hal.inria.fr/
inria-00188261/en/, Nov. 2007, 19 pp.
[5] R. P. Brent, S. Larvala and P. Zimmermann, A fast algorithm for testing reducibility of
trinomials mod 2 and some new primitive trinomials of degree 3021377, Math. Comp. 72
(2003), 1443–1452. http://wwwmaths.anu.edu.au/∼ brent/pub/pub199.html
[6] R. P. Brent, S. Larvala and P. Zimmermann, A primitive trinomial of degree 6972593, Math.
Comp. 74 (2005), 1001–1002, http://wwwmaths.anu.edu.au/∼ brent/pub/pub224.html
[7] D. G. Cantor and H. Zassenhaus, A new algorithm for factoring polynomials over finite fields,
Math. Comp. 36 (1981), 587–592.
[8] Ph. Flajolet, X. Gourdon and D. Panario, The complete analysis of a polynomial factorization
algorithm over finite fields, J. of Algorithms 40 (2001), 37–81.
[9] M. Fürer, Faster integer multiplication, Proceedings of the 39th annual ACM Symposium on
Theory of Computing (STOC 2007), 57–66.
[10] J. von zur Gathen and J. Gerhard, Modern Computer Algebra, Cambridge University Press,
Cambridge, UK, 1999.
[11] J. von zur Gathen and J. Gerhard, Polynomial factorization over F2 , Math. Comp. 71 (2002),
1677–1698.
[12] J. von zur Gathen and V. Shoup, Computing Frobenius maps and factoring polynomials,
Computational Complexity 2 (1992), 187–224. http://www.shoup.net/papers/
[13] J. R. Heringa, H. W. J. Blöte and A. Compagner. New primitive trinomials of Mersenne-
exponent degrees for random-number generation, International J. of Modern Physics C 3
(1992), 561–564.
[14] T. Kumada, H. Leeb, Y. Kurita and M. Matsumoto, New primitive t-nomials (t = 3, 5) over
GF(2) whose degree is a Mersenne exponent, Math. Comp. 69 (2000), 811–814. Corrigenda:
ibid 71 (2002), 1337–1338.
[15] A.-E. Pellet, Sur la décomposition d’une fonction entière en facteurs irréductibles suivant un
module premier p, Comptes Rendus de l’Académie des Sciences Paris 86 (1878), 1071–1072.
[16] J. M. Pollard. A Monte Carlo method for factorization, BIT 15 (1975), 331–334.
[17] A. Schönhage, Schnelle Multiplikation von Polynomen über Körpern der Charakteristik 2,
Acta Inf. 7 (1977), 395–398.
[18] A. Schönhage and V. Strassen, Schnelle Multiplikation groβer Zahlen, Computing 7 (1971),
281–292.
[19] V. Shoup, NTL: A library for doing number theory, Version 5.4.1, http:www.shoup.net/ntl/
[20] A. Steel, personal communications, July 5–9, 2007.
[21] L. Stickelberger, Über eine neue Eigenschaft der Diskriminanten algebraischer Zahlkörper,
Verhandlungen des ersten Internationalen Mathematiker-Kongresses, Zürich, 1897, 182–193.
[22] R. G. Swan, Factorization of polynomials over finite fields, Pacific J. Math. 12 (1962), 1099–
1106.
[23] G. Woltman et al, GIMPS, The Great Internet Mersenne Prime Search, http://www.
mersenne.org/
12 RICHARD P. BRENT AND PAUL ZIMMERMANN

Mathematical Sciences Institute, John Dedman Building (27), Australian National

University, Canberra, ACT 0200, Australia
E-mail address: [email protected]

Centre de Recherche INRIA Nancy - Grand Est, 615 rue du Jardin Botanique, 54600
Villers-lès-Nancy, France
E-mail address: [email protected]

Optimization Theory with Applications
From Everand
Optimization Theory with Applications
Donald A. Pierre
4/5 (4)
Design and Analysis of Algorithm
50% (2)
Design and Analysis of Algorithm
125 pages
Appears in Computer Algebra, Second Edition, B. Buchberger, R. Loos, G. Collins, Editors, Springer Verlag, Vienna, Austria, Pp. 95-11 (1982)
No ratings yet
Appears in Computer Algebra, Second Edition, B. Buchberger, R. Loos, G. Collins, Editors, Springer Verlag, Vienna, Austria, Pp. 95-11 (1982)
21 pages
Finite Fields of The Form GF
No ratings yet
Finite Fields of The Form GF
13 pages
Lecture11 (Cantor - Zassenhaus)
No ratings yet
Lecture11 (Cantor - Zassenhaus)
6 pages
Square Roots Modulo P
No ratings yet
Square Roots Modulo P
5 pages
Algorithm 3
No ratings yet
Algorithm 3
17 pages
Factoring Polynomials Modulo Composites
No ratings yet
Factoring Polynomials Modulo Composites
37 pages
Solns
No ratings yet
Solns
38 pages
Blake1985 Chapter ComputingLogarithmsInGF2n PDF
No ratings yet
Blake1985 Chapter ComputingLogarithmsInGF2n PDF
10 pages
A New Index Calculus Algorithm With Complexity L (1/4 + o (1) ) in Small Characteristic
No ratings yet
A New Index Calculus Algorithm With Complexity L (1/4 + o (1) ) in Small Characteristic
23 pages
Sparse Red
No ratings yet
Sparse Red
12 pages
S0025-5718-1987-0866113-7
No ratings yet
S0025-5718-1987-0866113-7
22 pages
MathCrypt2023p002013v2.pdf
No ratings yet
MathCrypt2023p002013v2.pdf
9 pages
Irreducibility of Polys in Z (X) PDF
No ratings yet
Irreducibility of Polys in Z (X) PDF
33 pages
Algorithmes de Factorisation de Polynome
No ratings yet
Algorithmes de Factorisation de Polynome
122 pages
Cryptography and Network Security, Finite Fields: From Third Edition by William Stallings
No ratings yet
Cryptography and Network Security, Finite Fields: From Third Edition by William Stallings
25 pages
TMP DFCC
No ratings yet
TMP DFCC
35 pages
1 s2.0 S0747717197901103 Main
No ratings yet
1 s2.0 S0747717197901103 Main
30 pages
CONSTRUCTION-OF-IRREDUCIBLE-POLYNOMIALS-OF-DEGREE-n-IN--Z-2
No ratings yet
CONSTRUCTION-OF-IRREDUCIBLE-POLYNOMIALS-OF-DEGREE-n-IN--Z-2
7 pages
Ants Xiii Proceedings of The Thirteenth Algorithmic Number Theory Symposium
No ratings yet
Ants Xiii Proceedings of The Thirteenth Algorithmic Number Theory Symposium
20 pages
Mit Ocw: 18.703 Modern Algebra Prof. James Mckernan
No ratings yet
Mit Ocw: 18.703 Modern Algebra Prof. James Mckernan
10 pages
1404.6281v2
No ratings yet
1404.6281v2
10 pages
Nscan 2
No ratings yet
Nscan 2
17 pages
K K K K
No ratings yet
K K K K
18 pages
Computational Experiences On The Distances of Polynomials To Irreducible Polynomials
No ratings yet
Computational Experiences On The Distances of Polynomials To Irreducible Polynomials
9 pages
intMult08
No ratings yet
intMult08
13 pages
Exercises On Binary Quadratic Forms
No ratings yet
Exercises On Binary Quadratic Forms
19 pages
Faster Multiplication in GF (2) (X)
No ratings yet
Faster Multiplication in GF (2) (X)
14 pages
Introduction To Channel Coding Introduction To Algebra
No ratings yet
Introduction To Channel Coding Introduction To Algebra
37 pages
2-GCD
No ratings yet
2-GCD
13 pages
ch15
No ratings yet
ch15
29 pages
PLUS-PURE THRESHOLDS OF SOME CUSP-LIKE SINGULARITIES IN MIXED CHARACTERISTIC
No ratings yet
PLUS-PURE THRESHOLDS OF SOME CUSP-LIKE SINGULARITIES IN MIXED CHARACTERISTIC
15 pages
2009-151
No ratings yet
2009-151
6 pages
Pointers and Memory
No ratings yet
Pointers and Memory
9 pages
01 - 109 - 0 (1) Csiro Paper
No ratings yet
01 - 109 - 0 (1) Csiro Paper
14 pages
A Fast Algorithm To Determine Normal Polynomial Over Finite Fields
No ratings yet
A Fast Algorithm To Determine Normal Polynomial Over Finite Fields
6 pages
Mersenne Prime
No ratings yet
Mersenne Prime
28 pages
2009 F
No ratings yet
2009 F
16 pages
Factor Ization
No ratings yet
Factor Ization
13 pages
s102080010026
No ratings yet
s102080010026
47 pages
An Introduction To Galois Fields and Reed-Solomon Coding: James Westall James Martin
No ratings yet
An Introduction To Galois Fields and Reed-Solomon Coding: James Westall James Martin
16 pages
Greatest Common Divisors of Polynomials: 3.1. GR Obner Bases and Gcds
No ratings yet
Greatest Common Divisors of Polynomials: 3.1. GR Obner Bases and Gcds
13 pages
Chapter 9 Factorising and DL Using A Factor Base
No ratings yet
Chapter 9 Factorising and DL Using A Factor Base
11 pages
CNS KCS074 NOTES UNIT-3
No ratings yet
CNS KCS074 NOTES UNIT-3
34 pages
AKScpp
No ratings yet
AKScpp
36 pages
Vasilenko O. Number-Theoretic Algorithms in Cryptography 2006
No ratings yet
Vasilenko O. Number-Theoretic Algorithms in Cryptography 2006
261 pages
A Polys and NT Holden Lee Lecture 13 E
No ratings yet
A Polys and NT Holden Lee Lecture 13 E
15 pages
Fast Integer Multiplication Using Modular Arithmetic
No ratings yet
Fast Integer Multiplication Using Modular Arithmetic
12 pages
Polynomial_matrix_primitive_factorization_over_arbitrary_coefficient_field_and_related_results
No ratings yet
Polynomial_matrix_primitive_factorization_over_arbitrary_coefficient_field_and_related_results
9 pages
Sol 8
No ratings yet
Sol 8
8 pages
PolynomialRingsOverField ANSWERS
No ratings yet
PolynomialRingsOverField ANSWERS
4 pages
Arturo Magidin and David Mckinnon
No ratings yet
Arturo Magidin and David Mckinnon
42 pages
Tsz-Wo Sze Square Roots
No ratings yet
Tsz-Wo Sze Square Roots
14 pages
19 Rennes STNFS
No ratings yet
19 Rennes STNFS
51 pages
Efficiently Factoring Polynomials Modulo P: 4 Ashish Dwivedi Rajat Mittal Nitin Saxena
No ratings yet
Efficiently Factoring Polynomials Modulo P: 4 Ashish Dwivedi Rajat Mittal Nitin Saxena
22 pages
Fast Parallel Matrix GCD
No ratings yet
Fast Parallel Matrix GCD
16 pages
Calculus: Maths of the Gods
From Everand
Calculus: Maths of the Gods
Bill Todorovich
No ratings yet
A-level Maths Revision: Cheeky Revision Shortcuts
From Everand
A-level Maths Revision: Cheeky Revision Shortcuts
Scool Revision
3.5/5 (8)
Mathematical Functions
From Everand
Mathematical Functions
Oliver Linton
No ratings yet
Multiple Integrals, A Collection of Solved Problems
From Everand
Multiple Integrals, A Collection of Solved Problems
Steven Tan
No ratings yet
Chapter II Build A Neural Network Step by Step
No ratings yet
Chapter II Build A Neural Network Step by Step
31 pages
ECE586BH Lecture1
No ratings yet
ECE586BH Lecture1
36 pages
CE403 Structural Analysis - III
No ratings yet
CE403 Structural Analysis - III
2 pages
RMT - Lesson Plan 20161
No ratings yet
RMT - Lesson Plan 20161
6 pages
MPPL
No ratings yet
MPPL
70 pages
INTERPOLATION
No ratings yet
INTERPOLATION
5 pages
Q107
No ratings yet
Q107
2 pages
Fast Fourier Transform
No ratings yet
Fast Fourier Transform
10 pages
Finite Element Analysis 3+1: MEC606 Common With Mechanical Engineering Objectives
No ratings yet
Finite Element Analysis 3+1: MEC606 Common With Mechanical Engineering Objectives
1 page
Neural Ordinary Differential Equations: Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, David Duvenaud
No ratings yet
Neural Ordinary Differential Equations: Ricky T. Q. Chen, Yulia Rubanova, Jesse Bettencourt, David Duvenaud
42 pages
Comparing Gru and LSTM For Automatic Speech Recognition: Shubham Khandelwal, Benjamin Lecouteux, Laurent Besacier
No ratings yet
Comparing Gru and LSTM For Automatic Speech Recognition: Shubham Khandelwal, Benjamin Lecouteux, Laurent Besacier
7 pages
4.8 Numerical Integration
No ratings yet
4.8 Numerical Integration
11 pages
DAA Assign 3
No ratings yet
DAA Assign 3
7 pages
STATISTIK ERA AYU WANDIRA
No ratings yet
STATISTIK ERA AYU WANDIRA
3 pages
Lecture 4 - Visualizing What Convnet Learn
No ratings yet
Lecture 4 - Visualizing What Convnet Learn
26 pages
Discrete Time Signal Processing - Prof - Mr. Rajiv Suhas Tawde
No ratings yet
Discrete Time Signal Processing - Prof - Mr. Rajiv Suhas Tawde
58 pages
Textbook 7.1 (1)
No ratings yet
Textbook 7.1 (1)
12 pages
Linear Algebra and Optimization T2
No ratings yet
Linear Algebra and Optimization T2
19 pages
Hair PPT Ch05
No ratings yet
Hair PPT Ch05
18 pages
Algorithm Notes Additional Materials
No ratings yet
Algorithm Notes Additional Materials
17 pages
Download Integer Optimization and its Computation in Emergency Management 1st Edition Zhengtian Wu ebook All Chapters PDF
100% (7)
Download Integer Optimization and its Computation in Emergency Management 1st Edition Zhengtian Wu ebook All Chapters PDF
50 pages
Math 203-1.2
No ratings yet
Math 203-1.2
66 pages
Opt
No ratings yet
Opt
2 pages
Bab 7
No ratings yet
Bab 7
3 pages
Algebraic Expressions _ DHA 01 __ Junoon 2025
No ratings yet
Algebraic Expressions _ DHA 01 __ Junoon 2025
4 pages
Analysis and Study of Perceptron To Solve Xor Problem
No ratings yet
Analysis and Study of Perceptron To Solve Xor Problem
6 pages
Assignment 4
No ratings yet
Assignment 4
3 pages
Asymptotic Analysis
No ratings yet
Asymptotic Analysis
10 pages
FLAT Assignment-1 PDF
No ratings yet
FLAT Assignment-1 PDF
2 pages

A Multi-Level Blocking Distinct-Degree Factorization Algorithm

Uploaded by

A Multi-Level Blocking Distinct-Degree Factorization Algorithm

Uploaded by

Contemporary Mathematics

A Multi-level Blocking Distinct-degree

Richard P. Brent and Paul Zimmermann

Abstract. We give a new algorithm for performing the distinct-degree factor-

Thus, a simple algorithm to find a factor of smallest degree of P (x) is to compute

2. Complexity of the algorithm

2.1. Complexity of polynomial multiplication and squaring. We need

2.2. Complexity of GCD. For GCDs we use a sub-quadratic algorithm that

For example, if m = 3, we have:

3. Distribution of degrees of factors

Table 1. dπd for various degrees r.

d r = 3021377 r = 6972593 r = 24036583

3.1. Consequences of the hypothesis. Define pk = πd−1 − πd to be the

Lemma 3.2. If β > 0 is constant and Hypothesis 3.1 holds, then

Proof. We use summation by parts. Note that a trinomial has no factor of

4. Expected cost of sqr/mul and GCD

and by Lemma 3.3 this is  

4.2. Comparison with previous algorithms. For simplicity we use the O e

4.3. Some details of our implementation. We first implemented the

4.4. Primitive trinomials. The largest published primitive trinomial is

4.5. Verification. Allan Steel [20] kindly verified irreducibility of (4.4)–(4.5)

Acknowledgements. We thank Allan Steel for verifying irreducibility of the

Mathematical Sciences Institute, John Dedman Building (27), Australian National

You might also like

and by Lemma 3.3 this is