0% found this document useful (0 votes)
21 views8 pages

Linfini

This document discusses computing good polynomial approximations of functions for use in evaluation on computers. It presents a new efficient method based on lattice basis reduction for finding polynomial approximations that minimize the absolute error compared to the target function. This often provides the best possible polynomial or a very good approximation. The method is faster than previous techniques and works for polynomials of higher degrees.

Uploaded by

Florin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
21 views8 pages

Linfini

This document discusses computing good polynomial approximations of functions for use in evaluation on computers. It presents a new efficient method based on lattice basis reduction for finding polynomial approximations that minimize the absolute error compared to the target function. This often provides the best possible polynomial or a very good approximation. The method is faster than previous techniques and works for polynomials of higher degrees.

Uploaded by

Florin
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

Efficient polynomial L∞ -approximations

Nicolas Brisebarre
LaMUSE, Université J. Monnet
23, rue du Dr P. Michelon, 42023 St-Étienne Cedex 02
and Projet Arénaire, LIP, 46 allée d’Italie, 69364 Lyon Cedex 07, France
[email protected]
Sylvain Chevillard
LIP (CNRS/ENS Lyon/INRIA/Univ. Lyon 1),
Projet Arénaire, 46 allée d’Italie, 69364 Lyon Cedex 07, France
[email protected]

Abstract problem of getting for a given continuous real-


valued function f , a real interval [a, b], and a de-
We address the problem of computing a good gree n ∈ N, the polynomial p ∈ Rn [X] that ap-
floating-point-coefficient polynomial approxima- proximates f the best way, where Rn [X] is the
tion to a function, with respect to the supremum R-vector space of the polynomials with real co-
norm. This is a key step in most processes of eval- efficients and degree less than or equal to n. The
uation of a function. We present a fast and efficient optimal polynomial depends on the way used to
method, based on lattice basis reduction, that of- measure the quality of the approximation. The
ten gives the best polynomial possible and most of most usual choices are the supremum norm (or L∞
the time returns a very good approximation. norm or absolute error)

Keywords: Efficient polynomial approxima- ||p − f ||∞,[a,b] = sup |p(x) − f (x)|,


a≤x≤b
tion, floating-point arithmetic, absolute error, L∞
norm, lattice basis reduction, closest vector prob- or the relative error
lem, LLL algorithm.
1
||p − f ||rel,[a,b] = sup |p(x) − f (x)|
a≤x≤b |f (x)|
1. Introduction
or least squares approximations norm
To evaluate a mathematical function on a com-  1/2
b
puter, in software or hardware, one frequently re- 2
||p−f ||2,[a,b] = w(x) (f (x) − p(x)) dx ,
places the function with good polynomial approx- a
imations of it. This is due to the fact that floating-
point (FP) addition, subtraction and multiplica- where w is a continuous weight function.
tion are carefully and efficiently implemented on The method proposed in this paper aims at min-
modern processors and also because one may take imizing the absolute error between f and polyno-
advantage of existing efficient schemes of poly- mials with FP coefficients. In practice, people are
nomial evaluation. Such polynomial approxima- mostly interested in minimizing the relative error.
tions are used in several recent elementary func- This second problem is in general more difficult
tion evaluation algorithms [15, 17, 20, 6]. than the first one (even when searching real coef-
The first natural goal for someone who tries to ficient polynomials) and we are currently working
evaluate a function f is thus to obtain a polynomial on this issue. But we can remark that there are
p which is sufficiently close to f for the approxi- numerous situations where the two problems do
mation required by the application. not differ so much. For example, there are a lot
Hence, we are naturally led to consider the of applications where the domain of definition of
the function is firstly cut into small intervals where obtain the polynomial
the order of magnitude of the function is constant.
6369051672525773
In these cases, the absolute and relative errors be- P =
come almost proportional. Hence minimizing the 4503599627370496
absolute error instead of the relative one is in gen- 884279719003555 6121026514868073 2
eral quite satisfying. + X+ X
281474976710656 2251799813685248
From now on, we will focus on the supre-
mum norm that we shall write || · ||[a,b] instead and we have f − P ∞  2.70622 · 10−15 . But
of || · ||∞,[a,b] in the sequel. Among impor- the best degree-2 polynomial with double preci-
tant theoretical works by various mathematicians sion coefficients is
(like Bernstein, Weierstrass, Lagrange for ex- 6369051672525769
ample), the work of Chebyshev in this area of P∗ =
4503599627370496
function L∞ -approximation, is especially remark-
able. He showed in particular that when p runs 3537118876014221 6121026514868073 2
+ X+ X
among Rn [X], ||f − p||∞ reaches a minimum at 1125899906842624 2251799813685248
a unique polynomial and gave a very precise char- for which f − P ∗ ∞  2.2243 · 10−16 . There-
acterization of this best polynomial approximant fore, there is a factor bigger than 10 between the
(see [5] for example). From that characterization, optimum and the rounded polynomial.
Remez [19] designed an algorithm that makes it In 2005, Brisebarre, Muller and Tisserand pro-
possible to compute the best L∞ -polynomial ap- posed in [3] a first general method for tackling
proximation, also called minimax approximation this problem. Given m0 , m1 , . . . , mn a fixed fi-
(see [5] for a proof of the algorithm) in a fairly nite sequence of natural integers, they search for
short time (see [24] for a proof of its quadraticity). the polynomials of the form
Therefore, we see that the general situation for L∞ a0 a1 an
approximation by real polynomials can be consid- p(x) = m
+ m1 x+ · · · + mn xn with ai ∈ Z
2 0 2 2
ered quite satisfying. The problem for the scientist
that implements in software or hardware such ap- that minimize ||f − p||∞ . The approach given
proximations is that he uses finite-precision arith- in [3] consists in, first, constructing with care
metic and unfortunately, most of the time, the min- a certain rational polytope containing all the
imax approximation given by Chebyshev’s theo- (a0 , . . . , an ) ∈ Zn+1 solution and as few as possi-
rem and computed by Remez’ algorithm has co- ble other elements of Zn+1 and, in a second time,
efficients which are transcendental (or at least ir- efficiently scanning all the points with integer co-
rational) numbers, hence not exactly representable ordinates that lie inside this polytope.
with a finite number of bits. This approach, that currently makes it possible
Thus, the coefficients of the approximation usu- to obtain polynomials of degree up to 10, has two
ally need to be rounded according to the require- major drawbacks. First of all, its complexity is not
ments of the application targeted (for example, in precisely estimated but it might be exponential in
current software implementations, one often uses the worst case, since the scanning of the integer
FP numbers in IEEE single or double precision for points of the polytope is done using linear ratio-
storing the coefficients of the polynomial approxi- nal programming. The other problem is that the
mation). But this rounding, if carelessly done, may efficiency of the method relies on the relevance
lead to an important loss of accuracy. For instance, of the estimation beforehand of the optimal error
if we choose to round to the nearest each coeffi- f − p∞ . If this error is overestimated, then the
cient of the minimax approximation to the required polytope may contain too many integer points and
format (this yields a polynomial that we will call the scanning step may become intractable. If this
rounded minimax in the sequel of the paper), the error is underestimated, then the polytope contains
quality of the approximation we get can be very no solution. Hence, we were led to design a tool
poor. To illustrate that problem, let us look at the that could give us a better insight of the value of the
following simple example. √ We want to approach optimal error, in order to speed up the method of
the function f : x → 2 + πx + ex2 on the in- [3]. To do so, we developed a new approach based
terval [2, 4] by a degree-2 polynomial with IEEE on Lattice Basis Reduction and in particular on the
double precision FP coefficients. The minimax ap- LLL algorithm. But, indeed, that tool proved to be
proximation is the function f itself. If we round far more useful than expected: it gives most of the
the coefficients of f to the closest double, we time an excellent approximant (if not the best) and
this is done quickly and at low memory cost. The in [8] that approximating CVP,  in the euclidean
goal of that paper is to present this new approach. norm case (CVP2 ), to a factor d/ log d is not
The outline of the paper is the following. In the NP-hard. Regarding to the infinity norm (CVP∞ ),
second section, we recall basic facts about lattices, they also show that approximating CVP to a fac-
about the closest vector problem (CVP), an algo- tor d/ log d is not NP-hard2. Unfortunately, no
rithmic problem related to lattices and that appears polynomial algorithm is known for approximating
naturally in our approach, and some algorithm that CVP to a polynomial factor.
solves an approximated version of the CVP. Then Though, if we relax the constraint on the factor,
we present our new approach in Section 3 and give the situation becomes far better.
some worked examples in Section 4, before giving In 1982, Lenstra, Lenstra and Lovász [13] gave
a brief conclusion in the last section. an algorithm that allows one to get “fairly” short
vectors of the lattice in polynomial time. Among
2. A reminder on Lattice Basis Reduc- many important applications of that algorithm,
Babai [2] proposed a polynomial algorithm for
tion and the LLL Algorithm
solving CVP with an exponential approximation
factor. For x ∈ R, we denote x the integer near-
Let x = (x1 , . . . , x ) ∈ R . We set est to x, defined by x = x + 1/2 .
||x||2 = (x|x)1/2 = (x21 + · · · + x2 )1/2 and
||x||∞ = max1≤i≤ |xi |.
A lattice is a discrete algebraic object that is en- Algorithm: ApproximatedCVP
countered in several domains of various sciences, Data: An LLL-reduced basis (bi )1≤i≤d ; its
such as Mathematics, Computer Science or Chem- Gram-Schmidt orthogonalization
istry. It is a rich and powerful modelling tool (b∗i )1≤i≤d ; a vector − →v
thanks to the deep and numerous theoretical, al- Result: An approximation to the factor 2d/2
gorithmic or implementation available works (see of the CVP2 of − →v
[4, 9, 14, 22] for example). The lattice structure is begin
→ −

ubiquitous in our approach. t =→ v ;
for (j = d ; j ≥ 1 ; j- -)  do
Definition 2.1 Let L be a nonempty subset of R , →−
− →∗
→ −
− →  t , bj  − →
L is a lattice if and only if there exists a set of vec- t = t − − → ∗ − →∗
bj ;
 bj , bj 
tors b1 , . . . , bd R-linearly independent such that
end


L = {λ1 b1 + · · · + λd bd : λ1 , . . . , λd ∈ Z} . return − →
v − t ;
end
The family (b1 , . . . , bd ) is a basis of the lattice L Algorithm 1: Babai’s nearest Plane algorithm
and d is called the rank of the lattice L.

In the sequel, we’ll have to deal with the fol- Remark 2.3 In practice, the result of Babai’s al-
lowing algorithmic problem.
gorithm is of better quality and given faster than
Problem 2.2 Closest vector problem (CVP) Let expected.
|| · || be a norm on R . Given a basis of a lat- For a practical exact CVP, see [12] in which the
tice L ⊂ Q of rank k, k ≤ , and x ∈ R , find given algorithm is super-exponential. This com-
y ∈ L s.t. ||x − y|| = dist(x, L), where dist(x, L) plexity has been recently improved in [10].
denotes min{||x − y||, y ∈ L}.
Associated approximation problem: find y ∈ L
s.t. ||x − y|| ≤ γ dist(x, L) where γ ∈ R is fixed. 3. Our approach

Let us recall some of the complexity results In many applications, scientist and engineers
known about this problem1. In 1981, van Emde desire to find the best (or very good) polynomial
Boas [23] proved that CVP is NP-hard (see also approximation among those that have FP coeffi-
[16]). On the other hand, Goldreich and Gold- cients and this is the problem we address in that
wasser showed (but their proof is nonconstructive) paper. First, let us make the following important
1 A presentation of different algorithms for solving the CVP 2 As it is noted in [18], it seems that the CVP problem when

is given in [1]. the norm is the infinity norm is more difficult.


remark: to solve that problem, it is sufficient to be The first idea is to discretize the continuous
able to solve the following problem: problem: let  ≥ n+1, let x1 , · · · , x in [a, b], we
want 2am00 + 2am11 xi + · · · + 2amnn xni to be as close
Problem 3.1 Let f be a continuous real-valued
as possible to f (xi ) for all i = 1, . . . , . That is to
function defined on a given real interval [a, b], let
say we want the vectors
n ∈ N, (mi )0≤i≤n a finite sequence of rational
integers, give one polynomial of the form  a0   
2m0 + 2m1 x1 + · · · + 2mn x1
a1 an n
f (x1 )
a0 a1 an  am00 + am11 x2 + · · · + amnn xn2   f (x2 ) 
p = + m1 X + · · · + mn X n  2 2 2   
 ..  and  .. 
2 m 0 2 2  .   . 
where the ai ∈ Z, that minimizes (or at least makes a0
2m0 + a1
2m1 x + · · · + an
2mn xn f (x )
very small) f − p[a, b] .   

−y
The problem we address can be reduced to this
one by the following heuristic. to be as close as possible, with respect to the
Let’s denote by P the best polynomial with FP || · ||∞ norm. This can be rewritten as: we want
coefficients and by R the minimax. For the sake the vectors
of simplicity, we will assume here that all the co-  1   x1   xn 
1
2m0 2m1 2mnn
efficients of P have the same precision t (but the
 m0 
1  m1 
x  x 
 2   2
2
  2m2n 
arguments would be the same with several differ- a0  .  +a1  .  + · · · + an   ..


ent precisions for the coefficients). In many cases  ..   ..   . 
(in particular if t is big enough), the order of mag- 1 x xn

2m0 2m1
nitude of the i-th coefficient of P is the same as        2mn
 

− →

the corresponding coefficient of R. Hence, we can v0 v1 v−
→n
suppose, initially, that the exponents of these coef-
ficients are the same. From Remez’ algorithm, we and − →
y to be as close as possible. Hence, we
know the i-th coefficient of R and thus its exponent have to find (a0 , . . . , an ) ∈ Zn+1 that minimize
e. Since the i-th coefficient of P is a FP number ||a0 −

v0 + · · · +an −
v→ →

n − y ||∞ : this is an instance of
with precision t and since we may suppose that its CVP∞ .
exponent is e, we can write it 1.u1 . . . ut−1 · 2e Two problems arise at this moment: first, how
where 1u1 . . . ut−1 is the unknown binary man- to choose the points xi and then how do we deal
tissa of the coefficient. This can be rewritten with the CVP∞ associated?
1u1 ...ut−1
2t−1−e and we can set mi = t − 1 − e.
However the guessed exponent may slightly 3.1. Choice of the points xi
differ from the real optimal one. In that case,
the computed coefficient of P will be of the form This is a critical step in our approach. We want
ai /2mi but where ai needs more than t bits to be that a small value of a0 − →
v0 + · · · + an −
v→ →

n − y ∞
written. Our heuristic is then to use this computed means a small value of the supremum norm
coefficient as a new insight for the order of magni- ||p − f ||[a,b] . More precisely, requiring
tude of the optimal one. We note e its exponent, a0 −
→v0 + · · · + an −
v→ →

n − y ∞ to be small can
and we set mi to t − 1 − e and we try again until be viewed as an approximate interpolation prob-
we reach a fixed point. lem and it is well known that, if the points xi are
This is just a heuristic and we have no proof carelessly chosen, one may find a polynomial that
of convergence for that process. However, in ev- coincides with the function on the points xi but
ery practical case we met, this procedure was ob- is pretty far from it on the whole [a, b] (one may
served to converge in at most two or three steps. In consider the classical Runge’s example [7, Chap.
particular, we (sometimes, but rarely) encountered 2] for instance).
situations where the initial assumption that the co- Our choice of the points relies on the observa-
efficients of P and R have the same order of mag- tion that when the imposed precision of the coef-
nitude were completely wrong. But in those cases, ficients is big enough (i.e. with mi big enough,
this heuristic also converged in two or three steps which is the case with double FP coefficients for
and let us find a polynomial with FP coefficients instance), the good polynomial approximants to
which was really good. the function will be close to the minimax approx-
Now that we reduced our general problem to imation. Hence, we generally choose the points
Problem 3.1, we start explaining how we solve it. where f and the degree-n minimax polynomial R
are the closest possible, i.e. when f − R cancels. explore the neighborhood of v, with the following
The following classical result (see [5, Chap. 3] for (convergent) method:
a more general statement) tells us that there are at
least n + 1 such points. 1. Let w denote successively the 2n + 2 follow-
ing vectors: v + ε0 , v − ε0 , . . . , v + εn , v − εn .
Theorem 3.2 (Chebyshev) A polynomial p ∈
Rn [X] is the best approximation of f in [a, b] if 2. If there is no w such that
and only if there exist a ≤ y0 < y1 < · · · < yn+1
||w − y||∞ < ||v − y||∞ ,
≤ b such that

• ∀i ∈ {0, · · · , n + 1}, |f (yi ) − p(yi )| = f − we may think that the quality of v is quite
p∞ , good: return v.

• ∀i ∈ {0, · · · , n}, f (yi+1 ) − p(yi+1 ) = 3. Else, denote by wopt one w such that
−(f (yi ) − p(yi )).
||w − y||∞ is minimal.
Another possible choice are the Chebyshev points
[7, Chap. 2]. They are known to be good choices in Set v := wopt and goto 1.
some approximation problems (for example, they
are the starting points in Remez’ algorithm). They 4. Examples
proved to be a pretty valuable choice in the experi-
ments we made, especially when the size of the mi
We will now present two examples that illus-
is low.
trate the behavior of our algorithm in real situa-
tions. Our first example shows the possible ben-
3.2. Solving of the CVP∞ efit that may come from the use of our method
in the implementation of a function for the Intel
Kannan’s algorithm [12] makes it possible to Itanium. The second example well illustrates the
solve either CVP2 or CVP∞ but its complexity is gap between an (almost) optimal polynomial and
super-exponential. Since the euclidean and infinity the rounded minimax. This example comes from
norms have the same order of magnitude √ (remem- the implementation of the function arcsin in the
ber that, in R , || · ||∞ ≤ || · ||2 ≤  || · ||∞ ), library CRlibm [21].
we preferred to use Babai’s algorithm that solves To implement our method, we have used Maple
CVP2 with, in theory, an exponential approxima- and GP3 . Maple lets us compute the minimax
tion factor but whose performance, in time, space polynomial R (Remez’ algorithm is available in
and quality of the result, is directly related to LLL’s Maple with the function minimax of the pack-
one, hence very good in practice for n, say, not age numapprox) which is used to determine the
greater than 50. points xi by solving the equation R(x) = f (x).
We may then consider the approximation of Then, we use GP to compute a LLL-reduced basis
CVP2 provided by Babai’s algorithm as the ap- and to solve the approximated CVP2 by Babai’s
proximation of CVP∞ searched. But we can also nearest-plane algorithm. We thus get a polyno-
refine that result in the following way. We can mial P and we finally use Maple to compute ||P −
use the LLL-reduced basis to explore the neighbor- f ||[a,b] (with Maple’s numapprox[infnorm])
hood of the computed approximated CVP2 . Let’s and compare it to ||R − f ||[a,b] .
denote by v the vector given by Babai’s nearest Applying our method to an Itanium imple-
plane algorithm and denote by (ε0 , · · · , εn ) the mentation of a function
LLL-reduced basis. LLL gives pretty short vectors The Intel Itanium processor uses a particular
in the lattice: for small values of n (say up to 50), FP format called extended double: it is a FP
ε0 is often the shortest nonzero vector in the lattice format with a 64 bit mantissa. An extended
(in norm || · ||2 ) and the other vectors of√the basis double allows more accurate computations, but
are also short. Since || · ||∞ ≤ || · ||2 ≤ || · ||∞ this gain in accuracy has a cost: extended
in R , they are also pretty short vectors for || · ||∞ . doubles must be loaded in cache one after the
If v is not the exact CVP∞ , it is maybe not so far 3 GP is an interactive calculator for number theory al-
from it: the exact CVP∞ is probably of the form gorithms. In particular, it implements LLL algorithm.
v + a0 ε0 + · · · + an εn where the ai are rational It is distibuted under GPL at http://pari.math.
integers with small absolute value. We can thus u-bordeaux.fr/
other, whereas two regular doubles may be ficients given a target accuracy. This can be also
loaded at the same time. Moreover, the latency very interesting in hardware applications where
of such an operation is 6 cycles (when a multipli- each single bit may count.
cation costs 4 cycles). Thus the loading time is Another possibility offered by our method is
quite critical in such a processor and it is very in- to improve the accuracy provided by the poly-
teresting to replace an extended double with nomial approximation (compared to the rounded
a double when the accuracy doesn’t require it. minimax for example) while keeping the same pre-
See [6] and [11] for more information about the cision for the coefficients. Our second example
Itanium architecture. comes from the implementation of the function
Intel developed and included in the glibc a arcsin in CRlibm. CRlibm is a mathematical li-
mathematical library optimized for the Itanium. brary which provides correct rounding: the value
Let us consider an example inspired by the imple- returned by CRlibm when evaluating a function is
mentation of the function erf in this library.
 x The the exact mathematical value rounded to the near-
2 2 est double. To achieve this goal, the develop-
function erf is defined by erf(x) = √ e−t dt
π 0 ers of CRlibm must use a bigger precision than
for all x ∈ R. The domain is cut into several the standard double precision provided by the pro-
intervals where the function can be approximated cessor. They use some special formats such as
by polynomials with reasonable degree (say less double-double and triple-double which are un-
than 30). This example deals with erf on the in- evaluated sums of two or three regular doubles.
terval [1; 2]. The goal is to provide a polynomial, In particular, they approximate functions by poly-
with extended and/or regular double co- nomials with double-double and triple-double
efficients, which approximates the function with a coefficients (and also regular doubles).
relative error bounded by 2−64 . The domain is Basically, a double-double gives the same
first translated to the interval [0, 1]. The problem precision as a FP number with 106 bits man-
is thus reduced to the following: find a polyno- tissa. However, a number of the form
mial p(x) = a0 + a1 x + · · · + an xn such that 1.x1 x2 . . . x52 000000001y1y2 . . . y52 · 2e is repre-
||p(x) − erf(x + 1)||rel,[0,1] < 2−64 . sentable by the double-double (1.x1 x2 . . . x52 ·
The minimax of degree 18 gives an error of 2e + 1.y1 y2 . . . y52 · 2e−8−53 ) and is not repre-
2−61.19 . A fortiori, a polynomial of degree 18 sentable in a 106 bits format. This limitation is not
with FP coefficients can’t provide the required ac- a problem in general: we can often prove a priori
curacy. The minimax of degree 19 gives an error that to achieve a given precision, the coefficients
of 2−66.92 . Can we find a satisfying polynomial of of a polynomial are so constrained that the 53 first
degree 19 ? bits are fixed. Thus, one of the two doubles of
A common strategy consists, when 0 belongs the double-double is known and we are back to
to the definition interval, in using a bigger preci- the problem of searching a regular double (that
sion for the first coefficients and a smaller preci- is a FP number with at most 53 bits). At worst, we
sion for the last ones. Setting the first 8 coeffi- can’t prove that the first 53 bits are fixed and we
cients to be extended doubles and the other just search for numbers with 106 bits: the resulting
to be doubles, and rounding each coefficient of number is representable with a double-double (it
the minimax to the nearest in the corresponding may just be nonoptimal).
format, we obtain an error of 2−61.13 : it is not suf- We focus here on the approximation of arcsin
ficient. However, using 9 extended doubles on the interval [0.79; 1]. It is a real-life example
and using the same classical procedure, we obtain we worked on with C. Lauter (who is one of the
an error of 2−64.74 which is enough. developers of CRlibm). After an argument reduc-
When we use our method, we obtain a poly- tion we obtain the problem to approximate
nomial with an error of 2−64.74 and only two
extended doubles. We saved 7 extended arcsin(1 − (z + m)) − π
2
g(z) = 
doubles thanks to our method. The computing 2 · (z + m)
time is the time necessary to Maple for computing
minimax approximation since the other steps are where −0.110  z  0.110 and m  0.110.
instantaneously performed. The developers know from previous computations
Second example: an example from CRlibm that they need an accuracy of more than 119 bits
The first example showed that our method to achieve correct rounding. The minimax of
makes it possible to reduce the size of the coef- degree 21 provides this accuracy (see Figure 1).
Figure 1. binary logarithm of the ab-
solute error of several approximants

Target -119
Minimax -119.83 Figure 2. Absolute error between g
Rounded minimax -103.31 and our polynomial
Our polynomial -119.77
1e-36

8e-37

Our method gives a satisfying polynomial with 6e-37


two triple-doubles, eight double-doubles and
4e-37
twelve regular doubles (here again, the comput-
2e-37
ing time is the time needed by Remez algorithm
for computing the minimax approximation). 0

More precisely, we searched for a polynomial -2e-37

of the form -4e-37

p0 + p1 x + p2 x2 + 
· · · + p9 x9
-6e-37

    -8e-37


173 159 106 ··· 106
-1e-36
-0.1 -0.05 0 0.05 0.1
10 21
+ p10 x · · · + p21 x
+ 
 
53 ··· 53
where the pi are FP numbers whose mantissa has
the corresponding size in bits. Note that the first
coefficient is actually searched with the method
described above: we prove that the first 53 bits are
fixed and we then search for a FP number with 106
bits for the two remaining doubles. The number
173 indicated under p0 only means that the 159
bits of this triple-double are nonconsecutive.
The accuracy provided by our polynomial is
very close to the one given by the minimax. How- Figure 3. Absolute error between g
ever, without our method, we would have to use and rounded minimax
the rounded minimax which gives only 103 bits of
accuracy: it would not be enough to provide cor- 8e-32
rect rounding. This example illustrates the gap be-
7e-32
tween the rounded minimax and the optimal poly-
nomial with FP coefficients. Here, 16 precious bits 6e-32

are lost by the rounding of minimax’ coefficients. 5e-32

This loss can be seen very well if the absolute er- 4e-32
rors are plotted: Theorem 3.2 indicates that a poly-
3e-32
nomial is optimal if and only if the absolute error
oscillates n + 2 times between its extrema. Our 2e-32

polynomial almost satisfies this theorem (see Fig- 1e-32

ure 2). On the other hand, one can see on Figure 0


3 that the rounded minimax does not satisfy this
-1e-32
theorem at all. -0.1 -0.05 0 0.05 0.1

5. Conclusion

We have presented a new method for computing


efficient polynomial approximation with machine-
number coefficients and in particular FP coeffi- ory of Computing (STOC), pages 1–9. ACM, May
cients. It improves the results provided by exist- 1998.
ing Remez’ based method. The method, based on [9] P. M. Gruber and C. G. Lekkerkerker. Geometry
lattice basis reduction, is much faster and more ef- of numbers, volume 37 of North-Holland Mathe-
matical Library. North-Holland Publishing Co.,
ficient than the one given in [3] and gives a very
Amsterdam, second edition, 1987.
good approximant. Moreover, it may considerably [10] G. Hanrot and D. Stehlé. Tighter analysis of Kan-
speed up the polytope-based method of [3] that nan’s enumeration algorithm. 2007. Submitted.
gives the best polynomial possible, by providing [11] Intel corporation. Intel Itanium 2 processor ref-
a relevant (indeed often a very good) estimate of erence manual for software development and op-
the L∞ approximation error. timization. Technical Report 251110-003, Intel,
This method can be adapted to several kind of 2004.
coefficients: fixed-point format, multi-double or [12] R. Kannan. Minkowski’s convex body theorem
classical floating point arithmetic with several pre- and integer programming. Math. Oper. Res.,
12(3):415–440, 1987.
cision formats.
[13] A. K. Lenstra, H. W. Lenstra, and L. Lovász.
For some applications, one may want to set the Factoring polynomials with rational coefficients.
value of certain coefficients of the approximant. Math. Annalen, 261:515–534, 1982.
For example, one may search for approximants [14] L. Lovász. An algorithmic theory of numbers,
whose first terms match the corresponding one in graphs and convexity, volume 50 of CBMS-NSF
the Taylor expansion of the approximated func- Regional Conference Series in Applied Mathemat-
tion. We should be able to deal with this additional ics. Society for Industrial and Applied Mathemat-
constraint in a close future. ics (SIAM), Philadelphia, PA, 1986.
We are currently working on an adaptation of [15] P. Markstein. IA-64 and Elementary Functions :
Speed and Precision. Hewlett-Packard Profes-
that method to the problem of minimizing the rel-
sional Books. Prentice Hall, 2000.
ative error and we plan to extend it to the problem [16] D. Micciancio. The hardness of the closest vector
of approximation by sums of cosines that arise in problem with preprocessing. IEEE Trans. Inform.
the field of Signal Processing and more specially Theory, 47(3):1212–1215, 2001.
in FIR filters. [17] J.-M. Muller. Elementary Functions, Algorithms
and Implementation. Birkhäuser, Boston, 1997.
[18] P. Q. Nguyen and J. Stern. The two faces of lat-
References tices in cryptology. In Proceedings of CALC ’01,
Lecture Notes in Computer Science, volume 2146,
[1] E. Agrell, T. Eriksson, A. Vardy, and K. Zeger. pages 146–180. Springer Verlag, 2001.
Closest point search in lattices. IEEE Transactions [19] E. Remes. Sur un procédé convergent
on Information Theory, 48(8):2201–2214, Aug. d’approximations successives pour déterminer
2002. les polynômes d’approximation. C.R. Acad. Sci.
[2] L. Babai. On Lovász’ lattice reduction and the Paris, 198:2063–2065, 1934.
nearest lattice point problem. Combinatorica, [20] S. Story and P. T. P. Tang. New algorithms for
6(1):1–13, 1986. improved transcendental functions on IA-64. In
[3] N. Brisebarre, J.-M. Muller, and A. Tisserand. Koren and Kornerup, editors, Proceedings of the
Computing machine-efficient polynomial approx- 14th IEEE Symposium on Computer Arithmetic,
imations. ACM Transactions on Mathematical pages 4–11, Los Alamitos, CA, Apr. 1999. IEEE
Software, 32(2), June 2006. Computer Society Press.
[4] J. W. S. Cassels. An introduction to the geometry [21] The Arénaire Project. CRlibm, Correctly
of numbers. Classics in Mathematics. Springer- Rounded mathematical library, July 2006.
Verlag, Berlin, 1997. Corrected reprint of the 1971 http://lipforge.ens-lyon.fr/www/
edition. crlibm/.
[5] E. W. Cheney. Introduction to approximation the- [22] V. Shoup. NTL, a library for doing number the-
ory. AMS Chelsea Publishing, second edition, ory, version 5.4. http://shoup.net/ntl/,
1982. 2005.
[6] M. Cornea, J. Harrison, and P. T. P. Tang. Scien- [23] P. van Emde Boas. Another NP-complete problem
tific Computing on Itanium-Based Systems. Intel and the complexity of computing short vectors in
Press, 2002. a lattice. Technical Report 81-04, Mathematische
[7] W. Gautschi. Numerical analysis. Birkhäuser Instituut, University of Amsterdam, 1981.
Boston Inc., Boston, MA, 1997. An introduction. [24] L. Veidinger. On the numerical determination of
[8] O. Goldreich and S. Goldwasser. On the limits of the best approximations in the Chebyshev sense.
non-approximability of lattice problems. In Pro- Numerische Mathematik, 2:99–105, 1960.
ceedings of 30th Annual ACM Symposium on The-

You might also like