0% found this document useful (0 votes)
89 views7 pages

Minimax Math Functions

This document summarizes key concepts in minimax approximation. It discusses how minimax approximation seeks to minimize the maximum deviation between a function f(x) and approximating polynomial p(x) over an interval, in contrast to least squares approximation which minimizes the average deviation. It provides an example of approximating the function f(x)=ex over [0,1] with a constant, finding the optimal value is (1+e)/2 which equalizes the errors at the endpoints. The document also introduces the oscillation theorem, which establishes a lower bound on the minimax error based on points where the error changes sign.

Uploaded by

Nenad Petrovic
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
89 views7 pages

Minimax Math Functions

This document summarizes key concepts in minimax approximation. It discusses how minimax approximation seeks to minimize the maximum deviation between a function f(x) and approximating polynomial p(x) over an interval, in contrast to least squares approximation which minimizes the average deviation. It provides an example of approximating the function f(x)=ex over [0,1] with a constant, finding the optimal value is (1+e)/2 which equalizes the errors at the endpoints. The document also introduces the oscillation theorem, which establishes a lower bound on the minimax error based on points where the error changes sign.

Uploaded by

Nenad Petrovic
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 7

CAAM 453/553 · NUMERICAL ANALYSIS I

Lecture 21: Minimax Approximation

In many applications, the L2 -norm has a physical interpretation – often associated with some
measure of energy – and this makes continuous least-squares approximation particularly appealing
(e.g., the best approximation minimizes the energy of the error). Moreover, that optimal polynomial
approximation in this norm can be computed at the expense of a few inner products (integrals).
However, like discrete least-squares, this approach suffers from a potential problem: it minimizes
the influence of outlying data, i.e., points where the function f varies wildly over a small portion
of the interval [a, b]. Such an example is shown below.†

2
f(x)

−1

−2
−1 −0.5 0 0.5 1
x

For this function the L2 -norm of the error,


Z b 1/2
kf − p∗ kL2 = (f (x) − p∗ (x))2 dx ,
a
averages out the discrepancy f − p∗ over all x ∈ [a, b], so it is possible to have a large error
f (x) − p∗ (x) on some narrow range of x values that makes a negligible contribution to the integral.
Below on the left, we compare the function shown above to its degree-5 least-squares approximation;
on the right, we show the error f − p∗ , which is small throughout [−1, 1] except for a large spike.

L2 Approximation, Degree 5 Error in L2 Approximation, Degree 5


5 3
f(x)
p*(x) 2.5
4
2
3
1.5
2
1
1
0.5
0
0
−1 −0.5

−2 −1
−1 −0.5 0 0.5 1 −1 −0.5 0 0.5 1
x x

The function in question is f (x) = sin(πx) + 3 exp(−50(x − 12 )2 ). Despite the nasty appearance of the plot, this
is function is perfectly smooth: f (x) ∈ C ∞ [−1, 1].

12 December 2009 21-1 M. Embree, Rice University


CAAM 453/553 · NUMERICAL ANALYSIS I

Functions of this sort may seem pathological, but they highlight the fact that L2 -optimization does
not always generate a polynomial p∗ that is close to f throughout the interval [a, b]. Indeed, in a
number of settings the L2 -norm is the wrong way to measure error: we really want to minimize
maxx∈[a,b] |f (x) − p(x)|.
3.4 Minimax approximation. The goal of minimizing the maximum deviation of a polynomial p
from our function f is called minimax (or uniform, or L∞ ) approximation, since

min max |f (x) − p(x)| = min kf − pkL∞ .


p∈Pk x∈[a,b] p∈Pk

A simple example. Suppose we seek the constant that best approximates f (x) = ex over the
interval [0, 1], shown below.

2.5
f(x) = ex

1.5

0 0.2 0.4 0.6 0.8 1


x

Since f (x) is monotonically increasing for x ∈ [0, 1], the optimal constant approximation p∗ = c0
must fall somewhere between f (0) = 1 and f (1) = e, i.e., 1 ≤ c0 ≤ e. Moreover, since f is
monotonic and p∗ is a constant, the function f − p∗ is also monotonic, so the maximum error
maxx∈[a,b] |f (x) − p∗ (x)| must be attained at one of the end points, x = 0 or x = 1. Thus,

kf − p∗ kL∞ = max{|e0 − c0 |, |e1 − c0 |}.

The following figure shows |e0 − c0 | (broken line) and |e1 − c0 | (dotted line) for c0 ∈ [1, e].

e−1

(e−1)/2

0
1 (1+e)/2 e
c0

12 December 2009 21-2 M. Embree, Rice University


CAAM 453/553 · NUMERICAL ANALYSIS I

The optimal value for c0 will be the point at which the larger of these two lines is minimal.
The figure above clearly reveals that this happens when the errors are equal, at c0 = (1 + e)/2.
We conclude that the optimal minimax constant polynomial approximation to ex on x ∈ [0, 1] is
p∗ (x) = c0 = (1 + e)/2.
The plots below compare f to the optimal polynomial p∗ (left), and show the error f − p∗ (right).
We picked c0 to be the point at which the error was equal in magnitude at the end points x = 0
and x = 1; in fact, it is equal in magnitude, but opposite in sign,

e0 − c0 = −(e1 − c0 ),

as seen in the illustration on the right below. It turns out that this property—maximal error
attained at various points in the interval with alternating sign—is a key feature of minimax ap-
proximation.

Minimax Error, degree n=0


3 1
f(x) = ex
p = (e+1)/2
*
2.5 0.5

2
0

1.5
−0.5

1
−1
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
x x

3.4.1. Oscillation Theorem. As hinted in the previous example, the points at which the error
f − p∗ attains its maximum magnitude play a central role in the theory of minimax approximation.
The Theorem of de la Vallée Poussin is a first step toward such a result. We include its proof (from
Süli and Mayers, §8.3) to give a general impression of how such results are established.
Theorem (de la Vallée Poussin Theorem). Let f ∈ C[a, b] and suppose r ∈ Pn is some polynomial
for which there exist n + 2 points {xj }n+1
j=0 with a ≤ x0 < x1 < · · · < xn+1 ≤ b at which the error
f (x) − r(x) oscillates signs, i.e.,

sign(f (xj ) − r(xj )) = −sign(f (xj+1 ) − r(xj+1 ))

for j = 0, . . . , n. Then
min kf − pkL∞ ≥ min |f (xj ) − r(xj )|.
p∈Pn 0≤j≤n+1

Before proving this result, we provide a numerical illustration. Here we are approximating f (x) = ex
with a quintic polynomial, r ∈ P5 (i.e., n = 5). This polynomial is not necessarily the minimax
approximation to f over the interval [0, 1]. However, in the plot below we can see that for this
r, we can find n + 2 = 7 points at which the sign of the error f (x) − r(x) oscillates. The broken

12 December 2009 21-3 M. Embree, Rice University


CAAM 453/553 · NUMERICAL ANALYSIS I

line shows the error curve for the optimal minimax polynomial p∗ (whose computation is discussed
below). Here is the point of the de la Vallée Poussin theorem: Since the error f (x) − r(x) oscillates
sign n + 2 times, there must be some x ∈ [0, 1] at which the minimax error ±kf − p∗ kL∞ (denoted
by the horizontal broken lines) exceeds |f (x) − r(x)| at one of the points that give the oscillating
sign. In other words, the de la Vallée Poussin theorem provides a mechanism for developing lower
bounds on kf − p∗ kL∞ .

−6 Minimax Error, degree n=5


x 10
f(x)−r(x)
2 f(x)−p*(x)

1
f(x)−p*(x)

−1

−2
0 0.2 0.4 0.6 0.8 1
x

Proof. Suppose we have n + 2 ordered points, {xj }n+1


j=0 ⊂ [a, b], such that f (xj ) − r(xj ) alternates
sign at consecutive points, and let p∗ denote the minimax polynomial,

kf − p∗ kL∞ = min kf − pkL∞ .


p∈Pn

We will prove the result by contradiction. Thus suppose

kf − p∗ kL∞ < |f (xj ) − r(xj )|, for all j = 0, . . . , n + 1. (21.1)

As the left hand side is the maximum difference of f − p∗ over all x ∈ [a, b], that difference can be
no larger at xj ∈ [a, b], and so:

|f (xj ) − p∗ (xj )| < |f (xj ) − r(xj )|, for all j = 0, . . . , n + 1. (21.2)

Now consider p∗ (x) − r(x) = (f (x) − r(x)) − (f (x) − p∗ (x)), which is a degree n polynomial,
since p∗ , r ∈ Pn . Equation (21.2) states that f (xj ) − r(xj ) always has larger magnitude than
f (xj ) − p∗ (xj ). Thus, regardless of the sign of f (xj ) − p∗ (xj ), the magnitude |f (xj ) − p∗ (xj )| will
never be large enough to overcome |f (xj ) − r(xj )|, and hence p∗ (xj ) − r(xj ) will always have the
same sign as f (xj ) − r(xj ). We know from the hypothesis that f (x) − r(x) must change sign at least
n + 1 times (at least once in each interval (xj , xj+1 ) for j = 0, . . . , n), and thus p∗ (x) − r(x) ∈ Pn
must do the same. But n + 1 sign changes implies n + 1 roots; the only degree n polynomial with
n + 1 roots is the zero polynomial, i.e., p∗ = r. However, this contradicts the strict inequality in
equation (21.1). Hence, there must be at least one j for which

kf − p∗ kL∞ ≥ |f (xj ) − r(xj )|.

12 December 2009 21-4 M. Embree, Rice University


CAAM 453/553 · NUMERICAL ANALYSIS I

The following result has the same flavor, but it is considerably more precise (with a more intricate
proof, which we omit).‡
Theorem (Oscillation Theorem). Suppose f ∈ C[a, b]. Then p∗ ∈ Pn is a minimax approximation
to f from Pn on [a, b] if and only if there exist n + 2 points x0 < x1 < · · · < xn+1 such that

|f (xj ) − p∗ (xj )| = kf − p∗ kL∞ , j = 0, . . . , n + 1

and
f (xj ) − p∗ (xj ) = −(f (xj+1 ) − p∗ (xj+1 )), j = 0, . . . , n.

In words, this means that the optimal error, f − p∗ , attains its maximum at n + 2 points, with the
error alternating sign between consecutive points.
Note that this result is if and only if : the oscillation property exactly characterizes the minimax
approximation. If you can present some polynomial p∗ ∈ Pn such that f − p∗ satisfies the oscillation
property, then this p∗ must be the unique minimax approximation!
Theorem (Uniqueness of minimax approximant). The minimax approximant p∗ ∈ Pn of f ∈
C[a, b] over the interval [a, b] is unique.
The proof is a straightforward application of the Oscillation Theorem. One can show that any two
potential minimax polynomials must have the same n+2 critical oscillation points. Any two degree-
n polynomials that agree at n + 2 points must be identical. See Süli and Mayers, Theorem 8.5, for
details.
This oscillation property forms the basis of algorithms that find the minimax approximation: it-
eratively adjust an approximating polynomial until it satisfies the oscillation property. The most
famous algorithm for computing the minimax approximation is called the Remez exchange algo-
rithm, essentially a specialized linear programming procedure. In exact arithmetic, this algorithm
is guaranteed to terminate with the correct answer in finitely many operations.
The oscillation property is demonstrated in the previous example, where we approximated f (x) = ex
with a constant. Indeed, the maximum error is attained at two points (that is, n + 2, since n = 0),
and the error differs in sign at those points. The pictures below show the errors f (x) − p∗ (x) for
minimax approximations p∗ of increasing degree.§ The oscillation property becomes increasingly
apparent as the polynomial degree increases. In each case, there are n + 2 extreme points of the
error, where n is the degree of the approximating polynomial.

For a proof, see Süli and Mayers, §8.3. Another excellent resource is G. W. Stewart, Afternotes Goes to Graduate
School, SIAM, 1998; see Stewart’s Lecture 3.
§
These examples were computed using the COCA package, software written by Bernd Fischer and Jan Modersitski
that even solves minimax approximation problems when the interval [a, b] is replaced by a region of the complex plane.

12 December 2009 21-5 M. Embree, Rice University


CAAM 453/553 · NUMERICAL ANALYSIS I

Minimax Error, degree n=2 −3 Minimax Error, degree n=3


x 10
1

0.01
f(x)−p (x)

f(x)−p*(x)
*

0 0

−0.01

−1
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
x x

−5 Minimax Error, degree n=4 −6 Minimax Error, degree n=5


x 10 x 10
2
4 1.5

1
2
0.5
f(x)−p*(x)

f(x)−p*(x)

0 0

−0.5
−2
−1

−4 −1.5

−2
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
x x

Example: ex revisited. Now we shall use the Oscillation Theorem to compute the optimal linear
minimax approximation to f (x) = ex on [0, 1]. Assume that the minimax polynomial p∗ ∈ P1 has
the form p∗ (x) = α + βx. Since f is convex, a quick sketch of the situation suggests the maximal
error will be attained at the end points of the interval, x0 = 0 and x2 = 1. We assume this to
be true, and seek some third point x1 ∈ (0, 1) that attains the same maximal error, δ, but with
opposite sign. If we can find such a point, then by the Oscillation Theorem, we are guaranteed
that the resulting polynomial is optimal, confirming our assumption that the maximal error was
attained at the ends of the interval.
This scenario suggests the following three equations:

f (x0 ) − p∗ (x0 ) = δ
f (x1 ) − p∗ (x1 ) = −δ
f (x2 ) − p∗ (x2 ) = δ.

Substituting our values for x0 , x2 , and p∗ (x) = α + βx, these equations become

12 December 2009 21-6 M. Embree, Rice University


CAAM 453/553 · NUMERICAL ANALYSIS I

1−α = δ
x1
e − α − βx1 = −δ
e − α − β = δ.

The first and third equation together imply β = e − 1. We also deduce that 2α = ex1 − x1 (e − 1) + 1.
There are a variety of choices for x1 that will satisfy these conditions, but in those cases δ will not
be the maximal error. It is key that

|δ| = max |f (x) − p∗ (x)|.


x∈[a,b]

To make sure this happens, we can require that the derivative of error be zero at x1 , reflecting
that the error f − p∗ attains a local minimum/maximum at x1 .The pictures on the previous page
confirm that this is reasonable.¶ Imposing the condition that f 0 (x1 ) − p0∗ (x1 ) = 0 yields

ex1 − β = 0.

Now we can explicitly solve the equations to obtain


1
α = 2 (e − (e − 1) log(e − 1)) = 0.89406 . . .
β = e − 1 = 1.71828 . . .
x1 = log(e − 1) = 0.54132 . . .
1
δ = 2 (2 − e + (e − 1) log(e − 1)) = 0.10593 . . . .

An illustration of the optimal linear approximation we have just computed, along with the asso-
ciated error, is shown below. Compare this approximation to the L2 -optimal linear polynomial
computed at the beginning of our study of continuous least-squares minimization.

Minimax Error, degree n=1


3 0.2
f(x)=ex
p* = α + β x
2.5 0.1
f(x)−p*(x)

2
0

1.5
−0.1

1
−0.2
0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1
x x


This requirement need not hold at the points x0 and x2 , since these points are on the ends of the interval [a, b];
it is only required at the interior points where the extreme error is attained, xj ∈ (a, b).

12 December 2009 21-7 M. Embree, Rice University

You might also like