Pade
Pade
Flemming Topsøe ∗
University of Copenhagen
[email protected]
Abstract
Bounds for the logarithmic function are studied. In particular, we
establish bounds with rational functions as approximants. The study
leads into the fascinating areas of Padé approximations ([2], [6]), con-
tinued fractions ([7], [11]) and orthogonal polynomials ([14], [4]) as
well as the somewhat frightening jungle of special functions and asso-
ciated identities ([5], [9]). Originally, the results aimed at establishing
certain inequalities for Shannon entropy but are here discussed in their
own right (the applications to entropy inequalities will be published
elsewhere).
The exposition is informal, a kind of essay, with only occasional
indications of proofs. The reader may take it as an invitation to further
studies. Enough details are provided to enable the reader to verify all
statements. To the expert in the fields pointed to there is little or
nothing new.
1 Basic inequalities
Consider first the truly basic inequalities:
1
1−≤ ln x ≤ x − 1 for x > 0 . (1)
x
Here and in all inequalities below it is understood that the inequalities shown
are strict, except in easily recognizable cases of equality. One may prefer to
write (1) in the form
∗
This work was supported by the Danish Natural Science Research Council.
1
x
≤ ln (1 + x) ≤ x for x > −1 . (2)
1+x
The bounds (1) and (2) involve rational functions of type [1, 1] (the left-
hand-side) and of type [1, 0] (the right-hand-side). Here, the type refers to the
degrees of numerator and denominator in the rational function concerned.
The simplicity, generality and usefulness of (1) and (2) is unbeatable.
Nevertheless, there are many other interesting inequalities for the logarithmic
function. For instance, if we separate positive and negative values of x, none
of the following two inequalities is much more complex:
2x x 2+x
≤ ln (1 + x) ≤ · for 0 ≤ x < ∞, (3)
2+x 2 1+x
2x x 2+x
≥ ln (1 + x) ≥ · for − 1 < x ≤ 0 (4)
2+x 2 1+x
–and these inequalities are sharper than (2). Also note that the first inequal-
ity of (3) is an improvement, for x ≥ 0, over the first inequality of (2) with
a rational function of the same type – whereas the second inequality of (2)
cannot, of course, be improved in a similar way.
The inequalities (3) and (4) may be written as a single double inequality,
e.g. by taking absolute values. Another possibility is to define the function
λ by
ln (1 + x)
λ(x) = (5)
x
(λ represents the slope of the chorde on the function x y ln (1 + x) which
connects (0, 0) with (x, ln (1 + x))). By continuity, λ(0) = 1. Then
2 2+x
≤ λ(x) ≤ for x > −1 . (6)
2+x 2 + 2x
For some possibilities to sharpen the bounds (3) and (4), see (22) and
(23) further on. However, the form chosen is convenient for a special reason.
Actually, (3) and (4) are equivalent as follows by writing ln (1 + x) in the
x
form − ln (1 − 1+x ). To further exploit this observation, we introduce a
notion of duality between functions defined on [0, ∞[ and functions defined
on ] − 1, 0]. The dual of one such function, say φ, is defined by the relation
∗ −x
φ (x) = −φ . (7)
1+x
Inequalities of the form φ(x) ≤ ln (1 + x) ≤ ψ(x) for x ∈ [0, ∞[ (respect-
ively, for x ∈] − 1, 0]) then translate into inequalities φ∗ (x) ≥ ln (1 + x) ≥
ψ ∗ (x) for x ∈] − 1, 0] (respectively, for x ∈ [0, ∞[).
2
In (3) and (4) we have an instance with two self-dual functions in the
sense that for the two analytic functions – say φ and ψ – given on the full
interval ] − 1, ∞[ by, respectively the left-hand and the right-hand expression
in (3) and (4), we find that
and similarly for ψ (here, the subscripts refer to restrictions of the domain of
definition). As the functions we shall consider will be analytic in ] − 1, ∞[,
we do not find it necessary to indicate the restrictions to the appropriate
interval since the analytic form of the functions will be the same whether we
consider the restriction to [0, ∞[ or to ] − 1, 0]. For the functions above we
may thus write simply φ∗ = φ and ψ ∗ = ψ.
Let us return to (2). Also there, duality is relevant. Indeed, writing (2)
in the form φ(x) ≤ ln (1 + x) ≤ ψ(x), we find that φ∗ = ψ (and ψ ∗ = φ), i.e.
(φ, ψ) is a dual pair. This explains why there is no restriction on x in (2),
except the natural one, x > −1.
Duality takes a different – and somewhat simpler – form when having
bounds for ln x in mind rather than bounds for ln (1 + x). In order to
save on notation, we use only one symbol, a “star” , for duality and then
indicate by an overline “tilde” if the functions are intended as bounds for
ln (x) rather than for ln (1 + x). The duality definition for “tilde-functions”
is then given by
φ̃∗ (x) = −φ̃(x−1 ) . (9)
As an example of a basic inequality of ln x-type, we mention
1 1
| ln x| ≤ |x − | for x > 0 (10)
2 x
which is obtained from the right-hand inequalities of (3) and (4).
From (10), and after a further substitution x := xa with a a positive
parameter, we obtain useful approximations of ln x. In more detail, one
finds that
1 a
(x − x−a ) ↓ ln x as a ↓ 0 for 1 ≤ x < ∞ , (11)
2a
and that
1 a
(x − x−a ) ↑ ln x as a ↓ 0 for 0 < x ≤ 1 . (12)
2a
These results – clearly duals of each other – can also be derived directly
from the expansion
∞
1 a −a
X a2n
(x − x ) = ln x + ( ln x)2n+1 . (13)
2a n=1
(2n + 1)!
3
Consider, for instance, the case a = 12 . This leads to the inequality
x−1
ln x ≤ √ for 1 ≤ x < ∞ (14)
x
with reversal of the inequality sign for 0 < x < 1. Expressed in terms of the
λ-function this shows that
1
λ(x) ≤ √ for x > −1 . (15)
1+x
This useful inequality – which one could just as well had been led to by
considering the product of the extreme terms in (6) – goes back at least to
Karamata [8], cf. also Mitrinović [12, Section 3.6.15].
Let us end this section with a more special inequality which is, again,
related to (6). It is our only result which involves two parameters (other such
inequalities can be found in [12]). The inequality states that for 0 ≤ x ≤ 1
and 0 ≤ y < ∞,
1−x
(2 − x)λ(y) − ≤ λ(xy) ≤ xλ(y) + (1 − x) . (16)
1+y
This follows by a standard analysis of the inequalities which result when
you keep y fixed (for the proof, consider the function equal to the difference
between terms of the inequality you wish to prove, and differentiate twice).
The indicated proof makes use of (6) (which is needed in order to discuss
endpoint behaviour). One may also note that, for 0 ≤ x ≤ 1, the second
inequality in (6) follows by comparison of the two extreme terms in (16).
4
For a loose consideration, let us neglect the fact that the denominator here
varies with the choice of φ. Then we only need to assure that the numerator
is small and non-negative. To choose among the possible bounds, note that
the numerator is a polynomium of degree 2n. We aim at a bound which is
especially good for small values of x and realize that we should attempt to
make lower terms vanish. We achieve this by insisting that all terms in the
numerator vanish, except the leading term.
F
To be precise, we denote by φn the function of the form φn = G with F
and G two polynomials of degree n such that F (0) = 0 and G(0) > 0 and
such that the numerator in (17) is a positive constant times x2n . Similarly, ψn
F
is defined by ψn = G , where F is a polynomial of degree n with F (0) = 0 and
G a polynomial of degree n − 1 with G(0) > 0 and such that the numerator
in (17) is a negative constant times x2n−1 .
It turns out that, for each n ≥ 1, φn and ψn are uniquely determined by
these requirements. We introduce the following standard representations:
with (Pn )n≥0 , (Qn )n≥0 , (Rn )n≥0 and (Sn )n≥0 four sets of polynomials of de-
grees as indicated by the subscript and such that
5
The considerations leading to the above definitions express a key idea
in the theory of Padé approximation, cf. Baker and Graves-Morris [2], a
standard reference. In the terminology of that theory, φn is the [n, n]-Padé
approximant and ψn the [n, n − 1]-Padé approximant of ln (1 + x). We shall
refer to the φn ’s as simply the lower approximants (to ln (1 + x)) and to the
ψn ’s as the upper approximants.
Instead of just referring the reader to the literature on Padé approxima-
tion, we base our exposition on experiments facilitated by modern computing
tools. This will lead rather quickly to desired formulas and other insights.
Full proofs of relations found experimentally may not always be so obvious.
We shall include enough details to enable the reader to validate all state-
ments in the usual rigorous mathematical style. The interested reader will
find further results in the literature referred to.
3 Some experiments
In order to get a feel for the nature of the approximants defined in the
previous section, it is natural to work out a number of examples. This can be
done by equating coefficients of polynomials occuring in the defining relations
(20) and (21). With hinsight, this can be done much more conveniently by
recursion formulas developed later ((60)–(63)) or by simply asking MAPLE
to work out the relevant Padé approximants. Anyhow, in one way or another
we can get at the first few approximants and obtain a suitable table, cf. Table
1.
For instance, φ1 (x) ≤ ln (1 + x) for x ≥ 0 which we recognize as the first
inequality of (3). As φ∗1 = φ1 , the first inequality of (4) also follows. The
determination of ψ2 gives the inequality
x(6 + x)
ln (1 + x) ≤ for x ≥ 0 , (22)
2(3 + 2x)
a strengthening of the second inequality in (3) with a bound of the same type
([2, 1]). And, if we dualize the inequality for ψ2 , we get ln (1 + x) ≥ ψ2∗ (x)
for −1 < x ≤ 0, i.e.
x(6 + 5x)
ln (1 + x) ≥ for − 1 < x ≤ 0 , (23)
2(1 + x)(3 + x)
a streghtening of the second inequality of (4), though with a function of a
different type ([2, 2] rather than [2, 1]).
By (18) and (19), Table 1 reveals the identity of the first P QRS-polynomials,
cf. Table 2.
6
n φn ψn
2x
1 2+x x
3x(2+x) x(6+x)
2 6+6x+x 2 2(3+2x)
x(60+60x+11x2 ) x(30+21x+x2 )
3 3(20+30x+12x2 +x3 ) 3(10+12x+3x2 )
5x(84+126x+52x2 +5x3 ) x(420+510x+140x2 +3x3 )
4 6(70+140x+90x2 +20x3 +x4 ) 12(35+60x+30x2 +4x3 )
x(7560+15120x+9870x2 +2310x3 +137x4 ) x(3780+6510x+3360x2 +505x3 +6x4 )
5 30(252+630x+560x2 +210x3 +30x4 +x5 ) 30(126+280x+210x2 +60x3 +5x4 )
7x(1320+3300x+2960x2 +1140x3 +174x4 +7x5 ) x(13860+30870x+23520x2 +7035x3 +672x4 +5x5 )
6 10(924+2772x+3150x2 +1680x3 +420x4 +42x5 +x6 ) 30(462+1260x+1260x2 +560x3 +105x4 +6x5 )
n 0 1 2 3
Pn 2 6 + 3x 20 + 20x + 113
x2 70 + 105x + 130
3
x2 + 25
6
x3
2 2 3
Qn 1 2+x 6 + 6x + x 20 + 30x + 12x + x
Rn 1 6+x 30 + 21x + x2 140 + 170x + 140
3
x2 + x 3
Sn 1 6 + 4x 30 + 36x + 9x2 140 + 240x + 120x2 + 16x3
7
y1
y2
ln(1+x)
f2
f1
and may continue based on the identification of the first few good bounds
given in Table 1.
The approach just described is of course well known and nothing but an
attempt to represent ln (1 + x) by a continued fraction. The approach above
does not lead in a unique way to (the beginnings of) a continued fraction.
E.g., above we could choose to look, as we did, at x divided by 2 + something
or we could have looked at 21 x divided by 1 plus something. Experimenting
with the possibilities, you soon see that a very simple structure emerges if
you at each step divide by n plus something. Indeed, you then arrive at the
beautiful representation
x 12 x 12 x 22 x 22 x n2 x n2 x
ln (1 + x) = ··· ··· . (28)
1+ 2 + 3 + 4 + 5 + + 2n + 2n + 1 +
8
This representation is not new. You find it in [3], [4], [2], [7] and [11].
In fact, the result is more than two hundred years old (!) and goes back to
Lambert, cf. [10]. Regarding historical comments – to this and other aspects
of the paper – we refer to [3] and to [7].
In a sense, the continued fraction (28) tells us all we need to know as
it provides us with easy access to the approximants φn and ψn . We shall
elaborate more on this in Section 5. For now, let us look at a few more
“experiments” .
Some coefficients for the P QRS-polynomials are easy to guess from Tables
1 and 2, e.g., we see that
Sn,n = (n + 1)2 . (29)
2
As another example, we realize that Pn,n − Pn−1,n−1 = n+1
, hence
n+1
X 1
Pn,n =2 . (30)
ν=1
ν
One may also go hunting for relationships among the coefficients by con-
sulting the “On-Line Encyclopedia of Integer Sequences” , cf. Sloane [13].
For instance, enquiring about the sequence 6, 30, 90, 210, the feed-back from
this source will reveal the fact that
1
Qn,n−2 = [n + 2]4 .1
4
Many other relations for the P QRS-coefficients can be discovered in this
way. Instead of pursuing this line of investigation, we shall have a look at the
approximants after a transformation so that they approximate ln x rather
than ln (1 + x). In other words, we are asking about the “tilde-functions”
9
n φ̃n ψ̃n
1 2(x−1)
1+x
x−1
2 3(x−1)(1+x)
1+4x+x2
(x−1)(x+5)
2(1+2x)
(x−1)(11+38x+11x2 ) (x−1)(10+19x+x2 )
3 3(1+9x+9x2 +x3 ) 3(1+6x+3x2 )
5(x−1)(5+37x+37x2 +5x3 ) (x−1)(47+239x+131x2 +3x3 )
4 6(1+16x+36x2 +16x3 +x4 ) 12(1+12x+18x2 +4x3 )
(x−1)(137+1762x+3762x2 +1762x3 +137x4 ) (x−1)(131+1281x+1881x2 +481x3 +6x4
5 30(1+25x+100x2 +100x3 +25x4 +x5 ) 30(1+20x+60x2 +40x3 +5x4 )
7(x−1)(7+139x+514x2 +514x3 +139x4 +7x5 ) (x−1)(142+2272x+6397x2 +4397x3 +647x4 +5x5 )
6 10(1+36x+225x2 +400x3 +225x4 +36x5 +x6 ) 30(1+30x+150x2 +200x3 +75x4 +6x5 )
P1 -2.0000
P2 -4.1356 -1.3189
P3 -7.2397 -2.0000 -1.1603
P4 -11.3204 -2.9243 -1.5197 -1.0969
P5 -16.3907 -4.0764 -2.0000 -1.3251 -1.0650
Q1 -2.0000
Q2 -4.7321 -1.2679
Q3 -8.8730 -2.0000 -1.1270
Q4 -14.4026 -3.0302 -1.4926 -1.0746
Q5 -21.3174 -4.3334 -2.0000 -1.3000 -1.0492
R1 -6.0000
R2 -19.4582 -1.5418
R3 -42.7683 -2.6743 -1.2240
R4 -77.0830 -4.2481 -1.7114 -1.1242
R5 -123.2945 -6.2526 -2.3648 -1.4089 -1.0792
S1 -1.5000
S2 -2.8165 -1.1835
S3 -4.7094 -1.6934 -1.0992
S4 -7.1551 -2.4015 -1.3828 -1.0606
S5 -10.1487 -3.2837 -1.7793 -1.2469 -1.0415
10
4 Some facts
In this section it is assumed that any occurring x is non-negative unless stated
otherwise explicitly.
The experiments of the previous section lead to important facts about
the φn ’s and ψn ’s. Let us start by looking at the quality of these functions
as bounds for ln (1 + x).
Apparently, all coefficients in the P QRS-polynomials are positive. From
(24) - (27) it therefore follows that
These bounds are less explicit but typically much sharper than (33).
The behaviour of φn (x) and ψn (x) for large x is given by the relations
n
X 1 x
φn (x) ≈ 2 ; ψn (x) ≈ (36)
ν=1
ν n2
in the sense that the ratios involved converge to 1 for x → ∞. These relations
follow from (29) and (30).
We also note that the functions φn and ψn are increasing. As for φn this
follows from (20) (see also (17)) which tells us that
Qn (x)2 − x2n
φ0n (x) = ≥ 0. (37)
(1 + x)Qn (x)2
11
for x ≥ 0 which dominates any other such function locally, i.e., for any such
function f , there exists ε > 0 such that φn (x) ≥ f (x) for 0 ≤ x < ε. In fact,
for this conclusion we may assume about type-[n, n] functions f considered
only that f ≤ ψn+1 , locally, instead of f (x) ≤ ln (1 + x) for x ≥ 0. A similar
characterization holds for ψn .
Note that we have to work locally with inequalities in the characteriza-
tions just pointed out (consider, for example, the function given by f (x) =
3x
4+x
; then f (x) ≤ ln (1 + x) for x ≥ 0 – as f ≤ φ2 – but f ≤ φ1 does not hold
globally for x ≥ 0, only locally). The proofs of the characterizations rest on
the facts that if a type-[n, n]-function is not identical to φn , then there is a
(ν)
ν ≤ 2n such that f (ν) (0) 6= φn (0) and if a type-[n, n − 1] function g is not
(ν)
identical to ψn , then there is a ν ≤ 2n − 1 such that g (ν) (0) 6= ψn (0).
The determination of the P QRS-polynomials in closed form is not that
obvious based on our experiments. Most strikingly, perhaps, is that if we
define Q̃n by
Q̃n (x) = Qn (x − 1) ,
then n 2
X n
Q̃n (x) = xk ,
k=0
k
12
Thus, the coefficients in Qn have been identified as certain trinomial coef-
ficients.
Having determined the Qn,k ’s we may use the defining relation (20) to
determine the Pn−1,k ’s. With some effort this leads to the formula
k
X (−1)ν
Pn−1,k = Qn,k−ν . (39)
ν=0
ν+1
(x − 1)P̃n−1 (x)
φ̃n (x) =
Q̃n (x)
with P̃n−1 (x) = Pn−1 (x − 1) and Q̃n (x) = Qn (x − 1), we realize that these
polynomials are self-reflected in the sense that
1
xn−1 P̃n−1 ( ) = P̃n−1 (x) ,
x
1
xn Q̃n ( ) = Q̃n (x) .
x
By simple calculation, this implies that the φ̃n ’s are self-dual. Hence also
the φn ’s are self-dual: φ∗n = φn . By duality, this implies that φn is a best
type-[n, n] upper bound of ln (1 + x) for −1 < x ≤ 0 (with “ best” being
understood in much the same way as discussed for the bounds found for
x ≥ 0).
The functions ψn are not self-dual. The duals ψn∗ are even of a different
type, viz. of type [n, n]. These functions then, are the best type-[n, n] lower
bounds of ln (1 + x) for −1 < x ≤ 0.
13
products of polynomials and simpler procedures are desirable. Here Lamberts
expansion (28) provides the proper tool.
Following usual terminology of continued fraction theory, let An and Bn
denote the approximants related to (28). These are polynomials associated
in the natural manner to the corresponding finite continued fractions, e.g.
A1 x A2 x A3 x
= , = 2
, = .
B1 1 B2 1 x B3 12 x
1+ 1+
2 12 x
2+
3
The first polynomials are
A0 = 0 , A 1 = x , B 0 = 1 , B 1 = 1 (40)
and, exploiting the special form of (28), the following recursive relations hold
for n ≥ 1:
A2n B2n
xPn−1 = 2
, Qn = , (48)
(n!) (n!)2
A2n−1 B2n−1
xRn−1 = , S n−1 = . (49)
((n − 1)!)2 ((n − 1)!)2
14
Now, Qn is known, cf. (38), hence also B2n can be found. And then, from
(43), we can determine B2n−1 , hence, by (49), also Sn−1 . This leads to the
formula
n−1
X 2n − ν − 1
Sn−1 = n xν . (50)
ν=0
ν, n − ν, n − ν − 1
Similarly, a formula for Rn−1 can be found based on (39), (41) and (49).
Going through the details you find that
n−1 X
ν+1
X 2n − ν + µ − 2
Rn−1 = n (−1) µ−1
µ−1 xν .
ν=0 µ=1
ν − µ + 1, n − ν + µ − 1, n − ν + µ − 2
(51)
The expressions (38), (39), (50) and (51) thus provide formulas for the
P QRS-polynomials in closed form with very satisfactory expressions for the
denominator polynomials (Q, S) and somewhat less satisfactory formulas for
the numerator polynomials (P , R).
For practical calculations, however, the recursive relations (41)–(44) are
more expedient and simple to programme. By (48) and (49) these relations
may be written directly in terms of the P QRS-polynomials:
2
Pn−1 = Rn−1 + xPn−2 , (52)
n
Rn = (2n + 1)Pn−1 + xRn−1 , (53)
2
Qn = Sn−1 + xQn−1 , (54)
n
Sn = (2n + 1)Qn + xSn−1 . (55)
15
Another form of (52)–(55) is obtained for the reflected polynomials. Fol-
lowing [6], the reflected polynomial of a polynomial A is the polynomial
denoted A which is given by
A(x) = xN A(x−1 )
with N the degree of A. We find:
2
P n−1 − P n−2 = Rn−1 ; Rn − Rn−1 = (2n + 1)xP n−1 , (58)
n
2
Qn − Qn−1 = xS n−1 ; S n − S n−1 = (2n + 1)Qn . (59)
n
16
n Pn Qn Rn Sn
0 2 1 1 1
1 3x x 3x − 2 3x + 1
2
Pn − Pn−1 = Rn , (64)
n+1
x−1
Qn − Qn−1 = Sn−1 , (65)
n
1
Rn − Rn−1 = (n + )(x − 1)Pn−1 , (66)
2
Sn − Sn−1 = (2n + 1)Qn , (67)
Note the disappearence of any x’s in the last terms. We may apply (68)–
(71) with n ≥ 2 in connection with the start polynomials given in Table
5.
We shall relate the PQRS-polynomials to the classical Jacobi polynomials,
here denoted Pnα,β , which are associated with the measures on [−1, 1] with
densities (1 − x)α (1 + x)β . From standard sources, e.g. [14], we see that these
17
polynomials may be determined from the recurrence relations
(2n − 1 + α + β) (2n + α + β)(2n + α + β − 2)x + (α2 − β 2 ) α,β
Pnα,β − Pn−1
2n(n + α + β)(2n − 2 + α + β)
(n + α − 1)(n + β − 1)(2n + α + β) α,β
+ Pn−2 = 0
n(n + α + β)(2n − 2 + α + β)
(72)
for n ≥ 2 in conjunction with the first polynomials which are given by
1
P0α,β = 1 ; P1α,β =
(α + β + 2)x + (α − β) . (73)
2
The Jacobi polynomials have a simple expression in closed form:
n
α,β 1 X n+α n+β
Pn = n (x − 1)n−ν (x + 1)ν . (74)
2 ν=0 ν n−ν
We realize that
Qn = Pn0,0 , (75)
i.e. the Qn ’s are nothing but the classical Legendre polynomials. By (65)
the S-polynomials are closely related to these polynomials. However, apart
from a constant factor, they may also be identified directly as orthogonal
polynomials. Indeed, by Table 5, (71), (72) and (73) it is easy to check that
Sn = (n + 1)Pn1,0 . (76)
18
We have not identified the R-polynomials in a similar manner. Possibly,
it is more reasonable to look at the polynomials Sn − Rn when searching for
properties related to orthogonal polynomials.
Lastly we note two interlacing properties with mixed polynomials. For
this, we may as well return to the original P QRS-polynomials. In fact, from
the defining relation (20) and an investigation of the sign of Pn−1 at zeroes of
Qn it follows that between any two neighbouring zeroes of Qn there is a zero
of Pn−1 (and of course only one zero). And from (21) a similar investigation
shows that between any two neighbouring zeroes of Sn−1 there is a zero of
Rn−1 . This accounts for n − 2 zeroes of Rn−1 . As is easily seen, the last zero
of Rn−1 is smaller than the smallest zero of Sn−1 .
Acknowledgement
Thanks are due to Jacob Stordal Christiansen for some of the references to
the literature and for identification of the associated orthogonal polynomials
related to the Pn ’s.
References
[1] E. S. Andersen and M. E. Larsen. Combinatorial Summation Identities.
In Analysis, algebra, and computers in mathematical research, volume
156 of Lecture Notes in Pure and Applied Mathematics, pages 1–23.
Marcel Dekker, 1994.
19
[7] W. B. Jones and W. J. Thron. Continued Fractions - Analytic Theory
and Applications, volume 11 of Encyclopedia of Mathematics and its
Applications. Addison-Wesley, 1980.
20