0% found this document useful (0 votes)
30 views8 pages

O Function

The document discusses the use of O(x) notation to describe the asymptotic behavior of functions. It defines O(x) as a "pseudo-function" that represents the norm of behavior that other functions are compared to. It also discusses Taylor series expansions and their properties such as convergence radii. Finally, it discusses asymptotic series which may not converge pointwise but have partial sums that are asymptotically better than the last, and explains how divergent series can still be useful in applications by reshuffling terms to widen convergence regions.

Uploaded by

Axel CG
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
30 views8 pages

O Function

The document discusses the use of O(x) notation to describe the asymptotic behavior of functions. It defines O(x) as a "pseudo-function" that represents the norm of behavior that other functions are compared to. It also discusses Taylor series expansions and their properties such as convergence radii. Finally, it discusses asymptotic series which may not converge pointwise but have partial sums that are asymptotically better than the last, and explains how divergent series can still be useful in applications by reshuffling terms to widen convergence regions.

Uploaded by

Axel CG
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 8

12

2. Fundamentals of Convergent Procedures

The O(x) pseudo-flDlction.

In most articles related to approximation theory, or to asymptotic convergence rates,


a pseudo-function O(x) is prominently used. We call it a pseudo-function, since it does
not identify a specific function, but rather a norm of behaviour to which other functions
are compared. Specifically, the following expressions are used:

I(z)"" O(g(z» means lim I«Z»


,(:)-09 z
= A ,60
I(z) < O(g(z» means lim I«Z»
,(:)-09 z
=0 (2.1)

I(z) > O(g(z» means lim Ig«Z»


,(:)_0 Z
=0
The symbol..., ('similar to') can also be used to compare any two functions, in which
case one can say
I(z)"" g(z) means lim ~~;~ = A :F 0 (2.2)

Here. the type of limit (- 0 or - (0) must be specified if it is not clear from context.
If comparison is to the pseudo-function 0, then usually the argument tells us which
limit to take.
Occasionally, the O(x) function is used also with large arguments. In computability
theory, for instance, one can see statements such as "Algorithm XX multiplies two
n-digit numbers in a time..., O(n log n log log n)". However, it will be clear from context
which meaning is currently attached to the O(x) function, and in this paper we will use
it only to compare small quantities.
One also frequently writes that I/(z)l::; g(z), meaning that I/(z) 1 is asymptotically
bounded by some A . g(z):
lim sup I/(z)1 < A (2.3)
:-00 g(z)
Finally, the term "exponentially decreasing" is frequently used. This does not mean that
Instead, it means
3'Yo: f(z)"" e-"'O:.

If(z)1 < O(e-"':) ~ r < 'Yo


{ (2.4)
I/(z)1 > O(e-"':) if r > 'Yo

The Taylor series.

The Taylor series expansion of a function I(z) around the point Zo is defined as a
power series
13

with partial sums


Sn(Z) =E" a.(z - zo)' (2.5)
.=0
which fulfill
fez) - sn(z)~ O(z"+1)
if such a series exists. We quote the following well-known facts:
= =
• If limsup,,_ex> Viani p exists, then the function f(T)(z) limn_ex> s,,(z) exists within
the circle Ix - zol < IIp if p > 0; else. it exists for all x. R = IIp is called the
convergence radius; the latter case is indicated by saying R = 00.
• Else, the sequence \llanl is unbounded. In that case, s,,(x) diverges everywhere
except for x =zo, and one says that the convergence radius is R = o.
Now assume a non-zero convergence radius. One can then prove that:
• Inside the circle. f(T)(z) is an analytic function.
• If fez) is analytic inside the circle, then f(T)(x) = fez) inside the circle.
• If fez) is analytic in zo, then the coefficients are given by
1 --
an= d"fl
n! dz" ===0
(2.6)

• R is the largest possible radius such that f(T) is analytic within the circle, i.e. if R
is finite, then there is at least one non-analytic point on the circumference of the
circle.
There is a common belief that a convergent Taylor series also means that the se-
ries converges to the intended function. This is usually true in most areas of applied
mathematics, but there are some very common counterexamples in some applications
of penurbation theory. One such example is the energy of a system perturbed by an
external field F, under circumstances which allow field ionization for all non-zero F.
Here, the unperturbed energy gets immersed in an infinitely deep continuum as soon
as F;;f; 0 The nmneling probability has the property of being limited by O(FN) for any
arbitrarily large N. The energy can be expressed as a function of F, but only if complex
energies are allowed, corresponding to finite life time. The energy then has the form
E(F) = (Eo + ElF + E2F2 + ...) + S(F), (2.7)

where S is some complex function with Taylor expansion S(T)(F) == O. The first part of
the r.h.s. may have a finite convergence radius, which then allows definition of dipole
mnoment, polarizability, etc. in the fonn of energy derivatives, eventhough this at first
may seem to be prohibited by the possibility of field ionization.
Another common example arises in the expansion of the interaction energy of two
atoms as a function of the inverse interatomic distance l/RAB. This turns out to be
a Taylor series which differ from the correct energy by a real contribution with zero
Taylor expansion. The difference may be attributed to so-called exchange repulsion. In
this case, it is nowadays known that the Taylor series bas a zero convergence radius, so
that the energy expression constitutes an example of an asymptotic series (to be defined
in a moment) which is non-convergent for all RAB.
14

Asymptotic series.

Sometimes. one is not so much concerned with the pointwise convergence of a


series; one merely wants each partial swn to be asymptotically better than the last.
Such an asymptotic series is almOSt always an inverse power series, and it is then
defined as follows:
(2.8)
with partial sums

... (:r) = Eaoz-i
;:E O

iff
I( ~) _ ... (Z') ~ O(z-(;+I»
There is DO suggestioo that Iimn_ .... .... (z ) = f ez). By comparison to what is known about
=
Taylor series, by defining g(;r) f {l /z} we immediately find :
• If g(z) is analytic in the ocigin. then the asymptotic series converges tOwards (x)
for all lzi > tIRo where R is the convergence radius of the Taylor series for g(: ). H
=
R 0, it diverges for all x.
• Else. it either does not converge, or it converges towards a function which differs
from f (x). From the definition, it then follows that the difference is limited by all
inverse powers of x. The function £-Iai provides an explicit example of such a
functioD.
A famous example of a ncn-convergenJ asymptOtic series is the Stirling fomwlo. for
the factorial function:

log(:d ) =-21 log 2r+ (z + -)log


I
2
z- z+
1
-12.: -- 1
- + ...
360.::3
(2.9)

As it stands, the series diverges for any z. It is usually used together with the recursion
.:! = (.:+ I )!/ (.:+ 1). which is tim employed 10 push the argument arbitrarily far mID the
asymptotic region, to provide any requested accuracy of the Stirling formula.
Note. however, that the common belief. that an asymptotic series is always non-
convergent,. is wrong.

Application of divergent series..

It is not true that a divergent series cannot be applied in computations. The key to
the malter is the analyticity of virtually all functions derived in applied mathematics.
and the essential uniq~MSS of analytical functions. With the latter, we refer to the fact
that if two analytic functions arc identical in any open disk. they are in fact identical
in every point where they have both been defined - with the exception that we may
have to select branches of multivalued functions. Thus. as long as some functional
dependencies on a variable remain in the terms. it does not man:er if some reshufHing
of the terms changes the convergence region of a series., as long as these convergence
regions overlap somewhere. The point is to make the reshuffling in such a way, that
the resulting convergence region covers our intended argumenL
15

There are many different ways to sum a divergent series. The two simplest ways
are: either. to add and subtract a ftmction with a known series, with the same type
of divergence, or similarly to multiply and divide with such a series expansion. This
operation is intended to remove the non-analytic point(s) closest to the origin, thereby
leaving a rest with larger convergence circle. A slightly less easy way is to make
a linear combination of low powers of the expanded function. The trick is then to
find such a combination which in itself has a larger convergence circle, and the result
is thereby reexpressed as the solution of an algebraic equation where the coefficients
can be evaluated as smnmable series. The latter trick is valuable when the poor con-
vergence is known to arise from branch points of some very large and complicated
algebraic equation. such as an eigenvalue problem. The straightforward deformation of
the convergence region can be more or less difficult, but can be exemplified: Consider
the lower eigenvalue E(A) of the perturbed two-by-two matrix

which has the Taylor expansion


E( A')-
_ -A
,:/ + -2'4
A -
2·6,6 + ---1\
-1\
2·6·10,13 + ... (2.10)
2 2·3 2·3·4
This series cannot be used to evaluate E(l), since the convergence radius is only ~.
This is due to two branch points, at A = ±~i. The simple replacement of A with p., which
we define as
2A2 A2 = po
= 1 + 2A 2 ;
P. 2(1 - p)
can be substituted into the terms of the series, and the terms themselves expanded in
powers of p. The results are (after some boring algebra)

(2.11)

The new series has a convergence radius of 1; the value of p to insert (corresponding to
A = 1) is po = ~, so we have been able to transform the divergent series into a convergent
one. This was done by a transformation, whereby the original branch points were
transformed into a single branch point located at p = -1. We also introduced a pole at
po = +1. The resulting convergence circle covers the intended argument, p = ~.

There exists also convergence acceleration schemes, which act rather like black-box
algorithms, where one can enter the partial sums of a series in one column, perform a
sequence of 'difference and quotient operations' following a given prescription, and end
up with a rapidly converging column of values. This type of scheme generally works
fine for the intended purpose, and especially if the non-analyticities responsible for the
poor convergence are regular poles rather than branch points. The application of such
methods to the partial sums of a divergent series is not recommended.

Iteration, recursion and defect-correction.

Very often, a sequence of quantities {Qi} have a relation between neighboring el-
ements, which can be exploited by repeatedly performing the same operations to suc-
cessively generate one quantity after another. This procedure is called either iteration
16

(from Lat. iterum= again) or recursion (Lat. recurrere= run back). The procedure may
terminate. or it may be formally infinite. The only distinction between the two words
seems to be that recursion is used to describe a procedure where each of the quantities
are considered as potentially interesting. while iteration is used when only a final result
(such as the convergence limit) is of interest.
Let us investigate some iterative scheme

(2.12)

as an attempt to find a fixpoint xl of F(x):

x' = F(x / ) (2.13)

Let us define an error e(l:) by

We immediately get
e(l:+l) = x(l:+l) _ xl = F(xl + e(l:» - F(x')

=F'(x')e(l:) + .!.F"(x/)e(l:)2 + ...


2 (2.14)
= Ae(l:) + Be(l:)2 + R(e(l:»
where R(e):;; 0(e3 )
Some conclusions can be drawn immediately:
• If 0 < IAI < 1, there will be some disk around x = xl within which the procedure
will converge with an exponentially decreasing error. If F(x) was unsuitable, then
IAI> 1 and the procedure will initially diverge, and can never converge towards the
intended root.
• If A = F'(xl ) = 0, there will be some disk within which e can be mapped onto some
new variable z. such that
e ..... z : z = e + 0(e 2 )
e(l:+l) = Be(l:)2 + R(e(l:»
=>
il:+ 1 ) = Bz(l:)2

i.e., the mapping is intended to remove the residual term so the exact solution can
be found:

Within this disk, we get then

e(l:) ..... O(z(l:» '" 0«Bi O»(2'»,


i.e.,
e(k) ..... 0(const.(2·» (2.15)

Such convergence is called quadratic, and is obviously very fast if const ::::: Be(O) is
appreciably smaller than 1. The convergence region is given by Izl < II/BI, and this
17

can be used as a rough estimate also for the convergence region of x, provided that
it is contained in the region where e ~ z is valid.
A full investigation of the convergence properties should ideally subdivide the x
space into domains where the procedure diverges, domains where it ultimately converges
towards some root, and in particular one convergent domain which is mapped onto itself
by the iteration procedure, and which contains the desired solution. This domain, if it
exists, is called the convergence region, and within it, we want to know how fast the
convergence is.
Such a full investigation is virtually impossible except in some very simple cases, and
is even then usually very difficult. In particular for quadratically convergent methods,
the convergence region is usually bounded by a fractal instead of a regular curve. Out
of necessity, the convergence properties studied are usually some necessary criterion on
f and c near the desired solution, and the influence of that criterion on the asymptotic
error.
A particular case is the iterative solution of some equation f(x) = 0, which is usually
done with some variant of the so-called defect-correction methods. Assume that we
want to find a solution to the equation
f(x) =0 (2.16)

close to some given point Xo. It is then solved by iterating the formulae
r(.l:) = f(z(/c)
{
z(/c+l) =c(/c)(x(/c), r(/c) (2.17)

where rOo) is the defect, or residue, and c(k) is the correction function. The latter may,
as indicated, be different in each step. This may be due to an adaptive modification of
the correction function, based upon the result of previous iterations, or it may be due
to a switching between use of different correction functions. However, the superscript
of c(k) will now be dropped for simplicity.
We must require that
z = c(x, 0) (2.18)
which is necessary (but not sufficient) for the x(/c) to converge to a solution of (2.16).
The reverse requirement
x = c(x, y) ~ y =0 (2.19)
is highly desirable, since this is a sufficient condition for a converged x to be a solution
to (2.16).
In this fonnulation, let us assume that f(x) can be written as
f(x) = he + he 2 + Rl(e),
where

and similarly

where
18

Direct substitution into (2.17) gives


e(Hl) = Ae(l:) + Be(I:)2 + R(e(I:»
where
A=codl

B = cub + COd2 + c02f: (2.20)


R(e)~O(e3)

We will not carry this analysis further, since in writing the correction function
on the fonn c(z, y) we have not explicitly included the dependence on derivatives of
f. However, we note that this section provides reason for a classification of defect
correction schemes by a new concept, called convergence order. A procedure which
guarantees the asymptotic inequalities (within some disk)
le(H1)1~ le(l:)p
(2.21)
le(Hl)l~ le(I:)I"I'

is said to have convergence order 'Yo. We note that 'Yo > 1 guarantees fast convergence
for starting point z(O) sufficiently close to the intended root.
A very important method is the Newton-Raphson procedure:
Z(Hl) = z(l:) - I(z)/I'(z) (2.22)

Series expansion gives

(2.23)

so it has second-order (=quadratic) convergence, and unless higher-order derivatives of


f are large, the convergence radius can be estimated as lIdh I. In this case, note that the
correction function was given the fonn

c(z,y) =z - dey)

This is a common fonn, but by no means the only one. Often, the corrected x is
instead given by some nonlinear procedure. As an example, an SCF procedure can
be implemented by applying corrections to the Fock matrix, which is diagonalized to
provide new occupied orbitals. The variable x is best represented (in SCF) by a density
matrix. which then depends in a highly non-linear way upon the corrections to the Fock
matrix.
Finally, we note that if it is impractical to calculate the derivative, itcan be replaced
by a difference approximation, where the two last iterations are used:
(1:+1) (I:) (I:) z(l:) - Z(I:-1)
z = z - fez ) /(z(l:) _ f(z(l:-l) (2.24)

This method is called regula falsi, and it has superlinear convergence, which means
that "Yo > 1. In this one-dimensional case, 'Yo = ~ 1.618. ¥
We will return to the analysis of iterative equation solving in Chapter 4, but then
we will extend it to many dimensions.
19

Exercises for Chapter 2.

(8) When n - 00, is ne- n .... O(e- n )? - Is ne-n an exponentially decreasing function?

(9) Find the Taylor series for log(z) expanded around Zo = 1. Detennine the convergence
radius. Explain which feature of the function is responsible for the limited conver-
gence radius. Then show that a better way of calculating the logarithm is provided
by
log(z) = 2(z-I)+ !(z-I)3+ !(z -1)5+ ... )
z+1 3 z+1 5 %+1
which is valid for lR(z) > O. This is an example of a deformation of the convergence
region by a simple variable substitution.
(10) Is it possible to find an asymptotic series l:i 4;z-i for any of the following functions?
If so, is the series convergent? If it is convergent, what is its sum?
(a) fez) = 1/(z log(z»
(b) fez) = 1/(z + + e- r )
Z2

(11) Consider a sequence S,,' n =1,2, ... , which is slowly converging with
Sn = S"" + A/n + B/n2 + Cln 3 + D/n4 + ...
Calculate the sequences an, b.. , en , ... , as

an =S2n + (52n - sn)/1


bn =a2n + (a2n - a n )/3
en = b2n + (b bn )/7
2n -

etc, where the denominators are 21 - 1,2 2 - 1,23 - 1, ... , etc., and show that each of
the sequences are converging towards sco, and that each new sequence has faster
asymptotic convergence than the previous one. This procedure is called Richardson-
extrapolation and is often used for its simplicity.
(12) Suppose you want to solve the equation z =e-Z: on a pocket calculator. Suggest a
simple method., and verify that it works by working out its asymptotic convergence
properties: What is the convergence order? If this is a first-order procedure, what
I I?
is the convergence rate IAI d~ limk_co e~~:)') Show that the NR method, applied as
fez) = % - e- r = 0, takes the fonn

Z(H1) = (1 + z(k)/(1 + erCO »

Try a few iterations and see how it performes.

You might also like