0% found this document useful (0 votes)
6 views

A fast algorithm for discrete laplace transform

document uploaded

Uploaded by

se22pmat003
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views

A fast algorithm for discrete laplace transform

document uploaded

Uploaded by

se22pmat003
Copyright
© © All Rights Reserved
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 21

JOURNAL OF COMPLEXITY 4, 12-32 (1988)

A Fast Algorithm for the Discrete Laplace


Transformation
V. ROKHLIN”

Computer Science Department, Yale University, New Haven, Connecticut 06520-2158

An algorithm is presented for the rapid evaluation of expressions of the form

at multiple points. In order to evaluate the above sum for n different values of the
variable X, the algorithm requires order O(n + m) operations, and a simple modifi-
cation of the scheme provides an order O(n) procedure for the evaluation of an
order n polynomial at n arbitrary real points. The algorithm is numerically stable,
and its performance is demonstrated by numerical examples. 0 198X Academic
Prers. Inc.

I. INTRODUCTION

In this paper, we present an algorithm for the rapid evaluation of ex-


pressions of the form

where x 2 0, & = {al, CY~,. . . , am}, fi = {PI,& . . . , Pm} are two finite
sequences of real numbers and flj 2 0 for all 1 5 j % m. To evaluate the
sum (1) at n arbitrary points on the real axis, the algorithm requires a
number of arithmetic operations proportional to
* The author was supported in part by the Office of Naval Research under Grant N00014.
86-K-0310.

12
0885-064X/88 $3.00
Copyright 0 1988 by Academic Press, Inc.
All rights of reproduction in any form reserved.
DISCRETE LAPLACETRANSFORMATION 13

where E is the precision with which the calculations are being performed,
and in most cases likely to be encountered in practice, the estimate (2) can
be reduced to

(n + m) * log2 0; (3)

(see Observations 7.1, 7.2 below).


The evaluation of expressions of the form (1) is closely related to sev-
eral classical problems in the theory of computation. For example, the
problem of rapidly evaluating a polynomial

P(t) = 2 Pj . tj
.I=1

at m different points is readily reduced to the form (1) by the obvious


substitution x = log(t). The classical algorithm for evaluating (4) at m
points has an asymptotic complexity O(m . log2(m)) (see, for example,
Aho et al., 1974; Borodin and Munro, 1975), making (3) a moderate im-
provement over previously available results, so far as the asymptotic
CPU time estimate is concerned. On the other hand, the algorithm of the
present paper is numerically stable, and our numerical experiments (see
Sect. 8) indicate that in practical calculations, it is extremely efficient,
making it a method of choice whenever expressions of the form (1) have
to be evaluated at large numbers of points.
Remark 1.1. Classical algorithms for the rapid manipulation of poly-
nomials are purely algebraic and are applicable to polynomials over a
wide class of fields. On the other hand, the algorithm presented here is
based on approximation theory (i.e., it relies on certain facts from real
analysis) and is restricted to polynomials over the field of real numbers.
While it can be generalized to certain other fields, detailed investigation of
such generalizations is outside the scope of this paper, and will be re-
ported at a later date.

2. RELEVANT FACTS FROM APPROXIMATION THEORY

Suppose that a, b are a pair of real numbers such that a < b and that k z-
2 is an integer. Chebychev nodes tl, t2, . . . , tk on the interval [a, b] are
defined by the formula
14 V. ROKHLIN

ti = a+b
- +a-6
- * cos i -. - (5)
2 2

For a functionf: [a, b] + RI, we will denote by Pk,6,f the order k - 1


Chebychev approximation to the functionfon the interval [a, b], i.e., the
(unique) polynomial of order k - 1 such that Pka,b,f(ti) = f(fi) for all i = 1,
2 . ., k. There exist several expressions for the polynomial P~,,J, and
the one we will use in this paper is

Pt,b,f(t) = $ uj(t) ' f(tj) (6)

with

IL.,+,;(t - t,)
(7)
‘ti(‘) = nl‘-,,,ii(t; - f;)’
The following well-known lemma provides an error estimate for Cheby-
chev approximations. It is the principal analytical tool of this paper and
can be found, in a somewhat different form, in Dahlquist and Bjork
(1974).
LEMMA 2.1. Zff E ck[a, b] (i.e., f has k continuous derivatives on the
interval [a, b]), thenfor any t E [a, 61,

lP:,b,f(t)- f@)l s i ’ (b ;k’)’

with

M = max If(k)(t)l.
rE[o,hl

Furthermore, for any k 2 2 and t E [a, bl,

(10)

and

$ luj(t)l 5 2 + i ’ log(k). (11)


DISCRETE LAPLACE TRANSFORMATION 15

In the present paper, the above lemma will be used in the special case
where 0 < a < b, andf(t) = e-y’l, with y > 0. Under these conditions, the
expression (8) assumes the form

IP:,b,f(t) - f(t)1 5 $ . (b ika)” . e-Y’r, (12)

and the following lemma provides a form of the estimate (12) independent
of y.
LEMMA 2.2. Zf under the conditions of Lemma 2. I, f(t) = e-y’l, b =
2a, and a > 0, then

IPl;,b,f(t) - f(t)1 5 $ (13)

for all k L 2 and t E [a, b].


Proof. Obviously, for t E [a, 2a], the estimate (12) can be rewritten in
the form

IPi,b,f(t) - f(t)1 5 $ . $ * e-7.0. (14)

Differentiating the latter expression with respect to y, we find that its


maximum is achieved at

Now, substituting (15) into (14) and using Stirling’s formula, we obtain

3. EXACT STATEMENTOFTHE~ROBLEM

In the description of the algorithm below, we will assume that


(a)~=h,a2,. . .,~,l,P={P1,Pz,. . .,pm),.f={~,,~2,. . . 7
x,} are three finite sequences of real numbers.
16 V. ROKHLIN

(b) The sequences b and i are monotonically increasing.


(cl Pl 2 0.
(d) xl 2 0.
(e) We would like to evaluate the sums

S,,p(xjJ = C qj . e-l-i”k (17)


i= I

forallk= 1,2,. . . , n with a relative accuracy E > 0; i.e., we would like


to find a number &&) such that

lPa&k) - &&k)I < E


(18)
ii h/ -
j=l

for each k E [l, n].


Remark 3.1. As has been mentioned in the Introduction, the problem
of evaluating a polynomial of order m at n points is easily reduced to the
form (17). Indeed, suppose that a polynomial of the form (4) has to be
evaluated at a monotonically increasing finite sequence of points tl,
t2,. * ‘> t,. It can be assumed without a loss of generality that 0 5 tk 5 1
forallk= 1,2,. . . , n, and we will introduce a new variable x = -log(t),
and denote -log(tk) by &. Thus, evaluation of the polynomial (4) at a
monotonically increasing finite sequence of points has been reduced to
evaluating the expression

$ Pj . e-j,’ (19)

at a monotonically decreasing finite sequence of points i = {XI, x2, . . . ,


x,}. Finally, by reversing the order of the sequence 2. we reduce the
evaluation of the polynomial (4) at the points tl, TV. . . , . t,, to the stan-
dard problem formulated above.

4. NOTATION

In this section, we will introduce several definitions to be used in the


description of the algorithm in Sections 5 and 6 below. Throughout this
section, we will assume that we are dealing with the problem described in
DISCRETE LAPLACE TRANSFORMATION 17

Section 3, and that q is an integer whose particular value is to be deter-


mined later.
We will denote by M the smallest integer number such that

(20)

We will define a finite sequence {Ui}, i = 1, 2, . . . , M of intervals on the


real axis by the formulae

U,= [+,g 1 forlri<M- 1,

UM = [o, $1 (21)

Similarly, we will define a finite sequence {Vi}, i = 1, 2, . . . , M of


intervals by the formulae

v;= [$& 1 for1 SisM- 1,

vM=
[
o,+
1
(22)

For any i = 1, 2, . . . , M, we will denote by pi the subset of p


consisting of all points PI such that /3, E Vi, and for any i = 1, 2, . . . , M,
we will denote by 5 the subset of x consisting of all points XI such that
xj E vi.
Foreachi= 1,2,. . . , M, rn; will denote the number of elements in pi.
Similarly, for each i = 1, 2, . . . , M, ni will denote the number of
elements in &.
Remark 4.1. Obviously, depending on the distributions of the points
pi and xi, the M can be fairly large. However, the toJa1 number a of such i
that mi # 0 is bounded by m, and the total number Nof such i that 12;# 0 is
bounded by n. For obvious reasons, we will refer as empty to intervals Ui,
Vj such that mi = 0 and nj = 0. In the opposite case, the intervals will be
referred to as non-empty.
Foreachi= 1,2,. . .,M,andj= 1,2,. . .,q,wewilldenotebyflj
the jth Chebychev node on the interval Ui.
Similarly, for each i = 1, 2, . . . , M, and j = 1, 2, . . . , q, we will
denote by ~j the jth Chebychev node on the interval Vi.
For each k = 1,2, . . . , M, and i such that pi E Uk, we will define the
finite sequence {Ufj}, j = 1, 2, . . . , q by the formula
18 V. ROKHLIN

(23)

For each k = 1, 2, . . . , M, and i = 1, 2, . . q, we will define a real


number U” by the formula

(24)

Observation 4.1. Due to Lemma 2.2, the expression

can be viewed as an approximation to the function e-pi,‘. Furthermore, for


anyt E [O,~1,

l&(t) - e-y 5 A. (26)

Combining (24)-(26) with the triangle inequality, we easily see that the
sum

4%(t) = f: UC * e-fif.1 = 2 C aj . uj”,i . e-pf9

i=l i=l f3,EU,,

(27)

can be viewed as an approximation to

and that

(2%
DISCRETE LAPLACE TRANSFORMATION 19

Furthermore, combining (1 I) with (24) and using the triangle inequality,


we obtain

Given k = I, 2, . . . , M, and i = 1, 2, . . . , q, we will define a real


numberf! by the expression

(31)

Foreachk= 1,2,. . . , M, and 1 5 j 5 n such that xj E Vk, we will define


5 by the formula

(32)

with the coefficients ufj defined by the formula

vf,j = fi 3-d. (33)


i=l.i#/ Xf - Xf

Observation 4.2. Due to Lemma 2.2, for any j = 1, 2, . . . , n and k


such that xj E Vk, 6 can be viewed as an approximation to the expression

i.wl Y
5 = C C uf . e-P;‘Xj,
(34)
i=vi+l /=I

and

(35)

Combining (35), (29) (30), and using the triangle inequality, we conclude
that
20 V. ROKHLIN

foranyj= 1,2,. . . , IZ. Now, for any given E and q > 2 * logd(s),

(37)

For any i = 1, 2, . . . , M - 1, we will denote by vi the largest integer


such that

vi < 10gZ(pm . x,) - i - log2 (log, p,. (38)

Similarly,foranyi= 1,2,. . . , M - 1, we will denote by pi the smallest


integer such that

/Ai > lOgz(p, ’ X,) - i - log2


il
i , (39)

For any k = 1, 2, . . . , M, we will define the subset Wk of the interval


[O&J by the formula

w, = u v;, (40)
iZ+l

and denote by Sk the sum

(41)

Observation 4.3. It is easy to see that if x E Ui and /3 E Vj withj 5 v;,


then

Similarly, if x E Ui and j3 E Vj with j 2 pi, then

le-@ - 1) 5 E. (43)

Furthermore,foranyi=1,2,. . .,M-1,

(44)
DISCRETE LAPLACETRANSFORMATION 21

In other words, given x E U; and /3 E Vj, one of three possible situations


obtains:
(a) j I yi. In this case, P-P’~can be approximated by 0 with a preci-
sion E.
(b) j 2 pi. In this case, e-P,1 can be approximated by 1 with a preci-
sion E.
(c) u’i 5 j 5 pi. In this case, e-P” cannot be approximated by either 0
or 1. However, the total number of indices j for which this situation
obtains is bounded by 2 . log2(1/e), independently of 3, p, or i.

5. INFORMAL DESCRIPTION OF THE ALGORITHM

We will illustrate the idea of the algorithm on a simplified example.


Namely, we will assume that pi E U,, i.e.,

for all i = 1, 2, . . . , m, and xj E VI, i.e.,

forallj= 1,2,. . . ,n.


Consider the function e-P,X with /3 E U,, x E VI. Fixing x and viewing
e-@ as a function of p, we construct its q-point Chebychev approxima-
tion $x4(/3) on the interval U,. Due to (6),

with the functions uj defined by (7) and the coefficients /3j defined in
Section 4. According to Lemma 2.2,

1
l@(p) - e-@XI 5 - (48)
44’

and, given a fixed precision E, we can choose q - 2 * logd(l/~) and in all


subsequent calculations replace e-P’I with I,!& p). Combining (48) with the
triangle inequality, we obtain the estimate
22 V. ROKHLIN

(49)

for any x E [0, -t-m], and due to (43, the latter can be rewritten in the form

with the coefficients I/J,, JIz, . . . , $Q defined by the formula

Now, instead of evaluating (17) at each of the points xi, we start by


evaluating the coefficients &, i = 1, 2, . . . , q, which is, obviously, an
order O(m * q) procedure. After that, we evaluate the expression

forallk= 1,2,. . . , n, which is an order O(n . q) procedure (evaluating a


q-term expansion at 12points). Thus, the total operation count becomes
O((n + m) . q). Due to (49), in order to obtain a relative accuracy E, q has
to be of the order lo&( I/E), and we have reduced the computational com-
plexity of evaluating (17) from O(n . m) to

0 ((n + m)+ log4 it)). (53)

An alternative approach would be to calculate the coefficients & for i =


1,2,. . .) q (order m * q operations), evaluate the expression

forallk= 1,2,. . ., q (order q* operations), and interpolate the expres-


sion (17) from the Chebychev nodes xi, xi, . . . , xj,, to the points xl,
x2, * * * , x, (order n . q operations). The resulting CPU time estimate in
this case is
DISCRETE LAPL.ACE TRANSFORMATION 23

0th + 4 * 4) + W) = 0 (( n + ml . log‘i (i) + 0 ((lo&l (ii)‘), (55)

which is not substantially different from (53).


When the points PI, p2, . . . , Pm and XI, x2, . . . , x, do not satisfy the
inequalities (49, (46), the above approach cannot be used in such a
straightforward manner. However, for any i,j E [l, Ml, Lemma 2.2
can be used separately on each of the intervals U;, Vj, with the results
combined to obtain an approximation to (17). This is done in the following
section, resulting in an order

Ok n + m) * log (i)) + 0 (n * (log (3)‘) (56)

algorithm for evaluating (17) at n points with a relative precision E.

6. DETAILED DESCRIPTION OF THE ALGORITHM

Algorithm
Stage 1.
Comment [Choose parameters and perform geometrical preprocessing.]
Choose precision E to be achieved. Set q = 2 - log(l/s). Construct the
intervals Uj, Vi, and the sets pi, li with i = 1, 2, . . . , M.

Stage 2.
Comment [On each of the non-empty intervals Uk, evaluate the coeffi-
cients U” in the expansions (27).]
Step 1.
Comment [Set all coefficients U” to zero.]
do k = 1, M - 1, Sk f 0
doi= l,q
set u” to zero.
end do
end do

Step 2.
Comment [For each pi on each of the non-empty intervals Uk, evaluate
oi * Ujk and add it to the uf.1
dok= l,M- l&f0
24 V. ROKHLIN

doi= 1,q
do fij E Uk
Evaluate Ujk via formula (23) and add the product CX~* u$ to uf.
end do
end do
end do

Stage 3.
Comment [Evaluate f” via formula (31) for all k = 1, 2, . . . , M such
that T& # 0, and i = 1, 2, . . . , q.]
dok= l,M- l,&f0
doi= 1,q
evaluate the expression ff = ~j!$+r~;l=,u-( . cP~,~:.
end do
end do

Stage 4.
Comment [For eachj = 1, 2, . . . , n, evaluate& via formula (32).]
do k = 1, M - 1, & f 0
do Xj E Vk
evaluate the expression A = &i= Ivfj * &.
end do
end do

Stage 5.
Comment [For each k = 1, 2, . . . , M and each xi E V,, use Observa-
tion 4.3 to evaluate the sum Sk = x:(y.JEWkaj.Add the result tofi, concluding
the calculation.]
Step 1.
Comment [Evaluate S1.]
set Sr = C,,,tJrCXi.

Step 2.
Comment [Evaluate Sk recursively for k = 2, 3, . . . , M.]
do k = 2, M, .Q f 0
evaluate Sk via the formula Sk = Sk-I + 2fljeukClj.

end do

Step 3.
Comment [For all k = 1,2, . . . , M, and all i such that xi E VL, add Sk to
xi, concluding the calculation.]
DISCRETE LAPLACETRANSFORMATION 25

dok= l,M,&#O
do Xj E vk
add Sk tofi.
end do
end do

7. COMPLEXITY ANALYSIS

Stage Operation Explanation


number count
Stage 1 O(n + m) Each of the points PI, 62, . . . , P,,, is
assigned to a single interval Vi. Each of
the points xl, x2, . . . , x, is assigned to a
single interval Vi.
Stage 2
Step 1 O(@. 4) Each of the coefficients UC, with k = 1,
2 . ., M,andi=1,2,. . .,qisset
td zero.
Step 2 Ob . 4’) Each of the points PI, /?2, . . . , Pm con-
tributes to the coefficients u!,~ withj = 1,
2 . . 2 q, and evaluating each of the
coefficients z& requires order q work
(see (23)).
Stage 3 O(M * q2 * WE)) The sum (31) has to be evaluated at q
nodes xi x6 . . . ) xi on each of non-
empty in;erAs VI, Vz, . . . , VM, and on
the kth interval, it contains )(lk - vL
terms. However, due to (44), p.I, - vh 5 E
for all k = 1, 2, , . . , M.
Stage 4 m *q2) The expression (32) has to be evaluated
for each of the points x1, x2, . . . , x,,
and evaluating each of the coefficients
u$ requires order q work (see (33)).
Stage 5
Step 1 O(m) The sum S, = cpiEU, cxicontains no more
than m terms.
Step 2 O(n + m) The total number of non-empty intervals
Vi is bounded by it, and the total number
of coefficients aj is bounded by m.
Step 3 O(n) Each of the numbers& is amended once.

Summing up the CPU times for all stages above, we obtain the follow-
ing time estimate:
26 V. ROKHLIN

Ttotal = a * m + b * n + c + m . q2 + d. n . q2

+ e ’ 5.f ’ q2 * log (57)

where the coefficients a, b, c, d, e depend on the computer system,


language, implementation, etc. However, a 5 m, and q - log(l/a), and
the estimate (57) assumes the form

3
Ttotal

The estimate (58) is independent of the locations of the points pi, xi in


RI, and does not depend on any precomputed data. The following two
observations reduce it to

Ttotal = 0 ((m + n) * log (k)i (59)

for many problems of practical interest.


Observation 7.1. The term b * m . q2 in (58) is associated with the
Stage 3 of the algorithm and the grossly pessimistic estimate

MSMIrn. (60)

According to (20),

Pm * &I
M - 1 5 log2 ___ (61)
i & ! = log2a?l> + lwz(&J + log2 0k .

Normally, when calculations are performed on a physical computer, the


exponential in the binary representation of a real number is bounded, and
we will denote this bound by A. It immediately follows that in all cases,

%kM%34og,(A)+ 1, (62)

and the estimate (57) becomes

Ttotal =a*m+b=n+c.m*q2+d*n*q2

+ e * log(A) * q2 . log 1 . (63)


0&
Observation 7.2. The terms c . m . q2 and d . n * q2 in (57) are associ-
DISCRETE LAPLACETRANSFORMATION 27

ated with the Stages 2 and 4 of the algorithm, and with the fact that in
order to evaluate each of the coefficients ujk,i(or Uj,i), a q - l-term product
of the form (24) (or (33)) has to be evaluated. Obviously, the coefficients
uti depend only on the distribution of the points pi, and not on that of xi or
ai+ Similarly, the coefficients ujk.idepend only on the distribution of the
points xi and not on that of p; or ai. Therefore, for fixed distributions of pi
and Xi, the coefficients ufi, I$; can be precomputed and stored, reducing
the total CPU time estimate to

Ttotal --a.m.q+b.n.q+ +c.log(A).q’*log i. (64)


0

However, q - log(l/&), and log(A) is fixed for given computer system and
language. Thus, when m, II -+ m,

Ttotal - (a * m + b . n) . q. (65)

8. NUMERICAL RESULTS

A computer program has been written implementing the algorithm of


this paper. The calculation is performed in two stages, each implemented
by a separate subroutine. During the initialization stage, the coefficients
u$, Uf,j are evaluated for given distributions of points PI, /32, . . . , Pm, x1,
X2,. . ., x, (see Observation 7.2). During the second stage, the sums (17)
are evaluated for a given set of weights aI, (Ye, . . . , (Y,.
Remark 8.1. It is clear from Tables 1, 3, and 5 that the first stage
(initialization) tends to be several times more expensive than the second
(evaluation). However, in most applications the algorithm has to be ini-

TABLE1
EXAMPLE 1: TIMINGS

72 Tinit TOlP Tdir


20 0.0112 0.0015 0.0081
40 0.0369 0.0042 0.0318
80 0.0802 0.0092 0.1278
160 0.136 0.0165 0.5202
320 0.218 0.0283 2.069
640 0.333 0.0468 8.368
1280 0.484 0.0784 33.25
2560 0.727 0.137 133.58
28 V. ROKHLIN

TABLE11
EXAMPLE 1: ACCURACIES

n (yn’Z fjn+ maz,rel


a/g dw 6 ok %Yvre’ 6’”4J p’
drr

20 .4123-06 .6383-06 .3833-06 .2433-06 .3593-07 .5563-l


40 .1793-05 .2193-05 .4593-06 .4183-06 .7833-07 .960E-f
80 .4083-05 .6883-05 .6233-06 .6713-06 .lOOE-06 .169E-I
160 .1383-04 .2623-04 .8253-06 .116E-06 .1733-06 .3263-t
320 .3783-04 .8733-04 .5973-06 .1953-05 .2383-06 .550E-f
640 .9223-04 .2313-03 .103E-05 .2583-05 .2793-06 .699E-t
1280 .2733-03 .7403-03 .8413-06 .4733-05 .4203-06 .114E-(
2560 .5223-03 .2333-02 .8803-06 .8863-05 .4073-06 .lBlE-t

tialized once, with subsequent repeated evaluation of the sums (17) for
varying sets of weights aI, CY~,. . . , (Y,. This situation is similar to that
encountered for the Fast Fourier Transformation.
The program has been applied to a variety of situations, and three such
examples are presented in this section, with the computations performed
on a VAX-8600 computer. In each case, we performed the calculations in
three ways: via the algorithm of the present paper in single precision
arithmetic, directly in single precision arithmetic, and directly in double
precision arithmetic. The first two calculations were used to compare the
speed and precision of the algorithm with that of the direct calculation.
The direct evaluation of the field in double precision was used as a stan-
dard for comparing the accuracies of the first two calculations. In all
cases, we set E = 10m8, and

m = II = 10 * 2k, (66)

with k varying from 1 to 8.

TABLE III
EXAMPLE 2: TIMINGS

T hl Tdir
20 0.0097 0.0011 0.0083
40 0.0275 0.0033 0.0332
80 0.0768 0.0089 0.1328
160 0.126 0.0157 0.536
320 0.210 0.0271 2.12
640 0.326 0.0455 8.50
1280 0.497 0.0784 34.12
2560 0.698 0.1351 138.34
DISCRETE LAPLACE TRANSFORMATION 29

TABLE IV
EXAMPLE 2: ACCURACIES

maz,reJ
6Ol$ 6’”
oka
20 .3123-06 .1643-06 .160E-06 .1923-06 .2683-07 .141E-07
40 .109E-05 .2703-05 .4053-06 .3163-06 .501E-07 .1243-06
80 .3543-05 .3643-05 .6583-06 .8603-06 .8323-07 .8573-07
160 .8353-05 .161E-04 .8603-06 .8453-06 lOOE-06 .1943-06
320 .1983-04 .2983-04 .9743-06 .1583-06 .115E-06 .1733-06
640 .6063-04 .1283-03 .8243-06 .2883-05 .1853-06 .3893-06
1280 .3363-03 .6573-03 .9143-06 .4233-05 .2073-06 . IOOE-05
2560 .4513-03 .1493-02 .8273-06 .7993-05 .3483-06 .115E-05

Tables 1, 3, and 5 contain the CPU timings for Examples 1, 2, and 3,


respectively. The following is a detailed description of the entries in these
tables:
n, the number of points at which the sum (1) is being evaluated;
Tinit, the initialization time of the algorithm;
Targ,the CPU time required by the algorithm once it has been initial-
ized;
Tdir, the CPU time required by the direct calculation.
Tables 2, 4, and 6 contain the accuracies for Examples 1, 2, and 3,
respectively. In the description of the entries of these tables below, Sk
denotes the sum (1) at the point xk as evaluated directly in double preci-
sion. Sp’ denotes the sum (1) at the point xk as evaluated directly in single
precision, and $$= denotes the sum (1) at the point xk as evaluated in single

TABLE V
EXAMPLE 3: TIMINGS

n Tinit T 42 Tdir
20 0.0135 0.0015 0.0013
40 0.0435 0.0052 0.0047
80 0.0948 0.0109 0.0179
160 0.1327 0.0172 0.0729
320 0.222 0.0286 0.2779
640 0.306 0.0445 1.101
1280 0.422 0.0718 4.54
0.664 0.1322 18.37
30 V. ROKHLIN

TABLEVI
EXAMPLE 3: ACCURACIES

20 .7723-06 .2603-05 .301E-06 .3053-06 .6733-07 .2273-


40 .2183-05 .2753-05 .3963-06 .3373-06 .9553-07 .121E-
80 .5183-05 .5543-05 .5283-06 .210E-06 .1273-06 .136E-
160 .6153-05 .4623-05 .6713-06 .3043-06 .7663-07 .5763-
320 .103E-04 .2323-05 .8513-06 .3873-06 .6463-07 .146E-
640 .2243-04 .2953-04 .8323-06 .4223-06 .6803-07 .8943-
1280 .6153-04 .3603-03 .745E-06 .120E-05 .949E-07 .5553-
2560 .7573-04 .120E-02 .8623-06 .191E-05 .5903-07 .9323-

precision via the algorithm of the present paper. The following is a de-
tailed description of the entries in the Tables 2, 4, and 6:
n, the number of points at which the sum (1) is being evaluated;
szr, the maximum error produced by the algorithm at any point. It is
defined by the formula

(67)

SgF, the maximum error produced by the direct calculation at any


point. It is defined by the formula

s$r*re’, the maximum relative error produced by the algorithm at any


point. It is defined by the formula

(6%

@gW”‘, the maximum relative error produced by the direct calcula-


tion at any point. It is defined by the formula

(70)

S2, the relative error as defined in Section 3 as produced by the


algorithm. It is given by the formula
DISCRETE LAPLACE TRANSFORMATION 31

(71)

a%;, the relative error as defined in Section 3 as produced by the


direct calculation. It is given by the formula

i IL@’ - &
aa; = k=’ (72)
i: lSkl ’
k=I

The following is a detailed description of the three examples.


EXAMPLE 1. In this example, the points pi, p2, . . . , P,,, and xi,
x2, . - . 7 x, were defined by the formulae

Pi = (73)

xk = (74)

and the weights al, CX~,. . . , CY,,,were generated randomly on the interval
[O, 11. Here, by “direct algorithm” we mean a straightforward implemen-
tation of the formula (17). The results of this set of experiments are
summarized in Tables 1 and 2.
EXAMPLE 2. In this example, the points pi, p2, . . . , Pm and x1,
x2, . * * 3 x, were generated randomly on the interval [O, 51, and the
weights (~1, (~2, . . . , CV,were generated randomly on the interval [0, 11.
Again, by “direct algorithm” we mean a straightforward implementation
of the formula (17). The results of this set of experiments are summarized
in Tables 3 and 4.
EXAMPLE 3. Here, we evaluate a polynomial of order IZat a collection
of randomly generated points on the interval [O, 11. The coefficients of the
nth order polynomial are randomly distributed on the interval [0, 11. In
this example, the direct evaluation of the polynomials is performed via the
Horner’s rule (see, for example, Dahlquist and Bjork, 1974), and the
algorithm of this paper is applied via the formula (1). The results of this set
of experiments are presented in Tables 5 and 6.
The following observations can be made from the Tables l-6 and are in
agreement with the results of our more extensive experiments.
32 V. ROKHLIN

1. In all cases, the accuracy produced by the algorithm of the present


paper is comparable to that obtained by the direct calculation. For large n,
the algorithm tends to be slightly more accurate.
2. The CPU times and accuracies produced by the algorithm are virtu-
ally independent of the distributions of points q, pi, xk in R’.
3. When used for evaluating expressions of the form (17), the algorithm
becomes faster than the direct calculation at 12= m 5 20, if the initializa-
tion time is ignored. If we include the initialization time, the break-even
point is between n = m = 40 and n = m = 60.
4. When used for evaluating polynomials, the algorithm becomes faster
than the direct calculation at roughly n = m = 40, if the initialization time
is ignored. If we include the initialization time, the break-even point is
roughly n = m = 300.

REFERENCES

AHO, A. V., HOPCROFT, J. E., AND ULLMAN, J. D. (1974). “The Design and Analysis of
Computer Algorithms,” Addison-Wesley, Reading, MA.
BORODIN, A., AND MUNRO, I. (1975). “The Computational Complexity of Algebraic and
Numeric Problems,” Elsevier, Amsterdam/New York.
DAHLQUIST, G., AND BJORK, A. (1974), “Numerical Methods,” Prentice-Hall, Englewood
Cliffs, NJ.

You might also like