The Automatic Control of Numerical Integration
The Automatic Control of Numerical Integration
55 { 74
This paper reviews the recent advances in developing automatic control algo-
rithms for the numerical integration of ordinary dierential equations (ODEs)
and dierential-algebraic equations (DAEs). By varying the stepsize, the error
committed in a single step of the discretization method can be aected. Modern
time-stepping methods provide an estimate of this error, and by comparing the
estimate to a specied accuracy requirement a control algorithm selects the next
stepsize. To construct ecient controllers it is necessary to analyze the dynamic
behaviour of the discretization method together with the controller. Based on
feedback control theory, this systematic approach has replaced earlier heuristics
and resulted in a more consistent and robust performance. Other quantities af-
fected by the stepsize are convergence rates of xed-point and Newton iterations,
and we therefore also review new techniques for the coordination of nonlinear
equation solvers with the primary stepsize controller. Taken together, the re-
cent development provides principles and guidelines for constructing ODE/DAE
software where heuristics and tuning parameters have largely been eliminated.
1. Introduction
Software for solving initial value problems for ODEs and DAEs has existed for
the past 15{30 years. Yet there has been a quite rapid recent development in
the analysis and design of the algorithmic structures needed to turn a numerical
time-stepping method into an adaptive integration procedure.
We shall consider the problem of numerically solving an ODE
y_ = f (y); y(0) = y0 ; t 0 ; (1)
where f : Rd ! Rd. The qualitative behaviour of the solution y(t) may vary
considerably depending on the properties of f ; f may be linear or nonlinear,
55
some problems are sensitive to perturbations, some have smooth solutions while
others have intervals where y(t) changes rapidly. Sometimes an accuracy of a
few digits is sucient but occasionally high precision results are required. To
handle the great variety of situations many dierent discretization methods are
needed, but each automatic integration procedure must nevertheless be able to
adapt properly to a wide range of operating conditions.
The work/precision eciency of a time-stepping method depends on the
discretization as well as on problem properties and size. The integration proce-
dure should attempt to compute a numerical solution fyn g to (1) with minimum
eort, subject to a prescribed error tolerance tol. As tol ! 0, the global er-
ror kyn y(tn )k should decrease in a regular way|a property called tolerance
proportionality. At the same time, computational eorts increase, usually in a
logarithmic proportionality between accuracy and eort.
Minimizing work subject to a bound on the global error requires a global
computational strategy. But the nature of time-stepping is inherently sequen-
tial or local; given the \state" y(t), the method is a procedure for computing an
approximation to y(t + h) a time step h > 0 ahead. The size of h is used to trade
accuracy for eciency and vice versa, and is therefore the principal means of
controlling the error and making the computational procedure adaptive. Thus
it is a well established practice to control the local error, an approach which is
inexpensive and far simpler than controlling the global error. By using dier-
ential inequalities, however, it can be shown that if the local error per unit time
of integration is kept below tol, then the global error at time t is bounded by
C (t) tol. Thus a local error tolerance indirectly aects the global error.
We shall assume that an s-stage Runge{Kutta method characterized by the
matrix-vector pair (A; b) is used to solve (1). Using standard notation, [8], the
method can be written
Y = 1l
yn + (A
Id )hY_ (2)
_Yi = f (Yi ) ; i=1:s (3)
yn+1 = yn + (bT
I )hY_ ; (4)
where yn approximates y(tn ). Further, h is the stepsize, Y is the sd{dimensional
stage vector, and Y_ is the corresponding stage derivative. For the purposes of
this study, it is of secondary importance if the method is explicit or implicit,
but we shall assume that the method also supports a reference method dened
by a dierent quadrature formula
y^n+1 = yn + (^bT
I )hY_ : (5)
By means of this reference method, a local error estimate is obtained by
^ln+1 = yn+1 y^n+1 = ((bT ^bT )
Id )hY_ : (6)
Let y^(t; ; ) denote a solution to (1) with initial condition y^( ) = . Then the
local error in a step from yn to yn+1 may be written
ln+1 = yn+1 y^(tn+1 ; tn ; yn): (7)
56
By expanding the local error in an asymptotic series one shows that
ln+1 = (tn ; yn )hp+1 + O(hp+2 ); (8)
where is the principal error function and p is the order of the method. Sim-
ilarly, one derives a corresponding expression for the local error estimate,
l^n+1 = ^ n hp^+1 + O(hp^+2 ); (9)
where the order p^ p, depending on design objectives for the method. For
simplicity we shall make no distinction between p^ and p but wish to emphasize
that in the local error control algorithms under discussion, the \order" always
refers to that of the error estimate.
It is of interest to control either the local error per step (EPS) or the local
error per unit step (EPUS). In the former case we let r^n+1 = k^ln+1 k and in the
latter r^n+1 = k^ln+1 =hnk (we now use hn instead of h). A step is accepted if
r^n+1 tol, and eciency suggests choosing the stepsize adaptively, as large
as possible, subject to this accuracy requirement. To reduce the risk of having
to reject a step, it is common to aim for a slightly smaller error, " = tol,
where < 1 is a safety factor. The elementary local error control algorithm [2,
p. 156] is 1=k
hn+1 = r^ " hn ; (10)
n+1
were k = p + 1 (EPS) or k = p (EPUS). The heuristic derivation of the control
law (10) assumes the following behaviour of the integration process:
Process assumptions
1. Asymptotics: r^n+1 = '^n hkn where '^n = k^ nk,
2. Slow variation: '^n '^n 1 .
If these were correct, and there is a deviation between " and r^n+1 , then (10)
will eliminate this deviation in a single step and make the error equal ":
" 1=k
hn+1 = '^ hk hn ) '^n hkn+1 = ": (11)
n n
Hence if '^n is constant the new stepsize exactly meets the accuracy require-
ment.
In practice it may happen that one or both process assumptions are false.
The rst states that hn is small enough for the error to exhibit its theoretical
asymptotic behaviour. Some reasons why this may be false are
(i) in an explicit method the stability region is bounded and if the stepsize is
limited by numerical stability it is no longer in the asymptotic regime;
(ii) in sti ODEs one uses \large" stepsizes far outside the asymptotic regime
for modes which have decayed;
57
(iii) for sti ODEs and DAEs, some methods suer from order reduction, also
invalidating the classical asymptotic model for the order.
The second assumption asserts that f and/or y have a negligible variation
during a time-step of size h and is rarely if ever correct. It also depends on
the smoothness of the norm k k, and whether the norm is \aligned" with the
behaviour of the solutions.
In spite of its shortcomings, the elementary error control has been quite
successful in practical computations. One apparent success is its ability to pre-
vent numerical instability. In computations with an explicit method, stepsizes
may eventually grow until hn must be limited by numerical stability. This
is at large managed by (10). As long as stability is at hand, the solution yn
remains smooth and the error small, and (10) will attempt to increase the
stepsize. But if hn increases to cause instability, no matter how slight, then yn
quickly becomes less regular. As a result r^n increases, forcing (10) to reduce
hn . This process repeats itself and keeps instability at bay through a continual
adjustment of hn .
But this suggests that hn will oscillate around a maximum stable stepsize.
Such oscillations are indeed observed in practice, cf. [3], [8, p. 25], and may even
have visible eects on the smoothness of the numerical solution [4, p. 534], [8, p.
31]. This led Hall [9, 10] to study a new question of stability: is the method
in tandem with the control (10) stable when numerical stability limits the
stepsize? It was found that for the linear test equation y_ = y stable stepsize
equilibria exist on parts of the boundary of the stability region, [8, p. 26]. For
some popular methods, however, the performance is less satisfactory. Hall
and Higham [12] then took the approach of constructing new methods that
together with (10) had improved \step control stability" and thus overcame
stepsize oscillations.
Around the same time, however, Gustafsson et al. [3] studied the problem
from a control theoretic point of view, which implies the opposite approach;
instead of constructing methods that match the control law (10), the controller
should be designed to match the method. Moreover, because (10) is as elemen-
tary to feedback control theory as the explicit Euler method is to numerical
ODEs, it should be possible to employ a more advanced digital control design
parameterized for each given method. A rst experimental study [3] conrmed
that a proven standard technique, the proportional integral (PI) controller (see
also Section 2), worked well. It was subsequently thoroughly analyzed, tested
and parameterized [4]. This resulted in smoother stepsize sequences and nu-
merical solutions [4, p. 547], [8, p. 31], fewer rejected steps, an improved ability
to prevent numerical instability and a possibility to run closer to tol, i.e. to
use a larger ; in all, a more robust performance at no extra computational
expense. This technique has since become the modern standard for nonsti
computations and is promoted in various guises in research literature [8, pp.
28{31] and general numerical analysis textbooks alike [1, pp. 173{179].
The purpose of this paper is to review the recent advances in the automatic
58
control of numerical integration. In Section 2 we study the elementary con-
troller (10) for nonsti computations while introducing some methodology and
basic notions from control theory. We explain the structure of the PI controller
and the eect of its parameters. Section 3 then gives a more detailed account
of the PI control design process and the parametrization of the controller. In
Section 4 we proceed to review the development in sti computations where
the PI control is replaced by a predictive control, [5]. As sti computations
use implicit methods, they also rely on iterative methods of Newton type. The
convergence rate of such iterations is aected by the stepsize, and stepsize
changes may force matrix refactorizations. It is therefore important to coor-
dinate matrix strategies with the controller. In Section 5 we describe a local
strategy, [6], which by monitoring and predicting convergence rates manages
a good performance without interfering with the predictive controller. This
reduces unnecessary matrix handling and convergence failures. We also dis-
cuss high order predictors for the starting values of the iteration, [13], which
are combined with an attempt to optimize global performance based on prob-
lem characteristics. These techniques are important steps towards eliminating
heuristics in numerical ODE software. They are available in the literature in
concise algorithmic descriptions to facilitate their practical implementation.
59
This rst-order dierence equation for log hn has the characteristic equation
q = 0, i.e. the roots are at the origin. Known as deadbeat control, this is the
result of choosing the integral gain kI = 1=k in the controller.
From the point of view of control theory, however, the integral gain kI is
a free parameter; it is not determined by the asymptotic model of the process
but is a design parameter used to achieve a good overall dynamic behaviour
for process and controller together. Replacing the factor 1=k in (12) by an
arbitrary gain kI and inserting the asymptotic model for r^n+1 we obtain the
closed loop dynamics
log hn+1 = (1 kkI ) log hn + kI (log " log '^n ): (15)
Thus the root of the characteristic equation is now q = 1 kkI . Stability
evidently requires kkI 2 [0; 2], but the choice kkI = 1 is by no means neces-
sary. Instead, the value of kkI determines how quickly the control responds
to variations in the external disturbance log '^n , whose
uctuations are to be
rejected or compensated by the controller. The system's unit step response is
the behaviour resulting when the control error increases from 0 to 1. In the
deadbeat controller, log hn+1 depends exclusively on log '^n , implying|as we
already noted|that the error will be immediately rejected. But in general this
leads to a rather \nervous" control, which is often not to advantage. Taking
kkI 2 (0; 1) leads to a smoother control1, where the control variable log hn
depends in part on its history and in part on log '^n . By contrast, kkI 2 (1; 2)
makes the controller overreact, and on subsequent steps it must even compen-
sate its own actions, resulting in damped oscillations. The choice of kkI is
therefore a tradeo between response time and sensitivity; for an I controller
to work well one must in eect choose kkI 2 (0; 1).
The solution of the dierence equation (15) is given by the convolution
n
X
log hn = (1 kkI )n log h0 + kI (1 kkI )n m (log " log '^m 1 ): (16)
m=1
This mollies the
uctuations in log '^n (recall that '^n is composed of high
order derivatives of f ) and therefore yields smoother stepsize sequences when
kkI 2 (0; 1) than those obatined with the deadbeat control (13).
The general I controller can be written as
kI
hn+1 = r^ " hn : (17)
n+1
When we compare this to (10), it is of importance to note that the change
of integral gain from 1=k to kI is not an arbitrary violation of the theoretical
asymptotics of the method but a deliberate change of controller dynamics to
achieve smoother stepsize sequences for the very same asymptotic error model
1 Note that one cannot choose kkI = 0 as (15) then yields a constant stepsize.
60
as was assumed for the elementary control algorithm (10). The control analysis
above rests on the asymptotic assumption, but not on the assumption of slow
variation.
In control theory it is common to use PI and proportional integral derivative
(PID) control structures to increase robustness. In most cases a PI controller
is satisfactory. Its idea is to construct the control log hn as follows:
n
X
log hn = log h0 + kI (log " log r^m ) + kP (log " log r^n ): (18)
m=1
Thus log hn consists of two terms: it adds a term proportional to the control
error to the previous integral term. The proportional gain kP and integral gain
kI are to be chosen to obtain robust closed loop dynamics.
Forming a recursion from (18), we obtain
log hn+1 = log hn + kI (log " log r^n+1 ) + kP (log r^n log r^n+1 ): (19)
The PI controller can therefore be written
kI
hn+1 = r^ " r^n kP h ; (20)
r^n+1 n
n+1
and is obviously a relatively simple modication to the elementary local error
control. The new proportional factor accounts for error trends; if r^n is increas-
ing the new factor is less than 1 and makes for a quicker stepsize reduction than
the I controller would have produced. Conversely, a decreasing error leads to
a faster stepsize increase.
The major change, however, is that the closed loop has second order dy-
namics. Inserting the asymptotic model into (19), the closed loop dynamics is
described by
log hn+1 = (1 kkI kkP) log hn + kkP log hn 1 +
kI (log " log '^n ) + kP (log '^n 1 log '^n );
with the characteristic equation
q2 (1 kkI kkP )q kkP = 0: (21)
The dynamics is therefore determined by two roots, the locations of which are
functions of the two design parameters kkI and kkP . One cannot, however,
choose the values of kkI and kkP solely from this characteristic equation, as we
have assumed that the asymptotic process model holds when we derived (21).
It is necessary to study additional process models, e.g. the dynamics when
numerical stability limits the stepsize, to nd a good parametrization. In the
case of the pure asymptotic model and an almost constant external disturbance
log '^n , a controller with integral action is typically both necessary and sucient
[19]. In many but certainly not all nonsti computations these assumptions are
61
not unrealistic, explaining the success of the elementary controller (10). It is
in the remaining cases that a more robust controller is needed, and our next
task is to outline the control analysis necessary to design and parameterize the
PI controller properly.
62
log '^
with E 0 (z ) P 0 (z )
C1 (z ) = Re z ;
E (z ) C2 (z ) = Re z :
P (z ) (25)
These coecients depend on the method parameters (A; b; ^b) as such, but also
vary along @S . We shall take a dierent approach from [11, 4] and normalize
the coecients by k, dening
c^1 (z ) = C1 (z )=k ; c^2 (z ) = C2 (z )=k : (26)
We now take the logarithm of r^n = j^ln j (or in the EPUS case j^ln j=hn ) and
express (24) in terms of the forward shift q. Noting that jP (z )j = 1 we obtain
log r^ =_ G@S
P (q)(log h log h ); (27)
where the sought dynamic process model takes the form
G@S (q ) = kq 1 c^1 (z ) pk + c^2 (z ) ; (28)
P k q 1
with pk = 0 for EPS and pk = 1 for EPUS, see [4, p. 538]. To verify this
dynamic model, Gustafsson used system identication, [16]. The coecients
63
obtained when a real ODE solver was applied to quasi-stationary nonlinear
problems were found to be well within 1% of theoretical values [4, pp. 541,
549{551], conrming that the dynamic model is highly accurate.
We thus have two dierent transfer functions representing the process, the
asymptotic model G0P (q) = kq 1 valid near z = 0 and the dynamic model
G@SP (q) valid near z 2 @S . These models are compatible; since c^1 (0) pk =k = 1
and c^2 (0) = 0, G@S P (q) reduces to G0P (q) near z = 0 2 @S . For many methods
c^1 (z ) 1, and c^2 (z ) often varies in [0; 2] along @S , cf. [4, p. 552].
The next task is to design a controller, which must be able to adequately
manage both processes. The elementary controller (10) will not do, as it is
often unable to yield stable dynamics for G@S P (q).
A controller GC (q) is a linear operator, mapping the control error log " log r^
to the control log h, expressed as
log h = GC (q)(log " log r^): (29)
Identifying (29) with (19) we readily nd the expression for the PI controller,
GPI q
C (q) = kI q 1 + kP : (30)
As q is the forward shift operator, q 1 is a dierence operator; 1=(q 1) is
therefore the summation operator representing the controller's integral action.
Moreover, the pure I controller is obtained as the special case kP = 0.
The closed loop dynamics is obtained by combining process and controller
and eliminating log h. Thus, in the asymptotic regime, we insert (29) into (22)
to obtain
log r^ = G0" (q) log " + G0'^ (q) log ';
^ (31)
where
0 1
G0" (q) = 1 +GCG(q()qG) PG(0q()q) ; G0'^ (q) = 1 + G q(q) G0 (q) : (32)
C P C P
To consider PI control we insert (30) and the process model G0P (q) = kq 1 into
(32). The transfer function from external disturbance log '^ to error estimate
log r^ can then be written explicitly as
G0'^ (q) = q2 (1 kkq 1kk )q kk : (33)
I P P
The closed loop dynamics is governed by the poles of the transfer functions
(32). Thus, by looking at the denominator of (33) we recognize the same char-
acteristic equation (21) as before. Moreover, the numerator of G0'^ (q) contains
the dierence operator q 1. Appearing as consequence of the controller's in-
tegral action, this operator will remove any constant disturbance log '^ in (31)
at a rate determined by the location of the poles. Last, we nd that G0" (1) = 1,
64
implying that log r^ will, in a stable system, eventually approach the setpoint
log "; this criterion is akin to the notion of consistency in numerical analysis.
Before selecting controller parameters, we also need to consider the closed
loop dynamics on @S . Combining (27) and (29) we obtain
log r^ = G@S @S
" (q ) log " + Gh (q ) log h ; (34)
where we need to nd the poles of the transfer functions
G@S ( q ) = GPI @S
C (q) GP (q) ; G @S (q ) = G@SP (q) : (35)
" PI @S
1 + GC (q) GP (q) h PI
1 + GC (q) G@S P (q)
In the EPS case the characteristic equation is q3 + a2 q2 + a1 q + a0 = 0, where
a2 = 2 (kkI + kkP )^c1
a1 = 1 (kkI + 2kkP)^c1 + (kkI + kkP )^c2
a0 = kkP(^c2 c^1 ):
Therefore the poles are determined by the parameters kkI and kkP both in
the asymptotic case and on @S . It is nevertheless a considerable challenge to
choose the PI control parameters. There is a gradual change in process models
from G0P (q) to G@S P (q) as z leaves the asymptotic domain and approaches @S
[4, p. 541]. In addition c^1 and c^2 are method dependent and vary on @S .
The choice of kkI and kkP is based on theoretical considerations as well
as computational performance. A systematic investigation of the range of the
coecients c^1 and c^2 for a large number of methods reveals how large the
closed loop stability region in the (^c1 ; c^2 ){plane needs to be. Extensive prac-
tical testings led Gustafsson [4] to suggest (kkI ; kkP ) = (0:3; 0:4) as a good
choice, to be considered as a starting point for ne-tuning the controller for an
individual method. This results in poles located at q = 0:8 and q = 0:5 for
(33), regardless of method. Negative poles are less desirable as they may cause
overshoot and risk step rejections, but they cannot be avoided in this design3.
If actual values of c^1 and c^2 permit a small reduction of the stability region in
the (^c1 ; c^2 ){plane, a somewhat faster, better damped and smoother response
near z = 0 is achieved by taking (kkI ; kkP) = (0:4; 0:2). The poles at z = 0
have then moved to q 0:69 and q 0:29. In most cases the parameter set
of interest is f(kkI ; kkP ) : kkI + kkP 0:8; kkI 0:3; kkP 0:1g, noting
that individual methods have somewhat dierent characteristics and that the
parameter choice is a tradeo between dierent objectives.
Extensive additional tests with the safety factor showed that one can
run as close as 90% of tol before stepsize rejections become frequent. For
3 We note that [8, p. 30] maintain kkI + kkP = 1; nding (kkI ; kkP ) = (0:36; 0:64) unsatis-
factory for an 8th order method, they change to (0:68; 0:32). Closed
p loop dynamics is less
convincing in both cases, however. Governed by two roots kkP of (21) near z = 0, it
is quite oscillatory and a reduced integral gain might be preferred. Further we note that
the control analysis in [1] only treats a static process model and therefore prematurely
concludes that an I controller is satisfactory.
65
an increased margin of robustness at a negligible cost, a value of = 0:8 is
recommended, implying a setpoint of " = 0:8 tol. Thus, a good overall
performance can be expected from the PI.3.4 controller
0:3=k
hn+1 = 0:8r^ tol r^n 0:4=k h ; (36)
r^n+1 n
n+1
with (10)|the PI1.0 |as a safety net in case of successive rejected steps, [4,
p. 546]. In many codes it is common to use additional logic, e.g. a deadzone
inhibiting stepsize increases unless they exceed say 20%. But such a deadzone
is detrimental to controller performance as it is equivalent to disengaging the
controller from the process, waiting for large control errors to build up. When
the controller is eventually employed it is forced to use larger actions and might
even nd itself out of bounds. By contrast, the advantage of using a control
design such as the PI.3.4 or the PI.4.2 lies in its capacity to exert a delicate
stabilizing control on the process as is clearly demonstrated by the smoother
stepsize sequences. We therefore advocate a continual control action.
66
error estimate which adequately re
ects the true error. Controllability requires
that the stepsize be an eective means to exert control; by adjusting hn we
must be able to put the error at a prescribed level of ". Neither of these re-
quirements is trivial and only a brief account can be given here. To this end, we
consider the linear test equation y_ = y. The method yields yn+1 = R(zn )yn ,
where R(zn ) is a rational function, typically a Pade approximation to ezn . To-
day it is common to select L-stable methods, e.g. the Radau IIa methods [8],
for which the stability region S contains the left half-plane and in addition
R(1) = 0. Given an embedded formula with rational function R^(z ), an error
estimate is provided by ^ln = (R(zn ) R^ (zn ))yn . Here R^ (z ) must be bounded
as z ! 1. But it is also important that the estimated error for very sti or
algebraic solution components essentially vanishes4 . Unless such conditions are
met, the stepsize-error relation may suddenly break down, forcing stepsize re-
ductions by several orders of magnitude to re-establish an asymptotic relation
[8, pp. 113{114]. A simplied explanation is an unsuitable choice of R and
R^ for which (R(zn ) R^ (zn ))yn is almost constant as a function of zn when
jzn j is large, i.e. the error estimate is unaected by moderate stepsize varia-
tions. In such a situation any controller|no matter how sophisticated|is at
a loss of control authority. In order to avoid this breakdown one must seek
method/error estimator combinations which are controllable. This is still an
area of active research, however, as it must address method properties, error
estimator construction, cf. [15], termination criteria for Newton iterations and
problem properties in DAEs.
In the sequel we shall assume that the error estimator is appropriately con-
structed and free of the shortcomings mentioned above. This implies, with
few exceptions, that the controller and error estimator operate in the asymp-
totic regime, i.e. the solution components or modes to be controlled satisfy
r^n+1 = '^n hkn . Sti or algebraic components, which are outside the asymptotic
regime, typically give negligible contributions to the error estimate, cf. [8, p.
125]. Thus our process model is
log r^ = kq 1 log h + q 1 log ':
^ (38)
When this model holds, the error control problem is a matter of adapting the
stepsize to variations in '^n . Investigations of real computational data show
that variations in '^n have a lot of structure [5, p. 503]. The simplest model for
predicting log '^n is the linear extrapolation
log '~n = log '^n 1 + r log '^n 1 ; (39)
where log '~n denotes the predicted value of log '^n and r is the backward
dierence operator. Note that the model is not compensated by using divided
dierences. Such models have been tried but show no signicant advantages
over (39). Using the forward shift we rewrite (39) in the form
log '~ = (2q 1)q 2 log ':^ (40)
4 A sucient condition is that both of R(z ) and R^ (z ) are L-stable.
67
We shall choose hn such that '~n hkn = ". The stepsize is therefore given by
log h = k 1 (log " log '~): (41)
Inserting the estimate (40) into (41), noting that log " is constant, we obtain
the closed loop dynamics
log h = 2q 1 (log " log '^):
kq2 (42)
The double pole at q = 0 shows that we have obtained a deadbeat control
analogous to (14). To nd the controller, we use the asymptotic process model
(38) to eliminate log '^ from (42) and obtain
log h = 2q 1 (log " log r^) + 2q 1 log h:
kq q2 (43)
GPC 1 q (2q 1) 1 q q + 1 :
C (q) = k (q 1)2 = k q 1 q 1 (44)
We note that the transfer function contains a double integral action ; this is
known to be necessary to follow a linear trend without control error.
In order to nd the recursion formula (37) we rearrange (43) in the form
q 1 log h = 1 q + 1 (log " log r^); (45)
q k q 1
where the usual PI control structure (30) is recognized in the right-hand side,
with integral and proportional gains of 1=k. Since the quantity on the left-hand
side corresponds to log(hn =hn 1), the formula (37) follows, and with = 0:8
we have obtained the PC11 controller
1=k r^ 1=k h2
hn+1 = 0:r8^ tol n
r^n+1
n
hn 1 : (46)
n+1
For a detailed discussion of a general observer for log '^ with gain parameters
(ke ; kr ) we refer to [5]. Here it is sucient to remark that the purpose of such
observers is to introduce dynamics in the estimation so that log '~n depends
on its own history as well as the present values of log '^n and r log '^n . This
control will be slower than the PC11 deadbeat design but is less sensitive to
uctuations in log '^; the convolution operator mapping log '^ to log h acts as a
mollier and smoother stepsize sequences are obtained. By contrast, (42) shows
that in the PC11, log hn+1 is directly proportional to log '^n and its dierence
r log '^n . For sti computations irregularities in log '^n are quite common as
the error estimate is also in
uenced by an irregular error contribution from
truncated Newton iterations. Therefore it may be worthwhile to try a dierent
68
parametrization of the PC controller. Because the dynamics of the PC closed
loop is dierent from that obtained with the usual PI controller, the parameters
discussed in the previous section are not relevant in this context. The choice
of (ke ; kr ) is discussed in [5, p. 512] where closed loop pole positioning as
well as practical tests indicate the possibility of using slightly smaller values
of the parameters, e.g. the PC.6.9 controller with (ke ; kr ) = (0:6; 0:9). This
will provide somewhat smoother stepsize sequences without signicant loss of
eciency and robustness.
The PC controllers also need safety nets in case of rejected steps. A com-
plete pseudocode of the PC11 controller, including exception handling and
estimation of k in case of order reduction is provided in [5, p. 511].
5. Stiff computation: Matrix strategies
In sti ODE computations implicit methods are used. It is therefore necessary
to employ iterative equation solvers on each step, usually of Newton type, to
solve equations of the form
yn =
hn f (yn ) + ; (47)
where
is a method-dependent constant and is a known vector. The New-
ton iteration requires that we solve linear systems using the matrix I
hnJn ,
where Jn is an approximation to the Jacobian f 0 (yn ). As the iteration matrix
depends on hn , it has long been argued that minor stepsize variations might
force matrix refactorizations and must be avoided. As a remedy the elementary
controller is often used in combination with a deadzone prohibiting small step-
size increases, aiming to keep hn piecewise constant. Stepsize decreases, on the
other hand, are readily accepted even when small, and may also be controlled
by a dierent strategy|as an example Radau5 uses the elementary control
(10) with a 20% deadzone for stepsize increases, and the predictive control (46)
for decreases [8, p. 124]. This gain scheduling strategy makes an unsymmetric,
discontinuous and nonlinear control, and yet it does not save factorizations
during long sequences of shrinking steps which do occur in practice [6, p. 35,
Fig. 8]. Instead of treating stespize increments and decrements dierently, a
logic consistent with a 20% deadzone for increases should immediately eect
a 20% decrease as soon as the error estimate exceeds ". Such a \staircase"
strategy, however, has more in common with step doubling/halving than with
actual control. In addition its control actions are often large enough to cause
transient eects, where process models no longer hold, resulting in an increased
risk of step rejections. The lack of smooth control may also cause an erratic
tolerance proportionality.
We advocate the use of a smooth, symmetric, linear control, allowed to
work with minute adjustments of hn on every step. The question is if we
can allow a continual control action without an excessive number of matrix
refactorizations. Because one normally uses modied Newton iteration, Jn is
already in error, and small changes in hn can be accomodated by viewing hn Jn
as being approximate, as long as the convergence rate does not deteriorate.
69
We shall only discuss Newton iterations. For a discussion of nonsti compu-
tations and xed-point iterations for (47) we refer to [6]. The convergence rate
is then proportional to hn . Stepsize increases therefore cause slower rates of
convergence, which are acceptable as long as larger steps reduce total work per
unit time of integration. But this puts an upper limit on the stepsize beyond
which further stepsize increases are counterproductive; convergence slows to
the point where total work per unit time of integration starts growing [6, p.
27]. In theory the convergence rate should never exceed 1=e 0:37, but in
practice one should limit hn so that 0:2. This method-independent step-
size limitation will therefore have to be coordinated with the controller, and
becomes part of the complete controller specication.
As for Newton iterations, an analysis of eciency in terms of convergence
rates is extremely complicated and depends on problem size, complexity and
degree of nonlinearity [13]. This model investigation indicates that = 0:2
is an acceptable rate for low precision computations but that rates as fast
as = 0:02 may be preferred if the accuracy requirement is high. Since we
normally want to keep the same Jacobian for several steps, as well as use the
same Jacobian for all stage values in (3), a convergence rate of = 0:1 must
generally be allowed or the strategy may break down when faced by a strongly
nonlinear problem. In fact, using an upper bound of = 0:2 will only lose
eciency if this rate can be maintained for a considerable time using an old
Jacobian [13]. A convergence rate of = 0:2 is therefore not only feasible but
in most cases also acceptable.
Let us consider a modied Newton iteration and assume that the Jaco-
bian Jm was computed at the point (tm ; ym), when the stepsize hm was used.
Further, we assume that the iteration matrix I
hm Jm is still in use at
time tn , when actual values of stepsize and Jacobian have changed to hn Jn =
hm Jm + (hJ ). It is then straightforward to show that the iteration error ej
in the modied Newton method satises, up to rst order terms,
ej+1 = (I
hmJm ) 1
(hJ ) ej : (48)
We shall nd an estimate of in terms of the relative deviation in hn Jn from
hm Jm . Assuming that Jm1 exists (this is no restriction), we rewrite (48) as
ej+1 = (I
hmJm ) 1
hmJm (hm Jm ) 1 (hJ )ej : (49)
1
By taking norms, we obtain kej+1 k k(hm Jm ) (hJ )k kej k, where it can
be shown for inner product norms that = k(I
hmJm ) 1
hmJm k ! 1 in
the presence of stiness, i.e. as khmJm k grows large [6, p. 29]. We can therefore
estimate the convergence rate by
< k(hm Jm ) 1 (hJ )k: (50)
For small variations around hm and Jm we approximate (hJ ) Jm h + hmJ ,
which together with (50) yields the bound
< k hh I + Jm1 J k j hh j + kJm1 J k: (51)
m m
70
Therefore the convergence rate is bounded by the sum of the relative changes in
stepsize and Jacobian, respectively. This simple formula is useful for assessing
the need for refactorizations and reevaluations of the Jacobian. Thus we see
that if variations in J are negligible, then we can accept a 20% variation in
stepsize without exceeding = 0:2; in other words, a 20% deadzone in the
stepsize strategy is unnecessary. Conversely, if without stepsize change the
estimated convergence rate grows and exceeds the acceptable level of =
0:2, then (51) demonstrates that this can only be due to a relative change
in the Jacobian, calling for a reevaluation of J . Note that there is no need
to compute kJm1 J k. It is sucient to monitor the convergence rate and
the accumulated stepsize change jh=hmj. Moreover, if a stepsize change is
suggested by the controller one can beforehand nd out if the proposed change
implies that a refactorization is necessary; it is not necessary to wait for a
convergence failure before invoking countermeasures. This local strategy has
been thoroughly tested in [6] where a full pseudocode specication can be
found. Further improvements have been added by De Swart [14, p. 160]. The
purpose of these strategies is to establish a full coordination of stepsize control
and iterative method. This can be achieved for various choices of the upper
limit on acceptable convergence rate.
In [13] an attempt is made to develop a global matrix strategy. This requires
a signicant amount of problem information, however. Transformed into a local
test, this strategy keeps the same Jacobian as long as is small enough to
satisfy s
log
p log(ke0k=") ; (52)
l+1 f
where e0 is the initial iteration error, l is the number of steps since the iteration
matrix was formed, and f is the time for computing and factorizing the Jaco-
bian, relative to the time for a full iteration, including function evaluation. This
strategy is therefore adaptive with respect to computational progress, problem
properties and size, but it is more dicult to use in practice, not least because
it typically puts a quite sharp upper bound on which tends to interfere with
the stespize controller. For certain problems in high precision computations,
this adaptive strategy may reduce CPU times by 20%.
The total work of the iteration depends to a large extent on starting values
and termination criterion. Starting values are obtained from some predictor,
and a more accurate predictor implies better eciency. A high order predic-
tor is a signicant advantage but may be dicult to construct if an implicit
Runge{Kutta method has low stage order. New second order predictors were
developed in [13], indicating that overall performance increases of 10 20%
may be obtained, in particular for high precision computations.
Termination criteria can be formulated in dierent ways. One requirement
is that the stage derivatives Y_ i must be suciently accurate, as these are the
values entering the quadrature formula (4). Moreover, the terminal iteration
errors must not make more than an insignicant contribution to the local error
71
(7) and its estimate (6). The iteration error is less regular than the truncation
error of the dicsretization, and may|if too large|cause an irregular behaviour
in log '^, i.e. the controller will be fed \noisy data." The rst order predictor
(39) is then prone to be erratic and it is no longer possible to achieve better
control with the PC11. If the correct remedy of a sharper termination criterion
is unacceptable, the remaining possibilities are either to give up deadbeat con-
trol and replace the PC11 by e.g. the PC.6.9 or to resort to the PI1.0, which
is based on the assumption of slow variation, equivalent to a predictor of order
zero for log '^. In the coordination of controller and iteration strategy, these
considerations need to be accounted for to obtain a coherent performance.
6. Conclusions
The algorithmic content of ODE/DAE software is not dominated by the dis-
cretization method as such but by a considerable amount of control structures,
support algorithms and logic. Nevertheless, it appears that only the discretiza-
tion methods have received a thorough mathematical attention while control
logic and structures have been largely heuristic. These algorithmic parts are,
however, amenable to a rigorous analysis and can be constructed in a system-
atic way. The objective of this review has been to introduce the numerical
analyst interested in ODEs to some basic notions and techniques from control
theory which have proved ecient in the construction of adaptive ODE/DAE
software. This methodology is not unfamiliar to numerical analysis; digital
control rests on the classical theories of linear dierence equations, dierence
operators and stability. Bearing this in mind, the numerical analyst has a
wide range of powerful concepts at his/her disposal for the automatic control
of numerical integration. The present paper should serve as a starting point
for accessing the state of the art in this eld as of 1998 through the referenced
literature.
The control theoretic approach views the adaptive numerical integration of
ODEs as a process which maps a stepsize log h to a corresponding error log r^.
Section 3 showed that the process is an ane map, i.e. log r^ = GP (q) log h +
log '^. Moreover, in the asymptotic regime it is static|GP (q) is just a constant.
The process is therefore relatively simple to control. But when log h becomes
large the process representing an explicit method shifts to become dynamic.
Both cases could be controlled by a standard technique, the PI controller. This
is a linear map log h = GC (q)(log " log r^) which nds a suitable stepsize from
the deviation between accuracy requirement and estimated error. But even if
the process has a generic structure for all discretizations, it is a challenging task
to nd suitable controller parameters. The recommended controllers, PI.3.4
and PI.4.2, are fully specied in the form of pseudocode in [4].
For implicit methods we saw in Section 4 that a more advanced technique
based on predicting the evolution of the principal error function could be used.
The process GP (q) was still just a constant, but the predictive controller had
an increased complexity, enabling it to follow linear trends. A full specication
72
of the recommended PC11 controller is found in [5]. Finally, in Section 5 we
saw how a continual control action could be coordinated with matrix strategies
for the Newton iterations; algorithm specications are found in [14, p. 160],
which improves the original in [6], and in [13].
The new control algorithms are ecient, yield smoother stepsize sequences
and fewer rejected steps, and are designed to be parts of a coherent strategy.
Even if the new algorithms are more complex and mainly directed towards
qualitative improvements, they usually decrease total work as well.
Thus we have brought together a number of algorithms, chie
y based on
modern control theory, for making numerical ODE/DAE integration meth-
ods adaptive. These control algorithms are supported by a well established
mathematical methodology instead of heuristics, and should be considered as
well-dened, fully specied structures accomplishing equally specic tasks. Al-
though one may wish to use dierent controllers in dierent situations, the
tuning of a controller relies in equal parts on control theoretic principles and
a thorough knowledge of the process to be controlled; parameter choice must
be consistent and is not arbitrary. A less systematic approach, such as trying
to achieve \ultimate performance" on a few test problems used for tuning, is
usually of little value as it often trades robustness and coherence for a marginal
gain in a single aspect of performance, e.g. total number of steps. Other perfor-
mance aspects, such as the quality of the numerical solution in terms of better
smoothness achieved by smoother stepsize sequences (cf. [4, p. 534, Fig. 1]) or
a more stable and regular tolerance proportionality, are often overlooked but
do belong in a serious evaluation of software performance and quality.
The recent development and present state of the art point to the possibility
of eliminating heuristic elements from adaptive solution methods for ODEs
and DAEs. This is an important goal as it may eventually bring ODE/DAE
software to approach the level of quality and standardization found in modern
linear algebra software.
Acknowledgements. The author would like to thank a large number of col-
leagues and collaborators in numerical analysis and automatic control who have
contributed directly as well as indirectly to shaping the techniques presented in
this review. The research was in part funded by the Swedish Research Council
for Engineering Sciences TFR contract 222/91-405.
References
1. P. Deuflhard, F. Bornemann (1994). Numerische Mathematik II: In-
tegration gewohnlicher Dierentialgleichungen. Walter de Gruyter, Berlin.
2. C.W. Gear (19971). Numerical Initial Value Problems in Ordinary Dif-
ferential Equations. Prentice{Hall, Englewood Clis.
3. K. Gustafsson, M. Lundh, G. So derlind (1988). A PI stepsize control
for the numerical solution of ordinary dierential equations. BIT 28, 270{
287.
73
4. K. Gustafsson (1991). Control theoretic techniques for stepsize selection
in explicit Runge{Kutta methods. ACM TOMS 17, 533{554.
5. K. Gustafsson (1994). Control theoretic techniqes for stepsize selection
in implicit Runge{Kutta methods. ACM TOMS 20, 496{517.
6. K. Gustafsson, G. So derlind (1997). Control strategies for the iterative
solution of nonlinear equations in ODE solvers. SIAM J. Sci. Comp. 18,
23{40.
7. E. Hairer, S.P. Nrsett, G. Wanner (1993). Solving Ordinary Dier-
ential Equations I: Nonsti Problems. Springer-Verlag, 2nd revised edition,
Berlin.
8. E. Hairer, G. Wanner (1996). Solving Ordinary Dierential Equations
II: Sti and Dierential-algebraic Problems. Springer-Verlag, 2nd revised
edition, Berlin.
9. G. Hall (1985). Equilibrium states of Runge{Kutta schemes. ACM TOMS
11, 289{301.
10. G. Hall (1986). Equilibrium states of Runge{Kutta schemes, part II. ACM
TOMS 12, 183{192.
11. G. Hall, D. Higham (1988). Analysis of stepsize selection schemes for
Runge{Kutta codes. IMA J. Num. Anal. 8, 305{310.
12. D. Higham, G. Hall (1990). Embedded Runge{Kutta formulae with sta-
ble equilibrium states. J. Comp. and Appl. Math. 29, 25{33.
13. H. Olsson, G. So derlind (1998). Stage value predictors and ecient
Newton iterations in implicit Runge{Kutta methods. To appear in SIAM
J. Sci. Comp. 19.
14. J.J.B. de Swart (1997). Parallel software for implicit dierential equa-
tions. Ph.D. thesis, CWI, Amsterdam.
15. J.J.B. de Swart, G. So derlind (1997). On the construction of error
estimators for implicit Runge{Kutta methods. J. Comp. and Appl. Math.
86, 347{358.
16. T. So derstro m, P. Stoica (1989). System Identication. Prentice{Hall,
Englewood Clis.
17. H.A. Watts (1984). Step size control in ordinary dierential equation
solvers. Trans. Soc. Comput. Sim. 1, 15{25.
18. J.A. Zonneveld (1964). Automatic numerical integration. Ph.D. thesis,
Math. Centre Tracts 8, CWI, Amsterdam.
19. K.J. Astro m, B. Wittenmark (1990). Computer Controlled Systems {
Theory and Design. 2nd ed., Prentice{Hall, Englewood Clis.
74