0% found this document useful (0 votes)
52 views20 pages

The Automatic Control of Numerical Integration

This paper reviews the recent advances in developing automatic control algo- rithms for the numerical integration of ordinary dierential equations (ODEs) and dierential-algebraic equations (DAEs). By varying the stepsize, the error committed in a single step of the discretization method can be aected. Modern time-stepping methods provide an estimate of this error, and by comparing the estimate to a specied accuracy requirement a control algorithm selects the next stepsize. To ...
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
52 views20 pages

The Automatic Control of Numerical Integration

This paper reviews the recent advances in developing automatic control algo- rithms for the numerical integration of ordinary dierential equations (ODEs) and dierential-algebraic equations (DAEs). By varying the stepsize, the error committed in a single step of the discretization method can be aected. Modern time-stepping methods provide an estimate of this error, and by comparing the estimate to a specied accuracy requirement a control algorithm selects the next stepsize. To ...
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 20

Volume 11 (1) 1998, pp.

55 { 74

The Automatic Control of Numerical Integration


Gustaf Soderlind
Department of Computer Science,
Lund University,
Box 118, S-221 00 Lund, Sweden
e-mail: [email protected]

This paper reviews the recent advances in developing automatic control algo-
rithms for the numerical integration of ordinary di erential equations (ODEs)
and di erential-algebraic equations (DAEs). By varying the stepsize, the error
committed in a single step of the discretization method can be a ected. Modern
time-stepping methods provide an estimate of this error, and by comparing the
estimate to a speci ed accuracy requirement a control algorithm selects the next
stepsize. To construct ecient controllers it is necessary to analyze the dynamic
behaviour of the discretization method together with the controller. Based on
feedback control theory, this systematic approach has replaced earlier heuristics
and resulted in a more consistent and robust performance. Other quantities af-
fected by the stepsize are convergence rates of xed-point and Newton iterations,
and we therefore also review new techniques for the coordination of nonlinear
equation solvers with the primary stepsize controller. Taken together, the re-
cent development provides principles and guidelines for constructing ODE/DAE
software where heuristics and tuning parameters have largely been eliminated.

1. Introduction
Software for solving initial value problems for ODEs and DAEs has existed for
the past 15{30 years. Yet there has been a quite rapid recent development in
the analysis and design of the algorithmic structures needed to turn a numerical
time-stepping method into an adaptive integration procedure.
We shall consider the problem of numerically solving an ODE
y_ = f (y); y(0) = y0 ; t  0 ; (1)
where f : Rd ! Rd. The qualitative behaviour of the solution y(t) may vary
considerably depending on the properties of f ; f may be linear or nonlinear,

55
some problems are sensitive to perturbations, some have smooth solutions while
others have intervals where y(t) changes rapidly. Sometimes an accuracy of a
few digits is sucient but occasionally high precision results are required. To
handle the great variety of situations many di erent discretization methods are
needed, but each automatic integration procedure must nevertheless be able to
adapt properly to a wide range of operating conditions.
The work/precision eciency of a time-stepping method depends on the
discretization as well as on problem properties and size. The integration proce-
dure should attempt to compute a numerical solution fyn g to (1) with minimum
e ort, subject to a prescribed error tolerance tol. As tol ! 0, the global er-
ror kyn y(tn )k should decrease in a regular way|a property called tolerance
proportionality. At the same time, computational e orts increase, usually in a
logarithmic proportionality between accuracy and e ort.
Minimizing work subject to a bound on the global error requires a global
computational strategy. But the nature of time-stepping is inherently sequen-
tial or local; given the \state" y(t), the method is a procedure for computing an
approximation to y(t + h) a time step h > 0 ahead. The size of h is used to trade
accuracy for eciency and vice versa, and is therefore the principal means of
controlling the error and making the computational procedure adaptive. Thus
it is a well established practice to control the local error, an approach which is
inexpensive and far simpler than controlling the global error. By using di er-
ential inequalities, however, it can be shown that if the local error per unit time
of integration is kept below tol, then the global error at time t is bounded by
C (t)  tol. Thus a local error tolerance indirectly a ects the global error.
We shall assume that an s-stage Runge{Kutta method characterized by the
matrix-vector pair (A; b) is used to solve (1). Using standard notation, [8], the
method can be written
Y = 1l
yn + (A
Id )hY_ (2)
_Yi = f (Yi ) ; i=1:s (3)
yn+1 = yn + (bT
I )hY_ ; (4)
where yn approximates y(tn ). Further, h is the stepsize, Y is the sd{dimensional
stage vector, and Y_ is the corresponding stage derivative. For the purposes of
this study, it is of secondary importance if the method is explicit or implicit,
but we shall assume that the method also supports a reference method de ned
by a di erent quadrature formula
y^n+1 = yn + (^bT
I )hY_ : (5)
By means of this reference method, a local error estimate is obtained by
^ln+1 = yn+1 y^n+1 = ((bT ^bT )
Id )hY_ : (6)
Let y^(t; ; ) denote a solution to (1) with initial condition y^( ) = . Then the
local error in a step from yn to yn+1 may be written
ln+1 = yn+1 y^(tn+1 ; tn ; yn): (7)

56
By expanding the local error in an asymptotic series one shows that
ln+1 = (tn ; yn )hp+1 + O(hp+2 ); (8)
where  is the principal error function and p is the order of the method. Sim-
ilarly, one derives a corresponding expression for the local error estimate,
l^n+1 = ^ n hp^+1 + O(hp^+2 ); (9)
where the order p^  p, depending on design objectives for the method. For
simplicity we shall make no distinction between p^ and p but wish to emphasize
that in the local error control algorithms under discussion, the \order" always
refers to that of the error estimate.
It is of interest to control either the local error per step (EPS) or the local
error per unit step (EPUS). In the former case we let r^n+1 = k^ln+1 k and in the
latter r^n+1 = k^ln+1 =hnk (we now use hn instead of h). A step is accepted if
r^n+1  tol, and eciency suggests choosing the stepsize adaptively, as large
as possible, subject to this accuracy requirement. To reduce the risk of having
to reject a step, it is common to aim for a slightly smaller error, " =   tol,
where  < 1 is a safety factor. The elementary local error control algorithm [2,
p. 156] is  1=k
hn+1 = r^ " hn ; (10)
n+1
were k = p + 1 (EPS) or k = p (EPUS). The heuristic derivation of the control
law (10) assumes the following behaviour of the integration process:
Process assumptions
1. Asymptotics: r^n+1 = '^n hkn where '^n = k^ nk,
2. Slow variation: '^n  '^n 1 .
If these were correct, and there is a deviation between " and r^n+1 , then (10)
will eliminate this deviation in a single step and make the error equal ":
 " 1=k
hn+1 = '^ hk hn ) '^n hkn+1 = ": (11)
n n
Hence if '^n is constant the new stepsize exactly meets the accuracy require-
ment.
In practice it may happen that one or both process assumptions are false.
The rst states that hn is small enough for the error to exhibit its theoretical
asymptotic behaviour. Some reasons why this may be false are
(i) in an explicit method the stability region is bounded and if the stepsize is
limited by numerical stability it is no longer in the asymptotic regime;
(ii) in sti ODEs one uses \large" stepsizes far outside the asymptotic regime
for modes which have decayed;

57
(iii) for sti ODEs and DAEs, some methods su er from order reduction, also
invalidating the classical asymptotic model for the order.
The second assumption asserts that f and/or y have a negligible variation
during a time-step of size h and is rarely if ever correct. It also depends on
the smoothness of the norm k  k, and whether the norm is \aligned" with the
behaviour of the solutions.
In spite of its shortcomings, the elementary error control has been quite
successful in practical computations. One apparent success is its ability to pre-
vent numerical instability. In computations with an explicit method, stepsizes
may eventually grow until hn must be limited by numerical stability. This
is at large managed by (10). As long as stability is at hand, the solution yn
remains smooth and the error small, and (10) will attempt to increase the
stepsize. But if hn increases to cause instability, no matter how slight, then yn
quickly becomes less regular. As a result r^n increases, forcing (10) to reduce
hn . This process repeats itself and keeps instability at bay through a continual
adjustment of hn .
But this suggests that hn will oscillate around a maximum stable stepsize.
Such oscillations are indeed observed in practice, cf. [3], [8, p. 25], and may even
have visible e ects on the smoothness of the numerical solution [4, p. 534], [8, p.
31]. This led Hall [9, 10] to study a new question of stability: is the method
in tandem with the control (10) stable when numerical stability limits the
stepsize? It was found that for the linear test equation y_ = y stable stepsize
equilibria exist on parts of the boundary of the stability region, [8, p. 26]. For
some popular methods, however, the performance is less satisfactory. Hall
and Higham [12] then took the approach of constructing new methods that
together with (10) had improved \step control stability" and thus overcame
stepsize oscillations.
Around the same time, however, Gustafsson et al. [3] studied the problem
from a control theoretic point of view, which implies the opposite approach;
instead of constructing methods that match the control law (10), the controller
should be designed to match the method. Moreover, because (10) is as elemen-
tary to feedback control theory as the explicit Euler method is to numerical
ODEs, it should be possible to employ a more advanced digital control design
parameterized for each given method. A rst experimental study [3] con rmed
that a proven standard technique, the proportional integral (PI) controller (see
also Section 2), worked well. It was subsequently thoroughly analyzed, tested
and parameterized [4]. This resulted in smoother stepsize sequences and nu-
merical solutions [4, p. 547], [8, p. 31], fewer rejected steps, an improved ability
to prevent numerical instability and a possibility to run closer to tol, i.e. to
use a larger ; in all, a more robust performance at no extra computational
expense. This technique has since become the modern standard for nonsti
computations and is promoted in various guises in research literature [8, pp.
28{31] and general numerical analysis textbooks alike [1, pp. 173{179].
The purpose of this paper is to review the recent advances in the automatic

58
control of numerical integration. In Section 2 we study the elementary con-
troller (10) for nonsti computations while introducing some methodology and
basic notions from control theory. We explain the structure of the PI controller
and the e ect of its parameters. Section 3 then gives a more detailed account
of the PI control design process and the parametrization of the controller. In
Section 4 we proceed to review the development in sti computations where
the PI control is replaced by a predictive control, [5]. As sti computations
use implicit methods, they also rely on iterative methods of Newton type. The
convergence rate of such iterations is a ected by the stepsize, and stepsize
changes may force matrix refactorizations. It is therefore important to coor-
dinate matrix strategies with the controller. In Section 5 we describe a local
strategy, [6], which by monitoring and predicting convergence rates manages
a good performance without interfering with the predictive controller. This
reduces unnecessary matrix handling and convergence failures. We also dis-
cuss high order predictors for the starting values of the iteration, [13], which
are combined with an attempt to optimize global performance based on prob-
lem characteristics. These techniques are important steps towards eliminating
heuristics in numerical ODE software. They are available in the literature in
concise algorithmic descriptions to facilitate their practical implementation.

2. PI-controlled time-stepping: Basic notions


Let us consider (10), known in control theory as a discrete-time integral con-
troller (I controller ). Taking logarithms in (10) yields the linear di erence
equation
log hn+1 = log hn + k1 (log " log r^n+1 ): (12)
The term log " log r^n+1 is called the control error, and the factor kI = 1=k is
referred to as the integral gain. The controller acts to make the error estimate
log r^n+1 equal the setpoint log " so that the control error vanishes; only then
will the control log hn cease to change.
Solving the di erence equation (12) we obtain
n
X
log hn = log h0 + k1 (log " log r^m ): (13)
m=1
This is a discrete-time analogue of an integral, explaining the term integral
control; the stepsize is obtained as a weighted sum of all past control errors.
We shall rst examine the closed loop dynamics for controller and controlled
process. To this end we insert the asymptotic model r^n+1 = '^n hkn (i.e. the
process and its dependence on the control log hn , according to the asymptotic
assumption) into the control law (12) to obtain
log h = 1 (log " log '^ ):
n+1 k n (14)

59
This rst-order di erence equation for log hn has the characteristic equation
q = 0, i.e. the roots are at the origin. Known as deadbeat control, this is the
result of choosing the integral gain kI = 1=k in the controller.
From the point of view of control theory, however, the integral gain kI is
a free parameter; it is not determined by the asymptotic model of the process
but is a design parameter used to achieve a good overall dynamic behaviour
for process and controller together. Replacing the factor 1=k in (12) by an
arbitrary gain kI and inserting the asymptotic model for r^n+1 we obtain the
closed loop dynamics
log hn+1 = (1 kkI ) log hn + kI (log " log '^n ): (15)
Thus the root of the characteristic equation is now q = 1 kkI . Stability
evidently requires kkI 2 [0; 2], but the choice kkI = 1 is by no means neces-
sary. Instead, the value of kkI determines how quickly the control responds
to variations in the external disturbance log '^n , whose uctuations are to be
rejected or compensated by the controller. The system's unit step response is
the behaviour resulting when the control error increases from 0 to 1. In the
deadbeat controller, log hn+1 depends exclusively on log '^n , implying|as we
already noted|that the error will be immediately rejected. But in general this
leads to a rather \nervous" control, which is often not to advantage. Taking
kkI 2 (0; 1) leads to a smoother control1, where the control variable log hn
depends in part on its history and in part on log '^n . By contrast, kkI 2 (1; 2)
makes the controller overreact, and on subsequent steps it must even compen-
sate its own actions, resulting in damped oscillations. The choice of kkI is
therefore a tradeo between response time and sensitivity; for an I controller
to work well one must in e ect choose kkI 2 (0; 1).
The solution of the di erence equation (15) is given by the convolution
n
X
log hn = (1 kkI )n log h0 + kI (1 kkI )n m (log " log '^m 1 ): (16)
m=1
This molli es the uctuations in log '^n (recall that '^n is composed of high
order derivatives of f ) and therefore yields smoother stepsize sequences when
kkI 2 (0; 1) than those obatined with the deadbeat control (13).
The general I controller can be written as
 kI
hn+1 = r^ " hn : (17)
n+1
When we compare this to (10), it is of importance to note that the change
of integral gain from 1=k to kI is not an arbitrary violation of the theoretical
asymptotics of the method but a deliberate change of controller dynamics to
achieve smoother stepsize sequences for the very same asymptotic error model
1 Note that one cannot choose kkI = 0 as (15) then yields a constant stepsize.

60
as was assumed for the elementary control algorithm (10). The control analysis
above rests on the asymptotic assumption, but not on the assumption of slow
variation.
In control theory it is common to use PI and proportional integral derivative
(PID) control structures to increase robustness. In most cases a PI controller
is satisfactory. Its idea is to construct the control log hn as follows:
n
X
log hn = log h0 + kI (log " log r^m ) + kP (log " log r^n ): (18)
m=1
Thus log hn consists of two terms: it adds a term proportional to the control
error to the previous integral term. The proportional gain kP and integral gain
kI are to be chosen to obtain robust closed loop dynamics.
Forming a recursion from (18), we obtain
log hn+1 = log hn + kI (log " log r^n+1 ) + kP (log r^n log r^n+1 ): (19)
The PI controller can therefore be written
 kI 
hn+1 = r^ " r^n kP h ; (20)
r^n+1 n
n+1
and is obviously a relatively simple modi cation to the elementary local error
control. The new proportional factor accounts for error trends; if r^n is increas-
ing the new factor is less than 1 and makes for a quicker stepsize reduction than
the I controller would have produced. Conversely, a decreasing error leads to
a faster stepsize increase.
The major change, however, is that the closed loop has second order dy-
namics. Inserting the asymptotic model into (19), the closed loop dynamics is
described by
log hn+1 = (1 kkI kkP) log hn + kkP log hn 1 +
kI (log " log '^n ) + kP (log '^n 1 log '^n );
with the characteristic equation
q2 (1 kkI kkP )q kkP = 0: (21)
The dynamics is therefore determined by two roots, the locations of which are
functions of the two design parameters kkI and kkP . One cannot, however,
choose the values of kkI and kkP solely from this characteristic equation, as we
have assumed that the asymptotic process model holds when we derived (21).
It is necessary to study additional process models, e.g. the dynamics when
numerical stability limits the stepsize, to nd a good parametrization. In the
case of the pure asymptotic model and an almost constant external disturbance
log '^n , a controller with integral action is typically both necessary and sucient
[19]. In many but certainly not all nonsti computations these assumptions are

61
not unrealistic, explaining the success of the elementary controller (10). It is
in the remaining cases that a more robust controller is needed, and our next
task is to outline the control analysis necessary to design and parameterize the
PI controller properly.

3. Nonstiff computation: PI control


The analysis and design of a linear control system has three steps:
(i) nding the dynamics of the process to be controlled;
(ii) choosing a control structure;
(iii) nding the closed loop dynamics of controller and process together in order
to select appropriate control parameters.
The procedure is usually carried out in the frequency domain rather than in
the time domain. This implies that process and controller dynamics are repre-
sented in terms of their z -transforms. We shall denote the transform variable,
corresponding to the forward shift operator, by q, and let GP (q) and GC (q)
denote the transfer functions of the process and controller, respectively. Fur-
ther, we denote transformed stepsize, error and disturbance sequences without
a subscript, see Figure 1.
The rst task is to identify the process models, which map log h to log r^.
When the asymptotic assumption holds, we have
log r^ = GP (q) log h + q 1 log ';
^ (22)
where the process is just a constant gain, GP (q) = G0P (q) = kq 1 . The back-
ward shift q 1 appears as a consequence of indexing conventions; when the
method takes a step of size hn it produces an error estimate r^n+1 = '^n hkn .
If numerical stability limits the stepsize, however, the process is no longer
static but dynamic. A dynamic model was rst obtained in [9, 10] for the linear
test equation2 y_ = y. The explicit Runge{Kutta method then yields yn+1 =
P (zn )yn , where P is the stability polynomial of the method and zn = hn .
For the error estimate one similarly has ^ln+1 = E (zn )yn where E is the error
estimator polynomial. We now assume that z  = h  is such that jP (z  )j = 1,
i.e. z  2 @S , the boundary of the stability region. We next write
^ln+1 = E (zn )yn = E (zn )P (zn 1 )yn 1 = P (zn 1 ) E (zn ) ^ln (23)
E (zn 1 )
and consider small stepsize variations hn = (1 + n )h , i.e. zn = (1 + n )z .
Expanding the polynomials in P (zn 1 )E (zn )=E (zn 1 ) around z  , retaining
rst-order terms, yields the approximation
 C1  h C2 C1
^ln+1 =_ P (z ) hn n 1 ^ln ; (24)
h h
2 This problem models di erential equations of the form x_ = f (x (t))+ _ (t), with f (0) = 0
and (t) slowly varying, as x(t) approaches the quasi-stationary solution (t).

62
log '^

log " Controller log h Process log r^


GC (q) GP (q)
1

Figure 1. Adaptive stepsize selection viewed as a feedback control system.


The process consists of the discretization method which takes a given stepsize
log h as input and produces an error output which is estimated by log r^. The
method is applied to an ODE whose in uence is represented as an external
disturbance log '^ accounting for problem properties. The error log r^ is fed
back with reversed sign (the factor 1 on the feedback loop), then added to
log " to compare actual and desired error levels. Taking this di erence as its
input, the controller selects the next stepsize log h. In the analysis of a linear
control system, the process and controller are represented by transfer functions
mapping the inputs to outputs. The closed loop transfer function, i.e. the
map from log '^ to log r^ with the controller included, must be stable and have
a good ability to reject the external disturbance; variations in log '^|even if
substantial|must not force log r^ far away from the setpoint log ".

with  E 0 (z  )   P 0 (z  ) 
C1 (z ) = Re z   ;

E (z ) C2 (z ) = Re z   :

P (z ) (25)
These coecients depend on the method parameters (A; b; ^b) as such, but also
vary along @S . We shall take a di erent approach from [11, 4] and normalize
the coecients by k, de ning
c^1 (z ) = C1 (z  )=k ; c^2 (z  ) = C2 (z  )=k : (26)
We now take the logarithm of r^n = j^ln j (or in the EPUS case j^ln j=hn ) and
express (24) in terms of the forward shift q. Noting that jP (z )j = 1 we obtain
log r^ =_ G@S 
P (q)(log h log h ); (27)
where the sought dynamic process model takes the form
  
G@S (q ) = kq 1 c^1 (z  ) pk + c^2 (z ) ; (28)
P k q 1
with pk = 0 for EPS and pk = 1 for EPUS, see [4, p. 538]. To verify this
dynamic model, Gustafsson used system identi cation, [16]. The coecients

63
obtained when a real ODE solver was applied to quasi-stationary nonlinear
problems were found to be well within 1% of theoretical values [4, pp. 541,
549{551], con rming that the dynamic model is highly accurate.
We thus have two di erent transfer functions representing the process, the
asymptotic model G0P (q) = kq 1 valid near z = 0 and the dynamic model
G@SP (q) valid near z 2 @S . These models are compatible; since c^1 (0) pk =k = 1
and c^2 (0) = 0, G@S P (q) reduces to G0P (q) near z = 0 2 @S . For many methods
c^1 (z  )  1, and c^2 (z  ) often varies in [0; 2] along @S , cf. [4, p. 552].
The next task is to design a controller, which must be able to adequately
manage both processes. The elementary controller (10) will not do, as it is
often unable to yield stable dynamics for G@S P (q).
A controller GC (q) is a linear operator, mapping the control error log " log r^
to the control log h, expressed as
log h = GC (q)(log " log r^): (29)
Identifying (29) with (19) we readily nd the expression for the PI controller,
GPI q
C (q) = kI q 1 + kP : (30)
As q is the forward shift operator, q 1 is a di erence operator; 1=(q 1) is
therefore the summation operator representing the controller's integral action.
Moreover, the pure I controller is obtained as the special case kP = 0.
The closed loop dynamics is obtained by combining process and controller
and eliminating log h. Thus, in the asymptotic regime, we insert (29) into (22)
to obtain
log r^ = G0" (q) log " + G0'^ (q) log ';
^ (31)
where
0 1
G0" (q) = 1 +GCG(q()qG) PG(0q()q) ; G0'^ (q) = 1 + G q(q) G0 (q) : (32)
C P C P
To consider PI control we insert (30) and the process model G0P (q) = kq 1 into
(32). The transfer function from external disturbance log '^ to error estimate
log r^ can then be written explicitly as
G0'^ (q) = q2 (1 kkq 1kk )q kk : (33)
I P P
The closed loop dynamics is governed by the poles of the transfer functions
(32). Thus, by looking at the denominator of (33) we recognize the same char-
acteristic equation (21) as before. Moreover, the numerator of G0'^ (q) contains
the di erence operator q 1. Appearing as consequence of the controller's in-
tegral action, this operator will remove any constant disturbance log '^ in (31)
at a rate determined by the location of the poles. Last, we nd that G0" (1) = 1,

64
implying that log r^ will, in a stable system, eventually approach the setpoint
log "; this criterion is akin to the notion of consistency in numerical analysis.
Before selecting controller parameters, we also need to consider the closed
loop dynamics on @S . Combining (27) and (29) we obtain
log r^ = G@S @S 
" (q ) log " + Gh (q ) log h ; (34)
where we need to nd the poles of the transfer functions
G@S ( q ) = GPI @S
C (q) GP (q) ; G @S (q ) = G@SP (q) : (35)
" PI @S
1 + GC (q) GP (q) h PI
1 + GC (q) G@S P (q)
In the EPS case the characteristic equation is q3 + a2 q2 + a1 q + a0 = 0, where
a2 = 2 (kkI + kkP )^c1
a1 = 1 (kkI + 2kkP)^c1 + (kkI + kkP )^c2
a0 = kkP(^c2 c^1 ):
Therefore the poles are determined by the parameters kkI and kkP both in
the asymptotic case and on @S . It is nevertheless a considerable challenge to
choose the PI control parameters. There is a gradual change in process models
from G0P (q) to G@S P (q) as z leaves the asymptotic domain and approaches @S
[4, p. 541]. In addition c^1 and c^2 are method dependent and vary on @S .
The choice of kkI and kkP is based on theoretical considerations as well
as computational performance. A systematic investigation of the range of the
coecients c^1 and c^2 for a large number of methods reveals how large the
closed loop stability region in the (^c1 ; c^2 ){plane needs to be. Extensive prac-
tical testings led Gustafsson [4] to suggest (kkI ; kkP ) = (0:3; 0:4) as a good
choice, to be considered as a starting point for ne-tuning the controller for an
individual method. This results in poles located at q = 0:8 and q = 0:5 for
(33), regardless of method. Negative poles are less desirable as they may cause
overshoot and risk step rejections, but they cannot be avoided in this design3.
If actual values of c^1 and c^2 permit a small reduction of the stability region in
the (^c1 ; c^2 ){plane, a somewhat faster, better damped and smoother response
near z = 0 is achieved by taking (kkI ; kkP) = (0:4; 0:2). The poles at z = 0
have then moved to q  0:69 and q  0:29. In most cases the parameter set
of interest is f(kkI ; kkP ) : kkI + kkP  0:8; kkI  0:3; kkP  0:1g, noting
that individual methods have somewhat di erent characteristics and that the
parameter choice is a tradeo between di erent objectives.
Extensive additional tests with the safety factor  showed that one can
run as close as 90% of tol before stepsize rejections become frequent. For
3 We note that [8, p. 30] maintain kkI + kkP = 1; nding (kkI ; kkP ) = (0:36; 0:64) unsatis-
factory for an 8th order method, they change to (0:68; 0:32). Closed
p loop dynamics is less
convincing in both cases, however. Governed by two roots  kkP of (21) near z = 0, it
is quite oscillatory and a reduced integral gain might be preferred. Further we note that
the control analysis in [1] only treats a static process model and therefore prematurely
concludes that an I controller is satisfactory.

65
an increased margin of robustness at a negligible cost, a value of  = 0:8 is
recommended, implying a setpoint of " = 0:8  tol. Thus, a good overall
performance can be expected from the PI.3.4 controller
 0:3=k 
hn+1 = 0:8r^  tol r^n 0:4=k h ; (36)
r^n+1 n
n+1
with (10)|the PI1.0 |as a safety net in case of successive rejected steps, [4,
p. 546]. In many codes it is common to use additional logic, e.g. a deadzone
inhibiting stepsize increases unless they exceed say 20%. But such a deadzone
is detrimental to controller performance as it is equivalent to disengaging the
controller from the process, waiting for large control errors to build up. When
the controller is eventually employed it is forced to use larger actions and might
even nd itself out of bounds. By contrast, the advantage of using a control
design such as the PI.3.4 or the PI.4.2 lies in its capacity to exert a delicate
stabilizing control on the process as is clearly demonstrated by the smoother
stepsize sequences. We therefore advocate a continual control action.

4. Stiff computation: Predictive control


In sti computations the PI control turns out to be less useful and shows few
clear advantages over (10), depending on the fact that the process has other
characteristics. The two main problems are with the process assumptions:
(i) stepsizes \far oustide" the asymptotic regime for sti , decayed solution
components, and (ii) substantial changes in log '^n . In the nonsti case the
asymptotic model is predominantly correct except near the stability boundary,
and the PI control is e ective as a countermeasure for uctuations in log '^n .
In the sti case, however, a better rejection of external disturbances can be
achieved if the variation of log '^n is modelled and predicted. This is done by
constructing an observer [19] for log '^n . Using the observer estimate to select
the stepsize, Gustafsson [5, 8, p. 124] arrived at the predictive control
hn+1 =  " ke =k  r^n kr =k hn ; (37)
hn r^n+1 r^n+1 hn 1
with a recommended observer gain of (ke ; kr )T = (1; 1)T . We recognize this
control structure as a PI controller for the stepsize change log hn+1 log hn .
This may appear to be a minor modi cation but the dynamics is di erent,
requiring an unbounded stability region S . It is therefore not suitable for
nonsti methods. Interestingly, similar controllers based on extrapolating error
trends and stepsizes were rst suggested and used in [17, 18], although without
an analysis of controller dynamics and parametrization.
When an implicit Runge{Kutta method is used to solve DAEs or sti ODEs,
the behaviour of the local error estimate is of fundamental importance for suc-
cessful control. Two notions of importance are observability and controllability
[19]. Loosely speaking, in our context observability requires that we have an

66
error estimate which adequately re ects the true error. Controllability requires
that the stepsize be an e ective means to exert control; by adjusting hn we
must be able to put the error at a prescribed level of ". Neither of these re-
quirements is trivial and only a brief account can be given here. To this end, we
consider the linear test equation y_ = y. The method yields yn+1 = R(zn )yn ,
where R(zn ) is a rational function, typically a Pade approximation to ezn . To-
day it is common to select L-stable methods, e.g. the Radau IIa methods [8],
for which the stability region S contains the left half-plane and in addition
R(1) = 0. Given an embedded formula with rational function R^(z ), an error
estimate is provided by ^ln = (R(zn ) R^ (zn ))yn . Here R^ (z ) must be bounded
as z ! 1. But it is also important that the estimated error for very sti or
algebraic solution components essentially vanishes4 . Unless such conditions are
met, the stepsize-error relation may suddenly break down, forcing stepsize re-
ductions by several orders of magnitude to re-establish an asymptotic relation
[8, pp. 113{114]. A simpli ed explanation is an unsuitable choice of R and
R^ for which (R(zn ) R^ (zn ))yn is almost constant as a function of zn when
jzn j is large, i.e. the error estimate is una ected by moderate stepsize varia-
tions. In such a situation any controller|no matter how sophisticated|is at
a loss of control authority. In order to avoid this breakdown one must seek
method/error estimator combinations which are controllable. This is still an
area of active research, however, as it must address method properties, error
estimator construction, cf. [15], termination criteria for Newton iterations and
problem properties in DAEs.
In the sequel we shall assume that the error estimator is appropriately con-
structed and free of the shortcomings mentioned above. This implies, with
few exceptions, that the controller and error estimator operate in the asymp-
totic regime, i.e. the solution components or modes to be controlled satisfy
r^n+1 = '^n hkn . Sti or algebraic components, which are outside the asymptotic
regime, typically give negligible contributions to the error estimate, cf. [8, p.
125]. Thus our process model is
log r^ = kq 1 log h + q 1 log ':
^ (38)
When this model holds, the error control problem is a matter of adapting the
stepsize to variations in '^n . Investigations of real computational data show
that variations in '^n have a lot of structure [5, p. 503]. The simplest model for
predicting log '^n is the linear extrapolation
log '~n = log '^n 1 + r log '^n 1 ; (39)
where log '~n denotes the predicted value of log '^n and r is the backward
di erence operator. Note that the model is not compensated by using divided
di erences. Such models have been tried but show no signi cant advantages
over (39). Using the forward shift we rewrite (39) in the form
log '~ = (2q 1)q 2 log ':^ (40)
4 A sucient condition is that both of R(z ) and R^ (z ) are L-stable.

67
We shall choose hn such that '~n hkn = ". The stepsize is therefore given by
log h = k 1 (log " log '~): (41)
Inserting the estimate (40) into (41), noting that log " is constant, we obtain
the closed loop dynamics
log h = 2q 1 (log " log '^):
kq2 (42)
The double pole at q = 0 shows that we have obtained a deadbeat control
analogous to (14). To nd the controller, we use the asymptotic process model
(38) to eliminate log '^ from (42) and obtain
log h = 2q 1 (log " log r^) + 2q 1 log h:
kq q2 (43)

Thus log h = GPC


C (q)(log " log r^), where the predictive controller is

GPC 1 q (2q 1) 1 q  q + 1 :
C (q) = k (q 1)2 = k q 1 q 1 (44)
We note that the transfer function contains a double integral action ; this is
known to be necessary to follow a linear trend without control error.
In order to nd the recursion formula (37) we rearrange (43) in the form
q 1 log h = 1  q + 1 (log " log r^); (45)
q k q 1
where the usual PI control structure (30) is recognized in the right-hand side,
with integral and proportional gains of 1=k. Since the quantity on the left-hand
side corresponds to log(hn =hn 1), the formula (37) follows, and with  = 0:8
we have obtained the PC11 controller
 1=k  r^ 1=k h2
hn+1 = 0:r8^  tol n
r^n+1
n
hn 1 : (46)
n+1
For a detailed discussion of a general observer for log '^ with gain parameters
(ke ; kr ) we refer to [5]. Here it is sucient to remark that the purpose of such
observers is to introduce dynamics in the estimation so that log '~n depends
on its own history as well as the present values of log '^n and r log '^n . This
control will be slower than the PC11 deadbeat design but is less sensitive to
uctuations in log '^; the convolution operator mapping log '^ to log h acts as a
molli er and smoother stepsize sequences are obtained. By contrast, (42) shows
that in the PC11, log hn+1 is directly proportional to log '^n and its di erence
r log '^n . For sti computations irregularities in log '^n are quite common as
the error estimate is also in uenced by an irregular error contribution from
truncated Newton iterations. Therefore it may be worthwhile to try a di erent

68
parametrization of the PC controller. Because the dynamics of the PC closed
loop is di erent from that obtained with the usual PI controller, the parameters
discussed in the previous section are not relevant in this context. The choice
of (ke ; kr ) is discussed in [5, p. 512] where closed loop pole positioning as
well as practical tests indicate the possibility of using slightly smaller values
of the parameters, e.g. the PC.6.9 controller with (ke ; kr ) = (0:6; 0:9). This
will provide somewhat smoother stepsize sequences without signi cant loss of
eciency and robustness.
The PC controllers also need safety nets in case of rejected steps. A com-
plete pseudocode of the PC11 controller, including exception handling and
estimation of k in case of order reduction is provided in [5, p. 511].
5. Stiff computation: Matrix strategies
In sti ODE computations implicit methods are used. It is therefore necessary
to employ iterative equation solvers on each step, usually of Newton type, to
solve equations of the form
yn = hn f (yn ) + ; (47)
where is a method-dependent constant and is a known vector. The New-
ton iteration requires that we solve linear systems using the matrix I hnJn ,
where Jn is an approximation to the Jacobian f 0 (yn ). As the iteration matrix
depends on hn , it has long been argued that minor stepsize variations might
force matrix refactorizations and must be avoided. As a remedy the elementary
controller is often used in combination with a deadzone prohibiting small step-
size increases, aiming to keep hn piecewise constant. Stepsize decreases, on the
other hand, are readily accepted even when small, and may also be controlled
by a di erent strategy|as an example Radau5 uses the elementary control
(10) with a 20% deadzone for stepsize increases, and the predictive control (46)
for decreases [8, p. 124]. This gain scheduling strategy makes an unsymmetric,
discontinuous and nonlinear control, and yet it does not save factorizations
during long sequences of shrinking steps which do occur in practice [6, p. 35,
Fig. 8]. Instead of treating stespize increments and decrements di erently, a
logic consistent with a 20% deadzone for increases should immediately e ect
a 20% decrease as soon as the error estimate exceeds ". Such a \staircase"
strategy, however, has more in common with step doubling/halving than with
actual control. In addition its control actions are often large enough to cause
transient e ects, where process models no longer hold, resulting in an increased
risk of step rejections. The lack of smooth control may also cause an erratic
tolerance proportionality.
We advocate the use of a smooth, symmetric, linear control, allowed to
work with minute adjustments of hn on every step. The question is if we
can allow a continual control action without an excessive number of matrix
refactorizations. Because one normally uses modi ed Newton iteration, Jn is
already in error, and small changes in hn can be accomodated by viewing hn Jn
as being approximate, as long as the convergence rate does not deteriorate.

69
We shall only discuss Newton iterations. For a discussion of nonsti compu-
tations and xed-point iterations for (47) we refer to [6]. The convergence rate
is then proportional to hn . Stepsize increases therefore cause slower rates of
convergence, which are acceptable as long as larger steps reduce total work per
unit time of integration. But this puts an upper limit on the stepsize beyond
which further stepsize increases are counterproductive; convergence slows to
the point where total work per unit time of integration starts growing [6, p.
27]. In theory the convergence rate should never exceed 1=e  0:37, but in
practice one should limit hn so that  0:2. This method-independent step-
size limitation will therefore have to be coordinated with the controller, and
becomes part of the complete controller speci cation.
As for Newton iterations, an analysis of eciency in terms of convergence
rates is extremely complicated and depends on problem size, complexity and
degree of nonlinearity [13]. This model investigation indicates that = 0:2
is an acceptable rate for low precision computations but that rates as fast
as = 0:02 may be preferred if the accuracy requirement is high. Since we
normally want to keep the same Jacobian for several steps, as well as use the
same Jacobian for all stage values in (3), a convergence rate of = 0:1 must
generally be allowed or the strategy may break down when faced by a strongly
nonlinear problem. In fact, using an upper bound of = 0:2 will only lose
eciency if this rate can be maintained for a considerable time using an old
Jacobian [13]. A convergence rate of = 0:2 is therefore not only feasible but
in most cases also acceptable.
Let us consider a modi ed Newton iteration and assume that the Jaco-
bian Jm was computed at the point (tm ; ym), when the stepsize hm was used.
Further, we assume that the iteration matrix I hm Jm is still in use at
time tn , when actual values of stepsize and Jacobian have changed to hn Jn =
hm Jm + (hJ ). It is then straightforward to show that the iteration error ej
in the modi ed Newton method satis es, up to rst order terms,
ej+1 = (I hmJm ) 1 (hJ ) ej : (48)
We shall nd an estimate of in terms of the relative deviation in hn Jn from
hm Jm . Assuming that Jm1 exists (this is no restriction), we rewrite (48) as
ej+1 = (I hmJm ) 1 hmJm (hm Jm ) 1 (hJ )ej : (49)
1
By taking norms, we obtain kej+1 k    k(hm Jm ) (hJ )k kej k, where it can
be shown for inner product norms that  = k(I hmJm ) 1 hmJm k ! 1 in
the presence of sti ness, i.e. as khmJm k grows large [6, p. 29]. We can therefore
estimate the convergence rate by
< k(hm Jm ) 1 (hJ )k: (50)
For small variations around hm and Jm we approximate (hJ )  Jm h + hmJ ,
which together with (50) yields the bound
< k hh I + Jm1 J k  j hh j + kJm1 J k: (51)
m m

70
Therefore the convergence rate is bounded by the sum of the relative changes in
stepsize and Jacobian, respectively. This simple formula is useful for assessing
the need for refactorizations and reevaluations of the Jacobian. Thus we see
that if variations in J are negligible, then we can accept a 20% variation in
stepsize without exceeding = 0:2; in other words, a 20% deadzone in the
stepsize strategy is unnecessary. Conversely, if without stepsize change the
estimated convergence rate grows and exceeds the acceptable level of =
0:2, then (51) demonstrates that this can only be due to a relative change
in the Jacobian, calling for a reevaluation of J . Note that there is no need
to compute kJm1 J k. It is sucient to monitor the convergence rate and
the accumulated stepsize change jh=hmj. Moreover, if a stepsize change is
suggested by the controller one can beforehand nd out if the proposed change
implies that a refactorization is necessary; it is not necessary to wait for a
convergence failure before invoking countermeasures. This local strategy has
been thoroughly tested in [6] where a full pseudocode speci cation can be
found. Further improvements have been added by De Swart [14, p. 160]. The
purpose of these strategies is to establish a full coordination of stepsize control
and iterative method. This can be achieved for various choices of the upper
limit on acceptable convergence rate.
In [13] an attempt is made to develop a global matrix strategy. This requires
a signi cant amount of problem information, however. Transformed into a local
test, this strategy keeps the same Jacobian as long as is small enough to
satisfy s
log
p  log(ke0k=") ; (52)
l+1 f
where e0 is the initial iteration error, l is the number of steps since the iteration
matrix was formed, and f is the time for computing and factorizing the Jaco-
bian, relative to the time for a full iteration, including function evaluation. This
strategy is therefore adaptive with respect to computational progress, problem
properties and size, but it is more dicult to use in practice, not least because
it typically puts a quite sharp upper bound on which tends to interfere with
the stespize controller. For certain problems in high precision computations,
this adaptive strategy may reduce CPU times by 20%.
The total work of the iteration depends to a large extent on starting values
and termination criterion. Starting values are obtained from some predictor,
and a more accurate predictor implies better eciency. A high order predic-
tor is a signi cant advantage but may be dicult to construct if an implicit
Runge{Kutta method has low stage order. New second order predictors were
developed in [13], indicating that overall performance increases of 10 20%
may be obtained, in particular for high precision computations.
Termination criteria can be formulated in di erent ways. One requirement
is that the stage derivatives Y_ i must be suciently accurate, as these are the
values entering the quadrature formula (4). Moreover, the terminal iteration
errors must not make more than an insigni cant contribution to the local error

71
(7) and its estimate (6). The iteration error is less regular than the truncation
error of the dicsretization, and may|if too large|cause an irregular behaviour
in log '^, i.e. the controller will be fed \noisy data." The rst order predictor
(39) is then prone to be erratic and it is no longer possible to achieve better
control with the PC11. If the correct remedy of a sharper termination criterion
is unacceptable, the remaining possibilities are either to give up deadbeat con-
trol and replace the PC11 by e.g. the PC.6.9 or to resort to the PI1.0, which
is based on the assumption of slow variation, equivalent to a predictor of order
zero for log '^. In the coordination of controller and iteration strategy, these
considerations need to be accounted for to obtain a coherent performance.

6. Conclusions
The algorithmic content of ODE/DAE software is not dominated by the dis-
cretization method as such but by a considerable amount of control structures,
support algorithms and logic. Nevertheless, it appears that only the discretiza-
tion methods have received a thorough mathematical attention while control
logic and structures have been largely heuristic. These algorithmic parts are,
however, amenable to a rigorous analysis and can be constructed in a system-
atic way. The objective of this review has been to introduce the numerical
analyst interested in ODEs to some basic notions and techniques from control
theory which have proved ecient in the construction of adaptive ODE/DAE
software. This methodology is not unfamiliar to numerical analysis; digital
control rests on the classical theories of linear di erence equations, di erence
operators and stability. Bearing this in mind, the numerical analyst has a
wide range of powerful concepts at his/her disposal for the automatic control
of numerical integration. The present paper should serve as a starting point
for accessing the state of the art in this eld as of 1998 through the referenced
literature.
The control theoretic approach views the adaptive numerical integration of
ODEs as a process which maps a stepsize log h to a corresponding error log r^.
Section 3 showed that the process is an ane map, i.e. log r^ = GP (q) log h +
log '^. Moreover, in the asymptotic regime it is static|GP (q) is just a constant.
The process is therefore relatively simple to control. But when log h becomes
large the process representing an explicit method shifts to become dynamic.
Both cases could be controlled by a standard technique, the PI controller. This
is a linear map log h = GC (q)(log " log r^) which nds a suitable stepsize from
the deviation between accuracy requirement and estimated error. But even if
the process has a generic structure for all discretizations, it is a challenging task
to nd suitable controller parameters. The recommended controllers, PI.3.4
and PI.4.2, are fully speci ed in the form of pseudocode in [4].
For implicit methods we saw in Section 4 that a more advanced technique
based on predicting the evolution of the principal error function could be used.
The process GP (q) was still just a constant, but the predictive controller had
an increased complexity, enabling it to follow linear trends. A full speci cation

72
of the recommended PC11 controller is found in [5]. Finally, in Section 5 we
saw how a continual control action could be coordinated with matrix strategies
for the Newton iterations; algorithm speci cations are found in [14, p. 160],
which improves the original in [6], and in [13].
The new control algorithms are ecient, yield smoother stepsize sequences
and fewer rejected steps, and are designed to be parts of a coherent strategy.
Even if the new algorithms are more complex and mainly directed towards
qualitative improvements, they usually decrease total work as well.
Thus we have brought together a number of algorithms, chie y based on
modern control theory, for making numerical ODE/DAE integration meth-
ods adaptive. These control algorithms are supported by a well established
mathematical methodology instead of heuristics, and should be considered as
well-de ned, fully speci ed structures accomplishing equally speci c tasks. Al-
though one may wish to use di erent controllers in di erent situations, the
tuning of a controller relies in equal parts on control theoretic principles and
a thorough knowledge of the process to be controlled; parameter choice must
be consistent and is not arbitrary. A less systematic approach, such as trying
to achieve \ultimate performance" on a few test problems used for tuning, is
usually of little value as it often trades robustness and coherence for a marginal
gain in a single aspect of performance, e.g. total number of steps. Other perfor-
mance aspects, such as the quality of the numerical solution in terms of better
smoothness achieved by smoother stepsize sequences (cf. [4, p. 534, Fig. 1]) or
a more stable and regular tolerance proportionality, are often overlooked but
do belong in a serious evaluation of software performance and quality.
The recent development and present state of the art point to the possibility
of eliminating heuristic elements from adaptive solution methods for ODEs
and DAEs. This is an important goal as it may eventually bring ODE/DAE
software to approach the level of quality and standardization found in modern
linear algebra software.
Acknowledgements. The author would like to thank a large number of col-
leagues and collaborators in numerical analysis and automatic control who have
contributed directly as well as indirectly to shaping the techniques presented in
this review. The research was in part funded by the Swedish Research Council
for Engineering Sciences TFR contract 222/91-405.

References
1. P. Deuflhard, F. Bornemann (1994). Numerische Mathematik II: In-
tegration gewohnlicher Di erentialgleichungen. Walter de Gruyter, Berlin.
2. C.W. Gear (19971). Numerical Initial Value Problems in Ordinary Dif-
ferential Equations. Prentice{Hall, Englewood Cli s.
3. K. Gustafsson, M. Lundh, G. So derlind (1988). A PI stepsize control
for the numerical solution of ordinary di erential equations. BIT 28, 270{
287.

73
4. K. Gustafsson (1991). Control theoretic techniques for stepsize selection
in explicit Runge{Kutta methods. ACM TOMS 17, 533{554.
5. K. Gustafsson (1994). Control theoretic techniqes for stepsize selection
in implicit Runge{Kutta methods. ACM TOMS 20, 496{517.
6. K. Gustafsson, G. So derlind (1997). Control strategies for the iterative
solution of nonlinear equations in ODE solvers. SIAM J. Sci. Comp. 18,
23{40.
7. E. Hairer, S.P. Nrsett, G. Wanner (1993). Solving Ordinary Di er-
ential Equations I: Nonsti Problems. Springer-Verlag, 2nd revised edition,
Berlin.
8. E. Hairer, G. Wanner (1996). Solving Ordinary Di erential Equations
II: Sti and Di erential-algebraic Problems. Springer-Verlag, 2nd revised
edition, Berlin.
9. G. Hall (1985). Equilibrium states of Runge{Kutta schemes. ACM TOMS
11, 289{301.
10. G. Hall (1986). Equilibrium states of Runge{Kutta schemes, part II. ACM
TOMS 12, 183{192.
11. G. Hall, D. Higham (1988). Analysis of stepsize selection schemes for
Runge{Kutta codes. IMA J. Num. Anal. 8, 305{310.
12. D. Higham, G. Hall (1990). Embedded Runge{Kutta formulae with sta-
ble equilibrium states. J. Comp. and Appl. Math. 29, 25{33.
13. H. Olsson, G. So derlind (1998). Stage value predictors and ecient
Newton iterations in implicit Runge{Kutta methods. To appear in SIAM
J. Sci. Comp. 19.
14. J.J.B. de Swart (1997). Parallel software for implicit di erential equa-
tions. Ph.D. thesis, CWI, Amsterdam.
15. J.J.B. de Swart, G. So derlind (1997). On the construction of error
estimators for implicit Runge{Kutta methods. J. Comp. and Appl. Math.
86, 347{358.
16. T. So derstro m, P. Stoica (1989). System Identi cation. Prentice{Hall,
Englewood Cli s.
17. H.A. Watts (1984). Step size control in ordinary di erential equation
solvers. Trans. Soc. Comput. Sim. 1, 15{25.
18. J.A. Zonneveld (1964). Automatic numerical integration. Ph.D. thesis,
Math. Centre Tracts 8, CWI, Amsterdam.
19. K.J. Astro m, B. Wittenmark (1990). Computer Controlled Systems {
Theory and Design. 2nd ed., Prentice{Hall, Englewood Cli s.

74

You might also like