Introduction To Model Order Reduction
Introduction To Model Order Reduction
net/publication/226284493
CITATIONS READS
137 10,570
1 author:
Wil Schilders
Eindhoven University of Technology
184 PUBLICATIONS 3,428 CITATIONS
SEE PROFILE
All content following this page was uploaded by Wil Schilders on 06 June 2014.
Wil Schilders1,2
1
NXP Semiconductors, Eindhoven, The Netherlands
[email protected]
2
Eindhoven University of Technology, Faculty of Mathematics and Computer Science,
Eindhoven, The Netherlands
[email protected]
1 Introduction
In this first section we present a high level discussion on computational science, and
the need for compact models of phenomena observed in nature and industry. We
argue that much more complex problems can be addressed by making use of current
computing technology and advanced algorithms, but that there is a need for model
order reduction in order to cope with even more complex problems. We also go into
somewhat more detail about the question as to what model order reduction is.
An important factor in enabling the complex simulations carried out today is the
increase in computational power. Computers and chips are getting faster, Moores law
predicting that the speed will double every 18 months (see Figure 2).
This increase in computational power appears to go hand-in-hand with devel-
opments in numerical algorithms. Iterative solution techniques for linear systems
are mainly responsible for this speed-up in algorithms, as is shown in Figure 3.
Important contributions in this area are the conjugate gradient method (Hestenes and
Stiefel [22]), preconditioned conjugate gradient methods (ICCG [25], biCGstab [34])
and multigrid methods (Brandt [4] and [5]).
The combined speed-up achieved by computer chips and algorithms is enormous,
and has enabled computational science to make big steps forward. Many problems
that people did not dream of solving two decades ago are now solved routinely.
Introduction to MOR 5
The developments described in the previous section also have a counter side. The
increased power of computers and algorithms reduces the need to develop smart,
sophisticated solution methods that make use of properties of the underlying systems.
For example, whereas in the 1960s and 1970s one often had to construct special basis
functions to solve certain problems, this can be avoided nowadays by using brute
force methods using grids that are refined in the right places.
The question arises whether we could use the knowledge generated by these very
accurate, but time-consuming, simulations to generate the special basis functions that
would have constituted the scientific approach a few decades ago. This is a promising
idea, as many phenomena are described very well by a few dominant modes.
The foregoing example clearly shows that it may not be necessary to calculate
all details, and nevertheless obtain a good understanding of the phenomena taking
place. There may be many reasons why such detail is not needed. There may be
physical reasons that can be formulated beforehand, and therefore incorporated into
the model before starting calculations. A very nice example is that of simulating
the blood flow in the human body, as described in many publications by the group
of Alfio Quarteroni (see [30], but also work of others). In his work, the blood flow
in the body is split into different parts. In very small arteries, it is assumed that
the flow is one dimensional. In somewhat larger arteries, two dimensional models
are used, whereas in the heart, a three dimensional model is used as these effects
Introduction to MOR 7
are very important and must be modelled in full detail. This approach does enable
a simulation of the blood flow in the entire human body; clearly, such simulations
would not be feasible if three dimensional models would be used throughout. This
approach, which is also observed in different application areas, is also termed op-
erational model order reduction. It uses physical (or other) insight to reduce the
complexity of models.
Another example of operational model order reduction is the simulation of elec-
tromagnetic effects in special situations. As is well known, electromagnetic effects
can be fully described by a system of Maxwell equations. Despite the power of
current computers and algorithms, solving the Maxwell equations in 3-dimensional
space and time is still an extremely demanding problem, so that simplifications are
being made whenever possible. An assumption that is made quite often is that of
quasi-statics, which holds whenever the frequencies playing a role are low to moder-
ate. In this case, simpler models can be used, and techniques for solving these models
have been developed (see [32]).
In special situations, the knowledge about the problem and solutions can be so
detailed, that a further reduction of model complexity can be achieved. A promi-
nent and very successful example is the compact modelling [19] of semiconductor
devices. Integrated circuits nowadays consist of millions of semiconductor devices,
such as resistors, capacitors, inductors, diodes and transistors. For resistors, capac-
itors and inductors, simple linear models are available, but diodes and especially
transistors are much more complicated. Their behaviour is not easily described, but
can be calculated accurately using software dedicated to semiconductor device sim-
ulation. However, it is impossible to perform a full simulation of the entire electronic
circuit, by using the results of the device simulation software for each of the millions
of transistors. This would imply coupling of the circuit simulation software to the
device simulation software. Bearing in mind that device simulations are often quite
time consuming (it is an extremely nonlinear problem, described by a system of three
partial differential equations), this is an impossible task.
The solution to the aforementioned problem is to use accurate compact mod-
els for each of the transistors. Such models look quite complicated, and can easily
occupy a number of pages of description, but consist of a set of algebraic relations
that can be evaluated very quickly. The compact models are constructed using a large
amount of measurements and simulations, and, above all, using much human insight.
The models often depend on as many as 40-50 parameters, so that they are widely
applicable for many different types and geometries of transistors. The most promi-
nent model nowadays is the Penn-State-Philips (PSP) model for MOS transistors (see
Figure 5), being chosen as the world standard in 2007 [15]. It is very accurate, in-
cluding also derivatives up to several orders. Similar developments can be observed
at Berkeley [6], where the BSIM suite of models is constructed.
Using these so-called compact models, it is possible to perform simulations of
integrated circuits containing millions of components, both for steady-state and time-
dependent situations. Compact modelling, therefore, plays an extremely important
role in enabling such demanding simulations. The big advantage of this approach is
that the compact models are formulated in a way that is very appealing to designers,
as they are formulated in terms of components they are very familiar with.
8 W. Schilders
Unfortunately, in many cases, it is not possible to a priori simplify the model de-
scribing the behaviour. In such cases, a procedure must be used, in which we rely on
the automatic identification of potential simplifications. Designing such algorithms
is, in essence, the task of the field of model order reduction. In the remainder of this
chapter, we will describe it in more detail.
There are several definitions of model order reduction, and it depends on the con-
text which one is preferred. Originally, MOR was developed in the area of systems
and control theory, which studies properties of dynamical systems in application for
reducing their complexity, while preserving their input-output behavior as much as
possible. The field has also been taken up by numerical mathematicians, especially
after the publication of methods such as PVL [9]. Nowadays, model order reduction
is a flourishing field of research, both in systems and control theory and in numer-
ical analysis. This has a very healthy effect on MOR as a whole, bringing together
different techniques and different points of view, pushing the field forward rapidly.
So what is model order reduction about? As was mentioned in the foregoing
sections, we need to deal with the simplification of dynamical models that may con-
tain many equations and/or variables (105 − 109 ). Such simplification is needed in
order to perform simulations within an acceptable amount of time and limited stor-
age capacity, but with reliable outcome. In some cases, we would even like to have
on-line predictions of the behaviour with acceptable computational speed, in order
to be able to perform optimizations of processes and products.
Model Order Reduction tries to quickly capture the essential features of a struc-
ture. This means that in an early stage of the process, the most basic properties of
the original model must already be present in the smaller approximation. At a certain
moment the process of reduction is stopped. At that point all necessary properties of
the original model must be captured with sufficient precision. All of this has to be
done automatically.
Introduction to MOR 9
to specific applications, others are more general. In the second and third part of this
book, many of these new developments are being discussed. In the remainder of
this chapter, we will discuss some basic methods and properties, as this is essential
knowledge required for the remainder of the book.
dx
= f (x, u)
dt
y = g(x, u).
Here, u is the input of the system, y the output, and x the so-called state variable.
The dynamical system can thus be viewed as an input-output system, as displayed in
Figure 7.
The complexity of the system is characterized by the number of its state vari-
ables, i.e. the dimension n of the state space vector x. It should be noted that similar
dynamical systems can also be defined in terms of differential algebraic equations,
in which case the first set of equations in (1) is replaced by F( dx
dt , x, u) = 0.
Model order reduction can now be viewed as the task of reducing the dimension
of the state space vector, while preserving the character of the input-output relations.
In other words, we should find a dynamical system of the form
dx̂
= f̂ (x̂, u),
dt
y = ĝ(x̂, u),
u1(t) y1(t)
⎧ f (x(t), x(t), u (t)) = 0
…
where the dimension of x̂ is much smaller than n. In order to provide a good ap-
proximation of the original input-output system, a number of conditions should be
satisfied:
• the approximation error is small,
• preservation of properties of the original system, such as stability and passivity
(see Sections 2.4-2.6),
• the reduction procedure should be computationally efficient.
A special case is encountered if the functions f and g are linear, in which case
the system reads
dx
= Ax + Bu,
dt
y = C T x + Du.
Π = V W∗
is an oblique projection along the kernel of W ∗ onto the k-dimensional subspace that
is spanned by the columns of the matrix V .
12 W. Schilders
If we substitute the projection into the dynamical system (1), the first part of the
set of equations obtained is
dx̂
= W ∗ f̂ (V x̂ + T1 x̃, u),
dt
y = ĝ(V x̂ + T1 x̃, u).
Note that this is an exact expression. The approximation occurs when we would
delete the terms involving x̃, in which case we obtain the reduced system
dx̂
= W ∗ f̂ (V x̂, u),
dt
y = ĝ(V x̂, u).
For this to produce a good approximation to the original system, the neglected
term T1 x̃ must be sufficiently small. This has implications for the choice of the pro-
jection Π. In the following sections, various ways of constructing this projection are
discussed.
In order to illustrate the various concepts related to model order reduction of input-
output systems as described in the previous section, we consider the linear time-
invariant system
dx
= Ax + Bu,
dt
y = C T x.
A common way to solve the differential equation is by transforming it from the time
domain to the frequency domain, by means of a Laplace transform defined as
∞
L(f )(s) ≡ f (t) exp (−st)dt.
0
If we apply this transform to the system, assuming that x(0) = 0, the system is
transformed to a purely algebraic system of equations:
where the capital letters indicate the Laplace transforms of the respective lower case
quantities. This immediately leads to the following relation:
This transfer function represents the direct relation between input and output in the
frequency domain, and therefore the behavior of the system in frequency domain. For
example, in the case of electronic circuits this function may describe the transfer from
currents to voltages, and is then termed impedance. If the transfer is from voltages to
currents, then the transfer function corresponds to the admittance.
Note that if the system has more than one input or more than one output, then
B and C have more than one column. This makes H(s) a matrix function. The i, j
entry in H(s) then denotes the transfer from input i to output j.
2.2 Moments
The transfer function is a function in s, and can therefore be expanded into a moment
expansion around s = 0:
H(s) = M0 + M1 s + M2 s2 + . . . ,
where h(t) is the impulse response function, which is the response of the system to
the Dirac delta input. The transfer function in the frequency domain is the Laplace
transform of the impulse response function:
∞
H(s) = h(t) exp (−st)dt.
0
Expanding the exponential function in a power series, it is seen that the Elmore delay
indeed corresponds to the first order moment of the transfer function.
Of course, the transfer function can also be expanded around some non-zero s0 .
We then obtain a similar expansion in terms of moments. This may be advantageous
in some cases, and truncation of that alternative moment expansion may lead to better
approximations.
where the pj are the poles, and Rj are the corresponding residues. The poles are
exactly the eigenvalues of the matrix −A−1 . In fact, if the matrix E of eigenvectors
is non-singular, we can write
−A−1 = EΛE −1 ,
where the diagonal matrix Λ contains the eigenvalues λj . Substituting this into the
expression for the transfer function, we obtain:
Hence, if B and C contain only one column (which corresponds to the single input,
single output or SISO case), then
n
ljT rj
H(s) = ,
j=1
1 + sλj
where the lj and rj are the left and right eigenvectors, respectively.
We see that there is a one-to-one relation between the poles and the eigenvalues of
the system. If the original dynamical system originates from a differential algebraic
system, then a generalized eigenvalue problem needs to be solved. Since the poles
appear directly in the pole-residue formulation of the transfer function, there is also
a strong relation between the transfer function and the poles or, stated differently,
between the behavior of the system and the poles. If one approximates the system,
one should take care to approximate the most important poles. There are several
Introduction to MOR 15
methods that do this, which are discussed in later chapters of this book. In general,
we can say that, since the transfer function is usually plotted for imaginary points
s = ωi, the poles that have a small imaginary part dictate the behavior of the transfer
function for small values of the frequency ω. Consequently, the poles with a large
imaginary part are needed for a good approximation at higher frequencies. Therefore,
a successful reduction method aims at capturing the poles with small imaginary part
rather, and leaves out poles with a small residue.
2.4 Stability
Poles and eigenvalues of a system are strongly related to the stability of the system.
Stability is the property of a system that ensures that the output signal of a system is
limited (in the time domain).
Consider again the system (1). The system is stable if and only if, for all eigenval-
ues λj , we have that Re(λj ) ≤ 0, and all eigenvalues with Re(λj ) = 0 are simple.
In that case, the corresponding matrix A is termed stable.
There are several properties associated with stability. Clearly, if A is stable, then
also A−1 is stable. Stability of A also implies stability of AT and stability of A∗ .
Finally, if the product of matrices AB is stable, then also BA can be shown to be
stable. It is also clear that, due to the relation between eigenvalues of A and poles of
the transfer function, stability can also be formulated in terms of the poles of H(s).
The more general linear dynamical system
dx
Q = Ax + Bu,
dt
y = C T x,
is stable if and only if for all generalized eigenvalues we have that Re(λj (Q, A))≤ 0,
and all generalized eigenvalues for which Re(λj (Q, A)) = 0 are simple. The set of
generalized eigenvalues σ(Q, A) is defined as the collection of eigenvalues of the
generalized eigenvalue problem
Qx = λAx.
In this case, the pair of matrices (Q, A) is termed a matrix pencil. This pencil is said
to be regular if there exists at least one eigenvalue λ for which Q + λA is regular.
Just as for the simpler system discussed in the above, stability can also be formulated
in terms of the poles of the corresponding transfer function.
The converse of this theorem is not true, as can be seen when we take
1
− 3 −1
A= ,
1 2
and
1
x= .
0
Matrices with the property that Re(x∗ Ax > 0 for all x ∈ C n are termed positive
real. The counter example shows that the class of positive real matrices is smaller
than the class of matrices for which all eigenvalues have positive real part. In the
next section, this new and restricted class will be discussed in more detail. For now,
we remark that a number of properties of positive real matrices are easily derived. If
A is positive real, then this also holds for A−1 (if it exists). Furthermore, A is positive
real if and only if A∗ is positive real. If two matrices A and B are both positive real,
then any linear combination αA + βB is also positive real provided Re(α) > 0 and
Re(β) > 0.
There is an interesting relation between positive real and positive definite ma-
trices. Evidently, the class of positive definite matrices is a subclass of the set of
positive real matrices. But we also have:
Theorem 2. A matrix A ∈ C n×n is positive real if and only if the Hermitian part
of A (i.e. 12 (A + A∗ )) is symmetric positive definite.
Similarly one can prove that a matrix is non-negative real if and only if its
Hermitian part is symmetric positive semi-definite.
2.6 Passivity
N
winst (t) ≡ vj (t)ij (t),
j=1
where vj (t) and ij (t) are the real instanteneous voltage and current at the j-th port.
An N -port contains stored energy, say E(t). If the system dissipates energy at rate
Introduction to MOR 17
wd (t), and contains sources which provide energy at rate ws (t), then the energy
balance during a time interval [t1 , t2 ] looks like:
t2
(winst + ws − wd )dt = E(t2 ) − E(t1 ). (5)
t1
over any time interval [t1 , t2 ]. This means that the increase in stored energy must
be less than or equal to the energy delivered through the ports. The N-port is called
lossless if (6) holds with equality over any interval. Assume that the port quantities
exhibit purely exponential time-dependence, at a single complex frequency s. We
may then write:
v(t) = v̂ exp(it), i(t) = î exp((it)),
where v̂ and î are the complex amplitudes. We define the total complex power ab-
sorbed to be the inner product of î and v̂,
w = î∗ v̂,
H ∗ (s) + H(s) ≥ 0
One of the basic and earliest methods in Model Order Reduction is Asymptotic
Waveform Evaluation (AWE), proposed by Pillage and Rohrer in 1990 [7, 29]. The
underlying idea of the method is that the transfer function can be well approximated
by a Padé approximation. A Padé approximation is a ratio of two polynomials P (s)
and Q(s). AWE calculates a Padé approximation of finite degree, so the degree of
P (s) and Q(s) is finite and deg(Q(s)) ≥ deg(P (s)). There is a close relation be-
tween the Padé approximations and Krylov subspace methods (see Chapter 2). To
explain this fundamental property, consider the general system:
Here, the Xi (s) are the moments. Assuming U(s) = 1, and equating like powers of
(s − s0 )i , we find:
(s0 In − A)X0 = B,
for the term corresponding to i = 0, and for i ≥ 1
The process can be terminated after finding a sufficient number of moments, and
the hope is that then a good approximation has been found for the transfer function.
Clearly, this approximation is of the form
n
H̃(s) = mk (s − s0 )k ,
k=0
P (s − s0 )
Ĥ(s) = .
Q(s − s0 )
Letting
p
p+1
P (s) = ak (s − s0 )k , Q(s) = bk (s − s0 )k ,
k=0 k=0
we find that the following relation must hold:
n p+1
p
ak (s − s0 ) =
k
mk (s − s0 )k
bk (s − s0 ) .
k
Equating like powers of s − s0 (for the higher powers), and setting b0 = 1, we find
the following system to be solved:
⎛ ⎞⎛ ⎞ ⎛ ⎞
m0 m1 . . . mp bp+1 mp+1
⎜ m1 m2 . . . mp+1 ⎟ ⎜ bp ⎟ ⎜ mp+2 ⎟
⎜ ⎟⎜ ⎟ ⎜ ⎟
⎜ .. .. . . .. ⎟ ⎜ .. ⎟ = − ⎜ .. ⎟ , (1)
⎝ . . . . ⎠ ⎝ . ⎠ ⎝ . ⎠
mp mp+1 . . . m2p b1 m2p+1
from which we can extract the coefficients bi , i = 1, . . . , p + 1 of Q. A similar step
can be used to subsequently find the coefficients of the polynomial P .
The problem with the above method, and with AWE in general, is that the coeffi-
cient matrix in (1) quickly becomes ill-conditioned as the number of moments used
goes up. In fact, practical experience indicates that applicability of the method stops
once 8 or more moments are used. The method can be made more robust by using
Complex Frequency Hopping [29], meaning that more than one expansion point is
used. However, the method remains computationally demanding and, for that reason,
alternatives as described in the next subsections are much more popular nowadays.
where
 = −(s0 In − A)−1 ,
and
r = (s0 In − A)−1 b.
This transfer function can, just as in the case of AWE, be approximated well by a
rational function in the form of a Padé approximation. In PVL this approximation is
found via the Lanczos algorithm. By running q steps of this algorithm (see Chapter 2
for details on the Lanczos method), an approximation of  is found in the form of a
tridiagonal matrix Tq , and the approximate transfer function is of the form:
where e1 is the first unit vector. The moments can also be found from this expression:
to a system of size, say, a few thousand. In the second phase, the ‘ordinary’ Lanczos
procedure is used to reduce the problem much further, now using the inverse of the
coefficient matrix. For more details, see [35, 36].
The search for provable passive model order reduction techniques continued after the
publication of PRIMA. A new development was the construction of approximations
using the framework of Laguerre functions, as proposed in [23,24]. In these methods,
the transfer function is not shifted-and-inverted, as is the case in the PVL and Arnoldi
methods. Instead, it is expanded in terms of Laguerre functions that are defined as
√
φαk (t) ≡ 2α exp (−αt)k (2αt),
exp (t) dk
k (t) ≡ (exp (−t)tk ).
k! dtk
The Laplace transform of φα
k (t) is
√ k
2α s − α
Φα
k (s) = .
s+α s+α
Furthermore, it can be shown (see [20]) that the Laguerre expansion of the transfer
function is
∞ k
2α −1
k −1 s−α
H(s) = C T
(αIn − A) (−αIn − A) (αIn − A) B .
s+α s+α
k=0
Clearly, this expansion gives rise to a Krylov space again. The number of linear
systems that needs to be solved is equivalent to that in PRIMA, so the method is
computationally competitive.
The algorithm presented in [23] then reads:
method is presented that makes use of intermediate orthogonalisation, and has been
shown to have certain advantages over using an SVD.
Just as in the PRIMA method, the Laguerre-based methods make use of explicit
projection of the system matrices. Consequently, these methods preserve stability and
passivity. Since α is a real number, the matrices in the Laguerre algorithm remain real
during projection, thereby making it suitable for circuit synthesis (see [21, 24]).
As mentioned before, model order reduction has its roots in the area of systems and
control theory. Within this area, methods have been developed that differ consider-
ably from the Krylov based methods as discussed in subsections 3.1-3.5. The basic
idea is to truncate the dynamical system studied at some point. To illustrate how it
works, consider again the linear dynamical system (1):
dx
= Ax + Bu,
dt
y = C T x + Du.
Applying a state space transformation
T x̃ = x,
does not affect the input-output behavior of the system. This transformation could be
chosen to be based on the eigenvalue decomposition of the matrix A:
AT = T Λ.
When T is non-singular, T −1 AT is a diagonal matrix consisting of the eigenvalues
of A, and we could use an ordering such that the eigenvalues on the diagonal occur
in order of decreasing magnitude. The system can then be truncated by restricting
the matrix T to the dominant eigenvalues. This process is termed modal truncation.
Another truncation method is that of balanced truncation, usually known as
Truncated Balanced Realization (TBR). This method is based upon the observation
that only the largest singular values of a system are important. As there is a very
good reference to this method, containing all details, we will only summarize the
main concepts. The reader interested in the details of the method is referred to the
book by Antoulas [2].
The controllability Gramian and the observability Gramian associated to the lin-
ear time-invariant system (A, B, C, D) are defined as follows:
∞
∗
P = eAt BB ∗ eA t dt, (3a)
0
and ∞
∗
Q= eAt C ∗ CeA t dt, (3b)
0
respectively.
24 W. Schilders
The matrices P and Q are the unique solutions of two Lyapunov equations:
AP + P A∗ + BB ∗ = 0, (4a)
A∗ Q + QA + C ∗ C = 0. (4b)
The Lyapunov equations for the Gramians arise from a stability assumption of A.
Stability in the matrix A implies that the infinite integral defined in (3) is bounded.
Finding the solution of the Lyapunov equation is quite expensive. There are direct
ways and iterative ways to do this. One of the interesting iterative methods to find a
solution is vector ADI [8, 33]. New developments are also in the work of Benner [3].
After finding the Gramians, we look for a state space transformation which bal-
ances the system. A system is called balanced if P = Q = Σ = diag(σi ). The
transformation will be applied to the system as follows:
A = T −1 AT
B = T −1 B
C = CT
D = D.
P = T −1 P T −∗
Q = T ∗ QT.
RQRT = U T Σ 2 U. (5)
Then the transformation T ∈ R n×n
is defined as:
T = RT U T Σ −1/2 . (6)
This procedure is called balancing. It can be shown that T indeed balances the
system:
Since a transformation was defined which transforms the system according to the
Hankel singular values [2], now very easily a truncation can be defined.
Introduction to MOR 25
Σ can be partitioned:
Σ1 0
Σ= , (8)
0 Σ2
where Σ1 contains the largest Hankel singular values. This is the main advantage of
this method, since now we can manually choose an appropriate value of the size of
the reduction, instead of guessing one.
A , B and C can be partitioned in conformance with Σ:
A11 A12
A = (9)
A21 A22
B1
B = (10)
B2
C = C1 C2 . (11)
The reduced model is then based on A11 , B1 and C1 :
x̃˙ = A11 x̃ + B1 u
y = C1 x̃.
It is sometimes proposed to apply Balanced Truncation-like methods as a second
reduction step, after having applied a Krylov method. This can be advantageous in
some cases, and has also been done by several authors.
A remark should be made on solving the Lyapunov equation. These equations
are normally solved by first calculating a Schur decomposition for the matrix A.
Therefore, finding the solution of the Lyapunov is quite expensive, the number of
operations is at least O(n3 ), where n is the size of the original model. Hence, it is
only feasible for small systems. Furthermore, because we arrived at this point using
the inverse of an ill-conditioned matrix we have to be careful. B can have very large
entries, which will introduce tremendous errors in solving the Lyapunov equation.
Dividing both equations by the square of the norm of B spreads the malice a bit,
which makes finding a solution worthwhile. In recent years, however, quite a lot of
progress has been made on solving larger Lyapunov equations. We refer to the work
of Benner [3].
Another remark: since the matrices are projected by a similarity transform,
preservation of passivity is not guaranteed in this method. In [27] a Balanced Trunca-
tion method is presented which is provably passive. Here also Poor Man’s TBR [28]
should be mentioned as a fruitful approach to implement TBR is a more efficient
way. We refer to a later chapter in this book for more information on this topic.
To define the Hankel norm we first have to define the Hankel operator H:
0
H:u→y= h(t − τ )u(τ ), (12)
−∞
where h(t) is the impulse response in time domain: h(t) = C exp(At)B for t > 0.
This operator considers the past input, the energy that was put into the system before
t = 0, in order to reach this state. The amount of energy to reach a state tells some-
thing about the controllability of that state. If, after t = 0, no energy is put into the
system and the system is stable, then the corresponding output will be bounded as
well. The energy that comes out of a state, gives information about the observability
of a state. The observability and controllability Gramians were defined in (3).
Therefore, the maximal gain of this Hankel operator can be calculated:
y2
ΣH = sup . (13)
u∈L2 (−∞,0] u2
1/2
This norm is called the Hankel norm. Since it can be proved that ΣH =λmax
(P Q)=σ1 , the Hankel norm is nothing but the largest Hankel singular value of the
system.
There exists a transfer function and a corresponding linear dynamical system
which minimizes this norm. In [18] an algorithm is given which explicitly generates
this optimal approximation in the Hankel-norm. The algorithm is based on a balanced
realization.
Krylov based methods build up the reduced order model by iterating, every iteration
leading to a larger size of the model. Hence, the first few iterations yield extremely
small models that will not be very accurate in general, and only when a sufficient
number of iterations has been performed, the approximate model will be sufficiently
accurate. Hence, characteristic for such methods is that the space in which approx-
imations are being sought is gradually built up. An alternative would be to start ‘at
the other end’, in other words, start with the original model, and reduce in it every
iteration, until we obtain a model that is small and yet has sufficient accuracy. This is
the basic idea behind a method termed selective node elimination. Although it can,
in principle, be applied in many situations, it has been described in literature only for
the reduction of electronic circuits. Therefore, we limit the discussion in this section
to that application.
Reduction of a circuit can be done by explicitly removing components and nodes
from the circuit. If a node in a circuit is removed, the behaviour of the circuit can be
preserved by adjusting the components around this node. Recall that the components
connected to the node that is removed, are also removed. The value of the remaining
components surrounding this node must be changed to preserve the behavior of the
circuit. For circuits with only resistors this elimination can be done exactly.
Introduction to MOR 27
We explain the idea for an RC circuit. For this circuit we have the following
circuit equation Y (s)v = (G + sC)v = J. The vector v here consists of the node
voltages, J is some input current term. Suppose the n-th node is eliminated. Then,
we partition the matrix such that the (n, n) entry forms one part:
Ỹ y ṽ J1
= .
yT γn + sχn vn jn
with
yi yj (gin + scin )(gjn + scjn )
Eij = = (15a)
γn + sχn γn + sχn
yi gin + scin
Fi = jn = jn . (15b)
γn + sχn γn + sχn
If node n is not a terminal node, jn is equal to 0 and therefore F=0 for all i. We see
that the elimination can also be written in matrix notation. Hence, this approach is
analogous to solving the system by Gaussian elimination. This approach can be used
to solve PDE’s in an efficient way.
After the elimination process the matrix is not in the form G+sC anymore, but is
a fraction of polynomials in s. To get an RC-circuit representation an approximation
is needed. Given the approximation method that is applied, removing one node leads
to a larger error than removing the other.
Many others have investigated methods which are strongly related to the ap-
proach described here, for instance a more symbolic approach. The strong attributes
of the methods described above is that an RC circuit is the direct result. The error
made with the reduction is controllable, but can be rather large. A disadvantage is
that reducing an RLC-circuit in this way is more difficult and it is hard to get an
RLC-circuit back after reduction.
Apart from Krylov subspace methods and Truncation methods, there is Proper
Orthogonal Decomposition (POD), also known as Karhunen-Loeve decomposition.
This method is developed within the area of Computational Fluid Dynamics and
nowadays used frequently in many CFD problems. The method is so common there,
that it should at least be mentioned here as an option to reduce models derived in
an electronic setting. The strong point of POD is that it can be applied to non-linear
partial differential equations and is at the moment state-of-the-art for many of such
problems.
The idea underlying this method is that the time response of a system given
a certain input, contains the essential behavior of the system. The most important
28 W. Schilders
aspects of this output in time are retrieved to describe the system. Therefore, the set of
outputs serves as a starting-point for POD. The outputs, which are called ‘snapshots’,
must be given or else be computed first.
A snapshot consists of a column vector describing the state at a certain moment.
Let W ∈ RN ×K be the matrix consisting of the snapshots. N is the number of
snapshots, K is the number of elements in every snapshot, say the number of state
variables. Usually we have that N < K.
Let X be a separable Hilbert space with inner product (., .), and with an ortho-
normal basis {ϕi }i∈I . Then, any element T (x, t) ∈ X can be written as:
T (x, t) = ai (t)ϕi (x) = (T (x, t), ϕi (x))ϕi (x). (16)
i i
The time dependent coefficients ai are called Fourier coefficients. We are looking for
an orthonormal basis {ϕi }i∈I such that the averages of the Fourier-coefficients are
ordered:
a21 (t) ≥ a22 (t) ≥ . . . , (17)
where . is an averaging operator. In many practical applications the first few ele-
ments represent 99% of the content. Incorporating these elements in the approxima-
tion gives a good approximation. The misfit, the part to which the remaining elements
contribute to, is small.
It can be shown that this basis can be found in the first eigenvectors of this
operator:
C = (T (t), ϕ)T (t) . (18)
In case we consider a finite dimensional problem, in a discrete and finite set of time
points, this definition of C comes down to:
1
C= W W T. (19)
N
Because C is self-adjoint, the eigenvectors are real and can be ordered, such that:
λ1 ≥ λ2 ≥ . . . (20)
A basis consisting of the first, say q eigenvectors of this matrix form the optimal
basis for POD of size q.
This leads to the following POD algorithm:
1. Input: the data in the matrix W consisting of the snapshots.
2. Define the correlation matrix C as:
1
C= W W T.
N
3. Compute the eigenvalue decomposition CΦ = ΦΛ.
4. Output: The basis to project the system on, Φ.
Introduction to MOR 29
W = ΦΣΨ T , (22)
1 1 1
C= W W T = ΦΣΨ T Ψ ΣΦT = ΦΣ 2 ΦT . (23)
N N N
The eigenvectors of C are in Φ:
1 1
CΦ = ΦΣ 2 ΦT Φ = Φ Σ 2 . (24)
N N
From which it can be seen that the eigenvalues of C are N1 Σ 2 .
Once the optimal orthonormal basis is found, the system is projected onto it. For
this, we will focus on the following formulation of a possibly non-linear model:
d
C(x) x = f (x, u)
dt
y = h(x, u).
x = x̂ + r, (25)
Q
where x̂ = k=1 ak (t)wk . When x̂ is taken as state space in (25) an error is made:
d
C(x̂) x̂ − f (x̂, u) = ρ = 0. (26)
dt
This error is forced to be perpendicular to the basis W . Forcing this defines the
d
Q d
projection fully. In the following derivation we use that dt x̂ = k=1 dt ak (t)wk :
Q
d d
0 = C(x̂) x̂ − f (x̂, u), wk = C(x̂) ak (t)wk − f (x̂, u), wk
dt dt
k=1
Q
d
= ak (t) (C(x̂)wk , wk ) − (f (x̂, u), wk ) ,
dt
k=1
(27)
30 W. Schilders
where:
Q
Aij = C( k=1 ak (t)wk )wi , wj
aj = aj (t)
Q
g(a(t), u(t)) = f ( k=1 ak (t)wk , u(t)), wj
Obviously, if the time domain output of a system has yet to be calculated, this method
is far too expensive. Fortunately, the much cheaper to obtain frequency response can
be used. Consider therefore the following linear system:
(G + jωC)x = Bu
y = LT x.
where xωj ∈ Cn×1 . We can take the real and imaginary part, or linear combinations
of both, for the POD process. We immediately see that the correlation matrix is an
approximation of the controllability Gramian:
1
M
K= [jωj C + G]−1 BB ∗ [−jωj C ∗ + G∗ ]−1 . (30)
M j=1
This approach solves the problem of chosing which time-simulation is the most ap-
propriate.
In the foregoing sections, we have reviewed a number of the most important meth-
ods for model order reduction. The discussion is certainly not exhaustive, alternative
methods have been published. For example, we have not mentioned the method of
vector fitting. This method builds rational approximations of the transfer function
in a very clever and efficient way, and can be used to adaptively build reduced or-
der models. The chapter by Deschrijver and Dhaene contains an account of recent
developments in this area.
As model order reduction is a very active area of research, progress in this very
active area may lead to an entirely new class of methods. The development of such
Introduction to MOR 31
new methods is often sparked by an industrial need. For example, right now there is
a demand for reducing problems in the electronics industry that contain many inputs
and outputs. It has already become clear that current methods cannot cope with such
problems, as the Krylov spaces very quickly become inhibitively large, even after
a few iterations. Hence, new ways of constructing reduced order models must be
developed.
References
1. J.I. Aliaga, D.L. Boley, R.W. Freund and V. Hernández. A Lanczos-type method for mul-
tiple starting vectors. Math. Comp., 69(232):1577-1601, May 1998.
2. A.C. Antoulas book. Approximation of Large-Scale Dynamical Systems. SIAM series on
Advances in Design and Control, 2005.
3. P. Benner, V. Mehrmann and D.C. Sorensen. Dimension Reduction of Large-Scale Sys-
tems. Lecture Notes in Computational Science and Engineering, vol. 45, Springer-Verlag,
June 2005.
4. A. Brandt. Multilevel adaptive solutions to boundary value problems. Math. Comp.,
31:333-390, 1977.
5. W.L. Briggs, Van Emden Henson and S.F. McCormick. A multigrid tutorial. SIAM, 2000.
6. BSIM3 and BSIM4 Compact MOSFET Model Summary. Online: http://www-
device.eecs.berkeley.edu/ 3/get.html.
7. E. Chiprout and M.S. Nakhla. Asymptotic Waveform Evaluation and moment matching
for interconnect analysis. Kluwer Academic Publishers, 1994.
8. N.S. Ellner and E.L. Wachspress. Alternating direction implicit iteration for systems with
complex spectra. SIAM J. Numer. Anal., 28(3):859.870, June 1991.
9. P. Feldmann and R. Freund. Efficient linear circuit analysis by Padé approximation via
the Lanczos process. IEEE Trans. Computer-Aided Design, 14:137-158, 1993.
10. P. Feldmann and R. Freund. Reduced-order modeling of large linear subcircuits via a
block Lanczos algorithm. Proc. 32nd ACM/IEEE Design Automation Conf., June 1995.
11. R.W. Freund and P. Feldmann. Reduced-order modeling of large passive linear circuits by
means of the SyPVL algorithm. Numerical Analysis Manuscript 96-13, Bell Laboratories,
Murray Hill, N.J., May 1996.
12. R.W. Freund and P. Feldmann. Interconnect-Delay Computation and Signal-Integrity Ver-
ification Using the SyMPVL Algorithm. Proc. 1997 European Conf. Circuit Theory and
Design, 408-413, 1997.
13. R.W. Freund and P. Feldmann. The SyMPVL algorithm and its application to interconnect
simulation. Proc. 1997 Int. Conf. Simulation of Semiconductor Processes and Devices,
113-116, 1997.
14. R.W. Freund and P. Feldmann. Reduced-order modeling of large linear passive multi-
terminal circuits using Matrix-Padé approximation. Proc. DATE Conf. 1998, 530-537,
1998.
15. G. Gildenblat, X. Li, W. Wu, H. Wang, A. Jha, R. van Langevelde, G.D.J. Smit,
A.J. Scholten and D.B.M. Klaassen. PSP: An Advanced Surface-Potential-Based MOS-
FET Model for Circuit Simulation. IEEE Trans. Electron Dev., 53(9):1979-1993,
September 2006.
16. E.J. Grimme. Krylov Projection Methods for model reduction. PhD thesis, Univ. Illinois,
Urbana-Champaign, 1997.
32 W. Schilders
17. G. H. Golub and C. F. Van Loan. Matrix Computations. Johns Hopkins University Press,
Baltimore, Maryland, 3rd edition, 1996.
18. K. Glover. Optimal Hankel-norm approximations of linear multivariable systems and their
l∞ -error bounds. Int. J. Control, 39(6):115-193, 1984.
19. H.C. de Graaff and F.M. Klaassen. Compact Transistor Modelling for Circuit Design.
Springer-Verlag, Wien, New York, 1990.
20. Heres2001
21. P.J. Heres. Robust and efficient Krylov subspace methods for Model Order Reduction.
PhD Thesis, TU Eindhoven, The Netherlands, 2005.
22. M.R. Hestenes and E. Stiefel. Methods of Conjugate Gradients for the solution of linear
systems. J. Res. Res. Natl. Bur. Stand., 49:409-436, 1952.
23. L. Knockaert and D. De Zutter. Passive Reduced Order Multiport Modeling: The Padé-
Arnoldi-SVD Connection. Int. J. Electron. Commun. (AEU), 53:254-260, 1999.
24. L. Knockaert and D. De Zutter. Laguerre-SVD Reduced-Order Modelling. IEEE Trans.
Microwave Theory and Techn., 48(9):1469-1475, September 2000.
25. J.A. Meijerink and H.A. van der Vorst. An iterative solution method for linear systems of
which the coefficient matrix is a symmetric M-matrix. Math. Comp., 31:148-162, 1977.
26. A. Odabasioglu and M. Celik. PRIMA: passive reduced-order interconnect macromodel-
ing algorithm. IEEE Trans. Computer-Aided Design, 17(8):645-654, August 1998.
27. J.R. Phillips, L. Daniel, and L.M. Silveira. Guaranteed passive balancing transformations
for model order reduction. Proc. 39th cConf. Design Automation, 52-57, 2002.
28. J.R. Phillips and L.M. Silveira. Poor Man’s TBR: A simple model reduction scheme.
IEEE. Trans. Comp. Aided Design ICS, 24(1):43-55, January 2005.
29. L.T. Pillage and R.A. Rohrer. Asymptotic Waveform Evaluation for Timing Analysis.
IEEE Trans. Computer-Aided Design Int. Circ. and Syst., 9(4): 352-366, April 1990.
30. A. Quarteroni and A. Veneziani. Analysis of a geometrical multiscale model based on
the coupling of PDE’s and ODE’s for blood flow simulations. SIAM J. on MMS, 1(2):
173-195, 2003.
31. Y. Saad. Iterative Methods for Sparse Linear Systems. SIAM, Philadelphia, 2nd edition,
2003.
32. W.H.A. Schilders and E.J.W. ter Maten. Numerical methods in electromagnetics. Hand-
book of Numerical Analysis, volume XIII, Elsevier, 2005.
33. G. Starke. Optimal alternating direction implicit parameters for nonsymmetric systems of
linear equations. SIAM J. Numer. Anal., 28(5):1431-1445, October 1991.
34. H.A. van der Vorst. Bi-CGSTAB: A fast and smoothly converging variant of Bi-CG for
the solution of nonsymmetric linear systems. SIAM J. Sci. Stat. Comput., 13(2):631-644,
1992.
35. T. Wittig. Zur Reduktion der Modellordnung in elektromagnetischen Feldsimulationen.
PhD thesis, TU Darmstadt, 2003.
36. T. Wittig, I. Munteanu, R. Schuhmann, and T. Weiland. Two-step Lanczos algorithm for
model order reduction. IEEE Trans. Magnetics, 38(2):673-676, March 2001.